Results(1) Analysis of clinicopathological data of patients in the training dataset and validation dataset: cases without microvascular invasion or with microvascular invasion, cases without liver cirrhosis or with liver cirrhosis of the training dataset were 292, 8, 105, 195, respectively, versus 69, 6, 37, 38 of the validation dataset, showing significant differences between the two groups (χ2=4.749, 5.239, P<0.05). (2) Follow-up and survival of patients in the training dataset and validation dataset: all the 375 patients received follow-up. The 300 patients in the training dataset were followed up for 1.1-85.5 months, with a median follow-up time of 50.3 months. Seventy-five patients in the validation dataset were followed up for 1.0-85.7 months, with a median follow-up time of 46.7 months. The postoperative 1-, 3-year overall survival rates of the 375 patients were 91.7%, 79.5%. The postoperative 1-, 3-year overall survival rates of the training dataset were 92.0%, 79.7%, versus 90.7%, 81.9% of the validation dataset, showing no significant difference in postoperative survival between the two groups (χ2=0.113, P>0.05). (3) Construction and evaluation of machine learning algorithm prediction models. ① Selection of the optimal machine learning algorithm prediction model: according to information divergence of variables for prediction of 3 years postoperative survival of HCC, five machine learning algorithms were used to comprehensively rank the variables of clinicopathological factors of HCC, including LR, SVM, DT, RF, and ANN. The main predictive factors were screened out, as hepatitis B e antigen (HBeAg), surgical procedure, maximum tumor diameter, perioperative blood transfusion, liver capsule invasion, and liver segment Ⅳ invasion. The rank sequence 3, 6, 9, 12, 15, 18, 21, 24, 27, 29 variables of predictive factors were introduced into 5 machine learning algorithms in turn. The results showed that the area under curve (AUC) of the receiver operating charateristic curve of LR, SVM, DT, and RF machine learning algorithm prediction models tended to be stable when 9 variables are introduced. When more than 12 variables were introduced, the AUC of ANN machine learning algorithm prediction model fluctuated significantly, the stability of AUC of LR and SVM machine learning algorithm prediction models continued to improve, and the AUC of RF machine learning algorithm prediction model was nearly 0.990, suggesting RF machine learning algorithm prediction model as the optimal machine learning algorithm prediction model. ② Optimization and evaluation of RF machine learning algorithm prediction model: 29 variables of predictive factors were sequentially introduced into the RF machine learning algorithm to construct the optimal RF machine learning algorithm prediction model in the training dataset. The results showed that when 10 variables were introduced, results of grid search method showed 4 as the optimal number of nodes in DT, and 1 000 as the optimal number of DT. When the number of introduced variables were not less than 10, the AUC of RF machine learning algorithm prediction model was about 0.990. When 10 variables were introduced, the RF machine learning algorithm prediction model had an AUC of 0.992 for postoperative overall survival of 3 years, a sensitivity of 0.629, a specificity of 0.996 in the training dataset, an AUC of 0.723 for postoperative overall survival of 3 years, a sensitivity of 0.177, a specificity of 0.948 in the validation dataset. (4) Construction and evaluation of COX nomogram prediction model. ① Analysis of postoperative survival factors of HCC patients in the training dataset. Results of univariate analysis showed that HBeAg, alpha fetoprotein (AFP), preoperative blood transfusion, maximum tumor diameter, liver capsule invasion, and degree of tumor differentiation were related factors for postoperative survival of HCC patients [hazard ratio (HR)=1.958, 1.878, 2.170, 1.188, 2.052, 0.222, 95% confidence interval (CI): 1.185-3.235, 1.147-3.076, 1.389-3.393, 1.092-1.291, 1.240-3.395, 0.070-0.703, P<0.05]. Clinico-pathological data withP<0.2 were included for Lasso regression analysis, and the results showed that age, HBeAg, AFP, surgical procedure, perioperative blood transfusion, maximum tumor diameter, tumor located at liver segment Ⅴ or Ⅷ, liver capsule invasion, and degree of tumor differentiation as high differentiation, moderate-high differentiation, moderate differentiation, moderate-low differentiation were related factors for postoperative survival of HCC patients. The above factors were included for further multivariate COX analysis, and the results showed that HBeAg, surgical procedure, maximum tumor diameter were independent factors affecting postoperative survival of HCC patients (HR=1.770, 8.799, 1.142, 95%CI: 1.049- 2.987, 1.203-64.342, 1.051-1.242, P<0.05). ② Construction and evaluation of COX nomogram prediction model: the clinicopathological factors ofP≤0.1 in the COX multivariate analysis were induced to Rstudio software and rms software package to construct COX nomogram prediction model in the training dataset. The COX nomogram prediction model for predicting postoperative overall survival had an consistency index of 0.723 (se=0.028), an AUC of 0.760 for postoperative overall survival of 3 years in the training dataset, an AUC of 0.795 for postoperative overall survival of 3 years in the validation dataset. The verification of the calibration plot in the training dataset showed that the COX nomogram prediction model had a good prediction performance for postoperative survival. COX nomogram score=0.627 06×HBeAg (normal=0, abnormal=1)+ 0.134 34×maximum tumor diameter (cm)+ 2.107 58×surgical procedure (laparoscopy=0, laparotomy=1)+ 0.545 58×perioperative blood transfusion (without blood transfusion=0, with blood transfusion=1)-1.421 33×high differentiation (non-high differentiation=0, high differentiation=1). The COX nomogram risk scores of all patients were calculated. Xtile software was used to find the optimal threshold of COX nomogram risk scores. Patients with risk scores ≥2.9 were assigned into high risk group, and patients with risk scores <2.9 were assigned into low risk group. Results of Kaplan-Meier overall survival curve showed a significant difference in the postoperative overall survival between low risk group and high risk group of the training dataset ( χ2=33.065, P<0.05). There was a significant difference in the postoperative overall survival between low risk group and high risk group of the validation dataset (χ2=6.585, P<0.05). Results of further analysis by the decision-making curve showed that COX nomogram prediction model based on the combination of HBeAg, surgical procedure, perioperative blood transfusion, maximum tumor diameter, and degree of tumor differentiation was superior to any of the above individual factors in prediction performance. (5) Evaluation of prediction performance between RF machine learning algorithm prediction model and COX nomogram prediction model: prediction difference between two models was investigated by analyzing maximun tumor diameter (the important variable shared in both models), and by comparing the predictive error curve of both models. The results showed that the postoperative 3-year survival rates predicted by RF machine learning algorithm prediction model and COX nomogram prediction model were 77.17% and 74.77% respectively for tumor with maximum diameter of 2.2 cm (χ2=0.182, P>0.05), 57.51% and 61.65% for tumor with maximum diameter of 6.3 cm (χ2=0.394, P>0.05), 51.03% and 27.52% for tumor with maximum diameter of 14.2 cm (χ2=12.762, P<0.05). With the increase of the maximum tumor diameter, the difference in survival rates predicted between the two models turned larger. In the validation dataset, the AUC for postoperative overall survival of 3 years of RF machine learning algorithm prediction model and COX nomogram prediction model was 0.723 and 0.795, showing a significant difference between the two models (t=3.353, P<0.05). Resluts of Bootstrap cross-validation for prediction error showed that the integrated Brier scores of RF machine learning algorithm prediction model and COX nomogram prediction model for predicting 3-year survival were 0.139 and 0.134, respectively. The prediction error of COX nomogram prediction model was lower than that of RF machine learning algorithm prediction model.