| Literature DB >> 33854400 |
Jianhua Tong1, Panmiao Liu1, Muhuo Ji1, Ying Wang1, Qiong Xue1, Jian-Jun Yang1, Cheng-Mao Zhou1.
Abstract
OBJECTIVE: Over 1 million new cases of hepatocellular carcinoma (HCC) are diagnosed worldwide every year. Its prognosis remains poor, and the 5-year survival rate in all disease stages is estimated to be between 10% and 20%. Radiofrequency ablation (RFA) has become an important local treatment for liver cancer, and machine learning (ML) can provide many shortcuts for liver cancer medical research. Therefore, we explore the role of ML in predicting the total mortality of liver cancer patients undergoing RFA.Entities:
Keywords: Treatment algorithms; hepatocellular carcinoma; individual outcomes; patient stratification; radiofrequency ablation
Year: 2021 PMID: 33854400 PMCID: PMC8013536 DOI: 10.1177/11795549211000017
Source DB: PubMed Journal: Clin Med Insights Oncol ISSN: 1179-5549
Functions, packages, and tuning parameters used in Anaconda for each machine learning algorithm.
| Algorithm | Classifier | Package | Tuning parameters |
|---|---|---|---|
| Logistic regression | LogisticRegression | from sklearn.linear_model import LogisticRegression | penalty =‘l2’, tol = 0.000001, C = 0.1, fit_intercept = True,intercept_scaling = 1, class_weight = None,max_iter = 100, multi_class =‘ovr’,verbose = 0, warm_start = False,n_jobs = 1 |
| DecisionTree | DecisionTreeClassifier | from sklearn.tree import DecisionTreeClassifier | splitter =‘best’, max_depth = 3, min_samples_split = 30, min_samples_leaf = 2, min_weight_fraction_leaf = 0.01 |
| forest | RandomForestClassifier | from sklearn.ensemble import RandomForestClassifier | n_estimators = 50, n_jobs = -1, min_samples_split = 20, min_samples_leaf = 2, random_state = 41 |
| GradientBoosting | GradientBoostinglassifier | from sklearn.ensemble import GradientBoostinglassifier | learning_rate = 0.2, n_estimators = 20, max_depth = 3, min_samples_split = 20, min_samples_leaf = 5 |
| gbm | lgb.LGBMClassifier | lightgbm 2.2.0 | boosting_type =‘gbdt’, objective =‘binary’,metrics =‘auc’,learning_rate = 0.1, n_estimators = 100, max_depth = 2, bagging_fraction = 0.5, feature_fraction = 0.5 |
Basic patient characteristics.
| Training group | Test group | |||||
|---|---|---|---|---|---|---|
| Nondeath | Death | Nondeath | Death | |||
| Number of people | 240 | 222 | 60 | 56 | ||
| Age (years) | 67.7 ± 9.7 | 70.7 ± 8.7 | 0.001 | 66.0 ± 10.8 | 71.4 ± 9.5 | 0.013 |
| Size (mm) | 23.1 ± 8.7 | 24.2 ± 8.3 | 0.067 | 23.2 ± 8.3 | 26.0 ± 9.9 | 0.134 |
| Height (cm) | 159.3 ± 9.8 | 159.1 ± 9.1 | 0.942 | 161.2 ± 9.5 | 157.6 ± 10.1 | 0.034 |
| Wight (kg) | 59.6 ± 11.3 | 59.6 ± 11.7 | 0.813 | 60.6 ± 9.9 | 58.5 ± 10.3 | 0.126 |
| Body mass index (kg/m2) | 23.4 ± 3.3 | 23.5 ± 3.9 | 0.811 | 23.3 ± 3.1 | 23.6 ± 3.5 | 0.875 |
| Serum albumin (g/dL) | 3.8 ± 0.5 | 3.6 ± 0.4 | <0.001 | 3.8 ± 0.5 | 3.5 ± 0.5 | <0.001 |
| Total bilirubin (mg/dL) | 0.9 ± 0.4 | 1.1 ± 0.5 | <0.001 | 0.9 ± 0.6 | 1.0 ± 0.6 | 0.052 |
| AST (IU/L) | 56.5 ± 36.7 | 62.1 ± 37.2 | 0.011 | 51.2 ± 30.1 | 63.0 ± 35.6 | 0.034 |
| ALT (IU/L) | 55.9 ± 50.8 | 53.3 ± 37.4 | 0.698 | 50.5 ± 31.2 | 53.2 ± 32.5 | 0.658 |
| Platelet count (×109/L) | 12.3 ± 5.3 | 10.5 ± 5.5 | <0.001 | 13.3 ± 6.9 | 11.3 ± 6.6 | 0.049 |
| Hemoglobin (g/dL) | 12.9 ± 1.5 | 12.8 ± 1.6 | 0.394 | 13.2 ± 1.6 | 12.5 ± 1.8 | 0.025 |
| Prothrombin activity (%) | 88.3 ± 13.2 | 84.8 ± 13.0 | 0.002 | 90.0 ± 13.3 | 83.0 ± 15.3 | 0.005 |
| PTINR | 1.1 ± 0.1 | 1.1 ± 0.1 | 0.002 | 1.0 ± 0.2 | 1.1 ± 0.2 | 0.004 |
| Creatinine | 0.7 ± 0.2 | 0.8 ± 0.2 | 0.053 | 0.8 ± 0.2 | 0.8 ± 0.7 | 0.294 |
| Serum ferritin level (ng/mL) | 193.0 ± 249.8 | 197.8 ± 232.3 | 0.780 | 197.8 ± 194.7 | 212.7 ± 303.2 | 0.553 |
| AFP | 111.1 ± 459.8 | 129.5 ± 452.1 | <0.001 | 131.8 ± 490.2 | 432.3 ± 2487.4 | 0.513 |
| L3 | 5.1 ± 13.9 | 7.4 ± 15.7 | <0.001 | 7.2 ± 16.0 | 5.9 ± 13.3 | 0.577 |
| Male | 0.299 | 0.289 | ||||
| No | 90 (37.5%) | 73 (32.9%) | 21 (35.0%) | 25 (44.6%) | ||
| Yes | 150 (62.5%) | 149 (67.1%) | 39 (65.0%) | 31 (55.4%) | ||
| HBsAg-positive only | <0.001 | 0.006 | ||||
| No | 192 (80.0%) | 209 (94.1%) | 44 (73.3%) | 52 (92.9%) | ||
| Yes | 48 (20.0%) | 13 (5.9%) | 16 (26.7%) | 4 (7.1%) | ||
| Anti HCVAb-positive only | 0.122 | 0.092 | ||||
| No | 75 (31.2%) | 55 (24.8%) | 25 (41.7%) | 15 (26.8%) | ||
| Yes | 165 (68.8%) | 167 (75.2%) | 35 (58.3%) | 41 (73.2%) | ||
| Alcohol consumption | 0.045 | 0.203 | ||||
| ⩽80 g/day | 192 (80.0%) | 159 (71.6%) | 52 (86.7%) | 41 (73.2%) | ||
| >80 g/day | 22 (9.2%) | 21 (9.5%) | 4 (6.7%) | 6 (10.7%) | ||
| None | 26 (10.8%) | 42 (18.9%) | 4 (6.7%) | 9 (16.1%) | ||
| Child-Pugh score | <0.001 | 0.019 | ||||
| 0 | 155 (64.6%) | 86 (38.7%) | 40 (66.7%) | 20 (35.7%) | ||
| 1 | 57 (23.8%) | 80 (36.0%) | 12 (20.0%) | 15 (26.8%) | ||
| 2 | 14 (5.8%) | 38 (17.1%) | 5 (8.3%) | 10 (17.9%) | ||
| 3 | 11 (4.6%) | 10 (4.5%) | 2 (3.3%) | 6 (10.7%) | ||
| 4 | 1 (0.4%) | 7 (3.2%) | 1 (1.7%) | 3 (5.4%) | ||
| 5 | 2 (0.8%) | 1 (0.5%) | 0 (0.0%) | 2 (3.6%) | ||
Abbreviations: AFP, alpha-fetoprotein; anti-HCVAb, anti-hepatitis C virus antibody; AST, aspartate aminotransferase; HBsAg, hepatitis B surface antigen; ALT, alanine aminotransferase; PTINR,prothrombin time-international normalized ratio.
Figure 1.Factor correlations.
Figure 2.Variable importance of features included in the machine-learning algorithm for predicting postoperative death outcomes.
AFP indicates alpha-fetoprotein; AST, aspartate aminotransferase; BMI, body mass index; HB, hemoglobin; HBsAg, hepatitis B surface antigen; PLT, platelet count; TB, total bilirubin; PTINR, prothrombin time-international normalized ratio; ALT, aminoleucine transferase; ALB, albumin; PT, prothrombin time; BW, body weight.
Forecasted results for the training group.
| Accuracy | Precision | Recall | AUC | |
|---|---|---|---|---|
| Logistic | 0.686 | 0.679 | 0.658 | 0.739 |
| DecisionTree | 0.690 | 0.642 | 0.806 | 0.748 |
| Forest | 0.900 | 0.904 | 0.887 | 0.971 |
| GradientBoosting | 0.831 | 0.795 | 0.874 | 0.914 |
| Gbm | 0.755 | 0.742 | 0.752 | 0.825 |
Abbreviation: AUC, area under the curve.
Figure 3.Machine-learning algorithm predictions of postoperative death outcomes in the training group.
AUC indicates area under the curve, ROC, receiver operating characteristics.
Forecasted results for the testing group.
| Accuracy | Precision | Recall | AUC | |
|---|---|---|---|---|
| Logistic | 0.672 | 0.714 | 0.538 | 0.738 |
| DecisionTree | 0.664 | 0.690 | 0.642 | 0.723 |
| forest | 0.638 | 0.646 | 0.554 | 0.693 |
| GradientBoosting | 0.664 | 0.689 | 0.571 | 0.714 |
| gbm | 0.681 | 0.721 | 0.554 | 0.717 |
Abbreviation: AUC, area under the curve.
Figure 4.Machine-learning algorithm predictions of postoperative death outcomes in the test group.
AUC indicates area under the curve, ROC, receiver operating characteristics.