| Literature DB >> 30100397 |
Runmin Wei1, Jingye Wang2, Xiaoning Wang3, Guoxiang Xie2, Yixing Wang3, Hua Zhang3, Cheng-Yuan Peng4, Cynthia Rajani2, Sandi Kwee2, Ping Liu5, Wei Jia6.
Abstract
Clinical prediction of advanced hepatic fibrosis (HF) and cirrhosis has long been challenging due to the gold standard, liver biopsy, being an invasive approach with certain limitations. Less invasive blood test tandem with a cutting-edge machine learning algorithm shows promising diagnostic potential. In this study, we constructed and compared machine learning methods with the FIB-4 score in a discovery dataset (n = 490) of hepatitis B virus (HBV) patients. Models were validated in an independent HBV dataset (n = 86). We further employed these models on two independent hepatitis C virus (HCV) datasets (n = 254 and 230) to examine their applicability. In the discovery data, gradient boosting (GB) stably outperformed other methods as well as FIB-4 scores (p < .001) in the prediction of advanced HF and cirrhosis. In the HBV validation dataset, for classification between early and advanced HF, the area under receiver operating characteristic curves (AUROC) of GB model was 0.918, while FIB-4 was 0.841; for classification between non-cirrhosis and cirrhosis, GB showed AUROC of 0.871, while FIB-4 was 0.830. Additionally, GB-based prediction demonstrated good classification capacity on two HCV datasets while higher cutoffs for both GB and FIB-4 scores were required to achieve comparable specificity and sensitivity. Using the same parameters as FIB-4, the GB-based prediction system demonstrated steady improvements relative to FIB-4 in HBV and HCV cohorts with different cutoff values required in different etiological groups. A user-friendly web tool, LiveBoost, makes our prediction models freely accessible for further clinical studies and applications.Entities:
Keywords: FIB-4; Gradient boosting; Hepatic fibrosis; Hepatitis B; Hepatitis C; Machine learning
Mesh:
Year: 2018 PMID: 30100397 PMCID: PMC6154783 DOI: 10.1016/j.ebiom.2018.07.041
Source DB: PubMed Journal: EBioMedicine ISSN: 2352-3964 Impact factor: 8.143
Fig. 1Flowchart of the study design. In step 1 of model selection, we performed training-testing splitting 100 times on the discovery set and trained DT, RF, GB models on the training sets, then compared these results with FIB-4 on testing sets. In step 2, we constructed final GB models and compared results with FIB-4 on the whole discovery set and then validated on the HBV validation set. In step 3, GB models and FIB-4 were used to predict the risks for two extra HCV cohorts. In step 4, we developed a user-friendly web-tool for clinical practices.
Fig. 2Boxplots of AUPR and AUROC on testing sets for four different methods. P-values were calculated using Student's t-tests.
Clinical and demographical characteristics of the HBV cohorts.
| Data | HF stage | Total Num | Num of M | Num of F | BMI (kg/m^2) | Age (years) | AST (U/L) | ALT (U/L) | PLT (10^9/L) |
|---|---|---|---|---|---|---|---|---|---|
| Discovery Set (HBV) | 0 | 46 | 39 | 7 | 22.1 (20.3–23.5) | 32 (27–40) | 49 (35–66) | 106 (58–171) | 190 (161–215) |
| 1 | 169 | 125 | 44 | 21.2 (19.5–24.1) | 30 (25–38) | 58 (39–99) | 114 (65–190) | 179 (155–214) | |
| 2 | 134 | 93 | 41 | 21.6 (20.1–24.0) | 31 (27–39) | 74 (43–138) | 155 (80–267) | 176 (150–210) | |
| 3 | 56 | 47 | 9 | 22.5 (20.9–25.0) | 39 (29–47) | 62 (44–112) | 90 (56–250) | 148 (108–182) | |
| 4 | 85 | 53 | 32 | 22.5 (20.9–24.5) | 50 (40–58) | 45 (31–77) | 45 (28–100) | 86 (43–121) | |
| Validation Set (HBV) | 0 | 15 | 7 | 8 | 23.2 (21.2–24.0) | 35 (28–40) | 40 (23–67) | 65 (33−100) | 173 (152–193) |
| 1 | 21 | 14 | 6 | 22.5 (21. 3–24.8) | 31 (26–45) | 67 (36–128) | 98 (77–183) | 193 (174–221) | |
| 2 | 12 | 7 | 5 | 22.4 (21.5–23.3) | 39 (34–43) | 50 (40–97) | 76 (52–357) | 161 (145–178) | |
| 3 | 11 | 8 | 3 | 21.5 (20.5–22.8) | 40 (31–49) | 35 (33–53) | 40 (31–95) | 108 (93–118) | |
| 4 | 27 | 18 | 9 | 22.4 (20.1–23.9) | 45 (37–56) | 43 (32–70) | 35 (28–82) | 74 (40–98) |
Continuous variables are displayed as median value (25% - 75% quantile values), Num (number), F (female) M (male).
Fig. 3Classification performances of GB and FIB-4 on the discovery set and the HBV validation set. (A) ROC curves of GB and FIB-4 in advanced HF detection (left-panel) and cirrhosis detection (right-panel). (B) Specificity, sensitivity and their 95% CIs of GB and FIB-4 scores in advanced HF detection (left-panel) and cirrhosis detection (right-panel). We selected the best GB cutoff based on the Youden index for the discovery set and two commonly applied FIB-4 cutoffs (1.45 and 3.25).
Fig. 4FIB-4 and GB scores for four independent cohorts between S0–2 and S3–4. P-values were calculated using Student's t-tests.
Fig. 5A screenshot of the web-tool (LiveBoost).