| Literature DB >> 30608929 |
Monica A Konerman1, Lauren A Beste2, Tony Van3, Boang Liu4, Xuefei Zhang4, Ji Zhu4, Sameer D Saini1,3, Grace L Su1,3, Brahmajee K Nallamothu5,6, George N Ioannou7,8, Akbar K Waljee1,3,6.
Abstract
BACKGROUND: Machine learning (ML) algorithms provide effective ways to build prediction models using longitudinal information given their capacity to incorporate numerous predictor variables without compromising the accuracy of the risk prediction. Clinical risk prediction models in chronic hepatitis C virus (CHC) can be challenging due to non-linear nature of disease progression. We developed and compared two ML algorithms to predict cirrhosis development in a large CHC-infected cohort using longitudinal data. METHODS ANDEntities:
Mesh:
Year: 2019 PMID: 30608929 PMCID: PMC6319806 DOI: 10.1371/journal.pone.0208141
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Baseline characteristics.
| Variable | Summary statistics |
|---|---|
| Primary outcome event | 11,616 (16%) |
| Followup years (mean (sd)) | 7.00 (4.01) |
| Age at enrollment (mean (sd)) | 52.84 (8.74) |
| Male (%) | 70,377 (96.8) |
| Race (%) | |
| WHITE | 35,216 (52.9%) |
| BLACK OR AFRICAN AMERICAN | 27,081 (40.7%) |
| HISPANIC | 3,101 (4.7%) |
| OTHER | 1,215 (1.8%) |
| Albumin g/dl (mean (sd)) | 3.96 (0.46) |
| Alkaline phosphatase U/L (mean (sd)) | 86.11 (38.01) |
| ALT U/L (mean (sd)) | 61.64 (58.14) |
| APRI (mean (sd)) | 0.62 (0.55) |
| AST U/L (mean (sd)) | 49.94 (32.64) |
| Bilirubin mg/dl (mean (sd)) | 0.69 (0.41) |
| Body-mass-index (mean (sd)) | 27.17 (5.32) |
| Creatinine mg/dl (mean (sd)) | 1.08 (0.87) |
| INR (mean (sd)) | 1.06 (0.30) |
| Platelet count 1000/uL (mean (sd)) | 233.02 (74.88) |
| Sodium mmol/L (mean (sd)) | 138.85 (3.24) |
Performance of Cox models.
| Cox Model | Concordance | AuROC 1 year | AuROC 3 years | AuROC 5 years |
|---|---|---|---|---|
| 0.746 (0.003) | 0.801 (0.008) | 0.784 (0.005) | 0.774 (0.003) | |
| 0.764 (0.003) | 0.820 (0.007) | 0.803 (0.005) | 0.794 (0.003) | |
| 2 | 1 | 6 | 1 |
* mean (standard deviation), 95% confidence interval
Performance of boosting models.
| Boosting Model | Concordance | AuROC 1 year | AuROC 3 years | AuROC 5 years |
|---|---|---|---|---|
| 0.758 (0.003) | 0.811 (0.008) | 0.797 (0.005) | 0.787 (0.005) | |
| 0.774 (0.003) | 0.830 (0.007) | 0.814 (0.005) | 0.805 (0.004) | |
| 8 | 4 | 3 | 1 |
* mean (standard deviation), 95% confidence interval
Misclassification Table.
| Time | Test Sample Size | Event Proportion | Model | AuROC | Best cut-off | Specificity | Sensitivity | PPV | NPV |
|---|---|---|---|---|---|---|---|---|---|
| 18896 | 0.036 | CS Cox | 0.807 | 0.041 | 0.79 | 0.71 | 0.11 | 0.99 | |
| CS Boosting | 0.817 | 0.037 | 0.77 | 0.73 | 0.11 | 0.99 | |||
| LGT Cox | 0.828 | 0.037 | 0.75 | 0.76 | 0.10 | 0.99 | |||
| LGT Boosting | 0.838 | 0.035 | 0.76 | 0.77 | 0.11 | 0.99 | |||
| 14605 | 0.112 | CS Cox | 0.784 | 0.095 | 0.73 | 0.72 | 0.25 | 0.95 | |
| CS Boosting | 0.799 | 0.091 | 0.76 | 0.71 | 0.27 | 0.95 | |||
| LGT Cox | 0.804 | 0.095 | 0.75 | 0.74 | 0.27 | 0.96 | |||
| LGT Boosting | 0.815 | 0.090 | 0.76 | 0.73 | 0.28 | 0.96 | |||
| 11334 | 0.206 | CS Cox | 0.775 | 0.151 | 0.74 | 0.70 | 0.41 | 0.90 | |
| CS Boosting | 0.790 | 0.138 | 0.75 | 0.70 | 0.42 | 0.91 | |||
| LGT Cox | 0.794 | 0.151 | 0.75 | 0.71 | 0.42 | 0.91 | |||
| LGT Boosting | 0.805 | 0.128 | 0.73 | 0.74 | 0.41 | 0.92 |
(CS) cross-sectional; (LGT) longitudinal; (PPV) positive predictive value; (NPV) negative predictive value.