| Literature DB >> 33141161 |
Fasiha Kanwal1,2,3,4, Thomas J Taylor5, Jennifer R Kramer2,3,4, Yumei Cao3,4, Donna Smith3,4, Allen L Gifford6,7, Hashem B El-Serag1,2,3,4, Aanand D Naik2,3,4,8, Steven M Asch3,9.
Abstract
Importance: Machine-learning algorithms offer better predictive accuracy than traditional prognostic models but are too complex and opaque for clinical use. Objective: To compare different machine learning methods in predicting overall mortality in cirrhosis and to use machine learning to select easily scored clinical variables for a novel cirrhosis prognostic model. Design, Setting, and Participants: This prognostic study used a retrospective cohort of adult patients with cirrhosis or its complications seen in 130 hospitals and affiliated ambulatory clinics in the integrated, national Veterans Affairs health care system from October 1, 2011, to September 30, 2015. Patients were followed up through December 31, 2018. Data were analyzed from October 1, 2017, to May 31, 2020. Exposures: Potential predictors included demographic characteristics; liver disease etiology, severity, and complications; use of health care resources; comorbid conditions; and comprehensive laboratory and medication data. Patients were randomly selected for model development (66.7%) and validation (33.3%). Three different statistical and machine learning methods were evaluated: gradient descent boosting, logistic regression with least absolute shrinkage and selection operator (LASSO) regularization, and logistic regression with LASSO constrained to select no more than 10 predictors (partial pathway model). Predictor inclusion and model performance were evaluated in a 5-fold cross-validation. Last, the predictors identified in the most parsimonious (the partial path) model were refit using maximum-likelihood estimation (Cirrhosis Mortality Model [CiMM]), and its predictive performance was compared with that of the widely used Model for End Stage Liver Disease with sodium (MELD-Na) score. Main Outcomes and Measures: All-cause mortality.Entities:
Mesh:
Year: 2020 PMID: 33141161 PMCID: PMC7610191 DOI: 10.1001/jamanetworkopen.2020.23780
Source DB: PubMed Journal: JAMA Netw Open ISSN: 2574-3805
Baseline Characteristics of 107 939 Patients With Cirrhosis
| Characteristic | Data |
|---|---|
| Age, mean (SD), y | 62.7 (9.6) |
| Race/ethnicity | |
| White | 71 563 (66.3) |
| Black | 19 852 (18.4) |
| Hispanic | 6376 (5.9) |
| Other | 3005 (2.8) |
| Sex | |
| Female | 3623 (3.4) |
| Male | 104 316 (96.6) |
| Marital status | |
| Divorced or separated | 47 981 (44.5) |
| Married | 45 792 (42.4) |
| Single or never married | 14 020 (13.0) |
| Etiology of cirrhosis | |
| HCV infection alone | 14 286 (13.2) |
| HCV and alcohol | 26 011 (24.1) |
| Alcohol alone | 34 112 (31.6) |
| Nonalcoholic steatohepatitis | 29 140 (27.0) |
| HBV infection | 3427 (3.2) |
| MELD-Na score | |
| <10 | 36 600 (33.9) |
| 10-20 | 29 442 (27.3) |
| >20 | 6329 (5.9) |
| Missing | 35 568 (32.9) |
| Cirrhosis complications | |
| Hepatic encephalopathy | 21 556 (20.0) |
| Ascites | 21 770 (20.2) |
| Varices | 17 631 (16.3) |
| Hepatocellular cancer | 8150 (7.6) |
| Laboratory test results, mean (SD) | |
| Sodium level, mEq/L | 137.7 (3.8) |
| Creatinine level, mg/dL | 1.2 (1.0) |
| Bilirubin level, mg/dL | 1.6 (2.7) |
| Albumin level, g/dL | 3.5 (0.7) |
| Platelet count, ×103/μL | 166.5 (92.7) |
| Hemoglobin level, g/dL | 13.0 (2.3) |
| Physical health conditions | |
| Diabetes | 54 137 (50.2) |
| Chronic obstructive pulmonary disease | 17 326 (16.1) |
| Heart failure | 11 332 (10.5) |
| Cancer | 18 164 (16.8) |
| Chronic kidney disease | 10 872 (10.1) |
| CirCom score | |
| 0 | 25 649 (23.8) |
| 1 + 0 | 28 853 (26.7) |
| 1 + 1 | 20 362 (18.9) |
| 3 + 0 | 5813 (5.4) |
| 3 + 1 | 23 807 (22.1) |
| 5 + 0 | 109 (0.1) |
| 5 + 1 | 3346 (3.1) |
| Use of health care resources | |
| Hospitalization due to any cause in past year | 44 143 (40.9) |
| Hospitalization with primary diagnosis of cirrhosis in past year | 10 560 (9.8) |
| ≥1 Emergency department visit in the past year | 46 450 (43.0) |
| ≥3 Outpatient visits in the past year | 98 173 (91.0) |
Abbreviations: CirCom, cirrhosis-specific comorbidity score; HBV, hepatitis B virus; HCV, hepatitis C virus.
SI conversion factors: To convert albumin to g/L, multiply by 10.0; bilirubin to μmol/L, multiply by 17.104; creatinine to μmol/L, multiply by 88.4; hemoglobin to g/L, multiply by 10.0; platelet count to ×109/L, multiply by 1.0; sodium to mmol/L, multiply by 1.0.
Unless otherwise indicated, data are expressed as number (percentage) of patients. Owing to missing data, percentages may not total 100.
Higher scores indicate more severe liver disease.
The CirCom score uses a specific set of International Statistical Classification of Diseases and Related Health Problems, Tenth Revision, codes to define the conditions. We mapped these to International Classification of Diseases, Ninth Revision, Clinical Modification, codes to define clinical conditions (eTable 1 in the Supplement). We used the Academy of Healthcare Research and Quality Clinical Classifications Software to define the conditions that were not part of the CirCom score (eg, diabetes, depression, anxiety, and alcohol use). Nonmetastatic cancer, metastatic cancer, hematologic cancer, substance abuse other than alcoholism, epilepsy, acute myocardial infarction, heart failure, peripheral arterial disease, chronic obstructive pulmonary disease, and chronic kidney disease were pulled using most recent inpatient or outpatient diagnoses given in the 5 years before index date. The CirCom score was calculated by the algorithm developed and validated by Jepsen et al.[21]
Figure 1. Annual and Cumulative Incidence of All-Cause Mortality
Includes 107 939 participants at inception. Annual all-cause mortality was 8.8% at 1 year, 15.3% at 2 years, 12.8% at 3 years, 11.5% at 4 years, 10.9% at 5 years, 10.4% at 6 years, 9.8% at 7 years, and 10.3% at 8 years. Cumulative all-cause mortality was 8.8% at 1 year, 22.8% at 2 years, 32.7% at 3 years, 40.4% at 4 years, 46.2% at 5 years, 50.2% at 6 years, 52.6% at 7 years, and 53.4% at 8 years.
Discrimination and Calibration of 3 Modeling Approaches
| Method | No. of patients | Person-year | AUC (95% CI) discrimination | Brier score calibration (95% CI) |
|---|---|---|---|---|
| Extreme gradient boosting | 32 437 | 1 | 0.81 (0.80-0.82) | 0.07 (0.07-0.07) |
| 29 603 | 2 | 0.78 (0.77-0.79) | 0.11 (0.11-0.11) | |
| 25 084 | 3 | 0.74 (0.73-0.75) | 0.10 (0.10-0.11) | |
| 21 908 | 4 | 0.72 (0.71-0.73) | 0.10 (0.09-0.10) | |
| 17 309 | 5 | 0.72 (0.71-0.73) | 0.09 (0.09-0.10) | |
| 12 341 | 6 | 0.72 (0.70-0.73) | 0.09 (0.08-0.09) | |
| 7940 | 7 | 0.71 (0.69-0.73) | 0.09 (0.08-0.09) | |
| 2715 | 8 | 0.72 (0.69-0.75) | 0.09 (0.07-0.10) | |
| Full discrete time-to-event logistic regression with LASSO | 32 437 | 1 | 0.78 (0.77-0.79) | 0.07 (0.07-0.08) |
| 29 603 | 2 | 0.76 (0.75-0.77) | 0.11 (0.11-0.12) | |
| 25 084 | 3 | 0.72 (0.71-0.73) | 0.10 (0.10-0.11) | |
| 21 908 | 4 | 0.70 (0.69-0.71) | 0.10 (0.09-0.10) | |
| 17 309 | 5 | 0.69 (0.68-0.71) | 0.09 (0.09-0.10) | |
| 12 341 | 6 | 0.69 (0.67-0.70) | 0.09 (0.08-0.09) | |
| 7940 | 7 | 0.69 (0.68-0.71) | 0.09 (0.08-0.09) | |
| 2715 | 8 | 0.69 (0.66-0.72) | 0.09 (0.08-0.10) | |
| Partial discrete time-to-event logistic regression with LASSO | 32 437 | 1 | 0.78 (0.76-0.78) | 0.07 (0.07-0.08) |
| 29 603 | 2 | 0.76 (0.74-0.76) | 0.12 (0.11-0.12) | |
| 25 084 | 3 | 0.71 (0.70-0.72) | 0.10 (0.10-0.11) | |
| 21 908 | 4 | 0.68 (0.67-0.69) | 0.10 (0.09-0.10) | |
| 17 309 | 5 | 0.67 (0.66-0.69) | 0.09 (0.09-0.10) | |
| 12 341 | 6 | 0.67 (0.65-0.68) | 0.09 (0.08-0.09) | |
| 7940 | 7 | 0.68 (0.66-0.70) | 0.09 (0.08-0.09) | |
| 2715 | 8 | 0.67 (0.64-0.70) | 0.09 (0.08-0.10) |
Abbreviations: AUC, area under the receiver operating characteristics curve; LASSO, least absolute shrinkage and selection operator.
Figure 2. Associations Between Clinical Factors Included in Cirrhosis Mortality Model and Time to Death
ALT indicates alanine aminotransferase; AST, aspartate aminotransferase; CirCom, cirrhosis-specific comorbidity score; and RR, relative risk.
Comparison of Discrimination Between the CiMM and MELD-Na Score
| Person-year | No. of patients | AUC (95% CI) | DeLong test ( | ||
|---|---|---|---|---|---|
| CiMM | MELD-Na score | ||||
| 1 | 107 939 | 0.78 (0.77-0.79) | 0.67 (0.66-0.68) | 17.00 | <.001 |
| 2 | 98 419 | 0.76 (0.75-0.77 | 0.65 (0.64-0.66) | 21.29 | <.001 |
| 3 | 83 368 | 0.72 (0.71-0.73) | 0.61 (0.60-0.62) | 17.44 | <.001 |
| 4 | 72 665 | 0.69 (0.68-0.70) | 0.60 (0.59-0.61) | 14.54 | <.001 |
| 5 | 57 301 | 0.69 (0.67-0.70) | 0.59 (0.58-061) | 12.79 | <.001 |
| 6 | 41 087 | 0.68 (0.66-0.69) | 0.58 (0.56-0.60) | 10.95 | <.001 |
| 7 | 26 441 | 0.69 (0.67-0.71) | 0.60 (0.58-0.62) | 7.29 | <.001 |
| 8 | 9085 | 0.68 (0.65-0.71) | 0.57 (0.53-0.61) | 4.47 | <.001 |
Abbreviations: AUC, area under the receiver operating characteristics curve; CiMM, cirrhosis mortality model; MELD-Na, Model for End Stage Liver Disease with sodium.
Maximum-likelihood model used the predictors identified from partial discrete time-to-event logistic regression model with least absolute shrinkage and selection operator.