| Literature DB >> 35666557 |
Yonghui Wu1, Xi Yang1, Heather L Morris2, Matthew J Gurka1, Elizabeth A Shenkman1, Kenneth Cusi3, Fernando Bril4, William T Donahoo3.
Abstract
BACKGROUND: Nonalcoholic steatohepatitis (NASH), advanced fibrosis, and subsequent cirrhosis and hepatocellular carcinoma are becoming the most common etiology for liver failure and liver transplantation; however, they can only be diagnosed at these potentially reversible stages with a liver biopsy, which is associated with various complications and high expenses. Knowing the difference between the more benign isolated steatosis and the more severe NASH and cirrhosis informs the physician regarding the need for more aggressive management.Entities:
Keywords: fatty liver; liver fibrosis; machine learning; nonalcoholic fatty liver disease; nonalcoholic steatohepatitis
Year: 2022 PMID: 35666557 PMCID: PMC9210198 DOI: 10.2196/36997
Source DB: PubMed Journal: JMIR Med Inform
Baseline characteristics of patients with and without nonalcoholic steatohepatitis (N=492).
| Characteristic | Patients with NASHa (n=198) | Patients without NASH (n=294) | ||
| Age, years, mean (SD) | 55 ± 10 | 54 ± 11 | .22 | |
| Males, n (%) | 142 (72) | 214 (73) | .88 | |
|
|
|
| <.001 | |
|
| Caucasian | 109 (55) | 126 (43) |
|
|
| Hispanic | 73 (37) | 107 (36) |
|
|
| African American | 11 (5.5) | 55 (19) |
|
|
| Asian | 3 (1.5) | 4 (1) |
|
|
| Indian | 0 (0) | 2 (1) |
|
|
| Pacific Islander | 2(1) | 0(0) |
|
| BMI, kg/m2, mean (SD) | 34.1 (4.7) | 33 (5.5) | .02 | |
| SBPc, mmHg, mean (SD) | 134 (16) | 134 (17) | .93 | |
| DBPd, mmHg, mean (SD) | 79 (10) | 78 (10) | .57 | |
| Total cholesterol, mg/dL, mean (SD) | 183 (44) | 168 (38) | <.001 | |
| TGe, mg/dL, mean (SD) | 202 (148) | 137 (85) | <.001 | |
| LDL-Cf, mg/dL, mean (SD) | 106 (36) | 98 (34) | .03 | |
| HDL-Cg, mg/dL, mean (SD) | 39 (11) | 43 (13) | <.001 | |
| A1ch, % | 6.8 (1.3) | 6.5 (1.2) | .004 | |
| ASTi, IUj/L, mean (SD) | 47 (26) | 28 (14) | <.001 | |
| ALTk, IU/L, mean (SD) | 64 (37) | 37 (27) | <.001 | |
| Bilirubin, mg/dL, mean (SD) | 0.9 (0.5) | 0.8 (0.4) | .003 | |
| Platelets, 109/L, mean (SD) | 257 (84) | 237 (63) | .006 | |
| Albumin, g/L, mean (SD) | 4.2 (0.3) | 4.1 (0.4) | .005 | |
| TSHl, mIU/L, mean (SD) | 2.31 (1.51) | 2.05 (2.41) | .14 | |
| FPGm, mg/dL, mean (SD) | 136 (39) | 127 (40) | .01 | |
|
|
|
| <.001 | |
|
| Type 2 diabetes | 144 (73) | 181 (62) |
|
|
| Impaired glucose tolerance | 41 (21) | 48 (16) |
|
|
| Impaired fasting glucose | 7 (3) | 36 (12) |
|
|
| Normal glucose tolerance | 6 (3) | 29 (10) |
|
| Presence of metabolic syndrome, n (%) | 191 (96) | 247 (84) | <.001 | |
| Presence of dyslipidemia, n (%) | 180 (91) | 206 (70) | <.001 | |
| Use of blood pressure medications, n (%) | 159 (80) | 181 (62) | <.001 | |
| Use of statins, n (%) | 103 (52) | 154 (52) | .99 | |
| Use of metformin, n (%) | 92 (46) | 119 (40) | .22 | |
| Use of sulfonylurea, n (%) | 45 (23) | 65 (22) | .96 | |
aNASH: nonalcoholic steatohepatitis.
bFor continuous variables, the P values were calculated by the 2-sided t test using 2 independent variables with unequal population variances. For categorical variables, the P values were calculated using the chi-square test.
cSBP: systolic blood pressure.
dDBP: diastolic blood pressure.
eTG: triglyceride.
fLDL-C: low-density lipoprotein-cholesterol.
gHDL-C: high-density lipoprotein-cholesterol.
hA1c: hyperglycemia
iAST: aspartate transaminase.
jIU: international units.
kALT: alanine aminotransferase.
lTSH: thyroid-stimulating hormone.
mFPG: fasting plasma glucose.
Performance of machine learning methods for prediction of nonalcoholic fatty liver disease.
| Method and feature encoding | Mean sensitivity | Mean specificity | Mean AUCa (95% CI) | |
|
| ||||
|
| Categorical | 0.7631 | 0.8557 | 0.8632 (0.8560-0.8704) |
|
| Continuous | 0.8232 | 0.8452 | 0.8786 (0.8716-0.8855) |
|
| ||||
|
| Categorical | 0.8013 | 0.8112 | 0.8599 (0.8523-0.8676) |
|
| Continuous | 0.7773 | 0.8245 | 0.8524 (0.8455-0.8594) |
|
| ||||
|
| Categorical | 0.7297 | 0.7796 | 0.7932 (0.7835-0.8029) |
|
| Continuous | 0.7888 | 0.7809 | 0.8078 (0.7974-0.8183) |
|
| ||||
|
| Categorical | 0.7811 | 0.8602 | 0.8782 (0.8717-0.8848) |
|
| Continuous | 0.8250 | 0.8595 | 0.9020 (0.8957-0.9083) |
|
| ||||
|
| Categorical | 0.7895 | 0.8380 | 0.8686 (0.8615-0.8756) |
|
| Continuous | 0.8343 | 0.8694 |
|
aAUC: area under the receiver operating characteristic curve.
Performance of machine learning methods in prediction of nonalcoholic steatohepatitis.
| Method and feature encoding | Mean sensitivity | Mean specificity | Mean AUCa (95% CI) | |
|
| ||||
|
| Categorical | 0.7244 | 0.7523 | 0.7858 (0.7769-0.7948) |
|
| Continuous | 0.7070 | 0.7903 | 0.7956 (0.7871-0.8041) |
|
| ||||
|
| Categorical | 0.7383 | 0.7480 | 0.7924 (0.7813-0.7983) |
|
| Continuous | 0.6836 | 0.8256 | 0.7968 (0.7886-0.8050) |
|
| ||||
|
| Categorical | 0.7064 | 0.6693 | 0.7201 (0.7098-0.7304) |
|
| Continuous | 0.6937 | 0.6881 | 0.7305 (0.7210-0.7401) |
|
| ||||
|
| Categorical | 0.6979 | 0.8041 | 0.7910 (0.7819-0.8001) |
|
| Continuous | 0.7582 | 0.7691 | 0.8119 (0.8036-0.8215) |
|
| ||||
|
| Categorical | 0.7226 | 0.7600 | 0.7914 (0.7827-0.8001) |
|
| Continuous | 0.7525 | 0.7836 |
|
aAUC: area under the receiver operating characteristic curve.
Performance of machine learning methods in prediction of advanced fibrosis.
| Method and feature encoding | Mean sensitivity | Mean specificity | Mean AUCa (95% CI) | ||||
|
| |||||||
|
| Categorical | 0.7683 | 0.7730 | 0.7950 (0.7837-0.8063) | |||
|
| Continuous | 0.8500 | 0.7428 | 0.8278 (0.8172-0.8392) | |||
|
| |||||||
|
| Categorical | 0.7367 | 0.7587 | 0.7628 (0.7489-0.7767) | |||
|
| Continuous | 0.8242 | 0.7320 | 0.8122 (0.8002-0.8233) | |||
|
| |||||||
|
| Categorical | 0.7467 | 0.8010 | 0.7844 (0.7651-0.8037) | |||
|
| Continuous | 0.6667 | 0.7379 | 0.6947 (0.6740-0.7153) | |||
|
| |||||||
|
| Categorical | 0.7425 | 0.8529 | 0.8118 (0.7985-0.8251) | |||
|
| Continuous | 0.8325 | 0.7757 | 0.8337 (0.8227-0.8447) | |||
|
| |||||||
|
| Categorical | 0.7492 | 0.8361 | 0.8115 (0.7977-0.8253) | |||
|
| Continuous | 0.8083 | 0.8074 |
| |||
aAUC: area under the receiver operating characteristic curve.
Comparison of gradient boosting (the best machine learning method) with existing scoring algorithms for prediction of advanced fibrosisa.
| Method | Mean sensitivity | Mean specificity | Mean AUCb (95% CI) | |
| GBc | 0.8083 | 0.8074 | 0.8360 (0.8254-0.8467) | N/Ad |
| APRIe | 0.7424 | 0.7606 | 0.7984 (0.7964-0.8004) | <.001 |
| FIB-4f | 0.7176 | 0.6674 | 0.7394 (0.7371-0.7417) | <.001 |
| NFSg | 0.7506 | 0.5673 | 0.6843 (0.6777-0.6909) | <.001 |
aThe scores for APRI, FIB-4, and NFS were calculated by bootstrapping 80% of the data from all 492 patients 100 times.
bAUC: area under the receiver operating characteristic curve.
cGB: gradient boosting.
dN/A: not applicable.
eAPRI: aspartate aminotransferase-to-platelet ratio index.
fFIB-4: Fibrosis-4.
gNFS: Nonalcoholic Fatty Liver Disease Fibrosis Score.
Figure 1Top 10 important risk factors for prediction of NAFLD, NASH, and fibrosis based on SHAP importance calculated using the GB models with the continuous feature encoding method. (SHAP importance was derived from the averaged absolute SHAP values). A1c: hyperglycemia; ALT: alanine aminotransferase; AST: aspartate transaminase; BILIRRUB: bilirubin; CHOL: cholesterol; DBP: diastolic blood pressure; DYSLIPID: dyslipidemia; FPG: fasting plasma glucose; GB: gradient boosting; HDL: high-density lipoprotein; LDL: low-density lipoprotein; NAFLD: nonalcoholic fatty liver disease; NASH: nonalcoholic steatohepatitis; TG: triglyceride; TSH: thyroid-stimulating hormone; SHAP: SHapley Additive exPlanations.
Figure 2Decision plots for false positive and false negative prediction cases using the gradient boosting model with the continuous feature encoding method on advanced fibrosis. A1c: hyperglycemia; AST: aspartate transaminase; ALT: alanine aminotransferase; BILIRRUB: bilirubin; CHOL: cholesterol; DBP: diastolic blood pressure; DIAB: diabetes; DYSLIPID: dyslipidemia; FPG: fasting plasma glucose; HDL: high-density lipoprotein; IFG: impaired fasting glucose; LDL: low-density lipoprotein; METFO: metformin; NGT: narrow gastric tube; SBP: systolic blood pressure; TG: triglyceride; TSH: thyroid-stimulating hormone.