| Literature DB >> 36211651 |
Kexing Han1, Kexuan Tan1, Jiapei Shen1, Yuting Gu1, Zilong Wang1, Jiayu He1, Luyang Kang1, Weijie Sun1, Long Gao1, Yufeng Gao1.
Abstract
Background: Prevention and treatment of liver fibrosis at an early stage is of great prognostic importance, whereas changes in liver stiffness are often overlooked in patients before the onset of obvious clinical symptoms. Recognition of liver fibrosis at an early stage is therefore essential. Objective: An XGBoost machine learning model was constructed to predict participants' liver stiffness measures (LSM) from general characteristic information, blood test metrics and insulin resistance-related indexes, and to compare the fit efficacy of different datasets for LSM.Entities:
Keywords: HOMA-IR; METS-IR; NHANES; insulin resistance; liver cirrhosis; liver stiffness measurement (LSM); machine learning model
Mesh:
Year: 2022 PMID: 36211651 PMCID: PMC9537573 DOI: 10.3389/fpubh.2022.1008794
Source DB: PubMed Journal: Front Public Health ISSN: 2296-2565
Figure 1Flow chart for participants.
Comparison of participant characteristics in the training and validation cohorts.
|
|
|
|
|
|---|---|---|---|
| Sample size | 2,376 | 1,188 | |
| Age (years) | 50.75 ± 17.11 | 49.94 ± 17.15 | 0.182 |
| Gender (%) | 0.570 | ||
| Male | 48.40 | 49.41 | |
| Female | 51.60 | 50.59 | |
| BMI (kg/m2) | 29.57 ± 7.00 | 29.84 ± 7.35 | 0.295 |
| Education level (%) | 0.332 | ||
| Less than high school | 20.12 | 17.76 | |
| High school | 23.27 | 24.83 | |
| More than high school | 56.57 | 57.32 | |
| Unclear | 0.04 | 0.08 | |
| PIR (%) | 0.966 | ||
| < 1.35 | 23.48 | 23.99 | |
| 1.35–3.45 | 33.04 | 33.42 | |
| ≥3.45 | 29.71 | 29.04 | |
| Unclear | 13.76 | 13.55 | |
| Drinking frequency (%) | 0.131 | ||
| Not at all | 19.87 | 17.93 | |
| ≤ 1 times per month | 28.62 | 30.98 | |
| ≤ 1 times per week | 17.51 | 20.37 | |
| ≥2 times per week | 13.85 | 10.77 | |
| Almost daily | 6.69 | 6.65 | |
| Unclear | 13.47 | 13.30 | |
| Smoker (%) | 0.065 | ||
| Yes | 41.54 | 45.29 | |
| No | 58.38 | 54.71 | |
| Unclear | 0.08 | 0 | |
| Still smoking (%) | 0.079 | ||
| Every day | 13.38 | 15.82 | |
| Some days | 4.55 | 4.04 | |
| Not at all | 23.61 | 25.42 | |
| Unclear | 58.46 | 54.71 | |
| Age started smoking (years) (%) | 0.082 | ||
| < 18 | 20.12 | 21.13 | |
| ≥18 | 21.42 | 24.16 | |
| Unclear | 58.46 | 54.71 | |
| Hypertension (%) | 0.245 | ||
| Yes | 36.83 | 39.65 | |
| No | 63.05 | 60.19 | |
| Unclear | 0.13 | 0.17 | |
| Age of hypertension (years) (%) | 0.072 | ||
| < 40 | 10.44 | 13.22 | |
| 40–60 | 18.35 | 19.02 | |
| ≥60 | 8.04 | 7.41 | |
| Unclear | 63.17 | 60.35 | |
| Medication for hypertension (%) | 0.248 | ||
| Yes | 33.88 | 36.28 | |
| No | 2.86 | 3.28 | |
| Unclear | 63.26 | 60.44 | |
| Diabetes (%) | 0.738 | ||
| Yes | 14.48 | 15.74 | |
| No | 82.37 | 81.06 | |
| Borderline | 3.11 | 3.11 | |
| Unclear | 0.04 | 0.08 | |
| Age of diabetes (years) (%) | 0.332 | ||
| < 40 | 2.69 | 3.45 | |
| 40–60 | 8.21 | 9.26 | |
| ≥60 | 3.58 | 3.03 | |
| Unclear | 85.52 | 84.26 | |
| Taking insulin now (%) | 0.119 | ||
| Yes | 4.12 | 3.37 | |
| No | 10.35 | 12.37 | |
| Unclear | 85.52 | 84.26 | |
| Taking diabetic pills now (%) | 0.181 | ||
| Yes | 12.21 | 14.39 | |
| No | 15.74 | 15.07 | |
| Unclear | 72.05 | 70.54 | |
| Ever told you have hepatitis B (%) | 0.202 | ||
| Yes | 1.47 | 1.18 | |
| No | 98.23 | 98.15 | |
| Unclear | 0.29 | 0.67 | |
| Ever treated for hepatitis B (%) | 0.910 | ||
| Yes | 0.29 | 0.25 | |
| No | 0.97 | 0.84 | |
| Unclear | 98.74 | 98.91 | |
| Ever told you have hepatitis C (%) | 0.903 | ||
| Yes | 1.64 | 1.77 | |
| No | 97.94 | 97.73 | |
| Unclear | 0.42 | 0.51 | |
| Ever treated for hepatitis C (%) | 0.848 | ||
| Yes | 0.88 | 1.01 | |
| No | 0.63 | 0.76 | |
| Unclear | 98.48 | 98.23 | |
| Waist Circumference (cm) | 99.98 ± 16.25 | 100.38 ± 16.74 | 0.640 |
| Hip circumference (cm) | 106.89 ± 13.81 | 107.11 ± 14.27 | 0.808 |
| Systolic pressure (mmHg) | 124.39 ± 18.60 | 124.24 ± 18.35 | 0.816 |
| Diastolic pressure (mmHg) | 75.06 ± 11.08 | 75.44 ± 11.20 | 0.470 |
| Sedentary activity (min) | 375.01 ± 713.93 | 373.22 ± 657.70 | 0.493 |
| Median liver stiffness (Kpa) | 5.34 ± 1.93 | 5.34 ± 1.97 | 0.664 |
| METS-IR | 44.19 ± 12.73 | 44.70 ± 13.20 | 0.264 |
| HOMA-IR | 113.52 ± 144.18 | 103.53 ± 246.13 | 0.128 |
Mean ± SD for continuous variables: P-value was calculated by weighted linear regression model. % for Categorical variables: P-value as calculated by weighted chi-square test.
Figure 2XGBoost machine learning model developed with dataset A in the training cohort. (A) Relative importance of the top 20 predictor variables. (B) Bland-Altman analysis of estimated LSM (kPa) for real data. The dark blue line in the middle represents the difference between the estimated and true values, and the light blue lines at the top and bottom represent 95% agreement limits of the estimated values. Each black point represents a sample. (C) The fitted plot of estimated and true values after XGBoost regression. Each black point represents a sample.
Figure 3XGBoost machine learning model developed with dataset B in the training cohort. (A) Relative importance of the top 20 predictor variables. (B) Bland-Altman analysis of estimated LSM (kPa) for real data. The dark blue line in the middle represents the difference between the estimated and true values, and the light blue lines at the top and bottom represent 95% agreement limits of the estimated values. Each black point represents a sample. (C) The fitted plot of estimated and true values after XGBoost regression. Each black point represents a sample.
Figure 4XGBoost machine learning model developed with dataset C in the training cohort. (A) Relative importance of the top 20 predictor variables. (B) Bland-Altman analysis of estimated LSM (kPa) for real data. The dark blue line in the middle represents the difference between the estimated and true values, and the light blue lines at the top and bottom represent 95% agreement limits of the estimated values. Each black point represents a sample. (C) The fitted plot of estimated and true values after XGBoost regression. Each black point represents a sample.
95% agreement limits of estimated LSM for three datasets in the training cohort.
|
|
|
|
|
|
|---|---|---|---|---|
| Dataset A | 0.00197 | −1.52 | 1.52 | 0.76 |
| Dataset B | 0.00407 | −1.49 | 1.48 | 0.74 |
| Dataset C | 0 | −1.72 | 1.72 | 0.86 |
Evaluation metric values in the training cohort.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| Dataset A | 0.58 | 0.58 | 0.76 | 0.86 | 0.93 |
| Dataset B | 0.55 | 0.57 | 0.74 | 0.87 | 0.93 |
| Dataset C | 0.74 | 0.67 | 0.86 | 0.83 | 0.91 |
95% agreement limits of estimated LSM for three datasets in the validation cohort.
|
|
|
|
|
|
|---|---|---|---|---|
| Dataset A | 0.00002 | −1.59 | 1.59 | 0.80 |
| Dataset B | 0 | −1.56 | 1.56 | 0.78 |
| Dataset C | 0 | −1.78 | 1.78 | 0.89 |
Evaluation metric values in the validation cohort.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| Dataset A | 0.64 | 0.61 | 0.79 | 0.85 | 0.92 |
| Dataset B | 0.61 | 0.61 | 0.78 | 0.87 | 0.93 |
| Dataset C | 0.79 | 0.69 | 0.89 | 0.83 | 0.91 |