| Literature DB >> 35282438 |
Xinqi Cheng1,2, Shicheng Li3,4, Lizong Deng3,4, Wei Luo2, Dancheng Wang2, Jin Cheng2, Chaochao Ma2, Luming Chen5, Taijiao Jiang3,4,5, Ling Qiu2, Guojun Zhang1.
Abstract
Objective: The purpose of this study was to predict elevated TSH levels by developing an effective machine learning model based on large-scale physical examination results.Entities:
Keywords: antithyroid peroxidase-antibody (TPO-Ab); elevated TSH level; free triiodothyronine (FT3) to thyroxine (FT4); logistic analysis; machine learning model
Mesh:
Substances:
Year: 2022 PMID: 35282438 PMCID: PMC8907627 DOI: 10.3389/fendo.2022.839829
Source DB: PubMed Journal: Front Endocrinol (Lausanne) ISSN: 1664-2392 Impact factor: 5.555
The characteristics of the patients during follow-up with normal TSH and elevated TSH in this study.
| Characteristics | Normal TSH (n = 11,565) | Elevated TSH (n = 1,170) | Total (n = 12735) | P value (Wilcoxon rank-sum test) |
|---|---|---|---|---|
| Male (%) | 5,308 (45.90%) | 318 (27.18%) | ||
| Age (years) | 39 (31–49) | 46 (38-54) | 40 (32-50) | <0.001 |
| FT3/FT4 | 2.53 (2.32-2.76) | 2.63 (2.42-2.87) | 2.55 (2.34-2.78) | <0.001 |
| TPO-Ab (IU/ml) | 13.91 (11.56-17.10) | 15.08 (10.81-79.17) | 13.96 (11.50-17.4) | 0.002 |
| Tg-Ab (IU/ml) | 10.53 (10.0-13.93) | 17.31 (11.68-192.17) | 10.83 (10.2-14.83) | <0.001 |
| ALT (U/L) | 17.0 (12.0-25.0) | 16.5 (13.0-23.0) | 17.0 (12.0-24.0) | 0.159 |
| TP (g/L) | 73.0 (71.0-76.0) | 72.0 (69.0-74.0) | 73.0 (70.0-76.0) | 0.081 |
| Alb (g/L) | 47.0 (45.0-48.0) | 43.0 (41.0-47.1) | 46.0 (44.0-47.8) | <0.001 |
| Cr (µmol/L) | 70.3 (61.2-82.5) | 66.1 (58.2-76.25) | 69.0 (60.0-82.0) | <0.001 |
| Urea (mmol/L) | 4.47 (3.75-5.2) | 4.41 (3.71-5.06) | 4.46 (3.75-5.19) | 0.123 |
| Glu (mmol/L) | 5.1 (4.8-5.5) | 5.0 (4.7-5.3) | 5.1 (4.8-5.4) | 0.255 |
| UA (µmol/L) | 305.0 (255.1-371.0) | 289.5 (242.5-341.0) | 303.0 (254.5-369.0) | <0.001 |
| TC (mg/dl) | 4.57 (4.02-5.17) | 4.78 (4.16-5.34) | 4.59 (4.03-5.19) | <0.001 |
| TG (mmol/L) | 1.03 (0.73-1.56) | 1.21 (0.88-1.68) | 1.05 (0.74-1.57) | <0.001 |
| HDL-C (mmol/L) | 1.27 (1.06-1.51) | 1.24 (1.05-1.47) | 1.26 (1.06-1.50) | 0.259 |
| LDL-C (mmol/L) | 2.83 (2.34-3.38) | 2.88 (2.36-3.38) | 2.83 (2.34-3.38) | 0.439 |
| Height (cm) | 167 (161-174) | 164 (159-171) | 167 (161-173.2) | 0.111 |
| Weight (kg) | 64 (56-75) | 63 (57-72) | 64 (56-74) | 0.317 |
| Diastolic blood pressure (mmHg) | 117 (106-130) | 119 (108-134) | 117 (106-131) | 0.910 |
| Systolic blood pressure (mmHg) | 72 (65-80) | 73 (67-81) | 72 (66-80) | 0.453 |
Data was shown as median (interquartile range). FT3, free triiodothyronine; FT4, free thyroxine; TPO-Ab, antithyroid peroxidase autoantibody; Tg-Ab, anti-thyroglobulin antibody; ALT, alanine aminotransferase; TP, total protein; Alb, albumin; Cr, creatinine; Glu, glucose; UA, uric acid; TC, total cholesterol; TG, triglycerides; HDL-C, high-density lipoprotein cholesterol; LDL-C, low-density lipoprotein cholesterol.
Logistic regression analysis of the risk factors for elevated TSH.
| Clinical Features | Univariate Logistic regression | Multivariate Logistic regression | ||
|---|---|---|---|---|
| OR (95% CI) | P value | OR (95% CI) | P value | |
| Age | 1.059 (0.943-1.188) | 0.332 | – | |
| Gender | 1.932 (1.582-2.122) | 0.001 | 1.517 (1.338-1.719) | 0.006 |
| FT3/FT4 | 6.696 (4.668-9.605) | <0.001 | 3.170 (2.033-4.206) | <0.001 |
| TPO-Ab | 3.088 (2.099-4.542) | <0.001 | 1.958 (1.659-2.378) | <0.001 |
| Tg-Ab | 2.001 (1.189-3.345) | 0.029 | 2.746 (1.953-3.009) | <0.001 |
| ALT | 1.033 (0.875-1.214) | 0.396 | – | – |
| TP | 1.157 (0.794-1.196) | 0.405 | – | – |
| Alb | 0.950 (0.838-1.076) | 0.278 | – | – |
| CR | 1.465 (1.347-1.600) | 0.012 | 1.331 (1.178-1.467) | 0.042 |
| Urea | 1.056 (0.935-1.192) | 0.377 | – | – |
| Glu | 1.009 (0.902-1.129) | 0.269 | – | – |
| UA | 0.974 (0.856-1.110) | 0.368 | – | – |
| TC | 0.646 (0.603-0.706) | <0.001 | 0.637 (0.345-0.970) | 0.027 |
| TG | 1.825 (1.623-2.503) | <0.001 | 1.749 (1.565-2.118) | <0.001 |
| HDLC | 1.151 (0.889-1.490) | 0.585 | – | – |
| LDLC | 1.549 (0.895-2.180) | 0.618 | – | – |
| height | 0.982 (0.971-1.185) | 0.211 | – | – |
| weight | 1.001 (0.997-1.006) | 0.168 | – | – |
| Diastolic blood pressure | 1.099 (0.969-1.187) | 0.200 | – | – |
| Systolic blood pressure | 1.174 (0.997-1.195) | 0.191 |
| – |
OR, odds ratio; CI, confidence interval; FT3, free triiodothyronine; FT4, free thyroxine; TPO-Ab, antithyroid peroxidase autoantibody; Tg-Ab, anti-thyroglobulin antibody; ALT, alanine aminotransferase; TP, total protein; Alb, albumin; Cr, creatinine; Glu, glucose; UA, uric acid; TC, total cholesterol; TG, triglycerides; HDL-C, high-density lipoprotein cholesterol; LDL-C, low-density lipoprotein cholesterol.
Performance of 5-fold cross-validation of the four machine learning models (mean ± SD).
| One year | Accuracy | AUC | Two year | Accuracy | AUC |
|---|---|---|---|---|---|
| DT | 0.80 (+/- 0.07) | 0.84 (+/- 0.04) | DT | 0.62 (+/- 0.07) | 0.57 (+/- 0.05) |
| LR | 0.79 (+/- 0.09) | 0.82 (+/- 0.01) | LR | 0.61 (+/- 0.03) | 0.52 (+/- 0.04) |
| SVM | 0.81 (+/- 0.08) | 0.85 (+/- 0.03) | SVM | 0.56 (+/- 0.04) | 0.52 (+/- 0.07) |
| XGBoost | 0.86 (+/- 0.07) | 0.87 (+/- 0.03) | XGBoost | 0.61 (+/- 0.06) | 0.62 (+/- 0.05) |
AUC, area under curve; DT, decision tree; LR, logisitic regression; SVM, support vector machine; Xgboost, eXtreme Gradient boosting.
Figure 1ROC curves for one-year prediction task, the Receiver Operating Characteristic (ROC) curves are used to evaluate the performance of four machine learning models with 5-fold cross-validation: (A) Decision tree; (B) Logistic regression; (C) Support vector machine; (D) eXtreme Gradient Boosting.
Figure 2Feature importance in the XGBoost model.