| Literature DB >> 36004356 |
Peng Li1, Fang Liu2, Minsu Zhao3, Shaokai Xu1, Ping Li1, Jingang Cao1, Dongming Tian1, Yaopeng Tan1, Lina Zheng2, Xia Cao2, Yingxia Pan4,5, Hui Tang4,5, Yuanyuan Wu4,5, Yi Sun1.
Abstract
Background: Hashimoto's thyroiditis (HT) frequently occurs among autoimmune diseases and may simultaneously appear with thyroid cancer. However, it is difficult to diagnose HT at an early stage just by clinical symptoms. Thus, it is urgent to integrate multiple clinical and laboratory factors for the early diagnosis and risk prediction of HT.Entities:
Keywords: Hashimoto’s thyroiditis; machine learning modeling; precise diagnosis; risk factors; thyroid cancer
Mesh:
Substances:
Year: 2022 PMID: 36004356 PMCID: PMC9393718 DOI: 10.3389/fendo.2022.886953
Source DB: PubMed Journal: Front Endocrinol (Lausanne) ISSN: 1664-2392 Impact factor: 6.055
Clinical characteristics of controls and HT patients.
| Character | Control (n=866) | HT-C (n=393) | HT+C (n=44) |
|
|---|---|---|---|---|
| Goiter degree | <0.0001 | |||
| None | 859 (99.19%) | 36 (9.16%) ↓ | 1 (2.27%) ↓ | |
| I | 3 (0.35%) | 183 (46.56%) ↑ | 13 (29.55%) ↑ | |
| II | 2 (0.23%) | 165 (41.98%) ↑ | 29 (65.91%) ↑ | |
| III | 2 (0.23%) | 9 (2.29%) ↑ | 1 (2.27%) ↑ | |
| Age (years) | 54.06 ± 13.50 | 47.66 ± 13.09 ↓ | 51.45 ± 11.86 ↓ | <0.0001 |
| BMI (kg/m2) | 24.49 ± 3.31 | 23.15 ± 3.55 ↓ | 24.64 ± 3.31 ↑ | <0.0001 |
| Gender | <0.0001 | |||
| Male | 508 (58. 66%) | 49 (12.47%) ↓ | 8 (18.18%) ↓ | |
| Female | 358 (41.34%) | 344 (87.53%) ↑ | 36 (81.82%) ↑ | |
| Diabetes | 0.0433 | |||
| Yes | 28 (3.23%) | 24 (6.11%) ↑ | 3 (6.82%) ↑ | |
| No | 838 (96.77%) | 369 (93.89%) ↓ | 41 (93.18%) ↓ | |
The table shows the statistics of clinical characteristics and laboratory results of controls, HT patients and patients with thyroid cancer. There were significant differences in most factors between controls and HT patients. BMI, body mass index. P-values were calculated by Kruskal–Wallis H test or Chi-square test among the triple groups.
Symbols "↓ and ↑" represent “reduced and increased” change in mean values compared to corresponding controls.
Figure 1Overview of the cohort study. The flow chart shows the participants selection and classification, statistical analysis and logistic regression analysis on multiple factors of controls and patients, as well as machine learning modeling and evaluation of HT development.
Figure 2Multiple comparative analysis testing of laboratory data among controls and HT patients. (A) UIC, (B) 25-(OH)D, (C) FT3, (D) FT4, (E) TSH, (F) TAG, (G) TC, (H) LDL-C, (I) HDL-C, (J) FPG. *P < 0.05, ***P < 0.001, ****P < 0.0001. UIC, urinary iodine concentrations; 25-(OH)D, 25 hydroxyvitamin D; FT3, free triiodothyronine; FT4, free thyroxine; TSH, thyroid stimulating hormone; TAG, triglyceride; TC, total cholesterol; LDL-C, low density lipoprotein cholesterol; HDL-C, high density lipoprotein cholesterol; FPG, fasting plasma glucose.
Logistic regression analysis on risk factors for HT.
| Characters | Wald |
| OR | 95% CI for OR | |
|---|---|---|---|---|---|
| Lower | Upper | ||||
| Gender | 33.053 | <0.001 | 0.264 | 0.167 | 0.415 |
| Diabetes | 12.962 | <0.001 | 6.617 | 2.365 | 18.512 |
| UIC | 7.804 | 0.005 | 1.001 | 1.000 | 1.003 |
| 25-(OH)D | 19.761 | <0.001 | 0.945 | 0.922 | 0.969 |
| FT3 | 15.929 | <0.001 | 1.721 | 1.318 | 2.246 |
| TSH | 11.203 | <0.001 | 1.128 | 1.051 | 1.211 |
| FPG | 12.804 | <0.001 | 0.664 | 0.531 | 0.831 |
The adjusted quantification of relative risk relations between factors and HT. OR, odds ratio; UIC, urinary iodine concentrations; 25-(OH)D, 25 hydroxyvitamin D; FT3, free triiodothyronine; TSH, thyroid stimulating hormone; FPG, fasting plasma glucose.
Figure 3Machine learning models of HT development. (A) The rank of feature importance in model construction, including clinical and laboratory factors, as well as their corresponding F scores. (B) The ROC curves of six models.
Performance summary of six machine learning models of HT.
| Models | Accuracy | Precision | Recall | F1 Score | AUC Score |
|---|---|---|---|---|---|
| KNN | 0.681029 | 0.746247 | 0.681507 | 0.710569 | 0.728307 |
| LR | 0.701378 | 0.730701 | 0.780822 | 0.749290 | 0.767496 |
| SVM | 0.710839 | 0.720756 | 0.825304 | 0.766319 | 0.766069 |
| DT | 0.643135 | 0.680359 | 0.706202 | 0.691899 | 0.633101 |
| MLP | 0.721863 | 0.746918 | 0.797603 | 0.767313 | 0.775333 |
| XGBoost | 0.729774 | 0.770717 | 0.772755 | 0.767228 | 0.781673 |
AUC, area under curve; KNN, k-nearest neighbor classifier; LR, logistic regression; SVM, a support vector machine; DT, the decision tree model; MLP, the multilayer perceptron network; XGBoost, eXtreme Gradient Boosting.