| Literature DB >> 29615972 |
Rohit Babbar1,2, Martin Heni3,4,5, Andreas Peter3,4,5, Martin Hrabě de Angelis5, Hans-Ulrich Häring3,4,5, Andreas Fritsche3,4,5, Hubert Preissl4,5,6, Bernhard Schölkopf1, Róbert Wagner3,4,5.
Abstract
INTRODUCTION: Impaired glucose tolerance (IGT) is diagnosed by a standardized oral glucose tolerance test (OGTT). However, the OGTT is laborious, and when not performed, glucose tolerance cannot be determined from fasting samples retrospectively. We tested if glucose tolerance status is reasonably predictable from a combination of demographic, anthropometric, and laboratory data assessed at one time point in a fasting state.Entities:
Keywords: classification; clinical study; impaired glucose tolerance; machine learning classification; oral glucose tolerance test; prediction; supervised machine learning; test-retest variability
Year: 2018 PMID: 29615972 PMCID: PMC5868129 DOI: 10.3389/fendo.2018.00082
Source DB: PubMed Journal: Front Endocrinol (Lausanne) ISSN: 1664-2392 Impact factor: 5.555
List of anthropometric, clinical, and laboratory variables used as features in the machine learning classifiers.
| Variable | Evidence of association with impaired glucose tolerance (IGT) |
|---|---|
| Sex | Women have higher risk of IGT ( |
| Age | IGT incidence increases with age ( |
| Height | Used separately as underlying variables of body mass index that is strongly associated with IGT age ( |
| Weight | |
| Glucose | By definition strongly associated with glycemia, prediabetes, and diabetes |
| HBA1C | Strongly associated with IGT ( |
| Hemoglobin | Potentially interacts with the association of HbA1c with the outcome (BZ120) |
| Mean corpuscular volume (MCV) | MCV could interact with HbA1c in modulating its association with glycemia |
| Ferritin | Elevated ferritin is associated with impaired glucose tolerance ( |
| Potassium | Associated with prediabetes in hypertensive persons ( |
| Insulin | Higher fasting insulin is associated with insulin resistance and IGT ( |
| C-peptide | Similar to insulin, higher levels are associated with IGT ( |
| Proinsulin | A read-out of proinsulin-insulin conversion, higher levels are associated with IGT ( |
| Non-esterified fatty acids (NEFAS) | High-fasting free fatty acids (=NEFAS) predict diabetes ( |
| Triglycerides | Well-known association with type 2 diabetes and prediabetes |
| Total cholesterol | Well-known association with type 2 diabetes and prediabetes |
| LDL cholesterol | Well-known association with type 2 diabetes and prediabetes |
| HDL cholesterol | Well-known inverse association with type 2 diabetes and prediabetes |
| C-reactive protein (CRP) | Elevated CRP predicts the development of type 2 diabetes ( |
| Aspartate aminotransferase (AST) | AST is associated with fatty liver and IGT ( |
| Gamma-glutamyl transferase (GGT) | GGT is associated with IGT ( |
| Uric acid | Associated with IGT especially in women ( |
.
Characteristics of the training and test set for the feature variables and the target variable defining the classification.
| Training set | Test set | ||||||
|---|---|---|---|---|---|---|---|
| Mean | SD | Mean | SD | ||||
| Sex (f/m) | 2,337 | 929 | 0.79 | ||||
| Age (years) | 2,337 | 40 | 13 | 929 | 49 | 15 | <0.0001 |
| Height (cm) | 2,337 | 171 | 9 | 929 | 170 | 9 | 0.00022 |
| Weight (kg) | 2,337 | 91.5 | 29.1 | 929 | 89.8 | 25.5 | 0.11 |
| Fasting glucose (mmol l−1) | 2,337 | 5.25 | 0.72 | 929 | 5.46 | 0.74 | <0.0001 |
| Glycated hemoglobin HbA1c (%) | 2,190 | 5.4 | 0.5 | 916 | 5.7 | 0.5 | <0.0001 |
| Hemoglobin (g dl−1) | 2,208 | 13.8 | 1.2 | 917 | 13.9 | 1.2 | 0.047 |
| Mean corpuscular volume (fl) | 2,208 | 86 | 5 | 917 | 86 | 4 | 0.59 |
| Potassium (mmol l−1) | 2,155 | 3.97 | 0.32 | 912 | 3.99 | 0.35 | 0.13 |
| Fasting insulin | 2,304 | 84 | 75 | 919 | 110 | 77 | <0.0001 |
| C-peptide (pmol l−1) | 2,236 | 685 | 344 | 913 | 603 | 308 | <0.0001 |
| Triglycerides (mg dl−1) | 2,185 | 132 | 152 | 917 | 123 | 71.9 | 0.027 |
| Cholesterol (mg dl−1) | 2,183 | 194 | 38.9 | 917 | 197 | 40.1 | 0.052 |
| Low-density lipoprotein (mg dl−1) | 2,157 | 121 | 33.5 | 917 | 115 | 34.1 | <0.0001 |
| HDL (mg dl−1) | 2,157 | 53.3 | 14.3 | 917 | 53.4 | 13.9 | 0.99 |
| Uric acid (mg dl−1) | 2,174 | 5.5 | 1.4 | 836 | 5.6 | 1.3 | 0.16 |
| Aspartate aminotransferase (U l−1) | 2,132 | 22 | 11 | 917 | 24 | 10 | <0.0001 |
| Gamma-glutamyl transferase (U l−1) | 2,167 | 28 | 33 | 917 | 28 | 36 | 0.71 |
| C-reactive protein (mg dl−1) | 2,161 | 0.42 | 0.62 | 917 | 0.36 | 0.51 | 0.0038 |
| Ferritin (μg dl−1) | 2,175 | 9 | 12 | 834 | 12 | 14 | <0.0001 |
| Non-esterified fatty acids (μmol l−1) | 2,210 | 593 | 251 | 906 | 606 | 225 | 0.15 |
| Proinsulin (pmol l−1) | 2,132 | 6 | 6.5 | 890 | 3.6 | 3.9 | <0.0001 |
| Postchallenge glucose (mmol l−1) | 2,337 | 6.65 | 2.15 | 929 | 6.96 | 2.19 | 0.00022 |
*t-Test or Fisher’s exact test, as appropriate.
Model performance showing crude accuracy values (the ratio of right predictions over all predictions) and κ statistic (accuracy in relation to expected accuracy) for the evaluated machine learning classifiers in the test set.
| Method | Accuracy | κ | |
|---|---|---|---|
| Recursive partitioning (rpart) | 0.783 | 0.423 | <0.0001 |
| Lasso (glmnet) | 0.767 | 0.418 | 0.003 |
| Stochastic gradient boosting (gbm) | 0.761 | 0.414 | 0.012 |
| Random forest (rf) | 0.744 | 0.412 | 0.142 |
| Extended gradient boost (xgbLinear) | 0.74 | 0.394 | 0.22 |
| generalized additive model (gamLoess) | 0.708 | 0.368 | 0.913 |
| Neural networks (nnet) | 0.695 | 0.348 | 0.987 |
| Generalized linear model (glm) | 0.686 | 0.339 | 0.998 |
| Penalized multinomial regression (multinom) | 0.686 | 0.339 | 0.998 |
| Partial least squares (pls) | 0.692 | 0.331 | 0.993 |
A one-sided binomial test .
Figure 1Aggregated importance score in the machine learning classifiers for each feature variable. Individual importance scores are represented by colors in the stacked bars. The classifiers are described in Table S1 in Supplementary Material.