| Literature DB >> 31572303 |
Bassam Farran1, Rihab AlWotayan1,2, Hessa Alkandari1,3, Dalia Al-Abdulrazzaq1,4, Arshad Channanath1, Thangavel Alphonse Thanaraj1.
Abstract
Objective: In recent decades, the Arab population has experienced an increase in the prevalence of type 2 diabetes (T2DM), particularly within the Gulf Cooperation Council. In this context, early intervention programmes rely on an ability to identify individuals at risk of T2DM. We aimed to build prognostic models for the risk of T2DM in the Arab population using machine-learning algorithms vs. conventional logistic regression (LR) and simple non-invasive clinical markers over three different time scales (3, 5, and 7 years from the baseline). Design: This retrospective cohort study used three models based on LR, k-nearest neighbours (k-NN), and support vector machines (SVM) with five-fold cross-validation. The models included the following baseline non-invasive parameters: age, sex, body mass index (BMI), pre-existing hypertension, family history of hypertension, and T2DM. Setting: This study was based on data from the Kuwait Health Network (KHN), which integrated primary health and hospital laboratory data into a single system. Participants: The study included 1,837 native Kuwaiti Arab individuals (equal proportion of men and women) with mean age as 59.5 ± 11.4 years. Among them, 647 developed T2DM within 7 years of the baseline non-invasive measurements. Analytical methods: The discriminatory power of each model for classifying people at risk of T2DM within 3, 5, or 7 years and the area under the receiver operating characteristic curve (AUC) were determined. Outcome measures: Onset of T2DM at 3, 5, and 7 years.Entities:
Keywords: body mass index; hypertension; k-nearest neighbours; logistic regression; prognosis; support vector machine; type 2 diabetes
Year: 2019 PMID: 31572303 PMCID: PMC6749017 DOI: 10.3389/fendo.2019.00624
Source DB: PubMed Journal: Front Endocrinol (Lausanne) ISSN: 1664-2392 Impact factor: 5.555
Descriptive statistics of the participants.
| Total number of participants | 1,837 |
| Sex (Male:Female) | 909:928 (49.5%:50.5%) |
| Number of participants with a family history of T2DM | 587 (32.0%) |
| Number of participants with a family history of hypertension | 371 (20.2%) |
| Number of participants who were baseline hypertensive | 1,316 (71.6%) |
| Number of participants with T2DM considering the 3-year horizon | 290 (15.8%) |
| Number of participants with T2DM considering the 5-year horizon | 468 (25.5%) |
| Number of participants with T2DM considering the 7-year horizon | 647 (35.2%) |
| Mean age of participants at T2DM onset considering the 3-year horizon (years) | 55.1 ± 11.0 |
| Mean age of participants at T2DM onset considering the 5-year horizon (years) | 56.7 ± 11.5 |
| Mean age of participants at T2DM onset considering the 7-year horizon (years) | 58.4 ± 11.5 |
| Mean BMI of participants considering the 3-year horizon (kg/m2) | 33.6 ± 10.2 |
| Mean BMI of participants considering the 5-year (kg/m2) | 33.2 ± 8.9 |
| Mean BMI of participants considering the 7-year horizon (kg/m2) | 33.0 ± 8.5 |
| Mean interval from study entry point to diabetes diagnosis considering the 3-year horizon (months) | 17.9 ± 11.5 |
| Mean interval from study entry point to diabetes diagnosis considering the 5-year horizon (months) | 29.6 ± 17.9 |
| Mean interval from study entry point to diabetes diagnosis considering the 7-year horizon (months) | 41.5 ± 24.8 |
T2DM, type 2 diabetes mellitus; BMI, body mass index.
Descriptive statistics of participants who became diabetic within 7 years since study entry point vs. those who did not become diabetic.
| Male | 311 (48.1%) | 598 (50.3%) | 0.4 |
| Mean age at entry point (years) | 54.92 ± 11.05 | 61.9 ± 10.8 | <0.001 |
| Mean BMI at entry point (kg/m2) | 32.95 ± 8.45 | 30.82 ± 6.19 | <0.001 |
| Positive diagnosis for hypertension at entry point | 431 (66.6%) | 885 (74.3%) | <0.001 |
| Family history of diabetes | 191 (29.5%) | 396 (33.2%) | 0.110 |
| Family history of hypertension | 165 (25.5%) | 206 (17.3%) | <0.001 |
AUC values obtained using logistic regression, k-nearest neighbours, and Support vector machine models designed for predicting the risk of T2DM over three different prediction horizons.
| 3-year | 0.737 | 0.8308 | 7286 |
| 5-year | 0.7161 | 0.818 | 0.6823 |
| 7-year | 0.7039 | 0.7903 | 0.7059 |
AUC, area under the curve; T2DM, type 2 diabetes mellitus; CI, confidence interval.
Figure 1Receiver operating characteristic (ROC) curves derived for prediction horizons of 3, 5, and 7 years using the three models based on logistic regression (LR), k-nearest neighbours (k-NN), and support vector machine (SVM). (a) 3-year prediction horizon. (b) 5-year prediction horizon. (c) 7-year prediction horizon.
Variables identified as significant (shown in bold) when applying the logistic regression model to the three prediction horizons.
| Family history of diabetes | −0.094 | 0.89 (0.67, 1.17) | 0.547 |
| Diagnosis for baseline hypertension | |||
| Sex | −0.11 | 0.99 (0.77, 1.28) | 0.447 |
| Family history for diabetes | −0.1523 | 0.89 (0.71, 1.12) | 0.239 |
| Diagnosis for baseline hypertension | |||
| Sex | −0.171 | 0.92 (0.75, 1.15) | 0.156 |
| Diagnosis for baseline hypertension | |||
| Sex | −0.1650 | 0.92 (0.76, 1.11) | 0.1261 |
BMI, body mass index.
Comparison of the presented prognostic models with models reported in the literature.
| 1 | Current study | k-nearest neighbours; | Age, BMI, family history of diabetes, hypertensive status, family history of hypertension, and sex | 0.83 (3-year), 0.82 (5-year), 0.79 (7-year) |
| 2 | Current study (3, 5 and 7 years) | Logistic regression; Support vector machine; | Age, BMI, family history of diabetes, hypertensive status, family history of hypertension, and sex | LR: 0.74 (3-year), 0.72 (5-year), 0.70 (7-year) |
| 3 | Alssema et al. ( | Logistic regression; | Age, BMI, Waist circumference, physical activity, diet, use of antihypertensive medication, history of high blood glucose level, sex, smoking, and family history of diabetes (parent, sibling, or both) | 0.77 |
| 4 | Wannamethee et al. ( | Logistic regression; | Age, sex, family history of diabetes, smoking status, BMI, waist circumference, and hypertension | AUC: 0.77. |
| 5 | Rathmann et al. ( | Logistic regression; | Age, sex, BMI, parental diabetes, smoking, and hypertension | 0.76 |
| 6 | Chen et al. ( | Logistic regression; | Age, sex, ethnicity, parental history of diabetes, history of high blood glucose level, use of antihypertensive medications, smoking, physical inactivity, and waist circumference | AUC = 0.78. |
| 7 | Rosella et al. ( | Logistic regression; | BMI, age, ethnicity, hypertension, immigrant status, smoking, education status, and heart disease | 0.77 |
| 8 | Joseph et al. ( | Cox proportional hazard models; | Age, BMI, triglycerides, high-density lipoprotein cholesterol, hypertension, family history of diabetes, low education, and smoking | – |
| 9 | Kahn et al. ( | Proportional hazard models; | Waist circumference, maternal diabetes, hypertension, paternal diabetes, short stature, black race, age 55 years or older, increased weight, rapid pulse, and smoking history | – |
| 10 | Hippisley-Cox et al. ( | Cox proportional hazards models; Reports hazard ratios; | Ethnicity, age, sex, body mass index, smoking status, family history of diabetes, townsend deprivation score, treated hypertension, cardiovascular disease, and current use of corticosteroids | – |
| 11 | Balkau et al. ( | Logistic regression; | Waist circumference and hypertension in both sexes, smoking in men and diabetes in the family in women | 0.71 for men, 0.83 for women |
| 12 | Simmons et al. ( | Logistic regression; | Physical activity, diet, age, BMI, and family history | 0.76 |
| 13 | Wilson et al. ( | Logistic regression; | Age, sex, parental history of diabetes, and BMI | 0.72 |
| 14 | Lindström et al. ( | Logistic regression; | Age, BMI, waist circumference, history of antihypertensive drug treatment, high blood glucose, physical activity, and diet | 0.85 (model development); (0.87) model validation. |