| Literature DB >> 31969896 |
Sajida Perveen1, Muhammad Shahbaz1,2, Muhammad Sajjad Ansari3, Karim Keshavjee2,4, Aziz Guergachi2,5,6.
Abstract
Type 2 Diabetes Mellitus (T2DM) is a chronic, progressive metabolic disorder characterized by hyperglycemia resulting from abnormalities in insulin secretion, insulin action, or both. It is associated with an increased risk of developing vascular complication of micro as well as macro nature. Because of its inconspicuous and heterogeneous character, the management of T2DM is very complex. Modeling physiological processes over time demonstrating the patient's evolving health condition is imperative to comprehending the patient's current status of health, projecting its likely dynamics and assessing the requisite care and treatment measures in future. Hidden Markov Model (HMM) is an effective approach for such prognostic modeling. However, the nature of the clinical setting, together with the format of the Electronic Medical Records (EMRs) data, in particular the sparse and irregularly sampled clinical data which is well understood to present significant challenges, has confounded standard HMM. In the present study, we proposed an approximation technique based on Newton's Divided Difference Method (NDDM) as a component with HMM to determine the risk of developing diabetes in an individual over different time horizons using irregular and sparsely sampled EMRs data. The proposed method is capable of exploiting available sequences of clinical measurements obtained from a longitudinal sample of patients for effective imputation and improved prediction performance. Furthermore, results demonstrated that the discrimination capability of our proposed method, in prognosticating diabetes risk, is superior to the standard HMM.Entities:
Keywords: hidden Markov model; machine learning; prognostic modelling; risk prediction; risk scoring; type 2 diabetes mellitus
Year: 2020 PMID: 31969896 PMCID: PMC6958689 DOI: 10.3389/fgene.2019.01076
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Descriptive statistics of Diabetic and Non-diabetic population in our derived data sample.
| Predictors | Findings | ||
|---|---|---|---|
| Total population | Progressors | Non-progressors | |
| 1,918 | 584 (23.49) | 1,334 | |
| Demographic (Gender, Age) | |||
| Male, sample size (%) | 114 3(38.96) | 280 | 495 |
| Female, sample size (%) | 775 (61.03) | 304 | 839 |
| Overall age mean ± SD,Years | 63.19± 11.89 | 65.312 ± 12.34 | 58.937 ± 14.315 |
| Vital Signs/clinical measures | |||
| Systolic BP, mean ± SD, mm Hg | 128.611 ± 15.86 | 131.49 ± 16.7 | 128.496 ± 17.259 |
| Lab Values | |||
| FBS, mmol/L mean ± SD, mmol/L | 6.029± 1.51 | 7.256 ± 2.056 | 5.214 ± 0.562 |
| Triglycerides, mean ± SD, mmol/L | 1.72 ± 1.02 | 1.777 ± 1.205 | 1.419 ± 0.837 |
| HDL, sample size, mean ± SD, mmol/L | 1.356 ± 0.39 | 1.249 ± 0.361 | 1.453 ± 0.423 |
| HbA(1c), mean ± SD, mmol/L | 6.268 ± 0.95 | 6.821 ± 1.049 | 5.698 ± 0.365 |
| Total Cholesterol mean ± SD, mmol/L | 5.409 ± 0.59 | 4.433 ± 1.224 | 5.081 ± 1.085 |
| LDL, mean ± SD, mmol/L | 2.442± 0.851 | 2.427 ± 1.011 | 3.007 ± 0.939 |
| BMI, mean ± SD, kg/m2 | 29.81 ± 6.362 | 36.163 ± 1197.454 | 37.984 ± 1697.225 |
| Depression frequency (%) | |||
| YES | 373 | 112 | 261 |
| NO | 1,544 | 472 | 1,073 |
| Hypertension positive cases frequency (%) | |||
| YES | 1,107 | 421 | 686 |
| NO | 809 | 163 | 646 |
| Unknown | 2 | 2 | |
SD, standard deviation; BP, bnlood pressure; BMI, body mass index; FBS, fasting blood sugar; LDL, light density lipoprotein; HDL, high-density lipoprotein; HbA(1c), glycated hemoglobin.
Characteristics of the population in the CPCSSN database.
| Predictors | Findings |
|---|---|
| Demographic (Sex, Age) | |
| Female, sample size (%) | 100,566 (57) |
| Male age mean ± SD, Yr | 48.2 ± 24.1 |
| Female age mean ± SD, Yr | 49.5 ± 24.8 |
| Male age mean ± SD, Yr | 48.2 ± 24.1 |
| Vital Signs/clinical measures | |
| Systolic BP, mean ± SD, mm Hg | 129.34 ± 17.183 |
| Chronic obstructive pulmonary disease frequency (%) | 9,939 (2.4) |
| Dementia frequency (%) | 12,007 (1.8) |
| Depression frequency (%) | 32,672 (10) |
| Diabetes Mellitus frequency (%) | 26,077 (6) |
| Epilepsy frequency (%) | 5,553 (0.8) |
| Hypertension frequency (%) | 61,370 (13) |
| Osteoarthritis frequency (%) | 37,274 (7) |
| Parkinson’s Disease frequency (%) | 1,825 (0.2) |
| Lab Values | |
| Fasting blood glucose, mean ± SD, mmol/L | 5.54 ± 1.91 |
| TG, mean ± SD, mmol/L | 1.523 ± 0.962 |
| LDL, mean ± SD, mmol/L | 2.83 ± 0.99 |
| High density lipoprotein, mean ± SD, mmol/L | 1.3893 ± 0.416 |
| BMI, mean ± SD, kg/m2 | 37.113 ± 1528.71 |
| HbA(1c), mean ± SD, mmol/L | 6.268 ± 0.976 |
| Cholesterol mean ± SD, mmol/L | 4.893 ± 1.159 |
SD, standard deviation; Yr, year; BP, blood pressure; LDL, light density lipoprotein; HbA(1c), glycated hemoglobin; TG, triglycerides; BMI, body mass index; HDL.
* Some patients have more than 1 disease in the database.
Figure 1Visualization of the association between diabetes and each of the risk factors included in our study sample.
Figure 2Comparative analysis of area under receiver operating characteristic curve of our proposed method and standard HMM over different time horizons.
Analysis of the association between individual risk factor and diabetes risk in our study sample.
| Explanatory variables | OR (95% C.I.) | P Value |
|---|---|---|
| Age | 1.002 (.999 -1.006) | 3.49E-22 |
| Systolic blood pressure | .995 (.993-.997) | 7.02E-07 |
| BMI | 1.036 (1.030 1 -.042) | 1.60E-58 |
| LDL | .621 (.528 -.732) | 1.56E-23 |
| HDL | .577 (.480 -.695) | 5.63E-24 |
| HbA(1c) | 12.565 (10.902 -14.482) | 7.94E-143 |
| Triglycerides | 1.183 (1.093 -1.281) | 5.34E-07 |
| Fasting blood glucose | 5.965 (5.607 -1.281) | 0.000 |
| Total Cholesterol | .935 (.795-1.098) | 0.411 |
| Intercept | 3.63E-145 |
Nagelkerke R2 = 0.546.
Hosmer and Lemeshow Test = 0.360 (Significantly greater than 0.0005).
OR, odds ratio; C.I., confidence interval; sBP, systolic blood pressure; BMI, body mass index; LDL, light density lipoprotein; HDL, high density lipoprotein; HbA(1c), glycated hemoglobin.