| Literature DB >> 29888055 |
Era Kim1,2, David S Pieczkiewicz1, M Regina Castro3, Pedro J Caraballo3, Gyorgy J Simon1.
Abstract
Because deterioration in overall metabolic health underlies multiple complications of Type 2 Diabetes Mellitus, a substantial overlap among risk factors for the complications exists, and this makes the outcomes difficult to distinguish. We hypothesized each risk factor had two roles: describing the extent of deteriorating overall metabolic health and signaling a particular complication the patient is progressing towards. We aimed to examine feasibility of our proposed methodology that separates these two roles, thereby, improving interpretation of predictions and helping prioritize which complication to target first. To separate these two roles, we built models for six complications utilizing Multi-Task Learning-a machine learning technique for modeling multiple related outcomes by exploiting their commonality-in 80% of EHR data (N=9,793) from a university hospital and validated them in remaining 20% of the data. Additionally, we externally validated the models in claims and EHR data from the OptumLabs™ Data Warehouse (N=72,720). Our methodology successfully separated the two roles, revealing distinguishing outcome-specific risk factors without compromising predictive performance. We believe that our methodology has a great potential to generate more understandable thus actionable clinical information to make a more accurate and timely prognosis for the patients.Entities:
Year: 2018 PMID: 29888055 PMCID: PMC5961813
Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc
Figure 1.Study design
Baseline Patient Characteristics in UMMC and OLDW Datasets
| Variable | Description | UMMC (N=9,793) | OLDW (N=72,720) |
|---|---|---|---|
| male | Male | 51 | 46 |
| age | Age (years) | 58±13 | 60±12 |
| never_smoker | Non-smoker | 56 | 45 |
| a1c | HbA1C | 7.2±1 | 7.0±1 |
| ldl | LDL-cholesterol (mg/dL) | 103±28 | 101±28 |
| hdl | HDL-cholesterol (mg/dL) | 44±12 | 46±12 |
| trigl | Triglycerides (mg/dL) | 172±90 | 169±117 |
| tchol | Total-cholesterol (mg/dL) | 181±34 | 179±34 |
| gfr | Glomerular Filtration Rate (ml/min/1.73m2) | 58±32 | 76±27 |
| gfr_norm | Normal Glomerular Filtration Rate | 22 | 7 |
| bmi | Body Mass Index (kg/m2) | 34±7 | 34±8 |
| sbp | Systolic Blood Pressure (mmHg) | 127±11 | 131±11 |
| dbp | Diastolic Blood Pressure (mmHg) | 75±7 | 77±7 |
| pls | Pulse (bpm) | 76±9 | 77±9 |
| hyperlip | Hyperlipidemia | 81 | 86 |
| htn | Hypertension | 71 | 81 |
| obese | Obesity (BMI > 30) | 70 | 67 |
Figure 2.Coefficients from Multi-Task Learning Methodology
P-values of Coefficients from Multi-Task Learning Methodology
| Variable | P-value | ||||||
|---|---|---|---|---|---|---|---|
| General | CKD | ARF | IHD | PVD | CHF | CVD | |
| a1c | 0.008 | 0.049 | 0.340 | 0.027 | 0 005 | 0 058 | 0 327 |
| Idl | 0 329 | 0.117 | 0.289 | 0.034 | 0.265 | 0.259 | 0.256 |
| hdl | 0.058 | 0281 | 0.343 | 0.123 | 0.145 | < 0 001 | 0.283 |
| trigl | 0 014 | 0297 | 0 307 | 0 059 | 0 079 | < 0 001 | 0.278 |
| tchol | 0 021 | 0 057 | 0.25 | 0 039 | 0.023 | 0 059 | 0.234 |
| gfr | < 0.001 | < 0.001 | 0.265 | < 0.001 | < 0.001 | 0.251 | 0.255 |
| gfr_norm | < 0.001 | < 0.001 | 0.261 | < 0.001 | < 0.001 | 0.025 | 0.237 |
| bmi | 0.042 | 0.067 | 0.333 | 0.051 | 0 034 | < 0.001 | 0.272 |
| pis | 0.188 | 0.290 | 0.320 | 0.105 | 0.076 | < 0.001 | 0.294 |
| sbp | 0 01 | 0.076 | 0 31 | 0.002 | 0.003 | < 0.001 | 0272 |
| dbp | 0 006 | 0.261 | 0.307 | 0.029 | < 0.001 | 0.289 | 0.266 |
| neversmoker | 0.002 | 0.089 | 0 017 | 0 098 | < 0 001 | < 0 001 | 0.317 |
| age | < 0 001 | 0289 | 0 311 | 0.045 | < 0 001 | < 0 001 | < 0 001 |
| male | 0 107 | 0.094 | 0.321 | 0.019 | < 0 001 | 0 008 | 0.312 |
Figure 3.Coefficients for Risk of Developing Any Complication
Figure 4.Characteristic Shapes of Differential Markers for CKD, IHD, PVD, and CHF
Figure 5.Coefficients from Baseline Methodology
P-values of Coefficients from Baseline Methodology
| Variable | P-value | |||||
|---|---|---|---|---|---|---|
| CKD | ARF | IHD | PVD | CHF | CVD | |
| a1c | 0.033 | < 0.001 | 0.132 | 0.023 | 0.164 | 0 007 |
| Idl | 0.052 | < 0.001 | 0.043 | 0.101 | 0.289 | 0.045 |
| hdl | 0.088 | 0.014 | 0.101 | 0.333 | < 0 001 | 0.024 |
| trigl | 0 333 | 0 010 | 0 023 | 0 025 | 0 003 | 0.162 |
| tchol | 0269 | < 0.001 | 0 029 | 0.045 | 0.239 | 0.271 |
| gfr | < 0.001 | < 0.001 | 0.006 | < 0.001 | 0.021 | 0.019 |
| gfr_norm | < 0.001 | < 0.001 | 0 007 | < 0.001 | 0.258 | 0.033 |
| bmi | 0.085 | 0.102 | 0.101 | 0.118 | < 0.001 | 0 184 |
| pis | 0.348 | < 0.001 | 0.121 | 0 081 | < 0.001 | 0 088 |
| sbp | 0.055 | 0.005 | 0.024 | < 0.001 | < 0.001 | 0.007 |
| dbp | 0.154 | 0.004 | 0.117 | < 0.001 | 0.138 | 0.159 |
| neversmoker | 0.344 | < 0.001 | 0 016 | < 0.001 | < 0 001 | 0.128 |
| age | <0 001 | < 0.001 | 0 021 | < 0 001 | < 0 001 | < 0 001 |
| male | 0 347 | < 0.001 | 0 059 | < 0.001 | 0.002 | 0.041 |
Predictive Performance in C-Index (95CIs)
| Dataset | Methodology | CKD | ARF | IHD | PVD | CHF | CVD |
|---|---|---|---|---|---|---|---|
| Internal (UMMC) | MTL | .74(.73-.79) | .58(.48-.82) | .57(.52-.58) | .75(.59-.80) | .83(.70-.91) | .75(.63-.78) |
| Reference | .74(.73-.79) | .62(.48-.79) | .57(.52-.58) | .75(.60-.81) | .84(.67-.91) | .78(.65-.80) | |
| External (OLDW) | MTL | .71 | .61 | .53 | .61 | .73 | .64 |
| Reference | .71 | .63 | .53 | .61 | .74 | .68 |