| Literature DB >> 32678448 |
Che Ngufor1,2, Pedro J Caraballo1,3, Thomas J O'Byrne4, David Chen1, Nilay D Shah2,4, Lisiane Pruinelli5, Michael Steinbach6, Gyorgy Simon3,7,8.
Abstract
Importance: Clinical domain knowledge about diseases and their comorbidities, severity, treatment pathways, and outcomes can facilitate diagnosis, enhance preventive strategies, and help create smart evidence-based practice guidelines. Objective: To introduce a new representation of patient data called disease severity hierarchy that leverages domain knowledge in a nested fashion to create subpopulations that share increasing amounts of clinical details suitable for risk prediction. Design, Setting, and Participants: This retrospective cohort study included 51 969 patients aged 45 to 85 years, with 10 674 patients who received primary care at the Mayo Clinic between January 2004 and December 2015 in the training cohort and 41 295 patients who received primary care at Fairview Health Services from January 2010 to December 2017 in the validation cohort. Data were analyzed from May 2018 to December 2019. Main Outcomes and Measures: Several binary classification measures, including the area under the receiver operating characteristic curve (AUC), Gini score, sensitivity, and positive predictive value, were used to evaluate models predicting all-cause mortality and major cardiovascular events at ages 60, 65, 75, and 80 years.Entities:
Mesh:
Year: 2020 PMID: 32678448 PMCID: PMC7368174 DOI: 10.1001/jamanetworkopen.2020.8270
Source DB: PubMed Journal: JAMA Netw Open ISSN: 2574-3805
Figure 1. Study Design and Model Development
A, Overview of study design with disease severity hierarchy determined at each time point in the baseline time window. B, Workflow of the training and validation procedure and selection of final model. Data were randomly divided in 10 equal and independent parts (ie, folds), then the models were trained on 9 folds and tested on the hold-out fold. The procedure was repeated until each fold was used for testing. At each step, optimal parameters were selected and performance was evaluating on the hold-out fold. The final model was then developed on the complete Rochester Epidemiology Project (REP) data using optimal parameters. ACC indicates accuracy; ACM, all-cause mortality; AUC, area under the receiver operating characteristic curve; FHS, Fairview Health Services; MCE, major cardiovascular event; and PPV, positive predictive value.
Figure 2. Disease Severity Hierarchy for Type 2 Diabetes
An example of a 6-level disease severity hierarchy tree structure and corresponding risk score allocations for type 2 diabetes. Disease severity intensifies on any right branch from the root to the leaves. A patient is considered under control if the laboratory result or vital sign associated with the condition is within its predefined reference range, eg, glycated hemoglobin less than 6.5% (to convert to proportion of total hemoglobin, multiply by 0.01). The disease severity hierarchy structure for hypertension and hyperlipidemia are similar to that for type 2 diabetes. An example of the disease severity hierarchy tree for obesity is presented in eFigure 3 in the Supplement.
Study Population
| Variable | No. (%) | |
|---|---|---|
| REP (n = 10 674) | FHS (n = 41 295) | |
| Age, mean (SD), y | 59.4 (10.8) | 57.4 (7.9) |
| Women | 6324 (59.3) | 21 975 (53.2) |
| Race/ethnicity | ||
| White | 9804 (91.9) | 37 653 (91.2) |
| Black | 179 (1.7) | 1586 (3.8) |
| Asian | 369 (3.5) | 902 (2.2) |
| American Indian | 21 (0.2) | 240 (0.6) |
| Hawaiian | 11 (0.1) | 29 (0.1) |
| Unknown | 290 (2.7) | 885 (2.1) |
| Type 2 diabetes | ||
| Sum | 5.4 (13.6) | 12.2 (17.6) |
| Mean | 0.4 (0.9) | 0.9 (1.3) |
| Last observed | 0.2 (0.4) | 0.2 (0.8) |
| Hypertension | ||
| Sum | 38.2 (22.7) | 47.3 (28.7) |
| Mean | 2.5 (1.5) | 3.4 (2.0) |
| Last observed | 1.5 (1.6) | 1.1 (2.3) |
| Hyperlipidemia | ||
| Sum | 19.5 (12.5) | 18.4 (13.5) |
| Mean | 1.3 (0.8) | 1.3 (1.0) |
| Last observed | 0.7 (1.1) | 0.6 (1.2) |
| Obesity | ||
| Sum | 15.1 (12.1) | 17.1 (12.0) |
| Mean | 1.0 (0.8) | 1.2 (0.9) |
| Last observed | 0.7 (1.1) | 0.3 (0.8) |
| Type 2 diabetes | ||
| Most frequent | 1733 (16.2) | 15 716 (38.1) |
| Rolling | 2664 (25.0) | 26 438 (64.0) |
| Hypertension | ||
| Most frequent | 10 019 (93.9) | 29 621 (71.7) |
| Rolling | 10 575 (99.1) | 41 218 (99.8) |
| Hyperlipidemia | ||
| Most frequent | 7678 (71.9) | 21 839 (52.9) |
| Rolling | 8806 (82.5) | 28 563 (69.2) |
| Obesity | ||
| Most frequent | 4078 (38.2) | 17 111 (41.4) |
| Rolling | 4986 (46.7) | 26 198 (63.4) |
| Frequency, mean (SD), No. | 23.2 (22.3) | 14.0 (0.2) |
| ACE inhibitor | 3416 (32.0) | 25 968 (62.9) |
| Calcium channel blocker | 2192 (20.5) | 19 214 (46.5) |
| β-blocker | 3958 (37.1) | 30 437 (73.7) |
| Diuretic | 1124 (10.5) | 28 046 (67.9) |
| Statin | 5687 (53.3) | 31 294 (75.8) |
| α-blocker | 1183 (11.1) | 90 (0.2) |
| Angiotensin receptor blocker | 1515 (14.2) | 11 832 (28.6) |
| Other | 262 (2.5) | 12 440 (30.1) |
| Fibrate | 478 (4.5) | 4095.00 (9.92) |
| Sulfonylurea | 660 (6.2) | 7476 (18.1) |
| Renin inhibitor | 7 (0.1) | 70 (0.3) |
| Insulin | 545 (5.1) | 14 559 (35.2) |
| Cholesterol absorption inhibitor | 587 (5.5) | 4213 (10.2) |
| Metformin | 1196 (11.2) | 13 033 (31.6) |
| Dipeptidyl peptidase-4 | 124 (1.2) | 2150 (5.2) |
| Meglitinide | 23 (0.2) | 203 (0.5) |
| Vasodilator | 63 (0.6) | 16 619 (40.2) |
| GLP-1 agonist | 48 (0.5) | 2070 (5.0) |
| Amylin | 4 (<0.1) | 52 (0.1) |
| SGLT-2 inhibitor | 1 (<0.1) | 608 (1.5) |
| Glycated hemoglobin | 2.9 (6.4) | 7.3 (9.3) |
| BNP | 0.2 (1.2) | 0.0 (0.0) |
| Creatinine | 18.9 (28.5) | 29.4 (39.1) |
| Cardiac troponin 1 | 0.1 (0.5) | 0.0 (0.0) |
| Cardiac troponin T | 2.5 (6.1) | 0.0 (0.0) |
| Glucose | ||
| Fasting | 10.1 (11.3) | 28.1 (38.1) |
| Random | 0.1 (0.4) | 33.6 (103.2) |
| Glomerular filtration rate | 11.0 (27.5) | 39.6 (66.8) |
| HDL cholesterol | 8.9 (6.8) | 8.9 (6.9) |
| LDL cholesterol | 8.7 (6.7) | 8.8 (6.8) |
| Total cholesterol | 9.0 (6.9) | 8.9 (6.9) |
| NT-proBNP | 0.3 (1.6) | 0.0 (0.0) |
| Triglycerides | 8.8 (6.9) | 9.0 (7.5) |
| BMI | 18.2 (15.5) | 38.3 (31.0) |
| Diastolic blood pressure | 106.9 (204.6) | 46.7 (39.5) |
| Height | 20.4 (18.1) | 0.0 (0.0) |
| Pulse | 196.0 (618.4) | 41.7 (35.9) |
| Systolic blood pressure | 106.9 (204.5) | 46.7 (39.5) |
| Weight | 40.2 (41.3) | 0.0 (0.0) |
| Respiration | 61.1 (222.6) | 0.0 (0.0) |
| Creatinine, mg/dL | 1.0 (0.3) | 6.1 (103.2) |
| Fasting glucose, mg/dL | 104.1 (20.2) | 119.9 (32.6) |
| Total cholesterol, mg/dL | 193.3 (30.1) | 178.8 (81.8) |
| BMI | 28.4 (6.3) | 35.1 (63.9) |
| Diastolic blood pressure, mm Hg | 73.9 (7.0) | 74.0 (6.7) |
| Pulse, bpm | 73.3 (8.4) | 74.5 (8.9) |
| Systolic blood pressure, mm Hg | 127.1 (11.7) | 128.0 (10.3) |
| Creatinine, mg/dL | 1.0 (0.3) | 8.3 (146.9) |
| Fasting glucose, mg/dL | 103.8 (20.2) | 119.7 (33.9) |
| Total cholesterol, mg/dL | 192.1 (30.0) | 177.9 (152.3) |
| BMI | 28.3 (6.2) | 36.1 (76.8) |
| Diastolic blood pressure, mm Hg | 73.6 (7.1) | 73.8 (7.1) |
| Pulse, bpm | 73.2 (8.6) | 74.8 (9.3) |
| Systolic blood pressure, mm Hg | 126.4 (11.8) | 127.9 (10.8) |
| ACM | 945 (8.9) | 1857 (4.5) |
| MCE | 787 (7.4) | 3178 (7.7) |
Abbreviations: ACE, angiotensin-converting enzyme; ACM, all-cause mortality; BMI, body mass index (calculated as weight in kilograms divided by height in meters squared); BNP, brain natriuretic peptide; bpm, beats per minute; DSH, disease severity hierarchy; FHS, Fairview Health Services; GLP-1, glucagon-like peptide 1; HDL, high-density lipoprotein; LDL, low-density lipoprotein; MCE, major cardiovascular event; NT-proBNP, N-terminal pro–brain natriuretic peptide; REP, Rochester Epidemiology Project; SGLT-2, sodium-glocuse cotransporter-2.
SI conversion factors: To convert creatinine to micromoles per liter, multiply by 88.4; fasting glucose to millimoles per liter, multiply by 0.055; and total cholesterol to millimoles per liter, multiply by 0.0259.
The index date for patients in REP was defined as age on January 1, 2004, and age on January 1, 2010, for FSH.
All-Cause Mortality Internal and External Validation
| Model | Age, y | ACC | AUC | Sensitivity | Specificity | PPV |
|---|---|---|---|---|---|---|
| DSH-RS | 60 | 0.91 (0.90-0.93) | 0.96 (0.94-0.97) | 0.98 (0.91-1.00) | 0.91 (0.90-0.93) | 0.10 (0.07-0.13) |
| 65 | 0.92 (0.90-0.93) | 0.96 (0.95-0.98) | 0.99 (0.94-1.00) | 0.91 (0.90-0.93) | 0.15 (0.10-0.19) | |
| 75 | 0.94 (0.93-0.95) | 0.97 (0.96-0.98) | 0.98 (0.93-1.00) | 0.94 (0.93-0.95) | 0.35 (0.25-0.39) | |
| 80 | 0.96 (0.95-0.97) | 0.98 (0.98-0.99) | 0.99 (0.97-1.00) | 0.95 (0.95-0.96) | 0.55 (0.49-0.62) | |
| COM | 60 | 0.62 (0.22-0.94) | 0.67 (0.55-0.80) | 0.49 (0.11-0.89) | 0.62 (0.21-0.94) | 0.02 (0.01-0.05) |
| 65 | 0.61 (0.38-0.93) | 0.66 (0.56-0.79) | 0.55 (0.17-0.85) | 0.61 (0.38-0.94) | 0.03 (0.01-0.05) | |
| 75 | 0.54 (0.41-0.65) | 0.64 (0.57-0.71) | 0.64 (0.46-0.79) | 0.54 (0.40-0.65) | 0.05 (0.03-0.06) | |
| 80 | 0.59 (0.51-0.67) | 0.63 (0.54-0.70) | 0.60 (0.51-0.70) | 0.59 (0.50-0.67) | 0.08 (0.05-0.10) | |
| COM + LB/VS | 60 | 0.80 (0.62-0.96) | 0.86 (0.79-0.92) | 0.68 (0.42-0.83) | 0.80 (0.62-0.97) | 0.06 (0.01-0.15) |
| 65 | 0.84 (0.70-0.96) | 0.87 (0.82-0.93) | 0.70 (0.48-0.87) | 0.85 (0.70-0.97) | 0.10 (0.03-0.23) | |
| 75 | 0.79 (0.69-0.86) | 0.79 (0.74-0.83) | 0.64 (0.58-0.73) | 0.80 (0.69-0.87) | 0.11 (0.05-0.17) | |
| 80 | 0.72 (0.60-0.80) | 0.73 (0.67-0.78) | 0.58 (0.47-0.70) | 0.73 (0.59-0.82) | 0.11 (0.07-0.15) | |
| DSH-RS | 60 | 0.76 | 0.81 | 0.72 | 0.76 | 0.05 |
| 65 | 0.75 | 0.80 | 0.68 | 0.76 | 0.07 | |
| 75 | 0.76 | 0.77 | 0.63 | 0.76 | 0.11 | |
| 80 | 0.74 | 0.77 | 0.65 | 0.74 | 0.11 | |
| COM | 60 | 0.43 | 0.75 | 0.87 | 0.43 | 0.03 |
| 65 | 0.41 | 0.74 | 0.88 | 0.39 | 0.04 | |
| 75 | 0.36 | 0.73 | 0.90 | 0.33 | 0.06 | |
| 80 | 0.38 | 0.72 | 0.88 | 0.36 | 0.06 | |
| COM + LB/VS | 60 | 0.42 | 0.82 | 0.94 | 0.41 | 0.03 |
| 65 | 0.46 | 0.81 | 0.92 | 0.45 | 0.04 | |
| 75 | 0.41 | 0.81 | 0.95 | 0.38 | 0.07 | |
| 80 | 0.35 | 0.80 | 0.96 | 0.33 | 0.06 | |
Abbreviations: ACC, accuracy; AUC, area under the receiving operator characteristic curve; COM, comorbidities with medications; COM + LB/VS, COM with laboratory results and vital signs; DSH-RS, disease severity hierachy with risk scores only; FHS, Fairview Health Services; PPV, positive predictive value; REP, Rochester Epidemiology Project.
Figure 3. Internal Validation Performance Plots for Cox Proportional Hazard Models Predicting ACM at Age 75 Years
B, The Gini score was computed by dividing the area between the gain curve and the random classifier (indicated by the dotted diagonal line) by the area between the perfect classifier (indicated by the purple curve) and the random classifier. D, the dotted diagonal line represents the line of perfect calibration. Systematic deviation (below or above) from the diagonal line indicates that the model might not reliably estimate event rates, leading to overestimation and underestimation. COM indicates model with only comorbidities and medication; COM + LB/VS, COM with laboratory results and vital signs; DSH-RS, disease severity hierarchy–risk score; ROC, receiver operating characteristic curve.