| Literature DB >> 33797560 |
Lili Chan1, Girish N Nadkarni2, Fergus Fleming3,4, James R McCullough3,4, Patricia Connolly3,4, Gohar Mosoyan2, Fadi El Salem5, Michael W Kattan6, Joseph A Vassalotti2, Barbara Murphy2, Michael J Donovan5, Steven G Coca2, Scott M Damrauer7.
Abstract
AIM: Predicting progression in diabetic kidney disease (DKD) is critical to improving outcomes. We sought to develop/validate a machine-learned, prognostic risk score (KidneyIntelX™) combining electronic health records (EHR) and biomarkers.Entities:
Keywords: Biomarkers; Diabetic kidney disease; Electronic data; Machine learning; Prediction
Mesh:
Substances:
Year: 2021 PMID: 33797560 PMCID: PMC8187208 DOI: 10.1007/s00125-021-05444-0
Source DB: PubMed Journal: Diabetologia ISSN: 0012-186X Impact factor: 10.122
Clinical characteristics of the participants in the derivation and validation cohorts
| Study population | Derivation population | Validation population | |
|---|---|---|---|
| Clinical characteristics | |||
| Age in years, median [Q1–Q3] | 63 [55–69] | 63 [55–68] | 63 [56–69] |
| Female, | 581 (50.7) | 352 (51.3) | 229 (49.8) |
| Race, | |||
| White | 373 (32.6) | 231 (33.7) | 142 (30.9) |
| African American | 386 (33.7) | 226 (32.9) | 160 (34.8) |
| Other | 387 (33.8) | 229 (33.4) | 158 (34) |
| BMI, median [Q1–Q3] | 31 [29–35] | 31 [29–35] | 31 [29–36] |
| Hypertension, | 1043 (91.0) | 622 (90.7) | 421 (91.5) |
| CAD, | 406 (35.4) | 234 (34.1) | 172 (37.4) |
| Heart failure, | 378 (33) | 213 (31.1) | 165 (35.9) |
| Systolic BP (mmHg) | 130 [120–144] | 130 [119–144] | 130 [120–144] |
| Diastolic BP (mmHg) | 74 [67–81] | 74 [66–81] | 73 [67–80] |
| Follow-up (months), median [Q1–Q3] | 51.9 [36.5–58.1] | 51.3 [36.8–58.1] | 52.8 [35.9–58.1] |
| Laboratory characteristics | |||
| eGFR (ml min−1 [1.73 m]−2) | |||
| Baseline, median [Q1–Q3] | 54.3 [45.3–67.3] | 54.4 [44.4–68.4] | 54.1 [45.7–66.1] |
| 30–44.9, | 279 (24.4) | 176 (25.7) | 103 (22.4) |
| 45–59.9, | 490 (42.8) | 275 (40.1) | 215 (46.7) |
| 60–89.9, | 263 (22.9) | 170 (24.8) | 93 (20.2) |
| ≥ 90, | 114 (9.9) | 65 (9.5) | 49 (10.7) |
| uACR (mg/mmol) | |||
| Baseline, median [Q1–Q3] | 6.9 [1.8–27.2] | 7.4 [2–26.9] | 6.1 [1.7–27.8] |
| Missing, | 433 (37.8) | 269 (39.2) | 164 (35.7) |
| Baseline HbA1c, median [Q1–Q3] | |||
| mmol/mol | 51.9 [44.3–66.1] | 53 [44.3–66.3] | 51.9 [44.3–66.1] |
| % | 6.9 [6.2–8.2] | 7 [6.2–8.22] | 6.9 [6.2–8.2] |
| Medication | |||
| ACEi/ARB, | 926 (80.8) | 560 (81.6) | 366 (79.6) |
| Plasma biomarkers (pg/ml), median [Q1–Q3] | |||
| TNFR1 | 2807 [2192–3830] | 2807 [2191–3830] | 2924 [2217–3894] |
| TNFR2 | 11,090 [8031–14,984] | 11,090 [8031–14,984] | 11,171 [8302–15,046] |
| KIM-1 | 124 [76–235] | 124 [76–235] | 138 [82–253] |
| Smoking status, | |||
| Never | 354 (30.9) | 214 (31.2) | 140 (30.4) |
| Ever | 503 (43.9) | 298 (43.4) | 205 (44.6) |
| Missing | 289 (25.2) | 174 (25.4) | 115 (25) |
| Events | |||
| eGFR slope ≥ 5 ml min−1 [1.73 m]−2 per year, | 171 (14.9) | 98 (14.3) | 73 (15.9) |
| Sustained 40% decline in eGFR, | 179 (15.6) | 103 (15) | 76 (16.5) |
| Kidney failure, | 52 (4.5) | 29 (4.2) | 23 (5) |
| Composite endpoint, | 241 (21) | 137 (20) | 104 (22.6) |
aConfirmed at least 3 months later. Defined as a decline in eGFR of ≥40% from baseline
bDefined by sustained eGFR <15 confirmed at least 30 days later, or receipt of long-term maintenance dialysis or receipt of a kidney transplant
cDefined as progressive decline in kidney function defined by any of the following: eGFR slope ≥ 5 ml min−1 [1.73 m]−2 per year or sustained 40% decline in eGFR or kidney failure
ACEi, ACE inhibitor; ARB, angiotensin receptor blocker; CAD, coronary artery disease
Fig. 1Shapley additive explanations (SHAP) plot showing relative feature importance. SHAP summary plots order features based on their importance. Each plot is made up of individual points from the training dataset with a higher value being darker purple and a lower value being more yellow. If the dots on one side of the middle line are more purple or yellow, this suggests that the values are increasing or decreasing, respectively, moving the prediction in that direction. For example, higher systolic BP is associated with higher risk of the composite kidney outcome. AST, aspartate aminotransferase
Fig. 2Composite kidney endpoint event rates by (a) KidneyIntelX predicted risk in derivation set, (b) KidneyIntelX predicted risk in validation set and (c) KidneyIntelX score prediction distributions of patients with DKD according to the risk of composite kidney endpoint in the derivation and validation set. (a, b) Events are denoted with an orange dot (progression) and represent the composite kidney endpoint within 5 years. Non-events are denoted with blue dots (no progression) and represent an absence of the composite kidney event in the follow-up period. (c) Dots represent cumulative incidence: blue, low risk 10% (6%, 14%); pink, intermediate risk 22% (16%, 28%); and red, high risk 61% (50%, 71%)
Test characteristics for KidneyIntelX and the comprehensive clinical model
| Predicted risk | KidneyIntelX risk score | Full derivation set ( | Validation set ( | Predicted risk | Optimised clinical model | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Population | Sens | Spec | NPV/PPV | Population | Sens | Spec | NPV/PPV | Population | Sens | Spec | NPV/PPV | |||
| Low risk | Low risk | |||||||||||||
| 0.040 | ≤30 | Lowest 30% | 96% | 37% | 98% | Lowest 32% | 88% | 38% | 91% | 0.142 | Lowest 32% | 74% | 33% | 86% |
| 0.061 | ≤45 | Lowest 45% | 88% | 53% | 95% | Lowest 46% | 81% | 54% | 90% | 0.171 | Lowest 46% | 67% | 48% | 88% |
| 0.0712 | ≤50 | Lowest 50% | 85% | 59% | 94% | Lowest 48% | 77% | 58% | 90% | 0.175 | Lowest 48% | 67% | 51% | 89% |
| High risk | ||||||||||||||
| 0.241 | ≥85 | Top 20% | 56% | 89% | 56% | Top 21% | 50% | 88% | 55% | 0.288 | Top 21% | 41% | 82% | 31% |
| 0.302 | ≥90 | Top 15% | 46% | 93% | 63% | Top 17% | 45% | 93% | 61% | 0.319 | Top 17% | 37% | 88% | 37% |
| 0.401 | ≥95 | Top 10% | 32% | 96% | 67% | Top 12% | 31% | 96% | 70% | 0.361 | Top 12% | 28% | 91% | 38% |
aAUCs in derivation set: 0.85 (95% CI 0.84, 0.86) in train and AUC 0.77 (95% CI 0.74, 0.79) from tenfold cross-validation testing
bAUC in validation set 0.77 (95% CI 0.76, 0.79)
NPV, negative predictive value (for low risk); PPV, positive predictive value (for high risk); Sens, sensitivity; Spec, specificity
Fig. 3Kaplan–Meier curves by KidneyIntelX risk strata for the endpoint of sustained 40% decline in eGFR or kidney failure in derivation (a) and validation (b) sets. The risk cut-offs derived from derivation and applied to validation were: low risk 0–0.061129, intermediate risk 0.061129–0.30209 and high risk 0.30209–1. In the derivation set, 45% were low risk, 40% were intermediate risk and 15% were high risk. In the validation set, 46% were low risk, 37% were intermediate risk, and 17% were high risk. The HR for high vs low risk was 18.3 (95% CI 10.1, 33.1) in derivation and 14.7 (95% CI 7.8, 27.6) in validation. The HR for high vs intermediate risk was HR 5.7 (95% CI 3.7, 8.7) in derivation and 6.0 (95% CI 3.5, 10.0) in validation. The HR for high vs low and intermediate risk combined was 9.2 (95% CI 6.2, 13.6) in derivation and 9.1 (95% CI 5.8, 14.4) in validation