| Literature DB >> 35811691 |
Jingjing Ren1,2,3,4,5, Dongwei Liu1,2,3,4, Guangpu Li1,2,3,4,5, Jiayu Duan1,2,3,4,5, Jiancheng Dong5, Zhangsuo Liu1,2,3,4.
Abstract
Background: Diabetic kidney disease (DKD) patients are facing an extremely high risk of cardiovascular disease (CVD), which is a major cause of death for DKD patients. We aimed to build a deep learning model to predict CVD risk among DKD patients and perform risk stratifying, which could help them perform early intervention and improve personal health management.Entities:
Keywords: cardiovascular disease; diabetic kidney disease; machine learning; prediction model; risk stratification
Year: 2022 PMID: 35811691 PMCID: PMC9263287 DOI: 10.3389/fcvm.2022.923549
Source DB: PubMed Journal: Front Cardiovasc Med ISSN: 2297-055X
Figure 1The working process of this study. The study procedure consisted of variables selection, model building, and model evaluation.
Baseline characteristics in the training set and validation set.
|
|
|
|
| |
|---|---|---|---|---|
|
|
|
| ||
| Smoking status | 0.474 | |||
| Never (%) | 707 (79.4) | 493 (79.1) | 214 (80.2) | |
| Previous (%) | 64 (7.2) | 42 (6.7) | 22 (8.2) | |
| Current (%) | 119 (13.4) | 88 (14.1) | 31 (11.6) | |
| Age (years) | 52 (45–60) | 52 (45–60) | 52 (45–60) | 0.543 |
| Systolic blood pressure (mmHg) | 135 (126–148) | 135 (126–148) | 135 (125–149) | 0.746 |
| Total cholesterol (mmol/L) | 4.5 (3.7–5.4) | 4.6 (3.7–5.4) | 4.5 (3.7–5.3) | 0.615 |
| Hemoglobin (g/L) | 111 (94–131) | 111 (95–130) | 111 (92–132) | 0.652 |
| high density lipoprotein (mmol/L) | 1.1(0.9–1.4) | 1.1 (0.8–1.4) | 1.1 (0.9–1.4) | 0.390 |
| 24 h urinary protein (g) | 2.7 (0.4–6.3) | 2.7 (0.4–6.3) | 2.8 (0.4–6.1) | 0.296 |
Continuous variables are shown as mean (SD) or median (interquartile range) according to the distribution, and categorical variables are shown as frequency (percentage).
Figure 2Cumulative incidence curve of cardiovascular disease in the training set and validation set. Cardiovascular disease is the composite of coronary heart disease, cerebrovascular disease, congestive heart failure, and peripheral arterial disease. There was no statistically significant difference between the survival of the two sets using the log-rank test (p = 0.21).
Demographic and clinical characteristics of patients with or without CVD in the dataset.
|
|
|
|
| |
|---|---|---|---|---|
|
|
|
| ||
| Smoking status | <0.001 | |||
| Never (%) | 707 (79.4) | 191 (67.3) | 516 (31.5) | |
| Previous (%) | 64 (7.2) | 25 (8.8) | 39 (4.1) | |
| Current (%) | 119 (13.4) | 68 (23.9) | 51 (11.2) | |
| Age (years) | 52 (45–60) | 56.4 ± 11.7 | 51 (43–57) | <0.001 |
| Systolic blood pressure (mmHg) | 135 (126–148) | 142 (131.3–160) | 133 (124.8–142) | <0.001 |
| Total cholesterol (mmol/L) | 4.5 (3.7–5.4) | 4.9 (4.1–5.7) | 4.3 (3.6–5.2) | <0.001 |
| Hemoglobin (g/L) | 111 (94–131) | 107 (91–128) | 113 (95.3–133) | 0.007 |
| High density lipoprotein (mmol/L) | 1.1 (0.9–1.4) | 1 (0.8–1.3) | 1.1 (0.9–1.4) | 0.002 |
| 24 h urinary protein (g) | 2.7 (0.4–6.3) | 3.2 (0.7–7.1) | 2.5 (0.4–6) | 0.042 |
Continuous variables are shown as mean (SD) or median (interquartile range) according to the distribution, and categorical variables are shown as frequency (percentage).
Performance of different models.
|
|
|
| ||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| DeepSurv | 0.796 (0.761–0.831) | 0.767 (0.717–0.817) | 0.781 (0.740–0.822) | 0.780 (0.721–0839) | 0.046 | 0.067 |
| CPH | 0.755 (0.719–0.790) | 0.745 (0.691–0.800) | 0.737 (0.698–0.776) | 0.724 (0.659–0.789) | 0.177 | 0.194 |
| RSF | 0.721 (0.681–0.761) | 0.753 (0.700–0.806) | 0.723 (0.680–0.766) | 0.765 (0.704–0.826) | 0.064 | 0.148 |
Model performance of DeepSurv, RSF, and CPH model in terms of C-index, AUC and IBS. CPH, cox proportional hazards regression; DeepSurv, deep learning-based survival model; RSF, random survival forest model; IBS, integrated brier scores; AUC, the area under the receiver-operator characteristic curve.
Figure 3Receiving operating characteristics (ROC) curves for the training set and validation set.
Figure 4Variable Importance. The importance score of the selected variables is calculated by their weights in the DeepSurv model. SBP, systolic blood pressure; TC, total cholesterol; Hb, hemoglobin; HDL, high density lipoprotein.
Subgroup analysis of the performance of different cardiovascular diseases.
|
|
|
|
|---|---|---|
| Coronary heart disease | 0.812 (0.768–0.856) | 0.771 (0.722–0.820) |
| Cerebrovascular disease | 0.844 (0.802–0.885) | 0.806 (0.759–0.853) |
| Congestive heart failure | 0.874 (0.822–0.826) | 0.831 (0.770–0.892) |
| Peripheral artery disease | 0.861 (0.789–0.934) | 0.726 (0.646–0.806) |
CI, confidence intervals.
Figure 5Cumulative Incidence curves for predicted cardiovascular disease among diabetic kidney disease patients in different risk groups. Cardiovascular disease is the composite of coronary heart disease, cerebrovascular disease, congestive heart failure, and peripheral arterial disease. Patients were stratified into a high-risk group and a low-risk group based on the cut-off value of the ROC curve. The P-values between the high-risk and low-risk subgroups were calculated by the log-rank test.
Figure 6The interface of the online calculation tool. This online calculation tool is used to predict the cardiovascular disease risk among diabetic kidney disease patients.