| Literature DB >> 35162242 |
Hao-Yun Kao1, Chi-Chang Chang2,3, Chin-Fang Chang4,5,6,7, Ying-Chen Chen2, Chalong Cheewakriangkrai8, Ya-Ling Tu9.
Abstract
Gender is an important risk factor in predicting chronic kidney disease (CKD); however, it is under-researched. The purpose of this study was to examine whether gender differences affect the risk factors of early CKD prediction. This study used data from 19,270 adult health screenings, including 5101 with CKD, to screen for 11 independent variables selected as risk factors and to test for the significant effects of statistical Chi-square test variables, using seven machine learning techniques to train the predictive models. Performance indicators included classification accuracy, sensitivity, specificity, and precision. Unbalanced category issues were addressed using three extraction methods: manual sampling, the synthetic minority oversampling technique, and SpreadSubsample. The Chi-square test revealed statistically significant results (p < 0.001) for gender, age, red blood cell count in urine, urine protein (PRO) content, and the PRO-to-urinary creatinine ratio. In terms of classifier prediction performance, the manual extraction method, logistic regression, exhibited the highest average prediction accuracy rate (0.8053) for men, whereas the manual extraction method, linear discriminant analysis, demonstrated the highest average prediction accuracy rate (0.8485) for women. The clinical features of a normal or abnormal PRO-to-urinary creatinine ratio indicated that PRO ratio, age, and urine red blood cell count are the most important risk factors with which to predict CKD in both genders. As a result, this study proposes a prediction model with acceptable prediction accuracy. The model supports doctors in diagnosis and treatment and achieves the goal of early detection and treatment. Based on the evidence-based medicine, machine learning methods are used to develop predictive model in this study. The model has proven to support the prediction of early clinical risk of CKD as much as possible to improve the efficacy and quality of clinical decision making.Entities:
Keywords: chronic kidney disease; gender differences; machine learning; risk factors
Mesh:
Year: 2022 PMID: 35162242 PMCID: PMC8835286 DOI: 10.3390/ijerph19031219
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Five stages of chronic kidney disease.
| Stages | Description | GFR Value |
|---|---|---|
| 1 | CKD with normal or high GFR | ≥90 mL/min/1.73 m2 |
| 2 | Mild CKD | 60–89.9 mL/min/1.73 m2 |
| 3 | Moderate CKD | 30–59.9 mL/min/1.73 m2 |
| 3a | 45–59.9 mL/min/1.73 m2 | |
| 3b | 30–44.9 mL/min/1.73 m2 | |
| 4 | Severe CKD | 15–29.9 mL/min/1.73 m2 |
| 5 | End stage CKD | <15 mL/min/1.73 m2 |
GFR: glomerular filtration rate; 3a: stage 3a of kidney disease; 3b: stage 3b of kidney disease.
Data sources and variable codes.
| Variables | Name | Normal Range |
|---|---|---|
| X1 | Gender | 1 male/2 female |
| X2 | Age | Continuous |
| X3 | RBC | 0–5 |
| X4 | GLU | 70–100 |
| X5 | TG | 50–150 |
| X6 | T-CHO | 50–200 |
| X7 | HDL | >40 |
| X8 | LDL | <130 |
| X9 | ALB | 3.5–5.0 |
| X10 | PRO | Random < 12 mg/dL |
| X11 | UPCR | <150 |
| Y | eGFR | 1. <90 mL/min/1.73 |
| 2. ≥90 mL/min/1.73 m2 |
RBC: red blood cell; GLU: glucose; TG: triglycerides; T-CHO: total cholesterol; HDL: high-density cholesterol; LDL: low-density cholesterol; ALB: albumin; PRO: proteinuria; UPCR: urine protein-to-creatinine ratio; eGFR: estimated glomerular filtration rate.
Descriptive analysis of variables.
| Items | Healthy | CKD | χ2 | |
|---|---|---|---|---|
| 14,169 (73.5%) | 5101 (26.5%) | |||
| Gender | ||||
| Male | 5608 (39.6%) | 2465 (48.3%) | <0.001 ** | 117.817 |
| Female | 8561 (60.4%) | 2636 (51.7%) | ||
| Age | ||||
| Mean (±SD) | 63.37 ± 11.56 | 69.19 ± 10.74 | <0.001 * | 699.271 |
| RBC | ||||
| Normal | 11,460 (80.9%) | 3917 (76.8%) | <0.001 ** | 38.956 |
| Abnormal | 2709 (19.1%) | 1184 (23.2%) | ||
| GLU | ||||
| Normal | 2667 (18.8%) | 1055 (20.7%) | 0.004 ** | 8.321 |
| Abnormal | 11,502 (81.2%) | 4046 (79.3%) | ||
| TG | ||||
| Normal | 5878 (41.5%) | 2012 (39.4%) | 0.011 * | 6.466 |
| Abnormal | 8291 (58.5%) | 3089 (60.6%) | ||
| T-CHO | ||||
| Normal | 9198 (64.9%) | 3284 (64.4%) | 0.491 | 0.474 |
| Abnormal | 4971 (35.1%) | 1817 (35.6%) | ||
| HDL | ||||
| Normal | 11,954 (84.4%) | 4369 (85.6%) | 0.029 * | 4.763 |
| Abnormal | 2215 (15.6%) | 732 (14.4%) | ||
| HDL | ||||
| Normal | 11,400 (80.5%) | 4095 (80.3%) | 0.782 | 0.076 |
| Abnormal | 2769 (19.5%) | 1006 (19.7%) | ||
| ALB | ||||
| Normal | 14,162 (100.0%) | 5097 (99.9%) | 0.457 | 0.553 |
| Abnormal | 7 (0.0%) | 4 (0.1%) | ||
| PRO | ||||
| Normal | 9203 (65.0%) | 915 (17.9%) | <0.001 * | 3324.451 |
| Abnormal | 4966 (35.0%) | 4186(82.1%) | ||
| UPCR | ||||
| Normal | 12,364 (87.3%) | 1639 (32.1%) | <0.001 * | 5739.411 |
| Abnormal | 1805 (12.7%) | 3462 (67.9%) | ||
** p-value < 0.01; * p-value < 0.05. RBC: red blood cell; GLU: glucose; TG: triglycerides; T-CHO: total cholesterol; HDL: high-density cholesterol; LDL: low-density cholesterol; ALB: albumin; PRO: proteinuria; UPCR: urine protein-to-creatinine ratio.
Figure 1The receiver operating characteristic (ROC) curves for comparing extraction methods for males. (a) Manual extraction. (b) SMOTE extraction. (c) SpreadSubsample extraction.
Figure 2The receiver operating characteristic (ROC) curves for comparing extraction methods for females. (a) Manual extraction. (b) SMOTE extraction. (c) SpreadSubsample extraction.
Figure 3Decision tree for predicting variables by gender. RBC: red blood cell; GLU: glucose; TG: triglycerides; T-CHO: total cholesterol; HDL: high-density cholesterol; LDL: low-density cholesterol; ALB: albumin; PRO: proteinuria; UPCR: urine protein-to-creatinine ratio; eGFR: estimated glomerular filtration rate.
Classification and regression tree (CART)decision rule for predicting variables by gender.
| Rule No. | The Composition of Risk Factors | No. | Status | Accuracy |
|---|---|---|---|---|
| 1 | Gender (Female) + UPCR (<150) + PRO (<12) | 1799 | Non-CKD | 77.5% |
| 2 | Gender (Female) + UPCR (<150) + PRO (≥12) + Age (<65) + T-CHO (50–200) + LDL (<130) | 21 | Non-CKD | 71.4% |
| 3 | Gender (Female) + UPCR (<150) + PRO (≥12) + Age (<65) + T-CHO (50–200) + LDL (≥130) | 12 | CKD | 66.7% |
| 4 | Gender (Female) + UPCR (<150) + PRO (≥12) + Age (<65) + T-CHO (<50 or >200) | 74 | CKD | 60.8% |
| 5 | Gender (Female) + UPCR (<150) + PRO (≥12) + Age (≥65) | 85 | CKD | 78.8% |
| 6 | Gender (Female) + UPCR (≥150) | 4335 | CKD | 84.9% |
| 7 | Gender (Male) + UPCR (<150) + Age (<65) | 1038 | Non-CKD | 82.3% |
| 8 | Gender (Male) + UPCR (<150) + Age (≥65) + RBC (0–5) | 218 | Non-CKD | 72% |
| 9 | Gender (Male) + UPCR (<150) + Age (≥65) + RBC (<0 or >5) + TG (50–150) + PRO (<12) | 384 | Non-CKD | 55.2% |
| 10 | Gender (Male) + UPCR (<150) + Age (≥65) + RBC (<0 or >5) + TG (50–150) + PRO (≥12) | 30 | CKD | 70% |
| 11 | Gender (Male) + UPCR (<150) + Age (≥ 65) + RBC (<0 or >5) + TG (<50 or >200) | 149 | CKD | 59.7% |
| 12 | Gender (Male) + UPCR (≥ 150) | 4097 | CKD | 83.8% |
RBC: red blood cell; TG: triglycerides; T-CHO: total cholesterol; LDL: low-density cholesterol; PRO: proteinuria; UPCR: urine protein-to-creatinine ratio.