| Literature DB >> 20003229 |
Javier Trujillano1, Mariona Badia, Luis Serviá, Jaume March, Angel Rodriguez-Pozo.
Abstract
BACKGROUND: Development of three classification trees (CT) based on the CART (Classification and Regression Trees), CHAID (Chi-Square Automatic Interaction Detection) and C4.5 methodologies for the calculation of probability of hospital mortality; the comparison of the results with the APACHE II, SAPS II and MPM II-24 scores, and with a model based on multiple logistic regression (LR).Entities:
Mesh:
Year: 2009 PMID: 20003229 PMCID: PMC2797013 DOI: 10.1186/1471-2288-9-83
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Characteristics of the classification tree methods
| CART | CHAID | C4.5 | |
|---|---|---|---|
| Classification and Regression Tree | Chi-Square Automatic Interaction Detection | Concept Learning Systems | |
| Breiman et al. (1984) | Kass (1980) | Quinlan (1993) | |
| Many disciplines with little data | Applied statisticians | Data miners | |
| Gini reduction or twoing | Chi-square tests | Gain ratio | |
| Best binary split | Number of values of the input | Best binary split | |
| Cross-validation | Uses p-values | Misclassification rates | |
| WEKA | Answer-Tree (SPSS) | WEKA | |
Demographic characteristics of patients
| Group | Development | Validation | ||
|---|---|---|---|---|
| Age (years)a | 55.0 (19) | 55.2 (19) | 54.6 (19) | 0.485 |
| Sex, male (%) | 66.9 | 66.8 | 67.3 | 0.786 |
| Elective (%) | 6.5 | 6.1 | 7.5 | 0.184 |
| Diagnostic category | 0.414 | |||
| TBI (%) | 15.1 | 15.2 | 14.9 | |
| Trauma (%) | 15.2 | 15.2 | 15.3 | |
| Neurological (%) | 14.8 | 14.6 | 15.3 | |
| Respiratory (%) | 19.0 | 18.1 | 21.1 | |
| Surgery (%) | 18.7 | 19.4 | 17.0 | |
| O Medicine (%) | 17.2 | 17.6 | 16.3 | |
| MV (%) | 65.9 | 66.6 | 64.2 | 0.216 |
| Inotropic therapy (%) | 33.7 | 33.9 | 33.3 | 0.783 |
| Acute renal failure (%) | 19.9 | 19.8 | 20.3 | 0.773 |
| Infection (%) | 34.6 | 34.6 | 34.8 | 0.900 |
| Coagulopathy (%) | 12.2 | 12.1 | 12.6 | 0.724 |
| COI (%) | 16.0 | 16.3 | 15.4 | 0.582 |
| HRa | 107.8 (30) | 108.3 (31) | 106.5 (30) | 0.253 |
| Glasgowa | 12.9 (4) | 12.8 (4) | 13.0 (4) | 0.507 |
| (A-a)O2 gradienta | 244.1 (161) | 241.7 (160) | 249.5 (162) | 0.250 |
| APACHE IIb | 18 (7-41) | 18 (6-37) | 16 (6-45) | 0.805 |
| SAPS IIb | 15 (6-47) | 15 (5-35) | 14 (5-47) | 0.742 |
| MPM II-24b | 17 (7-43) | 17 (6-37) | 15 (6-38) | 0.779 |
| LOS (days)b | 7 (3-16) | 7 (3-16) | 7 (3-15) | 0.972 |
| MORT (%) | 31.4 | 30.7 | 32.8 | 0.308 |
TBI: Traumatic brain injury; O Medicine: Other Medical; MV: Mechanical ventilation; A. renal failure: Acute renal failure; COI: Chronic organ insufficiency; HR: Heart rate; (A-a)O2 gradient: Alveolar-arterial oxygen gradient; LOS: Length of stay; MORT: Hospital mortality; (a): Mean (SD); (b): Median (Interquartile range) pc: Determined by χ2 test for percentages, t test for comparison of means or Mann-Whitney test for comparison of medians.
Outcome trend over the observation period
| All | 1997 | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2684 | 176 | 191 | 201 | 223 | 279 | 297 | 303 | 319 | 337 | 358 | ----- | |
| 31.4 | 35.4 | 33.7 | 39.1 | 35.0 | 28.4 | 32.9 | 33.0 | 28.6 | 27.7 | 23.3 | 0.112 | |
| 70.0 | 70.8 | 61.7 | 69.0 | 70.9 | 72.4 | 68.9 | 75.4 | 66.8 | 70.4 | 72.6 | 0.146 | |
| 18 (7-41) | 21 (7-41) | 19 (8-41) | 19 (6-34) | 17 (6-36) | 14 (5-30) | 14 (6-30) | 15 (6-35) | 16 (7-38) | 17 (6-36) | 15 (7-34) | 0.361 | |
| 15 (6-47) | 15 (6-47) | 17 (5-41) | 12 (4-31) | 13 (3-35) | 13 (4-27) | 13 (4-31) | 15 (5-39) | 17 (6-37) | 17 (6-37) | 13 (5-31) | 0.415 | |
| 17 (7-43) | 17 (7-43) | 16 (6-35) | 14 (6-29) | 16 (6-35) | 14 (6-32) | 13 (6-34) | 18 (6-39) | 18 (6-40) | 17 (7-36) | 14 (6-34) | 0.389 |
MORT: Hospital mortality; DEV: Development set percentage (a): Median (Interquartile range) pb: Determined by χ2 test for percentages or Kruskal-Wallis test for comparison of medians.
Univariate analyses of characteristics of patients at discharge, by survival status.
| Variable | Survivors | Non-survivors | ||
|---|---|---|---|---|
| Age (years) | 51.2 (19) | 63.8 (16) | < 0.001 | 1,2,3 |
| HR (ppm) | 104.7 (29) | 115.0 (31) | < 0.001 | 1,2 |
| MAP (mmHg) | 82.8 (28) | 72.4 (32) | < 0.001 | 1,2 |
| Inotropic therapy (%) | 25.0 | 52.7 | < 0.001 | 3 |
| Glasgow | 13.4 (3) | 11.8 (5) | < 0.001 | 1,2,3 |
| Intracranial mass (%) | 3.0 | 6.3 | 0.001 | 3 |
| FiO2 | 0.49 (0.2) | 0.62 (0.2) | < 0.001 | 1,2 |
| (A-a)O2 gradient (mmHg) | 212.3 (143) | 304.5 (176) | < 0.001 | 1,2 |
| MV (%) | 53.6 | 79.9 | < 0.001 | 3 |
| CO3H (mEq/L) | 23.4 (5) | 22.1 (6) | < 0.001 | 1,2 |
| pH | 7.36 (0.1) | 7.34 (0.1) | < 0.001 | 1,2 |
| Urine output (cc/24 h) | 2124 (1058) | 1778 (1398) | < 0.001 | 2 |
| Urea (mg/dL) | 50.8 (41) | 76.5 (53) | < 0.001 | 2 |
| Creatinin (mg/dL) | 1.34 (1.3) | 1.81 (1.4) | < 0.001 | 1,3 |
| Sodium (mEq/L) | 139.5 (5) | 140.5 (7) | 0.015 | 1,2 |
| Acute renal failure (%) | 14.1 | 31.9 | < 0.001 | 3 |
| Urine output < 150 cc/8 h (%) | 3.3 | 17.3 | < 0.001 | 3 |
| Temperature (°C) | 38.2 (13) | 38.3 (14) | 0.036 | 1,2 |
| Infection (%) | 28.8 | 46.9 | < 0.001 | 3 |
| Coagulopathy (%) | 9.7 | 17.1 | < 0.001 | 3 |
| COI (%) | 11.7 | 26.0 | < 0.001 | 1,2,3 |
| Elective (%) | 7.7 | 2.7 | < 0.001 | 1,2,3 |
| Trauma (%) | 36.6 | 23.2 | < 0.001 | |
| Surgery (%) | 30.2 | 44.8 | 0.001 |
Development set.
HR: Heart rate; MAP: Mean arterial pressure; (A-a)O2 gradient: Alveolar-arterial oxygen gradient; MV: Mechanical ventilation; COI: Chronic organ insufficiency; MORT: Hospital mortality; Data presented as the mean (SD) or percentages.
SCORE: (1) APACHE II, (2) SAPS II and (3) MPM II 24.
p: Determined by χ2 test for percentages or Mann-Whitney test for comparison of medians.
Results of multiple logistic regression
| Variable | Coefficient | SD | OR | 95% CI | |
|---|---|---|---|---|---|
| Age (years) | 0.041 | 0.004 | < 0.001 | 1.041 | 1.033 - 1.050 |
| HR (ppm) | 0.009 | 0.002 | < 0.001 | 1.009 | 1.005 - 1.013 |
| Inotropic therapy | 0.730 | 0.137 | < 0.001 | 2.074 | 1.585 - 2.714 |
| Glasgow | -0.180 | 0.019 | < 0.001 | 0.835 | 0.805 - 0.867 |
| MV | 0.502 | 0.145 | 0.001 | 1.655 | 1.245 - 2.201 |
| (A-a)O2 gradient | 0.002 | 0.001 | < 0.001 | 1.002 | 1.002 - 1.003 |
| Acute renal failure | 0.459 | 0.160 | 0.002 | 1.582 | 1.180 - 2.123 |
| COI | 1.026 | 0.156 | < 0.001 | 2.789 | 2.054 - 3.788 |
| Trauma | -0.357 | 0.160 | 0.026 | 0.700 | 0.511 - 0.957 |
| Intercept | -3.351 |
HR: Heart rate; MV: Mechanical ventilation; (A-a)O2 gradient: Alveolar-arterial oxygen gradient; COI: Chronic organ insufficiency.
Figure 1Classification tree by CART algorithm. The gray squares denote terminal prognostic subgroups. INOT: Inotropic therapy; (A-a)O2 gradient: Alveolar-arterial oxygen gradient (mmHg); MV: Mechanical ventilation; COI: Chronic organ insufficiency.
Figure 2Classification tree by CHAID algorithm. The gray squares denote terminal prognostic subgroups. INOT: Inotropic therapy; (A-a)O2 gradient: Alveolar-arterial oxygen gradient (mmHg); MV: Mechanical ventilation; COI: Chronic organ insufficiency.
Figure 3Classification tree by C4.5 algorithm. The gray squares denote terminal prognostic subgroups. INOT: Inotropic therapy; (A-a)O2 gradient: Alveolar-arterial oxygen gradient (mmHg); MV: Mechanical ventilation; COI: Chronic organ insufficiency; MAP: Mean arterial pressure.
Performance of the classification models: development and validation sets
| DEVELOPMENT (n = 1880) | ||||||
|---|---|---|---|---|---|---|
| 0.81 (0.79 - 0.83) | 68.2 | 0.17 | 0.72 (0.66 - 0.78) | 0.75 (0.73 - 0.77) | 1.30 (1.23 - 1.37) | |
| 0.82 (0.80 - 0.84) | 77.2 | 0.16 | 0.74 (0.68 - 0.79) | 0.74 (0.68 - 0.75) | 1.31 (1.24 - 1.38) | |
| 0.81 (0.79 - 0.83) | 74.2 | 0.16 | 0.75 (0.70 - 0.80) | 0.77 (0.75 - 0.79) | 1.29 (1.22 - 1.36) | |
| 0.83 (0.81 - 0.85) | 16.8 | 0.16 | 0.75 (0.70 - 0.80) | 0.77 (0.76 - 0.79) | 1.00 (0.92 - 1.10) | |
| 0.78 (0.76 - 0.80) | ------ | 0.17 | 0.67 (0.61 - 0.72) | 0.75 (0.73 - 0.77) | 1.00 (0.94 - 1.06) | |
| 0.80 (0.78 - 0.82) | ------ | 0.16 | 0.68 (0.63 - 0.73) | 0.75 (0.73 - 0.77) | 1.00 (0.93 - 1.08) | |
| 0.80 (0.78 - 0.82) | ------ | 0.16 | 0.69 (0.65 - 0.74) | 0.78 (0.76 - 0.80) | 1.00 (0.94 - 1.06) | |
| 0.77 (0.74 - 0.81) | 74.1 | 0.18 | 0.69 (0.60 - 0.70) | 0.73 (0.70 - 0.76) | 1.36 (1.26 - 1.47) | |
| 0.79 (0.76 - 0.83) | 78.3 | 0.18 | 0.71 (0.63 - 0.78) | 0.74 (0.71 - 0.77) | 1.39 (1.28 - 1.49) | |
| 0.79 (0.75 - 0.82) | 66.9 | 0.18 | 0.71 (0.63 - 0.78) | 0.74 (0.71 - 0.77) | 1.36 (1.25 - 1.46) | |
| 0.81 (0.78 - 0.84) | 41.5 | 0.17 | 0.73 (0.66 - 0.81) | 0.75 (0.73 - 0.78) | 1.22 (1.16 - 1.29) | |
| 0.75 (0.71 - 0.81) | ------ | 0.18 | 0.64 (0.57 - 0.72) | 0.72 (0.69 - 0.75) | 1.04 (0.95 - 1.31) | |
| 0.76 (0.72 - 0.79) | ------ | 0.18 | 0.64 (0.56 - 0.72) | 0.72 (0.69 - 0.75) | 1.06 (0.97 - 1.15) | |
| 0.76 (0.73 - 0.80) | ------ | 0.18 | 0.70 (0.63 - 0.76) | 0.76 (0.73 - 0.79) | 1.08 (0.98 - 1.16) | |
AUC: Area under ROC curve; CI: Confidence interval; HL-C: Hosmer-Lemeshow test C (eight degrees of freedom); Brier: Brier score; PPV: Positive predictive value (cutoff 0.5); PCC: Percentage correctly classified (cutoff 0.5); SMR: Standardized mortality ratio. The severity scores (APACHE II, SAPS II and MPM II 24) were not developed in the development phase and recalibration was not performed.
Figure 4Calibration curves for the classification models. Validation set.
Correlation matrix of the probabilities (CTs and LR models)
| DEVELOPMENT SET (n = 1880) | VALIDATION SET (n = 804) | |||||
|---|---|---|---|---|---|---|
| ------- | ------- | ------- | ------- | ------- | ------- | |
| 0.872 | ------- | ------- | 0.877 | ------- | ------- | |
| 0.803 | 0.821 | ------- | 0.788 | 0.810 | ------- | |
| 0.768 | 0.796 | 0.789 | 0.777 | 0.794 | 0.801 | |
LR: Logistic Regression. Numbers represent Spearman correlations coefficients.
All values with a p-value < 0.001.
Figure 5Bland-Altman plot analysis. (a) CART vs Logistic Regression. (b) CART vs CHAID. (c) CART vs C4.5. The dotted lines are the limits of agreement (mean ± 2 SD). Validation set.