| Literature DB >> 29306885 |
Pao-Jen Kuo1, Shao-Chun Wu2, Peng-Chen Chien1, Cheng-Shyuan Rau3, Yi-Chun Chen1, Hsiao-Yun Hsieh1, Ching-Hua Hsieh1.
Abstract
OBJECTIVES: This study aimed to build and test the models of machine learning (ML) to predict the mortality of hospitalised motorcycle riders.Entities:
Keywords: decision tree (dt); logistic regression (lr); machine learning (ml); mortality; motorcycle accident; support vector machine (svm)
Mesh:
Year: 2018 PMID: 29306885 PMCID: PMC5781097 DOI: 10.1136/bmjopen-2017-018252
Source DB: PubMed Journal: BMJ Open ISSN: 2044-6055 Impact factor: 2.692
Demographics and injury characteristics of the patients regarding gender, helmet-wearing status, comorbidities, injury region and number of injury regions
| Variables | Total (N=7252) | Survival (n=7084) | Mortality (n=168) | P value | |
| Sex | Female | 4291 (59.2%) | 4174 (58.9%) | 117 (69.6%) | 0.005 |
| Male | 2961 (40.8%) | 2910 (41.1%) | 51 (30.4%) | ||
| Helmet | No | 1011 (13.9%) | 929 (13.1%) | 82 (48.8%) | <0.001 |
| Yes | 6241 (86.1%) | 6155 (86.9%) | 86 (51.2%) | ||
| DM | No | 6562 (90.5%) | 6414 (90.5%) | 148 (88.1%) | 0.286 |
| Yes | 690 (9.5%) | 670 (9.5%) | 20 (11.9%) | ||
| HTN | No | 5939 (81.9%) | 5802 (81.9%) | 137 (81.5%) | 0.919 |
| Yes | 1313 (18.1%) | 1282 (18.1%) | 31 (18.5%) | ||
| CAD | No | 7120 (98.2%) | 6960 (98.2%) | 160 (95.2%) | 0.011 |
| Yes | 132 (1.8%) | 124 (1.8%) | 8 (4.8%) | ||
| CHF | No | 7228 (99.7%) | 7061 (99.7%) | 167 (99.4%) | 0.431 |
| Yes | 24 (0.3%) | 23 (0.3%) | 1 (0.6%) | ||
| CVA | No | 7168 (98.8%) | 7002 (98.8%) | 166 (98.8%) | 0.722 |
| Yes | 84 (1.2%) | 82 (1.2%) | 2 (1.2%) | ||
| ESRD | No | 7250 (100%) | 7082 (100%) | 168 (100%) | 1.000 |
| Yes | 2 (0.0%) | 2 (0.0%) | 0 (0.0%) | ||
| AIS (head/neck) | 0 | 4642 (64%) | 4627 (65.3%) | 15 (8.9%) | <0.001 |
| 1 | 665 (9.2%) | 661 (9.3%) | 4 (2.4%) | ||
| 2 | 192 (2.6%) | 189 (2.7%) | 3 (1.8%) | ||
| 3 | 713 (9.8%) | 699 (9.9%) | 14 (8.3%) | ||
| 4 | 840 (11.6%) | 795 (11.2%) | 45 (26.8%) | ||
| 5 | 189 (2.6%) | 113 (1.6%) | 76 (45.3%) | ||
| 6 | 11 (0.2%) | 0 (0%) | 11 (6.5%) | ||
| AIS (face) | 0 | 5472 (75.4%) | 5347 (75.5%) | 125 (74.4%) | <0.001 |
| 1 | 574 (7.9%) | 568 (8%) | 6 (3.6%) | ||
| 2 | 1173 (16.2%) | 1141 (16.1%) | 32 (19%) | ||
| 3 | 33 (0.5%) | 28 (0.4%) | 5 (3%) | ||
| AIS (thorax) | 0 | 6081 (83.9%) | 5973 (84.3%) | 108 (64.3%) | <0.001 |
| 1 | 234 (3.2%) | 229 (3.3%) | 5 (3%) | ||
| 2 | 260 (3.6%) | 258 (3.6%) | 2 (1.2%) | ||
| 3 | 423 (5.8%) | 404 (5.7%) | 19 (11.3%) | ||
| 4 | 245 (3.4%) | 217 (3.1%) | 28 (16.7%) | ||
| 5 | 7 (0.1%) | 3 (<0.1%) | 4 (2.4%) | ||
| 6 | 2 (<0.1%) | 0 (0%) | 2 (1.1%) | ||
| AIS (abdomen) | 0 | 6654 (91.8%) | 6516 (92%) | 138 (82.1%) | <0.001 |
| 1 | 57 (0.8%) | 54 (0.8%) | 3 (1.8%) | ||
| 2 | 288 (4%) | 277 (3.9%) | 11 (6.5%) | ||
| 3 | 170 (2.2%) | 163 (2.3%) | 7 (4.2%) | ||
| 4 | 66 (0.9%) | 58 (0.8%) | 8 (4.8%) | ||
| 5 | 17 (0.2%) | 16 (0.2%) | 1 (0.6%) | ||
| AIS (extremity) | 0 | 2000 (27.6%) | 1897 (26.8%) | 103 (61.3%) | <0.001 |
| 1 | 528 (7.3%) | 524 (7.4%) | 4 (2.4%) | ||
| 2 | 2886 (39.8%) | 2853 (40.3%) | 33 (19.6%) | ||
| 3 | 1822 (25.1%) | 1800 (25.4%) | 22 (13.1%) | ||
| 4 | 12 (0.2%) | 8 (0.1%) | 4 (2.4%) | ||
| 5 | 4 (0.1%) | 2 (0.0%) | 2 (1.2%) | ||
| AIS (external) | 0 | 6155 (84.9%) | 6001 (84.7%) | 154 (91.7%) | 0.003 |
| 1 | 1072 (14.8%) | 1059 (14.9%) | 13 (7.7%) | ||
| 2 | 25 (0.3%) | 24 (0.3%) | 1 (0.6%) | ||
| Number of AIS locations | 1 | 3687 (50.8%) | 3631 (51.3%) | 56 (33.3%) | <0.001 |
| 2 | 2255 (31.1%) | 2205 (31.1%) | 50 (29.8%) | ||
| 3 | 982 (13.5%) | 939 (13.3%) | 43 (25.6%) | ||
| 4 | 280 (3.9%) | 265 (3.7%) | 15 (8.9%) | ||
| 5 | 43 (0.6%) | 39 (0.6%) | 4 (2.4%) | ||
| 6 | 5 (0.1%) | 5 (0.1%) | 0 (0.0%) | ||
AIS, Abbreviated Injury Scale; CAD, coronary artery disease; CHF, congestive heart failure; CVA, cerebral vascular accident; DM, diabetes mellitus; ESRD, end-stage renal disease; HTN, hypertension.
Injury characteristics of the patients regarding laboratory data collected from the time point when arrival at the emergency department
| Variables | Total (N=7252) | Survival (n=7084) | Mortality (n=168) | P value |
| Age (years) | 38 (29) | 37 (29) | 47 (32) | <0.001 |
| HR (beats/min) | 89 (23) | 89 (23) | 93 (43) | <0.001 |
| SBP (mm Hg) | 137 (38) | 137 (37) | 143 (79) | 0.374 |
| RR (times/min) | 19 (2) | 19 (2) | 19 (5) | 0.660 |
| Temperature (oC) | 36.4 (0.8) | 36.4 (0.8) | 36.0 (0.5) | <0.001 |
| GCS | 15 (5) | 15 (3) | 3 (3) | <0.001 |
| ISS | 13 (12) | 13 (13) | 29 (11) | <0.001 |
| RBC (1012/L) | 4.6 (0.8) | 4.6 (0. 8) | 4.3 (1.1) | <0.001 |
| WCC (109/L) | 12.9 (7.7) | 12.9 (7.7) | 13.2 (8.7) | <0.001 |
| Hb (g/dL) | 13.9 (2.5) | 13.9 (2.5) | 12.9 (3.5) | <0.001 |
| Hct (%) | 40.9 (6.8) | 41.1 (6.6) | 38.6 (9.4) | <0.001 |
| Platelets (103/μL) | 228 (79) | 230 (79) | 190 (78) | <0.001 |
| Glucose (mg/dL) | 145 (27) | 145 (23) | 218 (60) | <0.001 |
| Na (mEq/L) | 139 (3) | 139 (3) | 139 (4) | 0.094 |
| K (mEq/L) | 3.5 (0.6) | 3.5 (0.6) | 3.4 (0.9) | <0.001 |
| BUN (mg/dL) | 12 (6) | 12 (5) | 14 (8) | <0.001 |
| Cr (mg/dL) | 0.8 (0.3) | 0.8 (0.3) | 1.0 (0.5) | <0.001 |
| AST (U/L) | 47 (50) | 45 (48) | 65 (76) | <0.001 |
| ALT (U/L) | 34 (35) | 34 (33) | 39 (55) | <0.001 |
| BAC (mg/dL) | 4.9 (133.0) | 4.9 (136.4) | 4.9 (62.5) | 0.698 |
ALT, alanine aminotransferase; AST, aspartate aminotransferase; BAC, blood alcohol concentration; BUN, blood urea nitrogen; Cr, creatinine; GCS, Glasgow Coma Scale; Hb, haemoglobin; Hct, haematocrit; HR, heart rate; ISS, Injury Severity Score; K, potassium; Na, sodium; RBC, red blood cell; RR, respiratory rate; SBP, systolic blood pressure; WCC, white cell count.
Figure 1ROC curves for LR, SVM and DT models in predicting mortality of motorcycle riders. AUC, area under the curve; DT, decision tree; LR, logistic regression; ROC, receiver operating characteristic; SVM, support vector machine.
Summary of mortality prediction performances regarding accuracy, sensitivity, specificity and geometric mean with LR, SVM and DT models in the training and test sets
| All samples (n=6306) | Reduced samples (n=1510) | |||||
| All variables | All variables | |||||
| LR | Train | Accuracy | 98.64 | 94.44 | ||
| Sensitivity | 59.31 | 60 | ||||
| Specificity | 99.56 | 98.1 | ||||
| Geometric mean | 76.84 | 76.72 | ||||
| Test | Accuracy | 98.41 | 98.41 | |||
| Sensitivity | 73.91 | 73.91 | ||||
| Specificity | 99.02 | 99.02 | ||||
| Geometric mean | 85.55 | 85.55 | ||||
| All variables | Selected features | All variables | Selected features | |||
| SVM | Train | Accuracy | 98.62 | 98.62 | 94.37 | 93.84 |
| Sensitivity | 62.07 | 64.14 | 59.31 | 62.76 | ||
| Specificity | 99.48 | 99.43 | 98.1 | 97.14 | ||
| Geometric mean | 78.58 | 79.86 | 76.28 | 78.08 | ||
| Test | Accuracy | 98.41 | 98.73 | 98.41 | 98.31 | |
| Sensitivity | 69.57 | 86.96 | 69.57 | 73.91 | ||
| Specificity | 99.13 | 99.02 | 99.13 | 98.92 | ||
| Geometric mean | 83.05 | 92.79 | 83.05 | 85.51 | ||
| DT | Train | Accuracy | 98.92 | 98.92 | 95.83 | 95.83 |
| Sensitivity | 62.76 | 64.14 | 68.97 | 70.34 | ||
| Specificity | 99.77 | 99.74 | 98.68 | 98.53 | ||
| Geometric mean | 79.13 | 79.98 | 82.50 | 83.25 | ||
| Test | Accuracy | 98.31 | 98.52 | 97.67 | 97.89 | |
| Sensitivity | 65.22 | 69.57 | 65.22 | 69.57 | ||
| Specificity | 99.13 | 99.24 | 98.48 | 98.59 | ||
| Geometric mean | 80.41 | 83.09 | 80.14 | 82.82 | ||
DT, decision tree; LR, logistic regression; SVM, support vector machine.
Figure 2Illustration of DT model for mortality of motorcycle riders. The boxes denote the percentage of patients with discriminating variables from CART analysis. Those who were survival and fatal were indicated with green and red colours, respectively, in the boxes. CART, classification and regression trees; DT, decision tree.
Comparison of AUC between LR, SVM and DT models in the training set
| LR | SVM | DT | |||||||||
| AS | RS | (AS+AV) | (AS+SF) | (RS+AV) | (RS+SF) | (AS+AV) | (AS+SF) | (RS+AV) | (RS+SF) | ||
| LR | AS | ||||||||||
| RS | 0.6575 | ||||||||||
| SVM | (AS+AV) | 0.7481 | 0.6785 | ||||||||
| (AS+SF) | 0.4121 | 0.7075 | 0.2473 | ||||||||
| (RS+AV) | 0.9151 | 0.9161 | 0.6619 | 0.6652 | |||||||
| (RS+SF) | 0.3502 | 0.5965 | 0.4135 | 0.9939 | 0.5346 | ||||||
| DT | (AS+AV) | 0.0001* | 0.0001* | 0.0001* | 0.0002* | 0.0002* | 0.0002* | ||||
| (AS+SF) | 0.0001* | 0.0002* | 0.0001* | 0.0002* | 0.0002* | 0.0002* | 0.3578 | ||||
| (RS+AV) | 0.0542 | 0.0618 | 0.0543 | 0.0713 | 0.0658 | 0.0703 | 0.0009* | 0.0010* | |||
| (RS+SF) | 0.0566 | 0.0643 | 0.0567 | 0.0743 | 0.0684 | 0.0731 | 0.0008* | 0.0009* | 0.3570 | ||
*P<0.05.
AS, all samples; AUC, area under the curve; AV, all variables; DT, decision tree; LR, Logistic regression; RS, reduced samples; SF, selected features; SVM, support vector machine.