| Literature DB >> 32526945 |
Ali J Ghandour1, Huda Hammoud2, Samar Al-Hajj3.
Abstract
Road traffic injury accounts for a substantial human and economic burden globally. Understanding risk factors contributing to fatal injuries is of paramount importance. In this study, we proposed a model that adopts a hybrid ensemble machine learning classifier structured from sequential minimal optimization and decision trees to identify risk factors contributing to fatal road injuries. The model was constructed, trained, tested, and validated using the Lebanese Road Accidents Platform (LRAP) database of 8482 road crash incidents, with fatality occurrence as the outcome variable. A sensitivity analysis was conducted to examine the influence of multiple factors on fatality occurrence. Seven out of the nine selected independent variables were significantly associated with fatality occurrence, namely, crash type, injury severity, spatial cluster-ID, and crash time (hour). Evidence gained from the model data analysis will be adopted by policymakers and key stakeholders to gain insights into major contributing factors associated with fatal road crashes and to translate knowledge into safety programs and enhanced road policies.Entities:
Keywords: classifier ensemble; fatal crashes; machine learning; road fatality factors
Year: 2020 PMID: 32526945 PMCID: PMC7312085 DOI: 10.3390/ijerph17114111
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Input and output variables.
| Variable | Range |
|---|---|
|
| |
| Month | 1–12 |
| Day | 1–31 |
| Day of the Week | Monday–Sunday |
| Hour of Crash | 0–23 |
| AM/PM | am, pm |
| Crash Type | Vehicle–Vehicle, Vehicle–Truck, Vehicle–Pedestrian, Vehicle–Motorcycle, Vehicle–Barrier, Truck–Truck, Truck–Motorcycle, Truck–Barrier, Motorcycle–Motorcycle, Motorcycle–Barrier, Other |
| Injury Severity Level | No Apparent-Injury, Minor Injury, Serious Injury |
| Road Type | Motorway, Trunk, Primary, Secondary, Tertiary, Unclassified |
| Spatial Cluster ID | 1–10 |
|
| |
| Fatality occurrence | Fatal, Not Fatal |
Performance metrics of single learning classifiers.
| F1-Score | AUC-PR | Kappa | |
|---|---|---|---|
| SMO | 0.493 | 0.276 | 0.4678 |
| Random Forest | 0.453 | 0.376 | 0.4258 |
| ANN | 0.385 | 0.291 | 0.3462 |
| Logistic Regression | 0.455 | 0.361 | 0.4309 |
| Naïve Bayes | 0.313 | 0.337 | 0.294 |
Performance metrics using the ensemble method.
| F1-Score | AUC-PR | Kappa | |
|---|---|---|---|
| Bagging J48 100 decision trees | 0.464 | 0.382 | 0.4365 |
| Vote SMO with Bagging J48 | 0.511 | 0.402 | 0.4882 |
Performance metrics for the selected classifier over the test dataset.
| Vote SMO with Bagging J48 | |
|---|---|
| F1-score | 0.435 |
| AUC-PR | 0.368 |
| Kappa | 0.4067 |
Figure 1Precision–recall curve for the “Vote SMO with Bagging J48” classifier.
Figure 2Top correlated variables in order of the Chi-squared score.