| Literature DB >> 35798814 |
Farshid Afshar1, Seyedehsan Seyedabrishami2, Sara Moridpour3.
Abstract
Crash severity models play a crucial role in evaluating the influencing factors in the severity of traffic crashes. In this study, Extremely Randomised Tree (ERT) is used as a machine learning technique to analyse the severity of crashes. The crash data in the province of Khorasan Razavi, Iran, for a period of 5 years from 2013 to 2017, is used for crash severity model development. The dataset includes traffic-related variables, vehicle specifications, vehicle movement, land use characteristics, temporal characteristics, and environmental variables. In this paper, Feature Importance Analysis (FIA), Partial Dependence Plots (PDP), and Individual Conditional Expectation (ICE) plots are utilised to analyse and interpret the results. According to the results, the involvement of vulnerable road users such as motorcyclists and pedestrians alongside traffic-related variables are among the most significant variables in crash severity. Results show that the presence of motorcycles can increase the probability of injury crashes by around 30% and almost double the probability of fatal crashes. Analysing the interaction of PDPs shows that driving speeds above 60 km/h in residential areas raises the probability of injury crashes by about 10%. In addition, at speeds higher than 70 km/h, the presence of pedestrians approximately increases the probability of fatal crashes by 6%.Entities:
Mesh:
Year: 2022 PMID: 35798814 PMCID: PMC9263179 DOI: 10.1038/s41598-022-15693-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Injury severity categories in the crash dataset.
| Severity level | Train | Test | Total | |
|---|---|---|---|---|
| Level 1 | PDO-no injury | 424 (37%) | 104 (36%) | 528 (37%) |
| Level 2 | Injury | 674 (59%) | 166 (58%) | 840 (59%) |
| Level 3 | Fatal injury | 43 (4%) | 16 (6%) | 59 (4%) |
| Overall | 1141 | 286 | 1427 | |
Description of the explanatory variables.
| Category | Variable | Type | Description | Train | Test |
|---|---|---|---|---|---|
| Vehicle | Moving forward | Binary | Crash occurred while moving forward | ||
| Zero | 508 (45%) | 120 (42%) | |||
| One | 633 (55%) | 166 (58%) | |||
| Motorcycle at fault | Binary | Motorcycle hit another vehicle | |||
| Zero | 1034 (91%) | 260 (91%) | |||
| One | 107 (9%) | 26 (9%) | |||
| Car involved in crash | Binary | At least one passenger car involved in crash | |||
| Zero | 817 (72%) | 212 (74%) | |||
| One | 324 (28%) | 74 (26%) | |||
| Hit fixed objects | Binary | Vehicle hit fixed objects | |||
| Zero | 1044 (91%) | 265 (93%) | |||
| One | 97 (9%) | 21 (7%) | |||
| Left turn | Binary | Crash occurred when one vehicle turning left | |||
| Zero | 1079 (95%) | 271 (95%) | |||
| One | 62 (5%) | 15 (5%) | |||
| Motorcycle innocent | Binary | Motorcycle was hit by another vehicle | |||
| Zero | 1022 (90%) | 254 (89%) | |||
| One | 119 (10%) | 32 (11%) | |||
| Truck at fault | Binary | Truck was found at fault in crash | |||
| Zero | 1037 (91%) | 260 (91%) | |||
| One | 104 (9%) | 26 (9%) | |||
| Truck innocent | Binary | Truck was not found at fault in crash | |||
| Zero | 1082 (95%) | 277 (97%) | |||
| One | 59 (5%) | 9 (3%) | |||
| Pedestrian involved | Binary | Pedestrian was hit by vehicle | |||
| Zero | 1057 (93%) | 259 (91%) | |||
| One | 84 (7%) | 27 (9%) | |||
| Land use | Residential area | Binary | Crash occurred in residential area | ||
| Zero | 869 (76%) | 219 (77%) | |||
| One | 272 (24%) | 67 (23%) | |||
| Agricultural area | Binary | Crash occurred in agricultural area | |||
| Zero | 1055 (92%) | 268 (94%) | |||
| One | 86 (8%) | 18 (6%) | |||
| Crash cause | Fail to keep longitudinal distance | Binary | Fail to keep longitudinal distance | ||
| Zero | 1077 (94%) | 272 (95%) | |||
| One | 64 (6%) | 14 (5%) | |||
| Failing to yield the right of way | Binary | Driver failed to yield the right of way | |||
| Zero | 1019 (89%) | 260 (91%) | |||
| One | 122 (11%) | 26 (9%) | |||
| Temporal | February | Binary | Crash occurred in February | ||
| Zero | 1083 (95%) | 262 (92%) | |||
| One | 58 (5%) | 24 (8%) | |||
| Winter | Binary | Crash occurred in winter | |||
| Zero | 854 (75%) | 199 (70%) | |||
| One | 287 (25%) | 87 (30%) | |||
| Environment | No road marking | Binary | No marking condition was in the crash scene | ||
| Zero | 937 (82%) | 224 (78%) | |||
| One | 204 (18%) | 62 (22%) | |||
| Dry road surface | Binary | Dry road surface | |||
| Zero | 323 (28%) | 90 (31%) | |||
| One | 818 (72%) | 196 (69%) | |||
| Unpaved shoulder | Binary | Road had unpaved shoulder at crash scene | |||
| Zero | 890 (78%) | 203 (71%) | |||
| One | 251 (22%) | 83 (29%) | |||
| Clean weather (no cloud) | Binary | No cloud was in the sky | |||
| Zero | 228 (20%) | 73 (26%) | |||
| One | 913 (80%) | 213 (74%) | |||
| Road lighting deficiency | Binary | Lighting system was not fully functional | |||
| Zero | 1133 (99%) | 283 (99%) | |||
| One | 8 (1%) | 3 (1%) | |||
| Collision in the median strip | Binary | Collision occurred in the median strip | |||
| Zero | 1052 (92%) | 266 (93%) | |||
| One | 89 (8%) | 20 (7%) | |||
| Broken line | Binary | Broken marking line | |||
| Zero | 402 (35%) | 123 (43%) | |||
| One | 739 (65%) | 163 (57%) | |||
| Direct-downhill road | Binary | Crash occurred in a direct-downhill road | |||
| Zero | 1106 (97%) | 278 (97%) | |||
| One | 35 (3%) | 8 (3%) | |||
| Night without enough lighting | Binary | Crash occurred at night without enough lighting | |||
| Zero | 892 (78%) | 223 (78%) | |||
| One | 249 (22%) | 63 (22%) | |||
| Snowy and frozen surface | Binary | Snowy and frozen road surface | |||
| Zero | 1119 (98%) | 277 (97%) | |||
| One | 22 (2%) | 9 (3%) | |||
| Paved shoulder | Binary | The road had paved shoulder at crash scene | |||
| Zero | 581 (51%) | 165 (58%) | |||
| One | 560 (49%) | 121 (42%) | |||
| Traffic | Flow rate (vehicle/min) | Continues | Traffic flow rate one hour before crash | ||
| Mean | 8.07 | 8.1 | |||
| Standard deviation | 8.34 | 8.44 | |||
| Headway (sec) | Continues | Headway one hour before crash | |||
| Mean | 37.12 | 42.42 | |||
| Standard deviation | 183.66 | 237.63 | |||
| Average speed (km/h) | Continues | Average speed one hour before crash | |||
| Mean | 80.05 | 79.53 | |||
| Standard deviation | 15.28 | 14.82 |
Figure 1Accuracy score on different combination of hyper-parameters [plotted by Python Matplotlib v. 3.3.4 https://matplotlib.org].
Performance of selected models.
| Train | ||||
|---|---|---|---|---|
| AdaBoost | XGBoost | Random forest | ERT | |
| Confusion matrix | ||||
| Accuracy | 0.756 | 0.776 | 0.831 | 0.850 |
| Precision | 0.729 | 0.748 | 0.842 | 0.858 |
| Recall | 0.756 | 0.776 | 0.831 | 0.850 |
| F-measure | 0.743 | 0.758 | 0.816 | 0.840 |
Figure 2ROC curves for the ERT model [plotted by Python Matplotlib v. 3.3.4 https://matplotlib.org].
Figure 3Importance of variables used in model [plotted by Python Matplotlib v. 3.3.4 https://matplotlib.org].
Figure 4PDPs for the most influencing variables on crash severity [plotted by Python Matplotlib v. 3.3.4 https://matplotlib.org].
Figure 5ICE plots for the most influencing variables on crash severity [plotted by Python Matplotlib v. 3.3.4 https://matplotlib.org].
Figure 6ICE plots for the most important variables [plotted by Python Matplotlib v. 3.3.4 https://matplotlib.org].
Figure 7Interaction PDPs of selected variables with average speed [plotted by Plotly].