| Literature DB >> 34352026 |
Md Mostafizur Rahman Komol1,2, Md Mahmudul Hasan1,2, Mohammed Elhenawy1,2, Shamsunnahar Yasmin1,2, Mahmoud Masoud1,2, Andry Rakotonirainy1,2.
Abstract
Road crash fatality is a universal problem of the transportation system. A massive death toll caused annually due to road crash incidents, and among them, vulnerable road users (VRU) are endangered with high crash severity. This paper focuses on employing machine learning-based classification approaches for modelling injury severity of vulnerable road users-pedestrian, bicyclist, and motorcyclist. Specifically, this study aims to analyse critical features associated with different VRU groups-for pedestrian, bicyclist, motorcyclist and all VRU groups together. The critical factor of crash severity outcomes for these VRU groups is estimated in identifying the similarities and differences across different important features associated with different VRU groups. The crash data for the study is sourced from the state of Queensland in Australia for the years 2013 through 2019. The supervised machine learning algorithms considered for the empirical analysis includes the K-Nearest Neighbour (KNN), Support Vector Machine (SVM) and Random Forest (RF). In these models, 17 distinct road crash parameters are considered as input features to train models, which originate from road user characteristics, weather and environment, vehicle and driver condition, period, road characteristics and regions, traffic, and speed jurisdiction. These classification models are separately trained and tested for individual and unified VRU to assess crash severity levels. Afterwards, model performances are compared with each other to justify the best classifier where Random Forest classification models for all VRU modes are found to be comparatively robust in test accuracy: (motorcyclist: 72.30%, bicyclist: 64.45%, pedestrian: 67.23%, unified VRU: 68.57%). Based on the Random Forest model, the road crash features are ranked and compared according to their impact on crash severity classification. Furthermore, a model-based partial dependency of each road crash parameters on the severity levels is plotted and compared for each individual and unified VRU. This clarifies the tendency of road crash parameters to vary with different VRU crash severity. Based on the outcome of the comparative analysis, motorcyclists are found to be more likely exposed to higher crash severity, followed by pedestrians and bicyclists.Entities:
Mesh:
Year: 2021 PMID: 34352026 PMCID: PMC8341492 DOI: 10.1371/journal.pone.0255828
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Data summary.
| Crash Severity Features | Description | Motorcyclist | Bicyclist | Pedestrian | Unified VRU | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| levels | Code | Frequency | Percentage | Frequency | Percentage | Frequency | Percentage | Frequency | Percentage | |
| Period | ||||||||||
| Year | 2013 | 0 | 1556 | 13.91% | 782 | 14.51% | 664 | 14.50% | 3002 | 14.19% |
| 2014 | 1 | 1655 | 14.79% | 832 | 15.44% | 629 | 13.74% | 3116 | 14.73% | |
| 2015 | 2 | 1664 | 14.87% | 724 | 13.43% | 664 | 14.50% | 3052 | 14.42% | |
| 2016 | 3 | 1594 | 14.24% | 769 | 14.27% | 695 | 15.18% | 3058 | 14.45% | |
| 2017 | 4 | 1550 | 13.85% | 798 | 14.81% | 677 | 14.79% | 3025 | 14.30% | |
| 2018 | 5 | 1585 | 14.16% | 710 | 13.17% | 603 | 13.17% | 2898 | 13.70% | |
| 2019 | 6 | 1586 | 14.17% | 775 | 14.38% | 646 | 14.11% | 3007 | 14.21% | |
| Month | January | 0 | 976 | 8.72% | 443 | 8.22% | 384 | 8.39% | 1803 | 8.52% |
| February | 1 | 1067 | 9.54% | 489 | 9.07% | 425 | 9.28% | 1981 | 9.36% | |
| March | 2 | 850 | 7.60% | 339 | 6.29% | 335 | 7.32% | 1524 | 7.20% | |
| April | 3 | 777 | 6.94% | 481 | 8.92% | 345 | 7.54% | 1603 | 7.58% | |
| May | 4 | 811 | 7.25% | 373 | 6.92% | 274 | 5.99% | 1458 | 6.89% | |
| June | 5 | 988 | 8.83% | 485 | 9.00% | 410 | 8.96% | 1883 | 8.90% | |
| July | 6 | 903 | 8.07% | 440 | 8.16% | 434 | 9.48% | 1777 | 8.40% | |
| August | 7 | 852 | 7.61% | 508 | 9.42% | 382 | 8.34% | 1742 | 8.23% | |
| September | 8 | 1077 | 9.62% | 504 | 9.35% | 460 | 10.05% | 2041 | 9.65% | |
| October | 9 | 923 | 8.25% | 456 | 8.46% | 373 | 8.15% | 1752 | 8.28% | |
| November | 10 | 972 | 8.69% | 454 | 8.42% | 393 | 8.58% | 1819 | 8.60% | |
| December | 11 | 994 | 8.88% | 418 | 7.76% | 363 | 7.93% | 1775 | 8.39% | |
| Day of Week | Monday | 0 | 1686 | 15.07% | 742 | 13.77% | 804 | 17.56% | 3232 | 15.28% |
| Tuesday | 1 | 1373 | 12.27% | 726 | 13.47% | 599 | 13.08% | 2698 | 12.75% | |
| Wednesday | 2 | 1741 | 15.56% | 659 | 12.23% | 539 | 11.77% | 2939 | 13.89% | |
| Thursday | 3 | 1855 | 16.58% | 571 | 10.59% | 473 | 10.33% | 2899 | 13.70% | |
| Friday | 4 | 1546 | 13.82% | 873 | 16.20% | 719 | 15.71% | 3138 | 14.83% | |
| Saturday | 5 | 1453 | 12.98% | 968 | 17.96% | 714 | 15.60% | 3135 | 14.82% | |
| Sunday | 6 | 1536 | 13.73% | 851 | 15.79% | 730 | 15.95% | 3117 | 14.73% | |
| Hour (Time of Day) | Early morning (midnight–6:30 a.m.) | 0 | 941 | 8.41% | 871 | 16.16% | 417 | 9.11% | 2229 | 10.54% |
| A.m. peak | 1 | 1783 | 15.93% | 1677 | 31.11% | 726 | 15.86% | 4186 | 19.78% | |
| (6:30 a.m.–9:00 a.m.) | ||||||||||
| A.m. off-peak | 2 | 2236 | 19.98% | 608 | 11.28% | 684 | 14.94% | 3528 | 16.67% | |
| (9:00–noon) | ||||||||||
| P.m. off-peak | 3 | 3388 | 30.28% | 1226 | 22.75% | 1348 | 29.45% | 5962 | 28.18% | |
| (noon-4:00 p.m.) | ||||||||||
| P.m. peak | 4 | 1892 | 16.91% | 809 | 15.01% | 875 | 19.11% | 3576 | 16.90% | |
| (4:00 p.m.–6:30 p.m.) | ||||||||||
| Evening | 5 | 857 | 7.66% | 199 | 3.69% | 528 | 11.53% | 1677 | 7.93% | |
| (6:30 p.m.–midnight) | ||||||||||
| Road and Environment Characteristics | ||||||||||
| Road & Environment Condition | Lighting Condition | 1 | 9136 | 81.64% | 4603 | 85.40% | 3941 | 86.09% | 17566 | 83.02% |
| Road Condition | 2 | 648 | 5.79% | 450 | 8.35% | 482 | 10.53% | 1407 | 6.65% | |
| Rain wet Slippery | 3 | 1387 | 12.39% | 325 | 6.03% | 142 | 3.10% | 2141 | 10.12% | |
| Atmospheric Condition | 4 | 7 | 0.06% | 9 | 0.17% | 8 | 0.17% | 23 | 0.11% | |
| None | 0 | 12 | 0.11% | 3 | 0.06% | 5 | 0.11% | 21 | 0.10% | |
| Roadway Feature | Intersection and Roundabout | 1 | 6275 | 56.08% | 2258 | 41.89% | 2948 | 64.39% | 11481 | 54.26% |
| Other Roadway Features | 0 | 4915 | 43.92% | 3132 | 58.11% | 1630 | 35.61% | 9677 | 45.74% | |
| Traffic and Speed Jurisdiction | ||||||||||
| Posted Speed Limit | 0–50 km/hr | 0 | 2300 | 20.55% | 2130 | 39.52% | 2222 | 48.54% | 6652 | 31.44% |
| 60 km/hr | 1 | 5633 | 50.34% | 2887 | 53.56% | 1989 | 43.45% | 10509 | 49.67% | |
| 70–80 km/hr | 2 | 618 | 5.52% | 179 | 3.32% | 146 | 3.19% | 943 | 4.46% | |
| 80–100 km/hr | 3 | 1131 | 10.11% | 143 | 2.65% | 118 | 2.58% | 1392 | 6.58% | |
| 100–110 km/hr | 4 | 1508 | 13.48% | 51 | 0.95% | 103 | 2.25% | 1662 | 7.86% | |
| Speeding Driving Factor | Crashes due to Speeding | 1 | 10662 | 95.28% | 5389 | 99.98% | 4552 | 99.43% | 20603 | 97.38% |
| Crashes irrelevant to Speeding | 0 | 528 | 4.72% | 1 | 0.02% | 26 | 0.57% | 555 | 2.62% | |
| Road User Characteristics | ||||||||||
| Age Group | 0 to 16 | 1 | 20 | 0.18% | 25 | 0.46% | 32 | 0.70% | 77 | 0.36% |
| 17 to 24 | 2 | 159 | 1.42% | 850 | 15.77% | 1046 | 22.85% | 2055 | 9.71% | |
| 25 to 59 | 3 | 1970 | 17.61% | 647 | 12.00% | 772 | 16.86% | 3389 | 16.02% | |
| 60 to 75 | 4 | 7908 | 70.67% | 3268 | 60.63% | 1910 | 41.72% | 13086 | 61.85% | |
| 75 up | 5 | 1034 | 9.24% | 514 | 9.54% | 494 | 10.79% | 2042 | 9.65% | |
| unknown | 0 | 99 | 0.88% | 86 | 1.60% | 324 | 7.08% | 509 | 2.41% | |
| Region | ||||||||||
| Road Region | Central Queensland | 0 | 910 | 8.13% | 295 | 5.47% | 270 | 5.90% | 1475 | 6.97% |
| Downs South West | 1 | 522 | 4.66% | 177 | 3.28% | 214 | 4.67% | 913 | 4.32% | |
| Metropolitan | 2 | 3713 | 33.18% | 2146 | 39.81% | 1864 | 40.72% | 7723 | 36.50% | |
| North Coast and Wide Bay/Burnett | 3 | 2585 | 23.10% | 971 | 18.01% | 830 | 18.13% | 4386 | 20.73% | |
| North Queensland | 4 | 1341 | 11.98% | 757 | 14.04% | 562 | 12.28% | 2660 | 12.57% | |
| South Coast | 5 | 2119 | 18.94% | 1044 | 19.37% | 838 | 18.30% | 4001 | 18.91% | |
| Area Remoteness | Inner Regional | 0 | 2628 | 23.49% | 696 | 12.91% | 750 | 16.38% | 4074 | 19.26% |
| Major Cities | 1 | 6575 | 58.76% | 3866 | 71.73% | 3152 | 68.85% | 13593 | 64.25% | |
| Outer Regional and Remote Areas | 2 | 1987 | 17.76% | 828 | 15.36% | 676 | 14.77% | 3491 | 16.50% | |
| Roadsection Authority | Locally controlled | 0 | 6603 | 59.01% | 4026 | 74.69% | 3400 | 74.27% | 14029 | 66.31% |
| Not coded | 1 | 5 | 0.04% | 4 | 0.07% | 2 | 0.04% | 11 | 0.05% | |
| State controlled | 2 | 4582 | 40.95% | 1360 | 25.23% | 1176 | 25.69% | 7118 | 33.64% | |
| Traffic Condition and Management | ||||||||||
| Vehicle Condition | Unrestrained | 1 | 10162 | 90.81% | 11 | 0.20% | 4398 | 96.07% | 19573 | 92.51% |
| Unlicensed | 2 | 166 | 1.48% | 24 | 0.45% | 35 | 0.76% | 244 | 1.15% | |
| Unregistered | 3 | 366 | 3.27% | 71 | 1.32% | 46 | 1.00% | 556 | 2.63% | |
| Vehicle Defect | 4 | 489 | 4.37% | 0 | 0.00% | 90 | 1.97% | 771 | 3.64% | |
| None | 0 | 7 | 0.06% | 5284 | 98.03% | 9 | 0.20% | 14 | 0.07% | |
| Driver Condition | Inattentive | 1 | 1874 | 16.75% | 726 | 13.47% | 3869 | 84.51% | 5122 | 24.21% |
| Fatigued | 2 | 156 | 1.39% | 4 | 0.07% | 10 | 0.22% | 213 | 1.01% | |
| Controller Condition | 3 | 1062 | 9.49% | 253 | 4.69% | 357 | 7.80% | 1782 | 8.42% | |
| Worn Helmet | 4 | 1753 | 15.67% | 210 | 3.90% | 318 | 6.95% | 2712 | 12.82% | |
| None | 0 | 6345 | 56.70% | 4197 | 77.87% | 24 | 0.52% | 11329 | 53.54% | |
| Traffic Control | Flashing amber lights (FL) | 0 | 1 | 0.01% | 1 | 0.02% | 1 | 0.02% | 3 | 0.01% |
| Give way sign (GWS) | 1 | 1522 | 13.60% | 1405 | 26.07% | 196 | 4.28% | 3123 | 14.76% | |
| Miscellaneous (MC) | 2 | 0 | 0.00% | 0 | 0.00% | 2 | 0.04% | 2 | 0.01% | |
| No traffic control (NT) | 3 | 8133 | 72.68% | 3132 | 58.11% | 3147 | 68.74% | 14412 | 68.12% | |
| Operating traffic lights (OTL) | 4 | 1139 | 10.18% | 568 | 10.54% | 758 | 16.56% | 2465 | 11.65% | |
| Pedestrian crossing sign (PCS) | 5 | 25 | 0.22% | 81 | 1.50% | 253 | 5.53% | 359 | 1.70% | |
| Pedestrian operated lights (POL) | 6 | 5 | 0.04% | 12 | 0.22% | 95 | 2.08% | 112 | 0.53% | |
| Police (PL) | 7 | 18 | 0.16% | 2 | 0.04% | 19 | 0.42% | 39 | 0.18% | |
| Railway—lights and boom gate (RL&BG) | 8 | 5 | 0.04% | 6 | 0.11% | 2 | 0.04% | 13 | 0.06% | |
| Railway—lights only (RL) | 9 | 4 | 0.04% | 0 | 0.00% | 0 | 0.00% | 5 | 0.02% | |
| Railway crossing sign (RCS) | 10 | 0 | 0.00% | 1 | 0.02% | 1 | 0.02% | 1 | 0.00% | |
| Road/Rail worker (RW) | 11 | 8 | 0.07% | 4 | 0.07% | 40 | 0.87% | 52 | 0.25% | |
| School crossing—flags (SCF) | 12 | 0 | 0.00% | 0 | 0.00% | 1 | 0.02% | 1 | 0.00% | |
| Stop sign (SS) | 13 | 329 | 2.94% | 177 | 3.28% | 55 | 1.20% | 561 | 2.65% | |
| Supervised school crossing (SSC) | 14 | 1 | 0.01% | 1 | 0.02% | 8 | 0.17% | 10 | 0.05% | |
| Traffic Law Impairment | ||||||||||
| Disobey Road Rule | All driver | 1 | 5680 | 50.76% | 2897 | 53.75% | 3439 | 75.12% | 11438 | 54.06% |
| Traffic Driver | 2 | 225 | 2.01% | 56 | 1.04% | 76 | 1.66% | 394 | 1.86% | |
| No Giveaway | 3 | 2218 | 19.82% | 1844 | 34.21% | 566 | 12.36% | 4569 | 21.59% | |
| Other road rule violation | 4 | 2958 | 26.43% | 592 | 10.98% | 495 | 10.81% | 4609 | 21.78% | |
| None | 0 | 109 | 0.97% | 1 | 0.02% | 2 | 0.04% | 148 | 0.70% | |
| Drink Drug Alcohol Related | Alcohol Drug Related | 0 | 9873 | 88.23% | 5206 | 96.59% | 3927 | 85.78% | 18865 | 89.16% |
| Drink Driving | 1 | 1220 | 10.90% | 184 | 3.41% | 309 | 6.75% | 1947 | 9.20% | |
| Alcohol Impaired Pedestrian | 2 | 97 | 0.87% | 0 | 0.00% | 342 | 7.47% | 346 | 1.64% | |
| Classification Target | ||||||||||
| Crash Severity | High Crash Severity | 1 | 3987 | 35.63% | 2909 | 53.97% | 1876 | 40.98% | 8772 | 41.46% |
| Low Crash Severity | 0 | 7203 | 64.37% | 2481 | 46.03% | 2702 | 59.02% | 12386 | 58.54% | |
Fig 1Hyperparameter tuning in KNN.
Performance of classification models for crash severity levels.
| Performance Metrics | Random Forest | Support Vector machine | K-Nearest Neighbour |
| Accuracy | 68.38% | 65.79% | |
| F1 Score | 80.25% | 78.24% | 77.59% |
| Sensitivity (True Positive Rate) | 94.53% | 94.12% | 92.81% |
| Specificity (True Negative Rate) | 29.78% | 17.51% | 13.53% |
| Precision Score | 70% | 67% | 67.45% |
| AUC Score | 0.74 | 0.70 | 0.67 |
| Performance Metrics | Random Forest | Support Vector machine | K-Nearest Neighbour |
| Accuracy | 60.25% | 58.21% | |
| F1 Score | 67.15% | 47.69% | 37.88% |
| Sensitivity (True Positive Rate) | 53.53% | 40.95% | 30.22% |
| Specificity (True Negative Rate) | 70.85% | 75.12% | 77.36% |
| Precision Score | 75.87% | 54.23% | 52.33% |
| AUC Score | 0.66 | 0.62 | 0.60 |
| Performance Metrics | Random Forest | Support Vector machine | K-Nearest Neighbour |
| Accuracy | 63.28% | 61.75% | |
| F1 Score | 79.35% | 76.49% | 69.2% |
| Sensitivity (True Positive Rate) | 92.66% | 98.12% | 88.56% |
| Specificity (True Negative Rate) | 27.38% | 12.22% | 19.38% |
| Precision Score | 61.67% | 63.5% | 63.7% |
| AUC Score | 0.68 | 0.64 | 0.65 |
| Performance Metrics | Random Forest | Support Vector machine | K-Nearest Neighbour |
| Accuracy | 65.59% | 62.56% | |
| F1 Score | 75.35% | 73.67% | 70.72% |
| Sensitivity (True Positive Rate) | 83.56% | 82.23% | 75.98% |
| Specificity (True Negative Rate) | 45.28% | 38.32% | 40.21% |
| Precision Score | 69.37% | 66.23% | 66.31% |
| AUC Score | 0.70 | 0.67 | 0.64 |
Fig 2Comparison performance measures of classifiers for different VRU models.
Fig 3ROC curves of different classifiers.
(A) Motorcyclists (B) Bicyclists (C) Pedestrians (D) VRUs.
Fig 4Random forest based feature ranking for QLD VRU.
Fig 5Partial dependency plots of different features with respect to VRU crash severity.
(A) Age Group (B) Disobey Road Rule (C) Drink, Drug and Alcohol Related (D) Area Remoteness (E) Day of Week (F) Driver Condition.
Fig 7Partial dependency plots of different features with respect to VRU crash severity.
(M) Year (N) Speed Limit (O) Vehicle Condition (P) Traffic Control (Q) Speed Driving.