| Literature DB >> 34063528 |
Xiuguang Song1,2, Rendong Pi1,2, Yu Zhang3, Jianqing Wu1,2, Yuhuan Dong4, Han Zhang5, Xinyuan Zhu6.
Abstract
Multi-vehicle (MV) crashes, which can lead to great damages to society, have always been a serious issue for traffic safety. A further understanding of crash severity can help transportation engineers identify the critical reasons and find effective countermeasures to improve transportation safety. However, studies involving methods of machine learning to predict the possibility of injury-severity of MV crashes are rarely seen. Besides that, previous studies have rarely taken temporal stability into consideration in MV crashes. To bridge these knowledge gaps, two kinds of models: random parameters logit model (RPL), with heterogeneities in the means and variances, and Random Forest (RF) were employed in this research to identify the critical contributing factors and to predict the possibility of MV injury-severity. Three-year (2016-2018) MV data from Washington, United States, extracted from the Highway Safety Information System (HSIS), were applied for crash injury-severity analysis. In addition, a series of likelihood ratio tests were conducted for temporal stability between different years. Four indicators were employed to measure the prediction performance of the selected models, and four categories of crash-related characteristics were specifically investigated based on the RPL model. The results showed that the machine learning-based models performed better than the statistical models did when taking the overall accuracy as an evaluation indicator. However, the statistical models had a better prediction performance than the machine learning models had considering crash costs. Temporal instabilities were present between 2016 and 2017 MV data. The effect of significant factors was elaborated based on the RPL model with heterogeneities in the means and variances.Entities:
Keywords: crash costs; machine learning; multi-vehicle crash; statistical model; unobserved heterogeneity
Year: 2021 PMID: 34063528 PMCID: PMC8157156 DOI: 10.3390/ijerph18105271
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Flowchart of data processing.
Figure 2The construction of Random Forest (RF) model.
Confusion matrix (error matrix) of an injury-severity level prediction model.
| Predicted | Injury-Severity Level | Actual Number of Crashes | |||
|---|---|---|---|---|---|
| Actual | Property Damage Only (PDO) | Injury (I) | Fatal Injury (FI) | ||
| Injury-Severity Level | Property Damage Only (PDO) |
|
|
|
|
|
|
|
| |||
| Injury (I) |
|
|
|
| |
|
|
|
| |||
| Fatal Injury (FI) |
|
|
|
| |
|
|
|
| |||
Note: P, the prediction result; R, the ration of the prediction result over the number of crashes; N, actual number of crashes.
2017 comprehensive crash unit cost based on injury-severity level (USD).
| Injury-Severity Level | Economic Crash Costs | QALY Crash Unit Costs | Comprehensive Crash Unit Cost |
|---|---|---|---|
| Property Damage Only (PDO) | 12,456 | 0 | 12,456 |
| Injury (I) | 46,132 | 97,535 | 143,667 |
| Fatal Injury (FI) | 588,738 | 3,173,900 | 3,762,638 |
Note: QALY, quality-adjusted life years.
Likelihood ratio test results between different years.
| 2016 | 2017 | 2018 | |
|---|---|---|---|
| 2016 | - | 35.06 (9) | 32.02 (10) |
| 2017 | 7.48 (13) | - | 3.56 (10) |
| 2018 | 8.94 (13) | 2.84 (9) | - |
Figure 3The learning curve of the RF model.
Comparison of model prediction based on four indicators.
| Methods | Statistical Methods | Machine Learning Methods |
|---|---|---|
| RPL | RF | |
|
| 56.59% | 67.16% |
|
| 2143 | 14,076 |
|
| 3.61% | 27.12% |
| 137 | 895 | |
| 252 | 153 | |
| 243 | 209 |
Note: Roverall, the overall correct prediction ratio; OPMAE, the overall prediction mean absolute error; OPAPE, the overall prediction absolute percentage error; OPRMSE, the overall prediction root-mean-squared error; POCC, the predicted overall costs of crashes; AOCC, the actual overall costs of crashes.
Error matrix for injury-severity prediction model.
| Injury-Severity Level | Method | Property Damage Only (PDO) | Injury (I) | Fatal (F) |
|---|---|---|---|---|
| Property Damage Only (PDO) | RPL | 1881 | 918 | 3 |
| 67.13% | 32.76% | 0.11% | ||
| RF | 2408 | 419 | 1 | |
| 85.14% | 14.81% | 0.03% | ||
| Injury (I) | RPL | 850 | 440 | 3 |
| 65.74% | 34.03% | 0.23% | ||
| RF | 909 | 306 | 1 | |
| 74.75% | 25.16% | 0.08% | ||
| Fatal (F) | RPL | 4 | 2 | 0 |
| 66.67% | 33.33% | 0% | ||
| RF | 0 | 0 | 0 | |
| 0% | 0% | 0% |
Note: RPL, random parameters logit model with heterogeneity in means and variances; RF, random forest model.
Figure 4The measurements for road safety. (a) The relationship between crash-related factors and measurements, (b) better not sit in the front seat, and (c) try to take public transportation on weekends.
| Explanatory Variable | PDO | Injury | Fatal | Total | ||||
|---|---|---|---|---|---|---|---|---|
|
| 9287 | 68.90% | 4180 | 31.01% | 11 | 0.08% | 13,478 | |
|
| ||||||||
| Driver gender | ||||||||
| Male driver | 5512 | 40.90% | 2280 | 16.92% | 5 | 0.04% | 7797 | 57.85% |
| Female driver | 3775 | 28.01% | 1900 | 14.10% | 6 | 0.04% | 5681 | 42.15% |
| Driver age | ||||||||
| Young driver | 1458 | 10.82% | 628 | 4.66% | 3 | 0.02% | 2089 | 15.50% |
| Middle-aged driver | 6718 | 49.84% | 2951 | 21.89% | 5 | 0.04% | 9674 | 71.78% |
| Elder driver | 1111 | 8.24% | 601 | 4.46% | 3 | 0.02% | 1715 | 12.72% |
| Driver restraint | ||||||||
| No restraints used | 33 | 0.24% | 19 | 0.14% | 3 | 0.02% | 55 | 0.41% |
| Lap belt/shoulder or other restraints used | 9254 | 68.66% | 4161 | 30.87% | 8 | 0.06% | 13,423 | 99.59% |
| Driver mistake action | ||||||||
| Skidding involved | 175 | 1.30% | 60 | 0.45% | 2 | 0.01% | 237 | 1.76% |
| Avoiding maneuvers | 156 | 1.16% | 47 | 0.35% | 0 | 0.00% | 203 | 1.51% |
| Sudden slowing maneuvers | 4089 | 30.34% | 1505 | 11.17% | 1 | 0.01% | 5595 | 41.51% |
| Stopped vehicle | 4145 | 30.75% | 2061 | 15.29% | 2 | 0.01% | 6208 | 46.06% |
|
| ||||||||
| Carry hazardous material | ||||||||
| Yes | 0 | 0.01% | 2 | 0.04% | 4 | 0.00% | 6 | 0.04% |
| No | 5417 | 40.21% | 8055 | 59.79% | 0 | 0.00% | 13,472 | 99.96% |
|
| ||||||||
| Roadway classification | ||||||||
| Urban freeways | 5463 | 40.53% | 2337 | 17.34% | 3 | 0.02% | 7803 | 57.89% |
| Urban multilane roads | 2607 | 19.34% | 1272 | 9.44% | 0 | 0.00% | 3879 | 28.78% |
| Rural freeways | 562 | 4.17% | 232 | 1.72% | 3 | 0.02% | 797 | 5.91% |
| Rural multilane roads | 655 | 4.86% | 339 | 2.52% | 5 | 0.04% | 999 | 7.41% |
| Road characteristics | ||||||||
| Straight | 8589 | 63.73% | 3845 | 28.53% | 10 | 0.07% | 12,444 | 92.33% |
| Curve | 698 | 5.18% | 335 | 2.49% | 1 | 0.01% | 1034 | 7.67% |
| Federal function class | ||||||||
| Rural collector | 1221 | 9.06% | 573 | 4.25% | 8 | 0.06% | 1802 | 13.37% |
| Urban collector | 8066 | 59.85% | 3607 | 26.76% | 3 | 0.02% | 11,676 | 86.63% |
| Road surface type | ||||||||
| Portland concrete cement | 2440 | 18.10% | 1006 | 7.46% | 0 | 0.00% | 3446 | 25.57% |
| Asphalt concrete | 6847 | 50.80% | 3171 | 23.53% | 11 | 0.08% | 10,029 | 74.41% |
| Brick/gravel/dirt | 0 | 0.00% | 3 | 0.02% | 0 | 0.00% | 3 | 0.02% |
|
| ||||||||
| Day of week | ||||||||
| Non-weekend | 6038 | 44.80% | 2796 | 20.74% | 8 | 0.06% | 8842 | 65.60% |
| Weekend | 3249 | 24.11% | 1384 | 10.27% | 3 | 0.02% | 4636 | 34.40% |
| Location of the crash | ||||||||
| Intersection-related | 2316 | 17.18% | 1135 | 8.42% | 3 | 0.02% | 3454 | 25.63% |
| Driveway-related | 279 | 2.07% | 177 | 1.31% | 1 | 0.01% | 457 | 3.39% |
| Not at intersection or driveway | 6692 | 49.65% | 2868 | 21.28% | 7 | 0.05% | 9567 | 70.98% |
| Weather | ||||||||
| Clear | 7354 | 54.56% | 3410 | 25.30% | 5 | 0.04% | 10,769 | 79.90% |
| Cloudy | 1689 | 12.53% | 657 | 4.87% | 4 | 0.03% | 2350 | 17.44% |
| Raining/snowing | 154 | 1.14% | 69 | 0.51% | 0 | 0.00% | 223 | 1.65% |
| Fog/wind/other | 90 | 0.67% | 44 | 0.33% | 2 | 0.01% | 136 | 1.00% |
| Light condition | ||||||||
| Daylight | 7233 | 53.67% | 3261 | 24.19% | 7 | 0.05% | 10,501 | 77.91% |
| Dusk-dawn | 319 | 2.37% | 137 | 1.02% | 0 | 0.00% | 456 | 3.38% |
| Dark, light on | 1274 | 9.45% | 594 | 4.41% | 2 | 0.01% | 1870 | 13.87% |
| Dark, light off | 461 | 3.42% | 188 | 1.39% | 2 | 0.01% | 651 | 4.83% |
| Roadway surface | ||||||||
| Dry | 6722 | 49.87% | 3104 | 23.03% | 4 | 0.03% | 9830 | 72.93% |
| Wet/snow/slush/ice | 2538 | 18.83% | 1060 | 7.86% | 7 | 0.05% | 3605 | 26.75% |
| Other | 27 | 0.20% | 16 | 0.12% | 0 | 0.00% | 43 | 0.32% |
|
| ||||||||
| Age | ||||||||
| Young passenger | 5019 | 37.24% | 1951 | 14.48% | 4 | 0.03% | 6974 | 51.74% |
| Middle-aged passenger | 3352 | 24.87% | 1654 | 12.27% | 5 | 0.04% | 5011 | 37.18% |
| Elder passenger | 916 | 6.80% | 575 | 4.27% | 2 | 0.01% | 1493 | 11.08% |
| Gender | ||||||||
| Male | 4115 | 30.53% | 1572 | 11.66% | 5 | 0.04% | 5692 | 42.23% |
| Female | 5172 | 38.37% | 2608 | 19.35% | 6 | 0.04% | 7786 | 57.77% |
| Seat position | ||||||||
| First row | 4934 | 36.61% | 2499 | 18.54% | 6 | 0.04% | 7439 | 55.19% |
| Second row | 1237 | 9.18% | 446 | 3.31% | 0 | 0.00% | 1683 | 12.49% |
| Third row | 3116 | 23.12% | 1235 | 9.16% | 5 | 0.04% | 4356 | 32.32% |
| Eject | ||||||||
| Not ejected | 9281 | 68.86% | 4175 | 30.98% | 7 | 0.05% | 13,463 | 99.89% |
| Ejected | 6 | 0.04% | 5 | 0.04% | 4 | 0.03% | 15 | 0.11% |
| Occupant Restraint | ||||||||
| No restraints used | 34 | 0.25% | 33 | 0.24% | 3 | 0.02% | 70 | 0.52% |
| Lap belt/shoulder or other used | 9253 | 68.65% | 4147 | 30.77% | 8 | 0.06% | 13,408 | 99.48% |
| Variable | Random Parameters Logit Model (with Heterogeneity in Means and Variances) | |
|---|---|---|
| Parameters Estimate | z-Stat | |
| Constant (PDO) | 7.0652 | 15.68 |
| Constant (I) | 5.4921 | 11.68 |
| Driver characteristics | ||
| Old-aged driver (1 if driver is older than 60 years old; 0 otherwise) (PDO) | −1.3907 | −3.66 |
| Middle-aged driver (1 if driver is between 25 and 60 years old; 0 otherwise) (PDO) | 1.4329 | 3.34 |
| Male driver (1 if the gender of driver is male; 0 otherwise) (PDO) | 0.7133 | −3.44 |
| Sudden slowing maneuvers (1 if the Driver mistake action is Sudden slowing maneuvers; 0 otherwise) (FI) | −2.0871 | −1.68 |
| Road characteristics | ||
| Wet/snow/slush/ice road surface (1 if the road surface is wet/snow/slush/ice; 0 otherwise) (PDO) | 0.2841 | 2.09 |
| Rural freeways (1 if the road classification is rural freeways; 0 otherwise) (F) | 1.8023 | 2.21 |
| Crash characteristics | ||
| Not at intersection or driveway (1 if the crash occurred not at intersection or driveway; 0 otherwise) (PDO) | 0.2232 | 1.73 |
| Weekend (1 if weekend; 0 otherwise) (I) | −0.1791 | −1.56 |
| Occupant characteristics | ||
| Male occupant (1 if the gender of occupant is male; 0 otherwise) (PDO) | −0.5782 | −2.39 |
| Old-aged occupant (1 if occupant is older than 60 years old; 0 otherwise) (PDO) | −0.8212 | −2.30 |
| Ejected (1 if occupant is ejected; 0 otherwise) (PDO) | −4.2151 | −4.30 |
| Second row (1 if the occupant seated in second row; 0 otherwise) (I) | −0.4940 | −2.29 |
| Random parameters | ||
| Occupant restraints (1 if occupant’s safety equipment is used; 0 otherwise) (I) | −1.5017 | −2.56 |
| Standard deviation of “Occupant restraints” (I) | 4.5840 | 3.45 |
| Male driver (1 if the gender of driver male; 0 otherwise) (I) | 0.6905 | 2.38 |
| Standard deviation of “Male driver” (I) | 3.1585 | 2.66 |
| Heterogeneity in the mean of the random parameters | ||
| Occupant restraints (I): Sudden slowing maneuvers | −0.5543 | −2.83 |
| Male driver (I): Sudden slowing maneuvers | −0.8786 | −2.54 |
| Heterogeneity in the variances of the random parameters | ||
| Occupant restraints (I): Middle-aged driver | −0.4272 | −2.19 |
| Model statistics | - | - |
| Number of observations | 13,478 | - |
| AIC | 16,593 | - |
| BIC | 16,743 | - |
| McFadden | 0.44 | - |
PDO, Property Damage Only; I, Injury; FI, Fatal Injury.