| Literature DB >> 29791481 |
Xin Ye1, Ke Wang1, Yajie Zou1, Dominique Lord2.
Abstract
This paper develops a semi-nonparametric Poisson regression model to analyze motor vehicle crash frequency data collected from rural multilane highway segments in California, US. Motor vehicle crash frequency on rural highway is a topic of interest in the area of transportation safety due to higher driving speeds and the resultant severity level. Unlike the traditional Negative Binomial (NB) model, the semi-nonparametric Poisson regression model can accommodate an unobserved heterogeneity following a highly flexible semi-nonparametric (SNP) distribution. Simulation experiments are conducted to demonstrate that the SNP distribution can well mimic a large family of distributions, including normal distributions, log-gamma distributions, bimodal and trimodal distributions. Empirical estimation results show that such flexibility offered by the SNP distribution can greatly improve model precision and the overall goodness-of-fit. The semi-nonparametric distribution can provide a better understanding of crash data structure through its ability to capture potential multimodality in the distribution of unobserved heterogeneity. When estimated coefficients in empirical models are compared, SNP and NB models are found to have a substantially different coefficient for the dummy variable indicating the lane width. The SNP model with better statistical performance suggests that the NB model overestimates the effect of lane width on crash frequency reduction by 83.1%.Entities:
Mesh:
Year: 2018 PMID: 29791481 PMCID: PMC5965849 DOI: 10.1371/journal.pone.0197338
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Node and weight values in Gauss–Hermite quadrature (30 supporting points).
| -6.86335 | -6.13828 | -5.53315 | -4.98892 | -4.48306 | |
| 0.834247 | 0.649098 | 0.569403 | 0.522526 | 0.491058 | |
| -4.00391 | -3.54444 | -3.09997 | -2.66713 | -2.24339 | |
| 0.468375 | 0.451321 | 0.438177 | 0.427918 | 0.419895 | |
| -1.82674 | -1.41553 | -1.00834 | -0.60392 | -0.20113 | |
| 0.413679 | 0.408982 | 0.405605 | 0.40342 | 0.402346 | |
| 0.201129 | 0.603921 | 1.00834 | 1.41553 | 1.82674 | |
| 0.402346 | 0.40342 | 0.405605 | 0.408982 | 0.413679 | |
| 2.24339 | 2.66713 | 3.09997 | 3.54444 | 4.00391 | |
| 0.419895 | 0.427918 | 0.438177 | 0.451321 | 0.468375 | |
| 4.48306 | 4.98892 | 5.53315 | 6.13828 | 6.86335 | |
| 0.491058 | 0.522526 | 0.569403 | 0.649098 | 0.834247 |
Comparison between NB and SNP Models (α2 = 0.8, Sample Size = 1000).
| NB Model | SNP Model | |||
|---|---|---|---|---|
| Variable | Value | SE | Value | SE |
| b0 (1.0) | 1.0031 | 0.0908 | 1.0031 | — |
| b1 (-0.3) | -0.2969 | 0.0239 | -0.2969 | 0.0232 |
| b2 (0.4) | 0.3829 | 0.0243 | 0.3864 | 0.0244 |
| 0.8113 | 0.0541 | — | — | |
| a0 | — | — | 1.0000 | — |
| a1 | — | — | -0.0581 | 0.0692 |
| a2 | — | — | -0.1393 | 0.0388 |
| a3 | — | — | -0.0521 | 0.0135 |
| a4 | — | — | 0.0207 | 0.0065 |
| LL(β) | ||||
Fig 1Comparison of SNP and Log-Gamma distributions (α2 = 0.8).
Comparison between NB and SNP Models (α2 = 1.2, Sample Size = 1000).
| NB Model | SNP Model | |||
|---|---|---|---|---|
| Variable | Value | SE | Value | SE |
| b0 (1.0) | 1.0157 | 0.1047 | 1.0157 | — |
| b1 (-0.3) | -0.3450 | 0.0279 | -0.3537 | 0.0264 |
| b2 (0.4) | 0.4007 | 0.0276 | 0.3915 | 0.0169 |
| 1.2215 | 0.0760 | — | — | |
| a0 | — | — | 1.0000 | — |
| a1 | — | — | 0.0496 | 0.0572 |
| a2 | — | — | -0.0459 | 0.0450 |
| a3 | — | — | -0.0895 | 0.0131 |
| a4 | — | — | 0.0213 | 0.0070 |
| LL(β) | ||||
Fig 2Comparison of SNP and Log-Gamma distributions (α2 = 1.2).
SNP models to approximate normal heterogeneities.
| SNP Model 1 ( | SNP Model 2 ( | |||
|---|---|---|---|---|
| Variable (True Value) | Value | SE | Value | SE |
| b1 (-0.3) | -0.2993 | 0.0113 | -0.3042 | 0.0151 |
| b2 (0.4) | 0.4090 | 0.0039 | 0.3946 | 0.0045 |
| a0 | 1.0000 | — | 1.0000 | — |
| a1 | 0.0059 | 0.0332 | 0.0067 | 0.0305 |
| a2 | -0.1194 | 0.0185 | 0.0980 | 0.0218 |
| LL(β) | -2112.09 | -2462.97 | ||
(Sample Size = 1000)
Fig 3Comparison of SNP and normal distributions (σ = 0.8 or 1.2).
Fig 4Comparison of SNP and bimodal distributions.
SNP models to approximate bimodal and trimodal heterogeneities.
| SNP Model 1 | SNP Model 2 | |||
|---|---|---|---|---|
| Variable (True Value) | Value | SE | Value | SE |
| b1 (-0.3) | -0.2801 | 0.0119 | -0.2804 | 0.0061 |
| b2 (0.4) | 0.4139 | 0.0051 | 0.4107 | 0.0021 |
| a0 | 1.0000 | — | 1.0000 | — |
| a1 | 0.8984 | 0.2587 | -0.0496 | 0.0942 |
| a2 | 0.9218 | 0.3554 | -0.2594 | 0.0712 |
| a3 | -0.3543 | 0.1535 | -0.0160 | 0.0514 |
| a4 | -0.0637 | 0.0485 | 0.0804 | 0.0088 |
| a5 | 0.0174 | 0.0170 | -0.0007 | 0.0047 |
| LL(β) | -1289.46 | -1327.89 | ||
Sample Size = 500
Fig 5Comparison of SNP and trimodal distributions.
Summary statistics of variables for the California data.
| Variable | Minimum | Maximum | Mean | Std. Dev. |
|---|---|---|---|---|
| Number of crashes (10 years) | 0.00 | 44.43 | ||
| Segment length (in miles) (L) | 0.10 | 4.37 | 0.50 | 0.52 |
| Average daily traffic over 10 years | 1372.00 | 78300.00 | 16001.57 | 13088.46 |
| Ln(L·10) | 0.00 | 3.78 | 1.26 | 0.79 |
| Ln(AADT) | 7.22 | 11.27 | 9.39 | 0.77 |
| Median width (in feet) | 0.00 | 99.00 | 34.56 | 32.34 |
| Lane width (in feet) | 6.00 | 15.00 | 12.01 | 0.39 |
| Right shoulder width (in feet) | 0.00 | 23.00 | 7.85 | 2.80 |
Crash frequency model estimation results.
| NB Model | SNP Model | |||
|---|---|---|---|---|
| Variable | Value | SE | Value | SE |
| Intercept | -7.0561 | 0.6873 | -7.0561 | —. |
| Ln[10×length] | 1.0000 | — | 1.0000 | —. |
| Ln(AADT) | 1.0711 | 0.0267 | 1.0046 | 0.0187 |
| Median | -0.0348 | 0.0083 | -0.0369 | 0.0056 |
| Lane | 0.0542 | 0.0171 | ||
| Right shoulder width (ft) | -0.0733 | 0.0093 | -0.0699 | 0.0043 |
| α2 | 0.5035 | 0.0239 | — | — |
| a0 | — | — | 1.0000 | — |
| a1 | — | — | -0.3242 | 0.0336 |
| a2 | — | — | -0.1714 | 0.0164 |
| a3 | — | — | 0.0408 | 0.0093 |
| Overall Performance Measurements | ||||
| Sample size | 1443 | 1443 | ||
| LL(β) | -4480.06 | |||
| Deviance | 8960.13 | |||
| AIC | 8972.13 | |||
| BIC | 9003.78 | |||
Fig 6Comparison of SNP and Log-Gamma distributions in crash frequency models.