| Literature DB >> 32244469 |
Ke Wang1,2,3, Qingwen Xue1, Yingying Xing1,2, Chongyi Li1.
Abstract
Real-time recognition of risky driving behavior and aggressive drivers is a promising research domain, thanks to powerful machine learning algorithms and the big data provided by in-vehicle and roadside sensors. However, since the occurrence of aggressive drivers in real traffic is infrequent, most machine learning algorithms treat each sample equally and prone to better predict normal drivers rather than aggressive drivers, which is our real interest. This paper aims to test the advantage of imbalanced class boosting algorithms in aggressive driver recognition using vehicle trajectory data. First, a surrogate measurement of collision risk, called Average Crash Risk (ACR), is proposed to calculate a vehicle's crash risk. Second, the driver's driving aggressiveness is determined by his/her ACR with three anomaly detection methods. Third, we train classification models to identify aggressive drivers using partial trajectory data. Three imbalanced class boosting algorithms, SMOTEBoost, RUSBoost, and CUSBoost, are compared with cost-sensitive AdaBoost and cost-sensitive XGBoost. Additionally, we try two resampling techniques with AdaBoost and XGBoost. Among all algorithms tested, CUSBoost achieves the highest or the second-highest Area Under Precision-Recall Curve (AUPRC) in most datasets. We find the discrete Fourier coefficients of gap as the key feature to identify aggressive drivers.Entities:
Keywords: collision surrogate measurement; driving aggressiveness; imbalanced class boosting; vehicle trajectory
Year: 2020 PMID: 32244469 PMCID: PMC7177658 DOI: 10.3390/ijerph17072375
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Methodology framework.
Figure 2Difference in procedure between AdaBoost and imbalanced class boosting.
Nine algorithms tested.
|
|
|
|---|---|
| Cost-sensitive boosting | |
| AdaBoost | No |
| XGBoost | No |
| Standard boosting with resampling | |
| SMOTE + AdaBoost | SMOTE |
| SMOTE + XGBoost | SMOTE |
| RUS + AdaBoost | Random undersampling |
| RUS + XGBoost | Random undersampling |
| Imbalanced Class Boosting | |
| SMOTEBoost | No |
| RUSBoost | No |
| CUSBoost | No |
Figure 3Histogram of Average Crash Risk (ACR).
Characteristics of datasets.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| Dataset 1 | DFT of speed and acceleration | K-means clustering | 0.14 | 14.4% | 6:1 |
| Dataset 2 | DFT of gap | K-means clustering | 0.14 | 14.4% | 6:1 |
| Dataset 3 | DFT of speed, acceleration, and gap | K-means clustering | 0.14 | 14.4% | 6:1 |
| Dataset 4 | DFT of speed, acceleration, and gap | Interquartile range rule | 0.19 | 10.0% | 9:1 |
| Dataset 5 | DFT of speed, acceleration, and gap | 94th percentile | 0.28 | 6.4% | 14:1 |
The performance of boosting algorithms (Dataset 1).
| Algorithms | Precision | Recall | F1 Score | AUPRC |
|---|---|---|---|---|
| Cost-sensitive boosting | ||||
| AdaBoost | 0.720 | 0.504 | 0.561 | 0.639 |
| XGBoost | 0.809 | 0.552 | 0.639 | 0.693 |
| Standard boosting with resampling | ||||
| SMOTE + AdaBoost | 0.495 | 0.663 | 0.557 | 0.617 |
| SMOTE + XGBoost | 0.526 | 0.684 | 0.586 | 0.655 |
| RUS + AdaBoost | 0.414 | 0.763 | 0.529 | 0.572 |
| RUS + XGBoost | 0.432 | 0.779 | 0.551 | 0.573 |
| Imbalanced class boosting | ||||
| SMOTEBoost | 0.441 | 0.823 | 0.571 | 0.664 |
| RUSBoost | 0.297 | 0.928 | 0.445 | 0.507 |
| CUSBoost | 0.586 | 0.661 | 0.615 | 0.715 |
The performance of boosting algorithms (Dataset 2).
| Algorithms | Precision | Recall | F1 Score | AUPRC |
|---|---|---|---|---|
| Cost-sensitive boosting | ||||
| AdaBoost | 0.832 | 0.768 | 0.786 | 0.852 |
| XGBoost | 0.910 | 0.894 | 0.897 | 0.917 |
| Standard boosting with resampling | ||||
| SMOTE + AdaBoost | 0.845 | 0.824 | 0.825 | 0.869 |
| SMOTE + XGBoost | 0.887 | 0.930 | 0.903 | 0.902 |
| RUS + AdaBoost | 0.681 | 0.901 | 0.774 | 0.820 |
| RUS + XGBoost | 0.823 | 0.917 | 0.861 | 0.890 |
| Imbalanced class boosting | ||||
| SMOTEBoost | 0.799 | 0.856 | 0.818 | 0.895 |
| RUSBoost | 0.588 | 0.962 | 0.722 | 0.851 |
| CUSBoost | 0.840 | 0.908 | 0.866 | 0.912 |
The performance of boosting algorithms (Dataset 3).
| Algorithms | Precision | Recall | F1 Score | AUPRC |
|---|---|---|---|---|
| Cost-sensitive boosting | ||||
| AdaBoost | 0.890 | 0.824 | 0.847 | 0.923 |
| XGBoost | 0.924 | 0.893 | 0.904 | 0.938 |
| Boosting with resampling | ||||
| SMOTE + AdaBoost | 0.830 | 0.873 | 0.732 | 0.802 |
| SMOTE + XGBoost | 0.848 | 0.916 | 0.912 | 0.926 |
| RUS + AdaBoost | 0.827 | 0.888 | 0.806 | 0.855 |
| RUS + XGBoost | 0.925 | 0.929 | 0.914 | 0.910 |
| Imbalanced class boosting | ||||
| SMOTEBoost | 0.812 | 0.917 | 0.852 | 0.942 |
| RUSBoost | 0.605 | 0.954 | 0.730 | 0.902 |
| CUSBoost | 0.870 | 0.911 | 0.884 | 0.935 |
Figure 4The impact of imbalance ratio on AdaBoost.
Figure 5The impact of resampling on AdaBoost.
Figure 6The impact of resampling on XGBoost.