| Literature DB >> 35890990 |
Thodoris Garefalakis1, Christos Katrakazas1, George Yannis1.
Abstract
Predicting driving behavior and crash risk in real-time is a problem that has been heavily researched in the past years. Although in-vehicle interventions and gamification features in post-trip dashboards have emerged, the connection between real-time driving behavior prediction and the triggering of such interventions is yet to be realized. This is the focus of the European Horizon2020 project "i-DREAMS", which aims at defining, developing, testing and validating a 'Safety Tolerance Zone' (STZ) in order to prevent drivers from risky driving behaviors using interventions both in real-time and post-trip. However, the data-driven conceptualization of STZ levels is a challenging task, and data class imbalance might hinder this process. Following the project principles and taking the aforementioned challenges into consideration, this paper proposes a framework to identify the level of risky driving behavior as well as the duration of the time spent in each risk level by private car drivers. This aim is accomplished by four classification algorithms, namely Support Vector Machines (SVMs), Random Forest (RFs), AdaBoost, and Multilayer Perceptron (MLP) Neural Networks and imbalanced learning using the Adaptive Synthetic technique (ADASYN) in order to deal with the unbalanced distribution of the dataset in the STZ levels. Moreover, as an alternative approach of risk prediction, three regression algorithms, namely Ridge, Lasso, and Elastic Net are used to predict time duration. The results showed that RF and MLP outperformed the rest of the classifiers with 84% and 82% overall accuracy, respectively, and that the maximum speed of the vehicle during a 30 s interval, is the most crucial predictor for identifying the driving time at each safety level.Entities:
Keywords: driving behavior analysis; driving behavior classification; imbalanced machine learning
Mesh:
Year: 2022 PMID: 35890990 PMCID: PMC9319394 DOI: 10.3390/s22145309
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1DSS Car Simulator.
Different scenarios applied during the driving simulator experiment.
| Scenario | Road Section | Number of Lanes | Speed Limits |
|---|---|---|---|
| A | 0–6300 m | 1 × 1 | 70 km/h |
| 6300–11,300 m | 2 × 2 | 90 km/h | |
| 11,300–16,500 m | 2 × 2 | 120 km/h | |
| B | 0–6100 m | 2 × 2 | 90 km/h |
| 6100–12,000 m | 2 × 2 | 120 km/h | |
| 12,000–18,200 m | 1 × 1 | 70 km/h | |
| C | 0–6000 m | 2 × 2 | 120 km/h |
| 6000–11,000 m | 2 × 2 | 90 km/h | |
| 11,000–17,200 m | 1 × 1 | 70 km/h |
Description of variables collected by the driving simulator experiment.
| Variable | Description | Units | Type |
|---|---|---|---|
| TTC | Time to collision with the vehicle ahead | Seconds | Numeric |
| Headway | Time headway to the vehicle ahead in the same lane | Seconds | Numeric |
| Speed | Vehicle speed | Kilometers per hour | Numeric |
| Distance_travelled | Distance driving | Meters | Numeric |
| BSAV_SpeedLimitKPH | Current speed limit | Kilometers per hour | Numeric |
| HandsOnEvent | Whether hands are on the steering wheel | None/both | Discrete |
| FatigueEvent | KSS score | 32–35–39 | Discrete |
Comparison of results of different methods for determining safety levels.
| Technique | Risk Level of Driving Behavior | ||
|---|---|---|---|
| Normal | Dangerous | Avoidable Accident | |
| K-means Clustering | 239 | 1483 | 1599 |
| Hierarchical Clustering | 368 | 1204 | 1749 |
| Threshold of the variable TTC_mean | 3150 | 35 | 136 |
| Threshold of the variable Speed_mean | 3320 | 1 | 0 |
| Threshold of the variable Headway_min | 2820 | 338 | 163 |
Figure 2Correlation heatmap of the examined variables.
Figure 3Permutation feature importance.
Descriptive statistics for input variables.
| Variable | Description | Mean | St. Dev. | Min | Max |
|---|---|---|---|---|---|
| Speed_max | Maximum value of Speed variable for an interval of 30 s (km/h) | 75.45 | 3.00 | 64.00 | 100.00 |
| Distance travelled_sum | Sum of Distance travelled variable for an interval of 30 s (m.) | 7,006,041.28 | 4,176,949.80 | 363.50 | 20,023,055.37 |
| BSAV_SpeedLimitKPH_max | Maximum value of BSAV_SpeedLimitKPH variable for an interval of 30 sec (km/h) | 95.94 | 20.96 | 75.50 | 125.50 |
Classification metrics for the developed classifiers.
| Classifier | Accuracy | Precision | Recall | False Alarm Rate | f1-Score |
|---|---|---|---|---|---|
| SVM | 68.67% | 51.35% | 74.72% | 12.47% | 53.22% |
| RF | 84.00% | 59.41% | 70.27% | 11.47% | 63.42% |
| AdaBoost | 75.08% | 52.31% | 70.71% | 11.30% | 55.87% |
| MLP | 81.28% | 57.51% | 72.04% | 11.37% | 61.79% |
Figure 4Classification metrics of the four machine learning models.
Figure 5ROC curve of RF classifier.
Figure 6Precision-Recall curve of RF classifier.
Summary of Ridge Regression model.
| Summary of Ridge Regression Model | ||||
|---|---|---|---|---|
| Coefficients: | ||||
| Estimate | Std. Error | t Value | ||
| Intercept | 9966.72 | 472.91 | 21.08 | 0.00 |
| Speed_max | −112.01 | 2.18 | −51.44 | 0.00 |
| Distance travelled_sum | 0.01 | 0.00 | 8.90 | 0.00 |
| R2 = 0.85 | Adjusted R2 = 0.85 | |||
Summary of Lasso Regression model.
| Summary of Lasso Regression Model | ||||
|---|---|---|---|---|
| Coefficients: | ||||
| Estimate | Std. Error | t Value | ||
| Intercept | 9966.36 | 472.91 | 21.08 | 0.00 |
| Speed_max | −112.02 | 2.18 | −51.45 | 0.00 |
| Distance travelled_sum | 0.01 | 0.00 | 8.90 | 0.00 |
| R2 = 0.85 | Adjusted R2 = 0.85 | |||
Summary of Elastic Net Regression model.
| Summary of Elastic Net Regression Model | ||||
|---|---|---|---|---|
| Coefficients: | ||||
| Estimate | Std. Error | t Value | ||
| Intercept | 9697.04 | 472.98 | 20.46 | 0.00 |
| Speed_max | −108.84 | 2.18 | −49.87 | 0.00 |
| Distance travelled_sum | 0.01 | 0.00 | 8.96 | 0.00 |
| R2 = 0.85 | Adjusted R2 = 0.85 | |||