| Literature DB >> 35271004 |
Mustafa Aljasim1, Rasha Kashef1.
Abstract
The increasing number of car accidents is a significant issue in current transportation systems. According to the World Health Organization (WHO), road accidents are the eighth highest top cause of death around the world. More than 80% of road accidents are caused by distracted driving, such as using a mobile phone, talking to passengers, and smoking. A lot of efforts have been made to tackle the problem of driver distraction; however, no optimal solution is provided. A practical approach to solving this problem is implementing quantitative measures for driver activities and designing a classification system that detects distracting actions. In this paper, we have implemented a portfolio of various ensemble deep learning models that have been proven to efficiently classify driver distracted actions and provide an in-car recommendation to minimize the level of distractions and increase in-car awareness for improved safety. This paper proposes E2DR, a new scalable model that uses stacking ensemble methods to combine two or more deep learning models to improve accuracy, enhance generalization, and reduce overfitting, with real-time recommendations. The highest performing E2DR variant, which included the ResNet50 and VGG16 models, achieved a test accuracy of 92% as applied to state-of-the-art datasets, including the State Farm Distracted Drivers dataset, using novel data splitting strategies.Entities:
Keywords: deep learning; distracted driving; ensemble learning; stacking
Mesh:
Year: 2022 PMID: 35271004 PMCID: PMC8914716 DOI: 10.3390/s22051858
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Comparative Analysis of Driver Distraction detection systems.
| Paper | Model (Type) | Dataset | Validation | Pros | Cons |
|---|---|---|---|---|---|
| [ | Deep learning Gaze estimation system | Driver Gaze in the Wild dataset | Accuracy | High performance | It can be only accurate to an extent |
| [ | Deep learning Gaze estimation driver-assistant system | DR(eye)VE dataset | Ground truth | Provide suggestion | Driver gaze is subjective |
| [ | Deep learning distracted driver detection | Distracted driver dataset | Accuracy, Recall, Precision, F1 score | Computationally efficient | Few epochs for training |
| [ | Hierarchical, weighted random forest (WRF) model | The Keimyung University Facial Expression of Drivers (KMU-FED) and the Cohn-Kahnde datasets | Accuracy | Requires low amount of memory and computing operations | Not accurate when the face is rotated |
| [ | Driver distraction detection using CNNs | State farm dataset and the AUC distracted driver dataset | Accuracy, sensitivity | Computationally efficient | Not enough validation metrics |
| [ | Deep learning distracted driver detection using pose estimation | AUC Distracted Driver Dataset | Accuracy, Fl score | Pose estimation improves accuracy | Low-resolution images affected training |
| [ | Driver action recognition using R-CNN | images of different driver actions | Accuracy and log loss | Effective feature representation | Small dataset |
| [ | Driver Distraction recognition Using Octave-Like CNN | Lilong Distracted Driving Behavior data | Accuracy, training duration | Lightweight network | Not enough validation metrics |
| [ | Temporal–Spatial Deep Learning driver distraction detection | EEG signals from 24 participants | Precision, Recall, F1 score | Unique approach | Drivers’ individual differences need to be considered |
| [ | Optimized Residual Neural Network Architecture | The State Farm Distracted Driver dataset | Accuracy, training time | Enhanced model | Only detects head movement |
| [ | Wearable sensing and deep learning driver distraction detection | Wearable sensing information from 20 participants | Recall, Precision, F1 score | Good potential | Small dataset |
| [ | Hybrid Distraction detection model using deep learning | State farm dataset | Accuracy | Computationally expensive | Not enough validation |
| [ | Triple-Wise Multi-Task Learning | AUC Distracted Driver Dataset | Accuracy, sensitivity | High detection accuracy | High computational cost |
| [ | Safety-critical events prediction | Driving events from 3500 drivers | Accuracy, Recall, Precision, F1 score | Can detect potential car accidents | Hard to get enough data |
| [ | CNN driver action detection system | 10 drivers’ data with driving activities | Accuracy | Accurate | It does not detect the driver action |
| [ | CNN driver action detection system | Distracted driver dataset | Accuracy and loss | Computationally simple | Not enough training |
| [ | Distracted driver behavior detection using deep learning | Self-built dataset of drivers making phone calls and smoking | Recall, Precision, Speed | Real-time | Only trained to detect 2 driver actions |
| [ | hybrid driver distraction detection model | (AUC) Distracted Driver Dataset | Accuracy and loss | High Performance | Complex |
| [ | Driver Inattentiveness detection | NA | Accuracy | Comprehensive analysis of deep learning models | Not effective in detecting aggressive behavior |
| [ | Deep learning manual distraction detection model | 106,677 frames extracted from a video that was taken from 20 participants in a driving simulator | Accuracy, Recall, Precision, F1 score | Novel approach | Only detects manual distraction |
Figure 1The ResNet50 blocks [27].
Figure 2The VGG16 blocks [33].
Figure 3The Inception v3 architecture [39].
Figure 4The MobileNet Architecture [43].
Figure 5The E2DR (Pair-wise stacking) Architecture.
Figure 6ResNet50 Architecture.
Figure 7Dense net architecture (VGG16).
Figure 8The Inception Model Architecture.
Figure 9Model Architecture (MobileNet).
Recommendations for distracted drivers.
| Class | Class | Recommendation |
|---|---|---|
| C0 | Safe driving | - |
| C1 | Texting—Right | “Please avoid texting in all cases or make a stop” |
| C2 | Talking on the phone—Right | “Please use a hands-free device” |
| C3 | Texting—Left | “Please avoid texting in all cases or make a stop” |
| C4 | Talking on the phone—Left | “Please use a hands-free device” |
| C5 | Adjusting Radio | “Please use steering control” |
| C6 | Drinking | “Please keep your hands at the steering wheel or make a stop” |
| C7 | Reaching Behind | “Please keep your eyes on the road make a stop” |
| C8 | Hair and Makeup | “Please make a stop” |
| C9 | Talking to passenger | “Please keep your eyes on the road while talking” |
Figure 10State Farm Distracted Driver dataset class representation [47].
Deep learning image classification models performance.
| Model | Training Accuracy | Validation Accuracy | Test Accuracy |
|---|---|---|---|
| ResNet | 0.89 | 0.88 | 0.88 |
| VGG16 | 0.94 | 0.86 | 0.87 |
| Mobile Net | 0.88 | 0.84 | 0.82 |
| Inception | 0.83 | 0.84 | 0.83 |
Figure 11Accuracy and loss graphs for the models on the training and validation sets.
Figure 12Performance reports for deep learning models on the test set.
Figure 13Confusion matrices for the deep learning models.
E2DR models performance on the test set.
| E2DR Model | Accuracy | Precision | Recall | F1 Score | Loss |
|---|---|---|---|---|---|
| MobileNet–Inception | 0.88 | 0.89 | 0.88 | 0.88 | 0.55 |
| ResNet50–Inception | 0.88 | 0.89 | 0.88 | 0.88 | 0.47 |
| ResNet50–MobileNet | 0.90 | 0.91 | 0.9 | 0.9 | 0.43 |
| VGG16–Inception | 0.90 | 0.91 | 0.9 | 0.9 | 0.39 |
| VGG16–MobileNet | 0.91 | 0.92 | 0.91 | 0.91 | 0.42 |
| ResNet50–VGG16 | 0.92 | 0.92 | 0.92 | 0.92 | 0.37 |
Figure 14Best performing E2DR model classification report and confusion matrix.
Figure 15Lowest performing E2DR model classification report and confusion matrix.
Figure 16Accuracy of base models and E2DR models.
Figure 17The Precision, Recall, and F1 Score of base models and E2DR models.
Figure 18The Loss of base models and E2DR models.
Figure 19Training Time (Proposed E2DR vs. baseline deep learning models).