| Literature DB >> 35271215 |
Yaman Albadawi1, Maen Takruri2, Mohammed Awad1.
Abstract
Continuous advancements in computing technology and artificial intelligence in the past decade have led to improvements in driver monitoring systems. Numerous experimental studies have collected real driver drowsiness data and applied various artificial intelligence algorithms and feature combinations with the goal of significantly enhancing the performance of these systems in real-time. This paper presents an up-to-date review of the driver drowsiness detection systems implemented over the last decade. The paper illustrates and reviews recent systems using different measures to track and detect drowsiness. Each system falls under one of four possible categories, based on the information used. Each system presented in this paper is associated with a detailed description of the features, classification algorithms, and used datasets. In addition, an evaluation of these systems is presented, in terms of the final classification accuracy, sensitivity, and precision. Furthermore, the paper highlights the recent challenges in the area of driver drowsiness detection, discusses the practicality and reliability of each of the four system types, and presents some of the future trends in the field.Entities:
Keywords: biological-based measures; driver drowsiness detection; hybrid-based measures; image-based measures; vehicle-based measures
Mesh:
Year: 2022 PMID: 35271215 PMCID: PMC8914892 DOI: 10.3390/s22052069
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Karolinska sleepiness scale, adapted from [19].
| Scale | Verbal Description |
|---|---|
| 1 | Extremely alert |
| 2 | Very alert |
| 3 | Alert |
| 4 | Fairly alert |
| 5 | Neither alert nor sleepy |
| 6 | Some signs of sleepiness |
| 7 | Sleepy, but no effort to keep alert |
| 8 | Sleepy, some effort to keep alert |
| 9 | Very sleepy, great effort to keep alert |
Wierwille and Ellsworth drowsiness scale.
| Levels | Verbal Description |
|---|---|
| 1 | Not drowsy |
| 2 | Slightly drowsy |
| 3 | Moderately drowsy |
| 4 | Significantly drowsy |
| 5 | Extremely drowsy |
Figure 1Driver drowsiness detection measures.
Figure 2Driver drowsiness detection systems data flow.
Some of the image-based measures.
| Features | Description |
|---|---|
| Blink frequency [ | The number of times an eye closes over a specific period of time. |
| Maximum closure duration of the eyes [ | The maximum time the eye was closed. However, it can be risky to delay detecting an extended eye closure that indicates a drowsy driver. |
| Percentage of eyelid closure (PERCLOS) [ | The percentage of time (per minute) in which the eye is 80% closed or more. |
| Eye aspect ratio (EAR) [ | EAR reflects the eye’s openness degree. The EAR value drops down to zero when the eyes are closed. On the other hand, it remains approximately constant when the eye is open. Thus, the EAR detects the eye closure at that time. |
| Yawning frequency [ | The number of times the mouth opens over a specific period of time. |
| Head pose [ | Is a figure that describes the driver’s head movements. It is determined by counting the video segments that show a large deviation of three Euler angles of head poses from their regular positions. These three angles are nodding, shaking, and tilting. |
Image-based drowsiness detection systems.
| Ref. | Image-Based | Extracted Features | Classification Method | Description | Quality Metric | Dataset |
|---|---|---|---|---|---|---|
| [ | Mouth | Yawning | Cold and hot voxels [ | A fatigue detection method based on yawning detection using thermal imaging. The cold and hot voxels were used to detect yawning. | Accuracy: | Prepared their own dataset [ |
| [ | Respiration (using thermal camera) | Standard deviation and the mean of respiration rate, as well as the inspiration-to-expiration time ratio | SVM and KNN | Used facial thermal imaging to study the driver’s respiration and relate it to drowsiness. | Accuracy: | New thermal image dataset was prepared |
| [ | Eye | Eyelids’ curvature | Classification based on the period of eye closure | Based on the eyelid’s curvature’s concavity, the system determined if the eye is opened or closed. Then, it detected drowsiness based on the eye closure period. | Accuracy: | Dataset1: Prepared their own image dataset |
| [ | Eye | Eye state (open/closed) | Proposed optical correlation with deformed filter | Used optical Vander Lugt correlator to precisely estimate the eye’s location in the Fourier plane of the Vander Lugt correlator. | Different accuracies for different datasets | FEI [ |
| [ | Eye | The eyes’ EAR value | Multilayer perceptron, RF, and SVM | Tracked eye blinking duration in video streams, as an indicator of drowsiness using the EAR. Overall, the SVM showed the best performance. | Accuracy: | Prepared their own dataset |
| [ | Face and eye | PERCLOS, blink frequency, and maximum closure duration of the eyes. | KNN, SVM, logistic regression, and ANN | A nonintrusive system based on face and eye state tracking. The final results revealed that the best models were the KNN and ANN. | Accuracy: | NTHUDDD public dataset [ |
| [ | Eye | Eye closure | FD-NN, TL-VGG16, and TL-VGG19 | Applied real-time system based on the area of eye closure using CNN. For eye closure classification, three networks were introduced: FD-NN, TL-VGG16, and TL-VGG19. | Accuracy: | ZJU gallery and prepared their own dataset |
| [ | Eye | 34 eye–eye tracking features | RF and non-linear SVM | Used 34 eye-tracking signals’ features to detect drowsiness. These features were extracted from overlapping eye signals’ epochs of different lengths. The labels were extracted from EEG signals. | Accuracy: | Prepared their own dataset |
| [ | Eye and Mouth | PERCLOS, eye closing duration, and average mouth opening time | Mamdani fuzzy inference system | The state of the extracted parameters is determined through a cascade of regression tree algorithms. A Mamdani fuzzy inference system then estimates the driver state. | Accuracy: 95.5% | 300-W dataset [ |
| [ | Eye and Mouth | Eye closure and mouth openness for a duration of time | Circular Hough transform | The circular Hough transform method is applied to check whether the mouth is open or iris is detected. Based on these two measures, the driver’s state is determined. | Accuracy: 94% | Prepared their own dataset |
| [ | Eye and Head | Frequency of eyes blinking and frequency of head tilting | Templet matching to detect the eyes and calculating the frequency of head tilting and eye blinking to detect the drowsiness level | By calculating the frequency of head tilting and eye blinking, the drowsiness level is determined, on a scale of 0-100. If drowsiness reached 100, a loud audible warning would be triggered. | Accuracy: 99.59% | Prepared their own dataset |
| [ | Mouth and Eye | Proportion of the number of closed-eye frames to the total number of frames in 1min, continuous-time of eye closure, blinking frequency, and number of yawns in 1-min | For face tracking: multiple CNNs-kernelized correlation filters method | The multiple CNNs-kernelized correlation filters method is used for face tracking and to extract the image-based parameters. If found drowsy, the driver is alerted. | Accuracy: 92% | CelebA dataset [ |
| [ | Facial, hand, Behavioral (head, eyes, or mouth movements) | Facial expression, behavioral features, head gestures, and hand gestures | SoftMax classifier | This system introduced an architecture that uses four deep learning models to extract four different types of features. | Accuracy: 85% | NTHUDDD public dataset [ |
| [ | Eye, head, and mouth | Eye closure duration, head nodding, and yawning | A two-stream CNN | Used multi-task cascaded CNNs to find the positions of the mouth and eyes. Then, it extracted the static and dynamic features from a partial facial image and partial facial optical flow, respectively. Lastly, it combined the features to classify the image data. | Accuracy: 97.06% | NTHUDDD public dataset [ |
| [ | Eye, mouth, head, and scene conditions | Facial changes in eye, mouth, and head, illumination condition of driving, and wearing glasses | 3D-deep CNN | The framework contained four models to recognize the drivers’ alertness status, using the condition-adaptive representation. | Accuracy: 76.2% | NTHUDDD public dataset [ |
| [ | Eye, head, and mouth | Blinking rate, head-nodding, and yawning frequency | Fisher score for feature selection and non-linear SVM for classification | The system is based on a hand-crafted compact face texture descriptor that can capture the most discriminant drowsy features. | Accuracy: 79.84% | NTHUDDD public dataset [ |
| [ | Facial features | Face feature vectors | SVM | Used facial motion information entropy, extracted from real-time videos. The algorithm contained four modules. | Accuracy: 94.32% | YawDD dataset [ |
| [ | Facial features, head movements | Implicitly decides the important features like eye closure, mouth position, chin or brow raises, frowning, and nose wrinkles | 3D CNN | DDD was performed, based on activity prediction, through a depth-wise separable 3D CNN, using real-time face video. An advantage of this method was that it implicitly decided the important features, rather than pre-specifying a set of features beforehand. | Accuracy: 73.9% | NTHUDDD public dataset [ |
| [ | Eye and mouth | Temporal facial feature vectors formed from spatial features | LSTM | A method that applied real-time DDD, based on a combination of CNN and LSTM. It consisted of two parts: spatial and temporal. | Accuracy: 84.85% | NTHUDDD public dataset [ |
| [ | Eye, head, and mouth | Yawning, eye closure, and head nodding | Multi-layer model-based 3D convolutional networks | Used a repetitive neural network architecture, based on an RNN model, called multi-layer model-based 3D convolutional networks, to detect fatigue. | Accuracy: 97.3% | NTHUDDD public dataset [ |
| [ | Eye and mouth | PERCLOS and mouth opening degree | Eye and mouth CNN | Applied face detection and feature points location, using multi-task cascaded CNNs architecture and EM-CNN to detect the mouth and eye state from the ROI. | Accuracy: 93.62% | Driving images dataset from Biteda company |
Some biological-based measures.
| Biological Signals | Description |
|---|---|
| Electroencephalography (EEG) [ | An EEG signal is a monitoring method that records the brain’s electrical activity from the scalp. It represents the microscopic activity of the brain’s surface layer underneath the scalp. Based on the frequency ranges (0.1 Hz–100 Hz), these signals are categorized as delta, theta, alpha, beta, and gamma. |
| Electrocardiography (ECG) [ | ECG signals represent the electrical activity of the heart, which are acquired using electrodes placed on the skin. ECG monitors heart functionality, including heart rhythm and rate. |
| Photoplethysmography (PPG) [ | PPG signals are used to detect blood volume changes. These signals are measured at the skin’s surface using a pulse oximeter. It is often used for heart rate monitoring. |
| Heart rate variability (HRV) [ | HRV signals are used to monitor the changes in the cardiac cycle, including the heartbeats. |
| Electrooculography (EOG) [ | EOG signals are used to measure the corneo-retinal standing potential between the front and back of the human eye and record the eye movements. |
| Electromyography (EMG) [ | EMG signals are the collective electric signals produced from muscles movement. |
Biological-based drowsiness detection systems.
| Ref. | Biological Parameters | Sensors | Extracted Features | Classification Method | Description | Quality Metric | Dataset |
|---|---|---|---|---|---|---|---|
| [ | Brain activity | Bluetooth-enabled EEG headband and a commercial smartwatch | Relative EEG | SVM-based posterior probabilistic model | A real-time system used an SVM-based posterior probabilistic model to detect and classify drowsiness into three levels. | Accuracy: | Prepared their own dataset |
| [ | Brain activity | EEG | IMF of the EEG signal | ANN | Detection was based on the extraction of the IMFs from the EEG signal by applying the EMD method. | Accuracy: 88.2% | Prepared their own dataset |
| [ | EEG signals and EEG spectrogram images | EEG Sensors | Energy distribution and zero-crossing distribution of the raw EEG signals, in-depth features of the EEG spectrogram, etc. | LSTM network | EEG-based drowsiness detection method. It used pre-trained AlexNet and VGG16 models to extract in-depth features from the EEG spectrogram images. | Accuracy: 94.31% | MIT/BIH polysomnographic EEG database [ |
| [ | EEG | EEG Sensors | The first quartile, median, range, and energy of the Hermite coefficients | ELM decision tree, KNN, least | Detection was based on an adaptive Hermite decomposition for EEG signals. The Hermite functions were employed as basic functions. | Accuracy: | MIT/BIH polysomnographic database [ |
| [ | EEG | Standard wet-electrode EEG and a cap-type dry-electrodeEEG | Multi-taper power spectral density | Extreme gradient boosting classifier | A framework for detecting instantaneous drowsiness with a 2-s length of EEG signal. It was implemented on a wireless and wired EEG to show its applicability in a mobile environment. | Accuracy: | Prepared their own dataset |
| [ | EEG | EEG sensors | F1–F9, extracted from | Extra trees classifier | Employed wavelet packet transform to extract the time domain features from a single-channel EEG signal. Eleven classifiers were tested in this work. The extra trees classifier had the best results. | Accuracy, sensitivity, and precision: | Dataset1: Fpz-Cz channel dataset [ |
| [ | EEG | EEG Sensors | Tsallis entropy, Renyi entropy, permutation entropy, log energy entropy, and Shannon entropy | Ensemble boosted tree classifier | Used AVMD to analyze and synthesize the EEG signals. By applying statistical analysis, five entropy-based features were selected. Ten classifiers were used, and the ensemble boosted tree classifier achieved the highest accuracy. | Accuracy: | MIT/BIH polysomnographic dataset [ |
| [ | Heart rate and blood volume changes | ECG and PPG | Features obtained from Bin-RP, Cont-RP, and ReLU-RP patterns | CNN | Used wearable ECG/PPG sensors to track the different patterns in HRV signals in a simulation environment and used CNN. | Best accuracy, sensitivity, and precision: | Prepared their own dataset |
| [ | Heart rate | PPG | Frequency measurements (HF, LF, and HF/LF) extracted from PPG signals | Differentiating between two (HF, LF, and HF/LF) patterns | Detection is done by analyzing the changes in PPG signals frequency measurements (HF, LF, and HF/LF) that are obtained from measurements on fingers and earlobes | Accuracy: 8/9 = 88.8% | Prepared their own dataset |
| [ | Heart rate | Wrist-worn wearable sensor | HRV and activity of the autonomic nervous systems | Random Tree, RF, KNN, SVM, Decision Stump, etc. | Detection was based on the physiological data extracted from a wrist-worn wearable sensor and ECG sensor. Multiple ML algorithms for binary classification were used | The highest accuracy was more than 92% for the KNN algorithm | Prepared their own dataset |
| [ | HRV | ECG electrodes | MeanNN, SDNN, RMSSD, TP, NN50, LF, HF, and LF/HF | Multivariate statistical process control | Detection was based on HRV analysis. Eight HRV features were monitored to detect the changes in HRV using the multivariate statistical process control anomaly detection method. The algorithm was validated by comparing its results with EEG-based sleep scoring. | Accuracy: 92% | Prepared their own dataset |
| [ | Respiration | Three respiratory inductive plethysmography sensors | RRV and quality of the respiratory signals | Thoracic effort-derived drowsiness | An algorithm for DDD, based on the respiratory signal variations. It combined the analysis of the RRV and the quality level of the respiratory signals to detect the changes in the driver’s alertness status. | Sensitivity: 90.3% | Prepared their own dataset |
| [ | ECG and EMG | Disposable Ag–AgCl electrodes | Features extracted from the bispectrum of the signals H1, H2, and H3 | Linear discriminant analysis, quadratic discriminant analysis, and KNN classifiers | Detects hypovigilance caused by drowsiness and inattention using ECG and EMG signals. The gathered physiological signals from the experiments were first pre-processed. Then, multiple higher-order spectral features were extracted to be classified. | Accuracy and Sensitivity: | Prepared their own dataset |
| [ | ECG and EMG | Two pieces of conductive knit fabric | EMG peak factor and maximum of the cross-relation curve of ECG and EMG | Discriminant criterion using Mahalanobis distance | A noncontact onboard DDD system studied the EMG and ECG signals changes during driving. Feature selection was applied using the Kolmogorov–Smirnov Z test. | Accuracy: 86%. | Prepared their own dataset |
| [ | ECG and | Enobio-20 channel device | EEG signals time-domain statistical descriptors, complexity measures, power spectral measures, ECG signals HR and HRV’s LF, HF, and LF/HF ratio | SVM | Combined ECG and EEG features to detect drowsiness. After the feature extraction, a paired t-test was only used to select the significant features. | Accuracy: 80.9% | Prepared their own dataset |
| [ | EEG, EOG, ECG | EEG, ECG, and EOG electrodes | EEG features from the temporal, frontal, and occipital channels | Linear discriminant analysis, linear SVM, kernel SVM, and KNN | The fuzzy mutual information-based wavelet packet transform method extracted the features. The features were dimensionally reduced, using spectral regression and kernel-based spectral regression methods. After that, four classifiers were applied. | Accuracy: | Prepared their own dataset |
Vehicle-based drowsiness detection systems.
| Ref. | Vehicle | Extracted Features | Classification Method | Description | Quality Metric | Dataset |
|---|---|---|---|---|---|---|
| [ | Steering wheel | SWA | RF | Used SWA as input data and compared it with PERCLOS. The RF algorithm was trained by a series of decision trees, with a randomly selected feature. | Accuracy: RF- steering model: | Prepared their own dataset |
| [ | Lateral distance | Statistical features, derived from the time and wavelet domains, relevant to the lateral distance and lane trajectory | SVM and neural network | Detection was based on lateral distance. Additionally, it collects data of the driver’s facial and head movements to be used as ground truth for the vehicle data. | Accuracy: | Prepared their own dataset |
| [ | Steering wheel | SWA | Specially designed binary decision classifier | Used SWA data to apply online fatigue detection. The alertness state is determined using a specially designed classifier. | Accuracy: Drowsy: 84.85% | Prepared their own dataset |
| [ | Steering wheel | SWA, steering wheel velocity | ANFIS for feature selection, PSO for optimizing the ANFIS parameters, and | Detection was based on steering wheel data. The system used a selection method that utilized ANFIS. | Accuracy: 98.12% | Prepared their own dataset |
| [ | Steering wheel | SW_Range_2, Amp_D2_Theta, PNS, and NMRHOLD | MOL, SVM, and BPNN | Used steering wheel status data. Using variance analysis, four parameters were selected, based on the correlation level with the driver’s status. MOL model performed best. | Accuracy: | Prepared their own dataset |
Hybrid-based drowsiness detection systems.
| Ref. | Sensors | Hybrid | Extracted Features | Classification Method | Description | Quality Metric | Dataset |
|---|---|---|---|---|---|---|---|
| [ | Automatic gearbox, image-generating computers, and control-loaded steering system | Image- and vehicle-based features | Latera position, yaw angle, speed, steering angle, driver’s input torque, eyelid opening degree, etc. | A series of mathematical operations, specified schemes from the study hypothesis | A system that assists the driver in case drowsiness is detected to prevent lane departure. It gives the driver a specific duration of time to control the car. If not, the system controls the vehicle and parks it. | Accuracies up to 100% in taking control of the car when the specified driving conditions were met | Prepared their own dataset |
| [ | PPG, sensor, accelerometer, and gyroscope | Biological- and vehicle-based features | Heart rate, stress level, respiratory rate, adjustment counter, and pulse rate variability, steering wheel’s linear acceleration, and radian speed | SVM | It collected data from the sensors. Then, the features were extracted and fed to the SVM algorithm. If determined drowsy, the driver is alerted via the watch’s alarm. | Accuracy: 98.3% | Prepared their own dataset |
| [ | Smartphone camera | Biological- and image-based features | Blood volume pulse, blinking duration and frequency, HRV, and yawning frequency | If any of the detected parameters showed a specific change/value | Used a multichannel second-order blind identification based on the extended-PPG in a smartphone to extract blood volume pulse, yawning, and blinking signals. | Sensitivity: Up to 94% | Prepared their own dataset |
| [ | Headband, equipped with EEG electrodes, accelerometer, and | Biological- and behavioral-based features | Eyeblink patterns analysis, head movement angle, and magnitude, and spectral power analysis | Backward feature selection method applied followed by various classifiers | Used a non-invasive and wearable headband that contains three sensors. This system combines the features extracted from the head movement analysis, eye blinking, and spectral signals. The features are then fed to a feature selection block followed by various classification methods. Linear SVM performed the best. | Accuracy, sensitivity, and precision: | Prepared their own dataset |
| [ | SCANeR Studio, faceLAB, electrocardiogram, PPG sensor, | Biological-, image-, and vehicle-based features | Heart rate and variability, respiration rate, blink duration, frequency, | ANN | Included two models that used ANN. One is for detecting the drowsiness degree, and the other is for predicting the time needed to reach a specific drowsiness level. Different combinations of the features were tested. | Overall mean square error of 0.22 for predicting various drowsiness levels | Prepared their own dataset |
| [ | EEG, EOG, ECG | Biological-based features and NIRS | Heart rate, alpha and beta bands power, blinking rate, and eye closure duration | Fisher’s linear discriminant analysis method | A new approach that combined EEG and NIRS to detect driver drowsiness. The most informative parameters were the frontal beta band and the oxygenation. As for classification, Fisher’s linear discriminant analysis method was used. Additionally, time series analysis was employed to predict drowsiness. | Accuracy: 79.2% | MIT/BIH polysomnographic database [ |
| [ | Multi-channel amplifier with active electrodes, projection screen, and touch screen | Biological-based features and contextual information | EEG signal: power spectra, five frequency characteristics, along with four power ratiosEOG signal: blinking duration and PERCLOS contextual information: the driving conditions (lighting condition and driving environment) and sleep/wake predictor value. | KNN, SVM, case-based reasoning, and RF | Used EOG, EEG, and contextual information. The scheme contained five sub-modules. Overall, the SVM classifier showed the best performance. | Accuracy: | Prepared their own data |
| [ | Smartphone | Image-based features, as well as voice and touch information | PERCLOS, vocal data, touch response data | Linear SVM | Utilized a smartphone for DDD. The system used three verification stages in the process of detection. If drowsiness is verified, an alarm will be initiated. | Accuracy: 93.33% | Prepared their own dataset called ‘Invedrifac’ [ |
| [ | Driving simulator and monitoring system | Biological-, image-, and vehicle-based features | 80 features were extracted: PERCLOS, SWA, LF/HF, etc. | RF and majority voting (logistic regression, SVM, KNN) classifiers | Vehicle-based, physiological, and behavioral signs were used in this system. Two ways for labeling the driver’s drowsiness state were used, slightly drowsy and moderately drowsy. | Accuracy, sensitivity, and precision: | Prepared their own dataset |
DDD systems challenges.
| System Type | Imaged-Based | Biological-Based | Vehicle-Based | |
|---|---|---|---|---|
| Challenges | ||||
| Difficulty in extracting drowsiness signs, due to facial characteristics/skin color | High | N/A | N/A | |
| Difficulty in extracting drowsiness signs, due to objects that cover the face | High | N/A | N/A | |
| Driver’s posture and distance from the dashboard | High | Low | N/A | |
| Real-time video analysis | Medium | N/A | N/A | |
| Driver movement | High | High | N/A | |
| Noisy sensor measurements | Low | High | Low | |
| Monitoring equipment and sensors inconvenience | Low | Medium | Low | |
| Influence of environmental conditions (weather/illumination) | High | Low | Medium | |
| Influence of the road conditions and geometry | Low | Low | High | |
| Hardware complexity and limitations | Low | High | Low | |
| Drowsiness signs extraction precision | Low | Low | High | |
| Testing under real (not simulated) driving conditions | Medium | Medium | Medium | |
DDD systems comparison, based on practicality.
| Practicality | Intrusiveness | Invasiveness | Reported Accuracies in DDD Literature | Cost | Ease of Use | |
|---|---|---|---|---|---|---|
| DDD Systems | ||||||
| Image-based systems | Non-intrusive | Non-invasive | High accuracy, between 72.25–99.59% | Generally low-cost | Automatic—no required set up or user intervention | |
| Biological-based systems | Depends on the hardware and method used | Depends on the hardware and method used | High accuracy, between 70–97.19% | Expensive when high-quality sensors are used | May require set up, user intervention, or wearing sensors | |
| Vehicle-based systems | Non-intrusive | Non-invasive | Low accuracy, as low as 62.1% | Mostly comes as an expensive car accessory | Automatic—no required set up or user intervention | |
| Hybrid-based systems | Depends on the hardware and method used | Depends on the hardware and method used | High accuracy, between 79–99% | Cost depends on the used hardware | May require set up, user intervention, or wearing sensors | |