| Literature DB >> 36267308 |
Yi Hu1, Zhuo Liu2, Aiqin Hou2, Chase Wu3, Wenbin Wei4, Yanjun Wang1, Min Liu5.
Abstract
Fatigue detection for air traffic controllers is an important yet challenging problem in aviation safety research. Most of the existing methods for this problem are based on facial features. In this paper, we propose an ensemble learning model that combines both facial features and voice features and design a fatigue detection method through multifeature fusion, referred to as Facial and Voice Stacking (FV-Stacking). Specifically, for facial features, we first use OpenCV and Dlib libraries to extract mouth and eye areas and then employ a combination of M-Convolutional Neural Network (M-CNN) and E-Convolutional Neural Network (E-CNN) to determine the state of mouth and eye closure based on five features, i.e., blinking times, average blinking time, average blinking interval, Percentage of Eyelid Closure over the Pupil over Time (PERCLOS), and Frequency of Open Mouth (FOM). For voice features, we extract the Mel-Frequency Cepstral Coefficients (MFCC) features of speech. Such facial features and voice features are fused through a carefully designed stacking model for fatigue detection. Real-life experiments are conducted on 14 air traffic controllers in Southwest Air Traffic Management Bureau of Civil Aviation of China. The results show that the proposed FV-Stacking method achieves a detection accuracy of 97%, while the best accuracy achieved by a single model is 92% and the best accuracy achieved by the state-of-the-art detection methods is 88%.Entities:
Mesh:
Year: 2022 PMID: 36267308 PMCID: PMC9578874 DOI: 10.1155/2022/4911005
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.809
Figure 1FV-Stacking architecture.
Figure 2LSTM structure.
Figure 3CNN structure.
Figure 4The location of 68 feature points on a face.
Figure 5Eye identification.
Figure 6E-CNN structure.
Figure 7Mouth identification.
Figure 8M-CNN structure.
Figure 9The process for obtaining the closure state of the mouth and eyes.
Figure 10MFCC feature extraction.
Figure 11A schematic block diagram.
Figure 12AUC based on ROC.
the performance of CNN.
| Model | Recall | Precision | Accuracy |
| AUC |
|---|---|---|---|---|---|
| E-CNN | 98% | 98% | 98% | 98% | 0.99 |
| M-CNN | 97% | 98% | 97% | 97% | 0.99 |
Figure 13Performance comparison between different models.
Performance comparison between different methods.
| Our method | Work by Zhang et al. [ | Work by Nie et al. [ | Work by Gu et al. [ | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| LR | SVM | KNN | LR | SVM | KNN | LR | SVM | KNN | ||
| Precision | 97% | 85% | 85% | 88% | 85% | 86% | 89% | 0.83% | 85% | 73% |
| Accuracy | 97% | 84% | 84% | 88% | 85% | 85% | 88% | 82% | 82% | 71% |
| Recall | 97% | 86% | 86% | 89% | 85% | 85% | 89% | 83% | 85% | 70% |
|
| 97% | 85% | 85% | 88% | 85% | 85% | 89% | 83% | 83% | 70% |
| AUC | 0.99 | 0.93 | 0.93 | 0.92 | 0.93 | 0.93 | 0.92 | 0.91 | 0.91 | 0.83 |