| Literature DB >> 33808147 |
Jaewon Lee1, Hyeonjeong Lee1, Miyoung Shin1.
Abstract
Mental stress can lead to traffic accidents by reducing a driver's concentration or increasing fatigue while driving. In recent years, demand for methods to detect drivers' stress in advance to prevent dangerous situations increased. Thus, we propose a novel method for detecting driving stress using nonlinear representations of short-term (30 s or less) physiological signals for multimodal convolutional neural networks (CNNs). Specifically, from hand/foot galvanic skin response (HGSR, FGSR) and heart rate (HR) short-term input signals, first, we generate corresponding two-dimensional nonlinear representations called continuous recurrence plots (Cont-RPs). Second, from the Cont-RPs, we use multimodal CNNs to automatically extract FGSR, HGSR, and HR signal representative features that can effectively differentiate between stressed and relaxed states. Lastly, we concatenate the three extracted features into one integrated representation vector, which we feed to a fully connected layer to perform classification. For the evaluation, we use a public stress dataset collected from actual driving environments. Experimental results show that the proposed method demonstrates superior performance for 30-s signals, with an overall accuracy of 95.67%, an approximately 2.5-3% improvement compared with that of previous works. Additionally, for 10-s signals, the proposed method achieves 92.33% classification accuracy, which is similar to or better than the performance of other methods using long-term signals (over 100 s).Entities:
Keywords: convolutional neural network (CNN); deep learning; galvanic skin response (GSR); heart rate (HR); physiological signals; recurrence plot (RP); stress detection
Year: 2021 PMID: 33808147 PMCID: PMC8038071 DOI: 10.3390/s21072381
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Type of features often used in stress recognition studies using physiological signals.
| Feature Domain | Physiological Signals | Feature Examples | Study |
|---|---|---|---|
| Time | GSR, ECG, HR, | Mean, Median, SD, RMS, Skewness, Kurtosis, | [ |
| Frequency | GSR, ECG, RSP | Entropy, Power spectrum density, Power sum, The average power, LF, HF, Ratio of LF/HF, Spectral peak features | [ |
| Domain-dependent | GSR, ECG, RSP, EMG | Mean HP, Variation in HP, Variation in GSR, Differential area between GSR and its first-order interpolation, Product between RMS and SDCC, Trend-based feature generation | [ |
| Nonlinear | ECG | RP, RQA, Poincare plot | [ |
GSR: galvanic skin response; ECG: electrocardiogram; HR: heart rate; ST: skin temperature; BR: breath-flow rate; SpO2: oximetry; BVP: blood volume pressure; RSP: respiration; EMG: electromyogram; SD: standard deviation; RMS: root mean squares; LF: low frequency; HF: high frequency; HP: heart period; SDCC: standard deviation of the frequencies; RP: recurrence plot.
Figure 1Overview of proposed multimodal CNN approach using FGSR, HGSR, and HR signals for stress class prediction.
Figure 2Example of multiple physiological signals within one recording segmented based on different road conditions.
Excluded recordings in our paper and the reasons.
| Excluded Recording | Reason |
|---|---|
| drive 01 | Marker signal is missing. |
| drive 02 | HGSR signal is missing. |
| drive 03 | Marker and HR signals are missing. |
| drive 04 | Marker signal is not clear. |
| drive 05 | HR signal is missing. |
| drive 13 | HGSR signal is missing. |
| drive 14 | HR signal is missing. |
| drive 17 | Marker signal is missing. |
Mean and standard deviation of the three physiological signals for 9 recordings used in experiment.
| Sensor | FGSR | HGSR | HR | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Status | Rest | Highway | City | Rest | Highway | City | Rest | Highway | City |
| drive 06 | 7.42 ± 1.80 | 7.25 ± 1.22 | 10.29 ± 2.64 | 18.36 ± 1.32 | 16.19 ± 1.77 | 19.36 ± 1.91 | 80.24 ± 9.35 | 88.31 ± 10.50 | 99.75 ± 13.19 |
| drive 07 | 9.21 ± 3.36 | 12.76 ± 1.16 | 12.81 ± 1.72 | 5.46 ± 1.71 | 6.76 ± 1.17 | 7.75 ± 1.20 | 70.9 ± 8.41 | 73.44 ± 5.55 | 78.22 ± 7.60 |
| drive 08 | 2.89 ± 0.93 | 6.44 ± 0.90 | 6.80 ± 1.19 | 3.21 ± 0.67 | 5.45 ± 0.97 | 6.03 ± 1.54 | 63.65 ± 12.53 | 66.49 ± 11.04 | 74.87 ± 24.93 |
| drive 09 | 3.55 ± 1.70 | 5.12 ± 0.99 | 5.27 ± 1.10 | 4.40 ± 2.39 | 5.66 ± 1.35 | 6.60 ± 1.69 | 71.24 ± 15.33 | 73.36 ± 18.20 | 74.03 ± 15.36 |
| drive 10 | 4.62 ± 3.23 | 6.96 ± 2.12 | 9.66 ± 2.23 | 6.98 ± 4.05 | 6.44 ± 1.75 | 9.32 ± 2.60 | 75.35 ± 10.60 | 77.66 ± 7.92 | 83.73 ± 12.99 |
| drive 11 | 3.24 ± 0.89 | 5.61 ± 0.86 | 6.23 ± 1.28 | 3.53 ± 1.21 | 7.32 ± 1.36 | 8.52 ± 1.94 | 60.64 ± 9.53 | 71.42 ± 21.00 | 75.54 ± 23.85 |
| drive 12 | 3.32 ± 2.99 | 4.07 ± 1.27 | 5.35 ± 3.40 | 7.67 ± 2.70 | 15.44 ± 2.21 | 15.53 ± 2.00 | 78.72 ± 4.57 | 87.59 ± 4.06 | 88.44 ± 6.32 |
| drive 15 | 4.35 ± 1.38 | 6.84 ± 0.80 | 7.69 ± 1.37 | 4.55 ± 1.01 | 6.67 ± 1.25 | 7.77 ± 1.86 | 69.83 ± 24.91 | 67.98 ± 11.01 | 72.36 ± 14.48 |
| drive 16 | 3.74 ± 0.91 | 5.71 ± 0.74 | 6.90 ± 1.31 | 16.09 ± 1.84 | 20.10 ± 1.07 | 21.21 ± 2.11 | 89.16 ± 10.30 | 101.9 ± 12.65 | 106.1 ± 17.57 |
Figure 3Examples of Cont-RPs for short-term (30 s) FGSR, HGSR, and HR signals.
Figure 4Examples of Cont-RPs for short-term (10 s) FGSR, HGSR, and HR signals.
Figure 5Detailed configuration of proposed multimodal CNN model for feature learning and stress classification.
Classification performance of proposed method based on input signal length.
| Input Length | Class | Precision (PPV) | Recall (Sensitivity) | F1-Score | Overall Accuracy | AUC |
|---|---|---|---|---|---|---|
| 30 s | Stressed | 95.7% | 96.0% | 95.8% | ||
| Relaxed | 95.9% | 95.8% | 95.7% | |||
| 95.89% | 95.67% | 95.67% | 95.67% | 0.9870 | ||
| 10 s | Stressed | 91.7% | 92.8% | 92.3% | ||
| Relaxed | 92.4% | 91.7% | 91.9% | |||
| 91.67% | 92.78% | 92.33% | 92.33% | 0.9619 |
Figure 6Aggregated confusion matrices for each input signal length (30 s and 10 s). The number in parentheses of each quadrants of the confusion matrices indicates the total number of samples classified as each case.
Figure 7Classification performance in individual recordings based on input signal length (30 s and 10 s).
Classification performance of proposed method based on input signal length and sensor type.
| Signal | Stressed | Relaxed | Overall | |||||
|---|---|---|---|---|---|---|---|---|
| Length | Type | Precision | Recall | Precision | Recall | F1-Score | Accuracy | AUC |
| 30 s | FGSR | 92.67% | 87.50% | 89.67% | 92.50% | 90.62% | 90.83% | 0.9091 |
| HGSR | 82.71% | 79.57% | 82.86% | 77.00% | 76.57% | 78.29% | 0.7825 | |
| HR | 67.25% | 59.75% | 64.25% | 66.00% | 61.00% | 62.50% | 0.6274 | |
| 3 types | 95.67% | 96.00% | 95.89% | 95.78% | 95.67% | 95.67% | 0.9870 | |
| 10 s | FGSR | 92.88% | 88.50% | 89.63% | 92.38% | 90.50% | 90.38% | 0.9101 |
| HGSR | 83.56% | 82.67% | 83.56% | 79.00% | 79.83% | 80.67% | 0.8141 | |
| HR | 63.86% | 61.86% | 55.57% | 57.43% | 56.71% | 59.57% | 0.5963 | |
| 3 types | 91.7% | 92.8% | 92.4% | 91.7% | 92.33% | 92.33% | 0.9619 |
Figure 8Comparison of the performance variations of our multimodal CNN model and three unimodal CNNs for FGSR, HGSR and HR signals depending on input signal length and sensor type.
Classification performance of proposed method compared with baseline CNN classifier.
| Signal | Input | Classification | Stressed | Relaxed | Overall | ||
|---|---|---|---|---|---|---|---|
| Length | Type | Model | Precision | Recall | Precision | Recall | Accuracy |
| 30 s | 1-D | Multimodal 1-D CNN | 82.56% | 86.78% | 86.89% | 80.22% | 83.44% |
| Cont-RP | Multimodal VGG16 | 87.88% | 81.88% | 85.22% | 86.11% | 84.11% | |
| Cont-RP | Multimodal CNN | 95.67% | 96.00% | 95.89% | 95.78% | 95.67% | |
| 10 s | 1-D | Multimodal 1-D CNN | 83.11% | 84.33% | 86.33% | 82.44% | 83.33% |
| Cont-RP | Multimodal VGG16 | 84.55% | 81.33% | 84.55% | 86.44% | 84.00% | |
| Cont-RP | Multimodal CNN | 91.7% | 92.8% | 92.4% | 91.7% | 92.33% |
Comparison of other 2-class stress classification methods in real-time driving scenarios.
| Method | Dataset | Used Signals | Input Length | Classifier | Accuracy |
|---|---|---|---|---|---|
| [ | SRAD | FGSR, HR, RESP | 5 min | Logistic | 81.39% |
| [ | Self-collection | HGSR, HR, HRV, | 100 s | CNN | 92% |
| [ | SRAD | FGSR, HGSR, HR | 30 s | SVM | 93% |
| Proposed | SRAD | FGSR, HGSR, HR | 30 s | Multimodal CNN | 95.67% |
| Proposed | SRAD | FGSR, HGSR, HR | 10 s | Multimodal CNN | 92.33% |
HRV: heart rate variability; RESP: respiration.
Figure 9Distribution of our learned representation vectors embedded in two-dimensional vector space learned from FGSR, HGSR, and HR signals (red: stressed, green: relaxed).