| Literature DB >> 31324001 |
Wonju Seo1, Namho Kim1, Sehyeon Kim1, Chanhee Lee2, Sung-Min Park3.
Abstract
Unmanaged long-term mental stress in the workplace can lead to serious health problems and reduced productivity. To prevent this, it is important to recognize and relieve mental stress in a timely manner. Here, we propose a novel stress detection algorithm based on end-to-end deep learning using multiple physiological signals, such as electrocardiogram (ECG) and respiration (RESP) signal. To mimic workplace stress in our experiments, we used Stroop and math tasks as stressors, with each stressor being followed by a relaxation task. Herein, we recruited 18 subjects and measured both ECG and RESP signals using Zephyr BioHarness 3.0. After five-fold cross validation, the proposed network performed well, with an average accuracy of 83.9%, an average F1 score of 0.81, and an average area under the receiver operating characteristic (ROC) curve (AUC) of 0.92, demonstrating its superiority over conventional machine learning models. Furthermore, by visualizing the activation of the trained network's neurons, we found that they were activated by specific ECG and RESP patterns. In conclusion, we successfully validated the feasibility of end-to-end deep learning using multiple physiological signals for recognition of mental stress in the workplace. We believe that this is a promising approach that will help to improve the quality of life of people suffering from long-term work-related mental stress.Entities:
Keywords: deep learning; electrocardiogram; machine learning; mental stress detection; respiration
Year: 2019 PMID: 31324001 PMCID: PMC6652136 DOI: 10.3390/s19133021
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1An experiment process. There are two stages for the whole experiment: an initial relaxation stage (colored with light green), and a regular experiment stage (indicated by gray bold lines). The regular experiment stage can be segmented into 5 min of relax tasks (colored with dark green) and stress tasks (colored with dark orange). From the start to end of the regular experiment stage, subjects’ physiological signals were captured by a wearable device. At the end of each short a relax or a stress task, a mental stress level assessment was carried (indicated with an orange arrow).
Figure 2Setup of the experiment in a closed room. A subject proceeds with the experiment with a laptop computer. There was not only no one else except the subject, but also no camera not to make the subject nervous or embarrassed.
A list of features extracted from ECG and RESP. We computed the power spectral density of ECG’s NN interval and RESP, using Welch’s method, to extract frequency domain features. Abbreviations: ECG, electrocardiogram; RESP, respiration; NN, normal-to-normal; RR, R peak-to-R peak.
| Signal | Domain | Features | Description |
|---|---|---|---|
| ECG | Time | HR mean | Mean of heartrate |
| sdNN | Standard deviation of NN intervals | ||
| rmssd | Root mean square of successive difference of RR intervals | ||
| pNN50 | Percentage of differences between adjacent RR intervals that are greater than 50 ms | ||
| ECG | Frequency | VLF | Power of NN interval (0.00–0.04 Hz) |
| LF | Power of NN interval (0.04–0.15 Hz) | ||
| HF | Power of NN interval (0.15–0.40 Hz) | ||
| TF | Power of NN interval (0.14–0.40 Hz) | ||
| nLF | LF to (LF + HF) ratio | ||
| nHF | HF to (LF + HF) ratio | ||
| LF2HF | LF to HF ratio | ||
| RESP | Time | RMS | Square root of mean of squared RESP |
| IQR | Interquartile range of RESP | ||
| MDA | Square root of mean of squared differences between adjacent elements | ||
| RESP | Frequency | LF1 | Power of RESP (0.00–1.00 Hz) |
| LF2 | Power of RESP (1.00–2.00 Hz) | ||
| HF1 | Power of RESP (2.00–3.00 Hz) | ||
| HF2 | Power of RESP (3.00–4.00 Hz) | ||
| L2H | (LF1+LF2) to (HF1 + HF2) ratio |
Figure 3The structure of the proposed DeepER Net. The different signals were processed in each network branch and then concatenated for recognizing the stress. The basic structure is based on the structure of Deep ECG Net [12].
Average normalized visual analogue scale (VAS) scores for all tasks. These have been normalized to a range of 0–1 with a MinMax scaler.
| Task | Average Value |
|---|---|
| Relax | 0.24 |
| Easy math | 0.51 |
| Easy stroop | 0.61 |
| Hard math | 0.80 |
| Hard stroop | 0.52 |
Average differences between the normalized VAS scores before and after each task. Here, the relaxation tasks were used as a baseline before stressor tasks.
| Task | Average Value |
|---|---|
| Easy math | 0.12 |
| Easy Stroop | 0.42 |
| Hard math | 0.55 |
| Hard Stroop | 0.32 |
Average metrics after five-fold cross validation. We used Equations (2) and (3) to calculate the average accuracy, F1 score, and AUC, as well as their standard deviations, and show these results as average ± standard deviation. Abbreviations: SVM, support vector machine; RF, random forest; KNN, k-nearest neighbors; LR, logistic regression; DT, decision tree; AUC, area under the ROC curve; ROC, receiver operating characteristic.
| Model | Accuracy (%) | F1 Score | AUC |
|---|---|---|---|
| DeepER Net | 83.9 ± 2.3 | 0.81 ± 0.05 | 0.92 ± 0.01 |
| SVM | 61.7 ± 3.4 | 0.62 ± 0.04 | 0.68 ± 0.05 |
| RF | 71.8 ± 2.3 | 0.67 ± 0.04 | 0.80 ± 0.02 |
| KNN | 64.0 ± 3.2 | 0.60 ± 0.02 | 0.67 ± 0.04 |
| LR | 59.1 ± 2.5 | 0.55 ± 0.05 | 0.63 ± 0.04 |
| DT | 68.8 ± 1.6 | 0.66 ± 0.02 | 0.70 ± 0.02 |
Figure 4The activations on the first ReLU of electrocardiogram (ECG) signal. To easily see which signal patterns were activated, the activations and the first batch-normalization layer’s output were normalized with MinMax Scaler having a range from 0 to 1. The blue line indicates the activations and the red line indicates the output. Activations around (a) ECG Q and T’s ascending waveform and (b) ECG QRS and T’s descending waveform.
Figure 5The activations on the first ReLU of respiration (RESP) signal. To easily see which signal patterns were activated, the activations and the first batch-normalization layer’s outputs were normalized with MinMax Scaler having a range from 0 to 1. The blue line indicates the activations and the red line indicates the output. Activations around (a) RESP peak (e.g., inspiration) and (b) RESP nadir (e.g., expiration).
Comparison with the-state-of-the-art deep learning approaches using physiological signals for recognizing stress. Abbreviations: CNN, convolutional neural network; LSTM, long short-term memory.
| Models | Physiological Signal | Model | Accuracy |
|---|---|---|---|
| Hwang et al. [ | ECG | CNN and LSTM | 80.7% |
| Cho et al. [ | Thermal respiration images | CNN | 84.6% |
| He et al. [ | Lomb Periodogram spectrum extracted from zero-one transformed NN intervals | CNN | 82.7% |
| Proposed DeepER Net | ECG and RESP | CNN and LSTM | 83.9% |