| Literature DB >> 31614646 |
Hyun-Myung Cho1,2, Heesu Park3,4, Suh-Yeon Dong5, Inchan Youn6.
Abstract
The goals of this study are the suggestion of a better classification method for detecting stressed states based on raw electrocardiogram (ECG) data and a method for training a deep neural network (DNN) with a smaller data set. We suggest an end-to-end architecture to detect stress using raw ECGs. The architecture consists of successive stages that contain convolutional layers. In this study, two kinds of data sets are used to train and validate the model: A driving data set and a mental arithmetic data set, which smaller than the driving data set. We apply a transfer learning method to train a model with a small data set. The proposed model shows better performance, based on receiver operating curves, than conventional methods. Compared with other DNN methods using raw ECGs, the proposed model improves the accuracy from 87.39% to 90.19%. The transfer learning method improves accuracy by 12.01% and 10.06% when 10 s and 60 s of ECG signals, respectively, are used in the model. In conclusion, our model outperforms previous models using raw ECGs from a small data set and, so, we believe that our model can significantly contribute to mobile healthcare for stress management in daily life.Entities:
Keywords: convolutional neural network; deep neural network; electrocardiogram; stress detection
Year: 2019 PMID: 31614646 PMCID: PMC6833036 DOI: 10.3390/s19204408
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Previous studies on stress detection. ECG, Electrocardiogram; SC, Skin Conductance; ST, Skin Temperature; HR, Heart Rate; BN, Bayesian Network; AB, Adaptive Boosting; CNN, Convolutional Neural Network; RNN, Recurrent Neural Network; MT-CNN, Multitask CNN; AUC, Area Under the Curve; SCWT, Stroop Color Word Test; MA, Mental Arithmetic.
| Ref | # of Subjects | Signal | Window Length (s) | Classifier | Performance (%) | Stressor |
|---|---|---|---|---|---|---|
| [ | 13 | ECG, SC, Respiration | 10 | BN | 84 | Driving |
| [ | 42 | ECG | 180 | AB | 80 | Verbal examination |
| [ | 20 | ECG, SC, Respiration, ST | 30 | BN | 84.6 | SCWT, math, counting |
| [ | 20, 30 | Raw ECG | 10 | CNN+RNN | 87.39, 73.96 | MA, interview, SCWT, visual stimuli, cold pressor |
| [ | 10 | HR, SC | - | MT-CNN | 0.918 (AUC) | Driving [ |
Figure 1The experimental procedures and their durations.
Number of samples in the data sets.
| Stressor | Window Length (s) | Number of Samples | Total | |
|---|---|---|---|---|
| Rest | Stress | |||
| Driving | 10 | 2161 | 3731 | 5892 |
| 30 | 712 | 1227 | 1939 | |
| 60 | 349 | 598 | 947 | |
| Mental arithmetic | 10 | 1020 | 1020 | 2040 |
| 30 | 340 | 340 | 680 | |
| 60 | 170 | 170 | 340 | |
Figure 2Deep neural network architecture and the components of each stage. Raw ECG signals are provided into the input layer. The successive stages extract features from an output of a previous stage. After the last stage, a softmax classifier performs a binary classification between the rest and stress.
List of operations and hyperparameters used in each stage.
| Order | Operation | Filter Width | Number of Filters | Stride |
|---|---|---|---|---|
| 1 | Conv 1-D | 16 |
| 1 |
| 2 | Conv 1-D | 16 |
| 2 |
| Pooling | 16 | - | 2 | |
| 3 | Concat | Concatenating | ||
| 4 | BN | Batch normalization | ||
| 5 | Activation | ReLU | ||
| 6 | Dropout | Drop rate: 0.3 | ||
More detailed information about the proposed architecture. “conv” denotes “conv(filter width)-(filter channel)”. Similar to “conv”, “maxpool16” refers to max pooling with 16 lengths of the filter.
| Order | Operation | Output | Stride | # of Parameters |
|---|---|---|---|---|
| 0 | input | (?, 2560, 1) | - | - |
| 1-1 | conv16-8 | (?, 2560, 8) | 1 | 128 |
| 1-2 | conv16-8 | (?, 1280, 8) | 2 | 1024 |
| 1-2 | maxpool16 | (?, 1280, 8) | 2 | - |
| 1-3 | concatenating | (?, 1280, 16) | - | - |
| 1-4 | batch normalization | (?, 1280, 16) | - | 32 |
| 1-5 | activation & dropout | (?, 1280, 16) | - | - |
| 2-1 | conv16-8 | (?, 1280, 8) | 1 | 2048 |
| 2-2 | conv16-8 | (?, 640, 8) | 2 | 1024 |
| 2-2 | maxpool16 | (?, 640, 8) | 2 | - |
| 2-3 | concatenating | (?, 640, 16) | - | - |
| 2-4 | batch normalization | (?, 640, 16) | - | 32 |
| 2-5 | activation & dropout | (?, 640, 16) | - | - |
| 3-1 | conv16-16 | (?, 640, 16) | 1 | 4096 |
| 3-2 | conv16-16 | (?, 320, 16) | 2 | 4096 |
| 3-2 | maxpool16 | (?, 320, 16) | 2 | - |
| 3-3 | concatenating | (?, 320, 32) | - | - |
| 3-4 | batch normalization | (?, 320, 32) | - | 64 |
| 3-5 | activation & dropout | (?, 320, 32) | - | - |
| 4-1 | conv16-16 | (?, 320, 16) | 1 | 8192 |
| 4-2 | conv16-16 | (?, 160, 16) | 2 | 4096 |
| 4-2 | maxpool16 | (?, 160, 16) | 2 | - |
| 4-3 | concatenating | (?, 160, 32) | - | - |
| 4-4 | batch normalization | (?, 160, 32) | - | 64 |
| 4-5 | activation & dropout | (?, 160, 32) | - | - |
| 5-1 | conv16-32 | (?, 160, 32) | 1 | 16,384 |
| 5-2 | conv16-32 | (?, 80, 32) | 2 | 16,384 |
| 5-2 | maxpool16 | (?, 80, 32) | 2 | - |
| 5-3 | concatenating | (?, 80, 64) | - | - |
| 5-4 | batch normalization | (?, 80, 64) | - | 128 |
| 5-5 | activation & dropout | (?, 80, 64) | - | - |
| 6-1 | conv16-32 | (?, 80, 32) | 1 | 32,768 |
| 6-2 | conv16-32 | (?, 40, 32) | 2 | 16,384 |
| 6-2 | maxpool16 | (?, 40, 32) | 2 | - |
| 6-3 | concatenating | (?, 40, 64) | - | - |
| 6-4 | batch normalization | (?, 40, 64) | - | 128 |
| 6-5 | activation & dropout | (?, 40, 64) | - | - |
| 7-1 | conv16-64 | (?, 40, 64) | 1 | 65,536 |
| 7-2 | conv16-64 | (?, 20, 64) | 2 | 65,536 |
| 7-2 | maxpool16 | (?, 20, 64) | 2 | - |
| 7-3 | concatenating | (?, 20, 128) | - | - |
| 7-4 | batch normalization | (?, 20, 128) | - | 256 |
| 7-5 | activation & dropout | (?, 20, 128) | - | - |
| 8-1 | conv16-64 | (?, 20, 64) | 1 | 131,072 |
| 8-2 | conv16-64 | (?, 10, 64) | 2 | 65,536 |
| 8-2 | maxpool16 | (?, 10, 64) | 2 | - |
| 8-3 | concatenating | (?, 10, 128) | - | - |
| 8-4 | batch normalization | (?, 10, 128) | - | 256 |
| 8-5 | activation & dropout | (?, 10, 128) | - | - |
| Total | 435K |
Difference in self-reported scores, compared to baseline measurement.
| Task | SAM | DT |
|---|---|---|
| Math1 |
| 0.37 |
| Math2 |
| 0.89 |
Accuracy of the conventional methods. (DT; Decision Tree, kNN; k-Nearest Neighbors, LF; Logistic Regression, RF; Random Forest, SVM; Support Vector Machine).
| Stressor | Classifier | Window Length (s) | ||
|---|---|---|---|---|
| 10 | 30 | 60 | ||
| MA | DT | 0.539 (0.050) | 0.517 (0.062) | 0.490 (0.066) |
| kNN | 0.497 (0.030) | 0.511 (0.040) | 0.535 (0.058) | |
| LR | 0.493 (0.029) | 0.537 (0.076) | 0.508 (0.055) | |
| RF | 0.512 (0.075) | 0.505 (0.062) | 0.515 (0.041) | |
| SVM | 0.483 (0.025) | 0.516 (0.071) | 0.520 (0.082) | |
| Driving | DT | 0.487 (0.210) | 0.457 (0.234) | 0.512 (0.208) |
| kNN | 0.361 (0.051) | 0.423 (0.150) | 0.451 (0.208) | |
| LR | 0.447 (0.188) | 0.443 (0.235) | 0.434 (0.225) | |
| RF | 0.528 (0.187) | 0.486 (0.215) | 0.523 (0.193) | |
| SVM | 0.514 (0.155) | 0.533 (0.177) | 0.498 (0.205) | |
Figure 3The t-SNE plots of raw ECG and extracted features from the stages. Round points denote features of ECG labeled as rest, and crosses represent stress-labeled features. This figure shows only the extracted features from stage 1, stage 5, and the last stage.
Figure 4Accuracy of the end-to-end model in binary classification. Types are grouped by each raw ECG window (i.e., 10 s, 30 s, and 60 s) fed to the model. * and ** indicates that difference of the means is significant at the 0.001 and 0.05 level, respectively.
Figure 5ROC and PR curves. Each line represents a curve from Type I, Type II, and Type III training, respectively. A cross refers to the performances of the conventional model. (a) ROC curves and (b) PR curves.
Evaluation metrics.
| Type | Window Length (s) | Evaluation Metrics | |||
|---|---|---|---|---|---|
| AUC | Sensitivity | Specificity | |||
| I | 10 | 0.938 | 0.922 | 0.930 | 0.854 |
| II | 0.701 | 0.602 | 0.552 | 0.759 | |
| III | 0.761 | 0.752 | 0.787 | 0.696 | |
| I | 30 | 0.924 | 0.922 | 0.949 | 0.788 |
| II | 0.766 | 0.755 | 0.815 | 0.665 | |
| III | 0.807 | 0.815 | 0.830 | 0.797 | |
| I | 60 | 0.857 | 0.901 | 0.923 | 0.755 |
| II | 0.679 | 0.717 | 0.760 | 0.670 | |
| III | 0.835 | 0.826 | 0.820 | 0.845 | |
Comparison with models featuring a DNN algorithm.
| [ | [ | Proposed | |
|---|---|---|---|
| Window | 10 s | - | 10 s |
| Input | Raw ECG | Raw HR and SC | Raw ECG |
| Accuracy | 87.39%, 73.96% | - | 90.19% |
| AUC | - | 0.918 | 0.938 |