| Literature DB >> 32194367 |
Xiang Li1, Zhigang Zhao1, Dawei Song2, Yazhou Zhang3, Jingshan Pan1, Lu Wu1, Jidong Huo1, Chunyang Niu1, Di Wang1.
Abstract
Robust cross-subject emotion recognition based on multichannel EEG has always been hard work. In this work, we hypothesize that there exist default brain variables across subjects in emotional processes. Hence, the states of the latent variables that relate to emotional processing must contribute to building robust recognition models. Specifically, we propose to utilize an unsupervised deep generative model (e.g., variational autoencoder) to determine the latent factors from the multichannel EEG. Through a sequence modeling method, we examine the emotion recognition performance based on the learnt latent factors. The performance of the proposed methodology is verified on two public datasets (DEAP and SEED) and compared with traditional matrix factorization-based (ICA) and autoencoder-based approaches. Experimental results demonstrate that autoencoder-like neural networks are suitable for unsupervised EEG modeling, and our proposed emotion recognition framework achieves an inspiring performance. As far as we know, it is the first work that introduces variational autoencoder into multichannel EEG decoding for emotion recognition. We think the approach proposed in this work is not only feasible in emotion recognition but also promising in diagnosing depression, Alzheimer's disease, mild cognitive impairment, etc., whose specific latent processes may be altered or aberrant compared with the normal healthy control.Entities:
Keywords: EEG; deep learning; emotion recognition; latent factor decoding; variational autoencoder
Year: 2020 PMID: 32194367 PMCID: PMC7061897 DOI: 10.3389/fnins.2020.00087
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 4.677
Figure 1The decoded EEG factors and recurrent neural network-based emotion recognition approach framework.
Figure 2The neural network-based multichannel EEG fusion and latent factor decoding method.
Three main categories of EEG features that we extracted for baseline methods.
| Time-frequency | 1. Peak-Peak Mean. 2. Mean Square Value. |
| domain features | 3. Variance. 4. Hjorth Parameter:Activity. |
| 5. Hjorth Parameter: Mobility. | |
| 6. Hjorth Parameter: Complexity. | |
| 7. Maximum Power Spectral Frequency. | |
| 8. Maximum Power Spectral Density. | |
| 9. Power Sum. | |
| Non-linear dynamical | 1. Approximate Entropy. 2. C0 Complexity. |
| system features | 3. Correlation Dimension. |
| 4. Kolmogorov Entropy. | |
| 5. Lyapunov Exponent. | |
| 6. Permutation Entropy. 7. Singular Entropy. | |
| 8. Shannon Entropy. 9. Spectral Entropy. | |
| Asynchronous brain | 1. Fp1-Fp2. 2. AF3-AF4. 3. F3-F4. |
| activity features | 4. F7-F8. 5. FC5-FC6. 6. FC1-FC2. |
| 7. C3-C4. 8. T7-T8. 9. CP5-CP6. | |
| 10. CP1-CP2. 11. P3-P4. 12. P7-P8. | |
| 13. PO3-PO4. 14. O1-O2. |
Figure 3The reconstruction performance (mean Pearson correlation coefficient) of different decoding models when assuming a different number of latent factors.
Figure 4The reconstruction performance (mean Pearson correlation coefficient) of different subjects when assuming a different number of latent factors (take the DEAP for example).
Figure 5Pearson correlation coefficient between the reconstructed EEG signal and the original EEG signal for each channel. (A) The reconstruction performance of different methods on DEAP dataset. (B) The reconstruction performance of different methods on SEED dataset.
Recognition performance on subject data of DEAP dataset (Valence).
| s01 | 0.5818 | 0.6296 | 0.6207 | 0.6296 | 0.6552 |
| s02 | 0.5882 | 0.6897 | 0.6885 | 0.7018 | 0.7213 |
| s03 | 0.7097 | 0.6897 | 0.7241 | 0.7097 | 0.7097 |
| s04 | 0.4444 | 0.5455 | 0.5490 | 0.5000 | 0.5818 |
| s05 | 0.5965 | 0.7000 | 0.7302 | 0.7500 | 0.7541 |
| s06 | 0.7241 | 0.7500 | 0.7419 | 0.8529 | 0.8182 |
| s07 | 0.7368 | 0.7619 | 0.7586 | 0.8065 | 0.8125 |
| s08 | 0.6071 | 0.6415 | 0.6545 | 0.6780 | 0.7213 |
| s09 | 0.5217 | 0.4898 | 0.5926 | 0.6545 | 0.6552 |
| s10 | 0.6207 | 0.6182 | 0.6441 | 0.6441 | 0.7059 |
| s11 | 0.6909 | 0.6552 | 0.7097 | 0.7419 | 0.7500 |
| s12 | 0.6667 | 0.6909 | 0.6552 | 0.6885 | 0.7000 |
| s13 | 0.4186 | 0.5714 | 0.6182 | 0.5882 | 0.6182 |
| s14 | 0.4444 | 0.5926 | 0.6182 | 0.6429 | 0.6780 |
| s15 | 0.5200 | 0.6143 | 0.6667 | 0.6441 | 0.6667 |
| s16 | 0.4706 | 0.5217 | 0.5306 | 0.5385 | 0.5660 |
| s17 | 0.6786 | 0.6552 | 0.6885 | 0.6897 | 0.7097 |
| s18 | 0.6667 | 0.6983 | 0.7407 | 0.7213 | 0.7500 |
| s19 | 0.7059 | 0.6697 | 0.7302 | 0.7302 | 0.7419 |
| s20 | 0.7018 | 0.7119 | 0.7097 | 0.7302 | 0.7719 |
| s21 | 0.6667 | 0.6600 | 0.6441 | 0.6780 | 0.6667 |
| s22 | 0.5306 | 0.5333 | 0.5965 | 0.6207 | 0.6207 |
| s23 | 0.7000 | 0.7170 | 0.7119 | 0.7241 | 0.8000 |
| s24 | 0.5106 | 0.5714 | 0.6207 | 0.5714 | 0.6316 |
| s25 | 0.4889 | 0.5532 | 0.6441 | 0.6441 | 0.6441 |
| s26 | 0.6415 | 0.7619 | 0.7333 | 0.7692 | 0.7813 |
| s27 | 0.8060 | 0.8309 | 0.8125 | 0.8235 | 0.8529 |
| s28 | 0.6786 | 0.7070 | 0.7241 | 0.7692 | 0.7619 |
| s29 | 0.6667 | 0.6667 | 0.7097 | 0.7143 | 0.7541 |
| s30 | 0.7742 | 0.7241 | 0.7813 | 0.8125 | 0.8182 |
| s31 | 0.6667 | 0.6897 | 0.7018 | 0.7302 | 0.7213 |
| s32 | 0.6441 | 0.5769 | 0.6538 | 0.6667 | 0.6897 |
| Mean | 0.6303 | 0.6528 | 0.6783 | 0.6927 | 0.7167 |
.
Recognition performance on subject data of DEAP dataset (Arousal).
| s01 | 0.6545 | 0.7241 | 0.7419 | 0.7419 | 0.7500 |
| s02 | 0.7119 | 0.7018 | 0.7458 | 0.7500 | 0.7619 |
| s03 | 0.2791 | 0.3182 | 0.3043 | 0.3111 | 0.3478 |
| s04 | 0.4138 | 0.5714 | 0.5106 | 0.5660 | 0.5714 |
| s05 | 0.5600 | 0.6441 | 0.6316 | 0.6316 | 0.6441 |
| s06 | 0.5200 | 0.6182 | 0.5818 | 0.5818 | 0.6275 |
| s07 | 0.7213 | 0.7338 | 0.7541 | 0.7500 | 0.7869 |
| s08 | 0.6667 | 0.6667 | 0.7213 | 0.7018 | 0.7333 |
| s09 | 0.6885 | 0.7302 | 0.7419 | 0.7213 | 0.7636 |
| s10 | 0.5660 | 0.6154 | 0.6667 | 0.6667 | 0.7097 |
| s11 | 0.5217 | 0.4898 | 0.5455 | 0.5306 | 0.5556 |
| s12 | 0.8000 | 0.8732 | 0.8889 | 0.8889 | 0.9041 |
| s13 | 0.5385 | 0.7125 | 0.7419 | 0.9041 | 0.8889 |
| s14 | 0.7241 | 0.7385 | 0.7619 | 0.7937 | 0.8060 |
| s15 | 0.5882 | 0.6316 | 0.6545 | 0.6552 | 0.6667 |
| s16 | 0.6273 | 0.6296 | 0.6545 | 0.6667 | 0.6667 |
| s17 | 0.6667 | 0.7302 | 0.7241 | 0.7333 | 0.7869 |
| s18 | 0.7541 | 0.7500 | 0.7692 | 0.7500 | 0.7742 |
| s19 | 0.7213 | 0.7213 | 0.8060 | 0.7097 | 0.8065 |
| s20 | 0.7879 | 0.8060 | 0.8235 | 0.8657 | 0.8732 |
| s21 | 0.8529 | 0.8406 | 0.8732 | 0.8696 | 0.9014 |
| s22 | 0.7241 | 0.7458 | 0.7500 | 0.7500 | 0.7500 |
| s23 | 0.3673 | 0.3404 | 0.3750 | 0.4000 | 0.4000 |
| s24 | 0.8358 | 0.8889 | 0.8696 | 0.8732 | 0.9041 |
| s25 | 0.7500 | 0.7458 | 0.8182 | 0.8406 | 0.8406 |
| s26 | 0.5714 | 0.5769 | 0.5769 | 0.5965 | 0.6071 |
| s27 | 0.6780 | 0.7797 | 0.8060 | 0.7879 | 0.8060 |
| s28 | 0.6071 | 0.6071 | 0.6154 | 0.6207 | 0.6545 |
| s29 | 0.7018 | 0.7000 | 0.7458 | 0.7419 | 0.7586 |
| s30 | 0.6182 | 0.5769 | 0.6207 | 0.6207 | 0.6441 |
| s31 | 0.5769 | 0.6182 | 0.6316 | 0.6316 | 0.6552 |
| s32 | 0.7213 | 0.7419 | 0.7692 | 0.7541 | 0.8148 |
| Mean | 0.6418 | 0.6741 | 0.6944 | 0.7002 | 0.7243 |
.
Recognition performance on subject data of SEED dataset.
| s01 | 0.6753 | 0.7195 | 0.7368 | 0.7677 | 0.8308 |
| s02 | 0.6897 | 0.7636 | 0.7386 | 0.7721 | 0.8504 |
| s03 | 0.7255 | 0.7259 | 0.7674 | 0.8018 | 0.8741 |
| s04 | 0.6434 | 0.6495 | 0.6677 | 0.6915 | 0.7012 |
| s05 | 0.6683 | 0.7221 | 0.7323 | 0.7552 | 0.8027 |
| s06 | 0.7321 | 0.7333 | 0.7733 | 0.8054 | 0.8904 |
| s07 | 0.7143 | 0.7713 | 0.7525 | 0.7953 | 0.8600 |
| s08 | 0.7053 | 0.7479 | 0.7488 | 0.7897 | 0.8571 |
| s09 | 0.6912 | 0.7080 | 0.7430 | 0.7759 | 0.8538 |
| s10 | 0.6677 | 0.7599 | 0.7264 | 0.7452 | 0.7931 |
| s11 | 0.7295 | 0.7548 | 0.7696 | 0.8026 | 0.8827 |
| s12 | 0.6776 | 0.7095 | 0.7373 | 0.7718 | 0.8320 |
| s13 | 0.7168 | 0.7894 | 0.7560 | 0.7962 | 0.8676 |
| s14 | 0.6939 | 0.7586 | 0.7442 | 0.7876 | 0.8565 |
| s15 | 0.7608 | 0.7699 | 0.7844 | 0.8079 | 0.8908 |
| Mean | 0.6994 | 0.7389 | 0.7452 | 0.7777 | 0.8429 |
.
Performance comparison between this work and the baseline methods.
| L1-SVM (kernel=“linear”) + handcrafted features | 0.7134 | 0.7154 | 0.8234 |
| RF (n_estimators=100) + handcrafted features | 0.5870 | 0.5754 | 0.7680 |
| KNN (n_neighbors=7) + handcrafted features | 0.6406 | 0.5890 | 0.6763 |
| LR + handcrafted features | 0.5757 | 0.5614 | 0.7649 |
| NB + handcrafted features | 0.6903 | 0.6989 | 0.6666 |
| DNN (hidden_layer_sizes={100, 100, 100}) + handcrafted features | 0.6183 | 0.6593 | 0.7219 |
| ICA+LSTM | 0.6303 | 0.6418 | 0.6994 |
| PCA+LSTM | 0.6528 | 0.6741 | 0.7389 |
| AE+LSTM | 0.6783 | 0.6944 | 0.7452 |
| RBM+LSTM | 0.6927 | 0.7002 | 0.7777 |
| VAE+LSTM | 0.7167 | 0.7243 | 0.8429 |
Figure 6Mean cross-subject recognition performance with different settings when L1-SVM based approach is applied.
List of related works in recent years and the corresponding performance obtained.
| DEAP | |||||
| Ontology-based storage and representation, feature selection and decision tree based recognition method (Chen et al., | 0.6783 | N/A | 0.6896 | N/A | |
| Minimum-redundancy-maximum-relevance (MRMR) based feature selection combined with the statistical features, band power features, Hjorth parameters and fractal dimension (Atkinson and Campos, | 0.7314 | N/A | 0.7306 | N/A | |
| Integrated classifier based on multi-layer stacking autoencoder combined with time domain features and PSD features (Yin et al., | 0.7617 | 0.7243 | 0.7719 | 0.6901 | |
| Multivariate empirical mode decomposition (MEMD) based feature extraction combined with ANN (Mert and Akan, | 0.7287 | N/A | 0.7500 | N/A | |
| Generative adversarial network (WGANDA) based transfer learning combined with differential entropy feature (Luo et al., | 0.6799 | N/A | 0.6685 | N/A | |
| The VAE based approach proposed in this work | 0.7623 | 0.7167 | 0.7989 | 0.7243 | |
| SEED | |||||
| Dynamical graph CNN (DGCNN) learns from the DE, PSD, DASM, RASM and DCAU features based adjacency matrix representation (Song et al., | 0.7995 | N/A | |||
| Extracting differential entropy features to construct 2D sparse graph representation, then combining CNN for classification (Li et al., | 0.8820 | N/A | |||
| Transfer learning methods combined with differential entropy features and logistic regression based classification (Lan et al., | 0.7247 | N/A | |||
| Generative adversarial network (WGANDA) based transfer learning combined with differential entropy feature (Luo et al., | 0.8707 | N/A | |||
| Spatial-temporal recurrent neural network (STRNN) combined with differential entropy feature (Zhang et al., | 0.8950 | N/A | |||
| The VAE based approach proposed in this work | 0.8581 | 0.8429 | |||