| Literature DB >> 36212045 |
Lei Jiang1, Panote Siriaraya1, Dongeun Choi2, Fangmeng Zeng3, Noriaki Kuwahara1.
Abstract
Reminiscence and conversation between older adults and younger volunteers using past photographs are very effective in improving the emotional state of older adults and alleviating depression. However, we need to evaluate the emotional state of the older adult while conversing on the past photographs. While electroencephalogram (EEG) has a significantly stronger association with emotion than other physiological signals, the challenge is to eliminate muscle artifacts in the EEG during speech as well as to reduce the number of dry electrodes to improve user comfort while maintaining high emotion recognition accuracy. Therefore, we proposed the CTA-CNN-Bi-LSTM emotion recognition framework. EEG signals of eight channels (P3, P4, F3, F4, F7, F8, T7, and T8) were first implemented in the MEMD-CCA method on three brain regions separately (Frontal, Temporal, Parietal) to remove the muscle artifacts then were fed into the Channel-Temporal attention module to get the weights of channels and temporal points most relevant to the positive, negative and neutral emotions to recode the EEG data. A Convolutional Neural Networks (CNNs) module then extracted the spatial information in the new EEG data to obtain the spatial feature maps which were then sequentially inputted into a Bi-LSTM module to learn the bi-directional temporal information for emotion recognition. Finally, we designed four group experiments to demonstrate that the proposed CTA-CNN-Bi-LSTM framework outperforms the previous works. And the highest average recognition accuracy of the positive, negative, and neutral emotions achieved 98.75%.Entities:
Keywords: CNN-RNN; channel-temporal attention; electroencephalogram (EEG); emotion recognition; older adults
Year: 2022 PMID: 36212045 PMCID: PMC9535340 DOI: 10.3389/fnagi.2022.945024
Source DB: PubMed Journal: Front Aging Neurosci ISSN: 1663-4365 Impact factor: 5.702
FIGURE 1Overview of the causes and methods of monitoring individual (subject) emotions. (A) Characteristics of mood, emotion, and feeling. (B) Expression of emotions space model and mental disorders. (C) The formation of emotion. (D) EEG emotion recognition system.
FIGURE 2The sources of artifacts in EEG signal and the removal methods.
Comparison of EMG and EOG artifacts removal techniques.
| Methods | Ref. E | Channel | Comparison results | |
| PK | NPK (BSS-based) | (Better than) | ||
| Adaptive filtering | √ | All | EMG: Low-pass filter EOG: WPT, ICA, DWT, ANC ( | |
| Linear regression | √ | All | EOG: Visual identification ( | |
| ICA | × | Multi | PCA, LR, Wavelet ( | |
| CCA | × | Multi | EMG: low-pass filter + Robust ICA; EOG: equivalent to ICA ( | |
| EMD | × | Single | ICA, CCA, WT ( | |
| EEMD-CCA | × | Single | EMD, EMD-ICA, EMD-CCA, EEMD, EEMD-ICA ( | |
| MEMD | × | Few | ICA ( | |
| MEMD-CCA | × | Few | EMG: ICA, EEMD-ICA, MEMD-ICA CCA, EEMD-CCA ( | |
| CCA-MEMD | × | Few | EOG:ICA, CCA ( | |
PK, prior knowledge; NPK, no prior knowledge; BSS, blind source separation; Ref. E, reference electrode; ICA, independent component analysis; CCA, canonical correlation analysis; EMD, empirical mode decomposition; EEMD, ensemble empirical mode decomposition; MEMD, multivariate empirical mode decomposition.
Summary of experiment dataset (OCER).
| Conversation experiment | |
| Trails | 36 trails × 60 s |
| Subject | Older: 11 ( |
| Rating | Valence (–4,4), Arousal (–4,4), Stress (1,7) |
|
| |
| Device | OpenBCI Cyton board (250 Hz/s) |
| Channel | F3, F4, F7, F8, T7, T8, P3, P4 (10–20 system) |
| Array | 396(Samples) × 60 (s) × 250(Hz/s) × 8 (Channels) |
FIGURE 3Illustration of the proposed framework based on raw EEG data for emotion recognition. (A) Removal of biological artifacts by MEMD-CCA. (B) CTA-CNN-Bi-LSTM framework.
Division of OCER into three motions by K-MEANS.
| Subject | Rating | Clustering center ( | ||
| ID | scale | 1 | 2 | 3 |
| 1 | Valence | –1.58 | 0.26 | 1.27 |
| Arousal | –0.98 | 0.65 | 1.85 | |
| Stress | –0.90 | –0.90 | –0.90 | |
| 2 | Valence | –1.58 | 0.55 | 1.16 |
| Arousal | –0.98 | –1.12 | 1.08 | |
| Stress | –0.90 | –0.90 | –0.90 | |
| 3 | Valence | 0.63 | 0.61 | 0.63 |
| Arousal | –2.40 | –0.96 | 0.44 | |
| Stress | 0.23 | –0.80 | –0.90 | |
| 4 | Valence | –0.20 | 0.63 | 1.02 |
| Arousal | –0.41 | –0.56 | 0.62 | |
| Stress | 0.23 | 0.46 | 0.38 | |
| 5 | Valence | –1.58 | –1.53 | –0.11 |
| Arousal | –3.11 | –1.07 | –0.98 | |
| Stress | 0.23 | 0.78 | 1.36 | |
| 6 | Valence | –0.35 | 0.22 | 0.63 |
| Arousal | –0.63 | 0.44 | 0.60 | |
| Stress | 2.30 | 0.73 | 2.01 | |
| 7 | Valence | –3.05 | –1.58 | –0.48 |
| Arousal | –0.98 | –0.98 | –0.98 | |
| Stress | –0.90 | –0.83 | –0.90 | |
| 8 | Valence | 0.14 | –0.11 | 0.35 |
| Arousal | 0.91 | –0.27 | 0.44 | |
| Stress | 1.36 | –0.90 | 0.23 | |
| 9 | Valence | –1.39 | –0.45 | 0.51 |
| Arousal | –0.81 | –0.60 | –0.20 | |
| Stress | 2.78 | 1.10 | 0.41 | |
| 10 | Valence | –0.48 | 0.49 | 1.36 |
| Arousal | –0.27 | 0.91 | 1.69 | |
| Stress | –0.90 | –0.85 | –0.90 | |
| 11 | Valence | –0.11 | 0.63 | 1.36 |
| Arousal | 0.40 | 0.76 | 1.14 | |
| Stress | 0.23 | 0.23 | 0.23 | |
All results are retained to 2 decimal places. The larger the score of Valence indicates the more positive; the larger the score of Arousal indicates the greater emotional intensity (no positive or negative directionality); the larger the score of Stress indicates the greater stress (negative directionality).
The emotion classification of OCER and the data arrays.
| Emotion classification | |
| Negative | 72 samples (60 s) |
| Neutral | 180 samples (60 s) |
| Positive | 138 samples (60 s) |
|
| |
| Dataset | 7800(seg) × 3(s) × 250(Hz/s) × 8(channels) |
| Label | 7800 × 3(Negative, Neutral, Positive) |
FIGURE 4The result of channel weight on OCER dataset respectively for negative, neutral, and positive emotions. **P < 0.01.
Baseline and proposed method for EEG dataset emotion recognition.
| Channel attention | Temporal attention | CNN | LSTM/Bi-LSTM | |
| RNN | × | × | × | √ |
| C-RNN | √ | × | × | √ |
| CTA-RNN | √ | √ | × | √ |
| CNN-RNN | × | × | √ | √ |
| C-CNN-RNN | √ | × | √ | √ |
|
| √ | √ | √ | √ |
Array and total parameters of 3s-dataset (OCER) fed into different models.
| Model | Input array | Main layers | Total params |
| RNN | None × 3 × 2000 | 3 Unit (64,32,16) | 544,243/1108,963 |
| C/CTA-RNN | (None × 3 × 250 × 8) reshape (None × 3 × 2000) | 3 Unit (64,32,16) | 544,243/1108,963 |
| -CNN-RNN | (None × 3 × 250 × 8) reshape (None × 3 × 2000) | 2 Conv ( | 648 × 2 + 263411/530915 |
FIGURE 5The results of mean accuracy (%) and one-way ANOVA in baseline and proposed methods on 3s-data set. The symbol ** means that P < 0.01 and is a statistically significant difference. The symbol **** means that P < 0.0001 is an extremely significant statistical difference.
FIGURE 6The result of confusion matrixes of negative, neutral, and positive emotions in baseline and proposed method.
FIGURE 7Average accuracy (%) of baseline and proposed method on the recognition of negative emotion in each individual.
FIGURE 9Average accuracy (%) of baseline and proposed method on the recognition of positive emotion in each individual.