| Literature DB >> 30959956 |
Fadi Al Machot1, Ali Elmachot2, Mouhannad Ali3, Elyan Al Machot4, Kyandoghere Kyamakya5.
Abstract
One of the main objectives of Active and Assisted Living (AAL) environments is to ensure that elderly and/or disabled people perform/live well in their immediate environments; this can be monitored by among others the recognition of emotions based on non-highly intrusive sensors such as Electrodermal Activity (EDA) sensors. However, designing a learning system or building a machine-learning model to recognize human emotions while training the system on a specific group of persons and testing the system on a totally a new group of persons is still a serious challenge in the field, as it is possible that the second testing group of persons may have different emotion patterns. Accordingly, the purpose of this paper is to contribute to the field of human emotion recognition by proposing a Convolutional Neural Network (CNN) architecture which ensures promising robustness-related results for both subject-dependent and subject-independent human emotion recognition. The CNN model has been trained using a grid search technique which is a model hyperparameter optimization technique to fine-tune the parameters of the proposed CNN architecture. The overall concept's performance is validated and stress-tested by using MAHNOB and DEAP datasets. The results demonstrate a promising robustness improvement regarding various evaluation metrics. We could increase the accuracy for subject-independent classification to 78% and 82% for MAHNOB and DEAP respectively and to 81% and 85% subject-dependent classification for MAHNOB and DEAP respectively (4 classes/labels). The work shows clearly that while using solely the non-intrusive EDA sensors a robust classification of human emotion is possible even without involving additional/other physiological signals.Entities:
Keywords: convolutional neural networks; deep learning; electrodermal activity (EDA); subject-dependent emotion recognition; subject-independent emotion recognition
Mesh:
Year: 2019 PMID: 30959956 PMCID: PMC6479880 DOI: 10.3390/s19071659
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Summary of the stare-of-the-art works for human emotion recognition using physiological signals.
| Paper | Classifier | Features | Signals |
|---|---|---|---|
| [ | SVM | Statistical Features | Facial electromyograms, electrocardiogram, respiration, and electrodermal activity |
| [ | Genetic algorithm and K-NN | Statistical features | EDA |
| [ | Neuro-fuzzy inference | Statistical Features | Facial electromyograms, electrocardiogram, respiration, and electrodermal activity |
| [ | K-NN | Statistical features | EDA |
| [ | SVM | Wrapper feature selection (WFS) | EDA |
| [ | CNN | Raw data | Patient’s movements XYZ + EDA |
| [ | Deep learning (CNN+RNN) | Raw data | AVEC 2016 |
| [ | ESN-CNN | Statistical features | ECG (Electrocardiogram), EDA (Electrodermal activity) and ST (Skin Temperature) |
| [ | Dynamic calibration + K-NN | Statistical features | EDA |
SVM: Support Vector Machine, K-NN: K-Nearest Neighbor, CNN: Convolutional Neural Network, RNN: Recurrent Neural Network, ESN-CNN: Echo State Network - Cellular Neural Network.
Figure 1Self-assessment manikins scales for valence (above) and arousal (below) [32].
Figure 2The proposed CNN model.
Parameters used for all the layers of the proposed CNN model.
| Layer | Kernel, Units | Other Layers Parameters |
|---|---|---|
| C1 | (3 × 3), 2 | Activation = Selu, Strides = 1 |
| P1 | (2×2) | Strides = 2 |
| C2 | (3×3), 196 | Activation = Selu, Strides = 1 |
| P2 | (3×3) | Strides=3 |
| C3 | (3×3)3, 92 | Activation = Selu, Strides = 1 |
| P3 | (3×3) | Strides = 3 |
C is the convolution layer, P is the max-pooling layer and SELU is the Scaled Exponential Linear Unit activation function.
Figure 3Overall emotion distribution for one Subject, where C1: High Valence/High Arousal (HVHA), C2: High Valence/Low Arousal (HVLA), C3: Low Valence/Low Arousal (LVLA) and C4: Low Valence/High Arousal (LVHA) based on a subject’s data in MAHNOB.
Figure 4Scatter plot of the first three Fisher scores based on a subject’s data in MAHNOB.
Values of parameters of proposed CNN and other classifiers.
| Model | Parameters |
|---|---|
| SVM (poly) | Degree of the polynomial kernel function = 3, |
| SVM (rbf) |
|
| Random Forest | Number of estimators estimators = 10 trees, criterion = Gini impurity, The minimum number of samples required to split an internal node = 2 |
| Naive Bayes | Prior = probabilities of the classes |
| KNN | Distance metric = ’minkowski’, Power parameter for the Minkowski metric = 2, Number of neighbors = 3 |
| Proposed (CNN) | Loss = categorical_crossentropy, optimizer = Adam, batch_size = 50, epochs = 1000 |
Performance metrics for DEAP (the average performance results for training and testing on same subject).
| Model | Accuracy | Precision | Recall | F-Measure |
|---|---|---|---|---|
| SVM (Linear) | 0.46 | 0.41 | 0.46 | 0.42 |
| SVM (poly) | 0.41 | 0.53 | 0.43 | 0.33 |
| SVM (rbf) | 0.59 | 0.60 | 0.60 | 0.58 |
| Random Forest | 0.74 | 0.76 | 0.75 | 0.75 |
| Naive Bayes | 0.44 | 0.48 | 0.44 | 0.42 |
| K-NN | 0.80 | 0.80 | 0.80 | 0.80 |
|
|
|
|
|
|
Performance metrics for MAHNOB (the average performance results for training and testing on same subject).
| Model | Accuracy | Precision | Recall | F-Measure |
|---|---|---|---|---|
| SVM (Linear) | 0.49 | 0.48 | 0.50 | 0.43 |
| SVM (poly) | 0.47 | 0.49 | 0.48 | 0.36 |
| SVM (rbf) | 0.55 | 0.53 | 0.56 | 0.51 |
| Random Forest | 0.68 | 0.70 | 0.70 | 0.70 |
| Naive Bayes | 0.37 | 0.43 | 0.39 | 0.35 |
| K-NN | 0.74 | 0.76 | 0.75 | 0.75 |
|
|
|
|
|
|
Performance metrics for MAHNOB (the average performance results for training and testing on different subjects).
| Model | Accuracy | Precision | Recall | F-Measure |
|---|---|---|---|---|
| SVM (Linear) | 0.34 | 0.47 | 0.34 | 0.37 |
| SVM (poly) | 0.36 | 0.70 | 0.37 | 0.42 |
| SVM (rbf) | 0.41 | 0.53 | 0.42 | 0.45 |
| Random Forest | 0.64 | 0.65 | 0.65 | 0.65 |
| Naive Bayes | 0.27 | 0.43 | 0.27 | 0.33 |
| K-NN | 0.72 | 0.73 | 0.73 | 0.72 |
|
|
|
|
|
|
Performance metrics for DEAP (the average performance results for training and testing on different subjects).
| Model | Accuracy | Precision | Recall | F-Measure |
|---|---|---|---|---|
| SVM (Linear) | 0.40 | 0.41 | 0.40 | 0.31 |
| SVM (poly) | 0.39 | 0.41 | 0.39 | 0.28 |
| SVM (rbf) | 0.44 | 0.50 | 0.44 | 0.40 |
| Random Forest | 0.69 | 0.70 | 0.69 | 0.69 |
| Naive Bayes | 0.36 | 0.31 | 0.36 | 0.28 |
| K-NN | 0.75 | 0.76 | 0.75 | 0.76 |
|
|
|
|
|
|
Confusion matrix for both MAHNOB and DEAP (the average performance results for training and testing on same subjects).
| Class | C1 | C2 | C3 | C4 |
|---|---|---|---|---|
| C1 | 0.861 | 0.057 | 0.071 | 0.046 |
| C2 | 0.062 | 0.808 | 0.059 | 0.034 |
| C3 | 0.039 | 0.050 | 0.878 | 0.017 |
| C4 | 0.045 | 0.063 | 0.042 | 0.866 |
C1: High Valence/High Arousal (HVHA), C2: High Valence/Low Arousal (HVLA), C3: Low Valence/Low Arousal (LVLA) and C4: Low Valence/High Arousal (LVHA).
Confusion matrix for both MAHNOB and DEAP (the average performance results for training and testing on different subjects).
| Class | C1 | C2 | C3 | C4 |
|---|---|---|---|---|
| C1 | 0.762 | 0.177 | 0 | 0.146 |
| C2 | 0.049 | 0.685 | 0 | 0.077 |
| C3 | 0.004 | 0 | 0.705 | 0.017 |
| C4 | 0.108 | 0.126 | 0.058 | 0.857 |
C1: High Valence/High Arousal (HVHA), C2: High Valence/Low Arousal (HVLA), C3: Low Valence/Low Arousal (LVLA) and C4: Low Valence/High Arousal (LVHA).
A summary of the state-of-the-art results using only EDA.
| Paper | Experiment | Number of Classes | Classifier Used | Arousal | Valence | Accuracy (Both) |
|---|---|---|---|---|---|---|
| [ | Subject-dependent | 4 | Genetic algorithm and K-NN | 0.56 | 0.50 | – |
| [ | Subject-independent | 3 | K-NN | 0.77 | 0.84 | – |
| [ | Subject-independent | 2 | SVM | – | – | 0.95 |
| [ | Subject-dependent | 2 | CNN | – | – | 0.80 |
| [ | Subject-independent | 2 | CNN | 0.10 | 0.33 | – |
| Proposed CNN | Subject-independent (DEAP) | 4 | CNN | – | – |
|
| Proposed CNN | Subject-independent (MAHNOB) | 4 | CNN | – | – |
|
| Proposed CNN | Subject-dependent (DEAP) | 4 | CNN | – | – |
|
| Proposed CNN | Subject-dependent (MAHNOB) | 4 | CNN | – | – |
|
SVM: Support Vector Machine, K-NN: K-Nearest Neighbor, CNN: Convolutional Neural Network.