| Literature DB >> 29710869 |
Yea-Hoon Kwon1, Sae-Byuk Shin2, Shin-Dug Kim3.
Abstract
The purpose of this study is to improve human emotional classification accuracy using a convolution neural networks (CNN) model and to suggest an overall method to classify emotion based on multimodal data. We improved classification performance by combining electroencephalogram (EEG) and galvanic skin response (GSR) signals. GSR signals are preprocessed using by the zero-crossing rate. Sufficient EEG feature extraction can be obtained through CNN. Therefore, we propose a suitable CNN model for feature extraction by tuning hyper parameters in convolution filters. The EEG signal is preprocessed prior to convolution by a wavelet transform while considering time and frequency simultaneously. We use a database for emotion analysis using the physiological signals open dataset to verify the proposed process, achieving 73.4% accuracy, showing significant performance improvement over the current best practice models.Entities:
Keywords: EEG; GSR; convolution neural networks; deep learning; emotion recognition; hybrid neural network; pattern recognition
Year: 2018 PMID: 29710869 PMCID: PMC5982398 DOI: 10.3390/s18051383
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1The 10-20 system the international standard and location of the electrodes.
Figure 2Arousal, valence two-dimension plane.
Figure 3K-means clustering results of Arousal-Valence self-assessment data (a) Clustered Arousal-Valence data result when k = 2; and, (b) Clustered Arousal-Valence data result when k = 4.
Figure 4Wavelet transformed spectrogram for each electrode.
The number of extracted wavelet transformed data for two types of labels.
| Label 1 | Data quantity |
|---|---|
| HAHV | 458 |
| HALV | 294 |
| LAHV | 255 |
| LALV | 273 |
| Total | 1280 |
1 H: high, L: low, V: valence, A: arousal, e.g., HAHV: high arousal, high valence.
The number of extracted wavelet transformed data for four types of labels.
| Label 1 | Data quantity | Label 2 | Data quantity |
|---|---|---|---|
| HA | 752 | HV | 713 |
| LA | 528 | LV | 567 |
| Total | 1280 | Total | 1280 |
1, 2 H: high, L: low, V: valence, A: arousal, e.g., HA: high arousal.
Figure 5Proposed convolution neural network combining electroencephalogram (EEG) and wavelet transformed galvanic skin response (GSR).
Figure 6Four class loss to find the optimal training point.
Hardware specifications.
|
| Intel Core i5-6600 |
|
| NVIDIA GeForce GTX 1070 8GBytes |
|
| DDR4 16GBytes |
|
| Ubuntu 16.04. |
|
| Tensorflow1.3 |
Emotion classification accuracy.
| Results A 1 | Results B 2 | |||||
|---|---|---|---|---|---|---|
|
| Two Class Classification Accuracy 3 | Four Class Classification Accuracy 4 | Two Class Classification Accuracy 3 | Four Class Classification Accuracy 4 | ||
| Arousal | Valence | Arousal | Valence | |||
|
| 0.7812 | 0.8125 | 0.7500 | 0.7656 | 0.8046 | 0.7343 |
1 Results A: label based classification using hold-out validation.
2 Results B: video based classification using leave-out one cross validation.
3 Arousal: HA, LA; Valence: HV, LV.
4 HAHV, HALV, LALV, LAHV.
Two class classification performance.
| Model | Accuracy | ||
|---|---|---|---|
| Arousal | Valence | ||
| CNS feature based single modality | [ | 0.6200 | 0.5760 |
| PNS feature based single modality | [ | 0.5700 | 0.6270 |
| Liu and Sourina | [ | 0.7651 | 0.5080 |
| Naser and Saha | [ | 0.6620 | 0.6430 |
| Chen et al. | [ | 0.6909 | 0.6789 |
| Yoon and Chung | [ | 0.7010 | 0.7090 |
| Li et al. | [ | 0.6420 | 0.5840 |
| Wang and Shang | [ | 0.5120 | 0.6090 |
| Proposed fusion CNN model |
|
| |
Figure 7Arousal and valence classification accuracy.
Four class classification performance.
| Model | Accuracy | |
|---|---|---|
| M Zubair and C Yoon | [ | 0.4540 |
| N Jadhav et al. | [ | 0.4625 |
| Hatamikia et al. | [ | 0.5515 |
| Martínez-Rodrigo et al. | [ | 0.7250 |
| Zhang et al. | [ | 0.7162 |
| Mei et al. | [ | 0.7310 |
| Proposed fusion CNN model |
|
Figure 8Four class classification accuracy.