| Literature DB >> 29297324 |
Jing Chen1, Bin Hu2, Yue Wang3, Philip Moore3, Yongqiang Dai3, Lei Feng4, Zhijie Ding5.
Abstract
BACKGROUND: Collaboration between humans and computers has become pervasive and ubiquitous, however current computer systems are limited in that they fail to address the emotional component. An accurate understanding of human emotions is necessary for these computers to trigger proper feedback. Among multiple emotional channels, physiological signals are synchronous with emotional responses; therefore, analyzing physiological changes is a recognized way to estimate human emotions. In this paper, a three-stage decision method is proposed to recognize four emotions based on physiological signals in the multi-subject context. Emotion detection is achieved by using a stage-divided strategy in which each stage deals with a fine-grained goal.Entities:
Keywords: Emotion recognition; Multimodal physiological signals; Stage-divided; Subject-independent
Mesh:
Year: 2017 PMID: 29297324 PMCID: PMC5751758 DOI: 10.1186/s12911-017-0562-x
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1Valence-arousal space. EQ1: valence rating > 5 and arousal rating > 5, EQ2: valence rating > 5 and arousal rating ≤ 5, EQ3: valence rating ≤ 5 and arousal rating ≤ 5, EQ4: valence rating ≤ 5 and arousal rating > 5
Database content summary
| Online subjective annotation | |
| Number of videos | 120 |
| Video duration | 1 min affective highlight |
| Selection method | 60 via last.fm affective tags, 60 manually selected |
| Number of ratings per video | 14–16 |
| Rating scales | Arousal, Valence and Dominance |
| Rating values | 1–9 |
| Physiological experiment | |
| Number of subjects | 32 |
| Number of videos | 40 |
| Selection method | Subset of online annotated videos with clearest responses |
| Rating scales | Arousal, Valence, Dominance, Liking and Familiarity |
| Rating values | Familiarity: discrete scale of 1-5, others: continuous scale of 1-9 |
| Recorded signals | 32-channel 512Hz EEG, peripheral physiological signals, face video from 22 subjects |
Features extracted from physiological signals
| Feature index | Notation of the extracted features |
|---|---|
| No. 1-448 EEG time and frequency-domain features (14 feature types × 32 channels) | Mean, Var, peak-to-peak amplitude, Skewness, Kurtosis |
| Average PSD in theta (4-7 Hz), alpha (8-15 Hz),beta (16-31 Hz), gamma (32-45 Hz), beta/theta, beta/alpha | |
| Three Hjorth parameters: mobility, activity and complexity | |
| No. 449-504 EEG hemispheric asymmetry (4 feature types × 14 channel pairs) | Difference of average PSD in theta, alpha, beta and gamma bands for 14 channel pairs between right and left scalp |
| No. 505-600 EEG nonlinear features (3 feature types × 32 channels) | Spectral Entropy, Shannon Entropy and C0 complexity |
| No. 601-608 EOG features (4 feature types × 2 channels) | Mean, Var, peak-to-peak amplitude, Energy |
| No. 609-642 EMG features (17 feature types × 2 channels) | Mean, Var, Total spectral power |
| 1Diff-Mean, 1Diff-Median, 1Diff-Min, 1Diff-Var, 1Diff-Max, 1Diff-MinRatio, 1Diff-MaxRatio | |
| 2Diff-Mean, 2Diff-Median, 2Diff-Min, 2Diff-Var, 2Diff-Max, 2Diff-MinRatio, 2Diff-MaxRatio | |
| No. 643-646 TMP features (4 feature types × 1 channel) | Mean, 1Diff-Mean, Spectral power in the bands (0-0.1 Hz) and (0.1-0.2 Hz) |
| No. 647-666 BVP features (20 feature types × 1 channel) | Hr-Mean, Hr-Var, Hr-Range |
| Hrv-Mean, Hrv-Var, Hrv-Min, Hrv-Max, Hrv-Range, Hrv-pNN50 | |
| HrvDistr-Mean, HrvDistr-Median, HrvDistr-Var, HrvDistr-Min, HrvDistr-Max, HrvDistr-Range, HrvDistr-Triind | |
| PSD in bands (0-0.2 Hz), (0.2-0.4 Hz), (0.4-0.6 Hz), and (0.6-0.8 Hz) of Hrv | |
| No. 667-721 RSP features (55 feature types × 1 channel) | Mean, Var, Range, MaxRatio |
| 1Diff-Mean, 1Diff-Median, 1Diff-Var, 1Diff-Range, 1Diff-MaxRatio | |
| 2Diff-Mean, 2Diff-Median, 2Diff-Var, 2Diff-Range, 2Diff-MaxRatio | |
| RSPPulse-Mean, RSPPulse-Var, RSPPulse-Range, RSPPulse-MaxRatio | |
| RSPPulse-1Diff-Mean, RSPPulse-1Diff-Median, RSPPulse-1Diff-Var, RSPPulse-1Diff-Min, RSPPulse-1Diff-Max, RSPPulse-1Diff-Range, RSPPulse-1Diff-MaxRatio | |
| RSPPulse-2Diff-Mean, RSPPulse-2Diff-Median, RSPPulse-2Diff-Var, RSPPulse-2Diff-Min, RSPPulse-2Diff-Max, RSPPulse-2Diff-Range, RSPPulse-2Diff-MaxRatio | |
| RSPAmpl-Mean, RSPAmpl-Var, RSPAmpl-Range, RSPAmpl-MaxRatio | |
| RSPAmpl-1Diff-Mean, RSPAmpl-1Diff-Median, RSPAmpl-1Diff-Var, RSPAmpl-1Diff-Min, RSPAmpl-1Diff-Max, RSPAmpl-1Diff-Range, RSPAmpl-1Diff-MaxRatio | |
| RSPAmpl-2Diff-Mean, RSPAmpl-2Diff-Median, RSPAmpl-2Diff-Var, RSPAmpl-2Diff-Min, RSPAmpl-2Diff-Max, RSPAmpl-2Diff-Range, RSPAmpl-2Diff-MaxRatio | |
| PSD in the bands (0-0.1 Hz), (0.1-0.2 Hz), (0.2-0.3 Hz), and (0.3-0.4 Hz), Ratio of PSD in the band (0-0.25 Hz) to PSD in the band (0.25-0.45 Hz) | |
| No. 722-742 GSR features (21 feature types × 1 channel) | Rising time, Decay time |
| Sc-Mean, Sc-Median, Sc-Var, Sc-MinRatio, Sc-MaxRatio | |
| Sc-1Diff-Mean, Sc-1Diff-Median, Sc-1Diff-Var, Sc-1Diff-Min, Sc-1Diff-Max, Sc-1Diff-MinRatio, Sc-1Diff-MaxRatio | |
| Sc-2Diff-Mean, Sc-2Diff-Median, Sc-2Diff-Var, Sc-2Diff-Min, Sc-2Diff-Max, Sc-2Diff-MinRatio, Sc-2Diff-MaxRatio |
MaxRatio: number of maxima divided by the total number of signal values, MinRatio: number of minima divided by the total number of signal values, 1Diff: approximation of first derivation, 2Diff: approximation of second derivation, Range: maximum-minimum, RSPPulse: pulse signal of RSP, RSPAmpl: amplitude signal of RSP, Sc: skin conductance, Hr: heart rate, Hrv: heart rate variability, pNN50: number of pairs of adjacent NN intervals differing by more than 50ms in the entire recording divided by the total number of NN intervals, HrvDistr: distribution of NN intervals, HrvDistr-Triind: total number of all NN intervals divided by the height of the histogram of all NN intervals, Var: variance, PSD: power spectral density
Fig. 2Diagram of the three-stage decision method for multi-subject emotion recognition
Fig. 3The process of feature selection in stage one. ’ S ’ (k=1,2, …,31) is subject IDs
Fig. 4Most selected features in each stage. a Most selected features in each stage when two emotion pools corresponding to HA and LA. b Most selected features in each stage when two emotion pools corresponding to HV and LV
Recognition performance of allocating four emotions to two emotion pools differently
| Strategy | Recognition accuracy | ||||
|---|---|---|---|---|---|
| EQ1 | EQ2 | EQ3 | EQ4 | Average | |
| Two pools: HA and LA | 86.67% | 80.00% | 30.56% | 58.33% | 77.57% |
| Two pools: HV and LV | 33.33% | 83.33% | 50.55% | 50.00% | 43.57% |
Fig. 5Three typical multiclass classification ways. a Multiclass classifiers. b One-against-Rest scheme. c One-against-One scheme
Parameter setting and recognition performance of comparative methods
| Method | Parameters | Description | Accuracy | |
|---|---|---|---|---|
| 1 |
|
| 44.30% | |
| SVM | Linear kernel, cost C=200, tolerance E=0.001, epsilon for the loss function P=1.0E-12 | One sample from one trial | 51.10% | |
| C4.5 | - | 47.24% | ||
| Random forest | - | 46.69% | ||
| 2 | One-against-Rest |
| 29 samples from one trial | 51.13% |
| 3 | One-against-One |
| 52.16% | |
| Three-stage decision | Two pools: HA and LA | 77.57% |