| Literature DB >> 25799141 |
Hariharan Muthusamy1, Kemal Polat2, Sazali Yaacob3.
Abstract
In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), perceptual linear predictive (PLP) features, gammatone filter outputs, timbral texture features, stationary wavelet transform based timbral texture features and relative wavelet packet energy and entropy features were extracted from the emotional speech (ES) signals and its glottal waveforms(GW). Particle swarm optimization based clustering (PSOC) and wrapper based particle swarm optimization (WPSO) were proposed to enhance the discerning ability of the features and to select the discriminating features respectively. Three different emotional speech databases were utilized to gauge the proposed method. Extreme learning machine (ELM) was employed to classify the different types of emotions. Different experiments were conducted and the results show that the proposed method significantly improves the speech emotion recognition performance compared to previous works published in the literature.Entities:
Mesh:
Year: 2015 PMID: 25799141 PMCID: PMC4370637 DOI: 10.1371/journal.pone.0120344
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Proposed improved emotion recognition from emotional speech signals and its glottal waveforms.
Details of number of speech samples per emotion.
| Databases | Emotions | |||||||
|---|---|---|---|---|---|---|---|---|
| Anger | Disgust | Fear | Neutral | Happiness | Sadness | Boredom | Surprise | |
| BES | 127 | 45 | 70 | 70 | 71 | 62 | 81 | NA |
| SAVEE | 60 | 60 | 60 | 120 | 60 | 60 | NA | 60 |
| SES | 240 | NA | NA | 240 | 240 | 240 | NA | 240 |
NA-Not Applicable
Fig 2PSO based clustering for feature enhancement.
Fig 3Process of Feature Selection using PSO.
Fig 4Class distribution plots of raw features.
Fig 5Class distribution plots of weighted features.
List of number of selected weighted features.
| List of Features | BES | SES | SAVEE | |||
|---|---|---|---|---|---|---|
| From Speech Signals | From Glottal Signals | From Speech Signals | From Glottal Signals | From Speech Signals | From Glottal Signals | |
| MFCCs (24+24) | 1 | 1 | 2 | 3 | 3 | 4 |
| LPCCs (18+18) | 3 | 2 | 3 | 2 | 2 | 2 |
| GTFBOs (24+24) | 2 | 1 | 3 | 3 | 2 | 3 |
| PLPs (13+13) | 1 | 1 | 2 | 1 | 2 | 1 |
| TTFs (24+24) | 3 | 2 | 3 | 2 | 3 | 1 |
| SWTTTFs (144+144) | 8 | 10 | 9 | 10 | 14 | 11 |
| RWPFs (60+60) | 5 | 5 | 4 | 4 | 1 | 4 |
| Total Selected Features (average) | 23 | 22 | 26 | 25 | 27 | 26 |
| Total Selected Features (Minimum) | 16 | 18 | 16 | 18 | 21 | 17 |
| Total Selected Features (Maximum) | 28 | 29 | 33 | 31 | 36 | 32 |
Confusion matrices for emotion recognition using raw, weighted and selected weighted features (BES).
| Experiments | Emo-tions | Raw Features | Weighted Features | Selected Weighted Features | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Ang | Bor | Dis | Fea | Hap | Sad | Neu | Ang | Bor | Dis | Fea | Hap | Sad | Neu | Ang | Bor | Dis | Fea | Hap | Sad | Neu | ||
|
|
|
| 0.00 | 0.87 | 0.34 | 3.99 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 0.15 | 0.37 | 3.59 | 0.00 | 0.00 |
|
| 0.00 |
| 1.68 | 1.06 | 0.00 | 2.80 | 13.92 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.64 | 0.00 | 0.00 | 0.03 | 2.49 | |
|
| 4.87 | 1.25 |
| 8.01 | 0.00 | 0.00 | 6.32 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.06 | 0.00 |
| 0.71 | 0.41 | 0.03 | 0.00 | |
|
| 3.37 | 1.11 | 1.96 |
| 2.97 | 2.65 | 0.77 | 0.00 | 0.00 | 0.80 |
| 0.00 | 0.00 | 0.00 | 0.16 | 0.33 | 0.37 |
| 0.70 | 0.20 | 0.19 | |
|
| 23.27 | 0.00 | 4.32 | 6.75 |
| 1.00 | 2.27 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.90 | 0.00 | 0.51 | 1.19 |
| 0.00 | 0.40 | |
|
| 0.00 | 3.73 | 0.00 | 1.97 | 0.00 |
| 2.37 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.03 | 0.00 | 0.34 | 0.00 |
| 0.16 | |
|
| 0.67 | 7.61 | 0.00 | 0.00 | 0.00 | 5.23 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 1.36 | 0.34 | 0.13 | 0.38 | 0.57 |
| |
|
|
|
| 0.00 | 2.51 | 1.74 | 7.07 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.35 | 0.00 | 0.00 |
| 0.00 | 0.49 | 1.15 | 5.78 | 0.00 | 0.00 |
|
| 0.00 |
| 3.07 | 1.98 | 0.00 | 8.79 | 15.21 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 1.95 | 0.00 | 0.10 | 0.29 | 3.73 | |
|
| 10.69 | 21.25 |
| 8.22 | 2.00 | 0.00 | 8.94 | 0.00 | 0.44 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.03 | 0.16 |
| 0.38 | 1.09 | 0.60 | 0.04 | |
|
| 12.62 | 5.91 | 5.05 |
| 7.05 | 3.40 | 1.05 | 0.27 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.16 | 0.55 | 0.77 |
| 1.08 | 0.79 | 0.88 | |
|
| 37.66 | 1.11 | 3.61 | 12.33 |
| 0.00 | 4.22 | 1.34 | 0.00 | 0.00 | 0.00 |
| 0.00 | 0.00 | 1.35 | 0.00 | 0.66 | 0.74 |
| 0.00 | 0.36 | |
|
| 0.00 | 8.50 | 3.08 | 7.18 | 0.00 |
| 1.25 | 0.00 | 0.00 | 0.00 | 0.92 | 0.00 |
| 0.00 | 0.00 | 0.12 | 0.00 | 0.40 | 0.00 |
| 0.35 | |
|
| 1.25 | 16.02 | 6.53 | 3.08 | 0.95 | 3.47 |
| 0.00 | 0.56 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 1.62 | 0.89 | 0.17 | 0.36 | 0.65 |
| |
|
|
|
| 0.00 | 0.83 | 0.83 | 4.17 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 1.00 | 0.11 | 4.24 | 0.00 | 0.00 |
|
| 0.00 |
| 0.00 | 2.86 | 0.00 | 8.57 | 27.14 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 1.20 | 0.00 | 0.00 | 0.00 | 1.25 | |
|
| 10.00 | 5.00 |
| 25.00 | 0.00 | 0.00 | 5.00 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.34 | 0.00 | 0.00 | 0.00 | |
|
| 2.86 | 0.00 | 0.00 |
| 1.43 | 0.00 | 2.86 | 0.00 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.17 | 0.34 | 3.60 |
| 0.64 | 0.16 | 0.20 | |
|
| 20.00 | 0.00 | 0.00 | 8.00 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.37 | 0.06 | 3.60 | 0.52 |
| 0.00 | 0.35 | |
|
| 0.00 | 10.00 | 0.00 | 0.00 | 0.00 |
| 2.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.12 | 0.20 | 0.34 | 0.00 |
| 0.30 | |
|
| 1.25 | 3.75 | 0.00 | 7.50 | 1.25 | 1.25 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 1.54 | 4.80 | 0.12 | 0.08 | 0.88 |
| |
|
|
|
| 0.00 | 0.77 | 0.00 | 2.31 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 0.29 | 0.40 | 4.13 | 0.00 | 0.00 |
|
| 0.00 |
| 0.00 | 0.00 | 0.00 | 0.00 | 11.11 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 0.06 | 0.00 | 0.00 | 3.60 | |
|
| 8.57 | 0.00 |
| 5.71 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.19 | 0.00 |
| 1.71 | 0.31 | 0.06 | 0.00 | |
|
| 2.86 | 0.00 | 2.86 |
| 7.14 | 0.00 | 0.00 | 0.00 | 0.00 | 2.00 |
| 0.00 | 0.00 | 0.00 | 0.12 | 0.09 | 0.34 |
| 0.53 | 0.06 | 0.15 | |
|
| 33.33 | 0.00 | 0.00 | 1.11 |
| 2.22 | 3.33 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.98 | 0.00 | 0.29 | 0.97 |
| 0.00 | 0.05 | |
|
| 0.00 | 0.00 | 1.43 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | |
|
| 0.00 | 13.75 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 0.50 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.00 | 1.24 | 0.00 | 0.06 | 0.27 | 0.11 |
| |
Confusion matrices for emotion recognition using raw, weighted and selected weighted features (SES).
| Experiments | Emotions | Raw Features | Weighted Features | Selected Weighted Features | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Neu | Sur | Ang | Sad | Hap | Ang | Dis | Fea | Hap | Neu | Ang | Dis | Fea | Hap | Neu | ||
|
|
|
| 6.04 | 11.04 | 24.17 | 13.54 |
| 2.29 | 2.21 | 2.21 | 1.67 |
| 5.32 | 2.44 | 3.33 | 3.13 |
|
| 10.21 |
| 12.08 | 16.25 | 17.29 | 2.08 |
| 1.67 | 1.38 | 1.79 | 4.73 |
| 3.35 | 4.02 | 2.03 | |
|
| 9.17 | 10.21 |
| 6.25 | 13.54 | 0.42 | 0.21 |
| 1.00 | 2.83 | 3.92 | 5.39 |
| 4.39 | 5.03 | |
|
| 16.46 | 8.33 | 7.08 |
| 5.21 | 1.13 | 0.33 | 2.38 |
| 1.54 | 3.43 | 2.88 | 2.53 |
| 3.71 | |
|
| 12.92 | 18.75 | 17.50 | 6.25 |
| 0.29 | 0.04 | 1.46 | 0.33 |
| 4.44 | 4.18 | 6.13 | 6.72 |
| |
|
|
|
| 20.83 | 15.00 | 16.67 | 15.00 |
| 3.25 | 2.75 | 3.42 | 2.58 |
| 10.52 | 5.07 | 4.85 | 7.95 |
|
| 21.25 |
| 16.25 | 13.33 | 20.00 | 4.75 |
| 5.33 | 4.92 | 2.58 | 9.05 |
| 6.25 | 6.02 | 3.88 | |
|
| 21.67 | 23.75 |
| 6.25 | 17.08 | 1.75 | 1.75 |
| 5.33 | 9.17 | 5.61 | 10.70 |
| 7.46 | 9.77 | |
|
| 37.08 | 23.75 | 7.50 |
| 6.67 | 0.08 | 2.50 | 5.67 |
| 2.50 | 5.43 | 7.83 | 6.97 |
| 8.10 | |
|
| 20.42 | 25.42 | 17.92 | 8.33 |
| 3.33 | 0.83 | 5.17 | 2.58 |
| 6.36 | 5.40 | 9.98 | 8.88 |
| |
| GD (Male) |
|
| 0.00 | 9.58 | 0.00 | 22.92 |
| 0.33 | 1.83 | 1.83 | 0.50 |
| 0.60 | 4.07 | 1.73 | 0.88 |
|
| 0.00 |
| 0.00 | 22.92 | 0.00 | 0.00 |
| 0.00 | 0.25 | 0.00 | 0.55 |
| 0.23 | 1.53 | 0.25 | |
|
| 7.92 | 0.00 |
| 0.00 | 17.08 | 2.08 | 0.00 |
| 0.25 | 4.08 | 4.80 | 0.08 |
| 0.77 | 3.60 | |
|
| 0.00 | 15.83 | 0.00 |
| 0.00 | 0.50 | 0.75 | 0.08 |
| 0.42 | 2.83 | 0.35 | 1.00 |
| 1.68 | |
|
| 22.08 | 0.00 | 21.25 | 0.00 |
| 0.00 | 0.00 | 1.75 | 0.25 |
| 1.95 | 0.29 | 6.62 | 2.03 |
| |
| GD (Female) |
|
| 0.42 | 23.33 | 0.00 | 13.75 |
| 0.00 | 0.33 | 0.50 | 0.08 |
| 0.57 | 0.63 | 1.85 | 0.32 |
|
| 0.00 |
| 0.42 | 20.00 | 0.00 | 1.75 |
| 0.83 | 2.58 | 4.50 | 0.17 |
| 0.58 | 3.83 | 1.72 | |
|
| 25.42 | 0.00 |
| 0.00 | 17.08 | 0.00 | 0.08 |
| 0.08 | 0.00 | 0.20 | 1.22 |
| 0.82 | 0.78 | |
|
| 1.25 | 22.92 | 0.00 |
| 0.00 | 1.67 | 0.25 | 0.83 |
| 1.00 | 0.40 | 4.37 | 0.39 |
| 2.58 | |
|
| 17.50 | 0.00 | 17.08 | 0.00 |
| 0.00 | 0.17 | 0.00 | 0.83 |
| 0.25 | 4.03 | 0.38 | 3.13 |
| |
Confusion matrices for emotion recognition using raw, weighted and selected weighted features (SAVEE).
| Experiments | Emotions | Raw Features | Weighted Features | Selected Weighted Features | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Ang | Dis | Fea | Hap | Neu | Sad | Sur | Ang | Dis | Fea | Hap | Neu | Sad | Sur | Ang | Dis | Fea | Hap | Neu | Sad | Sur | ||
|
|
|
| 0.83 | 0.83 | 10.00 | 2.50 | 0.00 | 8.33 |
| 0.00 | 0.00 | 1.17 | 0.00 | 0.00 | 0.00 |
| 0.73 | 0.17 | 2.90 | 0.03 | 0.00 | 0.30 |
|
| 3.33 |
| 6.67 | 0.83 | 21.67 | 4.17 | 3.33 | 0.00 |
| 0.00 | 0.00 | 2.17 | 0.00 | 0.00 | 0.30 |
| 1.20 | 1.17 | 0.67 | 1.33 | 0.43 | |
|
| 4.17 | 7.50 |
| 9.17 | 2.50 | 1.67 | 12.50 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.67 | 0.27 | 0.67 |
| 1.17 | 0.20 | 0.27 | 1.60 | |
|
| 5.00 | 4.17 | 4.17 |
| 0.00 | 0.83 | 15.83 | 0.83 | 0.00 | 0.00 |
| 0.00 | 0.00 | 1.00 | 3.03 | 0.73 | 1.70 |
| 0.02 | 0.03 | 4.10 | |
|
| 1.25 | 2.08 | 3.33 | 0.42 |
| 2.50 | 1.67 | 0.00 | 0.00 | 0.00 | 0.00 |
| 0.42 | 0.00 | 0.07 | 5.83 | 0.33 | 0.00 |
| 2.16 | 0.17 | |
|
| 1.67 | 4.17 | 2.50 | 0.00 | 2.50 |
| 0.83 | 0.00 | 0.17 | 0.00 | 0.00 | 0.17 |
| 0.00 | 0.00 | 1.07 | 0.50 | 0.10 | 0.52 |
| 0.13 | |
|
| 0.83 | 5.00 | 15.83 | 17.50 | 0.83 | 0.83 |
| 0.00 | 0.00 | 0.17 | 0.50 | 0.00 | 0.00 |
| 0.40 | 0.43 | 4.43 | 2.50 | 0.10 | 0.20 |
| |
|
|
|
| 20.00 | 0.00 | 3.33 | 15.00 | 0.00 | 6.67 |
| 5.33 | 0.00 | 13.00 | 6.00 | 0.00 | 8.33 |
| 0.93 | 0.40 | 17.00 | 0.30 | 0.60 | 1.27 |
|
| 10.00 |
| 3.33 | 5.00 | 41.67 | 13.33 | 3.33 | 0.00 |
| 1.33 | 0.33 | 13.67 | 1.67 | 0.00 | 5.93 |
| 5.40 | 4.40 | 11.93 | 7.20 | 4.67 | |
|
| 18.33 | 25.00 |
| 0.00 | 11.67 | 8.33 | 21.67 | 0.00 | 5.00 |
| 7.00 | 2.67 | 2.00 | 10.67 | 1.07 | 4.07 |
| 5.73 | 0.10 | 3.20 | 9.13 | |
|
| 28.33 | 21.67 | 3.33 |
| 8.33 | 0.00 | 23.33 | 11.67 | 4.33 | 4.00 |
| 1.33 | 0.00 | 11.33 | 9.47 | 1.93 | 6.47 |
| 0.43 | 1.27 | 11.67 | |
|
| 0.00 | 23.33 | 0.00 | 0.00 |
| 0.00 | 3.33 | 0.00 | 11.33 | 0.00 | 0.17 |
| 2.00 | 0.00 | 4.13 | 13.67 | 1.67 | 2.27 |
| 12.87 | 0.53 | |
|
| 0.00 | 40.00 | 1.67 | 0.00 | 21.67 |
| 0.00 | 0.00 | 15.00 | 0.00 | 0.33 | 8.33 |
| 0.00 | 1.87 | 6.13 | 5.47 | 2.00 | 5.40 |
| 2.00 | |
|
| 15.00 | 28.33 | 10.00 | 11.67 | 8.33 | 0.00 |
| 0.00 | 5.00 | 7.00 | 13.33 | 0.33 | 0.00 |
| 8.47 | 4.20 | 13.33 | 12.67 | 2.57 | 2.27 |
| |