| Literature DB >> 30376561 |
Simon Schaerlaeken1, Didier Grandjean1.
Abstract
The unfolding dynamics of the vocal expression of emotions are crucial for the decoding of the emotional state of an individual. In this study, we analyzed how much information is needed to decode a vocally expressed emotion using affect bursts, a gating paradigm, and linear mixed models. We showed that some emotions (fear, anger, disgust) were significantly better recognized at full-duration than others (joy, sadness, neutral). As predicted, recognition improved when greater proportion of the stimuli was presented. Emotion recognition curves for anger and disgust were best described by higher order polynomials (second to third), while fear, sadness, neutral, and joy were best described by linear relationships. Acoustic features were extracted for each stimulus and subjected to a principal component analysis for each emotion. The principal components were successfully used to partially predict the accuracy of recognition (i.e., for anger, a component encompassing acoustic features such as fundamental frequency (f0) and jitter; for joy, pitch and loudness range). Furthermore, the impact of the principal components on the recognition of anger, disgust, and sadness changed with longer portions being presented. These results support the importance of studying the unfolding conscious recognition of emotional vocalizations to reveal the differential contributions of specific acoustical feature sets. It is likely that these effects are due to the relevance of threatening information to the human mind and are related to urgent motor responses when people are exposed to potential threats as compared with emotions where no such urgent response is required (e.g., joy).Entities:
Mesh:
Year: 2018 PMID: 30376561 PMCID: PMC6207317 DOI: 10.1371/journal.pone.0206216
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Percentages of recognition and confidence for each emotion across all stimuli.
| Anger | Disgust | Fear | Joy | Neutral | Sadness | |
|---|---|---|---|---|---|---|
| Recognition% | 72.86 | 67.51 | 60.87 | 21.67 | 33.37 | 32.09 |
| Certainty% | 61.78 | 57.59 | 53.56 | 51.04 | 45.95 | 48.51 |
Fig 1Unbiased recognition over time.
(A) Graphic of the unbiased ratings for every emotion at every gate. Shown is the unbiased estimated percentage of correct recognition of the emotion expressed for different lengths of stimulus presented (divided into 10 gates representing the percentage of the full stimulus presented, e.g., 10%, 20%…90%, and 100%). For example, 100% for Anger correspond to a group of stimuli with varying length ranging from 400 to 2250 that represent at least 90.1% of the original corresponding stimulus. The mean duration for this group corresponds to 1244 and is used to represent this specific gate. The recognition percentages were computed with the confusion matrix based on continuous responses from participants. Continuous thinner lines represent the best polynomial fit based on the orthogonal polynomial contrasts and RMSE performed in Table 3 (anger: cubic, disgust: quadratic, fear: linear, sadness: linear, joy: linear, neutral: linear). (B) Graphic of the linear mixed model outputs using duration and emotion presented (model with four gates). Shown is the unbiased estimated percentage of correct recognition of the emotion as expressed over time (divided into four gates representing the percentage of the full stimulus presented, e.g., 25%, 50%, 75%, and 100%). The values were computed with a linear mixed model that evaluated the percentage of correctness using the emotion expressed, the gate duration, and their interaction as predictors. The error bars represent the confidence interval at 95%.
Comparison between polynomial fits and orthogonal polynomial contrasts for each emotion curve.
| Polynomial contrast (Root mean squared error, adjusted R squared) [10 gates (10%-20%-..-100%) dataset] | Equation (duration in seconds) | |
|---|---|---|
| Anger | Chi-squared test | |
| Linear | ||
| Quadratic | ||
| Cubic | ||
| Disgust | Chi-squared test | |
| Linear | ||
| Quadratic | ||
| Cubic | ||
| Fear | Chi-squared test | |
| Linear | ||
| Quadratic | ||
| Cubic | ||
| Joy | Chi-squared test | |
| Linear | ||
| Quadratic | ||
| Cubic | ||
| Neutral | Chi-squared test | |
| Linear | ||
| Quadratic | ||
| Cubic | ||
| Sadness | Chi-squared test | |
| Linear | ||
| Quadratic | ||
| Cubic | ||
The polynomial fit are computed on the Hu scores computed from the confusion matrix. These fit are charasterised by the root mean squared error (RMSE) and the adjusted R-squared (R2). The orthogonal polynomial contrasts are computed on the linear mixed models encompassing the interaction of emotion presented and duration as a fixed effect (4 gates). The orthogonal polynomial contrasts are compared using chi-squared test and p-values. FDR corrected = corrected for multiple comparisons using a false discovery rate correction.
Contrasts comparing Hu scores at last gate between emotions.
| Disgust | Fear | Joy | Neutral | Sadness | |
| Anger | |||||
| Disgust | |||||
| Fear | |||||
| Joy | |||||
| Neutral |
All significance level are FDR corrected (corrected for multiple comparisons using a false discovery rate correction).
Fig 2Certainty over time.
Graphic of the generalized linear mixed model outputs duration and emotion presented. Shown is the estimated self-confidence percentage for each emotion over time. The values were computed with a generalized linear mixed model that evaluated the percentage of confidence using the emotion expressed, the gate duration, and their interaction as predictors. The error bars represent the confidence interval at 95%.
Comparison of the slopes of the different emotion recognition curves between the first and second gate.
| Chi-squared test comparing slopes between emotions (FDR corrected) | |||||
|---|---|---|---|---|---|
| Disgust | Fear | Joy | Neutral | Sadness | |
| 25%–50% | |||||
| Anger | |||||
| Disgust | |||||
| Fear | |||||
| Joy | |||||
| Neutral | |||||
All the comparisons at later gates were non-significant. Chi-squared tests comparing slopes between different emotions for the model using four gates. All p-values are corrected for multiple comparisons using a false discovery rate correction.
Fig 3Impact of acoustic features on recognition across and over time.
Graphic of the generalized linear mixed model outputs using the principal component (PC) scores computed independently for every emotion (A,B,C,D,E,F) and the interaction with the duration of the stimulus (G,H,I). Shown is the unbiased estimated percentage of correctness for each emotion. The values were computed separately for each emotion with a linear mixed model that evaluated the unbiased percentage of correctness using PC scores. The error bars represent the confidence interval at 95%. (A) Anger PC3, (B) Disgust PC4, (C) Fear PC1, (D) Joy PC2 (E) Neutral PC3, (F) Sadness PC2, (G) third PC and duration for anger, (H) fourth PC and duration for disgust, (I) fourth PC and duration for sadness. The numbers associated with the data represent the significance level (FDR-corrected) as well as the statistical power shown in S6 Table. *: p < 0.05, **: p < 0.01, ***: p < 0.001.