| Literature DB >> 29291085 |
Henrik Nordström1, Petri Laukka1, Nutankumar S Thingujam2, Emery Schubert3, Hillary Anger Elfenbein4.
Abstract
This study explored the perception of emotion appraisal dimensions on the basis of speech prosody in a cross-cultural setting. Professional actors from Australia and India vocally portrayed different emotions (anger, fear, happiness, pride, relief, sadness, serenity and shame) by enacting emotion-eliciting situations. In a balanced design, participants from Australia and India then inferred aspects of the emotion-eliciting situation from the vocal expressions, described in terms of appraisal dimensions (novelty, intrinsic pleasantness, goal conduciveness, urgency, power and norm compatibility). Bayesian analyses showed that the perceived appraisal profiles for the vocally expressed emotions were generally consistent with predictions based on appraisal theories. Few group differences emerged, which suggests that the perceived appraisal profiles are largely universal. However, some differences between Australian and Indian participants were also evident, mainly for ratings of norm compatibility. The appraisal ratings were further correlated with a variety of acoustic measures in exploratory analyses, and inspection of the acoustic profiles suggested similarity across groups. In summary, results showed that listeners may infer several aspects of emotion-eliciting situations from the non-verbal aspects of a speaker's voice. These appraisal inferences also seem to be relatively independent of the cultural background of the listener and the speaker.Entities:
Keywords: acoustic parameters; appraisal; cross-cultural; emotion recognition; speech; vocal expression
Year: 2017 PMID: 29291085 PMCID: PMC5717659 DOI: 10.1098/rsos.170912
Source DB: PubMed Journal: R Soc Open Sci ISSN: 2054-5703 Impact factor: 2.963
Summary of selected acoustic parameters.
| feature type | description | factor loading |
|---|---|---|
| F0M | mean fundamental frequency (F0) on a semitone frequency scale | factor 4: 0.76 |
| F0SD | variability (standard deviation) of F0 | factor 3: 0.73 |
| F1FreqM | mean of first formant (F1) centre frequency | factor 6: 0.76 |
| IntM | mean voice intensity estimated from an auditory spectrum | factor 1: 0.90 |
| IntSD | variability (standard deviation) of voice intensity | factor 5: 0.87 |
| F1amplitude | relative energy of the spectral envelope in the first formant region | factor 2: −0.85 |
| Hammarberg | Hammarberg index, i.e. the ratio of the strongest energy peaks in the 0–2 versus 2–5 kHz regions | factor 7: 0.83 |
| spectral slope | mean spectral slope (i.e. linear regression slope of the logarithmic power spectrum) for the 500–1500 Hz region | factor 9: 0.73 |
| spectral flux | mean spectral flux, i.e. the difference between the spectra of two consecutive speech frames | factor 1: 0.89 |
| spectral flux s.d. | variability (standard deviation) of spectral flux | factor 5: 0.88 |
| VoicedSegPerSec | the number of continuous voiced regions per second (pseudo syllable rate) | factor 8: −0.83 |
Figure 1.Participants' mean ratings (and boxplots) of appraisal dimensions (z scores) as a function of culture and the intended emotion of the vocal expressions. Colour of the boxes indicates speaker and perceiver culture: solid blue boxes, Australian stimuli, Australian participants; striped blue boxes, Australian stimuli, Indian participants; solid orange boxes, Indian stimuli, Indian participants; and striped orange boxes, Indian stimuli, Australian participants. Error bars indicate 95% CIs and shaded areas indicate direction of theoretical predictions. Diamonds indicate the mean appraisal ratings and the colour of the diamonds indicates if the associated Bayes factor (BF) supports the prediction (green, BF > 3), supports a population mean close to zero (red, BF < 1/3) or if both hypotheses are equally likely (black, 3 > BF > 1/3). Numbers presented on the y-axis show BFs for comparisons of ratings between Australian and Indian participants, and the colours indicate if the BF supports a difference between participant cultures (green, BF > 3), supports no difference between cultures (red, BF < 1/3) or if both hypotheses are equally likely (black, 3 > BF > 1/3).
Correlations (Pearson's r) between selected acoustic parameters and participants’ mean ratings of appraisal dimensions for each combination of listener and speaker culture. Note: N = 64. Bold type indicates r ≥ 0.30.
| novelty | urgency | intrinsic pleasantness | goal conduciveness | norm compatibility | power | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| acoustic cue | listener culture speaker culture | Aus | Ind | Aus | Ind | Aus | Ind | Aus | Ind | Aus | Ind | Aus | Ind |
| F0M | Aus | −0.05 | −0.11 | −0.03 | 0.02 | −0.18 | −0.09 | −0.12 | 0.04 | ||||
| Ind | −0.26 | −0.15 | −0.18 | 0.08 | − | 0.00 | −0.06 | 0.09 | |||||
| F0SD | Aus | −0.25 | −0.24 | −0.27 | − | −0.13 | −0.10 | −0.13 | −0.22 | −0.10 | −0.19 | −0.12 | −0.24 |
| Ind | 0.25 | 0.15 | 0.20 | 0.11 | −0.27 | −0.21 | −0.23 | −0.11 | − | −0.11 | −0.15 | −0.05 | |
| F1FreqM | Aus | 0.21 | 0.23 | 0.20 | 0.16 | −0.10 | −0.05 | 0.01 | 0.02 | −0.16 | −0.01 | −0.12 | 0.02 |
| Ind | −0.23 | −0.15 | −0.03 | 0.18 | −0.25 | 0.14 | 0.22 | ||||||
| IntM | Aus | 0.15 | 0.05 | 0.07 | 0.28 | ||||||||
| Ind | 0.08 | 0.09 | 0.06 | ||||||||||
| IntSD | Aus | 0.00 | −0.02 | 0.00 | −0.01 | 0.13 | 0.10 | 0.19 | 0.21 | 0.11 | 0.08 | 0.14 | 0.12 |
| Ind | 0.00 | −0.14 | −0.02 | −0.14 | −0.13 | −0.13 | −0.07 | −0.10 | −0.14 | −0.16 | 0.04 | −0.06 | |
| F1amplitude | Aus | 0.07 | 0.15 | 0.08 | 0.20 | 0.21 | 0.29 | 0.26 | 0.23 | 0.26 | 0.22 | ||
| Ind | 0.11 | 0.15 | 0.11 | 0.18 | |||||||||
| Hammarberg | Aus | − | − | − | − | −0.11 | −0.09 | − | − | −0.15 | − | − | − |
| Ind | −0.24 | −0.25 | −0.18 | −0.25 | −0.18 | −0.11 | − | −0.29 | −0.17 | − | − | − | |
| spectral slope | Aus | 0.19 | 0.18 | 0.19 | 0.18 | −0.03 | 0.03 | −0.03 | 0.14 | −0.14 | 0.06 | 0.01 | 0.12 |
| Ind | 0.26 | 0.26 | − | − | − | −0.20 | − | −0.27 | −0.19 | −0.25 | |||
| spectral flux | Aus | 0.01 | −0.12 | 0.18 | −0.05 | 0.20 | 0.16 | ||||||
| Ind | 0.02 | 0.04 | −0.01 | ||||||||||
| spectral flux s.d. | Aus | 0.18 | 0.25 | 0.20 | 0.22 | 0.08 | 0.07 | 0.20 | 0.09 | 0.12 | 0.21 | 0.29 | |
| Ind | −0.09 | −0.21 | −0.09 | −0.20 | −0.13 | −0.11 | −0.13 | −0.17 | −0.19 | −0.25 | −0.07 | −0.13 | |
| VoicedSegPerSec | Aus | 0.17 | 0.15 | 0.15 | 0.12 | − | − | − | −0.21 | − | −0.18 | − | −0.25 |
| Ind | 0.21 | 0.20 | 0.29 | 0.28 | −0.22 | − | −0.15 | −0.13 | −0.15 | −0.21 | −0.05 | −0.05 | |