| Literature DB >> 35767528 |
Cyrielle Chappuis1, Didier Grandjean1.
Abstract
Prior research has established that valence-trustworthiness and power-dominance are the two main dimensions of voice evaluation at zero-acquaintance. These impressions shape many of our interactions and high-impact decisions, so it is crucial for many domains to understand this dynamic. Yet, the relationship between acoustical properties of novel voices and personality/attitudinal traits attributions remains poorly understood. The fundamental problem of understanding vocal impressions and relative decision-making is linked to the complex nature of the acoustical properties in voices. In order to disentangle this relationship, this study extends the line of research on the acoustical bases of vocal impressions in two ways. First, by attempting to replicate previous finding on the bi-dimensional nature of first impressions: using personality judgements and establishing a correspondence between acoustics and voice-first-impression (VFI) dimensions relative to sex (Study 1). Second (Study 2), by exploring the non-linear relationships between acoustical parameters and VFI by the means of machine learning models. In accordance with literature, a bi-dimensional projection comprising valence-trustworthiness and power-dominance evaluations is found to explain 80% of the VFI. In study 1, brighter (high center of gravity), smoother (low shimmers), and louder (high minimum intensity) voices reflected trustworthiness, while vocal roughness (harmonic to noise-ratio), energy in the high frequencies (Energy3250), pitch (Quantile 1, Quantile 5) and lower range of pitch values reflected dominance. In study 2, above chance classification of vocal profiles was achieved by both Support Vector Machine (77.78%) and Random-Forest (Out-Of-Bag = 36.14) classifiers, generally confirming that machine learning algorithms could predict first impressions from voices. Hence results support a bi-dimensional structure to VFI, emphasize the usefulness of machine learning techniques in understanding vocal impressions, and shed light on the influence of sex on VFI formation.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35767528 PMCID: PMC9242519 DOI: 10.1371/journal.pone.0267432
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Weighted linear combinations of attitudinal traits shows individual contribution of all attitudes on the first and second principal components (PC) for all speakers, male speakers and female speakers, after PCA with Varimax rotation.
| Attitude | All speakers | Male speakers | Female speakers | |||
|---|---|---|---|---|---|---|
| PC1 | PC2 | PC 1 | PC 2 | PC 1 | PC 2 | |
| agreeableness | 0.89 | 0.90 | -0.11 | 0.89 | ||
| aggressiveness | -0.31 | 0.87 | -0.26 | 0.90 | -0.25 | 0.88 |
| attractiveness | 0.89 | 0.91 | 0.89 | |||
| warmth | 0.78 | 0.79 | 0.74 | 0.22 | ||
| competence | 0.68 | 0.59 | 0.81 | 0.40 | 0.55 | 0.70 |
| trustworthiness | 0.85 | 0.33 | 0.90 | 0.14 | 0.78 | 0.44 |
| dominance | 0.33 | 0.88 | 0.54 | 0.76 | 0.24 | 0.91 |
| confidence | 0.60 | 0.70 | 0.77 | 0.48 | 0.42 | 0.84 |
|
| 0.49 | 0.31 | 0.59 | 0.23 | 0.42 | 0.38 |
|
| 0.49 | 0.80 | 0.59 | 0.81 | 0.42 | 0.80 |
Fig 1Confusion matrices of p(true attitude | model prediction) for each SVM.
(A) Results of vocal classification for different attitudinal labels. with a 77.8% accuracy. From bottom left to top corner right are all the congruent classifications (HTHD: trustworthy/dominant, HTLD: trustworthy/non-dominant, LTHD: untrustworthy/dominant, LTLD: untrustworthy/non-dominant). True attitude represents the given class label, and predicted attitude is the assigned class label after supervised learning. (B) Results of vocal classification for valence (positive, negative or neutral). (C) Results of vocal classification for dominance (high, low or neutral), from bottom left corner to top right are all the correctly classified items (Low Dominance = 50%; neutral = 54.3%, high dominance = 73.3%).
Fig 2Receiver Operating Characteristic (ROC) curves for bi-dimensional profiles (A), valence (B) and dominance (C) classification for Polynomial SVMs.
RF Confusion matrices for the four vocal profiles.
|
| ||||||
| Predicted class | Classification Error | |||||
| HTHD | HTLD | LTHD | LTLD | |||
| Actual class | HTHD |
| 0 | 2 | 4 | 0.32 |
| HTLD | 0 |
| 0 | 7 | 1 | |
| LTHD | 3 | 0 |
| 3 | 0.60 | |
| LTLD | 3 | 1 | 0 |
| 0.15 | |
|
| ||||||
| Predicted class | Classification Error | |||||
| HTHD | HTLD | LTHD | LTLD | |||
| Actual class | HTHD |
| 0 | 2 | 6 | 0.42 |
| HTLD | 0 |
| 0 | 6 | 0.86 | |
| LTHD | 3 | 0 |
| 3 | 0.60 | |
| LTLD | 2 | 0 | 0 |
| 0.07 | |
Note. (A) OOB estimate of error rate is 36.51% (mtry = 9, ntree = 10000). (B) OOB estimate of error rate is 34.92%. ntree = 6000. The number of variables randomly sampled at each split is adapted to the number of total variables (mtry = 3).
Fig 3Visualization of all decision trees in the RF, for energy levels in the 250-750Hz band, relative to minimum intensity and HNR.
Each dot indicates a decision tree’s classification. HTLD and LTHD do not appear, indicating the combination of the three features were not significantly associated to those classes.