| Literature DB >> 32365075 |
Vincent Aubanel1, Noël Nguyen2,3.
Abstract
Recent research on speech communication has revealed a tendency for speakers to imitate at least some of the characteristics of their interlocutor's speech sound shape. This phenomenon, referred to as phonetic convergence, entails a moment-to-moment adaptation of the speaker's speech targets to the perceived interlocutor's speech. It is thought to contribute to setting up a conversational common ground between speakers and to facilitate mutual understanding. However, it remains uncertain to what extent phonetic convergence occurs in voice fundamental frequency (F0), in spite of the major role played by pitch, F0's perceptual correlate, as a conveyor of both linguistic information and communicative cues associated with the speaker's social/individual identity and emotional state. In the present work, we investigated to what extent two speakers converge towards each other with respect to variations in F0 in a scripted dialogue. Pairs of speakers jointly performed a speech production task, in which they were asked to alternately read aloud a written story divided into a sequence of short reading turns. We devised an experimental set-up that allowed us to manipulate the speakers' F0 in real time across turns. We found that speakers tended to imitate each other's changes in F0 across turns that were both limited in amplitude and spread over large temporal intervals. This shows that, at the perceptual level, speakers monitor slow-varying movements in their partner's F0 with high accuracy and, at the production level, that speakers exert a very fine-tuned control on their laryngeal vibrator in order to imitate these F0 variations. Remarkably, F0 convergence across turns was found to occur in spite of the large melodic variations typically associated with reading turns. Our study sheds new light on speakers' perceptual tracking of F0 in speech processing, and the impact of this perceptual tracking on speech production.Entities:
Year: 2020 PMID: 32365075 PMCID: PMC7197779 DOI: 10.1371/journal.pone.0232209
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Experimental set-up.
Participants are seated in individual booths and communicate with each other through microphones and headphones, while an experimenter operates the F0 transformation software in the control room. Dashed lines indicate F0-transformed voice.
Fig 2F0 transformation values for the two repetitions of the task.
Top: 0-phase condition. Bottom: π-phase condition.
Fig 3Between-speaker convergence in F0 shifts across turns.
δ measure: Difference between 0-phase and π-phase condition in median F0 for each participant and each turn. Sinusoidal fit is shown in blue, with 95% confidence interval in light blue.
Fig 4Between-speaker convergence in mean F0.
In each panel, the regression line is shown in blue. (a) Global A/B F0 correlation: mean F0 of participant A as a function of mean F0 of participant B over the total duration of the task. The dashed line represents a hypothetical correlation of 1. (b) A/B correlation in F0 (as in (a)) for successive pairs of turns. (c) Mean δ in turns 19 to 22 as a function of the overall F0 difference of the pair members.
Output of linear mixed-effect modeling of the turn median F0.
refers to the median of the F0 value in turn t. Columns show, from left to right: the model ID, the predictors, random effects standard deviations for terms: (a) and (b): intercept and resp. by turn, (c), (d), (e): intercept, and resp. by participant, (f): residuals, and Akaike’s Information Criterion. To the right of the vertical separator are the results of an ANOVA between the first two models and m3.
| Model | Predictors | Random effects Standard Deviation | AIC | ANOVA with | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| (a) | (b) | (c) | (d) | (e) | (f) | |||||
| 0.37 | 0.01 | 0.55 | 0.30 | 0.09 | 0.63 | 8891 | 87.52 | < 2.2e-16 | ||
| 0.38 | 0.01 | 0.46 | 0.12 | 0.12 | 0.63 | 8827 | 23.71 | 1.122e-06 | ||
| 0.38 | 0.01 | 0.47 | 0.12 | 0.08 | 0.63 | 8805 | – | |||