| Literature DB >> 32393618 |
Wim Pouw1,2,3, Alexandra Paxton4,5, Steven J Harrison4,6, James A Dixon4,5.
Abstract
We show that the human voice has complex acoustic qualities that are directly coupled to peripheral musculoskeletal tensioning of the body, such as subtle wrist movements. In this study, human vocalizers produced a steady-state vocalization while rhythmically moving the wrist or the arm at different tempos. Although listeners could only hear and not see the vocalizer, they were able to completely synchronize their own rhythmic wrist or arm movement with the movement of the vocalizer which they perceived in the voice acoustics. This study corroborates recent evidence suggesting that the human voice is constrained by bodily tensioning affecting the respiratory-vocal system. The current results show that the human voice contains a bodily imprint that is directly informative for the interpersonal perception of another's dynamic physical states.Entities:
Keywords: hand gesture; interpersonal synchrony; motion tracking; vocalization acoustics
Year: 2020 PMID: 32393618 PMCID: PMC7260986 DOI: 10.1073/pnas.2004163117
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1.Vocalizer movements (A) and resultant acoustic patterning caused by movement (B). (A) Six vocalizers moved their wrist and arm in rhythmic fashion at different tempos (slow = 1.06 Hz; medium = 1.33 Hz; fast = 1.6 Hz) that was guided via a green bar digitally connected to a motion-tracking system, which represented their movement frequency relative to the target tempo. Human postures modified from ref. 23. (B) The resultant movement and acoustic data were collected. Preanalysis indeed showed that acoustics were affected by movement, with sharp peaks in the fundamental frequency (perceived as pitch; C) and the smoothed amplitude envelope of the vocalization (in purple, B) when movements reached peaks in deceleration during the stopping motion at maximum extension. Peaks in deceleration of the movement lead to counteracting muscular adjustments throughout the body recruited to keep postural integrity, which also cascade into vocalization acoustics. (D) Here we assessed how the fundamental frequency of voicing (in the human range: 75 to 450 Hz) was modulated around the maximum extension for each vocalizer and combined for all vocalizers (red line). D shows that smoothed-average-normalized F0 (also linearly detrended and z-scaled per vocalization trial) peaked around the moment of the maximum extension, when a sudden deceleration and acceleration occurred; normalized F0 dipped at steady-state low-physical-impetus moments of the movement phase (when velocity was constant), rising again for a maximum flexion (∼300 to 375 ms before and after the maximum extension), replicating previous work (21, 24). Vocalizer wrist movement showed a less pronounced F0 modulation compared to the vocalizer arm movement trials. For individual vocalizer differences for each tempo condition, see our interactive graph provided in the ).
Fig. 2.Synchrony results. The example shows different ways movements can synchronize between the listener and the vocalizer. Fully asynchronous movement would entail a mismatch of movement tempo and a random variation of relative phases. Synchronization of phases may occur without exact matching of movement tempos. Full synchronization entails tempo matching and 0° relative phasing between vocalizer and listener movement. Main results show clear tempo synchronization, as the observed frequencies for each vocalization trial were well matched to the observed movement frequencies of listeners moving to that trial. Similarly, phase synchronization was clearly apparent, as phasing distributions are all pronouncedly peaked rather than having flat distributions, with a negative mean asynchrony regardless of vocalizer movement or movement tempo. Individual differences in vocalizer F0 modulations for each vocalizer trial were modeled using a nonlinear regression method, generalized additive modeling (GAM), providing a model fit (R2 adjusted) for each trial, indicating the degree of variability of normalized F0 modulations around moments of the maximum extension (also see Fig. 1). The variance explained for each vocalizer trial then was regressed against the average synchronization performance (average circular SD relative phase, SD Φ) of that trial by the listeners. It can be seen that more structural F0 modulations around the maximum extensions of upper limb movement (higher R2 adjusted) predict better synchronization performance (lower SD Φ), r = −0.48, P < 0.003. This means that more reliable acoustic patterning in vocalizer's voicing predicts higher listener synchronization performance. Human postures modified from ref. 23.