Literature DB >> 33293422

The interrelationship between the face and vocal tract configuration during audiovisual speech.

Chris Scholes1, Jeremy I Skipper2, Alan Johnston3.   

Abstract

It is well established that speech perception is improved when we are able to see the speaker talking along with hearing their voice, especially when the speech is noisy. While we have a good understanding of where speech integration occurs in the brain, it is unclear how visual and auditory cues are combined to improve speech perception. One suggestion is that integration can occur as both visual and auditory cues arise from a common generator: the vocal tract. Here, we investigate whether facial and vocal tract movements are linked during speech production by comparing videos of the face and fast magnetic resonance (MR) image sequences of the vocal tract. The joint variation in the face and vocal tract was extracted using an application of principal components analysis (PCA), and we demonstrate that MR image sequences can be reconstructed with high fidelity using only the facial video and PCA. Reconstruction fidelity was significantly higher when images from the two sequences corresponded in time, and including implicit temporal information by combining contiguous frames also led to a significant increase in fidelity. A "Bubbles" technique was used to identify which areas of the face were important for recovering information about the vocal tract, and vice versa, on a frame-by-frame basis. Our data reveal that there is sufficient information in the face to recover vocal tract shape during speech. In addition, the facial and vocal tract regions that are important for reconstruction are those that are used to generate the acoustic speech signal.
Copyright © 2020 the Author(s). Published by PNAS.

Entities:  

Keywords:  PCA; audiovisual; speech

Mesh:

Year:  2020        PMID: 33293422      PMCID: PMC7768679          DOI: 10.1073/pnas.2006192117

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   12.779


  30 in total

Review 1.  Speech perception.

Authors:  Randy L Diehl; Andrew J Lotto; Lori L Holt
Journal:  Annu Rev Psychol       Date:  2004       Impact factor: 24.137

2.  Do you see what I am saying? Exploring visual enhancement of speech comprehension in noisy environments.

Authors:  Lars A Ross; Dave Saint-Amour; Victoria M Leavitt; Daniel C Javitt; John J Foxe
Journal:  Cereb Cortex       Date:  2006-06-19       Impact factor: 5.357

3.  Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability.

Authors:  D N Kalikow; K N Stevens; L L Elliott
Journal:  J Acoust Soc Am       Date:  1977-05       Impact factor: 1.840

4.  Identity From Variation: Representations of Faces Derived From Multiple Instances.

Authors:  A Mike Burton; Robin S S Kramer; Kay L Ritchie; Rob Jenkins
Journal:  Cogn Sci       Date:  2015-03-30

Review 5.  Perception of the speech code.

Authors:  A M Liberman; F S Cooper; D P Shankweiler; M Studdert-Kennedy
Journal:  Psychol Rev       Date:  1967-11       Impact factor: 8.934

Review 6.  Prediction and constraint in audiovisual speech perception.

Authors:  Jonathan E Peelle; Mitchell S Sommers
Journal:  Cortex       Date:  2015-03-20       Impact factor: 4.027

7.  The natural statistics of audiovisual speech.

Authors:  Chandramouli Chandrasekaran; Andrea Trubanova; Sébastien Stillittano; Alice Caplier; Asif A Ghazanfar
Journal:  PLoS Comput Biol       Date:  2009-07-17       Impact factor: 4.475

8.  Language familiarity modulates relative attention to the eyes and mouth of a talker.

Authors:  Elan Barenholtz; Lauren Mavica; David J Lewkowicz
Journal:  Cognition       Date:  2015-11-30

Review 9.  Neural pathways for visual speech perception.

Authors:  Lynne E Bernstein; Einat Liebenthal
Journal:  Front Neurosci       Date:  2014-12-01       Impact factor: 4.677

10.  Eye Movements During Visual Speech Perception in Deaf and Hearing Children.

Authors:  Elizabeth Worster; Hannah Pimperton; Amelia Ralph-Lewis; Laura Monroy; Charles Hulme; Mairéad MacSweeney
Journal:  Lang Learn       Date:  2017-09-26
View more
  4 in total

1.  Modulation transfer functions for audiovisual speech.

Authors:  Nicolai F Pedersen; Torsten Dau; Lars Kai Hansen; Jens Hjortkjær
Journal:  PLoS Comput Biol       Date:  2022-07-19       Impact factor: 4.779

2.  A PCA-Based Active Appearance Model for Characterising Modes of Spatiotemporal Variation in Dynamic Facial Behaviours.

Authors:  David M Watson; Alan Johnston
Journal:  Front Psychol       Date:  2022-05-26

3.  Neural indicators of articulator-specific sensorimotor influences on infant speech perception.

Authors:  Dawoon Choi; Ghislaine Dehaene-Lambertz; Marcela Peña; Janet F Werker
Journal:  Proc Natl Acad Sci U S A       Date:  2021-05-18       Impact factor: 11.205

Review 4.  Faces and Voices Processing in Human and Primate Brains: Rhythmic and Multimodal Mechanisms Underlying the Evolution and Development of Speech.

Authors:  Maëva Michon; José Zamorano-Abramson; Francisco Aboitiz
Journal:  Front Psychol       Date:  2022-03-30
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.