Literature DB >> 32632010

Vision perceptually restores auditory spectral dynamics in speech.

John Plass1,2, David Brang3, Satoru Suzuki2,4, Marcia Grabowecky2,4.   

Abstract

Visual speech facilitates auditory speech perception, but the visual cues responsible for these benefits and the information they provide remain unclear. Low-level models emphasize basic temporal cues provided by mouth movements, but these impoverished signals may not fully account for the richness of auditory information provided by visual speech. High-level models posit interactions among abstract categorical (i.e., phonemes/visemes) or amodal (e.g., articulatory) speech representations, but require lossy remapping of speech signals onto abstracted representations. Because visible articulators shape the spectral content of speech, we hypothesized that the perceptual system might exploit natural correlations between midlevel visual (oral deformations) and auditory speech features (frequency modulations) to extract detailed spectrotemporal information from visual speech without employing high-level abstractions. Consistent with this hypothesis, we found that the time-frequency dynamics of oral resonances (formants) could be predicted with unexpectedly high precision from the changing shape of the mouth during speech. When isolated from other speech cues, speech-based shape deformations improved perceptual sensitivity for corresponding frequency modulations, suggesting that listeners could exploit this cross-modal correspondence to facilitate perception. To test whether this type of correspondence could improve speech comprehension, we selectively degraded the spectral or temporal dimensions of auditory sentence spectrograms to assess how well visual speech facilitated comprehension under each degradation condition. Visual speech produced drastically larger enhancements during spectral degradation, suggesting a condition-specific facilitation effect driven by cross-modal recovery of auditory speech spectra. The perceptual system may therefore use audiovisual correlations rooted in oral acoustics to extract detailed spectrotemporal information from visual speech.

Keywords:  audiovisual speech; multisensory; spectrotemporal; speech perception

Mesh:

Year:  2020        PMID: 32632010      PMCID: PMC7382243          DOI: 10.1073/pnas.2002887117

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  51 in total

1.  The use of visible speech cues for improving auditory detection of spoken sentences.

Authors:  K W Grant; P F Seitz
Journal:  J Acoust Soc Am       Date:  2000-09       Impact factor: 1.840

Review 2.  Lipreading and audio-visual speech perception.

Authors:  Q Summerfield
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  1992-01-29       Impact factor: 6.237

3.  Hearing lips and seeing voices.

Authors:  H McGurk; J MacDonald
Journal:  Nature       Date:  1976 Dec 23-30       Impact factor: 49.962

4.  The principle of inverse effectiveness in multisensory integration: some statistical considerations.

Authors:  Nicholas P Holmes
Journal:  Brain Topogr       Date:  2009-04-29       Impact factor: 3.020

5.  Visual speech form influences the speed of auditory speech processing.

Authors:  Tim Paris; Jeesun Kim; Chris Davis
Journal:  Brain Lang       Date:  2013-08-11       Impact factor: 2.381

6.  On the role of spectral transition for speech perception.

Authors:  S Furui
Journal:  J Acoust Soc Am       Date:  1986-10       Impact factor: 1.840

7.  Mouth and Voice: A Relationship between Visual and Auditory Preference in the Human Superior Temporal Sulcus.

Authors:  Lin L Zhu; Michael S Beauchamp
Journal:  J Neurosci       Date:  2017-02-08       Impact factor: 6.167

8.  Congruent Visual Speech Enhances Cortical Entrainment to Continuous Auditory Speech in Noise-Free Conditions.

Authors:  Michael J Crosse; John S Butler; Edmund C Lalor
Journal:  J Neurosci       Date:  2015-10-21       Impact factor: 6.167

9.  A Causal Inference Model Explains Perception of the McGurk Effect and Other Incongruent Audiovisual Speech.

Authors:  John F Magnotti; Michael S Beauchamp
Journal:  PLoS Comput Biol       Date:  2017-02-16       Impact factor: 4.475

10.  The auditory representation of speech sounds in human motor cortex.

Authors:  Connie Cheung; Liberty S Hamiton; Keith Johnson; Edward F Chang
Journal:  Elife       Date:  2016-03-04       Impact factor: 8.140

View more
  6 in total

1.  Vision perceptually restores auditory spectral dynamics in speech.

Authors:  John Plass; David Brang; Satoru Suzuki; Marcia Grabowecky
Journal:  Proc Natl Acad Sci U S A       Date:  2020-07-06       Impact factor: 11.205

2.  Event-related potentials evidence for long-term audiovisual representations of phonemes in adults.

Authors:  Natalya Kaganovich; Sharon Christ
Journal:  Eur J Neurosci       Date:  2021-11-25       Impact factor: 3.386

3.  Shared and modality-specific brain regions that mediate auditory and visual word comprehension.

Authors:  Anne Keitel; Joachim Gross; Christoph Kayser
Journal:  Elife       Date:  2020-08-24       Impact factor: 8.140

4.  Intelligibility of audiovisual sentences drives multivoxel response patterns in human superior temporal cortex.

Authors:  Johannes Rennig; Michael S Beauchamp
Journal:  Neuroimage       Date:  2021-12-11       Impact factor: 6.556

5.  MEG Activity in Visual and Auditory Cortices Represents Acoustic Speech-Related Information during Silent Lip Reading.

Authors:  Felix Bröhl; Anne Keitel; Christoph Kayser
Journal:  eNeuro       Date:  2022-06-27

6.  Neurosensory development of the four brainstem-projecting sensory systems and their integration in the telencephalon.

Authors:  Bernd Fritzsch; Karen L Elliott; Ebenezer N Yamoah
Journal:  Front Neural Circuits       Date:  2022-09-23       Impact factor: 3.342

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.