Literature DB >> 17139705

An audio-visual corpus for speech perception and automatic speech recognition.

Martin Cooke1, Jon Barker, Stuart Cunningham, Xu Shao.   

Abstract

An audio-visual corpus has been collected to support the use of common material in speech perception and automatic speech recognition studies. The corpus consists of high-quality audio and video recordings of 1000 sentences spoken by each of 34 talkers. Sentences are simple, syntactically identical phrases such as "place green at B 4 now". Intelligibility tests using the audio signals suggest that the material is easily identifiable in quiet and low levels of stationary noise. The annotated corpus is available on the web for research use.

Mesh:

Year:  2006        PMID: 17139705     DOI: 10.1121/1.2229005

Source DB:  PubMed          Journal:  J Acoust Soc Am        ISSN: 0001-4966            Impact factor:   1.840


  16 in total

1.  Speaker-dependent multipitch tracking using deep neural networks.

Authors:  Yuzhou Liu; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2017-02       Impact factor: 1.840

2.  Modulation transfer functions for audiovisual speech.

Authors:  Nicolai F Pedersen; Torsten Dau; Lars Kai Hansen; Jens Hjortkjær
Journal:  PLoS Comput Biol       Date:  2022-07-19       Impact factor: 4.779

3.  The effects of Lombard perturbation on speech intelligibility in noise for normal hearing and cochlear implant listeners.

Authors:  Juliana N Saba; John H L Hansen
Journal:  J Acoust Soc Am       Date:  2022-02       Impact factor: 2.482

4.  The natural statistics of audiovisual speech.

Authors:  Chandramouli Chandrasekaran; Andrea Trubanova; Sébastien Stillittano; Alice Caplier; Asif A Ghazanfar
Journal:  PLoS Comput Biol       Date:  2009-07-17       Impact factor: 4.475

5.  The Bluegrass corpus: Audio-visual stimuli to investigate foreign accents.

Authors:  Bailey McGuffin; Sara Incera; Homer S White
Journal:  Behav Res Methods       Date:  2021-05-04

6.  Explaining face-voice matching decisions: The contribution of mouth movements, stimulus effects and response biases.

Authors:  Nadine Lavan; Harriet Smith; Li Jiang; Carolyn McGettigan
Journal:  Atten Percept Psychophys       Date:  2021-04-01       Impact factor: 2.199

7. 

Authors:  Robert Peharz; Franz Pernkopf
Journal:  Neurocomputing       Date:  2012-03-15       Impact factor: 5.719

8.  Matching novel face and voice identity using static and dynamic facial images.

Authors:  Harriet M J Smith; Andrew K Dunn; Thom Baguley; Paula C Stacey
Journal:  Atten Percept Psychophys       Date:  2016-04       Impact factor: 2.199

9.  Temporal Fine-Structure Coding and Lateralized Speech Perception in Normal-Hearing and Hearing-Impaired Listeners.

Authors:  Gusztáv Lőcsei; Julie H Pedersen; Søren Laugesen; Sébastien Santurette; Torsten Dau; Ewen N MacDonald
Journal:  Trends Hear       Date:  2016-09-05       Impact factor: 3.293

10.  The contribution of visual information to the perception of speech in noise with and without informative temporal fine structure.

Authors:  Paula C Stacey; Pádraig T Kitterick; Saffron D Morris; Christian J Sumner
Journal:  Hear Res       Date:  2016-04-13       Impact factor: 3.208

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.