Literature DB >> 18252568

Separation of speech from interfering sounds based on oscillatory correlation.

D L Wang1, G J Brown.   

Abstract

A multistage neural model is proposed for an auditory scene analysis task--segregating speech from interfering sound sources. The core of the model is a two-layer oscillator network that performs stream segregation on the basis of oscillatory correlation. In the oscillatory correlation framework, a stream is represented by a population of synchronized relaxation oscillators, each of which corresponds to an auditory feature, and different streams are represented by desynchronized oscillator populations. Lateral connections between oscillators encode harmonicity, and proximity in frequency and time. Prior to the oscillator network are a model of the auditory periphery and a stage in which mid-level auditory representations are formed. The model has been systematically evaluated using a corpus of voiced speech mixed with interfering sounds, and produces improvements in terms of signal-to-noise ratio for every mixture. The performance of our model is compared with other studies on computational auditory scene analysis. A number of issues including biological plausibility and real-time implementation are also discussed.

Year:  1999        PMID: 18252568     DOI: 10.1109/72.761727

Source DB:  PubMed          Journal:  IEEE Trans Neural Netw        ISSN: 1045-9227


  14 in total

1.  Fast and robust image segmentation by small-world neural oscillator networks.

Authors:  Chunguang Li; Yuke Li
Journal:  Cogn Neurodyn       Date:  2011-03-01       Impact factor: 5.082

2.  A cocktail party with a cortical twist: how cortical mechanisms contribute to sound segregation.

Authors:  Mounya Elhilali; Shihab A Shamma
Journal:  J Acoust Soc Am       Date:  2008-12       Impact factor: 1.840

Review 3.  Time-frequency masking for speech separation and its potential for hearing aid design.

Authors: 
Journal:  Trends Amplif       Date:  2008-10-30

4.  An oscillatory correlation model of auditory streaming.

Authors:  Deliang Wang; Peter Chang
Journal:  Cogn Neurodyn       Date:  2008-01-10       Impact factor: 5.082

5.  An oscillatory neural network model that demonstrates the benefits of multisensory learning.

Authors:  A Ravishankar Rao
Journal:  Cogn Neurodyn       Date:  2018-06-07       Impact factor: 5.082

6.  The importance of processing resolution in "ideal time-frequency segregation" of masked speech and the implications for predicting speech intelligibility.

Authors:  Christopher Conroy; Virginia Best; Todd R Jennings; Gerald Kidd
Journal:  J Acoust Soc Am       Date:  2020-03       Impact factor: 1.840

Review 7.  Temporal coherence and attention in auditory scene analysis.

Authors:  Shihab A Shamma; Mounya Elhilali; Christophe Micheyl
Journal:  Trends Neurosci       Date:  2010-12-31       Impact factor: 13.837

8.  Speech Segregation Using an Auditory Vocoder With Event-Synchronous Enhancements.

Authors:  Toshio Irino; Roy D Patterson; Hideki Kawahara
Journal:  IEEE Trans Audio Speech Lang Process       Date:  2006-11

9.  Ecological origins of perceptual grouping principles in the auditory system.

Authors:  Wiktor Młynarski; Josh H McDermott
Journal:  Proc Natl Acad Sci U S A       Date:  2019-11-21       Impact factor: 11.205

10.  A Dynamic Compressive Gammachirp Auditory Filterbank.

Authors:  Toshio Irino; Roy D Patterson
Journal:  IEEE Trans Audio Speech Lang Process       Date:  2006-11
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.