Literature DB >> 29904642

A Framework for Speech Activity Detection Using Adaptive Auditory Receptive Fields.

Michael A Carlin1, Mounya Elhilali1.   

Abstract

One of the hallmarks of sound processing in the brain is the ability of the nervous system to adapt to changing behavioral demands and surrounding soundscapes. It can dynamically shift sensory and cognitive resources to focus on relevant sounds. Neurophysiological studies indicate that this ability is supported by adaptively retuning the shapes of cortical spectro-temporal receptive fields (STRFs) to enhance features of target sounds while suppressing those of task-irrelevant distractors. Because an important component of human communication is the ability of a listener to dynamically track speech in noisy environments, the solution obtained by auditory neurophysiology implies a useful adaptation strategy for speech activity detection (SAD). SAD is an important first step in a number of automated speech processing systems, and performance is often reduced in highly noisy environments. In this paper, we describe how task-driven adaptation is induced in an ensemble of neurophysiological STRFs, and show how speech-adapted STRFs reorient themselves to enhance spectro-temporal modulations of speech while suppressing those associated with a variety of nonspeech sounds. We then show how an adapted ensemble of STRFs can better detect speech in unseen noisy environments compared to an unadapted ensemble and a noise-robust baseline. Finally, we use a stimulus reconstruction task to demonstrate how the adapted STRF ensemble better captures the spectrotemporal modulations of attended speech in clean and noisy conditions. Our results suggest that a biologically plausible adaptation framework can be applied to speech processing systems to dynamically adapt feature representations for improving noise robustness.

Entities:  

Keywords:  Adaptive filtering; neural plasticity; spectro-temporal receptive fields; speech activity detection (SAD); stimulus reconstruction

Year:  2015        PMID: 29904642      PMCID: PMC5997283          DOI: 10.1109/TASLP.2015.2481179

Source DB:  PubMed          Journal:  IEEE/ACM Trans Audio Speech Lang Process


  40 in total

1.  Adaptive changes in cortical receptive fields induced by attention to complex sounds.

Authors:  Jonathan B Fritz; Mounya Elhilali; Shihab A Shamma
Journal:  J Neurophysiol       Date:  2007-08-15       Impact factor: 2.714

Review 2.  Odor perception and olfactory bulb plasticity in adult mammals.

Authors:  Nathalie Mandairon; Christiane Linster
Journal:  J Neurophysiol       Date:  2009-03-04       Impact factor: 2.714

3.  Influence of context and behavior on stimulus reconstruction from neural activity in primary auditory cortex.

Authors:  Nima Mesgarani; Stephen V David; Jonathan B Fritz; Shihab A Shamma
Journal:  J Neurophysiol       Date:  2009-09-16       Impact factor: 2.714

4.  Mechanisms of noise robust representation of speech in primary auditory cortex.

Authors:  Nima Mesgarani; Stephen V David; Jonathan B Fritz; Shihab A Shamma
Journal:  Proc Natl Acad Sci U S A       Date:  2014-04-21       Impact factor: 11.205

5.  The spectro-temporal receptive field. A functional characteristic of auditory neurons.

Authors:  A M Aertsen; P I Johannesma
Journal:  Biol Cybern       Date:  1981       Impact factor: 2.086

6.  Focusing attention on sound.

Authors:  Victoria M Bajo; Andrew J King
Journal:  Nat Neurosci       Date:  2010-08       Impact factor: 24.884

Review 7.  Adaptive auditory computations.

Authors:  Shihab Shamma; Jonathan Fritz
Journal:  Curr Opin Neurobiol       Date:  2014-02-11       Impact factor: 6.627

8.  Differential dynamic plasticity of A1 receptive fields during multiple spectral tasks.

Authors:  Jonathan B Fritz; Mounya Elhilali; Shihab A Shamma
Journal:  J Neurosci       Date:  2005-08-17       Impact factor: 6.167

9.  The modulation transfer function for speech intelligibility.

Authors:  Taffeta M Elliott; Frédéric E Theunissen
Journal:  PLoS Comput Biol       Date:  2009-03-06       Impact factor: 4.475

10.  Reconstructing speech from human auditory cortex.

Authors:  Brian N Pasley; Stephen V David; Nima Mesgarani; Adeen Flinker; Shihab A Shamma; Nathan E Crone; Robert T Knight; Edward F Chang
Journal:  PLoS Biol       Date:  2012-01-31       Impact factor: 8.029

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.