Literature DB >> 26412979

Recognizing the message and the messenger: biomimetic spectral analysis for robust speech and speaker recognition.

Sridhar Krishna Nemala1, Kailash Patil1, Mounya Elhilali1.   

Abstract

Humans are quite adept at communicating in presence of noise. However most speech processing systems, like automatic speech and speaker recognition systems, suffer from a significant drop in performance when speech signals are corrupted with unseen background distortions. The proposed work explores the use of a biologically-motivated multi-resolution spectral analysis for speech representation. This approach focuses on the information-rich spectral attributes of speech and presents an intricate yet computationally-efficient analysis of the speech signal by careful choice of model parameters. Further, the approach takes advantage of an information-theoretic analysis of the message and speaker dominant regions in the speech signal, and defines feature representations to address two diverse tasks such as speech and speaker recognition. The proposed analysis surpasses the standard Mel-Frequency Cepstral Coefficients (MFCC), and its enhanced variants (via mean subtraction, variance normalization and time sequence filtering) and yields significant improvements over a state-of-the-art noise robust feature scheme, on both speech and speaker recognition tasks.

Entities:  

Keywords:  Biomimetic; Multi-resolution; Speaker verification; Speech recognition

Year:  2012        PMID: 26412979      PMCID: PMC4579853          DOI: 10.1007/s10772-012-9184-y

Source DB:  PubMed          Journal:  Int J Speech Technol        ISSN: 1381-2416


  6 in total

1.  Spectrotemporal receptive fields in the lemniscal auditory thalamus and cortex.

Authors:  Lee M Miller; Monty A Escabí; Heather L Read; Christoph E Schreiner
Journal:  J Neurophysiol       Date:  2002-01       Impact factor: 2.714

Review 2.  Neural mechanisms for spectral analysis in the auditory midbrain, thalamus, and cortex.

Authors:  Monty A Escabí; Heather L Read
Journal:  Int Rev Neurobiol       Date:  2005       Impact factor: 3.230

3.  Robust combination of neural networks and hidden Markov models for speech recognition.

Authors:  E Trentin; M Gori
Journal:  IEEE Trans Neural Netw       Date:  2003

4.  Neural Network Classifiers Estimate Bayesian a posteriori Probabilities.

Authors:  Michael D Richard; Richard P Lippmann
Journal:  Neural Comput       Date:  1991       Impact factor: 2.026

5.  A frequency-selective feedback model of auditory efferent suppression and its implications for the recognition of speech in noise.

Authors:  Nicholas R Clark; Guy J Brown; Tim Jürgens; Ray Meddis
Journal:  J Acoust Soc Am       Date:  2012-09       Impact factor: 1.840

6.  The modulation transfer function for speech intelligibility.

Authors:  Taffeta M Elliott; Frédéric E Theunissen
Journal:  PLoS Comput Biol       Date:  2009-03-06       Impact factor: 4.475

  6 in total
  1 in total

1.  Feedback-Driven Sensory Mapping Adaptation for Robust Speech Activity Detection.

Authors:  Ashwin Bellur; Mounya Elhilali
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2016-12-13
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.