Literature DB >> 30546387

Biomimetic multi-resolution analysis for robust speaker recognition.

Sridhar Krishna Nemala1, Dmitry N Zotkin2, Ramani Duraiswami2, Mounya Elhilali1.   

Abstract

Humans exhibit a remarkable ability to reliably classify sound sources in the environment even in presence of high levels of noise. In contrast, most engineering systems suffer a drastic drop in performance when speech signals are corrupted with channel or background distortions. Our brains are equipped with elaborate machinery for speech analysis and feature extraction, which hold great lessons for improving the performance of automatic speech processing systems under adverse conditions. The work presented here explores a biologically-motivated multi-resolution speaker information representation obtained by performing an intricate yet computationally-efficient analysis of the information-rich spectro-temporal attributes of the speech signal. We evaluate the proposed features in a speaker verification task performed on NIST SRE 2010 data. The biomimetic approach yields significant robustness in presence of non-stationary noise and reverberation, offering a new framework for deriving reliable features for speaker recognition and speech processing.

Entities:  

Year:  2012        PMID: 30546387      PMCID: PMC6289187          DOI: 10.1186/1687-4722-2012-22

Source DB:  PubMed          Journal:  EURASIP J Audio Speech Music Process        ISSN: 1687-4714


  6 in total

1.  Spectrotemporal receptive fields in the lemniscal auditory thalamus and cortex.

Authors:  Lee M Miller; Monty A Escabí; Heather L Read; Christoph E Schreiner
Journal:  J Neurophysiol       Date:  2002-01       Impact factor: 2.714

2.  Syllable intelligibility for temporally filtered LPC cepstral trajectories.

Authors:  T Arai; M Pavel; H Hermansky; C Avendano
Journal:  J Acoust Soc Am       Date:  1999-05       Impact factor: 1.840

3.  Effect of temporal envelope smearing on speech reception.

Authors:  R Drullman; J M Festen; R Plomp
Journal:  J Acoust Soc Am       Date:  1994-02       Impact factor: 1.840

4.  A physical method for measuring speech-transmission quality.

Authors:  H J Steeneken; T Houtgast
Journal:  J Acoust Soc Am       Date:  1980-01       Impact factor: 1.840

5.  The modulation transfer function for speech intelligibility.

Authors:  Taffeta M Elliott; Frédéric E Theunissen
Journal:  PLoS Comput Biol       Date:  2009-03-06       Impact factor: 4.475

6.  Complex spectral interactions encoded by auditory cortical neurons: relationship between bandwidth and pattern.

Authors:  Kevin N O'Connor; Pingbo Yin; Christopher I Petkov; Mitchell L Sutter
Journal:  Front Syst Neurosci       Date:  2010-11-05
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.