Literature DB >> 17550192

Statistical voice activity detection based on integrated bispectrum likelihood ratio tests for robust speech recognition.

J Ramírez1, J M Górriz, J C Segura.   

Abstract

Currently, there are technology barriers inhibiting speech processing systems that work in extremely noisy conditions from meeting the demands of modern applications. These systems often require a noise reduction system working in combination with a precise voice activity detector (VAD). This paper shows statistical likelihood ratio tests formulated in terms of the integrated bispectrum of the noisy signal. The integrated bispectrum is defined as a cross spectrum between the signal and its square, and therefore a function of a single frequency variable. It inherits the ability of higher order statistics to detect signals in noise with many other additional advantages: (i) Its computation as a cross spectrum leads to significant computational savings, and (ii) the variance of the estimator is of the same order as that of the power spectrum estimator. The proposed approach incorporates contextual information to the decision rule, a strategy that has reported significant benefits for robust speech recognition applications. The proposed VAD is compared to the G.729, adaptive multirate, and advanced front-end standards as well as recently reported algorithms showing a sustained advantage in speech/nonspeech detection accuracy and speech recognition performance.

Mesh:

Year:  2007        PMID: 17550192     DOI: 10.1121/1.2714915

Source DB:  PubMed          Journal:  J Acoust Soc Am        ISSN: 0001-4966            Impact factor:   1.840


  1 in total

1.  Higher-order spectrum in understanding nonlinearity in EEG rhythms.

Authors:  Cauchy Pradhan; Susant K Jena; Sreenivasan R Nadar; N Pradhan
Journal:  Comput Math Methods Med       Date:  2012-02-08       Impact factor: 2.238

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.