Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition.

Literature DB >> 25920855

Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition.

Abstract

To test if simultaneous spectral and temporal processing is required to extract robust features for automatic speech recognition (ASR), the robust spectro-temporal two-dimensional-Gabor filter bank (GBFB) front-end from Schädler, Meyer, and Kollmeier [J. Acoust. Soc. Am. 131, 4134-4151 (2012)] was de-composed into a spectral one-dimensional-Gabor filter bank and a temporal one-dimensional-Gabor filter bank. A feature set that is extracted with these separate spectral and temporal modulation filter banks was introduced, the separate Gabor filter bank (SGBFB) features, and evaluated on the CHiME (Computational Hearing in Multisource Environments) keywords-in-noise recognition task. From the perspective of robust ASR, the results showed that spectral and temporal processing can be performed independently and are not required to interact with each other. Using SGBFB features permitted the signal-to-noise ratio (SNR) to be lowered by 1.2 dB while still performing as well as the GBFB-based reference system, which corresponds to a relative improvement of the word error rate by 12.8%. Additionally, the real time factor of the spectro-temporal processing could be reduced by more than an order of magnitude. Compared to human listeners, the SNR needed to be 13 dB higher when using Mel-frequency cepstral coefficient features, 11 dB higher when using GBFB features, and 9 dB higher when using SGBFB features to achieve the same recognition performance.

Entities: Chemical Disease Species

Mesh：

Year: 2015 PMID： 25920855 DOI： 10.1121/1.4916618

Source DB: PubMed Journal: J Acoust Soc Am ISSN： 0001-4966 Impact factor: 1.840

Keyword Cloud
Cited

5 in total

5. Objective Prediction of Hearing Aid Benefit Across Listener Groups Using Machine Learning: Speech Recognition Performance With Binaural Noise-Reduction Algorithms.

Authors: Marc R Schädler; Anna Warzybok; Birger Kollmeier
Journal: Trends Hear Date: 2018 Jan-Dec Impact factor: 3.293

5 in total

Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition.

1. Speech Intelligibility Prediction using Spectro-Temporal Modulation Analysis.

2. Matching Pursuit Analysis of Auditory Receptive Fields' Spectro-Temporal Properties.

3. Time-frequency scattering accurately models auditory similarities between instrumental playing techniques.

4. Attention Differentially Affects Acoustic and Phonetic Feature Encoding in a Multispeaker Environment.

5. Objective Prediction of Hearing Aid Benefit Across Listener Groups Using Machine Learning: Speech Recognition Performance With Binaural Noise-Reduction Algorithms.