Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Computational speech segregation based on an auditory-inspired modulation analysis.

Literature DB >> 25480079

Computational speech segregation based on an auditory-inspired modulation analysis.

Abstract

A monaural speech segregation system is presented that estimates the ideal binary mask from noisy speech based on the supervised learning of amplitude modulation spectrogram (AMS) features. Instead of using linearly scaled modulation filters with constant absolute bandwidth, an auditory-inspired modulation filterbank with logarithmically scaled filters is employed. To reduce the dependency of the AMS features on the overall background noise level, a feature normalization stage is applied. In addition, a spectro-temporal integration stage is incorporated in order to exploit the context information about speech activity present in neighboring time-frequency units. In order to evaluate the generalization performance of the system to unseen acoustic conditions, the speech segregation system is trained with a limited set of low signal-to-noise ratio (SNR) conditions, but tested over a wide range of SNRs up to 20 dB. A systematic evaluation of the system demonstrates that auditory-inspired modulation processing can substantially improve the mask estimation accuracy in the presence of stationary and fluctuating interferers.

Entities: Disease

Mesh：

Year: 2014 PMID： 25480079 DOI： 10.1121/1.4901711

Source DB: PubMed Journal: J Acoust Soc Am ISSN： 0001-4966 Impact factor: 1.840

Keyword Cloud
Cited

2 in total

1. A Deep Ensemble Learning Method for Monaural Speech Separation.

Authors: Xiao-Lei Zhang; DeLiang Wang
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2016-03-01

2. The benefit of combining a deep neural network architecture with ideal ratio mask estimation in computational speech segregation to improve speech intelligibility.

Authors: Thomas Bentsen; Tobias May; Abigail A Kressner; Torsten Dau
Journal: PLoS One Date: 2018-05-15 Impact factor: 3.240

2 in total