Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Monaural Speech Dereverberation Using Temporal Convolutional Networks with Self Attention.

Literature DB >> 33748325

Monaural Speech Dereverberation Using Temporal Convolutional Networks with Self Attention.

Yan Zhao¹, DeLiang Wang², Buye Xu³, Tao Zhang⁴.

Abstract

In daily listening environments, human speech is often degraded by room reverberation, especially under highly reverberant conditions. Such degradation poses a challenge for many speech processing systems, where the performance becomes much worse than in anechoic environments. To combat the effect of reverberation, we propose a monaural (single-channel) speech dereverberation algorithm using temporal convolutional networks with self attention. Specifically, the proposed system includes a self-attention module to produce dynamic representations given input features, a temporal convolutional network to learn a nonlinear mapping from such representations to the magnitude spectrum of anechoic speech, and a one-dimensional (1-D) convolution module to smooth the enhanced magnitude among adjacent frames. Systematic evaluations demonstrate that the proposed algorithm improves objective metrics of speech quality in a wide range of reverberant conditions. In addition, it generalizes well to untrained reverberation times, room sizes, measured room impulse responses, real-world recorded noisy-reverberant speech, and different speakers.

Entities: Chemical Gene Species

Keywords: Dereverberation; room impulse response; self attention; temporal convolutional networks

Year: 2020 PMID： 33748325 PMCID： PMC7971181 DOI： 10.1109/taslp.2020.2995273

Source DB: PubMed Journal: IEEE/ACM Trans Audio Speech Lang Process

4 in total

1. Supervised Speech Separation Based on Deep Learning: An Overview.

Authors: DeLiang Wang; Jitong Chen
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2018-05-30

2. Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions.

Authors: Jianfen Ma; Yi Hu; Philipos C Loizou
Journal: J Acoust Soc Am Date: 2009-05 Impact factor: 1.840

3. Hearing loss, aging, and speech perception in reverberation and noise.

Authors: K S Helfer; L A Wilber
Journal: J Speech Hear Res Date: 1990-03

4. Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation.

Authors: Yi Luo; Nima Mesgarani
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2019-05-06

4 in total

3 in total

1. A causal and talker-independent speaker separation/dereverberation deep learning algorithm: Cost associated with conversion to real-time capable operation.

Authors: Eric W Healy; Hassan Taherian; Eric M Johnson; DeLiang Wang
Journal: J Acoust Soc Am Date: 2021-11 Impact factor: 1.840

2. Self-attending RNN for Speech Enhancement to Improve Cross-corpus Generalization.

Authors: Ashutosh Pandey; DeLiang Wang
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2022-03-22

3. Dense CNN with Self-Attention for Time-Domain Speech Enhancement.

Authors: Ashutosh Pandey; DeLiang Wang
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2021-03-08

3 in total