Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation.

Literature DB >> 33748322

Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation.

Abstract

We address talker-independent monaural speaker separation from the perspectives of deep learning and computational auditory scene analysis (CASA). Specifically, we decompose the multi-speaker separation task into the stages of simultaneous grouping and sequential grouping. Simultaneous grouping is first performed in each time frame by separating the spectra of different speakers with a permutation-invariantly trained neural network. In the second stage, the frame-level separated spectra are sequentially grouped to different speakers by a clustering network. The proposed deep CASA approach optimizes frame-level separation and speaker tracking in turn, and produces excellent results for both objectives. Experimental results on the benchmark WSJ0-2mix database show that the new approach achieves the state-of-the-art results with a modest model size.

Entities: Chemical Disease Gene Species

Keywords: Monaural speech separation; computational auditory scene analysis; deep CASA; speaker separation

Year: 2019 PMID： 33748322 PMCID： PMC7976856 DOI： 10.1109/taslp.2019.2941148

Source DB: PubMed Journal: IEEE/ACM Trans Audio Speech Lang Process

6 in total

1. Informational and energetic masking effects in the perception of two simultaneous talkers.

Authors: D S Brungart
Journal: J Acoust Soc Am Date: 2001-03 Impact factor: 1.840

2. An algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker.

Authors: Eric W Healy; Masood Delfarah; Jordan L Vasko; Brittney L Carter; DeLiang Wang
Journal: J Acoust Soc Am Date: 2017-06 Impact factor: 1.840

3. Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation.

Authors: Yi Luo; Nima Mesgarani
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2019-05-06

4. On Training Targets for Supervised Speech Separation.

Authors: Yuxuan Wang; Arun Narayanan; DeLiang Wang
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2014-12

5. A Deep Ensemble Learning Method for Monaural Speech Separation.

Authors: Xiao-Lei Zhang; DeLiang Wang
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2016-03-01

6. Complex Ratio Masking for Monaural Speech Separation.

Authors: Donald S Williamson; Yuxuan Wang; DeLiang Wang
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2015-12-23

6 in total

5 in total

1. A causal and talker-independent speaker separation/dereverberation deep learning algorithm: Cost associated with conversion to real-time capable operation.

Authors: Eric W Healy; Hassan Taherian; Eric M Johnson; DeLiang Wang
Journal: J Acoust Soc Am Date: 2021-11 Impact factor: 1.840

2. Deep learning based speaker separation and dereverberation can generalize across different languages to improve intelligibility.

Authors: Eric W Healy; Eric M Johnson; Masood Delfarah; Divya S Krishnagiri; Victoria A Sevich; Hassan Taherian; DeLiang Wang
Journal: J Acoust Soc Am Date: 2021-10 Impact factor: 2.482