Literature DB >> 34852625

A causal and talker-independent speaker separation/dereverberation deep learning algorithm: Cost associated with conversion to real-time capable operation.

Eric W Healy1, Hassan Taherian2, Eric M Johnson1, DeLiang Wang2.   

Abstract

The fundamental requirement for real-time operation of a speech-processing algorithm is causality-that it operate without utilizing future time frames. In the present study, the performance of a fully causal deep computational auditory scene analysis algorithm was assessed. Target sentences were isolated from complex interference consisting of an interfering talker and concurrent room reverberation. The talker- and corpus/channel-independent model used Dense-UNet and temporal convolutional networks and estimated both magnitude and phase of the target speech. It was found that mean algorithm benefit was significant in every condition. Mean benefit for hearing-impaired (HI) listeners across all conditions was 46.4 percentage points. The cost of converting the algorithm to causal processing was also assessed by comparing to a prior non-causal version. Intelligibility decrements for HI and normal-hearing listeners from non-causal to causal processing were present in most but not all conditions, and these decrements were statistically significant in half of the conditions tested-those representing the greater levels of complex interference. Although a cost associated with causal processing was present in most conditions, it may be considered modest relative to the overall level of benefit.

Entities:  

Mesh:

Year:  2021        PMID: 34852625      PMCID: PMC8612765          DOI: 10.1121/10.0007134

Source DB:  PubMed          Journal:  J Acoust Soc Am        ISSN: 0001-4966            Impact factor:   1.840


  17 in total

1.  A deep learning algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker and reverberation.

Authors:  Eric W Healy; Masood Delfarah; Eric M Johnson; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2019-03       Impact factor: 1.840

2.  A "rationalized" arcsine transform.

Authors:  G A Studebaker
Journal:  J Speech Hear Res       Date:  1985-09

3.  Monaural Speech Dereverberation Using Temporal Convolutional Networks with Self Attention.

Authors:  Yan Zhao; DeLiang Wang; Buye Xu; Tao Zhang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2020-05-18

4.  On Cross-Corpus Generalization of Deep Learning Based Speech Enhancement.

Authors:  Ashutosh Pandey; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2020-08-14

5.  Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation.

Authors:  Yuzhou Liu; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2019-09-12

6.  An algorithm to increase speech intelligibility for hearing-impaired listeners in novel segments of the same noise type.

Authors:  Eric W Healy; Sarah E Yoho; Jitong Chen; Yuxuan Wang; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2015-09       Impact factor: 1.840

7.  Complex Ratio Masking for Monaural Speech Separation.

Authors:  Donald S Williamson; Yuxuan Wang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2015-12-23

8.  Hearing aid gain and frequency response requirements for the severely/profoundly hearing impaired.

Authors:  D Byrne; A Parkinson; P Newall
Journal:  Ear Hear       Date:  1990-02       Impact factor: 3.570

9.  Comparison of effects on subjective intelligibility and quality of speech in babble for two algorithms: A deep recurrent neural network and spectral subtraction.

Authors:  Mahmoud Keshavarzi; Tobias Goehring; Richard E Turner; Brian C J Moore
Journal:  J Acoust Soc Am       Date:  2019-03       Impact factor: 1.840

10.  Speech enhancement based on neural networks improves speech intelligibility in noise for cochlear implant users.

Authors:  Tobias Goehring; Federico Bolner; Jessica J M Monaghan; Bas van Dijk; Andrzej Zarowski; Stefan Bleeck
Journal:  Hear Res       Date:  2016-11-30       Impact factor: 3.208

View more
  1 in total

Review 1.  Harnessing the Power of Artificial Intelligence in Otolaryngology and the Communication Sciences.

Authors:  Blake S Wilson; Debara L Tucci; David A Moses; Edward F Chang; Nancy M Young; Fan-Gang Zeng; Nicholas A Lesica; Andrés M Bur; Hannah Kavookjian; Caroline Mussatto; Joseph Penn; Sara Goodwin; Shannon Kraft; Guanghui Wang; Jonathan M Cohen; Geoffrey S Ginsburg; Geraldine Dawson; Howard W Francis
Journal:  J Assoc Res Otolaryngol       Date:  2022-04-20
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.