Literature DB >> 18238087

Monaural speech segregation based on pitch tracking and amplitude modulation.

Guoning Hu1, Deliang Wang.   

Abstract

Segregating speech from one monaural recording has proven to be very challenging. Monaural segregation of voiced speech has been studied in previous systems that incorporate auditory scene analysis principles. A major problem for these systems is their inability to deal with the high-frequency part of speech. Psychoacoustic evidence suggests that different perceptual mechanisms are involved in handling resolved and unresolved harmonics. We propose a novel system for voiced speech segregation that segregates resolved and unresolved harmonics differently. For resolved harmonics, the system generates segments based on temporal continuity and cross-channel correlation, and groups them according to their periodicities. For unresolved harmonics, it generates segments based on common amplitude modulation (AM) in addition to temporal continuity and groups them according to AM rates. Underlying the segregation process is a pitch contour that is first estimated from speech segregated according to dominant pitch and then adjusted according to psychoacoustic constraints. Our system is systematically evaluated and compared with pervious systems, and it yields substantially better performance, especially for the high-frequency part of speech.

Year:  2004        PMID: 18238087     DOI: 10.1109/TNN.2004.832812

Source DB:  PubMed          Journal:  IEEE Trans Neural Netw        ISSN: 1045-9227


  11 in total

1.  Coding of amplitude modulation in primary auditory cortex.

Authors:  Pingbo Yin; Jeffrey S Johnson; Kevin N O'Connor; Mitchell L Sutter
Journal:  J Neurophysiol       Date:  2010-12-08       Impact factor: 2.714

2.  Speech enhancement using the modified phase-opponency model.

Authors:  Om D Deshmukh; Carol Y Espy-Wilson; Laurel H Carney
Journal:  J Acoust Soc Am       Date:  2007-06       Impact factor: 1.840

3.  Factors influencing intelligibility of ideal binary-masked speech: implications for noise reduction.

Authors:  Ning Li; Philipos C Loizou
Journal:  J Acoust Soc Am       Date:  2008-03       Impact factor: 1.840

Review 4.  Time-frequency masking for speech separation and its potential for hearing aid design.

Authors: 
Journal:  Trends Amplif       Date:  2008-10-30

5.  An algorithm that improves speech intelligibility in noise for normal-hearing listeners.

Authors:  Gibak Kim; Yang Lu; Yi Hu; Philipos C Loizou
Journal:  J Acoust Soc Am       Date:  2009-09       Impact factor: 1.840

6.  A glimpsing account for the benefit of simulated combined acoustic and electric hearing.

Authors:  Ning Li; Philipos C Loizou
Journal:  J Acoust Soc Am       Date:  2008-04       Impact factor: 1.840

7.  Evaluation of the importance of time-frequency contributions to speech intelligibility in noise.

Authors:  Chengzhu Yu; Kamil K Wójcicki; Philipos C Loizou; John H L Hansen; Michael T Johnson
Journal:  J Acoust Soc Am       Date:  2014-05       Impact factor: 1.840

8.  A new sound coding strategy for suppressing noise in cochlear implants.

Authors:  Yi Hu; Philipos C Loizou
Journal:  J Acoust Soc Am       Date:  2008-07       Impact factor: 1.840

Review 9.  The cocktail-party problem revisited: early processing and selection of multi-talker speech.

Authors:  Adelbert W Bronkhorst
Journal:  Atten Percept Psychophys       Date:  2015-07       Impact factor: 2.199

Review 10.  A comparison of several computational auditory scene analysis (CASA) techniques for monaural speech segregation.

Authors:  Jihen Zeremdini; Mohamed Anouar Ben Messaoud; Aicha Bouzid
Journal:  Brain Inform       Date:  2015-08-04
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.