Literature DB >> 20191101

Speech Segregation Using an Auditory Vocoder With Event-Synchronous Enhancements.

Toshio Irino1, Roy D Patterson, Hideki Kawahara.   

Abstract

We propose a new method to segregate concurrent speech sounds using an auditory version of a channel vocoder. The auditory representation of sound, referred to as an "auditory image," preserves fine temporal information, unlike conventional window-based processing systems. This makes it possible to segregate speech sources with an event synchronous procedure. Fundamental frequency information is used to estimate the sequence of glottal pulse times for a target speaker, and to repress the glottal events of other speakers. The procedure leads to robust extraction of the target speech and effective segregation even when the signal-to-noise ratio is as low as 0 dB. Moreover, the segregation performance remains high when the speech contains jitter, or when the estimate of the fundamental frequency F0 is inaccurate. This contrasts with conventional comb-filter methods where errors in F0 estimation produce a marked reduction in performance. We compared the new method to a comb-filter method using a cross-correlation measure and perceptual recognition experiments. The results suggest that the new method has the potential to supplant comb-filter and harmonic-selection methods for speech enhancement.

Entities:  

Year:  2006        PMID: 20191101      PMCID: PMC2828642          DOI: 10.1109/TASL.2006.872611

Source DB:  PubMed          Journal:  IEEE Trans Audio Speech Lang Process        ISSN: 1558-7916


  12 in total

1.  A compressive gammachirp auditory filter for both physiological and psychophysical data.

Authors:  T Irino; R D Patterson
Journal:  J Acoust Soc Am       Date:  2001-05       Impact factor: 1.840

2.  Extending the domain of center frequencies for the compressive gammachirp auditory filter.

Authors:  Roy D Patterson; Masashi Unoki; Toshio Irino
Journal:  J Acoust Soc Am       Date:  2003-09       Impact factor: 1.840

3.  Derivation of auditory filter shapes from notched-noise data.

Authors:  B R Glasberg; B C Moore
Journal:  Hear Res       Date:  1990-08-01       Impact factor: 3.208

4.  Robust and accurate fundamental frequency estimation based on dominant harmonic components.

Authors:  Tomohiro Nakatani; Toshio Irino
Journal:  J Acoust Soc Am       Date:  2004-12       Impact factor: 1.840

5.  Separation of speech from interfering sounds based on oscillatory correlation.

Authors:  D L Wang; G J Brown
Journal:  IEEE Trans Neural Netw       Date:  1999

6.  A duplex theory of pitch perception.

Authors:  J C R LICKLIDER
Journal:  Experientia       Date:  1951-04-15

7.  Time-domain modeling of peripheral auditory processing: a modular architecture and a software platform.

Authors:  R D Patterson; M H Allerhand; C Giguère
Journal:  J Acoust Soc Am       Date:  1995-10       Impact factor: 1.840

8.  Modeling temporal asymmetry in the auditory system.

Authors:  R D Patterson; T Irino
Journal:  J Acoust Soc Am       Date:  1998-11       Impact factor: 1.840

9.  A comparison of detection and discrimination of temporal asymmetry in amplitude modulation.

Authors:  M A Akeroyd; R D Patterson
Journal:  J Acoust Soc Am       Date:  1997-01       Impact factor: 1.840

10.  A Dynamic Compressive Gammachirp Auditory Filterbank.

Authors:  Toshio Irino; Roy D Patterson
Journal:  IEEE Trans Audio Speech Lang Process       Date:  2006-11
View more
  2 in total

1.  A Dynamic Compressive Gammachirp Auditory Filterbank.

Authors:  Toshio Irino; Roy D Patterson
Journal:  IEEE Trans Audio Speech Lang Process       Date:  2006-11

2.  Comparison of the roex and gammachirp filters as representations of the auditory filter.

Authors:  Masashi Unoki; Toshio Irino; Brian Glasberg; Brian C J Moore; Roy D Patterson
Journal:  J Acoust Soc Am       Date:  2006-09       Impact factor: 1.840

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.