Literature DB >> 20084186

Prominence Detection Using Auditory Attention Cues and Task-Dependent High Level Information.

Ozlem Kalinli1, Shrikanth Narayanan.   

Abstract

Auditory attention is a complex mechanism that involves the processing of low-level acoustic cues together with higher level cognitive cues. In this paper, a novel method is proposed that combines biologically inspired auditory attention cues with higher level lexical and syntactic information to model task-dependent influences on a given spoken language processing task. A set of low-level multiscale features (intensity, frequency contrast, temporal contrast, orientation, and pitch) is extracted in parallel from the auditory spectrum of the sound based on the processing stages in the central auditory system to create feature maps that are converted to auditory gist features that capture the essence of a sound scene. The auditory attention model biases the gist features in a task-dependent way to maximize target detection in a given scene. Furthermore, the top-down task-dependent influence of lexical and syntactic information is incorporated into the model using a probabilistic approach. The lexical information is incorporated by using a probabilistic language model, and the syntactic knowledge is modeled using part-of-speech (POS) tags. The combined model is tested on automatically detecting prominent syllables in speech using the BU Radio News Corpus. The model achieves 88.33% prominence detection accuracy at the syllable level and 85.71% accuracy at the word level. These results compare well with reported human performance on this task.

Entities:  

Year:  2009        PMID: 20084186      PMCID: PMC2806691          DOI: 10.1109/tasl.2009.2014795

Source DB:  PubMed          Journal:  IEEE Trans Audio Speech Lang Process        ISSN: 1558-7916


  26 in total

Review 1.  Modular organization of frequency integration in primary auditory cortex.

Authors:  C E Schreiner; H L Read; M L Sutter
Journal:  Annu Rev Neurosci       Date:  2000       Impact factor: 12.449

2.  On the role of space and time in auditory processing.

Authors:  S Shamma
Journal:  Trends Cogn Sci       Date:  2001-08-01       Impact factor: 20.229

3.  Estimating mutual information.

Authors:  Alexander Kraskov; Harald Stögbauer; Peter Grassberger
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2004-06-23

4.  Reflexive and voluntary orienting of visual attention: time course of activation and resistance to interruption.

Authors:  H J Müller; P M Rabbitt
Journal:  J Exp Psychol Hum Percept Perform       Date:  1989-05       Impact factor: 3.332

5.  Mechanisms for allocating auditory attention: an auditory saliency map.

Authors:  Christoph Kayser; Christopher I Petkov; Michael Lippert; Nikos K Logothetis
Journal:  Curr Biol       Date:  2005-11-08       Impact factor: 10.834

6.  Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence.

Authors:  Sankaranarayanan Ananthakrishnan; Shrikanth S Narayanan
Journal:  IEEE Trans Audio Speech Lang Process       Date:  2008-01

7.  Optimizing sound features for cortical neurons.

Authors:  R C deCharms; D T Blake; M M Merzenich
Journal:  Science       Date:  1998-05-29       Impact factor: 47.728

Review 8.  Neural mechanisms of selective visual attention.

Authors:  R Desimone; J Duncan
Journal:  Annu Rev Neurosci       Date:  1995       Impact factor: 12.449

9.  Shifts in selective visual attention: towards the underlying neural circuitry.

Authors:  C Koch; S Ullman
Journal:  Hum Neurobiol       Date:  1985

Review 10.  Language outside the focus of attention: the mismatch negativity as a tool for studying higher cognitive processes.

Authors:  Friedemann Pulvermüller; Yury Shtyrov
Journal:  Prog Neurobiol       Date:  2006-06-30       Impact factor: 11.685

View more
  4 in total

1.  Behavioral Signal Processing: Deriving Human Behavioral Informatics From Speech and Language: Computational techniques are presented to analyze and model expressed and perceived human behavior-variedly characterized as typical, atypical, distressed, and disordered-from speech and language cues and their applications in health, commerce, education, and beyond.

Authors:  Shrikanth Narayanan; Panayiotis G Georgiou
Journal:  Proc IEEE Inst Electr Electron Eng       Date:  2013-02-07       Impact factor: 10.961

Review 2.  The cocktail-party problem revisited: early processing and selection of multi-talker speech.

Authors:  Adelbert W Bronkhorst
Journal:  Atten Percept Psychophys       Date:  2015-07       Impact factor: 2.199

3.  Spatial orienting in complex audiovisual environments.

Authors:  Davide Nardo; Valerio Santangelo; Emiliano Macaluso
Journal:  Hum Brain Mapp       Date:  2013-04-24       Impact factor: 5.038

4.  Quantifying attentional modulation of auditory-evoked cortical responses from single-trial electroencephalography.

Authors:  Inyong Choi; Siddharth Rajaram; Lenny A Varghese; Barbara G Shinn-Cunningham
Journal:  Front Hum Neurosci       Date:  2013-04-04       Impact factor: 3.169

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.