Literature DB >> 20454538

An Acoustic Measure for Word Prominence in Spontaneous Speech.

Dagen Wang1, Shrikanth Narayanan.   

Abstract

An algorithm for automatic speech prominence detection is reported in this paper. We describe a comparative analysis on various acoustic features for word prominence detection and report results using a spoken dialog corpus with manually assigned prominence labels. The focus is on features such as spectral intensity and speech rate that are directly extracted from speech based on a correlation-based approach without requiring explicit linguistic or phonetic knowledge. Additionally, various pitch-based measures are studied with respect to their discriminating ability for prominence detection. A parametric scheme for modeling pitch plateau is proposed and this feature alone is found to outperform the traditional local pitch statistics. Two sets of experiments are used to explore the usefulness of the acoustic score generated using these features. The first set focuses on a more traditional way of word prominence detection based on a manually-tagged corpus. A 76.8% classification accuracy was achieved on a corpus of role-playing spoken dialogs. Due to difficulties in manually tagging speech prominence into discrete levels (categories), the second set of experiments focuses on evaluating the score indirectly. Specifically, through experiments on the Switchboard corpus, it is shown that the proposed acoustic score can discriminate between content word and function words in a statistically significant way. The relation between speech prominence and content/function words is also explored. Since prominent words tend to be predominantly content words, and since content words can be automatically marked from text-derived part of speech (POS) information, it is shown that the proposed acoustic score can be indirectly cross-validated through POS information.

Entities:  

Year:  2007        PMID: 20454538      PMCID: PMC2864931          DOI: 10.1109/tasl.2006.881703

Source DB:  PubMed          Journal:  IEEE Trans Audio Speech Lang Process        ISSN: 1558-7916


  4 in total

1.  Analysis and synthesis of intonation using the Tilt model.

Authors:  P Taylor
Journal:  J Acoust Soc Am       Date:  2000-03       Impact factor: 1.840

2.  Loudness predicts prominence: fundamental frequency lends little.

Authors:  G Kochanski; E Grabe; J Coleman; B Rosner
Journal:  J Acoust Soc Am       Date:  2005-08       Impact factor: 1.840

3.  Vowel-onset detection.

Authors:  D J Hermes
Journal:  J Acoust Soc Am       Date:  1990-02       Impact factor: 1.840

4.  Fundamental frequency and perceived prominence of accented syllables. II. Nonfinal accents.

Authors:  J Terken
Journal:  J Acoust Soc Am       Date:  1994-06       Impact factor: 1.840

  4 in total
  2 in total

1.  Robust Speech Rate Estimation for Spontaneous Speech.

Authors:  Dagen Wang; Shrikanth S Narayanan
Journal:  IEEE Trans Audio Speech Lang Process       Date:  2007-11-01

2.  Behavioral Signal Processing: Deriving Human Behavioral Informatics From Speech and Language: Computational techniques are presented to analyze and model expressed and perceived human behavior-variedly characterized as typical, atypical, distressed, and disordered-from speech and language cues and their applications in health, commerce, education, and beyond.

Authors:  Shrikanth Narayanan; Panayiotis G Georgiou
Journal:  Proc IEEE Inst Electr Electron Eng       Date:  2013-02-07       Impact factor: 10.961

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.