Literature DB >> 26688612

Directly data-derived articulatory gesture-like representations retain discriminatory information about phone categories.

Vikram Ramanarayanan1, Maarten Van Segbroeck1, Shrikanth S Narayanan1.   

Abstract

How the speech production and perception systems evolved in humans still remains a mystery today. Previous research suggests that human auditory systems are able, and have possibly evolved, to preserve maximal information about the speaker's articulatory gestures. This paper attempts an initial step towards answering the complementary question of whether speakers' articulatory mechanisms have also evolved to produce sounds that can be optimally discriminated by the listener's auditory system. To this end we explicitly model, using computational methods, the extent to which derived representations of "primitive movements" of speech articulation can be used to discriminate between broad phone categories. We extract interpretable spatio-temporal primitive movements as recurring patterns in a data matrix of human speech articulation, i.e. representing the trajectories of vocal tract articulators over time. To this end, we propose a weakly-supervised learning method that attempts to find a part-based representation of the data in terms of recurring basis trajectory units (or primitives) and their corresponding activations over time. For each phone interval, we then derive a feature representation that captures the co-occurrences between the activations of the various bases over different time-lags. We show that this feature, derived entirely from activations of these primitive movements, is able to achieve a greater discrimination relative to using conventional features on an interval-based phone classification task. We discuss the implications of these findings in furthering our understanding of speech signal representations and the links between speech production and perception systems.

Entities:  

Keywords:  information transfer; motor theory; movement primitives; phone classification; speech communication

Year:  2015        PMID: 26688612      PMCID: PMC4681009          DOI: 10.1016/j.csl.2015.03.004

Source DB:  PubMed          Journal:  Comput Speech Lang        ISSN: 0885-2308            Impact factor:   1.899


  11 in total

1.  A generalized smoothness criterion for acoustic-to-articulatory inversion.

Authors:  Prasanta Kumar Ghosh; Shrikanth Narayanan
Journal:  J Acoust Soc Am       Date:  2010-10       Impact factor: 1.840

2.  Shared and specific muscle synergies in natural motor behaviors.

Authors:  Andrea d'Avella; Emilio Bizzi
Journal:  Proc Natl Acad Sci U S A       Date:  2005-02-11       Impact factor: 11.205

3.  Efficient auditory coding.

Authors:  Evan C Smith; Michael S Lewicki
Journal:  Nature       Date:  2006-02-23       Impact factor: 49.962

Review 4.  Speech production knowledge in automatic speech recognition.

Authors:  Simon King; Joe Frankel; Karen Livescu; Erik McDermott; Korin Richmond; Mirjam Wester
Journal:  J Acoust Soc Am       Date:  2007-02       Impact factor: 1.840

5.  Synergies: atoms of brain and behavior.

Authors:  J A Scott Kelso
Journal:  Adv Exp Med Biol       Date:  2009       Impact factor: 2.622

6.  Spatio-temporal articulatory movement primitives during speech production: extraction, interpretation, and validation.

Authors:  Vikram Ramanarayanan; Louis Goldstein; Shrikanth S Narayanan
Journal:  J Acoust Soc Am       Date:  2013-08       Impact factor: 1.840

7.  The potential role of speech production models in automatic speech recognition.

Authors:  R C Rose; J Schroeter; M M Sondhi
Journal:  J Acoust Soc Am       Date:  1996-03       Impact factor: 1.840

8.  Processing speech signal using auditory-like filterbank provides least uncertainty about articulatory gestures.

Authors:  Prasanta Kumar Ghosh; Louis M Goldstein; Shrikanth S Narayanan
Journal:  J Acoust Soc Am       Date:  2011-06       Impact factor: 1.840

9.  Inversion of articulatory-to-acoustic transformation in the vocal tract by a computer-sorting technique.

Authors:  B S Atal; J J Chang; M V Mathews; J W Tukey
Journal:  J Acoust Soc Am       Date:  1978-05       Impact factor: 1.840

10.  The motor theory of speech perception revised.

Authors:  A M Liberman; I G Mattingly
Journal:  Cognition       Date:  1985-10
View more
  1 in total

1.  Variability of articulator positions and formants across nine English vowels.

Authors:  D H Whalen; Wei-Rong Chen; Mark K Tiede; Hosung Nam
Journal:  J Phon       Date:  2018-02-23
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.