Literature DB >> 25382936

Acoustic and Lexical Representations for Affect Prediction in Spontaneous Conversations.

Houwei Cao1, Arman Savran1, Ragini Verma1, Ani Nenkova2.   

Abstract

In this article we investigate what representations of acoustics and word usage are most suitable for predicting dimensions of affect|AROUSAL, VALANCE, POWER and EXPECTANCY|in spontaneous interactions. Our experiments are based on the AVEC 2012 challenge dataset. For lexical representations, we compare corpus-independent features based on psychological word norms of emotional dimensions, as well as corpus-dependent representations. We find that corpus-dependent bag of words approach with mutual information between word and emotion dimensions is by far the best representation. For the analysis of acoustics, we zero in on the question of granularity. We confirm on our corpus that utterance-level features are more predictive than word-level features. Further, we study more detailed representations in which the utterance is divided into regions of interest (ROI), each with separate representation. We introduce two ROI representations, which significantly outperform less informed approaches. In addition we show that acoustic models of emotion can be improved considerably by taking into account annotator agreement and training the model on smaller but reliable dataset. Finally we discuss the potential for improving prediction by combining the lexical and acoustic modalities. Simple fusion methods do not lead to consistent improvements over lexical classifiers alone but improve over acoustic models.

Entities:  

Keywords:  acoustics; affect; emotion; lexical features; spontaneous speech

Year:  2015        PMID: 25382936      PMCID: PMC4219625          DOI: 10.1016/j.csl.2014.04.002

Source DB:  PubMed          Journal:  Comput Speech Lang        ISSN: 0885-2308            Impact factor:   1.899


  3 in total

1.  The world of emotions is not two-dimensional.

Authors:  Johnny R J Fontaine; Klaus R Scherer; Etienne B Roesch; Phoebe C Ellsworth
Journal:  Psychol Sci       Date:  2007-12

2.  Class-Level Spectral Features for Emotion Recognition.

Authors:  Dmitri Bitouk; Ragini Verma; Ani Nenkova
Journal:  Speech Commun       Date:  2010-07       Impact factor: 2.017

3.  Combining Video, Audio and Lexical Indicators of Affect in Spontaneous Conversation via Particle Filtering.

Authors:  Arman Savran; Houwei Cao; Miraj Shah; Ani Nenkova; Ragini Verma
Journal:  Proc ACM Int Conf Multimodal Interact       Date:  2012
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.