Literature DB >> 25653738

CREMA-D: Crowd-sourced Emotional Multimodal Actors Dataset.

Houwei Cao1, David G Cooper2, Michael K Keutmann3, Ruben C Gur4, Ani Nenkova5, Ragini Verma6.   

Abstract

People convey their emotional state in their face and voice. We present an audio-visual data set uniquely suited for the study of multi-modal emotion expression and perception. The data set consists of facial and vocal emotional expressions in sentences spoken in a range of basic emotional states (happy, sad, anger, fear, disgust, and neutral). 7,442 clips of 91 actors with diverse ethnic backgrounds were rated by multiple raters in three modalities: audio, visual, and audio-visual. Categorical emotion labels and real-value intensity values for the perceived emotion were collected using crowd-sourcing from 2,443 raters. The human recognition of intended emotion for the audio-only, visual-only, and audio-visual data are 40.9%, 58.2% and 63.6% respectively. Recognition rates are highest for neutral, followed by happy, anger, disgust, fear, and sad. Average intensity levels of emotion are rated highest for visual-only perception. The accurate recognition of disgust and fear requires simultaneous audio-visual cues, while anger and happiness can be well recognized based on evidence from a single modality. The large dataset we introduce can be used to probe other questions concerning the audio-visual perception of emotion.

Entities:  

Keywords:  Emotional corpora; facial expression; multi-modal recognition; voice expression

Year:  2014        PMID: 25653738      PMCID: PMC4313618          DOI: 10.1109/TAFFC.2014.2336244

Source DB:  PubMed          Journal:  IEEE Trans Affect Comput        ISSN: 1949-3045            Impact factor:   10.506


  13 in total

1.  Supramodal representation of emotions.

Authors:  Martin Klasen; Charles A Kenworthy; Krystyna A Mathiak; Tilo T J Kircher; Klaus Mathiak
Journal:  J Neurosci       Date:  2011-09-21       Impact factor: 6.167

Review 2.  Beyond emotion archetypes: databases for emotion modelling using neural networks.

Authors:  Roddy Cowie; Ellen Douglas-Cowie; Cate Cox
Journal:  Neural Netw       Date:  2005-05

3.  Incongruence effects in crossmodal emotional integration.

Authors:  Veronika I Müller; Ute Habel; Birgit Derntl; Frank Schneider; Karl Zilles; Bruce I Turetsky; Simon B Eickhoff
Journal:  Neuroimage       Date:  2010-10-23       Impact factor: 6.556

4.  Fundamental frequency of phonation and perceived emotional stress.

Authors:  A Protopapas; P Lieberman
Journal:  J Acoust Soc Am       Date:  1997-04       Impact factor: 1.840

Review 5.  Multisensory emotions: perception, combination and underlying neural processes.

Authors:  Martin Klasen; Yu-Han Chen; Klaus Mathiak
Journal:  Rev Neurosci       Date:  2012       Impact factor: 4.353

6.  Analysis of the glottal excitation of emotionally styled and stressed speech.

Authors:  K E Cummings; M A Clements
Journal:  J Acoust Soc Am       Date:  1995-07       Impact factor: 1.840

Review 7.  Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion.

Authors:  I R Murray; J L Arnott
Journal:  J Acoust Soc Am       Date:  1993-02       Impact factor: 1.840

8.  A video database of moving faces and people.

Authors:  Alice J O'Toole; Joshua Harms; Sarah L Snow; Dawn R Hurst; Matthew R Pappas; Janet H Ayyad; Hervé Abdi
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2005-05       Impact factor: 6.226

9.  Audio-visual integration of emotion expression.

Authors:  Olivier Collignon; Simon Girard; Frederic Gosselin; Sylvain Roy; Dave Saint-Amour; Maryse Lassonde; Franco Lepore
Journal:  Brain Res       Date:  2008-04-20       Impact factor: 3.252

10.  Validation of affective and neutral sentence content for prosodic testing.

Authors:  Jeff B Russ; Ruben C Gur; Warren B Bilker
Journal:  Behav Res Methods       Date:  2008-11
View more
  5 in total

1.  The Sabancı University Dynamic Face Database (SUDFace): Development and validation of an audiovisual stimulus set of recited and free speeches with neutral facial expressions.

Authors:  Yağmur Damla Şentürk; Ebru Ecem Tavacioglu; İlker Duymaz; Bilge Sayim; Nihan Alp
Journal:  Behav Res Methods       Date:  2022-08-26

2.  Face-voice space: Integrating visual and auditory cues in judgments of person distinctiveness.

Authors:  Joshua R Tatz; Zehra F Peynircioğlu; William Brent
Journal:  Atten Percept Psychophys       Date:  2020-10       Impact factor: 2.199

3.  The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English.

Authors:  Steven R Livingstone; Frank A Russo
Journal:  PLoS One       Date:  2018-05-16       Impact factor: 3.240

4.  Speech-Driven Facial Animations Improve Speech-in-Noise Comprehension of Humans.

Authors:  Enrico Varano; Konstantinos Vougioukas; Pingchuan Ma; Stavros Petridis; Maja Pantic; Tobias Reichenbach
Journal:  Front Neurosci       Date:  2022-01-05       Impact factor: 4.677

5.  Robust Multi-Scenario Speech-Based Emotion Recognition System.

Authors:  Fangfang Zhu-Zhou; Roberto Gil-Pita; Joaquín García-Gómez; Manuel Rosa-Zurera
Journal:  Sensors (Basel)       Date:  2022-03-18       Impact factor: 3.576

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.