Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 CREMA-D: Crowd-sourced Emotional Multimodal Actors Dataset.

Literature DB >> 25653738

CREMA-D: Crowd-sourced Emotional Multimodal Actors Dataset.

Houwei Cao¹, David G Cooper², Michael K Keutmann³, Ruben C Gur⁴, Ani Nenkova⁵, Ragini Verma⁶.

Abstract

People convey their emotional state in their face and voice. We present an audio-visual data set uniquely suited for the study of multi-modal emotion expression and perception. The data set consists of facial and vocal emotional expressions in sentences spoken in a range of basic emotional states (happy, sad, anger, fear, disgust, and neutral). 7,442 clips of 91 actors with diverse ethnic backgrounds were rated by multiple raters in three modalities: audio, visual, and audio-visual. Categorical emotion labels and real-value intensity values for the perceived emotion were collected using crowd-sourcing from 2,443 raters. The human recognition of intended emotion for the audio-only, visual-only, and audio-visual data are 40.9%, 58.2% and 63.6% respectively. Recognition rates are highest for neutral, followed by happy, anger, disgust, fear, and sad. Average intensity levels of emotion are rated highest for visual-only perception. The accurate recognition of disgust and fear requires simultaneous audio-visual cues, while anger and happiness can be well recognized based on evidence from a single modality. The large dataset we introduce can be used to probe other questions concerning the audio-visual perception of emotion.

Entities: Disease Gene Species

Keywords: Emotional corpora; facial expression; multi-modal recognition; voice expression

Year: 2014 PMID： 25653738 PMCID： PMC4313618 DOI： 10.1109/TAFFC.2014.2336244

Source DB: PubMed Journal: IEEE Trans Affect Comput ISSN： 1949-3045 Impact factor: 10.506

13 in total

1. Supramodal representation of emotions.

Authors: Martin Klasen; Charles A Kenworthy; Krystyna A Mathiak; Tilo T J Kircher; Klaus Mathiak
Journal: J Neurosci Date: 2011-09-21 Impact factor: 6.167

Review 2. Beyond emotion archetypes: databases for emotion modelling using neural networks.

Authors: Roddy Cowie; Ellen Douglas-Cowie; Cate Cox
Journal: Neural Netw Date: 2005-05

3. Incongruence effects in crossmodal emotional integration.

Authors: Veronika I Müller; Ute Habel; Birgit Derntl; Frank Schneider; Karl Zilles; Bruce I Turetsky; Simon B Eickhoff
Journal: Neuroimage Date: 2010-10-23 Impact factor: 6.556

4. Fundamental frequency of phonation and perceived emotional stress.

Authors: A Protopapas; P Lieberman
Journal: J Acoust Soc Am Date: 1997-04 Impact factor: 1.840

Review 5. Multisensory emotions: perception, combination and underlying neural processes.

Authors: Martin Klasen; Yu-Han Chen; Klaus Mathiak
Journal: Rev Neurosci Date: 2012 Impact factor: 4.353

6. Analysis of the glottal excitation of emotionally styled and stressed speech.

Authors: K E Cummings; M A Clements
Journal: J Acoust Soc Am Date: 1995-07 Impact factor: 1.840

Review 7. Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion.

Authors: I R Murray; J L Arnott
Journal: J Acoust Soc Am Date: 1993-02 Impact factor: 1.840

8. A video database of moving faces and people.

Authors: Alice J O'Toole; Joshua Harms; Sarah L Snow; Dawn R Hurst; Matthew R Pappas; Janet H Ayyad; Hervé Abdi
Journal: IEEE Trans Pattern Anal Mach Intell Date: 2005-05 Impact factor: 6.226

9. Audio-visual integration of emotion expression.

Authors: Olivier Collignon; Simon Girard; Frederic Gosselin; Sylvain Roy; Dave Saint-Amour; Maryse Lassonde; Franco Lepore
Journal: Brain Res Date: 2008-04-20 Impact factor: 3.252

10. Validation of affective and neutral sentence content for prosodic testing.

Authors: Jeff B Russ; Ruben C Gur; Warren B Bilker
Journal: Behav Res Methods Date: 2008-11

5 in total

1. The Sabancı University Dynamic Face Database (SUDFace): Development and validation of an audiovisual stimulus set of recited and free speeches with neutral facial expressions.

Authors: Yağmur Damla Şentürk; Ebru Ecem Tavacioglu; İlker Duymaz; Bilge Sayim; Nihan Alp
Journal: Behav Res Methods Date: 2022-08-26

2. Face-voice space: Integrating visual and auditory cues in judgments of person distinctiveness.

Authors: Joshua R Tatz; Zehra F Peynircioğlu; William Brent
Journal: Atten Percept Psychophys Date: 2020-10 Impact factor: 2.199

3. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English.

Authors: Steven R Livingstone; Frank A Russo
Journal: PLoS One Date: 2018-05-16 Impact factor: 3.240

4. Speech-Driven Facial Animations Improve Speech-in-Noise Comprehension of Humans.

Authors: Enrico Varano; Konstantinos Vougioukas; Pingchuan Ma; Stavros Petridis; Maja Pantic; Tobias Reichenbach
Journal: Front Neurosci Date: 2022-01-05 Impact factor: 4.677

5. Robust Multi-Scenario Speech-Based Emotion Recognition System.

Authors: Fangfang Zhu-Zhou; Roberto Gil-Pita; Joaquín García-Gómez; Manuel Rosa-Zurera
Journal: Sensors (Basel) Date: 2022-03-18 Impact factor: 3.576

5 in total