Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 DEEP MULTIMODAL LEARNING FOR EMOTION RECOGNITION IN SPOKEN LANGUAGE.

Literature DB >> 30505240

DEEP MULTIMODAL LEARNING FOR EMOTION RECOGNITION IN SPOKEN LANGUAGE.

Abstract

In this paper, we present a novel deep multimodal framework to predict human emotions based on sentence-level spoken language. Our architecture has two distinctive characteristics. First, it extracts the high-level features from both text and audio via a hybrid deep multimodal structure, which considers the spatial information from text, temporal information from audio, and high-level associations from low-level handcrafted features. Second, we fuse all features by using a three-layer deep neural network to learn the correlations across modalities and train the feature extraction and fusion modules together, allowing optimal global fine-tuning of the entire structure. We evaluated the proposed framework on the IEMOCAP dataset. Our result shows promising performance, achieving 60.4% in weighted accuracy for five emotion categories.

Entities: Chemical Disease Gene Species

Keywords: Emotion recognition; deep multimodal learning; spoken language

Year: 2018 PMID： 30505240 PMCID： PMC6261381 DOI： 10.1109/ICASSP.2018.8462440

Source DB: PubMed Journal: Proc IEEE Int Conf Acoust Speech Signal Process ISSN： 1520-6149

1 in total

1. Language-Based Process Phase Detection in the Trauma Resuscitation.

Authors: Yue Gu; Xinyu Li; Shuhong Chen; Hunagcan Li; Richard A Farneth; Ivan Marsic; Randall S Burd
Journal: IEEE Int Conf Healthc Inform Date: 2017-09-14

1 in total

2 in total

1. Human Conversation Analysis Using Attentive Multimodal Networks with Hierarchical Encoder-Decoder.

Authors: Yue Gu; Xinyu Li; Kaixiang Huang; Shiyu Fu; Kangning Yang; Shuhong Chen; Moliang Zhou; Ivan Marsic
Journal: Proc ACM Int Conf Multimed Date: 2018-10

2. Affective Latent Representation of Acoustic and Lexical Features for Emotion Recognition.

Authors: Eesung Kim; Hyungchan Song; Jong Won Shin
Journal: Sensors (Basel) Date: 2020-05-04 Impact factor: 3.576

2 in total