Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 CMU-MOSEAS: A Multimodal Language Dataset for Spanish, Portuguese, German and French.

Literature DB >> 33969362

CMU-MOSEAS: A Multimodal Language Dataset for Spanish, Portuguese, German and French.

Amir Zadeh¹, Yan Sheng Cao², Simon Hessner¹, Paul Pu Liang³, Soujanya Poria⁴, Louis-Philippe Morency¹.

Abstract

Modeling multimodal language is a core research area in natural language processing. While languages such as English have relatively large multimodal language resources, other widely spoken languages across the globe have few or no large-scale datasets in this area. This disproportionately affects native speakers of languages other than English. As a step towards building more equitable and inclusive multimodal systems, we introduce the first large-scale multimodal language dataset for Spanish, Portuguese, German and French. The proposed dataset, called CMU-MOSEAS (CMU Multimodal Opinion Sentiment, Emotions and Attributes), is the largest of its kind with 40, 000 total labelled sentences. It covers a diverse set topics and speakers, and carries supervision of 20 labels including sentiment (and subjectivity), emotions, and attributes. Our evaluations on a state-of-the-art multimodal model demonstrates that CMU-MOSEAS enables further research for multilingual studies in multimodal language.

Entities: Chemical Disease Gene Species

Year: 2020 PMID： 33969362 PMCID： PMC8106386 DOI： 10.18653/v1/2020.emnlp-main.141

Source DB: PubMed Journal: Proc Conf Empir Methods Nat Lang Process

Keyword Cloud
References

6 in total

CMU-MOSEAS: A Multimodal Language Dataset for Spanish, Portuguese, German and French.

1. Normalized amplitude quotient for parametrization of the glottal flow.

2. Vocal intensity in speakers and singers.

3. Vocal quality factors: analysis, synthesis, and perception.

4. Multimodal Transformer for Unaligned Multimodal Language Sequences.

5. Mutual Correlation Attentive Factors in Dyadic Fusion Networks for Speech Emotion Recognition.

6. Why We Should Study Multimodal Language.