Literature DB >> 32362720

Multimodal Transformer for Unaligned Multimodal Language Sequences.

Yao-Hung Hubert Tsai1, Shaojie Bai1, Paul Pu Liang1, J Zico Kolter1,2, Louis-Philippe Morency1, Ruslan Salakhutdinov1.   

Abstract

Human language is often multimodal, which comprehends a mixture of natural language, facial gestures, and acoustic behaviors. However, two major challenges in modeling such multimodal human language time-series data exist: 1) inherent data non-alignment due to variable sampling rates for the sequences from each modality; and 2) long-range dependencies between elements across modalities. In this paper, we introduce the Multimodal Transformer (MulT) to generically address the above issues in an end-to-end manner without explicitly aligning the data. At the heart of our model is the directional pairwise cross-modal attention, which attends to interactions between multimodal sequences across distinct time steps and latently adapt streams from one modality to another. Comprehensive experiments on both aligned and non-aligned multimodal time-series show that our model outperforms state-of-the-art methods by a large margin. In addition, empirical analysis suggests that correlated crossmodal signals are able to be captured by the proposed crossmodal attention mechanism in MulT.

Entities:  

Year:  2019        PMID: 32362720      PMCID: PMC7195022          DOI: 10.18653/v1/p19-1656

Source DB:  PubMed          Journal:  Proc Conf Assoc Comput Linguist Meet        ISSN: 0736-587X


  16 in total

1.  Integrating Multimodal Information in Large Pretrained Transformers.

Authors:  Wasifur Rahman; Md Kamrul Hasan; Sangwu Lee; Amir Zadeh; Chengfeng Mao; Louis-Philippe Morency; Ehsan Hoque
Journal:  Proc Conf Assoc Comput Linguist Meet       Date:  2020-07

2.  Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis.

Authors:  Yao-Hung Hubert Tsai; Martin Q Ma; Muqiao Yang; Ruslan Salakhutdinov; Louis-Philippe Morency
Journal:  Proc Conf Empir Methods Nat Lang Process       Date:  2020-11

3.  CMU-MOSEAS: A Multimodal Language Dataset for Spanish, Portuguese, German and French.

Authors:  Amir Zadeh; Yan Sheng Cao; Simon Hessner; Paul Pu Liang; Soujanya Poria; Louis-Philippe Morency
Journal:  Proc Conf Empir Methods Nat Lang Process       Date:  2020-11

4.  Human-Guided Modality Informativeness for Affective States.

Authors:  Torsten Wörtwein; Lisa B Sheeber; Nicholas Allen; Jeffrey F Cohn; Louis-Philippe Morency
Journal:  Proc ACM Int Conf Multimodal Interact       Date:  2021-10

5.  Multimodal Sentiment Analysis Based on Cross-Modal Attention and Gated Cyclic Hierarchical Fusion Networks.

Authors:  Zhibang Quan; Tao Sun; Mengli Su; Jishu Wei
Journal:  Comput Intell Neurosci       Date:  2022-08-09

6.  A Survey of Challenges and Opportunities in Sensing and Analytics for Risk Factors of Cardiovascular Disorders.

Authors:  Nathan C Hurley; Erica S Spatz; Harlan M Krumholz; Roozbeh Jafari; Bobak J Mortazavi
Journal:  ACM Trans Comput Healthc       Date:  2020-12-30

7.  Cross-Modal Sentiment Sensing with Visual-Augmented Representation and Diverse Decision Fusion.

Authors:  Sun Zhang; Bo Li; Chunyong Yin
Journal:  Sensors (Basel)       Date:  2021-12-23       Impact factor: 3.576

8.  STonKGs: A Sophisticated Transformer Trained on Biomedical Text and Knowledge Graphs.

Authors:  Helena Balabin; Charles Tapley Hoyt; Colin Birkenbihl; Benjamin M Gyori; John Bachman; Alpha Tom Kodamullil; Paul G Plöger; Martin Hofmann-Apitius; Daniel Domingo-Fernández
Journal:  Bioinformatics       Date:  2022-01-05       Impact factor: 6.937

9.  Decoding EEG Brain Activity for Multi-Modal Natural Language Processing.

Authors:  Nora Hollenstein; Cedric Renggli; Benjamin Glaus; Maria Barrett; Marius Troendle; Nicolas Langer; Ce Zhang
Journal:  Front Hum Neurosci       Date:  2021-07-13       Impact factor: 3.169

10.  Pre-training Model Based on Parallel Cross-Modality Fusion Layer.

Authors:  Xuewei Li; Dezhi Han; Chin-Chen Chang
Journal:  PLoS One       Date:  2022-02-03       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.