Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Speech Intention Classification with Multimodal Deep Learning.

Literature DB >> 30506055

Speech Intention Classification with Multimodal Deep Learning.

Yue Gu¹, Xinyu Li¹, Shuhong Chen¹, Jianyu Zhang¹, Ivan Marsic¹.

Abstract

We present a novel multimodal deep learning structure that automatically extracts features from textual-acoustic data for sentence-level speech classification. Textual and acoustic features were first extracted using two independent convolutional neural network structures, then combined into a joint representation, and finally fed into a decision softmax layer. We tested the proposed model in an actual medical setting, using speech recording and its transcribed log. Our model achieved 83.10% average accuracy in detecting 6 different intentions. We also found that our model using automatically extracted features for intention classification outperformed existing models that use manufactured features.

Entities: Chemical Disease Gene Species

Keywords: Convolutional neural network; Multimodal intention classification; Textual-acoustic feature representation; Trauma resuscitation

Year: 2017 PMID： 30506055 PMCID： PMC6261374 DOI： 10.1007/978-3-319-57351-9_30

Source DB: PubMed Journal: Adv Artif Intell

Keyword Cloud
Cited

3 in total

Speech Intention Classification with Multimodal Deep Learning.

1. Multimodal Attention Network for Trauma Activity Recognition from Spoken Language and Environmental Sound.

2. Deep Learning-Based Classification of Spoken English Digits.

3. Towards Aircraft Maintenance Metaverse Using Speech Interactions with Virtual Objects in Mixed Reality.