Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Speech emotion recognition based on transfer learning from the FaceNet framework.

Literature DB >> 33639796

Speech emotion recognition based on transfer learning from the FaceNet framework.

Shuhua Liu¹, Mengyu Zhang¹, Ming Fang¹, Jianwei Zhao¹, Kun Hou¹, Chih-Cheng Hung².

Abstract

Speech plays an important role in human-computer emotional interaction. FaceNet used in face recognition achieves great success due to its excellent feature extraction. In this study, we adopt the FaceNet model and improve it for speech emotion recognition. To apply this model for our work, speech signals are divided into segments at a given time interval, and the signal segments are transformed into a discrete waveform diagram and spectrogram. Subsequently, the waveform and spectrogram are separately fed into FaceNet for end-to-end training. Our empirical study shows that the pretraining is effective on the spectrogram for FaceNet. Hence, we pretrain the network on the CASIA dataset and then fine-tune it on the IEMOCAP dataset with waveforms. It will derive the maximum transfer learning knowledge from the CASIA dataset due to its high accuracy. This high accuracy may be due to its clean signals. Our preliminary experimental results show an accuracy of 68.96% and 90% on the emotion benchmark datasets IEMOCAP and CASIA, respectively. The cross-training is then conducted on the dataset, and comprehensive experiments are performed. Experimental results indicate that the proposed approach outperforms state-of-the-art methods on the IEMOCAP dataset among single modal approaches.

Entities: Species

Mesh：

Year: 2021 PMID： 33639796 DOI： 10.1121/10.0003530

Source DB: PubMed Journal: J Acoust Soc Am ISSN： 0001-4966 Impact factor: 1.840

Keyword Cloud
Cited

4 in total

1. Design of Aging Smart Home Products Based on Radial Basis Function Speech Emotion Recognition.

Authors: Xu Wu; Qian Zhang
Journal: Front Psychol Date: 2022-05-04

2. Emotional Speech Recognition Using Deep Neural Networks.

Authors: Loan Trinh Van; Thuy Dao Thi Le; Thanh Le Xuan; Eric Castelli
Journal: Sensors (Basel) Date: 2022-02-12 Impact factor: 3.576

3. Enterprise Strategic Management From the Perspective of Business Ecosystem Construction Based on Multimodal Emotion Recognition.

Authors: Wei Bi; Yongzhen Xie; Zheng Dong; Hongshen Li
Journal: Front Psychol Date: 2022-03-03

4. Design of Association Application System of Face Recognition and Test-Tube Barcode Based on CNN.

Authors: Zhangning Zhou; He Shi; Xuemin Niu
Journal: Comput Math Methods Med Date: 2022-08-24 Impact factor: 2.809

4 in total