Literature DB >> 35962021

Acoustic scene classification based on three-dimensional multi-channel feature-correlated deep learning networks.

Yuanyuan Qu1, Xuesheng Li1, Zhiliang Qin2,3, Qidong Lu1.   

Abstract

As an effective approach to perceive environments, acoustic scene classification (ASC) has received considerable attention in the past few years. Generally, ASC is deemed a challenging task due to subtle differences between various classes of environmental sounds. In this paper, we propose a novel approach to perform accurate classification based on the aggregation of spatial-temporal features extracted from a multi-branch three-dimensional (3D) convolution neural network (CNN) model. The novelties of this paper are as follows. First, we form multiple frequency-domain representations of signals by fully utilizing expert knowledge on acoustics and discrete wavelet transformations (DWT). Secondly, we propose a novel 3D CNN architecture featuring residual connections and squeeze-and-excitation attentions (3D-SE-ResNet) to effectively capture both long-term and short-term correlations inherent in environmental sounds. Thirdly, an auxiliary supervised branch based on the chromatogram of the original signal is incorporated in the proposed architecture to alleviate overfitting risks by providing supplementary information to the model. The performance of the proposed multi-input multi-feature 3D-CNN architecture is numerically evaluated on a typical large-scale dataset in the 2019 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE 2019) and is shown to obtain noticeable performance gains over the state-of-the-art methods in the literature.
© 2022. The Author(s).

Entities:  

Mesh:

Year:  2022        PMID: 35962021      PMCID: PMC9374676          DOI: 10.1038/s41598-022-17863-z

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.996


  5 in total

1.  Epileptic Seizure Detection in EEG Signals Using a Unified Temporal-Spectral Squeeze-and-Excitation Network.

Authors:  Yang Li; Yu Liu; Wei-Gang Cui; Yu-Zhu Guo; Hui Huang; Zhong-Yi Hu
Journal:  IEEE Trans Neural Syst Rehabil Eng       Date:  2020-02-12       Impact factor: 3.802

2.  General audio tagging with ensembling convolutional neural networks and statistical features.

Authors:  Kele Xu; Boqing Zhu; Qiuqiang Kong; Haibo Mi; Bo Ding; Dezhi Wang; Huaimin Wang
Journal:  J Acoust Soc Am       Date:  2019-06       Impact factor: 1.840

3.  Using deep learning for acoustic event classification: The case of natural disasters.

Authors:  Akon O Ekpezu; Isaac Wiafe; Ferdinand Katsriku; Winfred Yaokumah
Journal:  J Acoust Soc Am       Date:  2021-04       Impact factor: 1.840

4.  Towards Domain Invariant Heart Sound Abnormality Detection Using Learnable Filterbanks.

Authors:  Ahmed Imtiaz Humayun; Shabnam Ghaffarzadegan; Md Istiaq Ansari; Zhe Feng; Taufiq Hasan
Journal:  IEEE J Biomed Health Inform       Date:  2020-01-31       Impact factor: 5.772

5.  Attention-based VGG-16 model for COVID-19 chest X-ray image classification.

Authors:  Chiranjibi Sitaula; Mohammad Belayet Hossain
Journal:  Appl Intell (Dordr)       Date:  2020-11-17       Impact factor: 5.086

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.