| Literature DB >> 35336432 |
Xinqi Bao1, Yujia Xu1, Ernest Nlandu Kamavuako1,2.
Abstract
Deep learning techniques are the future trend for designing heart sound classification methods, making conventional heart sound segmentation dispensable. However, despite using fixed signal duration for training, no study has assessed its effect on the final performance in detail. Therefore, this study aims at analysing the duration effect on the commonly used deep learning methods to provide insight for future studies in data processing, classifier, and feature selection. The results of this study revealed that (1) very short heart sound signal duration (1 s) weakens the performance of Recurrent Neural Networks (RNNs), whereas no apparent decrease in the tested Convolutional Neural Network (CNN) model was found. (2) RNN outperformed CNN using Mel-frequency cepstrum coefficients (MFCCs) as features. There was no difference between RNN models (LSTM, BiLSTM, GRU, or BiGRU). (3) Adding dynamic information (∆ and ∆²MFCCs) of the heart sound as a feature did not improve the RNNs' performance, and the improvement on CNN was also minimal (≤2.5% in MAcc). The findings provided a theoretical basis for further heart sound classification using deep learning techniques when selecting the input length.Entities:
Keywords: convolutional neural network (CNN); deep learning (DL); heart sound; recurrent neural networks (RNNs)
Mesh:
Year: 2022 PMID: 35336432 PMCID: PMC8951308 DOI: 10.3390/s22062261
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Visualisation of heart sound signal with its component locations.
Heart Sound Components and their properties.
| Heart Sound | S1 | S2 | S3 | S4 |
|---|---|---|---|---|
| Duration (ms) | 100–160 | 80–140 | 40–50 | 40–50 |
| Frequency (Hz) | 30–50 | 40–70 | <30 | <20 |
| Occurrence | Sound of mitral and tricuspid valve closure | Sound of aortic and pulmonic valve closure | The sound caused by an increase in ventricular blood volume | Sound of an atrial gallop produced by blood being forced into a stiff ventricle |
Recent advancements in heart anomaly detection using deep learning.
| Authors | Year | Segmented (Input Length) | Features | Model | |
|---|---|---|---|---|---|
| Huai et al. [ | 2020 | No. (5 s intervals, 2 s window) | Time-Frequency (Spectrogram) | CNN + LSTM | 91.06 |
| Deng et al. [ | 2020 | No. (5 s) | MFCCs, ΔMFCC, Δ²MFCC | CNN + RNN | 98.34 |
| Xiao et al. [ | 2020 | No. (3 s length, 1 s shift) | Raw signals, MFCCs, PSDs | CNN | 93 |
| Dissanayake et al. [ | 2020 | No. (1 s, 0.1 s shift) | MFCCs | LSTM, CNN | 99.72 |
| Zhang et al. [ | 2019 | No. (2 s) | Temporal Quasi-Periodic Features | LSTM | 94.66 |
| Latif et al. [ | 2018 | Yes. (2, 5, and 8 cycles) | MFCCs | RNNs | 98.61 |
| Maknickas and Maknickas [ | 2017 | No. (128 × 128 frames) | MFCCs | CNN | 84.15 |
| Rubin et al. [ | 2017 | Yes. (3 s) | Time-Frequency, MFCCs | CNN | 84 |
Figure 2Graphical structure of (a) LSTM unit; (b) GRU; (c) BRNN.
Figure 3(a) MFCCs, (c) ∆MFCCs, (e) ∆²MFCCs for normal heart sound recording. (b) MFCCs, (d) ∆MFCCs, (f) ∆²MFCCs for abnormal heart sound recording.
Figure 4The deep learning network models used in the study: 3-layer CNN model, 2-layer LSTM, BiLSTM, GRU, and BiGRU models.
Figure 5The proposed models’ performance (10 times average) with different input PCG signal lengths.
Comparison between the deep learning models with 5 s PCG input (10 times, average ± standard deviation %).
| Model |
|
|
|
|
|---|---|---|---|---|
| LSTM | 91.86 ± 1.20 |
| 81.75 ± 5.95 | 88.58 ± 2.44 |
| BiLSTM |
| 95.14 ± 1.33 |
|
|
| GRU | 92.13 ± 0.44 | 95.22 ± 1.37 | 83.01 ± 2.69 | 89.12 ± 0.87 |
| BiGRU | 92.35 ± 0.72 | 95.31 ± 1.87 | 83.24 ± 3.46 | 89.27 ± 1.04 |
| CNN | 90.08 ± 1.22 | 93.80 ± 2.46 | 79.02 ± 4.57 | 86.41 ± 1.7 |
Figure 6Comparison between using MFCCs only and using MFCCs with its deltas and delta–deltas on BiLSTM and CNN.