| Literature DB >> 33311565 |
Jou-Kou Wang1, Yun-Fan Chang2, Kun-Hsi Tsai2, Wei-Chien Wang2, Chang-Yen Tsai2, Chui-Hsuan Cheng2, Yu Tsao3.
Abstract
Recognizing specific heart sound patterns is important for the diagnosis of structural heart diseases. However, the correct recognition of heart murmur depends largely on clinical experience. Accurately identifying abnormal heart sound patterns is challenging for young and inexperienced clinicians. This study is aimed at the development of a novel algorithm that can automatically recognize systolic murmurs in patients with ventricular septal defects (VSDs). Heart sounds from 51 subjects with VSDs and 25 subjects without a significant heart malformation were obtained in this study. Subsequently, the soundtracks were divided into different training and testing sets to establish the recognition system and evaluate the performance. The automatic murmur recognition system was based on a novel temporal attentive pooling-convolutional recurrent neural network (TAP-CRNN) model. On analyzing the performance using the test data that comprised 178 VSD heart sounds and 60 normal heart sounds, a sensitivity rate of 96.0% was obtained along with a specificity of 96.7%. When analyzing the heart sounds recorded in the second aortic and tricuspid areas, both the sensitivity and specificity were 100%. We demonstrated that the proposed TAP-CRNN system can accurately recognize the systolic murmurs of VSD patients, showing promising potential for the development of software for classifying the heart murmurs of several other structural heart diseases.Entities:
Mesh:
Year: 2020 PMID: 33311565 PMCID: PMC7732853 DOI: 10.1038/s41598-020-77994-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Basic information of the subjects in this study.
| Variables | VSD group (N = 51) | Normal group (N = 25) |
|---|---|---|
| Age (years) | 22.12 ± 16.96 (min: 2; max: 65) | 29.30 ± 18.67 (min: 4.3; max: 65) |
| Male [n; (%)] | 30 (58.82%) | 14 (56%) |
| Female [n; (%)] | 21 (41.18%) | 11 (44%) |
| Height (cm) | 147.23 ± 27.86 | 155.04 ± 23.91 |
| Weight (kg) | 46.75 ± 22.70 | 55.82 ± 24.63 |
Details of the VSD types included in this study.
| VSD types (N = 51) | Case number (%) |
|---|---|
| Type I: infundibular, outlet | 2 (3.92%) |
| Type II: perimembranous | 42 (82.35%) |
| Type III: inlet, atrioventricular | 0 |
| Type IV: muscular, trabecular | 5 (9.80%) |
| Unknown | 2 (3.92%) |
| Small | 30 (58.82%) |
| Medium | 13 (25.49%) |
| Large | 4 (7.84%) |
| Unknown | 4 (7.84%) |
*Small VSD: Qp/Qs < 1.5; medium VSD: 1.5≦Qp/Qs < 2; large VSD: 2≦Qp/Qs; where Qp indicates pulmonary blood flow, Qs indicates systemic blood flow.
Number of subjects and heart sound recordings in this study.
| Variables | VSD group | Normal group |
|---|---|---|
| Number of subjects | 51 | 25 |
| Number of sound recordings | 525 | 251 |
Figure 1The structure of the TAP-CRNN model. STFT was used to transform the phonocardiogram (PPG) signals to spectral features at the first step. The second step used CNN to extract invariant spatial–temporal representations from the spectral features. Then RNN was used to extract long temporal-context information in the representations for classification in the following step. Finally, TAP was used to assign importance weights for each frame in the systolic regions in the fourth step. STFT: short time fast Fourier transformation; LSTM: long-short term memory; TAP: temporal attentive pooling.
Results of testing the algorithm’s ability to distinguish systolic murmurs from normal heart sounds.
| Accuracy | Sensitivity | Specificity | PPV | NPV | |
|---|---|---|---|---|---|
| CNN | 87.0% | 87.6% | 85.0% | 94.5% | 69.9% |
| CRNN | 92.0% | 91.6% | 93.3% | 97.6% | 78.9% |
| TAP-CRNN | 97.1% | 96.6% | 98.3% | 99.4% | 90.1% |
Figure 2The experimental result of the ROC curves of the CNN, CRNN, and TAP-CRNN models.
Results of fourfold cross validation of TAP-CRNN.
| Accuracy | Sensitivity | Specificity | PPV | NPV | |
|---|---|---|---|---|---|
| 1st fold | 98.4 | 99.2% | 96.7% | 98.5% | 98.3% |
| 2nd fold | 96.8 | 96.2% | 98.3% | 99.2% | 92.2% |
| 3rd fold | 96.4 | 99.3% | 90.0% | 95.7% | 98.2% |
| 4th fold | 90.2 | 94.0% | 82.9% | 91.2% | 87.9% |
| Average | 95.45 | 97.18 | 91.98 | 96.15% | 94.15% |
Test results of the TAP-CRNN model’s ability to distinguish systolic murmur from normal heart sounds at the 5 standard auscultation locations.
| Auscultation area | Accuracy | Sensitivity | Specificity | PPV | NPV |
|---|---|---|---|---|---|
| Aortic area | 94.6% | 95.5% | 91.7% | 97.7% | 84.6% |
| Pulmonic area | 95.7% | 94.1% | 100% | 100% | 85.7% |
| Second aortic area/ Erb’s point | 100% | 100% | 100% | 100% | 100% |
| Tricuspid area | 100% | 100% | 100% | 100% | 100% |
| Mitral area/apex | 95.7% | 94.1% | 100% | 100% | 85.7% |
Figure 3Spectrograms of heart sounds from the normal subjects (a) and the subjects with VSD (b). The spectrums of sounds or other signals as they vary with time is shown. S1 (empty triangle) and S2 (solid triangle) are observed in the spectrogram of the normal heart sound. Systolic murmur (white arrow) is observed in the spectrogram of the VSD heart sound.
Figure 4The mechanism of TAP-CRNN.
Figure 5(a) The spectrogram of heart sounds from the subjects with VSD, and (b) the product of global attention and local attention coefficients.