| Literature DB >> 35505670 |
Saumya Borwankar1, Jai Prakash Verma1, Rachna Jain2, Anand Nayyar3.
Abstract
Every respiratory-related checkup includes audio samples collected from the individual, collected through different tools (sonograph, stethoscope). This audio is analyzed to identify pathology, which requires time and effort. The research work proposed in this paper aims at easing the task with deep learning by the diagnosis of lung-related pathologies using Convolutional Neural Network (CNN) with the help of transformed features from the audio samples. International Conference on Biomedical and Health Informatics (ICBHI) corpus dataset was used for lung sound. Here a novel approach is proposed to pre-process the data and pass it through a newly proposed CNN architecture. The combination of pre-processing steps MFCC, Melspectrogram, and Chroma CENS with CNN improvise the performance of the proposed system, which helps to make an accurate diagnosis of lung sounds. The comparative analysis shows how the proposed approach performs better with previous state-of-the-art research approaches. It also shows that there is no need for a wheeze or a crackle to be present in the lung sound to carry out the classification of respiratory pathologies.Entities:
Keywords: CENS (Chroma energy normalized statistics); CNN (Convolutional neural network); MFCC (Mel-frequency cepstral coefficients); Melspectrogram; Respiratory pathologies classification
Year: 2022 PMID: 35505670 PMCID: PMC9047583 DOI: 10.1007/s11042-022-12958-1
Source DB: PubMed Journal: Multimed Tools Appl ISSN: 1380-7501 Impact factor: 2.577
A comparative analysis for better understanding of approaches and the classification done by researchers
| Ref | Objective | Methodology | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Pros | Cons |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Islam et al. [ | To classify lung sound based on wheeze | MFCC - ANN, SVM | ✓ | ✓ | Whole lung sound cycle used for the classification | Multi-class classification not looked at | ||||||
| Jakovljevi et al. [ | Classify audio into normal, crackle, wheeze, and both crackle and wheeze | MFCC - HMM | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Use of ensemble of classifiers. | Author mentions the use of advance noise suppressing techniques which were not looked at. |
| Bardou et al. [ | Classify audio into lung sounds of wheeze, normal, monophonic, coarse, fine, polyphonic,crackle, squawk, and stridor sounds. | MFCC - SVM, KNN, GMM Spectrogram - CNN | ✓ | LBP feature classification and introduction of CNN | Larger CNN topologies need to be made | |||||||
| Chen et al. [ | Classify audio in wheeze crackles and normal | Spectrogram - CNN | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Different spectral mapping techniques used | Multi-class classification and real-time detection not looked at. |
| Guler et al. [ | The different neural structures of PSD values is classifined into crackle, wheeze, and normal. | Welch method | ✓ | Hybridization of genetic algorithm with neural networks, the Welch with PSD estimation calculations methods used. | Work needs to be done in proper model selection | |||||||
| Prerna et al. [ | Classify audio into chronic, non-chronic, wheezes, crackle and normal | MFCC-RNN | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | The use of LSTM to enhance the performance | Author mentions the use of multiple deep learning architectures for future work |
| Garcia et al. [ | Classify audio into chronic, non-chronic, wheezes, crackle and normal | Melspectrogram-CNN | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | The use of melspectrogram and CNN model for classification | Use of variable length audio and the detection of parts of melspectrogram emphasizing detection |
| Feng et al. [ | Classify sounds into normal and abnormal | Spectrogram-knn | ✓ | Use of temporal-spectral dominance spectrogram along with k-nn improved performance. | The representation of time-frequency has to be looked at from high complexity point of view. | |||||||
| Serbes et al. [ | Classify crackles from audio | Wavelet transform - SVM | ✓ | Preprocessing helps remove no-information containing regions with DTCWT. | Effect of windowing for time-frequency and mother wavelet function has not been looked at. | |||||||
| Guler et al. [ | Classify audio into normal, wheeze and crackles | Welch -Genetic algorithm ANN | ✓ | Use of welch method helped in preprocessing | More networks need to be looked at for better performance. | |||||||
| Nishi et al. [ | Classify COPD from audio | MFCC-SVM | ✓ | ✓ | The linear predictive coefficient and median frequency are found to be best biomarkers for COPD detection | Considering multi-centered data along with COPD subjects. | ||||||
| Chen et al. [ | Classify rale, rhoncus, wheeze and normal sounds | MFCC-knn | ✓ | Approach provides sound frame identification results to grade the lung sounds | Different variations of ML algorithms were not looked at. | |||||||
| Hashemi et al. [ | Classify wheeze from respiratory sounds. | Wavelet transform- MLP | ✓ | They were able to demonstrate that higher wavelet do not produce better results. | Computation cost was increased. | |||||||
| Yamashita et al. [ | Classify normal and emphysema | Segmentation-HMM | ✓ | They were able to get robust results between two likelihoods. | Multi class classification was not looked at. | |||||||
| Prerna et al. [ | Classify audio into chronic, non-chronic diseases and normal | MFCC - CNN | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | The use of convolutional variational autoencoders for augmenting the classes with fewer samples. | Classification based on variable length. |
| 1:normal, 2:Bronchiectasis, 3:Bronchiolitis,4:COPD. | ||||||||||||
| 5:Pneumonia, 6:LRTI, 7:URTI, 8:Asthma. | ||||||||||||
Fig. 1Proposed execution flow with flowchart
Fig. 2Mel scale filter bank, from (young et al, 1997)
Fig. 3Feature visualization
Original Data Size and Augmented data size
| ID | Name of Disease | #Audio Sample | #Augmented samples |
|---|---|---|---|
| 1 | Bronchiectasis | 16 | 80 |
| 2 | Bronchiolitis | 13 | 65 |
| 3 | COPD | 793 | 3965 |
| 4 | Healthy | 35 | 175 |
| 5 | Pneumonia | 37 | 185 |
| 6 | LRTI | 2 | 10 |
| 7 | URTI | 23 | 115 |
| 8 | Asthma | 1 | 5 |
Fig. 4Plot of a healthy audio
Fig. 5CNN architecture
Analysis of independent features
| Approach | Dataset | Classes | Precision(%) | F1-Score | Recall |
|---|---|---|---|---|---|
| MFCC | ICBHI | 3(chronic, non-chronic, normal) | 90.5 | 0.85 | 0.89 |
| Melspectrogram | ICHBHI | 3 (normal, chronic, non-chronic) | 84.5 | 0.84 | 0.82 |
| Chroma CENS | ICHBHI | 3 (normal, chronic, non-chronic) | 94.5 | 0.900 | 0.986 |
Fig. 6Training and Validation Accuracy
Fig. 7Training and Validation Loss
Experimental results
| Approach | Dataset | Classes | Precision | F1-Score | Recall |
|---|---|---|---|---|---|
| 2D CNN [ | ICBHI | 3(chronic, non-chronic, normal) | 0.96 | 0.84 | 0.82 |
| RNN [ | ICHBHI | 3 (normal, chronic, non-chronic) | 0.93 | 0.91 | 0.90 |
| CNN [ | ICHBHI | 3 (normal, chronic, non-chronic) | 0.994 | 0.900 | 0.986 |
| Proposed Approach | ICBHI | 3(chronic, non-chronic, normal) |
Fig. 8Comparison of features