| Literature DB >> 36160764 |
Santosh Kumar1, Sachin Kumar Gupta2, Vinit Kumar3, Manoj Kumar4, Mithilesh Kumar Chaube5, Nenavath Srinivas Naik1.
Abstract
Over the past few years, the awful COVID-19 pandemic effect has become a lethal sickness. The processing of the gathered samples requires extra time due to the use of medical diagnostic equipment, methodologies, and clinical testing procedures for the early diagnosis of infected individuals. An innovative multimodal paradigm for the early diagnosis and precise categorization of COVID-19 is put up as a solution to this issue. To extract distinguishing features from the prepared chest X-ray picture and cough (audio) database, chest X-ray-based and cough-based model are used here. Other public chest X-ray image datasets, and the Coswara cough (audio) dataset containing 92 COVID-19 positive, and 1079 healthy subjects (people) using the deep Uniform-Net, and Convolutional Neural Network (CNN). The weighted sum-rule fusion method and ensemble deep learning algorithms are utilized to further combine the extracted features. For the early diagnosis of patients, the framework offers an accuracy of 98.67%.Entities:
Keywords: COVID-19; Deep learning; Ensemble Learning; Fusion; Machine Learning
Year: 2022 PMID: 36160764 PMCID: PMC9485428 DOI: 10.1016/j.compeleceng.2022.108396
Source DB: PubMed Journal: Comput Electr Eng ISSN: 0045-7906 Impact factor: 4.152
Literature work based on chest X-ray modality system.
| Ref. | Pros. | Cons | Tech. |
|---|---|---|---|
| NA | LA | VG | |
| Lung segmentation | LA | FC-DenseNet | |
| Effective | NS | DarkNet | |
| CheXNet | Precise localization | LA | CNN |
| VP | RL | SL | CAAD model |
Abreviation: Ref.=Reference, Tech.=Techniques, LA=Low Accuracy(%), VG=VGG-19, ResNet-50, NA=Not Available, VP=Viral pneumonia screening, RL=Reinforces one-class model, SL=Singular class, so non-useful in Covid detection, NS=No segmentation technique used.
Literature work based on cough sound modality system/frameworks.
| Ref. | Advantage | Dis | Tech. |
|---|---|---|---|
| MD | HSR | CNN | |
| HF | A | DL | |
| Bio | Low-D | Bio-Model | |
| TM | Not implemented | ML |
Abbreviation: AD=Advantages, Dis=Disadvantages, Tech=Technique used, MD=Mediator based architecture, HSR=High sample rate & Mel-spectrogram, CNN=Convolution Neural Network, Low-D Require more dataset, Bio=Use of multiple Biomarkers, TM=Simple theoretical model and proof methods, FD=Fault way of data collection, DL=deep/machine learning & pattern recognition, A=Low AUC, EA=Easy availability, HF=Easy availability.
Current state-of-the-art works in early detection of COVID-19 detection using ML techniques based on chest and coughing samples.
| Study | Year | Rs | RSS | NS | NR | PS | TM | Accuracy |
|---|---|---|---|---|---|---|---|---|
| 2020 | SP | Cough | CO | 3621 | ST | ResNet-18 | AUC(0.72) | |
| 2020 | Web based | Cough | COD | 5320 | ResNet-50) | 97.10% | ||
| 2021 | Web based | Cough | CODD | 1502 | MFCC | ENCNN | 77% | |
| 2021 | ES | B | COV10 | 10 | 2DFT | Inception-v3 | 80% | |
| 2022 | Sound | 200 | Chest | HSR | SFs | CNN | 70% | |
| 2020 | cough | 4352 | Speech Pro. | MFCC | DL | CNN | 80% |
Abbreviation: Speech Pro=Speech processing, T=dataset consists of 4352 unique people collected from the web app + 2261 unique people from the Android app+4352 and 5634 samples. Cough=Crowdsourced Respiratory Sound Data, SF=Shape chest Features, 2DFT=Two-dimensional (2D) Fourier transformation, COVID10= [5 COVID19 5 healthy 10], ES=Electronic stethoscope, B=Breathing, Sp=Smartphone app, TM=Trained Model, Rs=Recordings Source, RSS=Respiratory Sound, NS=Number of Subjects, NR=Number of Recordings, PS=Pre-processing Steps, ST=Short-term magnitude spectrogram, CO=COVID-19 1620 healthy, Mel=Mel-frequency cepstral coefficients (MFCC), COD=2660 COVID-19 2660 healthy, CODD=114 COVID-19, 1388 healthy, MFCC=Spectrogram Mel spectrum, power spectrum Tonal, MFCC=Mel-frequency cepstral coefficients, ENCNN=Ensemble CNN.
Fig. 1Working flowchart of proposed framework using deep learning model.
Fig. 2CXR images for normal and illness people from prepared chest dataset, which categorizes Chest X-ray images into classes of: (a) Normal case, (b) pneumonia case, and (c) COVID-19 case.
Description of cough (audio) dataset.
| Database | Size | Class ratio | Data types |
|---|---|---|---|
| Coswara | 8000 | 10:1 | C+B+S |
| Coughvid | 20000 | 7:3 | Cough sounds |
| DetectNow | 6500 | 8:5 | Cough sounds |
| Virufy | 16 | 9:7 | Cough sounds |
Abbreviation: C+B+S = Cough (C), breathing (B), and speech sound (S).
Fig. 3Pipeline for proposed cough based diagnostic model.
Fig. 4Cough (speech) signal for volume and its power spectrum.
Fig. 5Feature extraction from the cough (voice) samples.
Fig. 6Working of chest X-ray model.
Fig. 7Segmentation of chest X-ray image using the proposed model.
Matrices of the proposed cough classification model.
| Sensitivity | Specificity | Precision | Accuracy | F1-Score |
|---|---|---|---|---|
| 0.8345 | 0.8126 | 0.8141 | 0.8461 | 0.8653 |
Fig. 8Chest classification model using CNN model with convolution and max pooling layer operations on chest X-ray images.
Performance matrices of proposed chest X-ray model of 5-fold cross-validation results.
| Fold | Sensitivity | Specificity | Precision | Accuracyto | F1 |
|---|---|---|---|---|---|
| 1 | 0.9459 | 1.0000 | 1.0000 | 0.9890 | 0.9722 |
| 2 | 1.00009 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
| 3 | 0.9453 | 0.91235 | 0.9367 | 0.9578 | 0.9646 |
| 4 | 0.9354 | 0.94285 | 0.9669 | 0.9677 | 0.9879 |
| 5 | 0.9891 | 0.97685 | 0.9556 | 0.9576 | 0.9789 |
Comparisons with existing techniques for COVID-19 detection.
| Ref. | Database used | Technique used | Accuracy(%) |
|---|---|---|---|
| CXR | COVID-Net | 92.4% | |
| CXR | ResNet50 | 95.4% | |
| CXR | DarkCovidNet | 98 | |
| Chest CT-Scan | ResNet | 86.7% | |
| CXR | VGG-19 | 93.48% | |
| 1,065 CT images | M-Inception | 82.90% | |
Fig. 12(a) accuracy (%) vs.epoch graph for cough classification model and (b) loss (%) vs. epoch for cough classification model.
Fig. 9Confusion matrix of the proposed system with fusion method.
Fig. 10Training accuracy vs. validation accuracy for chest X-ray based COVID-19 classification.
Fig. 11Confusion matrix for (a) chest X-ray images and (b) cough based diagnosis.
Fig. 13Confusion matrix of the proposed system without fusion method.
Classification (%) based on different ML models.
| Model | Accuracy(%) | Precision(%) | Recall(%) |
|---|---|---|---|
| SqueezeNet | 84.40 | 83.40 | 84.30 |
| Darknet | 92.5 | 91.8 | 93.2 |
| ResNet50 | 90.0 | 89.78 | 90.8 |
| REs-A | Chest CT | MI | 82.90% |
| DLT | Chest-X-ray | DTL | 92.1% |
| 86.6 | 85.7 | 86.86 | |
| 92.5 | 91.8 | 93.89 |
Abbreviation: Ref.=Reference, Data=database used, Tech.=Techniques, A=Accuracy(%), Prop.=Proposed, DLT=DLT-based classifier, ED=Ensembled Deep Neural Network (DNN), TL=Transfer Learning with VGGish, chest-CT=Chest CT Scan, REs-A=ResNet and Location Attention, MI=M-Inception techniques, CNN=Convolutional neural Network.
Comparisons with existing techniques on cough for COVID-19 detection.
| Ref. | Used dataset | Technique Used | Accuracy | Remark |
|---|---|---|---|---|
| chest-CT | Res-A | 86.7% | Takes more time, less accuracy | |
| Chest CT | MI | 82.90% | Less dataset | |
| X-ray | DTL | 92.1% | Overfitting+ processing time is more | |
| CC | ED | 77.1% | Less accuracy | |
| audioset/ ESC-50 | TL | 70.58% | Not Normalized dataset used | |
Abbreviation: Ref.=Reference, Prop.=Proposed, DLT=DLT-based classifier, ED=Ensembles DNN, TL=Transfer Learning with VGGish, chest-CT=Chest CT Scan, REs-A=ResNet and Location Attention, MI=M-Inception techniques, CNN=Convolutional neural Network, Coswara=Coswara cough audio database, CC=Coswara/Coughvid.
Summary table based on Current state-of-the-art works in early detection of COVID-19 detection using ML techniques based on chest and coughing samples.
| Study | Year | Rs | RSS | NS | NR | PS | TM | Per(%) |
|---|---|---|---|---|---|---|---|---|
| 2022 | Sound | 200 | Chest | HSR | SFs | CNN | 70% | |
| 2020 | cough | 4352 | Speech | MFCC | DL | CNN | 80% | |
| 2020 | Web | Cough | COD | 5320 | DL | R2 | 97.10% | |
| 2020 | SP | Cough | CO | 3621 | ST | R-18 | AUC(0.72) | |
| 2021 | ES | B | COV10 | 10 | 2DFT | In-v3 | 80% | |
| 2021 | Web based | Cough | CODD | 1502 | MFCC | ENCNN | 77% | |
| 2021 | Web-based | Cough | COT | 16 | MFCCs | SFT SVM | 94.21% | |
| 2022 | B | BB | 220 | MFCCS | DL | Low-D | 70% | |
Abbreviation: MultM=Multimodal, In-v3=Inception v3, R=(ResNet-18), R1=ResNet-50 COT=7 COVID-19+9 Healthy, SFT+Singlet time Fourier transformation, SVM=Support vector machine, Speech Pro= Speech processing, T= dataset consists of 4352 unique people collected from the web app + 2261 unique people from the Android app+4352 and 5634 samples. Cough=Crowd sourced Respiratory Sound Data, SF=Shape chest Features, 2DFT=Two-dimensional (2D) Fourier transformation, COVID10= [5 COVID19 5 healthy, ES=Electronic stethoscope, BB=120 COVID+100 healthy) B=Breathing, Sp=Smartphone app, TM=Trained Model, Per.=Performance, Rs=Recordings Source, RSS=Respiratory Sound, NS=Number of Subjects, NR=Number of Recordings, PS=Pre-processing Steps, ST=Short-term magnitude spectrogram, CO=COVID-19 1620 healthy, Mel=Mel-frequency cepstral coefficients (MFCC), COD=2660 COVID-19 2660 healthy, CODD=114 COVID-19, 1388 healthy, MFCC=Spectrogram Mel spectrum, power spectrum Tona, Mel-frequency cepstral coefficients (MFCC), ENCNN=Ensemble CNN, Total=(COVID-19 34516 samples+ 2100(1500 (chest X-ray positive+600(Chest X-ray: Negative), proposed method=MFCC+Darknet+ensemble ML+weighted sum rule fusion method, Model=Unet+Handcrafted features.
Classification accuracy (%) of 5-fold cross-validation of Proposed Framework.
| Fold | Chest X-ray(%) | Cough(%) |
|---|---|---|
| 1 | 0.9890 | 0.7776 |
| 2 | 0.9778 | 0.8278 |
| 3 | 0.9833 | 0.8116 |
| 4 | 1.000 | 0.7926 |
| 5 | 0.9832 | 0.8453 |
Weighted Sum rule fusion method based accuracy for classification of COVID-19 patients.
| Modality | Weight | Mean accuracy (Si) | W sm score |
|---|---|---|---|
| Chest X ray | 0.54 | 98.67 | 53.35 |
| Cough audio | 0.46 | 86.53 | 39.80 |