| Literature DB >> 35318171 |
Nihad Karim Chowdhury1, Muhammad Ashad Kabir2, Md Muhtadir Rahman3, Sheikh Mohammed Shariful Islam4.
Abstract
This research aims to analyze the performance of state-of-the-art machine learning techniques for classifying COVID-19 from cough sounds and to identify the model(s) that consistently perform well across different cough datasets. Different performance evaluation metrics (precision, sensitivity, specificity, AUC, accuracy, etc.) make selecting the best performance model difficult. To address this issue, in this paper, we propose an ensemble-based multi-criteria decision making (MCDM) method for selecting top performance machine learning technique(s) for COVID-19 cough classification. We use four cough datasets, namely Cambridge, Coswara, Virufy, and NoCoCoDa to verify the proposed method. At first, our proposed method uses the audio features of cough samples and then applies machine learning (ML) techniques to classify them as COVID-19 or non-COVID-19. Then, we consider a multi-criteria decision-making (MCDM) method that combines ensemble technologies (i.e., soft and hard) to select the best model. In MCDM, we use the technique for order preference by similarity to ideal solution (TOPSIS) for ranking purposes, while entropy is applied to calculate evaluation criteria weights. In addition, we apply the feature reduction process through recursive feature elimination with cross-validation under different estimators. The results of our empirical evaluations show that the proposed method outperforms the state-of-the-art models. We see that when the proposed method is used for analysis using the Extra-Trees classifier, it has achieved promising results (AUC: 0.95, Precision: 1, Recall: 0.97).Entities:
Keywords: COVID-19; Classification; Cough; Ensemble; Entropy; MCDM; Machine learning; TOPSIS
Mesh:
Year: 2022 PMID: 35318171 PMCID: PMC8926945 DOI: 10.1016/j.compbiomed.2022.105405
Source DB: PubMed Journal: Comput Biol Med ISSN: 0010-4825 Impact factor: 6.698
Fig. 1An overview of the proposed method for detecting COVID-19 from cough samples.
Datasets description.
| Dataset | Category | COVID-19 | Non-COVID-19 | Total |
|---|---|---|---|---|
| Cambridge | Asymtomatic | 141 | 298 | 439 |
| Symtomatic | 54 | 32 | 86 | |
| Coswara | - | 185 | 1134 | 1319 |
| Virufy | - | 48 | 73 | 121 |
| NoCoCoDa | - | 73 | - | 73 |
| Virufy + NoCoCoDa | - | 121 | 73 | 194 |
Fig. 2COVID-19 and Non-COVID-19 cough samples of the Cambridge dataset.
Hyper-parameters search space of classifiers for optimization.
| Classifiers | Hyper-parameters | Range |
|---|---|---|
| Extra-Trees | Estimators | 600, 700, 800 |
| Criterion | Gini, Entropy | |
| Max. features | Auto, Sqrt, Log2 | |
| SVM | C | 0.10 to 1.0, step = 0.10 |
| Kernel | Linear, Poly, rbf, Sigmoid | |
| Gamma | Auto, Scale | |
| RF | Estimators | 600, 700, 800 |
| Max. features | Auto, Sqrt, Log2 | |
| AdaBoost | Estimators | 600, 700, 800 |
| Algorithm | SAMME, SAMME.R | |
| MLP | Hidden layer sizes | (64), (64,64), (128), (128,128) |
| Activation | identity, logistic, tanh, relu | |
| Solver | lbfgs, sgd, adaml | |
| Learning rate | constant, invscaling, adaptive | |
| XGBoost | Estimators | 600,700,800 |
| Max. depth | 4,5,6 | |
| GBoost | Estimators | 600, 700, 800 |
| Criterion | friedman_mse, mse | |
| Max. features | auto, sqrt, log2 | |
| Loss | deviance, exponential | |
| LR | Penalty | l1, l2, elasticnet |
| Solver | newton-cg, lbfgs, liblinear, sag, saga | |
| k-NN | Number of neighbours | 5 to 8, step = 1 |
| Algorithm | auto, ball tree, kd tree, brute | |
| HGBoost | Max. iteration | 100 to 600, step = 100 |
| Loss | binary crossentropy |
Configurations of different training strategies.
| Training Strategy # | Cross-Validation Method | Cross-Validation Folds | Up-sampling Method | Threshold Moving | Hyper-parameters Selection Method |
|---|---|---|---|---|---|
| Strategy 1 | Stratified | 10 | N/A | Fixed | |
| Strategy 2 | Stratified | 10 | SMOTE | Fixed | |
| Strategy 3 | Stratified | 10 | SMOTE | Optimized using Nested Cross-Validation with Grid Search |
Decision matrix of the proposed method for asymptomatic category considering training strategies. Evaluation criteria into two groups based on maximization and minimization. Acc., AUC, Precision, Recall, Specificity, F1-score are expected to be the maximum; in contrast, FPR and FNR are expected to have the minimum.
| Training strategies | Classifiers | Evaluation Criteria | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Acc.( | AUC( | Precision( | Recall( | Specificity( | F1-score( | FPR( | FNR( | ||
| Strategy 1 | Extra-Trees | 0.86 | 0.85 | 0.93 | 0.62 | 0.98 | 0.75 | 0.02 | 0.38 |
| SVM | 0.81 | 0.81 | 0.82 | 0.54 | 0.94 | 0.65 | 0.06 | 0.46 | |
| RF | 0.85 | 0.81 | 0.90 | 0.62 | 0.97 | 0.73 | 0.03 | 0.38 | |
| AdBoost | 0.82 | 0.80 | 0.82 | 0.55 | 0.94 | 0.66 | 0.06 | 0.45 | |
| MLP | 0.81 | 0.83 | 0.84 | 0.51 | 0.95 | 0.63 | 0.05 | 0.49 | |
| XGBoost | 0.77 | 0.68 | 0.90 | 0.33 | 0.98 | 0.48 | 0.02 | 0.67 | |
| GBoost | 0.81 | 0.80 | 0.78 | 0.57 | 0.92 | 0.66 | 0.08 | 0.43 | |
| LR | 0.80 | 0.78 | 0.76 | 0.57 | 0.91 | 0.65 | 0.09 | 0.43 | |
| k-NN | 0.80 | 0.80 | 0.82 | 0.48 | 0.95 | 0.61 | 0.05 | 0.52 | |
| HGBoost | 0.84 | 0.83 | 0.92 | 0.56 | 0.98 | 0.70 | 0.02 | 0.44 | |
| Strategy 2 | Extra-Trees | 0.85 | 0.85 | 0.76 | 0.77 | 0.88 | 0.76 | 0.12 | 0.23 |
| SVM | 0.77 | 0.79 | 0.62 | 0.79 | 0.77 | 0.69 | 0.23 | 0.21 | |
| RF | 0.82 | 0.84 | 0.71 | 0.77 | 0.85 | 0.74 | 0.15 | 0.23 | |
| AdBoost | 0.78 | 0.79 | 0.64 | 0.72 | 0.81 | 0.68 | 0.19 | 0.28 | |
| MLP | 0.80 | 0.81 | 0.69 | 0.70 | 0.85 | 0.69 | 0.15 | 0.30 | |
| XGBoost | 0.83 | 0.84 | 0.70 | 0.79 | 0.84 | 0.75 | 0.16 | 0.21 | |
| GBoost | 0.78 | 0.80 | 0.64 | 0.76 | 0.80 | 0.69 | 0.20 | 0.24 | |
| LR | 0.78 | 0.79 | 0.62 | 0.78 | 0.78 | 0.69 | 0.22 | 0.22 | |
| k-NN | 0.80 | 0.81 | 0.68 | 0.71 | 0.84 | 0.69 | 0.16 | 0.29 | |
| HGBoost | 0.84 | 0.86 | 0.72 | 0.81 | 0.85 | 0.76 | 0.15 | 0.19 | |
| Strategy 3 | Extra-Trees | 0.84 | 0.83 | 0.75 | 0.74 | 0.88 | 0.74 | 0.12 | 0.26 |
| SVM | 0.81 | 0.83 | 0.67 | 0.79 | 0.82 | 0.72 | 0.18 | 0.21 | |
| RF | 0.84 | 0.84 | 0.75 | 0.77 | 0.88 | 0.76 | 0.12 | 0.23 | |
| AdBoost | 0.79 | 0.82 | 0.65 | 0.78 | 0.8 | 0.71 | 0.20 | 0.22 | |
| MLP | 0.82 | 0.82 | 0.71 | 0.74 | 0.86 | 0.72 | 0.14 | 0.26 | |
| XGBoost | 0.83 | 0.85 | 0.71 | 0.79 | 0.85 | 0.75 | 0.15 | 0.21 | |
| GBoost | 0.84 | 0.85 | 0.74 | 0.76 | 0.88 | 0.75 | 0.12 | 0.24 | |
| LR | 0.78 | 0.79 | 0.63 | 0.72 | 0.8 | 0.68 | 0.20 | 0.28 | |
| k-NN | 0.79 | 0.79 | 0.66 | 0.70 | 0.83 | 0.68 | 0.17 | 0.30 | |
| HGBoost | 0.83 | 0.85 | 0.71 | 0.79 | 0.85 | 0.75 | 0.15 | 0.21 | |
Decision matrix of the proposed method for symptomatic category considering training strategies. Evaluation criteria into two groups based on maximization and minimization. Acc., AUC, Precision, Recall, Specificity, F1-score are expected to be the maximum; in contrast, FPR and FNR are expected to have the minimum.
| Training strategies | Classifiers | Evaluation Criteria | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Acc.( | AUC( | Precision( | Recall( | Specificity( | F1-score( | FPR( | FNR( | ||
| Strategy 1 | Extra-Trees | 0.87 | 0.87 | 1 | 0.8 | 1 | 0.89 | 0 | 0.20 |
| SVM | 0.79 | 0.78 | 0.93 | 0.72 | 0.91 | 0.81 | 0.09 | 0.28 | |
| RF | 0.87 | 0.87 | 0.96 | 0.83 | 0.94 | 0.89 | 0.06 | 0.17 | |
| AdBoost | 0.79 | 0.78 | 0.93 | 0.72 | 0.91 | 0.81 | 0.09 | 0.28 | |
| MLP | 0.83 | 0.81 | 0.98 | 0.74 | 0.97 | 0.84 | 0.03 | 0.26 | |
| XGBoost | 0.84 | 0.81 | 0.95 | 0.78 | 0.94 | 0.86 | 0.06 | 0.22 | |
| GBoost | 0.79 | 0.75 | 0.93 | 0.72 | 0.91 | 0.81 | 0.09 | 0.28 | |
| LR | 0.84 | 0.8 | 0.95 | 0.78 | 0.94 | 0.86 | 0.06 | 0.22 | |
| k-NN | 0.73 | 0.75 | 0.92 | 0.63 | 0.91 | 0.75 | 0.09 | 0.37 | |
| HGBoost | 0.77 | 0.70 | 0.87 | 0.74 | 0.81 | 0.80 | 0.19 | 0.26 | |
| Strategy 2 | Extra-Trees | 0.86 | 0.86 | 0.98 | 0.80 | 0.97 | 0.88 | 0.03 | 0.20 |
| SVM | 0.84 | 0.80 | 0.92 | 0.81 | 0.88 | 0.86 | 0.13 | 0.19 | |
| RF | 0.84 | 0.85 | 0.95 | 0.78 | 0.94 | 0.86 | 0.06 | 0.22 | |
| AdBoost | 0.80 | 0.79 | 0.91 | 0.76 | 0.88 | 0.83 | 0.13 | 0.24 | |
| MLP | 0.87 | 0.86 | 0.94 | 0.85 | 0.91 | 0.89 | 0.09 | 0.15 | |
| XGBoost | 0.81 | 0.84 | 1 | 0.70 | 1 | 0.83 | 0 | 0.30 | |
| GBoost | 0.86 | 0.83 | 0.94 | 0.83 | 0.91 | 0.88 | 0.09 | 0.17 | |
| LR | 0.86 | 0.83 | 0.89 | 0.89 | 0.81 | 0.89 | 0.19 | 0.11 | |
| k-NN | 0.72 | 0.74 | 0.97 | 0.57 | 0.97 | 0.72 | 0.03 | 0.43 | |
| HGBoost | 0.77 | 0.74 | 0.93 | 0.69 | 0.91 | 0.79 | 0.09 | 0.31 | |
| Strategy 3 | Extra-Trees | 0.84 | 0.83 | 1 | 0.74 | 1 | 0.85 | 0 | 0.26 |
| SVM | 0.80 | 0.79 | 0.93 | 0.74 | 0.91 | 0.82 | 0.09 | 0.26 | |
| RF | 0.87 | 0.85 | 0.96 | 0.83 | 0.94 | 0.89 | 0.06 | 0.17 | |
| AdBoost | 0.83 | 0.80 | 0.95 | 0.76 | 0.94 | 0.85 | 0.06 | 0.24 | |
| MLP | 0.83 | 0.78 | 0.88 | 0.83 | 0.81 | 0.86 | 0.19 | 0.17 | |
| XGBoost | 0.84 | 0.81 | 0.92 | 0.81 | 0.88 | 0.86 | 0.13 | 0.19 | |
| GBoost | 0.88 | 0.87 | 0.98 | 0.83 | 0.97 | 0.90 | 0.03 | 0.17 | |
| LR | 0.78 | 0.76 | 0.95 | 0.69 | 0.94 | 0.80 | 0.06 | 0.31 | |
| k-NN | 0.69 | 0.71 | 0.97 | 0.52 | 0.97 | 0.67 | 0.03 | 0.48 | |
| HGBoost | 0.83 | 0.80 | 0.91 | 0.80 | 0.88 | 0.85 | 0.13 | 0.20 | |
Evaluation criteria and weights based on the entropy of all categories.
| Category | Training Strategies | Evaluation criteria | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Acc. | AUC | Precision | Recall | Specificity | F1-score | FPR | FNR | ||
| Asymptomatic | Strategy 1 | 0.10 | 0.06 | 0.13 | 0.06 | 0.11 | 0.07 | 0.25 | 0.22 |
| Strategy 2 | 0.13 | 0.19 | 0.14 | 0.09 | 0.09 | 0.18 | 0.09 | 0.10 | |
| Strategy 3 | 0.10 | 0.10 | 0.09 | 0.08 | 0.12 | 0.11 | 0.19 | 0.21 | |
| Symptomatic | Strategy 1 | 0.13 | 0.13 | 0.11 | 0.10 | 0.09 | 0.12 | 0.15 | 0.16 |
| Strategy 2 | 0.09 | 0.16 | 0.14 | 0.09 | 0.10 | 0.08 | 0.15 | 0.18 | |
| Strategy 3 | 0.07 | 0.09 | 0.10 | 0.07 | 0.09 | 0.07 | 0.15 | 0.37 | |
The results of the ideal best and the ideal worst value of each task for each training strategy.
| Category | Evaluation criteria | Strategy 1 | Strategy 2 | Strategy 3 | |||
|---|---|---|---|---|---|---|---|
| Asymptomatic | Acc. | 0.032 | 0.028 | 0.042 | 0.038 | 0.031 | 0.029 |
| AUC | 0.020 | 0.016 | 0.064 | 0.059 | 0.034 | 0.032 | |
| Precision | 0.045 | 0.037 | 0.049 | 0.040 | 0.032 | 0.027 | |
| Recall | 0.023 | 0.013 | 0.030 | 0.026 | 0.025 | 0.022 | |
| Specificity | 0.037 | 0.034 | 0.029 | 0.025 | 0.041 | 0.037 | |
| F1-score | 0.025 | 0.016 | 0.059 | 0.053 | 0.036 | 0.032 | |
| FPR | 0.029 | 0.131 | 0.019 | 0.036 | 0.045 | 0.075 | |
| FNR | 0.057 | 0.100 | 0.026 | 0.041 | 0.058 | 0.083 | |
| Symptomatic | Acc. | 0.043 | 0.036 | 0.031 | 0.026 | 0.024 | 0.019 |
| AUC | 0.046 | 0.037 | 0.055 | 0.047 | 0.030 | 0.025 | |
| Precision | 0.036 | 0.031 | 0.046 | 0.041 | 0.034 | 0.030 | |
| Recall | 0.036 | 0.028 | 0.034 | 0.022 | 0.024 | 0.015 | |
| Specificity | 0.032 | 0.026 | 0.035 | 0.028 | 0.029 | 0.024 | |
| F1-score | 0.041 | 0.035 | 0.028 | 0.023 | 0.022 | 0.017 | |
| FPR | 0 | 0.102 | 0 | 0.092 | 0 | 0.095 | |
| FNR | 0.033 | 0.073 | 0.025 | 0.096 | 0.077 | 0.216 | |
Results of MCDM with integration of ensemble.
| Category | Classifiers | Relative Closeness Scores | Ensemble | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Soft | Hard | ||||||||||
| Avg.( | Rank | Total( | Rank | ||||||||
| Asymptomatic | Extra-Trees | 1 | 0.806 | 0.701 | 0.835 | 10 | 10 | 6 | 26 | 2 | |
| SVM | 0.478 | 0.370 | 0.535 | 0.461 | 7 | 3 | 4 | 4 | 11 | 8 | |
| RF | 0.871 | 0.683 | 0.867 | 0.807 | 3 | 8 | 7 | 10 | 25 | 3 | |
| AdBoost | 0.483 | 0.256 | 0.422 | 0.387 | 9 | 4 | 1 | 3 | 8 | 9 | |
| MLP | 0.579 | 0.428 | 0.614 | 0.540 | 5 | 6 | 6 | 5 | 17 | 5 | |
| XGBoost | 0.690 | 0.699 | 0.736 | 0.708 | 4 | 7 | 8 | 8 | 23 | 4 | |
| GBoost | 0.314 | 0.351 | 0.807 | 0.490 | 6 | 2 | 2 | 9 | 13 | 6 | |
| LR | 0.267 | 0.357 | 0.132 | 0.252 | 10 | 1 | 3 | 1 | 5 | 10 | |
| k-NN | 0.561 | 0.405 | 0.262 | 0.409 | 8 | 5 | 5 | 2 | 12 | 7 | |
| HGBoost | 0.920 | 0.806 | 0.736 | 0.821 | 2 | 9 | 10 | 8 | 27 | ||
| Symptomatic | Extra-Trees | 0.947 | 0.790 | 0.772 | 0.836 | 10 | 10 | 8 | 28 | ||
| SVM | 0.515 | 0.484 | 0.647 | 0.548 | 8 | 5 | 4 | 4 | 13 | 7 | |
| RF | 0.717 | 0.675 | 0.837 | 0.743 | 2 | 8 | 8 | 9 | 25 | 2 | |
| AdBoost | 0.515 | 0.427 | 0.743 | 0.561 | 7 | 5 | 2 | 7 | 14 | 6 | |
| MLP | 0.784 | 0.643 | 0.596 | 0.674 | 5 | 9 | 7 | 3 | 19 | 4 | |
| XGBoost | 0.693 | 0.694 | 0.672 | 0.686 | 3 | 7 | 9 | 6 | 22 | 3 | |
| GBoost | 0.514 | 0.626 | 0.915 | 0.685 | 4 | 3 | 6 | 10 | 19 | 4 | |
| LR | 0.692 | 0.440 | 0.589 | 0.573 | 6 | 7 | 3 | 2 | 12 | 8 | |
| k-NN | 0.457 | 0.511 | 0.362 | 0.443 | 9 | 2 | 5 | 1 | 8 | 9 | |
| HGBoost | 0.176 | 0.467 | 0.662 | 0.435 | 10 | 1 | 1 | 5 | 7 | 10 | |
-The underlined boldface indicates the highest-ranked models.
Comparison of the proposed methods for COVID-19 cough detection.
| Category | Method | AUC | Precision | Recall |
|---|---|---|---|---|
| Asymptomatic | Proposed (Audio Features + Extra-Trees) | 0.83 | 0.74 | |
| Proposed (Audio Features + HGBoost) | 0.71 | |||
| Symptomatic | Proposed (Audio Features + Extra-Trees) | 0.74 | ||
| Proposed (Audio Features + HGBoost) | 0.80 | 0.91 |
-Bold values indicate the highest.
Fig. 3Normalized confusion matrices of Extra-Tree classifiers with 10-fold cross-validation for all training strategies. Figures (a)–(c) represent the confusion matrix of asymptomatic categories, and for symptomatic categories, the confusion matrices are (d)–(f). The sum of each class is equal to 1. Note that 0 represents COVID-19 and 1 represents Non-COVID-19 cough.
Fig. 4Optimal numbers of feature selection using recursive feature elimination with cross-validation for Cambridge asymptomatic and symptomatic categories. Note that RFECV stands for Recursive Feature Elimination with Cross-Validation.
Comparison of our proposed approach with the state-of-the-art approaches.
| Dataset | Method | AUC | Precision | Recall | |
|---|---|---|---|---|---|
| Cambridge | Asymptomatic | Brown et al. [ | 0.80 | 0.72 | 0.69 |
| Proposed (RFECV + Extra-Trees) | 0.75 | ||||
| Proposed (RFECV + HGBoost) | 0.85 | 0.73 | |||
| Symptomatic | Brown et al. [ | 0.87 | 0.70 | 0.90 | |
| Muhammad et al. [ | - | 0.87 | 0.82 | ||
| Proposed (RFECV + Extra-Trees) | |||||
| Proposed (RFECV + HGBoost) | 0.81 | 0.93 | 0.80 | ||
| Coswara | Proposed (RFECV + Extra-Trees) | 0.64 | 0.70 | 0.58 | |
| Proposed (RFECV + HGBoost) | 0.66 | 0.76 | 0.47 | ||
| Virufy | Proposed (RFECV + Extra-Trees) | 0.92 | 0.88 | ||
| Proposed (RFECV + HGBoost) | |||||
| Virufy + NoCoCoDa | Melek [ | 0.99 | 0.97 | ||
| Proposed (RFECV + Extra-Trees) | 0.97 | 0.92 | |||
| Proposed (RFECV + HGBoost) | 0.98 | 0.99 | |||
| Combined dataset | Proposed (RFECV + Extra-Trees) | ||||
| Proposed (RFECV + HGBoost) | 0.78 | 0.66 | |||
-Bold values indicate the highest.