| Literature DB >> 35885839 |
Jawad Rasheed1, Raed M Shubair2.
Abstract
The global pandemic COVID-19 is still a cause of a health emergency in several parts of the world. Apart from standard testing techniques to identify positive cases, auxiliary tools based on artificial intelligence can help with the identification and containment of the disease. The need for the development of alternative smart diagnostic tools to combat the COVID-19 pandemic has become more urgent. In this study, a smart auxiliary framework based on machine learning (ML) is proposed; it can help medical practitioners in the identification of COVID-19-affected patients, among others with pneumonia and healthy individuals, and can help in monitoring the status of COVID-19 cases using X-ray images. We investigated the application of transfer-learning (TL) networks and various feature-selection techniques for improving the classification accuracy of ML classifiers. Three different TL networks were tested to generate relevant features from images; these TL networks include AlexNet, ResNet101, and SqueezeNet. The generated relevant features were further refined by applying feature-selection methods that include iterative neighborhood component analysis (iNCA), iterative chi-square (iChi2), and iterative maximum relevance-minimum redundancy (iMRMR). Finally, classification was performed using convolutional neural network (CNN), linear discriminant analysis (LDA), and support vector machine (SVM) classifiers. Moreover, the study exploited stationary wavelet (SW) transform to handle the overfitting problem by decomposing each image in the training set up to three levels. Furthermore, it enhanced the dataset, using various operations as data-augmentation techniques, including random rotation, translation, and shear operations. The analysis revealed that the combination of AlexNet, ResNet101, SqueezeNet, iChi2, and SVM was very effective in the classification of X-ray images, producing a classification accuracy of 99.2%. Similarly, AlexNet, ResNet101, and SqueezeNet, along with iChi2 and the proposed CNN network, yielded 99.0% accuracy. The results showed that the cascaded feature generator and selection strategies significantly affected the performance accuracy of the classifier.Entities:
Keywords: COVID-19; diagnostic tool; pneumonia; stationary wavelets transformation; transfer learning
Year: 2022 PMID: 35885839 PMCID: PMC9317294 DOI: 10.3390/healthcare10071313
Source DB: PubMed Journal: Healthcare (Basel) ISSN: 2227-9032
Figure 1Workflow of the proposed system.
Comparison analysis of pre-trained transferred models as feature-extractor and machine-learning-classification algorithms.
| Feature-Generation Models | Classification Accuracy (%) | ||||||
|---|---|---|---|---|---|---|---|
| CNN | DT | KNN | LDA | LR | RF | SVM | |
| AlexNet | 95.6 | 90.7 | 88.8 | 95.0 | 91.9 | 90.4 |
|
| DenseNet-121 | 95.0 | 90.4 | 87.5 | 94.8 | 91.9 | 91.0 | 94.8 |
| DenseNet-169 | 95.1 | 91.0 | 89.5 | 95.1 | 92.0 | 91.0 | 95.3 |
| DenseNet-201 | 95.1 | 90.1 | 87.6 | 94.5 | 90.7 | 88.9 | 95.3 |
| DenseNet-263 | 94.8 | 88.8 | 85.0 | 93.8 | 90.1 | 87.5 | 95.0 |
| EfficientNet-B0 | 92.9 | 86.6 | 8.9 | 90.7 | 86.6 | 86.8 | 91.5 |
| GoogleNet | 93.5 | 90.1 | 82.2 | 92.5 | 87.1 | 88.8 | 92.5 |
| InceptionNetV3 | 95.0 | 91.0 | 83.1 | 93.3 | 87.3 | 89.3 | 95.1 |
| MobileNetV2 | 90.2 | 87.5 | 80.1 | 89.9 | 85.0 | 87.5 | 90.1 |
| ResNet18 | 90.7 | 88.8 | 82.2 | 90.7 | 86.7 | 85.0 | 90.6 |
| ResNet50 | 92.5 | 86.6 | 82.6 | 91.9 | 87.1 | 85.5 | 92.9 |
| ResNet101 |
| 91.9 | 85.0 | 95.9 | 90.7 | 87.8 |
|
| ResNet152 | 95.1 | 91.2 | 85.3 | 95.0 | 90.7 | 90.4 | 95.3 |
| SqueezeNet | 96.1 | 93.5 | 85.0 | 96.1 | 91.5 | 92.2 |
|
| VGG16 | 93.5 | 90.1 | 84.4 | 91.9 | 88.6 | 91.9 | 93.0 |
| VGG19 | 92.9 | 90.7 | 83.1 | 92.9 | 88.6 | 91.5 | 92.5 |
| XceptionNet | 91.0 | 87.5 | 80.5 | 88.8 | 87.5 | 86.8 | 91.0 |
CNN: convolutional neural network; DT: decision tree; KNN: k-nearest neighbors; LDA: linear discriminant analysis; LR: logistic regression; RF: random forest; SVM: support vector machine.
Data set information.
| Clinical State | Number of Instances | |||
|---|---|---|---|---|
| Training Set without Augmentation | Training Set with Augmentation | Validation Set | Testing Set | |
| Normal | 2456 | 22,104 | 1024 | 614 |
| Other pneumonia | 2456 | 22,104 | 1024 | 614 |
| COVID-19+ | 2456 | 22,104 | 1024 | 614 |
|
|
|
|
|
|
Output coefficient for each level in 2-D stationary wavelet (SW) transform.
| Decomposition Level | Down Sampling | Approximate Coefficient (Low Frequency) | Detail Coefficient (High Frequency) |
|---|---|---|---|
| 1 | Yes (by 2) |
| |
| 2 | - |
| |
| 3 | - |
|
App: approximate coefficient (image); Ver: vertical coefficient; Hor: horizontal coefficient; Dia: diagonal coefficient of the SW transformed image.
Parametric values of data-augmentation techniques.
| Augmentation Technique | Parametric |
|---|---|
| Translation | −10, 10 |
| Rotation | −90, 90 |
| Shear | −30, 30 |
Performance metrics for the convolutional neural network against each feature-selection technique.
| Feature Selector | Statistics | Precision | Recall | F1-Score | Accuracy |
|---|---|---|---|---|---|
| iNCA | Minimum | 98.536 | 98.534 | 98.535 | 98.534 |
| Maximum | 99.078 | 99.077 | 99.077 | 99.077 | |
| Average | 98.810 | 98.806 | 98.808 | 98.806 | |
|
| Minimum | 98.915 | 98.914 | 98.914 | 98.914 |
| Maximum | 99.133 | 99.132 | 99.132 | 99.131 | |
|
|
|
|
|
| |
| iMRMR | Minimum | 98.489 | 98.154 | 98.322 | 98.154 |
| Maximum | 98.914 | 98.914 | 98.914 | 98.914 | |
| Average | 98.535 | 98.534 | 98.534 | 98.534 |
iNCA: iterative neighborhood component analysis; iChi2: iterative chi-square; iMRMR: iterative maximum relevance–minimum redundancy.
Performance metrics for linear discriminant analysis against each feature-selection technique.
| Feature Selector | Statistics | Precision | Recall | F1-Score | Accuracy |
|---|---|---|---|---|---|
|
| Minimum | 97.790 | 97.768 | 97.779 | 97.774 |
| Maximum | 98.752 | 98.751 | 98.751 | 98.751 | |
|
|
|
|
|
| |
| iChi2 | Minimum | 97.674 | 97.666 | 97.670 | 97.666 |
| Maximum | 98.207 | 98.209 | 98.208 | 98.208 | |
| Average | 97.535 | 97.515 | 97.525 | 97.515 | |
| iMRMR | Minimum | 96.526 | 96.526 | 96.526 | 96.526 |
| Maximum | 97.927 | 97.720 | 97.724 | 97.720 | |
| Average | 97.124 | 97.123 | 97.123 | 97.123 |
iNCA: iterative neighborhood component analysis; iChi2: iterative chi-square; iMRMR: iterative maximum relevance–minimum redundancy.
Performance metrics for the support vector machine against each feature-selection technique.
| Feature Selector | Statistics | Precision | Recall | F1-Score | Accuracy |
|---|---|---|---|---|---|
| iNCA | Minimum | 98.756 | 98.751 | 98.754 | 98.751 |
| Maximum | 99.191 | 99.186 | 99.189 | 99.186 | |
| Average | 98.969 | 98.969 | 98.969 | 98.969 | |
|
| Minimum | 99.024 | 99.023 | 99.023 | 99.023 |
| Maximum | 99.462 | 99.457 | 99.460 | 99.457 | |
|
|
|
|
|
| |
| iMRMR | Minimum | 98.534 | 98.534 | 98.534 | 98.534 |
| Maximum | 99.078 | 99.077 | 99.078 | 99.077 | |
| Average | 98.806 | 98.806 | 98.806 | 98.806 |
iNCA: iterative neighborhood component analysis; iChi2: iterative chi-square; iMRMR: iterative maximum relevance–minimum redundancy.
Comparing the error of omission of the convolutional neural network (CNN), linear discriminant analysis (LDA), and support vector machine (SVM) for each class against the exploited feature-selection techniques.
| Feature Selector | Class Label | Error of Omission (%) | ||
|---|---|---|---|---|
| CNN | LDA | SVM | ||
| iNCA | Normal | 1.63 | 2.12 | 1.14 |
| Pneumonia | 0.81 | 1.63 | 1.14 | |
| COVID-19 | 1.14 | 1.47 | 0.81 | |
| iChi2 | Normal | 1.14 | 2.12 | 0.98 |
| Pneumonia | 0.98 | 2.44 | 0.65 | |
| COVID-19 | 0.81 | 1.63 | 0.65 | |
| iMRMR | Normal | 1.47 | 3.09 | 1.30 |
| Pneumonia | 1.79 | 3.26 | 1.30 | |
| COVID-19 | 1.14 | 2.28 | 0.98 | |
iNCA: iterative neighborhood component analysis; iChi2: iterative chi-square; iMRMR: iterative maximum relevance–minimum redundancy.
Comparative analysis of the proposed study with prior works.
| Study | Techniques | Accuracy (%) |
|---|---|---|
| [ | ResNet50 feature extractor with SVM | 95.33 |
| [ | SMOTE and ResNet152 with XGBoost and random forest | 97.70 |
| [ | Customized CNN-based network | 84.22 |
| [ | VGG-16-based scheme | 97.0 |
| [ | Customized Xception Net | 95.0 |
| [ | CNN with transfer multireceptive feature optimizer | 95.1 |
| [ | Cascaded ResNet50V2 and Xception Net | 91.4 |
| [ | Customized CNN-based model | 93.30 |
| [ | Pre-trained deep learning models with GAN | 85.2 |
|
|
|
|
SVM: support vector machine; CNN: convolutional neural network; GAN generative adversarial network; SWT: stationary wavelet transform; iChi2: iterative chi-square.