Literature DB >> 35410707

Automated detection of COVID-19 cases from chest X-ray images using deep neural network and XGBoost.

H Nasiri1, S Hasani2.   

Abstract

INTRODUCTION: In late 2019 and after the COVID-19 pandemic in the world, many researchers and scholars tried to provide methods for detecting COVID-19 cases. Accordingly, this study focused on identifying patients with COVID-19 from chest X-ray images.
METHODS: In this paper, a method for diagnosing coronavirus disease from X-ray images was developed. In this method, DenseNet169 Deep Neural Network (DNN) was used to extract the features of X-ray images taken from the patients' chests. The extracted features were then given as input to the Extreme Gradient Boosting (XGBoost) algorithm to perform the classification task.
RESULTS: Evaluation of the proposed approach and its comparison with the methods presented in recent years revealed that this method was more accurate and faster than the existing ones and had an acceptable performance for detecting COVID-19 cases from X-ray images. The experiments showed 98.23% and 89.70% accuracy, 99.78% and 100% specificity, 92.08% and 95.20% sensitivity in two and three-class problems, respectively.
CONCLUSION: This study aimed to detect people with COVID-19, focusing on non-clinical approaches. The developed method could be employed as an initial detection tool to assist the radiologists in more accurate and faster diagnosing the disease. IMPLICATION FOR PRACTICE: The proposed method's simple implementation, along with its acceptable accuracy, allows it to be used in COVID-19 diagnosis. Moreover, the gradient-based class activation mapping (Grad-CAM) can be used to represent the deep neural network's decision area on a heatmap. Radiologists might use this heatmap to evaluate the chest area more accurately.
Copyright © 2022 The College of Radiographers. Published by Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  COVID-19; Chest X-ray images; Deep neural network (DNN); DenseNet169; XGBoost

Mesh:

Year:  2022        PMID: 35410707      PMCID: PMC8958100          DOI: 10.1016/j.radi.2022.03.011

Source DB:  PubMed          Journal:  Radiography (Lond)        ISSN: 1078-8174


Introduction

COVID-19 virus was reported in Wuhan, China, in late December 2019 with unknown causes, after which it spread rapidly throughout the world.1, 2, 3 The virus prevailed in most parts of China within 30 days. The infectious disease caused by this type of virus was named COVID-19 by the World Health Organization (WHO) on February 11, 2020. COVID-19 was reported in Iran on February 21, 2020. About 3.5 and 192 million confirmed cases were identified in Iran and worldwide until July 20, 2021, respectively. Most types of coronavirus affect animals, but they can also be transmitted to humans due to their common nature. Severe Acute Respiratory Syndrome (SARS-CoV)-associated coronavirus causes humans’ severe respiratory disease and death. The well-known signs and symptoms of COVID-19 include fever, cough, sore throat, headache, fatigue, muscle pain, and shortness of breath. Since the prevalence of this pandemic, the COVID-19 virus has directly impacted the lifestyles of most communities, including human health, social welfare, businesses, and social relationships. It has also put its indirect effects, such as reducing the quality of education in schools and universities, weakening family relationships, decreasing sports activities, and so on. The most common method of COVID-19 diagnosis in individuals is the Real-Time Reverse Transcription-Polymerase Chain Reaction (RT-PCR) assay. However, identification through this approach is time-consuming, and the results may have a high level of false-negative errors. , Alternatively, chest radiographic imaging methods, such as Computed Tomography scan (CT-scan) and X-ray, can have a vital and effective role in the timely diagnosis and treatment of this disease, especially in pregnant women and children. , Chest X-ray Radiograph (CXR) images are mostly utilized to diagnose chest pathology and have been rarely applied to detect COVID-19. This research was conducted on these types of images due to the availability of radiographic imaging devices in most hospitals and specialized clinics. , According to the previous studies, radiological images of patients with COVID-19 bear important and useful information for identifying the virus in the body. However, one of the disadvantages of using CXR images is that they cannot detect soft tissues with poor contrast and are not thus capable of determining the degree of a patient's lung involvement. , To compensate for this shortcoming, Computer-Aided Diagnosis (CAD) systems can be employed. , Most CAD systems depend on the development of Graphics Processing Units (GPUs), which are applied to implement medical image processing algorithms, such as image enhancement and limb or tumor segmentation. , The development of artificial intelligence, especially some of its branches, such as Machine Learning and Deep Learning (DL), has contributed to greater intelligence in this process compared to human intelligence. Artificial intelligence has also significantly impacted the speed of the processes involved in such fields as medical sciences for performing diagnosis or even treatment. For instance, in areas like diagnosing lung , and cardiovascular , diseases and performing brain surgery, , it has so effectively contributed to the medical community and patients. Advances in DL have shown promising results in medical image analysis and radiology. , DL has got various architectures, each of which involves a variety of applications in the related fields. A type of DL architecture is Deep Convolutional Neural Network (DCNN), which is employed specifically in image processing. Among its varied applications, pattern recognition and image classification can be mentioned. Depending on the problem involved, DCNNs can be used in many different ways. One of the existing methods is to use pre-trained neural networks, which were utilized in this research. Based on this approach, pre-trained models that are freely available are employed, and image features are extracted using DNNs. The second step following the extraction of image features is to utilize classification methods for conducting the classification task. Among the various classification methods, such as Support Vector Machines (SVMs), Decision Trees, etc., the XGBoost classifier was applied in this paper. Apostolopoulos et al. carried out a study on a set of X-ray images from patients with pneumonia, COVID-19, and healthy individuals to assess the Convolutional Neural Network (CNN) performance. In this research, transfer learning was utilized, and the research process was done in 3 stages. The results demonstrated that the use of DL could lead to the extraction of significant features from COVID-19. In another study, Wang et al. presented the COVID-Net network (a DCNN for COVID-19 detection), implemented on X-ray images. Their proposed network could help physicians during the screening phase. Sethy et al. utilized DL and SVM to detect coronavirus-involved patients using X-ray images. Since SVM provides a powerful approach, it was applied in their classification process. Hemdan et al. proposed COVIDX-Net, which consisted of VGG16 and Google MobileNet. Mishra et al. proposed a decision fusion approach, which combined predictions from varied DCNNs to identify COVID-19 from chest CT images. In their study, Narin et al. proposed five models for diagnosing patients with pneumonia and coronavirus via X-ray images. These models were based on pre-trained CNNs, such as ResNet 152, ResNet 101, ResNet 50, Inception-ResNetV2, and InceptionV3. Pandit et al. employed the pre-trained VGG-16 to detect COVID-19 from chest radiographs and achieved 96% accuracy. Sung et al. developed a system to identify patients with COVID-19 using the CT images collected from hospitals in two provinces of China. Javadi Moghaddam and Gholamalinejad developed a novel DL structure for COVID-19 detection. The pooling layer of their proposed structure was a combination of pooling and the Squeeze Excitation Block layer. They also used the Mish function for convergence optimization. In a study, Ozturk et al. proposed a new model for detecting COVID-19 by using X-ray images. Their proposed model was presented based on the two problems of binary classification (for distinguishing COVID-19 from the “no-finding” class) and multi-class classification (for distinguishing COVID-19, pneumonia, and the “no-finding” classes). They proposed a DCNN called DarkCovidNet, which included 17 convolutional layers, and achieved 98.08% and 87.02% accuracies in their binary and multi-class classifications, respectively. The rest of the paper is organized as follows: Section: Methodology will describe the methodology. In Section: Proposed Method, results and discussion will be presented. Finally, the conclusion will be presented in Section: Results & Discussion.

Methodology

In this section, first, the dataset used in this study is presented. Then the XGBoost algorithm employed for classification in our proposed method is introduced, and finally, the proposed method is presented.

Dataset

Same as X-ray images from two different sources were used as the dataset in this paper: 1-Covid-19 X-ray images dataset, which was collected by Cohen, 2-ChestX-ray 8 dataset collected by Wang et al. Cohen collected images from public sources and through indirect collection from hospitals and physicians. This project was approved by the University of Montreal's Ethics Committee. Fig. 1 depicts the sample images in this dataset. There are 43 female and 82 male cases in the dataset, and the subjects' average age is approximately 55 years.
Figure 1

Sample images in Cohen's dataset.

Sample images in Cohen's dataset. The ChestX-ray 8 dataset was employed for normal and pneumonia images. This dataset consists of 108,948 frontal view X-ray images of 32,717 unique patients, from which 500 no-findings and 500 pneumonia chest X-ray images were selected randomly. So overall, the dataset used in this study contained 1125 X-ray images of the studied individuals’ chests, including 125 images labeled as COVID-19, 500 images labeled as pneumonia, and 500 images labeled as no findings. Fig. 2 shows the sample images in the ChestX-ray8 dataset.
Figure 2

Sample images in the ChestX-ray 8 dataset.

Sample images in the ChestX-ray 8 dataset.

XGBoost

XGBoost is an efficient and scalable algorithm based on tree boosting proposed by Chen & Guestrin in 2016. , It is an improved version of the Gradient Boosted Decision Tree (GBDT) method. It has proven not to have its computational limitations41, 42, 43 and thus differs from the GBDT method. GBDT uses the first-order Taylor expansion, while the second-order Taylor expansion is utilized in the XGBoost's loss function.44, 45, 46 In addition, the objective function is normalized in XGBoost to alleviate the model's complexity and prevent it from overfitting. ,

Proposed method

Considering the past similar research activities and common methods of using artificial intelligence in image processing, especially for medical images, our proposed method aimed to utilize the extracted features of the images by using pre-trained networks. One of the applications of artificial intelligence is the use of transfer learning techniques. In this technique, various networks are designed and trained with a huge set of available data, and the weights of the network layers are calculated. For example, in image processing, the ImageNet dataset contains millions of images in 1000 different classes. Several methods employ pre-trained networks, as follows: Using the structures of pre-designed networks to train one's model, remove the last layer of the presented network, and finally add layers to perform classification. Extracting image features by using pre-trained models and using the extracted features to perform classification via other algorithms. In this paper, the features were extracted using the second method, and the XGBoost classifier was employed for classification. In this way, the images were first given as input to the DenseNet169 DNN so that the network could extract image features. The extracted features were then given as input to the XGBoost algorithm to perform the classification operation. The framework of the proposed method can be seen in Fig. 3 . The proposed method was implemented using Python 3.8 and Keras 2.4 (i.e., the Python deep learning API).
Figure 3

Framework of the proposed method.

Framework of the proposed method.

Results

This research was done in two phases. In the first phase, the best (pre-trained) DNN was selected to extract the features, and in the second phase, the XGBoost classifier parameters were set by trial-and-error. Also, the ChestX-ray 8 dataset was used for two cases: 2-class problem, including COVID-19 and no findings (625 images), and 3-class problem, consisting of COVID-19, pneumonia, and no findings (1125 images). In the first phase, 17 pre-trained neural networks were assessed, and the XGBoost classifier was employed along with the default parameters for classification. Table 1 shows the average accuracy of the DNNs for each of the 2-class and 3-class problems. It should be noted that a 5-fold cross-validation method was applied to obtain the average accuracies in this experiment.
Table 1

Comparison of the average accuracies of the different DNNs.

DNNAverage Accuracy (%)
Three-class ProblemTwo-class Problem
Xception78.8493.59
VGG1681.6896.48
VGG1980.0895.36
ResNet 5080.7195.51
ResNet 15279.5595.68
ResNet50V280.5394.71
ResNet101V276.8893.95
ResNet152V277.6093.59
InceptionV379.0292.79
InceptionResNetV268.4490.72
MobileNet79.5595.51
MobileNetV282.5796.16
DenseNet12182.5196.32
DenseNet16983.0297.43
DenseNet20182.3196.63
NASNetMobile74.5793.11
EfficientNetB080.0097.28
Comparison of the average accuracies of the different DNNs. As can be seen, the DenseNet169 network has the best accuracy in both cases. As a result, this network was selected to extract the features of the proposed model in the second phase. The input to this network included images with the dimensions of 224 × 224 × 3, and its output consisted of 1664 features, which the network extracted from the given images. After determining the network type, parameters of the XGBoost classifier were set. Table 2 shows the parameters used in the XGBoost algorithm.
Table 2

The XGBoost parameter settings.

ParameterValue
Base LearnerGradient boosted tree
Tree construction algorithmExact greedy
Number of gradients boosted trees100
Learning rate (η)0.44
Lagrange multiplier (γ)0
Maximum depth of trees6
The XGBoost parameter settings. A 5-fold cross-validation was used for the 2-class problem, 80% of the dataset was utilized for training in the 3-class problem, and the remaining 20% was applied as the test set. The average accuracy for the 2-class problem was 98.23%, and the test accuracy for the 3-class problem was 89.70%. The confusion matrices for each of the five folds in the 2-class problem are shown in Fig. 4 , and the confusion matrix for the 3-class problem is shown in Fig. 5 .
Figure 4

Confusion matrices for the 2-class problem.

Figure 5

Confusion matrix for the 3-class problem.

Confusion matrices for the 2-class problem. Confusion matrix for the 3-class problem. The results of comparing the proposed approach with the method proposed by Ozturk et al. for the 3-class and 2-class problems can be observed in Table 3, Table 4 , respectively. Also, a comparison of the results obtained in this study with those of other proposed methods is given in Table 5 .
Table 3

Comparison of the proposed method with DarkCovidNet (3-class problem).

Proposed MethodDarkCovidNet
Sensitivity95.2088.17
Specificity10093.66
Precision92.5090.97
F1-score91.2089.44
Accuracy89.7089.33
Table 4

Comparison of the proposed method with DarkCovidNet (2-class problem).

Performance MetricsFold 1Fold 2Fold 3Fold 4Fold 5Average
SensitivityProposed Method95.2095.4096.7081.4091.4092.08
DarkCovidNet10096.4290.4793.7593.1895.13
SpecificityProposed Method10010010089.9010099.78
DarkCovidNet10096.4290.4793.7593.1895.30
PrecisionProposed Method99.5099.5099.4095.3099.0298.54
DarkCovidNet10094.5298.1498.5798.5898.03
F1-scoreProposed Method98.5098.5098.2092.5097.3097.00
DarkCovidNet10095.5293.7995.9395.6296.51
AccuracyProposed Method99.2099.2099.2095.2098.4098.24
DarkCovidNet10097.6096.8097.6097.6098.08
Table 5

Comparison of the proposed method with other DL-based methods.

StudyType of ImagesNumber of SamplesMethod UsedAccuracy (%)
Apostolopoulos et al.28Chest X-ray1428VGG-1993.48
Wang et al.29Chest X-ray13,645COVID-Net92.40
Sethy et al.30Chest X-ray50ResNet 50 + SVM95.38
Hemdan et al.31Chest X-ray50COVIDX-Net90.00
Narin et al.33Chest X-ray100Deep CNN ResNet-5098.00
Song et al.35Chest CT1485DRE-Net86.00
Wang et al.48Chest CT453M-Inception82.90
Zheng et al.49Chest CT542UNet + 3D Deep Network90.80
Xu et al.50Chest CT443ResNet + Location Attention86.60
Ozturk et al.12Chest X-ray625DarkCovidNet98.08
112589.33
Proposed MethodChest X-ray625DenseNet169 + XGBoost98.24
112589.70
Comparison of the proposed method with DarkCovidNet (3-class problem). Comparison of the proposed method with DarkCovidNet (2-class problem). Comparison of the proposed method with other DL-based methods.

Discussion

As can be seen in Table 3, Table 4, the proposed method has better performance than the DarkCovidNet network in both 3-class and 2-class problems. Noteworthy, the proposed approach had a higher speed and lower computational complexity than the method presented by Ozturk et al. because it did not require training of the DNN. Note that the proposed method has just trained the XGBoost algorithm. Table 5 demonstrates that the proposed method is more accurate than other DL-based models. However, it should be noted that the results presented in Table 5 were obtained from different datasets. This study's limitations include using an unbalanced dataset with a limited number of COVID-19 X-ray images and low sensitivity in the two-class problem. To compare the performance of XGBoost with other machine learning algorithms, we employed Random Forest and SVM as the classifier instead of XGBoost and repeated the experiments. The linear kernel was used for SVM. Table 6 shows the result of comparing different machine learning algorithms. As can be seen, the XGBoost outperforms other machine learning algorithms in both 2-class and 3-class problems.
Table 6

Comparison of the different machine learning algorithms.

Accuracy (%)
Method2-class problem3-class problem
DenseNet169 + XGBoost98.2489.70
DenseNet169 + Random Forest95.8580.15
DenseNet169 + SVM96.9679.20
Comparison of the different machine learning algorithms. To further analyze the proposed method, the gradient-based class activation mapping (Grad-CAM) was used to represent the decision area on a heatmap. Fig. 6 depicts the heatmaps for two confirmed COVID-19 cases. As can be seen, the developed method extracted correct features, and the model is mainly concentrated on the lung area. Radiologists can employ the heatmap to evaluate the chest area more accurately.
Figure 6

Heatmap of two confirmed COVID-19 cases.

Heatmap of two confirmed COVID-19 cases.

Conclusion

This study aimed to detect and identify people with COVID-19, focusing on non-clinical approaches and artificial intelligence techniques. In the proposed method, DenseNet169 was employed to extract image features, and the XGBoost algorithm was used for classification. The obtained results revealed that the detection accuracy of the proposed method in the 2-class problem was 98.24%, which was higher than other proposed methods. Also, 89.70% accuracy was reached in the 3-class problem, thus indicating better performance compared to the DarkCovidNet network. Besides being highly accurate, the proposed approach had a higher speed and lower computational complexity than the other proposed methods since it did not require the training of DNN.

Funding statement

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Conflict of interest statement

The authors declare that they have no conflicts of interest.
  3 in total

1.  Modeling of energy consumption factors for an industrial cement vertical roller mill by SHAP-XGBoost: a "conscious lab" approach.

Authors:  Rasoul Fatahi; Hamid Nasiri; Ehsan Dadfar; Saeed Chehreh Chelgani
Journal:  Sci Rep       Date:  2022-05-09       Impact factor: 4.379

2.  BLCov: A novel collaborative-competitive broad learning system for COVID-19 detection from radiology images.

Authors:  Guangheng Wu; Junwei Duan
Journal:  Eng Appl Artif Intell       Date:  2022-08-15       Impact factor: 7.802

3.  A Novel Lightweight Approach to COVID-19 Diagnostics Based on Chest X-ray Images.

Authors:  Agata Giełczyk; Anna Marciniak; Martyna Tarczewska; Sylwester Michal Kloska; Alicja Harmoza; Zbigniew Serafin; Marcin Woźniak
Journal:  J Clin Med       Date:  2022-09-20       Impact factor: 4.964

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.