Literature DB >> 32992136

Improving the performance of CNN to predict the likelihood of COVID-19 using chest X-ray images with preprocessing algorithms.

Morteza Heidari¹, Seyedehnafiseh Mirniaharikandehei², Abolfazl Zargari Khuzani³, Gopichandh Danala², Yuchen Qiu², Bin Zheng².

Abstract

OBJECTIVE: This study aims to develop and test a new computer-aided diagnosis (CAD) scheme of chest X-ray images to detect coronavirus (COVID-19) infected pneumonia.
METHOD: CAD scheme first applies two image preprocessing steps to remove the majority of diaphragm regions, process the original image using a histogram equalization algorithm, and a bilateral low-pass filter. Then, the original image and two filtered images are used to form a pseudo color image. This image is fed into three input channels of a transfer learning-based convolutional neural network (CNN) model to classify chest X-ray images into 3 classes of COVID-19 infected pneumonia, other community-acquired no-COVID-19 infected pneumonia, and normal (non-pneumonia) cases. To build and test the CNN model, a publicly available dataset involving 8474 chest X-ray images is used, which includes 415, 5179 and 2,880 cases in three classes, respectively. Dataset is randomly divided into 3 subsets namely, training, validation, and testing with respect to the same frequency of cases in each class to train and test the CNN model.
RESULTS: The CNN-based CAD scheme yields an overall accuracy of 94.5 % (2404/2544) with a 95 % confidence interval of [0.93,0.96] in classifying 3 classes. CAD also yields 98.4 % sensitivity (124/126) and 98.0 % specificity (2371/2418) in classifying cases with and without COVID-19 infection. However, without using two preprocessing steps, CAD yields a lower classification accuracy of 88.0 % (2239/2544).
CONCLUSION: This study demonstrates that adding two image preprocessing steps and generating a pseudo color image plays an important role in developing a deep learning CAD scheme of chest X-ray images to improve accuracy in detecting COVID-19 infected pneumonia.

Entities: Disease Species

Keywords: COVID-19 diagnosis; Computer-aided diagnosis; Convolution neural network (CNN); Coronavirus; Disease classification; VGG16 network

Mesh：

Year: 2020 PMID： 32992136 PMCID： PMC7510591 DOI： 10.1016/j.ijmedinf.2020.104284

Source DB: PubMed Journal: Int J Med Inform ISSN： 1386-5056 Impact factor: 4.046

Introduction

From the end of 2019, a new coronavirus namely COVID-19, was confirmed in human bodies as a new category of diseases that cause dangerous respiratory problems, heart infection, and even death. To more effectively control COVID-19 spread and treat patients to reduce mortality rate, medical images can play an important role [1]. In current clinical practice, chest X-ray radiography and computed tomography (CT) are two imaging modalities to detect COVID-19, assess its severity, and monitor its prognosis (or response to treatment). Although CT can achieve higher detection sensitivity, chest X-ray radiography is more commonly used in clinical practice due to the advantages, including low cost, low radiation dose, easy-to-operate and wide accessibility in the general or community hospitals [2]. However, pneumonia can be caused by many different types of viruses and bacterial. Thus, it may be time-consuming and challenging for general radiologists in the community hospitals to read a high volume of chest X-ray images to detect subtle COVID-19 infected pneumonia and distinguish it from other community-acquired non-COVID-19 infected pneumonia. It is because there are many similarities between pneumonia infected by COVID-19 and other types of viruses or bacteria. Thus, this is a clinical challenge faced by the radiologists in this pandemic [3]. To address this challenge, developing computer-aided detection or diagnosis (CAD) schemes based on medical image processing and machine learning has been attracting broad research interest, which aims to automatically analyze disease characteristics and provide radiologists valuable decision-making supporting tools for more accurate or efficient detection and diagnosis of COVID-19 infected pneumonia. To this aim, studies may involve following steps of preprocessing images, segmenting regions of interest (ROIs) related to the targeted diseases, computing and identifying effective image features, and building multiple-feature fusion-based machine learning models to detect and classify cases. For example, one study [4] computed 961 image features from the segmented ROIs depicting chest X-ray images. After applying a feature selection algorithm, a KNN classification model was built and yielded an accuracy of 96.1 % to classify between COVID-19 and non-COVID-19 cases. However, due to the difficulty in identifying and segmenting subtle pneumonia-related disease patterns or ROIs on chest X-ray images, recent studies have demonstrated that developing CAD schemes based on deep learning algorithms without segmentation of suspicious ROIs and computing handcrafted image features is more efficient and reliable than the use of the classical machine learning methods. As a result, many deep learning models have been reported recently in the literature to detect and classify COVID-19 cases [2,[5], [6], [7], [8], [9], [10], [11], [12], [13]]. Although some deep learning convolution neural network (CNN) models are applied to CT images [5,6], more studies applied CNN models to detect and classify COVID-19 cases using chest X-ray images. They include different existing CNN models (i.e., Resnet50 [2,7], MobileNetV2 [8], CoroNet [9], Xception + ResNet50V2 [10]) and several new special CNN models (i.e., DarkCovidNet [11], COVID-Net [12] and COVIDX-Net [13]). These studies used different image datasets with a varying number of COVID-19 cases (i.e., from 25 to 224) among the total number of cases from 50 to 11,302. The reported sensitivity to detect COVID-19 cases ranged from 79.0%–98.6%. Despite the promising results reported in previous studies, many issues have not been well investigated regarding how to train deep learning models optimally. For instance, whether applying image preprocessing algorithms can help to improve the performance and robustness of the deep learning models. To better address some of the challenges or technical issues, we in this study develop and test a new deep learning based CAD scheme of chest X-ray radiography images. The scheme can detect and classify images into 3 classes namely, COVID-19 infected pneumonia, the other community-acquired non-COVID-19 infected pneumonia, and normal (non-pneumonia) cases. The hypothesis in this study is that instead of directly using the original chest X-ray images to train deep learning models, we can apply image processing algorithms to remove the majority of diaphragm regions, normalize image contrast and reduce image noise, and generate a pseudo color image to feed in 3 input channels of the existing deep learning models that were pre-trained using color (RGB) images in the transfer learning process. It may help significantly improve model performance and robustness in detecting COVID-19 cases and distinguishing them from other community-acquired non-COVID-19 infected pneumonia cases. To test this study hypothesis and demonstrate the potential advantages of new approaches, we assemble a relatively large chest X-ray image dataset with 3 class cases. Then, we select a well-trained VGG16 based CNN model as a transfer learning model used in our CAD scheme. The details of the study design and data analysis results are reported in the following sections of this article.

Materials and method

Dataset

In this study, we utilize and assemble a dataset of chest X-ray radiography (CXR) images that are acquired from several different publicly available medical repositories [[14], [15], [16], [17], [18]]. These repositories were initially created and examined by the Allen Institute for AI in partnership with the Chan Zuckerberg Initiative, Georgetown University’s Center for Security and Emerging Technology, Microsoft Research, and the National Library of Medicine - National Institutes of Health, in coordination with The White House Office of Science and Technology Policy. Specifically, the dataset used in this study includes 8474 2D X-ray images in the posteroanterior (PA) chest view. Among them, 415 images depict with the confirmed COVID-19 disease, 5179 with other community-acquired non-COVID-19 infected pneumonia, and 2880 normal (non-pneumonia) cases.

Image preprocessing

Fig. 1 shows examples of three chest X-ray images acquired in three classes of normal, community-acquired non-COVID-19 infected pneumonia and COVID-19 pneumonia cases (from top to bottom). It shows that the bottom part of images includes a diaphragm region with high-intensity (or bright pixels), which may have a negative effect on distinguishing and quantifying lung disease patterns using deep learning models. Hence, an image pre-processing algorithm is applied to identify and remove diaphragm regions. Specifically, the algorithm detects the maximum (the brightest - ) and minimum (the darkest - ) pixel value of the image, then uses a threshold to segment the original image into a binary image as shown in Fig. 1(b). Next, after labeling all connected regions in the binary image, CAD scheme detects the biggest region, fills the holes in this region, and deletes all other small regions (if any) as shown in Fig. 1(c). This detected region locates in the diaphragm. Then, morphological filters are applied to smooth the boundary of the region as shown in Fig. 1(d). Last, the processed binary image is mapped back to the original image, CAD scheme removes overlapped pixels in the corresponding locations on the original image as shown in Fig. 1(e). Images after this step are named ().

Fig. 1

Example of chest X-ray images in three classes (top – normal, middle – community-acquired non-COVID-19 pneumonia, and bottom – COVID-19 infected pneumonia case). The figure also shows (a) the original Images, (b) the binary images after threshold, (c) images after selecting the biggest segmented region, (d) images after applying morphological filtering and (e) the original image after removing the majority part of diaphragm region (). In the next step, we convert the segmented grayscale images () to 3-channel images suitable for fine-tuning an existing CNN model pre-trained using color (RGB) images. To do so, we apply an image noise filtering method and a contrast normalization method to preprocess the image after removing the diaphragm region. First, since the X-ray images often include additive noise, we apply a bilateral low-pass filter () to This filter is a non-linear filter and highly effective at noise removal while preserving textural information compared to the other low pass filters. In other words, this filter analyzes intensity values locally and considers the intensity variation of the local area to replace the intensity value of each pixel with the averaged intensity value of the pixels in the local area. To calculate the weights, we apply a Gaussian low-pass filter in the space domain. This step generates a noise-reduction image. Based on our experimental results, we select the following parameters in the bilateral filtering ( and  = 75). Second, chest X-ray images may have different image contrast or brightness due to the difference in patient body size and/or variation of X-ray dose. To compensate such a potentially negative impact, we apply a histogram equalization () method to normalize images. This filter can enhance lung tissue patterns and characteristics associated with COVID-19 infection. Then, as shown in Fig. 2 , three preprocessed images namely, , and form a pseudo color image that is fed into 3 input (RGB) channels of the CNN model.

Fig. 2

A flow diagram to illustrate image pre-processing steps to generate input of a CNN model, where () is the original Image in the dataset. () is the diaphragm removed image. () is an image after applying histogram equalization on (), and () is an image after applying bilateral filtering on (). Three images (), (), and () are fed into three channels of the CNN model to simulate the RGB image.

Transfer learning

In this study, we adopt a transfer learning approach since the previous studies have shown in order to avoid either overfitting or underfitting consequences using a small training dataset, a better approach is to take advantage of a CNN initially trained using a large-scale dataset [19]. Currently, many CNN models have been previously developed and are available for different engineering applications. In this study, we select a VGG16 model, which was pre-trained on the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) using a large dataset with 14 million images [20]. VGG16 model won the first place on image localization task and second place on image classification task in the 2014 ILSVRC challenge [21]. As shown in Fig. 3 , the VGG16 model has 13 convolutions, 5 max pooling and 3 fully connection layers in 6 blocks, which include over 138 million trainable parameters.

Fig. 3

Illustration of the architecture of VGG16 based CNN model.

Illustration of the architecture of VGG16 based CNN model. In our transfer learning, the weights between all connected nodes in front or low layers of the VGG16 based CNN model maintain unchanged (blocks 1–5 as shown in Fig. 3). Next, block 6 in the model is modified by replacing with one flatten layer and two fully connected layers, which include 256 and 128 nodes, respectively. In these layers, the rectified linear unit (ReLU) [22] is used as their activation function. Then, all trainable weights in all connection nodes of the whole modified VGG16 model are fine-tuned using chest X-ray image data. In this fine-tuning process, a small learning rate (learning rate = is used to make a small variation to the pre-trained parameters. In this way, we will preserve the valuable parameters as much as possible by avoiding dramatic changes on the pre-trained parameters and let the model learn the special characteristics of chest X-ray images. Finally, in the last classification layer, Softmax is used as the activation function. As a result, a new transfer learning model is built to fulfill a three-class classification task. The complete CNN model is compiled with Adam [23] optimizer with a batch size of 4, max epoch = 200, initial learning rate = , and monitoring validation loss for reducing the learning rate every 5 epochs with a factor of 0.8. Table 1 shows the complete architecture of the transfer learning VGG16 model built in this study.

Table 1

The architecture of the new VGG16 model after transfer learning with new layers (19 to 22).

Number	Layer	Size	Activation
0	Input Image	224×224×3	---
2	2× Convolution (3×3)	224×224×64	ReLu
3	Max Pooling	112×112×64	ReLu
5	2× Convolution (3×3)	112×112×128	ReLu
6	Max Pooling	56×56×128	ReLu
9	3× Convolution (3×3)	56×56×256	ReLu
10	Max Pooling	28×28×256	ReLu
13	3× Convolution (3×3)	28×28×512	ReLu
14	Max Pooling	14×14×512	ReLu
16	3× Convolution (3×3)	14×14×512	ReLu
18	Max Pooling	7×7×512	ReLu
19	Flattening	25,088	---
20	Fully Connected	256	ReLu
21	Fully Connected	128	ReLu
22	Fully Connected	3	SoftMax

The architecture of the new VGG16 model after transfer learning with new layers (19 to 22).

Model training and testing

First, the original chest X-ray image has 1024 × 1024 pixels, while the VGG16 model was pre-trained using images of 224 × 224 pixels. Thus, each chest X-ray image is down-sampled to 224 × 224 pixels to fit the VGG16 model. Then, for training and evaluating the proposed VGG16 based transfer leaning CNN model, we randomly split the entire image dataset of 8474 cases into 3 independent subsets of training, validation, and testing. Overall, 10 % of cases (848) are assigned to test subset. On the remaining 7626 cases, 10 % cases are assigned to the validation subset (757), while 90 % case (6869) are formed as the training subset. To maintain the same case partition ratios for three classes of COVID-19 infected pneumonia, other community-acquired non-COVID-19 infected pneumonia and normal cases, the case partition or assignment is done on three classes independently. Table 2 shows the number of cases in each subset.

Table 2

Distribution of cases in three subsets.

Image Data Subset	Training	Validation	Testing
COVID-19 cases	366	37	42
Other pneumonia cases	4201	460	518
Normal cases	2332	260	288
Total number of cases	6899	757	848

Distribution of cases in three subsets. Second, there are different available techniques to deal with imbalanced data [24]. In this study, the class weight technique, as one possible way, is applied during training to reduce the potential consequences of imbalance data. In the class weigh technique, we adjust weights inversely proportional to class frequencies in the input data [25]. The weight, in class is computed using the following equation. The weights of the classes are utilized while fitting the model. Hence, in the loss function, we assign higher values to the instances of smaller classes. Therefore, the calculated loss will be a weighted average, where the weights of each sample corresponding to each class during loss calculation are specified with . Additionally, in the training data of minority cases (COVID-19 cases), a common augmentation technique [26] is applied to increase the training sample size. First, using shearing factors (≤0.2), image intensity is sheared based on the shearing angle in a counter-clockwise direction. Second, using zooming factors (≤0.2), images are randomly magnified. Third, using rotation factors (within ), images are randomly rotated. Fourth, using a shift factor (≤0.2), images are randomly shifted in 4 directions (up, left, down right). Last, images are flipped horizontally in a random base to generate as much augmented data as possible. Multiple iteration or epochs are applied to train the VGG16 based CNN model. The model is first trained using the data in the training subset and validated using validation subset. During the training process, the optimizer tries to force the architecture to learn more and more information to reduce the performance gap between training and validation. To control overfitting and maintain training efficiency, we limit model training epochs to 200. Hence, at the end of 200 training epoch, the trained model is saved and then tested using the data in the testing subset, which does not involve in the model training and validation process. To reduce the risk of potential bias in data partition into three subsets of training, validation, and testing, we repeat this model training and testing process three times by randomly dividing all cases into training, validation and testing subsets three times using the same case ratios or numbers as shown in Table 2. In addition, during these three times of case partition, the cases assigned to the validation and testing subsets are totally different (no duplication). Three trained models are tested using totally different testing cases. Thus, the total number of testing cases increases (as shown in Table 2) to 2544 (848. Fig. 4 shows a schematic diagram that illustrates the complete architecture of this VGG16 transfer learning CNN model, as well as the training, validation, and testing phase.

Fig. 4

schematic representing training and validation phase of the proposed scheme.

Performance assessment

We perform experiments to analyze two different accuracies. The first one is accuracy for a three-class classification to distinguish between COVID-19 infected pneumonia, community-acquired pneumonia, and normal (non-pneumonia) cases. We compute accuracy values in detecting images in 3 classes. We also calculate (1) a macro averaging, which is the average of 3 accuracy values of 3 classes without considering the proportion of the number of the cases in each class (, and (2) a weighted averaging, which is the weighted average of 3 accuracy values weighted with respect to the proportion of the classes (, where are accuracy values of 3 classes, while are weighting factors of 3 classes representing the ratios of cases in 3 classes. Then, for the three-class classification, a confusion matrix is generated from which several evaluation indices, including precision, recall, F1-score, and Cohen's Kappa [27] values are computed to evaluate CAD performance. The value of Cohen's kappa coefficients (ranging from zero and one) indicates the possibility of the predicted results occurring by chance. The lower Kappa value shows the more randomness of the results, while the higher value shows a better similarity and higher robustness. The second accuracy evaluation refers to classification between the COVID-19 and non-COVID19 cases (including both normal and community-acquired pneumonia cases). In this circumstance, we compute true positive (TP) for the cases correctly identified as COVID-19, false negative (FN) for the COVID-19 cases being incorrectly classified as normal or community-acquired pneumonia cases, true negative (TN) for the cases correctly identified as non-COVID-19 cases, and false positive (FP) for the normal and community-acquired pneumonia cases being incorrectly classified as COVID-19 by the CNN model. Then, the accuracy, sensitivity, specificity, recall, and F1-scores of model classification are computed and tabulated.

Results

Fig. 5 (a–c) presents trend curves of training and validation accuracy of the new transfer learning VGG16 based CNN model in three experiments using different training and validation subsets in the left column. Then, by applying the trained models on the corresponding testing subsets, three confusion matrices of the models on the testing subsets are shown in the right column. All three curves show that as the increase of training iteration epochs during the training process, the prediction accuracy of the validation subset varies greatly (with big oscillation) initially, and then gradually converges to a higher accuracy level with much small oscillation. Thus, for all three subsets after epoch 75, validation accuracy is following the training accuracy, which indicates that learning is happening during different epochs. The trend graph also shows that the proposed technique does not suffer significant overfitting or underfitting in our transfer learning model. Then, by combining three confusion matrixes of the three independent testing subsets, as shown in second column of Fig. 5(a–c), Fig. 5(d) displays a combined 3-class confusion matrix of 2544 (3 848) cases.

Fig. 5

(a-c) Left column show three sets of performance curves applying to the training and validation subsets for 3 experiments in 200 training epochs, respectively. The horizontal axis shows the number of epochs, and the vertical axis shows the accuracy. The right column show three confusion matrices of the corresponding testing results are shown on the right. (d) A combined confusion matrix of applying three trained models to three independent testing data subsets with total 2544 cases. First, based on three confusion matrices as shown in Fig. 5(a–c), the overall 3-class classification accuracy levels are 93.9 % (796/848), 94.7 % (803/848), and 94.9 % (805/848), respectively. The difference is approximately 1%. Then, based on the confusion matrix of the combined data as shown in Fig. 5(d), we compute the precision, recall rate, F1-score, and prediction accuracy of the new transfer learning VGG16 based CNN model, as shown in Table 3 . Among 2544 testing cases, 2404 are correctly detected and classified into 3 classes. The overall accuracy is 94.5 % (2404/2544) with 95 % confidence interval of [0.93, 0.96]. In addition, the computed Cohen’s kappa coefficient is 0.89, which confirms the reliability of the proposed approach to train this new deep transfer learning model to do this classification task.

Table 3

Classification report of the proposed method.

	Precision	Recall	F1-score	Support cases
Normal	0.96	0.91	0.93	864
Other Pneumonia	0.96	0.96	0.96	1554
COVID19	0.73	0.98	0.84	126
Accuracy	---	---	0.95	2544
Macro avg	0.88	0.95	0.91	2544
Weighted avg	0.95	0.94	0.94	2544

Classification report of the proposed method. To further evaluate the performance of our CAD scheme in detecting the COVID19 infected pneumonia cases using chest X-ray images, we place both normal and community-acquired pneumonia images into the negative class and COVID-19 infected pneumonia cases into the positive class. Combining the data in the confusion matrix, as shown in Fig. 5(d), the CAD scheme yields 98.4 % detection sensitivity (124/126) and 98.0 % specificity (2371/2418). The overall accuracy is 98.1 % (2495/2544). Next, Table 4 shows and compares (1) confusion matrixes generated by four models trained and tested using different input images and three data subsets generated from the data partition, as well as (2) overall classification accuracy and 95 % confidence intervals. The results indicate that (1) without using the data augmentation technique, the model accuracy on data of the testing subset drops to 82.3 % with the kappa score of 0.71. (2) Without applying image preprocessing and directly feeding the original chest X-ray images into the VGG16 based CNN model (“simple model”), classification accuracy is 88.0 % with a Cohen’s kappa score of 0.75. (3) Using image filtering and pseudo color images without removing the majority part of diaphragm regions, the “filter-based model” yields 91.2 % accuracy and a Cohen’s kappa score of 0.83. All three models yield lower classification accuracy than the proposed model involving data augmentation technique and two steps of image preprocessing.

Table 4

Confusion matrix of four CNN models on X-ray Images. 95 % confidence interval (CI) for the accuracy is shown in the last column.

			Normal	Pneumonia	COVID19	Accuracy	95 % CI
Proposed Model	True Label	Normal	788	56	20	94.5 %	[0.93,0.96]
		Pneumonia	35	1492	27
		COVID 19	1	1	124
Filter-based model		Normal	750	89	25	91.2 %	[0.90,0.92]
		Pneumonia	64	1452	38
		COVID19	2	6	118
Simple model		Normal	701	123	40	88.0 %	[0.86,0.89]
		Pneumonia	72	1431	51
		COVID19	6	13	107
No-augmentation		Normal	653	158	53	82.3 %	[0.80,0.84]
		Pneumonia	124	1346	74
		COVID19	8	23	95

Confusion matrix of four CNN models on X-ray Images. 95 % confidence interval (CI) for the accuracy is shown in the last column. In addition, Table 5 compares our transfer learning VGG16 based CNN model and 10 state-of-art models recently reported in the literature to detect and classify COVID-19 cases. The Table shows the number of cases in the training and testing data subsets, imaging modality (CT or X-ray radiography), and reported classification performance including either 3-class or 2-class classification for these studies. Although the reported performance of these studies cannot be directly compared due to the use of different image dataset and testing methods, the presented data clearly demonstrate that our model is tested using relatively large dataset and yields very comparable classification performance as comparing to the state-of-art models developed and tested in this research field.

Table 5

Comparison accuracy results of the proposed method with the other deep learning methods on COVID-19 diagnosis.

Approach	Data Type	Cases number (including COVID-19 cases)	Method utilized	2 classes accuracy (%)	3 classes accuracy (%)	COVID-19 detection Sensitivity (%)
Narin et al. [2]	X-ray	100 (50)	ResNet50	98.0	---	96.0
Sethy et al. [7]	X-ray	50 (25)	ResNet50+SVM	95.4	---	97.0
Ioannis et al. [8]	X-ray	1427 (224)	MobileNetV2	96.7	93.5	98.6
Wang et al. [5]	CT	237 (119)	M-Inception	82.9	---	81.0
Tulin et al. [11]	X-ray	1127 (127)	DarkCovidNet	98.08	87.02	90.6
Khan et al. [9]	X-ray	221 (29)	CoroNet (Xception)	98.8	94.52	95.0
Rahimzadeh & attar [10]	X-ray	11,302 (31)	Xception + ResNet50V2	99.5	91.4	80.53
Wang et al. [12]	X-ray	300 (100)	COVID-Net	96.6	93.3	91.0
Ying et al. [6]	CT	57 (30)	DRE-Net (ResNet50)	86	---	79.0
Hemdan et al. [13]	X-ray	50 (25)	COVIDX-Net	90	---	----
Our new method	X-ray	2544 (126)	VGG16	98.1	94.5	98.4

Comparison accuracy results of the proposed method with the other deep learning methods on COVID-19 diagnosis.

Discussion

In this study, we developed and tested a novel deep transfer learning CNN model to detect and classify chest X-ray images depicting COVID19 infected pneumonia. This study has several unique characteristics as compared to the previously reported studies in this field and produces several new interesting observations. First, since the deep learning CNN model includes a considerable number of parameters that need to be trained and determined, a large and diverse image dataset is required to produce robust results [28]. Although we used a relatively large image dataset of 8474 chest X-ray images, the dataset is unbalanced in 3 classes of images, and the number of the COVID-19 infected pneumonia cases (415) remains small. Thus, in order to build a robust deep learning model, we apply a class weight technique during the training process and select a well-trained VGG16 model and apply a transfer learning approach. Specifically, the original VGG16 model includes over 138 million parameters. These parameters have been trained and determined using a large ImageNet database over 14 million images. It is difficult to train so many parameters from scratch robustly using a dataset of 8474 images. Thus, we retrain or fine-tune the pre-trained VGG16 (as shown in Fig. 3) to reduce the overfitting risk. Study results demonstrate that this transfer learning approach can yield higher performance with the overall accuracy of 94.5 % (2404/2544) in the classification of 3 classes and 98.1 % (2495/2544) in classifying cases with and without COVID-19 infection, as well as the high robustness with a Cohen’s kappa score of 0.89. Second, unlike the regular color photographs, chest X-ray images are gray-level images. Thus, in order to fully use the pre-trained VGG16-based CNN model, we generate two new gray-level images. Then, instead of applying the original chest X-ray image to the CNN model directly, 3 different gray-level images are fed into 3 input (RGB color) channels of the CNN model. Specifically, we apply a bilateral low-pass filter to generate a noise-reduced image and a histogram equalization method to generate a contrast normalized image. Comparing two approaches of using only original chest X-ray image and 3 different images to generate a pseudo color image as an input image to the CNN model, our study results show that using a pseudo color image approach, overall classification accuracy increases 3.6 % from 91.2%–94.5%, and Cohen’s kappa score increases 7.2 % from 0.83 to 0.89, respectively. The results demonstrate the advantage of using our new approach to fully use 3 input channels of the CNN model pre-trained using color images because these two filtered gray-level images contain additional information, which can enhance image classification capability. Third, since in the area of medical imaging, generally, disease’s patterns are not comparable to the other existing patterns in the image, preprocessing steps are noteworthy [29]. Hence, we apply an image preprocessing algorithm to automatically detect and remove the majority part of the diaphragm region from the chest X-ray images. Comparing the approaches with and without removing the diaphragm regions, classification performance of the CNN model changes from 94.5%–88.0% and 0.89 to 0.75 for the overall classification accuracy and Cohen’s kappa coefficients, respectively, which indicates a 7.4 % increase in the overall classification accuracy and 18.7 % increase in Cohen’s kappa coefficient by removing the majority of diaphragm regions. Thus, although skipping segmentation of the suspicious disease regions of interest is one important characteristic of deep learning, our study demonstrates that applying an image preprocessing and segmentation algorithm to remove irrelevant regions on the image can also play an important role in increasing performance and robustness of deep learning models. In addition, we observe and confirm that applying data augmentation in the training data is also essential. Without data augmentation to increase training dataset size, the overall classification accuracy of the CNN model significantly reduces to around 82.3 %. In summary, we in this paper present a new deep transfer learning model to detect and classify the COVID-19 infected pneumonia cases, as well as several unique image preprocessing approaches to optimally train the deep learning model using the limited and unbalanced medical image dataset. The similar learning concept and image preprocessing approaches can also be adopted to develop new deep learning models for other medical images to detect and classify other types of diseases (i.e., cancers [30,31]). Despite encouraging results, this study also has limitations. First, although we used a publicly available dataset of 8474 cases, including 415 COVID-19 cases, due to the diversity or heterogeneity of COVID-19 cases, the performance and robustness of this CAD scheme need to be further tested and validated using other large and diverse image databases. Second, this study only investigates and tests two image preprocessing methods to generate two filtered images, which may not be the best or optimal methods. New methods should also be investigated and compared in future studies. Third, to further improve model performance and robustness, it also needs to develop new image processing and segmentation algorithms to more accurately remove the diaphragm and other regions outside lung areas in the images. Therefore, more research work is needed to overcome these limitations in the future studies.

Conclusion

In this study, we proposed and investigated several new approaches to develop a transfer deep learning CNN model to detect and classify COVID-19 cases using chest X-ray images. Study results demonstrate the added value of performing image preprocessing to generate better input image data to build deep learning models. These include removing irrelevant regions, normalizing image contrast-to-noise ratio, and generating pseudo color images to feed into all three channels of the CNN models in applying the transfer learning method. The reported high classification performance is also promising, which provides a solid foundation to further optimize the deep learning models to detect COVID-19 cases and validate its performance and robustness using large and diverse image datasets in future studies. Summary points What was Already Known on the topic Due to the low cost, low radiation, wide accessibility, chest X-ray radiography is a good imaging modality to detect COVID-19. However, its sensitivity is lower than CT. Developing deep learning model based CAD schemes of chest X-ray images may play a useful role in facilitating detection and diagnosis of COVID-19. A few deep learning models using chest X-ray images to detect COVID-19 have been reported using small datasets. The models were trained using the original images only. What this study adds to our knowledge Due to the diversity of image contrast and noise, adding image preprocessing steps is important and can help improve deep learning model performance. In transfer learning, one should not only use original images. It should add two additional filtered images to fill in 3 input channels of the deep learning model, which can enhance information learning and improve model performance. The deep learning CAD scheme can achieve high performance in detecting and classifying not only between COVID-19 cases and healthy (non-pneumonia) cases, and also between COVID-19 infected pneumonia and other community-acquired non-COVID-19 infected pneumonia cases. Our model is tested using a larger dataset as comparing to previous studies reported in the literature, which further supports the feasibility of this CAD approach.

Credit authorship contribution statement

Morteza Heidari and Bin Zheng design the study. Morteza Heidari implements the idea and write the required computer scheme coding link to VGG16 model. Abolfazl Zargari and Seyedehnafiseh Mirniaharikandehei collect the dataset and help test image filtering and normalization algorithms. Gopichandh Danala helps designing and testing image pre-processing and blob discovery strategy. Morteza Heidari drafts the manuscript. Bin Zheng and Yuchen Qiu review and revise the manuscript.

Declaration of Competing Interest

The authors report no declarations of interest.

14 in total

1. Deep CNN models for pulmonary nodule classification: Model modification, model integration, and transfer learning.

Authors: Xinzhuo Zhao; Shouliang Qi; Baihua Zhang; He Ma; Wei Qian; Yudong Yao; Jianjun Sun
Journal: J Xray Sci Technol Date: 2019 Impact factor: 1.535

2. Prediction of breast cancer risk using a machine learning approach embedded with a locality preserving projection algorithm.

Authors: Morteza Heidari; Abolfazl Zargari Khuzani; Alan B Hollingsworth; Gopichandh Danala; Seyedehnafiseh Mirniaharikandehei; Yuchen Qiu; Hong Liu; Bin Zheng
Journal: Phys Med Biol Date: 2018-01-30 Impact factor: 3.609

3. Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning.

Authors: Daniel S Kermany; Michael Goldbaum; Wenjia Cai; Carolina C S Valentim; Huiying Liang; Sally L Baxter; Alex McKeown; Ge Yang; Xiaokang Wu; Fangbing Yan; Justin Dong; Made K Prasadha; Jacqueline Pei; Magdalene Y L Ting; Jie Zhu; Christina Li; Sierra Hewett; Jason Dong; Ian Ziyar; Alexander Shi; Runze Zhang; Lianghong Zheng; Rui Hou; William Shi; Xin Fu; Yaou Duan; Viet A N Huu; Cindy Wen; Edward D Zhang; Charlotte L Zhang; Oulan Li; Xiaobo Wang; Michael A Singer; Xiaodong Sun; Jie Xu; Ali Tafreshi; M Anthony Lewis; Huimin Xia; Kang Zhang
Journal: Cell Date: 2018-02-22 Impact factor: 41.582

4. New machine learning method for image-based diagnosis of COVID-19.

Authors: Mohamed Abd Elaziz; Khalid M Hosny; Ahmad Salah; Mohamed M Darwish; Songfeng Lu; Ahmed T Sahlol
Journal: PLoS One Date: 2020-06-26 Impact factor: 3.240

5. CT Imaging and Differential Diagnosis of COVID-19.

Authors: Wei-Cai Dai; Han-Wen Zhang; Juan Yu; Hua-Jian Xu; Huan Chen; Si-Ping Luo; Hong Zhang; Li-Hong Liang; Xiao-Liu Wu; Yi Lei; Fan Lin
Journal: Can Assoc Radiol J Date: 2020-03-04 Impact factor: 2.248

6. Automated detection of COVID-19 cases using deep neural networks with X-ray images.

Authors: Tulin Ozturk; Muhammed Talo; Eylul Azra Yildirim; Ulas Baran Baloglu; Ozal Yildirim; U Rajendra Acharya
Journal: Comput Biol Med Date: 2020-04-28 Impact factor: 4.589

7. A modified deep convolutional neural network for detecting COVID-19 and pneumonia from chest X-ray images based on the concatenation of Xception and ResNet50V2.

Authors: Mohammad Rahimzadeh; Abolfazl Attar
Journal: Inform Med Unlocked Date: 2020-05-26

8. Interrater reliability: the kappa statistic.

Authors: Mary L McHugh
Journal: Biochem Med (Zagreb) Date: 2012 Impact factor: 2.313

9. CoroNet: A deep neural network for detection and diagnosis of COVID-19 from chest x-ray images.

Authors: Asif Iqbal Khan; Junaid Latief Shah; Mohammad Mudasir Bhat
Journal: Comput Methods Programs Biomed Date: 2020-06-05 Impact factor: 5.428

10. Covid-19: automatic detection from X-ray images utilizing transfer learning with convolutional neural networks.

Authors: Ioannis D Apostolopoulos; Tzani A Mpesiana
Journal: Phys Eng Sci Med Date: 2020-04-03

43 in total

1. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions.

Authors: Laith Alzubaidi; Jinglan Zhang; Amjad J Humaidi; Ayad Al-Dujaili; Ye Duan; Omran Al-Shamma; J Santamaría; Mohammed A Fadhel; Muthana Al-Amidie; Laith Farhan
Journal: J Big Data Date: 2021-03-31

2. Improving mammography lesion classification by optimal fusion of handcrafted and deep transfer learning features.

Authors: Meredith A Jones; Rowzat Faiz; Yuchen Qiu; Bin Zheng
Journal: Phys Med Biol Date: 2022-02-21 Impact factor: 3.609

3. A systematic review on AI/ML approaches against COVID-19 outbreak.

Authors: Onur Dogan; Sanju Tiwari; M A Jabbar; Shankru Guggari
Journal: Complex Intell Systems Date: 2021-07-05

4. A Survey on Machine Learning and Internet of Medical Things-Based Approaches for Handling COVID-19: Meta-Analysis.

Authors: Shahab S Band; Sina Ardabili; Atefeh Yarahmadi; Bahareh Pahlevanzadeh; Adiqa Kausar Kiani; Amin Beheshti; Hamid Alinejad-Rokny; Iman Dehzangi; Arthur Chang; Amir Mosavi; Massoud Moslehpour
Journal: Front Public Health Date: 2022-06-23

5. A Comparison of Computer-Aided Diagnosis Schemes Optimized Using Radiomics and Deep Transfer Learning Methods.

Authors: Gopichandh Danala; Sai Kiran Maryada; Warid Islam; Rowzat Faiz; Meredith Jones; Yuchen Qiu; Bin Zheng
Journal: Bioengineering (Basel) Date: 2022-06-15

6. Applying a random projection algorithm to optimize machine learning model for predicting peritoneal metastasis in gastric cancer patients using CT images.

Authors: Seyedehnafiseh Mirniaharikandehei; Morteza Heidari; Gopichandh Danala; Sivaramakrishnan Lakshmivarahan; Bin Zheng
Journal: Comput Methods Programs Biomed Date: 2021-01-15 Impact factor: 5.428

7. COVID-19 diagnosis from CT scans and chest X-ray images using low-cost Raspberry Pi.

Authors: Khalid M Hosny; Mohamed M Darwish; Kenli Li; Ahmad Salah
Journal: PLoS One Date: 2021-05-11 Impact factor: 3.240

Review 8. Transfer learning for medical image classification: a literature review.

Authors: Mate E Maros; Thomas Ganslandt; Hee E Kim; Alejandro Cosa-Linan; Nandhini Santhanam; Mahboubeh Jannesari
Journal: BMC Med Imaging Date: 2022-04-13 Impact factor: 1.930

9. Toward understanding COVID-19 pneumonia: a deep-learning-based approach for severity analysis and monitoring the disease.

Authors: Mohammadreza Zandehshahvar; Marly van Assen; Hossein Maleki; Yashar Kiarashi; Carlo N De Cecco; Ali Adibi
Journal: Sci Rep Date: 2021-05-27 Impact factor: 4.379

10. FractalCovNet architecture for COVID-19 Chest X-ray image Classification and CT-scan image Segmentation.

Authors: Hemalatha Munusamy; J M Karthikeyan; G Shriram; S Thanga Revathi; S Aravindkumar
Journal: Biocybern Biomed Eng Date: 2021-07-08 Impact factor: 4.314