Varalakshmi Perumal1, Vasumathi Narayanan1, Sakthi Jaya Sundar Rajasekar2. 1. Department of Computer Technology, Madras Institute of Technology, Anna University, Chromepet, Chengalpattu District, Tamilnadu India. 2. Melmaruvathur Adhiparasakthi Institute of Medical Sciences and Research, Melmaruvathur, Chengalpattu District, Tamilnadu India.
The CORONA Virus Disease (COVID-19) is a pulmonary disease brought about by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). This pandemic has imposed tremendous loss to life so far. World Health Organization (WHO) is persistently observing and publishing all reports of this disease outbreak in various countries. This COVID-19 is a respiratory ailment and to a greater extent spread through droplets in air. The infection is transmitted predominantly via close contact and by means of respiratory droplets delivered when an individual coughs or sneezes. The symptoms of this virus are coughing, difficulty in breathing and fever. Trouble in breathing is an indication of conceivable pneumonia and requires prompt clinical consideration. No antibody or explicit treatment for COVID-19 contamination is available. Emergency clinics provide isolation wards to infected individuals. It is the most common when individuals are symptomatic, yet spread might be conceivable before symptoms appear. The infection can sustain on surfaces for 72 hours. Symptoms of COVID-19 start to appear somewhere in between the range of 2 to 14 days, with a mean of 7 days. The standard technique for analysis is by real time Reverse Transcirption Polymerase Chain Reaction (RT-PCR) performed on a nasopharyngeal swab sample. The same disease can likewise be analyzed from a mix of manifestations, risk factors and a chest CT demonstrating highlights of pneumonia. Many countries are unable to limit the wide spread of COVID-19 quickly due to insufficient medical kits. Many researches are carried out across the globe to handle the pandemic scenario. Many deep leaning models are proposed to predict the COVID-19 symptoms at the earliest to control the spread. We propose a transfer learning model over the deep learning model to further quicken the prediction process.The literature survey of the proposed work is explained in Section 2. Proposed model is explored in Section 3. Experiment and results are discussed in Section 4. Discussion and conclusion are presented in Section 5 and Section 6 respectively.
Related Work
To limit the transmission of COVID-19 [11], screening large number of suspicious cases is needed, followed by proper medication and quarantine. RT-PCR testing is considered to be the gold standard grade of testing, yet with significant false negative outcomes. Efficient and rapid analytical techniques are earnestly anticipated to fight the disease. In view of COVID-19 radio graphical differences in CT images, we propose to develop a deep learning model that would mine the unique features of COVID-19 to give a clinical determination in front of the pathological test, thereby sparing crucial time for sickness control.Understanding the basic idea of COVID-19 and its subtypes, varieties might be a continual test, and the same should be made and shared all over the world. Baidu Research [14] discharge its LinearFold calculation and administrations, which can be utilized for whole genome optional structure forecasts on COVID-19, and is evidently multiple times quicker than different calculations. Pathological discoveries of COVID-19 related with intense respiratory pain disorder are exhibited in [4]. In [29], Investigations included: 1) rundown of patient attributes; 2) assessment of age appropriations 3) computation of case casualty and death rates; 4) geo-temporal examination of viral spread; 5) epidemiological bend development; and 6) subgroupAnother [6] to examine the degree of COVID-19 disease in chosen populace, as controlled through positive neutralizer tests in everyone has been created. Chen H [5] proposed that limited information is accessible for pregnant ladies with COVID-19 virus. This examination meant to assess the clinical attributes of COVID-19 in women during pregnancy period. Shan F [19] suggested that CT screening is vital for conclusion, appraisal and organizing COVID-19 disease. Follow-up checks of each 3 to 5 days regularly are suggested for illness movement. It is concluded that peripheral and bilateral Ground Glass Opacification (GGO) with consolidation will be more prevalent in COVID-19 infected patients. Gozes [8] created AI-based computerized CT image examination instruments for location, measurement, and following of Coronavirus. Wang S [23] proposed a deep learning system that uses CT scan images to detect COVID- 19 disease. Wang Y [27] presented a critical side effect in COVID-19 as Tachypnea, increasing rapid respiration. The examination can be used to recognize different respiratory conditions and the gadget can be used handy. Rajpurkar P [17] presented the algorithm that detected pneumonia using cheXNet system which is more accurate. Xu X [28] exhibited that the traditional method used for identifying COVID-19 has low positive rate during initial stages. He developed a model which does the early screening of CT images. Yoon SH [31] found that the COVID-19 pneumonia that affected people in Korea exhibits similar characteristics as people living in China. They are found to have same characteristics when analyzed for patients. Singh R [22] evaluated the impact of social distancing on this pandemic with taking age structure into account. The papers provide a solution for this pandemic caused by novel coronavirus. Many researches are also carried out using Image processing techniques on COVID-19. Recently Subhankar Roy [2] proposed the deep learning technique for classification of Lung Ultrasonography (LUS) images. Classification of COVID-19 was carried out with CT Chest images with help of evolution–based CNN network in [18]. In study [3], Harrison X. Bai proposed that main feature for discriminating COVID-19 from Viral pneumonia is peripheral distribution, ground-glass opacity and vascular thickening. This study can distinguish COVID-19 from Viral pneumonia with high specificity using CT scan images and CT scan features. The same study [3] also suggested that both COVID-19 affected patient and Viral pneumonia affected patients develop central+peripheral distribution , air bronchogram , pleural thickening, pleural effusion and lymphadenopathy with no significant differences. General similarities between two viruses is that both cause respiratory disease, which can be asymptomatic or mild but can also cause severe disease and death. Second, both viruses are transmitted by contact or droplets. These predominant similarities with pneumonia viruses (influenza and SARS-CoV-2) urged us to proceed with the proposed work. The primary limitations that are analyzed so far are as follows:Use of CT and CXR images: The COVID-19 virus attacks the cells in the respiratory tracts and predominantly the lung tissues; we can use the images of Thorax to detect the virus without any test-kit. It is inferred that the chest X-Ray are of little incentive in initial stages, though CT scans of the chest are valuable even before the symptoms appear.Less testing period: The test taken to identify the COVID-19 is not fast enough. Especially during initial stages of the virus development, it is very hard to assess the patients. The manual analysis of CXR and CT scans of many patients by radiologists requires tremendous time. So we need the automated system which can save radiologist’s valuable time.Reusing the existing model: In this paper, a novel existing system can be reused for identifying the COVID-19 using CT scan and Chest X-Ray images. This can precisely detect the abnormal features that are identified in the images.To resolve the limitations stated above, the paper has been proposed using transfer learning model.
Proposed work
In this section, the proposed transfer learning model along with Haralick features used in the model are discussed.
Transfer learning
In this proposed work, a novel system has been presented to identify COVID-19 infection using deep learning techniques. First, transfer learning has been performed to images from NIH [21, 26, 30] Chest X-Ray-14 dataset to identify which disease has maximal similarity to COVID-19. It is found that pneumonia is much similar to COVID-19. The Fig. 1 represents images of the Chest X-Ray for various types of lung conditions.
Fig. 1
(a) The normal lungs, (b) The bacterial pneumonia affected lungs, (c) The viral pneumonia affected lungs, (d) COVID-19 affected lungs
(a) The normal lungs, (b) The bacterial pneumonia affected lungs, (c) The viral pneumonia affected lungs, (d) COVID-19 affected lungsTransfer learning [9, 24] is a method where the knowledge gained by a model (Viral pneumonia detection model) for a given data is transferred to evaluate another problem of similar task (COVID-19 detection model). In transfer learning, a initial training is carried out using a large amount of dataset to perform classification. The architecture of the proposed transfer learning model is delineated in Fig. 2. The CXR and CT images of various lung diseases including COVID-19, are fed to the model. First, the images are preprocessed to get quality images.
Fig. 2
Proposed Architecture of the Transfer Learning Model
Proposed Architecture of the Transfer Learning ModelThe histogram equalization and Weiner filters are applied to increase the contrast and remove the noise respectively as image enhancement techniques to increase the quality of the images. Histogram equalization provides enhanced better quality image without loss of information. Weiner filter is used to determine target process by filtering the noisy regions in an image. The deconvolution is performed to the blurred images by minimizing the Mean Square Error(MSE). The area of interest is chosen using ITK-SNAP software.The image resizing is achieved using python PIL automatically. The images after applying image enhancement techniques are presented in Fig. 3.
Fig. 3
(a) Chest CT scan image, (b) Histogram equalized image to increase contrast, (c) Weiner filtered to remove noise from image
(a) Chest CT scan image, (b) Histogram equalized image to increase contrast, (c) Weiner filtered to remove noise from imageHaralick texture features are obtained from the enhanced images and these modified images are then fed into various pre-defined CNN models. The haralick features are discussed in the next following section.In various pre-defined CNN models like Resnet50, VGG16 and InceptionV3, the convolutional layers are used to extract the image features and max pooling layers are used to down sample the images for dimensionality reduction, intermediate results are shown in Fig. 4.
Fig. 4
Intermediate output of COVID-19 image for first two layers from VGG16
Intermediate output of COVID-19 image for first two layers from VGG16The regularization is done by the dropout layers which speedup the execution by expelling the neurons whose contribution to the yield is not so high. The values of weights and bias are initialized randomly. The image size is chosen as 226x226. Adam optimizer takes care of updating the weights and bias. The sample images are trained in batches of size 250. Early stopping is employed to avoid over-fitting of data.The five different CNN models are also built with different configurations to analyze the various results as shown in Fig. 5.
Fig. 5
CNN model with different configuration
CNN model with different configurationThe stride value and dilation is chosen to be 1 which is the default value. Since these models perform one class classification (i.e.) either the sample will belong to that class or not, this is same as binary classification. So sigmoid function is used as the activation function in the fully connected layer as mentioned in Eq (1).
For the convolution layer and max-pooling layers ReLu function is utilized to activate the neurons and it is defined in Eq (2).
Each model is trained with dropout of 0.2 or 0.3. The transfer learning model is applied to predict the COVID-19 images instead of developing a new deep learning model from the scratch, since it takes more training time. The different pre-trained models and different CNN configured models are trained and tested with different lung disease images, the one, VGG16, had given a lesser misclassification with viral pneumonia, would be taken for prediction of COVID-19 cases by the proposed transfer learning model.
Haralick Texture Feature Extraction
The Haralick features [16] are extracted from images that are resized as mentioned in the Fig. 2. Haralick features very well describe the relation between intensities of pixels that are adjacent to each other. Haralick presented fourteen metrics for textual features which are obtained from Co-occurrence matrix. It provides the information about how the intensity in a particular pixel with a position is related to neighbouring pixel. The Gray-Level Co-occurrence Matrix (GLCM) is constructed for the images with N dimensions where N refers to the number of intensity levels in the image. For each image GLCM is constructed to evaluate fourteen features. The calculation of those 14 features leads to identification of new inter-relationship between biological features in images. The relationship between intensities of adjacent pixels can be identified using these features. These relationship among pixels contains information related to spatial distribution of tone and texture variations in an image. The homogeneity of image(F1) which is the similarity between pixels, is given by Eq (3), where p(k,l) is position of element in matrix.
The measure of difference between maximum pixel value and minimum pixel value (contrast)(F2), is given by Eq (4) where m is |k−l|. p(k,l) is position of element in matrix.
The dependencies of pixel’s gray levels which are adjacent to each other (F3), is given by Eq (5) also called correlation. Where μ,μ, σ, σ are mean and standard deviations of the probability density functions where p(k,l) is position.
The square differences from mean of an image (F4), are averaged in Eq (6) which is called variance or sum of squares, where p(k,l) is position and μ is the mean value.
The local homogeneity of an image (F5), is given by Eq (7) which is also called Inverse Difference Moment (IDM).
The mean values in a given image (F6), are summed to get sum average as given in the Eq (8), where a and b are row and column positions in the co-occurrence matrix summed to a+b.
The variance values of an image (F7), is summed to get sum variance as exhibited in Eq (9).
The total amount of information that must be coded for an image (F8), is given by Eq (10) which is called sum entropy.
The amount of information that must be coded for an image (F9), is given by the Eq (11) is called entropy.
The variance of an image (F10), is differenced to get difference variance presented in Eq (12).
The entropy values of an image (F11), is differenced to get difference entropy as delineated in Eq (13).
The Eq (14) shows the information measures of correlation1 (F12), where HX and HY are entropies of F and F.
The Eq (15) shows the information measures of correlation2 (F13), where HX and HY are entropies of F and F.In the above equation HXY,HXY1 and HXY2 are as mentioned below in Eq (16), Eq (17) and Eq (18).
The Eq (16) is the linear interdependence of pixels in an image (F14), where p(k,i) and p(l,i) are positions as mentioned in Eq. (19).
Experiments and results
In this section, the dataset for carrying out the experiments have been discussed. In addition to this, all the results and statistical analyses have been presented.
Dataset
The data for COVID-19 is assimilated from various resources available in Github open repository, RSNA and Google images. The data collected from various resources are presented in Table 1.
Table 1
COVID-19 image dataset
Images
Number of images
X-Ray
205
CT scan
202
COVID-19 image datasetThe data [21, 26, 30] for the Chest X-Ray pulmonary diseases are obtained from NIH with total of 81,176 observations with disease labels from 30,805 unique patients are shown in Table 2. The images are of size 1024x1024.
Table 2
Chest X-Ray 14 pulmonary disease dataset
Disease name
Number of images
Atelectasis
11559
Cardiomegaly
2776
Consolidation
4667
Edema
2303
Effusion
13317
Emphysema
2516
Fibrosis
1686
Hernia
227
Infiltration
19894
Mass
5782
Nodule
6331
Pleural Thickening
3385
Pneumonia
1431
Pneumothorax
5302
Chest X-Ray 14 pulmonary disease datasetThe data [12] for the viral, bacterial pneumonia and normal images are obtained from Mendeley with total of 5,232 images as shown in Table 3.
Table 3
Mendeley dataset for types of pneumonia
Disease name
Number of images
Bacterial Pneumonia
2,538
Viral pneumonia
1,345
Normal
1,349
Mendeley dataset for types of pneumonia
Results
The misclassification rate is calculated for all the pre-trained models like VGG16, Resnet50, and InceptionV3. The misclassification rate in Eq (20) is used to find the models which are similar to COVID-19, where N is total number of images, F is number of data that are actually COVID-19 but wrongly classified as not COVID-19 and F is number of data that are not COVID-19 but wrongly classified as COVID-19.From Table 4, we can see the architecture 1 shows a better result when compared with other architectures with less misclassification rate. These architectures are shown in Fig. 5.
Table 4
Misclassification rate when COVID-19 data is tested with models built using Mendeley dataset with 14 pulmonary lung diseases
Model
Bacterial pneumonia
Viral Pneumonia
Normal
1
0.1251
0.0258
0.97
2
0.2937
0.0391
0.93
3
0.2506
0.0353
0.94
4
0.2395
0.0342
0.95
5
0.1429
0.0295
0.96
Misclassification rate when COVID-19 data is tested with models built using Mendeley dataset with 14 pulmonary lung diseasesFrom Table 5, we can identify that the COVID-19 data is very much similar to pneumonia, consolidation and effusion. It is evident that COVID-19 data when tested for pneumonia trained model produces less misclassification rate.
Table 5
Misclassification Rate When COVID-19 data is tested with models built using NIH dataset with 14 Pulmonary lung diseases
Class
VGG-16
Resnet50
InceptionV3
Atelectasis
0.41
0.57
0.45
Cardiomegaly
0.78
0.76
0.85
Consolidation
0.34
0.43
0.38
Edema
0.56
0.59
0.63
Efussion
0.32
0.38
0.39
Emphysema
0.69
0.71
0.75
Fibrosis
0.74
0.82
0.75
Hernia
0.67
0.83
0.69
Infiltration
0.69
0.76
0.76
Mass
0.59
0.60
0.59
Nodule
0.73
0.73
0.76
Pleural Thickening
0.67
0.69
0.72
Pneumothorax
0.79
0.65
0.85
Pneumonia
0.30
0.35
0.33
Misclassification Rate When COVID-19 data is tested with models built using NIH dataset with 14 Pulmonary lung diseasesIn Table 6, we can determine that transfer learning has produced better accuracy (ACCURACY1) compared with traditional learning accuracy (ACCURACY2). This is because the data for COVID-19 is similar to pneumonia. Because of this reason, when model is trained for pneumonia and tested with COVID-19 data, the accuracy is better. The time taken for VGG16 is less because it is only 16 layers deep while resent50 and inceptionV3 are 50 and 48 layers deep respectively even with better accuracy compared with other models. The models are trained using NVIDIA TESLA P100 GPUs provided by Kaggle.
Table 6
Testing the pneumonia models with COVID-19 dataset
Model
Accuracy1
Accuracy2
Elapse time
loss
VGG16
93.8%
91.4%
39 min 24 sec
0.1272
Resnet50
89.21%
87.92%
56 min 39 sec
0.2433
InceptionV3
82.42%
78.15%
79 min 54 sec
0.3989
Testing the pneumonia models with COVID-19 datasetFurther analyzing the pneumonia images, we can perform transfer learning for two types of pneumonia. It is found that COVID-19 is as similar as viral pneumonia. The VGG16 model correctly identifies the COVID-19 data with 0.012 misclassification rate as shown in Table 7.
Table 7
Misclassification rate when COVID-19 data is tested with models built using NIH dataset
Class
VGG16
Resnet50
InceptionV3
Bacterial pneumonia
0.05
0.11
0.1526
Viral pneumonia
0.012
0.0222
0.0159
Normal
0.99
0.99
0.99
Misclassification rate when COVID-19 data is tested with models built using NIH datasetWe can find from Table 8 that out of 407 images for COVID-19 and normal images, 385 COVID-19 images are correctly classified as COVID-19 and 22 images are falsely classified under non viral pneumonia class. This shows that COVID-19 is very similar to viral pneumonia. 28 images are misclassified which eventually made misclassification rate for viral pneumonia of 0.012.
Table 8
Confusion matrix when COVID-19 data is tested with VGG16 viral pneumonia model
Viral pneumonia
Non Viral pneumonia
Viral pneumonia
385
22
Non Viral pneumonia
28
379
Confusion matrix when COVID-19 data is tested with VGG16 viral pneumonia modelFrom Fig. 6 we can find that the pre-trained VGG16 model has correctly classified the CT scan image of chest as COVID- 19. The top right image has got ground glass opacity on the right lower lobe of chest which is filled with air. While the image on top and bottom left are normal CT chest images. In the bottom right image, we can find extensive ground-glass opacities in lungs involving almost the entire lower left and lower right indicating COVID-19 virus.
Fig. 6
Output of COVID-19 CT scan image classified correctly by VGG16
Output of COVID-19 CT scan image classified correctly by VGG16From Fig. 7, we can find that the VGG16 model has classified the Chest X-Ray images correctly. Here we can see that the images on the right side of the Fig. 7, has got increased patchy opacity in the right lower lobe. While the images on the left are seems to be more clear. The left side pictures are normal lung images which are correctly classified by VGG16.
Fig. 7
Output of COVID-19 CXR image classified correctly by VGG16
Output of COVID-19 CXR image classified correctly by VGG16CT scan carried out for a same person has peculiar features because the patient does not have any nodules or consolidation like reticular opacities as previous images. The image has got small patchy glass opacity in the center of lungs developed from the peripheral. These images are also precisely classified with more similarity percentage. So it has found a similar image from the training set. This shows that the model has been consistent with all the peculiar cases like the one shown in Fig. 8. This is the reason why the model is trained using both Chest CXR and CT scan images.
Fig. 8
Output of COVID-19 CT scan image classified correctly by VGG16 With more similarity percentage with the images trained
Output of COVID-19 CT scan image classified correctly by VGG16 With more similarity percentage with the images trainedThe loss and accuracy graphs are shown in Fig. 9 and Fig. 10 respectively. We can see the steady increase in accuracies and steady decrease in loss values while training and testing. This shows that model is more effective and efficient.
Fig. 9
Accuracy graphs for ResNet50 , InceptionV3 and VGG16 while finding the similarity to COVID-19 model
Fig. 10
Loss graphs for ResNet50 , InceptionV3 and VGG16 while finding the similarity to COVID-19 model
Accuracy graphs for ResNet50 , InceptionV3 and VGG16 while finding the similarity to COVID-19 modelLoss graphs for ResNet50 , InceptionV3 and VGG16 while finding the similarity to COVID-19 modelAfter finding the similar models, the final classification is carried out to calculate confusion matrix for model evaluation. Tables 9, 10 and 11 shows the confusion matrix for the conventional and transfer learning models of COVID-19 data when tested for viral pneumonia models. Tables 12, 13 and 14 show the classification reports for all three models with precision, recall and F1-scores to analyze the performance where C1 denotes normal class, C2 is bacterial pneumonia class, C3 is viral pneumonia class and C4 is COVID-19 class. The precision and recall for all the classes are found to be promising. The F1-score as shown in Eq (21) is calculated using precision in Eq (22) and recall in Eq (23), where T is number of data that are COVID-19 and are correctly classified as COVID-19. This shows the model is skilled and classified the images precisely. The transfer learning gives better outcomes when compared with normal classification.
Table 9
Confusion matrix for VGG16 model for conventional classification and transfer learning models
Classification
Transfer learning
Category
C1
C2
C3
C4
Total
Category
C1
C2
C3
C4
Total
C1
182
18
0
0
200
C1
184
16
0
0
200
C2
19
181
0
0
200
C2
15
185
0
0
200
C3
0
4
182
14
200
C3
0
2
187
11
200
C4
0
0
15
185
200
C4
0
0
7
193
200
Total
201
203
197
199
800
Total
199
203
194
204
800
Table 10
Confusion matrix for Resnet50 model for conventional classification and transfer learning models
Classification
Transfer learning
Category
C1
C2
C3
C4
Total
Category
C1
C2
C3
C4
Total
C1
175
25
0
0
200
C1
176
24
0
0
200
C2
6
174
20
0
200
C2
20
176
4
0
200
C3
0
5
176
19
200
C3
0
0
178
22
200
C4
0
0
22
178
200
C4
0
0
17
183
200
Total
181
204
218
197
800
Total
196
200
199
205
800
Table 11
Confusion matrix for InceptionV3 model for conventional classification and transfer learning models
Classification
Transfer learning
Category
C1
C2
C3
C4
Total
Category
C1
C2
C3
C4
Total
C1
154
46
0
0
200
C1
162
38
0
0
200
C2
45
155
0
0
200
C2
36
164
0
0
200
C3
0
0
156
44
200
C3
0
3
165
32
200
C4
0
0
40
160
200
C4
0
2
30
168
200
Total
199
201
196
204
800
Total
198
207
195
200
800
Table 12
Classification report for VGG16 model for conventional classification and transfer learning models
Classification
Transfer learning
Category
Precision
Recall
F1-Score
Category
Precision
Recall
F1-Score
C1
0.91
0.90
0.90
C1
0.92
0.92
0.92
C2
0.90
0.89
0.89
C2
0.92
0.91
0.92
C3
0.91
0.92
0.91
C3
0.93
0.96
0.94
C4
0.92
0.92
0.92
C4
0.96
0.94
0.95
AVERAGE
0.91
0.90
0.90
AVERAGE
0.93
0.93
0.93
Table 13
Classification report for Resnet50 model for conventional classification and transfer learning models
Classification
Transfer learning
Category
Precision
Recall
F1-Score
Category
Precision
Recall
F1-Score
C1
0.87
0.96
0.91
C1
0.88
0.89
0.88
C2
0.87
0.85
0.86
C2
0.88
0.88
0.88
C3
0.88
0.80
0.84
C3
0.89
0.89
0.89
C4
0.89
0.90
0.89
C4
0.91
0.89
0.89
Average
0.87
0.87
0.87
Average
0.89
0.88
0.885
Table 14
Classification report for InceptionV3 model for conventional classification and transfer learning models
Classification
Transfer learning
Category
Precision
Recall
F1-Score
Category
Precision
Recall
F1-Score
C1
0.77
0.77
0.77
C1
0.81
0.81
0.81
C2
0.77
0.77
0.77
C2
0.82
0.79
0.80
C3
0.78
0.78
0.78
C3
0.82
0.84
0.82
C4
0.80
0.78
0.78
C4
0.84
0.84
0.84
Average
0.78
0.77
0.77
Average
0.93
0.93
0.93
Confusion matrix for VGG16 model for conventional classification and transfer learning modelsConfusion matrix for Resnet50 model for conventional classification and transfer learning modelsConfusion matrix for InceptionV3 model for conventional classification and transfer learning modelsClassification report for VGG16 model for conventional classification and transfer learning modelsClassification report for Resnet50 model for conventional classification and transfer learning modelsClassification report for InceptionV3 model for conventional classification and transfer learning modelsSample of 14 haralick features of 10 sample images are seen through Table 15, Table 16 and Table 17 for Normal image, viral pneumonia and COVID-19 images. Then haralick features of 200 images of normal, viral pneumonia and COVID-19 are analyzed. From this analysis, it is concluded from Tables 18, 19 and 20 that the feature F4, F6 and F7 should lie only within certain ranges. Other features exhibit values with fewer deviations. The range of values for an image to be detected as normal is as shown in the Eq (24), Eq (25) and Eq (26).
For the image to be identified as viral pneumonia affected lung images the values of the features must lie within the range as shown in the Eq (27), Eq (28) and Eq (29).
For the image to be identified as COVID-19 affected lung images the values of the features must lie within the range as shown in the Eq (30), Eq (31) and Eq (32).
Table 15
Haralick features extracted from ten normal lung images
ID
F1
F2
F3
F4
F5
F6
F7
F8
F9
F10
F11
F12
F13
F14
1
.1731
.0412
.9921
31.45
.9634
10.2
109.10
2.1566
3.3121
.0934
.1843
.8299
.9934
.1355
2
.1654
.0499
.9910
30.12
.9601
9.98
140.24
2.0014
3.7768
.0567
.2909
.8593
.9891
.2197
3
.2278
.0711
.9899
25.12
.9576
8.10
106.75
1.8962
2.5467
.0568
.3201
.8402
.9945
.2401
4
.1978
.0599
.9923
24.67
.9656
9.28
133.00
1.9938
2.4934
.0678
.1904
.8723
.9912
.1646
5
.1867
.0567
.9985
29.01
.9767
11.9
112.12
1.9021
2.9676
.0507
.2708
.8645
.9978
.1593
6
.1934
.0634
.9878
34.67
.9689
9.52
123.67
2.0567
3.0192
.0593
.2736
.8345
.9893
.2054
7
.2034
.0689
.9886
31.98
.9525
11.5
129.34
2.1776
2.4366
.0799
.2498
.8012
.9991
.1995
8
.1901
.0734
.9906
26.78
.9712
10.8
100.12
2.1132
2.5012
.0673
.2809
.8510
.9923
.1697
9
.1835
.0726
.9978
28.99
.9694
8.71
102.67
1.9056
2.7094
.0986
.1908
.8399
.9967
.2001
10
.1905
.0448
.9956
29.78
.9623
10.9
100.93
1.9990
2.6790
.0590
.1802
.8299
.9987
.2109
AVG
.1991
.0601
.9924
29.25
.9647
10.09
115.79
2.0202
2.8442
.0689
.2431
.8422
.9942
.1904
Table 16
Haralick features extracted from ten viral pneumonia lung images
ID
F1
F2
F3
F4
F5
F6
F7
F8
F9
F10
F11
F12
F13
F14
1
.1921
.0478
.9841
34.87
.9835
11.57
149.46
2.1203
3.0135
.0605
.2919
.8012
.9867
.2895
2
.2822
.0439
.9811
39.98
.9627
11.56
148.69
2.0421
2.5664
.0467
.1604
.8445
.9934
.2049
3
.2405
.0505
.9984
46.35
.9640
12.46
205.10
1.7903
2.2357
.0512
.2999
.8746
.9895
.2210
4
.2012
.0603
.9990
42.89
.9795
12.01
176.24
1.6919
2.4903
.0670
.1734
.8601
.9942
.1712
5
.1912
.0435
.9992
40.45
.9698
13.35
156.35
1.9123
2.9927
.0513
.2820
.8719
.9983
.1807
6
.2989
.0934
.9978
38.89
.9533
11.56
156.78
2.1046
3.0111
.0593
.2694
.8203
.9923
.2010
7
.1947
.0613
.9929
37.99
.9782
12.99
180.35
2.0367
2.7266
.0799
.2302
.8803
.9901
.2198
8
.1603
.0949
.9890
42.45
.9639
13.01
150.34
1.9567
2.6904
.0649
.2901
.8712
.9899
.1732
9
.2834
.0392
.9823
37.45
.9599
11.68
167.93
1.9302
2.8012
.0956
.1799
.8593
.9896
.2278
10
.2011
.0389
.9898
39.73
.9721
10.98
157.89
1.9856
2.5065
.0920
.1904
.8302
.9896
.2049
AVG
.2245
.0573
.9913
40.10
.9686
12.11
164.91
1.9570
2.7034
.0668
.2367
.8513
.9913
.2094
Table 17
Haralick features extracted from ten COVID-19 affected lung images
ID
F1
F2
F3
F4
F5
F6
F7
F8
F9
F10
F11
F12
F13
F14
1
.2019
.0532
.9935
32.95
.9934
10.56
152.34
1.234
3.3498
.0712
.3046
.8346
.9897
.2467
2
.2775
.0604
.9719
34.78
.9789
14.23
130.46
1.9424
2.4672
.0645
.4606
.8578
.9775
.2744
3
.2390
.0392
.9968
68.23
.9546
15.67
211.23
2.4678
2.1290
.0946
.1203
.8913
.9805
.2102
4
.2110
.0402
.9839
43.34
.9920
19.10
199.34
1.0012
2.9348
.0201
.5675
.8160
.9799
.1304
5
.1560
.0670
.9946
45.67
.9459
12.64
189.12
2.5478
3.9960
.0645
.2201
.8325
.9902
.0712
6
.2756
.0892
.9898
39.25
.9278
15.95
170.12
2.9045
2.1132
.0344
.1930
.8404
.9999
.2590
7
.1728
.0302
.9917
34.33
.9735
17.91
105.67
2.4677
2.5471
.0930
.2102
.8576
.9987
.2701
8
.1496
.0956
.9923
48.12
.9567
16.99
157.45
1.2309
2.5780
.0923
.1682
.8602
.9993
.1230
9
.3040
.0505
.9780
39.89
.9893
12.02
189.45
2.9204
2.0913
.0675
.1012
.8503
.9921
.2903
10
.1846
.0678
.9946
32.10
.9829
15.38
120.12
1.5789
2.9103
.0859
.1110
.8902
.9929
.2001
AVG
.2172
.0593
.9887
41.86
.9695
15.04
162.53
2.0295
2.7116
.0680
.2456
.8503
.9900
.2075
Table 18
Statistical Analysis of Haralick features that are extracted for normal, COVID-19 and Viral pnenumonia lung images (200 images)
Normal
COVID-19
Viral pneumonia
Haralick feature
Mean value
Min value
Max value
Mean value
Min value
Max value
Mean value
Min value
Max value
F1
0.2010
0.1567
0.2298
0.2217
0.1567
0.3123
0.2345
0.1989
0.2712
F2
0.0724
0.0417
0.0800
0.0673
0.0345
0.987
0.0671
0.0512
0.0987
F3
0.9873
0.9799
0.9945
0.9987
0.9567
0.9987
0.9954
0.9847
0.9989
F4
31.5
23.67
33.81
43.12
33.78
57.32
41.56
35.78
47.18
F5
0.9723
0.9487
0.9764
0.9789
0.9568
0.9921
0.9789
0.9456
0.9923
F6
11.56
8.12
11.87
16.78
10.15
19.01
14.71
10.87
15.67
F7
117.21
100.74
132.45
167.65
132.81
210.99
170.89
147.78
205.56
F8
2.1245
1.789
2.1576
2.0278
1.4567
2.9523
2.1045
1.6829
1.9985
F9
2.8394
2.4589
3.3012
2.7345
2.1145
2.9670
2.8915
2.4567
3.1278
F10
0.0672
0.0561
0.0947
0.0678
0.0561
0.0742
0.0676
0.0672
0.0986
F11
0.2542
0.1856
0.2789
0.3124
0.1291
0.3099
0.2453
0.1567
0.2671
F12
0.8521
0.7989
0.8871
0.8670
0.8125
0.8891
0.8645
0.8967
0.7934
F13
0.9954
0.98767
0.9921
0.9889
0.9865
0.9923
0.9945
0.9789
0.9969
F14
0.2017
0.1459
0.2199
0.2077
0.0789
0.1678
0.2198
0.1674
0.2914
Table 19
Statistical Analysis of Haralick features that are extracted for normal and viral pneumonic lung images
Haralick Feature
Lung X-ray type
Mean Value
Min value
Max value
F4
Pneumonia lung
41.5612
35.7826
47.1890
Normal lung
31.5017
23.6780
33.8134
F6
Pneumonia lung
14.7123
10.8714
15.6770
Normal lung
11.5677
8.1234
11.8791
F7
Pneumonia lung
170.8990
147.7801
205.5645
Normal lung
117.2112
100.7445
132.4556
Table 20
Statistical Analysis of Haralick features that are extracted for normal and COVID-19 lung images
Haralick feature
Lung X-ray type
Mean value
Min value
Max value
F4
COVID-19 lung
43.1267
33.7812
57.3214
Normal lung
31.5017
23.6780
33.8134
F6
COVID-19 lung
16.7810
10.1552
19.0101
Normal lung
11.5677
8.1234
11.8791
F7
COVID-19 lung
167.6543
132.8199
210.9901
Normal lung
117.2112
100.7445
132.4556
Haralick features extracted from ten normal lung imagesHaralick features extracted from ten viral pneumonia lung imagesHaralick features extracted from ten COVID-19 affected lung imagesStatistical Analysis of Haralick features that are extracted for normal, COVID-19 and Viral pnenumonia lung images (200 images)Statistical Analysis of Haralick features that are extracted for normal and viral pneumonic lung imagesStatistical Analysis of Haralick features that are extracted for normal and COVID-19 lung images
Performance Comparison
The efficacy of the proposed model is compared with other recent studies on COVID-19 conventional classification works and it is given in Table 21. From this performance analysis, the proposed transfer learning model outperforms the other existing models.
Table 21
Comparison of studies on COVID-19 classification
Author
Image
Accuracy
Classification
Shuai Wang-[24]
CT images
82.9%
Transfer learning
Ezz El-Din Hemdan-[10]
X-Ray images
89%
COVIDX-Net
Jinyu Zhao-[32]
CT images
83%
Pre-trained model
Feng Shi-[20]
CT images
87.9%
Random Forest method
Ioannis D. Apostolopoulos-[1]
X-Ray images
88.8%
CNN
Khalid El Asnaoui-[7]
X-Ray images
84%
Pre-trained models
Shuo Wang-[25]
CT images
88%
Pre-trained models
Yujin Oh-[15]
X-Ray images
88.9%
Pre-trained model
Asif Iqbal Khan-[13]
X-Ray images
89.5%
Deep Neural Network
Proposed work
CT+X-Ray images
93%
Transfer learning
Comparison of studies on COVID-19 classification
Visualisation
The infected region of lung images are identified using GradCAM. Images in the Fig. 11 shows heatmap visualization based on the prediction made by the transfer learning model(VGG-16) which produces better accuracy. Using GradCAM we can visually validate where the proposed network is scanning and verifying that it is indeed screening the correct patterns in the image and activating around those patterns which will be used by our model for classification.This heatmap generated shows the infected regions that are correctly identified by our model. To sum up GradCAM, the images are passed into the completely trained model and features are extracted from the last convolution layer. Let f be the ith feature map and let w be the weight in the final classification layer for feature map i leading to f. We obtain a map M of the most salient features utilized in categorizing the image as having f by calculting the weighted sum of the features using the assigned weights. It is given in the Eq. (33).
Fig. 11
Output of Gradient-weighted Class Activation Mapping (GradCAM) generated heatmap visualization for images
Output of Gradient-weighted Class Activation Mapping (GradCAM) generated heatmap visualization for images
Discussion
World Health Organisation(WHO) has recommended RT-PCR testing for the suspicious cases and this has not been followed by many countries due to shortage of the testing kit. Here the transfer learning technique can provide a quick alternative to aid the diagnoses process and thereby limiting the spread. The primary purpose of this work is to provide radiologists with less complex model which can aid in early diagnosis of COVID-19. The proposed model produces precision of 91% , recall of 90% and accuracy of 93% by VGG-16 using transfer learning, which outperforms other existing models for this pandemic period.
Conclusion
This COVID-19 detection model has been developed with keeping in mind the challenges prevailing in the field of COVID-19 detection using data assimilated from multiple sources. Analysis of unusual features in the images is required for detection of this virus infection. The earlier we detect the viral infection, the more it helps in saving lives. This paper has been visualized in holistic approach taking into account the critical issues that are daunting in the domain. The results are fairly consistent for all peculiar cases. We hope the outcomes discussed in this paper serves a small steps for constructing cultivated COVID-19 detection model using CXR and CT images. In future work, more data can be assimilated for better results which further strengthen the proposed model.
Authors: Harrison X Bai; Ben Hsieh; Zeng Xiong; Kasey Halsey; Ji Whae Choi; Thi My Linh Tran; Ian Pan; Lin-Bo Shi; Dong-Cui Wang; Ji Mei; Xiao-Long Jiang; Qiu-Hua Zeng; Thomas K Egglin; Ping-Feng Hu; Saurabh Agarwal; Fang-Fang Xie; Sha Li; Terrance Healey; Michael K Atalay; Wei-Hua Liao Journal: Radiology Date: 2020-03-10 Impact factor: 11.105