Literature DB >> 34127912

Deep Learning in Classification of Covid-19 Coronavirus, Pneumonia and Healthy Lungs on CXR and CT Images.

Mihaela-Ruxandra Lascu1.   

Abstract

Purpose: In this paper, the transfer learning method has been implemented to chest X-ray (CXR) and computed tomography (CT) bio-images of diverse kinds of lungs maladies, including CORONAVIRUS 2019 (COVID-19). COVID-19 identification is a difficult assignment that constantly demands a careful analysis of a patient's clinical images, as COVID-19 is found to be very alike to pneumonic viral lung infection. In this paper, a transfer learning model to accelerate prediction processes and to assist medical professionals is proposed. Finally, the main purpose is to do an accurate classification between Covid-19, pneumonia and, healthy lungs using CXR and CT images.
Methods: Learning transfer gives the possibility to find out about this new illness COVID-19, using the knowledge we have about the pneumonia virus. This demonstrates the apprehensiveness achieved from a new architecture trained to detect virus-related pneumonia that must be transferred for COVID-19 detection. Transfer learning presents a considerable dissimilarity in results when compared to the result of traditional groupings. It is not necessary to create a separate model for the classification of COVID-19. This simplifies complicated issues by adopting the available model for COVID-19 determination. Automated diagnosis of COVID-19 using Haralick texture features is focused on segmented lung images and problematic lung patches. Lung patches are necessary for the augmentation of COVID-19 image data.
Results: The obtained outcomes are quite reliable for all distinctive processes as the proposed architecture can distinguish healthy lungs, pneumonia, COVID-19. Conclusions: The results suggest that the implemented model is improved considering other existing models because the obtained classification accuracy is over the recently obtained results. It is a belief that the new architecture that is implemented in this study, delivers a petite step in building refined Coronavirus 2019 diagnosis architecture using CXR and CT bio-images. © Taiwanese Society of Biomedical Engineering 2021.

Entities:  

Keywords:  COVID-19; Classification augmentation; Deep learning; Network architecture; Pulmonary disease detection; Segmentation; X-ray/CT imaging

Year:  2021        PMID: 34127912      PMCID: PMC8190751          DOI: 10.1007/s40846-021-00630-2

Source DB:  PubMed          Journal:  J Med Biol Eng        ISSN: 1609-0985            Impact factor:   2.213


Introduction

Coronavirus

CORONAVIRUS 2019 disease (COVID-19) is caused by severe acute respiratory syndrome coronavirus 2 (SARSCoV-2). Due to the very infectious character, deficiency of proper treatment, fast detection of COVID-19 is growing essential to hinder the significant spread and to smooth the trajectory for the appropriated designation of finite clinical assets. Antigen tests behave fast but have low sensitivity. Nowadays, the reverse transcription-polymerase chain reaction (RT-PCR), which discovers the viral nucleic acid, is the best criterion for detecting Coronavirus 2019. RT-PCRs are obtained by using nasopharyngeal and throat spicas that can have errors of sampling and small viral load [1]. Consequently, the deep learning (DL) strategies of chest CXR and CT for COVID-19 classification have been conscientiously explored. In [2] an open-source convolutional neural network platform named COVID-Net is proposed and adapted for COVID-19 cases recognition in CXR and CT images. COVID-Net can obtain good sensitivity for COVID-19 cases with a sensitivity of 80% [3].

Neural Network Design

In this paper, deep convolutional neural networks (DCNN) are evaluated for diagnosing COVID-19. However, it is challenging to gather a big set of clean data for the construction of deep neural networks. Because of this, the premier objectives of this paper are to implement a deep neural network (DNN) architecture that is appropriate to learn with a small dataset. The obtained data can further produce radiologically understandable results [3]. A systematic scientific investigation needs several imaging features for CXR and CT examination, for example, contrast, homogeneity, entropy, correlation. The training dataset can be extended using some patches from each selected bio-image. Even with a small dataset, the DNN is possible to be trained successfully without being overfitted. By mixing new preprocessing steps to normalize the heterogeneity and data bias, it is possible to demonstrate with the same dataset that the proposed DNN architecture offers finer sensitivity and intelligibility compared to the existing COVID-Net [3]. There are appreciable important differences in the resulted patches. The intensity dispersion is well connected to radiological intensity variations for COVID-19 in CXR and CT. It is indispensable to recommend a new patch based DNN model, with fuzzy area cutting. The concluding classification results come from the current results found in several patch locations. In the neural network design, a fully convolutional DenseNet for segmentation FC-DenseNet103 and a residual network for classification ResNet101 are used. FC-DenseNet103 was implemented using the PyTorch library and ResNet101 using MATLAB 2021a. The major advantage of Densely Connected Convolutional Networks (DenseNet) is that it has fewer parameters than standard convolutional networks. DenseNet pursues to grow the depth of deep neural networks. In Fig. 1 a DenseNet structure with dense blocks is represented. The distance from the input layer to the output is quite long. The layers between two adjacent blocks are referred to as transition layers and change features sizes using convolution and pooling. The solution is, that each layer must be simply connected layers in DenseNets, as we can observe in Fig. 2. Each layer belonging to this architecture takes all preceding features as input. In this way, the neural network is easier to be trained. While the depth for the neural network increases, the problem of gradient vanishes.
Fig.1

DenseNet architecture with three blocks [6]

Fig. 2

DenseNet classic architecture [6]

DenseNet architecture with three blocks [6] DenseNet classic architecture [6] ResNet-101 architecture is presented in Fig. 3. It is a convolutional neural network (CNN) that has 101 layers. A pretrained version of the CNN network can be trained on more than 12,520 biomedical images for viral pneumonia taken from the Kaggle database [4]. As a result, the network has learned a lot of features that characterize a large range of images.
Fig. 3

A regular block and a residual block (up). ResNet-101 architecture (down) [9]

A regular block and a residual block (up). ResNet-101 architecture (down) [9] Also, Haralick texture features, as contrast, correlation, and entropy, based on lung morphology are computed, which will emphasize the kind of abnormality we find inside the lungs.

Proposed Method—Network Architecture Using Transfer Learning Model

Transfer Learning Architecture

Pneumonia is in many ways like COVID-19 Fig. 4, [4]. The suggested transfer learning architecture of a neural network model is presented in Fig. 5. First, neural-network training is executed by operating a big number of datasets to execute grouping, in transfer learning. The CXR and CT images (12,520 images) of a great variety of lung illnesses, including COVID-19 (220 images) are given to the proposed architecture. The starting step is preprocessing for getting more accurate images. CXR and CT quality images are obtained using histogram equalization. Wiener filters are implemented to enhance contrast and reject perturbations. These two applied enhancement techniques grow the quality of the bio-images. The awareness obtained by a newly implemented model, for instance, the pneumonic respiratory disease recognition architecture, for a specified set of information, is moved for estimating a different question of an alike objective (CORONAVIRUS-2019 recognition architecture) that represents the transfer learning method [5].
Fig. 4

a Normal, b lung opacity, c viral pneumonia, d COVID-19

Fig. 5

Proposed architecture for Covid-19 classification

a Normal, b lung opacity, c viral pneumonia, d COVID-19 Proposed architecture for Covid-19 classification In the following, each network is described separately. The training of each network was implemented sequentially. First the training for the segmentation network, then the training for the classification network. Also, the mathematical background for establishing the Haralick texture features extraction is presented.

Segmentation Network

The main task for our segmentation network is to extract lung contour from the CXR and CT lung images. A fully convolutional FC-DenseNet 103 is implemented to execute semantic segmentation [4, 6]. Our training target is: where is the cross-entropy loss of multi-categorical semantic segmentation and represents the network parameter set, which is composed of filter kernel weights and biases. is represented as follows:where represents the indicator function, is the softmax probability of the j-th pixel in a CXR and CT image , and y represents the corresponding ground-truth label. s represents the class category, i.e., s λ represents the weights given to each class category. The FC-DenseNet 103 was trained using the preprocessed data. Random distribution was applied for the network parameters at the beginning. The SGDM optimizer with an initial learning rate of 0.00001 was used. The obtained results for our network are: 88.99% Accuracy, 83.42% Precision, 85.97% Recall and 84.47% F1Score. The PyTorch library was used for the network implementation [4].

Classification Network

For properly using the classification network the first step was to apply lung masks on the preprocessed images. The second step was to calculate the Haralick texture features for the segmented images, to narrow the class for selection. This aspect has a wider explanation in the next paragraph. The third step was to take out randomly and locally from the contour lung image patches with a dimension of 224 × 224, that represents the input shape dimension for ResNet-101. The resulting patches were used as the network input. Because the patches have the optimal size for our classification network ResNet-101, it was only to apply a color processing step for changing the gray images from the Kaggle dataset in RGB images, as our classification network is expecting. The centers of the patches were randomly selected within the lung areas for avoiding cropping the patch from the empty area of the masked images. The prepared images represent at this moment the input for our classification network. Because the pretrained classification network has an output for 1000 classes, we changed the fully connected layer and the classification layer for only four classes. For deciding the pretrained parameters for our classification network, the parameters from ImageNet are used for weight initialization, then the network was trained for CXR and CT for viral pneumonia. The last step was training the network for COVID classification. The training options are the stochastic gradient descent with momentum (SGDM) optimizer with a learning rate 0.01, ten epochs and batch size 10. The classification network was implemented using Matlab 2021a, Neural Network Design Application.

Haralick Texture Features Extraction

The Gray Level Cooccurrence Matrix (GLCM) and texture features have been introduced in 1973 by Haralick et al. [7]. This technique is often used in biomedical image processing. The implemented method consists of two steps used for feature extraction. The first step is to calculate the GLCM, and the second step is to calculate the texture features based on the GLCM. GLCM gives the information concerning how often each gray level occurs in a pixel located at a fixed geometric position versus each other pixel, as a function of the gray level [7]. A measure of gray level changes between the reference pixel and its neighbor represents the contrast (C): where is the normalized symmetric GLCM having a dimension N x N. Here N represents the number of gray levels and is the th element of the normalized GLCM [7]. Local homogeneity (LC) calculates how close the distribution of elements in GLCM is to the diagonal of GLCM. When the contrast decreases, the homogeneity increases: Homogeneity (H) of an image represents the similarity between pixels: Entropy (E) represents the degree of disorder in the image. A maximum of entropy will be obtained when all elements of the cooccurrence matrix are the same. A reduced entropy value will be obtained for unequal elements. Correlation (Corr) represents the linear dependency of gray level values in the cooccurrence matrix: ,—mean values; ,—standard deviations have the following mathematical expressions: The most important and basic difference statistic texture descriptions are Mean (M), Sum Entropy (SE), Variance (V) and Sum Variance (SV). where is the mean value. The Haralick features of 100 images of normal, viral pneumonia and COVID-19 are calculated and analyzed using Matlab2021a and Statistics and Machine Learning Toolbox. From this analysis, it is concluded from Table 1 that the features V, M and SV should be within a certain domain. Other features have values with fewer changes. The domain for an image to be detected as normal is shown in Eqs. (13), (14) and (15).
Table 1

Statistical analysis of Haralick features for normal, COVID-19 and viral pneumonia for segmented lung images (100 images)

Haralick feature valueNormalCOVID-19Viral pneumonia
MeanMinMaxMeanMinMaxMeanMinMax
Homogeneity-H0.1950.170.220.230.130.330.230.170.29
Contrast-C0.060.050.070.5050.030.980.0850.060.11
Correlation-corr0.9750.960.990.9750.960.990.530.950.11
Variance-V28.3922.9733.8145.88533.7857.9942.1636.1348.19
Mean-M10.028.1711.8714.70510.4418.9713.2810.7715.79
Sum variance-SV116.2199.98132.45172.16132.78211.54177.15147.78206.52
Sum entropy-SE1.8951.642.152.231.492.971.8251.671.98
Entropy-E2.8652.433.302.4152.072.762.7852.443.13
Statistical analysis of Haralick features for normal, COVID-19 and viral pneumonia for segmented lung images (100 images) For viral pneumonia affected lung images the values of the features must be within a domain as shown in Eqs. (16), (17) and (18). COVID-19 images have features values in the domain as shown in Eqs. (19), (20) and (21). The calculated values in Table 1 are like feature values obtained in [5].

Sorting System for COVID-19

Usually, the major trivial motive of pneumonia is a viral infection. In these extensive epidemic circumstances caused by a contagious illness, the disposal of clinical resources is a fact of supreme significance. CORONAVIRUS 2019 is growing fast and is exceeding the dimensions of the clinical structure in a lot of countries. It is compulsory, to take an acceptable choice to allocate the restricted assets based on the sorting system in Fig. 6, which determines the requirements and emergency for each patient [4].
Fig. 6

Sorting system for coronavirus 2019 victim [4]

Sorting system for coronavirus 2019 victim [4]

Results

Datasets

Kaggle dataset was divided into four classes for normal, lung opacity, viral pneumonia, and COVID-19 as it is specified in Table 2. The images are resized for our pretrained CNN on 224 × 224 pixels. Datastores were assembled for four classes from image data folders and subfolders, properly labeled, resized, color processed and augmented. The augmentation was performed with randomly patching. This dataset is acceptable for this study considering that the used computer is a Laptop Gaming Dell G7 7790, Intel Core i7@ 4,1 GHz, 17.3″ Full HD, 16 GB, 1 TB + SSD 256 GB, NVIDIA GeForce RTX 2020 6 GB, Operating System Windows 10 Home. We shall see that the training time is acceptable, depending on how the randomly patching for data augmentation is done. The time varies approximately from 55 to 111 min training on a GPU. If the training is performed on a single CPU instead of using GPU, the time will be ten times longer.
Table 2

Dataset images (1024 × 1024 pixels jpg format) from database Kaggle for CXR (80%) and CT (20%)

Kaggle-DatasetNormalPneumoniaLung OpacityCOVID-19Total
Training39003900390020011,900
Validation10010010010310
Test10010010010310
Total41004100410022012,520
Dataset images (1024 × 1024 pixels jpg format) from database Kaggle for CXR (80%) and CT (20%) Another possibility for training networks is Deep Neural Network Azure Cloud Training. In this situation, the processing cost will be high, but the time consumption will be extremely low.

Performance Parameters

The final classification is completed to calculate confusion matrix for model evaluation, after finding the corresponding model. Parameters that measure the classification performance for ResNet-101 network are described in this paragraph: accuracy, precision, recall, and F1-score [4]. Table 3 presents the confusion matrix for traditional classification and transfer learning model for COVID19 data when tested for viral pneumonia. Table 4 shows the performance parameters precision, recall, F1-Score for analyzing the normal Class1, lung opacity Class2, viral pneumonia Class3 and Covid-19 Class4. Precision and Recall values are encouraging. F1-Score is calculated using precision and recall as in Eq. (25). The model is experienced and classifies the images correctly. In general, the transfer learning provides enhanced results compared to classical classification.
Table 3

Confusion matrix for Resnet-101 network for classification and transfer learning model

CLASSIFICATIONTRANSFER LEARNING
CategoryClass1Class2Class3Class4TotalCategoryClass1Class2Class3Class4Total
Class191900100Class192800100
Class2109000100Class289200100
Class302917100Class301936100
Class400892100Class400496100
Total1011019999400Total10010197102400
Table 4

Performance parameters for Resnet-101 network for classification and transfer learning model

ClassificationTransfer learning
CategoryPrecisionRecallF1-scoreCategoryPrecisionRecallF1-score
Class10.910.900.90Class10.920.920.92
Class20.900.890.89Class20.920.910.92
Class30.910.920.91Class30.930.950.93
Class40.920.920.92Class40.960.940.95
Average0.910.900.90Average0.930.930.93
Confusion matrix for Resnet-101 network for classification and transfer learning model Performance parameters for Resnet-101 network for classification and transfer learning model Accuracy is the most intuitive performance measure and is an evaluation metric that allows to measure the total number of predictions a model gets right [4]. Accuracy shows what percent of the model predictions were correct. Precision evaluates how rigorous a model is in predicting positive labels [4]. Recall calculates the percentage of actual positives a model correctly identified (True Positive) [4]. F1 Score (F-Measure) is the weighted average of Precision and Recall. Therefore, this score takes both false positives and false negatives into account. True positive (TP) represents the number of predictions where the classifier accurately predicts the positive class as positive. TP is the number of data that are Covid-19 and are precisely classified as Covid-19. True negative (TN) means the number of predictions where the classifier perfectly predicts the negative class as negative. False Positives (FP) shows us the number of predictions where the classifier inaccurately predicts the negative class as positive. False Negatives (FN) indicates the number of predictions where the classifier falsely predicts the positive class as negative.

Comparison for CORONAVIRUS 2019 Classification Accuracy

In Fig. 7 the loss and accuracy graphs can be seen. It is observed the increase in accuracies and decrease in loss values while training and testing.
Fig. 7

Accuracy and loss graphs for COVID-19 classification using a classification network Resnet-101

Accuracy and loss graphs for COVID-19 classification using a classification network Resnet-101 In Table 5 it is possible to see a testing accuracy comparison between different COVID-19 studies. From the performance analysis, the proposed transfer learning model is slightly improved compared with other existing models. Properly labeling, applying mask segmentation, calculating Haralick features and choosing the right patches were extremely important for obtaining a higher performance concerning the network.
Table 5

Comparison of CORONAVIRUS 2019 classification accuracy

Author and sourceBio-imagesAccuracy (%)Classification
Hemdan [2]X-ray89COVID-Net
Wang-Zha [3]CT88Pretrained neural networks
Oh [4]X-ray88.9Pretrained neural networks
Zhao-Zang [10]CT83Pretrained neural networks
El Asnaoui [11]X-ray84Pretrained neural networks
Apostolopoulos [12]X-ray88.8CNN
Perumal [5]CT and X-ray93Transfer learning
Khan [13]X-ray89.5CNN
Shi [14]CT87.9Random Forest method
Wang-Kang [8]CT82.9Transfer learning
Proposed architectureCT and X-ray94.9Transfer learning
Comparison of CORONAVIRUS 2019 classification accuracy Using the adequate set for training, validating, and testing data is very important for obtaining better parameters for the implemented network, and finally a better classification for COVID-19. Haralick features gave the possibility to distinguish more easily between normal lungs, viral pneumonia, and COVID-19. Also, the proper size 224 × 224 for patches [4] [4] was very important for having an improved performance for the classification algorithm. The proposed model has a 93% precision, 93% recall and 94.9% accuracy using ResNet-101 with four classification classes.

Conclusions

In this paper, the obtained results have been presented and visualized in a comprehensive way taking into consideration the demanding problems that are discouraging in this domain. The Coronavirus 2019 recognition architecture has thrived using incorporated data from multiple sources [8]. This virus infection has been detected by examining peculiar characteristics found in the studied bioimages. If the viral infection is detected earlier, lives will be saved. The obtained outcomes are quite reliable for all distinctive processes as the proposed architecture can detect healthy lungs, viral pneumonia, and COVID-19 as represented in Fig. 4. It is believed that the new architecture that is implemented in this study, delivers a petite step in building refined Coronavirus 2019 diagnosis architecture using CXR and CT bio-images. In the following research activity, supplemental details and information will be incorporated for improved conclusions that auxiliary toughens the suggested transfer learning architecture.
  9 in total

1.  Large-scale screening of COVID-19 from community acquired pneumonia using infection size-aware classification.

Authors:  Feng Shi; Liming Xia; Fei Shan; Bin Song; Dijia Wu; Ying Wei; Huan Yuan; Huiting Jiang; Yichu He; Yaozong Gao; He Sui; Dinggang Shen
Journal:  Phys Med Biol       Date:  2021-02-19       Impact factor: 3.609

2.  Deep Learning COVID-19 Features on CXR Using Limited Training Data Sets.

Authors:  Yujin Oh; Sangjoon Park; Jong Chul Ye
Journal:  IEEE Trans Med Imaging       Date:  2020-05-08       Impact factor: 10.048

3.  A fully automatic deep learning system for COVID-19 diagnostic and prognostic analysis.

Authors:  Shuo Wang; Yunfei Zha; Weimin Li; Qingxia Wu; Xiaohu Li; Meng Niu; Meiyun Wang; Xiaoming Qiu; Hongjun Li; He Yu; Wei Gong; Yan Bai; Li Li; Yongbei Zhu; Liusu Wang; Jie Tian
Journal:  Eur Respir J       Date:  2020-08-06       Impact factor: 16.671

Review 4.  Detection of COVID-19 using CXR and CT images using Transfer Learning and Haralick features.

Authors:  Varalakshmi Perumal; Vasumathi Narayanan; Sakthi Jaya Sundar Rajasekar
Journal:  Appl Intell (Dordr)       Date:  2020-08-12       Impact factor: 5.086

5.  Using X-ray images and deep learning for automated detection of coronavirus disease.

Authors:  Khalid El Asnaoui; Youness Chawki
Journal:  J Biomol Struct Dyn       Date:  2020-05-22

6.  CoroNet: A deep neural network for detection and diagnosis of COVID-19 from chest x-ray images.

Authors:  Asif Iqbal Khan; Junaid Latief Shah; Mohammad Mudasir Bhat
Journal:  Comput Methods Programs Biomed       Date:  2020-06-05       Impact factor: 5.428

7.  Covid-19: automatic detection from X-ray images utilizing transfer learning with convolutional neural networks.

Authors:  Ioannis D Apostolopoulos; Tzani A Mpesiana
Journal:  Phys Eng Sci Med       Date:  2020-04-03

8.  A deep learning algorithm using CT images to screen for Corona virus disease (COVID-19).

Authors:  Shuai Wang; Bo Kang; Jinlu Ma; Xianjun Zeng; Mingming Xiao; Jia Guo; Mengjiao Cai; Jingyi Yang; Yaodong Li; Xiangfei Meng; Bo Xu
Journal:  Eur Radiol       Date:  2021-02-24       Impact factor: 5.315

  9 in total
  2 in total

1.  Cov-Net: A computer-aided diagnosis method for recognizing COVID-19 from chest X-ray images via machine vision.

Authors:  Han Li; Nianyin Zeng; Peishu Wu; Kathy Clawson
Journal:  Expert Syst Appl       Date:  2022-07-05       Impact factor: 8.665

Review 2.  Role of Artificial Intelligence in COVID-19 Detection.

Authors:  Anjan Gudigar; U Raghavendra; Sneha Nayak; Chui Ping Ooi; Wai Yee Chan; Mokshagna Rohit Gangavarapu; Chinmay Dharmik; Jyothi Samanth; Nahrizul Adib Kadri; Khairunnisa Hasikin; Prabal Datta Barua; Subrata Chakraborty; Edward J Ciaccio; U Rajendra Acharya
Journal:  Sensors (Basel)       Date:  2021-12-01       Impact factor: 3.576

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.