Literature DB >> 36158870

E-GCS: Detection of COVID-19 through classification by attention bottleneck residual network.

Abstract

Background: Recently, the coronavirus disease 2019 (COVID-19) has caused mortality of many people globally. Thus, there existed a need to detect this disease to prevent its further spread. Hence, the study aims to predict COVID-19 infected patients based on deep learning (DL) and image processing.
Objectives: The study intends to classify the normal and abnormal cases of COVID-19 by considering three different medical imaging modalities namely ultrasound imaging, X-ray images and CT scan images through introduced attention bottleneck residual network (AB-ResNet). It also aims to segment the abnormal infected area from normal images for localizing localising the disease infected area through the proposed edge based graph cut segmentation (E-GCS). Methodology: AB-ResNet is used for classifying images whereas E-GCS segment the abnormal images. The study possess various advantages as it rely on DL and possess capability for accelerating the training speed of deep networks. It also enhance the network depth leading to minimum parameters, minimising the impact of vanishing gradient issue and attaining effective network performance with respect to better accuracy. Results/
Conclusion: Performance and comparative analysis is undertaken to evaluate the efficiency of the introduced system and results explores the efficiency of the proposed system in COVID-19 detection with high accuracy (99%).

Entities: Chemical

Keywords: Attention bottleneck residual network; Coronavirus disease 2019; Edge based graph cut segmentation; Ultrasound and CT images; X-ray image

Year: 2022 PMID： 36158870 PMCID： PMC9485443 DOI： 10.1016/j.engappai.2022.105398

Source DB: PubMed Journal: Eng Appl Artif Intell ISSN： 0952-1976 Impact factor: 7.802

Introduction

Recently, new outbreak of coronavirus disease (COVID-19) has confronted medical, economic as well as the public health worldwide except few countries. Apart from restricting this outbreaks, attempts have to be made by devising clear measures for preventing further outbreaks of this disease (Singhal, 2020). Thus, a comprehensive analysis has been carried out by the paper (He et al., 2020) for analysing clinical features, COVID-19 treatments etc. The analysis explored that the pathological discoveries related to SARS-CoV-2 have been restricted. Autopsy has also been warranted and useful for future study that will enhance the accuracy of initial diagnostic tests (Li et al., 2020). The study (Kumar et al., 2020) also affords opinions on the analysis of technological advancement utilised for minimising the substantial impact of this disease. Though various research associated with modern technology have been emerging towards the detection of COVID-19, there exists more restrictions with respect to technological contributions in this prediction. This study explored about the Artificial intelligence (AI) applications and also the way in which modern technology might help in detecting COVID-19. AI can be categorised as computer vision applications, natural language processing and machine learning. A study has been carried out to analyse image processing and ML for accurate COVID-19 detection from two widely utilised medical imaging namely CT images and chest X-ray (Saygılı, 2021). The introduced system has been employed to three COVID-19 datasets that comprise of five fundamental steps such as acquisition of dataset, pre-processing and feature extraction, finally dimensionality reduction as well as classification. From the empirical outcomes, it has been found that the introduced model performed better and faster in comparison to other models. Though results were better, accuracy needed further improvement. For this purpose, the paper (Turkoglu, 2021) proposed an MKs-ELM-DNN (Multiple Kernels-Enhanced Learning Machine-Deep Neural Network) to classify COVID-19 affected area from CT images which rely on deep learning (DL) and enables the system to accomplish more accuracy. Proposed model comprise of three major stages like data augmentation, feature extraction and finally classification. This study exhibited that the introduced system could efficiently subsidise to the detection of this disease. The study aimed to develop a mobile based web platform that rely on the proposed approach. Various studies analysed the existing methods for predicting COVID-19 that rely on ML assessment and data mining (Albahri et al., 2020). Three ML methods have been employed to a dataset named MERS-CoV for determining the correct classification model for multiclass and binary class labels. Outcomes explored the efficacy of K-nearest neighbour (K-NN) for two class issues. On the other hand, decision tree (DT) and Naïve Bayes (NB) have been found to be the effective model to solve multiclass issues. Various existing studies attempted to use different strategies for detecting COVID-19. Accordingly, the paper (Ucar and Korkmaz, 2020) explored a structure based on AI to perform effectively in comparison to traditional methods. SqueezeNet with a light weight network design has been tuned for diagnosing COVID-19 with BOA (Bayesian optimization additive). Augmented dataset and fine-tuned hyper parameters have made the introduced network to perform efficient than traditional network designs thereby attained high accuracy for diagnosing COVID-19. In future, the study aimed to plan the proposed model to function mobile appealing for the healthcare specialists for COVID-19 diagnosis. In addition, CNN models have been trained to efficiently detect COVID-19 by systematic construction of large scale COVID-CS (COVID classification and segmentation) (Wu et al., 2021). A JCS (Joint Classification System) has also been developed for this diagnosis. The proposed model found if the suspected patients have been affected with COVID-19 or not in addition to convincing visual descriptions. However, it needs further enhancement. Among several solutions to conflict the pandemic situation of COVID-19, DL has been taken into account as an efficient methodology for affording intelligent solutions (Bhattacharya et al., 2021). Inspired by several DL applications in the area of medical image processing (as shown in Fig. 1, Fig. 2), it has been applied to accomplish several solutions to diagnose COVID-19.

Fig. 1

Multiple applications of DL for COVID-19 prediction (Shorten et al., 2021).

Fig. 2

Real-time application of DL and image processing for detecting COVID-19 (Bhattacharya et al., 2021).

From Fig. 1, it is clear that, DL has numerous applications in the area of life sciences, NLP (Natural Language Processing), epidemiology and computer vision for predicting COVID-19. Further, Fig. 2 shows DL applications in real-time for detecting COVID-19 which includes identification of positive COVID-19 cases, disease tracking, diagnosing according to patient’s health status etc. Despite promising outcomes, various challenges associated with traditional research like changes in the outbreak pattern, differentiation between non-COVID and COVID patients with good accuracy remains an issues. Thus, the present study intends to improvise the accuracy in classifying normal and abnormal COVID-19 patients and also segment the abnormal area based on deep learning. The three significant imaging modalities such as ultrasound imaging, CT scans and X-ray are taken into account which is an additional improvement of this system. Multiple applications of DL for COVID-19 prediction (Shorten et al., 2021). Real-time application of DL and image processing for detecting COVID-19 (Bhattacharya et al., 2021). The major contributions of the study are given below. To classify the abnormal and normal COVID classes by considering X-ray, ultrasound imaging and CT scan images using the proposed attention bottleneck residual network (AB-ResNet). To segment the abnormal infected area from the abnormal images using the introduced edge based graph cut segmentation (E-GCS). To analyse the performance of the proposed system with respect to significant metrics for evaluating its efficiency in detecting COVID-19 from different image modalities.

Paper organisation

The paper is organised in the following way. Section 1 explores the fundamental ideas of detecting COVID-19. Followed by this, Section 2 discusses the reviews of traditional system. After this, Section 3 describes the proposed system based on DL to classify and predict the COVID and non-COVID patients. The results obtained through the implementation of the proposed system is discussed in Section 4 and finally the overall system is summarised in Section 5.

Review of existing work

From December 2019, COVID-19 (Coronavirus Disease 2019) pandemic has affected globally and has turned to be a disorder within a short span of time. This infection occurred due to SARS-CoV-2 that has been rapidly spreading. Predicting the spread of this disease is necessary to prevent it where data mining techniques and imaging methodologies like Chest X-rays, Computed Tomography (CT) scans or chest ultrasound images of humans have been employed. Various traditional systems intended to use several methods for COVID-19 detection by different means. The different kinds of perspectives employed by these systems are comprehensively discussed in this section.

Detection of COVID-19 using Chest X-ray (CX-R) images

Among the magnetic resonance imaging data, CX-Rs are highly utilised to observe signs of COVID-19. This study explored the merits afforded through the usage of iteratively ensemble pruned deep learning (DL) models towards differentiating the CX-Rs exploring COVID-19 pneumonia associated transparencies from normal and bacterial pneumonia utilising publicly accessible CX-R collections (Aggarwal et al., 2022). Performance of the introduced model has been evaluated and weights have been assigned that rely on its predictions (Rajaraman et al., 2020). The empirical outcomes showed the outstanding performance of the proposed system in comparison to the traditional methods. Experimental assessments revealed that the proposed pruning model’s weighted average significantly enhanced the performance. Further research could discover envisaging and inferring the pruned model’s learned behaviour as well as their applications to supplementary screening situations namely detection of COVID-19, 3D-CT scan localization etc. The proposed methodology could be quickly adapted to detect COVID-19 through the digitized chest radiographs (Bharati et al., 2021b). Accordingly, COVID-19 explores few radiological signatures for easy detection by CX-R (Das et al., 2020a). For this purpose, the radiologists have been needed for signature analysis. Nevertheless, this has been an error-prone and time consuming task. Thus, there exists a requirement to examine CX-R. Automatic CX-R analysis could be accomplished through DL based methods that might quicken the time of analysis. These methods could perform network weight training on huge datasets and then fine-tuning these pre-trained network’s weights on small datasets. Employment of these methods to CX-R have been limited. Thus, this study mainly aimed to propose an automated deep transfer learning (DTL) based method to detect the infections of COVID-19 in CX-Rs through extreme inception version (Xception) model. Proposed model has also been comparatively analysed and the outcomes revealed the efficacy of the introduced system. However, the initial parameters of this system have to be tuned through several methods for enhancing the visibility of CX-R images (Panwar et al., 2020). In accordance with this, a TL method (Naronglerdrit et al., 2021) has been employed in the study (Zebin and Rezvy, 2021). Various renowned pre-trained deep convolutional neural network (CNN) models have been assessed to detect COVID-19 from CX-R images. It comprised of viral, bacterial as well as normal pneumonia cases excluding the novel COVID-19. Proposed models have been tested on two varied datasets to find the efficient model for detecting COVID-19. The effective models have been found to be ResNet, DenseNet and MobileNet with better accuracy. In addition, the CT scan images have been considered along with CX-R for detecting positive cases of COVID-19 using AI tools (Mukherjee et al., 2021a, Alyasseri et al., 2022). A CNN and DNN (deep neural network) have been employed that could train and test both these images (Majeed et al., 2020). Accuracy has been found to be 96.28%. Though efficient results have been attained, multi-modal data have to be taken into account. To compensate this, the paper (Mukherjee et al., 2021b) introduced a light weight CNN that could perform automatic detection of positive cases of COVID-19 using CX-R having no FN (False Negatives). It performed better, but this system have to be enhanced by reducing FP (False Positive) cases (Pham, 2021, Das et al., 2020b). Further, a deep CNN (Convolutional Neural Network) relying on architecture termed as CovXNet has been suggested which uses depth-wise convolution along with different rates of dilation for effective extraction of varied features from the CX-R images. Numerous experiments have been undertaken through the use of two varied datasets that afford better performance by exploring 97.4% accuracy for normal or COVID, 96.9% accuracy for viral pneumonia or COVID, 90.2% accuracy for multi-class pneumonias and 94.7% accuracy for bacterial or COVID pneumonia. Thus, the recommended approaches could serve as a better tool in present state of pandemic due to COVID-19 (Podder et al., 2021). However, the study needs to be enhances in terms of accuracy which is a drawback (Mahmud et al., 2020). In addition, patient’s CX-R images of the patients without and with COVID-19 proved by the RT-PCR (Reverse Transcriptase-Polymerase Chain Reaction) have been retrospectively gathered and regulatory permitted CAD has been used that could find several abnormalities such as pneumonia for analysing the CX-R. Performance has been assessed through the use of AUC (Area under receiver operating Characteristic curves). Outcomes revealed that, pneumonia on the chest CTs could be attained within 24-h from CX-R images. Nevertheless, different medical image modalities have to be considered in future (Hwang et al., 2021). To enhance the system, CO-ResNet (COVID-19 Optimised Residual Network) has been suggested which is developed through the employment of hyper-parameter tuning to traditional ResNet-101. Empirical analysis have revealed 98.74% as detection accuracy for normal lungs and 92.08% for pneumonia. Though better prediction rate has been attained, accuracy has to be enhanced further (Bharati et al., 2021a). Additionally, Deep CNN based method has been recommended to detect patients having COVID-19 through the use of CX-R images. Accuracy rate has been found to be 91.62% which is better than conventional models. Application based on GUI has been developed for public usage. Nevertheless, the study must be enhanced with respect to detection rate (Das et al., 2021).

Detection of COVID-19 using CT scan

Human chest’s imaging features could be attained through modalities of medical imaging like X-rays and CT scan. The merits of CT scan than X-ray are many. CT scan help in providing a three dimensional view of organ formation, flexible disease analysis as well as its location (Ahuja et al., 2021). This study introduced a 3 phase technique for classifying the slices of lung CT scan into non-COVID and COVID class. Better accuracy has been attained (Rahimzadeh et al., 2021). Yet, this system have to be analysed on large dataset consisting of positive cases of COVID-19. To eliminate this kinds of drawbacks, various existing studies have attempted to use different methods for detecting COVID-19. Accordingly, this article (Sharma, 2020) explored about machine learning (ML) methods to attain significant insights like if lung CT scan have to be the initial alternative or screening test for RT-PCR (Real time reverse transcriptase — polymerase chain reaction). This article also considered if COVID-19 varies from other kinds of viral pneumonia and if this is correct, the way to differentiate it through CT scan images has also been focused on this study by taking data from hospitals in India, Moscow, China and Italy. The proposed system has been trained and tested through a software named CV-MA (custom vision software of Microsoft azure) that rely on ML methods. Empirical outcomes showed accuracy of 91% for classifying COVID-19. Training as well as testing on high quality image datasets might further enhance the model accuracy (Bao et al., 2020). To improvise accuracy, DL is the effective methodology that could be utilised in medical area (Shah et al., 2021). It is an efficient and fast method to diagnose and predict several kinds of diseases with a better accuracy. By taking this into account, the paper (Silva et al., 2020) explored the ability of DL models to detect COVID-19 on CT images. It has also been highlighted that more datasets have been required to assess the techniques in a realistic way. Future path of the study might be to construct large CT scan image datasets from different Brazilian centres so as to attempt to encompass huge sensors, acquisition processes and ethnic groups and hence properly assess the proposed technique (Serte and Demirel, 2021). Similarly, the study (Alshazly et al., 2021) introduced various kinds of DL based methods to automatically detect COVID-19 through the use of CT images of chest. Highly advanced deep network architectures as well as their variants have been taken into account and many experiments have been undertaken on two datasets with huge counts of CT images accessible so far. Various experiments have been undertaken on two datasets comprising of CT images such as COVID19-CT and SARS-CoV-2 CT-scan. Outcomes exhibited high performance for the proposed model in comparison to traditional research (Aljondi and Alghamdi, 2020). Few traditional systems have considered different kinds of deep CNN models. With regard to this, this paper (Loey et al., 2020) explored five models of CNN namely VGGNet16, AlexNet, ResNet50, VGGNet19 and GoogleNet for analysis to find the COVID-19 infected patients through CT images. Traditional data augmentation in addition to CGAN enhance the classification performance in every chosen DT models. Results exhibited the efficacy of ResNet50 than other models with 82.91% accuracy (Irfan et al., 2021). The main demerit of this study was that it did not attempt to use DL models for attaining high performance for measurement (Liu et al., 2020). In addition, two forms of optimised NASNet (Neural Architecture Search Network) have been suggested to diagnose COVID-19. Outcomes have explored that, suggested NASNet-mobile exposed 82.42% as accuracy, while, NASNet-large has exposed 81.06% as accuracy (Bharati et al., 2020b). Further, an optimised Inception ResNetV2 has been suggested to detected COVID-19. Endorsed model has been employed to 2481 CT images gathered through collection of two distinct datasets. Lastly, it has been employed to 1662 X-ray images leading to 99.4% accuracy. Though effective results have been attained, it has to be assessed for huge COVID-19 datasets (Mondal et al., 2021).

Detection of covid-19 using ultrasound images

CT scan has been found to be time consuming and also makes patients exposed to transport associated risks as well as ionize radiation (Sorlini et al., 2021). CX-R has a major part in monitoring the clinical course of the patients instead of prediction stage at the time of pandemic where highly sensitive test has been vital (Zhao and Lediju Bell, 2022). Therefore, in recent years, lung POCUS (point-of-care ultrasound) has been performed and inferred (Xue et al., 2021) by emergency doctors that possess a widespread utility as well as its functionality in assessing patients with pneumogenic and cardiogenic dyspnoea (Dacrema et al., 2021). This study have introduced a non-convex regularisation that rely on line artefacts — quantification technique with lung ultrasound applications. Empirical outcomes revealed a precise idea for detecting line detection for hundred lung ultrasound images exhibiting various B-kind of structures perceived in nine patients having COVID-19 (Karakuş et al., 2020). It has also been significant to notate that the introduced technique is entirely unsupervised. On the presumption that the annotated data have been freely inaccessible, an unsupervised technique like proposed system always possess merits over other supervised techniques. Accordingly, the study employed a CNN model that possess few parameters for learning but could accomplish robust accuracy (Muhammad and Hossain, 2021). The model possessed five major convolution connector layers. The functionality of multi-layer fusion of individual block has been introduced for enhancing the efficacy of COVID-19 screening technique using the introduced model. Experiments have been undertaken through freely available video datasets and lung ultrasound photographs. Introduced fusion technique possessed accuracy of 91.8% through the use of data collection. Yet, the proposed system have to be enhanced by taking into account various input modalities accessible from similar person infected with COVID-19 (Xing et al., 2020). As different studies tried to detect this disease in varied kinds, the article (Mateos González et al., 2021) aimed to evaluate the correlation of CX-R and lung ultrasound to detect pulmonary infiltrates in COVID-19 infected patients. Findings of lung ultrasound correlated better with CX-R in confirmed or suspected COVID-19 patients. Lung ultrasound has been able to find the pulmonary infiltrates in nearly half the patients having normal CX-R. Hence, a lung ultrasound analysis might be undertaken as the preliminary test of diagnostic imaging in patients infected with COVID-19. Future research have been needed for assessing the usage of consistent lung ultrasound protocol on patient’s health service with suspected cases of COVID-19. However, few studies intended to check if lung ultrasound has been a worthwhile technique to detect severe respiratory syndrome (Fonsi et al., 2021). The analysis explored that lung ultrasound possessed good reliability in comparison to CT scan images of chest which indicate that lung ultrasound might be utilised for evaluating the patients doubted of possessing COVID-19. This study also possessed various limitations like small size of sample, lung ultrasound and CT scan image examinations undertaken only at the admission level (Dastider et al., 2021). Severity prediction has also been significant to be considered. In accordance with this, the study (Narinx et al., 2020) introduced a architecture named frame based four score disease severity with the incorporation of RNN (recurrent neural network) and CNN to taken into account both temporal and spatial features of lung ultrasound frames. Comprehensive study explores a promising enhancement in the performance of classification by presenting a LSTM (Long Short Term Memory) after the introduced CNN architecture. It has been found from a clear analysis that the introduced end to end method have been efficient in finding severity scores of COVID-19 from lung ultrasound images. This study has explored that future work will be to design DL based segmentation framework. Further, frame based four score disease severity identification framework has been recommended with the inclusion of RNN (Recurrent Neural Networks) and deep CNN for considering temporal and spatial features. Comprehensive analysis reveals promising enhancement in classification through introduction of LSTM (Long Short Term Memory) after endorsed CNN model by average rate of 7% to 12%. However, accurate outcomes have not been explored (Dastider et al., 2021). Analysis on AI to detect COVID-19 has also been performed through the use of PRISMA (Preferred Reporting Items of Systematic reviews and Meta-Analysis) guidelines. Various diagnosis (Glowacz, 2021b) stages have been reviewed which include pre-processing phase, segmentation phase and feature extraction. Outcomes reveal that, ResNet-18 and DenseNet-169 show better classification performance for the X-ray images, whereas, the DenseNet-201 possess high accuracy in detecting COVID-19 through CT images (Taylor et al., 2020). Outcomes revealed that DL and ML have been valuable tools to support the medical professionals and researchers in detecting COVID-19. However, such methods (Glowacz, 2021a) have to be used by considering different medical image modalities as this study has considered only X-ray and CT scan images (Rubaiyat Hossain Mondal et al., 2021). From the analysis of conventional works, it is found that, most of the methods worked better in COVID-19 detection and explored better accuracy. However, they have failed to consider all the three possible medical image modalities such as CT scan, CX-R and ultrasound. In addition, few studies also lacked with respect to accuracy. Hence, there is a need to focus on choosing the effective methods for obtaining high accuracy thereby considering all the three probable image modalities in COVID-19 prediction. Problem Identification Several problems are identified by analysing various existing systems for COVID-19 detection. They are listed here. Traditional study (Rajaraman et al., 2020) utilised DL models for differentiating CX-Rs and explored effective results. Yet, this study have not considered 3D CT scans. Similarly, the paper (Mukherjee et al., 2021a) employed CNN-DNN that could collectively train or test CX-R and CT scans. However, it have not considered ultrasound images. In addition, the article (Muhammad and Hossain, 2021) explored to use a multi-source system in future by considering different kinds of input modalities to detect COVID-19 infected patients. Hence, taking all these into account, the present study aims to detect normal and abnormal COVID-19 patients by considering CT scan images, CX-R images and ultrasound imaging. Few studies have considered only one of these three or two of the three images. But, the present study considers all these three images for effective prediction of COVID-19 infected persons. The existing paper (Loey et al., 2020) exhibit that ResNet50 has been the effective DL model for COVID-19 detection from CT images of chest. However, the main demerit is not attempting to use many DL model to achieve high performance. In addition, the study (Saygılı, 2021) aimed to develop a comparison model by incorporating DL techniques in future. The present study considered all these works and intends to detect abnormal and normal persons of COVID-19 based on DL techniques. The article (Dastider et al., 2021) intended to enhance the proposed system by designing DL based segmentation framework for segmenting the pathological artifacts. In addition, the paper (Das et al., 2020b) introduced a truncated inception net DL model for detecting positive patients of COVID-19. Introduced system has been restricted by its ability for localizing the diseases in CX-R. This goal have to be accomplished by utilising enhanced count of data or DL model that have been pre-trained on huge count of CX-R. Hence, the present study aims to use DL based methods to predict the abnormal and normal cases of COVID-19 thereby it also localise the infected area if it is found to be abnormal by taking into account the CT scan images, CX-R images and ultrasound imaging. This is an added advantage of the proposed system.

Proposed methodology

The study proposes methods such as attention bottleneck residual network (AB-ResNet) and edge based graph cut segmentation (E-GCS) that rely on DL. It has various advantages. It has the ability to accelerate the training speed of deep networks, it also enhance the network depth resulting in minimum parameters, minimising the impact of vanishing gradient issue and attaining efficient network performance in terms of accuracy particularly in image classification. Due to these advantages, the present study intends to classify abnormal and normal COVID cases based on DL and image processing. Various stepwise procedures are involved to achieve this detection. The overall view of this process is outlined in Fig. 3.

Fig. 3

Overall view of the proposed method for COVID-19 diagnosis.

Initially the ultrasound images, CT scan and X-ray images are taken into account. Then, attention bottleneck residual network (AB-ResNet) is used to classify the images into abnormal and normal by examining any one modality comprising of various kinds of modalities. Here, the ResNet is built by stacking numerous attention modules. Features from various layers have to be modelled through various attention masks. Utilising a mask branch might need exponential channel count to retrieve different factor combinations. Subsequently, edge based graph cut segmentation (E-GCS) is employed to detect abnormal areas after classifying the abnormal category. This step helps to specifically localise the disease infected areas. Finally, a performance analysis is undertaken to assess the efficacy of the proposed system in comparison to the traditional systems. In addition, systematic workflow of the proposed system is shown in Fig. 4 to provide comprehensive view of the overall proposed work.

Fig. 4

Overall view of measurement of the proposed work.

Overall view of the proposed method for COVID-19 diagnosis. Overall view of measurement of the proposed work.

Attention bottleneck residual network (AB-ResNet)

The overall framework of the introduced system is outlined in Fig. 1. Moreover, a attention bottleneck residual network comprise of three convolution layers, a GAP (global average pooling) layer, FC (a fully connected) layer and two AB-ResNet block. The COVID-19 dataset could be indicated as . Here and indicate the height and the width. To extract spatial–spectral features, the study adopted a three dimensional image patches positioned on pixels of the label as the input of the introduced AB-ResNet. Here, the image patch label indicates the position pixel label. The image patch size is represented by that indicates the spatial size of the neighbourhood. An assumption is made if the COVID-19 dataset comprise of pixels labelled , then the set of image patch could be indicated by . Here represents the image patch. The equivalent label set of the ground truth (GT) could be indicated by . Here denotes the label, indicates the count of COVID-19 classes. Accordingly, patch set () is partitioned into training, testing and validation set. Similarly, is partitioned into three kinds. Prior to AB-ResNet training, hyper parameters like batch size, patch size and learning rate are configured as, Patch size [128 128, 256 256] Batch size [4, 8, 16, 32] Learning rate [0.01, 0.001, 0.0001] Proposed AB-ResNet is trained for two hundred epochs and for each of these epochs, training sets are partitioned into few mini-batch data and batches. This is fed as input to the network individually. During the process of training, the estimation label vectors corresponding to the training set are attained by the model’s forward propagation. Subsequently, a CELF (Cross entropy loss function) is selected for computing the variations amongst the estimated label vectors as well as the equivalent label vectors that are transferred by the GT labels. Followed by this, learned parameters corresponding to the proposed model are updated by back propagation (BP) algorithm. Additionally, at the training time, validation sets are partitioned and the accuracy in classification is calculated for each and every epochs for monitoring the performance of the model. By this way, training model can be selected with high accuracy. At last, the test set is selected to assess the proposed AB-ResNet’s performance. For a given feature map (input), AB-ResNet infers three-dimensional attention-map . Advanced feature map is calculated as, where, indicates element wise multiplication. A residual learning strategy is adopted in addition to attention mechanism for facilitating gradient-flow. For designing an effective and powerful module, initially channel-attention as well as spatial-attention is computed at two distinct branches. Subsequently, attention-map is calculated as, where represents sigmoid-function. Outputs of both the branch are re-sized into prior to addition. As individual channel comprise of responses of specific feature, inter-channel association is used in channel-branch. Thus, channel-attention is calculated by, where , , . Particularly, feature () is estimated into reduced dimension through the use of (1 1) convolution for integrating and compression feature map throughout the dimensions of the channel. Similar reduction ratio is used with channel branch to attain simplicity. Subsequently, two (3 3) dilated-convolutions are employed to used contextual information efficiently. Lastly, features are repetitively used to the spatial-attention map through the use of (1 1) convolution. Further, a batch-normalisation layer is employed at spatial branch end for scale alteration. Hence, spatial-attention is calculated by,

Residual connection and activation based bottleneck mechanism

Residual block is selected as the main element of the proposed AB-ResNet, the framework of it is presented in Fig. 5(b). From the figure, it is seen that the residual block comprise of residual connection/skip connection and dual convolution layers. Using skip connection, high level and low level features could be combined in an extra way. By this, residual block could lessen the explosion or gradient vanishing issue that typically occur in deep network. The individual residual block could be computed as per Eq. (5). Here and Indicates the input as well as output AB-ResNet. In addition, is the residual learning function, signifies activation function and denotes the convolution layer output prior to summation operation.

Fig. 5(b)

Attention bottleneck residual block.

To attain efficient performance, batch normalization (BN) and an activation function is applied in the proposed network’s residual block. As presented in Fig. 5(b), Activation framework is executed by shifting ReLU activation function and BN prior to convolution operation. Activation residual block could be computed as per Eq. (6). As explored in Fig. 5(a), the activation function in Eq. (5) is ReLU that indicates the activation function as per Eq. (7). ReLU will compulsorily transfer the signal to zero if the signals are negative that might lead to loss of few residual and informative features in typical residual block. When is made as an identity-mapping, then Eqs. (5) and (6) will be similar. This identity-mapping permits signals to be directly propagated amongst any of the two units that indicates the features that are learned by RLF — residual learning function that will be always exist. By this way, the activation based bottleneck mechanism makes it simpler and easier for training the network thereby improvise the generalization of the network performance.

Fig. 5(a)

Typical residual network.

Attention bottleneck residual block

As the data comprising of every spectral brands are used directly as inputs and fed to the proposed network, it is vital to undertake redundant information that might lower the classification accuracy. For solving this issue, this study selected a SE (Squeeze and Excitation) block for recalibrating the feature responses of the channel through explicit interdependencies modelling amongst channels. Hence, this can be considered as a mechanism of channel attention. This attention mechanism is integrated to the attention residual block and introduced an attention bottleneck residual block — ABRB. An assumption is made with the size and count of attention mechanism’s feature maps is . Here indicates the count of channels and represents the spatial size of neighbourhood. The individual feature maps are initially processed through a three dimensional GAP layer to squeeze GSI (global spatial information). Hence, dimensional feature tensors of the channel are generated. Subsequently, the feature tensors of the channel are fed as input to a three dimensional convolution layer for minimising the dimensionality of the channel. Exactly, after the convolution operation, feature tensor’s channel dimension turns to be where indicates the reduction ratio. This ratio value is fixed as 4 in the introduced network. Followed by this, ReLU function is employed to enhance the channel response’s non-linearity and additional three dimensional convolution layer is selected for enhancing the channel dimension thereby produces feature sensors. Finally, a sigmoid function is also applied and the resultant is multiplied with all the feature maps from ABRB to rescale the overall outcomes of the corresponding attention mechanism to feature maps denoted by . In this manner, weights of the channel are allocated to individual feature maps. Hence, accomplish recalibrating features in an adaptive way. Moreover, an attention strategy is afforded with parameters that are retrieved from dual three dimensional convolution layer inside it. Typical residual network. Attention bottleneck residual block.

Overall architecture of the introduced network

The COVID-19 dataset is considered as an instance and patches of images are utilised as samples (input), all the proposed ABRB’s details are explored in Fig. 6. The individual convolutional layer pursues with ReLU and BN except that in ABRB. In accordance with SSRN (Spectral Spatial Residual Network), the introduced network inputs specific importance on learning all the spectral features from input (raw data). In addition, specific significance is provided to spatial features also. Hence, various spectral and spatial features are extracted.

Fig. 6

Overall architecture of the proposed network.

Lastly, the spatial and spectral features are processed through FC and GAP operation. Here, the FC operation could adaptively produce feature vector as well as the length that is equivalent to the count of land cover classes in the COVID data. This is because there exists sixteen land cover classes in this dataset, the output vector’s length is sixteen in Fig. 3. Additionally, it is stated that the initial convolutional layer’s stride is . Thus, the input sample’s channel dimension is minimised from . The additional convolution layers in the introduced ABRB is fortified with stride. In ABRB, every convolution layers utilise padding to maintain the feature cuboid sizes. Without utilising padding, channel dimension or spatial size is minimised when the feature cuboid processing is done by the convolution layers exterior to ABRB. Overall architecture of the proposed network.

Edge based graph cut segmentation (E-GCS)

GC segmentation is an effective segmentation method based on graph that possess two major parts such as the data-part for computing the conformity of the image data within the segmentation region that comprise of the features of an image as well as the regularization-part for smoothening the segmented region’s boundaries by maintaining the image’s spatial information. Graph cut is taken into account as an energy reduction process corresponding to the constructed graph so as to segment the image’s ROI using a seed sets. Representation of graph cut could be transferred into minimal graph cut or maximum flow. An assumption is made by considering a map () that assigns pixels to various clusters. Hence, the two elements corresponding to the energy function of the graph cut comprise of data-part that computes the variation amongst regularization-part and allocated region that assess the smoothness of the boundaries. The algorithm of it is shown in algorithm I. Initially, the abnormal image is taken as input so that the proposed E-GCS can be used to segment the infected area from abnormal images. For this purpose, the graph is constructed and its representation is altered into minimal graph cut or maximal flow. In addition, an energy function is also computed to attain the final segmented image indicated by step 4. This step helps to separately localise the infected area of the disease in addition to classification of normal and abnormal images which makes the proposed system effective.

Results and discussion

The results obtained after implementing the proposed system to detect the COVID and non-COVID cases are discussed in this section. Various performance metrics considered for the analysis are also presented. Comparative analysis is also undertaken to prove the efficacy of the proposed system than the existing system.

Dataset description

Huge number of CT images, ultrasound images and X-ray images are accessible from various publicly available datasets. With the evolvement of COVID-19 being recent, these huge repositories don’t possess any labelled data of COVID-19, hence need the dependence upon various datasets for non COVID and COVID source images. The dataset is taken from the below sources. CT images have been considered from, https://www.kaggle.com/datasets/kmader/siim-medical-images Chest X-ray images have been taken from, https://data.mendeley.com/datasets/rscbjbr9sj/2 Ultrasound images have been considered from, https://www.butterflynetwork.com/covid 19#gallery Additionally, number of images corresponding to individual data class for train and test split is given in Table 1.

Table 1

Number of images for train and test.

X-ray images (normal or pneumonia)
Train data	4690
Test data	1172

CT images (COVID or Non-COVID)

Train data	596
Test data	149

Ultrasound (normal or pneumonia)

Train data	900
Test data	225

Number of images for train and test.

Performance metrics

Performance of the introduced system is analysed with respect to various significant metrics for classifying the three modalities of medical images. The metrics include accuracy, sensitivity, recall, specificity, precision, Area under curve (AUC), kappa and jaccard coefficient, F-score, NPV, FPR and FNR. A. Accuracy It indicates the computation of overall correct classification of images. It is presented by Eq. (8). B. Specificity It refers to the quality/state of being unique/specific to an individual/group and is given by Eq. (9). C. Sensitivity It indicates the positive and correctly recognized segments and is given as per Eq. (10). D. Recall It is defined as the ratio of relevant as well as retrieved image to the ratio of relevant image and is given by Eq. (11). E. Precision It is defined as the computation of correct image classification counts. It is given by Eq. (12). F. Area under curve (AUC) It is the accurate integral curve that exhibits the variations in classifications and is given by Eq. (13). G. F-score It is also termed as F-measure and is the harmonic mean of and and is computed as per Eq. (14). H. Negative predictive value (NPV) NPV indicates the possibility that an individual is not affected by the particular disease provided a negative test outcome and is given by Eq. (15). I. False positive rate (FPR) FPR is defined as the ratio of incorrect negative cases detected as positive cases in the corresponding data and is given by Eq. (16). J. False negative rate (FNR) FNR is simply defined as one minus the true positive rate (TPR). It is given by Eq. (17). In Eqs. (8), (9), (10), (11), (12), (13), (14), (15), (16) and (17), is true positive, is false positive, is true negative and is false negative. K. Kappa coefficient It is typically utilised to find the reliability of the inter-rate. It can also be stated as the degree to which the agreement amongst two dataset frequencies accumulated on two varied occasions exists. It is the computation of agreement and is applied for assessing the accuracy in segmentation and is computed by Eq. (18). where represents kappa coefficient, indicates the population which judges agree and represents agreement through chance. L. Jaccard coefficient It is also called jaccard similarity coefficient that makes the comparison of two dataset members for observing the shared and distinct members. It is a similarity measure for two datasets within a range of 0%–100%. If this percentage is higher, the similarity between the population also seems to be high. It is calculated using Eq. (19).

Experimental results

The results attained from the proposed system is shown in Table 2. As per Table 2, the classification results obtained through CT, X-ray and ultrasound image is shown. The proposed system differentiates the normal and abnormal cases of COVID-19 as per the Table 2. It is separately partitioned and it can be clearly observed in Table 2.

Table 2

COVID and non-COVID classification.

Performance and comparative analysis

Performance of the proposed system is analysed with respect to training accuracy, testing accuracy, precision and recall. From total data, 70% of the available data is allocated for training, whereas, remaining 30% of data are equally divided and presented as testing and validation data. The obtained outcomes are shown in Table 3.

Table 3

Training and testing results.

Parameters	Obtained value
Training

Training accuracy	99.70
Precision	98.65
Recall	98.67

Testing

Testing accuracy	97.56
Precision	96.15
Recall	96.34

From Table 3, training accuracy is found to be 99.70%, its precision is found to be 98.65%, while, the recall rate is found to be 98.67%. On the other hand, testing accuracy is found to be 97.56%, its precision is explored as 96.15% and its recall is found to be 96.34%. Thus, it could be concluded that, the proposed system worked better as it shows good accuracy rate. In addition, the proposed system is analysed with respect to various metrics to find the efficiency in overall classification of three different medical image modalities such as X-ray, ultrasound and CT images. The considered metrics include sensitivity, recall, specificity, precision, accuracy, NPV, FPR, FNR, AUC, kappa and jaccard coefficient and F-score. The results obtained through this analysis are presented in Table 4.

Table 4

Performance analysis of the proposed system.

Performance metrics (in %)	Overall	X-ray	CT	Ultrasound
Sensitivity	99.67	96.64	94.55	86.22
Specificity	99.66	95.42	92.25	86.95
Precision	99.67	95.36	92.05	87.08
Recall	99.671	96.64	94.55	86.22
NPV	99.7	96.68	94.7	86.09
FPR	0.003	0.0457	0.0774	0.1304
FNR	0.0032	0.0335	0.0544	0.1377
Accuracy	99.66	96.02	93.37	86.58
Kappa coefficient	99.34	92.05	86.75	73.17
AUC	99.67	96.03	93.04	86.59
Jaccard	99.35	92.3	87.42	76.45
F-score	99.68	96	93.28	86.65

Training and testing results. From Table 4, it is clearly seen that the overall classification accuracy of the proposed system to classify the three image modalities is found to be high at a rate of 0.9968. On the other hand, the X-ray classification rate is found to be 0.96, CT scan classification rate is found to be 0.9328 and finally the ultrasound image classification rate is found to be 0.8665. The sensitivity, recall, specificity, precision, and F-score are found to be high. In addition, the kappa and jaccard coefficient are also found to be high which is computed using Eqs. (14) and (15). On the other hand, the FNR rate is found to minimum at a rate of 0.0032 that indicates the minimum error rate of the proposed system in classifying the images into normal and abnormal. Further, a ROC (Receiver Operating Characteristic) curve is plot (shown in Fig. 7) to afford graphical view of the performance of a classifier instead of exploring particular value like many other performance metrics.

Fig. 7

ROC graph.

Performance analysis of the proposed system. From Fig. 7, rapid increase in the ROC curve with an area of 0.99 represents the efficiency of the classifier. Additionally, confusion matrix is framed to check the model’s efficacy and it is tabulated in Table 5 and graphically presented in Fig. 8.

Table 5

Confusion matrices.

	Positive	Negative
Positive	3852	10
Negative	15	3856

Fig. 8

Confusion matrix.

ROC graph. From the confusion matrix, it is found that, 3582 positive cases have been correctly identified, while, 3856 cases have been correctly identified as negative cases, while, 10 positive cases have been misinterpreted as negative and 15 negative cases have been misinterpreted as positive. However, in comparison to misinterpretation rate, the correctly detected case rates are high which confirms the efficacy of the proposed method. In addition, a comparative analysis is also carried out with respect to accuracy as it is the significant metric especially in medical domain. This analysis is undertaken to find the efficacy of the introduced system than the other methods. The obtained results are tabulated in Table 6.

Table 6

Comparative analysis of the proposed and traditional systems (Narin et al., 2021).

Methods	Accuracy
Narayan Das et al.	97.4
Singh et. al.	94.65
Afshar et al	95.7
Ucar and Korkmaz	98.26
Khan et al.	89.6
Sahinbas and Catak	80
Medhi et al	93
Zhang et al.	95.18
Apostopolus et al.	93.48
Narin et al.	98
Ali Narin et al.	96.1
Proposed	98.34

Confusion matrices. Confusion matrix. From Table 6, it is clear that the existing system (Narin et al., 2021) used CNN for COVID-19 detection through classification. Various other methods are also considered as per the table. Though, better accuracy rate is achieved by the traditional systems, the accuracy rate of the proposed system is found to be outstanding than the traditional studies in classifying the normal and abnormal images thereby segmenting the affected area to localize the disease. This reveals the efficacy of the introduced system. It is graphically shown in Fig. 9.

Fig. 9

Analysis of the proposed and traditional methods with respect to accuracy (Narin et al., 2021).

Comparative analysis of the proposed and traditional systems (Narin et al., 2021). Another comparative analysis is also carried out to analyse the efficacy of the introduced system than the conventional methods in detecting COVID-19. This analysis is carried out to confirm the efficacy of the proposed DL based system than conventional techniques. For this analysis, sensitivity, accuracy, AUC and specificity are taken into account and the obtained results are tabulated in Table 7.

Table 7

Comparative analysis in terms of significant metrics (El-Kenawy et al., 2020).

Methods	Sensitivity	Specificity	Accuracy	AUC
X. Wu et al.	81.1	61.5	76	0.819
A.A. Ardakani et al.	100	99.02	99.61	0.994
K. Zhang et al.	94.93	91.13	92.49	0.981
H. Panwar et al.	97.62	89.13	88.1	0.881
C. Butt et al.	98.2	92.2	86.7	0.996
M. Nour et al.	89.39	99.75	98.97	0.994
X. Wang et al	94.5	95.3	96.2	0.97
Proposed	99.67	99.66	99.68	0.99

Analysis of the proposed and traditional methods with respect to accuracy (Narin et al., 2021). From Table 7, it is clearly seen that the AUC rate of the proposed system is 0.99. Whereas, the existing system (El-Kenawy et al., 2020) used new feature selection and voting classifier algorithms for COVID-19 detection. However, the outcomes of the introduced system is found to be high in comparison to the traditional systems. In addition, the sensitivity and specificity rate of the introduced system is found to be 99.67 and 99.66. The accuracy rate of the proposed system is also found to be 99.68. The results are graphically shown in Fig. 10.

Fig. 10

Analysis of the existing (El-Kenawy et al., 2020) and proposed system with respect to important metrics.

Comparative analysis in terms of significant metrics (El-Kenawy et al., 2020). Thus, the performance and comparative analysis of the proposed system with 12 significant performance metrics reveal the efficiency of the introduced system than the existing system. As mentioned earlier, the proposed system considers all the three different medical image modalities for classifying normal as well as abnormal images from CT scan, X-rays and ultrasound images. The existing study (El-Kenawy et al., 2020) considers only the CT images, but the present study considers all the three important medical imaging modalities that reveals its efficiency in classifying the abnormal and normal cases of COVID-19. Additionally, the existing system is compared with conventional systems in terms of execution time and the obtained outcomes are shown in Table 8 and Fig. 11.

Table 8

Comparative analysis with respect to execution time (Wu et al., 2020).

Method	Time (s)
ACNN	561.8
CNN	921.2
Proposed	365.1

Fig. 11

Analysis in terms of execution time (Wu et al., 2020).

Analysis of the existing (El-Kenawy et al., 2020) and proposed system with respect to important metrics. From the outcomes, it is found that, conventional methods like ACNN (Adaptive median filter CNN) has shown 561.8 s, CNN has shown 921.2 s, while, the proposed system has consumed minimum execution time of 365.1 s that explores its efficacy than conventional methods. Further, comparative analysis has been undertaken with respect to metrics such as recall, validation accuracy and precision. Several methods have been considered for analysis that includes Vanilla gray, Vanilla RGB, Basic CapsNet (Basic Capsule Network), Modified CapsNet, Hybrid CNN VGG (Hybrid Convolutional Neural Network Visual Geometry Group) and VDSNet (VGG Data Spatial transformer network with CNN). Obtained outcomes are shown in Table 9 with its graphical presentation in Fig. 12.

Table 9

Comparative analysis with respect to performance metrics.

Methods	Recall (in %)	Precision (in %)	Validation accuracy (in %)
Vanilla gray	58	68	67.80
Vanilla RGB	61	68	69
Hybrid CNN VGG	62	68	69.50
VDSNet	63	69	73
Modified CapsNet	48	61	63.80
Basic CapsNet	51	64	60.50
Proposed	96.64	95	96.02

Fig. 12

Analysis in terms of performance metrics (Bharati et al., 2020b).

Comparative analysis with respect to execution time (Wu et al., 2020). Analysis in terms of execution time (Wu et al., 2020). From the analytical results, it has been found that, existing methods like VDSNet has shown better recall of 63%. However, the proposed system has shown maximum recall rate of 96.64%. Similarly, the precision rate of conventional methods like VDSNet has shown 69%. Nevertheless, the proposed system has shown high precision rate of 95%. Further, the validation accuracy of existing methods like Vanilla RGB has shown 69%, while, VDSNet has shown 73%, whereas, the proposed method has shown maximum validation accuracy of 96.02%. Thus, overall analytical outcomes revealed the efficiency of proposed system than conventional system with respect to the considered metrics. Comparative analysis with respect to performance metrics. Analysis in terms of performance metrics (Bharati et al., 2020b).

Conclusion

The present study proposed methods based on image processing and deep learning (DL) to detect COVID-19 using Attention Bottleneck Residual Network (AB-ResNet) to classify the abnormal and normal images. The study also utilised Edge Based Graph Cut Segmentation (E-GCS) to segment the disease infected area for localizing the disease by considering three varying medical imaging modalities such as X-ray images, CT images and ultrasound imaging. The proposed system was assessed by comparing it with the traditional studies with respect to significant metrics namely accuracy, sensitivity, recall, specificity, precision, Area under curve (AUC), kappa and jaccard coefficient, F-score, NPV, FPR and FNR. The analytical outcomes explored the efficacy of the proposed system than the traditional systems in detecting the abnormal and normal cases of COVID-19 with high accuracy of 99.68%. However, it takes more training time. This study can help the society by helping the medical practitioners to easily differentiate the COVID and non-COVID cases thereby segment the disease infected area to particularly localise the disease by which patient’s life could be saved. In future, special focus will be given on reducing the training time and running time. Additionally, ablation studies will also be included to prove the proposed work further.

CRediT authorship contribution statement

T. Ahila: Writing, Developing, Reviewing the content of the manuscript. A.C. Subhajini: Works to cite the fgure, table and references.

Uncited References

Bharati et al. (2020a)

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

54 in total

1. Classification of COVID-19 chest X-rays with deep learning: new models or fine tuning?

Authors: Tuan D Pham
Journal: Health Inf Sci Syst Date: 2020-11-22

2. COVID-19 detection and disease progression visualization: Deep learning on chest X-rays for classification and coarse localization.

Authors: Tahmina Zebin; Shahadate Rezvy
Journal: Appl Intell (Dordr) Date: 2020-09-12 Impact factor: 5.086

3. Is Lung Ultrasound Imaging a Worthwhile Procedure for Severe Acute Respiratory Syndrome Coronavirus 2 Pneumonia Detection?

Authors: Giovanni Battista Fonsi; Paolo Sapienza; Gioia Brachini; Chiara Andreoli; Maria Luisa De Cicco; Bruno Cirillo; Simona Meneghini; Francesco Pugliese; Daniele Crocetti; Enrico Fiori; Andrea Mingoli
Journal: J Ultrasound Med Date: 2020-09-07 Impact factor: 2.153

4. Drawing insights from COVID-19-infected patients using CT scan images and machine learning techniques: a study on 200 patients.

Authors: Sachin Sharma
Journal: Environ Sci Pollut Res Int Date: 2020-07-22 Impact factor: 4.223