Literature DB >> 34998222

BEMD-3DCNN-based method for COVID-19 detection.

Ali Riahi¹, Omar Elharrouss², Somaya Al-Maadeed³.

Abstract

The coronavirus outbreak continues to spread around the world and no one knows when it will stop. Therefore, from the first day of the identification of the virus in Wuhan, China, scientists have launched numerous research projects to understand the nature of the virus, how to detect it, and search for the most effective medicine to help and protect patients. Importantly, a rapid diagnostic and detection system is a priority and should be developed to stop COVID-19 from spreading. Medical imaging techniques have been used for this purpose. Current research is focused on exploiting different backbones like VGG, ResNet, DenseNet, or combining them to detect COVID-19. By using these backbones many aspects cannot be analyzed like the spatial and contextual information in the images, although this information can be useful for more robust detection performance. In this paper, we used 3D representation of the data as input for the proposed 3DCNN-based deep learning model. The process includes using the Bi-dimensional Empirical Mode Decomposition (BEMD) technique to decompose the original image into IMFs, and then building a video of these IMF images. The formed video is used as input for the 3DCNN model to classify and detect the COVID-19 virus. The 3DCNN model consists of a 3D VGG-16 backbone followed by a Context-aware attention (CAA) module, and then fully connected layers for classification. Each CAA module takes the feature maps of different blocks of the backbone, which allows learning from different feature maps. In our experiments, we used 6484 X-ray images, of which 1802 were COVID-19 positive cases, 1910 normal cases, and 2772 pneumonia cases. The experiment results showed that our proposed technique achieved the desired results on the selected dataset. Additionally, the use of the 3DCNN model with contextual information processing exploited CAA networks to achieve better performance.

Entities: Chemical

Keywords: 3DCNN; BEMD; COVID-19; Context-aware attention

Mesh：

Year: 2021 PMID： 34998222 PMCID： PMC8717690 DOI： 10.1016/j.compbiomed.2021.105188

Source DB: PubMed Journal: Comput Biol Med ISSN： 0010-4825 Impact factor: 4.589

Introduction

Coronavirus (COVID-19) is a contagious respiratory illness that was identified in December 2019 in Wuhan, China. As of August 27, 2021, there were 215,612,352 confirmed cases of persons infected by coronavirus “COVID-19” worldwide, 192,807,624 persons recovered and 4,491,204 died [1]. The Coronavirus outbreak continues to spread and no one knows when it will stop. Therefore, from the first day, since the virus was identified, scientists have launched numerous research projects to understand how the pandemic spreads and how a person's immune system responds to prevent, diagnose and treat the disease [2]. Accordingly, medical professionals and research scientists around the world have succeeded in building up an understanding of how the novel COVID-19 coronavirus spreads and affects the body. The coronavirus attacks mainly the lungs of the patient and then spreads into the body and affects other organs such as the kidneys and liver. Researchers in other domains have also been involved in pursuing a solution for early detection of COVID-19, including efforts to ensure social distancing. Among these domains, the computer sciences utilize various techniques to analyze data, especially for medical imaging. Using features from CT-scans and X-ray images combined with artificial intelligence (AI) tools and deep learning-based frameworks, the proposed methods achieved high accuracies with minimal computational time [3]. AI techniques have also been used to mitigate the spread of COVID-19 thru early detection algorithms, drones, mask detection, etc. In this research paper, we propose a deep learning (DL) based framework that combines a 3D convolution neural network (3DCNN) and a context-aware attention network on BEMD features of X-ray images to detect the virus. In our proposal, we used the BEMD technique to decompose each original image into four new images [4], a finite number of intrinsic mode functions (IMFs), and then apply the 3DCNN to classify and detect COVID-19. The use of 3D data can be more efficient for any model to extract useful information compared to using only one image. We used a dataset that was gathered by Islam et al. [5]. In addition, we evaluated the proposed method on new data collected from the internet (Kaggle and Mendeley). Accordingly, the paper presents a set of contributions that can be summarized as: Extraction of three components from the original images using a BEMD decomposition technique. Then, creating a video of the extracted components for use it in a proposed 3DCNN model. A 3DCNN model is proposed, which consists of a backbone for feature extraction followed by two context-aware attention (CAA) modules that take as input the feature maps of two different blocks of the backbone. The concatenation of CAAs module outputs is used for classification using fully connected layers. An evaluation of the obtained results on two datasets with different sizes demonstrate the effectiveness of the proposed method on the two datasets with an accuracy of 99.99%. This research paper is organized as follows. In section 2, we present a summary of recent research related to this study. Section 3 presents a description of the proposed methodology and in section 4, we describe the experiments and results with the related dataset collection and preparation. Finally, section 5 concludes the paper.

Related works

Early work in COVID-19 has included detection by extracting images of patient lungs done thru ultrasound technology, and a technique to identify and monitor patients affected by viruses. These methods required trained and qualified staff. Therefore, the development of automated detection and recognition techniques that do not require skilled specialists is needed. There are techniques, such as computer vision, that provide detection using images and videos. Islam et al. [5] introduced a Deep Learning (DL) technique that combines a convolution neural network (CNN) with the Long Short-Term Memory (LSTM) to detect COVID-19 virus from x-ray images. They used 4575 x-ray images, where 1525 were COVID- 19 positive cases. In the same context, Carrer et al. [6], discussed a method to detect automatically detect the pleura line of lungs and extract the characteristics based on its geometry and intensity. The pleura lines that were extracted and modeled were given as input to the supervised support vector machine (SVM) classifier to identify COVID-19. Farooq et al. [7] used infrared thermography technology (IRT), which is a medical diagnostic tool that detects heat patterns and measures quantitative temperature data of the human body, with machine learning (ML) techniques, to diagnose the COVID-19 virus. They also illustrated and explained thru a generic comprehensive block diagram a thermal imaging method using a computer aided diagnosis (CAD) system. In this method, Machine learning (ML) preprocesses the images, refines the output, trains the data and extracts features to predict the outputs. The CNN was applied to train the data and detect the COVID-19 virus. In study [8], the authors demonstrated how deep learning could be utilized to detect COVID-19 using images regardless of source, such as X-ray, ultrasound, or CT scan. They built a CNN model based on a comparison of several known CNN models. Their approach aimed to minimize noise to apply deep learning of image features to detect COVID-19. Their study showed better results in ultrasound images compared to CT scans and X-ray images. They used a VGG19 network as a backbone in their detection model. Coronet novel techniques were explained in Ref. [9] study to detect COVID-19 from X-ray images. They used 1251 images in their experiments, where 284 were COVID-19, 330 bacterial pneumonia, 327 viral pneumonia, and 310 images from normal patients. They divided the collected dataset into two sets, 80% to train the model and 20% to validate the model. Horry et al. [10] proposed a pre-trained model to detect COVID-19. The system consists of four pre-trained models such as VGG, Xception, Inception and Resnet. The dataset includes 100 COVID-19 images, 100 pneumonia, and 200 normal cases. They used 80% for training and 20% for testing purposes. Iteratively pruned deep learning is capable of detecting COVID-19 pulmonary signs from X-ray images. CNN was used to train the data and evaluate the model [11]. In the same context, Roy et al. [12] used Deep Learning (DL) techniques on lung ultrasonography (LUS) images by analysing this type of image using a fully-annotated dataset. The authors in another study [13] confirmed that the automatic analysis of X-ray images are a good alternative for COVID-19 diagnosis. However, the accuracy of such a method is related to the annotated data done by experts, although the deep learning models used have the potential for COVID-19 detection. Techniques to detect COVID-19 using the concept of transfer learning were proposed with five variants of CNNs. They tested VGG-19, MobileNetv2, Inception, Xception, and ResNetv2 in their first experiment and MobileNetv2 in a second assay [14]. Waheed et al. [15], proposed GAN-based model named CovidGAN or Auxil-iary Classifier Generative Adversarial Network (ACGAN) to find COVID-19. Minaee et al. [16] talked about a technique named Deep-COVID based on the concept of deep transfer learning to detect COVID-19 from X-ray images. Their experiment used ResNet18, ResNet50, SqueezeNet, and DenseNet121. The SqueezeNet technique gave them the best performance. In another research project, Moutounet et al. [17] developed DL schema to differentiate between COVID-19 and other pneumonia from X-ray images. They tested VGG-16, VGG-19, InceptionResNetV2, InceptionV3, and Xception techniques in their diagnosis. They partitioned the dataset using five-fold cross-validation technique. The best performance was obtained using VGG-16. Recently, the authors in Ref. [18] developed a CNN variants schema to detect COVID-19 from X-Ray images. They divided the dataset into two sets, 80% for training and 20% for testing purposes. They proved that the VGG-19 and DenseNet were the best to detect COVID-19. In another study [19], conceptual structures and platforms were investigated for their capability dealing with COVID-19 diagnosis. Different techniques have been developed to detect the pandemic, such as Generative Adversarial Networks (GAN), LSTM, Extreme Learning Machine (ELM), and Recurrent Neural Networks (RNN). To identify COVID-19, Maguolo and Nanni [20] tested the AlexNet technique with ten-fold cross-validation for training and testing. In the same context, Oh et al. [21] talked about the importance of Artificial Intelligence (AI) in the diagnosis of COVID-19 using X-ray images. They proposed a CNN method with a small number of trainable parameters to detect COVID-19 virus. In another study, Wang et al. [22] developed a COVID-19 detection scheme named COVID-Net from X-ray images whereas other researchers [23] used a pre-trained models of CNN and SVM to detect COVID-19. They utilized eleven CNN pre trained models to extract features then applied the SVM for the classifications. In their experiment, they found that Resnet50 with SVM gave better accuracy and results. In the same context, Shi et al. [25] demonstrated the importance of medical imaging using CT-scan and X-Ray which can help in COVID-19 diagnosis. Exploiting the new technologies like Artificial intelligence (AI) for analysing these type of images, helps in detecting COVID-19. X-ray images with COVID-19 were applied to an SOFM network to search and classify patients sickness. Their work showed that unsupervised learning could extract features from X-ray images [26]. Chowdhury et al. [27] used in their research a dataset that contains 423 COVID-19 X-ray images, 1485 viral pneumonia X-ray images, and 1579 normal X-ray images. They used the transfer learning with the image augmentation techniques to train and validate some pre-trained deep CNN. Furthermore, it was proposed [28] a modified CNN to detect coronavirus from X-ray images. They combined Xception, ResNet50-V2 techniques and applied five-fold cross-validation techniques on 180 COVID-19 images, 6054 pneumonia images, and 8851 normal images. Loey et al. [40] presented pre-trained models of CNN with deep transfer learning and a Generative Adversarial Network (GAN) to detect COVID-19. Three techniques were tested, Alexnet, Googlenet, and Resnet18. The experiment showed that Googlenet technique was the best. Furthermore, the authors in Ref. [41] concatenated Xception and ResNet50 and used five-fold cross-validation techniques on 180 COVID-19 images, 6054 pneumonia images, and 8851 normal images to detect the COVID-19 virus. In another study, Ucar et al. [42] used Bayes-SqueezeNet to develop a schema named COVIDiagnosis-Net to detect coronavirus using X-ray images. They partitioned the dataset into three sets, 80% for training, 10% for validation, and 10% for testing purpose without reaching high results compared to other studies. In another study [44] the authors, presented a network called DarkCovidNet based on a CNN technique to detect the COVID-19 virus using X-ray images. The proposed solution used DarkNet with 17 CNN layers for the classification of 1127 images partitioned in five-fold cross-validation. In another study, Punn et al. [45] used ResNet, Inception-v3, Inception ResNet-v2, DenseNet169, and NASNetLarge as a pre-trained CNN to detect COVID-19 virus from X-ray images. In their experiment, they used 1076 images divided into three sets; 80% for training, 10% for testing and 10% for validation purposes. Narin et al. [46] used pre-trained models with InceptionV3, ResNet50, Inception-ResNetV2 to detect COVID-19 virus with five-fold cross-validation in the dataset partition. The authors of another research work [47] combined three pre-trained models of CNN, ResNet18, ResNet50, and GoogleNet. The grid search technique was used to select the best hyperparameter and the pre-trained models were used to extract features and do the classifications. They divided the dataset into 50% for training, 30% for testing and 20% for validation purposes. Meanwhile, Li et al. [48] proposed the discriminative costsensitive learning (DCSL) technique to detect COVID-19. The dataset used was partitioned using a five-fold cross-validation technique. In another study, the authors proposed a deep learning technique with 50 layers using ResNet50 to detect COVID-19 [51]. Within 41 iterations only, their system achieved 96.23% of accuracy. Khobahi et al. [52] used a semi-supervised deep learning technique based on Auto-Encoders named CoroNet to detect COVID-19. The dataset was divided into two sets; 90% for training and 10% for testing. The authors in Ref. [53], developed a COVID-19 detection model called BMO-CRNN. Firstly they remove noise in the prepossessing stage, then tuned the parameters using a BMO algorithm and CRNN technique to extract the features and then the SoftMax (SM) based classification to identify COVID-19. Yamaç et al. [54] proposed an approach based on the CSEN that can be seen as a bridge between deep learning models and representation based methods. CSEN uses both a dictionary and a set of training samples to learn a direct mapping from the query samples to the sparse support set of representation coefficients. With this unique ability and having the advantage of a compact network, the proposed CSEN-based COVID-19 recognition systems surpass the competing methods and achieve over 98% sensitivity and over 95% specificity. Zhou et al. [55] proposed a method that combines image regrouping and ResNet-SVM applied to X-ray images. The image regrouping and the residual encoder block combined for feature learning and extraction, and the extracted features were input into SVM for classification. Tang et al. [56] proposed a technique called Ensemble Deep Learning (EDL-COVID) by combining deep learning and ensemble learning. The model is generated by putting together several snapshot models from COVID-Net, which pioneered an open source COVID-19 case detection method with a deep neural network applied to X-ray images using a weighted average assembling method and different sensitivities of deep learning models for different types of classes. In another study [57], the authors used five pre-trained CNN models (ResNet50, ResNet101, ResNet152, InceptionV3 and Inception-ResNetV2) for the detection of coronavirus using X-ray images. They implemented three different binary classifications with four classes (COVID-19, normal (healthy), viral pneumonia and bacterial pneumonia) using five-fold cross-validation. According to their experiment, ResNet50 model provided the highest classification performance. Ahsan et al. [58] proposed a machine vision approach to detect COVID-19 from X-ray images. From the latter, they extracted features using histogram oriented gradient (HOG) and convolution neural network (CNN) to generate the classification model. They also employed a modified anisotropic diffusion filtering (MADF) technique for better edge preservation and to reduce the noise in the images. To mask the significant fracture in the X-ray images, they used a watershed segmentation algorithm. The authors in Ref. [59] demonstrated that deep learning models can leverage data-source specific confounders to differentiate between COVID-19 and pneumonia labels. They eliminated many confounders from earlier studies, such as those related to large age discrepancies between populations (pediatric vs adult), image post-processing artifacts introduced by working from low resolution PDF images, and positioning artifacts by pre-segmenting and cropping the lungs. A summarization of the proposed methods are presented in Table 1 .

Table 1

Summary of techniques and datasets used for COVID-19 detection from X-ray images.

Method	Dataset size (images)			Techniques used
Method	COVID-19	Pneumonia	Normal	Techniques used
Islam et al. [5]	613	1525	1525	CNN-LSTM
HORRY et al. [8]	140	322	60 361	VGG19
Khan et al. [9]	284	327	310	CoroNet (CNN)
Horry et al. [10]	100	100	200	VGG16,VGG19,ResNet50,InceptionV3,Xception
Apostolopoulos et al. [14]	224	714	504	VGG19,MobileNetv2,Inception, Xception, Inception, ResNetv2
Minaee et al. [16]	71		5000	ResNet18,ResNet50,SqueezeNet, DenseNet-121
Moutounet-Cartanet al. [17]	125	50	152	VGG16,VGG19,Inception, ResNetV2, InceptionV3,Xception
Hemdan et al. [18]	25	–	25	VGG19,DenseNet121, InceptionV3, ResNetV2, InceptionResNet-V2,Xception, MobileNetV2
Maguolo et al. [20]	144	339	–	AlexNet
Chowdhury et al. [27]	423	1485	1579	SqueezeNet, Mobilenetv2, ResNet18,ResNet101, VGG19, DenseNet201
Rahimzadeh wt al [28]	180	6054	8851	Concatenated CNN
Loey et al. [40]	69	79	79	GAN, Alexnet, Googlenet, Resnet18
Rahimzade et al. [41]	180	4575	4575	Xception, ResNet50V2,Concatenated CNN
Ucar et al. [42]	45	1591	1203	Bayes SqueezeNet
Bukharia et al. [43]	89	96	93	ResNet50
Ozturk et al. [44]	127	500	500	DarkNet
Punn et al. [45]	108	515	453	ResNet,Inception-v3, Inception, ResNet-v2, DenseNet169, NASNetL
Narin et al. [46]	50		50	ResNet50,InceptionV3, InceptionResNetV2
Ozcan et al. [47]	131	148	200	GoogleNet, ResNet18, ResNet50
Li et al. [48]	239	1000	1000	DCSL
Mukherjee et al. [49]	130		130	Shallow CNN
Luz et al. [50]	152	5421	7966	MobileNet, ResNet50, VGG16, VGG19
Farooq et al. [51]	68	931	1203	ResNet50
Khobahi et al. [52]	89	8521	7966	TFEN, CIN

BEMD algorithm Summary of techniques and datasets used for COVID-19 detection from X-ray images.

Proposed method

During the last two years, many methods have been proposed for COVID-19 detection using different deep learning architectures on X-ray images. The majority of the proposed techniques succeeded in identifying COVID-19 from X-ray images reaching an accuracy of 99%. The number of images used in the data sets was not very large, and every month the number of images used increases, which can produce some cases that can be different from existing cases in terms of the appearance of X-ray images and the medical situation of each patient. Therefore, researche continues to maintain and improve accuracy. In this paper, we propose a novel method to automatically detect COVID-19 from X-ray images. It consists of a 3DCNN network with a context-aware attention (CAA) module trained on BEMD data extracted from the original X-ray images. First, we used the BEMD technique to extract four Intrinsic Mode Functions (IMFs) from each original image while the size of the images used is 512 × 512. Then from the extracted IMFs, we created a video. The latter is then used as an input for the proposed 3DCNN model. The utilization of a video having different features combined with 3DCNN can better facilitate deep learning than just using images. Also, the use of a CAA module allows good learning from different features. Fig. 1 shows a flowchart of the proposed approach including the extraction of IMFs using the BEMD algorithm as well as the proposed architecture of the 3DCNN model.

Fig. 1

Flowchart of the proposed system.

Bidimensional empirical mode decomposition (BEMD)

The empirical mode decomposition (EMD) is an adaptive decomposition technique in intrinsic mode functions (IMFs) that provide local frequency information about a signal and is suitable for non-linear, non-stationary data analysis. On images that are a 2D representation, the EMD has been extended to bidimensional EMD (BEMD) which is able to work on 2D images and is also termed as image EMD (IEMD) or 2D EMD. The BEMD algorithm has been applied on different image processing tasks like pattern analysis, medical imaging, and texture analysis. By detecting the extrama and minima maps, the iterations for each IMF is stopped, which make the BEMD a technique with an adaptive iteration process. Also, the BEMD algorithm simplifies zero-mean 2D AMFM extraction termed BIMFs. In the decomposition process, local maxima and minima of the image are extracted at each iteration and then interpolated to form the upper and the lower envelopes, respectively. The number of two-dimensional intrinsic mode functions resulting from the decomposition and their properties are highly dependent on the method of interpolation. As mentioned in Ref. [37], BEMD is based on using the extrema of the original image and then using it for the decomposition. The technique is to find the extrema and the minima in the image and then find the distance between extrema that provide details to characterize the image on intrinsic length scales [38]. In 2D images, the pixels are denoted by (m,n) as presented in Algorithm 1 [39] which summarized the basic procedure of the BEMD. After using the BEMD algorithm, four IMFs are obtained and then used to generate the video sequence as an input for the 3DCNN proposed model. Some examples of BEMD decomposition are illustrated in Fig. 2 for each category.

Fig. 2

Some results using BEMD algorithm.

Some results using BEMD algorithm. In the pre-processing phase, we took the original images. After that, we applied the bidimensional empirical mode decomposition (BEMD) technique to decompose original images into four IMFs. The latter were then used to generate the video and used as the input for the proposed 3DCNN model, which allows a good learning model from 3D shapes.

Proposed architecture

A convolutional neural network (CNN), is a deep learning neural network used to process structured arrays like pixels in an image. It is important in many applications like image classification, text classification, and pattern recognition [[29], [30], [31], [32]]. CNN is a feed-forward neural network. The power of CNN comes from the convolutional layer put on top of each other. The architecture of the convolutional neural network is composed of many layers that form a feed-forward neural network, hidden layers, activation layers and pooling layers. Before training the data, BEMD features are extracted and used as input of the proposed 3DCNN backbone followed by two CAA modules. The pre-processing consists of extracting the BEMD features including four IMFS, then collecting them to generate the video format. The proposed 3DCNN model allows a multistage learning technique that permits better learning. The proposed 3DCNN-based architecture illustrated in Fig. 1 consists the parts of features extraction with convolutional and max-pooling layers followed by two CAA modules that then create a classification block of fully connected layers. The 3D network is based on a VGG16 architecture for generating features and increasing to a greater spatial extent. The proposed model contains ten 3D convolutional layers and two fully connected layers. We took layers from VGG-16 that include convolutional and max-pooling layers. After two consecutive convolutional layers, a dropout layer is added. The output layers consist of three neurons to present COVID-19, pneumonia, and normal classes. Then the data is flattened to make the model learn from multi-scale layers while also using a complex architecture that performs the learning. Two fully connected layers are of 1024 and 128 respectively. For the ten convolutional layers, we have used the activation function and the PReLU (Parametric Rectified Linear Unit). PReLU learns the parameters, improves accuracy, and minimizes the computational cost. Positive values are fed into the ReLU activation function whereas negative values were set to null. The PReLU function is defined as: a controls the slope of the negative part. When a = 0, it operates as an ReLU; when a is a readable parameter, it is referred to as Parametric ReLU (PReLU). If ai is a small value, PReLU becomes LReLU (a = 0.01). PReLU can be trained using the back propagation concept. Context-aware Attention module: analyzing the context is the way to extract, recognize a segment or classify the content in an image or a video. To do that, the existing deep learning architectures combine multiple convolutional and pooling layers to extract features for learning like in Ref. [33]. For image classification from X-ray images, the lung infection can exist in the variations of the lung position and scale. The combination of the convolutional and pooling layers cannot be effective for handling all these variations. The authors in Ref. [34] attempted to implement a module inspired by SIFT feature extraction method [35] that can extract the features based on the image content. For that atrous convolution [36] is used to extract the features of the same content. For multi-scale features, the use of the output of each convolutional-pooling block can be taken. The same strategy is used for implementing the context-aware pyramid (CP) module shown in Fig. 3 . Two CAA blocks are used for scale-shape-based extraction. The first one takes the output of the third VGG-16 block as input of the first CAA module while in the second one, CAA takes the output of the last block of the backbone. Each CAA module is composed of adopted Atrous convolutions with a variation of dilatation rate and pending parameters. The outputs of each atrous convolution are concatenated. Then the two CAA modules are combined to get features maps of a Context-aware pyramid module using the Pytorch function, which is then used as input for the flattened layer.

Fig. 3

Context-Aware attention module.

Context-Aware attention module. In the proposed model, some combined COVID-19 datasets were used for training, testing and validation. The use of different features as inputs (IMFs) makes the deep learning model capable of a better learning.

Loss function

In this paper, binary cross-entropy loss is used to estimate the performance of the classifier [60,61]. As defined, the loss value will increase if the predicted probability deviates from the true label. Binary cross-entropy loss calculates the difference between the actual and predicted probability distributions for predicting in the range [0,1]. The score is minimized and a perfect value is 0. The loss value is expressed as follows: While S denote is the number of scalar values in the model output.

Experimental results

In order to evaluate the proposed method for COVID-19 detection from X-ray images, a set of evaluation metrics have been used. Also we attempted to test the proposed architecture on two dataset with different sizes. To demonstrate the obtained results, we compared using the same metrics with state-of-the-art methods including Islam et al. [5], Chowdhury et al. [27], Rahimzade et al. [41], Ucar et al. [42], An et al. [43], Ozturk et al. [44], Punn et al. [45], Narin et al. [46], Ozcan et al. [47], Bukhari et al. [48], Mukherjee et al. [49], Shankar et al. [53], Yamaç et al. [54], ZHOU et al. [55], Tang et al. [56], Narin et al. [57], Ahsan et al. [58], and Kaoutar Ben et al. [59]. This section provides a description of the used datasets, experimental setup of the proposed deep-learning-based model, and also a discussion of the obtained results.

Experimental setup

For the experiment, we divided the dataset into training, testing and validation sets. The training set contains 80% of the data, the testing and validations sets contain 10% each. For feature extraction techniques we used pre-trained VGG-16 networks, while the adaptation of 3D representation is performed. The decomposition of the images into IMFs has been made using Matlab exploiting the BEMD algorithm. The BEMD-3DCNN model is implemented using Pytorch and Python on an Intel(R) Core(TM)i7-2.2 GHz processor with a graphical processing unit (GPU) NVIDIA RTX 2070 (8 GB), and 64 GB RAM.

Dataset

In this paper, two datasets were used for training and testing the proposed method. The first was used by Islam et al. [5]. They collected the images from multiple sources and then did some prepossessing on it to reduce the noise. Subsequently, they applied augmentation techniques to get a reasonable number of images to run the CNN. This dataset contains 4575 images. The second is collected from the internet (Kaggle1 and Mendeley2 ), then merged to obtain a large-scale dataset for training. The new dataset contains 6484 images. In our study, we tested the performance of our model using the two mentioned datasets. In the experiments we divided the data into 80% for training, 10% for validation and 10% for testing. A description of the data used in the experiments are presented in Table 2 .

Table 2

Dataset used in our experiment.

Data/Cases	COVID-19	Normal	Pneumonia	Overall
Training	1442	1528	2218	5188
Testing	180	191	277	648
Validation	180	191	277	648
Overall	1802	1910	2772	6484

Dataset used in our experiment.

Evaluation metrics

To evaluate the image classification method some metrics are exploited including recall, precision F1-score, sensitivity, and specificity. These metrics are computed based on four measures such as True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN). While True Positive (TP) denotes COVID-19 cases that are correctly predicted, False Positive (FP) refers to samples that are falsely classified, TN refer to normal cases that are well predicted, and FN for COVID-19 cases that are miss-classified as normal. We also used the accuracy of the trained model as a performance metric. These metrics are defined as follows. Recall: Recall is a metric determining the completeness of the classifier. Higher recall indicates lower false negatives, while lower recall indicates higher false negatives. Precision often decreases with an improvement in recall. Precision: Precision shows how much of the data predicted as positive are predicted correctly. In other words, high precision means fewer false positives. Sensitivity: Sensitivity is defined as the ratio of correctly detected COVID-19 cases to the total number of detected COVID-19 cases, and is computed as: Specificity: Specificity is defined as the ratio of correctly detected non-COVID-19 to the total number of non-COVID-19, and is measured as: F1-Score: To obtain the F1-score, the product of recall and precision is divided by the sum of recall and precision.

Evaluation and discussion

In the experiments, we divided the data into 80% for training, 10% for validation, and 10% for testing the proposed architecture. Fig. 4 and Fig. 5 show the evaluation metrics accuracy and loss respectively. These figures, show that the proposed method converges quickly to 1 during the training and the loss graph shows the convergence to 0. This is due to the use of 3D representation of decomposed data using the BEMD algorithm, which gives an opportunity to the model to learn from different features. Also, the combination of two CAA modules gives the model the possibility to learn from contextual information contained in the extracted features.

Fig. 4

Accuracy graph.

Fig. 5

Loss graph.

Accuracy graph. Loss graph. Fig. 6 illustrates the confusion matrix of the testing data used for the COVID-19 classification process by the proposed BEMD-3DCNN model using three class including COVID-19, Normal (non-COVID) and Pneumonia. The confusion matrix show that all the data were successfully classified by the proposed method. It also demonstrates that the proposed method is efficient for COVID-19 classification with a minimum true negative values.

Fig. 6

Confusion matrix.

Confusion matrix. To demonstrate the performance of each class using the proposed BEMD-3DCNN-based method, we exploited the evaluation metrics shown in Fig. 7 including accuracy, specificity, sensitivity, and F1-score for each class. The proposed method as well as the CNN-LSTM classified the classes with high performance, but Fig. 7 reflects that BEMD-3DCNN is more accurate and reached 99.99% for all metrics which is a near perfect accuracy rate. In our experiment, we used the same metrics as used for other existing COVID-19 detection methods. The proposed method showed the highest and most accurate detection of COVID-19 using X-ray images. Also compared with other proposed methods, our results are the best in terms of accuracy for COVID-19, normal and pneumonia classes.

Fig. 7

Comparison sensitivity, specificity, and F1-score metrics of the proposed method with state-of-the-art methods.

Comparison sensitivity, specificity, and F1-score metrics of the proposed method with state-of-the-art methods. The obtained results using precision, recall and F1-score are presented in Table 3 . As each proposed method used specific metrics, we attempted to use all the metrics used in previous work. The table shows that the results from the BEMD-3DCNN method reached the highest values of precision, recall and F1-score, with a difference of 0.8, 0.5 and 1 point comparing with CNN-LSTM method. For this dataset, the precision as well as the other metric values demonstrate that deep learning methods can detect COVID-19 with a high performance. Thus, it is applicable to use these methods as technique for real-time COVID-19 detection as a safe alternative technique. Further, the maintenance of these methods with new data can improve the performance to 100% (see Table 3).

Table 3

Performance of the BEMD-3DCNN network compared to the existing methods.

Method	Data	Accuracy	Precision	Recall	F1-Score
CNN-LSTM [2]	4575	99.4%	99.2%	99.3%	98.9%
CNN [24]	4575	99.7%	99.7	99.7	99.55%
BEMD-3DCNN	4575	99.99%	99.99%	99.99%	99.99%
BEMD-3DCNN	6484	99.99%	99.99%	99.99%	99.99%

Performance of the BEMD-3DCNN network compared to the existing methods. Another set of metrics was also used to evaluate the proposed method against state-of-the-art methods. These metrics are respectively sensitivity, and specificity. In Table 4 we summarize the obtained results compared with some state-of-the-art methods using the mentioned metrics. Some other significant factors are also presented, like data sources, data partitioning technique, number of images, classes, and performance measures. The Table shows that the proposed method had the highest accuracy values as well as best sensitivity and specificity values compared with other methods. BEMD-3DCNN was better than [5] by a difference of 0.5 for accuracy, 0.6 for sensitivity, and 0.7 for specificity. Also, BEMD-3DCNN was better than [48,49] by 3.6 and 5.7 for sensitivity respectively. Also, comparing with newer methods like [56,57], these method reached good results that are close to the obtained results, and also the size of data used for training and evaluation are close. The table also shows that the number of classes can affect the performance accuracy results like in Ozturk et al. [44] and Yamaç et al. [54] as well as the size of data used like for the method in Tang et al. [56]. In conclusion, all the proposed method metrics reached good results that exceed 96% in terms of performance accuracy. However, continuous changes in COVID-19 should be taken into consideration while maintaining the models with the new data. The BEMD-3DCNN method has also been evaluated on the same dataset without CAA blocks. Table 4 shows that the proposed method can classify COVID-19 images with robust accuracy even without CAA blocks. This is because of the use of 3D representation in BIMF images.

Table 4

COVID-19 detection techniques (comparison).

Authors	Images	Classes	Partitioning	Accuracy	Sensitivity	Specificity
Islam et al. [5]	4575	3	80%–20%	99.4	99.3	99.2
Chowdhury et al. [27]	3487	3	80%–20%	97.9	97.9	98.8
Rahimzade et al. [41]	180	3	five-fold cross-val	99.5	80.5	99.5
Ucar et al. [42]	2839	3	80%-10%–10%	98.2	98.2	99.1
An et al. [43]	278	3	80%–20%	98.1	98.2	98.1
Ozturk et al. [44]	1127	3	five-fold cross-val	98.0	95.1	95.3
Punn et al. [45]	1076	3	80%-10%–10%	98.0	91.0	91.0
Narin et al. [46]	100	2	five-fold cross-val	98.0	96.0	100
Ozcan et al. [47]	721	4	50%-30%–20%	97.6	97.2	97.9
Bukhari et al. [48]	2239	3	five-fold cross-val	97.0	97.0	97.0
Mukherjee et al. [49]	260	2	five-fold cross-val	96.9	94.0	100
Shankar et al. [53]	247	2	five-fold cross-val	94.8	98.3	98.8
Yamaç et al. [54]	6200	4	five-fold cross-val	–	98.0	95.0
ZHOU et al. [55]	672	2	70%–30%	93.6	88.0	–
Tang et al. [56]	15 477	3	90%–10%	95.0	96.0	–
Narin et al. [57]	7406	2	80%–20%	99.7	98.8	99.8
Ahsan et al. [58]	5090	3	80%–20%	99.4	93.6	95.7
Kaoutar Ben et al. [59]	1332	3	65%–6% - 29%	98.1	96.2	98.7
BEMD-3DCNN(without CAA)	6484	3	80%-10%-10%	99.1	98.8	99.2
BEMD-3DCNN	6484, 4575	3	80%-10%-10%	99.99	99.99	99.99

COVID-19 detection techniques (comparison).

Conclusion

COVID-19 spreads between people, mainly when an infected person is in close contact with another person. Currently, cases continue to increase every day, and many countries are still affected dramatically by the disease because of limited resources and the large number of infected persons even as vaccination progresses. Hence, it is necessary to identify positive cases during any emergency. Accordingly, we introduced a deep learning model utilizing a BEMD technique combined with 3DCNN on X-ray images to identify the coronavirus. BEMD decomposes the original image into IMFs, creates a video with these IMFs, and exploits a 3DCNN model with Context-aware attention (CAA) modules to classify and detect COVID-19. The proposed technique achieved a high performance accuracy of 99.99%. Compared with existing methods, BEMD-3DCNN achieved the best results on two datasets. This demonstrates that the proposed technique can be used in real-time for early COVID-19 detection.

30 in total

Review 1. Review of Artificial Intelligence Techniques in Imaging Data Acquisition, Segmentation, and Diagnosis for COVID-19.

Authors: Feng Shi; Jun Wang; Jun Shi; Ziyan Wu; Qian Wang; Zhenyu Tang; Kelei He; Yinghuan Shi; Dinggang Shen
Journal: IEEE Rev Biomed Eng Date: 2021-01-22

2. COVID-19 Detection Based on Image Regrouping and Resnet-SVM Using Chest X-Ray Images.

Authors: Changjian Zhou; Jia Song; Sihan Zhou; Zhiyao Zhang; Jinge Xing
Journal: IEEE Access Date: 2021-06-04 Impact factor: 3.367

3. Automatic Pleural Line Extraction and COVID-19 Scoring From Lung Ultrasound Data.

Authors: Leonardo Carrer; Elena Donini; Daniele Marinelli; Massimo Zanetti; Federico Mento; Elena Torri; Andrea Smargiassi; Riccardo Inchingolo; Gino Soldati; Libertario Demi; Francesca Bovolo; Lorenzo Bruzzone
Journal: IEEE Trans Ultrason Ferroelectr Freq Control Date: 2020-06-29 Impact factor: 2.725

4. Convolutional Sparse Support Estimator-Based COVID-19 Recognition From X-Ray Images.

Authors: Mehmet Yamac; Mete Ahishali; Aysen Degerli; Serkan Kiranyaz; Muhammad E H Chowdhury; Moncef Gabbouj
Journal: IEEE Trans Neural Netw Learn Syst Date: 2021-05-03 Impact factor: 14.255

5. COVIDiagnosis-Net: Deep Bayes-SqueezeNet based diagnosis of the coronavirus disease 2019 (COVID-19) from X-ray images.

Authors: Ferhat Ucar; Deniz Korkmaz
Journal: Med Hypotheses Date: 2020-04-23 Impact factor: 1.538

6. COVID-19 Detection from Chest X-ray Images Using Feature Fusion and Deep Learning.

Authors: Nur-A- Alam; Mominul Ahsan; Md Abdul Based; Julfikar Haider; Marcin Kowalski
Journal: Sensors (Basel) Date: 2021-02-20 Impact factor: 3.576

7. COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images.

Authors: Linda Wang; Zhong Qiu Lin; Alexander Wong
Journal: Sci Rep Date: 2020-11-11 Impact factor: 4.379

8. Artificial Intelligence and COVID-19: Deep Learning Approaches for Diagnosis and Treatment.

Authors: Mohammad Behdad Jamshidi; Ali Lalbakhsh; Jakub Talla; Zdenek Peroutka; Farimah Hadjilooei; Pedram Lalbakhsh; Morteza Jamshidi; Luigi La Spada; Mirhamed Mirmozafari; Mojgan Dehghani; Asal Sabet; Saeed Roshani; Sobhan Roshani; Nima Bayat-Makou; Bahare Mohamadzade; Zahra Malek; Alireza Jamshidi; Sarah Kiani; Hamed Hashemi-Dezaki; Wahab Mohyuddin
Journal: IEEE Access Date: 2020-06-12 Impact factor: 3.367

1 in total

1. ViDMASK dataset for face mask detection with social distance measurement.

Authors: Najmath Ottakath; Omar Elharrouss; Noor Almaadeed; Somaya Al-Maadeed; Amr Mohamed; Tamer Khattab; Khalid Abualsaud
Journal: Displays Date: 2022-05-10 Impact factor: 3.074

1 in total