Nahian Ibn Hasan1. 1. Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh.
Abstract
The outbreak of the SARS-CoV-2/Covid-19 virus in 2019-2020 has made the world look for fast and accurate detection methods of the disease. The most commonly used tools for detecting Covid patients are Chest-X-ray or Chest-CT-scans of the patient. However, sometimes it's hard for the physicians to diagnose the SARS-CoV-2 infection from the raw image. Moreover, sometimes, deep-learning-based techniques, using raw images, fail to detect the infection. Hence, this paper represents a hybrid method employing both traditional signal processing and deep learning technique for quick detection of SARS-CoV-2 patients based on the CT-scan and Chest-X-ray images of a patient. Unlike the other AI-based methods, here, a CT-scan/Chest-X-ray image is decomposed by two-dimensional Empirical Mode Decomposition (2DEMD), and it generates different orders of Intrinsic Mode Functions (IMFs). Next, The decomposed IMF signals are fed into a deep Convolutional Neural Network (CNN) for feature extraction and classification of Covid patients and Non-Covid patients. The proposed method is validated on three publicly available SARS-CoV-2 data sets using two deep CNN architectures. In all the databases, the modified CT-scan/Chest-X-ray image provides a better result than the raw image in terms of classification accuracy of two fundamental CNNs. This paper represents a new viewpoint of extracting preprocessed features from the raw image using 2DEMD.
The outbreak of the SARS-CoV-2/Covid-19 virus in 2019-2020 has made the world look for fast and accurate detection methods of the disease. The most commonly used tools for detecting Covid patients are Chest-X-ray or Chest-CT-scans of the patient. However, sometimes it's hard for the physicians to diagnose the SARS-CoV-2 infection from the raw image. Moreover, sometimes, deep-learning-based techniques, using raw images, fail to detect the infection. Hence, this paper represents a hybrid method employing both traditional signal processing and deep learning technique for quick detection of SARS-CoV-2 patients based on the CT-scan and Chest-X-ray images of a patient. Unlike the other AI-based methods, here, a CT-scan/Chest-X-ray image is decomposed by two-dimensional Empirical Mode Decomposition (2DEMD), and it generates different orders of Intrinsic Mode Functions (IMFs). Next, The decomposed IMF signals are fed into a deep Convolutional Neural Network (CNN) for feature extraction and classification of Covid patients and Non-Covid patients. The proposed method is validated on three publicly available SARS-CoV-2 data sets using two deep CNN architectures. In all the databases, the modified CT-scan/Chest-X-ray image provides a better result than the raw image in terms of classification accuracy of two fundamental CNNs. This paper represents a new viewpoint of extracting preprocessed features from the raw image using 2DEMD.
The recent outbreak of SARS-CoV-2/ Covid-19/ Corona-Virus has affected people of 215 countries [1] all-over the world. SARS-CoV-2 affects people in different physiological manners. It is supposed to be very much lethal for geriatrics, children, and people with other significant physical conditions. Some of the common symptoms of SARS-CoV-2 are diarrhea, fever, tiredness, cough, etc. Most of the time, physicians diagnose prospective SARS-CoV-2 patients using CT-scan images and Chest-X-ray images. It is one of the quickest methods for the detection of pneumonia patients. The early diagnosis of Covid patients is very much necessary for efficient treatment. Bai et al. [2] has reported that some radiologists around the world classified SARS-CoV-2 from common viral pneumonia cases on chest CT-scan images with moderate sensitivity but high specificity. Hence, the automatic detection of SARS-CoV-2 patients is needed. Consequently, automatic detection based on CNN networks and machine learning algorithms are necessary with higher accuracy and lower false-negative rate.Deep learning-based Covid-patients detection has gained the attraction of the scientific community. Some of the reported methods involve direct usage of images through different convolutional neural network architectures. However, some other methods involve gathering features from traditional signal processing techniques and then train those features through CNN to classify the patients. Jaiswal et al. [3] proposed a Densenet201-based transfer learning model to classify a Covid patient from a non-Covid patient. Chen et al. [4] has used a massive database of 46,096 anonymous images to train through deep learning models for Covid-19 patient detection and classification. On top of that, Barstugan et al. [5] used Grey Level Co-occurrence Matrix, Length Matrix, Local Directional Pattern, Size Zone Matrix, and Discrete Wavelet Transform as primary feature extraction methods, which are then trained through SVM machines for efficient detection of SARS-CoV-2 patients. On the other hand, Wang et al. [6] introduced COVID-Net, a deep learning based classifier of SARS-CoV-2 patients from Chest-X-ray images. They have designed and fine-tuned the architecture for their database. At the same time, Narin et al. [7] used different existing network architectures (i.e. ResNet50, Inception-ResNetV2, and InceptionV3) for classification purpose. Recently, Kareem et al. [8] has used ML models like Naive Bayes (NB), Random Forest (RF), and Support Vector Machine (SVM) for diagnosing Covid patients. Also, Waisy et al. [9] introduced a hybrid CheXNet model for detecting Covid patients by utilizing image enhancement and noise removal with pre-trained deep-learning models. The author also showed in another report [10] that the same enhancement and noise removal techniques can work as well with deep belief networks.Apart from these, Rajaraman et al. [11] used a new strategy to localize the region of interest (ROI) in the Chest-X-ray images, which are then passed through CNN (VGG16 architecture) for final prediction and classification. Furthermore, Ozturk et al. [12] has reported the usage of Darknet and You-only-look-once (YOLO) (an object detection system) for both binary and multi-class classification of SARS-CoV-2 patients. However, Sun et al. [13] have proposed a deep forest algorithm based on adaptive feature selection criteria for classification. They also used a deep forest model for a high-level representation of features. Besides, Bai et al. [14] compared the results of a deep learning model and radiologists. For this purpose, they first segmented the lung for excluding non-pulmonary regions of the CT. Ko et al. [15] have developed a Fast Track SARS-CoV-2 Classification Network (FCONet) to classify SARS-CoV-2 cases from CT-scan image. They have used pre-trained models like VGG16, ResNet-50, Inception-v3, or Xception to classify Covid patients. However, Hu et al. [16] has proposed a weakly supervised deep learning model for weakly labeled CT-scan images to classify SARS-CoV-2 patients. Apart from these, Mahmud et al. [17] has introduced a new neural network architecture, named ’CovXNet’, which utilizes depth-wise convolution and different dilation rates. They used the new architecture for differentiating SARS-CoV-2 Chest-X-ray images from viral pneumonia, bacterial pneumonia, and normal patients. Moreover, Harmon et al. [18] has reported a similar approach to [14], where they first segmented the lung and then used 3D deep learning models to classify the SARS-CoV-2 patients from CT-scan images.In all of the deep-learning-based studies, either the raw input image or any lightly preprocessed image has been used as input data to the deep learning network. However, we wanted to explore this preprocessing task, where the raw image is processed by the mode decomposition technique so that it is easier for the network to learn the inherent features in a more effective way. In this study, two-dimensional Empirical Mode Decomposition (2DEMD) has been presented as a new strategy to extract features from CT-scan and Chest-X-ray images. 2DEMD has never been used as a preprocessing technique for deep-learning-based methods. The primary goal of the classification is to show that a modified image from 2DEMD performs better as input data for the deep CNN while classifying Covid patients. The paper represents that training a fundamental deep neural network with a modified CT-scan/Chest-X-ray (through 2DEMD) shows better performance than training with a raw image. The modified image acts as a better candidate than the raw image irrespective of the complexity and performance of the CNN and the variation of data in the databases. The proposed method has been validated on three publicly available SARS-CoV-2 databases - two comprised of CT-scan images, and the third consists of Chest-X-ray images.
Proposed Method
This section represents the overview of the proposed method and a detailed discussion of the feature extraction methodology and CNN training. Fig. 1
briefly illustrates the proposed method. First, any image is converted to a gray-scale image. Next, the CT-scan/Chest-X-ray image is decomposed into separate high and low-frequency Intrinsic Mode Functions (IMFs) using two-dimensional Empirical Mode Decomposition (2DEMD). The reason for using 2DEMD is to capture small and high-frequency texture profiles of the CT-scan/Chest-X-ray images of the lung, which are characteristic properties of a SARS-CoV-2 patient. After that, the residual part of the decomposition is discarded, and all other independent IMFs are combined (addition) to form the modified image. This modified CT-scan/Chest-X-ray image is trained through a Convolutional Neural Network (CNN) for secondary feature extraction, learning and classification.
Fig. 1
The proposed method for SARS-CoV-2 patient classification. The original CT-scan/Chest-X-ray image is, first, resized according to the input specification of the CNN classifier. Next, the resized image is converted to a gray-scale image (if it is not already). Next, the 2DEMD algorithm decomposes the image into separate IMFs, which are simply summed together to form a modified version of the real medical image. The modified image is then passed through the CNN for feature learning and classification.
The proposed method for SARS-CoV-2 patient classification. The original CT-scan/Chest-X-ray image is, first, resized according to the input specification of the CNN classifier. Next, the resized image is converted to a gray-scale image (if it is not already). Next, the 2DEMD algorithm decomposes the image into separate IMFs, which are simply summed together to form a modified version of the real medical image. The modified image is then passed through the CNN for feature learning and classification.
2D EMD is an extension of single-dimensional EMD. EMD decomposes a signal into separate modes which are also known as Intrinsic Mode Functions (IMFs) [19], [20]. Each IMF has similar lengths and an equal number of zero-crossing and extrema points. The envelopes of the decomposed signals serve as oscillatory modes. The IMFs are non-orthogonal, but they can describe the signal adequately. Single dimensional EMD is effective for natural signals (i.e. Electrocardiogram signals) because it can track the non-linearity and non-stationarity of these signals. EMD signal conveys intrinsic features of a signal.In this description, we, first, consider the analysis of single-dimensional EMD. For a 1D EMD, the signal values are specified with respect to time (t). Digital representation of the signal will convert the time (t) parameter with the number of samples (n). However, for simplicity, we consider a time (t) varying signal. The minima and maxima points of a signal can be joined together to find the lower envelope curve () and the upper envelope curve (), respectively. The average value of the two signals is . This subtraction of mean value from the original signal () results into the first proto-IMF signal, :The proto-IMF signal, then goes through a sifting process subsequently up to a threshold point. At this threshold point, the conditions for an Intrinsic Mode Function (IMF) are fulfilled [19]. These steps provide the very first IMF signal (). The corresponding residue signal, (), after the first step is,Next, applying the sifting process to the first residue signal () in a similar fashion results in second, third and etc IMFs. Generally,Here, N denotes the number of IMFs. Finally a residue signal () and certain number of IMFs (, , ..., ) are obtained such that,here, is the order IMF. The lower-order IMFs correspond to high-frequency modes and higher-order IMFs resemble the low-frequency modes [19].In the uni-directional EMD, the method achieves uni-directional IMFs, in the case of two-dimensional EMD (2DEMD), there are 2D IMFs. The algorithm for 2DEMD is discussed in [21]. Morphological reconstruction extracts the 2D IMFs during the sifting process of EMD. It helps in detecting the radial basis function (RBF) and image extrema to compute the surface interpolation. In this paper, the sifting process is allowed to increase up to iteration for achieving lower error in achieving the IMFs. The whole sifting algorithm is specified in detail in [20]. According to Havlicek et al. [22], a two-dimensional IMF is a zero-mean 2D AM-FM component. The image AM-FM decomposition is a separate algorithm that is based on partial unsupervised features. In uni-directional EMD, the process is fully unsupervised. During the 2DEMD process, first, morphological reconstruction identifies the extrema of the image, which is based on the geodesic operators. Next, Similar to the case of EMD, the envelope (in 2DEMD, this is a 2D envelope) is generated with an RBF. The local mean (Y(2D)) is calculated by averaging the maxima and minima envelope. Then subtraction of the mean (Y(2D)) from the original image results into the proto-2D-IMF signal. The sifting process is repeatedly applied to the proto-2D-IMF signal at every step until the conditions for a 2D-IMF are fulfilled. Next, the same procedure is applied to the residual signal after the first 2D-IMF is achieved. The mechanism of extrema detection and the parameter specifications of 2DEMD are elaborately described in [21]. In this paper, the 2DEMD is applied on CT-scan/Chest-X-ray images and the maximum number of IMFs to be extracted is set to 6. That means any image undergoing the 2DEMD process, can generate a varying number of IMFs, but all of them are considered for forming a modified CT-scan/Chest-X-ray image by simple summation. Some of the modified images are shown in Fig. 4 and Fig. 5. The decomposed 2DIMFs capture the texture profile of the image. 2DEMD creates scopes for the extraction of spatial frequency components from coarsest to finest scales.
Fig. 4
Two-dimensional EMD applied to CT-scan sample images of (a) SARS-CoV-2 positive patients, (b) SARS-CoV-2 negative patients. In fig (a), the images in the top row denote the original ct-scan images of SARS-CoV-2-positive patients, whereas the bottom row denotes the 2DEMD-based modified CT-scan images of SARS-CoV-2-positive patients. In fig (b), the images in the top row denote the original ct-scan sample images of SARS-CoV-2-negative patients, whereas the bottom row denotes the 2DEMD-based modified CT-scan sample images of SARS-CoV-2-negative patients. Here, the residual part of the 2DEMD is discarded.
Fig. 5
Two-dimensional EMD applied to Chest-X-ray images of (a) SARS-CoV-2-positive cases, (b) SARS-CoV-2-negative cases. In fig (a), the images in the top row denote the original Chest-X-ray sample images of SARS-CoV-2-positive scenarios, whereas the bottom row denotes the 2DEMD-based modified Chest-X-ray images of SARS-CoV-2-positive scenarios. In fig (b), the images in the top row denote the original Chest-X-ray sample images of SARS-CoV-2-negative patients, whereas the bottom row denotes the 2DEMD-based modified Chest-X-ray images of SARS-CoV-2-negative patients. Here, the residual part of the 2DEMD is discarded and the IMFs are simply added together.
Deep Convolutional Neural Network Architecture
Two fundamental neural network architectures, (VGG16 [23] and VGG19 [23]), have been used for the training of decomposed and modified image. VGG16 consists of 5 stages of convolutional operations, which requires an input image size of . Similarly, VGG19 also consists of 5 stages of convolutional operations, but with a higher number of operations per stage. The kernels of the CNN layers are initialized with the pre-trained ’imagenet’ [24] weights of VGG16, VGG19, and Resnet152 models [23]. The learning rate is similar for all of the layers. A TESLA K80 Graphics Processing Unit has been utilized for the training purpose on a system of intel core-i9 CPU with 4 GB of memory. The training loss function is ’Categorical Cross-entropy’, and ’softmax’ activation is applied at the final output classification layer.
Results and Discussions
This section provides a detailed description of the databases, different training procedures, learning parameters, evaluation metrics, and training results.
Database
The proposed method is evaluated on three publicly available databases of SARS-CoV-2. The first database was collected from [25]. This database contains 2482 images of CT-scans from a varying number of patients. Among them, 1252 CT-scans are of SARS-CoV-2-positive samples and 1230 CT-scans are of SARS-CoV-2-negative samples. These data were collected from SARS-CoV-2 patients in hospitals from Sao Paulo, Brazil. Some of the reported results on this database can be found in [26], [27], [28], [29], [30], [31], [32], [33]. The second dataset was collected from [34]. The dataset contains in total of 746 CT-scan images of SARS-CoV-2 and non-SARS-CoV-2 patients. Among them, 349 images are from 216 SARS-CoV-2-positive patients. Several classification results have been mentioned in [35], [36], [37], [38], [39], [40] based on this database. The third dataset was collected from [41]. This dataset consists of 813 Chest-X-ray images from multiple sources and multiple patients. Researchers have reported classification and segmentation results on this dataset in [12], [42], [43], [44], [45]. The data distribution within the training-set and the testing-set is illustrated in Fig. 2
for all of the three databases. For databases 1 and 3, the training and testing sets have (80-20)% of random splitting. For database 2, the splitting is slightly different from the conventional rule-of-thumb. We have used (60-40)% splitting criteria since there is less data than databases 1 and 3. For database 3, the ’Non-Covid’ class in the train set has been up-sampled by simply making multiple copies of the images for extenuating the effect of class imbalance. In training, we utilized the 2D image as input data rather than utilizing the whole 3D scan of the lung. Hence, a single 2D image of a CT scan acts as a single data. Besides, since we are using 2D convolution in CNN, it is a must to use the 2D image.
Fig. 2
Data distribution within train set and test set for database 1 (a), database 2 (b), and database 3 (c). Database 1 and 2 consist of CT-scan images and database 3 consists of Chest-X-ray images.
Data distribution within train set and test set for database 1 (a), database 2 (b), and database 3 (c). Database 1 and 2 consist of CT-scan images and database 3 consists of Chest-X-ray images.
Learning Rate Scheduler
The training procedure incorporates a learning rate scheduler for better and efficient learning of image features (generated in distributed CNN). The scheduler used for training is ‘cosine-annealing’. Generally, such learning rate schedulers with restarting mechanism are also known as the stochastic gradient descent with warm restarts (SGDR) [46]. But this paper uses the restarting mechanism with the RMSProp optimizer. This restart technique frees the optimization from local minima over the optimization space at any time during the training. ‘Cosine annealing with warm restart’ consists of two parts. The first one is the ‘cosine function’ that acts as the learning rate annealing function. The second part is the ‘warm-restart’ that makes the learning rate scheduler restart again from the initial point. The purpose of using such a scheduler is to maximize the probability of converging to the global minimum cost location and also to minimize the probability of being stuck at a local minimum cost point. For the purpose of this paper, an initial learning rate of 0.00001 has been used as the maximum learning-rate. Besides, there are 10 cycles accommodated within the range of training epochs. As a result, the minimum learning rate achieved was only 0.00000006155. Within the epoch, the learning rate is decayed using the following function specified in [46].where, and are the limits of the learning rate, denotes the number of epochs that have passed since the last restart, denotes the periodicity of the restart (for example, after epochs, the learning rate is restarted again from the initial rate). Fig. 3
represents the schematic diagram of the ‘cosine-annealing’ scheduler throughout the training epochs.
Fig. 3
‘Cosine Annealing learning rate scheduler’ - used for training. The maximum learning rate is 0.00001, the number of cycles within the total range of epochs is 10.
‘Cosine Annealing learning rate scheduler’ - used for training. The maximum learning rate is 0.00001, the number of cycles within the total range of epochs is 10.Two-dimensional EMD applied to CT-scan sample images of (a) SARS-CoV-2 positive patients, (b) SARS-CoV-2 negative patients. In fig (a), the images in the top row denote the original ct-scan images of SARS-CoV-2-positive patients, whereas the bottom row denotes the 2DEMD-based modified CT-scan images of SARS-CoV-2-positive patients. In fig (b), the images in the top row denote the original ct-scan sample images of SARS-CoV-2-negative patients, whereas the bottom row denotes the 2DEMD-based modified CT-scan sample images of SARS-CoV-2-negative patients. Here, the residual part of the 2DEMD is discarded.Two-dimensional EMD applied to Chest-X-ray images of (a) SARS-CoV-2-positive cases, (b) SARS-CoV-2-negative cases. In fig (a), the images in the top row denote the original Chest-X-ray sample images of SARS-CoV-2-positive scenarios, whereas the bottom row denotes the 2DEMD-based modified Chest-X-ray images of SARS-CoV-2-positive scenarios. In fig (b), the images in the top row denote the original Chest-X-ray sample images of SARS-CoV-2-negative patients, whereas the bottom row denotes the 2DEMD-based modified Chest-X-ray images of SARS-CoV-2-negative patients. Here, the residual part of the 2DEMD is discarded and the IMFs are simply added together.
Evaluation Metrics
Several evaluation metrics have been incorporated into the training result analysis to convey the performance of the proposed method from varying perspective. This section provides a brief discussion of these evaluation metrics.
TP, TN, FP, FN
Four special metrics are calculated based on the predicted results. ’True-Positive (TP)’ metric denotes the number of patients who were infected, and the model predicted the infection correctly. ’True-Negative (TN)’ metric denotes the number of patients who were not infected, and the model predicted that correctly. ’False-Positive (FP)’ denotes the number of patients who were not affected, but the model predicted a positive result for them. Finally, the most important metric of disease classification, ’False-Negative (FN)’ denotes the number of patients who were affected, but the model predicted negative results for them. This paper tries to minimize the False-Negative-Rate (FNR), which can be expressed as follows-This parameter is also known as the ’miss-rate’.
Categorical Accuracy
It denotes how often the predicted class matches the true class. In the proposed method, there are databases with two classes, which involve binary classification accuracy. However, the proposed method also reports the class-specific accuracy, which helps to explain the class-specific performance of the models. The classification accuracy can be expressed in terms of TP, TN, FP, FN as follows-
Precision or Positive Predictive Value (PPV)
’Precision’ denotes what proportion of positively infected patients were accurately predicted. The following equation describes the metric-
Recall or True Positive Rate (TPR) or Sensitivity
’Recall’ indicates what proportions of originally infected people were correctly classified. The following formula expresses the idea of recall-
Specificity or True Negative Rate (TNR) or Selectivity
’Specificity’ signifies the number of patients who were not infected, and the model also correctly classified them. The following formula describes specificity-
F1 Score and F2 Score
The score combines the precision and recall. If , this is known as score and if , it is known as score. The formula is expressed as -
score provides equal importance to precision and recall. score provides weights recall higher than ‘precision’.
Area under the characteristic curve (AUC)
AUC measures the area under the Receiver Operating Characteristic curve (ROC curve). It renders an overall performance of the model. AUC does not depend on the classification threshold. However, it is also a scale-invariant parameter. The higher the value of AUC , the better the performance of the model.
2DEMD Feature Analysis
In this section, some sample analysis of the 2DEMD is presented. 2DEMD extracts the texture profile of an image. The overview is mentioned in the previous section.Fig. 4 shows the effect of 2DEMD on CT-scan images whereas Fig. 5 illustrates the effect of 2DEMD on Chest-X-ray images. However, the 2DEMD intensifies the inner-view of the lung. Also, discarding the residual signal from 2DEMD makes a better scope for visual inspection. In the modified images, the colors seem to differentiate different tissue types (i.e. liquid/empty space from bones/tissues). For example, most of the liquid and air-filling pores/cavities in the lung are colored blue, whereas most of the soft tissues and bones are colored red/yellow/orange. Also, the color helps to differentiate the densities of mucus in the lung in Covid patients. This bonus visualization benefit emerged while adding different IMFs.
Results of Proposed Method
This sections presents the classification result on all of the three databases using the proposed method in Table 1
, Table 2
, and Table 3
for database 1, 2, and 3, respectively. The evaluation metrics have been calculated per image (not per subject).
Table 1
Validation result of the proposed method on database 1.
Database
Metric
Original CT-scans
Modified CT-scans
VGG16
VGG19
VGG16
VGG19
Database 1
FNR (%)
0.40
20.16
1.58
7.51
Accuracy (%)
98.41
86.71
99.01
91.87
Precision (%)
97.30
92.66
99.60
91.41
Recall (%)
100.00
80.00
98.00
92.00
Specificity (%)
97.21
93.63
99.60
91.24
AUC (%)
99.49
96.64
99.71
97.43
F1 Score (%)
98.44
85.77
99.01
91.94
F2 score (%)
99.13
82.11
82.18
76.92
Table 2
Validation result of the proposed method on database 2.
Database
Metric
Original CT-scans
Modified CT-scans
VGG16
VGG19
VGG16
VGG19
Database 2
FNR (%)
31.88
20.63
11.88
12.00
Accuracy (%)
80.31
84.06
84.06
85.63
Precision (%)
90.08
87.59
81.50
82.50
Recall (%)
68.00
79.00
88.00
88.00
Specificity (%)
92.50
88.75
80.00
83.53
AUC (%)
90.08
89.47
90.90
90.37
F1 Score (%)
77.58
83.28
84.68
85.16
F2 score (%)
71.62
80.89
86.72
86.84
Table 3
Validation result of the proposed method on the Chest-X-ray images of database 3.
Database
Metric
Original CT-scan
Modified Chest-X-ray
VGG16
VGG19
VGG16
VGG19
Database 3
FNR (%)
19.00
29.00
30.71
31.30
Accuracy (%)
74.00
71.50
74.50
74.50
Precision (%)
71.05
71.72
88.00
90.00
Recall (%)
81.00
71.00
69.00
69.00
Specificity (%)
67.00
72.00
83.56
85.51
AUC (%)
75.01
72.24
74.68
74.04
F1 Score (%)
75.70
71.36
77.53
77.92
F2 score (%)
78.79
71.14
72.37
72.12
Validation result of the proposed method on database 1.Validation result of the proposed method on database 2.Validation result of the proposed method on the Chest-X-ray images of database 3.Table 1 shows the resultant metrics for the classifiers (VGG16 and VGG19 models) for both original CT-scan image and modified CT-scan image (through 2DEMD). It is evident from the study that the models trained with modified CT-scan images provide superior results to models trained with original CT-scan images from the perspective of classification accuracy, (AUC), and F1 score. When the model is trained using VGG16 architecture and modified CT-scan image, the accuracy increases from 98.41% to 99.01%, the AUC increases from 99.49% to 99.71%, and the F1 score increases from 98.44% to 99.01%. Similarly, when the model is trained using VGG19 architecture and modified CT-scan, there is a similar pattern of a steady increase in these three metrics. For example, in this case, the accuracy improves from 86.71% to 91.87%, the AUC increases from 96.64% to 97.43%, and the F1 score increases from 85.77% to 91.94%. All other metrics like FNR, precision, recall, specificity, F2 score, etc. follow a random fluctuating pattern. Such steady and fluctuating patterns are verified through a second database which also contains CT-scan images. The results for database 2 are shown in Table 2.From Table 2, it is evident that the models, trained with the modified CT-scan images, show better performance than a model that is trained with the original CT-scan images in terms of accuracy, AUC, and F1 score. For example, for database 2, when the model is VGG16 and it is trained with modified CT-scan images, the accuracy increases from 80.31% to 84.06%, the AUC increases from 90.08% to 90.90%, and the F1 score increases from 77.58% to 84.68%. Similarly, for the case of VGG19 architecture with modified CT-scan image, the accuracy increases from 84.06% to 85.63%, the AUC increases from 89.47% to 90.37%, and the F1 score increases from 83.28% to 85.16%. Investigating other metrics in Table 2, it is seen that the FNR, Recall, and F2 score also increases when the models are trained with modified CT-scan images. Moreover, precision and specificity decrease when modeled with the modified CT-scan image. Therefore, a generalized pattern is followed by classification accuracy, AUC, and F1 score irrespective of different databases. But that is not the case for other metrics like FNR, precision, recall, specificity, and F2 score.Next, the proposed method is applied to a database containing Chest-X-ray images to see whether the same generalization can be deduced like CT-scan images. Table 3 shows the resultant metrics for database 3. From Table 3, it is apparent that when the model is trained with VGG16 and VGG19 with modified Chest-X-ray images, the accuracy increases from 74.00% to 74.50% and from 71.50% to 74.50%, respectively. Besides, the F1 score increases from 75.70% to 77.53% and from 71.36% to 77.92%, respectively. However, the AUC does not follow the same steady increasing pattern as the CT-scan images in database 1 and 2. But, now the precision and specificity increase steadily for both architectures. For example, the precision increases from 71.05% to 88.00% when used the VGG16 model, and from 71.72% to 90.00% when used the VGG19 model. Similarly, the specificity increases from 67% to 83.56% and from 72.00% to 85.51% when used VGG16 and VGG19 architectures, respectively. Other metrics like FNR, precision, F2 score follow fluctuating patterns. In this analysis the deep-learning models does not perform well on the Chest-X-ray database. The chosen models are poorly performing on this Chest-X-ray database. But still, the modified image is a better tool for training than the raw image. Fig. 6
and Fig. 7
show the class-specific performance metrics for all of the three databases. Fig. 6 shows the class-specific performance of classification accuracy and F1 score for both SARS-CoV-2 positive and negative cases. The models trained with the modified CT-scan/Chest-X-ray images, based on 2DEMD, provide superior performance to models that are trained with the original image for both ’Covid’ and ’non-Covid’ classes. These two metrics render generalized, steady increasing performance when trained with modified CT-scan/Chest-X-ray image irrespective of the database. On the other hand, Fig. 7 shows other metrics which provide database-specific steady or fluctuating results when trained with modified CT-scan/Chest-X-ray. For example, in database 1 and 2, the AUC increases when trained with modified CT-scan image for both ’Covid’ and ’non-Covid’ classes. On the other hand, for database 3, the FNR and Recall decrease when trained with the modified Chest-X-ray image for both classes. However, the F2 score decreases for database 1 but it increases for database 2 when the model is trained with a modified CT-scan. Moreover, for database 2, the precision and specificity decrease for both ’Covid’ and ’non-Covid’ classes. Besides, in database 2, the FNR decreases with a modified CT-scan for both classes.
Fig. 6
Class-specific performance of different models for original and modified CT-scan/Chest-X-ray images from the perspective of accuracy (a) and F1 score (b). The models trained with the modified CT-scan/Chest-X-ray image, based on 2DEMD, provides superior performance to models that are trained with original image.
Fig. 7
Class specific performance of different models for original and modified CT-scan/Chest-X-ray images in terms of AUC (a), F2 score (b), FNR (c), Precision (d), Recall (e), Specificity (f). In case of database 1 and 2, the AUC score (a) increases when trained with modified CT-scan image. On the other hand, for database 3, the FNR (c) and Recall (e) decreases when trained with the modified Chest-X-ray image. However, F2 score (b) decreases for database 1 but increases for database 2 when the model is trained with modified CT-scan. Moreover, for database 2, the Precision (d) and Specificity (f) decreases. Besides, in database 2, the FNR decreases with modified CT-scan.
Class-specific performance of different models for original and modified CT-scan/Chest-X-ray images from the perspective of accuracy (a) and F1 score (b). The models trained with the modified CT-scan/Chest-X-ray image, based on 2DEMD, provides superior performance to models that are trained with original image.Class specific performance of different models for original and modified CT-scan/Chest-X-ray images in terms of AUC (a), F2 score (b), FNR (c), Precision (d), Recall (e), Specificity (f). In case of database 1 and 2, the AUC score (a) increases when trained with modified CT-scan image. On the other hand, for database 3, the FNR (c) and Recall (e) decreases when trained with the modified Chest-X-ray image. However, F2 score (b) decreases for database 1 but increases for database 2 when the model is trained with modified CT-scan. Moreover, for database 2, the Precision (d) and Specificity (f) decreases. Besides, in database 2, the FNR decreases with modified CT-scan.In summary, the accuracy increases, in general, across all databases when the models (VGG16 and VGG19) are trained with 2DEMD-based modified CT-scan/Chest-X-ray images rather than with the raw image. On the other hand, some of the other metrics provide database-specific steady or fluctuating performance. However, it is quite a bit of a challenge in achieving the best performance in all evaluation metrics simultaneously. In medical diagnosis (i.e. the classification of Covid and non-Covid patients), the most important evaluation metric is the classification accuracy (shown in Fig. 6(a)), which our proposed method achieves in all databases. We represented reports on other metrics as well to show the overall performance and to show that in achieving the best classification accuracy, what effects the training did impose on other metrics. It is evident that we had to trade off the performance in some metrics. On average, the time it takes for the 2DEMD to analyze a single image in the dataset is less than 30 seconds. A single prediction from the CNN model takes less than 5 seconds. Hence, the method can be readily implemented.
Conclusion
In the recent advancements of SARS-CoV-2 detection and classification, most of the automatic detection algorithms work with the raw image of the CT-scan/Chest-X-ray image. But in this report, a hybrid method is utilized comprising traditional signal processing and deep-learning methodology. First, the raw CT-scan/Chest-X-ray image is converted to a single channel (gray-scale) image, and then 2DEMD is applied to the image. Next, the residual part of the decomposition is discarded, and all of the Intrinsic Mode Functions (IMFs) are synthesized together to form a modified CT-scan/Chest-X-ray image. This modification represents the image with greater scope for visual inspection, and it also extracts the texture profile of the image. After that, the modified image is trained through a deep Convolutional Neural Network (CNN), and a final fully connected layer classifies the image either as a ’Covid-Positive’ or ’Covid-Negative’. The method is applied to CT-scan and Chest-X-ray images of three publicly available databases. In all of the cases, the models trained with modified CT-scan/Chest-X-ray images provide superior performance to the model that is trained with the raw images from the perspective of accuracy and F1 score. However, the models also show some database-specific steady and fluctuating patterns in other types of resultant metrics. Two fundamental CNN architectures are used for validation. One of the main purposes of this paper is not to achieve the best classification result, but to compare the performance of raw image and 2DEMD-based modified image training while maintaining the same set of hyper-parameter and learning methodology. The analysis in this paper makes it possible for the 2DEMD-based modified CT-scan/Chest-X-ray image to be used as a performance-boosting criterion in the deep learning-based classification of SARS-CoV-2 patients. In summary, this paper represents a new viewpoint for deep-learning engineers to utilize the feature extraction power of 2DEMD. One of the limitations of this work is that it has not been applied to a large set of chest-X-Ray images. It seems like the method performs better on chest CT scans rather than on chest-X-Ray images. The reason might be due to chest ribs and diaphragms in the chest-X-Ray image in higher quantity than in a ct-scan. Besides, instead of taking all of the decomposed IMFs, how will the result be affected if only several IMFs are combined to form a modified image, is still an area of possible future work
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Authors: Sivaramakrishnan Rajaraman; Jen Siegelman; Philip O Alderson; Lucas S Folio; Les R Folio; Sameer K Antani Journal: IEEE Access Date: 2020-06-19 Impact factor: 3.367
Authors: Harrison X Bai; Robin Wang; Zeng Xiong; Ben Hsieh; Ken Chang; Kasey Halsey; Thi My Linh Tran; Ji Whae Choi; Dong-Cui Wang; Lin-Bo Shi; Ji Mei; Xiao-Long Jiang; Ian Pan; Qiu-Hua Zeng; Ping-Feng Hu; Yi-Hui Li; Fei-Xian Fu; Raymond Y Huang; Ronnie Sebro; Qi-Zhi Yu; Michael K Atalay; Wei-Hua Liao Journal: Radiology Date: 2021-04 Impact factor: 11.105