Literature DB >> 36217534

UncertaintyFuseNet: Robust uncertainty-aware hierarchical feature fusion model with Ensemble Monte Carlo Dropout for COVID-19 detection.

Moloud Abdar1, Soorena Salari2, Sina Qahremani3, Hak-Keung Lam4, Fakhri Karray5,6, Sadiq Hussain7, Abbas Khosravi1, U Rajendra Acharya8,9,10, Vladimir Makarenkov11, Saeid Nahavandi1.   

Abstract

The COVID-19 (Coronavirus disease 2019) pandemic has become a major global threat to human health and well-being. Thus, the development of computer-aided detection (CAD) systems that are capable of accurately distinguishing COVID-19 from other diseases using chest computed tomography (CT) and X-ray data is of immediate priority. Such automatic systems are usually based on traditional machine learning or deep learning methods. Differently from most of the existing studies, which used either CT scan or X-ray images in COVID-19-case classification, we present a new, simple but efficient deep learning feature fusion model, called U n c e r t a i n t y F u s e N e t , which is able to classify accurately large datasets of both of these types of images. We argue that the uncertainty of the model's predictions should be taken into account in the learning process, even though most of the existing studies have overlooked it. We quantify the prediction uncertainty in our feature fusion model using effective Ensemble Monte Carlo Dropout (EMCD) technique. A comprehensive simulation study has been conducted to compare the results of our new model to the existing approaches, evaluating the performance of competing models in terms of Precision, Recall, F-Measure, Accuracy and ROC curves. The obtained results prove the efficiency of our model which provided the prediction accuracy of 99.08% and 96.35% for the considered CT scan and X-ray datasets, respectively. Moreover, our U n c e r t a i n t y F u s e N e t model was generally robust to noise and performed well with previously unseen data. The source code of our implementation is freely available at: https://github.com/moloud1987/UncertaintyFuseNet-for-COVID-19-Classification.
© 2022 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  COVID-19; Deep learning; Early fusion; Feature fusion; Uncertainty quantification

Year:  2022        PMID: 36217534      PMCID: PMC9534540          DOI: 10.1016/j.inffus.2022.09.023

Source DB:  PubMed          Journal:  Inf Fusion        ISSN: 1566-2535            Impact factor:   17.564


Introduction

The 2019 coronavirus (COVID-19) has been spreading astonishingly fast across the globe since its emergence in December 2019 and its exact origin is still unknown [1], [2], [3]. Overall, the COVID-19 pandemic has caused a consecutive series of catastrophic losses worldwide, infecting more than 287 million people and causing around 5.4 million deaths around the world up to the present. The rapid spread of COVID-19 is continuing to threaten human’s life and health with the emergence of novel variants such as Delta and Omicron. All of this makes COVID-19 not only an epidemiological disaster, but also a psychological and emotional one. The uncertainties and grappling with the loss of normalcy caused by this pandemic provoke severe anxiety, stress and sadness among people. Easy respiratory transmission of the disease from person to person triggers swift spread of the pandemic. While many of the COVID-19 cases show milder symptoms, the symptoms of the remaining cases are unfortunately life-critical. The health-care systems in many countries seem to have arrived at the point of collapse as the number of cases has been increasing drastically due to the fast propagation of some of its variants. Regarding the COVID-19 diagnostic, the reverse transcription polymerase chain reaction (RT-PCR) is one of the gold standards for COVID-19 detection. However, RT-PCR has a low sensitivity. Hence, many COVID-19 cases will not be recognized by this test and thus the patients may not get the proper treatments. These unrecognized patients pose a threat to the healthy population due to highly infectious nature of the virus. Chest X-ray (CXR) and Computed Tomography (CT) have been widely used to identify prominent pneumonia patterns in the chest. These imaging technologies accompanied by artificial intelligence tools may be used to diagnose COVID-19 patients in a more accurate, fast and cost-effective manner. Failure to provide prompt detection and treatment of COVID-19 patients increases the mortality rate. Hence, the detection of COVID-19 cases using deep learning models using both CXR and CT images may have huge potential in healthcare applications. In recent years, deep learning models have had the widespread applicability not only in medical imaging field but also in many other areas [4], [5], [6], [7]. These models have also been extensively applied for COVID-19 detection. It is critical to discriminate COVID-19 from other forms of pneumonia and flu. Farooq et al. [8] introduced an open-access dataset and the open-source code of their implementation using a CNN framework for distinguishing COVID-19 from analogous pneumonia cohorts from chest X-ray images. The authors designed their COVIDResNet model by utilizing a pre-trained ResNet-50 framework allowing them to improve the model’s performance and reduce its training time. An automatic and accurate identification of COVID-19 using CT images helps radiologists to screen patients in a better way. Zheng et al. in [9] proposed a fully automated system for COVID-19 detection from chest CT images. Their deep learning model, called COVNet, investigates visual features of the chest CT images. Moreover, Hall et al. [10] presented a new deep learning model, named COVIDX-Net, to aid radiologists with COVID-19 detection from CXR image data. The authors explored seven deep learning architectures, including DenseNet, VGG-19 and MobileNet v2.0. In another study, Abbas et al. [11] designed the Decompose, Transfer, and Compose (DeTraC) model of COVID-19 image classification using CXR data. A class decomposition approach was employed to identify irregularities in iCXR data by scrutinizing the class boundaries. Segmentation also plays a key role in COVID-19 quantification applied to CT scan data. Chen et al. [12] proposed a novel deep learning method for segmentation of COVID-19 infection regions automatically. Aggregated Residual Transformations were employed to learn a robust and expressive feature representation and the soft attention technique was applied to improve the potential of the system to distinguish several symptoms of COVID-19. However, we noticed that there are still some open issues in the recently proposed traditional machine learning and deep learning models for COVID-19 detection. For this reason, optimizing the existing models should be a priority in COVID-19 detection and classification. Ensemble and fusion-based models [13] have shown outstanding performance in different medical applications. In the following, we provide more information about fusion-based models, discussing how they can be used in the framework of the deep learning approach.

Uncertainty quantification (UQ)

Many traditional machine learning and deep learning models have been developed not only for analysis of CXR and CT image data but also for many other medical applications, often yielding high accuracy results even for a limited number of images [7]. However, DNNs require a large number of data to fine-tune trainable parameters. A limited number of images usually leads to epistemic uncertainty. Trust is an issue for these models, deployed with lower numbers of training samples. Out-of-distribution (OoD) samples and discrimination between the training and testing samples make such models fail in real world applications. Lack of confidence in unknown or new cases is usually not reported for these models. However, this information is essential for the development of reliable medical diagnostic tools. These unknown samples, which are generally hard to predict, often have important practical value. It is essential to estimate uncertainties with an extra insight in their point estimates. This additional vision aims at enhancing the overall trustworthiness of the systems, allowing clinicians to know where they can trust predictions made by the models. The flawed decisions made by some models can be fatal for the patients at risk. Hence, proper uncertainty estimations are necessary to improve the efficiency of ML models making them trustworthy and reliable [14], [15], [16]. Trustworthy uncertainty estimates can facilitate clinical decision making, and more importantly, provide clinicians with appropriate feedback on the reliability of the obtained results [17]. As discussed above, COVID-19 has had many negative effects on all aspects of human life around the world. The COVID-19 pandemic has caused millions of deaths worldwide. In this regard, our study attempts to propose a simple and accurate deep learning model, called UncertaintyFuseNet, for detecting COVID-19 cases. Our model includes an uncertainty quantification method to increase the reliability of the obtained results.

Research gaps

Our comprehensive literature review helped us to identify several important research gaps related to the use of the COVID-19 detection/segmentation methods. Below, we list the most important of them: There are no sufficient COVID-19 image data to develop accurate and robust deep learning models. This lack of data can impact the performance of deep learning approaches. To the best of our knowledge, there are very few studies that have used both types of images (CT scan and X-ray) simultaneously. There are very few studies that have examined the uncertainty of the COVID-19 predictions provided by deep learning models. Moreover, we found that there are very few COVID-19 classification studies considering the model’s robustness and its ability to process unknown data. The impressive effect of different feature fusion methods has received less attention in the COVID-19 classification research. It is worth noting that feature fusion techniques are very effective both for improving the model’s performance and for dealing with uncertainty within ML and DL models.

Main contributions

The main contributions of this study are as follows: We proposed a novel feature fusion model for accurate detection of COVID-19 cases. We quantified the uncertainty in our proposed feature fusion model using effective Ensemble MC Dropout (EMCD) technique. The proposed feature fusion model demonstrates strong robustness to data contamination (data noise). Our new model provided very encouraging results in terms of unknown data detection. The main characteristics of the proposed UncertaintyFuseNet model are as follows: (i) It is an accurate model with promising performance, (ii) It can be used efficiently to carry out classification analysis of large CT and X-ray image datasets, (iii) It quantifies the prediction uncertainty, (iv) It is a reliable model in terms of processing noisy data, and finally, (v) It allows for an accurate detection of OoD samples. The rest of this study is organized as follows. Section 2 formulates the proposed methodology. The main experiments of this study are discussed in Section 3. Section 4 presents the obtained results and provides a comprehensive comparison with existing studies. Finally, the conclusions are presented in Section 5.

Proposed methodology

This section includes two main sub-sections describing: (i) Basic deep learning models in sub- Section 2.1, (ii) and our novel feature fusion model, , in sub- Section 2.2. It may be noted that we also applied two traditional machine learning algorithms (i.e.,  (RF) and  (DT, max-depth  50 and n-estimators  200)) and compared their performances with the considered deep learning models.

Basic deep learning models

In this sub-section, we provide more details regarding two basic deep learning models: (i) Deep 1 (Simple CNN), and (ii) deep 2 (Multi-headed CNN). Fig. 1, Fig. 2 show deep 1 (Simple CNN) and deep 2 (Multi-headed CNN) models, respectively. The first deep learning model (Simple CNN) includes three convolutional layers followed by MC dropout in the feature extraction layer. The extracted features are then given to the classification layer, including three dense layers and MC dropout. More details of the deep 1 model can be found in Fig. 1. In our second deep learning model, deep 2, i.e., multi-headed CNN, comprises three main heads (as feature extractors). The extracted features in each branch are then given to the fusion layers, followed by the classification layer as illustrated in Fig. 2.
Fig. 1

A general overview of the applied deep learning model deep 1 (simple CNN).

Fig. 2

A general overview of the applied deep learning mode deep 2 (multi-headed CNN).

A general overview of the applied deep learning model deep 1 (simple CNN). A general overview of the applied deep learning mode deep 2 (multi-headed CNN).

Proposed feature fusion model:

Feature fusion is an approach used to combine features (different information) of the same sample (input) extracted by various methods. Assume  be a training sample (image) space of  labeled samples (images). Given , , …, and , where , , …,  are the feature vectors of the same input sample  extracted by various deep learning models, respectively. Therefore, the total feature fusion vector space  obtained from different sources can be calculated as follows: In this study, after preprocessing the data, we feed our dataset to the model. Our model consists of two major branches: The first branch has five convolutional blocks. Each block is made up of two tandem convolutional layers followed by batch normalization and max-pooling layers. Also, the fourth and fifth blocks have dropout layers in their outputs. It is worth noting that separable convolutions were utilized in the  model in the second and subsequent layers. However, we used the usual convolution layer in other architectures (First layer of , Simple CNN, and Multi-headed CNN). The following training parameters were used in our experiments: The learning rate of the proposed model is 0.0005, the batch size is 128, the number of epochs is 200, and Adam is selected as our optimizer. The second branch is a VGG16 transfer learning network whose output is used in the fusion layer. After two branches, the model is followed by a fusion layer that concatenates the third, fourth, and fifth convolutional layers’ outputs with VGG16’s output. Finally, we used fully connected layers to process the fused features and classify the data. In this part, we have used four dense layers with 512, 128, 64, and 3 neurons with the ReLU activation function, respectively. The output of the first three dense layers has a dropout in their outcomes with a rate equal to 0.7, 0.5, and 0.3, respectively. The stated model is not simplistic. Indeed, to boost the model’s power in dealing with data and extracting high-quality features, we have employed a novel feature fusion approach combining different sources: We selected the third convolutional block’s output as a fusion source to have a holistic perspective about the data distribution. These features help the model to consider the unprocessed and raw information and use it in the prediction. We included the final and penultimate convolutional blocks’ outputs in the feature fusion layer to have more accurate information. This feature gives a detailed view of the dataset to model and helps the model to process advanced classification features. As has been suggested by recent pneumonia detection studies, where the pretrained networks have been successively used to create high-quality generalizable features, we used the output of VGG16 in the fusion layer. The pseudo-code of the proposed  model for detecting the COVID-19 cases is reported in Algorithm 1. Its general view is illustrated in Fig. 3.
Fig. 3

A general overview of the proposed  model inspired by a hierarchical feature fusion approach and EMCD.

It should be noted that the detailed information about Convolution blocks in Figs. 1, 2, and 3 is reported in Table B.12, in the Appendix. To generate the final prediction, after training the applied models with uncertainty module, we have first run each model  times. Thereafter, we average the predicted softmax probabilities (outputs) in those  random predictions of data  through  and stochastic sampling dropout mask  for each single prediction.
Table B.12

The detailed parts of Convolution blocks used in Deep 1 (Simple CNN), Deep 2 (Multi-headed CNN), and our proposed model (Fusion model).

ModelLayer nameInput size
Deep 1 (Simple CNN)Conv12D convolution, Kernel size: 3, Activation: ReLU, Max Pooling: 2
Conv22D convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2
Conv32D convolution, Kernel size: 3, Activation: ReLU, Batch Normalization

Deep 2 (Multi-headed CNN)Conv12D convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2
Conv22D convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2
Conv32D convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2

Proposed (Fusion model)Conv12D convolution, Kernel size: 3, Activation: ReLU, Max Pooling: 2
Conv22D Separable convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2
Conv32D Separable convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2
Conv42D Separable convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2
Conv52D Separable convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2
A general overview of the proposed  model inspired by a hierarchical feature fusion approach and EMCD. We then used the model ensembling and acquired predictions from the  trained models with various weight distributions and initialized weights using this strategy. This allowed us to improve the model’s performance drastically. Thus, after training the model, we use the MC equation with  200 (see Eq. (3)) to obtain predictions of the model through different stochastic paths (using MC dropouts to create randomness in our architectures). After getting all predictions, we calculate the mean for each sample. Using this approach, we obtain an ensemble of different models which helps boost the model’s performance. Precisely, we run the proposed model 200 times for each sample at the test stage and get an average prediction as the final prediction of the model. The pseudo-code of the applied EMCD procedure included in our  model for detecting COVID-19 cases is summarized in Algorithm 2. Furthermore, the learning rate of the proposed model is 0.0005, the batch size is 128, the number of epochs is 200 and Adam is selected as our optimizer.

Experiments

In this section, we present : the data considered in our study (see sub- Section 3.1), the results obtained using our new model (see sub- Section 3.2), the results showing that our new model is robust against noise (see sub- Section 3.3), and the results showing how our new model copes with unknown data (see sub- Section 3.4).

Datas considered

In this study, two types of input image data were used: CT scan [18] 2 and X-ray3 images (see Table 1). Some random samples of the CT scan and X-ray datasets considered in this study are shown in Fig. 4. The CT scan dataset has classes of data: non-informative CT (NiCT), positive CT (pCT), and negative CT (nCT) images. The X-ray dataset also has three data classes: COVID-19, Normal, and Pneumonia images.
Table 1

Characteristics of the CT scan and X-ray datasets considered in our study.

Dataset# of samples# of classes
CT scan images19 685 (70% train, 30% test)3
X-ray images6432 (train: 5144, test: 1288)3
Fig. 4

Some random image samples from the CT scan and X-ray datasets considered in our study.

It should be pointed out that for the CT scan dataset we randomly used 70% of the whole data for training and the rest (30%) for testing the applied models. However, the X-ray dataset was originally divided into two main categories: train (5144 images) and test (1288 images). Thus, we used these train and test categories in our study as well. Characteristics of the CT scan and X-ray datasets considered in our study. Some random image samples from the CT scan and X-ray datasets considered in our study.

Experimental results

In this section, the experimental results are presented and discussed. Since we also considered the impact of UQ methods, our experiments have been conducted with and without applying them for detection of COVID-19 cases. In our first experiment, we compared five different machine learning models, including Random Forest (RF), Decision Trees (DT, max-depth  50, and n-estimators  200), Deep 1 (Simple CNN), Deep 2 (Multi-headed CNN), and our proposed model (feature fusion model).

COVID-19 classification without considering uncertainty

First, we investigated the performance of the five considered classifiers (RF, DT, simple CNN, multi-headed CNN and our proposed feature fusion model) without considering uncertainty. The obtained results are presented in Table 2, Table 3 for the CT scan and X-ray datasets, respectively. As shown in Table 2 our feature fusion model outperformed the other methods for the CT scan dataset, providing the accuracy of 99.136%, and followed by simple CNN with the accuracy of 98.763%. The obtained results also indicate that DT provided the weakest performance for the CT scan dataset among the five competing models. Figs. B.15 (in the Appendix) and 5 present the confusion matrices and the ROC curves obtained for the CT scan dataset without quantifying uncertainty, respectively.
Table 2

Comparison of the results (given in %) provided by different ML models for detecting COVID-19 cases for the CT scan dataset: Results without considering uncertainty.

ML modelPrecisionRecallF-measureAccuracy
RF97.11197.07097.09197.070
DT93.04993.04093.04593.040
Deep 1 (Simple CNN)98.78798.76398.77598.763
Deep 2 (Multi-headed CNN)98.59998.57798.58898.577
Proposed (Fusion model)99.13799.13699.13699.136
Table 3

Comparison of the results (given in %) provided by different ML models for detecting COVID-19 cases for the X-ray dataset: Results without considering uncertainty.

ML modelPrecisionRecallF-measureAccuracy
RF91.53291.38191.45691.381
DT83.82884.00683.91784.006
Deep 1 (Simple CNN)93.84793.16793.50693.167
Deep 2 (Multi-headed CNN)95.04194.95394.99794.953
DarkCovidNet95.75295.72995.74195.729
Proposed (Fusion model)97.12197.12797.12497.127
Fig. B.15

Confusion matrices obtained using different models for the CT scan datasets without quantifying uncertainty.

Fig. 5

ROC curves obtained for the five considered ML models for the CT scan data without quantifying uncertainty.

To demonstrate the effectiveness of the proposed feature fusion model, the same five ML have been applied to analyze X-ray data. It can be observed from Table 3 that our feature fusion model performed much better than the other competing ML models, providing the accuracy of 97.127%, followed by the multi-headed CNN model with the accuracy of 94.953%. The traditional Decision Tree model provided much worse results for the X-ray data (the recall value of 84.006%) than for the CT scan data (the recall value of 93.040%). Figs. B.16 (in the Appendix) and 6 present the confusion matrices and the ROC curves obtained by the five ML models for the X-ray dataset without quantifying uncertainty.
Fig. B.16

Confusion matrices obtained using different models for the X-ray dataset without quantifying uncertainty.

Fig. 6

ROC curves obtained for the five considered ML models for the X-ray data without quantifying uncertainty.

Comparison of the results (given in %) provided by different ML models for detecting COVID-19 cases for the CT scan dataset: Results without considering uncertainty. ROC curves obtained for the five considered ML models for the CT scan data without quantifying uncertainty. Comparison of the results (given in %) provided by different ML models for detecting COVID-19 cases for the X-ray dataset: Results without considering uncertainty. ROC curves obtained for the five considered ML models for the X-ray data without quantifying uncertainty.

COVID-19 classification considering uncertainty

The results, discussed in the previous sub- Section 3.2.1, provided by our new feature fusion model are promising, suggesting that it can be used by clinical practitioners for automatic detection of COVID-19 cases. We believe that new efficient intelligent (i.e. ML and DL) models to deal with COVID-19 data are urgently needed. At the same time, we believe in the uncertainty estimates should accompany such intelligent models. To accomplish this, we applied the uncertainty quantification method, called EMC dropout, to estimate the uncertainty of our deep learning predictions. The EMC method was used in the framework of the Deep 1 (Simple CNN) and Deep 2 (Multi-headed CNN) models, and our proposed  model. Table 4 and Fig. B.17 in the Appendix (confusion matrices) and Fig. 7 (ROC curves) show the results provided by the three compared deep learning models considering uncertainty for the CT scan dataset. As shown in Table 4, our feature fusion model yielded a better classification performance compared to the Deep 1 and Deep 2 CNN-based models.  provided the accuracy value of 99.085%, followed by the Deep 1 model with the accuracy value of 98.831%, for the CT scan data. The results obtained using deep learning models with and without uncertainty quantification (UQ) reveal that our proposed feature fusion model with UQ method has had a slightly poorer performance than the model without UQ. The Deep 1 CNN model performed slightly better with UQ, while the Deep 2 CNN model performed slightly better without UQ.
Table 4

Comparison of the results (given in %) provided by the 3 DL models for detecting COVID-19 cases for the CT scan dataset: Results obtained with uncertainty quantification.

DL modelPrecisionRecallF-measureAccuracy
Deep 1 (Simple CNN)98.83198.85498.84398.831
Deep 2 (Multi-headed CNN)98.49398.52398.50898.493
Proposed (Fusion model)99.08599.08599.08599.085
Fig. B.17

Confusion matrices obtained using different models for the CT scan dataset with quantifying uncertainty.

Fig. 7

ROC curves obtained for the three considered DL models for the CT scan data with uncertainty quantification.

We also evaluated the performance of three considered DL models with uncertainty quantification on the X-ray dataset. The obtained statistics, confusion matrices, and the ROC curves for the three competing DL models applied are presented in Table 5 and Fig. B.17 in the Appendix and Fig. ?? (ROC curves), respectively. Our proposed feature fusion model achieved the best performance for COVID-19 detection using X-ray dataset with an accuracy of 96.350% compared to the simple CNN (accuracy of 95.263%). For the X-ray data, the proposed  model outperformed the Deep 1 simple CNN model, but was slightly surpassed by the Deep 2 multi-headed CNN (see Table 5) in the Appendix (see Fig. 8).
Table 5

Comparison of the results (given in %) provided by the 3 DL models for detecting COVID-19 cases for the X-ray dataset: Results obtained with uncertainty quantification.

MethodPrecisonRecallF-measureAccuracy
Deep 1 (Simple CNN)95.26395.35495.30995.263
Deep 2 (Multi-headed)95.18695.25795.22295.186
DarkCovidNet97.46097.458997.45997.458
Proposed (Fusion model)96.35096.37096.36096.350
Fig. 8

ROC curves obtained for the three considered DL models for the X-ray data with uncertainty quantification.

Comparison of the results (given in %) provided by the 3 DL models for detecting COVID-19 cases for the CT scan dataset: Results obtained with uncertainty quantification. ROC curves obtained for the three considered DL models for the CT scan data with uncertainty quantification. Comparison of the results (given in %) provided by the 3 DL models for detecting COVID-19 cases for the X-ray dataset: Results obtained with uncertainty quantification. ROC curves obtained for the three considered DL models for the X-ray data with uncertainty quantification.

Robustness against noise

An individual visual system is significantly robust against a wide variety of natural noises and corruptions occurring in the nature such as snow, fog or rain [19]. However, the overall performance of various modern image and speech recognition systems is greatly degraded when evaluated using previously unseen noises and corruptions. Thus, conducting robustness tests for considered ML and DL models can be necessary to reveal their level of stability against noise. In this study, the robustness of the applied deep learning models against noise has been investigated. We added different noise variables to both CT scan and X-ray datasets to evaluate the performance of Simple CNN, Multi-headed CNN and our proposed feature fusion model. Gaussian noise variables with different standard deviations (STD) were generated. The generated STD values were the following: 0.0001, 0.001, 0.01, 0.1, 0.2, 0.3, 0.4, 0.5, and 0.6, whereas the value of Mean was equal to 0. Our simulation results obtained for the CT scan and X-ray datasets are presented in Table 6, Table 7, respectively. It may be noted from Table 6 (CT scan data results) that both Simple CNN and Multi-headed CNN models did not perform well with noisy data compared to our feature fusion model. The results reported in Table 6 indicate that the values of all metrics computed for Simple and Multi-headed CNNs decrease dramatically as the level of noise increases. In contrast, our feature fusion model has been much more robust against noise according to all metrics considered.
Table 6

Robustness against noise results (given in %) provided by the 3 compared DL models for detecting COVID-19 cases for the CT scan dataset. Here, , where  is the mean of the noise and  is standard deviation of the noise.

DL modelNoise STDPrecisionRecallF-measureAccuracy
Deep 1 (Simple CNN)0.000198.85298.83198.84298.831
0.00198.86898.84898.85898.848
0.0198.75498.73098.74298.730
0.195.78595.61495.70095.614
0.289.27087.28488.26687.284
0.386.37082.59384.43982.593
0.482.62875.19478.73675.194
0.578.30364.05370.46564.053
0.675.77057.53465.40557.534

Deep 2 (Multi-headed)0.000198.52698.49398.50998.493
0.00198.52498.49398.50898.493
0.0198.55898.52698.54298.526
0.193.44792.87193.15892.871
0.286.94383.42385.14783.423
0.380.16869.06574.20369.065
0.476.03258.46566.10258.465
0.573.83754.69062.83754.690
0.672.91453.14961.48253.149

Proposed (Fusion model)0.000199.08599.08599.08599.085
0.00199.11999.11999.11999.119
0.0199.19499.18799.19099.187
0.199.09899.08599.09299.085
0.298.82898.81498.82198.814
0.398.10998.08698.09798.086
0.496.95696.88496.92096.884
0.596.20196.08896.14596.088
0.695.80495.66595.73495.665
Table 7

Robustness against noise results (given in %) provided by the 3 compared DL models for detecting COVID-19 cases for the X-ray dataset.

DL modelNoise STDPrecisionRecallF-measureAccuracy
Deep 1 (Simple CNN)0.000195.40895.34195.37595.341
0.00195.40895.34195.34195.341
0.0195.33895.26395.30195.263
0.194.55494.25494.40494.254
0.291.54089.28590.39889.285
0.388.53482.06585.17682.065
0.485.77073.13678.95173.136
0.584.29464.51873.09264.518
0.682.54557.37567.69657.375

Deep 2 (Multi-headed)0.000195.47495.41995.44695.419
0.00195.18895.10895.14895.108
0.0195.40495.34195.37295.341
0.193.92293.32293.62193.322
0.288.86182.45385.53782.453
0.383.78158.07468.59858.074
0.482.20740.06253.87140.062
0.581.75031.52145.49931.521
0.681.36627.63941.26227.639

Proposed (Fusion model)0.000196.49896.50696.50296.506
0.00196.56896.58396.57696.583
0.0196.49296.50696.49996.506
0.196.36396.35096.35796.350
0.294.40394.25494.32994.254
0.391.76991.07191.41891.071
0.488.22585.71486.95185.714
0.584.18178.80481.40478.804
0.681.08268.32274.15768.322
Table 7 reports the performance of the three selected deep learning models under different noise conditions for the X-ray dataset. Both Simple CNN and Multi-headed CNN did not perform well in this context, whereas our new model was usually much more robust against noise. It should be noted that our feature fusion model performed better for the CT scan data than for the X-ray data. Robustness against noise results (given in %) provided by the 3 compared DL models for detecting COVID-19 cases for the CT scan dataset. Here, , where  is the mean of the noise and  is standard deviation of the noise. This stage of the experiments was necessary to demonstrate the stability of the applied models against noise. Our results clearly indicate that the proposed feature fusion model is robust against noise for both considered types of image data: CT scan and X-ray images. It should be mentioned that there are various COVID-19 diagnostic resources, the main being CT scan and X-ray imaging tools. Thus, we were motivated to propose an efficient deep learning-based COVID-19 detection model working promisingly on both CT scan and X-ray images in order to assist clinicians in providing timely diagnostics and clinical support to their patients in both of these fields. Robustness against noise results (given in %) provided by the 3 compared DL models for detecting COVID-19 cases for the X-ray dataset.

Unknown data detection

In this sub-section, we evaluate the performance of deep learning models when they are fed by unknown images. In these experimental settings the models either do not know or cannot clearly estimate the uncertainty of their predictions. To perform this evaluation, we fed the DL models being compared with one sample image from the well-known MNIST dataset (see Fig. 9). The mean and the STD values of the Simple CNN model, Multi-headed CNN model and our proposed feature fusion model are reported in Table 8. The obtained results indicate that our feature fusion model showed its uncertainty towards unknown data much better than the two other DL models.
Fig. 9

The MNIST sample image fed to the deep learning models as an unknown sample.

Table 8

Unknown image class detection by Simple CNN, Multi-headed CNN and our proposed feature fusion model when fed with the image presented in Fig. 9.

DL modelCT scan
X-ray
nCTNiCTpCTCOVID-19NormalPneumonia
Deep 1 (Simple CNN)Mean0.020.980.00.570.150.28
STD0.100.100.010.390.260.35

Deep 2 (Multi-headed CNN)Mean0.050.300.650.680.220.10
STD0.190.430.450.320.270.16

Proposed fusion modelMean0.560.00.440.410.590.0
STD0.500.070.500.490.490.0
We fed the MNIST sample image presented in Fig. 9 to the three deep learning models trained on CT scan and X-ray datasets, and then predicted the class of this unknown image sample. The MNIST sample image fed to the deep learning models as an unknown sample. Estimating uncertainty of traditional machine learning and deep learning models using different UQ methods is vital during critical predictions such as medical case studies. Ideally, the applied ML models should be able to capture a portion of both epistemic and aleatoric uncertainties. In this study, we applied a new feature fusion model to classify two types of medical data: CT scan and X-ray images. Table 8 reports the Mean and the STD values of the three considered deep learning models applied to unknown data. It should be noted that the Mean value accounts for the model’s prediction and STD accounts for its uncertainty. As reported in Table 8, our model usually provides zero (or close to zero) values of Mean and STD for one of the image classes (for both CT scan and X-ray image types). Unknown image class detection by Simple CNN, Multi-headed CNN and our proposed feature fusion model when fed with the image presented in Fig. 9.

Discussion

Nowadays, timely and accurate detection of COVID-19 cases has become a crucial health care task. Various methods from different fields of science have been proposed to tackle the problem of accurate COVID-19 diagnostic. Traditional machine learning (ML) and deep learning (DL) methods have been among the most effective of them. In this work, we mainly focused on the detection of COVID-19 cases using CT scan and X-ray image data. We proposed a new simple but very efficient feature fusion model, called , and compared its performance with several classical ML and DL techniques. The prediction results we obtained confirm that our feature fusion model can be highly effective in detecting the COVID-19 cases. Moreover, we have shown the superiority of our model in dealing with noise data. The obtained results also reveal that the proposed  model can be effectively used for classifying previously unseen images. Our study attempts to fill the gap reported in the literature [20]. To do so, we have compared the performance of the proposed feature fusion model to recent state-of-the-art machine learning techniques used to classify CT scan and X-ray image data (see Table 11). The Grad-CAM visualization procedure was carried out to identify the important features for each data class (this analysis was conducted for both CT scan and X-ray image datasets). Fig. 12, Fig. 13 illustrate the most important features used by our feature fusion model to identify each data class separately for CT scan and X-ray image datasets, respectively. Moreover, the T-SNE visualisation of different models applied to the CT scan and X-ray datasets without and with quantifying uncertainty are presented in Fig. 10, Fig. 11. Finally, the output posterior distributions of our proposed feature fusion model for both considered image datasets are presented in Fig. 14. This figure clearly shows that the correctly classified samples of a given class do not overlap with samples of the other classes (incorrect classes).
Table 11

Comprehensive comparison of the results provided by our proposed model with the state-of-the-art techniques for automated detection of COVID-19 cases using both the CT scan and X-ray image datasets.

DatasetStudyYear# of samplesPerformance
UQCode
PrecisionRecallF-measureAccuracyAUC
CT scanLi et al. [36]20201540 (3 classes)N/A82.60N/AN/A0.918××
Jaiswal et al. [37]20202492 (2 classes)96.2996.2996.2996.250.970××
Wang et al. [38]2020640 (2 classes)96.6197.7197.1497.15N/A××
Sharma [39]20202200 (3 classes)N/A92.10N/A91.00N/A××
Panwar et al. [40]20201600 (2 classes)95.0095.0095.0095.00N/A××
Do and Vu [41]2020746 (2 classes)85.0085.0085.0085.000.922××
Singh [42]2020N/A (2 classes)N/A91.0089.9793.50N/A××
Pham [43]2020746 (2 classes)N/A91.1493.0092.620.980××
Martinez [44]2020746 (2 classes)94.4086.6090.3090.400.965××
Loey et al. [45]202011 012 (2 classes)N/A80.85N/A81.41N/A××
Ning et al. [18]202019 685 (3 classes)N/AN/AN/AN/A0.978××
Han et al. [46]2020460 (3 classes)95.9090.5092.3094.300.988××
Shamsi Jokandan et al. [7]2021746 (2 classes)N/A86.50N/A87.900.942×
Benmalek et al. [47]202119 685 (3 classes)98.5098.6098.50N/AN/A
Kumar et al. [48]20222926 (2 classes)N/AN/AN/A98.87N/A×
Masood et al. [49]202219 685 (2 classes)99.7599.7099.7299.75N/A××
Ours202219 685 (3 classes)99.0899.0899.0899.081.00

X-rayKhan et al. [50]20201251 (4 classes)90.0089.9289.8089.60N/A×
Ozturk et al. [22]20201125 (3 classes)89.9685.3587.3787.02N/A×
Mesut and [51]2020458 (3 classes)98.8998.3398.5799.27N/A×
Mahmud et al. [52]20201220 (4 classes)82.8783.8283.3790.300.825×
Heidari et al. [53]20202544 (3 classes)N/AN/AN/A94.50N/A××
Rahimzadeh and Attar [54]202011 302 (3 classes)72.8387.31N/A91.40N/A×
Pereira et al. [55]20201144 (7 classes)N/AN/A64.91N/AN/A××
De Moura et al. [56]20201616 (3 classes)79.0079.3379.3379.86N/A××
Yoo et al. [57]20201170 (2 classes)97.0099.0097.9898.000.980××
Chandra et al. [58]20202346 (2 classes)N/AN/AN/A91.320.914××
Zhang et al. [59]20202706 (2 classes)77.13N/AN/A78.570.844×
Shamsi Jokandan et al. [7]2021100 (2 classes)N/A99.90N/A98.600.997×
Ahmad et al. [60]20214200 (4 classes)93.0192.9792.9796.49N/A××
Patel [61]20216432 (3 classes)N/AN/AN/A93.67N/A
Basu et al. [62]20222926 (2 classes)N/A92.90N/A97.60N/A××
Masud [63]20226432 (2 classes)N/AN/AN/A92.700.964××
Ours20226432 (3 classes)96.3596.3796.3696.350.993
Fig. 12

Grad-CAM visualization for our proposed fusion model without and with UQ for nCT (Fig. 12, Fig. 12), NiCT (Fig. 12, Fig. 12), and pCT (Fig. 12, Fig. 12) classes using CT scan dataset.

Fig. 13

Grad-CAM visualization for our proposed fusion model without and with UQ for COVID-19 (Fig. 13, Fig. 13), Normal (Fig. 13, Fig. 13), and Pneumonia (Fig. 13, Fig. 13) classes using the X-ray dataset.

Fig. 10

T-SNE visualisation of different models applied to the CT scan data without and with quantifying uncertainty.

Fig. 11

T-SNE visualisation of different models applied to the X-ray data without and with quantifying uncertainty.

Fig. 14

The output posterior distributions of our proposed feature fusion model calculated for the nCT 14(a), NiCT 14(b) and pCT 14(c) data classes for the CT scan dataset, and the COVID-19 14(d), Normal 14(e) and Pneumonia 14(f) data classes for the X-ray dataset.

T-SNE visualisation of different models applied to the CT scan data without and with quantifying uncertainty. T-SNE visualisation of different models applied to the X-ray data without and with quantifying uncertainty. Grad-CAM visualization for our proposed fusion model without and with UQ for nCT (Fig. 12, Fig. 12), NiCT (Fig. 12, Fig. 12), and pCT (Fig. 12, Fig. 12) classes using CT scan dataset. Grad-CAM visualization for our proposed fusion model without and with UQ for COVID-19 (Fig. 13, Fig. 13), Normal (Fig. 13, Fig. 13), and Pneumonia (Fig. 13, Fig. 13) classes using the X-ray dataset. The output posterior distributions of our proposed feature fusion model calculated for the nCT 14(a), NiCT 14(b) and pCT 14(c) data classes for the CT scan dataset, and the COVID-19 14(d), Normal 14(e) and Pneumonia 14(f) data classes for the X-ray dataset.

Comparison with the state-of-the-art

In this sub-section, we quickly compare the results provided by our new model with those yielded by the state-of-the-art DL techniques (see Table 9, Table 10). The state-of-the-art models used in our comparison are the Bayesian Deep Learning [21], DarkCovidNet [22],CNN [23], DeTraC (Decompose, Transfer, and Compose) [24], and ResNet50 [25] models.
Table 9

Comparison of the results of our DL feature fusion model with the state-of-the-art DL models for CT scan data.

DL modelPrecisonRecallF-measureAccuracy
Bayesian Deep Learning [21]98.35198.33398.34298.333
DarkCovidNet [22]97.46097.45897.45997.458
CNN [23]97.75397.75097.75197.750
DeTraC [24]96.97296.95896.96596.958
ResNet50 [25]95.57195.54195.55695.541
Proposed fusion model99.08599.08599.08599.085
Table 10

Comparison of the results of our DL feature fusion model with the state-of-the-art DL models for X-ray data.

DL modelPrecisonRecallF-measureAccuracy
Bayesian Deep Learning [21]95.39895.41995.40895.419
DarkCovidNet [22]95.75295.72995.74195.729
CNN [23]95.40095.34195.37095.341
DeTraC [24]95.27695.26395.27095.263
ResNet50 [25]94.15394.17794.16594.177
Proposed fusion model96.35096.37096.36096.350
As can be seen from Table 9, Table 10, our proposed feature fusion model not just only achieved superior performance but also significantly outperformed the state-of-the-art models applied to the same datasets. Comparison of the results of our DL feature fusion model with the state-of-the-art DL models for CT scan data. Comparison of the results of our DL feature fusion model with the state-of-the-art DL models for X-ray data.

Significance of the  feature fusion model

Wang et al. [20] proposed a DL feature fusion model for COVID-19 case detection. The model introduced by Wang et al. provided excellent prediction performance for CT scan data considered. However, the authors stated that their model may be much less efficient for other types of medical data such as X-ray images. In another study, Tang et al. [26] proposed an ensemble deep learning model for COVID-19 detection using X-ray image data only. Moreover, most of the existing studies focus on COVID-19 case detection without conducting any uncertainty analysis of the model’s predictions. Shamsi et al. [7] have been among rare authors who considered uncertainty in their study; however, they used very small datasets in their training experiments. In this work, we proposed a novel general feature fusion model which can be effectively used to analyze large CT scan and X-ray datasets (both of these types of images can be processed successively), while quantifying the uncertainty of the model’s predictions using the Ensemble MC Dropout (EMCD) technique. It should be noted that the proposed feature fusion model could be easily generalized to classify other complex diseases. Moreover, the model’s performance could be further improved by incorporating into its different optimization algorithms such as the Arithmetic optimization algorithm [27], Aquila optimizer [28], Artificial Immune System (AIS) algorithm [29], Marine Predators algorithm [30], or Cuckoo search optimization algorithm [31]. Finally, Neural Architecture Search (NAS) is a new technique for automating the design of various deep learning models. Therefore, the architecture of the proposed  feature fusion model can be further improved using newly proposed NAS techniques [32], [33], [34], [35]. A comprehensive comparison of the results provided by our proposed model with the state-of-the-art techniques for automated detection of COVID-19 cases using both the CT scan and X-ray image datasets is presented in Table 11. Moreover, the most important features of our  feature fusion model are summarized below: Our model provided the highest COVID-19 detection performance compared to traditional machine learning models, some simple deep learning models as well as to state-of-the-art deep learning techniques for both considered types of medical data (CT scan and X-ray images). Proposed model takes advantage of an uncertainty quantification strategy based on the effective Ensemble MC Dropout (EMCD) technique. Proposed model is robust against noise. Proposed model is able to detect unknown data with high accuracy. Comprehensive comparison of the results provided by our proposed model with the state-of-the-art techniques for automated detection of COVID-19 cases using both the CT scan and X-ray image datasets.

Conclusion

In this study, we have described a new deep learning feature fusion model to accurately detect COVID-19 cases using CT scan and X-ray data. In order to detect the COVID-19 cases accurately and provide health practitioners with an efficient diagnostic tool they could rely on, we carried out the uncertainty quantification of the model’s predictions while detecting the disease cases. Moreover, our  model demonstrated an excellent robustness to noise and ability to process unknown data. A class-wise analysis procedure has been implemented to ensure a steady performance of the model. We have demonstrated the effectiveness of our model using various computational experiments. Our experimental results suggest that the presented feature fusion model can be applied to analyze efficiently both CT and X-ray data. The use of hierarchical features in the model’s architecture helped  to outperform the considered traditional machine learning models, classical deep learning models, and state-of-the-art deep learning models. The limitations of the proposed feature fusion model will be addressed in our future studies. Thus, in the future, we intend to: (i) expand the considered COVID-19 datasets and test our feature fusion model using multi-modal data, (ii) include an attention mechanism while merging features, and (iii) integrate into our model some modern data fusion techniques such as decision level fusion.

CRediT authorship contribution statement

Moloud Abdar: Conception of the project, Design of methodology, Data selection and collection, Analysis and experimental protocol, Writing – original draft, Result discussion, Writing – review & editing. Soorena Salari: Analysis and experimental protocol, Drafting the initial draft, Experimental protocol, Result discussion, Writing – review & editing. Sina Qahremani: Analysis and experimental protocol, Drafting the initial draft, Experimental protocol, Result discussion, Writing – review & editing. Hak-Keung Lam: Writing – review & editing, Supervision. Fakhri Karray: Writing – review & editing, Supervision. Sadiq Hussain: Drafting the initial draft. Abbas Khosravi: Writing – review & editing, Supervision. U. Rajendra Acharya: Writing – review & editing, Supervision. Vladimir Makarenkov: Writing – review & editing, Supervision. Saeid Nahavandi: Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  6 in total

1.  The influence of clinical information on the reporting of CT by radiologists.

Authors:  A Leslie; A J Jones; P R Goddard
Journal:  Br J Radiol       Date:  2000-10       Impact factor: 3.039

2.  Accurate Screening of COVID-19 Using Attention-Based Deep 3D Multiple Instance Learning.

Authors:  Zhongyi Han; Benzheng Wei; Yanfei Hong; Tianyang Li; Jinyu Cong; Xue Zhu; Haifeng Wei; Wei Zhang
Journal:  IEEE Trans Med Imaging       Date:  2020-08       Impact factor: 10.048

3.  Re-epithelialization and immune cell behaviour in an ex vivo human skin model.

Authors:  Ana Rakita; Nenad Nikolić; Michael Mildner; Johannes Matiasek; Adelheid Elbe-Bürger
Journal:  Sci Rep       Date:  2020-01-08       Impact factor: 4.379

4.  Management of validation of HPLC method for determination of acetylsalicylic acid impurities in a new pharmaceutical product.

Authors:  Małgorzata Kowalska; Magdalena Woźniak; Michał Kijek; Paulina Mitrosz; Jerzy Szakiel; Paweł Turek
Journal:  Sci Rep       Date:  2022-01-06       Impact factor: 4.379

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.