Literature DB >> 36217534

UncertaintyFuseNet: Robust uncertainty-aware hierarchical feature fusion model with Ensemble Monte Carlo Dropout for COVID-19 detection.

Moloud Abdar¹, Soorena Salari², Sina Qahremani³, Hak-Keung Lam⁴, Fakhri Karray^5,6, Sadiq Hussain⁷, Abbas Khosravi¹, U Rajendra Acharya^8,9,10, Vladimir Makarenkov¹¹, Saeid Nahavandi¹.

Abstract

The COVID-19 (Coronavirus disease 2019) pandemic has become a major global threat to human health and well-being. Thus, the development of computer-aided detection (CAD) systems that are capable of accurately distinguishing COVID-19 from other diseases using chest computed tomography (CT) and X-ray data is of immediate priority. Such automatic systems are usually based on traditional machine learning or deep learning methods. Differently from most of the existing studies, which used either CT scan or X-ray images in COVID-19-case classification, we present a new, simple but efficient deep learning feature fusion model, called U n c e r t a i n t y F u s e N e t , which is able to classify accurately large datasets of both of these types of images. We argue that the uncertainty of the model's predictions should be taken into account in the learning process, even though most of the existing studies have overlooked it. We quantify the prediction uncertainty in our feature fusion model using effective Ensemble Monte Carlo Dropout (EMCD) technique. A comprehensive simulation study has been conducted to compare the results of our new model to the existing approaches, evaluating the performance of competing models in terms of Precision, Recall, F-Measure, Accuracy and ROC curves. The obtained results prove the efficiency of our model which provided the prediction accuracy of 99.08% and 96.35% for the considered CT scan and X-ray datasets, respectively. Moreover, our U n c e r t a i n t y F u s e N e t model was generally robust to noise and performed well with previously unseen data. The source code of our implementation is freely available at: https://github.com/moloud1987/UncertaintyFuseNet-for-COVID-19-Classification.

Entities: Chemical

Keywords: COVID-19; Deep learning; Early fusion; Feature fusion; Uncertainty quantification

Year: 2022 PMID： 36217534 PMCID： PMC9534540 DOI： 10.1016/j.inffus.2022.09.023

Source DB: PubMed Journal: Inf Fusion ISSN： 1566-2535 Impact factor: 17.564

Introduction

The 2019 coronavirus (COVID-19) has been spreading astonishingly fast across the globe since its emergence in December 2019 and its exact origin is still unknown [1], [2], [3]. Overall, the COVID-19 pandemic has caused a consecutive series of catastrophic losses worldwide, infecting more than 287 million people and causing around 5.4 million deaths around the world up to the present. The rapid spread of COVID-19 is continuing to threaten human’s life and health with the emergence of novel variants such as Delta and Omicron. All of this makes COVID-19 not only an epidemiological disaster, but also a psychological and emotional one. The uncertainties and grappling with the loss of normalcy caused by this pandemic provoke severe anxiety, stress and sadness among people. Easy respiratory transmission of the disease from person to person triggers swift spread of the pandemic. While many of the COVID-19 cases show milder symptoms, the symptoms of the remaining cases are unfortunately life-critical. The health-care systems in many countries seem to have arrived at the point of collapse as the number of cases has been increasing drastically due to the fast propagation of some of its variants. Regarding the COVID-19 diagnostic, the reverse transcription polymerase chain reaction (RT-PCR) is one of the gold standards for COVID-19 detection. However, RT-PCR has a low sensitivity. Hence, many COVID-19 cases will not be recognized by this test and thus the patients may not get the proper treatments. These unrecognized patients pose a threat to the healthy population due to highly infectious nature of the virus. Chest X-ray (CXR) and Computed Tomography (CT) have been widely used to identify prominent pneumonia patterns in the chest. These imaging technologies accompanied by artificial intelligence tools may be used to diagnose COVID-19 patients in a more accurate, fast and cost-effective manner. Failure to provide prompt detection and treatment of COVID-19 patients increases the mortality rate. Hence, the detection of COVID-19 cases using deep learning models using both CXR and CT images may have huge potential in healthcare applications. In recent years, deep learning models have had the widespread applicability not only in medical imaging field but also in many other areas [4], [5], [6], [7]. These models have also been extensively applied for COVID-19 detection. It is critical to discriminate COVID-19 from other forms of pneumonia and flu. Farooq et al. [8] introduced an open-access dataset and the open-source code of their implementation using a CNN framework for distinguishing COVID-19 from analogous pneumonia cohorts from chest X-ray images. The authors designed their COVIDResNet model by utilizing a pre-trained ResNet-50 framework allowing them to improve the model’s performance and reduce its training time. An automatic and accurate identification of COVID-19 using CT images helps radiologists to screen patients in a better way. Zheng et al. in [9] proposed a fully automated system for COVID-19 detection from chest CT images. Their deep learning model, called COVNet, investigates visual features of the chest CT images. Moreover, Hall et al. [10] presented a new deep learning model, named COVIDX-Net, to aid radiologists with COVID-19 detection from CXR image data. The authors explored seven deep learning architectures, including DenseNet, VGG-19 and MobileNet v2.0. In another study, Abbas et al. [11] designed the Decompose, Transfer, and Compose (DeTraC) model of COVID-19 image classification using CXR data. A class decomposition approach was employed to identify irregularities in iCXR data by scrutinizing the class boundaries. Segmentation also plays a key role in COVID-19 quantification applied to CT scan data. Chen et al. [12] proposed a novel deep learning method for segmentation of COVID-19 infection regions automatically. Aggregated Residual Transformations were employed to learn a robust and expressive feature representation and the soft attention technique was applied to improve the potential of the system to distinguish several symptoms of COVID-19. However, we noticed that there are still some open issues in the recently proposed traditional machine learning and deep learning models for COVID-19 detection. For this reason, optimizing the existing models should be a priority in COVID-19 detection and classification. Ensemble and fusion-based models [13] have shown outstanding performance in different medical applications. In the following, we provide more information about fusion-based models, discussing how they can be used in the framework of the deep learning approach.

Uncertainty quantification (UQ)

Many traditional machine learning and deep learning models have been developed not only for analysis of CXR and CT image data but also for many other medical applications, often yielding high accuracy results even for a limited number of images [7]. However, DNNs require a large number of data to fine-tune trainable parameters. A limited number of images usually leads to epistemic uncertainty. Trust is an issue for these models, deployed with lower numbers of training samples. Out-of-distribution (OoD) samples and discrimination between the training and testing samples make such models fail in real world applications. Lack of confidence in unknown or new cases is usually not reported for these models. However, this information is essential for the development of reliable medical diagnostic tools. These unknown samples, which are generally hard to predict, often have important practical value. It is essential to estimate uncertainties with an extra insight in their point estimates. This additional vision aims at enhancing the overall trustworthiness of the systems, allowing clinicians to know where they can trust predictions made by the models. The flawed decisions made by some models can be fatal for the patients at risk. Hence, proper uncertainty estimations are necessary to improve the efficiency of ML models making them trustworthy and reliable [14], [15], [16]. Trustworthy uncertainty estimates can facilitate clinical decision making, and more importantly, provide clinicians with appropriate feedback on the reliability of the obtained results [17]. As discussed above, COVID-19 has had many negative effects on all aspects of human life around the world. The COVID-19 pandemic has caused millions of deaths worldwide. In this regard, our study attempts to propose a simple and accurate deep learning model, called UncertaintyFuseNet, for detecting COVID-19 cases. Our model includes an uncertainty quantification method to increase the reliability of the obtained results.

Research gaps

Our comprehensive literature review helped us to identify several important research gaps related to the use of the COVID-19 detection/segmentation methods. Below, we list the most important of them: There are no sufficient COVID-19 image data to develop accurate and robust deep learning models. This lack of data can impact the performance of deep learning approaches. To the best of our knowledge, there are very few studies that have used both types of images (CT scan and X-ray) simultaneously. There are very few studies that have examined the uncertainty of the COVID-19 predictions provided by deep learning models. Moreover, we found that there are very few COVID-19 classification studies considering the model’s robustness and its ability to process unknown data. The impressive effect of different feature fusion methods has received less attention in the COVID-19 classification research. It is worth noting that feature fusion techniques are very effective both for improving the model’s performance and for dealing with uncertainty within ML and DL models.

Main contributions

The main contributions of this study are as follows: We proposed a novel feature fusion model for accurate detection of COVID-19 cases. We quantified the uncertainty in our proposed feature fusion model using effective Ensemble MC Dropout (EMCD) technique. The proposed feature fusion model demonstrates strong robustness to data contamination (data noise). Our new model provided very encouraging results in terms of unknown data detection. The main characteristics of the proposed UncertaintyFuseNet model are as follows: (i) It is an accurate model with promising performance, (ii) It can be used efficiently to carry out classification analysis of large CT and X-ray image datasets, (iii) It quantifies the prediction uncertainty, (iv) It is a reliable model in terms of processing noisy data, and finally, (v) It allows for an accurate detection of OoD samples. The rest of this study is organized as follows. Section 2 formulates the proposed methodology. The main experiments of this study are discussed in Section 3. Section 4 presents the obtained results and provides a comprehensive comparison with existing studies. Finally, the conclusions are presented in Section 5.

Proposed methodology

This section includes two main sub-sections describing: (i) Basic deep learning models in sub- Section 2.1, (ii) and our novel feature fusion model, , in sub- Section 2.2. It may be noted that we also applied two traditional machine learning algorithms (i.e., (RF) and (DT, max-depth 50 and n-estimators 200)) and compared their performances with the considered deep learning models.

Basic deep learning models

In this sub-section, we provide more details regarding two basic deep learning models: (i) Deep 1 (Simple CNN), and (ii) deep 2 (Multi-headed CNN). Fig. 1, Fig. 2 show deep 1 (Simple CNN) and deep 2 (Multi-headed CNN) models, respectively. The first deep learning model (Simple CNN) includes three convolutional layers followed by MC dropout in the feature extraction layer. The extracted features are then given to the classification layer, including three dense layers and MC dropout. More details of the deep 1 model can be found in Fig. 1. In our second deep learning model, deep 2, i.e., multi-headed CNN, comprises three main heads (as feature extractors). The extracted features in each branch are then given to the fusion layers, followed by the classification layer as illustrated in Fig. 2.

Fig. 1

A general overview of the applied deep learning model deep 1 (simple CNN).

Fig. 2

A general overview of the applied deep learning mode deep 2 (multi-headed CNN).

A general overview of the applied deep learning model deep 1 (simple CNN). A general overview of the applied deep learning mode deep 2 (multi-headed CNN).

Proposed feature fusion model:

Feature fusion is an approach used to combine features (different information) of the same sample (input) extracted by various methods. Assume be a training sample (image) space of labeled samples (images). Given , , …, and , where , , …, are the feature vectors of the same input sample extracted by various deep learning models, respectively. Therefore, the total feature fusion vector space obtained from different sources can be calculated as follows: In this study, after preprocessing the data, we feed our dataset to the model. Our model consists of two major branches: The first branch has five convolutional blocks. Each block is made up of two tandem convolutional layers followed by batch normalization and max-pooling layers. Also, the fourth and fifth blocks have dropout layers in their outputs. It is worth noting that separable convolutions were utilized in the model in the second and subsequent layers. However, we used the usual convolution layer in other architectures (First layer of , Simple CNN, and Multi-headed CNN). The following training parameters were used in our experiments: The learning rate of the proposed model is 0.0005, the batch size is 128, the number of epochs is 200, and Adam is selected as our optimizer. The second branch is a VGG16 transfer learning network whose output is used in the fusion layer. After two branches, the model is followed by a fusion layer that concatenates the third, fourth, and fifth convolutional layers’ outputs with VGG16’s output. Finally, we used fully connected layers to process the fused features and classify the data. In this part, we have used four dense layers with 512, 128, 64, and 3 neurons with the ReLU activation function, respectively. The output of the first three dense layers has a dropout in their outcomes with a rate equal to 0.7, 0.5, and 0.3, respectively. The stated model is not simplistic. Indeed, to boost the model’s power in dealing with data and extracting high-quality features, we have employed a novel feature fusion approach combining different sources: We selected the third convolutional block’s output as a fusion source to have a holistic perspective about the data distribution. These features help the model to consider the unprocessed and raw information and use it in the prediction. We included the final and penultimate convolutional blocks’ outputs in the feature fusion layer to have more accurate information. This feature gives a detailed view of the dataset to model and helps the model to process advanced classification features. As has been suggested by recent pneumonia detection studies, where the pretrained networks have been successively used to create high-quality generalizable features, we used the output of VGG16 in the fusion layer. The pseudo-code of the proposed model for detecting the COVID-19 cases is reported in Algorithm 1. Its general view is illustrated in Fig. 3.

Fig. 3

A general overview of the proposed model inspired by a hierarchical feature fusion approach and EMCD.

It should be noted that the detailed information about Convolution blocks in Figs. 1, 2, and 3 is reported in Table B.12, in the Appendix. To generate the final prediction, after training the applied models with uncertainty module, we have first run each model times. Thereafter, we average the predicted softmax probabilities (outputs) in those random predictions of data through and stochastic sampling dropout mask for each single prediction.

Table B.12

The detailed parts of Convolution blocks used in Deep 1 (Simple CNN), Deep 2 (Multi-headed CNN), and our proposed model (Fusion model).

Model	Layer name	Input size
Deep 1 (Simple CNN)	Conv1	2D convolution, Kernel size: 3, Activation: ReLU, Max Pooling: 2
	Conv2	2D convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2
	Conv3	2D convolution, Kernel size: 3, Activation: ReLU, Batch Normalization

Deep 2 (Multi-headed CNN)	Conv1	2D convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2
	Conv2	2D convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2
	Conv3	2D convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2

Proposed (Fusion model)	Conv1	2D convolution, Kernel size: 3, Activation: ReLU, Max Pooling: 2
Proposed (Fusion model)	Conv2	2D Separable convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2
	Conv3	2D Separable convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2
	Conv4	2D Separable convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2
	Conv5	2D Separable convolution, Kernel size: 3, Activation: ReLU, Batch Normalization, Max Pooling: 2

A general overview of the proposed model inspired by a hierarchical feature fusion approach and EMCD. We then used the model ensembling and acquired predictions from the trained models with various weight distributions and initialized weights using this strategy. This allowed us to improve the model’s performance drastically. Thus, after training the model, we use the MC equation with 200 (see Eq. (3)) to obtain predictions of the model through different stochastic paths (using MC dropouts to create randomness in our architectures). After getting all predictions, we calculate the mean for each sample. Using this approach, we obtain an ensemble of different models which helps boost the model’s performance. Precisely, we run the proposed model 200 times for each sample at the test stage and get an average prediction as the final prediction of the model. The pseudo-code of the applied EMCD procedure included in our model for detecting COVID-19 cases is summarized in Algorithm 2. Furthermore, the learning rate of the proposed model is 0.0005, the batch size is 128, the number of epochs is 200 and Adam is selected as our optimizer.

Experiments

In this section, we present : the data considered in our study (see sub- Section 3.1), the results obtained using our new model (see sub- Section 3.2), the results showing that our new model is robust against noise (see sub- Section 3.3), and the results showing how our new model copes with unknown data (see sub- Section 3.4).

Datas considered

In this study, two types of input image data were used: CT scan [18] 2 and X-ray3 images (see Table 1). Some random samples of the CT scan and X-ray datasets considered in this study are shown in Fig. 4. The CT scan dataset has classes of data: non-informative CT (NiCT), positive CT (pCT), and negative CT (nCT) images. The X-ray dataset also has three data classes: COVID-19, Normal, and Pneumonia images.

Table 1

Characteristics of the CT scan and X-ray datasets considered in our study.

Dataset	# of samples	# of classes
CT scan images	19 685 (70% train, 30% test)	3
X-ray images	6432 (train: 5144, test: 1288)	3

Fig. 4

Some random image samples from the CT scan and X-ray datasets considered in our study.

It should be pointed out that for the CT scan dataset we randomly used 70% of the whole data for training and the rest (30%) for testing the applied models. However, the X-ray dataset was originally divided into two main categories: train (5144 images) and test (1288 images). Thus, we used these train and test categories in our study as well. Characteristics of the CT scan and X-ray datasets considered in our study. Some random image samples from the CT scan and X-ray datasets considered in our study.

Experimental results

In this section, the experimental results are presented and discussed. Since we also considered the impact of UQ methods, our experiments have been conducted with and without applying them for detection of COVID-19 cases. In our first experiment, we compared five different machine learning models, including Random Forest (RF), Decision Trees (DT, max-depth 50, and n-estimators 200), Deep 1 (Simple CNN), Deep 2 (Multi-headed CNN), and our proposed model (feature fusion model).

COVID-19 classification without considering uncertainty

First, we investigated the performance of the five considered classifiers (RF, DT, simple CNN, multi-headed CNN and our proposed feature fusion model) without considering uncertainty. The obtained results are presented in Table 2, Table 3 for the CT scan and X-ray datasets, respectively. As shown in Table 2 our feature fusion model outperformed the other methods for the CT scan dataset, providing the accuracy of 99.136%, and followed by simple CNN with the accuracy of 98.763%. The obtained results also indicate that DT provided the weakest performance for the CT scan dataset among the five competing models. Figs. B.15 (in the Appendix) and 5 present the confusion matrices and the ROC curves obtained for the CT scan dataset without quantifying uncertainty, respectively.

Table 2

Comparison of the results (given in %) provided by different ML models for detecting COVID-19 cases for the CT scan dataset: Results without considering uncertainty.

ML model	Precision	Recall	F-measure	Accuracy
RF	97.111	97.070	97.091	97.070
DT	93.049	93.040	93.045	93.040
Deep 1 (Simple CNN)	98.787	98.763	98.775	98.763
Deep 2 (Multi-headed CNN)	98.599	98.577	98.588	98.577
Proposed (Fusion model)	99.137	99.136	99.136	99.136

Table 3

Comparison of the results (given in %) provided by different ML models for detecting COVID-19 cases for the X-ray dataset: Results without considering uncertainty.

ML model	Precision	Recall	F-measure	Accuracy
RF	91.532	91.381	91.456	91.381
DT	83.828	84.006	83.917	84.006
Deep 1 (Simple CNN)	93.847	93.167	93.506	93.167
Deep 2 (Multi-headed CNN)	95.041	94.953	94.997	94.953
DarkCovidNet	95.752	95.729	95.741	95.729
Proposed (Fusion model)	97.121	97.127	97.124	97.127

Fig. B.15

Confusion matrices obtained using different models for the CT scan datasets without quantifying uncertainty.

Fig. 5

ROC curves obtained for the five considered ML models for the CT scan data without quantifying uncertainty.

To demonstrate the effectiveness of the proposed feature fusion model, the same five ML have been applied to analyze X-ray data. It can be observed from Table 3 that our feature fusion model performed much better than the other competing ML models, providing the accuracy of 97.127%, followed by the multi-headed CNN model with the accuracy of 94.953%. The traditional Decision Tree model provided much worse results for the X-ray data (the recall value of 84.006%) than for the CT scan data (the recall value of 93.040%). Figs. B.16 (in the Appendix) and 6 present the confusion matrices and the ROC curves obtained by the five ML models for the X-ray dataset without quantifying uncertainty.

Fig. B.16

Confusion matrices obtained using different models for the X-ray dataset without quantifying uncertainty.

Fig. 6

ROC curves obtained for the five considered ML models for the X-ray data without quantifying uncertainty.

Comparison of the results (given in %) provided by different ML models for detecting COVID-19 cases for the CT scan dataset: Results without considering uncertainty. ROC curves obtained for the five considered ML models for the CT scan data without quantifying uncertainty. Comparison of the results (given in %) provided by different ML models for detecting COVID-19 cases for the X-ray dataset: Results without considering uncertainty. ROC curves obtained for the five considered ML models for the X-ray data without quantifying uncertainty.

COVID-19 classification considering uncertainty

The results, discussed in the previous sub- Section 3.2.1, provided by our new feature fusion model are promising, suggesting that it can be used by clinical practitioners for automatic detection of COVID-19 cases. We believe that new efficient intelligent (i.e. ML and DL) models to deal with COVID-19 data are urgently needed. At the same time, we believe in the uncertainty estimates should accompany such intelligent models. To accomplish this, we applied the uncertainty quantification method, called EMC dropout, to estimate the uncertainty of our deep learning predictions. The EMC method was used in the framework of the Deep 1 (Simple CNN) and Deep 2 (Multi-headed CNN) models, and our proposed model. Table 4 and Fig. B.17 in the Appendix (confusion matrices) and Fig. 7 (ROC curves) show the results provided by the three compared deep learning models considering uncertainty for the CT scan dataset. As shown in Table 4, our feature fusion model yielded a better classification performance compared to the Deep 1 and Deep 2 CNN-based models. provided the accuracy value of 99.085%, followed by the Deep 1 model with the accuracy value of 98.831%, for the CT scan data. The results obtained using deep learning models with and without uncertainty quantification (UQ) reveal that our proposed feature fusion model with UQ method has had a slightly poorer performance than the model without UQ. The Deep 1 CNN model performed slightly better with UQ, while the Deep 2 CNN model performed slightly better without UQ.

Table 4

Comparison of the results (given in %) provided by the 3 DL models for detecting COVID-19 cases for the CT scan dataset: Results obtained with uncertainty quantification.

DL model	Precision	Recall	F-measure	Accuracy
Deep 1 (Simple CNN)	98.831	98.854	98.843	98.831
Deep 2 (Multi-headed CNN)	98.493	98.523	98.508	98.493
Proposed (Fusion model)	99.085	99.085	99.085	99.085

Fig. B.17

Confusion matrices obtained using different models for the CT scan dataset with quantifying uncertainty.

Fig. 7

ROC curves obtained for the three considered DL models for the CT scan data with uncertainty quantification.

We also evaluated the performance of three considered DL models with uncertainty quantification on the X-ray dataset. The obtained statistics, confusion matrices, and the ROC curves for the three competing DL models applied are presented in Table 5 and Fig. B.17 in the Appendix and Fig. ?? (ROC curves), respectively. Our proposed feature fusion model achieved the best performance for COVID-19 detection using X-ray dataset with an accuracy of 96.350% compared to the simple CNN (accuracy of 95.263%). For the X-ray data, the proposed model outperformed the Deep 1 simple CNN model, but was slightly surpassed by the Deep 2 multi-headed CNN (see Table 5) in the Appendix (see Fig. 8).

Table 5

Comparison of the results (given in %) provided by the 3 DL models for detecting COVID-19 cases for the X-ray dataset: Results obtained with uncertainty quantification.

Method	Precison	Recall	F-measure	Accuracy
Deep 1 (Simple CNN)	95.263	95.354	95.309	95.263
Deep 2 (Multi-headed)	95.186	95.257	95.222	95.186
DarkCovidNet	97.460	97.4589	97.459	97.458
Proposed (Fusion model)	96.350	96.370	96.360	96.350

Fig. 8

ROC curves obtained for the three considered DL models for the X-ray data with uncertainty quantification.

Comparison of the results (given in %) provided by the 3 DL models for detecting COVID-19 cases for the CT scan dataset: Results obtained with uncertainty quantification. ROC curves obtained for the three considered DL models for the CT scan data with uncertainty quantification. Comparison of the results (given in %) provided by the 3 DL models for detecting COVID-19 cases for the X-ray dataset: Results obtained with uncertainty quantification. ROC curves obtained for the three considered DL models for the X-ray data with uncertainty quantification.

Robustness against noise

An individual visual system is significantly robust against a wide variety of natural noises and corruptions occurring in the nature such as snow, fog or rain [19]. However, the overall performance of various modern image and speech recognition systems is greatly degraded when evaluated using previously unseen noises and corruptions. Thus, conducting robustness tests for considered ML and DL models can be necessary to reveal their level of stability against noise. In this study, the robustness of the applied deep learning models against noise has been investigated. We added different noise variables to both CT scan and X-ray datasets to evaluate the performance of Simple CNN, Multi-headed CNN and our proposed feature fusion model. Gaussian noise variables with different standard deviations (STD) were generated. The generated STD values were the following: 0.0001, 0.001, 0.01, 0.1, 0.2, 0.3, 0.4, 0.5, and 0.6, whereas the value of Mean was equal to 0. Our simulation results obtained for the CT scan and X-ray datasets are presented in Table 6, Table 7, respectively. It may be noted from Table 6 (CT scan data results) that both Simple CNN and Multi-headed CNN models did not perform well with noisy data compared to our feature fusion model. The results reported in Table 6 indicate that the values of all metrics computed for Simple and Multi-headed CNNs decrease dramatically as the level of noise increases. In contrast, our feature fusion model has been much more robust against noise according to all metrics considered.

Table 6

Robustness against noise results (given in %) provided by the 3 compared DL models for detecting COVID-19 cases for the CT scan dataset. Here, , where is the mean of the noise and is standard deviation of the noise.

DL model	Noise STD	Precision	Recall	F-measure	Accuracy
Deep 1 (Simple CNN)	0.0001	98.852	98.831	98.842	98.831
	0.001	98.868	98.848	98.858	98.848
	0.01	98.754	98.730	98.742	98.730
	0.1	95.785	95.614	95.700	95.614
	0.2	89.270	87.284	88.266	87.284
	0.3	86.370	82.593	84.439	82.593
	0.4	82.628	75.194	78.736	75.194
	0.5	78.303	64.053	70.465	64.053
	0.6	75.770	57.534	65.405	57.534

Deep 2 (Multi-headed)	0.0001	98.526	98.493	98.509	98.493
	0.001	98.524	98.493	98.508	98.493
	0.01	98.558	98.526	98.542	98.526
	0.1	93.447	92.871	93.158	92.871
	0.2	86.943	83.423	85.147	83.423
	0.3	80.168	69.065	74.203	69.065
	0.4	76.032	58.465	66.102	58.465
	0.5	73.837	54.690	62.837	54.690
	0.6	72.914	53.149	61.482	53.149

Proposed (Fusion model)	0.0001	99.085	99.085	99.085	99.085
	0.001	99.119	99.119	99.119	99.119
	0.01	99.194	99.187	99.190	99.187
	0.1	99.098	99.085	99.092	99.085
	0.2	98.828	98.814	98.821	98.814
	0.3	98.109	98.086	98.097	98.086
	0.4	96.956	96.884	96.920	96.884
	0.5	96.201	96.088	96.145	96.088
	0.6	95.804	95.665	95.734	95.665

Table 7

Robustness against noise results (given in %) provided by the 3 compared DL models for detecting COVID-19 cases for the X-ray dataset.

DL model	Noise STD	Precision	Recall	F-measure	Accuracy
Deep 1 (Simple CNN)	0.0001	95.408	95.341	95.375	95.341
	0.001	95.408	95.341	95.341	95.341
	0.01	95.338	95.263	95.301	95.263
	0.1	94.554	94.254	94.404	94.254
	0.2	91.540	89.285	90.398	89.285
	0.3	88.534	82.065	85.176	82.065
	0.4	85.770	73.136	78.951	73.136
	0.5	84.294	64.518	73.092	64.518
	0.6	82.545	57.375	67.696	57.375

Deep 2 (Multi-headed)	0.0001	95.474	95.419	95.446	95.419
	0.001	95.188	95.108	95.148	95.108
	0.01	95.404	95.341	95.372	95.341
	0.1	93.922	93.322	93.621	93.322
	0.2	88.861	82.453	85.537	82.453
	0.3	83.781	58.074	68.598	58.074
	0.4	82.207	40.062	53.871	40.062
	0.5	81.750	31.521	45.499	31.521
	0.6	81.366	27.639	41.262	27.639

Proposed (Fusion model)	0.0001	96.498	96.506	96.502	96.506
	0.001	96.568	96.583	96.576	96.583
	0.01	96.492	96.506	96.499	96.506
	0.1	96.363	96.350	96.357	96.350
	0.2	94.403	94.254	94.329	94.254
	0.3	91.769	91.071	91.418	91.071
	0.4	88.225	85.714	86.951	85.714
	0.5	84.181	78.804	81.404	78.804
	0.6	81.082	68.322	74.157	68.322

Table 7 reports the performance of the three selected deep learning models under different noise conditions for the X-ray dataset. Both Simple CNN and Multi-headed CNN did not perform well in this context, whereas our new model was usually much more robust against noise. It should be noted that our feature fusion model performed better for the CT scan data than for the X-ray data. Robustness against noise results (given in %) provided by the 3 compared DL models for detecting COVID-19 cases for the CT scan dataset. Here, , where is the mean of the noise and is standard deviation of the noise. This stage of the experiments was necessary to demonstrate the stability of the applied models against noise. Our results clearly indicate that the proposed feature fusion model is robust against noise for both considered types of image data: CT scan and X-ray images. It should be mentioned that there are various COVID-19 diagnostic resources, the main being CT scan and X-ray imaging tools. Thus, we were motivated to propose an efficient deep learning-based COVID-19 detection model working promisingly on both CT scan and X-ray images in order to assist clinicians in providing timely diagnostics and clinical support to their patients in both of these fields. Robustness against noise results (given in %) provided by the 3 compared DL models for detecting COVID-19 cases for the X-ray dataset.

Unknown data detection

In this sub-section, we evaluate the performance of deep learning models when they are fed by unknown images. In these experimental settings the models either do not know or cannot clearly estimate the uncertainty of their predictions. To perform this evaluation, we fed the DL models being compared with one sample image from the well-known MNIST dataset (see Fig. 9). The mean and the STD values of the Simple CNN model, Multi-headed CNN model and our proposed feature fusion model are reported in Table 8. The obtained results indicate that our feature fusion model showed its uncertainty towards unknown data much better than the two other DL models.

Fig. 9

The MNIST sample image fed to the deep learning models as an unknown sample.

Table 8

Unknown image class detection by Simple CNN, Multi-headed CNN and our proposed feature fusion model when fed with the image presented in Fig. 9.

DL model		CT scan			X-ray
		nCT	NiCT	pCT	COVID-19	Normal	Pneumonia
Deep 1 (Simple CNN)	Mean	0.02	0.98	0.0	0.57	0.15	0.28
Deep 1 (Simple CNN)	STD	0.10	0.10	0.01	0.39	0.26	0.35

Deep 2 (Multi-headed CNN)	Mean	0.05	0.30	0.65	0.68	0.22	0.10
Deep 2 (Multi-headed CNN)	STD	0.19	0.43	0.45	0.32	0.27	0.16

Proposed fusion model	Mean	0.56	0.0	0.44	0.41	0.59	0.0
Proposed fusion model	STD	0.50	0.07	0.50	0.49	0.49	0.0

We fed the MNIST sample image presented in Fig. 9 to the three deep learning models trained on CT scan and X-ray datasets, and then predicted the class of this unknown image sample. The MNIST sample image fed to the deep learning models as an unknown sample. Estimating uncertainty of traditional machine learning and deep learning models using different UQ methods is vital during critical predictions such as medical case studies. Ideally, the applied ML models should be able to capture a portion of both epistemic and aleatoric uncertainties. In this study, we applied a new feature fusion model to classify two types of medical data: CT scan and X-ray images. Table 8 reports the Mean and the STD values of the three considered deep learning models applied to unknown data. It should be noted that the Mean value accounts for the model’s prediction and STD accounts for its uncertainty. As reported in Table 8, our model usually provides zero (or close to zero) values of Mean and STD for one of the image classes (for both CT scan and X-ray image types). Unknown image class detection by Simple CNN, Multi-headed CNN and our proposed feature fusion model when fed with the image presented in Fig. 9.

Discussion

Nowadays, timely and accurate detection of COVID-19 cases has become a crucial health care task. Various methods from different fields of science have been proposed to tackle the problem of accurate COVID-19 diagnostic. Traditional machine learning (ML) and deep learning (DL) methods have been among the most effective of them. In this work, we mainly focused on the detection of COVID-19 cases using CT scan and X-ray image data. We proposed a new simple but very efficient feature fusion model, called , and compared its performance with several classical ML and DL techniques. The prediction results we obtained confirm that our feature fusion model can be highly effective in detecting the COVID-19 cases. Moreover, we have shown the superiority of our model in dealing with noise data. The obtained results also reveal that the proposed model can be effectively used for classifying previously unseen images. Our study attempts to fill the gap reported in the literature [20]. To do so, we have compared the performance of the proposed feature fusion model to recent state-of-the-art machine learning techniques used to classify CT scan and X-ray image data (see Table 11). The Grad-CAM visualization procedure was carried out to identify the important features for each data class (this analysis was conducted for both CT scan and X-ray image datasets). Fig. 12, Fig. 13 illustrate the most important features used by our feature fusion model to identify each data class separately for CT scan and X-ray image datasets, respectively. Moreover, the T-SNE visualisation of different models applied to the CT scan and X-ray datasets without and with quantifying uncertainty are presented in Fig. 10, Fig. 11. Finally, the output posterior distributions of our proposed feature fusion model for both considered image datasets are presented in Fig. 14. This figure clearly shows that the correctly classified samples of a given class do not overlap with samples of the other classes (incorrect classes).

Table 11

Comprehensive comparison of the results provided by our proposed model with the state-of-the-art techniques for automated detection of COVID-19 cases using both the CT scan and X-ray image datasets.

Dataset	Study	Year	# of samples	Performance					UQ	Code
				Precision	Recall	F-measure	Accuracy	AUC
CT scan	Li et al. [36]	2020	1540 (3 classes)	N/A	82.60	N/A	N/A	0.918	×	×
	Jaiswal et al. [37]	2020	2492 (2 classes)	96.29	96.29	96.29	96.25	0.970	×	×
	Wang et al. [38]	2020	640 (2 classes)	96.61	97.71	97.14	97.15	N/A	×	×
	Sharma [39]	2020	2200 (3 classes)	N/A	92.10	N/A	91.00	N/A	×	×
	Panwar et al. [40]	2020	1600 (2 classes)	95.00	95.00	95.00	95.00	N/A	×	×
	Do and Vu [41]	2020	746 (2 classes)	85.00	85.00	85.00	85.00	0.922	×	×
	Singh [42]	2020	N/A (2 classes)	N/A	91.00	89.97	93.50	N/A	×	×
	Pham [43]	2020	746 (2 classes)	N/A	91.14	93.00	92.62	0.980	×	×
	Martinez [44]	2020	746 (2 classes)	94.40	86.60	90.30	90.40	0.965	×	×
	Loey et al. [45]	2020	11 012 (2 classes)	N/A	80.85	N/A	81.41	N/A	×	×
	Ning et al. [18]	2020	19 685 (3 classes)	N/A	N/A	N/A	N/A	0.978	×	×
	Han et al. [46]	2020	460 (3 classes)	95.90	90.50	92.30	94.30	0.988	×	×
	Shamsi Jokandan et al. [7]	2021	746 (2 classes)	N/A	86.50	N/A	87.90	0.942	✓	×
	Benmalek et al. [47]	2021	19 685 (3 classes)	98.50	98.60	98.50	N/A	N/A	✓	✓
	Kumar et al. [48]	2022	2926 (2 classes)	N/A	N/A	N/A	98.87	N/A	×	✓
	Masood et al. [49]	2022	19 685 (2 classes)	99.75	99.70	99.72	99.75	N/A	×	×
	Ours	2022	19 685 (3 classes)	99.08	99.08	99.08	99.08	1.00	✓	✓

X-ray	Khan et al. [50]	2020	1251 (4 classes)	90.00	89.92	89.80	89.60	N/A	×	✓
	Ozturk et al. [22]	2020	1125 (3 classes)	89.96	85.35	87.37	87.02	N/A	×	✓
	Mesut and [51]	2020	458 (3 classes)	98.89	98.33	98.57	99.27	N/A	×	✓
	Mahmud et al. [52]	2020	1220 (4 classes)	82.87	83.82	83.37	90.30	0.825	×	✓
	Heidari et al. [53]	2020	2544 (3 classes)	N/A	N/A	N/A	94.50	N/A	×	×
	Rahimzadeh and Attar [54]	2020	11 302 (3 classes)	72.83	87.31	N/A	91.40	N/A	×	✓
	Pereira et al. [55]	2020	1144 (7 classes)	N/A	N/A	64.91	N/A	N/A	×	×
	De Moura et al. [56]	2020	1616 (3 classes)	79.00	79.33	79.33	79.86	N/A	×	×
	Yoo et al. [57]	2020	1170 (2 classes)	97.00	99.00	97.98	98.00	0.980	×	×
	Chandra et al. [58]	2020	2346 (2 classes)	N/A	N/A	N/A	91.32	0.914	×	×
	Zhang et al. [59]	2020	2706 (2 classes)	77.13	N/A	N/A	78.57	0.844	×	✓
	Shamsi Jokandan et al. [7]	2021	100 (2 classes)	N/A	99.90	N/A	98.60	0.997	✓	×
	Ahmad et al. [60]	2021	4200 (4 classes)	93.01	92.97	92.97	96.49	N/A	×	×
	Patel [61]	2021	6432 (3 classes)	N/A	N/A	N/A	93.67	N/A	✓	✓
	Basu et al. [62]	2022	2926 (2 classes)	N/A	92.90	N/A	97.60	N/A	×	×
	Masud [63]	2022	6432 (2 classes)	N/A	N/A	N/A	92.70	0.964	×	×
	Ours	2022	6432 (3 classes)	96.35	96.37	96.36	96.35	0.993	✓	✓

Fig. 12

Grad-CAM visualization for our proposed fusion model without and with UQ for nCT (Fig. 12, Fig. 12), NiCT (Fig. 12, Fig. 12), and pCT (Fig. 12, Fig. 12) classes using CT scan dataset.

Fig. 13

Grad-CAM visualization for our proposed fusion model without and with UQ for COVID-19 (Fig. 13, Fig. 13), Normal (Fig. 13, Fig. 13), and Pneumonia (Fig. 13, Fig. 13) classes using the X-ray dataset.

Fig. 10

T-SNE visualisation of different models applied to the CT scan data without and with quantifying uncertainty.

Fig. 11

T-SNE visualisation of different models applied to the X-ray data without and with quantifying uncertainty.

Fig. 14

The output posterior distributions of our proposed feature fusion model calculated for the nCT 14(a), NiCT 14(b) and pCT 14(c) data classes for the CT scan dataset, and the COVID-19 14(d), Normal 14(e) and Pneumonia 14(f) data classes for the X-ray dataset.

T-SNE visualisation of different models applied to the CT scan data without and with quantifying uncertainty. T-SNE visualisation of different models applied to the X-ray data without and with quantifying uncertainty. Grad-CAM visualization for our proposed fusion model without and with UQ for nCT (Fig. 12, Fig. 12), NiCT (Fig. 12, Fig. 12), and pCT (Fig. 12, Fig. 12) classes using CT scan dataset. Grad-CAM visualization for our proposed fusion model without and with UQ for COVID-19 (Fig. 13, Fig. 13), Normal (Fig. 13, Fig. 13), and Pneumonia (Fig. 13, Fig. 13) classes using the X-ray dataset. The output posterior distributions of our proposed feature fusion model calculated for the nCT 14(a), NiCT 14(b) and pCT 14(c) data classes for the CT scan dataset, and the COVID-19 14(d), Normal 14(e) and Pneumonia 14(f) data classes for the X-ray dataset.

Comparison with the state-of-the-art

In this sub-section, we quickly compare the results provided by our new model with those yielded by the state-of-the-art DL techniques (see Table 9, Table 10). The state-of-the-art models used in our comparison are the Bayesian Deep Learning [21], DarkCovidNet [22],CNN [23], DeTraC (Decompose, Transfer, and Compose) [24], and ResNet50 [25] models.

Table 9

Comparison of the results of our DL feature fusion model with the state-of-the-art DL models for CT scan data.

DL model	Precison	Recall	F-measure	Accuracy
Bayesian Deep Learning [21]	98.351	98.333	98.342	98.333
DarkCovidNet [22]	97.460	97.458	97.459	97.458
CNN [23]	97.753	97.750	97.751	97.750
DeTraC [24]	96.972	96.958	96.965	96.958
ResNet50 [25]	95.571	95.541	95.556	95.541
Proposed fusion model	99.085	99.085	99.085	99.085

Table 10

Comparison of the results of our DL feature fusion model with the state-of-the-art DL models for X-ray data.

DL model	Precison	Recall	F-measure	Accuracy
Bayesian Deep Learning [21]	95.398	95.419	95.408	95.419
DarkCovidNet [22]	95.752	95.729	95.741	95.729
CNN [23]	95.400	95.341	95.370	95.341
DeTraC [24]	95.276	95.263	95.270	95.263
ResNet50 [25]	94.153	94.177	94.165	94.177
Proposed fusion model	96.350	96.370	96.360	96.350

As can be seen from Table 9, Table 10, our proposed feature fusion model not just only achieved superior performance but also significantly outperformed the state-of-the-art models applied to the same datasets. Comparison of the results of our DL feature fusion model with the state-of-the-art DL models for CT scan data. Comparison of the results of our DL feature fusion model with the state-of-the-art DL models for X-ray data.

Significance of the feature fusion model

Wang et al. [20] proposed a DL feature fusion model for COVID-19 case detection. The model introduced by Wang et al. provided excellent prediction performance for CT scan data considered. However, the authors stated that their model may be much less efficient for other types of medical data such as X-ray images. In another study, Tang et al. [26] proposed an ensemble deep learning model for COVID-19 detection using X-ray image data only. Moreover, most of the existing studies focus on COVID-19 case detection without conducting any uncertainty analysis of the model’s predictions. Shamsi et al. [7] have been among rare authors who considered uncertainty in their study; however, they used very small datasets in their training experiments. In this work, we proposed a novel general feature fusion model which can be effectively used to analyze large CT scan and X-ray datasets (both of these types of images can be processed successively), while quantifying the uncertainty of the model’s predictions using the Ensemble MC Dropout (EMCD) technique. It should be noted that the proposed feature fusion model could be easily generalized to classify other complex diseases. Moreover, the model’s performance could be further improved by incorporating into its different optimization algorithms such as the Arithmetic optimization algorithm [27], Aquila optimizer [28], Artificial Immune System (AIS) algorithm [29], Marine Predators algorithm [30], or Cuckoo search optimization algorithm [31]. Finally, Neural Architecture Search (NAS) is a new technique for automating the design of various deep learning models. Therefore, the architecture of the proposed feature fusion model can be further improved using newly proposed NAS techniques [32], [33], [34], [35]. A comprehensive comparison of the results provided by our proposed model with the state-of-the-art techniques for automated detection of COVID-19 cases using both the CT scan and X-ray image datasets is presented in Table 11. Moreover, the most important features of our feature fusion model are summarized below: Our model provided the highest COVID-19 detection performance compared to traditional machine learning models, some simple deep learning models as well as to state-of-the-art deep learning techniques for both considered types of medical data (CT scan and X-ray images). Proposed model takes advantage of an uncertainty quantification strategy based on the effective Ensemble MC Dropout (EMCD) technique. Proposed model is robust against noise. Proposed model is able to detect unknown data with high accuracy. Comprehensive comparison of the results provided by our proposed model with the state-of-the-art techniques for automated detection of COVID-19 cases using both the CT scan and X-ray image datasets.

Conclusion

In this study, we have described a new deep learning feature fusion model to accurately detect COVID-19 cases using CT scan and X-ray data. In order to detect the COVID-19 cases accurately and provide health practitioners with an efficient diagnostic tool they could rely on, we carried out the uncertainty quantification of the model’s predictions while detecting the disease cases. Moreover, our model demonstrated an excellent robustness to noise and ability to process unknown data. A class-wise analysis procedure has been implemented to ensure a steady performance of the model. We have demonstrated the effectiveness of our model using various computational experiments. Our experimental results suggest that the presented feature fusion model can be applied to analyze efficiently both CT and X-ray data. The use of hierarchical features in the model’s architecture helped to outperform the considered traditional machine learning models, classical deep learning models, and state-of-the-art deep learning models. The limitations of the proposed feature fusion model will be addressed in our future studies. Thus, in the future, we intend to: (i) expand the considered COVID-19 datasets and test our feature fusion model using multi-modal data, (ii) include an attention mechanism while merging features, and (iii) integrate into our model some modern data fusion techniques such as decision level fusion.

CRediT authorship contribution statement

Moloud Abdar: Conception of the project, Design of methodology, Data selection and collection, Analysis and experimental protocol, Writing – original draft, Result discussion, Writing – review & editing. Soorena Salari: Analysis and experimental protocol, Drafting the initial draft, Experimental protocol, Result discussion, Writing – review & editing. Sina Qahremani: Analysis and experimental protocol, Drafting the initial draft, Experimental protocol, Result discussion, Writing – review & editing. Hak-Keung Lam: Writing – review & editing, Supervision. Fakhri Karray: Writing – review & editing, Supervision. Sadiq Hussain: Drafting the initial draft. Abbas Khosravi: Writing – review & editing, Supervision. U. Rajendra Acharya: Writing – review & editing, Supervision. Vladimir Makarenkov: Writing – review & editing, Supervision. Saeid Nahavandi: Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

6 in total

4. Management of validation of HPLC method for determination of acetylsalicylic acid impurities in a new pharmaceutical product.

Authors: Małgorzata Kowalska; Magdalena Woźniak; Michał Kijek; Paulina Mitrosz; Jerzy Szakiel; Paweł Turek
Journal: Sci Rep Date: 2022-01-06 Impact factor: 4.379