| Literature DB >> 35615621 |
Abstract
COVID-19 quickly became a global pandemic after only four months of its first detection. It is crucial to detect this disease as soon as possible to decrease its spread. The use of chest X-ray (CXR) images became an effective screening strategy, complementary to the reverse transcription-polymerase chain reaction (RT-PCR). Convolutional neural networks (CNNs) are often used for automatic image classification and they can be very useful in CXR diagnostics. In this paper, 21 different CNN architectures are tested and compared in the task of identifying COVID-19 in CXR images. They were applied to the COVIDx8B dataset, a large COVID-19 dataset with 16,352 CXR images coming from patients of at least 51 countries. Ensembles of CNNs were also employed and they showed better efficacy than individual instances. The best individual CNN instance results were achieved by DenseNet169, with an accuracy of 98.15% and an F1 score of 98.12%. These were further increased to 99.25% and 99.24%, respectively, through an ensemble with five instances of DenseNet169. These results are higher than those obtained in recent works using the same dataset.Entities:
Keywords: Chest X-ray images; Convolutional neural networks; Transfer learning
Year: 2022 PMID: 35615621 PMCID: PMC9122742 DOI: 10.1016/j.eswa.2022.117549
Source DB: PubMed Journal: Expert Syst Appl ISSN: 0957-4174 Impact factor: 8.665
List of datasets that compose the COVIDx8B benchmark dataset.
| Source dataset | Size | Reference |
|---|---|---|
| Covid-chestxray-dataset | 950 | |
| COVID-19 Chest X-ray Dataset Initiative | 55 | |
| Actualmed COVID-19 Chest X-ray Dataset Initiative | 238 | |
| COVID-19 Radiography Database-Version 3 | 21,165 | |
| RSNA Pneumonia Detection Challenge | 29,684 | |
| RSNA International COVID-19 Open Radiology Database (RICORD) | 1,257 |
Fig. 1Examples of CXR images from the COVIDx8B dataset. The first row shows COVID-19 negative patient cases, and the second row shows COVID-19 positive patient cases.
CNN architectures, some of their characteristics, and their references.
| Model | Input Image Resolution | Output of Last Conv. Layer | Trainable Parameters | Reference |
|---|---|---|---|---|
| DenseNet121 | 224 × 224 | |||
| DenseNet169 | 224 × 224 | |||
| DenseNet201 | 224 × 224 | |||
| EfficientNetB0 | 224 × 224 | |||
| EfficientNetB1 | 240 × 240 | |||
| EfficientNetB2 | 260 × 260 | |||
| EfficientNetB3 | 300 × 300 | |||
| InceptionResNetV2 | 299 × 299 | |||
| InceptionV3 | 299 × 299 | |||
| MobileNet | 224 × 224 | |||
| MobileNetV2 | 224 × 224 | |||
| NASNetMobile | 224 × 224 | |||
| ResNet101 | 224 × 224 | |||
| ResNet101V2 | 224 × 224 | |||
| ResNet152 | 224 × 224 | |||
| ResNet152V2 | 224 × 224 | |||
| ResNet50 | 224 × 224 | |||
| ResNet50V2 | 224 × 224 | |||
| VGG16 | 224 × 224 | |||
| VGG19 | 224 × 224 | |||
| Xception | 299 × 299 |
Fig. 2The proposed CNN Transfer Learning architecture.
Comparison of 21 different CNN models applied to the COVIDx8B dataset. Each model is executed five times. The highest values for each measure are highlighted in bold.
| Model | ACC | TPR | PPV | F1 | ||||
|---|---|---|---|---|---|---|---|---|
| Mean | S.D. | Mean | S.D. | Mean | S.D. | Mean | S.D. | |
| DenseNet169 | 0.0056 | 0.0138 | 0.9930 | 0.0075 | 0.0058 | |||
| EfficientNetB2 | 0.9760 | 0.0049 | 0.9600 | 0.0141 | 0.9918 | 0.0051 | 0.9756 | 0.0052 |
| InceptionResNetV2 | 0.9755 | 0.0099 | 0.9590 | 0.0246 | 0.9919 | 0.0051 | 0.9749 | 0.0106 |
| InceptionV3 | 0.9750 | 0.0065 | 0.9520 | 0.0144 | 0.9979 | 0.0041 | 0.9744 | 0.0069 |
| MobileNet | 0.9710 | 0.0060 | 0.9430 | 0.0136 | 0.9990 | 0.0021 | 0.9701 | 0.0064 |
| EfficientNetB0 | 0.9705 | 0.0051 | 0.9510 | 0.0086 | 0.9896 | 0.0033 | 0.9699 | 0.0053 |
| EfficientNetB3 | 0.9700 | 0.0163 | 0.9470 | 0.0337 | 0.9927 | 0.0051 | 0.9690 | 0.0177 |
| DenseNet201 | 0.9695 | 0.0176 | 0.9400 | 0.0342 | 0.9989 | 0.0022 | 0.9683 | 0.0186 |
| ResNet152V2 | 0.9695 | 0.0244 | 0.9420 | 0.0510 | 0.9970 | 0.0040 | 0.9679 | 0.0268 |
| ResNet152 | 0.9660 | 0.0223 | 0.9370 | 0.0443 | 0.9947 | 0.0033 | 0.9644 | 0.0243 |
| DenseNet121 | 0.9630 | 0.0053 | 0.9270 | 0.0103 | 0.9989 | 0.0022 | 0.9616 | 0.0057 |
| Xception | 0.9615 | 0.0077 | 0.9230 | 0.0154 | 0.0000 | 0.9599 | 0.0083 | |
| VGG19 | 0.9580 | 0.0198 | 0.9170 | 0.0385 | 0.9989 | 0.0023 | 0.9558 | 0.0216 |
| EfficientNetB1 | 0.9570 | 0.0224 | 0.9240 | 0.0413 | 0.9892 | 0.0075 | 0.9551 | 0.0242 |
| ResNet50 | 0.9545 | 0.0172 | 0.9090 | 0.0344 | 0.0000 | 0.9520 | 0.0192 | |
| VGG16 | 0.9525 | 0.0123 | 0.9090 | 0.0282 | 0.9958 | 0.0052 | 0.9501 | 0.0138 |
| ResNet101V2 | 0.9530 | 0.0302 | 0.9100 | 0.0643 | 0.9959 | 0.0050 | 0.9497 | 0.0342 |
| MobileNetV2 | 0.9485 | 0.0172 | 0.9030 | 0.0359 | 0.9935 | 0.0019 | 0.9457 | 0.0190 |
| ResNet101 | 0.9410 | 0.0170 | 0.8830 | 0.0333 | 0.9988 | 0.0023 | 0.9370 | 0.0190 |
| ResNet50V2 | 0.9280 | 0.0075 | 0.8590 | 0.0153 | 0.9966 | 0.0046 | 0.9226 | 0.0087 |
| NASNetMobile | 0.8530 | 0.0653 | 0.7090 | 0.1317 | 0.9960 | 0.0034 | 0.8212 | 0.0918 |
| 0.9569 | 0.0162 | 0.9178 | 0.0334 | 0.9957 | 0.0036 | 0.9536 | 0.0187 | |
Comparison of the best four models tested in this paper (in italic) with other recently proposed models applied to the COVIDx8B dataset. The highest values for each measure are highlighted in bold. The results obtained by other authors were compiled from the respective cited references.
| Model | ACC | TPR | PPV | F1 | Source |
|---|---|---|---|---|---|
| 0.9930 | |||||
| 0.9760 | 0.9600 | 0.9918 | 0.9756 | ||
| 0.9755 | 0.9590 | 0.9919 | 0.9749 | ||
| 0.9750 | 0.9520 | 0.9979 | 0.9744 | ||
| VGG16 (ImageNet) | 0.9750 | 0.9500 | 0.9744 | ||
| Covid-Net | 0.9400 | 0.9350 | 0.9664 | ||
| DenseNet121 (ChestXray) | 0.9650 | 0.9350 | 0.9947 | 0.9639 | |
| ResNet50V2 (Bit-M) | 0.9650 | 0.9300 | 0.9637 | ||
| Covid-Net CXR-2 | 0.9630 | 0.9550 | 0.9700 | 0.9624 | |
| VGG19 (ImageNet) | 0.9625 | 0.9250 | 0.9610 | ||
| ResNet-50 (ImageNet) | 0.9575 | 0.9200 | 0.9946 | 0.9558 | |
| DenseNet121 (ImageNet) | 0.9575 | 0.9150 | 0.9556 | ||
| Xception (ImageNet) | 0.9550 | 0.9100 | 0.9529 | ||
| ResNet50V2 (Bit-S) | 0.9480 | 0.8950 | 0.9446 | ||
| ResNet50V2 (Random) | 0.9280 | 0.8550 | 0.9218 | ||
| ResNet50 | 0.9050 | 0.8850 | 0.9220 | 0.9031 |
Comparison of different CNN-based models applied to different COVID-19 datasets found in individual papers and the best result by an individual CNN model applied to the COVIDx8B dataset in this paper.
| Reference | Architecture | Dataset Size | Accuracy |
|---|---|---|---|
| VGG16 | 7,329 | 99.82% | |
| CSDB | 15,265 | 99.80% | |
| Se-ResNeXt-50 | 8,830 | 99.32% | |
| MobileNet | 7,592 | 99.30% | |
| ResNet50 | 4,809 | 98.50% | |
| ResNet50 | 7.406 | 98.43% | |
| 16,352 | 98.15% | ||
| InceptionV3 | 11,244 | 97.70% | |
| InceptionV3 | 14,486 | 97.03% | |
| EfficientNetB0 | 15,496 | 95.82% | |
| MobileNet | 13.975 | 95,00% | |
| ResNet50 | 380 | 94.70% | |
| VGG16 | 8,474 | 94.50% | |
| EfficientNetB7 | 16,634 | 93.48% | |
| DeTrac | 196 | 93.10% | |
| InceptionV3 | 8,246 | 84.95% |
Ensembles of CNN models applied to the COVIDx8B dataset. Each ensemble configuration is executed five times with different instances of the models. The highest values for each measure are highlighted in bold.
| Models | ACC | TPR | PPV | F1 | ||||
|---|---|---|---|---|---|---|---|---|
| Mean | S.D. | Mean | S.D. | Mean | S.D. | Mean | S.D. | |
| Top 2 models | 0.9855 | 0.0024 | 0.9730 | 0.0040 | 0.9980 | 0.0025 | 0.9853 | 0.0025 |
| Top 3 models | 0.0034 | 0.9770 | 0.0068 | 0.0000 | 0.9884 | 0.0035 | ||
| Top 4 models | 0.9870 | 0.0019 | 0.9740 | 0.0037 | 0.0000 | 0.9868 | 0.0019 | |
| Top 5 models | 0.9865 | 0.0020 | 0.9730 | 0.0040 | 0.0000 | 0.9863 | 0.0021 | |
| Top 6 models | 0.9880 | 0.0010 | 0.0020 | 0.0000 | 0.0010 | |||
| Top 7 models | 0.9865 | 0.0025 | 0.9730 | 0.0051 | 0.0000 | 0.9863 | 0.0026 | |
| All models | 0.9775 | 0.0032 | 0.9550 | 0.0063 | 0.0000 | 0.9770 | 0.0033 | |
Ensembles of CNN models applied to the COVIDx8B dataset. Each ensemble is composed of five instances of the same model, with different training/validation splits. The highest values for each measure and the highest gains in comparison to single instances of each model are highlighted in bold.
| Models | ACC | TPR | PPV | F1 | ||||
|---|---|---|---|---|---|---|---|---|
| Mean | Gain | Mean | Gain | Mean | Gain | Mean | Gain | |
| DenseNet169 | 1.12% | 1.55% | 1.14% | |||||
| EfficientNetB2 | 0.9850 | 0.92% | 0.9750 | 1.56% | 0.9949 | 0.31% | 0.9848 | 0.94% |
| InceptionResNetV2 | 0.9875 | 1.23% | 0.9750 | 1.67% | 0.82% | 0.9873 | 1.27% | |
| InceptionV3 | 0.9800 | 0.51% | 0.9600 | 0.84% | 0.21% | 0.9796 | 0.53% | |
| MobileNet | 0.9825 | 1.18% | 0.9650 | 2.33% | 0.10% | 0.9822 | 1.25% | |
| EfficientNetB0 | 0.9750 | 0.46% | 0.9600 | 0.95% | 0.9897 | 0.01% | 0.9746 | 0.48% |
| EfficientNetB3 | 0.9850 | 1.55% | 0.9750 | 2.96% | 0.9949 | 0.22% | 0.9848 | 1.63% |
| DenseNet201 | 0.9825 | 1.34% | 0.9650 | 2.66% | 0.11% | 0.9822 | 1.44% | |
| ResNet152V2 | 0.9900 | 2.11% | 0.9800 | 4.03% | 0.30% | 0.9899 | 2.27% | |
| ResNet152 | 0.9800 | 1.45% | 0.9650 | 2.99% | 0.9948 | 0.01% | 0.9797 | 1.59% |
| DenseNet121 | 0.9725 | 0.99% | 0.9450 | 1.94% | 0.11% | 0.9717 | 1.05% | |
| Xception | 0.9625 | 0.10% | 0.9250 | 0.22% | 0.00% | 0.9610 | 0.11% | |
| VGG19 | 0.9700 | 1.25% | 0.9400 | 2.51% | 0.11% | 0.9691 | 1.39% | |
| EfficientNetB1 | 0.9725 | 1.62% | 0.9500 | 2.81% | 0.9948 | 0.57% | 0.9719 | 1.76% |
| ResNet50 | 0.9650 | 1.10% | 0.9300 | 2.31% | 0.00% | 0.9637 | 1.23% | |
| VGG16 | 0.9550 | 0.26% | 0.9100 | 0.11% | 0.42% | 0.9529 | 0.29% | |
| ResNet101V2 | 0.9650 | 1.26% | 0.9300 | 2.20% | 0.41% | 0.9637 | 1.47% | |
| MobileNetV2 | 0.9650 | 1.74% | 0.9350 | 3.54% | 0.9947 | 0.12% | 0.9639 | 1.92% |
| ResNet101 | 0.9575 | 1.75% | 0.9150 | 3.62% | 0.12% | 0.9556 | 1.99% | |
| ResNet50V2 | 0.9350 | 0.75% | 0.8700 | 1.28% | 0.34% | 0.9305 | 0.86% | |
| NASNetMobile | 0.8750 | 0.7500 | 0.40% | 0.8571 | ||||
| 0.9683 | 1.20% | 0.9383 | 2.28% | 0.9983 | 0.26% | 0.9666 | 1.38% | |
Ensembles of CNN models applied to the COVIDx8B dataset. Each ensemble configuration has five instances of each participant model. The highest values for each measure and the highest gains in comparison with the ensembles of single instances for each model are highlighted in bold.
| Models | ACC | TPR | PPV | F1 | ||||
|---|---|---|---|---|---|---|---|---|
| Mean | Gain | Mean | Gain | Mean | Gain | Mean | Gain | |
| Top 2 models | 0.9850 | −0.05% | 0.9700 | −0.31% | 0.9848 | −0.05% | ||
| Top 3 models | −0.10% | −0.20% | 0.00% | −0.11% | ||||
| Top 4 models | 0.05% | 0.10% | 0.00% | 0.05% | ||||
| Top 5 models | 0.00% | |||||||
| Top 6 models | −0.05% | −0.10% | 0.00% | −0.06% | ||||
| Top 7 models | 0.00% | |||||||
| All models | 0.9775 | 0.00% | 0.9550 | 0.00% | 0.00% | 0.9770 | 0.00% | |
| 0.9857 | 0.01% | 0.9714 | −0.01% | 1.0000 | 0.03% | 0.9855 | 0.00% | |
Classification accuracy (ACC) achieved by the CNN architectures when applied to the train, validation, and test subsets individually.
| Dataset/Subset | Train | Validation | Test |
|---|---|---|---|
| DenseNet169 | 0.9951 | 0.9794 | 0.9815 |
| EfficientNetB2 | 0.9936 | 0.9793 | 0.9760 |
| InceptionResNetV2 | 0.9835 | 0.9681 | 0.9755 |
| InceptionV3 | 0.9960 | 0.9784 | 0.9750 |
| MobileNet | 0.9936 | 0.9788 | 0.9710 |
| EfficientNetB0 | 0.9894 | 0.9761 | 0.9705 |
| EfficientNetB3 | 0.9948 | 0.9803 | 0.9700 |
| ResNet152V2 | 0.9945 | 0.9757 | 0.9695 |
| DenseNet201 | 0.9971 | 0.9816 | 0.9695 |
| ResNet152 | 0.9923 | 0.9783 | 0.9660 |
| DenseNet121 | 0.9962 | 0.9806 | 0.9630 |
| Xception | 0.9909 | 0.9777 | 0.9615 |
| VGG19 | 0.9922 | 0.9804 | 0.9580 |
| EfficientNetB1 | 0.9802 | 0.9697 | 0.9570 |
| ResNet50 | 0.9955 | 0.9806 | 0.9545 |
| ResNet101V2 | 0.9909 | 0.9707 | 0.9530 |
| VGG16 | 0.9913 | 0.9772 | 0.9525 |
| MobileNetV2 | 0.9987 | 0.9808 | 0.9485 |
| ResNet101 | 0.9923 | 0.9803 | 0.9410 |
| ResNet50V2 | 0.9859 | 0.9662 | 0.9280 |
| NASNetMobile | 0.9798 | 0.9660 | 0.8530 |
Classification sensitivity (TPR) achieved by the CNN architectures when applied to the train, validation, and test subsets individually.
| Dataset/Subset | Train | Validation | Test |
|---|---|---|---|
| DenseNet169 | 0.9987 | 0.9611 | 0.9700 |
| EfficientNetB2 | 0.9936 | 0.9662 | 0.9600 |
| InceptionResNetV2 | 0.9830 | 0.9491 | 0.9590 |
| InceptionV3 | 0.9965 | 0.9458 | 0.9520 |
| EfficientNetB0 | 0.9760 | 0.9361 | 0.9510 |
| EfficientNetB3 | 0.9935 | 0.9662 | 0.9470 |
| MobileNet | 0.9957 | 0.9440 | 0.9430 |
| ResNet152V2 | 0.9928 | 0.9375 | 0.9420 |
| DenseNet201 | 0.9964 | 0.9454 | 0.9400 |
| ResNet152 | 0.9854 | 0.9509 | 0.9370 |
| DenseNet121 | 0.9973 | 0.9421 | 0.9270 |
| EfficientNetB1 | 0.9750 | 0.9505 | 0.9240 |
| Xception | 0.9847 | 0.9333 | 0.9230 |
| VGG19 | 0.9874 | 0.9338 | 0.9170 |
| ResNet101V2 | 0.9882 | 0.9278 | 0.9100 |
| VGG16 | 0.9915 | 0.9324 | 0.9090 |
| ResNet50 | 0.9918 | 0.9338 | 0.9090 |
| MobileNetV2 | 0.9933 | 0.9241 | 0.9030 |
| ResNet101 | 0.9701 | 0.9162 | 0.8830 |
| ResNet50V2 | 0.9758 | 0.8921 | 0.8590 |
| NASNetMobile | 0.8874 | 0.8255 | 0.7090 |
Classification precision (PPV) achieved by the CNN architectures when applied to the train, validation, and test subsets individually.
| Dataset/Subset | Train | Validation | Test |
|---|---|---|---|
| Xception | 0.9502 | 0.9057 | 1.0000 |
| ResNet50 | 0.9759 | 0.9242 | 1.0000 |
| MobileNet | 0.9587 | 0.9040 | 0.9990 |
| VGG19 | 0.9566 | 0.9228 | 0.9989 |
| DenseNet121 | 0.9754 | 0.9169 | 0.9989 |
| DenseNet201 | 0.9822 | 0.9215 | 0.9989 |
| ResNet101 | 0.9727 | 0.9366 | 0.9988 |
| InceptionV3 | 0.9748 | 0.8998 | 0.9979 |
| ResNet152V2 | 0.9682 | 0.8904 | 0.9970 |
| ResNet50V2 | 0.9244 | 0.8630 | 0.9966 |
| NASNetMobile | 0.9602 | 0.9159 | 0.9960 |
| ResNet101V2 | 0.9501 | 0.8701 | 0.9959 |
| VGG16 | 0.9474 | 0.9044 | 0.9958 |
| ResNet152 | 0.9598 | 0.8964 | 0.9947 |
| MobileNetV2 | 0.9973 | 0.9337 | 0.9935 |
| DenseNet169 | 0.9672 | 0.8946 | 0.9930 |
| EfficientNetB3 | 0.9702 | 0.8977 | 0.9927 |
| InceptionResNetV2 | 0.9077 | 0.8408 | 0.9919 |
| EfficientNetB2 | 0.9631 | 0.8925 | 0.9918 |
| EfficientNetB0 | 0.9486 | 0.8928 | 0.9896 |
| EfficientNetB1 | 0.8924 | 0.8470 | 0.9892 |
Classification F1 score achieved by the CNN architectures when applied to the train, validation, and test subsets individually.
| Dataset/Subset | Train | Validation | Test |
|---|---|---|---|
| DenseNet169 | 0.9825 | 0.9266 | 0.9812 |
| EfficientNetB2 | 0.9774 | 0.9273 | 0.9756 |
| InceptionResNetV2 | 0.9428 | 0.8905 | 0.9749 |
| InceptionV3 | 0.9855 | 0.9221 | 0.9744 |
| MobileNet | 0.9768 | 0.9233 | 0.9701 |
| EfficientNetB0 | 0.9619 | 0.9139 | 0.9699 |
| EfficientNetB3 | 0.9815 | 0.9304 | 0.9690 |
| DenseNet201 | 0.9892 | 0.9331 | 0.9683 |
| ResNet152V2 | 0.9801 | 0.9128 | 0.9679 |
| ResNet152 | 0.9722 | 0.9226 | 0.9644 |
| DenseNet121 | 0.9862 | 0.9292 | 0.9616 |
| Xception | 0.9670 | 0.9190 | 0.9599 |
| VGG19 | 0.9716 | 0.9281 | 0.9558 |
| EfficientNetB1 | 0.9312 | 0.8954 | 0.9551 |
| ResNet50 | 0.9837 | 0.9287 | 0.9520 |
| VGG16 | 0.9688 | 0.9173 | 0.9501 |
| ResNet101V2 | 0.9679 | 0.8966 | 0.9497 |
| MobileNetV2 | 0.9953 | 0.9287 | 0.9457 |
| ResNet101 | 0.9714 | 0.9262 | 0.9370 |
| ResNet50V2 | 0.9493 | 0.8773 | 0.9226 |
| NASNetMobile | 0.9209 | 0.8675 | 0.8212 |