| Literature DB >> 35529253 |
Muhammad Owais1, Na Rae Baek1, Kang Ryoung Park1.
Abstract
The recent disaster of COVID-19 has brought the whole world to the verge of devastation because of its highly transmissible nature. In this pandemic, radiographic imaging modalities, particularly, computed tomography (CT), have shown remarkable performance for the effective diagnosis of this virus. However, the diagnostic assessment of CT data is a human-dependent process that requires sufficient time by expert radiologists. Recent developments in artificial intelligence have substituted several personal diagnostic procedures with computer-aided diagnosis (CAD) methods that can make an effective diagnosis, even in real time. In response to COVID-19, various CAD methods have been developed in the literature, which can detect and localize infectious regions in chest CT images. However, most existing methods do not provide cross-data analysis, which is an essential measure for assessing the generality of a CAD method. A few studies have performed cross-data analysis in their methods. Nevertheless, these methods show limited results in real-world scenarios without addressing generality issues. Therefore, in this study, we attempt to address generality issues and propose a deep learning-based CAD solution for the diagnosis of COVID-19 lesions from chest CT images. We propose a dual multiscale dilated fusion network (DMDF-Net) for the robust segmentation of small lesions in a given CT image. The proposed network mainly utilizes the strength of multiscale deep features fusion inside the encoder and decoder modules in a mutually beneficial manner to achieve superior segmentation performance. Additional pre- and post-processing steps are introduced in the proposed method to address the generality issues and further improve the diagnostic performance. Mainly, the concept of post-region of interest (ROI) fusion is introduced in the post-processing step, which reduces the number of false-positives and provides a way to accurately quantify the infected area of lung. Consequently, the proposed framework outperforms various state-of-the-art methods by accomplishing superior infection segmentation results with an average Dice similarity coefficient of 75.7%, Intersection over Union of 67.22%, Average Precision of 69.92%, Sensitivity of 72.78%, Specificity of 99.79%, Enhance-Alignment Measure of 91.11%, and Mean Absolute Error of 0.026.Entities:
Keywords: COVID-19 lesions segmentation; Computer-aided diagnosis; DMDF-Net; Infection quantification; Lung segmentation
Year: 2022 PMID: 35529253 PMCID: PMC9057951 DOI: 10.1016/j.eswa.2022.117360
Source DB: PubMed Journal: Expert Syst Appl ISSN: 0957-4174 Impact factor: 8.665
Fig. 1Example CT images of different patients infected with COVID-19 virus from (a) MosMed data and (b) COVID-19-CT-Seg data. The infectious regions of the lungs are shown inside the red boundary lines.
Comparative review summary of the proposed and existing methods on the infection segmentation related to COVID-19.
| Method | #Sli. | Strengths | Limitations |
|---|---|---|---|
| C-GAN andU-Net | 829(9) | C-GAN overcomes the underfitting issue | - Lack of cross-data analysis |
| CoSinGAN and 2D U-Net ( | 5,569(70) | - CoSinGAN overcomes the underfitting issue | - Data synthesis requires high computation power |
| Inf-Net and Semi-Inf-Net ( | 100(40) | Semi-supervised learning improves the performance | - Lack of cross-data analysis |
| 3D U-Net ( | 5,569(70) | Detailed performance analysis and comparison | - 3D U-Net requires high computation power |
| Modified local contrast enhancement ( | 275(22) | - Visualize the progression of disease | - Lack of cross-data analysis |
| InceptionV3 and DeepLabV3+ ( | 100(40) | Joint segmentation and classification framework | - Lack of ablation study |
| MSD-Net ( | 4,780(36) | - Improve small lesion segmentation | - Lack of cross-data analysis |
| FSS-2019-nCov ( | 939(69) | Perform optimal training with a limited dataset | - Lack of cross-data analysis |
| CNN ( | 80(N/A) | Perform optimal training with a limited dataset | - Lack of ablation study |
| DAL-Net ( | 5,569(70) | - Address generality issue | Include pre-processing stage |
| U-Net + Attention mechanism ( | 473(69) | - Improve small lesion segmentation | - Lack of cross-data analysis |
| Res-UNet ( | 200(10) | Detect the infected areas in various 2D planes of lung CT slices | - Limited dataset |
| FractalCovNet ( | 473(N/A) | Detection of COVID-19 cases using both chest X-ray and CT images | - Lack of ablation study |
| U-Net and FCNs ( | 939(10) | Overcome the effect of class imbalance and annotation errors. | - Lack of comparison with state-of-the-art models |
| Improved 3D CU-Net ( | 5,569(70) | Perform well in case of uneven distribution of lesions | - Lack of ablation study |
| D2A U-Net ( | 1,765(N/A) | Perform well in case of blurred edges of infection | - Limited testing dataset |
| DMDF-Net | 5,569(70) | - Computationally efficient | Include the steps of pre- and post-processing |
#Sli.: Number of CT scan slices; #Pat.: Number of patients.
Fig. 2COVID-19 positive CT images and corresponding ground truths as segmentation masks from (a) MosMed, and (b) COVID-19-CT-Seg.
Fig. 3Complete workflow diagram of the proposed diagnostic framework including pre-processing step, lung segmentation network (DMDF-Net-1), infection segmentation network (DMDF-Net-2) and, finally, post-processing step. (: RT mapping function; : Total number of green pixels in final output image ; : Total number of red pixels in final output image ).
Fig. 4Overall design and workflow diagram of the proposed DMDF-Net (including both encoder and decoder modules).
Complete layer-wise structure, configuration, and parametric information of the proposed DMDF-Net.
| Layer | Input Size | Kernel Size | Kernel Depth | Stride | Output Size | #Par. | |
|---|---|---|---|---|---|---|---|
| Encoder | Image Input | 288 × 352 × 3 | – | – | – | – | – |
| Conv 1 | 288 × 352 × 3 | 32 | 2 | 144 × 176 × 32 | 960 | ||
| G-Conv 1 | 144 × 176 × 32 | 32 | 1 | 144 × 176 × 32 | 384 | ||
| Conv 2 | 144 × 176 × 32 | 16 | 1 | 144 × 176 × 16 | 560 | ||
| A-Block 1 | 144 × 176 × 16 | 96,96,24 | 1,2,1 | 72 × 88 × 24 | 5,352 | ||
| B-Block 1 | 72 × 88 × 24 | 144,144,24 | 1,1,1 | 72 × 88 × 24 | 9,144 | ||
| A-Block 2 | 72 × 88 × 24 | 144,144,32 | 1,2,1 | 36 × 44 × 32 | 10,320 | ||
| B-Block 2 | 36 × 44 × 32 | 192,192,32 | 1,1,1 | 36 × 44 × 32 | 15,264 | ||
| A-Block 3 | 36 × 44 × 32 | 192,192,64 | 1,2,1 | 18 × 22 × 64 | 21,504 | ||
| B-Block 3 | 18 × 22 × 64 | 384,384,64 | 1,1,1 | 18 × 22 × 64 | 55,104 | ||
| A-Block 4 | 18 × 22 × 64 | 384,384,320 | 1,1,1 | 18 × 22 × 320 | 154,176 | ||
| C-Block 1 | 18 × 22 × 320 | 320,320,320,320 | 1,1,1,1 | 18 × 22 × 1024 | 340,992 | ||
| Conv 3 | 18 × 22 × 1024 | 256 | 1 | 18 × 22 × 256 | 262,912 | ||
| Decoder | TP-Conv 1* | 18 × 22 × 256 | 256 | 4 | 72 × 88 × 256 | 4,194,560 | |
| Conv 4* | 72 × 88 × 144 | 48 | 1 | 72 × 88 × 48 | 7,056 | ||
| Depth Concatenation | 72 × 88 × 256 | – | – | – | 72 × 88 × 304 | – | |
| G-Conv 2 | 72 × 88 × 304 | 304 | 1 | 72 × 88 × 304 | 3,040 | ||
| Conv 5 | 72 × 88 × 304 | 256 | 1 | 72 × 88 × 256 | 78,592 | ||
| G-Conv 3 | 72 × 88 × 256 | 256 | 1 | 72 × 88 × 256 | 2,560 | ||
| Conv 6 | 72 × 88 × 256 | 320 | 1 | 72 × 88 × 320 | 82,880 | ||
| C-Block 2 | 72 × 88 × 320 | 320,320,320,320 | 1,1,1,1 | 72 × 88 × 1024 | 340,992 | ||
| Conv 7 | 72 × 88 × 1024 | 256 | 1 | 72 × 88 × 256 | 262,912 | ||
| Conv 8 | 72 × 88 × 256 | 2 | 1 | 72 × 88 × 2 | 514 | ||
| TP-Conv 2 | 72 × 88 × 2 | 2 | 4 | 288 × 352 × 2 | 258 | ||
| Softmax | 288 × 352 × 2 | – | – | – | 288 × 352 × 2 | – | |
| Pixel Classification | – | – | – | – | – | – | |
*Output tensors of these layers are fed to depth concatenation layer; **dilation rate (DR); #Par.: Total number of parameters; ‘–‘: Not applicable; .
Fig. 5Training/validation accuracies and losses of the proposed DMDF-Net for (a) lung segmentation (Exp#1), and (b) COVID-19 infection segmentation (Exp#2).
Average five-fold performance of the proposed lung segmentation network (DMDF-Net-1) for lung segmentation (Exp#1). These results also highlight the significance of multiscale deep features fusion using multiscale dilated convolution (C-Blocks) and transfer learning in Exp#1 as an ablation study. (unit: %).
| Training Option | Option | Dataset | DICE (Std) | IoU | AP | SEN | SPE | MAE | |
|---|---|---|---|---|---|---|---|---|---|
| Without Transfer Learning | Without | Left lung | 43.86 (5.42) | 33.24 (5.38) | 52.38 (1.84) | 63.39 (7.42) | 60.12 (7.93) | 36.84 (4.79) | 3.971 (0.788) |
| Right lung | 43.09 (2.4) | 32.24 | 52.02 (0.87) | 62.04 (5.11) | 58.44 (4.28) | 36.49 (2.86) | 4.137 (0.428) | ||
| Both lung | 48.52 | 35.75 | 54.08 (1.89) | 62.34 (5.29) | 60.3 (5.87) | 43.21 (4.37) | 3.95 (0.578) | ||
| With | Left lung | 41.88 (3.61) | 31.37 | 51.52 (1.12) | 59.0 (5.28) | 57.63 (6.12) | 35.17 (3.9) | 4.231 (0.608) | |
| Right lung | 44.56 (5.11) | 33.82 | 52.43 (1.74) | 62.4 (7.8) | 60.93 (8.3) | 38.2 (5.05) | 3.899 (0.826) | ||
| Both lung | 52.65 (4.69) | 39.74 (4.55) | 56.28 (2.38) | 67.32 (4.53) | 65.28 (5.8) | 47.6 (5.01) | 3.451 (0.564) | ||
| With Transfer Learning | Without | Left lung | 94.45 | 89.92 | 90.75 (0.97) | 99.15 (0.17) | 98.89 (0.24) | 94.36 (1.7) | 0.11 (0.023) |
| Right lung | 97.76 | 95.71 | 96.3 (1.11) | 99.1 (1.41) | 99.57 (0.14) | 97.81 (1.22) | 0.046 (0.02) | ||
| Both lung | 98.41 | 96.91 | 97.6 (0.59) | 99.07 (1.02) | 99.48 (0.08) | 98.29 (0.5) | 0.057 (0.01) | ||
| With | Left lung | ||||||||
| Right lung | |||||||||
| Both lung | |||||||||
#Par.: Number of parameters; M: Million; Std: Standard deviation; The best results are shown in boldface.
Comparative results of the proposed encoder design versus original MobileNetV2 as backbone networks and adopted BCE loss versus conventional CE loss function for lung segmentation (Exp#1). (unit: %).
| Backbone/Loss | Dataset | DICE (Std) | IoU | AP | SEN | SPE | MAE | ||
|---|---|---|---|---|---|---|---|---|---|
| Backbone Networks | MobileNetV2 | Left Lung | 94.06 (1.65) | 88.97 (1.13) | 89.97 (1.16) | 98.77 (0.25) | 98.79 (0.18) | 94.07 (1.68) | 0.121 (0.016) |
| Proposed | |||||||||
| MobileNetV2 | Right Lung | 95.49 (1.88) | 95.33 (0.7) | 95.83 (0.5) | 99.4 (0.43) | 99.5 (0.11) | 97.71 (0.51) | 0.05 (0.013) | |
| Proposed | |||||||||
| MobileNetV2 | Both Lung | 96.26 (1.63) | 96.21 (0.46) | 96.86 (0.27) | 99.34 (0.39) | 99.28 (0.15) | 98.07 (0.41) | 0.072 (0.016) | |
| Proposed | |||||||||
| Loss Functions | CE ( | Left Lung | 94.62 (0.48) | 90.19 (0.79) | 90.94 (0.83) | 99.35 (0.23) | 98.93 (0.12) | 94.48 (1.6) | 0.105 (0.011) |
| BCE (our) | |||||||||
| CE ( | Right Lung | 98.39 (0.24) | 96.86 (0.44) | 97.16 (0.42) | 99.68 (0.04) | 98.71 (0.03) | 0.032 (0.005) | ||
| BCE (our) | 99.64 (0.28) | ||||||||
| CE ( | Both Lung | 98.51 (0.12) | 97.09 (0.22) | 97.55 (0.18) | 99.44 (0.1) | 0.054 (0.009) | |||
| BCE (our) | 99.57 (0.12) | 98.73 (0.35) | |||||||
#Fold: Fold number; Avg.: Average results; Std: Standard deviation; The best results are shown in boldface.
Quantitative results of the proposed infection segmentation network (DMDF-Net-2) for COVID-19 infection segmentation (Exp#2). These results also highlight the significance of transfer learning, pre-processing, post-processing, and multiscale deep features fusion using multiscale dilated convolution (C-Blocks) in Exp#2 as an ablation study. (unit: %).
| Training | Option(#Par. (M) | Pre- | Post- | DICE | IoU | AP | SEN | SPE | MAE | |
|---|---|---|---|---|---|---|---|---|---|---|
| Without Transfer Learning | Without | 36.59 | 28.69 | 50.03 | 50.27 | 57.2 | 45.04 | 4.281 | ||
| ✓ | 43.37 | 38.06 | 50.02 | 27.14 | 76 | 57 | 2.409 | |||
| ✓ | 36.48 | 28.53 | 50.05 | 55.5 | 56.86 | 44.87 | 4.314 | |||
| ✓ | ✓ | 43.42 | 38.11 | 50.04 | 31.84 | 76.06 | 57.04 | 2.403 | ||
| With | 33.62 | 25.2 | 50.01 | 53.56 | 50.23 | 40.59 | 4.976 | |||
| ✓ | 44.99 | 40.65 | 50.02 | 21.61 | 81.2 | 60.5 | 1.892 | |||
| ✓ | 33.76 | 25.34 | 50.02 | 54.7 | 50.52 | 40.75 | 4.948 | |||
| ✓ | ✓ | 44.72 | 40.12 | 50.06 | 29.62 | 80.07 | 59.92 | 2.003 | ||
| With Transfer Learning | Without | 71.94 | 63.98 | 68.99 | 52.13 | 99.84 | 89.98 | 0.026 | ||
| ✓ | 71.63 | 63.73 | 69.13 | 49.94 | 99.84 | 89.98 | 0.025 | |||
| ✓ | 73.96 | 65.67 | 67.74 | 74.32 | 99.74 | 89.98 | 0.031 | |||
| ✓ | ✓ | 74.41 | 66.06 | 68.28 | 73.88 | 99.75 | 90.15 | 0.03 | ||
| With | 72.91 | 64.78 | 68.38 | 61.07 | 99.8 | 91.02 | 0.028 | |||
| ✓ | 73.13 | 64.96 | 68.74 | 60.66 | 99.8 | 91.13 | 0.027 | |||
| ✓ | 75.3 | 66.86 | 69.43 | 99.78 | 91.01 | 0.027 | ||||
| ✓ | ✓ | 72.78 | ||||||||
×: Not included; ✓: Included; #Par.: Number of parameters; M: Million; The best results are shown in boldface.
Fig. 6Visual output results of the proposed framework with and without including the pre- and post-processing stages for COVID-19 infection segmentation (Exp#2).
Comparative results of the proposed encoder design versus original MobileNetV2 as backbone networks and adopted BCE loss versus conventional CE loss function for COVID-19 infection segmentation (Exp#2). (unit: %).
| Backbone/Loss | Option | DICE | IoU | AP | SEN | SPE | MAE | ||
|---|---|---|---|---|---|---|---|---|---|
| Backbone Networks | MobileNetV2 | Without Pre- and Post-Processing | 73.2 | 65.02 | 69 | 59.79 | 99.81 | 88.44 | 0.027 |
| Proposed | 72.91 | 64.78 | 68.38 | 61.07 | 99.8 | 91.02 | 0.028 | ||
| MobileNetV2 | With | 72.84 | 64.7 | 66.76 | 72.21 | 99.72 | 89.35 | 0.033 | |
| Proposed | 75.3 | 66.86 | 69.43 | 72.95 | 99.78 | 91.01 | 0.027 | ||
| MobileNetV2 | With Pre- and Post-Processing | 73.4 | 65.18 | 67.42 | 71.72 | 99.74 | 89.54 | 0.031 | |
| Proposed | |||||||||
| Loss Functions | CE ( | Without Pre- and Post-Processing | 69.37 | 61.87 | 63.26 | 72.92 | 99.61 | 88.83 | 0.044 |
| BCE (our) | 72.91 | 64.78 | 68.38 | 61.07 | 99.8 | 91.02 | 0.028 | ||
| CE ( | With | 71.86 | 63.87 | 65.23 | 78.33 | 99.65 | 89.25 | 0.039 | |
| BCE (our) | 75.3 | 66.86 | 69.43 | 72.95 | 99.78 | 91.01 | 0.027 | ||
| CE ( | With Pre- and Post-Processing | 72.37 | 64.3 | 65.76 | 99.67 | 89.53 | 0.037 | ||
| BCE (our) | 72.78 | ||||||||
The best results are shown in boldface.
Quantitative performance comparison of pre-ROI fusion versus post-ROI fusion with and without applying pre-processing step for COVID-19 infection segmentation (Exp#2). (unit: %).
| Training Option | Pre- | Pre-ROI Fusion | Post-ROI Fusion | DICE | IoU | AP | SEN | SPE | MAE | |
|---|---|---|---|---|---|---|---|---|---|---|
| Without Transfer Learning | 33.62 | 25.2 | 50.01 | 53.56 | 50.23 | 40.59 | 4.976 | |||
| ✓ | 33.76 | 25.34 | 50.02 | 54.7 | 50.52 | 40.75 | 4.948 | |||
| ✓ | 36.08 | 28.09 | 50.0 | 43.67 | 56.05 | 44.29 | 4.398 | |||
| ✓ | 44.99 | 40.65 | 50.02 | 21.61 | 81.2 | 60.5 | 1.892 | |||
| ✓ | ✓ | 36.3 | 28.35 | 50.0 | 43.38 | 56.57 | 44.59 | 4.345 | ||
| ✓ | ✓ | 44.72 | 40.12 | 50.06 | 29.62 | 80.07 | 59.92 | 2.003 | ||
| With Transfer Learning | 72.91 | 64.78 | 68.38 | 61.07 | 99.8 | 91.02 | 0.028 | |||
| ✓ | 75.3 | 66.86 | 69.43 | 72.95 | 99.78 | 91.01 | 0.027 | |||
| ✓ | 52.09 | 49.46 | 51.65 | 95.64 | 75.76 | 0.439 | ||||
| ✓ | 73.13 | 64.96 | 68.74 | 60.66 | 0.027 | |||||
| ✓ | ✓ | 67.46 | 60.45 | 62.65 | 56.94 | 99.68 | 89.39 | 0.041 | ||
| ✓ | ✓ | 72.78 | 99.79 | 91.11 | ||||||
×: Not included; ✓: Included; The best results are shown in boldface.
Fig. 7Visual output results of pre-ROI fusion versus post-ROI fusion for COVID-19 infection segmentation (Exp#2) after applying a pre-processing step and training DMDF-Net-2 through transfer learning.
Comparative results of same versus cross datasets for COVID-19 infection segmentation (Exp#2). (unit: %).
| Dataset | Pre-/Post-Processing | DICE (Std) | IoU | AP | SEN | SPE | MAE | |
|---|---|---|---|---|---|---|---|---|
| COVID-19-CT-Seg | 81.51 (7.52) | 73.35 (7.81) | 75.63 (8.85) | 98.89 (1.09) | 87.69 (5.56) | 0.121 (0.106) | ||
| ✓ | 90.51 (5.21) | |||||||
| COVID-19-CT-Seg/MosMed | 72.91 | 64.78 | 68.38 | 61.07 | 91.02 | 0.028 | ||
| ✓ | 99.79 | |||||||
The best results are shown in boldface.
Quantitative results of the proposed DMDF-Net-1 compared with different state-of-the-art segmentation networks for lung segmentation task (Exp#1). (unit: %).
| Methods | #Par. | FLOPs | PT | DICE (Std) | IoU | AP | SEN | SPE | MAE | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Left Lung | 3D U-Net ( | – | – | – | 85.8 (10.5) | – | – | – | – | – | – |
| CoSinGAN + 2D U-Net ( | – | – | – | 93.9 | – | – | – | – | – | – | |
| SegNet (VGG16) ( | 29.44 | 123.91 | 10.31 | 92.91 (0.78) | 87.43 (1.23) | 88.79 (1.14) | 97.78 (0.25) | 98.61 (0.29) | 93.16 (1.66) | 0.143 (0.027) | |
| SegNet (VGG19) ( | 40.07 | 157.54 | 9.62 | 93.37 (0.66) | 88.16 (1.03) | 89.6 (1.14) | 97.43 (0.58) | 98.76 (0.09) | 93.98 (0.57) | 0.13 (0.01) | |
| U-Net (Encoder Depth:4) ( | 31.03 | 155.69 | 23.26 | 90.85 (3.98) | 84.54 (5.74) | 87.98 (6.84) | 93.38 (11.06) | 98.46 (1.08) | 90.96 (4.48) | 0.183 (0.081) | |
| FCN (Up Sampling Factor: 32) ( | 134.29 | 187.34 | 18.52 | 91.39 (0.57) | 85.07 (0.85) | 86.97 (0.91) | 96.15 (0.58) | 98.35 (0.27) | 91.15 (1.56) | 0.175 (0.025) | |
| DeepLabV3+(ResNet) ( | 20.61 | 58.42 | 31.25 | 89.97 (0.81) | 90.89 (0.85) | 98.85 (0.17) | 98.93 (0.1) | 94.79 (1.12) | 0.107 (0.009) | ||
| DAL-Net ( | 6.65 | 35.81 | 23.26 | 94.52 (0.44) | 90.03 (0.73) | 90.89 (0.67) | 99.05 (0.24) | 98.92 (0.17) | 0.108 (0.017) | ||
| DeepLabV3+(MobileNetV2) ( | 6.78 | 30.89 | 20.41 | 94.06 (1.65) | 88.97 (1.13) | 89.97 (1.16) | 98.77 (0.25) | 98.79 (0.18) | 94.07 (1.68) | 0.121 (0.016) | |
| DMDF-Net-1 (Proposed) | 5.85 | 37.45 | 25.64 | 94.86 (0.4) | 94.67 (1.54) | ||||||
| Right Lung | 3D U-Net ( | – | – | – | 87.9 | – | – | – | – | – | – |
| CoSinGAN + 2D U-Net ( | – | – | – | 94.6 | – | – | – | – | – | – | |
| SegNet (VGG16) ( | 29.44 | 123.91 | 10.31 | 96.47 (0.3) | 93.36 (0.53) | 94.01 (0.52) | 99.27 (0.37) | 99.27 (0.08) | 96.54 (0.83) | 0.072 (0.007) | |
| SegNet (VGG19) ( | 40.07 | 157.54 | 9.62 | 96.44 (0.26) | 93.3 (0.46) | 94.06 (0.5) | 98.98 (1.07) | 99.28 (0.05) | 96.46 (0.46) | 0.072 (0.008) | |
| U-Net (Encoder Depth:4) ( | 31.03 | 155.69 | 23.26 | 94.78 (5.21) | 91.01 (7.84) | 96.69 (1.65) | 87.89 (16.73) | 99.67 (0.24) | 93.3 (6.51) | 0.091 (0.074) | |
| FCN (Up Sampling Factor: 32) ( | 134.29 | 187.34 | 18.52 | 94.04 (0.44) | 89.22 (0.7) | 90.4 (0.85) | 98.35 (0.53) | 98.74 (0.05) | 93.56 (0.82) | 0.128 (0.006) | |
| DeepLabV3+(ResNet) ( | 20.61 | 58.42 | 31.25 | 96.74 (0.73) | 95.94 (0.28) | 96.44 (0.27) | 99.32 (0.56) | 99.59 (0.03) | 98.15 (0.21) | 0.043 (0.005) | |
| DAL-Net ( | 6.65 | 35.81 | 23.26 | 98.13 (0.32) | 96.39 (0.59) | 96.91 (0.49) | 99.21 (0.94) | 99.65 (0.02) | 98.33 (0.44) | 0.037 (0.007) | |
| DeepLabV3+(MobileNetV2) ( | 6.78 | 30.89 | 20.41 | 95.49 (1.88) | 95.33 (0.7) | 95.83 (0.5) | 99.4 (0.43) | 99.5 (0.11) | 97.71 (0.51) | 0.05 (0.013) | |
| DMDF-Net-1 (Proposed) | 5.85 | 37.45 | 25.64 | ||||||||
| Both Lung | SegNet (VGG16) ( | 29.44 | 123.91 | 10.31 | 97.1 (0.26) | 94.45 (0.47) | 95.38 (0.47) | 99.07 (0.34) | 98.92 (0.09) | 97.34 (0.5) | 0.103 (0.009) |
| SegNet (VGG19) ( | 40.07 | 157.54 | 9.62 | 97.16 (0.53) | 94.57 (0.96) | 95.46 (1.01) | 99.12 (0.55) | 98.95 (0.13) | 97.09 (0.92) | 0.102 (0.012) | |
| U-Net (Encoder Depth:4) ( | 31.03 | 155.69 | 23.26 | 85.76 (18.93) | 80.99 (19.36) | 90.18 (3.79) | 80.73 (42.08) | 98.04 (1.62) | 82.37 (22.69) | 0.4 (0.411) | |
| FCN (Up Sampling Factor: 32) ( | 134.29 | 187.34 | 18.52 | 94.21 (0.58) | 89.42 (0.96) | 91.12 (1.06) | 98.33 (0.38) | 97.73 (0.19) | 93.31 (0.61) | 0.221 (0.015) | |
| DeepLabV3+(ResNet) ( | 20.61 | 58.42 | 31.25 | 96.68 (0.89) | 96.42 (0.27) | 97.12 (0.34) | 99.17 (0.59) | 99.36 (0.02) | 98.28 (0.22) | 0.067 (0.007) | |
| DAL-Net ( | 6.65 | 35.81 | 23.26 | 98.33 (0.25) | 96.75 (0.47) | 97.37 (0.25) | 99.28 (0.55) | 99.41 (0.09) | 98.28 (0.51) | 0.061 (0.014) | |
| DeepLabV3+(MobileNetV2) ( | 6.78 | 30.89 | 20.41 | 96.26 (1.63) | 96.21 (0.46) | 96.86 (0.27) | 99.34 (0.39) | 99.28 (0.15) | 98.07 (0.41) | 0.072 (0.016) | |
| DMDF-Net-1 (Proposed) | 5.85 | 37.45 | 25.64 | ||||||||
#Par.: Number of parameters; M: Million; G: Giga; PT: Processing time; FPS: Frame per second;’–‘: Not available; Std: Standard deviation; The best results are shown in boldface.
Fig. 8Visual comparison of lung segmentation results of the proposed framework with the other state-of-the-art deep segmentation models.
Quantitative results of the proposed DMDF-Net-2 compared with different state-of-the-art segmentation networks for infection segmentation task (Exp#2). (unit: %).
| Methods | #Par. | FLOPs | P.T. | DICE | IoU | AP | SEN | SPE | MAE | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Without Pre- and Post-Processing | 3D U-Net ( | – | – | – | 58.8 | – | – | – | – | – | – |
| CoSinGAN + 2D U-Net ( | – | – | – | 47.4 | – | – | – | – | – | – | |
| U-Net (Encoder Depth:4) ( | 31.03 | 155.69 | 23.26 | 59.49 | 55.13 | 58.93 | 20.44 | 99.82 | 82.41 | 0.083 | |
| SegNet (VGG19) ( | 40.07 | 157.54 | 9.62 | 66.32 | 59.61 | 61.64 | 55.4 | 99.65 | 85.35 | 0.044 | |
| FCN (Up Sampling Factor: 32) ( | 134.29 | 187.34 | 18.52 | 68.16 | 60.97 | 63.35 | 57.3 | 99.7 | 83.72 | 0.039 | |
| SegNet (VGG16) ( | 29.44 | 123.91 | 10.31 | 66.81 | 60 | 63.97 | 42.44 | 99.79 | 84.65 | 0.032 | |
| DeepLabV3+(ResNet) ( | 20.61 | 58.42 | 31.25 | 68.12 | 60.97 | 65.81 | 42.63 | 99.82 | 88.26 | 0.029 | |
| DAL-Net ( | 6.65 | 35.81 | 23.26 | 69.76 | 62.18 | 63.7 | 71.88 | 99.63 | 88.13 | 0.042 | |
| DeepLabV3+(MobileNetV2) ( | 6.78 | 30.89 | 20.41 | 59.79 | 88.44 | ||||||
| DMDF-Net-2 (Proposed) | 5.85 | 37.45 | 25.64 | 72.91 | 64.78 | 68.38 | 99.8 | 0.028 | |||
| With Pre-Processing | U-Net (Encoder Depth:4) ( | 31.03 | 155.69 | 9.9 | 64.36 | 58.26 | 61.34 | 39.52 | 99.74 | 86.95 | 0.078 |
| SegNet (VGG19) ( | 40.07 | 157.54 | 6.17 | 64.5 | 58.24 | 59.04 | 77.28 | 99.32 | 83.27 | 0.072 | |
| FCN (Up Sampling Factor: 32) ( | 134.29 | 187.34 | 8.93 | 66.14 | 59.44 | 60.48 | 72.29 | 99.47 | 82.54 | 0.058 | |
| SegNet (VGG16) ( | 29.44 | 123.91 | 6.45 | 66.14 | 59.46 | 60.92 | 63.17 | 99.56 | 85.24 | 0.051 | |
| DeepLabV3+(ResNet) ( | 20.61 | 58.42 | 11.11 | 70.1 | 62.46 | 64.64 | 64.72 | 99.7 | 89.18 | 0.037 | |
| DAL-Net ( | 6.65 | 35.81 | 9.9 | 72.5 | 64.41 | 66.01 | 99.69 | 89.65 | 0.036 | ||
| DeepLabV3+(MobileNetV2) ( | 6.78 | 30.89 | 9.35 | 72.84 | 64.7 | 66.76 | 72.21 | 99.72 | 89.35 | 0.033 | |
| DMDF-Net-2 (Proposed) | 5.85 | 37.45 | 10.31 | 72.95 | |||||||
| With Pre- and Post-Processing | U-Net (Encoder Depth:4) ( | 36.88 | 193.14 | 7.14 | 67.64 | 60.62 | 66.06 | 39.3 | 85.07 | 0.057 | |
| SegNet (VGG19) ( | 45.92 | 194.99 | 4.98 | 66.9 | 59.98 | 60.93 | 99.47 | 84.68 | 0.058 | ||
| FCN (Up Sampling Factor: 32) ( | 140.14 | 224.79 | 6.62 | 68.95 | 61.55 | 62.95 | 71.87 | 99.6 | 84.08 | 0.045 | |
| SegNet (VGG16) ( | 35.29 | 161.36 | 5.15 | 69.54 | 62.03 | 64.25 | 62.77 | 99.7 | 86.53 | 0.038 | |
| DeepLabV3+(ResNet) ( | 26.46 | 95.87 | 70.53 | 62.81 | 65.11 | 64.56 | 99.71 | 89.24 | 0.036 | ||
| DAL-Net ( | 12.5 | 73.26 | 7.14 | 73.17 | 64.98 | 66.72 | 76.09 | 99.71 | 89.87 | 0.034 | |
| DeepLabV3+(MobileNetV2) ( | 12.63 | 6.85 | 73.56 | 65.32 | 67.57 | 71.98 | 99.74 | 89.54 | 0.031 | ||
| DMDF-Net-2 (Proposed) | 74.9 | 7.35 | 72.78 | ||||||||
#Par.: Number of parameters; M: Million; G: Giga; PT: Processing time; FPS: Frame per second; ‘–‘: Not available; The best results are shown in boldface.
Fig. 9Visual comparison of COVID-19 infection segmentation results of the proposed framework with the other state-of-the-art deep segmentation models (a) without pre- and post-processing, (b) with pre-processing, and (c) with pre- and post-processing (applying the same lung segmentation network).
Fig. 10Infection quantification results of the proposed diagnostic framework for some CT images, including both (a) infectious and (b) normal data samples. (PIAL: Proportion of the infected area of the lung).
Fig. 11Multiscale class activation maps (CAM) visualization of the proposed (a) DMDF-Net-1 for lung segmentation (Exp#1) and (b) DMDF-Net-2 for COVID-19 infection segmentation (Exp#2).
| 1 | |
| 2 | // Split the whole training data into |
| 3 | |
| 4 | |
| 5 | |
| 6 | |
| 7 | // Check validation accuracy after every epoch to avoid overfitting |
| 8 | |
| 9 | |