| Literature DB >> 34337587 |
Narges Saeedizadeh1, Shervin Minaee2, Rahele Kafieh1, Shakib Yazdani3, Milan Sonka4.
Abstract
The novel corona-virus disease (COVID-19) pandemic has caused a major outbreak in more than 200 countries around the world, leading to a severe impact on the health and life of many people globally. By October 2020, more than 44 million people were infected, and more than 1,000,000 deaths were reported. Computed Tomography (CT) images can be used as an alternative to the time-consuming RT-PCR test, to detect COVID-19. In this work we propose a segmentation framework to detect chest regions in CT images, which are infected by COVID-19. An architecture similar to a Unet model was employed to detect ground glass regions on a voxel level. As the infected regions tend to form connected components (rather than randomly distributed voxels), a suitable regularization term based on 2D-anisotropic total-variation was developed and added to the loss function. The proposed model is therefore called "TV-Unet". Experimental results obtained on a relatively large-scale CT segmentation dataset of around 900 images, incorporating this new regularization term leads to a 2% gain on overall segmentation performance compared to the Unet trained from scratch. Our experimental analysis, ranging from visual evaluation of the predicted segmentation results to quantitative assessment of segmentation performance (precision, recall, Dice score, and mIoU) demonstrated great ability to identify COVID-19 associated regions of the lungs, achieving a mIoU rate of over 99%, and a Dice score of around 86%.Entities:
Keywords: COVID-19; Computed tomography; Convolutional encoder decoder; Deep learning; Image segmentation; Total variation
Year: 2021 PMID: 34337587 PMCID: PMC8056883 DOI: 10.1016/j.cmpbup.2021.100007
Source DB: PubMed Journal: Comput Methods Programs Biomed Update ISSN: 2666-9900
Fig. 1The architecture of Unet model
Fig. 2The difference between normal and COVID-19 images.
Fig. 3Sample images from the COVID-19 CT segmentation dataset. The first row shows two COVID-19 images. The red boundary contours in the second row denote regions of COVID-19 Ground-Glass pathology and are not a part of the original image data.The third row shows Ground-Glass masks.
Training/Validation/Testing splits prior to data augmentation.
| Data | Number of Images | Number of Images |
|---|---|---|
| Training | 654 | 590 |
| Validation | 75 | 64 |
| Test | 200 | 275 |
| Total | 929 | 929 |
Overall performance with different Loss Functions employed, the best cut-off threshold of 0.3 used. Best performance shown in bold font.
| Loss | Optimizer | Learning Rate | mIOU | DSC | Average Precision |
|---|---|---|---|---|---|
| BCE | ADAM | 0.001 | 0.993 | 0.839 | 0.92 |
| DSC | 0.990 | 0.764 | 0.90 | ||
| BCE+DSC | 0.993 | 0.843 | 0.91 | ||
| BCE+DSC+TV | 0.988 | 0.645 | 0.91 | ||
| BCE+TV | 0.995 |
Overall performance for different Optimizer selection, using the best cut-off threshold of 0.3. Best performance shown in bold font.
| Loss | Optimizer | Learning Rate | mIOU | DSC | Average Precision |
|---|---|---|---|---|---|
| BCE+TV | ADAM | 0.001 | 0.995 | ||
| SGD | 0.985 | 0.573 | 0.8 | ||
| Adadelta | 0.991 | 0.780 | 0.9 | ||
| Adagrad | 0.992 | 0.784 | 0.9 |
Overall model performance for different Learning Rates, again for the best cut-off threshold of 0.3. Best performance shown in bold font.
| Loss | Optimizer | Learning Rate | mIOU | DSC | Average Precision |
|---|---|---|---|---|---|
| BCE+TV | ADAM | 0.001 | 0.995 | ||
| 0.0001 | 0.993 | 0.838 | 0.92 |
Overall model performance for different using the best cut-off threshold of 0.3. Best performance shown in bold font.
| Loss | lamda | mIOU | DSC | Average Precision |
|---|---|---|---|---|
| BCE+TV | 1 | 0.991 | 0.770 | 0.86 |
| 1/number of pixels | 0.992 | 0.824 | 0.90 | |
| 1/(256*number of pixels) | 0.995 |
Fig. 4Predicted segmentation masks by Unet trained from scratch and the proposed TV-Unet for a typical sample images from the testing set.
Precision, recall, Dice score, and mIoU rates of TV-Unet model for different threshold values for Split 1. Confidence intervals provided for recall metric. Best performance shown in bold font.
| Threshold | Recall | Precision | mIoU | DSC |
|---|---|---|---|---|
| 0.1 | 0.955 | 0.736 | 0.992 | 0.831 |
| 0.2 | 0.913 | 0. 811 | 0.994 | 0.859 |
| 0.867 | 0. 859 | |||
| 0.4 | 0.813 | 0. 900 | 0.994 | 0.854 |
| 0.5 | 0.746 | 0. 933 | 0.993 | 0.829 |
| 0.6 | 0.662 | 0. 959 | 0.992 | 0.783 |
| 0.7 | 0.547 | 0. 978 | 0.990 | 0.702 |
| 0.8 | 0.362 | 0. 990 | 0.986 | 0.531 |
Precision, recall, Dice score, and mIoU rates of TV-Unet model for different threshold values for Split 2. Best performance shown in bold font.
| Threshold | Recall | Precision | mIoU | DSC |
|---|---|---|---|---|
| 0.1 | 0.892 | 0.626 | 0.987 | 0.7363 |
| 0.2 | 0.833 | 0.700 | 0.989 | 0.7609 |
| 0.781 | 0.750 | |||
| 0.4 | 0.730 | 0.789 | 0.990 | 0.7582 |
| 0.5 | 0.674 | 0.825 | 0.990 | 0.7413 |
| 0.6 | 0.610 | 0.859 | 0.990 | 0.7139 |
| 0.7 | 0.535 | 0.890 | 0.989 | 0.6692 |
| 0.8 | 0.422 | 0.926 | 0.987 | 0.5801 |
Fig. 5Precision-Recall curve for Split-1.
Fig. 6Precision-Recall curve for Split-2.
Comparison of Unet trained from scratch and the proposed TV-Unet model performance in terms of precision, mIOU and DSC for Split 1. Best performance for each method shown in bold font.
| Model | Recall | Precision | mIOU | DSC |
|---|---|---|---|---|
| Unet | 0.975 | 0.575 | 0.985 | 0.727 |
| 0.945 | 0.688 | 0.990 | 0.798 | |
| 0.91 | 0.765 | 0.992 | 0.832 | |
| 0.85 | 0.834 | 0.993 | ||
| TV-Unet | 0.975 | 0.675 | 0.990 | 0.798 |
| 0.945 | 0.760 | 0.993 | 0.842 | |
| 0.91 | 0.812 | 0. 994 | 0.860 | |
| 0.85 | 0.871 | 0.995 |
Comparison of the Unet trained from scratch and TV-Unet model performance in terms of precision, mIOU and DSC for Split 2. Best performance for each method shown in bold font.
| Model | Recall | Precision | mIOU | DSC |
| Unet | 0.810 | 0.594 | 0.983 | |
| 0.643 | 0.621 | 0.985 | 0.633 | |
| 0.535 | 0.655 | 0.985 | 0.595 | |
| 0.422 | 0.693 | 0.984 | 0.527 | |
| TV-Unet | 0.810 | 0.727 | 0.990 | |
| 0.643 | 0.842 | 0.990 | 0.729 | |
| 0.535 | 0.890 | 0.989 | 0.670 | |
| 0.422 | 0.926 | 0.987 | 0.580 | |
Fig. 7The training and validation loss of the model during training.
Fig. 8The training and validation recall of the model during training.
Fig. 9The training and validation precision of the model during training.
Comparison of the TV-Unet model performance with other recent methods in terms of Sensitivity, Specificity and DSC for pathologic regions on COVID-SemiSeg dataset. Best performance is shown in bold font.
| Model | Sensitivity | Specificity | Dice Score |
|---|---|---|---|
| Unet+ | 0.672 | 0.902 | 0.518 |
| Inf-Net | 0.692 | 0.943 | 0.682 |
| Semi-Inf-Net | 0.725 | 0.960 | 0.739 |
| TV-Unet |
Comparison of the TV-Unet model performance with other methods in terms of Sensitivity, Specificity and DSC for the Ground-Glass mask on COVID-SemiSeg dataset. Best performance shown in bold font.
| Model | Sensitivity | Specificity | Dice Score |
|---|---|---|---|
| DeepLab-v3+ (stride=8) | 0.478 | 0.863 | 0.375 |
| DeepLab-v3+ (stride=16) | 0.713 | 0.823 | 0.443 |
| FCN8s | 0.537 | 0.905 | 0.471 |
| Semi-Inf-Net+FCN8s | 0.720 | 0.941 | 0.646 |
| TV-Unet |
Comparison of the TV-Unet model performance with several other methods in terms of Sensitivity, Specificity and DSC for the Consolidation mask on COVID-SemiSeg dataset. Best performance shown in bold font.
| Model | Sensitivity | Specificity | Dice Score |
|---|---|---|---|
| DeepLab-v3+ (stride=8) | 0.120 | 0.584 | 0.117 |
| DeepLab-v3+ (stride=16) | 0.245 | 0.560 | 0.188 |
| FCN8s | 0.212 | 0.567 | 0.221 |
| Semi-Inf-Net+FCN8s | 0.186 | 0.639 | 0.238 |
| TV-Unet |