| Literature DB >> 34307860 |
Ayat Abedalla1, Malak Abdullah1, Mahmoud Al-Ayyoub1, Elhadj Benkhelifa2.
Abstract
Medical imaging refers to visualization techniques to provide valuable information about the internal structures of the human body for clinical applications, diagnosis, treatment, and scientific research. Segmentation is one of the primary methods for analyzing and processing medical images, which helps doctors diagnose accurately by providing detailed information on the body's required part. However, segmenting medical images faces several challenges, such as requiring trained medical experts and being time-consuming and error-prone. Thus, it appears necessary for an automatic medical image segmentation system. Deep learning algorithms have recently shown outstanding performance for segmentation tasks, especially semantic segmentation networks that provide pixel-level image understanding. By introducing the first fully convolutional network (FCN) for semantic image segmentation, several segmentation networks have been proposed on its basis. One of the state-of-the-art convolutional networks in the medical image field is U-Net. This paper presents a novel end-to-end semantic segmentation model, named Ens4B-UNet, for medical images that ensembles four U-Net architectures with pre-trained backbone networks. Ens4B-UNet utilizes U-Net's success with several significant improvements by adapting powerful and robust convolutional neural networks (CNNs) as backbones for U-Nets encoders and using the nearest-neighbor up-sampling in the decoders. Ens4B-UNet is designed based on the weighted average ensemble of four encoder-decoder segmentation models. The backbone networks of all ensembled models are pre-trained on the ImageNet dataset to exploit the benefit of transfer learning. For improving our models, we apply several techniques for training and predicting, including stochastic weight averaging (SWA), data augmentation, test-time augmentation (TTA), and different types of optimal thresholds. We evaluate and test our models on the 2019 Pneumothorax Challenge dataset, which contains 12,047 training images with 12,954 masks and 3,205 test images. Our proposed segmentation network achieves a 0.8608 mean Dice similarity coefficient (DSC) on the test set, which is among the top one-percent systems in the Kaggle competition.Entities:
Keywords: EfficientNet-B4; Medical image segmentation; Pneumothorax; ResNet-50; SE-ResNext-50; Test-time augmentation; Transfer learning; U-Net
Year: 2021 PMID: 34307860 PMCID: PMC8279140 DOI: 10.7717/peerj-cs.607
Source DB: PubMed Journal: PeerJ Comput Sci ISSN: 2376-5992
Figure 1Illustration of deep learning structure.
Figure 2Illustration of CNN structure.
Figure 3A ResBlock within a ResNet.
Overview of medical image segmentation methods.
| Reference | Method | Modality | Task | Advantages | Limitations |
|---|---|---|---|---|---|
| Thresholding | |||||
| ( | Entropic | Ultrasonic | Breast and liver | Simple and fast | Sensitive to noise, and unable to segment most medical images |
| ( | Thresholding and morphological | MRI | Brain Tumor | ||
| Region growing | |||||
| ( | Region growing | CT | Organs | Simple, less sensitive to noise, and separates regions fairly accurately | Sensitive to noise and seed values, and requires manual interaction |
| ( | Adaptive region growing | MRI | Brain Tumor | ||
| Deformable | |||||
| ( | Deformable model | MRI | Cardiac | Robustness for noise | Requires manual tuning of parameters |
| Clustering | |||||
| ( | FCM | MRI | Ophthalmology | Fast computation, easy for implementation, and work well for MRI | Does not utilize spatial information, and inability to work on CT images |
| ( | FCM | MRI | Brain | ||
| Classifiers | |||||
| ( | KNN | MRI | Brain abnormalities | Robustness, easy to train, and work well for MRI and CT images | Rely on hand-crafted features, and time consuming |
| ( | Decision Tree | CT | Anatomical regions | ||
| CNN | |||||
| ( | Sliding-window CNN | Microscopy | Neuronal structures | Automatic feature extraction | Time-consuming process, and requires an amount of computing resources |
| ( | Cascaded CNN | MRI | Brain tumor | ||
| FCN | |||||
| ( | FCN | CT | Liver | Process varying input sizes and End-to-end segmentation | Loss of global context information and resolution |
| ( | FCN | MRI | Cardiac | ||
| U-Net | |||||
| ( | U-Net | Microscopy | Cell | Effectively capture localization and context information and effective with limited dataset images | Takes a long time to train because of a large number of parameters to learn |
| ( | U-Net | MRI | Brain tumor | ||
| ( | H-DenseU-Net | CT | Liver lesion | ||
| ( | U-SegNet | MRI | Brain tissue | ||
| ( | UNet-VGG16 | MRI | Brain tumor | ||
| ( | HTTU-Net | MRI | Brain tumor | ||
| ( | MRBSU-Net | EUS | GIST | ||
| ( | U-Net | CT | Esophagus | ||
Solutions and results for the top winning teams in the pneumothorax challenge.
| Rank | Team | Network | Encoder | Techniques | Score |
|---|---|---|---|---|---|
| 1 | [dsmlkz] sneddy | U-Net | ResNet (34, 50), SE-ResNext-50 | Triplet threshold | 0.8679 |
| 2 | X5 | Deeplabv3+, U-Net | SE-ResNext (50, 101), EfficientNet (B3, B5) | Segmentation with Classification | 0.8665 |
| 3 | Bestfitting | U-Net | ResNet-34, SE-ResNext-50 | Lung segmentation and CBAM attention | 0.8651 |
| 4 | [ods.ai] amirassov | U-Net | ResNet-34 | Deep supervision | 0.8644 |
| 5 | Earhian | U-Net | SE-ResNext (50, 101) | ASPP and Semi-supervision | 0.8643 |
Figure 4The proposed methodology of pneumothorax segmentation.
(A) Overview of the proposed pipeline of pneumothorax segmentation. (B) Overview of the pre-processing steps of the DICOM files and Mask values.
Dataset overview.
| Attribute | Training set | Validation set |
|---|---|---|
| Number of cases | 10,842 | 1,205 |
| Number of positive cases | 2,405 | 264 |
| Number of negative cases | 8,437 | 941 |
| Number of cases has single masks | 1,853 | 192 |
| Number of cases has multiple masks | 552 | 72 |
Figure 5Examples of some X-ray images (left), the masks (middle), and the X-ray images with masks (right).
Figure 6An example of applying augmentation operations.
Figure 7The architecture of the proposed semantic segmentation networks for pneumothorax segmentation from chest X-ray images.
Figure 8Architecture of the Ens4B-UNet framework.
Training parameters for all experiments.
| Experiment | B-UNet Network | Optimizer | LR Schedule | Batch Size | Epochs |
|---|---|---|---|---|---|
| EXP1 | ResNet50-UNetSGD | 1e−3 to 1e−5 | 10 | 60 | |
| EXP2 | DenseNet169-UNet | Adam | 1e−4 to 1e−6 | 6 | 80 |
| EXP3 | SE-ResNext50-UNet | Adam | 1e−3 to 1e−5 | 4 | 100 |
| EXP4 | SE-ResNext50-UNet | Adam | 1e−4 to 1e−6 | 6 | 80 |
| EXP5 | SE-ResNext101-UNet | Adam | 1e−4 to 1e−6 | 4 | 100 |
| EXP6 | EfficientNetB3-UNet | Adam | 1e−4 to 1e−6 | 4 | 45 |
| EXP7 | EfficientNetB4-UNet | Adam | 1e−4 to 1e−6 | 4 | 80 |
The number of trainable parameters, training time per epoch, training step time, and IoU score of the validation set for all proposed B-UNet networks.
| Experiment | B-UNet network | #Params | Train step time (ms) | Train time/epoch | IoU |
|---|---|---|---|---|---|
| EXP1 | ResNet50-UNet | 32.5 M | 568 | 10.3 min | 0.7535 |
| EXP2 | DenseNet169-UNet | 19.5 M | 437 | 13.2 min | 0.7669 |
| EXP3 | SE-ResNext50-UNet | 34.5 M | 472 | 21.3 min | 0.6986 |
| EXP4 | SE-ResNext50-UNet | 34.5 M | 644 | 19.4 min | 0.7820 |
| EXP5 | SE-ResNext101-UNet | 59.89 M | 709 | 32 min | 0.7795 |
| EXP6 | EfficientNetB3-UNet | 17.77 M | 431 | 19.5 min | 0.7589 |
| EXP7 | EfficientNetB4-UNet | 25.6 M | 516 | 23.3 min | 0.7758 |
Figure 9Training and validation IoU evaluation over training of four proposed segmentation network.
(A) ResNet50-UNet (B) DenseNet169-UNet (C) SE-ResNext50-UNet (D)EfficientNetB4-UNet.
Figure 10Training and validation loss evaluation over training of four proposed segmentation network.
(A) ResNet50-UNet. (B) DenseNet169-UNet. (C) SE-ResNext50-UNet. (D) EfficientNetB4-UNet.
Figure 11Confusion matrix for all proposed segmentation network for the validation dataset.
(A) ResNet50-UNet. (B) DenseNet169-UNet. (C) SE-ResNext50-UNet. (D) EfficientNetB4-UNet. (E) Ens4B-UNet.
Pixel-wise classification results of our segmentation networks on the validation set.
| Model | Accuracy (%) | Recall (%) | Precision (%) | F-measure (%) |
|---|---|---|---|---|
| ResNet50-UNet | 99.78 | 51.26 | 65.81 | 57.63 |
| DenseNet169-UNet | 99.80 | 57.61 | 67.16 | 62.02 |
| SE-ResNext50-UNet | 99.80 | 54.68 | 69.83 | 61.33 |
| EfficientNetB4-UNet | 99.80 | 55.53 | 69.50 | 61.74 |
| Ens4B-UNet | 99.81 | 56.94 | 71.19 | 63.27 |
The IoU result of the validation set using different base networks.
| Segmentation network | TTA | B-TH | R-TH | IoU |
|---|---|---|---|---|
| ResNet50-UNet | ✗ | Auto 0.86 | ✗ | 0.7542 |
| ✗ | Auto 0.87 | Auto 2048 | 0.7968 | |
| ✓ | Auto 0.77 | ✗ | 0.7714 | |
| ✓ | Auto 0.73 | Auto 2048 | 0.7981 | |
| DenseNet169-UNet | ✗ | Auto 0.70 | ✗ | 0.7656 |
| ✗ | Auto 0.83 | Auto 3072 | 0.7989 | |
| ✓ | Auto 0.89 | ✗ | 0.7759 | |
| ✓ | Auto 0.65 | Auto 2048 | 0.8007 | |
| SE-ResNext50-UNet | ✗ | Auto 0.82 | ✗ | 0.7832 |
| ✗ | Auto 0.81 | Auto 3072 | 0.7986 | |
| ✓ | Auto 0.89 | ✗ | 0.7920 | |
| ✓ | Auto 0.56 | Auto 1024 | 0.8000 | |
| EfficientNetB4-UNet | ✗ | Auto 0.90 | ✗ | 0.7786 |
| ✗ | Auto 0.21 | Auto 2048 | 0.8015 | |
| ✓ | Auto 0.59 | ✗ | 0.7867 | |
| ✓ | Auto 0.55 | Auto 2048 | 0.8030 |
The PSNR (dB) results of our segmentation networks on the validation set.
| Model | PSNR (dB) |
|---|---|
| ResNet50-UNet | 26.71 |
| DenseNet169-UNet | 27.01 |
| SE-ResNext50-UNet | 27.05 |
| EfficientNetB4-UNet | 27.08 |
| Ens4B-UNet | 27.20 |
Average prediction time per sample, and prediction mean DSC score reported for different proposed segmentation models evaluated on the test set.
| Model | Prediction time (ms) | DSC score leaderboard | |
|---|---|---|---|
| Public | Private | ||
| ResNet50-UNet | 37.2 | 0.9113 | 0.8400 |
| DenseNet169-UNet | 56 | 0.9041 | 0.8473 |
| SE-ResNext50-UNet | 66.3 | 0.9014 | 0.8515 |
| EfficientNetB4-UNet | 53.6 | 0.9065 | 0.8547 |
| Ens4B-UNet | – | 0.9060 | 0.8608 |
Different weights for the proposed models on the validation set.
| ResNet50-UNet (%) | DenseNet169-UNet (%) | SE-ResNext50-UNet (%) | EfficientNetB4-UNet (%) | IoU |
|---|---|---|---|---|
| 40 | 40 | 10 | 10 | 0.7796 |
| 30 | 30 | 20 | 20 | 0.7845 |
| 25 | 25 | 25 | 25 | 0.7867 |
| 20 | 20 | 30 | 30 | 0.7875 |
| 10 | 10 | 40 | 40 | 0.7877 |
| 10 | 10 | 30 | 50 | 0.7844 |
| 10 | 10 | 20 | 60 | 0.7858 |
| 10 | 20 | 30 | 40 | 0.7857 |
Comparisons with most recent current work on the pneumothorax medical condition.
| Reference | Dataset size | Task | Results |
|---|---|---|---|
| ( | 12,047 | Segmentation | Official DSC 76.04 |
| ( | 12,047 | Segmentation | DSC 84.3 |
| ( | 100,000 | Classification | AUC 0.911 |
| ( | 11,051 | Segmentation | MPA 0.93, DSC 0.92 |
| Ens4B-UNet | 12,047 | Segmentation | Official DSC 0.8608 |
Notes:
indicates that the reference works on the same dataset that we do.
means that the reference work result is not on the official test set but the test set divided from the training set.
The comparison between our proposed model and different teams results on the Pneumothorax segmentation challenge.
| Top (%) | Rank | Team/Model | Image size | DSC score leaderboard | |
|---|---|---|---|---|---|
| Public | Private | ||||
| 1 | 1 | [dsmlkz] sneddy | 1,024 | 0.8985 | 0.8679 |
| 5 | Earhian | 1,024 | 0.9035 | 0.8643 | |
| 13 | Ens4B-UNet | 512 | 0.9060 | 0.8608 | |
| 2 | 22 | [ods.ai] 11,111 good team is all you need | 512, 1,024 | 0.9024 | 0.8557 |
| 3 | 31 | [ods.ai] Vasiliy Kotov | 1,024 | 0.8995 | 0.8537 |
| 33 | imedhub ppc64 | 1,024 | 0.9093 | 0.8532 | |
| 4 | 58 | OmerS | 1,024 | 0.9180 | 0.8477 |
| 6 | 80 | Mohamed Ramzy | 768 | 0.9124 | 0.8451 |
| 9 | 124 | Ayat/2ST-UNet (2) | 256, 512 | 0.9023 | 0.8356 |
| 127 | DataKeen | 512 | 0.9073 | 0.8353 | |
| 19 | 279 | diCELLa | – | 0.9041 | 0.8096 |
| 21 | 308 | Marsh | – | 0.6215 | 0.7757 |
| 24 | 340 | pete | – | 0.6351 | 0.6983 |