| Literature DB >> 33852577 |
Jingfei Hu1,2,3,4, Hua Wang1,2,3,4, Jie Wang5, Yunqi Wang1,2, Fang He2, Jicong Zhang1,2,3,4,6.
Abstract
Semantic segmentation of medical images provides an important cornerstone for subsequent tasks of image analysis and understanding. With rapid advancements in deep learning methods, conventional U-Net segmentation networks have been applied in many fields. Based on exploratory experiments, features at multiple scales have been found to be of great importance for the segmentation of medical images. In this paper, we propose a scale-attention deep learning network (SA-Net), which extracts features of different scales in a residual module and uses an attention module to enforce the scale-attention capability. SA-Net can better learn the multi-scale features and achieve more accurate segmentation for different medical image. In addition, this work validates the proposed method across multiple datasets. The experiment results show SA-Net achieves excellent performances in the applications of vessel detection in retinal images, lung segmentation, artery/vein(A/V) classification in retinal images and blastocyst segmentation. To facilitate SA-Net utilization by the scientific community, the code implementation will be made publicly available.Entities:
Year: 2021 PMID: 33852577 PMCID: PMC8046243 DOI: 10.1371/journal.pone.0247388
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Diagram of the proposed SA-Net.
Fig 2Visualization of the retinal vessel diameters in the fundus.
(a) The raw 1880×2886 image. (b) Diameter of each point on the skeleton. c) Partially enlarged view of the red area in (b). (d) Histogram of the diameter map for each point on the skeleton (distance between pixels).
Fig 3Comparing the Res2Net and ResNet blocks (with a scale dimension of k = 4): (a) The conventional building block in CNN variants. (b) Res2Net uses a group of 3×3 filters. (c) The SA module adds the attention module to enforce the scale-attention capability.
Fig 4Diagram of the attention module.
As illustrated, the attention module utilizes both max-pooling outputs and average-pooling outputs.
Fig 5Samples of the retinal fundus images and their H×W dimensions in pixels.
(a) Input images: Left: DRIVE (584×565); right: CHASE_DB1 (960×999). (b) Manual annotation of the retinal vessels. (c) Binary segmentation masks of the fundus images.
Segmentation performance metrics on two publically available retinal image datasets.
| Datasets | Method | MCC | SE | SP | ACC | AUC | F1 | |
|---|---|---|---|---|---|---|---|---|
| DRIVE | Unsupervised | Zhao [ | N/A | 0.7420 | 0.9820 | 0.9540 | 0.8620 | N/A |
| Azzopardi [ | N/A | 0.7655 | 0.9704 | 0.9442 | 0.9614 | N/A | ||
| Roychowdhury [ | N/A | 0.7395 | 0.9782 | 0.9494 | 0.8672 | N/A | ||
| Supervised | U-Net [ | N/A | 0.7537 | 0.9820 | 0.9531 | 0.9755 | 0.8142 | |
| RU-Net [ | N/A | 0.7792 | 0.9813 | 0.9556 | 0.9784 | 0.8171 | ||
| DE-Unet [ | N/A | 0.7940 | 0.9816 | 0.9567 | 0.9772 | 0.8270 | ||
| SWT-UNet [ | 0.8045 | 0.8039 | 0.9804 | 0.9576 | 0.9821 | 0.8281 | ||
| BTS-UNet [ | 0.7923 | 0.7800 | 0.9806 | 0.9551 | 0.9796 | 0.8208 | ||
| Driu [ | 0.7941 | 0.7855 | 0.9799 | 0.9552 | 0.9793 | 0.8220 | ||
| CS-Net [ | N/A | 0.8170 | 0.9798 | N/A | ||||
| 0.9764 | 0.9569 | |||||||
| CHASE-DB1 | Unsupervised | Azzopardi [ | N/A | 0.7585 | 0.9587 | 0.9387 | 0.9487 | N/A |
| Roychowdhury [ | N/A | 0.7615 | 0.9575 | 0.9467 | 0.9623 | N/A | ||
| Supervised | RU-Net [ | N/A | 0.7756 | 0.9820 | 0.9634 | 0.9815 | 0.7928 | |
| SWT-UNet [ | 0.8011 | 0.7779 | 0.9653 | 0.9855 | 0.8188 | |||
| BTS-UNet [ | 0.7733 | 0.7888 | 0.9801 | 0.9627 | 0.9840 | 0.7983 | ||
| 0.9827 |
aN/A = Not Available, SWT-UNet shows the vessel segmentation results using fully convolutional neural networks, BTS-DSN gives the segmentation results with the multi-scale deeply-supervised networks with short connections.
Fig 9Example results for lung segmentation, detection of retinal blood vessels, artery/vein classification and blastocyst segmentation.
From top to bottom: lung segmentation, retinal vessel detection, artery/vein classification and blastocyst segmentation.
Fig 6Sample lung CT images with dimensions of H×W in pixels.
Segmentation performance measures for lung image datasets.
| Method | E | ACC | SE |
|---|---|---|---|
| 0.087 | 0.975 | 0.938 | |
| 0.038 | 0.980 | ||
| 0.986 |
Fig 7A sample retinal image from the DRIVE dataset and its corresponding artery/vein segmentation map.
Performances of different A/V classification methods on DRIVE dataset.
| Method | BACC | SEAV | SPAV | F1A | |
|---|---|---|---|---|---|
| 0.9122 | 0.9145 | 0.9083 | 0.7089 | 0.7586 | |
| N/A | 0.9190 | 0.9150 | N/A | N/A | |
Fig 8A sample blastocyst image from the blastocyst dataset and its manual label.
Blastocyst segmentation performance measures on the blastocyst dataset.
| Method | Mean | Background | Blastocoel | ZP | TE | |
|---|---|---|---|---|---|---|
| 0.8137 | 0.9404 | 0.7941 | 0.7932 | 0.7506 | 0.7903 | |
| 0.8151 | 0.9460 | 0.7926 | 0.8057 | 0.7483 | 0.7828 | |
| 0.8165 | 0.9449 | 0.7835 | 0.8084 | 0.7398 | 0.8060 | |
| 0.8285 | 0.9474 | 0.8079 | 0.8115 | 0.7652 | 0.8107 | |
| 0.8142 | 0.9450 | 0.7861 | 0.8024 | 0.7616 | 0.7758 | |
Ablation study segmentation results on the DRIVE dataset.
| Method | MCC | SE | SP | ACC | AUC | F1 | Params |
|---|---|---|---|---|---|---|---|
| Backbone [ | N/A | 0.7537 | 0.9531 | 0.9755 | 0.8142 | 34.5M | |
| Backbone+ResNet | 0.8013 | 0.8039 | 0.9793 | 0.9566 | 0.9817 | 0.8244 | 49.5M |
| Backbone+Res2Net | 0.8026 | 0.8148 | 0.9775 | 0.9566 | 0.9816 | 0.8260 | 161M |
| Backbone+SA | 0.9764 | 194M |