| Literature DB >> 36117632 |
Xingqing Nie1,2, Xiaogen Zhou1,2, Tong Tong1,2,3, Xingtao Lin1,2, Luoyan Wang1,2, Haonan Zheng1,2, Jing Li1,2, Ensheng Xue4,5, Shun Chen4,5, Meijuan Zheng4,5, Cong Chen4,5, Min Du1,2.
Abstract
Medical image segmentation is an essential component of computer-aided diagnosis (CAD) systems. Thyroid nodule segmentation using ultrasound images is a necessary step for the early diagnosis of thyroid diseases. An encoder-decoder based deep convolutional neural network (DCNN), like U-Net architecture and its variants, has been extensively used to deal with medical image segmentation tasks. In this article, we propose a novel N-shape dense fully convolutional neural network for medical image segmentation, referred to as N-Net. The proposed framework is composed of three major components: a multi-scale input layer, an attention guidance module, and an innovative stackable dilated convolution (SDC) block. First, we apply the multi-scale input layer to construct an image pyramid, which achieves multi-level receiver field sizes and obtains rich feature representation. After that, the U-shape convolutional network is employed as the backbone structure. Moreover, we use the attention guidance module to filter the features before several skip connections, which can transfer structural information from previous feature maps to the following layers. This module can also remove noise and reduce the negative impact of the background. Finally, we propose a stackable dilated convolution (SDC) block, which is able to capture deep semantic features that may be lost in bilinear upsampling. We have evaluated the proposed N-Net framework on a thyroid nodule ultrasound image dataset (called the TNUI-2021 dataset) and the DDTI publicly available dataset. The experimental results show that our N-Net model outperforms several state-of-the-art methods in the thyroid nodule segmentation tasks.Entities:
Keywords: deep convolutional neural network; dilated convolution; medical image segmentation; multi-scale input layer; thyroid nodule
Year: 2022 PMID: 36117632 PMCID: PMC9475170 DOI: 10.3389/fnins.2022.872601
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 5.152
Figure 1Illustration of our N-Net segmentation framework.
Figure 2Flowchart of the attention guidance module. I and F are the inputs of the attention guidance module and A is the calculated attention map.
Figure 3The attention map highlights the foreground region of the thyroid nodule.
Figure 4The illustration of the stackable dilated convolution (SDC) block. It has four branches of dilated convolution stack: 3x3 convolutional layer. Each branch is stacked with dilated convolutions with expansion rates of 1, 3, and 11. The SDC block can extract features from different scales.
Thyroid nodule segmentation performance of different segmentation approaches on the TNUI-2021 dataset.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| U-Net (Ronneberger et al., | 88.87 ± 0.62 | 83.72 ± 0.55 | 85.06 ± 1.66 | 77.31 ± 0.72 | 78.40 ± 1.19 |
| AttU-Net (Oktay et al., | 89.45 ± 0.65 | 84.42 ± 0.56 | 84.87 ± 0.82 | 79.13 ± 2.32 | 79.53 ± 1.27 |
| PSP-Net (Zhao et al., | 89.25 ± 0.38 | 83.91 ± 0.51 | 85.37 ± 1.81 | 78.34 ± 2.53 | 79.12 ± 0.74 |
| U-Net++ (Zhou et al., | 91.41 ± 0.48 | 86.86 ± 0.44 | 87.01 ± 0.91 | 82.91 ± 1.07 | 83.68 ± 0.90 |
| M-Net (Fu et al., | 91.52 ± 0.57 | 86.67 ± 0.51 | 88.06 ± 1.37 | 82.36 ± 1.75 | 83.6 ± 1.12 |
| DeepLabV3 (Chen et al., | 91.66 ± 0.6 | 86.88 ± 0.67 | 88.04 ± 0.93 | 82.97 ± 2.11 | 83.81 ± 1.13 |
| Ours |
The best results are in bold. The average of 5-fold cross-validation ± SD.
Figure 5Comparison of precision-recall curves of our N-Net and other medical image segmentation approaches on the TNUI-2021 dataset.
Figure 6Visual comparison of thyroid nodule segmentation results generated from five typical methods, including our N-Net. (A) Input, (B) GT, (C) Ours, (D) DeepLabV3, (E) M-Net, (F) PSP-Net, (G) AttU-Net, and (H) U-Net.
Thyroid nodule segmentation performance of different segmentation approaches on the DDTI dataset.
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| U-Net (Ronneberger et al., | 84.17 | 76.29 | 70.48 | 81.23 | 71.57 | 46.42 | 27.75 |
| AttU-Net (Oktay et al., | 84.91 | 77.37 | 71.37 | 81.7 | 72.76 | 266.47 | 84.81 |
| PSP-Net (Zhao et al., | 81.25 | 73.36 | 69.08 | 73.72 | 65.91 |
|
|
| M-Net (Fu et al., | 86.4 | 79.38 | 80.45 | 76.59 | 75.45 | 63.59 | 31.58 |
| DeepLabV3 (Chen et al., | 87.72 | 82.66 | 82.77 | 82.88 | 79.54 | 109.33 | 48.69 |
| nnU-Net (Isensee et al., | 88.59 | 80.76 | 82.27 | 85.23 | 82.79 | ||
| Ours | 88.76 | 82.69 | 81.53 | 82.94 | 79.62 | 68.52 | 36.96 |
| 1st in TN-SCUI2020 (Wang et al., | 92.39 | 86.84 | 84.33 | 86.71 | 86.37 | 410.58 | 108.49 |
| Two-stage cascaded of Ours |
|
|
|
|
| 137.04 | 73.92 |
The best results are shown in bold.
Figure 7The segmentation area and error of DeepLabV3 and our N-Net methods are further visualized on the TNUI-2021 dataset. (A–D) Ground truth, ours result, ours segmentation errors, DeepLabv3 result, and DeepLabv3 segmentation errors.
Figure 8Visual comparison of ablation study on the TNUI-2021 dataset. (A) Input, (B) GT, (C) ours, (D) multi-scale+AG +U-Net, (E) AG+U-Net, and (F) U-Net.
Ablation study for each component of our model for thyroid nodule segmentation on the TNUI-2021 dataset. The average of 5-fold cross-validation ± SD.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| Backbone (U-Net) | 88.87 ± 0.62 | 83.72 ± 0.55 | 85.06 ± 1.66 | 77.31 ± 0.72 | 78.40 ± 1.19 |
| Attention guidance (AG) + U-Net | 89.36 ± 0.42 | 84.24 ± 0.4 | 84.99 ± 1.05 | 79.82 ± 1.33 | 79.35 ± 0.83 |
| Multi-scale + AG + U-Net | 90.28 ± 0.46 | 85.34 ± 0.55 | 85.75 ± 1.61 | 80.99 ± 2.52 | 81.13 ± 0.92 |
| SDC + Multi-scale + AG + U-Net |
The bold values indicates the best results.