| Literature DB >> 34336704 |
Xi Liu1, Kai-Wen Li1,2, Ruijie Yang3, Li-Sheng Geng1,2,4,5.
Abstract
Lung cancer is the leading cause of cancer-related mortality for males and females. Radiation therapy (RT) is one of the primary treatment modalities for lung cancer. While delivering the prescribed dose to tumor targets, it is essential to spare the tissues near the targets-the so-called organs-at-risk (OARs). An optimal RT planning benefits from the accurate segmentation of the gross tumor volume and surrounding OARs. Manual segmentation is a time-consuming and tedious task for radiation oncologists. Therefore, it is crucial to develop automatic image segmentation to relieve radiation oncologists of the tedious contouring work. Currently, the atlas-based automatic segmentation technique is commonly used in clinical routines. However, this technique depends heavily on the similarity between the atlas and the image segmented. With significant advances made in computer vision, deep learning as a part of artificial intelligence attracts increasing attention in medical image automatic segmentation. In this article, we reviewed deep learning based automatic segmentation techniques related to lung cancer and compared them with the atlas-based automatic segmentation technique. At present, the auto-segmentation of OARs with relatively large volume such as lung and heart etc. outperforms the organs with small volume such as esophagus. The average Dice similarity coefficient (DSC) of lung, heart and liver are over 0.9, and the best DSC of spinal cord reaches 0.9. However, the DSC of esophagus ranges between 0.71 and 0.87 with a ragged performance. In terms of the gross tumor volume, the average DSC is below 0.8. Although deep learning based automatic segmentation techniques indicate significant superiority in many aspects compared to manual segmentation, various issues still need to be solved. We discussed the potential issues in deep learning based automatic segmentation including low contrast, dataset size, consensus guidelines, and network design. Clinical limitations and future research directions of deep learning based automatic segmentation were discussed as well.Entities:
Keywords: automatic segmentation; deep learning; lung cancer; organs-at-risk; radiotherapy
Year: 2021 PMID: 34336704 PMCID: PMC8323481 DOI: 10.3389/fonc.2021.717039
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1Architecture of a classic CNN (44).
Figure 2Architecture of a typical FCN (45). The white boxes represent multi-channel feature maps after the convolutional operation.
Figure 3Architecture of a conventional U-Net (46). The blue boxes represent multi-channel feature maps, and the white boxes correspond to the copy of feature maps in the encoder branch. The arrows of different colors represent various operations. The number provided on the top of the box represents the number of channels, and the x-y-size of feature map is denoted at the lower left edge of the box.
Selected works on deep learning-based automated segmentation of OARs for lung cancer.
| Reference | Year | Networks | OARs | Research Highlight |
|---|---|---|---|---|
| Zhao et al. ( | 2018 | FCN | lung | introducing a multi-instance loss and a conditional adversary loss to facilitate more correct segmentation |
| Agnes et al. ( | 2018 | U-Net | lung | exploring the performance of different convolutional network configurations, and comparing proposed model with other methods such as the thresholding method etc. |
| Zhu et al. ( | 2018 | U-Net | lung, heart, esophagus, spinal cord, and liver | replacing the common convolutional layers with residual convolution units |
| Dong et al. ( | 2019 | U-net-GAN | left lung, right lung, heart, esophagus, and spinal cord | proposing a 2.5D patch-based GAN to delineate the left lung, right lung, and heart, two 3D patch-based GAN to delineate esophagus and spinal cord separately |
| Feng et al. ( | 2019 | U-Net | left lung, right lung, heart, esophagus, and spinal cord | developing two 3D U-Nets, one for localizing each OAR and the other for individually segmenting each OAR |
| Harten et al. ( | 2019 | CNN | heart, aorta, trachea, esophagus | combining a 2D CNN which utilizes dilated convolutions with a 3D CNN which employs residual blocks |
| He et al. ( | 2019 | adapted U-Net | heart, aorta, trachea, esophagus | proposing a U-like architecture in which the encoder could set diverse network framework, and training it under the multi-task learning schema |
| Vesal et al. ( | 2019 | U-Net | heart, aorta, trachea, esophagus | employing dilated convolutions in the bottleneck of a 2D U-Net styled network and residual connections in the encoder branch |
| Han et al. ( | 2019 | V-Net | heart, aorta, trachea, esophagus | proposing a multi-resolution VB-Net architecture by replacing the convolutional layers inside the V-Net with a bottleneck framework |
| Zhang et al. ( | 2020 | CNN | left lung, right lung, heart, esophagus spinal cord, and liver | establishing a dilated CNN structure based on ResNet-101, and comparing its performance with atlas-based and manual segmentation |
| Hu et al. ( | 2020 | Mask R-CNN | lung | using a Mask R-CNN combined with supervised and unsupervised machine learning methods to segment the lung |
| Tan et al. ( | 2020 | GAN | lung | proposing a GAN-based architecture combing a novel loss function based on the Earth Mover distance for lung segmentation |
| Pawar et al. ( | 2020 | c-GAN | lung | introducing a c-GAN structure that used multi-scale dense feature extraction blocks for the lung segmentation with different interstitial lung disease patterns |
| Morris et al. ( | 2020 | U-Net | cardiac substructures | proposing a 3D U-Net combined with fully connected conditional random fields to segment twelve cardiac substructures |
| Chen et al. ( | 2020 | FCN | left lung, right lung, liver, spleen, left kidney, and right kidneys | proposing a 3D FCN based on the U-Net and ResNet to segment OARs in dual energy CT images |
FCN, fully convolutional network; GAN, generative adversarial network; OAR, organ-at-risk; 3D, three dimensional; 2D, two dimensional; CNN, convolutional neural network; R-CNN, Region- convolutional neural network; c-GAN, conditional generative adversarial network.
Comparison of selected works on segmentation of lung.
| Reference | Year | Networks | Evaluation Metrics | ||||||
|---|---|---|---|---|---|---|---|---|---|
| DSC | IOU | HD (mm) | 95%HD (mm) | MSD (mm) | Sensitivity | Specificity | |||
| Zhao et al. ( | 2018 | FCN | LIDC: 0.92 | – | – | – | – | – | – |
| CLEF: 0.96 | |||||||||
| HUG: 0.98 | |||||||||
| Agnes et al. ( | 2018 | U-Net | 0.95 ± 0.03 | – | – | – | – | 0.95 ± 0.03 | 0.99 ± 0.01 |
| Zhu et al. ( | 2018 | adapted U-Net | 0.95 ± 0.01 | – | – | 7.96 ± 2.57 | 1.93 ± 0.51 | – | – |
| Dong et al. ( | 2019 | U-net-GAN | Left: 0.97 ± 0.01 | – | – | 2.07 ± 1.93 | 0.61 ± 0.73 | 0.97 ± 0.02 | 0.9989 ± 0.0010 |
| Right: 0.97 ± 0.01 | 2.50 ± 3.34 | 0.65 ± 0.53 | 0.96 ± 0.02 | 0.9992 ± 0.0007 | |||||
| Feng et al. ( | 2019 | 3D U-Net | Left: 0.98 ± 0.01 | – | – | 2.10 ± 0.94 | 0.59 ± 0.29 | – | – |
| Right: 0.97 ± 0.02 | 3.96 ± 2.85 | 0.93 ± 0.57 | |||||||
| Zhang et al. ( | 2020 | ResNet-101 | Left: 0.95 ± 0.01 | – | – | – | 1.10 ± 0.15 | – | – |
| Right: 0.94 ± 0.02 | 2.23 ± 2.33 | ||||||||
| Hu et al. ( | 2020 | Mask R-CNN | 0.97 ± 0.03 | – | – | – | – | 0.97 ± 0.09 | 0.9711 ± 0.0365 |
| Tan et al. ( | 2020 | GAN | – | 0.938 | 2.812 | – | – | – | – |
| Chen et al. ( | 2020 | 3D FCN | Left: 0.98 ± 0.01 | – | – | – | – | – | – |
| Right: 0.98 ± 0.02 | |||||||||
FCN, fully convolutional network; LIDC, Lung Image Database Consortium; CLEF, Conference and Labs of the Evaluation Forum; HUG, University Hospitals of Geneva; DSC, Dice similarity coefficient; IOU, intersection over Union; HD, Hausdorff distance; 95%HD, 95% Hausdorff distance; MSD, mean surface distance; R-CNN, Region- convolutional neural network; GAN, generative adversarial network.
Selected works on segmentation of other OARs (liver, aorta, trachea).
| Reference | Year | Networks | Evaluation Metrics | ||
|---|---|---|---|---|---|
| DSC | HD (mm) | MSD (mm) | |||
| Zhu et al. ( | 2018 | adapted U-Net | Liver: 0.89 ± 0.02 | – | 3.21 ± 0.93 |
| Zhang et al. ( | 2020 | ResNet-101 | Liver: 0.94 ± 0.03 | – | 2.03 ± 1.49 |
| Chen et al. ( | 2020 | 3D FCN | Liver: 0.96 ± 0.16 | – | – |
| Harten et al. ( | 2019 | CNN | Aorta: 0.93 ± 0.01 | 2.7 ± 3.6 | – |
| Trachea: 0.91 ± 0.02 | 2.1 ± 1.0 | – | |||
| He et al. ( | 2019 | U-Net with multi-task learning | Aorta: 0.95 | 0.113 | – |
| Trachea: 0.92 | 0.182 | – | |||
| Vesal et al. ( | 2019 | 2D U-Net | Aorta: 0.94 | 0.297 | – |
| Trachea: 0.93 | 0.193 | – | |||
| Han et al. ( | 2019 | VB-Net | Aorta: 0.95 | 0.121 | – |
| Trachea: 0.93 | 0.145 | – | |||
DSC, Dice similarity coefficient; HD, Hausdorff distance; 95%HD, 95% Hausdorff distance; MSD, mean surface distance; GAN, generative adversarial network.
Comparison of different networks on segmentation of lung tumors in (41).
| Networks | TCIA dataset | MSKCC dataset | LIDC dataset | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| DSC | 95%HD | Sensitivity | Precision | DSC | 95%HD | Sensitivity | Precision | DSC | 95%HD | Sensitivity | Precision | |
| U-Net | 0.68 | 15.51 | 0.73 | 0.71 | 0.65 | 7.87 | 0.75 | 0.66 | 0.58 | 4.95 | 0.80 | 0.64 |
| SegNet | 0.70 | 15.24 | 0.73 | 0.72 | 0.66 | 7.92 | 0.72 | 0.69 | 0.57 | 4.48 | 0.77 | 0.60 |
| FRRN | 0.71 | 12.66 | 0.75 | 0.73 | 0.71 | 7.72 | 0.69 | 0.71 | 0.60 | 2.91 | 0.76 | 0.64 |
| incremental-MRRN | 0.74 | 7.94 | 0.80 | 0.73 | 0.74 | 5.85 | 0.82 | 0.72 | 0.68 | 2.60 | 0.85 | 0.67 |
| dense-MRRN | 0.73 | 8.10 | 0.79 | 0.73 | 0.73 | 5.94 | 0.80 | 0.72 | 0.67 | 2.72 | 0.82 | 0.70 |
TCIA, The Cancer Imaging Archive; MSKCC, Memorial Sloan Kettering Cancer Center; LIDC, Lung Image Database Consortium; DSC, Dice similarity coefficient; 95%HD, 95% Hausdorff distance; FRRN, full resolution residual neural network; MRRN, multiple resolution residually connected network.
Selected works on deep learning-based automated segmentation of lung tumors.
| Reference | Year | Network | Datasets | Input | Targets | Results | Research Highlight | |
|---|---|---|---|---|---|---|---|---|
| Wang et al. ( | 2018 | CNN | 9 patients | MRI | GTV | DSC: 0.82 ± 0.10 | establishing a patient-specific adaptive patch-based CNN and a population-based CNN, and comparing their performance with each other | |
| Precision: 0.81 ± 0.10 | ||||||||
| Zhang et al. ( | 2020 | ResNet | 330 patients (training set: 300; test set: 30) | CT | GTV | DSC: 0.73 ± 0.07 | proposing a modified version of ResNet to segment the GTV for NSCLC patients and comparing its performance with U-net | |
| JSC: 0.68 ± 0.09 | ||||||||
| TPR: 0.74 ± 0.07 | ||||||||
| FPR: 0.0012 ± 0.0014 | ||||||||
| Zhao et al. ( | 2018 | 3D FCN | 84 patients (training set: 48; test set:36) | PET/CT | Lung tumor | DSC: 0.85 ± 0.08 | proposing a multi-modality co-segmentation network and comparing its performance with utilizing CT or PET only | |
| Jiang et al. ( | 2019 | MRRN | 1210 patients from three datasets (377 from TCIA for training, 304 from MSKCC for validating, and 529 from LIDC for testing) | CT | Lung tumor | form: (TCIA, MSKCC, LIDC) | developing two multiple resolution residually connected network viz. incremental-MRRN and dense-MRRN, and comparing their performance with other commonly used networks | |
| DSC: (0.74, 0.75, 0.68) | ||||||||
| Precision: (0.73, 0.72, 0.67) | ||||||||
| Sensitivity: (0.80, 0.82, 0.85) | ||||||||
| 95%HD (mm): (7.94, 5.85, 2.60) | ||||||||
| Bi et al. ( | 2019 | ResNet-101 | 269 patients (training set: 200; validation set: 50; test set:19) | CT | CTV | DSC: 0.75 ± 0.06 | introducing a deep residual network with dilated blocks to segment the CTV for NSCLC patients, and comparing with manual delineation | |
| MDTA: 2.97 ± 0.91 | ||||||||
| CV: 0.129 ± 0.040 | ||||||||
| SDD: 0.47 ± 0.22 | ||||||||
CNN, convolutional neural network; MRI, magnetic resonance imaging; GTV, gross tumor volume; DSC, Dice similarity coefficient; RMSD, root mean surface distance; ResNet, residual network; CT, computed tomography; JSC, Jaccard similarity coefficient; TPR, true positive rate; FPR, false positive rate; NSCLC, non-small cell lung cancer; MRRN, multiple resolutions residually connected network; TCIA, The Cancer Imaging Archive; MSKCC, Memorial Sloan Kettering Cancer Center; LIDC, Lung Image Database Consortium; 95%HD, 95% Hausdorff distance; CTV, clinical target volume; MDTA, mean distance to agreement; CV, coefficient of variation; SDD, standard distance deviation.
Selected works on comparison between atlas-based and deep learning-based automated segmentation.
| Reference | Year | Evaluation Metrics | Comparison Results | |
|---|---|---|---|---|
| atlas-based | deep learning-based | |||
| Lustberg et al. ( | 2017 | time saved | 7.8 min | 10 min |
| (compared with manual segmentation) | ||||
| Zhu et al. ( | 2018 | DSC | heart: 0.90 ± 0.04 | heart: 0.91 ± 0.03 |
| liver: 0.87 ± 0.05 | liver: 0.89 ± 0.02 | |||
| esophagus: 0.54 ± 0.08 | esophagus: 0.71 ± 0.05 | |||
| spinal cord: 0.71 ± 0.06 | spinal cord: 0.79 ± 0.03 | |||
| lungs: 0.95 ± 0.01 | lungs: 0.95 ± 0.01 | |||
| MSD (mm) | heart: 3.14 ± 1.31 | heart: 2.92 ± 1.51 | ||
| liver: 3.83 ± 1.74 | liver: 3.21 ± 0.93 | |||
| esophagus: 2.67 ± 1.26 | esophagus: 2.18 ± 0.80 | |||
| spinal cord: 3.03 ± 1.57 | spinal cord: 1.25 ± 0.23 | |||
| lungs: 1.85 ± 0.53 | lungs: 1.93 ± 0.51 | |||
| 95%HD (mm) | heart: 9.53 ± 4.99 | heart: 7.98 ± 4.56 | ||
| liver: 11.87 ± 5.06 | liver: 10.06 ± 4.28 | |||
| esophagus: 9.45 ± 4.64 | esophagus: 7.83 ± 2.85 | |||
| spinal cord: 11.97 ± 6.88 | spinal cord: 4.01 ± 2.05 | |||
| lungs: 8.07 ± 2.39 | lungs: 7.96 ± 2.57 | |||
| Zhang et al. ( | 2020 | average time | 2.4 minutes | 1.6 minutes |
| DSC | left lung: 0.932 ± 0.040 | left lung: 0.948 ± 0.013 | ||
| right lung: 0.943 ± 0.017 | right lung: 0.943 ± 0.015 | |||
| heart: 0.858 ± 0.077 | heart: 0.893 ± 0.048 | |||
| spinal cord: 0.868 ± 0.031 | spinal cord: 0.821 ± 0.046 | |||
| liver:0.936 ± 0.012 | liver: 0.937 ± 0.027 | |||
| esophagus: – | esophagus: 0.732 ± 0.069 | |||
| MSD (mm) | left lung: 1.73 ± 1.58 | left lung: 1.10 ± 0.15 | ||
| right lung: 2.17 ± 2.44 | right lung: 2.23 ± 2.33 | |||
| heart: 3.66 ± 2.44 | heart: 1.65 ± 0.48 | |||
| spinal cord: 0.66 ± 0.16 | spinal cord: 0.87 ± 0.21 | |||
| liver: 2.11 ± 1.31 | liver: 2.03 ± 1.49 | |||
| esophagus: – | esophagus: 1.38 ± 0.44 | |||
DSC, Dice similarity coefficient; MSD, mean surface distance; 95%HD, 95% Hausdorff distance.
Comparison of selected works on segmentation of esophagus.
| Reference | Year | Networks | Evaluation Metrics | |||
|---|---|---|---|---|---|---|
| DSC | HD (mm) | 95%HD (mm) | MSD (mm) | |||
| Zhu et al. ( | 2018 | adapted U-Net | 0.71 ± 0.05 | – | 7.83 ± 2.85 | 2.18 ± 0.80 |
| Dong et al. ( | 2019 | U-net-GAN | 0.75 ± 0.08 | – | 4.52 ± 3.81 | 1.05 ± 0.66 |
| Feng et al. ( | 2019 | 3D U-Net | 0.73 ± 0.09 | – | 8.71 ± 10.59 | 2.34 ± 2.38 |
| Harten et al. ( | 2019 | CNN | 0.85 ± 0.05 | 3.4 ± 2.3 | – | – |
| He et al. ( | 2019 | U-Net with multi-task learning | 0.86 | 0.274 | – | – |
| Vesal et al. ( | 2019 | 2D U-Net | 0.86 | 0.331 | – | – |
| Han et al. ( | 2019 | VB-Net | 0.87 | 0.259 | – | – |
| Zhang et al. ( | 2020 | ResNet-101 | 0.73 ± 0.07 | – | – | 1.38 ± 0.44 |
DSC, Dice similarity coefficient; HD, Hausdorff distance; 95%HD, 95% Hausdorff distance; MSD, mean surface distance; GAN, generative adversarial network.
Comparison of selected works on segmentation of spinal cord.
| Reference | Year | Networks | Evaluation Metrics | ||
|---|---|---|---|---|---|
| DSC | 95%HD (mm) | MSD (mm) | |||
| Zhu et al. ( | 2018 | adapted U-Net | 0.79 ± 0.03 | 4.01 ± 2.05 | 1.25 ± 0.23 |
| Dong et al. ( | 2019 | U-net-GAN | 0.90 ± 0.04 | 1.19 ± 0.46 | 0.38 ± 0.27 |
| Feng et al. ( | 2019 | 3D U-Net | 0.89 ± 0.04 | 1.89 ± 0.63 | 0.66 ± 0.25 |
| Zhang et al. ( | 2020 | ResNet-101 | 0.82 ± 0.05 | – | 0.87 ± 0.21 |
DSC, Dice similarity coefficient; HD, Hausdorff distance; 95%HD, 95% Hausdorff distance; MSD, mean surface distance; GAN, generative adversarial network.
Comparison of selected works on segmentation of heart.
| Reference | Year | Networks | Evaluation Metrics | |||
|---|---|---|---|---|---|---|
| DSC | HD (mm) | 95%HD (mm) | MSD (mm) | |||
| Zhu et al. ( | 2018 | adapted U-Net | 0.91 ± 0.03 | – | 7.98 ± 4.56 | 2.92 ± 1.51 |
| Dong et al. ( | 2019 | U-net-GAN | 0.87 ± 0.05 | – | 4.58 ± 3.67 | 1.49 ± 0.85 |
| Feng et al. ( | 2019 | 3D U-Net | 0.93 ± 0.02 | – | 6.57 ± 1.50 | 2.30 ± 0.49 |
| Harten et al. ( | 2019 | CNN | 0.95 ± 0.01 | 2.0 ± 1.1 | – | – |
| He et al. ( | 2019 | U-Net with multi-task learning | 0.95 | 0.138 | – | – |
| Vesal et al. ( | 2019 | 2D U-Net | 0.94 | 0.226 | – | – |
| Han et al. ( | 2019 | VB-Net | 0.95 | 0.127 | – | – |
| Zhang et al. ( | 2020 | ResNet-101 | 0.89 ± 0.05 | – | – | 1.65 ± 0.48 |
DSC, Dice similarity coefficient; HD, Hausdorff distance; 95%HD, 95% Hausdorff distance; MSD, mean surface distance; GAN, generative adversarial network.