| Literature DB >> 34660843 |
Marcel Früh1,2, Marc Fischer3, Andreas Schilling2, Sergios Gatidis1,4, Tobias Hepp1,4.
Abstract
Purpose: We introduce and evaluate deep learning methods for weakly supervised segmentation of tumor lesions in whole-body fluorodeoxyglucose-positron emission tomography (FDG-PET) based solely on binary global labels ("tumor" versus "no tumor"). Approach: We propose a three-step approach based on (i) a deep learning framework for image classification, (ii) subsequent generation of class activation maps (CAMs) using different CAM methods (CAM, GradCAM, GradCAM++, ScoreCAM), and (iii) final tumor segmentation based on the aforementioned CAMs. A VGG-based classification neural network was trained to distinguish between PET image slices with and without FDG-avid tumor lesions. Subsequently, the CAMs of this network were used to identify the tumor regions within images. This proposed framework was applied to FDG-PET/CT data of 453 oncological patients with available manually generated ground-truth segmentations. Quantitative segmentation performance was assessed for the different CAM approaches and compared with the manual ground truth segmentation and with supervised segmentation methods. In addition, further biomarkers (MTV and TLG) were extracted from the segmentation masks.Entities:
Keywords: computed tomography; deep learning; label efficiency; oncological imaging; positron emission tomography; weakly supervised learning
Year: 2021 PMID: 34660843 PMCID: PMC8510879 DOI: 10.1117/1.JMI.8.5.054003
Source DB: PubMed Journal: J Med Imaging (Bellingham) ISSN: 2329-4302
Fig. 1Exemplary PET/CT slice with high SUV uptake next to the hilum of the right lung. The right image shows the manually annotated segmentation mask as red overlay to the PET image.
Fig. 2Distribution of the tumor size for slices with malignant tissue. Slices with small sized tumors are dominating.
Fig. 3Proposed processing routine. First, a binary tumor classifier is trained in a supervised manner on PET/CT data. Then a class activation map is computed based on the classifier. Finally, threshold based segmentation is performed on the PET images within the region proposed by the CAM.
Fig. 4PET with ground truth segmentation, corresponding activation map based on the four CAM methods, extracted segmentation and corresponding CT for a sample slice with a tumor.
Fig. 5Per subject Dice scores for the weakly supervised segmentation methods (blue) and the supervised baselines (red).
Fig. 6Comparison between true and estimated MTV. All units in ml.
Intra class correlation for estimated and real MTV/TLG.
| MTV (ml) | TLG (g) | |||||
|---|---|---|---|---|---|---|
| ICC | 95%-CI | p-value | ICC | 95%-CI | p-value | |
| CAM | 0.64 | [0.50, 0.74] | <0.001 | 0.85 | [0.77, 0.90] | <0.001 |
| GradCAM | 0.55 | [0.39, 0.67] | <0.001 | 0.40 | [0.19, 0.57] | <0.001 |
| GradCAM++ | 0.64 | [0.48, 0.73] | <0.001 | 0.79 | [0.66, 0.86] | <0.001 |
| ScoreCAM | 0.64 | [0.50, 0.75] | <0.001 | 0.82 | [0.71, 0.88] | <0.001 |
| Threshold | 0.59 | [0.45, 0.71] | <0.001 | 0.88 | [0.83, 0.92] | <0.001 |
| UNET | 0.94 | [0.91, 0.96] | <0.001 | 0.99 | [0.98, 0.99] | <0.001 |
Fig. 7Comparison between true and estimated TLG. All units in g.
CAM Segmentation
| 1: Predict class |
| |
|
|
| |
|
|
| 2: |
| 3: Upscale |
| 4: |
| 5: |
| 6: Segmentation mask = |