| Literature DB >> 32749075 |
Kendall J Kiser1,2,3, Sara Ahmed3, Sonja Stieb3, Abdallah S R Mohamed3,4, Hesham Elhalawani5, Peter Y S Park6, Nathan S Doyle6, Brandon J Wang6, Arko Barman2, Zhao Li2, W Jim Zheng2, Clifton D Fuller3,4, Luca Giancardo2,5.
Abstract
This manuscript describes a dataset of thoracic cavity segmentations and discrete pleural effusion segmentations we have annotated on 402 computed tomography (CT) scans acquired from patients with non-small cell lung cancer. The segmentation of these anatomic regions precedes fundamental tasks in image analysis pipelines such as lung structure segmentation, lesion detection, and radiomics feature extraction. Bilateral thoracic cavity volumes and pleural effusion volumes were manually segmented on CT scans acquired from The Cancer Imaging Archive "NSCLC Radiomics" data collection. Four hundred and two thoracic segmentations were first generated automatically by a U-Net based algorithm trained on chest CTs without cancer, manually corrected by a medical student to include the complete thoracic cavity (normal, pathologic, and atelectatic lung parenchyma, lung hilum, pleural effusion, fibrosis, nodules, tumor, and other anatomic anomalies), and revised by a radiation oncologist or a radiologist. Seventy-eight pleural effusions were manually segmented by a medical student and revised by a radiologist or radiation oncologist. Interobserver agreement between the radiation oncologist and radiologist corrections was acceptable. All expert-vetted segmentations are publicly available in NIfTI format through The Cancer Imaging Archive at https://doi.org/10.7937/tcia.2020.6c7y-gq39. Tabular data detailing clinical and technical metadata linked to segmentation cases are also available. Thoracic cavity segmentations will be valuable for developing image analysis pipelines on pathologic lungs - where current automated algorithms struggle most. In conjunction with gross tumor volume segmentations already available from "NSCLC Radiomics," pleural effusion segmentations may be valuable for investigating radiomics profile differences between effusion and primary tumor or training algorithms to discriminate between them.Entities:
Keywords: computer-aided decision support systems; image processing; image segmentation techniques; informatics in imaging; quantitative imaging
Mesh:
Year: 2020 PMID: 32749075 PMCID: PMC7722027 DOI: 10.1002/mp.14424
Source DB: PubMed Journal: Med Phys ISSN: 0094-2405 Impact factor: 4.071
Fig. 1Thoracic cavity volumes were segmented automatically then iteratively corrected by a medical student and at least one radiologist or radiation oncologist to include the entire hemithoraces. [Color figure can be viewed at wileyonlinelibrary.com]
Fig. 2Pleural effusion segmentations, excluding gross tumor volume, were delineated by a medical student and subsequently corrected by at least two radiologists. [Color figure can be viewed at wileyonlinelibrary.com]
Seven radiologists (Rad) and radiation oncologists (RO) collaborated to review and correct 402 thoracic cavity segmentations and 78 pleural effusion segmentations delineated by a fourth‐year medical student.
| Expert reviewer | Years of experience |
|---|---|
| Rad1 | 4 |
| Rad2 | 2 |
| Rad3 | 1 |
| Rad4 | 3 |
| RO1 | 4 |
| RO2 | 11 |
| RO3 | 5 |
Fig. 3A schematic of interobserver comparisons, with the number of segmentation cases shared between observer pairs given as n. All 78 pleural effusion segmentations were reviewed and as necessary corrected by two radiologists: Rad3 and Rad4. A subset of 15 pleural effusion segmentations were also reviewed and corrected by RO1. In contrast, not all 402 thoracic cavity segmentations were reviewed by two physicians. Rather, four subsets of 21 or 22 thoracic cavity segmentations were randomly selected for dual review. All members of a given subset were exclusive to that subset. [Color figure can be viewed at wileyonlinelibrary.com]
Median and minimum values for Dice similarity coefficient (DSC), surface DSC, κ, 95HD, and symmetric ASD spatial similarity metrics calculated between paired physician segmentations. The distributions for each observer pair are significantly different from one another for all metrics (paired Mann‐Whitney U test P < 0.001). However, interobserver variability between pairs of physician‐vetted segmentations is generally acceptable. Median DSC, 95HD and symmetric ASD values for thoracic cavity segmentations are comparable to mean interobserver variability values reported by the 2017 AAPM Thoracic Auto‐Segmentation Challenge for lung segmentations. In general, pleural effusion segmentation interobserver agreement is also acceptable but more variable, reflecting both variation in pleural effusion size and inclusion or exclusion of trace pleural fluid.
| Metric | Pleural effusions | Thoracic cavities | ||||||
|---|---|---|---|---|---|---|---|---|
| RO1‐Rad3 | RO1‐Rad4 | Rad3‐Rad4 | RO1‐RO3 | Rad1‐RO3 | Rad1‐Rad2 | RO2‐Rad2 | ||
| Conformality metrics (unitless) | ||||||||
| DSC | Med | 0.81 | 0.85 | 0.93 | 0.99 | 1.00 | 1.00 | 1.00 |
| Min | 0.10 | 0.26 | 0.20 | 0.96 | 0.91 | 0.99 | 0.97 | |
| sDSC | Med | 0.62 | 0.77 | 0.87 | 0.94 | 0.98 | 0.98 | 1.00 |
| Min | 0.20 | 0.32 | 0.21 | 0.73 | 0.82 | 0.94 | 0.88 | |
| Kappa | Med | 0.81 | 0.85 | 0.93 | 0.99 | 0.99 | 0.99 | 0.99 |
| Min | 0.10 | 0.26 | 0.20 | 0.96 | 0.90 | 0.99 | 0.97 | |
| Surface distance metrics (mm) | ||||||||
| 95HD | Med | 24.00 | 21.65 | 5.31 | 1.95 | 0.00 | 0.00 | 0.00 |
| Max | 127.83 | 127.80 | 161.48 | 11.35 | 55.11 | 0.98 | 24.82 | |
| ASD | Med | 1.82 | 2.45 | 0.79 | 0.25 | 0.05 | 0.03 | 0.00 |
| Max | 23.53 | 22.39 | 33.49 | 1.01 | 5.78 | 0.12 | 1.68 | |
Select cases with extreme spatial similarity metric values are explored visually in Fig. 6.
Fig. 6A visual exploration of physician‐corrected segmentation pairs with the least interobserver agreement. Case LUNG1‐005 accounts for the worst Dice similarity coefficient (DSC), surface DSC, and κ values between any pair of pleural effusion segmentations. RO1 mistook atelectatic lung for effusion, but Rad3 and Rad4 did not. Case LUNG1‐253 accounts for the worst 95HD and symmetric ASD values between any pair of pleural effusion segmentations. Rad3 and Rad4 varied in how much trace pleural fluid they chose to segment. This exposes a weakness in our pleural effusion segmentation methodology because we did not decide at projection initiation whether or to what extent trace pleural fluid should be part of the segmentation. Case LUNG1‐026 accounts for the worst sDSC value between any pair of thoracic cavity segmentations. RO3 failed to segment the full extent of peri‐mediastinal primary gross tumor volume, right effusion, and left hilum (orange arrows). Case LUNG1‐354 accounts for the worst DSC, κ, 95HD, and symmetric ASD values between any pair of thoracic cavity segmentations. RO3 erroneously excluded a collapsed left lung. [Color figure can be viewed at wileyonlinelibrary.com]
Fig. 4Dice similarity coefficient distributions reveal consistently strong agreement (>0.98) between paired, independently vetted radiologist and radiation oncologist thoracic cavity segmentations. Colored curves are kernel density estimates of DSC distributions. Note that Figs. 4 and 5 do not share the same x axis limits; the difference in DSC distributions in Figs. 4 and 5 is at least partially an artifact of the difference in average volume between thoraces and effusions. [Color figure can be viewed at wileyonlinelibrary.com]
Fig. 5Dice similarity coefficient (DSC) distributions indicate good agreement (>0.8) between most paired, independently vetted radiologist and radiation oncologist pleural effusion segmentations. Interpretation of this result should respect that DSC values calculated on trace pleural effusions are sensitive to variation between segmentations on the order of only a few voxels. As in Fig. 4, colored curves are kernel density estimations of DSC distributions. [Color figure can be viewed at wileyonlinelibrary.com]