| Literature DB >> 36106100 |
E Tryggestad1, A Anand2, C Beltran3, J Brooks1, J Cimmiyotti1, N Grimaldi1, T Hodge1, A Hunzeker1, J J Lucido1, N N Laack1, R Momoh1, D J Moseley1, S H Patel2, A Ridgway2, S Seetamsetty1, S Shiraishi1, L Undahl1, R L Foote1.
Abstract
In this era of patient-centered, outcomes-driven and adaptive radiotherapy, deep learning is now being successfully applied to tackle imaging-related workflow bottlenecks such as autosegmentation and dose planning. These applications typically require supervised learning approaches enabled by relatively large, curated radiotherapy datasets which are highly reflective of the contemporary standard of care. However, little has been previously published describing technical infrastructure, recommendations, methods or standards for radiotherapy dataset curation in a holistic fashion. Our radiation oncology department has recently embarked on a large-scale project in partnership with an external partner to develop deep-learning-based tools to assist with our radiotherapy workflow, beginning with autosegmentation of organs-at-risk. This project will require thousands of carefully curated radiotherapy datasets comprising all body sites we routinely treat with radiotherapy. Given such a large project scope, we have approached the need for dataset curation rigorously, with an aim towards building infrastructure that is compatible with efficiency, automation and scalability. Focusing on our first use-case pertaining to head and neck cancer, we describe our developed infrastructure and novel methods applied to radiotherapy dataset curation, inclusive of personnel and workflow organization, dataset selection, expert organ-at-risk segmentation, quality assurance, patient de-identification, data archival and transfer. Over the course of approximately 13 months, our expert multidisciplinary team generated 490 curated head and neck radiotherapy datasets. This task required approximately 6000 human-expert hours in total (not including planning and infrastructure development time). This infrastructure continues to evolve and will support ongoing and future project efforts.Entities:
Keywords: artificial intelligence; autosegmentation; convolutional neural network; curation; deep learning; head and neck cancer; organs-at-risk; radiotherapy
Year: 2022 PMID: 36106100 PMCID: PMC9464982 DOI: 10.3389/fonc.2022.936134
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 5.738
DICOM objects comprising the curated datasets.
| Description | DICOM modality | Required (R) vs. Optnl. (O) | Typical voxel size (x,y,z) in mm |
|---|---|---|---|
| H&N Curated OARs | RTSTRUCT | R |
|
| H&N Planning CT Recon. | CT | R | (1.27, 1.27, ≤2.5) |
| Small FOV CT Recon. | CT | O | (≤0.59, ≤0.59, ≤2.5) |
| Contrast CT Recon. | CT | O | (1.27, 1.27, ≤2.5) |
| Contrast CT Registration | REG | O |
|
| H&N Clinical OARs | RTSTRUCT | R |
|
| H&N Clinical RT Plan | RTPLAN or RTIONPLAN | R |
|
| H&N Clinical RT Dose | RTDOSE | R | (≤3.0, ≤3.0, ≤3.0) |
≠ Most-frequent voxel size in z was 2 mm.
† Most-frequent voxel size was (0.59, 0.59, 1) mm.
n/a, not applicable.
H&N cancer sites or histologies for cases included in our curated dataset.
| H&N Cancer site | Sub-sites or Relevant histologies (if applicable) |
|---|---|
| Hypopharynx | Pyriform sinus; postcricoid mucosa; posterior pharyngeal wall |
| Larynx | Supraglottic; glottic; subglottic |
| Nasal cavity | |
| Nasopharynx | |
| Oral cavity | Lip; gingiva; buccal mucosa; floor of mouth; oral tongue; retromolar trigone; hard palate |
| Oropharynx | Tonsil; base of tongue; soft palate; posterior pharyngeal wall |
| Paraganglioma | Carotid body; vagale; jugulotympanicum |
| Paranasal sinus | Maxillary; sphenoid; ethmoid; frontal |
| Salivary gland | Parotid; submandibular; sublingual; minor |
| Skin | Squamous cell ca.; basal cell ca.; Merkel cell ca.; melanoma |
| Thyroid gland |
H&N OAR segmentation labels and their anatomic designation.
| H&N OAR labels | Anatomic name or designation |
|---|---|
| brachial_plex_[l,r] | Brachial plexus nerve |
| brain | |
| brain_stem | Brainstem |
| carotid_artery_[l,r] | |
| cochlea_[l,r] | |
| constrictors_p | Pharyngeal constrictor muscle |
| cord | Spinal cord |
| crico_p_inlet | Cricopharyngeal inlet muscle |
| esophagus | |
| esophagus_cerv | Cervical esophagus |
| ext_aud_canal_[l,r] | External auditory canal |
| eye_[l,r] | |
| lacrimal_[l,r] | Lacrimal gland |
| larynx | |
| lens_[l,r] | |
| lips | |
| lung_[l,r] | |
| mandible | |
| mastoid_[l,r] | Mastoid air cells |
| nasal_cavity | Nasal cavity (region) |
| optic_nrv_[l,r] | Optic nerve |
| oral_cavity | Oral cavity (region) |
| parotid_[l,r] | Parotid gland |
| pituitary | Pituitary gland |
| retina_[l,r] | |
| semi_cir_canal_[l,r] | Semicircular canal |
| sub_mandib_[l,r] | Submandibular gland |
| thyroid | Thyroid gland |
Figure 1Schematic of technical curation infrastructure. (MCR, Mayo Clinic Rochester; MCA, Mayo Clinic Arizona; MCF, Mayo Clinic Florida; DeID, de-identification; CSS, Cloud Storage Service).
Figure 2Schematic of dataset curation workflow.
Figure 3Example of curated H&N OAR segmentations for randomly selected case. (A) Moving from left-to-right, top-to-bottom: selected axial slices moving from superior to inferior aspect of H&N RT planning CT. (B) Anterio-posterior maximum intensity projection rendering with OAR projections overlayed. (C) 3D perspective rendering of 2D DICOM-RTStruct-format OARs with tick-marked scales on DICOM coordinates as indicated.
Mean, median and standard deviation of time spent per recorded case by H&N anatomy experts for OAR segmentation and related revisions, broken down per cohort subset.
| MDA carotid time (min.) | MDA other OAR time (hr.) | Physician time (hr.) | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Cohort | NCases | NSamples | mean | median | σ | NSamples | mean | median | σ | NSamples | mean | median | σ |
| 1 | 251 | 28 | 27.7 | 27.5 | 6.6 | 63 | 6.9 | 6.00 | 2.9 | - | - | - | - |
| 2 | 105 | 39 | 24.1 | 20.0 | 9.2 | 72 | 7.1 | 7.00 | 0.6 | - | - | - | - |
| 3 | 99 | 74 | 35.3 | 35.0 | 8.7 | 77 | 7.4 | 7.50 | 0.6 | - | - | - | - |
| Hold Out | 35 | - | - | - | - | 35 | 6.6 | 6.75 | 0.8 | 34 | 4.0 | 4.25 | 1.6 |
| Combined | 490 | 141 | 30.7 | 30.0 | 9.8 | 247 | 7.1 | 7.00 | 1.6 | 34 | 4.0 | 4.25 | 1.6 |
MDA time spent on carotid segmentation was recorded to the nearest 5 minutes; time spent by MDAs on other OARs, as well as physician time, was recorded to the nearest 15 minutes. MDA effort can be considered as interchangeable with dosimetrist effort.
Figure 4(A) Box and whisker plots of DSC comparing curated versus clinical H&N OARs (NCases=490; counts per OAR are variable, dependent on set of OARs available per case). “Bulls-eyes” indicate the median values; medians are connected over all OARs with a dashed line. (B) Same representation as in panel (A) for alternative metric Overlap DSC. The dashed line showing medians from (standard) DSC (taken from panel) (A) is projected onto this panel for comparison.