| Literature DB >> 35136672 |
David J Foran1,2, Eric B Durbin3,4, Wenjin Chen1, Evita Sadimin1,2, Ashish Sharma5, Imon Banerjee5, Tahsin Kurc6, Nan Li5, Antoinette M Stroup7, Gerald Harris7, Annie Gu5, Maria Schymura8, Rajarsi Gupta6, Erich Bremer6, Joseph Balsamo6, Tammy DiPrima6, Feiqiao Wang6, Shahira Abousamra9, Dimitris Samaras9, Isaac Hands4, Kevin Ward10, Joel H Saltz6.
Abstract
BACKGROUND: Population-based state cancer registries are an authoritative source for cancer statistics in the United States. They routinely collect a variety of data, including patient demographics, primary tumor site, stage at diagnosis, first course of treatment, and survival, on every cancer case that is reported across all U.S. states and territories. The goal of our project is to enrich NCI's Surveillance, Epidemiology, and End Results (SEER) registry data with high-quality population-based biospecimen data in the form of digital pathology, machine-learning-based classifications, and quantitative histopathology imaging feature sets (referred to here as Pathomics features).Entities:
Keywords: Cancer registries; computational imaging; deep-learning; digital pathology
Year: 2022 PMID: 35136672 PMCID: PMC8794027 DOI: 10.4103/jpi.jpi_31_21
Source DB: PubMed Journal: J Pathol Inform
Figure 1Workflow for assembling linked image/data cohorts
Figure 2Clinical Research Data Warehouse workflow. The research data warehouse aggregates information from multiple data sources such as electronic health records, tumor registries, and radiology and pathology archives. It facilitates review of imaging data and linked clinical data on a single patient or cohort basis
Representative categories and linked data elements
| Source | Category | Representative elements |
|---|---|---|
| Cancer | Demographics | age_at_dx, sex, marital_status_at_dx, race, nhia, napiia, county_at_dx, etc |
| Registry | ||
| Vital information | vital_status, date_of_death, primary_cause | |
| Tumor information | Primary_site, laterality, grade, diagnosis_confirmation | |
| Tumor extension and metastasis | cs_extension, cs_tumor_size, cs_lymph_nodes, cs_mets_at_dx | |
| Pathology info and tumor staging | histology_icdo3, behavior_icdo3, clinical and pathology staging in AJCC 6, 7, 8 and SEER staging | |
| Site-specific data | cs_site_specific factors | |
| Tumor treatments | Surgical, radiation, hormone, BRM, and other cancer treatment information | |
| Imaging | Pathology images | Digitized representative diagnostic slides in Olympus (.vsi) and Philips (.svs?) whole slide image formats, including image metadata such as imaging device, optical settings and configuration, specimen staining, etc. |
| Computational imaging signatures | Tumor-infiltrating lymphocytes; tumor pattern segmentation; tumor and stromal nuclei segmenta-tion; spatial and spectral signatures |
Figure 3TIL and tumor analysis results displayed as a heatmap on the whole slide tissue image. TIL analysis results on the left and the tumor segmentation results on the right. The red color indicates a higher probability of a patch being TIL-positive (or tumor-positive) and the blue color indicates a lower probability
Figure 4Segmented nuclei overlaid as polygons shown in blue on the WSI. Each polygon represents the boundary of a segmented nucleus
Figure 5The iterative workflow starts with a set of patches which are extracted from whole slide tissue images and labeled for initial model training. Predictions from the trained model are reviewed as feature maps and heatmaps. The heatmaps are annotated to generate additional labeled patches which are added to the training dataset. The deep learning network is retrained with the updated training dataset to refine the model
Figure 6A feature map representation of TIL and tumor analysis results generated from a WSI in the Cancer Genome Atlas repository. The low-resolution version of the input WSI is displayed in the upper left corner. The upper right corner is the tumor segmentation map. The TIL map is displayed in the lower left corner. The lower right corner is the combined and thresholded TIL and tumor maps
Figure 7Pathology image workflow. WSIs are de-identified and analyzed by deep-learning analysis pipelines deployed in containers. Image data are linked to the SEER Registry database to enhance it with quantitative imaging features (such as TIL distributions and tumor segmentations) extracted by deep-learning models. De-identified images and imaging features can then be used for data mining and research purposes