| Literature DB >> 33369481 |
Anne E Thessen1,2, Cynthia J Grondin3, Resham D Kulkarni4, Susanne Brander1, Lisa Truong1, Nicole A Vasilevsky5,6, Tiffany J Callahan7,8, Lauren E Chan9, Brian Westra10, Mary Willis11, Sarah E Rothenberg11, Annie M Jarabek12, Lyle Burgoon13, Susan A Korrick14, Melissa A Haendel1.
Abstract
BACKGROUND: A critical challenge in genomic medicine is identifying the genetic and environmental risk factors for disease. Currently, the available data links a majority of known coding human genes to phenotypes, but the environmental component of human disease is extremely underrepresented in these linked data sets. Without environmental exposure information, our ability to realize precision health is limited, even with the promise of modern genomics. Achieving integration of gene, phenotype, and environment will require extensive translation of data into a standard, computable form and the extension of the existing gene/phenotype data model. The data standards and models needed to achieve this integration do not currently exist.Entities:
Year: 2020 PMID: 33369481 PMCID: PMC7769179 DOI: 10.1289/EHP7215
Source DB: PubMed Journal: Environ Health Perspect ISSN: 0091-6765 Impact factor: 9.031
Figure 1.Example competency questions. Competency questions were developed by workshop participants (see the section “Competency Questions” in the Supplemental Material) to help guide the data model and expose deficiencies in ontological coverage. The input indicates the type of information provided by the hypothetical user in the example question. The output indicates the type of information the hypothetical user is requesting. The semantic types of the inputs and outputs are listed as priority concepts that can be represented by the listed ontologies, such as chemical (dashed line), taxon (white), phenotype or disease (hashed lines), and genotype (gray). These semantic types and are represented in the semantic model (Figure 2) as the model data type. Note: CheBI, Chemicals of Biological Interest; ECTO, Environmental Conditions, Treatments, and Exposures Ontology; GENO, Genotype Ontology; GO, Gene Ontology; Mondo, Mondo Disease Ontology; NCBI, National Center for Biotechnology Information; uPheno, Unified Phenotype Ontology.
Figure 2.Semantic model for representing environmental exposures. Exposure events (open box, represented by terms in the ECTO) can be incorporated into the Monarch knowledge graph via direct links to the diseases and phenotypes resulting from the exposure, the genes being affected, and the organism(s) or organism parts that are being exposed. Note: ECTO, Environmental Conditions, Treatments, and Exposures Ontology.
Figure 3.ExO and ECTO exposure data model. (A) The ExO (Mattingly et al. 2012) model combines stressors and receptors in the representation of the exposure event, which is linked to an outcome. Each element (dark gray) has associated metadata (light gray). (B) Exposure events in ECTO are precomposed with the stressor, medium, and route (when known) contained in the axiomatic definition: “exposure event” and (“has exposure stimulus” some “stressor”) and (“has exposure medium” some “medium”) and (“has exposure route” some “route”). Additional metadata about the exposure event (dark gray) are added as annotations on the event (light gray). Information about the receptor and the outcome are linked to the exposure event in the larger knowledge graph as shown in Figure 2. Note: ECTO, Environmental Conditions, Treatments, and Exposures Ontology; ExO, Exposure Ontology.