| Literature DB >> 27899636 |
Christopher J Mungall1, Julie A McMurry2, Sebastian Köhler3, James P Balhoff4, Charles Borromeo5, Matthew Brush2, Seth Carbon1, Tom Conlin2, Nathan Dunn1, Mark Engelstad2, Erin Foster2, J P Gourdine2, Julius O B Jacobsen6, Dan Keith2, Bryan Laraway2, Suzanna E Lewis1, Jeremy NguyenXuan1, Kent Shefchek2, Nicole Vasilevsky2, Zhou Yuan5, Nicole Washington1, Harry Hochheiser5, Tudor Groza7, Damian Smedley6, Peter N Robinson3,8, Melissa A Haendel9.
Abstract
The correlation of phenotypic outcomes with genetic variation and environmental factors is a core pursuit in biology and biomedicine. Numerous challenges impede our progress: patient phenotypes may not match known diseases, candidate variants may be in genes that have not been characterized, model organisms may not recapitulate human or veterinary diseases, filling evolutionary gaps is difficult, and many resources must be queried to find potentially significant genotype-phenotype associations. Non-human organisms have proven instrumental in revealing biological mechanisms. Advanced informatics tools can identify phenotypically relevant disease models in research and diagnostic contexts. Large-scale integration of model organism and clinical research data can provide a breadth of knowledge not available from individual sources and can provide contextualization of data back to these sources. The Monarch Initiative (monarchinitiative.org) is a collaborative, open science effort that aims to semantically integrate genotype-phenotype data from many species and sources in order to support precision medicine, disease modeling, and mechanistic exploration. Our integrated knowledge graph, analytic tools, and web services enable diverse users to explore relationships between phenotypes and genotypes across species.Entities:
Mesh:
Year: 2016 PMID: 27899636 PMCID: PMC5210586 DOI: 10.1093/nar/gkw1128
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Glossary of acronyms
| Acronym | Name | URL | Ref |
|---|---|---|---|
| Bgee | BgeeDb | ( | |
| BioGrid | Biological General Repository for Interaction Datasets. | ( | |
| CL | Cell Ontology | ( | |
| ClinVar | ClinVar | ( | |
| CTD | Clinical Toxicology Database | ( | |
| ECO | Evidence and Conclusions Ontology | ( | |
| ExAC | Exome Aggregation Consortium | ( | |
| FlyBase | FlyBase | ( | |
| GeneNetwork | Gene Network | ( | |
| GENO | Genotype Ontology | ( | |
| GO | Gene Ontology | ( | |
| GWAS | GWAS Catalog | ( | |
| HP | Human Phenotype Ontology | ( | |
| KEGG | Kyoto Encyclopedia of Genes and Genomes | ( | |
| MGI | Mouse Genome Informatics | ( | |
| MonDO | Monarch Merged Disease Ontology | ( | |
| MP | Mammalian Phenotype Ontology | ( | |
| MPD | Mouse phenome database | ( | |
| MyGene | MyGene | ( | |
| OMIA | Online Mendelian Inheritance in Animals | ( | |
| OMIM | Online Mendelian Inheritance in Man | ( | |
| OrphaNet | Portal for rare diseases and orphan drugs | ( | |
| Panther | PantherDB | ( | |
| RO | Relation Ontology | ( | |
| SEPIO | Scientific Evidence and Provenance Information Ontology | ( | |
| SO | Sequence Ontology | ( | |
| Uberon | Uber-anatomy ontology | ( | |
| Upheno | Unified Phenotype Ontology | ( | |
| WormBase | WormBase | ( | |
| ZFIN | Zebrafish Information Resource | ( |
Figure 1.The phenotype annotation coverage of human coding genes. Yellow bars show that 51% of those genes have at least one phenotype association reported in humans (HPO annotations of OMIM, ClinVar, Orphanet, CTD and GWAS). The blue bars show that 58% of human coding genes have orthologs with causal phenotypic associations reported in at least one non-human model (MGI, Wormbase, Flybase and ZFIN). The green bars show that 40% of human coding genes have annotations both in human and in non-human orthologs. There are phenotypic associations from humans and/or non-human orthologs that cover 89% of human coding genes.
Figure 2.Monarch Data Architecture. Structured and unstructured data sources are loaded into SciGraph via Dipper. Ontologies are also loaded into SciGraph, resulting in a combined knowledge and data graph. Data is disseminated via SciGraph Services, an ontology-enhanced Solr instance called GOlr, and to the OwlSim semantic similarity software. Monarch applications and end users access the services for graph querying, application population and phenotype matching.
Figure 3.Data types, sources, and the ontologies used for their integration into the Monarch knowledge graph. Each data source uses or is mapped to a suite of different ontologies or vocabularies. These are in turn integrated into bridging ontologies for Genetics (GENO), Anatomy (Uberon/CL), Phenotypes (UPheno) and Diseases (MonDO).
Figure 4.Distribution of phenotypic annotations across species in Monarch, broken down by the top levels of the phenotype ontology. The graph can be interactively explored at https://monarchinitiative.org/phenotype/. Note that annotations are currently dominated by human, mouse, zebrafish and C. elegans (top panel); the chart is faceted allowing individual species to be switched on and off to see contributions for less data-rich species such as veterinary animals and monkeys (middle panel). Clicking on a given phenotype text allows drilling down to its subtypes (lower panel).
Figure 5.Annotated Monarch webpage for Marfan and Marfan Related syndrome. This group of syndromic diseases has a number of different associations spanning multiple entity types—disease phenotypes, implicated human genes, variants and animal models and other model systems. An abstraction of the contents and features of the tabs is shown in the lower panel. Actual contents of the tabs are best viewed in the context of the web app at https://monarchinitiative.org/DOID:14323.
Figure 6.Partial screenshot of PhenoGrid showing Marfan syndrome. PhenoGrid shows input phenotypes in rows, models in columns, and cell contents color-coded with greater saturation indicating greater similarity. Disease phenotypes are shown as rows, and phenotypically matching human diseases and model organism genes are shown as columns—the saturation of a cell correlates with strength if phenotypic match. Mouse-over tooltips highlight diseases associated with a selected phenotype (or vice-versa), or details (including similarity scores) of any match between a phenotype and a model. User controls support the selection of alternative sort orders, similarity metrics, and displayed organism(s) (mouse, human, zebrafish or the 10 most similar models for each). Here, we see all diseases or genes that exhibit ‘Hypoplasia of the mandible’ with the matching mouse gene Tfgb2. Actual PhenoGrid data is best viewed in the context of the web app at https://monarchinitiative.org/Orphanet:284993#compare. Note matches do not need to be exact—here the mouse phenotype of ‘small mandible’ (Mouse Phenotype Ontology) has a high scoring match to ‘micrognathia’ (Human Phenotype Ontology) based on the fact that both phenotypes are related to ‘small mandible’ (Mouse Phenotype Ontology). Advanced PhenoGrid features (not displayed) include the ability to alter the scoring and sorting methods, as well as zoomed-out map-style navigation.