| Literature DB >> 21737429 |
Robert Hoehndorf1, Paul N Schofield, Georgios V Gkoutos.
Abstract
Phenotypes are investigated in model organisms to understand and reveal the molecular mechanisms underlying disease. Phenotype ontologies were developed to capture and compare phenotypes within the context of a single species. Recently, these ontologies were augmented with formal class definitions that may be utilized to integrate phenotypic data and enable the direct comparison of phenotypes between different species. We have developed a method to transform phenotype ontologies into a formal representation, combine phenotype ontologies with anatomy ontologies, and apply a measure of semantic similarity to construct the PhenomeNET cross-species phenotype network. We demonstrate that PhenomeNET can identify orthologous genes, genes involved in the same pathway and gene-disease associations through the comparison of mutant phenotypes. We provide evidence that the Adam19 and Fgf15 genes in mice are involved in the tetralogy of Fallot, and, using zebrafish phenotypes, propose the hypothesis that the mammalian homologs of Cx36.7 and Nkx2.5 lie in a pathway controlling cardiac morphogenesis and electrical conductivity which, when defective, cause the tetralogy of Fallot phenotype. Our method implements a whole-phenome approach toward disease gene discovery and can be applied to prioritize genes for rare and orphan diseases for which the molecular basis is unknown.Entities:
Mesh:
Year: 2011 PMID: 21737429 PMCID: PMC3185433 DOI: 10.1093/nar/gkr538
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
General overview of the method
| Step | Method | Materials used | Example |
|---|---|---|---|
| Formalization of cross-species anatomy | Assign equivalent class statements based on UBERON cross-species mappings; integrate all ontologies and equivalence statements in single ontology. | Gene Ontology, Mouse Anatomy, Worm Anatomy and Development, Zebrafish Anatomy and Development, Fly Anatomy, Foundational Model of Anatomy, UBERON | Both the classes ‘Tail’ from Mouse Anatomy and ‘Caudal fin’ from Zebrafish Anatomy are declared equivalent to the class ‘Tail’ in UBERON. |
| Consistency verification and removal of contradictions | Remove disjointness statements from UBERON. | UBERON ontology, processing of OBO Flatfile Format | The disjointness between ‘Material anatomical entity’ ( |
| Formalization of cross-species phenotypes | Convert phenotype ontologies' definitions to enable interoperability with anatomy (using has-part and part-of relations); combine all phenotype ontologies, their class definitions and the cross-species anatomy ontology in a single ontology. | Yeast Phenotype, FlyBase Controlled Vocabulary, Worm Phenotype, Mammalian Phenotype, Human Phenotype; related ontologies: PATO, ChEBI, Gene Ontology, Mouse Pathology, Celltype, Protein Ontology | Define the mouse phenotype ‘Matted coat’ as shown in |
| Represent phenotype annotations | Phenotype annotations in model organisms databases or of diseases are represented as class C; the class C is then asserted as equivalent to the intersection of the annotated phenotypes. | HPO-based phenotype annotations of OMIM; phenotype annotations in model organism databases | The class ‘Alport syndrome’ is defined as equivalent to the intersection of the disease's phenotypic characteristics: ‘Renal failure, Nephritis, Hearing loss’ and ‘Hematuria’. |
| Inference of cross-species phenotype representation | Using automated reasoning, each class that represents a genotype annotations or disease is examined and all its super-classes in each species-specific phenotype ontology are inferred. The result is a representation of the annotated phenotypes based on five species-specific phenotype ontologies. | OWL reasoners (CB and CEL) | The worm phenotype ‘Abnormal apoptosis’ is inferred as a super-class of the human phenotype ‘Defective lymphocyte apoptosis’. |
| Application of semantic similarity | A semantic similarity measure is applied to compensate for missing information and noisy data. | Jaccard metric weighted by information content; implemented parallel algorithm for computation | The phenotype of an allele of the |
| Quantitative evaluation | KEGG and known disease models provide gene–gene and gene–disease associations which are compared within the network. Orthologous genes and genes in the same pathway are phenotypically similar; gene–disease pairs of known gene–disease associations are similar. | KEGG, OMIM and Morbidmap, mouse model annotations in MGI | The area under the receiver operator characteristic curve (a plot of the true positive rate as function of the false positive rate) for pathways is 0.59, for orthology 0.62 and for disease 0.68. |
Figure 1.Overview over ontology-based data analysis. First, the ontologies have to be formalized before their consistency can be verified. If contradictory axioms are identified, they must be removed. Using the ontology, biological data is represented within the same model so that the biological questions across the data can be asked in flexible ways. If necessary, statistical approaches are applied to complete missing information and results can then be inferred over the combined representation.
Figure 2.Illustration of assertions and inferences about the class Matted coat. Blue-colored shapes represent qualities, gray-colored shapes represent anatomical entities and green-colored shapes represent phenotypes. Dashed lines represent inferred associations.
Figure 3.ROC curves for predicting disease, participation in a common pathway and orthology using PhenomeNET. The ROC curves for pathway and orthology predictions are obtained by comparison with KEGG, while the gene-disease predictions are derived from OMIM and the annotated disease models in the MGI. AUC for pathways is 0.59, for orthology 0.62 and for disease 0.68.