| Literature DB >> 27011785 |
Sirarat Sarntivijai1, Drashtti Vasant1, Simon Jupp2, Gary Saunders1, A Patrícia Bento1, Daniel Gonzalez1, Joanna Betts3, Samiul Hasan3, Gautier Koscielny3, Ian Dunham1, Helen Parkinson2, James Malone1.
Abstract
BACKGROUND: The Centre for Therapeutic Target Validation (CTTV - https://www.targetvalidation.org/) was established to generate therapeutic target evidence from genome-scale experiments and analyses. CTTV aims to support the validity of therapeutic targets by integrating existing and newly-generated data. Data integration has been achieved in some resources by mapping metadata such as disease and phenotypes to the Experimental Factor Ontology (EFO). Additionally, the relationship between ontology descriptions of rare and common diseases and their phenotypes can offer insights into shared biological mechanisms and potential drug targets. Ontologies are not ideal for representing the sometimes associated type relationship required. This work addresses two challenges; annotation of diverse big data, and representation of complex, sometimes associated relationships between concepts.Entities:
Keywords: CTTV; EFO; OBAN; Phenotype disease associations; Rare disease
Mesh:
Year: 2016 PMID: 27011785 PMCID: PMC4804633 DOI: 10.1186/s13326-016-0051-7
Source DB: PubMed Journal: J Biomed Semantics
Fig. 1There were 2214 EFO-native classes in January 2010, and 3992 EFO-native classes in January 2015. Although EFO has significantly grown in its number of native classes, the number of imported classes has grown at a much higher rate. Importing more than 6000 rare disease classes from ORDO in 2012, and axiomatizing them into EFO has resulted in a sudden increase between 2012 and 2013. This reflects the use of EFO as an application ontology providing interoperability across domain ontologies through semantic axiomatization
Fig. 2The cell line design pattern in EFO links an EFO class ‘cell line’ to external ontologies via import mechanism. An EFO cell line derives_from a cell type class from Cell Ontology, which is part_of an organism – a class imported from NCBI Taxon. EFO cell line class is also a bearer_of a disease – a class imported from ORDO or class native to EFO itself
An overview of ontologies usage by each CTTV data source. Cross-reference sources of each CTTV data resource are normalized to EFO for CTTV data validation process
| Database | Cross-reference annotation sources |
|---|---|
| EVA | OMIM, SNOMED-CT, MeSH |
| ArrayExpress | GO, OMIM, EFO |
| UniProt | OMIM, Orphanet, MeSH |
| Reactome | OMIM, GO |
| ChEMBL | MedDRA, ATC, GO |
| GWAS Catalog | EFO, DO |
Summary of mapping between textual data annotations and EFO or ORDO ontology classes, following process outlined in methods section (%)
| Database | % Annotated to EFO or ORDO |
|---|---|
| EVA (inc. ClinVar) | 89 % of annotations of frequency > 100 |
| ArrayExpress | 77 % |
| UniProt | 78 % |
| Reactome | 100 % |
| ChEMBL | 99 % |
| GWAS Catalog | 100 % |
Fig. 3An OBAN association links an entity such as a disease to another such as an associated phenotype and retains the provenance information (e.g., manual curation, published findings, etc). Entities marked with * are required and others are added on per association basis, for instance the PubMed triple in this figure
Fig. 4An example of connecting a phenotype (malabsorption) with a disease (ileocolitis) using OBAN. Provenance here is manual curation by a named surgeon (name omitted here)
Fig. 5A summary of the rare-to-common associations linking diseases via anatomical system through the has_disease_location axiomatization inside EFO. The high-resolution image is downloadable at https://github.com/CTTV/ISMB2015/blob/master/figures/r2c.pdf blob/master/figures/r2c.pdf and provided in supplementary materials
Fig. 6Summary of the number of associations and provenances in each group of diseases in CTTV as of 28th September 2015