| Literature DB >> 26420780 |
Anika Oellrich, Nigel Collier, Tudor Groza, Dietrich Rebholz-Schuhmann, Nigam Shah, Olivier Bodenreider, Mary Regina Boland, Ivo Georgiev, Hongfang Liu, Kevin Livingston, Augustin Luna, Ann-Marie Mallon, Prashanti Manda, Peter N Robinson, Gabriella Rustici, Michelle Simon, Liqin Wang, Rainer Winnenburg, Michel Dumontier.
Abstract
Phenotypes have gained increased notoriety in the clinical and biological domain owing to their application in numerous areas such as the discovery of disease genes and drug targets, phylogenetics and pharmacogenomics. Phenotypes, defined as observable characteristics of organisms, can be seen as one of the bridges that lead to a translation of experimental findings into clinical applications and thereby support 'bench to bedside' efforts. However, to build this translational bridge, a common and universal understanding of phenotypes is required that goes beyond domain-specific definitions. To achieve this ambitious goal, a digital revolution is ongoing that enables the encoding of data in computer-readable formats and the data storage in specialized repositories, ready for integration, enabling translational research. While phenome research is an ongoing endeavor, the true potential hidden in the currently available data still needs to be unlocked, offering exciting opportunities for the forthcoming years. Here, we provide insights into the state-of-the-art in digital phenotyping, by means of representing, acquiring and analyzing phenotype data. In addition, we provide visions of this field for future research work that could enable better applications of phenotype data.Entities:
Keywords: acquisition; interoperability; knowledge discovery; phenomics; phenotypes; semantic representation
Mesh:
Year: 2015 PMID: 26420780 PMCID: PMC5036847 DOI: 10.1093/bib/bbv083
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Figure 1.The four dimensions of the phenotype development phases. Representation: Subject to the underlying domain and goal, phenotypes may be represented at different granularity levels. Interoperability: Existing ontologies and vocabularies externalize domain-specific phenotype knowledge at different levels of granularity. Acquisition: Capturing and documenting phenotypes in any representational format can be achieved manually (via curation) or automatically (via text mining). Processing: Representing and capturing phenotypes in a structured manner (a form that also enables interoperability) has led to their application in a large variety of domains. The arrows denote direct points of connection between the several phenotyping dimensions. Note that this figure only serves as illustration of the interplay of the four dimension and, thus, is not aimed at comprehensiveness (e.g. interoperability could also be achieved with a mapping instead of EQ statements).
Summarizes all resources mentioned throughout the manuscript, together with their URL and reference (where applicable)
| Resource | Link | Reference |
|---|---|---|
| Online Mendelian Inheritance in Man database | [ | |
| Mouse Genome Database | [ | |
| FlyBase | [ | |
| Zebrafish Model Organism Database | [ | |
| Mammalian Phenotype Ontology | [ | |
| Human Phenotype Ontology | [ | |
| London Dysmorphology Database | [ | |
| Gene Ontology | [ | |
| Phenotypic quality and Trait Ontology | [ | |
| OrphaNet | [ | |
| PharmGKB | ||
| Zebrafish Anatomical Ontology | [ | |
| International Mouse Phenotyping Consortium | [ | |
| IMPReSS | ||
| Phenote | ||
| PhenoTips | [ | |
| MetaMap | [ | |
| NCBO Annotator | ||
| cTakes | [ | |
| ShARE/CLEF 2013 | [ | |
| DeepPhe | ||
| Bio-LarK | [ | |
| PhenoMiner | [ | |
| Unified Medical Language System | [ | |
| Unified Medical Language System Metathesaurus tool | [ | |
| UberPheno | [ | |
| SNOMED CT | [ | |
| AgreementMaker | [ | |
| Zooma | ||
| SIDER | [ | |
| AVAToL | [ | |
| PhenoScape | [ | |
| ORCID | ||
| ResearcherID |
Figure 2.To date, phenotypes have mostly been captured and defined using a pre-composed and/or a post-composed representation. A pre-composed representation assumes the definition of a phenotype as a monolithic concept—a concept that captures the essence of the phenotype semantics. The post-composed representation decomposes the phenotype into an Entity–Quality pair, with its individual components being mapped to appropriate ontological concepts. In this case, the phenotype semantics is denoted by the compositional property of the pair. The transition between pre-composed and post-composed is realized via logical axioms. Both forms of representation have been successfully applied across different species.
Figure 3.The increasing amount of data made available over the course of the past years have rendered manual phenotype curation impractical. While automating the process is in principle the only viable solution, it possesses its own plethora of technical challenges. These include, among others: (i) boundary detection, i.e. identifying the exact span of text that represents a phenotype candidate; (ii) disambiguation and alignment, subject to the desired level of granularity and the underlying knowledge source; and (iii) interpretation, which covers lack of context, hedging or negation.