| Literature DB >> 22962340 |
Robert Hoehndorf1, Michel Dumontier, Georgios V Gkoutos.
Abstract
Ontologies are now pervasive in biomedicine, where they serve as a means to standardize terminology, to enable access to domain knowledge, to verify data consistency and to facilitate integrative analyses over heterogeneous biomedical data. For this purpose, research on biomedical ontologies applies theories and methods from diverse disciplines such as information management, knowledge representation, cognitive science, linguistics and philosophy. Depending on the desired applications in which ontologies are being applied, the evaluation of research in biomedical ontologies must follow different strategies. Here, we provide a classification of research problems in which ontologies are being applied, focusing on the use of ontologies in basic and translational research, and we demonstrate how research results in biomedical ontologies can be evaluated. The evaluation strategies depend on the desired application and measure the success of using an ontology for a particular biomedical problem. For many applications, the success can be quantified, thereby facilitating the objective evaluation and comparison of research in biomedical ontology. The objective, quantifiable comparison of research results based on scientific applications opens up the possibility for systematically improving the utility of ontologies in biomedical research.Entities:
Keywords: biomedical ontology; evaluation criteria; ontology evaluation; ontology-based applications; quantitative biology
Mesh:
Year: 2012 PMID: 22962340 PMCID: PMC3888109 DOI: 10.1093/bib/bbs053
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Formal approaches to ontology research and their potential impact on biomedical applications and analyses
| Task | Description | Potential impact on biomedical applications and analyses | Example |
|---|---|---|---|
| Philosophical foundations | A theory from philosophy is applied either to biomedical ontologies orro biomedical domains. It is then demonstrated that the philosophical theory can explain the distinctions within the domain. Furthermore, philosophical foundation theory can provide insights into the principles based on which scientists within a domain distinguish different kinds of entities and can provide a methodology for classifying domain entities. Formalizing aspects of the philosophical principles can enable verification of a domain theory with regard to these principles. | Increased coherence in representing knowledge and data, comprehensibility and interoperability | A study demonstrated that a particular perspective on philosophical realism can be used to describe chemical structures, even when the type of structure is known to be impossible to exist [ |
| Provision of unambiguous, formal documentation | The use of formal languages can remove ambiguity from specifications (of a domain, the meaning of a term, etc.). Based on formal logics, consequences of a specification can then be determined by a mathematical proof, thereby avoiding potential misunderstandings based on natural language. | Increased coherence, increased clarity | The RNA ontology (RNAO) [ |
| Provision of machine-readable documentation | Some aspects of the meaning of terms are formalized using a knowledge representation language so that automated systems can gain access to the meaning and process it. | Automated data processing, automated knowledge- and data-integration, semantic integration | The GO [ |
| Consistency verification | Statements that are considered true in the domain (axioms) and term definitions are formalized and an automated reasoner is used to verify the consistency (i.e. the absence of contradictions) of the stated knowledge. Furthermore, ‘satisfiability’ of a class can be automatically verified (a class is satisfiable if it is possible for the class to have instances). Once a model of a domain is consistently formalized, it can be applied to verify data in this domain. For this purpose, an automated reasoner verifies whether data items satisfy the constraints expressed in the model of the domain Often, expressive automated reasoners such as OWL 2.0 reasoners are used to perform consistency verification. | Increased coherence, detection of modelling errors, detection of competing scientific theories, data coherence | Inconsistencies when combining anatomy and phenotype ontologies were detected [ |
| Data classification | An ontology of a domain is applied to classify data in a domain. In this task, an automated reasoner uses the constraints in the domain ontology to automatically assign data items into ontology categories. | Classification, data analysis | A study [ |
| Supporting ontology development | Formal representations and automated reasoning can support ontology development by inferring information that is not explicitly stated. Possibly undesired consequences can be examined either manually or automatically and the statements leading to the undesired consequence can be corrected. Particularly useful is the automated construction of taxonomies based on axioms in an ontology. Complex statements and definitions are automatically transformed into a generalization hierarchy (a taxonomy) by the automated reasoner. | Decreased maintenance, detection of errors | The GULO software [ |
| Support querying | Based on the axioms about a domain, automated reasoners can infer a potentially infinite number of statements that are true if the axioms are true. Therefore, formal logics are ideally suited to encode knowledge about a domain so that it can support a wide variety of queries. Automated reasoners are capable of automatically determining the answers to the queries using the statements in the formalized theory. This is one of the most widely used application of automated reasoning in ontologies. To efficiently support querying in applications that require quick response times, highly optimized reasoners and low expressivity of the knowledge representation language are beneficial [ | Support knowledge extraction, connect databases and domains | A web-based query tool used the ELK reasoner [ |
Opportunities for the quantitative evaluation of research results in applied ontology
| Application | Possible evaluation methods | Description | Quantifiable result | Example |
|---|---|---|---|---|
| Establish ‘community agreement’ about meanings of terms in a vocabulary. In a domain in which terms can have different meanings based on the background of a researcher, an ontology is developed to provide a reference for ‘particular’ meanings of terms. | User-study | Multiple people perform a task, such as determining the occurrence of terms from an ontology in a manuscript, independently. The goal is to achieve a high agreement between annotators about the ontology terms that have occurred in the manuscript. | Percentage agreement, κ statistics | A study was performed to evaluate the agreement between expert curators of the GO and found ‘that there is 39% chance of curators exactly interpreting the text and selecting the same GO term, a 43% chance that they will extract a term from new/different lineage and a 19% chance that they will annotate a term from the same GO lineage’ [ |
| ‘Annotate data consistently’ across multiple databases, user communities or domains. | User-study | Multiple people annotate the same data set using an ontology. The goal is to achieve a high agreement in the resulting annotations. | Percentage agreement, κ statistics | A study was performed to evaluate GO annotation consistency between human and mouse. The authors find that, out of a set of 3359 annotations, 2137 are matches and 1222 are mismatches (and potential annotation inconsistencies) [ |
| ‘Integrate multiple databases’ and provide a uniform view across. | User study, integrated analysis | An evaluation can perform an analysis of an integrated data set, or compare the integration results to a gold standard. Integrated analysis results can either be compared to a reference or tested based on a scientific use case. | Integrated data analysis results, precision, recall, F-measure | The phenotype data contained in multiple model organism databases were integrated and utilized for the task of prioritization of candidate genes for a disease. The results were compared against gene–disease associations in the OMIM database (gold standard) and quantitatively evaluated using ROC analysis [ |
| ‘Answer queries over data’ using the ontology as the conceptual model of a database (i.e. the classes and relations in the ontology are used to structure the database and functions as a vocabulary based on which queries can be built). | Test suite, comparison to gold standard | An evaluation can be based on a test suite (in which particular queries and the desired results are specified) or a gold standard, and use the ontology to perform test queries over the database and determine if the results conform to the desired outcome or compare the results to the gold standard. Additionally, a performance analysis can be used to determine the time and space required to implement the queries. | Number of tests passed, precision, recall, F-measure; complexity class, performance measurements | A study implemented an RDF-based query system over biomedical ontologies together with several relation axioms, demonstrating several queries that could not be answered before. The evaluation found that ‘the answers to such a query are complete and they correspond to the logical meaning of the relation types as intended by the ontology engineers’ [ |
| ‘Answer questions’ over the knowledge contained in the ontology. | Test suite, content evaluation | Evaluation can take several directions. A test suite of questions can be designed, the ontology used to answer these questions, and the results compared to the outcome. Furthermore, the content of the ontology can be evaluated similarly to database content evaluation [ | Number of tests passed, domain coverage (percentage), currency (number of times updated), expert evaluation | A study evaluated whether questions about existential restrictions in biomedical ontologies are correct as judged by experts in the field. The results show that, ‘[a]ccording to a rating done by four experts, 23% of all existential restrictions in OBO Foundry candidate ontologies are suspicious (Cohens’ κ = 0.78)’ [ |
| Determine ‘consistency of data’ with respect to constraints in the ontology. | Test suite, performance measurement | A test suite of different types of data inconsistencies can be designed, and a performance evaluation used to measure the time and space complexity for identifying inconsistencies. | Number of tests passed (contradictions found); complexity class | A top-level ontology of computation models in systems biology (consisting of less than 10 classes and less than 10 relations) was formalized in OWL and the models in the BioModels database [ |
| Determine the ‘consistency and accuracy of the conceptual model’. | Automated reasoning, test suite | Automated reasoning can be used to determine model consistency, and a test suite can be used to test accuracy of consequences following from the model. | Number of tests passed, number of inconsistencies found | A project to formalize the definitions of GO terms [ |
| Enable ‘novel scientific analyses’, such as Gene Set Enrichment Analysis (GSEA) or semantic similarity, that rely on the type and the number of distinctions made in an ontology to analyse a data set. | Case-specific scientific validation | Evaluation must be based on the specific scientific problem and the standard established for the particular scientific discipline. An example for an evaluation could be to perform an experiment. | Various quantifiable results, including p-value, F-measure, ROC AUC | The novel method GSEA was proposed, that utilizes the annotations and the structure of GO to interpret gene expression data [ |
Figure 1:A direct evaluation of an ontology can assess intrinsic properties of the ontology such as consistency, expressivity, or the inclusion of natural language definitions and labels. Furthermore, the evaluating person can examine definitions and axioms of the ontology and either agree or disagree with their content.
Figure 2:An application-based evaluation does not directly assess an ontology, but rather evaluates an application that utilizes an ontology for its operations.
Figure 3:An analysis-based evaluation performs a scientific data analysis that relies on an ontology and evaluates the success of the analysis using criteria established in the scientific domain.