| Literature DB >> 21347178 |
Philip R O Payne1, Alan Kwok, Rakesh Dhaval, Tara B Borlawsky.
Abstract
The conduct of large-scale translational studies presents significant challenges related to the storage, management and analysis of integrative data sets. Ideally, the application of methodologies such as conceptual knowledge discovery in databases (CKDD) provides a means for moving beyond intuitive hypothesis discovery and testing in such data sets, and towards the high-throughput generation and evaluation of knowledge-anchored relationships between complex bio-molecular and phenotypic variables. However, the induction of such high-throughput hypotheses is non-trivial, and requires correspondingly high-throughput validation methodologies. In this manuscript, we describe an evaluation of the efficacy of a natural language processing-based approach to validating such hypotheses. As part of this evaluation, we will examine a phenomenon that we have labeled as "Conceptual Dissonance" in which conceptual knowledge derived from two or more sources of comparable scope and granularity cannot be readily integrated or compared using conventional methods and automated tools.Entities:
Year: 2009 PMID: 21347178 PMCID: PMC3041552
Source DB: PubMed Journal: Summit Transl Bioinform ISSN: 2153-6430
Examples of literature-derived CKCs.
| Chromosomes, Human, Pair 8 | LOCATION_OF | IGH@ gene cluster |
| IGH@ gene cluster | ASSOCIATED_WITH | Disease Progression |
Figure 2:Incidence of concepts included in TOKEn and SemRep generated CKCs at increasing granularity levels (e.g., depths from UMLS root).
Example of a TOKEn-based triplet.
| Gain of Chromosome 6 - [ | Gain of Chromosome 6 - [ |
| stage I childhood liver cancer - [ |