| Literature DB >> 22779044 |
Michelle Deng, Amin Zollanvari, Gil Alterovitz.
Abstract
The immense corpus of biomedical literature existing today poses challenges in information search and integration. Many links between pieces of knowledge occur or are significant only under certain contexts-rather than under the entire corpus. This study proposes using networks of ontology concepts, linked based on their co-occurrences in annotations of abstracts of biomedical literature and descriptions of experiments, to draw conclusions based on context-specific queries and to better integrate existing knowledge. In particular, a Bayesian network framework is constructed to allow for the linking of related terms from two biomedical ontologies under the queried context concept. Edges in such a Bayesian network allow associations between biomedical concepts to be quantified and inference to be made about the existence of some concepts given prior information about others. This approach could potentially be a powerful inferential tool for context-specific queries, applicable to ontologies in other fields as well.Entities:
Year: 2012 PMID: 22779044 PMCID: PMC3392061
Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc
Sources of records in the knowledge base. Records were compiled in 2009. For all sources except PubMed, every existing record was included in the knowledge base. For efficiency, 100,000 PubMed records were randomly selected from the 16,000,000 existing at the time. A B-tree index was created on the records for searching.
| Adverse Event Reporting System | 774,606 |
| Array Express | 9281 |
| BioSiteMaps | 1013 |
| caNanoLab | 444 |
| Conserved Domain Databases | 34,735 |
| Clinical Trials Database | 75,828 |
| Drug Bank | 4774 |
| Database of Phenotypes and Genotypes | 75,828 |
| Gene Expression Omnibus | 15,968 |
| Stanford Microarray Database | 16,148 |
| Published articles in PubMed | 100,000 |
| Drug Bank | 4774 |
Figure 2:TAN Bayesian network constructed for context “obesity.” Yellow nodes represent GO (destination) terms; light blue nodes represent DOID (source) terms, and the orange node represents the context. Displayed are 4401 out of the 447374 total edges in the network; the full network contains 240 unique nodes from DOID and 8218 unique nodes from GO. Magnified views of the circled subnets are provided in Fig. 3.
The ten Disease Ontology (source) terms identified as most strongly linked to the context “alcoholism.”
| Drug abuse | 340.755 | 12.949 | 4.187 × 10−4 |
| Alcohol-related disorder NOS | 294.652 | 12.460 | 5.718 × 10−3 |
| Substance-related disorder | 185.522 | 11.309 | 6.450 × 10−3 |
| Disease of environmental origin | 74.763 | 8.707 | 8.962 × 10−3 |
| Environmentally induced disease | 74.763 | 8.707 | 8.962 × 10−3 |
| Addiction | 68.806 | 10.934 | 6.726 × 10−3 |
| Schizophrenia | 24.199 | 5.157 | 1.768 × 10−2 |
| Alzheimer’s dementia | 16.957 | 3.400 | 3.121 × 10−2 |
| Organic mental disorder of unknown etiology | 6.543 | 6.341 | 1.345 × 10−2 |
| Tauopathies | 5.114 | 6.007 | 1.445 × 10−2 |
The ten Disease Ontology (source) terms identified as most strongly linked to the context “obesity.”
| Obesity, unspecified | 12225.085 | 630.816 | 5.999 × 10−4 |
| Polyphagia | 11458.037 | 20.524 | 3.102 × 10−3 |
| Morbid obesity | 5790.621 | 11.878 | 6.067 × 10−3 |
| Alcoholic liver damage, unspecified | 1416.854 | 7.704 | 1.047 × 10−2 |
| Eating disorder, unspecified | 163.686 | 16.905 | 3.929 × 10−3 |
| Cholelithiasis | 123.205 | 6.733 | 1.246 × 10−2 |
| Choroideremia | 123.205 | 6.733 | 1.246 × 10−2 |
| Alcohol induced liver disorder | 113.348 | 6.733 | 1.246 × 10−2 |
| Ovarian dysfunction | 93.180 | 15.432 | 4.392 × 10−3 |
| Ovarian non-neoplastic disease | 93.180 | 15.432 | 4.392 × 10−3 |