| Literature DB >> 22911721 |
Luis Tari1, Nguyen Vo, Shanshan Liang, Jagruti Patel, Chitta Baral, James Cai.
Abstract
BACKGROUND: With the large amount of pharmacological and biological knowledge available in literature, finding novel drug indications for existing drugs using in silico approaches has become increasingly feasible. Typical literature-based approaches generate new hypotheses in the form of protein-protein interactions networks by means of linking concepts based on their cooccurrences within abstracts. However, this kind of approaches tends to generate too many hypotheses, and identifying new drug indications from large networks can be a time-consuming process.Entities:
Mesh:
Year: 2012 PMID: 22911721 PMCID: PMC3402456 DOI: 10.1371/journal.pone.0040946
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Different types of knowledge used in our approach and their sources.
| Types of knowledge | Sources |
| Drug-target interactions | DrugBank |
| Oncogenes and tumor suppressors | UniProt, Entrez Gene, CancerQuest |
| Genes involved in cancer-related biological processes | Gene Ontology |
| Gene-disease relations | Medline abstracts by text mining |
| Protein-protein interactions | Medline abstracts by text mining |
Examples of extracted gene-disease relationships and protein-protein interactions with their support evidences.
| Evidences | Extracted relationships |
| The results of our study demonstrate that | <over-expressed AMACR, associated with, gastric cancer> |
| Therefore, | <under-expressed RB1, associated with, small cell carcinoma> |
| Moreover, | <EGF, induces, ERBB2> |
|
| <TNF, inhibits, PPARG> |
Logic forms for the classes and entities involved in the drug mechanism domain.
| Facts | Logic forms | Examples |
|
| protein(Prot) | protein(tp53) |
|
| oncogene(Prot) | oncogene(egfr) |
|
| suppressor(Prot) | suppressor(tp53) |
|
| drug(Dr) | drug(moclobemide) |
|
| disease(Dise) | disease(depression) |
|
| cancer_promoting_bioprocess(Bp) | cancer_promoting_bioprocess(pos_reg_cell_proliferation) |
|
| cancer_resisting_bioprocess(Bp) | cancer_resisting_bioprocess(pos_reg_apoptosis) |
Logic forms for the interactions involved in the drug mechanism domain.
| Relations | Logic forms |
| Drug | interaction(Dr, induces, Prot) |
| Drug | interaction(Dr, inhibits, Prot) |
| Protein | interaction(Prot1, induces, Prot2) |
| Protein | interaction(Prot1, inhibits, Prot2) |
| Overexpressed protein | relation(overexpressed(Prot), associated_with, Dise) |
| Underexpressed protein | relation(underexpressed(Prot), associated_with, Dise) |
| Protein | relation(Prot, is_associated, Bp) |
Figure 1A diagrammatic view of (a) direct and (b) indirect inferences for dipyridamole and tazarotene as novel cancer indications.
Evaluation of the inferences using a list of 943 drugs based on original indication and clinical trials.
| Cancer genes | GO | Text mining | All | |
| Cancer as original indication (81) | 25 | 43 | 58 | 67 (82.7%) |
| Non-cancer drugs under clinical trials for cancer (289) | 46 | 95 | 133 | 144 (49.8%) |
| Total inferences | 171 | 335 | 455 | 507 |
| % inferences confirmed to be cancer-related | 41.5% | 41.2% | 42.0% | 41.6% |
Figure 2Treatment distribution for the 296 inferred drugs that neither have cancer as the original indication nor in clinical trials for cancer.
Performance of the extraction of gene-disease relations (GDRs) and protein-protein interactions (PPIs).
| GDRs(Bundschus corpus) | PPIs (Bioinfer corpus) | |
| True Positives (TP) | 205 | 20 |
| False Positives (FP) | 14 | 18 |
| False Negatives (FN) | 469 | 150 |
| Precision | 93.61% | 52.63% |
| Recall | 30.42% | 11.76% |
| F-measure | 45.91% | 19.23% |
Examples of incorrectly extracted gene-disease relations due to negation (E1) and wrong interactor (E2).
| Gene-disease relation | Sentence | |
| E1 | <overexpressed CCR7, associated with, lymphocyte-predominant Hodgkin disease> |
|
| E2 | <overexpressed Bcl-2, associated with, acute myelogenous leukemia> | Synergistic |