| Literature DB >> 18793456 |
Cory B Giles1, Jonathan D Wren.
Abstract
BACKGROUND: Relationships between entities such as genes, chemicals, metabolites, phenotypes and diseases in MEDLINE are often directional. That is, one may affect the other in a positive or negative manner. Detection of causality and direction is key in piecing pathways together and in examining possible implications of experimental results. Because of the size and growth of biomedical literature, it is increasingly important to be able to automate this process as much as possible.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18793456 PMCID: PMC2537562 DOI: 10.1186/1471-2105-9-S9-S11
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Flowchart of the relation extraction process.
Figure 2An example of sentence structure for the sentence "Both isobutylmethylxanthine and theophylline increased the level of cyclic AMP in rat mast cells.". Shown are the grammatical relationships diagrammed in the Penn Treebank format.
Figure 3Grammatical sentence structure in dependency graph format.
Figure 4Path of dependency between database terms.
Figure 5SVM identification of directional relationships.
Figure 6Precision/recall curve for A) Detecting relationships and B) Detecting directional relationships within the GENIA corpus.
Figure 7Precision/recall curve for detecting A) Forward relationships and B) Reverse relationships in the GENIA corpus.
Identifying directional relationships in MEDLINE for the chemical compound caffeine on the basis of summarized relationships after analysis of 17,145 abstracts.
| (-)-Adrenaline | 58 | 22 | 14 | 4 | 17 | 0 | 3 | 17 | F+ | F+ |
| (-)-Dopa | 5 | 2 | 1 | 0 | 1 | 1 | 0 | 2 | F+ | None |
| Cyclic AMP | 30 | 9 | 21 | 11 | 12 | 4 | 3 | 13 | F+ | F+ |
| diuresis | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | F+ | F+ |
| Fatigue | 23 | 8 | 4 | 6 | 15 | 1 | 0 | 11 | F- | F- |
| Fatty acid | 10 | 0 | 1 | 0 | 6 | 0 | 0 | 2 | F+ | None |
| gastric acid secretion | 3 | 3 | 1 | 0 | 2 | 0 | 0 | 1 | F+ | None |
| insomnia | 16 | 3 | 3 | 0 | 5 | 0 | 0 | 2 | F+ | F+ |
| Irritability | 2 | 2 | 2 | 1 | 1 | 0 | 0 | 1 | F+ | F+ |
| Lethargy | 2 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | F- | None |
| lipid metabolism | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 2 | F+ | None |
| Nervousness | 0 | 1 | 0 | 1 | 5 | 0 | 0 | 1 | F+ | None |
| Norepinephrine | 61 | 19 | 10 | 2 | 18 | 4 | 0 | 16 | F+ | F+ |
| Palpitation | 5 | 1 | 0 | 0 | 1 | 0 | 0 | 2 | F+ | None |
| PKA | 3 | 1 | 1 | 0 | 3 | 0 | 0 | 2 | F+ | None |
| Respiratory alkalosis | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | F+ | None |
| ryanodine-sensitive calcium-release channel activity | 48 | 18 | 33 | 2 | 17 | 5 | 4 | 15 | F+ | F+ |
| vasodilation | 2 | 0 | 1 | 2 | 1 | 0 | 0 | 0 | F- | F- |
| vasoconstriction | 5 | 0 | 3 | 0 | 2 | 0 | 0 | 0 | F+ | F+ |
Possible directional relationships are classified as orward (caffeine affects the object) or everse (the object affects caffeine) and as stimulatory (+), inhibitory (-) or neutral (n). No rel = no relationship found, N.D. rel = no directional relationship found, Expected = the directional relation suggested by the summary information, Extracted = the directional relation with the most support, based on the sentences processed. font indicates errors.
Identifying directional relationships in MEDLINE for the gene c-myc.
| Anaplasia | 6 | 1 | 0 | 0 | 1 | 1 | 0 | 4 | F+/R+ | None |
| Cell growth | 131 | 34 | 19 | 19 | 76 | 6 | 8 | 34 | F+ | Inc. |
| Cervical carcinoma | 22 | 4 | 0 | 0 | 3 | 0 | 0 | 8 | F+ | None |
| R+ | ||||||||||
| DNA repair | 13 | 11 | 0 | 0 | 1 | 1 | 0 | 4 | F- | None |
| Medulloblastoma | 21 | 6 | 4 | 0 | 13 | 1 | 0 | 20 | F+ | F+ |
| Tumorigenesis | 80 | 40 | 20 | 6 | 48 | 6 | 4 | 32 | F+ | F+ |
| Rapamycin | 4 | 0 | 1 | 0 | 2 | 0 | 1 | 0 | R+ | None |
| Reactive oxygen species | 9 | 5 | 2 | 0 | 4 | 1 | 1 | 10 | F+ | F+ |
| Valproic acid | 1 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | R- | R- |
| AICDA | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | R+ | None |
| AURKA | 5 | 1 | 0 | 0 | 2 | 0 | 0 | 1 | R+ | None |
| Calcineurin | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | R+ | None |
| CREBBP | 7 | 3 | 0 | 0 | 5 | 1 | 0 | 6 | R- | None |
| EPHA2 | 8 | 0 | 1 | 0 | 1 | 0 | 0 | 2 | F- | None |
| R+ | ||||||||||
| HDAC1 | 1 | 5 | 0 | 0 | 0 | 0 | 0 | 4 | F+ | None |
| HMGCS2 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | F- | None |
| IFN-gamma | 2 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | R- | None |
| F+/R+ | ||||||||||
| NDRG2 | 2 | 1 | 0 | 0 | 5 | 0 | 0 | 2 | F- | None |
| NFATC1 | 1 | 0 | 0 | 0 | 1 | 2 | 0 | 1 | R+ | R+ |
| PCGF2 | 4 | 4 | 0 | 0 | 0 | 0 | 0 | 2 | R- | None |
| PPARG | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 2 | F+ | None |
| PRL | 15 | 0 | 0 | 0 | 5 | 6 | 1 | 5 | R+ | R+ |
| F+ | ||||||||||
| SP1 | 29 | 5 | 0 | 3 | 9 | 4 | 0 | 9 | R+ | R+ |
| R+ | ||||||||||
| Telomerase | 27 | 5 | 13 | 0 | 9 | 5 | 1 | 8 | F+ | F+ |
| VEGFA | 49 | 2 | 0 | 1 | 18 | 0 | 0 | 17 | F+ | None |
| WRN | 1 | 1 | 2 | 0 | 3 | 0 | 0 | 2 | F+ | F+ |
| ZBTB16 | 3 | 2 | 0 | 0 | 0 | 0 | 0 | 1 | R- | None |
| ZBTB17 | 24 | 17 | 1 | 8 | 11 | 1 | 3 | 9 | F- | F- |
| Zfp472 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 1 | R- | None |
See Table 1 for header explanations. ("Inc." = inconclusive - tied for the highest relation score).
Example sentences from the failed DRE between c-myc and the entities "Breast Cancer" and "c-jun". The expected relationships for breast cancer was F+ and for c-jun, F-.
| Object | Extracted Rel. | PubMed ID | Sentence |
| Breast Cancer | R+ | 12150449 | The c-myc oncogene is frequently activated in invasive breast cancer and has been associated with high nuclear grade, lymph node metastasis and poorer disease outcome |
| R+ | 7734397 | The proto-oncogene c-myc is involved in the stimulation of cell proliferation, and its expression is known to be stimulated by estradiol (E2) in human breast cancer cell lines and various non – cancerous E2 – dependent tissues | |
| F+ | 1855215 | In search of critical genes in the mechanism of estrogen action in human breast cancer, we previously showed that estrogen stimulates transcription of the c – myc gene in estrogen-dependent (MCF-7) cells | |
| c-Jun | R+ | 8417822 | 17 beta – Estradiol had little effect on expression of c-jun, jun B, jun D, or c-fos mRNA by MCF-7 cells over 12 h, although it stimulated c-myc expression 4-fold within 30 min |
| F+ | 14523011 | Furthermore, we identify a phylogenetically conserved AP-1-responsive element in the promoter of the c-myc proto-oncogene that recruits in vivo the c-Jun and JunD AP-1 family members and controls the PDGF-dependent transactivation of the c-myc promoter | |
| R- | 8219202 | In addition, intracellularly, mitoxantrone-induced PCD was associated with a marked induction of c-jun and significant repression of c-myc and BCL-2 oncogenes |
Path lengths and quantity of stimulatory and inhibitory tokens seem to vary with the type of object.
| 42 | 7.64 | 0.36 | 0.21 | |
| 850 | 7.23 | 0.16 | 0.11 | |
| 1378 | 4.93 | 0.15 | 0.06 |
See Table 2 for the particular objects tested.