| Literature DB >> 21685059 |
Yoshimasa Tsuruoka1, Makoto Miwa, Kaisei Hamamoto, Jun'ichi Tsujii, Sophia Ananiadou.
Abstract
MOTIVATION: Discovering useful associations between biomedical concepts has been one of the main goals in biomedical text-mining, and understanding their biomedical contexts is crucial in the discovery process. Hence, we need a text-mining system that helps users explore various types of (possibly hidden) associations in an easy and comprehensible manner.Entities:
Mesh:
Year: 2011 PMID: 21685059 PMCID: PMC3117364 DOI: 10.1093/bioinformatics/btr214
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Examples of event-describing phrases
| Event type | Phrase |
|---|---|
| Gene expression | Although resting Jurkat cells |
| Positive regulation | |
| Binding | … responses induced by |
| Phosphorylation | Differential expression and |
| Regulation | |
| Negative regulation | …, a specific |
The terms in bold are protein names, and the italicized words are event triggers.
Joint learning of event triggers and protein names
| Word | Tag |
|---|---|
| CD44 | B-Protein |
| activated | B-Positive_regulation |
| the | Filler |
| transcription | Filler |
| factor | Filler |
| AP | B-Protein |
| - | I-Protein |
| 1 | I-Protein |
| . | O |
Feature templates used in the CRF tagger
| Word unigram | & | |
|---|---|---|
| Word bigram | & | |
| Word trigram | & | |
| Substrings | substrings of | & |
| (up to length 10) | ||
| Word shape | S( | & |
| Tag bigram | True | & |
w is the current word. y is the current tag. Word shape S(w) is produced by converting capital letters into ‘A’, small letters into ‘a’ and numerals into ‘#’.
Accuracy of trigger detection
| Triggers Only | Joint | Joint + Filler | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Precision | Recall | F-score | Precision | Recall | Precision | Recall | |||
| Gene expression | 70.9 | 60.8 | 65.5 | 74.9 | 57.4 | 65.0 | 77.9 | 66.4 | 71.7 |
| Transcription | 66.7 | 39.4 | 49.5 | 62.5 | 37.9 | 47.2 | 67.5 | 40.9 | 50.9 |
| Protein catabolism | 93.8 | 79.0 | 85.7 | 93.8 | 79.0 | 85.7 | 93.8 | 79.0 | 85.7 |
| Localization | 86.4 | 47.5 | 61.3 | 82.8 | 60.0 | 69.6 | 85.2 | 57.5 | 68.7 |
| Binding | 64.0 | 26.7 | 37.6 | 67.5 | 31.1 | 42.6 | 72.8 | 32.8 | 45.2 |
| Phosphorylation | 68.6 | 63.2 | 65.8 | 75.8 | 65.8 | 70.4 | 76.7 | 60.5 | 67.7 |
| Regulation | 57.8 | 19.0 | 28.6 | 54.5 | 21.9 | 31.2 | 50.0 | 13.9 | 21.7 |
| Positive regulation | 64.5 | 33.6 | 44.1 | 62.0 | 33.8 | 43.7 | 65.2 | 35.4 | 45.8 |
| Negative regulation | 61.3 | 30.7 | 40.9 | 58.2 | 30.7 | 40.2 | 61.8 | 28.0 | 38.5 |
| Micro Average | 67.2 | 38.4 | 48.9 | 67.1 | 39.1 | 49.4 | 70.5 | 40.4 | 51.4 |
Fig. 1.Finding indirectly associated concepts.
Fig. 2.A screen-shot of FACTA+ search results for indirect associations. The links and icons in the table give the user a quick access to the textual evidence (snippets) of the associations.
Fig. 4.Visualization of indirectly associated concepts using treemapping and links.
Fig. 3.Visualization of directly associated concepts using treemapping.