| Literature DB >> 22759458 |
Jari Björne1, Filip Ginter, Tapio Salakoski.
Abstract
BACKGROUND: We present a system for extracting biomedical events (detailed descriptions of biomolecular interactions) from research articles, developed for the BioNLP'11 Shared Task. Our goal is to develop a system easily adaptable to different event schemes, following the theme of the BioNLP'11 Shared Task: generalization, the extension of event extraction to varied biomedical domains. Our system extends our BioNLP'09 Shared Task winning Turku Event Extraction System, which uses support vector machines to first detect event-defining words, followed by detection of their relationships.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22759458 PMCID: PMC3384251 DOI: 10.1186/1471-2105-13-S11-S4
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Event extraction. In most tasks named entities are given (A). Sentences are parsed (B) to produce a dependency parse. Entities not given are predicted through trigger detection (C). Edge detection predicts event arguments between entities (D) and unmerging creates events (E). Finally, event modality is predicted (F). When the graph is converted to the Shared Task format, site arguments are paired with core arguments that have the same target protein.
Figure 2Site argument representation. Site arguments add detail to core arguments, and each site argument is paired with one core argument. (A) In most tasks we link both core and site arguments to given protein nodes. This minimizes the number of outgoing edges per trigger node, simplifying unmerging, but loses the connection between site and core arguments. (B) In the EPI task, all events with site-arguments have a single core argument, so linking sites to the trigger node preserves the site/core connection. (C) To both limit number of arguments in trigger nodes and preserve site information, event arguments using sites could be linked to protein nodes through the site entity. However, in this approach the core argument would remain undetected if the site wasn't detected.
Corpus statistics
| Corpus | Sentences | Events | Equiv events | Nesting events | Intersentence events | Neg/spec events |
|---|---|---|---|---|---|---|
| GE'09 | 8906 | 11285 | 7.9% | 38.8% | 6.0% | 12.1% |
| GE | 11581 | 14496 | 6.6% | 37.2% | 6.0% | 13.3% |
| EPI | 7648 | 2684 | 9.1% | 10.2% | 9.3% | 10.1% |
| ID | 3193 | 2931 | 5.3% | 21.3% | 3.9% | 4.9% |
| BB | 1762 | 5843 | 79.4% | N/A | 86.0% | 0% |
| BI | 120 | 458 | 0% | N/A | 0% | 0% |
| CO | 8906 | 5284 | 0% | N/A | 8.5% | N/A |
| REL | 8906 | 2440 | 4.2% | N/A | 0% | 0% |
| REN | 13235 | 373 | 0% | N/A | 2.4% | 0% |
Numbers are for all available annotated data, i.e. the merged training and development sets. Event numbers include the resolved equivalencies.
Figure 3Ranking of the systems participating in the BioNLP'11 Shared Task. Our system is marked with black dots and the dotted line shows its theoretical maximum performance (see section Graph representation) with all correct classifications. The horizontal line in ID results shows the improved, post Shared Task result.
Event types
| Event type | Corpora | Core arguments | Optional arguments |
|---|---|---|---|
| Gene expression | GE, ID | Theme(Protein, Regulon/OperonID) | |
| Transcription | GE, ID | Theme(Protein, Regulon/OperonID) | |
| Protein catabolism | GE, ID | Theme(Protein) | |
| Phosphorylation* | GE, EPI, ID | Theme(Protein) | Site(Entity) |
| Localization | GE, ID, BB | ThemeGE, ID(Protein, Core entityID), BacteriumBB(Bacterium), LocalizationBB(Host, HostPart, Geographical, Environmental, Food, Medical, Soil, Water) | AtLocGE, ID(Entity), ToLocGE, ID(Entity) |
| Binding | GE, ID | Theme(Protein, Core entityID)+ | Site(Entity)+ |
| Regulation | GE, ID | Theme(Protein, Core entityID, Event), Cause(Core entityID, Event) | Site(Entity), CSite(Entity) |
| Positive regulation | GE, ID | Theme(Protein, Core entityID, Event), Cause(Protein, Core entityID, Event) | Site(Entity), CSite(Entity) |
| Negative regulation | GE, ID | Theme(Protein, Core entityID, Event), Cause(Core entityID, Event) | Site(Entity), CSite(Entity) |
| Process | ID | Participant(Core entity) | |
| Hydroxylation* | EPI | Theme(Protein) | Site(Entity) |
| Ubiquitination* | EPI | Theme(Protein) | Site(Entity) |
| DNA methylation* | EPI | Theme(Protein) | Site(Entity) |
| Glycosylation* | EPI | Theme(Protein) | Site(Entity) |
| Acetylation* | EPI | Theme(Protein) | Site(Entity) |
| Methylation* | EPI | Theme(Protein) | Site(Entity) |
| Catalysis | EPI | Theme(Event), Cause(Protein) | |
| PartOf | BB | HostPart(HostPart), Host(Host) | |
| RegulonDependence | BI | Regulon(Regulon), Target(GeneEntity, ProteinEntity) | |
| BindTo | BI | Agent(ProteinEntity), Target(Site, Promoter, Gene, GeneComplex) | |
| TranscriptionFrom | BI | Transcription(Transcription, Expression), Site(Site, Promoter) | |
| RegulonMember | BI | Regulon(Regulon), Member(GeneEntity, ProteinEntity) | |
| SiteOf | BI | Site(Site), Entity(Site, Promoter, GeneEntity) | |
| TranscriptionBy | BI | Transcription(Transcription), Agent(ProteinEntity) | |
| PromoterOf | BI | Promoter(Promoter), Gene(GeneEntity, ProteinEntity) | |
| PromoterDependence | BI | Promoter(Promoter), Protein(GeneEntity, ProteinEntity) | |
| ActionTarget | BI | Action(Action, Expression, Transcription), Target( | |
| Interaction | BI | Agent(GeneEntity, ProteinEntity), Target(GeneEntity, ProteinEntity) | |
| Coref | CO | Anaphora(Exp), Antecedent(Exp), Reference(Protein)+ | |
| Protein-Component | REL | Arg1(Protein), Arg2(Entity) | |
| Subunit-Complex | REL | Arg1(Protein), Arg2(Entity) | |
| Renaming | REN | Former(Gene), New(Gene) |
The event types for all tasks, their core arguments used for the primary evaluation and optional arguments for secondary evaluation. Superscripts show the arguments and targets limited to a specific task for events present in multiple tasks. Starred events have in the EPI task a corresponding reverse event (e.g. Dephosphorylation) with identical argument types. The plus-sign indicates where multiple arguments of the same type are allowed for one event.
Devel and test results for the BioNLP'11 Shared Task
| Corpus | Devel F | Test F |
|---|---|---|
| GE'09 task 1 | 56.27 | 53.15 |
| GE'09 task 2 | 54.25 | 50.68 |
| GE task 1 | 55.78 | 53.30 |
| GE task 2 | 53.39 | 51.97 |
| GE task 3 | 38.34 | 26.86 |
| EPI | 56.41 | 53.33 |
| ID | 44.92 | 42.57 |
| BB | 27.01 | 26 |
| BI | 77.24 | 77 |
| CO | 36.22 | 23.77 |
| REL | 65.99 | 57.7 |
| REN | 84.62 | 87.0 |
The performance of our new system on the BioNLP'09 ST GENIA dataset is shown for reference, with task 3 omitted due to a changed metric. For GE-tasks, the Approximate Span & Recursive matching criterion is used. In many tasks, the development and test set results differ considerably, which may be partially explained by noise unseen due to lack of cross-validation and by the event distribution not being stratified across the sets.
Figure 4Learning Curves. The learning curves provide an analysis of system performance relative to dataset size. The dotted line shows the addition of GE training data to ID training data. The x-axis is binary logarithmic, and the training corpus size roughly doubles between most points in the curves (2, 4, 8, 16, 32, 64 and 100%). Thus, a linear growth in F-score indicates a need for a corresponding exponential increase in dataset size.
Results of self-training
| Random distribution (devel/test) | Even distribution (devel/test) | |
|---|---|---|
| 55.97% | 56.17% | |
| 56.18%/52.72% | 56.83%/53.21% | |
| 54.83% | 55.78% | |
| 55.67% | 55.79% | |
| baseline | 55.46%/52.84% | 55.46%/52.84% |
Performance of the system on the GE subtask 1 in terms of F-score on the overall Approximate Span & Recursive matching criterion. Random distribution refers to self-training example selection by random sampling, whereas even distribution refers to selection of equal amount of examples for each event type and argument combination. Baseline is the performance of the system with no self-training (trained on GE subtask 1 data only).
Detailed results of the even distribution self-training experiment
| Event type | # | Baseline [%] | ST [%] | Δ (devel.) | Δ (test) | |
|---|---|---|---|---|---|---|
| Gene expression | 749 | 23.1% | 78.79 | 79.21 | +0.42 | +0.50 |
| Transcription | 158 | 4.9% | 59.78 | 61.71 | +1.93 | -0.33 |
| Protein catabolism | 23 | 0.7% | 89.80 | 95.83 | +6.03 | -6.32 |
| Phosphorylation | 111 | 3.4% | 85.97 | 86.49 | +0.52 | +0.46 |
| Localization | 67 | 2.1% | 64.91 | 66.67 | +1.76 | +6.00 |
| Binding | 373 | 11.5% | 51.30 | 50.88 | -0.42 | -0.61 |
| Regulation | 292 | 9.0% | 38.28 | 38.33 | +0.05 | +1.16 |
| Positive regulation | 999 | 30.8% | 42.74 | 47.14 | +4.40 | +1.70 |
| Negative regulation | 471 | 14.5% | 41.37 | 42.16 | +0.79 | -3.04 |
| Overall | 3,243 | 100.0% | 55.46 | 56.83 | +1.37 | +0.37 |
Performance of the system on the GE subtask 1 in terms of F-score on the overall Approximate Span & Recursive matching criterion. Baseline and self-training (ST) results, as well as evaluation event counts are given for the development set. Difference (Δ) in F-score is given for both the development and test sets.