| Literature DB >> 27175227 |
Farrokh Mehryary1, Suwisa Kaewphan2, Kai Hakala1, Filip Ginter3.
Abstract
BACKGROUND: Biomedical event extraction is one of the key tasks in biomedical text mining, supporting various applications such as database curation and hypothesis generation. Several systems, some of which have been applied at a large scale, have been introduced to solve this task. Past studies have shown that the identification of the phrases describing biological processes, also known as trigger detection, is a crucial part of event extraction, and notable overall performance gains can be obtained by solely focusing on this sub-task. In this paper we propose a novel approach for filtering falsely identified triggers from large-scale event databases, thus improving the quality of knowledge extraction.Entities:
Keywords: BioNLP; Event extraction; Trigger detection; Word embeddings
Mesh:
Year: 2016 PMID: 27175227 PMCID: PMC4864999 DOI: 10.1186/s13326-016-0070-4
Source DB: PubMed Journal: J Biomed Semantics
Fig. 1Visualization of a specific event occurrence. Genes and gene products (‘GGPs’) are marked, as well as the trigger words that refer to specific event types. Finally, arrows denote the roles of each argument in the event (e.g. Theme or Cause). (Adapted from [23])
Fig. 2Example sentence with multiple events sharing a single trigger. Two event occurrences extracted from the same trigger word recognized
Distribution of triggers and their associated event percentages in the EVEX database
| Trigger word frequency | EVEX events coverage | Number of trigger |
|---|---|---|
| (at least) | percentage | words |
| 100 | 98.4 | 6339 |
| 200 | 97.6 | 4263 |
| 300 | 97.1 | 3391 |
| 400 | 96.6 | 2880 |
| 500 | 96.3 | 2538 |
Examples of matching EVEX trigger words against Shared Task exact trigger words or their corresponding parts/lemmas
| EVEX trigger word | ST’11-trigger word/Part/Lemma |
|---|---|
| co-transcribed | transcribed |
| calcium-induced | induced |
| co-immunoprecipitates | immunoprecipitate |
| downregulating | downregulate |
| recognise | recognize |
| preceding | precede |
| analyzing | analyse |
Performance comparison of the different pruning approaches and the baseline methods (TEES/EVEX) on the official BioNLP Shared Task GE data sets
| Predictions | Precision | Recall | F1-score | |
|---|---|---|---|---|
| TEES-2011 (Shared Task 2011) | Original TEES | 61.76 | 48.78 | 54.51 |
| Pruned-TEES (Unsupervised Method) | 62.39 | 48.75 | 54.74 | |
| Pruned-TEES (Manual Annotation Method) | 62.04 | 48.78 | 54.62 | |
| Pruned-TEES (Aggregation Method) | 62.26 | 48.78 | 54.70 | |
| Pruned-TEES (Aggregation Method + SVM) | 62.27 | 48.78 | 54.71 | |
| TEES-2013 (Shared Task 2013) | Original TEES | 56.32 | 46.17 | 50.74 |
| Pruned-TEES (Unsupervised Method) | 57.13 | 46.02 | 50.97 | |
| Pruned-TEES (Manual Annotation Method) | 56.63 | 46.17 | 50.87 | |
| Pruned-TEES (Aggregation Method) | 56.97 | 46.17 | 51.00 | |
| Pruned-TEES (Aggregation Method + SVM) | 57.01 | 46.17 | 51.02 | |
| EVEX-2013 (Shared Task 2013) | Original EVEX | 58.03 | 45.44 | 50.97 |
| Pruned-EVEX (Unsupervised Method) | 58.77 | 45.29 | 51.15 | |
| Pruned-EVEX (Manual Annotation Method) | 58.32 | 45.44 | 51.08 | |
| Pruned-EVEX (Aggregation Method) | 58.66 | 45.44 | 51.21 | |
| Pruned-EVEX (Aggregation Method + SVM) | 58.71 | 45.44 | 51.23 |
Trigger/event classification performance, measured on the EVEX test set: The first column (Count) shows prediction results based on the counts of trigger words (test set examples). The second column (Sum of frequency) shows the number of respective events of those triggers in the EVEX database. For instance, the first row (True-Positive) shows that the classifier has correctly predicted 352 test set trigger words to be correct triggers, while these words account for 4,602 extracted events in the EVEX resource
| Count | Sum of frequency | |
|---|---|---|
| (Number of events) | ||
| True-Positive | 352 | 4602 |
| True-Negative | 99 | 679 |
| False-Positive | 134 | 850 |
| False-Negative | 11 | 115 |
| Total | 596 | 6246 |
Trigger classification performance on the EVEX resource based on trigger counts (test set examples). The prediction measures in this table are calculated based on the values in the first column of Table 4. This table shows how well the classifier is able to classify and distinguish between correct and incorrect trigger words. The last column (Support) shows that there are 363 correct and 233 incorrect trigger words in the test set, i.e, 596 in total
| Precision | Recall | F2-score | Support | |
|---|---|---|---|---|
| Negative (incorrect) | 0.90 | 0.42 | 0.48 | 233 |
| Positive (correct) | 0.72 | 0.97 | 0.91 | 363 |
| Weighted averages, total | 0.79 | 0.76 | 0.74 | 596 |
Classification performance on the EVEX resource based on the respective event counts in the EVEX database. This table shows how well the classifier will perform the prediction, preserving correct and eliminating incorrect respective events from the EVEX database. The prediction measures in this table are calculated based on the values in the second column of Table 4. The last column (Support) shows that there are 1,529 incorrect and 4,717 correct corresponding events in the EVEX database (6,246 in total) which are extracted based on those 596 trigger words in the test set
| Precision | Recall | F2-score | Support | |
|---|---|---|---|---|
| Negative (incorrect) | 0.86 | 0.44 | 0.49 | 1529 |
| Positive (correct) | 0.84 | 0.98 | 0.95 | 4717 |
| Weighted averages, total | 0.85 | 0.77 | 0.77 | 6246 |