| Literature DB >> 33141147 |
Hai-Long Trieu1, Thy Thy Tran2, Khoa N A Duong1, Anh Nguyen1, Makoto Miwa1,3, Sophia Ananiadou2.
Abstract
MOTIVATION: Recent neural approaches on event extraction from text mainly focus on flat events in general domain, while there are less attempts to detect nested and overlapping events. These existing systems are built on given entities and they depend on external syntactic tools.Entities:
Mesh:
Year: 2020 PMID: 33141147 PMCID: PMC7750964 DOI: 10.1093/bioinformatics/btaa540
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Example flat and nested events from the BioNLP-ST 2013 Cancer Genetics task. Green and orange denote entities, while red are event triggers, e.g. erbA, erythroid cells are two entities and transformed is a trigger. Blue connection denotes the role of an argument to a trigger, where an argument can be an entity or a trigger, e.g. erythroid cells is a Theme of the trigger transformed. Event structures are constructed by an event trigger and its arguments. The flat event, transformed erythroid cells, has only one entity as its argument. The nested event, erbA/myb IRES transformed erythroid cells, has the entity erbA/myb IRES as Cause and the flat event transformed erythroid cells as its argument. The nested event is a tree whose root is the red rectangle (Positive regulation) with blue connections to its entity argument (Organism) and its flat event argument (transformed erythroid cells)
Fig. 2.Overview of our deep end-to-end event extraction model. The hidden layers are omitted for brevity. We use the example in Figure 1 to present our method. Due to the space limitation, we omit some unimportant tokens (e.g. the, virus, and). We use abbreviations to represent triggers and entities, i.e. CTrans, PReg stand for Cell Transformation and Positive Regulation triggers, respectively, whilst Org and GGP are the short forms of Organism and Gene or gene product entities, respectively. We sum the representations of the tokens and concatenate with the first and last tokens to form the representation of a span. We show the combination details of an entity erythroid cells, while those of the others are omitted for simplicity
Results of event extraction on the test sets given gold entities
| Task | Model | P | R | F (%) |
|---|---|---|---|---|
| CG | TEES-CNN ( | 66.55 | 50.77 | 57.60 |
| DeepEventMine (single) | 69.54 | 54.24 | 60.94 | |
| DeepEventMine (ensemble) | 72.23 | 53.92 |
| |
|
| ||||
| EPI | EventMine ( | 54.42 | 54.28 | 54.35 |
| TEES-CNN ( | 64.93 | 50.00 | 56.50 | |
| DeepEventMine (single) | 73.73 | 55.95 | 63.62 | |
| DeepEventMine (ensemble) | 78.34 | 56.39 |
| |
|
| ||||
| GE11 | EventMine ( | 63.48 | 53.35 | 57.98 |
| BioMLN ( | 63.61 | 53.42 | 58.07 | |
| TEES-CNN ( | 69.45 | 49.94 | 58.10 | |
| DeepEventMine (single) | 71.71 | 56.20 | 63.02 | |
| DeepEventMine (ensemble) | 76.28 | 55.06 |
| |
|
| ||||
| GE13 | TEES-CNN ( | 65.78 | 44.38 | 53.00 |
|
BioMLN ( |
59.24 |
48.95 |
53.61 | |
| DeepEventMine (single) | 60.98 | 49.80 | 54.83 | |
| DeepEventMine (ensemble) | 67.08 | 49.14 |
| |
|
| ||||
| ID | TEES-CNN ( | 66.48 | 50.66 | 57.50 |
| EventMine ( | 61.33 | 58.96 | 60.12 | |
| DeepEventMine (single) | 63.56 | 57.30 | 60.27 | |
| DeepEventMine (ensemble) | 68.51 | 55.99 |
| |
|
| ||||
| PC | EventMine ( | 53.48 | 52.23 | 52.84 |
| TEES-CNN ( | 62.16 | 50.34 | 55.62 | |
| DeepEventMine (single) | 64.12 | 49.19 | 55.67 | |
| DeepEventMine (ensemble) | 68.13 | 50.07 |
| |
|
| ||||
| MLEE | MultiRep-CNN ( | 60.56 | 56.23 | 58.31 |
| PMCNN ( | 67.23 | 53.61 | 59.65 | |
| BLSTM ( | — | — | 59.61 | |
| DeepEventMine (single) | 67.39 | 56.35 | 61.38 | |
| DeepEventMine (ensemble) | 69.91 | 55.49 |
| |
Notes: The highest scores are shown in bold.
Results of nested event categories on the test sets
| Nested category | SOTA | DeepEventMine (ens.) | ||||
|---|---|---|---|---|---|---|
| P | R | F | P | R | F (%) | |
| CG (versus TEES-CNN) | ||||||
| Regulation | 50.52 | 30.93 | 38.36 | 61.13 | 34.07 |
|
| Positive_regulation | 61.28 | 44.24 | 51.38 | 66.26 | 45.72 |
|
| Negative_regulation | 56.60 | 45.97 | 50.73 | 62.23 | 44.29 |
|
| Total regulation | 57.33 | 41.14 | 47.90 | 63.88 | 42.19 |
|
| Planned_process | 48.53 | 34.10 | 40.05 | 57.45 | 42.33 |
|
| % nested events | 50.76% | |||||
|
| ||||||
| EPI (versus TEES-CNN) | ||||||
| Catalysis | 46.67 | 6.14 | 10.85 | 81.25 | 23.68 |
|
| % nested events | 8.27% | |||||
|
| ||||||
| GE11 (versus TEES-CNN) | ||||||
| Regulation | 53.47 | 34.03 | 41.59 | 67.69 | 34.29 |
|
| Positive_regulation | 63.63 | 38.67 | 48.10 | 69.42 | 47.68 |
|
| Negative_regulation | 54.89 | 43.26 | 48.38 | 64.27 | 46.94 |
|
| Total regulation | 59.54 | 39.02 | 47.14 | 67.87 | 45.35 |
|
| % nested events | 53.83% | |||||
|
| ||||||
| GE13 (versus BioMLN) | ||||||
| Regulation | — | — | — | 64.04 | 25.35 | 36.32 |
| Positive_regulation | — | — | — | 62.84 | 38.76 | 47.95 |
| Negative_regulation | — | — | — | 64.31 | 44.87 | 52.86 |
| Total regulation | 50.86 | 36.47 | 42.48 | 63.41 | 38.43 |
|
| % nested events | 58.89% | |||||
|
| ||||||
| ID (versus EventMine) | ||||||
| Regulation | 44.00 | 22.80 |
| 59.65 | 17.62 | 27.20 |
| Positive_regulation | 63.95 | 49.23 | 55.63 | 64.77 | 59.49 |
|
| Negative_regulation | 74.63 | 55.25 |
| 74.02 | 51.93 | 61.04 |
| Total regulation | 62.47 | 42.18 | 50.36 | 67.22 | 42.88 |
|
| % nested events | 41.50% | |||||
|
| ||||||
| PC (versus TEES-CNN) | ||||||
| Regulation | 50.48 | 35.81 | 41.90 | 60.33 | 33.11 |
|
| Positive_regulation | 56.05 | 36.86 | 44.48 | 66.24 | 35.88 |
|
| Negative_regulation | 52.38 | 43.67 | 47.63 | 59.90 | 44.05 |
|
| Total regulation | 53.69 | 38.43 | 44.80 | 62.94 | 37.43 |
|
| Activation | 75.48 | 79.76 | 77.56 | 81.28 | 77.33 |
|
| Inactivation | 51.67 | 47.69 |
| 52.17 | 36.92 | 43.24 |
| % nested events | 57.57% | |||||
|
| ||||||
| MLEE (versus PMCNN) | ||||||
| Regulation | 42.96 | 27.16 | 33.28 | 50.00 | 25.00 |
|
| Positive_regulation | 45.92 | 39.65 | 42.56 | 61.33 | 49.16 |
|
| Negative_regulation | 51.01 | 38.49 | 43.87 | 57.21 | 41.83 |
|
| Total regulation | — | — | — | 57.86 | 40.53 | 47.67 |
| Planned_process | 44.03 | 37.36 |
| 50.00 | 32.14 | 39.13 |
| % nested events | 53.17% | |||||
Notes: ens.: ensemble, TOTAL_REGULATION: micro-averaged scores on the three regulation types. % nested events: the proportion of the events in total nested categories among the events in a corpus.) The highest scores are shown in bold.
Comparison of our model with the pipeline training setting when extracting events from raw text and given gold entities on the development sets
| Task | Model | Entity | Trigger | Role | Event | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| P | R | F | P | R | F | P | R | F | P | R | F (%) | ||
| CG | Pipeline (gold) | — | — | — | 78.85 | 82.89 | 80.82 | 67.15 | 66.38 | 66.76 | 59.64 | 53.66 | 56.49 |
| DeepEventMine (gold) | — | — | — | 79.17 | 82.93 | 81.01 | 63.18 | 66.65 | 64.87 | 65.24 | 55.93 |
| |
| Pipeline | 85.47 | 84.12 | 84.79 | 79.18 | 81.90 | 80.52 | 61.99 | 60.84 | 61.41 | 53.25 | 47.49 | 50.20 | |
| DeepEventMine | 85.57 | 82.02 | 83.76 | 79.54 | 81.45 | 80.48 | 59.72 | 60.70 | 60.20 | 61.23 | 48.74 |
| |
| EPI | Pipeline (gold) | — | — | — | 74.43 | 83.11 | 78.53 | 75.74 | 65.50 | 70.25 | 71.02 | 55.29 | 62.18 |
| DeepEventMine (gold) | — | — | — | 76.99 | 82.52 | 79.66 | 76.83 | 67.25 | 71.72 | 75.90 | 56.18 |
| |
| Pipeline | 85.06 | 84.02 | 84.54 | 74.43 | 83.11 | 78.53 | 67.53 | 61.07 | 64.14 | 59.61 | 49.55 | 54.12 | |
| DeepEventMine | 83.84 | 82.04 | 82.93 | 73.52 | 81.94 | 77.50 | 66.46 | 62.35 | 64.34 | 59.69 | 52.40 |
| |
| GE11 | Pipeline (gold) | — | — | — | 71.11 | 69.97 | 70.54 | 69.96 | 59.63 | 64.38 | 67.99 | 56.37 | 61.64 |
| DeepEventMine (gold) | — | — | — | 71.23 | 70.31 | 70.77 | 63.52 | 59.09 | 61.22 | 70.52 | 56.52 |
| |
| Pipeline | 88.69 | 84.64 | 86.62 | 73.32 | 68.72 | 70.95 | 66.25 | 55.52 | 60.41 | 60.63 | 50.25 | 54.95 | |
| DeepEventMine | 88.51 | 84.29 | 86.35 | 72.05 | 68.89 | 70.43 | 60.82 | 57.14 | 58.92 | 62.36 | 51.88 |
| |
| GE13 | Pipeline (gold) | — | — | — | 76.59 | 68.73 | 72.45 | 69.45 | 57.07 | 62.65 | 59.37 | 48.37 | 53.31 |
| DeepEventMine (gold) | — | — | — | 74.29 | 71.25 | 72.74 | 62.50 | 54.64 | 58.31 | 64.50 | 49.25 |
| |
| Pipeline | 82.36 | 80.02 | 81.17 | 75.21 | 70.36 | 72.70 | 62.51 | 53.42 | 57.61 | 48.39 | 41.90 | 44.91 | |
| DeepEventMine | 81.11 | 80.74 | 80.93 | 74.96 | 69.42 | 72.08 | 58.09 | 52.04 | 54.90 | 49.49 | 42.88 |
| |
| ID | Pipeline (gold) | — | — | — | 71.96 | 84.06 | 77.54 | 52.34 | 61.68 | 56.65 | 52.82 | 56.39 | 54.54 |
| DeepEventMine (gold) | — | — | — | 74.56 | 80.24 | 77.30 | 52.17 | 58.88 | 55.32 | 59.91 | 53.94 |
| |
| Pipeline | 81.97 | 86.24 | 84.05 | 71.79 | 83.36 | 77.15 | 48.07 | 55.66 | 51.59 | 47.94 | 51.72 | 49.76 | |
| DeepEventMine | 81.93 | 85.46 | 83.66 | 73.15 | 82.15 | 77.39 | 44.36 | 48.95 | 46.54 | 58.14 | 44.02 |
| |
| PC | Pipeline (gold) | — | — | — | 75.26 | 80.71 | 77.89 | 65.60 | 65.06 | 65.33 | 53.82 | 52.11 | 52.95 |
| DeepEventMine (gold) | — | — | — | 76.97 | 77.74 | 77.35 | 63.54 | 62.96 | 63.25 | 65.94 | 49.52 |
| |
| Pipeline | 88.26 | 89.61 | 88.93 | 75.26 | 80.71 | 77.89 | 61.73 | 61.59 | 61.66 | 48.13 | 47.67 | 47.90 | |
| DeepEventMine | 87.80 | 89.25 | 88.52 | 75.73 | 78.73 | 77.20 | 57.77 | 60.44 | 59.08 | 56.90 | 45.45 |
| |
| MLEE | Pipeline (gold) | — | — | — | 80.77 | 73.93 | 77.20 | 62.09 | 56.22 | 59.01 | 55.66 | 48.60 | 51.89 |
| DeepEventMine (gold) | — | — | — | 79.73 | 76.39 | 78.03 | 56.64 | 55.92 | 56.28 | 63.52 | 48.26 |
| |
| Pipeline | 82.69 | 80.78 | 81.72 | 81.49 | 78.43 | 79.93 | 56.47 | 55.32 | 55.89 | 50.14 | 45.79 | 47.86 | |
| DeepEventMine | 81.87 | 81.41 | 81.64 | 79.37 | 78.86 | 79.12 | 51.93 | 56.52 | 54.13 | 56.33 | 47.83 |
| |
Notes: gold: given gold entities. The highest scores are shown in bold.
Comparison of our model with the TEES-CNN on the CG development set given gold entities
| Model | Trigger | Role | Event | ||||||
|---|---|---|---|---|---|---|---|---|---|
| P | R | F | P | R | F | P | R | F (%) | |
| TEES-CNN | 76.81 | 80.87 | 78.78 | 65.10 | 62.83 | 63.95 | 59.24 | 51.81 | 55.27 |
| Pipeline | 78.85 | 82.89 | 80.82 | 67.15 | 66.38 |
| 59.64 | 53.66 | 56.49 |
| DeepEventMine | 79.17 | 82.93 |
| 63.18 | 66.65 | 64.87 | 65.24 | 55.93 |
|
Note: The highest scores are shown in bold.
Results on different nested event levels on the CG development set
| Nested level | TEES-CNN | DeepEventMine (gold) | ||||
|---|---|---|---|---|---|---|
| P | R | F | P | R | F (%) | |
| Flat + nested events | 60.42 | 53.45 | 56.72 | 66.22 | 57.19 |
|
| Total flat | 64.99 | 62.99 | 63.98 | 71.19 | 66.91 |
|
| Total nested | 48.82 | 35.23 | 40.93 | 53.90 | 38.62 |
|
| Level 1 | 51.81 | 38.62 | 44.25 | 54.49 | 42.02 |
|
| Level 2 | 26.93 | 16.78 | 21.43 | 46.67 | 19.58 |
|
| Level 3 | 0.00 | 0.00 | 0.00 | 100.00 | 16.67 |
|
| % nested events | 34.37% | |||||
Note: The highest scores are shown in bold.
Analysis of missing events on the CG development set
| Error Type | #events | % |
|---|---|---|
| Missing role | 1294 | 89.0 |
| Missing trigger | 650 | 44.7 |
| Incorrect event class | 25 | 1.7 |
| Missing entity | 497 | 34.2 |
| Missing argument | 136 | 9.4 |
| Missing event argument | 107 | 7.4 |
| Total | 1454 | 100 |