| Literature DB >> 26551766 |
Kai Hakala, Sofie Van Landeghem, Tapio Salakoski, Yves Van de Peer, Filip Ginter.
Abstract
BACKGROUND: Modern methods for mining biomolecular interactions from literature typically make predictions based solely on the immediate textual context, in effect a single sentence. No prior work has been published on extending this context to the information automatically gathered from the whole biomedical literature. Thus, our motivation for this study is to explore whether mutually supporting evidence, aggregated across several documents can be utilized to improve the performance of the state-of-the-art event extraction systems.Entities:
Mesh:
Year: 2015 PMID: 26551766 PMCID: PMC4642107 DOI: 10.1186/1471-2105-16-S16-S3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Original rule-based conversion of EVEX event types to the GRN interaction types.
| EVEX type | GRN type |
|---|---|
| Binding | Binding |
| Regulation* of Transcription | Transcription |
| Regulation* of Gene expression | Transcription |
| Positive regulation of Any* | Activation |
| Negative regulation of Any* | Inhibition |
| Regulation of Any* | Regulation |
The table is traversed from top to bottom, and the first rule that matches is applied. Regulation* refers to any type of regulatory event, and Any* refers to any other non-regulatory event type. Because the EVEX Binding type, unlike the corresponding GRN Binding, is symmetrical, the EVEX data is split into two candidates per instance. Because some GRN types do not have an equivalent type in the EVEX resource, such as 'Requirement' and 'Promoter', these types could not be addressed by our work.
Original rule-based entity-type filtering of event predictions.
| GRN event type | Possible target types | Possible agent types |
|---|---|---|
| Interaction.Binding | Protein | Gene |
| Interaction.Transcription | Protein, PolymeraseComplex | Gene, Operon |
| Interaction.Regulation Interaction.Activation Interaction.Inhibition | Protein, PolymeraseComplex | Gene, Operon, Protein, ProteinComplex |
Only those events for which both arguments (target and agent) have a correct entity type, are retained in the result set.
Performance measurement of different system settings on the GRN training data.
| Method | Dataset | SER | F | Rel. P | Rel. R | Rel. F | Rel. SER |
|---|---|---|---|---|---|---|---|
| TC | All EVEX data | 1.56 | 8.86 | 39.29% | 40.59% | 1.23 | |
| TC, ETF | All EVEX data | 1.15 | 11.53 | 59.74% | 35.11% | 0.89 | |
| TC, ETF | 0.954 | 71.43% | 22.90% | 34.68% | |||
| TC, ETF | GRN PMIDs | 17.39 | 18.32% | 29.81% |
The SER score is the main evaluation criterion of the GRN challenge. The relaxed precision, recall, F and SER scores are produced by scoring the predictions regardless of the specific event types. TC refers to rule-based EVEX-GRN type conversion, ETF to rule-based entity type filtering. The ML-based methods are not shown because they are trained on this dataset.
Performance measurement of different system settings on the GRN development set.
| Method | Dataset | SER | F | Rel. P | Rel. R | Rel. F | Rel. SER |
|---|---|---|---|---|---|---|---|
| TC, ETF | GRN PMIDs | 1.00 | 11.90% | 70.59% | 0.896 | ||
| Hybrid | GRN PMIDs | 5.19% | 13.43% | 23.38% | |||
| ML-full | GRN PMIDs | 1.12 | 2.27% | 57.14% | 27.27% | 0.955 |
For the original rule-based system, only the best setting is included here (cf. Table 3). 'Hybrid' refers to the hybrid ML system with rule-based ETF, and ML-full refers to the classifier with the full feature set.
Official GRN performance rates of all participants (first 5 rows), as well as the test result of the Hybrid ML-based system.
| SER | Relaxed SER | |
|---|---|---|
| University of Ljubljana | 0.73 | 0.64 |
| K.U.Leuven | 0.83 | 0.66 |
| TEES-2.1 | 0.86 | 0.76 |
| IRISA-TexMex | 0.91 | 0.60 |
| EVEX (orig) | 0.92 | 0.81 |
| EVEX (hybrid) | 0.94 | n/a |
The evaluation service made available to perform additional test results did not present the relaxed SER scores.
Figure 1Incompatible entity span annotation. Example of entity annotations in EVEX, which are incompatible with the GRN challenge. In this case, the sigX-ypuN operon was not annotated as such by BANNER, but instead two separate gene/gene products (GGPs) were defined.
Figure 2Missing event structure. Example of a missing event structure, resulting in a false negative instance for the GRN challenge. In this case, the long sentence and the intermittent subsentences have prevented the event extractor to recognise the requirement relation with all its arguments.
Figure 3Wrongly predicted event type. Example of a wrongly prediction GRN interaction type. In this case, the crucial word 'negative' is not taken into consideration, resulting in a positive regulation of transcription rather than a negative one, which would have led to the correct GRN type 'inhibition'.
Official test set results of the five best performing GE participants, in percentages.
| P | R | F | |
|---|---|---|---|
| EVEX | 58.03 | 45.44 | 50.97 |
| TEES-2.1 | 56.32 | 46.17 | 50.74 |
| BioSEM | 62.83 | 42.47 | 50.68 |
| NCBI | 61.72 | 40.53 | 48.93 |
| DlutNLP | 57.00 | 40.81 | 47.56 |
All in all, 10 teams participated in this task.
Performance of our system for different event types compared against the TEES system, in percentage points, according to the official GE test results.
| # | P | R | F | |
|---|---|---|---|---|
| Simple events | 833 | -0.08 | -0.36 | -0.23 |
| Protein mod. | 191 | +0.09 | -2.09 | -1.12 |
| Binding | 333 | +0.43 | -1.20 | -0.44 |
| Regulation | 1944 | +2.38 | -0.67 | +0.36 |
| All | 3301 | +1.71 | -0.73 | +0.23 |
Performance of our current system in the GE task in contrast to TEES, the best case (B-C) and worst case (W-C) oracles with re-ranked output, as well as the worst case oracle with randomized rankings (averaged over ten runs).
| All events | P | R | F |
|---|---|---|---|
| B-C oracle (re-ranked) | 81.32 | 39.61 | 53.27 |
| W-C oracle (re-ranked) | 54.92 | 39.61 | 46.02 |
| W-C oracle (random) | 51.06 | 39.19 | 44.34 |
| Current system | 47.15 | 39.61 | 43.05 |
| TEES | 45.46 | 40.39 | 42.77 |
| B-C oracle (re-ranked) | 81.37 | 50.58 | 62.38 |
| W-C oracle (re-ranked) | 56.09 | 50.58 | 53.19 |
| W-C oracle (random) | 52.73 | 50.00 | 51.33 |
| Current system | 48.66 | 50.44 | 49.53 |
| TEES | 47.16 | 51.09 | 49.04 |
| B-C oracle (re-ranked) | 81.02 | 16.83 | 27.87 |
| W-C oracle (re-ranked) | 48.61 | 16.83 | 25.00 |
| W-C oracle (random) | 42.66 | 16.75 | 24.05 |
| Current system | 39.64 | 17.12 | 23.91 |
| TEES | 37.57 | 18.17 | 24.50 |
Performance of the system on the GE development set when applied to single-argument events only (1-arg), to multiple-argument events only (N-arg), and to all events (Full).
| TEES | 1-arg | N-arg | Full | |
|---|---|---|---|---|
| Simple events | 64.43 | +0.07 | +0.07 | |
| Protein mod. | 40.47 | +0.06 | +0.06 | |
| Binding | 82.03 | |||
| Regulation | 30.34 | +0.70 | -0.14 | +0.53 |
| All events | 45.04 | +0.66 | +0.64 | |
Performance comparison of the vector representations of words on the GE development and test set.
| Development | Test | |||||
|---|---|---|---|---|---|---|
| TEES | 52.49 | 45.07 | 48.50 | 56.32 | 46.17 | 50.74 |
| Vectors #1 | 52.96 | 45.26 | 48.81 | 56.91 | 46.05 | 50.90 |
| Vectors #2 | 53.16 | 45.45 | 49.00 | 55.97 | 45.71 | 50.33 |
| Clusters #1 | 53.09 | 44.85 | 48.62 | 56.41 | 45.71 | 50.50 |
| Clusters #2 | 52.53 | 45.29 | 48.64 | 54.73 | 45.71 | 49.82 |
Vectors #1 uses vector representations in trigger detection, Vectors #2 in edge detection and unmerging step. Clusters #1 uses word clusters in trigger detection and Clusters #2 in edge detection and unmerging step.