| Literature DB >> 22595089 |
Andrew Mackinlay1, David Martinez, Timothy Baldwin.
Abstract
BACKGROUND: This work describes a system for identifying event mentions in bio-molecular research abstracts that are either speculative (e.g. analysis of IkappaBalpha phosphorylation, where it is not specified whether phosphorylation did or did not occur) or negated (e.g. inhibition of IkappaBalpha phosphorylation, where phosphorylation did not occur). The data comes from a standard dataset created for the BioNLP 2009 Shared Task. The system uses a machine-learning approach, where the features used for classification are a combination of shallow features derived from the words of the sentences and more complex features based on the semantic outputs produced by a deep parser.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22595089 PMCID: PMC3339397 DOI: 10.1186/1472-6947-12-S1-S4
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Figure 1A sample RMRS. RMRS representation of the sentence Thus NF-kappa B activation is not required for neuroblastoma cell differentiation showing, in order, elementary predicates (each consisting of a label, predicate name, character span and arguments), qeq-constraints, and in-g constraints. The unlabelled first argument of each predicate is the mandatory ARG0 argument, which is closely linked to the predicate. The 'udef_q_rel' predicates are default quantifiers introduced to keep the RMRS well-formed, which do not have directly corresponding words in the sentence.
Results over development set
| Mod | RMRS from | Extra | Gold | UTurku | ||||
|---|---|---|---|---|---|---|---|---|
| R | P | F | R | P | F | |||
| - | 42.9 | 55.4 | 48.3 | 19.0 | 33.3 | 24.2 | ||
| - | 16.7 | 66.7 | 26.7 | 5.5 | 26.1 | 9.0 | ||
| - | 20.2 | 68.0 | 31.2 | 10.7 | 56.2 | 18.0 | ||
| fb( | - | 25.0 | 61.8 | 35.6 | 13.1 | 50.0 | 20.8 | |
| cb( | - | 15.5 | 59.1 | 24.5 | 10.7 | 52.9 | 17.8 | |
| 45.2 | 59.4 | 51.3 | 20.2 | 34.0 | ||||
| fb( | 45.2 | 60.3 | 16.7 | 31.1 | 21.7 | |||
| cb( | 40.5 | 54.8 | 46.6 | 19.0 | 32.0 | 23.9 | ||
| - | 32.7 | 32.7 | 32.7 | 22.6 | 17.8 | 19.9 | ||
| - | 51.8 | 54.8 | 53.3 | 25.4 | 33.7 | |||
| - | 12.7 | 35.9 | 18.8 | 5.4 | 26.1 | 9.0 | ||
| - | 26.4 | 72.5 | 38.7 | 15.4 | 34.0 | 21.2 | ||
| fb( | - | 35.4 | 66.1 | 46.1 | 17.3 | 34.6 | 23.0 | |
| cb( | - | 29.1 | 64.0 | 40.0 | 13.6 | 34.1 | 19.5 | |
| 45.4 | 48.5 | 47.0 | 18.2 | 26.3 | 21.5 | |||
| fb( | 44.6 | 66.2 | 53.3 | 19.1 | 33.3 | 24.3 | ||
| cb( | 50.9 | 59.0 | 21.8 | 32.0 | 26.0 | |||
Results over the development data using gold-standard Task 1 annotations and the UTurku Task 1 system ("fb" = fallback strategy, where we use the first source if possible, otherwise the second; "cb" = use undifferentiated RMRSs from each source to create feature vectors).
Results over test set
| Mod | RMRS from | Extra | R | P | F |
|---|---|---|---|---|---|
| - | 6.97 | 23.73 | 10.77 | ||
| - | 8.96 | 41.86 | 14.75 | ||
| fb( | - | 12.44 | 52.08 | ||
| cb( | - | 6.47 | 41.94 | 11.21 | |
| 11.44 | 26.14 | 15.92 | |||
| fb( | 9.95 | 28.17 | 14.71 † | ||
| cb( | 7.46 | 24.19 | 11.41 | ||
| - | 19.55 | 30.94 | 23.96 | ||
| - | 17.83 | 12.73 | 14.85 | ||
| - | 11.82 | 35.62 | 17.75 | ||
| fb( | - | 13.64 | 34.09 | 19.48 | |
| cb( | - | 12.73 | 33.33 | 18.42 | |
| 19.55 | 32.33 | 24.36 | |||
| fb( | 19.55 | 32.58 | 24.43 | ||
| cb( | 20.91 | 41.07 | |||
Results over the test data using the UTurku Task 1 system ("fb" = fallback strategy, where we use the first source if possible, otherwise the second; "cb" = use undifferentiated RMRSs from each source to create feature vectors). † denotes the feature set which performed best over the development set using gold Task 1 annotations.
Figure 2Task 3 against Task 1 for . Task 3 F-score against Task 1 F-score for , over the different combinations of Task 1 and Task 3 systems on the development set.
Figure 3Task 3 against Task 1 for . Task 3 F-score against Task 1 F-score for , over the different combinations of Task 1 and Task 3 systems on the development set.