| Literature DB >> 28818042 |
K Bretonnel Cohen1, Arrick Lanfranchi2, Miji Joo-Young Choi3, Michael Bada4, William A Baumgartner4, Natalya Panteleyeva4, Karin Verspoor3, Martha Palmer4,2, Lawrence E Hunter4.
Abstract
BACKGROUND: Coreference resolution is the task of finding strings in text that have the same referent as other strings. Failures of coreference resolution are a common cause of false negatives in information extraction from the scientific literature. In order to better understand the nature of the phenomenon of coreference in biomedical publications and to increase performance on the task, we annotated the Colorado Richly Annotated Full Text (CRAFT) corpus with coreference relations.Entities:
Keywords: Anaphora; Annotation; Benchmarking; Coreference; Corpus; Resolution
Mesh:
Year: 2017 PMID: 28818042 PMCID: PMC5561560 DOI: 10.1186/s12859-017-1775-9
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Descriptive statistics of Yang et al.’s coreference corpus [28]
| Total number | Percentage | |
|---|---|---|
|
| ||
| Noun phrase | 3561 | 29.1% |
| Pronoun | 131 | 1% |
|
| ||
| Noun phrase | 8272 | 67.6% |
| Pronoun | 259 | 2.1% |
Descriptive statistics of Kim and Park’s coreference corpus [63]
| Anaphoric expression | Count |
|---|---|
| Pronouns | 53 |
| Noun phrase with determiner | 26 |
| Zero anaphora | 8 |
Gasperin et al.’s inter-annotator agreement scores for six papers, calculated as Kappa, before and after annotation revision
| Before revision | After revision | |
|---|---|---|
| Paper 1 | 0.75 | 0.85 |
| Paper 2 | 0.70 | 0.83 |
| Paper 3 | 0.68 | 0.93 |
| Paper 4 | 0.62 | 0.95 |
| Paper 5 | 0.41 | 0.91 |
Gasperin et al.’s inter-annotator agreement scores for five semantic classes of anaphora, calculated as Kappa
| Class/Paper | 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|---|
| Coreferent | 0.84 | 0.84 | 0.98 | 0.97 | 0.93 |
| Biotype | 0.84 | 0.81 | 0.92 | 0.88 | 0.79 |
| Homolog | 0.77 | N/A | 1.0 / | N/A | 0.53 |
| Set-member | 0.78 | 0.69 | 0.66 | 0.83 | 0.88 |
| Discourse-new | 0.89 | 1.0 | 0.56 | 1.0 | 0.98 |
Descriptive statistics of the BioNLP-ST 2011 coreference corpus [70], downsampled from [71]
| Training | Devtest | Test | |
|---|---|---|---|
| Relative | 1193 | 254 | 349 |
| Pronoun | 738 | 149 | 269 |
| Definite or demonstrative | 296 | 58 | 91 |
| Noun phrase | |||
| Appositive | 9 | 1 | 3 |
| Other | 11 | 1 | 2 |
| Antecedent | 2116 | 451 | 674 |
| Total | 2247 | 463 | 714 |
Descriptive statistics of the i2b2 clinical coreference corpus [75, 76]
| Markables | 7214 |
| Average markables per report | 40.08 |
| Pairs | 5992 |
| Average pairs per report | 33.29 |
| Identity chains | 1304 |
| Average identity chains per report | 7.24 |
Adapted from [76]
Descriptive statistics of coreference annotations in the CRAFT corpus
| IDENTITY chains | 23,887 |
| APPOSITIVE | 4591 |
| Pronouns | 4145 |
| Mean IDENT chains per paper | 246.3 |
| Median IDENT chains per paper | 236 |
| Mean APPOS per paper | 47.3 |
| Median APPOS per paper | 43 |
| Mean length of IDENT chains | 4 |
| Median length of IDENT chains | 2 |
| Longest IDENT chain | 186 |
| Within-sentence IDENT chains | 1495 |
| Between-sentence IDENT chains | 22,392 |
Benchmarking results: System A, Simple, and the union of the two
| B3 | BLANC | |||||
|---|---|---|---|---|---|---|
| System | P | R | F | P | R | F |
| System A | 0.93 | 0.08 | 0.14 | 0.93 | 0.026 | 0.05 |
| Simple | 0.78 | 0.29 | 0.42 | 0.78 | 0.22 | 0.33 |
| Union | 0.78 | 0.35 | 0.46 | 0.78 | 0.26 | 0.37 |
Inter-annotator agreement
| Metric | Average |
| MUC | 0.684 |
| Class-B3 | 0.858 |
| Entity-B3 | 0.750 |
| Mention-based CEAF | 0.644 |
| Entity-based CEAF | 0.480 |
| Krippendorff’s alpha | 0.619 |