| Literature DB >> 26934708 |
Halil Kilicoglu1, Dina Demner-Fushman1.
Abstract
Coreference resolution is one of the fundamental and challenging tasks in natural language processing. Resolving coreference successfully can have a significant positive effect on downstream natural language processing tasks, such as information extraction and question answering. The importance of coreference resolution for biomedical text analysis applications has increasingly been acknowledged. One of the difficulties in coreference resolution stems from the fact that distinct types of coreference (e.g., anaphora, appositive) are expressed with a variety of lexical and syntactic means (e.g., personal pronouns, definite noun phrases), and that resolution of each combination often requires a different approach. In the biomedical domain, it is common for coreference annotation and resolution efforts to focus on specific subcategories of coreference deemed important for the downstream task. In the current work, we aim to address some of these concerns regarding coreference resolution in biomedical text. We propose a general, modular framework underpinned by a smorgasbord architecture (Bio-SCoRes), which incorporates a variety of coreference types, their mentions and allows fine-grained specification of resolution strategies to resolve coreference of distinct coreference type-mention pairs. For development and evaluation, we used a corpus of structured drug labels annotated with fine-grained coreference information. In addition, we evaluated our approach on two other corpora (i2b2/VA discharge summaries and protein coreference dataset) to investigate its generality and ease of adaptation to other biomedical text types. Our results demonstrate the usefulness of our novel smorgasbord architecture. The specific pipelines based on the architecture perform successfully in linking coreferential mention pairs, while we find that recognition of full mention clusters is more challenging. The corpus of structured drug labels (SPL) as well as the components of Bio-SCoRes and some of the pipelines based on it are publicly available at https://github.com/kilicogluh/Bio-SCoRes. We believe that Bio-SCoRes can serve as a strong and extensible baseline system for coreference resolution of biomedical text.Entities:
Mesh:
Year: 2016 PMID: 26934708 PMCID: PMC4774913 DOI: 10.1371/journal.pone.0148538
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Number of annotations before and after pre-annotation.
| Annotation Type | Original | With pre-annotations |
|---|---|---|
|
| 4621 | 13144 |
|
| 2808 | 5635 |
|
| 153 | 589 |
|
| 262 | 198 |
|
| 431 | 352 |
Coarse-grained annotation counts.
| Annotation Type | Number of annotations |
|---|---|
|
| 13124 |
|
| 5628 |
|
| 713 |
|
| 1976 |
|
| 3006 |
Fine-grained coreferential mention counts.
| Type | Number of annotations | % |
|---|---|---|
| 196 | 9.9 | |
| 230 | 11.6 | |
| 8 | 0.4 | |
| 32 | 1.6 | |
| 3 | 0.2 | |
| 14 | 0.7 | |
| 6 | 0.3 | |
| 587 | 29.7 | |
| 367 | 18.6 | |
| 328 | 16.6 | |
| 61 | 3.1 | |
| 144 | 7.3 |
After fine-grained annotation.
| Coreference Type | Total | % | By Mention Type |
|---|---|---|---|
| 2021 | 67.2 | definite NP (595), demonstrative NP (571), possessive pronoun (239), personal pronoun (205), distributive NP (131), zero article NP (116), indefinite NP (63), relative pronoun (49), distributive pronoun (28), reciprocal pronoun (12), demonstrative pronoun (9), indefinite pronoun (3) | |
| 488 | 16.2 | definite NP (419), demonstrative NP (34), indefinite NP (20), possessive pronoun (12), distributive NP (2), personal pronoun (1) | |
| 312 | 10.4 | indefinite NP (146), zero article NP (106), definite NP (60) | |
| 185 | 6.2 | indefinite NP (147), zero article NP (30), definite NP (8) |
Inter-annotator agreement.
| Mention | Coreference | |||
|---|---|---|---|---|
| Batch | Exact | Approximate | Exact | Approximate |
| 1 | 0.6078 | 0.6976 | 0.4888 | 0.6485 |
| 2 | 0.7781 | 0.8138 | 0.6382 | 0.7083 |
| 3 | 0.7514 | 0.8171 | 0.5970 | 0.7164 |
| 4 | 0.8218 | 0.8764 | 0.7399 | 0.8074 |
| 5 | 0.8315 | 0.8853 | 0.7255 | 0.8309 |
| 6 | 0.9485 | 0.9708 | 0.8651 | 0.8921 |
Fig 1The high-level view of the Bio-SCoRes framework.
Resolution strategy for personal pronominal anaphora.
| Anaphora | |
| PersonalPronoun | |
| ThirdPerson, PleonasticIt | |
| PriorDiscourse, WindowSize(2), SyntacticConfig, Default, Exemplification | |
| Person(1,0) + Gender(1,0) + Animacy(1,0) + Number(1,0) | |
| Threshold(4), TopScore, Salience(Parse) |
Configuration for structured drug label coreference resolution.
| Mention Type | Mention Filters | Referent Filters | Agreement Methods | Post-Scoring Filters |
|---|---|---|---|---|
| PersonalPronoun | ThirdPerson | PriorDiscourse | Person(1,0) | Threshold(4) |
| PleonasticIt | WindowSize(2) | Gender(1,0) | TopScore | |
| SyntacticConfig | Animacy(1,0) | Salience(Parse) | ||
| Default | Number(1,0) | |||
| Exemplification | ||||
| PossessivePronoun | ThirdPerson | PriorDiscourse | Person(1,0) | Threshold(4) |
| WindowSize(2) | Gender(1,0) | TopScore | ||
| Default | Animacy(1,0) | Salience(Parse) | ||
| Exemplification | Number(1,0) | |||
| DistributivePronoun | None | PriorDiscourse | Person(1,0) | Threshold(4) |
| ReciprocalPronoun | WindowSize(2) | Gender(1,0) | TopScore | |
| SyntacticConfig | Animacy(1,0) | Salience(Default) | ||
| Default | Number(1,0) | |||
| Exemplification | ||||
| DefiniteNP | Anaphoricity | PriorDiscourse | Number(1,1) | Threshold(4) |
| DemonstrativeNP | SyntacticConfig | HypernymList(3,0) | TopScore | |
| DistributiveNP | Default | Salience(Default) | ||
| Exemplification | ||||
| PersonalPronoun | ThirdPerson | SubsequentDiscourse | Person(1,0) | Threshold(4) |
| PleonasticIt | WindowSize(Sentence) | Gender(1,0) | TopScore | |
| SyntacticConfig | Animacy(1,0) | Salience(Default) | ||
| Default | Number(1,0) | |||
| PossessivePronoun | ThirdPerson | SubsequentDiscourse | Person(1,0) | Threshold(4) |
| WindowSize(Sentence) | Gender(1,0) | TopScore | ||
| Default | Animacy(1,0) | Salience(Default) | ||
| Number(1,0) | ||||
| DiscourseConnective(1,2) | ||||
| DefiniteNP | Cataphoricity | SubsequentDiscourse | Number(1,1) | Threshold(4) |
| SyntacticConfig | HypernymList(3,0) | TopScore | ||
| WindowSize(2) | Salience(Default) | |||
| Default | ||||
| DefiniteNP | None | WindowSize(Sentence) | Number(1,1) | Threshold(4) |
| IndefiniteNP | Default | SyntacticAppositive(3,2) | TopScore | |
| ZeroArticleNP | HypernymList(1,1) | Salience(Default) | ||
| IndefiniteNP | None | PriorDiscourse | Number(1,1) | Threshold(4) |
| ZeroArticleNP | WindowSize(Sentence) | SyntacticPredicate | TopScore | |
| Default | Nominative(3,2) | Salience(Default) | ||
| HypernymList(1,1) | ||||
Evaluation results for mention detection on the test portion of SPL coreference dataset.
| Precision | Recall | F1 | |
|---|---|---|---|
| PersonalPronoun | 96.8 | 100.0 | 98.4 |
| PossessivePronoun | 95.7 | 63.8 | 76.5 |
| DistributivePronoun | 33.3 | 100.0 | 50.0 |
| DefiniteNP | 85.0 | 94.6 | 89.6 |
| DemonstrativeNP | 93.8 | 96.0 | 94.9 |
| DistributiveNP | 81.3 | 59.1 | 68.4 |
| IndefiniteNP | 79.1 | 72.0 | 75.4 |
| ZeroArticleNP | 41.2 | 59.2 | 48.6 |
| Overall |
Evaluation results for coreference resolution on the test portion of SPL coreference dataset.
| Precision | Recall | F1 | |
|---|---|---|---|
| Baseline | 6.0 | 35.6 | 10.3 |
| Bio-SCoRes | |||
| - Anaphora | 64.8 | 45.3 | 53.3 |
| — PersonalPronoun (60) | 82.9 | 48.3 | 61.1 |
| — PossessivePronoun (74) | 76.1 | 47.3 | 58.3 |
| — DistributivePronoun (12) | 45.4 | 83.3 | 58.8 |
| — DefiniteNP (219) | 55.1 | 44.3 | 49.1 |
| — DemonstrativeNP (190) | 74.4 | 62.6 | 68.0 |
| — DistributiveNP (40) | 41.7 | 25.0 | 31.3 |
| - Cataphora | 61.1 | 37.5 | 46.5 |
| — PersonalPronoun (1) | 100.0 | 100.0 | 100.0 |
| — PossessivePronoun (6) | 100.0 | 100.0 | 100.0 |
| — DefiniteNP (135) | 58.4 | 43.7 | 50.0 |
| - Appositive | 61.1 | 50.5 | 55.3 |
| — DefiniteNP (24) | 86.7 | 54.2 | 66.7 |
| — IndefiniteNP (40) | 91.7 | 55.0 | 68.8 |
| — ZeroArticleNP (45) | 39.2 | 44.4 | 41.7 |
| - PredicateNominative | 93.0 | 55.6 | 69.6 |
| — IndefiniteNP (47) | 96.8 | 63.9 | 76.9 |
| — ZeroArticleNP (24) | 83.3 | 41.7 | 55.6 |
Evaluation results on the test portion of SPL coreference dataset using gold coreferential mentions.
| Precision | Recall | F1 | |
|---|---|---|---|
| Baseline | 66.7 | 38.6 | 48.9 |
| Bio-SCoRes | |||
| - Anaphora | 76.9 | 50.7 | 61.1 |
| - Cataphora | 62.3 | 37.5 | 46.8 |
| - Appositive | 91.9 | 41.3 | 57.0 |
| - PredicateNominative | 98.0 | 68.1 | 80.3 |
Evaluation results on the test portion of SPL coreference dataset using concept recognition/normalization with MetaMap.
| Precision | Recall | F1 | F1 difference from using gold entities | |
|---|---|---|---|---|
| Bio-SCoRes | 61.1 | 41.3 | 49.2 | |
| - Anaphora | 61.7 | 43.1 | 50.7 | |
| - Cataphora | 66.7 | 28.4 | 39.8 | |
| - Appositive | 43.9 | 43.9 | 43.9 | |
| - PredicateNominative | 88.1 | 52.1 | 65.5 | |
| Bio-SCoRes | ||||
| - Anaphora | 59.6 | 44.9 | 51.2 | -2.1 |
| - Cataphora | 74.0 | 32.4 | 45.1 | -1.4 |
| - Appositive | 58.7 | 50.5 | 54.3 | -1.0 |
| - PredicateNominative | 86.4 | 53.5 | 66.1 | -3.5 |
Configuration for coreference resolution in discharge summaries.
| Mention Type | Mention Filters | Referent Filters | Agreement Methods | Post-Scoring Filters |
|---|---|---|---|---|
| PersonalPronoun | None | PriorDiscourse | Person(1,0) | Threshold(4) |
| DistributivePronoun | WindowSize(Section) | Gender(1,0) | TopScore | |
| ReciprocalPronoun | SyntacticConfig | Animacy(1,0) | Salience(Default) | |
| SemanticClass | Number(1,0) | |||
| PossessivePronoun | None | PriorDiscourse | Person(1,0) | Threshold(4) |
| WindowSize(Section) | Gender(1,0) | TopScore | ||
| SemanticClass | Animacy(1,0) | Salience(Default) | ||
| Number(1,0) | ||||
| RelativePronoun | None | PriorDiscourse | Person(1,0) | Threshold(1) |
| WindowSize(Sentence) | ||||
| SemanticClass | ||||
| DefiniteNP | Anaphoricity | PriorDiscourse | Number(1,1) | Threshold(4) |
| DemonstrativeNP | WindowSize(Section) | SemanticType(2,2) | TopScore | |
| DistributiveNP | SyntacticConfig | HeadWord(2,0) | Salience(Default) | |
| SemanticClass | ExactString(2,0) | |||
| PreModifierAndHead(2,0) | ||||
| RelaxedStem(2,0) | ||||
| ZeroArticleNP | None | PriorDiscourse | Number(1,3) | Threshold(4) |
| WindowSize(Document) | ExactString(4,0) | TopScore | ||
| SyntacticConfig | PreModifierAndHead(4,0) | Salience(Default) | ||
| SemanticClass | KeyValuePair(4,0) | |||
| RelaxedStem(3,0) | ||||
| DefiniteNP | None | WindowSize(Sentence) | Number(1,1) | Threshold(4) |
| IndefiniteNP | SemanticClass | SyntacticAppositive(3,2) | TopScore | |
| Salience(Default) | ||||
Evaluation results on the test portion of the i2b2/VA dataset.
| Precision | Recall | F1 | |
|---|---|---|---|
| Baseline | 0.517 | 0.597 | 0.541 |
| Xu et al. [ | 0.906 | 0.925 | 0.915 |
| Bio-SCoRes | |||
| - BCUBED | 0.964 | 0.944 | 0.954 |
| - MUC | 0.735 | 0.830 | 0.779 |
| - CEAF | 0.815 | 0.868 | 0.841 |
| – Test | 0.796 | 0.700 | 0.735 |
| – Person | 0.816 | 0.903 | 0.850 |
| – Problem | 0.774 | 0.851 | 0.808 |
| – Treatment | 0.759 | 0.826 | 0.789 |
| Bio-SCoRes | 0.800 | 0.871 | 0.832 |
| - BCUBED | 0.966 | 0.941 | 0.953 |
| - MUC | 0.655 | 0.807 | 0.723 |
| - CEAF | 0.777 | 0.869 | 0.821 |
Configuration for the Protein Coreference Dataset.
| Mention Type | Mention Filters | Referent Filters | Agreement Methods | Post-Scoring Filters |
|---|---|---|---|---|
| PersonalPronoun | ThirdPerson | PriorDiscourse | Person(1,0) | Threshold(4) |
| PleonasticIt | WindowSize(2) | Gender(1,0) | TopScore | |
| SyntacticConfig | Animacy(1,0) | Salience(Parse) | ||
| Default | Number(1,0) | |||
| PossessivePronoun | ThirdPerson | PriorDiscourse | Person(1,0) | Threshold(4) |
| WindowSize(2) | Gender(1,0) | TopScore | ||
| Default | Animacy(1,0) | Salience(Parse) | ||
| Number(1,0) | ||||
| SemanticCoercion(1,0) | ||||
| DistributivePronoun | None | PriorDiscourse | Person(1,0) | Threshold(4) |
| ReciprocalPronoun | WindowSize(2) | Gender(1,0) | TopScore | |
| SyntacticConfig | Animacy(1,0) | Salience(Default) | ||
| Default | Number(1,0) | |||
| RelativePronoun | CoreferentialPronoun | PriorDiscourse | Adjacency(1,0) | Threshold(1) |
| WindowSize(Sentence) | Salience(Default) | |||
| Default | ||||
| DefiniteNP | Anaphoricity | PriorDiscourse | Number(1,1) | Threshold(4) |
| DemonstrativeNP | SyntacticConfig | HypernymList(3,0) | TopScore | |
| DistributiveNP | Default | Salience(Default) |
Evaluation results on the development portion of the Protein Coreference Dataset.
| Precision | Recall | F1 | |
|---|---|---|---|
| Bio-SCoRes | |||
| - PRON | 52.5 | 46.8 | 49.8 |
| - RELAT | 86.2 | 83.3 | 84.7 |
| - DNP | 36.4 | 18.5 | 24.5 |
| Nguyen et al. [ | 67.8 | 57.8 | 62.4 |