| Literature DB >> 21605399 |
Rashmi Prasad1, Susan McRoy, Nadya Frid, Aravind Joshi, Hong Yu.
Abstract
BACKGROUND: Identification of discourse relations, such as causal and contrastive relations, between situations mentioned in text is an important task for biomedical text-mining. A biomedical text corpus annotated with discourse relations would be very useful for developing and evaluating methods for biomedical discourse processing. However, little effort has been made to develop such an annotated resource.Entities:
Mesh:
Year: 2011 PMID: 21605399 PMCID: PMC3130691 DOI: 10.1186/1471-2105-12-188
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
BioDRB sense classification for discourse relations
| Type | Subtype | Type | Subtype |
|---|---|---|---|
| CAUSE | Reason | CONDITION | Hypothetical |
| Result | Factual | ||
| Claim | Non-Factual | ||
| Justification | |||
| PURPOSE | Goal | TEMPORAL | Synchronous |
| Enablement | Precedence | ||
| Succession | |||
| CONCESSION | Contra-Expectation | ALTERNATIVE | Chosen-Alternative |
| Expectation | Conjunctive | ||
| Disjunctive | |||
| CONTRAST | INSTATIATION | ||
| CONJUNCTION | EXCEPTION | ||
| SIMILARITY | CONTINUATION | ||
| CIRCUMSTANCE | Forward-Circumstance | BACKGROUND | Forward-Background |
| Backward-Circumstance | Backward-Background | ||
| RESTATEMENT | Equivalence | REINFORCEMENT | |
| Generalization | |||
| Specification | |||
Grouping of BioDRB sense types into PDTB generalized classes
| BioDRB Type-level Senses | PDTB Class-level Sense |
|---|---|
| Concession, Contrast | Comparison |
| Cause, Condition, Purpose | Contingency |
| Temporal | Temporal |
| Alternative, Background, Circumstance, Conjunction, Continuation, Exception, Instantiation, Reinforcement, Restatement, Similarity | Expansion |
BioDRB distribution of relation types
| Relation Type | No. of Tokens (%) | Types |
|---|---|---|
| Explicit | 2636 (45%) | 179 |
| Implicit | 3001 (51.2%) | 57 |
| Altlex | 193 (3.3%) | 165 |
| NoRel | 29 (0.5%) | - |
| TOTAL | 5859 | - |
Distribution of senses in BioDRB.
| Sense | Explicit | Implicit | AltLex | TOTAL |
|---|---|---|---|---|
| Alternative | 31 | 3 | 3 | 37 |
| Background | - | 132 | 1 | 133 |
| Cause | 339 | 98 | 105 | 542 |
| Circumstance | 8 | 221 | 1 | 230 |
| Concession | 257 | 70 | 2 | 329 |
| Condition | 22 | - | - | 22 |
| Conjunction | 421 | 641 | 3 | 1065 |
| Continuation | 24 | 831 | - | 855 |
| Contrast | 205 | 75 | 2 | 282 |
| Exception | 7 | 2 | - | 9 |
| Instantiation | 21 | 53 | 14 | 88 |
| Purpose | 616 | - | 1 | 617 |
| Reinforcement | 22 | 60 | 19 | 101 |
| Restatement | 69 | 445 | 19 | 533 |
| Similarity | 5 | - | - | 5 |
| Temporal | 394 | 370 | 16 | 780 |
| Cause/Background | 8 | - | - | 8 |
| Cause/Conjunction | 5 | - | - | 5 |
| Cause/Reinforcement | - | - | 1 | 1 |
| Cause/Temporal | 6 | - | 3 | 9 |
| Concession/Background | 2 | - | - | 2 |
| Concession/Circumstance | 1 | - | - | 1 |
| Condition/Circumstance | 2 | - | - | 2 |
| Condition/Temporal | 5 | - | - | 5 |
| Conjunction/Temporal | 70 | - | 1 | 71 |
| Continuation/Reinforcement | 1 | - | - | 1 |
| Contrast/Background | - | - | 1 | 1 |
| Contrast/Concession | 1 | - | - | 1 |
| Purpose/Conjunction | 1 | - | - | 1 |
| Reinforcement/Conjunction | - | - | 1 | 1 |
| Temporal/Circumstance | 92 | - | - | 92 |
| Temporal/Continuation | 1 | - | - | 1 |
| 2636 | 3001 | 193 | 5830 | |
Multiple senses provided for connectives are shown separately.
Contextual ambiguity of explicit connectives
| Connective Type | Senses | Tokens |
|---|---|---|
| accordingly | 2: Cause, Conjunction | 2 |
| although | 2: Concession, Contrast | 76 |
| and | 6: Cause, Concession, Conjunction, Continuation, Purpose, Temporal | 274 |
| as | 3: Cause, Purpose, Temporal | 23 |
| both upon | 2: Circumstance, Temporal | 2 |
| but | 2: Concession, Contrast | 42 |
| by | 3: Cause, Purpose, Temporal | 262 |
| nally | 2: Conjunction, Temporal | 21 |
| however | 2: Concession, Contrast | 117 |
| in part by | 2: Cause, Purpose | 3 |
| in particular | 2: Instantiation, Restatement | 4 |
| in response to | 3: Cause, Circumstance, Temporal | 12 |
| in turn | 3: Cause, Conjunction, Temporal | 6 |
| in | 2: Circumstance, Purpose | 3 |
| indeed | 2: Circumstance, Reinforcement | 15 |
| on the other hand | 2: Concession, Contrast | 6 |
| once | 2: Circumstance, Temporal | 7 |
| second | 2: Conjunction, Temporal | 3 |
| since | 2: Cause, Temporal | 52 |
| so | 2: Cause, Restatement | 7 |
| then | 2: Restatement, Temporal | 91 |
| therefore | 2: Cause, Restatement | 75 |
| thus | 2: Cause, Restatement | 77 |
| upon | 2: Cirsumstance, Temporal | 15 |
| when | 3: Circumstance, Condition, Temporal | 65 |
| while | 4: Concession, Conjunction, Contrast, Temporal | 64 |
| whilst | 2: Concession, Contrast | 4 |
| Total | - | 1328 |
Annotation fields in the BioDRB data representation
| Description | |
|---|---|
| 0 | Relation type (Explicit, Implicit, AltLex, NoRel) |
| 1 | (Sets of) Span o sets for connective (when explicit) |
| 7 | Connective string "inserted" for Implicit relation |
| 8 | Sense1 of Explicit Connective (or Implicit Connective) |
| 9 | Sense2 of Explicit Connective (or Implicit Connective) |
| 14 | (Sets of) Span o sets for Arg1 |
| 20 | (Sets of) Span o sets for Arg2 |
Annotation representation
| Explicit|9171..9174|||||||Temporal.PrecedencejConjunction|||||9137..9170||||||9175..9244|||||| |
|---|
| Explicit|21670..21678;21729..21737|||||||Conjunction||||||21679..21727||||||21738..21829|||||| |
| Explicit|10101..10105||||||||Temporal.Precedence||||||9932..10088||||||10090..10100;10106..10209|||||| |
| Implicit||||||||as a resultjCause.Result||||||3418..3655||||||3657..3714|||||| |
| AltLex|25183..25199||||||||ReinforcementjCause.Claim||||||24621..25181||||||25183..25444||||||| |
Ten-fold cross validation accuracies for explicit connective sense classification in BioDRB and PDTB.
| First Sense | Second Sense | Both Senses | |
|---|---|---|---|
| 90.9% | 83.6% | 85.6% | |
| 90.1% | 84.1% | 85.6% |
Columns represent three scenarios for selecting from multiple senses provided for connectives.
Explicit sense classification in BioDRB: Class-wise Precision, Recall and F1.
| Class | Precision | Recall | F1 |
|---|---|---|---|
| Comparison | 0.983 | 0.868 | 0.922 |
| Contingency | 0.819 | 0.992 | 0.897 |
| Expansion | 0.923 | 0.9 | 0.911 |
| Temporal | 1.0 | 0.754 | 0.860 |
Macro average F1 score is 0.91.
Explicit sense classification in PDTB: Class-wise Precision, Recall and F1.
| Class | Precision | Recall | F1 |
|---|---|---|---|
| Comparison | 0.948 | 0.993 | 0.970 |
| Contingency | 1.0 | 0.706 | 0.828 |
| Expansion | 0.907 | 0.978 | 0.941 |
| Temporal | 0.883 | 0.889 | 0.886 |
Macro average F1 score is 0.91.
Cross-domain sense classification: Class-wise Precision, Recall and F1.
| Class | Precision | Recall | F1 |
|---|---|---|---|
| Comparison | 0.983 | 0.897 | 0.938 |
| Contingency | 0.643 | 0.732 | 0.131 |
| Expansion | 0.347 | 0.938 | 0.507 |
| Temporal | 0.863 | 0.585 | 0.697 |
Macro average F1 score is 0.57.
Sense distributions in IMRAD segments
| Type-level Sense | Introduction | Methods | Results | Abstract | Discussion | Total |
|---|---|---|---|---|---|---|
| Alternative | 4 (13.8%) | 3 (10.3%) | 7 (24.1%) | 0 (0.0%) | 15 (51.7%) | 29 |
| Background | 24 (19.8%) | 7 (5.8%) | 36 (29.8%) | 15 (12.4%) | 39 (32.2%) | 121 |
| Cause | 80 (17.0%) | 16 (3.4%) | 134 (28.5%) | 33 (7.0%) | 208 (44.2%) | 471 |
| Circumstance | 11 (7.1%) | 7 (4.5%) | 112 (71.8%) | 13 (8.3%) | 13 (8.3%) | 156 |
| Concession | 59 (21.7%) | 3 (1.1%) | 73 (26.8%) | 21 (7.7%) | 116 (42.6%) | 272 |
| Condition | 1 (5.3%) | 6 (31.6%) | 0 0 (0.0%) | 1 (5.3%) | 11 (57.9%) | 19 |
| Conjunction | 105 (13.9%) | 100 (13.3%) | 271 (35.9%) | 78 (10.3%) | 195 (25.9%) | 754 |
| Continuation | 80 (19.3%) | 121 (29.2%) | 112 (27.0%) | 17 (4.1%) | 85 (20.5%) | 415 |
| Contrast | 26 (10.6%) | 9 (3.7%) | 118 (48.0%) | 12 (4.9%) | 81 (32.9%) | 246 |
| Exception | 1 (16.7%) | 2 (33.3%) | 2 (33.3%) | 0 (0.0%) | 1 (16.7%) | 6 |
| Instantiation | 17 (23.9%) | 0 (0.0%) | 9 (12.7%) | 3 (4.2%) | 42 (59.2%) | 71 |
| Purpose | 93 (20.2%) | 84 (18.3%) | 144 (31.3%) | 35 (7.6%) | 104 (22.6%) | 460 |
| Reinforcement | 14 (16.5%) | 3 (3.5%) | 14 (16.5%) | 4 (4.7%) | 50 (58.8%) | 85 |
| Restatement | 63 (19.2%) | 47 (14.3%) | 124 (37.8%) | 29 (8.8%) | 65 (19.8%) | 328 |
| Similarity | 0 (0.0%) | 0 (0.0%) | 2 (40%) | 0 (0.0%) | 3 (60%) | 5 |
| Temporal | 41 (8.0%) | 259 (50.3%) | 0 (0.0%) | 22 (4.3%) | 52 (10.1%) | 515 |