| Literature DB >> 21489224 |
Su Nam Kim1, David Martinez, Lawrence Cavedon, Lars Yencken.
Abstract
AIM: Given a set of pre-defined medical categories used in Evidence Based Medicine, we aim to automatically annotate sentences in medical abstracts with these labels.Entities:
Mesh:
Year: 2011 PMID: 21489224 PMCID: PMC3073185 DOI: 10.1186/1471-2105-12-S2-S5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Annotex interface for the annotation of sentences.
Kappa values per class.
| Class | Kappa |
|---|---|
| Background | 0.70 |
| Intervention | 0.61 |
| Other | 0.67 |
| Outcome | 0.71 |
| Population | 0.63 |
| Study Design | 0.41 |
Number of abstracts and sentences for Structured (S) and Unstructured (U) abstract sets, including number of sentences per class.
| All | S | U | |
|---|---|---|---|
| # Abstracts | 1000 | 376 | 624 |
| # Sentences | 10379 | 4774 | 5605 |
| - Background | 2557 | 669 | 1888 |
| - Intervention | 690 | 313 | 377 |
| - Outcome | 4523 | 2240 | 2283 |
| - Population | 812 | 369 | 443 |
| - Study Design | 233 | 149 | 84 |
| - Other | 1564 | 1034 | 530 |
F-scores for the benchmark system based on [4]. 1.P: unigrams with POS, Pst: position, W: windowed features, Sec: section headings. Best results per column are given in bold.
| Feature | 6-way | 5-way | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| S | U | S | U | |||||||||
| P | R | F | P | R | F | P | R | F | P | R | F | |
| 1.P+Pst | 75.11 | 71.49 | 73.26 | 86.38 | 81.07 | 83.64 | ||||||
| 1.P+Pst+W | 72.40 | 68.91 | 70.62 | 64.14 | 59.96 | 61.98 | 85.07 | 79.84 | 82.37 | 72.61 | 67.39 | 69.90 |
| 1.P+Pst+Sec+W | – | – | – | – | – | – | ||||||
F-scores using lexical and semantic Information for 6-way and 5-way classification.
| Feature | 6-way | 5-way | ||
|---|---|---|---|---|
| S | U | S | U | |
| 1.P | ||||
| 2.P | 47.50 | 44.19 | 59.09 | 49.61 |
| Token-CUI | 66.19 | 59.47 | 78.26 | 65.57 |
| Token-Syn | 64.13 | 58.79 | 76.77 | 65.47 |
| Token-Syn-B | 65.25 | 59.94 | 77.43 | 66.22 |
| MetaMap-CUI | 56.08 | 52.23 | 64.58 | 56.58 |
1.P: unigrams with POS, 2.P: bigrams with POS, CUI: UMLS tag, Syn: expansion with synonyms, Syn-B: synonyms in break-down form. Best results per column are given in bold.
F-scores of Combining Lexical and Semantic Information.
| Feature | 6-way | 5-way | ||
|---|---|---|---|---|
| S | U | S | U | |
| 1.P+2.P | 60.53 | 68.47 | ||
| 1.P+T-CUI | 67.83 | 61.01 | 79.94 | 67.41 |
| 1.P+T-Syn | 66.13 | 59.79 | 78.26 | 67.39 |
| 1.P+T-Syn-B | 67.03 | 61.24 | 79.09 | |
| 1.P+T-CUI+T-Syn | 65.89 | 60.23 | 77.85 | 66.82 |
| 1.P+T-CUI+T-Syn-B | 66.82 | 78.90 | 68.27 | |
1.P: unigrams with POS, 2.P: bigrams with POS, T: Token, CUI: UMLS tag, Syn: expansion with synonyms, Syn-B: synonyms in break-down form. Best results per column are given in bold.
F-scores using Structural Information.
| Feature | 6-way | 5-way | ||
|---|---|---|---|---|
| S | U | S | U | |
| 1.P+Pst | 73.26 | 83.64 | ||
| 1.P+Sec | 79.22 | – | 88.88 | – |
| 1.P+SecM | 76.95 | – | 87.48 | – |
| 1.P+Pst+Sec | – | – | ||
| 1.P+Pst+SecM | 78.45 | – | 88.55 | – |
1.P: Unigrams with POS, Pst: Position, Sec: Section heading, Sec: Section heading with mapping. Best results per column are given in bold.
F-scores using 1 to 3 previous sentences (Indirect).
| Feature | 6-way | 5-way | ||
|---|---|---|---|---|
| S | U | S | U | |
| B+1 Prev. Sen. | 65.06 | 71.80 | ||
| B+2 Prev. Sen. | 77.53 | 66.30 | 73.64 | |
| B+3 Prev. Sen. | 76.75 | 88.03 | ||
| B+Window | 77.48 | 61.98 | 87.50 | 69.90 |
B: base features, Window: features in previous and posterior sentence. Best performance per column is given in bold.
F-scores using previous labels (Direct).
| Feature | 6-way | 5-way | ||
|---|---|---|---|---|
| S | U | S | U | |
| B+1 Prev. Label | 79.85 | 63.64 | 89.24 | 71.15 |
| B+3 Prev Labels | 63.57 | |||
| B+All Prev Labels | 79.48 | 88.11 | 71.50 | |
B: base features. Best performance per column is given in bold.
F-scores per class from systems based on sequential features (applying the best configurations for each data subset).
| Feature | 6-way | 5-way | ||
|---|---|---|---|---|
| S | U | S | U | |
| Background | 81.84 | 68.46 | 87.92 | 74.67 |
| Intervention | 20.25 | 12.68 | 48.08 | 21.39 |
| Outcome | ||||
| Population | 56.25 | 39.80 | 63.88 | 43.15 |
| Study Design | 43.95 | 4.40 | 47.44 | 8.60 |
| Other | 69.98 | 24.28 | – | – |
Best performance per column is given in bold.
F-scores over dataset from [18].
| Feature | 5-way | 4-way | ||
|---|---|---|---|---|
| S | U | S | U | |
| Lexical & Structural | ||||
| 1.P | 55.12 | 37.10 | 76.90 | 76.32 |
| 1.P+Pst | 57.80 | 38.53 | 78.04 | 72.82 |
| 1.P+Pst+Sec | 62.83 | – | 83.81 | – |
| Sequential (indirect) | ||||
| 1.P+Pst+W | 56.06 | 38.76 | 75.26 | 72.82 |
| 1.P+Pst+Sec+W | 61.57 | – | 81.85 | – |
| B+1 Prev. Sen. | 62.36 | 75.20 | ||
| B+2 Prev. Sen. | 61.26 | 37.81 | 82.26 | 72.78 |
| B+3 Prev. Sen. | 60.16 | 37.81 | 82.20 | 75.27 |
| Sequential (direct) | ||||
| B+1 Prev. Label | 37.57 | 81.93 | ||
| B+3 Prev. Label | 36.39 | 77.72 | 76.98 | |
| B+All Prev. Labels | 62.05 | 37.10 | 82.67 | 78.26 |
1.P: unigrams with POS, Pst: position, W: windowed features, Sec: section headings, B: base features. Best results per column are given in bold.
F-scores per class over dataset from [18].
| Feature | 5-way | 4-way | ||
|---|---|---|---|---|
| S | U | S | U | |
| Background | 56.18 | 15.67 | 77.27 | 37.50 |
| Intervention | 15.38 | 28.57 | 28.17 | 8.33 |
| Outcome | ||||
| Population | 35.62 | 28.07 | 42.86 | 28.57 |
| Other | 46.32 | 15.77 | – | – |
For each of the two tasks (5-way, 6-way) the best feature set is applied. Best results per column are given in bold.
Confusion matrix over structured abstracts.
| Class | Prediction | ||||||
|---|---|---|---|---|---|---|---|
| B | I | O | P | S | Ot | ||
| B | 561 | 4 | 43 | 8 | 2 | 51 | |
| G | I | 27 | 41 | 48 | 60 | 5 | 132 |
| o | O | 6 | 1 | 2165 | 4 | 0 | 64 |
| l | P | 24 | 17 | 33 | 198 | 10 | 87 |
| d | S | 21 | 5 | 6 | 35 | 49 | 33 |
| Ot | 63 | 24 | 155 | 30 | 8 | 754 | |
B: Background, I: Intervention, O: Outcome, P: Population, S: Study Design, and Ot: Other.
Confusion matrix over unstructured abstracts.
| Class | Prediction | ||||||
|---|---|---|---|---|---|---|---|
| B | I | O | P | S | Ot | ||
| B | 1505 | 15 | 272 | 70 | 2 | 24 | |
| G | I | 141 | 30 | 120 | 64 | 2 | 20 |
| o | O | 496 | 13 | 1722 | 18 | 0 | 34 |
| l | P | 161 | 24 | 73 | 158 | 1 | 26 |
| d | S | 36 | 3 | 7 | 26 | 2 | 10 |
| Ot | 170 | 11 | 245 | 15 | 0 | 89 | |
B: Background, I: Intervention, O: Outcome, P: Population, S: Study Design, and Ot: Other.
Confusion matrix when testing over dataset from [18] for structured abstracts.
| Class | Prediction | |||||
|---|---|---|---|---|---|---|
| B | I | O | P | Ot | ||
| B | 47 | 0 | 3 | 3 | 6 | |
| G | I | 1 | 8 | 1 | 2 | 24 |
| o | O | 0 | 0 | 244 | 2 | 7 |
| l | P | 1 | 0 | 9 | 13 | 6 |
| d | Ot | 74 | 3 | 53 | 18 | 80 |
B: Background, I: Intervention, O: Outcome, P: Population, and Ot: Other.
Confusion matrix when testing over dataset from [18] for unstructured abstracts.
| Class | Prediction | |||||
|---|---|---|---|---|---|---|
| B | I | O | P | Ot | ||
| B | 21 | 1 | 1 | 1 | 3 | |
| G | I | 3 | 3 | 3 | 1 | 1 |
| o | O | 3 | 0 | 112 | 0 | 2 |
| l | P | 4 | 1 | 6 | 4 | 3 |
| d | Ot | 134 | 0 | 71 | 8 | 12 |
B: Background, I: Intervention, O: Outcome, P: Population, and Ot: Other.