| Literature DB >> 19025689 |
Angus Roberts1, Robert Gaizauskas, Mark Hepple, Yikun Guo.
Abstract
BACKGROUND: The Clinical E-Science Framework (CLEF) project has built a system to extract clinically significant information from the textual component of medical records in order to support clinical research, evidence-based healthcare and genotype-meets-phenotype informatics. One part of this system is the identification of relationships between clinically important entities in the text. Typical approaches to relationship extraction in this domain have used full parses, domain-specific grammars, and large knowledge bases encoding domain knowledge. In other areas of biomedical NLP, statistical machine learning (ML) approaches are now routinely applied to relationship extraction. We report on the novel application of these statistical techniques to the extraction of clinical relationships.Entities:
Mesh:
Year: 2008 PMID: 19025689 PMCID: PMC2586752 DOI: 10.1186/1471-2105-9-S11-S3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Relationship types and examples.
| Relates an intervention or an investigation to the bodily locus at which it is targeted. | • This patient has had a | |||
| Relates a condition to an investigation that demonstrated its presence, or a result to the investigation that produced that result. | • This patient has had a lymph node | |||
| Relates a condition to a drug, intervention, or investigation that is targeted at that condition. | • Her facial | |||
| Relationship between a condition and a locus: describes the bodily location of a specific condition. | • ... a biopsy which shows | |||
| Relates a condition to its negation or uncertainty about it. | • There was | |||
| Relates a bodily locus or intervention to its sidedness: | • ... on his | |||
| Relates a bodily locus to other information about the location: | • |
Relationship types, their argument type constraints, a description and examples. Each example shows a single relation of the given type. Arguments are underlined and preceded by their argument number. (Adapted from the CLEF Annotation Guidelines, )
Figure 1The relationship schema. The relationship schema, showing entities (rectangles), modifiers (ovals), and relationships (arrows).
Relationship counts in the gold standard.
| 265 | 46 | 25 | 7 | 5 | 4 | 3 | 2 | 2 | 2 | 0 | |
| 139 | 85 | 35 | 32 | 14 | 11 | 6 | 4 | 5 | 5 | 12 | |
| 360 | 4 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 4 | |
| 122 | 14 | 4 | 2 | 2 | 4 | 3 | 1 | 0 | 1 | 0 | |
| 128 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 100 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 76 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 1190 | 150 | 65 | 42 | 22 | 20 | 13 | 7 | 7 | 8 | 16 | |
| 1190 | 1340 | 1405 | 1447 | 1469 | 1489 | 1502 | 1509 | 1516 | 1524 | 1540 | |
Count of relations in 77 gold standard documents, sub-divided by the number of sentence boundaries between relations.
Figure 2The relationship extraction system. The relationship extraction system, as a GATE pipeline.
Feature sets for learning.
| tokN | 8 | Surface string and POS of tokens surrounding the arguments, windowed - |
| gentokN | 8 | Root and generalised POS of tokens surrounding the argument entities, windowed |
| atype | 1 | Concatenated semantic type of arguments, in arg1-arg2 order |
| dir | 1 | Direction: linear text order of the arguments (is arg1 before arg2, or vice versa?) |
| dist | 2 | Distance: absolute number of sentence and paragraph boundaries between arguments |
| str | 14 | Surface string features based on Zhou et al [ |
| pos | 14 | POS features, as above |
| root | 14 | Root features, as above |
| genpos | 14 | Generalised POS features, as above |
| inter | 11 | Intervening mentions: numbers and types of intervening entity mentions between arguments |
| event | 5 | Events: are any of the arguments, or intevening entities, events? |
| allgen | 96 | All above features in root and generalised POS forms, i.e. gen-tok6+atype+dir+dist+root+genpos+inter+event |
| notok | 48 | All above except tokN features, others in string and POS forms, i.e. atype+dir+dist+str+pos+inter+event |
| dep | 16 | Features based on a syntactic dependency path. |
| syndist | 2 | The distance between the two arguments, along a token path and along a syntactic dependency path. |
Feature sets used for learning relationships. The table is split into non-syntactic features, combined non-syntactic features, and syntactic features. The size of a set is the number of features in that set.
Performance by feature set, non-syntactic features.
| 44 | 49 | 58 | 63 | 62 | 64 | 65 | 63 | 63 | ||
| 39 | 63 | 78 | 80 | 80 | 81 | 81 | 82 | 82 | ||
| 39 | 54 | 66 | 70 | 69 | 71 | 72 | 71 | 71 | ||
| 37 | 23 | 38 | 42 | 40 | 41 | 42 | 37 | 44 | ||
| 14 | 14 | 46 | 44 | 44 | 47 | 47 | 45 | 47 | ||
| 18 | 16 | 39 | 39 | 38 | 41 | 42 | 38 | 41 | ||
| 36 | 36 | 50 | 68 | 71 | 72 | 72 | 73 | 73 | ||
| 28 | 28 | 74 | 79 | 79 | 81 | 81 | 83 | 83 | ||
| 30 | 30 | 58 | 72 | 74 | 76 | 75 | 77 | 76 | ||
| 9 | 9 | 32 | 63 | 57 | 60 | 62 | 60 | 59 | ||
| 11 | 11 | 51 | 68 | 67 | 67 | 66 | 68 | 68 | ||
| 9 | 9 | 38 | 64 | 60 | 63 | 63 | 63 | 62 | ||
| 21 | 38 | 73 | 84 | 83 | 84 | 84 | 86 | 86 | ||
| 9 | 55 | 82 | 89 | 86 | 88 | 88 | 87 | 89 | ||
| 12 | 44 | 76 | 85 | 83 | 84 | 84 | 84 | 85 | ||
| 19 | 54 | 85 | 81 | 80 | 79 | 79 | 77 | 81 | ||
| 12 | 82 | 97 | 98 | 93 | 92 | 93 | 93 | 93 | ||
| 13 | 63 | 89 | 88 | 85 | 84 | 85 | 83 | 85 | ||
| 2 | 2 | 55 | 88 | 86 | 86 | 88 | 88 | 87 | ||
| 1 | 1 | 62 | 94 | 92 | 95 | 95 | 95 | 95 | ||
| 1 | 1 | 56 | 90 | 86 | 89 | 91 | 91 | 90 | ||
| 33 | 38 | 50 | 63 | 62 | 64 | 65 | 64 | 64 | ||
| 22 | 36 | 70 | 74 | 73 | 75 | 75 | 76 | 76 | ||
| 26 | 37 | 58 | 68 | 67 | 69 | 69 | 69 | 70 | ||
Variation in performance by feature set, non-syntactic features. Features sets are abbreviated as in Table 3. For the first seven columns, features were added cumulatively to each other. The next two columns, allgen and notok, are as described in Table 3.
Performance by feature set, syntactic features.
| 65 | 73 | 74 | ||
| 81 | 77 | 77 | ||
| 72 | 71 | 74 | ||
| 42 | 42 | 43 | ||
| 47 | 37 | 37 | ||
| 42 | 38 | 39 | ||
| 72 | 74 | 73 | ||
| 81 | 86 | 86 | ||
| 75 | 79 | 78 | ||
| 62 | 65 | 71 | ||
| 66 | 63 | 66 | ||
| 63 | 62 | 64 | ||
| 84 | 89 | 89 | ||
| 88 | 84 | 90 | ||
| 84 | 85 | 89 | ||
| 79 | 85 | 85 | ||
| 93 | 97 | 93 | ||
| 85 | 90 | 88 | ||
| 88 | 90 | 93 | ||
| 95 | 95 | 95 | ||
| 91 | 92 | 94 | ||
| 65 | 71 | 71 | ||
| 75 | 74 | 74 | ||
| 69 | 72 | 72 | ||
Variation in performance by feature set, syntactic features. The first column shows the cumulative +event system from Table 4. The next two columns show the effect of cumulatively adding syntactic features to this system. Syntactic features are as described in Table 3.
Performance by sentences.
| 24 | 68 | 65 | 62 | 60 | 61 | 61 | ||
| 18 | 89 | 81 | 79 | 78 | 78 | 77 | ||
| 18 | 76 | 72 | 69 | 67 | 68 | 67 | ||
| 18 | 49 | 42 | 42 | 36 | 32 | 30 | ||
| 17 | 59 | 47 | 42 | 42 | 39 | 38 | ||
| 16 | 51 | 42 | 39 | 37 | 34 | 33 | ||
| n/a | 74 | 72 | 73 | 72 | 72 | 72 | ||
| n/a | 83 | 81 | 81 | 81 | 82 | 82 | ||
| n/a | 77 | 75 | 76 | 75 | 76 | 76 | ||
| 3 | 64 | 62 | 59 | 60 | 59 | 58 | ||
| 1 | 75 | 66 | 64 | 62 | 61 | 61 | ||
| 2 | 68 | 63 | 61 | 60 | 60 | 59 | ||
| n/a | 86 | 84 | 86 | 86 | 86 | 87 | ||
| n/a | 89 | 88 | 88 | 88 | 87 | 88 | ||
| n/a | 85 | 84 | 85 | 86 | 85 | 86 | ||
| n/a | 80 | 79 | 79 | 80 | 80 | 80 | ||
| n/a | 94 | 93 | 91 | 93 | 93 | 93 | ||
| n/a | 86 | 85 | 84 | 85 | 86 | 85 | ||
| n/a | 89 | 88 | 88 | 89 | 89 | 89 | ||
| n/a | 95 | 95 | 95 | 95 | 95 | 95 | ||
| n/a | 91 | 91 | 91 | 91 | 91 | 91 | ||
| 22 | 69 | 65 | 64 | 62 | 61 | 60 | ||
| 17 | 83 | 75 | 73 | 71 | 70 | 70 | ||
| 19 | 75 | 69 | 68 | 66 | 65 | 65 | ||
Variation in performance, by number of sentence boundaries (n) crossed by a relationship. For all cases, the cumulative feature set +event of Table 4 was used. For the inter-sentential-only classifier 1 ≤ n ≤ 5, the score fields for some relations are marked as n/a (not applicable). This is because some relations are either absent from the inter-sentential data (i.e. only ever appear intra-sententially), or are so rare that they do not appear in all training/test folds, and so a macro-average cannot be computed across the folds.
Performance by corpus size.
| 91 | 216 | 311 | ||
| 66 | 63 | 65 | ||
| 74 | 74 | 81 | ||
| 67 | 67 | 72 | ||
| 91 | 117 | 224 | ||
| 22 | 25 | 42 | ||
| 30 | 31 | 47 | ||
| 23 | 25 | 42 | ||
| 127 | 199 | 364 | ||
| 72 | 71 | 72 | ||
| 76 | 80 | 81 | ||
| 73 | 74 | 75 | ||
| 51 | 90 | 136 | ||
| 65 | 49 | 62 | ||
| 60 | 65 | 66 | ||
| 59 | 54 | 63 | ||
| 57 | 73 | 128 | ||
| 77 | 78 | 84 | ||
| 69 | 68 | 88 | ||
| 72 | 69 | 84 | ||
| 34 | 67 | 101 | ||
| 78 | 79 | 79 | ||
| 80 | 93 | 93 | ||
| 78 | 84 | 85 | ||
| 30 | 43 | 76 | ||
| 64 | 91 | 88 | ||
| 64 | 85 | 95 | ||
| 64 | 86 | 91 | ||
| 481 | 805 | 1340 | ||
| 62 | 63 | 65 | ||
| 65 | 71 | 75 | ||
| 63 | 66 | 69 | ||
Variation in performance by training corpus size. The "Count" row gives the number of training instances of a relation type, for the given corpus. The cumulative feature set +event of Table 4 was used.
Performance over extracted entities.
| 63 | 62 | ||
| 82 | 32 | ||
| 71 | 41 | ||
| 44 | 44 | ||
| 47 | 27 | ||
| 41 | 32 | ||
| 73 | 68 | ||
| 83 | 49 | ||
| 76 | 55 | ||
| 59 | 47 | ||
| 68 | 39 | ||
| 62 | 41 | ||
| 86 | 83 | ||
| 89 | 76 | ||
| 85 | 74 | ||
| 81 | 81 | ||
| 93 | 53 | ||
| 85 | 60 | ||
| 87 | 71 | ||
| 95 | 24 | ||
| 90 | 31 | ||
| 64 | 63 | ||
| 76 | 40 | ||
| 70 | 48 | ||
Performance of relation extraction over automatically extracted entities, compared to relation extraction using perfect gold standard entities. For relation extraction, the cumulative feature set +event of Table 4 was used.
Overall performance evaluation.
| + | ||||||
| 63 | 74 | 65 | ||||
| 82 | 77 | 76 | ||||
| 71 | 74 | 70 | 46 | 80 | ||
| 44 | 43 | 0 | ||||
| 47 | 37 | 0 | ||||
| 41 | 39 | 0 | 26 | 50 | ||
| 73 | 73 | 0 | ||||
| 83 | 86 | 0 | ||||
| 76 | 78 | 0 | 55 | 80 | ||
| 59 | 71 | 0 | ||||
| 68 | 66 | 0 | ||||
| 62 | 64 | 0 | 42 | 63 | ||
| 86 | 89 | 60 | ||||
| 89 | 90 | 91 | ||||
| 85 | 89 | 72 | 73 | 94 | ||
| 81 | 85 | 81 | ||||
| 93 | 93 | 98 | ||||
| 85 | 88 | 88 | 66 | 93 | ||
| 87 | 93 | 50 | ||||
| 95 | 95 | 68 | ||||
| 90 | 94 | 58 | 49 | 96 | ||
| 64 | 71 | 36 | ||||
| 76 | 74 | 48 | ||||
| 70 | 72 | 41 | 47 | 75 | ||
System best performance figures (from Tables 4 and 5), and comparison to baseline performance and to inter-annotator agreement scores.