| Literature DB >> 31307440 |
Wangjin Lee1, Jinwook Choi2,3,4.
Abstract
BACKGROUND: This paper presents a conditional random fields (CRF) method that enables the capture of specific high-order label transition factors to improve clinical named entity recognition performance. Consecutive clinical entities in a sentence are usually separated from each other, and the textual descriptions in clinical narrative documents frequently indicate causal or posterior relationships that can be used to facilitate clinical named entity recognition. However, the CRF that is generally used for named entity recognition is a first-order model that constrains label transition dependency of adjoining labels under the Markov assumption.Entities:
Keywords: Clinical named entity recognition; Clinical natural language processing; Conditional random fields; High-order dependency; Induction method
Mesh:
Year: 2019 PMID: 31307440 PMCID: PMC6632205 DOI: 10.1186/s12911-019-0865-1
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1NER perspective of a text; the label O represents a non-entity
Fig. 2Example of entities separated by non-entity words in the CRF model (S: symptom; D: drug; O: non-entity)
Fig. 3The transformation from conventional first-order CRF to precursor-induced CRF
Fig. 4Trellis graphs generated by different CRFs; each circle indicates random hidden state variables at each time step, and lines indicate the transition paths among the labels. The small circles in (c) are the memory elements added to the hidden states for the non-entity label
Data specification
| Corpus | Domain | Set | Article | Sentence | Token | Entity |
|---|---|---|---|---|---|---|
| i2b2 2012 | Clinical | Train | 190 | 7,258 | 94,836 | 11,239 |
| Test | 120 | 5,547 | 78,564 | 9,623 | ||
| SNUH | Clinical | Train | 196 | 11,669 | 116,402 | 18,383 |
| Test | 193 | 11,042 | 107,666 | 17,125 | ||
| CoNLL 2003 | General | Train | 946 | 14,987 | 203,621 | 23,499 |
| Test | 231 | 3,684 | 46,435 | 5,629 |
Annotation statistics
| a) i2b2 2012 | |||||
| Set | Problem | Test | Treatment | ||
| Train | 4,962 | 2,558 | 3,719 | ||
| Test | 4,270 | 2,140 | 3,213 | ||
| b) SNUH | |||||
| Set | Symptom | Test | Disease | Medication | Procedure |
| Train | 3,923 | 4,559 | 5,084 | 3,642 | 1,175 |
| Test | 3,737 | 3,917 | 4,828 | 3,496 | 1,147 |
| c) CoNLL 2003 | |||||
| Set | Location | Person | Organization | Miscellaneous | |
| Train | 7,140 | 6,600 | 6,321 | 3,438 | |
| Test | 1,656 | 1,617 | 1,662 | 694 | |
Example sentences of the entity distances (single: entity not having a precursor)
| Type | Example sentence with entity annotation |
|---|---|
| single | The patient is a 28-year-old woman who is [HIV positive]problem for 2 years . |
| distance 0 | With [intravenous hydration]treatment [the BUN]test and … |
| distance 1 | … because of [pancytopenia]problem and [vomiting]problem on [DDI]treatment |
| distance | She was brought in for [an esophagogastroduodenoscopy]test on 9/26 but she basically was not sufficiently [sedated]treatment and readmitted at this time for [a GI work-up]test . |
Fig. 5Histograms of distances between named entities in each corpus. The number ‘n’ on the x-axis means n non-entities exist within the two entities
Summary of the feature settings. (The w denotes the window size. If the value is absent, only feature of the current token is used. The n denotes the n of the n-gram. The ‘len’ denotes the length of affixes. The matching features denote the result of controlled vocabulary matching)
| Set | Token | Norm-token | n-gram | character affix | capitalization | POS/Chunk | Matching |
|---|---|---|---|---|---|---|---|
| #1-context | w = 3 | w = 3 | |||||
| #2-morph | w = 3 | w = 3 | le w = 3 | ||||
| #3-i2b2 | w = 5 | w = 5 | n = 2 w = 5 | len = 2~7 w = 3 | w = 1 | ||
| #3-snuh | w = 5 | w = 3 | n = 2 w = 5 | len = 2~3 | modifier /control | ||
| #3-conll | w = 5 | len = 3~4 w = 5 | w = 5 |
F1 scores of the first-order models and the pi-CRF for each corpora. The first value (‘whole instance’) is F1 score with whole test set and the second value (‘distanced instance’) is F1 score evaluated only with instances having transition dependency between NEs. (bold: best performance, shaded: pi-CRF)
| Feature | Models | i2b2 2012 | SNUH | CoNLL 2003 | |||
|---|---|---|---|---|---|---|---|
| whole instance | distanced instance | whole instance | distanced instance | whole instance | distanced instance | ||
| Set 1 | 1st-order CRF | 67.22 | 68.24 | 74.75 | 73.20 |
|
|
| 1st-order CRF with induced labels | 66.60 | 67.69 | 74.09 | 72.85 | 23.38 | 15.24 | |
| pi-CRF |
|
|
|
| 45.54 | 43.41 | |
| Set 2 | 1st-order CRF | 71.61 | 72.85 | 75.81 | 75.04 | 68.43 |
|
| 1st-order CRF with induced labels | 70.73 | 71.98 | 75.24 | 74.36 | 44.90 | 41.89 | |
| pi-CRF |
|
|
|
|
| 72.31 | |
| Set 3 | 1st-order CRF | 72.55 | 73.97 | 76.18 | 75.06 |
|
|
| 1st-order CRF with induced labels | 71.25 | 72.75 | 75.37 | 74.18 | 80.81 | 81.55 | |
| pi-CRF |
|
|
|
| 82.08 | 82.76 | |
F1 scores of higher-order CRF models and pi-CRF for each corpora. The first value (‘whole instance’) is F1 score with whole test set and the second value (‘distanced instance’) is F1 score evaluated only with instanced having transition dependency between NEs. (bold: best performance, shaded: pi-CRF)
| Feature | Models | i2b2 2012 | SNUH | CoNLL 2003 | |||
|---|---|---|---|---|---|---|---|
| whole instance | distanced instance | whole instance | distanced instance | whole instance | distanced instance | ||
| Set 1 | 2nd-order CRF |
|
| 73.43 | 72.21 |
|
|
| semi-Markov CRF | 67.87 | 68.91 | 73.44 | 71.61 | 37.31 | 34.13 | |
| high-order CRF | 68.38 | 69.52 | 73.50 | 71.69 | 36.97 | 33.87 | |
| pi-CRF | 67.29 | 68.43 |
|
| 45.54 | 43.41 | |
| Set 2 | 2nd-order CRF | 70.99 | 72.31 | 74.31 | 73.27 |
| 72.26 |
| semi-Markov CRF | 72.19 | 73.54 | 76.01 | 74.87 | 63.19 | 63.32 | |
| high-order CRF | 71.50 | 72.74 | 76.11 | 74.97 | 63.56 | 63.76 | |
| pi-CRF |
|
|
|
| 69.61 |
| |
| Set 3 | 2nd-order CRF | 71.75 | 73.01 | 75.17 | 74.05 |
|
|
| semi-Markov CRF | 69.30 | 70.73 | 76.70 | 75.79 | 82.47 | 83.29 | |
| high-order CRF | 69.26 | 70.64 |
|
| 82.18 | 82.80 | |
| pi-CRF |
|
| 76.28 | 75.45 | 82.08 | 82.76 | |
Efficiency test results. The numbers of parameters and states indicate the model’s size. The elapsed training/inference times indicate the model’s speed. (shaded: pi-CRF)
| Data | Model | Parameter | State | Elapsed training time (sec) | Training time per iteration (sec) | Elapsed |
|---|---|---|---|---|---|---|
| i2b2 | 1st-order CRF | 442,705 | 8 | 1,550 | 12.5 | 1.7 |
| 2nd-order CRF | 581,604 | 64 | 6,819 | 55.4 | 5.7 | |
| pi-CRF | 442,768 | 11 | 3,751 | 17.0 | 2.1 | |
| SNUH | 1st-order CRF | 396,245 | 12 | 2,946 | 19.5 | 1.9 |
| 2nd-order CRF | 495,772 | 144 | 27,388 | 139.7 | 9.3 | |
| pi-CRF | 396,400 | 17 | 6,231 | 23.6 | 2.1 | |
| CoNLL | 1st-order CRF | 313,672 | 10 | 4,031 | 19.1 | 0.6 |
| 2nd-order CRF | 431,044 | 100 | 24,828 | 173.6 | 2.6 | |
| pi-CRF | 313,776 | 14 | 13,512 | 29.4 | 0.7 |
The numbers of the models’ expectation and the correct on each held-out set. (shaded: pi-CRF)
| Data | Model | Whole instances | Distanced instances | ||||
|---|---|---|---|---|---|---|---|
| gold | expected | correct | gold | expected | correct | ||
| i2b2 (clinical) | 1st-order CRF | 9,623 | 7,361 | 5,708 | 8,552 | 6,188 | 4,927 |
| 2nd-order CRF | 7,785 | 6,046 | 6,547 | 5,245 | |||
| pi-CRF | 7,542 | 5,775 | 6,397 | 5,012 | |||
| SNUH (clinical) | 1st-order CRF | 17,125 | 15,326 | 12,128 | 12,520 | 10,813 | 8,540 |
| 2nd-order CRF | 15,702 | 12,053 | 11,088 | 8,524 | |||
| pi-CRF | 15,516 | 12,322 | 11,012 | 8,758 | |||
| CoNLL (general) | 1st-order CRF | 5,629 | 3,785 | 2,856 | 4,331 | 2,693 | 2,184 |
| 2nd-order CRF | 2,778 | 2,529 | 1,986 | 1,799 | |||
| pi-CRF | 1,855 | 1,704 | 1,280 | 1,218 | |||
Fig. 6Recalls along the distances between named entities in each corpus. The y-axis denotes recall score, numeric labels on the x-axis denote sets of entities having outside labels between the entity and its precursors as much as the numbers. (feature set: set #3)