| Literature DB >> 24052691 |
Pierre Zweigenbaum1, Thomas Lavergne, Natalia Grabar, Thierry Hamon, Sophie Rosset, Cyril Grouin.
Abstract
Medical entity recognition is currently generally performed by data-driven methods based on supervised machine learning. Expert-based systems, where linguistic and domain expertise are directly provided to the system are often combined with data-driven systems. We present here a case study where an existing expert-based medical entity recognition system, Ogmios, is combined with a data-driven system, Caramba, based on a linear-chain Conditional Random Field (CRF) classifier. Our case study specifically highlights the risk of overfitting incurred by an expert-based system. We observe that it prevents the combination of the 2 systems from obtaining improvements in precision, recall, or F-measure, and analyze the underlying mechanisms through a post-hoc feature-level analysis. Wrapping the expert-based system alone as attributes input to a CRF classifier does boost its F-measure from 0.603 to 0.710, bringing it on par with the data-driven system. The generalization of this method remains to be further investigated.Entities:
Keywords: hybrid methods; information extraction; machine learning; medical records; natural language processing; overfitting
Year: 2013 PMID: 24052691 PMCID: PMC3776026 DOI: 10.4137/BII.S11770
Source DB: PubMed Journal: Biomed Inform Insights ISSN: 1178-2226
Annotation statistics in percentage on training and test corpora for the six clinical event types.
| Clinical_dept | Evidential | Occurrence | Problem | Test | Treatment | |
|---|---|---|---|---|---|---|
| Train | 6.05 | 4.49 | 19.95 | 30.50 | 15.76 | 23.25 |
| Test | 5.39 | 4.38 | 18.38 | 31.70 | 15.99 | 24.17 |
Features for CRF-based event identification.
|
Section id among four sections we defined as follow: admission date (section #1), discharge date (#2), history of present illness (#3) and hospital course (#4); Morpho-syntactic tagging with the Tree Tagger ( Morpho-syntactic tags projected from a specific lexicon of 62,263 adjectives and 320,013 nouns based on the UMLS Specialist Lexicon; Semantic types and semantic groups from the UMLS ( Semantic annotation (the six event types and other markers such as “anatomical part”, “localization”, “pre/post-examination”, “value unit”, etc.) with WMatch, Syntactic analysis with the Charniak McClosky biomedical parser ( Two series of unsupervised clusters obtained through Brown’s algorithm, |
Ogmios: direct evaluation (training set).
| P | R | F | Description | |
|---|---|---|---|---|
| Ogmios | 0.8229 | 0.7079 | 0.7611 | Ogmios as is |
Abbreviations: P, Precision; R, Recall; F, F-measure.
Caramba: best groups of patterns at first iteration (training set).
| P | R | F | Description |
|---|---|---|---|
| 0.7322 | 0.6452 | 0.6859 | B: Brown Beth_Partners unigrams |
| 0.6949 | 0.5407 | 0.6082 | Brown UMLS unigrams |
| 0.5239 | 0.3475 | 0.4179 | B: UMLS first or two Semantic Types |
| 0.4307 | 0.2908 | 0.3472 | Charniak-McClosky POS unigrams, bigrams, trigrams |
| 0.5209 | 0.2551 | 0.3425 | Wmatch only |
| 0.5137 | 0.2564 | 0.3421 | Wmatch only, BIO |
| 0.3378 | 0.1476 | 0.2055 | B: Charniak-McClosky chunk unigrams, bigrams, trigrams |
| 0.3378 | 0.0478 | 0.0837 | B: alphabetic or case unigrams |
| 0.7469 | 0.6809 | 0.7124 | Subset 1: All of the above |
Note: B: bigram of classes.
Caramba: best additional groups of patterns at second iteration (training set).
| P | R | F | Description |
|---|---|---|---|
| 0.7622 | 0.6904 | 0.7245 | *Lemma, from TreeTagger |
| 0.7684 | 0.6851 | 0.7244 | *B: Brown Beth_Partners unigrams |
| 0.7589 | 0.6898 | 0.7227 | *Normalized token |
| 0.7637 | 0.6857 | 0.7226 | *Specialist Lexicon syntactic category, with normalized token |
| 0.7624 | 0.6852 | 0.7217 | B: Specialist Lexicon syntactic category, with normalized token |
| 0.7575 | 0.6876 | 0.7209 | *TreeTagger POS, with normalized token |
| 0.7595 | 0.6856 | 0.7206 | B: lemma, from TreeTagger |
| 0.7648 | 0.6811 | 0.7205 | B: Brown UMLS unigrams |
| 0.7640 | 0.6796 | 0.7194 | *Section identifier |
| 0.7632 | 0.6799 | 0.7192 | *Digit |
| 0.7578 | 0.6837 | 0.7189 | B: Charniak-McClosky POS unigrams, bigrams, trigrams |
| 0.7590 | 0.6805 | 0.7176 | B: TreeTagger POS, with normalized token |
| 0.7570 | 0.6819 | 0.7175 | UMLS first or two Semantic Types |
| 0.7487 | 0.6887 | 0.7175 | *Date |
| 0.7561 | 0.6821 | 0.7172 | *Alphabetic or case |
| 0.7579 | 0.6802 | 0.7169 | B: Wmatch |
| 0.7627 | 0.6757 | 0.7166 | B: section identifier |
| 0.7561 | 0.6810 | 0.7166 | *B: TreeTagger chunk, BIO |
| 0.7607 | 0.6772 | 0.7165 | B: Wmatch, BIO |
| 0.7522 | 0.6775 | 0.7129 | B: date |
| 0.7527 | 0.7119 | 0.7317 | Subset 2: Subset 1 + all of the above |
| 0.7761 | 0.6957 | 0.7337 | Subset 3: Subset 1 + starred feature groups only |
Notes: B: bigram of classes. Each pattern group is added independently to the pool of Iteration 1 (ie, Subset 1).
Combinations of Ogmios (Og) with Caramba (Ca). 10-fold cross validation on the training corpus (except for Ogmios, first row), then application to the test corpus. Pairs of numbers (−n, +m) in the rest of this caption indicate the range of relative positions of n-grams of attributes. All feature sets in the CRF include bigrams of classes (B feature).
| Training | Test | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
| |||||||||
| Conlleval | i2b2Evaluation | Conlleval | i2b2Evaluation | |||||||||
|
|
|
|
| |||||||||
| P | R | F | P | R | F | P | R | F | P | R | F | |
| Og | 0.7079 | 0.8229 | 0.7611 | 0.8281 | 0.9602 | 0.8893 | 0.5681 | 0.6419 | 0.7839 | 0.8852 | 0.8315 | |
| Ca | 0.7761 | 0.6957 | 0.9322 | 0.8336 | 0.8801 | 0.7541 | 0.6787 | 0.9210 | 0.8282 | |||
| OgF | 0.7581 | 0.8091 | 0.7828 | 0.8648 | 0.9206 | 0.8918 | 0.7469 | 0.6758 | 0.9183 | 0.8303 | ||
| OgT | 0.8483 | 0.8370 | 0.9292 | 0.9144 | 0.7443 | 0.6746 | 0.9163 | 0.8299 | ||||
| OgCa | 0.8613 | 0.8477 | 0.9362 | 0.9192 | 0.7472 | 0.6795 | 0.9159 | 0.8324 | ||||
Notes: Og: Ogmios alone, as is; Ca: Caramba alone; OgF: Ogmios output as only attributes: unigrams and bigrams of Ogmios attributes (−1, +1); OgT: Ogmios + normalized token: unigrams and bigrams of Ogmios attributes (−1, +1), with unigrams (−5, +3) and bigrams (−2, +1) of tokens, and one of the previous three tokens; OgCa: Ogmios as feature added to Caramba: unigrams and bigrams of Ogmios attributes (−1, +1), and above subset of Caramba features. Bold shows the (set of) best results per column; italics shows the lowest results when they are notable.
Strongest groups of features to make a decision.
| Group of features | Range of weights | Sum of weights |
|---|---|---|
| Ogmios | ~[0:8; 4] | ~[2:7; 21:4] |
| Bigrams of classes feature | ~[1:5; 4] | ~[1:5; 4] |
| Total score | ~[10; 30] | |
| Total mass | Up to ~50 |