| Literature DB >> 32025651 |
Elliot Schumacher1, Mark Dredze1.
Abstract
OBJECTIVES: An important component of processing medical texts is the identification of synonymous words or phrases. Synonyms can inform learned representations of patients or improve linking mentioned concepts to medical ontologies. However, medical synonyms can be lexically similar ("dilated RA" and "dilated RV") or dissimilar ("cerebrovascular accident" and "stroke"); contextual information can determine if 2 strings are synonymous. Medical professionals utilize extensive variation of medical terminology, often not evidenced in structured medical resources. Therefore, the ability to discover synonyms, especially without reliance on training data, is an important component in processing training notes. The ability to discover synonyms from models trained on large amounts of unannotated data removes the need to rely on annotated pairs of similar words. Models relying solely on non-annotated data can be trained on a wider variety of texts without the cost of annotation, and thus may capture a broader variety of language.Entities:
Keywords: contextual representations; medical terminology; synonym discovery
Year: 2019 PMID: 32025651 PMCID: PMC6994012 DOI: 10.1093/jamiaopen/ooz057
Source DB: PubMed Journal: JAMIA Open ISSN: 2574-2531
Mean reciprocal rank, coverage, and top-1 accuracy, for pairwise identification of synonyms of the top 50 results run on the tuning data (n = 1275)
| Model | MRR | Cov. (%) | Top-1 (%) |
|---|---|---|---|
|
| 0.373 | 71.90 | 30.90 |
| Char. Bigram | 0.404 | 70.60 | 33.70 |
|
| 0.417 | 69.80 | 33.70 |
| Char. Fourgram | 0.414 | 68.80 | 34.00 |
| Context2vec | 0.374 | 58.10 | 31.80 |
|
| 0.385 | 63.90 | 28.40 |
| ELMo | |||
| L0, Avg | 0.499 | 71.90 | 43.70 |
|
| 0.503 | 67.80 |
|
| L1, Avg | 0.37 | 71.50 | 28.50 |
| L1, Max | 0.344 | 63.30 | 27.40 |
| L2, Avg | 0.299 | 71.10 | 21.90 |
| L2, Max | 0.291 | 63.70 | 22.20 |
| ELMo + M2v In | |||
|
|
|
| 44.50 |
| L0, Max | 0.488 | 69.80 | 43.50 |
| L1, Avg | 0.346 | 72.20 | 26.50 |
| L1, Max | 0.336 | 65.50 | 26.90 |
| L2, Avg | 0.279 | 70.00 | 19.80 |
| L2, Max | 0.281 | 63.00 | 21.10 |
| ELMo + M2v Out | |||
| L0, Avg | 0.484 | 72.20 | 42.60 |
|
| 0.493 | 69.60 | 44.90 |
| L1, Avg | 0.371 | 73.30 | 29.10 |
| L1, Max | 0.351 | 66.80 | 27.70 |
| L2, Avg | 0.309 | 71.10 | 23.20 |
| L2, Max | 0.302 | 64.50 | 23.10 |
| BERT | |||
|
| 0.491 | 65.18 | 43.92 |
| L1, Max | 0.489 | 65.02 | 44.31 |
| L4, Avg | 0.438 | 61.96 | 39.37 |
| L4, Max | 0.435 | 63.29 | 38.43 |
| L8, Avg | 0.340 | 53.96 | 30.90 |
| L8 Max | 0.395 | 51.69 | 35.53 |
| L12, Avg | 0.324 | 47.22 | 29.33 |
| L12, Max | 0.372 | 50.67 | 33.25 |
Note: For ELMo and BERT models, L(0/1/2) indicates layer number and Avg or Max indicates combination method for multiple word phrases. Bolded entries are the best performing result for that measure. We report test results on the model names listed in bolded italics, selecting the best model for MRR in each category (noted by line separators).
Figure 1.Architecture for Context2vec with ICD-9 representation integration.
Figure 2.Architecture for ELMo with Med2vec representation integration in output layer.
Figure 3.Architecture for ELMo with Med2vec representation integration in input layer.
Mean reciprocal rank, coverage, and top-1 accuracy, and Jaro-Winkler average for correct synonyms in the top-1 for pairwise identification of synonyms of the top 50 results run on the test data (n = 599)
| Model | MRR | Coverage (%) | Top-1 Acc. (%) | JW Top-1 |
|---|---|---|---|---|
| Word2vec | 0.355 | 69.40 | 29.20 | 0.798 |
| Char. Trigram | 0.359 | 67.90 | 28.00 | 0.826 |
| C2v + M2v | 0.335 | 60.60 | 28.60 | 0.719 |
| ELMo (L0, Max) | 0.474 | 62.40 | 43.10 | 0.838 |
| ELMo+M2v In(L0, Avg) | 0.476 |
| 40.70 | 0.813 |
| ELMo+M2v Out(L0, Max) |
| 63.40 |
| 0.814 |
| BERT (L1, Avg) | 0.442 | 64.94 | 39.07 | 0.835 |
Note: Significance tests were performed using a two-sided Z-score test to compare the best performing models (bolded) to the baseline models. For Top-1, the best performing models is found to be significant compared to the best performing non-ELMo model. For Coverage, the increase between Word2vec and the best model is not significant. Significance testing is not performed for MRR or Jaro Winkler Top-1.
Selected correct Top-1 examples from the ELMo (L0, Max) Model tuning set results
| Ex. mention | Top-1 synonym |
|---|---|
| Varicosities | Varices |
| Left atrial enlargement | LA enlargement |
| Difficulty…breathing | Shortness of breath |
| Hypokinesis | Hypokinetic |
| Decreased responsiveness | Poorly responsive |
| Uterine fibroid | Fibroid |
| Mitral regurgitation | Mitral regurg |
| Rib fx | Fractures…rib |
| Septic | Sepsis |
Incorrect Top-1 examples are listed by category, along with the percentage of occurrence (from 400 reviewed Top-1 errors) from the ELMo (L0, Max) model tuning set results
| Category | Percentage | Mention | Top-1 synonym |
|---|---|---|---|
| Synonym overlap | 52% | Left atrium…dilated | Right atrium…dilated |
| Myocardial infarction | Inferior myocardial infarction | ||
| Diabetes mellitus | Diabetes mellitus type 2 | ||
| Aortic valve disease | Valvular heart disease | ||
| Dilated RA | Dilated RV | ||
| Abbreviation | 19% | LVH | MVP |
| NIDDM | DM | ||
| Morph. or lexical overlap | 16% | Hypokinesis | Akinesis |
| Bradycardic | Tachycardic | ||
| Cyanosis | Stenosis | ||
| No relation | 9% | Nausea | Masses |
| Clubbing | Bleeding | ||
| Similar concepts | 5% | Bleed | Bleeding |
| Cerebrovascular accident | Cerebrovascular accidents |
Figure 4.We performed dimensionality reduction using t-SNE on the tuning set mention representations from the ELMo (L0, Max) Model, randomly selected 5% of unique mention strings.