| Literature DB >> 34174874 |
Bo Sun1, Fei Zhang1, Jing Li2, Yicheng Yang3, Xiaolin Diao1, Wei Zhao4, Ting Shu5.
Abstract
BACKGROUND: With the development and application of medical information system, semantic interoperability is essential for accurate and advanced health-related computing and electronic health record (EHR) information sharing. The openEHR approach can improve semantic interoperability. One key improvement of openEHR is that it allows for the use of existing archetypes. The crucial problem is how to improve the precision and resolve ambiguity in the archetype retrieval.Entities:
Keywords: Information retrieval; Interoperability; Nature language processing; OpenEHR
Year: 2021 PMID: 34174874 PMCID: PMC8234679 DOI: 10.1186/s12911-021-01554-2
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1Approach description of this study. EHR: electronic health records; RIS: Radioiogy information system; LIS: Laboratory Information Management System
Fig. 2Process of query expansion. a Three key steps of query expansion: term segmentation, term expansion and term combination; b Example of query expansion: risk of side effects
Fig. 3Process for constructing test sets in different medical professional level: low, medium and high
Description of test sets
| Test sets | Target archetype | Original search terms | ||
|---|---|---|---|---|
| Examples | Number, n | Examples | Number, n | |
| Low level set | Skeletal age, medicine management, etc | 15 | Bone age, drug management, etc | 40 |
| Medium level set | Fetal heartrate, care program, etc | 15 | Fetal heartbeat, care plan, etc | 40 |
| High level set | NA | NA | Blood transfusion label, admission diagnosis, etc | 40 |
Fig. 4The illustration of synonym-expansion based on the Word2Vec. At train stage, Wikidata in Chinese is used to learn the representation vector of words, and then use the cosine distance to calculate the similarity between user’s term with all terms in vocabulary, after ranking, return top 10 terms as the expansion of user’s term
Fig. 5Two methods for assessingtest sets. Assessment method A: for Low and Medium Level Set, compare the result with ID in the original archetypes to determine; Assessment method B: for High Level Set, recruit two volunteers with clinical backgrounds to examin the results combined with the original EMR data information to determine
Process for constructing expansion terms (for example: risk of side effects)
| Original search term | Segmentation terms | Expansion terms | ||
|---|---|---|---|---|
| Top 3 (sum: 6) | Top 5 (sum: 10) | Top 10 (sum: 20) | ||
| Risk of side effects | Side effects | Side effect | Side effect | Side effect |
| Adverse reactions | Adverse reactions | Adverse reactions | ||
| … | Untoward reactions | Untoward reactions | ||
| … | Anaphylaxis | |||
| … | ||||
| Risk | Risk | Risk | Risk | |
| Possibility | Possibility | Possibility | ||
| … | Danger | Danger | ||
| … | Probability | |||
| … | ||||
Process for constructing combination terms (for example: risk of side effects)
| Original search term | Combination terms | ||
|---|---|---|---|
| Top 3 (sum: 9) | Top 5 (sum: 25) | Top 10 (sum: 100) | |
| Risk of side effects | Side effect risk | Side effect risk | Side effect risk |
| Side effect possibility | Side effect possibility | Side effect possibility | |
| … | Adverse reactions risk | Adverse reactions risk | |
| Adverse reactions possibility | Adverse reactions possibility | ||
| … | Untoward reactions risk | ||
| Untoward reactions Possibility | |||
| … | |||
Results for constructing expansion terms & combination terms
| Test queries | Original search term, n | Expansion term, n | Combination term, n | ||||
|---|---|---|---|---|---|---|---|
| Top 3 | Top 5 | Top 10 | Top 3 | Top 5 | Top 10 | ||
| Low level set | 40 | 216 | 360 | 719 | 312 | 840 | 3279 |
| Medium level set | 40 | 267 | 455 | 891 | 546 | 2085 | 13,712 |
| High level set | 40 | 303 | 505 | 1013 | 1050 | 6300 | 36,486 |
Manual assessment results of ‘High Level Set’
| Methods | Volunteer A | Volunteer B | Finally result | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| AP | P@3 | P@5 | AP | P@3 | P@5 | AP | P@3 | P@5 | ||
| Original search terms | Top 3 | 0.150 | 0.150 | |||||||
| Top 5 | ||||||||||
| Top 10 | ||||||||||
| Expansion terms | Top 3 | 0.637 | 0.525 | 0.750 | 0.662 | 0.575 | 0.750 | |||
| Top 5 | 0.575 | 0.475 | 0.675 | 0.612 | 0.550 | 0.675 | ||||
| Top 10 | 0.575 | 0.500 | 0.650 | 0.575 | 0.500 | 0.650 | ||||
| Combination terms | Top 3 | 0.150 | 0.150 | 0.150 | 0.175 | 0.175 | 0.175 | |||
| Top 5 | 0.150 | 0.150 | 0.150 | 0.175 | 0.175 | 0.175 | ||||
| Top 10 | 0.150 | 0.150 | 0.150 | 0.200 | 0.200 | 0.200 | ||||
Finally result values are emphasized in bold
Fig. 6Comparison result with baseline model. a Comparison result between expansion term and combination terms. b Comparison result between our method (Expansion terms & Top 3) and baseline model
Retrieval performance comparison between expansion terms and combination terms with different similarity thresholds
| Methods | AP | P@3 | P@5 | ||
|---|---|---|---|---|---|
| Low level set | Original search terms | 0.050 | 0.050 | 0.050 | |
| Expansion terms | Top3 | 0.963 | 0.950 | 0.975 | |
| Top5 | 0.975 | 0.950 | 1.000 | ||
| Top10 | 0.963 | 0.950 | 0.975 | ||
| Mean | 0.967 | 0.950 | 0.983 | ||
| Combination terms | Top3 | 0.512 | 0.500 | 0.525 | |
| Top5 | 0.575 | 0.550 | 0.600 | ||
| Top10 | 0.6125 | 0.600 | 0.625 | ||
| Mean | 0.567 | 0.550 | 0.583 | ||
| Medium level set | Original search terms | 0.137 | 0.125 | 0.15 | |
| Expansion terms | Top3 | 0.888 | 0.850 | 0.925 | |
| Top5 | 0.888 | 0.850 | 0.925 | ||
| Top10 | 0.875 | 0.850 | 0.900 | ||
| Mean | 0.883 | 0.850 | 0.917 | ||
| Combination terms | Top3 | 0.200 | 0.200 | 0.200 | |
| Top5 | 0.200 | 0.200 | 0.200 | ||
| Top10 | 0.250 | 0.250 | 0.250 | ||
| Mean | 0.217 | 0.217 | 0.217 | ||
| High level set | Original search terms | 0.150 | 0.150 | 0.150 | |
| Expansion terms | Top3 | 0.637 | 0.525 | 0.750 | |
| Top5 | 0.612 | 0.550 | 0.675 | ||
| Top10 | 0.575 | 0.500 | 0.650 | ||
| Mean | 0.608 | 0.525 | 0.692 | ||
| Combination terms | Top3 | 0.150 | 0.150 | 0.150 | |
| Top5 | 0.175 | 0.175 | 0.175 | ||
| Top10 | 0.175 | 0.175 | 0.175 | ||
| Mean | 0.167 | 0.167 | 0.167 | ||
Different similarity thresholds: in the process of synonym expansion, the first 3, 5, and 10 terms of similar results are used as expansion terms and then composed as combination terms
AP: average precision, P@3: precision at 3, P@5: precision at 5