| Literature DB >> 25956056 |
Yan Xu1,2, Luoxin Chen3, Junsheng Wei4, Sophia Ananiadou5, Yubo Fan6, Yi Qian7, Eric I-Chao Chang8, Junichi Tsujii9.
Abstract
BACKGROUND: Electronic medical record (EMR) systems have become widely used throughout the world to improve the quality of healthcare and the efficiency of hospital services. A bilingual medical lexicon of Chinese and English is needed to meet the demand for the multi-lingual and multi-national treatment. We make efforts to extract a bilingual lexicon from English and Chinese discharge summaries with a small seed lexicon. The lexical terms can be classified into two categories: single-word terms (SWTs) and multi-word terms (MWTs). For SWTs, we use a label propagation (LP; context-based) method to extract candidates of translation pairs. For MWTs, which are pervasive in the medical domain, we propose a term alignment method, which firstly obtains translation candidates for each component word of a Chinese MWT, and then generates their combinations, from which the system selects a set of plausible translation candidates.Entities:
Mesh:
Year: 2015 PMID: 25956056 PMCID: PMC4424557 DOI: 10.1186/s12859-015-0606-0
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1A sentence in a Chinese discharge summary.
Figure 2Construction of bilingual lexicons from discharge summaries in English and Chinese.
Figure 3Multi term alignment figure legend text.
T ranslation Candidates for “心房”, “颤动”, “吞咽”, and “困难”
|
|
|
|
|
|---|---|---|---|
| Bradycardia | bradycardia | hematuria | hematuria |
|
|
|
| syncope |
| acidosis | acidosis | syncope |
|
| angina | angina | stools | fever |
| depression | rheumatoid | fever | stools |
| hypothyroidism | myocardial | acidosis | jaundice |
| syncope | pericarditis | bloody | anemia |
| encephalopathy | cesarean | jaundice | acidosis |
| heart | heart | anemia | sepsis |
| rheumatoid | leukemia | angina | bloody |
| glaucoma | glaucoma | sepsis | respiratory |
| dysphagia | cardiomyopathy | thrombocytopenia | angina |
|
| encephalopathy | respiratory | encephalopathy |
| thrombocytopenia | hypothyroidism | bronchitis | leukemia |
| cardiomyopathy | syncope | bacteremia | thrombocytopenia |
| cesarean | palsy | leukemia | bacteremia |
| anemia |
| fibrillation | fibrillation |
Bold word means the correct translation (“atrial fibrillation” for “心房颤动” and “dysphagia” for “吞咽困难”).
The lengths of 37 multi word translation pairs
|
|
|
|---|---|
| 2 to 1 | 11 |
| 2 to 2 | 14 |
| 3 to 2 | 2 |
| 3 to 3 | 1 |
| 1 to 2 | 7 |
| 2 to 3 | 2 |
Performance on SWTs translation
|
|
|
| |
|---|---|---|---|
| LP | 4.44% | 24.44% | 62.22% |
| Baseline | 3.70% | 8.89% | 20.74% |
Performance on MWTs translation
|
|
|
| |
|---|---|---|---|
| Multi term alignment | 16.22% | 27.03% | 29.73% |
| LP | 5.41% | 8.11% | 16.22% |