Literature DB >> 34990838

CODER: Knowledge-infused cross-lingual medical term embedding for term normalization.

Zheng Yuan1, Zhengyun Zhao1, Haixia Sun2, Jiao Li2, Fei Wang3, Sheng Yu4.   

Abstract

OBJECTIVE: This paper aims to propose knowledge-aware embedding, a critical tool for medical term normalization.
METHODS: We develop CODER (Cross-lingual knowledge-infused medical term embedding) via contrastive learning based on a medical knowledge graph (KG) named the Unified Medical Language System, and similarities are calculated utilizing both terms and relation triplets from the KG. Training with relations injects medical knowledge into embeddings and can potentially improve their performance as machine learning features.
RESULTS: We evaluate CODER based on zero-shot term normalization, semantic similarity, and relation classification benchmarks, and the results show that CODER outperforms various state-of-the-art biomedical word embeddings, concept embeddings, and contextual embeddings.
CONCLUSION: CODER embeddings excellently reflect semantic similarity and relatedness of medical concepts. One can use CODER for embedding-based medical term normalization or to provide features for machine learning. Similar to other pretrained language models, CODER can also be fine-tuned for specific tasks. Codes and models are available at https://github.com/GanjinZero/CODER.
Copyright © 2022 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Contrastive learning; Cross-lingual; Knowledge graph embedding; Medical term normalization; Medical term representation

Mesh:

Year:  2022        PMID: 34990838     DOI: 10.1016/j.jbi.2021.103983

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  1 in total

1.  Medical terminology-based computing system: a lightweight post-processing solution for out-of-vocabulary multi-word terms.

Authors:  Nadia Saeed; Hammad Naveed
Journal:  Front Mol Biosci       Date:  2022-08-12
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.