Literature DB >> 32116463

Medical Concept Representation Learning from Multi-source Data.

Tian Bai1, Brian L Egleston2, Richard Bleicher2, Slobodan Vucetic1.   

Abstract

Representing words as low dimensional vectors is very useful in many natural language processing tasks. This idea has been extended to medical domain where medical codes listed in medical claims are represented as vectors to facilitate exploratory analysis and predictive modeling. However, depending on a type of a medical provider, medical claims can use medical codes from different ontologies or from a combination of ontologies, which complicates learning of the representations. To be able to properly utilize such multi-source medical claim data, we propose an approach that represents medical codes from different ontologies in the same vector space. We first modify the Pointwise Mutual Information (PMI) measure of similarity between the codes. We then develop a new negative sampling method for word2vec model that implicitly factorizes the modified PMI matrix. The new approach was evaluated on the code cross-reference problem, which aims at identifying similar codes across different ontologies. In our experiments, we evaluated cross-referencing between ICD-9 and CPT medical code ontologies. Our results indicate that vector representations of codes learned by the proposed approach provide superior cross-referencing when compared to several existing approaches.

Entities:  

Year:  2019        PMID: 32116463      PMCID: PMC7047512          DOI: 10.24963/ijcai.2019/680

Source DB:  PubMed          Journal:  IJCAI (U S)        ISSN: 1045-0823


  10 in total

1.  Overview of the SEER-Medicare data: content, research applications, and generalizability to the United States elderly population.

Authors:  Joan L Warren; Carrie N Klabunde; Deborah Schrag; Peter B Bach; Gerald F Riley
Journal:  Med Care       Date:  2002-08       Impact factor: 2.983

2.  The ICD-10 General Equivalence Mappings. Bridging the translation gap from ICD-9.

Authors:  Rhonda Butler
Journal:  J AHIMA       Date:  2007-10

3.  ICD-9 to ICD-10: evolution, revolution, and current debates in the United States.

Authors:  Maxim Topaz; Leah Shafran-Topaz; Kathryn H Bowles
Journal:  Perspect Health Inf Manag       Date:  2013-04-01

4.  Conversion problems concerning automated mapping from ICD-10 to ICD-9.

Authors:  S Schulz; A Zaiss; R Brunner; D Spinner; R Klar
Journal:  Methods Inf Med       Date:  1998-09       Impact factor: 2.176

5.  Joint Learning of Representations of Medical Concepts and Words from EHR Data.

Authors:  Tian Bai; Ashis Kumar Chanda; Brian L Egleston; Slobodan Vucetic
Journal:  Proceedings (IEEE Int Conf Bioinformatics Biomed)       Date:  2017-12-18

6.  AHIMA Project Offers Insights into SNOMED, ICD-9-CM Mapping Process.

Authors:  Kathy Brouch
Journal:  Health Inf Manag       Date:  2004-01       Impact factor: 3.185

7.  Interpretable Representation Learning for Healthcare via Capturing Disease Progression through Time.

Authors:  Tian Bai; Brian L Egleston; Shanshan Zhang; Slobodan Vucetic
Journal:  KDD       Date:  2018-08

8.  Preoperative delays in the US Medicare population with breast cancer.

Authors:  Richard J Bleicher; Karen Ruth; Elin R Sigurdson; Eric Ross; Yu-Ning Wong; Sameer A Patel; Marcia Boraas; Neal S Topham; Brian L Egleston
Journal:  J Clin Oncol       Date:  2012-11-19       Impact factor: 44.544

9.  Time to Surgery and Breast Cancer Survival in the United States.

Authors:  Richard J Bleicher; Karen Ruth; Elin R Sigurdson; J Robert Beck; Eric Ross; Yu-Ning Wong; Sameer A Patel; Marcia Boraas; Eric I Chang; Neal S Topham; Brian L Egleston
Journal:  JAMA Oncol       Date:  2016-03       Impact factor: 31.777

10.  EHR phenotyping via jointly embedding medical concepts and words into a unified vector space.

Authors:  Tian Bai; Ashis Kumar Chanda; Brian L Egleston; Slobodan Vucetic
Journal:  BMC Med Inform Decis Mak       Date:  2018-12-12       Impact factor: 2.796

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.