Literature DB >> 32361145

Automated ICD coding via unsupervised knowledge integration (UNITE).

Aaron Sonabend W1, Winston Cai2, Yuri Ahuja1, Ashwin Ananthakrishnan3, Zongqi Xia4, Sheng Yu5, Chuan Hong6.   

Abstract

OBJECTIVE: Accurate coding is critical for medical billing and electronic medical record (EMR)-based research. Recent research has been focused on developing supervised methods to automatically assign International Classification of Diseases (ICD) codes from clinical notes. However, supervised approaches rely on ICD code data stored in the hospital EMR system and is subject to bias rising from the practice and coding behavior. Consequently, portability of trained supervised algorithms to external EMR systems may suffer.
METHOD: We developed an unsupervised knowledge integration (UNITE) algorithm to automatically assign ICD codes for a specific disease by analyzing clinical narrative notes via semantic relevance assessment. The algorithm was validated using coded ICD data for 6 diseases from Partners HealthCare (PHS) Biobank and Medical Information Mart for Intensive Care (MIMIC-III). We compared the performance of UNITE against penalized logistic regression (LR), topic modeling, and neural network models within each EMR system. We additionally evaluated the portability of UNITE by training at PHS Biobank and validating at MIMIC-III, and vice versa.
RESULTS: UNITE achieved an averaged AUC of 0.91 at PHS and 0.92 at MIMIC over 6 diseases, comparable to LR and MLP. It had substantially better performance than topic models. In regards to portability, the performance of UNITE was consistent across different EMR systems, superior to LR, topic models and neural network models.
CONCLUSION: UNITE accurately assigns ICD code in EMR without requiring human labor, and has major advantages over commonly used machine learning approaches. In addition, the UNITE attained stable performance and high portability across EMRs in different institutions.
Copyright © 2020 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Automated ICD assignment; Electronic medical records; Knowledge integration; Portability; Semantic embedding; Unsupervised learning

Mesh:

Year:  2020        PMID: 32361145      PMCID: PMC9410729          DOI: 10.1016/j.ijmedinf.2020.104135

Source DB:  PubMed          Journal:  Int J Med Inform        ISSN: 1386-5056            Impact factor:   4.730


  17 in total

1.  Ambiguous abbreviations: an audit of abbreviations in paediatric note keeping.

Authors:  J E Sheppard; L C E Weidner; S Zakai; S Fountain-Polley; J Williams
Journal:  Arch Dis Child       Date:  2007-11-06       Impact factor: 3.791

2.  Towards automated clinical coding.

Authors:  Finneas Catling; Georgios P Spithourakis; Sebastian Riedel
Journal:  Int J Med Inform       Date:  2018-10-02       Impact factor: 4.046

3.  An empirical evaluation of deep learning for ICD-9 code assignment using MIMIC-III clinical notes.

Authors:  Jinmiao Huang; Cesar Osorio; Luke Wicent Sy
Journal:  Comput Methods Programs Biomed       Date:  2019-05-25       Impact factor: 5.428

4.  Feature extraction for phenotyping from semantic and knowledge resources.

Authors:  Wenxin Ning; Stephanie Chan; Andrew Beam; Ming Yu; Alon Geva; Katherine Liao; Mary Mullen; Kenneth D Mandl; Isaac Kohane; Tianxi Cai; Sheng Yu
Journal:  J Biomed Inform       Date:  2019-02-07       Impact factor: 6.317

5.  Measuring diagnoses: ICD code accuracy.

Authors:  Kimberly J O'Malley; Karon F Cook; Matt D Price; Kimberly Raiford Wildes; John F Hurdle; Carol M Ashton
Journal:  Health Serv Res       Date:  2005-10       Impact factor: 3.402

6.  PheProb: probabilistic phenotyping using diagnosis codes to improve power for genetic association studies.

Authors:  Jennifer A Sinnott; Fiona Cai; Sheng Yu; Boris P Hejblum; Chuan Hong; Isaac S Kohane; Katherine P Liao
Journal:  J Am Med Inform Assoc       Date:  2018-10-01       Impact factor: 4.497

7.  Automatic construction of rule-based ICD-9-CM coding systems.

Authors:  Richárd Farkas; György Szarvas
Journal:  BMC Bioinformatics       Date:  2008-04-11       Impact factor: 3.169

8.  Building the graph of medicine from millions of clinical narratives.

Authors:  Samuel G Finlayson; Paea LePendu; Nigam H Shah
Journal:  Sci Data       Date:  2014-09-16       Impact factor: 6.444

9.  MIMIC-III, a freely accessible critical care database.

Authors:  Alistair E W Johnson; Tom J Pollard; Lu Shen; Li-Wei H Lehman; Mengling Feng; Mohammad Ghassemi; Benjamin Moody; Peter Szolovits; Leo Anthony Celi; Roger G Mark
Journal:  Sci Data       Date:  2016-05-24       Impact factor: 6.444

10.  Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data.

Authors:  Andrew L Beam; Benjamin Kompa; Allen Schmaltz; Inbar Fried; Griffin Weber; Nathan Palmer; Xu Shi; Tianxi Cai; Isaac S Kohane
Journal:  Pac Symp Biocomput       Date:  2020
View more
  1 in total

1.  Comparison of different feature extraction methods for applicable automated ICD coding.

Authors:  Zhao Shuai; Diao Xiaolin; Yuan Jing; Huo Yanni; Cui Meng; Wang Yuxin; Zhao Wei
Journal:  BMC Med Inform Decis Mak       Date:  2022-01-12       Impact factor: 2.796

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.