Literature DB >> 31672532

Automated grouping of medical codes via multiview banded spectral clustering.

Luwan Zhang1, Yichi Zhang2, Tianrun Cai3, Yuri Ahuja4, Zeling He3, Yuk-Lam Ho5, Andrew Beam6, Kelly Cho7, Robert Carroll8, Joshua Denny8, Isaac Kohane6, Katherine Liao9, Tianxi Cai10.   

Abstract

OBJECTIVE: With its increasingly widespread adoption, electronic health records (EHR) have enabled phenotypic information extraction at an unprecedented granularity and scale. However, often a medical concept (e.g. diagnosis, prescription, symptom) is described in various synonyms across different EHR systems, hindering data integration for signal enhancement and complicating dimensionality reduction for knowledge discovery. Despite existing ontologies and hierarchies, tremendous human effort is needed for curation and maintenance - a process that is both unscalable and susceptible to subjective biases. This paper aims to develop a data-driven approach to automate grouping medical terms into clinically relevant concepts by combining multiple up-to-date data sources in an unbiased manner.
METHODS: We present a novel data-driven grouping approach - multi-view banded spectral clustering (mvBSC) combining summary data from multiple healthcare systems. The proposed method consists of a banding step that leverages the prior knowledge from the existing coding hierarchy, and a combining step that performs spectral clustering on an optimally weighted matrix.
RESULTS: We apply the proposed method to group ICD-9 and ICD-10-CM codes together by integrating data from two healthcare systems. We show grouping results and hierarchies for 13 representative disease categories. Individual grouping qualities were evaluated using normalized mutual information, adjusted Rand index, and F1-measure, and were found to consistently exhibit great similarity to the existing manual grouping counterpart. The resulting ICD groupings also enjoy comparable interpretability and are well aligned with the current ICD hierarchy.
CONCLUSION: The proposed approach, by systematically leveraging multiple data sources, is able to overcome bias while maximizing consensus to achieve generalizability. It has the advantage of being efficient, scalable, and adaptive to the evolving human knowledge reflected in the data, showing a significant step toward automating medical knowledge integration.
Copyright © 2019. Published by Elsevier Inc.

Entities:  

Keywords:  Data-driven grouping; Electronic health records (EHR); International Classification of Disease (ICD); Multiple data sources; Spectral clustering

Mesh:

Year:  2019        PMID: 31672532      PMCID: PMC7261410          DOI: 10.1016/j.jbi.2019.103322

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  4 in total

Review 1.  Using Phecodes for Research with the Electronic Health Record: From PheWAS to PheRS.

Authors:  Lisa Bastarache
Journal:  Annu Rev Biomed Data Sci       Date:  2021-07-20

2.  A framework for employing longitudinally collected multicenter electronic health records to stratify heterogeneous patient populations on disease history.

Authors:  Marc P Maurits; Ilya Korsunsky; Soumya Raychaudhuri; Shawn N Murphy; Jordan W Smoller; Scott T Weiss; Thomas W J Huizinga; Marcel J T Reinders; Elizabeth W Karlson; Erik B van den Akker; Rachel Knevel
Journal:  J Am Med Inform Assoc       Date:  2022-04-13       Impact factor: 7.942

Review 3.  Phenotype clustering in health care: A narrative review for clinicians.

Authors:  Tyler J Loftus; Benjamin Shickel; Jeremy A Balch; Patrick J Tighe; Kenneth L Abbott; Brian Fazzone; Erik M Anderson; Jared Rozowsky; Tezcan Ozrazgat-Baslanti; Yuanfang Ren; Scott A Berceli; William R Hogan; Philip A Efron; J Randall Moorman; Parisa Rashidi; Gilbert R Upchurch; Azra Bihorac
Journal:  Front Artif Intell       Date:  2022-08-12

4.  Deep phenotyping: Embracing complexity and temporality-Towards scalability, portability, and interoperability.

Authors:  Chunhua Weng; Nigam H Shah; George Hripcsak
Journal:  J Biomed Inform       Date:  2020-04-23       Impact factor: 6.317

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.