Literature DB >> 26464024

Learning probabilistic phenotypes from heterogeneous EHR data.

Rimma Pivovarov1, Adler J Perotte2, Edouard Grave3, John Angiolillo4, Chris H Wiggins5, Noémie Elhadad6.   

Abstract

We present the Unsupervised Phenome Model (UPhenome), a probabilistic graphical model for large-scale discovery of computational models of disease, or phenotypes. We tackle this challenge through the joint modeling of a large set of diseases and a large set of clinical observations. The observations are drawn directly from heterogeneous patient record data (notes, laboratory tests, medications, and diagnosis codes), and the diseases are modeled in an unsupervised fashion. We apply UPhenome to two qualitatively different mixtures of patients and diseases: records of extremely sick patients in the intensive care unit with constant monitoring, and records of outpatients regularly followed by care providers over multiple years. We demonstrate that the UPhenome model can learn from these different care settings, without any additional adaptation. Our experiments show that (i) the learned phenotypes combine the heterogeneous data types more coherently than baseline LDA-based phenotypes; (ii) they each represent single diseases rather than a mix of diseases more often than the baseline ones; and (iii) when applied to unseen patient records, they are correlated with the patients' ground-truth disorders. Code for training, inference, and quantitative evaluation is made available to the research community.
Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Clinical phenotype modeling; Computational disease models; Electronic health record; Medical information systems; Phenotyping; Probabilistic modeling

Mesh:

Year:  2015        PMID: 26464024      PMCID: PMC8025140          DOI: 10.1016/j.jbi.2015.10.001

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  24 in total

1.  The Unified Medical Language System (UMLS): integrating biomedical terminology.

Authors:  Olivier Bodenreider
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

2.  Finding scientific topics.

Authors:  Thomas L Griffiths; Mark Steyvers
Journal:  Proc Natl Acad Sci U S A       Date:  2004-02-10       Impact factor: 11.205

3.  Incorporating Domain Knowledge into Topic Modeling via Dirichlet Forest Priors.

Authors:  David Andrzejewski; Xiaojin Zhu; Mark Craven
Journal:  Proc Int Conf Mach Learn       Date:  2009

4.  Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network.

Authors:  Katherine M Newton; Peggy L Peissig; Abel Ngo Kho; Suzette J Bielinski; Richard L Berg; Vidhu Choudhary; Melissa Basford; Christopher G Chute; Iftikhar J Kullo; Rongling Li; Jennifer A Pacheco; Luke V Rasmussen; Leslie Spangler; Joshua C Denny
Journal:  J Am Med Inform Assoc       Date:  2013-03-26       Impact factor: 4.497

5.  Clinical Case-based Retrieval Using Latent Topic Analysis.

Authors:  Corey W Arnold; Suzie M El-Saden; Alex A T Bui; Ricky Taira
Journal:  AMIA Annu Symp Proc       Date:  2010-11-13

6.  Comorbidity clusters in autism spectrum disorders: an electronic health record time-series analysis.

Authors:  Finale Doshi-Velez; Yaorong Ge; Isaac Kohane
Journal:  Pediatrics       Date:  2013-12-09       Impact factor: 7.124

7.  A dynamic network approach for the study of human phenotypes.

Authors:  César A Hidalgo; Nicholas Blumm; Albert-László Barabási; Nicholas A Christakis
Journal:  PLoS Comput Biol       Date:  2009-04-10       Impact factor: 4.475

8.  Unfolding Physiological State: Mortality Modelling in Intensive Care Units.

Authors:  Marzyeh Ghassemi; Tristan Naumann; Finale Doshi-Velez; Nicole Brimmer; Rohit Joshi; Anna Rumshisky; Peter Szolovits
Journal:  KDD       Date:  2014-08-24

9.  Computational phenotype discovery using unsupervised feature learning over noisy, sparse, and irregular clinical data.

Authors:  Thomas A Lasko; Joshua C Denny; Mia A Levy
Journal:  PLoS One       Date:  2013-06-24       Impact factor: 3.240

10.  Risk prediction for chronic kidney disease progression using heterogeneous electronic health record data and time series analysis.

Authors:  Adler Perotte; Rajesh Ranganath; Jamie S Hirsch; David Blei; Noémie Elhadad
Journal:  J Am Med Inform Assoc       Date:  2015-04-20       Impact factor: 4.497

View more
  40 in total

1.  Can Patient Record Summarization Support Quality Metric Abstraction?

Authors:  Rimma Pivovarov; Yael Judith Coppleson; Sharon Lipsky Gorman; David K Vawdrey; Noémie Elhadad
Journal:  AMIA Annu Symp Proc       Date:  2017-02-10

2.  Learning endometriosis phenotypes from patient-generated data.

Authors:  Iñigo Urteaga; Mollie McKillop; Noémie Elhadad
Journal:  NPJ Digit Med       Date:  2020-06-24

Review 3.  Aspiring to Unintended Consequences of Natural Language Processing: A Review of Recent Developments in Clinical and Consumer-Generated Text Processing.

Authors:  D Demner-Fushman; N Elhadad
Journal:  Yearb Med Inform       Date:  2016-11-10

4.  Clinical Research Informatics Contributions from 2015.

Authors:  C Daniel; R Choquet
Journal:  Yearb Med Inform       Date:  2016-11-10

5.  Detecting time-evolving phenotypic topics via tensor factorization on electronic health records: Cardiovascular disease case study.

Authors:  Juan Zhao; Yun Zhang; David J Schlueter; Patrick Wu; Vern Eric Kerchberger; S Trent Rosenbloom; Quinn S Wells; QiPing Feng; Joshua C Denny; Wei-Qi Wei
Journal:  J Biomed Inform       Date:  2019-08-22       Impact factor: 6.317

6.  A Computable Phenotype Improves Cohort Ascertainment in a Pediatric Pulmonary Hypertension Registry.

Authors:  Alon Geva; Jessica L Gronsbell; Tianxi Cai; Tianrun Cai; Shawn N Murphy; Jessica C Lyons; Michelle M Heinz; Marc D Natter; Nandan Patibandla; Jonathan Bickel; Mary P Mullen; Kenneth D Mandl
Journal:  J Pediatr       Date:  2017-06-16       Impact factor: 4.406

7.  Using Clinical Notes and Natural Language Processing for Automated HIV Risk Assessment.

Authors:  Daniel J Feller; Jason Zucker; Michael T Yin; Peter Gordon; Noémie Elhadad
Journal:  J Acquir Immune Defic Syndr       Date:  2018-02-01       Impact factor: 3.731

8.  Joint Learning of Representations of Medical Concepts and Words from EHR Data.

Authors:  Tian Bai; Ashis Kumar Chanda; Brian L Egleston; Slobodan Vucetic
Journal:  Proceedings (IEEE Int Conf Bioinformatics Biomed)       Date:  2017-12-18

9.  A Review of Challenges and Opportunities in Machine Learning for Health.

Authors:  Marzyeh Ghassemi; Tristan Naumann; Peter Schulam; Andrew L Beam; Irene Y Chen; Rajesh Ranganath
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2020-05-30

10.  Automated disease cohort selection using word embeddings from Electronic Health Records.

Authors:  Benjamin S Glicksberg; Riccardo Miotto; Kipp W Johnson; Khader Shameer; Li Li; Rong Chen; Joel T Dudley
Journal:  Pac Symp Biocomput       Date:  2018
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.