Literature DB >> 33597713

Deep representation learning of electronic health records to unlock patient stratification at scale.

Isotta Landi1,2, Benjamin S Glicksberg3,4,5, Hao-Chih Lee4,5, Sarah Cherng4,5, Giulia Landi6, Matteo Danieletto3,4,5, Joel T Dudley4,5, Cesare Furlanello1,7, Riccardo Miotto8,9,10.   

Abstract

Deriving disease subtypes from electronic health records (EHRs) can guide next-generation personalized medicine. However, challenges in summarizing and representing patient data prevent widespread practice of scalable EHR-based stratification analysis. Here we present an unsupervised framework based on deep learning to process heterogeneous EHRs and derive patient representations that can efficiently and effectively enable patient stratification at scale. We considered EHRs of 1,608,741 patients from a diverse hospital cohort comprising a total of 57,464 clinical concepts. We introduce a representation learning model based on word embeddings, convolutional neural networks, and autoencoders (i.e., ConvAE) to transform patient trajectories into low-dimensional latent vectors. We evaluated these representations as broadly enabling patient stratification by applying hierarchical clustering to different multi-disease and disease-specific patient cohorts. ConvAE significantly outperformed several baselines in a clustering task to identify patients with different complex conditions, with 2.61 entropy and 0.31 purity average scores. When applied to stratify patients within a certain condition, ConvAE led to various clinically relevant subtypes for different disorders, including type 2 diabetes, Parkinson's disease, and Alzheimer's disease, largely related to comorbidities, disease progression, and symptom severity. With these results, we demonstrate that ConvAE can generate patient representations that lead to clinically meaningful insights. This scalable framework can help better understand varying etiologies in heterogeneous sub-populations and unlock patterns for EHR-based research in the realm of personalized medicine.

Year:  2020        PMID: 33597713     DOI: 10.1038/s41746-020-0301-z

Source DB:  PubMed          Journal:  NPJ Digit Med        ISSN: 2398-6352


  31 in total

Review 1.  Representation learning: a review and new perspectives.

Authors:  Yoshua Bengio; Aaron Courville; Pascal Vincent
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2013-08       Impact factor: 6.226

Review 2.  Deep learning.

Authors:  Yann LeCun; Yoshua Bengio; Geoffrey Hinton
Journal:  Nature       Date:  2015-05-28       Impact factor: 49.962

3.  Identification of type 2 diabetes subgroups through topological analysis of patient similarity.

Authors:  Li Li; Wei-Yi Cheng; Benjamin S Glicksberg; Omri Gottesman; Ronald Tamler; Rong Chen; Erwin P Bottinger; Joel T Dudley
Journal:  Sci Transl Med       Date:  2015-10-28       Impact factor: 17.956

Review 4.  Cystic fibrosis genetics: from molecular understanding to clinical application.

Authors:  Garry R Cutting
Journal:  Nat Rev Genet       Date:  2014-11-18       Impact factor: 53.242

Review 5.  The Parkinson's complex: parkinsonism is just the tip of the iceberg.

Authors:  J William Langston
Journal:  Ann Neurol       Date:  2006-04       Impact factor: 10.422

Review 6.  Mining electronic health records: towards better research applications and clinical care.

Authors:  Peter B Jensen; Lars J Jensen; Søren Brunak
Journal:  Nat Rev Genet       Date:  2012-05-02       Impact factor: 53.242

7.  Comorbidity clusters in autism spectrum disorders: an electronic health record time-series analysis.

Authors:  Finale Doshi-Velez; Yaorong Ge; Isaac Kohane
Journal:  Pediatrics       Date:  2013-12-09       Impact factor: 7.124

Review 8.  Drug development in the era of precision medicine.

Authors:  Sarah A Dugger; Adam Platt; David B Goldstein
Journal:  Nat Rev Drug Discov       Date:  2017-12-08       Impact factor: 84.694

9.  Large-scale phenome analysis defines a behavioral signature for Huntington's disease genotype in mice.

Authors:  Vadim Alexandrov; Dani Brunner; Liliana B Menalled; Andrea Kudwa; Judy Watson-Johnson; Matthew Mazzella; Ian Russell; Melinda C Ruiz; Justin Torello; Emily Sabath; Ana Sanchez; Miguel Gomez; Igor Filipov; Kimberly Cox; Mei Kwan; Afshin Ghavami; Sylvie Ramboz; Brenda Lager; Vanessa C Wheeler; Jeff Aaronson; Jim Rosinski; James F Gusella; Marcy E MacDonald; David Howland; Seung Kwak
Journal:  Nat Biotechnol       Date:  2016-07-04       Impact factor: 54.908

Review 10.  Type 2 diabetes: a multifaceted disease.

Authors:  Ewan R Pearson
Journal:  Diabetologia       Date:  2019-06-03       Impact factor: 10.122

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.