Literature DB >> 33313899

High-throughput phenotyping with temporal sequences.

Hossein Estiri1,2,3, Zachary H Strasser1,2,3, Shawn N Murphy1,2,3.   

Abstract

OBJECTIVE: High-throughput electronic phenotyping algorithms can accelerate translational research using data from electronic health record (EHR) systems. The temporal information buried in EHRs is often underutilized in developing computational phenotypic definitions. This study aims to develop a high-throughput phenotyping method, leveraging temporal sequential patterns from EHRs.
MATERIALS AND METHODS: We develop a representation mining algorithm to extract 5 classes of representations from EHR diagnosis and medication records: the aggregated vector of the records (aggregated vector representation), the standard sequential patterns (sequential pattern mining), the transitive sequential patterns (transitive sequential pattern mining), and 2 hybrid classes. Using EHR data on 10 phenotypes from the Mass General Brigham Biobank, we train and validate phenotyping algorithms.
RESULTS: Phenotyping with temporal sequences resulted in a superior classification performance across all 10 phenotypes compared with the standard representations in electronic phenotyping. The high-throughput algorithm's classification performance was superior or similar to the performance of previously published electronic phenotyping algorithms. We characterize and evaluate the top transitive sequences of diagnosis records paired with the records of risk factors, symptoms, complications, medications, or vaccinations. DISCUSSION: The proposed high-throughput phenotyping approach enables seamless discovery of sequential record combinations that may be difficult to assume from raw EHR data. Transitive sequences offer more accurate characterization of the phenotype, compared with its individual components, and reflect the actual lived experiences of the patients with that particular disease.
CONCLUSION: Sequential data representations provide a precise mechanism for incorporating raw EHR records into downstream machine learning. Our approach starts with user interpretability and works backward to the technology.
© The Author(s) 2020. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  electronic health records; phenotyping; sequential pattern mining; temporal data representation

Mesh:

Year:  2021        PMID: 33313899      PMCID: PMC7973443          DOI: 10.1093/jamia/ocaa288

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  57 in total

1.  Midlife cardiovascular risk factors and risk of dementia in late life.

Authors:  R A Whitmer; S Sidney; J Selby; S Claiborne Johnston; K Yaffe
Journal:  Neurology       Date:  2005-01-25       Impact factor: 9.910

2.  EHR Big Data Deep Phenotyping. Contribution of the IMIA Genomic Medicine Working Group.

Authors:  L J Frey; L Lenert; G Lopez-Campos
Journal:  Yearb Med Inform       Date:  2014-08-15

3.  Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources.

Authors:  Sheng Yu; Katherine P Liao; Stanley Y Shaw; Vivian S Gainer; Susanne E Churchill; Peter Szolovits; Shawn N Murphy; Isaac S Kohane; Tianxi Cai
Journal:  J Am Med Inform Assoc       Date:  2015-04-29       Impact factor: 4.497

4.  Doctor AI: Predicting Clinical Events via Recurrent Neural Networks.

Authors:  Edward Choi; Mohammad Taha Bahadori; Andy Schuetz; Walter F Stewart; Jimeng Sun
Journal:  JMLR Workshop Conf Proc       Date:  2016-12-10

5.  Electronic medical records for genetic research: results of the eMERGE consortium.

Authors:  Abel N Kho; Jennifer A Pacheco; Peggy L Peissig; Luke Rasmussen; Katherine M Newton; Noah Weston; Paul K Crane; Jyotishman Pathak; Christopher G Chute; Suzette J Bielinski; Iftikhar J Kullo; Rongling Li; Teri A Manolio; Rex L Chisholm; Joshua C Denny
Journal:  Sci Transl Med       Date:  2011-04-20       Impact factor: 17.956

6.  A machine learning-based framework to identify type 2 diabetes through electronic health records.

Authors:  Tao Zheng; Wei Xie; Liling Xu; Xiaoying He; Ya Zhang; Mingrong You; Gong Yang; You Chen
Journal:  Int J Med Inform       Date:  2016-10-01       Impact factor: 4.046

7.  The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies.

Authors:  Catherine A McCarty; Rex L Chisholm; Christopher G Chute; Iftikhar J Kullo; Gail P Jarvik; Eric B Larson; Rongling Li; Daniel R Masys; Marylyn D Ritchie; Dan M Roden; Jeffery P Struewing; Wendy A Wolf
Journal:  BMC Med Genomics       Date:  2011-01-26       Impact factor: 3.063

8.  Knowledge-based temporal abstraction in clinical domains.

Authors:  Y Shahar; M A Musen
Journal:  Artif Intell Med       Date:  1996-07       Impact factor: 5.326

9.  Surrogate-assisted feature extraction for high-throughput phenotyping.

Authors:  Sheng Yu; Abhishek Chakrabortty; Katherine P Liao; Tianrun Cai; Ashwin N Ananthakrishnan; Vivian S Gainer; Susanne E Churchill; Peter Szolovits; Shawn N Murphy; Isaac S Kohane; Tianxi Cai
Journal:  J Am Med Inform Assoc       Date:  2017-04-01       Impact factor: 4.497

View more
  1 in total

1.  Evolving phenotypes of non-hospitalized patients that indicate long COVID.

Authors:  Hossein Estiri; Zachary H Strasser; Gabriel A Brat; Yevgeniy R Semenov; Chirag J Patel; Shawn N Murphy
Journal:  BMC Med       Date:  2021-09-27       Impact factor: 11.150

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.