Literature DB >> 31748751

High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP).

Yichi Zhang1, Tianrun Cai2, Sheng Yu3,4, Kelly Cho5,6, Chuan Hong1, Jiehuan Sun1, Jie Huang2, Yuk-Lam Ho5, Ashwin N Ananthakrishnan7, Zongqi Xia8, Stanley Y Shaw9, Vivian Gainer10, Victor Castro10, Nicholas Link5, Jacqueline Honerlaw5, Sicong Huang2, David Gagnon5,11, Elizabeth W Karlson2, Robert M Plenge2, Peter Szolovits12, Guergana Savova13, Susanne Churchill14, Christopher O'Donnell5,15, Shawn N Murphy10,14,16, J Michael Gaziano5,6, Isaac Kohane14, Tianxi Cai1,14, Katherine P Liao17,18,19.   

Abstract

Phenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR) data has both facilitated and increased the demand for efficient, accurate, and robust approaches for phenotyping millions of patients. Challenges to phenotyping with EMR data include variation in the accuracy of codes, as well as the high level of manual input required to identify features for the algorithm and to obtain gold standard labels. To address these challenges, we developed PheCAP, a high-throughput semi-supervised phenotyping pipeline. PheCAP begins with data from the EMR, including structured data and information extracted from the narrative notes using natural language processing (NLP). The standardized steps integrate automated procedures, which reduce the level of manual input, and machine learning approaches for algorithm training. PheCAP itself can be executed in 1-2 d if all data are available; however, the timing is largely dependent on the chart review stage, which typically requires at least 2 weeks. The final products of PheCAP include a phenotype algorithm, the probability of the phenotype for all patients, and a phenotype classification (yes or no).

Entities:  

Mesh:

Year:  2019        PMID: 31748751      PMCID: PMC7323894          DOI: 10.1038/s41596-019-0227-6

Source DB:  PubMed          Journal:  Nat Protoc        ISSN: 1750-2799            Impact factor:   13.491


  37 in total

1.  Million Veteran Program: A mega-biobank to study genetic influences on health and disease.

Authors:  John Michael Gaziano; John Concato; Mary Brophy; Louis Fiore; Saiju Pyarajan; James Breeling; Stacey Whitbourne; Jennifer Deen; Colleen Shannon; Donald Humphries; Peter Guarino; Mihaela Aslan; Daniel Anderson; Rene LaFleur; Timothy Hammond; Kendra Schaa; Jennifer Moser; Grant Huang; Sumitra Muralidhar; Ronald Przygodzki; Timothy J O'Leary
Journal:  J Clin Epidemiol       Date:  2015-10-09       Impact factor: 6.437

2.  PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability.

Authors:  Jacqueline C Kirby; Peter Speltz; Luke V Rasmussen; Melissa Basford; Omri Gottesman; Peggy L Peissig; Jennifer A Pacheco; Gerard Tromp; Jyotishman Pathak; David S Carrell; Stephen B Ellis; Todd Lingren; Will K Thompson; Guergana Savova; Jonathan Haines; Dan M Roden; Paul A Harris; Joshua C Denny
Journal:  J Am Med Inform Assoc       Date:  2016-03-28       Impact factor: 4.497

3.  Measuring diagnoses: ICD code accuracy.

Authors:  Kimberly J O'Malley; Karon F Cook; Matt D Price; Kimberly Raiford Wildes; John F Hurdle; Carol M Ashton
Journal:  Health Serv Res       Date:  2005-10       Impact factor: 3.402

4.  Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources.

Authors:  Sheng Yu; Katherine P Liao; Stanley Y Shaw; Vivian S Gainer; Susanne E Churchill; Peter Szolovits; Shawn N Murphy; Isaac S Kohane; Tianxi Cai
Journal:  J Am Med Inform Assoc       Date:  2015-04-29       Impact factor: 4.497

5.  Genetic basis of autoantibody positive and negative rheumatoid arthritis risk in a multi-ethnic cohort derived from electronic health records.

Authors:  Fina Kurreeman; Katherine Liao; Lori Chibnik; Brendan Hickey; Eli Stahl; Vivian Gainer; Gang Li; Lynn Bry; Scott Mahan; Kristin Ardlie; Brian Thomson; Peter Szolovits; Susanne Churchill; Shawn N Murphy; Tianxi Cai; Soumya Raychaudhuri; Isaac Kohane; Elizabeth Karlson; Robert M Plenge
Journal:  Am J Hum Genet       Date:  2011-01-07       Impact factor: 11.025

6.  An atlas of genetic associations in UK Biobank.

Authors:  Oriol Canela-Xandri; Konrad Rawlik; Albert Tenesa
Journal:  Nat Genet       Date:  2018-10-22       Impact factor: 38.330

7.  Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data.

Authors:  Joshua C Denny; Lisa Bastarache; Marylyn D Ritchie; Robert J Carroll; Raquel Zink; Jonathan D Mosley; Julie R Field; Jill M Pulley; Andrea H Ramirez; Erica Bowton; Melissa A Basford; David S Carrell; Peggy L Peissig; Abel N Kho; Jennifer A Pacheco; Luke V Rasmussen; David R Crosslin; Paul K Crane; Jyotishman Pathak; Suzette J Bielinski; Sarah A Pendergrass; Hua Xu; Lucia A Hindorff; Rongling Li; Teri A Manolio; Christopher G Chute; Rex L Chisholm; Eric B Larson; Gail P Jarvik; Murray H Brilliant; Catherine A McCarty; Iftikhar J Kullo; Jonathan L Haines; Dana C Crawford; Daniel R Masys; Dan M Roden
Journal:  Nat Biotechnol       Date:  2013-12       Impact factor: 54.908

8.  Surrogate-assisted feature extraction for high-throughput phenotyping.

Authors:  Sheng Yu; Abhishek Chakrabortty; Katherine P Liao; Tianrun Cai; Ashwin N Ananthakrishnan; Vivian S Gainer; Susanne E Churchill; Peter Szolovits; Shawn N Murphy; Isaac S Kohane; Tianxi Cai
Journal:  J Am Med Inform Assoc       Date:  2017-04-01       Impact factor: 4.497

9.  Rapid identification of myocardial infarction risk associated with diabetes medications using electronic medical records.

Authors:  John S Brownstein; Shawn N Murphy; Allison B Goldfine; Richard W Grant; Margarita Sordo; Vivian Gainer; Judith A Colecchi; Anil Dubey; David M Nathan; John P Glaser; Isaac S Kohane
Journal:  Diabetes Care       Date:  2009-12-15       Impact factor: 19.112

10.  Development of phenotype algorithms using electronic medical records and incorporating natural language processing.

Authors:  Katherine P Liao; Tianxi Cai; Guergana K Savova; Shawn N Murphy; Elizabeth W Karlson; Ashwin N Ananthakrishnan; Vivian S Gainer; Stanley Y Shaw; Zongqi Xia; Peter Szolovits; Susanne Churchill; Isaac Kohane
Journal:  BMJ       Date:  2015-04-24
View more
  31 in total

1.  Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records.

Authors:  Sizheng Steven Zhao; Chuan Hong; Tianrun Cai; Chang Xu; Jie Huang; Joerg Ermann; Nicola J Goodson; Daniel H Solomon; Tianxi Cai; Katherine P Liao
Journal:  Rheumatology (Oxford)       Date:  2020-05-01       Impact factor: 7.580

2.  Association Between Early Hypertension Control and Cardiovascular Disease Incidence in Veterans With Diabetes.

Authors:  Sridharan Raghavan; Yuk-Lam Ho; Vinay Kini; Mary K Rhee; Jason L Vassy; David R Gagnon; Kelly Cho; Peter W F Wilson; Lawrence S Phillips
Journal:  Diabetes Care       Date:  2019-10       Impact factor: 19.112

3.  Temporal trends of multiple sclerosis disease activity: Electronic health records indicators.

Authors:  Liang Liang; Nicole Kim; Jue Hou; Tianrun Cai; Kumar Dahal; Chen Lin; Sean Finan; Guergana Savovoa; Mattia Rosso; Mariann Polgar-Tucsanyi; Howard Weiner; Tanuja Chitnis; Tianxi Cai; Zongqi Xia
Journal:  Mult Scler Relat Disord       Date:  2021-10-24       Impact factor: 4.339

4.  Impact of ICD10 and secular changes on electronic medical record rheumatoid arthritis algorithms.

Authors:  Sicong Huang; Jie Huang; Tianrun Cai; Kumar P Dahal; Andrew Cagan; Zeling He; Jacklyn Stratton; Isaac Gorelik; Chuan Hong; Tianxi Cai; Katherine P Liao
Journal:  Rheumatology (Oxford)       Date:  2020-12-01       Impact factor: 7.580

5.  Classifying Pseudogout Using Machine Learning Approaches With Electronic Health Record Data.

Authors:  Sara K Tedeschi; Tianrun Cai; Zeling He; Yuri Ahuja; Chuan Hong; Katherine A Yates; Kumar Dahal; Chang Xu; Houchen Lyu; Kazuki Yoshida; Daniel H Solomon; Tianxi Cai; Katherine P Liao
Journal:  Arthritis Care Res (Hoboken)       Date:  2021-03       Impact factor: 4.794

6.  A high-throughput phenotyping algorithm is portable from adult to pediatric populations.

Authors:  Alon Geva; Molei Liu; Vidul A Panickan; Paul Avillach; Tianxi Cai; Kenneth D Mandl
Journal:  J Am Med Inform Assoc       Date:  2021-06-12       Impact factor: 4.497

7.  Cutting the fat: advances and challenges in sleep apnoea genetics.

Authors:  Heming Wang; Matthew O Goodman; Tamar Sofer; Susan Redline
Journal:  Eur Respir J       Date:  2021-05-06       Impact factor: 16.671

8.  Pre-training phenotyping classifiers.

Authors:  Dmitriy Dligach; Majid Afshar; Timothy Miller
Journal:  J Biomed Inform       Date:  2020-11-28       Impact factor: 6.317

9.  Optimizing Atherosclerotic Cardiovascular Disease Risk Estimation for Veterans With Diabetes Mellitus.

Authors:  Sridharan Raghavan; Yuk-Lam Ho; Jason L Vassy; Daniel Posner; Jacqueline Honerlaw; Lauren Costa; Lawrence S Phillips; David R Gagnon; Peter W F Wilson; Kelly Cho
Journal:  Circ Cardiovasc Qual Outcomes       Date:  2020-08-31

10.  Comparative effectiveness of medical concept embedding for feature engineering in phenotyping.

Authors:  Junghwan Lee; Cong Liu; Jae Hyun Kim; Alex Butler; Ning Shang; Chao Pang; Karthik Natarajan; Patrick Ryan; Casey Ta; Chunhua Weng
Journal:  JAMIA Open       Date:  2021-06-16
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.