Literature DB >> 31613361

High-throughput multimodal automated phenotyping (MAP) with application to PheWAS.

Katherine P Liao1,2,3, Jiehuan Sun3,4, Tianrun A Cai1,2,3, Nicholas Link3, Chuan Hong2,3,4, Jie Huang2, Jennifer E Huffman3, Jessica Gronsbell5, Yichi Zhang4,6, Yuk-Lam Ho3, Victor Castro7, Vivian Gainer7, Shawn N Murphy2,7,8, Christopher J O'Donnell1,3, J Michael Gaziano1,2,3, Kelly Cho1,2,3, Peter Szolovits9, Isaac S Kohane2, Sheng Yu10,11,12, Tianxi Cai2,3,4.   

Abstract

OBJECTIVE: Electronic health records linked with biorepositories are a powerful platform for translational studies. A major bottleneck exists in the ability to phenotype patients accurately and efficiently. The objective of this study was to develop an automated high-throughput phenotyping method integrating International Classification of Diseases (ICD) codes and narrative data extracted using natural language processing (NLP).
MATERIALS AND METHODS: We developed a mapping method for automatically identifying relevant ICD and NLP concepts for a specific phenotype leveraging the Unified Medical Language System. Along with health care utilization, aggregated ICD and NLP counts were jointly analyzed by fitting an ensemble of latent mixture models. The multimodal automated phenotyping (MAP) algorithm yields a predicted probability of phenotype for each patient and a threshold for classifying participants with phenotype yes/no. The algorithm was validated using labeled data for 16 phenotypes from a biorepository and further tested in an independent cohort phenome-wide association studies (PheWAS) for 2 single nucleotide polymorphisms with known associations.
RESULTS: The MAP algorithm achieved higher or similar AUC and F-scores compared to the ICD code across all 16 phenotypes. The features assembled via the automated approach had comparable accuracy to those assembled via manual curation (AUCMAP 0.943, AUCmanual 0.941). The PheWAS results suggest that the MAP approach detected previously validated associations with higher power when compared to the standard PheWAS method based on ICD codes.
CONCLUSION: The MAP approach increased the accuracy of phenotype definition while maintaining scalability, thereby facilitating use in studies requiring large-scale phenotyping, such as PheWAS.
© The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  High-throughput; PheWAS; phenotyping

Mesh:

Year:  2019        PMID: 31613361      PMCID: PMC6798574          DOI: 10.1093/jamia/ocz066

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  34 in total

1.  The validity of ICD-9-CM codes in identifying postoperative deep vein thrombosis and pulmonary embolism.

Authors:  Chunliu Zhan; James Battles; Yen-Pin Chiang; David Hunt
Journal:  Jt Comm J Qual Patient Saf       Date:  2007-06

2.  Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network.

Authors:  Katherine M Newton; Peggy L Peissig; Abel Ngo Kho; Suzette J Bielinski; Richard L Berg; Vidhu Choudhary; Melissa Basford; Christopher G Chute; Iftikhar J Kullo; Rongling Li; Jennifer A Pacheco; Luke V Rasmussen; Leslie Spangler; Joshua C Denny
Journal:  J Am Med Inform Assoc       Date:  2013-03-26       Impact factor: 4.497

3.  Inaccuracy of the International Classification of Diseases (ICD-9-CM) in identifying the diagnosis of ischemic cerebrovascular disease.

Authors:  C Benesch; D M Witter; A L Wilder; P W Duncan; G P Samsa; D B Matchar
Journal:  Neurology       Date:  1997-09       Impact factor: 9.910

4.  The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies.

Authors:  Catherine A McCarty; Rex L Chisholm; Christopher G Chute; Iftikhar J Kullo; Gail P Jarvik; Eric B Larson; Rongling Li; Daniel R Masys; Marylyn D Ritchie; Dan M Roden; Jeffery P Struewing; Wendy A Wolf
Journal:  BMC Med Genomics       Date:  2011-01-26       Impact factor: 3.063

5.  PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations.

Authors:  Joshua C Denny; Marylyn D Ritchie; Melissa A Basford; Jill M Pulley; Lisa Bastarache; Kristin Brown-Gentry; Deede Wang; Dan R Masys; Dan M Roden; Dana C Crawford
Journal:  Bioinformatics       Date:  2010-03-24       Impact factor: 6.937

Review 6.  The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future.

Authors:  Omri Gottesman; Helena Kuivaniemi; Gerard Tromp; W Andrew Faucett; Rongling Li; Teri A Manolio; Saskia C Sanderson; Joseph Kannry; Randi Zinberg; Melissa A Basford; Murray Brilliant; David J Carey; Rex L Chisholm; Christopher G Chute; John J Connolly; David Crosslin; Joshua C Denny; Carlos J Gallego; Jonathan L Haines; Hakon Hakonarson; John Harley; Gail P Jarvik; Isaac Kohane; Iftikhar J Kullo; Eric B Larson; Catherine McCarty; Marylyn D Ritchie; Dan M Roden; Maureen E Smith; Erwin P Böttinger; Marc S Williams
Journal:  Genet Med       Date:  2013-06-06       Impact factor: 8.822

7.  The Biobank Portal for Partners Personalized Medicine: A Query Tool for Working with Consented Biobank Samples, Genotypes, and Phenotypes Using i2b2.

Authors:  Vivian S Gainer; Andrew Cagan; Victor M Castro; Stacey Duey; Bhaswati Ghosh; Alyssa P Goodson; Sergey Goryachev; Reeta Metta; Taowei David Wang; Nich Wattanasin; Shawn N Murphy
Journal:  J Pers Med       Date:  2016-02-26

8.  Methods to Develop an Electronic Medical Record Phenotype Algorithm to Compare the Risk of Coronary Artery Disease across 3 Chronic Disease Cohorts.

Authors:  Katherine P Liao; Ashwin N Ananthakrishnan; Vishesh Kumar; Zongqi Xia; Andrew Cagan; Vivian S Gainer; Sergey Goryachev; Pei Chen; Guergana K Savova; Denis Agniel; Susanne Churchill; Jaeyoung Lee; Shawn N Murphy; Robert M Plenge; Peter Szolovits; Isaac Kohane; Stanley Y Shaw; Elizabeth W Karlson; Tianxi Cai
Journal:  PLoS One       Date:  2015-08-24       Impact factor: 3.240

9.  Modeling disease severity in multiple sclerosis using electronic health records.

Authors:  Zongqi Xia; Elizabeth Secor; Lori B Chibnik; Riley M Bove; Suchun Cheng; Tanuja Chitnis; Andrew Cagan; Vivian S Gainer; Pei J Chen; Katherine P Liao; Stanley Y Shaw; Ashwin N Ananthakrishnan; Peter Szolovits; Howard L Weiner; Elizabeth W Karlson; Shawn N Murphy; Guergana K Savova; Tianxi Cai; Susanne E Churchill; Robert M Plenge; Isaac S Kohane; Philip L De Jager
Journal:  PLoS One       Date:  2013-11-11       Impact factor: 3.240

10.  Development of phenotype algorithms using electronic medical records and incorporating natural language processing.

Authors:  Katherine P Liao; Tianxi Cai; Guergana K Savova; Shawn N Murphy; Elizabeth W Karlson; Ashwin N Ananthakrishnan; Vivian S Gainer; Stanley Y Shaw; Zongqi Xia; Peter Szolovits; Susanne Churchill; Isaac Kohane
Journal:  BMJ       Date:  2015-04-24
View more
  23 in total

1.  Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records.

Authors:  Sizheng Steven Zhao; Chuan Hong; Tianrun Cai; Chang Xu; Jie Huang; Joerg Ermann; Nicola J Goodson; Daniel H Solomon; Tianxi Cai; Katherine P Liao
Journal:  Rheumatology (Oxford)       Date:  2020-05-01       Impact factor: 7.580

2.  sureLDA: A multidisease automated phenotyping method for the electronic health record.

Authors:  Yuri Ahuja; Doudou Zhou; Zeling He; Jiehuan Sun; Victor M Castro; Vivian Gainer; Shawn N Murphy; Chuan Hong; Tianxi Cai
Journal:  J Am Med Inform Assoc       Date:  2020-08-01       Impact factor: 4.497

Review 3.  Maturation and application of phenome-wide association studies.

Authors:  Shiying Liu; Dana C Crawford
Journal:  Trends Genet       Date:  2022-01-03       Impact factor: 11.639

4.  Autosomal Dominant Polycystic Kidney Disease Does Not Significantly Alter Major COVID-19 Outcomes among Veterans.

Authors:  Xiangqin Cui; Julia W Gallini; Christine L Jasien; Michal Mrug
Journal:  Kidney360       Date:  2021-04-28

Review 5.  Using Phecodes for Research with the Electronic Health Record: From PheWAS to PheRS.

Authors:  Lisa Bastarache
Journal:  Annu Rev Biomed Data Sci       Date:  2021-07-20

6.  Classifying Pseudogout Using Machine Learning Approaches With Electronic Health Record Data.

Authors:  Sara K Tedeschi; Tianrun Cai; Zeling He; Yuri Ahuja; Chuan Hong; Katherine A Yates; Kumar Dahal; Chang Xu; Houchen Lyu; Kazuki Yoshida; Daniel H Solomon; Tianxi Cai; Katherine P Liao
Journal:  Arthritis Care Res (Hoboken)       Date:  2021-03       Impact factor: 4.794

7.  A high-throughput phenotyping algorithm is portable from adult to pediatric populations.

Authors:  Alon Geva; Molei Liu; Vidul A Panickan; Paul Avillach; Tianxi Cai; Kenneth D Mandl
Journal:  J Am Med Inform Assoc       Date:  2021-06-12       Impact factor: 4.497

8.  High-throughput phenotyping with temporal sequences.

Authors:  Hossein Estiri; Zachary H Strasser; Shawn N Murphy
Journal:  J Am Med Inform Assoc       Date:  2021-03-18       Impact factor: 4.497

9.  Phenome-wide association of 1809 phenotypes and COVID-19 disease progression in the Veterans Health Administration Million Veteran Program.

Authors:  Rebecca J Song; Yuk-Lam Ho; Petra Schubert; Yojin Park; Daniel Posner; Emily M Lord; Lauren Costa; Hanna Gerlovin; Katherine E Kurgansky; Tori Anglin-Foote; Scott DuVall; Jennifer E Huffman; Saiju Pyarajan; Jean C Beckham; Kyong-Mi Chang; Katherine P Liao; Luc Djousse; David R Gagnon; Stacey B Whitbourne; Rachel Ramoni; Sumitra Muralidhar; Philip S Tsao; Christopher J O'Donnell; John Michael Gaziano; Juan P Casas; Kelly Cho
Journal:  PLoS One       Date:  2021-05-13       Impact factor: 3.240

10.  Comparative effectiveness of medical concept embedding for feature engineering in phenotyping.

Authors:  Junghwan Lee; Cong Liu; Jae Hyun Kim; Alex Butler; Ning Shang; Chao Pang; Karthik Natarajan; Patrick Ryan; Casey Ta; Chunhua Weng
Journal:  JAMIA Open       Date:  2021-06-16
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.