Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 High-throughput multimodal automated phenotyping (MAP) with application to PheWAS.

Literature DB >> 31613361

High-throughput multimodal automated phenotyping (MAP) with application to PheWAS.

Katherine P Liao^1,2,3, Jiehuan Sun^3,4, Tianrun A Cai^1,2,3, Nicholas Link³, Chuan Hong^2,3,4, Jie Huang², Jennifer E Huffman³, Jessica Gronsbell⁵, Yichi Zhang^4,6, Yuk-Lam Ho³, Victor Castro⁷, Vivian Gainer⁷, Shawn N Murphy^2,7,8, Christopher J O'Donnell^1,3, J Michael Gaziano^1,2,3, Kelly Cho^1,2,3, Peter Szolovits⁹, Isaac S Kohane², Sheng Yu^10,11,12, Tianxi Cai^2,3,4.

Abstract

OBJECTIVE: Electronic health records linked with biorepositories are a powerful platform for translational studies. A major bottleneck exists in the ability to phenotype patients accurately and efficiently. The objective of this study was to develop an automated high-throughput phenotyping method integrating International Classification of Diseases (ICD) codes and narrative data extracted using natural language processing (NLP).
MATERIALS AND METHODS: We developed a mapping method for automatically identifying relevant ICD and NLP concepts for a specific phenotype leveraging the Unified Medical Language System. Along with health care utilization, aggregated ICD and NLP counts were jointly analyzed by fitting an ensemble of latent mixture models. The multimodal automated phenotyping (MAP) algorithm yields a predicted probability of phenotype for each patient and a threshold for classifying participants with phenotype yes/no. The algorithm was validated using labeled data for 16 phenotypes from a biorepository and further tested in an independent cohort phenome-wide association studies (PheWAS) for 2 single nucleotide polymorphisms with known associations.
RESULTS: The MAP algorithm achieved higher or similar AUC and F-scores compared to the ICD code across all 16 phenotypes. The features assembled via the automated approach had comparable accuracy to those assembled via manual curation (AUCMAP 0.943, AUCmanual 0.941). The PheWAS results suggest that the MAP approach detected previously validated associations with higher power when compared to the standard PheWAS method based on ICD codes.
CONCLUSION: The MAP approach increased the accuracy of phenotype definition while maintaining scalability, thereby facilitating use in studies requiring large-scale phenotyping, such as PheWAS.

Entities: Disease Species

Keywords: High-throughput; PheWAS; phenotyping

Mesh：

Year: 2019 PMID： 31613361 PMCID： PMC6798574 DOI： 10.1093/jamia/ocz066

Source DB: PubMed Journal: J Am Med Inform Assoc ISSN： 1067-5027 Impact factor: 4.497

34 in total

1. The validity of ICD-9-CM codes in identifying postoperative deep vein thrombosis and pulmonary embolism.

Authors: Chunliu Zhan; James Battles; Yen-Pin Chiang; David Hunt
Journal: Jt Comm J Qual Patient Saf Date: 2007-06

2. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network.

Authors: Katherine M Newton; Peggy L Peissig; Abel Ngo Kho; Suzette J Bielinski; Richard L Berg; Vidhu Choudhary; Melissa Basford; Christopher G Chute; Iftikhar J Kullo; Rongling Li; Jennifer A Pacheco; Luke V Rasmussen; Leslie Spangler; Joshua C Denny
Journal: J Am Med Inform Assoc Date: 2013-03-26 Impact factor: 4.497

3. Inaccuracy of the International Classification of Diseases (ICD-9-CM) in identifying the diagnosis of ischemic cerebrovascular disease.

Authors: C Benesch; D M Witter; A L Wilder; P W Duncan; G P Samsa; D B Matchar
Journal: Neurology Date: 1997-09 Impact factor: 9.910

4. The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies.

Authors: Catherine A McCarty; Rex L Chisholm; Christopher G Chute; Iftikhar J Kullo; Gail P Jarvik; Eric B Larson; Rongling Li; Daniel R Masys; Marylyn D Ritchie; Dan M Roden; Jeffery P Struewing; Wendy A Wolf
Journal: BMC Med Genomics Date: 2011-01-26 Impact factor: 3.063

5. PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations.

Authors: Joshua C Denny; Marylyn D Ritchie; Melissa A Basford; Jill M Pulley; Lisa Bastarache; Kristin Brown-Gentry; Deede Wang; Dan R Masys; Dan M Roden; Dana C Crawford
Journal: Bioinformatics Date: 2010-03-24 Impact factor: 6.937

Review 6. The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future.

Authors: Omri Gottesman; Helena Kuivaniemi; Gerard Tromp; W Andrew Faucett; Rongling Li; Teri A Manolio; Saskia C Sanderson; Joseph Kannry; Randi Zinberg; Melissa A Basford; Murray Brilliant; David J Carey; Rex L Chisholm; Christopher G Chute; John J Connolly; David Crosslin; Joshua C Denny; Carlos J Gallego; Jonathan L Haines; Hakon Hakonarson; John Harley; Gail P Jarvik; Isaac Kohane; Iftikhar J Kullo; Eric B Larson; Catherine McCarty; Marylyn D Ritchie; Dan M Roden; Maureen E Smith; Erwin P Böttinger; Marc S Williams
Journal: Genet Med Date: 2013-06-06 Impact factor: 8.822

7. The Biobank Portal for Partners Personalized Medicine: A Query Tool for Working with Consented Biobank Samples, Genotypes, and Phenotypes Using i2b2.

Authors: Vivian S Gainer; Andrew Cagan; Victor M Castro; Stacey Duey; Bhaswati Ghosh; Alyssa P Goodson; Sergey Goryachev; Reeta Metta; Taowei David Wang; Nich Wattanasin; Shawn N Murphy
Journal: J Pers Med Date: 2016-02-26

8. Methods to Develop an Electronic Medical Record Phenotype Algorithm to Compare the Risk of Coronary Artery Disease across 3 Chronic Disease Cohorts.

Authors: Katherine P Liao; Ashwin N Ananthakrishnan; Vishesh Kumar; Zongqi Xia; Andrew Cagan; Vivian S Gainer; Sergey Goryachev; Pei Chen; Guergana K Savova; Denis Agniel; Susanne Churchill; Jaeyoung Lee; Shawn N Murphy; Robert M Plenge; Peter Szolovits; Isaac Kohane; Stanley Y Shaw; Elizabeth W Karlson; Tianxi Cai
Journal: PLoS One Date: 2015-08-24 Impact factor: 3.240

9. Modeling disease severity in multiple sclerosis using electronic health records.

Authors: Zongqi Xia; Elizabeth Secor; Lori B Chibnik; Riley M Bove; Suchun Cheng; Tanuja Chitnis; Andrew Cagan; Vivian S Gainer; Pei J Chen; Katherine P Liao; Stanley Y Shaw; Ashwin N Ananthakrishnan; Peter Szolovits; Howard L Weiner; Elizabeth W Karlson; Shawn N Murphy; Guergana K Savova; Tianxi Cai; Susanne E Churchill; Robert M Plenge; Isaac S Kohane; Philip L De Jager
Journal: PLoS One Date: 2013-11-11 Impact factor: 3.240

10. Development of phenotype algorithms using electronic medical records and incorporating natural language processing.

Authors: Katherine P Liao; Tianxi Cai; Guergana K Savova; Shawn N Murphy; Elizabeth W Karlson; Ashwin N Ananthakrishnan; Vivian S Gainer; Stanley Y Shaw; Zongqi Xia; Peter Szolovits; Susanne Churchill; Isaac Kohane
Journal: BMJ Date: 2015-04-24

23 in total

1. Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records.

Authors: Sizheng Steven Zhao; Chuan Hong; Tianrun Cai; Chang Xu; Jie Huang; Joerg Ermann; Nicola J Goodson; Daniel H Solomon; Tianxi Cai; Katherine P Liao
Journal: Rheumatology (Oxford) Date: 2020-05-01 Impact factor: 7.580

2. sureLDA: A multidisease automated phenotyping method for the electronic health record.

Authors: Yuri Ahuja; Doudou Zhou; Zeling He; Jiehuan Sun; Victor M Castro; Vivian Gainer; Shawn N Murphy; Chuan Hong; Tianxi Cai
Journal: J Am Med Inform Assoc Date: 2020-08-01 Impact factor: 4.497

Review 3. Maturation and application of phenome-wide association studies.

Authors: Shiying Liu; Dana C Crawford
Journal: Trends Genet Date: 2022-01-03 Impact factor: 11.639

4. Autosomal Dominant Polycystic Kidney Disease Does Not Significantly Alter Major COVID-19 Outcomes among Veterans.

Authors: Xiangqin Cui; Julia W Gallini; Christine L Jasien; Michal Mrug
Journal: Kidney360 Date: 2021-04-28

Review 5. Using Phecodes for Research with the Electronic Health Record: From PheWAS to PheRS.

Authors: Lisa Bastarache
Journal: Annu Rev Biomed Data Sci Date: 2021-07-20

6. Classifying Pseudogout Using Machine Learning Approaches With Electronic Health Record Data.

Authors: Sara K Tedeschi; Tianrun Cai; Zeling He; Yuri Ahuja; Chuan Hong; Katherine A Yates; Kumar Dahal; Chang Xu; Houchen Lyu; Kazuki Yoshida; Daniel H Solomon; Tianxi Cai; Katherine P Liao
Journal: Arthritis Care Res (Hoboken) Date: 2021-03 Impact factor: 4.794

7. A high-throughput phenotyping algorithm is portable from adult to pediatric populations.

Authors: Alon Geva; Molei Liu; Vidul A Panickan; Paul Avillach; Tianxi Cai; Kenneth D Mandl
Journal: J Am Med Inform Assoc Date: 2021-06-12 Impact factor: 4.497

8. High-throughput phenotyping with temporal sequences.

Authors: Hossein Estiri; Zachary H Strasser; Shawn N Murphy
Journal: J Am Med Inform Assoc Date: 2021-03-18 Impact factor: 4.497

9. Phenome-wide association of 1809 phenotypes and COVID-19 disease progression in the Veterans Health Administration Million Veteran Program.

Authors: Rebecca J Song; Yuk-Lam Ho; Petra Schubert; Yojin Park; Daniel Posner; Emily M Lord; Lauren Costa; Hanna Gerlovin; Katherine E Kurgansky; Tori Anglin-Foote; Scott DuVall; Jennifer E Huffman; Saiju Pyarajan; Jean C Beckham; Kyong-Mi Chang; Katherine P Liao; Luc Djousse; David R Gagnon; Stacey B Whitbourne; Rachel Ramoni; Sumitra Muralidhar; Philip S Tsao; Christopher J O'Donnell; John Michael Gaziano; Juan P Casas; Kelly Cho
Journal: PLoS One Date: 2021-05-13 Impact factor: 3.240

10. Comparative effectiveness of medical concept embedding for feature engineering in phenotyping.

Authors: Junghwan Lee; Cong Liu; Jae Hyun Kim; Alex Butler; Ning Shang; Chao Pang; Karthik Natarajan; Patrick Ryan; Casey Ta; Chunhua Weng
Journal: JAMIA Open Date: 2021-06-16