Literature DB >> 25062868

Improving the power of genetic association tests with imperfect phenotype derived from electronic medical records.

Jennifer A Sinnott1, Wei Dai, Katherine P Liao, Stanley Y Shaw, Ashwin N Ananthakrishnan, Vivian S Gainer, Elizabeth W Karlson, Susanne Churchill, Peter Szolovits, Shawn Murphy, Isaac Kohane, Robert Plenge, Tianxi Cai.   

Abstract

To reduce costs and improve clinical relevance of genetic studies, there has been increasing interest in performing such studies in hospital-based cohorts by linking phenotypes extracted from electronic medical records (EMRs) to genotypes assessed in routinely collected medical samples. A fundamental difficulty in implementing such studies is extracting accurate information about disease outcomes and important clinical covariates from large numbers of EMRs. Recently, numerous algorithms have been developed to infer phenotypes by combining information from multiple structured and unstructured variables extracted from EMRs. Although these algorithms are quite accurate, they typically do not provide perfect classification due to the difficulty in inferring meaning from the text. Some algorithms can produce for each patient a probability that the patient is a disease case. This probability can be thresholded to define case-control status, and this estimated case-control status has been used to replicate known genetic associations in EMR-based studies. However, using the estimated disease status in place of true disease status results in outcome misclassification, which can diminish test power and bias odds ratio estimates. We propose to instead directly model the algorithm-derived probability of being a case. We demonstrate how our approach improves test power and effect estimation in simulation studies, and we describe its performance in a study of rheumatoid arthritis. Our work provides an easily implemented solution to a major practical challenge that arises in the use of EMR data, which can facilitate the use of EMR infrastructure for more powerful, cost-effective, and diverse genetic studies.

Entities:  

Mesh:

Year:  2014        PMID: 25062868      PMCID: PMC4185241          DOI: 10.1007/s00439-014-1466-9

Source DB:  PubMed          Journal:  Hum Genet        ISSN: 0340-6717            Impact factor:   4.132


  26 in total

1.  Sensitivity and positive predictive value of Medicare Part B physician claims for rheumatologic diagnoses and procedures.

Authors:  J N Katz; J Barrett; M H Liang; A M Bacon; H Kaplan; R I Kieval; S M Lindsey; W N Roberts; D M Sheff; R T Spencer; A L Weaver; J A Baron
Journal:  Arthritis Rheum       Date:  1997-09

2.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

3.  Logistic regression when the outcome is measured with uncertainty.

Authors:  L S Magder; J P Hughes
Journal:  Am J Epidemiol       Date:  1997-07-15       Impact factor: 4.897

4.  Tumour necrosis factor alpha gene polymorphisms in rheumatoid arthritis: association with susceptibility to, or severity of, disease?

Authors:  B M Brinkman; T W Huizinga; S S Kurban; E A van der Velde; G M Schreuder; J M Hazes; F C Breedveld; C L Verweij
Journal:  Br J Rheumatol       Date:  1997-05

5.  The sensitivity and specificity of computerized databases for the diagnosis of rheumatoid arthritis.

Authors:  S E Gabriel
Journal:  Arthritis Rheum       Date:  1994-06

6.  Accuracy of Veterans Administration databases for a diagnosis of rheumatoid arthritis.

Authors:  Jasvinder A Singh; Aaron R Holmgren; Siamak Noorbaloochi
Journal:  Arthritis Rheum       Date:  2004-12-15

Review 7.  Influence of human leukocyte antigen-DRB1 on the susceptibility and severity of rheumatoid arthritis.

Authors:  Miguel A Gonzalez-Gay; Carlos Garcia-Porrua; Ali H Hajeer
Journal:  Semin Arthritis Rheum       Date:  2002-06       Impact factor: 5.532

8.  The influence of HLA-DRB1 genes on disease severity in rheumatoid arthritis.

Authors:  C M Weyand; K C Hicok; D L Conn; J J Goronzy
Journal:  Ann Intern Med       Date:  1992-11-15       Impact factor: 25.391

9.  Genetic variation in proteins of the cryopyrin inflammasome influences susceptibility and severity of rheumatoid arthritis (the Swedish TIRA project).

Authors:  A Kastbom; D Verma; P Eriksson; T Skogh; G Wingren; P Söderkvist
Journal:  Rheumatology (Oxford)       Date:  2008-02-07       Impact factor: 7.580

10.  Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data.

Authors:  Joshua C Denny; Lisa Bastarache; Marylyn D Ritchie; Robert J Carroll; Raquel Zink; Jonathan D Mosley; Julie R Field; Jill M Pulley; Andrea H Ramirez; Erica Bowton; Melissa A Basford; David S Carrell; Peggy L Peissig; Abel N Kho; Jennifer A Pacheco; Luke V Rasmussen; David R Crosslin; Paul K Crane; Jyotishman Pathak; Suzette J Bielinski; Sarah A Pendergrass; Hua Xu; Lucia A Hindorff; Rongling Li; Teri A Manolio; Christopher G Chute; Rex L Chisholm; Eric B Larson; Gail P Jarvik; Murray H Brilliant; Catherine A McCarty; Iftikhar J Kullo; Jonathan L Haines; Dana C Crawford; Daniel R Masys; Dan M Roden
Journal:  Nat Biotechnol       Date:  2013-12       Impact factor: 54.908

View more
  22 in total

1.  High-throughput multimodal automated phenotyping (MAP) with application to PheWAS.

Authors:  Katherine P Liao; Jiehuan Sun; Tianrun A Cai; Nicholas Link; Chuan Hong; Jie Huang; Jennifer E Huffman; Jessica Gronsbell; Yichi Zhang; Yuk-Lam Ho; Victor Castro; Vivian Gainer; Shawn N Murphy; Christopher J O'Donnell; J Michael Gaziano; Kelly Cho; Peter Szolovits; Isaac S Kohane; Sheng Yu; Tianxi Cai
Journal:  J Am Med Inform Assoc       Date:  2019-11-01       Impact factor: 4.497

2.  Genomics of posttraumatic stress disorder in veterans: Methods and rationale for Veterans Affairs Cooperative Study #575B.

Authors:  Krishnan Radhakrishnan; Mihaela Aslan; Kelly M Harrington; Robert H Pietrzak; Grant Huang; Sumitra Muralidhar; Kelly Cho; Rachel Quaden; David Gagnon; Saiju Pyarajan; Ning Sun; Hongyu Zhao; Michael Gaziano; John Concato; Murray B Stein; Joel Gelernter
Journal:  Int J Methods Psychiatr Res       Date:  2019-02-14       Impact factor: 4.035

Review 3.  Biomechanisms of Comorbidity: Reviewing Integrative Analyses of Multi-omics Datasets and Electronic Health Records.

Authors:  N Pouladi; I Achour; H Li; J Berghout; C Kenost; M L Gonzalez-Garay; Y A Lussier
Journal:  Yearb Med Inform       Date:  2016-11-10

4.  Large-scale identification of patients with cerebral aneurysms using natural language processing.

Authors:  Victor M Castro; Dmitriy Dligach; Sean Finan; Sheng Yu; Anil Can; Muhammad Abd-El-Barr; Vivian Gainer; Nancy A Shadick; Shawn Murphy; Tianxi Cai; Guergana Savova; Scott T Weiss; Rose Du
Journal:  Neurology       Date:  2016-12-07       Impact factor: 9.910

5.  Learning statistical models of phenotypes using noisy labeled training data.

Authors:  Vibhu Agarwal; Tanya Podchiyska; Juan M Banda; Veena Goel; Tiffany I Leung; Evan P Minty; Timothy E Sweeney; Elsie Gyang; Nigam H Shah
Journal:  J Am Med Inform Assoc       Date:  2016-05-12       Impact factor: 4.497

6.  Enabling phenotypic big data with PheNorm.

Authors:  Sheng Yu; Yumeng Ma; Jessica Gronsbell; Tianrun Cai; Ashwin N Ananthakrishnan; Vivian S Gainer; Susanne E Churchill; Peter Szolovits; Shawn N Murphy; Isaac S Kohane; Katherine P Liao; Tianxi Cai
Journal:  J Am Med Inform Assoc       Date:  2018-01-01       Impact factor: 4.497

7.  An augmented estimation procedure for EHR-based association studies accounting for differential misclassification.

Authors:  Jiayi Tong; Jing Huang; Jessica Chubak; Xuan Wang; Jason H Moore; Rebecca A Hubbard; Yong Chen
Journal:  J Am Med Inform Assoc       Date:  2020-02-01       Impact factor: 4.497

8.  The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities.

Authors:  Lauren J Beesley; Maxwell Salvatore; Lars G Fritsche; Anita Pandit; Arvind Rao; Chad Brummett; Cristen J Willer; Lynda D Lisabeth; Bhramar Mukherjee
Journal:  Stat Med       Date:  2019-12-20       Impact factor: 2.373

9.  PheProb: probabilistic phenotyping using diagnosis codes to improve power for genetic association studies.

Authors:  Jennifer A Sinnott; Fiona Cai; Sheng Yu; Boris P Hejblum; Chuan Hong; Isaac S Kohane; Katherine P Liao
Journal:  J Am Med Inform Assoc       Date:  2018-10-01       Impact factor: 4.497

10.  Study design for non-recurring, time-to-event outcomes in the presence of error-prone diagnostic tests or self-reports.

Authors:  Xiangdong Gu; Raji Balasubramanian
Journal:  Stat Med       Date:  2016-05-18       Impact factor: 2.373

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.