Literature DB >> 32589924

A Fast and Accurate Method for Genome-Wide Time-to-Event Data Analysis and Its Application to UK Biobank.

Wenjian Bi1, Lars G Fritsche1, Bhramar Mukherjee2, Sehee Kim3, Seunggeun Lee4.   

Abstract

With increasing biobanking efforts connecting electronic health records and national registries to germline genetics, the time-to-event data analysis has attracted increasing attention in the genetics studies of human diseases. In time-to-event data analysis, the Cox proportional hazards (PH) regression model is one of the most used approaches. However, existing methods and tools are not scalable when analyzing a large biobank with hundreds of thousands of samples and endpoints, and they are not accurate when testing low-frequency and rare variants. Here, we propose a scalable and accurate method, SPACox (a saddlepoint approximation implementation based on the Cox PH regression model), that is applicable for genome-wide scale time-to-event data analysis. SPACox requires fitting a Cox PH regression model only once across the genome-wide analysis and then uses a saddlepoint approximation (SPA) to calibrate the test statistics. Simulation studies show that SPACox is 76-252 times faster than other existing alternatives, such as gwasurvivr, 185-511 times faster than the standard Wald test, and more than 6,000 times faster than the Firth correction and can control type I error rates at the genome-wide significance level regardless of minor allele frequencies. Through the analysis of UK Biobank inpatient data of 282,871 white British European ancestry samples, we show that SPACox can efficiently analyze large sample sizes and accurately control type I error rates. We identified 611 loci associated with time-to-event phenotypes of 12 common diseases, of which 38 loci would be missed within a logistic regression framework with a binary phenotype defined as event occurrence status during the follow-up period.
Copyright © 2020. Published by Elsevier Inc.

Entities:  

Keywords:  Cox proportional hazards regression model; GWAS; PheWAS; UK Biobank; electronic health record; saddlepoint approximation; survival analysis; time-to-event data

Mesh:

Year:  2020        PMID: 32589924      PMCID: PMC7413891          DOI: 10.1016/j.ajhg.2020.06.003

Source DB:  PubMed          Journal:  Am J Hum Genet        ISSN: 0002-9297            Impact factor:   11.025


  14 in total

1.  Two-step hypothesis testing to detect gene-environment interactions in a genome-wide scan with a survival endpoint.

Authors:  Eric S Kawaguchi; Gang Li; Juan Pablo Lewinger; W James Gauderman
Journal:  Stat Med       Date:  2022-01-24       Impact factor: 2.373

2.  ILRUN Promotes Atherosclerosis Through Lipid-Dependent and Lipid-Independent Factors.

Authors:  Xin Bi; Sylvia Stankov; Paul C Lee; Ziyi Wang; Xun Wu; Li Li; Yi-An Ko; Lan Cheng; Hanrui Zhang; Nicholas J Hand; Daniel J Rader
Journal:  Arterioscler Thromb Vasc Biol       Date:  2022-07-14       Impact factor: 10.514

3.  Genetic Risk of Second Primary Cancer in Breast Cancer Survivors: The Multiethnic Cohort Study.

Authors:  Fei Chen; Sungshim L Park; Lynne R Wilkens; Peggy Wan; Steven N Hart; Chunling Hu; Siddhartha Yadav; Fergus J Couch; David V Conti; Adam J de Smith; Christopher A Haiman
Journal:  Cancer Res       Date:  2022-09-16       Impact factor: 13.312

4.  GWAS of longitudinal trajectories at biobank scale.

Authors:  Seyoon Ko; Christopher A German; Aubrey Jensen; Judong Shen; Anran Wang; Devan V Mehrotra; Yan V Sun; Janet S Sinsheimer; Hua Zhou; Jin J Zhou
Journal:  Am J Hum Genet       Date:  2022-02-22       Impact factor: 11.043

5.  Inference for set-based effects in genetic association studies with interval-censored outcomes.

Authors:  Ryan Sun; Liang Zhu; Yimei Li; Yutaka Yasui; Leslie Robison
Journal:  Biometrics       Date:  2022-02-14       Impact factor: 1.701

6.  The immunogenetics of viral antigen response is associated with subtype-specific glioma risk and survival.

Authors:  Geno Guerra; Linda Kachuri; George Wendt; Helen M Hansen; Steven J Mack; Annette M Molinaro; Terri Rice; Paige Bracci; John K Wiencke; Nori Kasahara; Jeanette E Eckel-Passow; Robert B Jenkins; Margaret Wrensch; Stephen S Francis
Journal:  Am J Hum Genet       Date:  2022-05-11       Impact factor: 11.043

7.  Genomic architecture and prediction of censored time-to-event phenotypes with a Bayesian genome-wide analysis.

Authors:  Sven E Ojavee; Athanasios Kousathanas; Daniel Trejo Banos; Etienne J Orliac; Marion Patxot; Kristi Läll; Reedik Mägi; Krista Fischer; Zoltan Kutalik; Matthew R Robinson
Journal:  Nat Commun       Date:  2021-04-20       Impact factor: 14.919

8.  Associations between genetic loci, environment factors and mental disorders: a genome-wide survival analysis using the UK Biobank data.

Authors:  Peilin Meng; Jing Ye; Xiaomeng Chu; Bolun Cheng; Shiqiang Cheng; Li Liu; Xuena Yang; Chujun Liang; Feng Zhang
Journal:  Transl Psychiatry       Date:  2022-01-11       Impact factor: 6.222

9.  Accounting for age of onset and family history improves power in genome-wide association studies.

Authors:  Emil M Pedersen; Esben Agerbo; Oleguer Plana-Ripoll; Jakob Grove; Julie W Dreier; Katherine L Musliner; Marie Bækvad-Hansen; Georgios Athanasiadis; Andrew Schork; Jonas Bybjerg-Grauholm; David M Hougaard; Thomas Werge; Merete Nordentoft; Ole Mors; Søren Dalsgaard; Jakob Christensen; Anders D Børglum; Preben B Mortensen; John J McGrath; Florian Privé; Bjarni J Vilhjálmsson
Journal:  Am J Hum Genet       Date:  2022-02-08       Impact factor: 11.025

10.  Block coordinate descent algorithm improves variable selection and estimation in error-in-variables regression.

Authors:  Célia Escribe; Tianyuan Lu; Julyan Keller-Baruch; Vincenzo Forgetta; Bowei Xiao; J Brent Richards; Sahir Bhatnagar; Karim Oualkacha; Celia M T Greenwood
Journal:  Genet Epidemiol       Date:  2021-09-01       Impact factor: 2.344

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.