Literature DB >> 29932245

FastSKAT: Sequence kernel association tests for very large sets of markers.

Thomas Lumley1, Jennifer Brody2, Gina Peloso3, Alanna Morrison4, Kenneth Rice5.   

Abstract

The sequence kernel association test (SKAT) is widely used to test for associations between a phenotype and a set of genetic variants that are usually rare. Evaluating tail probabilities or quantiles of the null distribution for SKAT requires computing the eigenvalues of a matrix related to the genotype covariance between markers. Extracting the full set of eigenvalues of this matrix (an <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mrow><mml:mi>n</mml:mi><mml:mo>×</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:math> matrix, for n subjects) has computational complexity proportional to n3 . As SKAT is often used when <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mrow><mml:mi>n</mml:mi><mml:mo>></mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mn>4</mml:mn></mml:msup></mml:mrow></mml:math> , this step becomes a major bottleneck in its use in practice. We therefore propose fastSKAT, a new computationally inexpensive but accurate approximations to the tail probabilities, in which the k largest eigenvalues of a weighted genotype covariance matrix or the largest singular values of a weighted genotype matrix are extracted, and a single term based on the Satterthwaite approximation is used for the remaining eigenvalues. While the method is not particularly sensitive to the choice of k, we also describe how to choose its value, and show how fastSKAT can automatically alert users to the rare cases where the choice may affect results. As well as providing faster implementation of SKAT, the new method also enables entirely new applications of SKAT that were not possible before; we give examples grouping variants by topologically associating domains, and comparing chromosome-wide association by class of histone marker.
© 2018 WILEY PERIODICALS, INC.

Keywords:  Lanczos algorithm; convolution; genetic association; randomized trace estimator; stochastic singular value decomposition

Mesh:

Substances:

Year:  2018        PMID: 29932245      PMCID: PMC6129408          DOI: 10.1002/gepi.22136

Source DB:  PubMed          Journal:  Genet Epidemiol        ISSN: 0741-0395            Impact factor:   2.135


  22 in total

1.  Optimal tests for rare variant effects in sequencing association studies.

Authors:  Seunggeun Lee; Michael C Wu; Xihong Lin
Journal:  Biostatistics       Date:  2012-06-14       Impact factor: 5.899

2.  Powerful SNP-set analysis for case-control genome-wide association studies.

Authors:  Michael C Wu; Peter Kraft; Michael P Epstein; Deanne M Taylor; Stephen J Chanock; David J Hunter; Xihong Lin
Journal:  Am J Hum Genet       Date:  2010-06-11       Impact factor: 11.025

3.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

4.  Strategies to design and analyze targeted sequencing data: cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium Targeted Sequencing Study.

Authors:  Honghuang Lin; Min Wang; Jennifer A Brody; Joshua C Bis; Josée Dupuis; Thomas Lumley; Barbara McKnight; Kenneth M Rice; Colleen M Sitlani; Jeffrey G Reid; Jan Bressler; Xiaoming Liu; Brian C Davis; Andrew D Johnson; Christopher J O'Donnell; Christie L Kovar; Huyen Dinh; Yuanqing Wu; Irene Newsham; Han Chen; Andi Broka; Anita L DeStefano; Mayetri Gupta; Kathryn L Lunetta; Ching-Ti Liu; Charles C White; Chuanhua Xing; Yanhua Zhou; Emelia J Benjamin; Renate B Schnabel; Susan R Heckbert; Bruce M Psaty; Donna M Muzny; L Adrienne Cupples; Alanna C Morrison; Eric Boerwinkle
Journal:  Circ Cardiovasc Genet       Date:  2014-06

5.  Fast Principal-Component Analysis Reveals Convergent Evolution of ADH1B in Europe and East Asia.

Authors:  Kevin J Galinsky; Gaurav Bhatia; Po-Ru Loh; Stoyan Georgiev; Sayan Mukherjee; Nick J Patterson; Alkes L Price
Journal:  Am J Hum Genet       Date:  2016-02-25       Impact factor: 11.025

6.  Sequence kernel association test for survival traits.

Authors:  Han Chen; Thomas Lumley; Jennifer Brody; Nancy L Heard-Costa; Caroline S Fox; L Adrienne Cupples; Josée Dupuis
Journal:  Genet Epidemiol       Date:  2014-01-26       Impact factor: 2.135

7.  Sequence kernel association test for quantitative traits in family samples.

Authors:  Han Chen; James B Meigs; Josée Dupuis
Journal:  Genet Epidemiol       Date:  2012-12-26       Impact factor: 2.135

8.  Algorithm 971: An Implementation of a Randomized Algorithm for Principal Component Analysis.

Authors:  Huamin Li; George C Linderman; Arthur Szlam; Kelly P Stanton; Yuval Kluger; Mark Tygert
Journal:  ACM Trans Math Softw       Date:  2017-01       Impact factor: 1.704

9.  Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium: Design of prospective meta-analyses of genome-wide association studies from 5 cohorts.

Authors:  Bruce M Psaty; Christopher J O'Donnell; Vilmundur Gudnason; Kathryn L Lunetta; Aaron R Folsom; Jerome I Rotter; André G Uitterlinden; Tamara B Harris; Jacqueline C M Witteman; Eric Boerwinkle
Journal:  Circ Cardiovasc Genet       Date:  2009-02

Review 10.  Demystifying the secret mission of enhancers: linking distal regulatory elements to target genes.

Authors:  Lijing Yao; Benjamin P Berman; Peggy J Farnham
Journal:  Crit Rev Biochem Mol Biol       Date:  2015-10-08       Impact factor: 8.250

View more
  11 in total

1.  Genetic association testing using the GENESIS R/Bioconductor package.

Authors:  Stephanie M Gogarten; Tamar Sofer; Han Chen; Chaoyu Yu; Jennifer A Brody; Timothy A Thornton; Kenneth M Rice; Matthew P Conomos
Journal:  Bioinformatics       Date:  2019-12-15       Impact factor: 6.937

2.  A powerful subset-based method identifies gene set associations and improves interpretation in UK Biobank.

Authors:  Diptavo Dutta; Peter VandeHaar; Lars G Fritsche; Sebastian Zöllner; Michael Boehnke; Laura J Scott; Seunggeun Lee
Journal:  Am J Hum Genet       Date:  2021-03-16       Impact factor: 11.025

3.  Whole-Genome Sequencing in Severe Chronic Obstructive Pulmonary Disease.

Authors:  Dmitry Prokopenko; Phuwanat Sakornsakolpat; Heide Loehlein Fier; Dandi Qiao; Margaret M Parker; Merry-Lynn N McDonald; Ani Manichaikul; Stephen S Rich; R Graham Barr; Christopher J Williams; Mark L Brantly; Christoph Lange; Terri H Beaty; James D Crapo; Edwin K Silverman; Michael H Cho
Journal:  Am J Respir Cell Mol Biol       Date:  2018-11       Impact factor: 6.914

4.  A Mixed-Effects Model for Powerful Association Tests in Integrative Functional Genomics.

Authors:  Yu-Ru Su; Chongzhi Di; Stephanie Bien; Licai Huang; Xinyuan Dong; Goncalo Abecasis; Sonja Berndt; Stephane Bezieau; Hermann Brenner; Bette Caan; Graham Casey; Jenny Chang-Claude; Stephen Chanock; Sai Chen; Charles Connolly; Keith Curtis; Jane Figueiredo; Manish Gala; Steven Gallinger; Tabitha Harrison; Michael Hoffmeister; John Hopper; Jeroen R Huyghe; Mark Jenkins; Amit Joshi; Loic Le Marchand; Polly Newcomb; Deborah Nickerson; John Potter; Robert Schoen; Martha Slattery; Emily White; Brent Zanke; Ulrike Peters; Li Hsu
Journal:  Am J Hum Genet       Date:  2018-05-03       Impact factor: 11.025

5.  eSCAN: scan regulatory regions for aggregate association testing using whole-genome sequencing data.

Authors:  Yingxi Yang; Quan Sun; Le Huang; Jai G Broome; Adolfo Correa; Alexander Reiner; Laura M Raffield; Yuchen Yang; Yun Li
Journal:  Brief Bioinform       Date:  2022-01-17       Impact factor: 11.622

6.  Random Field Modeling of Multi-trait Multi-locus Association for Detecting Methylation Quantitative Trait Loci.

Authors:  Chen Lyu; Manyan Huang; Nianjun Liu; Zhongxue Chen; Philip J Lupo; Benjamin Tycko; John S Witte; Charlotte A Hobbs; Ming Li
Journal:  Bioinformatics       Date:  2022-07-04       Impact factor: 6.931

7.  Efficient Variant Set Mixed Model Association Tests for Continuous and Binary Traits in Large-Scale Whole-Genome Sequencing Studies.

Authors:  Han Chen; Jennifer E Huffman; Jennifer A Brody; Chaolong Wang; Seunggeun Lee; Zilin Li; Stephanie M Gogarten; Tamar Sofer; Lawrence F Bielak; Joshua C Bis; John Blangero; Russell P Bowler; Brian E Cade; Michael H Cho; Adolfo Correa; Joanne E Curran; Paul S de Vries; David C Glahn; Xiuqing Guo; Andrew D Johnson; Sharon Kardia; Charles Kooperberg; Joshua P Lewis; Xiaoming Liu; Rasika A Mathias; Braxton D Mitchell; Jeffrey R O'Connell; Patricia A Peyser; Wendy S Post; Alex P Reiner; Stephen S Rich; Jerome I Rotter; Edwin K Silverman; Jennifer A Smith; Ramachandran S Vasan; James G Wilson; Lisa R Yanek; Susan Redline; Nicholas L Smith; Eric Boerwinkle; Ingrid B Borecki; L Adrienne Cupples; Cathy C Laurie; Alanna C Morrison; Kenneth M Rice; Xihong Lin
Journal:  Am J Hum Genet       Date:  2019-01-10       Impact factor: 11.043

8.  SEAGLE: A Scalable Exact Algorithm for Large-Scale Set-Based Gene-Environment Interaction Tests in Biobank Data.

Authors:  Jocelyn T Chi; Ilse C F Ipsen; Tzu-Hung Hsiao; Ching-Heng Lin; Li-San Wang; Wan-Ping Lee; Tzu-Pin Lu; Jung-Ying Tzeng
Journal:  Front Genet       Date:  2021-11-02       Impact factor: 4.772

9.  Identifying Susceptibility Loci for Cutaneous Squamous Cell Carcinoma Using a Fast Sequence Kernel Association Test.

Authors:  Manyan Huang; Chen Lyu; Xin Li; Abrar A Qureshi; Jiali Han; Ming Li
Journal:  Front Genet       Date:  2021-05-10       Impact factor: 4.599

10.  Multi-Set Testing Strategies Show Good Behavior When Applied to Very Large Sets of Rare Variants.

Authors:  Ruby Fore; Jaden Boehme; Kevin Li; Jason Westra; Nathan Tintle
Journal:  Front Genet       Date:  2020-11-09       Impact factor: 4.599

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.