Dajiang J Liu1, Suzanne M Leal. 1. Department of Biostatistics, Center of Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA. dajiang@umich.edu
Abstract
MOTIVATION: Next-generation sequencing greatly increases the capacity to detect rare-variant complex-trait associations. However, it is still expensive to sequence a large number of samples and therefore often small datasets are used. Given cost constraints, a potentially more powerful two-step strategy is to sequence a subset of the sample to discover variants, and genotype the identified variants in the remaining sample. If only cases are sequenced, directly combining sequence and genotype data will lead to inflated type-I errors in rare-variant association analysis. Although several methods have been developed to correct for the bias, they are either underpowered or theoretically invalid. We proposed a new method SEQCHIP to integrate genotype and sequence data, which can be used with most existing rare-variant tests. RESULTS: It is demonstrated using both simulated and real datasets that the SEQCHIP method has controlled type-I errors, and is substantially more powerful than all other currently available methods. AVAILABILITY: SEQCHIP is implemented in an R-Package and is available at http://linkage.rockefeller.edu/suzanne/seqchip/Seqchip.html.
MOTIVATION: Next-generation sequencing greatly increases the capacity to detect rare-variant complex-trait associations. However, it is still expensive to sequence a large number of samples and therefore often small datasets are used. Given cost constraints, a potentially more powerful two-step strategy is to sequence a subset of the sample to discover variants, and genotype the identified variants in the remaining sample. If only cases are sequenced, directly combining sequence and genotype data will lead to inflated type-I errors in rare-variant association analysis. Although several methods have been developed to correct for the bias, they are either underpowered or theoretically invalid. We proposed a new method SEQCHIP to integrate genotype and sequence data, which can be used with most existing rare-variant tests. RESULTS: It is demonstrated using both simulated and real datasets that the SEQCHIP method has controlled type-I errors, and is substantially more powerful than all other currently available methods. AVAILABILITY: SEQCHIP is implemented in an R-Package and is available at http://linkage.rockefeller.edu/suzanne/seqchip/Seqchip.html.
Authors: Jonathan C Cohen; Robert S Kiss; Alexander Pertsemlidis; Yves L Marcel; Ruth McPherson; Helen H Hobbs Journal: Science Date: 2004-08-06 Impact factor: 47.728
Authors: Nicola S Fearnhead; Jennifer L Wilding; Bruce Winney; Susan Tonks; Sylvia Bartlett; David C Bicknell; Ian P M Tomlinson; Neil J McC Mortensen; Walter F Bodmer Journal: Proc Natl Acad Sci U S A Date: 2004-11-01 Impact factor: 11.205
Authors: Stefano Romeo; Len A Pennacchio; Yunxin Fu; Eric Boerwinkle; Anne Tybjaerg-Hansen; Helen H Hobbs; Jonathan C Cohen Journal: Nat Genet Date: 2007-02-25 Impact factor: 38.330
Authors: Jonathan C Cohen; Alexander Pertsemlidis; Saleemah Fahmi; Sophie Esmail; Gloria L Vega; Scott M Grundy; Helen H Hobbs Journal: Proc Natl Acad Sci U S A Date: 2006-01-31 Impact factor: 11.205
Authors: I M Frayling; N E Beck; M Ilyas; I Dove-Edwin; P Goodman; K Pack; J A Bell; C B Williams; S V Hodgson; H J Thomas; I C Talbot; W F Bodmer; I P Tomlinson Journal: Proc Natl Acad Sci U S A Date: 1998-09-01 Impact factor: 11.205
Authors: Weizhen Ji; Jia Nee Foo; Brian J O'Roak; Hongyu Zhao; Martin G Larson; David B Simon; Christopher Newton-Cheh; Matthew W State; Daniel Levy; Richard P Lifton Journal: Nat Genet Date: 2008-04-06 Impact factor: 38.330
Authors: Andriy Derkach; Theodore Chiang; Jiafen Gong; Laura Addis; Sara Dobbins; Ian Tomlinson; Richard Houlston; Deb K Pal; Lisa J Strug Journal: Bioinformatics Date: 2014-04-14 Impact factor: 6.937