Song Yan1, Yun Li. 1. Department of Biostatistics, University of North Carolina, 3101 McGavran-Greenberg Hall, Chapel Hill, NC 27599, USA, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA and Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, USA.
Abstract
SUMMARY: Despite its great capability to detect rare variant associations, next-generation sequencing is still prohibitively expensive when applied to large samples. In case-control studies, it is thus appealing to sequence only a subset of cases to discover variants and genotype the identified variants in controls and the remaining cases under the reasonable assumption that causal variants are usually enriched among cases. However, this approach leads to inflated type-I error if analyzed naively for rare variant association. Several methods have been proposed in recent literature to control type-I error at the cost of either excluding some sequenced cases or correcting the genotypes of discovered rare variants. All of these approaches thus suffer from certain extent of information loss and thus are underpowered. We propose a novel method (BETASEQ), which corrects inflation of type-I error by supplementing pseudo-variants while keeps the original sequence and genotype data intact. Extensive simulations and real data analysis demonstrate that, in most practical situations, BETASEQ leads to higher testing powers than existing approaches with guaranteed (controlled or conservative) type-I error. AVAILABILITY AND IMPLEMENTATION: BETASEQ and associated R files, including documentation, examples, are available at http://www.unc.edu/~yunmli/betaseq
SUMMARY: Despite its great capability to detect rare variant associations, next-generation sequencing is still prohibitively expensive when applied to large samples. In case-control studies, it is thus appealing to sequence only a subset of cases to discover variants and genotype the identified variants in controls and the remaining cases under the reasonable assumption that causal variants are usually enriched among cases. However, this approach leads to inflated type-I error if analyzed naively for rare variant association. Several methods have been proposed in recent literature to control type-I error at the cost of either excluding some sequenced cases or correcting the genotypes of discovered rare variants. All of these approaches thus suffer from certain extent of information loss and thus are underpowered. We propose a novel method (BETASEQ), which corrects inflation of type-I error by supplementing pseudo-variants while keeps the original sequence and genotype data intact. Extensive simulations and real data analysis demonstrate that, in most practical situations, BETASEQ leads to higher testing powers than existing approaches with guaranteed (controlled or conservative) type-I error. AVAILABILITY AND IMPLEMENTATION: BETASEQ and associated R files, including documentation, examples, are available at http://www.unc.edu/~yunmli/betaseq
Authors: Alkes L Price; Gregory V Kryukov; Paul I W de Bakker; Shaun M Purcell; Jeff Staples; Lee-Jen Wei; Shamil R Sunyaev Journal: Am J Hum Genet Date: 2010-05-13 Impact factor: 11.025
Authors: Matthew Zawistowski; Shyam Gopalakrishnan; Jun Ding; Yun Li; Sara Grimm; Sebastian Zöllner Journal: Am J Hum Genet Date: 2010-11-12 Impact factor: 11.025
Authors: Serena Sanna; Bingshan Li; Antonella Mulas; Carlo Sidore; Hyun M Kang; Anne U Jackson; Maria Grazia Piras; Gianluca Usala; Giuseppe Maninchedda; Alessandro Sassu; Fabrizio Serra; Maria Antonietta Palmas; William H Wood; Inger Njølstad; Markku Laakso; Kristian Hveem; Jaakko Tuomilehto; Timo A Lakka; Rainer Rauramaa; Michael Boehnke; Francesco Cucca; Manuela Uda; David Schlessinger; Ramaiah Nagaraja; Gonçalo R Abecasis Journal: PLoS Genet Date: 2011-07-28 Impact factor: 5.917
Authors: Inga Prokopenko; Claudia Langenberg; Jose C Florez; Richa Saxena; Nicole Soranzo; Gudmar Thorleifsson; Ruth J F Loos; Alisa K Manning; Anne U Jackson; Yurii Aulchenko; Simon C Potter; Michael R Erdos; Serena Sanna; Jouke-Jan Hottenga; Eleanor Wheeler; Marika Kaakinen; Valeriya Lyssenko; Wei-Min Chen; Kourosh Ahmadi; Jacques S Beckmann; Richard N Bergman; Murielle Bochud; Lori L Bonnycastle; Thomas A Buchanan; Antonio Cao; Alessandra Cervino; Lachlan Coin; Francis S Collins; Laura Crisponi; Eco J C de Geus; Abbas Dehghan; Panos Deloukas; Alex S F Doney; Paul Elliott; Nelson Freimer; Vesela Gateva; Christian Herder; Albert Hofman; Thomas E Hughes; Sarah Hunt; Thomas Illig; Michael Inouye; Bo Isomaa; Toby Johnson; Augustine Kong; Maria Krestyaninova; Johanna Kuusisto; Markku Laakso; Noha Lim; Ulf Lindblad; Cecilia M Lindgren; Owen T McCann; Karen L Mohlke; Andrew D Morris; Silvia Naitza; Marco Orrù; Colin N A Palmer; Anneli Pouta; Joshua Randall; Wolfgang Rathmann; Jouko Saramies; Paul Scheet; Laura J Scott; Angelo Scuteri; Stephen Sharp; Eric Sijbrands; Jan H Smit; Kijoung Song; Valgerdur Steinthorsdottir; Heather M Stringham; Tiinamaija Tuomi; Jaakko Tuomilehto; André G Uitterlinden; Benjamin F Voight; Dawn Waterworth; H-Erich Wichmann; Gonneke Willemsen; Jacqueline C M Witteman; Xin Yuan; Jing Hua Zhao; Eleftheria Zeggini; David Schlessinger; Manjinder Sandhu; Dorret I Boomsma; Manuela Uda; Tim D Spector; Brenda Wjh Penninx; David Altshuler; Peter Vollenweider; Marjo Riitta Jarvelin; Edward Lakatta; Gerard Waeber; Caroline S Fox; Leena Peltonen; Leif C Groop; Vincent Mooser; L Adrienne Cupples; Unnur Thorsteinsdottir; Michael Boehnke; Inês Barroso; Cornelia Van Duijn; Josée Dupuis; Richard M Watanabe; Kari Stefansson; Mark I McCarthy; Nicholas J Wareham; James B Meigs; Gonçalo R Abecasis Journal: Nat Genet Date: 2008-12-07 Impact factor: 38.330