Yufeng Shen1, Ruijie Song, Itsik Pe'er. 1. Department of Computer Science, Columbia University, New York, NY 10027, USA. yshen@c2b2.columbia.edu
Abstract
MOTIVATION: Whole-genome sequencing (WGS) allows direct interrogation of previously undetected uncommon or rare variants, which potentially contribute to the missing heritability of human disease. However, cost of sequencing large numbers of samples limits its application in case-control association studies. Here, we describe theoretical and empirical design considerations for such sequencing studies, aimed at maximizing the power of detecting association under the constraint of study-wide cost. RESULTS: We consider two cost regimes. First, assuming cost is proportional to the total amount of base pairs to be sequenced across all samples, which is a practical model for whole-genome sequencing, we explored the tradeoff in terms of study power between increasing the number of subjects and increasing depth coverage. We demonstrate that the optimal power of detecting association is achieved at medium depth coverage under a wide range of realistic conditions for case-only sequencing designs. Second, if cost is fixed per sample, which is approximately the case in exome sequencing, we show that in a simple case+control sequencing study, the optimal design should include cases totaling 1/e of all subjects. AVAILABILITY: A web tool implementing the methods is available at http://www.cs.columbia.edu/~itsik/OPERA/.
MOTIVATION: Whole-genome sequencing (WGS) allows direct interrogation of previously undetected uncommon or rare variants, which potentially contribute to the missing heritability of human disease. However, cost of sequencing large numbers of samples limits its application in case-control association studies. Here, we describe theoretical and empirical design considerations for such sequencing studies, aimed at maximizing the power of detecting association under the constraint of study-wide cost. RESULTS: We consider two cost regimes. First, assuming cost is proportional to the total amount of base pairs to be sequenced across all samples, which is a practical model for whole-genome sequencing, we explored the tradeoff in terms of study power between increasing the number of subjects and increasing depth coverage. We demonstrate that the optimal power of detecting association is achieved at medium depth coverage under a wide range of realistic conditions for case-only sequencing designs. Second, if cost is fixed per sample, which is approximately the case in exome sequencing, we show that in a simple case+control sequencing study, the optimal design should include cases totaling 1/e of all subjects. AVAILABILITY: A web tool implementing the methods is available at http://www.cs.columbia.edu/~itsik/OPERA/.
Authors: David A Wheeler; Maithreyan Srinivasan; Michael Egholm; Yufeng Shen; Lei Chen; Amy McGuire; Wen He; Yi-Ju Chen; Vinod Makhijani; G Thomas Roth; Xavier Gomes; Karrie Tartaro; Faheem Niazi; Cynthia L Turcotte; Gerard P Irzyk; James R Lupski; Craig Chinault; Xing-zhi Song; Yue Liu; Ye Yuan; Lynne Nazareth; Xiang Qin; Donna M Muzny; Marcel Margulies; George M Weinstock; Richard A Gibbs; Jonathan M Rothberg Journal: Nature Date: 2008-04-17 Impact factor: 49.962
Authors: Benjamin M Neale; Manuel A Rivas; Benjamin F Voight; David Altshuler; Bernie Devlin; Marju Orho-Melander; Sekar Kathiresan; Shaun M Purcell; Kathryn Roeder; Mark J Daly Journal: PLoS Genet Date: 2011-03-03 Impact factor: 5.917
Authors: David R Bentley; Shankar Balasubramanian; Harold P Swerdlow; Geoffrey P Smith; John Milton; Clive G Brown; Kevin P Hall; Dirk J Evers; Colin L Barnes; Helen R Bignell; Jonathan M Boutell; Jason Bryant; Richard J Carter; R Keira Cheetham; Anthony J Cox; Darren J Ellis; Michael R Flatbush; Niall A Gormley; Sean J Humphray; Leslie J Irving; Mirian S Karbelashvili; Scott M Kirk; Heng Li; Xiaohai Liu; Klaus S Maisinger; Lisa J Murray; Bojan Obradovic; Tobias Ost; Michael L Parkinson; Mark R Pratt; Isabelle M J Rasolonjatovo; Mark T Reed; Roberto Rigatti; Chiara Rodighiero; Mark T Ross; Andrea Sabot; Subramanian V Sankar; Aylwyn Scally; Gary P Schroth; Mark E Smith; Vincent P Smith; Anastassia Spiridou; Peta E Torrance; Svilen S Tzonev; Eric H Vermaas; Klaudia Walter; Xiaolin Wu; Lu Zhang; Mohammed D Alam; Carole Anastasi; Ify C Aniebo; David M D Bailey; Iain R Bancarz; Saibal Banerjee; Selena G Barbour; Primo A Baybayan; Vincent A Benoit; Kevin F Benson; Claire Bevis; Phillip J Black; Asha Boodhun; Joe S Brennan; John A Bridgham; Rob C Brown; Andrew A Brown; Dale H Buermann; Abass A Bundu; James C Burrows; Nigel P Carter; Nestor Castillo; Maria Chiara E Catenazzi; Simon Chang; R Neil Cooley; Natasha R Crake; Olubunmi O Dada; Konstantinos D Diakoumakos; Belen Dominguez-Fernandez; David J Earnshaw; Ugonna C Egbujor; David W Elmore; Sergey S Etchin; Mark R Ewan; Milan Fedurco; Louise J Fraser; Karin V Fuentes Fajardo; W Scott Furey; David George; Kimberley J Gietzen; Colin P Goddard; George S Golda; Philip A Granieri; David E Green; David L Gustafson; Nancy F Hansen; Kevin Harnish; Christian D Haudenschild; Narinder I Heyer; Matthew M Hims; Johnny T Ho; Adrian M Horgan; Katya Hoschler; Steve Hurwitz; Denis V Ivanov; Maria Q Johnson; Terena James; T A Huw Jones; Gyoung-Dong Kang; Tzvetana H Kerelska; Alan D Kersey; Irina Khrebtukova; Alex P Kindwall; Zoya Kingsbury; Paula I Kokko-Gonzales; Anil Kumar; Marc A Laurent; Cynthia T Lawley; Sarah E Lee; Xavier Lee; Arnold K Liao; Jennifer A Loch; Mitch Lok; Shujun Luo; Radhika M Mammen; John W Martin; Patrick G McCauley; Paul McNitt; Parul Mehta; Keith W Moon; Joe W Mullens; Taksina Newington; Zemin Ning; Bee Ling Ng; Sonia M Novo; Michael J O'Neill; Mark A Osborne; Andrew Osnowski; Omead Ostadan; Lambros L Paraschos; Lea Pickering; Andrew C Pike; Alger C Pike; D Chris Pinkard; Daniel P Pliskin; Joe Podhasky; Victor J Quijano; Come Raczy; Vicki H Rae; Stephen R Rawlings; Ana Chiva Rodriguez; Phyllida M Roe; John Rogers; Maria C Rogert Bacigalupo; Nikolai Romanov; Anthony Romieu; Rithy K Roth; Natalie J Rourke; Silke T Ruediger; Eli Rusman; Raquel M Sanches-Kuiper; Martin R Schenker; Josefina M Seoane; Richard J Shaw; Mitch K Shiver; Steven W Short; Ning L Sizto; Johannes P Sluis; Melanie A Smith; Jean Ernest Sohna Sohna; Eric J Spence; Kim Stevens; Neil Sutton; Lukasz Szajkowski; Carolyn L Tregidgo; Gerardo Turcatti; Stephanie Vandevondele; Yuli Verhovsky; Selene M Virk; Suzanne Wakelin; Gregory C Walcott; Jingwen Wang; Graham J Worsley; Juying Yan; Ling Yau; Mike Zuerlein; Jane Rogers; James C Mullikin; Matthew E Hurles; Nick J McCooke; John S West; Frank L Oaks; Peter L Lundberg; David Klenerman; Richard Durbin; Anthony J Smith Journal: Nature Date: 2008-11-06 Impact factor: 49.962