Chandler Zuo1, Sündüz Keleş. 1. Department of Statistics, and Department of Biostatistics and Medical Informatics, 1300 University Avenue, Madison, WI 53706, USA.
Abstract
MOTIVATION: ChIP-seq technology enables investigators to study genome-wide binding of transcription factors and mapping of epigenomic marks. Although the availability of basic analysis tools for ChIP-seq data is rapidly increasing, there has not been much progress on the related design issues. A challenging question for designing a ChIP-seq experiment is how deeply should the ChIP and the control samples be sequenced? The answer depends on multiple factors some of which can be set by the experimenter based on pilot/preliminary data. The sequencing depth of a ChIP-seq experiment is one of the key factors that determine whether all the underlying targets (e.g. binding locations or epigenomic profiles) can be identified with a targeted power. RESULTS: We developed a statistical framework named CSSP (ChIP-seq Statistical Power) for power calculations in ChIP-seq experiments by considering a local Poisson model, which is commonly adopted by many peak callers. Evaluations with simulations and data-driven computational experiments demonstrate that this framework can reliably estimate the power of a ChIP-seq experiment at different sequencing depths based on pilot data. Furthermore, it provides an analytical approach for calculating the required depth for a targeted power while controlling the false discovery rate at a user-specified level. Hence, our results enable researchers to use their own or publicly available data for determining required sequencing depths of their ChIP-seq experiments and potentially make better use of the multiplexing functionality of the sequencers. Evaluation of power for multiple public ChIP-seq datasets indicate that, currently, typical ChIP-seq studies are powered well for detecting large fold changes of ChIP enrichment over the control sample, but they have considerably less power for detecting smaller fold changes. AVAILABILITY: Available at www.stat.wisc.edu/~zuo/CSSP. CONTACT: keles@stat.wisc.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: ChIP-seq technology enables investigators to study genome-wide binding of transcription factors and mapping of epigenomic marks. Although the availability of basic analysis tools for ChIP-seq data is rapidly increasing, there has not been much progress on the related design issues. A challenging question for designing a ChIP-seq experiment is how deeply should the ChIP and the control samples be sequenced? The answer depends on multiple factors some of which can be set by the experimenter based on pilot/preliminary data. The sequencing depth of a ChIP-seq experiment is one of the key factors that determine whether all the underlying targets (e.g. binding locations or epigenomic profiles) can be identified with a targeted power. RESULTS: We developed a statistical framework named CSSP (ChIP-seq Statistical Power) for power calculations in ChIP-seq experiments by considering a local Poisson model, which is commonly adopted by many peak callers. Evaluations with simulations and data-driven computational experiments demonstrate that this framework can reliably estimate the power of a ChIP-seq experiment at different sequencing depths based on pilot data. Furthermore, it provides an analytical approach for calculating the required depth for a targeted power while controlling the false discovery rate at a user-specified level. Hence, our results enable researchers to use their own or publicly available data for determining required sequencing depths of their ChIP-seq experiments and potentially make better use of the multiplexing functionality of the sequencers. Evaluation of power for multiple public ChIP-seq datasets indicate that, currently, typical ChIP-seq studies are powered well for detecting large fold changes of ChIP enrichment over the control sample, but they have considerably less power for detecting smaller fold changes. AVAILABILITY: Available at www.stat.wisc.edu/~zuo/CSSP. CONTACT: keles@stat.wisc.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Ryan McDaniell; Bum-Kyu Lee; Lingyun Song; Zheng Liu; Alan P Boyle; Michael R Erdos; Laura J Scott; Mario A Morken; Katerina S Kucera; Anna Battenhouse; Damian Keefe; Francis S Collins; Huntington F Willard; Jason D Lieb; Terrence S Furey; Gregory E Crawford; Vishwanath R Iyer; Ewan Birney Journal: Science Date: 2010-03-18 Impact factor: 47.728
Authors: Maya Kasowski; Fabian Grubert; Christopher Heffelfinger; Manoj Hariharan; Akwasi Asabere; Sebastian M Waszak; Lukas Habegger; Joel Rozowsky; Minyi Shi; Alexander E Urban; Mi-Young Hong; Konrad J Karczewski; Wolfgang Huber; Sherman M Weissman; Mark B Gerstein; Jan O Korbel; Michael Snyder Journal: Science Date: 2010-03-18 Impact factor: 47.728
Authors: Pei Fen Kuan; Dongjun Chung; Guangjin Pan; James A Thomson; Ron Stewart; Sündüz Keleş Journal: J Am Stat Assoc Date: 2012-01-24 Impact factor: 5.033
Authors: Sushmita Roy; Jason Ernst; Peter V Kharchenko; Pouya Kheradpour; Nicolas Negre; Matthew L Eaton; Jane M Landolin; Christopher A Bristow; Lijia Ma; Michael F Lin; Stefan Washietl; Bradley I Arshinoff; Ferhat Ay; Patrick E Meyer; Nicolas Robine; Nicole L Washington; Luisa Di Stefano; Eugene Berezikov; Christopher D Brown; Rogerio Candeias; Joseph W Carlson; Adrian Carr; Irwin Jungreis; Daniel Marbach; Rachel Sealfon; Michael Y Tolstorukov; Sebastian Will; Artyom A Alekseyenko; Carlo Artieri; Benjamin W Booth; Angela N Brooks; Qi Dai; Carrie A Davis; Michael O Duff; Xin Feng; Andrey A Gorchakov; Tingting Gu; Jorja G Henikoff; Philipp Kapranov; Renhua Li; Heather K MacAlpine; John Malone; Aki Minoda; Jared Nordman; Katsutomo Okamura; Marc Perry; Sara K Powell; Nicole C Riddle; Akiko Sakai; Anastasia Samsonova; Jeremy E Sandler; Yuri B Schwartz; Noa Sher; Rebecca Spokony; David Sturgill; Marijke van Baren; Kenneth H Wan; Li Yang; Charles Yu; Elise Feingold; Peter Good; Mark Guyer; Rebecca Lowdon; Kami Ahmad; Justen Andrews; Bonnie Berger; Steven E Brenner; Michael R Brent; Lucy Cherbas; Sarah C R Elgin; Thomas R Gingeras; Robert Grossman; Roger A Hoskins; Thomas C Kaufman; William Kent; Mitzi I Kuroda; Terry Orr-Weaver; Norbert Perrimon; Vincenzo Pirrotta; James W Posakony; Bing Ren; Steven Russell; Peter Cherbas; Brenton R Graveley; Suzanna Lewis; Gos Micklem; Brian Oliver; Peter J Park; Susan E Celniker; Steven Henikoff; Gary H Karpen; Eric C Lai; David M MacAlpine; Lincoln D Stein; Kevin P White; Manolis Kellis Journal: Science Date: 2010-12-22 Impact factor: 47.728
Authors: Hongkai Ji; Hui Jiang; Wenxiu Ma; David S Johnson; Richard M Myers; Wing H Wong Journal: Nat Biotechnol Date: 2008-11-02 Impact factor: 54.908
Authors: Simone Roeh; Peter Weber; Monika Rex-Haffner; Jan M Deussing; Elisabeth B Binder; Mira Jakovcevski Journal: Nucleus Date: 2017-04-27 Impact factor: 4.197
Authors: Andrew D Fernandes; Jennifer Ns Reid; Jean M Macklaim; Thomas A McMurrough; David R Edgell; Gregory B Gloor Journal: Microbiome Date: 2014-05-05 Impact factor: 14.650