Yanxiao Zhang1, Yu-Hsuan Lin1, Timothy D Johnson1, Laura S Rozek1, Maureen A Sartor2. 1. Department of Computational Medicine and Bioinformatics, Department of Biostatistics and Department of Environmental Health Sciences, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA. 2. Department of Computational Medicine and Bioinformatics, Department of Biostatistics and Department of Environmental Health Sciences, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA Department of Computational Medicine and Bioinformatics, Department of Biostatistics and Department of Environmental Health Sciences, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.
Abstract
MOTIVATION: ChIP-Seq is the standard method to identify genome-wide DNA-binding sites for transcription factors (TFs) and histone modifications. There is a growing need to analyze experiments with biological replicates, especially for epigenomic experiments where variation among biological samples can be substantial. However, tools that can perform group comparisons are currently lacking. RESULTS: We present a peak-calling prioritization pipeline (PePr) for identifying consistent or differential binding sites in ChIP-Seq experiments with biological replicates. PePr models read counts across the genome among biological samples with a negative binomial distribution and uses a local variance estimation method, ranking consistent or differential binding sites more favorably than sites with greater variability. We compared PePr with commonly used and recently proposed approaches on eight TF datasets and show that PePr uniquely identifies consistent regions with enriched read counts, high motif occurrence rate and known characteristics of TF binding based on visual inspection. For histone modification data with broadly enriched regions, PePr identified differential regions that are consistent within groups and outperformed other methods in scaling False Discovery Rate (FDR) analysis. AVAILABILITY AND IMPLEMENTATION: http://code.google.com/p/pepr-chip-seq/.
MOTIVATION: ChIP-Seq is the standard method to identify genome-wide DNA-binding sites for transcription factors (TFs) and histone modifications. There is a growing need to analyze experiments with biological replicates, especially for epigenomic experiments where variation among biological samples can be substantial. However, tools that can perform group comparisons are currently lacking. RESULTS: We present a peak-calling prioritization pipeline (PePr) for identifying consistent or differential binding sites in ChIP-Seq experiments with biological replicates. PePr models read counts across the genome among biological samples with a negative binomial distribution and uses a local variance estimation method, ranking consistent or differential binding sites more favorably than sites with greater variability. We compared PePr with commonly used and recently proposed approaches on eight TF datasets and show that PePr uniquely identifies consistent regions with enriched read counts, high motif occurrence rate and known characteristics of TF binding based on visual inspection. For histone modification data with broadly enriched regions, PePr identified differential regions that are consistent within groups and outperformed other methods in scaling False Discovery Rate (FDR) analysis. AVAILABILITY AND IMPLEMENTATION: http://code.google.com/p/pepr-chip-seq/.
Authors: Olga F Sarmento; Laura C Digilio; Yanming Wang; Julie Perlin; John C Herr; C David Allis; Scott A Coonrod Journal: J Cell Sci Date: 2004-08-17 Impact factor: 5.285
Authors: Bambarendage P U Perera; Zing Tsung-Yeh Tsai; Mathia L Colwell; Tamara R Jones; Jaclyn M Goodrich; Kai Wang; Maureen A Sartor; Christopher Faulk; Dana C Dolinoy Journal: Epigenetics Date: 2019-04-08 Impact factor: 4.528
Authors: Kelly P Stanton; Jiaqi Jin; Roy R Lederman; Sherman M Weissman; Yuval Kluger Journal: Nucleic Acids Res Date: 2017-12-01 Impact factor: 16.971