Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A Statistical Framework for the Analysis of ChIP-Seq Data.

Literature DB >> 26478641

A Statistical Framework for the Analysis of ChIP-Seq Data.

Pei Fen Kuan¹, Dongjun Chung¹, Guangjin Pan², James A Thomson³, Ron Stewart², Sündüz Keleş⁴.

Abstract

Chromatin immunoprecipitation followed by sequencing (ChIP-Seq) has revolutionalized experiments for genome-wide profiling of DNA-binding proteins, histone modifications, and nucleosome occupancy. As the cost of sequencing is decreasing, many researchers are switching from microarray-based technologies (ChIP-chip) to ChIP-Seq for genome-wide study of transcriptional regulation. Despite its increasing and well-deserved popularity, there is little work that investigates and accounts for sources of biases in the ChIP-Seq technology. These biases typically arise from both the standard pre-processing protocol and the underlying DNA sequence of the generated data. We study data from a naked DNA sequencing experiment, which sequences non-cross-linked DNA after deproteinizing and shearing, to understand factors affecting background distribution of data generated in a ChIP-Seq experiment. We introduce a background model that accounts for apparent sources of biases such as mappability and GC content and develop a flexible mixture model named MOSAiCS for detecting peaks in both one- and two-sample analyses of ChIP-Seq data. We illustrate that our model fits observed ChIP-Seq data well and further demonstrate advantages of MOSAiCS over commonly used tools for ChIP-Seq data analysis with several case studies.

Entities: CellLine Chemical Disease Gene Species

Keywords: GC content; Mappability; Mixture model; Negative binomial regression; Next generation sequencing

Year: 2012 PMID： 26478641 PMCID： PMC4608541 DOI： 10.1198/jasa.2011.ap09706

Source DB: PubMed Journal: J Am Stat Assoc ISSN： 0162-1459 Impact factor: 5.033

23 in total

1. PICS: probabilistic inference for ChIP-seq.

Authors: Xuekui Zhang; Gordon Robertson; Martin Krzywinski; Kaida Ning; Arnaud Droit; Steven Jones; Raphael Gottardo
Journal: Biometrics Date: 2011-03 Impact factor: 2.571

2. Genome-scale identification of nucleosome positions in S. cerevisiae.

Authors: Guo-Cheng Yuan; Yuen-Jong Liu; Michael F Dion; Michael D Slack; Lani F Wu; Steven J Altschuler; Oliver J Rando
Journal: Science Date: 2005-06-16 Impact factor: 47.728

3. High-resolution profiling of histone methylations in the human genome.

Authors: Artem Barski; Suresh Cuddapah; Kairong Cui; Tae-Young Roh; Dustin E Schones; Zhibin Wang; Gang Wei; Iouri Chepelev; Keji Zhao
Journal: Cell Date: 2007-05-18 Impact factor: 41.582

4. Genome-wide analysis of SREBP-1 binding in mouse liver chromatin reveals a preference for promoter proximal binding to a new motif.

Authors: Young-Kyo Seo; Hansook Kim Chong; Aniello M Infante; Seung-Soon Im; Xiaohui Xie; Timothy F Osborne
Journal: Proc Natl Acad Sci U S A Date: 2009-08-04 Impact factor: 11.205

5. Erythroid GATA1 function revealed by genome-wide analysis of transcription factor occupancy, histone modifications, and mRNA expression.

Authors: Yong Cheng; Weisheng Wu; Swathi Ashok Kumar; Duonan Yu; Wulan Deng; Tamara Tripic; David C King; Kuan-Bei Chen; Ying Zhang; Daniela Drautz; Belinda Giardine; Stephan C Schuster; Webb Miller; Francesca Chiaromonte; Yu Zhang; Gerd A Blobel; Mitchell J Weiss; Ross C Hardison
Journal: Genome Res Date: 2009-11-03 Impact factor: 9.043

6. A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning.

Authors: Anton Valouev; Jeffrey Ichikawa; Thaisan Tonthat; Jeremy Stuart; Swati Ranade; Heather Peckham; Kathy Zeng; Joel A Malek; Gina Costa; Kevin McKernan; Arend Sidow; Andrew Fire; Steven M Johnson
Journal: Genome Res Date: 2008-05-13 Impact factor: 9.043

7. Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies.

Authors: Ghia M Euskirchen; Joel S Rozowsky; Chia-Lin Wei; Wah Heng Lee; Zhengdong D Zhang; Stephen Hartman; Olof Emanuelsson; Viktor Stolc; Sherman Weissman; Mark B Gerstein; Yijun Ruan; Michael Snyder
Journal: Genome Res Date: 2007-06 Impact factor: 9.043

8. Genome-wide mapping of in vivo protein-DNA interactions.

Authors: David S Johnson; Ali Mortazavi; Richard M Myers; Barbara Wold
Journal: Science Date: 2007-05-31 Impact factor: 47.728

9. Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data.

Authors: Dongjun Chung; Pei Fen Kuan; Bo Li; Rajendran Sanalkumar; Kun Liang; Emery H Bresnick; Colin Dewey; Sündüz Keleş
Journal: PLoS Comput Biol Date: 2011-07-14 Impact factor: 4.475

10. Substantial biases in ultra-short read data sets from high-throughput DNA sequencing.

Authors: Juliane C Dohm; Claudio Lottaz; Tatiana Borodina; Heinz Himmelbauer
Journal: Nucleic Acids Res Date: 2008-07-26 Impact factor: 16.971

60 in total

1. Learning the Formation Mechanism of Domain-Level Chromatin States with Epigenomics Data.

Authors: Wen Jun Xie; Bin Zhang
Journal: Biophys J Date: 2019-04-11 Impact factor: 4.033

2. A MAD-Bayes Algorithm for State-Space Inference and Clustering with Application to Querying Large Collections of ChIP-Seq Data Sets.

Authors: Chandler Zuo; Kailei Chen; Sündüz Keleş
Journal: J Comput Biol Date: 2016-11-11 Impact factor: 1.479

A Statistical Framework for the Analysis of ChIP-Seq Data.

1. PICS: probabilistic inference for ChIP-seq.

2. Genome-scale identification of nucleosome positions in S. cerevisiae.

3. High-resolution profiling of histone methylations in the human genome.

4. Genome-wide analysis of SREBP-1 binding in mouse liver chromatin reveals a preference for promoter proximal binding to a new motif.

5. Erythroid GATA1 function revealed by genome-wide analysis of transcription factor occupancy, histone modifications, and mRNA expression.

6. A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning.

7. Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies.

8. Genome-wide mapping of in vivo protein-DNA interactions.

9. Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data.

10. Substantial biases in ultra-short read data sets from high-throughput DNA sequencing.

1. Learning the Formation Mechanism of Domain-Level Chromatin States with Epigenomics Data.

2. A MAD-Bayes Algorithm for State-Space Inference and Clustering with Application to Querying Large Collections of ChIP-Seq Data Sets.

3. Differential Sox10 genomic occupancy in myelinating glia.

Review 4. Genomics pipelines and data integration: challenges and opportunities in the research setting.

5. Data exploration, quality control and statistical analysis of ChIP-exo/nexus experiments.

6. ChIPWig: a random access-enabling lossless and lossy compression method for ChIP-seq data.

7. Distal enhancers upstream of the Charcot-Marie-Tooth type 1A disease gene PMP22.

8. Ethnicity-specific and overlapping alterations of brain hydroxymethylome in Alzheimer's disease.

9. Epstein-Barr Virus Nuclear Antigen 3 (EBNA3) Proteins Regulate EBNA2 Binding to Distinct RBPJ Genomic Sites.

10. A predictive modeling approach for cell line-specific long-range regulatory interactions.