Literature DB >> 19193731

CoCAS: a ChIP-on-chip analysis suite.

Touati Benoukraf1, Pierre Cauchy, Romain Fenouil, Adrien Jeanniard, Frederic Koch, Sébastien Jaeger, Denis Thieffry, Jean Imbert, Jean-Christophe Andrau, Salvatore Spicuglia, Pierre Ferrier.   

Abstract

MOTIVATION: High-density tiling microarrays are increasingly used in combination with ChIP assays to study transcriptional regulation. To ease the analysis of the large amounts of data generated by this approach, we have developed ChIP-on-chip Analysis Suite (CoCAS), a standalone software suite which implements optimized ChIP-on-chip data normalization, improved peak detection, as well as quality control reports. Our software allows dye swap, replicate correlation and connects easily with genome browsers and other peak detection algorithms. CoCAS can readily be used on the latest generation of Agilent high-density arrays. Also, the implemented peak detection methods are suitable for other datasets, including ChIP-Seq output. AVAILABILITY: The software is available for download along with a sample dataset at http://www.ciml.univ-mrs.fr/software/ferrier.htm. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Year:  2009        PMID: 19193731      PMCID: PMC2660873          DOI: 10.1093/bioinformatics/btp075

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

In the last few years, coupling of chromatin immunoprecipitation with microarray technology (ChIP-on-chip; Ren et al., 2000) and computational analysis tools has resulted in major leaps in our understanding of transcriptional networks and of the dynamics of chromatin structure (Bock and Lengauer, 2008). Microarray analysis is a stepwise process which encompasses spot detection in scanned images, normalization of fluorescence intensities within and between arrays, as well as probeset to gene assignment. In the case of ChIP-on-chip (CoC), this process comprises the additional processing of binding events, also known as peak detection. Several CoC analysis software solutions already exist, often adapted for one specific microarray platform. To our knowledge, in the case of Agilent microarrays, only one application suite is currently available: DNA Analytics (http://chem.agilent.com), a licensed program. Here, we introduce a new standalone ChIP-on-chip Analysis Suite (CoCAS) that provides several additional functions, including new normalization options, flexible peak detection, quality control reports, as well as a compilation of replicate samples. CoCAS is free (GPL) software which runs independently on Windows XP/Vista, Mac OSX, Linux and builds upon existing packages in the Java and R programming languages (http://www.r-project.org), notably BioConductor (http://bioconductor.org). CoCAS uses Java as graphical user interface as well as peak detection, and R for the bulk of the calculations.

2 PROCEDURES

As input, CoCAS takes Feature Extraction files (Agilent Technologies) originating from scanner quantification. Microarray files are read in R using BioConductor. Since two-channel normalization methods tend to underestimate enrichment, we made variance stabilization normalization (Huber et al., 2002) available in our software, as opposed to other Agilent CoC analysis programs. We also adapted, implemented and validated a novel CoC optimized intra-normalization method (Peng et al., 2007) de novo in R (Supplementary Fig. S1). These methods can now be used along with other traditional intra- and inter-normalization methods: median, loess and quantile (Yang et al., 2002) (Supplementary Fig. S2). Background subtraction can be carried out using all options limma (Smyth, 2004) offers in this regard, or disabled. A per-spot P-value is systematically calculated according to the Rosetta error model (Weng et al., 2006), which can be used for peak detection. Multiple slide designs are handled as separate experiments until inter-array normalization, after which they are merged as one whole experiment. Experimental and/or biological replicates can be merged either using a mean of log ratios, or the Rosetta error model. Peak detection is automatically performed in Java following microarray processing. The peak detection tab can be called from within the main interface at any time for standalone peak detection. The algorithm is based on the neighbourhood effect (Zheng et al., 2007). Significantly enriched probes are first mapped above a given threshold based on background noise estimation as used by Ringo (Toedling et al., 2007) or MPeak (Zheng et al., 2007). Peaks are extended as long as the log ratio of contiguous probes is greater than the extension threshold. A score is given by calculation of the effective peak area.

3 RESULTS AND CONCLUSION

CoCAS features either a simple stepwise wizard with detailed help which facilitates analyses, or a user-parameterized interface allowing more flexibility (an example screenshot of the interface is shown in Supplementary Fig. S3). It can handle large files originating from new high-density microarrays (>1 000 000 probes). Dye swap can be carried out on a selection of slides and replicate correlation plots are displayed. As illustration, we provide genome-wide profiling of Suz12, a subunit of the Polycomb repressor complex, performed in mouse ES cells, and processed with CoCAS (Fig. 1 and Supplementary Material S1). Because Suz12 is located throughout the genome (Boyer et al., 2006), we applied median normalization in this case. A PDF Quality Control report is generated for global estimation of per-slide enrichment (Fig. 1A–C). Resulting output is written as several generic file formats that are readable on most genome browsers, such as Integrated Genome Browser (IGB), Ensembl (http://ensembl.org) or UCSC genome browser (http://genome.ucsc.edu) (Supplementary Fig. S4), a function supported by most CoC packages, except for the Agilent platform, as of yet (Supplementary Table S1). As expected, our software shows high Suz12 enrichment at the genome-wide scale, notably in the Hox cluster region (Fig. 1D and data not shown). Importantly, the peak detection methods implemented in CoCAS can be used for any set of data (in GFF format), including ChIP-Seq data (Supplementary Fig. S5), where signal processing is similar to that of CoC.
Fig. 1.

Stepwise data analysis of Suz12 ChIP-on-chip in CoCAS. Quality control reports include (A) density plots of immunoprecipitated (IP) DNA, in red, and Input DNA, in green, so as to detect any dye bias; (B) MA plots which allow assessment of normalization quality and probe enrichment; (C) replicate correlation plots, which also help estimate background noise (which shows no correlation at low intensities). (D) Chromosomal view (chromosome 6) of Suz12 IP over input log ratios (in red) via IGB (top), followed by peak detection (green track) on a close up in the Hox cluster region (bottom).

Stepwise data analysis of Suz12 ChIP-on-chip in CoCAS. Quality control reports include (A) density plots of immunoprecipitated (IP) DNA, in red, and Input DNA, in green, so as to detect any dye bias; (B) MA plots which allow assessment of normalization quality and probe enrichment; (C) replicate correlation plots, which also help estimate background noise (which shows no correlation at low intensities). (D) Chromosomal view (chromosome 6) of Suz12 IP over input log ratios (in red) via IGB (top), followed by peak detection (green track) on a close up in the Hox cluster region (bottom). Funding: Inserm, CNRS, Association pour la Recherche sur le Cancer, Institut National du Cancer, Fondation de France, Association Laurette Fugain, Fondation Princesse Grace de Monaco and Commission of the European Communities (to Ferrier laboratory); Inserm, Université de la Méditerranée and Association pour la Recherche sur le Cancer (to Imbert laboratory); Agence Nationale de la Recherche (ANR-06-BYOS-0006 for collaboration between the two groups and to T.B.); fellowship from Institut National du Cancer (to P.C.); Marie Curie Research Training Network (RTN ‘Chromatin Plasticity’) from the Commission of the European Communities (to F.K.). Conflict of Interest: none declared.
  10 in total

1.  Genome-wide location and function of DNA binding proteins.

Authors:  B Ren; F Robert; J J Wyrick; O Aparicio; E G Jennings; I Simon; J Zeitlinger; J Schreiber; N Hannett; E Kanin; T L Volkert; C J Wilson; S P Bell; R A Young
Journal:  Science       Date:  2000-12-22       Impact factor: 47.728

2.  Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation.

Authors:  Yee Hwa Yang; Sandrine Dudoit; Percy Luu; David M Lin; Vivian Peng; John Ngai; Terence P Speed
Journal:  Nucleic Acids Res       Date:  2002-02-15       Impact factor: 16.971

3.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression.

Authors:  Wolfgang Huber; Anja von Heydebreck; Holger Sültmann; Annemarie Poustka; Martin Vingron
Journal:  Bioinformatics       Date:  2002       Impact factor: 6.937

4.  Polycomb complexes repress developmental regulators in murine embryonic stem cells.

Authors:  Laurie A Boyer; Kathrin Plath; Julia Zeitlinger; Tobias Brambrink; Lea A Medeiros; Tong Ihn Lee; Stuart S Levine; Marius Wernig; Adriana Tajonar; Mridula K Ray; George W Bell; Arie P Otte; Miguel Vidal; David K Gifford; Richard A Young; Rudolf Jaenisch
Journal:  Nature       Date:  2006-04-19       Impact factor: 49.962

5.  Linear models and empirical bayes methods for assessing differential expression in microarray experiments.

Authors:  Gordon K Smyth
Journal:  Stat Appl Genet Mol Biol       Date:  2004-02-12

Review 6.  Computational epigenetics.

Authors:  Christoph Bock; Thomas Lengauer
Journal:  Bioinformatics       Date:  2007-11-17       Impact factor: 6.937

7.  ChIP-chip: data, model, and analysis.

Authors:  Ming Zheng; Leah O Barrera; Bing Ren; Ying Nian Wu
Journal:  Biometrics       Date:  2007-09       Impact factor: 2.571

8.  Rosetta error model for gene expression analysis.

Authors:  Lee Weng; Hongyue Dai; Yihui Zhan; Yudong He; Sergey B Stepaniants; Douglas E Bassett
Journal:  Bioinformatics       Date:  2006-03-07       Impact factor: 6.937

9.  Ringo--an R/Bioconductor package for analyzing ChIP-chip readouts.

Authors:  Joern Toedling; Oleg Skylar; Oleg Sklyar; Tammo Krueger; Jenny J Fischer; Silke Sperling; Wolfgang Huber
Journal:  BMC Bioinformatics       Date:  2007-06-26       Impact factor: 3.169

10.  Normalization and experimental design for ChIP-chip data.

Authors:  Shouyong Peng; Artyom A Alekseyenko; Erica Larschan; Mitzi I Kuroda; Peter J Park
Journal:  BMC Bioinformatics       Date:  2007-06-25       Impact factor: 3.169

  10 in total
  12 in total

1.  H3K4 tri-methylation provides an epigenetic signature of active enhancers.

Authors:  Aleksandra Pekowska; Touati Benoukraf; Joaquin Zacarias-Cabeza; Mohamed Belhocine; Frederic Koch; Hélène Holota; Jean Imbert; Jean-Christophe Andrau; Pierre Ferrier; Salvatore Spicuglia
Journal:  EMBO J       Date:  2011-08-16       Impact factor: 11.598

2.  Transcription initiation platforms and GTF recruitment at tissue-specific enhancers and promoters.

Authors:  Frederic Koch; Romain Fenouil; Marta Gut; Pierre Cauchy; Thomas K Albert; Joaquin Zacarias-Cabeza; Salvatore Spicuglia; Albane Lamy de la Chapelle; Martin Heidemann; Corinna Hintermair; Dirk Eick; Ivo Gut; Pierre Ferrier; Jean-Christophe Andrau
Journal:  Nat Struct Mol Biol       Date:  2011-07-17       Impact factor: 15.369

3.  Sequence analysis of chromatin immunoprecipitation data for transcription factors.

Authors:  Kenzie D Macisaac; Ernest Fraenkel
Journal:  Methods Mol Biol       Date:  2010

4.  Starr: Simple Tiling ARRay analysis of Affymetrix ChIP-chip data.

Authors:  Benedikt Zacher; Pei Fen Kuan; Achim Tresch
Journal:  BMC Bioinformatics       Date:  2010-04-17       Impact factor: 3.169

5.  Assessing the efficiency and significance of Methylated DNA Immunoprecipitation (MeDIP) assays in using in vitro methylated genomic DNA.

Authors:  Jinsong Jia; Aleksandra Pekowska; Sebastien Jaeger; Touati Benoukraf; Pierre Ferrier; Salvatore Spicuglia
Journal:  BMC Res Notes       Date:  2010-09-16

6.  Characterisation of genome-wide PLZF/RARA target genes.

Authors:  Salvatore Spicuglia; Christelle Vincent-Fabert; Touati Benoukraf; Guillaume Tibéri; Andrew J Saurin; Joaquin Zacarias-Cabeza; David Grimwade; Ken Mills; Boris Calmels; François Bertucci; Michael Sieweke; Pierre Ferrier; Estelle Duprez
Journal:  PLoS One       Date:  2011-09-20       Impact factor: 3.240

7.  An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies.

Authors:  Michiel E Adriaens; Magali Jaillard; Lars M T Eijssen; Claus-Dieter Mayer; Chris T A Evelo
Journal:  BMC Genomics       Date:  2012-01-25       Impact factor: 3.969

8.  Optofluidic UV-Vis spectrophotometer for online monitoring of photocatalytic reactions.

Authors:  Ning Wang; Furui Tan; Yu Zhao; Chi Chung Tsoi; Xudong Fan; Weixing Yu; Xuming Zhang
Journal:  Sci Rep       Date:  2016-06-29       Impact factor: 4.379

9.  Dynamic recruitment of Ets1 to both nucleosome-occupied and -depleted enhancer regions mediates a transcriptional program switch during early T-cell differentiation.

Authors:  Pierre Cauchy; Muhammad A Maqbool; Joaquin Zacarias-Cabeza; Laurent Vanhille; Frederic Koch; Romain Fenouil; Marta Gut; Ivo Gut; Maria A Santana; Aurélien Griffon; Jean Imbert; Carolina Moraes-Cabé; Jean-Christophe Bories; Pierre Ferrier; Salvatore Spicuglia; Jean-Christophe Andrau
Journal:  Nucleic Acids Res       Date:  2015-12-15       Impact factor: 16.971

10.  Integrated and Functional Genomics Analysis Validates the Relevance of the Nuclear Variant ErbB380kDa in Prostate Cancer Progression.

Authors:  Mahmoud El Maassarani; Alice Barbarin; Gaëlle Fromont; Ouafae Kaissi; Margot Lebbe; Brigitte Vannier; Ahmed Moussa; Paule Séité
Journal:  PLoS One       Date:  2016-05-18       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.