Literature DB >> 24227674

MEDIPS: genome-wide differential coverage analysis of sequencing data derived from DNA enrichment experiments.

Matthias Lienhard1, Christina Grimm, Markus Morkel, Ralf Herwig, Lukas Chavez.   

Abstract

MOTIVATION: DNA enrichment followed by sequencing is a versatile tool in molecular biology, with a wide variety of applications including genome-wide analysis of epigenetic marks and mechanisms. A common requirement of these diverse applications is a comparison of read coverage between experimental conditions. The amount of samples generated for such comparisons ranges from few replicates to hundreds of samples per condition for epigenome-wide association studies. Consequently, there is an urgent need for software that allows for fast and simple processing and comparison of sequencing data derived from enriched DNA.
RESULTS: Here, we present a major update of the R/Bioconductor package MEDIPS, which allows for an arbitrary number of replicates per group and integrates sophisticated statistical methods for the detection of differential coverage between experimental conditions. Our approach can be applied to a diversity of quantitative sequencing data. In addition, our update adds novel functionality to MEDIPS, including correlation analysis between samples, and takes advantage of Bioconductor's annotation databases to facilitate annotation of specific genomic regions.
AVAILABILITY AND IMPLEMENTATION: The latest version of MEDIPS is available as version 1.12.0 and part of Bioconductor 2.13. The package comes with a manual containing detailed description of its functionality and is available at http://www.bioconductor.org.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 24227674      PMCID: PMC3892689          DOI: 10.1093/bioinformatics/btt650

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

DNA enrichment methods are widely used for genome-wide identification of many different kinds of epigenetic marks. These techniques include chromatin-immunoprecipitation for localizing transcription factor binding sites or for revealing the genomic distribution of different histone modifications. Methylated DNA Immuno-Precipitation (MeDIP) (Weber ) and methyl-CpG binding domain (MBD) protein capture (Serre ) are similar techniques, but target the enrichment of DNA fragments containing methylated cytosines. Similarly, 5-hydroxymethylcytosines can be detected by antiserum specific to cytosine-5-hydroxymethylenesulfate (CMS) after treatment with sodium bisulfite (Pastor ). It can be expected that further affinity methods will be developed for immunoprecipitation (IP) of known or novel kinds of epigenetic marks. To provide a general framework for efficient genome-wide differential coverage analysis of IP-sequencing data, we have improved the user-friendly MEDIPS package. In contrast to the previous version, the MEDIPS update is capable of processing an arbitrary number of replicates or samples per condition. Furthermore, MEDIPS now integrates an elaborated statistical framework developed for the digital nature of count data, which includes a model for biological variation across replicates (Robinson ), and has greatly reduced runtime and memory requirements.

2 MEDIPS WORK FLOW

The MEDIPS package provides functions for the quality control and analysis of data derived from IP-seq samples. It starts with the aligned reads (typically bam files) and can be used for any genome of interest. Figure 1 gives an overview of a typical work flow.
Fig. 1.

The MEDIPS work flow

The MEDIPS work flow

2.1 Preparation

In the first step, the alignment files (single- or paired-end) are imported, and the fragments overlapping previously specified genomic regions are counted. These regions can be either genome-wide windows of regular width or any given regions of interest. To control for polymerase chain reaction artifacts, MEDIPS optionally replaces reads with the same position and orientation by one representative.

2.2 Quality control

The saturation analysis helps to verify whether the given set of mapped reads is sufficient to generate a saturated and reproducible coverage profile of the reference genome. This is done by extrapolation of the correlation of subsets (see Fig. 1C). To assess the effectiveness of the MeDIP/MBD enrichment, a function to calculate overall CpG enrichment is provided. MEDIPS identifies the fraction of CpGs in the reference genome covered by the sequencing data and evaluates their depth of coverage.

2.3 CpG density normalization

It has been reported by Down that methylation levels obtained from MeDIP/MBD experiments and bisulfite sequencing cannot be compared directly. Therefore, MEDIPS maintains its normalization function based on the concept of CpG coupling analysis (Down ) to calculate the relative methylation score (see Fig. 1D and E). It has been shown by Chavez that this normalization can improve the correlation to bisulfite data.

2.4 Differential coverage analysis

The main task for comparative epigenetic analyses is detection of regions with differential coverage between conditions. Variability, which can emerge from technical and biological variation, has to be estimated and modeled, and the statistical test has to consider the discrete nature of the count values. For this purpose, we make use of the edgeR package, which has been developed in the context of RNA-seq by Robinson . It provides functions to estimate the biological variability from low number of replicates and models the count data using negative binomial distribution. Alteration in copy number (CNA) are known to locally influence the MeDIP signal (Robinson ). To control for this interference, alterations in copy number are evaluated and can be considered in further analysis. To help with the functional interpretation of genomic regions identified by the differential coverage analysis, MEDIPS provides the functionality to annotate these regions with any provided set of annotations. The features can be imported from custom files, or from online databases, accessible from Bioconductor.

3 APPLICATION

To demonstrate the functionality of the MEDIPS package, we processed recently published MeDIP-seq data (Grimm ) that was generated to assess genome-wide epigenetic changes in mouse intestinal adenoma. For this study, differential methylation was inferred for the sample groups by calculating Wilcoxon rank tests for the normalized count values (reads per million, rpm) of each window. Differentially methylated regions (DMRs) were determined by applying filters for P-values, minimal coverage and ratios (Grimm ). Here, we process the same data but by using the presented MEDIPS package version 1.12.0. The commented R script, showing the function calls of this analysis, can be found in the Supplementary Material. From the five adenoma and seven normal control mouse samples, 14–22 M MeDIP-seq reads were uniquely mapped to the mouse reference genome (NCBI37/mm9) using bowtie (Langmead ), of which ∼93% remain after replacing reads with the same position and orientation by one representative. The saturation analysis indicates sufficient sequencing depth, and the CpG coverage indicates an effective MeDIP enrichment (see Fig. 1C and Supplementary Figs S1 and S2). Comparison of the normalized relative methylation score values with bisulfite validation showed a good overall correlation of 0.69–0.79 with a set of bisulfite validation assays previously performed by Grimm on the same genomic samples (see Fig. 1E and Supplementary Fig. S3). The edgeR test for differentially methylated regions finds 51.722 DMRs (P < 0.01), which correspond to 0.5% of the genome. Correction for multiple testing leads to 110 regions at 10% false discovery rate (FDR). Figure 1F shows the methylation logFC versus average log methylation (MA-plot). DMRs are depicted as orange points (P < 0.01) and red crosses (FDR<0.1). The result table containing the DMRs can be found in Supplementary Table S1. About 60% of the DMRs identified by Grimm overlap with the DMRs identified by MEDIPS 1.12. A detailed comparison between the two approaches can be found in the Supplementary Material. Although the overall number of hypo- and hypermethylated regions is balanced, preferential hypermethylation was found in functionally important subgenomic regions, such as promoters and CpG islands. In particular, CpG-rich promoters showed a substantial enrichment of hyper- over hypomethylation (5:1; see Fig. 1G). The identification of CpG-rich promoters as preferential targets for hypermethylation may provide important leads for further wet lab experiments. For instance, the analysis can be helpful to identify binding patterns of epigenetic modulator complexes and can be suited to identify candidate genes for epigenetic transcriptional silencing. The processing of the aligned reads took ∼90 min on an AMD Opteron 6380 2.5 GHz computer, using 1 CPU core and allocating a maximum of 20 GB RAM. Funding: German Federal Ministry of Education and Research with the grant EPITREAT (No. 0316190A) and by the Max Planck Society with its International Research School program (IMPRS-CBSC). Feodor Lynen postdoctoral Research Fellowship from the Alexander von Humboldt Foundation (to L.C.). Conflict of Interest: none declared.
  9 in total

1.  Computational analysis of genome-wide DNA methylation during the differentiation of human embryonic stem cells along the endodermal lineage.

Authors:  Lukas Chavez; Justyna Jozefczuk; Christina Grimm; Jörn Dietrich; Bernd Timmermann; Hans Lehrach; Ralf Herwig; James Adjaye
Journal:  Genome Res       Date:  2010-08-27       Impact factor: 9.043

2.  Evaluation of affinity-based genome-wide DNA methylation data: effects of CpG density, amplification bias, and copy number variation.

Authors:  Mark D Robinson; Clare Stirzaker; Aaron L Statham; Marcel W Coolen; Jenny Z Song; Shalima S Nair; Dario Strbenac; Terence P Speed; Susan J Clark
Journal:  Genome Res       Date:  2010-11-02       Impact factor: 9.043

3.  Chromosome-wide and promoter-specific analyses identify sites of differential DNA methylation in normal and transformed human cells.

Authors:  Michael Weber; Jonathan J Davies; David Wittig; Edward J Oakeley; Michael Haase; Wan L Lam; Dirk Schübeler
Journal:  Nat Genet       Date:  2005-07-10       Impact factor: 38.330

4.  Genome-wide mapping of 5-hydroxymethylcytosine in embryonic stem cells.

Authors:  William A Pastor; Utz J Pape; Yun Huang; Hope R Henderson; Ryan Lister; Myunggon Ko; Erin M McLoughlin; Yevgeny Brudno; Sahasransu Mahapatra; Philipp Kapranov; Mamta Tahiliani; George Q Daley; X Shirley Liu; Joseph R Ecker; Patrice M Milos; Suneet Agarwal; Anjana Rao
Journal:  Nature       Date:  2011-05-08       Impact factor: 49.962

5.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.

Authors:  Ben Langmead; Cole Trapnell; Mihai Pop; Steven L Salzberg
Journal:  Genome Biol       Date:  2009-03-04       Impact factor: 13.583

6.  A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis.

Authors:  Thomas A Down; Vardhman K Rakyan; Daniel J Turner; Paul Flicek; Heng Li; Eugene Kulesha; Stefan Gräf; Nathan Johnson; Javier Herrero; Eleni M Tomazou; Natalie P Thorne; Liselotte Bäckdahl; Marlis Herberth; Kevin L Howe; David K Jackson; Marcos M Miretti; John C Marioni; Ewan Birney; Tim J P Hubbard; Richard Durbin; Simon Tavaré; Stephan Beck
Journal:  Nat Biotechnol       Date:  2008-07       Impact factor: 54.908

7.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

8.  MBD-isolated Genome Sequencing provides a high-throughput and comprehensive survey of DNA methylation in the human genome.

Authors:  David Serre; Byron H Lee; Angela H Ting
Journal:  Nucleic Acids Res       Date:  2009-11-11       Impact factor: 16.971

9.  DNA-methylome analysis of mouse intestinal adenoma identifies a tumour-specific signature that is partly conserved in human colon cancer.

Authors:  Christina Grimm; Lukas Chavez; Mireia Vilardell; Alexandra L Farrall; Sascha Tierling; Julia W Böhm; Phillip Grote; Matthias Lienhard; Jörn Dietrich; Bernd Timmermann; Jörn Walter; Michal R Schweiger; Hans Lehrach; Ralf Herwig; Bernhard G Herrmann; Markus Morkel
Journal:  PLoS Genet       Date:  2013-02-07       Impact factor: 5.917

  9 in total
  147 in total

1.  TET-catalyzed oxidation of intragenic 5-methylcytosine regulates CTCF-dependent alternative splicing.

Authors:  Ryan J Marina; David Sturgill; Marc A Bailly; Morgan Thenoz; Garima Varma; Maria F Prigge; Kyster K Nanan; Sanjeev Shukla; Nazmul Haque; Shalini Oberdoerffer
Journal:  EMBO J       Date:  2015-12-28       Impact factor: 11.598

2.  DNA methylation changes in plasticity genes accompany the formation and maintenance of memory.

Authors:  Rashi Halder; Magali Hennion; Ramon O Vidal; Orr Shomroni; Raza-Ur Rahman; Ashish Rajput; Tonatiuh Pena Centeno; Frauke van Bebber; Vincenzo Capece; Julio C Garcia Vizcaino; Anna-Lena Schuetz; Susanne Burkhardt; Eva Benito; Magdalena Navarro Sala; Sanaz Bahari Javan; Christian Haass; Bettina Schmid; Andre Fischer; Stefan Bonn
Journal:  Nat Neurosci       Date:  2015-12-14       Impact factor: 24.884

3.  Developmental origins of transgenerational sperm DNA methylation epimutations following ancestral DDT exposure.

Authors:  Millissia Ben Maamar; Eric Nilsson; Ingrid Sadler-Riggleman; Daniel Beck; John R McCarrey; Michael K Skinner
Journal:  Dev Biol       Date:  2018-11-27       Impact factor: 3.582

4.  Tet2-mediated epigenetic drive for astrocyte differentiation from embryonic neural stem cells.

Authors:  Fei He; Hao Wu; Liqiang Zhou; Quan Lin; Yin Cheng; Yi E Sun
Journal:  Cell Death Discov       Date:  2020-04-29

5.  Schizophrenia-Like Phenotype Inherited by the F2 Generation of a Gestational Disruption Model of Schizophrenia.

Authors:  Stephanie M Perez; David D Aguilar; Jennifer L Neary; Melanie A Carless; Andrea Giuffrida; Daniel J Lodge
Journal:  Neuropsychopharmacology       Date:  2015-06-12       Impact factor: 7.853

6.  Choline ameliorates adult learning deficits and reverses epigenetic modification of chromatin remodeling factors related to adolescent nicotine exposure.

Authors:  Miri Gitik; Erica D Holliday; Ming Leung; Qiaoping Yuan; Sheree F Logue; Roope Tikkanen; David Goldman; Thomas J Gould
Journal:  Neurobiol Learn Mem       Date:  2018-08-09       Impact factor: 2.877

7.  The histone deacetylase SIRT6 controls embryonic stem cell fate via TET-mediated production of 5-hydroxymethylcytosine.

Authors:  Jean-Pierre Etchegaray; Lukas Chavez; Yun Huang; Kenneth N Ross; Jiho Choi; Barbara Martinez-Pastor; Ryan M Walsh; Cesar A Sommer; Matthias Lienhard; Adrianne Gladden; Sita Kugel; Dafne M Silberman; Sridhar Ramaswamy; Gustavo Mostoslavsky; Konrad Hochedlinger; Alon Goren; Anjana Rao; Raul Mostoslavsky
Journal:  Nat Cell Biol       Date:  2015-04-27       Impact factor: 28.824

8.  DNA methylation dynamics underlie metamorphic gene regulation programs in Xenopus tadpole brain.

Authors:  Yasuhiro Kyono; Samhitha Raj; Christopher J Sifuentes; Nicolas Buisine; Laurent Sachs; Robert J Denver
Journal:  Dev Biol       Date:  2020-03-31       Impact factor: 3.582

9.  Tcf4 Regulates Synaptic Plasticity, DNA Methylation, and Memory Function.

Authors:  Andrew J Kennedy; Elizabeth J Rahn; Brynna S Paulukaitis; Katherine E Savell; Holly B Kordasiewicz; Jing Wang; John W Lewis; Jessica Posey; Sarah K Strange; Mikael C Guzman-Karlsson; Scott E Phillips; Kyle Decker; S Timothy Motley; Eric E Swayze; David J Ecker; Todd P Michael; Jeremy J Day; J David Sweatt
Journal:  Cell Rep       Date:  2016-08-25       Impact factor: 9.423

10.  Epigenomic analysis of primary human T cells reveals enhancers associated with TH2 memory cell differentiation and asthma susceptibility.

Authors:  Grégory Seumois; Lukas Chavez; Anna Gerasimova; Matthias Lienhard; Nada Omran; Lukas Kalinke; Maria Vedanayagam; Asha Purnima V Ganesan; Ashu Chawla; Ratko Djukanović; K Mark Ansel; Bjoern Peters; Anjana Rao; Pandurangan Vijayanand
Journal:  Nat Immunol       Date:  2014-07-06       Impact factor: 25.606

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.