Literature DB >> 25798147

Computational epigenomics: challenges and opportunities.

Abstract

Entities: Chemical Disease Gene Species

Keywords: ChIP-seq; DNA methylation; epigenetic code; epigenomics; histone mark

Year: 2015 PMID： 25798147 PMCID： PMC4350413 DOI： 10.3389/fgene.2015.00088

Source DB: PubMed Journal: Front Genet ISSN： 1664-8021 Impact factor: 4.599

× No keyword cloud information.

The field of epigenetics is undoubtedly attracting immense interest with countless studies in various areas of investigation; (see Rivera and Ren, 2013) for a review on the state of the art for human epigenomics. From the computational point of view and the characteristics of the generated data, epigenomics is a very complex field, for two main reasons. First, epigenetics encompasses a multi-layered set of regulatory cues that act coordinately and possibly in a combinatorial way to control fundamental biological processes, such as the output of gene expression programs. Second, profiling techniques based on high-throughput sequencing are widely adopted in this field, generating comprehensive yet complex and massive genome-wide datasets. As a result, the contribution of scientists with computational skills (computer scientists, statisticians, physicists and computational biologists) is considered an essential component of research institutes investing in this research field (Bock and Lengauer, 2008). In this Research Topic, we collected a number of contributions in the field of computational epigenomics covering three main research areas: (i) computational analyses tackling important issues closely related to the experimental method used to generate epigenetic data (Flensburg et al., 2014; Ji et al., 2014; Mensaert et al., 2015), (ii) computational approaches useful to overcome pitfalls associated to the analysis of a given epigenetic layer (Barozzi et al., 2014; Cairns et al., 2014; Robinson et al., 2014), and (iii) studies on the integration of multiple epigenetic layers (de Pretis and Pelizzola, 2014; Fejes et al., 2014; Osella et al., 2014). Computational tools developed for the analysis of specific epigenetic data types, including DNA methylation and ChIP-seq of histone post-translational modifications (so-called “marks”), have to deal with the biases originated directly from the experimental methodology. In the case of profiling DNA methylation, various approaches based on sequencing are available, depending on the desired tradeoff between cost, coverage and data resolution. In some cases, a non-trivial subset of the DNA fragments sequenced in MBD-seq experiments, based on affinity purification through a methyl-CpG binding protein, could not be assigned to the expected reference genome. It was then shown how it is possible to assess this unanticipated proportion of unmapped reads to profile methylated viral sequences, which can be particularly relevant in certain studies (e.g., oncoviruses; Mensaert et al., 2015). On the other hand, reads from methylated DNA were shown to be over-represented in data from whole-genome bisulfite sequencing experiments. The technical reasons for this bias and the necessity of developing computational methods for correcting this issue, especially when interested in allelic methylation, were explored (Ji et al., 2014). Finally, regarding the analysis of ChIP-seq data, computational methods were shown to be helpful in clarifying how to generate reference samples necessary for the identification of enriched genomic regions. Specifically, the effect of using pull-down of the whole histone H3 or the more common input sample (whole-cell extract) were compared, showing how this choice had negligible impact on the resulting computational results (Flensburg et al., 2014). Various computational methods have been developed for the analysis of different epigenetic data types, yet it remains difficult to understand the relative merits and performance of all the available approaches. Trying to guide on the identification of the best-suited method, a number of contributions in this Research Topic focused on the comparison between computational methods and discussion to contrast the available analysis strategies. Regarding DNA methylation data, a number of methods developed for the identification of differentially methylated bases or regions were compared, while discussing the importance of experimental design, and confounders such as batch effects and cell type composition (Robinson et al., 2014); this is a very active field, evidenced by new tools emerging, such as DMRcate (Peters et al., 2015) and M3D (Mayo et al., 2014) and also highlighting the need to constantly update performance comparisons. Touching on a different data type, chromatin accessibility, various methods for the identification of footprints in DNase-seq were discussed and compared using ENCODE data (Barozzi et al., 2014). Accessibility only reveals information about presumed activity, but is commonly chosen since it is complimentary to the analysis of specific epigenetic marks and provides a list of putative regulatory proteins that bind open chromatin regions. Finally, tackling the issue of the statistical modeling of read counts for ChIP-seq data, various alternatives were discussed and a method based on double Negative Binomial (i.e., Poisson distributed counts with a mixture of two gamma-distributed rates) was proposed (Cairns et al., 2014). While the experimental methods and the computational analysis of individual data types are compared and perfected, scientists are investigating how to make connections between the various epigenetic layers that are surveyed. It is now clear that patterns of DNA methylation and histone marks are established, maintained and have effect through a machinery that is influenced by the crosstalk between these layers, and their interplay with binding of regulatory proteins, chromatin accessibility and 3D conformation. In other words, the joint analysis of multiple epigenetic layers through data integration methods (Ritchie et al., 2015) is considered the key to comprehend how epigenetic information contributes controlling complex regulatory processes. In this series of articles, computational and experimental methods for the integrative analysis of epigenetic marks are discussed and proposed. Double-negative feedback loops, where a microRNA is inhibited by an epigenetic regulator while being epigenetically controlled by the same regulator, are considered and shown to exhibit properties that are well suited for circuits involved in cell fate transitions (Osella et al., 2014). In the context of data integration and visualization, an online platform (DaVIE) was developed based on a database of DNA methylation experiments. This tool allows navigating through multiple DNA methylation experiments and integrating different data types, including ChIP-seq data (Fejes et al., 2014). Finally, recent and past evidence in favor of the notion of epigenetic code are discussed, and computational and experimental strategies are proposed that can be instrumental to further investigating how different epigenetic layers and marks are interconnected (de Pretis and Pelizzola, 2014). Altogether, this series of articles provides a comprehensive glance at the emerging field of computational epigenomics. This research area brings to the field of epigenetics a set of tools that were initially developed in the field of genomics. At the same time, computational epigenetics is showing its maturity toward closing the circle between the genome and the epigenome, revealing how regulatory layers are interconnected and highlighting the need to jointly consider epigenetic phenomenon to explain complex transcriptional responses.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

14 in total

Review 1. Computational epigenetics.

Authors: Christoph Bock; Thomas Lengauer
Journal: Bioinformatics Date: 2007-11-17 Impact factor: 6.937

2. Mining for viral fragments in methylation enriched sequencing data.

Authors: Klaas Mensaert; Wim Van Criekinge; Olivier Thas; Ed Schuuring; Renske D M Steenbergen; G Bea A Wisman; Tim De Meyer
Journal: Front Genet Date: 2015-02-04 Impact factor: 4.599

3. De novo identification of differentially methylated regions in the human genome.

Authors: Timothy J Peters; Michael J Buckley; Aaron L Statham; Ruth Pidsley; Katherine Samaras; Reginald V Lord; Susan J Clark; Peter L Molloy
Journal: Epigenetics Chromatin Date: 2015-01-27 Impact factor: 4.954

4. DaVIE: Database for the Visualization and Integration of Epigenetic data.

Authors: Anthony P Fejes; Meaghan J Jones; Michael S Kobor
Journal: Front Genet Date: 2014-09-18 Impact factor: 4.599

5. Interplay of microRNA and epigenetic regulation in the human regulatory network.

Authors: Matteo Osella; Andrea Riba; Alessandro Testori; Davide Corà; Michele Caselle
Journal: Front Genet Date: 2014-10-06 Impact factor: 4.599

6. Methylated DNA is over-represented in whole-genome bisulfite sequencing data.

Authors: Lexiang Ji; Takahiko Sasaki; Xiaoxiao Sun; Ping Ma; Zachary A Lewis; Robert J Schmitz
Journal: Front Genet Date: 2014-10-21 Impact factor: 4.599

Review 7. Statistical methods for detecting differentially methylated loci and regions.

Authors: Mark D Robinson; Abdullah Kahraman; Charity W Law; Helen Lindsay; Malgorzata Nowicka; Lukas M Weber; Xiaobei Zhou
Journal: Front Genet Date: 2014-09-16 Impact factor: 4.599

8. Computational and experimental methods to decipher the epigenetic code.

Authors: Stefano de Pretis; Mattia Pelizzola
Journal: Front Genet Date: 2014-09-23 Impact factor: 4.599

9. M3D: a kernel-based test for spatially correlated changes in methylation profiles.

Authors: Tom R Mayo; Gabriele Schweikert; Guido Sanguinetti
Journal: Bioinformatics Date: 2014-11-13 Impact factor: 6.937

10. Comparative evaluation of DNase-seq footprint identification strategies.

Authors: Iros Barozzi; Pranami Bora; Marco J Morelli
Journal: Front Genet Date: 2014-08-15 Impact factor: 4.599

2 in total

Review 1. Molecular networks in Network Medicine: Development and applications.

Authors: Edwin K Silverman; Harald H H W Schmidt; Eleni Anastasiadou; Lucia Altucci; Marco Angelini; Lina Badimon; Jean-Luc Balligand; Giuditta Benincasa; Giovambattista Capasso; Federica Conte; Antonella Di Costanzo; Lorenzo Farina; Giulia Fiscon; Laurent Gatto; Michele Gentili; Joseph Loscalzo; Cinzia Marchese; Claudio Napoli; Paola Paci; Manuela Petti; John Quackenbush; Paolo Tieri; Davide Viggiano; Gemma Vilahur; Kimberly Glass; Jan Baumbach
Journal: Wiley Interdiscip Rev Syst Biol Med Date: 2020-04-19

Review 2. Integrating Epigenomics into the Understanding of Biomedical Insight.

Authors: Yixing Han; Ximiao He
Journal: Bioinform Biol Insights Date: 2016-12-04

2 in total