Literature DB >> 23749987

DMEAS: DNA methylation entropy analysis software.

Jianlin He1, Xinxi Sun, Xiaojian Shao, Liji Liang, Hehuang Xie.   

Abstract

SUMMARY: DMEAS is the first user-friendly tool dedicated to analyze the distribution of DNA methylation patterns for the quantification of epigenetic heterogeneity. It supports the analysis of both locus-specific and genome-wide bisulfite sequencing data. DMEAS progressively scans the mapping results of bisulfite sequencing reads to extract DNA methylation patterns for contiguous CpG dinucleotides. It determines the DNA methylation level and calculates methylation entropy for genomic segments to enable the quantitative assessment of DNA methylation variations observed in cell populations.
AVAILABILITY AND IMPLEMENTATION: DMEAS program, user guide and all the testing data are freely available from http://sourceforge.net/projects/dmeas/files/

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23749987      PMCID: PMC3722522          DOI: 10.1093/bioinformatics/btt332

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

DNA methylation is a crucial epigenetic modification involved in many biological processes, from normal cellular differentiation to disease genesis and progression. Traditionally, DNA methylation analysis is limited to the determination and comparison of DNA methylation levels. A number of computational tools including BISMA, QDMR, BiQ Analyzer HT and CpG_MPs have been developed to analyze DNA methylation data derived from illumina Beadchip or bisulfite sequence reads (Lutsik ; Rohde ; Su ; Zhang ). Recently, great interest has been aroused in decoding DNA methylation patterning to understand the generation of cell diversity. In addition to tracing the cell lineage (Shibata, 2012; Tsai ), DNA methylation patterns can be used as a measure of the epigenetic heterogeneity in cell populations (Xie ). In particular, with the emergence of next-generation sequencing techniques, rapidly accumulating deep bisulfite sequencing data allow the securitization of DNA methylation patterns on genome-wide scale. However, existing DNA methylation analysis tools mainly focus on the bisulfite sequencing data mapping and the comparison at DNA methylation level. No software has been developed to assess DNA methylation variations embedded in bisulfite sequencing data. Here, we present DMEAS, a C# implementation of the algorithm for DNA methylation entropy calculation (Xie ) as an interactive tool to evaluate the variation in DNA methylation patterns.

2 OVERVIEW OF DMEAS

2.1 Input data

DMEAS offers user-friendly interfaces for researchers to import the high-throughput methylation sequencing data analyzed with Bismark software (Krueger and Andrews, 2011). For each sequence read, Bismark provides one line annotation for mapping information including read ID, chromosome ID, genome start position and methylation calls for cytosines identified. With such sequence annotation, DMEAS identifies all possible genomic segments with at least four contiguous CpG dinucleotides and records the combination of their methylation statuses. Based on the mapping result for each segment, the sequence reads covering the corresponding genomic region will be identified and clustered. The DNA methylation pattern will be extracted and the methylation level/entropy will be determined for visualization and comparison. DMEAS also takes user-defined locus-specific methylation data. For each genomic region, the default input file should consist of sample information, locus information and multiple-line numerical data representing DNA methylation patterns. More specifically, DNA methylation statuses are represented with 0, 1 or 2 for unmethylated, methylated or unknown methylation status, respectively. DMEAS processes lines from input stream to extract DNA methylation patterns for all possible genomic segments with four contiguous CpG dinucleotides.

2.2 DNA methylation level and entropy analysis

DNA methylation entropy is calculated as described previously (Xie ). Briefly, for a given genomic segment, the frequency of each distinct DNA methylation pattern observed is calculated based on all sequence reads mapped to the locus. Providing the number of CpG sites, the frequencies of all patterns observed and the total number of sequence reads generated for a given genomic locus, methylation entropy could be determined with a modified version of Shannon entropy equation (Xie ). To ensure all 16 possible methylation patterns would be considered for a given 4-CpG segment, only genomic segments with ≥16× coverage will be included in the further analysis. DNA methylation level of a genomic region with multiple CpG sites is defined as the percentage of methylated CpG dinucleotides observed. For each high-throughput bisulfite sequencing dataset, DMEAS provides a descriptive statistical summary and the distribution plots for methylation level and entropy. Statistical analyses, including Pearson correlation, etc., are provided for pairwise comparisons and/or multi-sample comparisons. Similar functions have been implemented for the analysis of locus-specific methylation data as well.

2.3 Methylation pattern visualization and data output

DMEAS allows users to visualize methylation patterns at the genomic loci of interest. Heatmap representation is adopted for graphical displays of DNA methylation pattern. In heatmap style, red, blue and gray rectangles represent methylated, unmethylated and unknown methylation status, respectively. Owing to the large volume of high-throughput data, all sequence reads are sorted in ascending order according to their genomic coordinates to facilitate their retrieval later. To achieve a reasonable resolution, DNA methylation pattern is demonstrated in 1 kb window. If no methylation data are found for the 1 kb window targeted, DMEAS will automatically search the upstream and downstream for the most adjacent genomic region with methylation data and provide the corresponding genomic coordinates. DMEAS also provides user-friendly interfaces to export the results in a wide variety of ways, including text file and image format. The distribution of DNA methylation level or DNA methylation entropy can be exhibited in either histogram (Fig. 1A) or line chart styles, and the correlation between methylation level and entropy is demonstrated in a scatter style (Fig. 1B). In addition, the methylation level and entropy results can also be exported directly to tables (Fig. 1C). Together with the heatmap representation (Fig. 1D), DMEAS generates the output file with basic statistical information for selected samples, such as the number of segments, total number of reads, total number of CpG sites and average methylation level and entropy. All the saved images and tables could be used for further analysis.
Fig. 1.

Methylation level/entropy analysis with DMEAS. (A) The histogram of DNA methylation level/entropy. (B) The scatter plot for the association between the methylation level and the methylation entropy. (C) The output table with DNA methylation entropy/level. (D) DNA methylation pattern heatmap for locus-specific data. The blue, red or gray represents for unmethylated, methylated or unknown methylation status, respectively

Methylation level/entropy analysis with DMEAS. (A) The histogram of DNA methylation level/entropy. (B) The scatter plot for the association between the methylation level and the methylation entropy. (C) The output table with DNA methylation entropy/level. (D) DNA methylation pattern heatmap for locus-specific data. The blue, red or gray represents for unmethylated, methylated or unknown methylation status, respectively

3 CONCLUSION

We have successfully developed DMEAS, the first software to enable the analysis of DNA methylation pattern and the quantification of epigenetic heterogeneity. For bisulfite sequencing data, locus-specific or genome-wide, DMEAS can automatically identify and determine the methylation levels and entropies for all possible 4-CpG segments. The visualization of DNA methylation pattern of each segment is implemented and the descriptive statistical summary on segments, including the distributions of methylation level and entropy, are provided. In addition, Pearson correlation was adopted to measure the correlation of the methylation levels and entropies across samples. We anticipate it will assist researchers to explore DNA methylation data in a new dimension. Funding: This work was supported by grants from the Natural Science Foundation of China [31201002, 81270633]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Conflict of Interest: none declared.
  8 in total

1.  Heterogeneity and randomness of DNA methylation patterns in human embryonic stem cells.

Authors:  Albert G Tsai; Debbie M Chen; Mayin Lin; John C F Hsieh; Cindy Y Okitsu; Alexander Taghva; Darryl Shibata; Chih-Lin Hsieh
Journal:  DNA Cell Biol       Date:  2012-01-25       Impact factor: 3.311

2.  Cancer. Heterogeneity and tumor history.

Authors:  Darryl Shibata
Journal:  Science       Date:  2012-04-20       Impact factor: 47.728

3.  BISMA--fast and accurate bisulfite sequencing data analysis of individual clones from unique and repetitive sequences.

Authors:  Christian Rohde; Yingying Zhang; Richard Reinhardt; Albert Jeltsch
Journal:  BMC Bioinformatics       Date:  2010-05-06       Impact factor: 3.169

4.  Genome-wide quantitative assessment of variation in DNA methylation patterns.

Authors:  Hehuang Xie; Min Wang; Alexandre de Andrade; Maria de F Bonaldo; Vasil Galat; Kelly Arndt; Veena Rajaram; Stewart Goldman; Tadanori Tomita; Marcelo B Soares
Journal:  Nucleic Acids Res       Date:  2011-01-28       Impact factor: 16.971

5.  Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications.

Authors:  Felix Krueger; Simon R Andrews
Journal:  Bioinformatics       Date:  2011-04-14       Impact factor: 6.937

6.  BiQ Analyzer HT: locus-specific analysis of DNA methylation by high-throughput bisulfite sequencing.

Authors:  Pavlo Lutsik; Lars Feuerbach; Julia Arand; Thomas Lengauer; Jörn Walter; Christoph Bock
Journal:  Nucleic Acids Res       Date:  2011-05-11       Impact factor: 16.971

7.  QDMR: a quantitative method for identification of differentially methylated regions by entropy.

Authors:  Yan Zhang; Hongbo Liu; Jie Lv; Xue Xiao; Jiang Zhu; Xiaojuan Liu; Jianzhong Su; Xia Li; Qiong Wu; Fang Wang; Ying Cui
Journal:  Nucleic Acids Res       Date:  2011-02-08       Impact factor: 16.971

8.  CpG_MPs: identification of CpG methylation patterns of genomic regions from high-throughput bisulfite sequencing data.

Authors:  Jianzhong Su; Haidan Yan; Yanjun Wei; Hongbo Liu; Hui Liu; Fang Wang; Jie Lv; Qiong Wu; Yan Zhang
Journal:  Nucleic Acids Res       Date:  2012-08-31       Impact factor: 16.971

  8 in total
  10 in total

1.  Comprehensive analysis of DNA methylation data with RnBeads.

Authors:  Yassen Assenov; Fabian Müller; Pavlo Lutsik; Jörn Walter; Thomas Lengauer; Christoph Bock
Journal:  Nat Methods       Date:  2014-09-28       Impact factor: 28.547

Review 2.  Genome-wide assays that identify and quantify modified cytosines in human disease studies.

Authors:  Netha Ulahannan; John M Greally
Journal:  Epigenetics Chromatin       Date:  2015-01-22       Impact factor: 4.954

3.  Deciphering the heterogeneity in DNA methylation patterns during stem cell differentiation and reprogramming.

Authors:  Xiaojian Shao; Cuiyun Zhang; Ming-An Sun; Xuemei Lu; Hehuang Xie
Journal:  BMC Genomics       Date:  2014-11-18       Impact factor: 3.969

4.  Estimation of the methylation pattern distribution from deep sequencing data.

Authors:  Peijie Lin; Sylvain Forêt; Susan R Wilson; Conrad J Burden
Journal:  BMC Bioinformatics       Date:  2015-05-06       Impact factor: 3.169

5.  Quantification of tumour evolution and heterogeneity via Bayesian epiallele detection.

Authors:  James E Barrett; Andrew Feber; Javier Herrero; Miljana Tanic; Gareth A Wilson; Charles Swanton; Stephan Beck
Journal:  BMC Bioinformatics       Date:  2017-07-25       Impact factor: 3.169

Review 6.  Whole genome DNA methylation: beyond genes silencing.

Authors:  Roberto Tirado-Magallanes; Khadija Rebbani; Ricky Lim; Sriharsa Pradhan; Touati Benoukraf
Journal:  Oncotarget       Date:  2017-01-17

7.  msPIPE: a pipeline for the analysis and visualization of whole-genome bisulfite sequencing data.

Authors:  Heesun Kim; Mikang Sim; Nayoung Park; Kisang Kwon; Junyoung Kim; Jaebum Kim
Journal:  BMC Bioinformatics       Date:  2022-09-19       Impact factor: 3.307

8.  Virtual methylome dissection facilitated by single-cell analyses.

Authors:  Liduo Yin; Yanting Luo; Xiguang Xu; Shiyu Wen; Xiaowei Wu; Xuemei Lu; Hehuang Xie
Journal:  Epigenetics Chromatin       Date:  2019-11-11       Impact factor: 4.954

Review 9.  Ten Years of EWAS.

Authors:  Siyu Wei; Junxian Tao; Jing Xu; Xingyu Chen; Zhaoyang Wang; Nan Zhang; Lijiao Zuo; Zhe Jia; Haiyan Chen; Hongmei Sun; Yubo Yan; Mingming Zhang; Hongchao Lv; Fanwu Kong; Lian Duan; Ye Ma; Mingzhi Liao; Liangde Xu; Rennan Feng; Guiyou Liu; The Ewas Project; Yongshuai Jiang
Journal:  Adv Sci (Weinh)       Date:  2021-08-11       Impact factor: 16.806

10.  RLM: Fast and simplified extraction of Read-Level Methylation metrics from bisulfite sequencing data.

Authors:  Sara Hetzel; Pay Giesselmann; Knut Reinert; Alexander Meissner; Helene Kretzmer
Journal:  Bioinformatics       Date:  2021-10-02       Impact factor: 6.937

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.