| Literature DB >> 27884103 |
Giovanni Scala1, Ornella Affinito2,3, Domenico Palumbo2, Ermanno Florio2,3, Antonella Monticelli3, Gennaro Miele4,5, Lorenzo Chiariotti2,3, Sergio Cocozza2,3.
Abstract
BACKGROUND: CpG sites in an individual molecule may exist in a binary state (methylated or unmethylated) and each individual DNA molecule, containing a certain number of CpGs, is a combination of these states defining an epihaplotype. Classic quantification based approaches to study DNA methylation are intrinsically unable to fully represent the complexity of the underlying methylation substrate. Epihaplotype based approaches, on the other hand, allow methylation profiles of cell populations to be studied at the single molecule level. For such investigations, next-generation sequencing techniques can be used, both for quantitative and for epihaplotype analysis. Currently available tools for methylation analysis lack output formats that explicitly report CpG methylation profiles at the single molecule level and that have suited statistical tools for their interpretation.Entities:
Keywords: Bisulfite sequencing; DNA methylation; Epihaplotype; Epihaplotype based analysis; Methylation profiles
Mesh:
Substances:
Year: 2016 PMID: 27884103 PMCID: PMC5123276 DOI: 10.1186/s12859-016-1380-3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1ampliMethProfiler workflow. Functional modules are represented as trapezes. Input and output files are represented as dashed and solid rectangles, respectively
Fig. 2ampliMethProfiler output files. a Content example of a summary and quality statistics file. b Content example of a plain text alignment file. c Content example of a methylation profiles file. d Content example of a methylation profile abundances file
Sample characteristics
| Mouse | Age | Input reads | Passing filter reads |
|---|---|---|---|
| M1_0 | P0 | 114987 | 112283 |
| M2_0 | P0 | 48780 | 48288 |
| M3_0 | P0 | 90636 | 89114 |
| M4_90 | P90 | 5436 | 2498 |
| M5_90 | P90 | 28711 | 27750 |
| M6_90 | P90 | 117228 | 115069 |
Fig. 3Profile abundances plots. a Profile composition summary charts. Bar charts representing relative abundances of profiles grouped by number of methylated CpGs. b Heatmap representing methylation profile abundances in each sample
Fig. 4Alpha diversity rarefaction plots at sample level (right column) and developmental stage level (left column)
Fig. 5Beta diversity plots. a 3D Emperor plot snapshot representing the first three principal components of the PCoA. b From left to right are reported: Bray-Curtis distance boxplots of pairwise distances computed between samples from the same developmental stage, pairwise distances computed between pairs of samples from different developmental stages, distances within P90 mice, distances within P0 mice and distances between pairs of P90 and P0 mice
Comparison of existing software programs for bisulfite sequencing analysis (Adapted from [14])
| Software | Programming Language and Implementation | Analysis Process | Visual Output | Input File | Output File | EHA | Epihaplotype Counts | Experiment Quality Check |
|---|---|---|---|---|---|---|---|---|
| MethPat | Python, pip install, | Summarises Bismark output. | Interactive HTML and summary text file of epihaplotype counts. | Bismark methylation extractor output, user-defined BED format file. | HTML and tab delimited text file. | No | Yes | No, made by Bismark. |
| Bismark | Command line, | Performs alignment to bisulfite reference genome. | None, generates BAM files for visualisation with SeqMonk or IGV. | FASTQ file. | BAM and tab delimited text files. | No | No | Yes computes C to T conversion. |
| BSPAT | Java/JSP web interface. | Visualization and summarization of Bismark output. | PNG file and UCSC | Bismark output, FASTQ files. | Text file summary, PNG and UCSC Genome Browser BED file. | No | Yes | No |
| MPFE | R library, Bioconductor. | Calculates probabilities that epihaplotypes are true. | R image outputs. | Table of read counts from bisulfite sequencing data. | Derived statistics and plots. | No | Yes | Yes |
| Methylation plotter | R library, shiny interactive web application. | Visualizes beta DNA methylation values. | Interactive webpage with setting options to adjust a static image of DNA methylation values for each sample. PNG and PDF output. | Text file containing matrix of sample vs beta value at each CpG of interest. | PDF and PNG image file. | No | No | No |
| RnBeads | R library, Bioconductor. | Processes summary data from other software for visualization. | Interactive HTML and UCSC Genome browser track hub files. PNG files. | BED file | HTML summary | No | No | Yes |
| coMET | R library, Webserver for analysis. | For EWAS studies. | Image files of plots with genomic locations. | Text matrix files | Image files | No | No | No |
| AmpliMethProfiler | Python, BLAST and QIIME | Filtering and de-multiplexing of the sequence, generation of the methylation status and EpiHaplotype composition analysis. | HTML plots and summary text file. An heatmap in PDF format. An Alpha and a Beta diversity plot in HTML and PDF format. | A fasta directory with all fasta for each sample. A file containing the reads from the sequencer. A metaFile containing information about the samples. | Filtered Fasta file. Blast aligned sequences in XML and TXT format. Summary and quality statistics for region. CpG methylation profile matrix. BIOM file with number of occurrences. | Yes | Yes | Yes, quality statistic for each analyzed region. |