| Literature DB >> 26911705 |
Nicholas C Wong1,2,3,4, Bernard J Pope5,6, Ida L Candiloro7, Darren Korbie8, Matt Trau9,10, Stephen Q Wong11,12, Thomas Mikeska13,14, Xinmin Zhang15, Mark Pitman16, Stefanie Eggers17, Stephen R Doyle18, Alexander Dobrovic19,20,21.
Abstract
BACKGROUND: DNA methylation at a gene promoter region has the potential to regulate gene transcription. Patterns of methylation over multiple CpG sites in a region are often complex and cell type specific, with the region showing multiple allelic patterns in a sample. This complexity is commonly obscured when DNA methylation data is summarised as an average percentage value for each CpG site (or aggregated across CpG sites). True representation of methylation patterns can only be fully characterised by clonal analysis. Deep sequencing provides the ability to investigate clonal DNA methylation patterns in unprecedented detail and scale, enabling the proper characterisation of the heterogeneity of methylation patterns. However, the sheer amount and complexity of sequencing data requires new synoptic approaches to visualise the distribution of allelic patterns.Entities:
Mesh:
Year: 2016 PMID: 26911705 PMCID: PMC4765133 DOI: 10.1186/s12859-016-0950-8
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Mapping statistics of bisulfite amplicon libraries
| Sample | Mapping Efficiency | Unique Hits | Methylated CpG | Methylated CHG | Methylated CHH | Total C’s analysed |
|---|---|---|---|---|---|---|
| 293 | 52.2 % | 7539 | 64.9 % | 0.2 % | 0.3 % | 316211 |
| 40424 | 55.3 % | 9414 | 37.5 % | 0.2 % | 0.2 % | 351086 |
| 910046 | 42.0 % | 7060 | 32.6 % | 0.2 % | 0.3 % | 299795 |
| 12a-cd19 | 14.9 % | 48648 | 47.9 % | 0.4 % | 0.5 % | 1933767 |
| 12a-cd34 | 30.3 % | 85049 | 36.5 % | 0.1 % | 0.2 % | 3703147 |
| 12a-cd45 | 32.4 % | 109173 | 32.6 % | 0.1 % | 0.2 % | 4714744 |
| 12acd33 | 36.2 % | 161885 | 32.8 % | 0.2 % | 0.2 % | 6997070 |
| 6-mda453 | 54.6 % | 201660 | 84.4 % | 0.8 % | 1.3 % | 9179816 |
| 6c-cd19 | 7.9 % | 22258 | 77.8 % | 0.2 % | 0.3 % | 777739 |
| 6c-cd33 | 27.9 % | 20071 | 35.2 % | 0.2 % | 0.2 % | 851116 |
| 6c-cd34 | 19.5 % | 36928 | 49.7 % | 0.2 % | 0.2 % | 1628107 |
| 6ccd45 | 33.0 % | 31087 | 39.5 % | 0.1 % | 0.2 % | 1314281 |
| 9a-cd19 | 21.2 % | 39352 | 48.7 % | 0.2 % | 0.3 % | 1638757 |
| 9a-cd33 | 31.9 % | 125884 | 35.8 % | 0.2 % | 0.2 % | 5459419 |
| 9a-cd34 | 26.2 % | 77870 | 43.4 % | 0.2 % | 0.2 % | 3321993 |
| 9a-cd45 | 46.6 % | 28085 | 29.8 % | 0.2 % | 0.2 % | 1211803 |
| 9awholeblood | 31.5 % | 97532 | 30.8 % | 0.2 % | 0.2 % | 4081834 |
| brl | 49.3 % | 9107 | 32.7 % | 0.2 % | 0.4 % | 398977 |
| caco | 19.6 % | 129536 | 78.1 % | 0.2 % | 0.2 % | 4512574 |
| dg75 | 51.7 % | 10827 | 57.2 % | 0.3 % | 0.3 % | 489096 |
| ekvx | 23.0 % | 115915 | 63.1 % | 0.2 % | 0.2 % | 4494359 |
| hela | 43.1 % | 41650 | 55.9 % | 0.2 % | 0.2 % | 1731811 |
| hepg2 | 39.2 % | 24667 | 63.4 % | 0.3 % | 0.3 % | 971693 |
| ht1080 | 40.7 % | 4586 | 67.0 % | 0.2 % | 0.4 % | 176188 |
| htb22-col | 30.9 % | 45576 | 79.9 % | 0.2 % | 0.2 % | 1863098 |
| jwl | 31.3 % | 18814 | 42.7 % | 0.2 % | 0.2 % | 771188 |
| k562 | 49.7 % | 144791 | 55.9 % | 0.3 % | 0.3 % | 6230391 |
| ls174t | 41.2 % | 3691 | 57.2 % | 0.2 % | 0.3 % | 151722 |
| mcf7 | 30.0 % | 87404 | 71.6 % | 0.8 % | 0.8 % | 3786412 |
| mda-mb231-bag | 29.0 % | 94811 | 77.3 % | 1.0 % | 1.1 % | 4171147 |
| nalm6 | 43.6 % | 37669 | 85.8 % | 0.2 % | 0.2 % | 1569041 |
| nccit | 44.0 % | 31656 | 45.7 % | 0.4 % | 0.3 % | 1406165 |
| ovcar8 | 32.3 % | 46864 | 63.4 % | 0.3 % | 0.3 % | 1917527 |
| sknas | 21.6 % | 275040 | 27.7 % | 0.1 % | 0.2 % | 11313285 |
| u231 | 14.0 % | 123302 | 74.8 % | 0.4 % | 0.2 % | 4389352 |
Fig. 1Methpat visualisation of DNA methylation at the FOXP3 gene promoter region. Samples from one individual (blood) fluorescence activated cell sorted (FACS) into various haematopoetic compartments were assessed for DNA methylation and analysed by Methpat. DNA methylation across this locus varies according to cell type. Furthermore, the diversity of epialleles within each cell type analysed also varies with one or two patterns dominating the read counts
Fig. 2Methpat visualisation of DNA methylation at the MEST imprinted region on a range of primary cells (CD34, CD45, CD19 and CD33) and tissue (Whole blood), model cancer cell lines (HeLA and MDA-MB-231-BAG) and a normal lymphoblast cell line (BRL). The methylation status of MEST, expected to be ~50 % was observed in all normal sample types. The cancer cell lines demonstrate methylated MEST. In addition, Methpat visualizes the epiallelic diversity of MEST in all these samples
Fig. 3Methpat visualisation of DNA methylation at the RASSF1A gene promoter region. Methylation of RASSF1A is present in cancer cell lines (Caco, HEPG2 and NALM6) with the exception of HeLa. Examples of RASSF1A methylation in whole blood and a normal lymphoblast cell line (JWL) are also shown
Fig. 4Methpat visualisation of DNA methylation at the CDKN2A gene promoter region
Fig. 5Methpat visualisation of DNA methylation within the D-Loop regulatory region of the mitochondrial genome
Alternative DNA methylation Analysis and Visualisation Tools
| Software | Program Language and Implementation | Analysis Process | Visual Output | Input file | Output file | Epiallelic Counts | Experiment Quality Check |
|---|---|---|---|---|---|---|---|
| Methpat | Python, pip install, URL available to install files locally | Summarises Bismark output | Interactive HTML and summary text file of epiallele counts. Scalable PNG file | Bismark methylation extractor output, user-defined BED format file | HTML and tab delimited text file | Yes | No, leverages Bismark |
| Bismark | command line,Python, requires bwa | Performs alignment to bisulfite reference genome | None, generates BAM files for visualisation with SeqMonk or IGV | fastq file | BAM and tab deliminted text files | No | Yes calculates C to T conversion |
| BSPAT | Java/JSP web interface | Visualisation and summarisation of Bismark output | PNG file and UCSC Genome Browser file | Bismark output, fastq files | Text file summary, PNG and UCSC Genome Browser BED file | Yes | No |
| MPFE | R library, Bioconductor | Calculates probabilities that epialleles are true | R image outputs | Table of read counts from bisulfite sequencing data | Derived statistics and plots | Yes | Yes |
| Methylation plotter | R library, shiny interactive web application | Visualises beta DNA methylation values | Interactive webpage with setting options to adjust a static image of DNA methylation values for each sample. PNG and PDF output. | Text file containing matrix of sample vs beta value at each CpG of interest | PDF and PNG image file | No | No |
| RnBeads | R library, Bioconductor | Processes summary data from other software for visualisation | Interactive HTML and UCSC Genome browser track hub files. PNG files | BED file | HTML summary | No | Yes |
| coMET | R library, Webserver for analysis | For EWAS studies. Analyses derived matrix files | Image files of plots with genomic locations. | Text matrix files | Image files | No | No |
EWAS epigenome-wide association studies using Illumina Infinium HM450 BeadArrays