| Literature DB >> 34695109 |
Shian Su1,2, Quentin Gouil1,2, Marnie E Blewitt1,2, Dianne Cook3, Peter F Hickey1,2, Matthew E Ritchie1,2.
Abstract
A key benefit of long-read nanopore sequencing technology is the ability to detect modified DNA bases, such as 5-methylcytosine. The lack of R/Bioconductor tools for the effective visualization of nanopore methylation profiles between samples from different experimental groups led us to develop the NanoMethViz R package. Our software can handle methylation output generated from a range of different methylation callers and manages large datasets using a compressed data format. To fully explore the methylation patterns in a dataset, NanoMethViz allows plotting of data at various resolutions. At the sample-level, we use dimensionality reduction to look at the relationships between methylation profiles in an unsupervised way. We visualize methylation profiles of classes of features such as genes or CpG islands by scaling them to relative positions and aggregating their profiles. At the finest resolution, we visualize methylation patterns across individual reads along the genome using the spaghetti plot and heatmaps, allowing users to explore particular genes or genomic regions of interest. In summary, our software makes the handling of methylation signal more convenient, expands upon the visualization options for nanopore data and works seamlessly with existing methylation analysis tools available in the Bioconductor project. Our software is available at https://bioconductor.org/packages/NanoMethViz.Entities:
Mesh:
Year: 2021 PMID: 34695109 PMCID: PMC8568149 DOI: 10.1371/journal.pcbi.1009524
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Nanopore methylation workflow and data format.
A) The workflow used to perform differential methylation analysis. The red arrows indicate steps where further NanoMethViz provides conversion functions to bridge workflow steps. NanoMethViz performs visualization at the end of the workflow. B) Functions are provided in NanoMethViz to import the output of various methylation callers into a format used for visualization. This can be further converted by provided functions into formats suitable for various DMR detection methods provided in Bioconductor. C) The bgzip-tabix format compresses rows of tabular genomic information into blocks, and indexes the blocks with the range of genomic positions contained. This index is used for fast access the relevant blocks for decompression and reading.
Fig 2Summary of the plotting capabilities of NanoMethViz.
A) Multidimensional scaling plot of haplotyped samples. B) Aggregated methylation profile across all genes in the X-chromosome, scaled to relative positions. C) Box plot of methylation probabilities over promoter and non-promoter regions for the BL6 and CAST haplotypes. D) Spaghetti plots of known imprinted genes Peg3, Meg3, Peg10 and Peg13. Thin lines show the smoothed methylation probability on individual long reads, the thick lines show aggregated trend across the all the reads. The shaded regions are annotated as DMR by bsseq, and the tick marks along the x-axis show the location of CpG motifs. E) Spaghetti plot of Gnas, which shows two adjacent regions of opposite imprinting patterns. F) Spaghetti plot of Xist, a gene expressed from the inactive X chromosome.