| Literature DB >> 31755932 |
Quentin Gouil1,2, Andrew Keniry1,2.
Abstract
Bisulfite sequencing is a powerful technique to detect 5-methylcytosine in DNA that has immensely contributed to our understanding of epigenetic regulation in plants and animals. Meanwhile, research on other base modifications, including 6-methyladenine and 4-methylcytosine that are frequent in prokaryotes, has been impeded by the lack of a comparable technique. Bisulfite sequencing also suffers from a number of drawbacks that are difficult to surmount, among which DNA degradation, lack of specificity, or short reads with low sequence diversity. In this review, we explore the recent refinements to bisulfite sequencing protocols that enable targeting genomic regions of interest, detecting derivatives of 5-methylcytosine, and mapping single-cell methylomes. We then present the unique advantage of long-read sequencing in detecting base modifications in native DNA and highlight the respective strengths and weaknesses of PacBio and Nanopore sequencing for this application. Although analysing epigenetic data from long-read platforms remains challenging, the ability to detect various modified bases from a universal sample preparation, in addition to the mapping and phasing advantages of the longer read lengths, provide long-read sequencing with a decisive edge over short-read bisulfite sequencing for an expanding number of applications across kingdoms.Entities:
Keywords: Nanopore; PacBio; base modifications; bisulfite; epigenetics
Mesh:
Substances:
Year: 2019 PMID: 31755932 PMCID: PMC6923321 DOI: 10.1042/EBC20190027
Source DB: PubMed Journal: Essays Biochem ISSN: 0071-1365 Impact factor: 8.000
Figure 1Detection of base modifications by bisulfite and long-read sequencing
Whole-genome bisulfite sequencing (WGBS) provides accurate binary calls of cytosine methylation status at nucleotide resolution, but cannot distinguish between 5mC and 5hmC or detect other oxidised forms without additional techniques. The short reads have limitations in mapping to repeated sequences and haplotyping. Reduced representation bisulfite (RRBS) restricts the sequencing space to CpG islands. Both bisulfite protocols are compatible with single-cell genomics. PacBio single molecule real-time (SMRT) sequencing generates 250 b-20 kb reads with relatively low accuracy probabilistic calls for 4mC and 6mA modifications. Oxford Nanopore Technologies (ONT) nanopore sequencing produces 500 b-1 Mb reads with higher accuracy probabilistic calls for a range of modifications, including 5mC, 4mC, and 6mA. Long-read technologies offer a simpler library preparation workflow that avoids amplification biases, but they require more input material and more advanced data analysis. The longer read lengths improve mapping and haplotyping (estimates based on our experience with C57BL/6 x Cast/Eij F1 mice on Nanopore [62]).