| Literature DB >> 35849350 |
Jinfeng Chen1,2, Jingfei Cheng1,2, Xiufei Chen1,2, Masato Inoue1,2, Yibin Liu1,2, Chun-Xiao Song1,2.
Abstract
Long-read sequencing provides valuable information on difficult-to-map genomic regions, which can complement short-read sequencing to improve genome assembly, yet limited methods are available to accurately detect DNA methylation over long distances at a whole-genome scale. By combining our recently developed TET-assisted pyridine borane sequencing (TAPS) method, which enables direct detection of 5-methylcytosine and 5-hydroxymethylcytosine, with PacBio single-molecule real-time sequencing, we present here whole-genome long-read TAPS (wglrTAPS). To evaluate the performance of wglrTAPS, we applied it to mouse embryonic stem cells as a proof of concept, and an N50 read length of 3.5 kb is achieved. By sequencing wglrTAPS to 8.2× depth, we discovered a significant proportion of CpG sites that were not covered in previous 27.5× short-read TAPS. Our results demonstrate that wglrTAPS facilitates methylation profiling on problematic genomic regions with repetitive elements or structural variations, and also in an allelic manner, all of which are extremely difficult for short-read sequencing methods to resolve. This method therefore enhances applications of third-generation sequencing technologies for DNA epigenetics.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35849350 PMCID: PMC9561279 DOI: 10.1093/nar/gkac612
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 19.160
Figure 1.Development of wglrTAPS. (A) Schematic representation of the wglrTAPS. (B) Sequence length distribution of wglrTAPS HiFi read. (C) Conversion rate of wglrTAPS at methylated CpG sites and false-positive rate of wglrTAPS at non-methylated CpG sites from CmCGG-methylated 4-kb spike-in. (D) Fraction of mapped reads with ≥Q20 in wglrTAPS.
Figure 2.Comparison of covered CpG between short-read TAPS and wglrTAPS. (A) Venn diagram showing the number of CpG sites that were covered in short-read TAPS and wglrTAPS; CpG sites with at least five reads covered were used for calculation. (B) Bar plot showing the fraction of CpG sites overlapping with repeat regions in mouse genome. (C) An example of a repeat region that is only covered in wglrTAPS visualized using IGV in bisulfite mode with CG option. The blue colour denotes converted cytosine, and the red colour denotes unconverted cytosine. The top panel shows alignments from wglrTAPS; the bottom panel shows alignments from short-read TAPS. The repeat region that is only covered in wglrTAPS is highlighted in green box.
Figure 3.Insertion detection using short-read TAPS and wglrTAPS. (A) Venn diagram showing the number of insertions detected in wglrTAPS alone, both wglrTAPS and short-read TAPS, or short-read TAPS alone. (B) Example of insertion only detected in wglrTAPS visualized using IGV in bisulfite mode with CG option. The blue colour denotes converted cytosine, and the red colour denotes unconverted cytosine. The insertion is shown in the box. (C) The upper panel showing a region only covered in wglrTAPS but not in short-read TAPS. Lower panel showing re-aligned short-read TAPS reads after using the reference corrected by wglrTAPS. The tracks were visualized using IGV in bisulfite mode with CG option. The blue colour denotes converted cytosine, and the red colour denotes unconverted cytosine.
Figure 4.Detecting allele-specific methylation using wglrTAPS. IGV snapshot showing the reads from wglrTAPS aligned to two imprinting genes. The heterozygous SNP (A to G) and deletion are shown in the box. The tracks were visualized using IGV in bisulfite mode with CG option. The blue colour denotes converted cytosine, and the red colour denotes unconverted cytosine.