| Literature DB >> 24406024 |
Yew Kok Lee1, Shengnan Jin1, Shiwei Duan1, Yen Ching Lim1, Desmond Py Ng1, Xueqin Michelle Lin1, George Sh Yeo2, Chunming Ding1.
Abstract
BACKGROUND: DNA methylation plays crucial roles in epigenetic gene regulation in normal development and disease pathogenesis. Efficient and accurate quantification of DNA methylation at single base resolution can greatly advance the knowledge of disease mechanisms and be used to identify potential biomarkers. We developed an improved pipeline based on reduced representation bisulfite sequencing (RRBS) for cost-effective genome-wide quantification of DNA methylation at single base resolution. A selection of two restriction enzymes (TaqαI and MspI) enables a more unbiased coverage of genomic regions of different CpG densities. We further developed a highly automated software package to analyze bisulfite sequencing results from the Solexa GAIIx system.Entities:
Year: 2014 PMID: 24406024 PMCID: PMC3895702 DOI: 10.1186/1480-9222-16-1
Source DB: PubMed Journal: Biol Proced Online ISSN: 1480-9222 Impact factor: 3.244
Figure 1Key laboratory steps in RRBS. The isolated DNA from the samples are digested by two restriction enzymes (TaqαI and MspI). The fragments are then end-repaired and ligated with adapters. The ligated DNA are size-selected and repetitive sequences are removed. PCR amplification is performed after bisulfite conversion. Illumina GAIIx system is used for high-throughput sequencing. Newly added bases (marked in bold in step 2) are always unmethylated and thus discarded in the downstream analysis for CpG methylation.
Comparison of digestions (double vs. single digestion)
| CpGs in CGI regions | 1,098,462 | 1,180,058 | 7.4% |
| CpGs in non-CGI regions | 1,919,174 | 2,720,858 | 41.8% |
| CGIs* | 20,227 | 21,511 | 6.3% |
| Promoters* (−1000 to +500 bp) | 24,520 | 27,633 | 12.7% |
*by at least 3 CpGs covered in the genomic region.
Figure 2Coverage of CpGs and genomic regions by RRBS. (A) An example for number of CpG sites with different minimum sequencing depths; (B) Distribution of CpGs in CGIs/CGSs/Others, using a sequencing depth ≥ 10 as the cutoff; (C) Distribution of CpGs in Promoter/TTR/Intragenic/Intergenic regions; (D) Distribution of genomic regions in CGIs/CGSs/others. (E) Distribution of genomic regions in promoter/TTR/Intragenic/Intergenic regions. A genomic region was considered covered if at least three CpGs within the region were sequenced at a depth ≥ 10.
Genome-wide coverage by RRBS
| CpGs | 56,434,896 | 1,837,502 | 3.3% |
| CGIs | 27,718 | 21,252 | 76.7% |
| CGSs (±2 KB from CGI) | 49,300 | 27,074 | 54.9% |
| Promoters (−1000 to +500 bp) | 44,399 | 23,168 | 52.2% |
Figure 3Sequencing reads alignment to two reference genomes, C2T and G2A reference genomes. Due to the pointer size limitation by the Bowtie program, reads alignment needs to be performed using the two reference genomes separately. Cross check and comparison are made subsequently to remove reads not uniquely aligned. At the end, sets A and C are read pairs aligned to C2T reference genome, while sets B and D are read pairs aligned to the G2A reference genome.