| Literature DB >> 23034176 |
Patrick Boyle, Kendell Clement, Hongcang Gu, Zachary D Smith, Michael Ziller, Jennifer L Fostel, Laurie Holmes, Jim Meldrim, Fontina Kelley, Andreas Gnirke, Alexander Meissner.
Abstract
Sequencing-based approaches have led to new insights about DNA methylation. While many different techniques for genome-scale mapping of DNA methylation have been employed, throughput has been a key limitation for most. To further facilitate the mapping of DNA methylation, we describe a protocol for gel-free multiplexed reduced representation bisulfite sequencing (mRRBS) that reduces the workload dramatically and enables processing of 96 or more samples per week. mRRBS achieves similar CpG coverage to the original RRBS protocol, while the higher throughput and lower cost make it better suited for large-scale DNA methylation mapping studies, including cohorts of cancer samples.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23034176 PMCID: PMC3491420 DOI: 10.1186/gb-2012-13-10-r92
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Flowchart comparing RRBS and mRRBS steps. Each step that can be completed in a standard workday is shown. Orange boxes highlight phenol:chloroform clean-up and preparative agarose gel purification steps that were omitted in the new mRRBS protocol. Purple boxes highlight key new steps specific to mRRBS. Each box also shows the approximate amount of hands-on time required per step. QC, quality control.
Summary of mRRBS performance
| Description | Total reads | Informative reads | Bisulfite conversion | 1× coverage CpG count | 5× coverage CpG count | 10× coverage CpG count |
|---|---|---|---|---|---|---|
| 96 samples | 11,295,879 | 8,921,543 | 99% | 2,523,793 | 1,399,192 | 563,980 |
| 84 HQ samples | 12,151,833 | 9,629,839 | 99% | 2,583,636 | 1,510,414 | 645,828 |
The first row summarizes statistics for all 96 libraries generated using mRRBS, and the second row includes only those high-quality (HQ) samples with greater than 5 million reads per sample (see Additional file 2 for per-sample details). The total reads column gives the median number of sequencing reads produced for each library. The number of those reads that passed sequencer quality controls, were aligned to the reference genome, and included in the informative read count (median value). The estimated bisulfite conversion rate is based on all methylated cytosines in a non-CpG context [6]. The median numbers of CpGs covered with at least 1×, 5×, and 10× coverage is shown.
Figure 2Performance summary of mRRBS. Ninety-six samples were processed using mRRBS and sequenced with eight lanes of Illumina HiSeq 2000 using 12 barcoded adapters per lane. (a) The total number of reads for each sample is shown 84 samples with >5 million total reads were included in the subsequent comparisons. (b) Quartile plots of summary coverage depth from these samples. The minimum and maximum values are bounded by the light blue area in (b-d), while the darker blue area represents the interquartile range. The dark blue line indicates the median. (c,d) MspI in silico digestion of the hg19 genome produced a total of 1,124,739 fragments. (c) The percentage of fragments of each fragment size that were covered by at least one read. (d) The average coverage depth for fragments of each length. Genomic MspI-digested fragments longer than 300 bp were not included in the sequence alignment target, which partly contributes to the sharp drop in coverage at 300 bp in (c,d).
Summary for 12 RRBS and 12 mRRBS libraries
| Description | Total reads | Informative reads | Bisulfite conversion | 1× coverage CpG count | 5× coverage CpG count | 10× coverage CpG count |
|---|---|---|---|---|---|---|
| 12 RRBS samples | 18,066,460 | 12,482,608 | 99% | 1,851,441 | 1,312,909 | 831,581 |
| 12 mRRBS samples | 12,523,362 | 10,000,051 | 99% | 2,631,436 | 1,617,861 | 704,994 |
The same statistics reported in Table 1 are shown here for 12 RRBS and 12 mRRBS samples that were used for the coverage comparison in Figure 3.
Figure 3Comparison of CpG measurements in RRBS (top) and mRRBS (bottom) across five genomic features. Pie charts compare the relative CpG coverage for different genomic features as sampled by the original RRBS and mRRBS protocol. Twelve representative samples with 10 to 20 million reads and more than 10 million mapped reads were selected from each method (Table 2; Additional file 2). The number of unique CpG measurements residing within a given feature must be observed in at least 80% of the samples used to be scored at a given coverage. Promoters are defined as 1 kb upstream and 1 kb downstream of the transcription start site of Ensembl genes. CgiHunter was used to computationally derive CpG islands with a minimum CpG observed versus expected ratio of 0.6, a minimum GC content of 0.5 and a minimum length of 700 bp. CpG island shores are defined as the 2 kb regions adjacent to the derived CpG islands. Previously published H3K4me2 peaks across multiple human cells were used to derive a consensus enhancer set [20]. As a more global measurement, the genome was divided into non-overlapping consecutive 5 kb tiles, and the number of CpG measurements in each tile was analyzed.
Figure 4Single-base resolution view across the . DNA methylation values of 44 individual CpGs that are captured at greater than 5× coverage within at least 80% of our 84 high-quality samples are shown for the region 3 kb upstream and 2 kb downstream of the PAX9 transcription start site. The 279 genomic CpGs within this region are marked in black and those captured by the Illumina Infinium HumanMethylation450 BeadChip Kit are shown in red. The regional average of these 44 CpGs is shown to the left of the individual CpG measurements for each sample.
Cost comparison of RRBS and mRRBS
| mRRBS | RRBS | ||
|---|---|---|---|
| Enzymes | Total (96 samples) enzymes | $665.99 | $998.69 |
| Per sample | $6.94 | $10.40 | |
| Other supplies and sequencinga | Total (96 samples) other supplies and sequencing | $16,770.00 | $15,360.00 |
| Per sample | $174.69 | $160.00 | |
| With salaryb | Total (96 samples) supplies + salary | $18,820.60 | $37,254.08 |
| Total per sample cost | $196.05 | $388.06 |
aSequencing costs are based on the current list price for HiSeq 40 bp indexed single read at the Whitehead Institute Core Facility. bUsing the previous RRBS method, an estimated 72 days are required to complete library preparation for 96 samples, whereas only 6 days are required using the mRRBS method. Salary costs are calculated using these labor estimates with a $60,000 annual research associate salary. Values are US dollars.