| Literature DB >> 27393338 |
Johannes Dapprich1, Deborah Ferriola2,3, Kate Mackiewicz3, Peter M Clark3, Eric Rappaport4, Monica D'Arcy5, Ariella Sasson5, Xiaowu Gai5, Jonathan Schug6, Klaus H Kaestner6, Dimitri Monos7,8.
Abstract
BACKGROUND: The ability to capture and sequence large contiguous DNA fragments represents a significant advancement towards the comprehensive characterization of complex genomic regions. While emerging sequencing platforms are capable of producing several kilobases-long reads, the fragment sizes generated by current DNA target enrichment technologies remain a limiting factor, producing DNA fragments generally shorter than 1 kbp. The DNA enrichment methodology described herein, Region-Specific Extraction (RSE), produces DNA segments in excess of 20 kbp in length. Coupling this enrichment method to appropriate sequencing platforms will significantly enhance the ability to generate complete and accurate sequence characterization of any genomic region without the need for reference-based assembly.Entities:
Keywords: DNA sequencing; DNA target capture; Genomic resequencing; MHC haplotype; Targeted enrichment
Mesh:
Substances:
Year: 2016 PMID: 27393338 PMCID: PMC4938946 DOI: 10.1186/s12864-016-2836-6
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Principle of RSE. a During the first step of RSE, the genomic template DNA (light blue) briefly gets denatured to allow capture primers (red) to hybridize. b The bound primers are enzymatically extended with biotinylated nucleotides. The extended portions of the primers, shown in green, form the “handle” to which streptavidin-coated magnetic beads bind. During this process many biotins of the same primer/target DNA complex are bound to streptavidin binding sites on the same bead, thereby forming a topological linkage that firmly locks even very long DNA segments extending in both directions from the capture point onto the surface of the magnetic bead. The primer/target DNA complex is then magnetically purified and released from the bead surface by heat. (The drawing is not to scale: the magnetic beads are approximately an order of magnitude larger than illustrated here)
Fig. 2Effects of RSE capture primer spacing on target enrichment. a Schematic representation of the distribution of captured genomic DNA copy number obtained around the primer hybridization site, indicated with a red triangle, as measured by qPCRs, placed at increasing distances from the primer hybridization site and shown with black inverted triangles. Gray bars indicate captured random DNA fragments. b qPCR results for RSE extracted material at seven non-contiguous genomic regions, plotted as the copy number ratio of targeted sites (indicated as diamonds) to a common non-targeted region (beta actin). The amount of targeted vs. off-target material decreases within about 10 kbp of the RSE extraction site
Fig. 3Effects of RSE capture primer spacing on capture effectiveness. a 46 RSE primers were designed to capture ≈ 700 kbp of genomic sequence for four gene regions. b To examine the effect of RSE primer spacing on capture efficiency, we assumed that the midpoint between the RSE primers would produce the least amount of signal on the array. Each midpoint in the bins shown above was averaged across 20 array primers to account for array probe capture variability. The distance between RSE primers and the averaged array value is presented. c The distances between RSE primers were segregated into bins to show the collective effect of similar RSE primer spacing. As seen in the graph, capture of the material as used here at the midpoint between primers drops rapidly beyond ≈ 15 kbp with little to no capture evident at 25 kbp or greater for the type of genomic DNA used in this study (average length ≈ 20 kbp)
Sequencing results
| Targeted Region (bp) | 4,000,002 | Targeted Bases Called | 3,997,493 | Depth >1 | 99.937 % |
| Unique Bases (bp) | 1,895,669 | Unique Bases Called | 1,891,678 | Depth >1 | 99.789 % |
| % of Repeat Sequences | 52.68 | ||||
| % of Unique Sequences | 47.39 | ||||
| Total # of Reads Mapped to Whole Genome | 67,257,141 | ||||
| Total # of Mapped Reads to Targeted Region | 6,951,692 | ||||
| Average Depth of Coverage for Entire Genome (Non-Targeted) | 2 | ||||
| Average Depth of Coverage for Entire Targeted Region | 164 | ||||
| Average Depth of Coverage for Unique Sequence in Targeted Region | 173 | ||||
Fig. 4Sequencing depth of coverage of the enriched MHC. The RSE enrichment process results in clinical sequencing depth (>30×) for ≈ 97 % of all enriched bases with >90 % coverage at 50× or greater
Fig. 5Sequencing depth of coverage map for RSE-extracted MHC region. a MHC sequencing coverage is displayed for the entire enriched 4 Mb of the PGF MHC region along with 300 kbp of non-targeted sequence on either side. Each qPCR probe assay is marked by a numbered arrow. b 50 kbp regions around each of three qPCR assays is shown to demonstrate differing levels of coverage. RSE capture primer positions are marked with a green marker. The red circle shows the approximate depth of coverage at the qPCR probe position. While regions 2 and 5 have differing average depth of coverage, the qPCR results at the site of capture are very similar (930 vs 1010 copies/μl) which suggest similar amounts of enrichment that is validated by the sequencing depth of coverage results (130 vs 95). Region 3 shows enhanced depth of coverage and suggests higher enrichment that is validated by the higher qPCR results (2569 copies/μl). The depth of coverage results correlate well to the qPCR copy number estimates of the extracted material: higher enrichment = higher depth of coverage
qPCR correlation to sequencing coverage
| qPCR Probe Position within MHC | Corr. Coef | Corr. Coef | |||||
|---|---|---|---|---|---|---|---|
| 30362055 | 31417450 | 31682240 | 32016911 | 32935499 | 1&2 | 2&3 | |
| (1) non-Amped | 1565 | 930 | 2569 | 1227 | 1010 | 0.94 | |
| (2) WGA | 4,201,954 | 3,312,705 | 12,750,000 | 5,974,923 | 2,337,060 | 0.97 | |
| (3) Coverage Depth | 166 | 130 | 372 | 253 | 95 | ||
(1) & (2) results are copies of target per ul of extracted material
Fig. 6Average depth of coverage at the site of capture and midpoint between capture primers. Average depth of coverage was calculated across all bases underlying each RSE capture primer position. Black diamonds represent the average depth of coverage at the RSE primer position while open circles represent the average depth of coverage at the midpoint between adjacent RSE primers. Out of 500 RSE primers, only 7 were at a depth of coverage of <30× at the RSE capture site (≈ 99 % produced 30× coverage or better) while only 16 midpoints between RSE primers were at a depth of <30× (≈ 97 % of the midpoints were 30× and above)
Sanger validation of identified NGS variants
| Type of variants | Sanger agrees with NGS | Sanger agrees with reference | Total | |
|---|---|---|---|---|
| 61 Sanger Validated Variants (Gene Regions) | Exonic | 4 | 4 | 8 |
| Intronic | 28 | 22 | 50 | |
| Insertions/deletions | 3 | 3 | ||
| 25 Sanger Validated Variants (Intergenic) | Mismatches | 15 | 10 | 25 |
| 50 | 36 | 86 | ||