| Literature DB >> 33830997 |
Amelia D Wallace1,2, Thomas A Sasani3, Jordan Swanier1, Brooke L Gates4, Jeff Greenland4, Brent S Pedersen1,2, Katherine E Varley4, Aaron R Quinlan1,2,5.
Abstract
A substantial fraction of the human genome is difficult to interrogate with short-read DNA sequencing technologies due to paralogy, complex haplotype structures, or tandem repeats. Long-read sequencing technologies, such as Oxford Nanopore's MinION, enable direct measurement of complex loci without introducing many of the biases inherent to short-read methods, though they suffer from relatively lower throughput. This limitation has motivated recent efforts to develop amplification-free strategies to target and enrich loci of interest for subsequent sequencing with long reads. Here, we present CaBagE, a method for target enrichment that is efficient and useful for sequencing large, structurally complex targets. The CaBagE method leverages the stable binding of Cas9 to its DNA target to protect desired fragments from digestion with exonuclease. Enriched DNA fragments are then sequenced with Oxford Nanopore's MinION long-read sequencing technology. Enrichment with CaBagE resulted in a median of 116X coverage (range 39-416) of target loci when tested on five genomic targets ranging from 4-20kb in length using healthy donor DNA. Four cancer gene targets were enriched in a single reaction and multiplexed on a single MinION flow cell. We further demonstrate the utility of CaBagE in two ALS patients with C9orf72 short tandem repeat expansions to produce genotype estimates commensurate with genotypes derived from repeat-primed PCR for each individual. With CaBagE there is a physical enrichment of on-target DNA in a given sample prior to sequencing. This feature allows adaptability across sequencing platforms and potential use as an enrichment strategy for applications beyond sequencing. CaBagE is a rapid enrichment method that can illuminate regions of the 'hidden genome' underlying human disease.Entities:
Mesh:
Substances:
Year: 2021 PMID: 33830997 PMCID: PMC8031414 DOI: 10.1371/journal.pone.0241253
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Results from individual CaBagE runs in DNA from healthy donors.
| Run ID | Total Reads | Target(s) per flowcell | Target Length (bp) | On-Target Read Depth | Total Spanning Reads |
|---|---|---|---|---|---|
| 536,943 | 4,044 | 416 | 404 | ||
| 485,412 | 4,044 | 179 | 168 | ||
| 845,510 | 17,819 | 91 | 61 | ||
| 18,189 | 162 | 98 | |||
| 13,644 | 190 | 136 | |||
| 24,389 | 116 | 77 | |||
| 681,142 | 17,819 | 39 | 25 | ||
| 18,189 | 61 | 36 | |||
| 13,644 | 54 | 39 | |||
| 24,389 | 63 | 41 |
aMapQ = 60
bReads that span ≥ 90% of the target locus
Results from CaBagE runs in known carriers of the C9orf72 repeat expansion.
| Coriell ID | RP-PCR CN Estimate | Total Reads | On-Target Read Depth | Total Spanning Reads | Reads spanning expanded repeat | CaBagE CN Estimate |
|---|---|---|---|---|---|---|
| ND11386 | 8/704 | 1,490,712 | 115 | 98 | 21 | 9/749/1,893 |
| ND13803 | 2/EXP | 852,155 | 71 | 66 | 7 | 2/808/1,538 |
aMapQ = 60
*RP-PCR repeat-primed PCR and agarose gel electrophoresis derived genotypes from Bram et al [38], CN copy number