| Literature DB >> 26787508 |
Alexis L Norris1, Rachael E Workman2, Yunfan Fan2, James R Eshleman1, Winston Timp2.
Abstract
Despite advances in sequencing, structural variants (SVs) remain difficult to reliably detect due to the short read length (<300 bp) of 2nd generation sequencing. Not only do the reads (or paired-end reads) need to straddle a breakpoint, but repetitive elements often lead to ambiguities in the alignment of short reads. We propose to use the long-reads (up to 20 kb) possible with 3rd generation sequencing, specifically nanopore sequencing on the MinION. Nanopore sequencing relies on a similar concept to a Coulter counter, reading the DNA sequence from the change in electrical current resulting from a DNA strand being forced through a nanometer-sized pore embedded in a membrane. Though nanopore sequencing currently has a relatively high mismatch rate that precludes base substitution and small frameshift mutation detection, its accuracy is sufficient for SV detection because of its long reads. In fact, long reads in some cases may improve SV detection efficiency. We have tested nanopore sequencing to detect a series of well-characterized SVs, including large deletions, inversions, and translocations that inactivate the CDKN2A/p16 and SMAD4/DPC4 tumor suppressor genes in pancreatic cancer. Using PCR amplicon mixes, we have demonstrated that nanopore sequencing can detect large deletions, translocations and inversions at dilutions as low as 1:100, with as few as 500 reads per sample. Given the speed, small footprint, and low capital cost, nanopore sequencing could become the ideal tool for the low-level detection of cancer-associated SVs needed for molecular relapse, early detection, or therapeutic monitoring.Entities:
Keywords: 3rd generation sequencing; DNA sequencing; Deletions; cancer diagnostics; inversions; nanopore sequencing; next generation sequencing; structural variation; translocations; tumor suppressor gene
Mesh:
Substances:
Year: 2016 PMID: 26787508 PMCID: PMC4848001 DOI: 10.1080/15384047.2016.1139236
Source DB: PubMed Journal: Cancer Biol Ther ISSN: 1538-4047 Impact factor: 4.742
Details of Amplicons included in this study.
| Amplicon ID | Amplicon Size Without Barcodes (bp) | TSG Deleted | SV Type | SV Left Breakpoint (hg19) | SV Right Breakpoint (hg19) | Expected alignment: Left (hg19) | Expected alignment: Center (hg19) | Expected alignment: Right (hg19) |
|---|---|---|---|---|---|---|---|---|
| SV01, SV07 | 573 | TRANS | chr9:24,353,014 | chr22:36,338,191 | chr9:24352894-24353014(+) | chr22:36338191-36338601(+) | ||
| SV02 | 579 | (WT) | chr9:21,970,115 | chr9:21,970,649 | chr9: 21970115-21970649(−) | |||
| SV03 | 562 | INV+TRANS | chr9:21,083,362 | chr9:21,083,521 | chr9: 21083139-21083362(+) | chr9: 21083440-21083521(−) (81bp) | chr3:79387683-79387899(−) | |
| SV04 | 573 | TRANS | chr10:132,412,941 | chr9:27,096,867 | chr10: 132412940-132413131(−) | chr9:27096866-27097203(+) | ||
| SV05 | 576 | ID | chr18:48,570,319 | chr18:49,191,882 | chr18:48569959-48570319(+) | chr18:49191882-49192052(+) | ||
| SV06 | 561 | INV | chr9:24,320,470 | chr9:24,323,843 | chr9: 24323843-24324156(−) | chr9: 24320470-24320672(+) | ||
| SV08 | 559 | INV | chr9:25,968,399 | chr9:25,969,868 | chr9: 25968120-25968399(+) | chr9: 25969634-25969868(−) | ||
| SV09 | 581 | INV | chr9:25,969,504 | chr9:25,972,326 | chr9: 25969502-25969820(−) | chr9: 25972324-25972543(+) | ||
| SV10 | 584 | INV+TRANS | chr9:21,326,884 | chr7:140,023,555 | chr9: 21326735-21326867(+) | chr9: 21326884-21326931(−) (47bp) | chr7: 140023553-140023913(+) | |
| SV11 | 578 | TRANS | chr6:124,911,349 | chr18:53,465,051 | chr6:124911349-124911707(−) | chr18:53465049-53465220(+) | ||
| SV12 | 573 | ID | chr18:48,434,141 | chr18:49,851,882 | chr18:48433731-48434141(+) | chr18:49851882-49852004(+) |
Tumor Suppressor Gene (TSG), Structural Variant (SV), Translocation (TRANS), Wild-type Control (WT), Inversion (INV), Interstitial Deletion (ID).
Figure 1.Nanopore Library Prep Workflow. Oxford Nanopore barcodes were incorporated into amplicons by PCR- individually for each SV- then resultant reactions were pooled (A). After NEB End Repair and dA-tailing modules (B), hairpin and leader adapters were ligated on, each containing a motor protein. Only the hairpin protein contained a his-tag, which was used to enrich for molecules containing a leader adapter and his-tag (his-tag selection step not shown). Tether attachment (C) allowed for direct attachment of the molecules to the flow cell membrane. Within the MinION flowcell (D), DNA molecules are pulled through a protein pore (blue), with motor protein (orange) affecting speed of DNA translocation through the pore. One side of the DNA molecule is read, then the hairpin, then the second side. Both reads were aligned to produce a 2D consensus read.
Figure 2.Nanopore sequencing QC data. QC of Flow cell 1 A) length and B) PHRED quality histograms of each of the barcodes as a stacked bar graph. Average length of 570 bp and PHRED score of 11.5. QC of flow cell 2 C) length and D) PHRED quality histograms. Average length of 573 bp and PHRED score of 10.9.
Yield and Quality of Exp1, Limited to 2D reads.
| Amplicon | Avg. Length (bp) | Yield (bp) | Yield (reads) | Quality (PHRED) | % Match | % Mismatch | % Insertion | % Deletion |
|---|---|---|---|---|---|---|---|---|
| SV01 | 533.07 | 53,307 | 100 | 11.52 | 81.3% | 12.6% | 2.3% | 6.1% |
| SV02 | 582.07 | 81,490 | 140 | 10.98 | 79.7% | 13.2% | 2.4% | 7.1% |
| SV03 | 555.62 | 228,914 | 412 | 11.86 | 76.3% | 15.7% | 2.6% | 8.0% |
| SV04 | 562.60 | 200,285 | 356 | 11.50 | 75.2% | 16.0% | 2.5% | 8.8% |
| SV05 | 596.14 | 134,131 | 225 | 11.31 | 79.0% | 13.6% | 2.2% | 7.4% |
| SV06 | 548.78 | 311,156 | 567 | 11.47 | 81.2% | 12.5% | 2.2% | 6.3% |
| SV07 | 560.40 | 44,832 | 80 | 11.53 | 80.9% | 12.7% | 2.2% | 6.4% |
| SV08 | 547.34 | 266,554 | 487 | 11.33 | 78.1% | 14.0% | 1.8% | 8.0% |
| SV09 | 610.68 | 123,358 | 202 | 11.17 | 81.4% | 12.2% | 2.6% | 6.3% |
| SV10 | 595.92 | 182,353 | 306 | 11.58 | 78.2% | 15.4% | 3.0% | 6.4% |
| SV11 | 578.32 | 419,283 | 725 | 11.55 | 77.6% | 14.8% | 2.5% | 7.6% |
| SV12 | 583.76 | 225,914 | 387 | 12.26 | 76.4% | 14.7% | 2.5% | 8.9% |
| Average | 571.22 | 189,298 | 332 | 11.50 | 78.8% | 14.0% | 2.4% | 7.3% |
All SVs are detected by Nanopore multiplex (1:12) experiment [Exp1].
| Amplicon ID | SV Type | 2D reads | % total reads per barcode | 2D reads aligned to hg19 (%) | Reads properly aligned* (%) | Off-target Reads | Lumpy break-points | Top Lumpy Breakpoint |
|---|---|---|---|---|---|---|---|---|
| SV01 | TRANS | 100 | 2.5% | 91 (91.0%) | 77 (77.0%) | 6 (6.0%) | 1 | chr9:24353014/chr22:36338191 (58) |
| SV02 | n/a (WT) | 140 | 3.5% | 139 (99.3%) | 115 (82.1%) | 24 (17.1%) | 0 | None |
| SV03 | INV+TRANS | 412 | 10.3% | 412 (100.0%) | 68 (16.5%) | 1 (0.2%) | 8 | chr3:79387939/chr9:21083384 (132) |
| SV04 | TRANS | 356 | 8.9% | 356 (100.0%) | 303 (85.1%) | 6 (1.7%) | 3 | chr9:27096843/chr10:132412942 (183) |
| SV05 | ID | 225 | 5.6% | 224 (99.6%) | 198 (88.0%) | 7 (3.1%) | 2 | chr18:48570319 (154) |
| SV06 | INV | 567 | 14.2% | 567 (100.0%) | 549 (96.8%) | 7 (1.2%) | 4 | chr9:24320456/chr9:24323864 (120) |
| SV07 | TRANS | 80 | 2.0% | 78 (97.5%) | 70 (87.5%) | 0 (0.0%) | 2 | chr9:24353014/chr22:36338191 (52) |
| SV08 | INV | 487 | 12.2% | 487 (100.0%) | 449 (92.2%) | 3 (0.6%) | 4 | chr9:25968397/chr9:25969868 (384) |
| SV09 | INV | 202 | 5.1% | 202 (100.0%) | 190 (94.1%) | 5 (2.5%) | 2 | chr9:25969501/chr9:25972324 (172) |
| SV10 | INV+TRANS | 306 | 7.7% | 301 (98.4%) | 254 (83.0%) | 1 (0.3%) | 5 | chr7:140023522/chr9:21326914 (167) |
| SV11 | TRANS | 725 | 18.2% | 725 (100.0%) | 471 (65.0%) | 3 (0.4%) | 3 | chr6:124911349/chr18:53465049 (362) |
| SV12 | ID | 387 | 9.7% | 387 (100.0%) | 274 (70.8%) | 3 (0.8%) | 2 | chr18:48434141 (115) |
| Average | 332 | 8.3% | 331 (98.8%) | 252 (78.2%) | 6 (2.8%) |
To be considered properly aligned, a read must align to all expected regions (eg. Left sequence, Center sequence, and Right sequence, from Table 1).
Figure 3.IGV screenshot alignment of WT (SV02). B-C) IGV Screenshot of Translocation (SV01) alignment. B) Shows the alignment to the area in chr9 and C) the alignment to the area in chr22. Note the erroneous extension of the read past the breakpoint in the bottom left. D-E) IGV Screenshot of Interstitial Deletion (SV05) alignment. The plot shows the alignment to the area upstream D) and downstream E) of the deletion in chr18. Note the erroneous extension of the read past the breakpoint in the top right. F-G) IGV Screenshot of Inversion (SV09) alignment. The plot shows the alignment to the inverted area F) and G) the area downstream of the inversion. We have flipped G) to show how the 2 parts align. Note the erroneous extension of the read past the breakpoint in the top left.
Results of low frequency serial dilutions of SVs 1:100 into wildtype [Exp2].
| Amplicon ID | SV Type | Dilution into WT | # 2D reads | % per barcode | Reads aligned to hg19 | Aligned reads mapped to WT | Aligned reads mapped to SV* | # Off-target Reads | Lumpy break-points | Top Lumpy Breakpoint |
|---|---|---|---|---|---|---|---|---|---|---|
| SV01 | TRANS | 1:100 | 867 | 21.37% | 851 (98.2%) | 838 (96.7%) | 11 (1.3%) | 3 (0.3%) | 1 | chr9:24353014/chr22:36338197 (7) |
| SV03 | INV+ TRANS | 1:100 | 760 | 18.73% | 741 (97.5%) | 685 (90.1%) | 7 (0.9%) | 50 (6.6%) | 3 | chr3:79387933/chr9:21083384 (13) |
| SV04 | TRANS | 1:100 | 378 | 9.31% | 377 (99.7%) | 367 (97.1%) | 10 (2.6%) | 1 (0.3%) | 1 | chr9:27096848/chr10:132412942 (8) |
| SV05 | ID | 1:100 | 577 | 14.22% | 571 (99.0%) | 538 (93.2%) | 31 (5.4%) | 3 (0.5%) | 1 | chr18:48570319 (25) |
| SV09 | INV | 1:100 | 621 | 15.30% | 617 (99.4%) | 601 (96.8%) | 16 (2.6%) | 1 (0.2%) | 1 | chr9:25969504/chr9:25972324 (14) |
| SV12 | ID | 1:100 | 855 | 21.07% | 849 (99.3%) | 810 (94.7%) | 26 (3.0%) | 14 (1.6%) | 1 | chr18:48434141 (11) |
| 676 | 16.67% | 668 (98.8%) | 640 (94.8%) | 17 (2.6%) | 12 (1.6%) | Mean |
To be considered properly aligned, a read must align to all expected regions (eg. Left sequence, Center sequence, and Right sequence, from Table 1).
Yield and Quality of Experiment 2, Limited to 2D reads.
| Amplicon | Avg Length (bp) | Yield (bp) | Yield (reads) | Quality (PHRED) | % Match | % Mismatch | % Insertion | % Deletion |
|---|---|---|---|---|---|---|---|---|
| SV01 | 570.85 | 494,925 | 867 | 10.79 | 80.2% | 13.0% | 2.6% | 6.8% |
| SV03 | 573.37 | 435,760 | 760 | 10.83 | 79.2% | 13.7% | 2.8% | 7.1% |
| SV04 | 575.37 | 217,491 | 378 | 10.90 | 80.5% | 12.7% | 2.6% | 6.7% |
| SV05 | 571.22 | 329,593 | 577 | 10.85 | 80.0% | 13.1% | 2.8% | 6.8% |
| SV09 | 572.65 | 355,617 | 621 | 10.91 | 80.6% | 12.8% | 2.7% | 6.6% |
| SV12 | 573.27 | 490,146 | 855 | 10.93 | 80.1% | 13.0% | 2.6% | 6.9% |
| Average | 572.79 | 387,255 | 676 | 10.87 | 80.1% | 13.1% | 2.7% | 6.8% |
Alignment termini position error.
| Amplicon ID | Overlapping alignments | Correct Upstream Termini (%) | Mean Upstream Error ± SD | Correct Downstream Termini (%) | Mean Downstream Error ± SD |
|---|---|---|---|---|---|
| SV01, SV07 (L) | 77 | 3 (3.9%) | 1.8 ± 5.4 | 53 (68.8%) | 5.3 ± 14.5 |
| SV01, SV07 (R) | 85 | 4 (4.7%) | 1.1 ± 5.5 | 1 (1.2%) | −1.3 ± 17.3 |
| SV02 | 116 | 7 (6.0%) | 0.1 ± 5.2 | 13 (11.2%) | −4.3 ± 10.1 |
| SV03 (L) | 407 | 6 (1.5%) | 4.9 ± 8.3 | 16 (3.9%) | 20.8 ± 13.7 |
| SV03 (C) | 83 | 1 (1.2%) | 10.8 ± 18.4 | 58 (69.9%) | 4.7 ± 18.7 |
| SV03 (R) | 391 | 51 (13.0%) | 1.8 ± 17.2 | 3 (0.8%) | 39.1 ± 24.2 |
| SV04 (L) | 317 | 23 (7.3%) | 11.0 ± 19.3 | 1 (0.3%) | 7.3 ± 7.0 |
| SV04 (R) | 349 | 2 (0.6%) | 19.0 ± 23.3 | 196 (56.2%) | −5.7 ± 14.2 |
| SV05 (L) | 225 | 4 (1.8%) | 5.7 ± 8.5 | 107 (47.6%) | 2.9 ± 14.2 |
| SV05 (R) | 206 | 1 (0.5%) | 7.3 ± 9.8 | 62 (30.1%) | −3.8 ± 9.7 |
| SV06 (L) | 565 | 0 (0.0%) | −20.9 ± 5.2 | 229 (40.5%) | −1.0 ± 4.5 |
| SV06 (R) | 555 | 4 (0.7%) | 5.6 ± 13.6 | 297 (53.5%) | −1.6 ± 7.8 |
| SV08 (L) | 70 | 6 (8.6%) | 2.7 ± 5.4 | 45 (64.3%) | 2.9 ± 14.7 |
| SV08 (R) | 78 | 3 (3.8%) | 4.4 ± 14.3 | 1 (1.3%) | −1.1 ± 7.9 |
| SV09 (L) | 476 | 125 (26.3%) | 4.4 ± 9.9 | 116 (24.4%) | −2.1 ± 6.6 |
| SV09 (R) | 464 | 37 (8.0%) | 0.0 ± 11.0 | 276 (59.5%) | 7.0 ± 28.4 |
| SV10 (L) | 271 | 32 (11.8%) | −0.5 ± 4.7 | 0 (0.0%) | 49.9 ± 19.4 |
| SV10 (C) | 260 | 0 (0.0%) | 144.0 ± 26.5 | 0 (0.0%) | −9.9 ± 14.7 |
| SV10 (R) | 302 | 0 (0.0%) | 28.8 ± 24.4 | 8 (2.6%) | 15.7 ± 27.6 |
| SV11 (L) | 725 | 4 (0.6%) | 34.7 ± 51.5 | 91 (12.6%) | −5.7 ± 7.9 |
| SV11 (R) | 472 | 8 (1.7%) | −0.1 ± 8.0 | 75 (15.9%) | 2.4 ± 8.6 |
| SV12 (L) | 386 | 72 (18.7%) | 5.2 ± 6.9 | 151 (39.1%) | 4.1 ± 11.2 |
| SV12 (R) | 278 | 11 (4.0%) | −0.6 ± 7.9 | 6 (2.2%) | 9.9 ± 9.9 |