| Literature DB >> 34551429 |
Timour Baslan1, Sam Kovaka2, Fritz J Sedlazeck3, Yanming Zhang4, Robert Wappel5, Sha Tian1, Scott W Lowe1,6, Sara Goodwin5, Michael C Schatz2,5,7.
Abstract
Genome copy number is an important source of genetic variation in health and disease. In cancer, Copy Number Alterations (CNAs) can be inferred from short-read sequencing data, enabling genomics-based precision oncology. Emerging Nanopore sequencing technologies offer the potential for broader clinical utility, for example in smaller hospitals, due to lower instrument cost, higher portability, and ease of use. Nonetheless, Nanopore sequencing devices are limited in the number of retrievable sequencing reads/molecules compared to short-read sequencing platforms, limiting CNA inference accuracy. To address this limitation, we targeted the sequencing of short-length DNA molecules loaded at optimized concentration in an effort to increase sequence read/molecule yield from a single nanopore run. We show that short-molecule nanopore sequencing reproducibly returns high read counts and allows high quality CNA inference. We demonstrate the clinical relevance of this approach by accurately inferring CNAs in acute myeloid leukemia samples. The data shows that, compared to traditional approaches such as chromosome analysis/cytogenetics, short molecule nanopore sequencing returns more sensitive, accurate copy number information in a cost effective and expeditious manner, including for multiplex samples. Our results provide a framework for short-molecule nanopore sequencing with applications in research and medicine, which includes but is not limited to, CNAs.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34551429 PMCID: PMC8643650 DOI: 10.1093/nar/gkab812
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 19.160
Annotations and sequencing statistics for samples run on a MinION device
| Sample/type | Sample ID | Read count | Yield (Mb) | Passed (%) |
| Read-length mean | Read N50 | Minutes to 250k reads |
|---|---|---|---|---|---|---|---|---|
| Cancer Cell Line SKBR3 | SKBR3-Long | 899425 | 8360.86 | 84.94 | 10.17 | 9295.79 | 14301 | 291.93 |
| Cancer Cell Line SKBR3 | SKBR3-Short (1ME) | 6332668 | 3024.8 | 86.42 | 10.54 | 477.65 | 538 | 16.5 |
| Cancer Cell Line SKBR3 | SKBR3-Short (3ME) | 1465247 | 789.33 | 89.61 | 10.89 | 538.7 | 637 | 17.07 |
| Cancer Cell Line SKBR3 | SKBR3-Short (6ME) | 699009 | 416.13 | 88.69 | 10.73 | 595.32 | 732 | 19.35 |
| Leukemia (AML-BM) | AML1 | 3797811 | 1670.22 | 84.97 | 9.09 | 439.79 | 501 | 15.13 |
| Leukemia (AML-BM) | AML2 | 6065542 | 2510.64 | 81.17 | 8.69 | 413.92 | 461 | 16.42 |
| Leukemia (AML-BM) | AML3 | 2312000 | 1072.69 | 91.29 | 9.72 | 463.96 | 503 | 16.59 |
| Leukemia (AML-BM) | AML4 | 3935726 | 1753.57 | 86.68 | 9.29 | 445.55 | 488 | 15.58 |
| Leukemia (AML-BM) | AML5 | 6263490 | 2886.35 | 82.82 | 8.95 | 460.82 | 519 | 15.53 |
AML = acute myeloid leukemia, ME = molar equivalent.
All AML samples were run at 1ME concentrations.
Figure 1.Sequencing short molecules on a nanopore device yields high read counts and enables accurate copy number profiling. (A) Cumulative number of reads/molecules sequenced over time throughout each SK-BR-3 nanopore run (ME = molar equivalent). (B) Total number of ‘living’ channels over time throughout each run as measured by the time that the last read was sequenced by each channel. Note that a channel may not produce a read for several hours but still be considered ‘alive’ by this definition. (C) Distributions of timing between discrete DNA molecules entry/exit from a given channel, i.e. vacancy time. (D) Genome-wide copy number profiles of SK-BR-3 sequenced using short molecule nanopore (right panel) and short-read Illumina (left) data. Profiles are plotted when dividing the genomes in 5 thousand bins (i.e. 5k bins). Examples of detected CNAs are annotated on the profiles.
Figure 2.Short molecule sequencing on a MinION nanopore device yields accurate high resolution copy number information in a clinically relevant setting. (A) and (B) Genome-wide copy number profiles of a normal and a complex karyotype sample sequenced using short molecules on a MinION (lower panel) and short-read on an Illumina instrument (upper panel). (C) Density scatter correlation plot of normalized bin read counts (proportional to copy number) from MinION and Illumina sequencing data for leukemic sample AML-2. Pearson correlation value is provided. (D) Zoom in views of chromosome 7 (left panel) with complex rearrangement resulting in loss of both short and long arms and chromosome 11 (right panel) with gains of the long arm at various level in complex karyotype sample AML-2 using Nanopore (gray line) and Illumina (black line) sequencing data. Dashed gray vertical lines illustrate alteration breakpoints. (E and F) Zoom in views of chromosomal deletions on chromosomes 12p, 16p, 17p and 20q. Right panel in (F) illustrates chromosome images derived from cytogenetic analysis showing a normal chromosome 20 (top) and a marker chromosome (bottom) which was redefined as a derivative chromosome 20 with deletion of the long arm based on this study (G) DNA-FISH based validation of selected CNA alterations. Gain of the MLL gene on chromosome cytoband 11q23 (top panel), using a break apart MLL probe set (5’MLL and 3’MLL are labeled in orange and green respectively); two cells (arrows) show five copies of MLL/KMT2A. FISH tests confirm focal 17p loss, encompassing the TP53 gene (bottom panel). TP53 probe and D17Z1 (CEP17-Cen) probes are labeled in orange and green, respectively. Three cells (arrows) show two green signals for CEP17 and only one orange signal for TP53. Nuclei are counterstained with DAPI (blue).
Figure 3.Increased read counts via short molecule sequencing enables accurate, multiplex profiling on a MinION device. (A) Heatmap illustration of copy number profiles of all profiled AML samples (n = 5) using Illumina, MinION, and Multiplex MinION sequencing. Annotations regarding karyotype status of leukemic sample (normal vs. complex) and sequencing modality are denoted on bars on top of the heatmap. Heatmap color and bar color codes are provided below the heatmap. Selected, prognostically relevant CNAs are identified on heatmap. (B) Genome-wide copy number profiles from a complex karyotype AML sample inferred from short molecule nanopore sequencing in multiplex (lower panel) and non-multiplex mode (upper-panel). (C) DNA-FISH based validation of identified copy number alterations on the long arm of chromosomes 5 and 7(Chr5q and Chr7q respectively). A 5p15.2 probe (labeled in green) and a centromeric probes (D7Z1, CEP7) for chromosome 7 were used as an internal control (Con). Left: two cells (arrow) show two green signals for 5p and only one orange signal for 5q. Right: all three cells (arrow) show two green signals for CEP7 and only one orange 7q. Color codes of probes are illustrated. Nuclei are counterstained with DAPI.
Figure 4.Short molecule sequencing using a Flongle adaptor enables high read counts, accurate copy number data, and cheap, portable sequencing. (A) Absolute number of sequenced molecules for a Flongle run compared to other nanopore runs. Short 1ME and Long are as depicted in Figure 1A. (B) Normalized number of sequenced molecules per channel for each nanopore run (126 channels for Flongle and 512 channels for MinION). (C and D) Genome-wide copy number profiles of a complex karyotype (C) and a normal karyotype (D) AML case inferred from short molecule nanopore sequencing (top-panel) and short read Illumina sequencing (lower panel), respectively.