| Literature DB >> 35347328 |
Sneha D Goenka1, John E Gorzynski1, Kishwar Shafin2, Dianna G Fisk3, Trevor Pesout2, Tanner D Jensen1, Jean Monlong2, Pi-Chuan Chang4, Gunjan Baid4, Jonathan A Bernstein1, Jeffrey W Christle1, Karen P Dalton1, Daniel R Garalde5, Megan E Grove3, Joseph Guillory5, Alexey Kolesnikov4, Maria Nattestad4, Maura R Z Ruzhnikov1, Mehrzad Samadi6, Ankit Sethia6, Elizabeth Spiteri1, Christopher J Wright5, Katherine Xiong1, Tong Zhu6, Miten Jain2, Fritz J Sedlazeck7, Andrew Carroll4, Benedict Paten2, Euan A Ashley8.
Abstract
Whole-genome sequencing (WGS) can identify variants that cause genetic disease, but the time required for sequencing and analysis has been a barrier to its use in acutely ill patients. In the present study, we develop an approach for ultra-rapid nanopore WGS that combines an optimized sample preparation protocol, distributing sequencing over 48 flow cells, near real-time base calling and alignment, accelerated variant calling and fast variant filtration for efficient manual review. Application to two example clinical cases identified a candidate variant in <8 h from sample preparation to variant identification. We show that this framework provides accurate variant calls and efficient prioritization, and accelerates diagnostic clinical genome sequencing twofold compared with previous approaches.Entities:
Mesh:
Year: 2022 PMID: 35347328 PMCID: PMC9287171 DOI: 10.1038/s41587-022-01221-5
Source DB: PubMed Journal: Nat Biotechnol ISSN: 1087-0156 Impact factor: 68.164
Fig. 2Comparison of barcoded and nonbarcoded variant calling performance, and standard and ultra-rapid variant filtration performance.
a, Stratified variant calling performance comparison between barcoded and nonbarcoded HG002 samples in all benchmarking and exonic regions. The HG002 nonbarcoded run was the seventh sample run on the flow cells. The similarity in variant calling performance shows that barcoding is not necessary to achieve high-quality variant calls. b, Comparison between standard and ultra-rapid variant filtration pipelines. The standard pipeline picks variants if any of the subcategories are true for any variant, whereas the ultra-rapid pipeline uses a score-based method to surface mostly relevant variants for fast manual review. We see that the standard pipeline flags 147 variants compared with 31 variants proposed by the ultra-rapid pipeline, which substantially reduces the time required for manual review.
Source data
Fig. 1Overview of ultra-rapid computational pipeline.
a, After the start of sequencing on the PromethION48 device, raw signal files are periodically uploaded to cloud storage. Our cloud-based pipeline scales compute-intensive base calling and alignment across 16 instances with 4× Tesla V100 GPUs each and runs concurrently with sequencing. The instances aim for maximum resource utilization, where base calling using Guppy runs on GPU and alignment using Minimap2 (ref. [17]) runs on 42 virtual CPUs in parallel. b, Once the alignment file is ready, small-variant calling performed using GPU-accelerated PEPPER–Margin–DeepVariant[11] on 14 instances and SV calling using Sniffles[18] on 2 instances. Each instance processes a specific set of contigs. Specific details about the Google Cloud Platform-based instance configurations are provided in Supplementary Table 21. c, These variant calls are annotated to aid in the subsequent variant filtration and prioritization. Our score-based variant filtration method takes in millions of variants reported by the variant caller to surface any deleterious variant for review using Alissa. The filtration method is designed such that it reports a tractable number of variants for manual curation.
Fig. 3Ultra-rapid-sequencing pipeline performance.
The detailed schema of the ultra-rapid-sequencing pipeline and end-to-end performance of the ultra-rapid-sequencing pipeline on two different clinical samples. a, The ultra-rapid-sequencing pipeline starts from sample collection on the far left to final diagnosis on the far right. The details of each step are presented in Online Methods. All the steps that run in parallel are vertically stacked. b, The first patient was the fourth sample sequenced on the set of flow cells, resulting in 2:16 h of sequencing. Variant calling completed 6:55 h from the start of sample preparation. Subsequently, variant filtration and manual review identified a candidate variant in gene TNNT2 in 7:18 h. c, The second patient was the sixth sample on the same set of flow cells. The sequencing completed in 2:46 h with 200 Gb followed by another 2:44 h to identify a candidate variant resulting in an end-to-end time of 7:48 h.