| Literature DB >> 35611117 |
Sara Donzelli1, Ludovica Ciuffreda2, Martina Pontone3, Martina Betti4, Alice Massacci4, Carla Mottini2, Francesca De Nicola2, Giulia Orlandi5, Frauke Goeman2, Eugenia Giuliani5, Eleonora Sperandio4, Giulia Piaggio2, Aldo Morrone5, Gennaro Ciliberto6, Maurizio Fanciulli2, Giovanni Blandino1, Fulvia Pimpinelli3, Matteo Pallocca4.
Abstract
The SARS-CoV-2 Variants of Concern tracking via Whole Genome Sequencing represents a pillar of public health measures for the containment of the pandemic. The ability to track down the lineage distribution on a local and global scale leads to a better understanding of immune escape and to adopting interventions to contain novel outbreaks. This scenario poses a challenge for NGS laboratories worldwide that are pressed to have both a faster turnaround time and a high-throughput processing of swabs for sequencing and analysis. In this study, we present an optimization of the Illumina COVID-seq protocol carried out on thousands of SARS-CoV-2 samples at the wet and dry level. We discuss the unique challenges related to processing hundreds of swabs per week such as the tradeoff between ultra-high sensitivity and negative contamination levels, cost efficiency and bioinformatics quality metrics.Entities:
Keywords: BAM, Binary Alignment Map; BED, Browser Extensible Data; Bioinformatics workflow; COVID mutations; FDA, Food and Drug Administration; HPC, High Performance Computing; Illumina COVID-seq; LIMS, Laboratory Information Management System; NGS, Next Generation Sequencing; Oncology; Oncology Metagenomics; RBD, Receptor-Binding Domain; SARS-CoV-2 Variants of Concern; SARS-CoV-2 genome; SARS-CoV-2 mutation; SARS-CoV-2, Severe Acute Respiratory Syndrome Coronavirus; TAT, Turnaround Time; VoC, Variants of Concern
Year: 2022 PMID: 35611117 PMCID: PMC9119164 DOI: 10.1016/j.csbj.2022.05.033
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 6.155
Fig. 1A: High-level abstraction of wet laboratory and bioinformatics procedures. The overall workflow is divided into three major sections, including laboratorial (wet), cloud, and in-house HPC-automated analyses. Every section underwent prompted specific line of engineering and research, from LIMS automated extraction and reporting (Clinical Informatics) to library prep optimization, to bioinformatics pipeline parameter tuning, in order to achieve the required amount of information in the shortest timeframe possible.
Fig. 2A-B: Absolute coverage showing signal-to-noise reduction from 35 to 22 PCR cycles. Noise is represented by negative control samples (water) resequenced after the protocol change. Positive samples represent real swabs from March to September 2021, containing alpha and delta variants. C: Complete coverage analysis per amplicon on 3232 SARS-CoV-2 COVID-seq samples. Highlight on outlier amplicon 64. The overall coverage is consistently high across all samples (median >7000). C: Vertical coverage across samples on the whole genome. Highlight on Spike protein region, lower than the median coverage but consistently above the 200x threshold. D: Horizontal coverage across samples with different thresholds, showing that > 80% of samples have a median horizontal coverage > 100x across the whole SARS-CoV-2 genome, regardless of the number of PCR cycles. Overall, better horizontal coverage is achieved with the adapted protocol.
Time performance metrics of several sample batches and bioinformatics workflows.
| plates | samples (controls included) | Nfcore ViralRecon on 32 CPU/256 GB RAM | Illumina DRAGEN COVID Lineage | Illumina DRAGEN RNA Pathogen Detection |
|---|---|---|---|---|
| 1 | 96 | 3–4 h | 1.5 h (4 nodes) | 3.5 h (4 nodes) |
| 2 | 192 | 6–8 h | 5 h-6 h (4 nodes) | 3 h-4 h (9 nodes) |
| 4 | 384 | 10–12 h | 5 h-6 h (4 nodes) | 5 h (10 nodes) |
| 8 | 768 | 20–24 h | 15.5 h (4 nodes) | 9 h (16 nodes) |
Sequencing instruments and flow cell/cartridge combos for desired throughput.
| Samples | Instrument and Flow cell/cartridge |
|---|---|
| 96 | NextSeq 500/550 Mid Output Kit v2.5 (300 Cycles) |
| 192–384 | NextSeq 500/550 High Output Kit v2.5 (300 Cycles) |
| 192–768 | NovaSeq 6000 SP Reagent Kit v1.5 (300 cycles) with NovaSeq XP 2-Lane Kit v1.5 #20043130 |