| Literature DB >> 33839270 |
Luca Marcolungo1, Cristina Beltrami1, Chiara Degli Esposti1, Giulia Lopatriello1, Chiara Piubelli2, Antonio Mori2, Elena Pomari2, Michela Deiana2, Salvatore Scarso2, Zeno Bisoffi3, Valentina Grosso1, Emanuela Cosentino1, Simone Maestri1, Denise Lavezzari1, Barbara Iadarola1, Marta Paterno1, Elena Segala1, Barbara Giovannone1, Martina Gallinaro1, Marzia Rossato4, Massimo Delledonne5.
Abstract
Sequencing the SARS-CoV-2 genome from clinical samples can be challenging, especially in specimens with low viral titer. Here we report Accurate SARS-CoV-2 genome Reconstruction (ACoRE), an amplicon-based viral genome sequencing workflow for the complete and accurate reconstruction of SARS-CoV-2 sequences from clinical samples, including suboptimal ones that would usually be excluded even if unique and irreplaceable. The protocol was optimized to improve flexibility and the combination of technical replicates was established as the central strategy to achieve accurate analysis of low-titer/suboptimal samples. We demonstrated the utility of the approach by achieving complete genome reconstruction and the identification of false-positive variants in >170 clinical samples, thus avoiding the generation of inaccurate and/or incomplete sequences. Most importantly, ACoRE was crucial to identify the correct viral strain responsible of a relapse case, that would be otherwise mis-classified as a re-infection due to missing or incorrect variant identification by a standard workflow.Entities:
Keywords: Genetic variants; Low-viral titer; Re-infection; SARS-CoV-2 genome sequencing; Suboptimal samples
Year: 2021 PMID: 33839270 PMCID: PMC8028595 DOI: 10.1016/j.ygeno.2021.04.008
Source DB: PubMed Journal: Genomics ISSN: 0888-7543 Impact factor: 5.736
Fig. 1Comparison of intra-cDNA and inter-cDNA replicates of SARS-CoV-2 genome amplification and sequencing. (A) Schematic diagram showing the five clinical samples obtained from COVID-19 patients, their RT-qPCR Ct values and the experimental workflow. For each sample, we generated three independent cDNAs and each cDNA was amplified in duplicate using the ARTIC nCoV-2019 V3 Panel. Amplicons used as the input for library preparation were sequenced in 250PE mode on the Illumina MiSeq platform. The bar charts show mean concordance rates (± standard deviations) for (B) genome coverage, (C) genotypability, (D) consensus variants and (E) iSNV between amplification replicates generated from different cDNAs (inter-cDNA) or the same cDNA (intra-cDNA).
Fig. 2Coverage and variant calling between intra-cDNA and inter-cDNA replicates. (A) Sequencing coverage of the 98 amplicons of ARTIC V3 panel from four representative replicates of sample S5. Green bars represent the amplicons generated using the ARTIC original primer set, and orange bars represent the amplicons generated using the alternative V3 primers. Red arrows point at representative amplicons missing in only one replicate. (B) Integrative Genomics Viewer (IGV) visualization of four representative sequencing replicates of sample S5 in the region 19,080–19,180 of the SARS-Cov-19 genome. Black arrows indicate variants called only in one replicate. The amplicon was not amplified in replicate S5 2.1. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 3Merging sequencing replicates can improve coverage and genotypability. (A) Mean percentage genome coverage (± standard deviations). (B) Mean percentage genotypability (± standard deviations). Both genome coverage and genotypability were calculated for single replicates or after merging all possible combinations of two or six replicates, starting from the same total sequencing reads (****p < 0.0001, Mann Whitney U test). (C) The coverage fraction contributed by each of the six replicates generated from sample S5. (D) Percentage of genome coverage after merging different numbers of replicates from sample S5, and from three other COVID-19-positive swab samples, namely samples 3270 (E), 4572 (F), 4173 (E), whose sequencing results are reported in Table S12.
Fig. 4Comparison of SARS-CoV-2 sequencing and mapping results obtained using the KAPA and Illumina library preparation kits. (A) Distribution of the number of fragments generated using the KAPA Hyper Prep and Illumina DNA Prep kits for the same set of 30 replicates. (B) Visualization of mean sequencing coverage on a representative ARTIC amplicon using the KAPA and Illumina library kits. Given the overlap with adjacent amplicons, the 5′ and 3′ ends show increased coverage. (C) Mean coverage (± standard deviations) and (D) mean genotypability (± standard deviations) of sequencing libraries prepared from the 30 replicates using either the KAPA or Illumina kits. The 100PE results were obtained from the 150PE dataset by in silico trimming.
Fig. 5SARS-CoV-2 sequencing in a cohort of clinical samples with wide range of viral titers. (A-C) Percentage of genome coverage and (B—D) genotypability for each sample (N = 170) considering a single replicate (selected randomly) or after merging two sequencing replicates. The pie charts show the fraction of the complete SARS-CoV-2 (>96.98%) genome in terms of (E) coverage or (F) genotypability for samples with Ct < or ≥ 30.
High-frequency variants identified in the COVID-19 relapse case study. The positions of high-frequency variants (>75%) are shown in the consensus sequence of a specimen collected during the first hospitalization. For each of these positions, the genotypes identified in the samples collected during the second hospitalization are also shown. Genotypes are reported for each sequencing replicate independently or after merging all replicates from the same sample (merged). Positions that could not be genotyped are indicated with a dash.
| 1° Hospitalization | 2° Hospitalization | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 05/03/2020 | 22/03/2020 | 03/04/2020 | ||||||||||||
| Ct 27 | Ct 34 | Ct 35.7 | ||||||||||||
| Genome | Reference allele | 9075 | 9075 | 9075 | 9075 | 9075 | 9076 | 9076 | 9076 | 9078 | 9078 | 9078 | 9078 | 9078 |
| Position | 1.1 | 1.2 | 2.1 | 2.2 | merged | 1.1 | 1.2 | merged | 1.1 | 1.2 | 2.1 | 2.2 | merged | |
| 241 | C | T | T | T | T | T | T | – | T | – | – | – | T | T |
| 3037 | C | T | T | T | T | T | – | – | – | – | T | – | – | T |
| 13,620 | C | T | T | T | T | T | T | – | T | – | – | T | T | T |
| 14,408 | C | T | T | T | T | T | T | T | T | – | – | – | T | T |
| 23,403 | A | G | G | G | G | G | G | G | G | – | – | G | – | G |
| 28,881 | G | A | A | A | A | A | – | A | A | – | A | – | – | A |
| 28,882 | G | A | A | A | A | A | – | A | A | – | A | – | – | A |
| 28,883 | G | C | C | C | C | C | – | C | C | – | C | – | – | C |