| Literature DB >> 34878296 |
James J Davis1,2, S Wesley Long3,4,5, Paul A Christensen3,4,5, Randall J Olsen3,4,5, Robert Olson1,2, Maulik Shukla1,2, Sishir Subedi3,4,5, Rick Stevens6,7, James M Musser3,4,5.
Abstract
The ARTIC Network provides a common resource of PCR primer sequences and recommendations for amplifying SARS-CoV-2 genomes. The initial tiling strategy was developed with the reference genome Wuhan-01, and subsequent iterations have addressed areas of low amplification and sequence drop out. Recently, a new version (V4) was released, based on new variant genome sequences, in response to the realization that some V3 primers were located in regions with key mutations. Herein, we compare the performance of the ARTIC V3 and V4 primer sets with a matched set of 663 SARS-CoV-2 clinical samples sequenced with an Illumina NovaSeq 6000 instrument. We observe general improvements in sequencing depth and quality, and improved resolution of the SNP causing the D950N variation in the spike protein. Importantly, we also find nearly universal presence of spike protein substitution G142D in Delta-lineage samples. Due to the prior release and widespread use of the ARTIC V3 primers during the initial surge of the Delta variant, it is likely that the G142D amino acid substitution is substantially underrepresented among early Delta variant genomes deposited in public repositories. In addition to the improved performance of the ARTIC V4 primer set, this study also illustrates the importance of the primer scheme in downstream analyses. IMPORTANCE ARTIC Network primers are commonly used by laboratories worldwide to amplify and sequence SARS-CoV-2 present in clinical samples. As new variants have evolved and spread, it was found that the V3 primer set poorly amplified several key mutations. In this report, we compare the results of sequencing a matched set of samples with the V3 and V4 primer sets. We find that adoption of the ARTIC V4 primer set is critical for accurate sequencing of the SARS-CoV-2 spike region. The absence of metadata describing the primer scheme used will negatively impact the downstream use of publicly available SARS-Cov-2 sequencing reads and assembled genomes.Entities:
Keywords: ARTIC; COVID-19; SARS-CoV-2; genome sequencing; primers
Mesh:
Substances:
Year: 2021 PMID: 34878296 PMCID: PMC8653831 DOI: 10.1128/Spectrum.01803-21
Source DB: PubMed Journal: Microbiol Spectr ISSN: 2165-0497
FIG 1Sequencing artifact analysis of spike protein amino acid position 142. (A) Median read depths at each nucleotide position for the assembly of the set of 663 V3 (blue) and V4 (red) samples. (B) Median read depth at each primer for the set of 663 V3 assemblies. (C) Median read depth at each primer for the set of 663 V4 assemblies. Orange squares are right primers and blue circles are left primers. (D) The fraction of B.1.617.2 sequences in GISAID with (blue) and without (orange) G142D through August 31, 2021. (E) Frequency of L452R (gray), D950N (blue), and G142D (orange) amino acid substitutions observed in SARS-CoV-2-positive samples from April through August of 2021. L452R, D950N, and G142D are hallmark amino acid substitutions of the Delta variant.
Distribution of Pangolin lineages in the set of 663 samples that were sequenced using either the ARTIC version 3 or version 4 primers
| Lineage | ARTIC version 3 | ARTIC version 4 |
|---|---|---|
| AY.10 | 2 | 1 |
| AY.12 | 4 | 0 |
| AY.13 | 2 | 2 |
| AY.14 | 1 | 1 |
| AY.15 | 5 | 0 |
| AY.2 | 4 | 7 |
| AY.20 | 3 | 3 |
| AY.21 | 1 | 0 |
| AY.24 | 5 | 1 |
| AY.25 | 124 | 126 |
| AY.3 | 40 | 36 |
| AY.3.1 | 6 | 7 |
| AY.4 | 4 | 0 |
| B.1 | 0 | 1 |
| B.1.1 | 0 | 1 |
| B.1.1.7 | 40 | 40 |
| B.1.575 | 1 | 0 |
| B.1.617.2 | 375 | 398 |
| B.1.621 | 4 | 4 |
| B.1.621.1 | 1 | 1 |
| B.1.625 | 0 | 1 |
| B.1.628 | 11 | 11 |
| B.1.637 | 1 | 1 |
| C.37 | 4 | 3 |
| P.1 | 9 | 9 |
| P.1.10 | 1 | 1 |
| None | 15 | 8 |