| Literature DB >> 35733956 |
Maria T Arévalo1,2, Mark A Karavis2, Sarah E Katoski2, Jacquelyn V Harris2, Jessica M Hill3, Samir V Deshpande2, Pierce A Roth3, Alvin T Liem3, R Cory Bernhards2.
Abstract
A new human coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), emerged at the end of 2019 in Wuhan, China that caused a range of disease severities; including fever, shortness of breath, and coughing. This disease, now known as coronavirus disease 2019 (COVID-19), quickly spread throughout the world, and was declared a pandemic by the World Health Organization in March of 2020. As the disease continues to spread, providing rapid characterization has proven crucial to better inform the design and execution of control measures, such as decontamination methods, diagnostic tests, antiviral drugs, and prophylactic vaccines for long-term control. Our work at the United States Army's Combat Capabilities Development Command Chemical Biological Center (DEVCOM CBC) is focused on engineering workflows to efficiently identify, characterize, and evaluate the threat level of any potential biological threat in the field and more remote, lower resource settings, such as forward operating bases. While we have successfully established untargeted sequencing approaches for detection of pathogens for rapid identification, our current work entails a more in-depth sequencing analysis for use in evolutionary monitoring. We are developing and validating a SARS-CoV-2 nanopore sequencing assay, based on the ARTIC protocol. The standard ARTIC, Illumina, and nanopore sequencing protocols for SARS-CoV-2 are elaborate and time consuming. The new protocol integrates Oxford Nanopore Technology's Rapid Sequencing Kit following targeted RT-PCR of RNA extracted from human clinical specimens. This approach decreases sample manipulations and preparation times. Our current bioinformatics pipeline utilizes Centrifuge as the classifier for quick identification of SARS-CoV-2 and RAMPART software for verification and mapping of reads to the full SARS-CoV-2 genome. ARTIC rapid sequencing results, of previous RT-PCR confirmed patient samples, showed that the modified protocol produces high quality data, with up to 98.9% genome coverage at >1,000x depth for samples with presumably higher viral loads. Furthermore, whole genome assembly and subsequent mutational analysis of six of these sequences identified existing and unique mutations to this cluster, including three in the Spike protein: V308L, P521R, and D614G. This work suggests that an accessible, portable, and relatively fast sample-to-sequence process to characterize viral outbreaks is feasible and effective.Entities:
Keywords: COVID-19; SARS-CoV-2; nanopore sequencing; whole genome assembly; whole genome sequencing
Year: 2022 PMID: 35733956 PMCID: PMC9207459 DOI: 10.3389/fmicb.2022.910955
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 6.064
SARS-CoV-2 GISAID clade classifications, corresponding pango lineage, and variants of concern.
| GISAID clade | Pango lineage | Marker variants | Variants of concern |
| S | A | C8782T, T28144C includes NS8-L84S | |
| L | B | C241, C3037, A23403, C8782, G11083, G26144, T28144 (early clade markers in WIV04- GISAID reference sequence) | |
| V | B.2 | G11083T, G26144T NSP6-L37F + NS3-G251V | |
| G | B.1 | C241T, C3037T, A23403G includes S-D614G | |
| GH | B.1.* | C241T, C3037T, A23403G, G25563T includes S-D614G + NS3-Q57H | Beta (B.1.3151) |
| GR | B.1.1.1 | C241T, C3037T, A23403G, G28882A includes S-D614G + N-G204R | Gamma (P.1 or B.1.1.28.1) |
| GV | B.1.177 | C241T, C3037T, A23403G, C22227T includes S-D614G + S-A222V | |
| GRY | B.1.1.7 | C241T, C3037T, 21765-21770del, 21991-21993del, A23063T, A23403G, G28882A includes S-H69del, S-V70del, S-Y144del, S-N501Y + S-D614G + N-G204R | Alpha |
| GK | B.1.617.2 | C241T, C3037T, A23403G, C22995A S-D614G + S-T478K | Delta |
| GRA | B.1.1.529 | A67V, del69-70, T95I, del142-144, Y145D, del211, L212I, ins214EPE, G339D, S371L, S373P, S375F, K417N, N440K, G446S, S477N, T478K, E484A, Q493R, G496S, Q498R, N501Y, Y505H, T547K, D614G, H655Y, N679K, P681H, N764K, D796Y, N856K, Q954H, N969K, L981F | Omicron (includes BA.1-BA.5 or B.1.1.529.1-B.1.1.529.5, XE-recombinant BA.1/BA.2) |
FIGURE 1Artic protocol and modified versions. Schematics for the COVID-19 sequencing workflows are shown starting with (A) the standard Artic protocol with v3 primers and followed by (B) the Artic protocol modified for use with the Rapid Barcoding Sequencing kit for multiplexed samples, and (C) the Artic protocol modified for use with the Rapid Sequencing kit for rapid sequencing of individual samples.
FIGURE 2CoV-2 analysis and assembly pipeline. A schematic showing analysis of sequencing data starting with sequencing using the MinION and MinIT and creation of basecalled data (fastq) (Wang et al., 2020). The fastq files are demultiplexed if appropriate and Centrifuge is used to map reads and deliver a report of the ranked organisms and visual report (Krona). The basecalled data is also analyzed using (Wu et al., 2020) RAMPART software and the sequences with high coverage are assembled using Medaka. Fasta files are generated for subsequent analyses.
Pilot study evaluating ARTIC-based assays for sequencing SARS-CoV-2 from clinical specimens.
| Centrifuge | RAMPART | |||||||||||
|
| ||||||||||||
| Sample ID> | RT-PCR Diagnosis | Seq Assay | Total Reads | # CoV-2 Reads | # Unique CoV-2 Reads | % CoV-2 Reads | #CoV-2 Reads | Median Length | % CoV-2 Genome | Depth > 10x | Depth > 100x | Depth > 1,000x |
| BEI RNA | n/a | ARTIC v3 | 179,391 | 170,400 | 168,872 | 95.0 | 119,188 | 500 | 97.3 | 99.8 | 99.8 | 83.1 |
| BEI RNA | n/a | ARTIC + Rapid | 2,256,855 | 1,896,808 | 1,896,808 | 84.0 | 2,148,026 | 290 | 95.2 | 99.9 | 99.8 | 99.8 |
| P02 | + | ARTIC v3 | 547,474 | 483,403 | 470,900 | 88.3 | 327,577 | 490 | 77.74 | 98 | 92.9 | 72 |
| P02 | + | ARTIC + Rapid Barcoding | 24,000 | 15,507 | 15,507 | 64.6 | ND | ND | ND | ND | ND | ND |
| P02 | + | ARTIC + Rapid | 443,368 | 370,676 | 370,676 | 83.6 | 418,818 | 270 | 94.46 | 99.1 | 95.8 | 80.6 |
| P03 | + | ARTIC v3 | 344,910 | 73,885 | 73,867 | 21.4 | 29,429 | 300 | 15.24 | 56.5 | 20.6 | 15.8 |
| P03 | + | ARTIC + Rapid Barcoding | 11,000 | 1,011 | 1,011 | 9.2 | ND | ND | ND | ND | ND | ND |
| P03 | + | ARTIC + Rapid | 645,017 | 490,929 | 490,929 | 76.1 | 545,826 | 270.00 | 84.62 | 48.20 | 41.40 | 30.10 |
| P04 | + | ARTIC v3 | 543,005 | 119,657 | 118,894 | 22.0 | 39,939 | 310 | 12.2 | 73.1 | 22.6 | 15.8 |
| P04 | + | ARTIC + Rapid Barcoding | 13,000 | 1,597 | 1,597 | 12.3 | ND | ND | ND | ND | ND | ND |
| P04 | + | ARTIC + Rapid | 861,734 | 121,254 | 120,491 | 14.1 | 748,263 | 260.00 | 86.21 | 64.90 | 53.80 | 35.90 |
| N01 | - | ARTIC v3 | 568,266 | 59,246 | 59,246 | 10.4 | 10,482 | 300 | 3.02 | 82.6 | 22.5 | 4.1 |
| N01 | - | ARTIC + Rapid Barcoding | 32,000 | – | – | 0.0 | ND | ND | ND | ND | ND | ND |
| N01 | - | ARTIC + Rapid | 578,916 | 446,225 | 443,662 | 77.1 | 495,874 | 280 | 85.66 | 62 | 58.8 | 43.8 |
| Neg. Ctl 1/Water | n/a | ARTIC + Rapid | 25,173 | 7,487 | 7,487 | 29.7 | 8,336 | 190 | 33.11 | 1.5 | 1.4 | 1.3 |
| Neg. Ctl 2/Water | n/a | ARTIC + Rapid | 12,131 | – | – | 0.0 | 0 | - | 0 | 0 | 0 | 0 |
Assemblies with GenBank accession numbers.
| ID | Seq assay | Assembly | Accession # |
| BEI RNA | ARTIC v3 | C0V2_lsk109/ARTIC/medaka MN908947.3 |
|
| BEI RNA | ARTIC + Rapid | C0V2_rad004/ARTIC/medaka MN908947.3 |
|
| P02 | ARTIC v3 | 1P2_lsk109/ARTIC/medaka MN908947.3 |
|
| P02 | ARTIC + Rapid | 1P2_rad004/ARTIC/medaka MN908947.3 |
|
| P11 | ARTIC + Rapid | P11_rad004/ARTIC/medaka MN908947.3 |
|
| P12 fixed | ARTIC + Rapid | P12_rad004 (organism = Severe acute respiratory syndrome coronavirus 2) | |
| (isolate = P12) reference assembly to MN908947.3 |
| ||
| P14 | ARTIC + Rapid | P14_rad004/ARTIC/medaka MN908947.3 |
|
| P15 fixed | ARTIC + Rapid | P15_rad004 (organism = Severe acute respiratory syndrome coronavirus 2) | |
| (isolate = P15) reference assembly to MN908947.3 |
| ||
| P18 | ARTIC + Rapid | P18_rad004/ARTIC/medaka MN908947.3 |
|
Sequencing of previously diagnosed clinical specimens by ARTIC with rapid sequencing approach.
| Centrifuge | RAMPART | |||||||||||
|
| ||||||||||||
| Sample ID | RT-PCR Diagnosis | Seq Assay | Total Reads | # CoV-2 Reads | # Unique CoV-2 Reads | % CoV-2 Reads | #CoV-2 Reads | Median Length | % CoV-2 Genome | Depth > 10x | Depth > 100x | Depth > 1,000x |
| P01 | + | ARTIC + Rapid | 462,000 | 71,138 | 71,138 | 15.4 | ND | ND | ND | ND | ND | ND |
| P02 | + | ARTIC + Rapid | 443,368 | 370,676 | 370,676 | 83.6 | 418,818 | 270 | 94.46 | 99.1 | 95.8 | 80.6 |
| P03 | + | ARTIC + Rapid | 645,017 | 490,929 | 490,929 | 76.1 | 545,826 | 270 | 84.62 | 48.2 | 41.4 | 30.1 |
| P04 | + | ARTIC + Rapid | 861,734 | 121,254 | 120,491 | 14.1 | 748,263 | 260 | 86.21 | 64.9 | 53.8 | 35.9 |
| P05 | + | ARTIC + Rapid | 44,838 | 18,779 | 18,779 | 41.9 | ND | ND | ND | ND | ND | ND |
| P06 | + | ARTIC + Rapid | 5,136,573 | 4,260,587 | 3,808,126 | 82.9 | 4,361,013 | 280 | 84.9 | 99.8 | 99.8 | 98.9 |
| P07 | + | ARTIC + Rapid | 1,523,922 | 1,189,946 | 1,189,171 | 78.1 | 1,331,034 | 290 | 87.34 | 62.2 | 61.3 | 56.2 |
| P08 | + | ARTIC + Rapid | 701,282 | 66,782 | 66,782 | 9.5 | ND | ND | ND | ND | ND | ND |
| P09 | + | ARTIC + Rapid | 2,245,950 | 59,114 | 59,114 | 2.6 | 66,862 | 340 | 2.98 | 57.6 | 42.2 | 18 |
| P10 | + | ARTIC + Rapid | 16,691 | 1,511 | 1,511 | 9.1 | ND | ND | ND | ND | ND | ND |
| P11 | + | ARTIC + Rapid | 3,417,719 | 2,816,093 | 2,724,325 | 82.4 | 3,087,854 | 290 | 90.35 | 99.8 | 98.3 | 91.5 |
| P12 | + | ARTIC + Rapid | 861,623 | 301,852 | 293,972 | 35.0 | 339,009 | 280 | 60.65 | 89 | 80.2 | 54.5 |
| P13 | + | ARTIC + Rapid | 360,276 | 23,244 | 23,244 | 6.5 | 26,335 | 280 | 7.31 | 7.1 | 6.8 | 4.3 |
| P14 | + | ARTIC + Rapid | 3,437,488 | 875,159 | 853,443 | 25.5 | 3,136,063 | 290 | 91.23 | 99.9 | 99.8 | 98.2 |
| P15 | + | ARTIC + Rapid | 2,100,544 | 1,822,242 | 1,754,580 | 86.8 | 1,982,939 | 290 | 94.4 | 97.4 | 96.3 | 82.9 |
| P16 | + | ARTIC + Rapid | 237,504 | 15,807 | 15,807 | 6.7 | 18,607 | 280 | 7.83 | 5.7 | 4.4 | 2.7 |
| P17 | + | ARTIC + Rapid | 526,781 | 178,745 | 174,792 | 33.9 | 198,615 | 300 | 37.7 | 25.6 | 24 | 26 |
| P18 | + | ARTIC + Rapid | 5,724,659 | 4,538,653 | 4,460,075 | 79.3 | 5,065,568 | 280 | 88.49 | 99.9 | 99.8 | 99.8 |
| P19 | + | ARTIC + Rapid | 162,340 | 35,574 | 35,574 | 21.9 | 39,440 | 290 | 24.29 | 38.2 | 34.5 | 7.7 |
| P20 | + | ARTIC + Rapid | 286,145 | 41,196 | 41,196 | 14.4 | 46,575 | 230 | 16.28 | 44.1 | 21.4 | 4.6 |
| N01 | - | ARTIC + Rapid | 578,916 | 446,225 | 443,662 | 77.1 | 495,874 | 280 | 85.66 | 62 | 58.8 | 43.8 |
| N02 | - | ARTIC + Rapid | 121,813 | 15,313 | 15,313 | 12.6 | ND | ND | ND | ND | ND | ND |
| N03 | - | ARTIC + Rapid | 67,918 | – | – | 0.0 | – | - | 0 | 0 | 0 | 0 |
| N04 | - | ARTIC + Rapid | 141,075 | 1,273 | 1,273 | 0.0 | 1,407 | 290 | 1 | 2 | 1.9 | 0 |
| N05 | - | ARTIC + Rapid | 649,918 | – | – | 0.0 | 431 | 340 | 0.07 | 1.30 | 1.20 | 0.00 |
| N06 | - | ARTIC + Rapid | 168,575 | – | – | 0.0 | ND | ND | ND | ND | ND | ND |
| N07 | - | ARTIC + Rapid | 111,549 | – | – | 0.0 | ND | ND | ND | ND | ND | ND |
| N08 | - | ARTIC + Rapid | 30,365 | – | – | 0.0 | ND | ND | ND | ND | ND | ND |
| N09 | - | ARTIC + Rapid | 91,432 | 14,671 | 14,656 | 16.0 | 16,219 | 280 | 17.74 | 2.60 | 2.50 | 2.10 |
| N10 | - | ARTIC + Rapid | 574,708 | 183,764 | 183,764 | 32.0 | 208,541 | 280 | 36.29 | 33.90 | 31.70 | 25.50 |
| N11 | - | ARTIC + Rapid | 395,322 | 166,603 | 166,603 | 42.1 | 189,251 | 290 | 47.87 | 64.40 | 56.50 | 33.70 |
| N12 | - | ARTIC + Rapid | 512,171 | 386,424 | 386,424 | 75.4 | 441,824 | 270 | 82.26 | 50.30 | 49.70 | 43.30 |
FIGURE 3Plot of CoV-2 Reads and Coverage as analyzed using Centrifuge and RAMPART. The number of SARS-CoV2 specific reads for sequenced COVID-19 RT-PCR positive (POS) and RT-PCR negative (NEG) and genome coverage are presented. (A) Plot of CoV-2 reads identified using Centrifuge in all POS versus NEG samples sequenced using the ARTIC with Rapid sequencing protocol. (B) Comparison of POS versus NEG samples that were analyzed using both Centrifuge and RAMPART pipelines. The number of CoV-2 specific reads are shown. (C) The depth of coverage at > 10X, > 100X, and > 1,000X as determined by RAMPART analyses are shown for POS versus NEG samples.
FIGURE 4CoV-2 Reads mapped over reference genome. Representative plots showing the number of CoV-2 reads mapping over specific regions of a CoV-2 reference genome after sequencing using the ARTIC with Rapid sequencing protocol and RAMPART analyses. (A) P09 is shown as positive sample just below the median coverage for the positive sample cohort; (B) P14 is representative of high coverage samples; (C) N03 is a negative sample with zero CoV-2 reads identified; (D) N04 is a negative sample with possible non-specific amplification or contamination; (E,F) are two RT-PCR negative samples with 50–60% CoV-2 genome coverage following sequencing.
Mutation analysis of assembled sequences using CoVsurver and comparison to Pangolin and Nextclade classifications.
| ID | Seq Assay | Query | %N | Length | Length | #Muts | %Muts | Comment | Unique | Existing | GISAID | Pango lineage | Nextstrain Clade |
| BEI RNA | ARTIC v3 | C0V2_lsk109/ARTIC/medaka MN908947.3 | 0.00% | 29,903 | 9,710 | 1 | 0.01% | NS8_L84S | S | A | 19B | ||
| BEI RNA | ARTIC + Rapid | C0V2_rad004/ARTIC/medaka MN908947.3 | 0.00% | 29,903 | 9,710 | 1 | 0.01% | NS8_L84S | S | A | 19B | ||
| P02 | ARTIC v3 | 1P2_lsk109/ARTIC/medaka MN908947.3 | 1.18% | 29,903 | 9,710 | 4 | 0.04% | Stretches of NNNs (1.18% of overall sequence). | NSP12_P323L, Spike_D614G, Spike_D138Y, Spike_P521R | G | B.1.473 | 20A | |
| P02 | ARTIC + Rapid | 1P2_rad004/ARTIC/medaka MN908947.3 | 2.91% | 29,903 | 9,710 | 6 | 0.06% | Stretches of NNNs (2.91% of overall sequence). | NSP12_P323L, Spike_D614G, Spike_D138Y, Spike_P521R, NS3_Q57H, NS3_G76S | GH | B.1.473 | 20A | |
| P11 | ARTIC + Rapid | P11_rad004/ARTIC/medaka MN908947.3 | 0.00% | 29,903 | 9,710 | 6 | 0.06% | NSP12_P323L, NSP12_H613Y, NSP14_A274S, Spike_D614G, N_G204R, N_R203K | GR | B.1.1 | 20B | ||
| P12 | ARTIC + Rapid | P12_rad004/ARTIC/medaka MN908947.3 | 15.51% | 29,904 | 9,534 | 6 | 0.06% | Long stretches of NNNs (15.51% of overall sequence). Insertion of 1 nucleotide(s) found at refpos 26653 (FRAMESHIFT). M without BLAST coverage. NSP3 has 103 Δs, NSP4 has 22 Δs; NSP16 has 80 Δs | NSP3_V477F, NSP12_P323L, Spike_D614G, Spike_P521R, NS3_Q57H, M_R44S | GH | B.1 | 20A | |
| P14 | ARTIC + Rapid | P14_rad004/ARTIC/medaka MN908947.3 | 0.00% | 29,903 | 9,710 | 5 | 0.05% | NSP2_T85I, NSP12_P323L, Spike_D614G, NS3_Q57H, N_P364L | GH | B.1 | 20C | ||
| P15 | ARTIC + Rapid | P15_rad004/ARTIC/medaka MN908947.3 | 3.30% | 29,902 | 9,710 | 10 | 0.10% | Stretches of NNNs (3.30% of overall sequence). Gap of 1 nucleotide(s) found at refpos 3013 (FRAMESHIFT). NSP3 has 248Δs | NSP3_L98I | NSP3_A99S, NSP3_T1456I, NSP3_V477F, NSP12_P323L, Spike_D614G, Spike_V308L, Spike_P521R, NS3_Q57H, N_R209I | GH | B.1 | 20A |
| P18 | ARTIC + Rapid | P18_rad004/ARTIC/medaka MN908947.3 | 0.00% | 29,903 | 9,710 | 6 | 0.06% | NSP15_D212V | NSP2_T85I, NSP12_P323L, NSP13_V209I, Spike_D614G, NS3_Q57H | GH | B.1 | 20C | |
| P12 fixed | ARTIC + Rapid | P12_rad004 (organism = Severe acute respiratory syndrome coronavirus 2) (isolate = P12) reference assembly to MN908947.3 | 15.51% | 29,903 | 9,534 | 6 | 0.06% | Long stretches of NNNs (15.51% of overall sequence). NSP3 has 103 Δs, NSP4 has 22 Δs; NSP16 has 80 Δs | NSP3_V477F, NSP12_P323L, Spike_D614G, Spike_P521R, NS3_Q57H, M_R44S | GH | B.1 | 20A | |
| P15 fixed | ARTIC + Rapid | P15_rad004 (organism = Severe acute respiratory syndrome coronavirus 2) (isolate = P15) reference assembly to MN908947.3 | 3.30% | 29,903 | 9,710 | 10 | 0.10% | Stretches of NNNs (3.30% of overall sequence). NSP3 has 248Δs | NSP3_A99S, NSP3_T1456I, NSP3_V477F, NSP3_L98F, NSP12_P323L, Spike_D614G, Spike_V308L, Spike_P521R, NS3_Q57H, N_R209I | GH | B.1 | 20A |