Sushama Telwatte1, Holly Anne Martin1, Ryan Marczak2, Parinaz Fozouni3, Albert Vallejo-Gracia3, G Renuka Kumar3, Victoria Murray4, Sulggi Lee4, Melanie Ott5, Joseph K Wong1, Steven A Yukl6. 1. Department of Medicine, University of California, San Francisco (UCSF), San Francisco, CA, United States; Department of Medicine, San Francisco VA Health Care System, San Francisco, CA, United States. 2. University of California, Santa Barbara, CA, United States. 3. Gladstone Institute of Virology, Gladstone Institutes, San Francisco, CA, United States. 4. Department of Medicine, University of California, San Francisco (UCSF), San Francisco, CA, United States. 5. Department of Medicine, University of California, San Francisco (UCSF), San Francisco, CA, United States; Gladstone Institute of Virology, Gladstone Institutes, San Francisco, CA, United States. 6. Department of Medicine, University of California, San Francisco (UCSF), San Francisco, CA, United States; Department of Medicine, San Francisco VA Health Care System, San Francisco, CA, United States. Electronic address: Steven.yukl@ucsf.edu.
Abstract
The replication of SARS-CoV-2 and other coronaviruses depends on transcription of negative-sense RNA intermediates that serve as the templates for the synthesis of positive-sense genomic RNA (gRNA) and multiple different subgenomic mRNAs (sgRNAs) encompassing fragments arising from discontinuous transcription. Recent studies have aimed to characterize the expression of subgenomic SARS-CoV-2 transcripts in order to investigate their clinical significance. Here, we describe a novel panel of reverse transcription droplet digital PCR (RT-ddPCR) assays designed to specifically quantify multiple different subgenomic SARS-CoV-2 transcripts and distinguish them from transcripts that do not arise from discontinuous transcription at each locus. These assays can be applied to samples from SARS-CoV-2 infected patients to better understand the regulation of SARS-CoV-2 transcription and how different sgRNAs may contribute to viral pathogenesis and clinical disease severity. Published by Elsevier Inc.
The replication of SARS-CoV-2 and other coronaviruses depends on transcription of negative-sense RNA intermediates that serve as the templates for the synthesis of positive-sense genomic RNA (gRNA) and multiple different subgenomic mRNAs (sgRNAs) encompassing fragments arising from discontinuous transcription. Recent studies have aimed to characterize the expression of subgenomic SARS-CoV-2 transcripts in order to investigate their clinical significance. Here, we describe a novel panel of reverse transcription droplet digital PCR (RT-ddPCR) assays designed to specifically quantify multiple different subgenomic SARS-CoV-2 transcripts and distinguish them from transcripts that do not arise from discontinuous transcription at each locus. These assays can be applied to samples from SARS-CoV-2 infected patients to better understand the regulation of SARS-CoV-2 transcription and how different sgRNAs may contribute to viral pathogenesis and clinical disease severity. Published by Elsevier Inc.
Entities:
Keywords:
COVID-19; Coronavirus; Digital PCR; Droplet digital PCR; Quantitative assays; SARS-CoV-2; Subgenomic RNA; Viral transcription/replication
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) [1], [2], the etiologic agent of COVID-19, is an enveloped virus with a positive-sense, single-stranded RNA genome of about 30,000 nucleotides. Like other viruses of the order Nidovirales, SARS-CoV-2 replicates through the transcription of negative-sense RNA intermediates that serve as the templates for the transcription of positive-sense genomic RNA (gRNA) and multiple different subgenomic RNAs (sgRNAs). These sgRNAs, which are analogous to spliced RNAs, are generated from discontinuous transcription during the synthesis of the negative strand RNA [3]. sgRNAs arise from a template switch at transcription-regulating sequences (TRS) located at the end of the ‘leader’ sequence in the 5′ untranslated region [UTR] and ‘body’ TRS sequences located upstream of various genes in the distal third of the genome (open reading frames [ORF] 2 to 9) [4]. The resulting sgRNAs contain the 5′ UTR ‘leader’ sequence followed by the ‘body’ sequence derived from one of the 3′ genes [5] (see Fig. 1
). Although further studies are needed to elucidate the contribution of different sgRNAs to SARS-CoV-2 replication and disease, subgenomic transcription may allow for variation in expression of viral structural proteins and proteins involved in pathogenesis. The transcription of gRNA and sgRNA occurs at double-membraned vesicles that contain cellular and viral materials in the cytoplasm of infected cells [6]. While the gRNAs are packaged into virions, it seems unlikely that the sgRNAs are packaged [7], [8]. However, sgRNAs are detectable during early symptomatic infection and sometimes after symptoms have subsided [8], [9]. It is unclear which mechanisms allow for prolonged persistence of sgRNAs, but they might be protected from degradation by encapsulation in double-membrane vesicles [6], [9].
Fig. 1
Schematic representation of SARS-CoV-2 genome organization, virion structure, assay design and sgRNA targets. SARS-CoV-2 employs discontinuous transcription to generate subgenomic RNAs. (A) The genome organization of SARS-CoV-2. The genome features two large genes, ORF1a (yellow) and ORF1b (blue), which encode 16 non-structural proteins (NSP1–NSP16). The structural genes encode the structural proteins, spike (S; green), envelope (E; blue), membrane (M; purple), and nucleocapsid (N; gold). Assay locations of each assay designed for this study are indicated. The SARS-CoV-2 virion structure is shown in the lower panel. (B) Assay design. Assays for a given subgenomic RNA and the corresponding “genomic” RNA region share the same probe and reverse primer (located in a body gene) but differ in the forward primers. The sgRNA-targeting forward primer is located in the 5’ UTR (upstream of the leader-body TRS junction), whereas the cognate gRNA-targeting forward primer is located upstream of the body TRS and coding region. (C) Schematic representation of negative strand RNA synthesis from the full-length positive strand gRNA. A template-switch occurs at the body TRS [TRS-B] to the 5’ leader TRS [TRS-L] to give rise to the negative-strand sgRNA. (D) The panel of validated canonical subgenomic RNA targets is shown, including S, 3a, E, M, 7a, 8 and N. Figure adapted from Telwatte et al., 2021 and Kim et al., 2020. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Schematic representation of SARS-CoV-2 genome organization, virion structure, assay design and sgRNA targets. SARS-CoV-2 employs discontinuous transcription to generate subgenomic RNAs. (A) The genome organization of SARS-CoV-2. The genome features two large genes, ORF1a (yellow) and ORF1b (blue), which encode 16 non-structural proteins (NSP1–NSP16). The structural genes encode the structural proteins, spike (S; green), envelope (E; blue), membrane (M; purple), and nucleocapsid (N; gold). Assay locations of each assay designed for this study are indicated. The SARS-CoV-2 virion structure is shown in the lower panel. (B) Assay design. Assays for a given subgenomic RNA and the corresponding “genomic” RNA region share the same probe and reverse primer (located in a body gene) but differ in the forward primers. The sgRNA-targeting forward primer is located in the 5’ UTR (upstream of the leader-body TRS junction), whereas the cognate gRNA-targeting forward primer is located upstream of the body TRS and coding region. (C) Schematic representation of negative strand RNA synthesis from the full-length positive strand gRNA. A template-switch occurs at the body TRS [TRS-B] to the 5’ leader TRS [TRS-L] to give rise to the negative-strand sgRNA. (D) The panel of validated canonical subgenomic RNA targets is shown, including S, 3a, E, M, 7a, 8 and N. Figure adapted from Telwatte et al., 2021 and Kim et al., 2020. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Advantages of droplet digital PCR for assessing SARS-CoV-2 transcription
By partitioning PCR reactions into thousands of individual droplets prior to amplification and acquiring data at the reaction end point, ddPCR permits absolute quantification of targets independent of standard curves, which leads to higher precision and reproducibility compared to quantitative PCR [10]. Recent studies have attempted to measure SARS-CoV-2 sgRNAs using various strategies in cell lines [11] or cells from infected individuals [9], [12]. These methods consistently indicate a low abundance of subgenomic transcripts [9], [11], [12]. The combination of low abundance targets (as is the case with sgRNAs), limited yields of cells or fluid from sampling procedures (e.g. nasopharyngeal swabs), and the potential presence of inhibitors such as chemical or protein contaminants [10] requires amplification methods that are sensitive, reliable, accurate, and precise. In this regard, ddPCR offers absolute quantification, improved sensitivity [10], [13], improved tolerance to PCR inhibitors [14], [15], increased precision (particularly for low concentration samples [16], [17], [18]), inter-run reproducibility, and enhanced discrimination between small differences in concentration [19]. Therefore, the use of RT-ddPCR offers advantages to study SARS-CoV-2 sgRNAs compared to other methods that have been recently described, such as RT-PCR [20] and Thermofisher’s Ampliseq panel. The latter contains 237 primer pairs targeting SARS-CoV-2 and 5 primer pairs for cellular targets; however, only two of the forward primers are located within the leader sequence and are likely to amplify subgenomic RNAs [9]. Furthermore, the number of cycles must be carefully calibrated because biases due to differential amplification efficiency of primer pairs can skew amplification [21]. This presents a technical challenge for samples with low viral loads that require >20 amplification cycles [9], [22]. Here, we detail a protocol designed to measure SARS-CoV2 sgRNAs using RT-ddPCR and validate the protocol by applying it to Vero CCL-81 kidney epithelial cells infected with SARS-CoV-2.
Material and methods
Assay approach
We designed ddPCR assays targeting seven of the reportedly most abundant canonical sgRNAs expressed by SARS-CoV-2 [11]: spike (S) protein (ORF2), open reading frame 3a (ORF3a), envelope (E; ORF 4), membrane (M) glycoprotein (ORF5), ORF7a, ORF8, and nucleocapsid (N) protein (ORF9). For each region, we designed pairs of primer/probe sets that shared the same reverse primer and probe (located in a 3′ body gene) but differed in the forward primer, either directed to the leader sequence in the 5′ UTR (which is far from the probe in gRNA and therefore should only amplify sgRNA) or upstream of the body TRS (in a region that would be removed during the discontinuous transcription of that particular sgRNA) [Table 1
, Fig. 1]. It should be noted that the genomic forward primer for the spike (S) gene (the most upstream of the body genes) should only detect gRNA, whereas “genomic” primers for more downstream genes will detect gRNA as well as sgRNAs arising from discontinuous transcription at other body TRS sequences located further upstream. However, the levels of each sgRNA and its corresponding “genomic” target can still be used to measure the fraction of transcripts that result from discontinuous transcription at the given locus. Of note, the “genomic” and subgenomic forward primers were chosen to be a similar distance from the probe in their respective genomic and subgenomic RNAs in order to minimize any potential difference in amplification efficiency prior to subsequent validations.
Table 1
SARS-CoV-2 genomic and subgenomic primer/probe sets selected.
Target Region
Primer Namea
SARS-CoV-2 coordinatesb
Sequence (5′-3′)
Genomic S
gS_F1
21516–21541
CAGAGTTGTTATTTCTAGTGATGTTC
S_P1
21585–21607
TGCCACTAGTCTCTAGTCAGTGT
S_R1
21616–21637
GGGTAATTGAGTTCTGGTTGTA
Subgenomic S
sgS-ORF3a_F1
35–56
ACCAACTTTCGATCTCTTGTAG
Genomic ORF3a
gORF3a_F1
25350–25369
CAGTGCTCAAAGGAGTCAAA
ORF3a_P1
25421–25445
TTGGAACTGTAACTTTGAAGCAAGG
ORF3a_R1
25448–25469
GAAGGAGTAGCATCCTTGATTT
Subgenomic ORF3a
sgS-ORF3a_F1
35–56
ACCAACTTTCGATCTCTTGTAG
Genomic E (ORF 4)
gE_F3
26217–26236
GTAAGCACAAGCTGATGAGT
E_P1
26252–26275
CATTCGTTTCGGAAGAGACAGGTA
E_R3
26303–26323
AGAATACCACGAAAGCAAGAA
Subgenomic E
sgE_F3
20–38
CCAGGTAACAAACCAACCA
Genomic M (ORF 5)
gM_F1
26452–26473
GTTCCTGATCTTCTGGTCTAAA
M_P1
26516–26537
TTTAGCCATGGCAGATTCCAAC
M_R1
26541–26561
AAGCTCTTCAACGGTAATAGT
Subgenomic M
sgM_F1
33–53
CAACCAACTTTCGATCTCTTG
Genomic ORF7a
gORF7a_F1
27361–27380
GAAGAGCAACCAATGGAGAT
ORF7a_P2
27410–27431
TCTTGGCACTGATAACACTCGC
ORF7a_R2
27432–27454
GGTAGTGATAAAGCTCACAAGTA
Subgenomic ORF7a
sgORF7a_F2
20–38
CCAGGTAACAAACCAACCA
Genomic ORF8
gORF8_F2
27877–27894
GTCACGCCTAAACGAACA
ORF8_P1
27930–27953
ACATTCTTGGTGAAATGCAGCTAC*
ORF8_R1
27956–27978
GATGTTGAGTACATGACTGTAAA
Subgenomic ORF8
sgORF8-N_F1
32–51
CCAACCAACTTTCGATCTCT
Genomic N (ORF 9)
gN_F1
28227–28245
CATGACGTTCGTGTTGTTT
N_P2
28275–28299
TGTCTGATAATGGACCCCAAAATCA
N_R2
28316–28333
GGGTCCACCAAACGTAAT
Subgenomic N
sgORF8-N_F1
32–51
CCAACCAACTTTCGATCTCT
*Reverse complement.
‘F’ = forward primer, ‘R’ = reverse primer, ‘P’= probe (fluorophore/quencher: FAM, MGB).
SARS-CoV2 coordinates indicated are based on the SARS-CoV2 reference sequence (NC_045512.2).
SARS-CoV-2 genomic and subgenomic primer/probe sets selected.*Reverse complement.‘F’ = forward primer, ‘R’ = reverse primer, ‘P’= probe (fluorophore/quencher: FAM, MGB).SARS-CoV2 coordinates indicated are based on the SARS-CoV2 reference sequence (NC_045512.2).
Primer design and selection
A multiple sequence alignment was performed using Clustal Omega [23], encompassing complete sequences of 86 SARS-CoV-2 isolates from all geographical locations and all sequences available from the US on 3/14/2020 as previously described [24]. Using this alignment, we assessed sequence conservation across target regions. Multiple primer/probe sets were designed to target seven regions of SAR-CoV-2 reported to generate subgenomic in addition to genomic RNA. Where feasible, two alternative primer/probe sets were designed for a given subgenomic or “genomic” target in order to provide flexibility in choosing the better assay. Primers/probes were designed using the Primer Quest® Tool (Integrated DNA Technologies, Coralville, IA). A sequence similarity analysis using Basic Local Alignment Search Tool (BLAST) [25] found no significant similarity in any primer or probe to human sequences.
Preparation of SARS-COV-2 supernatant viral RNA
Vero CCL-81 kidney epithelial cells, derived from Cercopithecus aethiops, were infected with SARS-CoV-2 (Isolate: USA-WA1/2020) at an MOI of 0.003 (250 000 cells/well) [24]. After 72 h, supernatant was harvested and centrifuged twice to remove cells, as previously described [24]. The clarified supernatant from one infection was mixed with TriReagent (Molecular Research Center; 1 mL per 100 µL supernatant), and RNA was extracted using TriReagent to generate a viral RNA standard [24]. The copies/μL in the supernatant virus standard were estimated by triplicate measurements using the Abbott RealTime SARS-CoV-2 assay (Abbott m2000 Molecular Platform), which targets the RdRp and N genes. Dilutions of the supernatant virus standard were added to reverse transcription reactions to achieve expected inputs of 1 to 10,000 copies per 5 μL RT, and 5 μL aliquots of cDNA were used to test each assay in parallel using replicate 20 μL ddPCR reactions (see Section 2.5) containing primers/probe specific for a given region.
Reverse transcription
Reverse transcription (RT) reactions were performed in 50 µL containing 5 µL of 10× SuperScript III buffer (Invitrogen), 5 µL of 50 mM MgCl2, 2.5 µL of random hexamers (50 ng/µL; Invitrogen), 2.5 µL of 50 µM poly-dT15, 2.5 µL of 10 mM deoxynucleoside triphosphates (dNTPs), 1.25 µL of RNAseOUT (40 U/µL; Invitrogen), and 2.5 µL of SuperScript III RT (200 U/µL; Invitrogen) [26]. Key reagents and resources are listed in Table S1. Reverse transcription was performed with both random hexamers and poly-dT in anticipation of this approach being applied to clinical samples containing long polyadenylated SARS-CoV-2 RNAs, for which the combination of poly-dT plus random hexamers may reduce bias towards reverse transcription of any one region (as can be seen with specific reverse primers), the 5′ end (as would be expected with random hexamers), or the 3′ end (as would be expected with poly-dT).
Droplet digital PCR
Droplet digital PCR reactions consisted of 20 μL per well containing 10 μL of ddPCR Probe Supermix (no deoxyuridine triphosphate, Bio-Rad, Hercules, CA), 900 nM of primers, 250 nM of probe, and 5 μL of plasmid DNA or cDNA generated from SARS-CoV-2 virion RNA [24]. The ddPCR reactions were incorporated into droplets using the QX100 Droplet Generator (Bio-Rad). Nucleic acids were amplified with the following cycling conditions: 10 min at 95 °C, 45 cycles of 30 s at 95 °C and 59 °C for 60 s, and a final droplet cure step of 10 min at 98 °C using a Mastercycler® nexus (Eppendorf, Hamburg, Germany). Droplets were read and analyzed using Bio-Rad QX100 system and QuantaSoft software in the absolute quantification mode. Only wells containing ≥ 11,000 droplets were accepted for further analysis.
sgRNA expression levels in a clinical sample
Ethics statement
All participants provided written informed consent, and the research was approved by the Committee on Human Research (CHR), the Institutional Review Board for the University of California, San Francisco (IRB# 20–30588).
Participant recruitment, sample collection, and processing
Nasopharyngeal (NP) swab samples were obtained from an acutely-infected, symptomatic male with mild disease (outpatient). SARS-CoV-2 infection was confirmed by diagnostic PCR 4 days prior to collection of the study sample, while symptom onset was reported 5 days prior to the study sample. The collected NP swab was transported in 850 μL Viral Transport Medium (VTM; Hanks Balanced Salt Solution, 2% fetal bovine serum, 100 μg/mL Gentamycin, and 0.5 μg/mL Amphotericin B). The nasopharyngeal sample in VTM was subjected to two cycles of centrifugation at 500 xg for 5 min to pellet cells. Cell-associated RNA was extracted using the AllPrep DNA/RNA/miRNA Universal Kit (Qiagen) as per manufacturer’s instructions. The levels of each subgenomic target and three genomic targets were measured using RT-ddPCR as described in 2.4, 2.5.
Results
Validations using plasmid DNA
Plasmid constructs containing genomic regions upstream of one or more SARS-CoV-2 body TRS and body gene (“S2-ORF3a-E”, “ORF6-ORF7ab-ORF8-NC-3’UTR”, “Nsp16-S1”, and “E-M”) and subgenomic constructs containing the 5’ UTR leader sequence upstream of different body genes (S, ORF3a, E, M, ORF7a, ORF8, and N) were designed in pBluescript KS(+) (Bio Basic Inc., Ontario, Canada) to enable assay validations (Table 2
). Plasmid concentrations were quantified by ultraviolet (UV) spectrophotometry (NanoDrop ND-1000 instrument, Thermo Fisher) and the molecular weights of respective plasmids were used to calculate the number of molecules per μL. Extracted PBMC from a healthy donor (150–200 ng/well) and H2O were included as negative controls for each primer/probe set.
Table 2
Plasmid constructs for SARS-CoV-2 sgRNA and gRNA assay validations.
Plasmid Name
Coding Regionsa
Features
Contains regions targeted by
Subgenomic S
Spike subunit 1
TRS at 5′ end
sgS
Subgenomic ORF3a
ORF3a
TRS at 5′ end
sgORF3a
Subgenomic E
Envelope
TRS at 5′ end
sgE, E_Sarbeco
Subgenomic M
Membrane
TRS at 5′ end
sgM, M−ORF5
Subgenomic ORF7a
ORF7a
TRS at 5′ end
sgORF7a
Subgenomic ORF8
ORF8
TRS at 5′ end
sgORF8
Subgenomic N
Nucleocapsid
TRS at 5′ end
sgN, N-ORF9
NSp16-SpikeS1
Nsp16, Spike subunit 1
genomic sequence upstream of S
gS
S2-ORF3a-E
Spike subunit 2, ORF3a, and envelope
genomic sequence upstream of ORF3a and E
gORF3a, gE
E-M
Envelope, membrane
genomic sequence upstream of M
gM
ORF6-ORF7ab-ORF8-NC-3UTR
ORF6, ORF7a, ORF7b, ORF8, nucleocapsid, and 3′UTR
genomic sequence upstream of ORF7a, ORF8, and N
gORF7a, gORF8, gN
All plasmids feature a backbone of pBluescript KS(+).
Plasmid constructs for SARS-CoV-2 sgRNA and gRNA assay validations.All plasmids feature a backbone of pBluescript KS(+).
Comparison of sgRNA primer/probe sets to previously validated primer/probe sets targeting SARS-CoV-2
Initially, to assess the performance of the designed sgRNA assays, we simultaneously measured the copies detected for selected subgenomic primer/probe sets (sgE, sgM, and sgN_2) and previously-validated SARS-CoV-2 primer/probe sets that target coding regions of the E, M, or N genes [24] using the subgenomic plasmids with the leader sequence linked to the respective coding region (Table 2, Fig. 2
). Each of the previously-validated primer/probe sets target coding regions downstream of the body TRS, so they detect both genomic and subgenomic RNAs containing that gene (“total” SARS-Cov-2 RNA) [24].
Fig. 2
Comparing efficiency of subgenomic assays and total SARS-CoV-2 assays. The efficiency of selected subgenomic assays (sgE, sgM, and sgN_2) relative to previously validated SARS-CoV-2 assays (targeting total E, M or N coding region; grey symbols) were measured using the same plasmids. S indicates slope (efficiency).
Comparing efficiency of subgenomic assays and total SARS-CoV-2 assays. The efficiency of selected subgenomic assays (sgE, sgM, and sgN_2) relative to previously validated SARS-CoV-2 assays (targeting total E, M or N coding region; grey symbols) were measured using the same plasmids. S indicates slope (efficiency).Two plasmid inputs were used to compare each assay for subgenomic or total RNA and the efficiencies at each input were determined using the ratio of measured copies to expected copies (as calculated from plasmid concentration and molecular weights) to ascertain whether there were marked differences between primer/probes sets using the same plasmid input. The average efficiency was calculated by the slope (S) of the measured copies vs. expected copies for both inputs (Fig. 2). The average efficiency of each subgenomic primer/probe set was similar to the assay for total SARS-CoV-2 RNA (Slope for E: subgenomic E [sgE] = 1.20, total E [“E_Sarbeco”] = 1.16; Slope for M: sgM = 0.66, total M [“M−ORF5”] = 0.71; and slope for N: sgN_2 = 0.92, total N [“N-ORF9”] = 0.95; Fig. 2). The sgE and sgN assays performed with efficiency close to 100%, whereas sgM was lower (at 66%) but similar to the assay for total M (ORF5’; efficiency = 71%) in this study and previous work [24]. For these initial tests, as we measured only two plasmid inputs, the efficiency estimation served only as a preliminary guide prior to commencing subsequent validation experiments.
Efficiency and linearity of SARS-CoV-2 panel of ddPCR assays determined using plasmid DNA
Next, we determined the efficiency, linearity, and sensitivity for all of the subgenomic primer/probe sets using terminal dilutions of the respective subgenomic plasmids (Fig. 3
). Briefly, plasmid DNA (Table 2) was added to ddPCR wells at expected inputs of 1–10,000 copies/well in duplicate (10,000, 1000 and 100 copies) or quadruplicate (10 and 1 copy).
Fig. 3
Efficiency and linearity of ddPCR assays for SARS-CoV-2 subgenomic transcripts determined using plasmid DNA. Specially designed subgenomic plasmids containing the 5′ UTR sequence upstream of the coding regions of individual body genes were quantified by UV spectroscopy and diluted (expected copies) to test the absolute number of copies detected by each primer/probe set using duplicate ddPCR reactions (measured copies). Coloured symbols denote the best-performing of alternate primer/probe sets for a given sgRNA.
Efficiency and linearity of ddPCR assays for SARS-CoV-2 subgenomic transcripts determined using plasmid DNA. Specially designed subgenomic plasmids containing the 5′ UTR sequence upstream of the coding regions of individual body genes were quantified by UV spectroscopy and diluted (expected copies) to test the absolute number of copies detected by each primer/probe set using duplicate ddPCR reactions (measured copies). Coloured symbols denote the best-performing of alternate primer/probe sets for a given sgRNA.Assay efficiencies (slope of measured vs. expected copies) were > 63% for all assays (Range: 63%-138%; Median: 81%; Fig. 3). For subgenomic targets for which we had designed alternative primer/probe sets (sgORF3a, sgORF7a, and sgORF8), the better performing assay (Fig. 3, colored symbols) was chosen based on efficiency, sensitivity, and separation of the positive and negative droplets (Figs. S1-S2). All assays showed linear quantification (R2 ≥ 0.999) with a dynamic range of ≥4 log10 (Fig. 3). Five of the seven assays were able to detect as few as one copy per well (two independent experiments, each with quadruplicate wells for each primer/probe set). Negative controls of human donor PBMC and H2O included in every experiment were consistently negative for all selected sgRNA primer/probe sets listed in Table 1 (Figs. S1-S2). Alternate primer/probe sets for which false positives were detected, such as for ‘sgORF8_2′ and ‘sg_N1′, were not selected for the final panel.
Specificity of subgenomic primer/probe sets
To ascertain the specificity of the subgenomic RNA primer/probe sets, we tested each subgenomic primer/probe set using plasmids designed for their respective genomic sequence (Table 2). Each genomic plasmid was added to ddPCR reactions at an input of at least 10 000 copies along with either the subgenomic or respective genomic sequence-targeting primer/probe set. Five of the seven subgenomic sequence targeting primer/probe sets (sgORF3a_1, sgE, sgM, sgORF8_1, sgN_2) did not detect any copies of plasmid containing the respective genomic sequence even at inputs above 17 000 copies [near saturation] (Fig. S3). At inputs >17 000 copies of genomic plasmid, only one droplet was detected across 4 replicate wells for sgS, while for sgORF7a_2, one droplet was detected in each well of two replicates. In contrast, the matched genomic sequence-targeting primer/probe sets detected copies in the expected input range, close to saturation. Taken together, these data demonstrate that the detection of subgenomic sequences by all seven designed primer/probe sets is highly specific.
The expression of sgRNA vs. “genomic” RNA in supernatant from SARS-CoV-2 in vitro infections
Although supernatant from in vitro infections with SARS-CoV-2 might be expected to contain mostly gRNA in virions, we previously reported that supernatant RNA levels of the total M, N, and 3′ UTR regions (found in both gRNA and sgRNA) were higher than that of RNA coding for 5′ targets found only in gRNA (main protease, RDRP) [24], suggesting the supernatant contains sgRNA. In order to test this hypothesis and determine if we could quantify different sgRNAs, all of the assays for subgenomic and “genomic” ORF2-9 transcripts were applied to RNA extracted from the supernatant. Dilutions of the supernatant RNA were added to RT reactions to achieve expected inputs (based on testing the supernatant with the Abbott RealTime SARS-CoV-2 clinical assay) of 1 to 10,000 copies per 5 μL RT (the input into each ddPCR well).All assays for “genomic” ORF 2–9 transcripts showed relatively similar efficiencies, ranging from 118 to 169% (Fig. 4
). In these experiments, for 6 out of 7 regions, at least one “genomic” RNA primer/probe set was able to detect as few as 1 copy/well. Negative controls, including ‘no template’ and ‘no RT’ controls, were routinely negative (Fig. S4). Contrastingly, each sgRNA assay detected a smaller fraction of the total, ranging from 1 to 14% of expected total copies, based on average efficiency inferred from the slope [S] (Fig. 4). Since each subgenomic assay should only detect one particular form of sgRNA arising from a template switch at a given body TRS (Fig. 1) and is highly specific for sgRNA and not genomic RNA (Fig. S3), we reasoned that the sum of all of these sgRNA species should approximate the overall levels of subgenomic RNA present in the supernatant. The cumulative sum of all measured sgRNAs was approximately 42% of the total expected SARS-CoV-2 RNA copies in the supernatant, which accords with the previously measured excess of 3’ over 5’ SARS-CoV-2 RNA levels [24].
Fig. 4
Efficiency and linearity of ddPCR assays for SARS-CoV-2 “genomic” and subgenomic transcripts determined using supernatant from in vitro infection. A SARS-CoV-2 ‘supernatant’ standard was prepared by extracting the RNA from the supernatant of an in vitro infection and quantified using the Abbott Real Time SARS-CoV-2 assay. Various inputs of the supernatant standard (which were used to calculate ‘Expected Copies’ per ddPCR well) were applied to a common reverse transcription reaction, from which aliquots of cDNA were used to measure the absolute number of copies detected by each ddPCR assay (measured copies) for ‘genomic’ or subgenomic transcripts. Each assay was tested with expected inputs of 10–10,000 copies/ddPCR well in duplicate. S (slope) and R2 are indicated for each assay. Coloured symbols denote the best-performing of alternate primer/probe sets for a given gRNA or sgRNA.
Efficiency and linearity of ddPCR assays for SARS-CoV-2 “genomic” and subgenomic transcripts determined using supernatant from in vitro infection. A SARS-CoV-2 ‘supernatant’ standard was prepared by extracting the RNA from the supernatant of an in vitro infection and quantified using the Abbott Real Time SARS-CoV-2 assay. Various inputs of the supernatant standard (which were used to calculate ‘Expected Copies’ per ddPCR well) were applied to a common reverse transcription reaction, from which aliquots of cDNA were used to measure the absolute number of copies detected by each ddPCR assay (measured copies) for ‘genomic’ or subgenomic transcripts. Each assay was tested with expected inputs of 10–10,000 copies/ddPCR well in duplicate. S (slope) and R2 are indicated for each assay. Coloured symbols denote the best-performing of alternate primer/probe sets for a given gRNA or sgRNA.Next, we calculated the ratio of copies of each sgRNA relative to its cognate “genomic” RNA, which represents the ratio of discontinuous transcription to continuous transcription at a given body TRS. This comparison should minimize differences due to assay efficiency because the corresponding assays share the same probe and reverse primer, while the forward primers are located similar distances from the probe. The ratios of subgenomic to cognate “genomic” RNA varied between loci, with a range of 1.5–5.2%. The sum of these ratios, which provides a minimum estimate of all subgenomic RNA as a fraction of genomic RNA, was 27.9%. However, as mentioned previously, some of our “genomic” assays may also detect subgenomic transcripts that result from upstream “splicing.” Therefore, we also calculated the ratio of copies of each sgRNA relative to a single true genomic RNA, as measured by the copies from the genomic S assay. The ratios of subgenomic RNA to genomic S ranged from 1.2 to 5.2%. The sum of these ratios, which provides another minimum estimate of the total ratio of subgenomic to genomic RNA, was 24.9% (Fig. 5
C). These two analyses suggest a subgenomic to genomic ratio of at least 25–28%, indicating that subgenomic transcripts may account for at least 20–22% of all viral RNA. Taken together, these results suggest that our novel panel of RT-ddPCR assays can be used to measure the absolute levels of 7 different SARS-CoV-2 sgRNAs and provide a minimum estimate of all subgenomic transcripts as a fraction of genomic or total viral RNA.
Fig. 5
Expression of sgRNA and gRNA in supernatant from a SARS-CoV-2 in vitro infection. Aliquots of cDNA from a common RT reaction containing SARS-CoV-2 RNA were added to ddPCR reactions (predicted to contain 1000 copies of SARS-CoV-2/well) and the levels of each (A) sgRNA and (B) gRNA target were measured. (C) The copies of each sgRNA were divided by the copies of genomic S RNA to express the ratio of each subgenomic RNA to genomic RNA.. Each target region is depicted by a different colour (red: S; orange: ORF3a; yellow: E; green: M; blue: ORF7a; teal: ORF8; and grey: N). Symbols represent the following: squares: subgenomic RNA; circles: genomic RNA; and diamonds: ratio of sgRNA/gRNA. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Expression of sgRNA and gRNA in supernatant from a SARS-CoV-2 in vitro infection. Aliquots of cDNA from a common RT reaction containing SARS-CoV-2 RNA were added to ddPCR reactions (predicted to contain 1000 copies of SARS-CoV-2/well) and the levels of each (A) sgRNA and (B) gRNA target were measured. (C) The copies of each sgRNA were divided by the copies of genomic S RNA to express the ratio of each subgenomic RNA to genomic RNA.. Each target region is depicted by a different colour (red: S; orange: ORF3a; yellow: E; green: M; blue: ORF7a; teal: ORF8; and grey: N). Symbols represent the following: squares: subgenomic RNA; circles: genomic RNA; and diamonds: ratio of sgRNA/gRNA. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Subgenomic RNA expression levels in pharynx from one acutely infected individual
Next, we applied our newly-validated assays to a nasopharyngeal sample from a laboratory-confirmed SARS-CoV-2 acutely-infected individual (5 days post-symptom onset and 4 days after positive clinical PCR test). Levels of each distinct sgRNA (sgS, sgORF3a, sgE, sgM, sgORF7a, sgORF8, and sgN) and three targets found only in genomic RNA (gS, RDRP and Main Proteinase) were measured in cells from the nasopharynx (Fig. 6
). All 7 subgenomic RNAs were detected in the nasopharyngeal cells. With the exception of subgenomic S, the hierarchy of subgenomic transcript levels (N > 3a, M, 7a > S > 8, E) was relatively similar to that seen in the supernatant from SARS-CoV-2 in vitro infection (N > S > 3a, M, 7a > 8, E). Using the sum of the ratios of each sgRNA to genomic S RNA, we determined that the total ratio of subgenomic to genomic RNA was 55.4%, indicating that subgenomic transcripts may account for at least 35.6% of the total viral RNA present in nasopharyngeal cells from this acutely infected individual.
Fig. 6
Expression of sgRNAs in pharynx of one acutely SARS-CoV-2- infected individual. Total cell-associated RNA was isolated from a nasopharyngeal swab. A common RT reaction was divided across ddPCR reactions to measure seven sgRNA targets and one genomic target (S). Independent measurements of two other genomic targets, RDRP and Main Proteinase, from the same sample are included as a reference. Levels of each target are expressed as copies per μL extract.
Expression of sgRNAs in pharynx of one acutely SARS-CoV-2- infected individual. Total cell-associated RNA was isolated from a nasopharyngeal swab. A common RT reaction was divided across ddPCR reactions to measure seven sgRNA targets and one genomic target (S). Independent measurements of two other genomic targets, RDRP and Main Proteinase, from the same sample are included as a reference. Levels of each target are expressed as copies per μL extract.
Discussion
The role of sgRNAs in infection with SARS-CoV-2 and other coronaviruses remains unclear. While important in the generation of viral proteins that are encoded in the 3′ region of the viral genome, their role in the life cycle and pathogenicity of these viruses still remains to be studied in greater detail. It is important to determine if sgRNAs are a suitable marker for active replication, as suggested by some studies [12], [20], [27], [28], or if their enclosure in double membrane vesicles [6], [29] and/or extracellular vesicles allows persistence after replication has ceased. Furthermore, the discovery of several non-canonical sgRNAs [2], [30] suggests that more diverse sgRNA species may play a role in SARS-CoV-2 life cycle. However, the unique recombination sequences seen in these non-canonical sgRNAs may indicate that they function as defective interfering RNAs, as has been previously suggested [31]. Since sgRNAs in other viruses play roles in viral replication and recombination, these and other roles of sgRNA in SARS-CoV2 merit further investigation. Given these gaps in knowledge, it is critical to continue to study sgRNAs to better understand coronaviruses, their role in pathogenicity, and potential targets of future therapies.In this study, we developed and validated a novel panel of RT-ddPCR-based SARS-CoV-2 assays that target 7 distinct canonical sgRNAs. All assays could detect as few as 1–10 copies using plasmid DNA carrying the cognate sequence and demonstrated linearity over 4–5 orders of magnitude (R2 > 0.999 for all; Fig. 3). Given the relatively similar PCR efficiencies of the subgenomic assays on plasmid DNA (Fig. 3), it is unlikely that variable assay efficiencies alone account for the differing abundances of the sgRNA species observed in the supernatant from the in vitro infection (Fig. 4). Though we did not formally measure the limit of detection (LOD) of each primer probe set, the lowest inputs of 1 or 10 copies were typically measured in quadruplicate wells for each independent validation experiment using either plasmid DNA or viral RNA. We found the sensitivity of each primer/probe set to be reproducible in these experiments.It is worth noting that the expected copies of SARS-CoV-2 RNA in the supernatant were based on the viral load as determined by the Abbot RealTime SARS-CoV-2 assay, which measures the combined fluorescence from a reaction containing two qPCR assays (targeting RDRP and N) employing probes with the same fluorophore. The Abbott RealTime SARS-CoV-2 assay does not exclusively detect genomic RNA, since the N primer/probe set will detect both genomic and subgenomic SARS-CoV-2. Moreover, the Abbott provides relative quantification by extrapolation from an external standard. The observation that our “genomic” RNA assays consistently detected more than the expected copies from the Abbott (Fig. 4) could reflect the possibility that the relative quantification from the Abbott underestimated the true number of copies, that our assays were more efficient, or that there were other sources of error (dilution, pipetting, assay performance, etc.). Nonetheless, the fraction of sgRNA detected for each target is indicative of the level of sgRNA produced during in vitro infection of Vero-CCL81 cells with SARS-CoV-2. Our data suggest that the ratio of subgenomic to genomic RNA is at least 25–28% in the supernatant from this in vitro infection.Like the approach used by Kim et al., 2020 [11], we employed Vero cells infected with virus isolated from a patient [24]. However, our approach differed from other reported methods [9], [11], [12] to quantify sgRNAs. We employed an RT strategy that seeks to minimize bias towards any one region or either end of the RNA by using both random hexamers and poly-dT. Furthermore, our use of a common RT reaction for multiple ddPCR reactions should reduce differences due to reverse transcription and facilitate more accurate comparison of different RNA target levels (for instance, matched sgRNA and gRNA targets for multiple regions). Our approach is also faster than deep and direct RNA sequencing methods [11], although it does not allow for the examination of any RNA modifications, as does direct RNA sequencing. While a prior publication described a PCR-based assay for one subgenomic RNA (sgE) [12], our approach is more comprehensive in that it allows us to quantify seven canonical sgRNAs, measure the rates of discontinuous transcription at each region, and estimate the total abundance of sgRNA species.Recent studies estimate the abundance of sgRNAs in clinical samples or in vitro infections to be 0.4% [12] − 5.6% [11]. Our data suggest that this may be an underestimate of total sgRNA species, which may be present at levels exceeding 25–55% of the genomic SARS-CoV-2 RNA copies or 20–36% of the total viral RNA. Further studies with larger numbers of clinical samples are required to confirm this finding. Taken together, these data suggest that sgRNA species are likely more abundant than previously thought and highlight the difficulty in accurately quantifying one or all sgRNAs using other published approaches.
Limitations
Given the difficulty in preparing and independently quantifying full-length RNA standards for genomic and seven different subgenomic SARS-CoV-2 transcripts, we were unable to determine whether the efficiency of reverse transcription may differ between the various assays for sgRNA or “genomic” RNA. Consequently, we cannot exclude the possibility that differences in the efficiencies of reverse transcription could contribute to differences between the measured levels of various sgRNAs or the ratios of each sgRNA to its cognate “genomic” RNA. However, the use of a “common” RT reaction (from which aliquots are taken for all assays) and the reverse transcription with random hexamers and poly-dT should minimize differences at the reverse transcription step. Moreover, use of a common probe and reverse primer should minimize differences between the sgRNA and cognate “genomic” RNA assays.We also considered testing one step, dd-RT-PCR assays where reverse transcription would take place in each droplet using specific reverse primers. However, prior experiments using dd-RT-PCR suggested that RT efficiency did not contribute significantly to differences between supernatant levels of different SARS-CoV-2 RNA regions as measured by our previously validated panel of SARS-CoV-2 RNA assays [24]. Moreover, we found that the one step dd-RT-PCR approach resulted in a poorer signal to noise ratio and more false positive droplets using the same primers/probes (all TaqMan dual-labelled FAM-MGB probes from Applied Biosystems) used in parallel with the 2-step approach.While genetic diversification of coronaviruses occurs at lower rates than that of influenza virus, viral variants have been described that, owing to enhanced transmissibility, are predicted to replace original founder strains and become predominant in the near future. These mutations could affect performance of PCR-based assays if they occur in regions targeted by the PCR primers or probes. Table S2 shows the mutations described in the recent COVID-19 Genomics UK Consortium report [32]. Based on this report, one mutation falls in the reverse primer for the subgenomic and genomic ORF 8 assay (ORF8_R1), while another mutation corresponds to the very 5′ end of the forward primer for “genomic” N (gN_F1; Table S1). However, the remaining assays (including those for the spike gene) should not be affected. Moreover, it should be noted that not all mismatches in primer/probe sequences will have an appreciable impact on amplification efficiency, and ddPCR is reported to be less susceptible to the inhibitory effects of sequence mismatches. When mismatches do cause significant effects on amplification efficiency, these effects are often visible in the raw ddPCR plots, and primer/probe sequences can be corrected to account for known mutations.
Potential applications
The sensitive, quantitative performance of our RT-ddPCR assays and the breadth of detection for different sgRNA species enables fine discrimination of differences between the rates of discontinuous transcription at various loci and between the abundance of different sgRNAs within the same sample. As a result, this approach can be utilized to enhance our understanding of the mechanisms that regulate coronavirus transcription. Our panel of sgRNAs and corresponding “genomic” RNAs (or previously developed RT-ddPCR assays for “genomic only” targets such as RdRP and main protease [24]) should be applied to other samples from in vitro infections and SARS-CoV-2 infected individuals. These assays can be used to study the kinetics of SARS-CoV-2 genomic and subgenomic transcription, the function of sgRNAs, and the contribution of sgRNA to viral pathogenesis. It remains to be determined whether sgRNA levels correlate with infectivity/transmission, disease severity, or sequelae of COVID-19, but our assays provide powerful tools to address these questions.
Conclusions
Here, we describe the design and validation of seven novel ddPCR assays specific for the most abundantly reported sgRNAs of SARS-CoV-2, as well as matched primer/probe sets that target “genomic” RNA. Our approach, which may be faster, cheaper, and more quantitative than sequencing, could be utilized to study less abundant sgRNAs of SARS-CoV-2 or adapted to measure sgRNAs from other viruses. In this way, the methods described here can help researchers understand the role of subgenomic transcription in SARS-CoV-2 and other viruses.
CRediT authorship contribution statement
Sushama Telwatte: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Supervision, Validation, Visualization, Writing - original draft. Holly Anne Martin: Investigation, Formal analysis, Writing - original draft. Ryan Marczak: Investigation, Writing - review & editing. Parinaz Fozouni: Resources, Investigation, Writing - review & editing. Albert Vallejo-Gracia: Resources, Investigation, Writing - review & editing. G. Renuka Kumar: Resources, Investigation, Writing - review & editing. Victoria Murray: Resources, Writing - review & editing. Sulggi Lee: Resources, Supervision, Writing - review & editing. Melanie Ott: Resources, Supervision, Writing - review & editing. Joseph K. Wong: Resources, Supervision, Writing - review & editing. Steven A. Yukl: Conceptualization, Funding acquisition, Methodology, Resources, Supervision, Visualization, Writing - original draft.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Authors: Roman Wölfel; Victor M Corman; Wolfgang Guggemos; Michael Seilmaier; Sabine Zange; Marcel A Müller; Daniela Niemeyer; Terry C Jones; Patrick Vollmar; Camilla Rothe; Michael Hoelscher; Tobias Bleicker; Sebastian Brünink; Julia Schneider; Rosina Ehmann; Katrin Zwirglmaier; Christian Drosten; Clemens Wendtner Journal: Nature Date: 2020-04-01 Impact factor: 49.962
Authors: Adrian Viehweger; Sebastian Krautwurst; Kevin Lamkiewicz; Ramakanth Madhugiri; John Ziebuhr; Martin Hölzer; Manja Marz Journal: Genome Res Date: 2019-08-22 Impact factor: 9.043
Authors: Jeroen J A van Kampen; David A M C van de Vijver; Pieter L A Fraaij; Bart L Haagmans; Mart M Lamers; Nisreen Okba; Johannes P C van den Akker; Henrik Endeman; Diederik A M P J Gommers; Jan J Cornelissen; Rogier A S Hoek; Menno M van der Eerden; Dennis A Hesselink; Herold J Metselaar; Annelies Verbon; Jurriaan E M de Steenwinkel; Georgina I Aron; Eric C M van Gorp; Sander van Boheemen; Jolanda C Voermans; Charles A B Boucher; Richard Molenkamp; Marion P G Koopmans; Corine Geurtsvankessel; Annemiek A van der Eijk Journal: Nat Commun Date: 2021-01-11 Impact factor: 14.919
Authors: Sushama Telwatte; Peggy Kim; Tsui-Hua Chen; Jeffrey M Milush; Ma Somsouk; Steven G Deeks; Peter W Hunt; Joseph K Wong; Steven A Yukl Journal: AIDS Date: 2020-11-15 Impact factor: 4.632
Authors: Sushma M Bhosle; Julie P Tran; Shuiqing Yu; Jillian Geiger; Jennifer D Jackson; Ian Crozier; Anya Crane; Jiro Wada; Travis K Warren; Jens H Kuhn; Gabriella Worwa Journal: Viruses Date: 2022-05-17 Impact factor: 5.818
Authors: Michael D Lu; Sushama Telwatte; Nitasha Kumar; Fernanda Ferreira; Holly Anne Martin; Gayatri Nikhila Kadiyala; Adam Wedrychowski; Sara Moron-Lopez; Tsui-Hua Chen; Erin A Goecker; Robert W Coombs; Chuanyi M Lu; Joseph K Wong; Athe Tsibris; Steven A Yukl Journal: PLoS One Date: 2022-04-27 Impact factor: 3.752
Authors: Laura A E Van Poelvoorde; Mathieu Gand; Marie-Alice Fraiture; Sigrid C J De Keersmaecker; Bavo Verhaegen; Koenraad Van Hoorde; Ann Brigitte Cay; Nadège Balmelle; Philippe Herman; Nancy Roosens Journal: Curr Issues Mol Biol Date: 2021-11-06 Impact factor: 2.976