Wei Zou1, Min Xiong2, Siyuan Hao2, Elizabeth Yan Zhang3, Nathalie Baumlin4, Michael D Kim4, Matthias Salathe4, Ziying Yan5, Jianming Qiu6. 1. Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA. 2. Department of Microbiology, Molecular Genetics and Immunology, University of Kansas Medical Center, Kansas City, Kansas, USA. 3. GeneGoCell Inc., San Diego, California, USA. 4. Department of Internal Medicine, University of Kansas Medical Center, Kansas City, Kansas, USA. 5. Department of Anatomy and Cell Biology, University of Iowa, Iowa City, Iowa, USA ziying-yan@uiowa.edu jqiu@kumc.edu. 6. Department of Microbiology, Molecular Genetics and Immunology, University of Kansas Medical Center, Kansas City, Kansas, USA ziying-yan@uiowa.edu jqiu@kumc.edu.
Abstract
The spike (S) polypeptide of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) consists of the S1 and S2 subunits and is processed by cellular proteases at the S1/S2 boundary that contains a furin cleavage site (FCS), 682RRAR↓S686 Various deletions surrounding the FCS have been identified in patients. When SARS-CoV-2 propagated in Vero cells, it acquired deletions surrounding the FCS. We studied the viral transcriptome in Vero cell-derived SARS-CoV-2-infected primary human airway epithelia (HAE) cultured at an air-liquid interface (ALI) with an emphasis on the viral genome stability of the FCS. While we found overall the viral transcriptome is similar to that generated from infected Vero cells, we identified a high percentage of mutated viral genome and transcripts in HAE-ALI. Two highly frequent deletions were found at the FCS region: a 12 amino acid deletion (678TNSPRRAR↓SVAS689) that contains the underlined FCS and a 5 amino acid deletion (675QTQTN679) that is two amino acids upstream of the FCS. Further studies on the dynamics of the FCS deletions in apically released virions from 11 infected HAE-ALI cultures of both healthy and lung disease donors revealed that the selective pressure for the FCS maintains the FCS stably in 9 HAE-ALI cultures but with 2 exceptions, in which the FCS deletions are retained at a high rate of >40% after infection of ≥13 days. Our study presents evidence for the role of unique properties of human airway epithelia in the dynamics of the FCS region during infection of human airways, which is likely donor dependent.IMPORTANCE Polarized human airway epithelia at an air-liquid interface (HAE-ALI) are an in vitro model that supports efficient infection of SARS-CoV-2. The spike (S) protein of SARS-CoV-2 contains a furin cleavage site (FCS) at the boundary of the S1 and S2 domains which distinguishes it from SARS-CoV. However, FCS deletion mutants have been identified in patients and in vitro cell cultures, and how the airway epithelial cells maintain the unique FCS remains unknown. We found that HAE-ALI cultures were capable of suppressing two prevalent FCS deletion mutants (Δ678TNSPRRAR↓SVAS689 and Δ675QTQTN679) that were selected during propagation in Vero cells. While such suppression was observed in 9 out of 11 of the tested HAE-ALI cultures derived from independent donors, 2 exceptions that retained a high rate of FCS deletions were also found. Our results present evidence of the donor-dependent properties of human airway epithelia in the evolution of the FCS during infection.
Thespike (S) polypeptide of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) consists of the S1 and S2 subunits and is processed by cellular proteases at the S1/S2 boundary that contains a furin cleavage site (FCS), 682RRAR↓S686 Various deletions surrounding the FCS have been identified in patients. WhenSARS-CoV-2 propagated in Vero cells, it acquired deletions surrounding the FCS. We studied the viral transcriptome in Vero cell-derived SARS-CoV-2-infected primary human airway epithelia (HAE) cultured at an air-liquid interface (ALI) with an emphasis on the viral genome stability of the FCS. While we found overall the viral transcriptome is similar to that generated frominfected Vero cells, we identified a high percentage of mutated viral genome and transcripts in HAE-ALI. Two highly frequent deletions were found at the FCS region: a 12 amino acid deletion (678TNSPRRAR↓SVAS689) that contains the underlined FCS and a 5 amino acid deletion (675QTQTN679) that is two amino acids upstream of the FCS. Further studies on the dynamics of the FCS deletions in apically released virions from 11 infected HAE-ALI cultures of both healthy and lung disease donors revealed that the selective pressure for the FCS maintains the FCS stably in 9 HAE-ALI cultures but with 2 exceptions, in which the FCS deletions are retained at a high rate of >40% after infection of ≥13 days. Our study presents evidence for the role of unique properties of human airway epithelia in the dynamics of the FCS region during infection of human airways, which is likely donor dependent.IMPORTANCE Polarized human airway epithelia at an air-liquid interface (HAE-ALI) are an in vitro model that supports efficientinfection of SARS-CoV-2. Thespike (S) protein of SARS-CoV-2 contains a furin cleavage site (FCS) at the boundary of the S1 and S2 domains which distinguishes it fromSARS-CoV. However, FCS deletion mutants have been identified in patients and in vitro cell cultures, and how the airway epithelial cells maintain the unique FCS remains unknown. We found that HAE-ALI cultures were capable of suppressing two prevalent FCS deletion mutants (Δ678TNSPRRAR↓SVAS689 and Δ675QTQTN679) that were selected during propagation in Vero cells. While such suppression was observed in 9 out of 11 of the tested HAE-ALI cultures derived from independent donors, 2 exceptions that retained a high rate of FCS deletions were also found. Our results presentevidence of the donor-dependent properties of human airway epithelia in theevolution of the FCS during infection.
The ongoing coronavirus disease 2019 (COVID-19) outbreak, caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), poses a great threat to global public health with a devastating mortality (1–4). The virus has spread with unprecedented speed and has infected >100 million people worldwide, causing >2 million deaths so far. Theefficacy of the only U.S. Food and Drug Administration (FDA)-approved drug, Veklury (remdesivir), to treat COVID-19patients is limited to early phases of the disease and is supportive (5). Ongoing rollout of theadenovirus-based and two mRNA-based COVID-19 vaccines approved by the FDA under an emergency use authorization, with more vaccines to follow soon, is the hope to preventCOVID-19 and contain the virus (6).SARS-CoV-2 phylogenetically belongs to the genus Betacoronavirus of the family Coronaviridae (7, 8) and is closely related to the previously identified severe acute respiratory syndrome coronavirus (SARS-CoV) with an identity of 79% in genome sequence (9, 10). SARS had an outbreak in 2002 to 2003 (11, 12). The genome organization of SARS-CoV-2 is the same as other betacoronaviruses. It has six major open reading frames (ORFs) arranged in order from the structured 5′ untranslated region (UTR) to 3′ UTR (13, 14): replicases (ORF1a and ORF1b), spike (S), envelope (E), membrane (M), and nucleocapsid (N). In addition, at least seven ORFs encoding accessory proteins (3a, 6, 7a, 7b, 8, 9a, and 9b) are interspersed between the structural protein genes (10, 15).The replication and transcription of SARS-CoV-2 largely resemble that of theSARS-CoV (15, 16). The accepted model for coronavirus transcription indicates that all viral mRNAs have a common 5′-leader (L) sequence or the 5′-cap structure at the 5′ UTR, and a common poly(A) tail at the 3′ UTR (15, 17–19). The highly conserved leader sequences contain the transcription regulatory sequences (TRS) which play an important role in viral RNA transcription (18–20). Upon cell entry of the virus, the incoming positive-sense genomic RNA (+gRNA) subjects it to immediate translation of two large ORFs, ORF1a and ORF1b, for viral nonstructural proteins, which form the viral replication and transcription complex (RTC) (21). In the complex, the viral +gRNA serves as the template for the production of negative-sense gRNA (−gRNA) and subgenome RNA (−sgRNA) intermediates, which in turn serve as the templates for the synthesis of +gRNA and +sgRNAs (22). The positive-sense viral RNAs are themRNAs used for the translations of 16 viral nonstructural proteins, 9 accessory proteins, and 4 structural proteins, which includespike protein S, envelope protein E, membrane protein M, and nucleocapsid protein N (21, 23). Next, thenewly synthesized +gRNA is encapsidated by the N protein to assemble progeny virions with other viral structural proteins, M, E, and S (21).Most SARS-CoV-2 structural and nonstructural proteins share greater than 85% identity in protein sequence with SARS-CoV, whereas their S proteins share an identity of only approximately 77% (2). TheS protein consists of two subunits, S1 and S2, and is a key glycoprotein responsible for receptor binding and determining the host tropism, pathogenicity, and transmissibility (24, 25). It forms a homotrimer on the virion surface and triggers viral entry into target cells via binding of the S1 subunit to its cognate receptor, angiotensin-converting enzyme 2 (ACE2) (2, 26, 27). One significant difference among S proteins of SARS-CoV-2, SARS-CoV, and other bat SARS-likecoronaviruses, such as bat coronavirus BtCoV-RaTG13, is the addition of 4 amino acids, PRRA, at the S1/S2 boundary (25). This insertion forms a polybasic residuemotif, assembling a furin cleavage site (FCS), RRAR↓S, which is highly related to thefurin cleavage consensus sequence RX[K/R]R (X, any amino acid) (28). The absence of the FCS in the other betacoronaviruses suggests the insertion of PRRA is a key factor in the virulence of SARS-CoV-2, which has been shown to broaden cell tropism, transmissibility, and pathogenicity of the virus (29–31).Viral transcriptomes of SARS-CoV-2 have been studied by several groups but only in infected Vero cells (15, 17), which revealed quick mutations in the S1/S2 boundary of the S geneafter a few passages, including the loss of the FCS and the immediately adjacent amino acids upstream or downstream of the FCS. The loss of the FCS has been identified in progeny virions replicated in Vero cells (17, 32–35). Themutant viruses were stable, quickly took over the wild-type (WT) virus, and became the dominant population during passaging. Of note, various deletions surrounding the FCS have been identified in patients. This raises the question of how the FCS region deletions are selected in human airways.In this study, we used transcriptome sequencing (RNA-seq) to analyze the viral transcriptome of SARS-CoV-2 in theinfectedhuman airway epithelia (HAE) cultured at an air-liquid interface (HAE-ALI), which mimics natural viral infection of human airways (36, 37). While the viral transcriptome overall recapitulated that in Vero cells, we discovered that there is a selective pressure in HAE-ALI to suppress the deletions at the S1/S2 boundary and that this pressure appears individual donor dependent. We identified two FCS region deletions that are strikingly amplified in two HAE-ALI cultures after 2 to 3 weeks of infection, whereas these deletions were suppressed in nine other HAE-ALI cultures.
RESULTS
The SARS-CoV-2 transcriptome in SARS-CoV-2-infected HAE-ALI cultures.
HAE-ALIB2-20 cultures wereinfected with SARS-CoV-2 at a multiplicity of infection (MOI) of 0.2 or 2 or mock infected. At 4 days postinfection (dpi), immunofluorescence assay for theSARS-CoV-2 N protein expression revealed effectiveSARS-CoV-2 infection in these cultures, with ∼10% and ∼30% of cells positive in theinfections at MOIs of 0.2 and 2, respectively (Fig. 1). This result was similar to our previous observation (37).
FIG 1
Immunofluorescence analysis of SARS-CoV-2 infection of HAE cells. HAE-ALIB2-20 cultures were infected with SARS-CoV-2 at an MOI of 0.2 or 2 PFU/cell or, as indicated, mock -infected (Mock). At 4 days postinfection, a piece of the insert membrane was fixed in 4% paraformaldehyde in PBS at 4°C overnight and subjected to direct immunofluorescence analysis. The membranes were stained with anti-SARS-CoV-2 N protein (NP). Images were taken on a Leica TCS SPE confocal microscope under 40×, which was controlled by Leica Application Suite X software. The nuclei were stained with DAPI (4′,6-diamidino-2-phenylindole). Bar, 20 μm.
Immunofluorescence analysis of SARS-CoV-2 infection of HAE cells. HAE-ALIB2-20 cultures wereinfected with SARS-CoV-2 at an MOI of 0.2 or 2 PFU/cell or, as indicated, mock -infected (Mock). At 4 days postinfection, a piece of the insert membrane was fixed in 4% paraformaldehyde in PBS at 4°C overnight and subjected to direct immunofluorescence analysis. Themembranes were stained with anti-SARS-CoV-2 N protein (NP). Images were taken on a Leica TCS SPE confocal microscope under 40×, which was controlled by Leica Application Suite X software. The nuclei were stained with DAPI (4′,6-diamidino-2-phenylindole). Bar, 20 μm.Total RNA samples wereextracted frominfected HAE-ALI cultures at 4 dpi and subjected to reverse transcription, followed by DNA nanoball sequencing. An average total reads of 18.27% and 26.54% weremapped to theSARS-CoV-2 reference genome (Wuhan-Hu-1 isolate; GenBank accession no. MN908947) in the groups of MOI 0.2 and MOI 2, respectively (Table 1). No significant difference was observed in the total reads in the two groups. Notably, the RNA-seq data obtained fromSARS-CoV-2-infected Vero-E6 cells had up to 70% of the reads mapped to the viral genome (15), which was likely due to the high infectivity of Vero-E6 cells and that not all the cell types in HAE-ALI are permissive to theinfection (37). Also, for the whole viral genomecoverage, in contrast to the observation in SARS-CoV-2-infected Vero-E6 cells (15), we did not observe an obvious 5′-leader peak in theinfected HAE cells (Fig. 2). Instead, we observed ∼2-fold higher reads in the 3′ end than that in the 5′-end viral genome.
TABLE 1
Summary of RNA-seq data of SARS-CoV-2 and mock-infected HAE-ALI cultures
Sample
No. of total reads
No. of mapped viral reads
Mapped viral reads (%)
Mock-1
44,690,289
0
0
Mock-2
44,710,969
0
0
Mock-3
44,661,039
0
0
MOI0.2-1
44,665,846
9,109,261
20.39
MOI0.2-2
44,606,709
9,446,219
21.18
MOI0.2-3
44,625,565
5,907,458
13.24
MOI2-1
44,622,724
14,536,455
32.58
MOI2-2
44,658,564
12,466,502
27.92
MOI2-3
44,616,157
8,529,469
19.12
FIG 2
Genome coverage of SARS-CoV-2-infected HAE cells with MOIs of 0.2 and 2, respectively. Six total RNA samples, as indicated by six colors, extracted from HAE-ALIB2-20 cultures infected with SARS-CoV-2 at MOIs of 0.2 and 2, respectively, were subjected to whole RNA-seq. The reads were mapped to the reference SARS-CoV-2 Wuhan-Hu-1 strain genome (GenBank accession no. MN908947, NCBI), as shown with nucleotide numbers (x axis), using BWA, and the sequencing read coverage (y axis) was calculated.
Genomecoverage of SARS-CoV-2-infected HAE cells with MOIs of 0.2 and 2, respectively. Six total RNA samples, as indicated by six colors, extracted from HAE-ALIB2-20 cultures infected with SARS-CoV-2 at MOIs of 0.2 and 2, respectively, were subjected to whole RNA-seq. The reads weremapped to the referenceSARS-CoV-2 Wuhan-Hu-1 strain genome (GenBank accession no. MN908947, NCBI), as shown with nucleotide numbers (x axis), using BWA, and the sequencing read coverage (y axis) was calculated.Summary of RNA-seq data of SARS-CoV-2 and mock-infected HAE-ALI culturesWe further analyzed the viral sgRNA expression in infected HAE cells. Junction-spanning reads covering the 5′ leader and different sgRNAs were counted and analyzed (see Data Set S1 in the supplemental material). Different sgRNAs were abundantly expressed in infected HAE cells. As thenegative-strand intermediates are only ∼1% as abundant as their positive-sense counterparts (22, 38), this indicates most of the identified sgRNAs were +sgRNAs. N-protein-encoding RNA was themost abundantly expressed viral transcript and accounted for 23.11% and 16.93% of total junction-spanning reads in the groups of MOI 0.2 and MOI 2, respectively, followed by ORF3a, ORF7a, M, ORF8, S, E, ORF6 coding RNAs (Fig. 3). The junction-spanning reads associated with ORF7b and ORF9a/b were identified at a level of 0.01% or less of the total junction-spanning reads and were identified in only part of all the six samples in two MOI groups (Data Set S1). In SARS-CoV-2-infected HAE cells, S RNA transcript was expressed at a ratio of ∼2% of total junction-spanning reads in both groups (Fig. 3), compared to that of ∼8% in Vero cells. We detected relatively higher level of ORF3a (∼8%) transcript in SARS-CoV-2-infected HAE cells (Fig. 3), in contrast to 5.22% in infected Vero cells (15).
FIG 3
Identification and quantification of SARS-CoV-2 subgenomic RNAs. (A) Genome organization. The SARS-CoV-2 genome is schematically diagrammed (not to scale) with regions in order coding for open reading frame 1a (ORF1a)/ORF1b, S protein, ORF3a, E and M proteins, ORF7a/b and ORF8, N protein, and ORF9a/b. The leader sequence was labeled L in a blue box. The structural genes are labeled within boxes in orange, and the accessory genes are labeled within boxes in light green. (B) Subgenomic RNAs. Six total RNA samples were extracted from SARS-CoV-2-infected HAE-ALI cultures (at MOIs of 0.2 and 2, respectively) and subjected to whole RNA-seq. Three repeats in each MOI group were merged. Junction-spanning reads were identified using STAR (2.7.3a), and the transcript abundance, as shown as a percentage under HAE-ALI/MOI of 0.2 or 2, was estimated by counting the reads that span the junction of the corresponding RNA transcript. The left is the diagrammed subgenomic RNAs. The canonical junction-spanning reads related to each sgRNA were calculated, and the ratios are shown on right. The abundances of the subgenomic transcripts identified in Vero cells in a previous study (15) are listed for comparison.
Identification and quantification of SARS-CoV-2 subgenomic RNAs. (A) Genome organization. TheSARS-CoV-2 genome is schematically diagrammed (not to scale) with regions in order coding for open reading frame 1a (ORF1a)/ORF1b, S protein, ORF3a, E and M proteins, ORF7a/b and ORF8, N protein, and ORF9a/b. The leader sequence was labeled L in a blue box. The structural genes are labeled within boxes in orange, and the accessory genes are labeled within boxes in light green. (B) Subgenomic RNAs. Six total RNA samples wereextracted fromSARS-CoV-2-infected HAE-ALI cultures (at MOIs of 0.2 and 2, respectively) and subjected to whole RNA-seq. Three repeats in each MOI group weremerged. Junction-spanning reads were identified using STAR (2.7.3a), and the transcript abundance, as shown as a percentage under HAE-ALI/MOI of 0.2 or 2, was estimated by counting the reads that span the junction of the corresponding RNA transcript. The left is the diagrammed subgenomic RNAs. The canonical junction-spanning reads related to each sgRNA were calculated, and the ratios are shown on right. The abundances of the subgenomic transcripts identified in Vero cells in a previous study (15) are listed for comparison.Summary of the junction-spanning reads identified in SARS-CoV-2-infected HAE-ALI cultures. HAE-ALIB2-20 cultures wereinfected with SARS-CoV-2 at MOIs of 0.2 and 2, respectively. Three repeated total RNA samples, as indicated, wereextracted fromeach MOI group and were subjected to whole RNA-seq. The RNA-seq reads weremapped to the reference Wuhan-Hu-1 genome. The junction-spanning reads as shown with start and end sites, as well as numbers of the junction, are listed in theExcel file. Download Data Set S1, XLSX file, 0.9 MB.Interestingly, in all identified spanning-junction reads, only ∼50% correlated with the canonical sgRNA transcripts in both MOI infection groups. The other half junction-spanning reads representeither reads covering 5′-leader sequence but with unexpected 3′ sites located in themiddle of annotated ORFs or reads covering between different ORFs or inside an ORF without 5′-leader sequence (Data Set S1). It is important to note that a lot of these noncanonical junction-spanning patterns were supported by only one read from the RNA-seq data, indicating that these noncanonical transcripts may arise fromerroneous replicase activity.
Identification of deletions surrounding the furin cleavage site at S1/S2.
Among the ∼50% noncanonical junction-spanning reads, we identified a high abundant 36-bp deletion, mut-del1, located at nucleotide (nt) 23,594 to 23,629 spanning the FCS (Fig. 4A) that encodes amino acids (aa) 678TNSPRRAR↓SVAS689 (Fig. 4B, “↓” indicates cleavage). It displayed at frequencies of 21.04% and 14.79% of total junction-spanning reads in MOI 0.2 and MOI 2 groups, respectively (Table 2). Another 15-bp deletion, mut-del2, located at nt 23,583 to 23,597 and encoding aa 675QTQTN679 (Fig. 4A), just two amino acids ahead of the FCS, was also identified. It accounted for 0.42% and 15.11% of the total junction-spanning reads in MOI 0.2 and MOI 2 groups, respectively (Table 2). The ratio of mut-del1 is only slightly lower than the N sgRNA and nearly 10 times higher than the S sgRNA transcript, indicating a high fraction of this mutation comes from the viral genome (+gRNA).
FIG 4
Features of the S gene of SARS-CoV-2 and the deletions detected in the FCS region. (A) S gene and FCS. Key domains of the S polypeptide are diagrammed in the context of the SARS-CoV-2 genome. The S1 protein, receptor binding unit, harbors N-terminal domain (NTD) and receptor binding domain (RBD) subunit, which is conserved and recognizes ACE2. The S2, membrane fusion subunit, has fusion peptide (FP), S2' proteolytic site, two heptad repeats, HR1 and HR2, and a transmembrane domain (TM) followed by cytoplasmic peptide (CP) (31). The S protein has acquired a polybasic site (RRAR↓S, a furin cleavage site [FCS]) for cleavage at the S1/S2 boundary. An FCS region of aa 670 to 695, together with the two key deletions mut-del1 (ΔFCS1) and mut-del2 (ΔFCS2), are shown with S amino acid sequences of the SARS-CoV-2 genome (GenBank accession no. MN908947). (B) Coverage plots of S gene at nt 23,500 to 23,698 in SARS-CoV-2-infected HAE-ALIB2-20. The coverage plots show that the most abundant junction-spanning reads in SARS-CoV-2-infected HAE-ALIB2-20 cultures are the 36-bp and 15-bp deletions in the S gene of nt 23,594 to 23,629 and nt 23,583 to 23,597, respectively, which deleted 12 aa and 5 aa shown in mut-del1 and mut-del2 in panel A.
TABLE 2
Ratio of reads covering mut-del1 and mut-del2 to total junction spanning reads and viral genome
Mutant
Junction-spanning reads (%)a
Viral genome ratio (%)b
MOI 0.2
MOI 2
MOI 0.2
MOI 2
Mut-del1
21.04
14.79
26.31
6.67
Mut-del2
0.42
15.11
0.37
8.17
The minimal size of the junctions was set at 10 as described in Materials and Methods.
The total reads include both viral genome RNA (gRNA) and subgenomic RNA (sgRNA).
Features of the S gene of SARS-CoV-2 and the deletions detected in the FCS region. (A) S gene and FCS. Key domains of the S polypeptide are diagrammed in the context of theSARS-CoV-2 genome. The S1 protein, receptor binding unit, harbors N-terminal domain (NTD) and receptor binding domain (RBD) subunit, which is conserved and recognizes ACE2. The S2, membrane fusion subunit, has fusion peptide (FP), S2' proteolytic site, two heptad repeats, HR1 and HR2, and a transmembrane domain (TM) followed by cytoplasmic peptide (CP) (31). TheS protein has acquired a polybasic site (RRAR↓S, a furin cleavage site [FCS]) for cleavage at the S1/S2 boundary. An FCS region of aa 670 to 695, together with the two key deletions mut-del1 (ΔFCS1) and mut-del2 (ΔFCS2), are shown with S amino acid sequences of theSARS-CoV-2 genome (GenBank accession no. MN908947). (B) Coverage plots of S gene at nt 23,500 to 23,698 in SARS-CoV-2-infected HAE-ALIB2-20. Thecoverage plots show that themost abundant junction-spanning reads in SARS-CoV-2-infected HAE-ALIB2-20 cultures are the 36-bp and 15-bp deletions in the S gene of nt 23,594 to 23,629 and nt 23,583 to 23,597, respectively, which deleted 12 aa and 5 aa shown in mut-del1 and mut-del2 in panel A.Ratio of reads covering mut-del1 and mut-del2 to total junction spanning reads and viral genomeTheminimal size of the junctions was set at 10 as described in Materials and Methods.The total reads include both viral genome RNA (gRNA) and subgenomic RNA (sgRNA).To further reveal the ratio of the two deletions in total viral genome, the junction-spanning reads associated with the two deletions were normalized with the average reads covering the same deletions. The results showed that 26.31% and 6.67% of the viral reads related to this region contain themut-del1 deletion, while 0.37% and 8.17% of this region contain mut-del2 deletion in MOI 0.2 and MOI 2 groups, respectively (Table 2). It should be noted that the total reads used for normalization include reads of both viral gRNA and sgRNAs. Thus, here we were unable to distinguish the origin of these two deletions from the viral genome and viral RNA transcripts in these total cellular transcriptome data.Except for these two highly abundantmut-del1 and mut-del2 deletions, we also observed a 21-bp FCS deletion at nt 23,595 to 23,615, encoding aa 678TNSPRRA684, but only in theMOI 2 group with 1.13% of the total junction-spanning reads, and a 39-bp deletion at the N terminus of theS protein (nt 21,743 to 21,781 encoding aa 61NVTWFHAIHVSGT73) with 0.27% and 0.60% of the total junction-spanning reads in MOI 0.2 and MOI 2 groups, respectively.In addition to these deletions in the S gene, we identified about 50 different in-frame or frameshift deletions in theMencoding region that appeared in all six samples of both MOI groups, and there wereevenmore deletions in theM coding region that appeared in only a part of the six RNA samples (Data Set S1). Although the ratio of single deletion was low, the 50 deletion patterns that appeared in all six RNA samples had the ratios of 2.39% and 3.18% in MOI 0.2 and MOI 2 groups, respectively, which is similar or even higher than the identified canonical junction-spanning reads related to M sgRNAs (Fig. 3). Notably, most of these identified deletion patterns of theM gene also appeared in SARS-CoV-2-infected Vero cells (15). Whether these deletions produce functional M protein or affect the function of M protein warrants further studies. In SARS-CoV-2-infected Vero-E6 cells, a high ratio of 27-bp deletion in E gene (nt 26,257 to 26,283) was identified (15), which, however, was not found in infected HAE cells.
Dynamics of the FCS region deletions in virions apically released from SARS-CoV-2-infected HAE-ALI cultures derived from various donors.
To further investigate the FCS region deletions during SARS-CoV-2 infection of HAE cells, weinfected HAE-ALI cultures generated from five different donors, B3-20 (MOI = 0.2), B4-20 (MOI = 0.2 and 2), B9-20 (MOI = 2), L209 (MOI = 0.2), and KC19 (MOI = 0.2), and collected the progeny in the apical washes at different time points. The dynamics of apical virus release of the HAE-ALI cultures of B3-20, B4-20, B9-20, and L209 have been described in our previous study (37). The apical virus release kinetics of the HAE-ALIKC19 is shown in Fig. 5A. Viral RNA was prepared either for RNA-seq or for PCR amplicon-seq of a 384-nt sequencecovering the FCS. Notably, mut-del1 was not significantly detected (<0.1%) in all the apically released viruses collected at >13 dpi (Table 3, Bx-20). Nevertheless, for viruses collected from HAE-ALIKC19, themut-del2 was detected at a high level (20.75%RNA-seq and 20.98%RNA-seq) at 4 dpi and 13 dpi, respectively, which reached a close level of 41.79%PCR-seq at 21 dpi. Although the viruses derived from HAE-ALIB3-20, HAE-ALIL209, and HAE-ALIB4-20 contain a high level (23.17%, 30.33%, and 8.3%, respectively) of mut-del2 at 3 dpi, it decreased to a level of <2% at ≥17 dpi (Table 3, Bx-20). HAE-ALIB9-20never produced significantmut-del2 (<0.1%).
FIG 5
Apical virus release kinetics of SARS-CoV-2-infected HAE-ALI cultures. (A and B) HAE-ALIKC19 and four COPD HAE-ALI cultures (HAE-ALIN707, HAE-ALIUC24, HAE-ALIL184, and HAE-ALIN703) (A) and HAE-ALIB15-20 and HAE-ALIB16-20 (B) cultures were infected with SARS-CoV-2 at an MOI of 0.2 from the apical side. At the indicated days postinfection (dpi), the apical surface was washed with 300 μl of D-PBS to collect the released viruses. PFU were determined (y axis) and plotted to the dpi. Values represent means ± standard deviations (SD) (error bars). The graphs in panels A and B were prepared with GraphPad Prism 9.0.2.
TABLE 3
Summary of the detections of mut-del1 and mut-del2 in stock viruses and apical washes of SARS-CoV-2-infected HAE-ALI cultures derived from different donors
Virus or donor
Source or dpi
Mut-del1
Mut-del2
RNA-seq (%)
PCR-seq (%)
RNA-seq (%)
PCR-seq (%)
P0 stock
BEI
0.87
2.03
2.09
2.45
P1 stock
Vero-E6
21.69
40.47
5.18
5.16
B3-20 (0.2c)
3
NDa
23.17
12
ND
4.44
20
0.00
0.57
B4-20 (2)
3
1.08
8.33
12
0.052
4.27
13
0.033
3.90
17
0.103
1.59
B4-20 (0.2)
3
1.58
1.48
14
0.07
1.41
B9-20 (2)
5
0.001
0.01
11
0.064
0.09
17b
0.056
0.15
17b
0.024
0.07
KC19 (0.2)
4
0.16
1.79
20.75
35.71
13
ND
20.98
21
0.017
41.79
L209 (0.2)
3
0.037
30.33
14
0.001
0.11
41
0.001
0.04
ND, not detected or <0.1%.
Independent samples.
Numbers in parentheses are MOIs.
Apical virus release kinetics of SARS-CoV-2-infected HAE-ALI cultures. (A and B) HAE-ALIKC19 and four COPD HAE-ALI cultures (HAE-ALIN707, HAE-ALIUC24, HAE-ALIL184, and HAE-ALIN703) (A) and HAE-ALIB15-20 and HAE-ALIB16-20 (B) cultures wereinfected with SARS-CoV-2 at an MOI of 0.2 from the apical side. At the indicated days postinfection (dpi), the apical surface was washed with 300 μl of D-PBS to collect the released viruses. PFU were determined (y axis) and plotted to the dpi. Values representmeans ± standard deviations (SD) (error bars). The graphs in panels A and B were prepared with GraphPad Prism 9.0.2.Summary of the detections of mut-del1 and mut-del2 in stock viruses and apical washes of SARS-CoV-2-infected HAE-ALI cultures derived from different donorsND, not detected or <0.1%.Independent samples.Numbers in parentheses areMOIs.Notably, SARS-CoV-2 isolate USA-WA1/2020 P0 stock provided by BEI was already passaged four times in Vero cells, and it was reported that there appeared significant heterogeneity at the S1/S2 boundary in Vero cell-propagated virus (32). To verify this, we sequenced the viral RNA of passage 0 (P0) (the originally received vial) and P1 (passaged once in Vero-E6 cells) virus stocks. The results showed that there was no detectablemut-del1 in the P0 stock but a high rate of 21.69%RNA-seq and 40.47PCR-seq in the P1 stock. However, while there was ∼2% of mut-del2 in the P0 stock, it slightly increased to only ∼5% in the P1 stock (Table 3, P0 and P1). These results confirmed that even though there was no or a low level of mut-del1 and mut-del2 in the P0 stock, there was a high level of mut-del1 and a low level of mut-del2 in the P1 stock, which we used for infection of all HAE-ALI cultures.Taken together, the results demonstrated that themut-del1 appears at a very low rate in all the HAE-ALI produced viruses at late time points of infection (≥17 dpi), which was confirmed by both RNA-seq and PCR amplicon-seq. Although the inoculum had themut-del1 detected at a high frequency rate of 21.69%RNA-seq (40.47%PCR-seq), this deletion was obviously suppressed when the virus propagated in the HAE-ALI cultures. In addition, except for the viruses produced from HAE-ALIKC19, mut-del2 also appeared at a low detection rate during the course of infection (≥17 dpi) in various HAE-ALI cultures.
FCS region deletions during SARS-CoV-2 infection of human airway epithelia are donor dependent.
The above RNA-seq and PCR-seq results of apically released virions from five individual HAE-ALI cultures suggested a selective pressure in suppressing the deletions of the FCS region during SARS-CoV-2 propagation in human airway epithelia. Theexception is theinfection in HAE-ALIKC19 cultures, which amplified themut-del2 to a high level. To address the possibility of the donor dependency of the FCS region deletions, weinfected HAE-ALI cultures generated from two additional donors, B15-20 and B16-20, and collected both the viral progeny in apical washes at theearly and late time points postinfection for PCR-seq. The apical washes frominfected HAE-ALIB15-20 and HAE-ALIB16-20 had similar kinetics of apical virus release, reaching titers of >106 PFU/ml at both 3 dpi and 13 dpi (Fig. 5B), indicating that the virus replicated in these two HAE-ALI cultures at a similar level. The sequencing results of the apically released viruses frominfected HAE-ALIB15-20 showed that mut-del1 was detected at a rate of 31.87% at 3 dpi and an increased level of 54.22% at 13 dpi (Table 4, B15-20/mut-del1). However, mut-del2 was barely detectable at a low rate (0.18%) at 3 dpi, and this rate remained very low at 0.69% at 13 dpi (Table 4, B15-20/mut-del2). For the viruses apically released frominfected HAE-ALIB16-20 cultures, a stark difference was that themut-del1 was suppressed during theinfection period, whereas mut-del2 was also detected at very low levels (<1%) at both 3 and 13 dpi (Table 4, B16-20). Themut-del1 was detected at a rate of 6.78% at 3 dpi and nearly disappeared (at a rate of 0.08%) by theend of infection (13 dpi). The suppression of mut-del1 in HAE-ALIB16-20 cultures was similar to what was observed in the previously tested five ALI cultures (Table 3).
TABLE 4
Detections of mut-del1 and mut-del2 in SARS-CoV-2 virions apically released from infected HAE-ALI cultures derived from B15-20 and B16-20 donors (MOI = 0.2)
Donor
dpi
PCR-seq (%) of mut-del1
PCR-seq (%) of mut-del2
B15-20
3
31.87
0.18
13
54.22
0.69
B16-20
3
6.78
0.71
13
0.08
0.16
Detections of mut-del1 and mut-del2 in SARS-CoV-2 virions apically released frominfected HAE-ALI cultures derived from B15-20 and B16-20 donors (MOI = 0.2)Now that we have tested seven HAE-ALI cultures from seven healthy donors, wenext looked into the FCS deletions in four infected HAE-ALI cultures derived from bronchiolar epithelial cells isolated from four patients who had chronic obstructive pulmonary disease (COPD). While we did not observe significant differences in the titers of apically released viruses over the course of 21 days (Fig. 5A), we found that the FCS was largely retained at a rate of >99% in HAE-ALIN707, HAE-ALIUC24, and HAE-ALIL184 cultures at 3 dpi and in HAE-ALIN703 culture at 21 dpi (Table 5).
TABLE 5
Summary of the detections of mut-del1 and mut-del2 in apical washes of SARS-CoV-2 infected HAE-ALI cultures derived from four COPD donors (MOI = 0.2)
COPD donor
dpi
PCR-seq (%)
Mut-del1
Mut-del2
N707
3
0.171
0.75
13
0.001
0.05
21
0.003
0.05
UC24
3
0.345
0.47
13
0.007
0.07
L184
3
0.327
1.11
14
0.002
0.40
21
0.004
0.68
N703
3
0.602
7.15
13
0.195
3.33
21
0.002
0.47
Summary of the detections of mut-del1 and mut-del2 in apical washes of SARS-CoV-2 infected HAE-ALI cultures derived from four COPD donors (MOI = 0.2)Together with the detection rates of the viruses produced from theinfected HAEKC19-ALI cultures, the above results suggested that the probability of FCS region deletions is dependent on the HAE-ALI cultures made from airway epithelial cells of different donors. In contrast to the HAE-ALIKC19 that produced a high rate of mut-del2, HAE-ALIB15-20 tended to generate the viruses that have a high rate of themut-del1 deletion.
DISCUSSION
In this study, we analyzed the transcriptome of SARS-CoV-2 in polarized human bronchial airway epithelia, an in vitro model mimicking theSARS-CoV-2 infection in human lower airways (36, 37). We found that the transcriptome in HAE-ALI reflects more closely the viral transcriptome in the airways of COVID-19patients, providing further support for HAE-ALI as a physiologically relevant in vitro culturemodel to study SARS-CoV-2. Neither RNA-seq data of clinical SARS-CoV-2-positive nasopharyngeal specimens nor RNA-seq of SARS-CoV-2-infected HAE-ALI showed the 5′-leader sequence read peak (39, 40). In SARS-CoV-2-infected Vero-E6 cells, N sgRNAs accounted for up to 69% in total viral RNA transcripts, and S sgRNAs accounted for 8% of the total junction-spanning reads (15) (Fig. 3). Nevertheless, in HAE-ALI cultures, SARS-CoV-2 still expresses the abundant N protein transcript and a relatively low level of S genemRNA. Of note, the overall sgRNA transcripts in infected HAE-ALI weremapped to only 50% of all the canonical sgRNAs, much lower than that in Vero cells (15), which is partially due to the high deletion rate of the FCS region derived from the inoculated virus (P1 stock, Table 3). Except for those arising fromerroneous replicaseevents, some of these noncanonical transcripts may play unknown but functional roles in thecoronavirus life cycle.Among all theSARS-CoV-2 viral genes, the S gene is themost variable one, in particular the S1/S2 junctional region which features the FCS. Increasing evidence has shown that the S1/S2 FCS region is highly unstable, and various deletions and mutations have been detected or isolated in SARS-CoV-2-infected Vero cells (17, 32–35). A mutant with a 30-bp deletion, encoding aa 679NSPRRAR↓SVA688, showed enhanced replication ability in Vero cells and had the capability to dominate the genome population during passage in Vero cells (35). A 21-bp deletion encoding aa 679NSPRRAR686 was detected (>10%) in low (<2 to 3) passaged isolates (33). However, detection of the original clinical specimen where themutant was derived and SARS-CoV-2-positive clinical specimens showed no such deletions (15 to 30 bp) in the FCS region (34), indicating that themut-del1 or mut-del1-like (containing FCS) deletions are generated during the propagation in Vero cells. Apparently, SARS-CoV-2 is under strong selection pressure in Vero cells to acquire adaptivemutations in theS protein. Nevertheless, mut-del2 (675QTQTN679) has been identified not only in Vero cell-passaged isolates (41) but also in 3 of 68 clinical specimens (33), indicating that mut-del2 may be clinically more important (relevant) than mut-del1.TheS protein as a part of the viral envelope facilitates viral entry into infected cells. The S1 subunit contains the receptor binding domain, and the S2 domain mediates fusion of the viral envelope with a cellular membrane (24). The infectivity of SARS-CoV-2necessitates the activation of S protein. There are two proteolytic activation events associated with S-mediated receptor binding and membrane fusion. The first is a priming cleavage that occurs at the S1/S2 boundary, and the second is the obligatory triggering cleavage that occurs within the S2′ site (Fig. 4A). The priming cleavage at the S1/S2 boundary causes the conformation changes of the S1 subunit for receptor binding and of the S2 subunit for conversion of a fusion competent form, by enabling theS protein to better bind receptors or expose the hidden S2′ cleavage site. The cleavage at S2′ triggers the fusion of the viral envelope with the host cell membrane (24). Cleavage by furin at the S1/S2 site is required for subsequenttransmembrane serine protease 2 (TMPRSS2)-mediated cleavage at the S2′ site during viral entry into lung cells (27). However, a cathepsin B/L-dependent auxiliary activation pathway is available during infection of SARS-CoV-2 infection in TMPRSS2-negative cells (35, 42), which is likely not dependent on the cleavage at S1/S2 (43). One important novel finding of our study is that HAE-ALI cultures prepared fromhuman airway cells isolated from different donors selected different FCS deletions. Whilemost (9/11; 81.8%) of the HAE cultures (B3-20, B9-20, L209, B16-20, and 4 COPD) strongly selected the FCS during virus replication after a long-terminfection (the FCS deletions accounted for only <1% at 13 dpi), HAE-ALIKC19 preferred selection of mut-del2 (675QTQTN679) (41.79%PCR-seq at 21 dpi), and HAE-ALIB15-20 selected themut-del1 (678TNSPRRAR↓SVAS689) at a rate of 54.22% PCR-seq at 13 dpi. Although mut-del2 retains the FCS, deletion of QTQTN upstream of the FCS also prevented the cleavage (41). Thesemutants with amino acid deletions immediately upstream of FCS, likemut-del2, or downstream (685RSV687 or 689SQS691) also showed significant defects in S protein processing (41, 42). Both types of the FCS region deletions were unable to utilize thefurin and TMPRSS2-mediated plasma membrane fusion entry pathway and exhibited a more limited range of cell tropism (42, 44), which might be randomly selected. This is substantiated by the fact that there were no FCS region deletions detected in SARS-CoV-2 propagated in TMPRSS2-expressing cells (42).Overall, we believe that human airway epithelial cells express ACE2 and TMPRSS2 (37, 45–47), which plays an important role during S protein priming and viral entry, and the virus entry is mediated by themembrane fusion pathway. However, from 2 out of the 11 HAE-ALI cultures tested in this study, the lack of suppression of the FCS region deletion was also found in apically released virions of theinfected HAE-ALI cultures made from KC19 and B15-20 donors. Previously, we discovered that theSARS-CoV-2 infection in HAE-ALI resulted in periodic recurrent replication peaks of progeny (37). Since the cleavage at the S2′ site by TMPRSS2necessitates the priming cleavage at S1/S2, the accumulation of FCS mutations in the progeny during theinfection in HAE-ALIB15-20 and HAE-ALIKC19 may limit infection of themut-del1 or mut-del2. Since the progeny virus was a pool mixed with WT and mutants, we did not observe a significant difference in apical virus release in HAE-ALIB15-20 and HAE-ALIKC19. In a future study, we will plaque purify the two FCS mutants and examine their infectivity in HAE-ALI derived from various donors, as well as their transmissibility and pathogenicity in animals. We speculate that epithelial cells from these two donors may express much less TMPRSS2, and therefore, the virus utilizes theTMPRSS2-independent and cathepsin-dependentendosomal entry pathway (42, 44, 48), which likely does not require the S cleavage at S1/S2 (43) and thus prefers replication of the FCS deletion mutants.SARS-CoV-2 infection is less studied in primary airway epithelial culture derived from individuals who already have critical pulmonary disease, such as COPD. COPD is a chronic inflammatory lung disease that causes obstructed airflow in the lung (49). The severity of COVID-19 in COPDpatients is in general worse (50, 51). Here, we did not observe an increased replication of SARS-CoV-2 in the four COPD HAE-ALI cultures. However, the FCS deletions were strongly suppressed. In the future, it is worthwhile to look into the retention of the FCS site in HAE-ALI cultures derived from donors of different sexes, age groups, and other chronic lung diseases, as well theexpression levels of the cellular proteases furin, TMPRSS2, and cathepsin in these HAE-ALI cultures.Importantly, the deletion of QTQTN (mut-del2) diminished SARS-CoV-2entry and infection in Vero-E6 cells (41). Furthermore, three FCS-related deletion mutants, ΔPRRA↓, ΔRAR↓SVAS, and ΔNSPRRAR↓SVA, have been shown to have reduced replication in vitro and lung disease in animal models (44, 52, 53), strongly supporting that the FCS is a virulence-related motif. Since the ΔQTQTN also abolished furin cleavage (41), we speculate that themut-del2 mutant should have reduced lung disease in animals as well. Since the FCS is a key motif related to virulence, it is important to investigate the natural occurrence rate of the FCS region deletions, possibility or limitation of their human-to-human transmission, as well as their pathogenicity. Several studies tried to screen the FCS region deletions frompatients-derived SARS-CoV-2. As discussed above, screening of 27 SARS-CoV-2-positive clinical specimens, including one specimen that had FCS deletions identified after passaging in Vero-E6 cells, failed to detect any FCS deletions (34). However, one study detected the 675QTQTN679-deleted mutants (mut-del2) in 3 of 68 SARS-CoV-2-positive clinical specimens (33). In another detection of 51 SARS-CoV-2-positivepatient specimens, although a high rate of 52.9% and 82.4% of the positive clinical samples contained the FCS upstreammotif (661ECDIPIGAG669) and the PRRA deletions, respectively, themutant population was at a very low level (0.33% ±1.17% for FCS upstreammotif deletion and 1.12% ±1.21% for PRRA deletion) (54), arguing for the infectivity and transmissibility of thesemutants. Notably, we detected a high rate of mut-del2 (7.86% and 20.97%) but not mut-del1 (<0.1%) in two SARS-CoV-2-positive nasopharyngeal aspirates. Thus, it is possible that mut-del1 is an artificial deletion from the inoculated virus cultured in Vero-E6 cells. Therefore, further studies need to be carried out to understand the significance of the prevalence of these FCS deletion mutants in the disease progression of COVID-19patients.Along with the usages of antibody drugs and the wide inoculation of the vaccine, which target theS protein, the virus may undergo further mutations under the pressure of human immune response. Supervision and screening themutations in theS protein gene in clinical specimens is extremely important to identify theescaped isolates which may increase or decrease infectivity and transmissibility. Apparently, the in vitro-polarized HAEmodel, which can facilitate long-terminfection of SARS-CoV-2, is an ideal model to study S genemutants under various conditions.
MATERIALS AND METHODS
Ethics statement.
Primary human bronchial epithelial cells were isolated from the lungs of healthy human donors and COPDpatients by the Cells and Tissue Core of the Center for Gene Therapy, University of Iowa, and by the Department of Internal Medicine, University of Kansas Medical Center with the approvals of the institutional review boards (IRB) of the University of Iowa and University of Kansas Medical Center, respectively.
Viruses.
SARS-CoV-2 (NR-52281), isolate USA-WA1/2020 (batch no. 70034262), was obtained from BEI Resources (Manassas, VA) and designated P0 passage. The virus used for infections of HAE-ALI was propagated once in Vero-E6 cells, designated P1 passage. Viruses were titrated by plaque assays on Vero-E6 cells and stored at −80°C as previously described (37). A biosafety protocol to work on SARS-CoV-2 infection in the biosafety level 3 (BSL3) lab was approved by the Institutional Biosafety Committee of the University of Kansas Medical Center.
HAE-ALI cultures.
Primary HAE-ALI cultures, lots of B2-20, B3-20, B4-20, B9-20, B15-20, and B16-20, were directly prepared from bronchial airway epithelial cells isolated from various healthy donors. They were obtained from the Cells and Tissue Core of the Center for Gene Therapy, University of Iowa and polarized in Transwell inserts (0.33 cm2; Costar, Corning, Tewksbury, WA). L209, KC19, and four COPD phenotypes (N707, UC24, L184, and N703) HAE-ALI cultures were prepared from propagated (passage 2) bronchial airway cells of the two healthy L209 and KC19 donors and four COPD donors. These six lung tissues were obtained from organ donors whose lungs were rejected for transplant and recovered for research by the Life Alliance Organ Recovery Agency at the University of Miami (Miami, Florida), Life Center Northwest (Seattle, Washington), and theMidwest Transplant Network (Kansas City, Kansas). Bronchial airway epithelia cells were polarized on Transwell inserts (1.1 cm2) (Costar; Corning). The HAE-ALI cultures that had transepithelial electrical resistance (TEER) of >1,000 Ω·cm2, determined with an epithelial volt-ohmmeter (MilliporeSigma, Burlington, MA), were used for infections.
Virus infections.
Polarized HAE-ALI cultures wereinfected with SARS-CoV-2 at a multiplicity of infection (MOI) of 0.2 or 2. The inoculum of 100 μl or 300 μl was apically applied to the 0.33-cm2 or 1.1-cm2 Transwell inserts with an incubation period of 1 h at 37°C and 5% CO2. After aspiration of the inoculum, the apical surface of the insert was washed with 100 μl (or 300 μl) of Dulbecco’s phosphate-buffered saline (D-PBS; Corning, Tewksbury, WA) three times to maximally remove the unbound viruses. The HAE-ALI cultures were then placed back into the incubator at 37°C and 5% CO2. To collect the apically released progeny frominfected cultures, 100 μl (or 300 μl) of D-PBS was added to the apical chamber for 30 min at 37°C and 5% CO2. Thereafter, the apical wash was pipetted carefully from the apical chamber.
Immunofluorescence assay.
Themembrane of theinfected HAE-ALI was cut out and fixed with 4% paraformaldehyde in phosphate-buffered saline (PBS) at 4°C overnight. The fixed membrane was washed in PBS for 5 min three times and then split into several pieces for whole-mount immunostaining. Following permeabilization with 0.2% Triton X-100 for 15 min at room temperature, the slide was incubated with a rabbit monoclonal anti-SARS-CoV-2nucleocapsid (NP) (catalog no. 40143-R001; SinoBiological US, Wayne, PA) at a dilution of 1:25 in PBS with 2% fetal bovine serum for 1 h at 37°C. After washing, the slide was incubated with a rhodamine-conjugated secondary antibody, followed by staining of the nuclei with DAPI (4′,6-diamidino-2-phenylindole).
RNA extraction.
For total RNA extraction, four Transwell inserts of HAE-ALI cultures were dissolved in 1 ml of TRIzol reagent (ThermoFisher, Waltham, MA), following manufacturer’s instructions. Viral RNA was isolated from the virions in apical washes. Fifty microliters of apical wash was used for theextraction of nuclease digestion-resistant viral RNA using the Quick-RNA Viral kit (catalog no. R1035; Zymo Research, Irvine, CA), as described previously (37). The final RNA samples were dissolved in 50 μl of deionized H2O and quantified for concentrations using a microplate reader (Synergy H; BioTek).
RNA-seq.
For viral transcriptome, total RNA was extracted from HAE-ALI cultures infected with SARS-CoV-2 at an MOI of 0.2 and 2, respectively, or mock infected at 4 dpi. After RNA quality control and reverse transcription, DNA nanoball sequencing (DNSeq) was performed at BGI Genomics (Cambridge, MA). Briefly, RNA samples were tested using an Agilent 2100 bioanalyzer (Agilent RNA 6000 Nano kit). Samples with an RNA integrity number (RIN) of ≥8.0 were chosen for library construction. rRNA was removed from the total RNA samples by using RNase H or Ribo-Zero method. Then, samples were fragmented in a fragmentation buffer for thermal fragmentation to 130 to 160 nucleotides (nt). First-strand cDNA was generated by First Strand Mix, then Second Strand Mix was added to synthesize the second-strand cDNA. The reaction product was purified by magnetic beads and end repaired by addition of adaptors, followed by several rounds of PCR amplification to enrich the cDNA fragments. The PCR products were then purified and subjected to library quality control on the Agilent Technologies 2100 bioanalyzer. The double-stranded PCR products were heat denatured and circularized by the splintoligonucleotide sequence. The single-strand circle DNA (ssCir DNA) was formatted as the final library. The final library was amplified with phi29 to make DNA nanoballs (DNBs), which havemore than 300 copies of onemolecule. TheDNBs were loaded into the patterned nanoarray and 2 × 100 paired-end reads were generated in the way of combinatorial probe-anchor synthesis (cPAS).For RNA-seq of the viral RNAs, the apical washes were collected frominfected HAE-ALI cultures at the indicated times (days postinfection [dpi]; Table 3), and viral RNA was extracted as described above. For library preparation, the stranded-RNA seq kit (Thermo Fisher) was used following themanufacturer’s protocol. The rRNA depletion step was added for the library preparation. The Illumina sequencer NextSeq550 was used to generate paired-end 2 × 150 reads at GeneGoCell Inc. (San Diego, CA).
PCR amplicon-seq.
For sequencing the FCS region of the S gene, viral RNA extracted from the apical washes was reverse transcribed using avian myeloblastosis virus (AMV) (Promega, Madison, WI). A 384-nt sequencecovering the S gene FCS region (nt 23,487 to 23,870) was amplified by 20 cycles of PCR using the primers containing the adaptor sequences: forward, 5′-ACA CTC TTT CCC TAC ACG CTC TTC CGA TCT TTT TCA AAC ACG TGC AGG C-3′, and reverse, 5′-GAC TGG AGT TCA GAC GTG TGC TCT TCC GAT CTT CCA GTT AAA GCA CGG TTT AAT-3′. The PCR products were analyzed on 1.5% agarose and excised for purification. The purified DNA samples were quantified on a microplate reader (Synergy LX; BioTek, Winooski, VT), and 500 ng of each DNA sample (20 ng/μl) was sent for PCR amplicon-seq (AMPLICON-EZ) at GENEWIZ, Inc. (South Plainfield, NJ).
Bioinformatic analyses. (i) Total cellular DNBseq data (BGI) and PCR-amplicon-seq (GENEWIZ).
The reads were aligned to the referenceSARS-CoV-2 Wuhan-Hu-1 isolate genome (GenBank accession no. MN908947) using BWA v0.7.5a-r405. Sequencing read coverage was calculated using bedtools genomecov of version 2.27.1. We used STAR (2.7.3a) to identify the junction-spanning reads as described previously except that we set theminimal size of deletions at 10 (15).
(ii) Viral RNA-seq data (GeneGoCell Inc.).
Raw sequence reads (fastq files) were processed through the following steps by the Genenius NGS bioinformatics pipeline (v2.1). Low-quality reads were removed using a quality score threshold of 25 (Q25). The resulting fastq files were analyzed by FastQC v0.10.1 for quality control (QC). Reads were aligned to the reference genome Wuhan-Hu-1. The alignment results were analyzed using the proprietary GeneGoCell program for variant calling on the target sites as follows. (i) Each read pair was processed to report the variant in the read. (ii) Each variant's allele frequency (AF) was calculated based on the number of variant reads/total reads covering the region (both variant and nonvariant). (iii) Variants with ≥1% AF and ≥3 variant reads were reported in a variant calling file (vcf). The output of the bioinformatics workflow was collected and further organized/processed in Microsoft Office 365.
Data availability.
All the RNA-seq and PCR amplicon-seq data have been deposited in NIH-sponsored BioProject database under accession number PRJNA698337 (https://dataview.ncbi.nlm.nih.gov/object/PRJNA698337?reviewer=k4gtr6eundj03jnpcrj62tq1un).
Authors: Christian Drosten; Stephan Günther; Wolfgang Preiser; Sylvie van der Werf; Hans-Reinhard Brodt; Stephan Becker; Holger Rabenau; Marcus Panning; Larissa Kolesnikova; Ron A M Fouchier; Annemarie Berger; Ana-Maria Burguière; Jindrich Cinatl; Markus Eickmann; Nicolas Escriou; Klaus Grywna; Stefanie Kramme; Jean-Claude Manuguerra; Stefanie Müller; Volker Rickerts; Martin Stürmer; Simon Vieth; Hans-Dieter Klenk; Albert D M E Osterhaus; Herbert Schmitz; Hans Wilhelm Doerr Journal: N Engl J Med Date: 2003-04-10 Impact factor: 91.245
Authors: Natacha S Ogando; Tim J Dalebout; Jessika C Zevenhoven-Dobbe; Ronald W A L Limpens; Yvonne van der Meer; Leon Caly; Julian Druce; Jutte J C de Vries; Marjolein Kikkert; Montserrat Bárcena; Igor Sidorov; Eric J Snijder Journal: J Gen Virol Date: 2020-09 Impact factor: 3.891
Authors: Daniel Butler; Christopher Mozsary; Cem Meydan; Jonathan Foox; Joel Rosiene; Alon Shaiber; David Danko; Ebrahim Afshinnekoo; Matthew MacKay; Fritz J Sedlazeck; Nikolay A Ivanov; Maria Sierra; Diana Pohle; Michael Zietz; Undina Gisladottir; Vijendra Ramlall; Evan T Sholle; Edward J Schenck; Craig D Westover; Ciaran Hassan; Krista Ryon; Benjamin Young; Chandrima Bhattacharya; Dianna L Ng; Andrea C Granados; Yale A Santos; Venice Servellita; Scot Federman; Phyllis Ruggiero; Arkarachai Fungtammasan; Chen-Shan Chin; Nathaniel M Pearson; Bradley W Langhorst; Nathan A Tanner; Youngmi Kim; Jason W Reeves; Tyler D Hether; Sarah E Warren; Michael Bailey; Justyna Gawrys; Dmitry Meleshko; Dong Xu; Mara Couto-Rodriguez; Dorottya Nagy-Szakal; Joseph Barrows; Heather Wells; Niamh B O'Hara; Jeffrey A Rosenfeld; Ying Chen; Peter A D Steel; Amos J Shemesh; Jenny Xiang; Jean Thierry-Mieg; Danielle Thierry-Mieg; Angelika Iftner; Daniela Bezdan; Elizabeth Sanchez; Thomas R Campion; John Sipley; Lin Cong; Arryn Craney; Priya Velu; Ari M Melnick; Sagi Shapira; Iman Hajirasouliha; Alain Borczuk; Thomas Iftner; Mirella Salvatore; Massimo Loda; Lars F Westblade; Melissa Cushing; Shixiu Wu; Shawn Levy; Charles Chiu; Robert E Schwartz; Nicholas Tatonetti; Hanna Rennert; Marcin Imielinski; Christopher E Mason Journal: Nat Commun Date: 2021-03-12 Impact factor: 14.919
Authors: Michelle N Vu; Kumari G Lokugamage; Jessica A Plante; Dionna Scharton; Aaron O Bailey; Stephanea Sotcheff; Daniele M Swetnam; Bryan A Johnson; Craig Schindewolf; R Elias Alvarado; Patricia A Crocquet-Valdes; Kari Debbink; Scott C Weaver; David H Walker; William K Russell; Andrew L Routh; Kenneth S Plante; Vineet D Menachery Journal: Proc Natl Acad Sci U S A Date: 2022-07-26 Impact factor: 12.779