James W Saville1, Alison M Berezuk1, Shanti S Srivastava1, Sriram Subramaniam1,2. 1. Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, British Columbia, Canada, V6T 1Z3. 2. Gandeeva Therapeutics Inc., Vancouver, British Columbia, Canada, V5C 6N5.
Abstract
The global spread of SARS-CoV-2 has proceeded at an unprecedented rate. Remarkably, characterization of the virus using modern tools in structural biology has also progressed at exceptional speed. Advances in electron-based imaging techniques, combined with decades of foundational studies on related viruses, have enabled the research community to rapidly investigate structural aspects of the novel coronavirus from the level of individual viral proteins to imaging the whole virus in a native context. Here, we provide a detailed review of the structural biology and pathobiology of SARS-CoV-2 as it relates to all facets of the viral life cycle, including cell entry, replication, and three-dimensional (3D) packaging based on insights obtained from X-ray crystallography, cryo-electron tomography, and single-particle cryo-electron microscopy. The structural comparison between SARS-CoV-2 and the related earlier viruses SARS-CoV and MERS-CoV is a common thread throughout this review. We conclude by highlighting some of the outstanding unanswered structural questions and underscore areas that are under rapid current development such as the design of effective therapeutics that block viral infection.
The global spread of SARS-CoV-2 has proceeded at an unprecedented rate. Remarkably, characterization of the virus using modern tools in structural biology has also progressed at exceptional speed. Advances in electron-based imaging techniques, combined with decades of foundational studies on related viruses, have enabled the research community to rapidly investigate structural aspects of the novel coronavirus from the level of individual viral proteins to imaging the whole virus in a native context. Here, we provide a detailed review of the structural biology and pathobiology of SARS-CoV-2 as it relates to all facets of the viral life cycle, including cell entry, replication, and three-dimensional (3D) packaging based on insights obtained from X-ray crystallography, cryo-electron tomography, and single-particle cryo-electron microscopy. The structural comparison between SARS-CoV-2 and the related earlier viruses SARS-CoV and MERS-CoV is a common thread throughout this review. We conclude by highlighting some of the outstanding unanswered structural questions and underscore areas that are under rapid current development such as the design of effective therapeutics that block viral infection.
As early as November 2019, initial reports surfaced describing patients presenting with
pneumonia-like symptoms in the Guangdong region of China, believed to be the origin of the
severe acute respiratory syndrome (SARS)-associated coronavirus (SARS-CoV) in
2003.[1−3] In late December 2019,
the Wuhan Municipal Health Commission reported a cluster of 27 cases of pneumonia and days
later identified a novel coronavirus—now named SARS-CoV-2—as the causative
agent of the disease now called COVID-19 (Figure ).[4] Coronaviruses are enveloped, positive-strand RNA
viruses, with SARS-CoV-2 part of the β-coronavirus genus containing SARS-CoV and
MERS-CoV (the causative viruses of the 2003 SARS and 2012 MERS outbreaks,
respectively).[1−3,5] The
first COVID-19-associated death was reported in China on January 11, 2020, and the genetic
sequence of SARS-CoV-2 was published by the Global Initiative on Sharing All Influenza
Data (GISAID) the following day (Figure ).[4,6] This
genetic sequence revealed that SARS-CoV-2 shares 79% sequence identity with
SARS-CoV.[6−8] On January 13, 2020 the
first case of COVID-19 outside of China was reported in Thailand, and the World Health
Organization (WHO) suggested evidence of “limited human-to-human
transmission” in a press briefing the following day.[4,9] In early February 2020, the first
COVID-19 death was reported outside of China, and countries worldwide began reporting
cases.[4] The WHO characterized COVID-19 as a pandemic on March 11,
2020 (Figure ), and over the next few years,
SARS-CoV-2 would spread globally, infect over 6% of the global population, mutate into
more infectious variants of concern, and generally disrupt many aspects of daily life.
Figure 1
Timeline of the COVID-19 pandemic. Events are divided into general milestones (blue),
variants of concern (red), and vaccine and therapeutic developments (green). The
number of global COVID-19-associated deaths (gray) and vaccine doses administered
(green) are graphed per month over the course of the COVID-19 pandemic (ref (10)).
Timeline of the COVID-19 pandemic. Events are divided into general milestones (blue),
variants of concern (red), and vaccine and therapeutic developments (green). The
number of global COVID-19-associated deaths (gray) and vaccine doses administered
(green) are graphed per month over the course of the COVID-19 pandemic (ref (10)).Many diverse therapeutic avenues have been employed to treat COVID-19. Small molecules,
convalescent plasma, and biologics (monoclonal antibodies, human recombinant ACE2, and
peptides) have all been successfully used as COVID-19 treatments.[11−13] However, the majority of these therapeutic regimes are only
implemented in mild-to-severe hospitalized cases as measures of treatment following
SARS-CoV-2 infection. Relatively early on in the progression of this pandemic, it was
clear that population-level vaccination was the most promising prophylactic approach to
slow the spread of SARS-CoV-2.[14−16] As such, at
the time of writing, there are currently 194 and 149 vaccines in preclinical and clinical
development, respectively (Figure ).[17] The rate of vaccine development and approval of SARS-CoV-2 vaccines is
proceeding at an unprecedented rate, afforded by multiple factors.[14,15,18] First,
ongoing fundamental research on vaccine development and characterization of pathogens has
provided a backdrop upon which to quickly leverage these tools and knowledge to rapidly
begin vaccine development.[14,15,19] Second, improvements in preclinical and
clinical scientific throughput have further accelerated the development rate. Third, the
overwhelming widespread need for these vaccines has pushed governmental and conglomerate
regulatory agencies to speed their evaluation processes. Finally, the 10 vaccines in phase
4 clinical trials are composed of vastly different vaccine technologies, including nucleic
acids, whole virus, and viral vectors, thus permitting a multiangled
approach.[19,20]
These accelerating factors have combined to reduce the typical vaccine-development
timeline from 5–10 years to under 1 year for the COVID-19 pandemic.[20]
Figure 2
Summary of SARS-CoV-2 vaccines in development. The 10 vaccines in phase 4 are the
following: nucleic acids mRNA-1273 (Moderna), BNT162 (Pfizer/BioNTech), and
mRNA-1273.351 (Moderna); viral vectors ChAdOx1 (AstraZeneca/Oxford), Ad5-nCoV (CanSino
Biologics), and JNJ-78436735 (Johnson & Johnson); protein-based MVC-COV1901
(Medigen); whole virus CoronaVac (Sinovac), BBIBP-CorV (Beijing Institute of
Biological Products), and BIBP (Sinopharm) (refs (17 and 20)). Adapted with
permission from ref (20). Copyright 2022 Gavi,
the Vaccine Alliance.
Summary of SARS-CoV-2 vaccines in development. The 10 vaccines in phase 4 are the
following: nucleic acids mRNA-1273 (Moderna), BNT162 (Pfizer/BioNTech), and
mRNA-1273.351 (Moderna); viral vectors ChAdOx1 (AstraZeneca/Oxford), Ad5-nCoV (CanSino
Biologics), and JNJ-78436735 (Johnson & Johnson); protein-based MVC-COV1901
(Medigen); whole virus CoronaVac (Sinovac), BBIBP-CorV (Beijing Institute of
Biological Products), and BIBP (Sinopharm) (refs (17 and 20)). Adapted with
permission from ref (20). Copyright 2022 Gavi,
the Vaccine Alliance.Genomic surveillance of SARS-CoV-2 samples during the first year of the COVID-19 pandemic
revealed limited mutation.[7,8,21] The D614G mutation in the spike (S) protein was the sole
widespread consensus mutation, with the G614 genotype largely displacing D614 in March
2020 (Figure ).[7,8,21] In November 2020,
however, the emergence of the Alpha (B.1.1.7) variant began capturing global headlines and
coincided with a surge in COVID-19 cases in the United Kingdom. Within 4 months, the Alpha
variant became the dominantly sequenced SARS-CoV-2 lineage worldwide (Figure ).[7,8] Emergence of the Alpha lineage was quickly proceeded by the emergence
of the Beta (B.1.351), Gamma (P.1), Delta (B.1.617.2), Kappa (B.1.617.1), and Epsilon
(B.1.429) variants in early 2021 (Figure ). Most
of these variants were classified as variants of concern (VoCs) by the WHO, demonstrating
(a) increased transmissibility or detrimental changes in COVID-19 epidemiology, (b)
increased virulence or changes in clinical disease presentation, and/or (c) decreased
effectiveness of public health and social measures or available diagnostics, vaccines, and
therapeutics.[22] Finally, in late 2021, the Omicron (B.1.1.529)
variant—which contained an unprecedented number of mutations—rapidly
supplanted the Delta variant as the most sequenced variant worldwide. Given that most
current SARS-CoV-2 vaccine immunogens and testing reagents are based on the original
Wuhan-1 reference sequence, the mutations present in emergent VoCs warrant urgent
investigation to assess their consequences on vaccine efficacy and the SARS-CoV-2 life
cycle.[7,8]
Figure 3
Emergence and global prevalence of the D614G and variant of concern lineages of
SARS-CoV-2. Sequence data was downloaded from the Global Initiative on Sharing All
Influenza Data (GISAID) and graphed as weekly totals (refs (7 and 8)).
D614 and G614 genotype prevalence is shown from January to September 2020, and variant
of concern lineage prevalence is shown from September 2020 to April 2022.
Emergence and global prevalence of the D614G and variant of concern lineages of
SARS-CoV-2. Sequence data was downloaded from the Global Initiative on Sharing All
Influenza Data (GISAID) and graphed as weekly totals (refs (7 and 8)).
D614 and G614 genotype prevalence is shown from January to September 2020, and variant
of concern lineage prevalence is shown from September 2020 to April 2022.
Structural Biology Perspective of SARS-CoV-2
Scientists have never before been so readily positioned to quickly answer critical
questions about the three-dimensional (3D) arrangement of emergent viruses. As highlighted
in the above section, the COVID-19 pandemic has evoked unprecedented speeds of response as
researchers aim to understand and combat the spread of the virus. Our cumulative knowledge
and ever-advancing scientific toolbox—garnered by studying other recent viral
pandemics—has fueled the swift unravelling of the SARS-CoV-2 viral life cycle.
Recent epidemics such as MERS (2012–2015), Ebola (2014–2016), Zika
(2015–2016), and Dengue (2019–2020) have prepared us to quickly gain an
understanding of emergent viruses. Joachim Frank highlights how these pandemics have
coincided with the “resolution revolution” in cryo-electron tomography
(cryo-ET) and cryo-electron microscopy (cryo-EM) imaging (afforded largely by improved
electron detectors), allowing researchers to more finely resolve the arrangement of these
virions and the proteins that compose them.[23]No single imaging technique paints a full picture of SARS-CoV-2 biology given the vast
biological size scale across which viral infection, replication, and packaging takes
place. Rather, the aggregation of results across the imaging size spectrum allows for a
more comprehensive characterization. Cryo-ET involves the flash-freezing of a biospecimen
(a virus, cell, tissue, or protein), imaging the sample through a tilt-series using an
electron microscope, and finally aligning and merging the images using computational
techniques to reconstruct a 3D image.[24] Cryo-ET provided the first
images of the SARS-CoV-2 virion and how the S and nucleocapsid (N) proteins were arranged
on the viral surface and within the viral lumen, respectively. Similar to cryo-ET,
single-particle cryo-EM generally involves the flash-freezing of a biospecimen (individual
proteins or protein complexes), collecting images or movies of the vitrified sample using
an electron microscope, and aligning and merging the images to produce a 3D image.[25] Cryo-EM provided the first structures of the spike glycoprotein and
actively replicating SARS-CoV-2 RNA polymerase complex.[26,27] As there exists a theoretical minimum
protein-size limit for high-resolution cryo-EM (∼40 kDa), under which many of the
nonstructural SARS-CoV-2 proteins lie,[28,29] X-ray crystallography has been used extensively to
determine the structures of these small (<40 kDa) crystallizable proteins. Here,
isolated protein crystals are diffracted with an incident X-ray beam, and the resulting
diffraction pattern produces a 3D electron density map using Fourier transforms.[30] The combination of these three structural techniques (cryo-ET, cryo-EM,
and X-ray crystallography), each with their inherent strengths and limitations, has
yielded us a rapid 3D understanding of SARS-CoV-2 from atomic details of viral replication
machinery to visualizing entire viral particles being packaged and trafficked within a
human cell.The first structure of a SARS-CoV-2 protein, the main protease (Mpro/NSP5),
was reported mere weeks after the global sharing of the viral sequence (Figure A).[31] This initial atomic
structure—in complex with an inhibitor—permitted the rational design of
improved and specific protease inhibitors that may help to treat
COVID-19.[31−33] Additionally, the
reporting of structural impacts imparted by the many mutations within VoC S proteins
provides us with insights into how they may effect vaccine efficacy. These insights are
particularly important given the global dominance of VoCs and the fact that the majority
of vaccines in development use the S protein as their sole protein
immunogen.[7,8,17] The Mpro and spike proteins are just two examples of how
structural biology is being employed to help researchers gain a better understanding of
SARS-CoV-2.
Figure 4
Summary of progress toward SARS-CoV-2 structural characterization. (A) Schematic of
the SARS-CoV-2 genome with structurally characterized proteins indicated in full
color. Proteins that have not yet been characterized are displayed through homology
modeling and are shown as semitransparent images (NSP4/6/12, E protein, M protein).
Protein illustrations were generated using Illustrate (ref (34)). (B) The number of X-ray crystallography,
cryo-EM, and all SARS-CoV-2 protein structures deposited into the RCSB protein data
bank (PDB) over the first 18 months of the COVID-19 pandemic (ref (35)).
Summary of progress toward SARS-CoV-2 structural characterization. (A) Schematic of
the SARS-CoV-2 genome with structurally characterized proteins indicated in full
color. Proteins that have not yet been characterized are displayed through homology
modeling and are shown as semitransparent images (NSP4/6/12, E protein, M protein).
Protein illustrations were generated using Illustrate (ref (34)). (B) The number of X-ray crystallography,
cryo-EM, and all SARS-CoV-2 protein structures deposited into the RCSB protein data
bank (PDB) over the first 18 months of the COVID-19 pandemic (ref (35)).
Scope and Organization of This Review
Researchers worldwide have worked at great pace to unravel the 3D architecture of the
SARS-CoV-2 virion and the atomic arrangement of its proteome. From depositing the first
structure of the main protease (6LU7), just 2 weeks following the global sharing of the SARS-CoV-2 genome, to
the first S protein structure deposited only 2 weeks later (6VSB), the structural characterization of SARS-CoV-2 has
developed at a truly unprecedented rate.[26,31] Herein, we aim to summarize these results in the context
of how they inform our understanding of the SARS-CoV-2 life cycle.The driving theme of this review is the 3D visualization of the SARS-CoV-2 life cycle,
including how the arrangement of the virion, cell entry, replication, packaging, and
release are orchestrated in 3D space. We strive to integrate results across the 3D
visualization spectrum (X-ray crystallography, single-particle cryo-EM, cryo-ET, and
molecular dynamics) and describe both a general overview and specific interesting themes
throughout. Structural differences between SARS-CoV-2 and other previously emerged viruses
(SARS-CoV, MERS-CoV) will additionally be highlighted. While fundamental chemical,
molecular, clinical, and cellular biology have each provided crucial information in our
characterization of this virus, this review will not delve deeply into these topics;
rather, we will use the findings of these fields to contextualize the structural biology
described herein.This review is arranged into three main sections, each providing both a broad overview
and specific insights into various processes within the SARS-CoV-2 life cycle. The initial
section (Structure of the Virion) describes the 3D architecture of
the viral particle and largely summarizes results obtained from cryo-ET. Specific themes
in this section include the arrangement and conformations of the spike glycoprotein on the
surface of the virion and how the nucleocapsids are organized within the viral particle.
The next section (Entry into the Cell) examines the molecular
mechanisms by which SARS-CoV-2 transits the plasma membrane to deposit its genome into
host cells. Specific themes in this section include the conservation of glycosylation
across coronavirus spike proteins (SARS-CoV-2, MERS-CoV, and SARS-CoV) and the spike
protein–ACE2 interaction. The following section (Replication,
Packaging, and Release) compiles the numerous structures of proteins encoded by
the SARS-CoV-2 genome and how they come together to orchestrate replication, packaging,
and release of the viral particles. Here, X-ray crystallography and cryo-EM structures of
individual and complexed proteins and cryo-ET imaging of the packaging process in
vivo combine to visualize these complicated final steps in the viral
replication cycle. Our final section (Future Prospects) aims to
highlight yet to be addressed areas of SARS-CoV-2 pathobiology that warrant continued
structural investigation.
Structure of the Virion
Brief Overview of the SARS-CoV-2 Viral Structure
Like all coronaviruses, the SARS-CoV-2 viral particle is composed of proteins, nucleic
acids, and lipids that are assembled within host cells.[36] The viral
envelope is derived from the membrane of the endoplasmic reticulum and is studded with
membrane (M), envelope (E), and S structural proteins (Figure ). M is the most abundant envelope protein in coronaviruses and is
a critical structural component that facilitates budding and defines the shape of the
viral particle.[37,38] As
such, the M protein is considered the central organizer of the viral envelope, as it
contacts and coordinates all other structural proteins (E, S, and N).[39,40] The S protein facilitates both
attachment to and entry into host cells by binding the angiotensin-converting enzyme 2
(ACE2) receptor, as covered in greater detail in the next section (Entry
into the Cell). E is the smallest of the structural proteins with crucial, yet
currently ill-defined, mechanistic roles.[41] As a viroporin, E is a
hydrophobic protein that oligomerizes in the membrane of host cells, forming hydrophilic
pores that precipitate membrane remodelling and viral packaging.[42]
Recombinant coronaviruses lacking the E protein exhibit hampered viral titers and yield
propagation-incompetent progeny, demonstrating the critical nature of this structural
protein in viral replication.[41,43−45] These three structural proteins (M, S, and E) define the viral
envelope which encapsulates an ∼30 kb viral genome. The SARS-CoV-2 genome is
composed of positive-sense single-stranded RNA (ssRNA) that associates with hundreds of
copies of the fourth and final structural protein, the N protein (Figure
).[46,47] The ribonucleoprotein (RNP) complex of ssRNA and N protein compresses
and packages the viral genome within the virion, coordinated by interactions between the N
and M proteins (covered in greater depth in section , Replication, Packaging, and
Release).[40,48−51] This brief overview of
SARS-CoV-2 structural proteins leverages decades-long investigations of related
coronaviruses (SARS-CoV and MERS-CoV) and is reviewed in greater depth by Schoeman and
Fielding and Mariano et al.[41,52] Herein, we highlight several recent studies that report in
situ evidence for the overall 3D arrangement of the SARS-CoV-2 virion.
Figure 5
Three-dimensional model of a coronavirus particle. Membrane (M), spike (S), envelope
(E), and nucleocapsid (N) structural proteins are shown. Models for E and M proteins
were obtained from https://sars3d.com/ and were
manually (not experimentally) arranged on the surface of a 3D model of the virion
rendered using EMD-30430. Adapted with permission from ref (47). Copyright 2020 Elsevier.
Three-dimensional model of a coronavirus particle. Membrane (M), spike (S), envelope
(E), and nucleocapsid (N) structural proteins are shown. Models for E and M proteins
were obtained from https://sars3d.com/ and were
manually (not experimentally) arranged on the surface of a 3D model of the virion
rendered using EMD-30430. Adapted with permission from ref (47). Copyright 2020 Elsevier.
Spike Protein Distribution within the Viral Envelope
Multiple groups have leveraged cryo-ET to uncover the overall arrangement of authentic
SARS-CoV-2 viral particles.[47,53,54] Ke et al. reported roughly spherical
particles, with an average outer diameter of 91 ± 11 nm, while Yao et al. reported
both spherical and ellipsoidal shaped viruses, with dimensions of 64.8 ± 11.8, 85.9
± 9.4, and 96.6 ± 11.8 nm for the short, medium, and the long axes of the
ellipsoid envelope, respectively (Figure A).[47,53]
Both of these results are consistent with the diameter of the SARS-CoV virion (∼85
nm), which was determined in 2008 by Neuman et al., also by cryo-ET methods.[40] Each SARS-CoV-2 virion contains roughly 15–40 S proteins randomly
distributed across the viral surface, with the vast majority (97%) of S proteins adopting
the prefusion conformation (Figure , parts B and
C).[53] Notably, SARS-CoV was previously determined to have roughly 90
S proteins per virion, with fewer S proteins potentially providing a viral fitness
advantage given the immune susceptibility of this protein in viral neutralization.[40] This modeling of intact SARS-CoV-2 virions approximates that there is one
spike protein per 1000 nm2 of membrane surface, in contrast to approximately
one hemagglutinin per 100 nm2 for the influenza A virus.[55]
This ∼10-fold decrease in S protein density suggests that S protein–ACE2
receptor binding may be less dependent on the avidity effects as seen in influenza A. This
finding is additionally consistent with the nanomolar and millimolar affinities for the S
protein–ACE2 (SARS-CoV-2) and hemagglutinin–sialic acid (influenza A)
interactions, respectively.[53] A further cryo-ET study by Liu et al.
employed β-propiolactone-inactivated SARS-CoV-2 viruses and found that this chemical
inactivation drastically shifted the spike proteins toward the postfusion conformation
(∼74%), thus altering their antigenic profile.[56] This finding is
particularly relevant as chemical inactivation of pathogens is one of the most common
vaccine strategies, with β-propiolactone used in current SARS-CoV-2 vaccine
formulations.[56−59]
Figure 6
Spike protein distribution, conformations, and tilt angles in authentic SARS-CoV-2
virions. (A) Tomographic slices of four representative SARS-CoV-2 virions and side
projections of three individual S proteins. (B) Three-dimensional model of a single
SARS-CoV-2 virion derived from subtomogram averaging. Prefusion S proteins are colored
in blue with up RBDs colored pink. Postfusion S protein densities are colored in
orange. (C) Prefusion and postfusion S protein trimer densities obtained by
subtomogram averaging and fitted with PDBs 6VXX and 6XRA,
respectively. (D) Prefusion trimer conformations as observed on intact virions. The
densities corresponding to three closed, one open, and two open RBDs are fitted with
PDBs 6VXX, 6VYB, and 6X2B, respectively, with protomers containing up RBDs
colored in blue. (E) Averaging of trimer subsets is shown for pools centered at
0°, 30°, and 60° from the normal, as well as for two rotations of the S
protein relative to the tilt direction. Adapted with permission from ref (53). Copyright 2020 Ke et al. http://creativecommons.org/licenses/by/4.0/.
Spike protein distribution, conformations, and tilt angles in authentic SARS-CoV-2
virions. (A) Tomographic slices of four representative SARS-CoV-2 virions and side
projections of three individual S proteins. (B) Three-dimensional model of a single
SARS-CoV-2 virion derived from subtomogram averaging. Prefusion S proteins are colored
in blue with up RBDs colored pink. Postfusion S protein densities are colored in
orange. (C) Prefusion and postfusion S protein trimer densities obtained by
subtomogram averaging and fitted with PDBs 6VXX and 6XRA,
respectively. (D) Prefusion trimer conformations as observed on intact virions. The
densities corresponding to three closed, one open, and two open RBDs are fitted with
PDBs 6VXX, 6VYB, and 6X2B, respectively, with protomers containing up RBDs
colored in blue. (E) Averaging of trimer subsets is shown for pools centered at
0°, 30°, and 60° from the normal, as well as for two rotations of the S
protein relative to the tilt direction. Adapted with permission from ref (53). Copyright 2020 Ke et al. http://creativecommons.org/licenses/by/4.0/.These cryo-ET studies provide moderate resolution (7–8 Å) 3D reconstructions
of pre- and postfusion S proteins. These resolutions enable the assignment of receptor
binding domain (RBD) “open” versus “closed” states and
in situ validation of higher resolution soluble ectodomain structures.
Only recently was the first structure of the SARS-CoV S protein reported, which revealed
the requirement for the RBD to adopt an open or “up” conformation before
engaging the ACE2 receptor.[60] The in situ cryo-ET
classification of SARS-CoV-2 S proteins revealed three distinct states: (1) all three RBDs
in the closed or “down” conformation, (2) one RBD in the open conformation,
and (3) a small fraction with two RBDs in the open conformation, which has also been
observed in various ectodomain structures (Figure D).[53] Overall, these in situ S protein
structures were found to be very structurally similar to soluble and recombinantly
produced ectodomain structures. Three hinge points present in the stalk of the prefusion S
protein afford a high degree of flexibility relative to the viral membrane, and
accordingly, S proteins were found to adopt a wide range of tilt angles, with a mode of
40° from normal (Figure E).[47,54]
Interestingly, this flexibility was not observed in postfusion S proteins, and it has
therefore been proposed that rigid postfusion S proteins anchor the viral particle into
the cell membrane, while unbound flexible prefusion S proteins are able to
“scan” and bind additional ACE2 receptors, therefore contributing to avidity
effects.[54]
Packing of the Viral Contents
A well-defined molecular model for how coronaviruses compress ∼30 kb RNA genomes
into an 80 nm diameter viral lumen remained elusive prior to the emergence of SARS-CoV-2.
The N protein binds genomic RNA to form the RNP core and is the basic unit of genome
packing.[46,47,61,62] X-ray crystal structures of the individual
structured domains of the N protein were published months after the emergence of
SARS-CoV-2 and defined the location of the viral RNA’s binding site. These
structures revealed relatively conserved N protein structures compared to other reported
coronaviral N proteins and defined the electrostatic interactions between RNA and the N
protein N-terminal domain (NTD).[62] The structure of the C-terminal
domain (CTD) of the N protein showed that it dimerizes in a highly conserved manner
relative to SARS-CoV and tetramerizes through a conserved spacer domain.[51] The synthesis of these domain-specific structural insights describes a
protein that binds RNA at its NTD and oligomerizes to facilitate packing via its CTD.
Notably, these in vitro studies were likely complicated given the
recently demonstrated tendency of the SARS-CoV-2 N protein to undergo liquid–liquid
phase separation following RNA binding.[48−50,63,64]These in vitro X-ray crystallography observations were further
corroborated by in situ cryo-ET studies of authentic SARS-CoV-2 viruses,
wherein visualization of higher order RNP packing is possible (Figure
A).[47] A majority of RNPs were found to be
membrane-proximal, which is consistent with the previously described interaction between N
and M proteins.[46,47,61,62] Following 3D refinement, two
distinct RNP ultrastructure assemblies emerged; the first is a membrane-proximal
“hexon” assembly in the shape of “eggs in a nest”, and the
second is a membrane-free “tetrahedron” assembly in the shape of a
“pyramid” (Figure B).[47] A portion of the hexon-assembled RNPs simultaneously participate in
tetrahedron assemblies, suggesting that the RNP pyramid is the fundamental genome packing
unit in SARS-CoV-2. Backprojection of these distinct RNP assemblies onto their viral
coordinates reveals a 2-fold increase in the proportion of pyramid assemblies in ellipsoid
virions compared to spherical virions (Figure C).[47] Whether differential RNP assemblies play a driving
role in viral morphology, or whether virion shape simply favors different RNP assemblies,
remains to be uncovered.
Figure 7
Packing of the ribonucleoprotein (RNP) complex within spherical and ellipsoidal
SARS-CoV-2 particles. (A) Representative tomogram slices (5 Å thick) of spherical
and ellipsoid viral particles. RNPs are visible as granular densities within the viral
lumen. (B) Hexon and pyramid in situ ultrastructure reconstructions
of the RNP. There was an approximately 2-fold increase in pyramid RNP reconstructions
in ellipsoid viruses compared to spherical viruses. (C) Representative RNP packing
arrangements in spherical and ellipsoid SARS-CoV-2 virions. Adapted with permission
from ref (47). Copyright 2020 Elsevier.
Packing of the ribonucleoprotein (RNP) complex within spherical and ellipsoidal
SARS-CoV-2 particles. (A) Representative tomogram slices (5 Å thick) of spherical
and ellipsoid viral particles. RNPs are visible as granular densities within the viral
lumen. (B) Hexon and pyramid in situ ultrastructure reconstructions
of the RNP. There was an approximately 2-fold increase in pyramid RNP reconstructions
in ellipsoid viruses compared to spherical viruses. (C) Representative RNP packing
arrangements in spherical and ellipsoid SARS-CoV-2 virions. Adapted with permission
from ref (47). Copyright 2020 Elsevier.
Entry into the Cell
Overview of SARS-CoV-2 Cell Entry
Provided a viral particle has evaded innate and adaptive immune responses, the first step
in viral infection is attachment and subsequent entry into host cells. Cellular attachment
by the SARS-CoV-2 viral particle exploits the same receptor, ACE2, as the related
coronaviruses SARS-CoV and HCoV-NL63 (Figure A).[65−67] The viral particle binds
the ACE2 receptor via the RBD of its S protein, with the first structures of this
interaction reported by Shang et al. and Lan et al. in March 2020.[68−70] The S protein is composed of an N-terminal S1 subunit that mediates
cell attachment and a C-terminal S2 subunit which facilitates fusion of the viral and host
membranes (Figure B).[69,71,72] During
expression and processing of the S protein, it is cleaved at the S1/S2 boundary, but the
domains remain associated through noncovalent interactions. Upon binding ACE2, the
SARS-CoV-2 viral genome may transit the cellular plasma membrane by two distinct entry
mechanisms: (1) the membrane fusion pathway (early pathway) or (2) the endocytosis pathway
(late pathway) (Figure C; comprehensively
reviewed by Tang et al. and others).[73,74] The membrane fusion pathway involves cleavage of the S
protein at its S2′ boundary by the TMPRSS2 host protease, followed by dissociation
of the S1 subunit, leaving S2 exposed. Through a highly conserved (across the coronavirus
family and in the HIV gp41 protein), yet structurally uncharacterized, mechanism the S2
domain unfolds to adopt its postfusion conformation and extends into the host-cell plasma
membrane.[71,75,76] The postfusion S protein then ratchets the two membranes together,
again by a structurally uncharacterized process that results in membrane fusion (Figure B). In contrast, the late endocytosis pathway
does not reply upon TMPRSS2 cleavage and exploits the host cells’ innate
endocytosis process to transit the plasma membrane (Figure C). Prolonged attachment of the virus at the exterior of the cell
triggers receptor-mediated endocytosis of the viral particle. The particle is invaginated
into an endosome, wherein acidification activates cathepsin L and other host proteases,
which cleave at the S protein S2′ site, resulting in endosome–viral membrane
fusion.[74] These two distinct cell entry pathways both result in
deposition of the SARS-CoV-2 genome into the cytosol to precipitate viral replication.
Figure 8
Overview of coronavirus cell-entry mechanisms. (A) Members of the α- and
β-coronavirus genera and their major associated cellular receptors. (B) Model of
coronavirus receptor-mediated membrane-fusion mechanism between viral and cellular
membranes. Adapted from ref (72). Copyright
2018 Xia et al. http://creativecommons.org/licenses/by/4.0/. (C) Endocytosis (a) and membrane
fusion (b) pathways of coronavirus cell entry. Created with BioRender.
Overview of coronavirus cell-entry mechanisms. (A) Members of the α- and
β-coronavirus genera and their major associated cellular receptors. (B) Model of
coronavirus receptor-mediated membrane-fusion mechanism between viral and cellular
membranes. Adapted from ref (72). Copyright
2018 Xia et al. http://creativecommons.org/licenses/by/4.0/. (C) Endocytosis (a) and membrane
fusion (b) pathways of coronavirus cell entry. Created with BioRender.
S Protein–ACE2 Interaction
As described above, both the membrane fusion and endocytosis pathways of SARS-CoV-2
cellular infection critically rely upon association of the S protein with the ACE2
receptor. This interaction was first predicted in January 2020 by Wan et al., wherein the
authors leveraged decade-long structural studies of SARS-CoV.[77] Using
homology modeling to gain insights into the yet to be validated SARS-CoV-2 S
protein–ACE2 interaction, the authors emphasized that the N501 residue was not
ideal for binding human ACE2 and that “2019-nCoV [SARS-CoV-2] evolution in patients
should be closely monitored for the emergence of novel mutations at the 501
position”. Therefore, early structural biology insights enabled the prediction of a
mutation—N501Y—that would replace N501 as the dominantly sequenced genotype
over a year later (January 2021, see the Introduction, Figure ).[7,8] Following this prediction, experimental evidence by
Hoffmann et al. and others confirmed that SARS-CoV-2 cell entry is dependent upon
ACE2.[78−80] Additionally, recent
reports have found overexpression of specific lectins (DC-/L-SIGN and SIGLEC1) to enhance
SARS-CoV-2 infectivity, potentially implicating the heavily glycosylated S protein NTD in
viral particle attachment.[81,82]Following the structurally predicted and experimentally validated SARS-CoV-2 S
protein–ACE2 interaction, Lan et al. and Yan et al. solved structures of the ACE2
receptor in complex with the S protein RBD by X-ray crystallography and cryo-EM methods,
respectively.[69,70]
The structure by Yan et al. included the ACE2-associated B0AT1 protein and
suggested that two S protein trimers are able to simultaneously bind the ACE2 homodimer
(Figure A).[69] Both studies
discussed the SARS-CoV-2 RBD–ACE2 interaction in the context of the SARS-CoV
interaction and concluded that the interaction is structurally similar; however, several
small sequence and conformational variations are present in the respective ACE2
interfaces. The higher resolution X-ray structure allowed for the conclusion that there
are subtle rearrangements within the SARS-CoV-2 receptor binding motif (RBM; the portion
of the RBD that forms the interface with ACE2) that cause the RBM ridge to become more
compact and form better contacts with the N-terminal helix of ACE2 (circled in Figure B).[70] The synthesis of
structural and biochemical data reported by these groups and others revealed that the
SARS-CoV-2 RBD recognizes and binds ACE2 better than the SARS-CoV
RBD.[26,68−70,80] The interplay of these two structures, the X-ray structure
illuminating subtle rearrangements at the ACE2-RBM interface and the cryo-EM structure
yielding a more global picture of S protein binding relative to the membrane plane,
combined to provide an early understanding of how SARS-CoV-2 particles attach to our
cells. These structures additionally provided the basis for structure-based rational
design of neutralizing binders with enhanced affinities to either ACE2 or the S
protein.
Figure 9
Structural insights into the SARS-CoV-2 S protein–ACE2 interaction. (A)
Cryo-EM structure of the SARS-CoV-2 RBD–ACE2–B0AT1 protein
complex reported by Yan et al. (6M17). The complex is shown as a colorized ribbon model and molecular
surface with the RBD, ACE2, and B0AT1 shown in red, blue, and green,
respectively. (B) Superposition of ACE2-complexed SARS-CoV (2AJF, brown) and SARS-CoV-2 (6M0J, purple) RBDs aligned by the RBD
(refs (70 and 93)). The ACE2 structure for the SARS-CoV-2 complex is
shown alone to simplify the RBD–ACE2 interface. The major structural
discrepancy between the SARS-CoV and SARS-CoV-2 RBDs is circled with a black dotted
line. The side chains of residues mutated in variants of concern (prior to the Omicron
variant) are shown and labeled in red. (C) The same as in panel B, but mutated
residues are shown for the Omicron BA.2 variant.
Structural insights into the SARS-CoV-2 S protein–ACE2 interaction. (A)
Cryo-EM structure of the SARS-CoV-2 RBD–ACE2–B0AT1 protein
complex reported by Yan et al. (6M17). The complex is shown as a colorized ribbon model and molecular
surface with the RBD, ACE2, and B0AT1 shown in red, blue, and green,
respectively. (B) Superposition of ACE2-complexed SARS-CoV (2AJF, brown) and SARS-CoV-2 (6M0J, purple) RBDs aligned by the RBD
(refs (70 and 93)). The ACE2 structure for the SARS-CoV-2 complex is
shown alone to simplify the RBD–ACE2 interface. The major structural
discrepancy between the SARS-CoV and SARS-CoV-2 RBDs is circled with a black dotted
line. The side chains of residues mutated in variants of concern (prior to the Omicron
variant) are shown and labeled in red. (C) The same as in panel B, but mutated
residues are shown for the Omicron BA.2 variant.Mutations within the S protein have been the major focus of the structural
characterization of emergent SARS-CoV-2 variants, with several RBD mutations shared
between multiple VoCs (Alpha, Beta, Gamma, Delta, and Omicron). Figure
B demonstrates that these VoC mutations localize to the
RBD–ACE2 interface and therefore may elicit effects on ACE2 binding affinity.
Indeed, the combination of these mutations (N501Y, E484 K/Q, L452R, T478 K) have been
biochemically implicated in increasing ACE2 binding affinity.[83−88] The N501Y
mutation was structurally demonstrated to insert into a cavity at the ACE2 binding
interface and form a perpendicular π–π stacking interaction with
Y41.[83] This additional interaction likely underlies the increased
ACE2 affinity afforded by the N501Y mutation and rationalizes its presence in the Alpha,
Beta, Gamma, and Omicron VoCs. In contrast to these mutations that enhance ACE2 affinity,
the ambiguous mutation of residue K417 to either T or N (K417T/N) uniquely decreases the
ACE2 binding affinity.[84,85,89] Accordingly, the K417N/T mutations are not significantly
prevalent in the absence of N501 or E484 mutations, which likely compensate for the loss
in ACE2 affinity.[7,8]
The structural rationale for this decreased ACE2 affinity by K417N/T is the loss of the
K417–D30 salt bridge that spans the ACE2–S protein complex. K417N potently
escapes neutralizing antibodies, justifying its inclusion in these variants, despite the
imparted penalty on ACE2 affinity. The recently emerged Omicron variant contains over 3
times the number of S protein mutations relative to any other previously emerged VoC, with
many mutations localizing to the RBD–ACE2 interface (Figure C).[90] These numerous mutations again balance
ACE2 binding affinity and afford unprecedented escape from convalescent and
vaccine-induced antibodies, likely rationalizing the rapid replacement of the Delta
variant by Omicron in late 2021.[90]Mutations elsewhere within the S protein, such as the omnipresent D614G mutation and the
A570D and S982A mutations in the Alpha variant, have also been implicated in increasing
ACE2 affinity through allosteric mechanisms including influencing the propensity of the
RBD to occupy the up or open conformation.[91,92] Additionally, these variants are defined by mutations
within other viral proteins which have been superficially structurally characterized
relative to the S protein mutations. The antigenic dominance of the S protein and its
inclusion as the sole protein antigenic component of all approved COVID-19 vaccines
rationalizes this hyper focus on S protein mutation.
Glycosylation of the S Protein
Complementary structural and mass spectrometry analyses of the SARS-CoV-2 S protein
confirmed that, like the related proteins from SARS-CoV and MERS-CoV viruses, the
SARS-CoV-2 S protein is also extensively glycosylated.[94,95] Glycosylation has a myriad of roles in viral
pathobiology including shielding vulnerable neutralizing epitopes, shaping viral tropism,
and mediating S protein folding and stability.[94−100]Figure A shows that there are 22, 22, and 23
glycosylation sites in the SARS-CoV-2, SARS-CoV, and MERS-CoV S proteins, respectively,
with glycosylation preferentially localizing to the NTD, S1/S2 boundary, and stem helix of
the S2 fusion domain.[94,95] Using the first reported structure of the SARS-CoV-2 spike protein
(6VSB), Grant et al. generated 3D
structures of the S protein glycoforms and subjected them to molecular dynamic (MD)
simulations to determine the antibody-accessible surface area.[101]
Despite only accounting for 17% of the molecular weight of the S trimer, Grant et al.
found that the glycans shield approximately 40% of the S protein surface (Figure B). The most exposed protein epitope
comprises the ACE2 receptor site of the RBD in the up or open conformation (indicated by
the blue circle in Figure B). The RBD has been
demonstrated to present the antigen against which the vast majority (∼90%) of
patient-derived neutralizing antibodies bind.[102] Therefore, the
exposure of the RBD in the up position due to lack of glycan shielding is likely a
vulnerability necessitated by the crucial requirement for the S protein to bind the ACE2
receptor. Taking glycan microheterogeneity into account, the authors further conclude that
variations in glycan identity may affect local structural fluctuation at either the
protein or glycan level, which may influence S protein function and stability.[101]
Figure 10
Glycosylation of the SARS-CoV-2, SARS-CoV, and MERS-CoV spike proteins. (A) Schematic
representation of the SARS-CoV-2, SARS-CoV, and MERS-CoV protein open reading frames
with glycosylation sites indicated (refs (94 and 95)). Adapted with permission
from ref (95). Copyright 2020 Watanabe et al.
http://creativecommons.org/licenses/by/4.0/. (B) Moss surface representation
of SARS-CoV-2 S protein glycosylation from molecular dynamic simulations performed by
Grant et al. (ref (101)). Glycans are shown in
ball-and-stick representations and colorized accordingly: M9, green; M5, dark yellow;
hybrid, orange; complex, pink. The S protein surface (6VSB) is colored according to antibody accessibility from
black (least accessible) to red (most accessible). The RBD in the up conformation is
circled in blue. Adapted with permission from ref (101). Copyright 2020 Grant et al. http://creativecommons.org/licenses/by/4.0/. (C) The same as in panel B, but
for SARS-CoV, SARS-CoV-2, and MERS-CoV S proteins and with available S
protein–antibody structures overlapped.
Glycosylation of the SARS-CoV-2, SARS-CoV, and MERS-CoV spike proteins. (A) Schematic
representation of the SARS-CoV-2, SARS-CoV, and MERS-CoV protein open reading frames
with glycosylation sites indicated (refs (94 and 95)). Adapted with permission
from ref (95). Copyright 2020 Watanabe et al.
http://creativecommons.org/licenses/by/4.0/. (B) Moss surface representation
of SARS-CoV-2 S protein glycosylation from molecular dynamic simulations performed by
Grant et al. (ref (101)). Glycans are shown in
ball-and-stick representations and colorized accordingly: M9, green; M5, dark yellow;
hybrid, orange; complex, pink. The S protein surface (6VSB) is colored according to antibody accessibility from
black (least accessible) to red (most accessible). The RBD in the up conformation is
circled in blue. Adapted with permission from ref (101). Copyright 2020 Grant et al. http://creativecommons.org/licenses/by/4.0/. (C) The same as in panel B, but
for SARS-CoV, SARS-CoV-2, and MERS-CoV S proteins and with available S
protein–antibody structures overlapped.Given the high degree of structural similarity between the SARS-CoV, MERS-CoV, and
SARS-CoV-2 S proteins, comparison between their respective glycan shields, and therefore
their impact on antibody accessibility, was possible. Figure C presents a side-by-side comparison of the glycan shield and
antibody-accessible area of these three viral S proteins, with neutralizing antibody
co-complexes superimposed. From this analysis and various reports mapping the epitopes of
patient-derived neutralizing antibodies, it can be concluded that the majority
(∼90%) of neutralizing antibodies map to the RBD of the SARS-CoV-2 S
protein.[102−104] Additionally, these
simulated structures show a remarkable degree of epitope conservation among the SARS-CoV,
MERS-CoV, and SARS-CoV-2 coronavirus S proteins. The strong correlation between the
predicted gaps in the S protein glycan shield and the observed antibody binding sites
highlights the importance of these epitopes in the elicitation of neutralizing antibodies
by therapeutics such as vaccines. This point may underlie the hampered (decades-long)
efforts to develop successful vaccines incorporating the even more densely glycosylated
HIV-1 Env and gp120 proteins.[101,105]
Replication, Packaging, and Release
Overview
Our understanding of the SARS-CoV-2 life cycle after it enters a host cell has greatly
benefited from the ability to “peer” into infected cells using numerous 2D
and 3D imaging technologies. Collectively, these snapshots highlight the extensive spatial
and temporal coordination employed during replication, packaging, and release of newly
formed viral particles, where each step takes place in discrete but highly coordinated
cytoplasmic compartments.Initial translation of the ∼30 kb long positive-sense SARS-CoV-2 genome by host
ribosomes occurs in the cytoplasm (Figure A).[106−108] Beginning at a single
ribosome entry site and exploiting ribosomal frame-shifting, two large polyproteins (pp1a
and pp1ab) are translated from the first two-thirds of the genome.[108−112] These polyproteins encode 16 individual
nonstructural proteins (NSPs), many of which function as components of the
replication–transcription complex (RTC) responsible for viral RNA
synthesis.[111,113−116] Overall, the SARS-CoV-2 NSPs share 86%
sequence identity with the NSPs produced by SARS-CoV.[117] The first four
NSPs (NSPs 1–4) are cleaved by the viral protease encoded by NSP3
(PLpro).[118−120] The remaining NSPs
(NSPs 5–16) are cleaved by the main viral protease, NSP5 (Mpro, also
called 3CLpro).[31] Mpro undergoes autolytic
cleavage from the polyprotein, and then assembles as an asymmetric dimer (Figure A, bottom panel) before cleaving the
remaining downstream NSPs.[31] The Mpro substrate-binding
pocket is highly conserved in all β-coronaviruses and contains a Cys-His catalytic
diad located in a cleft formed between two adjacent protein domains.[31]
Since the cleavage recognition sequence for Mpro is distinct from that of human
proteases, and given its high conservation and functional importance in coronavirus
replication, Mpro has become a prominent target for therapeutic
development.[11,31,33,121−124]
Figure 11
Overview of RNA translation and replication, viral packaging, and release of the
SARS-CoV-2 virion. Schematic representations (top) and experimental data (bottom) of
the cellular machinery and viral proteins involved in (A) genome translation and
initial polypeptide processing, (B) replication of genomic and subgenomic RNA, (C)
assembly of the virion at the ER–Golgi intermediate compartment (ERGIC), and
(D) final egress of the viral particle into the extracellular environment. (A, bottom)
Mpro dimer surface model (6LU7) (ref (31)) colored by chain
and the substrate binding pocket (inset) depicting the bound Mpro inhibitor
N3 (sticks). Experimental data from panels B–D show tomographic slices from
cryo-ET studies of (B) murine hepatitis virus (MHV) or (C and D) SARS-CoV-2-infected
cells highlighting the transport of RNA through a molecular DMV pore, budding of a
SARS-CoV-2 virion, and a viral exit tunnel, respectively. Panel B was adapted with
permission from ref (138). Copyright 2020 Wolff
et al. http://creativecommons.org/licenses/by/4.0/. Panel C was adapted with
permission from ref (125). Copyright 2020 Klein
et al. http://creativecommons.org/licenses/by/4.0/. Panel D was adapted with
permission from ref (139). Copyright 2021
Mendonça et al. http://creativecommons.org/licenses/by/4.0/. Created with BioRender.
Overview of RNA translation and replication, viral packaging, and release of the
SARS-CoV-2 virion. Schematic representations (top) and experimental data (bottom) of
the cellular machinery and viral proteins involved in (A) genome translation and
initial polypeptide processing, (B) replication of genomic and subgenomic RNA, (C)
assembly of the virion at the ER–Golgi intermediate compartment (ERGIC), and
(D) final egress of the viral particle into the extracellular environment. (A, bottom)
Mpro dimer surface model (6LU7) (ref (31)) colored by chain
and the substrate binding pocket (inset) depicting the bound Mpro inhibitor
N3 (sticks). Experimental data from panels B–D show tomographic slices from
cryo-ET studies of (B) murine hepatitis virus (MHV) or (C and D) SARS-CoV-2-infected
cells highlighting the transport of RNA through a molecular DMV pore, budding of a
SARS-CoV-2 virion, and a viral exit tunnel, respectively. Panel B was adapted with
permission from ref (138). Copyright 2020 Wolff
et al. http://creativecommons.org/licenses/by/4.0/. Panel C was adapted with
permission from ref (125). Copyright 2020 Klein
et al. http://creativecommons.org/licenses/by/4.0/. Panel D was adapted with
permission from ref (139). Copyright 2021
Mendonça et al. http://creativecommons.org/licenses/by/4.0/. Created with BioRender.The remaining stages of replication require
extensive spatial reorganization of the cytoplasm to sequester RNA synthesis within viral
replication organelles (vROs). These vROs have been extensively studied using transmission
electron microscopy (TEM) of stained plastic sections, serial cryo-focused ion beam
(FIB)/scanning electron microscopy (SEM), and cryo-ET, which all reveal a perinuclear
network of interconnected membrane compartments created by reorganization of the rough
endoplasmic reticulum (ER) (Figure B).[54,56,125−132] This network is predominantly made up of double-membrane
vesicles (DMVs) and is induced by the combined action of NSP3, NSP4, NSP6, and various
host factors.[126,133,134] This spatial segregation is thought to have several
beneficial effects for viral replication, as sequestering RNA synthesis into DMVs may not
only concentrate RNA replication machinery but also provide cover from host innate immune
sensors that detect the double-stranded RNA replication intermediates produced by this
process.[135,136]
However, early cellular studies on coronavirus infection raised the issue of how newly
made genomic RNA and subgenomic mRNAs could be transported out of fully enclosed DMVs to
the site of viral assembly.[137] High-resolution electron microscopy
analysis has shed light onto this conundrum with the identification of a 3 MDa molecular
pore complex that spans both DMV membranes (Figure B, bottom panel).[56,138] This pore contains NSP3 as a structural component and likely serves as
the export channel for viral mRNA and genomic RNA back into the cytoplasm.[138]
Viral Replication–Transcription Complex (RTC)
Within the lumen of SARS-CoV-2 DMVs, the viral genome is transcribed by the multiprotein
RTC (Figures B and 12). This
complex is not only responsible for transcription of the entire 30 kb genome but also the
numerous subgenomic mRNAs encoded within the final one-third of the genome required for
structural protein synthesis (i.e., N, M, E, and S proteins). In fact, genomic RNA only
accounts for a small fraction of the total RNA produced during replication.[137] The RTC uses a discontinuous transcription mechanism that relies on
complementary transcription regulatory sequences throughout the genome to produce these
subgenomic mRNAs (reviewed by Sawicki et al.).[140] These mRNAs are then
exported through the DMV molecular pore to the cytoplasm and translocated to the ER/Golgi
for protein production.
Figure 12
Structure of the SARS-CoV-2 multiprotein replication–transcription complex
(RTC). Surface representation (PDB 7KRN) (ref (143)) of the RTC
highlighting the relative positions of the RNA-dependent RNA polymerase (RdRp, NSP12),
processivity cofactors (NSP7 and NSP8), and the viral helicase (NSP13) as determined
by single-particle cryo-EM. The NTP entry tunnel (inset) plays a critical role in the
backtracking/proofreading function of the RTC, as erroneously incorporated
ribonucleotides are frayed into the entry tunnel where they can then be removed by the
3′-5′ exonuclease, NSP14, to ensure high-fidelity replication of the
viral genome. Created with BioRender.
Structure of the SARS-CoV-2 multiprotein replication–transcription complex
(RTC). Surface representation (PDB 7KRN) (ref (143)) of the RTC
highlighting the relative positions of the RNA-dependent RNA polymerase (RdRp, NSP12),
processivity cofactors (NSP7 and NSP8), and the viral helicase (NSP13) as determined
by single-particle cryo-EM. The NTP entry tunnel (inset) plays a critical role in the
backtracking/proofreading function of the RTC, as erroneously incorporated
ribonucleotides are frayed into the entry tunnel where they can then be removed by the
3′-5′ exonuclease, NSP14, to ensure high-fidelity replication of the
viral genome. Created with BioRender.Almost all NSPs produced by cleavage of the viral polyproteins play a role in the
structure and function of the RTC. NSPs 2–11 provide RTC supporting functions,
while the core enzymatic functions of RNA synthesis, RNA proofreading, and RNA
modification are carried out by NSPs 12–16.[52] The main component
of the RTC is the RNA-dependent RNA polymerase (RdRp) NSP12. The structure of the RdRp has
been likened to a “right hand” wrapped around the replicating RNA, with the
conserved polymerase motifs (A–G) located in the “palm” domain and
the NSP7/NSP8 cofactor binding sites located in the “thumb” and
“fingers” (Figure ).[27,141,142] NSP7 and NSP8 act as processivity factors for the RdRp
and bind the NSP12 thumb as a heterodimer.[27,141,142] An additional copy of NSP8 also
occupies the NSP12 fingers domain.[27,141,142] While many structural features of the RdRp
are consistent with prior information obtained from SARS-CoV, the SARS-CoV-2 RdRp
structure highlights a previously unresolved N-terminal β hairpin that is predicted
to stabilize the overall structure.[142] The positively charged RNA
template and NTP entry tunnels located at the back of the RdRp join together in a central
hydrophilic cavity in the palm where template-directed RNA synthesis occurs (Figure , inset).[142] Specificity
of the polymerase for RNA over DNA synthesis is likely conferred through recognition of
the 2′-OH group of the NTP by residues N691, S682, and D623 in the RdRp
palm.[27,141] After
incorporation of the nucleotide into the nascent RNA strand, the double-stranded RNA
intermediate exits through a tunnel located at the front side of the polymerase.[142]
RNA Proofreading and Modification
One of the most intriguing qualities of the coronavirus RTC is its exceptionally high
fidelity during replication. The high mutation rates of typical RNA viruses promote
genetic diversity and viral adaptation; however, coronaviruses have a mutation rate that
is an order of magnitude lower compared to most other RNA viruses.[124,144] The driving force of this
high-fidelity replication is the ability of the SARS-CoV-2 RTC to backtrack along the
nascent RNA strand and remove erroneously incorporated ribonucleotides. This backtracking
and proofreading function is driven by the viral helicase, NSP13, and the
3′-5′ exonuclease, NSP14 (also known as ExoN).[143,145] On the basis of single-particle
cryo-EM structures of backtracked complexes and molecular dynamics analysis of the RTC, it
is suggested that, when a ribonucleotide is misincorporated into the growing product RNA,
the 5′-3′ helicase activity of NSP13 pushes the mismatched duplex RNA
backward into the RdRp.[143,145] In doing so, the template RNA and product RNA are separated by a
structural motif of the RdRp that frays the misincorporated ribonucleotide into the more
favorable environment of the NTP entry tunnel (Figure , inset).[143,145] Exposure of the erroneously incorporated ribonucleotide in the entry
tunnel allows NSP14 (along with its stabilizing cofactor NSP10) to remove it. It is this
backtracking and proofreading function that limits the efficacy of nucleotide analogue
drugs (e.g., remdesivir) for the treatment of COVID-19.Final processing of the viral mRNAs involves addition of a 5′ cap and
polyadenylation of the 3′ end that together aid in viral mRNA stability,
translation initiation, and escape from the cellular innate immune system. Synthesis of
the 5′ cap is facilitated by the nucleotide triphosphatase activity of NSP13,
C-terminal N7-methyltransferase activity of NSP14 (ExoN), and 2′-O methylation by
NSP16.[124,146] To
easily facilitate capping, these enzymes are positioned in close proximity to the newly
synthesized viral mRNA as part of the large RTC.
Viral Packaging
Formation of the complete SARS-CoV-2 virion requires all components of the viral genome
and envelope to assemble at the same time and place. Numerous high-resolution electron
microscopy imaging studies of SARS-CoV-2-infected cells have shown that virions bud into
the lumen of the ER–Golgi intermediate compartment (ERGIC) (Figure C).[125,147] The membrane-associated structural proteins M, E, and S
are all translated from viral mRNA and inserted into the ER membrane. These virus assembly
sites are frequently observed in close proximity to DMV molecular pores, thereby aiding in
the spatiotemporal coordination of the packaging process.[125,139]Given the large size of the SARS-CoV-2 genome, it must be extensively condensed prior to
encapsulation by the viral envelope. This packaging is mediated by RNA binding and
dimerization of the N protein which coats the genomic RNA to form a tightly packed RNP
complex.[47,125]
Since the N protein is required to bind variable regions all along the genome,
interactions between its N-terminal RNA-binding domain and RNA are likely nonspecific and
largely electrostatic.[61] However, this nonspecificity gives rise to the
issue of how genomic RNA is specifically packaged over the other abundant RNAs produced
during the process of viral replication. To date, no specific packaging signal has been
conclusively identified within the SARS-CoV-2 genome that drives selective packaging by
the N protein. However, Syed et al. carried out a series of truncations of the SARS-CoV-2
genome—guided by reported packaging sequences for related viruses (murine hepatitis
virus and SARS-CoV)—and found that deletion of a region termed “T20”
(nucleotides 20080–22222) resulted in significant impairment of viral
infectivity.[148] Additionally, it has been suggested that phase
separation of the N protein, driven by its three intrinsically disordered regions, may
play a role in recruiting the nascent viral RNA.[46,49,50,63,64,149] Once condensed, the RNP complex is
retained at the cytoplasmic side of the ERGIC through interactions with the highly
abundant M protein.[125]Oligomerization and association of the M protein with the viral RNP complex and E and S
proteins drives assembly, membrane curvature, and eventual budding of the viral particles
into the ERGIC.[150] These interactions are mediated through its soluble
C-terminal domain that extends into the viral lumen.[46] Trimeric
prefusion spike proteins are produced in the Golgi/ER network and carried by small
transport vesicles to the viral assembly site.[151] Here, spike proteins
cluster exclusively in association with RNP complexes, likely through a bridged
interaction with the M protein.[139] As previously mentioned, the precise
role of the E protein in SARS-CoV-2 assembly remains unclear, as deletion of the E protein
in other related β-coronaviruses does not impact viral assembly but instead
attenuates the viral particles.[43,44,152] The M2 protein of influenza viruses is also a
viroporin and has been demonstrated to alter membrane cholesterol levels, inducing
membrane scission and viral particle budding; however, further experiments are needed to
determine whether the E protein plays a similar mechanistic role in SARS-CoV-2 viral
packaging.[153,154]
Nevertheless, the M protein-driven clustering of the E protein with all other viral
components induces positive membrane curvature of the ERGIC and eventual fusion of the
viral envelope to produce a fully encapsulated and infectious SARS-CoV-2 virion.
Virion Release
Once all components of the virion have been assembled, the final phase of the SARS-CoV-2
replication cycle is release of the viral particles into the extracellular environment. It
was initially assumed that β-coronaviruses use vesicles from the biosynthetic
pathway for cellular egress similar to other enveloped RNA viruses that assemble and bud
directly from the plasma membrane.[155−157] In
contrast, assembled SARS-CoV-2 virions in the ERGIC are trafficked to the plasma membrane
via lysosomal exocytosis.[125,158,159] This pathway poses a unique challenge for
cellular egress of the viral particles as the typical acidification of lysosomes would
result in particle inactivation and degradation. To circumvent this, β-coronaviruses
have been shown to significantly disrupt lysosomal acidification and inactivate lysosomal
proteases.[150] While the exact mechanism of this deacidification is
currently unknown, ORF3a of SARS-CoV-2 and other related β-coronaviruses has been
implicated in this process.[150,158] As a consequence of this deacidification, antigen presentation on
SARS-CoV-2-infected cells is limited and could potentially serve as an additional path for
immune evasion.[158]Given that many of the SARS-CoV-2 structural proteins require post-translational
modifications, such as phosphorylation, glycosylation, and/or cleavage, single-membrane
vesicles (SMVs) containing single or multiple virions are first trafficked to the Golgi
and trans-Golgi network where these post-translational modifications can be made.[150] From here, lysosomes carry the viral particles to the cell periphery.
Cryo-FIB/SEM and cryo-ET imaging analyses reveal that final egress of the virions into the
extracellular space is mediated through exit tunnels connected to the cell membrane (Figure D), likely formed by fusion of the SMVs
with the plasma membrane.[125,139]
Future Prospects
Recent advancements in electron microscopy have made viral proteins and processes that were
intractable only years previously now structurally tractable. Here, we have reviewed
SARS-CoV-2 structural insights and placed them into the context of greater viral
pathobiology. What follows is a discussion of the limitations to our current understanding
of the SARS-CoV2 viral life cycle and examples of how structural results are being used to
inform therapeutic design.Our understanding of the 3D rearrangement of viral biology stems from the synthesis of
results across imaging techniques. However, due to technology specialization, there is a
tendency for structural fields to compare results exclusively within their own imaging field
(i.e., cryo-EM structures compared with other cryo-EM structures) and a lack of integration
of results across imaging techniques. An excellent example of the implementation of
multitechnique structural investigation was conducted by Cortese et al., wherein the authors
describe the effects of viral infection on cellular structure, visualizing fixed samples by
cryo-ET and FIB/SEM and live cells by confocal and super-resolution (STED) light
microscopy.[147] This integrative imaging approach revealed in
situ remodeling of internal cellular membrane systems upon SARS-CoV-2 infection
and provided a functional link by demonstrating the pharmacological inhibition of
cytoskeleton remodeling to restrict viral replication.While broad 3D characterization of the viral infection cycle has been achieved, there still
exist a few structural unknowns. High-resolution details for the M and E structural proteins
remain elusive, with pan-coronavirus implications given their high degree of conservation.
An additional structural unknown is the molecular mechanism by which the SARS-CoV-2 S
protein transitions from the pre- to postfusion state during cellular infection. This
mechanism is presumed to be highly conserved across many viruses (coronaviruses and HIV-1),
and therefore the in-depth experimental characterization of this process may inform broad
therapeutic targeting.[71,75,76]We would like to finally highlight how some of these structural results have aided in the
development of several SARS-CoV-2 therapies. Overall, therapeutic discovery has steadily
advanced throughout the COVID-19 pandemic (Figure ). A specific example of structure-based therapeutic design was reported by Hunt
et al., wherein high-affinity small proteins (“minibinders”) were designed to
bind and geometrically complement the ACE2 binding site on the SARS-CoV-2 S protein.[161] These in silico and structurally designed minibinders were
further demonstrated to potently neutralize SARS-CoV-2 VoCs, with protection provided in
human ACE2-expressing transgenic mice (both prophylactically and therapeutically).
Foundational work by Pallesen et al. and Hsieh et al. demonstrating structure-based design
of prefusion-stabilized MERS-CoV, SARS-CoV, and SARS-CoV-2 stabilized spike proteins has
undoubtedly informed the formulation of the BNT162b2 (Pfizer, Inc.) and mRNA-1273 (Moderna,
Inc.) mRNA vaccines.[162,163] Finally, the development of the recently announced oral pills
Molnupiravir (Merck & Co.) and Paxlovid (Pfizer Inc.) have likely benefited from the
high-resolution structural data available for their viral targets (the RTC and 3CL protease,
respectively). The interdisciplinary combination of structural biology, immunology,
virology, and continual scientific technological advancements has proven to represent our
best approach to understanding and mitigating the COVID-19 pandemic.
Figure 13
Therapeutic discovery during the COVID-19 pandemic. These data were compiled by the
Biotechnology Innovation Organization and are presented here as the cumulative monthly
number of antiviral, treatment, and vaccine therapies in development (ref (160)). Antivirals are defined here as drugs that
interact directly with the virus or disrupt its ability to replicate. Treatments are
defined here as drugs that treat various COVID-19-associated illnesses resulting from
SARS-CoV-2 viral infection. Vaccines are defined here as prophylactic therapeutics that
stimulate immunity against SARS-CoV-2.
Therapeutic discovery during the COVID-19 pandemic. These data were compiled by the
Biotechnology Innovation Organization and are presented here as the cumulative monthly
number of antiviral, treatment, and vaccine therapies in development (ref (160)). Antivirals are defined here as drugs that
interact directly with the virus or disrupt its ability to replicate. Treatments are
defined here as drugs that treat various COVID-19-associated illnesses resulting from
SARS-CoV-2 viral infection. Vaccines are defined here as prophylactic therapeutics that
stimulate immunity against SARS-CoV-2.
Authors: Jesper Pallesen; Nianshuang Wang; Kizzmekia S Corbett; Daniel Wrapp; Robert N Kirchdoerfer; Hannah L Turner; Christopher A Cottrell; Michelle M Becker; Lingshu Wang; Wei Shi; Wing-Pui Kong; Erica L Andres; Arminja N Kettenbach; Mark R Denison; James D Chappell; Barney S Graham; Andrew B Ward; Jason S McLellan Journal: Proc Natl Acad Sci U S A Date: 2017-08-14 Impact factor: 11.205
Authors: Amalio Telenti; Davide Corti; Florian A Lempp; Leah B Soriaga; Martin Montiel-Ruiz; Fabio Benigni; Julia Noack; Young-Jun Park; Siro Bianchi; Alexandra C Walls; John E Bowen; Jiayi Zhou; Hannah Kaiser; Anshu Joshi; Maria Agostini; Marcel Meury; Exequiel Dellota; Stefano Jaconi; Elisabetta Cameroni; Javier Martinez-Picado; Júlia Vergara-Alert; Nuria Izquierdo-Useros; Herbert W Virgin; Antonio Lanzavecchia; David Veesler; Lisa A Purcell Journal: Nature Date: 2021-08-31 Impact factor: 49.962
Authors: Xiaoli Xiong; Peter J Coombs; Stephen R Martin; Junfeng Liu; Haixia Xiao; John W McCauley; Kathrin Locher; Philip A Walker; Patrick J Collins; Yoshihiro Kawaoka; John J Skehel; Steven J Gamblin Journal: Nature Date: 2013-04-24 Impact factor: 49.962
Authors: Zunlong Ke; Joaquin Oton; Kun Qu; Mirko Cortese; Vojtech Zila; Lesley McKeane; Takanori Nakane; Jasenko Zivanov; Christopher J Neufeldt; Berati Cerikan; John M Lu; Julia Peukes; Xiaoli Xiong; Hans-Georg Kräusslich; Sjors H W Scheres; Ralf Bartenschlager; John A G Briggs Journal: Nature Date: 2020-08-17 Impact factor: 49.962
Authors: Daniel Wrapp; Nianshuang Wang; Kizzmekia S Corbett; Jory A Goldsmith; Ching-Lin Hsieh; Olubukola Abiona; Barney S Graham; Jason S McLellan Journal: Science Date: 2020-02-19 Impact factor: 47.728
Authors: Shan Lu; Qiaozhen Ye; Digvijay Singh; Yong Cao; Jolene K Diedrich; John R Yates; Elizabeth Villa; Don W Cleveland; Kevin D Corbett Journal: Nat Commun Date: 2021-01-21 Impact factor: 14.919
Authors: Sourish Ghosh; Teegan A Dellibovi-Ragheb; Adeline Kerviel; Eowyn Pak; Qi Qiu; Matthew Fisher; Peter M Takvorian; Christopher Bleck; Victor W Hsu; Anthony R Fehr; Stanley Perlman; Sooraj R Achar; Marco R Straus; Gary R Whittaker; Cornelis A M de Haan; John Kehrl; Grégoire Altan-Bonnet; Nihal Altan-Bonnet Journal: Cell Date: 2020-10-27 Impact factor: 41.582