The virus that causes COVID-19, SARS-CoV-2, has a large RNA genome that encodes numerous proteins that might be targets for antiviral drugs. Some of these proteins, such as the RNA-dependent RNA polymerase, helicase, and main protease, are well conserved between SARS-CoV-2 and the original SARS virus, but several others are not. This study examines one of the proteins encoded by SARS-CoV-2 that is most different, a macrodomain of nonstructural protein 3 (nsp3). Although 26% of the amino acids in this SARS-CoV-2 macrodomain differ from those observed in other coronaviruses, biochemical and structural data reveal that the protein retains the ability to bind ADP-ribose, which is an important characteristic of beta coronaviruses and a potential therapeutic target.
The virus that causes COVID-19, SARS-CoV-2, has a large RNA genome that encodes numerous proteins that might be targets for antiviral drugs. Some of these proteins, such as the RNA-dependent RNA polymerase, helicase, and main protease, are well conserved between SARS-CoV-2 and the original SARS virus, but several others are not. This study examines one of the proteins encoded by SARS-CoV-2 that is most different, a macrodomain of nonstructural protein 3 (nsp3). Although 26% of the amino acids in this SARS-CoV-2 macrodomain differ from those observed in other coronaviruses, biochemical and structural data reveal that the protein retains the ability to bind ADP-ribose, which is an important characteristic of beta coronaviruses and a potential therapeutic target.
The development of antivirals targeting severe acute respiratory syndrome coronavirus 2
(SARS-CoV-2), the causative agent of the present COVID-19 pandemic,[1] will
most likely focus on viral proteins and enzymes needed for replication.[2]
Similar to other coronaviruses, SARS-CoV-2 has a large positive sense (+)RNA genome that is
more than 30000 nucleotides long with several open reading frames. Most of the proteins that
form the viral replicase are encoded by the “rep 1ab” reading frame, which
encodes a 7096-amino acid polyprotein that is ultimately processed into at least 15
functional peptides, five of which are produced only by a translational frameshift event
occurring after nsp10 (Figure ). Parts of the
SARS-CoV-2 rep 1ab polyprotein are very similar to the rep 1ab protein of the coronavirus
that caused the SARS epidemic in 2003 (which here will be termed SARS-CoV), suggesting the
that drugs targeting SARS-CoVnsp5–14 might be effective against SARS-CoV-2. However,
some portions of the SARS rep 1ab polyproteins are quite different.
Figure 1
Sequence divergence between potential drug targets in SARS-CoV and SARS-CoV-2. The
SARS-CoV-2 rep 1ab peptide sequence was aligned with each of the PDB files listed, which
describe an atomic structure of a homologous region of the SARS-CoV rep 1ab polyprotein.
nsps are shown in sequence as black arrows (note that there is no “nsp11”,
and the translational frameshift occurs after nsp10). The percent of amino acids that
differ in each protein is plotted. nsp1 is an interferon antagonist.[28] The nsp3 Mac1 domain is studied here. Mac2 and Mac3 are tandem macrodomains that bind
G-quadruplex structures,[4] and nsp3 dC is the C-terminus of
Mac3.[29] nsp3pro is a papain-like protease,[30] and
nsp3 RBD is another possible RNA-binding domain.[31] nsp5 is the main
viral protease.[32] nsp7 and nsp8 are polymerase cofactors.[33] nsp9 is an RNA-binding protein.[34] nsp10 is a
zinc-binding cofactor for nsp14 and nsp16.[35] nsp12 is an RNA
polymerase.[33] nsp13 is a helicase.[36] nsp14 is a
3′–5′ exonuclease and a 7-methyltransferase.[35]
nsp15 is an RNA endonuclease.[37] nsp16 is an RNA cap
2′-O-methyltransferase.[38]
Sequence divergence between potential drug targets in SARS-CoV and SARS-CoV-2. The
SARS-CoV-2 rep 1ab peptide sequence was aligned with each of the PDB files listed, which
describe an atomic structure of a homologous region of the SARS-CoV rep 1ab polyprotein.
nsps are shown in sequence as black arrows (note that there is no “nsp11”,
and the translational frameshift occurs after nsp10). The percent of amino acids that
differ in each protein is plotted. nsp1 is an interferon antagonist.[28] The nsp3Mac1 domain is studied here. Mac2 and Mac3 are tandem macrodomains that bind
G-quadruplex structures,[4] and nsp3 dC is the C-terminus of
Mac3.[29] nsp3pro is a papain-like protease,[30] and
nsp3 RBD is another possible RNA-binding domain.[31] nsp5 is the main
viral protease.[32] nsp7 and nsp8 are polymerase cofactors.[33] nsp9 is an RNA-binding protein.[34] nsp10 is a
zinc-binding cofactor for nsp14 and nsp16.[35] nsp12 is an RNA
polymerase.[33] nsp13 is a helicase.[36] nsp14 is a
3′–5′ exonuclease and a 7-methyltransferase.[35]
nsp15 is an RNA endonuclease.[37] nsp16 is an RNA cap
2′-O-methyltransferase.[38]In contrast to the well-conserved SARS-CoVnsp5 protease, nsp12 polymerase, and nsp13
helicase enzymes, significantly more differences exist between the nsp3 proteins encoded by
SARS-CoV and SARS-CoV-2. The most variation occurs in a domain of nsp3 suspected to bind
ADP-ribose, which will here be termed the Mac1 domain,[3] to differentiate
it from the two downstream macrodomains (Mac2 and Mac3), which do not bind
ADP-ribose.[4] The Mac1 domain of SARS-CoV also catalyzes the hydrolysis
of ADP-ribose 1′′ phosphate, albeit at a slow rate.[5] Some
viral macrodomains also remove ADP-ribose from proteins,[6] and this
de-ADP-ribosylation activity correlates with virulence and the ability to evade the innate
immune response.[6−11] Jean-Michel Claverie recently suggested that the putative
ability of Mac1 to remove ADP-ribose from proteins might be related to the cytokine storm
syndrome seen in severe cases of COVID-19.[12]Compounds blocking ADP-ribose binding could be used to test this important hypothesis.
However, the many sequence differences preclude the use of the SARS-CoVMac1 domain
structures as scaffolds to design compounds that might target this nsp3 region in
SARS-CoV-2. ADP-ribose binding must also be confirmed, especially in light of the
observation that the same nsp3 domain from gamma coronaviruses does not bind ADP-ribose
in vitro.[13] The ability of the SARS-CoV-2Mac1 domain
to bind ADP-ribose was therefore examined here using a recombinant purified protein and
isothermal titration calorimetry (ITC). We also determined the structure of the SARS-CoV-2Mac1 domain to examine the biochemical context of ADP-ribose binding and to provide data for
rational inhibitor design or in silico screening.
Materials and Methods
Gene Synthesis
To facilitate the comparison between SARS-CoV and SARS-CoV-2, a protein expression vector
was generated that is similar to that used by Eglott et al.[14] To this
end, a codon-optimized open reading frame was synthesized by GenScript (Piscataway, NJ)
that encodes the Mac1 domain with an N-terminal TEV-cleavage site flanked by
NheI and BamHI restriction sites. This open reading
frame was cloned into pET21b to yield plasmid pET21-COVID-Mac1. The pET11-COVID-Mac1
plasmid was used to transform BL21(DE3) cells.
Protein Purification
Colonies of BL21(DE3) cells harboring the pET21-COVID-Mac1 plasmid were used to inoculate
3 mL of lysogeny broth containing 100 mg/mL ampicillin. The starter culture was incubated
at 37 °C while being shaken at 225 rpm. After the cells grew to an OD600
of 1.0, they were transferred to 1 L of fresh medium containing ampicillin. After the
cells reached an OD600 of 1.0 again, protein production was induced with 1 mM
isopropyl β-d-thiogalactoside. After growing for 16 h at 23 °C, cells
were harvested by centrifugation at 4000 rpm and 4 °C. The resulting cell pellet was
suspended in 25 mL of IMAC buffer [20 mM Tris (pH 8) and 0.5 M NaCl], sonicated on ice for
five 1 min bursts separated by 2 min rests, and clarified by centrifugation at
10000g for 30 min. The supernatant was loaded onto a 5 mL Ni-NTA
column, and the fractions were eluted with a step gradient from 5 to 500 mM imidazole.
Fractions containing the Mac1 domain protein (5 mL total) were loaded on a 250 mL
Sephacryl S300 gel filtration column and eluted with 10 mM MOPS and 150 mM NaCl. Purified
protein concentrations were determined by measuring the absorbance at 260 nm using a molar
extinction coefficient of 10555 M–1 cm–1, which was
calculated using the ProtParam tool (https://web.expasy.org/protparam/).
Isothermal Titration Calorimetry (ITC)
Binding of ADP-ribose to the SARS-CoV-2Mac1 domain was measured using a Nano ITC
apparatus (TA Instruments). Before the measurement was started, samples of both ligand and
protein were diluted in 10 mM MOPS and 150 mM NaCl (pH 7) and degassed at 400 mmHg for 30
min. Measurements were taken at 20 °C by injecting 2.0 μL aliquots of 500
μM ADP-ribose (Sigma) into 50 μM protein (175 μL initial volume) with
stirring at 250 rpm. Using NanoAnalyze Software (version 3.11.0), data were fitted by
nonlinear regression to an independent binding model. Briefly, after baseline correction,
background heats from ligand-to-buffer titrations were subtracted, and the corrected heats
from the binding reaction were used to identify best fit parameters for the stoichiometry
of binding (n), free energy of binding (ΔG),
apparent enthalpy of binding (ΔH), and entropy change
(ΔS). Dissociation constants (Kd)
were calculated from the ΔG.
Crystallization and Structure Determination
In preparation for crystallization experiments, purified SARS-CoV-2Mac1 domain protein
was cleaved with tobacco etch virus (TEV) protease to remove the N-terminal
His6 tag and passed back through the Ni-NTA column. The flow-through
fractions were desalted into 10 mM HEPES (pH 7.2) using a 2 × 5 mL HiTrap desalting
column (GE Life Sciences) and concentrated to 10 mg/mL in a centrifugal concentrator. This
preparation of the protein was mixed at a 1:1 μL ratio with Morpheus HT screen
reagents (Molecular Dimensions) in a 96-well SwissSci MRC ultraviolet-transmissible
sitting drop plate. Large, diffraction-quality crystals grew directly from a number of the
screen conditions. The crystal ultimately used for structure determination grew from
condition D9: 0.12 M alcohols (0.02 M each 1,6-hexanediol, 1-butanol, 1,2-propanediol,
2-propanol, 1,4-butanediol, and 1,3-propanediol), 0.1 M buffer system 3, pH 8.5 (0.05 M
Tris and 0.05 M bicine), and 30% precipitant mix 1 [20% poly(ethylene glycol) (PEG) 500
monomethylether and 10% PEG 20000]. Large, thick plates grew within 1 week at 22 °C.
Given the high concentration of PEG 500 MME, the crystal did not require additional
cryo-protection and was flash-cooled by being directly looped from the sitting drop and
plunged into liquid nitrogen.Diffraction data were collected on Life Sciences Collaborative Access Team (LS-CAT)
beamline 21-ID-F at the Advanced Photon Source of Argonne National Laboratory. The
wavelength at this station is fixed at 0.9787 Å; the detector is a MarMosaic M300
charge-coupled device. The data were collected with an oscillation width of 0.5° per
image for a total oscillation of 180°. The data were indexed and integrated with
DIALS[15,16] as
implemented in version 7.2 of the CCP4 software suite.[17,18] Data scaling and reduction were performed
using AIMLESS.[19−21] Data collection
statistics are listed in Table .
Table 1
Crystallographic Data Collection and Model Refinement Statistics for the
SARS-CoV-2 Mac1 Domain
Data Collection
resolution range (Å) (last shell)a
43.32–0.95 (0.97–0.95)
space group
P212121
a, b, c (Å)
43.3, 54.4, 67.6
α, β, γ (deg)
90.0, 90.0, 90.0
Rmergea
0.063 (0.348)
Rmeasa
0.075 (0.475)
Rpima
0.039 (0.320)
CC1/2a
99.5 (89.1)
no. of unique reflectionsa
99442 (4199)
completeness (%)a
98.4 (85.0)
multiplicitya
6.4 (2.6)
⟨I/σ(I)⟩a
13.1 (1.5)
Model Refinement
no. of reflections used in refinement
99335 (2606)b
no.f of reflections used for Rfree
5046 (125)
Rcryst
(Rfree)
0.119 (0.137)
Wilson B-factor (Å2)
7.4
average B-factor (Å2)
13.9
protein atoms
10.2
solvent
28.2
root-mean-square (RMS) deviation
bond lengths (Å)
0.010
bond angles (deg)
1.16
coordinate error (Å)
0.06
Ramachandran statistics
favored/allowed/outliers
(%)
98.2/1.8/0.0
rotamer outliers (%)
1.2
Clashscore
1.70
Values in parentheses apply to the high-resolution shell indicated in the
resolution row.
The limits of the high-resolution bin for refinement were 0.96–0.95
Å.
Values in parentheses apply to the high-resolution shell indicated in the
resolution row.The limits of the high-resolution bin for refinement were 0.96–0.95
Å.The structure was determined by molecular replacement in PHASER[22]
using the model of the SARS-CoVMac1 domain as the search model [Protein Data Bank (PDB)
entry 2FAV(14)].
The model underwent iterative rounds of (re)building in COOT[23] and
refinement in PHENIX.refine.[24,25] The very high resolution of the data justified a full anisotropic
treatment of the protein and solvent temperature factors. Model refinement and validation
statistics are listed in Table . The coordinates
were deposited in the Protein Data Bank as entry 6WEY.
Results and Discussion
Variability in the nsp3 Mac1 Domain
The structures of most of the soluble portions of the SARS-CoV nsp proteins were examined
at atomic resolution to help understand coronavirus replication and facilitate antiviral
drug discovery. The amino acid sequences of each of these proteins were compared with the
homologous regions of the rep 1ab protein encoded by SARS-CoV-2 (GenPept accession number
YP_009724389). The most similar proteins were the RNA helicases (nsp13), which are
identical in all but one of their 603 amino acids, namely a conservative Val to Ile
substitution near their C-termini. The RNA-dependent RNA polymerases (nsp12) are also well
conserved, sharing all but 34 of 955 amino acids. The primary protease that cleaves the
polyprotein (nsp5) is also similar in SARS-CoV and SARS-CoV-2, with only 13 amino acids
that differ among 306 (4.2% different) (Figure ).At the other end of the spectrum are the nsp3 proteins, which are notably more different
in the two SARS viruses. nsp3 is a large multidomain membrane-bound protein,[26] and its clearest role in viral replication is cleaving the rep
polyprotein. Greater than 17% of the amino acids in the nsp3 protease domain differ
between SARS-CoV and SARS-CoV-2. Other parts of nsp3 are even more variable, such as the
macrodomains that lie N-terminal to the nsp3 protease domain. Macrodomains consist of four
helices that surround a mixed β sheet. A ligand-binding pocket that typically binds
ADP-ribose or related compounds lies between the helices and the sheet.[27] SARS-CoV has three macrodomains in tandem, but only the first binds ADP-ribose. The
amino acid sequences of this Mac1 domain differ by 26% between SARS-CoV and SARS-CoV-2
(Figure ).Six of the 47 variant residues in the 180-amino acid SARS Mac1 domain are clustered in a
21-amino acid region near its N-terminus, which is particularly variable in the three
Coronaviridae genera (Figure ) and sometimes not
included in other macrodomains. As shown by the conservation plot above the alignment
(Figure ), there are several highly variable
regions throughout the protein, which might affect protein function. Mac1 domains from
coronaviruses that cause the common cold (alpha coronaviruses)[13] and
the beta coronaviruses, such as SARS-CoV and Middle East respiratory syndrome coronavirus
(MERS-CoV),[39] all bind ADP-ribose. However, reports with gamma
coronaviruses suggest that ADP-ribose binding might not be conserved. Sequence differences
exist between infectious bronchitis virus (IBV) strain M41, which binds ADP-ribose, and
IBV strain Beaudette, which does not.[13]
Figure 2
Variation in the Mac1 domains of coronaviruses. Mac1 structures were aligned using
the “MatchMaker” function of UCSF Chimera (version 1.14).[40] Amino acids are colored by class. β sheets are denoted with
green boxes, and α helices are denoted with yellow boxes. Arrows mark key
residues F360, F336, D226, and N244 (see the text).
Variation in the Mac1 domains of coronaviruses. Mac1 structures were aligned using
the “MatchMaker” function of UCSF Chimera (version 1.14).[40] Amino acids are colored by class. β sheets are denoted with
green boxes, and α helices are denoted with yellow boxes. Arrows mark key
residues F360, F336, D226, and N244 (see the text).
Expression and Purification of the SARS-CoV-2 Mac1 Domain
An Escherichia coli expression vector for the Mac1 domain was generated
to express an N-terminally His-tagged protein similar to a SARS-CoV protein studied by
Egloff et al.[14] Upon induction, a 1 L culture of BL21(DE3) cells
harboring the vector expresses 50–100 mg of the Mac1 domain protein that can be
purified in one step to apparent homogeneity using immobilized metal affinity
chromatography (Figure A). The protein was
polished further with gel filtration chromatography and concentrated before analysis and
crystallization.
Figure 3
SARS-CoV-2 Mac1 domain binds ADP ribose. (A) A 15% sodium dodecyl
sulfate–polyacrylamide gel electrophoresis gel showing 10 μL samples of a
soluble crude lysate of induced BL21(DE3) cells harboring plasmid p21-COVID-Mac1 (lane
C), proteins that do not bind a Ni-NTA column (F), and fractions eluted from a
Ni-nitrilotriacetic acid column during an imidazole step gradient from 0 mM (lanes
1–3) to 5 mM (lanes 4–6), 40 mM (lanes 7–9), and 500 mM (lanes
10–12). Protein markers (lane M) are 116, 66.2, 45, 35, and 25 kDa. (B) Example
ITC experiment in which the purified SARS-CoV-2 nsp3 macrodomain was titrated with ADP
ribose. (C) ITC experiments such as those shown in panel B were repeated thrice with
each of the compounds listed. Means are plotted, and error bars are standard
deviations. Average (±SD) dissociation constants were 10 ± 4 μM for
ADP-ribose, 8 ± 9 μM for ADP, 3 ± 3 μM for ATP, 6 ± 4
μM for ADP-glucose, 2 ± 1 μM for cAMP, and 2 ± 1 μM for
adenosine. (D) Comparison of the thermodynamics of ADP-ribose binding by Mac1 domains
from SARS-CoV-2 (data from panel C), SARS-CoV, MERS-CoV, and an alpha coronavirus.
aData from ref (34)bData from ref (39).
cData from ref (13).
SARS-CoV-2Mac1 domain binds ADP ribose. (A) A 15% sodium dodecyl
sulfate–polyacrylamide gel electrophoresis gel showing 10 μL samples of a
soluble crude lysate of induced BL21(DE3) cells harboring plasmid p21-COVID-Mac1 (lane
C), proteins that do not bind a Ni-NTA column (F), and fractions eluted from a
Ni-nitrilotriacetic acid column during an imidazole step gradient from 0 mM (lanes
1–3) to 5 mM (lanes 4–6), 40 mM (lanes 7–9), and 500 mM (lanes
10–12). Protein markers (lane M) are 116, 66.2, 45, 35, and 25 kDa. (B) Example
ITC experiment in which the purified SARS-CoV-2nsp3 macrodomain was titrated with ADPribose. (C) ITC experiments such as those shown in panel B were repeated thrice with
each of the compounds listed. Means are plotted, and error bars are standard
deviations. Average (±SD) dissociation constants were 10 ± 4 μM for
ADP-ribose, 8 ± 9 μM for ADP, 3 ± 3 μM for ATP, 6 ± 4
μM for ADP-glucose, 2 ± 1 μM for cAMP, and 2 ± 1 μM for
adenosine. (D) Comparison of the thermodynamics of ADP-ribose binding by Mac1 domains
from SARS-CoV-2 (data from panel C), SARS-CoV, MERS-CoV, and an alpha coronavirus.
aData from ref (34)bData from ref (39).
cData from ref (13).
Nucleotide Binding by the SARS-CoV-2 Mac1 Domain
Repeated ITC experiments revealed that the purified recombinant protein bound ADP-ribose
(Figure C) with a dissociation constant of 10
± 4 μM (the uncertainty is the standard deviation of
Kds from independent titrations). To examine binding
specificity, similar titrations were repeated with related nucleotides. The SARS-CoV-2
protein bound ADP, cAMP, ATP, and ADP-glucose (Figure D). All nucleotides lacking the ribose moiety bound with similar high
affinities, but none bound with an enthalpy change similar to that observed with
ADP-ribose, suggesting specific contacts form between ADP-ribose and the SARS-CoV-2
protein. On the basis of findings from the SARS-CoV-2 structures below, these contacts
likely occur with the conserved D226 and N244 (positions 30 and 43, respectively, in the
numbering above the alignment in Figure ). None
of the other nucleotides bind with an entropic penalty as observed with ADP-ribose,
suggesting that the ribose moiety becomes structured when bound to the macrodomain.The energetics of binding of ADP-ribose to the SARS-CoV protein are similar to those
noted for the same protein from SARS-CoV[14] and MERS-CoV.[39] The enthalpy and entropy of binding were also very similar for all three
proteins (Figure D). In contrast to findings
from the Mac1 protein from an alpha coronavirus,[13] enthalpy appears to
drive binding of ADP-ribose to the Mac1 domains of the three beta coronaviruses. Variation
in the ADP-ribose-binding cleft might account for these differences. For example, nsp3
amino acid 360, which is near the adenine base, is a Phe in SARS-CoV-2, an Asn in both
SARS-CoV and MERS-CoV, and an aliphatic amino acid in the alpha and gamma coronaviruses
(Figure )
Structure of the SARS-CoV-2 Mac1 Domain
The SARS-CoV-2Mac1 domain (nsp3 residues 207–277) crystallized in space group
P212121 with one molecule per
asymmetric unit. These crystals had a solvent content of 43% and diffracted extremely
well. The final resolution limit of the data was set at 0.95 Å (Table ). The quality of the electron density maps is
correspondingly excellent (Figure A). The
section of the structure depicted in this image is located on the surface of the protein,
and the B-factors of these residues (7.2 Å2) are close to
the average B-factor of the protein (10.2 Å2), indicating
that this sample accurately represents the overall quality of the maps. The final model
contains the entire sequence from V207 to S377 of nsp3, an N-terminal glycine residue that
was left from the TEV-protease cleavage, and 374 solvent molecules. The
Rcryst and Rfree values of the
final model were 0.119 and 0.137, respectively (Table ).
Figure 4
SARS-CoV-2 Mac1 domain structure. (A) The electron density is shown for a
representative portion of the structure (residues 359–361) on the surface of
the protein. The 2mFo –
DFc map is contoured at 1.5σ and is shown as a
magenta mesh. The mFo –
DFc (difference) maps are shown at +3.0σ and
−3.0σ as green and red mesh, respectively. (B) Ribbon diagram of the
SARS-CoV-2 Mac1 domain structure colored according to the sequence conservation plot
in Figure as a gradient from red (weakly
conserved, <10%) to blue (strongly conserved, 100%) through magenta. As observed in
the sequence alignment, the N- and C-termini are particularly poorly conserved. (C)
Overlay of the structure of the SARS-CoV-2 Mac1 domain bound to ADP-ribose determined
by Michalska et al. of the CGSID (PDB entry 6W02) with the ultra-high-resolution structure of the
unliganded protein determined here. ADP-ribose is presented as a ball-and-stick model
with the carbon atoms colored gold. The backbone trace of the unliganded structure is
colored cyan, and that of the ADP-ribose-bound model is colored gray. There are three
loops with significantly different conformations in the two structures. In the
unliganded structure, the β2−α2 loop is colored bright red, the
β4−α4 loop is colored purple, and the β5−α5 loop
is colored bright green. The same regions of the ADP-ribose-bound structure are
colored pale red, pale purple, and pale green, respectively. The transparent blue and
yellow spheres represent water molecules bound to the unliganded (transparent blue)
and ADP-ribose-bound (transparent yellow) forms of the protein. Interestingly, several
of the water molecules interacting with ADP-ribose in PDB entry 6W02 can also be found in the
unliganded structure of the protein. The inset shows a close-up of the boxed region
colored according to the same scheme. The β2−α2 and
β5−α5 loops, which contact ADP-ribose, are presented as a
ball-and-stick model. Note that the β2−α2 loop rotates
∼180° to allow it to make a hydrogen bonding interaction with the
1′-hydroxyl of the ribose moiety. Additionally, the phenylalanine residue in
the β5−α5 loop (F336) would clash with the β-phosphate and
ribose of ADP-ribose if the β5−α5 loop did not adopt a different
conformation.
SARS-CoV-2Mac1 domain structure. (A) The electron density is shown for a
representative portion of the structure (residues 359–361) on the surface of
the protein. The 2mFo –
DFc map is contoured at 1.5σ and is shown as a
magenta mesh. The mFo –
DFc (difference) maps are shown at +3.0σ and
−3.0σ as green and red mesh, respectively. (B) Ribbon diagram of the
SARS-CoV-2Mac1 domain structure colored according to the sequence conservation plot
in Figure as a gradient from red (weakly
conserved, <10%) to blue (strongly conserved, 100%) through magenta. As observed in
the sequence alignment, the N- and C-termini are particularly poorly conserved. (C)
Overlay of the structure of the SARS-CoV-2Mac1 domain bound to ADP-ribose determined
by Michalska et al. of the CGSID (PDB entry 6W02) with the ultra-high-resolution structure of the
unliganded protein determined here. ADP-ribose is presented as a ball-and-stick model
with the carbon atoms colored gold. The backbone trace of the unliganded structure is
colored cyan, and that of the ADP-ribose-bound model is colored gray. There are three
loops with significantly different conformations in the two structures. In the
unliganded structure, the β2−α2 loop is colored bright red, the
β4−α4 loop is colored purple, and the β5−α5 loop
is colored bright green. The same regions of the ADP-ribose-bound structure are
colored pale red, pale purple, and pale green, respectively. The transparent blue and
yellow spheres represent water molecules bound to the unliganded (transparent blue)
and ADP-ribose-bound (transparent yellow) forms of the protein. Interestingly, several
of the water molecules interacting with ADP-ribose in PDB entry 6W02 can also be found in the
unliganded structure of the protein. The inset shows a close-up of the boxed region
colored according to the same scheme. The β2−α2 and
β5−α5 loops, which contact ADP-ribose, are presented as a
ball-and-stick model. Note that the β2−α2 loop rotates
∼180° to allow it to make a hydrogen bonding interaction with the
1′-hydroxyl of the ribose moiety. Additionally, the phenylalanine residue in
the β5−α5 loop (F336) would clash with the β-phosphate and
ribose of ADP-ribose if the β5−α5 loop did not adopt a different
conformation.As expected, the tertiary structure ranges from approximately identical to very similar
to those of other coronavirus macrodomains, including SARS CoV (2FAV,[14] 74.7% sequence
identity) with a root-mean-square deviation (RMSD) value for 162 of 172 Cα atoms of
0.6 Å, MERS-CoV (5DUS,[39] 42.2% identical) with a 1.2 Å RMSD for 161 of 172 Cα atoms,
human alpha coronavirus 229E (3EWR,[41] 32.5% identical) with a 1.5 Å RMSD for 154 of
172 Cα atoms, feline coronavirus (FCoV, 3JZT,[42] 26.8% identical) with a 1.5 Å RMSD for 153 of
172 Cα atoms, and the gamma CoVIBV (3EWP,[41] 26.7% identical) with a 2.1 Å RMSD for 150 of
172 Cα atoms. The regions with a high degree of sequence conservation are not
clustered in any particular region(s) of the molecule, as is clear when the ribbon is
colored as a gradient from red (poorly conserved) to blue (highly conserved) (Figure B). This finding is consistent with the fact
that the protein atoms involved in hydrogen bonding interactions with the ligands in these
structures are more often part of the main chain; relatively few interactions of side
chains with ligands are observed.At the time of writing, we discovered that Michalska et al. of the Center for Structural
Genomics of Infectious Diseases (CSGID) deposited coordinates for a very similar construct
of the SARS-CoV-2Mac1 domain including the region from E206 to E275 of the nsp3 protein
plus an additional four residues at the N-terminus (6VSX, unpublished). Their crystals also allowed binding of
ADP-ribose (6W02) and AMP (6W6Y), whereas ours seemed to be packed too
tightly to permit ligands to access the binding site (data not shown). We compared our
ultra-high-resolution model of the unliganded protein to the ADP-ribose-bound form. The
RMSD for the fitting, which were determined by secondary structure matching (SSM)[43] as implemented in COOT, is 0.59 Å for 165 of 172 Cα atoms. This
value is very similar to the RMSD values of the free protein (6VXS, 0.66 Å) and the AMP-bound form
(6W6Y, 0.50 Å), indicating
that no large conformational changes occur upon ligand binding. In fact, the only notable
conformational changes occur in three surface-exposed loops in or near the ligand-binding
pocket (Figure C). These loops connect strand
β2 with helix α2 (the β2−α2 loop), strand β4 with
helix α4 (the β4−α4 loop), and strand β5 with helix
α5 (the β5−α5 loop). The subtle change in conformation of the
β4−α4 loop (purple in Figure C) appears to be the result of crystal contacts and not the direct influence of
ADP-ribose binding. The other two loops are more intimately involved in ligand binding.
The main chain of the α2−β2 loop rotates 180° to allow the amide N
atom of G252 to participate in a hydrogen bonding interaction with the 1′-hydroxyl
of the ribose moiety of ADP-ribose. G252 corresponds to residue V33 in the Chikungunya
macrodomain, which Eckei et al. proposed is needed for de-ADP-ribosylation activity based
on results of site-directed mutagenesis.[10] This loop also carries N244,
which directly interacts with the ribose. The phenyl ring of F336 in the
β5−α5 loop occupies the portion of the binding pocket in the unliganded
structure that is occupied by the β-phosphate of ADP-ribose. Thus, without
rearrangement of the β5−α5 loop, ADP-ribose would not be able to
bind.
Conclusion
The significance of the study stems mainly from the demonstration that the SARS-CoV-2Mac1
domain binds ADP-ribose. Although SARS-CoV and SARS-CoV-2 have 26% divergence in amino acid
sequences, their structures are highly similar, which may explain the ability of the latter
to bind ADP-ribose. This is the first step needed to justify screens for potential
antivirals that bind in place of ADP-ribose. However, more work needs to be done to
understand the antiviral potential of such compounds because the biological role for
ADP-ribose binding is not completely understood. Some work with alpha coronaviruses suggests
that ADP-ribose binding by the Mac1 domain is not required for viral replication.[44] However, studies with other (+)RNA viruses suggest that macrodomains are
essential for virulence.[45] This work is also noteworthy because the
synthetic codon-optimized plasmid reported here produces up to 100 mg of soluble Mac1 domain
protein per liter of E. coli culture, and this protein retains a high
affinity for ADP-ribose. The protein could be used for structural studies and screening
campaigns. Screening assays with the SARS-CoV-2 protein might be more efficient because the
SARS-CoV-2 protein binds ADP-ribose somewhat more tightly (Kd =
10 μM) than the SARS-CoV protein (Kd = 24 μM). The
recombinant protein reported here together with detailed structural information might also
be useful to others developing SARS-CoV-2 diagnostics and/or therapeutics.
Authors: Galen J Correy; Daniel W Kneller; Gwyndalyn Phillips; Swati Pant; Silvia Russi; Aina E Cohen; George Meigs; James M Holton; Stefan Gahbauer; Michael C Thompson; Alan Ashworth; Leighton Coates; Andrey Kovalevsky; Flora Meilleur; James S Fraser Journal: Sci Adv Date: 2022-05-27 Impact factor: 14.957
Authors: Johannes Gregor Matthias Rack; Valentina Zorzini; Zihan Zhu; Marion Schuller; Dragana Ahel; Ivan Ahel Journal: Open Biol Date: 2020-11-18 Impact factor: 6.411
Authors: F Cantini; L Banci; N Altincekic; J K Bains; K Dhamotharan; C Fuks; B Fürtig; S L Gande; B Hargittay; M Hengesbach; M T Hutchison; S M Korn; N Kubatova; F Kutz; V Linhard; F Löhr; N Meiser; D J Pyper; N S Qureshi; C Richter; K Saxena; A Schlundt; H Schwalbe; S Sreeramulu; J-N Tants; A Wacker; J E Weigand; J Wöhnert; A C Tsika; N K Fourkiotis; G A Spyroulias Journal: Biomol NMR Assign Date: 2020-08-14 Impact factor: 0.746
Authors: Rajdeep S Virdi; Robert V Bavisotto; Nicholas C Hopper; Nemanja Vuksanovic; Trevor R Melkonian; Nicholas R Silvaggi; David N Frick Journal: SLAS Discov Date: 2020-09-28 Impact factor: 3.341