The nucleocapsid (N) protein is one of the four structural proteins of the SARS-CoV-2 virus and plays a crucial role in viral genome organization and, hence, replication and pathogenicity. The N-terminal domain (NNTD) binds to the genomic RNA and thus comprises a potential target for inhibitor and vaccine development. We determined the atomic-resolution structure of crystalline NNTD by integrating solid-state magic angle spinning (MAS) NMR and X-ray diffraction. Our combined approach provides atomic details of protein packing interfaces as well as information about flexible regions as the N- and C-termini and the functionally important RNA binding, β-hairpin loop. In addition, ultrafast (100 kHz) MAS 1H-detected experiments permitted the assignment of side-chain proton chemical shifts not available by other means. The present structure offers guidance for designing therapeutic interventions against the SARS-CoV-2 infection.
The nucleocapsid (N) protein is one of the four structural proteins of the SARS-CoV-2 virus and plays a crucial role in viral genome organization and, hence, replication and pathogenicity. The N-terminal domain (NNTD) binds to the genomic RNA and thus comprises a potential target for inhibitor and vaccine development. We determined the atomic-resolution structure of crystalline NNTD by integrating solid-state magic angle spinning (MAS) NMR and X-ray diffraction. Our combined approach provides atomic details of protein packing interfaces as well as information about flexible regions as the N- and C-termini and the functionally important RNA binding, β-hairpin loop. In addition, ultrafast (100 kHz) MAS 1H-detected experiments permitted the assignment of side-chain proton chemical shifts not available by other means. The present structure offers guidance for designing therapeutic interventions against the SARS-CoV-2 infection.
SARS-CoV-2, a positive-sense single-stranded RNA virus from the β-coronavirus
family,[1] is the causative agent of the COVID-19 pandemic that killed
millions of people and brought the world economy to a grinding halt.[2,3] The SARS-CoV-2 genome encodes four
structural proteins: spike (S) glycoprotein, envelope (E) protein, membrane (M) protein, and
nucleocapsid (N) protein.[4,5] All play crucial roles in the viral life cycle and pathogenicity,
including host immunity evasion.[6] Due to its important role in genome
packaging and ribonucleoprotein (RNP) formation, the N protein represents a potential target
for therapeutic interventions.[7−9]The N protein comprises two folded domains, the N-terminal (NNTD, residues
40–174) and C-terminal (NCTD, residues 246–365) domains, connected
by an ∼70 amino acid linker region that contains a 13-residue serine/arginine motif,
as well as extensive intrinsically disordered regions (IDRs) at the N- and
C-termini[10−15] (Figure a). All domains including the NNTD play an important role in the RNA
genome interaction.[16−18]
Figure 1
Domain delineation, amino acid sequence, and magic angle spinning (MAS) NMR spectra
used for resonance assignment of SARS-CoV-2 NNTD. (a) Top: domain
organization of SARS-CoV-2 nucleocapsid (N) protein; N-terminal domain (NNTD)
and C-terminal domain (NCTD). Bottom: NNTD primary sequence and
β-strands (blue arrows); residues 2–136 (black) in the current
NNTD construct (this work) correspond to residues 40–174 (gray) in
the full-length N protein. (b) Selected regions of 1H-detected
two-dimensional (2D) (H)NH HETCOR and (H)CH HETCOR spectra of
U–13C,15N–NNTD. The expansions around
the A52 and T38 cross-peaks in (H)NH and (H)CH spectra depict one-dimensional (1D)
slices to illustrate the line widths in the two frequency dimensions. (c) Aliphatic
region of the 2D CORD spectrum (25 ms mixing time). (d) Sequential assignments for the
D44–G47 stretch of residues are illustrated with representative strips of 2D
NCACX (gray), NCOCX (blue), 1H-detected three-dimensional (3D) (H)CANH
(black), and (H)CONH (pale blue) spectra. (e) Selected strips from the
1H-detected 3D (H)CCH spectrum for residues T77 and P42 illustrating
13C and 1H side-chain resonance assignments. CORD, NCACX, and
NCOCX spectra were recorded at an MAS frequency of 14 kHz; (H)CANH and (H)CONH spectra
were acquired at an MAS frequency of 60 kHz; and HETCOR and (H)CCH spectra were acquired
at an MAS frequency of 100 kHz. The number of scans and the number points in the direct
and indirect dimensions are as follows: 2D (H)NH HETCOR - 32 scans, 1024
t2 points, 1034 t1 points; 2D
(H)CH HETCOR - 64 scans, 1024 t2 points, 2310
t1 points; 2D CORD - 192 scans, 2048
t2 points, 840 t1 points; 2D
NCACX - 2048 scans, 2048 t2 points, 96
t1 points; 2D NCOCX - 1536 scans, 3072
t2 points, 96 t1 points; 3D
(H)CANH - 48 scans, 2048 t3 points, 112(15N)
t2 points, 32(13C) t1 points; 3D
(H)CONH - 32 scans, 2048 t3 points, 112(15N)
t2 points, 32(13C)
t1 points; and 3D (H)CCH - 8 scans, 1024
t3 points, 264 (13C)
t2 points, 264 (13C)
t1 points.
Domain delineation, amino acid sequence, and magic angle spinning (MAS) NMR spectra
used for resonance assignment of SARS-CoV-2 NNTD. (a) Top: domain
organization of SARS-CoV-2 nucleocapsid (N) protein; N-terminal domain (NNTD)
and C-terminal domain (NCTD). Bottom: NNTD primary sequence and
β-strands (blue arrows); residues 2–136 (black) in the current
NNTD construct (this work) correspond to residues 40–174 (gray) in
the full-length N protein. (b) Selected regions of 1H-detected
two-dimensional (2D) (H)NH HETCOR and (H)CH HETCOR spectra of
U–13C,15N–NNTD. The expansions around
the A52 and T38 cross-peaks in (H)NH and (H)CH spectra depict one-dimensional (1D)
slices to illustrate the line widths in the two frequency dimensions. (c) Aliphatic
region of the 2D CORD spectrum (25 ms mixing time). (d) Sequential assignments for the
D44–G47 stretch of residues are illustrated with representative strips of 2D
NCACX (gray), NCOCX (blue), 1H-detected three-dimensional (3D) (H)CANH
(black), and (H)CONH (pale blue) spectra. (e) Selected strips from the
1H-detected 3D (H)CCH spectrum for residues T77 and P42 illustrating
13C and 1H side-chain resonance assignments. CORD, NCACX, and
NCOCX spectra were recorded at an MAS frequency of 14 kHz; (H)CANH and (H)CONH spectra
were acquired at an MAS frequency of 60 kHz; and HETCOR and (H)CCH spectra were acquired
at an MAS frequency of 100 kHz. The number of scans and the number points in the direct
and indirect dimensions are as follows: 2D (H)NH HETCOR - 32 scans, 1024
t2 points, 1034 t1 points; 2D
(H)CH HETCOR - 64 scans, 1024 t2 points, 2310
t1 points; 2D CORD - 192 scans, 2048
t2 points, 840 t1 points; 2D
NCACX - 2048 scans, 2048 t2 points, 96
t1 points; 2D NCOCX - 1536 scans, 3072
t2 points, 96 t1 points; 3D
(H)CANH - 48 scans, 2048 t3 points, 112(15N)
t2 points, 32(13C) t1 points; 3D
(H)CONH - 32 scans, 2048 t3 points, 112(15N)
t2 points, 32(13C)
t1 points; and 3D (H)CCH - 8 scans, 1024
t3 points, 264 (13C)
t2 points, 264 (13C)
t1 points.Several structures of β-coronavirus NNTD domains have been reported, all
of which possess the same architecture, resembling a right hand.[19−24] The core
structure is made of a four-stranded antiparallel β-sheet, the palm, from which the
β2, β3 hairpin prominently protrudes. It contains several basic residues, and
this basic finger and the palm have been implicated in RNA binding.[25] The
loop connecting β2 and β3 is flexible, in agreement with the missing density in
this region of most X-ray structures (see below).[19,26] The N-terminal disordered tail projects outward and may
contribute to RNA binding.[19]Here, we report the atomic-resolution structure of crystalline NNTD, determined
by combining X-ray crystallography and solid-state magic angle spinning (MAS) NMR
spectroscopy. The protein crystallized in the P212121 space
group with four chains in the asymmetric unit, and the X-ray structure was solved at a 1.7
Å resolution. The MAS NMR structure of an individual NNTD chain, at 0.7
Å rmsd resolution, was determined using a single crystalline
U–13C,15N–NNTD sample, based on 2968
nonredundant 13C–13C, 15N–13C, and
15N–1H distance restraints. Several inter-chain contacts were
identified in 13C–13C correlation experiments, both for chains
in the asymmetric unit as well as across asymmetric units. Side-chain proton chemical shifts
were assigned from high-frequency (100 kHz) MAS NMR correlation experiments and provided
important structural information, such as the tautomeric state of the H107 residue. Our
results illustrate the power of integrating orthogonal structural techniques, here MAS NMR
and X-ray diffraction, for assessing details of protein conformations. The atomic-resolution
structure of crystalline NNTD reported here will guide the development of
small-molecule inhibitors and biologics for treatment as well as biosensors for the
detection of SARS-CoV-2 infection.[27]
Results
Resonance Assignments
Chemical shift assignments and distance restraints for NNTD structure
calculation were obtained using a single sample of fully protonated crystalline
U–13C,15N–NNTD comprising residues
40–174 (current construct residues 2–136; Figure a; see the Materials and Methods section
for experimental details). A total of eleven 2D and three 3D 1H- and
13C-detected high-frequency (100 and 60 kHz) MAS NMR experiments were
recorded (Figure b–e and Supporting
Information Table S1). The spectra are of remarkably high resolution, with line widths
as narrow as 35 Hz for 15N, 48 Hz for 13C, and 174 Hz for
1HN (Figure b).2D CORD,[28] NCACX, and NCOCX at a 25 ms mixing time, as well as
1H-detected 2D (H)NH HETCOR, 3D (H)CANH, and (H)CONH spectra (Figure c,d), were used for sequential backbone
assignments, and 13C and 15N chemical shifts are complete for 128 of
136 residues. For five residues, F28, P84, P113, P124, and E136, partial backbone chemical
shift assignments were obtained, and, for 119 residues, backbone amide proton
(HN) chemical shifts were assigned. The resonances of the first two residues,
R2 and R3, are missing in the spectra, likely due to disorder. Overall, good agreement is
observed between 1H and 15N chemical shifts determined in this work
and those reported previously from solution NMR.[25,29] MAS NMR assignments for a representative
stretch of residues D44–G47 are illustrated in Figure d.Side-chain 13C chemical shifts and inter-residue correlations were obtained
from 2D NCACX, NCOCX, and CORD spectra, the latter acquired with 25, 100, and 500 ms
mixing times (Figures c and 2a). High spectral resolution permitted the unambiguous assignment of numerous
cross-peaks, including those corresponding to aliphatic-to-aromatic (left panel) and
aliphatic-to-aliphatic (right panel) side-chain correlations (Figure
a). To determine side-chain and backbone 1H
chemical shifts, a 3D (H)CCH correlation experiment was recorded at an MAS frequency of
100 kHz (Figure e). In conjunction with spectra
acquired at an MAS frequency of 60 kHz, 84 side-chain proton resonances for 71 residues
and Hα resonances for 65 residues were assigned. For 11 Ala, 3 Val, 4
Ser, and 1 His residues, complete 13C, 15N, and 1H
backbone and side-chain chemical shifts were obtained. Overall, assignments for 132
residues were attained (Figure S1) on the basis of 3728 cross-peaks in various spectra (Table ). All chemical shifts are summarized in
Table S2 of the Supporting Information.
Figure 2
Correlation spectra, inter-residue distance restraints, and MAS NMR structure of a
single NNTD chain. (a) Superposition of representative regions of 2D CORD
spectra of U–13C,15N–NNTD acquired with
the mixing times of 100 ms (blue) and 500 ms (gray). Aromatic and aliphatic regions
are shown in the left and right panels, respectively. Representative cross-peaks
between amino acids are labeled by residue numbers. (b) The number of all
inter-residue distance restraints and long-range inter-residue distance restraints are
plotted for each residue along the polypeptide chain. (c) Superposition of the 10
lowest-energy MAS NMR structures of a single chain of SARS-CoV-2 NNTD.
β-strands are colored in blue and labeled. The number of scans and the number
points in the direct and indirect dimensions are as follows: 2D CORD (100 ms mixing
time) - 96 scans, 3072 t2 points, 840 t1
points; 2D CORD (500 ms mixing time) - 192 scans, 3072 t2
points, 667 t1 points.
Table 1
Summary of Samples and the Number of Assigned Peaks
no. of assigned peaksa
U–13C,15N–NNTD (MAS
NMR)
intraresidue
1943
sequential
(|i – j| = 1)
495
medium range
(1< |i – j| < 5)
306
long range
(|i – j| ≥ 5)
972
long range
(|i – j| ≥ 5) (interchain)
12
total assigned peaks (MAS NMR)
3728
U–15N–NNTD (solution NMR)
intraresidue
159
total assigned peaks
3887
Cross-peaks present in different experiments are counted only once.
Correlation spectra, inter-residue distance restraints, and MAS NMR structure of a
single NNTD chain. (a) Superposition of representative regions of 2D CORD
spectra of U–13C,15N–NNTD acquired with
the mixing times of 100 ms (blue) and 500 ms (gray). Aromatic and aliphatic regions
are shown in the left and right panels, respectively. Representative cross-peaks
between amino acids are labeled by residue numbers. (b) The number of all
inter-residue distance restraints and long-range inter-residue distance restraints are
plotted for each residue along the polypeptide chain. (c) Superposition of the 10
lowest-energy MAS NMR structures of a single chain of SARS-CoV-2 NNTD.
β-strands are colored in blue and labeled. The number of scans and the number
points in the direct and indirect dimensions are as follows: 2D CORD (100 ms mixing
time) - 96 scans, 3072 t2 points, 840 t1
points; 2D CORD (500 ms mixing time) - 192 scans, 3072 t2
points, 667 t1 points.Cross-peaks present in different experiments are counted only once.Gratifyingly, many side-chain protons of aromatic residues could be unambiguously
assigned from the 1H-detected 100 kHz MAS NMR spectra (Figure
b). For example, for W70 and W94, located in the
β-sheet core and assumed to be involved in RNA binding, side-chain protons were
assigned fully (W70) or partially (W94). Moreover, the tautomeric state of H107 was
determined (see below).
Structure of a Single NNTD Chain Determined by MAS NMR
The structure of an NNTD single chain was calculated using 2968 nonredundant
distance and 101 ϕ/ψ torsion angle restraints. Of these, 2197 are unambiguous
13C–13C, 763 are 15N–13C, and
4 are 1H–15N distance restraints, including 968 long-range
(|i – j| ≥ 5) restraints (Table and Figure S2 of the Supporting Information). The number of restraints per
residue is plotted in Figure b. The 10
lowest-energy MAS NMR structures in the structural ensemble and an average structure of a
single chain of NNTD are shown in Figures c and S3 of the Supporting Information, respectively. All MAS NMR distance
restraints are summarized in Table . With nearly
22 restraints/residue on average, the NNTD structure determined in this study
represents a notable technical advance being one of only two MAS NMR structures of
proteins larger than 100 residues per chain determined using more than 20
restraints/residue and reaching the maximum accuracy and precision attained for MAS NMR
protein structures[30] (see below).
Table 2
Summary of MAS NMR Restraints and Structure Statistics
MAS NMR distance restraints
13C–13C
15N–13C
1H–15N
unambiguous
2197
763
4
intraresidue
807
505
0
sequential
(|i – j| = 1)
119
258
4
medium range
(1 < |i – j| < 5)
303
0
0
long range
(|i – j| ≥ 5)
968
0
0
ambiguous
4
total number of restraints assigned
2968 (21.8 restraints per
residue)
MAS NMR Dihedral Angle Restraints
Φ
101
Ψ
101
Structure Statistics from 10 Lowest-Energy Subunits
violations (mean ± sd)
distance
restraints ≥ 7.2 Å (Å)
0.144 ± 0.001
dihedral angle
restraints ≥ 5° (deg)
1.528 ± 0.137
max. distance restraint violation
(Å)
1.254
max. dihedral angle restraint
violation (deg)
17.267
deviations from idealized geometry
bond lengths (Å)
0.008 ± 0.000
bond angles (deg)
0.774 ± 0.012
impropers (deg)
0.516 ± 0.016
average pairwise rmsd (Å)a
backbone
(N, Cα, C′)
0.7 ± 0.2
heavy
1.2 ± 0.1
Disordered N-terminus (residues 1–9) excluded.
Disordered N-terminus (residues 1–9) excluded.Like all coronavirus NNTD structures,[16,19,24−26,31,32] the MAS NMR-derived structure exhibits the overall shape
of a right hand, made up of a four-stranded β-sheet, comprising β1
(L18–T19), β2 (I46–R55), β3 (D65–Y74), and β4
(I92–A96). At its center, a long β-hairpin protrudes out from the palm (Figure c). The irregular regions at the N- and
C-termini exhibit well-defined backbone and side-chain orientations in the MAS NMR
structure, except for the first eight amino acids (R2–N9) and the last residue
(E136) (see Figures S2 and S3 of the Supporting Information). The lack of long-range
inter-residue distance restraints for the N-terminal tail residues (P4–N9) and
β-hairpin loop (I56–K64) suggests that they are dynamic (Figure b). The precision of the single-chain MAS NMR structure
is 0.7 ± 0.2 Å, as measured by the pairwise atomic backbone rmsd for the 10
lowest-energy structures (excluding the disordered N-terminal tail, residues R2–N9)
(Table and Figure S3 of the Supporting Information).
X-ray Crystal Structure of the NNTD
The protein crystallized in the P212121 orthorhombic
space group with four monomers (chains A–D) in the asymmetric unit (Figures a and S4a of the Supporting Information). Two views of the four chains are
provided in Figure a, and chain A is depicted in
Figure b. Details of the β2, β3
hairpin and loop region, as well as the difference electron density map, are shown in
Figure c. Complete statistics for X-ray data
collection, phasing, and refinement are provided in Table . The average pairwise rmsd value between the four chains in the
asymmetric unit is 0.5 ± 0.1 Å for the backbone atoms (excluding common missing
residues in all four chains, R2–N9, Q20–D25, R57–P68, and
P124–E136) (Figure S5b and Table S3 of the Supporting Information). A positively charged
region, comprising arginine residues in the β-sheet (R50, R51, R54, R55, R69) and at
the tip of the β-hairpin finger (R57, K62), may possibly contribute to RNA binding
(Figure S4c of the Supporting Information).[19,25]
Figure 3
X-ray crystal structure of SARS-CoV-2 NNTD. (a) Ribbon and surface
representation of the four NNTD chains in the asymmetric unit shown in two
different orientations (PDB: 7UW3); chain A (gray), chain B (purple), chain C (cyan), and chain D
(orange). (b) Structure of chain A (ribbon representation) with the strands in the
β-sheet core labeled β1−β4. (c) Electron density map for the
β-hairpin loop of chain A superimposed on the atomic model in stick
representation.
Table 3
X-ray Data Collection and Refinement Statisticsa
SARS-CoV-2 NNTD
Data Collection
wavelength (Å)
0.9794
space group
P212121
cell dimensions
a, b, c (Å)
58.76, 92.76, 95.59
α, β, γ (deg)
90, 90, 90
resolution range (Å)
37–1.70
Rsym or
Rmerge
0.024(0.65)b
I/σI-CC1/2
27.5(1.1)-99(42)
completeness (%)
98.9(98.2)
redundancy
11(7)
Refinement
refinement program
COOT
resolution range (Å)
37–1.70
no. reflections
59,316
Rwork/Rfree
24.4/28.5
no. of nonhydrogen atoms
protein
3642
solvent (water)
650
B-factors
protein
48.5
solvent (water)
42
Rms deviations
bond lengths (Å)
0.003
bond angles (deg)
0.68
PDB ID
7UW3
Molecular replacement.
Values in parentheses are for the highest-resolution shell.
X-ray crystal structure of SARS-CoV-2 NNTD. (a) Ribbon and surface
representation of the four NNTD chains in the asymmetric unit shown in two
different orientations (PDB: 7UW3); chain A (gray), chain B (purple), chain C (cyan), and chain D
(orange). (b) Structure of chain A (ribbon representation) with the strands in the
β-sheet core labeled β1−β4. (c) Electron density map for the
β-hairpin loop of chain A superimposed on the atomic model in stick
representation.Molecular replacement.Values in parentheses are for the highest-resolution shell.Interesting details about the intratetramer interfaces in the crystal can be noted in the
structure, with five unique types of contacts formed by the residues within the tetramer
(Figure a). Specifically, (i) the A–B
interface comprises several residues (T18, H21, I36, R54, R56–G59, K64, L66, S67,
V120, Q122, Y134, and A135) from both chain A and chain B; (ii) the A–C interface
is very small and involves residues G59 and D60 of chain A contacting K131 of chain C;
(iii) the B–C interface comprises several residues, such as R30–Q32, P42,
D43, E98–L101, and P124–T127 of the chain B palm region, which are in
contact with the C-terminal tail (residues L121–L129, K131, G132, Y134, A135), as
well as H21, G22, and K23 of chain C; (iv) the B–D interface packs the palm regions
of chains B and D against each other; and (v) the D–C interface comprises residues
I56 and P113–A117 of chain D and T16, H21, A117, A118, V120, and Q122 of chain
C.
Figure 4
Interchain interfaces and crystal packing in the NNTD structure. (a)
Intratetramer interfaces. Top and middle: five unique interchain interfaces in the
asymmetric unit of NNTD crystal; chain A (gray), chain B (purple), chain C
(cyan), and chain D (orange). Interface residues are in yellow stick representation.
(b) Intertetramer interfaces. Top-left: each single tetramer (numbered 0) forms
intertetramer interfaces with 10 neighboring tetramers (1–10). Intertetramer
interface residues are colored yellow. Top-right and middle-right: four unique
intertetramer interfaces are formed based on symmetry operation. The nomenclature for
a specific chain (A) in a tetramer (0) is 0A. Symmetry-related interfaces
are boxed and expanded, with individual residues labeled and depicted in stick
representation. Selected regions of a 2D CORD spectrum (100 ms mixing time) showing
intratetramer correlations (magenta) and intertetramer correlations (green) (a and b,
bottom panels). (c) Left: representative strips of the 2D CORD (top strips, 25 ms
mixing time, gray; 100 ms mixing time, red), 2D NCACX (middle strip), 2D NCOCX (bottom
strip), and 2D (H)NH HETCOR (right strips) spectra illustrating the sequential
assignment for T77–G78. Resonances for two conformers, a and b, of T77 are
indicated by dotted and solid lines, respectively. Right: interchain contacts of T77
for each chain are colored yellow. The number of scans and the number points in the
direct and indirect dimensions are as follows: 2D CORD (25 ms mixing time) - 192
scans, 2048 t2 points, 840 t1
points; 2D CORD (100 ms mixing time) - 96 scans, 3072 t2
points, 840 t1 points; 2D NCACX - 2048 scans, 2048
t2 points, 96 t1 points; 2D
NCOCX - 1536 scans, 3072 t2 points, 96 t1
points; and 2D (H)NH HETCOR - 80 scans, 3072 t2 points;
400 t1 points.
Interchain interfaces and crystal packing in the NNTD structure. (a)
Intratetramer interfaces. Top and middle: five unique interchain interfaces in the
asymmetric unit of NNTD crystal; chain A (gray), chain B (purple), chain C
(cyan), and chain D (orange). Interface residues are in yellow stick representation.
(b) Intertetramer interfaces. Top-left: each single tetramer (numbered 0) forms
intertetramer interfaces with 10 neighboring tetramers (1–10). Intertetramer
interface residues are colored yellow. Top-right and middle-right: four unique
intertetramer interfaces are formed based on symmetry operation. The nomenclature for
a specific chain (A) in a tetramer (0) is 0A. Symmetry-related interfaces
are boxed and expanded, with individual residues labeled and depicted in stick
representation. Selected regions of a 2D CORD spectrum (100 ms mixing time) showing
intratetramer correlations (magenta) and intertetramer correlations (green) (a and b,
bottom panels). (c) Left: representative strips of the 2D CORD (top strips, 25 ms
mixing time, gray; 100 ms mixing time, red), 2D NCACX (middle strip), 2D NCOCX (bottom
strip), and 2D (H)NH HETCOR (right strips) spectra illustrating the sequential
assignment for T77–G78. Resonances for two conformers, a and b, of T77 are
indicated by dotted and solid lines, respectively. Right: interchain contacts of T77
for each chain are colored yellow. The number of scans and the number points in the
direct and indirect dimensions are as follows: 2D CORD (25 ms mixing time) - 192
scans, 2048 t2 points, 840 t1
points; 2D CORD (100 ms mixing time) - 96 scans, 3072 t2
points, 840 t1 points; 2D NCACX - 2048 scans, 2048
t2 points, 96 t1 points; 2D
NCOCX - 1536 scans, 3072 t2 points, 96 t1
points; and 2D (H)NH HETCOR - 80 scans, 3072 t2 points;
400 t1 points.The X-ray and the MAS NMR structures of the individual chains are in good agreement, with
a backbone rmsd of 1.1 Å between the X-ray structure (averaged over the four chains
in the asymmetric unit) and the MAS NMR structure (averaged over the ensemble of 10
lowest-energy structures) (Figure S5 and Table S3 of the Supporting Information). Upon exclusion of
chain D, which possesses the highest degree of disorder in the X-ray structure, the
corresponding value becomes 0.7 Å. The average pairwise backbone rmsd between the
four chains in the X-ray structure and within the ensemble of 10 lowest-energy MAS NMR
structures are, both, 0.5 ± 0.1 Å (Table S3 of the Supporting Information). Side-chain conformations for most
residues in all four chains of the X-ray structure vary little, except for residues N10,
R30, R55, and I56, which exhibit major differences (Figure S6 of the Supporting Information). Unlike the MAS NMR structure,
which was determined using distance restraints or/and chemical shifts for most residues,
except R2, R3, and E136, density is either missing or very weak for residues R2–N9
and E136 in all chains and for residues Q20–D25, R57–P68 and
P124–E136 in chain D.In addition to the five intratetramer interfaces, each tetramer (arbitrarily designated
as “tetramer 0”) contacts 10 neighboring tetramers (tetramers 1–10)
resulting in four distinct intertetramer interfaces, classified by symmetry operators. The
nomenclature for a specific chain (A) in a tetramer (0) is denoted as 0A. The
packing of tetramers in the crystallographic lattice is depicted in Figure b. Intertetramer interface 1 is formed by two
tetramers, 1 and 2, adjacent to tetramer 0. This interface comprises residues
G82–P84 and A87–K89 of chains C (0C) and D (0D) of
tetramer 0 and equivalent residues G82–P84 and A87–K89 in 2D and
1C, respectively (colored light gray in Figure b(i)). Intertetramer interface 2 is formed by four tetramers, 3, 4,
5, and 6 (colored green in Figure b(ii)) around
tetramer 0. Several residues in 0A (N10–W14, T16, T38–S41, D44,
R50–R57, K62, M63, L66, R69, Y71, Y73, P79, D106–V120), 0B
(N10–S13, T16, T38–S40, R50–R57, G58–D60, M63, L66, R69, Y71,
Y73, P79, G82–K89, R111–V120, Q122), 0C (N10, T11, W14, R30,
N37–I46, Y48, L75–A87, V95–I119), and 0D (N10, T11, W14,
N39, S41–D44, I46, Y48, L75–A87, A96–N116) are involved in
crystallographically inequivalent interfaces (see Table S4). Intertetramer interface 3 is formed by two tetramers, 7 and 8
(colored pink in Figure b(iii)), adjacent to
tetramer 0. Residues H21–D25, K27, P29–Q32, A96–G99, and
Q122–A135 in 0A, Q20–K23, R57, K62, K64, L66–P68, and Y134
in 0B, residues R50, R51, T53–R57, G61–M63, and D65–L66 in
0C, and I36 and V120–L123 of 0D have contacts with residues
comprising tetramers 7 and 8. Intertetramer interface 4 comprises two tetramers, 9 and 10
(colored blue in Figure b(iv)), adjacent to
tetramer 0. The residues involved in this interface are E80, G82–P84, and
A87–K89 of 0A and 9A and W14, D106–R112 in
10B and 0B, respectively. All intra- and intertetramer contacts
for each NNTD residue are summarized in Table S4.These unique intra- and intertetramer interfaces are reflected in distinct correlations
in the MAS NMR spectra. In the (H)NH spectrum recorded at an MAS frequency of 100 kHz,
multiple resolved peaks or broad (unresolved) peaks are observed for residues that are
found in several different local environments, with 15N peak widths of
∼85–110 Hz (Figure b), whereas
those for amino acids in single conformations are ∼40–60 Hz (Figure a,c). Examples of correlations corresponding
to intratetramers A–B, A–C, and B–D interfaces, as well as to the
intertetramer interfaces, are shown in Figure a,b, bottom panels.
Figure 5
Selected amino acids that exhibit multiple backbone amide resonances in the 2D CORD
spectra of crystalline SARS-CoV-2 NNTD. (a, b) Individual single
cross-peaks reporting on a unique environment (a, pink labels) and doubled cross-peaks
reporting on different environments for the different chains (b, gray labels) with
their corresponding 1D 15N slices. (c) The location of amino acids (pink)
possessing single amide backbone cross-peaks mapped onto the structure of chain A in
the X-ray structure.
Selected amino acids that exhibit multiple backbone amide resonances in the 2D CORD
spectra of crystalline SARS-CoV-2 NNTD. (a, b) Individual single
cross-peaks reporting on a unique environment (a, pink labels) and doubled cross-peaks
reporting on different environments for the different chains (b, gray labels) with
their corresponding 1D 15N slices. (c) The location of amino acids (pink)
possessing single amide backbone cross-peaks mapped onto the structure of chain A in
the X-ray structure.One notable example of unique intratetramer contacts, evidenced by multiple cross-peaks
in the MAS NMR spectra, is seen for the H21–Y134 pair of residues (Figure a, bottom panel). It is evident from the X-ray
structure that the interchain distances are much shorter than the intrachain ones
(3.5–4.6 and 7.0–8.0 Å, respectively); hence, only interchain
correlations are expected in the spectra (Figure ). Another interesting example involves residues D60 and M63 in the β2,
β3-hairpin loop, for which intra- and intertetramer correlations were identified
based on the following considerations: (i) D60 from chain A with no intertetramer contacts
has a unique intratetramer correlation with K131 at the A–C interface. In contrast,
there are no intratetramer correlations involving D60 from other chains with K131, and
only intertetramer correlations are present; (ii) M63 has no intratetramer interactions;
(iii) M63 from chains A and B is in close proximity to N39 in the symmetry-related
tetramer interface 2; and (iv) M63 from chain C forms contacts with A135 across the
tetramer interface 3 (Figure b, bottom panel,
and Figure ).
Figure 6
H21–Y134 intra- and interchain contacts at the A–B interface in the
X-ray structure. (a) The A–B dimer in the asymmetric unit. (b) Close-up view of
the interchain contacts formed between H21 and Y134 from chains A and B. Interchain
contacts (dotted lines) are shorter than the intrachain contacts.
Figure 7
Intra- and intertetramer contacts involving the β-hairpin loop. (a) The
A–C interface in the asymmetric unit of X-ray structure is colored yellow
(top), and the D60–K131 contact for which correlations are observed in MAS NMR
spectra is shown at the bottom. (b) The three intertetramer interfaces around M63 in
the crystal lattice are colored yellow (top) with a detailed view provided in the
bottom two panels (bottom). The numbering of the tetramers and colors is as in Figure b.
H21–Y134 intra- and interchain contacts at the A–B interface in the
X-ray structure. (a) The A–B dimer in the asymmetric unit. (b) Close-up view of
the interchain contacts formed between H21 and Y134 from chains A and B. Interchain
contacts (dotted lines) are shorter than the intrachain contacts.Intra- and intertetramer contacts involving the β-hairpin loop. (a) The
A–C interface in the asymmetric unit of X-ray structure is colored yellow
(top), and the D60–K131 contact for which correlations are observed in MAS NMR
spectra is shown at the bottom. (b) The three intertetramer interfaces around M63 in
the crystal lattice are colored yellow (top) with a detailed view provided in the
bottom two panels (bottom). The numbering of the tetramers and colors is as in Figure b.In addition to the intra- and intertetramer contacts discussed above, we also observed
multiple conformers for many residues, as is evident from their distinct backbone and
side-chain chemical shifts. For example, T77 exhibits several resolved resonances with
unique chemical shifts each (Figure c, left
panel). A unique T77Cβ–A117C′ cross-peak is found for one
of the two conformers (designated as conformer b, T77b) in the 100 ms mixing
time CORD spectrum. This correlation is missing for the second conformer, T77a.
Both conformers exhibit significant chemical shift differences (ΔN
= 1.4 ppm, ΔCβ = 0.3 ppm,
ΔCγ = 0.3 ppm, ΔC′
= 0.6 ppm), consistent with distinct local environments for T77. Indeed, in the X-ray
structure, T77 in 0A and 0B has no intra- or intertetramer contacts
within 7 Å, consistent with this conformer being T77a. In contrast, T77 in
chains 0C and 0D exhibits intertetramer contacts with A117 from
chains 4B and 3A, respectively, therefore suggesting that these
resonances correspond to the T77b conformer (Figure c, right panels). Likewise, at least two distinct conformations are
seen for A52, whose backbone chemical shifts are different (ΔN =
2.0 ppm, ΔCα = 0.4 ppm), suggesting that they
exist in unique local environments (Figure S7, top panel, Supporting Information). This finding is fully
consistent with the X-ray structure (Figure S7, bottom panel, Supporting Information), where A52 in 0A
and 0B forms intertetramer contacts with D43 and L101 from chains 6C
and 5D, respectively, while A52 in chains 0C and 0D forms
contacts with E24 and K27 from chains 8A and 0B, respectively.
Tautomeric State of Histidine-107 in the Crystal
In the 1H-detected 2D (H)NH spectrum acquired with a CP contact time of 0.3
ms, H107 gives rise to two distinct
15Nε2–Hε2 cross-peaks at
δ(15N) = 170.2 ppm/δ(1H) = 12.4 ppm and
δ(15N) = 170.4 ppm/δ(1H) = 12.6 ppm (Figures b and 8a). In 2D (H)CH spectra,
multiple Cε1–Hε1 and
Cδ2–Hδ2 correlations are also observed (Figure a). Taken together, these results suggest the
presence of local heterogeneity around H107, consistent with two distinct local
environments seen in the X-ray crystal structure (Table S4 of the Supporting Information).
Figure 8
Side-chain imidazole state of H107 in the NNTD crystal. (a) Strips
extracted from 2D (H)CH and (H)NH HETCOR spectra acquired with CP contact times of at
0.3 ms (gray) and 4 ms (blue) (2D (H)NH HETCOR spectra only). H107 side-chain
13C, 15N, and 1H resonances are labeled. (b)
Conformations of H107 and the neighboring D44 residues in chains A–D of the
crystal. The interatomic
H107Nε2–D44Oδ1,δ2 distances vary
between 2.9 and 3.4 Å. The number of scans and the number points in the direct
and indirect dimensions are as follows: 2D (H)CH HETCOR - 64 scans, 1024
t2 points, 2310 t1 points;
2D (H)NH HETCOR - 16 scans, 1024 t2 points, 512
t1 points.
Side-chain imidazole state of H107 in the NNTD crystal. (a) Strips
extracted from 2D (H)CH and (H)NH HETCOR spectra acquired with CP contact times of at
0.3 ms (gray) and 4 ms (blue) (2D (H)NH HETCOR spectra only). H107 side-chain
13C, 15N, and 1H resonances are labeled. (b)
Conformations of H107 and the neighboring D44 residues in chains A–D of the
crystal. The interatomic
H107Nε2–D44Oδ1,δ2 distances vary
between 2.9 and 3.4 Å. The number of scans and the number points in the direct
and indirect dimensions are as follows: 2D (H)CH HETCOR - 64 scans, 1024
t2 points, 2310 t1 points;
2D (H)NH HETCOR - 16 scans, 1024 t2 points, 512
t1 points.The tautomeric state of H107 was ascertained by a 1H-detected 2D (H)NH
experiment acquired with a CP contact time of 4 ms (Figure a). The cross-peaks at δ(15N) ∼ 170
ppm/δ(1H) = 7.9 ppm, δ(15N) ∼ 170
ppm/δ(1H) = 7.7 ppm, and δ(15N) ∼ 249
ppm/δ(1H) = 7.9 ppm were unambiguously assigned as
Nε2–Hε1,
Nε2–Hδ2, and
Nδ1–Hε1 correlations, respectively, based on
the Hε1 and Hδ2 chemical shift assignments from 2D (H)CH
and 3D (H)CCH experiments (Figure a). These
results suggest that H107 is the Nε2–H
tautomer.[33−36] Further evidence comes from a solution HMBC spectrum
recorded at the pH of the crystallization (pH 6.3) (Figure S8 of the Supporting Information), where
Nε2–Hε1 and
Nε2–Hδ2 correlations were observed at
δ(15N) = 170 ppm/δ(1H) = 7.7 ppm and
δ(15N) = 170 ppm/δ(1H) = 6.9 ppm, respectively. The
Nδ1–Hε1 correlation at δ(15N)
= 243.4 ppm/δ(1H) = 7.7 ppm is consistent with the H107 being the
Nε2–H tautomer.[35,36]The deshielded Hε2 resonance at δ(1H) ∼ 12.5 ppm
suggests that the H107 imidazole may be involved in hydrogen bonding or close to a
negatively charged group.[34,37] The presence of a
15Nε2–Hε2 correlation in the (H)NH
spectrum acquired with a 0.3 ms CP contact time also suggests a short
nitrogen–hydrogen distance, possibly a directly bonded
Nε2–H. Indeed, in the X-ray crystal structure, H107 is in close
proximity to D44 and the interatomic
H107Nε2–D44Oδ1,δ2 distances are 2.9, 2.9,
3.2, and 3.4 Å in chains A, B, C, and D, respectively (Figure b). Taken together, these data indicate that the
(H107)Nε2–H is close to the carboxylate side chain of D44.
Discussion
The NTD domain of the SARS-CoV-2 virus N protein has been previously structurally
characterized by X-ray crystallography and solution NMR.[16,19,25] Here, we present a
structure that was determined by integrating MAS NMR and X-ray diffraction, providing
important novel findings of distinct conformers, made possible by the remarkably high
resolution of the MAS NMR spectra. The structural heterogeneity of NNTD is an
outcome of crystallization, as seen in other NNTD crystal
structures,[19,26,31] and underscores the ability of the protein to form multiple types of
contacts involving distinct conformers with unique local environments.From the technical standpoint, the current study represents a notable advance for
determining protein structures by MAS NMR since a single sample of only 3.6 mg of
U–13C,15N-labeled NNTD packed in a 1.3 mm MAS
rotor was sufficient to obtain all necessary spectra. The same sample (∼0.5 mg) was
subsequently packed in a 0.7 mm MAS rotor for 1H-detected experiments at 100 kHz
MAS and 20.0 T, and these additional experiments yielded unique information on the
side-chain protons. Resonances for 98% of all amino acids were assigned, and a large number
of correlations corresponding to 2968 distance restraints, including 968 long-range
restraints, were obtained from 14 2D and 3D data sets. As a result, no preparations of
isotopically diluted samples were necessary. At 21.8 restraints per residue, the
NNTD single-chain structure reported here is one of the highest precision and
accuracy MAS NMR structures determined to date.Notable is the complementarity of information obtained by MAS NMR and X-ray diffraction. In
the X-ray structure, the atomic level details for individual chains and information on the
quaternary arrangement in the crystal are obtained, while the strength of MAS NMR lies in
providing information about dynamically disordered regions, proton positions, protonation
and tautomeric states, and contacts with water molecules. To understand the dynamics of the
different regions of NNTD in solution and in the crystalline forms and their role
in RNA binding, it will be interesting in future studies to perform measurements of
relaxation rates, chemical shifts, and dipolar anisotropy tensors.
Conclusions
We have determined the structure of SARS-CoV-2 NNTD by integrating MAS NMR and
X-ray diffraction. Our combined approach provided atomic details of packing interfaces as
well as information about disordered residues at the N- and C-termini and the functionally
important RNA binding, β-hairpin loop. In addition, 1H-detected experiments
at an MAS frequency of 100 kHz permitted the assignment of side-chain proton chemical shifts
not available by other means. The present structure offers guidance for designing
therapeutic interventions against the SARS-CoV-2 infection.
Materials and Methods
Expression and Purification of NNTD
The recombinant plasmid for expressing SARS-CoV-2 NNTD (residues
40–174, current construct residue numbering 2–136) was prepared from
GenScript based on the sequence previously reported for NNTD[19] and E.coli codon-optimized. The template coding for SARS-CoV-2
NNTD was subcloned into a pET28a(+) vector fused with an N-terminal
hexahistidine tag (His6), followed by a tobacco etch virus (TEV) protease
cleavage site, His6-TEV-NNTD. For the expression of
U–13C,15N–NNTD and
U–15N–NNTD, transformed E.coli BL21
(DE3) cells were cultured in 5 mL of Luria-Bertani (LB) medium containing 100 μg/mL
of kanamycin. LB preculture was incubated at 37 °C with agitation until the
OD600 reached 1.0–1.2. Fifty milliliters of M9 medium, supplemented
with 1 g/L 15NH4Cl (U–15N–NNTD)
or 1 g/L 15NH4Cl and 2 g/L
U–13C6–glucose
(U–13C,15N–NNTD), was inoculated with 1
mL of the LB preculture and incubated overnight at 37 °C. Following the overnight
growth, 50 mL of M9 medium was transferred to 1 L of the isotopically labeled M9 medium
and incubated at 37 °C. Cells were grown to an OD600 of 1.0 and induced
with a final isopropylthio-β-galactoside (IPTG) concentration of 400 μM. The
protein was expressed at 20 °C for 16–18 h, and cells were harvested by
centrifugation at 4000g for 10 min at 4 °C. The cell pellet was
resuspended in lysis buffer (20 mM tris–HCl, 500 mM NaCl, 20 mM imidazole, 0.02%
NaN3, pH 8.0) and flash-frozen (−80 °C) for short-term
storage.Cells were opened after treatment with 1 mM phenylmethylsulfonyl fluoride (PMSF) by
sonication at 40% power for 5 min (15 s pulse on and 45 s pulse off) on ice. The cellular
lysate was clarified by centrifugation at 14,000g for 1 h at 4 °C.
His6-tagged NNTD was purified by affinity chromatography over a 5
mL HisTrap HP column (GE Healthcare). For elution, a gradient of 20–500 mM
imidazole in 20 mM tris–HCl (pH 8.0), 500 mM NaCl, and 0.02% NaN3 was
employed. The His6-tag was cleaved with TEV protease (1:25 ratio of TEV
protease to
His6–U–13C,15N–NNTD)
for 12–16 h at 4 °C and again fractionated over a 5 mL HisTrap HP column (GE
Healthcare). Fractions were eluted in 20 mM tris–HCl, 500 mM NaCl, 20 mM imidazole,
0.02% NaN3, pH 8.0, and the pure protein was buffer-exchanged into
crystallization buffer (20 mM tris–HCl, 50 mM NaCl, pH 6.0) and NMR buffer (20 mM
tris–HCl, 150 mM NaCl, 90:10 H2O/D2O, pH 8.0). The
buffer-exchanged NNTD was concentrated to 12 mg/mL for solution NMR and 30
mg/mL for crystallization.
Crystallization of NNTD
Small-scale crystallization was carried out at room temperature (∼20 °C)
using the sitting-drop vapor diffusion. Two microliters of
U–13C,15N–NNTD (20 mM tris–HCl, 50
mM NaCl, pH 6.0) was mixed with 2 μL of crystallization buffer (100 mM MES, 30%
PEG4000, pH 6.5), modified from a previously published crystallization condition (PDB:
6WKP).[26]U–13C,15N–NNTD for MAS NMR experiments was
crystallized using a large-scale sitting-drop method based on the volumetric proportions
of 500 μL sitting-drop crystallization wells (Figure S9 of the Supporting Information). A series of presterilized Petri
dishes were used to form a concentric sitting-drop vessel with a reservoir (volume
capacity of 25–75 mL) and three droplet wells (optimal volume of 300–1000
μL). Similar to the small-scale crystallization, the crystallization droplet mixture
comprised 250 μL of NNTD and 250 μL of crystallization buffer. The
large-scale sitting-drop vessel was sealed using vacuum grease and left undisturbed at 20
°C for 5 days. Once crystallization was complete, the protein crystals were harvested
and packed into a Bruker 1.3 mm rotor by ultracentrifugation at 10,000g
for 15 min at 4 °C. The fully packed 1.3 mm rotor contained 3.6 mg of hydrated
protein crystals. The same sample (∼0.5 mg) was subsequently packed in a 0.7 mm
rotor for experiments at 100 kHz MAS and 20.0 T.
Diffraction Data Collection and Structure Determination
X-ray diffraction data of protein crystals were collected at beamline 12-2 at the
Stanford Synchrotron Radiation Lightsource (SRRL). All diffraction data used for analysis
were collected from crystals grown in 100 mM MES, 30% PEG4000, pH 6.0 at 100 K. All
diffraction data were indexed and integrated using the program XDS[38]
and scaled using the program AIMLESS from the CCP4 suite.[39] The
structure was solved by molecular replacement (MOLREP, CCP4 suite) using one monomer of
PDB:ID 6M3M. Structure refinement
was carried out in Phenix[40] with manual building in COOT[41] (PDB:ID 7UW3, this
work).
Solution NMR Spectroscopy
A 2D 1H–15N HSQC spectrum of 850 mM
U–15N–NNTD in 20 mM tris–HCl, 150 mM NaCl,
90:10 H2O/D2O buffer (pH 8.0) was recorded at 25 °C on a 14.1 T
Bruker Neo spectrometer equipped with a triple-resonance inverse detection (TXI) probe.
The Larmor frequencies were 600.13 MHz (1H), 150.9 MHz (13C), and
60.8 MHz (15N). Backbone and side-chain 1H and 15N
chemical shift assignments (Figure S10 and Table S5 of the Supporting Information) were obtained by
comparison with SARS-CoV-2 NNTD (BMRB:34511) and SARS-CoV-1 NNTD
(BMRB:6372) chemical shifts in the BMRB.[25,29]1H–15N HMBC spectra were recorded at pH 6.3 to match the
crystallization pH, with delays set to 5.4, 25, and 50 ms, corresponding to 1/2 of
1J and 2,3J coupling constants of 92, and 10 and 20 Hz,
respectively.
MAS NMR Spectroscopy
MAS NMR spectra of U–13C,15N–NNTD protein
crystals were recorded on a 14.1 T Bruker AVIII spectrometer outfitted with a 1.3 mm HCN
probe. The Larmor frequencies were 599.8 MHz (1H), 150.8 MHz (13C),
and 60.7 MHz (15N). The MAS frequencies were controlled to within ±10 Hz
by a Bruker MAS controller. The actual sample temperature was maintained at ∼25
°C throughout the experiments.Typical 90° pulse lengths were 1.3–1.5 μs for 1H,
2.6–2.9 μs for 13C, and 3.2–3.5 μs for
15N. 1H–13C and
1H–15N cross-polarizations were performed with an
80–100% linear amplitude ramp on 1H, with contact times of 1–1.5
and 2–2.5 ms, respectively. The center of the ramp was matched to the
Hartmann–Hahn condition at the first spinning sideband. 2D
13C–13C CORD,[28] 2D NCACX, and 2D NCOCX
spectra were recorded at an MAS frequency of 14 kHz. CORD mixing times were 25, 100, 250,
and 500 ms, and the 1H radio frequency (rf) field strength during CORD mixing
was 14 kHz. Band-selective 15N–13C spectrally induced
filtering in combination with cross-polarization (SPECIFIC-CP)[42] with a
contact time of 6.0–7.5 ms. SPINAL-64[43] decoupling
(90–100 kHz) was used during the evolution and acquisition periods.2D 13C–13C RFDR, 1H-detected 2D (H)NH, and (H)CH
HETCOR, as well as 3D (H)CANH and (H)CONH spectra, were recorded at an MAS frequency of 60
kHz with a 2.4 ms RFDR mixing time. Swept-low power TPPM (15 kHz, slTPPM)[44] was used for 1H-heteronuclear decoupling during acquisition.
WALTZ-16[45] broad-band decoupling was used for 13C and
15N decoupling during 1H acquisition. For 3D
1H-detected (H)CANH and (H)CONH spectra, CA-N and CO-N CP contact times were
6–7.5 ms with a contact-amplitude spin lock of about 25 kHz on 13C and a
tangent-modulated amplitude spin lock of the mean rf field amplitude of about 35 kHz on
15N.[46]Additional MAS NMR spectra were recorded on a 20.0 T Bruker AVIII spectrometer outfitted
with a 0.7 mm HCND and a 1.3 mm HCN probe. The Larmor frequencies were 850.4 MHz
(1H), 213.9 MHz (13C), and 86.2 MHz (15N). The MAS
frequency was 100 kHz, controlled to within ±50 Hz by a Bruker MAS controller. The
sample temperature was maintained at ∼25 °C throughout the experiments. Pulse
lengths (90°) were 1.3 μs for 1H, 3.15 μs for 13C,
and 3 μs for 15N. The (H)NH spectrum was recorded using a back CP (HN) of
an 800 μs contact time with an 80–100% linear amplitude ramp on
1H; the rf field strengths were 145 kHz for 1H and 48 kHz for
15N. The forward CP (NH) used a 200 μs contact time, with an
80–100% linear amplitude ramp on 1H; the rf field strengths were 134 kHz
for 1H and 48 kHz for 15N. For (H)CH and (H)CCH experiments, the
13C CP rf field strength was set to 30 kHz; for forward and back CP, linear
amplitude ramps on 1H were 80–100 and 100–80%; the 1H
rf field strengths were set at 138 and 132 kHz; the contact times were 600 and 175
μs, respectively. The CC RFDR mixing time was 0.56 ms. For all spectra, the
1H rf field strengths for water suppression and proton decoupling were set at
1/4 ωr, and a WALTZ sequence at 10 kHz was used for heteronuclear
decoupling of both 13C and 15N. An additional 2D (H)NH spectrum was
recorded at 60 kHz MAS, using a 1.3 mm HCN probe. The CP contact time was 4 ms, and the
remainder of all conditions were identical to those at 14.1 T (see above).
Data Processing
All MAS NMR data were processed using Bruker TopSpin and NMRPipe.[47]1H resonances are referenced with respect to water at 4.7 ppm and
13C and 15N to the external standards adamantane and ammonium
chloride, respectively. All 2D and 3D data sets were processed by applying 30, 45, 60, and
90° shifted sine bell apodization, followed by a Lorentzian-to-Gaussian
transformation in all dimensions. Forward linear prediction to twice the number of
original data points was applied in the indirect dimension for some data sets, followed by
zero filling. 2D and 3D 1H-detected data sets were processed with Gaussian
and/or square sine window apodization and quadrature baseline correction.Spectra were analyzed using CCPN[48] and Sparky,[49,50] and MAS NMR backbone and side-chain
1H–15N resonance assignments were initially carried out by
comparison with solution NMR chemical shifts[25,29] and verified by de novo backbone
assignment based on 2D 13C–13C CORD (25 ms mixing time) and
RFDR spectra, combined with 2D NCACX (25 ms mixing time), 2D NCOCX (25 ms mixing time),
1H-detected 2D (H)NH HETCOR, 3D (H)CANH, and 3D (H)CONH spectra. Side-chain
carbon and nitrogen resonances were assigned using 2D CORD, 2D NCACX, 2D NCOCX, and 2D
(H)NH spectra, and side-chain and backbone protons were assigned using
1H-detected 2D (H)NH, (H)CH HETCOR, and 3D (H)CCH experiments.
Structure Calculation of SARS-Cov2 NNTD
The MAS NMR structure of a single NNTD chain was calculated in Xplor-NIH
version 2.53[51−53] using
13C–13C, 15N–13C, and
1H–15N distance restraints, extracted from 2D CORD (100,
250, and 500 ms mixing times), NCACX, NCOCX, and (H)NH HETCOR spectra and backbone
dihedral angles predicted by TALOS–N[54] from the experimental
1H, 13C, and 15N chemical shifts. The bounds for the
distance restraints were set to 1.5–6.5 Å (4.0 ± 2.5 Å) and
2.0–7.2 Å (4.6 ± 2.6 Å) for intra- and inter-residue restraints,
respectively, consistent with our previous studies.[30,55]Calculations were seeded using the primary sequence as extended strands. One thousand
structures were generated with molecular dynamics simulated annealing in torsion angle
space with two successive annealing schedules and a final gradient minimization in
Cartesian space, essentially as described previously[30,55] and detailed below.Two successive annealing schedules were used, the first in vacuum with the REPEL module
and the second with an implicit solvent refinement using the EEFx module.[56] The 10 lowest-energy structures were selected and served as input for the
second schedule, and the 10 lowest-energy structures were selected and served as input for
the final ensemble (PDB: 7SD4).
Standard terms for bond lengths, bond angles, and impropers were applied to enforce the
correct covalent geometry.The first annealing calculation was essentially identical to that reported
previously,[30,55]
with initial random velocities at 3500 K constant-temperature molecular dynamics run for
the shorter of 800 ps or 8000 steps, with the time step size allowed to float to maintain
constant energy. Subsequently, simulated annealing calculations at reduced temperatures in
steps of 25–100 K were carried out for the shorter of 0.4 ps or 200 steps. Force
constants for distance restraints were ramped from 10 to 50
kcal/(mol·Å)2. Dihedral angle restraints were disabled for
high-temperature dynamics at 3500 K and subsequently applied with a force constant of 200
kcal/(mol·rad2). The force constant for the radius of gyration was
geometrically scaled from 0.002 to 1, and a hydrogen bond term, HBPot, was used to improve
hydrogen bond geometries.[57] After simulated annealing, structures were
minimized using a Powell energy minimization scheme.For the second schedule performed in implicit solvent, all parameters were set as in the
example EEFx of Xplor-NIH. Annealing was performed at 3500 K for 15 ps or 15,000 steps,
whichever was completed first. The starting time step was 1 fs and was self-adjusted in
subsequent steps to ensure conservation of energy. Random initial velocities were assigned
about a Maxwell distribution at the starting temperature of 3500 K. Subsequently, the
temperatures were reduced to 25 K in steps of 12.5 K. At each temperature, 0.4 ps dynamics
were run with an initial time step of 1 fs. Force constants for distance restraints were
ramped from 2 to 30 kcal/(mol·Å2). The dihedral restraint force
constants were set to 10 kcal/(mol·rad2) for high-temperature dynamics at
3,000 K and 200 kcal/(mol·rad2) during cooling. After the EEFx module,
structures were minimized using a Powell energy minimization scheme.
Structure Analysis and Visualization
Atomic rmsd values were calculated using routines in Xplor-NIH (version
2.53).[51−53] The visualization of
structural ensembles was rendered in PyMOL,[58] using in-house shell/bash
scripts. Secondary structure elements were classified according to STRIDE[59] and manual inspection.
Authors: Martyn D Winn; Charles C Ballard; Kevin D Cowtan; Eleanor J Dodson; Paul Emsley; Phil R Evans; Ronan M Keegan; Eugene B Krissinel; Andrew G W Leslie; Airlie McCoy; Stuart J McNicholas; Garib N Murshudov; Navraj S Pannu; Elizabeth A Potterton; Harold R Powell; Randy J Read; Alexei Vagin; Keith S Wilson Journal: Acta Crystallogr D Biol Crystallogr Date: 2011-03-18
Authors: Manman Lu; Ryan W Russell; Alexander J Bryer; Caitlin M Quinn; Guangjin Hou; Huilan Zhang; Charles D Schwieters; Juan R Perilla; Angela M Gronenborn; Tatyana Polenova Journal: Nat Struct Mol Biol Date: 2020-09-08 Impact factor: 15.369