The SARS coronavirus 2 (SARS-CoV-2) spike protein is located at the outermost perimeter of the viral envelope and is the first component of the virus to make contact with surrounding interfaces. The stability of the spike protein when in contact with surfaces plays a deciding role for infection pathways and for the viability of the virus after surface contact. While cryo-EM structures of the spike protein have been solved with high resolution and structural studies in solution have provided information about the secondary and tertiary structures, only little is known about the folding when adsorbed to surfaces. We here report on the secondary structure and orientation of the S1 segment of the spike protein, which is often used as a model protein for in vitro studies of SARS-CoV-2, at the air-water interface using surface-sensitive vibrational sum-frequency generation (SFG) spectroscopy. The air-water interface plays an important role for SARS-CoV-2 when suspended in aerosol droplets, and it serves as a model system for hydrophobic surfaces in general. The SFG experiments show that the S1 segment of the spike protein remains folded at the air-water interface and predominantly binds in its monomeric state, while the combination of small-angle X-ray scattering and two-dimensional infrared spectroscopy measurements indicate that it forms hexamers with the same secondary structure in aqueous solution.
The SARS coronavirus 2 (SARS-CoV-2) spike protein is located at the outermost perimeter of the viral envelope and is the first component of the virus to make contact with surrounding interfaces. The stability of the spike protein when in contact with surfaces plays a deciding role for infection pathways and for the viability of the virus after surface contact. While cryo-EM structures of the spike protein have been solved with high resolution and structural studies in solution have provided information about the secondary and tertiary structures, only little is known about the folding when adsorbed to surfaces. We here report on the secondary structure and orientation of the S1 segment of the spike protein, which is often used as a model protein for in vitro studies of SARS-CoV-2, at the air-water interface using surface-sensitive vibrational sum-frequency generation (SFG) spectroscopy. The air-water interface plays an important role for SARS-CoV-2 when suspended in aerosol droplets, and it serves as a model system for hydrophobic surfaces in general. The SFG experiments show that the S1 segment of the spike protein remains folded at the air-water interface and predominantly binds in its monomeric state, while the combination of small-angle X-ray scattering and two-dimensional infrared spectroscopy measurements indicate that it forms hexamers with the same secondary structure in aqueous solution.
The spike protein of SARS coronavirus 2 (SARS-CoV-2) is the key molecule for viral entry
into human cells as it is the first contact point with host cells and therefore crucial for
viral viability and reproduction. As such, it is a prime target for the immune response,
antigen testing, and vaccines. The structure of the spike protein has been studied
intensively, and X-ray crystal structures, cryo-EM, as well as solution-state studies have
provided detailed information about the secondary and tertiary
structure.[1−5] However, surface contacts
can change the protein structure severely and, to the best of our knowledge, only
theoretical studies have been reported about the folding of the spike protein when in
contact with surfaces.[6,7]We here report experimental evidence on the structure and orientation of the spike protein
adsorbed to the air–water interface (AWI). The AWI is a hydrophobic interface, which
is potentially a very disruptive surface and known to alter protein structure in many
cases.[8−10] The AWI plays an important
role for SARS-CoV-2 when airborne within aerosol droplets, suspended in test tubes, or bound
to the surfaces within the respiratory tract. Besides this direct role, the AWI is also a
good model system for hydrophobic surfaces in general and can therefore provide a first
approximation of how the spike protein will respond to natural and technical hydrophobic
surfaces such as plant surfaces, test tubes, coatings, textiles, and skin.To determine the structure of the SARS-CoV-2 spike protein at the AWI and in bulk aqueous
solution, we combine vibrational sum frequency generation (SFG) spectroscopy with
two-dimensional infrared (2D-IR) spectroscopy and small-angle X-ray scattering (SAXS). SFG
spectroscopy in the amide I and amide II regions can determine how proteins fold and orient
specifically at interfaces.[11−13] Mixing broadband infrared
laser pulses with narrowband visible laser pulses at the interface generates sum frequency
photons, which report on vibrational modes at the interface.[14] On the
other hand, 2D-IR and SAXS data provide information about the secondary structure and
aggregation state of the protein in solution.[15,16]Together, we find that the SARS-CoV-2 spike protein remains intact when binding to the AWI
and predominantly binds in its monomeric state.
Methods
Further details about experiments and data analysis can be found in the Supporting Information.
Sample Preparation
The S1 segment of the SARS-CoV-2 spike protein (residues Gln14-Arg685) was purchased from
GenScript Biotech (purity > 85%), and the sequence and purity were checked by mass
spectrometry. The samples were kept at −80 °C until use. Upon thawing, the
samples were buffer-exchanged into D2O-based phosphate (PBS) buffer and diluted
to the desired concentration for a given experiment.
SFG Spectroscopy
The SFG experiments were conducted on a home-built setup described in detail
elsewhere.[17] Briefly, we used an amplified Ti:Sapphire laser system
(Astrella, Coherent) to generate broadband mid-infrared light (TOPAS Prime, Light
Conversion) and narrowband visible light through an etalon (fwhm ≈ 16
cm–1), which were overlapped on the sample in time and space. The
resultant SFG light was directed onto a spectrograph and camera (Shamrock/Newton, Andor)
for spectral acquisition. The spectra were background subtracted and normalized to the
spectrum from a clean gold sample.
Small-Angle X-ray Scattering
SAXS data were recorded at the HyperSAXS facility in Aarhus, Denmark.[18] The data were calibrated to an absolute scale and displayed as a function of the
modulus of the scattering vector, q. The forward scattering data yield a
mass of about 390 kDa corresponding to six spike proteins per scatterer. The SAXS data
were modeled by rigid-body refinement with 10 independent runs per input structure using
home-written software.[19]
Two-Dimensional Infrared Spectroscopy
Transmission two-dimensional infrared (2D-IR) spectra were recorded using a commercial
instrument set to operate in the amide I region (2DQuickIR, PhaseTech Spectroscopy), as
described previously.[20] Briefly, a time delayed collinear pair of
femtosecond infrared pump pulses were produced using a pulse shaper, and the transmission
spectrum of the pumped sample was measured using a third, non-collinear, infrared pulse
that was dispersed on a grating spectrometer and measured using a high repetition rate MCT
detector (Jackhammer, PhaseTech Spectroscopy). Fourier transformation with respect to the
delay between the pump pulses produced the 2D-IR spectra shown.
Spectral Calculations
The SFG spectra are calculated based on the formalism reported previously.[21] Briefly, we construct a one-exiton amide I Hamiltonian based on the atomic
positions of the backbone amide groups. The couplings between the nearest neighbors
(dominated by through-bond interactions) are determined using a parameterized map of the
coupling as a function of the dihedral angle derived from a ab initio calculation of
“glycine dipeptide” (Ac-Gly-NHCH3) using the 6-31G+(d) basis set and
B3LYP-functional,[22] while all other couplings (dominated by
through-space interactions) are calculated using the transition-dipole coupling model
(TDCM).[23] We diagonalize the Hamiltonian to obtain the eigenvalues
and eigenvectors of the eigenmodes, from which the IR and Raman responses, and their outer
product, the SFG hyperpolarizability tensor, are determined.For the 2D-IR spectral calculations, the corresponding 2-exciton Hamiltonian is
constructed using the formalism described by Hamm and Zanni,[24] which
leads to such large matrices for the S1 spike protein that the diagonalization is
computationally very challenging. Therefore, we have cut up the 670 residue monomers into
3 segments of almost equal length and averaged the 2D-IR response of the 18 segments that
are thus created for the hexamer structure. Because the couplings decrease with the
inter-amide-group distance to the third power, this should only lead to a minimal loss of
the couplings that shape the 2D-IR amide I spectrum.
Results
Surface Studies
For the SFG experiments, we inject the S1 segment of the spike protein (Val16-Arg685),
responsible for the initial interaction with binding partners and surfaces, into the
subphase of a trough and overlap the laser beams at the aqueous surface. An illustration
of the experimental geometry is shown in Figure A. The frequencies of the infrared beam were chosen to cover the amide I and
amide II spectral regions (1350–1800 cm–1). We prepared our
samples using buffer based on D2O instead of H2O to avoid
interference from water bending modes in this spectral region. This also leads to
deuteration of the N atoms in the amide groups, which modifies the amide I and amide II
modes to their deuterated equivalents (denoted amide I′ and amide II′).
Figure 1
Experimental characterization of the spike protein at the air–water interface
with SFG spectroscopy. (A) Schematic overview of the experimental setup with incoming
and outgoing laser beams. Also shown are the relevant angles and coordinate systems of
the protein relative to the interface. (B) Normalized SFG spectra of the spike protein
injected into a trough with phosphate-buffered D2O to a concentration of
0.3 μM. The spectra are recorded in ssp (red), ppp (orange), sps (blue), and pss
(black) polarization combinations, respectively. For further clarity, the spectra have
been plotted individually in the Supporting Information. (C) Normalized SFG signal amplitude of the amide
I band at 1647 cm–1, plotted as a function of the added spike
protein concentration in the bulk. The black solid line is a fit to a modified
Langmuir/Hill equation for protein surface adsorption (see Supporting Information for details).
Experimental characterization of the spike protein at the air–water interface
with SFG spectroscopy. (A) Schematic overview of the experimental setup with incoming
and outgoing laser beams. Also shown are the relevant angles and coordinate systems of
the protein relative to the interface. (B) Normalized SFG spectra of the spike protein
injected into a trough with phosphate-buffered D2O to a concentration of
0.3 μM. The spectra are recorded in ssp (red), ppp (orange), sps (blue), and pss
(black) polarization combinations, respectively. For further clarity, the spectra have
been plotted individually in the Supporting Information. (C) Normalized SFG signal amplitude of the amide
I band at 1647 cm–1, plotted as a function of the added spike
protein concentration in the bulk. The black solid line is a fit to a modified
Langmuir/Hill equation for protein surface adsorption (see Supporting Information for details).SFG spectra for different polarization combinations of the incoming and outgoing laser
beams are displayed in Figure B. Given the
chosen beam angles, the most intense spectra are observed in the ssp combination
(s-polarized SFG, s-polarized visible, and p-polarized IR). The broad peak around 1650
cm–1 can be assigned to the amide I′ band. The width of the
peak reflects the structural variation present within this relatively large protein. The
weak mode near 1750 cm–1 can be assigned to sidechain C=O modes.
There are several broad resonances below 1600 cm–1. The shoulder near
1500 cm–1 is likely assigned to protonated C=O groups, and the
signal near 1450 cm–1 is likely related to the side chain C–H
bending modes. The feature near 1460 cm–1 is assigned to the amide
II′ backbone mode. Spectra recorded in other polarization combinations show most of
the same modes with different intensities (see Figure B and the Supporting Information).The selection rules of SFG dictate that only ordered species at an interface are visible
in the spectra. A disordered structure or signal from proteins in solution near the
surface cancels in the far-field (doi.org/10.1116/6.0001401). Therefore, the presence of backbone and
sidechain modes in the spectra imply that the spike protein forms a relatively
well-ordered layer at the air–water interface. In order to investigate the
propensity of the spike protein to bind to the AWI, we recorded SFG spectra as a function
of protein concentration, where the protein coverage of the surface was tracked using the
intensity of the amide I′ band at 1647 cm–1 (Figure C). This analysis show that the spike protein binds
relatively strongly to the surface with an apparent binding constant of (2.3 ± 0.3)
× 106 μM–4.9. The Hill coefficient of 4.9
extracted from the binding isotherm indicates cooperative binding to the air–water
interface, which suggests attractive interactions between spike proteins at the water
surface (see Supporting Information for details).
Structural Analysis
Since SFG is a coherent method and spectra from structurally diverse proteins are the
result of complex interferences between SFG photons, the data cannot be reliably
interpreted by spectral inspection or fitting alone. To compare the cryo-EM structure of
the spike protein with SFG data directly, and thereby determine the folding state and
orientation of the spike protein at the surface, we calculate the theoretical amide I SFG
response and compare the calculated spectra with the experimental data. The calculations
are based on the cryo-EM derived SARS-CoV-2 spike structures[3] in the
open and closed states (PDB entries 6VXX and 6VYB,
respectively), which have been structurally completed using the C–I-TASSER
algorithm.[25]For the calculations, we have to take the aggregation state of the spike protein into
account. While the protein is known to form homo-trimers at the viral
surface,[4,26,27] the oligomeric state of our recombinantly expressed protein in
solution is not known. To investigate this, we use SAXS and 2D-IR spectroscopy, as shown
in Figure . SAXS profiles for the spike protein
in solution indicate that, when testing a variety of cluster sizes and starting
structures, a hexameric cluster is dominating the solution-state ensemble (Figure A,C). Furthermore, when we compare the
experimental 2D-IR spectrum with the calculated 2D-IR spectrum for the SAXS-derived
hexamer structures (Figure B), we find a match
that is consistent with a globular protein structure that contains both α-helical
and β-sheet secondary structure elements.[16] This makes it
unlikely that the formation of the hexameric oligomers causes a large change in the
secondary structure, for example, to amyloid aggregates, which have distinct sharp peaks
in the 2D-IR spectrum,[28] very different from the ones observed here for
the spike protein.
Figure 2
Structural characterization of the spike protein in bulk solution with SAXS and 2D-IR
spectroscopy. (A) Experimental SAXS data (black crosses) superimposed with the model
fit from the hexamer structure (red line). Representative model fits for the monomer
(blue) and trimer (orange) structures are also shown for comparison. (B) Measured
2D-IR spectrum of the (i) spike protein with pump and probe beams polarized parallel
to each other, alongside the calculated spectra for the protein in its monomeric (ii)
or hexameric (iii) form and the corresponding difference between the normalized
parallel and perpendicular polarized 2D-IR spectra (iv–vi). In these
parallel–perpendicular difference spectra, the diagonal features are removed,
which enhances the sensitivity to the cross-peaks that directly show the coupling
between vibrational modes. Note that, to enhance weak features, the contours are not
uniformly spaced—the lines within the color bar show the contour positions. (C)
Illustration of one of the hexameric spike structures, which gives the best match to
the experimental SAXS data. Each monomer unit within the structure has its own
distinct color. Please note that this structure is not unique in modeling the SAXS
data, but just one of the SAXS-derived hexamer structures that we include in our
calculations.
Structural characterization of the spike protein in bulk solution with SAXS and 2D-IR
spectroscopy. (A) Experimental SAXS data (black crosses) superimposed with the model
fit from the hexamer structure (red line). Representative model fits for the monomer
(blue) and trimer (orange) structures are also shown for comparison. (B) Measured
2D-IR spectrum of the (i) spike protein with pump and probe beams polarized parallel
to each other, alongside the calculated spectra for the protein in its monomeric (ii)
or hexameric (iii) form and the corresponding difference between the normalized
parallel and perpendicular polarized 2D-IR spectra (iv–vi). In these
parallel–perpendicular difference spectra, the diagonal features are removed,
which enhances the sensitivity to the cross-peaks that directly show the coupling
between vibrational modes. Note that, to enhance weak features, the contours are not
uniformly spaced—the lines within the color bar show the contour positions. (C)
Illustration of one of the hexameric spike structures, which gives the best match to
the experimental SAXS data. Each monomer unit within the structure has its own
distinct color. Please note that this structure is not unique in modeling the SAXS
data, but just one of the SAXS-derived hexamer structures that we include in our
calculations.However, since surfaces are known to potentially disrupt protein clusters,[8] and since the concentration of protein at the interface is unknown, we
cannot immediately rule out any of these aggregation states for the protein when bound to
the surface. Therefore, we include the monomeric state, a dimer state, and the native
trimeric state along with an ensemble of SAXS-derived hexameric structures as potential
starting binding poses in our SFG calculations. Furthermore, we also take into account the
fact that the protein can reside at the interface in either an open or a closed form. The
former is characteristic of the spike protein when situated at the viral envelope, while
the latter is observed for proteins interacting with its target receptor ACE2.Consequently, we run our SFG spectral calculations for a map of all possible different
tilt and twist angles for monomer, dimer, and trimer structures, in both open and closed
forms, and 30 different structural models based on the SAXS data. We then rank the
structures by the resemblance of the experimental and the calculated SFG spectra for each
structural model. A summary of all structural models and the associated calculated spectra
can be found in the Supporting Information. Overall, the monomeric structure in the open form
shows the best match with the experimental data, as shown in Figure A. The calculations converge on this result irrespective of our
starting choice of the spectral parameters (see Supporting Information). We find the best match for the open form protein,
as is expected, given that there are no receptors capable of binding to and closing the
protein present in our system. The calculations match the experimental data very well for
all the achiral polarization combinations when the monomeric protein adopt a slightly
tilted binding geometry of about 58° with respect to the surface (Figure B). The residual-sum-of-squares (RRS) error value in
this conformation is less than 0.67, which is much smaller than the minimal RSS value
obtained for the other structures (Figure S4). A schematic of the binding geometry of the monomer at the
air–water interface is shown in Figure C.
Figure 3
Calculated SFG spectra for the best-matching monomeric open-form protein structure.
(A) Comparison of the best matching calculated SFG spectra in the amide I region to
experimental data for four different polarization combinations. Spectra for all other
structures are shown in the Supporting Information. (B) Residual sum-squares error (RSS) plot
showing the optimal tilt (θ) and twist (Ψ) angles of the monomer protein
with respect to the surface, which was used to produce the spectra shown in panel (A).
(C) Illustration of the spike protein in the best-matching orientation at the
air–water interface. (D) Top-view of a lateral arrangement of spike proteins
within the monolayer, which is in agreement with the experimental data. In this model,
the spike monomers are interacting laterally through β-sheet structures, forming
a densely packed layer at the water surface.
Calculated SFG spectra for the best-matching monomeric open-form protein structure.
(A) Comparison of the best matching calculated SFG spectra in the amide I region to
experimental data for four different polarization combinations. Spectra for all other
structures are shown in the Supporting Information. (B) Residual sum-squares error (RSS) plot
showing the optimal tilt (θ) and twist (Ψ) angles of the monomer protein
with respect to the surface, which was used to produce the spectra shown in panel (A).
(C) Illustration of the spike protein in the best-matching orientation at the
air–water interface. (D) Top-view of a lateral arrangement of spike proteins
within the monolayer, which is in agreement with the experimental data. In this model,
the spike monomers are interacting laterally through β-sheet structures, forming
a densely packed layer at the water surface.The second-best match (RSS = 1.3) was achieved for the trimer model in a somewhat more
inclined binding orientation of 65° (see Supporting Information). The dimer model did not match the experimental data
well, with a deviation of more than RSS = 2.6. The SAXS structures, which describes the
solution-state clusters well, did not match the experimental SFG data in any orientation
(RSS higher than 2.4). Ostensibly, the hexameric spike protein clusters in solution are
not stabilized at the AWI but break down into smaller units and rearrange significantly
when interacting with the surface.
Discussion
Spike Protein Interaction with the Air–Water Interface
The spike protein layer at the AWI is essentially a large assembly that promotes lateral
protein interactions in combination with the surface interaction. Likely, the delicate
balance of forces, which steers the proteins into different types of clusters, is shifted
toward a more aligned assembly of monomer spike proteins when bound to the surface. The
binding geometry will depend on several factors, including the interaction with the AWI,
interaction between monomers, and solvation energy. An aligned, tilted binding geometry,
inferred here for the spike protein, is often observed for interfacial structures and
proteins, where it is important to maximize lateral as well as surface interactions
simultaneously.[29−32]Strong lateral interactions between spike proteins within the surface layer are also
manifested in the appearance of the intense amide II′ band (Figure
B). This band is SFG inactive for most proteins and is
usually only observed when vibrational modes are highly delocalized across the protein
structure, for example, within large aggregates such as amyloid
fibers.[33−35] In this context, our
data suggest that the spike proteins are strongly coupled to each other when forming a
densely packed protein layer the air–water interface, while still retaining their
monomeric structure within this layer. A packing geometry in agreement with these results
is shown in a top view in Figure D. In the
displayed binding structure, the packing density of the monomers is high and lateral
interactions through β-strand sites is maximized. We note that, while the shown
assembly is in agreement with the structural data, there is no direct evidence for the
model at this point.
Conclusions
Together, the data show that the SARS-CoV-2 spike protein is strongly attracted to the
air–water interface. The protein binds to the AWI as monomers and assumes a slightly
tilted orientation. Since the AWI is a model for hydrophobic surfaces in general, this
implies that the spike protein will likely bind to other hydrophobic surfaces as well.
Unlike many other globular proteins, it can be expected that the protein will stay intact
during contact with hydrophobic surfaces such as polymers, plant surfaces, skin, aerosol
particles, and test tubes as well as to gas bubbles in non-degassed buffers.
Authors: Ziad Ganim; Hoi Sung Chung; Adam W Smith; Lauren P Deflores; Kevin C Jones; Andrei Tokmakoff Journal: Acc Chem Res Date: 2008-02-21 Impact factor: 22.384
Authors: James L Daly; Boris Simonetti; Katja Klein; Kai-En Chen; Maia Kavanagh Williamson; Carlos Antón-Plágaro; Deborah K Shoemark; Lorena Simón-Gracia; Michael Bauer; Reka Hollandi; Urs F Greber; Peter Horvath; Richard B Sessions; Ari Helenius; Julian A Hiscox; Tambet Teesalu; David A Matthews; Andrew D Davidson; Brett M Collins; Peter J Cullen; Yohei Yamauchi Journal: Science Date: 2020-10-20 Impact factor: 63.714
Authors: Markus Hoffmann; Hannah Kleine-Weber; Simon Schroeder; Nadine Krüger; Tanja Herrler; Sandra Erichsen; Tobias S Schiergens; Georg Herrler; Nai-Huei Wu; Andreas Nitsche; Marcel A Müller; Christian Drosten; Stefan Pöhlmann Journal: Cell Date: 2020-03-05 Impact factor: 41.582
Authors: Alexandra C Walls; Young-Jun Park; M Alejandra Tortorici; Abigail Wall; Andrew T McGuire; David Veesler Journal: Cell Date: 2020-03-09 Impact factor: 41.582