Frederico Campos Freitas1, Paulo Henrique Borges Ferreira1, Denize Cristina Favaro2, Ronaldo Junio de Oliveira1. 1. Laboratório de Biofísica Teórica, Departamento de Física, Instituto de Ciências Exatas, Naturais e Educação, Universidade Federal do Triângulo Mineiro, Uberaba, MG 38064-200, Brazil. 2. Departamento de Química Orgânica, Instituto de Química, Universidade Estadual de Campinas, São Paulo, SP 13083-970, Brazil.
Abstract
Angiotensin-converting enzyme 2 (ACE2) is the host cellular receptor that locks onto the surface spike protein of the 2002 SARS coronavirus (SARS-CoV-1) and of the novel, highly transmissible and deadly 2019 SARS-CoV-2, responsible for the COVID-19 pandemic. One strategy to avoid the virus infection is to design peptides by extracting the human ACE2 peptidase domain α1-helix, which would bind to the coronavirus surface protein, preventing the virus entry into the host cells. The natural α1-helix peptide has a stronger affinity to SARS-CoV-2 than to SARS-CoV-1. Another peptide was designed by joining α1 with the second portion of ACE2 that is far in the peptidase sequence yet grafted in the spike protein interface with ACE2. Previous studies have shown that, among several α1-based peptides, the hybrid peptidic scaffold is the one with the highest/strongest affinity for SARS-CoV-1, which is comparable to the full-length ACE2 affinity. In this work, binding and folding dynamics of the natural and designed ACE2-based peptides were simulated by the well-known coarse-grained structure-based model, with the computed thermodynamic quantities correlating with the experimental binding affinity data. Furthermore, theoretical kinetic analysis of native contact formation revealed the distinction between these processes in the presence of the different binding partners SARS-CoV-1 and SARS-CoV-2 spike domains. Additionally, our results indicate the existence of a two-state folding mechanism for the designed peptide en route to bind to the spike proteins, in contrast to a downhill mechanism for the natural α1-helix peptides. The presented low-cost simulation protocol demonstrated its efficiency in evaluating binding affinities and identifying the mechanisms involved in the neutralization of spike-ACE2 interaction by designed peptides. Finally, the protocol can be used as a computer-based screening of more potent designed peptides by experimentalists searching for new therapeutics against COVID-19.
Angiotensin-converting enzyme 2 (ACE2) is the host cellular receptor that locks onto the surfacespike protein of the 2002 SARS coronavirus (SARS-CoV-1) and of the novel, highly transmissible and deadly 2019 SARS-CoV-2, responsible for theCOVID-19 pandemic. One strategy to avoid thevirus infection is to design peptides by extracting thehumanACE2 peptidase domain α1-helix, which would bind to thecoronavirus surface protein, preventing the virus entry into the host cells. The natural α1-helix peptide has a stronger affinity to SARS-CoV-2 than to SARS-CoV-1. Another peptide was designed by joining α1 with the second portion of ACE2 that is far in the peptidase sequence yet grafted in thespike protein interface with ACE2. Previous studies have shown that, among several α1-based peptides, the hybrid peptidic scaffold is the one with the highest/strongest affinity for SARS-CoV-1, which is comparable to the full-length ACE2 affinity. In this work, binding and folding dynamics of the natural and designed ACE2-based peptides were simulated by the well-known coarse-grained structure-based model, with the computed thermodynamic quantities correlating with theexperimental binding affinity data. Furthermore, theoretical kinetic analysis of native contact formation revealed the distinction between these processes in the presence of the different binding partners SARS-CoV-1 and SARS-CoV-2spike domains. Additionally, our results indicate theexistence of a two-state folding mechanism for the designed peptideen route to bind to thespike proteins, in contrast to a downhill mechanism for the natural α1-helix peptides. The presented low-cost simulation protocol demonstrated its efficiency in evaluating binding affinities and identifying themechanisms involved in theneutralization of spike-ACE2 interaction by designed peptides. Finally, the protocol can be used as a computer-based screening of more potent designed peptides by experimentalists searching for new therapeutics against COVID-19.
Coronaviruses (CoV) represent a diverse family of single-stranded, positive-sense RNA
viruses capable of infecting humans and animals.[1] Historically,
coronaviruses haveinfectedhumans causing mild cold; however, highly pathogenic strains
haveemerged over the past two decades.[2−5] In 2002–2003, severe acute respiratory syndromecoronavirus (SARS-CoV-1) infected over 8000 people worldwide, with approximately 800 deaths.
In 2012, Middle East respiratory syndrome coronavirus (MERS-CoV), first identified in Saudi
Arabia, caused 2500 confirmed cases and a death rate of 36%.[5] Even though
a new outbreak of SARS-CoV-1 was expected by 2004, SARS-CoV-1 remained absent fromhuman
circulation until December 2019. Compared to SARS-CoV-1 and MERS-CoV, thenew strain of
coronaviruses (SARS-CoV-2, previously known as 2019-nCoV) spreads moreefficiently.
According to the World Health Organization (WHO) report, by January 24th of 2021, more than
97.4 million people have beeninfected worldwide, including more than 2.1 million deaths.
Sequencing of theSARS-CoV-2 genome indicates that the 30 kb RNA encodes four structural
proteins: spike (S), envelope (E), membrane (M), and nucleocapsid (N); 16 nonstructural
proteins (Nsp1-16); and 9 putative accessory factors.[6,7] The trimeric glycosylated spike protein is the
key used by both SARS-CoV-1 and SARS-CoV-2 viruses to enter host cells. During viral
infection, the trimeric protein is cleaved into S1 and S2 subunits. S1 contains the
receptor-binding domain (RBD), which recognizes the peptidase domain (PD) of
angiotensin-converting enzyme 2 (ACE2), and S2 is responsible for membrane
fusion[3,8,9] (Figures A,B, and S1).
Figure 1
SARS-CoV-1 and SARS-CoV-2 spike proteins inhibited by ACE2-based peptides. (A)
Coronavirus uses the surface spike protein (in red) to lock onto human ACE2 receptors
(in gray) located on the surface of host cells to deliver its RNA to produce more
viruses. (B) Closer virus–host interface look is shown, represented by the spike
receptor-binding domain (RBD, in cyan) and the ACE2 peptidase domain (PD, in light
gray). The residues of ACE2-PD interfacing with spike-RBD (in green) are also shown,
which were used as one designed peptide with the aim to inhibit the virus–cell
recognition schemed in (C). Human ACE2-based peptides binding to spike-RBD of (D)
SARS-CoV-1 and (E) SARS-CoV-2 with the spike receptor-binding motif (RBM) is shown in
red. The residue sequences of each extracted ACE2 peptide are presented in green in (D)
and (E): α1-helix composed of residues 22–44, 22–57, and
residues 22–44 of α1 connect by one glycine (G) with the loop
region numbering from 351 to 357; inhibitors 1, 2, and
3 for sort.
SARS-CoV-1 and SARS-CoV-2spike proteins inhibited by ACE2-based peptides. (A)
Coronavirus uses the surfacespike protein (in red) to lock onto humanACE2 receptors
(in gray) located on the surface of host cells to deliver its RNA to producemore
viruses. (B) Closer virus–host interface look is shown, represented by thespike
receptor-binding domain (RBD, in cyan) and theACE2 peptidase domain (PD, in light
gray). The residues of ACE2-PD interfacing with spike-RBD (in green) are also shown,
which were used as one designed peptide with the aim to inhibit the virus–cell
recognition schemed in (C). HumanACE2-based peptides binding to spike-RBD of (D)
SARS-CoV-1 and (E) SARS-CoV-2 with thespike receptor-binding motif (RBM) is shown in
red. The residue sequences of each extracted ACE2peptide are presented in green in (D)
and (E): α1-helix composed of residues 22–44, 22–57, and
residues 22–44 of α1 connect by oneglycine (G) with the loop
region numbering from 351 to 357; inhibitors 1, 2, and
3 for sort.Theoretical models serve as a powerful predictive approach in aiding the design of new
treatments to block viral infection.[10] In this sense, we propose that
models with the foundation in theenergy landscape theory might contribute to attack this
problem. Theenergy landscape theory was initially developed to study protein
folding[11−14] and it has been a valuable framework in revealing many
biological mechanisms related to folding and many other biomolecular functions, such as
functional conformation dynamics, domain swapping, and binding
mechanisms.[15−24] The structural
formation of biomolecules has been described in this framework as a complex process that
takes place in a multidimensional configurational space.[25] Folding,
binding, and conformational rearrangements are depicted in terms of diffusion of
configurational search down a funnel-likeenergy landscape, in direction to theminimumenergy structure.[26,27]
Also, natural protein sequences with funnel-like landscapes haveenergetic roughness that is
relatively small compared to the global native state and the funnel slope is steep enough to
overcome the local traps imposed by the roughness so that the biological function
occurs.[11]Biological function is controlled by the association of biomolecules, an essential event
for cellular signaling and communication. Understanding themechanisms involved in molecular
recognition processes and the correspondent proteins’ roles are key targets for
therapeutic interventions.[28] The classical binding mechanism called
“lock-and-key” treats two binding biomolecules as rigid entities in their
docking process.[29] Proteins are rather flexible, involving conformational
changes; therefore, two other coupling mechanisms can be proposed. One is the
“conformational selection”, where a conformational change of one binding
partner occurs before binding to the other. The second is the “induced-fit”,
where the conformational change occurs after binding.[30] These two
flexiblemechanisms are reasonable for small molecules that dock to proteins on a time-scalefaster than the protein conformational change. However, this time-scale separation is not
always clear for protein–protein or protein–peptide binding, which is induced
during folding events of at least one of the involved biomolecules.[31,32]The “binding-induced folding” association mechanisms can be
cooperative—“coupled binding–folding” or
noncooperative—“binding prior folding” and
“binding after folding”.[32] These
binding–folding scenarios have beenmore investigated in the association of
intrinsically disordered proteins (IDPs) in which their global conformational change (or
folding) is always associated with their partners.[33] The coupling
kinetics of binding and folding has been studied by experiments, demonstrating a broad range
of binding–folding scenarios.[34−47]In this manuscript, natural and designed peptide binders known for having inhibitory action
against coronavirus were selected by blocking the binding of spike-RBD to ACE2-PD (Figure C). Spike protein/gene is the preferred target
site in SARS/MERS vaccine development, with many patents already deposited.[48] The objective is to determine the structural binding mechanisms and
affinities of inhibitory peptides by characterizing their thermodynamic and kinetic
profiles. This strategy could save lab work by computationally selecting new designs of
peptides with a stronger affinity to SARS-CoV-2 than SARS-CoV-1. Also, virus surveillance
will take place after the first generation of vaccines and drugs and there will be a
continuous search for upgrades. Thus, new tools that are computationally less expensive can
come in hand to select potential candidates to inhibit viral entry into human cells. At the
beginning of theCOVID-19 outbreak, many studies were performed to understand the structural
aspects of the interaction between the virus and potential inhibitors and antibodies of
spike-mediated infection[4,49−60] in a manner similar to the first SARS-CoVepidemic.[61−66] However, the dynamic aspects must be taken into
account when designing new biomolecules,[4,66−69] and traditional molecular
dynamics might be computationally expensive to screen a combinatorial variation of even
small peptides, yet efficient to energy-minimized designed structures.[67,68,70−75] In this perspective, here we present a relatively
low demand protocol to computationally characterize the difference in the binding
thermodynamics and mechanisms of thepeptides designed to inhibit the surfacespike proteins
of the old and new coronaviruses. The protocol is based on structure-based model (SBM)
simulations, which has its foundation in theenergy landscape theory,[76]
already applied to study thespike prefusion dynamics.[77]
Results and Discussion
Selected Peptide Inhibitors have Increased Hydrogen Bonds
Thespike protein located on the surface of thecoronavirus binds to the cellular
receptor ACE2 of the host to deliver the virus genetic code.[78] Peptide
binders are suitable therapeutics to interrupt thespike/ACE2 interaction by specifically
binding to the interface region spike-RBD/ACE2-PD, preventing humaninfection.[79−82] Also, synthetic small biomolecules, such as aptamers of
nucleic acid, bioconjugates of peptides, and polysaccharides, have been studied as protein
blockers for coronavirus and other virus-related diseases.[56,83−87] Reports have shown that
viral spike protein subunit vaccines produced higher neutralizing antibody titers and more
complete protection than live-attenuated SARS-CoV-1, full-length spike protein, and
DNA-based spike protein vaccines.[48] Protein vaccines targeting
spike-RBD account for half of the vaccine patents.Nanotechnology-enabled approaches like nanoparticles conjugated with the protein blockers
enhance specificity/efficiency when used as delivery systems to inactivateSARS-CoV-2 in
patients.[88,89] The
computational design of peptides can contribute as an important step for speeding up
nanomedicine-based approaches for COVID-19,[55,68,71−73,90−93] and characterizing their dynamics is a natural next step
for the selection of potential candidates. For these reasons, combining in
silico with experimental analyses narrows the large possible candidates among
thousands of pharmaceutically active substances to be later employed in clinical
trials.[88]Threepeptides constructed from the N-terminal amino acid residues of ACE2-PD were
selected: two natural segments of α1-helix (residues 22–44 and
22–57) and another peptide comprised of α1-helix (residues
22–44) linked by a glycine (G) with the loop segment 351–357; hereafter
coined as peptide inhibitors 1, 2, and 3,
respectively. The two natural peptides 1 and 2exhibited a
modest antiviral activity for SARS-CoV-1 RBD to humanACE2 with IC50 of about
50 and 6 μM, respectively, implying that 2 had a stronger affinity to
RBD.[94] The artificial peptide 3 exhibited an evenmore
potent antiviral activity of about 0.1 μM, implying that it had a much stronger
binding affinity for SARS-CoV-1 than peptides 1 and 2. Related
to thenovel coronavirus, it was reported that the natural fragment composed of residues
21–43 of the samehumanACE2 α1-helix can strongly bind to
SARS-CoV-2 RBD with a micromolar affinity (dissociation constant,
KD = 1.3 μM)[68] that is comparable
to the full-length ACE2 binding to SARS-CoV-2 RBD.[4] It was hypothesized
by Huang et al. that the artificial peptide 3 may also bind stronger than
peptide 21–43 (similar to peptide 1) due to the high similarity among
SARS-CoV-1 and 2 RBD interfaces with ACE2-PD[55] (Figures S1 and S2).Peptides 1 and 2 binding modes with SARS-CoV-1 and 2 RBDs were
constructed from the complex structures deposited in the Protein Data Bank to be used in
this work (PDB codes 2ajf(61) and 6m0j,[51] respectively). Peptide 3 bound to SARS-CoV-2 RBD was obtained
from the computational design of Huang et al.,[55] and
CoV1+3 was prepared by superposition to theCoV2+3 and to the
full-length ACE2-PD, followed by local energy minimization. Figure shows the structures of thepeptides bound to thespike-RBMs,
representing the protein–peptide binding simulations of this study. Also, the
sequence alignment of both spike-RBDs is presented in Figure S1, where the high similarity of the sequences is evident. Figure S1 also shows the sequence similarity among the threepeptide
inhibitors, residue binding contacts, and secondary structures.
Figure 2
Hydrogen bonds formed by (A) SARS-CoV-1 (CoV1) and (B) SARS-CoV-2 (CoV2) and the
inhibitors 1, 2, and 3. Only the side chains of
residues involved in hydrogen bonds are evidenced, and the hydrogen bond formed by the
residue K417 (CoV2) and D30 of the inhibitors is highlighted in dark blue.
Hydrogen bonds formed by (A) SARS-CoV-1 (CoV1) and (B) SARS-CoV-2 (CoV2) and theinhibitors 1, 2, and 3. Only the side chains of
residues involved in hydrogen bonds areevidenced, and thehydrogen bond formed by the
residueK417 (CoV2) and D30 of the inhibitors is highlighted in dark blue.Interfaces and interactions formed by SARS-CoV-1/SARS-CoV-2 and the inhibitors were
analyzed using the protein interfaces, surfaces, and assemblies (PISA) server of theEuropean Bioinformatics Institute (EBI), which predicts themost likely interfaces, based
on size and energetic considerations.[95] Analysis of the
complex’s interfaces shows that SARS-CoV-1 and inhibitor 1 form a weak
complex with only two hydrogen bonds (Y41/T486 and D38/Y436). This result agrees with our
MD simulations, where the complex SARS-CoV-1 + 1 shows high flexibility.
Nevertheless, Figure A shows that the addition
of extra amino acids increases the binding efficiency of the inhibitors 2 and
3 toward theSARS-CoV-1 spike protein. The addition of 13 amino acids leads
to the appearance of threeextra hydrogen bonds in theSARS-CoV-1 + 2 complex
(2 × Q42/Y436 and T487/Y41). Intriguingly, the inhibitor formed by the synthetic
peptide of ACE2 boosted the number of hydrogen bonds to 7 (T41/Y486, K353/Y481, K353/G482,
K353/G488, E37/Y491, and D38/Y436). It may be recalled that themajority of these
interactions are also present in the crystallographic structure of the complex formed by
SARS-CoV-1-RBD/ACE2.[54] It is worth mentioning that our analysis is in
agreement with theexperiments performed by Han and col., where increasing the size of thepeptide caused an increase in thehydrogen bonds.[94] Examination of
Figure B indicates that SARS-CoV-2 binds all
of the inhibitors with a higher number of hydrogen bonds than SARS-CoV-1. Furthermore,
inhibitors 1 and 2 showed a similar interaction mode toward the
RBD-ACE2 domain, presenting 9 hydrogen bonds (Q24/A475, Q42/Y449, Q42/G446, D30/K417,
E35/Q493, E37/Y505, D38/Y449, D38/Q498, and D38/Y449) and a salt bridge formed by residues
D30 and K417. Here, it is important to highlight thehydrogen bond and salt bridge formed
by residues D30 and K417, which have been featured as one of themost important for the
affinity improvement of the viral protein toward humanACE2.[54] Similar
to SARS-CoV-1, the number of hydrogen bonds suggests that the synthetic peptide binds theSARS-CoV-2 protein with the highest number of hydrogen bonds (Figure
B, right corner). The PISA web server identified 13 hydrogen
bonds in the interface of the complex SARS-CoV-2 + 3 (Q24/A475, Y41/T500, Y41/N501,
Q42/G446, Q42/Y449, D30/K417, E35/Q493, E37/Y505, D38/Q498, D38/Y449, K353/Q498,
K353/G498, and K353/G502) and a salt bridge formed by D30 and K417, suggesting that
sequence 3 might form themost stable complex and themost efficient
inhibitor among the three sequences studied here. The designed 3 binding
sites overlap with the binding sites of the de novo miniprotein
inhibitors that showed picomolar to nanomolar affinity and blocked SARS-CoV-2infection.[82]Local frustration in the structures of SARS-CoV-1 and 2 complexed with the inhibitors
1, 2, and 3 was recognized using the
Frustratometer Web Server.[96,97] Mutational and configurational frustration indexes were calculated for
each molecular complex. Table shows the
percentage of minimally, neutral, and highly frustrated interface contacts betweenSARS-CoV-1 and 2 and theinhibitors 1, 2, and 3.
The complexes formed by SARS-CoV-1 and 2 with theinhibitor 3 presented a
high increase in the number of interface contacts when compared to the inhibitor
1. A slight increase could also be seen in the number of interface contacts
from the complexes formed by SARS-CoV-1 and 2 with the inhibitor 2.
Furthermore, a lower percentage of the interface contacts betweenSARS-CoV-1 and 2
(specially SARS-CoV-2) and theinhibitor 3 was inclined to be highly
frustrated, unfavorable, in comparison with the other complexes (see Figures S3 and S4).
Table 1
Percentage of Minimally, Neutral, and Highly Mutational and Configurational
Frustration Indexes of Interface Contacts between SARS-CoV-1 and 2 and the Inhibitors
1, 2, and 3
mutational index
protein
minimally
neutral
highly
total
CoV1+1
11 (26.2%)
18 (42.9%)
13 (31.0%)
42 (100%)
CoV2+1
7 (13.0%)
33 (61.1%)
14 (25.9%)
54 (100%)
CoV1+2
14 (29.2%)
21 (43.8%)
13 (27.1%)
48 (100%)
CoV2+2
11 (18.6%)
34 (57.6%)
14 (23.7%)
59 (100%)
CoV1+3
13 (16.5%)
49 (62.0%)
17 (21.5%)
79 (100%)
CoV2+3
12 (14.3%)
60 (71.4%)
12 (14.3%)
84 (100%)
Designed Peptide Folds via Two-State Mechanism after Binding to Spike-RBDs
Protein folding and binding are intertwined processes closely connected to the cellular
function. Both processes are similar at the thermodynamic level: the dynamics aremapped
as diffusion over an ensemble of partially structured states, accompanied by a
simultaneous reduction of the configurational entropy and the freeenergy due to solvent
exclusion, hydrogen bonds, and salt bridge formations.[98] In addition,
folding/binding kinetics are influenced by the thermodynamic free-energy barrier that
separates the unfolded/unbound and the folded/bound states.[99] The
thermodynamic barriers for the real proteins were reasonably described by SBMs in the
α-carbon (Cα) representation of the protein.[100,101] However, side-chain packing can
play an important role in the native contact interactions presented in the transition
state of coupled protein folding and binding reaction,[37] and this can
be achieved computationally when all atoms (except nonpolar hydrogen) are included in the
pure Cα-SBM.[102,103] In either Cα and all-atom cases, and whether or not
it is a Go̅-model, protein topology and native contacts determine binding mechanisms
in coupled binding–folding that are reflected by funneled energy landscapes, even
if intermediates occur.[100,103−105]The folding process of the selected peptides, coupled to the binding on theSARS-CoV-1
and 2 spike protein RBDs (CoV1 and CoV2, respectively), was simulated to characterize
their thermodynamic quantities. The SBM, with all heavy (nonhydrogen) atom representation,
was chosen to represent the system topology with default parameters. Figure S5A shows the residue contact map of three out of six simulated
protein dimer interfaces CoV1+2, CoV2+1, and
CoV2+3, showing just the intermolecular contacts between two chains in the
native structure defined as Qbind. For each dimer, themodel
also accounts for the number of intramolecular contacts of thespike-RBD
(Qspike) and thepeptide
(Qpeptide). Qtotal is the sum of
the intramolecular and intermolecular native contacts (Qspike,
Qpeptide, and Qbind) of each
protein–peptide simulated system.The computational SBM implemented here does not have a direct connection with the real
temperatures. However, differences between quantities extracted from the theoretical
temperatures have been proved to be proportional to theexperimental temperatures, such as
the Φ-value analysis that is based on the free-energy variation or folding rates of
mutant/wild-type proteins.[106] The SBM is a topology-based model; thus,
it is assumed that slight variations on the reduced temperature around the vicinity of the
Cv(T) up/down shape do not abruptly change
the dimerization mechanism. Also, as experimentalists choose the quantity half-maximal
inhibitory concentration (IC50) to compare different inhibitory assays, we
choose T = Tbind as the temperature to
compare all of the simulated dimers. Tbind was defined as the
temperature where Cv is maximum for each system that separates
the transition between the states on (bound, low temperatures) and off (unbound, high
temperatures). At this temperature, bound/unbound states aremaximally sampled and the
computed statistical quantities are better estimated.Each dimer presented a broad yet a defined peak in the specific heat curve as a function
of temperature (Cv(T) in FigureS5B), determined by the thermodynamic analysis of the full set of
simulated runs. In general, protein–peptide dimers of CoV1 showed higher
Tbind than the respectivepeptide binding and folding to
CoV2, although they are comparable. Another thermodynamic result is related to the
designed fragment 3 that showed a higher peak in
Cv for both coronavirusspike proteins compared to the
natural peptides. This could be related to the insertion of the coil region on thespike-RBMs in addition to the α-helix, which may increase theenergy/temperaturenecessary for transition.The thermodynamic analysis of the free-energy profiles (F) as a function
of selected reaction coordinates at the binding transition temperature
(Tbind) was also computed. To improve themanuscript
clarity, analyses were focused on the comparison of one natural with the designed peptide
in complex with CoV1 and CoV2 and CoV1+2 and CoV2+3,
respectively. The results of all of the six peptide complexes are presented in the
Supporting Information section for the sake of completeness. Peptide
2, the longest selected α-helix peptide, was chosen to be compared
with fragment 3 and complexed with CoV1 and CoV2. Their implications can also
be discussed along with the different coronavirusspike proteins. CoV1 and CoV2 are very
similar protein homologues in the sequence and structure. Inhibitors 2
(natural peptide) and 3 (designed fragment) differ after residueSer44 in the
C-terminal portion of thehumanACE2-based peptides (Figures S1 and S2).Figure shows F as a function
of reaction coordinates based on native contacts (Qs) for the selected
protein–inhibitor dimers CoV1+2 and CoV2+3.
F as a function of Qs and position-based reaction
coordinates radius of gyration (Rg), root-mean-square
deviation (rmsd), and distance between the center of mass of the chains
(d) for all of the simulated protein dimers are presented in Figure S6 for completion. The defined fraction of native binding contacts
(Qbind) was monitored for thecoronavirusspike protein
dimers, and the two-state on/off free-energy profiles are shown in Figure
A for CoV1+2 and CoV2+3 and in
Figure S6E for all dimers at Tbind. A broad
bound state (around Qbind = 0.7) is separated from a sharp
unbound state (at Qbind = 0) by an F barrier
around Qbind = 0.1. The natural peptide complex
CoV1+2 presented the lowest binding F barrier in
Qbind of around
9kBT compared with the other complexes that
resulted in a barrier between 12 to 15kBT.
Thepeptides complexed with CoV2 presented, on average, a slightly higher binding
F(Qbind) barrier in relation to theCoV1
complexes. The same is also true for F profiles as a function of
Qtotal (Figure B),
Qpeptide (Figure C), and the position-based reaction coordinates (Figure S6, right panels). In Figure B, themajority of the native contacts of Qtotal
are related to thespike-RBDs (Qspike), over 1600, in contrast
to the total number of binding contacts of the two chains and the intramolecular contacts
of thepeptides (Qbind and
Qpeptide, respectively), both in the order of 100. Thus, in
Figure B, Qspike
is oscillating in its folded state and this is the reason why the bound–unbound
behavior is presented for values higher than Qtotal = 0.8. The
two states of F in Figure B
contain the unbound/unfolded state of thepeptide on the left well and the bound/folded
state on the right well, representing that dimer association and peptide/spike folding
occur without distinction, in contrast to Figure A,C, where binding and folding are dissociated by the coordinates. Higher
F barriers for the dimers were found along with the reaction coordinateQbind
(9–15kBT), suggesting that over
one-dimensional landscapes, the rate-limiting step in RBD-peptide dimerization is the
formation of native binding contacts among the chains interface at the transition state of
this coordinate. The dimerization rate limit is followed by the position-based coordinates
(4–12kBT in Figure S6).
Figure 3
Free energy (F) of simulated ACE2 peptide binding to SARS-CoV-1
(red) and SARS-CoV-2 (orange) spike-RBD proteins. One-dimensional F
profiles are shown as a function of reaction coordinates: a fraction of native (A)
peptide–protein chain interface contacts (Qbind),
(B) inter plus intrachain contacts (Qtotal), and (C)
peptide contacts (Qpeptide). (D, E) show two-dimensional
F profiles as a function of native contacts
Qbind and Qpeptide. In (A),
the designed peptide ACE22-44G351-357 (inhibitor 3) binds to the
SARS-CoV-2 protein with a higher F barrier when compared with the
α1-helix peptide ACE22-57 (peptide 2) binding to the
SARS-CoV-1 spike domain. (C) shows that fragment 3 folds through a
two-state mechanism upon binding to SARS-CoV-2 RBD, in contrast to a one-state
(downhill) folding mechanism of 2 binding to the CoV1 surface. In
addition, the coupled binding–folding processes are more cooperative to the
designed 3 (E) than the natural 2 (D), despite the fact that
the crossing barrier for 3 is higher and bumpier. F
profiles are in units of
kBTbind, with
Tbind being the respective system’s binding
temperature.
Freeenergy (F) of simulated ACE2peptide binding to SARS-CoV-1
(red) and SARS-CoV-2 (orange) spike-RBD proteins. One-dimensional F
profiles are shown as a function of reaction coordinates: a fraction of native (A)
peptide–protein chain interface contacts (Qbind),
(B) inter plus intrachain contacts (Qtotal), and (C)
peptide contacts (Qpeptide). (D, E) show two-dimensional
F profiles as a function of native contacts
Qbind and Qpeptide. In (A),
the designed peptideACE22-44G351-357 (inhibitor 3) binds to theSARS-CoV-2 protein with a higher F barrier when compared with the
α1-helix peptideACE22-57 (peptide 2) binding to theSARS-CoV-1 spike domain. (C) shows that fragment 3 folds through a
two-statemechanism upon binding to SARS-CoV-2 RBD, in contrast to a one-state
(downhill) folding mechanism of 2 binding to theCoV1 surface. In
addition, the coupled binding–folding processes aremore cooperative to the
designed 3 (E) than the natural 2 (D), despite the fact that
the crossing barrier for 3 is higher and bumpier. F
profiles are in units of
kBTbind, with
Tbind being the respective system’s binding
temperature.It is interesting to observe in Figures C and
S6A that during the dimerization processes, two distinct folding scenarios
emerged for thepeptides over F along their one-dimensional folding
reaction coordinate: one-state (downhill) and two-statepeptide folding mechanisms. It can
be observed that natural peptides fold via a completely downhill scenario, i.e., there is
no thermodynamic barrier in F(Qpeptide). Thenatural peptides 1 and 2manifested a downhill-likemechanism as
they bind to CoV1 and CoV2, except for CoV2+1 that showed a small signal of
two-state behavior in the coupled folding–binding two-dimensional landscapes,
discussed in thenext section. The downhill-like switches to a two-state folding scenario
as the joined peptide 3 binds to CoV1 and CoV2 RBDs (FigureS7A) with a folding barrier of
1–2kBT at the raised transition
state (Qpeptide ≈ 0.5) between the unfolded
(Qpeptide ≈ 0.3) and folded
(Qpeptide ≈ 0.7) states. Insertion of the loop
sequence G351–357 on the natural peptide 1 to form the designed
3 created many additional inter and intrachain contacts (see Figure S5A), which may lead to this two-state folder. Enlarging the natural
peptide from residues 22–44 to 22–57 only increases the size of the
α-helix peptide and did not substantially increase the binding native contacts of
the dimers as the addition of loop G351–357 (Figure S5A) did, and this could be one of the reasons of this change in the
folding scenario. Another possibility could be that the information of peptide folding
might not only be on thepeptide sequence but also templated on the sequence of the
binding partner as coupling proceeds,[107] although the partner sequence
changes fromSARS-CoV-1 to SARS-CoV-2spike-RBD without affecting protein topology. This
is a common feature of IDPs, and a large fraction of proteins undergo a disorder-to-order
transition when recognized (templated) by their physiological partners.[107] However, some IDPs have their folding–binding pathways encoded
within their sequence and are not templated by partner proteins with the same
topology.[108]The one-dimensional F profiles in Qbind and
Qpeptide were combined to produce the higher two-dimensional
F for the dimers CoV1+2 and CoV2+3, shown in
Figure D,E, respectively.
F(Qbind,
Qpeptide) represents the coupled binding–folding
pathways that a given dimer causes during theexperiments. As shown in Figure D,E, projections over these two-dimensions also raised
one thermodynamic barrier (transition state) for each dimer dynamics, with
CoV2+3 having a higher and rougher energetic barrier than
CoV1+2. The recovered energetic barriers were on the same order of
magnitude as the one-dimensional F barrier in
Qbind. In addition, the one- and two-state folding aspects
of the two peptides are presented in the
F(Qbind,
Qpeptide) projection on Qbind.
In Figure D, the downhill-folder natural peptide
binds and folds almost without distinction of the coupled dynamics. In contrast, FigureE shows that fragment 3 binds to
spike after folding, recovering a two-state kind of coupled mechanism. 3
binding–folding pathways aremore cooperative, although there is a higher and
funneled energetic barrier to cross. Once bound, the natural peptide can fold and unfold
more freely than the fragment, an aspect that will be thoroughly discussed in thenext
section. Highly dynamic “fuzzy” complexes, indicating a malleable
binding–folding pathway, were previously reported for IDPs complexed with stable
proteins by theexperiments temperature-jump[22] and multinuclear
relaxation dispersion NMR.[40] For another IDP-protein dimer, it was
reported that a free-energy barrier shows up separating the self-assembly process.[104]
Two-State Designed Peptide is Less Flexible on the Spike-RBMs
The coupled dynamics of peptide folding and binding to spike-RBD was investigated by
analyzing two-dimensional free-energy profiles. The binding was captured as the
one-dimensional F(Qbind), as shown in Figure A. Folding was captured by
Qpeptide in Figure C. The composition of Qbind and
Qpeptide to form a two-dimensional F
resulted in a landscape with a not so clear valley at the unbound state since this state
has 0 intersubunit contacts (Figure A),
resulting in a sharp valley located at the boundary of Figure (right panel, not clear to inspect) and another broad valley at
the center, as previously described in many works.[21,103] After combining all of the analyzed reaction
coordinates of Figure S6, it was found that the combinations presented in Figures , S7, and S8 presented two broader valleys at the bound and unbound states.
These combinations enabled more binding–folding mechanism analyses in the course of
the sampled states.
Figure 4
Two-dimensional free energy (F) of ACE2 peptides binding to the
SARS-CoV-1 (left panel) and the SARS-CoV-2 (right panel) spike proteins. (A)
Root-mean-square fluctuation (rmsf) of the peptides in the native (N) state. In the
right panel, the rmsf for the designed peptide in the encounter complex (E) state is
also shown (dash line). Representative simulated structures in these states (gray) are
aligned with the native dimer structure (colored). F profiles are
shown as a function of reaction coordinates of the native (B) peptide
(Qpeptide) and protein
(Qspike) contacts and (C) inter plus intramolecular
contacts (Qtotal) and
Qpeptide. The designed peptide ACE22-44G351–357
(inhibitor 3) folds via a two-state mechanism (monitored by
Qpeptide) upon binding to the SARS-CoV-2 RBD, which is
oscillating in the protein native state (monitored by
Qspike); in contrast to a downhill folding mechanism of
ACE22-57 (inhibitor 2) binding to the SARS-CoV-1 protein. Also, the rmsf
of CoV1+2 is higher than CoV2+3. F profiles
are in units of kBTbind, with
Tbind being the respective system binding
temperature.
Two-dimensional freeenergy (F) of ACE2peptides binding to theSARS-CoV-1 (left panel) and theSARS-CoV-2 (right panel) spike proteins. (A)
Root-mean-square fluctuation (rmsf) of thepeptides in the native (N) state. In the
right panel, the rmsf for the designed peptide in theencounter complex (E) state is
also shown (dash line). Representative simulated structures in these states (gray) are
aligned with the native dimer structure (colored). F profiles are
shown as a function of reaction coordinates of the native (B) peptide
(Qpeptide) and protein
(Qspike) contacts and (C) inter plus intramolecular
contacts (Qtotal) and
Qpeptide. The designed peptideACE22-44G351–357
(inhibitor 3) folds via a two-statemechanism (monitored by
Qpeptide) upon binding to theSARS-CoV-2 RBD, which is
oscillating in the protein native state (monitored by
Qspike); in contrast to a downhill folding mechanism of
ACE22-57 (inhibitor 2) binding to theSARS-CoV-1 protein. Also, the rmsf
of CoV1+2 is higher than CoV2+3. F profiles
are in units of kBTbind, with
Tbind being the respective system binding
temperature.Figure shows two-dimensional free-energy
landscapes for the two protein–peptide dimers CoV1+2 and
CoV2+3 at the defined transition temperature,
Tbind. Figure B
presents F as a function of the reaction coordinates
Qpeptide and Qspike, clearly
displaying one valley for CoV1+2 and two wells for CoV2+3.
Projection of F in Qspike (Figure S6C) shows one defined valley in F for both systems
with the fluctuations and sampling along Qspike of the protein
monomer being more rigid in CoV2. This is confirmed by the root-mean-square fluctuation
(rmsf) analyses of the simulated protein dimers in Figure S10. Figure S10 shows the rmsf of each residue, according to the primary
structure of CoV1/CoV2 proteins and theACE2-based peptides. Interestingly, the overall
flexibility of the complex CoV1-2 is the highest compared to the complexes
formed by CoV1 with inhibitors 1 and 3 and the threeCoV2
dimers. These differences areevenmoreevident in FigureS10D, where the rmsfs are incorporated in the complex structure as
tube radius. The binding of the three inhibitors 1, 2, and
3 makes the overall structure of CoV2 spike-RBD less flexible, which is an
indication of a tighter binding toward thenovel coronavirus protein. In addition, peptide
fluctuations are higher in CoV1 than in CoV2, as represented by thepeptide rmsfs in Figures A and S10. This rigidity of theSARS-CoV-2spike receptor-binding motif has
already been reported by simulations and experiments, and it was hypothesized as one of
the reasons for its enhanced infectivity.[59,60,67,74]In Figure B, it is the projection of
F in Qpeptide that differentiates the
one-to-two wells in F(Qspike,
Qpeptide) landscapes between the two dimers. The downhill
and two-state folding mechanisms of thepeptides along the
Qpeptide coordinate are again evident for
CoV1+2 and CoV2+3, respectively. For the two-state fragment
folder in CoV2+3, there is an F barrier separating the two
valleys on the order of 2kBT, which is the
double of the free-energy barrier of the one-dimensional
F(Qpeptide) (Figure C). In this high order projection of F in Figure B, conformational rearrangements of thespike
protein along Qspike are also taken into account combined with
folding in Qpeptide to overcome the barrier. Figure shows one valley marked with the letter
“N” standing for the “Native” state in which both protein and
peptide are folded and bound. “E” denotes the “Encounter”
complex state in which the dimer can be partially bound or unbound, the protein can be
folded yet slightly loose and open, and thepeptide is unfolded but partially structured
(Figure A). This two-state dimerization
mechanismemerges due to the temperature selected to plot
F(Tbind) that samples thesementioned
states without separating them in the classic three or more dimerization states reported
for bigger dimers.[21,103]Cv has a wide peak around Tbind in
FigureS5B, suggesting a smooth transition where these states coexist. The
completely unbound and unfolded state for thepeptide was seen in higher simulated
temperature data. Peptides are small compared with spike-RBDs, and thepeptide dynamics
arefaster in sampling states at Tbind as shown in a small
part of the representative time trajectory in Figure S7B,C. Dimers of CoV2 with natural peptides 1 and
2 also presented this two-state dimerization profiles in
F(Qspike,
Qpeptide), however, with lower barrier and elevated E well
minima comparing with N valleys (Figure S9A). Theexistence of more than two states at the
Cv(Tbind) peak does not
guarantee that these wells will have the same depth in Figures , S8, and S9 at Tbind.Figure C presents
F(Qtotal,
Qpeptide) landscapes at Tbind
for CoV1+2 and CoV2+3 with one and two global wells,
respectively. Qtotal contains
Qspike, despite being the largest amount of native contacts,
and its value fluctuates around a narrow valley wherespike-RBD is folded
(Qspikeminimum in Figure S6C). Qtotal also contains
Qbind and Qpeptide, and these
reaction coordinates have a smooth change in F at the transition
temperatureTbind (for example, Figures S6E and S7A, respectively). Apart from
Qpeptide that monitors thepeptide folding mechanism
separately, as shown in Figure C, we would say
that the increase in Qtotal is proportional to the increase in
Qbind, representing theminimal path of the diagonal. With
that being stated, theCoV1 dimer increases Qtotal
concomitantly with the increase in Qpeptide, suggesting a
mechanism that binding and folding of thepeptide occurs concomitantly in Figure C. TheE state is marked in Figure
C as a reference for the discussion of the coupled
binding–folding mechanism of theCoV1 dimer, although it is not a deep minimum
valley per se. Figure S7B shows that Qpeptide changes in a
manner similar to Qbind during the selected few steps of
CoV1+2 molecular dynamics, again suggesting thepeptide coupled
binding–folding mechanism.CoV2 dimer dynamics over F(Qtotal,
Qpeptide) shows a different minimal path diverging from the
diagonal that separates E and N states. The off-diagonal path suggests that the folding of
the designed fragment 3 (same for the natural 1 in Figure S9B) is coupled to binding, yet slightly different from
2 with CoV1. The separation F barrier of about
1kBT for the fragment folding is small
compared to the thermal fluctuations that could also be around
kBT;[109] thus, it does
not impose an obligatory step to the fragment being completely folded to bind to themonomer. Yet, it seems that thepeptide should be partially structured in the native
conformation, reflected by the first increase in Qpeptide, and
then binding to CoV2 occurs, shown by the second increase in
Qtotal. The off-diagonal path suggests a mechanism of
folding after binding, although folding occurs by crossing an energetic barrier. Figure S7C shows that Qpeptide still changes
around Qpeptide = 0.3 after thepeptide is unbound at step 32,
although Qbind remains equal to zero, again suggesting the
folding-upon-binding mechanism of the fragment. CoV2+3 dimer forms after
transposing the F(Qtotal,
Qpeptide) barrier of over
2kBT that occurs after at least 30% of the
designed native contacts being formed at Tbind.It is important to emphasize that our findings indicate that the limiting rate for the
dimer assembling of the two coronaviruses was the thermodynamic barrier on the
one-dimensional Qbind coordinate
(9–15kBT) and theenergetic barrier
along Qpeptide that segregates one-to-two-statepeptide
folding; therefore, binding is the critical step. Figures S8C and S9C show the two-dimensional F of the
simulated dimers as a function of the distance between the center of mass between the two
chains (dspike–peptide) and
Qpeptide.
F(dspike–peptide,
Qpeptide) shows an unbound (U) state in which the distance
between the protein and thepeptide is bigger than the native/bound (N) and theencounter
(E) complex states, both located at almost the same
dspike–peptide ≈ 2 nm. It can be observed that
U are wide wells approximately aligned with theE state in which both have a low level of
nativeness for thepeptide, i.e., thepeptide was found to be unfolded at U and E. Despite
the fact that there is a difference between the two natural and the designed peptides en
route to the N well. There is no barrier betweenE and N for the natural peptides; only
the fragment has to overcome an energetic barrier to fold, even though the threepeptides
are limited by a free-energy barrier to reach thespike proteins at
dspike–peptide ≈ 2 nm. Thus, this is an
indication that the fragment has a more cooperative coupled binding–folding
dynamics than the natural peptides. These two coupled events determine the kinetics and
the thermodynamics of the studied protein dimers and it is feasible to be characterized by
all-atom SBM simulations.
Designed Peptide has Binding Contacts Tightend by the Secondary Loop Site
The structural evolution content of the folding-upon-binding mechanism of thespike-ACE2-based dimers was analyzed by building average contact formation probability
maps. The probability map of an atom i to form native contacts along with
a reaction coordinate Q was computed
(p(Q, i) given by eq S3). In this probability map, each pixel corresponds to the average
frequency of an atom belonging to a residue to form their atom–atom native contacts
in a specific point of the reaction coordinate. p(Q,
i) is computed over a long time trajectory at a constant temperature
run. The simulations contain about 1600 atoms belonging to the two chains of each
protein–peptide dimer. Thus, it was convenient to show in the y-label the residue
index instead of the original atom numbering, which produced irregularly distributed
labels, yet, human-readable.Figure shows the probability map of thepeptide atoms to form native contacts along the binding (Figure A) and folding (Figure B)
reaction coordinates for the dimers CoV1+2 (left panels) and
CoV2+3 (right panels), both at Tbind. Figure B also shows the free-energy profile for the
binding coordinate (F(Qbind)), with the
defined transition state (TS) region shaded as the gray area. In addition, the probability
maps for the threepeptides simulated with SARS-CoV-1 and 2 spike-RBDs are presented in
Figures S11 and S12, respectively. Qtotal and
Qspike contain thousands of native contacts with similar
probability maps, and this was the reason to show only
p(Qspike, i) in Figures S11C and S12C. Thespike protein is relatively well folded (red maps
in Figures S11C and S12C) during the binding and folding contact formation of
thepeptides, which are on the order of a hundred. Thepeptides are α-helix-based
and this linearity in their structure and sequence are reflected in the high similarity
between the two portions of Qbind probability maps of the
proteins (top half) and peptides (bottom half) in Figures S11A and S12A.
Figure 5
Contact formation probability as a function of (A) binding
(Qbind) and (B) peptide
(Qpeptide) native contacts of each peptide atom
i (p(Q, i)) for
CoV1+2 (left panel) and CoV2+3 (right panel). The atom
numbers were replaced by the respective residue number to a better visualization. The
normalized contact probability increases from zero (blue) to 1 (red). Both cases were
analyzed at the respective binding temperature Tbind. For
each dimer, the free-energy profile
(F/kBT) as a function
of (A) Qbind and (B) Qpeptide
with the transition state (TS) area shaded in gray is also shown. Main native contact
interactions involved in the early binding event are marked in the structures at the
left side of each panel in (A). Peptide inhibitor 2 binds and folds to
CoV1 by the N-terminal portion of its α-helix. Fragment 3 also
attempts to bind to CoV2 by the N-terminal portion at the transition state (right
panel in A); however, after crossing TS, 3 unbinds its N-terminal region
and binds and folds via its C-terminal loop segment.
Contact formation probability as a function of (A) binding
(Qbind) and (B) peptide
(Qpeptide) native contacts of each peptide atom
i (p(Q, i)) for
CoV1+2 (left panel) and CoV2+3 (right panel). The atom
numbers were replaced by the respective residue number to a better visualization. The
normalized contact probability increases from zero (blue) to 1 (red). Both cases were
analyzed at the respective binding temperatureTbind. For
each dimer, the free-energy profile
(F/kBT) as a function
of (A) Qbind and (B) Qpeptide
with the transition state (TS) area shaded in gray is also shown. Main native contact
interactions involved in theearly binding event aremarked in the structures at the
left side of each panel in (A). Peptide inhibitor 2 binds and folds to
CoV1 by the N-terminal portion of its α-helix. Fragment 3 also
attempts to bind to CoV2 by the N-terminal portion at the transition state (right
panel in A); however, after crossing TS, 3 unbinds its N-terminal region
and binds and folds via its C-terminal loop segment.At the unbound state (blue strip at Qbind = 0 in Figures A, S11A, and S12A), the almost folded spike proteins approach thepeptides that
are completely unfolded and disordered, represented by the blue strip region at
Qpeptide = 0 on themaps in Figures B, S11B, and S12B. The sharp transition to the bound state starts when there
are between 4 and 10 binding native contacts formed for thepeptides, which is the
beginning of the TS in Figure A. At TS along
Qbind, many nonspecific protein–peptide contacts are
formed with a probability of up to 50%, except for the constructed 3 that
also showed some binding contacts made with a probability of above 80% at some positions
in TS (red dots at TS on themaps). Many other protein–peptide native interface
contacts start to bemade with high probability in red as binding proceeds by crossing TS
along Qbind.Thenatural peptides 1 and 2 have similar binding contacts
being formed with thespike-RBMs (Figure S5A). Figure A shows that
CoV1 first casts the natural peptide 2 by binding the protein residueAsn479
to thepeptide residueHist34 at Qbind ≈ 0, followed by
Tyr442 binding to peptide residues Lys31 and Glu35. CoV1 proceeds by binding residues
Asn473 and Cys474 to thepeptide residues Thr27 and Phe28. These two interface regions are
highly hydrophilic on both surfaces, protein and peptide, as is shown in Figure S13 (left panels). Natural peptide residues over Tyr41 do not
contribute to binding significantly. Previous experiments showed that CoV1 residues Asn479
and Thr487 were associated with the recognition of thehumanACE2 receptor in addition to
residues Tyr442, Leu472, Asp479, and Asp480 that allowed interspecies infection.[110] ResidueThr487 is a CoV1 residue important for binding of the designed
3 loop segment, as shown in Figure S11A (right panel). The corresponding CoV1 residues Asn479 and Thr487
in CoV2 are the polar residues Gln493 and Asn501, respectively, in which they also play a
key role in theACE2 recognition by CoV2.[111]The designed peptide 3 has a different sequence for the native binding
contact formation revealed by the probability maps. Figure A shows that fragment 3 binds at the transition state
to CoV2 by many interface contacts from its N-terminal portion attempting to follow the
same sequence of events of the natural peptides. However, after crossing TS in the
F(Qbind) profile, 3 unbinds
some of its N-terminal contacts and binds its C-terminal loop segment to CoV2. In Figure A, CoV2 binds to fragment 3
after TS by residueAsn501 to peptide residueLys353, followed by Gly354, Asp355, and
Arg357. CoV2 residues Gln498 to Tyr505 also contribute to these binding contacts to the
loop segment. This is connected with the recent finding that, among the novel spikemutations, SARS-CoV-2 has themaximum van der Waals interaction between residues
CoV2-Asp491 and ACE2-Lys353.[70] This partial interface formation
proceeds by CoV2 residueGln498 connecting to peptide residueGln42. The loop segment of
3, present in the wild-typeACE2-PD as a secondary binding site, creates
many additional intermolecular native contacts with thespike-RBD tightening up the
protein–peptide interaction.Figure S13 shows how this partial interface, created by the turn residues
Gln498, Pro499, and Asp501 of CoV2, is more hydrophilic than the similar (mutated)
residues in CoV1 (Figures S1C and S2). Also, theTrp500 surroundings in CoV2 showed to bemore
hydrophobic than in the same position in CoV1, as is pointed by the upper orange arrow in
Figure S13. In CoV1, the first probable binding residue was Asn479 that is
mutated by Gln493, which maintained the hydrophilic nature of this site in both proteins.
However, other mutations surrounding Gln493 in CoV2 created a more hydrophobic site than
in CoV1. This increase in hydrophobicity caused by mutations on this CoV2 site could be
another reason for the late capture of fragment 3 residues His34 and Glu35
during the binding process, despite the large number of binding attempts at TS. This
observation suggests room for fragment 3 binding affinity improvement. These
two mechanisms described by the natural and designed peptides are similar in CoV1 and CoV2
with a difference that many binding contacts weremadeearlier in CoV2 (Figure S12A) than in CoV1 (Figure S11A).Thepeptide binding processes are accompanied by the folding of their linear structures,
which can be seen in the probability maps. Figure B shows the probability of a peptide residue to form intramolecular native
contacts at a specific position of the folding coordinate,
Qpeptide. Figure B
presents p(Qpeptide, i) for
dimers CoV1+2 (left panel) and CoV2+3 (right panel), both at
Tbind, in addition to the probability maps of the threepeptides simulated with SARS-CoV-1 and 2 spike-RBDs presented in Figures S11B and S12B, respectively. It was discussed in the previous
sections by inspecting one- and two-dimensional F profiles that, upon
binding, thenatural peptides 1 and 2 fold via the downhill
mechanism and the designed fragment folds via the two-statemechanism. These two folding
mechanisms were again represented by
F(Qpeptide) in Figure
B, with the transition state shaded as the gray area only for
the two-state-folder fragment. The probability maps of Qbind
resemble those of Qpeptide in Figures , S11, and S12. This is a signal that as a peptide atom/residue is captured
(binding), on average, it assumes its native conformation (by folding) that is represented
by the red pixels in Figure B.An unstructured protein might have a greater capture radius for a specific binding site
than the folded state with its restricted conformational freedom. Therefore, the unfolded
state binds weakly, followed by folding, as the protein approaches the binding site, and
the binding rate is enhanced. This scenario has been hypothesized over a couple of decades
as the “fly-casting” mechanism and it has been studied by SBMs in which the
thermodynamic barriers for the real proteins were reasonably described by themodel in the
α-carbon (Cα) and all-atom representations of
proteins.[100−103] Thus, folding-upon-binding, similarly to the proposed
fly-casting mechanism, might be one of the possible ways of interaction between thespike-RBDs and thepeptides.In Figure B, the α-helix peptide
2 starts the downhill folding (at Qpeptide
≈ 0.1) by their N-terminal residues in a similar manner as the residues bind to
CoV1, as discussed in Figure A. The downhill
F profile has its minimum at Qpeptide
≈ 0.5, meaning that around 50% of thepeptide native contacts were not formed most
of the time during the simulation at Tbind. The probability
map of peptide 2 projected on F shows that residues over
Tyr41 are not yet formed at the free-energy lowest level, suggesting that the C-terminal
region of the natural peptide is more flexible, unbound, and loose in CoV1. However,
folding of the natural peptides in the presence of CoV2 starts in residues closer to the
C-terminal domain, as shown in Figure S12B, in addition to the fact that they are less flexible than in
CoV1. Figure B shows that the folding of
fragment 3 in the presence of CoV2 also starts from its C-terminal loop
residues that havemany of their intramolecular native contacts formed at TS of
F(Qpeptide). In addition, few
3 residues close to the N-terminal sequence are not formed at the folded
state in Qpeptide ≈ 0.8. Regarding 3 with
CoV1, thepeptide native contacts from the N-terminal residues are formed before TS,
similar to the natural peptides shown in Figure S11B. Nevertheless, some contacts are unmade after TS and a less
specific native contact formation proceeds as the folding occurs, also leaving some
N-terminal residues unfolded at the folded well.In general terms, the threepeptides (1, 2, and 3)
in a disordered state are cast (fly-casting) by the two spike-RBDs CoV1/CoV2 with the
hydrophilic Asn479/Gln493 residues being the bait of theexposed hydrophilic region close
to thepeptide residueHis34. This precise requirement of a small partner search diameter
is necessary for the α-based peptides to dock into the protein binding site
interface. The capping loop CoV1/CoV2 residues Thr487/Asn501 play an essential role in the
association of theACE2-based secondary binding site, the 3 loop region. The
inserted loop residues of 3 increase the binding contacts in the two
spike-RBDs, a major hydrophilic region. These polar contacts formed near the casting and
the loop residues correlate with thehydrogen bonds formed between the proteins and thepeptides. Figure shows that CoV2 has morehydrogen bonds formed with thepeptides than CoV1, which are thenew hydrophilic contacts
that emerged in CoV2 by residuemutations that occurred in CoV1. It was reported by
binding simulations of the full-length ACE2-RBD of SARS-CoV-1 and 2 that these differences
in the hydrophobicity and hydrogen bond network of the two dimers govern theenhanced
binding affinity of thenovel coronavirus.[74] Also, polar residuemutations in SARS-CoV-2 result in a greater electrostatic complementarity than that of theSARS-CoV-1 complex,[74] although the simulated electrostatic and
pH-dependent free-energy affinities of both dimers were similar.[112] It
is suggested that this similar, yet enhanced, binding energy of theSARS-CoV-2spike
protein is not due to a singlemutant but because of the sophisticated structural changes
induced by all mutations together.[70] This has implications for antibody
therapeutics and vaccines that might beeffective in combating different SARS-CoV-2
isolates affected by common mutations.[113]
Peptide Binding Affinity is Higher for SARS-CoV-2 Spike-RBD
Association rates and binding affinity constants have been investigated extensively for
interactions between proteins that can be folded or partially folded and for complexes
where one or more partner is initially disordered.[114] Solvent and
temperature dependence of the association rates indicate that kinetics of a coupled
folding and binding reaction is limited by the free-energy barrier or transition
state.[115] Theexperimentally observed binding process of the
binding–folding reaction was consistent with molecular dynamics simulations of the
coarse-grained Cα-SBM performed previously,[39] which
was similar to the all-atom SBM simulations of this study. Themolecular binding affinity
is related to theequilibrium dissociation constant (KD =
1/KA) that measures the propensity of a dimer in reversibly
separating its two components. Therefore, a low dissociation constant is related to higher
biding affinity and vice-versa. One can measure KD from the
ratio between the kinetic dissociation (koff) and association
(kon) rate constants (KD =
koff/kon).[39]
From the thermodynamic point of view, KD can also be
associated with the Gibbs free-energy change during the dissociation/binding events. The
stability of a protein–ligand binding can be determined by themagnitude of the
free-energy variation upon binding (ΔG) that is connected to
KD by[116]From the thermodynamic stability given by eq , it
is possible to measure theequilibrium KD and to infer the
biomolecular binding affinity of the simulated protein–peptide dimers. Theequilibrium dissociation constant was evaluated for a range of temperatures around
Tbind using the Helmholtz free-energy stability
(ΔF) in eq , which is
obtained in the simulations by the Weighted Histogram Analysis Method (WHAM).[117]F was evaluated as a function of the distance (d)
between the center of mass of thespike-RBDs and thepeptides (Figure S6d), differing from the previous sections that analyzed
F of native contact-based coordinates (Figure S6, left panels). We observed that the coordinateQbind is suitable for mapping the thermodynamics of the
contact formation mechanism; however, Qbind can give imprecise
information about the association. For instance, one of the dimer’s components
could be slightly rotated from the native bound conformation resulting in a low
Qbind, but in a very close distance between the
protein–ligand.Figure A shows an increase of theequilibrium
dissociation constant (KD) with temperature for the dimers
CoV1+2 and CoV2+3. At T =
Tbind, Figure B
presents the F(d) profiles for CoV1+2 and
CoV2+3 that resulted in the thermodynamic stabilities between the
bound/unbound states of ΔF ∼
3kBT and ΔF ∼
4kBT, respectively. This difference in
ΔF between the two dimers at Tbind is
mainly responsible for the difference of about one order of magnitude in
KD in Figure A.
The dissociation constant of CoV2+3 is lower than the
KD of CoV1+2, implying that the binding
affinity is higher. This trend is also observed for the other equilibrium
KD reported in Table , in which it predicts that the binding affinity of the threepeptides with the
novel spike-RBD CoV2 is higher than with SARS-CoV-1 RBD.
Figure 6
(A) Equilibrium dissociation constant (KD) as a function
of temperature for two simulated dimers: CoV1+2 (red) and
CoV2+3 (orange). KD was calculated using
eq (circles) and the line represents a
simple linear fit to the data that helps to guide the eye. CoV2+3 has
lower dissociation constants for increasing T than
CoV1+2, implying higher binding affinity in temperatures around
Tbind. (B) Free energy (F) as a
function of the distance between the center of mass of the protein and the peptide at
Tbind. At T =
Tbind in (A), the CoV2+3 equilibrium
KD is about one order of magnitude smaller than
CoV1+2, which is a reflection of the higher ΔF
stability between bound/unbound states in (B). Temperature is normalized by the
respective dimer binding temperature
(T/Tbind) and F
profiles are in units of
kBTbind.
Table 2
Equilibrium Dissociation Constant (KD) between the
SARS-CoV-1 and 2 Spike-RBDs and the Peptide Inhibitors 1, 2,
and 3, Evaluated at the Binding Temperature
(Tbind) of Each Protein Dimer
spike-RBD + peptide
KD(T = Tbind)
CoV1+1
1.5 × 10–2
CoV1+2
3.7 × 10–2
CoV1+3
1.4 × 10–2
CoV2+1
2.7 × 10–3
CoV2+2
1.6 × 10–3
CoV2+3
6.6 × 10–3
(A) Equilibrium dissociation constant (KD) as a function
of temperature for two simulated dimers: CoV1+2 (red) and
CoV2+3 (orange). KD was calculated using
eq (circles) and the line represents a
simple linear fit to the data that helps to guide theeye. CoV2+3 has
lower dissociation constants for increasing T than
CoV1+2, implying higher binding affinity in temperatures around
Tbind. (B) Freeenergy (F) as a
function of the distance between the center of mass of the protein and thepeptide at
Tbind. At T =
Tbind in (A), theCoV2+3equilibrium
KD is about one order of magnitude smaller than
CoV1+2, which is a reflection of the higher ΔF
stability between bound/unbound states in (B). Temperature is normalized by the
respective dimer binding temperature
(T/Tbind) and F
profiles are in units of
kBTbind.The analysis of binding events can be a rather difficult task. It is strongly dependent
on choosing the perfect setup and experimental conditions, and it is also advisable to
measure binding interactions on multiple platforms for cross-validation. Two label-free
binding measurement platforms are commonly used: the surface plasmon resonance (SPR) and
biolayer interference (BLI).[118] These two platforms were recently
employed by theexperimentalists to analyze the binding of optimized ACE2-based peptides
and antibodies targeting the full spike protein (or its RBD) of thenovel coronavirus in
which KD ranged from picomolar to nanomolar affinities.[119] The full ACE2enzyme dissociation constant of theSARS-CoV-2spike
protein is reported to be on the higher end, the nanomolar binding affinity,[4] though that of SARS-CoV-1 were reported by different BLI experiments to be
similar[50,120] or to
have an up to 20-fold higher KD,[4] meaning
lowered affinity to CoV1. The de novo designed ACE2-based peptide and
miniprotein affinities to CoV2 were also reported to have comparable or lower affinity to
CoV1.[68,82,93,121−124] This is an
indication that ACE2-derived peptides might be, in some cases, more sensitive to CoV2 than
to CoV1, that is, qualitatively in line with the theoretical binding affinities estimated
in Table . The natural peptides 1
and 2 and the designed 3 are based on the original sequence of
thehumanACE2 that is shown to be suboptimal for binding CoV2 in the nanomolar range.
This opens the opportunity to redesign thesepeptides to significantly improve their
binding affinity up to the picomolar range, inhibiting SARS-CoV-2 and its mutations fromentering human cells and hindering its fast transmission.
Conclusions
Many cellular activities are determined by the recognition of at least two flexible
biomolecules. In this manuscript, two sets of dimers relating to SARS-CoV-1 and 2 spike-RBDs
were simulated (CoV1 and CoV2), in which each RBD was simulated with threepeptide
inhibitors of increasing sequence length based on the α1-helix of ACE2-PD,
two naturals (1 and 2) and one designed (3) by
including a secondary loop sequence. The thermodynamic analysis showed that the threepeptides have different binding and folding free-energy barriers in the presence of CoV1 and
CoV2 and, for some coordinates, even a different transition state position. Kinetic analysis
of binding/folding native contact formation also revealed different patterns between the two
sets of dimers. In these two sets, the threepeptides have the same residue sequence;
therefore, the difference in the binding partner sequence seems to have an important role in
the binding affinity and folding of thepeptides. This difference in the
binding–folding process of the binder that is templated by the protein partner was
already reported for some IDPs;[107] however, it is not a universal
feature.[108] In the case of thecoronavirus dimers of this study, just
folding by itself separated the natural and designed peptides into two classes of
mechanisms; thus, the coupled binding-upon-folding mechanisms were dictated by both
protein–inhibitor residue sequences.Figure summarizes the two binding–folding
mechanisms of the studied protein–peptide dimers. Figure shows that it starts with the conformational rearrangements of the
superficially “open” spike-RBMs catching thepeptides at theencounter complex
step and then the binding occurs. The RBM surface accommodates thepeptide at the same time
that folding proceeds. This flexible binding–folding mechanism is more cooperative
for the designed peptide, as shown in Figure B,
due to the free-energy barrier for folding that 3 has to cross to reach the
native/bound state (Figure A). Free-energy
barriers for binding over the native contact-based and coordinate-based reaction coordinates
were similar when comparing SARS-CoV-1 spike protein dimers with SARS-CoV-2, although CoV2
dimers resulted in increased energetic barriers. Our results are in alignment with recent
computer simulations showing that electrostatic and van der Waals binding energies are
similar, yet higher, for theSARS-CoV-2spike protein bound to ACE2-based peptide
inhibitors, full-length ACE2-PD, and monoclonal antibodies.[72,74,112] It is also in synergy
with theexperimental results showing again that binding kinetics and affinity of humanACE2
to spike are similar, although higher to SARS-CoV-2 than to SARS-CoV-1, correlating with theefficient spread of SARS-CoV-2 among humans.[50]
Figure 7
Scheme of the binding and folding dynamics of the studied dimers with the natural
peptides 1 and 2 and the designed fragment 3. (A)
Schematic free energy as a function of a joint binding and folding reaction coordinate.
(B) Schematic diagram projecting the association mechanisms in two-dimensional binding
and folding reaction coordinates. The natural and designed peptides bind prior to
folding, although the designed fragment has a more cooperative binding–folding
mechanism due to its two-state folding mechanism.
Scheme of the binding and folding dynamics of the studied dimers with the natural
peptides 1 and 2 and the designed fragment 3. (A)
Schematic freeenergy as a function of a joint binding and folding reaction coordinate.
(B) Schematic diagram projecting the association mechanisms in two-dimensional binding
and folding reaction coordinates. The natural and designed peptides bind prior to
folding, although the designed fragment has a more cooperative binding–folding
mechanism due to its two-state folding mechanism.Concluding, the applied computational protocol was successful in determining binding
free-energies and main residues for the binding kinetics of the natural and designed
peptides in complex with both CoV1 and CoV2. Theenhanced binding energy of the designed
fragment was stated to be related to the increasing contacts of the secondary loop region,
which switched the folding mechanism to two-state-like, in contrast to the downhill-folder
natural peptides. In addition, the inserted loop sequence conferred to 3 a
tight bounding with the two spike-RBDs, explained by the increased hydrogen bonds and polar
contacts. It is important to remember that the theoretical protocol is a simplified all-atommodel based on the protein–peptideexperimental structure with no electrostatic
potential term and without solutes included, and they are implicit on the functional
Hamiltonian. These assumptions make themodel fast to run when compared with traditional
molecular dynamics simulations, which allows many different inhibitor peptides to beevaluated against thenovel coronavirus protein targets. Notwithstanding, themodel is
extremely versatile and these features or many others could be included to evaluate their
influence on the association dynamics. The design of new potent peptides and biomolecules
that inhibit novel coronaviruses can be advantageous to the development of diagnosis,
vaccines, and epidemic surveillance in the years to follow.[125]
Methods
Simulation and Analysis Details
Protein dimers were simulated with all heavy (nonhydrogen) atoms with the structure-based
model (SBM),[126] also known as the Go̅-model.[23]
SBMs define theenergy minimum of the Hamiltonian function as the native conformation;
thus, SBM is a topology-based model.[76] The simulation protocol was
based on the previous work[26] with GROMACS suite 5.1.4.[127] Input files for the SBM simulations were prepared with SMOG2 version
2.3-β with standard options for the all-atommodel.[128] Disulfide
bridges between the two cysteine residues (SS-bond) were included as a regular covalent
SBM bond, which represents an SS-bond oxidized state.[129] Native
structures were analyzed by FoldX[130] to minimize residue side-chain
energy.For each system simulated, the thermodynamic analysis was performed over 40 temperature
runs from lower temperatures (fully folded and on states) to higher temperatures (off
states), including the transition temperature between the on and off states
(Tbind), by the weighted histogram analysis method
(WHAM)[117] with a Python script, the PyWham package.[131] The WHAM computes themicrocanonical density of states that is used to
build the free-energy profiles. Tb was defined as the
temperature related to the specific heat peak between binding states for each protein. The
number/fraction of native contacts within a given structure (Q) was used
to evaluate nativeness as the reaction coordinate, which describes how similar is a given
structure with respect to the initial native structure.[132]
Root-mean-square fluctuation (rmsf), the radius of gyration
(Rg), root-mean-square deviation (rmsd), and distance
between the two chains (d) were computed with the analysis package
contained in theGROMACS suite. Each simulation was executed over 1 μs with
integration steps of 2 fs and snapshots of 4 ps. The all-heavy-atom SBM simulations wereexecuted using periodic boundary conditions with no constraint on the distance between the
two molecules. Enough sampling for the statistical calculations of bound/unbound states
was achieved at Tbind and temperatures in its vicinity.Sequence alignment and secondary structureelements were identified using theENDscript 2
web server.[133] Interfaces and interactions formed by CoV1/CoV2 and the
inhibitors were analyzed using the PISA (protein interfaces, surfaces, and assemblies)
server of theEuropean Bioinformatics Institute (EBI) (http://www.ebi.ac.uk/msd-srv/prot_int/cgi-bin/piserver).[95]
Protein-energy frustration was obtained by submitting the native conformation to the
AWSEM-MD Frustratometer[96,97] web server (http://frustratometer.qb.fcen.uba.ar). Protein structures were visualized with
PyMOL (https://pymol.org), VMD,[134]
and UCSF Chimera[135] softwares. Two- and three-dimensional curves were
plotted using Grace (https://plasma-gate.weizmann.ac.il/Grace), Gnuplot (http://www.gnuplot.info), and Matplotlib[136] packages. The contact probability maps were generated using the code
available at https://github.com/ronaldolab/pyContactProbability.
Authors: Aleksei Kuznetsov; Piret Arukuusk; Heleri Härk; Erkki Juronen; Mart Ustav; Ülo Langel; Jaak Järv Journal: Int J Pept Res Ther Date: 2021-12-01 Impact factor: 1.931
Authors: Robert T Mackin; J Vincent Edwards; E Berk Atuk; Noah Beltrami; Brian D Condon; Janarthanan Jayawickramarajah; Alfred D French Journal: Molecules Date: 2022-03-23 Impact factor: 4.927