Luisa Di Paola1, Hamid Hadi-Alijanvand2, Xingyu Song3, Guang Hu3, Alessandro Giuliani4. 1. Unit of Chemical-Physics Fundamentals in Chemical Engineering, Department of Engineering, Università Campus Bio-Medico di Roma, via Álvaro del Portillo 21, 00128 Rome, Italy. 2. Department of Biological Sciences, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, 45137-66731, Iran. 3. Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China. 4. Environmental and Health Department, Istituto Superiore di Sanità, 00161 Rome, Italy.
Abstract
SARS-CoV-2 has caused the largest pandemic of the twenty-first century (COVID-19), threatening the life and economy of all countries in the world. The identification of novel therapies and vaccines that can mitigate or control this global health threat is among the most important challenges facing biomedical sciences. To construct a long-term strategy to fight both SARS-CoV-2 and other possible future threats from coronaviruses, it is critical to understand the molecular mechanisms underlying the virus action. The viral entry and associated infectivity stems from the formation of the SARS-CoV-2 spike protein complex with angiotensin-converting enzyme 2 (ACE2). The detection of putative allosteric sites on the viral spike protein molecule can be used to elucidate the molecular pathways that can be targeted with allosteric drugs to weaken the spike-ACE2 interaction and, thus, reduce viral infectivity. In this study, we present the results of the application of different computational methods aimed at detecting allosteric sites on the SARS-CoV-2 spike protein. The adopted tools consisted of the protein contact networks (PCNs), SEPAS (Affinity by Flexibility), and perturbation response scanning (PRS) based on elastic network modes. All of these methods were applied to the ACE2 complex with both the SARS-CoV2 and SARS-CoV spike proteins. All of the adopted analyses converged toward a specific region (allosteric modulation region [AMR]), present in both complexes and predicted to act as an allosteric site modulating the binding of the spike protein with ACE2. Preliminary results on hepcidin (a molecule with strong structural and sequence with AMR) indicated an inhibitory effect on the binding affinity of the spike protein toward the ACE2 protein.
SARS-CoV-2 has caused the largest pandemic of the twenty-first century (COVID-19), threatening the life and economy of all countries in the world. The identification of novel therapies and vaccines that can mitigate or control this global health threat is among the most important challenges facing biomedical sciences. To construct a long-term strategy to fight both SARS-CoV-2 and other possible future threats from coronaviruses, it is critical to understand the molecular mechanisms underlying the virus action. The viral entry and associated infectivity stems from the formation of the SARS-CoV-2spike protein complex with angiotensin-converting enzyme 2 (ACE2). The detection of putative allosteric sites on the viral spike protein molecule can be used to elucidate the molecular pathways that can be targeted with allosteric drugs to weaken the spike-ACE2 interaction and, thus, reduce viral infectivity. In this study, we present the results of the application of different computational methods aimed at detecting allosteric sites on the SARS-CoV-2spike protein. The adopted tools consisted of the protein contact networks (PCNs), SEPAS (Affinity by Flexibility), and perturbation response scanning (PRS) based on elastic network modes. All of these methods were applied to the ACE2 complex with both the SARS-CoV2 and SARS-CoVspike proteins. All of the adopted analyses converged toward a specific region (allosteric modulation region [AMR]), present in both complexes and predicted to act as an allosteric site modulating the binding of the spike protein with ACE2. Preliminary results on hepcidin (a molecule with strong structural and sequence with AMR) indicated an inhibitory effect on the binding affinity of the spike protein toward the ACE2 protein.
Entities:
Keywords:
ACE2 binding; COVID-19; allosteric drugs; binding patch softness; drug discovery; elastic networks modeling; protein contact networks; spike protein
The recent pandemic caused by SARS-CoV-2 (COVID-19) has posed the most
challenging global health and sanitation concern of the twenty-first
century, with 6,663,304 confirmed cases of infected and 392,802 confirmed
fatalities (World Health Organization, updated on June 6th, 2020[1]). Moreover, this pandemic has caused a widespread
lockdown, affecting more than 1 in 2 individuals in the world.The virus responsible for COVID-19, SARS-CoV-2, emerged in late 2019, in the
region of Wuhan, China.[2] SARS-CoV-2 belongs to the family
Coronaviridae, which is characterized by large,
enveloped, positive-stranded RNA viruses that can potentially cause
gastrointestinal, nervous system, and respiratory distress.[3] This family also includes SARS-CoV and MERS-CoV,[4] which, despite several similarities, present some
interesting differences from SARS-CoV2.[4,5]SARS-CoV-2, MERS-CoV, and SARS-CoV use an envelope protein, termed the spike
(S) protein, to gain access to host cells.[4,6] The S protein displays
the highest degree of genetic variability in the virus genome.[6] Additionally, the SARS-CoV-2 S protein exhibits a greater
affinity for the ACE2 receptor with respect to analogs in SARS-CoV and
MERS-CoV.[7,8] The SARS-CoV-2 S protein shares a 96.2% sequence
similarity with the bat SARS-like RaTG13 S protein, indicating the natural
origin of the virus.[5,9]Due to its crucial role in the very first stage of viral infection, several
studies[7,10−14]
point to the S protein as an elective target to develop coronaviruses
therapies and vaccines (the S protein is also the main target of
neutralizing antibodies[15−17]).The S protein belongs to the viral fusion class I proteins,[7]
has a trimeric, crown-like shape, protrudes from the virus envelope, and
targets diverse host cell receptors in different species (see Table 1 in ref
(4)). Both SARS and SARS-like
(MERS and SARS-CoV-2) coronaviruses target the angiotensin-converting-enzyme
2 (ACE2) present in pneumocytes and enterocytes in the respiratory system
and other susceptible cells (e.g., intestinal mucosal, renal tubular
epithelial, cerebral neurons, and immune cells).[18] The
S-ACE2 interaction occurs at the very first stage of viral entry to initiate
viral fusion with the host cell.[19,20] The hallmark of SARS-CoV-2 is that
its S protein is optimized for ACE2 binding.[9]The S protein comprises two subunits, termed S1 and S2, which are cleaved
during the first stage of viral infection. The S1 subunit is responsible for
the interaction of the S protein with the host receptor (ACE2), while S2
promotes viral fusion.[21] The receptor binding domain
(RBD) resides between residues 318 and 513[22] in
SARS-CoV-2 and 318 and 510 in SARS-CoV.[13]The development of therapeutics targeting the S-ACE2 complex requires a
comprehensive study of this complex interface and its associated
dynamics.[23] Thus, investigating the allosteric
nature of the S-ACE2 complex will help to understand and quantify the extent
by which the S protein binds to ACE2, and therefore the molecular basis of
virus infectivity.The identification of allosteric modulation regions (AMRs) allows for the
design and testing of allosteric drugs (i.e., drugs targeting a different
site from the complex interface) with enhanced bioactivity with respect to
orthosteric drugs targeting the protein–protein interaction
interface,[24] which are often tightly packed and
scarcely accessible. The field of allosteric drugs is a rapidly developing
field in drug discovery and may play a crucial role in the therapeutic
strategy for SARS-CoV-2.[25−28]In this study, we adopted a multifaceted computational approach to identify
promising “druggable” allosteric sites of viral S protein. We
first applied the method of protein contact networks (PCNs) to the S-ACE2
complex in SARS-CoV-2 and SARS-CoV. In previous studies, we adopted this
method to investigate both protein–protein
interactions[29−31] and the allosteric nature of
binding.[32,33]In this work, the application of PCNs clearly highlighted a region in the
SARS-CoV and SARS-CoV-2 S protein, which acts as the putative
locus for allosteric modulation (AMR). AMR was
further investigated in terms of its relevance in the stability of the
complex and ability to transmit modulating signals to the binding site by
means of independent computational approaches.Specifically, we adopted a monomer-based approach[34] which
can predict the affinity of the S protein to its ligand. In this step, the
ensembles for the ground state of the structures have been produced using
the anisotropic network model (ANM) approach.[35] Moreover,
we adopted an unsupervised blind procedure to predict possible interaction
sites on S proteins and their affinities for tentative partners using a
softness-based prediction of intersubunits affinity (SEPAS). In addition,
the dynamical difference in RBD between the S proteins of the two virus
strains suggested that they may have different binding and allosteric
properties.We also provided biophysical evidence based on the elastic network modeling
(ENM) approach, combined with perturbation-response scanning (PRS)[36] that AMRs in both viruses acted as a mediator of
intermolecular allostery between the S protein and ACE2. “The three
methods converged in allosteric character for residues in AMR in both
complexes. This allowed us to state that residues in AMR for SARS-CoV-2AMR
are more susceptible to allosteric drug targeting than for SARS-CoV.A recent study[37] suggested that the many residues in the AMR
may be one of the most efficient epitopes for antibody recognition.Preliminary docking studies individuated some molecules that bind to AMR
(hepcidin), which were independently shown to exhibit strong sequence
similarities with the AMR.[38]
Materials and Methods
The analyzed structures consist of the following: complex SARS-spike
glycoprotein-humanACE2 complex (stabilized variant, all ACE2-bound
particles, PDB code 6CS2,[39] termed the SARS-CoV S/ACE2
complex) and the SARS-CoV-2 analogous spike glycoprotein-humanACE2
complex[40].
Protein Contact Network Methods
Purposed software was used to transform the full structural information
in the PDB files into a protein contact network (PCN): the network
nodes are the amino acid residues represented by α-carbons.
Links between nodes (residues) exist if the mutual distance of the
residues (centered on α-carbons) were in the range between 4 and
8 Å, thus including only significant (<8 Å) noncovalent
(>4 Å) bonds, while discarding “obliged”
contacts due to proximity along the sequence.The adjacency matrix A mathematically represented the PCN in
terms of undirected, unweighted network and is defined
asThe topological role of nodes
(residues) addressed the functional role at the corresponding
residues, based on the value of network descriptors. The method is
widely discussed elsewhere.[41]The basic network descriptor was the node degree
k, defined
asTo detect allosteric sites and, more
generally, the functional regions activating upon binding, we adopted
a method based on network spectral clustering.[42,43]
Spectral clustering was based on the Laplacian matrix spectral
decomposition, where the Laplacian matrix L is defined
aswhere D is the degree
matrix (i.e., a diagonal matrix whose diagonal is the degree
vector).The spectral clustering was based on the eigenvalue decomposition of
Laplacian L. The sign of the Fiedler vector (the
eigenvector corresponding to the second minor eigenvalue) was used for
binary partition (each network was parted into two clusters, which in
turn could be parted into four, and so on), as outlined in refs
(43 and
44). We previously
demonstrated that network clusters (group of residues) correspond to
functional regions in the protein.[43]Once network nodes (again, protein residues) were parted into the given
number of clusters (an independent user-defined parameter for the
clustering software), it was possible to define the participation
coefficient
Pwhere
k is the
degree of the ith residue computed in the cluster it
belongs (i.e., accounting only for links with nodes pertaining to the
same cluster).This network descriptor has been demonstrated to represent the functional
role of residues in binding and stability.[33,41,43,45] Residues with a
high P value are responsible for the communication
between clusters (functional regions) and are thus addressed via the
role in allosteric communication.[46] In this study,
the participation coefficient maps were projected onto ribbon protein
structures (as heat maps), to highlight activating hotspots in the
spike-ACE2 complex. Participation coefficient maps are visualized by
PyMol (https://pymol.org/2/).We characterized the spike protein/ACE2 interface by means of descriptors
stemming from the structural information and the network analysis
described above:the interchain
degree kIC was defined as the node degree, but was
only computed over contacts between nodes (residues)
belonging to different
chainsThe interface roughness Q/R[29] was defined for each
chain participating in an interface, Q was the number
of chain residues in the interface and R, the
sequence range of chain residues in the interface.the interface energy
EINT is the sum of of
eij and
⟨EINT⟩ is the value
averaged over the whole number of residues at the interface.the interface amino acid
range[29] IAR =
R/N, where
N was the total number of
residues in the
chainthe interface
energy matrix, E, defined
as
Elastic Network Modeling of ANM and GNM
Elastic network models (ENMs) provide efficient methods for investigating
the intrinsic dynamics and allosteric communication pathways in
proteins. We adopted two elastic network models: (1) anisotropic
network model (ANM); and (2) Gaussian network model (GNM).In ANM,[47] the interaction potential for a protein of
N residues
was:In ANM, the 3N
× 3N Hessian matrix, H, determined
the systems dynamics.H generic element
waswhere
X,
Y, and
Z
represented the Cartesian components of the ith
residue, V was the potential energy of the system. We
selected rc = 13 Å.ANMs provide information about the amplitudes, as well as the direction
of residue fluctuations. The similarity between two ANM modes,
u and
v, evaluated for
proteins with two different conformations, can be quantified in terms
of the inner product of their
eigenvectors:GNM[36] was
based on the description of the protein structure as a network of
C connected
by springs of uniform force constant γ if they were located
within cutoff distance rc. In GNM, the
interaction potential for a protein of N residues
waswhere R0 and R
were the equilibrium and instantaneous distance between residues
i and j, and
Γ was the N ×
N Kirchhoff
matrix:Thus, the square fluctuations
wereThe normal modes were
extracted by eigenvalue decomposition of the matrix
Γ =
UΛU,
with U being the orthogonal matrix whose
kth column
u was the
kth mode eigenvector. Λ was
the diagonal matrix of eigenvalues
λ.The perturbation response scanning (PRS) approach was based on the linear
response theory and allowed evaluation of residue displacements in
response to external forces.[48] Our study performed
a PRS analysis based on GNM by constructing the inverse Laplacian
matrix, Γ–1.The N-dimensional vector ΔR of node displacements
in response to the application of a perturbation (a N-dimensional
force vector F) obeys Hooke’s law (F
= H • ΔR).The purpose of PRS was to exert a force of a given magnitude on the
network, one residue at a time, and F for other
residues not being perturbed, equal to zero. Thus, the force exerted
on residue i was expressed
asand the resulting response
wasΔR(
was an N-dimensional vector that described the deformation of all the
residues in response to
F. A metric for
the response of residue k was the magnitude
⟨∥ΔR(∥2⟩ of
the kth block of
ΔR(
averaged over multiple
F(, expressed as
the ikth element of the N ×
N PRS matrix, SPRS.
The elements of SPRS referred to the unit
(or uniform) perturbing force. The response to unit deformation at
each perturbation site was obtained by dividing each row by its
diagonal
valueThe average effect of the
perturbed effector site i on all other residues was
computed by averaging over all sensors (receivers) residues
j and can be expressed as
⟨(ΔR)2⟩sensor.
The effector profile
⟨(ΔRi)2⟩effector
described the average effect that the local perturbation in the
effector site i had on all other residues. The maxima
along the effector and sensor profiles corresponded to the functional
mobile residues that underwent allosteric structural change.Because the crystallized structure presented one snapshot of the protein
dynamic life at the bottom of a potential well, we created a
trajectory of protein dynamics in its ground state by setting the RMSD
cutoff to 5 Å using the ANM[36,47] approach
implemented in Prody.[49] To predict the most
possible protein binding patches (PBPs) on spike proteins, we utilized
ISPRED to predict the interface residues.[50] The
ISPRED-predicted interface residues acted as a seed to create a
possible PBP based on the procedure as previously described.[34] The predictions of monomer-based affinities based
on the mechanical softness of PBPs were performed by utilizing SEPAS
ver. 1.[34] We introduced ALA mutations into the
Covid-spike using FoldX suite.[51]
Additional Computational Methods
We computed the general interface properties of the spike/ACE2 complex
using PISA software,[52] specifically the interface
area, SINT, and the energy gain upon the
interface formation, ΔGINT.
Results and Discussion
Protein Contact Network Results
Figure reports the cluster
partition (2 clusters) of the SARS-CoV S/ACE2 complex (Figure B). The chains are
also shown to highlight which chains participate in the two clusters
(Figure A).
Figure 1
SARS-CoV S/ACE2 complex (PDB code 6cs2(39)). (A) Partition
into two clusters (green cluster 1, comprising the
ectodomain of ACE2 and the fusion peptide, in red cluster
2, the remaining part of the spike protein); (B) chains in
the complex: in green chain A, in cyan chain B, carrying
the fusion peptide, in magenta chain C, in yellow the ACE2
ectodomain.
SARS-CoV S/ACE2 complex (PDB code 6cs2(39)). (A) Partition
into two clusters (green cluster 1, comprising the
ectodomain of ACE2 and the fusion peptide, in red cluster
2, the remaining part of the spike protein); (B) chains in
the complex: in green chain A, in cyan chain B, carrying
the fusion peptide, in magenta chain C, in yellow the ACE2
ectodomain.The interface between the fusion peptide and the ACE2 ectodomain was so
strong (in terms of number of contacts between viral and host
proteins) that the clustering partition algorithm recognized the
fusion+ACE2 region as a single cluster, even though it comprised
sequences belonging to different chains.The map of the participation coefficient projected onto the ribbon
structure of the SARS-CoV/ACE2 complex (Figure C) shows an active region
(P > 0) in the junction between the fusion
peptide and the trimeric bulk phase of the spike protein. Active
residues are shown more in greater detail in red in Figure D. As previously demonstrated,
the participation coefficient describes the attitude of residues
toward participating in the intercommunication between clusters; thus,
it was a putative score of the involvement of the residues in
allosteric communication. Therefore, this region was a good candidate
to intervene in the allosteric regulation of complex formation and
there was value in investigating allosteric drugs that target this
portion of the protein.Figure refers to the analogous
analysis for the SARS-CoV2 S/ACE2 complex.
Figure 2
SARS-COV 2 S/ACE2 complex: (A) Partition into two clusters
(green cluster 1, comprising the ectodomain of ACE2 and
the fusion peptide, in red cluster 2, the remaining part
of the spike protein); (B) chains in the complex: in green
the chain A, in cyan the chain B, in magenta the chain C,
carrying the fusion peptide, in yellow the ACE2
ectodomain.
SARS-COV 2 S/ACE2 complex: (A) Partition into two clusters
(green cluster 1, comprising the ectodomain of ACE2 and
the fusion peptide, in red cluster 2, the remaining part
of the spike protein); (B) chains in the complex: in green
the chain A, in cyan the chain B, in magenta the chain C,
carrying the fusion peptide, in yellow the ACE2
ectodomain.A similar active region appeared at the junction of the fusion peptide to
the body of the spike protein; however, the active region in the SARS
CoV S/ACE2 was more compact than that of SARS-CoV2 S/ACE2. Active
regions in both complexes were characterized by two β-sheets and
unfolded traits, and could thus be targeted to peptides due to
flexibility of the region and absence of a pocket to bind small
organic molecules.[53]Table reports the values of
the participation coefficient P in residues with
P > 0, all of which were located in the
chain carrying the fusion peptides of the SARS-CoV S/ACE2 (chain B, in
cyan in Figure A) and
SARS-CoV2 S/ACE2 (chain C, in magenta in Figure A) complexes.
Table 1
Values of the Participation Coefficient
P for the SARS-CoV S/ACE2 and
nCoV2019 S/ACE2 Complexes (Only Positive Values Are
Reported, with Corresponding Position, in Italic Values
Larger than 0.6)
SARS CoV S/ACE2
SARS-CoV2 S/ACE2
aa
P
aa
P
312D
0.31
320Q
0.61
313V
0.56
321P
0.56
314 V
0.69
322T
0.64
315R
0.47
323E
0.67
316F
0.23
531N
0.75
516S
0.31
532L
0.75
517T
0.64
533V
0.44
518D
0.64
534 K
0.31
519L
0.56
537C
0.15
520I
0.36
538V
0.44
527F
0.19
539N
0.70
528N
0.51
540F
0.51
529N
0.61
542F
0.21
530N
0.64
546T
0.31
532L
0.36
547G
0.36
563R
0.21
548T
0.51
564D
0.27
549G
0.44
565P
0.61
576R
0.21
566 K
0.56
577D
0.21
567T
0.56
578P
0.84
568S
0.64
580T
0.56
-
-
581L
0.36
The SARS CoV S/ACE2 and SARS-CoV2 S/ACE2 complexes accounted for 21 and
22 active residues, respectively. The average value was similar for
the two distributions (0.47 for SARS CoV S/ACE2, 0.48 for SARS-CoV2
S/ACE2). However, the absolute values of higher P
residues (>0.6) were higher for COVID19 S/ACE2, indicative of a
corresponding larger reactivity of the active region.
Characterization of the S-ACE2 Interface
Table reports the properties
of interface S/ACE2 in the two complexes. The interface area and the
absolute value of energy of the SARS CoV S/ACE2 complex were higher
than the corresponding area for the SARS-CoV-2 S/ACE2 complex;
however, the specific values for the unit area and single residues
were higher for SARS-CoV-2 S/ACE2, indicating a more efficient yet
less stable interface.
Table 2
Properties of the Spike Protein/ACE2 Interface
SARS-CoV SA/ACE2
SARS-CoV2 S/ACE2
IARSP
0.01
0.008
IARACE
0.62
0.56
(Q/R)SP
0.45
0.37
(Q/R)ACE
0.05
0.04
EINT
5.37
4.35
⟨EINT⟩
0.17
0.17
nIC =
QSP +
QACE
31
26
SINT,
Å2
962.2
739.9
ΔGINT,
kcal/mol
–7.9
–7.4
(,
cal/(mol Å2))
8.21
10.00
This implied an optimized complex formation with the ACE2 of SARS-CoV-2,
as observed in ref (9). This
helped to explain the more rapid kinetics of infection, mediated by a
faster complex formation with main receptor, ACE2.The RBDs for the two proteins are reported as a wide peptide, which
comprised approximately 200 residues for both structures; however, a
more limited core region, termed the receptor binding motif (RBM), was
claimed to actively participate in the complex formation of the spike
protein with its receptor, ACE2. We independently identified the
residues in the spike protein for SARS-CoV and SARS-CoV2 using the
method of PCNs. The position and the value of the interchain degree
are reported in Table for
both complexes. The two residues in the two structures with the
highest value of interface degree were 488G in SARS-CoV and 502G in
SARS-CoV-2, respectively.
Table 3
Residues in the Interface S/ACE2, with Corresponding
Interface Degree
SARS-CoV S/ACE2
SARS-CoV2 S/ACE2
Position
kIC
Position
kIC
462P
3
475A
2
463D
2
476G
1
464G
1
487N
2
473N
3
489Y
1
475Y
1
493Q
1
479N
3
500T
4
482G
1
501N
4
486T
4
502G
8
487T
4
503V
3
488G
6
504G
1
489I
4
505Y
2
490G
1
-
-
491Y
2
-
-
These two residues could be considered the hotspots of the PPI interface
in the S/ACE2 complex; both are glycine residues, likely due to the
high steric adaptability of the small glycine residues in the most
tightly packed region of the S-ACE2 interface. This confirmed the high
density of the interface and its extremely scarce accessibility to
small molecules targeting the region, which also suggested the need to
interfere with the S/ACE2 interface through allosteric regulation.
Spike Binding Affinity and the Allosteric Modulation Region
(AMR)
In the previous sections, the AMRs of the S/ACE2 complexes have been
identified for SARS-CoV-2 and SARS-CoV using the PCNs approach (Table ). The AMR is
accessible for binding, which is at odds with the PBD.One critical property of the protein complexes was the intersubunit
affinity. Therefore, besides the S/ACE2 stability, we predicted the
affinity between the AMR and incoming-designed molecule (peptide
molecules), which will be designed to affect the function of AMR.This was a crucial step that will provide an independent proof-of-concept
of both the allosteric features of AMR (by definition, an allosteric
site must “sense” the microenvironment) and the
druggability of the predicted AMR.We reviewed the need for complete 3D knowledge of the S/ACE2 complex by
means of a recently introduced method to predict the intersubunit
affinity of protein complexes using protein binding patch (PBP) on a
single subunit by computing the PBP mechanical
softness.[34,54]To alleviate the static presentation of the protein structure in a
crystallized state, we created ensembles of protein complexes in their
ground state using the ANM approach. In the first step, we predicted
the affinity of the SARS-CoV-2 and SARS-CoV S proteins for ACE2.We compared the predicted affinity between the S protein and ACE2 when
the S protein was in its monomeric or trimeric state to obtain some
hints about the effect of adjacent subunits on the affinity between
spike and its ligand (Figure ). Subunits (chains) are denoted as SA, SB, and SC.
Figure 3
Predicted affinities of spikes to ACE2 are presented. The
affinity of SARS-CoV 2 spike (blue curve) and SARS-CoV
spike (red curve) to ACE2 is predicted using SARS-CoV
2/SARS_X_C/B_Y
summarized the state of spike proteins in affinity
prediction: X denotes number of chain in
predictions. X = 1 means S1 protein in
monomer form is considered, X = 2 means
S1 + ACE2 complex, and X = 4 means whole
spike complex + ACE2. Y notes the
considered PBP in computations: Y = 2
means the PBP’s residues of S1 in S1-ACE2 complex
is used to predict the affinity and Y = 4
means the PBP’s residues of S1 in whole spike
proteins-ACE2 complex is used to predict the affinity
between S1 and ACE2. Horizontal axis presents predicted
affinity (kcal/mol). Normalized density denotes in
Y-axis.
Predicted affinities of spikes to ACE2 are presented. The
affinity of SARS-CoV 2spike (blue curve) and SARS-CoVspike (red curve) to ACE2 is predicted using SARS-CoV
2/SARS_X_C/B_Y
summarized the state of spike proteins in affinity
prediction: X denotes number of chain in
predictions. X = 1 means S1 protein in
monomer form is considered, X = 2 means
S1 + ACE2 complex, and X = 4 means whole
spike complex + ACE2. Y notes the
considered PBP in computations: Y = 2
means the PBP’s residues of S1 in S1-ACE2 complex
is used to predict the affinity and Y = 4
means the PBP’s residues of S1 in whole spike
proteins-ACE2 complex is used to predict the affinity
between S1 and ACE2. Horizontal axis presents predicted
affinity (kcal/mol). Normalized density denotes in
Y-axis.To manage the potential states of the spike protein, we propose different
forms of PBPs on the S protein:Our
computations suggested that SARS-CoV-2SA (dark blue curves, Figure ) was more sensitive
to the presence of the SB subunit than SARS-CoVSA (red curves, Figure ). This result
suggested approaches that potentially disrupted the SA-SB association
may provide an efficient therapeutic route.The SA subunit is alone and waiting
for its partner (left column, Figure ). We suppose
that SA may interact with ACE2 using PBPs similar to
those in the ensemble of S/ACE2 in its dimeric state
(the first row, Figure ) or by using PBPs similar to
ones SA displays in the trimer spike and ACE2
ensemble (second row, Figure )Subunit SA
finds ACE2 and binds it directly (right column,
Figure ). SA binds to ACE2 via the PBPs that it
displays, either when no SB subunit is available
(the first row, Figure ) or in association with SB
monomers (second row, Figure ).We observed a higher affinity between S and ACE2 in SARS-CoV-2 in
SARS-CoV. This observation may explain the higher infectivity of
SARS-CoV-2.We are interested in predicting the affinities between possible PBPs on
spike proteins and tentative protein partners. In this case, we did
not know the PBPs on spike proteins because we did not have the
resolved 3D structures of all possible protein complexes between the
spike protein and other natural or synthesized partners. We predicted
the seed interface residues on the spike surface using the ISPRED
approach.[50] By expanding the selection radius
for defining the PBP region,[34] many possible PBPs
were generated for both SARS-CoV-2 and SARS-CoVspike proteins. Using
SEPAS, the affinities between the generated PBPs and unknown partners
were predicted.The affinities averaged over the whole are presented in Figure a and b (for SARS-CoV-2 and
SARS-CoVspikes, respectively); the average affinities mapped on the
surface of the spike proteins are reported in Supporting Information Figures 1 and 2. Some of the
predicted regions with an affinity < −5 kcal/mol resided in
the interface region of the spike subunits. There were some residues
in the proposed PBPs that belonged to AMR (indicated as red triangles
in Figure ). This meant that
AMR could act as a distinct PBP for some partners, suggesting that
targeting AMR may be a good choice for drug design.
Figure 4
Affinity of possible PBPs on the S1 subunit of the spike to
tentative proteins are predicted. Possible PBPs are
defined for the S1 subunit. The affinity of the PBPs on
SARS-CoV2 (a) and SARS-CoV (b) S1 protein for their
possible partners is presented. Left vertical axis points
to the averaged predicted affinity (kcal/mol) of the
proposed PBPs. Right vertical axis denotes the
P metric (Table
) of the AMR participat
residues. X-axis notes the residue of the
PBP’s central residue. Red triangles represent the
position of AMR residues on the S1 protein sequence.
Affinity of possible PBPs on the S1 subunit of the spike to
tentative proteins are predicted. Possible PBPs are
defined for the S1 subunit. The affinity of the PBPs on
SARS-CoV2 (a) and SARS-CoV (b) S1 protein for their
possible partners is presented. Left vertical axis points
to the averaged predicted affinity (kcal/mol) of the
proposed PBPs. Right vertical axis denotes the
P metric (Table
) of the AMR participat
residues. X-axis notes the residue of the
PBP’s central residue. Red triangles represent the
position of AMR residues on the S1 protein sequence.Thus, in considering AMR as distinct PBP, we were able toPredict the
affinities between AMR and unknown drug-like protein
partners, simply by using a 3D structure of AMR on
the spike subunit in different ensembles (monomer,
case “C” in Figure
a)
Figure 5
Affinity of AMR for tentative protein partners and the effect
of AMR perturbation on ACE2 affinity are presented. (a)
Affinity (kcal/mol) of AMR to possible partners are
predicted when S1 is a monomer (state C), S1 is associated
with ACE2 (stade DC), S1 is accompanied by two S2 subunits
(state ABC), and when the whole spike binds to ACE2 (state
ABCD). (b) Affinity of S1 for ACE2 is predicted when AMR
is intact (WT), residues of AMR with P
> 0.5 are mutated to ALA (P > 0.5),
or all AMR residues are mutated to ALA (P
> 0). The affected region of affinity density curve is
marked with vertical gray arrow. X-axis
notes predicted affinity (kcal/mol) and vertical axis
presents normalized density of affinities.
Account
for the effect of ACE2 binding on AMR affinity for
its possible partners. We assumed the spike monomer
in association with ACE2 then predicted the affinity
between AMR and its possible partners (dimer, case
“DC” in Figure a)Address
the effect of the SBspike chains on the affinity of
AMR in the SAspike chain for its partners, feeding
SEPAS with the trimeric state of spikes (trimer,
case “ABC” in Figure
a)Analyze
the condition of a trimer spike and ACE2 in terms of
a holo-complex (S/ACE2 complex, case
“ABCD” in Figure
a)Affinity of AMR for tentative protein partners and the effect
of AMR perturbation on ACE2 affinity are presented. (a)
Affinity (kcal/mol) of AMR to possible partners are
predicted when S1 is a monomer (state C), S1 is associated
with ACE2 (stade DC), S1 is accompanied by two S2 subunits
(state ABC), and when the whole spike binds to ACE2 (state
ABCD). (b) Affinity of S1 for ACE2 is predicted when AMR
is intact (WT), residues of AMR with P
> 0.5 are mutated to ALA (P > 0.5),
or all AMR residues are mutated to ALA (P
> 0). The affected region of affinity density curve is
marked with vertical gray arrow. X-axis
notes predicted affinity (kcal/mol) and vertical axis
presents normalized density of affinities.These results indicated that the AMR of SARS-CoV-2spike exhibited a
higher affinity for generic peptides than the AMR of SARS-CoVspike.
Interestingly, the affinity of AMR for their partners was increased
upon binding to ACE2, which suggested an interplay between the two
distant regions, the ACE2-binding site of spike and AMR. This last
result could be considered to be another proof-of-concept of the
allosteric character of the AMR.To further investigate the effect of the induced structural perturbation
in AMR on the binding affinity of the spike protein to ACE2, we
introduced two sets of ALA mutations in AMR. In one set, the AMR
residues that play a major role in allosteric modulation
(P > 0.5 in Table ) were mutated to ALA, and in the other
set, all residues of AMR on SARS-CoV-2SA were mutated to ALA.To simulate an SA perturbation, we used the ensemble of structures
generated using the ANM approach after setting the RMSD cutoff to 8
Å for selecting normal mode states. Such global perturbation only
distorts the ACE2-binding site of SA to less than 1.5 Å. In Figure b, we reported that
wild-type SA exhibits a shoulder in its affinity density curve at a
high affinity region. Our observations suggested that the structural
perturbation of AMR due to the ALA mutation of native residues
decreased the affinity of SA for ACE2. This was further confirmation
of the allosteric properties of the AMR.In order to explore AMR druggability, we started by recognizing the
strict structural similarity of the AMR region with hepcidin, a
crucial protein for iron regulation. Moreover, a recent preprint
reported a distant (albeit relevant) sequence similarity between
hepcidin and the SARS-CoV-2spike protein.[38] It has
been established that there was some proof of the similarity between
the hydrophobicity spectra of ligand–receptor pairs, which
fostered the observed sequence similarity between hepcidin and the
SARS-CoV-2 S-protein.[55,56]Moreover, hepcidin is strictly involved in interleukin 6 (IL-6)
regulation,[57] given that IL-6 is the main
actor associated with the “cytokine storm” that is
linked to fatalities provoked by SARS-CoV-2 infection.[58] A preliminary docking analysis showed a perfect
fit between hepcidin and the AMR region; however, this now must only
be considered as a first step with which to prioritize different
peptides as allosteric effectors.
Dynamical RBDs Comparison of SARS-CoV and SARS-CoV-2
In addition to the characterization of the S/ACE2 interface, a comparison
of RBD dynamics between SARS-CoV and SARS-CoV-2 was also important to
understand the functional mechanism of the S/ACE2 interaction. To this
aim, the overlapping values between the slowest ANM modes
(eigenvectors) were calculated to compare the intrinsic dynamics of
RBDs in both S proteins (eq
in Materials and Methods). In Figure a, the dot products (overlap
values) were plotted in the map for the RBDs in SARS-CoV versus
SARS-CoV-2 for the first 10 ANM modes. In this map, the upper limit of
1 indicated a perfect overlap (red) correspondent to a perfectly
shared dynamics, while 0 indicated no overlap (blue), indicating that
they showed independent dynamics. Although RBDs in the two complexes
showed very high structural similarity with an RMSD of 0.68 Å,
the overlap between the first 10 ANM modes of two RBDs exhibited some
differences. Overlap values with ∼0.6 for the first three modes
indicated a weaker dynamical overlap, even at the lowest frequencies,
which were those endowed with most important functional meaning.
Figure 6
ANM results for RBDs. (a) Overlap map between the ten slowest
ANM modes of SARS-CoV and SARS-CoV-2 RBDs, in which dark
red and blue regions indicate strong and weak similar
dynamics, respectively. The motions of the first ANM mode
of RDBs in SARS-CoV (b) and SARS-CoV2 (c). The color scale
from red to blue changes from the highest to lowest
flexibilities.
ANM results for RBDs. (a) Overlap map between the ten slowest
ANM modes of SARS-CoV and SARS-CoV-2 RBDs, in which dark
red and blue regions indicate strong and weak similar
dynamics, respectively. The motions of the first ANM mode
of RDBs in SARS-CoV (b) and SARS-CoV2 (c). The color scale
from red to blue changes from the highest to lowest
flexibilities.To further understand the relative motions and dynamics of the interface
structures, the ANM motions of the first mode for two RBDs are shown
in Figure b and c. In ANM 1,
most of the interfacial residues were very stable (blue), while the
peripheral loop exhibited the highest fluctuation (red). The most
relevant dynamical difference between the two systems referred to this
loop motion. For SARS-CoV RBD, the highest fluctuations point to
residues 463D and 465 K, while the SARS-CoV-2 RBD displayed higher
flexibilities at 476G to 479P (orange beads). This loop was found to
be located at the C terminal of the α1 helix of the RBD,
interacting with ACE2 through van der Waals forces, hereinafter
referred to as an “interfacial C loop”. We suppose that
the difference in the dynamics of such an interfacial C loop may be
caused by the amino acid mutations in their respective interfaces with
ACE2 (i.e., 442Y to 455L, 443L to F, 460F to 473Y, and 479N to 493Q)
from SARS-CoV-RBD to SARS-CoV-2-RBD.[59] Thus, we
further mapped these interfacial mutations onto the structures (yellow
beads). Thus, the lower fluctuations of these mutations in
SARS-CoV-2-RBD may account for the higher affinity of such
interface.Accordingly, despite the overall structural similarity, RBDs in the two S
proteins showed different dynamics, especially for the S-ACE2
interfaces. These differences in dynamics may also indicate that these
two S-ACE2 interfaces achieved their activities, including substrate
binding and allosteric regulation.
PRS Analysis Identifies the Long-Range Allosteric Potential of the
AMR
We performed perturbation response scanning (PRS) based on GNM to
quantify the allosteric properties of AMRs and identify the key
residues for which perturbation provokes structural dynamical changes
at a long distance. First, the PRS map provided information regarding
the sensitivity and effectiveness of a given residue in transmitting
signals. The column of the matrix corresponding to a single residue
(e.g., for r313V) described how well-connected that residue was within
the AMR region and its likely relevance in transmitting allosteric
signals to distal parts of the S-ACE2 complex.Here, we adapted this method to probe the allosteric response of the AMR
region upon perturbation on residues with P > 0.5
from the PCN prediction (Table ).For the S/ACE2 complex in SARS-CoV, the dynamical changes of the other
residues in AMR (red arrows in Figure a) have been caught due to their close
distance in space. In addition, all AMR residues have allosteric
effects on the interfacial C loop with high fluctuation. Because the
effectiveness of the three residues with P > 0.5
was almost identical, we only reported in Figure
a for the case of the
perturbation of 568S that had the highest P value
(0.64). It is appreciated that three residues (313V, 529N, and 568S)
displayed long-range allosteric effects on some particular regions in
ACE2 through the S/ACE interface. The residues relative to the AMR in
SARS-CoV-2 could be classified into three types according to their
allosteric character.
Figure 7
Allosteric effects obtained upon unit perturbation on the AMR
residues, obtained from the PRS analysis.
Effectiveness/influence profiles with respect to the
linker residue (A) 568S in SARS-CoV AMR, and (B) 320Q,
531N, and 580T in SARS-CoV-2 AMR. Peaks indicate the most
influential residues, while some important regions are
highlighted.
Allosteric effects obtained upon unit perturbation on the AMR
residues, obtained from the PRS analysis.
Effectiveness/influence profiles with respect to the
linker residue (A) 568S in SARS-CoVAMR, and (B) 320Q,
531N, and 580T in SARS-CoV-2AMR. Peaks indicate the most
influential residues, while some important regions are
highlighted.The first class included most residues, including 320Q, 321P, 322T, 323E,
539N, 548T, and 578P. Upon perturbation, these residues exerted a
moderate dynamical effect on other AMR residues, and a relatively weak
but sensible intramolecular allosteric effect on residues of the
S/ACE2 interface, especially on the highly flexible C loop. The
perturbation of these AMR residues largely affected the dynamics of
ACE2 through stronger long-range allosteric communications.The perturbation of the second type of (531N and 532L) residues provokes
a large intermolecular allosteric effect in the RBD, including some
peaks at the interfacial C loop (Figure c). In addition, the perturbation
weakened the intermolecular communications from AMR to ACE2.The last type only included 580T, which exhibited similar dynamical
effects on three regions (AMR, RBD, and ACE2) via long-range
communications.Accordingly, our PRS analysis provided important insight into AMR
allostery. Among other findings, 3 residues in SARS-CoVAMR and 10
residues in SARS-CoV-2AMR were predicted to have wide-ranging
allosteric effects on the ACE2 protein. This result in conjunction
with the statistically significant higher effectiveness of the
SARS-CoV-2AMR perturbation (SI Figure 3),
suggested that its allosteric character was stronger than that of AMR
in SARS-CoV. Notably, in both complexes, the interfacial C loop formed
a bridge to transmit the signal from AMR to ACE2. The reported
RBD-ACE2 interfacial mutations from SARS-CoV to SARS-CoV-2 were close
to the interfacial C loop. Therefore, we propose that the different
allosteric properties of AMR residues were due to mutations in the
S-ACE2 interface.In Figure S3, the box plot shows that AMR in
SARS-CoV-2 had a statistically higher effectiveness than that of
SARS-CoV (under the p-value of 7.4 ×
10–5 by Wilcoxon signed-ranked test).
Conclusions
The application of PCNs, SEPAS (Affinity by Flexibility) and PRS based on
Elastic Network modes to two complexes of the spike protein with its
receptor, ACE2, generated highly consistent results. The integrated
computational approach disclosed active zones in the SARS-CoV and SARS-CoV2
complexes with the ACE2 ectodomain. These active zones, as demonstrated in
previous studies[41,60,61] and in the present
research by the convergence among different approaches, have an extremely
high probability of acting as allosteric sites, which can modulate
spike/ACE2 protein binding.These regions are worthy of further investigation from the perspective of drug
or vaccine development as indicated by a preliminary docking simulation.
From a methodological perspective, our results highlighted the need to
identify convergence of different simulation methods arising from
independent theoretical premises, in order to obtain a reliable picture of
the associated biomolecular systems.
Authors: Safaa M Kishk; Rania M Kishk; Asmaa S A Yassen; Mohamed S Nafie; Nader A Nemr; Gamal ElMasry; Salim Al-Rejaie; Claire Simons Journal: Molecules Date: 2020-10-29 Impact factor: 4.411
Authors: Almerinda Di Venere; Eleonora Nicolai; Velia Minicozzi; Anna Maria Caccuri; Luisa Di Paola; Giampiero Mei Journal: Int J Mol Sci Date: 2021-05-30 Impact factor: 5.923