José X Lima Neto1, Davi S Vieira2, Jones de Andrade3, Umberto Laino Fulco1. 1. Departamento de Biofísica e Farmacologia, Universidade Federal do Rio Grande do Norte, 59072-970 Natal-RN, Brazil. 2. Instituto de Química, Universidade Federal do Rio Grande do Norte, 59072-970 Natal-RN, Brazil. 3. Department of Physical Chemistry, Universidade Federal do Rio Grande do Sul, 91501-970 Porto Alegre-RS, Brazil.
Abstract
Coronaviruses (CoVs) have been responsible for three major outbreaks since the beginning of the 21st century, and the emergence of the recent COVID-19 pandemic has resulted in considerable efforts to design new therapies against coronaviruses. Thus, it is crucial to understand the structural features of their major proteins related to the virus-host interaction. Several studies have shown that from the seven known CoV human pathogens, three of them use the human Angiotensin-Converting Enzyme 2 (hACE-2) to mediate their host's cell entry: SARS-CoV-2, SARS-CoV, and HCoV-NL63. Therefore, we employed quantum biochemistry techniques within the density function theory (DFT) framework and the molecular fragmentation with conjugate caps (MFCC) approach to analyze the interactions between the hACE-2 and the spike protein-RBD of the three CoVs in order to map the hot-spot residues that form the recognition surface for these complexes and define the similarities and differences in the interaction scenario. The total interaction energy evaluated showed a good agreement with the experimental binding affinity order: SARS-2 > SARS > NL63. A detailed investigation revealed the energetically most relevant regions of hACE-2 and the spike protein for each complex, as well as the key residue-residue interactions. Our results provide valuable information to deeply understand the structural behavior and binding site characteristics that could help to develop antiviral therapeutics that inhibit protein-protein interactions between CoVs S protein and hACE-2.
Coronaviruses (CoVs) have been responsible for three major outbreaks since the beginning of the 21st century, and the emergence of the recent COVID-19 pandemic has resulted in considerable efforts to design new therapies against coronaviruses. Thus, it is crucial to understand the structural features of their major proteins related to the virus-host interaction. Several studies have shown that from the seven known CoV human pathogens, three of them use the human Angiotensin-Converting Enzyme 2 (hACE-2) to mediate their host's cell entry: SARS-CoV-2, SARS-CoV, and HCoV-NL63. Therefore, we employed quantum biochemistry techniques within the density function theory (DFT) framework and the molecular fragmentation with conjugate caps (MFCC) approach to analyze the interactions between the hACE-2 and the spike protein-RBD of the three CoVs in order to map the hot-spot residues that form the recognition surface for these complexes and define the similarities and differences in the interaction scenario. The total interaction energy evaluated showed a good agreement with the experimental binding affinity order: SARS-2 > SARS > NL63. A detailed investigation revealed the energetically most relevant regions of hACE-2 and the spike protein for each complex, as well as the key residue-residue interactions. Our results provide valuable information to deeply understand the structural behavior and binding site characteristics that could help to develop antiviral therapeutics that inhibit protein-protein interactions between CoVs S protein and hACE-2.
In December 2019, a novel human coronavirus was discovered in the city of Wuhan, China, and
designated as Severe Acute Respiratory Syndrome coronavirus 2 (SARS-CoV-2) due to its
similarity to the SARS-CoV genome (2002–2003). Commonly referred to as coronavirus
disease 19 (COVID-19), this novel coronavirus disease spread to more than 200 countries in a
few months. Almost two years after its discovery, ∼247 million infections and over 5
million deaths have been registered all over the world according to the World Health
Organization (WHO), and the numbers are increasing daily.[1] Despite all
efforts, to date, there is no safe treatment available against CoVs, and the emergence of
new variants that can escape from immune surveillance is still a constant risk.CoVs are enveloped viruses with positive-sense RNA belonging to the
Nidovirales order, Coronaviridae family, and
Coronavirinae subfamily that is divided into four genera (α,
β, γ, and δ types), and can infect a wide range of mammalian and avian
species, causing diseases in the liver, as well as in the respiratory, gastrointestinal,
renal, and nervous systems.[2,3] Their genome commonly encodes four structural proteins: the spike (S)
glycoprotein, the envelope (E) protein, the membrane (M) protein, and the nucleocapsid (N)
protein, as well as a series of nonstructural proteins involved in replication and
transcription.[4] Among these, the spike glycoprotein is seen as a key
target for potential therapies and diagnostics, since it mediates coronavirus recognition
and entry into the host cell.[5]S protein is a trimeric class I fusion protein containing over 1200 amino acids, with each
monomer consisting of three segments: ectodomain, transmembrane, and intracellular. In
regard to the ectodomain, it is also divided into the S1 receptor-binding subunit and S2
membrane-fusion subunit, with the receptor-binding domain (RBD) and receptor-binding motif
(RBM) in S1 being responsible for the recognition of the host’s cell
receptor.[6,7] From the
seven confirmed coronavirus species known as human pathogens, only three of them recognize
the same human target to enter into the cell, the Angiotensin-Converting Enzyme 2 (hACE-2;
EC: 3.4.17.23): SARS-CoV (β-CoV), SARS-CoV-2 (β-CoV), and HCoV-NL63
(α-CoV), with the last being a prevalent human respiratory pathogen often related to
common colds.[8] hACE-2 is a carboxypeptidase widely distributed in the
human body that consists of an 805 amino-acid type I transmembrane protein containing, in
the extracellular part, a catalytic domain formed by a substrate-binding region, a zinc
metallopeptidase domain, and a binding site that CoVs RBM evolved to recognize, the
virus-binding motif (VBM).[6,9,10] Thus, inhibiting the binding of the RBD-spike protein to
VBM-hACE-2 is an attractive strategy for developing antibodies and other potential
inhibitors that hinder the viral attachment.[11−13]Coronaviruses demonstrate a high versatility in viral receptor binding strategies since the
S1 amino-acid sequence and structure, as well as the host’s receptor target, differ
among the CoV’s genera.[14] Curiously, only SARS-CoV-2 resulted in a
pandemic among the human-infecting CoVs that recognize hACE-2, which led some authors to
relate this to differences in the structural conformation of spike protein and a distinct
binding interface of SARS-CoV-2 to hACE-2 as compared to other CoVs.[15,16] Recent experimental studies have
shown that the SARS-CoV-2 spike has a higher binding affinity toward hACE2 compared to
SARS-CoV, while previous data depicted that the HCoV-NL63 receptor-binding affinity is much
lower than that of SARS-CoV, which may help explain the differences in these CoVs
infectivities and transmissibility.[14,17−20] In this sense, it is vital
to deeply investigate the receptor recognition mechanisms of coronaviruses for understanding
their pathogenesis and epidemics, as well as for human intervention in coronavirus
infections.In this sense, we investigated the residue–residue interactions in the crystal
structure of the hACE2/spike protein complex using in silico approaches to
deeply understand the mechanism underlying the virus attachment and find the similarities
and differences in the interaction pattern of the low pathogenic common cold virus HCoV-NL63
and the high pathogenic SARS viruses (SARS-CoV and SARS-CoV-2). For this purpose, we
obtained the three hACE-2/spike protein X-ray structures from the Protein Data Bank (www.rcsb.org/pdb) and employed quantum biochemistry
techniques within the density function theory (DFT) framework and the molecular
fragmentation with conjugate caps (MFCC) approach to calculate the individual contribution
of each amino-acid residue for the protein–protein interface to map the recognition
surface for these complexes. These data may shed some light on how different CoVs recognize
hACE-2 enzymes and could help the development of novel therapeutic strategies.
Materials and Methods
Drug-Receptor Complex Data and Quantum Calculations
We used the X-ray crystallographic data of the human Angiotensin-Converting Enzyme 2
(hACE) solved in complex with the receptor-binding domain (RBD) region of SARS-CoV (PDB
ID: 2AJF; 2.90 Å of
resolution),[21] SARS-CoV-2 (PDB ID: 6M0J; 2.45 Å of resolution),[22] and
HCoV-NL63 (PDB ID: 3KBH; 3.31 Å
of resolution).[23] First, we added missing protein amino-acid atoms, and
also hydrogen atoms were included according to the results obtained from the PROPKA 3.1
package set up at pH 7.0,[24] as well as for water molecules. Then,
protein main-chain heavy atoms are constrained, and all the other atoms are submitted to a
classical energy minimization using the Chemistry at Harvard Molecular Mechanics (CHARMm)
force field,[25] with the convergence tolerances to
10–5 kcal mol–1 (total energy variation) and
10–2 kcal mol–1 Å–1 (RMS
gradient). Since the PROPKA software is slightly sensitive to the ligand pocket geometry,
the steps of hydrogen addition/withdrawal and energy minimization are carried out until no
difference is observed in the protonation results.[26]After the energy minimization step, the three complexes were fragmented following the
molecular fractionation with conjugate caps (MFCC; see below) scheme, and the structures
generated were submitted to energetic quantum mechanical calculations through the Gaussian
(G09) package,[27] within the density functional theory (DFT) formalism.
The generalized gradient approximation (GGA) functional B97D[28] was
selected to perform the quantum in silico simulation, and the
6-311+G(d,p) basis set was chosen to expand the Kohn–Sham orbitals. We have chosen
the functional B97D over other methods due to its good performance for noncovalently bound
systems. Besides, it was already used in the case of the interaction of
nanomaterial–ligand systems,[29] as well as in the evaluation of a
large number of data sets proposed by Li et al., in which B97D presents the best
performance over some of the hybrid methods applied in their work and is close to
functionals using D3 correction, including B3LYP+D3.[30] To improve our
results, the effect of the residues’ surroundings formed by neighboring atoms
(amino acids and water molecules) was included in our calculations through the use of the
conductor-like polarizable continuum model (CPCM)[31,32] with the dielectric constant
ε40, which represents the influence of the electrostatic environment
surrounding the residue–residue complex.[33−35]
Molecular Fractionation with Conjugate Caps
As presented above, we fragmented the protein into amino acids following the MFCC
scheme[36,37] adapted
by Rodrigues et al. to calculate protein–protein interactions.[38]
The MFCC scheme together with DFT calculations have been widely employed to calculate the
interaction energies (IEs) in protein–ligand and protein–protein complexes
with great success, making the investigation of a large number of amino-acid residues in a
protein possible with a small computational cost and high accuracy.[39−43]In the framework of this approach, for each amino acid of interest of the hACE-2 at
position R, we mapped its distance to the residues in the
spike protein at position R and choose those
R–R that showed at
least one atom inside a radius (r) equal to 8.0 Å. Thus,
R and R were decomposed into
individual fragments by cutting through the peptide bonds, and a pair of conjugate caps is
designed to saturate each fragment, aiming to preserve the local chemical environment and
comply with the valence requirements. Finally, hydrogen atoms are added into the molecular
caps to avoid dangling bonds.[44] Here, the caps are formed by the
neighbor residue covalently bound to the amine (C and
C) and carboxyl (C and
C) groups of residues R and
R, respectively, along the protein chain, providing a
better description of its electronic environment. Finally, the interaction energy (IE) of
each residue–residue pair,
IE(R–R), was
calculated as
followswhere Δ =
CRC,
and δ =
CC (m =
i, j). The term
E(Δ) corresponds to the total
energy of the fragment comprised by both capped residues. The second [third] term,
E(Δ –
δ)
[E(δ –
Δ)], gives the total energy of the system formed by
the capped residue R [R] and the
hydrogenated caps of R [R].
E(δ) is the total energy of the
system formed only by the caps. Additionally, in order to achieve the structural stability
of the complex promoted by interactions with the extended hydration network, all water
molecules forming hydrogen bonds with a particular residue or cap were included for
completeness in the fragments. The descriptions of the interaction types were obtained
through the Discovery Studio visualizer[45] and visual inspection.
Results and Discussion
In this work, the quantum mechanical calculations were employed to describe the
residue–residue interactions and highlight the hot spot on the proteins’
surface. This is a valuable strategy in drug design due to its potential for the
identification of druggable sites. Thus, it can lead to the development of new therapeutic
strategies to modulate this system under pathological conditions. Seeking a detailed
understanding of the residue–residue network supporting the hACE-2 and S protein
interaction, a total of 62, 63, and 54 residues belonging to hACE-2 and 51, 49, and 27
residues belonging to spike proteins were considered, resulting in 357, 333, and 276
interaction pairs evaluated for the complex hACE-2/spike of SARS-CoV-2, SARS-CoV, and
HCoV-NL63, respectively.By adding the interaction energies (IE) for all
R–R within
r = 8.0 Å,[46] we obtained the total interaction
energies (TIEs) between the hACE-2 and spike proteins as −118.6, – 83.1, and
−64.6 kcal mol–1 for the SARS-CoV-2, SARS-CoV, and HCoV-NL63
complexes, respectively, which is in accordance with the experimental
order.[14,17−20,47] It is also in agreement
with the observations of Rawat et al., where the authors obtained a positive correlation
between the interaction surface size, the interaction energy, and the increased virulence of
CoVs.[48] A list with all calculated interaction energies for individual
residues is shown in Tables S1–S6 in the Supporting Information. From now on, SARS-CoV-2,
SARS-CoV, and HCoV-NL63 are termed SARS-2, SARS, and NL63, respectively, and all the energy
values are in kcal mol–1.
Energetic Description of Secondary Structures
It has been shown that SARS-CoV-2 and SARS-CoV receptor-binding domains (RBDs) share
approximately 72% amino-acid sequence identities, while the comparison with HCoV-NL63 is
quite low (∼23%–25%).[48] Similarly, the structures of the
RBDs from both SARS-CoVs are analogous when compared to each other but different when
compared to HCoV-NL63.[23] In Figure (top), we present the structures of the SARS (Figure (a)), SARS-2 (Figure (b)), and NL63 (Figure (c)) RBDs. As
one can see, the three S protein-RBD contains two subdomains: a core structure and the
receptor-binding motif (RBM). The core structures of SARS (Figure (a); orange) and SARS-2 (Figure (b); yellow) consist of a five-stranded antiparallel β-sheet (β1,
β2, β3, β4, and β7) and some short α-helices connecting
them, while the NL63 (Figure (c); light pink)
core structure is a β-sandwich consisting of two β-sheet layers (β5,
β7, β8, β6, β3, and β4, β1, β2). The RBM of both
SARS-CoVs are made of loops divided by two antiparallel β-strands, with
approximately 70 amino acids that connect the β4−β7 in the core
regions. On the other hand, NL63 RBM is formed by three small loops that connect the core
β-sheets: β1−β2, β4−β5, and
β7−β8. Besides, it can be seen that SARS RBD has some small differences
in structures when compared to SARS-2, with the first presenting less defined secondary
structures (α-helix or β-strand). These differences are reflected in the
binding surface with hACE-2 and can be related to the specific recognition pattern of each
CoV, as one can see in Figure (bottom).
Figure 1
Overall structures of (a) SARS-CoV, (b) SARS-CoV-2, and (c) HCoV-NL63 RBD (top), each
complexed with their common receptor, hACE-2 (botom). Core regions of RBDs are in
orange (SARS), yellow (SARS-2), and light pink (NL63) colors, while RBMs of SARS,
SARS-2, and NL63 are in purple, pink, and yellow, respectively.
Overall structures of (a) SARS-CoV, (b) SARS-CoV-2, and (c) HCoV-NL63 RBD (top), each
complexed with their common receptor, hACE-2 (botom). Core regions of RBDs are in
orange (SARS), yellow (SARS-2), and light pink (NL63) colors, while RBMs of SARS,
SARS-2, and NL63 are in purple, pink, and yellow, respectively.Several authors[6,49−51] have stated
that the interactions between hACE-2 and S proteins occur in two main hot spots of hACE-2
that are in close contact with RBM: (i) One is in the α-helix 1 (α1). (ii) The
other is in the region formed by the β-strands 3 and 4 (β3−β4).
In order to analyze the energy relevance of these two hot spots, as well as looking for
new ones, we analyzed the energetic profile of each secondary structure of both proteins
to identify the most relevant ones. We summed the IEs between the amino-acid residues of
hACE-2 and S proteins, separated by segments, and show the results in Figure .
Figure 2
Interaction energies per (a) hACE-2, (b) SARS and SARS-2, and (c) NL63 protein
secondary structures. In the left panel, we give a view about protein structures. In
panel (a), the yellow color represents a common binding site for the three proteins.
The orange color is a binding site exclusively for NL63. The pink region is a binding
site only for SARS. The purple is the SARS-2 exclusive site.
Interaction energies per (a) hACE-2, (b) SARS and SARS-2, and (c) NL63 protein
secondary structures. In the left panel, we give a view about protein structures. In
panel (a), the yellow color represents a common binding site for the three proteins.
The orange color is a binding site exclusively for NL63. The pink region is a binding
site only for SARS. The purple is the SARS-2 exclusive site.As one can see in Figure (a), three segments of
hACE-2 (yellow color) are recognized with the strongest interaction energy by all the CoVs
here studied, name as α-helix α1 (SARS, −43.9, SARS-2, −72.3,
NL63, −20.9), beta-sheet β4 (SARS, 4.9, SARS-2, −4.5, NL63,
−3.4), and loop L18 (SARS, −13.7, SARS-2, −26.8,
NL63, −22.1), which is in agreement with the previous studies. Here, it is
noticeable that the amino acids of SARS-2 in the crystal structures are more tightly bound
to these segments than the other two CoVs, mainly with the α1, which accounts for
179, 220, and 68 of all interaction pairs evaluated in hACE-2/SARS, SARS-2, and NL63
complexes, respectively, indicating that this segment could be a good starting point in
the development of inhibitors to the viral docking in hACE-2. Besides, each one of the
CoVs also showed a strong interaction with a specific region of the host’s target
compared to the other viruses. The segment α14 (pink color) presents more favorable
interactions for SARS (−18.0) spike proteins than the other two viruses, while the
interaction with SARS-2 (−3.9) and NL63 (−3.4) were weaker and quite similar
to each other. L3 (purple color) amino acids are not forming an
interaction with NL63 residues, but they are attracted by SARS-2 (−5.8) and
repelled by some SARS (3.0) residues. Finally, the loops L16 (SARS,
−0.6, SARS-2, −0.5, NL63, −6.5) and L20 (SARS,
−0.4, SARS-2, −0.5, NL63, −4.7) (orange color) are specific
recognition regions of the NL63 virus.Figure (b) and (c) depicts the segments of both
SARS-CoVs and NL63 spike proteins that are interacting with hACE-2 residues, respectively.
As observed for hACE-2 segments, Figure (b)
shows that the interaction energy between hACE-2 and the S protein of SARS-2 is stronger
than SARS in most of the segments, except the α-helix 3 (α3) and loop 10
(L10). As expected, the two segments with the strongest interaction
energies are at the top of RBM (L9 and L10), closer to
the receptor. It should be mentioned that the numbering in S protein α-helices is
not exactly the same for SARS and SARS-2, as SARS has a more disordered structure, and
some helices are not formed (Figure ; top). This
disordered, and consequently more flexible, characteristic has been pointed out as one of
the factors for the lower affinity between the spike of SARS and hACE-2 in comparison to
SARS-2 and can be related to a small binding surface.[48] In Figure (c), one can see the most relevant segments
of NL63. From this, it is possible to realize that almost all the amino acids interacting
with the human receptor are in the three loops of the virus RBM, except β7.These results reinforce the relevance of the spike protein RBM residues, as well as the
segments formed by the helix α1 and the β-turn
“β3-L18-β4” in hACE-2 for the attachment of
CoVs to the host’s protein. Moreover, we found segments of both proteins that seem
to be specific for the recognition of each virus, including the α-helix 3 of SARS,
the segments α14, L3, L16, and
L20 of hACE-2, and the β-strand 7 of NL63.
Hot-Spot Region between hACE-2 and CoVs S Protein
It is of great interest to identify the dominant sets of contact residues involved in
attraction or repulsion between hACE-2 and S proteins. In order to evaluate this
interaction in different species of coronaviruses, we performed a search for the most
relevant residue–residue pair interactions. Figure depicts the 24 residues of hACE-2 in which the sum of the
interaction energies with all other residues belonging to the spike within a radius of 8.0
Å showing an energy value stronger than 2.0 or −2.0 kcal
mol–1 at least in one of the three complexes studied. One can see that
the residues Q24, T27,
F28, D30,
K31, H34,
E37, D38,
Y41, Q42, and
L45 contribute with about 48% (− 42.2), 69%
(− 70.4), and 37% (− 19.0) of the TIEs of hACE-2 interacting with SARS
(− 87.6), SARS-2 (− 101.3), and NL63 (− 50.3) spike proteins,
respectively. It explains the reason for the hACE-2 α1 segment being the most
important region of the protein, in terms of interaction energy, since these residues are
part of this secondary structures. From these residues, only
Y41 show energetically relevant results, using the
energy criterion above, for all the three complexes, while
D30, H34, and
E37 are strongly bond to NL63 and only one of the SARS
viruses (SARS or SARS-2). The residues Q24,
T27, F28,
K31, D38, and
L45 present strong total interaction energies with
both SARS and SARS-2 spike residues, while Q42 is only
energetically relevant to the hACE-2/SARS-2 complex (see the energy values for the three
complexes in Table ).
Figure 3
Energy profiles for the hACE-2 amino-acid residues with strongest interaction
energies in the recognition surface interacting with amino acids from spike proteins
within a radius of 8.0 Å. Here, we use a gray scale with dark gray, light gray,
and gray represents SARS, SARS-2, and NL63 energy spectra, respectively.
Table 1
Total Interaction Energy Values of Each One of the Most Relevant Residues of
hACE-2 in Complex with Spike Proteins of SARS, SARS-2, and NL63
Energy (kcal
mol–1)
Residues
SARS
SARS-2
NL63
Y41α1
–3.9
–2.1
–3.6
D30α1
–1.8
–9.9
–4.1
H34α1
–6.3
0.7
–5.6
E37α1
–0.8
–3.8
–5.2
Q24α1
–4.6
–7.6
0.0
T27α1
–4.4
–7.2
–0.1
F28α1
–2.5
–2.4
0.0
K31α1
–5.3
–24.6
–0.7
D38α1
–8.3
–8.8
0.1
L45α1
–3.1
–2.0
–0.3
Q42α1
–1.4
–2.8
–0.1
L79α2
–2.6
–2.2
0.0
M82L3
5.2
–1.7
0.0
Y83L3
–2.0
–4.0
0.0
T324L16
–0.6
–0.5
–3.7
Q325α14
–1.3
–0.7
–3.5
E329α14
–14.0
–0.4
0.2
N330α14
–0.4
–2.2
2.3
K353L18
–20.1
–7.3
–12.3
G354L18
–5.3
–7.2
–4.3
D355L18
1.1
–0.7
–4.0
F356β4
–1.3
–0.4
–2.3
R357β4
–4.1
–3.7
–1.0
A387L20
–0.1
–0.2
–2.0
Energy profiles for the hACE-2 amino-acid residues with strongest interaction
energies in the recognition surface interacting with amino acids from spike proteins
within a radius of 8.0 Å. Here, we use a gray scale with dark gray, light gray,
and gray represents SARS, SARS-2, and NL63 energy spectra, respectively.As one can see, there are not any other hACE-2 segments with more than three
energetically relevant residues. Therefore, we present all the other ones here, and the
values are listed in Table for
L79α2,
M82,
Y83,
T324,
Q325α14,
E329α14,
N330α14,
K353,
G354,
D355,
F356β4,
R357β4, and
A387. From these, only
K353 and G354 are
shown to be energetically relevant for the three complexes, while
T324, Q325,
D355, F356, and
A387 are strongly bound to NL63, as well as
L79 and R357 present
strong total interaction energies with both SARS viruses. Similarly, in Figure , one can see the most energetically relevant amino
acids of SARS, SARS-2, and NL63 spike proteins, named as SARS
R426α3,
Y436,
L442,
P462,
N473,
Y475,
N479β6,
Y484,
T486,
T487,
I489, and
Y491;
SARS-2 K417α6,
Y449,
Y453β5,
F456,
A475,
E484,
F486,
N487,
Y489,
Q493β6,
G496,
Q498,
T500,
N501,
G502α8, and
Y505; NL63
G494,
S496,
C497,
Y498,
V499,
S535,
P536,
G537,
W585β7, and
H586 (see Table S7 to observe the total interaction energy values of each one of these
residues).
Figure 4
Energy profiles for (a) SARS, (b) SARS-2, and (c) NL63 spike amino-acid residues with
strongest interaction energies in the recognition surfaces.
Energy profiles for (a) SARS, (b) SARS-2, and (c) NL63 spike amino-acid residues with
strongest interaction energies in the recognition surfaces.These results are well correlated with previous experimental and computational
data.[23,51−70]
Spinello et al. performed a multimicrosecond-long molecular dynamics simulations over the
structures of SARS-CoV(−2)/ACE2 and SARS-CoV/ACE2 complexes, observing that regions
of spike protein of SARS-CoV-2/ACE2 are markedly more rigid as compared to SARS-CoV, as
well as they revealed a map of the most important H-bond and salt bridge interactions that
enable it to be more stable.[51] Important studies using fragment
molecular orbitals were carried out to characterize the protein–protein
interactions (PPI) between RBD and several antibody/peptides[52] as well
as PPI in the RBD-hACE2 interfaces.[56] In the former, nine key residues
were found in RBD (T415, K417, Y421, F456, A475, F486, N487, N501, and Y505), while in the
latter, four residues (E37, K353, G354, and D355) of the hACE2 were identified as forming
strong interactions with the spike proteins of coronaviruses (SARS-CoV-1, SARS-CoV-2, and
HCoV-NL-63). Through mutagenesis study, Yi et al. reported that mutations in the SARS RBD
residues R426, K439, N457, P470, Y484, T487, and Y491 decreased dramatically its binding
affinities to the hACE-2, while by replacing the SARS-2-RBD amino acids, N501, Q498, E484,
T470, K452, and R439 were responsible for a decrease in its receptor binding affinity.
Interesting, the residues K439 and
N457 of SARS, as well as
T470, K452, and
R439 of SARS-2, are not present in our 8.0 Å
radius, indicating that their relevance is not related to the direct interaction with
hACE-2.[71]Zou et al. performed a molecular dynamics simulation together with an alanine scanning
analysis, finding that relative binding free energy of the hACE-2/SARS complex is
significantly changed when there is a mutation in the spike residues R426, L443, Y484, and
T487. In our work, L443 is only making one interaction pair
with IE over −1.0 kcal mol–1, with hACE-2 residue
T27, and perhaps it is an interaction that was not
captured by only taking into account a static crystal structure. Similarly, in the same
work, it was shown that a mutation in the amino acid L455 of the S protein of SARS-2
resulted in the highest difference in binding energy, but in our outcomes, it presented
only three weak IEs. Moreover, Zou et al. found the residues F456, F486, Q493, and N501 as
the most relevant for the binding energy of the hACE-2/SARS-2 complex, which are also
among the most relevant residues accordingly the herein reported results (Figure b).[72]It is important to mention that mutations in spike residues K417(N/T), E484(K), and
N501(Y) have been found in some SARS-2 variants, possibly involved in reducing
neutralization by some antibodies.[66,67,69,70,73]
From these, N501Y is present in lineages B.1.1.7 (Alpha), B.1.351 (Beta), and P.1 (Gamma),
and E484K was observed in lineages Beta and Gamma, which also possess alternative
amino-acid substitutions K417N and K417T, respectively. Watanabe et al.,[66] using the fragment molecular orbital method, found the K417N/T, E484K, and
N501Y mutations are energetically disadvantageous for antibody interactions. Watanabe et
al.[67] presented a detailed study on the N501Y mutation effect. They
have concluded that this mutation on the S protein enhanced the attractive interaction in
comparison to the wild S protein due to the hydrogen bonds and XH/pi
interactions with Y41 and K353 from hACE-2.Here, K417, E484, and N501 have shown some of the strongest attractive TIEs, and a study
by Taka et al. depicted that spike residues E484 and K417 are involved in salt bridges
with the hACE-2 residues K31 and D30, respectively, during most of the time of their
molecular dynamics simulation,[74] which could help to understand the key
role of these residues.[74] On the other hand, the mutation E484(K) leads
to a small increase in the binding affinity of the complex,[75] and some
computational studies have related it to a repulsive energy of E484, which is not present
when the lysine residue is in position.[70,76,77] In our calculations, we only found
one repulsion between E484 and a hACE-2 residue (Table S2), and it showed a quite small IE. This difference in the
interaction energy could be related to the pose captured by the crystal structure, which
represents a moment with favorable interactions, or to distinction in the calculation
methods used.On the other hand, there are only a few studies trying to understand the binding of
hACE-2 to the NL63 S protein at a molecular level, as well as the effects of mutations in
hACE-2 for recognizing CoVs.[17,18,23,78] Wu et al. reported four
mutations in hACE-2 (K353A, D38A, D37A, Y41A, and Y41F) that decrease the binding affinity
for NL63 and the SARS S protein, as well as three NL63 spike mutations (Y498A, S535A, and
S535T).[47] Rawat et al. studied the conserved residues in the spike
protein by in silico analysis, showing that the conserved glycine and
tyrosine residues in SARS (G488 and Y436), SARS-2 (G502 and Tyr449), and NL63 (G537 and
Y498) are important for both stabilizing the S protein and its interaction with
hACE-2.[48] It has been depicted by Li et al. that nine mutations in
hACE-2 affected inhibition of interactions with SARS (Q24K, K31D, Y41A, K68D, K353D,
K353A, D255A, R357A, and R393A).[65]Finally, to assess the differences in the interactions between hACE-2 and the S protein
of the three coronaviruses, we performed a thorough analysis of the individual interaction
pairs. All energy values are in kcal mol–1. In Figures
–7 (top), we
present the most energetically relevant interaction pairs in the complexes hACE-2 with
SARS (Figure ), SARS-2 (Figure
), and NL63 (Figure ), respectively. Among the 11 hACE-2 α1 residues,
Y41 forms interaction pairs with 14, 13, and 7
residues of SARS, SARS-2, and NL63, respectively, but most of the IE values are weakly
attractive (negative). When the energy criterion is applied (≥2.0 kcal
mol–1 or ≤ −2.0 kcal mol–1), 2 (SARS),
0 (SARS-2), and 1 (NL63) interaction pairs are observed. In the complex hACE-2/SARS, the
two strongest IEs of Y41 are with the residues
Y484
(Y41α1–Y484:
−4.1) and T486
(Y41α1–T486:
3.3) through the formation of a π–π interaction (4.7 Å) and a
repulsion between the oxygen atoms from their side-chain hydroxyl groups (distant 2.73
Å from each other), respectively. Looking for these IEs,
Y41 is not expected to be among those residues with the
strongest interaction energy, but it is also in an π–alkyl interaction (4.0
Å) with T487
(Y41α1–T487:
−1.5) that increases the attraction between the hACE-2 residue and S protein.
Figure 5
(Top) Graphical panel presenting the most relevant interactions involving the
hACE-2–spike residues of SARS. (Bottom) Detailed spatial organization of the
interaction pairs with their intermolecular interactions. Dashed lines in marine
(green) represent direct (nonconventional) hydrogen bonds and orange (yellow) lines
π–π (π–alkyl). Repulsion, amide-π stacked, and
salt bridges are represented in red, cyan, and violet lines, respectively.
Figure 7
(Top) Graphical panel presenting the most relevant interactions involving the
hACE-2–spike residues of NL63. (Bottom) Detailed spatial organization of the
interaction pairs with their intermolecular interactions. Dashed lines in marine
(green) represent direct (nonconventional) hydrogen bonds and orange (yellow) lines
π–π (π–alkyl). Rrepulsion, amide-π stacked, and
salt bridges are represented in red, cyan, and violet lines, respectively.
Figure 6
(Top) Graphical panel presenting the most relevant interactions involving the
hACE-2–spike residues of SARS-2. (Bottom) Detailed spatial organization of the
interaction pairs with their intermolecular interactions. Dashed lines in marine
(green) represent direct (nonconventional) hydrogen bonds and orange (yellow) lines
π–π (π–alkyl). Repulsion and salt bridges are
represented in red and violet lines, respectively.
(Top) Graphical panel presenting the most relevant interactions involving the
hACE-2–spike residues of SARS. (Bottom) Detailed spatial organization of the
interaction pairs with their intermolecular interactions. Dashed lines in marine
(green) represent direct (nonconventional) hydrogen bonds and orange (yellow) lines
π–π (π–alkyl). Repulsion, amide-π stacked, and
salt bridges are represented in red, cyan, and violet lines, respectively.(Top) Graphical panel presenting the most relevant interactions involving the
hACE-2–spike residues of SARS-2. (Bottom) Detailed spatial organization of the
interaction pairs with their intermolecular interactions. Dashed lines in marine
(green) represent direct (nonconventional) hydrogen bonds and orange (yellow) lines
π–π (π–alkyl). Repulsion and salt bridges are
represented in red and violet lines, respectively.(Top) Graphical panel presenting the most relevant interactions involving the
hACE-2–spike residues of NL63. (Bottom) Detailed spatial organization of the
interaction pairs with their intermolecular interactions. Dashed lines in marine
(green) represent direct (nonconventional) hydrogen bonds and orange (yellow) lines
π–π (π–alkyl). Rrepulsion, amide-π stacked, and
salt bridges are represented in red, cyan, and violet lines, respectively.Y41 does not have any individual interaction with spike
residues of SARS-2 within the energy criterion, but three interaction pairs are the main
ones responsible for the energy of this residue to be within the criterion (all other
energy values can be seen in Table S2):
Y41α1–Q498
(−2.0),
Y41α1–T500
(1.9), and
Y41α1–N501
(−1.0). The first attraction occurs by a nonconventional hydrogen bond (from now
on, nonconventional H-bond) between the amine side chain of the glutamine and a C–H
group from the tyrosine ring (2.60 Å), while the second attraction is the result of
small hydrophobic contacts between the amino acids. Curiously,
T500 in SARS-2 occupies the same position as
T486 in the SARS S protein, and both residues are
involved in repulsion with Y41 through the oxygen atoms
from their side-chain hydroxyl groups in a similar distance (2.9 Å). On the other
hand, S535
(Y41α1–S535)
of NL63 interacts with hACE-2 Y41 through a
nonconventional H-bond (3.2 Å) and a π–alkyl interaction (3.3 Å)
with an IE of −2.1.As one can see in Figure S1(a), SARS Y484 and
T487 are almost overlapping the positions of SARS-2
Q498 and N501,
respectively, while the NL63 S535 main chain is quite
close to the position of T487 (with the side chain
rotating to the other side; Figure S1(b)), indicating that the presence of hydrophobic or small polar
residues surrounding hACE-2 Y41 is more favorable to the
stabilization of the spike protein, although the repulsion to
T486 and T500 should
also be considered. These results explain the more energetic behavior of the SARS-RBD loop
10 (L10) when compared to SARS-2.The residues Q24α1,
K31α1,
D38α1, and
G354 are the hACE-2
residues that showed IEs within energy criterion and that only interacted with SARS and
SARS-2 residues. The position of Q24 is almost overlapped
in both crystals (Figure S1(c)), as well as the positions of
N473 and
N487 of SARS and SARS-2
spike, respectively (Figure S1(d)). These amino acids are making H-bonds with
Q24 and show IEs of −2.5
(Q24α1–N473;
2.5 Å) and −3.7
(Q24α1–N487:
2.0 Å). Besides, the interaction pair
Q24α1–N487
also presented a nonconventional H-bond at a distance of 2.9 Å.
K31 is an important residue for SARS-2 interactions,
accounting for three strong interaction pairs with E484
(K31α1–E484:
−11.7), Y489
(K31α1–Y489:
−3.4), and Q493
(K31α1–Q493β6:
−5.2). The first pair is the second strongest IE and is making a salt bridge (2.1
Å) and a nonconventional H-bond (3.1 Å), while the second is involved in a
π–alkyl (4.4 Å) and several hydrophobic interactions, and finally, in
the third pair, there is an H-bond (2.2 Å) and a nonconventional H-bond (3.3 Å)
being made. In contrast, in the position occupied by the residue
Y489 (SARS-2), there is the amino acid
Y475 in SARS, and it is
also making a π–alkyl interaction with K31
(K31α1–Y475:
−2.4) at a distance of 3.6 Å.Moreover, the residue D38 seems to be relevant to SARS,
since it has two interaction pairs with spikes Y436
(D38α1–Y436:
−2.7) and Y484
(D38α1–Y484:
−2.2) residues, both of them making H-bonds at distances of 1.9 and 2.2 Å,
respectively. On the other hand, in the complex SARS-2/hACE-2,
D38 makes an H-bond with residue
Y449
(D38α1–Y449;
2.1 Å), showing an IE of −4.3. It is important to note that the SARS residue
Y436 is overlapping the residue
Y449 of SARS-2 (Figure S1(a)), indicating the relevance of the presence of an H-bond with
the residue D38 of hACE-2 for both SARS-CoVs. Finally, the
interaction pairs in SARS
G354–Y491
and SARS-2
G354–G502α8
have IEs of −5.5 (π–amide: 4.1 Å) and −2.4
(nonconventional H-bond: 2.7 Å; 3.6 Å), respectively.Residue D30 interacts with 8, 10, and 4 residues of SARS,
SARS-2, and NL63, respectively, while 0, 1, and 1 interaction pairs have higher IEs than
the energy criterion used. All IEs with the eight residues of SARS are below −0.9
kcal mol–1; therefore, they will not be discussed here. On the other
hand, in the complex hACE-2/SARS-2, the interaction pair
D30α1–K417α6
shows the third strongest IE (−9.9) of all complexes which occur by the formation
of a salt bridge between their charged side chain (2.9 Å), while
D30α1–S496
(NL63) forms an H-bond (2.4 Å) and a nonconventional H-bond (2.5 Å) with IEs of
−2.7.From the 19 (SARS), 17 (SARS-2), and 7 (NL63) interaction pairs of
H34 with S proteins, 2 (SARS), 1 (SARS-2), and 1
(NL63) are within the energy criterion.
H34α1–Y442
(SARS) and
H34α1–N479β6
(SARS) interaction pairs create a π–π (5.1 Å) interaction and a
H-bond (2.8 Å), with IEs of −2.2 and −2.4, respectively. The IE of
H34α1–Y453β5
(SARS-2) is 3.3 with a repulsion between the C–N–C ring side chain and the
hydroxyl oxygen (3.3 Å), and
H34α1–S496
(NL63) forms a nonconventional H-bond (−2.2) at 3.2 Å.The residue E37 of hACE-2 is interacting with 9, 11, and
12 residues of SARS, SARS-2, and NL63 S proteins, respectively, but only in the last
complex do we observe IE values within the energy criterion:
E37α1–G495
with an IE of 2.8, through a repulsion between the oxygen atoms from the
E37 carboxyl group and G495 main chain (3.1 Å)
and
E37α1–C497
with IEs of −7.0 through a nonconventional H-bond (2.50) and some hydrophobic
interactions. However, two residues of SARS-2,
R403β3 and
Y505, showed attractive
interactions to hACE-2 E37 close to −2.0
(−1.7 and −1.8, respectively).Two residues of hACE-2 only show energetically relevant IE with the SARS complexes:
M82 and
E329α14. The IE of
M82 with L472 has the
strongest repulsion among the three complexes
(M82–L472:
5.4), being caused by the proximity between the sulfur atom of methionine and the methyl
group of leucine (3.1 Å), while E329 is interacting
with R426 with the strongest attraction
(E329α14–R426α3:
−13.9) through a salt bridge (2.0 Å) and an H-bond (2.1 Å). On the other
hand, T27α1 and
Y83 are the residues of
hACE-2 that formed strong interactions only with SARS-2. In the first,
T27 interacts with
Y489 through a
nonconventional H-bond
(T27α1–Y489:
−2.3) at a distance of 2.8 Å, and in the second,
Y83 forms a π–π (5.1 Å)
interaction with F486
(Y83–F486:
−3.1). Finally, K353 interacts strongly with the
residues of SARS T487
(K353–T487:
−5.2) through a nonconventional H-bond (3.0 Å) and
G488
(K353–G488:
3.3) through a repulsion with the nitrogen in the main chain of G488 (2.5 Å), N501
(K353–N501:
−5.3) of SARS-2 through nonconventional H-bonds (3.0 and 3.1 Å) and with
Y498
(K353–Y498:
−6.6) of NL63 by making a π-alkyl (4.2 Å) and an amide-π stacked
(3.8 Å) interaction.In summary, the overall receptor-binding modes of SARS and SARS-2 were quite similar, as
well as the most energetically important residues of hACE-2 interacting with NL63, but the
detailed interaction patterns were substantially different, which might explain the
distinct affinities and immunogenic features. As one can see, the most relevant residues
here presented are in agreement with published data, and new ones are shown, including
I489, N473, and
P462 (SARS), T500,
Y489, and E484 (SARS-2),
and H586, P536, and
C497 (NL63). Finally, the strongest IE are seen by
E329α14–R426α3
(−13.9),
G354–Y491
(−5.5),
Y41α1–Y484
(−4.1), and
M82–L472
(5.4) of SARS;
K31α1–E484
(−11.7),
D30α1–K417α6
(−9.9),
K353–N501
(−5.3), and
K31α1–Q493β6
(−5.2) of SARS-2; and
E37α1–C497
(−7.0),
K353–Y498
(−6.6),
D30α1–S496
(−2.68), and
E37α1–G495
(2.8) of NL63.
Conclusion
Virus-receptor recognition is a primary phase that plays a decisive role in tissue tropism
in host cells. Several studies have shown that SARS-CoV, SARS-CoV-2, and HCoV-NL63, despite
binding to the same receptor, use fairly different mechanisms to recognize hACE-2 and
initiate the cell entrance process that could be related to the order of binding affinity
and severity of these viruses. In this sense, we employed quantum biochemistry methods to
investigate the interactions of the hACE-2 with the spike protein of the three CoVs in
atomic detail to understand the process by which these proteins interact, aiming to discover
methods that could be used to neutralize virus infection.According to the protein–protein interaction results, the total interaction energy
between hACE-2 and spike protein obtained in this work followed the experimental binding
affinity: SARS-2 (−118.6 kcal mol–1) > SARS (−83.1 kcal
mol–1) > NL63 (−64.6 kcal mol–1). In order
to investigate the energetic relevance of the segments of both proteins to the attachment
and search for differences among the CoV species, we observed that, as expected, the most
important residues of hACE-2 are in the helix α1 and the β-turn
“β3-L18-β4”, while in the S protein these
residues are in the receptor-binding motif (RBM). Moreover, we found segments of both
proteins that look to be specific for the recognition of each virus, including the
α-helix 3 of SARS, the segments α14, L3, L16,
and L20 of hACE-2, and the β-strand 7 of NL63.Finally, our results showed that 24 residues of hACE-2 are important to the recognition of
the CoVs (Q24, T27,
F28, D30,
K31, H34,
E37, D38,
Y41, Q42,
L45, L79,
M82, Y83,
T324, Q325,
E329, N330,
K353, G354,
D355, F356,
R357, and A387), while
12 (R426, T436,
Y442, P462,
N473, Y475,
N479, Y484,
T486, T487,
L489, and Y491), 16
(K417, Y449,
Y453, F456,
A475, E484,
F486, N487,
Y489, Q493,
G496, Q498,
T500, N501,
G502, and Y505), and 10
(G494, S496,
C497, Y498,
V499, S535,
P536, G537,
W585, and H586) residues
are energetically relevant to the interaction of SARS, SARS-2, and NL63 with hACE-2,
respectively. As one can see, the most relevant residues and segments here presented are in
agreement with previous studies published during this two year period of the COVID-19
outbreak, as well as highlighting new ones: (a) segments α-helix 3 of SARS, α14,
L3, L16, and L20 of hACE-2, and the
β-strand 7 of NL63, and (b) residues I489,
N473, and P462 (SARS),
T500, Y489, and
E484 (SARS-2), and H586,
P536, and C497 (NL63).
These results provide valuable information for the discovery of antiviral therapeutics that
inhibit protein–protein interactions of human pathogenic CoVs that use hACE-2 as a
target.
Data and Software Availability
Structures used to perform the MFCC fragmentation are available in the Supporting
Information. For the analysis of the results and plotting the energy data, we have used
Python in-house scripts. Additional software details and data that support the findings of
this study are available from the authors upon request.