| Literature DB >> 35047894 |
Leonardo Bò1, Mattia Miotto1,2, Lorenzo Di Rienzo1, Edoardo Milanetti1,2, Giancarlo Ruocco1,2.
Abstract
Recent experimental evidence demonstrated the capability of SARS-CoV-2 Spike protein to bind sialic acid molecules, which was a trait not present in SARS-CoV and could shed light on the molecular mechanism used by the virus for the cell invasion. This peculiar feature has been successfully predicted by in-silico studies comparing the sequence and structural characteristics that SARS-CoV-2 shares with other sialic acid-binding viruses, like MERS-CoV. Even if the region of the binding has been identified in the N-terminal domain of Spike protein, so far no comprehensive analyses have been carried out on the spike-sialic acid conformations once in the complex. Here, we addressed this aspect performing an extensive molecular dynamics simulation of a system composed of the N-terminal domain of the spike protein and a sialic acid molecule. We observed several short-lived binding events, reconnecting to the avidic nature of the binding, interestingly occurring in the surface Spike region where several insertions are present with respect to the SARS-CoV sequence. Characterizing the bound configurations via a clustering analysis on the Principal Component of the motion, we identified different possible binding conformations and discussed their dynamic and structural properties. In particular, we analyze the correlated motion between the binding residues and the binding effect on the stability of atomic fluctuation, thus proposing regions with high binding propensity with sialic acid.Entities:
Keywords: SARS- CoV-2; entry mechanism; molecular binding; molecular dynamics; sialic acid
Year: 2021 PMID: 35047894 PMCID: PMC8757799 DOI: 10.3389/fmedt.2020.614652
Source DB: PubMed Journal: Front Med Technol ISSN: 2673-3129
Figure 1(A) Cartoon representation of the SARS-CoV-2 spike protein in trimeric form. Different colors highlight the three domains. (B) Root mean squared deviation (RMSD) as a function of time for the N-terminal regions of SARS-CoV-2 in the trimeric form (chains A, B, and C) and of chain A alone. (C) Root-mean squared fluctuation (RMSF) for all the 290 residues of the analyzed domains.
SARS-CoV-2 area under the curve values for the identified regions of high values of contact probability (peaks in Figure 2C).
|
|
|
|
|---|---|---|
| 1–23 | 0.117 | Peak-1S |
| 56–78 | 0.132 | Peak-2S |
| 140–165 | 0.153 | Peak-3S |
| 178–191 | 0.153 | Peak-4S |
| 241–265 | 0.194 | Peak-5S |
Figure 2(A) Root mean square deviation (RMSD) as a function of time of the atomic positions of SARS-CoV-2 spike N-terminal domain and a sialic acid molecule using the final bound configuration as reference. The progressive nearing of the sialic acid molecule to the binding region can be seen also via the snapshots above the graph. (B) RMSD of the sialic acid molecule atomic positions as a function of simulation time. We used the first bound configuration as a reference. (C) Probability for each residue of the SARS-CoV-2 N-terminal domain to interact with the sialic acid molecule, obtained from a 1.75 μs long molecular dynamic simulation. The solid line represents the contact probability averaged over the six near-neighbor of each residue, while dashed lines show the mean value plus or less one standard deviation. Asterisks represent the regions of the binding predicted in (24).
Figure 3(A) Representation as a function of the 2 Principal Components of the coordinates of the heavy atoms of both the 111 spike interacting residues and the sialic acid. A clustering analysis (described in section 4) isolates five distinct clusters, highlighted with different colors in the plot. (B) Time evolution of the root mean square deviation (RMSD) of the atomic positions of SARS-CoV-2 spike N-terminal domain and a sialic acid molecule using the first bound configuration as a reference. Bound frames are colored according to the 5 clusters found in (A). (C) Cartoon representations of the five different complexes between SARS-CoV-2 spike N-terminal domain and sialic acid molecule. Solvent-exposed molecular surfaces in the binding region are highlighted.
Summary of the relevant features for the five clusters identified in Figure 3A: percentage of bound conformations assigned to each cluster, list of residues in interaction with the sialic acid molecule, mean RMSF in free and bound state of the residues associated to each cluster.
|
|
|
|
| |
|---|---|---|---|---|
| Cluster 1 | 8.91 | Phe 32, Thr 33, Phe 59, Pro 217, Pro 218, Gln 219, Phe 220 | 0.11 ± 0.01 | 0.10 ± 0.01 |
| Cluster 2 | 31.93 | His 66, Ala 67, Ile 68, Val 70, Ser 71, Gly 72, Gln 183, Gly 184, Asn 185, Phe 186, Val 213, Arg 214, Gly 261 | 0.25 ± 0.03 | 0.17 ± 0.02 |
| Cluster 3 | 22.52 | Lys 147, Gly 181, Gln 183, Tyr 248, Leu 249, Thr 250, Pro 251, Gly 252, Asp 253 | 0.44 ± 0.07 | 0.37 ± 0.06 |
| Cluster 4 | 22.42 | Tyr 144, Tyr 145, His 146, Trp 152, Met 153, Tyr 248, Pro 251, Gly 252, Asp 253, Ser 254 | 0.38 ± 0.08 | 0.47 ± 0.11 |
| Cluster 5 | 14.21 | Cys 15, Val 16, Phe 140, Glu 156, Arg 158, Arg 246 | 0.13 ± 0.02 | 0.09 ± 0.01 |
Figure 4(A) Pearson pairwise correlation matrix of the motion of the 290 residues constituting the N-terminal domain of the SARS-CoV-2 spike protein. The correlation between a couple of residues is obtained averaging the correlation on the motion in every direction. Colors range from blue to red as the correlation passes from −1 to 1. (B) Density distributions of the mean correlation coefficients obtained grouping residues inside a sphere of radius 8 Å centered on each residue of the 111 residues forming the NTD loop regions. Vertical colored lines mark the mean values of the correlation of the couples of residues found in interaction with the sialic acid residue in the five clusters identified in Figure 3. (C) Difference in the root mean square fluctuations (RMSF) for each of the 290 residues comprising the N-terminal domain, computed using the whole dynamics (RMSFtot) or only the configurations where the sialic acid molecule is not bound to the protein (RMSFfree) divided by RMSFtot. (D) Difference in RMSF for each of residues comprising the binding sites found in Figure 3, computed using the bound configurations (RMSFbound) or the free ones RMSFfree and divided by RMSFfree.