The novel SARS-CoV-2 uses ACE2 (Angiotensin-Converting Enzyme 2) receptor as an entry point. Insights on S protein receptor-binding domain (RBD) interaction with ACE2 receptor and drug repurposing has accelerated drug discovery for the novel SARS-CoV-2 infection. Finding small molecule binding sites in S protein and ACE2 interface is crucial in search of effective drugs to prevent viral entry. In this study, we employed molecular dynamics simulations in mixed solvents together with virtual screening to identify small molecules that could be potential inhibitors of S protein -ACE2 interaction. Observation of organic probe molecule localization during the simulations revealed multiple sites at the S protein surface related to small molecule, antibody, and ACE2 binding. In addition, a novel conformation of the S protein was discovered that could be stabilized by small molecules to inhibit attachment to ACE2. The most promising binding site on RBD-ACE2 interface was targeted with virtual screening and top-ranked compounds (DB08248, DB02651, DB03714, and DB14826) are suggested for experimental testing. The protocol described here offers an extremely fast method for characterizing key proteins of a novel pathogen and for the identification of compounds that could inhibit or accelerate spreading of the disease.
The novel SARS-CoV-2 uses ACE2 (Angiotensin-Converting Enzyme 2) receptor as an entry point. Insights on S protein receptor-binding domain (RBD) interaction with ACE2 receptor and drug repurposing has accelerated drug discovery for the novel SARS-CoV-2 infection. Finding small molecule binding sites in S protein and ACE2 interface is crucial in search of effective drugs to prevent viral entry. In this study, we employed molecular dynamics simulations in mixed solvents together with virtual screening to identify small molecules that could be potential inhibitors of S protein -ACE2 interaction. Observation of organic probe molecule localization during the simulations revealed multiple sites at the S protein surface related to small molecule, antibody, and ACE2 binding. In addition, a novel conformation of the S protein was discovered that could be stabilized by small molecules to inhibit attachment to ACE2. The most promising binding site on RBD-ACE2 interface was targeted with virtual screening and top-ranked compounds (DB08248, DB02651, DB03714, and DB14826) are suggested for experimental testing. The protocol described here offers an extremely fast method for characterizing key proteins of a novel pathogen and for the identification of compounds that could inhibit or accelerate spreading of the disease.
The pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has inflicted substantial socio-economic damage on a global scale [1], [2]. While the number of confirmed infections and deaths (135.6 million and 2.9 million as of 13.4.2021, respectively; https://www.who.int/emergencies/diseases/novel-coronavirus-2019) keeps increasing, recently approved vaccines that protect from severe novel coronavirus disease (COVID-19) have brought the way out of the pandemic in sight. Extremely fast vaccine development was undoubtedly made possible by the scientific community's impressive efforts right from the beginning of the pandemic: for example, the first high-resolution structural models of SARS-CoV-2 proteins were published within weeks after the first confirmed infections [3], [4], [5], [6], [7], [8]. However, the threat of novel pathogens emerging in the future remains real and should be constantly prepared for [9]. The current pandemic has shown that quickly acquiring knowledge of the pathogen is essential to contain the disease and accelerate the development of effective treatments.Drug repurposing offers a potential speedway to find treatments for an emerging disease. The main benefit of drug repurposing is that compounds with known safety profiles can target the disease if neutralizing activity is detected [10]. The viral drug remdesivir, originally developed against the Ebola virus, was reported to shorten hospitalized COVID-19 patients [11]. In addition, a preliminary report showing the steroid dexamethasone decrease mortality in severe cases of COVID-19 was recently published [12]. Other recent examples of successful drug repurposing include Dapoxetine, that was initially developed as an antidepressant but is currently used to treat premature ejaculation [13], and Aspirin, a widely used analgesic that was recommended for prevention of colorectal cancer [14].In the context of drug discovery and repurposing, computer-aided methods offer an excellent toolbox for rapid identification of the most promising drug candidates and they have already been employed in various studies targeting Sars-CoV-2 proteins [10], [15], [16], [17], [18], [19]. As a result, expensive, laborious and time-consuming experimental testing can be focused on the highlighted compounds. Computational tools such as molecular docking enable extremely fast virtual screening (VS) of molecular libraries to detect the most potential binders to a disease-related protein target [20], [21], [22], [23], [24]. Accuracy of docking can typically be improved by utilizing a different scoring function to re-score ligand poses generated by the docking software [25], [26], [27], [28]. In recent years, increase in computing capacity has enabled efficient utilization of various molecular dynamics (MD) simulation-based techniques in VS protocols [29], [30], [31]. MD simulations performed in mixed solvents of organic probe molecules and water (MixMD) have been successful in the identification of orthosteric, allosteric and cofactor binding sites as well as protein-protein interaction (PPI) sites [32], [33]. In addition to revealing binding sites for small organic molecules, an ensemble of protein structures produced by MixMD can be used in VS to consider protein flexibility [30]. These features of MixMD are particularly useful when targeting a protein that has not much binding data available.SARS-CoV-2 internalization to human cells is mediated by the interaction between the trimeric viral Spike glycoprotein (S protein) and its plasma membrane receptor, angiotensin-converting enzyme 2 (ACE2) [8], [34]. Available high-resolution structural models show the binding region of ACE2 on the S protein receptor-binding domain (RBD) and that prior to the complex formation, S1 subunit RBD shifts to an “up” conformation and becomes accessible for ACE2 attachment to initiate the fusion with the host cell [3]. S protein - ACE2 interaction is enhanced by forming multiple electrostatic interactions, most of which were stable in a recent MD simulation study [4], [35]. Thus, prevention of the S protein - ACE2 complex formation using small molecule PPI blockers is a potential strategy for slowing down propagation of the infection. PPI interfaces have traditionally been challenging targets for small molecules due to their physicochemical and geometrical properties. However, up to date, numerous small-molecule PPI blockers have been identified and some have entered clinical trials for medical use [36], [37], [38]. PPI interfaces may contain small-sized binding hotspots that contribute significantly to the binding free energy of the PPI. In addition, PPI interfaces can undergo structural adaptation to different binding partners, facilitating small molecule binding to the PPI hotspots [36].In this study, the SARS-CoV-2 S protein RBD was simulated in mixed solvents utilizing the protocol described by Ghanakota et al.
[33]. The aim of the MixMD simulations was to detect suitable binding hotspots for small molecules on the RBD structure, focusing especially on the S protein – ACE2 interface. The biological relevance of the identified sites was evaluated by comparison with the available structural and binding data. The most promising PPI binding site was targeted with VS of a database containing approved and investigational small molecule drugs. VS results were inspected to identify compounds that could affect the S protein – ACE2 interaction and be potential for treatment of COVID-19.
Materials and Methods
MixMD Simulations
A 2.45 Å resolution crystal structure of SARS-CoV-2 S protein RBD in complex with ACE2 was obtained from Protein Data Bank [39] (rcsb.org; PDB: 6M0J [4]). ACE2, ions and crystal waters were deleted. For MixMD simulations, the system was solvated in 5 percent v/v ratio of probe to water using three different probes: pyrimidine (1P3), acetonitrile (ACN) and isopropanol (IPA). Tleap in AMBER18 package [40] was used to set up the simulation systems. Hydrogens were added, and disulfide bonds and histidine protonations were assigned. Systems were solvated with a layer of probe molecules after which enough TIP3P [41] water molecules were added to obtain the correct probe-water ratio. Parameters from ff14SB force field [42] were used for protein. GLYCAM_06j-1 force field parameters were used for N-acetyl-D-glucosamine (NAG) and the glycosylated asparagine [43]. Validated MixMD simulation parameters for 1P3, ACN and IPA in TIP3P water were used [44].All simulations were run with the CUDA implementation of PMEMD, using 2 fs time step [45], [46]. Constant temperature (300 K) and pressure (1 atm) were maintained. Andersen thermostat was used [47]. Bonds to hydrogen atoms were restrained with the SHAKE algorithm [48]. 10 Å cutoff was used for short-range electrostatic interactions, and Particle Mesh Ewald was used for long-range electrostatics [49], [50]. All systems were energy minimized and heated by gradually increasing the temperature. Equilibration runs were conducted with restraints on the protein backbone atoms. The restraints were gradually decreased, and finally, 1.4 ns equilibrations were run without any restraints. 100 ns production simulation was repeated 10 times for all three probes, resulting in 3 µs total simulation time.
Analysis of MixMD Simulations
Last 25 ns of each simulation run was included in the analysis. Probe occupancies were calculated using “grid” function in cpptraj [51] of AMBER18 with 0.5 Å spacing. Occupancy maps were visualized in Pymol 2.3.0 [52] using σ value 15 or higher. The σ-values were obtained by using the equation (x - m)/s where x is the bin count of a grid point, m is the mean and s is the standard deviation of all binned data. Thus, σ signifies deviation from the mean of all data at a grid point in units of standard deviation. Using higher σ value disposes of signal caused by sites where probes rapidly exchange with water while the signal from spots with high probe residence time is retained. Occupancy clustering was conducted using Probeview plugin [53] in Pymol. Grid points having at least 20 percent of the maximum occupancy were included, points within 3 Å of each other were considered within the same cluster, and a minimum of 10 points per cluster was required. The most common probe poses at the mapped hotspots were obtained by root-mean-square deviation (RMSD)-based clustering using cpptraj. Cpptraj was used to extract probe molecules within 3 Å of hotspot residues from each simulation snapshot. The probes were then again loaded to cpptraj one by one for clustering. Average-linkage method and epsilon value 4 Å were used. The simulation structures of the protein were clustered by RMSD fluctuations of the residues closest to the largest probe density observed at the PPI interface, using epsilon 2.0 Å. The resulting centroid structures were used in molecular docking.
Ligand Preparation
Drugbank database (https://www.drugbank.ca/; version 5.1.5; 2020-01-03) was downloaded as a 2D structure-data file (∼10000 compounds). LIGPREP in MAESTRO 2020-1 (Schrödinger, LLC, New York, NY, USA, 2020) was used to generate 3D coordinates, OPLS3 charges and tautomeric states at pH 7.4. Molecules with over ten rotatable bonds or molecular weight outside the range of 150-550 g/mol were excluded from the dataset with LIGFILTER in MAESTRO. The viral drug remdesivir was retained from Drugbank despite exceeding the molecular weight threshold. For docking, the database was converted to SYBYL MOL2 format with MOL2CONVERT in MAESTRO.
Virtual Screening
Molecular docking of Drugbank was performed with PLANTS software [54]. Chemplp scoring function and search speed “speed1” were used. Four best-scored conformations of each ligand were kept with cluster RMSD 2.0. Binding site center was varied between models 1 and 2. For model1 structures, two dockings were performed using two different coordinates. First, docking was done using a coordinate point in the center of the groove between Arg403 and Lys417. Docking radius of 12.0 Å was used. For the second docking, the coordinate point was moved closer to the Tyr505 – Gly502 region to obtain hits that interact with these residues. Docking radius was increased to 14.0 Å for the second docking to retain the possibility of occupying the groove-like region between Arg403 and Lys417. In model2 docking, binding site center was set to the center of the groove-like region between Arg403 and Lys417 and docking radius 12.0 Å was used. In docking of remdesivir, also GLIDE in MAESTRO 2020-01 was used [55], [56], [57]. The extra precision mode was used with enhanced planarity of conjugated pi groups, otherwise, default settings were retained. Same docking coordinates and radiuses were used as in the corresponding PLANTS dockings.All resulting docking conformations were re-scored with the negative image-based (NIB) re-scoring method using the programs PANTHER [21] (version 0.18.19) and SHAEP [58] (version 1.3.1). PANTHER generates a NIB-model of the binding pocket that describes the shape and electrostatic properties of an optimal ligand. The NIB-model can then be used in an extremely fast molecular similarity comparison. The similarity screening is performed with SHAEP, using the option “noOptimize” to score the ligand poses generated by the docking program. PANTHER re-scoring has been shown to provide significant enrichment of active compounds in VS with various protein targets [25], [26]. In addition, a fragment-based approach for NIB-screening was utilized by incorporating 1P3 molecules extracted from the MixMD simulations into the NIB-models [20]. Multiple coordinate points for each model were used. Coordinates, box radius and protein C atom radius were adjusted for each protein structure for complete coverage of the regions of interest. Face-centered cubic packing method was used. In models where 1P3 molecules were included, points overlapping with the probes were deleted. Same partial charges were assigned for 1P3 atoms as in MixMD simulations.In the final phase of VS, top-ranking compounds were inspected visually in BODIL molecular modelling software [59]. Top 50 molecules of each docking and re-scoring and compounds ranked top 100 in more than one docking/re-scoring run were evaluated visually. Furthermore, the second model 1 docking results were filtered with SDFCONF (version 0.8.37) to focus on compounds that interact with Gly502. Compounds with some atom within 2.0 Å or 3.5 Å of backbone amide H of Gly502 were considered.
Protein-Ligand Complex Simulations
Classical MD simulations in water were run for RBD in complex with the 30 most promising compounds obtained from docking. Simulation systems were set up with the same protocol as the MixMD simulations, but no probe molecules were added this time. Ligands were parameterized and atomic charges were calculated with General AMBER force field using antechamber and AM1-BCC method [60], [61]. 100 ns production simulation was repeated three times for each compound. Binding free energy of the most stable compounds was evaluated with the Molecular mechanics/Generalized Born Surface Area (MM/GBSA) method, with igb5 model for Generalized Born calculations [62], using MMPBSA.py available in AMBER18 [63]. Residue-wise decomposition of free energy was calculated for each compound. In addition, hydrogen bond number and lifetime analysis was conducted for the most stable compounds with cpptraj.
Figure Preparation
All structural figures were generated using Pymol version 2.3.0 [52].
Results and Discussion
Probe occupancy maps were aligned with the available S protein RBD structures complexed with various binding partners. The biological relevance of the observed probe densities was evaluated by their localization concerning those binding molecules. In addition, regions with mapping from multiple probes associated with a cavity-like surface shape were sought to detect potential sites for molecular docking.
Receptor Binding, Antibody Epitopes and Small Molecule Binding Sites Mapped by MixMD
In probe occupancy analysis, special attention was paid to the PPI interface as a small molecule binding to this region could directly prevent the S protein – ACE2 interaction. All three probes showed mapping at the PPI interface, 1P3 having the highest total occupancy (Table 1). Probe occupancy clustering identified five separate clusters at the PPI interface (clusters 0, 2, 21, 1 and 3; Fig. 1). Cluster 0 had occupancy for all three probes, and it overlapped with Lys353 of ACE2 that forms a hydrogen bond with Gly502 backbone amide of S protein in the complex crystal structure (Fig. S1F, which can be found on the Computer Society Digital Library at http://doi.ieeecomputersociety.org/10.1109/TCBB.2021.3076259). In addition, clusters 1 and 3 overlapped or were close to ACE2 residues facing the S protein surface. Notably, clusters 1 and 0 overlapped with sites of ACE2 residues Lys31 and Lys353 that have been identified as virus-binding hotspots crucial for receptor attachment of both SARS-CoV and SARS-CoV-2 [6], [64], [65]. In addition, S protein residues Gln498, Asn501, Tyr505, Gln493, Tyr453, Lys417, Tyr489, Asn487 and Ala475 were recently shown to form stable hydrogen bonds with ACE2 during MD simulation [35]. All the PPI probe clusters shown in Fig. 1 were located near these residues.
TABLE 1
Probe Cluster Occupancy Values
Cluster
1P3_max
1P3_sum
ACN_max
ACN_sum
IPA_max
IPA_sum
Max
Total_Sum
5
60.02
19285.96
146.29
13321.6
115.38
24867.37
146.29
57474.95
6
71.57
38503.78
63.9
2909.88
44.53
4574.25
71.57
45987.91
20
53.53
13723.34
49.18
1652.53
45.69
8291.69
53.53
23667.56
19
36.92
8852.39
0
0
30.59
1808.94
36.92
10661.33
10
49.2
7024.6
0
0
0
0
49.2
7024.6
8
39.09
5190.84
0
0
0
0
39.09
5190.84
4
28.98
4404.96
0
0
27.1
530.06
28.98
4935.02
15
27.54
4307.6
0
0
0
0
27.54
4307.6
0
39.81
3801.62
31.53
151.76
23.62
94.47
39.81
4047.85
14
26.82
3051.6
0
0
0
0
26.82
3051.6
16
22.48
1637.71
0
0
0
0
22.48
1637.71
9
21.76
1224.96
0
0
0
0
21.76
1224.96
13
27.54
1007.23
0
0
0
0
27.54
1007.23
11
17.43
340.15
0
0
25.94
218.37
25.94
558.52
18
14.54
56.73
0
0
30.59
473.14
30.59
529.87
3
17.43
490.62
0
0
0
0
17.43
490.62
2
21.04
479.59
0
0
0
0
21.04
479.59
21
0
0
43.3
385.07
0
0
43.3
385.07
17
18.87
365.62
0
0
0
0
18.87
365.62
7
16.71
359.85
0
0
0
0
16.71
359.85
1
15.99
350.46
0
0
0
0
15.99
350.46
12
18.87
340.87
0
0
0
0
18.87
340.87
Clusters are ordered by Total_Sum which is the total occupancy of all probes in that cluster. Max values indicate maximum bin counts of the probes in each cluster.
Fig. 1.
Probe occupancy clusters on SARS-CoV-2 S protein RBD. S protein RBD is shown in white cartoon representation. Cyan, blue, yellow, red and orange clusters overlap or are close to known antibody epitopes (cyan: B38, blue: CR3022, yellow: S309, red: P2B-2F6, orange: 298-52) or small molecule binding sites (5 and 6). Green clusters are related to packing of adjacent domains in the S protein trimer. NAG and the glycosylated asparagine are represented as sticks and with the following atom coloring: C = white, O = red, N = blue. RBD is flipped 180° over vertical axis between left and right panel.
Clusters are ordered by Total_Sum which is the total occupancy of all probes in that cluster. Max values indicate maximum bin counts of the probes in each cluster.Probe occupancy clusters on SARS-CoV-2 S protein RBD. S protein RBD is shown in white cartoon representation. Cyan, blue, yellow, red and orange clusters overlap or are close to known antibody epitopes (cyan: B38, blue: CR3022, yellow: S309, red: P2B-2F6, orange: 298-52) or small molecule binding sites (5 and 6). Green clusters are related to packing of adjacent domains in the S protein trimer. NAG and the glycosylated asparagine are represented as sticks and with the following atom coloring: C = white, O = red, N = blue. RBD is flipped 180° over vertical axis between left and right panel.Hydrogen bonding with Gly502 was also observed by neutralizing antibody B38, whose epitope partially overlaps with the ACE2 binding region (PDB: 7BZ5) [66]. Interestingly, all the observed probe densities at the PPI interface were overlapping or very close to B38 residues facing the S protein (Fig. S1A, available online). Furthermore, the crystal structure of S protein RBD in complex with Fab fragment of SARS-CoV-2 neutralizing antibody P2B-2F6 revealed a second epitope that partially overlaps with ACE2 binding region (PDB: 7BWJ) [67]. Site occupied by the antibody residues Ile103 and Val105 was mapped by cluster 4 (Fig. S1B, available online).Clusters 9, 11, 14 and 19 overlapped with antibody CR3022 bound to a cryptic epitope on S protein RBD (PDB: 6W41; Fig. 1; Fig. S1D, available online). CR3022 is a SARS-CoV antibody, and its epitope is accessible only when at least two of the S protein trimer RBDs are in the “up” conformation and in a slightly rotated orientation [68]. Interestingly, MixMD mapped the binding pocket of linoleic acid (LA), a fatty acid that was recently shown to decrease ACE2 attachment of S protein by binding and stabilizing the RBD in the closed S protein conformation (PDB: 6ZB5) [69]. Cluster 13 formed deep inside the pocket whereas cluster 15 resided at the pocket opening, both showing considerable 1P3 occupancy (Table 1; Fig. 1; Fig. S2A, available online). The deep-reaching part of cluster 6 overlapped with the fatty acid tail of LA, showing 1P3 and IPA occupation in the area. In addition, small density for ACN was observed near cluster 13 when using a smaller cutoff value in probe occupancy clustering (Fig. S2A, available online). A slight movement of the helix next to cluster 15 (between Ser383-Leu387) was observed during the simulations, allowing probes enter the LA pocket (Fig. S2B and C, available online). Due to the inhibitory properties of LA, its binding pocket has been suggested to be a potential target for drug discovery [69]. Mapping by the probes used here provides indication on what kind of chemical entities are accepted into the LA pocket from a binding molecule.Clusters 7, 10, 17 and 18 mapped third and fourth antibody epitopes close to the RBD glycosylation site (Fig. 1). Although not completely overlapping with the clusters, complex of antibodies 298 and 52 Fab fragments makes contact with the RBD in the region occupied by clusters 7, 10 and 17, and also cluster 8 resides on this RBD-antibody interface (Fig. 1; Fig. S1E, available online; publicly available PDB: 7K9Z). In addition, cluster 18 showed overlap with Phe186 of the antibody S309 bound to this epitope (PDB: 6WPS; Fig. S1C, available online) [70]. Several other S309 residues reached into the vicinity of cluster 6, but no overlap was observed.5 and 6 were the top-ranked clusters by total probe occupancy (Table 1). Both clusters showed strong mapping for all the probes and were associated with a cavity- or groove-like surface shape, making them potential small molecule binding sites. Recently, Carino et al. reported using VS to identify steroidal compounds that decrease S protein binding to ACE2 in vitro
[71]. The compounds were predicted to bind sites that overlap with clusters 5 and 6 and were hypothesized to decrease ACE2 binding by an allosteric effect. Notably, these sites are contiguous and between them resides the LA binding site (PDB: 6ZB5) that opens up from sides of both clusters 5 and 6 in the MixMD simulations. Thus, a molecule binding to either of the steroid-binding regions could possibly utilize the LA binding site through these channels. Finally, clusters 20, 10, 12, 16 and 17 were identified as contact sites with adjacent domains when RBD is in the “down” conformation in the trimeric S protein (Fig. 1; PDB: 6VSB) [3].
Novel Conformation of the S Protein – ACE2 Interface Revealed
Simulation structures of the S protein RBD were clustered based on the RMSD deviation of the residues in contact with the mapped spots at the PPI interface. Clustering resulted in a total of 7 centroid structures that were used in molecular docking. Due to significant conformational differences observed at the PPI interface, these structures were divided into two models for which different docking coordinates were used. Models 1 and 2 contained 4 and 3 structures, respectively.In model1, conformations of Tyr505 and Arg403 resembled the crystal structure (Fig. 2). A groove-like region was observed below Arg403 and next to Tyr505. Depth of the groove was increased by sidechain of Glu406 turning away from Tyr495 to form a salt bridge with either Arg403 or Arg408. Clustering of probe poses at cluster 0 showed 1P3 forming π-stacking interaction with Tyr505 and hydrogen bonding with the backbone amide of Gly502 (Fig. 1; Fig. 2). 1P3 at this site overlapped with the sidechain of ACE2 Lys353 in S protein – ACE2 complex, suggesting a potential mechanism for inhibition. For model1 structures, compounds were sought to utilize this site by forming similar interactions to those formed by the 1P3 centroid probe.
Fig. 2.
S protein RBD PPI interface conformations revealed by MixMD simulations. Model1 conformation on the left panel, Model2 conformation on the right panel. RBD is represented as white transparent surface. Residues Y505 and R403 are shown as sticks with the following atom coloring: C = white, O = red, N = blue, H = white. In addition, NIB-models generated of these conformations are shown as spheres and blue transparent surface. NIB-model atom colorings are: Neutral = cyan, negatively charged = red, positively charged = blue. 1P3 probes incorporated in the NIB-models are shown as sticks and the same atom colorings as RBD residues except for carbon atoms that are colored cyan.
S protein RBD PPI interface conformations revealed by MixMD simulations. Model1 conformation on the left panel, Model2 conformation on the right panel. RBD is represented as white transparent surface. Residues Y505 and R403 are shown as sticks with the following atom coloring: C = white, O = red, N = blue, H = white. In addition, NIB-models generated of these conformations are shown as spheres and blue transparent surface. NIB-model atom colorings are: Neutral = cyan, negatively charged = red, positively charged = blue. 1P3 probes incorporated in the NIB-models are shown as sticks and the same atom colorings as RBD residues except for carbon atoms that are colored cyan.In model2, sidechains of Tyr505 and Arg403 adopted conformations that caused overlap with ACE2 Lys353 in the superimposed crystal structure complex (Fig. 2). Such conformation was observed in approximately 10 percent of structures of 1P3 simulations and 15 percent of structures of ACN and IPA simulations. This difference between 1P3 and other probes was reasoned to be caused by the high residence time of 1P3 close to Tyr505, resulting in the crystal structure-like conformation being stabilized by π-stacking interaction. Probe pose clustering of 1P3 within the model2 structures revealed 1P3 frequently forming hydrogen bond and π-stacking interactions with Tyr505 and Arg403 (Fig. 2). Thus, potential hit compounds were predicted to stabilize the observed Tyr505 and Arg403 conformations by forming similar interactions.
Potential Modulators of S Protein – ACE2 Interaction Discovered
61 Drugbank compounds were predicted to bind to the S protein – ACE2 interface based on initial virtual screens by docking and re-Scoring (Table S1, available online). Nucleotide analogues were dominantly present within docking and re-scoring hits. Notably, nucleotide analogues are common viral drugs that are usually delivered to infected cells in nucleoside form. They are modified to nucleotide form by viral enzymes, after which they act as nucleic acid chain terminators or inhibitors of the related enzymes (e.g., Acyclovir) [72]. In addition, VS suggested several commercially used drugs to bind ACE2 interface of the RBD: for example, DB04884 (Dapoxetine) and DB13419 (Troxipide) are being used against premature ejaculation and gastroesophageal reflux disease, respectively, and have favourable safety profiles (Fig. S3, available online) [13], [73]. Reports suggesting good tolerability are available for DB11654 (T-2000) and DB15048 (Licogliflozin) that have been studied in phase 2 clinical trials (Fig. S3, available online) [74], [75].Of these compounds, 30 were selected for complex MD simulations with the RBD by visual inspection. Compounds that occupied the region lined by residues Gly502, Tyr505, Arg403 and Gly496 were predicted to be potential blockers of the ACE2 interaction. Emphasis was also put on the compounds' ability to form hydrogen bonds in the region and on chemical diversity: for example, as many of the virtual screening hits were highly similar nucleotide analogues, only the most potential ones according to the mentioned criteria were submitted to MD simulations.MD simulations were run for the selected compounds in complex with the S protein RBD to evaluate stability and binding free energy. 6 compounds displayed stable binding at the ACE2 interface of S protein in at least one of the 3 repeat simulations. Of these, DB08248 was ranked first by MM/GBSA with binding energy ∼ -28.9 kcal/mol for MD runs 2 and 3 (Table 2). DB08248 formed hydrogen bonds frequently with Asn501 and Gly502, which indicated stable occupation of the ACE2 Lys353 binding region (Table S2, available online; Fig. S4D, available online). Visual inspection of the binding interactions in MD run 3 showed that hydrogen bonding and π-stacking interactions with Tyr505 and Tyr495 mediated stability of the benzenesulfonamide and purin-2-ylamino moieties whereas the cyclohexyl tail fluctuated more due to lack of significant stabilizing interactions (Fig. 3, bottom left panel; Fig. S6, available online).
TABLE 2
Average MM/GBSA ΔGbind Values From Protein-Ligand Complex MD Simulations
DrugBank ID
Run 1
Run 2
Run 3
Mean DGbind
DB01937
-15.08 ± 7.5
-13.15 ± 5.25
-22.15 ± 14.64
-16.79 ± 4.74
DB02651
-27.14 ± 7.32
-22.48 ± 6.4
-20.95 ± 4.55
-23.53 ± 3.22
DB03714
-19.71 ± 5.41
-20.76 ± 6.21
-25.23 ± 4.66
-21.9 ± 2.93
DB08248
-22.3 ± 5.73
-28.87 ± 4.33
-28.91 ± 4.88
-26.69 ± 3.81
DB08434
-3.14 ± 5.79
-9.95 ± 6.3
-4.15 ± 5.28
-5.74 ± 3.67
DB14826
-6.64 ± 4.29
-21.93 ± 3.5
-8.12 ± 6.36
-12.23 ± 8.43
ΔGbind is shown in kcal/mol with standard deviation for each MD run. Bolded ΔGbind values indicate runs where the compound remained bound at the ACE2 interface of RBD throughout the simulation.
Fig. 3.
Potential modulators of S protein – ACE2 interaction identified from Drugbank database. Drugbank IDs of the compounds are: DB02651 (top left panel), DB03714 (top right panel), DB08248 (bottom left panel) and DB14826 (bottom right panel). Predicted binding mode as identified by MD simulation and MM/GBSA analysis is shown for each compound. RBD is shown as white transparent surface. Y505 and R403 are shown in each figure with several hydrogen bonding and nearby residues. Residues are represented as sticks with the following atom coloring: C = white, O = red, N = blue, H = white. Docked compounds are represented as balls and sticks and their atom colorings are: C = cyan, O = red, N = blue, H = white, P = orange, S = yellow, F = pink. Only polar hydrogens are shown. Purple dashed lines indicate potential hydrogen bonds between ligand and RBD.
ΔGbind is shown in kcal/mol with standard deviation for each MD run. Bolded ΔGbind values indicate runs where the compound remained bound at the ACE2 interface of RBD throughout the simulation.Potential modulators of S protein – ACE2 interaction identified from Drugbank database. Drugbank IDs of the compounds are: DB02651 (top left panel), DB03714 (top right panel), DB08248 (bottom left panel) and DB14826 (bottom right panel). Predicted binding mode as identified by MD simulation and MM/GBSA analysis is shown for each compound. RBD is shown as white transparent surface. Y505 and R403 are shown in each figure with several hydrogen bonding and nearby residues. Residues are represented as sticks with the following atom coloring: C = white, O = red, N = blue, H = white. Docked compounds are represented as balls and sticks and their atom colorings are: C = cyan, O = red, N = blue, H = white, P = orange, S = yellow, F = pink. Only polar hydrogens are shown. Purple dashed lines indicate potential hydrogen bonds between ligand and RBD.The second-ranked compound, DB02651 (Table 2), had stable electrostatic interactions with Arg403 and Gly496 via phosphonic acid and benzotriazol groups (Fig. S4B, available online; Fig. S6, available online; Table S2, available online). Binding of the phosphonic acid was further enhanced by frequent interaction with Lys417 (Fig. 3, top left panel). However, the other terminal phosphonic acid of DB02651 was not as well stabilized due to lack of positively charged amino acids in the area lined by Gly502 and Tyr505. A similar observation was made with the third-ranked compound, DB03714 (Table 2), whose terminal phosphonomethyl group did not form stable electrostatic interaction with the RBD. However, other parts of the compound displayed stable binding mediated especially by hydrogen bonding and π-stacking with Gly496, Lys417, Arg403, Tyr453, Tyr495 and Tyr505 (Fig. 3, top right panel; Fig. S4C, available online; Fig. S6, available online; Table S2, available online).DB14826 was ranked fifth by MM/GBSA (Table 2, run2) but displayed more stable binding than the fourth-ranked DB01937. DB14826 formed 2-3 hydrogen bonds, primarily with Arg403, Gly496 and Asn501, and significant contribution to binding was made by π-stacking and van der Waals interactions with Tyr505 and Tyr495 (Fig. 3, bottom right panel; Fig. S4F, available online; Fig. S6, available online; Table S2, available online). DB01937 displayed high variation in binding mode during simulations, which was also reflected in large fluctuation of the binding free energy (Table 2, run3). Electrostatic interaction between phosphate group and Arg403 was maintained for the whole simulation (Fig. S5, available online; Fig. S4A, available online), but other parts of the compound intermittently shifted away from the ACE2 Lys353 binding region, which could impair the compound's inhibitory activity. DB08434 stabilized model2 RBD structure in one of the MD replicates (Table 2, run3). However, despite of multiple hydrogen bonds observed during the simulation (Fig. S4E, available online), the estimated binding free energy (-4.15 ± 5.28 kcal/mol) of this complex indicated low affinity. Shape complementarity of the binding site and DB08434 was not optimal, as the compound did not properly fill a pocket between Lys417 and Glu406 that could be utilized for increased affinity (Fig. S5, available online).The stable binding mode displayed by DB08248, DB02651, DB03714 and DB14826 was confirmed to overlap with ACE2 Lys353 binding region by superimposing the ACE2-bound RBD with the MD simulation snapshots. These compounds are thus predicted to inhibit the S protein - ACE2 interaction and should be experimentally tested. Lys353 binding region has been indicated as a PPI hotspot of Sars-CoV-2 RBD-ACE2 attachment, which makes prevention of this interaction a potential strategy for inhibition [6]. A literature search was conducted to determine the potential of using these compounds to slow down propagation of Sars-CoV-2 infection. DB08248, DB02651, DB03714 had “experimental” status in Drugbank, suggesting that not enough clinical data is available to estimate their safety and tolerability in humans. DB14826 (Topramezone) is a herbicide that has been studied in clinical trials for use in immunonutrition before major vascular surgery (clinicaltrials.gov: NCT00559520) but, to our knowledge, no results of this study have been published.Thus far, several in-silico studies have been conducted for drug repurposing to treat COVID-19. The results vary, and compounds from vitamins and antioxidants (e.g., riboflavin) to adrenergic drugs (e.g., fenoterol), ACE2 blockers (tasosartan), antibiotics (e.g., salazosulfadimidine) and antivirals, such as a nucleotide analogue vidarabine, have been suggested against COVID-19 [15], [16], [17], [18], [19]. Various natural compounds, such as rosmarinic acid and demethoxycurcumin, have been shown to exhibit antiviral activity [76], [77], [78]. We recently screened a natural compound database and identified potential S protein modulators - ACE2 interaction [79]. Despite not being optimal for drug repurposing due to lack of clinical data, DB08248, DB02651, DB03714 and DB14826 are useful starting points for the development of more optimal small molecule inhibitors of the S protein - ACE2 interaction. The performed MD simulations provide indications for structural optimization of the compounds. Especially, the core structures placed at the ACE2 Lys353 binding region by DB08248, DB02651 and DB03714 displayed stable binding whereas increased affinity could be gained by optimization of the parts placed near Gly502 and Tyr453 to allow more efficient hydrogen bonding in these regions (Fig. 3).One of the key objectives of the current protocol is to identify compounds that could either inhibit or mediate internalization of SARS-CoV-2. Certain antimalarial drugs have been reported to accelerate viral replication [80]. In VS of Drugbank, several compounds displayed binding modes that could be speculated to stabilize ACE2 binding to the RBD. Small molecule mediated stabilization of protein oligomerization has been documented, for example, in triggering receptors expressed on myeloid cells 2 [81]. For instance, DB02790 (“experimental” status in Drugbank) was predicted to bind to the groove region below Arg403 and did not have any overlap with ACE2. The compound placed phosphate groups suitably to possibly contact ACE2 Lys353 and make favourable binding interactions resulting in acceleration of virus internalization. Thus, compounds that either accelerate or decelerate viral infection could be present as commonly used drugs or various natural or food products, making their identification essential for controlling the spread of the disease on a larger scale.
Conclusion
Impact of COVID-19 to modern society has been sudden and fierce. Although vaccination campaigns are likely to relieve the situation within the following months, emergence of novel, even more contagious pathogens is possible in the future and it should be prepared for. In this study, we have presented a protocol for extremely fast identification of small molecule binding sites and subsequent virtual screening and affinity prediction to identify compounds that could modulate the interaction between a pathogenic virus and its receptor. In addition, molecular dynamics simulations in mixed solvents recognized multiple antibody epitopes and small molecule binding sites that could affect antibody and receptor binding, emphasizing the power of the method in the characterization of a novel protein target. This information can be readily adopted into strategies, such as drug repurposing, that aim to accelerate the development of effective treatment against novel and fast spreading diseases. In the future, further efforts will be made in search of inhibitors and stabilizing agents of the S protein – ACE2 interaction and antibody binding by targeting the potential binding sites identified here.
Conflict of Interest
Authors declare that there is no conflict of interest regarding the publication of this article.
Authors: Elmeri M Jokinen; Pekka A Postila; Mira Ahinko; Sanna Niinivehmas; Olli T Pentikäinen Journal: Chem Biol Drug Des Date: 2019-07-19 Impact factor: 2.817
Authors: Jarmo Käpylä; Olli T Pentikäinen; Tommi Nyrönen; Liisa Nissinen; Sanna Lassander; Johanna Jokinen; Matti Lahti; Anne Marjamäki; Mark S Johnson; Jyrki Heino Journal: J Med Chem Date: 2007-04-21 Impact factor: 7.446
Authors: John H Beigel; Kay M Tomashek; Lori E Dodd; Aneesh K Mehta; Barry S Zingman; Andre C Kalil; Elizabeth Hohmann; Helen Y Chu; Annie Luetkemeyer; Susan Kline; Diego Lopez de Castilla; Robert W Finberg; Kerry Dierberg; Victor Tapson; Lanny Hsieh; Thomas F Patterson; Roger Paredes; Daniel A Sweeney; William R Short; Giota Touloumi; David Chien Lye; Norio Ohmagari; Myoung-Don Oh; Guillermo M Ruiz-Palacios; Thomas Benfield; Gerd Fätkenheuer; Mark G Kortepeter; Robert L Atmar; C Buddy Creech; Jens Lundgren; Abdel G Babiker; Sarah Pett; James D Neaton; Timothy H Burgess; Tyler Bonnett; Michelle Green; Mat Makowski; Anu Osinusi; Seema Nayak; H Clifford Lane Journal: N Engl J Med Date: 2020-10-08 Impact factor: 91.245
Authors: Daniel Wrapp; Nianshuang Wang; Kizzmekia S Corbett; Jory A Goldsmith; Ching-Lin Hsieh; Olubukola Abiona; Barney S Graham; Jason S McLellan Journal: Science Date: 2020-02-19 Impact factor: 47.728