The coronavirus SARS-CoV-2 main protease, Mpro, is conserved among coronaviruses with no human homolog and has therefore attracted significant attention as an enzyme drug target for COVID-19. The number of studies targeting Mpro for in silico screening has grown rapidly, and it would be of great interest to know in advance how well docking methods can reproduce the correct ligand binding modes and rank these correctly. Clearly, current attempts at designing drugs targeting Mpro with the aid of computational docking would benefit from a priori knowledge of the ability of docking programs to predict correct binding modes and score these correctly. In the current work, we tested the ability of several leading docking programs, namely, Glide, DOCK, AutoDock, AutoDock Vina, FRED, and EnzyDock, to correctly identify and score the binding mode of Mpro ligands in 193 crystal structures. None of the codes were able to correctly identify the crystal structure binding mode (lowest energy pose with root-mean-square deviation < 2 Å) in more than 26% of the cases for noncovalently bound ligands (Glide: top performer), whereas for covalently bound ligands the top score was 45% (EnzyDock). These results suggest that one should perform in silico campaigns of Mpro with care and that more comprehensive strategies including ligand free energy perturbation might be necessary in conjunction with virtual screening and docking.
The coronavirusSARS-CoV-2 main protease, Mpro, is conserved among coronaviruses with no human homolog and has therefore attracted significant attention as an enzyme drug target for COVID-19. The number of studies targeting Mpro for in silico screening has grown rapidly, and it would be of great interest to know in advance how well docking methods can reproduce the correct ligand binding modes and rank these correctly. Clearly, current attempts at designing drugs targeting Mpro with the aid of computational docking would benefit from a priori knowledge of the ability of docking programs to predict correct binding modes and score these correctly. In the current work, we tested the ability of several leading docking programs, namely, Glide, DOCK, AutoDock, AutoDock Vina, FRED, and EnzyDock, to correctly identify and score the binding mode of Mpro ligands in 193 crystal structures. None of the codes were able to correctly identify the crystal structure binding mode (lowest energy pose with root-mean-square deviation < 2 Å) in more than 26% of the cases for noncovalently bound ligands (Glide: top performer), whereas for covalently bound ligands the top score was 45% (EnzyDock). These results suggest that one should perform in silico campaigns of Mpro with care and that more comprehensive strategies including ligand free energy perturbation might be necessary in conjunction with virtual screening and docking.
Coronaviruses are positive-stranded RNA viruses that infect humans and animals and cause
common and severe respiratory diseases, including severe acute respiratory syndrome (SARS)
and Middle East respiratory syndrome (MERS).[1,2] These viruses rely heavily on functional polypeptides that
are generated by proteolytic cleavage of polyproteins that are translated from viral RNA.
The principal coronavirus proteases responsible for this polypeptide formation are mainly
protease and papain protease. The coronavirusSARS-CoV-2 main protease, Mpro
(henceforth denoted simply as Mpro), has garnered significant attention in the
past year as an enzyme drug target due to the COVID-19 pandemic outbreak.[3] Mpro is a druggable target[4,5] as it is conserved among coronaviruses and has no human
homolog. The first Mpro structures were published in early
2020.[3,6,7] The first crystal structures revealed that the active form of
Mpro is a homodimer containing two protomers, each composed of three domains
(Figure A). The active site in Mpro
is located in a cleft between domains I and II (Figure B) and features a noncanonical catalytic Cys–His dyad. The active site is
composed of four regions: S1′, S1, S2, and S4, with the catalytic Cys145 located in
S1′ and His41 located in S1′ and S2 (Figure C). To date, well over 200 three-dimensional holostructures have been resolved at
a resolution of 3 Å or better.[3,6−9] In these structures,
ligands bind to a variety of binding sites, including covalent and noncovalent binding in
the main catalytic site, noncovalent binding in pockets in between the two proteomers, and
weakly bound at the protein surface (i.e., in between crystallographic homodimer units).
This wealth of structural information makes Mpro an attractive benchmarking
system for testing the ability of docking programs to correctly identify and rank the
correct poses. This becomes particularly important in light of the severity of the COVID-19
pandemic[2] and the significant number of mutant forms of the virus that
are rapidly appearing and might render current and future vaccines ineffective. To date,
Mpro has been studied extensively using ligand docking and screening
tools[10−19] and
computational enzymology tools, such as hybrid quantum mechanics–molecular mechanics
(QM/MM).[20−29]
Figure 1
(A) SARS-CoV-2 Mpro dimer with bound ligand in each active site (PDB ID:
7BQY). The domains in chain A
are colored as follows: N′-finger in bright orange, domain I in pale green,
domain II in light blue, loop L3 in light teal, and domain III in light pink. The ligand
bound to chain A is colored yellow and appears in stick representation. Both chain B and
the ligand bound to it appear in gray. The same color code applies to (B), where the
active site is shown in greater detail. (C) Four conserved subsites are presented, along
with the Cys 145–His 41 dyad (deep teal stick representation). The covalently
bound ligand appears in limon-shade stick representation. The protein appears as a white
surface.
(A) SARS-CoV-2Mpro dimer with bound ligand in each active site (PDB ID:
7BQY). The domains in chain A
are colored as follows: N′-finger in bright orange, domain I in pale green,
domain II in light blue, loop L3 in light teal, and domain III in light pink. The ligand
bound to chain A is colored yellow and appears in stick representation. Both chain B and
the ligand bound to it appear in gray. The same color code applies to (B), where the
active site is shown in greater detail. (C) Four conserved subsites are presented, along
with the Cys 145–His 41 dyad (deep teal stick representation). The covalently
bound ligand appears in limon-shade stick representation. The protein appears as a white
surface.The number of studies targeting Mpro for in silico screening has grown,[30] and it would be of great interest to know in advance how well docking
methods can reproduce the correct ligand binding modes and rank these correctly. Clearly,
current attempts at designing drugs targeting Mpro with the aid of computational
docking are problematic if programs struggle to predict correct binding modes and score
these correctly. This is true even if common docking programs have undergone extensive
testing since each protein target comes with its own challenges due to the complexity of
binding pocket crevices, nature of interactions, and solvent exposure. Therefore, critical
evaluation of how well common docking programs perform for Mpro is important.Since the development of the first automated docking program DOCK,[31−35] a multitude of docking software packages have been
developed, with different physicochemical approximations and algorithmic details. Popular
docking programs in addition to DOCK, include Autodock,[36−38] Autodock Vina,[39] Glide,[40−42] Rosettaligand,[43,44] Gold,[45−47] and
CDocker.[48−51] Widely used search algorithms include molecular dynamics
(MD), Monte Carlo (MC), and genetic algorithms (GA). Common scoring functions include force
field-based energy functions, such as CHARMM,[52−56] AMBER,[57] and OPLS[58,59] (i.e., molecular mechanics, MM), and
knowledge-based scoring functions, such as DRUG-SCORE,[60] IT-SCORE,[61] DSX,[62] CHEMSCORE,[63] and SMoG.[64] Specialized docking programs addressing enzymes,[65,66] such as EnzyDock, have also been
developed.[67] Current challenges for docking methods include protein
flexibility,[68] ligand solvation, and binding-site
hydration.[47,69] Thus
far, several docking approaches have been employed to screen Mpro for potential
drugs in virtual screening and drug repurposing campaigns, including
Glide,[7,10,13,17,18,70−72]
Autodock,[11,13,73,74] Autodock Vina,[11,13,14,19,71,73,75,76]
Surflex,[77] PLANT,[78] DockThor,[76]
fast pulling of ligands,[14] deep docking,[70] algebraic
topology and deep learning,[79] and virtual reality-based docking.[16] However, to the best of our knowledge, no rigorous benchmark study
addressing the ability of such docking tools to reproduce and correctly rank known ligand
binding modes has been published, in spite of the known inherent challenges in
docking.[80−83]In the current work, we tested the ability of several leading docking programs to correctly
identify and score the binding mode of Mpro ligands in 193 crystal structures
(Figure S1). We tested the following docking programs:
Glide,[40−42] DOCK,[31−35] Autodock,[36−38] Autodock
Vina,[39] FRED,[84−87] and EnzyDock.[67] The current
results suggest that care should be taken in applying docking programs to a challenging
protein target such as Mpro.
Methods
Preparing Mpro Structure Database
The available crystal structures of ligand-containing SARS-CoV-2Mpro were
downloaded from the RCSB PDB website (March-December 2020).[88] In total,
we collected 193 different structures, including covalent and noncovalent ligands
(Table S1 and Figure S1). All structures were aligned relative to one
reference structure (PDB ID: 5R84)
for easy comparison. To perform docking, we separated each protein and ligand into
separate files, removing crystal waters, ions, and cosolvents. Missing residues were added
using the Modeller homology program.[89] Hydrogens were added using the
CHARMM simulation platform (using HBUILD) for the protein structure and using Openbabel
for the ligands.[90,91]
Visual inspection was also performed. For systems including only one monomer, the
complementary unit was generated using the crystallographic information included in the
PDB file using CHARMM. Protonation states of His residues were determined based on
hydrogen bonding patterns and knowledge of the chemistry catalyzed by Mpro
(Table S2), and they match the protonation states of key His residues
recently published.[23] All docking simulations described below commenced
with the CHARMM prepared systems.
Clustering of Ligands and Water Molecules
Chemical descriptors were calculated for all ligands from the 193 PDB files using RDKit
libraries in Python. Features thought to be important for ligand binding were chosen.
Specifically, we computed the number of rotatable torsions, molecular weight, number of
H-bond donors and acceptors, number of aromatic rings, the fraction of carbon atoms in
sp3 hybridization (relative to all carbon atoms in the molecule), and log
P values. We applied principal component analysis (PCA) using these
ligands descriptors, followed by k-means clustering to cluster the ligands into groups. We
selected the number of clusters by silhouette analysis of the k-means clustering results.
Ligand clustering was performed using Python 3.7. Density-Based Spatial Clustering of
Applications with Noise (DBSCAN) clustering was performed to analyze the water molecules
from all crystal structures. These water molecules were not included in the docking
studies.
Subsite-Binding Pocket Binding
To classify the binding patterns of all the 193 protein–ligand complexes, we
categorized the ligands as bound on the surface, at the dimer interface, or in the active
site. The latter was characterized according to the subsites S1, S1′, S2, and S4
(Figure C, Table S3).[7,77] A ligand is considered to occupy a binding pocket if any ligand atom
is within 4.0 Å of any pocket atom and also within 3.0 Å of the geometric center
of the pocket (defined as geometric center of all pocket atoms). Moreover, a ligand is
considered to be in the proximity of a binding pocket if any ligand atom is within 4.0
Å of any pocket atom and also within 5.0 Å of the geometric center of the
pocket. The criteria were designed to account for both small and bulky ligands and to
distinguish between binding poses where ligand groups are well docked inside a subpocket
and poses where ligand groups are located at the periphery of a subpocket. Cutoff values
are suitable for various nonbonded interactions (e.g., π–π stacking,
hydrogen bond, ionic, and hydrophobic interactions). The final values were obtained by
trial and error and validated by means of visual inspection. The subsite-binding pocket
occupancy analysis was implemented as a CHARMM[90,91] script.
Docking Protocols
To compare the performance of selected docking programs for use with Mpro
(search algorithm and scoring function), we performed noncovalent docking using
DOCK,[31−35] Autodock
Vina,[39] and FRED[84−87] and noncovalent and covalent docking using
Glide,[40−42]
Autodock,[36−38] and EnzyDock[67] to the systems described above. In all docking simulations described
below, the ligand was fully flexible, while the protein was fixed (except for the
covalently connected complexes, where the appropriate Cys residue was flexible).
Ligand Docking with Glide
Proteins and ligands were prepared using Schrödinger’s Maestro (version
11.4, 2017-4 release) Prep Wiz and LigPrep modules, respectively, with default settings
for docking with Glide. All covalent docking simulations were performed using the CovDock
module available in Glide.[92] For the noncovalent simulations, the grid
was generated using XGlide, which enables creation of different grids in parallel. The
grids were centered around the ligand’s centroid. The dimensions of the enclosing
box and the bounding box were set to 12 × 12 × 12 Å3 and 26
× 26 × 26 Å3, respectively, for all cases. The ligand
stereochemistry was kept during all docking simulations. The number of poses written per
ligand was set to 10,000. The scaling factors of the vdW radii and the partial atomic
charge cutoff were set to the default values 0.80 and 0.15, respectively. Standard
precision (SP) mode was chosen for all ligand docking runs. The selection of the
best-docked ligand structure among the proposed poses is made based on several model
energies implemented with Glide (docking score, Prime energy and E-model energy, and cdock
affinity). Solvent effects were incorporated using MMGBSA. All reported energies herein
used the docking score function for noncovalent docking and the cdock affinity scoring
function for covalent docking as these performed best [i.e., produced the highest number
of top-ranking structures with root-mean-square deviation (rmsd) < 2 Å]. In all
Glide docking simulations (ligand preparation, protein preparation, grid generation,
covalent, and noncovalent docking), the OPLS3 force field[59,93] was used.
Ligand Docking with DOCK (Version 6.9)
Proteins and ligands were prepared for docking simulations using the DockPrep option of
Chimera v.14.[94] The grid was generated according to the center of mass
of the crystal structure ligand with a grid spacing of 0.4 Å. The maximum and minimum
radius of the sphere generated was set to 4 and 1.4 Å. All the spheres within 10
Å of the ligand were selected for docking. The box length surrounding the ligand was
set to 6 Å, that is, the edge of the box from any atom of the ligand was at least 6
Å away, which easily accommodates all the selected spheres.
Ligand Docking with Autodock (Version 4.2)
Proteins and ligand pdbqt files were prepared using standard AutoDock tools
(prepare_flexreceptor4.py and prepare_ligand4.py). These files include Cartesian
coordinates and Gasteiger atomic charges[95] for each atom. AutoDock
employs a united atom method, and thus, no nonpolar hydrogens are present. The center of
mass of the crystallographic ligand was used to determine the center of the grid. AutoDock
uses one grid box to perform the docking calculations, and the dimensions of this box were
set to 37.5 × 37.5 × 37.5 Å3 and the spacing was set to 0.375
Å for all systems. We performed flexible ligand docking into a rigid protein
environment using GA, with default settings. For covalent docking,[96]
each ligand was prepared with the active Cys residue already present in the input file
using AutoDock tools (prepare_receptor4.py and prepare_flexreceptor4.py). For covalent
docking, the ligand flexible torsional angles were presampled using MC simulations with
CHARMM prior to docking.
Ligand Docking with Autodock Vina
Protein and ligand input pdbqt files were prepared in the same way as for Autodock4.2
(see above). The size of the grid was set to 30.0 × 30.0 × 30.0
Å3, and remaining parameters were set to default values.
Ligand Docking with FRED[85−87]
FRED is one of the docking programs available within the OpenEye scientific library. For
the docking process, proteins and ligands were prepared using the graphical user interface
“Make Receptor” provided with OpenEye. FRED creates a potential field around
the binding site by producing a negative image, which complements the shape of the protein
site. This potential field is represented on a contour, which completely surrounds the
ligand. OMEGA, an internal program within OpenEye is used to generate an ensemble of
conformers for each ligand. A total of 200 different conformers were generated for each
ligand for docking, and the 50 lowest energy docked structures were used to select the
best pose in terms of lowest rmsd or Chemgauss energy scoring relative to the crystal
ligand structure. The proteins and ligands were held fixed during the docking process.
Ligand Docking with EnzyDock
Protein and ligand files were prepared as described at the beginning of Methods. CHARMM topology (RTF) and parameters (PRM) files for the ligands were
generated using the CHARMM General Force Field (CGenFF) program.[53,97] For the proteins, CHARMM 36 was
used.[52,54,55] The grid was generated according to the center of mass of the crystal
structure ligand. The grid was generated with a mesh spacing of 0.25 Å and dimensions
of 30.0 Å along each axis. The docking entailed 25 cycles of simulated MC and MD
annealing for 25 differently rotated and MC torsion-sampled ligand configurations.
Settings for noncovalent and covalent docking were identical.
Results and Discussion
Ligand Clustering and the Subsite-Binding Pocket
To better understand the nature of the 193 Mpro complexes prior to discussing
the docking results, we present analyses of the ligands and their binding modes. The
ligands were clustered into seven main groups based on their chemical features by PCA and
k-means clustering. The features of each cluster were normalized, and the average value
for each cluster was calculated (Figure ). The
relative amount of the 193 ligands composing each cluster can be seen in the inserted pie
chart (Figure ).
Figure 2
Normalized radar plot showing various features of each cluster centroid for
SARS-CoV-2 Mpro ligands. Inset: pie plot with the relative fraction of each
cluster among all the ligands studied in this work. The first number is the cluster
name, and the second is the number of ligands in the cluster.
Normalized radar plot showing various features of each cluster centroid for
SARS-CoV-2Mpro ligands. Inset: pie plot with the relative fraction of each
cluster among all the ligands studied in this work. The first number is the cluster
name, and the second is the number of ligands in the cluster.For instance, cluster 6 is characterized by 17 ligands of low molecular weight and high
fraction of carbon atoms with sp3 hybridization, while cluster 3 is composed of
78 low-molecular-weight ligands that are rather rigid and slightly hydrophobic. Cluster 4
has 15 ligands of high molecular weight, flexible chains with sp3carbons, and
several hydrogen bond donors and acceptors, while cluster 7 has medium-molecular-weight
ligands that are highly hydrophobic with aromatic rings, yet has some hydrogen bond donors
and acceptors.Next, we analyzed the binding modes of the ligand clusters in Mpro (Tables and S1). In Table we present the fraction of each cluster that is bound in a specific subpocket
of the active site, at the surface, or at the interface between the two monomers. Note
that ligands can bind in more than one pocket, and hence, the fractions do not add up to
unity for each cluster. Inspection of the binding data clearly shows that ligands of low
molecular weight (clusters 1, 3, 5, and 6) tend to bind at the surface of the protein
(i.e., clusters 3, 5, and 6 are caught in between crystal units) or at the interface
between the homodimer units (cluster 1). Still many low-molecular-weight ligands occupy
pockets S1 and S1′ as these are covalently attached to C145. Ligands rich in
aromatic rings and correspondingly high log P values (cluster 7) tend to
occupy all pockets more than average, specifically sites S1 and S2. This is due to
favorable π–π interactions with F140, H163, and H172 located in S1 or
H41 and Y54 in S2. Another important observation is that ligands more likely to bind to S4
(which is rarely occupied) belong to clusters 2 and 4, which tend to include large,
flexible molecules that are rich in H-bond donors and acceptors.
Table 1
Fraction of Ligands in Each Cluster that are Located in a Specific Active-Site
Pocket, on the Surface, or at the Interface between Monomers in the SARS-CoV-2
Mpro Crystal Structures
S1
S1′
S2
S4
surface
dimer interface
1
0.74
0.32
0.62
0.29
0.06
0.18
2
0.84
0.44
0.74
0.50
0.04
0.00
3
0.42
0.35
0.16
0.05
0.22
0.06
4
0.87
0.67
0.80
0.87
0.00
0.00
5
0.57
0.18
0.25
0.10
0.27
0.00
6
0.24
0.44
0.12
0.00
0.29
0.06
7
0.82
0.50
0.77
0.36
0.09
0.09
Total
0.57
0.38
0.37
0.22
0.17
0.05
We also clustered the water molecules in all crystal structures using DBSCAN clustering.
Following clustering, we removed all waters that overlap any bound ligand (Figure S2). These waters form a set of active site features that can be
included in docking studies (these waters were not included in the current docking
study).
Docking of Ligands in Mpro
We docked all ligands from 193 crystal Mpro structures into their respective
crystal protein structure (Table S4). These crystal structures include ligands bound noncovalently to
the main binding pocket, surface and dimer interface, as well as covalently attached
ligands. In all results below, we present the success rate of different docking programs
in reproducing the crystal bound poses. For DOCK, AutoDock Vina, and FRED, the results
reflect the noncovalent complexes only.In Figure A we show the overall success of all
programs. Glide and EnzyDock reproduce the correct crystal structure pose (rmsd < 2
Å) for over 50% of the structures, with success rates of 64 and 70%, respectively,
while for AutoDock, this rate falls to 40%. However, in many cases, even if the correct
pose is identified, it is not scored as lowest in energy, and the success rate reduces to
33% (Glide), AutoDock (30%), and EnzyDock (35%). The overall success rates of Glide,
AutoDock, and EnzyDock are in part due to the covalent complexes, whose poses are easier
to reproduce than the noncovalent ones. If we analyze the success rates of identifying
only the covalently bound complexes, Glide, AutoDock, and EnzyDock identify the correct
poses 70, 42, and 71% of the cases, while the correct pose is also scored as the best one
in 38, 36, and 45% of the cases (Figure B). For
the noncovalent complexes, Glide, DOCK, AutoDock, AutoDock Vina, FRED, and EnzyDock
identify the correct poses in 55, 61, 37, 29, 46, and 68% of the cases, respectively,
while these are ranked as the lowest energy poses in 26, 20, 24, 14, 14, and 22% of the
cases, respectively (Figure C). Finally, if we
remove the complexes with surface-bound ligands (i.e., ligands bound in between crystal
units), all programs perform significantly better (Figure D). The correct poses are identified (and scored as best) as
follows (%): Glide 74 (39), DOCK 77 (29), AutoDock 48 (35), AutoDock Vina 40 (23), FRED 61
(18), and EnzyDock 80 (35), respectively.
Figure 3
Docking pose prediction success (%) for selected methods for 193 crystal structures
of SARS-CoV-2 Mpro. For each column, the upper part represents the ability
of methods to identify poses with rmsd < 2 Å, while the lower part represents
ability of methods to identify a pose with rmsd < 2 Å as the lowest energy
pose. (A) All 193 crystal structures, (B) 108 crystal structures with covalently bound
ligands, (C) 85 crystal structures with noncovalently bound ligands, (D) 51 crystal
structures with noncovalently bound ligands excluding surface bound ligands.
Docking pose prediction success (%) for selected methods for 193 crystal structures
of SARS-CoV-2Mpro. For each column, the upper part represents the ability
of methods to identify poses with rmsd < 2 Å, while the lower part represents
ability of methods to identify a pose with rmsd < 2 Å as the lowest energy
pose. (A) All 193 crystal structures, (B) 108 crystal structures with covalently bound
ligands, (C) 85 crystal structures with noncovalently bound ligands, (D) 51 crystal
structures with noncovalently bound ligands excluding surface bound ligands.Next, we analyze how the different programs perform as a function of binding site
locations on Mpro. In Figures S3 and 4, we present box plots of the best rmsd
values and the rmsd values for the lowest scoring pose for noncovalently bound ligands,
respectively. All methods struggle with ligands bound at the interface between monomers
and on the protein surface, with Glide and FRED producing the best results. Additionally,
most methods perform better for ligands bound to more than a single pocket (i.e., S1 +
S2), and this trend is particularly clear for Glide and EnzyDock.
Figure 4
Distributions of rmsd values (Å) for the lowest-scoring pose at different sites
in SARS-CoV-2 Mpro for selected noncovalent docking methods. The ligands
were clustered into groups occupying similar regions, and only clusters with more than
seven members are shown.
Distributions of rmsd values (Å) for the lowest-scoring pose at different sites
in SARS-CoV-2Mpro for selected noncovalent docking methods. The ligands
were clustered into groups occupying similar regions, and only clusters with more than
seven members are shown.Similarly, we analyze how the different covalent docking programs perform as a function
of binding site locations on Mpro. In Figures S4 and 5, we present box plots of the best rmsd
values and the rmsd values for the lowest scoring pose for covalently bound ligands,
respectively. Also here, we observe a general trend, where the docking methods perform
better for ligands occupying more pockets.
Figure 5
Distributions of rmsd values (Å) for the lowest-scoring pose at different sites
in SARS-CoV-2 Mpro for selected covalent docking methods. The ligands were
clustered into groups occupying similar regions, and only clusters with more than 22
members are shown.
Distributions of rmsd values (Å) for the lowest-scoring pose at different sites
in SARS-CoV-2Mpro for selected covalent docking methods. The ligands were
clustered into groups occupying similar regions, and only clusters with more than 22
members are shown.
Conclusions
In wake of the growing threat emerging from the SARS-CoV-2 pandemic, the modeling community
has rushed to study a variety of potential pharmaceutical targets. One of these targets,
Mpro, is particularly attractive as this enzyme has no human analogue and is
conserved among coronaviruses. A large number of studies have addressed ligand docking and
virtual screening of ligand libraries against Mpro in search of promising leads.
A prerequisite for such studies is the ability of the docking programs to correctly identify
and score ligand poses. Due to the intense efforts by the scientific community, there is
already a wealth of structural biology information on Mpro, hence enabling
comparative studies of different docking approaches against this target. To date, the
available crystal structures of Mpro include ligands bound covalently and
noncovalently to the main catalytic site, surface, and in between the two monomers. Here, we
studied several leading docking codes, namely, Glide, DOCK, AutoDock, AutoDock Vina, FRED,
and EnzyDock, and evaluate their ability to correctly reproduce and score the crystal
structure ligand configuration for 193 Mpro crystal structures. None of the codes
are able to correctly identify and score the crystal structure in more than 26% of the cases
for noncovalently bound ligands (Glide top performer), whereas for covalently bound ligands,
the top score was 45% (EnzyDock best performer). Additionally, a general trend, where
several of the docking methods (e.g., Glide and EnzyDock) perform better for larger, bulkier
ligands occupying more than a single pocket, is observed. All docking methods struggle with
prediction of small ligands. In the original crystal structures, many of the smaller ligands
are surrounded by numerous explicit water molecules, dimethyl sulfoxide molecules, and
Cl– ions that were removed prior to docking. Thus, these redocking
trends may be ascribed to difficulty in accurately scoring docking poses, where a delicate
balance between intra- and intermolecular terms and solvation terms must be stricken.In conclusion, the current results suggest that one should perform docking studies and
virtual screening campaigns of Mpro with care and that more comprehensive
strategies might be necessary. Such strategies might include initial virtual screening
(e.g., using FRED or AutoDock Vina) or docking (e.g., Glide or EnzyDock), followed by more
rigorous ligand free energy binding calculations[98,99] and in-depth QM/MM studies.[20,24,26,28,29] Inclusion of conserved water molecules, as identified in
this study, may also be of help in guiding the docking process. Indeed, MD simulations have
pointed to several water molecules, as important for Mpro.[11,18,20,24,26,29,77]
Authors: Amr Moustafa; Markus Perbandt; Eva Liebau; Christian Betzel; Sven Falke Journal: Acta Crystallogr F Struct Biol Commun Date: 2022-05-27 Impact factor: 1.072
Authors: Lukas Bucinsky; Dušan Bortňák; Marián Gall; Ján Matúška; Viktor Milata; Michal Pitoňák; Marek Štekláč; Daniel Végh; Dávid Zajaček Journal: Comput Biol Chem Date: 2022-02-26 Impact factor: 3.737
Authors: Léa El Khoury; Zhifeng Jing; Alberto Cuzzolin; Alessandro Deplano; Daniele Loco; Boris Sattarov; Florent Hédin; Sebastian Wendeborn; Chris Ho; Dina El Ahdab; Theo Jaffrelot Inizan; Mattia Sturlese; Alice Sosic; Martina Volpiana; Angela Lugato; Marco Barone; Barbara Gatto; Maria Ludovica Macchia; Massimo Bellanda; Roberto Battistutta; Cristiano Salata; Ivan Kondratov; Rustam Iminov; Andrii Khairulin; Yaroslav Mykhalonok; Anton Pochepko; Volodymyr Chashka-Ratushnyi; Iaroslava Kos; Stefano Moro; Matthieu Montes; Pengyu Ren; Jay W Ponder; Louis Lagardère; Jean-Philip Piquemal; Davide Sabbadin Journal: Chem Sci Date: 2022-02-10 Impact factor: 9.825
Authors: Guillem Macip; Pol Garcia-Segura; Júlia Mestres-Truyol; Bryan Saldivar-Espinoza; María José Ojeda-Montes; Aleix Gimeno; Adrià Cereto-Massagué; Santiago Garcia-Vallvé; Gerard Pujadas Journal: Med Res Rev Date: 2021-10-26 Impact factor: 12.388