| Literature DB >> 35199221 |
Jessie Low Gan1,2, Dhruv Kumar3,4, Cynthia Chen2,5, Bryn C Taylor6,7, Benjamin R Jagger6,8, Rommie E Amaro9, Christopher T Lee10.
Abstract
The discovery of new drugs is a time consuming and expensive process. Methods such as virtual screening, which can filter out ineffective compounds from drug libraries prior to expensive experimental study, have become popular research topics. As the computational drug discovery community has grown, in order to benchmark the various advances in methodology, organizations such as the Drug Design Data Resource have begun hosting blinded grand challenges seeking to identify the best methods for ligand pose-prediction, ligand affinity ranking, and free energy calculations. Such open challenges offer a unique opportunity for researchers to partner with junior students (e.g., high school and undergraduate) to validate basic yet fundamental hypotheses considered to be uninteresting to domain experts. Here, we, a group of high school-aged students and their mentors, present the results of our participation in Grand Challenge 4 where we predicted ligand affinity rankings for the Cathepsin S protease, an important protein target for autoimmune diseases. To investigate the effect of incorporating receptor dynamics on ligand affinity rankings, we employed the Relaxed Complex Scheme, a molecular docking method paired with molecular dynamics-generated receptor conformations. We found that Cathepsin S is a difficult target for molecular docking and we explore some advanced methods such as distance-restrained docking to try to improve the correlation with experiments. This project has exemplified the capabilities of high school students when supported with a rigorous curriculum, and demonstrates the value of community-driven competitions for beginners in computational drug discovery.Entities:
Keywords: Computational biophysics; Drug discovery; Ensemble docking; Molecular dynamics; Restrained docking
Mesh:
Substances:
Year: 2022 PMID: 35199221 PMCID: PMC8907095 DOI: 10.1007/s10822-021-00433-2
Source DB: PubMed Journal: J Comput Aided Mol Des ISSN: 0920-654X Impact factor: 3.686
Fig. 1Workflow of the ensemble docking approach. A PDB file was selected and simulated using molecular dynamics. The resultant trajectory was clustered using six different methods, and cluster centroids were extracted as representative structures. Ligand SMILES were prepared as 3D structures and conformers were generated. Molecular docking of ligands was performed with Glide to the cluster centroids and the crystal ensemble. Pose scores were used to generate rank orderings and Kendall’s τ values when compared to the experimental rank ordering
Fig. 2Apo molecular dynamics (MD) clustering results. A Binding Atoms definition for Clustered by Binding Atoms (CBA) centroids, defined by taking all atoms within 2 Å of docked poses of a ligand from the D3R dataset (‘CatS_2’, the second ligand in the dataset) from Glide apo blind docking. The crystal structure protein is depicted in NewCartoon and colored teal, while the binding atoms are both represented by red spheres and a transparent red surface representation, visualized in Visual Molecular Dynamics (VMD) [80, 81]. B The pairwise root-mean-square deviations (RMSDs) of the binding atoms of the crystal structure and all 10 centroid structures from each clustering method are depicted in a heatmap. The centroids obtained from clustering have a range of RMSDs and therefore have structural variability. C MD clustering extracts various centroid structures, and different clustering methods yield different conformations. The RMSF of the 10 centroids extracted from each clustering method, shown as the relative thickness and color, was calculated with MDTraj [71] and visualized using PyMOL [73]. The orientation of the protein for parts A and C are the same
Fig. 3Kendall’s τ values for ligand rankings based on minimum scores from Glide docking to apo MD centroids, compared to a random rank ordering distribution. ‘XTAL Ens.’ indicates the crystal ensemble results. The probability distribution function is graphed from the Kendall’s τ values of 10,000 random ligand rank orderings. The distribution has μ = 0 and
Kendall’s For All Ligand Rankings, The Kendall’s τs for the initial Glide docking show slight fluctuations in different scoring schemes, but do not show any immense improvement
| Docking function | Scoring method | Clustering methods | ||||||
|---|---|---|---|---|---|---|---|---|
| XTAL | TICA | PCA | GROMOS | TICA CBA | PCA CBA | GROMOS CBA | ||
| XTAL Ens. AB | Minimum | 0.23 | ||||||
| W. Avg. | – | |||||||
| Avg. | 0.27 | |||||||
| Glide SP-AB | Minimum | 0.20 | 0.18 | 0.18 | 0.22 | 0.12 | 0.28 | 0.12 |
| W. Avg. | – | 0.21 | 0.20 | 0.18 | 0.17 | 0.24 | 0.21 | |
| Avg. | – | 0.20 | 0.21 | 0.25 | 0.20 | 0.24 | 0.23 | |
| Glide SP-AR | Minimum | 0.13 | 0.14 | 0.13 | 0.13 | 0.11 | 0.11 | 0.09 |
| W. Avg. | – | 0.08 | 0.07 | 0.10 | 0.09 | 0.05 | 0.07 | |
| Avg. | – | 0.12 | 0.07 | 0.09 | 0.07 | 0.06 | 0.09 | |
| Glide XP-AB | Minimum | 0.20 | 0.11 | 0.11 | 0.12 | 0.11 | 0.24 | 0.14 |
| W. Avg. | – | 0.10 | 0.08 | 0.08 | 0.11 | 0.17 | 0.15 | |
| Avg. | – | 0.11 | 0.08 | 0.07 | 0.12 | 0.19 | 0.17 | |
| Glide XP-AR | Minimum | 0.13 | 0.14 | 0.10 | 0.12 | 0.11 | 0.09 | 0.13 |
| W. Avg. | – | 0.10 | 0.07 | 0.04 | 0.07 | 0.03 | 0.11 | |
| Avg. | – | 0.11 | 0.07 | 0.06 | 0.04 | 0.04 | 0.08 | |
| Glide SP-HB | Minimum | 0.09 | 0.17 | 0.14 | 0.18 | 0.23 | 0.23 | 0.18 |
| W. Avg. | – | 0.20 | − 0.01 | 0.17 | 0.24 | 0.14 | 0.20 | |
| Avg. | – | 0.22 | − 0.01 | 0.21 | 0.23 | 0.18 | 0.21 | |
| Glide SP-HR | Minimum | 0.12 | 0.18 | 0.13 | 0.11 | 0.16 | 0.17 | 0.15 |
| W. Avg. | – | 0.13 | 0.09 | 0.08 | 0.15 | 0.11 | 0.13 | |
| Avg. | – | 0.15 | 0.11 | 0.12 | 0.14 | 0.13 | 0.14 | |
Here we show the Kendall’s τ from rank orderings produced through various docking functions, clustering methods, and scoring schemes. Docking Functions are labeled accordingly: XTAL Ens Crystal Ensemble, SP Glide Standard Precision Docking, XP Glide Extra Precision Docking, A apo structure, H holo structure, B blind docking, R restrained docking. We experimented with these scoring schemes to test if a particular method of discerning scores for each ensemble would better represent the protein binding mechanisms and improve rank ordering. The various scoring schemes were the Minimum, Weighted Average (W. Avg.), and Average (Avg.)
Fig. 4Docking pose analysis shows that a distance-restraint improves pose accuracy. A Cocrystal pose (PDB ID: 5QC4 [15]) Ligand carbons are pink; ligand common core carbons are yellow; key binding residues PHE71, VAL163, and PHE212 are green. B Ligand CatS 259 of the crystal ensemble docking: an ideal docking pose in an initial cocrystal structure. C Ligand CatS 118 of the SP apo blind crystal docking: ideal pose most similar to the cocrystal structure. D Ligand CatS 363 of the SP apo blind crystal docking: some docked ligands show a flipped core binding mode that is less common but can be found in some available cocrystals. [11]. E The RMSDs of the ligand core for each pose in each Glide docking method show that blind poses were concentrated farther from the cocrystal position compared to the ligand-core-restrained docking. Each violin is composed of all minimum poses for each clustering method which contributed to the final rank ordering and the crystal structure poses, totaling per violin. Method acronyms: SP Glide Standard Precision Docking, XP Glide Extra Precision Docking; A apo structure, H holo structure; B blind docking, R restrained docking. The median is represented in white, the interquartile range is shown in black, and the minimum and maximum values are shown as whiskers. F Ligand CatS 23 of the SP apo restrained PCA docking: When the ligand is restrained, it can be unnaturally docked in receptors that are dissimilar to the cocrystal, such as here where the PHE71 is in a different configuration