| Literature DB >> 35440993 |
Karan Kapoor1, Sundar Thangapandian1, Emad Tajkhorshid1.
Abstract
Proteins can sample a broad landscape as they undergo conformational transition between different functional states. At the same time, as key players in almost all cellular processes, proteins are important drug targets. Considering the different conformational states of a protein is therefore central for a successful drug-design strategy. Here we introduce a novel docking protocol, termed extended-ensemble docking, pertaining to proteins that undergo large-scale (global) conformational changes during their function. In its application to multidrug ABC-transporter P-glycoprotein (Pgp), extensive non-equilibrium molecular dynamics simulations employing system-specific collective variables are first used to describe the transition cycle of the transporter. An extended set of conformations (extended ensemble) representing the full transition cycle between the inward- and the outward-facing states is then used to seed high-throughput docking calculations of known substrates, non-substrates, and modulators of the transporter. Large differences are predicted in the binding affinities to different conformations, with compounds showing stronger binding affinities to intermediate conformations compared to the starting crystal structure. Hierarchical clustering of the binding modes shows all ligands preferably bind to the large central cavity of the protein, formed at the apex of the transmembrane domain (TMD), whereas only small binding populations are observed in the previously described R and H sites present within the individual TMD leaflets. Based on the results, the central cavity is further divided into two major subsites, first preferably binding smaller substrates and high-affinity inhibitors, whereas the second one shows preference for larger substrates and low-affinity modulators. These central subsites along with the low-affinity interaction sites present within the individual TMD leaflets may respectively correspond to the proposed high- and low-affinity binding sites in Pgp. We propose further an optimization strategy for developing more potent inhibitors of Pgp, based on increasing its specificity to the extended ensemble of the protein, instead of using a single protein structure, as well as its selectivity for the high-affinity binding site. In contrast to earlier in silico studies using single static structures of Pgp, our results show better agreement with experimental studies, pointing to the importance of incorporating the global conformational flexibility of proteins in future drug-discovery endeavors. This journal is © The Royal Society of Chemistry.Entities:
Year: 2022 PMID: 35440993 PMCID: PMC8985516 DOI: 10.1039/d2sc00841f
Source DB: PubMed Journal: Chem Sci ISSN: 2041-6520 Impact factor: 9.825
Fig. 1Conformational domains targeted by different docking approaches. Single-point docking utilizes a single structure of the protein target, restricting the sampling to a single point (orange dots) in the conformational landscape. Ensemble docking utilizes an ensemble of protein structures, often generated using MD simulations, taking into account thermal fluctuations within a local conformational basin in the vicinity of the starting experimental structure (blue lines). Extended-ensemble docking, the method introduced here, aims at taking into account the full functional cycle of the protein, generated, e.g., through the application of biasing techniques to transition between the major functional states of the protein (green lines).
Fig. 2Extended-ensemble docking protocol. A flow diagram showing different steps involved in the extended-ensemble docking approach. The approach involves the targeting of an extended ensemble of the protein conformations, generated along its functional cycle, by docking small molecules, followed by clustering of the predicted binding poses for each representative conformation. See Methods for details of each step.
Fig. 3Docking in the extended ensemble. (A) The extended ensemble of Pgp (with color changing from red to blue between states depending on the position in the trajectory), generated by taking 50 snapshots nearly equally distributed along the CV phase-space defining the IF to OF transition. Molecular docking was carried out in the same docking grid box (shown in green) defined around the TMD of the protein in all conformations. (B) The chemical structures of the 14 compounds selected for docking in the extended ensemble of Pgp are shown. These compounds include known substrates (S), modulators (M) and non-substrates (NS) of Pgp.
Fig. 4Binding pocket predictions during Pgp IF to OF transition. (A) The active binding pockets predicted for the starting IF crystal structure are shown. The binding pocket residues are shown by space filling representations in different colors. (B–G) Binding pockets predicted for snapshots 1, 10, 20, 30, 40 and 50 of the extended ensemble of Pgp, respectively. The large central binding region (blue) in the apex of the DBP shows the highest ligand binding propensity and is present in all protein conformations.
Fig. 5Binding affinities to the extended ensemble. The highest predicted binding affinities for 4 representative compounds (small substrate: rhodamine; large substrate: doxorubicin; low-affinity modulator: QZ59; high-affinity modulator: zosuquidar) to each conformation in the extended ensemble of Pgp are shown. Data for the other compounds are presented in Fig. S10.† Conformation 0 is the starting IF crystal structure, and the selected conformations along the IF to OF transition pathway are numbered 1 to 50. The predicted binding affinities fluctuate between different protein conformations, with the high-affinity modulator showing the highest binding affinities among all the compounds.
Fig. 6Clustering of binding modes generated by docking. Clustering of the binding modes for 4 representative compounds to the extended ensemble of Pgp is shown (the results for other compounds are shown in Fig. S12†). Only one representative, IF-like conformation of the protein is shown here for clarity. The main binding sites in the TMDs are marked by colored rectangles (different shades of blue: modulator or M site; green and yellow: hoechst-binding or H site; red and salmon: rhodamine-binding or R site; purple: extracellular or E site; brown: subsidiary or S site) shown for the first representative compound (rhodamine). Binding clusters (or binding subsites) within the main binding regions are highlighted with colored points (as indicated in the legend at the bottom), representing the heavy atoms of the clustered binding modes (E3 and S2 sites are not shown as they may not represent sites important for substrate binding/transport). The density of points in each cluster represents the cluster population. M1 and M2 subsites at the apex of the TMDs show the highest cluster populations in all compounds.
Distribution (percentage) of binding modes of different compounds to different binding sites observed in the extended ensemble of Pgp
| Compounds | M1 | M2 | M3 | H1 | H2 | R1 | R2 | E1 | E2 | E3 | S1 | S2 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Substrates | Colchicine | 42.4 | 22.0 | 1.8 | 0.9 | 0.7 | 2.0 | 3.1 | 0.0 | 15.7 | 0.9 | 8.3 | 2.2 |
| Doxorubicin | 22.3 | 40.0 | 5.1 | 5.3 | 0.9 | 8.3 | 3.6 | 0.0 | 9.8 | 1.8 | 1.8 | 1.1 | |
| Hoechst | 58.5 | 27.1 | 1.1 | 2.2 | 0.0 | 3.8 | 0.9 | 2.4 | 4.0 | 0.0 | 0.0 | 0.0 | |
| Prazosin | 46.0 | 28.5 | 0.4 | 7.4 | 0.2 | 5.3 | 0.5 | 0.5 | 7.6 | 2.5 | 0.9 | 0.2 | |
| Progesterone | 71.3 | 24.0 | 0.0 | 0.2 | 0.2 | 0.2 | 0.0 | 0.2 | 3.4 | 0.0 | 0.5 | 0.0 | |
| Rhodamine | 60.0 | 15.4 | 2.7 | 2.4 | 1.3 | 1.3 | 0.9 | 0.7 | 11.4 | 0.7 | 1.6 | 1.6 | |
| Verapamil | 75.3 | 16.4 | 0.0 | 1.1 | 0.0 | 1.6 | 0.5 | 0.9 | 2.0 | 0.2 | 1.8 | 0.2 | |
| Vinblastine | 10.2 | 51.1 | 8.5 | 1.3 | 1.8 | 5.3 | 14.7 | 0.0 | 4.4 | 0.9 | 1.3 | 0.5 | |
| Modulators | QZ59 | 16.1 | 54.3 | 7.4 | 2.5 | 0.0 | 4.5 | 4.7 | 0.0 | 1.8 | 2.2 | 4.5 | 2.0 |
| Laniquidar | 68.2 | 19.6 | 0.7 | 0.7 | 0.0 | 0.2 | 2.0 | 1.3 | 3.8 | 0.0 | 3.1 | 0.4 | |
| Tariquidar | 56.2 | 27.2 | 0.5 | 1.3 | 2.9 | 4.0 | 1.8 | 1.8 | 3.6 | 0.0 | 0.5 | 0.2 | |
| Zosuquidar | 70.0 | 17.9 | 2.2 | 1.3 | 1.1 | 0.0 | 1.4 | 0.5 | 3.8 | 0.0 | 0.5 | 1.3 | |
| Non-substrates | Diphenhydramine | 80.2 | 15.4 | 0.0 | 0.4 | 0.0 | 0.7 | 0.0 | 3.3 | 0.0 | 0.0 | 0.0 | 0.0 |
| Trimethoprim | 28.5 | 31.0 | 1.1 | 10.2 | 0.0 | 9.1 | 0.0 | 1.1 | 14.6 | 1.4 | 3.0 | 0.0 |
Fig. 7Binding cluster energies. The predicted binding affinities in each cluster are shown as a swarm plot for 4 representative compounds (data for the other compounds are shown in Fig. S13†). 1–50 represent protein conformations arising during the IF-OF transition (shown with different colors defined in the legend). Additionally, a boxplot providing the median cluster values, Q1 and Q3 quartiles, as well as minimum (Q1 − 1.5× interquartile range) and maximum (Q3 + 1.5× interquartile range) binding affinity values, is overlaid on top of the swarm plot for each cluster. M1 and M2 binding clusters show the highest populations and display the strongest binding affinities for all compounds.
Fig. 8Frequency of Pgp residues interacting with ligands. The normalized interaction frequencies of Pgp's binding residues for all binding modes predicted of 4 representative compounds are shown (data for the other compounds are shown in Fig. S14†). The residues are considered to interact with the ligand if their heavy atoms are within 4 Å. The orange arrows point to the regions of the protein showing differences in their interaction patterns for different classes of compounds. High-affinity modulators like zosuquidar display the highest interaction frequencies with the binding residues pointing to a more specific mode of their binding.
Fig. 9Ligand binding residues identified in Pgp. Binding residues in Pgp showing the highest (top 10) interaction frequencies (from Fig. 8 and S14†) with the tested compounds are shown in color within a cartoon representation of the protein backbone (left) and in stick representation (inset, right). The TM helices are individually labeled in the inset. The binding residues common to all compounds are shown in blue, residues common to binding of small substrates/high-affinity modulators in red, residues common to low-affinity modulators/large substrates in yellow, those common to small substrates/low-affinity modulators in orange, and residues showing preference for only small substrates are shown in green. High-affinity modulators share all binding residues with small substrates and show binding preference for the M1 subsite, whereas low-affinity modulators and large substrates show binding preference for residues forming the M2 subsite, lying below the M1 subsite and partially overlapping with it.
Top 20 binding residues for each compound, calculated based on the highest interaction frequencies
| Compound | Top 10 binding residues | Top 10–20 binding residues |
|---|---|---|
| Colchicine | L64,[ | F299, I302, Y306, F339, F724,[ |
| Doxorubicin | L221, F299, I302, Y306, L335, I336, F339, F979,[ | M68, M295, F332, A338,[ |
| Hoechst | F71, I302, Y306, F332, L335, F339,[ | M68, S218, L221, I336, A338, G342, F724,[ |
| Prazosin | M68, Y306,[ | L64, F71, F299,[ |
| Progesterone | M68, F71, Y306, F332, L335, I336, F339, F728, F974, F979 | F299, I302, Y303, Q721, F724, Y949, L971, M982, A983, V987 (ref. |
| Rhodamine | M68, F71, F332,[ | L64,[ |
| Verapamil | M68, F71, Y306, F332,[ | L64,[ |
| Vinblastine | L221, A225, M295, A298, F299, I302,[ | Y303, L335,[ |
| QZ59 | L221, M295,[ | G222, A225, I336,[ |
| Laniquidar | M68, F71, Y306, F332, L335, I336, F728, Y949, F974, F979 | L64, M67, F299, I302, F339, F724, M945, M982, A983, Q986 |
| Tariquidar | M68,[ | L64,[ |
| Zosuquidar | M68, F71, F332,[ | L64,[ |
| Diphenhydramine | M68, F71, F332, F724, F728, Y949, F953, L971, F974, F979 | M67, Y303, Y306, L335, I336, F339, A952, V970, S975, M982 |
| Trimethoprim | F299, I302, Y303, Y306, F339, Q721, Q834, F979, A983, V987 | M68, F332, L335, I336, N717, G718, L720, F724, F766, Q986 |
No data available.
Mutagenesis.
Cysteine-scanning mutagenesis.
Phootoaffinity labeling.
X-ray crystal structure.
CryoEM structure.
Fig. 10Differentiating between different classes of ligands. The binding site preference of different compounds was evaluated in terms of (A) the predicted binding affinities calculated for the extended ensemble in the M1 subsite, and (B) the respective populations in the M1 site. Ranking the compounds based on their binding affinities places the high-affinity modulators and the non-substrates at the two extremes, with low-affinity modulators and substrates lying between them. Comparison of the binding site population in the M1 subsite further distinguished the high-affinity modulators and small substrates (showing high populations in the M1 subsite) from low-affinity modulators and large substrates (showing higher populations in the M2 subsite instead).
Fig. 11Classification of ligand binding sites in Pgp identified by extended ensemble docking. (A) Different binding sites are shown in representative IF (left) and OF (right) conformations. The two TMD leaflets are shown in blue (TMD1) and pink (TMD2), respectively, and the NBDs are shown in green. The major substrate-modulator binding site (M1), as well as the low-affinity modulator/large substrate binding site (M2) are observed throughout the conformational transition of Pgp, whereas extracellular sites (E1 and E2) are only observed in the OF-like states. (B) Combining the predicted binding affinities of all compounds in the different binding sites obtained in the extended ensemble (Fig. 7 and S13†), we observe differences in the relative binding affinities of the poly-specific interaction sites (H/R/S), modulation sites (M1/M2) and extracellular sites (E). These differences may facilitate the transport of the molecules from the inside to the outside of the cell.