Literature DB >> 28293633

Combined Ligand/Structure-Based Virtual Screening and Molecular Dynamics Simulations of Steroidal Androgen Receptor Antagonists.

Yuwei Wang1, Rui Han1, Huimin Zhang1, Hongli Liu1, Jiazhong Li1, Huanxiang Liu1, Paola Gramatica2.   

Abstract

The antiandrogens, such as bicalutamide, targeting the androgen receptor (AR), are the main endocrine therapies for prostate cancer (PCa). But as drug resistance to antiandrogens emerges in advanced PCa, there presents a high medical need for exploitation of novel AR antagonists. In this work, the relationships between the molecular structures and antiandrogenic activities of a series of 7α-substituted dihydrotestosterone derivatives were investigated. The proposed MLR model obtained high predictive ability. The thoroughly validated QSAR model was used to virtually screen new dihydrotestosterones derivatives taken from PubChem, resulting in the finding of novel compounds CID_70128824, CID_70127147, and CID_70126881, whose in silico bioactivities are much higher than the published best one, even higher than bicalutamide. In addition, molecular docking, molecular dynamics (MD) simulations, and MM/GBSA have been employed to analyze and compare the binding modes between the novel compounds and AR. Through the analysis of the binding free energy and residue energy decomposition, we concluded that the newly discovered chemicals can in silico bind to AR with similar position and mechanism to the reported active compound and the van der Waals interaction is the main driving force during the binding process.

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28293633      PMCID: PMC5331419          DOI: 10.1155/2017/3572394

Source DB:  PubMed          Journal:  Biomed Res Int            Impact factor:   3.411


1. Introduction

According to the latest World Cancer Report 2014 [1], prostate cancer (PCa) has become the second most common cancer among men in the world. The morbidity rate of PCa has reached 15%, which is merely 1.7% lower than the leading lung cancer. It is reported that about 1100,000 people were diagnosed as new PCa patients in 2012 [2]. Additionally researchers pointed out that prostate cancer is not the privilege of men; women have similar prostate tissue, which also has the risk of cancer [3]. The androgen receptor (AR), a ligand inducible transcription factor in the nuclear hormone receptor super family [4], plays a critical role in the development and progress of PCa. Natural hormone testosterone (T) and dihydrotestosterone (DHT), known as androgens, are the endogenous ligands of AR. When bound to AR, androgens play significant roles in the sexual development, function, and musculoskeletal growth of male. The main mechanism of androgen action is to regulate the gene expression by means of binding to AR, changing the level of specific proteins in cells, and controlling cell behavior [5]. Therefore, a rational approach to cure PCa is the use of antiandrogens to prevent the interaction of T or DHT with AR. At present, androgen receptor antagonists, such as bicalutamide and flutamide, are used as main hormone therapies for prostate cancer [6]. Although these antiandrogens exhibit good efficacy in many cases and comprise an important part of effective therapeutics [7-10], the emergence of recurrent and metastatic forms of castration-resistant PCa (CRPC) becomes a major challenge, with a median survival of only 1~2 years [11]. A possible reason is that these antiandrogens have partial agonistic activities at high concentration in vitro [12]. Therefore, the discovery of new AR antagonists with high antiandrogen activities is highly expected. Here, in this study, to aid the research and development of steroidal antiandrogens, we investigated the relationships between a series of 7α-substituted dihydrotestosterone derivatives and corresponding antiandrogen activities. The vital features related to the bioactivities were explored, and a linear quantitative structure-activity relationship (QSAR) model was established according to OECD principles [13], using the QSARINS program [14, 15]. Then the QSAR model, thoroughly and strictly validated by various internal and external validation techniques, is used to virtually screen new dihydrotestosterones, without experimental bioactivities, downloaded from PubChem database [16]. Besides, molecular docking and molecular dynamics (MD) simulations are used to study the possible binding mode of compounds owning high in silico activities with androgen receptor. At last, the most active compounds with good binding affinities to AR, as highlighted by the Insubria graph [17], are proposed for experimental research group to test the antiandrogen activities in the future.

2. Materials and Methods

2.1. Data Set

The success of any QSAR model depends on accurate and clean training data, proper representative descriptor selection methods, suitable statistical methods, and, most critically, both internal and external validation of resulting methods [18, 19]. Here, in this work, a set of 36 7α-substituted dihydrotestosterones derivatives were taken from literatures [20, 21]. The skeleton structure of these derivatives is shown in Figure 1, in which R group represents amine, carboxylic acids, and halogens, and so forth.
Figure 1

The skeleton structure of 7α-substituted dihydrotestosterones derivatives.

These molecules were divided into a training set and a prediction set according to the structure diversity in QSARINS. Finally, 29 compounds were included in the training set and 7 compounds were in the prediction set (prediction set a). The experimental values, half maximal inhibitory concentration (IC50) expressed in nM, were converted into negative logarithmic units marked as pIC50, which was used as dependent variables in the QSAR analyses. The studied molecular structures and corresponding antiandrogen activities were listed in Table 1.
Table 1

The studied compounds and corresponding experimental and predicted pIC50 values.

NumberStructureExperimental pIC50Predicted pIC50
1a 6.045.91
2a 6.146.15
3a 5.705.81
4 6.776.88
5 6.186.23
6a 6.526.71
7 5.246.76
8 6.075.95
9 6.606.53
10 6.386.27
11 6.046.06
12a 6.486.48
13 6.285.80
14 5.545.47
15 6.476.17
16 5.475.81
17a 5.745.51
18 5.895.80
19 5.465.61
20 5.966.35
21a 6.386.12
22 6.386.07
23 6.726.64
24 6.526.46
25 5.805.84
26 6.146.30
27 6.526.46
28 6.316.48
29 6.356.18
30 6.326.06
31 6.035.80
32 5.745.60
33 5.205.50
34 5.185.24
35 6.706.59
36 6.286.59
37b 6.026.38
38b 5.665.49

aThe prediction set a; bthe prediction set b (Bradbury et al., 2011).

2.2. Descriptors Calculation

To describe a molecule, the molecular structures were firstly sketched in HyperChem program [22] and then were optimized to the minimum energy conformation by using AM1 method. After minimization, we submit the structures to DRAGON 5.5 software [23] to calculate 2914 descriptors including zero-, one-, two-, and three-dimensional (0D, 1D, 2D, and 3D), charge descriptors, and molecular properties. The related theories of the molecular descriptors are provided by DRAGON software, and the calculation procedure is clarified in detail, in the Handbook of Molecular Descriptors [24]. In order to facilitate the successive feature selection process, the constant and near constant descriptors were removed. Besides, if pairwise correlation of two descriptors is larger than 0.98, the one showing the highest pairwise correlation with others will be excluded. Finally, 358 descriptors remained for the next variable selection process.

2.3. QSAR Model Generation

After descriptor calculation, genetic algorithm (GA) implemented in QSARINS software was used to select descriptors. The final model was built by using MLR method based on the selected descriptors, named GA-MLR. The first step of GA is to produce a set of solutions randomly which is called initial population. Each solution, a model based on the contained descriptors by using multiple linear regressions method, is called a chromosome. Subsequently, the fitness function, Friedman LOF, is used to evaluate the fitness of these individuals:Here, SSE represents the sum of squares of errors, c is the number of basis function, d is the smoothness factor (default 0.5), p is the number of features in the model, and n is the number of samples for model construction. In the successful selection stage, the fittest individuals evolve to the next generation. Then crossover and mutation operators were performed to generate new individuals. Finally a new population is formed consisting of the fittest chromosomes. The above evolution continues until the stop conditions are satisfied. The related parameters that control the GA performance are list as follows: population size (200), maximum generations (10000), and mutation probability (0.05).

2.4. Model Validation

QSARINS is based on GA-MLR method and performed various tools to a rigorous internal and external validation, based on the different validation criteria, as well as for the check of model applicability domain. The robustness and stability of the built model were validated by several statistical parameters, such as determination coefficient (R2), leave-one-out (LOO) cross-validation QLOO2, and root mean squared error (RMSE). Besides, leave-many-out (LMO) cross-validation method was also performed and QLMO2 was reported. Randomization technology, by reordering the independent variable, was used to exclude the possibility of the chance correlation. Generally, the correlation coefficient of the built QSAR model should exceed Y randomized generated model. After Y scrambling was carried out with iterations of 5,000, the average value of squared correlation coefficient of the randomized models R2 and QLOO2 was reported. A good QSAR model should also have satisfactory predictive ability. The best way to evaluate the predictive ability of a model is its validation by new compounds, called prediction set, which do not participate in the process of model building. After the activities of the prediction set samples were predicted, the agreement between the experimental and predicted values was calculated as a measure of a QSAR model quality. Here we adopt several ways to calculate this agreement, Q2, Q2 [25], Q2 [26], and CCC [27-29]. All the above external validation parameters were calculated in the software QSARINS and were combined to evaluate the predictive ability of the proposed model.

2.5. Applicability Domain

To validate the practical applicability of a model to a new chemical, the applicability domain (AD), a theoretical domain which is defined by means of the selected descriptors in the process of modeling, should be defined properly. In this research, the AD was quantitatively assessed by the leverage approach [30, 31]. The leverage (hat, h) was calculated by h = x(XX)x  (i = 1,…, m), where x was the descriptor row-vector of the query compound i and X was the n∗p matrix of the training set (p is the number of model descriptors). The limit of model domain was quantitatively defined by the leverage cutoff (h), set as 3(p + 1)/n. A leverage greater than h means that the query was outside of the model structural AD, so the predictions were extrapolations of the model and could be less reliable. The AD for chemicals with experimental data was verified by the Williams plot, where the hat values versus the standardized residuals were plotted, while the AD for chemicals without experimental data, which were analyzed in the virtual screening, was verified by the Insubria graph where the hat values were plotted versus the predicted responses [14, 18].

2.6. Virtual Screening of Potent Steroidal Antiandrogens

To explore more 7α-substituted dihydrotestosterones and to find similar derivatives with high antiandrogen activities, the studied skeleton structure was used as a query to search PubChem database for new dihydrotestosterones, without experimental bioactivities. Then the established MLR model, after thoroughly being validated internally and externally, was used to predict the antiandrogenic activities of these new dihydrotestosterones, verifying the AD. Besides, molecular docking was employed to investigate the possible interaction mechanisms of the samples owning high in silico antiandrogenic activities with AR. Particularly, comprehensively considering the docking speed and accuracy, LigandFit, which is commonly used as a flexible docking method executed in the commercial software Discovery Studio 2.5 [32], was applied into the progress of structure-based virtual screening. The protein structure of AR was firstly downloaded from RCSB Protein Data Bank [33] (PDB entry code: 1T65) and imported in docking process. All ligands and water molecules were removed at first, the charge and polar hydrogen atoms were added, and the incomplete residues were corrected.

2.7. Molecular Dynamics (MD) Simulations

The molecular dynamics (MD) simulations were carried out using the Amber 14 software package [34]. MD is a commonly used methodology in exploring the interaction between ligand and protein. We have investigated the interaction mechanisms of R-bicalutamide/S-1 with WT/W741L AR using molecular dynamics simulations [35]. The docked structures of AR (PDB ID: 1T65) with the reported most active compound number 4 and novel chemicals with high in silico activities were used as the initial structures for MD simulations. During the process of docking, taking into consideration the fact that these residues collide significantly with the compounds, Helix 12 of AR was removed in the model as executed in the literature [36]. All missing hydrogen atoms of the AR were added by the LEaP module of the Amber 14 package. To maintain the electroneutrality of all the studied complexes, the appropriate number of chloride counterions was added. Then each complex was immersed into a cubic periodic box of TIP3P water model [37] with at least 10 Å distance around the complex. For the ligand, the GAFF parameter assignments [38] were made by using Antechamber program and the partial charges were assigned by using the AM1-BCC method [39]. Amber 14 package and the Amberff03 force field were used for all molecular dynamics simulations. Sander program was carried out for the energy minimization and equilibration protocol. First, energy minimization of four complexes was done through three stages, using the steepest descent method switched to a conjugate gradient every 2500 steps for a total of 5000 steps with a nonbonded cutoff of 10 Å. In the first stage, to enable the added TIP3P water molecules to adjust to their proper orientations, the AR and ligand were restrained with 5.0 kcal mol−1 Å−2. In the second stage, to enable the AR to find a better way of accommodating ligand, the protein backbone was restrained with 3.0 kcal mol−1 Å−2. In the third stage, the entire solvated system was minimized without any restraint. Additionally, gradual heating, density, and equilibration protocols were performed. During the 100 ps heating procedure, the system was gradually heated from 0 to 310 K, and then the density was at 310 K for 400 ps, and at last the equilibration was at 310 K for 400 ps. Afterwards, four 20 ns production MD simulations were carried out with the PMEMD program without any restraints in the isothermal isobaric ensemble (NPT, P = 1 atm, T = 310 K) MD. The time step was set at 2 fs. 10 Å cutoff was applied to treat nonbonding interactions. During the simulations periodic boundary conditions were employed and all electrostatic interactions were calculated using the particle mesh Ewald (PME) method. The SHAKE algorithm was used to restrain all bond lengths containing hydrogen atoms. All of the coordinate trajectories were recorded every 2 ps throughout all MD runs. To analyze the energy and structure, a total of 500 snapshots of the simulated structures stripped in the last 2 ns stable MD production trajectory at 4 ps intervals were extracted.

2.8. Binding Free Energy Calculations

For each protein-ligand complex, the binding free energy was analyzed by the MM/GBSA method [40]. To compare the AR binding free energies with different ligands, MM/GBSA calculation was applied to the 500 snapshots extracted from the final 2 ns of the MD trajectories. The total free energy of binding free energy was composed of the following molecular species (complex):where Gcomplex, Gprotein, and Gligand are the free energy of complex, receptor, and ligand, respectively. The free energy for each species (complex, ligand, or receptor) can be decomposed into a gas phase energy (ΔEMM), a solvation-free energy (ΔGsol), and an entropy term (TΔS).where the ΔEMM is the sum of the internal energy of bonds, angle, and torsion (ΔEval), electrostatic interaction energy (ΔEele), and van der Waals interaction energy (ΔEvdw). ΔGsol is solvation-free energy and can be divided into two parts, the polar solvation-free energy (ΔGp) and the nonpolar solvation-free energy (ΔGnp). The polar solvation-free energy ΔGp is determined by generalized Born (GB) equation. The values of the dielectric constant for solute and solvent were set as 1 and 80. ΔGnp is the nonpolar solvation contribution and was calculated with constants 0.0072 kcal mol−1 Å−2 for surface tension proportionality constant γ and 0.92 kcal mol−1 Å−2 for the nonpolar free energy for a point solute β. SASA is the solvent accessible surface area and is determined by recursively approximating a sphere around each atom, starting from icosahedra (ICOSA method). TΔS is the entropy term, including the translational, rotational, and vibrational terms of the solute molecules.

2.9. Energy Decomposition

Furthermore, to obtain the contribution of each residue to the binding process, we performed binding free energy decomposition. The MM/GBSA approach was used to calculate the per-residue free energy decomposition, which is based on the same 500 snapshots we have extracted from the last 2 ns of the stable MD trajectory.

2.10. Normal Mode Calculation

Entropy was analyzed by normal mode with AMBER14 NMODE module. Due to the high computational cost in the entropy calculation, 50 snapshots were extracted from the last 2 ns trajectory of the simulation with 40 ps time intervals.

3. Results and Discussion

3.1. The Linear MLR Model

The training set samples, 29 compounds as listed in Table 1, were used to build QSAR model by using GA-MLR methods, and the remaining compounds were used to evaluate the predictive ability of the built model. GA provided a series of linear equations containing different descriptor combinations with different performance, but similarly satisfactory. An excellent QSAR model should have high fitting ability, high cross-validated QLOO2, high external predictive ability, and little difference between internal and external predictive ability. Based on the above principles, a four-descriptor model was selected as the final model. The involved descriptors and corresponding physical-chemical meanings were listed in Table 2. The corresponding model equation and statistic parameters are listed as follows:From the linear equation and statistic parameters, we could see that the fitting ability of the final model was relatively high with R2 of 0.760 and the final model was stable with QLOO2 of 0.656 and QLMO2 of 0.662. About the predictive ability of the final model, we could find that Q2 and Q2 have similar high values. Compared with Q2 and Q2, the value of Q2 was higher. Besides, the value of CCC was as high as 0.891, surpassing the threshold value of 0.85 as suggested in literature [29] for predictive model. Additionally, the RMSE values for the training set and prediction set were similarly very low. All these parameters indicated the higher external prediction ability of the final model. The interrelation coefficients of the selected descriptors were presented in the Table 3. It could be seen that the highest intercorrelation coefficient K was −0.61 between IC5 and HATS3e, which indicated that the used variables were independent. All results proved that the selected model was reliable, stable, and predictive.
Table 2

The selected descriptors used to build QSAR model and corresponding meanings.

DescriptorMeaningDescriptor type
IC5Information content index (neighborhood symmetry of 5-order)Information indices
GATS5eGeary autocorrelation, lag 5/weighted by atomic Sanderson electronegativities2D autocorrelations
DISPpd COMMA2 value/weighted by atomic polarizabilitiesGeometrical descriptors
HATS3uLeverage-weighted autocorrelation of lag 3/unweightedGETAWAY descriptors
Table 3

The correlation coefficients (K) of the selected descriptors in the model.

IC5GATS5eDISPpHATS3u
IC51
GATS5e0.071
DISPp0.250.361
HATS3u−0.6250.19−0.151
Y randomization technique was carried out with iterations of 5000 in QSARINS. Figure 2 showed the plot of R2 and Q2 values versus K, automatically obtained in QSARINS. From Figure 2, we could find that the R2 and Q2 values of the final model were much higher than the models from scrambled Y-column, because the relationship between molecular structure and response was broken. This result indicated that the relationships between structures of 7α-substituted dihydrotestosterones and corresponding pIC50 values did exist in the proposed model, and it was really not obtained by chance.
Figure 2

The distribution of R2 and Q2 of 5000 iterated Y-scrambled models in comparison to the proposed model performances.

The predicted pIC50 values by MLR model were listed in Table 1. Figure 3 was the scatter plot of the experimental versus the predicted pIC50 values. It was obvious that, in Figure 3, all predicted pIC50 values were close to the line y = x, which indicated that the linear model can accurately predict the antiandrogenic values of these derivatives.
Figure 3

The scatter plot of the experimental versus the predicted pIC50 by MLR model.

The model applicability domain was evaluated by means of leverage analysis, namely, Williams plot, shown in Figure 4, in which the standardized residuals (σ) and leverage values (h) were plotted. In Figure 4, we could see that all compounds were inside the model structural applicability domain (h = 0.517) and reasonably well predicted with standard residue smaller than 2.5σ.
Figure 4

The Williams plot of final MLR model.

After the MLR model was built, we luckily found two new 7α-substituted dihydrotestosterones, showed in Table 1 (marked as “b,” prediction set b), with experimental antiandrogen activities from another published literature [41]. These two compounds were additionally used to validate our model. Both of them (numbered in the Williams plot of Figure 4) were located in the applicability domain of the MLR model with h value of 0.332 for compound 37 and 0.237 for compound 38. The predictions on them were quite near to their experimental values. In Figure 3, these two samples (in red) were very near to the line y = x. These results further indicated the high predictive ability of the proposed MLR model. By interpreting the meaning of the descriptors used in the model, we could extract vital structural features, to some extent, responsible for the antiandrogenic activities of these steroidal derivatives. IC5 was calculated as the mean information content as follows: IC5 = −∑(A/nAT) · log2(A/nAT), where g runs over the equivalence classes, A was the cardinality of the gth equivalence class, and nAT was the total number of atoms. This index represented a measure of structural complexity per vertex. GATS5e belonged to 2D autocorrelations and was Geary autocorrelation, lag 5/weighted by atomic Sanderson electronegativities. This descriptor was favorable to the antiandrogen activities of these steroidal derivatives. DISPp, geometrical descriptors, indicated the displacement between the geometric center and the center of the polarizability, calculated with respect to the molecular principal axes. HATS3u is a GETAWAY descriptor [42], representing the leverage-weighted autocorrelation of lag 5/weighted by atomic polarizabilities. With the increase of these two descriptors, the bioactivities of the studied compounds decreased.

3.2. Virtual Screening

From PubChem database, we found 110 new 7α-substituted dihydrotestosterone derivatives, without experimental data. By exploring the leverage h values, 77.27% of them were located in the structural applicability domain of the proposed MLR model. Figure 5 is the Insubria graph of these dihydrotestosterones, the plot of leverage values versus predicted pIC50, which was proposed especially for exploring the unknown samples. In Figure 5, most chemicals are in the range of the hat cutoff (h = 0.517). Inside the model AD, the most active compound is CID_70128824, which has in silico PIC50 of 7.37, higher than the reported most active compound number 4 (pIC50 = 6.77). Outside the model AD, we luckily obtained several samples with higher in silico activities, especially CID_70126881 and CID_70127147, which showed excellent in silico antiandrogen activities as high as 7.90 and 7.76, respectively, even higher than bicalutamide and hydroxyflutamide. The ID of these compounds and corresponding predicted pIC50 values are listed in Table 4. Though only these three compounds were highlighted here, other samples with high in silico activities, especially those located in the model structural AD, were also worthy of our attention.
Figure 5

Insubria graph (plot of hat values versus predicted values for the complete compounds).

Table 4

The 110 new compounds from PubChem database and corresponding predicted activities.

NumberMolIDPredADa
1CID_444219996.32Y
2CID_444220086.49Y
3CID_444220146.53Y
4CID_444220206.47Y
5CID_444220316.34Y
6CID_444220346.15Y
7CID_444220376.14Y
8CID_444220416.07Y
9CID_444220435.82Y
10CID_444220445.67Y
11CID_444220455.62Y
12CID_444220476.69N
13CID_444220536.16Y
14CID_444220546.91Y
15CID_444220586.90Y
16CID_444220646.12Y
17CID_444220675.94Y
18CID_444220756.25Y
19CID_444220806.33Y
20CID_444336446.53Y
21CID_444336456.20Y
22CID_678542577.51N
23CID_697581126.69Y
24CID_701262167.35N
25CID_701262317.17Y
26CID_701262476.91Y
27CID_701262975.60Y
28CID_701262985.83Y
29CID_701263056.56Y
30CID_701263277.42N
31CID_701264916.40Y
32CID_701267826.27Y
33CID_701267846.24Y
34CID_701267986.48N
35CID_701268026.87N
36CID_701268376.36Y
37CID_701268686.75N
38CID_701268817.90N
39CID_701269796.39Y
40CID_701269916.89Y
41CID_701271446.82Y
42CID_701271477.76N
43CID_701271816.03Y
44CID_701271836.35Y
45CID_701271856.87N
46CID_701271886.96N
47CID_701271926.88Y
48CID_701271946.81Y
49CID_701272696.35Y
50CID_701272877.48N
51CID_701272966.92N
52CID_701272976.77Y
53CID_701272986.82Y
54CID_701274446.80Y
55CID_701274467.54N
56CID_701275675.89Y
57CID_701277147.00N
58CID_701277216.57Y
59CID_701277226.82Y
60CID_701277606.91N
61CID_701277716.66Y
62CID_701278187.06N
63CID_701278217.17N
64CID_701280626.59Y
65CID_701280687.19N
66CID_701280786.06Y
67CID_701280836.83Y
68CID_701280846.81Y
69CID_701282096.75Y
70CID_701282386.83Y
71CID_701284506.33Y
72CID_701284527.23N
73CID_701284566.71Y
74CID_701284626.84Y
75CID_701285336.79Y
76CID_701285346.92Y
77CID_701285747.18N
78CID_701286087.18N
79CID_701288247.37Y
80CID_701288286.46Y
81CID_701288307.00Y
82CID_701288477.46N
83CID_701289026.50Y
84CID_701289046.57Y
85CID_701289875.60Y
86CID_701290656.69N
87CID_701291617.05Y
88CID_701291986.53Y
89CID_701292047.14N
90CID_701293756.18Y
91CID_701293776.74Y
92CID_701293807.12Y
93CID_98049166.38Y
94CID_98267766.33Y
95CID_98272856.48Y
96CID_98283926.12Y
97CID_98285876.86N
98CID_98294266.45Y
99CID_98910336.14Y
100CID_98933525.94Y
101CID_99332266.34Y
102CID_99351046.56Y
103CID_99351966.57Y
104CID_99357886.50Y
105CID_99363476.45Y
106CID_99368036.36Y
107CID_99558746.20Y
108CID_99571226.62Y
109CID_99576926.57Y
110CID_99581616.19Y

aAD: model structural applicability domain; Y: compound inside the model structural AD; N: compound outside the model structural AD.

To further explore the possible binding mode of the screened compounds, molecular docking was employed to study the interaction between compounds owning high in silico activities (especially CID_70128824, CID_70126881, and CID_70127147), together with the reported most active compound 4 as a comparison, and androgen receptor (PDB ID: 1T65) by using LigandFit module in Discovery Studio 2.5. Firstly, DHT was extracted from crystal structure and redocked into ligand binding pocket to obtain the optimal docking parameters. Secondly, the ligand binding site was defined with the same parameters as DHT. At this point, the radius of SBD_Site_Sphere was set to 10 Å. The other parameters were set by default. To obtain reasonable conformations of different complex, the top-ranked compounds with lowest RMSD values were extracted. The binding mode of compound 4, CID_70128824, CID_70126881, and CID_70127147 with AR were presented in Figure 6. From this Figure, it could be seen that the docked pose of CID_70128824, CID_70126881, and CID_70127147 located in the same position with similar orientation in the AR ligand binding site to compound 4. All these results indicated that though two of them were outside of the model AD, these three compounds CID_70128824, CID_70126881, and CID_70127147 might have good performance to antagonize androgen receptor and have the potency for further research and development for PCa therapy.
Figure 6

The molecular binding models of compound 4 (cyan), CID_70128824 (yellow), CID_70126881 (pink), and CID_70127147 (blue) in the AR ligand binding site.

3.3. MD Simulations

3.3.1. System Equilibration

In order to verify whether the studied systems reach equilibrium, the root mean square deviations (RMSDs) of all the backbone atoms of the protein, the C atoms for the residues of the active site (residues within 5 Å around ligand), and the heavy atoms of ligand from the initial structure were monitored to examine the dynamic stability of the systems and plotted against time, as shown in Figure 7. The three RMSDs have small fluctuations after 15 ns, implying that the studied systems have reached stability. We used the last 2 ns to analyze the energy and binding modes for the four complexes.
Figure 7

Time series of (a) the RMSDs of backbone atoms of androgen receptor, (b) the RMSD of C atoms for the residues around 5 Å of the ligand, and (c) the RMSD of the heavy atoms of ligand.

3.3.2. Validation of the MD Simulations

We calculated the binding free energy by MM/GBSA method between the four ligands and AR to validate the reliability of the MD simulation. Table 5 lists the binding free energy and all of the energy terms for the four compounds. From Table 5, the four complexes had different binding free energy; the ranking order is CID_70126881 (−41.62 kcal mol−1), CID_70127147 (−33.06 kcal mol−1), CID_70128824 (−31.86 kcal mol−1), and compound 4 (−22.69 kcal mol−1). Different binding free energy means different binding affinity between the four complexes. We have proved that the antiandrogen activity of compounds CID_70126881, CID_70127147, and CID_70128824 are higher than the published best one (compound 4) by MLR model. In addition, the ranking of the calculated binding free energy was consistent with their in silico bioactivities order.
Table 5

The calculated binding free energies (kcal mol−1) of four systems.

ComplexContribution
ΔEeleΔEvdwΔGpΔGnpΔEMMΔGsolΔEbindTΔSΔGbind
Compound 4−29.58−58.9344.95−6.74−88.5038.22−50.2827.59−22.69
CID_70128824−3.32−70.5624.14−7.25−73.8816.89−56.9925.13−31.86
CID_70127147−16.79−71.3433.50−7.80−88.1325.69−62.4429.38−33.06
CID_70126881−10.01−69.3225.14−8.85−79.3316.29−63.0521.43−41.62

3.3.3. Analysis of the Interaction Mechanism

According to the calculated binding free energy, CID_70126881 holds the strongest binding affinity; on the contrary, compound 4 has the lowest binding affinity. As can be seen from Table 5, the nonpolar interactions (ΔGnonpolar) including van der Waals (Evdw) and nonpolar solvation (ΔGnp) terms are the driving force for the binding of the four ligands to AR, and the total polar contributions (ΔGp) are unfavorable for their binding. In addition, CID_70126881, CID_70127147, and CID_70128824 have almost the same van der Waals interactions toward AR (−69.32 kcal mol−1, −71.34 kcal mol−1, and −70.56 kcal mol−1 for CID_70126881-AR, CID_70127147-AR, and CID_70128824-AR), while compound 4 has a low van der Waals value, which may partly explain the reduced binding affinity of compound 4 and prove that the newly discovered chemicals could possess higher antiandrogen activities. To obtain the detailed interaction between four ligands and AR, the decomposition of binding free energy, which is calculated by MM/GBSA method, was executed to identify key residues during the binding process. The result of energy decomposition contains van der Waals, electrostatic, solvation-free energy, and total energy contribution terms, respectively, for four systems, shown in Figure 8. All the residues with great energy contributions were almost more than 1.5 kcal mol−1. As shown in Figure 8(a), residues L701, L704, M780, L873, T877, L880, and L881 of AR make a significant contribution to the CID_70126881-AR binding, as well as those for the CID_70127147-AR which are L704, N705, L712, W741, M745, and F876 (Figure 8(b)). In Figure 8(c), residues L704, N705, G708, L712, F764, and F891 of AR make a substantial contribution to the CID_70128824-AR binding. However, only two key residues (L704 and N705) were the major energy contributions to compound 4-AR binding as shown in Figure 8(d). As mentioned previously, the vast majority of key residues of AR were nonpolar; it was reasonable to speculate that these residues can form greater van der Waals interactions with hydrophobic ligand and exhibit more favorable nonpolar interaction contribution to the binding free energy.
Figure 8

Energy decomposition of key residues in four complexes.

The MD simulation, together with the docking results, confirmed that the newly discovered chemicals CID_70126881, CID_70127147, and CID_70128824 share similar binding mode with the reported compound 4, and the in silico antiandrogen activities of them are higher through the calculated binding free energy and decomposition of binding free energy.

4. Conclusions

In this study, the relationships between a series of 7α-substituted dihydrotestosterone derivatives and corresponding antiandrogen activities were explored. A reliable, stable, and robust linear MLR model with four descriptors was built and validated in QSARINS. The predictive ability of the final model, fully evaluated by using two different prediction sets, is excellent enough to be used to virtually screen novel 7α-substituted dihydrotestosterones from PubChem database. After antiandrogenic activity prediction, molecular docking, and molecular dynamic simulations, CID_70126881, CID_70127147, and CID_70128824, as the most potent chemicals with good binding affinities to androgen receptor, were proposed. Of course, bioassay experimental researches are needed to evaluate the virtual screening results. This study provides the theoretical basis and specific chemicals for AR antagonists, which can help the experimental research groups to search for potential antiandrogens.
  27 in total

1.  Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. Parameterization and validation.

Authors:  Araz Jakalian; David B Jack; Christopher I Bayly
Journal:  J Comput Chem       Date:  2002-12       Impact factor: 3.376

2.  Development and testing of a general amber force field.

Authors:  Junmei Wang; Romain M Wolf; James W Caldwell; Peter A Kollman; David A Case
Journal:  J Comput Chem       Date:  2004-07-15       Impact factor: 3.376

3.  A multicenter randomized trial comparing the luteinizing hormone-releasing hormone analogue goserelin acetate alone and with flutamide in the treatment of advanced prostate cancer. The International Prostate Cancer Study Group.

Authors:  C J Tyrrell; J E Altwein; F Klippel; E Varenhorst; G Lunglmayr; F Boccardo; I M Holdaway; J M Haefliger; J P Jordaan
Journal:  J Urol       Date:  1991-11       Impact factor: 7.450

Review 4.  Use of the nonsteroidal anti-androgen Casodex in advanced prostatic carcinoma.

Authors:  G T Kennealey; B J Furr
Journal:  Urol Clin North Am       Date:  1991-02       Impact factor: 2.241

5.  Small-molecule androgen receptor downregulators as an approach to treatment of advanced prostate cancer.

Authors:  Robert H Bradbury; Neil J Hales; Alfred A Rabow; Graeme E Walker; David G Acton; David M Andrews; Peter Ballard; Nigel A N Brooks; Nicola Colclough; Alan Girdwood; Urs J Hancox; Owen Jones; David Jude; Sarah A Loddick; Andrew A Mortlock
Journal:  Bioorg Med Chem Lett       Date:  2011-07-02       Impact factor: 2.823

6.  A concordance correlation coefficient to evaluate reproducibility.

Authors:  L I Lin
Journal:  Biometrics       Date:  1989-03       Impact factor: 2.571

Review 7.  Biology of progressive, castration-resistant prostate cancer: directed therapies targeting the androgen-receptor signaling axis.

Authors:  Howard I Scher; Charles L Sawyers
Journal:  J Clin Oncol       Date:  2005-11-10       Impact factor: 44.544

8.  Discovery and structure-activity relationships of new steroidal compounds bearing a carboxy-terminal side chain as androgen receptor pure antagonists.

Authors:  Kazutaka Tachibana; Ikuhiro Imaoka; Hitoshi Yoshino; Nobuaki Kato; Mitsuaki Nakamura; Masateru Ohta; Hiromitsu Kawata; Kenji Taniguchi; Nobuyuki Ishikura; Masahiro Nagamuta; Etsuro Onuma; Haruhiko Sato
Journal:  Bioorg Med Chem Lett       Date:  2007-08-22       Impact factor: 2.823

9.  Testosterone activates mitogen-activated protein kinase and the cAMP response element binding protein transcription factor in Sertoli cells.

Authors:  Charity Fix; Cynthia Jordan; Patricia Cano; William H Walker
Journal:  Proc Natl Acad Sci U S A       Date:  2004-07-19       Impact factor: 11.205

10.  Bilateral orchiectomy with or without flutamide for metastatic prostate cancer.

Authors:  M A Eisenberger; B A Blumenstein; E D Crawford; G Miller; D G McLeod; P J Loehrer; G Wilding; K Sears; D J Culkin; I M Thompson; A J Bueschen; B A Lowe
Journal:  N Engl J Med       Date:  1998-10-08       Impact factor: 91.245

View more
  3 in total

1.  Comparison of Machine Learning Models for the Androgen Receptor.

Authors:  Kimberley M Zorn; Daniel H Foil; Thomas R Lane; Wendy Hillwalker; David J Feifarek; Frank Jones; William D Klaren; Ashley M Brinkman; Sean Ekins
Journal:  Environ Sci Technol       Date:  2020-10-21       Impact factor: 9.028

2.  Predicted Adsorption Affinity for Enteric Microbial Metabolites to Metal and Carbon Nanomaterials.

Authors:  Bregje W Brinkmann; Ankush Singhal; G J Agur Sevink; Lisette Neeft; Martina G Vijver; Willie J G M Peijnenburg
Journal:  J Chem Inf Model       Date:  2022-07-25       Impact factor: 6.162

3.  Structural Changes Due to Antagonist Binding in Ligand Binding Pocket of Androgen Receptor Elucidated Through Molecular Dynamics Simulations.

Authors:  Sugunadevi Sakkiah; Rebecca Kusko; Bohu Pan; Wenjing Guo; Weigong Ge; Weida Tong; Huixiao Hong
Journal:  Front Pharmacol       Date:  2018-05-15       Impact factor: 5.810

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.