| Literature DB >> 25872791 |
Thomas Scior1, Jorge Lozano-Aponte, Subhash Ajmani, Eduardo Hernández-Montero, Fabiola Chávez-Silva, Emanuel Hernández-Núñez, Rosa Moo-Puc, Andres Fraguela-Collar, Gabriel Navarrete-Vázquez.
Abstract
In view of the serious health problems concerning infectious diseases in heavily populated areas, we followed the strategy of lead compound diversification to evaluate the near-by chemical space for new organic compounds. To this end, twenty derivatives of nitazoxanide (NTZ) were synthesized and tested for activity against Entamoeba histolytica parasites. To ensure drug-likeliness and activity relatedness of the new compounds, the synthetic work was assisted by a quantitative structure-activity relationships study (QSAR). Many of the inherent downsides - well-known to QSAR practitioners - we circumvented thanks to workarounds which we proposed in prior QSAR publication. To gain further mechanistic insight on a molecular level, ligand-enzyme docking simulations were carried out since NTZ is known to inhibit the protozoal pyruvate ferredoxin oxidoreductase (PFOR) enzyme as its biomolecular target.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25872791 PMCID: PMC5396257 DOI: 10.2174/1573409911666150414145937
Source DB: PubMed Journal: Curr Comput Aided Drug Des ISSN: 1573-4099 Impact factor: 1.606
Listing of molecular structures of NTZ and its eleven thiazole derivatives (T series). The inhibition concentration pIC50 (against E. histolytica) is given and used as input data for QSAR. Note: tizoxanide (TIZ) is the hydrolysis product of NTZ: deacetyl-nitazoxanide. Note: recently, T17 without E. histolytica activity data was used in a published study (cf. Table 1 on page 1627 in [33]). And also recently T18 appeared without E. histolytica activity data (cf. Fig. 1 on page 3159 in another article by GNV [34]). Molecules of test set are denoted by asterisk (*).
Listing of molecular structures of NTZ and its nine benzothiazole derivatives (B series). The inhibition concentration pIC50 (against E. histolytica) are given and used as input data for QSAR. Note: B01 and B02 combined with E. histolytica activity data were already documented by the group leader GNV (cf. Table 1 on page 3169 in [27]. Molecules of test set are denoted by asterisk (*).
Matrix of interactions representing Docking results (H-bonds, donors or acceptors). The gray boxes represent the interaction with the corresponding amino acid of PFOR model of E. histolytica. The experimental pIC50 and the computed ΔGbinding energies are included in the right side (T03 doesn’t show qualitative relationship between both values).
Statistical parameters and descriptor definitions of the regression model (equation SA-1). Cross-validation standard error: cvSE; external-validation standard error: predSE; cross-validated Zscore: Zscore_cv; cross-validated alpha: alpha_cv.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| Train/Test (n) | 18/4 | Q2F1 | 0.72 | cvSE | 0.97 |
| Descriptors (k) | 4 | Q2F2 | 0.70 | predSE | 0.68 |
| R2 | 0.80 | Q2F3 | 0.69 | Zscore | 3.99 |
| Q2 (LOO*) | 0.55 | r2m | 0.39 | Zscore_cv | 2.42 |
| pred R2 | 0.72 | alpha | 0.00 | ||
| SEE | 0.65 | r2m (LOO*) | 0.48 | alpha_cv | 0.01 |
| aasC_Cnt | Count of atom-type E-State::C:- | ||||
| aOm_Cnt | Count of atom-type E-State::O- | ||||
| ssNH_Sum | Sum of atom-type E-State: -NH- | ||||
| BTZ indicator | A binary variable either 0 or 1 to indicate the absence or presence of a benzothiazole scaffold in a given molecule | ||||
Note: (LOO*) is a particular parameter only applied to the initial training set (18 compounds).
Listing of detected QSAR shortcomings and pitfalls which were reported in the literature [3].
|
|
|
|---|---|
| Small simple and limited chemical variability | It does not exist an ideal sample number size for QSAR, but is clear with larger sample size, the results become more representative. Here, the compound number is small (n=22), and the chemical variety is so limited. Basically it consist in ester (-O-CO-), ether (-O-), nitro (-NO2) and chloride, all of which occupy different positions of the phenyl ring of the molecules. |
| Composition of training and test sets | With a small sized series, the distribution of chemical items may not be representative in terms of the activity and chemical variability. |
| Meaningless descriptor selections | A common QSAR descriptor is pKa with high correlation to biological activity, which is not the case of dipole moment (DM, dipole). DM takes on different values with changing conformations. When DM is calculated for artificially held planar molecules, it is loaded onto the Z-axis only, while even small torsional changes are reflected by huge changes in DM values. |
| not constant coefficients and constants | The calculated pKa values are not equal to literature reports. It takes different values in different programs; moreover, each software consider different ionized forms, making difficult its selection/consideration for QSAR equations. |
| Starting geometries for 3D-QSAR | Albeit, the active conformation is not necessarily identical to the observed crystal structure, and since no NTZ-PFOR complex has been solved, two pieces of information were taken into account to assess the active conformation of the NTZ scaffold: (1) its crystallographic record deposited in CCDC [47] and (2) the final pose of NTZ docked into the ligand binding site of the cofactor TPP-PFOR complex. As a direct result, both geometries are practically the same, see Supplementary Material (Fig. SD2-C). |
| Errors of descriptor calculations (acidity, dissociation) | The experimental acidity value of NTZ is reported as pKa≈6 [32] for the conjugated acid / neutral thiazole system ([B-H+] / [B]) which corresponds to approx. 90% neutral species under physiological conditions. The calculated value, pKa≈8 [76], however, inverts the cationic/nonionic portions (10% neutral species). With no experimental value at hand, the (wrong) cationic forms would have been taken as input for the QSAR and docking studies. |
| Lipole-dipole collinearity | The algorithm of Lipole calculation is derived from the dipole moment equation (DM = q*r, q = atomic partial charge, r = VDW atomic radius) and atomic lipophilic values replace atomic partial charges. Despite different scale and units (charges and lipophilic fragments, same VDW radii), the equal calculation protocol generates collinearity. |
| Linearity hypothesis | The a priori assumption of linearity might be the main drawback in QSAR studies. Since data sampling is not complete, because no scientist would seek to explore the weaker, less active or more toxic data segments, it is often not clear if linearity is a first principle of nature or just appears due to insufficient data spread. Outliers and activity cliffs are first signs of nonlinear relationships between independent variables and response (biological activity, dependent variable). |
| Ligand based alignment (LBA) | The X-ray (crystal) conformation of NTZ may not constitute the biological active conformation. The hitherto unsolved structure of the NTZ-PFOR complex constitutes a disadvantage in case of higher dimensional QSAR where reliable conformational data is required. Results based on 2D descriptors (connectivities, drawings, SMILES, etc.) do not need special information while ligands can be superposed on their more rigid substructure or common scaffold (LBA). |
| Multiple solutions | We generated different equations based on different conformations and methodologies. It is not clear whether modeling based on NTZ X-ray conformation reflects realistic molecule geometries for binding site interaction, because the NTZ liganded binding site complex has not been elucidated. According to the pKa value of NTZ (here: 6.2, located on exothiazolic N amide), it can be inferred that all molecules treated here, present their activity at anionic form. Then, a new QSAR equation generation step based on descriptors calculated considering anionic compounds (without H at exothiazolic N amide, same training set), give us smaller R2 values that those obtained with X-ray data. The ideal case is considering the anionic form directly related with biological activity, because a small structural change can be reflected in huge descriptor magnitudes differences. This last QSAR equation generated with ionic compounds ( |
| Prodrugs and active metabolites | Some publications describe NTZ as a prodrug, albeit the biological activity of NTZ itself has been reported, too. Nevertheless, upon hydrolysis of the acetyl group, the metabolite TIZ shows comparable antiprotozoal potency. |
| Incompatible concepts and contradictions | Sometimes, linear equations in 2D QSAR include conformation-dependent descriptors in a way where spatial information about structural requirements for the ligands and the binding site remains unknown. Hence, conformation-dependent descriptors contribute to establish the “rules” governing the relations between structures and activities, without any reason to be present in the equation except for chance correlations: “… because the relevant features only appear in molecules that also contain the wrong features” [3]. |