Literature DB >> 31186969

A Derived QSAR Model for Predicting Some Compounds as Potent Antagonist against Mycobacterium tuberculosis: A Theoretical Approach.

Shola Elijah Adeniji1, Sani Uba1, Adamu Uzairu1, David Ebuka Arthur1.   

Abstract

Development of more potent antituberculosis agents is as a result of emergence of multidrug resistant strains of M. tuberculosis. Novel compounds are usually synthesized by trial approach with a lot of errors, which is time consuming and expensive. QSAR is a theoretical approach, which has the potential to reduce the aforementioned problem in discovering new potent drugs against M. tuberculosis. This approach was employed to develop multivariate QSAR model to correlate the chemical structures of the 2,4-disubstituted quinoline analogues with their observed activities using a theoretical approach. In order to build the robust QSAR model, Genetic Function Approximation (GFA) was employed as a tool for selecting the best descriptors that could efficiently predict the activities of the inhibitory agents. The developed model was influenced by molecular descriptors: AATS5e, VR1_Dzs, SpMin7_Bhe, TDB9e, and RDF110s. The internal validation test for the derived model was found to have correlation coefficient (R2) of 0.9265, adjusted correlation coefficient (R2 adj) value of 0.9045, and leave-one-out cross-validation coefficient (Q_cv∧2) value of 0.8512, while the external validation test was found to have (R2 test) of 0.8034 and Y-randomization coefficient (cR_p∧2) of 0.6633. The proposed QSAR model provides a valuable approach for modification of the lead compound and design and synthesis of more potent antitubercular agents.

Entities:  

Year:  2019        PMID: 31186969      PMCID: PMC6521565          DOI: 10.1155/2019/5173786

Source DB:  PubMed          Journal:  Adv Prev Med


1. Introduction

Tuberculosis (TB) is the most deadly bacterial disease caused by specie of bacteria known as Mycobacterium tuberculosis. In 2013, World Health Organization (WHO) estimated death of 1.5 million people, 9.0 million people living with tuberculosis, and 360,000 people who were HIV positive [1]. At present, pyrazinamide (PZA), para-amino salicylic acid (PAS), isoniazid (INH), and rifampicin (RMP) are the current drugs administered to patients suffering from tuberculosis. The resistance of the M. tuberculosis to the current drugs led to development of new approach that is fast and precise and could be able to predict the biological activity for the new compounds against M. tuberculosis. Meanwhile, a theoretical approach, quantitative structure activity relationships (QSARs), is one of the most widely used computational method which helps in designing drugs and predicting drugs activities [2]. QSAR model is a mathematical linear equation which relates the molecular structures of the compounds to their biological activities. In this research, a data set of 2,4-diquinoline derivatives which had been synthesized and evaluated as anti-Mycobacterium tuberculosis [3] has been selected for QSAR study. Few researchers [4-7] have established relationship between some antitubercular inhibitors like quinolone, chalcone, pyrrole, and 7-methyljuglone using QSAR approach. However, QSAR study has not been established to relate the structures and activities of 2,4-disubstituted quinoline derivatives as potent antitubercular agents. Therefore, this study aimed to establish a valid QSAR model that could correlate the structures of 2,4-diquinoline derivatives and predict their respective activities against Mycobacterium tuberculosis.

2. Material and Method

2.1. Data Set and Data Collection

The derivatives of 2,4-disubstituted quinoline as potent anti-Mycobacterium tuberculosis that were used in this research were selected from the literature [3]. The chemical structures alongside with their biological activities of these compounds were presented in Table 1, while the equation below was used to convert the percentage activities to logarithm unit.see [5]
Table 1

Molecular structures of inhibitory compounds and their derivatives as antitubercular agents.

S/NMolecular structureObserved Activity (%)Observed Activity (pA)Calculated ActivityResidualLeverage
1 a (E)-2-(2-(2-methylpropylidene)hydrazinyl)-N-phenylquinoline-4-carboxamide116.81917.22456-0.405460.186966
2 (E)-N-phenyl-2-(2-propylidenehydrazinyl)quinoline-4-carboxamide126.84186.7135610.1282390.267393
3 (E)-2-(2-benzylidenehydrazinyl)-N-phenylquinoline-4-carboxamide116.86016.6647440.1953560.072612
4 (E)-2-(2-(4-methoxybenzylidene)hydrazinyl)-N-phenylquinoline-4-carboxamide999.49799.73193-0.234030.15548
5 (E)-2-(2-(4-methoxybenzylidene)hydrazinyl)-N-phenylquinoline-4-carboxamide146.97726.8967780.0804220.328411
6 (E)-N-benzyl-2-(2-(pyridin-3-ylmethylene)hydrazinyl)quinoline-4-carboxamide237.26086.5104420.7503580.055405
7 a (E)-N-benzyl-2-(2-(furan-2-ylmethylene)hydrazinyl)quinoline-4-carboxamide207.17076.9729820.1977180.407733
8 a (E)-N-benzyl-2-(2-(thiophen-2-ylmethylene)hydrazinyl)quinoline-4-carboxamide307.42337.1525270.2707730.378878
9 a (E)-2-(2-(anthracen-9-ylmethylene)hydrazinyl)-N-benzylquinoline-4-carboxamide207.28386.9856680.2981320.085176
10 (E)-N-benzyl-2-(2-((4-methoxynaphthalen-1-yl)methylene)hydrazinyl)quinoline-4-carboxamide167.14727.67865-0.531450.343511
11 (E)-N-benzyl-2-(2-(2-methylpropylidene)hydrazinyl)quinoline-4-carboxamide427.60357.71263-0.109130.084914
12 (E)-N-benzyl-2-(2-propylidenehydrazinyl)quinoline-4-carboxamide277.29386.4957250.7980750.096543
13 (E)-N-benzyl-2-(2-benzylidenehydrazinyl)quinoline-4-carboxamide999.60909.62779-0.018790.089973
14 (E)-N-benzyl-2-(2-(4-methoxybenzylidene)hydrazinyl)quinoline-4-carboxamide217.26307.88645-0.623450.067538
15 (E)-N-(5-phenylpentyl)-2-(2-(pyridin-4-ylmethylene)hydrazinyl)quinoline-4-carboxamide307.47727.4118260.0653740.101346
16 a (E)-N-(5-phenylpentyl)-2-(2-(pyridin-3-ylmethylene)hydrazinyl)quinoline-4-carboxamide106.89096.7818620.1090380.218861
17 (E)-2-(2-(furan-2-ylmethylene)hydrazinyl)-N-(5-phenylpentyl)quinoline-4-carboxamide157.08077.17282-0.092120.090942
18 (E)-N-(5-phenylpentyl)-2-(2-(thiophen-2-ylmethylene)hydrazinyl)quinoline-4-carboxamide217.27477.2241530.0505470.079898
19 a (Z)-2-(2-(anthracen-9-ylmethylene)hydrazinyl)-N-(5-phenylpentyl)quinoline-4-carboxamide237.40917.67409-0.264990.075513
20 a (E)-2-(2-((4-methoxynaphthalen-1-yl)methylene)hydrazinyl)-N-(5-phenylpentyl)quinoline-4-carboxamide407.74127.31870.42250.154686
21 a (E)-2-(2-(2-methylpropylidene)hydrazinyl)-N-(5-phenylpentyl)quinoline-4-carboxamide427.66887.2737580.3950420.0423
22 (E)-2-(2-benzylidenehydrazinyl)-N-(5-phenylpentyl)quinoline-4-carboxamide216.26886.3256-0.05680.05984
23 (E)-2-(2-(4-methoxybenzylidene)hydrazinyl)-N-(5-phenylpentyl)quinoline-4-carboxamide407.69707.73765-0.040650.357197
24 a (E)-(2-(2-((4-methoxynaphthalen-1-yl)methylene)hydrazinyl)quinolin-4-yl)(morpholino)methanone76.77415.8165710.9575290.214607
25 (E)-(2-(2-benzylidenehydrazinyl)quinolin-4-yl)(morpholino)methanone36.25136.0396030.2116970.200793
26 (E)-(2-(2-(4-methoxybenzylidene)hydrazinyl)quinolin-4-yl)(morpholino)methanone106.84146.8095420.0318580.432707
27 a (E)-(4-methylpiperazin-1-yl)(2-(2-(pyridin-4-ylmethylene)hydrazinyl)quinolin-4-yl)methanone287.36737.3577410.0095590.263698
28 (E)-(2-(2-(furan-2-ylmethylene)hydrazinyl)quinolin-4-yl)(4-methylpiperazin-1-yl)methanone217.18917.39202-0.202920.255295
29 (E)-(4-methylpiperazin-1-yl)(2-(2-(thiophen-2-ylmethylene)hydrazinyl)quinolin-4-yl)methanone106.82916.5084410.3206590.06229
30 (E)-(2-(2-(anthracen-9-ylmethylene)hydrazinyl)quinolin-4-yl)(4-methylpiperazin-1-yl)methanone106.92536.9146770.0106230.81434
31 a (E)-(2-(2-((4-methoxynaphthalen-1-yl)methylene)hydrazinyl)quinolin-4-yl)(4-methylpiperazin-1-yl)methanone187.20227.50052-0.298320.279776
32 (E)-(4-methylpiperazin-1-yl)(2-(2-(2-methylpropylidene)hydrazinyl)quinolin-4-yl)methanone527.76967.4869080.2826920.409976
33 (E)-(2-(2-benzylidenehydrazinyl)quinolin-4-yl)(4-methylpiperazin-1-yl)methanone96.77167.25273-0.481130.25708
34 (E)-(2-(2-(4-methoxybenzylidene)hydrazinyl)quinolin-4-yl)(4-methylpiperazin-1-yl)methanone307.44207.49224-0.050240.055855
35 (E)-N-phenyl-2-(2-(thiophen-2-ylmethylene)hydrazinyl)quinoline-4-carboxamide267.32097.0251320.2957680.517231
36 (E)-N-phenyl-2-(2-(pyridin-4-ylmethylene)hydrazinyl)quinoline-4-carboxamide146.98097.16429-0.183390.249575

Note. Superscript “a” represents the test set.

2.2. Structure Optimization

In order for the molecules to attain a stable conformer at a minimal energy, all the molecules were geometrically optimized with the aid of Spartan 14 V1.1.4 by employing Molecular Mechanics Force Field (MMFF) count to remove strain energy and later subjected to Density Functional Theory (DFT) by utilizing the (B3LYP) basic set [5].

2.3. Molecular Descriptor Calculation

Descriptor is a mathematical logic that describes the properties of a molecule based on the correlation between the structure of the compound and its biological activity. Descriptors calculation for all the inhibitory compounds were achieved using PaDEL-Descriptor software V2.20.

2.4. Normalization of Data and Pretreatment

The values for the calculated descriptors were normalized using (2) so that each variable will have the same prospect at the inception so as to sway the model [8]:where Y1 is the descriptor value for each molecule and Ymin and Ymax are the minimum and maximum value for each descriptors column of Y. After successful normalization of the data, the data were further subjected to pretreatment in order to remove noise and redundant data.

2.5. Data Division into Training and Test Set

Kennard and Stone's algorithm approach was employed in this study to divide the data set into two compounds, a training set and a test, in proportion of 70 to 30%. The training set was used to develop the QSAR model while the test was used to confirm the developed model [9].

2.6. Development of the Model

Multilinear regression (MLR) approach is a strategy used to develop the QSAR. MLR approach displays a direct relationship between the dependent variable Y (activity) and independent variable X (descriptors). In MLR analysis, the mean of the dependent variable Y relies on X. MLR equation below is used to incorporate more than one independent variable (descriptors) with a single response variable (activity):where Y represents the dependent variable, represent the independent variables, k's are regression coefficients for each x, and C is a regression intercept [9].

2.7. Generation of QSAR Model and Validation

The combinations of the optimum descriptors for the training set were obtained from the descriptor pool using the Genetic Function Approximation technique. Their anti-lung cancer activities were placed as the last column in their respective spread sheets in Microsoft Excel 2010 which were later imported into the Material Studio software version 8.0 to generate the QSAR model by employing multilinear regression (MLR) approach and to evaluate the internal validation parameters [9].

2.8. Determination of Outlier and Influential Molecule (Applicability Domain)

The applicability domain approach was employed for the determination of outlier and influential molecule. Any compound outside the applicability domain space of ±3 is said to be an outlier. To define and describe the applicability domain of the built QSAR models, the leverage hi approach was employed and defined as follows [10].Xi is training set matrix of i. X is the n × k descriptor matrix of the training set compound, and X is the transpose of the training set (X). X is the transpose matrix X used to build the mode. The warning leverage h is the limit values to check for influential molecule. The warning leverage h is defined aswhere j is the number of descriptors in the built model and m is the number of compounds that made up the training set.

2.9. Assessment of Y-Randomization

Y-Randomization test is a confirmatory test to show that the developed QSAR model is reliable, strong, and robust and not gotten by chance. This test was performed on the training set data as described by [11]. Multilinear regression (MLR) models were generated by randomly shuffling the dependent variable (activity data) while keeping the independent variables (descriptors) unaltered. It is expected that the developed QSAR model should have significantly low R2 and Q2 values for numbers of trials in order to ascertain that the developed QSAR model is robust. Y-randomization coefficient (cR2) is another important parameter which should be more than 0.5 for passing this test.Here cR2 is Y-randomization coefficient, R is correlation coefficient for Y-Randomization, and Rr is average ‘R' of random models.

2.10. External Validation of the Model

The external validation test for the developed QSAR model was further subjected to Golbraikh and Tropsha criteria listed below: |r0∧2 − r′0∧2| (threshold value < 0.3) r2 − ro2/r2 (threshold value < 0.1) r2 − r′o2/r2 (threshold value < 0.1) k (threshold value 0.85 ≤ k ≤ 1.15) k′ (threshold value 0.85 ≤ k ≤ 1.15) [11, 12] where r2 is the square correlation coefficients of the plot of observed activity against calculated activity values, ro2 is the square correlation coefficients of the plot of observed activity against calculated activity values at zero intercept, r′o2 is the square correlation coefficients of the plot of calculated activity against observed activity at zero intercept, k is the slope of the plot of observed activity against calculated activity values at zero intercept, and k′ is the slope of the plot of calculated against observed activity at zero intercept.

2.11. Affirmation of the Built Model

The fitting ability, stability, reliability, predictiveness, and robustness of the developed models were evaluated by internal and external validation parameters. The validation parameters were compared with the accepted threshold value for any QSAR model [10-13] shown in Table 6.
Table 6

Validation parameters for each model using multilinear regression (MLR).

S/NOValidation ParametersFormulaThresholdModel 1Model 2Model 3Model 4
Internal Validation
1 Friedman LOF SEE1-(C+d×p)/M2 0.031670.032530.035610.04567
2 R-squared 1-Yexp-Ypred2Yexp-Y¯training2 R2 > 0.60.92650.87650.84540.8123
3 Adjusted R-squared R2-Pn-1n-p+1 Radj2 > 0.60.90450.84640.82770.7800
4 Cross validated R-squared (Qcv2) 1-Ypred-Yexp2Yexp-Y¯training2 Q2 > 0.60.85120.81540.75740.7245
5 Significant RegressionYesYesYesYes
6 Critical SOR F-value (95%) Ypred-Yexp2p/Ypred-Yexp2N-p-1 F(test) > 2.093.64653.65423.754433.8743
7 Replicate points0000
8 Computed observed error0000
9 Min expt. error for non-significant LOF (95%)0.034320.03540.046320.0485
Model Randomization
10 Average of the correlation coefficient for randomized data (R-r) R-<0.5 0.38660.32650.46440.4875
11 Average of determination coefficient for randomized data ( R-r2) R-r2<0.5 0.14650.18430.25410.2533
12 Average of leave one out cross-validated determination coefficient for randomized data ( Q-r2 ) Q-r2<0.5 -1.3325-1.3522-1.4023-1.4854
13 Coefficient for Y-randomization (cRp2) R2×  1-R2-R-r2   cRp2 > 0.60.74430.71030.65870.5873
External validation
14 Slope of the plot of Observed activity against Calculated activity values at zero intercept (K) YObsYcal 0.85<k<1.151.00161.047321.00541.1134
15 Slope of the plot of Calculated against Observed activity at zero intercept (k′) YCalYObs 0.85<k<1.150.812330.94320.64320.96433
16 /r02r02/<0.30.016430.074330.053220.04324
17 r2-r02r2 <0.10.002430.005730.078430.0643
18 r2-r02r2 <0.10.053320.064530.076370.8633
19 R test 2 1-  Yext-Y^ext2Yext-Y-2 Rpred2 > 0.60.80340.754330.67650.6123

3. Results and Discussion

A theoretical approach was employed to derive a QSAR model for predicting the activities of 2,4-disubstituted quinoline analogues against Mycobacterium tuberculosis. Kennard-Stone algorithm approach employed in this research was able to divide the studied compounds, which comprise 36 compounds, into a training set of 25 compounds and a test set of 11 compounds. The model generated was built on the basis of the training set while validation of the model was accessed by the test set The best descriptors that could better predict the activities of the inhibitory compounds were selected with the approach of Genetic Function Algorithm (GFA) while multilinear regression (MLR) method was used as modeling technique in generating the QSAR model. GFA-MLR led to selection of five (5) descriptors and four (4) QSAR models. Model 1 Model 2 Model 3 Model 4 The observed activities, calculated activities of the inhibitors, the residual values, and the leverage value for each compound were reported in Table 1. The low residual values between observed activities and calculated activities indicate that the model generated has a high predictive ability. Meanwhile the calculated descriptors for training set and test set in generating model 1 were reported in Table 2 for the purpose of reproducibility.
Table 2

Calculated descriptors for training set in generating model 1.

MoleculeDescriptorCalculated Activity
AATS5e VR1_Dzs SpMin7_Bhe TDB9e RDF110s
Training set
10 2.3115470.50405564.515520.527200520.35062637.67865
11 2.67309062.6813634.27717759.042756317.71263
12 2.5208330.50146857.739721.291889672.96E-696.495725
13 2.0705130.39914457.396822.438356990.196202189.62779
14 4.7125510.45285266.027746.521048290.358503137.88645
15 2.8348230.44281668.010634.115336893.170709447.411826
17 2.2500860.43256969.732244.345197542.696860827.17282
18 1.966490.41377763.862020.967857650.097692947.224153
2 1.7397120.41377762.705254.065518311.087680866.713561
22 2.0179310.41377757.967743.160247234.02E-056.3256
23 3.220530.46748563.019046.863459245.292706527.73765
25 2.443220.45182459.4202618.60363611.720120236.039603
26 1.9519680.50405563.870782.642302190.480138136.809542
28 2.250.4111952.983391.40036721.32E-1787.39202
29 2.1367520.4111956.140891.682882941.32E-856.508441
3 2.5403680.44923762.498346.734396581.569418596.664744
30 2.330070.43899161.123753.136655260.399828776.914677
32 2.2820510.71726969.051350.800404632.87E-177.486908
33 4.4916670.71726970.413452.292834681.64E-057.25273
34 2.69287059.1039937.24309783.369245977.49224
35 4.9349980.75531672.896438.059352171.922313717.025132
36 4.8088260.74506977.785298.302827690.380526867.16429
4 2.1773380.50405560.254782.272492290.001902679.73193
5 2.497643063.149213.57104094.024223926.896778
6 2.3296020.42323657.110633.943856940.22442066.510442
Test set
1 1.8431370.39914458.859830.5883527.75E-1017.22456
16 2.535225066.262768.9963742.65041656.781862
19 2.166170.44157758.022579.2412660.62301997.67409
20 3.5732780.46489963.501655.4428462.62060167.3187
21 6.7298420.77097778.825037.6317466.25049217.273758
24 2.2230390.50146858.211139.0462090.00373055.816571
27 2.0311110.4111956.346575.8808330.25626247.357741
31 2.4996220.42278559.887932.5652460.228847.50052
7 2.9117650.50146855.174251.2621442.64E-1826.972982
8 1.5714290.58888906.09E-174.24E-2987.152527
9 2.568603063.931437.5765141.22814576.985668
The names and symbols of each descriptors selected by GFA approach were presented in Table 3. The combination of the selected descriptors (2D and 3D) reported in model 1 indicates that these types of descriptors are able characterize and give better information on the structure of the antitubercular molecules.
Table 3

List of some descriptors used in the QSAR optimization model.

S/NODescriptors symbolsName of descriptor(s)Class
1 AATS5e Average Broto-Moreau autocorrelation - lag 5 / weighted by Sanderson electronegativities2D
2 VR1_Dzs Randic-like eigenvector-based index from Barysz matrix / weighted by I-state2D
3 SpMin7_Bhe Smallest absolute eigenvalue of Burden modified matrix - n 7 / weighted by relative Sanderson electronegativities2D
4 TDB9e 3D topological distance based autocorrelation - lag 9 / weighted by Sanderson electronegativities3D
5 RDF110s Radial distribution function - 110 / weighted by relative I-state3D
Statistics and correlation matrix of the selected descriptors that were reported in model 1 were presented in Table 4. The descriptors were subjected to Variance Inflation Factor (VIF) in order to check for orthogonality. Meanwhile, the VIF values for each descriptor shown in Table 4 were less than 4, which confirms that the descriptors were statistically significant and orthogonal.
Table 4

Statistical parameters that influence the model.

DescriptorStandard regression coefficient (bj)Mean Effect (ME)P- Value (Confidence interval)VIFStandard Error
AATS5e -0.3532-0.44290.0005462.19430.00654
VR1_Dzs 0.23760.35520.02362.37430.53182
SpMin7_Bhe -0.1343-0.88264.34E-041.64560.7866E-05
TDB9e 0.57890.51962.12E-051.04910.00867
RDF110s 0.94224-0.44050.01352.78603.65E-05
The mean effect (ME) and standard regression coefficient (b) values are reported in Table 4 which gives vital information on the effect of each descriptor and the degree of contribution in the developed model. The signs and the magnitude on the mean effects values indicate direction in influencing the activity of a compound and their individual strength. Table 4 represents the P-values of each of the descriptors in the model at 95% confidence level. Therefore the null hypothesis that says there is no association between the descriptors and the activities of the molecules is rejected; thus, the alternative hypothesis that says there is a relationship between the descriptors used in generating the model and the activities of the compounds at p < 0.05 is accepted. The Person correlation coefficients calculated for the descriptors in the model were reported in Table 5. The low correlation coefficients that exist between each descriptor in the model imply that there exists no significant intercorrelation between each descriptor.
Table 5

Pearson's correlation coefficient for the descriptor used in the QSAR model.

Inter-correlation
AATS5e VR1_Dzs SpMin7_Bhe TDB9e RDF110s
AATS5e 1
VR1_Dzs 0.4148121
SpMin7_Bhe 0.6681510.4980431
TDB9e 0.1092-0.67462-0.042641
RDF110s 0.061763-0.60670.0952740.07280091
External validation and internal validation parameters used to assure that the developed models are stable and robust were reported in Table 6. These parameters were in agreement with the threshold value reported in Table 6 which actually confirmed the robustness and stability of the model. Based on these validation parameters, model one was selected as the optimum model and used to predict the activities of 2,4-disubstituted quinoline derivatives. The QSAR model generated in this research was compared with the models obtained in the literature [4, 5] as shown below:and the external validation for the test set was found to be Rpred = 0.8842 [5]and Rpred = 0.7690 [4]. From the above models the validation parameters reported in this work and those reported in the literature were all in agreement with the parameters presented in Table 6, which actually confirmed the robustness of the model generated. Y-Randomization coefficient (cR2) was also conducted and has a significant value of 0.7443, greater than 0.5, which was reported in Table 7 supporting the claim that the model generated is powerful and not inferred by chance.
Table 7

Y-randomization parameters test.

Model R R 2 Q 2
Original0.92650.90450.8512
Random 10.34540.1193-1.0841
Random 20.48680.2370-1.0985
Random 30.44080.1943-0.9815
Random 40.55750.3108-0.5503
Random 50.29570.0874-1.1088
Random 60.55620.3093-0.7285
Random 70.77240.59660.0328
Random 80.27520.0757-1.1166
Random 90.748230.5598-0.0362
Random 100.55570.3088-0.4448
Random Models Parameters
Average r:0.3866
Average r2:0.1465
Average Q2:-0.3325
c R p 2:0.7443
The graphs of calculated activities plotted against observed activities of the training and test set are presented in Figures 1 and 2. The correlation coefficient (R2) value of 0.9265 for the training set and (R2) value of 0.8034 for the test set recorded in this work were found to be in line with accepted QSAR threshold values reported in Table 3. This affirms the stability, reliability, and predictive power of the built model. The plot of residual activity against observed activities shown in Figure 3 indicates that there exists no computational inaccuracy in the derived QSAR model as the range of residuals values falls within an accepted limit of ±2 on residual activity axis.
Figure 1

Plot of calculated activity against observed activity of training set.

Figure 2

Plot of calculated activity against observed activity of test set.

Figure 3

Plot of standardized residual activity versus observed activity.

The standardized residual activities plotted against the leverage value, known as the Williams plot, are shown in Figure 4. The plotted graph clearly shows that all the compounds fall within limit boundary ±3 of standardized cross-validated residuals. Hence, it can be inferred that no outlier is observed in the data set. However, compound number 30 is found to have a leverage value greater than the calculated warning leverage (h = 0.60). Therefore the compound is an influential molecule.
Figure 4

The Williams plot of the standardized residuals versus the leverage value.

3.1. D-Optimal Design

D-Optimal design was carried out in order to determine optimal design location and maximize the efficiency of estimating a specified model. This was achieved using Statgraphics 18 software. From the results presented in Table 8, the R-Squared statistic indicates that the model as fitted explains 80.9278% of the variability in observed activities. The correlation coefficient equals 0.899599, indicating a moderately strong relationship between the variables (descriptors). The standard error of the estimate shows the standard deviation of the residuals to be 0.345508. Thus, value can be used to construct prediction limits for new observations. The mean absolute error (MAE) of 0.25514 is the average value of the residuals. The Durbin-Watson (DW) statistic tests the residuals to determine if there is any significant correlation based on the order in which they occur in the data file. Since the P-value is greater than 0.05, it implies there is no indication of serial autocorrelation in the residuals at the 95.0% confidence level.
Table 8

D optimal validation parameters.

D optimal Validation parametersValue
Correlation Coefficient0.899599
R-squared80.9278 percent
R-squared (adjusted for d.f.)80.0986 percent
Standard Error of Est.0.345508
Mean absolute error0.25514
Durbin-Watson statistic1.81474 (P=0.3302)
Lag 1 residual autocorrelation0.0925989
Correlation Coefficient0.899599
The observed versus predicted plot presented in Figure 5 shows the observed values of Y on the vertical axis and the predicted values of X on the horizontal axis. Based on the fact that the points are randomly scattered around the diagonal line, it indicates that the model fits well. The Prediction Variance Plot presented in Figure 6 shows how the standard error of the predicted response varies across the design region. The standard error displayed is the square root of the unscaled prediction variance. A surface plot is created for the first two design factors, AATS5e and RDF110s, with all other factors held constant. In order to have an optimal design, the standard error must be at lowest near the center of the design region. It increases as the location moves away from the center in any direction. The Prediction Profile graph presented in Figure 7 displays the standard error of the predicted response as a function of each design factor as the factors are moved from a specified reference point. The location in the design region for each response was AATS5e = 3.34, RDF110s = 4.52, SpMin7_Bhe = 65.38, TDB9e = 18.89, and VR1_Dzs= 0.38, respectively. At these locations, the standard error of prediction equals 0.345508. Therefor the plot shows the location of each factor in standardized units. In standardized units, the specified low value equals -0.4, the center is 0, and the specified high value equals 0.4. The lines on the plot show how the specified standard error changes as the factors are moved away from the reference location. It can be clearly noticed that the standard errors remain small within the low to high range (-0.4 to 0.4) but start to increase rapidly outside that range.
Figure 5

Plot of observed versus predicted values.

Figure 6

Variance plot shows how the standard error of the predicted response varies across the design region.

Figure 7

Prediction profile graph displays the standard error of the predicted response.

4. Conclusion

A theoretical approach was employed in this study on selected molecular descriptors to derive a model that could be used to correlate the structure of 2,4-disubstituted quinolone derivatives as potent inhibitors against Mycobacterium tuberculosis with their respective biological activities. The model derived was subjected to internal and external validation test to confirm that the built QSAR model is significant, robust, and reliable. From the results, it is concluded that 2,4-disubstituted quinolone derivatives can be modeled using molecular descriptors, AATS5e, VR1_Dzs, SpMin7_Bhe, TDB9e, and RDF110s. The built QSAR model will be a vital tool for pharmaceutical as well as medicinal chemists to design and synthesize novel antitubercular drugs with better activities against M. tuberculosis.
  3 in total

Review 1.  Chem-bioinformatics and QSAR: a review of QSAR lacking positive hydrophobic terms.

Authors:  C Hansch; A Kurup; R Garg; H Gao
Journal:  Chem Rev       Date:  2001-03       Impact factor: 60.622

2.  Some case studies on application of "r(m)2" metrics for judging quality of quantitative structure-activity relationship predictions: emphasis on scaling of response data.

Authors:  Kunal Roy; Pratim Chakraborty; Indrani Mitra; Probir Kumar Ojha; Supratik Kar; Rudra Narayan Das
Journal:  J Comput Chem       Date:  2013-01-08       Impact factor: 3.376

3.  QSAR Modeling and Molecular Docking Analysis of Some Active Compounds against Mycobacterium tuberculosis Receptor (Mtb CYP121).

Authors:  Shola Elijah Adeniji; Sani Uba; Adamu Uzairu
Journal:  J Pathog       Date:  2018-05-10
  3 in total
  2 in total

1.  QSTR Modeling to Find Relevant DFT Descriptors Related to the Toxicity of Carbamates.

Authors:  Emma H Acosta-Jiménez; Luis A Zárate-Hernández; Rosa L Camacho-Mendoza; Simplicio González-Montiel; José G Alvarado-Rodríguez; Carlos Z Gómez-Castro; Miriam Pescador-Rojas; Amilcar Meneses-Viveros; Julián Cruz-Borbolla
Journal:  Molecules       Date:  2022-08-28       Impact factor: 4.927

2.  In silico design and molecular docking study of CDK2 inhibitors with potent cytotoxic activity against HCT116 colorectal cancer cell line.

Authors:  Fabian Adakole Ikwu; Yusuf Isyaku; Babatunde Samuel Obadawo; Hadiza Abdulrahman Lawal; Samuel Akolade Ajibowu
Journal:  J Genet Eng Biotechnol       Date:  2020-09-15
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.