| Literature DB >> 31186969 |
Shola Elijah Adeniji1, Sani Uba1, Adamu Uzairu1, David Ebuka Arthur1.
Abstract
Development of more potent antituberculosis agents is as a result of emergence of multidrug resistant strains of M. tuberculosis. Novel compounds are usually synthesized by trial approach with a lot of errors, which is time consuming and expensive. QSAR is a theoretical approach, which has the potential to reduce the aforementioned problem in discovering new potent drugs against M. tuberculosis. This approach was employed to develop multivariate QSAR model to correlate the chemical structures of the 2,4-disubstituted quinoline analogues with their observed activities using a theoretical approach. In order to build the robust QSAR model, Genetic Function Approximation (GFA) was employed as a tool for selecting the best descriptors that could efficiently predict the activities of the inhibitory agents. The developed model was influenced by molecular descriptors: AATS5e, VR1_Dzs, SpMin7_Bhe, TDB9e, and RDF110s. The internal validation test for the derived model was found to have correlation coefficient (R2) of 0.9265, adjusted correlation coefficient (R2 adj) value of 0.9045, and leave-one-out cross-validation coefficient (Q_cv∧2) value of 0.8512, while the external validation test was found to have (R2 test) of 0.8034 and Y-randomization coefficient (cR_p∧2) of 0.6633. The proposed QSAR model provides a valuable approach for modification of the lead compound and design and synthesis of more potent antitubercular agents.Entities:
Year: 2019 PMID: 31186969 PMCID: PMC6521565 DOI: 10.1155/2019/5173786
Source DB: PubMed Journal: Adv Prev Med
Molecular structures of inhibitory compounds and their derivatives as antitubercular agents.
| S/N | Molecular structure | Observed Activity | Observed Activity | Calculated Activity | Residual | Leverage |
|---|---|---|---|---|---|---|
|
|
| 11 | 6.8191 | 7.22456 | -0.40546 | 0.186966 |
|
|
| 12 | 6.8418 | 6.713561 | 0.128239 | 0.267393 |
|
|
| 11 | 6.8601 | 6.664744 | 0.195356 | 0.072612 |
|
|
| 99 | 9.4979 | 9.73193 | -0.23403 | 0.15548 |
|
|
| 14 | 6.9772 | 6.896778 | 0.080422 | 0.328411 |
|
|
| 23 | 7.2608 | 6.510442 | 0.750358 | 0.055405 |
|
|
| 20 | 7.1707 | 6.972982 | 0.197718 | 0.407733 |
|
|
| 30 | 7.4233 | 7.152527 | 0.270773 | 0.378878 |
|
|
| 20 | 7.2838 | 6.985668 | 0.298132 | 0.085176 |
|
|
| 16 | 7.1472 | 7.67865 | -0.53145 | 0.343511 |
|
|
| 42 | 7.6035 | 7.71263 | -0.10913 | 0.084914 |
|
|
| 27 | 7.2938 | 6.495725 | 0.798075 | 0.096543 |
|
|
| 99 | 9.6090 | 9.62779 | -0.01879 | 0.089973 |
|
|
| 21 | 7.2630 | 7.88645 | -0.62345 | 0.067538 |
|
|
| 30 | 7.4772 | 7.411826 | 0.065374 | 0.101346 |
|
|
| 10 | 6.8909 | 6.781862 | 0.109038 | 0.218861 |
|
|
| 15 | 7.0807 | 7.17282 | -0.09212 | 0.090942 |
|
|
| 21 | 7.2747 | 7.224153 | 0.050547 | 0.079898 |
|
|
| 23 | 7.4091 | 7.67409 | -0.26499 | 0.075513 |
|
|
| 40 | 7.7412 | 7.3187 | 0.4225 | 0.154686 |
|
|
| 42 | 7.6688 | 7.273758 | 0.395042 | 0.0423 |
|
|
| 21 | 6.2688 | 6.3256 | -0.0568 | 0.05984 |
|
|
| 40 | 7.6970 | 7.73765 | -0.04065 | 0.357197 |
|
|
| 7 | 6.7741 | 5.816571 | 0.957529 | 0.214607 |
|
|
| 3 | 6.2513 | 6.039603 | 0.211697 | 0.200793 |
|
|
| 10 | 6.8414 | 6.809542 | 0.031858 | 0.432707 |
|
|
| 28 | 7.3673 | 7.357741 | 0.009559 | 0.263698 |
|
|
| 21 | 7.1891 | 7.39202 | -0.20292 | 0.255295 |
|
|
| 10 | 6.8291 | 6.508441 | 0.320659 | 0.06229 |
|
|
| 10 | 6.9253 | 6.914677 | 0.010623 | 0.81434 |
|
|
| 18 | 7.2022 | 7.50052 | -0.29832 | 0.279776 |
|
|
| 52 | 7.7696 | 7.486908 | 0.282692 | 0.409976 |
|
|
| 9 | 6.7716 | 7.25273 | -0.48113 | 0.25708 |
|
|
| 30 | 7.4420 | 7.49224 | -0.05024 | 0.055855 |
|
|
| 26 | 7.3209 | 7.025132 | 0.295768 | 0.517231 |
|
|
| 14 | 6.9809 | 7.16429 | -0.18339 | 0.249575 |
Note. Superscript “a” represents the test set.
Validation parameters for each model using multilinear regression (MLR).
| S/NO | Validation Parameters | Formula | Threshold | Model 1 | Model 2 | Model 3 | Model 4 |
|---|---|---|---|---|---|---|---|
|
| |||||||
|
| Friedman LOF |
| 0.03167 | 0.03253 | 0.03561 | 0.04567 | |
|
| R-squared |
| R2 > 0.6 | 0.9265 | 0.8765 | 0.8454 | 0.8123 |
|
| Adjusted |
| Radj2 > 0.6 | 0.9045 | 0.8464 | 0.8277 | 0.7800 |
|
| Cross validated R-squared ( |
| Q2 > 0.6 | 0.8512 | 0.8154 | 0.7574 | 0.7245 |
|
| Significant Regression | Yes | Yes | Yes | Yes | ||
|
| Critical SOR F-value (95%) |
| F(test) > 2.09 | 3.6465 | 3.6542 | 3.75443 | 3.8743 |
|
| Replicate points | 0 | 0 | 0 | 0 | ||
|
| Computed observed error | 0 | 0 | 0 | 0 | ||
|
| Min expt. error for non-significant LOF (95%) | 0.03432 | 0.0354 | 0.04632 | 0.0485 | ||
|
| |||||||
|
| Average of the correlation coefficient for randomized data ( |
| 0.3866 | 0.3265 | 0.4644 | 0.4875 | |
|
| Average of determination coefficient for randomized data ( |
| 0.1465 | 0.1843 | 0.2541 | 0.2533 | |
|
| Average of leave one out cross-validated determination coefficient for randomized data ( |
| -1.3325 | -1.3522 | -1.4023 | -1.4854 | |
|
| Coefficient for Y-randomization (c |
| cRp2 > 0.6 | 0.7443 | 0.7103 | 0.6587 | 0.5873 |
|
| |||||||
|
| Slope of the plot of Observed activity against Calculated activity values at zero intercept ( |
| 0.85<k<1.15 | 1.0016 | 1.04732 | 1.0054 | 1.1134 |
|
| Slope of the plot of Calculated against Observed activity at zero intercept ( |
| 0.85<k<1.15 | 0.81233 | 0.9432 | 0.6432 | 0.96433 |
|
| / | <0.3 | 0.01643 | 0.07433 | 0.05322 | 0.04324 | |
|
|
| <0.1 | 0.00243 | 0.00573 | 0.07843 | 0.0643 | |
|
|
| <0.1 | 0.05332 | 0.06453 | 0.07637 | 0.8633 | |
|
|
|
| Rpred2 > 0.6 | 0.8034 | 0.75433 | 0.6765 | 0.6123 |
Calculated descriptors for training set in generating model 1.
| Molecule | Descriptor | Calculated | ||||
|---|---|---|---|---|---|---|
|
|
|
|
|
| ||
|
| ||||||
|
| 2.311547 | 0.504055 | 64.51552 | 0.52720052 | 0.3506263 | 7.67865 |
|
| 2.67309 | 0 | 62.68136 | 34.2771775 | 9.04275631 | 7.71263 |
|
| 2.520833 | 0.501468 | 57.73972 | 1.29188967 | 2.96E-69 | 6.495725 |
|
| 2.070513 | 0.399144 | 57.39682 | 2.43835699 | 0.19620218 | 9.62779 |
|
| 4.712551 | 0.452852 | 66.02774 | 6.52104829 | 0.35850313 | 7.88645 |
|
| 2.834823 | 0.442816 | 68.01063 | 4.11533689 | 3.17070944 | 7.411826 |
|
| 2.250086 | 0.432569 | 69.73224 | 4.34519754 | 2.69686082 | 7.17282 |
|
| 1.96649 | 0.413777 | 63.86202 | 0.96785765 | 0.09769294 | 7.224153 |
|
| 1.739712 | 0.413777 | 62.70525 | 4.06551831 | 1.08768086 | 6.713561 |
|
| 2.017931 | 0.413777 | 57.96774 | 3.16024723 | 4.02E-05 | 6.3256 |
|
| 3.22053 | 0.467485 | 63.01904 | 6.86345924 | 5.29270652 | 7.73765 |
|
| 2.44322 | 0.451824 | 59.42026 | 18.6036361 | 1.72012023 | 6.039603 |
|
| 1.951968 | 0.504055 | 63.87078 | 2.64230219 | 0.48013813 | 6.809542 |
|
| 2.25 | 0.41119 | 52.98339 | 1.4003672 | 1.32E-178 | 7.39202 |
|
| 2.136752 | 0.41119 | 56.14089 | 1.68288294 | 1.32E-85 | 6.508441 |
|
| 2.540368 | 0.449237 | 62.49834 | 6.73439658 | 1.56941859 | 6.664744 |
|
| 2.33007 | 0.438991 | 61.12375 | 3.13665526 | 0.39982877 | 6.914677 |
|
| 2.282051 | 0.717269 | 69.05135 | 0.80040463 | 2.87E-17 | 7.486908 |
|
| 4.491667 | 0.717269 | 70.41345 | 2.29283468 | 1.64E-05 | 7.25273 |
|
| 2.69287 | 0 | 59.10399 | 37.2430978 | 3.36924597 | 7.49224 |
|
| 4.934998 | 0.755316 | 72.89643 | 8.05935217 | 1.92231371 | 7.025132 |
|
| 4.808826 | 0.745069 | 77.78529 | 8.30282769 | 0.38052686 | 7.16429 |
|
| 2.177338 | 0.504055 | 60.25478 | 2.27249229 | 0.00190267 | 9.73193 |
|
| 2.497643 | 0 | 63.1492 | 13.5710409 | 4.02422392 | 6.896778 |
|
| 2.329602 | 0.423236 | 57.11063 | 3.94385694 | 0.2244206 | 6.510442 |
|
| ||||||
|
| 1.843137 | 0.399144 | 58.85983 | 0.588352 | 7.75E-101 | 7.22456 |
|
| 2.535225 | 0 | 66.26276 | 8.996374 | 2.6504165 | 6.781862 |
|
| 2.16617 | 0.441577 | 58.02257 | 9.241266 | 0.6230199 | 7.67409 |
|
| 3.573278 | 0.464899 | 63.50165 | 5.442846 | 2.6206016 | 7.3187 |
|
| 6.729842 | 0.770977 | 78.82503 | 7.631746 | 6.2504921 | 7.273758 |
|
| 2.223039 | 0.501468 | 58.21113 | 9.046209 | 0.0037305 | 5.816571 |
|
| 2.031111 | 0.41119 | 56.34657 | 5.880833 | 0.2562624 | 7.357741 |
|
| 2.499622 | 0.422785 | 59.88793 | 2.565246 | 0.22884 | 7.50052 |
|
| 2.911765 | 0.501468 | 55.17425 | 1.262144 | 2.64E-182 | 6.972982 |
|
| 1.571429 | 0.588889 | 0 | 6.09E-17 | 4.24E-298 | 7.152527 |
|
| 2.568603 | 0 | 63.93143 | 7.576514 | 1.2281457 | 6.985668 |
List of some descriptors used in the QSAR optimization model.
| S/NO | Descriptors symbols | Name of descriptor(s) | Class |
|---|---|---|---|
|
|
| Average Broto-Moreau autocorrelation - lag 5 / weighted by Sanderson electronegativities | 2D |
|
|
| Randic-like eigenvector-based index from Barysz matrix / weighted by I-state | 2D |
|
|
| Smallest absolute eigenvalue of Burden modified matrix - n 7 / weighted by relative Sanderson electronegativities | 2D |
|
|
| 3D topological distance based autocorrelation - lag 9 / weighted by Sanderson electronegativities | 3D |
|
|
| Radial distribution function - 110 / weighted by relative I-state | 3D |
Statistical parameters that influence the model.
| Descriptor | Standard regression coefficient ( | Mean Effect (ME) | P- Value | VIF | Standard Error |
|---|---|---|---|---|---|
|
| -0.3532 | -0.4429 | 0.000546 | 2.1943 | 0.00654 |
|
| 0.2376 | 0.3552 | 0.0236 | 2.3743 | 0.53182 |
|
| -0.1343 | -0.8826 | 4.34E-04 | 1.6456 | 0.7866E-05 |
|
| 0.5789 | 0.5196 | 2.12E-05 | 1.0491 | 0.00867 |
|
| 0.94224 | -0.4405 | 0.0135 | 2.7860 | 3.65E-05 |
Pearson's correlation coefficient for the descriptor used in the QSAR model.
| Inter-correlation | |||||
|---|---|---|---|---|---|
|
|
|
|
|
| |
|
| 1 | ||||
|
| 0.414812 | 1 | |||
|
| 0.668151 | 0.498043 | 1 | ||
|
| 0.1092 | -0.67462 | -0.04264 | 1 | |
|
| 0.061763 | -0.6067 | 0.095274 | 0.0728009 | 1 |
Y-randomization parameters test.
| Model |
|
|
|
|---|---|---|---|
| Original | 0.9265 | 0.9045 | 0.8512 |
| Random 1 | 0.3454 | 0.1193 | -1.0841 |
| Random 2 | 0.4868 | 0.2370 | -1.0985 |
| Random 3 | 0.4408 | 0.1943 | -0.9815 |
| Random 4 | 0.5575 | 0.3108 | -0.5503 |
| Random 5 | 0.2957 | 0.0874 | -1.1088 |
| Random 6 | 0.5562 | 0.3093 | -0.7285 |
| Random 7 | 0.7724 | 0.5966 | 0.0328 |
| Random 8 | 0.2752 | 0.0757 | -1.1166 |
| Random 9 | 0.74823 | 0.5598 | -0.0362 |
| Random 10 | 0.5557 | 0.3088 | -0.4448 |
|
| |||
| Average | 0.3866 | ||
| Average | 0.1465 | ||
| Average | -0.3325 | ||
|
| 0.7443 | ||
Figure 1Plot of calculated activity against observed activity of training set.
Figure 2Plot of calculated activity against observed activity of test set.
Figure 3Plot of standardized residual activity versus observed activity.
Figure 4The Williams plot of the standardized residuals versus the leverage value.
D optimal validation parameters.
| D optimal Validation parameters | Value |
|---|---|
| Correlation Coefficient | 0.899599 |
| R-squared | 80.9278 percent |
| R-squared (adjusted for d.f.) | 80.0986 percent |
| Standard Error of Est. | 0.345508 |
| Mean absolute error | 0.25514 |
| Durbin-Watson statistic | 1.81474 (P=0.3302) |
| Lag 1 residual autocorrelation | 0.0925989 |
| Correlation Coefficient | 0.899599 |
Figure 5Plot of observed versus predicted values.
Figure 6Variance plot shows how the standard error of the predicted response varies across the design region.
Figure 7Prediction profile graph displays the standard error of the predicted response.