| Literature DB >> 27065774 |
Hamid Rafiei1, Marziyeh Khanzadeh2, Shahla Mozaffari2, Mohammad Hassan Bostanifar1, Zhila Mohajeri Avval2, Reza Aalizadeh3, Eslam Pourbasheer2.
Abstract
Quantitative structure-activity relationship (QSAR) study has been employed for predicting the inhibitory activities of the Hepatitis C virus (HCV) NS5B polymerase inhibitors . A data set consisted of 72 compounds was selected, and then different types of molecular descriptors were calculated. The whole data set was split into a training set (80 % of the dataset) and a test set (20 % of the dataset) using principle component analysis. The stepwise (SW) and the genetic algorithm (GA) techniques were used as variable selection tools. Multiple linear regression method was then used to linearly correlate the selected descriptors with inhibitory activities. Several validation technique including leave-one-out and leave-group-out cross-validation, Y-randomization method were used to evaluate the internal capability of the derived models. The external prediction ability of the derived models was further analyzed using modified r(2), concordance correlation coefficient values and Golbraikh and Tropsha acceptable model criteria's. Based on the derived results (GA-MLR), some new insights toward molecular structural requirements for obtaining better inhibitory activity were obtained.Entities:
Keywords: HCV; QSAR; genetic algorithms; multiple linear regression
Year: 2016 PMID: 27065774 PMCID: PMC4822051 DOI: 10.17179/excli2015-731
Source DB: PubMed Journal: EXCLI J ISSN: 1611-2156 Impact factor: 4.068
Table 1Table1: Chemical structures and the corresponding observed and predicted pIC50 values by GA-MLR method
Table 2Golbraikh and Tropsha acceptable model criteria's for SW-MLR and GA-MLR
Figure 1Principle component analysis with PC1 and PC2 with test set for GA-MLR result
Figure 2PC1-PC2 loadings plot using the six descriptors for the best model (GA-MLR)
Figure 3The predicted pIC50 values by the GA-MLR modeling vs. the experimental pIC50 values
Table 3Correlation coefficient matrix of the selected descriptors with their VIF values
Figure 4R2train and Q2LOO values after several Y-randomization tests for GA-MLR
Figure 5The William plot for the predictive GA-MLR model