| Literature DB >> 33868450 |
Junlin Zhou1, Juan Hao1, Lianxin Peng1, Huaichuan Duan1, Qing Luo1, Hailian Yan1, Hua Wan2, Yichen Hu1, Li Liang1, Zhenjian Xie1, Wei Liu1, Gang Zhao1, Jianping Hu1.
Abstract
A key enzyme in human immunodeficiency virus type 1 (HIV-1) life cycle, integrase (IN) aids the integration of viral DNA into the host DNA, which has become an ideal target for the development of anti-HIV drugs. A total of 1785 potential HIV-1 IN inhibitors were collected from the databases of ChEMBL, Binding Database, DrugBank, and PubMed, as well as from 40 references. The database was divided into the training set and test set by random sampling. By exploring the correlation between molecular descriptors and inhibitory activity, it is found that the classification and specific activity data of inhibitors can be more accurately predicted by the combination of molecular descriptors and molecular fingerprints. The calculation of molecular fingerprint descriptor provides the additional substructure information to improve the prediction ability. Based on the training set, two machine learning methods, the recursive partition (RP) and naive Bayes (NB) models, were used to build the classifiers of HIV-1 IN inhibitors. Through the test set verification, the RP technique accurately predicted 82.5% inhibitors and 86.3% noninhibitors. The NB model predicted 88.3% inhibitors and 87.2% noninhibitors with correlation coefficient of 85.2%. The results show that the prediction performance of NB model is slightly better than that of RP, and the key molecular segments are also obtained. Additionally, CoMFA and CoMSIA models with good activity prediction ability both were constructed by exploring the structure-activity relationship, which is helpful for the design and optimization of HIV-1 IN inhibitors.Entities:
Year: 2021 PMID: 33868450 PMCID: PMC8035010 DOI: 10.1155/2021/5559338
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Figure 1Distributions of eight molecular descriptors of both inhibitors and noninhibitors.
Figure 2The C value changes of the training and test sets along with tree depth.
Figure 3Decision tree with a depth of 9.
Cross validation of naive Bayesian classification.
| Model name | ROC score | ROC rating | TP | FN | FP | TN | SE | SP |
|
|---|---|---|---|---|---|---|---|---|---|
| Naive Bayesian model | 0.897 | Good | 627 | 55 | 210 | 893 | 0.919 | 0.81 | 0.852 |
Figure 4Potentially advantageous molecular fingerprint structures for HIV-1 IN inhibitors derived from naive Bayesian classification.
Figure 5Structures and pIC50 values of quinolinone acid inhibitors against HIV-1 IN. The training set and test set are shown in black and red, respectively. The IC50 value is units of μM.
Figure 6The correlation between experimental pIC50 and the predicted values by (a) CoMFA and (b) CoMSIA models.