| Literature DB >> 29651663 |
Tianyi Qiu1,2, Dingfeng Wu1, Jingxuan Qiu1,3, Zhiwei Cao4.
Abstract
Nuclear receptors (NR) are a class of proteins that are responsible for sensing steroid and thyroid hormones and certain other molecules. In that case, NR have the ability to regulate the expression of specific genes and associated with various diseases, which make it essential drug targets. Approaches which can predict the inhibition ability of compounds for different NR target should be particularly helpful for drug development. In this study, proteochemometric modelling was introduced to analysis the bioactivity between chemical compounds and NR targets. Results illustrated the ability of our PCM model for high-throughput NR-inhibitor screening after evaluated on both internal (AUC > 0.870) and external (AUC > 0.746) validation set. Moreover, in-silico predicted bioactive compounds were clustered according to structure similarity and a series of representative molecular scaffolds can be derived for five major NR targets. Through scaffolds analysis, those essential bioactive scaffolds of different NR target can be detected and compared. Generally, the methods and molecular scaffolds proposed in this article can not only help the screening of potential therapeutic NR-inhibitors but also able to guide the future NR-related drug discovery.Entities:
Keywords: Cheminformatics; Molecular scaffold; Nuclear receptor; Proteochemometric modelling
Year: 2018 PMID: 29651663 PMCID: PMC5897275 DOI: 10.1186/s13321-018-0275-x
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
10-fold cross-validation results of different machine learning methods
| Method | Accuracy | Precision | Recall | F1_score | AUC |
|---|---|---|---|---|---|
| RF | 0.740 | 0.761 | 0.768 | 0.762 | 0.829 |
| RC | 0.624 | 0.643 | 0.713 | 0.674 | –a |
| LR | 0.453 | 0.490 | 0.000 | 0.000 | 0.452 |
| DT | 0.701 | 0.726 | 0.727 | 0.726 | 0.700 |
| SVC | 0.583 | 0.569 | 0.984 | 0.720 | –a |
Results in Table 1 were calculated based on descriptor T1
aThis parameters can’t be calculated in here (continuous predict values are needed to calculate AUC value)
Fig. 1Performance of PCM modeling. a Cross-validation performance of PCM model constructed by RF classifier based on four different protein descriptors. b AUC value of PCM modeling constructed by RF classifier under different cutoffs of bio-active data, this results were obtained by descriptor T1. *Precision score means the area under the precision-recall curve
Fig. 2Scaffold clustering of NR-inhibitors, colors in back ground and in spot represents the experimental confirmed active compounds and model predicted active compounds respectively. Red means active compounds while green means inactive compounds, white color means scaffold contains both active and inactive compounds. a Scaffold clustering of NR1C1-inhibitors. b Examples of compounds contains scaffold S10. c Scaffold clustering of NR1C2-inhibitors. d Scaffold clustering of NR1C3-inhibitors. e Scaffold clustering of NR1H2-inhibitors. f Scaffold clustering of NR2B1-inhibitors
Fig. 3Clustering tree of nuclear receptors. 7 different subtypes of NR were marked in different colors and 11 NR proteins used in this study were marked in red as well as its data distribution