| Literature DB >> 23865065 |
Tianhong Gu1, Xiaoyan Yang, Minjie Li, Milin Wu, Qiang Su, Wencong Lu, Yuhui Zhang.
Abstract
The second development program developed in this work was introduced to obtain physicochemical properties of DPP-IV inhibitors. Based on the computation of molecular descriptors, a two-stage feature selection method called mRMR-BFS (minimum redundancy maximum relevance-backward feature selection) was adopted. Then, the support vector regression (SVR) was used in the establishment of the model to map DPP-IV inhibitors to their corresponding inhibitory activity possible. The squared correlation coefficient for the training set of LOOCV and the test set are 0.815 and 0.884, respectively. An online server for predicting inhibitory activity pIC50 of the DPP-IV inhibitors as described in this paper has been given in the introduction.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23865065 PMCID: PMC3705804 DOI: 10.1155/2013/798743
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1Molecular structure of cyanopyrrolidine amides as DPP-IV inhibitors.
Figure 2The program interface for the computation of molecular descriptors.
Symbols for molecular descriptors involved in the model.
| Molecular descriptor | Type | Description |
|---|---|---|
| OComposition | Elemental analysis functions | O Composition |
| MaximalProjectionArea | Geometry | Calculates the maximal projection area |
| MinimalProjectionArea | Geometry | Calculates the minimal projection area |
| BasicpKa | pKa | Constant denoting basic pKa |
| RingBondCount | Topology | Ring bond count |
| AliphaticRingCount | Topology | Aliphatic ring count |
Figure 3Predicted versus experimental pIC50 for the training (circles for fitting and triangle for CV, respectively) and test (stars) sets.
Experimental and predicted pIC50 for the training and test sets.
| No. | pIC50(exp) | pIC50(Pred) | pIC50(LOOCV) |
|---|---|---|---|
| 1 | 7.00 | 7.11 | 7.17 |
| 2T | 7.20 | 7.30 | — |
| 3 | 7.35 | 7.36 | 7.33 |
| 4 | 7.33 | 7.23 | 7.16 |
| 5T | 7.01 | 7.00 | — |
| 6 | 7.14 | 7.04 | 6.92 |
| 7 | 7.14 | 7.03 | 6.84 |
| 8 | 6.71 | 7.01 | 7.14 |
| 9T | 6.64 | 6.80 | — |
| 10 | 7.06 | 7.13 | 7.14 |
| 11 | 6.91 | 7.01 | 7.28 |
| 12 | 6.62 | 6.73 | 6.89 |
| 13 | 6.60 | 6.70 | 6.78 |
| 14T | 6.85 | 6.73 | — |
| 15 | 6.67 | 6.70 | 6.70 |
| 16 | 6.60 | 6.70 | 6.70 |
| 17 | 6.94 | 6.86 | 6.86 |
| 18 | 6.74 | 6.79 | 6.79 |
| 19T | 6.52 | 6.73 | — |
| 20 | 8.70 | 8.27 | 8.18 |
| 21 | 8.30 | 8.34 | 8.34 |
| 22 | 7.46 | 7.39 | 7.39 |
| 23 | 7.40 | 7.50 | 7.43 |
| 24T | 8.22 | 8.24 | — |
| 25 | 8.15 | 8.25 | 8.57 |
| 26 | 8.30 | 8.24 | 8.25 |
| 27 | 8.05 | 8.13 | 8.14 |
| 28 | 8.22 | 8.11 | 8.05 |
| 29 | 8.15 | 8.05 | 7.90 |
| 30T | 8.00 | 7.78 | — |
| 31 | 7.66 | 7.77 | 8.11 |
| 32T | 8.15 | 7.80 | — |
| 33 | 7.82 | 7.93 | 8.17 |
| 34T | 7.77 | 7.54 | — |
| 35T | 7.51 | 7.46 | — |
| 36 | 8.10 | 8.00 | 7.85 |
| 37 | 7.72 | 7.82 | 8.00 |
| 38T | 7.43 | 7.09 | — |
| 39 | 7.96 | 7.93 | 7.93 |
| 40 | 8.10 | 8.17 | 8.18 |
| 41 | 7.51 | 7.40 | 7.30 |
| 42 | 7.92 | 7.89 | 7.89 |
| 43 | 7.51 | 7.47 | 7.47 |
| 44 | 7.92 | 7.93 | 7.93 |
| 45 | 7.80 | 7.70 | 7.55 |
| 46 | 7.60 | 7.76 | 7.84 |
| 47 | 7.85 | 7.75 | 7.26 |
| 48T | 7.89 | 7.98 | — |
Tindicates the test samples.
Figure 4Dispersion plot of the residuals for the training and test sets.
Comparative statistical parameters obtained by the secondary development program and the Gaussian program concerning the same compounds.
| Program |
|
|
|
|---|---|---|---|
| The secondary development program developed in this work | 0.953 | 0.815 | 0.884 |
|
| |||
| Gaussian, HyperChem 7.5, JChem for Excel package, Dragon | 0.969 | 0.868 | 0.891 |
q train-CV 2 of different methods.
| Method | SVR | BP-ANN | MLR |
|---|---|---|---|
|
| 0.815 | 0.761 | 0.721 |