| Literature DB >> 23935845 |
Guizhao Liang1, Yonglan Liu, Bozhi Shi, Jun Zhao, Jie Zheng.
Abstract
Bioactive peptides and peptidomimetics play a pivotal role in the regulation of many biological processes such as cellular apoptosis, host defense, and biomineralization. In this work, we develop a novel structural matrix, Index of Natural and Non-natural Amino Acids (NNAAIndex), to systematically characterize a total of 155 physiochemical properties of 22 natural and 593 non-natural amino acids, followed by clustering the structural matrix into 6 representative property patterns including geometric characteristics, H-bond, connectivity, accessible surface area, integy moments index, and volume and shape. As a proof-of-principle, the NNAAIndex, combined with partial least squares regression or linear discriminant analysis, is used to develop different QSAR models for the design of new peptidomimetics using three different peptide datasets, i.e., 48 bitter-tasting dipeptides, 58 angiotensin-converting enzyme inhibitors, and 20 inorganic-binding peptides. A comparative analysis with other QSAR techniques demonstrates that the NNAAIndex method offers a stable and predictive modeling technique for in silico large-scale design of natural and non-natural peptides with desirable bioactivities for a wide range of applications.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23935845 PMCID: PMC3720802 DOI: 10.1371/journal.pone.0067844
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Loading and communality of 155 variables on 6 factors.
| No. | Property | Factor 1 | Factor 2 | Factor 3 | Factor 4 | Factor 5 | Factor 6 | Communality |
| 1 | First Zagreb index M1 | 0.798 | 0.018 | 0.127 | −0.015 | −0.004 | −0.146 | 0.992 |
| 2 | Second Zagreb index M2 | 0.776 | 0.010 | 0.156 | −0.008 | −0.010 | −0.211 | 0.989 |
| 3 | Quadratic index | 0.582 | −0.060 | 0.294 | −0.013 | 0.002 | −0.409 | 0.949 |
| 4 | Narumi simple topological index (log) | 0.774 | 0.066 | 0.255 | −0.011 | −0.017 | −0.072 | 0.991 |
| 5 | Narumi harmonic topological index | 0.204 | 0.089 | 0.787 | −0.020 | −0.016 | 0.001 | 0.918 |
| 6 | Narumi geometric topological index | 0.165 | 0.056 | 0.759 | −0.029 | 0.004 | −0.072 | 0.929 |
| 7 | Total structure connectivity index | −0.347 | −0.038 | −0.297 | 0.059 | −0.010 | −0.059 | 0.936 |
| 8 | Pogliani index | 0.822 | 0.068 | 0.025 | −0.014 | −0.004 | −0.022 | 0.992 |
| 9 | Ramification index | 0.658 | −0.082 | −0.111 | −0.018 | 0.027 | −0.352 | 0.932 |
| 10 | Polarity number | 0.794 | −0.021 | 0.027 | −0.003 | −0.031 | −0.239 | 0.978 |
| 11 | Log of product of row sums (PRS) | 0.904 | 0.046 | 0.058 | −0.006 | −0.003 | −0.010 | 0.997 |
| 12 | Average vertex distance degree | 0.991 | 0.042 | 0.067 | −0.002 | 0.018 | 0.072 | 0.996 |
| 13 | Mean square distance index (Balaban) | −0.204 | −0.043 | −0.059 | 0.074 | −0.041 | 0.761 | 0.946 |
| 14 | Schultz molecular topological index (mti) | 1.119 | 0.027 | 0.093 | 0.015 | 0.015 | −0.086 | 0.990 |
| 15 | Gutman molecular topological index | 1.122 | 0.034 | 0.115 | 0.017 | 0.008 | −0.096 | 0.989 |
| 16 | Xu index | 0.782 | 0.053 | 0.059 | −0.023 | −0.001 | 0.075 | 0.995 |
| 17 | Superpendentic index | 0.799 | −0.051 | −0.265 | −0.012 | 0.027 | 0.036 | 0.975 |
| 18 | Wiener W index | 1.118 | 0.021 | 0.066 | 0.014 | 0.022 | −0.073 | 0.989 |
| 19 | Mean Wiener index | 0.768 | 0.053 | 0.044 | −0.020 | 0.017 | 0.392 | 0.988 |
| 20 | Harary H index | 0.835 | 0.032 | 0.102 | −0.007 | −0.017 | −0.135 | 0.993 |
| 21 | Quasi-Wiener index (Kirchhoff number) | 1.110 | 0.037 | 0.038 | 0.007 | 0.038 | −0.062 | 0.983 |
| 22 | Detour index | 1.094 | −0.020 | 0.157 | 0.027 | −0.012 | −0.141 | 0.978 |
| 23 | Hyper-detour index | 1.122 | −0.055 | 0.191 | 0.030 | −0.014 | −0.126 | 0.929 |
| 24 | Reciprocal hyper-detour index | 0.834 | 0.122 | −0.330 | −0.013 | 0.033 | 0.069 | 0.974 |
| 25 | Distance/detour index | 1.028 | 0.077 | −0.063 | 0.000 | 0.023 | −0.050 | 0.975 |
| 26 | All-path Wiener index | 0.985 | −0.041 | 0.245 | 0.051 | −0.065 | −0.157 | 0.844 |
| 27 | Wiener-type index from Z weighted distance matrix (Barysz matrix) | 1.129 | 0.015 | 0.064 | 0.008 | 0.019 | −0.056 | 0.986 |
| 28 | Wiener-type index from van der Waals weighted distance matrix | 1.105 | 0.039 | 0.048 | 0.003 | 0.043 | −0.056 | 0.986 |
| 29 | Wiener-type index from electronegativity weighted distance matrix | 1.131 | 0.019 | 0.060 | 0.006 | 0.026 | −0.070 | 0.986 |
| 30 | Wiener-type index from polarizability weighted distance matrix | 1.098 | 0.040 | 0.045 | 0.005 | 0.041 | −0.046 | 0.984 |
| 31 | Balaban distance connectivity index | −0.127 | 0.053 | −0.925 | −0.008 | 0.000 | −0.146 | 0.968 |
| 32 | Balaban-type index from mass weighted distance matrix | −0.243 | 0.089 | −0.960 | −0.028 | 0.056 | −0.158 | 0.905 |
| 33 | Balaban-type index from electronegativity weighted distance matrix | −0.277 | 0.027 | −0.985 | −0.032 | 0.008 | −0.146 | 0.943 |
| 34 | Maximal electrotopological positive variation | 0.626 | −0.052 | 0.038 | −0.103 | 0.046 | −0.202 | 0.812 |
| 35 | Molecular electrotopological variation | 0.557 | 0.078 | −0.153 | −0.020 | 0.028 | −0.045 | 0.947 |
| 36 | E-state topological parameter | 0.827 | 0.015 | −0.240 | −0.003 | 0.091 | −0.092 | 0.854 |
| 37 | Kier symmetry index | 0.829 | 0.017 | 0.101 | −0.023 | 0.014 | 0.037 | 0.972 |
| 38 | 1-path Kier alpha-modified shape index | 0.892 | 0.046 | −0.192 | −0.018 | 0.002 | 0.054 | 0.997 |
| 39 | 2-path Kier alpha-modified shape index | 0.840 | 0.106 | −0.131 | 0.000 | −0.036 | 0.337 | 0.959 |
| 40 | 3-path Kier alpha-modified shape index | 0.573 | 0.041 | −0.397 | −0.015 | −0.059 | 0.532 | 0.899 |
| 41 | Kier flexibility index | 0.785 | 0.086 | −0.296 | 0.002 | −0.038 | 0.388 | 0.955 |
| 42 | Path/walk 5 - Randic shape index | 0.027 | 0.040 | 0.341 | −0.078 | −0.115 | −0.176 | 0.795 |
| 43 | Eccentric connectivity index | 0.988 | 0.022 | 0.145 | −0.003 | 0.013 | 0.125 | 0.991 |
| 44 | Eccentricity | 0.987 | 0.017 | 0.101 | −0.003 | 0.018 | 0.154 | 0.991 |
| 45 | Average eccentricity | 0.738 | 0.006 | 0.116 | −0.016 | 0.011 | 0.500 | 0.981 |
| 46 | Eccentric | 0.649 | −0.092 | 0.137 | 0.006 | −0.015 | 0.711 | 0.914 |
| 47 | Mean distance degree deviation | 1.017 | 0.008 | 0.051 | 0.011 | −0.002 | 0.096 | 0.989 |
| 48 | Unipolarity | 0.965 | 0.028 | 0.086 | −0.007 | 0.027 | 0.156 | 0.989 |
| 49 | Centralization | 1.122 | 0.038 | 0.037 | 0.023 | 0.005 | −0.189 | 0.970 |
| 50 | Variation | 1.051 | −0.008 | 0.063 | 0.022 | −0.027 | 0.011 | 0.977 |
| 51 | Balaban centric index | 0.628 | −0.105 | −0.659 | 0.009 | 0.046 | −0.066 | 0.930 |
| 52 | Lopping centric index | 0.203 | 0.050 | −0.122 | −0.031 | 0.151 | 0.727 | 0.817 |
| 53 | Radial centric information index | 0.544 | −0.035 | 0.176 | −0.012 | −0.030 | 0.627 | 0.934 |
| 54 | Unsaturation index | 0.102 | 0.034 | 0.237 | −0.079 | 0.065 | 0.082 | 0.893 |
| 55 | Hydrophilic factor | 0.022 | 0.352 | −0.186 | −0.172 | −0.103 | 0.093 | 0.874 |
| 56 | Ghose-Crippen molar refractivity | 0.854 | −0.006 | 0.038 | −0.023 | −0.013 | −0.043 | 0.991 |
| 57 | Moriguchi octanol-water partition coeff. (logp) | 0.072 | −0.263 | 0.068 | −0.056 | −0.137 | −0.266 | 0.887 |
| 58 | Ghose-Crippen octanol-water partition coeff. (logp) | 0.498 | −0.496 | −0.058 | −0.027 | 0.069 | 0.047 | 0.910 |
| 59 | Verhaar model of Fish base-line toxicity from MLOGP | −0.188 | 0.195 | −0.062 | 0.070 | 0.114 | 0.334 | 0.859 |
| 60 | Sum of the atomic polarizabilities (including implicit hydrogens) with polarizabilities. | 0.929 | −0.032 | 0.022 | −0.028 | −0.017 | −0.036 | 0.992 |
| 61 | Sum of the absolute value of the difference between atomic polarizabilities of all bonded atoms in the molecule (including implicit hydrogens) with polarizabilities. | 0.937 | −0.041 | −0.103 | −0.049 | −0.005 | 0.007 | 0.911 |
| 62 | Total charge of the molecule (sum of formal charges). | 0.044 | −0.050 | 0.066 | 0.017 | 0.048 | −0.006 | 0.938 |
| 63 | Molecular refractivity (including implicit hydrogens). This property is an atomic contribution model | 0.852 | 0.003 | 0.035 | −0.008 | −0.011 | −0.051 | 0.991 |
| 64 | Molecular weight (including implicit hydrogens) in atomic mass units with atomic weights. | 0.765 | 0.103 | −0.019 | −0.001 | 0.007 | −0.090 | 0.962 |
| 65 | Log of the octanol/water partition coefficient (including implicit hydrogens) | 0.313 | −0.440 | −0.017 | −0.077 | −0.289 | −0.080 | 0.829 |
| 66 | Log of the aqueous solubility (mol/L). | −0.564 | 0.208 | 0.077 | −0.073 | −0.083 | 0.062 | 0.920 |
| 67 | Log of the octanol/water partition coefficient (including implicit hydrogens). | 0.481 | −0.573 | −0.048 | 0.058 | 0.007 | 0.091 | 0.901 |
| 68 | Polar surface area (Å2) calculated using group contributions to approximate the polar surface area from connection table information only | 0.282 | 0.488 | −0.165 | 0.089 | 0.003 | 0.025 | 0.930 |
| 69 | Van der Waals volume (Å3) calculated using a connection table approximation. | 0.903 | −0.017 | 0.020 | −0.031 | −0.012 | −0.025 | 0.996 |
| 70 | Area of van der Waals surface (Å2) calculated using a connection table approximation. | 0.895 | 0.033 | −0.119 | −0.021 | 0.001 | 0.005 | 0.993 |
| 71 | Atomic connectivity index (order 0). This is calculated as the sum of 1/sqrt( | 0.850 | 0.035 | −0.038 | −0.016 | 0.000 | −0.028 | 0.996 |
| 72 | Atomic valence connectivity index (order 0). This is calculated as the sum of 1/sqrt( | 0.887 | −0.003 | −0.044 | −0.009 | 0.004 | −0.078 | 0.988 |
| 73 | First kappa shape index: ( | 0.867 | 0.051 | −0.145 | −0.015 | −0.001 | 0.055 | 0.995 |
| 74 | First alpha modified shape index: | 0.881 | 0.046 | −0.241 | −0.003 | 0.025 | 0.019 | 0.969 |
| 75 | Kier molecular flexibility index: (kiera1) (kiera2)/ | 0.705 | 0.073 | −0.343 | 0.020 | −0.015 | 0.313 | 0.907 |
| 76 | Balaban's connectivity topological index | −0.136 | 0.051 | −0.926 | −0.009 | 0.000 | −0.147 | 0.969 |
| 77 | Number of hydrogen bond acceptor atoms | 0.373 | 0.627 | 0.039 | −0.068 | −0.050 | −0.073 | 0.845 |
| 78 | Number of acidic atoms. | 0.021 | −0.039 | 0.049 | −0.004 | 0.037 | −0.011 | 0.949 |
| 79 | Number of hydrogen bond donor atoms | 0.240 | 0.627 | 0.001 | −0.237 | −0.145 | −0.112 | 0.810 |
| 80 | Number of hydrophobic atoms | 0.806 | −0.252 | 0.078 | −0.026 | −0.020 | −0.007 | 0.980 |
| 81 | Approximation to the sum of VDW surface areas of acidic atoms | 0.022 | −0.039 | 0.051 | 0.002 | 0.039 | −0.018 | 0.945 |
| 82 | Approximation to the sum of VDW surface areas of pure hydrogen bond donors | −0.085 | −0.027 | −0.182 | −0.211 | −0.160 | −0.010 | 0.758 |
| 83 | Approximation to the sum of VDW surface areas of hydrophobic atoms | 0.869 | −0.240 | −0.083 | −0.053 | −0.014 | 0.013 | 0.974 |
| 84 | Approximation to the sum of VDW surface areas (Å2) of atoms typed as “other”. | 0.154 | 0.461 | −0.072 | 0.308 | 0.116 | −0.064 | 0.726 |
| 85 | Approximation to the sum of VDW surface areas (Å2) of polar atoms (atoms that are both hydrogen bond donors and acceptors), such as -OH. | 0.287 | 0.486 | −0.092 | −0.112 | −0.016 | 0.017 | 0.915 |
| 86 | Value of the potential energy | 0.259 | 0.070 | −0.099 | 0.134 | 0.011 | −0.193 | 0.869 |
| 87 | Angle bend potential energy | 0.185 | −0.048 | −0.140 | 0.106 | −0.107 | −0.072 | 0.713 |
| 88 | Value of the potential energy with all bonded terms disabled | 0.337 | 0.066 | −0.085 | 0.103 | 0.032 | −0.155 | 0.823 |
| 89 | Solvation energy | −0.093 | −0.198 | −0.131 | 0.008 | 0.028 | 0.033 | 0.836 |
| 90 | Bond stretch-bend cross-term potential energy | 0.187 | 0.088 | 0.057 | −0.200 | 0.009 | −0.103 | 0.859 |
| 91 | Local strain energy | 0.183 | −0.021 | −0.032 | 0.150 | −0.022 | 0.024 | 0.811 |
| 92 | Van der Waals component of the potential energy | 0.579 | −0.147 | −0.062 | 0.003 | −0.094 | −0.118 | 0.889 |
| 93 | Water accessible surface area calculated using a radius of 1.4 A for the water molecule. A polyhedral representation is used for each atom in calculating the surface area. | 0.820 | 0.060 | −0.005 | −0.026 | 0.018 | 0.146 | 0.987 |
| 94 | Mass density: molecular weight divided by van der Waals volume as calculated in the vol descriptor. | −0.349 | 0.338 | −0.061 | 0.035 | 0.027 | −0.181 | 0.819 |
| 95 | Van der Waals volume calculated using a grid approximation (spacing 0.75 A). | 0.915 | −0.005 | −0.004 | −0.024 | −0.002 | −0.019 | 0.996 |
| 96 | Van der Waals surface area. A polyhedral representation is used for each atom in calculating the surface area. | 0.906 | 0.013 | −0.033 | −0.021 | 0.008 | 0.050 | 0.995 |
| 97 | Amphiphilic moment | 0.077 | −0.119 | −0.048 | 0.013 | 0.897 | 0.069 | 0.939 |
| 98 | Hydrophobic volume (8 descriptors) | 0.559 | −0.008 | 0.059 | 0.022 | 0.090 | −0.010 | 0.974 |
| 99 | Hydrophobic volume (8 descriptors) | 0.569 | −0.023 | 0.052 | 0.031 | 0.065 | 0.004 | 0.980 |
| 100 | Hydrophobic volume (8 descriptors) | 0.605 | −0.054 | 0.018 | 0.031 | 0.076 | −0.030 | 0.983 |
| 101 | Hydrophobic volume (8 descriptors) | 0.604 | −0.053 | 0.016 | 0.019 | 0.104 | −0.051 | 0.985 |
| 102 | Hydrophobic volume (8 descriptors) | 0.578 | −0.028 | 0.035 | 0.012 | 0.132 | −0.056 | 0.985 |
| 103 | Hydrophobic volume (8 descriptors) | 0.521 | 0.030 | 0.127 | 0.014 | 0.128 | −0.034 | 0.973 |
| 104 | Hydrophobic volume (8 descriptors) | 0.390 | 0.046 | 0.161 | 0.031 | 0.025 | 0.118 | 0.942 |
| 105 | Hydrophobic volume (8 descriptors) | 0.245 | 0.063 | 0.070 | 0.083 | −0.046 | 0.025 | 0.900 |
| 106 | Lowest hydrophobic energy (3 descriptors) | −0.150 | 0.021 | −0.051 | −0.149 | 0.023 | 0.216 | 0.853 |
| 107 | Lowest hydrophobic energy (3 descriptors) | −0.129 | 0.030 | −0.050 | −0.153 | 0.020 | 0.216 | 0.864 |
| 108 | Lowest hydrophobic energy (3 descriptors) | −0.107 | 0.004 | −0.049 | −0.151 | 0.012 | 0.213 | 0.868 |
| 109 | Surface globularity | 0.582 | 0.223 | −0.024 | −0.015 | 0.021 | 0.411 | 0.901 |
| 110 | H-bond donor capacity (8 descriptors) | 0.283 | 0.518 | 0.154 | 0.209 | −0.153 | 0.296 | 0.843 |
| 111 | H-bond donor capacity (8 descriptors) | 0.338 | 0.340 | 0.116 | 0.186 | −0.185 | 0.347 | 0.922 |
| 112 | H-bond donor capacity (8 descriptors) | 0.302 | 0.456 | 0.033 | 0.140 | −0.096 | 0.213 | 0.957 |
| 113 | H-bond donor capacity (8 descriptors) | 0.208 | 0.786 | 0.016 | 0.081 | 0.020 | −0.004 | 0.973 |
| 114 | H-bond donor capacity (8 descriptors) | 0.075 | 1.067 | 0.021 | 0.017 | 0.085 | −0.169 | 0.967 |
| 115 | H-bond donor capacity (8 descriptors) | −0.002 | 1.205 | −0.007 | −0.039 | 0.141 | −0.316 | 0.940 |
| 116 | Hydrophilic-Lipophilic (2 descriptors) | −0.223 | 0.808 | −0.057 | −0.042 | −0.117 | −0.102 | 0.883 |
| 117 | Hydrophilic-Lipophilic (2 descriptors) | −0.237 | 0.909 | −0.071 | −0.058 | −0.082 | −0.175 | 0.844 |
| 118 | Hydrophobic integy moment (8 descriptors) | −0.123 | 0.224 | 0.018 | −0.019 | 0.928 | 0.137 | 0.895 |
| 119 | Hydrophobic integy moment (8 descriptors) | −0.065 | 0.194 | 0.012 | −0.083 | 0.889 | 0.136 | 0.866 |
| 120 | Hydrophobic integy moment (8 descriptors) | −0.060 | 0.367 | 0.052 | −0.054 | 0.514 | −0.005 | 0.730 |
| 121 | Hydrophilic integy moment (8 descriptors) | 0.153 | −0.080 | −0.026 | 0.035 | 0.829 | −0.058 | 0.903 |
| 122 | Hydrophilic integy moment (8 descriptors) | 0.112 | 0.003 | 0.045 | 0.008 | 0.805 | −0.070 | 0.952 |
| 123 | Hydrophilic integy moment (8 descriptors) | 0.116 | −0.085 | 0.014 | 0.007 | 0.759 | −0.059 | 0.957 |
| 124 | Hydrophilic integy moment (8 descriptors) | 0.156 | −0.281 | −0.037 | 0.013 | 0.665 | 0.046 | 0.916 |
| 125 | Hydrophilic integy moment (8 descriptors) | 0.187 | −0.592 | −0.050 | 0.064 | 0.413 | 0.282 | 0.792 |
| 126 | Surface rugosity | 0.714 | −0.206 | −0.017 | −0.005 | 0.016 | −0.168 | 0.917 |
| 127 | Interaction field surface area | 0.820 | 0.047 | −0.011 | −0.004 | 0.022 | 0.137 | 0.987 |
| 128 | Interaction field volume | 0.876 | 0.000 | −0.011 | 0.000 | 0.016 | 0.054 | 0.995 |
| 129 | Hydrophilic volume (8 descriptors) | 0.666 | 0.258 | 0.076 | 0.046 | −0.068 | 0.280 | 0.978 |
| 130 | Hydrophilic volume (8 descriptors) | 0.507 | 0.312 | 0.063 | 0.067 | −0.161 | 0.365 | 0.966 |
| 131 | Hydrophilic volume (8 descriptors) | 0.369 | 0.438 | −0.014 | 0.086 | −0.116 | 0.229 | 0.969 |
| 132 | Hydrophilic volume (8 descriptors) | 0.254 | 0.730 | 0.001 | 0.030 | −0.009 | 0.067 | 0.978 |
| 133 | Hydrophilic volume (8 descriptors) | 0.123 | 1.014 | 0.017 | −0.012 | 0.061 | −0.114 | 0.975 |
| 134 | Hydrophilic volume (8 descriptors) | 0.020 | 1.188 | −0.004 | −0.050 | 0.126 | −0.310 | 0.943 |
| 135 | Polar volume (8 descriptors) | 0.696 | 0.123 | 0.035 | −0.020 | −0.026 | 0.230 | 0.963 |
| 136 | Polar volume (8 descriptors) | 0.497 | 0.176 | −0.010 | −0.072 | −0.080 | 0.255 | 0.910 |
| 137 | Water accessible surface area of all atoms with positive partial charge (strictly greater than 0). | −0.080 | −0.009 | 0.037 | 0.959 | −0.025 | 0.026 | 0.833 |
| 138 | Water accessible surface area of all atoms with negative partial charge (strictly less than 0). | −0.083 | −0.014 | −0.018 | 0.997 | −0.002 | 0.023 | 0.977 |
| 139 | Water accessible surface area of all hydrophobic (| | 0.826 | 0.061 | −0.004 | −0.151 | 0.019 | 0.142 | 0.987 |
| 140 | Water accessible surface area of all polar (| | −0.084 | −0.013 | −0.010 | 1.007 | −0.005 | 0.024 | 0.982 |
| 141 | Positive charge weighted surface area, ASA+ times max. | −0.063 | 0.002 | 0.017 | 0.996 | −0.011 | 0.004 | 0.914 |
| 142 | Negative charge weighted surface area, ASA− times max. | −0.045 | 0.002 | −0.036 | 0.921 | 0.018 | −0.004 | 0.872 |
| 143 | Dipole moment calculated from the partial charges of the molecule. | −0.105 | −0.019 | −0.049 | 0.931 | 0.005 | 0.033 | 0.899 |
| 144 | 3D-Wiener index | 1.166 | −0.074 | 0.007 | −0.018 | −0.001 | 0.022 | 0.973 |
| 145 | 3D-Balaban index | 0.138 | −0.062 | −0.810 | 0.038 | −0.065 | 0.050 | 0.953 |
| 146 | 3D-Harary index | 1.084 | −0.105 | −0.045 | −0.013 | −0.056 | −0.084 | 0.969 |
| 147 | Average geometric distance degree | 1.033 | −0.038 | 0.009 | −0.034 | 0.004 | 0.145 | 0.979 |
| 148 | D/D index | 1.121 | −0.079 | −0.006 | −0.015 | −0.026 | −0.018 | 0.978 |
| 149 | Gravitational index G1 | 0.808 | 0.098 | −0.004 | 0.005 | −0.016 | −0.209 | 0.968 |
| 150 | Span R | 0.718 | −0.071 | 0.040 | −0.033 | 0.021 | 0.600 | 0.951 |
| 151 | Spherosity | −0.328 | −0.110 | 0.031 | −0.162 | 0.019 | 0.836 | 0.781 |
| 152 | Asphericity | 0.005 | −0.254 | 0.206 | 0.041 | 0.024 | 1.160 | 0.797 |
| 153 | Folding degree index | −0.519 | 0.010 | 0.464 | −0.019 | 0.158 | 0.512 | 0.762 |
| 154 | Aromaticity index | −0.030 | −0.126 | 0.236 | −0.165 | 0.052 | 0.186 | 0.819 |
| 155 | HOMA total | 0.310 | −0.025 | 0.229 | −0.075 | −0.017 | −0.031 | 0.892 |
The correlation coefficients among six factors.
| No. | Factor 1 | Factor 2 | Factor 3 | Factor 4 | Factor 5 | Factor 6 |
| Factor 1 | 1.000 | 0.324 | 0.097 | −0.004 | 0.247 | 0.382 |
| Factor 2 | 0.324 | 1.000 | −0.358 | 0.241 | −0.251 | 0.306 |
| Factor 3 | 0.097 | −0.358 | 1.000 | −0.076 | 0.173 | −0.128 |
| Factor 4 | −0.004 | 0.241 | −0.076 | 1.000 | −0.232 | −0.017 |
| Factor 5 | 0.247 | −0.251 | 0.173 | −0.232 | 1.000 | 0.178 |
| Factor 6 | 0.382 | 0.306 | −0.128 | −0.017 | 0.178 | 1.000 |
Figure 1The 2- and 3-dimensional distribution of 22 natural amino acids on the first 2 and 3 factors.
The A and B present the 2- and 3-dimensional distribution, respectively.
Figure 2The difference of 22 natural amino acids and 593 non-natural amino acids on each factor.
The A, B, C, D, E and F is the difference for geometric characteristics, H-bond, connectivity, accessible surface area, integy moments index, and volume and shape, respectively.
The performance comparison among different QSAR models of BTDs.
| No. | Descriptors | Data size | Correlation methods |
|
|
|
|
| 1 | z-scales | 48 | PLS | 2 | 0.824 | nd | 0.260 |
| 2 | GRID | 48 | PLS | 1 | nd | 0.780 | nd |
| 3 | ISA-ECI | 48 | PLS | 2 | 0.847 | nd | nd |
| 4 | MS-WHIM | 48 | PLS | 3 | 0.704 | 0.633 | nd |
| 5 | MS-WHIM(extended) | 48 | PLS | 3 | 0.754 | 0.710 | 0.320 |
| 6 | VHSE | 48 | PLS | 3 | 0.910 | 0.816 | 0.200 |
| 7 | MARCH-INSIDE | 48 | PLS | 3 | 0.858 | nd | 0.230 |
| 8 | FASGAI | 48 | PLS | 3 | 0.886 | 0.723 | 0.220 |
| 9 | FASGAI | 48 | GA-PLS | 3 | 0.907 | 0.848 | 0.198 |
| 10 | FASGAI | 24/24 | GA-PLS | 2 | 0.936 | 0.761 | 0.172 |
| 11 | 3D-HoVAIF | 48 | PLS | 3 | 0.936 | 0. 849 | nd |
| 12 | SZOTT | 48 | PLS | 2 | 0.908 | 0. 736 | 0.195 |
| 13 | ST-SCALE | 48 | PLS | 5 | 0.855 | 0.774 | 0.400 |
| 14 | NNAAIndex | 48 | PLS | 2 | 0.863 | 0.765 | 0.238 |
| 15 | NNAAIndex | 48 | GA-PLS | 1 | 0.864 | 0.830 | 0.234 |
| 16 | NNAAIndex | 24/24 | GA-PLS | 2 | 0.898 | 0.772 | 0.198 |
The A is the number of principal component.
The R 2 is the cumulative multiple correlation coefficient.
The Q 2 is a cross validation square of cumulative multiple correlation coefficient by a leave-one-out procedure.
The RMS is the root mean square error of modeling simulation.
The nd shows that the correlative value is not given out.
Two numbers separated by slashes denote the number of samples in training and test sets, respectively.
Figure 3Plot of the 50-random-permutation validation for the GA-PLS model of BTDs.
The intercepts of the R 2- and Q 2-regression lines with the ordinate axis are −0.026 and −0.183, which are below limits of R 2<0.300 and Q 2<0.050, respectively.
Figure 4The newly designed peptide mimetics for BTDs.
The forty-sixth sample in training set, WW, is regarded as a template to design molecules. The activity of WW, 206-108, 206-439 and 206-206 is 3.60, 9.78, 9.20, and 9.17, respectively.
The performance comparison among different QSAR models of ACE inhibitors.
| No. | Descriptors | Data size | Correlation methods |
|
|
|
|
| 1 | z-scales | 58 | PLS | 2 | 0.770 | nd | nd |
| 2 | GRID | 58 | PLS | 1 | 0.744 | nd | 0.500 |
| 3 | ISA-ECI | 58 | PLS | 2 | 0.700 | nd | nd |
| 4 | MS-WHIM(rotameric) | 58 | PLS | 6 | 0.657 | 0.541 | nd |
| 5 | MS-WHIM(extended) | 58 | PLS | 2 | 0.708 | 0.637 | 0.540 |
| 6 | VHSE | 58 | SMR-PLS | 1 | 0.770 | 0.745 | 0.480 |
| 7 | FASGAI | 58 | PLS | 1 | 0.760 | 0.728 | 0.495 |
| 8 | FASGAI | 58 | GA-PLS | 1 | 0.796 | 0.775 | 0.456 |
| 9 | FASGAI | 29/29 | GA-PLS | 1 | 0.869 | 0.835 | 0.357 |
| 10 | 3D-HoVAIF | 58 | GA-PLS | 3 | 0.857 | 0.811 | 0.376 |
| 11 | T-scale | 58 | SMR-PLS | 2 | 0.845 | 0.786 | 0.390 |
| 12 | NNAAIndex | 58 | PLS | 2 | 0.749 | 0.719 | 0.511 |
| 13 | NNAAIndex | 58 | GA-PLS | 2 | 0.803 | 0.779 | 0.453 |
| 14 | NNAAIndex | 29/29 | GA-PLS | 1 | 0.852 | 0.832 | 0.369 |
The A is the number of principal component.
The R 2 is the cumulative multiple correlation coefficient.
The Q 2 is a cross validation square of cumulative multiple correlation coefficient by the leave-one-out procedure.
The RMS is the root mean square error of modeling simulation.
The nd shows that the correlative value is not given out.
Two numbers separated by slashes denote the number of samples in training and test sets, respectively.
Figure 5Plot of the 50-random-permutation validation for the GA-PLS model of ACE inhibitors.
The intercepts of the R 2- and Q 2- lines with the ordinate axis are 0.029 and −0.270, which are below limits of R 2<0.300 and Q 2<0.050, respectively.
Figure 6The newly designed peptide mimetics for ACE inhibitors.
The first sample in training set, VW, is regarded as a template to design molecules. The activity of VW, 512-439, 512-534 and 512-524 is 5.80, 8.33, 8.04, and 8.04, respectively.
Figure 7The newly designed mica-binding peptide mimetics.
The 13th sample in training set, TLTRVGW, is regarded as a template to design molecules. The predicted score of TLTRVGW, TLTRV108W, TLTRV439W, TLTRV534W and TLTRV524W is 0.94, 1.00, 1.00, and 1.00, respectively.