| Literature DB >> 29989088 |
Jian Song1,2,3, Jijun Tang1,2,4, Fei Guo1,2.
Abstract
Matrix metalloproteases (MMPs) are a family of zinc-dependent proteinases that play complex and diverse roles in metabolism, which are vital for physiological development. In this paper, we present a novel method to identify peptide binding to seven matrix metalloproteases. First, we propose a novel sampling criteria for constructing a training set for each new peptide motif. Then, we select nine physicochemical properties of amino acids and compute their auto-cross covariance to effectively extract features for both natural and non-natural amino acids. Finally, we adopt random forest to predict binding values of each peptide motif respectively with seven MMPs. Our method verifies on 1300 known peptide motifs binding to seven MMPs and achieved preeminent Pearson-product-moment correlation coefficient (PCC) and root mean squared error (RMSE) on all seven MMPs, especially of 0.9181 and 9.3827 on MMP-7. We predict binding values of 4000 peptide motifs and identify peptides preferentially bind to MMP-2 and MMP-7. We herein report 4 novel inhibitor candidates of Asp-Ile-Phe, Asp-Ile-Tyr, Asp-Ile-Lys and Hser-Gly-Phe with high potency and selectivity binding to MMP-2, as well as 6 novel inhibitor candidates of Chg-Ile-Ile, Chg-Ile-Leu, Chg-Ile-Glu, Chg-Ile-Met, Chg-Val-Ile and Chg-Val-Leu selectively binding to MMP-7. Our findings facilitate the identification of inhibitors with good potency as well as desirable selectivity, providing significant insights of candidate inhibitor drugs.Entities:
Keywords: MMPs; auto-cross covariance; peptide inhibitors; random forest
Mesh:
Substances:
Year: 2018 PMID: 29989088 PMCID: PMC6036742 DOI: 10.7150/ijbs.24588
Source DB: PubMed Journal: Int J Biol Sci ISSN: 1449-2288 Impact factor: 6.580
Scheme 1The optional non-natural and natural amino acids for three positions. The consists of 11 non-natural and natural amino acids made of substituted succinyl hydroxamate ZBG (highlighted in pink). Each was assigned a unique single-letter code (inset).
Figure 1The overall method flow.
Five categories of 20 natural amino acids and 5 non-natural amino acids
| Category | Amino Acids |
|---|---|
| Amino acids with positive charged side chains | R, H, K, B |
| Amino acids with negative charged side chains | D, E, J, Z |
| Amino acids with polar uncharged side chains | S, T, N, Q |
| Amino acids with hydrophobic side chains | A, I, L, M, F, W, Y, V |
| Special cases | C, G, P, U, |
Nine physicochemical properties of 20 natural amino acids and 5 non-natural amino acids.
| AA | MW | SV | SE | P | HB | CSI | ECC | SPH | HY | |
|---|---|---|---|---|---|---|---|---|---|---|
| ALA | A | 90.12 | 7.11 | 14.35 | 7.58 | 2 | 24 | 16 | 0.576 | 3.794 |
| GLY | G | 75.08 | 5.21 | 10.52 | 5.44 | 2 | 19 | 13 | 0.828 | 2.870 |
| IIE | I | 132.21 | 11.90 | 23.00 | 12.86 | 2 | 61 | 37 | 0.557 | 2.926 |
| LEU | L | 132.21 | 11.90 | 23.00 | 12.86 | 2 | 63 | 38 | 0.500 | 2.926 |
| PRO | P | 116.16 | 9.71 | 18.23 | 10.34 | 2 | 51 | 27 | 0.651 | 2.001 |
| VAL | V | 118.18 | 10.31 | 20.12 | 11.10 | 2 | 43 | 27 | 0.410 | 3.150 |
| PHE | F | 166.22 | 14.31 | 24.12 | 15.10 | 2 | 131 | 68 | 0.831 | 2.456 |
| TRP | W | 205.26 | 17.30 | 28.22 | 18.11 | 3 | 195 | 95 | 0.852 | 3.198 |
| TYR | Y | 182.22 | 14.82 | 25.44 | 15.56 | 3 | 157 | 82 | 0.787 | 3.446 |
| ASP | D | 134.13 | 9.13 | 18.00 | 9.49 | 4 | 63 | 38 | 0.708 | 4.320 |
| GLU | E | 148.16 | 10.73 | 20.89 | 11.25 | 4 | 85 | 50 | 0.770 | 4.068 |
| ARG | R | 176.26 | 14.59 | 28.36 | 15.50 | 4 | 139 | 79 | 0.863 | 8.560 |
| HIS | H | 156.19 | 12.10 | 21.55 | 12.59 | 4 | 106 | 55 | 0.644 | 3.857 |
| LYS | K | 148.24 | 13.20 | 26.04 | 14.25 | 2 | 98 | 57 | 0.836 | 6.438 |
| SER | S | 106.12 | 7.62 | 15.68 | 8.03 | 4 | 36 | 23 | 0.652 | 4.875 |
| THR | T | 120.15 | 9.22 | 18.56 | 9.80 | 4 | 43 | 27 | 0.450 | 4.508 |
| CYS | C | 122.19 | 8.20 | 15.43 | 9.23 | 2 | 36 | 23 | 0.668 | 4.875 |
| MET | M | 150.25 | 11.39 | 21.19 | 12.75 | 2 | 74 | 44 | 0.810 | 3.032 |
| ASN | N | 133.15 | 9.62 | 18.78 | 10.04 | 4 | 63 | 38 | 0.734 | 5.574 |
| GLN | Q | 147.18 | 11.21 | 21.66 | 11.80 | 4 | 85 | 50 | 0.787 | 5.271 |
| CHG | B | 158.25 | 14.50 | 26.88 | 15.63 | 2 | 101 | 53 | 0.477 | 2.587 |
| HSER | J | 120.15 | 9.22 | 18.56 | 9.80 | 4 | 54 | 33 | 0.735 | 4.508 |
| HPE | Z | 180.25 | 15.90 | 27.00 | 16.86 | 2 | 163 | 84 | 0.762 | 2.342 |
| CPA3 | U | 157.24 | 14.20 | 25.94 | 15.24 | 2 | 106 | 55 | 0.595 | 1.574 |
MW, Molecular Weight; SV, Sum of Atomic Van Der Waals Volumes; SE, Sanderson Electronegativity; P, Polarizability; HB, Number of hydrogen bonds; CSI, Connectivity Index; ECC, Eccentricity; SPH, Sphericity; HY, Hydrophilic factor.
Validation on 1300 peptide motifs using random forest with 1500 trees.
| Leave-One-Out Validation | Two-fold Cross-Validation | |||
|---|---|---|---|---|
| 0.8212 | 17.7916 | 0.7836 | 19.3735 | |
| 0.7682 | 17.2379 | 0.7117 | 18.9166 | |
| 0.9181 | 9.3827 | 0.9053 | 10.0559 | |
| 0.8910 | 12.4845 | 0.8756 | 13.2831 | |
| 0.9124 | 12.0189 | 0.8893 | 13.4283 | |
| 0.8708 | 16.2411 | 0.8448 | 17.6808 | |
| 0.7247 | 16.7657 | 0.6910 | 17.5881 | |
Validation on 1300 peptide motifs using random forest with 1500 trees with Sampling Criteria.
| Training Set with Relevant Samples | Training Set with Irrelevant Samples | ||
|---|---|---|---|
| 0.8195 | 17.8608 | 35.1576 | |
| 0.7680 | 17.2457 | 29.8699 | |
| 0.9181 | 9.3803 | 29.5143 | |
| 0.8908 | 12.4952 | 33.2383 | |
| 0.9095 | 12.2094 | 37.6389 | |
| 0.8680 | 16.4052 | 44.0620 | |
| 0.7246 | 16.7671 | 25.9060 | |
Validation of binding values of inhibitors of MMP-2 with distinct computational methods.
| 0.5547 | 25.9458 | |
| - | 33.6099 | |
| 0.7240 | 21.5097 | |
| 0.8212 | 17.7916 |
Figure 2Position-specific scoring histogram on top 50 binding-value motifs of 1300 samples against seven MMPs. For each MMP protein, we select its binding peptides with top 50 predicted binding values among 1300 library. Each bar represents the frequency of appearance of each amino acid Tyre on each position among the top 50 predicted binding peptides of the specific MMP. The x axis denotes nominal positions of a binding peptide from to . The y axis and the height of a letter denotes its frequency of appearance on this position, implicating its contribution of binding value to the position.
Figure 3Averaged inhibition contributions across permuted , and positions. Each bar represents averaged inhibition values of relevant residue across 1300-member library. The asterisk (*) highlights the residue contributing to the highest inhibition average in each graph.
Comparison with Experimental method of top average binding values on position
| Yao | Ours | |
|---|---|---|
| Z | Z | |
| Sf, U, Z | U, Z | |
| U | U | |
| Z, L | Z | |
| Z | Z | |
| Z | Z | |
| L | L |
Figure 4Position-specific scoring histogram on top 100 binding-value motifs of 4000 samples against seven MMPs. For each MMP protein, we select its binding peptides with top 100 predicted binding values among 4000 library. Each bar represents the frequency of appearance of each amino acid Tyre on each position among the top 100 predicted binding peptides of the specific MMP. The x axis denotes nominal positions of a binding peptide from to . The y axis and the height of a letter denotes its frequency of appearance on this position, implicating its contribution of binding value to the position.
Inhibitors predicted by computational method with high potency and selectivity with MMP-2 and MMP-7
| No. | MMP-2 | MMP-7 |
|---|---|---|
| 1 | HSER-LEU-HIS * | CHG-ILE-ILE |
| 2 | ASP-ILE-PHE | CHG-ILE-LEU |
| 3 | ASP-ILE-TYR | CHG-ILE-GLU |
| 4 | ASP-ILE-LYS | CHG-ILE-MET |
| 5 | HSER-GLY-PHE | CHG-VAL-ILE |
| 6 | CHG-VAL-LEU |
The binding values against MMPs of inhibitors with high selectivity of MMP-2 predicted by computational method
| MMP-2 | MMP-3 | MMP-7 | MMP-8 | MMP-9 | MMP-13 | MMP-14 | |
|---|---|---|---|---|---|---|---|
| Hser-Leu-His | 60.5067 | 19.7844 | 16.7174 | 1.22158 | 9.5357 | 1.6049 | 8.3165 |
| Asp-Ile-Phe | 65.9515 | 16.5326 | 18.1990 | 0.39054 | 15.2081 | 1.7737 | 11.3050 |
| Asp-Ile-Tyr | 66.5144 | 18.0813 | 17.9488 | 0.10264 | 15.7835 | 0.9748 | 8.8563 |
| Asp-Ile-Lys | 61.4983 | 14.9883 | 17.0719 | 0.51577 | 10.3562 | 3.1173 | 14.9769 |
| Hser-Gly-Phe | 64.4088 | 17.8065 | 16.4288 | 2.24950 | 13.1266 | 5.9196 | 13.8823 |
The binding values against MMPs of inhibitors with high selectivity of MMP-7 predicted by computational method
| MMP-2 | MMP-3 | MMP-7 | MMP-8 | MMP-9 | MMP-13 | MMP-14 | |
|---|---|---|---|---|---|---|---|
| Chg-Ile-Ile | 24.7546 | 21.6090 | 64.3935 | 8.0101 | 24.6581 | 21.9287 | 24.5430 |
| Chg-Ile-Leu | 19.2488 | 18.2610 | 62.8014 | 6.1657 | 24.1854 | 24.0525 | 22.9100 |
| Chg-Ile-Glu | 63.9129 | 27.2415 | 60.7542 | 18.0563 | 17.8181 | 2.32008 | 25.0958 |
| Chg-Ile-Met | 48.4514 | 29.6585 | 63.2750 | 15.3248 | 21.0132 | 11.3537 | 22.0503 |
| Chg-Val-Ile | 22.8662 | 24.8929 | 61.1206 | 7.2634 | 19.4140 | 19.4965 | 21.4045 |
| Chg-Val-Leu | 18.8152 | 23.2073 | 60.0980 | 5.5433 | 18.4461 | 21.3667 | 19.1914 |