| Literature DB >> 20856816 |
Seungpyo Hong1, Taesu Chung, Dongsup Kim.
Abstract
SH3 domains mediate signal transduction by recognizing short peptides. Understanding of the driving forces in peptide recognitions will help us to predict the binding specificity of the domain-peptide recognition and to understand the molecular interaction networks of cells. However, accurate calculation of the binding energy is a tough challenge. In this study, we propose three ideas for improving our ability to predict the binding energy between SH3 domains and peptides: (1) utilizing the structural ensembles sampled from a molecular dynamics simulation trajectory, (2) utilizing multiple peptide templates, and (3) optimizing the sequence-structure mapping. We tested these three ideas on ten previously studied SH3 domains for which SPOT analysis data were available. The results indicate that calculating binding energy using the structural ensemble was most effective, clearly increasing the prediction accuracy, while the second and third ideas tended to give better binding energy predictions. We applied our method to the five SH3 targets in DREAM4 Challenge and selected the best performing method.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20856816 PMCID: PMC2939891 DOI: 10.1371/journal.pone.0012654
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Ensemble Based Binding Energy Calculation Method.
Our method is composed of three steps: structure sampling, energy matrix generation, and binding energy calculation. Initial complex structures were generated by superimposing the peptides of crystal structures to the modeled SH3 domains. For each initial complex the near binding state conformations were sampled by molecular dynamics simulation. Sampled structures were used in calculating the contribution of each amino acid on the binding energy on each position, which is converted into energy matrices. The resulting energy matrices were used to calculate the binding energy of peptides.
Effect of structural ensemble sampled from MD simulation trajectory.
| SH3 Domain | Single Conformation | Best Conformation | Multiple Conformations |
|
| 0.32±0.03 | 0.37 |
|
|
| 0.33±0.13 |
| 0.43 |
|
| 0.41±0.06 | 0.48 |
|
|
| 0.23±0.10 | 0.33 |
|
|
| 0.31±0.07 |
| 0.38 |
|
| 0.35±0.05 | 0.42 |
|
|
| 0.57±0.06 | 0.65 | 0.65 |
|
| 0.34±0.16 |
| 0.47 |
The Pearson's correlation coefficients between the predicted binding energies and SPOT data are shown.
*Average correlation coefficient of 11 conformations.
Figure 2Performance Dependency on Number of Averaged Energies.
Out of 11 conformations sampled via molecular dynamics simulation, the average energy of n lowest energies was used as the binding energy. At n = 0, the average performance when a single conformation was used for calculation is plotted. ‘+’: ABP1, ‘×’: Amphyphisin, ‘*’: Endophilin, empty box: MYO5, filled box: RVS167, empty circle: SHO1, filled circle: LSB3, triangle: YSC84, line: averaged performance.
Effect of using multiple peptide templates.
| Energy Based Selection | Best Peptide Template | Average | |||
| Domain | Peptide Template | Class I | Class II | ||
|
| 0. | 0.39 | 4 (II) | 0.36 | 0.32 |
|
| 0.20 | 0.43 | 8 (II) | 0.03 |
|
|
|
| 0.53 | 9 (II) | 0.30 | 0.41 |
|
|
| 0.36 | 2 (I) | 0.20 | 0.17 |
|
|
| 0.38 | 7 (II) | 0.25 | 0.30 |
|
| 0.29 | 0.48 | 6 (II) | 0.26 |
|
|
|
| 0.48 | 7 (II) | 0.26 | 0.36 |
|
|
| 0.43 | 3 (I) | 0.36 | 0.32 |
|
|
| 0.52 | 3 (I) | 0.45 | 0.30 |
|
| 0.23 | 0.26 | 2 (I) |
| 0.21 |
|
|
| 0.65 | 9 (II) | 0.19 | 0.59 |
|
|
| 0.47 | 4 (II) | 0.39 | 0.42 |
|
|
| 0.71 | 9 (II) | 0.35 | 0.66 |
|
|
| 0.52 | 9 (II) | 0.07 | 0.46 |
|
|
| 0.40 | 3 (I) | 0.26 | 0.25 |
|
|
| 0.57 | 9 (II) | 0.19 | 0.49 |
The Pearson's correlation coefficients between the predicted binding energies and SPOT data are shown.
*When sequences are separated into Class I and Class II, the class is marked in parentheses. Class I has (R/K)xxPxxP motif and Class II has PxxPx(R/K) motif. ABP1, Amphyphisin, Endophilin, and MYO5 do not have the canonical SH3 motifs.
Peptides 1, 2, and 3 have Class I orientation, and peptides 4, 5, 6, 7, 8, and 9 have Class II orientation. Class I and Class II are marked in parentheses.
Effect of sequence-structure mapping.
| Domain | Alignment(best peptide) | Without alignment (best peptide) |
|
|
| 0.36 (−3, II) |
|
| 0.43 |
|
|
| 0.53 |
|
|
|
| 0.31 (−3, I) |
|
|
| 0.34 (−3, I) |
|
| 0.48 |
|
|
|
| 0.50 (−3, I) |
|
|
| 0.24 (0, II) |
|
|
| 0.43 (−3, I) |
|
|
| 0.69 (0, II) |
|
|
| 0.38 (−3, I) |
|
| 0.57 | 0.57 (0, II) |
The Pearson's correlation coefficients between the predicted binding energies and SPOT data are shown.
When sequences are separated into Class I and Class II, the class is marked in parentheses. Abp1, Amphyphisin, Endophilin, and Myo5 do not have the canonical SH3 motifs.
*Pearson's correlation coefficient for the best peptide template when alignments are adjusted.
**Pearson's correlation coefficient for the best template peptide when the alignment is fixed to that of canonical motif PxxP. The offset and class of peptide templates are indicated in parentheses.
Cases when the class of the best peptide template is inconsistent with the class of sequence motifs. The best peptide belonging to the sequence motif is indicated in parentheses in the second column. The correlation of fixed alignment for that peptide is shown in the third column.
Comparison to Other Binding Energy Calculation Methods.
| SH3 domain | Fernandez-Ballester | Our Method | Hou | Our Method |
| ABP1 | 0.83 |
| – | – |
| BOI1 |
| 0.55 |
| 0.72 |
| LSB3 | 0.96 |
| 0.91 |
|
| MYO5 |
| 0.74 | 0.59 |
|
| RVS167 | 0.70 |
| 0.78 |
|
| SHO1 | 0.83 |
| – | – |
| YSC84 | – | – | 0.89 |
|
Area under ROC curves (AROC) are shown.
*Methods by Fernandez-Ballester [19] and Hou used different data sets[11]. Accordingly, our method was compared with the two methods separately.
Figure 3Distribution of Predicted Binding Energy for DREAM4 Target.
Binding energies were calculated for randomly generated sequences (upper panel, dashed lines), for random sequences with canonical SH3 binding peptide motifs (lower panel, dashed lines), and for sequences derived from the DREAM4 Gold Standard (solid lines).
Figure 4DREAM4 Gold Standard and Predicted PSFM.
Position specific frequency matrices are represented with WebLogo [22]. Gold Standards are disclosed for three targets out of five challenges. They are displayed on the upper panel. The PSFM of 1000 sequences with 1000 lowest energies are displayed on the lower panel. Target 1: Homology to FISH, Target 2: Intersection-1-5, Target 3: PACSIN1. In case of target 2, the first position of the DREAM4 fold standard is matched with the fourth position in our prediction.