| Literature DB >> 19368710 |
Sergey A Samsonov1, Joan Teyra, Gerd Anders, M Teresa Pisabarro.
Abstract
BACKGROUND: The correlated mutations concept is based on the assumption that interacting protein residues coevolve, so that a mutation in one of the interacting counterparts is compensated by a mutation in the other. Approaches based on this concept have been widely used for protein contacts prediction since the 90s. Previously, we have shown that water-mediated interactions play an important role in protein interfaces. We have observed that current "dry" correlated mutations approaches might not properly predict certain interactions in protein interfaces due to the fact that they are water-mediated.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19368710 PMCID: PMC2676287 DOI: 10.1186/1472-6807-9-22
Source DB: PubMed Journal: BMC Struct Biol ISSN: 1472-6807
Figure 1Water contacts of residues in PDB. Fractions of residues found to be in contact with water in protein interfaces (white) and in whole proteins (grey) in the PDB.
Correlation between vectors per residue type in the DRY and WET matrices.
| Residue | p-value | Adjusted R2 |
| Ala | 0.90 | -0.05 |
| Arg | 4·10-3 | 0.35 |
| Asn | 4·10-5 | 0.65 |
| Asp | 6·10-4 | 0.46 |
| Cys | 0.14 | 0.07 |
| Gln | 5·10-4 | 0.47 |
| Glu | 4·10-4 | 0.49 |
| Gly | 0.53 | -0.03 |
| His | 0.02 | 0.22 |
| Ile | 8·10-4 | 0.44 |
| Leu | 6·10-3 | 0.31 |
| Lys | 8·10-3 | 0.29 |
| Met | 6·10-3 | 0.31 |
| Phe | 0.02 | 0.24 |
| Pro | 0.62 | -0.04 |
| Ser | 2·10-3 | 0.39 |
| Thr | 0.07 | 0.12 |
| Trp | 0.18 | 0.05 |
| Tyr | 0.71 | -0.05 |
| Val | 4·10-3 | 0.33 |
Dataset used for intradomain contact predictions.
| PFAM ID | PDB IDa | R (Å) | Nb | % idc | Ld | Ran acce | Accf | Rg | Opt αh | Xd dryi | OptXd αj | Xd wet|opt αk |
| PF00014 | 1.70 | 151 | 33 | 52 | 0.096 | 0.346 | 3.61 | 1 | 9.37 | 1 | 11.16 | |
| PF03705 | 2.00 | 85 | 20 | 57 | 0.081 | 0.241 | 2.65 | 0.5, 4, 10 | 6.14 | 2 | 7.63 | |
| PF00062 | 2.00 | 22 | 46 | 127 | 0.043 | 0.078 | 1.91 | 0, 0.5 | 2.68 | 0 | 2.68 | |
| PF00018 | 2.60 | 61 | 28 | 56 | 0.088 | 0.357 | 4.06 | 0.5 | 12.99 | 0 | 12.99 | |
| PF03900 | 1.76 | 21 | 25 | 74 | 0.062 | 0.237 | 3.82 | 2 | 9.18 | 0.2 | 9.99 | |
| PF00034 | 1.10 | 35 | 17 | 89 | 0.061 | 0.250 | 4.10 | 1 | 9.13 | 0.1 | 10.34 | |
| PF01568 | 1.82 | 88 | 18 | 113 | 0.044 | 0.050 | 1.14 | 0.2, 0.5 | 10.62 | 2 | 12.53 | |
| PF00127 | 1.60 | 31 | 29 | 89 | 0.055 | 0.102 | 1.85 | 2 | 0.50 | 1 | 4.82 | |
| PF01814 | 1.30 | 295 | 12 | 49 | 0.098 | 0.400 | 4.08 | 0.5, 2 | 8.39 | 2 | 13.14 | |
| PF00017 | 1.80 | 59 | 28 | 93 | 0.058 | 0.212 | 3.66 | 0 – 0.5 | 5.98 | 1 | 8.37 | |
| PF01320 | 2.00 | 45 | 47 | 86 | 0.056 | 0.233 | 4.15 | 0.2 | 16.04 | 0 | 16.04 | |
| PF08666 | 1.65 | 171 | 14 | 66 | 0.074 | 0.273 | 3.69 | 0 | 10.25 | 0 | 10.25 | |
| PF01337 | 2.76 | 30 | 25 | 89 | 0.065 | 0.178 | 2.87 | 0, 0.1 | 4.55 | 0.1 | 4.72 | |
| PF00595 | 2.30 | 56 | 19 | 85 | 0.062 | 0.233 | 3.75 | 0.5 – 2 | 10.16 | 1 | 11.67 | |
| PF00531 | 2.10 | 92 | 14 | 82 | 0.066 | 0.250 | 3.79 | 0 – 0.5 | 7.67 | 0.2 | 7.95 | |
| PF00397 | 2.00 | 73 | 32 | 30 | 0.143 | 0.467 | 3.26 | 2 – 20 | 6.59 | 2 | 8.81 | |
| PF01335 | 1.40 | 40 | 21 | 76 | 0.072 | 0.237 | 3.88 | 0.1, 0.2 | 5.66 | 0.2 | 5.96 | |
| PF00619 | 1.30 | 61 | 16 | 85 | 0.066 | 0.209 | 3.43 | 0.2 – 2 | 5.09 | 2 | 9.42 | |
| PF02213 | 2.35 | 112 | 28 | 58 | 0.083 | 0.241 | 2.91 | 0.5 – 2 | 7.37 | 0.5 | 7.77 | |
| PF05743 | 1.85 | 28 | 27 | 118 | 0.035 | 0.068 | 1.98 | 0.1 | 7.22 | 0 | 7.22 | |
| PF00536 | 1.95 | 69 | 28 | 74 | 0.076 | 0.395 | 5.19 | 0.2 – 2 | 15.53 | 2 | 16.36 | |
| PF03114 | 2.30 | 29 | 19 | 195 | 0.021 | 0.074 | 3.53 | 0.2 | 2.41 | 20 | 3.99 | |
| PF00169 | 1.70 | 139 | 10 | 112 | 0.050 | 0.071 | 1.43 | 0, 0.2, 0.5 | 5.46 | 2 | 7.53 | |
| PF08416 | 1.50 | 49 | 28 | 132 | 0.040 | 0.106 | 2.65 | 2, 4 | 0.53 | 0.1 | 1.24 | |
| PF01981 | 1.20 | 69 | 43 | 116 | 0.049 | 0.172 | 3.52 | 0.1 – 0.5 | 7.63 | 20 | 12.38 | |
| PF03992 | 1.90 | 116 | 15 | 65 | 0.068 | 0.125 | 1.84 | 0.5 | 3.34 | 0 | 3.34 | |
| PF00907 | 1.70 | 23 | 49 | 183 | 0.032 | 0.033 | 1.03 | 0 – 20 | 3.30 | 2 | 6.03 | |
| PF02237 | 1.60 | 47 | 21 | 48 | 0.094 | 0.167 | 1.77 | 0.5 – 2 | -2.83 | 0.5 | 0.22 | |
| PF08031 | 1.98 | 64 | 34 | 34 | 0.135 | 0.235 | 1.74 | 0.1, 0.2 | -0.05 | 2 | 3.37 | |
| PF02861 | 1.80 | 165 | 21 | 51 | 0.098 | 0.440 | 4.49 | 1, 4, 10, 20 | 9.55 | 20 | 13.21 | |
| PF02834 | 1.94 | 106 | 14 | 85 | 0.048 | 0.119 | 2.48 | 4 – 20 | -0.51 | 4, 10 | 3.21 | |
| PF01423 | 1.55 | 128 | 23 | 60 | 0.079 | 0.167 | 2.11 | 0.2, 0.5 | 5.78 | 0.1, 0.2 | 7.14 | |
| PF01472 | 1.80 | 106 | 24 | 78 | 0.058 | 0.128 | 2.21 | 1 – 20 | 3.57 | 2, 4 | 11.45 | |
| PF01909 | 1.80 | 119 | 14 | 91 | 0.059 | 0.133 | 2.26 | 0.1 – 1 | 4.97 | 0.2 | 6.01 | |
| PF09261 | 1.95 | 79 | 31 | 78 | 0.069 | 0.205 | 2.97 | 0.1, 0.2 | 4.87 | 0.1 | 6.64 | |
| PF01315 | 1.28 | 28 | 19 | 117 | 0.041 | 0.207 | 5.05 | 1, 2 | 7.70 | 2 | 10.28 | |
| PF04545 | 1.80 | 128 | 31 | 54 | 0.096 | 0.370 | 3.86 | 0, 0.1, 1, 10, 20 | 12.37 | 10, 20 | 12.76 | |
| PF00984 | 1.55 | 24 | 17 | 98 | 0.048 | 0.184 | 3.83 | 0.5 – 20 | 8.27 | 0.2 | 9.78 | |
| PF01658 | 1.90 | 20 | 31 | 105 | 0.049 | 0.096 | 1.96 | 0.1 – 20 | 1.93 | 0.5 | 6.28 | |
| PF00745 | 1.95 | 34 | 23 | 99 | 0.048 | 0.100 | 2.08 | 0.1 – 0.5 | 3.17 | 0.1 | 4.17 | |
| PF03099 | 1.60 | 65 | 14 | 117 | 0.043 | 0.121 | 2.81 | 0 | 13.7 | 0.2 | 14.20 | |
| PF01985 | 1.37 | 50 | 23 | 84 | 0.064 | 0.167 | 2.60 | 0 – 0.2 | 6.96 | 0 | 6.96 | |
| PF08436 | 1.90 | 77 | 57 | 94 | 0.049 | 0.213 | 4.34 | 0 – 0.1 | 6.91 | 10 | 10.15 | |
| PF02881 | 1.90 | 52 | 19 | 85 | 0.063 | 0.119 | 1.89 | 0 – 20 | 3.94 | 2 | 5.78 | |
| PF01966 | 1.76 | 158 | 12 | 91 | 0.057 | 0.333 | 5.85 | 0 – 0.2 | -0.79 | 2 | 2.20 | |
| PF00191 | 1.42 | 178 | 28 | 66 | 0.076 | 0.273 | 3.59 | 0 – 0.2 | -0.35 | 10 | 1.05 | |
| PF00317 | 1.90 | 79 | 23 | 90 | 0.056 | 0.178 | 3.17 | 0.5 – 2 | 10.01 | 0.5 | 13.16 | |
| PF00046 | 1.90 | 184 | 37 | 60 | 0.082 | 0.333 | 4.07 | 1, 2 | 6.07 | 2 | 8.60 | |
| PF00077 | 1.90 | 48 | 27 | 108 | 0.049 | 0.093 | 1.89 | 2 | -1.37 | 1 | 3.63 | |
| PF00042 | 1.40 | 73 | 18 | 101 | 0.046 | 0.163 | 3.56 | 1, 2 | 6.89 | 2 | 7.19 | |
aPDB ID; bNumber of sequences; cAverage sequences pairwise similarity (%); dReference sequence length; eRandom accuracy; fAccuracy for optimal α; gImprovement ratio over random prediction for optimal α; hValues for α = 0; iα corresponding to the highest accuracy; jα corresponding to the highest Xd; kXd highest value.
Prediction parameters dependence on the number of analyzed contacts.
| Predicted contacts analyzed | Accuracy | Improvement ratio over random prediction |
| L | 0.15 ± 0.09 | 2.24 ± 0.95 |
| L/2 | 0.18 ± 0.10 | 2.67 ± 1.08 |
| L/3 | 0.19 ± 0.12 | 2.81 ± 1.52 |
| L/5 | 0.21 ± 0.16 | 3.16 ± 1.79 |
| L/10 | 0.23 ± 0.20 | 3.55 ± 2.81 |
L is the length of the reference sequence. The value α = 0.5 has been used.
Figure 2Dependence on . A) Wet prediction ratio. B) Relative harmonic weighted difference statistic (X).
Accuracy, improvement ratio over random prediction and wet prediction ratio for different sequence separations.
| Sequence separation 6 | Sequence separation 12 | Sequence separation 24 | |||||||
| Accuracy | R | Wet ratio | Accuracy | R | Wet ratio | Accuracy | R | Wet ratio | |
| L | 0.061 | 3.07 | 1.01 | 0.051 | 3.02 | 1.02 | 0.042 | 2.97 | 1.06 |
| L/2 | 0.079 | 4.18 | 1.11 | 0.070 | 4.34 | 1.14 | 0.050 | 3.76 | 1.10 |
| L/3 | 0.087 | 4.56 | 1.14 | 0.071 | 4.49 | 1.01 | 0.060 | 4.61 | 1.14 |
| L/5 | 0.099 | 5.49 | 1.05 | 0.085 | 5.71 | 1.08 | 0.068 | 5.18 | 1.04 |
| L/10 | 0.122 | 6.68 | 1.14 | 0.103 | 6.89 | 1.13 | 0.078 | 6.31 | 1.00 |
L is the length of the reference sequence. R is improvement over random prediction. The value α = 0.5 has been used.
Dataset used for interdomain contact predictions.
| Interacting partners | PFAM | PDB IDa | Nb | % idenc | L1d | L2e | Xd dryf | OptXd αg | Xd wet|opt αh |
| Tyrosine kinase SH3/SH2 domains | PF00018/PF00017 | 19 | 35 | 57 | 83 | 1.86 | 0.2 | 3.25 | |
| Alcohol dehydrogenase N-/C-domains | PF08240/PF00107 | 89 | 23 | 128 | 143 | 3.52 | 0.2 | 3.64 | |
| Mg superoxide dismutase | PF00081/PF02777 | 23 | 44 | 82 | 107 | 4.76 | 0.2 | 5.04 | |
| Immunoglobulin heavy/light chains | PF00047/PF00047 | 116 | 36 | 107 | 114 | 13.56 | 0 | 13.56 | |
| Ortnithine transferase N-/C-domains | PF02729/PF00185 | 20 | 30 | 142 | 178 | 4.47 | 0.1 | 4.94 | |
| NFKB factor RHD/TIG domains | PF00554/PF01833 | 21 | 40 | 199 | 100 | 4.56 | 0.5 | 4.62 | |
| STAT alpha/binding domains | PF01017/PF02864 | 32 | 38 | 180 | 251 | 4.30 | 0.2 | 4.42 | |
| Mur-ligase catalytic/C-terminal domains | PF01225/PF08245 | 26 | 25 | 82 | 208 | 1.84 | 0.1 | 2.12 | |
| Dynamin central/N-domains | PF00350/PF01031 | 32 | 40 | 174 | 89 | 0.04 | 0.2 | 0.14 | |
| Trk C-/N-domains | PF02254/PF02080 | 42 | 20 | 114 | 72 | 0.53 | 1 | 0.78 | |
aPDB ID of the reference structure; bNumber of sequences in the multiple sequence alignment; cAverage percentage of sequences pairwise similarity; d, eLengths of the reference sequences; fValues for α = 0; gα value corresponding to the highest Xd; hXd highest value.
Figure 3Predictions for interdomain dataset. Relative harmonic weighted difference statistic (X) dependence on α.
Figure 4Proportion of residue pairs at distance bins for the interaction SH2-SH3. All residue pairs are shown in black, correlated pairs with α = 0 in white, and correlated pairs with α = 0.2 in grey. Reference structure used is PDB ID 2SRC.