| Literature DB >> 20624289 |
Vladimir Potapov1, Mati Cohen, Yuval Inbar, Gideon Schreiber.
Abstract
BACKGROUND: Accurate evaluation and modelling of residue-residue interactions within and between proteins is a keyEntities:
Mesh:
Substances:
Year: 2010 PMID: 20624289 PMCID: PMC2912888 DOI: 10.1186/1471-2105-11-374
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Four-distance definition of inter-residue interactions for the Asn-Gln pair. (a) To define the interaction, one pair of side chain atom is chosen in the first residue (Asn) and another pair is chosen in the second residue (Gln). The four distances (OE1-OD1, OE1-ND2, NE2-OD1, NE2-ND2) define mutual positions of chosen side chain atoms. (b) A 2-dimensional projection (OD1-NE2 versus ND2-OE1 distance) of the 4-distance distributions for the Asn-Gln pair. Peaks in the histogram indicate preferred distance combinations. The peak indicated by the arrow corresponds to the arrangement of side chain atoms in the panel (a). The histogram was built with a bin size of 0.25 Å, without smoothing. The histogram is represented as a contour plot generated with MATLAB. All 4-distance distributions can be viewed in the supporting web site http://bioinfo.weizmann.ac.il/hunter/.
Performance of the ScSc knowledge-based term in modelling
| Atom set | Histogram type | Bin size, Å | Side chain RMSD, Å | |
|---|---|---|---|---|
| Reduced RL | Full RL | |||
| Atom Set 1 | Smoothed | 0.25 | 2.37 | 2.95 |
| 0.50 | 2.28 | 2.89 | ||
| 0.75 | 2.41 | 3.00 | ||
| 1.00 | 2.47 | 3.04 | ||
| Non-smoothed | 0.25 | 2.69 | 3.32 | |
| 0.50 | 2.58 | 3.21 | ||
| 0.75 | 2.55 | 3.08 | ||
| 1.00 | 2.46 | 3.04 | ||
| Atom Set 2 | Smoothed | 0.25 | 2.42 | 3.11 |
| 0.50 | 2.38 | 3.03 | ||
| 0.75 | 2.43 | 3.16 | ||
| 1.00 | 2.53 | 3.17 | ||
| Non-smoothed | 0.25 | 2.67 | 3.47 | |
| 0.50 | 2.62 | 3.27 | ||
| 0.75 | 2.53 | 3.14 | ||
| 1.00 | 2.43 | 3.06 | ||
To determine the best combination of parameters for the ScSc term, a set of 30 proteins was remodelled using the ScSc term only with different atom sets, bin sizes, and with or without histogram smoothing. The side chain RMSD was calculated for side chains atoms excluding Cβ atoms. Results for both the reduced and full rotamer libraries are presented.
Figure 2Six 2-dimensional projections of the 4-distance distribution for the Asn-Gln pair. Interaction between Asn and Gln is defined by four distances between OD1, ND2 atoms of Asn and OE1, NE2 atoms of Gln. The 4-distance data for Asn-Gln were collected from high-resolution protein structures (see Methods for details) and are displayed in six 2-dimensional projections. Each projection represents mutual distribution of two particular distances. The peaks in distributions indicate preferred mutual arrangements of atoms under consideration. The histograms were built with a bin size of 0.25 Å without smoothing. The histograms are represented as contour plots generated with MATLAB.
Contribution of different terms to side chain prediction accuracy
| Side chain RMSD, Å | |||
|---|---|---|---|
| All | Buried | Exposed | |
| EScSc, EScMc, Erot, Elj | 1.47 | 0.73 | 1.72 |
| Erot, Elj | 1.52 | 0.77 | 1.78 |
| EScSc, EScMc | 2.28 | 1.39 | 2.61 |
| EScSc, Erot | 1.87 | 1.36 | 2.05 |
| EScSc, Elj | 2.04 | 0.97 | 2.41 |
| Elj | 2.13 | 1.04 | 2.51 |
| Erot | 2.25 | 2.05 | 2.29 |
The contribution of the different term to Hunter was evaluated by rebuilding side chain conformations on a test set of 94 proteins. The full rotamer library was used with Hunter (see Methods for details).
Discrimination of native structures in Decoys 'R' Us multiple decoys sets
| Resolution, Å | Rank of native structure | Z-score of native structure | Number of decoys | RMSD range, Å | |
|---|---|---|---|---|---|
| 1sn3 | 1.20 | 1 | 5.3 | 660 | 1.3 - 9.1 |
| 4rxn | 1.20 | 1 | 5.7 | 677 | 1.4 - 8.1 |
| 4pti | 1.50 | 1 | 4.4 | 686 | 1.4 - 9.3 |
| 1ctf | 1.70 | 1 | 4.7 | 630 | 1.3 - 9.1 |
| 1r69 | 2.00 | 1 | 6.6 | 676 | 0.9 - 8.3 |
| 3icb | 2.30 | 15 | 2.5 | 654 | 0.9 - 9.4 |
| 2cro | 2.35 | 1 | 4.1 | 673 | 0.8 - 8.3 |
| 4icb | 1.60 | 1 | 6.6 | 500 | 4.8 - 14.1 |
| 2cro | 2.35 | 1 | 5.4 | 501 | 4.3 - 12.6 |
| 1fc2 | 2.80 | 33 | 1.6 | 501 | 3.1 - 10.6 |
| 1hdd-C | 2.80 | 1 | 5.3 | 501 | 2.8 - 12.9 |
| smd3 | ? | 1 | 7.1 | 1200 | 8.5 - 17.0 |
| 1bg8-A | 2.20 | 1 | 6.1 | 1200 | 6.0 - 15.8 |
| 1bl0 | 2.30 | 1 | 6.0 | 972 | 3.6 - 18.2 |
| 1eh2 | NMR | 1 | 4.4 | 2413 | 4.0 - 15.3 |
| 1jwe | NMR | 1 | 8.3 | 1407 | 7.8 - 20.9 |
| 4icb | 1.60 | 1 | 5.1 | 1988 | 4.7 - 12.9 |
| 1ctf | 1.70 | 1 | 6.2 | 1999 | 5.4 - 12.8 |
| 1fca | 1.80 | 1 | 4.2 | 1986 | 5.1 - 11.4 |
| 1pgb | 1.92 | 1 | 6.2 | 1997 | 5.8 - 12.9 |
| 1beo | 2.20 | 1 | 6.6 | 1998 | 7.0 - 15.6 |
| 1dkt-A | 2.90 | 1 | 5.5 | 1995 | 6.7 - 14.0 |
| 1nkl | NMR | 1 | 5.2 | 1995 | 5.3 - 13.6 |
| 1trl-A | NMR | 1 | 5.8 | 1998 | 5.4 - 12.5 |
| 1igd | 1.10 | 1 | 8.1 | 501 | 3.1 - 12.6 |
| 2ovo | 1.50 | 1 | 5.6 | 348 | 4.4 - 13.4 |
| 4pti | 1.50 | 1 | 5.2 | 344 | 4.9 - 13.2 |
| 1ctf | 1.70 | 1 | 7.6 | 496 | 3.6 - 12.5 |
| 1b0n-B | 1.90 | 1 | 5.3 | 498 | 2.4 - 6.0 |
| 1shf-A | 1.90 | 1 | 7.7 | 437 | 4.4 - 12.3 |
| 2cro | 2.35 | 1 | 7.7 | 501 | 3.9 - 13.5 |
| 1fc2 | 2.80 | 49 | 1.5 | 501 | 4.0 - 8.4 |
| 1bba | NMR | 501 | -3.8 | 501 | 2.8 - 8.9 |
| 1dtk | NMR | 3 | 3.6 | 216 | 4.3 - 12.6 |
Each decoy structure and the native structure were scored and ranked using the Hunter's ScSc term only. Rank of 1 corresponds to the structure with the lowest score. The Z-score of the native structure was calculated to estimate Hunter's ability to discriminate the native structure from decoy structures. Resolution is of the X-ray structure.
Discrimination of native structures in CASP 7/8 decoys sets
| Resolution, Å | Rank of native structure | Number of decoys | Z-score of native structure | RMSD range, Å | |
|---|---|---|---|---|---|
| T0288 | 1.1 | 2 | 373 | 2.3 | 1.6 - 11.2 |
| T0359 | 1.4 | 1 | 383 | 2.7 | 1.8 - 12.8 |
| T0340 | 1.5 | 1 | 416 | 2.0 | 0.7 - 8.9 |
| T0324_D1 | 1.5 | 1 | 342 | 3.4 | 1.6 - 13.3 |
| T0324_D2 | 1.5 | 1 | 400 | 3.2 | 1.2 - 10.7 |
| T0305 | 1.6 | 1 | 321 | 2.3 | 0.9 - 19.8 |
| T0332 | 1.6 | 1 | 343 | 2.9 | 1.5 - 10.9 |
| T0366 | 1.7 | 1 | 414 | 2.7 | 0.9 - 8.0 |
| T0291 | 1.8 | 1 | 252 | 1.9 | 2.0 - 19.2 |
| T0290 | 1.8 | 2 | 196 | 2.3 | 0.5 - 13.4 |
| T0313 | 1.9 | 1 | 366 | 2.9 | 2.4 - 19.0 |
| T0311 | 1.9 | 1 | 403 | 2.8 | 1.4 - 11.1 |
| T0295_D1 | 1.9 | 1 | 295 | 2.2 | 1.1 - 15.1 |
| T0295_D2 | 1.9 | 1 | 235 | 2.2 | 1.1 - 23.9 |
| T0303_D1 | 1.9 | 1 | 194 | 3.9 | 1.7 - 14.1 |
| T0308 | 2.0 | 1 | 385 | 2.1 | 1.3 - 13.9 |
| T0317 | 2.0 | 1 | 341 | 3.1 | 1.6 - 13.1 |
| T0346 | 2.0 | 6 | 348 | 1.8 | 0.4 - 13.6 |
| T0339_D2 | 2.1 | 1 | 399 | 2.8 | 1.7 - 15.7 |
| T0345 | 2.1 | 1 | 314 | 2.2 | 0.8 - 16.0 |
| T0367 | 2.2 | 1 | 337 | 3.3 | 2.0 - 17.2 |
| T0292_D1 | 2.2 | 4 | 298 | 2.2 | 1.3 - 10.8 |
| T0292_D2 | 2.2 | 1 | 174 | 4.3 | 3.0 - 13.8 |
| T0315 | 2.2 | 1 | 384 | 2.8 | 0.9 - 13.0 |
| T0334 | 2.5 | 1 | 328 | 2.0 | 1.5 - 45.5 |
| T0326 | 2.5 | 1 | 281 | 2.2 | 3.1 - 29.3 |
| T0328 | 2.8 | 1 | 317 | 2.9 | 1.7 - 40.1 |
| T0488-D1 | 1.3 | 1 | 341 | 3.3 | 1.1 - 5.6 |
| T0508-D1 | 1.5 | 1 | 283 | 3.5 | 1.3 - 11.8 |
| T0459-D1 | 1.7 | 1 | 296 | 3.5 | 1.4 - 8.4 |
| T0423-D1 | 1.7 | 1 | 349 | 3.2 | 1.2 - 15.1 |
| T0454-D1 | 1.8 | 22 | 464 | 1.5 | 0.8 - 6.2 |
| T0504-D3 | 1.8 | 1 | 241 | 3.8 | 1.2 - 21.6 |
| T0445-D1 | 1.8 | 1 | 174 | 6.1 | 1.4 - 11.2 |
| T0392-D1 | 1.8 | 1 | 327 | 2.2 | 1.2 - 8.1 |
| T0447-D1 | 1.9 | 1 | 116 | 4.3 | 1.3 - 32.5 |
| T0505-D1 | 1.9 | 1 | 246 | 3.0 | 1.3 - 10.3 |
| T0506-D1 | 1.9 | 1 | 297 | 3.9 | 1.4 - 11.3 |
| T0432-D1 | 1.9 | 1 | 169 | 3.9 | 1.4 - 17.1 |
| T0388-D1 | 2.0 | 1 | 133 | 3.5 | 1.1 - 10.7 |
| T0402-D1 | 2.0 | 1 | 310 | 4.2 | 1.5 - 22.6 |
| T0418-D1 | 2.0 | 1 | 335 | 3.4 | 1.1 - 11.2 |
| T0418-D2 | 2.0 | 1 | 344 | 4.2 | 1.5 - 10.2 |
| T0422-D2 | 2.0 | 1 | 299 | 3.1 | 1.6 - 12.3 |
| T0491-D1 | 2.0 | 12 | 319 | 2.3 | 1.6 - 10.9 |
| T0428-D1 | 2.0 | 3 | 335 | 2.4 | 0.7 - 9.4 |
| T0426-D1 | 2.1 | 3 | 168 | 2.7 | 0.5 - 6.0 |
| T0396-D1 | 2.1 | 1 | 403 | 2.8 | 1.4 - 13.9 |
| T0398-D1 | 2.1 | 1 | 273 | 2.5 | 0.6 - 33.5 |
| T0398-D2 | 2.1 | 1 | 301 | 2.9 | 0.6 - 10.6 |
| T0453-D1 | 2.1 | 3 | 322 | 3.2 | 1.3 - 8.0 |
| T0435-D1 | 2.2 | 1 | 296 | 3.4 | 1.8 - 12.6 |
| T0400-D1 | 2.2 | 1 | 270 | 3.9 | 1.3 - 11.3 |
| T0452-D1 | 2.2 | 1 | 279 | 4.5 | 1.7 - 13.2 |
| T0452-D2 | 2.2 | 1 | 319 | 3.2 | 1.0 - 15.0 |
| T0486-D1 | 2.3 | 1 | 302 | 4.2 | 1.3 - 9.9 |
| T0404-D1 | 2.4 | 1 | 317 | 3.7 | 0.9 - 7.4 |
| T0479-D1 | 2.4 | 4 | 317 | 2.7 | 1.2 - 7.7 |
| T0456-D2 | 2.5 | 2 | 331 | 2.6 | 2.7 - 8.8 |
| T0390-D1 | 2.7 | 1 | 287 | 2.5 | 1.4 - 16.7 |
| T0455-D1 | 2.7 | 1 | 297 | 3.3 | 1.4 - 8.3 |
| T0416-D1 | 2.7 | 1 | 265 | 2.6 | 1.4 - 42.1 |
| T0450-D1 | 2.7 | 3 | 263 | 2.6 | 1.5 - 53.5 |
| T0458-D1 | 2.7 | 7 | 342 | 1.9 | 0.6 - 9.3 |
| T0438-D1 | 2.8 | 5 | 254 | 2.8 | 1.3 - 11.2 |
| T0438-D2 | 2.8 | 11 | 311 | 1.7 | 1.0 - 12.5 |
| T0442-D1 | 2.8 | 11 | 250 | 1.4 | 1.1 - 25.1 |
| T0442-D2 | 2.8 | 66 | 315 | 0.8 | 0.7 - 21.6 |
| T0444-D1 | 2.8 | 7 | 314 | 2.2 | 0.9 - 7.8 |
| T0461-D1 | 2.8 | 1 | 307 | 2.9 | 1.6 - 10.0 |
| T0441-D2 | 2.9 | 1 | 271 | 3.1 | 1.8 - 7.2 |
| T0470-D1 | 2.9 | 36 | 334 | 1.3 | 1.7 - 11.0 |
| T0470-D2 | 2.9 | 16 | 324 | 1.6 | 1.0 - 13.4 |
See a legend of Table 3 for details.
Figure 3Discriminating the native structures in four CASP 7/8 decoy sets (T0359, T0340, T0488, T0508) using Hunter. Each structure in a particular decoy set was scored with Hunter ScSc term only. All obtained scores were converted to Z-scores. The Z-score of each decoy protein was plotted against its Cα-RMSD to the native structure. The native structure (0 Å RMSD) is shown as large circle, and has the lowest score (indicated as ranks in Tables S4 and S5). In most cases no funnel-like shape is observed.
Figure 4Monte Carlo side chain optimization of . A model of the complex (PDB ID 1BRS) with the lowest score has a side chain RMSD of 1.44 Å (filled triangle; gray dots in the plot are the conformations sampled during a MCSA run). The native structure (filled circle) is never attainable in the side chain modelling due to use of a discrete rotamer library. Instead, the best rotameric structure for the complex would have a RMSD of 0.43 Å (filled square). None of the conformations in the region between 0.43 Å and 1.44 Å is ever sampled. To investigate this problem, a MCSA run was started from the best-rotameric structure (see inset). As can be seen in the inset, such a MCSA run, nevertheless, converges to the same region as a standard run.
Comparing performance of side modelling methods
| Method | RMSD (Å) | Contact score | ||||
|---|---|---|---|---|---|---|
| All | Buried | Exposed | % | % | ||
| Hunter | 1.47 | 0.73 | 1.72 | 39 | 79 | 60 |
| OPUS-Rota [ | 1.56 | 0.91 | 1.80 | 35 | 77 | 56 |
| SCAP [ | 1.72 | 1.00 | 1.96 | 37 | 69 | 46 |
| SCCOMP [ | 1.72 | 1.03 | 1.96 | 34 | 69 | 49 |
| SCWRL4.0 [ | 1.65 | 0.87 | 1.93 | 35 | 76 | 55 |
Side chains were rebuilt using five different methods in a set of 94 high-resolution protein structures. The average statistics on side chain prediction accuracy was calculated in terms of side chain RMSD, contact score, and chi-angle prediction accuracy. Side chain RMSD was calculated for side chain atoms except Cβ. A residue was considered as exposed or buried if its relative solvent exposed area is greater or less than 15%, respectively. Contact score reflects amount of correctly predicted atom-atom contacts and is calculated as described in Methods.
Figure 5Per residue side chain modelling accuracy. Per residue RMSDs after modelling using Hunter were collected for all side chain conformations on the set of 94 models. Each per residue RMSD was normalized (see Table S7 for details), and an average normalized RMSD was calculated. Hydrophobic residues are on the right side of the plot while polar ones are on the left side.
Figure 6Evaluating side chain prediction accuracy using Hunter in X-ray and NMR structures. (a) Side chain conformations of the barstar structure were modelled using Hunter and compared to those determined in two different X-ray crystal structures. For most buried side chains, conformation predicted with Hunter is in a good agreement with observed conformations in the crystal structures. For a number of exposed residues (circled in the figure) Hunter's side chain conformation is different from those observed in crystal structures; however, their conformations differ also between the structures. (b) Side chains were rebuilt with Hunter starting from the X-ray or NMR structures of barnase (PDB IDs 1A2P and 1FW7, respectively; backbone RMSD between X-ray and NMR structure is 1.07 Å). RMSD between X-ray and NMR - 1.45 Å; X-ray modelled versus NMR modelled - 1.49 Å; X-ray versus NMR modelled - 1.54 Å; X-ray modelled versus NMR, 1.52 Å; X-ray versus X-ray modelled, 0.85 Å; NMR versus NMR modelled - 1.15 Å.
Hunter's performance for side chain predictions in homology modelling of structures from CASP7
| Number of targets | RMSD, Å | Contact score | |||
|---|---|---|---|---|---|
| Hunter | 28 | ||||
| TS004 | 4.1 | 17.1 | 60 | 37 | |
| Hunter | 26 | ||||
| TS020 | 3.7 | 18.9 | 60 | 38 | |
| Hunter | 27 | ||||
| TS186 | 4.0 | 16.0 | 60 | 37 | |
| Hunter | 5 | ||||
| TS191 | 3.1 | 20.8 | 64 | 39 | |
| Hunter | 6 | ||||
| TS397 | 3.8 | 16.5 | 59 | 38 | |
| Hunter | 6 | ||||
| TS556 | 4.0 | 17.8 | 60 | 38 |
The results of the top six groups in CASP7 (according to Read and Chavali [39]) were taken for examining Hunter side chain modelling accuracy in homology modelling. Best models submitted by each group for every target were collected and the average side chain prediction accuracy was determined (side chain RMSD, contact score, chi-angle prediction accuracy). Then, Hunter was used to rebuild side chains on the set same set of submissions, and side chain prediction accuracy was evaluated and presented below. For each set of submissions two rows are given. The first row shows side chain modelling accuracy with Hunter while taking backbone coordinates from the best model submitted by the corresponding group for each target. The second row shows performance in side chain modelling by the group in CASP7. The group name is given here according to the CASP7 experiment.