| Literature DB >> 17603876 |
Sebastian Kmiecik1, Dominik Gront, Andrzej Kolinski.
Abstract
BACKGROUND: Although experimental methods for determining protein structure are providing high resolution structures, they cannot keep the pace at which amino acid sequences are resolved on the scale of entire genomes. For a considerable fraction of proteins whose structures will not be determined experimentally, computational methods can provide valuable information. The value of structural models in biological research depends critically on their quality. Development of high-accuracy computational methods that reliably generate near-experimental quality structural models is an important, unsolved problem in the protein structure modeling.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17603876 PMCID: PMC1933428 DOI: 10.1186/1472-6807-7-43
Source DB: PubMed Journal: BMC Struct Biol ISSN: 1472-6807
The CABS decoys set
| PDB ID | % α | % β | L | % of low energy structures |
| 2gr8A | 16 | 50 | 78 | 99 |
| 2cklA | 44 | 14 | 98 | 93 |
| 2gmkA | 19 | 41 | 103 | 75 |
| 2gu3A | 19 | 48 | 128 | 79 |
| 2grrB | 64 | 0 | 157 | 92 |
| 2cl4X | 52 | 2 | 250 | 68 |
| 2cjpA | 45 | 16 | 320 | 38 |
Particular columns contain: the PDB code, the fraction of alpha helices, the fraction of beta strands, the protein length and the fraction of correctly built structures (where the minimization did not result in abnormal high energy values).
Figure 1Accuracy-representative models from the CABS decoys set. Three 2GU3A example models with various distances from the native structure (the lowest energy model – 0.6 Å, intermediate 1.5 Å, and the worst one 3 Å from native). Models are plotted in gray, reference native structure in dark thin line.
Figure 2Illustration of the secondary structure dependent character of differences between the accuracy-representative models. RMSD deviation from native for each residue of three 2GU3A example decoys (after the best superimposition of the entire structures – see Figure 1). On the sequence axis the secondary structure is symbolically depicted (helices in black and strands in grey).
Figure 3Results of 1000 iteration minimization for the CABS decoys. For each protein, the energy was plotted as a function of Cα RMSD for all decoys (left panels) and without decoys with abnormal high energy values resulted from structural inaccuracies (right panels). On the left panels, energies of the native structures are denoted by asterisks. The native structures were subjected to the same rebuilding procedure from the Cα-traces as that applied to the decoys. Proteins are ordered in respect to their chain lengths (Table 1) – from the smallest on top (2GR8A) to the largest (2CJPA) on the bottom.
The 7 protein subsets from the MOULDER testing set
| PDB ID | SS type | L | RMSD range | Median RMSD | Δ | |
| 2mtaC | α | 81 | 2.2–42.7 | 6.7 | 0.56 (0.30, | 0.26 |
| 1onc_ | α, β | 101 | 2.2–22.8 | 10.5 | 0.37 (0.25, | 0.30 |
| 1bbhA | α | 127 | 2.5–20.8 | 6.5 | 0.51 (0.05, | 0.76 |
| 1mdc_ | β | 130 | 1.9–16.4 | 9.3 | 0.37 (0.13, | 0.02 |
| 1dxtB | α | 143 | 2.0–34.1 | 7.2 | 0.56 (0.31, | 0.08 |
| 2fbjL | β | 210 | 2.4–22.5 | 8.8 | 1.51 (0.32, | 0.53 |
| 2cmd_ | α, β | 310 | 2.5–20.2 | 5.8 | 1.26 (0.31, | 1.58 |
Particular columns contain: the PDB code, the secondary structure type, the protein length, the range of Cα RMSD (Å), the median of RMSD (Å), the average ΔRMSD of our method, in the brackets: the average ΔRMSD of the best method [12] and a ranking – the number of methods that outperformed our procedure (23 individual assessment methods were tested, SVMod that uses a composite score from the individual methods was not taken into account [12]), the ΔRMSD on the whole subset.
Figure 4Results of 1000 iteration minimization for the Moulder decoys. For each subset of decoys, the energy was plotted as a function of Cα RMSD for the best scored decoys.