| Literature DB >> 19208142 |
Abstract
BACKGROUND: Electron cryomicroscopy is a fast developing technique aiming at the determination of the 3-dimensional structures of large protein complexes. Using this technique, protein density maps can be generated with 6 to 10 A resolution. At such resolutions, the secondary structure elements such as helices and beta-strands appear to be skeletons and can be computationally detected. However, it is not known which segment of the protein sequence corresponds to which of the skeletons. The topology in this paper refers to the linear order and the directionality of the secondary structures. For a protein with N helices and M strands, there are (N!2N)(M!2M) different topologies, each of which maps N helix segments and M strand segments on the protein sequence to N helix and M strand skeletons. Since the backbone position is not available in the skeleton, each topology of the skeletons corresponds to additional freedom to position the atoms in the skeletons.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19208142 PMCID: PMC2648730 DOI: 10.1186/1471-2105-10-S1-S40
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Protein density map and the detected helix skeletons. The density map simulated using protein 1AGW (PDB Id) to 10 Å resolution using EMAN [31, 32]. The helix skeletons (sticks) were detected using Helix Tracer [13].
Figure 2Generation of the secondary structure topologies for the skeletons. The three major steps in the first stage are included in two boxes with solid lines. The additional steps in the second stage are marked with a box in dashed lines.
Figure 3Secondary structure mutation of 1DV5(PDB Id). The location and orientation of the three helix skeletons (cylinder R1, R2 and R3) are shown. The three segments of the sequence (H1, H2 and H3) forming helices are labelled on the sequence. A mutated topology of the secondary structures was generated by swapping the sequence assignment for (R1, R3). Amino acids in the boxes: the deleted amino acids during the mutation; Amino acids in red: the padded amino acid. The thicker side chains are in the native structure. The thinner side chains are in the mutated structure after side chain relaxation. For viewing clarity, only the side chains in the vicinity of the secondary structure interaction are shown.
The topologies of the secondary structures with lower contact energy than that of the native topology: Stage 1, with the assumption of knowing the backbone Cα atom positions.
| PDB | NAA | PctN | N | Nb | Nm | Neff | Pcteff |
| 56 | 58.93% | 4 | 0 | 384 | 9 | 2.34% | |
| 80 | 46.25% | 3 | 0 | 48 | 2 | 4.17% | |
| 77 | 48.05% | 4 | 0 | 384 | 1 | 0.26% | |
| 58 | 44.83% | 2 | 2 | 64 | 13 | 20.31% | |
| 91 | 56.04% | 4 | 0 | 384 | 4 | 1.04% | |
| 35 | 71.43% | 3 | 0 | 48 | 5 | 10.42% | |
| 93 | 53.76% | 4 | 0 | 384 | 3 | 0.78% | |
| 58 | 44.83% | 2 | 2 | 64 | 4 | 6.25% | |
| 61 | 88.52% | 4 | 0 | 384 | 2 | 0.52% | |
| 107 | 85.98% | 3 | 0 | 48 | 4 | 8.33% | |
| 36 | 47.22% | 1 | 3 | 96 | 27 | 28.12% | |
| 72 | 61.11% | 5 | 0 | 3840 | 167 | 4.35% | |
| 81 | 66.67% | 5 | 0 | 3840 | 6 | 0.16% | |
| 131 | 83.97% | 5 | 0 | 3840 | 2 | 0.05% | |
| 78 | 65.38% | 5 | 0 | 3840 | 2 | 0.05% | |
| 87 | 55.17% | 5 | 0 | 3840 | 7 | 0.18% | |
| 117 | 78.63% | 6 | 0 | 46080 | 9 | 0.02% | |
| 72 | 68.06% | 4 | 2 | 3072 | 27 | 0.88% | |
| 88 | 64.77% | 6 | 0 | 46080 | 39 | 0.08% | |
| 190 | 88.95% | 6 | 0 | 46080 | 2 | 0.00% | |
| 75 | 88.00% | 2 | 0 | 8 | 0 | 0.00% | |
| 105 | 86.67% | 4 | 0 | 384 | 0 | 0.00% | |
| 64 | 71.88% | 1 | 3 | 96 | 1 | 1.04% | |
| 137 | 42.34% | 3 | 0 | 48 | 5 | 10.42% | |
| 68 | 80.88% | 4 | 0 | 384 | 5 | 1.30% | |
| 59 | 86.44% | 2 | 0 | 8 | 0 | 0.00% | |
| 118 | 60.17% | 5 | 0 | 3840 | 13 | 0.34% | |
| 85 | 56.47% | 6 | 0 | 46080 | 102 | 0.22% | |
| 77 | 61.04% | 2 | 4 | 3072 | 28 | 0.91% | |
| 58 | 67.24% | 3 | 3 | 2304 | 19 | 0.82% | |
NAA: number of amino acids
PctN: the percentage of amino acids in the secondary structures (helices and strands);
N: the number of alpha helix; Nb: gthe number of beta strands;
Nm: the gnumber of secondary structure mutations, the cross mutation between a helix and a strand is ignored;
Neff: the gnumber of the mutated topologies with lower effective contact energy than that of the native;
Pcteff: the percentage of the mutated topologies with lower effective contact energy than that of the native by multi-well function;
Constructed atomic structures for the skeletons in 1LRE: The structures are ranked by the contact energy (5th column), and those 32066 structures with negative contact energy are included in the table. The structure with the smallest RMSD to native (4.781 Å) is ranked the 17th.
| Rank | Topology | Shift | Rot | CE | RMSD |
| 1 | 123100 | [-1, 1, 1] | [1.57, 1.57. 1.57] | -2.066887 | 7.224 |
| 2 | 123100 | [-1, 1, -1] | [1.57, 1.57. 3.66] | -2.066817 | 7.517 |
| ... | ... | ... | ... | ... | ... |
| 17 | 123000 | [1, 0, 1] | [5.76, 3.14, 1.57] | -2.004894 | 4.781 |
| ... | ... | ... | ... | ... | ... |
| 32066 | 123011 | [1, -1, -1] | [1.05, 1.05, 0] | 1.09E-4 | 12.979 |
Rank: the rank of the structure by the contact energy (5th column)
Topology: the topology Id, the 1st half of the digits: permutation of the assignment, the last half of the digits: directions (0 or 1) of the assignment for each helix;
Shift: the amino acid position shift from the true helix segment for each assignment, "-" left, "+" right;
Rot: rotation angle around the skeleton axis for each helix, in radian;
CE: Effective contact energy of the constructed helices;
RMSD: the root mean square deviation of the Cα atoms between the constructed candidate structure and the native structure, in Å.
The constructed structure with the smallest RMSD to native for the four tested proteins.
| Protein | Assignments | Rank | Topology | Shift | Rot | CE | RMSD |
| 32066 | 17 | 123000 | [1, 0, 1] | [5.76, 3.14, 1.57] | -2.004894 | 4.781 | |
| 391833 | 31 | 12340000 | [-1, -1, 1, 0] | [0, 0, 3.66, 5.76] | -1.751024 | 4.718 | |
| 98755 | 16 | 123000 | [-1, 0, -1] | [4.71, 5.23, 1.05] | -2.414033 | 4.341 | |
| 192935 | 5 | 124000 | [-1, 0, 1] | [5.23, 3.66, 2.09] | -2.784552 | 4.665 | |
Protein: the PDB Id;
Assignments: the total number of assignments with the negative contact energy
Rank: the rank of the structure with the smallest RMSD to native;
Topology: the topology Id, the 1st half of the digits: permutation of the assignment, the last half of the digits: directions (0 or 1) of the assignment for each helix;
Shift: the amino acid position shift from the true helix segment for each assignment, "-" left, "+" right;
Rot: rotation angle around the skeleton axis for each helix, in radian;
CE: Effective contact energy of the constructed helices;
RMSD: the root mean square deviation of the Cα atoms between the constructed candidate structure and the native structure, in Å.
Figure 4Top 100 structures for the skeletons in 1LRE(PDB Id). The constructed atomic structures for the helix skeletons were ranked by the contact energy (top panel). A topology Id (bottom panel) includes six digits. The first three digits represent the permutation of the assignment, and the last three represent the relative direction (0, or 1) between the sequence segment and the skeleton for each of the skeletons. The RMSD (middle panel) to the native structure is shown for each of the 100 constructed structures for the skeletons. The constructed structure with the smallest RMSD to native (the 17th of the 100) is marked in red for its topology and the RMSD.