| Literature DB >> 20122218 |
Kamal Al Nasr1, Weitao Sun, Jing He.
Abstract
BACKGROUND: The current advances in electron cryo-microscopy technique have made it possible to obtain protein density maps at about 6-10 A resolution. Although it is hard to derive the protein chain directly from such a low resolution map, the location of the secondary structures such as helices and strands can be computationally detected. It has been demonstrated that such low-resolution map can be used during the protein structure prediction process to enhance the structure prediction.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20122218 PMCID: PMC3009517 DOI: 10.1186/1471-2105-11-S1-S44
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Helical skeletons and topologies. (A) The density map (grey) was simulated to 10 Å resolution using protein 1B5L from the Protein Data Bank (PDB). The helical skeletons (red S1 to S5) were detected using Helix Tracer [8]. (B) The helix segments are highlighted (black) on the proteins sequence. Two alternative topologies are shown as diagrams in (C) correct and (D) wrong topology, in which the N to C direction for the loop (arrow) and for the skeleton (cross and dot) is labelled. The true assignment is labelled on the skeleton with H1 being the first helix segment on the protein sequence.
Figure 2The highest ranked structure with the correct topology for 1B5L (PDB ID). The native structure (grey ribbon) and the predicted structure (red ribbon) were superimposed on the protein density map. In the predicted structure, the connection between the two helices is simply drawn as a straight line that is smoothed by the ribbon representation. The amino acid labels and the side chains are shown for one of the five helices. The dotted line (grey) represents the missing loop in the native structure.
The test of the structure prediction for the helical skeletons
| No | ID | #AAa | #hlcesb | #sticksc | #Possible Topologiesd | #Valid Topologiese | #Generated structuresf | Rankg | RMSDh | Prcti |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 55 | 3 | 3 | 48 | 6 | 3000 | 10 | 4.78 | 0.33% | |
| 2 | 149 | 3 | 3 | 48 | 32 | 16000 | 15 | 3.72 | 0.09% | |
| 3 | 73 | 3 | 3 | 48 | 16 | 8000 | 3 | 3.96 | 0.04% | |
| 4 | 168 | 3 | 3 | 48 | 16 | 8000 | 27 | 4.67 | 0.34% | |
| 5 | 81 | 3 | 3 | 48 | 18 | 9000 | 10 | 4.55 | 0.11% | |
| 6 | 120 | 3 | 3 | 48 | 4 | 2000 | 4 | 11.17 | 0.20% | |
| 7 | 68 | 3 | 3 | 48 | 14 | 7000 | 17 | 4.03 | 0.24% | |
| 8 | 87 | 4 | 3 | 192 | 48 | 24000 | 93 | 3.8 | 0.39% | |
| 9 | 127 | 4 | 3 | 192 | 160 | 80000 | 572 | 4.75 | 0.72% | |
| 10 | 85 | 4 | 3 | 192 | 40 | 20000 | 12 | 2.8 | 0.06% | |
| 11 | 126 | 4 | 3 | 192 | 120 | 60000 | 42 | 4.24 | 0.07% | |
| 12 | 75 | 4 | 3 | 192 | 104 | 52000 | 47 | 4.2 | 0.09% | |
| 13 | 181 | 4 | 3 | 192 | 48 | 24000 | 110 | 3.57 | 0.46% | |
| 14 | 88 | 4 | 3 | 192 | 40 | 20000 | 9 | 3.69 | 0.05% | |
| 15 | 116 | 4 | 3 | 192 | 72 | 36000 | 147 | 7.3 | 0.41% | |
| 16 | 164 | 4 | 3 | 192 | 124 | 62000 | 114 | 5.48 | 0.18% | |
| 17 | 80 | 5 | 3 | 480 | 450 | 225000 | 30 | 3.51 | 0.01% | |
| 18 | 159 | 5 | 3 | 480 | 96 | 48000 | 60 | 6.69 | 0.13% | |
| 19 | 73 | 6 | 3 | 960 | 438 | 219000 | 35 | 3.43 | 0.02% | |
| 20 | 72 | 4 | 4 | 384 | 66 | 33000 | 31 | 4.26 | 0.09% | |
| 21 | 61 | 4 | 4 | 384 | 64 | 32000 | 21 | 5.58 | 0.07% | |
| 22 | 77 | 4 | 4 | 384 | 20 | 10000 | 3 | 4.75 | 0.03% | |
| 23 | 100 | 5 | 4 | 1920 | 468 | 234000 | 339 | 4.75 | 0.14% | |
| 24 | 118 | 6 | 4 | 5760 | 139 | 69500 | 144 | 5.33 | 0.21% | |
| 25 | 136 | 6 | 4 | 5760 | 144 | 72000 | 8 | 4.84 | 0.01% | |
| 26 | 186 | 6 | 4 | 5760 | 419 | 209500 | 288 | 7.22 | 0.14% | |
| 27 | 121 | 6 | 4 | 5760 | 400 | 200000 | 17 | 3.78 | 0.01% | |
| 28 | 146 | 6 | 4 | 5760 | 304 | 152000 | 129 | 5.25 | 0.08% | |
| 29 | 108 | 7 | 4 | 13440 | 768 | 384000 | 3599 | 4.15 | 0.94% | |
| 30 | 153 | 8 | 4 | 26880 | 1215 | 607500 | 8 | 5.59 | 0.00% | |
| 31 | 144 | 5 | 5 | 3840 | 16 | 8000 | 31 | 4.91 | 0.39% | |
| 32 | 161 | 5 | 5 | 3840 | 157 | 78500 | 204 | 5.3 | 0.26% | |
| 33 | 194 | 6 | 5 | 23040 | 3840 | 1920000 | 2179 | 4.44 | 0.11% | |
| 34 | 172 | 6 | 5 | 23040 | 438 | 219000 | 759 | 5.44 | 0.35% | |
| 35 | 142 | 7 | 6 | 322560 | 7734 | 3867000 | 4707 | 4.65 | 0.12% | |
a: the number of amino acids in the protein
b: the number of helices in the protein
c: the number of skeletons detected by Helix Tracer
d: the number of all possible topologies
e: the number of valid topologies after applying distance and length screening
f: the number of structures generated for all valid topologies
g: the highest rank of the structure that has the correct topology
h: the Root Mean Square Deviation (RMSD) of Cá atoms of the structure that has the highest rank with the correct topology
i: the percentage of the highest rank among all generated structures
Structure prediction of the local regions in two large proteins
| ID | #AAa | #hlcesb | #sticksc | #Possible Topologiesd | #Valid Topologiese | #Generated structuresf | Rankg | RMSDh | Prcti |
|---|---|---|---|---|---|---|---|---|---|
| 14 | 4 | 384384 | 69738 | 6973800 | 10448 | 3.96 | 0.15% | ||
| 14 | 4 | 384384 | 84733 | 8473300 | 14673 | 4.11 | 0.17% | ||
| 290 | 14 | 8 | 5.9E+13 | 3741775 | 0% | ||||
| 20 | 4 | 1860480 | 184255 | 18425500 | 40485 | 6.47 | 0.22% | ||
| 20 | 4 | 1860480 | 280708 | 28070800 | 18412 | 4.65 | 0.07% | ||
| 322 | 20 | 8 | 5.17E+14 | 32104299 | 0% | ||||
a: the number of amino acids in the protein
b: the number of helices in the protein
c: the number of skeletons used for structure prediction in the region
d: the number of all possible topologies in the region
e: the number of valid topologies after applying distance and length screening
f: the number of structures generated in the region or the total number of the structures evaluated for two regions
g: the highest rank of the structure with the correct topology among the generated structures
h: the RMSD of the highest ranked structure with the correct topology among the generated structures
i: the percentage of the rank for the structure with the highest rank and the correct topology
Figure 3The predicted structure for eight of the fourteen helices in two regions. The native structure (PDB ID: 1A0P in grey) and the highest ranked structure with the correct topology (red).
Figure 4The structure prediction process for the skeletons.