| Literature DB >> 18586712 |
Shuai Cheng Li1, Dongbo Bu, Xin Gao, Jinbo Xu, Ming Li.
Abstract
MOTIVATION: The 3D structure of a protein sequence can be assembled from the substructures corresponding to small segments of this sequence. For each small sequence segment, there are only a few more likely substructures. We call them the 'structural alphabet' for this segment. Classical approaches such as ROSETTA used sequence profile and secondary structure information, to predict structural fragments. In contrast, we utilize more structural information, such as solvent accessibility and contact capacity, for finding structural fragments.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18586712 PMCID: PMC2718643 DOI: 10.1093/bioinformatics/btn165
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Proteins for the Structure Space and Training Set
| A. Structure Space | ||||||
| 1ci4a | 1zm8a | 1j79a | 1rlja | 1zhva | 1wlya | 2a14a |
| 2gc9a | 1lg7a | 1wkoa | 1jfla | 1t9ha | 1lm5a | 1kxoa |
| 1xfia | 1rqpa | 1m15a | 1z96a | 1mla | 1ail | 1yksa |
| 1q25a | 1mj5a | 2erba | 2bsya | 1lst | 1g8aa | 1wzca |
| 1y9wa | 1xkpc | 1v4va | 1se8a | 1p9ha | 1r17a | 1qfta |
| 1aol | 1ju3a | 1rsga | 1atg | 1s5aa | ||
| B. Training Set | ||||||
| 1olra | 2byca | 1yb5a | 1pbwa | 1v0ea | 1orva | 1jb7b |
| 2ftra | 1fj2a | 1fp2a | 2foma | 1xtta | 1suua | 1xuua |
| 1w2wb | 1viaa | 1r9wa | 1fj2a | 1dmga | 2ah5a | 1tc5a |
| 2az4a | 1mzwb | 1ef1c | 1uvqc | 1ikta | 1xfsa | 1zava |
| 1vk5a | 1oyga | |||||
The first 4 letters is the PDB code. The 5th letter is the chain id, missing for single chains.
Position coverage for CBM versus FRazor (FR)'s score function
| α-Helix | β-Sheet | Loop | Overall | |||||
|---|---|---|---|---|---|---|---|---|
| θ | CBM | FR | CBM | FR | CBM | FR | CBM | FR |
| 0.5 | 94.2 | 95.1 | 10.0 | 37.6 | 26.6 | 38.7 | 49.4 | 55.1 |
| 1 | 98.2 | 98.6 | 56.4 | 89.6 | 55.5 | 78.1 | 72.2 | 88.2 |
| 1.5 | 99.7 | 99.7 | 89.3 | 98.2 | 81.3 | 93.3 | 89.9 | 96.7 |
| 2 | 100 | 100 | 99.7 | 99.8 | 96.9 | 98.9 | 98.6 | 99.4 |
| 2.5 | 100 | 100 | 99.9 | 99.9 | 99.7 | 99.7 | 99.8 | 99.8 |
| 3 | 100 | 100 | 100 | 100 | 99.9 | 100 | 99.9 | 100 |
| 3.5 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
Position coverage (%) is displayed. The first column θ(Å) is the native threshold. The fragment candidate list size (k) is 25. The fragment length is 9.
Position coverage percentage (%) for CBM versus FRazor (FR) at threshold value 1Å
| α-Helix | β-Sheet | Loop | Overall | |||||
|---|---|---|---|---|---|---|---|---|
| CMB | FR | CMB | FR | CMB | FR | CMB | FR | |
| 5 | 90.5 | 96.6 | 34.2 | 65.6 | 40.3 | 59.8 | 60.7 | 75.1 |
| 10 | 97.2 | 97.5 | 42.4 | 79.1 | 46.1 | 67.9 | 65.1 | 81.5 |
| 15 | 97.8 | 99.3 | 49.5 | 82.1 | 50.6 | 70.5 | 68.6 | 85.0 |
| 20 | 98.1 | 98.0 | 53.6 | 85.1 | 53.5 | 73.0 | 70.8 | 86.4 |
| 25 | 98.2 | 98.6 | 56.4 | 89.6 | 55.5 | 78.1 | 72.2 | 86.4 |
| 30 | 98.3 | 98.7 | 59.9 | 90.8 | 57.4 | 79.6 | 73.6 | 88.2 |
| 35 | 98.5 | 98.8 | 61.5 | 92.0 | 58.5 | 81.1 | 74.5 | 90.0 |
| 40 | 98.7 | 99.0 | 63.5 | 92.9 | 59.5 | 82.3 | 75.4 | 90.8 |
The first column is the fragment candidate list size. The fragment length is 9.
Fragment coverage and local fit score for threshold value as 1Å
| Fragment coverage (%) | Local fit score (Å) | |||
|---|---|---|---|---|
| CBM | FRazor | CBM | FRazor | |
| 5 | 29.2 | 37.9 | 1.860 | 1.542 |
| 10 | 33.1 | 43.3 | 1.592 | 1.338 |
| 15 | 35.5 | 46.8 | 1.468 | 1.240 |
| 20 | 37.0 | 49.6 | 1.393 | 1.176 |
| 25 | 38.2 | 51.5 | 1.342 | 1.133 |
| 30 | 39.3 | 53.2 | 1.301 | 1.097 |
| 35 | 40.1 | 54.6 | 1.272 | 1.072 |
| 40 | 40.8 | 55.6 | 1.247 | 1.050 |
k is fragment candidate list size. The fragment length is 9. The threshold value is 1Å.
Customized fragment lists versus independent fragment libraries
| Fragment coverage (%) | Local fit score (Å) | |||
|---|---|---|---|---|
| KFL | FRazor | KFL | FRazor | |
| 25 | – | 45.3 | – | 0.763 |
| 50 | 36.2 | 40.5 | 0.754 | 0.667 |
| 100 | 40.7 | 55.7 | 0.673 | 0.589 |
| 150 | 43.3 | 58.6 | 0.633 | 0.554 |
| 200 | 44.0 | 60.4 | 0.603 | 0.531 |
| 250 | 46.3 | 61.8 | 0.585 | 0.515 |
This first column is the fragment candidate list size for FRazor, and the library size for Kolodny's libraries. Fragment coverage (%) at threshold 0.5Å is shown for Kolodny's fragment libraries (KFL) at Column 2 and for FRazor's distance function at Column 3, respectively.
Decoy quality comparison between ROSETTA and FRazor
| Target protein | ROSETTA | FRazor | ||||||
|---|---|---|---|---|---|---|---|---|
| Name | L | α,β | % | Best | Avg | % | Best | Avg |
| 1FC2 | 43 | 2,0 | 20.5 | 2.59 | 7.3 | 38.6 | 2.60 | 6.4 |
| 1ENH | 54 | 2,0 | 39.5 | 3.06 | 7.3 | 53.8 | 2.61 | 6.4 |
| 2GB1 | 56 | 1,4 | 89.8 | 1.88 | 4.3 | 90.6 | 2.04 | 4.4 |
| 2CRO | 65 | 5,0 | 40.6 | 3.02 | 6.7 | 67.2 | 2.57 | 5.8 |
| 1CTF | 68 | 3,3 | 9.2 | 3.42 | 9.1 | 11.0 | 3.14 | 8.4 |
| 4ICB | 76 | 4,0 | 2.8 | 4.74 | 9.4 | 2.6 | 4.81 | 9.6 |
The first column is protein name given as PDB code. L is sequence length. Third column is number of α-helices and β-stands. Column 4–6 give the percentage (%) of the good decoys with RMSD <6.0Å, RMSD of the best decoy (Best), and average RMSD (Avg) of all decoys by ROSETTA. Column 7–9 give the corresponding values for FRazor.
Fig. 1.Best decoys generated by ROSETTA and FRazor for the Cro repressor protein 2CRO.