| Literature DB >> 20209018 |
Abstract
We examine the ability of current state-of-the-art methods in protein structure prediction to discriminate topologically distant folds encoded by highly similar (>90% sequence identity) designed proteins in blind protein structure prediction experiments. We detail the corresponding prognosis for the protein fold recognition field and highlight the features of the methodologies that successfully deciphered this folding riddle.Entities:
Year: 2009 PMID: 20209018 PMCID: PMC2832337 DOI: 10.3410/B1-69
Source DB: PubMed Journal: F1000 Biol Rep ISSN: 1757-594X
Figure 1.Difficulties in fold recognition for the redesigned streptococcal protein domains GA95 versus GB95
(a) Only four out of over 150 contributing team groups recognized the difference in fold caused by three nonidentical residues in the 56 residue proteins: HHpred (cyan), Feig (black), FOLDpro (blue), and Coma (others are in orange). The results are shown in global distance test (GDT) plot format, in which the alpha carbon atoms of the predicted model and experimental structure are spatially aligned within distance cutoffs of 0.5 Å, 1 Å, and 1.5 Å up to 10 Å, such that lower lines denote higher accuracy. A common trend of these four groups was to predict the alternate fold as a lower confidence model. Most groups correctly identified the GB95 T0499 fold, yet most models were no better than random for GA95 T0498. The ability for four automated servers to disentangle this riddle provides a positive outlook for the fold recognition field. (b) Predictions made by our group (purple) for T0498 and T0499 compared with the experimental structures for GA95 and GB95 (cyan), respectively. While our predictions were among the very best for T0499/GB95 (the only group with all five submissions in the top 10), the incorrect fold assignment led to highly inaccurate predictions for T0498/GA95. As with so many other protein structure prediction groups, we failed to predict that profoundly similar sequences would produce different folds. CA, alpha carbon; GA95, the artificial protein with GA fold and 95% sequence identity to GB95; GB95, the artificial protein with GB fold and 95% sequence similarity to GA95.
Figure 2.Putative causes of CASP8 fold recognition failure and success for redesigned streptococcal protein GA95 versus GB95
A sequence-to-structure cross of GA95 and GB95 is presented to demonstrate determinants of fold recognition from side chain packing of the nonidentical residues (red). The lack of profound steric clashes created by applying the side chain identities from T0498 to the structure of GB95 (left) misleads predictors to identify an incorrect fold topology. Conversely, the clash that occurs between F30 and A20 when applying the side chain identities from T0499 to the structure of GA95 (right) illustrates an incorrect fold for predictors. GA95, the artificial protein with GA fold and 95% sequence identity to GB95; GB95, the artificial protein with GB fold and 95% sequence similarity to GA95.