| Literature DB >> 29136098 |
Saulo H P de Oliveira1, Eleanor C Law1, Jiye Shi2,3, Charlotte M Deane1.
Abstract
Motivation: Most current de novo structure prediction methods randomly sample protein conformations and thus require large amounts of computational resource. Here, we consider a sequential sampling strategy, building on ideas from recent experimental work which shows that many proteins fold cotranslationally.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29136098 PMCID: PMC6030820 DOI: 10.1093/bioinformatics/btx722
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Number of decoys required by SAINT2 to produce a correct answer or a ‘best’ model. We generated 100 000 decoys for each target in the training dataset and have estimated both the number of decoys required to produce a correct answer (A) and to produce a ‘best’ model (B). A correct answer is one with TM-Score to the native structure greater than 0.5 and a ‘best’ model is a decoy within 0.05 TM-Score units of the best possible solution produced in the 100 000 decoy ensemble (see Supplementary Fig. S3 for more details). Proteins are coloured according to their SCOP classes
Fig. 2.Correlation between the number of decoys required by SAINT2 to produce a correct answer and three sequence-based features. The x-axis represents a feature, protein length (A), number of predicted loop positions (B), and SCOP class (C). The y-axis is the number of decoys required to generate a correct answer for the 25 targets in our Training dataset where SAINT2 produced a correct answer
Fig. 3.Comparison of the TM-Score Best for a validation set of 41 soluble proteins (left) and 24 transmembrane proteins (right) obtained using SAINT2 Forward (x-axis) against SAINT2 Non-sequential (y-axis). Points below the diagonal indicate cases where sequential prediction performs better than non-sequential prediction. Point size indicates protein length and point colour indicates the protein SCOP class