Literature DB >> 15281128

Accurate prediction of solvent accessibility using neural networks-based regression.

Rafał Adamczak1, Aleksey Porollo, Jarosław Meller.   

Abstract

Accurate prediction of relative solvent accessibilities (RSAs) of amino acid residues in proteins may be used to facilitate protein structure prediction and functional annotation. Toward that goal we developed a novel method for improved prediction of RSAs. Contrary to other machine learning-based methods from the literature, we do not impose a classification problem with arbitrary boundaries between the classes. Instead, we seek a continuous approximation of the real-value RSA using nonlinear regression, with several feed forward and recurrent neural networks, which are then combined into a consensus predictor. A set of 860 protein structures derived from the PFAM database was used for training, whereas validation of the results was carefully performed on several nonredundant control sets comprising a total of 603 structures derived from new Protein Data Bank structures and had no homology to proteins included in the training. Two classes of alternative predictors were developed for comparison with the regression-based approach: one based on the standard classification approach and the other based on a semicontinuous approximation with the so-called thermometer encoding. Furthermore, a weighted approximation, with errors being scaled by the observed levels of variability in RSA for equivalent residues in families of homologous structures, was applied in order to improve the results. The effects of including evolutionary profiles and the growth of sequence databases were assessed. In accord with the observed levels of variability in RSA for different ranges of RSA values, the regression accuracy is higher for buried than for exposed residues, with overall 15.3-15.8% mean absolute errors and correlation coefficients between the predicted and experimental values of 0.64-0.67 on different control sets. The new method outperforms classification-based algorithms when the real value predictions are projected onto two-class classification problems with several commonly used thresholds to separate exposed and buried residues. For example, classification accuracy of about 77% is consistently achieved on all control sets with a threshold of 25% RSA. A web server that enables RSA prediction using the new method and provides customizable graphical representation of the results is available at http://sable.cchmc.org. Copyright 2004 Wiley-Liss, Inc.

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 15281128     DOI: 10.1002/prot.20176

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  107 in total

1.  Cardiolipin Interactions with Proteins.

Authors:  Joan Planas-Iglesias; Himal Dwarakanath; Dariush Mohammadyani; Naveena Yanamala; Valerian E Kagan; Judith Klein-Seetharaman
Journal:  Biophys J       Date:  2015-08-20       Impact factor: 4.033

2.  A composite score for predicting errors in protein structure models.

Authors:  David Eramian; Min-yi Shen; Damien Devos; Francisco Melo; Andrej Sali; Marc A Marti-Renom
Journal:  Protein Sci       Date:  2006-06-02       Impact factor: 6.725

3.  Improving the prediction accuracy of residue solvent accessibility and real-value backbone torsion angles of proteins by guided-learning through a two-layer neural network.

Authors:  Eshel Faraggi; Bin Xue; Yaoqi Zhou
Journal:  Proteins       Date:  2009-03

4.  Support vector training of protein alignment models.

Authors:  Chun-Nam John Yu; Thorsten Joachims; Ron Elber; Jaroslaw Pillardy
Journal:  J Comput Biol       Date:  2008-09       Impact factor: 1.479

5.  Positive selection at the protein network periphery: evaluation in terms of structural constraints and cellular context.

Authors:  Philip M Kim; Jan O Korbel; Mark B Gerstein
Journal:  Proc Natl Acad Sci U S A       Date:  2007-12-12       Impact factor: 11.205

6.  Combining sequence and structural profiles for protein solvent accessibility prediction.

Authors:  Rajkumar Bondugula; Dong Xu
Journal:  Comput Syst Bioinformatics Conf       Date:  2008

7.  Fast geometric consensus approach for protein model quality assessment.

Authors:  Rafal Adamczak; Jaroslaw Pillardy; Brinda K Vallat; Jaroslaw Meller
Journal:  J Comput Biol       Date:  2011-01-18       Impact factor: 1.479

8.  Membrane insertion of a Tc toxin in near-atomic detail.

Authors:  Christos Gatsogiannis; Felipe Merino; Daniel Prumbaum; Daniel Roderer; Franziska Leidreiter; Dominic Meusch; Stefan Raunser
Journal:  Nat Struct Mol Biol       Date:  2016-08-29       Impact factor: 15.369

9.  Molecular modeling and computational analyses suggests that the Sinorhizobium meliloti periplasmic regulator protein ExoR adopts a superhelical fold and is controlled by a unique mechanism of proteolysis.

Authors:  Eliza M Wiech; Hai-Ping Cheng; Shaneen M Singh
Journal:  Protein Sci       Date:  2014-12-26       Impact factor: 6.725

10.  A generic method for assignment of reliability scores applied to solvent accessibility predictions.

Authors:  Bent Petersen; Thomas Nordahl Petersen; Pernille Andersen; Morten Nielsen; Claus Lundegaard
Journal:  BMC Struct Biol       Date:  2009-07-31
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.