Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity.

Literature DB >> 24860169

SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity.

Abstract

MOTIVATION: Accurately predicting protein secondary structure and relative solvent accessibility is important for the study of protein evolution, structure and function and as a component of protein 3D structure prediction pipelines. Most predictors use a combination of machine learning and profiles, and thus must be retrained and assessed periodically as the number of available protein sequences and structures continues to grow.
RESULTS: We present newly trained modular versions of the SSpro and ACCpro predictors of secondary structure and relative solvent accessibility together with their multi-class variants SSpro8 and ACCpro20. We introduce a sharp distinction between the use of sequence similarity alone, typically in the form of sequence profiles at the input level, and the additional use of sequence-based structural similarity, which uses similarity to sequences in the Protein Data Bank to infer annotations at the output level, and study their relative contributions to modern predictors. Using sequence similarity alone, SSpro's accuracy is between 79 and 80% (79% for ACCpro) and no other predictor seems to exceed 82%. However, when sequence-based structural similarity is added, the accuracy of SSpro rises to 92.9% (90% for ACCpro). Thus, by combining both approaches, these problems appear now to be essentially solved, as an accuracy of 100% cannot be expected for several well-known reasons. These results point also to several open technical challenges, including (i) achieving on the order of ≥ 80% accuracy, without using any similarity with known proteins and (ii) achieving on the order of ≥ 85% accuracy, using sequence similarity alone.
AVAILABILITY AND IMPLEMENTATION: SSpro, SSpro8, ACCpro and ACCpro20 programs, data and web servers are available through the SCRATCH suite of protein structure predictors at http://scratch.proteomics.ics.uci.edu.

Mesh：

Substances：
Proteins
Solvents

Year: 2014 PMID： 24860169 PMCID： PMC4215083 DOI： 10.1093/bioinformatics/btu352

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

13 in total

1. The Protein Data Bank.

Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

2. Prediction of coordination number and relative solvent accessibility in proteins.

Authors: Gianluca Pollastri; Pierre Baldi; Pietro Fariselli; Rita Casadio
Journal: Proteins Date: 2002-05-01

3. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.

Authors: Weizhong Li; Adam Godzik
Journal: Bioinformatics Date: 2006-05-26 Impact factor: 6.937

4. UniRef: comprehensive and non-redundant UniProt reference clusters.

Authors: Baris E Suzek; Hongzhan Huang; Peter McGarvey; Raja Mazumder; Cathy H Wu
Journal: Bioinformatics Date: 2007-03-22 Impact factor: 6.937

5. Porter, PaleAle 4.0: high-accuracy prediction of protein secondary structure and relative solvent accessibility.

Authors: Claudio Mirabello; Gianluca Pollastri
Journal: Bioinformatics Date: 2013-06-14 Impact factor: 6.937

6. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features.

Authors: W Kabsch; C Sander
Journal: Biopolymers Date: 1983-12 Impact factor: 2.505

7. The Dropout Learning Algorithm.

Authors: Pierre Baldi; Peter Sadowski
Journal: Artif Intell Date: 2014-05 Impact factor: 9.088

8. Scalable web services for the PSIPRED Protein Analysis Workbench.

Authors: Daniel W A Buchan; Federico Minneci; Tim C O Nugent; Kevin Bryson; David T Jones
Journal: Nucleic Acids Res Date: 2013-06-08 Impact factor: 16.971

9. SCRATCH: a protein structure and structural feature prediction server.

Authors: J Cheng; A Z Randall; M J Sweredoski; P Baldi
Journal: Nucleic Acids Res Date: 2005-07-01 Impact factor: 16.971

10. Sequence-similar, structure-dissimilar protein pairs in the PDB.

Authors: Mickey Kosloff; Rachel Kolodny
Journal: Proteins Date: 2008-05-01

89 in total

1. VIRALpro: a tool to identify viral capsid and tail sequences.

Authors: Clovis Galiez; Christophe N Magnan; Francois Coste; Pierre Baldi
Journal: Bioinformatics Date: 2016-01-05 Impact factor: 6.937

2. Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12.

Authors: Chengxin Zhang; S M Mortuza; Baoji He; Yanting Wang; Yang Zhang
Journal: Proteins Date: 2017-11-14

3. Structural introspection of a putative fluoride transporter in plants.

Authors: Aditya Banerjee; Aryadeep Roychoudhury
Journal: 3 Biotech Date: 2019-02-22 Impact factor: 2.406

4. DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks.

Authors: Mostafa Karimi; Di Wu; Zhangyang Wang; Yang Shen
Journal: Bioinformatics Date: 2019-09-15 Impact factor: 6.937

5. Secreted Proteins Defy the Expression Level-Evolutionary Rate Anticorrelation.

Authors: Felix Feyertag; Patricia M Berninsone; David Alvarez-Ponce
Journal: Mol Biol Evol Date: 2017-03-01 Impact factor: 16.240

6. The choice of sequence homologs included in multiple sequence alignments has a dramatic impact on evolutionary conservation analysis.

Authors: Nelson Gil; Andras Fiser
Journal: Bioinformatics Date: 2019-01-01 Impact factor: 6.937

7. Inadequate Reference Datasets Biased toward Short Non-epitopes Confound B-cell Epitope Prediction.

Authors: Kh Shamsur Rahman; Erfan Ullah Chowdhury; Konrad Sachse; Bernhard Kaltenboeck
Journal: J Biol Chem Date: 2016-05-09 Impact factor: 5.157

8. PaRSnIP: sequence-based protein solubility prediction using gradient boosting machine.

Authors: Reda Rawi; Raghvendra Mall; Khalid Kunji; Chen-Hsiang Shen; Peter D Kwong; Gwo-Yu Chuang
Journal: Bioinformatics Date: 2018-04-01 Impact factor: 6.937

9. Systematic analysis and prediction of type IV secreted effector proteins by machine learning approaches.

Authors: Jiawei Wang; Bingjiao Yang; Yi An; Tatiana Marquez-Lago; André Leier; Jonathan Wilksch; Qingyang Hong; Yang Zhang; Morihiro Hayashida; Tatsuya Akutsu; Geoffrey I Webb; Richard A Strugnell; Jiangning Song; Trevor Lithgow
Journal: Brief Bioinform Date: 2019-05-21 Impact factor: 11.622

10. Predicting Proteolysis in Complex Proteomes Using Deep Learning.

Authors: Matiss Ozols; Alexander Eckersley; Christopher I Platt; Callum Stewart-McGuinness; Sarah A Hibbert; Jerico Revote; Fuyi Li; Christopher E M Griffiths; Rachel E B Watson; Jiangning Song; Mike Bell; Michael J Sherratt
Journal: Int J Mol Sci Date: 2021-03-17 Impact factor: 5.923