Literature DB >> 24571803

Context-based features enhance protein secondary structure prediction accuracy.

Ashraf Yaseen1, Yaohang Li.   

Abstract

We report a new approach of using statistical context-based scores as encoded features to train neural networks to achieve secondary structure prediction accuracy improvement. The context-based scores are pseudo-potentials derived by evaluating statistical, high-order inter-residue interactions, which estimate the favorability of a residue adopting certain secondary structure conformation within its amino acid environment. Encoding these context-based scores as important training and prediction features provides a way to address a long-standing difficulty in neural network-based secondary structure predictions of taking interdependency among secondary structures of neighboring residues into account. Our computational results have shown that the context-based scores are effective features to enhance the prediction accuracy of secondary structure predictions. An overall 7-fold cross-validated Q3 accuracy of 82.74% and Segment Overlap Accuracy (SOV) accuracy of 86.25% are achieved on a set of more than 7987 protein chains with, at most, 25% sequence identity. The Q3 prediction accuracy on benchmarks of CB513, Manesh215, Carugo338, as well as CASP9 protein chains is higher than popularly used secondary structure prediction servers, including Psipred, Profphd, Jpred, Porter (ab initio), and Netsurf. More significant improvement is observed in the SOV accuracy, where more than 4% enhancement is observed, compared to the server with the best SOV accuracy. A Q8 accuracy of >70% (71.5%) is also found in eight-state secondary structure prediction. The majority of the Q3 accuracy improvement is contributed from correctly identifying β-sheets and α-helices. When the context-based scores are incorporated, there are 15.5% more residues predicted with >90% confidence. These high-confidence predictions usually have a rather high accuracy (averagely ~95%). The three- and eight-state prediction servers (SCORPION) implementing our methods are available online.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 24571803     DOI: 10.1021/ci400647u

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  14 in total

1.  MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction.

Authors:  Chao Fang; Yi Shang; Dong Xu
Journal:  Proteins       Date:  2018-03-12

2.  Deep Ensemble Learning with Atrous Spatial Pyramid Networks for Protein Secondary Structure Prediction.

Authors:  Yuzhi Guo; Jiaxiang Wu; Hehuan Ma; Sheng Wang; Junzhou Huang
Journal:  Biomolecules       Date:  2022-06-02

3.  Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning.

Authors:  Rhys Heffernan; Kuldip Paliwal; James Lyons; Abdollah Dehzangi; Alok Sharma; Jihua Wang; Abdul Sattar; Yuedong Yang; Yaoqi Zhou
Journal:  Sci Rep       Date:  2015-06-22       Impact factor: 4.379

4.  Bayesian model of protein primary sequence for secondary structure prediction.

Authors:  Qiwei Li; David B Dahl; Marina Vannucci; Jerry W Tsai
Journal:  PLoS One       Date:  2014-10-14       Impact factor: 3.240

5.  FLEXc: protein flexibility prediction using context-based statistics, predicted structural features, and sequence information.

Authors:  Ashraf Yaseen; Mais Nijim; Brandon Williams; Lei Qian; Min Li; Jianxin Wang; Yaohang Li
Journal:  BMC Bioinformatics       Date:  2016-08-31       Impact factor: 3.169

6.  Sixty-five years of the long march in protein secondary structure prediction: the final stretch?

Authors:  Yuedong Yang; Jianzhao Gao; Jihua Wang; Rhys Heffernan; Jack Hanson; Kuldip Paliwal; Yaoqi Zhou
Journal:  Brief Bioinform       Date:  2018-05-01       Impact factor: 11.622

7.  Protein secondary structure prediction using a small training set (compact model) combined with a Complex-valued neural network approach.

Authors:  Shamima Rashid; Saras Saraswathi; Andrzej Kloczkowski; Suresh Sundaram; Andrzej Kolinski
Journal:  BMC Bioinformatics       Date:  2016-09-13       Impact factor: 3.169

8.  SLALOM, a flexible method for the identification and statistical analysis of overlapping continuous sequence elements in sequence- and time-series data.

Authors:  Roman Prytuliak; Friedhelm Pfeiffer; Bianca Hermine Habermann
Journal:  BMC Bioinformatics       Date:  2018-01-26       Impact factor: 3.169

9.  CNNH_PSS: protein 8-class secondary structure prediction by convolutional neural network with highway.

Authors:  Jiyun Zhou; Hongpeng Wang; Zhishan Zhao; Ruifeng Xu; Qin Lu
Journal:  BMC Bioinformatics       Date:  2018-05-08       Impact factor: 3.169

10.  Boosting the accuracy of protein secondary structure prediction through nearest neighbor search and method hybridization.

Authors:  Spencer Krieger; John Kececioglu
Journal:  Bioinformatics       Date:  2020-07-01       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.