Literature DB >> 25026218

Prediction of protein structure classes by incorporating different protein descriptors into general Chou's pseudo amino acid composition.

Loris Nanni1, Sheryl Brahnam2, Alessandra Lumini3.   

Abstract

Successful protein structure identification enables researchers to estimate the biological functions of proteins, yet it remains a challenging problem. The most common method for determining an unknown protein's structural class is to perform expensive and time-consuming manual experiments. Because of the availability of amino acid sequences generated in the post-genomic age, it is possible to predict an unknown protein's structural class using machine learning methods given a protein's amino-acid sequence and/or its secondary structural elements. Following recent research in this area, we propose a new machine learning system that is based on combining several protein descriptors extracted from different protein representations, such as position specific scoring matrix (PSSM), the amino-acid sequence, and secondary structural sequences. The prediction engine of our system is operated by an ensemble of support vector machines (SVMs), where each SVM is trained on a different descriptor. The results of each SVM are combined by sum rule. Our final ensemble produces a success rate that is substantially better than previously reported results on three well-established datasets. The MATLAB code and datasets used in our experiments are freely available for future comparison at http://www.dei.unipd.it/node/2357.
Copyright © 2014 Elsevier Ltd. All rights reserved.

Keywords:  Ensemble of classifiers; Machine learning; Protein descriptors; Protein structure class; Support vector machines

Mesh:

Substances:

Year:  2014        PMID: 25026218     DOI: 10.1016/j.jtbi.2014.07.003

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  22 in total

1.  iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition.

Authors:  Hao Lin; En-Ze Deng; Hui Ding; Wei Chen; Kuo-Chen Chou
Journal:  Nucleic Acids Res       Date:  2014-10-31       Impact factor: 16.971

2.  iPhosY-PseAAC: identify phosphotyrosine sites by incorporating sequence statistical moments into PseAAC.

Authors:  Yaser Daanial Khan; Nouman Rasool; Waqar Hussain; Sher Afzal Khan; Kuo-Chen Chou
Journal:  Mol Biol Rep       Date:  2018-10-11       Impact factor: 2.316

3.  Protein remote homology detection by combining Chou's distance-pair pseudo amino acid composition and principal component analysis.

Authors:  Bin Liu; Junjie Chen; Xiaolong Wang
Journal:  Mol Genet Genomics       Date:  2015-04-21       Impact factor: 3.291

Review 4.  Some illuminating remarks on molecular genetics and genomics as well as drug development.

Authors:  Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2020-01-01       Impact factor: 3.291

5.  iPhos-PseEn: identifying phosphorylation sites in proteins by fusing different pseudo components into an ensemble classifier.

Authors:  Wang-Ren Qiu; Xuan Xiao; Zhao-Chun Xu; Kuo-Chen Chou
Journal:  Oncotarget       Date:  2016-08-09

6.  UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences.

Authors:  Pu-Feng Du; Wei Zhao; Yang-Yang Miao; Le-Yi Wei; Likun Wang
Journal:  Int J Mol Sci       Date:  2017-11-14       Impact factor: 5.923

7.  Accurate prediction of subcellular location of apoptosis proteins combining Chou's PseAAC and PsePSSM based on wavelet denoising.

Authors:  Bin Yu; Shan Li; Wen-Ying Qiu; Cheng Chen; Rui-Xin Chen; Lei Wang; Ming-Hui Wang; Yan Zhang
Journal:  Oncotarget       Date:  2017-11-21

8.  Success: evolutionary and structural properties of amino acids prove effective for succinylation site prediction.

Authors:  Yosvany López; Alok Sharma; Abdollah Dehzangi; Sunil Pranit Lal; Ghazaleh Taherzadeh; Abdul Sattar; Tatsuhiko Tsunoda
Journal:  BMC Genomics       Date:  2018-01-19       Impact factor: 3.969

9.  Classification of anti hepatitis peptides using Support Vector Machine with hybrid Ant Colony OptimizationThe Luxembourg database of trichothecene type B F. graminearum and F. culmorum producers.

Authors:  Gunjan Mishra; Vivek Ananth; Kalpesh Shelke; Deepak Sehgal; Jayaraman Deepak
Journal:  Bioinformation       Date:  2016-01-31

10.  JPPRED: Prediction of Types of J-Proteins from Imbalanced Data Using an Ensemble Learning Method.

Authors:  Lina Zhang; Chengjin Zhang; Rui Gao; Runtao Yang
Journal:  Biomed Res Int       Date:  2015-10-26       Impact factor: 3.411

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.