Literature DB >> 17989092

Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs.

Mohammad Tabrez Anwar Shamim1, Mohammad Anwaruddin, H A Nagarajaram.   

Abstract

MOTIVATION: Fold recognition is a key step in the protein structure discovery process, especially when traditional sequence comparison methods fail to yield convincing structural homologies. Although many methods have been developed for protein fold recognition, their accuracies remain low. This can be attributed to insufficient exploitation of fold discriminatory features.
RESULTS: We have developed a new method for protein fold recognition using structural information of amino acid residues and amino acid residue pairs. Since protein fold recognition can be treated as a protein fold classification problem, we have developed a Support Vector Machine (SVM) based classifier approach that uses secondary structural state and solvent accessibility state frequencies of amino acids and amino acid pairs as feature vectors. Among the individual properties examined secondary structural state frequencies of amino acids gave an overall accuracy of 65.2% for fold discrimination, which is better than the accuracy by any method reported so far in the literature. Combination of secondary structural state frequencies with solvent accessibility state frequencies of amino acids and amino acid pairs further improved the fold discrimination accuracy to more than 70%, which is approximately 8% higher than the best available method. In this study we have also tested, for the first time, an all-together multi-class method known as Crammer and Singer method for protein fold classification. Our studies reveal that the three multi-class classification methods, namely one versus all, one versus one and Crammer and Singer method, yield similar predictions. AVAILABILITY: Dataset and stand-alone program are available upon request.

Mesh:

Substances:

Year:  2007        PMID: 17989092     DOI: 10.1093/bioinformatics/btm527

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  22 in total

1.  Characterizing the regularity of tetrahedral packing motifs in protein tertiary structure.

Authors:  Ryan Day; Kristin P Lennox; David B Dahl; Marina Vannucci; Jerry W Tsai
Journal:  Bioinformatics       Date:  2010-11-02       Impact factor: 6.937

2.  Identifying anticancer peptides by using a generalized chaos game representation.

Authors:  Li Ge; Jiaguo Liu; Yusen Zhang; Matthias Dehmer
Journal:  J Math Biol       Date:  2018-10-05       Impact factor: 2.259

3.  Protein Science Meets Artificial Intelligence: A Systematic Review and a Biochemical Meta-Analysis of an Inter-Field.

Authors:  Jalil Villalobos-Alva; Luis Ochoa-Toledo; Mario Javier Villalobos-Alva; Atocha Aliseda; Fernando Pérez-Escamirosa; Nelly F Altamirano-Bustamante; Francine Ochoa-Fernández; Ricardo Zamora-Solís; Sebastián Villalobos-Alva; Cristina Revilla-Monsalve; Nicolás Kemper-Valverde; Myriam M Altamirano-Bustamante
Journal:  Front Bioeng Biotechnol       Date:  2022-07-07

4.  Structural alphabets for protein structure classification: a comparison study.

Authors:  Quan Le; Gianluca Pollastri; Patrice Koehl
Journal:  J Mol Biol       Date:  2008-12-25       Impact factor: 5.469

5.  Genome-wide polycomb target gene prediction in Drosophila melanogaster.

Authors:  Jia Zeng; Brian D Kirk; Yufeng Gou; Qinghua Wang; Jianpeng Ma
Journal:  Nucleic Acids Res       Date:  2012-03-13       Impact factor: 16.971

6.  Structural similarity and classification of protein interaction interfaces.

Authors:  Nan Zhao; Bin Pang; Chi-Ren Shyu; Dmitry Korkin
Journal:  PLoS One       Date:  2011-05-12       Impact factor: 3.240

7.  Computational prediction of type III secreted proteins from gram-negative bacteria.

Authors:  Yang Yang; Jiayuan Zhao; Robyn L Morgan; Wenbo Ma; Tao Jiang
Journal:  BMC Bioinformatics       Date:  2010-01-18       Impact factor: 3.169

8.  Automatic structure classification of small proteins using random forest.

Authors:  Pooja Jain; Jonathan D Hirst
Journal:  BMC Bioinformatics       Date:  2010-07-01       Impact factor: 3.169

9.  Sequence-based classification using discriminatory motif feature selection.

Authors:  Hao Xiong; Daniel Capurso; Saunak Sen; Mark R Segal
Journal:  PLoS One       Date:  2011-11-10       Impact factor: 3.240

10.  Prediction of vitamin interacting residues in a vitamin binding protein using evolutionary information.

Authors:  Bharat Panwar; Sudheer Gupta; Gajendra P S Raghava
Journal:  BMC Bioinformatics       Date:  2013-02-07       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.