Literature DB >> 14668222

Classification of protein quaternary structure with support vector machine.

Shao-Wu Zhang1, Quan Pan, Hong-Cai Zhang, Yun-Long Zhang, Hai-Yu Wang.   

Abstract

MOTIVATION: Since the gap between sharply increasing known sequences and slow accumulation of known structures is becoming large, an automatic classification process based on the primary sequences and known three-dimensional structure becomes indispensable. The classification of protein quaternary structure based on the primary sequences can provide some useful information for the biologists. So a fully automatic and reliable classification system is needed. This work tries to look for the effective methods of extracting attribute and the algorithm for classifying the quaternary structure from the primary sequences.
RESULTS: Both of the support vector machine (SVM) and the covariant discriminant algorithms have been first introduced to predict quaternary structure properties from the protein primary sequences. The amino acid composition and the auto-correlation functions based on the amino acid index profile of the primary sequence have been taken into account in the algorithms. We have analyzed 472 amino acid indices and selected the four amino acid indices as the examples, which have the best performance. Thus the five attribute parameter data sets (COMP, FASG, NISK, WOLS and KYTJ) were established from the protein primary sequences. The COMP attribute data set is composed of amino acid composition, and the FASG, NISK, WOLS and KYTJ attribute data sets are composed of the amino acid composition and the auto-correlation functions of the corresponding amino acid residue index. The overall accuracies of SVM are 78.5, 87.5, 83.2, 81.7 and 81.9%, respectively, for COMP, FASG, NISK, WOLS and KYTJ data sets in jackknife test, which are 19.6, 7.8, 15.5, 13.1 and 15.8%, respectively, higher than that of the covariant discriminant algorithm in the same test. The results show that SVM may be applied to discriminate between the primary sequences of homodimers and non-homodimers and the two protein sequence descriptors can reflect the quaternary structure information. Compared with previous Robert Garian's investigation, the performance of SVM is almost equal to that of the Decision tree models, and the methods of extracting feature vector from the primary sequences are superior to Robert's binning function method. AVAILABILITY: Programs are available on request from the authors.

Mesh:

Substances:

Year:  2003        PMID: 14668222     DOI: 10.1093/bioinformatics/btg331

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  11 in total

1.  Quat-2L: a web-server for predicting protein quaternary structural attributes.

Authors:  Xuan Xiao; Pu Wang; Kuo-Chen Chou
Journal:  Mol Divers       Date:  2010-02-11       Impact factor: 2.943

Review 2.  Machine learning for in silico virtual screening and chemical genomics: new strategies.

Authors:  Jean-Philippe Vert; Laurent Jacob
Journal:  Comb Chem High Throughput Screen       Date:  2008-09       Impact factor: 1.339

3.  Protein sequences classification by means of feature extraction with substitution matrices.

Authors:  Rabie Saidi; Mondher Maddouri; Engelbert Mephu Nguifo
Journal:  BMC Bioinformatics       Date:  2010-04-08       Impact factor: 3.169

4.  Genome-wide polycomb target gene prediction in Drosophila melanogaster.

Authors:  Jia Zeng; Brian D Kirk; Yufeng Gou; Qinghua Wang; Jianpeng Ma
Journal:  Nucleic Acids Res       Date:  2012-03-13       Impact factor: 16.971

5.  NOXclass: prediction of protein-protein interaction types.

Authors:  Hongbo Zhu; Francisco S Domingues; Ingolf Sommer; Thomas Lengauer
Journal:  BMC Bioinformatics       Date:  2006-01-19       Impact factor: 3.169

6.  SVM-based prediction of caspase substrate cleavage sites.

Authors:  Lawrence J K Wee; Tin Wee Tan; Shoba Ranganathan
Journal:  BMC Bioinformatics       Date:  2006-12-18       Impact factor: 3.169

7.  osFP: a web server for predicting the oligomeric states of fluorescent proteins.

Authors:  Saw Simeon; Watshara Shoombuatong; Nuttapat Anuwongcharoen; Likit Preeyanon; Virapong Prachayasittikul; Jarl E S Wikberg; Chanin Nantasenamat
Journal:  J Cheminform       Date:  2016-12-20       Impact factor: 5.514

8.  Classification of protein quaternary structure by functional domain composition.

Authors:  Xiaojing Yu; Chuan Wang; Yixue Li
Journal:  BMC Bioinformatics       Date:  2006-04-04       Impact factor: 3.169

9.  Quad-PRE: a hybrid method to predict protein quaternary structure attributes.

Authors:  Yajun Sheng; Xingye Qiu; Chen Zhang; Jun Xu; Yanping Zhang; Wei Zheng; Ke Chen
Journal:  Comput Math Methods Med       Date:  2014-05-18       Impact factor: 2.238

10.  QuaBingo: A Prediction System for Protein Quaternary Structure Attributes Using Block Composition.

Authors:  Chi-Hua Tung; Chi-Wei Chen; Ren-Chao Guo; Hui-Fuang Ng; Yen-Wei Chu
Journal:  Biomed Res Int       Date:  2016-08-17       Impact factor: 3.411

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.