Literature DB >> 26752865

Predicting Protein Model Quality from Sequence Alignments by Support Vector Machines.

Xin Deng1, Jilong Li1, Jianlin Cheng2.   

Abstract

Assessing the quality of a protein structure model is essential for protein structure prediction. Here, we developed a Support Vector Machine (SVM) method to predict the quality score (GDT-TS score) of a protein structure model from the features extracted from the sequence alignment used to generate the model. We developed a Support Vector Machine (SVM) model quality assessment method, taking either a query-single-template pairwise alignment or a query-multitemplate alignment as input. For the pairwise alignment scheme, the input features fed into the SVM predictor include the normalized e-value of the given alignment, the percentage of identical residue pairs in the alignment, the percentage of residues of the query aligned with those of the template, and the sum of the BLOSUM scores of all aligned residues divided by the length of the aligned positions. Similarly, for the multiple-alignment scheme, the input features include the percentage of the residues of the target sequence aligned with those in one or more templates, the percentage of aligned residues of the target sequence that are the same as that of any one template, the average BLOSUM score of aligned residues and the average Gonnet160 score of aligned residues. A SVM regression predictor was trained on the training data to predict the GDT-TS scores of the models from the input features. The Root Mean Square Error (RMSE) and the Absolute Mean Error (ABS) between predicted and real GDT-TS scores were calculated to evaluate the performance. A five-fold cross validation was applied to select the best parameter values based on the average RMSE and ABS on the five folds. The RMSE and ABS of the optimized SVM predictor on the testing data were close to 0.1. The good performance of the SVM and sequence alignment based predictor indicates that integrating sequence alignment features with a SVM is effective for protein model quality assessment.

Entities:  

Keywords:  Protein model quality; Protein structure model; Protein structure prediction; Sequence alignment; Support vector machine

Year:  2013        PMID: 26752865      PMCID: PMC4705550          DOI: 10.4172/jpb.S9-001

Source DB:  PubMed          Journal:  J Proteomics Bioinform        ISSN: 0974-276X


  22 in total

1.  Pcons: a neural-network-based consensus predictor that improves fold recognition.

Authors:  J Lundström; L Rychlewski; J Bujnicki; A Elofsson
Journal:  Protein Sci       Date:  2001-11       Impact factor: 6.725

2.  Amino acid substitution matrices from protein blocks.

Authors:  S Henikoff; J G Henikoff
Journal:  Proc Natl Acad Sci U S A       Date:  1992-11-15       Impact factor: 11.205

3.  Scoring function for automated assessment of protein structure template quality.

Authors:  Yang Zhang; Jeffrey Skolnick
Journal:  Proteins       Date:  2004-12-01

4.  Exhaustive matching of the entire protein sequence database.

Authors:  G H Gonnet; M A Cohen; S A Benner
Journal:  Science       Date:  1992-06-05       Impact factor: 47.728

5.  QMEAN: A comprehensive scoring function for model quality assessment.

Authors:  Pascal Benkert; Silvio C E Tosatto; Dietmar Schomburg
Journal:  Proteins       Date:  2008-04

6.  Evaluating the absolute quality of a single protein model using structural features and support vector machines.

Authors:  Zheng Wang; Allison N Tegge; Jianlin Cheng
Journal:  Proteins       Date:  2009-05-15

7.  Mitochondrial uncoupling protein 2 structure determined by NMR molecular fragment searching.

Authors:  Marcelo J Berardi; William M Shih; Stephen C Harrison; James J Chou
Journal:  Nature       Date:  2011-07-24       Impact factor: 49.962

8.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

9.  Designing and benchmarking the MULTICOM protein structure prediction system.

Authors:  Jilong Li; Xin Deng; Jesse Eickholt; Jianlin Cheng
Journal:  BMC Struct Biol       Date:  2013-02-27

10.  APOLLO: a quality assessment service for single and multiple protein models.

Authors:  Zheng Wang; Jesse Eickholt; Jianlin Cheng
Journal:  Bioinformatics       Date:  2011-05-05       Impact factor: 6.937

View more
  2 in total

1.  PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine.

Authors:  Balachandran Manavalan; Tae H Shin; Gwang Lee
Journal:  Front Microbiol       Date:  2018-03-16       Impact factor: 5.640

2.  DNA sequences alignment in multi-GPUs: acceleration and energy payoff.

Authors:  Jesús Pérez-Serrano; Edans Sandes; Alba Cristina Magalhaes Alves de Melo; Manuel Ujaldón
Journal:  BMC Bioinformatics       Date:  2018-11-20       Impact factor: 3.169

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.