Literature DB >> 16315316

Will my protein crystallize? A sequence-based predictor.

Pawel Smialowski1, Thorsten Schmidt, Jürgen Cox, Andreas Kirschner, Dmitrij Frishman.   

Abstract

We propose a machine-learning approach to sequence-based prediction of protein crystallizability in which we exploit subtle differences between proteins whose structures were solved by X-ray analysis [or by both X-ray and nuclear magnetic resonance (NMR) spectroscopy] and those proteins whose structures were solved by NMR spectroscopy alone. Because the NMR technique is usually applied on relatively small proteins, sequence length distributions of the X-ray and NMR datasets were adjusted to avoid predictions biased by protein size. As feature space for classification, we used frequencies of mono-, di-, and tripeptides represented by the original 20-letter amino acid alphabet as well as by several reduced alphabets in which amino acids were grouped by their physicochemical and structural properties. The classification algorithm was constructed as a two-layered structure in which the output of primary support vector machine classifiers operating on peptide frequencies was combined by a second-level Naive Bayes classifier. Due to the application of metamethods for cost sensitivity, our method is able to handle real datasets with unbalanced class representation. An overall prediction accuracy of 67% [65% on the positive (crystallizable) and 69% on the negative (noncrystallizable) class] was achieved in a 10-fold cross-validation experiment, indicating that the proposed algorithm may be a valuable tool for more efficient target selection in structural genomics. A Web server for protein crystallizability prediction called SECRET is available at http://webclu.bio.wzw.tum.de:8080/secret. 2005 Wiley-Liss, Inc.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16315316     DOI: 10.1002/prot.20789

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  30 in total

1.  Target selection for structural genomics based on combining fold recognition and crystallisation prediction methods: application to the human proteome.

Authors:  James E Bray
Journal:  J Struct Funct Genomics       Date:  2012-02-22

2.  Predicting protein crystallization propensity from protein sequence.

Authors:  György Babnigg; Andrzej Joachimiak
Journal:  J Struct Funct Genomics       Date:  2010-02-23

3.  The challenge of protein structure determination--lessons from structural genomics.

Authors:  Lukasz Slabinski; Lukasz Jaroszewski; Ana P C Rodrigues; Leszek Rychlewski; Ian A Wilson; Scott A Lesley; Adam Godzik
Journal:  Protein Sci       Date:  2007-11       Impact factor: 6.725

Review 4.  Critical evaluation of bioinformatics tools for the prediction of protein crystallization propensity.

Authors:  Huilin Wang; Liubin Feng; Geoffrey I Webb; Lukasz Kurgan; Jiangning Song; Donghai Lin
Journal:  Brief Bioinform       Date:  2018-09-28       Impact factor: 11.622

5.  Improving the chances of successful protein structure determination with a random forest classifier.

Authors:  Samad Jahandideh; Lukasz Jaroszewski; Adam Godzik
Journal:  Acta Crystallogr D Biol Crystallogr       Date:  2014-02-15

Review 6.  Intrinsic disorder and functional proteomics.

Authors:  Predrag Radivojac; Lilia M Iakoucheva; Christopher J Oldfield; Zoran Obradovic; Vladimir N Uversky; A Keith Dunker
Journal:  Biophys J       Date:  2006-12-08       Impact factor: 4.033

Review 7.  Computational crystallization.

Authors:  Irem Altan; Patrick Charbonneau; Edward H Snell
Journal:  Arch Biochem Biophys       Date:  2016-01-11       Impact factor: 4.013

Review 8.  The "Sticky Patch" Model of Crystallization and Modification of Proteins for Enhanced Crystallizability.

Authors:  Zygmunt S Derewenda; Adam Godzik
Journal:  Methods Mol Biol       Date:  2017

Review 9.  Lessons from structural genomics.

Authors:  Thomas C Terwilliger; David Stuart; Shigeyuki Yokoyama
Journal:  Annu Rev Biophys       Date:  2009       Impact factor: 12.981

10.  Predicting protein-ATP binding sites from primary sequence through fusing bi-profile sampling of multi-view features.

Authors:  Ya-Nan Zhang; Dong-Jun Yu; Shu-Sen Li; Yong-Xian Fan; Yan Huang; Hong-Bin Shen
Journal:  BMC Bioinformatics       Date:  2012-05-31       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.