Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Will my protein crystallize? A sequence-based predictor.

Literature DB >> 16315316

Will my protein crystallize? A sequence-based predictor.

Pawel Smialowski¹, Thorsten Schmidt, Jürgen Cox, Andreas Kirschner, Dmitrij Frishman.

Abstract

We propose a machine-learning approach to sequence-based prediction of protein crystallizability in which we exploit subtle differences between proteins whose structures were solved by X-ray analysis [or by both X-ray and nuclear magnetic resonance (NMR) spectroscopy] and those proteins whose structures were solved by NMR spectroscopy alone. Because the NMR technique is usually applied on relatively small proteins, sequence length distributions of the X-ray and NMR datasets were adjusted to avoid predictions biased by protein size. As feature space for classification, we used frequencies of mono-, di-, and tripeptides represented by the original 20-letter amino acid alphabet as well as by several reduced alphabets in which amino acids were grouped by their physicochemical and structural properties. The classification algorithm was constructed as a two-layered structure in which the output of primary support vector machine classifiers operating on peptide frequencies was combined by a second-level Naive Bayes classifier. Due to the application of metamethods for cost sensitivity, our method is able to handle real datasets with unbalanced class representation. An overall prediction accuracy of 67% [65% on the positive (crystallizable) and 69% on the negative (noncrystallizable) class] was achieved in a 10-fold cross-validation experiment, indicating that the proposed algorithm may be a valuable tool for more efficient target selection in structural genomics. A Web server for protein crystallizability prediction called SECRET is available at http://webclu.bio.wzw.tum.de:8080/secret. 2005 Wiley-Liss, Inc.

Entities: Chemical Gene

Mesh：

Substances：
Amino Acids
Proteins

Year: 2006 PMID： 16315316 DOI： 10.1002/prot.20789

Source DB: PubMed Journal: Proteins ISSN： 0887-3585

Keyword Cloud
Cited

30 in total

1. Target selection for structural genomics based on combining fold recognition and crystallisation prediction methods: application to the human proteome.

Authors: James E Bray
Journal: J Struct Funct Genomics Date: 2012-02-22

2. Predicting protein crystallization propensity from protein sequence.

Authors: György Babnigg; Andrzej Joachimiak
Journal: J Struct Funct Genomics Date: 2010-02-23

3. The challenge of protein structure determination--lessons from structural genomics.

Authors: Lukasz Slabinski; Lukasz Jaroszewski; Ana P C Rodrigues; Leszek Rychlewski; Ian A Wilson; Scott A Lesley; Adam Godzik
Journal: Protein Sci Date: 2007-11 Impact factor: 6.725

Review 4. Critical evaluation of bioinformatics tools for the prediction of protein crystallization propensity.

Authors: Huilin Wang; Liubin Feng; Geoffrey I Webb; Lukasz Kurgan; Jiangning Song; Donghai Lin
Journal: Brief Bioinform Date: 2018-09-28 Impact factor: 11.622

5. Improving the chances of successful protein structure determination with a random forest classifier.

Authors: Samad Jahandideh; Lukasz Jaroszewski; Adam Godzik
Journal: Acta Crystallogr D Biol Crystallogr Date: 2014-02-15

Review 6. Intrinsic disorder and functional proteomics.

Authors: Predrag Radivojac; Lilia M Iakoucheva; Christopher J Oldfield; Zoran Obradovic; Vladimir N Uversky; A Keith Dunker
Journal: Biophys J Date: 2006-12-08 Impact factor: 4.033

Review 7. Computational crystallization.

Authors: Irem Altan; Patrick Charbonneau; Edward H Snell
Journal: Arch Biochem Biophys Date: 2016-01-11 Impact factor: 4.013

Review 8. The "Sticky Patch" Model of Crystallization and Modification of Proteins for Enhanced Crystallizability.

Authors: Zygmunt S Derewenda; Adam Godzik
Journal: Methods Mol Biol Date: 2017

Review 9. Lessons from structural genomics.

Authors: Thomas C Terwilliger; David Stuart; Shigeyuki Yokoyama
Journal: Annu Rev Biophys Date: 2009 Impact factor: 12.981

10. Predicting protein-ATP binding sites from primary sequence through fusing bi-profile sampling of multi-view features.

Authors: Ya-Nan Zhang; Dong-Jun Yu; Shu-Sen Li; Yong-Xian Fan; Yan Huang; Hong-Bin Shen
Journal: BMC Bioinformatics Date: 2012-05-31 Impact factor: 3.169