Literature DB >> 17430194

Predicting experimental properties of proteins from sequence by machine learning techniques.

Pawel Smialowski1, Antonio J Martin-Galiano, Jürgen Cox, Dmitrij Frishman.   

Abstract

Efficient target selection methods are an important prerequisite for increasing the success rate and reducing the cost of high-throughput structural genomics efforts. There is a high demand for sequence-based methods capable of predicting experimentally tractable proteins and filtering out potentially difficult targets at different stages of the structural genomic pipeline. Simple empirical rules based on anecdotal evidence are being increasingly superseded by rigorous machine-learning algorithms. Although the simplicity of less advanced methods makes them more human understandable, more sophisticated formalized algorithms possess superior classification power. The quickly growing corpus of experimental success and failure data gathered by structural genomics consortia creates a unique opportunity for retrospective data mining using machine learning techniques and results in increased quality of classifiers. For example, the current solubility prediction methods are reaching the accuracy of over 70%. Furthermore, automated feature selection leads to better insight into the nature of the correlation between amino acid sequence and experimental outcome. In this review we summarize methods for predicting experimental success in cloning, expression, soluble expression, purification and crystallization of proteins with a special focus on publicly available resources. We also describe experimental data repositories and machine learning techniques used for classification and feature selection.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17430194     DOI: 10.2174/138920307780363398

Source DB:  PubMed          Journal:  Curr Protein Pept Sci        ISSN: 1389-2037            Impact factor:   3.272


  3 in total

1.  Participation of protein sequence termini in crystal contacts.

Authors:  Oliviero Carugo
Journal:  Protein Sci       Date:  2011-11-09       Impact factor: 6.725

2.  Characterization of OglDREB2A gene from African rice (Oryza glaberrima), comparative analysis and its transcriptional regulation under salinity stress.

Authors:  Abubakar Mohammad Gumi; Pritam Kanti Guha; Abhishek Mazumder; Pawan Jayaswal; Tapan Kumar Mondal
Journal:  3 Biotech       Date:  2018-01-23       Impact factor: 2.406

3.  Small-scale, semi-automated purification of eukaryotic proteins for structure determination.

Authors:  Ronnie O Frederick; Lai Bergeman; Paul G Blommel; Lucas J Bailey; Jason G McCoy; Jikui Song; Louise Meske; Craig A Bingman; Megan Riters; Nicholas A Dillon; John Kunert; Jung Whan Yoon; Ahyoung Lim; Michael Cassidy; Jason Bunge; David J Aceti; John G Primm; John L Markley; George N Phillips; Brian G Fox
Journal:  J Struct Funct Genomics       Date:  2007-11-06
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.