Literature DB >> 14741208

Mining the structural genomics pipeline: identification of protein properties that affect high-throughput experimental analysis.

Chern-Sing Goh1, Ning Lan, Shawn M Douglas, Baolin Wu, Nathaniel Echols, Andrew Smith, Duncan Milburn, Gaetano T Montelione, Hongyu Zhao, Mark Gerstein.   

Abstract

Structural genomics projects represent major undertakings that will change our understanding of proteins. They generate unique datasets that, for the first time, present a standardized view of proteins in terms of their physical and chemical properties. By analyzing these datasets here, we are able to discover correlations between a protein's characteristics and its progress through each stage of the structural genomics pipeline, from cloning, expression, purification, and ultimately to structural determination. First, we use tree-based analyses (decision trees and random forest algorithms) to discover the most significant protein features that influence a protein's amenability to high-throughput experimentation. Based on this, we identify potential bottlenecks in various stages of the structural genomics process through specialized "pipeline schematics". We find that the properties of a protein that are most significant are: (i.) whether it is conserved across many organisms; (ii). the percentage composition of charged residues; (iii). the occurrence of hydrophobic patches; (iv). the number of binding partners it has; and (v). its length. Conversely, a number of other properties that might have been thought to be important, such as nuclear localization signals, are not significant. Thus, using our tree-based analyses, we are able to identify combinations of features that best differentiate the small group of proteins for which a structure has been determined from all the currently selected targets. This information may prove useful in optimizing high-throughput experimentation. Further information is available from http://mining.nesg.org/.

Mesh:

Substances:

Year:  2004        PMID: 14741208     DOI: 10.1016/j.jmb.2003.11.053

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  50 in total

1.  Conference report--structural genomics: parsing the architecture of proteins highlights of the ABRF 2004--integrating technologies in proteomics and genomics, February 28-March 2, 2004; Portland, Oregon.

Authors:  Sara M Mariani
Journal:  MedGenMed       Date:  2004-04-16

2.  Multiple post-translational modifications affect heterologous protein synthesis.

Authors:  Alexander A Tokmakov; Atsushi Kurotani; Tetsuo Takagi; Mitsutoshi Toyama; Mikako Shirouzu; Yasuo Fukami; Shigeyuki Yokoyama
Journal:  J Biol Chem       Date:  2012-06-06       Impact factor: 5.157

3.  Protein production and crystallization at the joint center for structural genomics.

Authors:  Scott A Lesley; Ian A Wilson
Journal:  J Struct Funct Genomics       Date:  2005

4.  Effect of N-terminal solubility enhancing fusion proteins on yield of purified target protein.

Authors:  Martin Hammarström; Esmeralda A Woestenenk; Niklas Hellgren; Torleif Härd; Helena Berglund
Journal:  J Struct Funct Genomics       Date:  2006-07-19

5.  The challenge of protein structure determination--lessons from structural genomics.

Authors:  Lukasz Slabinski; Lukasz Jaroszewski; Ana P C Rodrigues; Leszek Rychlewski; Ian A Wilson; Scott A Lesley; Adam Godzik
Journal:  Protein Sci       Date:  2007-11       Impact factor: 6.725

6.  Effect of low-complexity regions on protein structure determination.

Authors:  Ryan M Bannen; Craig A Bingman; George N Phillips
Journal:  J Struct Funct Genomics       Date:  2008-02-27

7.  High-throughput crystallization-to-structure pipeline at RIKEN SPring-8 Center.

Authors:  Michihiro Sugahara; Yukuhiko Asada; Katsumi Shimizu; Hitoshi Yamamoto; Neratur K Lokanath; Hisashi Mizutani; Bagautdin Bagautdinov; Yoshinori Matsuura; Midori Taketa; Yuichi Kageyama; Naoko Ono; Yuko Morikawa; Yukiko Tanaka; Hiroki Shimada; Takanobu Nakamoto; Mitsuaki Sugahara; Masaki Yamamoto; Naoki Kunishima
Journal:  J Struct Funct Genomics       Date:  2008-08-02

8.  Correlation Between Protein Primary Structure and Soluble Expression Level of HSA dAb in Escherichia coli.

Authors:  Yankun Yang; Guoqiang Liu; Meng Liu; Zhonghu Bai; Xiuxia Liu; Xiaofeng Dai; Wenwen Guo
Journal:  Food Technol Biotechnol       Date:  2018-03       Impact factor: 3.918

9.  Understanding the physical properties that control protein crystallization by analysis of large-scale experimental data.

Authors:  W Nicholson Price; Yang Chen; Samuel K Handelman; Helen Neely; Philip Manor; Richard Karlin; Rajesh Nair; Jinfeng Liu; Michael Baran; John Everett; Saichiu N Tong; Farhad Forouhar; Swarup S Swaminathan; Thomas Acton; Rong Xiao; Joseph R Luft; Angela Lauricella; George T DeTitta; Burkhard Rost; Gaetano T Montelione; John F Hunt
Journal:  Nat Biotechnol       Date:  2009-01       Impact factor: 54.908

10.  Characteristics affecting expression and solubilization of yeast membrane proteins.

Authors:  Michael A White; Kathleen M Clark; Elizabeth J Grayhack; Mark E Dumont
Journal:  J Mol Biol       Date:  2006-10-06       Impact factor: 5.469

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.