Literature DB >> 23440081

Discrimination of soluble and aggregation-prone proteins based on sequence information.

Yaping Fang1, Jianwen Fang.   

Abstract

Understanding the factors governing protein solubility is a key to grasp the mechanisms of protein solubility and may provide insight into protein aggregation and misfolding related diseases such as Alzheimer's disease. In this work, we attempt to identify factors important to protein solubility using feature selection. Firstly, we calculate 1438 features including physicochemical properties and statistics for each protein. Random Forest algorithm is used to select the most informative and the minimal subset of features based on their predictive performance. A predictive model is built based on 17 selected features. Compared with previous models, our model achieves better performance with a sensitivity of 0.82, specificity 0.85, ACC 0.84, AUC 0.91 and MCC 0.67. Furthermore, a model using a redundancy-reduced dataset (sequence identity <= 30%) achieves the same performance as the model without redundancy reduction. Our results provide not only a reliable model for predicting protein solubility but also a list of features important to protein solubility. The predictive model is implemented as a freely available web application at .

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23440081      PMCID: PMC3627541          DOI: 10.1039/c3mb70033j

Source DB:  PubMed          Journal:  Mol Biosyst        ISSN: 1742-2051


  37 in total

1.  An NMR approach to structural proteomics.

Authors:  Adelinda Yee; Xiaoqing Chang; Antonio Pineda-Lucena; Bin Wu; Anthony Semesi; Brian Le; Theresa Ramelot; Gregory M Lee; Sudeepa Bhattacharyya; Pablo Gutierrez; Aleksej Denisov; Chang-Hun Lee; John R Cort; Guennadi Kozlov; Jack Liao; Grzegorz Finak; Limin Chen; David Wishart; Weontae Lee; Lawrence P McIntosh; Kalle Gehring; Michael A Kennedy; Aled M Edwards; Cheryl H Arrowsmith
Journal:  Proc Natl Acad Sci U S A       Date:  2002-02-19       Impact factor: 11.205

2.  Structural proteomics of an archaeon.

Authors:  D Christendat; A Yee; A Dharamsi; Y Kluger; A Savchenko; J R Cort; V Booth; C D Mackereth; V Saridakis; I Ekiel; G Kozlov; K L Maxwell; N Wu; L P McIntosh; K Gehring; M A Kennedy; A R Davidson; E F Pai; M Gerstein; A M Edwards; C H Arrowsmith
Journal:  Nat Struct Biol       Date:  2000-10

3.  Mining the structural genomics pipeline: identification of protein properties that affect high-throughput experimental analysis.

Authors:  Chern-Sing Goh; Ning Lan; Shawn M Douglas; Baolin Wu; Nathaniel Echols; Andrew Smith; Duncan Milburn; Gaetano T Montelione; Hongyu Zhao; Mark Gerstein
Journal:  J Mol Biol       Date:  2004-02-06       Impact factor: 5.469

4.  Predicting the solubility of recombinant proteins in Escherichia coli.

Authors:  D L Wilkinson; R G Harrison
Journal:  Biotechnology (N Y)       Date:  1991-05

Review 5.  Protein structure, stability and solubility in water and other solvents.

Authors:  C Nick Pace; Saul Treviño; Erode Prabhakaran; J Martin Scholtz
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2004-08-29       Impact factor: 6.237

6.  Prediction of protein function from sequence properties. Discriminant analysis of a data base.

Authors:  P Klein; M Kanehisa; C DeLisi
Journal:  Biochim Biophys Acta       Date:  1984-06-28

7.  Understanding the relationship between the primary structure of proteins and its propensity to be soluble on overexpression in Escherichia coli.

Authors:  Susan Idicula-Thomas; Petety V Balaji
Journal:  Protein Sci       Date:  2005-02-02       Impact factor: 6.725

Review 8.  Protein solubility and protein homeostasis: a generic view of protein misfolding disorders.

Authors:  Michele Vendruscolo; Tuomas P J Knowles; Christopher M Dobson
Journal:  Cold Spring Harb Perspect Biol       Date:  2011-12-01       Impact factor: 10.005

9.  Protein disorder prediction: implications for structural proteomics.

Authors:  Rune Linding; Lars Juhl Jensen; Francesca Diella; Peer Bork; Toby J Gibson; Robert B Russell
Journal:  Structure       Date:  2003-11       Impact factor: 5.006

10.  AGGRESCAN: a server for the prediction and evaluation of "hot spots" of aggregation in polypeptides.

Authors:  Oscar Conchillo-Solé; Natalia S de Groot; Francesc X Avilés; Josep Vendrell; Xavier Daura; Salvador Ventura
Journal:  BMC Bioinformatics       Date:  2007-02-27       Impact factor: 3.169

View more
  5 in total

1.  Enhancement of the solubility of recombinant proteins by fusion with a short-disordered peptide.

Authors:  Jun Ren; Suhee Hwang; Junhao Shen; Hyeongwoo Kim; Hyunjoo Kim; Jieun Kim; Soyoung Ahn; Min-Gyun Kim; Seung Ho Lee; Dokyun Na
Journal:  J Microbiol       Date:  2022-07-14       Impact factor: 2.902

2.  How do eubacterial organisms manage aggregation-prone proteome?

Authors:  Rishi Das Roy; Manju Bhardwaj; Vasudha Bhatnagar; Kausik Chakraborty; Debasis Dash
Journal:  F1000Res       Date:  2014-06-27

3.  Classification model of amino acid sequences prone to aggregation of therapeutic proteins.

Authors:  Monika Marczak; Krystyna Okoniewska; Tomasz Grabowski
Journal:  In Silico Pharmacol       Date:  2016-07-07

4.  A review of machine learning methods to predict the solubility of overexpressed recombinant proteins in Escherichia coli.

Authors:  Narjeskhatoon Habibi; Siti Z Mohd Hashim; Alireza Norouzi; Mohammed Razip Samian
Journal:  BMC Bioinformatics       Date:  2014-05-08       Impact factor: 3.169

5.  Codon usage clusters correlation: towards protein solubility prediction in heterologous expression systems in E. coli.

Authors:  Leonardo Pellizza; Clara Smal; Guido Rodrigo; Martín Arán
Journal:  Sci Rep       Date:  2018-07-13       Impact factor: 4.379

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.