Literature DB >> 29378850

A statistical model for improved membrane protein expression using sequence-derived features.

Shyam M Saladi1, Nauman Javed1, Axel Müller1, William M Clemons2.   

Abstract

The heterologous expression of integral membrane proteins (IMPs) remains a major bottleneck in the characterization of this important protein class. IMP expression levels are currently unpredictable, which renders the pursuit of IMPs for structural and biophysical characterization challenging and inefficient. Experimental evidence demonstrates that changes within the nucleotide or amino acid sequence for a given IMP can dramatically affect expression levels, yet these observations have not resulted in generalizable approaches to improve expression levels. Here, we develop a data-driven statistical predictor named IMProve that, using only sequence information, increases the likelihood of selecting an IMP that expresses in Escherichia coli The IMProve model, trained on experimental data, combines a set of sequence-derived features resulting in an IMProve score, where higher values have a higher probability of success. The model is rigorously validated against a variety of independent data sets that contain a wide range of experimental outcomes from various IMP expression trials. The results demonstrate that use of the model can more than double the number of successfully expressed targets at any experimental scale. IMProve can immediately be used to identify favorable targets for characterization. Most notably, IMProve demonstrates for the first time that IMP expression levels can be predicted directly from sequence.
© 2018 by The American Society for Biochemistry and Molecular Biology, Inc.

Entities:  

Keywords:  computational biology; machine-learning; membrane biogenesis; membrane biophysics; membrane protein; prediction; protein expression; structural biology

Mesh:

Substances:

Year:  2018        PMID: 29378850      PMCID: PMC5880134          DOI: 10.1074/jbc.RA117.001052

Source DB:  PubMed          Journal:  J Biol Chem        ISSN: 0021-9258            Impact factor:   5.157


  71 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics.

Authors:  P Bertone; Y Kluger; N Lan; D Zheng; D Christendat; A Yee; A M Edwards; C H Arrowsmith; G T Montelione; M Gerstein
Journal:  Nucleic Acids Res       Date:  2001-07-01       Impact factor: 16.971

3.  Expression of G-protein coupled receptors in Escherichia coli for structural studies.

Authors:  L E Petrovskaya; A A Shulga; O V Bocharova; Ya S Ermolyuk; E A Kryukova; V V Chupin; M J J Blommers; A S Arseniev; M P Kirpichnikov
Journal:  Biochemistry (Mosc)       Date:  2010-07       Impact factor: 2.487

4.  NUPACK: Analysis and design of nucleic acid systems.

Authors:  Joseph N Zadeh; Conrad D Steenberg; Justin S Bois; Brian R Wolfe; Marshall B Pierce; Asif R Khan; Robert M Dirks; Niles A Pierce
Journal:  J Comput Chem       Date:  2011-01-15       Impact factor: 3.376

5.  The complete genome sequence of the gastric pathogen Helicobacter pylori.

Authors:  J F Tomb; O White; A R Kerlavage; R A Clayton; G G Sutton; R D Fleischmann; K A Ketchum; H P Klenk; S Gill; B A Dougherty; K Nelson; J Quackenbush; L Zhou; E F Kirkness; S Peterson; B Loftus; D Richardson; R Dodson; H G Khalak; A Glodek; K McKenney; L M Fitzegerald; N Lee; M D Adams; E K Hickey; D E Berg; J D Gocayne; T R Utterback; J D Peterson; J M Kelley; M D Cotton; J M Weidman; C Fujii; C Bowman; L Watthey; E Wallin; W S Hayes; M Borodovsky; P D Karp; H O Smith; C M Fraser; J C Venter
Journal:  Nature       Date:  1997-08-07       Impact factor: 49.962

6.  Causes and effects of N-terminal codon bias in bacterial genes.

Authors:  Daniel B Goodman; George M Church; Sriram Kosuri
Journal:  Science       Date:  2013-09-26       Impact factor: 47.728

7.  An efficient strategy for high-throughput expression screening of recombinant integral membrane proteins.

Authors:  Said Eshaghi; Marie Hedrén; Marina Ignatushchenko Abdel Nasser; Tove Hammarberg; Anders Thornell; Pär Nordlund
Journal:  Protein Sci       Date:  2005-02-02       Impact factor: 6.725

8.  Solvation energies of amino acid side chains and backbone in a family of host-guest pentapeptides.

Authors:  W C Wimley; T P Creamer; S H White
Journal:  Biochemistry       Date:  1996-04-23       Impact factor: 3.162

9.  The Structural Biology Knowledgebase: a portal to protein structures, sequences, functions, and methods.

Authors:  Margaret J Gabanyi; Paul D Adams; Konstantin Arnold; Lorenza Bordoli; Lester G Carter; Judith Flippen-Andersen; Lida Gifford; Juergen Haas; Andrei Kouranov; William A McLaughlin; David I Micallef; Wladek Minor; Raship Shah; Torsten Schwede; Yi-Ping Tao; John D Westbrook; Matthew Zimmerman; Helen M Berman
Journal:  J Struct Funct Genomics       Date:  2011-04-07

10.  mRNA-programmed translation pauses in the targeting of E. coli membrane proteins.

Authors:  Nir Fluman; Sivan Navon; Eitan Bibi; Yitzhak Pilpel
Journal:  Elife       Date:  2014-08-18       Impact factor: 8.140

View more
  3 in total

1.  Learned protein embeddings for machine learning.

Authors:  Kevin K Yang; Zachary Wu; Claire N Bedbrook; Frances H Arnold
Journal:  Bioinformatics       Date:  2018-08-01       Impact factor: 6.937

2.  Towards generalizable predictions for G protein-coupled receptor variant expression.

Authors:  Charles P Kuntz; Hope Woods; Andrew G McKee; Nathan B Zelt; Jeffrey L Mendenhall; Jens Meiler; Jonathan P Schlebach
Journal:  Biophys J       Date:  2022-06-17       Impact factor: 3.699

3.  TMCrys: predict propensity of success for transmembrane protein crystallization.

Authors:  Julia K Varga; Gábor E Tusnády
Journal:  Bioinformatics       Date:  2018-09-15       Impact factor: 6.937

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.