Literature DB >> 28398465

Machine learning in computational biology to accelerate high-throughput protein expression.

Anand Sastry1, Jonathan Monk1, Hanna Tegel2, Mathias Uhlen2,3, Bernhard O Palsson1,3, Johan Rockberg2, Elizabeth Brunk1,3.   

Abstract

MOTIVATION: The Human Protein Atlas (HPA) enables the simultaneous characterization of thousands of proteins across various tissues to pinpoint their spatial location in the human body. This has been achieved through transcriptomics and high-throughput immunohistochemistry-based approaches, where over 40 000 unique human protein fragments have been expressed in E. coli. These datasets enable quantitative tracking of entire cellular proteomes and present new avenues for understanding molecular-level properties influencing expression and solubility.
RESULTS: Combining computational biology and machine learning identifies protein properties that hinder the HPA high-throughput antibody production pipeline. We predict protein expression and solubility with accuracies of 70% and 80%, respectively, based on a subset of key properties (aromaticity, hydropathy and isoelectric point). We guide the selection of protein fragments based on these characteristics to optimize high-throughput experimentation.
AVAILABILITY AND IMPLEMENTATION: We present the machine learning workflow as a series of IPython notebooks hosted on GitHub (https://github.com/SBRG/Protein_ML). The workflow can be used as a template for analysis of further expression and solubility datasets. CONTACT: ebrunk@ucsd.edu or johanr@biotech.kth.se. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28398465      PMCID: PMC5870730          DOI: 10.1093/bioinformatics/btx207

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  51 in total

1.  Mining the structural genomics pipeline: identification of protein properties that affect high-throughput experimental analysis.

Authors:  Chern-Sing Goh; Ning Lan; Shawn M Douglas; Baolin Wu; Nathaniel Echols; Andrew Smith; Duncan Milburn; Gaetano T Montelione; Hongyu Zhao; Mark Gerstein
Journal:  J Mol Biol       Date:  2004-02-06       Impact factor: 5.469

Review 2.  Regulation of translation via mRNA structure in prokaryotes and eukaryotes.

Authors:  Marilyn Kozak
Journal:  Gene       Date:  2005-10-05       Impact factor: 3.688

3.  Affinity proteomics for systematic protein profiling of chromosome 21 gene products in human tissues.

Authors:  Charlotta Agaton; Joakim Galli; Ingmarie Höidén Guthenberg; Lars Janzon; Marianne Hansson; Anna Asplund; Eva Brundell; Susanne Lindberg; Irene Ruthberg; Kenneth Wester; Dorothee Wurtz; Christer Höög; Joakim Lundeberg; Stefan Ståhl; Fredrik Pontén; Mathias Uhlén
Journal:  Mol Cell Proteomics       Date:  2003-06-09       Impact factor: 5.911

4.  Causes and effects of N-terminal codon bias in bacterial genes.

Authors:  Daniel B Goodman; George M Church; Sriram Kosuri
Journal:  Science       Date:  2013-09-26       Impact factor: 47.728

5.  Proteomics. Tissue-based map of the human proteome.

Authors:  Mathias Uhlén; Linn Fagerberg; Björn M Hallström; Cecilia Lindskog; Per Oksvold; Adil Mardinoglu; Åsa Sivertsson; Caroline Kampf; Evelina Sjöstedt; Anna Asplund; IngMarie Olsson; Karolina Edlund; Emma Lundberg; Sanjay Navani; Cristina Al-Khalili Szigyarto; Jacob Odeberg; Dijana Djureinovic; Jenny Ottosson Takanen; Sophia Hober; Tove Alm; Per-Henrik Edqvist; Holger Berling; Hanna Tegel; Jan Mulder; Johan Rockberg; Peter Nilsson; Jochen M Schwenk; Marica Hamsten; Kalle von Feilitzen; Mattias Forsberg; Lukas Persson; Fredric Johansson; Martin Zwahlen; Gunnar von Heijne; Jens Nielsen; Fredrik Pontén
Journal:  Science       Date:  2015-01-23       Impact factor: 47.728

6.  Codon identity regulates mRNA stability and translation efficiency during the maternal-to-zygotic transition.

Authors:  Ariel A Bazzini; Florencia Del Viso; Miguel A Moreno-Mateos; Timothy G Johnstone; Charles E Vejnar; Yidan Qin; Jun Yao; Mustafa K Khokha; Antonio J Giraldez
Journal:  EMBO J       Date:  2016-07-19       Impact factor: 11.598

7.  Influence of duplexes 3' to the mRNA initiation codon on the efficiency of monosome formation.

Authors:  S H Shakin-Eshleman; S A Liebhaber
Journal:  Biochemistry       Date:  1988-05-31       Impact factor: 3.162

8.  The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications.

Authors:  P M Sharp; W H Li
Journal:  Nucleic Acids Res       Date:  1987-02-11       Impact factor: 16.971

9.  Co-variation of tRNA abundance and codon usage in Escherichia coli at different growth rates.

Authors:  H Dong; L Nilsson; C G Kurland
Journal:  J Mol Biol       Date:  1996-08-02       Impact factor: 5.469

10.  Protein disorder prediction: implications for structural proteomics.

Authors:  Rune Linding; Lars Juhl Jensen; Francesca Diella; Peer Bork; Toby J Gibson; Robert B Russell
Journal:  Structure       Date:  2003-11       Impact factor: 5.006

View more
  2 in total

1.  Machine and Deep Learning for Prediction of Subcellular Localization.

Authors:  Gaofeng Pan; Chao Sun; Zijun Liao; Jijun Tang
Journal:  Methods Mol Biol       Date:  2021

2.  Machine learning modeling for solubility prediction of recombinant antibody fragment in four different E. coli strains.

Authors:  Atieh Hashemi; Majid Basafa; Aidin Behravan
Journal:  Sci Rep       Date:  2022-03-31       Impact factor: 4.379

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.