Literature DB >> 35286324

Large-scale design and refinement of stable proteins using sequence-only models.

Jedediah M Singer1, Scott Novotney1, Devin Strickland2, Hugh K Haddox3, Nicholas Leiby1, Gabriel J Rocklin4, Cameron M Chow3, Anindya Roy3, Asim K Bera3, Francis C Motta5, Longxing Cao3, Eva-Maria Strauch6, Tamuka M Chidyausiku3, Alex Ford3, Ethan Ho7, Alexander Zaitzeff1, Craig O Mackenzie8, Hamed Eramian9, Frank DiMaio3, Gevorg Grigoryan10, Matthew Vaughn7, Lance J Stewart3, David Baker3, Eric Klavins2.   

Abstract

Engineered proteins generally must possess a stable structure in order to achieve their designed function. Stable designs, however, are astronomically rare within the space of all possible amino acid sequences. As a consequence, many designs must be tested computationally and experimentally in order to find stable ones, which is expensive in terms of time and resources. Here we use a high-throughput, low-fidelity assay to experimentally evaluate the stability of approximately 200,000 novel proteins. These include a wide range of sequence perturbations, providing a baseline for future work in the field. We build a neural network model that predicts protein stability given only sequences of amino acids, and compare its performance to the assayed values. We also report another network model that is able to generate the amino acid sequences of novel stable proteins given requested secondary sequences. Finally, we show that the predictive model-despite weaknesses including a noisy data set-can be used to substantially increase the stability of both expert-designed and model-generated proteins.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 35286324      PMCID: PMC8920274          DOI: 10.1371/journal.pone.0265020

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


  64 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Machine learning-assisted directed protein evolution with combinatorial libraries.

Authors:  Zachary Wu; S B Jennifer Kan; Russell D Lewis; Bruce J Wittmann; Frances H Arnold
Journal:  Proc Natl Acad Sci U S A       Date:  2019-04-12       Impact factor: 11.205

3.  MolProbity: More and better reference data for improved all-atom structure validation.

Authors:  Christopher J Williams; Jeffrey J Headd; Nigel W Moriarty; Michael G Prisant; Lizbeth L Videau; Lindsay N Deis; Vishal Verma; Daniel A Keedy; Bradley J Hintze; Vincent B Chen; Swati Jain; Steven M Lewis; W Bryan Arendall; Jack Snoeyink; Paul D Adams; Simon C Lovell; Jane S Richardson; David C Richardson
Journal:  Protein Sci       Date:  2017-11-27       Impact factor: 6.725

4.  Improved protein structure prediction using predicted interresidue orientations.

Authors:  Jianyi Yang; Ivan Anishchenko; Hahnbeom Park; Zhenling Peng; Sergey Ovchinnikov; David Baker
Journal:  Proc Natl Acad Sci U S A       Date:  2020-01-02       Impact factor: 11.205

5.  Amino-acid substitutions in a surface turn modulate protein stability.

Authors:  P F Predki; V Agrawal; A T Brünger; L Regan
Journal:  Nat Struct Biol       Date:  1996-01

6.  Protein fragments as models for events in protein folding pathways: protein engineering analysis of the association of two complementary fragments of the barley chymotrypsin inhibitor 2 (CI-2).

Authors:  J Ruiz-Sanz; G de Prat Gay; D E Otzen; A R Fersht
Journal:  Biochemistry       Date:  1995-02-07       Impact factor: 3.162

7.  Networks of bZIP protein-protein interactions diversified over a billion years of evolution.

Authors:  Aaron W Reinke; Jiyeon Baek; Orr Ashenberg; Amy E Keating
Journal:  Science       Date:  2013-05-10       Impact factor: 47.728

8.  Deep learning enables rapid identification of potent DDR1 kinase inhibitors.

Authors:  Alex Zhavoronkov; Yan A Ivanenkov; Alex Aliper; Mark S Veselov; Vladimir A Aladinskiy; Anastasiya V Aladinskaya; Victor A Terentiev; Daniil A Polykovskiy; Maksim D Kuznetsov; Arip Asadulaev; Yury Volkov; Artem Zholus; Rim R Shayakhmetov; Alexander Zhebrak; Lidiya I Minaeva; Bogdan A Zagribelnyy; Lennart H Lee; Richard Soll; David Madge; Li Xing; Tao Guo; Alán Aspuru-Guzik
Journal:  Nat Biotechnol       Date:  2019-09-02       Impact factor: 54.908

9.  Improved protein structure refinement guided by deep learning based accuracy estimation.

Authors:  Naozumi Hiranuma; Hahnbeom Park; Minkyung Baek; Ivan Anishchenko; Justas Dauparas; David Baker
Journal:  Nat Commun       Date:  2021-02-26       Impact factor: 14.919

10.  The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design.

Authors:  Rebecca F Alford; Andrew Leaver-Fay; Jeliazko R Jeliazkov; Matthew J O'Meara; Frank P DiMaio; Hahnbeom Park; Maxim V Shapovalov; P Douglas Renfrew; Vikram K Mulligan; Kalli Kappel; Jason W Labonte; Michael S Pacella; Richard Bonneau; Philip Bradley; Roland L Dunbrack; Rhiju Das; David Baker; Brian Kuhlman; Tanja Kortemme; Jeffrey J Gray
Journal:  J Chem Theory Comput       Date:  2017-05-12       Impact factor: 6.006

View more
  1 in total

1.  Dissecting the stability determinants of a challenging de novo protein fold using massively parallel design and experimentation.

Authors:  Tae-Eun Kim; Kotaro Tsuboyama; Scott Houliston; Cydney M Martell; Claire M Phoumyvong; Alexander Lemak; Hugh K Haddox; Cheryl H Arrowsmith; Gabriel J Rocklin
Journal:  Proc Natl Acad Sci U S A       Date:  2022-10-03       Impact factor: 12.779

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.