Literature DB >> 34853475

De novo protein design by deep network hallucination.

Ivan Anishchenko1,2, Samuel J Pellock1,2, Tamuka M Chidyausiku1,2, Theresa A Ramelot3,4, Sergey Ovchinnikov5, Jingzhou Hao3,4, Khushboo Bafna3,4, Christoffer Norn1,2, Alex Kang1,2, Asim K Bera1,2, Frank DiMaio1,2, Lauren Carter1,2, Cameron M Chow1,2, Gaetano T Montelione3,4, David Baker6,7,8.   

Abstract

There has been considerable recent progress in protein structure prediction using deep neural networks to predict inter-residue distances from amino acid sequences1-3. Here we investigate whether the information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occurring proteins used in training the models. We generate random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting residue-residue distance maps, which, as expected, are quite featureless. We then carry out Monte Carlo sampling in amino acid sequence space, optimizing the contrast (Kullback-Leibler divergence) between the inter-residue distance distributions predicted by the network and background distributions averaged over all proteins. Optimization from different random starting points resulted in novel proteins spanning a wide range of sequences and predicted structures. We obtained synthetic genes encoding 129 of the network-'hallucinated' sequences, and expressed and purified the proteins in Escherichia coli; 27 of the proteins yielded monodisperse species with circular dichroism spectra consistent with the hallucinated structures. We determined the three-dimensional structures of three of the hallucinated proteins, two by X-ray crystallography and one by NMR, and these closely matched the hallucinated models. Thus, deep networks trained to predict native protein structures from their sequences can be inverted to design new proteins, and such networks and methods should contribute alongside traditional physics-based models to the de novo design of proteins with new functions.
© 2021. The Author(s), under exclusive licence to Springer Nature Limited.

Entities:  

Mesh:

Substances:

Year:  2021        PMID: 34853475      PMCID: PMC9293396          DOI: 10.1038/s41586-021-04184-w

Source DB:  PubMed          Journal:  Nature        ISSN: 0028-0836            Impact factor:   69.504


  44 in total

1.  Distance-based protein folding powered by deep learning.

Authors:  Jinbo Xu
Journal:  Proc Natl Acad Sci U S A       Date:  2019-08-09       Impact factor: 11.205

2.  Fast and Flexible Protein Design Using Deep Graph Neural Networks.

Authors:  Alexey Strokach; David Becerra; Carles Corbi-Verge; Albert Perez-Riba; Philip M Kim
Journal:  Cell Syst       Date:  2020-09-23       Impact factor: 10.304

3.  Improved protein structure prediction using predicted interresidue orientations.

Authors:  Jianyi Yang; Ivan Anishchenko; Hahnbeom Park; Zhenling Peng; Sergey Ovchinnikov; David Baker
Journal:  Proc Natl Acad Sci U S A       Date:  2020-01-02       Impact factor: 11.205

4.  Improved protein structure prediction using potentials from deep learning.

Authors:  Andrew W Senior; Richard Evans; John Jumper; James Kirkpatrick; Laurent Sifre; Tim Green; Chongli Qin; Augustin Žídek; Alexander W R Nelson; Alex Bridgland; Hugo Penedones; Stig Petersen; Karen Simonyan; Steve Crossan; Pushmeet Kohli; David T Jones; David Silver; Koray Kavukcuoglu; Demis Hassabis
Journal:  Nature       Date:  2020-01-15       Impact factor: 49.962

5.  De Novo Protein Design for Novel Folds Using Guided Conditional Wasserstein Generative Adversarial Networks.

Authors:  Mostafa Karimi; Shaowen Zhu; Yue Cao; Yang Shen
Journal:  J Chem Inf Model       Date:  2020-09-30       Impact factor: 4.956

6.  Low-N protein engineering with data-efficient deep learning.

Authors:  Surojit Biswas; Grigory Khimulya; Ethan C Alley; Kevin M Esvelt; George M Church
Journal:  Nat Methods       Date:  2021-04-07       Impact factor: 28.547

7.  Computational Protein Design with Deep Learning Neural Networks.

Authors:  Jingxue Wang; Huali Cao; John Z H Zhang; Yifei Qi
Journal:  Sci Rep       Date:  2018-04-20       Impact factor: 4.379

8.  Deep generative models for T cell receptor protein sequences.

Authors:  Kristian Davidsen; Branden J Olson; William S DeWitt; Jean Feng; Elias Harkins; Philip Bradley; Frederick A Matsen
Journal:  Elife       Date:  2019-09-05       Impact factor: 8.140

9.  Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13).

Authors:  Andrew W Senior; Richard Evans; John Jumper; James Kirkpatrick; Laurent Sifre; Tim Green; Chongli Qin; Augustin Žídek; Alexander W R Nelson; Alex Bridgland; Hugo Penedones; Stig Petersen; Karen Simonyan; Steve Crossan; Pushmeet Kohli; David T Jones; David Silver; Koray Kavukcuoglu; Demis Hassabis
Journal:  Proteins       Date:  2019-12

10.  Generating functional protein variants with variational autoencoders.

Authors:  Alex Hawkins-Hooker; Florence Depardieu; Sebastien Baur; Guillaume Couairon; Arthur Chen; David Bikard
Journal:  PLoS Comput Biol       Date:  2021-02-26       Impact factor: 4.475

View more
  29 in total

1.  Large-scale design and refinement of stable proteins using sequence-only models.

Authors:  Jedediah M Singer; Scott Novotney; Devin Strickland; Hugh K Haddox; Nicholas Leiby; Gabriel J Rocklin; Cameron M Chow; Anindya Roy; Asim K Bera; Francis C Motta; Longxing Cao; Eva-Maria Strauch; Tamuka M Chidyausiku; Alex Ford; Ethan Ho; Alexander Zaitzeff; Craig O Mackenzie; Hamed Eramian; Frank DiMaio; Gevorg Grigoryan; Matthew Vaughn; Lance J Stewart; David Baker; Eric Klavins
Journal:  PLoS One       Date:  2022-03-14       Impact factor: 3.240

Review 2.  Design principles of protein switches.

Authors:  Robert G Alberstein; Amy B Guo; Tanja Kortemme
Journal:  Curr Opin Struct Biol       Date:  2021-09-16       Impact factor: 6.809

Review 3.  Structure-based protein design with deep learning.

Authors:  Sergey Ovchinnikov; Po-Ssu Huang
Journal:  Curr Opin Chem Biol       Date:  2021-09-20       Impact factor: 8.822

4.  A Method for Assessing the Robustness of Protein Structures by Randomizing Packing Interactions.

Authors:  Shilpa Yadahalli; Lakshmi P Jayanthi; Shachi Gosavi
Journal:  Front Mol Biosci       Date:  2022-06-27

5.  Interpreting Neural Networks for Biological Sequences by Learning Stochastic Masks.

Authors:  Johannes Linder; Alyssa La Fleur; Zibo Chen; Ajasja Ljubeti; David Baker; Sreeram Kannan; Georg Seelig
Journal:  Nat Mach Intell       Date:  2022-01-25

6.  Scientists are using AI to dream up revolutionary new proteins.

Authors:  Ewen Callaway
Journal:  Nature       Date:  2022-09       Impact factor: 69.504

Review 7.  The road to fully programmable protein catalysis.

Authors:  Sarah L Lovelock; Rebecca Crawshaw; Sophie Basler; Colin Levy; David Baker; Donald Hilvert; Anthony P Green
Journal:  Nature       Date:  2022-06-01       Impact factor: 69.504

8.  Ig-VAE: Generative modeling of protein structure by direct 3D coordinate generation.

Authors:  Raphael R Eguchi; Christian A Choe; Po-Ssu Huang
Journal:  PLoS Comput Biol       Date:  2022-06-27       Impact factor: 4.779

9.  AlphaFold Models of Small Proteins Rival the Accuracy of Solution NMR Structures.

Authors:  Roberto Tejero; Yuanpeng Janet Huang; Theresa A Ramelot; Gaetano T Montelione
Journal:  Front Mol Biosci       Date:  2022-06-13

10.  AlphaFold2 models indicate that protein sequence determines both structure and dynamics.

Authors:  Hao-Bo Guo; Alexander Perminov; Selemon Bekele; Gary Kedziora; Sanaz Farajollahi; Vanessa Varaljay; Kevin Hinkle; Valeria Molinero; Konrad Meister; Chia Hung; Patrick Dennis; Nancy Kelley-Loughnane; Rajiv Berry
Journal:  Sci Rep       Date:  2022-06-23       Impact factor: 4.996

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.