Literature DB >> 35413604

Machine learning to navigate fitness landscapes for protein engineering.

Chase R Freschlin1, Sarah A Fahlberg1, Philip A Romero2.   

Abstract

Machine learning (ML) is revolutionizing our ability to understand and predict the complex relationships between protein sequence, structure, and function. Predictive sequence-function models are enabling protein engineers to efficiently search the sequence space for useful proteins with broad applications in biotechnology. In this review, we highlight the recent advances in applying ML to protein engineering. We discuss supervised learning methods that infer the sequence-function mapping from experimental data and new sequence representation strategies for data-efficient modeling. We then describe the various ways in which ML can be incorporated into protein engineering workflows, including purely in silico searches, ML-assisted directed evolution, and generative models that can learn the underlying distribution of the protein function in a sequence space. ML-driven protein engineering will become increasingly powerful with continued advances in high-throughput data generation, data science, and deep learning.
Copyright © 2022 Elsevier Ltd. All rights reserved.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 35413604      PMCID: PMC9177649          DOI: 10.1016/j.copbio.2022.102713

Source DB:  PubMed          Journal:  Curr Opin Biotechnol        ISSN: 0958-1669            Impact factor:   10.279


  38 in total

1.  Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations.

Authors:  Kahini Wadhawan; Inkit Padhi; Sebastian Gehrmann; Payel Das; Tom Sercu; Flaviu Cipcigan; Vijil Chenthamarakshan; Hendrik Strobelt; Cicero Dos Santos; Pin-Yu Chen; Yi Yan Yang; Jeremy P K Tan; James Hedrick; Jason Crain; Aleksandra Mojsilovic
Journal:  Nat Biomed Eng       Date:  2021-03-11       Impact factor: 25.671

Review 2.  Adaptive machine learning for protein engineering.

Authors:  Brian L Hie; Kevin K Yang
Journal:  Curr Opin Struct Biol       Date:  2021-12-09       Impact factor: 6.809

3.  Learning protein fitness models from evolutionary and assay-labeled data.

Authors:  Chloe Hsu; Hunter Nisonoff; Clara Fannjiang; Jennifer Listgarten
Journal:  Nat Biotechnol       Date:  2022-01-17       Impact factor: 68.164

4.  Unified rational protein engineering with sequence-based deep representation learning.

Authors:  Ethan C Alley; Grigory Khimulya; Surojit Biswas; Mohammed AlQuraishi; George M Church
Journal:  Nat Methods       Date:  2019-10-21       Impact factor: 28.547

5.  Global analysis of protein folding using massively parallel design, synthesis, and testing.

Authors:  Gabriel J Rocklin; Tamuka M Chidyausiku; Inna Goreshnik; Alex Ford; Scott Houliston; Alexander Lemak; Lauren Carter; Rashmi Ravichandran; Vikram K Mulligan; Aaron Chevalier; Cheryl H Arrowsmith; David Baker
Journal:  Science       Date:  2017-07-14       Impact factor: 47.728

6.  Low-N protein engineering with data-efficient deep learning.

Authors:  Surojit Biswas; Grigory Khimulya; Ethan C Alley; Kevin M Esvelt; George M Church
Journal:  Nat Methods       Date:  2021-04-07       Impact factor: 28.547

7.  Evaluating Protein Transfer Learning with TAPE.

Authors:  Roshan Rao; Nicholas Bhattacharya; Neil Thomas; Yan Duan; Xi Chen; John Canny; Pieter Abbeel; Yun S Song
Journal:  Adv Neural Inf Process Syst       Date:  2019-12

8.  Modeling aspects of the language of life through transfer-learning protein sequences.

Authors:  Michael Heinzinger; Ahmed Elnaggar; Yu Wang; Christian Dallago; Dmitrii Nechaev; Florian Matthes; Burkhard Rost
Journal:  BMC Bioinformatics       Date:  2019-12-17       Impact factor: 3.169

9.  Deciphering protein evolution and fitness landscapes with latent space models.

Authors:  Xinqiang Ding; Zhengting Zou; Charles L Brooks Iii
Journal:  Nat Commun       Date:  2019-12-10       Impact factor: 14.919

10.  Highly accurate protein structure prediction with AlphaFold.

Authors:  John Jumper; Richard Evans; Alexander Pritzel; Tim Green; Michael Figurnov; Olaf Ronneberger; Kathryn Tunyasuvunakool; Russ Bates; Augustin Žídek; Anna Potapenko; Alex Bridgland; Clemens Meyer; Simon A A Kohl; Andrew J Ballard; Andrew Cowie; Bernardino Romera-Paredes; Stanislav Nikolov; Rishub Jain; Demis Hassabis; Jonas Adler; Trevor Back; Stig Petersen; David Reiman; Ellen Clancy; Michal Zielinski; Martin Steinegger; Michalina Pacholska; Tamas Berghammer; Sebastian Bodenstein; David Silver; Oriol Vinyals; Andrew W Senior; Koray Kavukcuoglu; Pushmeet Kohli
Journal:  Nature       Date:  2021-07-15       Impact factor: 49.962

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.