Literature DB >> 31495995

Integration of the Rosetta suite with the python software stack via reproducible packaging and core programming interfaces for distributed simulation.

Alexander S Ford1,2,3, Brian D Weitzner2,3,4, Christopher D Bahl1,5,6.   

Abstract

The Rosetta software suite for macromolecular modeling is a powerful computational toolbox for protein design, structure prediction, and protein structure analysis. The development of novel Rosetta-based scientific tools requires two orthogonal skill sets: deep domain-specific expertise in protein biochemistry and technical expertise in development, deployment, and analysis of molecular simulations. Furthermore, the computational demands of molecular simulation necessitate large scale cluster-based or distributed solutions for nearly all scientifically relevant tasks. To reduce the technical barriers to entry for new development, we integrated Rosetta with modern, widely adopted computational infrastructure. This allows simplified deployment in large-scale cluster and cloud computing environments, and effective reuse of common libraries for simulation execution and data analysis. To achieve this, we integrated Rosetta with the Conda package manager; this simplifies installation into existing computational environments and packaging as docker images for cloud deployment. Then, we developed programming interfaces to integrate Rosetta with the PyData stack for analysis and distributed computing, including the popular tools Jupyter, Pandas, and Dask. We demonstrate the utility of these components by generating a library of a thousand de novo disulfide-rich miniproteins in a hybrid simulation that included cluster-based design and interactive notebook-based analyses. Our new tools enable users, who would otherwise not have access to the necessary computational infrastructure, to perform state-of-the-art molecular simulation and design with Rosetta.
© 2019 The Protein Society.

Keywords:  zzm321990de novo protein design; Rosetta; conda; containerization; dask; disulfide-rich miniprotein; elastic cloud services; high performance computing; jupyter; python

Mesh:

Substances:

Year:  2019        PMID: 31495995      PMCID: PMC6933847          DOI: 10.1002/pro.3721

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  16 in total

1.  Protein structure prediction and analysis using the Robetta server.

Authors:  David E Kim; Dylan Chivian; David Baker
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

2.  Automating human intuition for protein design.

Authors:  Lucas G Nivón; Sinisa Bjelic; Chris King; David Baker
Journal:  Proteins       Date:  2013-11-22

3.  PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta.

Authors:  Sidhartha Chaudhury; Sergey Lyskov; Jeffrey J Gray
Journal:  Bioinformatics       Date:  2010-01-07       Impact factor: 6.937

4.  Why Jupyter is data scientists' computational notebook of choice.

Authors:  Jeffrey M Perkel
Journal:  Nature       Date:  2018-11       Impact factor: 49.962

5.  Web-accessible molecular modeling with Rosetta: The Rosetta Online Server that Includes Everyone (ROSIE).

Authors:  Rocco Moretti; Sergey Lyskov; Rhiju Das; Jens Meiler; Jeffrey J Gray
Journal:  Protein Sci       Date:  2017-10-27       Impact factor: 6.725

6.  Analysis and classification of disulphide connectivity in proteins. The entropic effect of cross-linkage.

Authors:  P M Harrison; M J Sternberg
Journal:  J Mol Biol       Date:  1994-12-09       Impact factor: 5.469

7.  ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules.

Authors:  Andrew Leaver-Fay; Michael Tyka; Steven M Lewis; Oliver F Lange; James Thompson; Ron Jacak; Kristian Kaufman; P Douglas Renfrew; Colin A Smith; Will Sheffler; Ian W Davis; Seth Cooper; Adrien Treuille; Daniel J Mandell; Florian Richter; Yih-En Andrew Ban; Sarel J Fleishman; Jacob E Corn; David E Kim; Sergey Lyskov; Monica Berrondo; Stuart Mentzer; Zoran Popović; James J Havranek; John Karanicolas; Rhiju Das; Jens Meiler; Tanja Kortemme; Jeffrey J Gray; Brian Kuhlman; David Baker; Philip Bradley
Journal:  Methods Enzymol       Date:  2011       Impact factor: 1.600

8.  Global analysis of protein folding using massively parallel design, synthesis, and testing.

Authors:  Gabriel J Rocklin; Tamuka M Chidyausiku; Inna Goreshnik; Alex Ford; Scott Houliston; Alexander Lemak; Lauren Carter; Rashmi Ravichandran; Vikram K Mulligan; Aaron Chevalier; Cheryl H Arrowsmith; David Baker
Journal:  Science       Date:  2017-07-14       Impact factor: 47.728

9.  RosettaScripts: a scripting language interface to the Rosetta macromolecular modeling suite.

Authors:  Sarel J Fleishman; Andrew Leaver-Fay; Jacob E Corn; Eva-Maria Strauch; Sagar D Khare; Nobuyasu Koga; Justin Ashworth; Paul Murphy; Florian Richter; Gordon Lemmon; Jens Meiler; David Baker
Journal:  PLoS One       Date:  2011-06-24       Impact factor: 3.240

10.  3Dmol.js: molecular visualization with WebGL.

Authors:  Nicholas Rego; David Koes
Journal:  Bioinformatics       Date:  2014-12-12       Impact factor: 6.937

View more
  7 in total

1.  PyRosetta Jupyter Notebooks Teach Biomolecular Structure Prediction and Design.

Authors:  Kathy H Le; Jared Adolf-Bryfogle; Jason C Klima; Sergey Lyskov; Jason Labonte; Steven Bertolani; Shourya S Roy Burman; Andrew Leaver-Fay; Brian Weitzner; Jack Maguire; Ramya Rangan; Matt A Adrianowycz; Rebecca F Alford; Aleexsan Adal; Morgan L Nance; Yuanhan Wu; Jordan Willis; Daniel W Kulp; Rhiju Das; Roland L Dunbrack; William Schief; Brian Kuhlman; Justin B Siegel; Jeffrey J Gray
Journal:  Biophysicist (Rockv)       Date:  2021-04-14

2.  Interpreting Neural Networks for Biological Sequences by Learning Stochastic Masks.

Authors:  Johannes Linder; Alyssa La Fleur; Zibo Chen; Ajasja Ljubeti; David Baker; Sreeram Kannan; Georg Seelig
Journal:  Nat Mach Intell       Date:  2022-01-25

3.  Structural characterization and computational analysis of PDZ domains in Monosiga brevicollis.

Authors:  Melody Gao; Iain G P Mackley; Samaneh Mesbahi-Vasey; Haley A Bamonte; Sarah A Struyvenberg; Louisa Landolt; Nick J Pederson; Lucy I Williams; Christopher D Bahl; Lionel Brooks; Jeanine F Amacher
Journal:  Protein Sci       Date:  2020-09-25       Impact factor: 6.725

4.  Anchor extension: a structure-guided approach to design cyclic peptides targeting enzyme active sites.

Authors:  Paris R Watson; Timothy W Craven; Xinting Li; Stephen Rettie; Parisa Hosseinzadeh; Fátima Pardo-Avila; Asim K Bera; Vikram Khipple Mulligan; Peilong Lu; Alexander S Ford; Brian D Weitzner; Lance J Stewart; Adam P Moyer; Maddalena Di Piazza; Joshua G Whalen; Per Jr Greisen; David W Christianson; David Baker
Journal:  Nat Commun       Date:  2021-06-07       Impact factor: 14.919

Review 5.  Toward complete rational control over protein structure and function through computational design.

Authors:  Jared Adolf-Bryfogle; Frank D Teets; Christopher D Bahl
Journal:  Curr Opin Struct Biol       Date:  2020-12-01       Impact factor: 6.809

6.  De novo design of immunoglobulin-like domains.

Authors:  Tamuka M Chidyausiku; Soraia R Mendes; Jason C Klima; Marta Nadal; Ulrich Eckhard; Jorge Roel-Touris; Scott Houliston; Tibisay Guevara; Hugh K Haddox; Adam Moyer; Cheryl H Arrowsmith; F Xavier Gomis-Rüth; David Baker; Enrique Marcos
Journal:  Nat Commun       Date:  2022-10-03       Impact factor: 17.694

Review 7.  Macromolecular modeling and design in Rosetta: recent methods and frameworks.

Authors:  Julia Koehler Leman; Brian D Weitzner; Steven M Lewis; Jared Adolf-Bryfogle; Nawsad Alam; Rebecca F Alford; Melanie Aprahamian; David Baker; Kyle A Barlow; Patrick Barth; Benjamin Basanta; Brian J Bender; Kristin Blacklock; Jaume Bonet; Scott E Boyken; Phil Bradley; Chris Bystroff; Patrick Conway; Seth Cooper; Bruno E Correia; Brian Coventry; Rhiju Das; René M De Jong; Frank DiMaio; Lorna Dsilva; Roland Dunbrack; Alexander S Ford; Brandon Frenz; Darwin Y Fu; Caleb Geniesse; Lukasz Goldschmidt; Ragul Gowthaman; Jeffrey J Gray; Dominik Gront; Sharon Guffy; Scott Horowitz; Po-Ssu Huang; Thomas Huber; Tim M Jacobs; Jeliazko R Jeliazkov; David K Johnson; Kalli Kappel; John Karanicolas; Hamed Khakzad; Karen R Khar; Sagar D Khare; Firas Khatib; Alisa Khramushin; Indigo C King; Robert Kleffner; Brian Koepnick; Tanja Kortemme; Georg Kuenze; Brian Kuhlman; Daisuke Kuroda; Jason W Labonte; Jason K Lai; Gideon Lapidoth; Andrew Leaver-Fay; Steffen Lindert; Thomas Linsky; Nir London; Joseph H Lubin; Sergey Lyskov; Jack Maguire; Lars Malmström; Enrique Marcos; Orly Marcu; Nicholas A Marze; Jens Meiler; Rocco Moretti; Vikram Khipple Mulligan; Santrupti Nerli; Christoffer Norn; Shane Ó'Conchúir; Noah Ollikainen; Sergey Ovchinnikov; Michael S Pacella; Xingjie Pan; Hahnbeom Park; Ryan E Pavlovicz; Manasi Pethe; Brian G Pierce; Kala Bharath Pilla; Barak Raveh; P Douglas Renfrew; Shourya S Roy Burman; Aliza Rubenstein; Marion F Sauer; Andreas Scheck; William Schief; Ora Schueler-Furman; Yuval Sedan; Alexander M Sevy; Nikolaos G Sgourakis; Lei Shi; Justin B Siegel; Daniel-Adriano Silva; Shannon Smith; Yifan Song; Amelie Stein; Maria Szegedy; Frank D Teets; Summer B Thyme; Ray Yu-Ruei Wang; Andrew Watkins; Lior Zimmerman; Richard Bonneau
Journal:  Nat Methods       Date:  2020-06-01       Impact factor: 28.547

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.