Literature DB >> 31424530

Multi-scale structural analysis of proteins by deep semantic segmentation.

Raphael R Eguchi1, Po-Ssu Huang2.   

Abstract

MOTIVATION: Recent advances in computational methods have facilitated large-scale sampling of protein structures, leading to breakthroughs in protein structural prediction and enabling de novo protein design. Establishing methods to identify candidate structures that can lead to native folds or designable structures remains a challenge, since few existing metrics capture high-level structural features such as architectures, folds and conformity to conserved structural motifs. Convolutional Neural Networks (CNNs) have been successfully used in semantic segmentation-a subfield of image classification in which a class label is predicted for every pixel. Here, we apply semantic segmentation to protein structures as a novel strategy for fold identification and structure quality assessment.
RESULTS: We train a CNN that assigns each residue in a multi-domain protein to one of 38 architecture classes designated by the CATH database. Our model achieves a high per-residue accuracy of 90.8% on the test set (95.0% average per-class accuracy; 87.8% average per-structure accuracy). We demonstrate that individual class probabilities can be used as a metric that indicates the degree to which a randomly generated structure assumes a specific fold, as well as a metric that highlights non-conformative regions of a protein belonging to a known class. These capabilities yield a powerful tool for guiding structural sampling for both structural prediction and design.
AVAILABILITY AND IMPLEMENTATION: The trained classifier network, parser network, and entropy calculation scripts are available for download at https://git.io/fp6bd, with detailed usage instructions provided at the download page. A step-by-step tutorial for setup is provided at https://goo.gl/e8GB2S. All Rosetta commands, RosettaRemodel blueprints, and predictions for all datasets used in the study are available in the Supplementary Information. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Substances:

Year:  2020        PMID: 31424530      PMCID: PMC7075530          DOI: 10.1093/bioinformatics/btz650

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  36 in total

Review 1.  Design of self-assembling transmembrane helical bundles to elucidate principles required for membrane protein folding and ion transport.

Authors:  Nathan H Joh; Gevorg Grigoryan; Yibing Wu; William F DeGrado
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2017-08-05       Impact factor: 6.237

Review 2.  The coming of age of de novo protein design.

Authors:  Po-Ssu Huang; Scott E Boyken; David Baker
Journal:  Nature       Date:  2016-09-15       Impact factor: 49.962

3.  Principles that govern the folding of protein chains.

Authors:  C B Anfinsen
Journal:  Science       Date:  1973-07-20       Impact factor: 47.728

4.  Computational design of ligand-binding proteins with high affinity and selectivity.

Authors:  Christine E Tinberg; Sagar D Khare; Jiayi Dou; Lindsey Doyle; Jorgen W Nelson; Alberto Schena; Wojciech Jankowski; Charalampos G Kalodimos; Kai Johnsson; Barry L Stoddard; David Baker
Journal:  Nature       Date:  2013-09-04       Impact factor: 49.962

5.  A potent and broad neutralizing antibody recognizes and penetrates the HIV glycan shield.

Authors:  Robert Pejchal; Katie J Doores; Laura M Walker; Reza Khayat; Po-Ssu Huang; Sheng-Kai Wang; Robyn L Stanfield; Jean-Philippe Julien; Alejandra Ramos; Max Crispin; Rafael Depetris; Umesh Katpally; Andre Marozsan; Albert Cupo; Sebastien Maloveste; Yan Liu; Ryan McBride; Yukishige Ito; Rogier W Sanders; Cassandra Ogohara; James C Paulson; Ten Feizi; Christopher N Scanlan; Chi-Huey Wong; John P Moore; William C Olson; Andrew B Ward; Pascal Poignard; William R Schief; Dennis R Burton; Ian A Wilson
Journal:  Science       Date:  2011-10-13       Impact factor: 47.728

6.  ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules.

Authors:  Andrew Leaver-Fay; Michael Tyka; Steven M Lewis; Oliver F Lange; James Thompson; Ron Jacak; Kristian Kaufman; P Douglas Renfrew; Colin A Smith; Will Sheffler; Ian W Davis; Seth Cooper; Adrien Treuille; Daniel J Mandell; Florian Richter; Yih-En Andrew Ban; Sarel J Fleishman; Jacob E Corn; David E Kim; Sergey Lyskov; Monica Berrondo; Stuart Mentzer; Zoran Popović; James J Havranek; John Karanicolas; Rhiju Das; Jens Meiler; Tanja Kortemme; Jeffrey J Gray; Brian Kuhlman; David Baker; Philip Bradley
Journal:  Methods Enzymol       Date:  2011       Impact factor: 1.600

7.  CATH: an expanded resource to predict protein function through structure and sequence.

Authors:  Natalie L Dawson; Tony E Lewis; Sayoni Das; Jonathan G Lees; David Lee; Paul Ashford; Christine A Orengo; Ian Sillitoe
Journal:  Nucleic Acids Res       Date:  2016-11-28       Impact factor: 16.971

8.  How protein stability and new functions trade off.

Authors:  Nobuhiko Tokuriki; Francois Stricher; Luis Serrano; Dan S Tawfik
Journal:  PLoS Comput Biol       Date:  2008-02-29       Impact factor: 4.475

9.  Exploring the repeat protein universe through computational protein design.

Authors:  T J Brunette; Fabio Parmeggiani; Po-Ssu Huang; Gira Bhabha; Damian C Ekiert; Susan E Tsutakawa; Greg L Hura; John A Tainer; David Baker
Journal:  Nature       Date:  2015-12-16       Impact factor: 49.962

10.  De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy.

Authors:  Po-Ssu Huang; Kaspar Feldmeier; Fabio Parmeggiani; D Alejandro Fernandez Velasco; Birte Höcker; David Baker
Journal:  Nat Chem Biol       Date:  2015-11-23       Impact factor: 15.040

View more
  4 in total

Review 1.  Structure-based protein design with deep learning.

Authors:  Sergey Ovchinnikov; Po-Ssu Huang
Journal:  Curr Opin Chem Biol       Date:  2021-09-20       Impact factor: 8.822

2.  Ig-VAE: Generative modeling of protein structure by direct 3D coordinate generation.

Authors:  Raphael R Eguchi; Christian A Choe; Po-Ssu Huang
Journal:  PLoS Comput Biol       Date:  2022-06-27       Impact factor: 4.779

Review 3.  De novo protein design, a retrospective.

Authors:  Ivan V Korendovych; William F DeGrado
Journal:  Q Rev Biophys       Date:  2020-02-11       Impact factor: 5.318

Review 4.  Deep learning approaches for conformational flexibility and switching properties in protein design.

Authors:  Lucas S P Rudden; Mahdi Hijazi; Patrick Barth
Journal:  Front Mol Biosci       Date:  2022-08-10
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.