Literature DB >> 30587591

Data-driven supervised learning of a viral protease specificity landscape from deep sequencing and molecular simulations.

Manasi A Pethe1,2, Aliza B Rubenstein3,4, Sagar D Khare5,2,3,4.   

Abstract

Biophysical interactions between proteins and peptides are key determinants of molecular recognition specificity landscapes. However, an understanding of how molecular structure and residue-level energetics at protein-peptide interfaces shape these landscapes remains elusive. We combine information from yeast-based library screening, next-generation sequencing, and structure-based modeling in a supervised machine learning approach to report the comprehensive sequence-energetics-function mapping of the specificity landscape of the hepatitis C virus (HCV) NS3/4A protease, whose function-site-specific cleavages of the viral polyprotein-is a key determinant of viral fitness. We screened a library of substrates in which five residue positions were randomized and measured cleavability of ∼30,000 substrates (∼1% of the library) using yeast display and fluorescence-activated cell sorting followed by deep sequencing. Structure-based models of a subset of experimentally derived sequences were used in a supervised learning procedure to train a support vector machine to predict the cleavability of 3.2 million substrate variants by the HCV protease. The resulting landscape allows identification of previously unidentified HCV protease substrates, and graph-theoretic analyses reveal extensive clustering of cleavable and uncleavable motifs in sequence space. Specificity landscapes of known drug-resistant variants are similarly clustered. The described approach should enable the elucidation and redesign of specificity landscapes of a wide variety of proteases, including human-origin enzymes. Our results also suggest a possible role for residue-level energetics in shaping plateau-like functional landscapes predicted from viral quasispecies theory.

Entities:  

Keywords:  machine learning; molecular modeling; protease; sequence−function mapping; substrate specificity

Mesh:

Substances:

Year:  2018        PMID: 30587591      PMCID: PMC6320525          DOI: 10.1073/pnas.1805256116

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  62 in total

1.  Profiling Protease Specificity: Combining Yeast ER Sequestration Screening (YESS) with Next Generation Sequencing.

Authors:  Qing Li; Li Yi; Kam Hon Hoi; Peter Marek; George Georgiou; Brent L Iverson
Journal:  ACS Chem Biol       Date:  2017-01-03       Impact factor: 5.100

2.  Large-Scale Structure-Based Prediction and Identification of Novel Protease Substrates Using Computational Protein Design.

Authors:  Manasi A Pethe; Aliza B Rubenstein; Sagar D Khare
Journal:  J Mol Biol       Date:  2016-12-06       Impact factor: 5.469

3.  Mutational and fitness landscapes of an RNA virus revealed through population sequencing.

Authors:  Ashley Acevedo; Leonid Brodsky; Raul Andino
Journal:  Nature       Date:  2013-11-27       Impact factor: 49.962

4.  Highly heterogeneous mutation rates in the hepatitis C virus genome.

Authors:  Ron Geller; Úrsula Estada; Joan B Peris; Iván Andreu; Juan-Vicente Bou; Raquel Garijo; José M Cuevas; Rosario Sabariegos; Antonio Mas; Rafael Sanjuán
Journal:  Nat Microbiol       Date:  2016-04-18       Impact factor: 17.745

Review 5.  Hepatitis C virus genetic variability in patients undergoing antiviral therapy.

Authors:  Juan Cristina; María del Pilar Moreno; Gonzalo Moratorio
Journal:  Virus Res       Date:  2007-04-20       Impact factor: 3.303

Review 6.  Biophysical Models of Protein Evolution: Understanding the Patterns of Evolutionary Sequence Divergence.

Authors:  Julian Echave; Claus O Wilke
Journal:  Annu Rev Biophys       Date:  2017-03-15       Impact factor: 12.981

7.  A comprehensive, high-resolution map of a gene's fitness landscape.

Authors:  Elad Firnberg; Jason W Labonte; Jeffrey J Gray; Marc Ostermeier
Journal:  Mol Biol Evol       Date:  2014-02-23       Impact factor: 16.240

8.  Gag mutations strongly contribute to HIV-1 resistance to protease inhibitors in highly drug-experienced patients besides compensating for fitness loss.

Authors:  Elisabeth Dam; Romina Quercia; Bärbel Glass; Diane Descamps; Odile Launay; Xavier Duval; Hans-Georg Kräusslich; Allan J Hance; François Clavel
Journal:  PLoS Pathog       Date:  2009-03-20       Impact factor: 6.823

Review 9.  Viral precursor polyproteins: keys of regulation from replication to maturation.

Authors:  Samantha A Yost; Joseph Marcotrigiano
Journal:  Curr Opin Virol       Date:  2013-04-18       Impact factor: 7.090

10.  The Mutational Robustness of Influenza A Virus.

Authors:  Elisa Visher; Shawn E Whitefield; John T McCrone; William Fitzsimmons; Adam S Lauring
Journal:  PLoS Pathog       Date:  2016-08-29       Impact factor: 6.823

View more
  10 in total

Review 1.  On the cutting edge: protease-based methods for sensing and controlling cell biology.

Authors:  H Kay Chung; Michael Z Lin
Journal:  Nat Methods       Date:  2020-07-13       Impact factor: 28.547

2.  Learning peptide recognition rules for a low-specificity protein.

Authors:  Lucas C Wheeler; Arden Perkins; Caitlyn E Wong; Michael J Harms
Journal:  Protein Sci       Date:  2020-10-05       Impact factor: 6.725

3.  Accurate Models of Substrate Preferences of Post-Translational Modification Enzymes from a Combination of mRNA Display and Deep Learning.

Authors:  Alexander A Vinogradov; Jun Shi Chang; Hiroyasu Onaka; Yuki Goto; Hiroaki Suga
Journal:  ACS Cent Sci       Date:  2022-05-26       Impact factor: 18.728

Review 4.  Making the cut with protease engineering.

Authors:  Rebekah P Dyer; Gregory A Weiss
Journal:  Cell Chem Biol       Date:  2021-12-17       Impact factor: 9.039

5.  LONGL-Net: temporal correlation structure guided deep learning model to predict longitudinal age-related macular degeneration severity.

Authors:  Alireza Ganjdanesh; Jipeng Zhang; Emily Y Chew; Ying Ding; Heng Huang; Wei Chen
Journal:  PNAS Nexus       Date:  2022-03-19

Review 6.  Data-driven computational protein design.

Authors:  Vincent Frappier; Amy E Keating
Journal:  Curr Opin Struct Biol       Date:  2021-04-25       Impact factor: 7.786

7.  Analysis of direct-acting antiviral-resistant hepatitis C virus haplotype diversity by single-molecule and long-read sequencing.

Authors:  Kozue Yamauchi; Mitsuaki Sato; Leona Osawa; Shuya Matsuda; Yasuyuki Komiyama; Natsuko Nakakuki; Hitomi Takada; Ryo Katoh; Masaru Muraoka; Yuichiro Suzuki; Akihisa Tatsumi; Mika Miura; Shinichi Takano; Fumitake Amemiya; Mitsuharu Fukasawa; Yasuhiro Nakayama; Tatsuya Yamaguchi; Taisuke Inoue; Shinya Maekawa; Nobuyuki Enomoto
Journal:  Hepatol Commun       Date:  2022-03-31

8.  An automated protocol for modelling peptide substrates to proteases.

Authors:  Rodrigo Ochoa; Mikhail Magnitov; Roman A Laskowski; Pilar Cossio; Janet M Thornton
Journal:  BMC Bioinformatics       Date:  2020-12-29       Impact factor: 3.169

9.  Structural basis for peptide substrate specificities of glycosyltransferase GalNAc-T2.

Authors:  Sai Pooja Mahajan; Yashes Srinivasan; Jason W Labonte; Matthew P DeLisa; Jeffrey J Gray
Journal:  ACS Catal       Date:  2021-02-19       Impact factor: 13.084

10.  Structure-based prediction of HDAC6 substrates validated by enzymatic assay reveals determinants of promiscuity and detects new potential substrates.

Authors:  Julia K Varga; Kelsey Diffley; Katherine R Welker Leng; Carol A Fierke; Ora Schueler-Furman
Journal:  Sci Rep       Date:  2022-02-02       Impact factor: 4.379

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.