Literature DB >> 19003998

Accurate prediction for atomic-level protein design and its application in diversifying the near-optimal sequence space.

Menachem Fromer1, Chen Yanover.   

Abstract

The task of engineering a protein to assume a target three-dimensional structure is known as protein design. Computational search algorithms are devised to predict a minimal energy amino acid sequence for a particular structure. In practice, however, an ensemble of low-energy sequences is often sought. Primarily, this is performed because an individual predicted low-energy sequence may not necessarily fold to the target structure because of both inaccuracies in modeling protein energetics and the nonoptimal nature of search algorithms employed. Additionally, some low-energy sequences may be overly stable and thus lack the dynamic flexibility required for biological functionality. Furthermore, the investigation of low-energy sequence ensembles will provide crucial insights into the pseudo-physical energy force fields that have been derived to describe structural energetics for protein design. Significantly, numerous studies have predicted low-energy sequences, which were subsequently synthesized and demonstrated to fold to desired structures. However, the characterization of the sequence space defined by such energy functions as compatible with a target structure has not been performed in full detail. This issue is critical for protein design scientists to successfully continue using these force fields at an ever-increasing pace and scale. In this paper, we present a conceptually novel algorithm that rapidly predicts the set of lowest energy sequences for a given structure. Based on the theory of probabilistic graphical models, it performs efficient inspection and partitioning of the near-optimal sequence space, without making any assumptions of positional independence. We benchmark its performance on a diverse set of relevant protein design examples and show that it consistently yields sequences of lower energy than those derived from state-of-the-art techniques. Thus, we find that previously presented search techniques do not fully depict the low-energy space as precisely. Examination of the predicted ensembles indicates that, for each structure, the amino acid identity at a majority of positions must be chosen extremely selectively so as to not incur significant energetic penalties. We investigate this high degree of similarity and demonstrate how more diverse near-optimal sequences can be predicted in order to systematically overcome this bottleneck for computational design. Finally, we exploit this in-depth analysis of a collection of the lowest energy sequences to suggest an explanation for previously observed experimental design results. The novel methodologies introduced here accurately portray the sequence space compatible with a protein structure and further supply a scheme to yield heterogeneous low-energy sequences, thus providing a powerful instrument for future work on protein design.

Mesh:

Substances:

Year:  2009        PMID: 19003998     DOI: 10.1002/prot.22280

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  5 in total

1.  M are better than one: an ensemble-based motif finder and its application to regulatory element prediction.

Authors:  Chen Yanover; Mona Singh; Elena Zaslavsky
Journal:  Bioinformatics       Date:  2009-02-17       Impact factor: 6.937

2.  Mode Estimation for High Dimensional Discrete Tree Graphical Models.

Authors:  Chao Chen; Han Liu; Dimitris N Metaxas; Tianqi Zhao
Journal:  Adv Neural Inf Process Syst       Date:  2014-12

Review 3.  Generative models of conformational dynamics.

Authors:  Christopher James Langmead
Journal:  Adv Exp Med Biol       Date:  2014       Impact factor: 2.622

4.  A critical analysis of computational protein design with sparse residue interaction graphs.

Authors:  Swati Jain; Jonathan D Jou; Ivelin S Georgiev; Bruce R Donald
Journal:  PLoS Comput Biol       Date:  2017-03-30       Impact factor: 4.475

5.  Tradeoff between stability and multispecificity in the design of promiscuous proteins.

Authors:  Menachem Fromer; Julia M Shifman
Journal:  PLoS Comput Biol       Date:  2009-12-24       Impact factor: 4.475

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.