Literature DB >> 21873640

Fast and accurate prediction of protein side-chain conformations.

Shide Liang1, Dandan Zheng, Chi Zhang, Daron M Standley.   

Abstract

SUMMARY: We developed a fast and accurate side-chain modeling program [Optimized Side Chain Atomic eneRgy (OSCAR)-star] based on orientation-dependent energy functions and a rigid rotamer model. The average computing time was 18 s per protein for 218 test proteins with higher prediction accuracy (1.1% increase for χ(1) and 0.8% increase for χ(1+2)) than the best performing program developed by other groups. We show that the energy functions, which were calibrated to tolerate the discrete errors of rigid rotamers, are appropriate for protein loop selection, especially for decoys without extensive structural refinement. AVAILABILITY: OSCAR-star and the 218 test proteins are available for download at http://sysimm.ifrec.osaka-u.ac.jp/OSCAR CONTACT: standley@ifrec.osaka-u.ac.jp SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21873640      PMCID: PMC3187653          DOI: 10.1093/bioinformatics/btr482

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

Over the past two decades, much effort has been spent improving the accuracy or speed of side-chain modeling methods. Most methods exploit a limited number of representative conformations, called rotamers, at each residue position and use efficient search algorithms to find a low-energy rotamer combination for the whole protein. In spite of their efficiency, rigid rotamers are inherently accompanied by a discrete error and not suited for physics-based force fields, which are sensitive to small atomic clashes: the calculated energies can be quite different for the native conformation and near-native rotamers. Force fields thus have to be modified, by either scaling the atomic radii (Dahiyat and Mayo, 1997), or using softer Lennard–Jones repulsive terms (Yanover ) to reduce the influence of the steric clashes. Alternatively, knowledge-based, coarse-grained energy functions have been developed that can tolerate rigid rotamers while achieving high accuracy (Liang and Grishin, 2002). A third approach is to use extremely detailed rotamer libraries, or flexible rotamer models, in combination with accurate energy functions at the cost of speed (Peterson ). Recently, we developed a side-chain modeling program combining accurate, orientation-dependent, Optimized Side Chain Atomic eneRgy (OSCAR-o) with a flexible rotamer model (Liang ). The prediction accuracy was significantly higher (2.2% for χ1 and 4.0% for χ1+2) than that of the next-best method, but the run time was as long as 28 min for a single protein. In this study, we adopted OSCAR to a rigid rotamer model by modifying the distance-dependent component for fast side-chain modeling. The parameters of the orientation-dependent functions were optimized so that decoy proteins with low RMSD (root mean square deviation) from native structures could be distinguished from a pool of decoys (obtained by perturbing the energy functions and then modeling the entire protein). The proposed methodology (OSCAR-star) is very fast while maintaining high accuracy.

2 RESULTS

2.1 Parameter optimization

The parameters of the distance-dependent energy functions (OSCAR-dstar) were initialized to the corresponding values (OSCAR-d) previously optimized, by maximizing the energy gap between the native conformation and rotamers at each modeled position (Liang ). To model a side chain at a given position, OSCAR-dstar exploited a limited number of rigid rotamers to find the rotamer that had the lowest energy. The original OSCAR-d parameters were sensitive to discrete errors of rigid rotamers and the mean RMSD of the lowest energy rotamers was as large as 0.785 Å for a training set of 40 000 side chains per residue type (the rotamer interior energy was calculated the same as OSCAR-d). We then optimized the parameters to improve the accuracy by Monte Carlo (MC) simulation. Consequently, the RMSD was dropped to 0.734 Å and the accuracy (90.9% for χ1 and 80.8% for χ1+2) for single residues in 30 test proteins was similar to that of OSCAR-d with a flexible rotamer model. In the next step, the optimized OSCAR-dstar potential was multiplied by an orientation-dependent function to yield OSCAR-star. The parameters of the orientation-dependent function were optimized by simultaneously minimizing the RMSD of the lowest energy rotamer at each modeled position and the RMSD of the lowest energy decoy obtained by perturbing the energy functions and then modeling the entire protein. As a result, the prediction accuracy of OSCAR-star increased by 0.6 and 0.7% for χ1 and χ1+2, respectively, compared with OSCAR-dstar when modeling all residues in each of the 218 test proteins.

2.2 Comparison with other methods

We compared the performance of OSCAR-star with other top-ranked side-chain modeling programs (Table 1) such as CISRR (Cao ), SCWRL4 (Krivov ), LGA (Liang and Grishin, 2002), NCN (Peterson ), OPUS_Rota (Lu ), OSCAR-d and OSCAR-o. OSCAR-star had better prediction accuracies than other programs except OSCAR-o and was faster than all but three programs: SCWRL4, OPUS_Rota and OSCAR-dstar. In other words, OSCAR-star was more accurate than all of the faster side-chain modeling programs. According to a paired t-test, the χ1 accuracy difference between OSCAR−star and the three programs was statistically significant (P<0.0001).
Table 1.

Comparison of side-chain modeling programs in prediction accuracy and running time for 218 independent test proteins

ProgramaAll residues
Core residues
χ1(%)χ1+2(%)RMSD(Å)χ1(%)χ1+2(%)RMSD(Å)CPU time/proteinb
CISRR84.773.11.4992.685.90.9523 s
SCWRL4c85.1741.489386.90.967 s
LGAc86.172.31.4293.985.90.915 m 53 s
NCNc86.374.31.4893.887.90.8720 m 50 s
OSCAR-dc86.675.31.4195.590.40.79 m 26 s
OPUS_Rotac86.675.71.494.387.60.867 s
OSCAR-dstar87.175.71.3793.986.30.8714 s
OSCAR-star87.776.41.3594.487.30.8518 s
OSCAR-oc88.879.71.2495.991.90.6227 m 49 s

aThe list of programs are sorted according to χ1 accuracy. Default parameters/arguments were used in the calculations.

bOPUS_Rota was run on one Intel Xeon 3.0 GHz processor and other programs were run on one AMD Opteron 2.7 GHz processor.

cThe prediction accuracies of SCWRL4, LGA, NCN, OPUS_Rota, OSCAR-d and OSCAR-o were obtained from our previous work (Liang ).

Comparison of side-chain modeling programs in prediction accuracy and running time for 218 independent test proteins aThe list of programs are sorted according to χ1 accuracy. Default parameters/arguments were used in the calculations. bOPUS_Rota was run on one Intel Xeon 3.0 GHz processor and other programs were run on one AMD Opteron 2.7 GHz processor. cThe prediction accuracies of SCWRL4, LGA, NCN, OPUS_Rota, OSCAR-d and OSCAR-o were obtained from our previous work (Liang ). The performance of a side-chain modeling is affected by the energy function, structural representation and search algorithm. Efficient search algorithms save time but help little to improve the prediction accuracy. For rigid-rotamer-based side-chain modeling programs such as OSCAR-star, the MC simulation time is less than that used to calculate rotamer–backbone and rotamer–rotamer interaction energies in the initial stage (see Methods in Supplementary Material). Orientation-dependent energy functions are essential for high accuracy. For example, OPUS_Rota, the most accurate side-chain modeling program after OSCAR methods (Table 1), uses orientation-dependent statistical energy functions. On the other hand, flexible rotamer models, which are time consuming, cannot achieve accurate predictions without high-quality energy functions. In fact, the three programs using flexible rotamers, CISRR, SCWRL4 and NCN, have lower accuracies than the rigid-rotamer-based OPUS_Rota and OSCAR-star. OSCAR-o, which uses both orientation-dependent energy functions and a flexible roamer model, is the most accurate and also slower than the other methods. With a state-of-the-art search algorithm, SCWRL4 is the fastest, even though a flexible rotamer model is used.

2.3 Protein loop selection with OSCAR-star

We have previously demonstrated that OSCAR-o has higher accuracy than other energy functions in selecting near native conformations from loop decoys (Liang ). Here, we compared the performance of OSCAR-star with OSCAR-o for the RAPPER decoy set (de Bakker ), in which every loop target contained 1000 decoys optimized by side-chain modeling and 50 top scored decoys further optimized by energy minimization. We modeled side-chain conformations of loop residues with OSCAR-o/OSCAR-star before each energy calculation. For the decoys without energy minimization, OSCAR-star demonstrated better performance than OSCAR-o in 7 out of 11 loop lengths from 2 to 12 and equal accuracy for five-residue loops. For the energy-minimized decoys, OSCAR-star was effective for long loops but poor for short loops compared with the more accurate OSCAR-o. The relatively coarse-grained OSCAR-star was superior to OSCAR-o, which was sensitive to incomplete sampling and atomic clashes, for decoys without energy minimization. Moreover, it took 5 min for OSCAR-star to model side-chain conformations of 1000 decoys for an eight-residue loop target compared with 5 h for OSCAR-o. OSCAR-star is thus appropriate for the initial stage of loop modeling. Side-chain conformations can be modeled very fast at candidate loop backbones, which makes it possible to sample loop conformations extensively (>1000 decoys). The top ranked decoys can be then energy minimized and evaluated by more accurate force fields such as OSCAR-o. Funding: DMS was supported by the Funding Program for World-Leading Innovative R&D on Science and Technology (FIRST), Japan Science for the Promotion of Science (JSPS). Conflict of Interest: none declared.
  10 in total

1.  Side-chain modeling with an optimized scoring function.

Authors:  Shide Liang; Nick V Grishin
Journal:  Protein Sci       Date:  2002-02       Impact factor: 6.725

2.  Ab initio construction of polypeptide fragments: Accuracy of loop decoy discrimination by an all-atom statistical potential and the AMBER force field with the Generalized Born solvation model.

Authors:  Paul I W de Bakker; Mark A DePristo; David F Burke; Tom L Blundell
Journal:  Proteins       Date:  2003-04-01

3.  Improved side-chain prediction accuracy using an ab initio potential energy function and a very large rotamer library.

Authors:  Ronald W Peterson; P Leslie Dutton; A Joshua Wand
Journal:  Protein Sci       Date:  2004-03       Impact factor: 6.725

4.  Improved side-chain modeling by coupling clash-detection guided iterative search with rotamer relaxation.

Authors:  Yang Cao; Lin Song; Zhichao Miao; Yun Hu; Liqing Tian; Taijiao Jiang
Journal:  Bioinformatics       Date:  2011-01-06       Impact factor: 6.937

5.  Minimizing and learning energy functions for side-chain prediction.

Authors:  Chen Yanover; Ora Schueler-Furman; Yair Weiss
Journal:  J Comput Biol       Date:  2008-09       Impact factor: 1.479

6.  OPUS-Rota: a fast and accurate method for side-chain modeling.

Authors:  Mingyang Lu; Athanasios D Dousis; Jianpeng Ma
Journal:  Protein Sci       Date:  2008-06-12       Impact factor: 6.725

7.  Protein loop selection using orientation-dependent force fields derived by parameter optimization.

Authors:  Shide Liang; Chi Zhang; Daron M Standley
Journal:  Proteins       Date:  2011-05-13

8.  Probing the role of packing specificity in protein design.

Authors:  B I Dahiyat; S L Mayo
Journal:  Proc Natl Acad Sci U S A       Date:  1997-09-16       Impact factor: 11.205

9.  Protein side chain modeling with orientation-dependent atomic force fields derived by series expansions.

Authors:  Shide Liang; Yaoqi Zhou; Nick Grishin; Daron M Standley
Journal:  J Comput Chem       Date:  2011-03-04       Impact factor: 3.376

10.  Improved prediction of protein side-chain conformations with SCWRL4.

Authors:  Georgii G Krivov; Maxim V Shapovalov; Roland L Dunbrack
Journal:  Proteins       Date:  2009-12
  10 in total
  18 in total

1.  RabGDIα is a negative regulator of interferon-γ-inducible GTPase-dependent cell-autonomous immunity to Toxoplasma gondii.

Authors:  Jun Ohshima; Miwa Sasai; Jianfa Liu; Kazuo Yamashita; Ji Su Ma; Youngae Lee; Hironori Bando; Jonathan C Howard; Shigeyuki Ebisu; Mikako Hayashi; Kiyoshi Takeda; Daron M Standley; Eva-Maria Frickel; Masahiro Yamamoto
Journal:  Proc Natl Acad Sci U S A       Date:  2015-08-03       Impact factor: 11.205

2.  Incorporating post-translational modifications and unnatural amino acids into high-throughput modeling of protein structures.

Authors:  Ken Nagata; Arlo Randall; Pierre Baldi
Journal:  Bioinformatics       Date:  2014-02-25       Impact factor: 6.937

3.  Performance and enhancement of the LZerD protein assembly pipeline in CAPRI 38-46.

Authors:  Charles Christoffer; Genki Terashi; Woong-Hee Shin; Tunde Aderinwale; Sai Raghavendra Maddhuri Venkata Subramaniya; Lenna Peterson; Jacob Verburgt; Daisuke Kihara
Journal:  Proteins       Date:  2019-11-25

4.  Assessment of protein side-chain conformation prediction methods in different residue environments.

Authors:  Lenna X Peterson; Xuejiao Kang; Daisuke Kihara
Journal:  Proteins       Date:  2014-03-31

5.  OPUS-Rota4: a gradient-based protein side-chain modeling framework assisted by deep learning-based predictors.

Authors:  Gang Xu; Qinghua Wang; Jianpeng Ma
Journal:  Brief Bioinform       Date:  2022-01-17       Impact factor: 11.622

6.  LEAP: highly accurate prediction of protein loop conformations by integrating coarse-grained sampling and optimized energy scores with all-atom refinement of backbone and side chains.

Authors:  Shide Liang; Chi Zhang; Yaoqi Zhou
Journal:  J Comput Chem       Date:  2013-12-10       Impact factor: 3.376

7.  Computational Feasibility of an Exhaustive Search of Side-Chain Conformations in Protein-Protein Docking.

Authors:  Taras Dauzhenka; Petras J Kundrotas; Ilya A Vakser
Journal:  J Comput Chem       Date:  2018-09-18       Impact factor: 3.376

8.  Benchmarking of structure refinement methods for protein complex models.

Authors:  Jacob Verburgt; Daisuke Kihara
Journal:  Proteins       Date:  2021-08-03

9.  InterEvDock3: a combined template-based and free docking server with increased performance through explicit modeling of complex homologs and integration of covariation-based contact maps.

Authors:  Chloé Quignot; Guillaume Postic; Hélène Bret; Julien Rey; Pierre Granger; Samuel Murail; Pablo Chacón; Jessica Andreani; Pierre Tufféry; Raphaël Guerois
Journal:  Nucleic Acids Res       Date:  2021-07-02       Impact factor: 16.971

10.  PEP-FOLD: an updated de novo structure prediction server for both linear and disulfide bonded cyclic peptides.

Authors:  Pierre Thévenet; Yimin Shen; Julien Maupetit; Frédéric Guyon; Philippe Derreumaux; Pierre Tufféry
Journal:  Nucleic Acids Res       Date:  2012-05-11       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.