Literature DB >> 21576220

The FALC-Loop web server for protein loop modeling.

Junsu Ko1, Dongseon Lee, Hahnbeom Park, Evangelos A Coutsias, Julian Lee, Chaok Seok.   

Abstract

The FALC-Loop web server provides an online interface for protein loop modeling by employing an ab initio loop modeling method called FALC (fragment assembly and analytical loop closure). The server may be used to construct loop regions in homology modeling, to refine unreliable loop regions in experimental structures or to model segments of designed sequences. The FALC method is computationally less expensive than typical ab initio methods because the conformational search space is effectively reduced by the use of fragments derived from a structure database. The analytical loop closure algorithm allows efficient search for loop conformations that fit into the protein framework starting from the fragment-assembled structures. The FALC method shows prediction accuracy comparable to other state-of-the-art loop modeling methods. Top-ranked model structures can be visualized on the web server, and an ensemble of loop structures can be downloaded for further analysis. The web server can be freely accessed at http://falc-loop.seoklab.org/.

Entities:  

Mesh:

Year:  2011        PMID: 21576220      PMCID: PMC3125760          DOI: 10.1093/nar/gkr352

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Protein loops are often responsible for functional specificity of a given protein by contributing to recognition of interaction partners, enzymatic reactions with substrates or conformational changes relevant to function. The special properties of protein loops originate from the variable loop structures that occur as a result of substitutions, insertions or deletions in sequence during evolution. Many available loop modeling web servers use database search methods (1–3) that search for loops of related sequences in the structure database. When loops with reasonable sequence similarity are not found, one may have to rely on ab initio methods. However, typical ab initio methods that rely mainly on intensive energy optimizations are very time consuming and therefore may not be suitable for web-based service where predictions have to be produced in relatively short time. In this article, we introduce FALC-Loop server, a protein loop modeling web server that implements an efficient ab initio loop modeling method, FALC (fragment assembly and analytical loop closure) (4). The FALC method is relatively faster than typical ab initio methods because the use of fragments derived from a structure database reduces conformational search space drastically and a knowledge-based potential allows fast scoring of the generated conformations. The fragment-assembled structures are not geometrically consistent with a given framework protein, but the backbone loop dihedral angles can be adjusted to fit into the framework efficiently by solving the analytical loop closure equation (5,6). The prediction accuracy of the FALC method is comparable to other ab initio methods due to the excellent loop sampling performance (4). A combination of the efficient loop sampling method with a more intensive energy optimization can improve the prediction accuracy, but with a large increase in computation time (Park, H. and Seok, C., manuscript in preparation).

FALC-LOOP METHOD

A flowchart of the FALC-Loop modeling procedure is shown in Figure 1. The FALC-Loop server employs the loop modeling method that combines fragment assembly and analytical loop closure developed in Ref. (4).
Figure 1.

A flowchart of the FALC-Loop modeling procedure.

A flowchart of the FALC-Loop modeling procedure. First, 4000 candidate loop structures are generated by fragment assembly. For each residue of target loops, 200 fragment structures of length 5 (for loop length ≤ 5-residue) or length 7 (for loop length > 5-residue) with similar sequence features are collected from the ASTRAL SCOP (version 1.63) structure database (7–9), filtered to maximum pairwise sequence identity 25% (4362 chains, 905 684 residues). The collected fragments are assembled by sequentially adding randomly chosen fragments starting from the N-terminal region of the loop, requiring that the fragments have similar torsion angles at junctions. The average length of the joined segments is about two residues. Second, the analytical loop closure algorithm (5,6) is applied to fit the candidate structures into the rest of the protein structure by rotating the six backbone torsion angles of randomly chosen three residues. In a variant method called FALCm, an additional step is taken in which an energy devised to enforce the torsion angles to lie within the allowed regions of Ramachandran map is minimized while satisfying the loop closure restraint simultaneously (4). Only the backbone conformations are generated up to this stage. Third, 1000 backbone-only models are selected from the closed loop candidate structures for each of the model sets generated by the FALC and FALCm methods by scoring with the DFIRE-β potential (4,10). Side chain conformations are then built and optimized for the 2000 models using our in-house version of SCWRL (4). These models are scored by the DFIRE potential, and top-ranked models are reported.

Performance of the method

The FALC method was shown in Ref. (4) to outperform several of the best previous loop sampling methods. For example, it shows better performance in loop sampling than the recently published method SOS (11) when tested on 30 loops [Table I of Ref. (4)]: average of the minimum RMSD from native improves from 1.2 to 0.8 Å and 2.3 to 1.8 Å for 8- and 12-residue loops, respectively. The loop sampling method was also tested on 317 loops and gave better results than RAPPER (12) [Table III of Ref. (4)]. In Ref. (4), the FALC-Loop method provides higher accuracy loop modeling results than RAPPER combined with DFIRE scoring (13) [Table IV of (4)]. The FALC-loop server also shows better performance than the well-known loop modeling server, ModLoop (14), as shown in Table 1, for longer loops of 8- and 12-residues when tested on the 30 loops listed in Table 1 of Ref. (4). (Homologous proteins were removed from the database during fragment library generation for this comparison.) The performance of the FALC-loop method (RMSD = 3.1, 3.4 and 3.8 Å for 10, 11 and 12 residues, respectively) [Table IV of Ref. (4)] is also comparable to those of commercially available programs Prime (Schrödinger, LLC), MODELLER (Accelrys Software, Inc.), ICM (Molsoft, LLC) and Sybyl (Tripos, Inc.), 3–5 Å for 10–12 residues (15), although different benchmark sets were used. However, it may be less accurate than other loop modeling methods that employ more extensive energy optimizations such as ROSETTA (16,17).
Table 1.

The average RMSD of the loop conformations predicted by ModLoop and FALC-Loop (FALC and FALCm) when tested on the 30 loops listed in Table 1 of Ref. (4)

Loop length (aa)Average RMSD from native (Å)
ModLoopFALCFALCm
40.660.870.93
82.462.341.87
124.483.133.07
The average RMSD of the loop conformations predicted by ModLoop and FALC-Loop (FALC and FALCm) when tested on the 30 loops listed in Table 1 of Ref. (4)

FALC-LOOP WEB SERVER

Hardware and software

The FALC-Loop server runs on a Linux server of a 2.8 GHz Intel Xeon processor that consists of two cores. The web application uses Python and the MySQL database. The loop prediction pipeline is implemented using Python by combining the fragment assembly program implemented in C and the algorithms for loop closure, side chain optimization and DIFRE scoring implemented in Fortran 90. The JMol (http://www.jmol.org) is used for visualization of predicted structures.

Input

The FALC-loop server accepts as input a protein structure and the positions and sequences of one or more loops. The maximum sizes of the protein chain and the loops are set to 1500 and 50 amino acids, respectively, for efficient service. For a protein larger than the maximum size, the user may truncate parts of the protein structure that are away from the loops of interest. Typical computation time is about 3 h for loops of 8–12 residues in protein chains of less than 500 residues. The protein structure has to be provided in the PDB format. It is expected that the structure file contains coordinates of all residues except for those of loop regions. The server reads the SEQRES and ATOM lines in the PDB file and identifies stretches of the residues with missing ATOM lines as loops. If the PDB file does not contain SEQRES lines, a separate SEQ file must be provided in the FASTA format. After submission of a structure file and an optional sequence file, loops identified by the server are displayed. Once the loops to be modeled are selected, the job is added to the modeling queue. The modeling progress can be checked by following the link for the report page or through the Queue page.

Output

The FALC-Loop output consists of two pages, Modeling Report. On the Modeling Report page (Figure 2A–C), the top five models obtained from each of the methods FALC and FALCm are presented. Static structure images both with and without the protein framework can be viewed on the web page. Structures can also be examined using the Jmol structure viewer by clicking the ‘View in Jmol’ link. The loop structures are colored by the rank of the DFIRE potential. The PDB files used to draw the images can be downloaded from the DOWNLOAD link. Comparison of the DFIRE scores and the following RMSD measures from the first model is summarized in a table: L-RMSD (C-α RMSD of loop after superimposition of loop structures), A-RMSD (C-α RMSD of loop at the fixed framework) and C-RMSD (C-α RMSD of protein structure). The DFIRE scores can be used as a guideline if stabilities of different loop conformations need to be compared, although it is challenging to estimate the model quality from such scores in general. The RMSD measures may be used to get a quick idea on the relative differences of the models. Each loop conformation can also be downloaded from the table.
Figure 2.

FALC-Loop output page. The Modeling Report page shows (A) job information, loop information and five top-ranking loop models obtained by (B) the FALC method and (C) the FALCm method. The static images for the loop structures are shown with and without the framework structure. DFIRE scores and RMSDs from the best model are tabulated. The structures can also be viewed using the Jmol structure viewer with and without the framework (E and F) by clicking the ‘View in Jmol’ link. (D) In the page, fragment libraries, structures from the intermediate stages, as well as the final structures and DFIRE energy scores can be downloaded.

FALC-Loop output page. The Modeling Report page shows (A) job information, loop information and five top-ranking loop models obtained by (B) the FALC method and (C) the FALCm method. The static images for the loop structures are shown with and without the framework structure. DFIRE scores and RMSDs from the best model are tabulated. The structures can also be viewed using the Jmol structure viewer with and without the framework (E and F) by clicking the ‘View in Jmol’ link. (D) In the page, fragment libraries, structures from the intermediate stages, as well as the final structures and DFIRE energy scores can be downloaded. The FALC-Loop server provides additional data in the page (Figure 2D). The ensemble of the 2000 final models and the DFIRE scores can be used for analysis of alternative structures. Other data may be used for further research such as method developments for fragment assembly (fragment libraries), loop closure (fragment-assembled structures) or side chain optimization (closed backbone-only structures).

CONCLUSIONS

The FALC-Loop web server is a protein loop modeling server that employs an efficient ab initio loop modeling method that has aspects of knowledge-based methods such as the use of structure fragments derived from a structure database and scoring by a knowledge-based potential. The server does not require availability of related loops in the structure database for high accuracy prediction unlike the web servers based on database search methods. Therefore, the FALC-Loop server may also be applied to modeling designed loops, loops in multiple states, etc.

FUNDING

National Research Foundation of Korea funded by the Ministry of Education, Science and Technology (2010-0000220 to J.L., 305-20100007 to C.S.); National Institutes of Health (R01 GM 090205-02 to E.A.C.); Center for Marine Natural Products and Drug Discovery (CMDD), one of the MarineBio21 programs funded by the Ministry of Land, Transport, and Maritime Affairs of Korea (to J.K. and H.P.). Funding for open access charge: Seoul National University. Conflict of interest statement. None declared.
  15 in total

1.  Ab initio construction of polypeptide fragments: efficient generation of accurate, representative ensembles.

Authors:  Mark A DePristo; Paul I W de Bakker; Simon C Lovell; Tom L Blundell
Journal:  Proteins       Date:  2003-04-01

2.  ModLoop: automated modeling of loops in protein structures.

Authors:  András Fiser; Andrej Sali
Journal:  Bioinformatics       Date:  2003-12-12       Impact factor: 6.937

3.  A kinematic view of loop closure.

Authors:  Evangelos A Coutsias; Chaok Seok; Matthew P Jacobson; Ken A Dill
Journal:  J Comput Chem       Date:  2004-03       Impact factor: 3.376

4.  Protein loop modeling by using fragment assembly and analytical loop closure.

Authors:  Julian Lee; Dongseon Lee; Hahnbeom Park; Evangelos A Coutsias; Chaok Seok
Journal:  Proteins       Date:  2010-09-24

5.  Prediction of protein solvent accessibility using fuzzy k-nearest neighbor method.

Authors:  Jaehyun Sim; Seung-Yeon Kim; Julian Lee
Journal:  Bioinformatics       Date:  2005-04-06       Impact factor: 6.937

6.  Modeling protein loops with knowledge-based prediction of sequence-structure alignment.

Authors:  Hung-Pin Peng; An-Suei Yang
Journal:  Bioinformatics       Date:  2007-09-07       Impact factor: 6.937

7.  Protein-protein docking with backbone flexibility.

Authors:  Chu Wang; Philip Bradley; David Baker
Journal:  J Mol Biol       Date:  2007-08-02       Impact factor: 5.469

8.  Loopholes and missing links in protein modeling.

Authors:  Karen A Rossi; Carolyn A Weigelt; Akbar Nayeem; Stanley R Krystek
Journal:  Protein Sci       Date:  2007-07-27       Impact factor: 6.725

9.  SuperLooper--a prediction server for the modeling of loops in globular and membrane proteins.

Authors:  Peter W Hildebrand; Andrean Goede; Raphael A Bauer; Bjoern Gruening; Jochen Ismer; Elke Michalsky; Robert Preissner
Journal:  Nucleic Acids Res       Date:  2009-05-08       Impact factor: 16.971

10.  A self-organizing algorithm for modeling protein loops.

Authors:  Pu Liu; Fangqiang Zhu; Dmitrii N Rassokhin; Dimitris K Agrafiotis
Journal:  PLoS Comput Biol       Date:  2009-08-21       Impact factor: 4.475

View more
  31 in total

Review 1.  Constraint methods that accelerate free-energy simulations of biomolecules.

Authors:  Alberto Perez; Justin L MacCallum; Evangelos A Coutsias; Ken A Dill
Journal:  J Chem Phys       Date:  2015-12-28       Impact factor: 3.488

2.  Effects of ATP and actin-filament binding on the dynamics of the myosin II S1 domain.

Authors:  Joseph L Baker; Gregory A Voth
Journal:  Biophys J       Date:  2013-10-01       Impact factor: 4.033

3.  BCSearch: fast structural fragment mining over large collections of protein structures.

Authors:  Frédéric Guyon; François Martz; Marek Vavrusa; Jérôme Bécot; Julien Rey; Pierre Tufféry
Journal:  Nucleic Acids Res       Date:  2015-05-14       Impact factor: 16.971

4.  Solution structure and DNA-binding properties of the winged helix domain of the meiotic recombination HOP2 protein.

Authors:  Hem Moktan; Michel F Guiraldelli; Craig A Eyster; Weixing Zhao; Chih-Ying Lee; Timothy Mather; R Daniel Camerini-Otero; Patrick Sung; Donghua H Zhou; Roberto J Pezza
Journal:  J Biol Chem       Date:  2014-04-06       Impact factor: 5.157

5.  Exhaustive Conformational Sampling of Complex Fused Ring Macrocycles Using Inverse Kinematics.

Authors:  Evangelos A Coutsias; Katrina W Lexa; Michael J Wester; Sara N Pollock; Matthew P Jacobson
Journal:  J Chem Theory Comput       Date:  2016-08-04       Impact factor: 6.006

6.  Homology modeling, molecular dynamic simulation, and docking based binding site analysis of human dopamine (D4) receptor.

Authors:  Minasadat Khoddami; Hamid Nadri; Alireza Moradi; Amirhossein Sakhteman
Journal:  J Mol Model       Date:  2015-02-04       Impact factor: 1.810

7.  Insights into the Structural Dynamics of Nucleocytoplasmic Transport of tRNA by Exportin-t.

Authors:  Asmita Gupta; Senthilkumar Kailasam; Manju Bansal
Journal:  Biophys J       Date:  2016-03-29       Impact factor: 4.033

8.  Structural basis for the association of PLEKHA7 with membrane-embedded phosphatidylinositol lipids.

Authors:  Alexander E Aleshin; Yong Yao; Amer Iftikhar; Andrey A Bobkov; Jinghua Yu; Gregory Cadwell; Michael G Klein; Chuqiao Dong; Laurie A Bankston; Robert C Liddington; Wonpil Im; Garth Powis; Francesca M Marassi
Journal:  Structure       Date:  2021-04-19       Impact factor: 5.871

9.  GalaxyRefine: Protein structure refinement driven by side-chain repacking.

Authors:  Lim Heo; Hahnbeom Park; Chaok Seok
Journal:  Nucleic Acids Res       Date:  2013-06-03       Impact factor: 16.971

10.  In-silico structural and functional characterization of a V. cholerae O395 hypothetical protein containing a PDZ1 and an uncommon protease domain.

Authors:  Avirup Dutta; Atul Katarkar; Keya Chaudhuri
Journal:  PLoS One       Date:  2013-02-18       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.