| Literature DB >> 27151199 |
José Ramón López-Blanco1, Alejandro Jesús Canosa-Valls1, Yaohang Li2, Pablo Chacón3.
Abstract
Modeling loops is a critical and challenging step in protein modeling and prediction. We have developed a quick online service (http://rcd.chaconlab.org) for ab initio loop modeling combining a coarse-grained conformational search with a full-atom refinement. Our original Random Coordinate Descent (RCD) loop closure algorithm has been greatly improved to enrich the sampling distribution towards near-native conformations. These improvements include a new workflow optimization, MPI-parallelization and fast backbone angle sampling based on neighbor-dependent Ramachandran probability distributions. The server starts by efficiently searching the vast conformational space from only the loop sequence information and the environment atomic coordinates. The generated closed loop models are subsequently ranked using a fast distance-orientation dependent energy filter. Top ranked loops are refined with the Rosetta energy function to obtain accurate all-atom predictions that can be interactively inspected in an user-friendly web interface. Using standard benchmarks, the average root mean squared deviation (RMSD) is 0.8 and 1.4 Å for 8 and 12 residues loops, respectively, in the challenging modeling scenario in where the side chains of the loop environment are fully remodeled. These results are not only very competitive compared to those obtained with public state of the art methods, but also they are obtained ∼10-fold faster.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27151199 PMCID: PMC4987936 DOI: 10.1093/nar/gkw395
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Distribution of loop lengths in the protein chain list generated by the PISCES server on April 13, 2016 containing 18 275 chains with 2.0A resolution, 90% sequence identity and 0.25 R-factor cutoff.
Figure 2.Sample results page provided by the server for a bacterial hydrolase loop (PDB-ID 1qwl). In this case, for validation proposes only, the native loop (yellow) is displayed superimposed with the predicted lowest energy model in the JSmol visualization panel. On the right, the 20 top-ranked loop models are sorted by energy and can be easily selected to activate visualization and customize representation. The RMSD versus ICOSA energy plots and the Ramachandran distributions are shown in the bottom part.
Loop-prediction performance of RCD+ and other state-of-the-art methods
| Nativea | Modeling | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| HLPb | Galaxy | RCD+ | HLP | Galaxy | Rosetta | RCD+ | ||||||
| Lengthc | Std. | PS2 | 1d | 5 | 20 | SS | PS2 | NGK | 1 | 5 | 20 | |
| Median | 0.6 | 0.6 | 0.5 | 0.3 | 0.3 | 0.9 | 1.1 | 0.4 | 0.5 | 0.4 | 0.4 | |
| 8 | Average | 1.2 | 0.9 | 0.6 | 0.5 | 0.4 | 1.3 | 1.3 | 0.5 | 0.8 | 0.7 | 0.6 |
| Sigma | 1.5 | 0.7 | 0.3 | 0.3 | 0.2 | 1.5 | 1.0 | 0.3 | 1.0 | 1.0 | 0.8 | |
| #e | 13 | 14 | 17 | 18 | 19 | 11 | 9 | 17 | 15 | 18 | 18 | |
| Median | 0.6 | 1.4 | 0.6 | 0.4 | 0.4 | 0.9 | 1.6 | 0.8 | 0.6 | 0.6 | 0.5 | |
| 12 | Average | 1.2 | 1.6 | 1.0 | 0.9 | 0.7 | 1.4 | 2.1 | 1.7 | 1.4 | 0.8 | 0.8 |
| Sigma | 1.2 | 1.3 | 1.7 | 1.0 | 1.0 | 1.4 | 1.7 | 1.8 | 1.6 | 0.9 | 0.9 | |
| #e | 12 | 7 | 16 | 16 | 17 | 11 | 4 | 11 | 13 | 17 | 17 | |
aSampling scenarios: (i) Native, the side-chains of the loop environment are kept, or (ii) Modeling, include the refinement of the loop environment side-chains.
bHLP, HLP-SS, Galaxy-PS2 and Rosetta-NGK Root Mean Squared Deviations (RMSDs) were taken from Supplementary Tables S1, S2, S4 and S5 of (8) and calculated considering the main-chain atoms N, Cα, C and O.
cNumber of residues of the loop.
dRMSD of the lowest Rosetta-energy loop predicted with RCD+ together with the best RMSD of the 5 and 20 loops of lowest Rosetta-energies.
eNumber of sub-angstrom cases.
Figure 3.Illustrative cases of the server performance with benchmark test cases. In all the cases, the first ranked model (lowest energy) is depicted in blue, the native loop in yellow and the protein environment in gray. Alternative solutions found in the 12nd best (1oth) and 2nd best (1a8d) top-ranked predictions are colored in cyan.