| Literature DB >> 20000319 |
Srivatsan Raman1, Yuanpeng J Huang, Binchen Mao, Paolo Rossi, James M Aramini, Gaohua Liu, Gaetano T Montelione, David Baker.
Abstract
Conventional NMR structure determination requires nearly complete assignment of the cross peaks of a refined NOESY peak list. Depending on the size of the protein and quality of the spectral data, this can be a time-consuming manual process requiring several rounds of peak list refinement and structure determination. Programs such as Aria, CYANA, and AutoStructure can generate models using unassigned NOESY data but are very sensitive to the quality of the input peak lists and can converge to inaccurate structures if the signal-to-noise of the peak lists is low. Here, we show that models with high accuracy and reliability can be produced by combining the strengths of the high-resolution structure prediction program Rosetta with global measures of the agreement between structure models and experimental data. A first round of models generated using CS-Rosetta (Rosetta supplemented with backbone chemical shift information) are filtered on the basis of their goodness-of-fit with unassigned NOESY peak lists using the DP-score, and the best fitting models are subjected to high resolution refinement with the Rosetta rebuild-and-refine protocol. This hybrid approach uses both local backbone chemical shift and the unassigned NOESY data to direct Rosetta trajectories toward the native structure and produces more accurate models than AutoStructure/CYANA or CS-Rosetta alone, particularly when using raw unedited NOESY peak lists. We also show that when accurate manually refined NOESY peak lists are available, Rosetta refinement can consistently increase the accuracy of models generated using CYANA and AutoStructure.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20000319 PMCID: PMC2841443 DOI: 10.1021/ja905934c
Source DB: PubMed Journal: J Am Chem Soc ISSN: 0002-7863 Impact factor: 15.419
Figure 1Model generation from raw and refined peak lists with CYANA/AutoStructure and Rosetta for protein SR213. (A) Rosetta all-atom energy vs rmsd to the X-ray structure. Dark blue points are CYANA/AutoStructure models from raw peak lists with energy set to arbitrary value. Red points are Rosetta models after the CS-DP-Rosetta protocol using raw peak lists. Purple points are CYANA/AutoStructure models from refined peak lists with energy set to arbitrary value. Light blue points are Rosetta models generated by AssignNOE-Rosetta refinement protocol starting from the purple points. (A′) Rosetta all-atom energy + DP-score vs rmsd to X-ray structure for Rosetta models after the CS-DP-Rosetta protocol from raw peak lists (red points in panel A). It should be noted that the Rosetta energy function correctly assigns very low energies to the models less than 2 Å from the native structure in light blue in panel A; adding the DP-score improves discrimination of models somewhat further from the native structure (2−3 Å). (B−E) Superposition of the X-ray structure (dark blue) with the best CYANA/AutoStructure model from raw peak lists (B), best Rosetta model after the CS-DP-Rosetta protocol using raw peak lists (C), best CYANA/AutoStructure model from refined peak lists (D), and the best Rosetta model after the AssignNOE-Rosetta model generation protocol using refined peak lists. The arrows in panel A indicate the models chosen for superposition in panels B−E.
Improvement in Model Accuracy Using Unassigned NOESY Peak Listsa
| (A) CS-DP-Rosetta (raw NOESY peak lists) | |||
|---|---|---|---|
| protein name (length) | CS-DP-Rosetta model | CYANA/AutoStructure model | CS-Rosetta model |
| CcR55 (116 aa) | 2.42 (1.86) | 1.71 (1.68) | 7.40 (5.68) |
| SR213 (123 aa) | 2.93 (2.37) | 8.03 (7.76) | 6.15 (3.65) |
| StR65 (100 aa) | 1.40 (1.10) | 2.84 (1.45) | 7.44 (5.91) |
Column 2 in sections A and B are the median rmsd to native of the 10 lowest energy models. Column 3 in sections A and B are the median rmsd to native in the CYANA/AutoStructure ensemble using the raw and refined peak lists, respectively. Column 4 in section A denotes the median rmsd of the 10 lowest energy models generated using CS-Rosetta (without DP-score filtering) and in section B denotes the median rmsd to the X-ray structure of all the conformers in the PDB-deposited NMR ensemble. The numbers in parentheses denote the lowest rmsd model in the ensemble. All rmsd’s are computed with reference to the X-ray over the core residues as identified by FindCore.(11) The number of core residues are the following: CcR55, 85 aa; SR213, 103 aa; StR65, 77 aa; HR41, 125 aa; and SsR10, 107 aa. The rmsd’s over the full length are shown in Table S2 in Supporting Information. The protein names are NESG target id’s; detailed protein sequence data for these targets are available from the SPINE database.[13,14]
Figure 2Blind structure determinations with CS-DP-Rosetta protocol: (A) VpR247, (B) AR3436A, (C) HR4394C. (Left) Experimentally solved NMR ensemble. (Right) Ensemble of lowest energy structures by the CS-DP-Rosetta protocol. Refined peak lists were used for VpR247 and AR3436A; raw peak lists were employed for HR4394C.
Figure 3Superposition of the AssignNOE-Rosetta model (red) with the starting model generated by CYANA/AutoStructure using refined peak lists (light green) and the X-ray structure (dark blue): (A) CcR55, (B) StR65 (flexible loop residues 14−22 not shown), (C) HR41, (D) SsR10.