| Literature DB >> 24392845 |
Binchen Mao1, Roberto Tejero, David Baker, Gaetano T Montelione.
Abstract
We have found that refinement of protein NMR structures using Rosetta with experimental NMR restraints yields more accurate protein NMR structures than those that have been deposited in the PDB using standard refinement protocols. Using 40 pairs of NMR and X-ray crystal structures determined by the Northeast Structural Genomics Consortium, for proteins ranging in size from 5-22 kDa, restrained Rosetta refined structures fit better to the raw experimental data, are in better agreement with their X-ray counterparts, and have better phasing power compared to conventionally determined NMR structures. For 37 proteins for which NMR ensembles were available and which had similar structures in solution and in the crystal, all of the restrained Rosetta refined NMR structures were sufficiently accurate to be used for solving the corresponding X-ray crystal structures by molecular replacement. The protocol for restrained refinement of protein NMR structures was also compared with restrained CS-Rosetta calculations. For proteins smaller than 10 kDa, restrained CS-Rosetta, starting from extended conformations, provides slightly more accurate structures, while for proteins in the size range of 10-25 kDa the less CPU intensive restrained Rosetta refinement protocols provided equally or more accurate structures. The restrained Rosetta protocols described here can improve the accuracy of protein NMR structures and should find broad and general for studies of protein structure and function.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24392845 PMCID: PMC4129517 DOI: 10.1021/ja409845w
Source DB: PubMed Journal: J Am Chem Soc ISSN: 0002-7863 Impact factor: 15.419
Summary of NMR Structure Quality Statistics for 40 NMR Structures Using Different Refinement Protocols
| structural metric | parameter range | PDB | R3 | R3rst |
|---|---|---|---|---|
| NOE restraint violations per conformer (Å) | 0.1–0.2 Å | 5.0 ± 7.0 | 15.6 ± 8.7 | 6.6 ± 5.5 |
| 0.2 −0.5 Å | 1.9 ± 4.5 | 31.7 ± 20.3 | 4.0 ± 4.1 | |
| >0.5 Å | 0.1 ± 0.2 | 74.3 ± 56.0 | 1.3 ± 1.2 | |
| dihedral restraint violations per conformer (°) | <10° | 5.2 ± 6.9 | 7.9 ± 7.0 | 1.1 ± 1.5 |
| >10° | 0.2 ± 0.6 | 5.9 ± 6.6 | 0.8 ± 1.2 | |
| Ensemble RMSD (Å) | bb_ord | 0.79 ± 0.69 | 1.05 ± 0.84 | 0.80 ± 0.81 |
| hvy_ord | 1.19 ± 0.64 | 1.43 ± 0.80 | 1.10 ± 0.79 | |
| bb_all | 2.92 ± 1.85 | 3.38 ± 1.80 | 3.20 ± 1.70 | |
| hvy_all | 3.46 ± 1.85 | 3.90 ± 1.82 | 3.70 ± 1.70 | |
| RPF scores | recall | 0.94 ± 0.07 | 0.92 ± 0.07 | 0.94 ± 0.06 |
| precision | 0.90 ± 0.06 | 0.90 ± 0.06 | 0.90 ± 0.06 | |
| DP score | 0.79 ± 0.08 | 0.76 ± 0.07 | 0.79 ± 0.08 | |
| PSVS | Verify3D | –2.11 ± 1.12 | –1.28 ± 0.91 | –1.44 ± 0.93 |
| Prosa | –0.57 ± 1.03 | –0.18 ± 0.98 | –0.26 ± 1.01 | |
| Procheck_bb | –0.34 ± 1.68 | 0.14 ± 1.46 | 0.63 ± 1.58 | |
| Procheck_all | –0.94 ± 1.85 | 1.23 ± 1.43 | 1.40 ± 1.60 | |
| Molprobity clash score | –2.10 ± 1.20 | 0.84 ± 0.38 | 0.55 ± 0.57 |
Structure quality scores were analyzed by PSVS.[21] Constraint violations were calculated with the program PDBStat.[46] Knowledge-based statistics were calculated using the programs Verify3D,[25] ProsaII,[24] ProCheck,[23] and MolProbity,[26,27] normalized to Z = 0 for a set of 252 high-resolution X-ray crystal structures.[21]
Structure quality scores were calculated for the NMR structures available from the PDB (PDB) and for the unrestrained Rosetta (R3) and restrained Rosetta (R3rst) structures. For each statistic, the mean and standard deviation were computed across the 40 NMR structures and are formatted as mean ± sd.
Computed following superimposition of atoms with well-defined atomic positions, as determined by the dihedral angle order parameter method[49] as implemented in PSVS. bb_ord, backbone atoms (N, Cα, C′) in well-ordered residues; hvy_ord, all heavy (N, C, O, S) atoms in well-ordered residues; bb_all, backbone atoms of all residues; hvy_all, all heavy atoms in all residues.
RPF-DP scores were computed for 35 NMR structures for which NOESY peak list data is available, and provide a statistical assessment of the consistency of the 3D NMR structure ensemble with the NOESY peak list as provided by the RPF software.[16,17]
Figure 1The number of restraint violations is significantly reduced by incorporating NMR restraints into Rosetta refinement. Restraint violations were assessed using PDBStat46, across the complete set of 40 NESG NMR structures used in this study. (A) Boxplot of the number of distance restraint violations between 0.1 Å and 0.2 Å. (B) Boxplot of the number of distance restraint violations between 0.2 Å and 0.5 Å. (C) Boxplot of the number of distance restraint violations larger than 0.5 Å. (D) Boxplot of the number of dihedral angle restraint violations between 1 deg and 10 deg. (E) Boxplot of the number of dihedral angle restraint violations larger than 10 deg.
Figure 2Rosetta-refined structures have RPF-DP scores, comparing the structure against the unassigned NOESY peak list, similar to those of structures deposited in the PDB. (A) Boxplots of Recall scores for structures deposited in the PDB or refined with Rosetta protocols. (B) Boxplots of Precision scores for structures deposited in the PDB or refined with Rosetta protocols. (C) Boxplots of DP- scores for structures deposited in the PDB or refined with Rosetta protocols. (D) DP-score scatterplot. DP-scores of the PDB NMR structures are plotted on the X-axis, while the DP-scores of both the unrestrained Rosetta refined structures represented by red solid triangle symbols (R3) and restrained Rosetta refined structures represented by blue solid rectangle symbols (R3rst) are plotted on the Y-axis. The black dashed line indicates y = x. Data are presented for 35 NMR structures for which NOESY peak list data are available.
Figure 3Knowledge-based structure quality scores are much improved after Rosetta refinement. (A) Boxplot of Procheck[23] backbone dihedral angle G-factor Z-scores for structures refined with different protocols. (B) Scatterplot of Procheck[23] backbone dihedral angle G-factor Z-scores. (C) Boxplot of Procheck[23] all dihedral angle G-factor Z-scores for structures refined with different protocols. (D) Scatterplot of Procheck[23] all dihedral angle G-factor Z-scores. (E) Boxplot of Molprobity clashscore[26,27] Z-scores for structures refined with different protocols. (F) Scatterplot of Molprobity clashscore[26,27] Z-scores. In the scatter plots (B), (D) and (F), the Z-scores of unrestrained Rosetta refined structures (R3) are plotted on the X-axis, while the Z-scores of restrained Rosetta refined structures (R3rst) are plotted on the Y-axis.
Figure 4Restrained Rosetta refined structures are more similar to their corresponding X-ray crystal structures than PDB NMR structures. GDT.TS values of PDB NMR structures to corresponding X-ray structures are plotted on the X-axis, and GDT.TS values of both unrestrained Rosetta refined structures (R3, represented by red solid triangle) and restrained Rosetta refined structures (R3rst, represented by blue solid rectangles) to their corresponding X-ray structures are plotted on the Y-axis. Data are summarized for 39 NESG NMR/X-ray pairs. The two green dash lines indicate GDT.TS of PDB NMR structures equal to 0.7 and 0.85 respectively. The black dash line indicates y = x, and the two gray dash lines indicate y = x + 0.05 and y = x – 0.05 respectively.
Figure 5The agreement between NMR structures and their X-ray counterparts are generally improved following restrained Rosetta refinement. Top: Plot of differences of RMSD to X-ray crystal structures before and after restrained Rosetta refinement. The NESG NMR/X-ray pair target index is plotted on the X-axis, and the differences between the RMSD of PDB NMR structures to their corresponding X-ray structures and the RMSD of restrained Rosetta refined structures to their corresponding X-ray structures are plotted on the Y-axis in units of Ångstroms. The four subpanels summarize data for well-defined (lower half) and not-well defined (upper half) residues, and for backbone (left) and sidechain (right) atoms. Well-defined vs not well-defined residues are defined by S(phi)+S(psi) ≥ 1.8.[21,49] Data are summarized for 39 NESG NMR/X-ray pairs. Bottom: Superimposition of X-ray, NMR and restrained Rosetta refined structures. Left – HR3646E; middle – HR4435B ; right – DhR29B. The structures are color coded as: magenta- X-ray crystal structure; cyan – NMR structure deposited in PDB; blue – restrained Rosetta refined structure. For DhR29B, only the last two C-terminal beta strands are plotted.
Figure 6For Rosetta refinement of NMR structure, preserving ensemble information is beneficial for MR success. Scatterplot of Phaser[47] TFZ scores (A) and DP-scores[16,17] (B) for two different protocols for selecting models for MR. Decoy(Energy) Rosetta-refined structure ensembles are composed of the 20 lowest Rosetta energy decoys from the entire pool of decoys generated from all the NMR conformers. Decoy(Conformer+Energy) Rosetta-refined structure ensembles are composed of each lowest Rosetta energy decoy generated from each NMR conformer. The scores of structures picked by Decoy(Energy) protocol are plotted on the X-axis, and the scores of structures picked by Decoy(Conformer+Energy) protocol are plotted on the Y-axis. Unrestrained Rosetta refined structures are represented by red solid triangles and restrained Rosetta refined structures are represented by blue solid rectangles. Data are summarized for 38 NESG NMR/Xray pairs used in the crystallographic MR study.
Figure 7Restrained Rosetta refined NMR structures provide better templates for MR, and generally yield crystal structures with better Rfree scores. Dotplot of Rfree values of MR structures using MR templates for 38 NESG NMR/X-ray pairs deposited in the PDB or refined by Rosetta. The MR structures were solved either by Phenix[65] or Arp/WARP.[63,64] The Rfree values are plotted on the Y-axis. PDB NMR structures (PDB), unrestrained Rosetta refined structures (R3) and restrained Rosetta refined structures (R3rst) are colored black, red and green respectively. Each subpanel represents one NESG target, and the subpanels are organized in ascending order of the resolution of its X-ray crystal structure from bottom left corner to top right corner.
Comparison of Unrestrained (R3) and Restrained (R3rst) Rosetta Refinement of Monomeric NMR Structures with Results of Restrained CS-Rosetta, Based on GDT.TS to Corresponding X-ray Crystal Structures
| target | length | PDB | R3 | R3rst | CS-Rrst | |
|---|---|---|---|---|---|---|
| ER382A | 53 | 6.0 | 0.77 | 0.80 | 0.81 | 0.91 |
| HR4435B | 53 | 6.1 | 0.71 | 0.77 | 0.79 | 0.80 |
| GmR137 | 70 | 7.5 | 0.78 | 0.79 | 0.80 | 0.84 |
| ZR18 | 83 | 9.4 | 0.77 | 0.76 | 0.78 | 0.90 |
| UuR17A | 101 | 11.9 | 0.72 | 0.73 | 0.76 | 0.75 |
| HR3646E | 111 | 12.3 | 0.75 | 0.79 | 0.83 | 0.82 |
| PsR293 | 117 | 13.7 | 0.81 | 0.82 | 0.83 | 0.83 |
| SR213 | 123 | 14.5 | 0.81 | 0.83 | 0.85 | 0.86 |
| HR5546A | 133 | 14.6 | 0.79 | 0.79 | 0.79 | 0.80 |
| StR70 | 134 | 15.1 | 0.76 | 0.75 | 0.79 | 0.84 |
| SgR209C | 147 | 16.8 | 0.81 | 0.80 | 0.84 | 0.87 |
| HR41 | 167 | 19.5 | 0.82 | 0.78 | 0.84 | 0.74 |
| SgR145 | 194 | 21.3 | 0.64 | 0.62 | 0.64 | 0.65 |
Number of residues, excluding short disordered purification tags.
Molecular weight (kDa).
GDT.TS score for NMR structures deposited in PDB.
GDT.TS score for unrestrained Rosetta refined structures.
GDT.TS score for restrained Rosetta refined structures.
GDT.TS scores for restrained CS-Rosetta structures generated from extended structures.