| Literature DB >> 24391900 |
Anders S Christensen1, Troels E Linnet2, Mikael Borg3, Wouter Boomsma2, Kresten Lindorff-Larsen2, Thomas Hamelryck3, Jan H Jensen1.
Abstract
We present the ProCS method for the rapid and accurate prediction of protein backbone amide proton chemical shifts--sensitive probes of the geometry of key hydrogen bonds that determine protein structure. ProCS is parameterized against quantum mechanical (QM) calculations and reproduces high level QM results obtained for a small protein with an RMSD of 0.25 ppm (r = 0.94). ProCS is interfaced with the PHAISTOS protein simulation program and is used to infer statistical protein ensembles that reflect experimentally measured amide proton chemical shift values. Such chemical shift-based structural refinements, starting from high-resolution X-ray structures of Protein G, ubiquitin, and SMN Tudor Domain, result in average chemical shifts, hydrogen bond geometries, and trans-hydrogen bond ((h3)J(NC')) spin-spin coupling constants that are in excellent agreement with experiment. We show that the structural sensitivity of the QM-based amide proton chemical shift predictions is needed to obtain this agreement. The ProCS method thus offers a powerful new tool for refining the structures of hydrogen bonding networks to high accuracy with many potential applications such as protein flexibility in ligand binding.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24391900 PMCID: PMC3877219 DOI: 10.1371/journal.pone.0084123
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Correlation coefficients and RMSD between five chemical shift predictors, chemical shifts derived from quantum mechanics (B3LYP/cc-pVTZ/PCM) chemical shifts and experimental values.
| Data source | Exp'tl | Exp'tl | QM | QM |
|
| RMSD |
| RMSD | |
| ProCS | 0.54 | 0.63 | 0.94 | 0.25 |
| SHIFTS | 0.64 | 0.37 | 0.59 | 0.70 |
| SHIFTX | 0.69 | 0.37 | 0.71 | 0.62 |
| SPARTA+ | 0.69 | 0.42 | 0.68 | 0.56 |
| CamShift | 0.64 | 0.32 | 0.59 | 0.66 |
a The crystal structure of human parathyroid hormone, residues 1–34 at 0.9 Å resolution (PDB-code 1ET1[23]) is used as input structure in all chemical shift calculations.
Figure 1Correlation between chemical shift predictions from five different NMR prediction methods and quantum mechanical chemical shifts for human parathyroid hormone, residues 1–37 (PDB code: 1ET1).
Blue lines represent a 1-to-1 correlation.
Reproduction of experimental amide proton chemical shift values based on 13 X-ray structures with a crystallographic resolution of 1.35 Å or less.
| Method |
|
|
| ProCS | 0.58 | 1.13 ppm |
| SHIFTS | 0.56 | 0.64 ppm |
| SHIFTX | 0.71 | 0.51 ppmc |
| SPARTA+ | 0.79 | 0.40 ppm |
| CamShift | 0.74 | 0.46 ppm |
">denotes the average correlation coefficient over the 13 structure.
">RMSD denotes the average root mean square deviation over the 13 structure.
c For SHIFTX, three structures displayed over fitting behavior with . These structures are excluded from the average values.
Statistics for three different types of protein simulations.
| ProCS | CamShift |
| ||
| Structures | 1H RMSD | 1H RMSD | deviation |
|
| Ubiquitin Ensembles: CamShift + OPLS | 0.79 ppm | - | 0.03 Å | 0.17 Hz |
| Ubiquitin Ensembles: CamShift + OPLS | - | 0.50 ppm | 0.37 Å | 0.17 Hz |
| Ubiquitin Ensembles: OPLS (no chemical shifts) | 1.56 ppm | 0.60 ppm | 0.41 Å | 0.18 Hz |
| 1UBQ X-ray starting structure | 1.22 ppm | 0.51 ppm | - | 0.22 Hz |
| SMN Tudor Domain Ensembles: ProCS + OPLS | 0.93 ppm | - | 0.09 Å | 0.24 Hz |
| SMN Tudor Domain Ensembles: CamShift + OPLS | - | 0.46 ppm | 0.17 Å | 0.23 Hz |
| SMN Tudor Domain Ensembles: OPLS (no chemical shifts) | 1.47 ppm | 0.61 ppm | 0.22 Å | 0.23 Hz |
| 1MHN X-ray starting structure | 1.09 ppm | 0.65 ppm | - | 0.24 Hz |
| Protein G Ensembles: ProCS + OPLS | 0.69 ppm | - | 0.06 Å | 0.14 Hz |
| Protein G Ensembles: CamShift + OPLS | - | 0.52 ppm | 0.38 Å | 0.18 Hz |
| Protein G Ensembles: OPLS (no chemical shifts) | 1.54 ppm | 0.68 ppm | 0.37 Å | 0.20 Hz |
| 1PGB X-ray starting structure | 1.21 ppm | 0.55 ppm | - | 0.17 Hz |
a The ensembles are obtained from MCMC simulations using either OPLS-AA/L with the GB/SA solvent model (OPLS) force field energy or OPLS energy plus a chemical shift energy term from from either ProCS or CamShift. Values are calculated over four runs on each of three protein structures, Ubiquitin, Protein G and SMN Tudor Domain, or their static X-ray structure.
b The mean bond length deviation denotes the mean absolute difference between the mean hydrogen bond length observed in the sampled structures to the mean hydrogen bond length observed in the corresponding X-ray structure noted below.
Figure 2Distribution of average hydrogen bond lengths throughout Monte Carlo simulations on Ubiquitin, Protein G and SMN Tudor Domain.
Histograms are normalized (to an area of 1) to fit identical axes. Vertical lines indicate average values obtained from experimental X-ray structures (PDB-codes are noted in the figure legends). The blue histogram represents the simulation with only the molecular mechanics energy from the OPLS-AA/L force field with the GB/SA solvent model (but no chemical shift energy term). Green and yellow histograms indicate the use of OPLS force field plus an additional chemical shift energy term from ProCS or CamShift, respectively. *1OGW contains fluoro leucine at residues 50 and 67. **1IGD is a closely related homologue (see text).
Figure 3Deviation in hydrogen bonding geometries between the experimental X-ray structure and samples obtained from Markov Chain Monte Carlo (MCMC) simulations using the OPLS-AA/L force field with the GB/SA solvent model with either no chemical shift energy term or a chemical shift energy from either ProCS or CamShift.
Data is calculated over all amide-amide bonding pairs for which experimental spin-spin coupling constants were present. (A) shows the distribution of the deviations found in the MCMC ensembles from the experimental hydrogen bond length found in the X-ray structure. (B) shows the correlation of deviations in hydrogen bond lengths and HO = C bond angles from the experimental X-ray structures.
Figure 4Reproducing experimental spin-spin coupling constants via different structural ensembles and experimental X-ray structures.
Squares denote the average coupling constant observed for that hydrogen bond in the ensemble and error bars represent the standard deviation observed throughout the simulations. Crosses represent the spin-spin coupling constants calculated using the static experimental X-ray structure. Results from simulations on ubiquitin is displayed in A, SMN Tudor domain in B and Protein G in C. Left column displays simulations only the OPLS-AA/L force field with the GB/SA solvent model (OPLS) and the ProCS energy term; second column is from OPLS plus the CamShift energy term; thrid column is for the simulation with only the OPLS force field energy. In the rightmost column are computed from the corresponding X-ray structure.
Statistics for selected ubiquitin ensembles and X-ray structures.a
| (CamShift) | (CamShift) | (ProCS) | (ProCS) |
| ||
| PDB-ID | 1H RMSD |
| 1H RMSD |
| RMSD | Q-factor |
|
| 0.29 | 0.84 | 0.68 | 0.86 | 0.12 | 0.04 |
|
| 0.34 | 0.82 | 0.98 | 0.77 | 0.13 | 0.07 |
|
| 0.23 | 0.91 | 0.71 | 0.82 | 0.12 | 0.22 |
|
| 0.44 | 0.74 | 1.35 | 0.64 | 0.14 | 0.25 |
|
| 0.38 | 0.81 | 0.92 | 0.77 | 0.14 | 0.38 |
|
| 0.41 | 0.79 | 1.00 | 0.71 | 0.30 | 0.06 |
|
| 0.40 | 0.77 | 0.92 | 0.72 | 0.22 | 0.22 |
|
| 0.40 | 0.77 | 0.97 | 0.73 | 0.33 | 0.25 |
|
| 0.36 | 0.73 | 0.84 | 0.73 | 0.17 | 0.26 |
|
| 0.32 | 0.79 | 0.17 | 0.98 | 0.14 | 0.27 |
|
| 0.32 | 0.90 | 1.15 | 0.86 | 0.17 | 0.27 |
|
| 0.48 | 0.78 | 1.11 | 0.78 | 0.18 | 0.29 |
a Chemical shifts RMSD and values are calculated for the residues for which spin-spin coupling constants have been measured. [12]
b ERNST method/CHARMM27 + NOE + RDC [41]
c OPLS/AA-L + NOE + RDC [42]
d Backrub method/Rosetta all-atom energy + RDC [42]
e MUMO method/CHARMM22 + NOE + RDC [43]
f DER method/CHARMM22 + NOE + [44]
g NOE + RDC [45]
h X-ray 1.80 Å structure [46]
i X-ray 1.80 Å structure [47]
j X-ray 1.32 Å structure (synthetic protein with fluoro-LEU at residues 50 and 67) [48]
k The methods presented here