| Literature DB >> 25973843 |
Benjamin P Martin1, Christopher J Brandon1, James J P Stewart1,2, Sonja B Braun-Sand1,2.
Abstract
Using the semiempirical method PM7, an attempt has been made to quantify the error in prediction of the in vivo structure of proteins relative to X-ray structures. Three important contributory factors are the experimental limitations of X-ray structures, the difference between the crystal and solution environments, and the errors due to PM7. The geometries of 19 proteins from the Protein Data Bank that had small R values, that is, high accuracy structures, were optimized and the resulting drop in heat of formation was calculated. Analysis of the changes showed that about 10% of this decrease in heat of formation was caused by faults in PM7, the balance being attributable to the X-ray structure and the difference between the crystal and solution environments. A previously unknown fault in PM7 was revealed during tests to validate the geometries generated using PM7. Clashscores generated by the Molprobity molecular mechanics structure validation program showed that PM7 was predicting unrealistically close contacts between nonbonding atoms in regions where the local geometry is dominated by very weak noncovalent interactions. The origin of this fault was traced to an underestimation of the core-core repulsion between atoms at distances smaller than the equilibrium distance.Entities:
Keywords: PDB; PM7; X-ray structures; errors; geometry optimization; protein; semiempirical
Mesh:
Substances:
Year: 2015 PMID: 25973843 PMCID: PMC4744657 DOI: 10.1002/prot.24826
Source DB: PubMed Journal: Proteins ISSN: 0887-3585
Heats of Formation, (ΔH f) in kcal mol−1, of Single‐Point Gas Phase, ΔH f(PDBg), and Solvated, ΔH f(PDBaq), Protein Structures and Fully Optimized Gas Phase, ΔH f(PM7g), and Solvated, ΔH f(PM7aq), Structures
| PDB ID | Residues | Atoms | Resolution (Å) | Δ | Δ | Δ | Δ |
|---|---|---|---|---|---|---|---|
| 3NIR | 48 | 933 | 0.48 | −9677.2 | −10043.6 | −10588.31 | −10838.52 |
| 3W5H | 272 | 6142 | 0.78 | −60719.18 | −63125.27 | −66451.1 | −68066.68 |
| 3W7Y | 51 | 2401 | 0.92 | −26334.05 | −27663.67 | −28741.15 | −29467.43 |
| 3WCQ | 97 | 1715 | 0.97 | −13737.64 | −15306.84 | −15253.8 | −16812.44 |
| 3WDN | 125 | 2818 | 0.86 | −31166.42 | −33656.06 | −33933.21 | −35935.5 |
| 3ZOJ | 279 | 4754 | 0.88 | −32530.8 | −34379.14 | −36033.83 | −37029.43 |
| 4AQO | 92 | 1653 | 0.99 | −15879.93 | −16890.7 | −17594.74 | −18379.29 |
| 4AR6 | 54 | 1235 | 0.92 | −13883.15 | −15511.66 | −15190.45 | −16644.27 |
| 4BCT | 201 | 3966 | 0.98 | −38694.79 | −39895.79 | −42054.08 | −42859.72 |
| 4BY8 | 21 | 296 | 0.94 | −1306.82 | −1426.6 | −1640.21 | −1743 |
| 4EIC | 93 | 1742 | 0.84 | −15171.68 | −16249.14 | −16669.42 | −17557.05 |
| 4FRC | 260 | 4963 | 0.98 | −41891.22 | −43962.08 | −46017.09 | −46922 |
| 4FU5 | 260 | 5018 | 0.98 | −40137.78 | −43578.26 | −44362.36 | −47569.54 |
| 4G78 | 152 | 2802 | 0.92 | −22818.52 | −25191.37 | −26087.12 | −27525.25 |
| 4HGU | 40 | 812 | 0.98 | −7900.6 | −8596 | −9088.4 | −9353.19 |
| 4HS1 | 85 | 1586 | 0.87 | −11805.16 | −13040.06 | −13879.69 | −14204.75 |
| 4KQP | 232 | 4844 | 0.95 | −50179.49 | −52062.3 | −54089.1 | −55551.98 |
| 4LFS | 35 | 803 | 0.97 | −7126.4 | −7851.03 | −7859.9 | −8328.35 |
| 4MZC | 111 | 2223 | 0.95 | −19,006 | −20433.1 | −20963.86 | −21833.4 |
Total Change in ΔH f per Atom for Gas, ε(totg), and Solution, ε(totaq), Phase Resulting from Geometry Optimization in kcal mol−1 atom−1
| PDB ID |
|
|
|---|---|---|
| 3NIR | 0.98 | 0.85 |
| 3W5H | 0.93 | 0.80 |
| 3W7Y | 1.00 | 0.75 |
| 3WCQ | 0.88 | 0.88 |
| 3WDN | 0.98 | 0.81 |
| 3ZOJ | 0.74 | 0.56 |
| 4AQO | 1.04 | 0.90 |
| 4AR6 | 1.06 | 0.92 |
| 4BCT | 0.85 | 0.75 |
| 4BY8 | 1.13 | 1.07 |
| 4EIC | 0.86 | 0.75 |
| 4FRC | 0.83 | 0.60 |
| 4FU5 | 0.84 | 0.80 |
| 4G78 | 1.17 | 0.83 |
| 4HGU | 1.46 | 0.93 |
| 4HS1 | 1.31 | 0.73 |
| 4KQP | 0.81 | 0.72 |
| 4LFS | 0.91 | 0.59 |
| 4MZC | 0.88 | 0.63 |
| Average | 0.98 | 0.78 |
The Change in ΔH f per Atom for Gas, ε(PDBg), and Solution, ε(PDBaq), Phase in Moving from the PDB Geometry to the HIM in kcal mol−1 atom−1
| PDB ID |
|
|
|---|---|---|
| 3NIR | 0.97 | 0.84 |
| 3W5H | 0.93 | 0.80 |
| 3W7Y | 1.00 | 0.74 |
| 3WCQ | 0.88 | 0.87 |
| 3WDN | 0.97 | 0.80 |
| 3ZOJ | 0.73 | 0.54 |
| 4AQO | 1.03 | 0.89 |
| 4AR6 | 1.05 | 0.91 |
| 4BCT | 0.84 | 0.74 |
| 4BY8 | 1.12 | 1.06 |
| 4EIC | 0.85 | 0.74 |
| 4FRC | 0.82 | 0.58 |
| 4FU5 | 0.83 | 0.79 |
| 4G78 | 1.16 | 0.82 |
| 4HGU | 1.46 | 0.92 |
| 4HS1 | 1.30 | 0.72 |
| 4KQP | 0.80 | 0.71 |
| 4LFS | 0.91 | 0.58 |
| 4MZC | 0.87 | 0.62 |
| Average | 0.97 | 0.77 |
The change attributable to errors in PM7 is 0.12 kcal mol−1 atom−1.
Figure 2Relative heat of formation for leucine and valine at different H ‐ H separations. See text for details. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Figure 1Nonbonding close contact between hydrogen atoms from Leu104 and Val110 of 4MZC that was generated during optimization with PM7. The optimized interatomic distance was 1.78 Å. The noncovalent interaction surface as generated by Jmol version 14.2.7 is shown in the figure (Jmol: an open‐source Java viewer for chemical structures in 3D. http://www.jmol.org/s).
Clashscores for Protein Structures Predicted using PM7
| Proteins | H Only | All |
|---|---|---|
| 3NIR | 0.0 | 20.0 |
| 3W5H | 6.8 | 25.4 |
| 3W7Y | 4.5 | 22.2 |
| 3WCQ | 6.2 | 45.5 |
| 3WDN | 5.2 | 18.6 |
| 3ZOJ | 1.0 | 17.5 |
| 4AQO | 8.1 | 22.6 |
| 4AR6 | 7.4 | 19.9 |
| 4BCT | 4.1 | 20.4 |
| 4BY8 | 10.6 | 3.5 |
| 4EIC | 0.0 | 18.4 |
| 4FRC | 1.8 | 24.6 |
| 4FU5 | 2.3 | 36.6 |
| 4G78 | 13.3 | 17.9 |
| 4HGU | 1.8 | 22.8 |
| 4HS1 | 4.0 | 12.7 |
| 4KQP | 2.6 | 25.3 |
| 4LFS | 0.0 | 10.4 |
| 4MZC | 5.2 | 14.6 |
| Average | 4.5 | 21.0 |
H Only: After optimization of hydrogen atom positions only. All: After unconstrained optimization.