Literature DB >> 31458687

Comparison of Charge Derivation Methods Applied to Amino Acid Parameterization.

Pietro G A Aronica1, Stephen J Fox1, Chandra S Verma1,2,3.   

Abstract

When using non-natural amino acids in computational simulations of proteins, it is necessary to ensure appropriate parameterization of the new amino acids toward the creation of appropriate input files. In particular, the charges on the atoms may have to be derived de novo and ad hoc for the new species. As there are many variables in the charge derivation process, an investigation was devised to compare different approaches and determine their effect on simulations. This was done with the purpose to identify the methods which produced results compatible with the existing parameters. It was found in this study that all analyzed charge derivation methods reproduce with sufficient accuracy the literature values and can be used with confidence when parameterizing novel species.

Entities:  

Year:  2018        PMID: 31458687      PMCID: PMC6641686          DOI: 10.1021/acsomega.8b00438

Source DB:  PubMed          Journal:  ACS Omega        ISSN: 2470-1343


Introduction

The reliability of molecular dynamics (MD) simulations is dependent on the underlying force field,[1] which is a collection of parameters including the strengths of various interactions, the charges on the atoms, the lengths of bonds, and so forth. Therefore, obtaining high-quality parameters is one of the first steps in ensuring that simulations are reliable. MD packages such as AMBER,[2] CHARMM,[3] GROMACS,[4] NAMD,[5] and many others use different varieties of the basic force field,[6] some of which may be specific to classes of biological molecules or to the computational program being used. These force fields may have different parameters that result from the differences in the methods employed for their derivation. Parameters have been derived for the standard sets of amino acids and nucleic acids, sugars and lipids, and also for many kinds of functional groups in small molecule ligands. Indeed, general algorithms have been developed to parameterize new molecules that enable speed in high-throughput studies. However, while the parameters for the covalent interactions such as bond lengths, angles, dihedrals, and noncovalent van der Waals parameters are transferable from the general datasets that already exist,[7] the same is not the case for charges, which are very much dependent on the context and hence need to be calculated for every new molecule. Although some general rules have been derived for their transferability,[8] the paucity of experimental data to benchmark the charge datasets (either directly or through the prediction of properties dependent on charges) can make this issue problematic. The matter is further complicated because a wide variety of methods have been used for the purpose of charge derivation, regularly producing subtly different results. Therefore, we have decided to review and compare the existing methods in representative test cases and establish how charges derived with different variables measure with those reported in the literature. We hope this can help understand which of the current approaches is best suited for parameterizing nonstandard amino acids. The charges for a series of representative natural amino acids were derived following as much as possible the procedure reported in the literature; the values that we used as the benchmark are taken from the AMBER force field FF14SB[9] (which are the same as in FF12SB and FF99SB[10]). Two methods were used for the derivation of charges: the restrained electrostatic potential (RESP) fitting procedure,[11] which was originally employed in the derivation of the AMBER force field charges, and AM1-BCC,[12] which is used by the antechamber tool part of AMBERTools. MD simulations were then carried out separately using these derived charges and compared to simulations carried out using the original charges from AMBER.

Methods

Charge Derivation Methods

Perhaps the most commonly used technique in the derivation of charges is RESP[13] fitting, where point charges are assigned to atoms based on a grid of electrostatic potential points derived through quantum mechanical (QM) methods. It is an extension of the ESP method and is more powerful, creating charges of higher quality because it is less dependent on the molecular conformation.[14] Although it is an obvious approximation to treat charges as fixed, immutable, and located in a single point, it is necessary within the context of MD. There are other methods for deriving charges such as from Mulliken analysis[15] or NBO analysis,[16] but for our purposes, we only considered RESP as it is the one used in the charge derivation of FF99SB and its successors,[11] and our aim was to be as compatible as possible with the established procedures. One final remark must be made about polarization and the effects that arise from it, which are generally not accounted for in the RESP approach. Some force fields, such as Amoeba,[17] do try to incorporate these effects in their parameters, but this was not done for the FF14SB force field and its predecessors. Our investigation aims to follow the protocols laid out in the literature and therefore does not explicitly take into account polarization. This could be an interesting future venue of exploration to further improve our understanding on how to parameterize amino acids. We used the programs NWChem[18] and Red Server[19] to perform the calculations necessary for RESP, each of which present different approaches to charge derivation. NWChem is a suite of tools that allows for many types of QM calculations; for the purposes of this study, we used the geometry optimization and the ESP charge fitting protocols to generate atomic charges for our systems. NWChem can be downloaded and run as an executable on common desktop machines and is relatively quick and easy to use. However, only a single conformation can be used as an input at any one time. It has been shown that charges vary with molecular conformation,[20] thus necessitating the use of multiple conformations to obtain robust results.[21] To investigate the effect of conformation on the derived charges, the Red Server package was also used; it is an online server and it is a little more complicated to use than NWChem. Both NWChem and Red Server allow for the use of several different basis sets and QM methods for the RESP fitting procedure. For the derivation of charges, we focused on perhaps the most commonly used basis set, 6-31G*, and employed both Hartree–Fock (HF) and the density functional theory exchange correlation function, B3LYP. Although the former method is the one most traditionally used for this task, because of a fortuitous cancellation of errors,[22] B3LYP is also employed.[23] The other charge derivation procedure we tested is the semiempirical QM approach using the AM1-BCC charge method,[12] as used by programs such as MOPAC.[24] It is meant to emulate RESP fitting and provide a faster alternative; conceptually, it is similar to RESP but the implementation is much quicker. It is an approximation which can satisfactorily describe the underlying electronic features and although not as accurate as RESP,[25] it does so in a fraction of the time. This method is also available as part of AMBER and is employed to derive charges of small molecule ligands.[26] As previously mentioned, the conformation of the molecule has a great effect on the derived electrostatic potential:[20] the same molecule in different orientations may produce different charge distributions depending on local polarization, steric clashes, and other similar effects. In order to obviate this problem, the literature protocol uses multiple conformations to describe each amino acid.[11,27] These conformations, differing in backbone dihedrals, were obtained through a statistical analysis of amino acids in crystallized proteins. It was possible to do this for all natural amino acids, but it is, for obvious reasons, much harder to apply to non-natural ones, for which there would not be a large enough database to mine statistically relevant conformations. In our investigation, as we have only examined natural amino acids, we used the conformations reported in the literature with the intention of determining how this approach can then be applied to residues for which there is less data available.

Test Case Methodology

The amino acids isoleucine (Ile, I), lysine (Lys, K), glutamic acid (Glu, E), glutamine (Gln, Q), tyrosine (Tyr, Y), and phenylalanine (Phe, F) were chosen as they represent nonpolar, polar, positive, negative, and aromatic amino acids. Each amino acid (X) was capped using an acetyl group (Ace) at the N-terminus and an N-methylamide (Nme) at the C-terminus, resulting in Ace-X-Nme. Following the spirit of the derivation of charges for FF14SB,[11] different combinations of φ, ψ, χ,1 and χ2 dihedrals were used to generate αR or C5 conformations, as shown in Figure .
Figure 1

αR (left) and C5 (right) conformations of isoleucine.

αR (left) and C5 (right) conformations of isoleucine. The RESP and AM1-BCC charges were generated separately for each of the two different conformations and were also subsequently averaged. As a result, 11 sets of derived charges were obtained: AM1-BCC αR, AM1-BCC C5, the average of these two sets of charges, RESP αR with HF, RESP αR with B3LYP, RESP C5 with HF, RESP C5 with B3LYP, the average of the single conformation HF charges, the average of the single conformation B3LYP charges, the multiconformation HF charges, and the multiconformation B3LYP charges. In each case, if possible, the backbone atoms of N, H, C, and O were restrained to the literature values[11] (for neutral residues: N = −0.4157, H = 0.2719, C = 0.5973, and O = −0.5679; for positively charged residues: N = −0.3479, H = 0.2747, C = 0.7341, and O = −0.5894; and for negatively charged residues: N = −0.5163, H = 0.2936, C = 0.5366, and O = −0.5819), as the philosophy is to have a charge derivation that is as modular as possible. For AM1-BCC, which does not do this automatically, this step was not performed, which leads to those four atoms having different charges from the other methods. All chemically equivalent atoms, such as hydrogens attached to the same carbon or carbons occupying the same relative position in a ring were constrained, using symmetry, to have the same charge. The obtained charges were all very similar to each other across individual amino acids. This means that, for example, the charges of each atom of the glutamic acid residue were consistent and differed usually only in the second significant digit onward; the carboxylic carbon had a strongly positive charge compensated by the negative charges of the oxygens, whereas the β-carbon and γ-carbon and their hydrogens were weakly charged, and so on. Similar patterns could be seen for all the other amino acids, with functional groups carrying the expected charge distributions, that is, similar to those reported in the literature. This is perhaps a testament to the robustness and versatility of both RESP and AM1-BCC. All of the charges were obtained easily and automatically; by far, the fastest method was AM1-BCC, which operated within seconds, although the inability to restrain backbone charges is a drawback. NWChem and Red Server both took similar amounts of time, with the expected advantages and disadvantages inherent to obtaining and running a program on a machine and accessing a web server, respectively. The only exception is the multiconformational charge derivation of lysine, carried out on the Red Server, which did not work with any combination of parameters (neither with B3LYP or HF) nor by changing the convergence criteria. The program never managed to reach a minimized structure, and thus it was not considered in this study. Once the charges were obtained, it was a matter of testing how the properties of the system were affected by the different charge models. Free energy was chosen as a method to measure this, as the computations of the free energies of binding between two molecules are highly dependent on the nonbonded parts of the force field equation, that is, the Coulombic and van der Waals terms. In order to evaluate the effect of the differing charges from the different models, two approaches have been used: the molecular mechanics Poisson–Boltzmann (generalized born) surface area [MMPB(GB)SA] approach[28,29] and the multistate Bennett’s acceptance ratio (MBAR).[30] MMPBSA is a widely used approach which is fast and has been shown to produce trends in binding affinities that are consistent with experimentally observed trends;[31] in contrast, MBAR is much more accurate, but is also much more computationally expensive. The system we have chosen to use as a test case is oligopeptide permease A (OppA; Figure ).[32] This protein transports oligopeptides into the cells, and the crystal structures of this protein in complex with tripeptides of the sequence KXK, where X is any of the 20 natural amino acids, were available. This makes it an ideal test case, as it allows us to have bound structures of proteins and peptides where the amino acid is at the interface and is therefore presumed to be important for the interaction. For each of the six selected amino acids, the relevant OppA complex was obtained from the pdb repository (PDB reference codes: KIK, 1b3g.pdb; KKK, 2olb.pdb; KEK, 1jeu.pdb; KFK, 1b40.pdb, KQK, 1b5j.pdb; KYK, 1b58.pdb).
Figure 2

OppA (green) with KIK bound in the pocket (purple).

OppA (green) with KIK bound in the pocket (purple). In each complex, the crystallographic waters were retained, and then the whole system was solvated with an octahedral box which was truncated to ensure that no part of the system was less than 8 Å from the edge of the box; the system was neutralized with Cl– or Na+ ions where necessary. The parameter file was modified to change the charge of the residue X to the values calculated in this study; all other amino acids were assigned the original FF14SB charges. The resulting system was minimized for 1500 steps of steepest descent followed by a further 3500 steps of conjugate gradient. Equilibration consisted of heating from 0 to 300 K in the NVT ensemble for 2.5 ns followed by a further 20 ns at 300 K in the NPT ensemble. Production simulations were carried out for 100 ns in the NPT ensemble at 300 K and 1 atm pressure, using the Langevin thermostat and Berendsen barostat, respectively. Periodic boundary conditions were employed using an 8 Å radius; inside the radius, the nonbonded terms were calculated explicitly, while the particle mesh Ewald method was used outside that radius. The interaction energy between the tripeptide and the receptor was then measured using MMPBSA with all the differing charge models, using the PB4 method.[33] This was done to remove any intrinsic variation of the binding energy due to simulation variability. We are aware that the interaction in this system is mediated by water, and although water at the interface between the ligand and protein can be of fundamental importance when calculating the binding free energy within this system,[34,35] in this case, our focus is on only the difference between the binding free energies due to the differing charge models and hence not on the absolute values. The presence of interfacial waters is less important in this situation as we are not attempting to replicate the experimental values or obtain water-mediated dynamics; any error that derives from neglecting water molecules will be consistent across different systems. Another venue of analysis was the calculation of the solvation energies, which are dependent on the nonbonded terms (as well as on the conformation of the molecule). A highly accurate free energy perturbation approach for the calculation of the free energy of binding differences is MBAR. These calculations were carried out using 19 lambda windows from 0.05 to 0.95, extrapolating to the end points. Production simulations were 1 ns in the NPT ensemble at 1 atm and 300 K using the Berendsen barostat and the Langevin thermostat, respectively. The system setup was the same as that used above for MMGBSA calculations. Finally, some structural features that depend on charges were also analyzed and compared across different derivation methods.

Results and Discussion

Comparison of Charges from the Different Charge Models

Each charge derivation method generates, at the atomic positions, point charges that recreate the QM (or semiempirical) electrostatic potential. The charges for the six different amino acids are shown in the supplemental information for the different methods used. The largest difference in single point charges between the methods is in the sidechain carbons, most notably at CB, which consistently has the largest standard deviation between the methods. The maximum difference in partial charge on any single atom type between the methods was 0.51. Charges can also differ in sign; an example is the β-hydrogen of isoleucine, whose charge ranges from −0.0817 to 0.0696. In general, there is a relatively small difference between the partial charges obtained from the different approaches used here; an example is shown in Table for isoleucine. More details on the actual values from each method for each amino acid are shown in the Supporting Information.
Table 1

Average Charges for Each Atom in Isoleucinea

 averagestandard deviationrangemaxmin
N–0.4157*0.0000*0.0000*–0.4157*–0.4157*
H0.2719*0.0000*0.0000*0.2719*0.2719*
CA–0.08860.09520.27480.0397–0.2351
HA0.11040.04100.13350.18390.0504
CB0.17170.17210.43700.3723–0.0647
HB0.01180.04980.14460.0817–0.0629
CG2–0.18540.07450.2263–0.0941–0.3204
HG210.04360.01500.05940.08820.0288
HG220.04360.01500.05940.08820.0288
HG230.04360.01500.05940.08820.0288
CG10.02830.09020.22900.1436–0.0854
HG12–0.01030.04330.11100.0482–0.0628
HG13–0.01030.04330.11100.0482–0.0628
CD1–0.08630.03590.1260–0.0427–0.1687
HD110.02030.01340.03990.0381–0.0018
HD120.02030.01340.03990.0381–0.0018
HD130.02030.01340.03990.0381–0.0018
C0.5973*0.0000*0.0000*0.5973*0.5973*
O–0.5679*0.0000*0.0000*–0.5679*–0.5679*

The exact partial charges for all the amino acids derived using the different methods are listed in the Supporting Information. The backbone charges (N, H, C, O, marked with an asterisk) include data only from methods that can restrain charges, therefore excluding AM1-BCC methods. The other atoms include AM1-BCC charges.

The exact partial charges for all the amino acids derived using the different methods are listed in the Supporting Information. The backbone charges (N, H, C, O, marked with an asterisk) include data only from methods that can restrain charges, therefore excluding AM1-BCC methods. The other atoms include AM1-BCC charges.

Structural Comparisons

Salt Bridge Analysis

In addition to energetic differences, we have also looked at two structural features dependent on charges. The first is a salt bridge interaction that is present in the OppA–KEK system, where the central glutamate of the peptide interacts with Arg404. As this is an interaction that is directly dependent on the charges of the positively and negatively charged residues, it is reasonable to suggest that it might be susceptible to variations in atomic partial charges due to different methods of charge derivation. We therefore compared the distance between the carboxylic and guanidinium groups of these two residues to see if there are any significant differences in the salt bridge. It is to be noted that in this interaction the charges of the arginine residue do not change between different systems, as they remain those of the FF14SB literature values; it is the glutamate charges that change. The salt bridge is shown in Figure , while the averages are summarized in Table .
Figure 3

A close-up of the interaction between the glutamate in the KEK peptide and arginine 404.

Table 2

Average Distance Across All Methods of the Carboxylic Group in KEK and the Guanidium Group in Arginine 404a

 averagestandard deviation (within method)
FF14SB2.860.17
AM1-BCC, single conformation αR2.830.17
AM1-BCC single conformation C52.850.16
AM1-BCC average of conformations2.850.16
RESP, single conformation αR, B3LYP2.950.19
RESP, single conformation C5, B3LYP2.910.20
RESP, average of B3LYP conformations2.930.21
RESP, single conformation αR, HF2.930.23
RESP, single conformation C5, HF2.830.14
RESP, average of HF conformations2.840.13
RESP, multiconformation, B3LYP2.960.24
RESP, multiconformation, HF2.900.20
Standard deviation (across methods)0.06 

The comparison includes the standard deviation for all measurements within the method, and the standard deviation across all methods. All values are in angstrom.

A close-up of the interaction between the glutamate in the KEK peptide and arginine 404. The comparison includes the standard deviation for all measurements within the method, and the standard deviation across all methods. All values are in angstrom. The distance between the center of mass of the carboxylic oxygens of the glutamate and the center of mass of the nitrogen atoms of the guanidine of the arginine over the simulation for each charge model is shown in Figure . The data follows a Boltzmann distribution, with an average value between 2.86 (for FF14SB) and 2.96 Å (for RESP, multiconformational, HF). As it can be seen in the table, the standard deviation between the charge methods is an order of magnitude smaller than that from each ensemble, showing no noticeable discrepancies between the different simulations.
Figure 4

A plot of the distance of the carboxylic acid and the guanidinium group across all frames of all runs within a method. The distribution of distances follows a predictable pattern and clusters around a common value, indicating that the different charge derivation methods do not have a large effect on this property.

A plot of the distance of the carboxylic acid and the guanidinium group across all frames of all runs within a method. The distribution of distances follows a predictable pattern and clusters around a common value, indicating that the different charge derivation methods do not have a large effect on this property. This shows that there is very little difference in the salt bridge in terms of its length or interaction due to differences in the charges.

Root Mean Square Fluctuation Analysis

Another type of structural analysis that we have carried was the measurement of the mobility of the residues as measured using the root mean square fluctuation (RMSF) as a metric. RMSF values were averaged over the set of three simulations. Because the ligand is buried, conventional MD was considered sufficient for sampling of the ligand bound in the cavity. The data is shown in Table . The RMSF values are small and without significant variations across the different charge derivation methods.
Table 3

RMSF of the Central Residue of the KXK Peptide Averaged Across the Entire Simulation and all Runs, Shown for Each Methoda

 GLUGLNTYRPHEILELYS
FF14SB0.961.221.530.931.291.85
AM1-BCC, single conformation αR0.851.211.300.921.951.65
AM1-BCC single conformation C50.931.531.511.231.511.98
AM1-BCC average of conformations0.791.281.921.101.531.92
RESP, single conformation αR, B3LYP0.851.461.721.251.551.87
RESP, single conformation C5, B3LYP0.771.261.531.091.191.94
RESP, average of B3LYP conformations0.781.171.601.081.331.97
RESP, single conformation αR, HF1.001.351.431.201.551.91
RESP, single conformation C5, HF0.881.061.591.011.592.01
RESP, average of HF conformations0.821.261.741.081.661.79
RESP, multiconformation, B3LYP0.991.351.590.991.36n/a
RESP, multiconformation, HF0.871.552.020.981.54n/a
Average0.881.311.621.071.511.89
standard deviation0.080.150.200.110.200.11

All values are in angstrom.

All values are in angstrom. With these analyses, we believe we have shown that structural features that are likely to be dependent on charges appear to be agnostic to different charge derivation methods. Regardless of which variables are used, all produce values that are similar to those reported in the literature. Our next step in the investigation was to study what may be of more interest in a comparison with the experiment: free energies of binding, which are related to the interaction energy between the protein and the ligand. When employing non-natural amino acids, it is often done with the aim of producing a better binder, and thus it is crucial to ensure that new parameters do not skew the free energy calculations.

Free Energy Calculations

MMPBSA Results

The MD simulations carried out using the original and derived charges were processed to compute the free energies of interactions between tripeptides carrying each of the six amino acids chosen using MMPBSA calculations. Entropy was not considered when using MMPBSA, as it is notoriously complex and resource-intensive to address.[36] The ΔΔG value describes the difference between each derivation method and that data obtained by using the original FF14SB values. The closer that ΔΔG is to zero, the more it can be assumed that small differences in charge have a minimal impact on the overall free energy. When calculating the averages of the ΔΔG values to obtain a mean for a particular residue or across a particular residue, it was carried out on the absolute values, ignoring the sign. By and large, the results in Table seem to suggest that there is little difference in using one method over another for charge derivation, and all methods reproduce values similar to those obtained using literature charges. The ΔΔG ranges from 0.25 to approximately 1 kcal mol–1 in terms of the absolute average, which is well within the error obtained for a typical MMPBSA or MMGBSA calculation.[29] Although this difference is theoretically not negligible (as 2 kcal mol–1 would imply a difference of about one order of magnitude in terms of Kd), it is too small to conclusively attribute it to the differences in charge derivation, and likely arises due to the inherent imprecision of MMPBSA.[37] Thus, we conclude that as long as MMPBSA is used as the tool for the analysis of free energy changes, all the examined charge derivation methods should give results that are comparable with each other and similar to the literature values.
Table 4

ΔΔG MMPBSA Data, Obtained by Subtracting from the FF14SB MMPBSA Free Energy the MMPBSA Free Energy Calculated Using a Different Method Parameters on the FF14SB Runsa

 GLUGLNTYRPHEILELYS  
FF14SB–57.0 ± 10.37–29.16 ± 9.95–43.85 ± 6.14–67.10 ± 9.06–31.8 ± 5.98–43.75 ± 6.30  
ΔΔG      averagestandard deviation
AM1-BCC, αR conformation0.910.320.771.060.090.450.600.37
AM1-BCC C5 conformation0.650.101.000.710.040.330.470.38
AM1-BCC average0.730.210.860.920.060.400.530.36
RESP, αR conformation, B3LYP2.701.220.440.580.520.310.960.91
RESP, C5 conformation, B3LYP0.890.931.231.420.240.080.800.53
RESP, B3LYP average1.901.150.871.040.320.090.900.65
RESP, αR conformation, HF0.720.300.251.180.410.380.540.35
RESP, C5 conformation, HF1.280.330.380.730.300.100.520.42
RESP, HF average0.210.040.341.010.320.230.360.34
RESP, multiconformation, B3LYP2.541.591.100.640.16n/a1.210.91
RESP, multiconformation, HF0.780.550.250.720.23n/a0.500.26
Average1.210.610.680.910.240.26  
standard deviation0.820.520.360.260.150.14  

All values are in kcal mol–1.

All values are in kcal mol–1. Another level of analysis we have done is comparing MMPBSA results between different residues, as MMPBSA is often used to obtain trends between ligands. To examine the effect of the charge model for this purpose, we have calculated the ΔΔG between the peptide with isoleucine and the other ligands in turn. From Figure , we observe that there is very little difference between the charge models, all yielding the same trends for the ligands. The greatest deviation is between isoleucine and glutamic acid, with a standard deviation of 1.2 kcal mol–1. All other ΔΔG’s have a standard deviation less than 0.65 kcal mol–1 between the different charge models (full values in the Supporting Information), which is comparable to thermal fluctuations; the overall trend clearly remains unchanged.
Figure 5

Comparison of the calculated binding energy within a method across different residues, showing the difference between the isoleucine value and that of different amino acids.

Comparison of the calculated binding energy within a method across different residues, showing the difference between the isoleucine value and that of different amino acids. Additionally, we have investigated the sidechain contribution to the MMPBSA binding energy, specifically the electrostatic component, which depends the most on charges. The other components (polar solvation and van der Waals) are not shown here. The results are shown below in Table .
Table 5

Electrostatic Contribution of the Sidechain of Residue X in KXK, Where X is the Six Analyzed Amino Acids, to the PB Binding Energy Measured in FF14SB Simulations with Parameter Files of Different Methods, Averaged Across all Runsa

 GLUGLNTYRPHEILELYS
FF14SB61.89–10.59–3.32–1.52–5.98–92.08
AM1-BCC, single conformation αR61.95–10.73–2.28–4.78–3.95–90.91
AM1-BCC single conformation C561.90–10.14–4.25–7.94–3.25–90.09
AM1-BCC average of conformations61.92–10.45–3.29–6.34–3.63–90.51
RESP, single conformation αR, B3LYP44.25–16.04–4.94–8.19–9.99–102.14
RESP, single conformation C5, B3LYP64.17–13.00–6.06–3.66–9.23–78.31
RESP, average of B3LYP conformations54.19–14.51–5.50–5.97–9.58–90.20
RESP, single conformation αR, HF48.44–16.27–4.800.99–11.78–102.08
RESP, single conformation C5, HF59.86–15.23–8.95–2.25–8.59–101.12
RESP, average of HF conformations54.15–15.76–6.89–0.65–10.20–101.64
RESP, multiconformation, B3LYP57.71–9.63–2.33–8.93–6.30n/a
RESP, multiconformation, HF60.06–11.11–2.90–1.06–7.33n/a
average57.54–12.79–4.63–4.19–7.48–93.91
standard deviation6.142.612.003.322.867.76

All values in kcal mol–1.

All values in kcal mol–1. The size of the error bars and the lack of obvious overarching trends complicated the task of interpreting the data, but some conclusions could be drawn from the analysis. One observation that can be made is that the values obtained with AM1-BCC calculations are those that mirror the FF14SB ones most closely, although only for the polar residues: for glutamine and glutamic acid, the electrostatic contribution of those methods is very similar to the one obtained using literature charges. To a lesser extent, this is also true of lysine and tyrosine, whereas this is not really the case for isoleucine and phenylalanine. It can perhaps mean that AM1-BCC most closely replicates the charge distribution of more strongly charged residues. However, it should be noticed that the correlation between the electrostatic contribution to the binding energy and the binding energy itself is mediocre at best. The highest R2 value is for isoleucine (0.77), but it is much lower for the other amino acids: it goes from poor (glutamic acid, 0.34 and lysine, 0.26) to very poor (tyrosine, 0.11; phenylalanine, 0.02; and glutamine, 0.01). In other words, the electrostatic component is not necessarily predictive of the overall binding energy, and there are other factors that affect the free energy. Furthermore, the free energy results, as it can be seen above in Table , are very consistent across different charge models. A further level of analysis that we have performed is correlating the obtained MMPBSA results with the sum of the absolute charge values for each derivation method. For each method and each amino acid, the absolute value of the charge was summed to obtain an overall value, which was averaged across each method and across each residue. The obtained number can be a property of the charge derivation method; even if the overall sum of the charges is 0, as it is the case for neutral amino acids, the sum of absolute values is a measure of the magnitude of individual charges. The average values for each amino acid across different methods yield unsurprising results: isoleucine and phenylalanine, which contain no heteroatoms in the sidechain, have lower sums than other residues, which have greater sums in proportion to the number of oxygens and nitrogens they include. The standard deviation is also fairly narrow, indicating that the results are consistent for each residue. This data is shown in Table .
Table 6

Average Sum of Absolute Charge Values for Each Amino Acid, Across Different Methods

amino acidaveragestandard deviation
GLU5.060.42
GLN5.440.40
PHE3.550.24
TYR4.880.38
ILE3.010.25
LYS4.870.60
This sort of analysis can reveal important information regarding the trends and correlations within different methods. For example, when averaging only the values of those methods that involve HF and averaging only those that use B3LYP, the former are generally of larger magnitude and have smaller standard deviations, as shown in Table . This is due to the well-known overpolarization of HF.[38]
Table 7

Average of the Sum of the Absolute Charges for Each Amino Acid, of the HF and B3LYP Methods

amino acidaverage B3LYPstandard deviationaverage HFstandard deviation
GLU4.870.575.150.40
GLN5.190.445.790.36
PHE3.290.033.620.15
TYR4.660.425.240.31
ILE3.040.383.010.30
LYS4.350.534.900.13
It is however more interesting to consider the average for one charge derivation method across different residues and comparing that to the binding affinity data to obtain correlation values, which is shown in Tables a and 8bb. Although it may seem strange to consider together charged, polar, and nonpolar residues alike, focusing on each method across different amino acids can reveal if there are trends in how the charges affect the interaction of the peptide. The average values across the method were correlated with the ΔG MMPBSA results obtained by using different charges on the FF14SB runs. As shown by the R2, the correlation goes from poor (glutamine and phenylalanine) to very poor (tyrosine and lysine). We believe that this supports our observation that there is very little difference between charge derivation methods: even in the case of glutamine, the change in the sum of charges given by the differences in charge derivation methods correlates poorly with the obtained values, and all other residues have even weaker correlations. Thus, although they may yield charges of different magnitude and of different distributions, all charge derivation methods end up giving similar results when analyzed with MMPBSA. Even when considering known differences within methods, this does not translate to significantly different MMPBSA results.
Table 8a

Average Sum of Absolute Charge Values for Each Method, across Different Amino Acids

methodaveragestandard deviation
FF14SB4.230.87
AM1-BCC, single conformation αR4.621.03
AM1-BCC single conformation C54.661.02
AM1-BCC average of conformations4.641.02
RESP, single conformation αR, B3LYP4.641.07
RESP, single conformation C5, B3LYP4.110.81
RESP, average of B3LYP conformations4.330.90
RESP, single conformation αR, HF4.791.08
RESP, single conformation C5, HF4.681.14
RESP, average of HF conformations4.721.09
RESP, multiconformation, B3LYP3.750.87
RESP, multiconformation, HF4.151.07
Table 8b

Values Used To Obtain the Per-Residue Correlation with MMPBSA Data (Shown in Table a)

amino acidR2
GLU0.28
GLN0.42
PHE0.33
TYR0.04
ILE0.06
LYS0.00

MBAR Results

The other free energy approach we used to evaluate the effect of charges was MBAR. We calculated the hydration energy of the Ace-X-Nme system. Table presents the average of the absolute ΔΔG values from MBAR. The average ΔΔG for the MBAR results across all the residues and charge models is 0.28 kcal mol–1 with a standard deviation of 0.18 kcal mol–1. Tyrosine has the largest differences, with a max ΔΔG of 1.06 kcal mol–1 (RESP, average of B3LYP conformations) and an average of 0.63 kcal mol–1. The next highest difference is for phenylalanine with a max ΔΔG of 0.81 kcal mol–1 (AM1-BCC, single conformation αR) and an average of 0.28 kcal mol–1, which is much lower than that for tyrosine. Charged residues, and those containing a phenyl ring, have larger average differences and standard deviations. Of the various methods examined, no single approach produced the smallest differences across the amino acids. Overall, the average differences between the different methods are quite comparable, however, AM1-BCC single conformation C5 and RESP multiconformational HF from the Red Server performed the best, and the averaged B3LYP approach performed the worst. The experimental uncertainty for binding free energies is approximately 0.6 kcal mol–1,[39] setting the upper limit for in silico approaches. Differences in predicted binding free energy results within this range could therefore be considered indistinguishable. On the basis of this conclusion, all charge methods produce fairly comparable results, although some appear to be better.
Table 9

MBAR Data for the Transformation between FF14SB Charges and the Charges Derived Using Different Methods

 GLUGLNTYRPHEILELYSaverage
AM1-BCC, single conformation αR0.260.260.040.810.410.200.33
AM1-BCC single conformation C50.270.020.230.270.110.160.18
AM1-BCC average of conformations0.170.200.370.590.050.220.27
RESP, single conformation αR, B3LYP0.300.000.320.460.220.090.23
RESP, single conformation C5, B3LYP0.280.370.570.240.300.120.31
RESP, average of B3LYP conformations0.310.091.060.080.060.700.38
RESP, single conformation αR, HF0.280.370.570.240.300.120.31
RESP, single conformation C5, HF0.080.290.910.440.010.380.35
RESP, average of HF conformations0.400.060.730.480.060.010.29
RESP, multiconformation, B3LYP0.080.040.700.260.24n/a0.26
RESP, multiconformation, HF0.290.220.330.040.01n/a0.18
average0.230.150.630.280.170.210.28
standard deviation0.120.140.260.210.120.210.18

Conclusions

Given the evidence that we have presented above, we can conclude that any of the approaches that we have used to derive charges can reproduce, within error, the values that are observed when using the FF14SB charges. In other words, if we were to use these methods to obtain charges for an unknown, non-natural amino acid, we would do so with the confidence that our results would be comparable and compatible with parameters of existing residues. That is not to say that there are not any differences between the methods. For example, from the analysis of the electrostatic contribution to the binding energy, the AM1-BCC methods seem to replicate the values obtained with FF14SB the most closely for polar residues. From the MBAR data, the methods that use HF instead of B3LYP have greater agreement with FF14SB results and therefore appear to be superior for charge derivation. However, all methods generally produce results that are within error. This is because the general approach, which includes steps such as restraining the backbone and maintain equivalence between chemically identical atoms, is robust enough to account for minor variations due to differing initial structures, or differing basis sets and quantum methods. The resulting charges may vary, and this may have an impact on the simulation, but such small differences are eclipsed by other interactions which do not depend on charges, by the inherent variability of MD runs, and by the intrinsic inaccuracy of the methods commonly used to analyze the simulations. It is our opinion that an accurate assigning of the atom types that determine the internal parameters is far more important than the minute differences in charge, as the former control bonds, angles, torsions, and van der Waals interactions which will have a larger effect on the overall simulation. It should be noted however that we have only tested relatively easy systems that involved natural amino acids, which are well-known and lack problematic features. The largely aliphatic and well-documented nature of the residues did not include functional groups which may have been described by our approaches in a less robust manner. Future efforts should focus on determining what is the best approach to parametrize systems involving highly polarizable or highly charged species,[40] metals,[41] or halides,[42] which may all present unique challenges. In conclusion, the current study provides compelling evidence that for the investigated class of chemical species, all examined charge derivation approaches produce equivalent results.
  25 in total

1.  The experimental uncertainty of heterogeneous public K(i) data.

Authors:  Christian Kramer; Tuomo Kalliokoski; Peter Gedeck; Anna Vulpetti
Journal:  J Med Chem       Date:  2012-05-29       Impact factor: 7.446

2.  Scalable molecular dynamics with NAMD.

Authors:  James C Phillips; Rosemary Braun; Wei Wang; James Gumbart; Emad Tajkhorshid; Elizabeth Villa; Christophe Chipot; Robert D Skeel; Laxmikant Kalé; Klaus Schulten
Journal:  J Comput Chem       Date:  2005-12       Impact factor: 3.376

3.  Evaluating the molecular mechanics poisson-boltzmann surface area free energy method using a congeneric series of ligands to p38 MAP kinase.

Authors:  David A Pearlman
Journal:  J Med Chem       Date:  2005-12-01       Impact factor: 7.446

Review 4.  CHARMM: the biomolecular simulation program.

Authors:  B R Brooks; C L Brooks; A D Mackerell; L Nilsson; R J Petrella; B Roux; Y Won; G Archontis; C Bartels; S Boresch; A Caflisch; L Caves; Q Cui; A R Dinner; M Feig; S Fischer; J Gao; M Hodoscek; W Im; K Kuczera; T Lazaridis; J Ma; V Ovchinnikov; E Paci; R W Pastor; C B Post; J Z Pu; M Schaefer; B Tidor; R M Venable; H L Woodcock; X Wu; W Yang; D M York; M Karplus
Journal:  J Comput Chem       Date:  2009-07-30       Impact factor: 3.376

5.  Statistically optimal analysis of samples from multiple equilibrium states.

Authors:  Michael R Shirts; John D Chodera
Journal:  J Chem Phys       Date:  2008-09-28       Impact factor: 3.488

6.  Systematic placement of structural water molecules for improved scoring of protein-ligand interactions.

Authors:  David J Huggins; Bruce Tidor
Journal:  Protein Eng Des Sel       Date:  2011-07-19       Impact factor: 1.650

7.  Crystallographic and calorimetric analysis of peptide binding to OppA protein.

Authors:  S H Sleigh; P R Seavers; A J Wilkinson; J E Ladbury; J R Tame
Journal:  J Mol Biol       Date:  1999-08-13       Impact factor: 5.469

8.  ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB.

Authors:  James A Maier; Carmenza Martinez; Koushik Kasavajhala; Lauren Wickstrom; Kevin E Hauser; Carlos Simmerling
Journal:  J Chem Theory Comput       Date:  2015-07-23       Impact factor: 6.006

9.  Improved side-chain torsion potentials for the Amber ff99SB protein force field.

Authors:  Kresten Lindorff-Larsen; Stefano Piana; Kim Palmo; Paul Maragakis; John L Klepeis; Ron O Dror; David E Shaw
Journal:  Proteins       Date:  2010-06

10.  An efficient computational method for calculating ligand binding affinities.

Authors:  Atsushi Suenaga; Noriaki Okimoto; Yoshinori Hirano; Kazuhiko Fukui
Journal:  PLoS One       Date:  2012-08-20       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.