Literature DB >> 19690373

Polarizable atomic multipole X-ray refinement: application to peptide crystals.

Michael J Schnieders1, Timothy D Fenn, Vijay S Pande, Axel T Brunger.   

Abstract

Recent advances in computational chemistry have produced force fields based on a polarizable atomic multipole description of biomolecular electrostatics. In this work, the Atomic Multipole Optimized Energetics for Biomolecular Applications (AMOEBA) force field is applied to restrained refinement of molecular models against X-ray diffraction data from peptide crystals. A new formalism is also developed to compute anisotropic and aspherical structure factors using fast Fourier transformation (FFT) of Cartesian Gaussian multipoles. Relative to direct summation, the FFT approach can give a speedup of more than an order of magnitude for aspherical refinement of ultrahigh-resolution data sets. Use of a sublattice formalism makes the method highly parallelizable. Application of the Cartesian Gaussian multipole scattering model to a series of four peptide crystals using multipole coefficients from the AMOEBA force field demonstrates that AMOEBA systematically underestimates electron density at bond centers. For the trigonal and tetrahedral bonding geometries common in organic chemistry, an atomic multipole expansion through hexadecapole order is required to explain bond electron density. Alternatively, the addition of interatomic scattering (IAS) sites to the AMOEBA-based density captured bonding effects with fewer parameters. For a series of four peptide crystals, the AMOEBA-IAS model lowered R(free) by 20-40% relative to the original spherically symmetric scattering model.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19690373      PMCID: PMC2733883          DOI: 10.1107/S0907444909022707

Source DB:  PubMed          Journal:  Acta Crystallogr D Biol Crystallogr        ISSN: 0907-4449


Introduction

The number of X-ray crystal structures in the Protein Data Bank (PDB) with a resolution of higher than 1.0 Å continues to increase rapidly (Berman et al., 2000 ▶). In late 2002, there were already over 100 structures available at subatomic resolution (Afonine & Urzhumtsev, 2004 ▶), while as of early 2009 the number had more than tripled to well over 300. Examples include the proteins lysozyme at 0.65 Å (Wang et al., 2007 ▶), aldose reductase at 0.66 Å (Howard et al., 2004 ▶) and serine protease at 0.78 Å (Kuhn et al., 1998 ▶), as well as nucleic acid structures such as B-DNA at 0.74 Å (Kielkopf et al., 2000 ▶), Z-DNA at 0.60 Å (Tereshko et al., 2001 ▶) and an RNA tetraplex at 0.61 Å (Deng et al., 2001 ▶). Crystals that diffract to high resolution are ideal for studying valence-electron distributions (Jelsch et al., 2000 ▶; Muzet et al., 2003 ▶; Zarychta et al., 2007 ▶; Volkov et al., 2007 ▶; Coppens & Volkov, 2004 ▶) that dictate the electrostatic properties of macromolecules. Electrostatics, in turn, is one of the driving forces in protein and nucleic acid folding, which should be understood in detail in order to predict biomolecular thermodynamics and kinetics (Snow et al., 2002 ▶, 2005 ▶; Sorin & Pande, 2005 ▶; Pande et al., 2003 ▶). In this work, we contribute an improved theory and algorithm for computing the anisotropic and aspherical valence-electron density of molecules for X-ray crystal structure refinement. Calculation of structure factors is generally based on scattering factors defined by the isolated-atom model (IAM), which assumes that the electron density around each atom is spherically symmetric. However, subatomic resolution diffraction data capture aspherical features of the electron density that result from bonding and the local chemical environment. The difference between the IAM and the true electron density is defined as the deformation density. For example, aspherical electron-density models of diamond, silicon and germanium developed by DeMarco and Weiss and later by Dawson explained the peaks of deformation density at bond midpoints observed in the experimental data (Dawson, 1967a ▶,b ▶,c ▶; DeMarco & Weiss, 1965 ▶). In these works, the IAM was augmented by atom-centered spherical harmonic expansions, whose physical consequence was to redistribute electron density from nonbonding lobes into the tetragonal arrangement of bond centers. A variety of radial functions have been used in combination with atom-centered spherical harmonic expansions. Modified Gaussians were promoted by Dawson (1967a ▶), a set of harmonic oscillator wavefunctions by Kurki-Suonio (1968 ▶) and more recently a formalism based on Slater-type orbitals (STO) was described by Stewart and coworkers (Epstein et al., 1977 ▶; Cromer et al., 1976 ▶; Stewart, 1979 ▶, 1977 ▶) and by Hansen & Coppens (1978 ▶), which represents the current standard (Jelsch et al., 2005 ▶; Zarychta et al., 2007 ▶; Volkov et al., 2007 ▶; Coppens, 2005 ▶). However, spherical harmonics are not the only basis set available to describe the angular dependence of the deformation density. We first present a formulation of anisotropic and aspherical atomic densities based on Cartesian Gaussian multipoles, which leads to much simpler formulae for the calculation of structure factors via direct summation in reciprocal space than the STO-based theory of Hansen & Coppens (1978 ▶). We also demonstrate that Cartesian Gaussian multipoles allow the computation of structure factors via fast Fourier transformation (FFT) of the real-space electron density (Cooley & Tukey, 1965 ▶). The latter approach, originally proposed by Ten Eyck (1973 ▶, 1977 ▶), is the basis of the efficient macromolecular refinement algorithms (Brünger, 1989 ▶; Afonine & Urzhum­tsev, 2004 ▶; Afonine et al., 2007 ▶; Agarwal, 1978 ▶) implemented in programs such as CNS (Brünger et al., 1998 ▶; Brunger, 2007 ▶) and PHENIX (Adams et al., 2002 ▶). The sublattice method implemented in CNS lends itself to efficient parallelization (Brünger, 1989 ▶). Boys originally proposed Cartesian Gaussian functions as basis functions to solve the many-electron Schrödinger equation (Boys, 1950 ▶). The advantage of Gaussians over STOs in this context is that two-electron integrals have analytic forms, which has led to the adoption of Gaussian basis sets for many ab initio calculations (Hehre et al., 1969 ▶, 1970 ▶). We note that the equivalence of spherical harmonics and Cartesian tensors is well known, with key relationships having been presented by Stone (1996 ▶) and Applequist (1989 ▶, 2002 ▶). We apply Cartesian Gaussian multipoles to restrained crystallographic refinements based on the Atomic Multipole Optimized Energetics for Biomolecular Applications (AMOEBA) force-field electrostatic model (Ponder & Case, 2003 ▶; Ren & Ponder, 2002 ▶, 2003 ▶, 2004 ▶; Schnieders et al., 2007 ▶; Schnieders & Ponder, 2007 ▶). The AMOEBA electrostatic model is based on the superposition of permanent atomic multipoles truncated at quadrupoles and induced dipoles. Permanent electrostatics represents the electron density of a group of atoms in the absence of interactions with the environment, which may include other parts of the molecule or solvent. Groups are chosen to be relatively rigid in order to avoid conformational variability in the permanent multipole moments. Conversely, the induced dipoles of AMOEBA represent polarization, the response of the electron density to the local electric field. Force fields are widely used to restrain macromolecular refinement by contributing forces to local optimizations and molecular dynamics (Brünger et al., 1987 ▶), with the latter used within simulated-annealing algorithms to promote global optimization (Brünger, 1988 ▶, 1991 ▶; Brünger et al., 1989 ▶, 1990 ▶, 1997 ▶; Kuriyan et al., 1989 ▶; Adams et al., 1997 ▶; Brünger & Rice, 1997 ▶). Up to now, force fields in crystallography have been largely limited to the geometric and repulsive terms and have had no influence on the atomic scattering factors. Therefore, refinement using a scattering model based on AMOEBA electrostatics is novel and lends insight into the progress being made in the development of precise, transferable force fields. Another limitation of the use of force fields for restraining X-­ray refinement has been the lack of proper treatment of long-range electrostatic interactions, which is overcome in this work via use of particle-mesh Ewald summation (PME; Darden et al., 1993 ▶; Essmann et al., 1995 ▶; Sagui et al., 2004 ▶). In addition to AMOEBA, polarizable force fields are being studied by a number of other groups. Maple and coworkers have pursued a model similar to AMOEBA, but with the permanent moments truncated at dipole order, which has shown promising results for protein–ligand complexes (Friesner et al., 2005 ▶; Maple et al., 2005 ▶). As an alternative to induced dipoles, Patel and Brooks employed a fluctuating-charge model of polarization (Patel & Brooks, 2006 ▶), while Lamoureux and Roux have demonstrated success using classical Drude oscillators (Lamoureux et al., 2006 ▶; Lamoureux & Roux, 2003 ▶). In addition to polarization, Gresh and coworkers have developed a methodology to include nonclassical effects such as electrostatic penetration and charge transfer (Gresh, 2006 ▶; Gresh et al., 2007 ▶; Piquemal et al., 2006 ▶, 2007 ▶). Although classical potentials can be validated against a range of experimental observables, for example small-molecule solvation energies (Shirts et al., 2003 ▶; Shirts & Pande, 2005 ▶), high-resolution diffraction data can pinpoint deficiencies in an electrostatics model with high precision. For example, we show that truncation of permanent atomic multipoles at quadrupole order limits the ability of the AMOEBA model to place charge density at bond midpoints. We use an efficient solution to this limitation by refining partial charges at bond centers as originally proposed by Afonine et al. (2007 ▶).

Theory

Subgrid fast Fourier transform

The starting point for this work is the subgrid fast Fourier transform algorithm (SGFFT), which will be briefly summarized (Brünger, 1989 ▶). In FFT-based methods, the electron density is computed over a lattice chosen to be fine enough to avoid aliasing effects at a given resolution. This computation can be made more efficient by an artificial increase in the atomic displacement parameters (ADPs) of all atoms. The optimum choice in CNS v.1.2 (Brunger, 2007 ▶) for the ADP offset and grid size follows the work of Bricogne (2001 ▶). An important point is that the electron density is only computed within a cutoff radius around each atom. As the resolution increases, the cutoff is increased based on an empirical scheme to maintain agreement between direct-summation structure factors and derivatives and the SGFFT calculation (Brunger, 2007 ▶). Structure factors are computed by FFT of the electron density of an asymmetric unit of atoms (Agarwal, 1978 ▶). The SGFFT is based on factorizing this computation into smaller FFTs that are computed separately on sublattices, which allows efficient parallelization since these tasks are independent (Brünger, 1989 ▶; Kay Diederichs, private communication). CNS v.1.21 has implemented this approach via an OpenMP environment (courtesy Kay Diederichs, University of Konstanz; available at http://cns-online.org). Crystallographic symmetry is then applied to the structure factors, and the target function and its derivatives with respect to structure factors are evaluated. Symmetry operators are applied to the derivatives of the target function with respect to the structure factors followed by inverse Fourier transform. Using the chain rule, derivatives of the target function with respect to atomic parameters are then computed by multiplication and summation over the local neighborhood around each atom of the derivatives of the electron density with respect to atomic parameters. Although the original SGFFT method was developed with an isolated-atom description of electron density and isotropic ADPs, it is generalizable to aspherical Cartesian Gaussian multipoles and anisotropic ADPs. All that is needed are formulae for the electron density and the derivatives of the electron density with respect to atomic parameters, which then can be inserted into equations (29) and (40) of Brünger (1989 ▶). In the following sections, we develop these necessary formulae.

Isolated-atom Gaussian density

The key mathematical property of Gaussians with respect to efficient calculation of structure factors is that they are an eigenfunction of the Fourier transform (FT). In other words, a Gaussian in real space transforms to a Gaussian in reciprocal space and vice versa. Consider the canonical spherically symmetric Gaussian atomic scattering factor (Agarwal, 1978 ▶),where a and b are constant parameters fitted to ab initio calculations on isolated atoms (this work is based on a sum of six Gaussians; n = 6; Su & Coppens, 1998 ▶), κ is an expansion/contraction parameter used to adjust the width of the density and r is a position vector relative to the center of the atom. Its FT is given bywhere s is the reciprocal-lattice vector and we have used the FT definition given in Appendix A . The reciprocal-lattice vector is s = h A −1 = (A −1) h, where h is a column vector with the Miller indices of a Bragg reflection and A is the fractional­ization matrix that transforms coordinates r with respect to a Cartesian basis to fractional coordinates r frac as defined in a crystallographic basis set. The Debye–Waller factor (Waller, 1923 ▶) is given byin reciprocal space, where each element of the symmetric positive-definite matrix U is defined via a Cartesian basis consistent with PDB ANISOU records (Trueblood et al., 1996 ▶; Grosse-Kunstleve & Adams, 2002 ▶). Multiplication of (3) by the atomic form factor from (2) gives the scattering factorbased on U that are defined bywhere U add is the artificial isotropic increase or decrease in the ADP discussed above and I 3 is a 3 × 3 identity matrix. Removal of U add analytically from each structure factor after the FT is straightforward. The only difference, therefore, between each U is the isolated-atom scattering parameter b . Application of the inverse FT to (4) gives the real-space anisotropic electron densitywhere |U | is the determinant of matrix U and U −1 is its inverse. This expression can also be viewed as the convolution of the Gaussian form factor of (1) with the inverse Fourier transform of the Debye–Waller factor of (3). Although the underlying isolated-atom scattering factor is spherically symmetric, convolution with anisotropic ADPs can lead to an angular dependence in ρ((r). Using the relationship that B = 8π2 U, one can show that (6) reduces to the isotropic density expression reported by Brünger in equation (16) of Brünger (1989 ▶) if all diagonal elements of U are equal to U iso + b/8π2 + U add with zero off-diagonal components.

Polarizable atomic multipole electron density

For the derivation of an atomic multipole expansion from a collection of point charges we begin with the Taylor expansion of the electric potential V(r) at r arising from n partial point charges that represent the electron density of an atom,where Δ is the position of partial charge c , ∇α = ∂/∂r α is one component of the del operator, α ∈ {x, y, z} and the Greek subscripts {α, β} represent the use of the Einstein summation convention for summing over tensor elements (Stone, 1996 ▶). We omit the constant factor of 1/4π∊0 throughout for com­pactness. Let the monopole, dipole and traceless quadrupole moments be defined aswhere removal of the trace in the definition of the quadrupole moment is allowed because the potential satisfies the Laplace equation (i.e. ∇2 V = 0). Substitution of the relationships in (8) into the final expression of (7) gives the electric potential in terms of a Cartesian multipole expansion, which we truncate at quadrupole orderWe now replace the Coulomb potential of (9) with the potential from the sum of Gaussians from (1), which is given byand findWe now introduce unique superscripts on the charge, dipole and quadrupole Gaussian basis sets, denoted by {n, n, n Θ} and {κ, κ, κΘ}, to allow them to differ in number and width.The potential of the charge density of (12) quickly approaches the Coulomb potential as r increases since the error function goes to unity such that at large r this potential satisfies the Laplace equation and the use of a traceless quadrupole tensor is still justified. Application of the Laplace operator to both sides of (12) gives the negative of a continuous charge density based on Cartesian Gaussian multipoles, In crystallography the convention is that electron density is positive, so we will keep the negative sign. Therefore, a negative partial charge equates to positive scattering density. Inclusion of ADPs is described by convolution of (13) with the real-space temperature factor,Based on the convolution differentiation rulethe solution to (14) is given by substituting for f(r) in (13) with the corresponding ρ(r) from (6) to giveHowever, since q only represents partial atomic charges, the contributions from valence and core electrons need to be added. Additionally, the AMOEBA force field divides each atomic dipole moment into permanent (d) and induced (u) contributions to account for polarization. Therefore, we construct the total atomic electron density at a location r relative to the center of atom j by adding the contribution of core and valence electron density to (16) and splitting the dipole into permanent and induced components to givewhere P ( is the integer number of core electrons (carbon has two) and P ( is the integer number of valence electrons (carbon has four). The superscripts on the anisotropic Gaussian form factors ρ ((r) have been made explicit for our model. We make the reasonable choice of using the isolated-atom scattering parameters for both core and valence electron densities. The width of the core electron density is frozen at the isolated-atom description (κ = 1) based on the observation that chemical bonding does not perturb it significantly (Hansen & Coppens, 1978 ▶). On the other hand, the width of the valence electron density expands or contracts relative to the isolated-atom model owing to a gain or reduction, respectively, of electron density from or to covalently bonded atoms. This effect is modeled by the width parameter of the valence density κ. In this work, the dipole and quadrupole densities are described by a single Gaussian (n = n Θ = 1) based on a and b parameters set to unity. The widths of the dipole and quadrupole densities are controlled by the κ and κΘ parameters. In this work, the width parameters {κ, κ, κ} are optimized against the diffraction data for each AMOEBA multipole type. The multipole moments are fixed by the AMOEBA force field and are not refined against the data. The partial derivatives through second order of the anisotropic and aspherical density defined in (6), which are required for the real-space multipolar density given in (17), arewhere u α is a unit vector in the α direction with α ∈ {x, y, z}. In addition, the third-, fourth- and fifth-order terms of the expansion are presented as supplementary information along with a Mathematica notebook.1 To the best of our knowledge, (17) is the first expression reported in the literature for a real-space form factor that is the convolution of an atomic multipolar electron density with anisotropic ADPs. This equation opens the door to exploring precise polarizable atomic multipole refinements in tandem with efficient computation of structure factors via FFT. Given a molecular conformation, the AMOEBA permanent multipole moments for each atom in the global coordinate frame (q, d, Θ) are converted via rotation from a local frame. For example, as shown in Fig. 1, ▶ the z axis of the local frame for the carbonyl O atom of the peptide bond is in the direction of the bond to the carbonyl C atom. Its positive x axis is located in the O=CCα plane in the direction of the Cα atom and the y axis is chosen to give a right-handed coordinate system (Ren & Ponder, 2002 ▶). The induced dipole (u) on each atom is determined via a self-consistent field (SCF) calculation, where the field is a sum of contributions from the permanent atomic multipoles and induced dipoles. The AMOEBA polarization model is described in greater detail in work by Ren & Ponder (2002 ▶).
Figure 1

The local multipole frame of the carbonyl O atom of the peptide backbone is shown. The positive z axis is along the C=O bond and the x axis is chosen in the O=C—Cα plane in the direction of the Cα atom. The y axis is directed into the page in order to achieve a right-handed coordinate system. Also shown are the nonzero multipole moments of the O atom and a qualitative representation of their shape. The d Cartesian Gaussian dipole (in Debye units) places electron density along the C=O bond, while the trace of the Cartesian Gaussian quadrupole (in Buckingham units) positions electron density approximately at lone-pair positions.

Derivatives of the electron density

Atomic coordinates

As a simplification, the derivation up to this point has assumed that the atomic center was the origin of the coordinate system. However, for this section on the derivatives with respect to atomic coordinates we place atom j at r in the global frame. In order to keep the derivation manageable, we split the total electron density into that produced by permanent charges ρperm and that of induced charges ρind,The derivative of the permanent multipole electron density of atom j with respect to the α coordinate of atom j is given bywhere the derivative of the dipole and quadrupole densities are each composed of two terms owing to the chain rule. As described above, the dipole and quadrupole moments of each atom are implicitly a function of its coordinates and the coordinates of a few of its bonded neighbors (atoms k) that define the local frame of the multipole. Therefore, the derivative of the permanent multipole electron density of atom j with respect to the α coordinate of atoms k must also be considered,where the derivatives of spherically symmetric terms are zero with respect to the coordinates of atom k because they have no dependence on the orientation of the local frame. Note that the partial derivative of an anisotropic and aspherical density tensor with respect to an atomic coordinate is the negative of the partial derivatives given in (18), simply due to the negative sign on r . The derivatives of the polarizable density with respect to atomic coordinates are very specific to the AMOEBA electrostatic model and are discussed in Appendix B . However, we note that computing the derivatives of a polarizable density with respect to atomic coordinates is O(n 2logn) using PME, which quickly becomes the most expensive part of the overall calculation.

ADPs

The derivative of the anisotropic electron density of atom j with respect to an anisotropic displacement parameter U is given byand requires the partial derivatives of the Cartesian Gaussian tensors with respect to ADP components. Introducing a few relationships facilitates their presentation. Firstly, based on the equalitywe havewhere the Kronecker delta δτυ is unity for diagonal elements of U and zero otherwise. Differentiating an identity from matrix algebra U −1 U = I gives the following relationshipwhich makes it possible to differentiate U instead of its inverse. This is preferred since only one or two elements of ∂U/∂U τυ are equal to unity and the rest are zero. Specifically, a single element is equal to unity if τ equals υ, while two elements are equal to unity otherwise, since U τυ and U υτ represent the same variable in this case. For convenience, we define a 3 × 3 matrix J (τυ),and based on the chain rule we haveDifferentiating (6) with respect to U τυ and using (24), (27) and the product rule gives

Gaussian width

The Gaussian width parameter κ controls radial expansion and contraction of the Cartesian Gaussian multipoles. Analogous parameters are used to optimize the STOs within the Hansen and Coppens scattering model (Hansen & Coppens, 1978 ▶). The derivative of the electron density with respect to this parameter is similar to the gradient for the ADP parameters. Two chain-rule terms are necessary. Firstly, the gradient of the normalizing termwhereSecondly, the gradient of the inverse ADP matrix is most conveniently expressed using the gradient of the original ADP matrix,whereFor convenience the matrix J (κ) is defined to more compactly represent this result, Differentiating (6) with respect to κ and using (29), (33) and the product rule givestogether with the third- and fourth-order terms available as supplementary information1.

Fourier transform of the polarizable atomic multipole electron density

Remarkably, the FT of the anisotropic and aspherical density given in (17) is simplywhere the dipole and quadrupole terms in (35) depend on the FT of the partial derivatives defined in (18). Through fifth order the reciprocal-space tensors areand in compressed tensor notation the general expression for order u + v + w isThis expression is considerably more compact than any reported previously for an aspherical scattering factor in reciprocal space, particularly the formulation based on STOs and spherical harmonics (Hansen & Coppens, 1978 ▶). Notably, our formulation has no dependence on cumbersome Fourier Bessel transforms of Slater-type functions (Dawson, 1967a ▶; Hansen & Coppens, 1978 ▶; Su & Coppens, 1990 ▶). Our equation (35) has been implemented by ‘direct summation’ for com­parison to the performance of the FFT algorithm.

Scattering models

Four scattering models were implemented by modifying and combining the CNS (Brünger et al., 1998 ▶) and TINKER (Ponder, 2004 ▶) code bases. The scattering models were added to the CNS code base, while TINKER was used to compute AMOEBA chemical forces and to supply CNS with polarizable multipoles in the global frame.

Isolated atom

The first scattering model (‘IAM’) is the conventional IAM based on the relativistic elastic scattering factors described by Su & Coppens (1998 ▶).

Isolated atom with inter-atomic scattering

The second scattering model (‘IAM–IAS’) augments the IAM with inter-atomic scattering sites at bond centers (Afonine et al., 2007 ▶). Unlike the model of Afonine and coworkers, our implementation does not include IAS sites at lone pairs or at the center of aromatic rings. We have neglected these sites based on the rationale that the AMOEBA electrostatic model is sufficient to capture these details of the electron density, which we provide further evidence for below when discussing the refinement of a Tyr-Gly-Gly tripeptide. In our approach, chemically equivalent bonds are constrained to use the same IAS parameters. Charge density that is added to or removed from bond centers is exactly balanced by changing the net charge of the bond-defining atoms. For example, a bond charge of −0.2 e requires atomic charge increments that sum to 0.2 e. In this way, all molecules retain their original net charge. Each bond type requires three parameters: the charge increments of both atoms and the Gaussian width of the scattering site. Bond types are defined based on the concatenation of the AMOEBA force-field atom types.

AMOEBA

The third scattering model (‘AMOEBA’) is based on the polarizable atomic multipoles of the AMOEBA force field. Each chemically unique multipole type requires three Gaussian width parameters as described in §2. The induced dipoles were iterated to self-consistency using PME whenever any atomic coordinates were changed during refinement (Darden et al., 1993 ▶; Sagui et al., 2004 ▶; Essmann et al., 1995 ▶).

AMOEBA with inter-atomic scattering

The final scattering model (‘AMOEBA–IAS’) augments AMOEBA electrostatics with inter-atomic scattering sites. It became clear during the course of this study that an atomic multipole expansion truncated at quadrupole order is insufficient to capture bond charge density for most molecular geometries. This is consistent with theoretical observations by Stone and coworkers that the convergence of a distributed multipole analysis (DMA) may be improved by using both atoms and bond centers as expansion sites (Stone & Alderton, 1985 ▶; Stone, 2005 ▶). Furthermore, experimental data from the X-ray scattering of diamond and silicon, simple examples of tetrahedral bonding geometry, are explained by the superposition of one atomic octopole moment and one atomic hexadecapole moment (Dawson, 1967a ▶,b ▶). The characteristics of the four scattering models are further clarified below with respect to four peptide test cases. The following computational details were constant across all of the refinements. The isotropic ADP offset U add was set to 1/(4π2), which is equivalent to B add = 8π2 U add = 2, the FFT grid factor to 0.33 (as appropriate for crystal structures at sub­atomic resolution), and the electron-density cutoff around each atom was 18 (specified by the E lim parameter in CNS). These conservative parameters led to close agreement between direct summation and FFT computation of structure factors. The CNS parameter w A that controls the weighting of X-ray target function relative to the force-field energy was set to 1.0, although we also tested 0.2.This raised R free values by less than 0.1% and lowered the AMOEBA potential energy differences between refinements presented below, but did not alter any trends or our conclusions. It should be noted that force-field restraints are not necessarily required for refinement at subatomic high resolution. However, their use in this study gives an insight into the relative energetic cost of the structural changes arising from differences in the four scattering models. A modified version of the refine.inp CNS task file was used for all refinements using the MLI target function.

Applications

To demonstrate the behavior of X-ray refinements based on Cartesian Gaussian multipoles, we present two sets of applications. The first set is simply to illustrate the performance of direct summation versus FFT and SGFFT computation of structure factors as a function of system size. The second set describes refinements on a series of four peptide crystals that diffract to 0.59 Å resolution or better. All examples use the AMOEBA force field for chemical forces, instead of the default CNS force field based on Engh & Huber parameters (Engh & Huber, 1991 ▶). Although the refinements were performed in the native space group of each crystal, AMOEBA energies and gradients as computed using the TINKER code base required expanding to P1. This did not increase the number of refined variables, but suggests the need for an AMOEBA code that takes advantage of crystal symmetry.

Runtime scaling on protein data sets

Evaluation of the target function and its derivatives by direct-summation calculation of structure factors via (35) and (36) is O(N atoms × N reflections × N symm). Alternatively, the FFT algorithm based on (17) and (18) is O(N grid × logN grid), where the number of grid points N grid depends on the resolution of the diffraction data. Aspherical refinements based on the Hansen–Coppens formalism are currently limited to direct summation, since the real-space form of the electron density convolved with ADPs is unknown. Therefore, the performance of X-ray refinements based on Cartesian Gaussian multipoles and FFT is of particular interest. The results are summarized in Table 1 ▶ and are plotted in Fig. 2 ▶. Although the performance difference is only about a factor of two for the small protein crambin, over an order of magnitude improvement is achieved for both ribonuclease A and aldose reductase. Parallelization with the SGFFT method results in a further significant speedup (a speedup of a factor of nearly four relative to a single processor on a four-processor machine).
Table 1

Comparison of computational efficiency of direct-summation, FFT and SGFFT methods for the computation of the Cartesian Gaussian multipole scattering factors

The permanent multipole expansion was truncated at atomic quadrupoles and polarization was included via induced dipoles. The FFT method shows a speedup factor of 1.8–14.5 relative to direct summation. Parallelization by SGFFT provided an additional factor of 3.7–3.9 using four processors. All calculations were performed on a MacPro workstation with 2 × 2.66 GHz Dual Core Intel Xeon processors.

PDB codeAtomsReflectionsNsymmAtoms × reflections × Nsymm × 10−6Direct (s)FFT (s)Direct/FFTSGFFT (s)Direct/SGFFT
1ejg6421122092144.149.928.11.87.36.8
2vb125441871651476.1301.891.53.323.612.8
1fn842941585501680.8245.145.85.412.419.8
1dy5483515942221541.6505.637.013.79.752.1
1us0686551126527019.72346.2162.314.542.355.5
Figure 2

The scaling of the Cartesian Gaussian multipole model, truncated at quadrupole order, is plotted on a log–log scale for computation of the intensity-based maximum-likelihood target function (MLI) for direct summation, FFT and SGFFT. Direct summation scales linearly with the product of the number of atoms, the number of reflections and the number of symmetry operators. Computation of the crystallographic target function by FFT of the Cartesian Gaussian multipole electron density shows a speedup of a factor of between 1.8 and 14.5 compared with direct summation. A further speedup factor of nearly four is achieved using the SGFFT method on a four-processor machine.

Refinement of peptide crystals

In principle, a more precise scattering model based on Cartesian Gaussian multipoles with coefficients from the AMOEBA electrostatics model should improve the quality of refinements relative to the IAM as judged by both R free and the potential energy of the asymmetric unit. Furthermore, the quality of the AMOEBA potential energy function can also be assayed, since it is reasonable to expect that potential energy and R free should be correlated. The peptide crystals studied include YG2 (Pichon-Pesme et al., 2000 ▶), cyclic P2A4 (Dittrich et al., 2002 ▶) and AYA with three waters or with an ethanol molecule (Chęcińska, Forster et al., 2006 ▶; Chęcińska, Mebs et al., 2006 ▶). Detailed descriptions of the unit-cell parameters, number of atoms, resolution and measured reflections are given in Table 2 ▶. The refinement results are summarized in Table 3 ▶ and compared with previous work below.
Table 2

Refinement systems

MoleculeSpace group and unit-cell parameters (Å, °)Non-H atomsH atomsBondsdmin (Å)Reflections
YG2P212121, a = 7.98, b = 9.54, c = 18.322219400.434766
P2A4P212121, a = 10.13, b = 12.50, c = 19.503536720.3724878
AYA + 3 watersP21, a = 8.12, b = 9.30, c = 12.53, β = 91.212627500.595019
AYA + ethanolP21, a = 8.85, b = 9.06, c = 12.36, β = 94.562627520.595258
Table 3

Refinement statistics and the relative AMOEBA potential energy per asymmetric unit are given for four small peptide crystals using the IAM, IAM–IAS, AMOEBA and AMOEBA–IAS scattering models

In all cases, the lowest R free was found using the AMOEBA–IAS scattering model. Furthermore, the structure with the lowest AMOEBA potential energy per asymmetric unit also corresponded to AMOEBA–IAS refinement.

    Rwork/Rfree (%) 
MoleculeScattering modelNparamNdata/NparamIobs/σ(Iobs) > 0Iobs/σ(Iobs) > 3Energy (kcal mol−1)
YGGIAM27417.44.73/4.744.41/4.6036.5
 IAM–IAS34913.73.93/4.013.59/3.867.2
 AMOEBA35513.44.50/4.564.16/4.376.8
 AMOEBA–IAS43011.13.54/3.723.17/3.500.0
PPAAAAIAM33973.44.25/4.223.65/3.7332.2
 IAM–IAS41759.73.56/3.483.00/3.0118.3
 AMOEBA41759.74.24/4.233.69/3.7712.9
 AMOEBA–IAS49550.33.42/3.422.86/2.940.0
AYA + 3 watersIAM34214.72.75/2.792.67/2.7117.5
 IAM–IAS41112.22.24/2.472.16/2.394.1
 AMOEBA42311.92.40/2.552.31/2.474.7
 AMOEBA–IAS49210.21.72/2.031.64/1.950.0
AYA + ethanolIAM34215.43.30/3.503.20/3.3323.1
 IAM–IAS42312.42.32/2.662.21/2.4914.8
 AMOEBA43512.13.42/3.753.32/3.583.7
 AMOEBA–IAS51610.21.90/2.251.79/2.080.0

1 kcal mol−1 = 4.186 kJ mol−1.

YG2

The R free values of the IAM and IAM–IAS refinements of YG2 (4.60 and 3.86%, respectively) are slightly lower than those reported by Afonine and coworkers (4.72 and 4.06%, respectively; Afonine et al., 2007 ▶). The R free value of the AMOEBA–IAS refinement (3.50%) is a significant improvement. The R work value (3.17%) of the AMOEBA–IAS refinement is also lower and is comparable to multipolar refinements reported by Volkov and coworkers using transferred or refined multipole coefficients (3.66% and 3.42%, respectively; Volkov et al., 2007 ▶). Cross-validation-based comparisons are unavailable in this case. We note that the AMOEBA–IAS refinement used a reflections-to-parameters ratio of 11.1, which is slightly higher than the value of 10.6 reported by Volkov and coworkers using refined multipole coefficients. This is computed based on the number of reflections reported in Table 2 ▶ and the number of parameters given in Table 3 ▶. Electron-density maps of the tyrosine ring for the four scattering models are shown in Fig. 3 ▶, which lend visual insight into their properties. The non-H atom positions are apparent in the 2F o − F c contours for each refinement. The standard IAM scattering model underestimates the electron density at bond centers and at the oxygen lone-pair sites, as shown by the F o − F c con­tours. Our IAM–IAS scattering model explains the electron density at bond centers, but does not capture lone-pair electron density. Conversely, the AMOEBA model places electron density approximately at the lone-pair positions but not at bond centers. Finally, the AMOEBA–IAS model explains much of the lone-pair and bonding electron densities.
Figure 3

(a) IAM, (b) IAM–IAS, (c) AMOEBA and (d) AMOEBA–IAM refinements, respectively, for GY2. The F o − F c and 2F o − F c σA-weighted electron-density maps are contoured at 3.5σ and shown in green and gray, respectively. Both the IAM and AMOEBA models fail to explain the electron density at bond centers seen in the data. In addition, the IAM model does not account for lone-pair density on the O atom.

P2A4

The R free values of our IAM and IAM–IAS refinements of P2A4 (3.73 and 3.01%, respectively) agree closely with the values of Afonine and coworkers (3.63 and 3.23%, respectively; Afonine et al., 2007 ▶). The R free value of the AMOEBA–IAS refinement (2.94%) is lower by 0.07%, which is the least amount of improvement seen for AMOEBA–IAS relative to IAM–IAS in this study. The R work value (2.86%) of the AMOEBA–IAS refinement is slightly higher, but com­parable to those reported by Volkov and coworkers using transferred or refined multipole coefficients (2.60% and 2.53%), although this work uses a higher reflections-to-parameters ratio (50.3 compared with 43.6; Volkov et al., 2007 ▶). As for YG2, cross-validation was not performed. The similarity of the R values for YG2 and P2A4 between the AMOEBA–IAS refinements and the multipolar refinements of Volkov and coworkers is consistent with the principle that bond scattering sites capture density that is represented by higher order atomic moments missing in the AMOEBA model (octopole and hexadecapole). In Fig. 4 ▶ the precision of the R work and R free values computed using discrete FTs are compared with analytic direct summation for P2A4 under the AMOEBA scattering model. Agreement to four decimal places is seen for B add values between 0 and 3 Å2, which serves as validation of the correctness of (17) and (35). These results support the conclusion that FFT-based computation of structure factors is appropriate for anisotropic and aspherical scattering models.
Figure 4

The precision of numerical computation of the R work and R free values via FFT is compared with analytic direct summation as a function of the isotropic increase B add in ADP parameters for P2A4 under the AMOEBA scattering model. Note that B add = 8π2 U add.

AYA

The AYA data sets were chosen because of the extremely low temperature achieved during the measurement of structure factors (9 K for AYA + three waters and 20 K for AYA + ethanol). For AYA + water, Chęcińska and coworkers (Chęcińska, Forster et al., 2006 ▶; Chęcińska, Mebs et al., 2006 ▶) originally reported an R value of 2.4%, which is in agreement with the R value of our IAM refinement (2.67%). Addition of IAS lowered the R free statistic from 2.71% to 2.39%, while addition of polarizable atomic multipole electron density showed a further improvement to an R free of 1.95%. For AYA + ethanol the R work value of the IAM (3.20%) is comparable to that reported originally by Chęcińska and coworkers (2.9%). IAM–IAS lowered R free from 3.33 to 2.49%, while AMOEBA–IAS achieved 2.08%.

Refinement summary

The results for all four peptide refinements are summarized in Fig. 5 ▶. In every case, use of the AMOEBA–IAS scattering model relative to the IAM scattering model lowered both R free and the potential energy of the crystal. When the IAM scattering model is used, molecular conformations are highly strained to compensate. For example, H—C atom bonds are too short because the IAM model centers electron density at the hydrogen nucleus. In the crystal structures, this electron density is shifted towards the C atom. As the description of the electron density is improved, the molecular conformation relaxes by approximately 16 kJ mol−1 per residue. The precise amount of relaxation depends on the weighting between the crystallographic target and the force field. Unrestrained refinements with an IAM scattering model could adopt even more unphysical conformations. This suggests that accurate chemical restraints are necessary even for ultrahigh-resolution refinements unless an anisotropic and aspherical scattering model is used.
Figure 5

The improvement arising from the AMOEBA–IAS scattering model, relative to the IAM model, is plotted as a function of relative percentage improvement in R free value and the relative AMOEBA potential energy per residue. For all data sets, the best R free value and lowest potential energy per residue were achieved using the AMOEBA–IAS scattering model. 1 kcal mol−1 = 4.186 kJ mol−1.

In Fig. 6 ▶, we present plots of the IAS sites that were refined for each peptide system. Their Gaussian full-width at half-maximum (FWHM) is plotted against charge magnitude for both the IAM–IAS and the AMOEBA–IAS models. The majority of the charges under the IAM–IAS model and all of the charges under the AMOEBA–IAS model refined to negative partial charge values (or positive scattering density), which is consistent with the physical concentration of charge density at chemical bonds. The similarity of the refined charges between the IAM–IAS and the AMOEBA–IAS models suggests that an atomic multipole description of electron density truncated at quadrupole order underestimates density at trigonal and tetrahedral bond centers.
Figure 6

For the inter-atomic scattering sites of the IAM–IAS (a) and AMOEBA–IAS (b) scattering models, the refined Gaussian full-width at half-maximum (FWHM) is plotted versus partial charge magnitude. The majority of charges for the IAM–IAS model and all charges for the AMOEBA–IAS are negative. The sub-angstrom FWHM values are consistent with very localized bond densities.

Conclusions

Cartesian Gaussian multipoles offer an efficient alternative to the Hansen and Coppens formulation of aspherical scattering. They eliminate the use of Slater-type functions and allow structure factors to be computed by FFT. Numerical tests show that that FFT and direct-summation implementations of Cartesian Gaussian multipoles agree to high precision. For subatomic resolution biomolecular data sets such as ribo­nuclease A and aldose reductase, parallelized computation of structure factors using the SGFFT method results in a speedup of one to two orders of magnitude compared with direct summation. Ideally, a force-field electrostatics model should be accurate enough to explain the electron density observed in X-ray diffraction experiments. Although the AMOEBA polarizable multipole force field energetic model shows promise, truncation of the permanent moments at quadrupole order systematically underestimates electron density at bond centers. Our results suggest that the added computational expense of including hexadecapole moments in the atomic scattering factor computation is justified. As supplementary information we have provided a Mathematica notebook and formulae that allow computation of Cartesian Gaussian multipoles up to the fourth order in anticipation of further improvements to force fields. In the near future, we will present the results of applying our polarizable atomic multipole refinement methodology to macromolecules. For ultrahigh-resolution macromolecular data sets, such as HEWL at 0.65 Å (Wang et al., 2007 ▶), our scattering model significantly improves refinement statistics, as it does for the simpler peptide cases presented here. Equally exciting will be the use of the AMOEBA force field and in particular the electrostatic forces to orient water molecules in the absence of clear H-atom electron density. We anticipate that refinement of hydrogen-bonding networks will enhance the usefulness of X-ray crystallography experiments with respect to explaining pK a shifts, ligand-binding affinities and enzymatic mechanisms. Supplementary material file. DOI: 10.1107/S0907444909022707/dz5164sup1.pdf Supplementary material file. DOI: 10.1107/S0907444909022707/dz5164sup2.pdf
  44 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Accurate protein crystallography at ultra-high resolution: valence electron distribution in crambin.

Authors:  C Jelsch; M M Teeter; V Lamzin; V Pichon-Pesme; R H Blessing; C Lecomte
Journal:  Proc Natl Acad Sci U S A       Date:  2000-03-28       Impact factor: 11.205

3.  Conformational flexibility of B-DNA at 0.74 A resolution: d(CCAGTACTGG)(2).

Authors:  C L Kielkopf; S Ding; P Kuhn; D C Rees
Journal:  J Mol Biol       Date:  2000-02-25       Impact factor: 5.469

4.  On a fast calculation of structure factors at a subatomic resolution.

Authors:  P V Afonine; A Urzhumtsev
Journal:  Acta Crystallogr A       Date:  2003-12-23       Impact factor: 2.290

Review 5.  Force fields for protein simulations.

Authors:  Jay W Ponder; David A Case
Journal:  Adv Protein Chem       Date:  2003

6.  Atomistic protein folding simulations on the submillisecond time scale using worldwide distributed computing.

Authors:  Vijay S Pande; Ian Baker; Jarrod Chapman; Sidney P Elmer; Siraj Khaliq; Stefan M Larson; Young Min Rhee; Michael R Shirts; Christopher D Snow; Eric J Sorin; Bojan Zagrovic
Journal:  Biopolymers       Date:  2003-01       Impact factor: 2.505

Review 7.  How well can simulation predict protein folding kinetics and thermodynamics?

Authors:  Christopher D Snow; Eric J Sorin; Young Min Rhee; Vijay S Pande
Journal:  Annu Rev Biophys Biomol Struct       Date:  2005

8.  Triclinic lysozyme at 0.65 A resolution.

Authors:  Jiawei Wang; Miroslawa Dauter; Randy Alkire; Andrzej Joachimiak; Zbigniew Dauter
Journal:  Acta Crystallogr D Biol Crystallogr       Date:  2007-11-16

9.  Cross-validated maximum likelihood enhances crystallographic simulated annealing refinement.

Authors:  P D Adams; N S Pannu; R J Read; A T Brünger
Journal:  Proc Natl Acad Sci U S A       Date:  1997-05-13       Impact factor: 11.205

10.  Ultrahigh resolution drug design I: details of interactions in human aldose reductase-inhibitor complex at 0.66 A.

Authors:  E I Howard; R Sanishvili; R E Cachau; A Mitschler; B Chevrier; P Barth; V Lamour; M Van Zandt; E Sibley; C Bon; D Moras; T R Schneider; A Joachimiak; A Podjarny
Journal:  Proteins       Date:  2004-06-01
View more
  22 in total

1.  Generalized and efficient algorithm for computing multipole energies and gradients based on Cartesian tensors.

Authors:  Dejun Lin
Journal:  J Chem Phys       Date:  2015-09-21       Impact factor: 3.488

Review 2.  Computational insights for the discovery of non-ATP competitive inhibitors of MAP kinases.

Authors:  Michael J Schnieders; Tamer S Kaoud; Chunli Yan; Kevin N Dalby; Pengyu Ren
Journal:  Curr Pharm Des       Date:  2012       Impact factor: 3.116

Review 3.  Classical electrostatics for biomolecular simulations.

Authors:  G Andrés Cisneros; Mikko Karttunen; Pengyu Ren; Celeste Sagui
Journal:  Chem Rev       Date:  2013-08-27       Impact factor: 60.622

4.  Structural Insights into Hearing Loss Genetics from Polarizable Protein Repacking.

Authors:  Mallory R Tollefson; Jacob M Litman; Guowei Qi; Claire E O'Connell; Matthew J Wipfler; Robert J Marini; Hernan V Bernabe; William T A Tollefson; Terry A Braun; Thomas L Casavant; Richard J H Smith; Michael J Schnieders
Journal:  Biophys J       Date:  2019-07-03       Impact factor: 4.033

5.  Perspective: Quantum mechanical methods in biochemistry and biophysics.

Authors:  Qiang Cui
Journal:  J Chem Phys       Date:  2016-10-14       Impact factor: 3.488

6.  Extracting water and ion distributions from solution x-ray scattering experiments.

Authors:  Hung T Nguyen; Suzette A Pabit; Lois Pollack; David A Case
Journal:  J Chem Phys       Date:  2016-06-07       Impact factor: 3.488

7.  DiSCaMB: a software library for aspherical atom model X-ray scattering factor calculations with CPUs and GPUs.

Authors:  Michał L Chodkiewicz; Szymon Migacz; Witold Rudnicki; Anna Makal; Jarosław A Kalinowski; Nigel W Moriarty; Ralf W Grosse-Kunstleve; Pavel V Afonine; Paul D Adams; Paulina Maria Dominiak
Journal:  J Appl Crystallogr       Date:  2018-02-01       Impact factor: 3.304

8.  Ambient-Potential Composite Ewald Method for ab Initio Quantum Mechanical/Molecular Mechanical Molecular Dynamics Simulation.

Authors:  Timothy J Giese; Darrin M York
Journal:  J Chem Theory Comput       Date:  2016-05-26       Impact factor: 6.006

9.  Force Fields for Small Molecules.

Authors:  Fang-Yu Lin; Alexander D MacKerell
Journal:  Methods Mol Biol       Date:  2019

10.  Limiting assumptions in molecular modeling: electrostatics.

Authors:  Garland R Marshall
Journal:  J Comput Aided Mol Des       Date:  2013-01-26       Impact factor: 3.686

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.