Literature DB >> 20958032

Improved fitting of solution X-ray scattering data to macromolecular structures and structural ensembles by explicit water modeling.

Alexander Grishaev1, Liang Guo, Thomas Irving, Ad Bax.   

Abstract

A new procedure, AXES, is introduced for fitting small-angle X-ray scattering (SAXS) data to macromolecular structures and ensembles of structures. By using explicit water models to account for the effect of solvent, and by restricting the adjustable fitting parameters to those that dominate experimental uncertainties, including sample/buffer rescaling, detector dark current, and, within a narrow range, hydration layer density, superior fits between experimental high resolution structures and SAXS data are obtained. AXES results are found to be more discriminating than standard Crysol fitting of SAXS data when evaluating poorly or incorrectly modeled protein structures. AXES results for ensembles of structures previously generated for ubiquitin show improved fits over fitting of the individual members of these ensembles, indicating these ensembles capture the dynamic behavior of proteins in solution.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20958032      PMCID: PMC2974370          DOI: 10.1021/ja106173n

Source DB:  PubMed          Journal:  J Am Chem Soc        ISSN: 0002-7863            Impact factor:   15.419


Solution small-angle X-ray scattering (SAXS) data contain valuable information on the macromolecular size and shape and are increasingly used in biomolecular structure studies, not only as a stand alone tool but as a complement to NMR and X-ray crystallography.[1−4] Recent methods permit direct refinement against X-ray scattering data in combination with other experimental restraints, taking advantage of the sensitivity of these data to the molecular shape.[5−8] In particular, the long-range translational information encoded in SAXS data is proving to be a valuable complement to global orientational restraints contained in NMR residual dipolar couplings. As these solution data reflect the composition of the entire structural ensemble, they are also particularly useful in the investigation of flexible and intrinsically disordered systems, which often challenge a detailed structural characterization by X-ray crystallography and conventional NMR.(9) The utility of SAXS data in structural studies critically hinges on the ability to accurately predict such data from all-atom structural models. Important progress in this area has been made over the past two decades, leading to an established formalism for such calculations, culminating in the Crysol software package,(10) the de facto standard for such calculations. Crysol models solution scattering data from a uniform orientational average: where Fmol, Fdisp, and δρFsurf stand for the complex scattering amplitudes of the macromolecule, the displaced solvent, and the increased density (by δρ) of the surface water layer. The scattering vector is defined as q = 4π sin θ/λ, where 2θ is the scattering angle and λ is the incident radiation wavelength. Other methods have been formulated to improve on the treatment of orientational averaging and solvent representation.[11−14] However, Crysol’s speed, simplicity, and often superior ability to obtain a very good fit of the experimental scattering data to the atomic coordinates make its use very attractive. Several adjustable parameters are used by Crysol when calculating predicted data that best match the experimental curve. Next to the adjustable overall scaling factor between the measured and fitted data, these include the effective atomic radii multiplier which scales the solvent volume displaced by each atom, the electron density contrast of the surface solvent layer, and the total displaced solvent volume, in practice equivalent to the variation of the electron density of the displaced solvent relative to bulk water. The necessity for introducing these parameters as variables rather than constants that are kept fixed for all proteins or nucleic acids is not immediately obvious from first principles but becomes clear when investigating the reproducibility of experimental scattering data collected for distinct samples of the same macromolecule on different instruments. Such comparisons often indicate that the characteristic features of the measured scattering curves are well conserved, but in particular the scattered intensity at larger angles (“higher-q features”) varies relative to the extrapolated intensity at zero angle, I(0). Crysol’s adjustable parameters are very effective at absorbing this variability as they can adjust the level of the higher-q features of the predicted data relative to the low-q intensities. Here, we reformulate the approach to fitting SAXS data by explicitly taking into account the sources of experimental data variability. For this purpose, the measured scattering intensity difference is written as where the variable sample/buffer rescaling factor α ≈ 1 accounts for the uncertainty in the measurements of transmitted and incident intensities and the concentration-based uncertainty at which the solute volume fraction in the sample is known. The second variable, c, accounts for variability of the detector’s dark current and effects such as X-ray fluorescence. Uncertainties in α and c appear responsible for much of the systematic difference between repeated experimental data sets. In our analysis, we model the scattering intensity predicted from the atomic coordinates as Here, the Ω average is taken over a discrete pseudouniform set of molecular frame orientations relative to the incident beam; the “solv” average is taken over the displaced and surface water sets; and “ens” denotes an average over the ensemble of macromolecular structures, when available. In our approach, the scattering amplitudes of the surface and displaced solvent are calculated by summations over explicit individual water molecules, as detailed in the Supporting Information (SI). Explicit and realistic representation of the solvent is particularly useful for molecular shapes that strongly deviate from being globular, including rods, toroids, dumbbells, random coils, and other highly anisometric shapes. A second advantage of this approach is its natural ability to predict the scattering intensities for an arbitrarily dynamic ensemble with the same ease as that for a single static structure. So, our approach uses the same number of adjustable parameters as Crysol but replaces the atomic radii multiplier and total excluded volume, which are applied to the structure-predicted data, by the solvent/buffer rescaling factor and the constant offset, applied to the measured data. A measure of the discrepancy, D, between the predicted and measured scattering data is formulated as Here, A is the overall scaling parameter, σexpt are the experimental uncertainties for each individual data point, R(α) is a regularizer which keeps the fitted α parameter close to the target value for the concentration-based volume fraction of the displaced solvent, αo, and σα ≈ 10−2 denotes the uncertainty of this parameter. Fitting of the SAXS data is carried out by our webserver program AXES (Analysis of X-ray scattering data for Ensemble Structures; http://spin.niddk.nih.gov/bax/nmrserver/), using a Powell minimization of the penalty function against the adjustable parameters for both experimental (α, c) and predicted (A, δρ) data. Superior SAXS data fit quality is illustrated for a set of small well-studied proteins for which high-resolution structures were available from X-ray crystallography and solution NMR.[15−18] SAXS data for hen egg white lysozyme, cytochrome c, the B3 domain of protein G (GB3), and ubiquitin were acquired at the BIOCAT and BESSRC beamlines at the Advanced Photon Source synchrotron and fitted up to q values of ∼1 Å−1. We limit the fitting of our SAXS data to q < 1 Å−1 as, on one hand, scattering data above 1 Å−1 become increasingly similar for different proteins(19) and, on the other, the ability to accurately model such data is hampered by coordinate uncertainties, macromolecular dynamics, inhomogeneity of the surface solvent distribution, the effects of inelastic (Compton) scattering, and the accuracy of the commonly used neutral-atom form factors. Improvements in the data fit quality (Figure 1; Table 1; SI) indicate that the AXES program yields χ decreases of 10−50% over Crysol analysis, also for larger systems. When normalizing the fitting error to the very low statistical error associated with the high photon counts obtained for our synchrotron measurements, the residual error in the fit becomes dominated by the presence of small systematic errors resulting from fluctuations in temperature, beam position, transmission, and beam path length, as well as the imperfections in the data modeling noted above, resulting in χ >1. Using the same input data and number of adjustable fitting parameters, lower χ values obtained with AXES fitting compared to CRYSOL reflect smaller imperfections in the data modeling. As currently implemented, AXES is more than an order of magnitude slower than Crysol due to the need to average the scattering amplitudes involving the displaced and surface solvent over ca. 20 independent configurations. Applications of AXES-like methodology for direct inclusion in structure refinement programs[5−7,20] will require a significant speedup; several possible avenues for such speedup are currently under development.
Figure 1

Comparison of experimental (black) with predicted (red) SAXS data generated by (A) standard Crysol and (B) AXES fitting. From top to bottom, data sets correspond to GB3, cytochrome C, lysozyme, and ubiquitin (PDB entries 1IGD, 1CRC, 193L, and 1D3Z). Data sets are arbitrarily offset vertically for visual purposes.

Table 1

Fitting Statistics (χ Values) Obtained with Crysol and AXES for Four Proteins

 LysozymeCytochrome CGB3Ubiquitin
Crysol1.191.934.703.60
AXES0.980.901.763.05
Comparison of experimental (black) with predicted (red) SAXS data generated by (A) standard Crysol and (B) AXES fitting. From top to bottom, data sets correspond to GB3, cytochrome C, lysozyme, and ubiquitin (PDB entries 1IGD, 1CRC, 193L, and 1D3Z). Data sets are arbitrarily offset vertically for visual purposes. More important than the drop in χ statistics afforded by AXES is the question whether the program can discriminate well against poor models on the basis of SAXS data. For this purpose, we fit the experimental GB3 data to 2000 models generated de novo by the program Rosetta.(21) All Rosetta generated structures by their very nature are quite compact and have comparable radii of gyration (Rg = 11.1 ± 0.4 Å), but many have high Rosetta energies indicative of incorrect folds, deviating by 5 Å or more from the X-ray reference structure (PDB entry 1IGD).(15) Fits of the SAXS data by the standard approach in many cases yields χ values that are much lower, by up to 70%, for poor models (i.e., high rmsd to 1IGD) than those for the X-ray reference structure (Figure 2A), indicative of overfitting. In contrast, AXES does not yield significantly better fits for any of the poor Rosetta models (Figure 2B). At the same time, the AXES results illustrate that for a subset of the poor structures SAXS data alone cannot discriminate these from the reference structure. When restricting ourselves to models that all have the correct fold, as generated by chemical-shift-guided CS-Rosetta(22) (blue dots in Figure 2), AXES correctly assigns higher relative χ values (1.6 ± 0.5) to the CS-Rosetta models than to the experimental structure, whereas the inverse applies for the standard SAXS fitting procedure (χ = 0.93 ± 0.18; Figure 2A).
Figure 2

Normalized χ values when fitting 2000 GB3 models, generated by Rosetta (red) or CS-Rosetta (blue) modeling, to experimental SAXS data using (A) standard Crysol and (B) AXES fits. The horizontal axis corresponds to the backbone Cα rmsd between the model and the X-ray structure (PDB entry 1IGD); χ values are normalized relative to the χ1IGD value obtained when fitting the X-ray structure (horizontal black line).

Normalized χ values when fitting 2000 GB3 models, generated by Rosetta (red) or CS-Rosetta (blue) modeling, to experimental SAXS data using (A) standard Crysol and (B) AXES fits. The horizontal axis corresponds to the backbone Cα rmsd between the model and the X-ray structure (PDB entry 1IGD); χ values are normalized relative to the χ1IGD value obtained when fitting the X-ray structure (horizontal black line). An important feature of AXES is its ability to directly fit structural ensembles. Remarkably, fits to the previously extensively studied dynamic ensemble representations of ubiquitin[23,24] yield lower χ values when fitting these entire ensembles simultaneously, than when fitting each member of the ensemble separately, followed by averaging of these χ values (χensemble = 5.06 and 4.98 for PDB entries 1XQQ(23) and 2K39,(24) respectively, vs ⟨χ⟩ = 5.36 for 1XQQ and ⟨χ⟩ = 6.01 for 2K39; SI), despite far fewer adjustable parameters in the fitting procedure (4 for the ensemble fit; N*4 for an N-member ensemble). Even though the AXES fit to the static, lowest energy NMR structure (1D3Z;(25) χ = 3.05) suggests that this model is a better representation of the average ubiquitin structure in solution, the fact that fits to the entire 1XQQ and 2K39 ensembles are better than those to their individual members indicates that these ensembles correctly capture dynamic processes in the protein. The ability to evaluate such ensemble fits is becoming increasingly important as experimental structural biology shifts from an average-model view of macromolecular structure to more realistic multistate representations.
  22 in total

1.  Is the first hydration shell of lysozyme of higher density than bulk water?

Authors:  Franci Merzel; Jeremy C Smith
Journal:  Proc Natl Acad Sci U S A       Date:  2002-04-16       Impact factor: 11.205

2.  Refinement of multidomain protein structures by combination of solution small-angle X-ray scattering and NMR data.

Authors:  Alexander Grishaev; Justin Wu; Jill Trewhella; Ad Bax
Journal:  J Am Chem Soc       Date:  2005-11-30       Impact factor: 15.419

3.  Toward high-resolution de novo structure prediction for small proteins.

Authors:  Philip Bradley; Kira M S Misura; David Baker
Journal:  Science       Date:  2005-09-16       Impact factor: 47.728

4.  A physical picture of atomic motions within the Dickerson DNA dodecamer in solution derived from joint ensemble refinement against NMR and large-angle X-ray scattering data.

Authors:  Charles D Schwieters; G Marius Clore
Journal:  Biochemistry       Date:  2007-02-06       Impact factor: 3.162

5.  Consistent blind protein structure generation from NMR chemical shift data.

Authors:  Yang Shen; Oliver Lange; Frank Delaglio; Paolo Rossi; James M Aramini; Gaohua Liu; Alexander Eletsky; Yibing Wu; Kiran K Singarapu; Alexander Lemak; Alexandr Ignatchenko; Cheryl H Arrowsmith; Thomas Szyperski; Gaetano T Montelione; David Baker; Ad Bax
Journal:  Proc Natl Acad Sci U S A       Date:  2008-03-07       Impact factor: 11.205

6.  Global molecular structure and interfaces: refining an RNA:RNA complex structure using solution X-ray scattering data.

Authors:  Xiaobing Zuo; Jingbu Wang; Trenton R Foster; Charles D Schwieters; David M Tiede; Samuel E Butcher; Yun-Xing Wang
Journal:  J Am Chem Soc       Date:  2008-02-27       Impact factor: 15.419

7.  Structural characterization of flexible proteins using small-angle X-ray scattering.

Authors:  Pau Bernadó; Efstratios Mylonas; Maxim V Petoukhov; Martin Blackledge; Dmitri I Svergun
Journal:  J Am Chem Soc       Date:  2007-04-06       Impact factor: 15.419

8.  A rapid coarse residue-based computational method for x-ray solution scattering characterization of protein folds and multiple conformational states of large protein complexes.

Authors:  Sichun Yang; Sanghyun Park; Lee Makowski; Benoît Roux
Journal:  Biophys J       Date:  2009-06-03       Impact factor: 4.033

9.  Mixing and matching detergents for membrane protein NMR structure determination.

Authors:  Linda Columbus; Jan Lipfert; Kalyani Jambunathan; Daniel A Fox; Adelene Y L Sim; Sebastian Doniach; Scott A Lesley
Journal:  J Am Chem Soc       Date:  2009-06-03       Impact factor: 15.419

10.  The low ionic strength crystal structure of horse cytochrome c at 2.1 A resolution and comparison with its high ionic strength counterpart.

Authors:  R Sanishvili; K W Volz; E M Westbrook; E Margoliash
Journal:  Structure       Date:  1995-07-15       Impact factor: 5.006

View more
  51 in total

1.  Accurate flexible fitting of high-resolution protein structures to small-angle x-ray scattering data using a coarse-grained model with implicit hydration shell.

Authors:  Wenjun Zheng; Mustafa Tekpinar
Journal:  Biophys J       Date:  2011-12-20       Impact factor: 4.033

2.  Modeling the hydration layer around proteins: applications to small- and wide-angle x-ray scattering.

Authors:  Jouko Juhani Virtanen; Lee Makowski; Tobin R Sosnick; Karl F Freed
Journal:  Biophys J       Date:  2011-10-19       Impact factor: 4.033

3.  Ensembles of a small number of conformations with relative populations.

Authors:  Vijay Vammi; Guang Song
Journal:  J Biomol NMR       Date:  2015-10-17       Impact factor: 2.835

4.  Accurate SAXS profile computation and its assessment by contrast variation experiments.

Authors:  Dina Schneidman-Duhovny; Michal Hammel; John A Tainer; Andrej Sali
Journal:  Biophys J       Date:  2013-08-20       Impact factor: 4.033

5.  Interpretation of solution x-ray scattering by explicit-solvent molecular dynamics.

Authors:  Po-Chia Chen; Jochen S Hub
Journal:  Biophys J       Date:  2015-05-19       Impact factor: 4.033

6.  Accurate small and wide angle x-ray scattering profiles from atomic models of proteins and nucleic acids.

Authors:  Hung T Nguyen; Suzette A Pabit; Steve P Meisburger; Lois Pollack; David A Case
Journal:  J Chem Phys       Date:  2014-12-14       Impact factor: 3.488

7.  Accurate optimization of amino acid form factors for computing small-angle X-ray scattering intensity of atomistic protein structures.

Authors:  Dudu Tong; Sichun Yang; Lanyuan Lu
Journal:  J Appl Crystallogr       Date:  2016-06-20       Impact factor: 3.304

8.  Reduction of small-angle scattering profiles to finite sets of structural invariants.

Authors:  Jérôme Houdayer; Frédéric Poitevin
Journal:  Acta Crystallogr A Found Adv       Date:  2017-06-09       Impact factor: 2.290

Review 9.  X-ray Scattering Studies of Protein Structural Dynamics.

Authors:  Steve P Meisburger; William C Thomas; Maxwell B Watkins; Nozomi Ando
Journal:  Chem Rev       Date:  2017-05-30       Impact factor: 60.622

10.  Contrast-matched small-angle X-ray scattering from a heavy-atom-labeled protein in structure determination: application to a lead-substituted calmodulin-peptide complex.

Authors:  Alexander Grishaev; Nicholas J Anthis; G Marius Clore
Journal:  J Am Chem Soc       Date:  2012-08-29       Impact factor: 15.419

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.