| Literature DB >> 20382999 |
Airlie J McCoy1, Randy J Read.
Abstract
Developments in protein crystal structure determination by experimental phasing are reviewed, emphasizing the theoretical continuum between experimental phasing, density modification, model building and refinement. Traditional notions of the composition of the substructure and the best coefficients for map generation are discussed. Pitfalls such as determining the enantiomorph, identifying centrosymmetry (or pseudo-symmetry) in the substructure and crystal twinning are discussed in detail. An appendix introduces combined real-imaginary log-likelihood gradient map coefficients for SAD phasing and their use for substructure completion as implemented in the software Phaser. Supplementary material includes animated probabilistic Harker diagrams showing how maximum-likelihood-based phasing methods can be used to refine parameters in the case of SIR and MIR; it is hoped that these will be useful for those teaching best practice in experimental phasing methods.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20382999 PMCID: PMC2852310 DOI: 10.1107/S0907444910006335
Source DB: PubMed Journal: Acta Crystallogr D Biol Crystallogr ISSN: 0907-4449
Figure 1Harker diagrams. (a) SIR Harker diagram where H 1 is the calculated substructure structure factor for the single derivative. The black and red circles have radii given by the observed structure-factor amplitudes for the native and the derivative, respectively. (b) SAD Harker diagram where H + and H − are the calculated substructure structure factors and H + − H −* is the expected vector difference between the true structure factors F + and F −*. (c) MIR Harker diagram where H 1 and H 2 are the calculated substructure structure factors for the first and second derivatives, respectively. The black, red and blue circles have radii given by the observed structure-factor amplitude for the native, the first derivative and the second derivative, respectively. In the absence of measurement errors and errors in the substructure, the red and blue circles would intersect at one point on the black circle.
Figure 2SIR probabilistic Harker diagram (notation as in Fig. 1 ▶). (a) Contour plot showing components of the PDF. The component arising from the native is shown in black contours and the component arising from the derivative is shown in red contours centred on H 1 (the point at the base of the red arrow). The dashed black and red circles indicate the measured values of the observed structure-factor amplitudes for the native and the derivative, respectively. (b) The PDF [the product of the two components in (a)] is shown in dark red contours. The ‘best F’ F B is shown as a black arrow. (c) Three-dimensional plot of the value of the PDF. The likelihood is the volume under the PDF surface. (d) Plot of the likelihood as a function of the occupancy of the substructure (increasing amplitude of H 1). The maximum likelihood is marked with a dot. All other panels in this figure show the values of the parameters at the point of maximum likelihood. (e) The PDF for the phases of the true structure factor F is shown in red and the PDF reconstructed from the four Hendrickson–Lattman (Hendrickson & Lattman, 1970 ▶) coefficients (HL) is shown as a black curve. (f) Bar chart showing the relative values of the four HL coefficients A, B, C and D.
Figure 3SAD probabilistic Harker diagram (adapted from McCoy, 2004 ▶ with notation as in Fig. 1 ▶). (a) Contour plot showing components of the PDF. The component P(F −*|H −*) is shown in blue contours centred on H −* (blue arrow) and the anomalous component P(F + obs|F −*, H +, H −*) is shown in red contours centred on H + − H −*, the expected vector difference between F + and F −*. The black and red circles indicate the observed structure-factor amplitudes for F − and F +, respectively. (b) The product of the two components in (a) is shown in magenta contours. (c) Three-dimensional plot of the value of the PDF under the black circle in (b). The likelihood is given as the integral of the height of the surface under the black circle. (d) Plot of the likelihood as a function of the occupancy of the substructure (increasing value of |H −*| and |H + − H −*|). The maximum likelihood is marked with a dot. All other panels in this figure show the values of the parameters at the point of maximum likelihood. (e) The PDF for the phases of F −* is shown in magenta and the PDF reconstructed from the four HL coefficients is shown as a black curve. (f) Bar chart showing the relative values of the four HL coefficients A, B, C and D.
Figure 4MIR probabilistic Harker diagram (notation as in Fig. 1 ▶). (a) Contour plot showing components of the PDF. The component arising from the native is shown in black contours, the component arising from the first derivative is shown in red contours centred on H 1 (the point at the base of the red arrow) and the component arising from the second derivative is shown in blue contours centred on H 2 (the point at the tip of the blue arrow). The dashed black, red and blue circles indicate the measured values of the observed structure-factor amplitudes for the native, first and second derivatives, respectively. (b) The PDF [the product of the three components in (a)] is shown in dark magenta contours. The ‘best F’ F B is shown as a black arrow. (c) Three-dimensional plot of the value of the PDF. The likelihood is given as the volume under the surface. (d) Plot of the likelihood as a function of the occupancy of the substructure for the second derivative (increasing amplitude of H 2). The maximum likelihood is marked with a dot. All other panels in this figure show the values of the parameters at the point of maximum likelihood. (e) The PDF for the phases of the true structure factor F is shown in dark magenta and the PDF reconstructed from the four HL coefficients is shown as a black curve. (f) Bar chart showing the relative values of the four Hendrickson–Lattmanm coefficients A, B, C and D.
Figure 5The difference between the ‘centroid’ and ‘most probable’ structure factors. (a) Cut the centre out of a paper plate. (b) Balance the disc on a pen. The centre of mass is at the centre. (c) Now clip two unequal weights to the edge of the plate. (d) The balancing point is between the two weights (analogous to the ‘centroid’ structure factor) and not on the heaviest weight (analogous to the ‘most probable’ structure factor).
Figure 6Phasing in both hands. The anomalous scattering component is always advanced. For example, data collected at a wavelength of 1.7 Å from an iron-containing protein will have a significant anomalous signal from both the Fe atoms and the S atoms in methionine and cysteine. Non-anomalous contributions to the scattering come from C, N and O atoms. The total structure factor has an anomalous component that is not perpendicular to the normal scattering component, leading to an anomalous difference in the structure factors for F + and F −. Only in one hand will the observed direction of the anomalous difference match the calculated direction of the difference (|F +| > |F −|).
Changing the hand of substructure sites
For nonchiral space groups the other hand of the heavy-atom sites is found by the operation (x, y, z)→(−x, −y, −z), except for three space groups (I41, I4122 and I4132) where there is also a change of origin. For the chiral space groups the change of hand of the heavy-atom sites with the operation (x, y, z)→(−x, −y, −z) is accompanied by a change of space group to the other chiral form.
| System | Chiral | Nonchiral |
|---|---|---|
| Triclinic | ||
| Monoclinic | ||
| Orthorhombic | ||
| Tetragonal | ||
| Trigonal | ||
| Hexagonal | ||
| Cubic | ||
For I41 the origin is shifted to (½, 0, 0).
For I4122 the origin is shifted to (½, 0, ¼).
For I4132 the origin is shifted to (¼, ¼, ¼).