| Literature DB >> 25723925 |
Abstract
Serial crystallography, using either femtosecond X-ray pulses from free-electron laser sources or short synchrotron-radiation exposures, has the potential to reveal metalloprotein structural details while minimizing damage processes. However, deriving a self-consistent set of Bragg intensities from numerous still-crystal exposures remains a difficult problem, with optimal protocols likely to be quite different from those well established for rotation photography. Here several data processing issues unique to serial crystallography are examined. It is found that the limiting resolution differs for each shot, an effect that is likely to be due to both the sample heterogeneity and pulse-to-pulse variation in experimental conditions. Shots with lower resolution limits produce lower-quality models for predicting Bragg spot positions during the integration step. Also, still shots by their nature record only partial measurements of the Bragg intensity. An approximate model that corrects to the full-spot equivalent (with the simplifying assumption that the X-rays are monochromatic) brings the distribution of intensities closer to that expected from an ideal crystal, and improves the sharpness of anomalous difference Fourier peaks indicating metal positions.Entities:
Keywords: X-ray free-electron laser; mosaicity; partiality; postrefinement; serial femtosecond crystallography
Mesh:
Year: 2015 PMID: 25723925 PMCID: PMC4344359 DOI: 10.1107/S1600577514028203
Source DB: PubMed Journal: J Synchrotron Radiat ISSN: 0909-0495 Impact factor: 2.616
Figure 1Geometric definition of partiality, accounting for the mosaic structure of the crystal. For a still shot taken with monochromatic X-rays of wavelength λ, a reciprocal lattice point (blue ball centered on Q) partially intersects the Ewald sphere. The intersection area, which is actually a spherical cap, is approximated by a circle of radius , which is determined by , the distance from Q to the Ewald sphere, and , the resolution-dependent radius of the reciprocal lattice point as described in the text. Partiality is defined as the intersection area-to-ball volume ratio for lattice point Q, normalized by the intersection area-to-ball volume ratio of the F 000 spot at reciprocal space origin O.
Figure 2Partiality estimates for Bragg spots integrated on a single thermolysin image, plotted as a function of the ratio.
Processing outcome on XFEL still shots from thermolysin
| Protocol number | |||||
|---|---|---|---|---|---|
| 4 | 6 | 6F | 7POST | 7F,POST | |
| Protocol choice | |||||
| Model restraints | Spot positions only | Spot positions + angular deviations | Spot positions + angular deviations | Spot positions + angular deviations | Spot positions + angular deviations |
| Postrefinement and partiality correction | No | No | No | Yes | Yes |
| Each-lattice resolution bin cutoff [ | 0.5 | 0.5 | None | 0.5 | None |
| Indexing results | |||||
| # Total hits with >15 Bragg spots | 14041 | 14041 | 14041 | 14041 | 14041 |
| # Integrated and merged lattices | 12097 | 12550 | 13756 | 12551 | 13733 |
| Model accuracy | |||||
| Half-width mosaicity | 0.292 | 0.168 | 0.213 | 0.168 | 0.213 |
| Mosaic block size | 4320 | 4220 | 4370 | 4220 | 4370 |
| Integrated data results | |||||
| Individual image CC | 32.0% | 40.2% | 40.1% | 40.2% | 40.1% |
| No. of measurements 512.2 | 6605566 | 5036076 | 11905131 | 4290566 | 9915864 |
| Positive measurements 512.2 | 4297065 | 3626262 | 7249271 | 3187835 | 6201772 |
| Negative measurements | 35% | 28% | 39% | 26% | 37% |
| Structure factor merging | |||||
| Merging resolution range () | 512.2 (2.282.2) | 512.2 (2.282.2) | 512.2 (2.282.2) | 512.2 (2.282.2) | 512.2 (2.282.2) |
| Unique Miller indices | 17198 (1405) | 17297 (1488) | 17513 (1700) | 17227 (1425) | 17513 (1700) |
| Multiplicity of observation | 250 (3.0) | 210 (3.6) | 414 (53) | 185 (3.2) | 354 (44) |
| Completeness | 98.2% (82.7%) | 98.8% (87.6%) | 100% (100%) | 98.4% (83.9%) | 100% (100%) |
|
| 36.1 (2.3) | 56.7 (3.2) | 55.9 (4.2) | 74.9 (3.5) | 72.7 (4.0) |
| CC1/2 correlation of semi-datasets | 72.2% (4.1%) | 87.2% (42.7%) | 92.1% (14.6%) | 90.2% (34.0%) | 92.8% (16.0%) |
|
| 33.9% (95.2%) | 32.0% (89.7%) | 26.7% (69.6%) | 29.3% (89.7%) | 26.7% (78.0%) |
| CCiso
| 86.8% (18.1%) | 94.7% (40.0%) | 95.1% (23.3%) | 94.8% (42.1%) | 95.2% (30.1%) |
|
| 23.6% (79.0%) | 18.0% (73.8%) | 17.7% (63.8%) | 23.4% (76.1%) | 22.5% (69.3%) |
| Structure factor quality tests | |||||
| Wilson | 12.2 | 17.2 | 18.3 | 17.7 | 20.6 |
|
| 1.293 | 1.518 | 1.471 | 1.697 | 1.628 |
| | | 0.302 | 0.376 | 0.366 | 0.425 | 0.412 |
|
| 0.137 | 0.202 | 0.193 | 0.252 | 0.238 |
|
| 0.201 | 0.121 | 0.133 | 0.071 | 0.082 |
|
| 0.271 | 0.198 | 0.213 | 0.112 | 0.147 |
| Quality of refined structure | |||||
| Refinement resolution range () | 512.2 (2.342.2) | 512.2 (2.342.2) | 512.2 (2.342.2) | 512.2 (2.342.2) | 512.2 (2.342.2) |
|
| 24.5% (35.2%) | 20.8% (33.5%) | 21.2% (32.0%) | 20.6% (36.3%) | 19.5% (33.0%) |
|
| 29.6% (39.9%) | 26.3% (44.0%) | 26.0% (39.0%) | 24.1% (45.8%) | 24.3% (42.0%) |
| Zn2+ anomalous-difference peak height | 2.9 | 5.8 | 7.2 | 7.4 | 8.7 |
| Molprobity clashscore (Chen | 8.41 | 2.16 | 3.23 | 1.08 | 0.86 |
| Protein atom | 15.6 | 18.0 | 20.4 | 19.1 | 21.3 |
| Solvent atom | 23.3 | 28.8 | 29.7 | 27.9 | 30.5 |
| Number of autobuilt water molecules | 311 | 295 | 248 | 236 | 232 |
| Overall/local map C.C. to | 77.0%/81.3% | 81.3%/84.4% | 82.0%/85.0% | 82.5%/85.2% | 83.2%/85.7% |
| Automated model building after MR-SAD | |||||
| No. of mainchain/sidechain (of total 316) | 310/299 | 309/309 | 309/309 | 312/305 | 312/306 |
|
| 24.0%/28.8% | 23.7%/28.2% | 22.6%/26.2% | 23.0%/26.5% | 22.6%/26.2% |
For the thermolysin data analysis, candidate Bragg spots were chosen with a minimum spot area of 2 square pixels.
Figure 3Resolution limits and positional accuracy of the thermolysin integration model. (a) Limiting resolution for 1000 randomly selected shots from runs 21–27 of the L498 experiment, collected at a sample-to-detector distance of 171.0 mm, and thus restricted to 2.6 Å at the detector edge, and 2.05 Å in the detector corners. Data for the strongest-diffracting samples are therefore limited by a sharp cutoff due to detector geometry rather than the intrinsic sample diffraction. Horizontal axis: limits based on bright spots picked by a spotfinding algorithm (Zhang et al., 2006 ▶); blue bars represent a histogram of resolution limits determined with ‘method 2’ from that paper. Vertical axis: limits based on a Wilson plot of the integrated intensities. (b) Displacement (in pixels) between Bragg spot positions predicted by the lattice model used for integration, and the center of mass positions actually measured for bright spotfinder-picked spots. Blue traces: displacement for 20 randomly selected shots, with bright spots from each shot grouped into resolution bins; black dots identify the highest-resolution bin for each individual shot. Red curve: aggregate displacement over the 1000 images analyzed in panel (a).
Figure 4Bragg spot predictions are more accurate when the orientational model is refined against Ewald sphere distance. Two protocols are evaluated: (a) refinement of indexed spots against observed positions only, and (b) also refining the model against the angular deviation of the reciprocal lattice point from the Ewald sphere, corresponding to protocols 4 and 6 of Sauter et al. (2014 ▶), respectively. Plots represent a random sampling of processing results for simulated PSI data, in which the modeled orientation can be compared against the known true orientation from the simulation. Horizontal axis: residual misorientation angle R after removal of the small misorientation R along the axis parallel to the beam direction (r.m.s. R misorientation is 0.017° for both panels). Vertical axis: fraction of Bragg spots predicted by the model but not present in the simulated data (blue), and fraction of Bragg spots in the simulation that are not modeled (red).
Figure 5Data quality statistics for the merged structure factor intensities from thermolysin. (a, b, c) Cumulative distribution function N(L) of the local statistic: L = where and are unrelated intensities (Padilla & Yeates, 2003 ▶). (d, e, f) Cumulative distribution function N(z), where z = I/〈I〉. Identical data were processed with the protocols listed in Table 1 ▶: (a, d) protocol 4, lattice model is not restrained against proximity to the Ewald sphere; (b, e) protocols 6 and 6F, proximity restraints are applied, with and without a separate resolution cutoff for each lattice; and (c, f) protocols 7POST and 7F,POST, which are the same as protocols 6 and 6F except that crystal orientation is postrefined to maximize agreement with a set of reference intensities as described in the text. Agreement between the merged intensities (thick lines) and the theoretical distribution (thin lines) demonstrates that such statistics offer useful metrics for evaluating different processing protocols, with the postrefined model giving the best agreement with theoretical expectation.