| Literature DB >> 28440794 |
Chun Hong Yoon1, Hasan DeMirci2,3,4, Raymond G Sierra1, E Han Dao2,3, Radman Ahmadi1, Fulya Aksit2, Andrew L Aquila1, Alexander Batyuk1, Halilibrahim Ciftci1, Serge Guillet1, Matt J Hayes1, Brandon Hayes1, Thomas J Lane1,3, Meng Liang1, Ulf Lundström3, Jason E Koglin1, Paul Mgbam1, Yashas Rao1, Theodore Rendahl1, Evan Rodriguez1, Lindsey Zhang1, Soichi Wakatsuki1,3, Sébastien Boutet1, James M Holton4,5, Mark S Hunter1.
Abstract
We provide a detailed description of selenobiotinyl-streptavidin (Se-B SA) co-crystal datasets recorded using the Coherent X-ray Imaging (CXI) instrument at the Linac Coherent Light Source (LCLS) for selenium single-wavelength anomalous diffraction (Se-SAD) structure determination. Se-B SA was chosen as the model system for its high affinity between biotin and streptavidin where the sulfur atom in the biotin molecule (C10H16N2O3S) is substituted with selenium. The dataset was collected at three different transmissions (100, 50, and 10%) using a serial sample chamber setup which allows for two sample chambers, a front chamber and a back chamber, to operate simultaneously. Diffraction patterns from Se-B SA were recorded to a resolution of 1.9 Å. The dataset is publicly available through the Coherent X-ray Imaging Data Bank (CXIDB) and also on LCLS compute nodes as a resource for research and algorithm development.Entities:
Year: 2017 PMID: 28440794 PMCID: PMC5404608 DOI: 10.1038/sdata.2017.55
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Figure 1Phasing versus data volume.
The correlation to final map (red) and anomalous peak RMS (blue) shown as a function of data volume used for the Se-SAD data. Using 100% of the data, only ~0.1% of the 200,000 attempts to phase the data were successful. A key to successful phasing was finding the NCS operator, which only occurred when including the entire data set. Adopted from Hunter et al.[2]
Figure 2Overview of the Selenium single-wavelength anomalous diffraction experiments.
(a) Data were collected simultaneously in two sample chambers using a serial SFX setup. (b) The X-rays are focused only using Be lenses with the Kirkpatrick-Baez mirrors (KB1) moved out. The X-rays enter the first sample chamber (SC1) and scatter from the Se-B SA crystal. The unscattered X-rays exit through the central hole in the CSPAD detector, which is then refocused for another scattering in the downstream sample chamber (SC3) followed by the diagnostic (Diag).
Description of the subset of recorded PVs that were used during the analysis of the Se-SAD data.
| GDET:FEE1:241:ENRC | mJ | Pulse energy measurement at an upstream gas detector |
| GDET:FEE1:242:ENRC | mJ | Second pulse energy measurement at an upstream gas detector |
| GDET:FEE1:361:ENRC | mJ | Pulse energy measurement at a downstream gas detector |
| GDET:FEE1:362:ENRC | mJ | Second pulse energy measurement at a downstream gas detector |
| GDET:FEE1:363:ENRC | mJ | Duplicate measurement as 361 with 10% dynamic range |
| GDET:FEE1:364:ENRC | mJ | Duplicate measurement as 362 with 10% dynamic range |
| CXI:DS1:MMS:06:RBV | mm | Upstream detector stage readback value |
| CXI:DS2:MMS:06:RBV | mm | Downstream detector stage readback value |
| XRT:DIA:MMS:02:RBV | mm | 20 μm thick Si attenuator motor |
| XRT:DIA:MMS:03:RBV | mm | 40 μm thick Si attenuator motor |
| XRT:DIA:MMS:04:RBV | mm | 80 μm thick Si attenuator motor |
| XRT:DIA:MMS:11:RBV | mm | 160 μm thick Si attenuator motor |
| XRT:DIA:MMS:06:RBV | mm | 320 μm thick Si attenuator motor |
| XRT:DIA:MMS:07:RBV | mm | 640 μm thick Si attenuator motor |
| XRT:DIA:MMS:08:RBV | mm | 1,280 μm thick Si attenuator motor |
| XRT:DIA:MMS:09:RBV | mm | 2,560 μm thick Si attenuator motor |
| XRT:DIA:MMS:10:RBV | mm | 5,120 μm thick Si attenuator motor |
Figure 3Diffraction pattern with predicted Bragg peak positions.
Inset shows a zoomed in subpanel beyond the 2 Angström resolution. Adopted from Hunter et al.[2]
Number of diffraction patterns extracted from front and back chambers.
| 100% | 2,520,580/1,174,470 | 324,803/369,691 | 125,755/88,289 |
| 50% | 1,026,168/1,109,319 | 239,442/377,793 | 115,993/158,865 |
| 10% | 1,041,249/730,055 | 111,444/144,620 | 43,188/27,104 |
| Total | 7,601,841 | 1,567,793 | 559,194 |
Cheetah hit finding parameters in the front chambers.
| 5 | Threshold 150, Peak size 3–12, SNR 6 | 0–1,300 | 8 | Radial background subtraction | |
| 10 | Threshold 150, Peak size 3–12, SNR 4 | 0–1,300 | 8 | Radial background subtraction | |
| 10 | Threshold 150, Peak size 3–12, SNR 4 | 0–1,300 | 8 | Radial background subtraction |
Cheetah hit finding parameters in the back chambers.
| 10 | Threshold 150, Peak size 3–15, SNR 4 | 0–1,300 | 8 | Radial background subtraction | |
| 10 | Threshold 150, Peak size 3–12, SNR 4 | 0–1,300 | 8 | Radial background subtraction | |
| 10 | Threshold 150, Peak size 3–15, SNR 4 | 0–1,300 | 8 | Radial background subtraction |
CrystFEL processing parameters.
| CrystFEL (‘zaef’) | Threshold 500, min grad 500,000, SNR 5.5 | 3.5,5,5.5 | 7 | |
| CrystFEL (‘zaef’) | Threshold 550, min grad 1,100,000, SNR 5 | 3,4,5 | 7 |
Selenobiotinyl streptavidin crystallography figures of merit.
| PDB ID (5JD2) | |
| Beamline | LCLS (CXI) |
| Space group | P21 |
| Cell dimensions | |
| | 50.7, 98.4, 53.1 |
| α, β, γ (°) | 90, 112.7, 90 |
| Resolution (Å) | 32.51–1.90 (1.97–1.90) |
| | 0.048 (0.395) |
| | 14.0 (2.7) |
| Completeness (%) | 1.0 (1.0) |
| SFX multiplicity of observations | 1447.6 (1003.3) |
| | 1.000 (0.930) |
| CC1/2 | 0.998 (0.762) |
| CCano | 0.177 (0.003) |
| Wilson B Factor (Å2) | 29.58 |
| Refinement | |
| No. reflections | 38,327 (3,817) |
| | 0.166/0.199 (0.231/0.253) |
| Ramachandran favored (%) | 89.3 |
| Ramachandran allowed (%) | 10.5 |
| Ramachandran outliers (%) | 0.2 |
| No. atoms | |
| Protein | 3,630 |
| Ligand/Ion | 64 |
| Water | 265 |
| | |
| Protein | 34.0 |
| Ligand/Ion | 38.9 |
| Water | 43.5 |
| R.m.s. deviations | |
| Bond lengths (Å) | 0.006 |
| Bond angles (°) | 1.04 |
Figure 4Figures of merit plot: CC* (red) and Rsplit (blue) versus resolution.
CC* is an estimate of the cross correlation between the observed dataset against the unmeasured true intensities which is above 90% for our dataset up to the observed resolution shell. Rsplit is a measure of discrepancy of the measured intensities and it stays below 40% for our dataset up to the observed resolution shell. Both plots indicate the merged intensities are of high quality.