I Poudyal1, M Schmidt1, P Schwander1. 1. Department of Physics, University of Wisconsin-Milwaukee, 3135 N. Maryland Ave., Milwaukee, Wisconsin 53211, USA.
Abstract
X-ray free-electron lasers (XFELs) open the possibility of obtaining diffraction information from a single biological macromolecule. This is because XFELs can generate extremely intense x-ray pulses that are so short that diffraction data can be collected before the sample is destroyed. By collecting a sufficient number of single-particle diffraction patterns, the three-dimensional electron density of a molecule can be reconstructed ab initio. The quality of the reconstruction depends largely on the number of patterns collected at the experiment. This paper provides an estimate of the number of diffraction patterns required to reconstruct the electron density at a targeted spatial resolution. This estimate is verified by simulations for realistic x-ray fluences, repetition rates, and experimental conditions available at modern XFELs. Employing the bacterial phytochrome as a model system, we demonstrate that sub-nanometer resolution is within reach.
X-ray free-electron lasers (XFELs) open the possibility of obtaining diffraction information from a single biological macromolecule. This is because XFELs can generate extremely intense x-ray pulses that are so short that diffraction data can be collected before the sample is destroyed. By collecting a sufficient number of single-particle diffraction patterns, the three-dimensional electron density of a molecule can be reconstructed ab initio. The quality of the reconstruction depends largely on the number of patterns collected at the experiment. This paper provides an estimate of the number of diffraction patterns required to reconstruct the electron density at a targeted spatial resolution. This estimate is verified by simulations for realistic x-ray fluences, repetition rates, and experimental conditions available at modern XFELs. Employing the bacterial phytochrome as a model system, we demonstrate that sub-nanometer resolution is within reach.
X-rays have been used for more than sixty years to determine the structures of proteins and other biologically important macromolecules. Protein structures are determined by the interpretation of electron density maps obtained from measured structure factors. Since the interaction of x-rays with matter is weak, crystals are widely used to determine these structure factors. When the crystals are exposed to x-ray radiation, diffraction is amplified along specific directions that are determined by Bragg's law. In this way, structure factor amplitudes can be measured with sufficient precision. However, the phase of the structure factors is not experimentally accessible and needs to be retrieved from additional experiments. X-ray free-electron lasers (XFELs) can provide enormous incident intensities so that diffraction is sufficiently strong to access structure factor amplitudes from single particles. The particles are destroyed shortly after exposure with the incident radiation. However, XFEL pulses are short enough to collect a diffraction signal before the object is damaged. This is the so-called “diffraction-before-destruction” principle. In a single-particle imaging (SPI) experiment, a large number of two-dimensional (2D) diffraction patterns (snapshots) of single molecules are recorded by a pixel area detector. These snapshots are extremely noisy and taken in random and unknown orientations. Therefore, the orientations of the molecules relative to each other have to be determined from the snapshots. Multiple algorithmic methods of orientation recovery have been developed to assign orientations to single-particle x-ray diffraction patterns. The oriented patterns are merged into a three-dimensional (3D) diffraction volume with phases initially unknown. The phases can be retrieved from the diffraction volume by iterative phasing. Finally, from the phased diffraction volume, the electron density map is determined.Electron density reconstructions from experimental SPI datasets collected at XFELs have achieved resolutions in the regime of a few tens of nanometers. Diffraction up to a resolution of 5.9 Å was already observed, but a reconstruction of a 3D diffraction volume was not attempted due to the small number of diffraction patterns. This immediately raises the central question of how many diffraction patterns must be collected for a 3D reconstruction of the electron density for a given resolution. In other words, how many diffraction patterns are required to obtain a diffraction volume at a signal-to-noise ratio (SNR) sufficient to reach the desired resolution by iterative phasing? Clearly, this depends on the molecule under investigation and experimental conditions such as the wavelength, beam size, and incident x-ray fluence. To answer this important question is the primary focus of this paper.Since the beamtime at XFELs is expensive, and sparsely available, it is important to have a sound estimate of the required number of diffraction patterns to design such an experiment. In contrast to XFELs, electron microscopes are now ubiquitously available. Using cryogenic electron microscopy (cryo-EM), the structures of single molecules have been determined at near-atomic resolution, which surpasses the resolution reached at XFELs to date. However, the duration of the ultrashort x-ray pulses is faster than the molecular fluctuations. Accordingly, the XFEL provides a snapshot of a molecule “frozen in time” during x-ray exposure. Ambient temperatures are necessary to keep the molecules alive and enable protein dynamics. Using single-particle imaging at the XFEL, such dynamics can be probed with unprecedented time resolution down to the femtosecond regime. In contrast, cryo-EM requires quenching the sample to cryogenic temperatures, which may alter the structure, and with cryo-EM, the time resolution is limited to a few milliseconds.So far, the SPI techniques at XFELs were applied to large biological assemblies, primarily viruses. Here, we estimate by simulation how the SPI approach could be applied to a much smaller protein at a more relevant, molecular resolution. We selected the phytochrome, a light regulated enzyme, as a suitable model system. Phytochromes are red light photoreceptors characterized in plants, fungi, and bacteria and undergo large structural changes after red light absorption. The full-length, functional bacterial phytochromes (BphPs) consist of multiple domains. The PAS (Period ARNT Sim), GAF (cGMP phosphodiesterase/adenylyl cyclase/FhIA), and PHY (phytochrome-specific) domains form the photosensory core module (PCM). An effector domain has enzymatic activity, which is covalently linked to the PHY domain. The PHY domain has a tongue-like structure, which contacts the GAF domain to seal the biliverdin (BV) chromophore pocket as shown in Fig. 1. Upon photoexcitation, phytochromes interconvert between a dark-adapted red-light absorbing state, Pr, and a photoactivated far red-light absorbing state, Pfr. The sensory tongue probes the configuration of the BV chromophore and transmits the signal to the PHY domain. The structure of the tongue undergoes substantial changes between the Pr and Pfr states. In the Pr state, the tongue assumes a loop to -strand conformation, whereas in the Pfr state, it assumes a loop to -helix conformation. Accordingly, large-scale conformational changes with amino acid displacements across several tens of Å between the Pr and Pfr states are required. However, the molecular details of the structural changes during the Pr to Pfr transition and their long-range effects on the effector domains are not well understood. Such changes may not be accommodated by the crystal lattice and thus hidden in crystallographic methods so that SPI is required. With the advancement in x-ray technology, we anticipate that single-particle x-ray experiments on the full-length phytochromes can be conducted at sub-nanometer resolution. With this, the structural dynamics of the Pr to Pfr transition in the full-length functional BphPs could be observed. Here, we estimate how many diffraction patterns are required to determine a 3D diffraction volume for an intact (full-length) phytochrome in the Pr form that can be successfully phased to calculate the three-dimensional electron density map at a targeted resolution.
FIG. 1.
Pr and Pfr structures of Idiomarina sp. and D. radiodurans phytochrome. (a) Full-length dark-adapted red-light absorbing Pr state (pdb code 5llw) of the Idiomarina sp. phytochrome. Individual domains are colored in yellow, green, magenta, brown, and gray for PAS, GAF, PHY, and coiled-coil and di-guanylyl cyclase (DGC) effector domains, respectively. (b) The photosensory core module (PCM) from the D. radiodurans phytochrome in the photoactivated far red-light absorbing Pfr state (pdb code 5c5k). The PAS, GAF, and PHY domain constitute the PCM and are represented with the same color code as in (a). The PHY domains are displaced substantially in the Pfr structure [blue curved arrow (b)]. The sensory tongue is marked in both structures, and the biliverdin (BV) chromophores (orange) are shown as ball-and-stick models.
Pr and Pfr structures of Idiomarina sp. and D. radiodurans phytochrome. (a) Full-length dark-adapted red-light absorbing Pr state (pdb code 5llw) of the Idiomarina sp. phytochrome. Individual domains are colored in yellow, green, magenta, brown, and gray for PAS, GAF, PHY, and coiled-coil and di-guanylyl cyclase (DGC) effector domains, respectively. (b) The photosensory core module (PCM) from the D. radiodurans phytochrome in the photoactivated far red-light absorbing Pfr state (pdb code 5c5k). The PAS, GAF, and PHY domain constitute the PCM and are represented with the same color code as in (a). The PHY domains are displaced substantially in the Pfr structure [blue curved arrow (b)]. The sensory tongue is marked in both structures, and the biliverdin (BV) chromophores (orange) are shown as ball-and-stick models.
METHODS
The simulations reported in this paper were done for the full-length Idiomarina sp. phytochrome molecule [protein data bank (PDB) entry 5llw] whose structure was recently determined. This structure has an approximate diameter D of 164 Å and consists of about 11 000 atoms. Diffraction patterns were simulated according to the formalism in Appendix A and implemented in Python. The simulations were done for a photon energy of 2.48 keV corresponding to a wavelength of . The resolution at the edge of the detector was 10 Å. The phases needed to recover the electron density can be retrieved by sampling the continuous diffraction pattern at sufficiently small intervals in reciprocal space. To retain the phase information, these intervals must be smaller than 1/(2D), i.e., oversampled at least twice with respect to the molecular diameter. This determines the size of the Shannon pixels in the diffraction pattern. For phytochrome, detector pixels are required to reach a resolution of 10 . The simulated signal for each detector pixel was converted to the expected number of photons for an incident x-ray fluence of photons/cm2 achievable at an XFEL. The measured photon counts follow Poisson statistics. Accordingly, diffraction patterns were simulated by adding Poisson noise (“shot noise”) to the calculated diffraction signal.For the simulation, randomly oriented diffraction patterns were generated using uniform random rotation quaternions as described in Appendix B. The (3D) diffraction volume of the molecule was obtained by orienting the noisy diffraction patterns relative to each other. To retrieve the electron density, the entire diffraction volume (reciprocal space) needs to be covered. Only scattering vectors ending on the Ewald sphere contribute to the diffraction pattern of a molecule in a particular orientation. Accordingly, each diffraction pattern accesses a spherical cap of the diffraction volume centered at the origin of the reciprocal space. Consequently, a large number of diffraction patterns (snapshots) from many different molecular orientations are required to fully sample the reciprocal space. The ensemble of all snapshots is then merged into a diffraction volume using a (cone-gridding) algorithm, which has been previously used for SPI at the XFEL. The diffraction volume is phased by an iterative phasing algorithm. As a result, the phases of the structure factors are recovered, and the electron density is determined. The resolution is validated using Fourier Shell Correlation (FSC), a method now widely accepted in cryo-EM single-particle imaging.
NUMBER OF SNAPSHOTS
Let D be the particle diameter and d be the aimed resolution. We define a dimensionless quantity R = “number of resolution elements” as follows:
The number of Shannon voxels in the outermost shell covered by a single diffraction pattern for oversampling by a factor of two is
Accordingly, the number of Shannon voxels in the resolution shell is
The probability of an outermost shell voxel hit by a randomly oriented diffraction pattern is therefore
Let denote the mean number of expected photons per Shannon pixel of a diffraction pattern at the resolution shell. As single photons are counted by the detector, the signal follows Poisson statistics, . Accordingly, the signal-to-noise ratio (SNR) is
Due to the weak scattering of x-rays from a single molecule, the SNR of a single diffraction snapshot is way too low for any high-resolution information. It is therefore necessary to obtain information from many snapshots by averaging. The number of times M a voxel must be hit by a diffraction pattern in order to reach a desired SNR is
As an example, for (phytochrome at 10 Å resolution), each voxel must be visited at least 500 times to achieve a SNR of 1.The probability P for a single voxel visited at least M times by an ensemble of nS snapshots is estimated using the following sum of Binomial distributions:
Under the assumption that individual Shannon voxels are visited independently (justification given in Appendix C), the joint probability to observe all voxels at least M times is
The factor in front of in the equation is due to Friedel's symmetry. Using these relations, we can estimate the total number of diffraction patterns needed to cover the entire diffraction volume at any desired probability . For the special case when M = 1, i.e., at very high signal levels, an estimation of the number of snapshots was proposed previously and is in agreement with our formulation.Equation (7) cannot be analytically solved for nS. Instead, one can calculate the right-hand side with increasing nS until the desired probability P is reached. An implementation of an efficient algorithm in Python is listed in Appendix D. However, an analytical formula for the number of snapshots can be obtained by using the de Moivre–Laplace theorem, which approximates the binomial distribution by a Gaussian,
This approximation allows us to write nS explicitly,
Since , the probability P becomes 0.5 for . We call nS the characteristic number of snapshots, the number required for a single voxel being visited at least M times with probability 50%. Together with Eqs. (4) and (6), this yields
To verify the approximation, we calculated the exact probability P according to Eq. (7) as a function of for different values of M. The result is depicted in Fig. 2(a). All curves admit a value of for , in close agreement with the approximation (9).
FIG. 2.
Probability P and correction factor C. (a) Probability P of visiting a single voxel at least M times as a function of (R = 16). For the characteristic number of snapshots , all curves admit a probability P close to . (b) Log –log plot of the correction-factor as a function of for different numbers of resolution elements R and SNR = 1.0. The horizontal dashed line corresponds to a value of = 1.15. For the exact number of snapshots is within 15% of the characteristic number .
Probability P and correction factor C. (a) Probability P of visiting a single voxel at least M times as a function of (R = 16). For the characteristic number of snapshots , all curves admit a probability P close to . (b) Log –log plot of the correction-factor as a function of for different numbers of resolution elements R and SNR = 1.0. The horizontal dashed line corresponds to a value of = 1.15. For the exact number of snapshots is within 15% of the characteristic number .Now, for all voxels being jointly visited at least M times, the number of snapshots must be larger of course. With a joint probability Eq. (8), a single-voxel probability must be reached instead, a value much closer to one. We express the ratio of the exact number of snapshots , derived from Eq. (8), to the characteristic number of snapshots from Eq. (9) by a correction factor C
. A plot of C as a function of the mean number of photons for a SNR value of 1.0 is shown in Fig. 2(b). From this plot, we observe that approaches as the number of photons is lowered. The exact number can easily be estimated from the characteristic number by multiplication with the proper correction factor taken from the graph, without need to solve Eq. (8).According to Eq. (9), the most important parameter for estimating the number of diffraction patterns is the mean number of expected photons per Shannon pixel at the desired resolution d. Different methods to calculate are given in Appendix E. An estimate of the number of snapshots for the full-length phytochrome molecule to reach a SNR of 1.0 at different resolutions and various experimental conditions is tabulated in Table I. The conditions are similar to the experimental conditions available at the existing XFELs such as the Linac Coherent Light Source (LCLS) or the European XFEL (EuXFEL).
TABLE I.
Estimated number of snapshots required to reach SNR = 1 at different resolutions for the full-length phytochrome molecule at various experimental conditions. is the mean number of photons per Shannon pixel at the desired resolution d.
Hard x-ray (larger beam) energy = 6.0 keV, λ = 2.07 Å, beam size = 0.5 μm× 0.5 μm, and fluence = 4.0×1020 ph/cm2
n
# Snapshots
n
# Snapshots
n
# Snapshots
30 Å
1.8 × 10−2
1774
1.0 × 10−1
488
8.4 × 10−3
3394
25 Å
6.7× 10−2
5000
4.8 × 10−2
1007
3.8 × 10−3
8302
10 Å
2.0 × 10−3
38 244
2.6 × 10−2
4323
2.2 × 10−3
35 063
5 Å
a
a
6.6 × 10−3
26 978
5.0 × 10−4
284 978
3 Å
a
a
4.3 × 10−3
66 075
3.5 × 10−4
672 010
Not accessible due to the wavelength or unpractically high scattering angle.
Estimated number of snapshots required to reach SNR = 1 at different resolutions for the full-length phytochrome molecule at various experimental conditions. is the mean number of photons per Shannon pixel at the desired resolution d.Not accessible due to the wavelength or unpractically high scattering angle.
RESULTS
Simulated diffraction patterns of the full-length phytochrome are presented in Fig. 3. Figure 3(a) shows a noise-free diffraction pattern, and Fig. 3(b) shows a diffraction pattern corresponding to an incident photon fluence of photons/cm2 and a photon energy of 2.48 keV (wavelength 5 Å). Only ∼200 photons are scattered from a single phytochrome molecule. Using our formalism [Eq. (8)], 38 000 diffraction snapshots are required to reach a SNR of 1.0 at 10 Å resolution. Accordingly, we simulated noisy 38 000 patterns that were subsequently merged into a 3D diffraction volume. Central slices through the reconstructed 3D volume are shown in Fig. 4(a), while Fig. 4(b) shows a section through the noise-free volume, derived from the atomic model. In the simulation, the central three voxels of the diffraction volume were set to zero, which takes into account the experimentally inaccessible central area of the detectors used at the XFELs. Iterative phasing was used to recover the electron density from the diffraction volume using the combination of the hybrid-input-output (HIO) algorithm and shrink-wrap algorithm. The HIO algorithm was applied for the first fifty iterations with feedback parameter . After this, the shrink-wrap algorithm was used with an adaptive support constraint determined anew for each iteration cycle. For that, the present electron density was convoluted with a Gaussian of width . Voxels that contain electron densities larger than 14% of the maximum were assigned to the new support constraint. The initial width was set to six voxels and reduced by 5% after each iteration until a minimum of one voxel was reached. This algorithm converged after a few hundred iterations. The reconstructed electron density at 10 Å is shown in Fig. 5(a).
FIG. 3.
Simulated diffraction patterns. (a) Noise-free diffraction pattern with 10 Å resolution at the edge of the detector and 5 Å wavelength. (b) Diffraction pattern for a photon fluence of photons/cm2. The total number of scattered photons is 196, and the average photon count at 10 Å resolution is 0.002. The color code corresponds to the number of photons per pixel.
FIG. 4.
Reconstructed and exact diffraction volume. (a) Central slices of the reconstructed volume, reconstructed from 38 000 noisy patterns. (b) Central slices through the exact, noise-free diffraction volume. The slices correspond to the xy, yz, and zx planes.
FIG. 5.
Electron density obtained by iterative phasing and resolution validation by Fourier Shell Correlation (FSC). (a) Reconstructed electron density from 38 000 noisy diffraction patterns of the full-length phytochrome targeted at 10 Å resolution displayed at the 3σ contour level with atomic model superimposed using Chimera. (b) FSC from (a) for all diffraction patterns (blue) and half the number of patterns (red). (c) Reconstructed electron density from 66 000 noisy diffraction patterns targeted at 3 Å resolution and beam parameters different from (a), as explained in the text. (d) FSC from (c) for all diffraction patterns (blue) and half the number of patterns (red). The horizontal dashed lines show the established thresholds for FSC (0.143 and 0.5). The vertical dashed line corresponds to the target resolution.
Simulated diffraction patterns. (a) Noise-free diffraction pattern with 10 Å resolution at the edge of the detector and 5 Å wavelength. (b) Diffraction pattern for a photon fluence of photons/cm2. The total number of scattered photons is 196, and the average photon count at 10 Å resolution is 0.002. The color code corresponds to the number of photons per pixel.Reconstructed and exact diffraction volume. (a) Central slices of the reconstructed volume, reconstructed from 38 000 noisy patterns. (b) Central slices through the exact, noise-free diffraction volume. The slices correspond to the xy, yz, and zx planes.Electron density obtained by iterative phasing and resolution validation by Fourier Shell Correlation (FSC). (a) Reconstructed electron density from 38 000 noisy diffraction patterns of the full-length phytochrome targeted at 10 Å resolution displayed at the 3σ contour level with atomic model superimposed using Chimera. (b) FSC from (a) for all diffraction patterns (blue) and half the number of patterns (red). (c) Reconstructed electron density from 66 000 noisy diffraction patterns targeted at 3 Å resolution and beam parameters different from (a), as explained in the text. (d) FSC from (c) for all diffraction patterns (blue) and half the number of patterns (red). The horizontal dashed lines show the established thresholds for FSC (0.143 and 0.5). The vertical dashed line corresponds to the target resolution.The resolution and reproducibility of the reconstructed electron density were accessed using Fourier Shell Correlation (FSC). For this, we randomly split the diffraction patterns into two disjoint sets “1” and “2” and processed each set independently, resulting in two electron density maps. The FSC is calculated from the Fourier transformation of the two maps using
where and are the Fourier transforms of maps 1 and 2, respectively, q is the magnitude of the scattering vector, and * denotes the complex conjugate. The resolution limit is defined by the value where the FSC drops below a certain threshold. Conventional thresholds used by the cryo-EM community are 0.143 and 0.5. These thresholds are represented by horizontal dashed lines in Fig. 5(b). The FSC drops below 0.5 at around 10 Å [Fig. 5(b), blue line]. This demonstrates that the number of diffraction patterns estimated by Eq. (8), with and SNR = 1, is sufficient to reach the targeted resolution. Now, we want to test if a smaller number of snapshots could be sufficient to obtain the electron density at the same resolution. To address this, the reconstruction workflow is repeated with half the number of patterns (19 000), which corresponds to a SNR of 0.67. The FSC [Fig. 5(b), red line] reveals that instead of 10 Å, only about 20 Å is reached in this case.To evaluate whether near-atomic resolution could be realistically reached by an XFEL experiment, the same pipeline is repeated for a target resolution of 3 Å. For that, a higher photon energy of 8.27 keV is used instead, which corresponds to a wavelength of 1.5 Å. A smaller x-ray focal spot with 100 nm diameter is chosen, which yields a photon fluence of photons/cm2. A total of ∼2000 photons/pattern are scattered per phytochrome molecule. According to our formalism, 66 000 noisy patterns are required to reach a resolution of 3 Å. The reconstructed electron density at 3 Å is shown in Fig. 5(c). Details of the structure can be identified from the inset of Fig. 5(c). We also validated the resolution of 3 Å by FSC [Fig. 5(d)]. With half the number of patterns (SNR = 0.65), the resolution reaches only about 8 Å [Fig. 5(d), red line].Finally, we analyze the effect of different types of backgrounds on the quality of the reconstruction. For that, we consider a uniform background with the same magnitude as the phytochrome signal at 10 Å and a uniform background three times the magnitude. Additionally, we take a q-dependent background assuming a helium gas, using the atomic scattering factors of He. The magnitude of the helium background is set equal to the signal of the phytochrome at 10 Å. The diffraction patterns, including the background, are converted to photon counts by addition of Poisson noise as described above. We repeat the merging and reconstruction processes to obtain 3D electron density maps. The FSC with a background is depicted in Fig. 6 and compared with the FSC without any background. Addition of a uniform background equal to the molecular signal (Fig. 6, red line) and the helium background (magenta line) has a small effect, and the target resolution of 10 Å can still be reached with 38 000 diffraction patterns. For a background three times the molecular signal, however, the resolution is reduced to 12 Å (blue line). We conclude that a background comparable to the molecular signal does not need substantially more diffraction patterns to reach the target resolution.
FIG. 6.
Effect of different types of backgrounds on the resolution. Fourier Shell Correlation (FSC) for a uniform background with the same magnitude as the phytochrome signal at 10 Å (red) and a uniform background three times the magnitude (blue). The FSC of a q-dependent background representing a gas of helium with a magnitude equal to the signal at 10 Å (magenta). The FSC without a background is included for comparison (black).
Effect of different types of backgrounds on the resolution. Fourier Shell Correlation (FSC) for a uniform background with the same magnitude as the phytochrome signal at 10 Å (red) and a uniform background three times the magnitude (blue). The FSC of a q-dependent background representing a gas of helium with a magnitude equal to the signal at 10 Å (magenta). The FSC without a background is included for comparison (black).
SUMMARY
We estimated the appropriate number of snapshots required in order to reconstruct the three-dimensional electron density of a biological molecule at any desired resolution. Quantitative results are derived as a function of the desired resolution, the molecular size, the expected average number of photons per Shannon pixel, and the SNR in the resolution shell. Using this formalism, we demonstrated that a SNR of 1.0 and a joint probability are sufficient to reach the desired resolution by iterative phasing validated by Fourier Shell Correlation (FSC). The derivation assumed that orientation recovery does not require additional snapshots. Indeed, a study of single-particle diffraction imaging recently reported successful reconstruction of electron density from diffraction patterns at a signal level of less than 100 photons on the average pattern. We, therefore, conclude that for a protein like the phytochrome, where 200 photons per diffraction pattern can be expected in an XFEL experiment (see Fig. 3), additional snapshots for orientation recovery are not required.Our formalism can be extended to incorporate other experimental conditions. In this work, we studied the effect of the uniform background and q-dependent background from a gas of helium. However, any form of background or other nuisances such as x-ray streaks and variations in the detector response can be incorporated. By including such signals in the simulations, the impact on the required number of snapshots can be evaluated following the same framework outlined in this paper.
OUTLOOK
The primary challenge of Single-Particle Imaging (SPI) at XFELs is to reach sub-nanometer resolution to visualize the atomic details of biological macromolecules. Most importantly, a sufficiently large number of single-particle diffraction patterns must be collected during the allocated experimental beamtime. With the new generation of high repetition-rate XFELs, which deliver pulse energies of 10 mJ and higher, improvement in sample delivery technology, and specialized detectors, we expect to collect tens of millions of snapshots during a single shift. Combined with noise-robust data analysis algorithms for single-particle detection and orientation recovery, the goal may be reached in the near future. According to our estimates (see Table I), it should be possible to reach sub-nanometer resolution even for smaller proteins with molecular masses similar to that of the phytochrome. However, the main advantage of SPI unfolds in the presence of structural variability, associated with the biological function. Manifold-based machine learning algorithms applied to a large ensemble of single-particle snapshots allow us to reveal the concerted structural changes exercised by the sample. This enables us to map conformational spectra together with energy landscapes, determine possible functional pathways, and compile 3D molecular movies. Performing single-particle diffraction in time-resolved mode can further advance these promising opportunities for understanding the biological function and bring structural biology to a new level.The relevance of our formalism is that it establishes a sound mathematical formalism to determine the number of snapshots required to answer a specific biological question at any resolution. These estimates will be useful for beamtime proposals to assert that datasets of sufficient quality can be collected. Similarly, it can help beamline scientists to decide whether enough snapshots can be collected during the allocated beamtime. In this way, the feasibility of single-particle experiments can be judged in an objective way and can be used to guide experiments at new and existing XFELs for a broad class of biological macromolecules.
import numpy as np
def logFactorial(n):
if n < 20:
value = np.log(np.math.factorial(n))
else:
value = 0.5*np.log(2*np.pi*n) + n*np.log(n/np.e)
return value
def pnm(p,N,M):
#this function gives the probability to observe a voxel at least M times
#from an ensemble of N snapshots
# p = probability to hit a voxel for a single snapshot
# N = Number of Snapshots
# M = Redundancy
if N < M:
s = 1
else:
s = 0
lp = np.log(p)
lmp = np.log(1-p)
for k in np.arange(M):
s = s + np.exp(logFactorial(N) - logFactorial(N-k) - logFactorial(k) + k*lp + (N-k)*lmp)
return np.maximum(1-s,0)
def numberOfSnapShots(d,D,nPhotons,SNR,P_tilde):
#nS number of Snapshots
#d is the Resolution
# D is the Diameter of a Molecule
#nPhotons is the Number of Photons
#P_tilde is the combined probability
#Number of Resolution elements
R = D/d
#number of voxels at Resolution Shell
nV_Shell = 16*np.pi*R**2
#probability per Shannon voxel
p = 1./(4*R)
M = np.ceil(SNR**2/nPhotons)
# P -> Probability to observe a voxel at least M times from an ensemble of nS snapshot
Authors: Carlos Sanchez-Cano; Ramon A Alvarez-Puebla; John M Abendroth; Tobias Beck; Robert Blick; Yuan Cao; Frank Caruso; Indranath Chakraborty; Henry N Chapman; Chunying Chen; Bruce E Cohen; Andre L C Conceição; David P Cormode; Daxiang Cui; Kenneth A Dawson; Gerald Falkenberg; Chunhai Fan; Neus Feliu; Mingyuan Gao; Elisabetta Gargioni; Claus-C Glüer; Florian Grüner; Moustapha Hassan; Yong Hu; Yalan Huang; Samuel Huber; Nils Huse; Yanan Kang; Ali Khademhosseini; Thomas F Keller; Christian Körnig; Nicholas A Kotov; Dorota Koziej; Xing-Jie Liang; Beibei Liu; Sijin Liu; Yang Liu; Ziyao Liu; Luis M Liz-Marzán; Xiaowei Ma; Andres Machicote; Wolfgang Maison; Adrian P Mancuso; Saad Megahed; Bert Nickel; Ferdinand Otto; Cristina Palencia; Sakura Pascarelli; Arwen Pearson; Oula Peñate-Medina; Bing Qi; Joachim Rädler; Joseph J Richardson; Axel Rosenhahn; Kai Rothkamm; Michael Rübhausen; Milan K Sanyal; Raymond E Schaak; Heinz-Peter Schlemmer; Marius Schmidt; Oliver Schmutzler; Theo Schotten; Florian Schulz; A K Sood; Kathryn M Spiers; Theresa Staufer; Dominik M Stemer; Andreas Stierle; Xing Sun; Gohar Tsakanova; Paul S Weiss; Horst Weller; Fabian Westermeier; Ming Xu; Huijie Yan; Yuan Zeng; Ying Zhao; Yuliang Zhao; Dingcheng Zhu; Ying Zhu; Wolfgang J Parak Journal: ACS Nano Date: 2021-03-02 Impact factor: 15.881
Authors: Eduardo R Cruz-Chú; Ahmad Hosseinizadeh; Ghoncheh Mashayekhi; Russell Fung; Abbas Ourmazd; Peter Schwander Journal: Struct Dyn Date: 2021-02-18 Impact factor: 2.920
Authors: Lena Worbs; Nils Roth; Jannik Lübke; Armando D Estillore; P Lourdu Xavier; Amit K Samanta; Jochen Küpper Journal: J Appl Crystallogr Date: 2021-11-16 Impact factor: 3.304