Single-particle structure recovery without crystals or radiation damage is a revolutionary possibility offered by X-ray free-electron lasers, but it involves formidable experimental and data-analytical challenges. Many of these difficulties were encountered during the development of cryogenic electron microscopy of biological systems. Electron microscopy of biological entities has now reached a spatial resolution of about 0.3 nm, with a rapidly emerging capability to map discrete and continuous conformational changes and the energy landscapes of biomolecular machines. Nonetheless, single-particle imaging by X-ray free-electron lasers remains important for a range of applications, including the study of large "electron-opaque" objects and time-resolved examination of key biological processes at physiological temperatures. After summarizing the state of the art in the study of structure and conformations by cryogenic electron microscopy, we identify the primary opportunities and challenges facing X-ray-based single-particle approaches, and possible means for circumventing them.
Single-particle structure recovery without crystals or radiation damage is a revolutionary possibility offered by X-ray free-electron lasers, but it involves formidable experimental and data-analytical challenges. Many of these difficulties were encountered during the development of cryogenic electron microscopy of biological systems. Electron microscopy of biological entities has now reached a spatial resolution of about 0.3 nm, with a rapidly emerging capability to map discrete and continuous conformational changes and the energy landscapes of biomolecular machines. Nonetheless, single-particle imaging by X-ray free-electron lasers remains important for a range of applications, including the study of large "electron-opaque" objects and time-resolved examination of key biological processes at physiological temperatures. After summarizing the state of the art in the study of structure and conformations by cryogenic electron microscopy, we identify the primary opportunities and challenges facing X-ray-based single-particle approaches, and possible means for circumventing them.
Determining the structure and function of biological nanomachines is a key goal of
molecular biology. Electron-microscopic single-particle imaging approaches, augmented by
powerful algorithmic techniques, have served this goal by revealing the three-dimensional
(3D) structure (Frank, 2006), discrete (Scheres, 2012) and continuous conformational changes,
and energy landscapes (Dashti ) from 2D snapshots of individual objects in unknown orientational and
conformational states. These set a high bar for competing techniques. In
order to delineate the opportunities for single-particle structure determination by methods
based on X-ray
free-electron lasers
(XFEL's), it is necessary to begin with a summary of the state of the art in cryogenic
electron microscopy
(cryo-EM).
STRUCTURE OF BIOLOGICAL ENTITIES BY CRYOGENIC
ELECTRON
MICROSCOPY
Electron microscopy
exploits the scattering of high-energy (100–300 kV) electrons passing through the
sample of interest. Biological entities are captured in their “native” state in vitreous ice
by plunge-freezing (Frank, 2006). This and the need to
minimize radiation damage limit the signal-to-noise ratio
(SNR) of cryoEM
snapshots to the range of 0.1–0.01. Each electron micrograph represents a 2D “projection
view” of a single particle in an unknown orientational and conformational state. The
reliable extraction of conformational information has become possible only recently (Scheres, 2012 and Dashti
).Over fifty years of intensive research has resulted in a series of well-characterized
data-analytical steps for recovering reliable 3D structural information from ultralow-signal
electron micrographs of biological entities (Frank,
2006). Over this period, 3D structure recovery by cryoEM has moved systematically
from relying on a few high-contrast (initially “stained”) snapshots, to utilize large
collections of ultralow-signal snapshots obtained under exceptionally uniform imaging
conditions. This trend has been driven by the need to preserve sample integrity and to
minimize variations extraneous to the sample itself, often at the expense of signal-to-noise ratio. The
realization that statistically meaningful structural information requires averaging over
large homogeneous ensembles, particularly for the important class of conformationally flexible
molecular machines (Moore, 2012), has fueled the
development of powerful data-analytical approaches capable of extracting reliable
information from large collections of ultralow-signal snapshots.The past few years have witnessed a significant increase in the highest resolution with
which 3D structure can be determined by CryoEM, to a level where secondary structure can be
discerned. Specifically, the advent of direct-detection electron-counting techniques has
resulted in near-atomic resolutions (see, e.g., Amunts
), limited mostly by intrinsic sample heterogeneity.
At the same time, it has been suggested that “the golden age of structure determination is
drawing to a close, with the focus shifting to structural changes during function” (Moore, 2012). The growing realization that biological
“function” involves conformational change rather than an immutable static structure has
accelerated efforts to develop data-analytical methods capable of mapping complex
conformational changes from large heterogeneous datasets of
ultralow-signal 2D snapshots.Bayesian clustering techniques (Scheres and Scheres, 2012) have
been highly successful in sorting discrete conformational states. Template-based methods have
yielded exciting evidence of continuous conformational changes during function (Fischer ). Most recently,
manifold-based approaches have enabled ab initio, mathematically rigorous
extraction of continuous conformational changes, without recourse to templates, pre-selected
regions of interest, or other ad hoc assumptions (Dashti ). It has thus been possible to
compile molecular movies of the ribosome, map its energy landscape, and identify the
trajectory on the energy landscape associated with function (Fig. 1). CryoEM is thus strongly positioned to lead the ongoing shift of
emphasis from determining structure to elucidate function at the near-atomic level.
FIG. 1.
(a) 3D structure of ribosome shown in three standard views. (b) Energy landscape of
ribosome. (Low-energy regions appear blue.) Conformational changes occurring between each
of the seven numbered points along the trajectory are shown in letters. Movies showing the
continuous conformational changes along the low-energy trajectory are available in Dashti . Reprinted with
permission from Dashti et al., Proc. Natl. Acad. Sci. U. S. A.
111(49), 17492–17497 (2014). Copyright 2014 by PNAS.
DETERMINING THE 3D STRUCTURE OF BIOLOGICAL ENTITIES BY XFEL
Since the initial suggestion by Solem and Baldwin over three decades ago (Solem and Baldwin, 1982), the impressive simulations of
Neutze fifteen years
ago, and the advent of XFELs capable of producing intense, ultrashort pulses of hard X-rays,
the biostructural community has anticipated with excitement the 3D reconstructions of
individual biological entities without radiation damage, or the need for crystals. Results
are now beginning to emerge (Ekeberg ), albeit with resolutions in the 100 nm range. These indicate the relative
infancy of the field, whose progress will likely mirror the trajectory followed by CryoEM at
the corresponding stage in its development.Experimental challenges in single-particle imaging by XFEL are similar to those faced by
the early CryoEM community: mitigation of shot-to-shot variations in the characteristics of
the incident radiation; reproducible, artifact-free introduction of individual biological
entities in their native states into beam; and the availability of well-characterized,
linear detectors of
sufficient dynamic range. From an algorithmic point of view, XFEL-based data-analytical
methods initially focused on 3D reconstruction with ultralow-signal snapshots with Poisson
noise (Fung and Loh and Elser, 2009). The initial concentration on giant
viruses, however,
requires the algorithmic capability to deal with snapshots recording millions of
scattered photons,
a substantial proportion of which do not emanate from the object at all, but change
unpredictably from shot to shot. These difficulties have led to the realization that an
international collaborative effort is required for further experimental and algorithmic
progress (Aquila ). The
advent of systematic efforts to identify optimum conditions for single-particle imaging by
XFEL's will likely help to reduce artifacts due to shot-to-shot variations in extraneous
scattering and
detector
nonlinearities. Such efforts must occur in tandem with the development of algorithms able to
recover reliable 3D structure in the presence of remaining artifacts. The drive to
experimentally reduce and algorithmically deal with artifacts represents a new and important
direction in single-particle structure determination by XFEL's.
Present algorithms for 3D single-particle structure determination by XFEL
Since each diffraction snapshot is a 2D section through the center of a 3D
diffraction
volume, 3D structure recovery seems, at first sight, a classical tomographic problem.
However, the snapshots emanate from unknown orientations of weakly scattering biological objects
(Shneerson ). In
2009, two apparently different Bayesian approaches (Fung
and Loh and
Elser, 2009), later shown to be fundamentally the same (Moths and Ourmazd, 2011), demonstrated the capability to recover 3D
structure from a collection of simulated snapshots at the SNR expected from a single 500
kD biological molecule. Since then, a growing number of publications have replicated the
same capability (Tegze and Bortel, 2012 and Kassemeyer, ). Some of
these approaches have demonstrated success with ultralow-signal experimental snapshots
obtained by cryoEM (Schwander ), or even a conventional X-ray source (Philipp
).The same level of success, however, has not been achieved with experimental XFEL
snapshots of biological entities, even from giant viruses. Since such large
objects scatter
millions of photons onto the detector, the scattering is visible to the naked eye as emanating from an icosahedral
object (Fig. 2). The relative lack of success is
thus, at first sight, surprising. When 3D structure recovery has been demonstrated (Xu and Ekeberg ), it has been
based on a few (typically 10–200) individual snapshots—carefully selected from large
collections containing ≥105 snapshots—so as to minimize the shot-to-shot
variations in imaging conditions and detector response. In order to compete with alternative
methods of structure recovery at nanometer and sub-nanometer levels and because of the
need to compile ensemble averages as mentioned earlier, it is essential to develop the
ability to recover 3D structure from large datasets.
FIG. 2.
Some artifacts in a typical XFEL diffraction pattern from a large virus, obtained with a
liquid-jet injector. Features marked (a) and (b) are due to the scattering from injector
nozzle. Features (c) and (d) stem from movements in the position of the liquid jet
containing the sample. Dark lines marked (e) represent the effect of electronic noise and
dead pixels.
Two observations are appropriate at this point. First, when extraneous scattering and detector response dominate,
they affect all methods based on similarity (Fung
; Loh and Elser,
2009; and Giannakis ), as well as those based on angular correlations (Saldin and Kirian ). Second, increases in the
incident X-ray
intensity are unlikely to solve this problem, because object and extraneous scattering effects often scale
together, while detector nonlinearities tend to become worse at higher intensities.
Future algorithms for 3D structure determination by XFEL
Future single-particle algorithms must be able to remove extraneous artifacts, which vary
from shot to shot. These artifacts include: (i) Variations in the incident X-ray beam intensity,
inclination, and position; (ii) scattering from upstream apertures and the sample injector; (iii)
variations in beam-sample impact parameter; (iv) multi-particle hits; and (v)
detector
nonlinearities (Fig. 2). Unfortunately, each of these
factors changes from shot to shot (Fig. 3).
FIG. 3.
Representative selection of snapshots showing shot-to-shot variations in effects
extraneous to the particle under observation.
Algorithms designed to extract the object orientation from each 2D snapshot rely on some
measure of similarity (or angular correlation) in order to determine the object
orientation. However, in the presence of strong extraneous artifacts, such measures reveal
the changes in the artifacts, rather than the particle orientation and/or conformation.
Manifold-based approaches (Giannakis ; Schwander ; Hosseinizadeh ; and Schwander ) offer a geometric means of visualizing the effect of
extraneous artifacts. In the absence of extraneous effects and conformational
heterogeneities, the points representing a collection of snapshots from sightings of an
object in different orientations lie on a specific hypersurface (“manifold”) (Giannakis ; Hosseinizadeh ; and Schwander ). A
two-dimensional representation of the hypersurface produced by snapshots of an icosahedral
virus is shown
in Fig. 4(a), to be compared with the hypersurface
produced by experimental snapshots of a large virus [Fig. 4(b)].
This comparison makes it clear that the dominant similarity relationships between the
snapshots contain little information about the orientation of the virus, reflecting, instead, the
changes in extraneous effects. Fig. 5, for example,
shows that the arc length along the experimental manifold of Fig. 4(b) reflects changes in the incident beam intensity.
FIG. 4.
(a) Manifold of simulated diffraction snapshots from an icosahedral virus. Each point in
this plot represents a diffraction pattern. (b) 2D representation of the manifold of the
experimental XFEL snapshots from a large icosahedral virus.
FIG. 5.
The total intensity of the snapshots (vertical axis) is correlated with the position
along the parabolic manifold [arc length of parabola in Fig. 4(b)].
Post facto algorithmic correction of extraneous effects: Dynamic
flat-fielding
One can regard the shot-to-shot changes in extraneous effects as changes in the
detector
response (background and gain). Corrections to detector response are traditionally implemented via
so-called flat-fielding methods. As normally implemented, this involves a one-time
correction to variations in detector response across a uniformly illuminated detector (Seibert ).In the case of XFEL snapshots, the incident imaging conditions vary from shot to shot,
requiring a different “flat-field” correction for each snapshot. As little is known about
the imaging conditions for each shot, this appears infeasible.As described below, however, it is possible to design a “dynamic flat-fielding”
correction for each snapshot based on the intrinsic properties of diffraction patterns
themselves. The main assumption is that single-particle diffraction snapshots are
collected in uniformly random orientations.Provided that the snapshots are corrected for X-ray polarization, an average over a sufficiently large
number of snapshots must be azimuthally uniform. In the presence of pixel-to-pixel
variations in detector response, this symmetry is lost. One can therefore correct the
combined effect of extraneous artifacts by applying a flat-field to each snapshot
specifically designed to restore azimuthal symmetry to the ensemble-average of snapshots
obtained under closely similar imaging conditions. This can be done with the help of
manifold-based approaches, as outlined below.First, one sorts the data-points along the parabolic manifold shown in Fig. 4(b), in order to identify subsets, within each of which
the imaging conditions are closely similar. This “classification” can be performed by
using bins along the arc-length of the parabola [Fig. 4(b)]. Specifically, one computes the angular average of diffracted intensities as a function of spatial frequency and the azimuthal angle on the snapshots within an interval , that is, where N is the number of
diffraction
patterns in the selected interval. Next, a flat-field correction for the interval is designed with .This flat-field correction is applied to every image in the interval , in order to restore azimuthal symmetry to the
ensemble-average over that interval, viz., where represents the corrected intensity .The procedure outline above corrects for detector variations within a single class in the interval . In order to correct globally for all intervals of along the parabolic manifold, one makes in Eq. (2)
independent of the interval by computing Finally, each image in the dataset is corrected as follows:
where is a diffraction pattern in the interval . This establishes identical azimuthal averages, and thus a
global correction for all values of . The result is shown in Fig. 6. Horizontal and vertical streaks in the raw image are eliminated, and
differences in overall intensity in top and bottom panels of the image corrected.
FIG. 6.
(a) A typical raw XFEL diffraction pattern obtained from a biological object injected by
a continuous liquid jet. (b) The flat-field correction for this snapshot deduced from the
collection of diffraction patterns themselves. (c) Corrected snapshot after dynamic
flat-fielding.
FUTURE DIRECTIONS
XFEL-based single particle methods can determine the structure of large objects, which are
difficult to examine by other means. Determining the conformational spectra
(“movies”) of molecular processes and the energy landscapes traversed in the course of
function also represents a compelling goal. XFEL-based single-particle techniques have the
potential to provide unprecedented access to such information under physiologically relevant
conditions (temperature, pH, etc.).Important challenges, however, remain. Experimentally, we must learn to reduce the effect
of shot-to-shot variations in imaging conditions to a level where structural and
conformational information can be reliably extracted. Algorithmically, we
must develop the capability to remove remaining artifacts and mine the information content
of large datasets to obtain statistically meaningful information. Finally, we must learn how
to combine these capabilities with time-resolved single-particle approaches, to gain
dynamical information on biologically important processes. Timing uncertainties, caused, for
example, by “pump-probe jitter” when an optical pump is combined with an XFEL probe pulse,
or by non-uniform reaction initiation in solution, can limit the available information. In
order to exploit the exquisite short-pulse capability of the XFEL, we must develop the
ability to extract accurate information from datasets collected in the presence of large
timing uncertainty. The magnitude of these prizes and the challenges they pose require
concerted international cooperation. It is gratifying that such efforts are gathering
momentum.
Authors: Ali Dashti; Peter Schwander; Robert Langlois; Russell Fung; Wen Li; Ahmad Hosseinizadeh; Hstau Y Liao; Jesper Pallesen; Gyanesh Sharma; Vera A Stupina; Anne E Simon; Jonathan D Dinman; Joachim Frank; Abbas Ourmazd Journal: Proc Natl Acad Sci U S A Date: 2014-11-24 Impact factor: 11.205
Authors: Alexey Amunts; Alan Brown; Xiao-Chen Bai; Jose L Llácer; Tanweer Hussain; Paul Emsley; Fei Long; Garib Murshudov; Sjors H W Scheres; V Ramakrishnan Journal: Science Date: 2014-03-28 Impact factor: 47.728
Authors: A Hosseinizadeh; P Schwander; A Dashti; R Fung; R M D'Souza; A Ourmazd Journal: Philos Trans R Soc Lond B Biol Sci Date: 2014-07-17 Impact factor: 6.237
Authors: Ahmad Hosseinizadeh; Ghoncheh Mashayekhi; Jeremy Copperman; Peter Schwander; Ali Dashti; Reyhaneh Sepehr; Russell Fung; Marius Schmidt; Chun Hong Yoon; Brenda G Hogue; Garth J Williams; Andrew Aquila; Abbas Ourmazd Journal: Nat Methods Date: 2017-08-14 Impact factor: 28.547
Authors: Anna Munke; Jakob Andreasson; Andrew Aquila; Salah Awel; Kartik Ayyer; Anton Barty; Richard J Bean; Peter Berntsen; Johan Bielecki; Sébastien Boutet; Maximilian Bucher; Henry N Chapman; Benedikt J Daurer; Hasan DeMirci; Veit Elser; Petra Fromme; Janos Hajdu; Max F Hantke; Akifumi Higashiura; Brenda G Hogue; Ahmad Hosseinizadeh; Yoonhee Kim; Richard A Kirian; Hemanth K N Reddy; Ti-Yen Lan; Daniel S D Larsson; Haiguang Liu; N Duane Loh; Filipe R N C Maia; Adrian P Mancuso; Kerstin Mühlig; Atsushi Nakagawa; Daewoong Nam; Garrett Nelson; Carl Nettelblad; Kenta Okamoto; Abbas Ourmazd; Max Rose; Gijs van der Schot; Peter Schwander; M Marvin Seibert; Jonas A Sellberg; Raymond G Sierra; Changyong Song; Martin Svenda; Nicusor Timneanu; Ivan A Vartanyants; Daniel Westphal; Max O Wiedorn; Garth J Williams; Paulraj Lourdu Xavier; Chun Hong Yoon; James Zook Journal: Sci Data Date: 2016-08-01 Impact factor: 6.444
Authors: Hemanth K N Reddy; Chun Hong Yoon; Andrew Aquila; Salah Awel; Kartik Ayyer; Anton Barty; Peter Berntsen; Johan Bielecki; Sergey Bobkov; Maximilian Bucher; Gabriella A Carini; Sebastian Carron; Henry Chapman; Benedikt Daurer; Hasan DeMirci; Tomas Ekeberg; Petra Fromme; Janos Hajdu; Max Felix Hanke; Philip Hart; Brenda G Hogue; Ahmad Hosseinizadeh; Yoonhee Kim; Richard A Kirian; Ruslan P Kurta; Daniel S D Larsson; N Duane Loh; Filipe R N C Maia; Adrian P Mancuso; Kerstin Mühlig; Anna Munke; Daewoong Nam; Carl Nettelblad; Abbas Ourmazd; Max Rose; Peter Schwander; Marvin Seibert; Jonas A Sellberg; Changyong Song; John C H Spence; Martin Svenda; Gijs Van der Schot; Ivan A Vartanyants; Garth J Williams; P Lourdu Xavier Journal: Sci Data Date: 2017-06-27 Impact factor: 6.444
Authors: C O S Sorzano; J Vargas; J Otón; J M de la Rosa-Trevín; J L Vilas; M Kazemi; R Melero; L Del Caño; J Cuenca; P Conesa; J Gómez-Blanco; R Marabini; J M Carazo Journal: Biomed Res Int Date: 2017-09-17 Impact factor: 3.411
Authors: Eduardo R Cruz-Chú; Ahmad Hosseinizadeh; Ghoncheh Mashayekhi; Russell Fung; Abbas Ourmazd; Peter Schwander Journal: Struct Dyn Date: 2021-02-18 Impact factor: 2.920