| Literature DB >> 33294840 |
Joseph Caesar1,2, Cyril F Reboul3,4, Chiara Machello3,4, Simon Kiesewetter3,4, Molly L Tang1,2, Justin C Deme1,2, Steven Johnson1, Dominika Elmlund3,4, Susan M Lea1,2, Hans Elmlund3,4.
Abstract
We here introduce the third major release of the SIMPLE (Single-particle IMage Processing Linux Engine) open-source software package for analysis of cryogenic transmission electron microscopy (cryo-EM) movies of single-particles (Single-Particle Analysis, SPA). Development of SIMPLE 3.0 has been focused on real-time data processing using minimal CPU computing resources to allow easy and cost-efficient scaling of processing as data rates escalate. Our stream SPA tool implements the steps of anisotropic motion correction and CTF estimation, rapid template-based particle identification and 2D clustering with automatic class rejection. SIMPLE 3.0 additionally features an easy-to-use web-based graphical user interface (GUI) that can be run on any device (workstation, laptop, tablet or phone) and supports a remote multi-user environment over the network. The new project-based execution model automatically records the executed workflow and represents it as a flow diagram in the GUI. This facilitates meta-data handling and greatly simplifies usage. Using SIMPLE 3.0, it is possible to automatically obtain a clean SP data set amenable to high-resolution 3D reconstruction directly upon completion of the data acquisition, without the need for extensive image processing post collection. Only minimal standard CPU computing resources are required to keep up with a rate of ∼300 Gatan K3 direct electron detector movies per hour. SIMPLE 3.0 is available for download from simplecryoem.com.Entities:
Keywords: Cryo-EM; Real-time; Single-particle; Stream image processing
Year: 2020 PMID: 33294840 PMCID: PMC7695977 DOI: 10.1016/j.yjsbx.2020.100040
Source DB: PubMed Journal: J Struct Biol X ISSN: 2590-1524
Fig. 1Schematic overview of steps in SPA.
Typical CPU resources required to keep up with data generated by the given detectors at the rate shown. All benchmarks were performed on machines with AMD EPYC 7551P processors, 192 GB RAM and an SSD backed BeeGFS filesystem. These minimal resources can easily be housed within a single processing machine using modern CPU hardware.
| Detector | Movie Dimensions | Movie Frames | Movies/hour | CPU Threads |
|---|---|---|---|---|
| K2 | 3838 × 3710 | 32 | 100 | 16 |
| K3(super-resolution) | 11520 × 8184 | 40 | 300 | 88 |
Fig. 2Graphical User Interface (GUI). (a) Project window with workflow graph outlining the executed processes. Each box represents a process with the execution directory as heading and a process indicator (running, finished, failed) just below. The clickable eye icon in the lower left corner of each box links to (b) viewable outputs. In this example, the (b) panel shows micrograph (left), background subtracted power spectrum/fitted CTF model (middle) and picked particle coordinates (right) generated by 1_preprocess_stream. (c(i)) Following stream 2D analysis (process #2, executed after #1 stream pre-processing), the viewer links to the class averages produced. (c(ii)) The class averages can be closely inspected and link to (d) a particle viewer via the eye icon in the upper right corner of each class average, allowing visualization of the particles associated with each class and inspection of their associated statistics. (e) The folder icon in each box allows inspection of the output files produced in the execution directory. Outputs that can be rendered on screen link to viewers. (f) The process icon (cyclic double arrow) link to the task control window, where input parameters are arranged in dropdown menus according to their categorization. Only dropdown menus with required inputs are expanded by default. In this example, the categories are job parameters, search controls that modify optimization behavior, filter controls that modify Fourier filtering behavior, mask controls and computer controls used to change how the task is executed, i.e. number of threads etc. (g) The text file icon allows inspection of the log file, to which all SIMPLE 3.0 subprocesses concatenate their output. The log file is used to report subprocess exceptions and should be inspected when the process indicator is in the “failed” state. (h) When 3D volumes are available they can be visualized and the volume viewer supports 3D rendering over remote connection. Shown here is the output from 7_initial_3Dmodel, which in addition to the initial 3D reconstruction shows the class averages used and the associated re-projections of the volume for validation.
Fig. 3Schematic overview of 2D stream processing. Red/green dots indicated good/bad classes. In the final class averages the red dots indicate an additional 5 classes manually deselected in addition to those automatically rejected during the streaming processing. The resolution of the best class averages is estimated to 7.9 Å. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 4Optics Group Assignment. Plot of beam shift in × and y for 7,428 movies collected using EPU colored by optics group assignment. Hierarchical clustering is used to group movies based on beam shift, before each group is further divided into sub-populations based on the location identifier in the EPU filename. The user may limit the maximum population of each group and/or apply an offset to the optics group number to aid dataset combination. The data shown were collected using a 1.2/1.3 quantifoil grid with two shots per hole using AFIS beam shift collection in EPU 2.7 (Thermo Fisher Scientific, The Netherlands).
Fig. 5FSC curves and Local Resolution colored volumes for the same particle (and half set assignments) processed with either SIMPLE3.0 or RELION 3.1. (a) Blue curves are for data where patched motion and CTF estimation was performed in SIMPLE3.0 before (dark blue) and after (light blue) CTF refinement and Bayesian Polishing in RELION 3.1. Orange curves are for the same particles extracted from RELION 3.1 motion corrected and CTFFIND 4 CTF estimated movies before (dark orange) and after (light orange) CTF refinement and Bayesian Polishing in RELION 3.1. All volumes were refined in three independent calculations and the values shown are the mean +/- SD of the FSC values obtained. (b) Example volumes from each protocol are shown colored by local resolution (calculated in RELION 3.1). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)