| Literature DB >> 27275146 |
Takanori Nakane1, Yasumasa Joti2, Kensuke Tono2, Makina Yabashi2, Eriko Nango3, So Iwata4, Ryuichiro Ishitani1, Osamu Nureki1.
Abstract
A data processing pipeline for serial femtosecond crystallography at SACLA was developed, based on Cheetah [Barty et al. (2014). J. Appl. Cryst.47, 1118-1131] and CrystFEL [White et al. (2016). J. Appl. Cryst.49, 680-689]. The original programs were adapted for data acquisition through the SACLA API, thread and inter-node parallelization, and efficient image handling. The pipeline consists of two stages: The first, online stage can analyse all images in real time, with a latency of less than a few seconds, to provide feedback on hit rate and detector saturation. The second, offline stage converts hit images into HDF5 files and runs CrystFEL for indexing and integration. The size of the filtered compressed output is comparable to that of a synchrotron data set. The pipeline enables real-time feedback and rapid structure solution during beamtime.Entities:
Keywords: SACLA; parallelization; real-time processing; serial femtosecond crystallography; spot finding
Year: 2016 PMID: 27275146 PMCID: PMC4886989 DOI: 10.1107/S1600576716005720
Source DB: PubMed Journal: J Appl Crystallogr ISSN: 0021-8898 Impact factor: 3.304
Figure 1Architecture of the data processing environment at SACLA. The online pipeline runs on the online analysis server. Eight image acquisition threads retrieve image data from the corresponding data handling servers and fill the frame buffer (shown by arrows). Once completed, a Cheetah worker thread is dispatched for each frame. For simplicity, only two worker threads are drawn. The offline pipeline runs on HPC nodes.
Figure 2Typical flow of data in the pipeline. The online pipeline finds spots on images and provides real-time feedback on hit rates and detector saturation. The offline pipeline runs spot finding again and outputs hit images in the HDF5 format. The images are then processed by CrystFEL. In time-resolved experiments, excited and non-excited images are classified by the photodiode status. Users optimize parameters based on the pipeline output and re-run CrystFEL before final merging. If the refined parameters are known beforehand, then they can be used in the pipeline and the stream files from the pipeline can be immediately merged (dotted line).
Figure 3(a) Screenshot of the monitor for the real-time pipeline. The number of spots in each image is plotted in red, while the number of saturated spots is plotted in blue. The hit rate is represented by the blue line. (b) Screenshot of the GUI for the offline pipeline.
Figure 4Screenshot of the extended hdfsee viewer. A table of images in a stream file is displayed on the left. On the right, spots identified by CrystFEL are circled in black, while predicted and integrated spots are circled in red.