| Literature DB >> 30508256 |
I Munro1, E García1, M Yan1,2, S Guldbrand1, S Kumar1,3, K Kwakwa1, C Dunsby1,4, M A A Neil1, P M W French1.
Abstract
Super-resolved microscopy techniques have revolutionized the ability to study biological structures below the diffraction limit. Single molecule localization microscopy (SMLM) techniques are widely used because they are relatively straightforward to implement and can be realized at relatively low cost, e.g. compared to laser scanning microscopy techniques. However, while the data analysis can be readily undertaken using open source or other software tools, large SMLM data volumes and the complexity of the algorithms used often lead to long image data processing times that can hinder the iterative optimization of experiments. There is increasing interest in high throughput SMLM, but its further development and application is inhibited by the data processing challenges. We present here a widely applicable approach to accelerating SMLM data processing via a parallelized implementation of ThunderSTORM on a high-performance computing (HPC) cluster and quantify the speed advantage for a four-node cluster (with 24 cores and 128 GB RAM per node) compared to a high specification (28 cores, 128 GB RAM, SSD-enabled) desktop workstation. This data processing speed can be readily scaled by accessing more HPC resources. Our approach is not specific to ThunderSTORM and can be adapted for a wide range of SMLM software. LAY DESCRIPTION: Optical microscopy is now able to provide images with a resolution far beyond the diffraction limit thanks to relatively new super-resolved microscopy (SRM) techniques, which have revolutionized the ability to study biological structures. One approach to SRM is to randomly switch on and off the emission of fluorescent molecules in an otherwise conventional fluorescence microscope. If only a sparse subset of the fluorescent molecules labelling a sample can be switched on at a time, then each emitter will be, on average, spaced further apart than the diffraction-limited resolution of the conventional microscope and the separate bright spots in the image corresponding to each emitter can be localised to high precision by finding the centre of each feature using a computer program. Thus, a precise map of the emitter positions can be recorded by sequentially mapping the localisation of different subsets of emitters as they are switched on and others switched off. Typically, this approach, described as single molecule localisation microscopy (SMLM), results in large image data sets that can take many minutes to hours to process, depending on the size of the field of view and whether the SMLM analysis employs a computationally-intensive iterative algorithm. Such a slow workflow makes it difficult to optimise experiments and to analyse large numbers of samples. Faster SMLM experiments would be generally useful and automated high throughput SMLM studies of arrays of samples, such as cells, could be applied to drug discovery and other applications. However, the time required to process the resulting data would be prohibitive on a normal computer. To address this, we have developed a method to run standard SMLM data analysis software tools in parallel on a high-performance computing cluster (HPC). This can be used to accelerate the analysis of individual SMLM experiments or it can be scaled to analyse high throughput SMLM data by extending it to run on an arbitrary number of HPC processors in parallel. In this paper we outline the design of our parallelised SMLM software for HPC and quantify the speed advantage when implementing it on four HPC nodes compared to a powerful desktop computer.Entities:
Keywords: Automated image analysis; high-performance computing; super-resolved microscopy
Mesh:
Year: 2018 PMID: 30508256 PMCID: PMC6378585 DOI: 10.1111/jmi.12772
Source DB: PubMed Journal: J Microsc ISSN: 0022-2720 Impact factor: 1.758
Figure 1Schematic of SMLM data analysis workflow.
Figure 2Schematic of data flow for parallel HPC analysis of SMLM data.
Figure 3Schematic of software structure for HPC SMLM data analysis.
Figure 4Images of acetylated tubulin in NIH‐3T3 a mouse embryonic fibroblast: (A) wide‐field fluorescence image; (B, C) automatically processed (averaged shifted histogram) visualization of Gaussian NWLS fitting localization table (B) and phasor‐based localization table (C), with lateral drift correction; (scale bar is 10 μm, inset images are 10 μm × 10 μm).
Comparison of processing times with iterative fitting to Gaussian PSF and phasor‐ThunderSTORM for SMLM data analysis using the desktop computer and using one HPC node or the four‐node HPC cluster in parallel mode
| File size (gigabytes) | 13.4 GB (5000 frames) | |
|---|---|---|
| Desktop PC (14 cores) | Phasor | Gaussian NWLS |
|
| 9 min 30 s | 39 min 17 s |
|
| 2 min 30 s | 2 min 36 s |
|
| 11 168 044 (11 093 056) | 10 981 212 (10 981 146) |
|
|
|
|
Figure 5Reconstructed images (with expanded regions to show more detail) of alpha‐tubulin (green) and acetylated tubulin (red) in NIH‐3T3 Mouse Embryonic Fibroblasts generated by parallel HPC analysis using the 4 × 24‐core node cluster. Images reconstructed from (A) the full 25 000 frames and (B) a subset containing the first 10 000 frames of the acquired data. (Scale bar is 10 μm, inset images are 10 μm × 10 μm).
Comparison of processing times of SMLM data represented in Figure 5 for different stages of SMLM data analysis using the desktop computer, and the four‐node HPC cluster in parallel mode 5
| AF488 dSTORM (67.3 GB) | Four HPC nodes four jobs/node | Desktop PC | ||||
|---|---|---|---|---|---|---|
| Algorithm | Gaussian NWLS | Phasor | ||||
| Frames | 25k | 10k | 25k | 10k | 25k | 10k |
|
| 16 min 34 s | 10 min 18 s | 159 min 07 s | 122 min 57 s | 37 min 36 s | 20 min 26 s |
|
| 8 min 06 s | 7 min 28 s | 4 min 24 s | 3 min 34 s | 3 min 48 s | 3 min 39 s |
|
| 19 308 944 (19 308 355) | 16 975 777 (16 975 196) | 19 308 944 (19 308 355) | 16 975 777 (16 975 196) | 19 981 786 (19 546 283) | 17 645 143 (17 269 087) |
|
|
|
|
|
|
|
|
Note: Timings are shown for Gaussian NWLS fitting on both HPC and the Desktop PC, and for Phasor fitting on the Desktop PC. Each algorithm is applied to either the full 25 000 frames.
Figure 6Depth‐resolved dSTORM images of NIH3T3 cell with alpha‐tubulin labelled with Alexa Fluor 647 reconstructed using ThunderSTORM with (A) 10 000‐frame data set (of 29 GB) reconstructed using (B) phasor‐based localization, (C) Gaussian nonlinear weighted least squares fitting and (D) Gaussian fitting using maximum likelihood estimation. Full size image (A) uses NLS and shows the 10 × 10 μm area zoomed for (C and D). (Scale bar 10 μm; depth colour scale: –400 to +400 nm).
Comparison of processing times for different stages of SMLM data using the desktop computer and using the four‐node HPC cluster in parallel mode
| 3D STORM (File size 10 000 frames, 29 GB) | Desktop PC Phasor fitting | Desktop PC Gaussian NWLS | HPC four nodes four jobs/node NWLS | HPC four nodes four jobs/node MLE |
|---|---|---|---|---|
|
| 9 min 26 s | 116 min 32 s | 11 min 31 s | 229 min 52 s |
|
| 3 min 27 s | 3 min 33 s | 4 min 29 s | 4 min 18 s |
|
| 8 764 013 (8 707 036) | 8 677 386 (6 754 187) | 8 677 386 (6 754 187) | 8 683 023 (6 767 722) |
|
|
|
|
|
|