| Literature DB >> 28261544 |
Tekin Bicer1, Doğa Gürsoy2, Vincent De Andrade2, Rajkumar Kettimuthu1,3, William Scullin4, Francesco De Carlo2, Ian T Foster1,3,5.
Abstract
BACKGROUND: Modern synchrotron light sources and detectors produce data at such scale and complexity that large-scale computation is required to unleash their full power. One of the widely used imaging techniques that generates data at tens of gigabytes per second is computed tomography (CT). Although CT experiments result in rapid data generation, the analysis and reconstruction of the collected data may require hours or even days of computation time with a medium-sized workstation, which hinders the scientific progress that relies on the results of analysis.Entities:
Keywords: Big data; High-throughput; Reconstruction; Tomography
Year: 2017 PMID: 28261544 PMCID: PMC5313579 DOI: 10.1186/s40679-017-0040-7
Source DB: PubMed Journal: Adv Struct Chem Imaging ISSN: 2198-0926
Fig. 1Reconstructed 3D image of a shale sample [46]. The input dataset consists of 90 projections each with 2K × 2K pixels. a, b The 3D reconstructed image using SIRT and Gridrec algorithms, respectively. Reconstruction with SIRT takes 353 s for 80 iterations, using two threads, reconstruction with Gridrec takes only 9 s, using one thread. However, SIRT with 80 iterations provides a higher-quality image than does Gridrec
Fig. 2Execution flow (steps 1–5) of Trace middleware with sinogram-level group communication
Fig. 3Data organization replicated reconstruction object: Trace and cache optimized Trace-OC
Fig. 4Execution times (secs) and L1 cache misses with respect to different numbers of projections and columns using SIRT. a Execution times with varying numbers of projections. b Number od L1 cache misses with varying number of projections. c Execution times with varying number of columns. d Number of L1 cache misses with varying number of columns
Fig. 5Breakdown of iterative reconstruction times (secs) with respect to varying parallelization configurations. Here ppn stands for processes per node, and t is the number of initialized threads per process. For example, with configuration ppn2-t6, Trace-OC initiates two processes on each compute node, each with six worker threads (i.e., a total of 12 threads per compute node). We used 32 compute nodes for the reconstruction. a Mouse brain, b shale sample
Fig. 6Execution times (s) of reconstructing a single sinogram mouse brain with different numbers of compute nodes. The number of iterations is set to five and the ppn2-t6 configuration is used