| Literature DB >> 22481908 |
David S Smith1, John C Gore, Thomas E Yankeelov, E Brian Welch.
Abstract
Compressive sensing (CS) has been shown to enable dramatic acceleration of MRI acquisition in some applications. Being an iterative reconstruction technique, CS MRI reconstructions can be more time-consuming than traditional inverse Fourier reconstruction. We have accelerated our CS MRI reconstruction by factors of up to 27 by using a split Bregman solver combined with a graphics processing unit (GPU) computing platform. The increases in speed we find are similar to those we measure for matrix multiplication on this platform, suggesting that the split Bregman methods parallelize efficiently. We demonstrate that the combination of the rapid convergence of the split Bregman algorithm and the massively parallel strategy of GPU computing can enable real-time CS reconstruction of even acquisition data matrices of dimension 4096(2) or more, depending on available GPU VRAM. Reconstruction of two-dimensional data matrices of dimension 1024(2) and smaller took ~0.3 s or less, showing that this platform also provides very fast iterative reconstruction for small-to-moderate size images.Entities:
Year: 2012 PMID: 22481908 PMCID: PMC3296267 DOI: 10.1155/2012/864827
Source DB: PubMed Journal: Int J Biomed Imaging ISSN: 1687-4188
Algorithm 1GPU-based Split Bregman Compressive Sensing Reconstruction.
Figure 1Sample reconstructions from the timing experiment. (a) shows the original T 1-weighted breast image, (b) is the result of replacing the missing Fourier coefficients with zeros (minimum energy reconstruction), and (c) shows the CS reconstruction. (d) shows the 50% undersampled pattern of Fourier data retained (white is acquired; black is omitted). Entire lines of the Fourier domain were chosen to be consistent with the constraints of a 2D MRI acquisition.
Measured peak CPU and GPU performance in GFLOPs for a matrix multiply experiment using Matlab and Jacket. The GPU achieves a factor of ~5 improvement over the multicore CPU.
| CPU performance (GFLOPS) | GPU performance (GFLOPS) | ||
|---|---|---|---|
| Precision | 1 Core | 6 Cores | |
| Single | 23.6 | 121 | 650 |
| Double | 11.8 | 60.2 | 311 |
Single-precision GPU CS MRI reconstruction times for test images ranging from 162 to 81922 with and without CPU multithreading. Effective speedup is the multicore CPU time relative to the GPU time.
| Square | CPU time (s) | GPU | Effective | |
|---|---|---|---|---|
| image size | 1 Core | 6 Cores | time (s) | speedup |
| 16 | 0.011 | 0.011 | 0.060 | 0.18 |
| 32 | 0.016 | 0.017 | 0.060 | 0.28 |
| 64 | 0.038 | 0.046 | 0.063 | 0.73 |
| 128 | 0.13 | 0.099 | 0.061 | 1.62 |
| 256 | 0.49 | 0.28 | 0.066 | 4.24 |
| 512 | 2.0 | 1.1 | 0.12 | 9.17 |
| 1024 | 8.4 | 3.4 | 0.34 | 10.00 |
| 2048 | 35 | 14 | 1.3 | 10.77 |
| 4096 | 160 | 68 | 8.3 | 8.19 |
| 8192 | 670 | 270 | 140 | 1.93 |
Double-precision GPU CS MRI reconstruction times for test images ranging from 162 to 40962 with and without CPU multithreading. Effective speedup is the multicore CPU time relative to the GPU time.
| Square | CPU time (s) | GPU | Effective | |
|---|---|---|---|---|
| image size | 1 Core | 6 Cores | time (s) | speedup |
| 16 | 0.011 | 0.011 | 0.062 | 0.18 |
| 32 | 0.017 | 0.017 | 0.062 | 0.27 |
| 64 | 0.040 | 0.046 | 0.065 | 0.71 |
| 128 | 0.13 | 0.10 | 0.064 | 1.5 |
| 256 | 0.49 | 0.28 | 0.070 | 4.0 |
| 512 | 2.0 | 1.2 | 0.12 | 10 |
| 1024 | 8.4 | 3.4 | 0.35 | 10 |
| 2048 | 35 | 15 | 1.4 | 11 |
| 4096 | 160 | 69 | 15 | 5 |
Figure 2Speedup factors for the CS MRI reconstruction as a function of image size with and without CPU multithreading. The gray, dashed horizontal line shows a speedup of one.
Figure 3Linear growth of yearly Google Scholar hits for “compressed sensing algorithm” over the last decade. The steadily growing number indicates the increasing difficulty of keeping pace with algorithmic improvements in compressed sensing.