Martin C W Leong1, Kit-Hang Lee1, Bowen P Y Kwan2, Yui-Lun Ng1, Zhiyu Liu1, Nassir Navab3, Wayne Luk2, Ka-Wai Kwok4. 1. Department of Mechanical Engineering, The University of Hong Kong, Pok Fu Lam, Hong Kong. 2. Department of Computing, Imperial College London, London, SW7 2AZ, UK. 3. Computer Aided Medical Procedures and Augmented Reality, Technischen Universität München, 85748, Munich, Germany. 4. Department of Mechanical Engineering, The University of Hong Kong, Pok Fu Lam, Hong Kong. kwokkw@hku.hk.
Abstract
PURPOSE: Intensity-based image registration has been proven essential in many applications accredited to its unparalleled ability to resolve image misalignments. However, long registration time for image realignment prohibits its use in intra-operative navigation systems. There has been much work on accelerating the registration process by improving the algorithm's robustness, but the innate computation required by the registration algorithm has been unresolved. METHODS: Intensity-based registration methods involve operations with high arithmetic load and memory access demand, which supposes to be reduced by graphics processing units (GPUs). Although GPUs are widespread and affordable, there is a lack of open-source GPU implementations optimized for non-rigid image registration. This paper demonstrates performance-aware programming techniques, which involves systematic exploitation of GPU features, by implementing the diffeomorphic log-demons algorithm. RESULTS: By resolving the pinpointed computation bottlenecks on GPU, our implementation of diffeomorphic log-demons on Nvidia GTX Titan X GPU has achieved ~ 95 times speed-up compared to the CPU and registered a 1.3-M voxel image in 286 ms. Even for large 37-M voxel images, our implementation is able to register in 8.56 s, which attained ~ 258 times speed-up. Our solution involves effective employment of GPU computation units, memory, and data bandwidth to resolve computation bottlenecks. CONCLUSION: The computation bottlenecks in diffeomorphic log-demons are pinpointed, analyzed, and resolved using various GPU performance-aware programming techniques. The proposed fast computation on basic image operations not only enhances the computation of diffeomorphic log-demons, but is also potentially extended to speed up many other intensity-based approaches. Our implementation is open-source on GitHub at https://bit.ly/2PYZxQz .
PURPOSE: Intensity-based image registration has been proven essential in many applications accredited to its unparalleled ability to resolve image misalignments. However, long registration time for image realignment prohibits its use in intra-operative navigation systems. There has been much work on accelerating the registration process by improving the algorithm's robustness, but the innate computation required by the registration algorithm has been unresolved. METHODS: Intensity-based registration methods involve operations with high arithmetic load and memory access demand, which supposes to be reduced by graphics processing units (GPUs). Although GPUs are widespread and affordable, there is a lack of open-source GPU implementations optimized for non-rigid image registration. This paper demonstrates performance-aware programming techniques, which involves systematic exploitation of GPU features, by implementing the diffeomorphic log-demons algorithm. RESULTS: By resolving the pinpointed computation bottlenecks on GPU, our implementation of diffeomorphic log-demons on Nvidia GTX Titan X GPU has achieved ~ 95 times speed-up compared to the CPU and registered a 1.3-M voxel image in 286 ms. Even for large 37-M voxel images, our implementation is able to register in 8.56 s, which attained ~ 258 times speed-up. Our solution involves effective employment of GPU computation units, memory, and data bandwidth to resolve computation bottlenecks. CONCLUSION: The computation bottlenecks in diffeomorphic log-demons are pinpointed, analyzed, and resolved using various GPU performance-aware programming techniques. The proposed fast computation on basic image operations not only enhances the computation of diffeomorphic log-demons, but is also potentially extended to speed up many other intensity-based approaches. Our implementation is open-source on GitHub at https://bit.ly/2PYZxQz .
Authors: Terry S Yoo; Michael J Ackerman; William E Lorensen; Will Schroeder; Vikram Chalana; Stephen Aylward; Dimitris Metaxas; Ross Whitaker Journal: Stud Health Technol Inform Date: 2002
Authors: Stefan Klein; Marius Staring; Keelin Murphy; Max A Viergever; Josien P W Pluim Journal: IEEE Trans Med Imaging Date: 2009-11-17 Impact factor: 10.048