| Literature DB >> 35295823 |
Malte Brunn1, Naveen Himthani2, George Biros2, Miriam Mehl1, Andreas Mang3.
Abstract
We present a Gauss-Newton-Krylov solver for large deformation diffeomorphic image registration. We extend the publicly available CLAIRE library to multi-node multi-graphics processing unit (GPUs) systems and introduce novel algorithmic modifications that significantly improve performance. Our contributions comprise (i) a new preconditioner for the reduced-space Gauss-Newton Hessian system, (ii) a highly-optimized multi-node multi-GPU implementation exploiting device direct communication for the main computational kernels (interpolation, high-order finite difference operators and Fast-Fourier-Transform), and (iii) a comparison with state-of-the-art CPU and GPU implementations. We solve a 2563-resolution image registration problem in five seconds on a single NVIDIA Tesla V100, with a performance speedup of 70% compared to the state-of-the-art. In our largest run, we register 20483 resolution images (25 B unknowns; approximately 152× larger than the largest problem solved in state-of-the-art GPU implementations) on 64 nodes with 256 GPUs on TACC's Longhorn system.Entities:
Keywords: Gauss–Newton–Krylov solver; PDE-constrained optimization; diffeomorphic image registration; multi-node multi-GPU; preconditioning
Year: 2020 PMID: 35295823 PMCID: PMC8923614 DOI: 10.1109/sc41405.2020.00042
Source DB: PubMed Journal: Int Conf High Perform Comput Netw Storage Anal ISSN: 2167-4337