PURPOSE: Recently, several attempts were conducted to transfer deep learning to medical image reconstruction. An increasingly number of publications follow the concept of embedding the computed tomography (CT) reconstruction as a known operator into a neural network. However, most of the approaches presented lack an efficient CT reconstruction framework fully integrated into deep learning environments. As a result, many approaches use workarounds for mathematically unambiguously solvable problems. METHODS: PYRO-NN is a generalized framework to embed known operators into the prevalent deep learning framework Tensorflow. The current status includes state-of-the-art parallel-, fan-, and cone-beam projectors, and back-projectors accelerated with CUDA provided as Tensorflow layers. On top, the framework provides a high-level Python API to conduct FBP and iterative reconstruction experiments with data from real CT systems. RESULTS: The framework provides all necessary algorithms and tools to design end-to-end neural network pipelines with integrated CT reconstruction algorithms. The high-level Python API allows a simple use of the layers as known from Tensorflow. All algorithms and tools are referenced to a scientific publication and are compared to existing non-deep learning reconstruction frameworks. To demonstrate the capabilities of the layers, the framework comes with baseline experiments, which are described in the supplementary material. The framework is available as open-source software under the Apache 2.0 licence at https://github.com/csyben/PYRO-NN. CONCLUSIONS: PYRO-NN comes with the prevalent deep learning framework Tensorflow and allows to setup end-to-end trainable neural networks in the medical image reconstruction context. We believe that the framework will be a step toward reproducible research and give the medical physics community a toolkit to elevate medical image reconstruction with new deep learning techniques.
PURPOSE: Recently, several attempts were conducted to transfer deep learning to medical image reconstruction. An increasingly number of publications follow the concept of embedding the computed tomography (CT) reconstruction as a known operator into a neural network. However, most of the approaches presented lack an efficient CT reconstruction framework fully integrated into deep learning environments. As a result, many approaches use workarounds for mathematically unambiguously solvable problems. METHODS:PYRO-NN is a generalized framework to embed known operators into the prevalent deep learning framework Tensorflow. The current status includes state-of-the-art parallel-, fan-, and cone-beam projectors, and back-projectors accelerated with CUDA provided as Tensorflow layers. On top, the framework provides a high-level Python API to conduct FBP and iterative reconstruction experiments with data from real CT systems. RESULTS: The framework provides all necessary algorithms and tools to design end-to-end neural network pipelines with integrated CT reconstruction algorithms. The high-level Python API allows a simple use of the layers as known from Tensorflow. All algorithms and tools are referenced to a scientific publication and are compared to existing non-deep learning reconstruction frameworks. To demonstrate the capabilities of the layers, the framework comes with baseline experiments, which are described in the supplementary material. The framework is available as open-source software under the Apache 2.0 licence at https://github.com/csyben/PYRO-NN. CONCLUSIONS:PYRO-NN comes with the prevalent deep learning framework Tensorflow and allows to setup end-to-end trainable neural networks in the medical image reconstruction context. We believe that the framework will be a step toward reproducible research and give the medical physics community a toolkit to elevate medical image reconstruction with new deep learning techniques.
In recent years, major breakthroughs made deep learning increasingly prevalent in more and more fields. It revolutionizes the way of classification and regression tasks in speech and image recognition1, 2, 3 and many other areas. Even in the medical domain, where interpretability and reliability are one of the most important driving forces, deep learning has led to astonishing results.4 One of the most cited papers of recent years is the U‐net5 which outperforms classical machine learning algorithms in segmentation tasks. In the subsequent time, the U‐net architecture emerged to many more tasks, for example, artifact correction, image fusion, image‐to‐image translation, and even into the context of medical image reconstruction.6, 7, 8 However, this domain is fundamentally different from those in which the advent of deep learning began, and the question arises as to whether these learned signal reconstruction pipelines are reliable and stable enough for a critical area such as medical imaging.9 Two special issues: “Deep learning in medical imaging”10 and “Machine Learning for Image Reconstruction”11 in transactions on medical imaging (TMI) in 2016 and 2018 discuss the increasing relevance of deep learning methods in medical image reconstruction.The presented approaches can be divided into either pre‐ or post‐processing approaches or fully end‐to‐end trained methods. For the first type, the actual reconstruction pipeline is based on well known signal reconstruction algorithms omitting the end‐to‐end capability due to its complexity. For the second type of approaches, the modeling of the end‐to‐end pipeline can be realized under two different paradigms. One way is to learn the whole signal processing pipeline, an exceptionally clear representative of this paradigm is AUTOMAP.12 Directly in contrast to this is the emerging paradigm of embedding known operators.13 This preserves the end‐to‐end learning capability but includes the known operations of the reconstruction chain to preserve the credibility of the signals, reduce the error bound of the learning process and decrease the number of parameters and thus the amount of necessary training data. This paradigm gets increasingly popular, with multiple publications following the way of embedding known operators in the computed tomography (CT) context and successfully including the CT reconstruction as known operators into the network architecture to be able to benefit from the end‐to‐end training capability of deep learning.14, 15, 16, 17, 18, 19, 20, 21 However, the publications that follow this path are still less represented than those that use deep learning only as pre‐ or post‐processing. We believe that a major reason for this is the non‐trivial implementation of known operators in existing deep learning frameworks. Even publications that successfully take on this challenge often refer to their own implementations as prototypical15 or provide frameworks on abstract wrapped levels.16, 22 An efficient and publicly usable solution integrated into one of the popular deep learning frameworks, however, remains pending.To strengthen the paradigm of known operators, elaborate the research in the medical image reconstruction, and to avoid reimplementations and incompatibilities, we started to work on an open source software framework PYRO‐NN, which allow an easy way to integrate known algorithms into the deep learning framework Tensorflow.23 We provide multiple forward and backward projectors for CT implemented in CUDA based on scientific publications supported with a high‐level Python API for simple use of state‐of‐the‐art CT reconstruction, even from different setups of real CT scanners. The profound integration into Tensorflow on C++/Cuda level allows to handle occurring performance and memory issues and, additionally, allows an easy customization of the algorithms compared to a wrapper alternative like.22, 24 Furthermore, the high‐level Python API offers an easy link between deep learning and community driven frameworks. For the CT domain this allows to use a wide range of tools (e.g., filter, redundancy weights, etc.).24, 25, 26, 27We believe that this framework will help the community leverage the power of end‐to‐end training of machine learning algorithms directly from the data, while continuing to apply mathematically sound solutions to uniquely solvable problems.
Materials and Methods
The framework concept is designed to include native C++ and CUDA based algorithms into the deep learning framework Tensorflow. In detail, PYRO‐NN provides network layers as CUDA implementations to generate parallel‐, fan‐, and cone‐beam x‐ray projections and to reconstruct them within any neural network constructed with Tensorflow. Due to the nature of the projection and reconstruction operation, we intrinsically provide the analytical gradients for all of these layers with respect to their inputs, which allows fully end‐to‐end trainable networks. Furthermore, with PYRO‐NN we provide filters and weights based on scientific publications to allow proper filtered‐backprojection (FBP) reconstructions. The PYRO‐NN API is inspired by the CONRAD26 framework to adapt the ability to reconstruct data from real clinical scanners and by using PyConrad27 many more tools and phantoms can easily be used in the deep learning context. The current state of the framework features a CT reconstruction pipeline, while the basic design allows to transfer the whole concept to other signal reconstruction domains within one framework and, therefore, points out a direction to future development and community contribution.
Software design/rationale
The development speed in the deep learning community is tremendous. Like in the research itself, the toolkits and frameworks are developing in the same speed, which often causes conflicts in interoperability of self‐developed solutions and version mismatches between different frameworks and toolkits. To ensure a robust version control, the framework is directly included into the building process of the Tensorflow sources.
PYRO‐NN‐layers
The known operators can be implemented as CUDA kernels with an additional C++ class following the design of the Tensorflow API for the embedding as a Tensorflow layer. Unlike other frameworks that simply wrap the implementation at the Python level, this provides the advantage of full control over device resources such as memory utilization and implementation efficiency. The separation of the operator implementation as a native CUDA kernel and the information control allows an easy extension towards other deep learning frameworks. The integration of PyTorch is planned for the future. The integration of known operators can be found under: https://github.com/csyben/PYRO-NN-Layers.
PYRO‐NN
We provide a high‐level Python API to allow a convenient use of the known operators as normal Tensorflow layers and offers additional helper functions. The provided Python package automatically invokes the relevant algorithms to compute the gradient with respect to the input of the layer in an efficient way. The provision of the gradient is a necessity to enable a gradient flow through the entire network and, thus, allow fully end‐to‐end trainable networks with known operators. The package can be installed via pip or from: https://github.com/csyben/PYRO-NN.All together, these rationales offer the community with a generic, version stable, framework to easily include known operators into neural networks. The source code is publicly available under the Apache 2.0 licence to be directly compatible with Tensorflow and to allow uncomplicated community contributions to existing projects. A detailed description of the software architecture and the build process can be found in the supplementary material Section 1.
CT reconstruction in neural networks
Based on the generic design of the framework, the current state provides all necessary algorithms and tools for analytical parallel‐, fan‐, and cone‐beam reconstruction. The necessary algorithms are implemented within Tensorflow as an own layer, while the respective tools, for example, filter, weights, etc., are provided on the Python level to supply a high‐level API for CT reconstruction. In the following, we introduce the mapping of the known operator to a layer for our case study of CT reconstruction, followed by a description of the provided algorithms and tools.
The known operator
For the task of reconstructing object information from acquired x‐ray projections, an efficient analytical method is well known and is called FBP. To embed these methods into a neural network, the whole acquisition and reconstruction procedure of a CT system needs to be described with discrete linear algebra to embed them into a neural network. The acquisition of projection data of the object can be described with where A is the matrix describing the geometry, the so called system matrix which can be algorithmically implemented as the forward‐projection operator. The object is denoted by x and p are the acquired projections of object x under the system described by A. The reconstruction according to the FBP algorithm can be conducted using the Moore‐Penrose pseudoinverse for the system matrix which gives: where is the adjoint system matrix which can algorithmically implemented as the back‐projection operator. According to the FBP, the inverse bracket describes a filter operation, which is conducted by a multiplication with the diagonal filter matrix K in the Fourier domain. Consequently F, is the Fourier transform and the respective adjoint, that is, inverse operation. Hence, the forward and backward model can be expressed completely as discrete linear algebra, allowing fully end‐to‐end trainable networks. As the publications from Würfl and Syben et al. show that A and are their respective operators to calculate the gradient, therefore the gradient flow through these layers can be ensured.17, 19
The operator as a layer
From iterative reconstruction, it is known that the system matrix is usually too large to store in memory; therefore, we compute the operator on the fly using ray‐based algorithms. There are several ways for the computation. We introduce the ray‐driven forward‐projection and the voxel‐driven back‐projection algorithmically as native CUDA kernels for the integration into Tensorflow. Note that when using a ray‐driven forward‐projection algorithm to compute the result of the multiplication with A, then the voxel‐driven back‐projection algorithm is not the respective adjoint operation . They are a so called an unmatched projector‐/back‐projector pair. The implications of matched projectors and shear‐warp projectors on the convergence and runtime are subject to future work and are briefly discussed in Section 3.The forward‐projection to generate projections from the input volume are implemented as CUDA kernels in a ray driven manner. For each detector pixel, a ray is cast through the scene, accumulating the absorption values along the line. We provide forward projectors for two‐dimensional (2D) parallel‐ and fan‐beam geometry based on ray vectors and respective geometry parameters. Furthermore, a three‐dimensional (3D) cone‐beam forward projector based on projection matrices is implemented according to Galigekere et al.28 The CUDA kernels are parallelized over the detector pixels computing the line integral along the ray.The back‐projection operators to reconstruct simulated or real projection data are implemented as CUDA kernels in a voxel‐driven manner. For each pixel/voxel to be reconstructed, the projection of the point on all projection images is accumulated. The framework provides the respective 2D parallel‐ and fan‐beam back‐projection algorithms based on geometry parameters and ray vectors. Following the forward projection, the 3D cone‐beam back‐projection is based on projection matrices according to Scherl et al.29 Note that for runtime efficiency, the distance weighting for the cone‐beam circular trajectory geometry is included within the kernel. Currently, only circular trajectories are supported by default. Thus, we recommend adapting the projectors for special reconstruction accordingly. The back‐projection kernel is parallelized over the voxels projecting the respective position on the different detector coordinates interpolating the measured line integral.For the 3D cone‐beam case, the framework offers the possibility to choose between a texture and a kernel interpolation mode. While texture interpolation is associated with very short computing times, Tensorflow’s memory management in combination with CUDA implies that the data must be kept twice in memory. For kernel interpolation, the situation is exactly the other way around, the computations are slower but no additional memory is needed. As both options are provided the user can decide on a per application bases. Furthermore, as the 3D cone‐beam operators are based on projection matrices, calibrated matrices from real systems can be used as shown in the CONRAD framework.26
High‐level python API
To supply the community with an easy‐to‐use version of the described layers, we provide the necessary structure and additional tools like filters, weights, phantoms, etc., within the Python framework. In the following, the outline of the necessary structure to utilize the layers is shown, followed by a short introduction of the provided tools.
Reconstruction and geometry
The high‐level Python API wraps the provided reconstruction layers in Tensorflow. Thus, the framework registers the respective adjoint operation for the gradient computation automatically. All attributes necessary for the provided forward‐projection and backward‐projection layers are covered with a base geometry class and corresponding specialized derived geometry classes, for example, cone‐beam geometry class dependent on projection matrices.
Phantoms
PYRO‐NN contains a set of simple geometric objects, for example, circle, ellipsoid, sphere, and rectangles to easily create a more complex numerical phantom. Furthermore, the framework provides an analytical description of the 2D Shepp–Logan phantom30 as well as a 3D extension of it based on the CONRAD implementation.26
Trajectories
The trajectory describes the geometric scanner setup over the whole scan. For the 2D parallel‐ and fan‐beam cases, the trajectory is described by the central ray vector for each projection. For the 3D cone‐beam case, the trajectory is described by a set of projection matrices, which allows to use calibrated projection matrices from real scanner systems. Within the high‐level Python API, we provide basic methods to compute the respective rays or projection matrices based on a given geometry. The open‐source concept of the whole framework allows to contribute to the diversity of provided trajectories.
Filters
To allow a basic reconstruction in the context of neural networks, PYRO‐NN provides the Ramp and Ram‐Lak filter implemented according to Kak and Slaney.31 The filters can be directly assigned as weights to a multiplication layer and are a multiplication with a diagonal matrix in the Fourier domain as shown in Eq. (2).
Correction weights
In order to support fan‐ and cone‐beam reconstructions for the short‐scan case, the framework contains geometric and redundancy correction weights implemented according to Kak and Slaney.31
Network architectures
Following the paradigm of precision learning,13 different network architectures can be setup or even derived as shown by Syben et al.21 We provide various examples within the framework to assist users in using the framework. The experiments in the supplementary material cover a baseline network able to reconstruct a short‐scan cone‐beam CT according to the Feldkamp–Davis–Kress (FDK) algorithm32 (Section 2.A), a reconstruction with raw‐data and projection matrices from a real system (Section 2.B), an example of learning the correct reconstruction filter discretization (Section 2.C) and a novel baseline network to perform iterative reconstruction within few lines of code (Fig. 1; Section 2.D). A detailed description of the experiments can be found in the supplementary material Section 2.A–2.D. In addition, executable experiments are made available online as a Code Ocean Capsule.33
Figure 1
PYRO‐NN iterative reconstruction network. The training procedure solves: . The seeked reconstruction is achieved when the optimal training state is reached. [Color figure can be viewed at http://wileyonlinelibrary.com]
PYRO‐NN iterative reconstruction network. The training procedure solves: . The seeked reconstruction is achieved when the optimal training state is reached. [Color figure can be viewed at http://wileyonlinelibrary.com]
Discussion
Recently, there have been several different attempts to transfer the astonishing capability of deep learning into the field of medical image reconstruction. In order to transfer deep learning toward medical image reconstruction and at the same time address these problems, the idea of embedding known operators into the neural networks is increasingly pursued as the growing number of publications shows.In this paper, we present a framework providing the known operators for CT reconstruction and all necessary tools to conduct experiments on real scenarios. We believe that such an open‐source framework will reduce the barriers of such approaches and will elevate the research in the medical image reconstruction domain. To encourage the research, we provide baseline experiments in the supplementary material and example code within the framework, allow an starting point for own research ideas.To the best of our knowledge, PYRO‐NN is the first framework that provides CT reconstruction algorithms as native CUDA kernels within neural networks. This allows full control over the device resources in contrast to CT algorithms wrapped on Python level. We choose to implement the projector and back‐projector as a unmatched projector pair. The implications of unmatched pairs are already analyzed in the context of iterative reconstruction by Zeng et al.34 Zeng et al. concluded that unmatched‐pairs can be beneficial due to the algorithmic speedup, while the convergence of the algorithm has to be kept in mind. While we have not noticed negative effects on the training process in our experiments,19, 21 we want to investigate the implications of unmatched‐projector pairs to the training procedure in future work.As the combination of deep neural networks and CT reconstruction can, especially in the 3D case, easily exceeds the GPU’s memory, the provided algorithms allow the user a trade‐off choice between computational‐ and memory efficient implementations. Furthermore, the concept of the framework enables a problem‐specific solution, since the algorithms and the gradients can be changed by the user at any time. Additionally, as the core of PYRO‐NN is an extension of the existing Tensorflow build process, every known operator which allows the calculation of sub‐gradients can easily be modeled as a Tensorflow layer. Besides the actual CUDA implementation, there is only the need of an information control class following the Tensorflow API guidelines. Therefore, the setup allows an easy extension toward other frameworks like PyTorch as the CUDA kernel implementation of the known operator stays untouched.We provide the known operators for CT reconstruction on CUDA level with the respective necessary tools like filters and weights on Python level. Nevertheless, the framework design allows an easy extension to other fields, for example, magnetic resonance imaging (MRI) and many more. With the increasing amount of publications being supplemented by open‐source reference implementations, we believe that with help of the community PYRO‐NN can grow beyond the application on CT reconstruction.
Conclusion and Outlook
PYRO‐NN is an open‐source software framework developed to elevate the use of known operators within neural networks. This allows to transfer the power of deep learning to medical image reconstruction while making use of existing knowledge about the physical principles. Currently, the framework provides state‐of‐the‐art CT reconstruction algorithms within the Tensorflow deep learning environment, supported by the necessary tools for the reconstruction pipeline. This allows to use existing CT reconstruction algorithms in combination with neural networks in an end‐to‐end trainable fashion. The generic design of the framework makes it easy to extend it to other modalities. We hope that our open‐source framework will encourage other groups to join these efforts making the framework a valuable element in the deep learning medical image reconstruction field. The main objective of the framework is to enable the community to use CT reconstruction algorithms in end‐to‐end neural networks and to elevate the research in medical image reconstruction. The software package is available under https://github.com/csyben/PYRO-NN and https://github.com/csyben/PYRO-NN-Layers.Appendix S1: Supplementary Material.Click here for additional data file.
Authors: Andreas Maier; Hannes G Hofmann; Martin Berger; Peter Fischer; Chris Schwemmer; Haibo Wu; Kerstin Müller; Joachim Hornegger; Jang-Hwan Choi; Christian Riess; Andreas Keil; Rebecca Fahrig Journal: Med Phys Date: 2013-11 Impact factor: 4.071
Authors: Wim van Aarle; Willem Jan Palenstijn; Jeroen Cant; Eline Janssens; Folkert Bleichrodt; Andrei Dabravolski; Jan De Beenhouwer; K Joost Batenburg; Jan Sijbers Journal: Opt Express Date: 2016-10-31 Impact factor: 3.894
Authors: Hu Chen; Yi Zhang; Yunjin Chen; Junfeng Zhang; Weihua Zhang; Huaiqiang Sun; Yang Lv; Peixi Liao; Jiliu Zhou; Ge Wang Journal: IEEE Trans Med Imaging Date: 2018-06 Impact factor: 10.048
Authors: Andreas K Maier; Christopher Syben; Bernhard Stimpel; Tobias Würfl; Mathis Hoffmann; Frank Schebesch; Weilin Fu; Leonid Mill; Lasse Kling; Silke Christiansen Journal: Nat Mach Intell Date: 2019-08-09