| Literature DB >> 34777967 |
K A Saneera Hemantha Kulathilake1, Nor Aniza Abdullah1, Aznul Qalid Md Sabri2, Khin Wee Lai3.
Abstract
Computed Tomography (CT) is a widely use medical image modality in clinical medicine, because it produces excellent visualizations of fine structural details of the human body. In clinical procedures, it is desirable to acquire CT scans by minimizing the X-ray flux to prevent patients from being exposed to high radiation. However, these Low-Dose CT (LDCT) scanning protocols compromise the signal-to-noise ratio of the CT images because of noise and artifacts over the image space. Thus, various restoration methods have been published over the past 3 decades to produce high-quality CT images from these LDCT images. More recently, as opposed to conventional LDCT restoration methods, Deep Learning (DL)-based LDCT restoration approaches have been rather common due to their characteristics of being data-driven, high-performance, and fast execution. Thus, this study aims to elaborate on the role of DL techniques in LDCT restoration and critically review the applications of DL-based approaches for LDCT restoration. To achieve this aim, different aspects of DL-based LDCT restoration applications were analyzed. These include DL architectures, performance gains, functional requirements, and the diversity of objective functions. The outcome of the study highlights the existing limitations and future directions for DL-based LDCT restoration. To the best of our knowledge, there have been no previous reviews, which specifically address this topic.Entities:
Keywords: Deep Learning; Denoising; Generative adversarial networks; Medical datasets; Optimization; Structure preservation
Year: 2021 PMID: 34777967 PMCID: PMC8164834 DOI: 10.1007/s40747-021-00405-x
Source DB: PubMed Journal: Complex Intell Systems ISSN: 2199-4536
Fig. 1A diagram of the CT imaging
Fig. 2Visuals of CT degradations. a, b Normal dose and quantum noise corrupted abdomen CT image (The metastasis in of liver lesion marked in a red circle is unclear.) [77]; c, d normal dose and quantum noise corrupted abdomen CT images with streak artifacts [73]
Fig. 3Classification of DL methods used for LDCT restoration
Fig. 4Functional difference of DL techniques: a model based on the discriminative approach; b model based on the generative approach
Analysis of discriminative model-based LDCT restoration applications
| References | Model features | Strength | Weaknesses | |||||
|---|---|---|---|---|---|---|---|---|
| Network design | Input | Model depth | Shortcut | BN | Objective function | |||
| CNN200, Chen et al. [ | CNN | 33 × 33 sparse patches | 3 | No | No | L3 | Simple architecture | Over smoothing [ |
| Du et al. [ | SCN | 100 × 100 patches from LDCT | Stacked competitive blocks | No | No | L3 | Multi-scale processing with great structure preservation | Competitive blocks increase the training parameters due to redundant feature maps |
| AAPM-Net, Kang et al. [ | Deep CNN | 55 × 55 × 15 patches from contourlet transformed image | 24 | 6 bypass connections, concatenation connection | Yes | L3 | Better performs in contrast to the model-based iterative reconstruction | Over smoothing. Consuming considerable computation time due to the wavelet transform |
| Yang et al. [ | ResNet | 128 × 128 cropped images and noise patches | 12 | Bypass connections | Yes | L3 | Both 2D and 3D network models are independent of the input size | The model over fits during the testing |
| Wu et al. [ | Cascaded CNN | LDCT images | Cascaded CNN | Cascade connections | Yes | L3 | Cascade NN suppresses the blocky and steak artifacts | False lesion artifact |
| WaveResNet, Kang et al. [ | Wavelet-based ResNet | 55 × 55 × 15 patches from contourlet transformed image | 24 | Bypass connections Convolution blocks, Concatenation connection | Yes | L3 | High network convergence | resistance to generalizability |
| Wang et al. [ | RLNet [ | 38 × 38 × 22 | 13 | No | Yes | L3 | Outperforming the statistical reconstruction methods Reduction of the high-frequency noise artifacts | Lack of generalizability (Tested only for two clinical tasks) |
| GRCNN, Gou et al. [ | CNN [ | LDCT images | * | No | Yes | L1, L16 | Adapting to various noise types and preserve subtle structures | Finding the optimal model depth is laborious |
| DRL-E-MP, Gholizadeh-Ansari et al. [ | Dilated CNN based on RLNet [ | 40 × 40 sized overlapping patches | 8 dilated convolution layers | Concatenation connections | Yes | L3, L8 | Dilation convolution allows capturing more contextual details using fewer layers | The perceptual loss was determined using VGG16 which trained using natural images |
| DP-ResNet, Yin et al. [ | Projection domain (SD-Net) and image domain (ID-Net)-based ResNets | Overlapping 3D blocks of noisy cone-beam projections for SD-Net | For SD-Net = 9, for ID-Net = 11 | For SD-Net = Bypass connections, for ID-Net = Bypass connections and a contracting path connection | SD-Net = Yes | L3, L4 | Incorporating the projection domain and image domain has enhanced the restoration results | False lesion artifact |
| DCRN, Ming et al. [ | Densely connected ResNet | LDCT images | 12 | Skip connections via the concatenation layer | Yes | L3 | Preserving the structural details | Dense-connections increased the depth of the network, and it leads to a long training time |
| TLR-CNN, Zhong et al. [ | ResNet with Transfer Learning | 40 × 40 sized patches | 15 | No | Yes | L3 | Generalizability and performance enhancement via transfer learning | Streaking artifacts and noise is remaining in the processed images |
| Shiri et al. [ | ResNet with dilated convolution | 512 × 512 voxels | 20 | Bypass connections | Yes | L3 | Providing clear visualizations of lesions in the lung when reducing the radiation dose index by up to 89% | Failure to recover the correct structure details of the lesion |
| TS-RCNN, Huang et al. [ | Two-stage ResNet with stationary wavelet transform | 512 × 512 × 4 | 9 | Bypass connections | Yes | L3, L8 | Preservation of the structural details | Weak texture details |
| CT-ReCNN Jiang et al. [ | Multi-scale parallel RerNet with dilated convolution | 100 × 100 sized patches | Shallow channel: 5, deep channel: 13 | Bypass connections | Yes | L2 (among residual images) | Preserved multi-scale detailed features texture details of lung images | Complex architecture |
Fig. 5Generic architecture of the CNN model
Fig. 6Generic architecture of the SCN model
Fig. 7Generic architecture of the ResNet model
Fig. 8Different shortcut connections. a CNN with sequential convolution layers, b ResNet with convolution block and skip connection. Y—input from the z residual unit, Y—output from l + 1 unit, F(Y)—residual mapping of the stacked convolutional layer. c DenseNet with dense connections. DenseNet concatenates the output passed from previous layers, d inception ResNet connection, and Y, C, I, F represent the input, convolution, inception filtering, and network operations, respectively
Fig. 9Generic architecture of the DenseNet model
Fig. 10Generic architecture of generative models used for LDCT restoration: a autoencoder; b U-Net
Analysis of generative model-based LDCT restoration applications
| References | Model features | Strength | Weaknesses | |||
|---|---|---|---|---|---|---|
| Network design | Input | Model depth | Shortcut | |||
| RED-CNN, Chen et al. [ | Residual EnDec | LDCT images | 5 Convolution and 5 De-convolution layers | Long skip connections | Enhancing the low-contrast regions | Texture loss and blurring due to the usage of MSE-based objective function False lesion issue |
| Liu, Zhang [ | Stacked sparse denoising Autoencoder | 8 × 8 sized patches | 3 stacked sparse denoising autoencoder (6 hidden layers) | No | Preserving the texture details in which decays during the down-sampling | The proposed model still distorted some subtle structures |
| Q-AE, Fan et al. [ | Quadratic Autoencoder | 64 × 64 sized patches | 5 quadratic convolution and 5 quadratic deconvolution layers | Bypass connections | Low computational cost due to the lower number of training parameters | Determining the depth of the network is laborious |
Fig. 11Variant of GAN architectures: a Vanilla GAN, b WGAN, c Cycle-GAN, and d LS-GAN
Analysis of GAN-based LDCT restoration applications
| Publication | GAN Model | Input | Generator design | Discriminator design | Losses in objective function | Strength (s) | Weakness (s) | ||
|---|---|---|---|---|---|---|---|---|---|
| Depth | Shortcut | BN | |||||||
| Wolterink et al. [ | Vanilla Gan | Voxels of size 65 × 65 × 19 | CNN with 7 convolution layers | No | Yes | CNN with 11 convolution layers | L2, L5 | Texture preservation | Quality degradation due to the usage of Wasserstein distance [ Ring artifact |
| WGAN-VGG, Yang et al. [ | W-GAN | Patches of size 64 × 64 | CNN with 8 convolution layers | No | No | CNN with 6 convolution layers [ | L6, L8 | VGG loss prevents severe distortions | False lesion issue [ |
| SAGAN, Yi, Babyn [ | Vanilla Gan | Patches of size 256 × 256 | UNet256 [ | 9 bypass connections in 9 residual blocks | Yes | Patch-GAN structure of [ | L5, L9 | Ensures the sharpness | The sensitivity of the sharpness metric is low in some low-contrast regions. False lesion issue [ |
| CPCE- 2D/3D, Shan et al. [ | Vanilla Gan | Patches of size 64 × 64 | 3D conveying path-based convolutional encoder-decoder network. 4 encoder and 4 deconvolution layers | Three conveying paths between the corresponding encoder and decoder layers | No | CNN with 6 convolutional layers | L5, L8 | 2D-to-3D transfer learning method is efficient and stable [ | Loss of texture and tissue details [ |
| SMGAN, You et al. [ | W-GAN | Voxels of size 80 × 80 × 11 | 3D CNN with 8 convolution layers | No | No | CNN with 6 convolutional layers | L4, L6, L7 | It successfully preserves the critical features to be diagnosed | Edge blurring and structural dissimilarities appeared in some cases |
| Kang et al. [ | Cycle GAN | Patches of size 56 × 56 | CNN with 20 convolution layers | 6 bypass connections in 6 convolution blocks | Yes | Patch-GAN structure of [ | L5, L10, L11 | Resistance to mode collapse and robust to cardiac motion and contrast change | Lack of texture preservation due to the low stability of the DL model [ |
| m-WGAN, Hu et al. [ | W-GAN | Patches of size 64 × 64 | ResNet with 16 Residual blocks (38 Convolution layers) | 16 Bypass connection in 16 residual blocks | Yes | CNN with 8 convolution layers | L3, L6 | Minimizing the over-smoothing | The noise and artifacts remain in the restored images |
| Park et al. [ | Vanilla GAN | Patches of size 128 × 128 | U-net like deep convolutional framelet | 3 bypass connection, contracting paths connect low-frequency bands | Yes | Patch-GAN structure with 4 convolution layers | L5, L17 | Preserving the detailed information during the down-sampling, and fidelity term protects the subtle details | Gaussian noise components cannot be successfully removed by the proposed method [ |
| VAGAN, Du et al. [ | Vanilla GAN | Patches of size 64 × 64 | Residual Encoder-Decoder Network with 14 convolution blocks | 14 Bypass connections in 14 convolution blocks | No | Attentive discriminator with 8 convolution layers | L5, L8, L12, L13, L14 | The network pays special attention to the noisy regions and background structures [ | VAGAN performance towards PSNR is weak |
| CycleGAN-BM3D, Tang et al. [ | Cycle GAN | Patches of size 128 × 128 | Encoder-decoder network with 18 convolution layers | 6 Bypass connections in residual blocks | No | CNN with 5 convolution layer | L5, L10, L15 | BM3D-based prior information prevents the generation of incorrect details | Cycle GAN alters the local image details [ |
| HFSGAN, Yang et al. [ | Least Square GAN | Images with size 512 × 512 | Two nested U-Nets | 7 Bypass connections between corresponding encoder and decoder layers | Yes | Patch-GAN structure of [ | L4, L17, L18 | The inception-based discriminator boosts the accuracy of the generator Preserving more texture details | Running two parallel U-net models has increased the model complexity |
| Ma et al. [ | Least Square GAN | Patches of size 55 × 55 | ResNet with 14 convolution layers | 5 bypass connections in the 5 residual blocks | No | CNN with 6 convolution layers | L4, L7, L18 | Minimizing the vanishing gradient and allows content learning with the SSIM | Lack of performance in pixel-level similarity assessment |
| Chi et al. [ | Least Square GAN | Patches of size 256 × 256 | Modified U-net with inception residual blocks (4 convolution and 4 deconvolution layers) | 4 bypass connections to concatenate inception residual blocks with deconvolution layers | Yes | Multi-layer CNN | L3 (between noise and content), L5, L8 | Preserving the structural details and eliminates false lesions | Noise is still retained in the restored CT images |
| SACNN, Li et al. [ | W-GAN | Stacked 3 patches of size 64 × 64 | CNN with 8 convolution layer and 3 3D self-attention blocks | 3D self-attention blocks | No | CNN with 6 convolution layers | L6, L8 | Improving the denoising performance of both GAN and CNN-based networks | Low PSNR and SSIM for some cases due to not using the pixel-wise loss function |
| Gu, Ye [ | Cycle- GAN | Patches of size 128 × 128 | U-net generator with 10 convolution layers and AdaIn layers | By pass connections to connect encoder layers with corresponding decoder layers | No | Patch-GAN structure with 6 convolution layers | L10, L11, L18 | One generator, Stable training with less number of parameters | Structure preservation need to be improved further |
DRWGAN-PF Yin et al. [ | W-GAN | Patches of size 128 × 128 | CNN with 8 convolution layers. 2nd and 6th layer consists of dilated convolution | 3 bypass connections between the 3 convolution blocks | Yes | CNN with 6 convolution layers | L2, L6, L8* | Trained on unpaired data | Perceptual loss is determined using natural images (VGG-19 net) |
*Multi-perceptual loss—average perceptual loss of 5 levels in VGG-19 net
Common datasets used in the reviewed literature
| ID | Dataset | Anatomy | Remarks | Related studies |
|---|---|---|---|---|
| D01 | NBIA/NCIA dataset [ | Many organs including, Chest | The National Biomedical Imaging Archive consists of 7,015 total NDCT images of 256 × 256 size. URL: | [ |
| D02 | AAPM-Mayo | Abdomen | Mayo clinic AAPM Low- Dose CT Grand Challenge dataset consists of 2378 full and quarter dose CT images from 10 patients of 512 × 512 size. URL: | [ |
| D03 | Piglet dataset [ | Whole-body | Images were obtained under four dose levels and each dose level consists of 850 images of 512 × 512 size | [ |
| D04 | Data Science Bowl 2017 | Lung | The dataset consists of over a thousand high-resolution LDCT images of high-risk lung cancer patients. | [ |
| D05 | 3D-IRCADb | Different organs | 1375 clinical NDCT images of 10 patients. URL: | [ |
| D06 | Luna-16 | Lung | 888 clinical NDCT scans are available with annotations. URL: | [ |
| D06 | Cardiac CT | Cardiac | Cardiac CT scans of 28 patients | [ |
| D07 | MGH dataset [ | Abdomen, chest, and head | Massachusetts General Hospital (MGH) dataset consists of 40 cadaver scans obtained under four dose levels | [ |
| D08 | Cardiac CT | Cardiac | Two sets of 50 CT scans of mitral valve prolapse and coronary artery disease patients | [ |
| D09 | Dental CT | Dental CT | CT images were reconstructed using sinogram data in axial, sagittal, and coronal planes | [ |
| D10 | Liver simulated dataset | Liver and portal vein | 2480 NDCT images of liver and portal vein of 62 patients | [ |
| D11 | Brain clinical dataset | Brain | 200 brain CT images of two different dose levels (100 from each.) | [ |
| D12 | Piglet dataset | Whole-body | 360 data were scanned under four dose levels | [ |
| D13 | COVID-19 | Lung | 1141 volumetric chest CT exams were obtained from 9 medical centers. Among them, 312 data were marked as PCR-positive | [ |
Fig. 12Steps of LDCT reconstruction algorithms
Loss functions used in the ML-based literature
| Abbreviation | Loss function | Name: description | Strength |
|---|---|---|---|
| L1 | Pixel difference: The pixel value difference between the center pixel of the patch in the ground truth image and the generated pixel value by the network | Preserves the gray content in the generated images | |
| L2 | Squired error: Supports to keep the properties of differentiability, symmetry, and convexity. Suffer from blurring. Determines element-wise data fidelity loss in image space | Ensures the pixel-level similarity between generated and real images | |
| L3 | Average/Mean Squared error: compute the MSE loss between the true and generated images | Ensures the differentiability, symmetry, and convexity between the generated image and the real image | |
| L4 | L1 loss: It computes the mean absolute error of two images | It supports to keep the symmetry, convexity, and reduce blurring. However, it forms blocky artifacts | |
| L5 | Adversarial loss: Consisted of generator loss and discriminator loss | Improves the stability of the training process | |
| L6 | Adversarial loss with Wasserstein distance: To determine the GAN loss based on earth mover distance | Leverages the training convergence and improves the stability of the training process | |
| L7 | Structural Loss: The loss which finds the best visual patterns in the images | It discourages blurring and preserves the structure and texture details | |
| L8 | Perceptual loss: The loss which determines similarity in the features. It is computed based on the features extracted from the pre-trained networks such as the VGG-19 network [ | Structure and texture preservation | |
| L9 | Sharpness loss: The loss which assesses the sharpness loss of the features represented in the images to be processed | Ensures the sharpness of features | |
| L10 | Cyclic loss: The loss which permits one-to-one correspondence between the noisy and denoised image | Enforces the inverse relationship of the cycle-GAN model | |
| L11 | Identity loss: Minimizes the alteration loss of input and generated images of each generator model associated with the cycle GAN | Ensure that the properly generated output images no longer change when used as inputs to the same network | |
| L12 | Attentive block loss: The loss between the image generated by the attentive block and the prior/ attention knowledge | Create the optimal attention map | |
| L13 | Multi-scale loss: The loss which computes further structural and contextual information on different scales | Preserve the structural and contextual information on different scales | |
| L14 | Attention loss: The loss between the features extracted from interior layers of the discriminator and the attention map produced at time step | Determine the optimal learning of latent noise (attention feature data) distribution | |
| L15 | BM3D Prior loss: L1 loss between the BM3D prior image and the generated image | Enforces the generator to produce the desired output | |
| L16 | Gradient loss: The loss which determines the high-level feature-based information | Leverage the training convergence | |
| L17 | Detail loss: Restoration loss between the high-frequency sub-bands of the ground truth and the high-frequency sub-bands of the generated images | Preserve the edges | |
| L18 | Least square loss: GAN loss of least-square GAN model | Improves the stability of the training process |
Summary of functional aspects of reviewed studies
| Article | Test dataset (s) | Model feature for noise and artifact reduction | Quantitative analysis | Aspects of structure preservation | ||||
|---|---|---|---|---|---|---|---|---|
| MSSIM | PSNR | Border (B) or edges (E) | Organs and structures | Textures | Lesions | |||
| Chen et al. [ | D01 | Deep CNN | 0.94 ± 0.02 | 40.57 ± 1.58 | B | Fine structural details | × | × |
| Du et al. [ | D01, D02 | Competitive blocks in the stacked competitive network | 0.95 | 41.92 | × | Fine structural details, inter-costal vein | √ | √ |
| Kang et al. [ | D02 | Direction wavelet transform determines the directional artifacts | × | 34.8 ± 1.5 | B | Liver, blood vessels | √ | √ |
| Yang et al. [ | D02 | Optimal model depth and dropouts | 0.94 ± 0.002 | 39.53 ± 0.30 | E | Continuous structures, blood vessels | √ | × |
| Wu et al. [ | D02 | Applying the cascading for CNN boosts PSNR | 0.75 ± 0.001 | 42.2 ± 0.05 | × | Fine structural details | × | √ |
| Kang et al. [ | D02 | Feed forwards and RNN network structure suppressed the streaking artifacts | 0.89 | 38.54 | E | Liver, pelvic bone, blood vessels | √ | √ |
| Wang et al. [ | D02 | Statistical iterative reconstruction framework and IRL prior information | × | 26.77 ± 12.84 | E | All structural details | × | × |
| Gou et al. [ | D02 | The gradient regularization of the structure details | 0.91 ± 0.04 | 36.62 ± 3.245 | E | Soft tissues | × | √ |
| Gholizadeh-Ansari et al. [ | D01, D03 | MSE and perceptual loss combined objective function keep more texture details | 0.84 | 35.57 | E | Lung, pulmonary vasculature | √ | × |
| Yin et al. [ | D01, D02 | Sinogram domain network (SD-net), and image domain network (ID-net) | 0.95 ± 0.01 | 41.97 ± 1.49 | E | Tissues, blood vessels | √ | × |
| Ming et al. [ | D01 | DenseNet [ | 0.94 | 33.67 | × | Bone grove | × | × |
| Zhong et al. [ | D01, D02 | Stacked residual blocks | 0.96 ± 0.012 | 43.51 ± 0.95 | × | Organs in the thorax | × | × |
| Shiri et al. [ | D13 | ResNet with dilated convolution | 0.97 | 34.0 | × | Lung regions | × | √ |
| Huang et al. [ | D02, D05 | Stationary wavelet transformation and ResNet | 0.88 | 31.0 | E | Liver, renal inferior vena cava | √ | √ |
| Jiang et al. [ | D02 | Multi-scale parallel ResNet. Dilated convolution | 0.94 | 32.75 | × | Detailed structures of Lung | √ | × |
| Chen et al. [ | D01, D02 | Shortcut connections have preserved the structure | 0.97 ± 0.008 | 44.42 ± 1.21 | B | Adrenal gland, blood vessels | √ | √ |
| Liu and Zhang [ | D01 | Stacked sparse denoising autoencoder | × | 42.32 ± 0.36 | E | Fine structural detail, blood vessels | × | × |
| Fan et al. [ | D02 | Quadratic autoencoder | 0.91 ± 0.05 | 30.82 ± 2.76 | × | Lymph nodes, blood vessels | √ | √ |
| Wolterink et al. [ | D06 | GAN network | × | 40.45 ± 4.05 | × | × | × | √ |
| Yang et al. [ | D02 | Wasserstein distance and perceptual loss-based objective function | 0.78 ± 0.03 | 24.17 ± 0.07 | × | × | √ | √ |
| Yi and Babyn [ | D01, D03, D04 | Long skip connections | 0.86 ± 0.01 | 27.51 ± 0.74 | E | Fine structural details | × | × |
| Shan et al. [ | D02, D07 | Conveying path-based encoder and decoder model. Transfer learning is applied from the 2D model to the 3D model | 0.90 ± 0.03 | 29.63 ± 1.68 | × | Fine structural details, intrahepatic blood vessels | √ | √ |
| You et al. [ | D02 | Structure sensitive objective function | 0.78 ± 0.04 | 26.38 ± 1.01 | E | Structural details, blood vessels | √ | √ |
| Kang et al. [ | D08 | Cycle GAN | × | × | E | Structural details | × | × |
| Hu et al. [ | D09 | Deep generator with cascaded ResNet | 0.95 ± 0.02 | 30.79 ± 3.03 | E | Trabecular bone, tooth root | × | × |
| Park et al. [ | D10 | Fidelity term in the objective function | 0.89 ± 0.02 | 34.34 ± 0.92 | E | Liver, morphological structures of the tissues, portal vein | × | × |
| Du et al. [ | D02 | Visual attention network | 0.98 | 42.05 | × | Fine structural details | × | √ |
| Tang et al. [ | D12 | Cycle GAN and BM3D prior | 0.97 ± 0.04 | 34.35 ± 0.14 | E | Tissue, blood vessels | √ | × |
| Yang et al. [ | D02, D03 | Generator with high-frequency domain-based U-Net and discriminator with inception module | 0.94 ± 0.004 | 32.29 ± 0.99 | E | Fine structural details | × | √ |
| Ma et al. [ | D02 | Adopting the least-square loss | 0.91 | 32.70 | × | Structural details | √ | × |
| Chi et al. [ | D02 | Generator with inception residual mapping-based U-Net and perceptual multi-level joint discriminator | 0.98 | 44.40 | × | Structural details | × | √ |
| Li et al. [ | D02 | Plane and depth attention modules | 0.78 | 22.17 | E | Structural details, blood vessels | √ | √ |
| Gu and Ye [ | D01 | Switchable generator using AdaIn layers | 066 | 30.87 | E | Structural details | × | × |
| Yin et al. [ | D06 | Multi-perceptual loss and residual connections in the generator network | 0.70 | 30.46 | E | Structural details | √ | × |
×—indicates “not tested” state
Summary of training and execution efficiency of reviewed studies
| Article | Hardware platform | Handling learning rate | Data augmentation | Tainting dataset size | Batch size | Epochs | Training time (h) | Execution time (s) |
|---|---|---|---|---|---|---|---|---|
| Chen et al. [ | GPU | Time-based | Rotation (45°), flipping (vertical and horizontal), scaling (2, 0.5) | 200 slices (106 patches) | × | × | 17 | 2.05 |
| Kang et al. [ | NVIDIA GeForce Titan, 6 GB | Time-based | Random flipping and rotate | 200 slices (106 patches) | 10 | 50 | × | 1.6 |
| Yang et al. [ | NVIDIA GTX 1080, 8 GB | Step-based | × | 5080 slices | 1 | 150 | × | 0.3 (2D model), 0.25 (3D model) |
| Kang et al. [ | NVIDIA GeForce GTX 1080 Ti | Time-based | Flipping (vertical and horizontal) | 3236 slices | 10 | × | 20 | 2.05 |
| Wang et al. [ | Titan X GPU | Step-based | × | 22,800 slices for T-net and 433,200 slices for A-net | 256 | × | 8.10 for 2D model, 10.15 for 3D model | × |
| Yin et al. [ | NVIDIA GTX 1080, 8 GB | Time-based | × | 106 and 107 patches | × | × | 4 (106 patches), 10 (107 patches) | 0.13 SDnet, 1 for IDnet |
| Ming et al. [ | NVIDIA GeForce GTX 1080 Titan | Time-based | Rotation, flipping | 100 slices | 1 | 64 | 6.57 | 0.002 |
| Zhong et al. [ | NVIDIA GeForce GTX Titan Xp | Time-based for first 25 epochs and fixed learning rate for next 25 epochs | Rotation, flipping, scaling | 5761 slices | 128 | 50 | 29.5 (30 residual blocks) | 0.1 |
| Shiri et al. [ | NVIDIA GTX 2080 Titan | Adaptive | × | 1141 slices | × | 10 | 50 | × |
| Huang et al. [ | NVIDIA GTX1080 Titan | Time-based | × | 1944 slices | 4 (stage 1), 16 (stage 2) | 55 (stage 1), 30 (stage 2) | 4 | < 3.68 |
| Chen et al. [ | NVIDIA GTX1080 | Time-based | rotation (45°), flipping (vertical and horizontal), scaling (2, 0.5) | 107 patches | × | × | 30 | 3.68 |
| Liu, Zhang [ | NVIDIA GTX 980 Titan | BFGS algorithm | rotation (30°), flipping (vertical and horizontal), scaling (2, 0.5) | 300 slices (106 patches) | × | × | 6.5 (6 layers) | 1.7 |
| Fan et al. [ | × | Fixed | × | 64,000 patches | 50 | 20 | × | 1.29 |
| Wolterink et al. [ | NVIDIA Titan X | Exponential | × | × | 48 | 100 | × | < 10 |
| You et al. [ | NVIDIA Titan Xp | Adaptive | × | 100,100 patches | 64 | 100 | 15 (2D model), 26 (3D model) | 0.534 (2D model), 4.864 (3D model) |
| Du et al. [ | NVIDIA Titan V | Fixed | × | 4000 slices (140,000 patches) | 32 | 20 | × | 0.08 |
| Chi et al. [ | NVIDIA Titan Xp, 12 GB | Time-based | × | 4036 slices | × | × | × | 1.84 |
| Yin et al. [ | NVIDIA RTX 2070 | Fixed | × | 2400 slices (11,648 patches) | 32 | 100 | 14 | × |