| Literature DB >> 35634105 |
Huibin Zhang1,2, Qiusheng Lian1,3, Jianmin Zhao1,4, Yining Wang2, Yuchi Yang1,3, Suqin Feng2.
Abstract
Deep convolutional neural networks (CNNs) have been very successful in image denoising. However, with the growth of the depth of plain networks, CNNs may result in performance degradation. The lack of network depth leads to the limited ability of the network to extract image features and difficults to fuse the shallow image features into the deep image information. In this work, we propose an improved deep convolutional U-Net framework (RatUNet) for image denoising. RatUNet improves Unet as follows: (1) RatUNet uses the residual blocks of ResNet to deepen the network depth, so as to avoid the network performance saturation. (2) RatUNet improves the down-sampling method, which is conducive to extracting image features. (3) RatUNet improves the up-sampling method, which is used to restore image details. (4) RatUNet improves the skip-connection method of the U-Net network, which is used to fuse the shallow feature information into the deep image details, and it is more conducive to restore the clean image. (5) In order to better process the edge information of the image, RatUNet uses depthwise and polarized self-attention mechanism to guide a CNN for image denoising. Extensive experiments show that our RatUNet is more efficient and has better performance than existing state-of-the-art denoising methods, especially in SSIM metrics, the denoising effect of the RatUNet achieves very high performance. Visualization results show that the denoised image by RatUNet is smoother and sharper than other methods.Entities:
Keywords: Attention mechanism; Convolutional neural networks; Image denoising; RatUNet; U-Net
Year: 2022 PMID: 35634105 PMCID: PMC9138094 DOI: 10.7717/peerj-cs.970
Source DB: PubMed Journal: PeerJ Comput Sci ISSN: 2376-5992
Figure 1U-Net architecture.
Figure 2Residual block architecture.
(A) The standard residual block of the residual network, (B) the improved residual block, and s is stride.
Figure 3RatUNet architecture.
The numbers 128, 256, and 512 in the figure represent channels, and the rectangular block in the figure represents the standard residual blocks which are composed of two conv3 × 3.
Figure 4Depthwise attention mechanism (Howard et al., 2017).
Figure 5Polarized self-attention mechanism.
(A) The original attention mechanism (Liu et al., 2021), (B) the improved attention mechanism after removing LayerNorm.
Average PSNR(dB)/SSIM values of the state-of-the-art methods for grayscale image denoising with various noise levels σ = 15, 25 and 50 on benchmarks datasets Set12 and BSD68. Red color indicates the best performance and second best performances are highlighted in blue.
| Dataset | Noise | BM3D | WNNM | TNRD | DnCNN | IRCNN | FC-AIDE | NLRN | GCDN | MWCNN | RatUNet |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Set12 | 15 | 32.37 | 32.70 | 32.50 | 32.86 | 32.77 | 32.99 |
| 33.14 | 33.15 |
|
| 0.8952 | 0.8982 | 0.8958 | 0.9031 | 0.9008 | 0.9006 | 0.9070 | 0.9072 |
|
| ||
| 25 | 29.97 | 30.28 | 30.06 | 30.44 | 30.38 | 30.57 |
| 30.78 | 30.79 |
| |
| 0.8504 | 0.8557 | 0.8512 | 0.8622 | 0.8601 | 0.8557 | 0.8689 | 0.8687 |
|
| ||
| 50 | 26.72 | 27.05 | 26.81 | 27.18 | 27.14 | 27.42 | 27.64 | 27.60 |
|
| |
| 0.7676 | 0.7775 | 0.7680 | 0.7829 | 0.7804 | 0.7768 | 0.7980 | 0.7957 |
|
| ||
| BSD68 | 15 | 31.07 | 31.37 | 31.42 | 31.73 | 31.63 | 31.78 |
| 31.83 | 31.86 |
|
| 0.8717 | 0.8766 | 0.8769 | 0.8907 | 0.8881 | 0.8907 | 0.8932 | 0.8933 |
|
| ||
| 25 | 28.57 | 28.83 | 28.92 | 29.23 | 29.15 | 29.31 | 29.41 | 29.35 | 29.41 |
| |
| 0.8013 | 0.8087 | 0.8093 | 0.8278 | 0.8249 | 0.8281 | 0.8331 | 0.8332 |
|
| ||
| 50 | 25.62 | 25.87 | 25.97 | 26.23 | 26.19 | 26.38 | 26.47 | 26.38 |
|
| |
| 0.6864 | 0.6982 | 0.6994 | 0.7189 | 0.7171 | 0.7181 | 0.7298 |
| 0.7366 |
|
Figure 6(A–F) Visual comparison results for grayscale image denoising methods on the image “Monarch” from Set12 dataset with noise level 50.
Average PSNR(dB) values of the state-of-the-art methods for color image denoising with various noise levels σ = 15, 25 and 50 on benchmarks datasets CBSD68, Kodak24 and McMaster. Red color indicates the best performance and second best performances are highlighted in blue.
| Dataset | Noise | CBM3D | DnCNN | IRCNN | FFDNet | BRDNet | ADNet | RNAN | RatUNet |
|---|---|---|---|---|---|---|---|---|---|
| CBSD68 | 15 | 33.52 | 33.90 | 39.86 | 33.87 | 34.10 | 33.99 | – |
|
| 25 | 30.71 | 31.24 | 31.16 | 31.21 | 31.43 | 31.31 | – |
| |
| 50 | 27.38 | 27.95 | 27.86 | 27.42 | 28.16 | 28.04 |
|
| |
| Kodak24 | 15 | 34.28 | 34.60 | 34.69 | 34.63 |
| – | 34.76 |
|
| 25 | 32.15 | 32.14 | 32.18 | 32.13 | 32.41 | 32.26 | – |
| |
| 50 | 28.46 | 28.95 | 28.93 | 28.98 | 29.22 | 29.10 |
|
| |
| McMaster | 15 | 34.06 | 33.45 | 34.58 | 34.66 |
| 34.93 | – |
|
| 25 | 31.66 | 31.52 | 32.18 | 32.35 |
| 32.56 | – |
| |
| 50 | 28.51 | 28.62 | 28.91 | 29.18 |
| 29.36 | – |
|
Figure 7(A–F) Color image denoising results of one image “163085” from the CBSD68 dataset with noise level 50 for different methods.
The training time comparison with different GPU.
| Method | GPU | Stream processor unit | Memory capacity (GB) | Training time (h) |
|---|---|---|---|---|
| FFDNet | Nvidia Titan X | 3,584 | 12 | 48 |
| FOCNet | Nvidia Titan Xp | 3,840 | 12 | 48 |
| NLRN | Nvidia Titan Xp | 3,840 | 12 | 72 |
| MWCNN | Nvidia GTX 1080 | 2,560 | 8 | 48 |
| DnCNN | Nvidia GTX 1070ti | 2,432 | 8 | 15 |
| RatUNet | Nvidia GTX 1070ti | 2,432 | 8 | 11 |
Performance comparison of the denoising results of the ReLU and PReLU activation functions of the network model.
| Dataset | Performance metrics | ReLU | PReLU |
|---|---|---|---|
| Set12 | PSNR | 30.81 | 30.85 |
| SSIM | 0.8730 | 0.8736 | |
| BSD68 | PSNR | 29.39 | 29.43 |
| SSIM | 0.8408 | 0.8419 |
Figure 8Comparison of the iterative convergence of loss function values for our initialization method and the Kaiming initialization method.