| Literature DB >> 34721702 |
Ojasvi Yadav1, Koustav Ghosal1, Sebastian Lutz1, Aljosa Smolic1.
Abstract
We address the problem of exposure correction of dark, blurry and noisy images captured in low-light conditions in the wild. Classical image-denoising filters work well in the frequency space but are constrained by several factors such as the correct choice of thresholds and frequency estimates. On the other hand, traditional deep networks are trained end to end in the RGB space by formulating this task as an image translation problem. However, that is done without any explicit constraints on the inherent noise of the dark images and thus produces noisy and blurry outputs. To this end, we propose a DCT/FFT-based multi-scale loss function, which when combined with traditional losses, trains a network to translate the important features for visually pleasing output. Our loss function is end to end differentiable, scale-agnostic and generic; i.e., it can be applied to both RAW and JPEG images in most existing frameworks without additional overhead. Using this loss function, we report significant improvements over the state of the art using quantitative metrics and subjective tests.Entities:
Keywords: Computational photography; Deep learning; Exposure correction; Frequency transform; Loss function
Year: 2021 PMID: 34721702 PMCID: PMC8549936 DOI: 10.1007/s11760-021-01915-4
Source DB: PubMed Journal: Signal Image Video Process ISSN: 1863-1703 Impact factor: 2.157
Fig. 3JPEG results For the blue border crop, there is least noise in the FFT loss output. For the yellow border crop, both DCT loss and FFT loss give sharper text written on the book. For the red border crop, the colours are the most accurate for DCT loss. For the green border crop, FFT loss has the least amount of artefacts (Color figure online)
Fig. 1Framework A standard encoder–decoder architecture (yellow) is coupled with a GAN component (green). The Pix2Pix framework used for JPEG images roughly follows this pipeline with additional skip connections. For RAW images, we use the framework of [3], which does not use a GAN component, i.e., uses only the yellow section of the pipeline (Color figure online)
Results for our RAW exposure correction experiments
| Loss | PSNR | SSIM |
|---|---|---|
| L1 (SoA) [ | 28.60 | 0.767 |
| L1 + DCT | 28.61 | 0.769 |
| L1 + FFT |
Bold values indicate highest number of the respective metric in the respective experiment
For both PSNR and SSIM, higher scores are better
Fig. 2RAW results In the blue border crop, the pavement cross is sharper for the FFT loss output. For the yellow border crop, the L1 loss (SoA) output has green artefacts at the bottom while FFT loss does not. For the red border crop, the colours are more accurate for FFT loss. For the green border crop, the window pane is sharper for FFT loss (Color figure online)
Results for our JPEG exposure correction experiments
| Loss | PSNR | SSIM |
|---|---|---|
| L1 + GAN (SoA) [ | 23.9487 | 0.7623 |
| L1 + GAN + DCT | ||
| L1 + GAN + FFT | 24.4624 | 0.7727 |
Bold values indicate highest number of the respective metric in the respective experiment
For both PSNR and SSIM, higher scores are better
Fig. 4Just objectionable difference (JOD) [32] between the L1 (SoA), FFT and DCT loss for all images and for only the RAW and JPEG images, respectively
Results additional applications
| Default settings | Added | |
|---|---|---|
| SRCNN [ | 28.86/0.92 | |
| Gaussian-clean [ | 30.30/0.87 | |
| DeblurGAN-v2 [ | 29.18/0.89 | |
| LBAM [ | 26.11/0.86 | |
| ViDeNN-spatial [ | 31.5 |
Bold values indicate highest number of the respective metric in the respective experiment
The models were trained with and without frequency loss function, while keeping model parameters and datasets constant. The numbers signify PSNR/SSIM scores, and higher scores are better
*Our frequency loss computation was modified to suit classic inpainting loss functions, where instead of finding the loss between ground truth (GT) and output (O), the loss gets calculated including a mask (M) between and and between () and ()