Literature DB >> 28759633

Dictionary learning based noisy image super-resolution via distance penalty weight model.

Yulan Han1, Yongping Zhao1, Qisong Wang1.   

Abstract

In this study, we address the problem of noisy image super-resolution. Noisy low resolution (LR) image is always obtained in applications, while most of the existing algorithms assume that the LR image is noise-free. As to this situation, we present an algorithm for noisy image super-resolution which can achieve simultaneously image super-resolution and denoising. And in the training stage of our method, LR example images are noise-free. For different input LR images, even if the noise variance varies, the dictionary pair does not need to be retrained. For the input LR image patch, the corresponding high resolution (HR) image patch is reconstructed through weighted average of similar HR example patches. To reduce computational cost, we use the atoms of learned sparse dictionary as the examples instead of original example patches. We proposed a distance penalty model for calculating the weight, which can complete a second selection on similar atoms at the same time. Moreover, LR example patches removed mean pixel value are also used to learn dictionary rather than just their gradient features. Based on this, we can reconstruct initial estimated HR image and denoised LR image. Combined with iterative back projection, the two reconstructed images are applied to obtain final estimated HR image. We validate our algorithm on natural images and compared with the previously reported algorithms. Experimental results show that our proposed method performs better noise robustness.

Entities:  

Mesh:

Year:  2017        PMID: 28759633      PMCID: PMC5536359          DOI: 10.1371/journal.pone.0182165

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


1 Introduction

Single image super-resolution (SR) is a classical problem in computer vision. In general, it uses signal processing techniques to recover a high resolution (HR) image from only one low resolution (LR) image. SR methods can be broadly classified into three categories: interpolation-based methods, reconstruction-based methods, and example-based methods. Interpolation-based SR such as [1, 2] has been proposed for in various applications and it demonstrates the advantage of fast computational simplicity. But they usually fail to generate fine details in discontinuous regions and often result in introducing blurring of edges and other high-frequency features in practice [3]. Reconstruction-based methods usually integrate one or more sophisticated priors such as gradient profile prior [4], edge prior [5], and total variation [6] into SR literature to estimate the missed details. Recently, sparse-based regularization [7-10] has also been shown to be particularly effective for the ill-posed problems of SR. Usually, these methods achieved impressive results in preserving sharper edges and suppressing aliasing artifacts. However, the performance depends heavily upon a rational prior imposed on the up-sampled image [11]. Over the years, many example-based SR methods [12-14] have been proposed with demonstrated promising results and become the mainstream approaches of SR domain. The methods assume that the missing high frequency details can be estimated based on learning the mapping relationship from LR-HR patch pairs of external database and input LR patches. Two kinds of relationship models exist for these methods. One is that between LR patches and the corresponding HR patches in the database. After Freeman et al. [15] used Markov network to model the relationship, regression functions [16] are employed to exploit the relationship between HR and LR patch pairs. In addition, supervised or semi-supervised learning models are introduced into some of the algorithms [17-19]. Recently, a mapping of LR-HR image pairs was learned using a deep convolutional neural network [20], and has shown favorable results. D. Dai et al. [21] jointly learned a collection of regressors from LR to HR patches, which collectively yielded the smallest error for all training data. The other is that between LR example patches and input LR patches. Most of the methods [22, 23] is based on Nearest Neighbor Embedding (NNE). In these methods, a fixed number of nearest neighbors are extracted from database for each input LR patch, and then the corresponding HR patches are used to estimate the output HR patch by a linear combination determined by LR patch and its neighbors. Despite the algorithms are demonstrated by successful results, they highly depend on the number of neighbors which is difficult to determine. For this problem, [24] operates on a dynamic k-nearest neighbor algorithm, where k is small for test point with highly relevant neighbors and large others. Some researchers calculate the distance between input patch and its neighbors respectively. The neighbors will be abandoned when the distance is smaller than mean value. Yang [25] exploited sparse coding to perform image SR. The algorithm assumes that LR-HR patch pairs share the same sparse coefficients with respect to their respective dictionaries which are jointly learned from a set of external training images. It can be considered as neighbor embedding in sparse domain without choosing the number of neighbors. Since then, sparse coding is applied to SR problem [21-23], and achieves impressive results. Zeyde [26] used dimensionality reduction and orthogonal matching pursuit for sparse representation to improve efficiency. S. Wang [27], proposed a semi-coupled dictionary learning model, under which a pair of dictionaries and a mapping function describing the relationship between sparse coefficients of LR-HR patch pairs will be simultaneously learned. In [28], kernel ridge regression is employed to connect sparse coefficients of LR-HR patch pairs. Kaibing Zhang [29] determine the relationship between LR image patches and HR image patches by assuming that LR image patches and HR image patches are share the same sparse coefficients. R. Timofte et al. [30] proposed a fast image SR method called anchored neighbourhood regression (ANR) which learns sparse dictionaries and regressors anchored to dictionary atoms. This algorithm is faster, while making no compromise on quality. R. Timofte et al. [31] then produced an improved variant of ANR. The study in [31] enhanced these features and anchored regressors for ANR. Instead of learning the regressors on the dictionary, their method uses the full training material. It obtained improved quality, and became the fastest method indisputably. S. Gu [32] proposed a convolutional sparse coding based SR method to address consistency issue. In addition, researches show that image structures tend to repeat themselves within and across scales. [33-35] exploits the self-similarity of structures in nature image and extracts the database directly from the LR input image instead of the external database. Good reconstruction quality relies on much additional memory and running time to build counterparts across different scales in a recursive scheme. Therefore, its application is limited. Although the algorithms can results in better performance, most of the SR algorithms including other learning-based methods assume that the input LR image is noise-free. Such assumption is not in accord with real applications. The algorithms are less robustness to noisy image SR. So another challenge is the super-resolution for noisy images. While compared with SR on clear LR input images, less attention has been paid to develop effective SR algorithms for noisy ones. J. Xie [36] first employs an adaptively regularized Shock filter to tackle the jagged noise, and then perform SR for depth image. The disadvantage of such scheme is that the artifacts can be created in denoising process and magnified in super-resolution process. Therefore, researchers started on simultaneously denoising and super-resolution. In [37], LR training images are magnified by a TV regularization model with a constraint before dictionaries training stage. However, the level of noise dealt with the method is small. Furthermore, it focuses on magnification only. Based on the current research status, we devote to design an algorithm to complete SR and denoising in the same framework to deal with noisy image patches. Sparse representation makes the signal energy only concentrated in a few atoms. Because of the special nature, some sparse coding based SR algorithms such as [25] show certain robustness to noisy image. In addition, sparse representation has been successfully employed in image denoising [38, 39], image restoration [40, 41] and other processing [42, 43]. The dictionary plays an important role in the sparse representation process. A predefined analytical dictionary (e.g., wavelet dictionary, Gabor dictionary) make the coding fast and explicit, but it is less effective to model the complex local structures of natural images. A synthesis dictionary (e.g., K-SVD dictionary) can be learned from example natural images and has more expensive computation but can better model complex image local structures [44]. In recent years, lots of dictionary learning methods have been proposed and achieved obvious performance. Feng et al. [45] propose to learn jointly the projection matrix for dimensionality reduction and the discriminative dictionary for face representation. Zhang et al. [46] propose a semisupervised label consistent dictionary learning framework for machine fault classification. Inspired by these, we introduce sparse theory to our research. The synthesis procedure is illustrated in Fig 1. The input LR image and example images are firstly cropped into patches. The example images are noise-free. Then the features of example patch pairs are extracted, which will be learned for dictionary pair. For each input LR patch, according to its features, it is easy to achieve simultaneously similar dictionary atom pairs finding and calculating distance b between input LR patch and its similar atoms. Next, combined with the input LR image patch feature, LR dictionary atom and distance b are used to compute weight ω. After the weight is computed, we can obtain estimated HR image patch and denoised LR image patch from . Put all the estimated HR patches into an estimated HR image, which is computed by averaging in overlapping regions. In the same way, we obtain the denoised LR image from all the denoised LR patches. At last, combined with the iterative back projection (IBP), the estimated HR image and the denoised LR image are applied to obtain the final output HR image.
Fig 1

The flowchart of the proposed SR algorithm.

The contributions can be summarized as follows. (1) Different from the conventional methods, the proposed algorithm can process noisy image, and present for simultaneously image superresolution and denoising. Furthermore, in the training stage of our method, LR example images are noise-free. For different input LR images, even if the noise variance varies, the dictionary pair does not need to be retrained. (2) The core idea of our proposed method is that the estimated HR patch is weighted average of similar HR example patches. To reduce computational cost for finding similar patches from millions of examples, example patches are replaced by the learned sparse dictionary which makes the signal energy only concentrate in few atoms. (3) Penalty function is applied to least squares regression regularized by l2-norm for modeling weight. It makes the objective function treat each similar atom unequally. The function is determined by the similarity between input LR patch and its similar atom of LR dictionary. When the similarity is strong, we make the penalty small, which forces large weight at the same time. Conversely, when the similarity is weak, we make the penalty large, which forces small or zero weight at the same time. (4) LR example patches subtracted mean pixel value are used for training dictionary rather than just their gradient features like other literatures such as [25]. In the training stage, for each LR example patch, we first subtract its mean pixel value, then connect it to its corresponding HR example patch into a single vector. All the new vectors are used as new HR examples to learn HR dictionary. Thus, the HR dictionary represents textures of HR example patches, but also that of LR example patches which are noise-free. Therefore, in the reconstruction stage, the HR dictionary can also be used to recover denoised input LR patches. This is different from conventional learning methods. Combined with iterative back projection (IBP), the denoised LR patches are applied to enhance robustness to noise. The remainder of this paper is organized as follows. The proposed algorithm is presented in detail in Section 2. Experimental results and comparisons are demonstrated in section 3. Section 4 concludes this paper.

2 The proposed method

Firstly, let us recall the image degradation model which is shown in Eq (1). Given an observed LR image Y ∈ R that is a degraded version of a HR image X ∈ R of the same scene Where, G is the down-sampling operator with scaling factor s; H is the blurring operator; v is the noise. It is the task of SR reconstruction to recover X from Y as accurate as possible. It is considered that the image is noise-free by conventional SR methods.

2.1 Example database

From the example images , LR images are first obtained, which are considered as noise-free ones. For each image , its corresponding LR image is determined by A set of vectorized HR patches of size are taken from example HR images and a set of vectorized LR patches of size are taken from example LR images . Consequently, we obtain a database of HR-LR patch pairs

2.2 Distance penalty weight model

For the super-resolution, given a LR image Y, which is generated form HR image X by Eq (1), the task is to recover the unknown X from Y with the help of example patch pairs. The algorithm is performed with patch for the unit. Similar to [25], Y is firstly divided into overlapping patches Where, is the vectorized LR image patch of size , N is the number of patches of Y. The estimated vectorized HR image X can be represented as Where, is the estimated HR image patch of size . According to Eq (1), the relationship can be described by Where,v is the noise. We assume that it is Gaussian noise with zero-mean and variance σ2. Thus, it become the purpose of super-resolution to estimate HR image patch from input LR image patch . As we known, for each , it can be approximated by HR example patches through weighted average, which have similar structures. Therefore, based on this core idea, the problem in this method is to find the similar patches of in database and to calculate the weight. Due to the repetition of local structures of images, a subset of patches in which has similar structures with exists. That is Where, weight vector is ω = [ω, ω, …, ω, …, ω], k is the number of the patch pairs in this subset . There are many methods to determine the weight, such as set the weights to be inversely proportional to the distance between patches. These methods relying on number of similar patches heavily, and cannot suppress noise. Now, we discuss a new weight model in details. According to the degradation model Eqs (1) and (7), we have From Eq (8), we can obtain Where, v is assumed as Gaussian noise with zero-mean and variance σ2. Thus, Where, ε is related to σ2. We can see that the LR patch can be represented by the same weight vector ω over , with an error ε. That is to say, we can get the weight from input LR image patch and similar LR example patches with a controlled error. Based on the above discussions, We formulate the weight solution as a least squares regression regularized by l2-norm: From Eq (12), the objective function treats the patches equally. It is not flexible to obtain accurate weights for the input patch. Motivated by this, we introduce distance penalty to the least square problem Where, ⋅ denotes a point wise vector product, b = [b, b, …, b, …, b]. b is the distance between and each similar example patch in . When the similarity between and is strong, we make the b small, which forces large ω at the same time. Conversely, when the similarity is weak, we make b large, which forces small or zero ω at the same time. It is simply determined by the squared Euclidean distance. Eq (13) can be written as Where, λ is a regularization parameter. According to Eq (10), we have Where, γ is a positive constant. So we set λ = γσ2, when σ ≠ 0. Thus, the main task in reconstruction stage is to find the patches from p, which is similar to and compute the weight. Squared Euclidean distance can be adopted in to quantify the similarity. The corresponding is assumed to have similar structures with . But it is uneasy to find similar patches for each input patch from millions of example patch pairs. It will take lots of time for the repetitive computation. Sparse dictionary make the signal energy only concentrate in few atoms, and some sparse coding based SR algorithm [25] show certain robustness to noisy image, so that we use a learned sparse dictionary instead of examples. We find similar patch pairs from dictionary atom pairs, meaning . Two dictionaries D and D are trained to have the same sparse coding for each HR and LR patch pair. Similar to Yang [25] and Chang [22], we subtract the mean pixel value for each HR example patch, so that the dictionary D represents image textures rather than absolute intensities. In the reconstruction stage, the mean value for each estimated patch is then predicted by its LR version. Also we employ first- and second-order derivatives as the feature extraction for LR example patches to train. Thus, D represents the gradient feature of images rather than absolute intensities. The four filters used here are: In addition, to enhance robustness to noise, we also subtract mean pixel value for each LR example patch, and connect the LR example patch to its corresponding HR example patch into a single vector, which is also used to learn D. Thus, dictionary D represents textures of HR example patches, but also that of LR example patches which are noise-free. In the reconstruction stage, the D can also be used to recover denoised input LR patches. This is different from conventional learning methods. From above, the training set is obtained by Where, (p,p) is original HR-LR patch pairs in Eq(3), is the mean value of , is the mean value of , F(⋅) is the operator to get four gradient vectors by Eq (16) and connect the four vectors into a single vector. The set (P,P) is used to jointly train the dictionaries as Where, N and M are the vector dimensions of P and P, respectively. To solve the problem easily, Eq (18) can be rewritten as Where, , . The minimization of Eq (19) is a typical patch-based sparse problem. Many methods can be used to solve it. Yang [25] proposed the framework and acquired good results. However, it takes a large amount of time to solve this sparse model. Zeyde [26] improve the execution speed by dimensionality reduction on the patches through PCA and Orthogonal Matching Pursuit for the Sparse coding. For sparse dictionaries learning, we use the approach of Zeyde [26]. Gradient features(see Eq (16)) of LR example patches are used to learn LR dictionary. D represents the image gradient feature and . Therefore, the weight model is rewritten by Where, is the weight. This problem Eq (20) is l2-norm constraint. We solve it for by taking . The closed-form solution is Where, , B is a k × k diagonal matrix, The final optimal weight is obtained by rescaling it so that .

2.3 Reconstruction

Based on the above discussions, for each input , we start by extracting its gradient features and finding k similar atom pairs . Because the dictionary atoms are learned basis vectors, we find the similar atoms based on the correlation between the LR dictionary atoms and input LR patch rather than the Euclidean distance. Now, we describe how to compute the correlation. can be represented by dictionary ( is the LR dictionary atom, nd is dictionary size) Where, β = [β1, β2, …, β, …, β], β is the correlation between and . Eq (23) shows that every dictionary atom makes its own contribution to representing the input patch. The contribution of the j atom can be evaluated by β. In other words, β is a measurement of the similarity between the input patch and the j dictionary atom. We consider that the larger the β, the larger scale of similarity between input patch and dictionary atom ; and a small β means that there is little similarity. We can solve β by Thus, could return the correlation. In Eq (20), we use distance b as the penalty. When the similarity between and is strong, we make the b small, which forces large at the same time. Conversely, when the similarity is weak, we make b large, which forces small or zero at the same time. Therefore, we use the reciprocal of β to compute the penalty. The atom pairs corresponding to the maximal k correlation coefficients constitute . b in Eq (20) is determined by Where, Sort(a, num) is a function returning num top biggest values of vector a, abs(.) is absolute value operation. The scheme can achieve simultaneously similar atoms finding and distance computing. If σ = 0, after finding similar atoms, we set b = 1. After this, we can easily obtain the weight by Eq (20) and . According to section 2.2, the reconstructed vector represents the estimated HR patch and the denoised LR patch correspondent to . And the estimated patch and the denoised patch are subtracted mean pixel value. Based on this, we have Where, is the estimation of , is the denoised patch of , is an all-one column vector, is an all-one column vector, w1 is the size of , w2 is the size of , E(⋅) is the mean evaluation operator. Noise here is assumed as zero-means, so We can see that the noise has little effect on image mean. The mean of and could be estimated by the mean of . Eq (26) can be written by Put all estimated patches into a HR image , which is computed by averaging in overlapping regions. In the same way, we obtain a denoised image from . In order to strengthen the reconstruction constraint Eq (1), we compute the final estimated HR image X* by The iterative back-projection (IBP) method [32] is used to solve this optimization problem Where, is the estimate of the HR image at the t iteration, ↑ denote up-scaling by factor s, p is a symmetric Gaussian filter. The entire SR process is summarized as Algorithm 1. Algorithm 1: The Proposed SR Algorithm Input: the sparse dictionaries D and D; input LR image Y; number of similar atoms k; a positive constant γ; output: HR image X*; 1: for each patch of Y do 2:  Extract the gradient features for by Eq (16). 3:  Find k similar atom pairs and compute b by Eq (25). 4:  Solve Eq (21) for . 5:  Generate estimated HR patch and denoised patch by Eq (28). 6: end for 7: Put the patches and into an image and , respectively. 8: Perform IBP Eq (30) to obtain a HR image X*.

3 Experiments

In this section, we will show the robustness of the proposed algorithm to noise and compare the state-of-the-art methods [20, 22, 25, 26, 31, 32]. In the training stage, we used 77 standard natural images as training set. For testing, we used Set5 [20, 31], Set14 [20, 31] and B100 [20, 31] to evaluate the performance of upscaling factors ×2, ×3 and ×4, respectively. Set5 and Set14 contain 5 and respectively 14 images for super-resolution evaluation. B100 contains 100 testing images of Berkeley Segmentation Dataset called BSDS300. All LR images (training or test images) are generated from the original HR images. Firstly, the original HR images are directly blurred and down-sampled. The MATLAB function “imresize” is used here to complete the process. The function “imresize” involved a smooth filtering before down-sampling. Similar to [7], the noise is generated by MATLAB function “randn”, and σ times noise is added to the blurred and down-sampled test images. It should be noted that LR example images for training dictionary are noise-free. For color images used in experiments, SR algorithms are performed only on luminance channel, because humans are more sensitive to illuminant changes. Therefore, we first changes channels into YCbCr ones and then apply our method to the Y channel. We interpolate the color layers (Cb, Cr) using bicubic interpolation.

3.1 Parameters

In this section, we analyze the main parameters of our algorithm. The standard settings we use are Set5 [20, 31] database, dictionary size 1024, γ = 0.08 and k = 24 for upscaling factor ×2, k = 8 for upscaling factor ×3, ×4. Peak signal-to-noise ratio (PSNR) and reconstruction time were used as the objective criteria.

3.1.1 Regularization parameter

γ is a key regularization parameter of our method. Here, we validate the effectiveness of using different γ, and choose an appropriate one. The results of Set5 are shown in Fig 2. Experimental setting is dictionary size 1024 and k = 24 for upscaling factor ×2, k = 8 for upscaling factor ×3, ×4. We can see that the curves are not monotonic, and PSNR peaks at γ = 0.08. For different datasets, the optimal γ is slightly different (0.06 of Set14 and B100 compared to 0.08 of Set5) for reconstruction quality. The results of Set14 and B100 are shown in S1–S6 Figs. Therefore, we suggest determining γ to be around 0.08 in practice. Here, in all of our following experiments, we set γ as 0.08 for convenience.
Fig 2

γ versus average PSNR on Set5.

(A) upscaling factor ×2; (B) upscaling factor ×3; (C) upscaling factor ×4.

γ versus average PSNR on Set5.

(A) upscaling factor ×2; (B) upscaling factor ×3; (C) upscaling factor ×4.

3.1.2 Dictionary size

In this experiments, dictionary size is varied from 32 up to 2048, while the training samples are extracted from the same training images previously mentioned. In Fig 3, we present the results that show the relation between our method’s performance and the dictionary size when γ = 0.08 and k = 24 for upscaling factor ×2, k = 8 for upscaling factor ×3, ×4. Actually, noise has little effect on reconstruction time. So we only show the reconstruction time when σ = 10. We can see that the larger we learn the dictionary, the better reconstruction quality becomes. However, this comes with a higher computational cost. The result is the same as that of [25, 47]. Other datasets Set14 and B100 can also achieve similar results. The results of Set14 and B100 are shown in S7–S12 Figs. In practice, we suggest choosing the appropriate dictionary size as a tradeoff between reconstruction quality and computation. Dictionary size here is 1024 in our following experiments.
Fig 3

Dictionary size influence on performance on average on Set5.

(A) upscaling factor ×2; (B) upscaling factor ×3; (C) upscaling factor ×4.

Dictionary size influence on performance on average on Set5.

(A) upscaling factor ×2; (B) upscaling factor ×3; (C) upscaling factor ×4.

3.1.3 Number of similar atoms

The proposed method finds the similar atom pairs for each input patch. The performance of the method depends on the number of similar atoms k. The effect of k is shown in Fig 4 when dictionary size is 1024 and γ = 0.08. Here, we also only show the reconstruction time when σ = 10. We can see that k = 24 is best for reconstruction quality when upscaling factor is ×2. The PSNR peaks at k = 8 when upscaling factor is ×3 or ×4. Moreover, average reconstruction time increases distinctly as k increases. It is due to the fact that by having a larger k, the computation of matrix inversion in Eq (21) increases. Other datasets Set14 and B100 can also achieve similar results. The results of Set14 and B100 are shown in S13–S18 Figs. Therefore, in resource-limited systems, a reasonable selection of k depends on the tradeoff between reconstruction quality and computational time. We will use k = 24 when upscaling factor is ×2, k = 8 when upscaling factor is ×3 or ×4 in our further experiments.
Fig 4

Number of similar atoms influence on performance on average on Set5.

(A) upscaling factor ×2; (B) upscaling factor ×3; (C) upscaling factor ×4.

Number of similar atoms influence on performance on average on Set5.

(A) upscaling factor ×2; (B) upscaling factor ×3; (C) upscaling factor ×4.

3.1.4 Patch size and overlap

Intuitively, using a too large or too small patch size tends to produce a smooth or unwanted artifact as noticed also in [25, 29] and a larger overlapping leads to a better SR results [25]. Therefore, patch size is set as 6×6, 6×6 and 8×8 for upscaling factor ×2, ×3 and ×4, respectively, and overlap is set as 4, 3 and 4 for upscaling factor ×2, ×3 and ×4, respectively.

3.2 Performance evaluation

In this section we analyze the performance of our algorithm in quantitative and qualitative comparison with the state-of-the-art methods including NE [22], SCSR [25], Zeyde [26], A+ [31], SRCNN [20], and CSC [32]. We also show the reconstruction times of the algorithms. The code of the compared method was downloaded from the authors’ homepage. Peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) were used as the objective criteria. The parameters are analyzed in the previous section. Besides the patch size and overlap(see section 3.1.4), the other parameter are unified (γ = 0.08, dictionary size = 1024, k = 24 for upscaling factor ×2, k = 8 for upscaling factor ×3 and ×4).

3.2.1 Quality

Tables 1–3 list the PSNR and SSIM comparisons. When σ = 0, the approach CSC [32] achieves the best performance. But it is not in accord with real application. When σ ≠ 0, as repeatedly shown, the results demonstrate the superiority of our proposed algorithm over other approaches on Set5, Set14 and B100. The average PSNR of the recent method CSC [32] is 0.24 dB (Set14, upscaling factor ×4, σ = 5) and 7.4 dB (Set5, upscaling factor ×2, σ = 20) behind our method. Compared with CSC, for dataset B100, the average PSNR improvement is from the minimum 0.52 dB (upscaling factor ×4, σ = 5) to the maximum 6.18 dB (upscaling factor ×2, σ = 20). In addition, our method improves on average 3.62 dB (Set5, upscaling factor ×2, σ = 20) over the next top robustness method SCSR [25]. Figs 5–8 provide a visual assessment. We can see that our method gets similar quality performance as the top methods it was compared to when σ = 0, and it has the strongest robustness.
Table 1

Comparisons of average PSNR (dB) and SSIM (σ = 0).

datasetScaleNE [22]SCSR [25]Zedye [26]A+ [31]SRCNN [20]CSC [32]ours
PSNRSSIMPSNRSSIMPSNRSSIMPSNRSSIMPSNRSSIMPSNRSSIMPSNRSSIM
Set5×235.770.94936.040.95135.780.94936.550.95436.340.95236.620.95535.650.948
×331.840.89631.400.88731.909.89732.590.90932.390.88732.660.90931.570.895
×429.610.840--29.690.84330.280.86030.090.85330.360.85929.490.841
Set14×231.760.89931.710.90331.810.89932.280.90632.180.90432.310.90731.710.901
×328.600.80828.070.80328.670.80829.130.81929.000.81529.150.82128.260.811
×426.810.733--26.880.73427.320.74926.610.72527.300.75026.550.738
B100×230.410.87131.040.88430.400.86830.770.87731.140.88531.270.88830.760.881
×327.850.77127.810.77227.870.77028.180.78028.210.78028.310.78627.850.778
×426.470.697--26.550.69726.770.70926.710.70226.830.71126.510.703
Table 3

The results of average PSNR (dB) and SSIM on the Set14 and B100 dataset.

datasetScaleσNE [22]SCSR [25]Zedye [26]A+ [31]SRCNN [20]CSC [32]ours
PSNRSSIMPSNRSSIMPSNRSSIMPSNRSSIMPSNRSSIMPSNRSSIMPSNRSSIM
Set14×2528.740.751429.310.798129.010.764728.710.773728.610.743528.360.727529.690.8205
1025.080.555126.590.647825.460.575024.780.540024.450.526424.310.518027.800.7381
1522.290.420424.310.521322.710.439421.890.430921.420.384821.350.382126.380.6732
2020.150.330222.470.425920.570.346619.700.314019.140.294519.100.293325.350.6220
×3526.860.690326.550.689127.080.703526.960.685926.990.694226.700.670327.160.7220
1024.190.524023.950.521524.520.544123.920.507923.900.501423.530.485625.780.6663
1521.890.403221.640.399022.240.422121.430.382721.290.378120.960.360624.670.6075
2019.990.319619.720.314220.350.335519.430.298119.200.290018.880.276723.770.5579
×4525.570.6398--25.760.652625.760.641625.890.657525.490.624125.730.6788
1023.420.4985--23.420.517423.240.486523.450.507822.840.460724.640.6171
1521.390.3896--21.690.407621.010.371321.080.383620.510.344823.700.5686
2019.660.3115--19.960.326519.150.291019.080.295818.570.265522.910.5283
B100×2528.000.726428.810.771928.190.738027.960.719628.020.72127.830.707628.950.7917
1024.660.527926.280.620025.020.548024.360.512324.170.505924.070.499727.290.7037
1522.010.395124.100.494122.420.413621.630.379221.250.366121.230.365426.080.6378
2019.950.307722.320.400020.370.323319.520.292519.030.27719.020.277925.200.5873
×3526.320.651826.790.672826.490.663826.340.646326.460.658626.200.635126.850.7010
1023.860.487823.740.487424.150.506723.580.471623.600.476723.270.453825.640.6273
1521.660.370321.470.367421.990.388221.220.351021.100.348120.810.332624.660.5702
2019.820.290219.590.285520.170.305119.290.270619.070.263518.780.252623.840.5223
×4525.30.6015--25.460.613325.360.599125.530.617125.20.585725.720.6414
1023.230.4615--23.490.480023.010.447823.230.469022.690.426924.740.5815
1521.260.3562--21.560.373620.860.337820.950.350020.430.316123.890.5358
2019.560.2820--19.860.296619.060.262218.990.266918.520.241223.150.4980
Fig 5

Comparisons with various image super-resolution methods on “coastguard” from Set14 with upscaling factor ×2 (σ = 0, PSNR in dB).

(A) Ground truth HR; (B) NE [22]; (C) SCSR [25]; (D) Zedye [26]; (E) A+ [31]; (F) SRCNN [20]; (G) CSC [32]; (H) ours.

Fig 8

Comparisons with various image super-resolution methods on “208001” from B100 with upscaling factor ×4 (σ = 10, PSNR in dB).

(A) Ground truth HR; (B) NE [22]; (C) Zedye [26]; (D) A+ [31]; (E) SRCNN [20]; (F) CSC [32]; (G) ours.

Comparisons with various image super-resolution methods on “coastguard” from Set14 with upscaling factor ×2 (σ = 0, PSNR in dB).

(A) Ground truth HR; (B) NE [22]; (C) SCSR [25]; (D) Zedye [26]; (E) A+ [31]; (F) SRCNN [20]; (G) CSC [32]; (H) ours.

Comparisons with various image super-resolution methods on “16077” from B100 with upscaling factor ×2 (σ = 10, PSNR in dB).

(A) Ground truth HR; (B) NE [22]; (C) SCSR [25]; (D) Zedye [26]; (E) A+ [31]; (F) SRCNN [20]; (G) CSC [32]; (H) ours.

Comparisons with various image super-resolution methods on “241004” from B100 with upscaling factor ×3 (σ = 10, PSNR in dB).

(A) Ground truth HR; (B) NE [22]; (C) SCSR [25]; (D) Zedye [26]; (E) A+ [31]; (F) SRCNN [20]; (G) CSC [32]; (H) ours.

Comparisons with various image super-resolution methods on “208001” from B100 with upscaling factor ×4 (σ = 10, PSNR in dB).

(A) Ground truth HR; (B) NE [22]; (C) Zedye [26]; (D) A+ [31]; (E) SRCNN [20]; (F) CSC [32]; (G) ours.

3.2.2 Reconstruction time

Average reconstruction time of test images in Set5 was compared when σ = 10. Actually, noise has little effect on test results. The experiments were conducted on the same computer. The results are summarized in Table 4. The reconstruction time varies a lot for different upscaling factors. Our algorithm cost fewer than 10s. The reconstruction time of our algorithm is comparable to that of SCSR, CSC, and SRCNN. SCSR is the slowest method.
Table 4

Comparisons of average reconstruction time (s)on Set5.

ScaleNE [22]SCSR [25]Zedye [26]A+ [31]SRCNN [20]CSC [32]ours
×24.78193.266.820.887.54139.033.21
×32.7844.313.010.577.4778.461.24
×41.63-1.960.426.3948.240.75

3.3 Effect of IBP

Combined with iterative back projection (IBP), the denoised LR patches are applied to improve SR performance in our algorithm. According to [47], IBP has an important role to improve SR performance. But if the input is a noisy image, the model of IBP will propagate the noise to the HR image. Experimental results show that if we use IBP algorithm directly on the input LR image, the performance will become worse. The results are listed in Table 5. The iteration number of IBP here is 20. From this comparison, we can see that the superiority of our method is obvious. Other datasets Set14 and B100 can also achieve similar results. The results of Set14 and B100 are shown in S1 Table.
Table 5

Effect of IBP on average PSNR(dB) and SSIM (Set 5).

ScaleIBPσ = 5σ = 10σ = 15σ = 20
PSNRSSIMPSNRSSIMPSNRSSIMPSNRSSIM
×2×31.480.83127.760.66525.030.53122.930.432
29.930.75325.160.52621.950.38319.580.293
ours32.490.87329.690.80027.920.74126.670.691
×3×29.190.80126.590.66024.330.53722.470.442
28.390.73024.520.52321.620.38519.390.296
ours29.720.82827.740.75626.240.69325.070.638
×4×27.650.76525.580.64623.640.53721.970.449
27.190.70623.930.52421.280.39219.170.303
ours28.100.78326.480.71825.160.66324.110.615

3.4 Effect of distance penalty

Distance penalty is applied to model the weight. To check the effect of the penalty for improving SR performance, we perform our method with and without the penalty respectively on Set5 database. The experiments are done in different γ. The results are shown in in Fig 9. We can see that our method with distance penalty obtains better performance and the superiority of our method with distance penalty is obvious. Other datasets Set14 and B100 can also achieve similar results. The results of Set14 and B100 are shown in S19–S24 Figs.
Fig 9

Effect of distance penalty on average PSNR (dB)(Set 5).

(A) upscaling factor ×2; (B) upscaling factor ×3; (C) upscaling factor ×4.

Effect of distance penalty on average PSNR (dB)(Set 5).

(A) upscaling factor ×2; (B) upscaling factor ×3; (C) upscaling factor ×4.

4 Conclusion

In this research, we proposed an algorithm of noisy image super-resolution based on sparse representation. For the problem of noisy image super-resolution, most of the existing methods will become less effective because they assume that the input LR image is noise-free. The proposed algorithm can achieve simultaneously image super-resolution and denoising. For different input LR images, even if the noise variance varies, the dictionary pair does not need to be retained. The core idea of the proposed algorithm is that HR image patch is reconstructed through weighted average of similar HR example patches. In particular, atoms of learned sparse dictionary are used to compute the weight and reconstruct HR patch instead of example patches. This strategy can reduce time computation and suppress noise. In addition, LR example patches subtracted mean pixel value are also used to learn dictionary rather than just their gradient features, which will help IBP to further improve the SR performance. The experimental results show that our method performs better noise robustness.

γ versus average PSNR on Set14. (upscaling factor ×2).

(TIF) Click here for additional data file.

γ versus average PSNR on Set14. (upscaling factor ×3).

(TIF) Click here for additional data file.

γ versus average PSNR on Set14. (upscaling factor ×4).

(TIF) Click here for additional data file.

γ versus average PSNR on B100. (upscaling factor ×2).

(TIF) Click here for additional data file.

γ versus average PSNR on B100. (upscaling factor ×3).

(TIF) Click here for additional data file.

γ versus average PSNR on B100. (upscaling factor ×4).

(TIF) Click here for additional data file.

Dictionary size influence on performance on average on Set14. (upscaling factor ×2).

(TIF) Click here for additional data file.

Dictionary size influence on performance on average on Set14. (upscaling factor ×3).

(TIF) Click here for additional data file.

Dictionary size influence on performance on average on Set14. (upscaling factor ×4).

(TIF) Click here for additional data file.

Dictionary size influence on performance on average on B100. (upscaling factor ×2).

(TIF) Click here for additional data file.

Dictionary size influence on performance on average on B100. (upscaling factor ×3).

(TIF) Click here for additional data file.

Dictionary size influence on performance on average on B100. (upscaling factor ×4).

(TIF) Click here for additional data file.

Number of similar atoms influence on performance on average on Set14. (upscaling factor ×2).

(TIF) Click here for additional data file.

Number of similar atoms influence on performance on average on Set14. (upscaling factor ×3).

(TIF) Click here for additional data file.

Number of similar atoms influence on performance on average on Set14. (upscaling factor ×4).

(TIF) Click here for additional data file.

Number of similar atoms influence on performance on average on B100. (upscaling factor ×2).

(TIF) Click here for additional data file.

Number of similar atoms influence on performance on average on B100. (upscaling factor ×3).

(TIF) Click here for additional data file.

Number of similar atoms influence on performance on average on B100. (upscaling factor ×4).

(TIF) Click here for additional data file.

Effect of Distance Penalty on Average PSNR (dB) on average on Set14. (upscaling factor ×2).

(TIF) Click here for additional data file.

Effect of Distance Penalty on Average PSNR (dB) on average on Set14. (upscaling factor ×3).

(TIF) Click here for additional data file.

Effect of Distance Penalty on Average PSNR (dB) on average on Set14. (upscaling factor ×4).

(TIF) Click here for additional data file.

Effect of Distance Penalty on Average PSNR (dB) on average on B100. (upscaling factor ×2).

(TIF) Click here for additional data file.

Effect of Distance Penalty on Average PSNR (dB) on average on B100. (upscaling factor ×3).

(TIF) Click here for additional data file.

Effect of Distance Penalty on Average PSNR (dB) on average on B100. (upscaling factor ×4).

(TIF) Click here for additional data file.

Effect of IBP on Average PSNR (dB) and SSIM (Set14 and B100).

(PDF) Click here for additional data file.
Table 2

The results of PSNR (dB) and SSIM on the set5 dataset.

ScaleσSet5 imagesNE [22]SCSR [25]Zedye [26]A+ [31]SRCNN [20]CSC [32]ours
PSNRSSIMPSNRSSIMPSNRSSIMPSNRSSIMPSNRSSIMPSNRSSIMPSNRSSIM
×25baby31.50.755632.960.828731.90.775331.110.739933.470.848230.630.719034.190.8760
bird31.840.806332.210.868332.250.822631.480.791833.340.883630.890.770434.460.9134
butterfly28.350.837229.050.881928.700.849228.970.838526.870.858828.430.822228.080.8936
head30.710.702232.040.76331.080.719330.420.689832.340.77429.990.669332.870.7915
woman30.390.786731.330.846530.680.80330.31.776230.60.857529.820.757531.650.8913
average30.560.777631.520.837730.920.793930.460.967231.320.844429.950.747732.250.8732
10baby26.170.504428.540.636426.660.530725.730.483425.430.471325.210.458731.410.7946
bird26.300.570228.510.694326.770.594625.840.548725.520.537325.240.520830.990.8345
butterfly25.040.684526.330.757225.410.700224.950.677724.520.659724.30.654726.080.8381
head25.920.464928.340.593626.40.489925.530.446325.440.445424.990.419230.840.7124
woman25.830.569827.680.676526.230.591225.440.552825.140.543324.920.531129.110.8199
average25.850.558827.880.671626.290.581325.500.541825.210.531424.930.51729.690.7999
15baby22.850.349625.620.486323.350.372622.390.330321.910.310821.830.308129.710.7343
bird22.930.412625.520.549523.410.435822.450.391522.030.375321.810.364328.980.7723
butterfly22.290.574123.880.649722.640.589221.920.562621.370.539921.30.539224.230.7802
head22.810.315025.640.454623.30.337222.40.298322.180.293321.70.270929.500.6567
woman22.690.431124.980.544223.120.450522.240.413721.810.397121.670.393327.180.7592
average22.710.416525.130.536923.1640.437122.280.399321.860.383321.660.375227.920.7406
20baby20.50.255623.460.379220.980.274120.030.238919.450.219619.390.219128.510.6856
bird20.550.310523.340.439821.020.330420.050.291419.570.275419.350.266527.670.7219
butterfly20.120.493521.950.566620.460.507119.640.478519.040.455519.030.455722.870.7262
head20.580.225323.580.352921.050.242920.150.21119.820.204919.290.18528.470.6121
woman20.410.341522.930.447520.830.357619.940.324719.420.306219.280.305125.860.7077
average20.430.325323.050.43720.870.342419.960.308919.460.292319.270.286326.670.6907
×35baby30.700.752330.420.748331.060.771330.330.734930.50.749529.950.714232.050.8407
bird30.530.804330.220.802430.860.821430.390.79330.450.808429.960.772431.430.8756
butterfly24.970.785924.80.784625.200.800625.940.808826.150.804425.530.784124.970.8280
head30.080.677730.010.674830.40.693829.780.663830.090.687129.420.643631.400.7442
woman28.330.778128.080.77628.610.794328.60.771928.510.782328.20.748328.800.8534
average28.920.759728.710.757229.230.776329.010.754529.140.766328.610.732529.730.8284
10baby26.080.529925.850.525726.540.557625.530.503425.590.509125.050.476229.780.7625
bird26.080.595225.830.591326.510.621525.550.568525.580.579825.030.539428.990.7993
butterfly23.220.660123.060.660223.50.678423.460.666523.450.657322.920.634123.420.7635
head25.890.472625.810.46826.360.49825.390.448825.610.467724.870.419929.640.6749
woman25.170.586324.940.583925.540.610024.860.567324.800.573424.340.537126.890.7812
average25.290.568825.100.565825.690.593124.960.550925.010.557524.440.521327.740.7563
15baby22.950.381222.720.376323.410.40622.340.354122.250.350621.790.328528.130.6977
bird23.000.443522.750.437523.440.468322.350.412522.230.411421.780.383527.330.7353
butterfly21.340.560821.110.558521.620.578121.120.55620.990.544620.570.524622.100.6994
head22.980.332422.860.327323.440.355622.390.308322.340.311421.710.277828.220.6178
woman22.530.451922.280.447822.910.473321.990.427821.840.425721.430.400225.430.7168
average22.560.43422.340.429522.960.456322.040.411721.930.408721.460.382926.240.6934
20baby20.680.285620.430.280521.110.305620.040.260619.830.252819.40.237526.780.6397
bird20.730.340320.470.333721.160.499820.040.310919.80.302319.390.282526.070.6782
butterfly19.630.484719.360.480219.90.26219.150.470818.980.461218.620.443521.180.6453
head20.830.24320.690.23821.270.379120.210.221519.960.215419.350.191326.970.5650
woman20.470.361420.190.355820.820.361519.820.336119.580.328619.190.310824.350.6598
average20.470.343020.230.337620.850.361619.850.320019.630.312119.190.293125.070.6376
×45baby29.790.7405--30.10.757929.570.727629.950.756629.180.704530.650.8066
bird29.190.7878--29.430.802929.220.782329.380.805328.860.760829.670.8355
butterfly22.920.7246--23.130.741423.700.762424.220.773423.460.734423.050.7582
head29.40.6561--29.690.671729.270.649429.670.678828.930.630930.390.7093
woman26.520.7496--26.760.765426.960.752226.830.768826.710.728226.820.8062
average27.560.7317--27.820.747927.740.734828.010.756627.430.711828.100.7832
10baby25.730.5442--26.140.569725.250.519625.670.552324.720.486428.660.7342
bird25.610.6081--25.930.631025.160.583225.510.620824.610.550427.560.7612
butterfly21.780.6258--22.000.642722.020.640522.410.654621.620.603821.960.7053
head25.630.4798--26.040.503825.240.461925.770.505224.70.43128.820.6504
woman24.210.5843--24.490.605624.050.570724.150.59923.590.537625.390.7394
average24.590.5684--24.920.590624.340.555224.700.586423.850.521826.480.7181
15baby22.790.4029--23.190.426422.210.375522.370.393521.590.343327.190.6783
bird22.820.4648--23.170.48822.230.43422.400.462521.580.398526.110.7021
butterfly20.380.5401--20.570.554420.200.537520.430.54919.730.502520.830.6482
head22.850.3508--23.250.372222.360.330122.600.362821.630.294227.530.6037
woman21.980.4594--22.280.478821.530.43821.570.456420.980.405724.160.6822
average22.160.4436--22.490.464021.710.423021.870.444821.100.388825.160.6629
20baby20.590.3081--20.960.327119.970.281219.910.288519.250.251525.980.6304
bird20.680.3633--21.010.381920.030.330820.010.348519.280.295725.010.6514
butterfly18.990.4705--19.160.481918.540.456318.650.465218.030.425119.970.5984
head20.770.264--21.150.281920.210.244420.190.264319.310.206526.380.5609
woman20.10.3704--20.410.386519.530.346119.450.357018.870.315423.220.6335
average20.230.3553--20.540.371919.660.331819.640.344618.950.298824.110.6149
  15 in total

1.  Fundamental limits of reconstruction-based superresolution algorithms under local translation.

Authors:  Zhouchen Lin; Heung-Yeung Shum
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2004-01       Impact factor: 6.226

2.  Image super-resolution via sparse representation.

Authors:  Jianchao Yang; John Wright; Thomas S Huang; Yi Ma
Journal:  IEEE Trans Image Process       Date:  2010-05-18       Impact factor: 10.856

3.  Image denoising via sparse and redundant representations over learned dictionaries.

Authors:  Michael Elad; Michal Aharon
Journal:  IEEE Trans Image Process       Date:  2006-12       Impact factor: 10.856

4.  New edge-directed interpolation.

Authors:  X Li; M T Orchard
Journal:  IEEE Trans Image Process       Date:  2001       Impact factor: 10.856

5.  Learning multiple linear mappings for efficient single image super-resolution.

Authors:  Kaibing Zhang; Dacheng Tao; Xinbo Gao; Xuelong Li; Zenggang Xiong
Journal:  IEEE Trans Image Process       Date:  2015-01-07       Impact factor: 10.856

6.  Weighted Couple Sparse Representation With Classified Regularization for Impulse Noise Removal.

Authors:  Chun Lung Philip Chen; Licheng Liu; Long Chen; Yuan Yan Tang; Yicong Zhou
Journal:  IEEE Trans Image Process       Date:  2015-07-14       Impact factor: 10.856

7.  Nonlocally centralized sparse representation for image restoration.

Authors:  Weisheng Dong; Lei Zhang; Guangming Shi; Xin Li
Journal:  IEEE Trans Image Process       Date:  2012-12-21       Impact factor: 10.856

8.  Group-based sparse representation for image restoration.

Authors:  Jian Zhang; Debin Zhao; Wen Gao
Journal:  IEEE Trans Image Process       Date:  2014-05-12       Impact factor: 10.856

9.  Single image super-resolution with multiscale similarity learning.

Authors:  Kaibing Zhang; Xinbo Gao; Dacheng Tao; Xuelong Li
Journal:  IEEE Trans Neural Netw Learn Syst       Date:  2013-10       Impact factor: 10.451

10.  Novel example-based method for super-resolution and denoising of medical images.

Authors:  Marie Luong; Francoise Dibos; Jean-Marie Rocchisani; Truong Q Nguyen
Journal:  IEEE Trans Image Process       Date:  2014-04       Impact factor: 10.856

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.