Literature DB >> 30183722

Joint bayesian convolutional sparse coding for image super-resolution.

Abstract

We propose a convolutional sparse coding (CSC) for super resolution (CSC-SR) algorithm with a joint Bayesian learning strategy. Due to the unknown parameters in solving CSC-SR, the performance of the algorithm depends on the choice of the parameter. To this end, a coupled Beta-Bernoulli process is employed to infer appropriate filters and sparse coding maps (SCM) for both low resolution (LR) image and high resolution (HR) image. The filters and the SCMs are learned in a joint inference. The experimental results validate the advantages of the proposed approach over the previous CSC-SR and other state-of-the-art SR methods.

Entities: Chemical Disease Species

Mesh：

Year: 2018 PMID： 30183722 PMCID： PMC6124716 DOI： 10.1371/journal.pone.0201463

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Image super-resolution (SR) arms to reconstruct a high resolution (HR) image from a single or several low resolution (LR) image of the scene together [1-5]. The resolution limitations of low-cost imaging sensors is overcome by SR methods, and the degradation in the LR images caused by blur and the motion of camera or scene are utilized to reconstruct the HR image by SR methods. Since the motion parameters are estimated along with HR image solely from the LR images, SR is a difficult inverse problem, especially the single image SR. To reconstruct a HR image from the LR image, information usually was lost during the down-sampling procedure, and more prior knowledge should be exploited. The errors are easily caused in estimating the missing pixels of HR image based on some simple assumptions that does not hold in many practical systems. For better reconstructing complicated structures in natural images, local priors are exploited in image patches by the single image super resolution (SISR) for estimating the HR image [6-9]. Example-based methods are one of the most important SISR methods. The existing example-based methods can be categorized into sparse coding-based and mapping-based methods [10-12]. Sparse coding-based methods train a couple of dictionaries for LR and HR image patches, and there are many approaches are proposed to build mapping functions between the LR and HR patches [13-16]. The mapping functions are learned using LR and HR patch pairs [6], [17-18]. For obtaining the final results, the pixels in the overlapped patches are averaged by the previous example-based methods. However, the pixels in the overlapped patches usually are not consistent, and the averaging process may destroy the inconsistency. Some approaches are proposed to overcome this problem and show the improved performance in image reconstruction [19]. Recently, the convolutional sparse coding based super-resolution (CSC-SR) method [19] utilizes a global sparse coding method for preserving the consistency better. CSC has been applied in unsupervised learning for visual features [20], [21]. It represents the input signal by the linear combination of N sparse feature maps corresponding to N filters. The number of filters in decomposing LR image is different from the number of filters in reconstructing HR image in CSC-SR method. By processing the global image rather than in each local patch, the previous CSC-SR has shown its outperformance over the sparse coding (SC) based ones. Although CSC-SR adopts a more adaptive decomposition-reconstruction strategy, this model still utilizes the fixed number of filters. Therefore, these parameters should be assigned a priori and a structure of latent space representing the input data should be applied to solve CSC problem. In that case, the sparse coding maps and the filters should be learned optimal. Otherwise, the errors in estimating the parameters may lead to instability in reconstructing the HR image. To address the above problems, we present a novel convolutional sparse coding based super-resolution method. There are two stages in the proposed method. Firstly, we learn the filters and the sparse coding maps adaptively by modeling them using a Beta-Bernoulli distribution in decomposing the LR image. Secondly, we use the same distributions associated with the sparse coding maps when reconstructing the HR image. In each stage, the sparse coding maps are learned by minimizing the CSC problem with a variance Bayesian inference process. The Bernoulli distributions have an ability of controlling the frequency with the factors. Since learning processes on LR image and HR image are based on the same distribution, the filters and the sparse coding maps in each stage are inferred simultaneously under a joint inference process. The experimental results of the proposed method validate the competitive results with the state-of-the-art methods.

Related works

Due to the consistency between the overlapped patches, the existing sparse coding-based methods [13-16] may smooth out the consistency including the high frequency edges and structures of the image. To preserve the consistency, a convolutional sparse coding is proposed to encode the whole image in [22]. The CSC based super-resolution (CSC-SR) method learns the sparse coding maps of the LR and HR image separately by solving the CSC problem. We use 3×3 low pass filter with all coefficients being 1/9 to extract the smooth component of the LR image, which is the same as the work [19]. Denote by y the LR image, by Y the smooth component of y, by Y the residual component of y. Y represents high frequency edges and texture structures. In CSC-SR, after the smooth component of LR image Y extracted by a low pass filter, and a group of LR filters are learned to decompose the residual component Y: where Y is the residual component including high frequency edges and texture structures, are the filters, are the sparse coding maps. ‖•‖ represents the Frobenius norm and ‖•‖1 represents l-1 norm. ⊗ is convolution operator. And we use the initial sparse coding map provided on the website, http://www4.comp.polyu.edu.hk/~cslzhang/papers.htm. Gu et al. [19] use a stochastic average based alternating direction multiplier algorithm (SA-ADMM) to efficiently solve the CSC model (1) with a large number of filters. The HR image is also decomposed into a smooth component and residual component. X denotes the residual component which represents high frequency edges and structures of HR image. The N1-dimensional sparse coding space of can be transformed to N2-dimensional sparse coding space by multiply a trained mapping function [19]. By amplifying the transformed matrix with k factor, the column vectors of the transformed matrix are the initial the sparse coding map set . The filters and the mapping functions can be learned by solving another CSC minimization problem, where represents the j-th sparse coding map of HR image, M represents the mapping function matrix of size N1×N2, is the set of filters of HR image. After solving the minimization problem Eq (2) by SA-ADMM algorithm, the mapping function M and the filers can be learned and finally the residual component with high frequency texture structure of the HR image can be reconstruct by the summation of the convolution results of HR filters and the sparse coding maps . By amplifying the smooth component of LR image Y with k factor, the smooth component of HR image X is obtained. The reconstructed HR image x is represented by .

Joint bayesian convolutional sparse coding for image super-resolution

The performance of convolutional-sparse-coding (CSC)-based super resolution still depends on the appropriate choice of the unknown parameters [19]. To address the limitation, we present a joint Bayesian convolutional sparse coding framework for image super resolution (JB-CSC SR). Firstly, rather than the non-parameter Bayesian based sparse coding, we develop a new beta process to build the Bayesian based CSC model. Secondly, since we use different numbers of the filters and the corresponding sparse coding maps to decompose/reconstruct the LR and HR images, two CSC problems are solved by jointly learning a mapping function between the sparse coding maps in the two feature spaces.

Bayesian based convolutional sparse coding model for decomposing LR image

In this paper, we propose a new CSC model for super resolution with Bayesian prior. As shown in S1 Fig, the SR model is a linear combination of N atoms with the corresponding coefficients. Likewise, the traditional CSC model is the linear combination of the convolutions with the corresponding linear coefficients, where all the linear coefficients equal to one. Therefore, adaptive linear coefficients should be added in the CSC model. For conveniently describing our model, we transform the residual component of LR image , the filters and the sparse coding maps in Eq (1) into 1-dimensional, thus, we have the residual component Y∈R, filters , sparse coding maps , where P = m×n, S = s×s. The number of filters for decomposing the LR image is N1. is the i-th 1-dimensional version of the convolution . With the base measure ℏ0 and the parameters a,b>0, a representation of Beta Process is as follows, where is unit point mass at . A draw ℏ from the process is a set of N probabilities , each associated with that are drawn i.i.d. from the base measure ℏ0. Considering π to be a Bernoulli distribution parameter, we use ℏ to draw a binary vector , , . In the limit N1→∞, the number of the non-zero elements in z itself is derived from Poisson(a/b) that control the number of used convolutions . Concatenating the convolutions together, we have the convolution matrix . Since the residual component Y can be represented as the linear combination of the convolutions , it can be expressed in the following term, where is the coefficient vector for linearly representing y. ε∈ℝ denotes the error term, . With the binary vector , the parameters in Eq (5) can be expressed as, where ⊙ represents the Hadamard vector product, , I, and I represent the identity matrix with size of N1×N1, S×S, P×P, respectively. γ~Gamma(c,d), γ~Gamma(e,f) are drawn from the Gamma distributions, respectively. Following the model Eq (5) and the general structure of beta process described in [23], the convolutional sparse coding model with Bayesian prior can be expressed as, where w is defined as in Eq (5). We rewrite the formula Eq (6) as follows, Eq (7) is a typical convolutional sparse coding minimization problem with Bayesian priors. The weighted linear combination of the convolutions, , is equivalent to the summation of the convolution of the filters and the weighted sparse coding maps , i = 1⋯N1. Analogous to [19], we adopt the SA-ADMM algorithm to solve the minimization problem Eq (7) to train the filters and learn the new sparse coding maps with the Bayesian prior.

Bayesian based convolutional sparse coding model for reconstructing HR image

Authors in [19] train the mapping function to obtain the sparse coding maps of the HR image from the sparse coding maps of the LR image. However, inappropriate choosing the unknown parameters, including the number of the convolutions for HR image and the parameters of the mapping function, may lead to instability of the performance of the algorithm. To address this problem, we build the convolutional sparse coding (CSC) model with Bayesian prior to learn the HR filters to reconstruct the HR image. For convenient expression, given the sparse coding maps learned from Eq (7), we have . N2 is the number of filters for reconstructing the HR image. The authors in [19] transform A from N1-dimension feature space to the N2-dimension feature space by right-multiplying it with the mapping function , (, m is the j-th column of M), they have . And then by zooming each column of AM with k factor, the sparse coding maps for HR image is obtained. Different from [19], we apply Bayesian prior to the mapping function M, the HR filters, and the sparse coding maps together. Let k denote the zooming factor, thus we represent the 1-dimensional version of the residual component of HR image representing high frequency edges and structures as X∈ℝ. We represent the mapping function as , the HR filters as , and the sparse coding maps as . Let denote the j-th 1-dimensional version of the convolution . We initialize each column of mapping function M as . With the same base measure ℏ0 and the same parameters a,b>0 in Eq (3), we have a representation of Beta Process as follows where is unit point mass at . Similar to the Beta Process in Sub-section Ⅲ-A, we also have ℏ to draw a binary vector , , . The number of the non-zero elements in z itself is drawn from Poisson(a/b) and it controls the number of used convolutions , when N2→∞. Given the binary vector z, we have the diagonal matrix . By multiplying the mapping function M with Λ, we have , thus the columns of the mapping function is weighted by the elements of z. We represent the sparse coding maps as for HR image. First, we transform A with N1 sparse coding maps for LR image to AMΛ with N2 sparse coding maps by the new mapping function MΛ. By zooming AMΛ with k factor, we have the sparse coding maps , where x represents the horizontal coordinate in the image domain. Given the convolution matrix , the residual component x of HR image can be represented as a linear combination of the convolutions, , where and ε∈ℝ represent the coefficient vector and the error term, respectively. The coefficient vector w can be decomposed by the binary vector . Similar to Eq (5), we have, , I, I represent the identity matrix with size of N2×N2, kS×kS, kP×kP, respectively. γ and γ are the defined as in Eq (5). Following the parameter settings in Eq (10), the convolutional sparse coding model with Bayesian prior for learning the HR filters can be expressed as, where w is defined as in Eq (10). represents the weighted linear combination of the convolutions, we rewrite the formula Eq (11) as follows, Eq (12) is also a convolutional sparse coding minimization problem with Bayesian priors. We solve it by SA-ADMM algorithm. After training the HR filters , the sparse coding maps , and the mapping function MΛ, the residual component of HR image is formulated as the summation of the convolutions of HR filters and the sparse coding maps: . The estimation of HR image is reconstructed by combing the residual component and the smooth component. The training framework of the proposed JB-CSC super-resolution method is shown in S2 Fig.

Variance bayesian inference process

Gibbs sampling is usually used to perform Bayesian inference to sample the parameters from the conditional distribution of each hidden variable, given the other observations [24]. The authors in [25] update the parameters according to the posterior distributions over the model for Gibbs sampler. The inference process performs updating. It is also called sampling over these posteriors. Inspired by [23], [25], we derive analytical expressions for the Gibbs sample, starting with the convolutions of the LR image instead of the dictionary in their models [25]. For our model, we derive the following about the posterior distribution over a LR convolution to sampling the i-th convolution, Let denote the contribution of the convolution to the residual component of LR image y, then we have, With , the posterior distribution in Eq (14) can be rewritten as, Given Eq (15), the posterior distribution over a convolution can be written as where , . Once the convolutions have sampled, we sample the parameter . By the distribution of the i-th convolution, the posterior probability distribution over can be expressed as With the prior probability of given by π, we can write the following posterior probability, The prior probability of is given by 1−π, the posterior probability can be derived as, Analogous to the K-SVD algorithm, we derive the expression of sampling , Having sampled the convolution and the binary vector, we adopt the method of sampling of s, γ, π, γ, according to [25].

Summary of algorithm

The proposed model consists of two stages: first, we solve the CSC model problem to learn LR filters and the sparse coding maps; second, we solve the CSC problem to learn the HR filters, the sparse coding maps, and the mapping function for reconstructing the HR image. By incorporating the Bayesian prior and the inference process into the convolutional sparse coding model, we summarize the algorithm of the proposed method in Algorithm 1.

Algorithm 1

1. Input the train LR image Y, decompose it into smooth component Y and residual component y; 2. Initialize the parameters z, s, ε, , and as in Eq (5). 2. Solving Eq (6) by SA-ADMM algorithm to obtain and the weighted sparse coding maps while updating the parameters for z by Eq (20), and updating s,ε, accordingly. Then the smooth component Y is zoomed into X with k factor. 3. Initialize , and the parameters z, s, ε, , and as in Eq (10); 4. Transform the sparse coding maps to AMΛ, where , by zooming it with k factor, get the initial sparse coding maps for HR image; 5. Solving Eq (12) by SA-ADMM algorithm to obtain the filters , sparse coding maps , and the mapping function M while updating the parameters for z analogous to Eq (20), and updating s, ε accordingly. 6. Reconstruct the residual component of HR image by . The final HR image is obtained by

Experimental results

In this section, we compare the proposed joint Bayesian convolutional sparse coding (JB-CSC) method with four state-of-the-art SR methods, including the Beta process joint dictionary learning method (BPJDL) [16], convolutional neural network based method (CNN-SR) [18], convolutional sparse coding based method (CSC-SR) [19], and adjusted ANR (A-ANR) [26]. We use the source code provided in the authors' website, and use the parameters which they recommend. The original ground truth images are downsized by bi-cubic interpolation to generate LR-HR image pairs for both training and evaluation.

Parameter setting

To have a fair comparison between CSC-SR [19] and our model, we adopt the same training set provided in [19], the same regularization parameters, and the same implementation on the image boundary. The beta-distribution parameters are set as a = N1 for LR image and a = N2 for HR image, and b = 1 for both HR and LR image. The hyper-parameters in Eq (5) and Eq (10) were set as c = d = e = f = 10−6, which is provided in [24]. All these parameters in the Bayesian inference process have not been tuned. We conduct all the comparison experiments on a PC with Intel Core i7 25.GHz CPU 4GB RAM on Matlab platform.

Comparison with state-of-the-arts

We compare the proposed JB-CSC method with other recent SR methods on the widely used test images from Set5 [27] and Set14 [28] for different zooming factors. The test images of Set5 and Set14 are available at https://sites.google.com/site/jbhuang0604/publications/struct_sr. These two test image datasets are both publicly released and popularly usedfor image super-resolution works [1-6], [8-26] without any consent or any copyright. We show the SR results of BPJDL [16], A-ANR [26], CNN-SR [18], convolutional sparse coding based method (CSC-SR) [19], and the proposed JB-CSC method in S3–S6 Figs. In S3 Fig, we show the SR results of image Bird with zooming factor k = 2. In S4, S5 and S6 Figs we show the SR results of image Foreman, Face Butterfly with zooming factor k = 3. And the PSNR values by the competing methods are shown in Table 1.

Table 1

PSNR results of different methods.

	Zooming factor k = 2					Zooming factor k = 3
	BPJDL	CNN	A-ANR	CSC	JB-CSC	BPJDL	CNN	A-ANR	CSC	JB-CSC
Butterfly	31.43	32.20	31.94	31.96	32.02	26.42	27.58	27.22	27.11	27.60
Face	35.75	35.60	35.72	35.71	35.76	33.45	33.57	33.74	33.80	33.94
Bird	40.99	40.63	40.98	41.49	41.64	34.53	34.92	34.48	35.78	35.90
Woman	35.23	34.93	35.27	35.31	35.37	30.50	30.92	31.19	31.27	31.42
Foreman	36.49	36.19	36.91	36.64	36.79	32.91	33.34	34.22	34.24	34.33
Coast	30.60	30.49	30.55	30.65	30.71	27.07	27.19	27.27	27.27	27.38
Flowers	32.91	33.04	33.03	33.15	33.22	28.62	28.98	29.05	29.05	29.15
Zebra	33.62	33.30	33.67	33.77	33.83	28.73	28.90	29.06	29.30	29.43
Lena	36.58	36.48	36.57	36.66	36.76	33.13	33.39	33.50	33.62	33.75
Bridge	27.77	27.70	27.78	27.84	27.91	24.99	25.07	25.17	25.20	25.33
Baby	38.54	38.41	38.43	38.48	38.53	35.15	35.00	35.14	35.28	35.44
Peppers	36.71	36.75	37.06	36.90	36.97	34.02	34.35	34.71	34.72	34.86
Man	30.80	30.82	30.88	30.97	31.02	28.05	28.18	28.29	28.34	28.44
Barbara	28.68	28.59	28.70	28.77	28.81	26.82	26.65	26.47	26.67	26.78
Ave	34.01	33.94	34.11	34.16	34.24	30.31	30.57	30.68	30.83	30.97

In S3 Fig, we can see from the small window in the right-bottom of the images, the SR results by CNN-SR over-smooth the edge. And very little ringing artifacts are generated by BPJDL, A-ANR, and CSC, since the zooming factor is low. The PSNR values of the results at this factor k = 2 show the outperformance of our method. In S4 Fig, in the small window of the images, the SR results of the competing methods, such as BPJDL and A-ANR show that more edges are over-smoothed. Comparing CSC-SR and the proposed method, the edges of the image can be preserved better than CSC-SR. In S5 Fig, the results of the proposed method are compared with the results of BPJDL, A-ANR, CNN-SR, and CSC-SR. In the small windows, we can see that more textures of the hair are preserved much better in BPJDL. This comparison verifies that the models with Bayesian prior, such as BPJDL and the proposed method, have the outperformance in preserving the texture details of the images. In S6 Fig, the proposed method is compared with the competing methods with the zooming factor k = 4. In (b), we can see that the result of BPJDL shows that the ringing artifacts increase as the zooming factor increases. As shown in (d), (c), (e), (f) of S6 Fig, the ringing artifacts remain in the results of A-ANR, CSC-SR and the proposed method while not in the result of CNN-SR. However, the edges are over-smoothed by CNN-SR. The PSNR value of the proposed method is higher than the competing methods.

Conclusion

In this paper, we present a convolutional sparse coding based super resolution method with a joint Bayesian learning strategy (JB-CSC). JB-CSC employs a coupled Beta-Bernoulli process to incorporate the Bayesian prior into the convolutional sparse coding model, which avoids instability caused by estimating the unknown parameter. Different from the CSC method, the filters and sparse feature maps for both low resolution (LR) image and high resolution (HR) are learned adaptively through the Bayesian learning strategy. The experimental results validate the advantages of the proposed approach, compared with the previous CSC-SR and other state-of-the-art SR methods.

The difference between the sparse coding model and the convolutional sparse coding model.