Literature DB >> 36010816

Stochastic Model of Block Segmentation Based on Improper Quadtree and Optimal Code under the Bayes Criterion.

Abstract

Most previous studies on lossless image compression have focused on improving preprocessing functions to reduce the redundancy of pixel values in real images. However, we assumed stochastic generative models directly on pixel values and focused on achieving the theoretical limit of the assumed models. In this study, we proposed a stochastic model based on improper quadtrees. We theoretically derive the optimal code for the proposed model under the Bayes criterion. In general, Bayes-optimal codes require an exponential order of calculation with respect to the data lengths. However, we propose an algorithm that takes a polynomial order of calculation without losing optimality by assuming a novel prior distribution.

Entities: Chemical

Keywords: Bayes code; lossless image compression; quadtree; stochastic generative model

Year: 2022 PMID： 36010816 PMCID： PMC9407622 DOI： 10.3390/e24081152

Source DB: PubMed Journal: Entropy (Basel) ISSN： 1099-4300 Impact factor: 2.738

1. Introduction

There are two approaches to lossless image compression. (These two approaches are detailed in Section 1 of our previous study [1].) Most previous studies (e.g., [2,3,4]) adopted an approach in which they constructed a preprocessing function that outputs a code length assignment vector from past pixel values . determines the code length of the next pixel value , or typically, a value equivalent to in the meaning that there exists a one-to-one mapping computable for both encoder and decoder. Then, and are passed to the following entropy coding process such as [5,6]. In this approach, the elements of the code length assignment vector satisfy . Therefore, it appears superficially as a probability distribution. However, it does not directly govern the stochastic generation of original pixel value . Hence, we cannot define the entropy of the source of pixel value , and we cannot discuss the theoretical optimality of the preprocessing function and one-to-one mapping . In contrast, we adopted an approach in which we estimated a stochastic generative model with an unknown parameter and a model variable m, which is directly and explicitly assumed on the original pixel value [1,7,8,9]. Therefore, we can discuss the theoretical optimality of the entire algorithm to the entropy defined from the assumed stochastic model . In particular, we can achieve the theoretically optimal coding under the Bayes criterion in statistical decision theory (see, e.g., [10]) by assuming prior distributions and on the unknown parameter and model variable m. Such codes are known as Bayes codes [11] in information theory. It is known that the Bayes code asymptotically achieves the entropy of the true stochastic model, and its convergence speed achieves the theoretical limit [12]. The Bayes codes have shown remarkable performance in text compression (e.g., [13]). Therefore, we consider this approach. We assume that the target image herein has non-stationarity, that is, the properties of pixel values are different among the positions in the image. For such an image, researchers have performed quadtree block segmentation as a component of preprocessing and one-to-one mapping in the former approach, and its practical efficiency has been reported in many previous studies (e.g., [4,14]). In the latter approach, we proposed a stochastic generative model that contains a quadtree as a model variable m. By assuming a prior distribution on it, we derived the optimal code under the Bayes criterion, and we constructed a polynomial order algorithm to calculate it without loss of optimality [1]. However, in all these studies [1,4,14], the class of quadtrees is restricted to that of proper trees, whose inner nodes have exactly four children. In this paper, we propose a stochastic generative model based on an improper quadtree m and derive the code optimal under the Bayes criterion. In general, the codes optimal under the Bayes criterion require a summation that takes an exponential order calculation for the data length. However, we herein construct an algorithm that only requires a polynomial order calculation without losing optimality by applying a theory of probability distribution for general rooted trees [15] to the improper quadtree representing the block segmentation.

2. Proposed Stochastic Generative Model

Let denote a set of possible values of a pixel. For example, we have for binary images and for grayscale images. Let and denote a height and a width of an image, respectively. Although our model is able to represent any rectangular images, we assume that for in the following for the simplicity of the notation. Then, let denote the random variable of the t-th pixel value in order of the raster scan, and let denote its realization. Note that is at the -th row and -th column, where t divided by w is with a reminder of . In addition, let denote the sequence of pixel values . Note that all the indices start from zero herein. We assume is generated from a probability distribution depending on an unknown model and unknown parameters . (For , we assume follows .) We define m and in the following. ([1]). Let where ([1]). For Therefore, it represents the indices of the upper right region. In a similar manner, We define the model m as a quadtree whose nodes are elements of Each node Notably, we can reduce the number of parameters from an equivalent model represented by a proper tree with added dummy child nodes. See the following example. For Under the model and the parameters , we assume that the t-th pixel value is generated as follows. We assume that where s is the minimal block that satisfies Thus, the pixel value given the past sequence depends only on the parameter of the minimal block s that contains . Note that we do not assume a specific form of at this point. For example, we can assume the Bernoulli distribution for and also the Gaussian distribution (with an appropriate normalization and quantization) for .

3. The Bayes Code for Proposed Model

Since the true m and are unknown, we assume prior distributions and . Then, we estimate the true generative probability by under the Bayes criterion in statistical decision theory (see, e.g., [10]). Subsequently, we use as a coding probability of the entropy code such as [16]. Such a code is known as Bayes codes [11] in information theory. The expected code length of the Bayes code converges to the entropy of for sufficiently large data length, and its convergence speed achieves the theoretical limit [12]. The Bayes code has shown remarkable performances in text compression (e.g., [13]). The optimal coding probability of the Bayes code for is derived as follows, according to the general formula in [11]. The optimal coding probability We call Proposition 1 implies that we should use the coding probability that is a weighted mixture of for every block segmentation pattern m and parameters according to the posteriors and . (For , is mixed with weights according to the priors and , which corresponds to the initialization of the algorithm.) Notably, is generalized to the set of improper quadtrees from the set of proper quadtrees although (4) has a similar form to Formula (5) in [1].

4. Polynomial Order Algorithm to Calculate Bayes-Optimal Coding Probability

Unfortunately, the Bayes-optimal coding probability (4) contains a computationally hard calculation. (Herein, we assume that is feasible. Examples of feasible settings will be described in the next section.) The summation cost for m exponentially increases with respect to . Therefore, we propose a polynomial order algorithm to calculate (4) without loss of optimality by applying a theory of probability distribution for general rooted trees [15] to the improper quadtree m. In this section, we focus on the procedure of the constructed algorithm. Its validity is described in Appendix A. Let First, we assume the following prior distributions as and . Let where Intuitively, represents the conditional probability that s has the block division pattern under the condition that . The above prior actually satisfies the condition . Although this is proved for any rooted tree in [15], we briefly describe a proof restricted for our model in the Appendix A to make this paper self-contained. Note that the above assumption does not restrict the expressive capability of the general prior in the meaning that each model m still has possibly to be assigned a non-zero probability . For each model Moreover, for any Therefore, each element of the parameters depends only on s and they are independent from both of the other elements and the model m. From Assumptions 1 and 3, the following lemma holds. For any Hence, we represent it by Lemma 1 means that the optimal coding probability for depends on the minimal block s that contains and its division pattern . Therefore, it could be calculated as if was known. At last, the Bayes-optimal coding probability can be calculated by a recursive function for nodes on a path of the perfect quadtree on . The definition of the path is the same as [1]. ([1]). Let We define the following recursive function where Consequently, the following theorem holds. The Bayes-optimal coding probability Although Theorem 1 is proved by applying Corollary 2 of Theorem 7 in [15], we briefly describe a proof restricted to our model in the Appendix A to make this paper self-contained. Theorem 1 means that the summation with respect to in (4) is able to be replaced by the summation with respect to and , which costs only . The proposed algorithm recursively calculates a weighted mixture of coding probabilities for the case where block s is not divided at (i.e., ) and the coding probability for the case where block s is divided at (i.e., ).

5. Experiments

In this section, we perform four experiments. Three of them are similar to the experiments in [1]. The fourth one is newly added. In Experiments 1, 2, and 3, we assume , which is the simplest setting, to focus on the effect of the improper quadtrees. In Experiment 4, we assume to show our method is also applicable to grayscale images. The purpose of the first experiment is to confirm the Bayes optimality of for synthetic images generated from the proposed model. The purpose of the second experiment is to show an example image suitable to our model. The purpose of the third experiment is to compare average coding rates of our proposed algorithm with a current image coding procedure on real images. The purpose of the fourth experiment is to show our method is applicable to grayscale images. In Experiments 1 and 2, is Bernoulli distribution for the minimal s that satisfies . Each element of is i.i.d. distributed with the beta distribution , which is the conjugate prior distribution of Bernoulli distribution. Therefore, the integral in (4) has a closed form. The hyperparameter of the model prior is for every and , and the hyperparameters of the beta distribution are . For comparison, we used the previous method based on proper quadtrees, whose hyperparameters are the same as the experiments in [1], and the standard methods known as JBIG [17] and JBIG2 [18].

5.1. Experiment 1

The setting of Experiment 1 is as follows. The width and height of images are . We generate 1000 images according to the following procedure. Generate m according to (5). Generate according to for . Generate pixel value according to for . Repeat Steps 1 to 3 for 1000 times. Examples of the generated images are shown in Figure 5. Subsequently, we compress these 1000 images. The size of the image is saved in the header of the compressed file using 4 bytes. The coding probability calculated by the proposed algorithm is quantized in levels and substituted into the range coder [16]. Table 1 shows the coding rates (bit/pel) averaged over all the images. Our proposed code has the minimum coding rate as expected by the Bayes optimality.

Figure 5

Examples of the generated images in Experiment 1.

Table 1

The average coding rates (bit/pel).

Improper Quadtree (Proposal)	Proper Quadtree [1]	JBIG [17]	JBIG2 [18]
0.619	0.624	1.811	0.962

5.2. Experiment 2

In Experiment 2, we compress camera.tif in [19], which is binarized with the threshold of 128. The setting of the header and the range coder is the same as those of Experiment 1. Figure 6 visualizes the maximum a posteriori (MAP) estimation based on the improper quadtree model and the proper quadtree model [1], which are by-products of the compression. They are obtained by applying Theorem 3 in [15] and the algorithm in Appendix B in the preprint of the full version of [15], which is uploaded on arXiv. The improper quadtree represents the non-stationarity by a fewer number of regions (i.e., fewer parameters) than that of the proper quadtree [1]. Table 2 shows that the coding rate of our proposed model for camera.tif is lower than the previous one based on the proper quadtree [1] and JBIG [17] without any special tuning. However, JBIG2 [18] showed the lowest coding rate. The improvement of our method for real images will be described in the next experiment.

Figure 6

The original image (left), the MAP estimated model based on the proper quadtree [1] (middle), and that based on the improper quadtree (right).

Table 2

The coding rates for the camera.tif in [19] (bit/pel).

Improper Quadtree (Proposal)	Proper Quadtree [1]	JBIG [17]	JBIG2 [18]
0.318	0.323	0.348	0.293

5.3. Experiment 3

In Experiment 3, we compare the proposed algorithm with the proper-quadtree-based algorithm [1], JBIG [17], and JBIG2 [18] on real images from [19]. They are binarized in a similar manner to Experiment 2. The setting of the header and the range coder is the same as those of Experiments 1 and 2. A difference from Experiments 1 and 2 is in the stochastic generative model assumed on each block s. We assume another model represented as the Bernoulli distribution that depends on the neighboring four pixels. (If the indices go out of the image, we use the nearest past pixel in Manhattan distance.) Therefore, has a kind of Markov property. In other words, there are 16 parameters for each block s of model m, and one of them is chosen by the observed values , , , and in the past. Each parameter is i.i.d. distributed with the beta distribution whose parameters are . The results are shown in Table 3. The algorithms labeled as Improper-i.i.d. and Proper-i.i.d. are the same as those in Experiments 1 and 2. The algorithms labeled as Improper-Markov and Proper-Markov are the aforementioned ones.

Table 3

The coding rates for the binarized images from [19] (bit/pel).

Images	Proper-i.i.d	Improper-i.i.d.	JBIG [17]	Proper-Markov	JBIG2 [18]	Improper-Markov
bird	0.121	0.113	0.149	0.099	0.090	0.067
bridge	0.390	0.382	0.386	0.373	0.353	0.300
camera	0.323	0.318	0.348	0.310	0.293	0.255
circles	0.100	0.090	0.102	0.060	0.045	0.030
crosses	0.140	0.132	0.083	0.110	0.027	0.027
goldhill1	0.371	0.364	0.359	0.353	0.321	0.280
horiz	0.075	0.070	0.078	0.022	0.018	0.004
lena1	0.254	0.243	0.217	0.216	0.169	0.141
montage	0.176	0.165	0.164	0.163	0.114	0.087
slope	0.091	0.083	0.096	0.056	0.038	0.021
squares	0.005	0.004	0.076	0.010	0.016	0.003
text	0.468	0.465	0.301	0.468	0.229	0.280
avg.	0.209	0.202	0.197	0.187	0.143	0.125

Improper-Markov outperforms the other methods from the perspective of average coding rates. The effect of the improper quadtree is probably amplified because the number of parameters for each block is increased. However, JBIG2 [18] still outperforms our algorithms only for text. We consider it is because JBIG2 [18] is designed for text images such as faxes in contrast to our general-purpose algorithm. Note that our algorithm has room for improvement by tuning the hyperparameters and of the beta distribution for each of .

5.4. Experiment 4

Through Experiment 4, we show our method is applicable to grayscale images. Herein, we assume two types of stochastic generative models for the block of the proper quadtree and the improper quadtree. The first one is the i.i.d. Gaussian distribution . In this case, can be regarded as . The second one is the two-dimensional autoregressive (AR) model [7] of the neighboring four pixels, i.e., , where . (If the indices go out of the image, we use the nearest past pixel in Manhattan distance.) In this case, can be regarded as . For both models, is normalized and quantized into in a similar manner to [7]. The prior distributions for each model are assumed to be the Gauss–gamma distributions and , where , , , , , . Here, is the identity matrix. The results are shown in Table 4. (The values for previous studies [2,4,20,21] are cited from [21].)

Table 4

The coding rates for the grayscale images from [19] (bit/pel).

Images	JPEG2000 [20]	JPEG-LS [2]	MRP [4]	Vanilc [21]	Proper-Gaussian	Improper-Gaussian	Proper-AR	Improper-AR
bird	3.630	3.471	3.238	2.749	4.086	4.055	3.461	3.422
bridge	6.012	5.790	5.584	5.596	6.353	6.294	5.696	5.678
camera	4.570	4.314	3.998	3.995	4.651	4.589	4.163	4.121
circles	0.928	0.153	0.132	0.043	1.190	0.915	1.030	0.826
crosses	1.066	0.386	0.051	0.016	1.603	1.240	0.898	0.625
goldhill1	5.516	5.281	5.098	5.090	5.796	5.738	5.220	5.196
horiz	0.231	0.094	0.016	0.015	1.091	0.922	0.279	0.216
lena1	4.755	4.581	4.189	4.123	5.312	5.259	4.433	4.394
montage	2.983	2.723	2.353	2.363	3.818	3.734	2.940	2.850
slope	1.342	1.571	0.859	0.960	3.721	3.683	1.728	1.602
squares	0.163	0.077	0.013	0.007	0.335	0.205	0.323	0.202
text	4.215	1.632	3.175	0.621	4.310	3.691	4.176	3.732
Whole avg.	2.951	2.506	2.392	2.132	3.522	3.360	2.862	2.739
Natural avg.	4.897	4.687	4.421	4.311	5.240	5.187	4.595	4.562
Artificial avg.	1.561	0.948	0.943	0.575	2.295	2.056	1.625	1.436

The coding rates of the proper-quadtree-based algorithm are improved by our proposed method for all the images in this data set and for both settings of the stochastic generative model assumed within blocks. This indicates the superiority of the improper-quadtree-based model to the proper-quadtree-based model. The method labeled by Improper-AR showed an average coding rate lower than JPEG2000, averaging for the whole images. It also showed an average coding rate lower than JPEG-LS, averaging for the natural images. Although it does not outperform recent methods such as MRP and Vanilc, we consider this is because of the suitability of the stochastic generative model within blocks, which is out of the scope of this paper.

6. Conclusions

We proposed a novel stochastic model based on the improper quadtree, so that our model effectively represents the variable block size segmentation of images. Then, we constructed a Bayes code for the proposed stochastic model. Moreover, we introduced an algorithm to implement it in polynomial order of data size without loss of optimality. Some experiments both on synthetic and real images demonstrated the flexibility of our stochastic model and the efficiency of our algorithm. As a result, the derived algorithm showed a better average coding rate than that of JBIG2 [18].

3 in total

1. Probability Distribution Estimation for Autoregressive Pixel-Predictive Image Coding.

Authors: Andreas Weinlich; Peter Amon; Andreas Hutter; André Kaup
Journal: IEEE Trans Image Process Date: 2016-03 Impact factor: 10.856

2. The LOCO-I lossless image compression algorithm: principles and standardization into JPEG-LS.

Authors: M J Weinberger; G Seroussi; G Sapiro
Journal: IEEE Trans Image Process Date: 2000 Impact factor: 10.856

3 in total