Literature DB >> 33265265

First- and Second-Order Hypothesis Testing for Mixed Memoryless Sources.

Abstract

The first- and second-order optimum achievable exponents in the simple hypothesis testing problem are investigated. The optimum achievable exponent for type II error probability, under the constraint that the type I error probability is allowed asymptotically up to ε , is called the ε -optimum exponent. In this paper, we first give the second-order ε -optimum exponent in the case where the null hypothesis and alternative hypothesis are a mixed memoryless source and a stationary memoryless source, respectively. We next generalize this setting to the case where the alternative hypothesis is also a mixed memoryless source. Secondly, we address the first-order ε -optimum exponent in this setting. In addition, an extension of our results to the more general setting such as hypothesis testing with mixed general source and a relationship with the general compound hypothesis testing problem are also discussed.

Entities: Disease Gene

Keywords: general source; hypothesis testing; information spectrum; mixed source; optimum exponent

Year: 2018 PMID： 33265265 PMCID： PMC7512691 DOI： 10.3390/e20030174

Source DB: PubMed Journal: Entropy (Basel) ISSN： 1099-4300 Impact factor: 2.524

1. Introduction

Let and be two general sources (cf. Han [1]), where we use the term of general source to denote a sequence of random variables (respectively, ) indexed by block length n, where each component of (respectively, ) takes values in alphabet and may vary depending on n. We consider the hypothesis testing problem with null hypothesis , alternative hypothesis and acceptance region . The probabilities of type I error and type II error are defined, respectively, as We focus mainly on how to determine the -optimum exponent, defined as the supremum of achievable exponents R for the type II error probability under the constraint that the type I error probability is allowed asymptotically up to a constant . The classical but fundamental result in this setting is so-called Stein’s lemma [2], which gives the -optimum exponent in the case where both the null and alternative hypotheses are stationary memoryless sources. The lemma shows that the -optimum exponent is given by , the divergence between stationary memoryless sources X and . Chen [3] has generalized this lemma to the case where both of and are general sources, and established the general formula of -optimum exponent in terms of divergence spectra. The -optimum exponent derived in [3] is called in this paper the first-order -optimum exponent. On the other hand, the second-order asymptotics have also been investigated in several contexts of information theory [4,5,6,7,8,9] to analyze the finer asymptotic behavior of the form . Strassen [4] has first introduced the notion of -optimum achievable exponent of the second-order in hypothesis testing problem in the case where both of and are stationary memoryless sources. The results in [4] have also revealed that the asymptotic normality of divergence density rate (or likelihood ratio rate) plays an important role in computing the second-order -optimum exponent. In this paper, on the other hand, we investigate the hypothesis testing for mixed memoryless sources. The class of mixed sources is quite important, because all stationary sources can be regarded as mixed sources consisting of stationary ergodic sources. Therefore, the analysis for mixed sources is primitive but fundamental and thus we first focus on the case where the null hypothesis is a mixed memoryless source and the alternative hypothesis is a memoryless source. In this direction, Han [1] has first derived the single-letter formula for the first-order -optimum exponent in the case with mixed memoryless source and stationary memoryless source . The first main result in this paper is to establish the single-letter second-order -optimum exponent in the same setting by invoking the relevant asymptotic normality. The result is a substantial generalization of that of Strassen [4]. Second, we generalize this setting to the case where both null and alternative hypotheses are mixed memoryless , to establish the single-letter first-order -optimum exponent. It should be emphasized that our results described here are valid for mixed memoryless sources with general mixture in the sense that the mixing weight for component sources may be an arbitrary probability measure. For the case of mixed general sources with finite discrete mixture, we reveal the deep relationship with the compound hypothesis testing problem. We notice that the compound hypothesis testing problem is important from both of theoretical and practical points of view. We show that the first-order 0-optimum (respectively, exponentially r-optimum) exponent for the mixed general hypothesis testing coincides with that for the 0-optimum (respectively, exponentially r-optimum) exponent in the compound general hypothesis testing. The present paper is organized as follows. In Section 2, we fix the problem setting and review the general formula (Theorem 1) for the first-order -optimum exponent. This is used to prove Theorem 5 to establish a first-order single-letter formula for hypothesis testing in the case where both the null and alternative hypotheses are mixed memoryless. Moreover, we give the general formula (Theorem 2) for the second-order -optimum exponent, which is used to prove Theorem 4 to establish a second-order single-letter formula for hypothesis testing in the case where the null hypothesis is mixed memoryless and the alternative hypothesis is stationary memoryless. In Section 3, we establish the single-letter second-order -optimum exponent in the case with mixed memoryless source and stationary memoryless source (cf. Theorem 4). Furthermore, in Section 4, we consider the case where both of null and alternative hypotheses are mixed memoryless sources, and derive the single-letter first-order -optimum exponent (cf. Theorem 5). Section 5 is devoted to an extension of mixed memoryless sources to mixed general sources. Finally, in Section 6, we define the optimum exponent for the compound general hypothesis testing problem to discuss a relevant relationship with the hypothesis testing with mixed general sources. We conclude the paper in Section 7.

2. General Formulas for -Hypothesis Testing

In this section, we first review the first-order general formula and then give the second-order general formula. Throughout in this paper, the following lemmas play the important role, where we use the notation that indicates the probability distribution of random variable Z. ([1] (Lemma 4.1.1)). For any then, it holds that ([1] (Lemma 4.1.2)). For any Proofs of these lemmas are found in [1]. We define the first and second-order -optimum exponents as follows. Rate R is said to be ε-achievable, if there exists an acceptance region (First-order ε-optimum exponent). The right-hand side of Equation (5) specifies the asymptotic behavior of the form . Chen [3] has derived the general limiting formula for as follows, which is utilized to establish Theorem 5 in Section 4. (Chen [3] (Theorem 1)). where Moreover, we consider the second-order -optimum exponent as follows. Rate S is said to be (Second-order -optimum exponent). The right-hand side of Equation (9) specifies the asymptotic behavior of the form . The general limiting formula for is given as follows, which is the second-order counterpart of Theorem 1, and is utilized to establish Theorem 4 in the next Section 3.2 to give a second-order single-letter formula for hypothesis testing in the case where the null hypothesis is mixed memoryless and the alternative hypothesis is stationary memoryless. where See Appendix A. □

3. Mixed Memoryless Sources

3.1. First-Order -Optimum Exponent

In the previous section, we have demonstrated the “limiting” formulas for general hypothesis testing. In this and subsequent sections, we consider special but insightful cases and compute the optimum exponents in single-letter forms. Let be an arbitrary probability space with general probability measure . Then, the hypothesis testing problem to be considered in this section is stated as follows: The null hypothesis is a mixed stationary memoryless source , that is, for where is a stationary memoryless source for each and with generic random variable taking values in . The alternative hypothesis is a stationary memoryless source with generic random variable taking values in , that is, We assume to be a finite alphabet hereafter. To investigate this special case, first we introduce an expurgated parameter set on the basis of types, where the type T of sequence is the empirical distribution of , that is, with the number of i such that . Let denote all possible types of sequences of length n. Then, it is well-known that Now, for each , we define the set Since is an i.i.d. source for each , the set depends only on the type of sequence , and therefore, we may write instead of . Moreover, we define the “expurgated” set Then, we have the following lemma: (Han [1]). Let Next, we introduce two basic “decomposition” lemmas as follows. (Upper Decomposition Lemma). Let See Appendix B. □ (Lower Decomposition Lemma). Let See Appendix C. □ These Lemmas 3–5 are used later in order to establish Theorems 3–5. First, Theorem 3 concerning the first-order -optimum exponent for mixed memoryless sources has earlier been given as follows: (First-order ε-optimum exponent: Han [1]). For where If Θ is a singleton, the above formula reduces to which is nothing but Stein’s lemma [ This can be verified as follows. Set Then, clearly holds. Thus, setting which is a contradiction, where the last inequality is due to the definition of

3.2. Second-Order -Optimum Exponent

Next, we establish the second-order -optimum exponent for mixed sources, which is the first main result in this paper. (Second-order ε-optimum exponent). For where See Appendix D. □ If Θ is a singleton From Theorem 3 with and Here, let us consider the following canonical equation for S In view of Equations ( The canonical equation is a useful expression for the second-order ε-optimum rate [7,10,11,12]. Equation (35) is the hypothesis testing counterpart of these results.

4. Mixed Memoryless Alternative Hypothesis

In this section, we consider the case where not only the null hypothesis but also the alternative hypothesis are mixed memoryless sources to establish the single-letter formula for the first-order -optimum exponent, by which we intend to generalize Theorem 3. Let be a family of probability distributions on , where is a probability space with probability measure . We assume here that is a compact space and is continuous as a function of . The hypothesis testing problem considered in this section is stated as follows: The null hypothesis is a mixed memoryless source as defined by Equations (13) and (14) in Section 3.1. The alternative hypothesis is another mixed memoryless source , that is, for where Let us now consider, for each (the set of probability distributions on ), the equation with respect to as follows: with (the essential infimum of with respect to ), where is measured with respect to the probability measure . Since the solution of this equation depends on P, we may write as . Notice here that is continuous in , and as we have assumed that is compact and is continuous in , there indeed exists such a function . Now, to avoid technical subtleties, we assume here that the function may be chosen so as to be continuous. For example, if we consider a special case such that is a closed convex subset of , then it is not difficult to verify that the function is uniquely determined and continuous (or even differentiable), which follows from the strict convexity of in . Another simple example will be the case that is a countable set. Hereafter, for simplicity, we write (respectively, ) instead of (respectively, ), then we have the second main result in this paper as (First order ε-optimum exponent). For In the case that Σ is a singleton, the above theorem coincides with Theorem 3. Therefore, this theorem is a direct generalization of Theorem 3. This means also that both Θ and Σ are singletons, the theorem coincides with Stein’s lemma (see Remark 1). Remark 2 is also valid in this theorem. That is, To show the theorem, let be the set of -typical sequence with respect to , that is, let be the set of all such that where is the number of i such that , and is an arbitrary constant. Then, it is well known that In the sequel, we use the upper and lower bounds of the probability in the form for each , where satisfies as , and and are some constants independent of n. Proofs of Equations (45) and (46) appear in Appendix E. We then prove the theorem by using Equations (45) and (46) as follows. In view of Theorem 1 and Remark 6, it suffices to show two inequalities: Proof of Equation (47): Similar to the derivation of Equation (A23) with Lemma 4, we have From the definition of the -typical set and Equation (45), we also have for any . Here, we define two sets: Then, from the definition of there exists a small constant satisfying for . Thus, it holds that where we have used the relation , and for sufficiently large n and sufficiently small . Therefore, noting that, with , gives the arithmetic average of n i.i.d. variables with expectation Then, the weak law of large numbers yields that for , Thus, from Equations (54) and (57), the right-hand side of Equation (49) is upper bounded by which completes the proof of (47). Proof of Equation (48): Similar to the derivation of Equation (A32) with Lemma 5, we have From the definition of the -typical set and Equation (46), we also have for any . We also partition the parameter space into two sets. Then, for , if we set and sufficiently small, then there exists a constant satisfying Thus, again by invoking the weak law of large numbers, we have for Summarizing up, we obtain This completes the proof of Equation (48). □ Theorem 3 is a special case of Theorem 5 when Σ is a singleton. To illustrate a significance of Theorem 5, let us now consider the special case with . Then, by virtue of Theorem 5, we have the following simplified result: In the special case of The formula (40) can be written in this case as Let then this means that Contrarily, let then this means that As a consequence, (66) follows from (67), (69) and (71). □ One may wonder if it might be possible to deal with the second-order ε-optimum problem too using the arguments as developed in the above for the first-order ε-optimum problem with mixed memoryless sources and . To do so, however, it seems that we need some novel techniques, which remain to be studied.

5. Hypothesis Testing with Mixed General Sources

We have so far investigated the -hypothesis testing for mixed memoryless sources. In this section, we deal with more general settings such as hypothesis testings with mixed general sources, which inherits the crux of that for mixed memoryless sources (cf. Theorem 5). This leads us to a primitive but insightful “general” observation. To do so, we consider the case where both of null hypothesis and alternative hypothesis are finite mixtures of general sources as follows: The null hypothesis is a mixed general source consisting of K general (not necessarily memoryless) sources , that is, , where and The alternative hypothesis is another mixed general source consisting of L general (not necessarily memoryless) sources , that is, , where and In this general setting, it is hard to derive a compact formula for the first-order -optimum exponent (with ). Instead, we can obtain the following theorem in the special case of . In particular, if which is a special case of Corollary 1. See Appendix F. □ Furthermore, we can also consider the following exponentially r-optimum exponent in hypothesis testing with two mixed general sources and as above. Let (First-order exponentially r-optimum exponent). Then, it is not difficult to verify that a result analogous to Theorem 6 holds, which is a generalization of [1] (Remark 4.4.3): In particular, if the null and alternative hypotheses consist of stationary memoryless sources by virtue of Hoeffding’s theorem.

6. Hypothesis Testing with Compound General Sources

In this section, let us consider the compound hypothesis testing problem with finite null hypotheses and finite alternative hypotheses , where and are general sources. As is well-known, this problem is expected to have a primitive but “general” relationship to that of mixed hypothesis at the structural level. Specifically, the compound hypothesis testing is the problem in which a pair of general sources occurs as a pair (null hypothesis, alternative hypothesis), and the tester does not know which pair is actually working. This means that the acceptance region cannot depend on i and j. The type I error probabilities of the compound hypothesis testing are given by for each general null hypothesis . The type II error probabilities are also given by for each general alternative hypothesis . Then, the following achievability is of our interest. Rate R is said to be 0-achievable for the compound hypothesis testing, if there exists an acceptance region for all (First-order 0-optimum exponent). Now, we have Assuming that where with sources Equations ( to denote See Appendix G. □ From Theorems 6 and 8, we immediately obtain the first-order 0-optimum exponent for the compound hypothesis testing as: Assuming that In particular, if Similar to Definition 5, we can define the exponentially r-optimum exponent also for the compound hypothesis testing problem as follows. Let for all (First-order exponentially r-optimum exponent). Then, using an argument similar to the proof of Theorem 8, the following theorem can be shown: Let where with sources Equations ( to denote Combining Theorems 7 and 9, we immediately obtain the following corollary: Let In particular, if the null and alternative hypotheses consist of stationary memoryless sources specified by which corresponds to Equation (79).

7. Concluding Remarks

Thus far, we have investigated the first- and second-order -optimum exponents in the hypothesis testing problem. First, we have studied the second-order -optimum problem with mixed memoryless null hypothesis and stationary memoryless alternative hypothesis. As we have shown in the analysis of the second-order -optimum exponent, we use, as a key property, the asymptotic normality of divergence density rate for each of the component sources. We also observe that the canonical representation, first introduced in [11], is still efficient to express the second-order -optimum exponent for mixed memoryless sources in the hypothesis testing problem. The first-order -optimum exponent in the case with mixed memoryless null and alternative hypotheses has also been established. One may wonder whether we can apply the same approach in the derivation of the second-order -optimum exponent in this setting. Notice that one of our key techniques to derive the first-order -optimum exponent is an expansion around . More careful evaluation of this expansion would be needed to compute the second-order -optimum exponent. This remains to be a future work. Our final goal is the problem of hypothesis testing in which both of null and alternative hypotheses are general stationary sources. This paper characterizes the first- and second-order performance of hypothesis testing for mixed memoryless sources as a simple but crucial step toward this goal. Finally, the relationship between the first-order 0-optimum (respectively, exponentially r-optimum) exponent in the hypothesis testing with mixed general sources and the 0-optimum (respectively, exponentially r-optimum) exponent in the compound hypothesis testing has also been demonstrated.

1 in total

1. Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses.

Authors: Hiroto Kuramata; Hideki Yagi
Journal: Entropy (Basel) Date: 2022-04-30 Impact factor: 2.738

1 in total