Literature DB >> 33267210

A Deformed Exponential Statistical Manifold.

Francisca Leidmar Josué Vieira¹, Luiza Helena Félix de Andrade², Rui Facundo Vigelis³, Charles Casimiro Cavalcante⁴.

Abstract

Consider μ a probability measure and P μ the set of μ -equivalent strictly positive probability densities. To endow P μ with a structure of a C ∞ -Banach manifold we use the φ -connection by an open arc, where φ is a deformed exponential function which assumes zero until a certain point and from then on is strictly increasing. This deformed exponential function has as particular cases the q-deformed exponential and κ -exponential functions. Moreover, we find the tangent space of P μ at a point p, and as a consequence the tangent bundle of P μ . We define a divergence using the q-exponential function and we prove that this divergence is related to the q-divergence already known from the literature. We also show that q-exponential and κ -exponential functions can be used to generalize of Rényi divergence.

Entities: Chemical Species

Keywords: deformed exponential manifold; exponential arcs; information geometry; statistical manifold; φ-family

Year: 2019 PMID： 33267210 PMCID： PMC7514985 DOI： 10.3390/e21050496

Source DB: PubMed Journal: Entropy (Basel) ISSN： 1099-4300 Impact factor: 2.524

1. Introduction

Let be the set of -equivalent strictly positive probability densities, where is a given probability measure. In order to build a structure to , Amari considered the parametric case, where the construction depends on a parameter belonging to the Euclidean space [1,2]. The case of non-parametric statistical models was initially studied by Pistone and Sempi [3]. In this case, was equipped with a structure of a -Banach manifold using the Orlicz space associated to an Orlicz function. In a later work [4], Pistone and Cena proved that the probability distribution z belongs to the maximal exponential model to the probability distribution p, if and only if, z is connected to p by an open exponential arc. Moreover, the new manifold structure obtained from the connection by an open exponential arc is equivalent to the one defined in [3,5]. Results involving conditions connecting two probability densities by an open exponential arc were recently studied in [6]. The deformed exponential function was first introduced by Naudts in [7] and studied in more details later in [8,9]. In [10], the authors propose a generalization for the exponential family , based in the replacement of the exponential function exp by a deformed exponential function . It is then proposed a -family of probability distributions denoted by , with . The described family was modeled on Musielak–Orlicz spaces and a Banach manifold structure to is obtained. As a consequence of such model, a more general form of the Kullback–Leibler divergence was obtained and called -divergence. Furthermore, the arcs for the deformed exponential function were investigated and it was provided the necessary and sufficient conditions to connect by a -arc any two probability distributions [11]. This result was generalized later by [12,13]. A generalization to exponential arcs was defined in [14] and it also proved that the probability distribution z belongs to the -family if, and only if, z is connected to p by an open -arc. An example of deformed exponential function is the q-exponential one that it was used by Loaiza and Quiceno [15] to define an atlas modeled on essentially bounded function spaces. The charts for the given atlas are defined in terms of connections by an one-dimensional q-exponential model and of the q-deformations of cumulant maps [4]. Moreover, using equivalence class it was constructed the tangent space and the tangent bundle. In this paper we endow with a structure of a -Banach manifold using a deformed exponential function. This deformed exponential function has zero value until a certain point and from then on has the behaviour similar to the “classical” exponential function, which is strictly increasing. Particular cases of that function are: q-deformed exponential and -exponential. In order to build this structure, as in [15], we divide into equivalence classes using the connection provided by generalized exponential arcs as defined in [14]. Also, we define a set , that is the connected component of and will be the generalized -family of probability distributions. Moreover, by means of the derivative of the transition map, we find the tangent space and, consequently, the tangent bundle. In addition, we define a divergence using the q-exponential function which is related with the q-divergence defined in [15]. Finally, we show that the -exponential and q-exponential functions can be used in the generalization of Rényi’s divergence. The rest of the paper is organized as follows. In Section 2 we revisit some important results about the q-exponential statistical manifold and provide a brief introduction about Musielak–Orlicz spaces. In Section 3, we have our main results. We discuss generalized open exponential arcs and build generalized -families of probability distributions. Alterwards, in Section 4, we find the derivative of the transition map and, as a consequence, the tangent space and tangent bundle. Moreover, in Section 5 we define a divergence using the q-exponential function and we use those results to prove that the q-exponential and -exponential functions can be used to generalize Rényi’s divergence. Finally, in Section 6 our conclusions and future perspectives are stated.

2. Background and Preliminary Results

The deformed exponential function that we will use to equip with a structure of a -Banach manifold has as a particular case the q-exponential function and the parametrization domain is obtained from a Musielak–Orlicz space. For this reason, the purpose of this section is to make a brief presentation of the results involving the q-exponential manifold and the Musielak–Orlicz spaces.

2.1. A q-Exponential Statistical Banach Manifold

In the same way as in [15], we consider a probability space and . The q-deformed exponential function is given by [16] We say that satisfies that there are Consider the following partition of into equivalence classes: are related () if and only if there exists an one-dimensional q-exponential model connecting p and z, according to Equation (2). As a consequence, the measures and are equivalent and the essentially bounded function spaces and are equal. We need to define a family of q-deformations of the moment-generating functional denoted by , it means, where Also, we define a family of cumulant generating functional where Notice that , where is the open unit ball in . Some properties of the functional are described in the theorem below. ([15], Theorem 9). The cumulant generating function The function is a probability density on , since ; The functional is analytic in . The function is used to define the q-exponential models where Moreover, the set is a Banach space and is the open unit ball of . Since we obtain . Therefore and consequently is well defined. The inverse of is given by [15] The transition map , where is the range of , is expressed as [15] where with and . The map is injective and the set is open in the -topology, where . Hence, the transition map is a topological homeomorphism and consequently the collection of pairs is a -atlas modeled on . Then, is a -Banach manifold, since is a parametrization. There exists a relation between the constructed manifold and the Tsallis relative entropy. In fact, let us consider, for and , the following function where . Given p and z in , the Tsallis divergence, also called q-divergence of z with relation to p, is expressed by ([15], Proposition 16). Taking p, z in , with equality iff . .

2.2. Musielak–Orlicz Spaces and -Families of Probability Distributions

Consider a -finite, non-atomic measure space. Let , where is the linear space of all real-valued, measurable functions on T, with equality -a.e. . The map is a Musielak–Orlicz function if, for -a.e. (almost everywhere) , the following conditions hold [17]: Since the items (1) and (2) occur, it follows that is not equal to 0 or ∞ in the interval . is convex and lower semi-continuous; and ; is measurable for each . Consider the functional , for any . The Musielak–Orlicz space, Musielak–Orlicz class, Morse–Transue space associated the a Musielak–Orlicz function are defined, respectively, by and Consider the Luxemburg norm and the Orlicz norm where is the Fenchel conjugate of The Musielak–Orlicz space equipped with one of these two norms is a Banach space. The norms above are equivalent and the inequalities hold for all . For more details see [18,19]. Define the Musielak–Orlicz function as where is a measurable function such that is -integrable and we write , and , in the place of , and respectively. In [10] it was defined the parametrization where for each , and The application is called the normalizing function and it is defined in such a way that is in . We have that , and are open for any measurable such that and are in . The transition map is a -isomorphism and consequently is a parametrization. In the next section, we will use the generalized open exponential arcs to build a parametrization to .

3. Construction of Generalized -Families of Probability Distributions

Let , be a -finite, non-atomic measure space and consider a deformed exponential function . In other words, is convex for -a.e. and the limits , for -a.e. hold. In this work we consider two additional conditions on the deformed exponential : for all where given a measurable function such that , we have For a measurable function , we define the q-deformed exponential function as , where and . In this case, the q-deformed exponential function satisfies the condition (a1) with . In the next example, we prove that the q-deformed exponential function satisfies the condition (a2) for . Given If If By the convexity property of Then, any positive function Now, we provide an example of a deformed exponential function that satisfies condition (a1), but does not satisfy condition (a2). Consider the function where the measure μ is σ-finite and non atomic. Note that φ is convex, and satisfies where According to [ Let us define we obtain Hence, we can write On the other hand, we also have which shows that (a2) is not satisfied. We say that p and z in for each According to the proof proved in [11], we have that for each . Indeed, for , we have clearly that ; for the convexity of the function of the ensures that . Integrating the inequality we obtain Since satisfies then , for . Now we will define, by using generalized exponential arcs, important sets for the construction of generalized -family of probability distributions. Let us define as p and z are -connected by an open arc, we have that , for each . Hence, , i.e., For , where , consider the set We will show that the set is a generalized -family of probability distributions. Consider the partition of into equivalence classes using the following relation: given p, z ∈ we say that if and only if p and z are -connected by an open arc. This equivalence relation is necessary to define an atlas modeled on Banach spaces. Consider then be the Musielak–Orlicz space, given as and the set The set is a closed subspace. Clearly Given there exist such that and Considering we have that . Finally, given we obtain , since . The fact that remains to show is that is closed. For this, let convergent -a.e. for . This implies that there exists a subsequence , such that a.e. . Then, for each we can find with a.e. , for each The compactness of ensures that the coverage admits a finite undercoverage. Let the set of the elements that constitute the finite undercoverage. Taking it follows that a.e. , for each Passing to the limit, we obtain a.e. , for each Therefore, and consequently is closed. □ Define the set The set Let . Then, there exists , such that for each and . Considering , we have that for any it occurs and consequently . Given we denote . The inequality implies For we can write Then, we have for any . Hence, and since is a subspace, we obtain . As a consequence, is contained in and therefore the set is open. □ The set defined in (14) is important to guarantee that may be in . Now, we establish a relationship between the connection by an open arc and similar to that was proved in [14]. Fix Since that z is -connected to p by an open arc, there exists an interval , such that , for each . Considering , we have where and . Therefore . Another conclusion that arises from the fact of q is -connected to p by a open arc is that . Hence, , for each and and . Reciprocally, taking , we get , and consequently with . □ One should notice that as a consequence of Proposition 2, given -connected by an open arc, the random variable . In fact, this follows from two reasons: as it follows that and as z is -connected the p by an open arc we have for each . Since the function φ-arc is injective, in the Proposition 2 only the case Let is then well defined. Moreover, is strictly increasing. Proposition 2 ensures that , where and . Then, we can find such that is -integrable. Given , taking , we obtain and consequently is -integrable, for every and for each . This proves that is well defined. By the dominated convergence theorem, the map is continuous, and . Hence, given , we have that is strictly increasing. □ Fix Given there exists , such that and , -a.e. for each . Then, for each which ensures that q is -connected to p by an open arc. Reciprocally, take -connected to p by an open arc. In this way there exists , where and , for each . Note that because , for each and is non-decreasing. Suppose that , there exists such that The Equations (15) and (16) ensure that for each . Therefore, by Lemma 3 it exists a unique satisfying , such that . Since is such that , it follows that and consequently is unique. Hence, for each , that is an absurd. □ By Corollary 3 the sets are the connected components of . Then, we need to find a domain for the parametrization in such a way that the image is . We will make some similar considerations to the ones present in [10]. Remark that, for , is not necessarily in . Define , such that the density is contained in . We have that the open domain maximal of is contained in . Note that is well defined, since . It can be then proved that is convex, and as a consequence is continuous, since is open by Lemma 2. Let be the operator acting on the set of real-valued functions given by , where is the right-derivative of Also, notice that the function can assume both positive and negative values. Consider the closed subspace Observe that the image of will be contained in , since the domain of is restricted to a . By the convexity property of we have Hence, we have that Thus, it follows that in order to be in . Given a measurable function such that is a probability density in . Consider the set where and Given Given , we have and Hence, for each , which implies in . In addition, for each and therefore, . □ The set Consider the sets and Define the functions The function f is well defined and continuous, since is continuous; The map g is well defined in and continuous, since and are continuous. Moreover, given , in particular and . By the continuity of f and g respectively, exist , such that for each , we have and for each , we have . Taking, , we obtain that and consequently is open in . □ Clearly . Consider the measurable functions , where and belong to . The parametrization and have a transition map given as Given and being the normalizing functions associated to and respectively, and the functions and are such that So, we have Multiplying the Equation (18) by and integrating with respect to the measure once the function v is in , we obtain and we can write Therefore Hence, the transition map can be expressed as for every Showing that w and are in and the spaces and have equivalent norms we obtain that this transition map will be of class . In the next corollary we have that Musielak–Orlicz spaces are equal. The proof follows as the one provided in [14]. Let We have that z is -connected to p by a open arc. Then, by Corollary 3, we have that . The result follows immediately from [10]. □ It follows from Corollary 1 that is of class , and consequently, the set is open in ([14], Proposition 8). The relation given in the Definition 2 is an equivalence relation. Since reflexivity and symmetry properties immediately follow from the definition, we will only prove transitivity. Let be , such that, with , , and . Consider is defined with , where , . Therefore z and s are -connected. □ As a consequence of the Corollary 3 and of the Proposition 6 we have that the -families are maximal, in the sense that or if , then . Hence, we can write the following proposition. The collection

4. The Tangent Bundle

In the previous section, the expression of the transition application was important to garantee that could be equipped with a -Banach structure. Now, we will use the transition application to find the tangent space of at the point and the tangent bundle. Given , we consider the triple where is the -family, is the parametrization and v is a vector in which is contained in the vector space . Let us define the following equivalence relation: The class is called the tangent vector of in p and the set of all classes is called the tangent space and is denoted by . For more details we refer the reader to [20]. The vector is the velocity vector of a curve in the parametrization domain. In fact, consider and be charts about and a curve such that , for some . Taking , we have that . Moreover, and . Using random variables we have that . Hence, by the chain rule we can write We will denote as the tangent bundle, which is defined as the disjointed unity of , that is, The local representation of the tangent bundle Given we have that the derivative of the map evaluated at w in the direction of is of the form In fact, by the convexity of , we have that Since we have that is -integrable, and consequently, is -integrable. Then, from the dominated convergence theorem follows that (21) occurs. The tangent bundle is then denoted by Its charts are expressed as which was defined in the collection of open subsets of . Then, since Equation (21) occurs, the transition mappings are given for by □

5. Divergence in Statistical Manifolds

This section will be divided into two parts. The first one is responsible by the definition of the -divergence for the case where is the deformed exponential defined in Section 3 and to define a divergence using the q-exponential. In the second part, we prove that the q-exponential and -exponential functions can be used to generalize the divergence of Rényi [13,21].

5.1. The -Divergence and q-Divergence

To define the divergence associated to the normalization function is necessary the convexity of . This is guaranteed by the fact that is a subspace and is convex [10]. In this way, the Bregman’s divergence associated the is given by [22,23,24] Then, we can define the divergence related the generalized -family as . Given , we have that and as a consequence . Supposing is continuously differentiable, it follows that the divergence does not depend on the parametrization of . This allows us to define the divergence between the probability densities and , for as Note that the divergence is well defined inside the same -family. The condition if p and z are not in the same -family extends the divergence for . We will denote those divergence by and called it -divergence [10]. Given , we have that , then is strictly convex in , and therefore is always non-negative and is equal to zero if and only if . In the following example, we find the -divergence for the case in which the deformed exponential function is the q-deformed exponential function. Consider the q-exponential where and Therefore The divergence in (25) is related with the q-divergence defined in (6). In fact, Then and we can define the metric as where is the set of vector fields and the set of functions . This map is well defined, since and Notice that considering we will have that divergence in (25) coincides with the q-divergence defined in [15], the metric in (26) coincides with the metric given in [25] and the family of covariant derivatives (connections) given by where and coincides with the family of covariant derivatives (connections) given in [25]. The notation means the derivative of A in the direction of w in the point z when .

5.2. Generalization of Divergence of Rényi and

Now, we will recall that the Rényi divergence is related with the -divergence and we will see that a necessary and sufficient condition for the existence of generalization of Rényi divergence is the condition (a2). Consequently, we prove that the q-deformed exponential and -exponential functions can be used in the generalization of Rényi divergence. In [12] was defined a generalization of the Rényi divergence of order as where satisfies the Equation (12). This generalization in the case is defined as the limit and The limits in (28) and (29), under some conditions, are finite-valued and converges to the -divergence: In the next proposition we have that a necessary and sufficient condition to connect two probability densities of by an open arc is the condition (a2). ([12], Proposition 1). Let μ be a non-atomic measure. Consider In the Example 1, where the measure was assumed to be non-atomic, we have that the q-exponential function satisfies the condition (a2). Then, by Proposition 9 and Equation (27), we conclude that this function can be used in the generalization of Rényi divergence. Analogously, the function given in the Example 2 cannot be used in the generalization of Rényi divergence. Supposing that is non-atomic, it is presented on the next proposition an equivalent criterion for a deformed exponential function to satisfy condition (a2). ([12], Proposition 3). Let In the next example, we will show a class of deformed exponential functions that can be used in the generalization of Rényi divergence. We will show that the Kaniadakis κ-exponential Its inverse, the so called κ-logarithm We will verify that there exists Some manipulations imply that the derivative of Consequently, the difference If Then, Therefore, by Proposition 10 Kaniadakis -exponential satisfies the condition (a2). As consequence of the Example 4 and Proposition 9, we have that can be used in the generalization of Rényi divergence.

6. Conclusions

In this paper we constructed a parametrization of the statistical Banach manifold using a deformed exponential function. We have found the tangent space of in p and we also constructed the tangent bundle of . We defined the -divergence where is the q-exponential function and we establish a relation between this divergence and the q-divergence defined in [15]. Another important contribution is that the q-exponential and -exponential functions can be used to generalize the divergence of Rényi. The perspective for future works is to define the parallel transport, once we find the tangent plane. We also intend to construct a parametrization for using a deformed exponential function satisfying (a1) in the case where for each measurable function , with , there exists a measurable function , such that , for each .

1 in total

1. Divergence function, duality, and convex analysis.

Authors: Jun Zhang
Journal: Neural Comput Date: 2004-01 Impact factor: 2.026

1 in total