Literature DB >> 34945994

Inequalities for Jensen-Sharma-Mittal and Jeffreys-Sharma-Mittal Type f-Divergences.

Abstract

In this paper, we introduce new divergences called Jensen-Sharma-Mittal and Jeffreys-Sharma-Mittal in relation to convex functions. Some theorems, which give the lower and upper bounds for two new introduced divergences, are provided. The obtained results imply some new inequalities corresponding to known divergences. Some examples, which show that these are the generalizations of Rényi, Tsallis, and Kullback-Leibler types of divergences, are provided in order to show a few applications of new divergences.

Entities: Chemical

Keywords: Csiszár f-divergence; Jeffreys–Sharma–Mittal divergence; Jensen–Sharma–Mittal divergence; Sharma–Mittal f-divergence; convex function

Year: 2021 PMID： 34945994 PMCID： PMC8700545 DOI： 10.3390/e23121688

Source DB: PubMed Journal: Entropy (Basel) ISSN： 1099-4300 Impact factor: 2.524

1. Introduction

The Sharma–Mittal entropy was introduced as a new measure of information with two parameters [1]. It has previously been studied in the context of multi-dimensional harmonic oscillator systems [2]. This entropy could also be formulated in the form of exponential families, to which many usual statistical distributions including the Gaussians and discrete multinomials (that is, normalized histograms) belong. In physical applications it plays a major role in the field of thermo-statistics [3]. The Sharma–Mittal entropy is also applied for the analysis of the results of machine learning methods [4,5]. Additionally, the divergence based on considered entropy could be a cost function in the context of so-called the Twin Gaussian Processes [6]. It was originally showed by [7] that the Sharma–Mittal entropy generalized both Tsallis and Rényi entropy in the limiting cases of these two entropies. In [8], authors suggested a physical meaning of Sharma–Mittal entropy, which is the free energy difference between the equilibrium and the off-equilibrium distribution. Recently, was published a manuscript showing, in opposition to the work [8], that Sharma–Mittal entropy besides the convenient thermodynamic systems does not reduce only to Kullback–Leibler entropy. In [9] Verma and Merigó present the use of Sharma–Mittal entropy under intuitionistic fuzzy environment. Additionally, in [5] Koltcov et al. demonstrate that Sharma–Mittal entropy is a tool for selecting both the number of topics and the values of hyper-parameters, simultaneously controlling for semantic stability, which none of the existing metrics can do. Another applications of considered entropy are interesting results in the cosmological setup, such as black hole thermodynamics [10]. Namely, it helps us to describe the current accelerated universe by using the vacuum energy in a suitable manner [11]. In addition [12] have established the relation between anomalous diffusion process and Sharma–Mittal entropy. This paper is based on publications in which we introduced new types of f-divergences [13,14,15,16]. In this paper we generalize Sharma–Mittal types divergences in order to obtain new types of divergences and hence the inequalities from which it will be possible to derive new results and generalizations for known divergences in order to estimate the lower and upper bounds which determine the level of the uncertainty measure.

2. Sharma–Mittal Type Divergences

Throughout and denote the sets of non-negative and positive numbers, respectively, i.e., and . Let and with , . The relative entropy (also called Kullback–Leibler divergence) is defined by (see [17]) In the above definition, based on continuity arguments, we use a convention that and . Additionally . Let f: be a convex function on , and , . The Csiszár f-divergence is defined by (see [15]) with the conventions and , (see [18,19,20]). The Tsallis divergence of order is defined by (see [17]) The Rényi divergence of order is defined by (see [17,21]) The Sharma–Mittal divergence of order and degree is defined by (see [4]) for all , and . Let be a convex function on an interval . Let and for . The Jensen’s inequality is as follows (see [22]) When the function is convex and the function is convex and increasing then the composition of the functions is convex. We assume that the probabilities and for . It is known (see [4]) that if Let be the differentiable function. Then the Sharma–Mittal h-divergence is defined as follows: for all , and . If we assume that then (5) becomes Sharma–Mittal divergence. When for all , then (5) becomes Rényi divergence of order . We substitute for and we have Hence, from (5) Let be a differentiable function with respect to and We assume that and . Then, Hence, the Sharma–Mittal h–divergence tends to Rényi divergence of order . If, additionally, α tends to 1 then based on the proof of the Equation ( Now we define a new generalized as follows where is an increasing, non-negative and differentiable function for . We assume that is a given family of functions such that for and which are increasing, non-negative for and such that for every the function is differentiable. According to [16] if we substitute the function from the family for then it stands that We assume that . Then, If in ( The function is the generalization of the function which is used for example in Csiszár f-divergence. Condition means that the limit of generalized Sharma–Mittal divergence is equal to generalized –Rényi divergence. Hence we have implications for generalized forms of entropies. Additionally, when in ( In ( This work is more theoretical than practical. Therefore, the implications are formulated in the mathematical area that is from constructing general model which gives known specific cases.

3. Jensen–Sharma–Mittal and Jeffreys–Sharma–Mittal Divergences

The Jensen–Shannon divergence (Jensen–Shannon entropy) is defined as follows (see [17]): The Jeffreys divergence (Jeffreys entropy) is defined as follows (see [17]): We introduce a new generalized Jensen–Sharma–Mittal divergence defined by with assumptions as before. We similarly introduce a new generalized Jeffreys–Sharma–Mittal divergence as follows Taking into account inequality from [17]: describing the relation between the Jensen–Shannon and Jeffreys divergences, we could formulate the following: We define the Jensen–Sharma–Mittal h-divergence where, in (8), . Then, it takes the form: In the same way, we define the Jeffreys–Sharma–Mittal h–divergence: Additionally, if the function then we define the Jensen–Sharma–Mittal and the Jeffreys–Sharma–Mittal divergences of order and degree , respectively. When in (8) and (9) and we substitute for then we obtain, defined in [16], the generalized Jensen–Rényi and Jeffreys–Rényi divergences, respectively: The following theorem is the generalization and refinement of the inequalities for some known divergences and provides lower and upper bounds for the generalized Jeffreys–Sharma–Mittal divergence in order to a more accurate estimation of its uncertainty measure. Let Then, the following inequalities are valid: Taking into account the assumptions, we could formulate the following inequality: The function h is increasing and convex, therefore, from (4) and (17) we obtain inequalities: In the same way, we obtain the following inequalities: From (9) we have Taking into account (2), (18), (19) and the definition of Jeffreys divergence, it stands that: The above inequality is the upper bound for generalized Jeffreys–Sharma–Mittal divergence. By using the convexity of the function h with the following inequality is valid for : From (7) the above derivative function is equal to: . The function is concave and increasing. Then, it stands that: Hence, from (21) and (22) we have the inequality: Similarly, we obtain the second inequality: We have from (6), (23) and (24) that: Then, by using the definition (9) we have: This result is the lower bound of the generalized Jeffreys–Sharma–Mittal divergence. Combining (20) and (25) we obtain the expected inequalities (16). □ When we substitute for We now formulate the theorem thanks to which the estimation of the generalized Jensen–Sharma–Mittal divergence will be possible. Let Then, the following inequalities are valid: Let’s consider the function Using the assumptions that the function h is differentiable, convex and we could formulate the following inequality: Then, (28) is equal to: Taking into account concavity of the function log, we have that: Then, we obtain that (27) is greater than We do the same with the function Hence, we have that (30) is greater than Then, combining (27), (29)–(31), and using the definition (8) the following inequality occurs and it is the lower bound of the generalized Jensen–Sharma–Mittal divergence. When we consider the function with then for the convex and increasing function h we have from (4) that (33) is smaller than In a similar way we conclude the following inequality for the function and we have Then combining (33)–(36) and the definition (8) with the proper transformations we obtain the inequality which is the upper bound of the generalized Jensen–Sharma–Mittal divergence. When we take into account (32) and (37), then we obtain (26). □ When we substitute for It could be seen that the lower bounds for both Jeffreys ( Taking into account the inequality (

4. Applications

In this section we show how our theory works.

4.1. Bounds for Sharma–Mittal Divergences

For the functions , and based on Theorems 1 and 3 we obtain the lower and upper bounds for Jeffreys–Sharma–Mittal and Jensen–Sharma–Mittal divergences, respectively, as follows The above lower bounds ( Substituting different values for the parameters α, β, such that

4.2. Bounds for Tsallis Divergences

When we make the same assumptions as for Sharma–Mittal divergences with additional that we obtain the bounds for Tsallis type divergences as follows

4.3. Bounds for Kullback–Leibler Divergences

When we have the same situation as in case of Tsallis divergence that is , , and additionally both and approach 1 then we obtain new upper bounds for Jeffreys and Jensen–Shannon divergences, respectively. The last inequality is equivalent to .

5. Summary

In this paper, new types of entropy have been defined, which are generalizations of others known and used so far in information theory. The manuscript deals more with issues in the field of pure mathematics, therefore the standard axioms of entropy used in thermodynamics could, in this case, be extended by other assumptions and properties. These divergences have been introduced for new physical interpretations which could be generated. Generalized Sharma–Mittal and consequently Jensen–Sharma–Mittal and Jeffrey–Sharma–Mittal divergences have been defined for obtaining better estimates for known entropies, which will allow to more accurately determination of the dispersion measure of different distributions. The derived inequalities have both upper and lower limits for the considered f-divergences. As a consequence, we obtain specific estimates for some new order measures. Hence they provide much wider interpretation possibilities in comparing probability distributions in the sense of mutual distances in different spaces. In the era of advancing quantum mechanics, scientists are striving to build a quantum computer with very high computing power. The obtained results, despite their mathematical and analytical complexity, will very quickly generate specific numerical intervals which are an estimation of new introduced entropies. Therefore, such results as in this paper will be very useful in developing information theory issues. This work is from the area of pure mathematics, therefore it is more theoretical than practical and makes it possible to find the existing known entropies by means of new defined generalizations. These generalizations can be used for interpreting various physical phenomena. The aim of this manuscript was to provide some new theoretical solutions for physicists who, with their knowledge and experience, will be able to look for new applications.

1 in total

1. Estimating Topic Modeling Performance with Sharma-Mittal Entropy.

Authors: Sergei Koltcov; Vera Ignatenko; Olessia Koltsova
Journal: Entropy (Basel) Date: 2019-07-05 Impact factor: 2.524

1 in total