Literature DB >> 33285995

On a Generalization of the Jensen-Shannon Divergence and the Jensen-Shannon Centroid.

Abstract

The Jensen-Shannon divergence is a renown bounded symmetrization of the Kullback-Leibler divergence which does not require probability densities to have matching supports. In this paper, we introduce a vector-skew generalization of the scalar α -Jensen-Bregman divergences and derive thereof the vector-skew α -Jensen-Shannon divergences. We prove that the vector-skew α -Jensen-Shannon divergences are f-divergences and study the properties of these novel divergences. Finally, we report an iterative algorithm to numerically compute the Jensen-Shannon-type centroids for a set of probability densities belonging to a mixture family: This includes the case of the Jensen-Shannon centroid of a set of categorical distributions or normalized histograms.

Entities: Chemical Disease Gene

Keywords: Bregman divergence; Jensen diversity; Jensen–Bregman divergence; Jensen–Shannon centroid; Jensen–Shannon divergence; capacitory discrimination; difference of convex (DC) programming; f-divergence; information geometry; mixture family

Year: 2020 PMID： 33285995 PMCID： PMC7516653 DOI： 10.3390/e22020221

Source DB: PubMed Journal: Entropy (Basel) ISSN： 1099-4300 Impact factor: 2.524

1. Introduction

Let be a measure space [1] where denotes the sample space, the -algebra of measurable events, and a positive measure; for example, the measure space defined by the Lebesgue measure with Borel -algebra for or the measure space defined by the counting measure with the power set -algebra on a finite alphabet . Denote by the Lebesgue space of measurable functions, the subspace of positive integrable functions f such that and for all , and the subspace of non-negative integrable functions f such that and for all . We refer to the book of Deza and Deza [2] and the survey of Basseville [3] for an introduction to the many types of statistical divergences met in information sciences and their justifications. The Kullback–Leibler Divergence (KLD) is an oriented statistical distance (commonly called the relative entropy in information theory [4]) defined between two densities p and q (i.e., the Radon–Nikodym densities of -absolutely continuous probability measures P and Q) by Although with equality iff. -a. e. (Gibb’s inequality [4]), the KLD may diverge to infinity depending on the underlying densities. Since the KLD is asymmetric, several symmetrizations [5] have been proposed in the literature. A well-grounded symmetrization of the KLD is the Jensen–Shannon Divergence [6] (JSD), also called capacitory discrimination in the literature (e.g., see [7]): The Jensen–Shannon divergence can be interpreted as the total KL divergence to the average distribution . The Jensen–Shannon divergence was historically implicitly introduced in [8] (Equation (19)) to calculate distances between random graphs. A nice feature of the Jensen–Shannon divergence is that this divergence can be applied to densities with arbitrary support (i.e., with the convention that and ); moreover, the JSD is always upper bounded by . Let and denote the supports of the densities p and q, respectively, where . The JSD saturates to whenever the supports and are disjoints. We can rewrite the JSD as where denotes Shannon’s entropy. Thus, the JSD can also be interpreted as the entropy of the average distribution minus the average of the entropies. The square root of the JSD is a metric [9] satisfying the triangle inequality, but the square root of the JD is not a metric (nor any positive power of the Jeffreys divergence, see [10]). In fact, the JSD can be interpreted as a Hilbert metric distance, meaning that there exists some isometric embedding of into a Hilbert space [11,12]. Other principled symmetrizations of the KLD have been proposed in the literature: For example, Naghshvar et al. [13] proposed the extrinsic Jensen–Shannon divergence and demonstrated its use for variable-length coding over a discrete memoryless channel (DMC). Another symmetrization of the KLD sometimes met in the literature [14,15,16] is the Jeffreys divergence [17,18] (JD) defined by However, we point out that this Jeffreys divergence lacks sound information-theoretical justifications. For two positive but not necessarily normalized densities and , we define the extended Kullback–Leibler divergence as follows: The Jensen–Shannon divergence and the Jeffreys divergence can both be extended to positive (unnormalized) densities without changing their formula expressions: However, the extended divergence is upper-bounded by instead of for normalized densities (i.e., when ). Let denote the statistical weighted mixture with component densities p and q for . The asymmetric -skew Jensen–Shannon divergence can be defined for a scalar parameter by considering the weighted mixture as follows: Let us introduce the α-skew K-divergence [6,19] by: Then, both the Jensen–Shannon divergence and the Jeffreys divergence can be rewritten [20] using as follows: since , and . We can thus define the symmetric α-skew Jensen–Shannon divergence [20] for as follows: The ordinary Jensen–Shannon divergence is recovered for . In general, skewing divergences (e.g., using the divergence instead of the KLD) have been experimentally shown to perform better in applications like in some natural language processing (NLP) tasks [21]. The α-Jensen–Shannon divergences are Csiszár f-divergences [22,23,24]. An f-divergence is defined for a convex function f, strictly convex at 1, and satisfies as: We can always symmetrize f-divergences by taking the conjugate convex function (related to the perspective function): is a symmetric divergence. The f-divergences are convex statistical distances which are provably the only separable invariant divergences in information geometry [25], except for binary alphabets (see [26]). The Jeffreys divergence is an f-divergence for the generator , and the -Jensen–Shannon divergences are f-divergences for the generator family . The f-divergences are upper-bounded by . Thus, the f-divergences are finite when . The main contributions of this paper are summarized as follows: First, we generalize the Jensen–Bregman divergence by skewing a weighted separable Jensen–Bregman divergence with a k-dimensional vector in Section 2. This yields a generalization of the symmetric skew -Jensen–Shannon divergences to a vector-skew parameter. This extension retains the key properties for being upper-bounded and for application to densities with potentially different supports. The proposed generalization also allows one to grasp a better understanding of the “mechanism” of the Jensen–Shannon divergence itself. We also show how to directly obtain the weighted vector-skew Jensen–Shannon divergence from the decomposition of the KLD as the difference of the cross-entropy minus the entropy (i.e., KLD as the relative entropy). Second, we prove that weighted vector-skew Jensen–Shannon divergences are f-divergences (Theorem 1), and show how to build families of symmetric Jensen–Shannon-type divergences which can be controlled by a vector of parameters in Section 2.3, generalizing the work of [20] from scalar skewing to vector skewing. This may prove useful in applications by providing additional tuning parameters (which can be set, for example, by using cross-validation techniques). Third, we consider the calculation of the Jensen–Shannon centroids in Section 3 for densities belonging to mixture families. Mixture families include the family of categorical distributions and the family of statistical mixtures sharing the same prescribed components. Mixture families are well-studied manifolds in information geometry [25]. We show how to compute the Jensen–Shannon centroid using a concave–convex numerical iterative optimization procedure [27]. The experimental results graphically compare the Jeffreys centroid with the Jensen–Shannon centroid for grey-valued image histograms.

2. Extending the Jensen–Shannon Divergence

2.1. Vector-Skew Jensen–Bregman Divergences and Jensen Diversities

Recall our notational shortcut: . For a k-dimensional vector , a weight vector w belonging to the -dimensional open simplex , and a scalar , let us define the following vector skew α-Jensen–Bregman divergence (-JBD) following [28]: where is the Bregman divergence [29] induced by a strictly convex and smooth generator F: with denoting the Euclidean inner product (dot product). Expanding the Bregman divergence formulas in the expression of the -JBD and using the fact that we get the following expression: The inner product term of Equation (21) vanishes when Thus, when (assuming at least two distinct components in so that ), we get the simplified formula for the vector-skew -JBD: This vector-skew Jensen–Bregman divergence is always finite and amounts to a Jensen diversity [30] induced by Jensen’s inequality gap: The Jensen diversity is a quantity which arises as a generalization of the cluster variance when clustering with Bregman divergences instead of the ordinary squared Euclidean distance; see [29,30] for details. In the context of Bregman clustering, the Jensen diversity has been called the Bregman information [29] and motivated by rate distortion theory: Bregman information measures the minimum expected loss when encoding a set of points using a single point when the loss is measured using a Bregman divergence. In general, a k-point measure is called a diversity measure (for ), while a distance/divergence is the special case of a 2-point measure. Conversely, in 1D, we may start from Jensen’s inequality for a strictly convex function F: Let us notationally write , and define and (i.e., assuming at least two distinct values). We have the barycenter which can be interpreted as the linear interpolation of the extremal values for some . Let us write for and proper values of the s. Then, it comes that so that .

2.2. Vector-Skew Jensen–Shannon Divergences

Let be a strictly smooth convex function on . Then, the Bregman divergence induced by this univariate generator is the extended scalar Kullback–Leibler divergence. We extend the scalar-skew Jensen–Shannon divergence as follows: for h, the Shannon’s entropy [4] (a strictly concave function [4]). (Weighted vector-skew -Jensen–Shannon divergence). For a vector with This definition generalizes the ordinary JSD; we recover the ordinary Jensen–Shannon divergence when , , , and with : . Let . Then, we have . Using this -KLD, we have the following identity: since , where is a k-dimensional vector of ones. A very interesting property is that the vector-skew Jensen–Shannon divergences are f-divergences [22]. The vector-skew Jensen–Shannon divergences First, let us observe that the positively weighted sum of f-divergences is an f-divergence: for the generator . Now, let us express the divergence as an f-divergence: with generator Thus, it follows that Therefore, the vector-skew Jensen–Shannon divergence is an f-divergence for the following generator: where . When and , we recover the f-divergence generator for the JSD: Observe that , where . We also refer the reader to Theorem 4.1 of [31], which defines skew f-divergences from any f-divergence. □ Since the vector-skew Jensen divergence is an f-divergence, we easily obtain Fano and Pinsker inequalities following [ Next, we show that (and ) are separable convex divergences. Since the f-divergences are separable convex, the divergences and the divergences are separable convex. For the sake of completeness, we report a simplex explicit proof below. (Separable convexity). The divergence Let us calculate the second partial derivative of with respect to x, and show that it is strictly positive: for . Thus, is strictly convex on the left argument. Similarly, since , we deduce that is strictly convex on the right argument. Therefore, the divergence is separable convex. □ It follows that the divergence is strictly separable convex, since it is a convex combination of weighted divergences. Another way to derive the vector-skew JSD is to decompose the KLD as the difference of the cross-entropy minus the entropy h (i.e., KLD is also called the relative entropy): where and (self cross-entropy). Since (for ), it follows that Here, the “trick” is to choose in order to “convert” the cross-entropy into an entropy: when . Then, we end up with When with and and , we have , and we recover the Jensen–Shannon divergence: Notice that Equation (13) is the usual definition of the Jensen–Shannon divergence, while Equation (48) is the reduced formula of the JSD, which can be interpreted as a Jensen gap for Shannon entropy, hence its name: The Jensen–Shannon divergence. Moreover, if we consider the cross-entropy/entropy extended to positive densities and : we get: Next, we shall prove that our generalization of the skew Jensen–Shannon divergence to vector-skewing is always bounded. We first start by a lemma bounding the KLD between two mixtures sharing the same components: (KLD between two w-mixtures). For For , we have Indeed, by considering the two cases (or equivalently, ) and (or equivalently, ), we check that and . Thus, we have . Therefore, it follows that: Notice that we can interpret as the ∞-Rényi divergence [36,37] between the following two two-point distributions: and . See Theorem 6 of [36]. A weaker upper bound is . Indeed, let us form a partition of the sample space into two dominance regions: and . We have for and for . It follows that That is, . Notice that we allow but not to take the extreme values (i.e., ). □ In fact, it is known that for both , amount to compute a Bregman divergence for the Shannon negentropy generator, since defines a mixture family [38] of order 1 in information geometry. Hence, it is always finite, as Bregman divergences are always finite (but not necessarily bounded). By using the fact that we conclude that the vector-skew Jensen–Shannon divergence is upper-bounded: (Bounded -Jensen–Shannon divergence). We have . Since , it follows that we have Notice that we also have □ The vector-skew Jensen–Shannon divergence is symmetric if and only if for each index there exists a matching index such that and . For example, we may define the symmetric scalar α-skew Jensen–Shannon divergence as since it holds that for any . Note that . We can always symmetrize a vector-skew Jensen–Shannon divergence by doubling the dimension of the skewing vector. Let Since the vector-skew Jensen–Shannon divergence is an f-divergence for the generator since For example, consider the ordinary Jensen–Shannon divergence with As a side note, let us notice that our notation allows one to compactly write the following property: We have Clearly, for any . Now, we have □

2.3. Building Symmetric Families of Vector-Skewed Jensen–Shannon Divergences

We can build infinitely many vector-skew Jensen–Shannon divergences. For example, consider and . Then, , and Interestingly, we can also build infinitely many families of symmetric vector-skew Jensen–Shannon divergences. For example, consider these two examples that illustrate the construction process: Consider . Let denote the weight vector, and the skewing vector. We have . The vector-skew JSD is symmetric iff. (with ) and . In that case, we have , and we obtain the following family of symmetric Jensen–Shannon divergences: Consider , weight vector , and skewing vector for . Then, , and we get the following family of symmetric vector-skew JSDs: We can similarly carry on the construction of such symmetric JSDs by increasing the dimensionality of the skewing vector. In fact, we can define with

3. Jensen–Shannon Centroids on Mixture Families

3.1. Mixture Families and Jensen–Shannon Divergences

Consider a mixture family in information geometry [25]. That is, let us give a prescribed set of linearly independent probability densities defined on the sample space . A mixture family of order D consists of all strictly convex combinations of these component densities: For example, the family of categorical distributions (sometimes called “multinouilli” distributions) is a mixture family [25]: where is the Dirac distribution (i.e., for and for ). Note that the mixture family of categorical distributions can also be interpreted as an exponential family. Notice that the linearly independent assumption on probability densities is to ensure to have an identifiable model: . The KL divergence between two densities of a mixture family amounts to a Bregman divergence for the Shannon negentropy generator (see [38]): On a mixture manifold , the mixture density of two mixtures and of also belongs to : where we extend the notation to vectors and : . Thus, the vector-skew JSD amounts to a vector-skew Jensen diversity for the Shannon negentropy convex function :

3.2. Jensen–Shannon Centroids

Given a set of n mixture densities of , we seek to calculate the skew-vector Jensen–Shannon centroid (or barycenter for non-uniform weights) defined as , where is the minimizer of the following objective function (or loss function): where is the weight vector of densities (uniform weight for the centroid and non-uniform weight for a barycenter). This definition of the skew-vector Jensen–Shannon centroid is a generalization of the Fréchet mean (the Fréchet mean may not be unique, as it is the case on the sphere for two antipodal points for which their Fréchet means with respect to the geodesic metric distance form a great circle) [39] to non-metric spaces. Since the divergence is strictly separable convex, it follows that the Jensen–Shannon-type centroids are unique when they exist. Plugging Equation (82) into Equation (88), we get that the calculation of the Jensen–Shannon centroid amounts to the following minimization problem: This optimization is a Difference of Convex (DC) programming optimization, for which we can use the ConCave–Convex procedure [27,40] (CCCP). Indeed, let us define the following two convex functions: Both functions and are convex since F is convex. Then, the minimization problem of Equation (89) to solve can be rewritten as: This is a DC programming optimization problem which can be solved iteratively by initializing to an arbitrary value (say, the centroid of the s), and then by updating the parameter at step t using the CCCP [27] as follows: Compared to a gradient descent local optimization, there is no required step size (also called “learning” rate) in CCCP. We have and . The CCCP converges to a local optimum where the support hyperplanes of the function graphs of A and B at are parallel to each other, as depicted in Figure 1. The set of stationary points is . In practice, the delicate step is to invert . Next, we show how to implement this algorithm for the Jensen–Shannon centroid of a set of categorical distributions (i.e., normalized histograms with all non-empty bins).

Figure 1

The Convex–ConCave Procedure (CCCP) iteratively updates the parameter by aligning the support hyperplanes at . In the limit case of convergence to , the support hyperplanes at are parallel to each other. CCCP finds a local minimum.

3.2.1. Jensen–Shannon Centroids of Categorical Distributions

To illustrate the method, let us consider the mixture family of categorical distributions [25]: The Shannon negentropy is We have the partial derivatives Inverting the gradient requires us to solve the equation so that we get . We find that Table 1 summarizes the dual view of the family of categorical distributions, either interpreted as an exponential family or as a mixture family.

Table 1

Two views of the family of categorical distributions with d choices: An exponential family or a mixture family of order . Note that the Bregman divergence associated to the exponential family view corresponds to the reverse Kullback–Leibler (KL) divergence, while the Bregman divergence associated to the mixture family view corresponds to the KL divergence.

	Exponential Family	Mixture Family
pdf	pθ(x)=∏i=1dpiti(x),pi=Pr(x=ei),ti(x)∈{0,1},∑i=1dti(x)=1	mθ(x)=∑i=1dpiδei(x)
primal θ	θi=logpipd	θi=pi
F(θ)	log(1+∑i=1Dexp(θi))	θilogθi+(1−∑i=1Dθi)log(1−∑i=1Dθi)
dual η=∇F(θ)	eθi1+∑j=1Dexp(θj)	logθi1−∑j=1Dθj
primal θ=∇F*(η)	logηi1−∑j=1Dηj	eθi1+∑j=1Dexp(θj)
F*(η)	∑i=1Dηilogηi+(1−∑j=1Dηj)log(1−∑j=1Dηj)	log(1+∑i=1Dexp(ηi))
Bregman divergence	BF(θ:θ′)=KL*(pθ:pθ′)	BF(θ:θ′)=KL(mθ:mθ′)
	=KL(pθ′:pθ)

We have for and , where is the Jensen divergence [40]. Thus, to compute the Jensen–Shannon centroid of a set of n densities of a mixture family (with ), we need to solve the following optimization problem for a density : The CCCP algorithm for the Jensen–Shannon centroid proceeds by initializing (center of mass of the natural parameters), and iteratively updates as follows: We iterate until the absolute difference between two successive and goes below a prescribed threshold value. The convergence of the CCCP algorithm is linear [41] to a local minimum that is a fixed point of the equation where is a vector generalization of the formula of the quasi-arithmetic means [30,40] obtained for the generator . Algorithm 1 summarizes the method for approximating the Jensen–Shannon centroid of a given set of categorical distributions (given a prescribed number of iterations). In the pseudo-code, we used the notation instead of in order to highlight the conversion procedures of the natural parameters to/from the mixture weight parameters by using superscript notations for coordinates. Figure 2 displays the results of the calculations of the Jeffreys centroid [18] and the Jensen–Shannon centroid for two normalized histograms obtained from grey-valued images of Lena and Barbara. Figure 3 show the Jeffreys centroid and the Jensen–Shannon centroid for the Barbara image and its negative image. Figure 4 demonstrates that the Jensen–Shannon centroid is well defined even if the input histograms do not have coinciding supports. Notice that on the parts of the support where only one distribution is defined, the JS centroid is a scaled copy of that defined distribution.

Figure 2

The Jeffreys centroid (grey histogram) and the Jensen–Shannon centroid (black histogram) for two grey normalized histograms of the Lena image (red histogram) and the Barbara image (blue histogram). Although these Jeffreys and Jensen–Shannon centroids look quite similar, observe that there is a major difference between them in the range where the blue histogram is zero.

Figure 3

The Jeffreys centroid (grey histogram) and the Jensen–Shannon centroid (black histogram) for the grey normalized histogram of the Barbara image (red histogram) and its negative image (blue histogram which corresponds to the reflection around the vertical axis of the red histogram).

Figure 4

Jensen–Shannon centroid (black histogram) for the clamped grey normalized histogram of the Lena image (red histograms) and the clamped gray normalized histogram of Barbara image (blue histograms). Notice that on the part of the sample space where only one distribution is non-zero, the JS centroid scales that histogram portion.

3.2.2. Special Cases

Let us now consider two special cases: For the special case of , the categorical family is the Bernoulli family, and we have (binary negentropy), (and ) and . The CCCP update rule to compute the binary Jensen–Shannon centroid becomes Since the skew-vector Jensen–Shannon divergence formula holds for positive densities: we can relax the computation of the Jensen–Shannon centroid by considering 1D separable minimization problems. We then normalize the positive JS centroids to get an approximation of the probability JS centroids. This approach was also considered when dealing with the Jeffreys’ centroid [18]. In 1D, we have , and . In general, calculating the negentropy for a mixture family with continuous densities sharing the same support is not tractable because of the log-sum term of the differential entropy. However, the following remark emphasizes an extension of the mixture family of categorical distributions: Consider a mixture family mutually non-intersecting( Note that the term Notice that we can truncate an exponential family [

3.2.3. Some Remarks and Properties

In general, the entropy and cross-entropy between densities of a mixture family (whether the distributions have disjoint supports or not) can be calculated in closed-form. The entropy of a density belonging to a mixture family Let us write the KLD as the difference between the cross-entropy minus the entropy [4]: Following [45], we deduce that and for a constant c. Since by definition, it follows that and that where . □ Thus, we can numerically compute the Jensen–Shannon centroids (or barycenters) of a set of densities belonging to a mixture family. This includes the case of categorical distributions and the case of Gaussian Mixture Models (GMMs) with prescribed Gaussian components [38] (although in this case, the negentropy needs to be stochastically approximated using Monte Carlo techniques [46]). When the densities do not belong to a mixture family (say, the Gaussian family, which is an exponential family [25]), we face the problem that the mixture of two densities does not belong to the family anymore. One way to tackle this problem is to project the mixture onto the Gaussian family. This corresponds to an m-projection (mixture projection) which can be interpreted as a Maximum Entropy projection of the mixture [25,47]). Notice that we can perform fast k-means clustering without centroid calculations using a generalization of the k-means++ probabilistic initialization [48,49]. See [50] for details of the generalized k-means++ probabilistic initialization defined according to an arbitrary divergence. Finally, let us notice some decompositions of the Jensen–Shannon divergence and the skew Jensen divergences. We have the following decomposition for the Jensen–Shannon divergence: where and Similarly, the α-skew Jensen divergence can be decomposed as the sum of the information Notice that the information Finally, let us briefly mention the Jensen–Shannon diversity [30] which extends the Jensen–Shannon divergence to a weighted set of densities as follows: where . The Jensen–Shannon diversity plays the role of the variance of a cluster with respect to the KLD. Indeed, let us state the compensation identity [51]: For any q, we have Thus, the cluster center defined as the minimizer of is the centroid , and

4. Conclusions and Discussion

The Jensen–Shannon divergence [6] is a renown symmetrization of the Kullback–Leibler oriented divergence that enjoys the following three essential properties: It is always bounded, it applies to densities with potentially different supports, and it extends to unnormalized densities while enjoying the same formula expression. This JSD plays an important role in machine learning and in deep learning for studying Generative Adversarial Networks (GANs) [52]. Traditionally, the JSD has been skewed with a scalar parameter [19,53] . In practice, it has been experimentally demonstrated that skewing divergences may significantly improve the performance of some tasks (e.g., [21,54]). In general, we can symmetrize the KLD by taking an abstract mean (we require a symmetric mean with the in-betweenness property: ) M between the two orientations and : We recover the Jeffreys divergence by taking the arithmetic mean twice (i.e., where ), and the resistor average divergence [55] by taking the harmonic mean (i.e., where ). When we take the limit of Hölder power means, we get the following extremal symmetrizations of the KLD: In this work, we showed how to vector-skew the JSD while preserving the above three properties. These new families of weighted vector-skew Jensen–Shannon divergences may allow one to fine-tune the dissimilarity in applications by replacing the skewing scalar parameter of the JSD by a vector parameter (informally, adding some “knobs” for tuning a divergence). We then considered computing the Jensen–Shannon centroids of a set of densities belonging to a mixture family [25] by using the convex–concave procedure [27]. In general, we can vector-skew any arbitrary divergence D by using two k-dimensional vectors and (with ) by building a weighted separable divergence as follows: This bi-vector-skew divergence unifies the Jeffreys divergence with the Jensen–Shannon -skew divergence by setting the following parameters: We have shown in this paper that interesting properties may occur when the skewing vector is purposely correlated to the skewing vector : Namely, for the bi-vector-skew Bregman divergences with and , we obtain an equivalent Jensen diversity for the Jensen–Bregman divergence, and, as a byproduct, a vector-skew generalization of the Jensen–Shannon divergence.

3 in total

1. An invariant form for the prior probability in estimation problems.

Authors: H JEFFREYS
Journal: Proc R Soc Lond A Math Phys Sci Date: 1946

2. Information entropy, information distances, and complexity in atoms.

Authors: K Ch Chatzisavvas; Ch C Moustakidis; C P Panos
Journal: J Chem Phys Date: 2005-11-01 Impact factor: 3.488

3. Entropy and distance of random graphs with application to structural pattern recognition.

Authors: A K Wong; M You
Journal: IEEE Trans Pattern Anal Mach Intell Date: 1985-05 Impact factor: 6.226

3 in total

5 in total

1. Redundancy Reduction for Sensor Deployment in Prosthetic Socket: A Case Study.

Authors: Wenyao Zhu; Yizhi Chen; Siu-Teing Ko; Zhonghai Lu
Journal: Sensors (Basel) Date: 2022-04-19 Impact factor: 3.847

2. Divergence Measures: Mathematical Foundations and Applications in Information-Theoretic and Statistical Problems.

Authors: Igal Sason
Journal: Entropy (Basel) Date: 2022-05-16 Impact factor: 2.738

3. α-Geodesical Skew Divergence.

Authors: Masanari Kimura; Hideitsu Hino
Journal: Entropy (Basel) Date: 2021-04-25 Impact factor: 2.524

4. Damage Detection in Largely Unobserved Structures under Varying Environmental Conditions: An AutoRegressive Spectrum and Multi-Level Machine Learning Methodology.

Authors: Alireza Entezami; Stefano Mariani; Hashem Shariatmadar
Journal: Sensors (Basel) Date: 2022-02-11 Impact factor: 3.576

5. Relating Darcy-Scale Chemical Reaction Order to Pore-Scale Spatial Heterogeneity.

Authors: Po-Wei Huang; Bernd Flemisch; Chao-Zhong Qin; Martin O Saar; Anozie Ebigbo
Journal: Transp Porous Media Date: 2022-07-15 Impact factor: 3.610

5 in total