Literature DB >> 33286629

Multiplicative Decomposition of Heterogeneity in Mixtures of Continuous Distributions.

Abraham Nunes¹, Martin Alda¹, Thomas Trappenberg².

Abstract

A system's heterogeneity (diversity) is the effective size of its event space, and can be quantified using the Rényi family of indices (also known as Hill numbers in ecology or Hannah-Kay indices in economics), which are indexed by an elasticity parameter q≥0. Under these indices, the heterogeneity of a composite system (the γ-heterogeneity) is decomposable into heterogeneity arising from variation within and between component subsystems (the α- and β-heterogeneity, respectively). Since the average heterogeneity of a component subsystem should not be greater than that of the pooled system, we require that γ≥α. There exists a multiplicative decomposition for Rényi heterogeneity of composite systems with discrete event spaces, but less attention has been paid to decomposition in the continuous setting. We therefore describe multiplicative decomposition of the Rényi heterogeneity for continuous mixture distributions under parametric and non-parametric pooling assumptions. Under non-parametric pooling, the γ-heterogeneity must often be estimated numerically, but the multiplicative decomposition holds such that γ≥α for q>0. Conversely, under parametric pooling, γ-heterogeneity can be computed efficiently in closed-form, but the γ≥α condition holds reliably only at q=1. Our findings will further contribute to heterogeneity measurement in continuous systems.

Entities: Disease Gene

Keywords: Gaussian mixture; decomposition; diversity; heterogeneity

Year: 2020 PMID： 33286629 PMCID： PMC7517460 DOI： 10.3390/e22080858

Source DB: PubMed Journal: Entropy (Basel) ISSN： 1099-4300 Impact factor: 2.524

1. Introduction

Measurement of heterogeneity is important across many scientific disciplines. Ecologists are interested in the heterogeneity of ecosystems’ biological composition (biodiversity) [1], economists are interested in the heterogeneity of resource ownership (wealth equality) [2], and medical researchers and physicians are interested in the heterogeneity of diseases and their presentations [3]. Using Rényi heterogeneity [3,4,5], which for categorical random variables corresponds to ecologists’ Hill numbers [6] and economists’ Hannah–Kay indices [7], one can measure a system’s heterogeneity as its effective number of distinct configurations. The heterogeneity of a mixture or ensemble of systems is often known as -heterogeneity, and is generated by variation occurring within and between constituent subsystems. A good heterogeneity measure will facilitate decomposition of -heterogeneity into (within subsystem) and (between subsystem) components. Under this decomposition, we require that , since it is counterintuitive that the heterogeneity of the overall ensemble should be less than any of its constituents, let alone the “average” subsystem [8,9]. Such a decomposition was introduced by Jost [9] for systems represented on discrete event spaces (such as representations of organisms by species labels). However, many data are better modeled by continuous embeddings, including word semantics [10,11,12], genetic population structure [13], and natural images [14]. Unfortunately, there is considerably less understood about how to decompose Rényi heterogeneity in such cases where data are represented on non-categorical spaces [4]. Although there are decomposable functional diversity indices expressed in numbers equivalent, they require categorical partitioning of the data (in order to supply species (dis)similarity matrices) [15,16,17,18] and setting sensitivity or threshold parameters for (dis)similarities [16,18]. For many research applications, such as those in psychiatry [3,4,19] or involving unsupervised learning [13,14], we may not have categorical partitions of the observable space that are valid, reliable, and of semantic relevance. If we are to apply Rényi heterogeneity to such continuous-space systems, then we must demonstrate that its multiplicative decomposition of -heterogeneity into and components is retained. Therefore, our present work extends the Jost [9] multiplicative decomposition of Rényi heterogeneity to the analysis of continuous systems, and provides conditions under which the condition is satisfied. In Section 2, we introduce decomposition of the Rényi heterogeneity in categorical and continuous systems. Specifically, we highlight that the most important decision guiding the availability of a decomposition is how one defines the distribution over the mixture of subsystems. We show that, for non-parametrically pooled systems (i.e., finite mixture models, illustrated in Section 3), the condition can hold for all values of the Rényi elasticity parameter , but that -heterogeneity will generally require numerical estimation. Section 4 introduces decomposition of Rényi heterogeneity under parametric assumptions on the pooled system’s distribution. In this case, which amounts to a Gaussian mixed-effects model (as commonly implemented in biomedical meta-analyses), we show that will hold at , though not necessarily at . Finally, in Section 5, we discuss the implications of our findings and scenarios in which parametric or non-parametric pooling assumptions might be particularly useful.

2. Background

2.1. Categorical Rényi Heterogeneity Decomposition

In this section, we consider the definition and decomposition of Rényi heterogeneity for a composite random variable (or “system”) that we call a discrete mixture (Definition 1). A random variable or system X is called a discrete mixture when it is defined on an n-dimensional discrete state space Let X be a discrete mixture. The Rényi heterogeneity for the ithcomponent is which is the effective number of states in . Assuming the pooled distribution over discrete mixture X is a weighted average of subsystem distributions, , the -heterogeneity is thus which we interpret as the effective number of states in the pooled system X. Jost [9] proposed the following decomposition of -heterogeneity: where and are summary measures of heterogeneity due to variation within and between subsystems, respectively. Since the factor has units of effective number of states in the pooled system, and has units of effective number of states per component, then yields the effective number of components in X. For discrete mixtures, Jost [9] specified the functional form for -heterogeneity as which allows the decomposition in Equation (3) to satisfy the following desiderata: The and components are independent [20] The within-group heterogeneity is a lower bound on total heterogeneity [8]: The -heterogeneity is a form of average heterogeneity over groups The and components are both expressed in numbers that are equivalent. Specifically, Jost [9] proved that is guaranteed for all when for all , or for unequal weights if the elasticity is set to the Shannon limit of .

2.2. Continuous Rényi Heterogeneity Decomposition

Let X be a non-parametric continuous mixture according to Definition 2. Despite individual mixture components in X potentially having parametric probability density functions, we call this a “non-parametric” mixture because the distribution over pooled components does not assume the form of a known parametric family. A non-parametric continuous mixture is a random variable X defined on an n-dimensional continuous space The continuous Rényi heterogeneity for the ithsubsystem of X is whose interpretation is given by Proposition 1 (see Proposition A3 in Nunes et al. [5] for the proof), which we henceforth call the “effective volume” of the event space or domain of . (Rényi Heterogeneity of a Continuous Random Variable). The Rényi heterogeneity of a continuous random variable X defined on event space Given the pooled distribution as defined in Equation (6), the Rényi heterogeneity over the mixture, which is the -heterogeneity, is The -heterogeneity is thus the total effective volume of X’s domain. The -heterogeneity represents the effective volume per component mixture component in X, and is computed as follows: Given Equations (8) and (9), the following theorem provides conditions under which is satisfied for a non-parametric continuous mixture. The proof is analogous to that given by Jost [9] for discrete mixtures, and is detailed in Appendix A. If X is a non-parametric continuous mixture (Definition 2), with γ-heterogeneity specified by Equation ( under the following conditions: If is analytically tractable for all , then a closed form expression for will be available. If is also analytically tractable, then will be too. However, this will depend entirely on the functional form of , and will rarely be the case using real world data. In the majority of cases, will have to be computed numerically.

3. Rényi Heterogeneity Decomposition under a Non-Parametric Pooling Distribution

Definition 3 defines a general Gaussian mixture X as a weighted combination of component Gaussian random variables, without identifying the function form of the composition. The non-parametric Gaussian mixture, where the distribution over X is a simple model average over its Gaussian components, is specified in Definition 4. The n-dimensional Gaussian mixture X is a weighted combination of the set of n-dimensional Gaussian random variables We define the random variable X as a non-parametric Gaussian mixture if it is a Gaussian mixture (Definition 3) whose probability density function is defined as where We now introduce the Rényi heterogeneity of a single n-dimensional Gaussian random variable (Proposition 2) and subsequently characterize the -, -, and -heterogeneity values for a non-parametric Gaussian mixture. (Rényi Heterogeneity of a Multivariate Gaussian). The Rényi heterogeneity of an n-dimensional Gaussian random variable X with mean The proof of Proposition 2 is included in Appendix A. Unfortunately, a closed form solution such as Equation (12) cannot be obtained for the -heterogeneity of a non-parametric Gaussian mixture, which must be computed numerically to yield the effective size of the mixture’s domain. This process may be computationally expensive, particularly in high dimensions. Conversely, Equation (9), which yields the effective size of the domain per mixture component, can be evaluated in closed form for a Gaussian mixture: The -heterogeneity, which returns the effective number of components in the mixture, can then be computed using Equation (4). Example 1 demonstrates an important property of considering X as a non-parametric Gaussian mixture: that low-probability regions of the domain between well-separated components will have little to no effect on the - or -heterogeneity estimates. (Decomposition of Rényi heterogeneity in a univariate Gaussian mixture). Consider three non-parametric Gaussian mixtures Assuming sufficiently accurate approximation of the integral in Equation (13), the -heterogeneity in Example 1 appears to reach a limit corresponding to the sum of effective domain sizes under all mixture components, and the -heterogeneity reaches a limit corresponding to the number of individual mixture components. Unfortunately, computation of -heterogeneity in a non-parametric Gaussian mixture will yield results whose accuracy will depend on the error of numerical integration, and which may consume significant computational resources when evaluated for large N (many components) and large n (high dimension). Monte Carlo integration may be preferable for high dimensional mixture distributions, but running samplers can still be costly if the -heterogeneity must be estimated many times. Although the non-parametric pooling approach may be the only available method for many distribution classes, a computationally efficient parametric pooling approach exists for Gaussian mixtures, to which we now turn our attention.

4. Rényi Heterogeneity Decomposition under a Parametric Pooling Distribution

This section introduces the parametric Gaussian mixture (Definition 5). This is essentially an ensemble of individual Gaussian distributions whose means and covariance matrices are weighted and pooled to obtain the mean and covariance matrix of the mixture as a whole. We subsequently provide conditions under which decomposition of the parametric Gaussian mixture’s heterogeneity satisfies the requirement that -heterogeneity be a lower bound on -heterogeneity (Theorem 2). Parametric Gaussian mixtures are an important class of models commonly used in mixed-effects meta-analyses [21], where one models the effect size of each of studies as Gaussians whose means are themselves Gaussian distributed with “true” effect-size and variance . The variance of the true effect, , is often taken as an index of between-study heterogeneity, but unfortunately variance does not satisfy the replication principle [4]. A parametric Gaussian mixture can also be used to measure the effective number of natural images embedded in the real valued latent space of a variational autoencoder (a probabilistic deep learning model used to learn compressed representations of high-dimensional data) [5]. We define the random variable X as an n-dimensional parametric Gaussian mixture if it is a Gaussian mixture (Definition 3) whose probability density function is defined as with pooled mean vector and pooled covariance matrix The efficiency of assuming a parametric, rather than non-parametric, Gaussian mixture is that -heterogeneity for the latter may be computed in closed form using Equation (12) (it is simply a function of Equation (17)). However, the critical difference between the parametric and non-parametric Gaussian mixture assumptions is that -heterogeneity—and therefore -heterogeneity—will depend on the component means , according to the following Lemma. (Relationship of -Heterogeneity to Component Dispersion). Let X and with equality if Lemma 1, whose proof is detailed in Appendix A, implies that the resulting -heterogeneity of a parametric Gaussian mixture will increase as the mixture component means are spread further apart. This follows from the fact that Equation (14), which is computed component-wise, remains a valid expression of the -heterogeneity in a parametric Gaussian mixture. Before stating the conditions under which is a lower bound on for a parametric Gaussian mixture (Theorem 2), we introduce the following Lemma, whose proof is left to Appendix A. If The Rényi β-heterogeneity of order Recall that is independent of the mean-vectors of components in X (Equation (14)). Furthermore, it follows from Lemma 1 that, if , where is an zero vector, then for any parametric Gaussian mixture with means , we will have , where equality is obtained if are also zero vectors, or the covariance of mean vectors in , is otherwise singular. Thus, it suffices to prove our theorem under the assumption that , where the pooled covariance of X is redefined as The expression for is which after simplification, can be appreciated to satisfy Lemma 2. ☐ Although Theorem 2 highlights the reliability and flexibility of using elasticity , we must emphasize that may not be the only condition under which , as suggested by Example 2. Indeed, Example 2 suggests that the integrity of this bound on -heterogeneity at elasticity values may depend in various ways on the unique combination of component-wise parameters in a parametric Gaussian mixture. (Decomposition of Rényi Heterogeneity in a Parametric Gaussian Mixture). Consider a parametric Gaussian mixture X with four components defined on which “skews” the distribution of weights over components in X according to the value of a skew parameter Figure 3 illustrates the effect of progressively separating the locations (i.e., means) of mixture components on the resulting -heterogeneity of parametric and non-parametric univariate Gaussian mixtures. We implemented mixtures with equally weighted components, respectively. Each Gaussian component had unit variance, since our comparison is primarily concerned with separation of component means. The mean of component was set to , with . Thus, as is increased, mixture components become progressively further separated.

Figure 3

Comparison of -heterogeneity for univariate Gaussian mixtures with varying numbers of components (blue lines ; purple lines ; gold lines ). Individual Gaussian components have unit variance, and the mean of component is set to , with . Solid lines show the -heterogeneity computed for the non-parametric Gaussian mixture using a second-order asymptotic approximation to the integral in Equation (13). Dotted markers show -heterogeneity of respective non-parametric Gaussian mixtures with the -heterogeneity estimated numerically. Dashed lines show the respective -heterogeneities for parametric Gaussian mixtures.

The -heterogeneity values of parametric Gaussian mixtures were computed by pooling component means and variances according to Definition 5, to which we applied Equation (12). The -heterogeneity values of non-parametric Gaussian mixtures (Equation (13)) were computed using numerical integration, as well as in closed form using second-order asymptotic approximation. In all cases, the -heterogeneity reduced simply to the Rényi heterogeneity of a single univariate Gaussian with unit variance. Figure 3 further highlights that the -heterogeneity of uniformly weighted non-parametric Gaussian mixtures tend to approach the number of individual components in the system. Conversely, the -heterogeneity of parametric Gaussian mixtures continues increasing. In fact, one can show that, as the separation between mixture components becomes large, the -heterogeneity approaches a linear rate of growth (Appendix B).

5. Discussion

This paper provided approaches for multiplicative decomposition of heterogeneity in continuous mixture distributions, thereby extending the earlier work on discrete space heterogeneity decomposition presented by Jost [9]. Two approaches were offered, dependent upon whether the distribution over the pooled system is defined either parametrically or non-parametrically. Our results improve the understanding of heterogeneity measurement in non-categorical systems by providing conditions under which decomposition of heterogeneity into and components conforms to the intuitive property that . If one defines the pooled mixture non-parametrically, as in a finite mixture model, heterogeneity is decomposable such that for all (if component weights are uniform, or at , otherwise), and may be interpreted as the discrete number of distinct mixture components (Section 2.2 and Section 3). This has the advantage of conforming with the original discrete decomposition by Jost [9], insofar as probability mass in the mixture is recorded only where it is observed in the data, and not elsewhere, as would be assumed under a parametric model of the pooled system. Consequently, one achieves a more precise estimate of the size of the pooled system’s base of support. The primary limitation arises from the need to numerically integrate the -heterogeneity, which can become prohibitively expensive in higher dimensions. Future work should investigate the error bounds on numerically integrated . A more computationally efficient approach for decomposition of continuous Rényi heterogeneity is to assume that the pooled mixture has an overall parametric distribution. A common application for which this assumption is generally made is in mixed-effects meta-analysis [21]. An important departure from the non-parametric pooling approach of finite mixture models is that non-trivial probability mass may now be assigned to regions not covered by any of the constituent component distributions. From another perspective, one may appreciate that the non-parametric approach to pooling is insensitive to the distance between component distributions, and rather only measures the effective volume of event space to which component distributions assign probability. Conversely, assumption of the parametric distribution over mixture (in the case of Section 4, a Gaussian) incorporates the distance between the component distributions into the calculation of -heterogeneity. This would be appropriate in scenarios where one assumes that the observed components undersamples the true distribution on the pooled system. For example, in the case of mixed-effects meta-analysis, the available research studies for inclusion may differ significantly in terms of their means, but one might assume that there is a significant probability of a new study yielding an effect somewhere in between. Specifying a parametric distribution over the pooled system would capture this assumption. One limitation of the present study is the use of a Gaussian model for the pooled system distribution. This was chosen on account of (A) its prevalence in the scientific literature and (B) analytical tractability. Future work should expand these results to other distributions. Notwithstanding this, we have demonstrated the decomposition of Rényi heterogeneity into its and components for continuous systems. There are (broadly) two approaches, based on whether parametric assumptions are made about the pooled system distribution. Under these assumptions applied to Gaussian mixture distributions, we provided conditions under which the criterion that is satisfied. Future studies should evaluate this method as an alternative approach for the measurement of meta-analytic heterogeneity, and expand these results to other parametric distributions over the pooled system.

9 in total

Multiplicative Decomposition of Heterogeneity in Mixtures of Continuous Distributions.

1. Introduction

2. Background

2.1. Categorical Rényi Heterogeneity Decomposition

2.2. Continuous Rényi Heterogeneity Decomposition

3. Rényi Heterogeneity Decomposition under a Non-Parametric Pooling Distribution

4. Rényi Heterogeneity Decomposition under a Parametric Pooling Distribution

5. Discussion

1. Measuring diversity: the importance of species similarity.

2. Principal components analysis corrects for stratification in genome-wide association studies.

3. Partitioning diversity into independent alpha and beta components.

4. Diversity partitioning of Rao's quadratic entropy.

5. Meta-analysis in clinical trials.

6. We need an operational framework for heterogeneity in psychiatric research

Review 7. Beyond Lumping and Splitting: A Review of Computational Approaches for Stratifying Psychiatric Disorders.

8. Distance-based functional diversity measures and their decomposition: a framework based on Hill numbers.

Review 9. The definition and measurement of heterogeneity.