Literature DB >> 29700839

Multistep estimators of the between-study variance: The relationship with the Paule-Mandel estimator.

Abstract

A wide variety of estimators of the between-study variance are available in random-effects meta-analysis. Many, but not all, of these estimators are based on the method of moments. The DerSimonian-Laird estimator is widely used in applications, but the Paule-Mandel estimator is an alternative that is now recommended. Recently, DerSimonian and Kacker have developed two-step moment-based estimators of the between-study variance. We extend these two-step estimators so that multiple (more than two) steps are used. We establish the surprising result that the multistep estimator tends towards the Paule-Mandel estimator as the number of steps becomes large. Hence, the iterative scheme underlying our new multistep estimator provides a hitherto unknown relationship between two-step estimators and Paule-Mandel estimator. Our analysis suggests that two-step estimators are not necessarily distinct estimators in their own right; instead, they are quantities that are closely related to the usual iterative scheme that is used to calculate the Paule-Mandel estimate. The relationship that we establish between the multistep and Paule-Mandel estimator is another justification for the use of the latter estimator. Two-step and multistep estimators are perhaps best conceptualized as approximate Paule-Mandel estimators.

Entities: Chemical Disease Gene Species

Keywords: estimation; iterative scheme; method of moments; random-effects meta-analysis

Mesh：

Year: 2018 PMID： 29700839 PMCID： PMC6055723 DOI： 10.1002/sim.7665

Source DB: PubMed Journal: Stat Med ISSN： 0277-6715 Impact factor: 2.373

INTRODUCTION

Meta‐analysis statistically combines effect size estimates from different studies in order to calculate a quantitative summary of the evidence base. Two important outcomes from a meta‐analysis are the estimates of the overall effect size and the between‐study variance (the variance of the studies' true effect sizes). Between‐study heterogeneity refers to the possibility that there is more variation in the studies' observed effect sizes than what would be expected by sampling variability alone1, 2 and is often present in meta‐analyses.3, 4, 5 Characteristics of the included studies (eg, differences between populations from which participants were sampled or treatments across studies) can be incorporated as moderators in meta‐regressions to explore and explain the between‐study heterogeneity.6, 7, 8 However, random‐effects meta‐analyses are often used to account for, but not explain, between‐study heterogeneity. A wide variety of estimators are available for the between‐study variance. Two recent papers9, 10 review existing research on these estimators and recommended either the Paule‐Mandel (PM) estimator11 or the restricted maximum likelihood (REML) estimator.12 However, the DerSimonian‐Laird (DL) estimator is most often used in practice.5, 13, 14 The popularity of the DL estimator is due to its simplicity, because it is calculated from an easily computed noniterative method and also because it is already familiar to applied meta‐analysts. In this paper, we focus on estimators that are motivated by the method of moments, which includes the DL and PM estimators, but not REML. In particular, we use the general method of moments estimator (ie, with an arbitrary set of weights for the effect sizes) proposed by DerSimonian and Kacker15 to develop a new multistep DL estimator. This idea extends the two‐step DL (DL2) estimator, which was also proposed by DerSimonian and Kacker.15 The usual (one‐step) DL estimator uses the inverse of the studies' within‐study sampling variances as weights to estimate the between‐study variance. In the two‐step–estimation procedure, the estimate of the usual DL estimator is calculated in the first step and this estimate is then included in the weights of the second step. Full details of the DL2 estimator are provided in Section 3. The statistical properties of the DL2 estimator are largely unknown, because the method has rarely been topic of further study. Bhaumik et al16 studied the statistical properties of the DL2 estimator and concluded that for rare events, both the DL2 and PM estimators are negatively biased. It was our initial intuition that allowing the number of steps to tend to infinity in our new multistep estimator would define a new type of estimator. However, working empirically to begin with and then mathematically, we will demonstrate that the PM estimator is obtained if the number of steps tends towards infinity. Hence, we will instead establish the relationship between the two‐step estimators and PM estimator, which is another justification for the use of the PM estimator. The rest of the paper is set out as follows. We continue with describing the random‐effects model for meta‐analysis in Section 2. In Section 3, we describe three existing moments‐based estimators, DL, DL2, and PM. Our new multistep estimator is introduced in Section 4. Subsequently, we apply these estimators to three contrasting examples in Section 5 where we empirically show that the multistep estimator tends towards the PM estimator as the number of steps becomes large, where this convergence occurs quickly in practice. Section 6 contains mathematics that formally establishes the relationship between the multistep estimators and PM estimator. We explore the use of meta‐regression models in Section 7, and we conclude with a short discussion in Section 8.

THE RANDOM‐EFFECTS MODEL

The random‐effects model assumes that the effect size estimates y , i=1, …, n, are extracted from separate studies. This model can be written as where μ is the average true effect size, μ is a random effect indicating the difference between the ith study's true effect size and μ, and ϵ is the ith study's sampling error. It is commonly assumed that μ ∼N(0,τ 2) where τ 2 is the between‐study variance and , where is the within‐study sampling variance of the ith study. Furthermore, all μ and ϵ are assumed to be mutually independent. The within‐study sampling variances are usually estimated in practice and then assumed to be known in the analysis. We will emphasize that the are estimated by writing as their estimates. The parameter μ is usually of primary interest. The usual method for making inferences about μ initially estimates τ 2 and then treats the resulting estimate as fixed and known.9, 17 Hence, the conventional weights in the random‐effects model, , are treated as fixed and known and the usual inferential procedure for μ is straightforward.8 However, the estimate of the between‐study variance, , is our primary interest here with moment‐based estimators as our focus.

MOMENT‐BASED METHODS FOR ESTIMATING THE BETWEEN‐STUDY VARIANCE

Most of the moment‐based estimators for τ 2 are a special case of a general method of moments estimator.15 To derive this general estimation method, DerSimonian and Kacker15 propose methodology for estimating τ 2 using an arbitrary set of weights a , i=1, …, n, where all a are fixed positive constants. To estimate τ 2, DerSimonian and Kacker15 propose equating , where , to its expected value. As explained by DerSimonian and Kacker,15 this results in the estimating equation where negative estimates from (2) are truncated to zero (because τ 2 ≥ 0). An often overlooked point is that the calculation of the expectation of , which gives rise to the estimating Equation 2, ignores the uncertainty in the and has taken . Although when presenting Equation 2, we have emphasized that the estimates are used in the calculation; this does not clearly convey the fact that the estimation does not take their uncertainty into account. Kulinskaya and Dollinger18 and Hoaglin19 criticize moment‐based methods for this type of reason, because ignoring uncertainty in may cause bias in the estimate of τ 2 especially if the sample size of the studies is small. By ignoring the uncertainty in the within‐study variances, we have that before truncation to zero, but the truncated estimator is positively biased.20, 21

The DerSimonian‐Laird (DL) estimator

The DL estimator,1 , is obtained by taking in Equation 2. We then have , so that Equation 2 simplifies when using this standard set of weights. Negative estimates are again truncated to zero. Uncertainty in is, as in Equation 2, neglected by treating the weights as fixed constants. This may result in bias when estimating τ 2 using the DL estimator especially if sample sizes of the studies is small.18, 19

The two‐step DerSimonian‐Laird estimator

DerSimonian and Kacker15 propose an alternative estimator that is an extension of the DL estimator. The usual DL estimate , described in the previous section, is calculated in the first step. The two‐step DL (DL2) estimator adds a second step by incorporating into the weights and computes using estimating Equation 2 with . To describe the two‐step DL estimator more explicitly, and also to define the PM and multistep DL estimators below, it is convenient to define the quantity where . Then Q (0) is the usual Q statistic used in meta‐analysis.22, 23 From Equation 2 with , we have where we again truncate negative estimates to zero. The weights are intuitively appealing, because we then weight by estimates of the studies' total precisions which are also the standard weights when making inferences about μ in the random‐effects model.8, 24 Using these weights raises further statistical issues, because they are now functions of both the and the estimated between‐study variance . There is statistical error in both of these estimated variance components, and so treating the weights as fixed constants continues to have the potential to have unfortunate implications for the estimation. It is possible to use other estimators in the first step, and DerSimonian and Kacker15 also propose using the Cochran analysis of variance (ANOVA) estimator22, 25 that is based on an unweighted sum of squares for this purpose. However, the DL estimator is so common in application that we only explore the use of two‐step and multistep estimators that use this particular estimator. Nonetheless, our main results will apply regardless of the type of estimator used in the first step as we will explain below. Hence, generalizability of our results is not restricted by using the DL estimator in the first step, but the results also apply if, for instance, the Cochran ANOVA estimator is used in the first step.

The Paule‐Mandel (PM) estimator

Another moment‐based estimator for τ 2 is the PM estimator.11 This estimation method exploits the fact that under the assumptions made in the random‐effects model (normal sampling distribution for all y and known within‐study variances ). Hence, is obtained by matching Q (τ 2) to its expected value and is the solution to For any given dataset, Q (τ 2) is a monotonically decreasing continuous function of τ 2. As a consequence, Equation 5 always provides a unique estimate15, 26, 27, 28 if Q (0)≥(n−1). If Q (0)<(n−1), then no positive solution to the estimating Equation 5 exists, and we take . The estimating Equation 5 is nonlinear and so must be solved numerically, but this is straightforward in practice. An empirical Bayes estimator for estimating τ 2 29, 30 was developed independently, but this has subsequently been shown to be equivalent to the PM estimator.9, 28 Unlike the DL and DL2 estimator and other moment‐based estimators, the PM estimator does not directly use estimating Equation 2. This is because the general method of moments treats the weights a as fixed (and therefore known) constants, but the PM estimator uses weights that are explicitly unknown (because τ 2 is unknown). The PM estimator is motivated using the method of moments, but otherwise, there is no direct connection between the PM estimator and other moment‐based estimators. We introduce our new multistep estimator in the next section, and we will illustrate the relationship between the PM and the two‐step estimator.

THE MULTISTEP DERSIMONIAN AND LAIRD ESTIMATOR

In this section, we develop the multistep DL estimator as a natural extension of the DL2 estimator. From Equation 4, we have that the DL2 estimator is simply the estimate from the more general estimating Equation 2 where the weights are . The key observation is that the two‐step estimator uses weights that are the reciprocal of the estimated total study variances, where the between‐study variance is estimated using the usual DL estimator. A natural way to extend this estimator to define a three‐step estimator is to use weights that are reciprocal of the estimated total study variances, where the between‐study variance is estimated using the DL2 estimator. Hence, we define to be where as before, we truncate negative estimates to zero. We can then define a four‐step estimator in a similar way, using Equation 2 with weights , and then a five‐step estimator using weights , and so on. In general, we define the (k+1)th step DL estimator as for k ≥ 1, where is defined to be the usual DL estimator . As usual, we truncate the resulting estimate from Equation 6 to zero if the solution is negative. Written explicitly in terms of this truncation, the (k+1)th step DL estimator is In practice, we compute recursively by first computing , then , then , and so on until we reach the required value of k. However, all of these estimators are available in closed form and so it is in principle also possible to write in this way. Assuming that the limit exists, we define . We will see below that, whenever convergence occurs, , so that instead of defining a new estimator, we establish the relationship between existing estimates by taking this limit.

EXAMPLES

In this section, we apply the DL, PM, DL2, and multistep DL estimators to three contrasting examples. Having illustrated our main findings empirically using these examples, we will demonstrate them mathematically in Section 6.

Characteristics of the 3 examples

Our first example is a meta‐analysis by Bangert‐Drowns et al31 studying the effect of school‐based writing‐to‐learn interventions on academic achievement. This meta‐analysis consists of 48 estimated standardized mean differences (ie, Hedges' g). The second example is obtained from Sterne et al32 and is a meta‐analysis on the effectiveness of intravenous magnesium in acute myocardial infarction. This meta‐analysis consists of 16 estimated log odds ratios. The third example is a meta‐analysis on the efficacy of two treatments for post‐traumatic stress disorder.33 This meta‐analysis consists of 10 standardized mean differences. The metafor package34 was used to calculate the DL and PM estimators, and we used our own bespoke code to recursively calculate the multistep DL estimator. R code for applying these estimators to the examples is available via https://osf.io/paqzm/.

Results

Table 1 shows the DL, DL2, DL, and PM estimates of τ 2 for all three examples. For each example, we calculated the multistep DL estimator until the (k+1)th step DL estimator was the same as the kth step estimator up to 4 decimal places. Convergence was taken to have been reached at this point, so that any further steps would result in the same estimate to this level of numerical accuracy. From Table 1, we can see that this convergence was reached in 6, 10, and 4 steps, for example, one, two, and three, respectively. Furthermore, we can see that in each case, the DL2 estimate is closer to the PM estimate than the DL estimate and that the DL estimate converges to the PM estimate. The way in which this convergence occurred was different for each example. For the first example obtained from Bangert‐Drowns et al,31 the DL estimate was notably less than the PM estimate. Then the DL2 estimate took a large step towards the PM estimator and after this convergence was quickly reached. For the second example obtained from Sterne et al,32 the DL estimate was notably greater than the PM estimate and once again, the DL2 estimate took a large step towards the PM estimator (and in fact “overshot” this). Convergence of the multistep DL estimator was reasonably fast although the sequence produced by the DL estimates was not monotone until k ≥ 7. For the third example obtained from Ho et al,33 the DL and PM estimators are similar and convergence was very quickly reached.

Table 1

The DerSimonian‐Laird (DL), two‐step DerSimonian‐Laird (DL2), multistep DerSimonian‐Laird (DL) (where k refers to the kth step), and Paule‐Mandel (PM) estimates for the three example datasets

Estimate	Bangert‐Drowns et al31	Sterne et al32	Ho et al33
DL	0.0455	0.2239	0.0076
DL₂	0.0652	0.1587	0.0078
DL₃	0.0684	0.1841	0.0079
DL₄	0.0688	0.1736	0.0079
DL₅	0.0689	0.1778
DL₆	0.0689	0.1761
DL₇		0.1768
DL₈		0.1765
DL₉		0.1766
DL₁₀		0.1766
PM	0.0689	0.1766	0.0079

Conclusions

Although the way in which the multistep DL estimator converged to the PM estimator was different in each case, all three examples illustrated the surprising finding that . A large number of simulations (see https://osf.io/dpuzs/ for R code) using and , where is either the DL estimate or the Cochran ANOVA estimate, as study weights in the first step confirmed that multistep estimators converge to the PM estimator. Hence, this indicates that convergence was not only a property of the selected data sets and that convergence also occurred if the DL estimator was not used in the first step. Our findings are in agreement with the observation by DerSimonian and Kacker15 that two‐step estimators better approximate the method of Paule and Mandel, and the conclusion by Bhaumik et al16 that performance of the DL2 and PM estimator is similar. This is because we have observed that DL2 is the second step in an iterative scheme that takes us from to .

PROVING (WHEN CONVERGENCE OCCURS) THAT THE MULTISTEP ESTIMATOR CONVERGES TO THE PAULE‐MANDEL ESTIMATOR

As explained above, in addition to our three examples, many simulated datasets have shown that multistep estimators converge to the PM estimator. In this section, we provide mathematical proofs to formally establish this limit. We will explain why it is not necessary that the DL estimator is used in the first step, so that our findings apply to multistep estimators regardless of the nature of the estimation used in the first step.

Lemma: agreement with respect to truncation to zero of the DerSimonian and Laird and Paule‐Mandel estimators

We start by proving the lemma that the DL and the PM estimators always agree in the sense that, for any given dataset, they are either both zero (if Q (0)≤(n−1)) or both positive (if Q (0)>(n−1)). It is conceptually appealing that these two estimators agree in this way, and this is easily proved, but we do not think that this result has been stated previously. Proof: If Q (0)<(n−1), where Q (τ 2) is defined in Equation 3, then the PM estimator is truncated to zero as explained in Section 3.3. Furthermore, the first term in the numerator of Equation 2 is also Q (0) when the DL weights of are used. As noted in Section 3.1, we then also have in the numerator of Equation 2. Hence, the DL estimator is also truncated to zero if Q (0)<(n−1). If Q (0)=(n−1), then immediately from their estimating equations, both the DL and PM estimators are zero. Finally, if Q (0)>(n−1), then no truncation for either estimator is required, so that the DL and PM estimators are both positive.

Proving that if convergence of the multistep estimator occurs, then it is to the Paule‐Mandel estimate

Having established our lemma, we will prove that the estimate of the multistep estimator equals the PM estimate if convergence occurs. We will prove this first for cases where the convergence is to a positive estimate and then to an estimate of zero.

The case where the estimate converged to is positive

Assume that convergence occurs and the resulting estimate is positive, so that . We substitute into Equation 6, where this equation correctly describes the iteration from DL to DL (because the estimate is positive and no truncation is necessary). Then solving the resulting equation for results in which from Equation 5 means that .

The case where the estimate converged to is zero

Assume that convergence occurs and the resulting estimate is either zero or truncated to zero, so that . If we substitute into Equation 7, the term in square brackets of (7) simplifies to (n−1) and this equation becomes where . Equation 8 is satisfied only if Q (0)−(n−1)≤0, from which the lemma in Section 6.1 implies that both the DL and PM estimators are zero (which is also the assumed value of ). Hence, if the convergence of the multistep estimator is to zero, then the PM estimate is also zero, so that .

Failure of convergence of the multistep estimator

Although we have observed convergence of the multistep estimators in thousands of simulated datasets (see https://osf.io/dpuzs/), it is possible to create examples where the multistep estimator does not converge. As a concrete example of nonconvergence, imagine a meta‐analysis with 4 effect sizes y 1 = −0.2, y 2 = 0.1, y 3 = −0.05, and y 4 = −0.3, with corresponding and . The DL estimate is . Using this in estimating Equation 4 gives . Hence, is then the usual DL estimator and, instead of achieving convergence, the multistep estimator oscillates between 0.016 and 0, and does not converge to . The difficulties for achieving convergence in this example would appear to be because the DL and PM estimates differ so substantially and also because the within‐study variances are of different magnitudes (so that is sensitive to the value of τ 2 when this is small). This example is a counterexample to the conjecture that the multistep estimator always converges to the PM estimator.

Conclusions

Regardless of whether or not the convergence of the multistep estimator is to a positive estimate, we have proved that if convergence occurs, then this is to the PM estimate. Simulating thousands of meta‐analyses (see https://osf.io/dpuzs/) did not reveal the convergence problems suggesting that these problems only occur in rare cases such as the artificial one described above. We conclude that that, in practice, multistep estimators converge to the PM estimate and also that they cannot converge to anything other than the PM estimate. Although the finding that multistep estimators may not converge reduces the utility of our analysis, our analytical results are more general than might be supposed, because it is not limited to using the DL estimator in the first step. All that is necessary for our results is that subsequent steps weight by the reciprocal of the estimated total study variances where the estimated between‐study variance is the estimate at the previous step. Hence, our work establishes a link between multistep estimators per se and the PM estimator rather than between just the DL and PM estimators.

The relationship with an established Newton‐Raphson method for calculating the Paule‐Mandel estimate

DerSimonian and Kacker15 propose a Newton‐Raphson algorithm for calculating the PM estimate (see their Appendix A). This algorithm sets to zero if Q (0) ≤ (n−1). If Q (0) > (n−1), then and an initial value for the algorithm must be chosen. Then the Newton‐Raphson algorithm takes , where where . Negative estimates are truncated to zero, and the algorithm keeps iterating until convergence is reached. Jackson et al35 explain how to generalize this Newton‐Raphson procedure so that it can be applied to meta‐regression models. We can also calculate the corresponding when using Equation 6 in the iterative scheme that produces our multistep estimators as . From Equation 6, this is Putting the right‐hand side of the numerator over a common denominator results in Equation 10 also illustrates why the multistep estimator converges to the PM estimator in practice. This is because the multistep estimator converges if and only if , so that and . If instead the PM estimate has not been converged to, Equation 10 shows that the estimator takes a step in the direction of the PM estimate in the kth step, because if , then and if , then . Comparing Equations 9 and 10, we can also see that the iterative scheme for the multistep estimator is closely related to the established Newton‐Raphson method for calculating . In Appendix A, we show that the expectation of the denominator of Equation 9, under the model and where the y are independent (where we suppress the distinction between and ), is equal to the denominator of Equation 10. This is reminiscent of the relationship between Fisher's scoring and Newton‐Raphson methods in maximum likelihood estimation. This is because Fisher's scoring algorithm solves the likelihood‐based estimating equation by replacing the observed information in the denominator in a Newton‐Raphson procedure by its expectation (the expected information). This observation provides us with intuition into why multistep estimators tend towards the PM estimator as the number of steps becomes large.

THE RANDOM‐EFFECTS META‐REGRESSION MODEL

For ease of exposition, we have presented our main results for random‐effects meta‐analyses, but these are readily extended to meta‐regression models where study level covariate effects are included in the model. To establish that our results generalize in this way, we consider meta‐regression models with an arbitrary number of covariates in this section. All of the results in this section simplify to those shown previously. The random‐effects meta‐regression model is an extension of model 1, where we assume that where is the 1×p row vector of covariates associated with this study and is the p×1 column vector of regression parameters of interest. Unless an intercept‐free regression is required, the first “covariate” in each study is taken to be one to include the intercept. A matrix formulation of this standard model is where is a column vector containing the y , is the n×p design matrix (sometimes referred to as the model matrix) whose ith row is , and is the n×n identity matrix. The parameter τ 2 in model 11 is called the residual between‐study variance and describes the heterogeneity in the effect size estimates that is not explained by the covariates.

The general method of moments for meta‐regression

Jackson et al35 generalize the general method of moments (Equation (2)) to the meta‐regression setting. They define = diag(a ), a diagonal matrix containing the weights, and = − ( )−1 . They also define the Q statistic Jackson et al35 use the subscript a to emphasize that the weights a are used, and so use the notation Q for this quadratic form. This Q statistic reduces to the the quadratic form in the numerator of Equation 2 in the meta‐analysis setting. Jackson et al35 show that the meta‐regression version of the generalized method of moments in Equation 2 is where tr(·) denotes the trace of a matrix and tr(B)>0. As in the meta‐analysis setting, we truncate when the solution to Equation 12 is negative.

Paule‐Mandel and DerSimonian and Laird estimators for meta‐regression

The Paule‐Mandel estimator

The PM‐type estimator in the meta‐regression setting proposed by Jackson et al35 uses weights when computing the Q statistic. We denote the resulting Q statistic using the notation Q (τ 2) in order to emphasize the dependence of the weights on the unknown parameter τ 2. This is a direct generalization of Q (τ 2) in Equation 3. Since Q (τ 2) follows a χ 2‐distribution with n−p degrees of freedom, the PM estimator is obtained by solving If Q (0) n−p, then .

The DerSimonian and Laird estimator

The standard weights of produce a DL‐type estimator of τ 2 when using Equation 12, so that this estimator is just a special case of the general method of moments. We then have =Δ −1 so that =Δ −1−Δ −1 ( Δ −1 )−1 Δ −1. Hence, with these weights, the numerator of Equation 12 becomes where this final equality is because tr( )=tr( ), where and are square matrices of the same size, and because Δ 1/2 Δ 1/2=Δ. We can then further simplify this expression by taking This identity is because tr(Δ 1/2 B Δ 1/2)=tr()−tr(Δ −1/2 ( Δ −1 )−1 Δ −1/2), where tr()=n and tr(Δ −1/2 ( Δ −1 )−1 Δ −1/2)=p. This final equality follows from the observation that the hat matrix corresponding to a design matrix is given by ( )−1 , where tr(( )−1 )=tr( ( )−1). For an identifiable regression ( )−1 is a p×p identity matrix, which results in the well‐known result that the trace of the hat matrix is p. Then we simply observe that Δ −1/2 ( Δ −1 )−1 Δ −1/2 is the hat matrix corresponding to the design matrix Δ −1/2 , so that its trace is also p. The numerator of Equation 12 therefore simplifies to Q (0)−(n−p) for the DL estimator.

Multistep estimators for meta‐regression

We can motivate multistep estimators of τ 2 for meta‐regression in exactly the same way as in meta‐analysis. For example, using the DL estimator, we first calculate using Equation 12 and weights of , truncating the estimate to zero if the solution is negative. We can then calculate using Equation 12 and weights of , from which we can then calculate and so on. In general, we calculate using Equation 12 with weights of . Any negative solutions are truncated to zero. This process generalizes the multistep estimators for meta‐analysis described in Section 4. Let denote the diagonal matrix containing the weights when computing the (k+1)th step DL estimator, for k≥1. Let denote the corresponding matrix computed using . From Equation 12, we can then write for k≥1, where we truncate the resulting estimate to zero if the solution is negative. Equation 14 is a direct generalization of Equation 6 for meta‐regression. Written explicitly in terms of the truncation, the (k+1)th step estimator is and Equation 15 is a direct generalization of Equation 7.

Lemma: Agreement with respect to truncation to zero of the DL and PM estimators

In this section, we generalize the lemma in Section 6.1 for meta‐regression. As explained above, the PM estimator is positive if and only if Q (0)>n−p. As also explained above, the numerator of Equation 12 simplifies to Q (0)−(n−p) when using the DL estimator ( ). Hence, the DL estimator is also positive if and only if Q (0)>n−p. If instead Q (0)≤n−p, then both the DL and PM estimators are zero. We therefore have established that the type of weak agreement described in Section 6.1 between the PM and DL estimators also applies in the meta‐regression setting.

Proving that if convergence occurs, then it is to the Paule‐Mandel estimate

Although we do not prove that the multistep estimator always converges for the meta‐regression model, we have also simulated thousands of datasets and conducted meta‐regression analyses with one continuous covariate (see https://osf.io/5wqvd/ for R code) and did not observe any convergence problems. We have established that artificial examples can be created where the multistep estimator does not converge but that this nonconvergence is unlikely to occur in practice in both meta‐analysis and meta‐regression. In the remainder of this section, we generalize the results in Section 6.2 for meta‐regression. Assume that convergence occurs and the resulting estimate is positive, so that . We substitute into Equation 14, where this equation correctly describes the iteration from DL to DL (because the estimate is positive and no truncation is necessary). Then solving the resulting equation for results in where the final equality follows from an argument involving a hat matrix that is very similar to the one made in Section 7.2.2. Equation 13 implies that . Assume that convergence occurs and the resulting estimate is either zero or truncated to zero, so that . If we substitute into Equation 15, then this equation becomes where c=tr( 0)>0. Equation 8 is satisfied only if Q (0)−(n−p)≤0, from which the lemma in Section 7.4 implies that both the DL and PM estimators are zero (which is also the assumed value of ). Hence, if the convergence of the multistep estimator is to zero, then the PM estimate is also zero, so that . We have therefore established that multistep estimates also converge to the PM estimator in meta‐regression models whenever convergence occurs.

DISCUSSION

Two‐step estimators have recently been presented as estimators of the between‐study variance. We have extended these two‐step estimators to a multistep estimator and show by means of empirical examples, simulations, and also analytically that the multistep estimator converges to the PM estimator if the number of steps is sufficiently large. This convergence occurs quickly in practice. Although examples can be produced where the multistep estimator does not converge, we have shown that the PM estimator is obtained in the limit when convergence is obtained and that convergence problems seldom occur in practice. Hence, our analysis suggests that the two‐step estimators are better conceptualized as part of the usual iterative scheme that is used to calculate estimates using the PM estimator. Our findings also clarify why previous work15, 16 observed that the DL2 estimator was closer to the PM estimator than the DL estimator. We therefore suggest that the two‐step estimators, as well as the proposed multistep estimator, are not seen as truly distinct estimators but as steps in an iterative procedure that results in the PM estimator. Now that REML and the PM estimator are computationally feasible and established in standard software, we align ourselves with those who argue that these estimators should be preferred over the DL estimator.9, 10 The case for REML becoming the default estimation method is now strong. However, the PM estimator is a viable alternative that is currently the best estimator that uses the method of moments. An advantage of the PM estimator compared to REML is that, in a small proportion of meta‐analyses, REML suffers from convergence problems.5 A byproduct of our work is the development of a new iterative scheme that can be used to calculate the PM estimator. Our work is a good example of scientists exploring an issue of interest with the expectation of discovering something new and then making new, but unanticipated, discoveries. However, discovering the link between the multistep and PM estimator is in some respects even more satisfying than inventing a new class of estimators of the between‐study variance. We have already explained that the PM estimator has been found to be equivalent to the empirical Bayes estimator, and our results provide another justification for the use of the PM estimator. This estimator would therefore seem to have a very wide variety of justifications and connections with other approaches, which suggests that it has a useful role in both methodological and applied work. The estimation equation for the multistep estimator in Equation 6 closely resembles a fixed‐point iteration problem,36 because the estimate of the between‐study variance in the previous step is included in the weights of the estimation equation in the next step. Studying the multistep estimator using methods for fixed‐point iteration may yield further insights into the characteristics of meta‐analysis datasets where convergence problems occur. We leave this as an opportunity for future research, which would probably be best undertaken by experts in numerical analysis. We have considered the random‐effects models for meta‐analysis and meta‐regression. Both of these models assume that the outcome data are independent. More sophisticated models that allow for correlated data include multivariate meta‐analysis37 and network meta‐analysis.38 Jackson et al39 have already developed PM estimators for network meta‐analysis, but our connection between multistep and PM estimators provides an alternative possibility for motivating them. There is currently no PM estimator for the between‐study covariance matrix in multivariate meta‐analysis, but two extensions of the DL estimator have been proposed.40, 41 Generalizing one or both of these estimators to allow an arbitrary set of weights, and so develop a general method of moments estimator, could then motivate the development of multistep estimators in the context of multivariate meta‐analysis. When convergence is reached as the number of steps becomes large, PM estimators of the between‐study covariance matrix could then be defined in this limit. However, considerable methodological development is needed to extend our work to the network and multivariate meta‐analysis settings, because this would first require the development of a generalized method of moments for correlated outcome data. We therefore leave this as a tantalizing possibility for further work. However, enthusiasm for this idea is likely to be mitigated by the finding that the multistep estimator does not always converge. Matters will become more complicated in the multivariate setting and some convention for defining a PM estimator in this way when convergence is not obtained would be needed. To summarize, we have extended the two‐step estimator so that multiple steps can be used and reproduced the PM estimator in the limit when the number of steps are sufficiently large. The PM estimator therefore has another justification as a result of its relationship with the proposed multistep estimator. We suggest that the meta‐analysis community should no longer consider the two‐step and multistep estimators to be truly distinct estimators but should instead regard these type of estimators as approximate PM estimators.

27 in total

1. Hartung-Knapp method is not always conservative compared with fixed-effect meta-analysis.

Authors: Anna Wiksten; Gerta Rücker; Guido Schwarzer
Journal: Stat Med Date: 2016-02-04 Impact factor: 2.373

2. Random-effects model for meta-analysis of clinical trials: an update.

Authors: Rebecca DerSimonian; Raghu Kacker
Journal: Contemp Clin Trials Date: 2006-05-12 Impact factor: 2.226

3. Assessing the amount of heterogeneity in random-effects meta-analysis.

Authors: Guido Knapp; Brad J Biggerstaff; Joachim Hartung
Journal: Biom J Date: 2006-04 Impact factor: 2.207

4. Confidence intervals for the amount of heterogeneity in meta-analysis.

Authors: Wolfgang Viechtbauer
Journal: Stat Med Date: 2007-01-15 Impact factor: 2.373

5. Indirect and mixed-treatment comparison, network, or multiple-treatments meta-analysis: many names, many benefits, many concerns for the next generation evidence synthesis tool.

Authors: Georgia Salanti
Journal: Res Synth Methods Date: 2012-06-11 Impact factor: 5.273

6. Meta-analysis in clinical trials.

Authors: R DerSimonian; N Laird
Journal: Control Clin Trials Date: 1986-09

7. A random-effects regression model for meta-analysis.

Authors: C S Berkey; D C Hoaglin; F Mosteller; G A Colditz
Journal: Stat Med Date: 1995-02-28 Impact factor: 2.373

8. Commentary: Heterogeneity in meta-analysis should be expected and appropriately quantified.

Authors: Julian P T Higgins
Journal: Int J Epidemiol Date: 2008-10 Impact factor: 7.196

9. Multivariate meta-analysis: potential and promise.

Authors: Dan Jackson; Richard Riley; Ian R White
Journal: Stat Med Date: 2011-01-26 Impact factor: 2.373

10. An accurate test for homogeneity of odds ratios based on Cochran's Q-statistic.

Authors: Elena Kulinskaya; Michael B Dollinger
Journal: BMC Med Res Methodol Date: 2015-06-10 Impact factor: 4.615

11 in total

1. Meta-Analysis for Epigenome-Wide Association Studies.

Authors: Nan Wang; Shuilin Jin
Journal: Methods Mol Biol Date: 2022

2. Efficacy and Safety of Intra-Articular Cell-Based Therapy for Osteoarthritis: Systematic Review and Network Meta-Analysis.

Authors: Wei Ding; Yong-Qing Xu; Ying Zhang; An-Xu Li; Xiong Qiu; Hong-Jie Wen; Hong-Bo Tan
Journal: Cartilage Date: 2020-07-22 Impact factor: 3.117

Review 3. Efficacy of Liraglutide in Non-Diabetic Obese Adults: A Systematic Review and Meta-Analysis of Randomized Controlled Trials.

Authors: Joshuan J Barboza; Mariella R Huamán; Beatriz Melgar; Carlos Diaz-Arocutipa; German Valenzuela-Rodriguez; Adrian V Hernandez
Journal: J Clin Med Date: 2022-05-25 Impact factor: 4.964

INTRODUCTION

THE RANDOM‐EFFECTS MODEL

MOMENT‐BASED METHODS FOR ESTIMATING THE BETWEEN‐STUDY VARIANCE

The DerSimonian‐Laird (DL) estimator

The two‐step DerSimonian‐Laird estimator

The Paule‐Mandel (PM) estimator

THE MULTISTEP DERSIMONIAN AND LAIRD ESTIMATOR

EXAMPLES

Characteristics of the 3 examples

Results

Conclusions

PROVING (WHEN CONVERGENCE OCCURS) THAT THE MULTISTEP ESTIMATOR CONVERGES TO THE PAULE‐MANDEL ESTIMATOR

Lemma: agreement with respect to truncation to zero of the DerSimonian and Laird and Paule‐Mandel estimators

Proving that if convergence of the multistep estimator occurs, then it is to the Paule‐Mandel estimate

The case where the estimate converged to is positive

The case where the estimate converged to is zero

Failure of convergence of the multistep estimator

Conclusions

The relationship with an established Newton‐Raphson method for calculating the Paule‐Mandel estimate

THE RANDOM‐EFFECTS META‐REGRESSION MODEL

The general method of moments for meta‐regression

Paule‐Mandel and DerSimonian and Laird estimators for meta‐regression

The Paule‐Mandel estimator

The DerSimonian and Laird estimator

Multistep estimators for meta‐regression

Lemma: Agreement with respect to truncation to zero of the DL and PM estimators

Proving that if convergence occurs, then it is to the Paule‐Mandel estimate

DISCUSSION

Review 3. Efficacy of Liraglutide in Non-Diabetic Obese Adults: A Systematic Review and Meta-Analysis of Randomized Controlled Trials.

Review 4. Efficacy of Colchicine in the Treatment of COVID-19 Patients: A Systematic Review and Meta-Analysis.