Literature DB >> 30023448

An integrative shrinkage estimator for random-effects meta-analysis of rare binary events.

Lie Li1, Ou Bai1, Xinlei Wang1.   

Abstract

Meta-analysis has been a powerful tool for inferring the treatment effect between two experimental conditions from multiple studies of rare binary events. Recently, under a random-effects (RE) model, Bhaumik et al. developed a simple average (SA) estimator and showed that with the continuity correction factor 0.5, the SA estimator was the least biased among a set of commonly used estimators. In this paper, under various RE models that allow for treatment groups with equal and unequal variability (in either direction), we develop an integrative shrinkage (iSHRI) estimator based on the SA estimator, which aims to improve estimation efficiency in terms of mean squared error (MSE) that accounts for the bias-variance tradeoff. Through simulation, we find that iSHRI has better performance in general when compared with existing methods, in terms of bias, MSE, type I error and confidence interval coverage. Data examples of rosiglitazone meta-analysis are provided as well, where iSHRI yields competitive results.

Entities:  

Keywords:  Bias; Estimation efficiency; Log odds ratio; Mean squared error; Sparse binary data

Year:  2018        PMID: 30023448      PMCID: PMC6046515          DOI: 10.1016/j.conctc.2018.04.004

Source DB:  PubMed          Journal:  Contemp Clin Trials Commun        ISSN: 2451-8654


Introduction

In medical research, when events of interest are rare, a single randomized study rarely has sufficient power to provide reliable information regarding the treatment effect between two experimental conditions (say, treatment vs. control). Therefore, meta-analysis is often used to combine information from multiple studies of rare binary events. In the past, various meta-analysis approaches have been developed to estimate the overall treatment effect, based on either fixed-effect (FE) models [e.g., 18] or random-effects (RE) models [e.g., 10]. Note that FE models assume a common treatment effect across different studies while RE models allow the treatment effects to vary from study to study. When dealing with rare binary events, under the FE assumption, commonly used methods for estimating the treatment effect include the Mantel-Haenszel [MH, 18] with a constant zero-cell correction factor 0.5, Peto [30], and inverse variance methods [10,14]. A previous study [3] has compared the performance of twelve FE methods. The general recommendation is to use the MH method with some appropriate continuity corrections, which is consistent with the findings of Sweeting et al. [25]. In practice, RE models seem to be less restrictive, especially for clinical trials, because doses and follow-up time can be different in individual studies. Through a meta-analysis of multiple rosiglitazone studies, Shuster et al. [22] pointed out that the performance based on RE models is superior to that based on FE models. In this paper, we focus on meta-analysis of rare binary events based on RE models. For RE meta-analysis, Shuster [21] showed that the widely used DerSimonian and Laird [DSL, 10] method can be highly biased for rare events, and suggested to use the simple (unweighted) average of estimates from individual studies. Recently, Bhaumik et al. [2] formally proposed a simple average (SA) estimator based on a RE model, which is the unweighted average of estimated log odd ratios (with a positive continuity correction factor a) in individual studies. Bhaumik et al. [2] showed that, when , the SA estimator (SA_0.5) is asymptotically unbiased and has superior bias performance when compared with existing estimators, including MH, empirical logit [EL, 10], and DSL methods. However, Li and Wang [17] pointed out that the RE model they assumed is restrictive in the sense that it forces the variability in the treatment group to be no less than that in the control group, and more importantly, SA_0.5 fails to minimize the mean squared error (MSE), which is an established measure of estimation performance that takes into account the bias-variance tradeoff. Based on the SA estimator, we aim to develop a shrinkage estimator with smaller MSE to improve estimation efficiency. Shrinkage methods, which shrink some “standard” estimator toward zero or any other fixed value, have been widely used in various fields [24,[26], [27], [28], [29]]. Many shrinkage methods [7,8,29] were developed under a rigorous Bayesian framework via empirical Bayes (EB) approaches, which shrink a point estimate from the sample to the prior. By contrast, the others were derived based on statistical decision theory directly (by minimizing some loss function), of which an early example is the famous James–Stein estimator [6,16]. A recent example is that, to estimate the failure rate of rare events from multiple heterogeneous systems, Xiao and Xie [28] developed a plug-in shrinkage estimator based on Poisson distributions, which shrinks the maximum likelihood estimator (MLE) of the failure rate toward a predetermined point in the parameter space, and so lifts degenerated zero estimates to some reasonable positive values. This estimator can be expressed as a weighted average of the MLE and a data-dependent point. In the context of meta-analysis of rare binary events, we develop an integrative shrinkage estimator (iSHRI) by using the SA method in Bhaumik et al. [2] to combine shrinkage estimators from individual studies, which intends to shrink the estimated log odd ratios with toward a predetermined point. The shrinkage factor is obtained from minimizing the MSE and then plug in estimates from the data. We thoroughly compare the bias and MSE of the proposed method with existing methods via simulation. We further examine their performance on hypothesis testing and interval estimation using the type I error and coverage probability of confidence intervals (CIs). Besides, two data examples of rosiglitazone meta-analysis are provided for further comparison.

An integrative shrinkage estimator

Consider a meta-analysis consisting of K randomized studies. In the kth study, let and be the number of subjects in the treatment and control groups, respectively. Assume and , where is the number of observed events of interest in the kth study, and is the probability of observing an event in the treatment (control) group. Let and . To measure the treatment effect in study k, the log odds ratio is used throughout this paper, i.e., . We denote the mean treatment effect across component studies by θ, and the between-study heterogeneity among individual treatment effects by , satisfying . Below we consider three RE models, to accommodate realistic situations where treatment and control groups can have equal variability or unequal variability (in either direction). All the three models involve random terms ’s and ’s, where , , and any two components of are assumed to be independent. Here, can represent the log odds of the control or treatment group or the average of the two groups, depending on model specification below. Model I is the one used in Bhaumik et al. [2], which implicitly assumes that the variance of is greater than that of : Model II is the one used in Smith et al. [23], which assumes equal variances between the two groups: Model III assumes the variance in the treatment is less than that in the control: Although originally developed under Model I, the SA estimator can be used with any of the three models to estimate the mean treatment effect θ, given by:where is an estimator of the individual treatment effect in the kth study:and a is a positive continuity correction factor. Bhaumik et al. [2] proved that is asymptotically unbiased when under Model I; that is, , where is the overall minimum number of subjects. It can be shown that the asymptotic unbiasedness of also holds under Models II and III. This theoretical property ensures that SA_0.5 (i.e., ) performs well in terms of bias for large sample sizes. However, Li and Wang [17] proved that SA_0.5 is suboptimal in terms of MSE, and showed via simulation that it can have poor MSE performance especially for small sample sizes. Motivated by Xiao and Xie [28], in the kth study, we consider a shrinkage estimator of based on , denoted by ,where is a fixed point in the parameter space of θ, and c is a shrinkage factor. Then our integrative shrinkage estimator for the overall treatment effect θ can be given by Clearly, the iSHRI estimator shrinks SA_0.5 toward the predetermined point . Based on the asymptotic unbiasedness of , it is easy to show Further, we can show that the variance of is given by Thus, regardless of the model specification, has the following MSE: To minimize the asymptotic MSE of , we ignore the term and set the shrinkage factor c tosatisfying . The minimized asymptotic MSE of iSHRI, denoted by mAMSE, is always less than or equal to the asymptotic MSE of SA_0.5: To be able to calculate the shrinkage factor in (3), we need to estimate θ, , and from the data as well as choosing an appropriate value of . We estimate θ using SA_0.5. Based on some preliminary simulation, we find that how to estimate (e.g., the popular DSL estimator or other estimators introduced in Bhaumik et al. 2), has not much effect on the performance of iSHRI. Thus, we set for simplicity. We also adopt the estimators in Gart et al. [12] for and : Further, we use as a surrogate of the fixed point , due to the following reasons. First of all, our simulation results (not reported here) show that the obvious choice does not work well. There seems to be no other plausible constant in that can be used to set . Second, iSHRI is a weighted average of SA_0.5 and . Thus, it would be reasonable to find a good estimator of θ to replace . Hitchcock [15] and Gart and Zweifel [13] recommended as an alternative continuity correction to estimate the log odds ratio, based on the minimum logit method. Further, Li and Wang [17] show that SA_0.25 performs well in general. Note that using to replace would lead to an increase in the variance of , and so the derivation of the optimal shrinkage factor would become analytically difficult. To avoid this difficulty, we directly replace by in (3). This plug-in strategy was also adopted by Xiao and Xie [28], which turns out to work well in their context. We will show via simulation that it works well in the context of meta-analysis of rare binary events, too. Based on these estimators, the data-dependent shrinkage factor is given by From now on, we refer to the estimator as the iSHRI estimator.

Simulation

We conducted a simulation study to compare the performance of the proposed method iSHRI with various existing methods including SA_0.5, DSL, EL, MH, and GLMM [4], among which the first four were included in the comparison done by Bhaumik et al. [2] and the last, GLMM, stands for the popular estimator based on generalized linear mixed models, which has been reported to work well for analysis of sparse binary data. For the purpose of point estimation, we examined the bias and MSE of these methods; and for the purpose of hypothesis testing and interval estimation, we evaluated the type I error rate for testing and per cent coverage of 95% CIs. In our simulation, we set the number of studies the mean treatment effect , for rare binary events (i.e., 0.0067 in the probability magnitude), and . We simulated and from , to mimic realistic situations where sample sizes of the two groups can be similar or very different. In this way, for the rare events simulated with , n would not be large enough for SA_0.5 to become unbiased in most simulated cases. We generated the responses and from and , respectively, for , where ’s were computed from Models I, II and III, respectively. For each setting, we simulated 1000 datasets to compute the empirical bias, MSE, type I error rate and per cent coverage of all the methods.

Comparing bias

The results for bias comparison are shown in Fig. 1. The number of studies K seems not to affect the relative performance of the different methods while the model specification matters a lot. GLMM is the least biased under Model I, which assumes that the variability in the treatment is larger than that in the control. MH is the best under Model II, in which equal variability is assumed. Under Model III that assumes the variability in the treatment is smaller than that in the control, iSHRI is the least biased except for large positive θ, where MH and GLMM outperform iSHRI; however, MH and GLMM are the worst two for many other θ values. Overall, iSHRI works the best in terms of bias because iSHRI is the best or always among the top performing group in all the settings. Note that the performance of SA_0.5 is consistently worse than that of iSHRI. This is because in these simulated settings, SA_0.5 is not asymptotically unbiased due to generally small n, as mentioned before.
Fig. 1

Comparison of bias.

Comparison of bias. Another interesting observation is that for Model I, bias is large for ; and for Model III, bias is large for . This is because in these cases, the number of events of interest can be extremely low in one group (treatment or control) because is very small for Model I with and is very small for Model III with . From Fig. 1, we can find that the bias of all methods is large when the number of events is extremely low. As will be shown later, the poor bias performance leads to large MSE in Fig. 2 and low CI coverage rates in Fig. 3 for these cases.
Fig. 2

Comparison of MSE.

Fig. 3

Per cent coverage of CIs.

Comparison of MSE. Per cent coverage of CIs.

Comparing MSE

Fig. 2 shows results for MSE comparison. Clearly, there is no single estimator that beats the others in all the settings. Under Model I, iSHRI seems to be the best for or 20, and GLMM seems to be the best for . Under Model II, the performance of the different methods is less heterogeneous and iSHRI performs well when θ moves away from zero. Under Model III, iSHRI offers either the best or close to the best MSE performance in all cases. Overall, iSHRI has very strong performance in terms of MSE.

Evaluating type I error

For testing , we used the statistic , where is an estimator of θ, and stands for the estimated standard error (SE) of . The methods described in Bhaumik et al. [2] were used to estimate the SEs of SA_0.5, DSL, EL and MH. For iSHRI, it is hard to get a closed form formula for the SE, and so we used the delete-d jackknife method [11] with to estimate the SEs. For GLMM, we used an R package named [1] to estimate the SEs. In our preliminary simulation, we find that when K is not large, based on the statistic T, a t-test with the number of degrees of freedom works better than a z-test for all the methods except for SA_0.5; and when K is large, the two tests are equivalent. Thus, we performed the z-test for SA_0.5, and the t-test for other methods. Using data generated under , we computed empirical type I error rates at the significance level 0.05, and report the results in Table 1. Unlike other methods, the rates of SA_0.5 are consistently smaller than the target level 0.05, meaning that SA_0.5 is quite conservative in rejecting . Among all, the rates of iSHRI are closest to 0.05 in nearly all the settings, although they are slightly above 0.05. By contrast, GLMM has inflated rates under Models II and III; DSL have inflated rates under Models I and III; and the rates of EL and MH are highly inflated for all the cases.
Table 1

Comparison of type I error rates.

ModelKSA_0.5iSHRIGLMMDSLELMH
I100.030.060.070.070.180.26
200.020.070.070.120.300.38
500.020.070.060.200.480.59
II100.020.050.080.060.110.19
200.020.060.090.060.150.22
500.010.060.120.060.170.26
III100.020.060.120.070.190.26
200.020.060.200.130.330.39
500.020.070.360.210.530.60
Comparison of type I error rates.

Comparing CI coverage

To compute the 95% CIs, we used for SA_0.5, and for the other methods, where is the critical value of the t distribution with . Then for each method considered, we computed the empirical proportion of the CIs that contain the true θ value among 1000 replicates in each setting. To clearly show differences among the estimators especially in the neighbor of the nominal level 95% (represented by the black solid line), we focus on coverage rates from 50% to 100% in Fig. 3. Under Model I, GLMM appears to be the best and its coverage is quite close to 95%; and iSHRI perform as well as GLMM when θ is large positive or close to zero. Under Model II, iSHRI is the best for and 20; and it is also the best for except for large positive/negative θ. Under Model III, iSHRI is the best except for some cases with large positive θ, where SA_0.5 provides coverage closer to 95%. Note that although GLMM performs well under Model I, it performs poorly under Model III; and the other methods including DSL, EL and MH do not perform well in general. Thus, the performance of iSHRI is competitive in terms of CI coverage.

Data examples

Nissen and Wolski [19] conducted a meta-analysis of 48 randomized trials to evaluate potential adverse effects of rosiglitazone therapy, including the risks of myocardial infarction (MI) and cardiovascular death (CVD). Later, Nissen and Wolski [20] updated the analysis with 56 randomized trials including the previous 48 trials. In this section, we applied the six methods, namely, SA_0.5, iSHRI, GLMM, DSL, EL and MH, to evaluate adverse effects of rosiglitazone with respect to MI and CVD based on these two studies.

MI data

Table 2 summarizes results from the different methods for meta-analysis of the MI data of the 48 and 56 trials, respectively, including the estimated treatment effect , estimated odds ratio (OR), and 95% CI of the OR. We used the same methods described in Section 3 to compute the CIs of each method. The results of DSL and EL are the same because the Q statistic [9] is 0 in both analyses. From Table 2, we find that, for the 48 and 56 trials, all the methods report positive estimates of θ that belong to the interval [0,0.5]; and only MH provide CIs of OR that do not include 1, meaning that all the other methods do not reject except for MH.
Table 2

Results for MI data in rosiglitazone meta-analysis.

StudyParaSA_0.5iSHRIGLMMDSLELMH
Nissen and Wolski [19]θˆ0.0210.1070.2990.1800.1800.356
OR1.0211.1131.3481.1981.1981.427
95% CI(0.653,1.598)(0.855,1.448)(0.963,1.886)(0.887,1.619)(0.887,1.619)(1.021,1.995)
Nissen and Wolski [20]θˆ0.0290.1080.2260.1590.1590.250
OR1.0301.1141.2531.1721.1721.284
95% CI(0.678,1.564)(0.888,1.397)(0.985,1.595)(0.937,1.467)(0.937,1.467)(1.009,1.634)
Results for MI data in rosiglitazone meta-analysis. We conducted the Brown-Forsythe Levene-type (BFL) test for equal variances [5] using sample log odds (with the correction factor 0.5); and for the MI data in both studies, the test did not reject the null at the significance level . Thus, both MI studies may fit in simulation settings with and under Model II. Based on our simulation results in Section 3, for such settings, iSHRI and MH are the least biased, and all the methods achieve similar small MSE values. Meanwhile, iSHRI performs the best in maintaining the nominal type I error as well as the nominal CI coverage, suggesting iSHRI is the best choice for the purpose of testing. Here, even for the purpose of estimation, we believe iSHRI is a much safer choice than MH. Although the BFL test did not reject variance equality, there is still a small chance that the variance in the treatment is larger than that in the control based on their variance estimates; and for under Model I, MH severely overestimates the treatment effect and its MSE is also the largest among all. Note that MH does give the largest positive estimates for both MI studies. In addition, between the two studies, we find iSHRI gives the most consistent estimates of θ (0.107 vs. 0.108) while MH gave estimates with the largest difference (0.356 vs. 0.250). To summarize, we decide to rely on the results from iSHRI, and conclude that the (small positive) effect of rosiglitazone on MI is not statistically significant.

CVD data

As shown in Table 3, for the CVD data, SA_0.5 gives slightly negative estimates in both studies, and DSL and EL return slightly negative estimates in the updated study but positive estimates in the first study, while the other methods report both positive estimates. Comparing estimates of θ between the two studies, SA_0.5 and iSHRI have the smallest changes while GLMM and MH change a lot, even though their sign does not change. Note that for any adverse effect, θ is expected to be nonnegative, otherwise the “adverse” effect is actually beneficial. All the CIs cover 1, suggesting the action of not rejecting .
Table 3

Results for CVD data in rosiglitazone meta-analysis.

StudyParaSA_0.5iSHRIGLMMDSLELMH
Nissen and Wolski [19]θˆ−0.0370.0540.4390.0870.0870.529
OR0.9641.0551.5511.0921.0921.698
95% CI(0.583,1.594)(0.859,1.297)(0.897,2.681)(0.724,1.647)(0.724,1.647)(0.970,2.972)
Nissen and Wolski [20]θˆ−0.0650.0110.009−0.057−0.0570.026
OR0.9371.0111.0090.9440.9441.026
95% CI(0.590,1.487)(0.831,1.231)(0.759,1.341)(0.729,1.222)(0.729,1.222)(0.770,1.367)
Results for CVD data in rosiglitazone meta-analysis. The BFL test for equal variances in the study of the 48 trials returns a p-value of 0.040, suggesting Model I; and it returns a p-value of 0.246 in the study of the 56 trials, suggesting Model II. Thus, the CVD data of the 48/56 studies may fit in simulation settings with and under Model I/II. Based on our simulation results, it is easy to find that iSHRI and SA_0.5 are the best two methods in terms of bias and MSE under both models for and . However, in terms of the type I error and CI coverage, iSHRI is better than SA_0.5. Thus, based on the results from iSHRI, we believe the rosiglitazone effect on CVD is very close to zero, and there is no strong evidence to support its existence.

Discussion

For meta-analysis of rare binary events, an integrative shrinkage estimator of treatment effects, called iSHRI, has been developed based on the SA estimator proposed in Bhaumik et al. [2]. We have shown that iSHRI should have smaller MSE than SA_0.5 if individual sample sizes are sufficiently large. Further, the performance of iSHRI has been compared with commonly used methods in terms of bias, MSE, type I error, and CI coverage under various settings simulated to mimic realistic situations. We find that for relatively small sample sizes, the performance of a meta-analysis method depends largely on the variability direction (i.e., which group has larger variability) as well as the size and direction of the mean treatment effect θ. However, iSHRI has generally satisfactory performance in both estimation and testing, which seems to be better than or comparable to other top methods in most cases. Data examples involving rosiglitazone meta-analysis confirmed the usefulness of iSHRI as a safe and reliable method, especially when there exists uncertainty about the variability direction, as is typical in practice. We end this paper by pointing out several major differences between our iSHRI method and that proposed in Xiao and Xie [28]. Firstly, the underlying research problems are different, though both involve rare events. Xiao and Xie [28] consider the problem of estimating the failure rate for highly reliable systems while we consider the problem of estimating the treatment effect from multiple independent studies in the context of meta-analysis of rare binary events. Thus, the application scopes of the two methods are completely different. Secondly, due to the different problem set-ups, Xiao and Xie [28] rely on Poisson distributions while we rely on binomial-normal hierarchical models for statistical inference. In addition, in terms of technical detail, Xiao and Xie [28] shrink the maximum likelihood estimator of the failure rate, and we shrink the simple average estimator of the treatment effect that is nonparametric; and so our derivation is different from theirs.
  15 in total

1.  Statistical aspects of the analysis of data from retrospective studies of disease.

Authors:  N MANTEL; W HAENSZEL
Journal:  J Natl Cancer Inst       Date:  1959-04       Impact factor: 13.506

2.  What to add to nothing? Use and avoidance of continuity corrections in meta-analysis of sparse data.

Authors:  Michael J Sweeting; Alexander J Sutton; Paul C Lambert
Journal:  Stat Med       Date:  2004-05-15       Impact factor: 2.373

3.  Much ado about nothing: a comparison of the performance of meta-analytical methods with rare events.

Authors:  Michael J Bradburn; Jonathan J Deeks; Jesse A Berlin; A Russell Localio
Journal:  Stat Med       Date:  2007-01-15       Impact factor: 2.373

4.  An empirical Bayes method for estimating epistatic effects of quantitative trait loci.

Authors:  Shizhong Xu
Journal:  Biometrics       Date:  2007-06       Impact factor: 2.571

5.  Meta-analysis in clinical trials.

Authors:  R DerSimonian; N Laird
Journal:  Control Clin Trials       Date:  1986-09

6.  Bayesian approaches to random-effects meta-analysis: a comparative study.

Authors:  T C Smith; D J Spiegelhalter; A Thomas
Journal:  Stat Med       Date:  1995-12-30       Impact factor: 2.373

7.  Estimation of a common effect parameter from sparse follow-up data.

Authors:  S Greenland; J M Robins
Journal:  Biometrics       Date:  1985-03       Impact factor: 2.571

Review 8.  Rosiglitazone revisited: an updated meta-analysis of risk for myocardial infarction and cardiovascular mortality.

Authors:  Steven E Nissen; Kathy Wolski
Journal:  Arch Intern Med       Date:  2010-07-26

9.  Fixed vs random effects meta-analysis in rare event studies: the rosiglitazone link with myocardial infarction and cardiac death.

Authors:  Jonathan J Shuster; Lynn S Jones; Daniel A Salmon
Journal:  Stat Med       Date:  2007-10-30       Impact factor: 2.373

10.  Meta-analysis of rare binary events in treatment groups with unequal variability.

Authors:  Lie Li; Xinlei Wang
Journal:  Stat Methods Med Res       Date:  2017-07-31       Impact factor: 3.021

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.