Literature DB >> 28620945

Measuring the statistical validity of summary meta-analysis and meta-regression results for use in clinical practice.

Abstract

An important question for clinicians appraising a meta-analysis is: are the findings likely to be valid in their own practice-does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity-where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple ('leave-one-out') cross-validation technique, we demonstrate how we may test meta-analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta-analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta-analysis and a tailored meta-regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within-study variance, between-study variance, study sample size, and the number of studies in the meta-analysis. Finally, we apply Vn to two published meta-analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta-analysis estimates in clinical practice.

Entities: Chemical Disease Gene Species

Keywords: data interpretation; decision making; meta-analysis; models; statistical; validity

Mesh：

Year: 2017 PMID： 28620945 PMCID： PMC5575530 DOI： 10.1002/sim.7372

Source DB: PubMed Journal: Stat Med ISSN： 0277-6715 Impact factor: 2.373

Introduction

The capacity to aggregate multiple studies and provide a summary estimate for translation into practice was one of the motivations that drove the development of meta‐analysis. In this regard, it has achieved undoubted success; however, the blight of heterogeneity, which so often affects meta‐analyses, can potentially affect the applicability of results in individual clinical settings, such as individual practices, hospitals, regions, or even countries. Although methods have been developed to quantify and ascertain the effects of heterogeneity, more recently, particularly in the field of predictive modeling, the focus has been on developing statistical approaches that increase the validity of meta‐analysis results when applied in new populations. When evaluating diagnostic and prognostic tests, Riley and colleagues 1 examine approaches to translate test accuracy meta‐analysis results to a new population, and propose cross‐validation and prediction intervals to evaluate calibration performance of each approach. Debray and colleagues provide a framework for the use of individual patient data (IPD) from multiple studies in prediction modeling using logistic regression, and demonstrate that different model intercepts may be needed in new settings to ensure good predictive performance 2. Similarly, Snell et al. use IPD from multiple countries to develop and validate a breast cancer prognostic model, and show that it calibrates far better in each country when the baseline hazard is recalibrated (or ‘tailored’) to each country's population 3. A common theme to all of these approaches is the use of cross‐validation 4 where the k primary studies in the meta‐analysis provide all the data used in the validation process. The cross‐validation process involves comparing the meta‐analysis estimate from k − 1 studies with that estimate from the omitted study; this is repeated k times, each time omitting a different study. The need to examine and improve the validity of meta‐analysis results should not be confined to the prediction modeling field. Indeed, assessing whether meta‐analysis results translate into practice should be the concern of all reviewers and statisticians producing summary results from a body of evidence, whether it is for the purpose of diagnosis, treatment, prognosis, or otherwise. With this in mind, in this article we propose a general method for assessing the statistical validity of meta‐analysis results when applied in clinical practice. The predominant question we aim to address is when should we apply a summary meta‐analysis estimate to an independent setting? Specifically, if μma is the parameter for the true summary effect of interest as estimated by the meta‐analysis analysis model and μsetting is the parameter for the true effect in an independent setting of interest, then does μma = μsetting? When the two are equal, we propose that the summary estimate from the meta‐analysis model can be described as having statistical validity. However, if the two are not equal, then meta‐analysis results may need to be modified (or ‘tailored’) to the setting of interest in order to ensure statistical validity. We will also examine statistical validity in the context of heterogeneity considering some of the methods used to establish heterogeneity. We develop this in the following sections. We describe the meta‐analysis and tailored meta‐regression models in section 2. In section 3, we develop a new validation statistic, Vn, and derive its associated distribution applied to meta‐analysis and meta‐regression models. We consider how statistical validity relates to heterogeneity and compare Vn with Cochran's Q statistic. The properties of Vn and Q are examined more closely in a simulation study in section 4, and then application is made to two case examples in section 5. In the discussion in section 6, we consider its use and shortcomings.

Meta‐analysis and mixed‐models

Supposing there are multiple studies that each evaluate a particular effect of interest (e.g. an intervention effect, the sensitivity of a test, or the performance of a prognostic model). Of interest is the summary (mean) effect across studies and, for the purposes of this paper, how to apply or tailor such summary meta‐analysis estimates to clinical practice. Given the potential variation in the true effects between primary studies, we develop methods from the view point of a random effects model. Meta‐analysis model For the observed mean effect, , in a primary study, we use the following univariate random effects model to aggregate the primary studies in the meta‐analysis where is the mean (summary) effect across the studies and the key parameter to be estimated, is the study‐specific deviation from the overall mean effect with unexplained between‐study variance , and is the sampling error with variance assumed known for each study i. This will be used as the base model in the meta‐analysis, where the key result for translation to clinical practice is the summary effect estimate, . Tailored meta‐regression model Heterogeneity across settings may be explained by study‐level covariates, and such covariates may be important when applying summary meta‐analysis results in clinical practice. We consider the effects of covariates on the meta‐analysis by incorporating these within a meta‐regression model. When there are p − 1 covariates, we have where is the row vector with p elements (the first element is 1) associated with study i, is the p‐vector of coefficients to be estimated (the first element being the intercept term), and and are defined as previously. This model allows us to obtain summary results ‘tailored’ for particular populations of interest, defined by their set of covariate values and estimated by where denotes the estimate of derived from (2) with the ith study omitted. Therefore, following estimation of model (2), the key result for translation to clinical practice is . The merits of using information from the setting of interest have been recently described by Willis and Hyde 5, 6 particularly in the case of diagnosis. Essentially, it helps tailor the summary meta‐analysis estimate to the setting for clinical application, to potentially improve its validity. Models (1) and (2) can be estimated using standard techniques, such as methods of moments and restricted maximum likelihood (REML). All meta‐analysis and meta‐regression models in this article are conducted in R using the package Metafor 7.

Examining summary meta‐analysis estimates in clinical practice

Cross‐validation approach

To evaluate whether meta‐analysis results may be translated into practice requires the development of methods that allow the cross‐validation of the derived summary estimates in new independent settings. In prognostic modeling, Royston and colleagues proposed what they called an ‘internal–external cross‐validation’ procedure to establish the generality of a prognostic model developed across different studies 4. More recently, this method has been elaborated upon by Riley 1 and Debray 2. Essentially, the procedure bears similarity with the ‘Jack‐knife method’ 8 by omitting each primary study, in turn, from the meta‐analysis and deriving a summary estimate from the remaining studies, which is then compared with the observed estimate in the corresponding omitted study. When there are k independent studies, the procedure generates k different meta‐analysis estimates, and thus k different validations. As such, the primary studies selected for the meta‐analysis are themselves being used as the basis for independent validation.

Validation statistic, Vn

In the remainder of this article, we focus on developing a statistic to test the validity of summary meta‐analysis estimates for clinical practice, within the context of the aforementioned cross‐validation approach. Let be the observed mean effect estimate of interest in the ith study, and be the summary meta‐analysis estimate (from either model (1) or model (2)) generated from using k − 1 studies with the ith study omitted. Therefore, following the cross‐validation exercise, we have a dataset containing k values of and . In this context, we propose the validation statistic, Vn where var( is the variance of and is the variance of . Assuming to be normally distributed, the Vn statistic may be used as an overall test of the null hypothesis that for all i, where is the parameter (true predicted effect) that underlies the predicted by the meta‐analysis model, and is the parameter (true effect) in the omitted study i. By our definition, when the null hypothesis is true, the meta‐analysis/regression estimate is a statistically valid estimate for a new setting. Thus, if we define a p value <0.05 for Vn as significant, then a p < 0.05 implies that there is sufficient evidence to conclude the meta‐analysis/regression estimate is not statistically valid. In section 3.3, the distribution for Vn is derived for meta‐analysis/regression models.

Distribution of Vn for meta‐analysis and meta‐regression summary estimates

Here, we give an outline of the derivation of the asymptotic distribution of Vn recognising that Vn is a quadratic form and using an approach described in previous studies 9, 10. A more detailed description of the derivation is given in the appendix 1. Assuming a continuous outcome, let the ith study have observed mean effect and variance where is the variance of the patient‐level observations in each study with sample size . By writing the weights, , Vn may be written as If we define the k × k matrix appropriately (see appendix 1), the term inside the squared brackets may be written as , where is the k‐vector with elements (, and Vn may be written in the following matrix form The diagonal matrix w has diagonal elements , and in general, the diagonal elements of are all 1. The advantage of writing Vn in this form is that the null hypothesis, for all i, is equivalent to . By transforming into a vector of standard normal variables and applying the spectral decomposition theorem, it may be shown that for eigenvalues of where is the diagonal matrix with diagonal elements Thus, Vn has a distribution which is a linear combination of variables of degree 1. This is an exact distribution if the are all known. In practice, the are estimated from the sample data, so it is an asymptotic distribution. Note that Vn has the same form of distribution for both the univariate random effects meta‐analysis and tailored meta‐regression, but both and differ between the two cases (see appendix 1).

Farebrother's algorithm to implement Vn

To apply Vn, the distribution specified in (6) needs to be known. The distribution for a linear combination of chi‐square variables has received considerable attention over the years. In general, there is no closed form to the distribution so that it has to be obtained numerically and a number of approaches have been described. Sattherwaite in 1946 described an approximate method based on the observation that when the non‐zero eigenvalues all equal one the distribution simplifies to single chi‐squared distribution with k degrees of freedom 11. This suggests that a single distribution with an ‘effective’ number of degrees of freedom may provide a suitable approximation. Other approaches include inverting the characteristic function (Davies 12, 13) and applying numerical integration to a weighted sum of chi‐squared variables (Fleiss 14). Ruben made a notable development when he demonstrated that a linear combination of chi‐square variables could be written as an infinite series 15. Importantly, he also showed that the truncation error after n terms had an upper bound which was dependent on a chi‐squared distribution, the coefficients of the terms in the expansion, and n—all of which could be estimated accurately15. Thus, an estimate of the exact distribution may be obtained for the truncated series with n terms, such that n is set to make the truncation error arbitrarily small 9. Ruben's method is incorporated within Farebrother's algorithm 16 and it is this algorithm we use when estimating the distribution of Vn. The version of Farebrother's algorithm applied below is from the package CompQuadForm in R 10. The R source code used to estimate the distribution for Vn for a case example may be found in appendix 2.

Heterogeneity and statistical validity

The Vn statistic may be used as a test of the null hypothesis H0: for all i. For the base meta‐analysis model (1) if we define for j ≠ i, then the null hypothesis is equivalent to the following for each i It can be seen from this that the null hypothesis is only satisfied when for all , that is, when there is no heterogeneity. In short, we have the intuitive result that the base meta‐analysis model will provide a summary estimate which is statistically valid to all clinical settings only when the studies comprising the meta‐analysis are homogeneous. Writing Vn in matrix form is useful when considering the null hypothesis. As stated earlier, the null hypothesis equates to where is the k‐vector of parameters; in other words, the above equations are equivalent to the kernel or null space of on . Using Gauss‐Jordan elimination, may be reduced to echelon form, and the above result follows readily. Heterogeneity is likely to exist in most meta‐analyses, and thus, in general, individual clinical settings will have a true effect that differs from the mean (summary) effect; thus, one would expect Vn to lead to the null hypothesis being rejected in most applications of meta‐analysis. In contrast, Vn is likely to be more useful when considering the case of tailored meta‐analysis results, as derived from the tailored meta‐regression model in (2) where covariates are included. Reducing to echelon form (see Appendix 1 for definition of ), it follows that if for all i then for p − 1 covariates The dimension of the kernel of , otherwise known as the nullity, is p—this follows from the rank‐nullity theorem, because the rank() = k − p. Therefore, if ,.., are all known, then this constrains to values in p‐space (equation (8)). The values of the coefficients depend on the within‐study variance, the between‐study variance, and the values of the covariates for each study; when the covariates are continuous, there are an infinite number of possible solutions. In summary, within the k‐dimensional space of all possible values of (,…, , there is a p‐dimensional sub‐space where the (,…, are statistically valid. Because in most cases, we expect to find , that is, the primary studies are heterogeneous, there is still the potential for the meta‐regression model to provide predicted summary estimates that are statistically valid. When the covariates are discrete, the possible values of (,…, that are statistically valid share the same p‐dimensional sub‐space as continuous covariates. However, the number of possible values of (,…, is constrained by the number of levels, contrasting a continuous covariate, which may be thought of as having an infinite number of levels. Thus, for a meta‐regression model that includes only a single dichotomous covariate each of the can have only one of two values for them to be statistically valid. For example, if and , then the other will be either 2.5 or 3.8. Although there is strict heterogeneity (not all the are equal), the data divide into two homogenous sub‐groups. Thus, similar to meta‐analysis, statistically valid estimates arise when the sub‐groups of studies are homogeneous. In essence, the process of adding covariates to a meta‐regression model in order to ‘explain’ heterogeneity is a one of identifying homogenous sub‐groups and this makes statistical valid estimates more likely. In the limit, when the covariates are continuous, this is equivalent to there being an infinite number of homogenous sub‐groups.

Comparison of the Q statistic with Vn

In meta‐analysis, Cochran's Q statistic 17 is classically used to identify heterogeneity 18. Specifically, Q is used to test the null hypothesis of homogeneity, namely, H0: for all 18 or equivalently, H0: where is the between‐study variance. For a meta‐regression model, the use of the Q statistic may be extended to detecting residual heterogeneity. In this instance, Q is used to test H0: where is the residual between‐study variance and this corresponds to identifying homogenous sub‐groups of studies. Thus, if a covariate has m levels, then for each level the sub‐group of studies corresponding to that level satisfy for all . Whether the Vn statistic is applied to a meta‐analysis or meta‐regression model, it is a test of the null hypothesis H0: for all i, which we defined as statistical validity. However, as we have deduced already, this is equivalent to testing H0: in the case of meta‐analysis and H0: in the case of meta‐regression. In essence, Vn and Q test the same hypotheses but from different standpoints, one from a direction of statistical validity of predictions, the other from homogeneity. The key to our definition of statistical validity is the comparison of the parameter for the model estimate with that of an independent study. This derives from our goal of determining whether the model produces an estimate which accurately represents that seen in a new clinical population. Specifically, we would like to know how close the meta‐analysis/regression estimate of the effect is to the effect seen in an independent study. By incorporating a leave‐one‐out cross validation approach, the Vn statistic directly captures the out of sample prediction error. In contrast, Cochran's Q statistic measures the deviation of the studies from the overall summary estimate or regression line. As all the studies are used to derive these summary measures, the Q statistic does not directly quantify how well these summary estimates generalise to independent settings. In the next section, the statistical properties of Vn are studied and compared with Cochran's Q statistic.

A simulation study

In order to study the properties of Vn and compare them with the Q statistic, meta‐analyses and meta‐regression models were simulated for different values of the following parameters: τ 2 (between‐study variance), (variance of patient‐level observations), n (sample size for each individual primary study), and k (number of primary studies in each meta‐analysis/meta‐regression). The following values were used: τ 2 [0, 0.05, 0.1, 0.25, and 0.5]; [0.0001, 0.01, 0.1, and 1]; n, [50, 100, 250, 500, and 1000]; and k [5, 10, 25, and 50]. The and n were varied between meta‐analyses/regressions but not between the primary studies within the meta‐analysis/regression to allow us to study the contributions of these separately. In practice, Vn and Q are calculated using the asymptotic estimates and for the parameters and τ2, respectively. To simulate these for each primary study, was calculated by taking the mean of the n simulated observations from for m = 1,..,n. The was estimated as the variance of the observations around , and was estimated using and to fit the meta‐analysis/regression models via the Metafor package 7. The models were fitted using REML. When evaluating the type 1 errors of Vn and Q using a meta‐analysis model, we simulated a single and set . The meta‐regression models were simulated using a single continuous covariate for each study . Hence, when investigating the type 1 error of the meta‐regression models, two values of for i = k − 1, k, were simulated, and the remaining k − 2 values of were deduced as linear combinations of and as according to (8). To evaluate the power of Vn and Q the true effect (parameter) for each primary study, was simulated from a normal distribution according to for i = 1,..,k. These were checked to ensure that and rejected if otherwise. For different combinations of (τ2, , n, k), the type I error rate and the power of Vn and Q were determined for a critical value of p = 0.05. The rate of type I error and the power were estimated based on 40 000 meta‐analysis/meta‐regression replications for each (τ2, , n, k) combination.

Type 1 error rates for Vn and Q

In Table 1 the rates of type 1 error for Vn and Q when applied to the meta‐analysis model in (1) are given. Here, there is no heterogeneity (τ2 = 0), that is, for all . In general, the type 1 error rate is dependent on the average sample size, n, in the individual studies and the number of studies in the meta‐analysis, k. This reflects the estimates and being less prone to sampling error as n and k increase, respectively.

Table 1

Rate of type 1 error for Vn and Q for meta‐analysis.

n	k = 5		k = 10		k = 25		k = 50
n	Vn	Q	Vn	Q	Vn	Q	Vn	Q
50	0.083	0.061	0.075	0.065	0.078	0.075	0.088	0.085
100	0.080	0.058	0.066	0.056	0.066	0.060	0.065	0.064
250	0.073	0.052	0.062	0.052	0.056	0.054	0.057	0.057
500	0.076	0.052	0.060	0.052	0.052	0.051	0.055	0.054
1000	0.072	0.051	0.059	0.049	0.052	0.050	0.050	0.051

Probabilities are derived from simulations based on 40 000 meta‐analysis replications with fixed at 0.1.

n = individual study sample size; k = number of studies.

Rate of type 1 error for Vn and Q for meta‐analysis. Probabilities are derived from simulations based on 40 000 meta‐analysis replications with fixed at 0.1. n = individual study sample size; k = number of studies. When there are fewer studies k < 50, and smaller sample sizes, n, the type 1 error rate of Q is closer to the threshold of 5% than Vn. The null distribution of Q applied to meta‐analysis is a distribution of degree k − 1 when is known exactly 19, and for n = 1000, where sampling error of is minimised, the type 1 error rate for Q is consistently around the 5% mark across different numbers of studies. Table 2 provides the type 1 error rates for Vn and Q when applied to the meta‐regression model in (2) where one continuous covariate was included. As in the meta‐analysis model, when compared with Vn, Q has type 1 error rates which are closer to the 5% threshold. When n = 50 and k = 5, Vn has a type 1 error rate as high as 9.3% compared 5.3% for Q. In general, for both statistics when the sample size of the individual studies is small (n = 50), the type 1 error rate increases with k.

Table 2

Rate of type 1 error for Vn and Q for meta‐regression (with one covariate).

n	k = 5		k = 10		k = 25		k = 50
n	Vn	Q	Vn	Q	Vn	Q	Vn	Q
50	0.093	0.055	0.084	0.064	0.079	0.071	0.085	0.085
100	0.090	0.053	0.073	0.055	0.065	0.061	0.066	0.065
250	0.085	0.054	0.067	0.051	0.057	0.053	0.057	0.055
500	0.087	0.051	0.069	0.051	0.058	0.053	0.053	0.051
1000	0.087	0.050	0.069	0.052	0.056	0.051	0.051	0.052

Probabilities are derived from simulations based on 40 000 meta‐analysis replications with fixed at 0.1.

n = individual study sample size; k = number of studies.

Rate of type 1 error for Vn and Q for meta‐regression (with one covariate). Probabilities are derived from simulations based on 40 000 meta‐analysis replications with fixed at 0.1. n = individual study sample size; k = number of studies.

Power of Vn and Q

In Figure 1 the power of Vn is plotted against (where ) for models (1) and (2). In both the meta‐analysis and meta‐regression models, the power of Vn increases with increasing number of studies for a given . Thus, heterogeneity and large meta‐analyses increase the power of Vn.

Figure 1

Power of VnMeta‐analysis in left panel and meta‐regression in right. In both panels, we have τ varying, σ = 1, and n = 100.

Power of VnMeta‐analysis in left panel and meta‐regression in right. In both panels, we have τ varying, σ = 1, and n = 100. Comparing the left and right panels, respectively, the power curves for Vn are shifted slightly to the right in the meta‐regression model when there are 10 studies or fewer. In short, the probability of rejecting statistical validity when it is known to be false for a given is more likely when Vn is applied to a meta‐analysis model than in a meta‐regression model with one covariate. In Figure 2, the power of Vn and Q are compared for each of the models when there are 5 and 25 studies, respectively. When there are 5 studies, Vn has greater power than Q for a given . This difference is more pronounced in the meta‐regression model. When there are 25 studies in the analyses, the power curves for Vn and Q are closer, but Vn maintains greater power over Q for a wide range of . In short, Vn is more useful if we may assume heterogeneity and that estimates are more likely to be invalid.

Figure 2

Comparison of power of Vn and Q for meta‐analysis and meta‐regression (with 1 covariate).

Comparison of power of Vn and Q for meta‐analysis and meta‐regression (with 1 covariate). Also of note is how the difference in type 1 error rates between the two statistics compares with the difference in power. When a meta‐analysis has 25 studies, the type 1 error rate of Vn is 0.001–0.006 higher than Q (this difference is 0.004–0.008 for meta‐regression). Although power varies with , for a similar‐sized meta‐analysis and < 1.5 the power of Vn is 0.002–0.014 higher than Q (difference in meta‐regression is 0.003–0.016). Essentially, the higher type 1 error rate of Vn compared with Q is compensated by a corresponding increase in power.

Interpretation

Central to proposing Vn is its use in testing the null hypothesis of statistical validity, but its interpretation has to involve weighing up the results in the context of other evidence. Before we interpret Vn, an important consideration is to decide whether the null hypothesis (statistical validity) is likely to be true. We know from 3.6 this is equivalent to deciding whether the studies are homogeneous and the cumulative evidence provided by such measures as the Q statistic 18, I 18 and the width of prediction intervals 19 contribute to this decision. However, we should also be aware that none of these is without shortcomings when making such decisions 20, 21. Depending on whether statistical validity seems likely, we then interpret the results of Vn in terms of the type 1 error rate or power (specifically the type 2 error rate). Here, the simulation study which evaluated these for both Vn and Q may be used to inform decision‐making when interpreting results. If Vn is significant (p < 0.05), then either the meta‐analysis/regression estimates are invalid or there is a type 1 error. For > 3 and a 5% level of significance, the power of Vn is above 85% when there are 5 studies and above 99% when there are 10 studies. Thus, if model estimates are statistically invalid and is large enough, Vn will nearly always be significant. A type 1 error arises when the estimates are valid but Vn is significant. When there are few studies (k = 5) and the sample size is small (n = 50), Vn can have a type 1 error rate as high as 9% at a level of significance of 5%. Because validity is dependent on homogeneity (as we showed in 3.5), we may use the Q statistic in such instances as it maintains a type 1 error rate of around 5% even when k and n are small. If Vn is not significant but statistical invalidity seems likely, then we should consider the type 2 error rate (1‐ power). The power of Vn is dependent on k and , and these are required for interpretation. We may estimate directly from the meta‐analysis model and estimate the average standard error of studies in the meta‐analysis based on that proposed by Higgins and Thompson 18, namely where . Thus, any non‐significant results should be interpreted by weighing up whether they are consistent with the null hypothesis being true or there being a type 2 error given our estimate, . In order to gain some insight into the values of that may be expected, we screened the Cochrane database of Systematic Reviews for reviews published in September 2016. Including the two cases that follow, we found 16 reviews which provided summary 2 × 2 table data and had 5 or more primary studies (see Table 3). The median number of primary studies was 7 [lowest = 5, highest = 19] per review, and the median was 1.00 [lowest = 0, highest = 3.44]. Cochrane reviews tend to be of a higher quality than other reviews so they may not be representative—this may also account for the median primary study count being only 7.

Table 3

Values for and from a sample of meta‐analyses.

Study	k	Outcome	τ^	se¯	τ^/se¯
Fraquelli	5	Log RR	0.000	0.440	0.000
Martineau	7	Log OR	0.000	0.637	0.000
Clarke	9	Log OR	0.000	1.394	0.000
Wong	10	Log OR	0.000	1.087	0.000
Sheppard	7	Log RR	0.204	0.329	0.620
Greenough	7	Log RR	0.263	0.410	0.642
Prabhakar	6	Log RR	0.304	0.425	0.716
Sng	7	Log RR	0.475	0.484	0.980
Chin	9	Log RR	0.470	0.462	1.018
van Driel	6	Log OR	0.328	0.315	1.040
Kakkos	11	Log OR	0.906	0.782	1.158
Leeflang	7	Logit PPV	0.494	0.409	1.209
Wilkinson	6	Log RR	0.296	0.180	1.649
Bighelli	6	Log RR	0.557	0.292	1.911
Theron	19	Logit sens	1.069	0.471	2.271
Berkey	13	Log RR	0.560	0.163	3.444

RR = relative risk; OR = odds ratio; PPV = positive predictive value; sens = sensitivity.

Values for and from a sample of meta‐analyses. RR = relative risk; OR = odds ratio; PPV = positive predictive value; sens = sensitivity. For four studies, = 0 and hence = 0 where both Vn and Q have low power. But in such instances, the power becomes less relevant because = 0 suggests homogeneity. This reinforces the need to decide on whether statistical validity is likely to be true before interpreting the results. When = 1, the power for Vn is 37.5% for k = 5 and is 52% for k = 10. This rises to 62% and 83%, respectively, when =1.5. There were four reviews in which , and two of these had 6 studies. The lists of reviews are available in an online appendix. We will now apply these principles to the following case examples.

Case examples

We now use two data sets to illustrate the use of Vn in testing the statistical validity of summary meta‐analysis results. In each case, we first fit the meta‐analysis model then the tailored meta‐regression model. Note all parameters were estimated from fitting the models using REML.

Berkey

The first dataset is from Berkey et al. 22 who reviewed the primary studies which evaluated the efficacy of the Bacillus Calmette–Guérin (BCG) vaccination in preventing tuberculosis (TB) 22. For both the meta‐analysis model and the tailored meta‐regression model, Vn is calculated as follows where is the summary estimate from the meta‐analysis/regression model fitted with the ith study omitted and log( is the individual study estimate for study i. The individual study variance, var(log(RR) = 1/tp + 1/cp – 1/(tp + tn) – 1/(cp + cn) where tp and tn are the number of TB positive and negative patients in those who were vaccinated; cp and cn are the number of TB positive and negative patients in those who were not vaccinated. The is the model variance. In Table 4 the individual study estimates for log( are given alongside the corresponding meta‐analysis and tailored meta‐regression estimates for . This shows directly how well the meta‐analysis and tailored meta‐regression estimate predicts that observed in the excluded study.

Table 4

Meta‐analysis and tailored meta‐regression estimates with study estimates using data from Berkey 22.

No.	Study	Year	Lat	Study estimate	MA estimate	TMR estimate
1	Vandiviere et al.	1973	19	−1.62 (−2.55, −0.70)	−0.66 (−1.01, −0.30)	−0.22 (−0.44, 0.00)
2	Ferguson & Simes	1949	55	−1.59 (−2.45, −0.72)	−0.65 (−1.01, −0.30)	−1.31 (−1.76, −0.85)
3	Hart & Sutherland	1977	52	−1.44 (−1.72, −1.16)	−0.63 (−0.97, −0.28)	−1.17 (−1.64, −0.70)
4	Rosenthal et al.	1961	42	−1.37 (−1.90, −0.84)	−0.65 (−1.01, −0.29)	−0.92 (−1.18, −0.65)
5	Rosenthal et al.	1960	42	−1.35 (−2.61, −0.08)	−0.69 (−1.05, −0.32)	−0.96 (−1.23, −0.69)
6	Aronson	1948	44	−0.89 (−2.01, 0.23)	−0.71 (−1.08, −0.33)	−1.04 (−1.33, −0.74)
7	Stein & Aronson	1953	44	−0.79 (−0.95, −0.62)	−0.71 (−1.10, −0.32)	−1.10 (−1.42, −0.79)
8	Coetzee & Berjak	1968	27	−0.47 (−0.94, 0.00)	−0.74 (−1.13, −0.36)	−0.55 (−0.81, −0.29)
9	Comstock et al.	1974	18	−0.34 (−0.56, −0.12)	−0.76 (−1.14, −0.37)	−0.27 (−0.64, 0.10)
10	Frimodt‐Miller et al.	1973	13	−0.22 (−0.66, 0.23)	−0.76 (−1.14, −0.39)	−0.11 (−0.54, 0.31)
11	Comstock et al.	1976	33	−0.02 (−0.54, 0.51)	−0.78 (−1.14, −0.41)	−0.75 (−0.93, −0.57)
12	TPT Madras	1980	13	0.01 (−0.11, 0.14)	−0.79 (−1.15, −0.44)	−0.22 (−0.68, 0.25)
13	Comstock & Webster	1969	33	0.45 (−0.98, 1.88)	−0.76 (−1.12, −0.40)	−0.73 (−0.94, −0.52)

The study estimate is the log(relative risk) for the individual study. The meta‐analysis (MA) estimate for a study is that derived from aggregating the remaining studies. The tailored meta‐regression (TMR) estimate for a study is derived from regressing the remaining studies with the covariate Lat but inserting the Lat value for the excluded study. For example, the MA estimate for study 1 is derived from aggregating studies 2–13. The TMR estimate for study 1 is derived from regressing studies 2–13 but inserting Lat = 19.

All estimates are for the log(RR) with 95% confidence intervals in brackets. Lat = latitude.

Meta‐analysis and tailored meta‐regression estimates with study estimates using data from Berkey 22. The study estimate is the log(relative risk) for the individual study. The meta‐analysis (MA) estimate for a study is that derived from aggregating the remaining studies. The tailored meta‐regression (TMR) estimate for a study is derived from regressing the remaining studies with the covariate Lat but inserting the Lat value for the excluded study. For example, the MA estimate for study 1 is derived from aggregating studies 2–13. The TMR estimate for study 1 is derived from regressing studies 2–13 but inserting Lat = 19. All estimates are for the log(RR) with 95% confidence intervals in brackets. Lat = latitude. From Table 5, Vn (59.96; p < 0.0001) is significant, and as = 3.44, the power of Vn is likely to be close to 100% at a 5% level of significance. This strongly suggests the meta‐analysis estimate is unlikely to be valid in a new setting. The other statistics in Table 5 also suggest that there is heterogeneity which would be consistent with the estimate being invalid.

Table 5

Comparison of Vn with Q and I2 when applied to two case examples.

Cases	Outcome	95% PI	Vn	p‐Value	Q	p‐Value	I ²
1—MA	Log(RR)	(−1.87, +0.44)	59.96	<0.0001	152.23	<0.0001	92.2%
1—MR*	Log(RR)	(−1.67, −0.45)†	25.77	0.0037	30.73	0.0012	68.4%
2—MA	Logit(PPV)	(−1.84, +0.33)	16.76	0.0083	15.39	0.0175	59.75%
2—MR#	Logit(PPV)	(−1.19, −0.58)‡	6.04	0.484	4.86	0.433	0%

Case 1 (Berkey et al. [22]) and Case 2 (Leeflang et al. [23]). The results are given for the meta‐analysis (MA) and meta‐regression (MR) (with 1 covariate). k = the number of studies;

Includes 1 covariate (the latitude)

Includes 1 covariate (the logit(prevalence)); PPV—positive predictive value; RR—relative risk; 95% PI—95% prediction interval.

Prediction interval estimated for a latitude of 45°.

Prediction interval estimated for a prevalence of 10%.

Comparison of Vn with Q and I2 when applied to two case examples. Case 1 (Berkey et al. [22]) and Case 2 (Leeflang et al. [23]). The results are given for the meta‐analysis (MA) and meta‐regression (MR) (with 1 covariate). k = the number of studies; Includes 1 covariate (the latitude) Includes 1 covariate (the logit(prevalence)); PPV—positive predictive value; RR—relative risk; 95% PI—95% prediction interval. Prediction interval estimated for a latitude of 45°. Prediction interval estimated for a prevalence of 10%. When undertaking a meta‐regression, the reported efficacy in terms of the relative risk was associated with the latitude of the setting in which the study was conducted 22. Each fitted tailored meta‐regression equation with the ith study omitted took the form of: where and are estimated from fitting the model. It is of interest to know if the summary estimates from this tailored meta‐regression are valid in particular countries. Therefore, the cross‐validation approach is useful, with Vn used to test the calibration of the meta‐regression predicted effects and the actual study effects. Although including the latitude as a covariate in the model has helped explain some of the heterogeneity (both Q and I have decreased), Vn (25.77; p = 0.0037) remains significant. As noted above = 3.44, and for a tailored meta‐regression model, the power of Vn is still around 100%—this points strongly to the tailored meta‐regression estimate being invalid. For the estimate to be statistically valid, we would need to interpret the significant result for Vn in the context of the probability of a type 1 error. Notwithstanding that Table 2 shows for a level of significance of 0.05 the true type 1 error rate of Vn can be higher than this, a p value of 0.0037 suggests a type 1 error is still very unlikely. This supports our judgment that the tailored meta‐regression estimates are likely to be statistically invalid.

Leeflang

The second dataset is derived from a Cochrane systematic review which appraised studies that had evaluated the Galactamannan assay for diagnosing invasive aspergillosis in immunocompromised patients 23. Here, we use the dataset with the threshold for a positive test result set at an optical density index (ODI) of 0.5. As in general, it is the probability of disease/non‐disease given the test result which is most useful to clinicians, the outcome of interest chosen in this example was the positive predictive value (PPV). Thus, the Vn statistic for both the meta‐analysis and meta‐regression model is calculated as follows The individual study variance var[logit(PPV) ] = 1/[(tp + fp)PPV(1‐PPV)] where tp and fp are the number of true and false positive patients. The individual study estimates for log( with the corresponding meta‐analysis and tailored meta‐regression estimates for are given in Table 6.

Table 6

Meta‐analysis and tailored meta‐regression estimates with study estimates using data from Leeflang 23

Author	Year	lgtprev	Study estimate	MA estimate	TMR estimate
Allan	2005	−4.82	−3.09 (−5.93, −0.26)#	−0.69 (−1.18, −0.20)	−2.65 (−4.06, −1.24)
Florent	2006	−2.56	−1.58 (−2.34, −0.82)	−0.59 (−1.02, −0.15)	−0.99 (−1.42, −0.56)
Kawazu	2004	−2.53	−0.74 (−1.46, −0.02)	−0.77 (−1.38, −0.15)	−1.25 (−1.68, −0.82)
Foy	2007	−2.21	−0.15 (−1.24, +0.94)	−0.84 (−1.38, −0.29)	−0.95 (−1.27, −0.63)
Yoo	2005	−2.10	−0.73 (−1.42, −0.05)	−0.77 (−1.39, −0.15)	−0.81 (−1.19, −0.43)
Weisser	2005	−1.95	−0.94 (−1.52, −0.36)	−0.72 (−1.34, −0.10)	−0.62 (−0.98, −0.26)
Suankratay	2006	−0.66	+0.21 (−0.52, +0.94)	−0.93 (−1.27, −0.59)	+0.26 (−1.20, +1.73)

The study estimate is the logit(PPV) for the individual study. The meta‐analysis (MA) estimate for a study is that derived from aggregating the remaining studies. The tailored meta‐regression (TMR) estimate for a study is derived from regressing the remaining studies with the covariate lgtprev but inserting the lgtprev value for the excluded study. For example, the MA estimate for study 1 is derived from aggregating studies 2–7. The TMR estimate for study 1 is derived from regressing studies 2–7 but inserting lgtprev = −4.82.

All estimates are for the logit(PPV) with 95% confidence intervals in brackets.

PPV = positive predictive value; lgtprev = logit(prevalence)

Estimate includes continuity correction of 0.5.

Meta‐analysis and tailored meta‐regression estimates with study estimates using data from Leeflang 23 The study estimate is the logit(PPV) for the individual study. The meta‐analysis (MA) estimate for a study is that derived from aggregating the remaining studies. The tailored meta‐regression (TMR) estimate for a study is derived from regressing the remaining studies with the covariate lgtprev but inserting the lgtprev value for the excluded study. For example, the MA estimate for study 1 is derived from aggregating studies 2–7. The TMR estimate for study 1 is derived from regressing studies 2–7 but inserting lgtprev = −4.82. All estimates are for the logit(PPV) with 95% confidence intervals in brackets. PPV = positive predictive value; lgtprev = logit(prevalence) Estimate includes continuity correction of 0.5. From Table 5, for the meta‐analysis model, Q (15.39; df = 6; p = 0.0175), I = 59.75% and the 95% prediction interval for the summary logit (PPV) is (−1.84, +0.33) (this is equivalent to a PPV of 14–58%)—these all suggest the studies are heterogeneous. As might be expected, Vn (16.76; p = 0.0083) is also significant which leads to the conclusion that any summary estimates are unlikely to be valid. From Bayes' theorem, the PPV is known to depend on the disease prevalence. We implemented a tailored meta‐regression approach in which the prevalence of disease was assumed to be known for each primary study 5, 6, to study the effects of such information on the potential validity of any estimates. Thus, for the Vn statistic, each fitted tailored meta‐regression model took the form: where and are estimated from fitting the model with the ith study omitted. In contrast to the first example, Vn (6.04; p = 0.484) is non‐significant. Should this be considered as consistent with the null hypothesis of statistical validity or is it a type 2 error? When there are few studies (k = 7), Vn has greater power and therefore a lower type 2 error rate than Q. From Table 5, = 1.21 and inspecting the power curves in Figure 2 we see the power is around 58% (a type 2 error rate of 42%) for a level of significance of 0.05. However, in this instance, the power for Vn will be much higher because p = 0.484. (A simulation study based on = 1.21 and an average sample size per study of 128 for the 7 studies demonstrates the power to be 90.0% giving a type 2 error rate of 10%). This would suggest that estimates are likely to be valid. This is also supported by the other measures, as I 2 = 0%, and the prediction interval has narrowed to (−1.19, −0.58) equivalent to a PPV of 23–36% suggesting that the heterogeneity has been explained by the addition of the covariate to the model. Based on this evidence, it would be reasonable to implement a PPV estimate from this model in an independent setting and hence practice. Both of these examples demonstrate why clinicians should not automatically assume summary meta‐analysis results are applicable to their population and that even when a summary estimate appears valid this should be judged in the context of the properties of the statistic and other evidence used to make that decision.

Discussion

One of the issues facing medical research in general is determining how well the research results translate into practice. To truly address this, we need a separate evaluation of the research in practice settings which are independent from the research settings. The terms validity or external validation are often reserved for when the research findings may be generalised or translated into clinical practice. With regard to meta‐analysis results, however, it is often impractical to conduct further independent studies to assess whether the aggregate estimates are valid. Judgments on validity are therefore generally based on an assessment of quality, in which a combination of qualitative and quantitative characteristics of the comprising studies is weighed up. Yet the need for further studies is partly circumvented in meta‐analysis by making use of the existing studies in a similar way to the Jack‐knife method 7. Such ‘cross‐validation’ is not new and was implemented by Lachenbruch 24 and Stone 25 but has only recently gained some traction in meta‐analysis. However, to date, its main use in meta‐analysis has been in prediction modeling to evaluate predictive performance and for model recalibration 2, 3, 4. To address this, we considered the concept of statistical validity in relation to estimates generated by meta‐analysis and meta‐regression models. In particular, we defined it as when the model parameter for the effect measure of interest equates to that in an independent setting. From this, and using the previously described cross‐validation procedure, we derived a statistic, Vn, and its asymptotic distribution to test the viability of statistical validity. Homogeneity plays a central role and is integral to statistical validity when evaluating meta‐analysis or meta‐regression models with discrete categorical covariates. As part of the derivation of Vn, it was demonstrated that in meta‐analysis statistical validity follows only when the studies are homogenous. Furthermore, when meta‐analysis is extended to include discrete covariates in a tailored meta‐regression model, statistical valid estimates arise when the individual sub‐groups of studies are homogeneous, equivalent to the covariates being used to ‘explain’ the heterogeneity. However, the more general case is when the p covariates are continuous in which case the set of statistical valid estimates spans a p + 1 dimensional sub‐space of the k‐space of parameters (,…, ). Here the individual parameters, , may all have different values (by definition heterogeneity), but the model still provide statistically valid estimates. Owing to the link between homogeneity and statistical validity, it reinforces the need to explore meta‐analyses/regressions for heterogeneity using standard methods 17, 18. As such, Cochran's Q statistic, a measure often used to identify heterogeneity, was compared with Vn. There are clear similarities in that they are both a ‘weighted sum of squares’ statistic. Because Vn is estimated using cross validation it directly measures the out of sample prediction error which is important to statistical validity and contrasts Q. In terms of the type 1 error rate and power, they are similar when then are 50 or more studies. However, when there are fewer studies, Vn has greater power and Q has a lower type 1 error rate. Clearly defining the power and type 1 error rates is important as both these statistics are being used to test hypotheses. Furthermore, they provide a basis for recommendations on the use of Vn (and Q) when making decisions on the validity of meta‐analysis/regression estimates. The power of Vn not only depends on the number of studies but also on τ/se, so there is a trade‐off between the level of heterogeneity and the average precision of the studies. Thus, when the meta‐analysis consists predominantly of high precision studies, Vn may detect differences between the model estimate and those observed which may be too small to be clinical relevant but are, nonetheless, statistically significant. However, our overview of Cochrane reviews showed this not to be a large problem. As in previous studies 20, 26, similar shortcomings were demonstrated here with the Q statistic. This motivated the proposing of statistics, such as the I statistic, that are aimed at measuring the extent of heterogeneity rather than its presence 27. However, this too is similarly affected by study precision and the number of studies in the meta‐analysis 20, 27, 28. Such drawbacks should inform the interpretation of these statistics and also suggest that they should not be used in isolation when evaluating heterogeneity or statistical validity. Like many statistics, Vn provides little information when used in isolation; its usefulness depends upon the context in which it is applied. As statistical validity is intrinsically linked to homogeneity if homogeneity seems likely then a significant Vn result should be interpreted in terms of its potential type 1 error rate. In such an instance, the Q statistic, with its lower type 1 error rates, is likely to be the more informative of the two. However, if heterogeneity is considered to be likely at the outset (as with many meta‐analyses), then Vn's greater power means that it could be more useful than the Q statistic particularly for < 1.5. In this context, reviewers may be able to use Vn as support for a recommendation which discourages the wider application of the meta‐analysis results. But how should Vn be interpreted when the inclusion of covariates in meta‐regression analyses seems to ‘explain’ the between‐study heterogeneity? One of the risks in this instance is that legitimate exploratory analyses could lead to recommendations on validity. When the objective is to determine potential sources of variation and not make assertions on the ‘applicability’ of results, then the Q statistic and I can be used to inform such analyses. Although in any event, a non‐significant Vn should be interpreted against the likelihood of a type 2 error, it should not be used to assert statistical validity in meta‐regression analyses when the covariates were identified as part of an exploratory phase. The risks of such data fishing or dredging are well documented 29 but are particularly apposite here where the results could be applied in clinical practice as a result. The best way to mitigate this risk is to pre‐specify the covariates that are likely to be causal before embarking upon any meta‐regression analyses. In this context, the assertion of statistical validity is more likely to be justified. As an alternative to the hypothesis testing approach of the Vn statistic, Riley et al. propose using 95% prediction intervals to quantify the potential error in the predictions of true study values from applying meta‐analysis/regression models to new settings 1. This allows the magnitude of error to be examined, which may inform clinical relevance. Indeed, the issue of clinical relevance is pertinent to any methods used to assess validity. Altman and Royston allude to this when they express the importance of differentiating between statistical and clinical validity 30. In the latter case, biased estimates (which are statistically invalid) may be acceptable to clinicians under certain clinical conditions. However, there are potential limitations to the Riley et al. 1 approach, too. Prediction intervals are not universally accepted in the meta‐analysis field, and recent work suggests frequentist equations for the prediction interval have poor coverage 31. Furthermore, unlike the Vn statistic, the ‘error’ in the proposed prediction interval, as estimated using Riley's method 1, does not account for the variance of the meta‐analysis model's predicted summary estimate. In this study, we confined our investigations to using study‐level data either as part of meta‐analyses or as covariates included in meta‐regression analyses. As IPD or patient‐level data become increasingly available, there is the potential to use IPD meta‐analyses to improve the summary estimates translated into practice. An important part of this process would be to evaluate their statistical validity and is worthy of future research. In conclusion, for summary estimates from meta‐analysis to be useful in practice, they need to be statistically valid. As a direct measure of statistical validity, we have proposed the Vn statistic, and it is applicable whenever the meta‐analysis model (1) or the tailored meta‐regression model (2) is applied to combine effect estimates from multiple studies. It does have limitations as a hypothesis test, and these should be noted. However, as statistical validity relates to identifying homogenous sub‐groups of studies, these limitations may be partly circumvented by using it alongside other statistics such as the Q statistic. As such, Vn provides a useful summary of the likely statistical validity of results from meta‐analysis/regression models when applied to clinical practice. Data S1. Supporting info item Click here for additional data file.

21 in total

1. Construction and validation of a prognostic model across several studies, with an application in superficial bladder cancer.

Authors: Patrick Royston; Mahesh K B Parmar; Richard Sylvester
Journal: Stat Med Date: 2004-03-30 Impact factor: 2.373

2. An approximate distribution of estimates of variance components.

Authors: F E SATTERTHWAITE
Journal: Biometrics Date: 1946-12 Impact factor: 2.571

3. The exact distribution of Cochran's heterogeneity statistic in one-way random effects meta-analysis.

Authors: Brad J Biggerstaff; Dan Jackson
Journal: Stat Med Date: 2008-12-20 Impact factor: 2.373

4. A random-effects regression model for meta-analysis.

Authors: C S Berkey; D C Hoaglin; F Mosteller; G A Colditz
Journal: Stat Med Date: 1995-02-28 Impact factor: 2.373

Review 5. Galactomannan detection for invasive aspergillosis in immunocompromized patients.

Authors: Mariska M Leeflang; Yvette J Debets-Ossenkopp; Caroline E Visser; Rob J P M Scholten; Lotty Hooft; Henk A Bijlmer; Johannes B Reitsma; Patrick Mm Bossuyt; Christina M Vandenbroucke-Grauls
Journal: Cochrane Database Syst Rev Date: 2008-10-08

6. What is the test's accuracy in my practice population? Tailored meta-analysis provides a plausible estimate.

Authors: Brian H Willis; Christopher J Hyde
Journal: J Clin Epidemiol Date: 2014-12-03 Impact factor: 6.437

7. Summarising and validating test accuracy results across multiple studies for use in clinical practice.

Authors: Richard D Riley; Ikhlaaq Ahmed; Thomas P A Debray; Brian H Willis; J Pieter Noordzij; Julian P T Higgins; Jonathan J Deeks
Journal: Stat Med Date: 2015-03-20 Impact factor: 2.373

8. Measuring the statistical validity of summary meta-analysis and meta-regression results for use in clinical practice.

Authors: Brian H Willis; Richard D Riley
Journal: Stat Med Date: 2017-06-15 Impact factor: 2.373

9. Estimating a test's accuracy using tailored meta-analysis-How setting-specific data may aid study selection.

Authors: Brian H Willis; Christopher J Hyde
Journal: J Clin Epidemiol Date: 2014-01-18 Impact factor: 6.437

10. Random effects meta-analysis: Coverage performance of 95% confidence and prediction intervals following REML estimation.

Authors: Christopher Partlett; Richard D Riley
Journal: Stat Med Date: 2016-10-07 Impact factor: 2.373

23 in total

1. Circulating vitamin D and the risk of gestational diabetes: a systematic review and dose-response meta-analysis.

Authors: Mehdi Sadeghian; Maryam Asadi; Sepideh Rahmani; Mohsen Akhavan Zanjani; Omid Sadeghi; Seyed Ahmad Hosseini; Ahmad Zare Javid
Journal: Endocrine Date: 2020-07-24 Impact factor: 3.633

2. Objective detection of chronic stress using physiological parameters.

Authors: Rabah M Al Abdi; Ahmad E Alhitary; Enas W Abdul Hay; Areen K Al-Bashir
Journal: Med Biol Eng Comput Date: 2018-06-18 Impact factor: 2.602

3. Preventing Respiratory Tract Infections by Synbiotic Interventions: A Systematic Review and Meta-Analysis of Randomized Controlled Trials.

Authors: Carty K Y Chan; Jun Tao; Olivia S Chan; Hua-Bin Li; Herbert Pang
Journal: Adv Nutr Date: 2020-07-01 Impact factor: 8.701

4. Incidence of endophthalmitis after phacoemulsification cataract surgery: a Meta-analysis.

Authors: Si-Lu Shi; Xiao-Ning Yu; Yi-Lei Cui; Si-Fan Zheng; Xing-Chao Shentu
Journal: Int J Ophthalmol Date: 2022-02-18 Impact factor: 1.779

Review 5. Incidence of postoperative symptomatic spinal epidural hematoma requiring surgical evacuation: a systematic review and meta-analysis.

Authors: Qian Chen; Xiaoxin Zhong; Wenzhou Liu; Chipiu Wong; Qing He; Yantao Chen
Journal: Eur Spine J Date: 2022-10-19 Impact factor: 2.721

6. Radiological outcome after surgical treatment of syringomyelia-Chiari I complex in adults: a systematic review and meta-analysis.

Authors: Paolo Perrini; Yury Anania; Federico Cagnazzo; Nicola Benedetto; Riccardo Morganti; Davide Tiziano Di Carlo
Journal: Neurosurg Rev Date: 2020-01-17 Impact factor: 3.042

7. 3D-GLCM CNN: A 3-Dimensional Gray-Level Co-Occurrence Matrix-Based CNN Model for Polyp Classification via CT Colonography.

Authors: Jiaxing Tan; Yongfeng Gao; Zhengrong Liang; Weiguo Cao; Marc J Pomeroy; Yumei Huo; Lihong Li; Matthew A Barish; Almas F Abbasi; Perry J Pickhardt
Journal: IEEE Trans Med Imaging Date: 2019-12-30 Impact factor: 10.048

8. Measuring the statistical validity of summary meta-analysis and meta-regression results for use in clinical practice.

Authors: Brian H Willis; Richard D Riley
Journal: Stat Med Date: 2017-06-15 Impact factor: 2.373

9. Visual and radiographic caries detection: a tailored meta-analysis for two different settings, Egypt and Germany.

Authors: Falk Schwendicke; Karim Elhennawy; Osama El Shahawy; Reham Maher; Thais Gimenez; Fausto M Mendes; Brian H Willis
Journal: BMC Oral Health Date: 2018-06-08 Impact factor: 2.757

10. Inhaler device feature preferences among patients with obstructive lung diseases: A systematic review and meta-analysis.

Authors: Maryam Navaie; Carole Dembek; Soojin Cho-Reyes; Karen Yeh; Bartolome R Celli
Journal: Medicine (Baltimore) Date: 2020-06-19 Impact factor: 1.817