| Literature DB >> 21479240 |
Thomas Pfeiffer1, Lars Bertram, John P A Ioannidis.
Abstract
Meta-analyses play an important role in synthesizing evidence from diverse studies and datasets that address similar questions. A major obstacle for meta-analyses arises from biases in reporting. In particular, it is speculated that findings which do not achieve formal statistical significance are less likely reported than statistically significant findings. Moreover, the patterns of bias can be complex and may also depend on the timing of the research results and their relationship with previously published work. In this paper, we present an approach that is specifically designed to analyze large-scale datasets on published results. Such datasets are currently emerging in diverse research fields, particularly in molecular medicine. We use our approach to investigate a dataset on Alzheimer's disease (AD) that covers 1167 results from case-control studies on 102 genetic markers. We observe that initial studies on a genetic marker tend to be substantially more biased than subsequent replications. The chances for initial, statistically non-significant results to be published are estimated to be about 44% (95% CI, 32% to 63%) relative to statistically significant results, while statistically non-significant replications have almost the same chance to be published as statistically significant replications (84%; 95% CI, 66% to 107%). Early replications tend to be biased against initial findings, an observation previously termed Proteus phenomenon: The chances for non-significant studies going in the same direction as the initial result are estimated to be lower than the chances for non-significant studies opposing the initial result (73%; 95% CI, 55% to 96%). Such dynamic patterns in bias are difficult to capture by conventional methods, where typically simple publication bias is assumed to operate. Our approach captures and corrects for complex dynamic patterns of bias, and thereby helps generating conclusions from published results that are more robust against the presence of different coexisting types of selective reporting.Entities:
Mesh:
Year: 2011 PMID: 21479240 PMCID: PMC3066227 DOI: 10.1371/journal.pone.0018362
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Weight functions for the publication bias models.
The models specify the probability of a result to be published depending on the Z-value associated with the outcome. Weights are assumed to be stepwise constant, and are relative to the outer intervals where the weight function is set to one. (A) High resolution model. Using a stepwise constant weight function with many intervals allows studying the shape of publication bias. In our model we use 16 intervals. Model 1 is a simplified version with only three intervals, and a single free weight function parameter. (B) Model 2 (two categories). The weight function is assumed to differ for initial and subsequent studies. Initial studies with Z-values that fall into the mid interval are subject to weight w, subsequent ones are subject to weight w. Model 1 (see Methods), in contrast, uses a single one-parameter weight function that applies to all studies. (C) Proteus model. The weight functions are assumed to differ for initial studies, early replications, and subsequent publications. The shape for early replications depends on the result of the initial study (solid black line: initial result with z≤0, dashed grey line: initial result with z>0). This allows investigating whether early replication studies are more likely published if they oppose initial results. Model 3, described in detail in the Methods section, matches the Proteus model in complexity, but does not take into account that the sign of the initial study might have an impact on the weight function for early replications.
Figure 2Results for the high-resolution models and the one category model (Model 1).
Estimates for the high-resolution random-effects model are shown in solid lines, estimates for the fixed-effect model in dashed lines. Error bars show standard errors of the estimate. (A) High resolution model. For both the fixed-effect and random-effects model, the weights drop rapidly as |Z| decreases. At about |Z| = 1.64, the weights hit a bottom and remain relatively constant in the central intervals. This is what one would expect for publication bias: Non-significant results are subject to a similar bias, irrespective of the Z-value. (B) Model 1. In the random-effects model, bias in the mid interval is estimated as w. Under the fixed-effect model, we estimate w. The estimates for the bias in the fixed-effect model tend to be higher than in the random-effects model, because in the random-effects model, a high between-study variance offers an additional explanation for an excess of formally significant results.
Estimates of the weight function parameters.
| Random-effects model | |||||
| Unbiased | Model 1 | Model 2 | Model 3 | Proteus | |
|
| - | - |
|
|
|
|
| - | - | - |
|
|
|
| - | - | - |
|
|
|
| - |
|
|
|
|
|
| 0 | 4.4 | 10.6 | 11.4 | 13.9 |
|
| 0 | 1 | 2 | 4 | 4 |
|
| 0 |
|
|
|
|
Standard errors of the estimates, calculated from a numerically approximated information matrix, are given in parentheses. Parameter w described bias in initial studies, w and w describe bias in early replication studies in Model 3 and the Proteus model (Fig. 1 and Methods), and w describes bias in subsequent studies. In Model 1, w describes bias in all studies, in Model 2, all but initial studies. Further details are given in Fig. 1 and Methods. The AIC is given by 2k-2L, where L is the maximized value of the log likelihood function and k is the number of parameters. The values given in the table are differences to the values for the unbiased model. Model 1 shows clear indication for selection bias. Model 2 shows that bias is larger for initial studies on a marker, compared to subsequent ones. The Proteus model indicates that non-significant studies opposing the initial result tend to be more likely published than non-significant studies confirming it. Note that the two parameters log w and log w are estimated relative to the outer intervals, and therefore the errors in the estimates are correlated. The difference between the two parameters is log w0.32. The standard error of the difference can be calculated from the variances and co-variances between the two estimates as determined by the information matrix and is given by sqrt{var(log w0.14. Thus non-significant studies confirming the initial result are published with a probability of 73% relative to non-significant studies opposing the initial result, with a confidence interval ranging from 55% to 96%. Model 3 shows that the direction of the second study does not matter per se. Selection bias for the early replication studies falls in between bias for initial and for subsequent results.