| Literature DB >> 27068542 |
Richard Kunert1,2.
Abstract
Recently, many psychological effects have been surprisingly difficult to reproduce. This article asks why, and investigates whether conceptually replicating an effect in the original publication is related to the success of independent, direct replications. Two prominent accounts of low reproducibility make different predictions in this respect. One account suggests that psychological phenomena are dependent on unknown contexts that are not reproduced in independent replication attempts. By this account, internal replications indicate that a finding is more robust and, thus, that it is easier to independently replicate it. An alternative account suggests that researchers employ questionable research practices (QRPs), which increase false positive rates. By this account, the success of internal replications may just be the result of QRPs and, thus, internal replications are not predictive of independent replication success. The data of a large reproducibility project support the QRP account: replicating an effect in the original publication is not related to independent replication success. Additional analyses reveal that internally replicated and internally unreplicated effects are not very different in terms of variables associated with replication success. Moreover, social psychological effects in particular appear to lack any benefit from internal replications. Overall, these results indicate that, in this dataset at least, the influence of QRPs is at the heart of failures to replicate psychological findings, especially in social psychology. Variable, unknown contexts appear to play only a relatively minor role. I recommend practical solutions for how QRPs can be avoided.Entities:
Keywords: False positives; Publication bias; QRP; Replication; Reproducibility
Mesh:
Year: 2016 PMID: 27068542 PMCID: PMC5050250 DOI: 10.3758/s13423-016-1030-9
Source DB: PubMed Journal: Psychon Bull Rev ISSN: 1069-9384
Fig. 1Comparing the reproducibility of internally replicated and unreplicated effects in an empirical dataset. Left panel P-values obtained by independent replication teams. The dotted line represents the threshold for considering an effect statistically significant (P = .05). Note that the bottom 25 % quartile in the right distribution of the left panel relating to previously internally unreplicated effects is at P = .0017 and, thus, not visible here. Right panel Reduction in effect size between original study and independent replication. Violin plots display density, i.e. thicker parts represent more data points
Comparison of internally replicated and internally unreplicated effects
| Internal replication present | Internal replication absent | Bayes factor | Posterior median [95 % Credible Interval]a | |
|---|---|---|---|---|
| Reproducibility | ||||
| Independent replications | 12 out of 42 | 22 out of 54 | BF0+ = 8.72 | −0.52 |
| [−1.39; 0.31] | ||||
| Effect size reduction (simple subtraction)b |
|
| BF0+ = 4.15 | −0.00 |
| (SD = 0.20) | (SD = 0.22) | [−0.09; 0.08] | ||
| Effect size reduction (Cohen’s |
|
| BF0+ = 2.43 | 0.00 |
| (SD = 0.26) | (SD = 0.27) | [−0.10; 0.10] | ||
| Reproducibility predictors | ||||
| Field of study | 13 × cognitive | 29 × cognitive | BF+0 = 5.76 | 0.22 |
| 29 × social | 25 × social | [0.03; 0.40] | ||
| Effect type | 20 × main effect | 29 × main effect | BF0+ = 3.13 | 0.02 |
| 16 × interaction | 21 × interaction | [−0.18; 0.23] | ||
| Original study |
|
| BF0+ = 2.78 | 0.00 |
| (SD = .016) | (SD = .016) | [−0.00; 0.01] | ||
| Original effect size |
|
| BF+0 = 1.42 | 0.07 |
| (SD = .15) | (SD = .22) | [−0.01; 0.14] | ||
| Independent replication powerb |
|
| BF0+ = 3.64 | 0.01 |
| (SD = .08) | (SD = .09) | [−0.02; 0.04] | ||
| Surprisingness of original effectc |
|
| BF0+ = 1.36 | 0.21 |
| (SD = 0.98) | (SD = 0.83) | [−0.17; 0.60] | ||
| Challenge of conducting replicationb,d |
|
| BF0+ = 4.74 | −0.03 |
| (SD = 0.79) | (SD = 0.82) | [−0.36; 0.31] | ||
| Formal power analysis in original publication present/absent | 0 × present | 2 × present | BF0+ = 22.21 | −0.03 |
| 42 × absent | 52 × absent | [−0.11; 0.04] | ||
| Sample size of original studye |
|
| BF0+ = 4.41 | −6.25 |
| (SD = 55.77) | (SD = 124.12) | [−25.94; 14.50] | ||
aPositive values represent support for the alternative hypothesis representing the unknown moderator account
bData not normally distributed. No satisfactory data transformation could be found. The reader should therefore focus on parameter estimation which does not assume normality
cBased on mean of three raters using Likert scale from 1 (not at all surprising) to 6 (extremely surprising)
dBased on combination of three standardized mean ratings as in Open Science Collaboration (2015)
eNatural logarithm of raw data due to non-normal distribution of raw values. Raw data results in BF0+ = 1.74. Analysis excludes one study with an unusual sample size (N = 230,025)
Comparison of internally replicated and internally unreplicated effects for different fields of study
| Internal replication present | Internal replication absent | Bayes factor | Posterior median [95 % Credible Interval]a | |
|---|---|---|---|---|
| Social psychology | ||||
| Independent replications | 5 out of 29 | 8 out of 25 | BF0+ = 7.60 | −0.76 |
| [−2.03; 0.45] | ||||
| Effect size reduction (simple subtraction)b |
|
| BF0+ = 7.15 | −0.06 |
| ( | ( | [−0.15; 0.04] | ||
| Effect size reduction (Cohen’s |
|
| BF0+ = 6.96 | −0.06 |
| ( | ( | [−0.17; 0.04] | ||
| Cognitive psychology | ||||
| Independent replications | 7 out of 13 | 14 out of 29 | BF0+ = 1.92 | 0.21 |
| [−1.05; 1.49] | ||||
| Effect size reduction (simple subtraction)b |
|
| BF0+ = 1.26 | 0.08 |
| ( | ( | [−0.10; 0.26] | ||
| Effect size reduction (Cohen’s |
|
| BF+0 = 1.38 | 0.13 |
| ( | ( | [−0.11; 0.38] | ||
aPositive values represent support for the alternative hypothesis representing the unknown moderator account
bData not normally distributed. No satisfactory data transformation could be found. The reader should therefore focus on parameter estimation which does not assume normality