| Literature DB >> 29410818 |
Abstract
The inferential inadequacies of statistical significance testing are now widely recognized. There is, however, no consensus on how to move research into a 'post p < 0.05' era. We present a potential route forward via the Analysis of Credibility, a novel methodology that allows researchers to go beyond the simplistic dichotomy of significance testing and extract more insight from new findings. Using standard summary statistics, AnCred assesses the credibility of significant and non-significant findings on the basis of their evidential weight, and in the context of existing knowledge. The outcome is expressed in quantitative terms of direct relevance to the substantive research question, providing greater protection against misinterpretation. Worked examples are given to illustrate how AnCred extracts additional insight from the outcome of typical research study designs. Its ability to cast light on the use of p-values, the interpretation of non-significant findings and the so-called 'replication crisis' is also discussed.Entities:
Keywords: Bayesian methods; credibility; replication crisis; significance testing; statistical inference
Year: 2018 PMID: 29410818 PMCID: PMC5792895 DOI: 10.1098/rsos.171047
Source DB: PubMed Journal: R Soc Open Sci ISSN: 2054-5703 Impact factor: 2.963
Figure 1.The sceptical CPI used to challenge claims of statistical significance. Large studies have relatively low SLs, making sceptical challenge harder to sustain using existing evidence.
Figure 2.The advocacy CPI for assessing claims of statistical non-significance. Non-significant findings from large studies have relatively low ALs, limiting the range of effect sizes available to advocates to challenge the claim of non-significance.
Figure 3.Small studies (a) have CPIs so broad they may encompass the most probable effect size, M. Their statistical significance will then lack intrinsic credibility. In contrast, large studies (b) have relatively narrow CPIs less likely to encompass M, and thus more likely to make their statistical significance intrinsically credible.
Prototypical examples of AnCred assessments of the credibility of statistically significant and non-significant findings in both the presence and absence of prior evidence. OR, odds ratio.
| AnCred | |||||||
|---|---|---|---|---|---|---|---|
| trial variant | intervention response rate | central estimate (OR) | 95% CI lower bound (OR) | 95% CI upper bound (OR) | scepticism/ advocacy limit (OR) | intrinsically credible if no prior evidence? | |
| statistically significant results | |||||||
| S1 | 52% | 4.33 | 0.001 | 1.78 | 10.5 | SL = 2.0 | yes |
| S2 | 40% | 2.67 | 0.03 | 1.09 | 6.52 | SL = 7.3 | no |
| statistically non-significant results | |||||||
| S3 | 36% | 2.25 | 0.08 | 0.91 | 5.55 | AL ≥ 100 | no |
| S4 | 26% | 1.41 | 0.48 | 0.55 | 3.59 | AL = 4.8 | no |
| S5 | 18% | 0.88 | 0.80 | 0.32 | 2.39 | none available | yes |
| statistically significant results | |||||||
| L1 | 29% | 1.63 | 0.001 | 1.22 | 2.18 | SL = 1.24 | yes |
| L2 | 26% | 1.41 | 0.025 | 1.045 | 1.89 | SL = 1.69 | no |
| statistically non-significant results | |||||||
| L3 | 25% | 1.33 | 0.06 | 0.99 | 1.80 | AL ≫ 100 | no |
| L4 | 23% | 1.20 | 0.25 | 0.88 | 1.62 | AL = 2.9 | no |
| L5 | 19% | 0.95 | 0.74 | 0.67 | 1.33 | none available | yes |