| Literature DB >> 30618550 |
Tom A de Graaf1,2, Alexander T Sack1,2.
Abstract
Experiments often challenge the null hypothesis that an intervention, for instance application of non-invasive brain stimulation (NIBS), has no effect on an outcome measure. In conventional statistics, a positive result rejects that hypothesis, but a null result is meaningless. Informally, however, researchers often do find null results meaningful to a greater or lesser extent. We present a model to guide interpretation of null results in NIBS research. Along a "gradient of surprise," from Replication nulls through Exploration nulls to Hypothesized nulls, null results can be less or more surprising in the context of prior expectations, research, and theory. This influences to what extent we should credit a null result in this greater context. Orthogonal to this, experimental design choices create a "gradient of interpretability," along which null results of an experiment, considered in isolation, become more informative. This is determined by target localization procedure, neural efficacy checks, and power and effect size evaluations. Along the latter gradient, we concretely propose three "levels of null evidence." With caveats, these proposed levels C, B, and A, classify how informative an empirical null result is along concrete criteria. Lastly, to further inform, and help formalize, the inferences drawn from null results, Bayesian statistics can be employed. We discuss how this increasingly common alternative to traditional frequentist inference does allow quantification of the support for the null hypothesis, relative to support for the alternative hypothesis. It is our hope that these considerations can contribute to the ongoing effort to disseminate null findings alongside positive results to promote transparency and reduce publication bias.Entities:
Keywords: TES; TMS; bayes; inference; localization; negative; null
Year: 2018 PMID: 30618550 PMCID: PMC6297282 DOI: 10.3389/fnins.2018.00915
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 4.677
FIGURE 1Orthogonal gradients of surprise and interpretability. A null result in an experiment aiming to replicate a well-established finding is very surprising. A null result that was hypothesized is not at all surprising. A null result in an exploratory study with no prior expectations can be received neutrally (middle of the continuum, not shown). The level of surprise (vertical axis) essentially reflects how a null finding relates to our prior expectations. From a Bayesian perspective, even without Bayesian statistics, we should let this level of surprise (if justified based on theory or previous research) influence to what extent we let the result “change our beliefs.” This explicitly refers to interpretation of a null result in the greater context of knowledge, theory, and prior research. Orthogonal to this, one might evaluate the experiment and its parameters in terms of localization procedure, neural efficacy checks, and power and effect size, together making up a gradient of interpretability (horizontal axis). This continuum reflects how informative we should consider the null result in isolation, ignoring expectations, theory, or previous research. The figure displays our view on how design choices impact the interpretability of a null finding along this dimension (toward the right is more informative, see legend top-right). At the bottom, we schematically visualize how the collective of such “leftward” design choices can place a null result into the “Level C evidence category,” which means the null result in isolation is not very informative, while the collective of such “rightward” design choices can place a null result in the highest “Level A evidence” category, which means a null result appears informative and should be taken seriously. A few caveats are important. This figure aims to visualize concepts discussed in more detail in main text, and how they relate to each other. The visualization of Level C through Level A evidence is meant to make intuitive how they roughly fit into this overview, our proposal for what exactly differentiates Level A through C evidence is in main text. Lastly, we do not suggest that every design decision “on the right” in this figure is always best for every experiment, or that experiments yielding Level C evidence are somehow inferior. Note also that the figure reflects how design factors influence how informative null findings are, it does not apply to positive findings in a straightforward way.
FIGURE 2Bayesian analysis to assess support for a null hypothesis. (A) Results of two fictional within-subject conditions in a (random-walk) generated dataset with 40 virtual observers. There does not seem to be a meaningful difference in RT between both conditions. (B) A traditional paired-samples t-test provides no reason to reject the null hypothesis (P > 0.05). But it can provide no evidence to accept the null hypothesis. The Bayesian paired-samples t-test equivalent yields a BF01 = 5.858. This means that one is 5.858 times more likely to obtain the current data if the null hypothesis is true, than if the alternative hypothesis is true. In the recommended interpretational framework, this constitutes “moderate” evidence for the null hypothesis (BF > 3, but < 10 which would constitute “strong” evidence). (C) A plot of the prior distribution (dashed) and posterior distribution (solid), reflecting probability density (vertical axis) of effect sizes in the population (Cohen’s d, horizontal axis). In these tests, the prior distribution is conventionally centered around 0, but its width can be set by the user to reflect strength of prior expectations. The width of the posterior distribution reflects confidence about effect size based on the prior and the data: the horizontal bar outlines the “credible interval” which contains 95% of the posterior distribution density. This width/interval will be smaller with increasing sample size. The median of the posterior distribution (0.002) is a point estimate of the real effect size. BF10 here is simply the inverse of BF01, and the “wedge-chart” visualizes how much more likely one is to obtain the current data given H0 versus H1. (D) Since setting a prior (width) is not always straightforward, one can plot how much the outcome of the analysis (BF01, vertical axis) depends on the selected prior (horizontal axis). At the top are the BF values for a few points on the plot (user-selected prior, in this case the JASP-provided default of Cauchy prior width = 0.707, wide, and ultrawide prior). Clearly the evidence for the null hypothesis (likelihood of obtaining current data if null hypothesis is true) ranges for most reasonable priors from moderate to strong.