| Literature DB >> 33037582 |
Johnny van Doorn1, Don van den Bergh2, Udo Böhm2, Fabian Dablander2, Koen Derks3, Tim Draws2, Alexander Etz4, Nathan J Evans2, Quentin F Gronau2, Julia M Haaf2, Max Hinne2, Šimon Kucharský2, Alexander Ly2,5, Maarten Marsman2, Dora Matzke2, Akash R Komarlu Narendra Gupta2, Alexandra Sarafoglou2, Angelika Stefan2, Jan G Voelkel6, Eric-Jan Wagenmakers2.
Abstract
Despite the increasing popularity of Bayesian inference in empirical research, few practical guidelines provide detailed recommendations for how to apply Bayesian procedures and interpret the results. Here we offer specific guidelines for four different stages of Bayesian statistical reasoning in a research setting: planning the analysis, executing the analysis, interpreting the results, and reporting the results. The guidelines for each stage are illustrated with a running example. Although the guidelines are geared towards analyses performed with the open-source statistical software JASP, most guidelines extend to Bayesian inference in general.Entities:
Keywords: Bayesian inference; Scientific reporting; Statistical software
Year: 2021 PMID: 33037582 PMCID: PMC8219590 DOI: 10.3758/s13423-020-01798-5
Source DB: PubMed Journal: Psychon Bull Rev ISSN: 1069-9384
A summary of the guidelines for the different stages of a Bayesian analysis, with a focus on analyses conducted in JASP.
| Stage | Recommendation |
|---|---|
| Planning | Write the methods section in advance of data collection |
| Distinguish between exploratory and confirmatory research | |
| Specify the goal; estimation, testing, or both | |
| If the goal is testing, decide on one-sided or two-sided procedure | |
| Choose a statistical model | |
| Determine which model checks will need to be performed | |
| Specify which steps can be taken to deal with possible model violations | |
| Choose a prior distribution | |
| Consider how to assess the impact of prior choices on the inferences | |
| Specify the sampling plan | |
| Consider a Bayes factor design analysis | |
| Preregister the analysis plan for increased transparency | |
| Consider specifying a multiverse analysis | |
| Executing | Check the quality of the data (e.g., assumption checks) |
| Annotate the JASP output | |
| Interpreting | Beware of the common pitfalls |
| Use the correct interpretation of Bayes factor and credible interval | |
| When in doubt, ask for advice (e.g., on the JASP forum) | |
| Reporting | Mention the goal of the analysis |
| Include a plot of the prior and posterior distribution, if available | |
| If testing, report the Bayes factor, including its subscripts | |
| If estimating, report the posterior median and | |
| Include which prior settings were used | |
| Justify the prior settings (particularly for informed priors in a testing scenario) | |
| Discuss the robustness of the result | |
| If relevant, report the results from both estimation and hypothesis testing | |
| Refer to the statistical literature for details about the analyses used | |
| Consider a sequential analysis | |
| Report the results of any multiverse analyses, if conducted | |
| Make the .jasp file and data available online |
Note that the stages have a predetermined order, but the individual recommendations can be rearranged where necessary
Fig. 4A graphical representation of a Bayes factor classification table. As the Bayes factor deviates from 1, which indicates equal support for and , more support is gained for either or . Bayes factors between 1 and 3 are considered to be weak, Bayes factors between 3 and 10 are considered moderate, and Bayes factors greater than 10 are considered strong evidence. The Bayes factors are also represented as probability wheels, where the ratio of white (i.e., support for ) to red (i.e., support for ) surface is a function of the Bayes factor. The probability wheels further underscore the continuous scale of evidence that Bayes factors represent. These classifications are heuristic and should not be misused as an absolute rule for all-or-nothing conclusions
Fig. 7The Bayes factor robustness plot. The maximum BF+ 0 is attained when setting the prior width r to 0.38. The plot indicates BF+ 0 for the user specified prior (), wide prior (r = 1), and ultrawide prior (). The evidence for the alternative hypothesis is relatively stable across a wide range of prior distributions, suggesting that the analysis is robust. However, the evidence in favor of is not particularly strong and will not convince a skeptic
Fig. 1Model misspecification is also a problem for Bayesian analyses. The four scatterplots in the top panel show Anscombe’s quartet (Anscombe, 1973); the bottom panel shows the corresponding inference, which is identical for all four scatter plots. Except for the leftmost scatterplot, all data violate the assumptions of the linear correlation analysis in important ways
Fig. 2Descriptive plots allow a visual assessment of the assumptions of the t test for the stereogram data. The top row shows descriptive plots for the raw fuse times, and the bottom row shows descriptive plots for the log-transformed fuse times. The left column shows boxplots, including the jittered data points, for each of the experimental conditions. The middle and right columns show parQ-Q plots of the dependent variable, split by experimental condition. Here we see that the log-transformed dependent variable is more appropriate for the t test, due to its distribution and absence of outliers. Figures from JASP
Fig. 3JASP menu for the Bayesian two-sample t test. The left input panel offers the analysis options, including the specification of the alternative hypothesis and the selection of plots. The right output panel shows the corresponding analysis output. The prior and posterior plot is explained in more detail in Fig. 6. The input panel specifies the one-sided analysis for hypothesis testing; a two-sided analysis for estimation can be obtained by selecting “Group 1 ≠ Group 2” under “Alt. Hypothesis”
Fig. 6Bayesian two-sample t test for the parameter δ. The probability wheel on top visualizes the evidence that the data provide for the two rival hypotheses. The two gray dots indicate the prior and posterior density at the test value (Dickey & Lientz, 1970; Wagenmakers, Lodewyckx, Kuriyal, & Grasman, 2010). The median and the 95% central credible interval of the posterior distribution are shown in the top right corner. The left panel shows the one-sided procedure for hypothesis testing and the right panel shows the two-sided procedure for parameter estimation. Both figures from JASP
Fig. 5Updating the unconditional prior distribution to the unconditional posterior distribution for the stereogram example. The left panel shows the unconditional prior distribution, which is a mixture between the prior distributions under and . The prior distribution under is a spike at the null value, indicated by the dotted line; the prior distribution under is a Cauchy distribution, indicated by the gray mass. The mixture proportion is determined by the prior model probabilities and . The right panel shows the unconditional posterior distribution, after updating the prior distribution with the data D. This distribution is a mixture between the posterior distributions under and ., where the mixture proportion is determined by the posterior model probabilities and . Since (i.e., the data provide support for over ), about 70% of the unconditional posterior mass is comprised of the posterior mass under , indicated by the gray mass. Thus, the unconditional posterior distribution provides information about plausible values for δ, while taking into account the uncertainty of being true. In both panels, the dotted line and gray mass have been rescaled such that the height of the dotted line and the highest point of the gray mass reflect the prior (left) and posterior (right) model probabilities