| Literature DB >> 29805179 |
Stefan Wiens1, Mats E Nilsson1.
Abstract
Because of the continuing debates about statistics, many researchers may feel confused about how to analyze and interpret data. Current guidelines in psychology advocate the use of effect sizes and confidence intervals (CIs). However, researchers may be unsure about how to extract effect sizes from factorial designs. Contrast analysis is helpful because it can be used to test specific questions of central interest in studies with factorial designs. It weighs several means and combines them into one or two sets that can be tested with t tests. The effect size produced by a contrast analysis is simply the difference between means. The CI of the effect size informs directly about direction, hypothesis exclusion, and the relevance of the effects of interest. However, any interpretation in terms of precision or likelihood requires the use of likelihood intervals or credible intervals (Bayesian). These various intervals and even a Bayesian t test can be obtained easily with free software. This tutorial reviews these methods to guide researchers in answering the following questions: When I analyze mean differences in factorial designs, where can I find the effects of central interest, and what can I learn about their effect sizes?Entities:
Keywords: Bayesian analysis; analysis of variance; confidence interval; contrast analysis; null hypothesis significance testing
Year: 2016 PMID: 29805179 PMCID: PMC5952862 DOI: 10.1177/0013164416668950
Source DB: PubMed Journal: Educ Psychol Meas ISSN: 0013-1644 Impact factor: 2.821
Figure 1.Means and contrast weights in a 2 × 2 analysis of variance.
Figure 2.(A) Left: The actual mean happiness rating (of zero) with the 95% CI. Right: The non-rejection area (dotted) for a one-sample t test (at 2-tailed α = 5%). If a sample mean falls outside of this area, it is significantly different from zero. Note that the CI around an actual sample mean of zero (left) is identical to the nonrejection area in a one-sample t test (which assumes that mean = 0). (B) Across many studies, 95% of the sample means should fall within the nonrejection area (dotted) defined by the true mean and the true standard deviation. Here, several sample means (and the 95% CIs) are shown along the x axis. In the long run, 95% of these sample means should fall no more than ±2 steps from the true mean. Because we do not know these true values, we compute CIs with a similar interval (about ±2 steps). Note that the CIs differ slightly because we need to estimate the true standard deviation from each sample. Critically, the distance from the true mean to the sample means (left) is symmetric to the distance from the sample means to the true mean (right). Thus, 95% of all CIs would capture the true mean, but 5% would not (the dashed CI). Unfortunately, for individual studies, we do not know whether the CI misses the true mean (i.e., the dashed CI in the figure), that is, whether the CI is “red,” as illustrated nicely by the ESCI software (Cumming, 2012).
Figure 3.Illustration showing that confidence intervals (CIs) are more informative than null hypothesis significance testing (NHST). For each study, NHST was used to determine whether the mean differed significantly from zero. The NHST found that the results of studies a and b were not significant, whereas the results of the other studies were significant (p < .05). This information is less informative than what can be learned directly from the CIs (see comments). Note that effect sizes in the gray area are considered practically unimportant (null range).
Figure 4.Excerpts from the output of a Bayesian one-sample t test in JASP. The Bayes Factor (BF0+) shows that the present data are 4.2 times more likely under the H0 than under the alternative hypothesis (H+). In the original pie chart, the area data|H+ is shown in red. In the line chart, the dotted line shows the prior distribution that captures the a priori belief that the true effect is positive and small rather than large. The solid line shows the posterior distribution that captures the beliefs after the data are incorporated. In the line chart, the BF0+ is the likelihood for H0 under the posterior (upper gray circle) divided by the likelihood for H0 under the prior (lower gray circle, Wagenmakers et al., 2010). The whiskers above the distributions show the 95% credible interval for the standardized effect.