| Literature DB >> 22593746 |
Rink Hoekstra1, Henk A L Kiers, Addie Johnson.
Abstract
A valid interpretation of most statistical techniques requires that one or more assumptions be met. In published articles, however, little information tends to be reported on whether the data satisfy the assumptions underlying the statistical techniques used. This could be due to self-selection: Only manuscripts with data fulfilling the assumptions are submitted. Another explanation could be that violations of assumptions are rarely checked for in the first place. We studied whether and how 30 researchers checked fictitious data for violations of assumptions in their own working environment. Participants were asked to analyze the data as they would their own data, for which often used and well-known techniques such as the t-procedure, ANOVA and regression (or non-parametric alternatives) were required. It was found that the assumptions of the techniques were rarely checked, and that if they were, it was regularly by means of a statistical test. Interviews afterward revealed a general lack of knowledge about assumptions, the robustness of the techniques with regards to the assumptions, and how (or whether) assumptions should be checked. These data suggest that checking for violations of assumptions is not a well-considered choice, and that the use of statistics can be described as opportunistic.Entities:
Keywords: analyzing data; assumptions; homogeneity; normality; robustness
Year: 2012 PMID: 22593746 PMCID: PMC3350940 DOI: 10.3389/fpsyg.2012.00137
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Figure 1An example of one of the research question descriptions. In this example, participants were supposed to answer this question by means of a regression analysis.
An overview of the properties of the six scenarios.
| Scenario | Technique to be used | Effect size | Violations of assumption | |
|---|---|---|---|---|
| 1 | Medium | 0.04 | Normality | |
| 2 | Very small | 0.86 | None | |
| 3 | Regression analysis | Large | 0.00 | None |
| 4 | Regression analysis | Medium | 0.01 | None |
| 5 | ANOVA | Large | 0.05 | Homogeneity |
| 6 | ANOVA | Close to 0 | 0.58 | None |
Figure 2The frequency of whether two assumptions were checked at all, whether they were checked correctly, and whether a preliminary test was used for three often used techniques in percentages of the total number of cases. Between brackets are 95% CIs for the percentages.
Figure 3Percentages of participants giving each of the explanations for not checking assumptions as a function of assumption and technique. Error bars indicate 95% CIs.