| Literature DB >> 31903199 |
Jerome Olsen1, Johanna Mosen1, Martin Voracek2, Erich Kirchler1.
Abstract
The replicability of research findings has recently been disputed across multiple scientific disciplines. In constructive reaction, the research culture in psychology is facing fundamental changes, but investigations of research practices that led to these improvements have almost exclusively focused on academic researchers. By contrast, we investigated the statistical reporting quality and selected indicators of questionable research practices (QRPs) in psychology students' master's theses. In a total of 250 theses, we investigated utilization and magnitude of standardized effect sizes, along with statistical power, the consistency and completeness of reported results, and possible indications of p-hacking and further testing. Effect sizes were reported for 36% of focal tests (median r = 0.19), and only a single formal power analysis was reported for sample size determination (median observed power 1 - β = 0.67). Statcheck revealed inconsistent p-values in 18% of cases, while 2% led to decision errors. There were no clear indications of p-hacking or further testing. We discuss our findings in the light of promoting open science standards in teaching and student supervision.Keywords: academic theses; effect size; p-value; questionable research practices; statistical power; statistical reporting
Year: 2019 PMID: 31903199 PMCID: PMC6936276 DOI: 10.1098/rsos.190738
Source DB: PubMed Journal: R Soc Open Sci ISSN: 2054-5703 Impact factor: 2.963
Coded characteristics.
| category | coded information |
|---|---|
| formal characteristics | year |
| length | |
| thesis topic | |
| classification | type: |
| ‘non-empirical’, ‘qualitative’, ‘descriptive’, ‘exploratory’ (excluded) | |
| ‘inferential’ (included) | |
| hypothesis | focal hypothesis (if there was no clear focal hypothesis, the first hypothesis was used.) |
| variables | dependent variable |
| sample | sample size criterion: ‘none', ‘power analysis', ‘rule-of-thumb’ |
| type of sample: ‘student', ‘non-student, no further specification', ‘employed', ‘self-employed', ‘children’ | |
| sample size | |
| mean age and standard deviation | |
| analysis | test type: |
| ‘test of categorical data ( | |
| ‘test of mean difference ( | |
| ‘analysis of variance ( | |
| ‘correlation’ | |
| ‘linear regression’ | |
| ‘non-parametric test’ | |
| ‘complex analysis’ | |
| test statistics (depending on method): | |
| reported test value, d.f., cell means and standard deviations, independent/dependent | |
| did the model include covariates: ‘yes’/‘no’ | |
| reported as: | |
| ‘exact, 3 decimals’ | |
| ‘exact, 2 decimals’ | |
| ‘significant threshold (e.g. <0.05)’ | |
| ‘non-significant threshold (e.g. >0.05)’ | |
| ‘missing’ | |
| reported | |
| effect size | was an effect size reported for the focal hypothesis test: ‘yes’ / ‘no’ |
| were effect sizes generally reported: ‘yes’/‘no’ | |
| type of effect size: ‘ | |
| reported effect size | |
| reporting errors | reporting errors: ‘ |
| data use | data were also used in other thesis: ‘yes’/‘no’ |
| focal hypothesis was the same: ‘yes’/‘no’ | |
| exploration | further testing beyond hypothesis testing: ‘yes’/‘no’ |
| exploration was reported in designated section: ‘yes’/‘no’ |
Figure 1.Violin plot of effect sizes.
Figure 2.Power plots. (a) Observed post hoc power of theses. (b) Median hypothetical power for various effect sizes from r = 0 to r = 1, i.e. calculated based on each occurring sample size and all possible effect sizes. Solid lines indicate the samples' power for conventional effect sizes of r = 0.10, r = 0.20 and r = 0.30; the dotted line represents the power for the median observed effect estimate of r = 0.19.
Figure 3.Distribution of 226 p-values that were recalculated with statcheck or, if not possible, reported as exact values. The non-included 24 p-values were either not reported (n = 4) or reported as smaller or larger than a common threshold (e.g. ‘<0.05’, n = 20), and could not be recalculated.
Figure 4.Correlation of sample size and effect size r.