| Literature DB >> 33996122 |
Sabine Hoffmann1,2, Felix Schönbrodt1,3, Ralf Elsas1,4, Rory Wilson5, Ulrich Strasser6, Anne-Laure Boulesteix1,2,7.
Abstract
For a given research question, there are usually a large variety of possible analysis strategies acceptable according to the scientific standards of the field, and there are concerns that this multiplicity of analysis strategies plays an important role in the non-replicability of research findings. Here, we define a general framework on common sources of uncertainty arising in computational analyses that lead to this multiplicity, and apply this framework within an overview of approaches proposed across disciplines to address the issue. Armed with this framework, and a set of recommendations derived therefrom, researchers will be able to recognize strategies applicable to their field and use them to generate findings more likely to be replicated in future studies, ultimately improving the credibility of the scientific process.Entities:
Keywords: interdisciplinary perspective; metaresearch; open science; replicability crisis; uncertainty
Year: 2021 PMID: 33996122 PMCID: PMC8059606 DOI: 10.1098/rsos.201925
Source DB: PubMed Journal: R Soc Open Sci ISSN: 2054-5703 Impact factor: 2.963
Figure 1The multiplicity of analysis strategies arising from data preprocessing, model and method choices to obtain an estimate of the parameter of interest θ and values of the outcome variable for two research questions in epidemiology and hydroclimatology, respectively.
Figure 2Sources of uncertainty in explanatory, mechanistic predictive and agnostic predictive modelling. Data preprocessing, parameter, model and method uncertainty are epistemic sources of uncertainty arising from a lack of knowledge in the specification of the analysis strategy. Measurement and sampling uncertainty are random sources of uncertainty that lead to variability in the results when the same analysis strategy is applied on different datasets. The model structure describes the association between the p input variables and the outcome of interest Y. θ is a parameter and e represents a probabilistic error term.
Description of the six sources of uncertainty arising in empirical research.
| description | |
|---|---|
| measurement uncertainty | randomness arising from the operationalization or the measurement of the input and the output variables |
| data preprocessing uncertainty | uncertain decisions in the selection of the data to analyse and in the definition, the cleaning and the transformation of the input and the output variables |
| parameter uncertainty | uncertain decisions in the specification of input parameters |
| model uncertainty | uncertain decisions in the specification of the model structure to describe the phenomenon of interest |
| method uncertainty | uncertain decisions in the choice of a method and method settings |
| sampling uncertainty | randomness arising from the selection of a sample from a larger population of interest |
Figure 3The impact of random sources of uncertainty and of the multiplicity of possible analysis strategies on the replicability of research findings. The result of interest is the parameter θ in explanatory modelling, the outcome in mechanistic predictive modelling and the predictive performance in agnostic predictive modelling. The yellow colour represents the results of the chosen analysis strategy—a strategy selected because it presents the most ‘favourable’ results. It is clear that the traditional confidence interval (given by the bars around the estimate ‘x’), which only takes into account sampling uncertainty, is inadequate in capturing the true uncertainty in the estimate.
Figure 4Overview of solutions to the replication crisis which address the multiplicity of analysis strategies by reducing, reporting, integrating or accepting uncertainty. For an interactive version of this graphic with assorted references see https://shiny.psy.lmu.de/multiplicity/index.html.
Six steps researchers can take to make their research findings more replicable and credible.
| steps | |
|---|---|
| before the analysis | (1) be aware of the multiplicity of possible analysis strategies |
| (2) if possible, reduce sources of uncertainty in the study design | |
| during the analysis | (3) if possible, integrate remaining sources of uncertainty into the analysis |
| (4) report the results of alternative analysis strategies | |
| after the analysis | (5) acknowledge the inherent uncertainty in your findings |
| (6) publish all research code, data and material |