Literature DB >> 35578622

Statistical significance and its critics: practicing damaging science, or damaging scientific practice?

Deborah G Mayo1, David Hand2.   

Abstract

While the common procedure of statistical significance testing and its accompanying concept of p-values have long been surrounded by controversy, renewed concern has been triggered by the replication crisis in science. Many blame statistical significance tests themselves, and some regard them as sufficiently damaging to scientific practice as to warrant being abandoned. We take a contrary position, arguing that the central criticisms arise from misunderstanding and misusing the statistical tools, and that in fact the purported remedies themselves risk damaging science. We argue that banning the use of p-value thresholds in interpreting data does not diminish but rather exacerbates data-dredging and biasing selection effects. If an account cannot specify outcomes that will not be allowed to count as evidence for a claim-if all thresholds are abandoned-then there is no test of that claim. The contributions of this paper are: To explain the rival statistical philosophies underlying the ongoing controversy; To elucidate and reinterpret statistical significance tests, and explain how this reinterpretation ameliorates common misuses and misinterpretations; To argue why recent recommendations to replace, abandon, or retire statistical significance undermine a central function of statistics in science: to test whether observed patterns in the data are genuine or due to background variability.
© The Author(s) 2022.

Entities:  

Keywords:  Data-dredging; Error probabilities; Fisher; Neyman and Pearson; P-values; Statistical significance tests

Year:  2022        PMID: 35578622      PMCID: PMC9096069          DOI: 10.1007/s11229-022-03692-0

Source DB:  PubMed          Journal:  Synthese        ISSN: 0039-7857            Impact factor:   1.595


  21 in total

1.  Toward evidence-based medical statistics. 2: The Bayes factor.

Authors:  S N Goodman
Journal:  Ann Intern Med       Date:  1999-06-15       Impact factor: 25.391

2.  Two cheers for P-values?

Authors:  S Senn
Journal:  J Epidemiol Biostat       Date:  2001

3.  A comment on replication, p-values and evidence, S.N.Goodman, Statistics in Medicine 1992; 11:875-879.

Authors:  Stephen Senn
Journal:  Stat Med       Date:  2002-08-30       Impact factor: 2.373

4.  False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant.

Authors:  Joseph P Simmons; Leif D Nelson; Uri Simonsohn
Journal:  Psychol Sci       Date:  2011-10-17

5.  Revised standards for statistical evidence.

Authors:  Valen E Johnson
Journal:  Proc Natl Acad Sci U S A       Date:  2013-11-11       Impact factor: 11.205

6.  Absence of evidence is not evidence of absence.

Authors:  D G Altman; J M Bland
Journal:  BMJ       Date:  1995-08-19

7.  Tests of Statistical Significance Made Sound.

Authors:  Brian D Haig
Journal:  Educ Psychol Meas       Date:  2016-10-05       Impact factor: 2.821

8.  P values are only an index to evidence: 20th- vs. 21st-century statistical science.

Authors:  K P Burnham; D R Anderson
Journal:  Ecology       Date:  2014-03       Impact factor: 5.499

9.  COMPare: a prospective cohort study correcting and monitoring 58 misreported trials in real time.

Authors:  Ben Goldacre; Henry Drysdale; Aaron Dale; Ioan Milosevic; Eirion Slade; Philip Hartley; Cicely Marston; Anna Powell-Smith; Carl Heneghan; Kamal R Mahtani
Journal:  Trials       Date:  2019-02-14       Impact factor: 2.279

10.  The statistics wars and intellectual conflicts of interest.

Authors:  Deborah G Mayo
Journal:  Conserv Biol       Date:  2021-12-06       Impact factor: 7.563

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.