| Literature DB >> 29867666 |
David Trafimow1, Valentin Amrhein2,3, Corson N Areshenkoff4, Carlos J Barrera-Causil5, Eric J Beh6, Yusuf K Bilgiç7, Roser Bono8,9, Michael T Bradley10, William M Briggs11, Héctor A Cepeda-Freyre12, Sergio E Chaigneau13, Daniel R Ciocca14, Juan C Correa15, Denis Cousineau16, Michiel R de Boer17, Subhra S Dhar18, Igor Dolgov1, Juana Gómez-Benito8,9, Marian Grendar19,20, James W Grice21, Martin E Guerrero-Gimenez14, Andrés Gutiérrez22, Tania B Huedo-Medina23, Klaus Jaffe24, Armina Janyan25,26, Ali Karimnezhad27, Fränzi Korner-Nievergelt3,28, Koji Kosugi29, Martin Lachmair30, Rubén D Ledesma31,32, Roberto Limongi33,34, Marco T Liuzza35, Rosaria Lombardo36, Michael J Marks1, Gunther Meinlschmidt37,38,39, Ladislas Nalborczyk40,41, Hung T Nguyen42, Raydonal Ospina43, Jose D Perezgonzalez44, Roland Pfister45, Juan J Rahona30, David A Rodríguez-Medina46, Xavier Romão47, Susana Ruiz-Fernández30,48,49, Isabel Suarez50, Marion Tegethoff51, Mauricio Tejo52, Rens van de Schoot53,54, Ivan I Vankov25, Santiago Velasco-Forero55, Tonghui Wang42, Yuki Yamada56, Felipe C M Zoppino14, Fernando Marmolejo-Ramos57.
Abstract
We argue that making accept/reject decisions on scientific hypotheses, including a recent call for changing the canonical alpha level from p = 0.05 to p = 0.005, is deleterious for the finding of new discoveries and the progress of science. Given that blanket and variable alpha levels both are problematic, it is sensible to dispense with significance testing altogether. There are alternatives that address study design and sample size much more directly than significance testing does; but none of the statistical tools should be taken as the new magic method giving clear-cut mechanical answers. Inference should not be based on single studies at all, but on cumulative evidence from multiple independent studies. When evaluating the strength of the evidence, we should consider, for example, auxiliary assumptions, the strength of the experimental design, and implications for applications. To boil all this down to a binary decision based on a p-value threshold of 0.05, 0.01, 0.005, or anything else, is not acceptable.Entities:
Keywords: decision making; null hypothesis testing; p-value; significance testing; statistical significance
Year: 2018 PMID: 29867666 PMCID: PMC5962803 DOI: 10.3389/fpsyg.2018.00699
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078