Thomas V Perneger1, Christophe Combescure2. 1. Division of Clinical Epidemiology, Faculty of Medicine, University of Geneva, and Geneva University Hospitals, 6 rue Gabrielle-Perret-Gentil, CH-1211 Geneva, Switzerland. Electronic address: thomas.perneger@hcuge.ch. 2. Division of Clinical Epidemiology, Faculty of Medicine, University of Geneva, and Geneva University Hospitals, 6 rue Gabrielle-Perret-Gentil, CH-1211 Geneva, Switzerland.
Abstract
OBJECTIVES: Published P-values provide a window into the global enterprise of medical research. The aim of this study was to use the distribution of published P-values to estimate the relative frequencies of null and alternative hypotheses and to seek irregularities suggestive of publication bias. STUDY DESIGN AND SETTING: This cross-sectional study included P-values published in 120 medical research articles in 2016 (30 each from the BMJ, JAMA, Lancet, and New England Journal of Medicine). The observed distribution of P-values was compared with expected distributions under the null hypothesis (i.e., uniform between 0 and 1) and the alternative hypothesis (strictly decreasing from 0 to 1). P-values were categorized according to conventional levels of statistical significance and in one-percent intervals. RESULTS: Among 4,158 recorded P-values, 26.1% were highly significant (P < 0.001), 9.1% were moderately significant (P ≥ 0.001 to < 0.01), 11.7% were weakly significant (P ≥ 0.01 to < 0.05), and 53.2% were nonsignificant (P ≥ 0.05). We noted three irregularities: (1) high proportion of P-values <0.001, especially in observational studies, (2) excess of P-values equal to 1, and (3) about twice as many P-values less than 0.05 compared with those more than 0.05. The latter finding was seen in both randomized trials and observational studies, and in most types of analyses, excepting heterogeneity tests and interaction tests. Under plausible assumptions, we estimate that about half of the tested hypotheses were null and the other half were alternative. CONCLUSION: This analysis suggests that statistical tests published in medical journals are not a random sample of null and alternative hypotheses but that selective reporting is prevalent. In particular, significant results are about twice as likely to be reported as nonsignificant results.
OBJECTIVES: Published P-values provide a window into the global enterprise of medical research. The aim of this study was to use the distribution of published P-values to estimate the relative frequencies of null and alternative hypotheses and to seek irregularities suggestive of publication bias. STUDY DESIGN AND SETTING: This cross-sectional study included P-values published in 120 medical research articles in 2016 (30 each from the BMJ, JAMA, Lancet, and New England Journal of Medicine). The observed distribution of P-values was compared with expected distributions under the null hypothesis (i.e., uniform between 0 and 1) and the alternative hypothesis (strictly decreasing from 0 to 1). P-values were categorized according to conventional levels of statistical significance and in one-percent intervals. RESULTS: Among 4,158 recorded P-values, 26.1% were highly significant (P < 0.001), 9.1% were moderately significant (P ≥ 0.001 to < 0.01), 11.7% were weakly significant (P ≥ 0.01 to < 0.05), and 53.2% were nonsignificant (P ≥ 0.05). We noted three irregularities: (1) high proportion of P-values <0.001, especially in observational studies, (2) excess of P-values equal to 1, and (3) about twice as many P-values less than 0.05 compared with those more than 0.05. The latter finding was seen in both randomized trials and observational studies, and in most types of analyses, excepting heterogeneity tests and interaction tests. Under plausible assumptions, we estimate that about half of the tested hypotheses were null and the other half were alternative. CONCLUSION: This analysis suggests that statistical tests published in medical journals are not a random sample of null and alternative hypotheses but that selective reporting is prevalent. In particular, significant results are about twice as likely to be reported as nonsignificant results.
Authors: Fredi Alexander Diaz-Quijano; Fernando Morelli Calixto; José Mário Nunes da Silva Journal: BMC Med Res Methodol Date: 2020-06-03 Impact factor: 4.615
Authors: Mirthe Muilwijk; Marie Loh; Irene G M van Valkengoed; John C Chambers; Sara Mahmood; Saranya Palaniswamy; Samreen Siddiqui; Wnurinham Silva; Gary S Frost; Heather M Gage; Marjo-Riitta Jarvelin; Ravindra P Rannan-Eliya; Sajjad Ahmad; Sujeet Jha; Anuradhani Kasturiratne; Prasad Katulanda; Khadija I Khawaja; Jaspal S Kooner; Ananda R Wickremasinghe Journal: Trials Date: 2022-09-06 Impact factor: 2.728