| Literature DB >> 23056287 |
Gary L Gadbury1, David B Allison.
Abstract
Much has been written regarding p-values below certain thresholds (most notably 0.05) denoting statistical significance and the tendency of such p-values to be more readily publishable in peer-reviewed journals. Intuition suggests that there may be a tendency to manipulate statistical analyses to push a "near significant p-value" to a level that is considered significant. This article presents a method for detecting the presence of such manipulation (herein called "fiddling") in a distribution of p-values from independent studies. Simulations are used to illustrate the properties of the method. The results suggest that the method has low type I error and that power approaches acceptable levels as the number of p-values being studied approaches 1000.Entities:
Mesh:
Year: 2012 PMID: 23056287 PMCID: PMC3466248 DOI: 10.1371/journal.pone.0046363
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Pseudo-Code to Generate p-Values Under the Level 2 Null Hypothesis of No Fiddlinga.
| Pseudo-Code | Comments |
| Compute | |
| Compute μ = .8 | |
| Compute σ = .4 | |
| For i = 1 to |
|
| Compute B = Bernoulli( | |
| Compute Z = Normal(0,1) | |
| Compute λ = Normal(μ,σ) | |
| Compute T = B*Z+(1−B)*(Z+λ) | |
| Compute pi = 2*(1-CDF_Normal(0,1,|T|)) | This is for two-tailed testing. |
| Next i |
The code above was implemented in R (www.r-project.org).
Pseudo-Code to Generate p-Values When the Level 2 Null Hypothesis of No Fiddling is False.
| Pseudo-Code | Comments |
| Compute | |
| Compute μ = .8 | |
| Compute σ = .4 | |
| Compute ρ = .95 | |
| Compute α = .05 | |
| Compute ι = .025 | |
| Compute | |
| For i = 1 to |
|
| Compute B = Bernoulli( | |
| Compute Z = Normal(0,1) | |
| Compute λ = Normal(μ,σ) | |
| For r = 1 to | |
| Compute Zr = (ρ(1/2))*Z+ (1-ρ)(1/2)*Normal(0,1) | This formula presupposes that ρ is positive. |
| Compute Tr = B*Zr+(1−B)*(Zr+λ) | |
| Next r | |
| Compute Tmax = max(|T1|…|T | |
| Compute pi = 2*(1-CDF_Normal(0,1,|T1|)) | This is for two-tailed testing. |
| Compute pmin = 2*(1-CDF_Normal(0,1,|Tmax|)) | |
| If (pi>α AND pi≤α+ι) pi = pmin. | |
| Next i |
Contingency Table Generated From the Simulation of Two Level 2 Null Data Sets With N Tests Each.
| # p-values ∈ (0.05,0.075] | # p-values ∈ (0.075,0.1] | |
| Null 1 | N11 | N12 |
| Null 2 | N21 | N22 |
Contingency Table Generated From the Simulation of a Level 2 Null Hypothesis Data Set and a Level 2 Alternative Hypothesis Data Set (Fiddling).
| # p-values ∈ (0.05,0.075] | # p-values ∈ (0.075,0.1] | |
| Null 1 | N11 | N12 |
| Alternative | N21 | N22 |
Results of a Testing Scenario for Evaluating Simulated Dataa.
|
| Type I Error (Chi-Sq Test) | Type I Error (Fisher’s Test) | Power (Chi-Sq Test) | Power (Fisher’s Test) |
| 400 | 0.0250 | 0.0326 | 0.1896 | 0.2308 |
| 600 | 0.0300 | 0.0384 | 0.3202 | 0.3632 |
| 800 | 0.0370 | 0.0456 | 0.4494 | 0.4882 |
| 1000 | 0.0356 | 0.0418 | 0.5354 | 0.5780 |
| 2000 | 0.0366 | 0.0402 | 0.8838 | 0.8952 |
Two tests were conducted to evaluate type I error and power for various sample sizes.
Results of the Performance of Two Tests for Detecting Fiddling in a Distribution of p-Valuesa.
|
| Type I Error (Test 1) | Type I Error (Test 2) | Power (Test 1) | Power (Test 2) |
| 400 | 0.0216 | 0.0216 | 0.4350 | 0.4350 |
| 600 | 0.0224 | 0.0220 | 0.6132 | 0.6132 |
| 800 | 0.0208 | 0.0198 | 0.7424 | 0.7416 |
| 1000 | 0.0200 | 0.0198 | 0.8222 | 0.8218 |
| 2000 | 0.0156 | 0.0152 | 0.9846 | 0.9842 |
Test 1 considers the total of N p-values and Test 2 considers only those in the interval (0.05, 0.1]. Type I error and power for the two tests are reported for various sample sizes.
Figure 1Boxplots of OBJ and DIFF for the conditions of fiddling or no fiddling.
Boxplots compare the distributions (from 1000 simulations) of comparison statistics from mixture models fitted to a distribution of p-values for which no fiddling has occurred (i.e., a level 2 null distribution) and to a distribution of p-values for which fiddling did occur. OBJ is the objective function calculated at maximum likelihood estimates of parameters of the model, and DIFF is a difference in the fitted model to 0.05 versus 0.1.
Two-way Contingency Table for the Mixture Model Approach.
| # | # | |
| Expected | E1 | E2 |
| Actual |
|
|
Figure 2Steps A – F for the simulation procedure as described in section 5.3.
The steps are repeated 1000 times for various sample sizes given in step H from section 5.3. For the final step I in section 5.3, the dashed line above for step B is redirected to instead connect the P.alt P-values to fitting the mixture model.
Results for the Scenario in Which the Mixture Model was Fit to the p-Values From P.nulla.
|
| Type I Error (Chi-Sq Test) | Type I Error (Fisher’s Test) | Power (Chi-Sq Test) | Power (Fisher’s Test) |
| 400 | 0.021 | 0.032 | 0.339 | 0.417 |
| 600 | 0.017 | 0.034 | 0.564 | 0.608 |
| 800 | 0.019 | 0.027 | 0.711 | 0.756 |
| 1000 | 0.022 | 0.029 | 0.791 | 0.817 |
| 2000 | 0.021 | 0.028 | 0.976 | 0.982 |
Type I error and power are reported for two different tests for contingency table data.
Results for the Scenario in Which the Mixture Model was Fit to the p-Values From P.alta.
|
| Type I Error (Chi-Sq Test) | Type I Error (Fisher’s Test) | Power (Chi-Sq Test) | Power (Fisher’s Test) |
| 400 | 0.017 | 0.024 | 0.343 | 0.416 |
| 600 | 0.019 | 0.030 | 0.578 | 0.635 |
| 800 | 0.021 | 0.031 | 0.716 | 0.758 |
| 1000 | 0.024 | 0.031 | 0.817 | 0.842 |
| 2000 | 0.021 | 0.028 | 0.980 | 0.982 |
Type I error and power are reported for various sample sizes for two different tests using contingency table data.