| Literature DB >> 27489256 |
John Concato1, John A Hartigan2.
Abstract
A threshold probability value of 'p≤0.05' is commonly used in clinical investigations to indicate statistical significance. To allow clinicians to better understand evidence generated by research studies, this review defines the p value, summarizes the historical origins of the p value approach to hypothesis testing, describes various applications of p≤0.05 in the context of clinical research and discusses the emergence of p≤5×10(-8) and other values as thresholds for genomic statistical analyses. Corresponding issues include a conceptual approach of evaluating whether data do not conform to a null hypothesis (ie, no exposure-outcome association). Importantly, and in the historical context of when p≤0.05 was first proposed, the 1-in-20 chance of a false-positive inference (ie, falsely concluding the existence of an exposure-outcome association) was offered only as a suggestion. In current usage, however, p≤0.05 is often misunderstood as a rigid threshold, sometimes with a misguided 'win' (p≤0.05) or 'lose' (p>0.05) approach. Also, in contemporary genomic studies, a threshold of p≤10(-8) has been endorsed as a boundary for statistical significance when analyzing numerous genetic comparisons for each participant. A value of p≤0.05, or other thresholds, should not be employed reflexively to determine whether a clinical research investigation is trustworthy from a scientific perspective. Rather, and in parallel with conceptual issues of validity and generalizability, quantitative results should be interpreted using a combined assessment of strength of association, p values, CIs, and sample size.Entities:
Keywords: Biostatistics; Clinical Research; Research Design
Mesh:
Year: 2016 PMID: 27489256 PMCID: PMC5099183 DOI: 10.1136/jim-2016-000206
Source DB: PubMed Journal: J Investig Med ISSN: 1081-5589 Impact factor: 2.895
Examples of type of variables and selected statistical test(s)
| Bivariate (unadjusted) analysis | ||
|---|---|---|
| First variable (*) | Second variable (*) | Statistical test(s) |
| Binary (unpaired) | Binary | χ2, Fisher's exact |
| Binary (unpaired) | Continuous | Student's t-test |
| Binary | ‘Moving’ binary (survival curves) | Log-rank |
| Continuous | Continuous | Correlation (r), linear regression |
|
| ||
| Binary | Multiple logistic regression | Ordinal |
| Continuous | ANOVA; ANCOVA | – |
| Continuous | Multiple linear regression | Ordinal, binary |
| Integer count | Poisson regression | (contingency tables) |
| ‘Moving’ binary (eg, survival curve) | Proportional hazard function analysis (Cox regression) | – |
*Independent and dependent variables, when applicable, are not distinguished in this table. Paired indicates that the study design links (matches) particular participants in compared groups.
ANCOVA, analysis of covariance; ANOVA, analysis of variance.
Figure 1In the first study ‘A’, with n=87, the relative risk is 2.5 (95% CI 0.99 to 6.5) and the p value is 0.062. In the second study ‘B’, with n=89, the relative risk is 2.7 (95% CI 1.1 to 7.0) and the p value is 0.037. The two studies are quite similar from an overall perspective, but ‘B’ is statistically significant, whereas ‘A’ is not.
Figure 2p Values and CIs provide concordant information regarding statistical significance. A 95% CI that excludes the null value of one, as with a p value of ≤0.05, indicates a statistically significant result. A 95% CI that includes the null value of one, as with a p value of ≥0.05, indicates a statistically non-significant result. A p=0.05 occurs when a CI ends at 1.0 and is considered statistically significant.
Examples of large and small quantitative differences, and corresponding p values25
| Large quantitative difference | ||
|---|---|---|
| Outcome A=0.333 | Outcome B=0.250 | p Value |
| 1/3 | 1/4 | 0.81 |
| 10/30 | 10/40 | 0.45 |
| 100/300 | 100/400 | 0.02 |
| 1000/3000 | 1000/4000 | <0.0000001 |
| 288/1000 | 282/1000 | 0.77 |
| 2880/10,000 | 2820/10,000 | 0.35 |
| 28,800/100,000 | 28,200/100,000 | 0.003 |
p Values calculated using the χ2 test, for demonstration purposes.