| Literature DB >> 34106935 |
Otília Menyhart1,2, Boglárka Weltz2,3, Balázs Győrffy1,2,4.
Abstract
Scientists from nearly all disciplines face the problem of simultaneously evaluating many hypotheses. Conducting multiple comparisons increases the likelihood that a non-negligible proportion of associations will be false positives, clouding real discoveries. Drawing valid conclusions require taking into account the number of performed statistical tests and adjusting the statistical confidence measures. Several strategies exist to overcome the problem of multiple hypothesis testing. We aim to summarize critical statistical concepts and widely used correction approaches while also draw attention to frequently misinterpreted notions of statistical inference. We provide a step-by-step description of each multiple-testing correction method with clear examples and present an easy-to-follow guide for selecting the most suitable correction technique. To facilitate multiple-testing corrections, we developed a fully automated solution not requiring programming skills or the use of a command line. Our registration free online tool is available at www.multipletesting.com and compiles the five most frequently used adjustment tools, including the Bonferroni, the Holm (step-down), the Hochberg (step-up) corrections, allows to calculate False Discovery Rates (FDR) and q-values. The current summary provides a much needed practical synthesis of basic statistical concepts regarding multiple hypothesis testing in a comprehensible language with well-illustrated examples. The web tool will fill the gap for life science researchers by providing a user-friendly substitute for command-line alternatives.Entities:
Mesh:
Year: 2021 PMID: 34106935 PMCID: PMC8189492 DOI: 10.1371/journal.pone.0245824
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Formulas and explanations of statistical concepts.
| Statistical concept | Formula | Explanation |
|---|---|---|
| αFW = 1- (1- αPC)C | where C refers to the number of comparisons performed, and αPC refers to the per contrast error rate, usually 0.05 | |
| p’i = npi ≤ α | the p-value of each test (pi) is multiplied with the number of performed statistical tests (n). If the corrected p-value (p’i) is lower than the significance level, α (usually 0.05), the null hypothesis will be rejected and the result will be significant | |
| p’i = 1- (1—pi)n ≤ α. | where pi refers to the p-value of each test, and n refers to the number of performed statistical tests (n) | |
| FDR = E [V / R] | where E refers to the expected proportion of null hypotheses that are falsely rejected (V), among all tests rejected (R), thus it calculates the probability of an incorrect discovery. | |
| FDR (t) = (π0 x m x t) / S(t) | where t represents a treshold between 0 and 1 under which p-values are considered significant, m is the number of p-values above the treshold (p1, p2, pm), π0 is the estimated proportion of true nulls (π0 = m0 / m) and S(t) is the number of all rejected hypotheses at t | |
| FDRi ≤ (n x pi)/(nRi x c(n))’ | where c(n) is a function of the number of tests depending on the correlation between the tests. If the tests are positively correlated, c(n) = 1 | |
| PFP = E (V) / E (R) | where E refers to the expected proportion of null hypotheses that are falsely rejected (V), among all tests rejected (R). V and R are both individually estimated | |
| q(pi) = min FDR (t) | the q-value is defined as the minimum FDR that can be achieved when calling that "feature" significant |