| Literature DB >> 29499647 |
Regina Brinster1,2, Anna Köttgen3, Bamidele O Tayo4, Martin Schumacher5, Peggy Sekula3.
Abstract
BACKGROUND: When many (up to millions) of statistical tests are conducted in discovery set analyses such as genome-wide association studies (GWAS), approaches controlling family-wise error rate (FWER) or false discovery rate (FDR) are required to reduce the number of false positive decisions. Some methods were specifically developed in the context of high-dimensional settings and partially rely on the estimation of the proportion of true null hypotheses. However, these approaches are also applied in low-dimensional settings such as replication set analyses that might be restricted to a small number of specific hypotheses. The aim of this study was to compare different approaches in low-dimensional settings using (a) real data from the CKDGen Consortium and (b) a simulation study.Entities:
Keywords: False discovery rate; Low-dimensional setting; Q-value method; Simulation study
Mesh:
Year: 2018 PMID: 29499647 PMCID: PMC5833079 DOI: 10.1186/s12859-018-2081-x
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Statistical hypothesis test with possible test decisions related to the unknown truth (notation)
| Test decision | ||||
|---|---|---|---|---|
|
|
| Total | ||
| Underlying truth |
|
|
| |
|
|
|
| ||
| Total |
|
|
| |
Algorithms of methods controlling family-wise error rate (FWER) and false discovery rate (FDR) Let m be the number of hypotheses H1, …, Hm to test and p1, …, pm their respective m p-values. The p-values ranked in increasing order are defined as p(1) ≤ … ≤ p(m). The overall significance threshold is called α. Furthermore, let be the estimated proportion of true null hypotheses
Fig. 1CKDGen data example – Number of significant p-values (regions) in replication set. Applied procedures controlling the type I error: Bonferroni correction (BO), Hommel’s procedure (HO), Benjamini-Yekutieli’s procedure (BY), Strimmer’s LFDR method (LFDR), Benjamini-Hochberg’s procedure (BH), Two-stage procedure (TSBH), Strimmer’s q-value method (qv Str), Storey’s q-value method (qv Sto). Results are ordered by number of significant p-values leading to a separation of FDR methods from FWER methods (indicated by dashed line). Additional significant p-values from one approach to another are indicated by decreasing gray shades within the bars
Fig. 2Simulation – Number of repetitions with at least 1 false positive decision and average specificity for π0 = 100% (a). Average power and specificity for β1 = 2.5 and π0 = 75% (b), 50% (c), 25% (d). Applied procedures controlling the type I error: Bonferroni correction, Hommel’s procedure, Benjamini-Hochberg’s procedure, Two-stage procedure, Benjamini-Yekutieli’s procedure, Storey’s q-value method, Strimmer’s q-value method, Strimmer’s LFDR method. Power is defined as the proportion of correctly rejected hypotheses and specificity as the proportion of correctly maintained hypotheses. Both proportions potentially range from 0 to 1. Simulations for each scenario were repeated 100 times
Fig. 3Simulation – Observed estimations of π0 for Storey’s (qv) and Strimmer’s q-value methods (fdr) for π0 = 100% (a) and for β1 = 2.5 and π0 = 75% (b), 50% (c), 25% (d)