| Literature DB >> 23599503 |
Jennifer Listgarten1, Christoph Lippert, Eun Yong Kang, Jing Xiang, Carl M Kadie, David Heckerman.
Abstract
MOTIVATION: Approaches for testing sets of variants, such as a set of rare or common variants within a gene or pathway, for association with complex traits are important. In particular, set tests allow for aggregation of weak signal within a set, can capture interplay among variants and reduce the burden of multiple hypothesis testing. Until now, these approaches did not address confounding by family relatedness and population structure, a problem that is becoming more important as larger datasets are used to increase power.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23599503 PMCID: PMC3673214 DOI: 10.1093/bioinformatics/btt177
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Type I error estimates for FaST-LMM-Set using one million tests across various levels of significance, α
| Significance level | |||
| Fast-LMM-set | |||
| Non-truncated ML | |||
The first row shows results for our LRT-based method; the second row (‘non-truncated ML’) shows results when fitting the null distribution parameters using maximum likelihood with all test statistics; the third row shows results using a null distribution. Results significantly different from expected according to the binomial test (P < 0.05) are denoted with an asterisk.
Fig. 1.Quantile–quantile plot of observed and expected log10 P-values on the null-only WTCCC datasets (same data as used for Table 1) for FaST-LMM-Set. Dashed red error bars denote the 99% confidence interval around the solid red diagonal. Points shown are for null-only data (generated by permuting individuals in the SNPs to be tested—see Section 2) and only for the non-unity P-values (those assumed to belong to the non-zero degree of freedom component of the null distribution). The portion of the expected distribution of P-values shown is uniform on the interval [,1], where is the estimated mixing weight in the null distribution
Power experiments
| LRT | score | ||
|---|---|---|---|
| 44 | 26 | 0.03 | |
| 60 | 39 | 0.03 | |
| 172 | 138 | 0.05 | |
| 556 | 509 | 0.14 | |
| 2419 | 2195 | 0.0009 |
Number of tests with P-values less than α. The last column shows the results of a binomial test comparing the number of tests found by LRT as compared with the score test. The first row denotes the Bonferroni threshold for the WTCCC dataset.
Validation of methods on WTCCC Crohn’s disease
| Method | In meta- analysis | Supported by literature | No support found |
|---|---|---|---|
| FaST-LMM-Set | 15 | 1 | 0 |
| FaST-LMM-Set-Score | 7 | 0 | 0 |
| FaST-LMM-Set (uncorrected) | 17 | 3 | 6 |
FaST-LMM-Set denotes our newly developed method, which corrects for confounding and uses our LRT approach; FaST-LMM-Set (uncorrected) is the same but does not correct for confounding with a second variance component; FaST-LMM-Set-Score is the same as FaST-LMM-Set but uses a score test (as described in Section 2) instead of an LRT. Columns: ‘in meta-analysis’ shows the number of significant sets validated by a meta-analysis (Franke ); ‘supported by literature’ denotes the number of significant sets found by a literature search; ‘no support found’ denotes the number of sets supported neither by the meta-analysis nor a literature search.
of univariate tests for confounding-corrected and naïve methods
| Method | GAW14 | WTCCC |
|---|---|---|
| Uncorrected | 3.80 | 1.30 |
| FaST-LMM | 1.01 | 1.08 |
FaST-LMM denotes a one-component (to correct for confounding) LMM, testing one SNP fixed effect (Listgarten ); Uncorrected refers to no correction for confounding (linear regression).
Pearson correlation of log10(P)-values with set size
| Method | FaST-LMM-Set (uncorrected) | FaST-LMM-Set |
|---|---|---|
| GAW14 | 0.001 (0.98) | |
| WTCCC | 0.025 ( |
FaST-LMM-Set denotes our newly developed method; FaST-LMM-Set (uncorrected) is the same but does not correct for confounding with a second variance component. The P-value is reported in parentheses next to the value for . Significant entries are bolded. We excluded P-values from the zero degree-of-freedom component of our one-sided test, as their inclusion would violate the assumptions of the Pearson correlation test.