| Literature DB >> 25887572 |
Genevieve L Wojcik1,2, W H Linda Kao3, Priya Duggal4.
Abstract
BACKGROUND: Despite the success of genome-wide association studies (GWAS), there still remains "missing heritability" for many traits. One contributing factor may be the result of examining one marker at a time as opposed to a group of markers that are biologically meaningful in aggregate. To address this problem, a variety of gene- and pathway-level methods have been developed to identify putative biologically relevant associations. A simulation was conducted to systematically assess the performance of these methods. Using genetic data from 4,500 individuals in the Wellcome Trust Case Control Consortium (WTCCC), case-control status was simulated based on an additive polygenic model. We evaluated gene-level methods based on their sensitivity, specificity, and proportion of false positives. Pathway-level methods were evaluated on the relationship between proportion of causal genes within the pathway and the strength of association.Entities:
Mesh:
Year: 2015 PMID: 25887572 PMCID: PMC4391470 DOI: 10.1186/s12863-015-0191-2
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Performance of gene-level methods
|
|
|
|
|
|
|
|---|---|---|---|---|---|
|
| Fisher | 59.18 | 88.64 | 5.89 | 40.82 |
| Sidak | 18.37 | 97.73 | 0.11 | 81.63 | |
| Simes | 46.94 | 97.73 | 1.33 | 53.06 | |
| FDR | 24.49 | 97.73 | 0.13 | 75.51 | |
|
| TPM | 63.04 | 92.86 | 4.93 | 36.96 |
| GATES | 24.49 | 98.00 | 0.17 | 75.51 | |
| WGATES | 26.53 | 98.00 | 0.16 | 73.47 | |
| HYST | 24.49 | 98.00 | 0.16 | 75.51 | |
| WHYST | 24.49 | 98.00 | 0.16 | 75.51 | |
|
| VEGAS | 20.41 | 100.0 | 0.16 | 79.59 |
| VEGAS [10%] | 28.57 | 98.00 | 0.40 | 71.43 |
Stratified sensitivities by effect sizes and number of causal SNPs under simulation
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
|
| Fisher | 66% | 17% | 50% | 64% | 60% |
| Sidak | 18% | 33% | 12% | 18% | 20% | |
| Simes | 50% | 17% | 50% | 50% | 45% | |
| FDR | 27% | 17% | 25% | 27% | 25% | |
|
| TPM | 68% | 20% | 57% | 63% | 65% |
| GATES | 25% | 17% | 12% | 18% | 35% | |
| GATES [Weighted] | 27% | 17% | 25% | 18% | 30% | |
| HYST | 25% | 17% | 12% | 18% | 40% | |
| Weighted GATES/HYST | 25% | 17% | 12% | 18% | 35% | |
|
| VEGAS | 23% | 17% | 0% | 27% | 25% |
| VEGAS [10%] | 32% | 17% | 0% | 32% | 40% |
Sensitivity and specificity calculated using subset of 49 true positive and 50 true negative genes. False positive and false negative percentages calculated using entire dataset of ~17,000 genes.
*OR = Odds Ratio.
Correlation for pathway-level results between rankings within each method and the proportion of associated genes within the pathway using only the 10 larger pathways evaluated, as well as correlation with mean ranking across all programs
|
|
|
|
|
|
|
|---|---|---|---|---|---|
|
|
| ||||
|
| ALIGATOR | [ | SNP P-values | −0.6 (−0.89, 0.05) | 0.75 (0.22, 0.94) |
| GenGen | [ | SNP P-values | −0.64 (−0.91, −0.02) | 0.82 (0.41, 0.96) | |
| GSA-SNP | [ | SNP P-values | −0.59 (−0.89, 0.06) | 0.78 (0.28, 0.94) | |
| GSEA-SNP | [ | Raw Genotypes | −0.6 (−0.89, 0.04) | 0.73 (0.18, 0.93) | |
| MAGENTA | [ | SNP P-values | −0.63 (−0.9, 0) | 0.9 (0.62, 0.98) | |
| MGFM | [ | SNP P-values | −0.53 (−0.87, 0.15) | 0.7 (0.13, 0.92) | |
| SRT | [ | Raw Genotypes | −0.43 (−0.84, 0.27) | 0.55 (−0.12, 0.88) | |
|
| GRASS | [ | Raw Genotypes | −0.49 (−0.86, 0.2) | 0.53 (−0.15, 0.87) |
| HYST | [ | SNP P-values | −0.57 (−0.88, 0.09) | 0.84 (0.46, 0.96) | |
| PST | [ | Raw Genotypes | −0.26 (−0.76, 0.44) | 0.55 (−0.12, 0.88) |
Figure 1Heatmap of results for pathway-level methods by the proportion of associated genes within the gene sets. The results are P-values for all pathways using the methods for a complete assessment of performance. Pathways with similar performances will cluster together along the y-axis, as indicated by the dendrogram. Proportion of associated genes (at least one SNP with P < 0.01) is indicated along the x-axis from left (0%) to right (33%). Intensity of color refers to stronger signals (lower P-values), which increases with the proportion of associated genes for most methods.