| Literature DB >> 22784568 |
Jaehoon Lee1, Soyeon Ahn, Sohee Oh, Bruce Weir, Taesung Park.
Abstract
BACKGROUND: The current genome-wide association (GWA) analysis mainly focuses on the single genetic variant, which may not reveal some the genetic variants that have small individual effects but large joint effects. Considering the multiple SNPs jointly in Genome-wide association (GWA) analysis can increase power. When multiple SNPs are jointly considered, the corresponding SNP-level association measures are likely to be correlated due to the linkage disequilibrium (LD) among SNPs.Entities:
Mesh:
Year: 2011 PMID: 22784568 PMCID: PMC3287477 DOI: 10.1186/1752-0509-5-S2-S11
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Figure 1Distribution of gene-level measures over the gene size for hypertension data from Korean population. The x-axis is gene size which is a number of SNPs within the gene and the y-axis is mean of gene-level summaries with same gene size.
Figure 2Variance of gene-level measure over the gene sets. In the left plot (a), x-axis is gene set size (= number of genes in the gene set) and y-axis is sample variance of gene-level summaries in the gene set. The right plot (b) shows a boxplot of variance of gene-level measures. A red line represents total sample variance in the data.
KARE result: top 5 gene sets with smallest q-value associated with hypertension phenotype from Z-statistic method
| Gene set | No. genes | No. SNPs | p-value | q-value |
|---|---|---|---|---|
| ST_JNK_MAPK_PATHWAY | 36 | 2410 | 1.13 × 10-4 | 6.38 × 10-2 |
| HSA00563_GLYCOSYLPHOSPHATIDYLINOSITOL_ANCHOR_BIOSYNTHESIS | 18 | 700 | 2.67 × 10-4 | 6.57 × 10-2 |
| FASPATHWAY | 28 | 1489 | 8.82 × 10-4 | 1.44 × 10-1 |
| HSA05060_PRION_DISEASE | 117 | 762 | 2.42 × 10-3 | 2.53 × 10-1 |
| HSA04520_ADHERENS_JUNCTION | 64 | 4150 | 2.58 × 10-3 | 2.53 × 10-1 |
KARE result: top 5 gene sets with smallest q-value associated with hypertension phenotype from SNP-PRAGE
| Gene set | No. genes | No. SNPs | p-value | q-value |
|---|---|---|---|---|
| ST_JNK_MAPK_PATHWAY | 36 | 1701 | 2.40 × 10-5 | 9.48× 10-3 |
| ST_ERK1_ERK2_MAPK_PATHWAY | 24 | 1765 | 1.61 × 10-4 | 3.16 × 10-2 |
| HSA05214_GLIOMA | 52 | 2301 | 3.92 × 10-4 | 5.16 × 10-2 |
| HSA05050_DENTATORUBROPALLIDOLUYSIAN_ATROPHY | 14 | 997 | 7.97 × 10-4 | 7.57 × 10-2 |
| EXTRINSICPATHWAY | 13 | 579 | 9.58 × 10-4 | 7.57 × 10-2 |
WTCCC result: top 5 gene sets with smallest q-value associated with bipolar disorder phenotype from Z-statistic method
| Gene set | No. genes | No. SNPs | p-value | q-value |
|---|---|---|---|---|
| EICOSANOID_SYNTHESIS | 15 | 669 | 6.85 × 10-4 | 3.33 × 10-1 |
| HSA04510_FOCAL ADHESION | 171 | 10281 | 2.50 × 10-3 | 1.00 |
| HSA01030_GLYCAN_STRUCTURES _BIOSYNTHESIS_1 | 91 | 7475 | 4.01 × 10-3 | 1.00 |
| BADPATHWAY | 17 | 1045 | 4.91 × 10-3 | 1.00 |
| HSA05223_NON_SMALL_CELL_LUNG_CANCER | 43 | 2933 | 5.49 × 10-3 | 1.00 |
WTCCC result: top 5 gene sets with smallest q-value associated with bipolar disorder phenotype from SNP-PRAGE
| Gene set | No. genes | No. SNPs | p-value | q-value |
|---|---|---|---|---|
| AGPCRPATHWAY | 12 | 616 | 5.2 × 10-5 | 1.45× 10-3 |
| DREAMPATHWAY | 13 | 600 | 8.5× 10-5 | 1.45× 10-3 |
| CK1PATHWAY | 16 | 1079 | 3.1× 10-4 | 3.52× 10-3 |
| BIOGENIC_AMINE_SYNTHESIS | 16 | 914 | 1.0× 10-3 | 8.52× 10-3 |
| BADPATHWAY | 21 | 1045 | 5.6× 10-3 | 1.51× 10-1 |
Simulated gene set based on MsigDB pathways
| Simulated gene set | No. genes | Gene size | Reference gene set |
|---|---|---|---|
| Set1 | 20 | 9~12 SNPs | HSA04060_CYTOKINE_CYTOKINE_RECEPTOR_INTERACTION |
| Set2 | 20 | 12~20 SNPs | HSA04010_MAPK_SIGNALING_PATHWAY |
| Set3 | 20 | 20~30 SNPs | HSA04810_REGULATION_OF_ACTIN_CYTOSKELETON |
| Set4 | 20 | 26~40 SNPs | HSA04510_FOCAL_ADHESION |
| Set5 | 20 | 36~49 SNPs | HSA04080_NEUROACTIVE_LIGAND_RECEPTOR_INTERACTION |
Figure 3QQ plot of set-level summary with various set size. Under the assumption there is no causal set effect, Figure3 (a), (b), and (c) show the QQ plot of set summary with set size 5, 10, and 20, respectively.
Type 1 error (when effect size is 0) in simulation studies
| Causal gene set | Gene set size | Gene size | Significance level | Z-statistic method | SNP-PRAGE | GLOSSI | GSEA | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 1 | 2 | 3 | 4 | 5 | ||||||
| Set1 | 20 genes | 9~12 | 0.05 | .005 | .003 | .004 | .004 | .003 | .057 | .053 | .054 | .054 | .053 | .052 | .051 |
| Set2 | 20 genes | 20~30 | 0.05 | .083 | .087 | .084 | .080 | .080 | .051 | .052 | .052 | .050 | .052 | .051 | .049 |
| Set3 | 20 genes | 36~49 | 0.05 | .430 | .641 | .760 | .864 | .891 | .047 | .049 | .050 | .050 | .051 | .049 | .052 |
Power (when effect size is 0.3 or 0.6) in the simulation studies
| Effect size (=β) | Causal gene set | Gene set size | Gene size | significance level | Z-statistic method | SNP-PRAGE | GLOSSI | GSEA | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 1 | 2 | 3 | 4 | 5 | |||||||
| 0.3 | Set1 | 20 | 9~12 | 0.05 | .81 | .81 | .74 | .67 | .59 | .92 | .92 | .94 | .95 | .95 | .92 | .95 |
| Set3 | 20 | 20 ~30 | 0.05 | .85 | .78 | .79 | .79 | .76 | .81 | .81 | .83 | .82 | .83 | .82 | .83 | |
| Set5 | 20 | 36~49 | 0.05 | .98 | .99 | .99 | .99 | .99 | .74 | .74 | .75 | .75 | .76 | .74 | .73 | |
| 0.6 | Set1 | 20 | 9~12 | 0.05 | .84 | .83 | .78 | .69 | .62 | .97 | .98 | .98 | .98 | .97 | .98 | .98 |
| Set3 | 20 | 20 ~30 | 0.05 | .86 | .89 | .86 | .88 | .88 | .84 | .85 | .86 | .86 | .87 | .84 | .85 | |
| Set5 | 20 | 36~49 | 0.05 | 1.0 | .99 | 1.0 | 1.0 | 1.0 | .79 | .80 | .80 | .82 | .82 | .79 | .79 | |
Computing time for simulation data analysis
| Process | Z-statistic method | SNP-PRAGE | GLOSSI (100 permutations) | GSEA (1000 permutations) |
|---|---|---|---|---|
| Single SNP analysis | 40sec | 40sec | 34 min | 26 min 15sec |
| Gene set analysis | 0.3 sec | 52sec | 0.5sec | 2 min 10sec |
| Total analysis | 40.3sec | 1min 32sec | 34 min 0.5sec | 28 min 25sec |