| Literature DB >> 20018093 |
Nathan L Tintle1, Bryce Borchers, Marshall Brown, Airat Bekmetjev.
Abstract
Recently, gene set analysis (GSA) has been extended from use on gene expression data to use on single-nucleotide polymorphism (SNP) data in genome-wide association studies. When GSA has been demonstrated on SNP data, two popular statistics from gene expression data analysis (gene set enrichment analysis [GSEA] and Fisher's exact test [FET]) have been used. However, GSEA and FET have shown a lack of power and robustness in the analysis of gene expression data. The purpose of this work is to investigate whether the same issues are also true for the analysis of SNP data. Ultimately, we conclude that GSEA and FET are not optimal for the analysis of SNP data when compared with the SUMSTAT method. In analysis of real SNP data from the Framingham Heart Study, we find that SUMSTAT finds many more gene sets to be significant when compared with other methods. In an analysis of simulated data, SUMSTAT demonstrates high power and better control of the type I error rate. GSA is a promising approach to the analysis of SNP data in GWAS and use of the SUMSTAT statistic instead of GSEA or FET may increase power and robustness.Entities:
Year: 2009 PMID: 20018093 PMCID: PMC2796000 DOI: 10.1186/1753-6561-3-s7-s96
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Figure 1Venn diagram of sets identified as significant by four different GSA methods for Framingham Heart Study data. The numbers represent the significant sets in each non-overlapping region. Total number of sets depicted is 1412, which is 706 gene sets for each of the two phenotypes. There were 1340 sets not identified as significant by any method.
Percent of sets found as significant for the simulated data
| FET | |||||||
|---|---|---|---|---|---|---|---|
| 5.9 | 9.2 | 13.8 | 18.4 | GSEA | SUMSQ | SUMSTAT | |
| Pseudo-gene sets | |||||||
| No associated genes | 1.4 | 2.8 | 0.0 | 0.9 | 3.2 | 3.7 | 3.7 |
| 1-9 weakly associated genes | 9.8 | 3.3 | 1.6 | 3.3 | 8.5 | 8.2 | 8.2 |
| 10+ weakly associated genes | 10.6 | 7.7 | 1.9 | 1.9 | 11.5 | 11.5 | 15.4 |
| 1-2 strongly associated genes, but no weakly associated genes | 0.0 | 0.0 | 5.3 | 0.0 | 0.0 | 0.0 | 0.0 |
| Real gene sets | |||||||
| Many weakly associated genes | 51.7 | 49.2 | 13.3 | 5.0 | 58.3 | 60.0 | 70.8 |
| Some strongly associated genes | 2.5 | 3.8 | 7.5 | 33.8 | 6.3 | 36.3 | 23.8 |
| Null sets (no associated genes) | 0.0 | 6.7 | 1.7 | 3.3 | 1.7 | 5.0 | 3.3 |
The cytogenetic band sets found to be significant by SUMSTAT (FDR 5%)
| 2q34 |
| 2q36 |
| 3p14 |
| 3p26 |
| 4q22 |
| 4q32 |
| 5q14 |
| 5q23 |
| 5p14 |
| 9p24 |
| 9q21 |
| 10p14 |
| 10p15 |
| 11q21 |
| 12p12 |
| 12q23 |
| 13q12 |
| 13q22 |
| 14q13 |
| 18q12 |
| 18q21 |
| 18q22 |
| 1p31 |
| 2q24 |
| 3p26 |
| 4p15 |
| 5p13 |
| 6p24 |
| 6p25 |
| 9p24 |
| 9q |
| 9q21 |
| 10p12 |
| 10p15 |
| 12q15 |
| 18q21 |
| 18q22 |
| 20p12 |
| 21q21 |
The molecular function gene sets found to be significant by SUMSTAT (FDR 5%)
| Cation Transmembrane Transporter Activity |
| Glutamate Receptor Activity |
| Hematopoietin Interferon Class D200 Domain Cytokine Receptor Activity |
| Ionotropic Glutamate Receptor Activity |
| Low density lipoprotein activity |
| Sialyltransferase Activity |
| Transmembrane receptor protein kinase activity |
| Cyclic nucleotide phosphodiesterase activity |
| G-protein coupled receptor activity |
| Gated Channel activity |
| Glutamate receptor activity |
| GTPase regulator activity |
| Guanyl nucleotide exchange factor activity |
| Ionotropic glutamate receptor activity |
| Lipoprotein binding |
| Low-density lipoprotein binding |
| Phosphoric diester hydrolase activity |
| Phosphoric ester hydrolase activity |
| Transmembrane receptor protein phosphate activity |
| 3-5-cyclic nucleotide phosphodiesterase activity |
| Cation channel activity |
| Interleukin binding |
| GTPase activator activity |
| Ion transmembrane transport activity |
| Phosphoprotein phosphatase activity |
| Gaba receptor activity |
| Metal ion transmembrane transporter activity |
| Protein tyrosine phosphatase activity |
| Growth factor binding |
| Metabotropic glutamate gaba-b like receptor activity |
| Delayed rectifier potassium channel activity |