| Literature DB >> 24104466 |
Brooke L Fridley1, Gregory D Jenkins, Diane E Grill, Richard B Kennedy, Gregory A Poland, Ann L Oberg.
Abstract
Gene set analysis (GSA) has been used for analysis of microarray data to aid the interpretation and to increase statistical power. With the advent of next-generation sequencing, the use of GSA is even more relevant, as studies are often conducted on a small number of samples. We propose the use of soft truncation thresholding and the Gamma Method (GM) to determine significant gene set (GS), where a generalized linear model is used to assess per-gene significance. The approach was compared to other methods using an extensive simulation study and RNA-seq data from smallpox vaccine study. The GM was found to outperform other proposed methods. Application of the GM to the smallpox vaccine study found the GSs to be moderately associated with response, including focal adhesion (p = 0.04) and extracellular matrix receptor interaction (p = 0.05). The application of GSA to RNA-seq data will provide new insights into the genomic basis of complex traits.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24104466 PMCID: PMC3793215 DOI: 10.1038/srep02898
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Summary of power across all 1440 non-null simulation scenarios for sample sizes of N = 500 and N = 100, with 1000 simulated data sets per scenario. The GM with various STT values is compared to ten previously proposed self-contained GSA methods. Table entries are sorted by descending mean power for the scenarios with sample size of 500
| N = 500 | N = 100 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Method | STT | Min. | 1st Qu. | Median | Mean | Min. | 1st Qu. | Median | Mean |
| Gamma Method (GM) | 0.1 | 0.264 | 1 | 1 | 0.991 | 0.072 | 0.994 | 1 | 0.923 |
| GM | 0.05 | 0.331 | 1 | 1 | 0.993 | 0.073 | 0.990 | 1 | 0.921 |
| GM | 0.15 | 0.210 | 1 | 1 | 0.990 | 0.073 | 0.995 | 1 | 0.920 |
| Global model with fixed effects (GMFE) | 0.223 | 1 | 1 | 0.985 | 0.068 | 0.998 | 1 | 0.906 | |
| GM | 0.2 | 0.171 | 1 | 1 | 0.988 | 0.068 | 0.992 | 1 | 0.916 |
| GM | 0.01 | 0.449 | 1 | 1 | 0.993 | 0.067 | 0.96 | 1 | 0.903 |
| Global model using random effects (GMRE) | 0.111 | 1 | 1 | 0.983 | 0.065 | 0.945 | 1 | 0.896 | |
| Fisher's Method/Gamma Method (FM) | 1/e | 0.096 | 1 | 1 | 0.980 | 0.059 | 0.943 | 1 | 0.889 |
| PCA using principal components that explain 80% of variation (PCA80) | 0.101 | 1 | 1 | 0.974 | 0.06 | 0.856 | 1 | 0.855 | |
| Stouffer's Method (SM) | 0.062 | 1 | 1 | 0.933 | 0.051 | 0.674 | 1 | 0.816 | |
| FTS. GS Modified Tail Strength (MTS) | 0.057 | 1 | 1 | 0.924 | 0.045 | 0.574 | 1 | 0.787 | |
| PCA using top five principal components (PCA1.5) | 0.07 | 0.958 | 1 | 0.920 | 0.056 | 0.631 | 0.993 | 0.788 | |
| Tail Strength (TS) | 0.078 | 0.981 | 1 | 0.867 | 0.059 | 0.621 | 1 | 0.797 | |
| Kolmogorov-Smirnov (KS) | 0.051 | 0.960 | 1 | 0.819 | 0.047 | 0.394 | 0.999 | 0.738 | |
| PCA using top principal component (PCA1) | 0.056 | 0.591 | 1 | 0.808 | 0.051 | 0.314 | 0.991 | 0.704 | |
*198 and 396 scenarios were unable to be fit do to size of gene set for N = 500 and N = 100, respectively.
Figure 1Power comparison between the Gamma Method with STT = 0.15 and 0.05 (GM.15, GM.05), Fisher's Method (FM), Global model with random effects (GMRE), Principal components analysis with 80% of components that explained the variability included in the model (PCA.80), Kolmogorov-Smirnov test (KS).
Figure 2Power comparison between the Gamma Method with various STT values.
STT values ranged from 0.20 to 0.01.
Top GSs associated with response to Smallpox vaccine for various STT values. Results with p < 0.05 from GSA using the GM with any of the STT values are presented
| GSA P-values for various STT Value | ||||||||
|---|---|---|---|---|---|---|---|---|
| Gene Set | 0.05 | 0.10 | 0.15 | 0.20 | 1/e | N Genes in KEGG | N Genes in Analysis | Coverage of Pathway |
| Biotin metabolism | 0.0005 | 0.0005 | 0.0005 | 0.0005 | 0.002 | 2 | 2 | 100% |
| Pentose and glucuronate interconversions | 0.018 | 0.021 | 0.025 | 0.031 | 0.119 | 28 | 8 | 29% |
| Non-homologous end-joining | 0.022 | 0.026 | 0.031 | 0.039 | 0.121 | 14 | 12 | 86% |
| Focal adhesion | 0.039 | 0.053 | 0.065 | 0.076 | 0.103 | 201 | 148 | 74% |
| D-Glutamine and D-glutamate metabolism | 0.041 | 0.041 | 0.042 | 0.041 | 0.050 | 4 | 4 | 100% |
| ECM-receptor interaction | 0.046 | 0.064 | 0.080 | 0.093 | 0.129 | 84 | 59 | 70% |
| Lysine biosynthesis | 0.058 | 0.055 | 0.048 | 0.042 | 0.044 | 4 | 3 | 75% |
*FDR q-value was 0.02.
Gene-level results (p < 0.15) for GSs with p < 0.05
| Gene Set | Gene | P-value | Gene Set | Gene | P-value |
|---|---|---|---|---|---|
| Biotin metabolism | 0.0005 | Focal adhesion | 0.0278 | ||
| Pentose & glucuronate interconversions | 0.1160 | 0.0311 | |||
| 0.0115 | 0.0385 | ||||
| Non-homologous end-joining | 0.0164 | 0.0430 | |||
| 0.0613 | 0.0468 | ||||
| 0.0802 | 0.0493 | ||||
| D-Glutamine & D-glutamate metabolism | 0.0176 | 0.0496 | |||
| 0.1061 | 0.0521 | ||||
| ECM-receptor interaction | 0.0117 | 0.0626 | |||
| 0.0168 | 0.0776 | ||||
| 0.0208 | 0.0788 | ||||
| 0.0278 | 0.0801 | ||||
| 0.0311 | 0.0814 | ||||
| 0.0947 | 0.0899 | ||||
| 0.1277 | 0.0947 | ||||
| 0.1288 | 0.0996 | ||||
| 0.1452 | 0.1028 | ||||
| focal adhesion | 0.0090 | 0.1089 | |||
| 0.0117 | 0.1170 | ||||
| 0.0131 | 0.1252 | ||||
| 0.0132 | 0.1277 | ||||
| 0.0168 | 0.1288 | ||||
| 0.0208 | 0.1440 | ||||
| 0.0219 |
Figure 3Dendrogram of top 25 GS associated with response to smallpox vaccine to visualize relationship and overlap between gene sets.
GSs containing a large set of genes in common would be clustered close together while GSs with no genes in common would not be clustered together.