| Literature DB >> 20017994 |
Joseph Beyene1, Pingzhao Hu, Jemila S Hamid, Elena Parkhomenko, Andrew D Paterson, David Tritchler.
Abstract
Evaluation of the association between single-nucleotide polymorphisms (SNPs) and disease outcomes is widely used to identify genetic risk factors for complex diseases. Although this analysis paradigm has made significant progress in many genetic studies, many challenges remain, such as the requirement of a large sample size to achieve adequate power. Here we use rheumatoid arthritis (RA) as an example and explore a new analysis strategy: pathway-based analysis to search for related genes and SNPs contributing to the disease.We first propose the application of measure of explained variation to quantify the predictive ability of a given SNP. We then use gene set enrichment analysis to evaluate enrichment of specific pathways, where pathways, are considered enriched if they consist of genes that are associated with the phenotype of interest above and beyond is expected by chance. The results are also compared with score tests for association analysis by adjusting for population stratification.Our study identified some significantly enriched pathways, such as "cell adhesion molecules," which are known to play a key role in RA. Our results showed that pathway-based analysis may identify other biologically interesting loci (e.g., rs1018361) related to RA: the gene (CTLA4) closest to this marker has previously been shown to be associated with RA and the gene is in the significant pathways we identified, even though the marker has not reached genome-wide significance in univariate single-marker analysis.Entities:
Year: 2009 PMID: 20017994 PMCID: PMC2795901 DOI: 10.1186/1753-6561-3-s7-s128
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Significant pathways identified by three test statistic methods
|
| DirEV | IndirEV | |||||
|---|---|---|---|---|---|---|---|
| Gene sets/pathways | No. genesa | Rankb | FDR | Rank | FDR | Rank | FDR |
| Hsa04612 antigen processing and presentation | 57 | 1 | <1*10-8 | 1 | <1*10-8 | 1 | <1*10-8 |
| Hsa04940 Type I diabetes mellitus | 40 | 2 | <1*10-8 | 2 | <1*10-8 | 2 | <1*10-8 |
| Wieland hepatitis B-induced | 87 | 3 | <1*10-8 | 3 | 0.004 | 3 | 0.001 |
| Ctla4 pathway | 18 | 4 | 6*10-5 | 6 | 0.006 | 4 | 0.003 |
| Ami pathway | 22 | 5 | 0.01 | 7 | 0.006 | 10 | 0.015 |
| Csk pathway | 22 | 6 | 0.01 | 10 | 0.011 | 7 | 0.012 |
| Sana Ifng endothelial up | 67 | 7 | 0.011 | 8 | 0.006 | 9 | 0.013 |
| Th1th2 pathway | 17 | 8 | 0.011 | 9 | 0.009 | 8 | 0.013 |
| Inflam pathway | 28 | 9 | 0.012 | 4 | 0.005 | 6 | 0.006 |
| Hsa04514 cell adhesion molecule | 115 | 10 | 0.02 | 5 | 0.005 | 5 | 0.004 |
aNumber of genes found in our data. (The actual number of genes defined in the set is larger than these numbers.)
bThe rank is based on the FDR q-value for all 1,900 tested gene sets in each method.
Distribution of selected genes and SNPs in each of the two regions and nine gene sets/pathways
| MHC region | Rest of genome | |||||
|---|---|---|---|---|---|---|
| Gene sets/pathways | No. genes | %b | #SNPs | No. genes | %a | No. SNPs |
| Hsa04612 antigen processing and Presentation | 24 | 42.1 | 5 | 33 | 57.9 | 0 |
| Hsa04940 Type I Diabetes Mellitus | 22 | 55.0 | 6 | 18 | 45.0 | 0 |
| Wieland hepatitis B-iInduced | 15 | 17.2 | 4 | 72 | 82.8 | 0 |
| Ctla4 pathway | 2 | 11.1 | 0 | 16 | 88.9 | 1 |
| Ami pathway | 2 | 9.1 | 0 | 20 | 90.9 | 0 |
| Csk pathway | 2 | 9.1 | 0 | 20 | 90.9 | 0 |
| Sana Ifng endothelial up | 11 | 16.4 | 2 | 56 | 83.6 | 0 |
| Th1th2 pathway | 2 | 11.8 | 0 | 15 | 88.2 | 0 |
| Inflam pathway | 4 | 14.3 | 2 | 24 | 85.7 | 0 |
| Hsa04514 cell adhesion molecule | 20 | 17.4 | 5 | 95 | 82.6 | 1 |
aThis region covers approximately 0.2% the whole genome
bThe percentage of genes in a given gene set that were found in the region