| Literature DB >> 22586488 |
Ren-Hua Chung1, Ying-Erh Chen.
Abstract
Pathway analysis provides a powerful approach for identifying the joint effect of genes grouped into biologically-based pathways on disease. Pathway analysis is also an attractive approach for a secondary analysis of genome-wide association study (GWAS) data that may still yield new results from these valuable datasets. Most of the current pathway analysis methods focused on testing the cumulative main effects of genes in a pathway. However, for complex diseases, gene-gene interactions are expected to play a critical role in disease etiology. We extended a random forest-based method for pathway analysis by incorporating a two-stage design. We used simulations to verify that the proposed method has the correct type I error rates. We also used simulations to show that the method is more powerful than the original random forest-based pathway approach and the set-based test implemented in PLINK in the presence of gene-gene interactions. Finally, we applied the method to a breast cancer GWAS dataset and a lung cancer GWAS dataset and interesting pathways were identified that have implications for breast and lung cancers.Entities:
Mesh:
Year: 2012 PMID: 22586488 PMCID: PMC3346727 DOI: 10.1371/journal.pone.0036662
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Type I error rates for the TRF-pathway, RF-pathway, and PLINK set-based tests.
| TRF-pathway | RF-pathway | PLINK | ||||
| 0.05 | 0.01 | 0.05 | 0.01 | 0.05 | 0.01 | |
| Scenario 1 | 0.048 | 0.012 | 0.049 | 0.009 | 0.052 | 0.012 |
| Scenario 2 | 0.054 | 0.011 | 0.049 | 0.010 | 0.053 | 0.011 |
| Scenario 3 | 0.058 | 0.011 | 0.047 | 0.010 | 0.047 | 0.009 |
| Scenario 4 | 0.049 | 0.009 | 0.051 | 0.008 | 0.050 | 0.010 |
Figure 1Power comparison of the TRF-pathway with PLINK and RF-pathway at the 0.05 and 0.01 significance levels.
Pathway analysis results for the breast cancer GWAS data.
| Pathway | No. Genes | No. SNPs | TRF P-value | RF P-value | PLINK P-value |
| T cell receptor signaling pathway (hsa04660) | 97 | 105 | 0.001 | 0.168 | 0.035 |
| Maturity onset diabetes of the young (hsa04950) | 25 | 27 | 0.003 | 0.043 | 0.048 |
| Prostate cancer (hsa05215) | 82 | 90 | 0.004 | 0.143 | 0.012 |
| Aminoacyl-tRNA biosynthesis (hsa00970) | 39 | 56 | 0.009 | 0.016 | 0.252 |
Number of genes in the pathway.
Number of SNPs used in the step 3 in the TRF-pathway algorithm.
P-values for the TRF-Pathway.
P-values for RF-Pathway.
P-values for PLINK set-based tests.
Pathway analysis results for the lung cancer GWAS data.
| Pathway | No. Genes | No. SNPs | TRF P-value | RF P-value | PLINK P-value |
| Cyanoamino acid metabolism (hsa00460) | 7 | 19 | 0.001 | 0.092 | 0.192 |
| Fc gamma R-mediated phagocytosis (hsa04666) | 88 | 133 | 0.002 | 0.010 | 0.381 |
| p53 signaling pathway (hsa04115) | 66 | 50 | 0.006 | 0.064 | 0.208 |
| Pentose phosphate pathway (hsa00030) | 22 | 25 | 0.008 | 0.208 | 0.506 |
Number of genes in the pathway.
Number of SNPs used in the step 3 in the TRF-pathway algorithm.
P-values for the TRF-Pathway.
P-values for the RF-Pathway.
P-values for PLINK set-based tests.