| Literature DB >> 25519354 |
Marcio Almeida1, Juan M Peralta1,2, Vidya Farook1, Sobha Puppala1, John W Kent1, Ravindranath Duggirala1, John Blangero1.
Abstract
The new generation of sequencing platforms opens new horizons in the genetics field. It is possible to exhaustively assay all genetic variants in an individual and search for phenotypic associations. The whole genome sequencing approach, when applied to a large human sample like the San Antonio Family Study, detects a very large number (>25 million) of single nucleotide variants along with other more complex variants. The analytical challenges imposed by this number of variants are formidable, suggesting that methods are needed to reduce the overall number of statistical tests. In this study, we develop a single degree-of-freedom test of variants in a gene pathway employing a random effect model that uses an empirical pathway-specific genetic relationship matrix as the focal covariance kernel. The empirical pathway-specific genetic relationship uses all variants (or a chosen subset) from gene members of a given biological pathway. Using SOLAR's pedigree-based variance components modeling, which also allows for arbitrary fixed effects, such as principal components, to deal with latent population structure, we employ a likelihood ratio test of the pathway-specific genetic relationship matrix model. We examine all gene pathways in KEGG database gene pathways using our method in the first replicate of the Genetic Analysis Workshop 18 simulation of systolic blood pressure. Our random effect approach was able to detect true association signals in causal gene pathways. Those pathways could be easily be further dissected by the independent analysis of all markers.Entities:
Year: 2014 PMID: 25519354 PMCID: PMC4143680 DOI: 10.1186/1753-6561-8-S1-S100
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Figure 1Deviation of PSGRMs from the expected kinship matrix. Dispersion plot comparing kinship estimators and the number of variants in a gene pathway. Panel A was constructed using the entire set of pairwise comparisons and panel B with only unrelated individuals.
Figure 2Q-Q plot of PSGRM-based . LRT results obtained from testing the addition of term in the variance component model.
Likelihood ratio tests of gene pathways.Results of likelihood ratio test using the pathway-specific variance component term . The table only shows pathways with p-values lower than 0.01.
| Pathway | # Causal genes | Variance explained |
|
| ||
|---|---|---|---|---|---|---|
| CAUSAL_SET | 15 | 0.2151 | 0.141 | 0.0019184 | 0.095 | 0.0000002 |
| Cytokine_interaction | 2 | 0.032 | 0.127 | 0.0130248 | 0.131 | 0.0006748 |
| Glutathione_metabolism | 2 | 0.027 | 0.221 | 0.0000136 | 0.074 | 0.00306 |
| CAUSAL_1_to_9 | 1 | 0.0779 | 0.248 | 0.0000019 | 0.039 | 0.004348 |
| 0 | 0 | 0.244 | 0.0000099 | 0.051 | 0.0081785 | |
| CAUSAL_1_to_49 | 1 | 0.0779 | 0.265 | 0.0000131 | 0.038 | 0.178527 |
Tests of calculated using different frequency spectra. Association results of the CAUSAL sets using variants with different MAF spectra. The columns "Complete set", "MAF< 0.05" and "MAF 0.0" represent respectively the use all variants near genes, all variants with MAF< 0.05 and all variants with MAF < 0.01.
| Gene set | Complete set | MAF <0.05 | MAF <0.01 |
|---|---|---|---|
| CAUSAL_1_to_49 | 0.178527 | 0.173653 | 0.0455696 |
| CAUSAL_1_to_9 | 0.004348 | 0.000155 | 0.0111827 |
| CAUSAL_SET | 0.0000002 | 0.168555 | 0.0548205 |