| Literature DB >> 22373133 |
Abstract
Rare variants are believed to play an important role in disease etiology. Recent advances in high-throughput sequencing technology enable investigators to systematically characterize the genetic effects of both common and rare variants. We introduce several approaches that simultaneously test the effects of common and rare variants within a single-nucleotide polymorphism (SNP) set based on logistic regression models and logistic kernel machine models. Gene-environment interactions and SNP-SNP interactions are also considered in some of these models. We illustrate the performance of these methods using the unrelated individuals data from Genetic Analysis Workshop 17. Three true disease genes (FLT1, PIK3C3, and KDR) were consistently selected using the proposed methods. In addition, compared to logistic regression models, the logistic kernel machine models were more powerful, presumably because they reduced the effective number of parameters through regularization. Our results also suggest that a screening step is effective in decreasing the number of false-positive findings, which is often a big concern for association studies.Entities:
Year: 2011 PMID: 22373133 PMCID: PMC3287933 DOI: 10.1186/1753-6561-5-S9-S91
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Methods in the testing stage
| Method | Model | Kernel | Common variants | Rare variants |
|---|---|---|---|---|
| Logistic regression | Logistic regression | NA | Genotypes | WScombined |
| Logistic common score | Logistic regression | NA | Common score | WScombined |
| Linear rare WScombined | Kernel machine | Linear | Genotypes | WScombined |
| Linear rare WSnonsyn | Kernel machine | Linear | Genotypes | WSnonsyn |
| Quadratic rare WScombined | Kernel machine | Quadratic | Genotypes | WScombined |
Figure 1Frequently selected genes and their selection frequencies. For each gene, the height of the bar represents the number of times it has been selected across the 100 screening-testing pairs.