| Literature DB >> 19236714 |
Joshua N Sampson1, Hongyu Zhao.
Abstract
BACKGROUND: One common goal of a case/control genome wide association study (GWAS) is to find SNPs associated with a disease. Traditionally, the first step in such studies is to assign a genotype to each SNP in each subject, based on a statistic summarizing fluorescence measurements. When the distributions of the summary statistics are not well separated by genotype, the act of genotype assignment can lead to more potential problems than acknowledged by the literature.Entities:
Mesh:
Year: 2009 PMID: 19236714 PMCID: PMC2679732 DOI: 10.1186/1471-2105-10-68
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Programs available for genotyping SNPs.
| Name | Summary Statistic | MM1 | Data2 | Data3 | Notes |
| RLMM | {Θ | No | T | M | |
| BRLMM | No | E-U | M | Assumes genotypes in "training" data are known. "Training" data only uses high quality SNPs. Incorporates info from other SNPs as a | |
| CRLMM | {Θ | No | T | L | |
| CHIAMO | Yes | E-L, T* | W | CHIAMO is a Bayesian hierarchical mixture model and is greatly simplified by this brief summary | |
| SNiPer-HD | No | E-U | W | Assumes genotypes in "training" data are unknown and requires the EM algorithm. "Training" data should only use high quality SNPs. | |
| Moorhead | N/A | E-U | W | Originally for MIP, but applicable to Affymetrix. Plagnol demonstrated how to link genotype probabilities between cases and controls. | |
| logiCALL | No | E-L | W-F | Designed to lower false positive rate and assigns calls based on cumulative distribution, not density functions. |
1Indicates use of mismatched probes
2Parameters were estimated by Experimental or Training data. For experimental data, under the null, cases and control genotype proportions could be Linked or Unlinked. * indicates optional.
3Distance can be Mahalanobis, W eighted Likelihood, or unweighted Likelihood
Figure 1(b) The bias, - p0 (y-axis) depends on P(x-axis), and is shown for different values of p0.
Figure 2(. After fixing, μ0 = 0, = 1, and μ1 = 1, we plot the bias, - p0 (y-axis), against (x-axis), for different values of p0.
Figure 3(Dependency of Calls) The density of . The deviation from the Y = X line indicates that is not distributed as a binomial(n, p) variable.
The percentage of influential genes among the top 100 most significant SNPs, as ranked by LogiCALL, a likelihood ratio test, and a standard test.
| RR = 1.5 | RR = 2.0 | RR = 2.5 | ||||||||
| Shift | Difference | LogiCALL | LR | Standard | LogiCALL | LR | Standard | LogiCALL | LR | Standard |
| 0.5 | 0 | 4.4 | 4.6 | 4.5 | 41.8 | 41.8 | 41.3 | 78.8 | 79.2 | 79.3 |
| 0.2 | 4.4 | 4.6 | 4.5 | 42.0 | 41.8 | 41.3 | 78.8 | 79.3 | 79.4 | |
| 0.3 | 0 | 4.4 | 4.6 | 4.6 | 42.0 | 41.9 | 41.2 | 78.8 | 79.2 | 79.4 |
| 0.2 | 4.4 | 1.9 | 0.2 | 41.6 | 29.5 | 14.4 | 78.8 | 67.3 | 42.7 | |
| 0.2 | 0 | 4.4 | 4.4 | 2.1 | 42.0 | 41.4 | 25.9 | 78.8 | 78.9 | 47.1 |
| 0.2 | 3.7 | 0.1 | 0.0 | 38.9 | 3.4 | 0.0 | 75.3 | 21.3 | 0.0 | |
Simulated data sets are full described in the Methods Section. The shift is the distance between the and μand . The Difference is the distance between and . RR is the genotype relative risk for subjects homogeneous for the minor allele.
The percentage of 'p-values' less than traditional α-levels (0.005,0.001,0.005) are listed for four tests of association.
| Method | n | p < 0.005 | p < 0.001 | p < 0.0005 |
| BeadStudio | 3487 | 0.015 | 0.005 | 0.004 |
| logiCALL | 5533 | 0.004 | 0.002 | 0.002 |
| LR | 5533 | 0.087 | 0.061 | 0.052 |
| LR-(same) | 3014 | 0.006 | 0.002 | 0.001 |
1) BeadStudio (default setting for missing assignements and omitting SNPs).
2) logiCALL.
3) Likelihood Ratio (LR) using all SNP.
4) Likelihood ratio (LR-same) using only SNP where calls were the same for restricted and unrestricted parameter sets.