| Literature DB >> 22373242 |
Liping Tong1, Bamidele Tayo, Jie Yang, Richard S Cooper.
Abstract
We compare the SNP-based and gene-based association studies using 697 unrelated individuals. The Benjamini-Hochberg procedure was applied to control the false discovery rate for all the multiple comparisons. We use a linear model for the single-nucleotide polymorphism (SNP) based association study. For the gene-based study, we consider three methods. The first one is based on a linear model, the second is similarity based, and the third is a new two-step procedure. The results of power using a subset of SNPs show that the SNP-based association test is more powerful than the gene-based ones. However, in some situations, a gene-based study is able to detect the associated variants that were neglected in a SNP-based study. Finally, we apply these methods to a replicate of the quantitative trait Q1 and the binary trait D (D = 1, affected; D = 0, unaffected) for a genome-wide gene search.Entities:
Year: 2011 PMID: 22373242 PMCID: PMC3287878 DOI: 10.1186/1753-6561-5-S9-S41
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Number of positive genes and estimated FDR
| Trait | Method | ||||||
|---|---|---|---|---|---|---|---|
| No. Pos | FDR | No. Pos | FDR | No. Pos | FDR | ||
| Q1 | SNP | 35.185 | 0.849 | 51.625 | 0.874 | 65.875 | 0.884 |
| GL | 31.790 | 0.791 | 47.935 | 0.837 | 65.100 | 0.869 | |
| G2 | 26.785 | 0.777 | 39.730 | 0.823 | 47.255 | 0.841 | |
| Q2 | SNP | 3.975 | 0.281 | 11.060 | 0.486 | 29.905 | 0.655 |
| GL | 4.175 | 0.297 | 12.350 | 0.503 | 29.315 | 0.662 | |
| G2 | 4.370 | 0.351 | 13.755 | 0.545 | 27.760 | 0.684 | |
| Q4 | SNP | 0.325 | – | 1.215 | – | 3.210 | – |
| GL | 0.365 | – | 1.120 | – | 3.370 | – | |
| G2 | 0.415 | – | 1.325 | – | 2.900 | – | |
| D | SNP | 3.695 | 0.324 | 12.690 | 0.519 | 33.940 | 0.580 |
| GL | 5.225 | 0.261 | 17.355 | 0.374 | 37.050 | 0.476 | |
| GS | 4.945 | 0.274 | 13.615 | 0.377 | 34.330 | 0.437 | |
| G2 | 5.575 | 0.284 | 19.160 | 0.416 | 33.880 | 0.467 | |
SNP, GL, G2, and GS are the SNP-based, gene-based linear, gene-based two-step, and gene-based similarity methods, respectively. α is the predefined value in the BH procedure. “No. POS” is the average number of positives over 200 replicates; and the FDR is the observed false discovery rate.
Figure 1ROC plot for trait D The x-axis is the average observed false positive rate over 200 replicates and 50 nonassociated genes. The y-axis is the average observed true positive rate (or power) over 200 replicates and 36 truly associated genes. SNP, SNP-based method; GL, gene-based linear method; GS, gene-based similarity method; G2, gene-based two-step method.
Power to identify each associated gene for trait D
| Gene | Observed FPR = 0.05 | Observed FPR = 0.1 | ||||||
|---|---|---|---|---|---|---|---|---|
| SNP | GL | GS | G2 | SNP | GL | GS | G2 | |
| 0.155 | 0.12 | 0.005 | 0.25 | 0.04 | 0.285 | |||
| 0.705 | 0.83 | 0.335 | 0.785 | 0.915 | 0.505 | |||
| 0.14 | 0.17 | 0.005 | 0.2 | 0.26 | 0.01 | |||
| 0.1 | 0.13 | 0.015 | 0.185 | 0.22 | 0.03 | |||
| 0.2 | 0.06 | 0.41 | 0.585 | 0.12 | 0.585 | |||
| 0.04 | 0.075 | 0.055 | 0.095 | 0.145 | 0.135 | |||
| 0.035 | 0.025 | 0.06 | 0.075 | 0.09 | 0.075 | |||
| 0.085 | 0.095 | 0.105 | 0.2 | 0.21 | 0.175 | |||
| 0.44 | 0.08 | 0.4 | 0.565 | 0.14 | 0.58 | |||
| 0.125 | 0.225 | 0.165 | 0.3 | 0.36 | 0.31 | |||
| 0.4 | 0.255 | 0.035 | 0.41 | 0.07 | 0.595 | |||
| 0.24 | 0.09 | 0.145 | 0.375 | 0.205 | 0.3 | |||
| 0 | 0 | 0.26 | 0.05 | 0 | 0.335 | |||
| 0.05 | 0.02 | 0.065 | 0.085 | 0.06 | 0.115 | |||
| 0.175 | 0.045 | 0.065 | 0.25 | 0.095 | 0.13 | |||
SNP, GL, GS, and G2 are the SNP-based, gene-based linear, gene-based similarity, and gene-based two-step methods, respectively. The boldface values are the maximum power of these four methods.
Comparison of number of significant genes using trait Q1
| SNP | GL | G2 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| True | False | Total | True | False | Total | True | False | Total | |
| 1 × 10−5 | 1 | 2 | 3 | 2 | 1 | 3 | 1 | 3 | 4 |
| 1 × 10−4 | 1 | 4 | 5 | 2 | 6 | 8 | 1 | 4 | 5 |
| 1 × 10−3 | 2 | 19 | 21 | 2 | 23 | 25 | 1 | 12 | 13 |
| 0.01 | 2 | 73 | 75 | 3 | 101 | 104 | 1 | 75 | 76 |
| 0.10 | 4 | 389 | 393 | 5 | 397 | 402 | 4 | 303 | 307 |
| 0.25 | 5 | 718 | 723 | 6 | 695 | 701 | 7 | 626 | 633 |
| 0.50 | 6 | 1,266 | 1272 | 7 | 1,246 | 1,253 | 8 | 969 | 977 |
SNP, GL, and G2 are the SNP-based, gene-based linear, and gene-based two-step methods, respectively. The “True” columns are number of true associated genes; the “False” columns are number of false-positive genes; the “Total” columns are the total number of positive genes detected.