| Literature DB >> 24294105 |
Guolian Kang1, Bo Jiang, Yuehua Cui.
Abstract
The study of gene-based genetic associations has gained conceptual popularity recently. Biologic insight into the etiology of a complex disease can be gained by focusing on genes as testing units. Several gene-based methods (e.g., minimum p-value (or maximum test statistic) or entropy-based method) have been developed and have more power than a single nucleotide polymorphism (SNP)-based analysis. The objective of this study is to compare the performance of the entropy-based method with the minimum p-value and single SNP-based analysis and to explore their strengths and weaknesses. Simulation studies show that: 1) all three methods can reasonably control the false-positive rate; 2) the minimum p-value method outperforms the entropy-based and the single SNP-based method when only one disease-related SNP occurs within the gene; 3) the entropy-based method outperforms the other methods when there are more than two disease-related SNPs in the gene; and 4) the entropy-based method is computationally more efficient than the minimum p-value method. Application to a real data set shows that more significant genes were identified by the entropy-based method than by the other two methods.Entities:
Keywords: Entropy; Gene-centric; Genome-wide association study; Minimum p-value method.; Monte carlo
Year: 2013 PMID: 24294105 PMCID: PMC3731815 DOI: 10.2174/13892029113149990001
Source DB: PubMed Journal: Curr Genomics ISSN: 1389-2029 Impact factor: 2.236
Single-SNP Disease Model
| Disease Model |
|
|
|
|---|---|---|---|
| Additive | (2 | ||
| Dominant | |||
| Recessive |
The f0, f1, f2 are three penetrances of genotypes.
In additive and dominant models, λ = λ1, and in a recessive model, λ = λ2.
Analysis of the Preeclampsia Data Set Using the SNP-Based, Gene-based Entropy, and MaxT Methods
| Gene (No. of SNPs) |
maxT |
Entropy |
SNP | SNP-based Method |
|---|---|---|---|---|
| 0.0379 | rs5456814 | 0.0165 | ||
| 0.0282 | rs28787657 | |||
| 0.5812 | rs28886771 | 0.0021 | ||
| rs634043464 | 0.0067 | |||
| 0.7919 | rs41410456 | 0.0330 | ||
| 0.1150 | rs634850223 | 0.0280 | ||
| 0.0527 | rs634820282 | 0.032 | ||
| 0.1312 | 0.1902 | rs40893937 | ||
| 0.3695 | 0.0547 | rs9678181 |
Data were obtained using the maximum test statistic method.
Data were obtained using the entropy-based method.
Only SNPs with the smallest P-values within the corresponding genes are listed.
Bold formatting of data indicates significant p-values.
The Estimated Type I Error Rate Under the Null Hypothesis of No Association by Using MS Program
| SS | MS Program | LD-based Programs | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
| ||||||||||
| maxT | Entropy | SNP | maxT | Entropy | SNP | maxT | Entropy | SNP | maxT | Entropy | SNP | |
| 400 | 0.05 | 0.06 | 0.03 | 0.05 | 0.06 | 0.027 | 0.06 | 0.06 | 0.06 | 0.04 | 0.04 | 0.04 |
| 800 | 0.05 | 0.05 | 0.02 | 0.04 | 0.06 | 0.019 | 0.05 | 0.06 | 0.04 | 0.06 | 0.05 | 0.05 |
SS, sample size.
Data were obtained using the maximum test statistic method.
Data were obtained using the entropy-based method.
Data were obtained using the single-SNP–based method.
The Estimated Power of Gene-based Association Tests, Assuming One Disease-related SNP Occurs Within the Gene, Under Different Sample Sizes and Different Disease Models
| Disease Model |
GRR | N=400 | N=800 | |||||
|---|---|---|---|---|---|---|---|---|
|
maxT |
Entropy | SNP | maxT | Entropy | SNP | |||
| Additive | 1.4 | 1 | 0.56 | 0.60 | 0.95 | 0.92 | 0.94 | |
| 1.6 | 1 | 0.91 | 0.955 | 1 | 1 | 1 | ||
| 1.8 | 1 | 0.975 | 0.990 | 1 | 1 | 1 | ||
| Dominant | 1.4 | 0.47 | 0.39 | 0.36 | 0.65 | 0.62 | 0.74 | |
| 1.6 | 0.75 | 0.65 | 0.73 | 0.94 | 0.90 | 0.95 | ||
| 1.8 | 0.88 | 0.89 | 0.90 | 0.99 | 0.99 | 0.99 | ||
| Recessive | 1.4 | 0.22 | 0.26 | 0.20 | 0.29 | 0.29 | 0.37 | |
| 1.6 | 0.32 | 0.34 | 0.34 | 0.64 | 0.74 | 0.77 | ||
| 1.8 | 0.54 | 0.63 | 0.59 | 0.86 | 0.92 | 0.98 | ||
GRR, genotype relative risks.
Data were obtained using the maximum test statistic method.
Data were obtained using the entropy-based method.
Data were obtained using the single-SNP–based method.
The Estimated Power of Gene-based Association Tests, Assuming that Two Disease-related SNPs Occur Within a Gene, Under Different Sample Sizes and Different Disease Models
| Disease Model |
(BL,GE) | N=400 | N=800 | ||||
|---|---|---|---|---|---|---|---|
| maxT | Entropy | SNP | maxT | Entropy | SNP | ||
| Model 1 | (1,0.5) | 0.31 | 0.42 | 0.19 | 0.61 | 0.76 | 0.37 |
| (1,0.7) | 0.54 | 0.71 | 0.35 | 0.87 | 0.93 | 0.72 | |
| (1,0.9) | 0.78 | 0.89 | 0.61 | 0.99 | 1 | 0.96 | |
| Model 2 | (1,0.5) | 0.20 | 0.29 | 0.19 | 0.52 | 0.54 | 0.49 |
| (1,0.7) | 0.34 | 0.45 | 0.38 | 0.66 | 0.77 | 0.79 | |
| (1,0.9) | 0.52 | 0.65 | 0.59 | 0.90 | 0.96 | 0.97 | |
| Model 3 | (1,0.5) | 0.17 | 0.25 | 0.10 | 0.51 | 0.49 | 0.54 |
| (1,0.7) | 0.43 | 0.56 | 0.43 | 0.66 | 0.77 | 0.76 | |
| (1,0.9) | 0.41 | 0.59 | 0.50 | 0.84 | 0.91 | 0.92 | |
BL, the baseline effect; GE, is the genotypic effect.
Data were obtained using the maximum test statistic method.
Data were obtained using the entropy-based method.
Data were obtained using the single-SNP–based method.
The Estimated Power of Gene-based Association Tests, Assuming Three Disease-related SNPs Occur Within a Gene, Under Different Sample Sizes and Different Disease Models
| Disease Model |
(BL,GE) | N=400 | N=800 | ||||
|---|---|---|---|---|---|---|---|
| maxT | Entropy | SNP | maxT | Entropy | SNP | ||
| Model 1 | (1,0.5) | 0.54 | 0.56 | 0.42 | 0.92 | 0.88 | 0.81 |
| (1,0.7) | 0.87 | 0.77 | 0.63 | 1 | 1 | 1 | |
| (1,0.9) | 0.95 | 0.94 | 0.87 | 1 | 1 | 1 | |
| Model 2 | (1,0.5) | 0.56 | 0.50 | 0.33 | 0.94 | 0.91 | 0.81 |
| (1,0.7) | 0.87 | 0.76 | 0.73 | 1 | 0.99 | 0.99 | |
| (1,0.9) | 0.96 | 0.96 | 0.91 | 1 | 1 | 1 | |
| Model 3 | (1,0.5) | 0.06 | 0.05 | 0 | 0.01 | 0.08 | 0.03 |
| (1,0.7) | 0.08 | 0.13 | 0.05 | 0.06 | 0.16 | 0.02 | |
| (1,0.9) | 0.04 | 0.19 | 0.03 | 0.05 | 0.20 | 0.05 | |
BL, the baseline effect; GE, is the genotypic effect.
Data were obtained using the maximum test statistic method.
Data were obtained using the entropy-based method.
Data were obtained using the single-SNP–based method.