| Literature DB >> 23105939 |
Abstract
A sample size with sufficient statistical power is critical to the success of genetic association studies to detect causal genes of human complex diseases. Genome-wide association studies require much larger sample sizes to achieve an adequate statistical power. We estimated the statistical power with increasing numbers of markers analyzed and compared the sample sizes that were required in case-control studies and case-parent studies. We computed the effective sample size and statistical power using Genetic Power Calculator. An analysis using a larger number of markers requires a larger sample size. Testing a single-nucleotide polymorphism (SNP) marker requires 248 cases, while testing 500,000 SNPs and 1 million markers requires 1,206 cases and 1,255 cases, respectively, under the assumption of an odds ratio of 2, 5% disease prevalence, 5% minor allele frequency, complete linkage disequilibrium (LD), 1:1 case/control ratio, and a 5% error rate in an allelic test. Under a dominant model, a smaller sample size is required to achieve 80% power than other genetic models. We found that a much lower sample size was required with a strong effect size, common SNP, and increased LD. In addition, studying a common disease in a case-control study of a 1:4 case-control ratio is one way to achieve higher statistical power. We also found that case-parent studies require more samples than case-control studies. Although we have not covered all plausible cases in study design, the estimates of sample size and statistical power computed under various assumptions in this study may be useful to determine the sample size in designing a population-based genetic association study.Entities:
Keywords: case-control studies; case-parent study; genetic association studies; sample size; statistical power
Year: 2012 PMID: 23105939 PMCID: PMC3480678 DOI: 10.5808/GI.2012.10.2.117
Source DB: PubMed Journal: Genomics Inform ISSN: 1598-866X
Number of cases required to achieve 80% power according to the different genetic models in a case-control study
Assumptions: 5% minor allele frequency, 5% disease prevalence, complete linkage disequilibrium (D'=1), 1:1 case-control ratio, and 5% type I error rates for single marker analyses.
ORhet/ORhomo, odds ratios of heterozygotes/rare homozygotes.
Fig. 1The statistical power for the allelic test in a case-control study according to (A) minor allele frequency (MAF), (B) disease prevalence, (C) linkage disequilibrium (LD), and (D) case-to-control ratio (M, MAF; P, prevalence; D, LD; R, case-control ratio; A1=1.3, A2=1.5, A3=2, and A4=2.5 for heterozygous odds ratios).
Sample sizes with 80% power by increasing number of SNP markers in case-control and case-parent studies
Assumptions: 5% minor allele frequency, 5% disease prevalence, complete linkage disequilibrium (D'=1), and 5% type I error rates for allelic test.
SNP, single-nucleotide polymorphism; ORA, odds ratio of heterozygotes under an additive model; CC, case-control study; CP, case-parent study.
Fig. 2The statistical power according to the number of single-nucleotide polymorphism (SNP) markers for the allelic test in (A) a case-control study and (B) a case-parent study (a1 = 0.05, a2 = 1×10-7, and a3 = 5×10-8 denote the significance thresholds according to the number of markers; A1.3, A1.5, A2, and A2.5 denote the odds ratios of heterozygotes). MAF, minor allele frequency; D', linkage disequilibrium.