| Literature DB >> 31259311 |
Robert Delongchamp1,2, Mohammed F Faramawi1,2, Eleanor Feingold3, Dongjun Chung4, Saly Abouelenein2.
Abstract
A power calculation for a study with a quantitative outcome requires information on the outcome distribution under the alternative hypothesis. Researchers face challenges when they concisely specify alternative distributions in genetic studies because power depends on genotype frequencies and the average effect of each genotype. In GWAS, investigators evaluate hundreds of thousands of associations; therefore it is unrealistic to specify gene frequencies and gene effects for each test and some simplification is needed. Software packages are available to calculate power, but many of them have limited flexibility and / or may have a steep learning curve. In this review, we describe to researchers and graduate students the essentials of a power calculation for testing for an association between a quantitative trait and genotypes. In addition, we provide them with the codes of the different available software packages-free and commercial-to calculate this power. The calculations can be carried out using virtually any computer language that computes the cumulative distribution function of a non-central F-distribution.Entities:
Keywords: genome-wide association study; quantitative trait loci; sample size; statistics as topic
Year: 2018 PMID: 31259311 PMCID: PMC6599598 DOI: 10.20897/ejeph/3925
Source DB: PubMed Journal: Eur J Environ Public Health
ANOVA table testing for changes in phenotype associated with the genotypes of a SNP
| Source of Variation | Degrees of Freedom | Sum of Squares | Expected Mean Squared Error | |
|---|---|---|---|---|
| Covariates | * | * | * | |
| Genotypes | ||||
| Error | ||||
| Total |
Figure 1.Computer code in SAS, Matlab, R, Stata, and Mathematica that compute power for an ANOVA (Table 1)
Figure 2.Power of an ANOVA for several sample sizes as a function of the proportion of the phenotype variation that is explained by the genotypes
Figure 3.Effect size detectable with a power of 0.8 in the dominant/recessive genetic model as a function of the minor allele frequency for several sample sizes
Software packages and Critical value for F-distribution with 1, 1000 degrees of freedom
| Significance level | Critical value for F-distribution with 1, 1000 degrees of freedom | ||||
|---|---|---|---|---|---|
| Mathematica | SAS | Matlab | R | Stata | |
| 1E-03 | 10.891865559 | 10.891866 | 10.891866 | 10.891866 | 10.891866 |
| 1E-04 | 15.259521389 | 15.259521 | 15.259521 | 15.259521 | 15.259521 |
| 1E-05 | 19.712947049 | 19.712947 | 19.712947 | 19.712947 | 19.712948 |
| 1E-06 | 24.228934152 | . | 24.228934 | 24.228934 | 24.228933 |
| 1E-07 | 28.794927827 | . | 28.794928 | 28.794928 | 28.794928 |
| 1E-08 | 33.403406313 | . | 33.403406 | 33.403406 | 33.403408 |
| 1E-09 | 38.049531722 | . | 38.049532 | 38.049532 | 38.049530 |
| 1E-10 | 42.730026564 | . | 42.730026 | 42.730026 | 42.730026 |
| 1E-11 | 47.442581611 | . | 47.442581 | 47.442581 | 47.442581 |
| 1E-12 | 52.185519872 | . | 52.185566 | 52.185566 | 52.185520 |