| Literature DB >> 35120458 |
Maryam Onifade1, Marie-Hélène Roy-Gagnon2, Marie-Élise Parent3, Kelly M Burkett4.
Abstract
BACKGROUND: Mixed models are used to correct for confounding due to population stratification and hidden relatedness in genome-wide association studies. This class of models includes linear mixed models and generalized linear mixed models. Existing mixed model approaches to correct for population substructure have been previously investigated with both continuous and case-control response variables. However, they have not been investigated in the context of extreme phenotype sampling (EPS), where genetic covariates are only collected on samples having extreme response variable values. In this work, we compare the performance of existing binary trait mixed model approaches (GMMAT, LEAP and CARAT) on EPS data. Since linear mixed models are commonly used even with binary traits, we also evaluate the performance of a popular linear mixed model implementation (GEMMA).Entities:
Keywords: Extreme phenotype sampling; Generalized linear mixed models; Genome-wide association study; Population stratification; Type 1 error
Mesh:
Year: 2022 PMID: 35120458 PMCID: PMC8815214 DOI: 10.1186/s12864-022-08297-y
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Estimated type I error rates for the three mixed model approaches for binary traits (LEAP, GMMAT and CARAT), the LMM method (GEMMA) and logistic regression with principal component based correction (PCA)
| Cohort Sample Size (N) | Sub-sample Size (0.2N) | LEAP | GMMAT | CARAT | GEMMA | PCA |
|---|---|---|---|---|---|---|
| 5000 | 1000 | 0.0405 | 0.04135 | 0.102. | 0.061 | 0.0575 |
| 10000 | 2000 | 0.0417 | 0.0475 | 0.089 | 0.046 | 0.0605 |
| 20000 | 4000 | 0.0450 | 0.0515 | 0.0945∗ | 0.052 | 0.0555 |
*Based on m=1999 simulations
Fig. 1Type 1 error rates for the three mixed model methods (LEAP, GMMAT and CARAT). The allele frequency in population 1, p1, was fixed at 0.5. The allele frequency in population 2, p2, ranged from 0.5 to 0.9. The x-axis corresponds to the p2 value. The orange line represents GMMAT, the blue line represents LEAP, and the green line represents CARAT. The horizontal line indicates the alpha value of 0.05
Estimated type I error rates for the rare variant mixed model methods implemented in SMMAT
| SMMAT Method | Estimated Type I Error Rate |
|---|---|
| Burden (SMMAT-B) | 0.0617 |
| SKAT (SMMAT-S) | 0.1039 |
| SKAT-O (SMMAT-O) | 0.1024 |
| Efficient (SMMAT-E) | 0.0883 |
Fig. 2Quantile-Quantile Plots of the − log10 of the p-values from the four SMMAT rare variant association tests (Burden, SKAT, SKAT-0 and Efficient)
Estimated power for detecting a causal variant of two different effect sizes (β=0.15 and β=0.25) in the presence of population stratification
| Method | Estimated Power | |
|---|---|---|
| LEAP | 0.31 | 0.48 |
| GMMAT | 0.34 | 0.51 |
| GEMMA | 0.44 | 0.52 |
| PCA | 0.35 | 0.49 |
Fig. 3Quantile-Quantile plot of population stratification adjusted GMMAT, LEAP, GEMMA and uncorrected PLINK in the GWAS analysis of the case-control dataset