| Literature DB >> 29914353 |
Ali Toosi1, Rohan L Fernando2, Jack C M Dekkers2.
Abstract
BACKGROUND: Population stratification and cryptic relationships have been the main sources of excessive false-positives and false-negatives in population-based association studies. Many methods have been developed to model these confounding factors and minimize their impact on the results of genome-wide association studies. In most of these methods, a two-stage approach is applied where: (1) methods are used to determine if there is a population structure in the sample dataset and (2) the effects of population structure are corrected either by modeling it or by running a separate analysis within each sub-population. The objective of this study was to evaluate the impact of population structure on the accuracy and power of genome-wide association studies using a Bayesian multiple regression method.Entities:
Mesh:
Year: 2018 PMID: 29914353 PMCID: PMC6006859 DOI: 10.1186/s12711-018-0402-1
Source DB: PubMed Journal: Genet Sel Evol ISSN: 0999-193X Impact factor: 4.297
Simulated QTL positions (cM) and effects
| Chromosome | QTL position | % of phenotypic variance explained by QTL |
|---|---|---|
| 1 | 60 | 0.01 |
| 1 | 61 | 0.01 |
| 1 | 95 | 0.01 |
| 2 | 121 | 0.01 |
| 2 | 125 | 0.01 |
| 2 | 160 | 0.01 |
| 3 | 205 | 0.01 |
| 3 | 215 | 0.01 |
| 3 | 225 | 0.01 |
| 3 | 240 | 0.01 |
| 1 | 75 | 0.03 |
| 2 | 120 | 0.03 |
| 2 | 180 | 0.03 |
| 3 | 270 | 0.03 |
| 1 | 15 | 0.06 |
Fig. 1Scatter plots of the first two principal components of the genome-wide markers in the admixed (left) and the purebred (right) populations. Numbers in brackets show the percentage of variances explained by corresponding PC. Different colors represent various breed compositions
Fig. 2Q–Q plots of the observed distribution of − log10(P-values) on the null chromosomes, with different analysis approaches, versus their expected distribution. PB purebred population, ADMX admixed population, SMA single marker association, SMA_BC SMA with breed composition, MLM mixed linear model association, MLM_BC MLM with breed composition
Accuracy, power, false positive rate and positive predictive value (PPV) for the SMA and MLM analyses with the NCHR method of finding thresholds in the ADMX population
| SMA | SMABC | MLM | MLMBC | |
|---|---|---|---|---|
| Accuracy | 0.86 (0.003) | 0.87 (0.005) | 0.86 (0.005) | 0.87 (0.004) |
| Power | 0.40 (0.027) | 0.63 (0.038) | 0.40 (0.049) | 0.64 (0.035) |
| False positive rate | 0.08 (0.003) | 0.10 (0.007) | 0.08 (0.006) | 0.11 (0.006) |
| PPV | 0.34 (0.016) | 0.43 (0.014) | 0.36 (0.031) | 0.43 (0.012) |
Numbers in brackets are SE of means
Accuracy, power, false positive rate and positive predictive value (PPV) for SMA and MLM analyses with the SLIDE method of finding thresholds in the ADMX population
| SMA | SMABC | MLM | MLMBC | |
|---|---|---|---|---|
| Accuracy | 0.58 (0.047) | 0.92 (0.003) | 0.69 (0.078) | 0.92 (0.002) |
| Power | 0.72 (0.034) | 0.30 (0.026) | 0.63 (0.063) | 0.27 (0.021) |
| False positive rate | 0.44 (0.056) | 0.007 (0.001) | 0.30 (0.096) | 0.004 (0.001) |
| PPV | 0.25 (0.030) | 0.85 (0.018) | 0.34 (0.057) | 0.90 (0.017) |
Numbers in brackets are SE of means
Accuracy, power, false positive rate and positive predictive value (PPV) for BMR analysis in the ADMX population
| BMR | BMRBC | |
|---|---|---|
| Accuracy | 0.89 (0.002) | 0.89 (0.002) |
| Power | 0.65 (0.016) | 0.58 (0.017) |
| False positive rate | 0.08 (0.001) | 0.07 (0.001) |
| PPV | 0.51 (0.007) | 0.51 (0.008) |
Numbers in brackets are SE of means