| Literature DB >> 29304753 |
Ling-Yun Chang1, Sajjad Toghiani2, Ashley Ling2, Sammy E Aggrey3,4, Romdhane Rekaya2,4.
Abstract
BACKGROUND: The availability of high-density (HD) marker panels, genome wide variants and sequence data creates an unprecedented opportunity to dissect the genetic basis of complex traits, enhance genomic selection (GS) and identify causal variants of disease. The disproportional increase in the number of parameters in the genetic association model compared to the number of phenotypes has led to further deterioration in statistical power and an increase in co-linearity and false positive rates. At best, HD panels do not significantly improve GS accuracy and, at worst, reduce accuracy. This is true for both regression and variance component approaches. To remedy this situation, some form of single nucleotide polymorphisms (SNP) filtering or external information is needed. Current methods for prioritizing SNP markers (i.e. BayesB, BayesCπ) are sensitive to the increased co-linearity in HD panels which could limit their performance.Entities:
Keywords: Genomic selection; High density; SNP prioritizing
Mesh:
Year: 2018 PMID: 29304753 PMCID: PMC5756446 DOI: 10.1186/s12863-017-0595-2
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Descriptive statistics of simulation schemes
| Historical Population (HP) | |
| Number of generation | 315 |
| Mutation rate for markers | 10−4 |
| Mutation rate for QTL | 10−4 |
| Founder Population (G0) | |
| Number of generation | 3 |
| Number of male | 1500 |
| Number of female | 15,000 |
| Selection Population (G3) | |
| Number of chromosomes | 10 |
| Length per chromosome (cM) | 100 |
| Number of markers per generation | 200,000/400,000 |
| Marker distribution | Evenly spaced |
| Number of QTL per generation | 100 |
| QTL distribution | Randomly distributed |
| QTL effect | Sampled from gamma with shape 0.4 |
| Heritability | 0.4 |
| Genetic variance | 0.4 |
| Residual variance | 0.6 |
Preselected SNPs based on different cutoff values for the FST scores and different simulation scenarios
| Panel density | QTL effects1 | Quantile | FST Score3 | Selected SNPs4 |
|---|---|---|---|---|
| 99.5 | 0.02 | 935 | ||
| Gamma5 | 99.0 | 0.01 | 1956 | |
| 97.5 | 0.004 | 4932 | ||
| 200 K | ||||
| 99.5 | 0.009 | 1076 | ||
| Predefined6 | 99.0 | 0.007 | 2171 | |
| 97.5 | 0.005 | 5620 | ||
| 99.5 | 0.015 | 2078 | ||
| Gamma | 99.0 | 0.009 | 3586 | |
| 97.5 | 0.004 | 10,178 | ||
| 400 K | ||||
| 99.5 | 0.009 | 2036 | ||
| Predefined | 99.0 | 0.007 | 4646 | |
| 97.5 | 0.004 | 10,651 | ||
1Distribution used for the simulation of the QTL effects, 2quantiles of the FST score distribution, 3cutoff point for the fixation index (FST), 4number of selected SNPs based on the FST score cutoff, 5Gamma distribution with shape parameter equal 0.4, and 6QTL effects pre-defined to explain at least 0.5% of genetic variance each
Fig. 1Distribution of the simulated quantitative trait loci (QTL) along the ten chromosomes when their effects were simulated from a gamma distribution (a) or predefined (c) and their associated FST scores distribution (b) and (d) for the 200 K marker panel scenario. Horizontal dashed lines indicate the 99.5 (red), 99.0 (blue), and 97.5 (green) quantiles of the FST distribution
Number of selected SNPs, number of tagged QTLs, percentage of genetic variance explained, and accuracies of genomic and phenotype prediction under different quantile of the distribution of FST scores, sampling distribution for the QTL effects and density of the marker panel using the proposed method. Standard errors of accuracies are listed between parentheses
| All SNPs | 97.5 quantile1 | 99.0 quantile | 99.5 quantile | |||||
|---|---|---|---|---|---|---|---|---|
| Gamma2 | Predefined3 | Gamma | Predefined | Gamma | Predefined | Gamma | Predefined | |
| 200 K SNP marker panel | ||||||||
| Selected SNP | 200 K | 200 K | 4932 | 5620 | 1956 | 2171 | 935 | 1076 |
| Tagged QTL4 | 95 | 97 | 33 | 69 | 18 | 47 | 13 | 31 |
| % GV5 | 91.29 | 98.60 | 83.70 | 71.27 | 73.57 | 49.69 | 64.08 | 35.10 |
| Acc_P6 | 0.462 | 0.445 | 0.503 | 0.490 | 0.472 | 0.415 | 0.434 | 0.359 |
| (0.018) | (0.012) | (0.017) | (0.014) | (0.015) | (0.018) | (0.028) | (0.032) | |
| Acc_G7 | 0.777 | 0.741 | 0.853 | 0.830 | 0.797 | 0.704 | 0.725 | 0.617 |
| (0.017) | (0.012) | (0.019) | (0.023) | (0.017) | (0.031) | (0.037) | (0.026) | |
| 400 K SNP marker panel | ||||||||
| Selected SNP | 400 K | 400 K | 10,173 | 10,651 | 3586 | 4646 | 2078 | 2037 |
| Tagged QTL | 95 | 99 | 38 | 74 | 20 | 53 | 13 | 34 |
| % GV | 96.73 | 99.01 | 84.03 | 75.09 | 73.83 | 56.66 | 66.12 | 43.79 |
| Acc_P | 0.456 | 0.438 | 0.506 | 0.485 | 0.473 | 0.433 | 0.448 | 0.350 |
| (0.015) | (0.017) | (0.014) | (0.017) | (0.029) | (0.021) | (0.039) | (0.028) | |
| Acc_G | 0.775 | 0.735 | 0.860 | 0.813 | 0.807 | 0.722 | 0.765 | 0.685 |
| (0.020) | (0.012) | (0.015) | (0.012) | (0.041) | (0.025) | (0.059) | (0.052) | |
1quantile of the distribution of the FST scores, 2QTL effects sampled from a Gamma distribution, 3QTL effects pre-defined to explain at least 0.5% of genetic variance (GV), 4QTL with r2 > 0.7 with at least one selected SNP, 5GV = Genetic Variance, 6accuracy of phenotype prediction, 7accuracy of genomic prediction
Fig. 2Distribution of the simulated QTL (in Blue) and the preselected SNPs (in Red) across the 10 chromosomes using the 99.5 (a) and 97.5 (b) quantiles of the FST scores under the predefined QTL effect and the 200 K marker panel simulation scenario. (* indicates the top 10% QTL)
Number of selected SNPs, number of tagged QTL, percentage of genetic variance explained, and accuracies of genomic and phenotype prediction under different π values, sampling distribution for the QTL effects and density of the marker panel using BayesB method. Standard errors of accuracies are listed between parentheses
| (1-π) =0.90 | (1-π) =0.95 | (1-π) =0.98 | (1-π) =0.99 | |||||
|---|---|---|---|---|---|---|---|---|
| Gamma1 | Predefined2 | Gamma | Predefined | Gamma | Predefined | Gamma | Predefined | |
| 200 K marker density | ||||||||
| # SNP | 20 K | 20 K | 10 K | 10 K | 4 K | 4 K | 2 K | 2 K |
| Tagged QTL3 | 78 | 98 | 63 | 97 | 54 | 94 | 48 | 91 |
| % GV4 | 89.31 | 98.16 | 86.43 | 97.88 | 84.30 | 95.76 | 83.88 | 93.20 |
| Acc_P5 | 0.473 | 0.463 | 0.478 | 0.471 | 0.489 | 0.487 | 0.499 | 0.500 |
| (0.018) | (0.009) | (0.018) | (0.009) | (0.018) | (0.008) | (0.018) | (0.007) | |
| Acc_G6 | 0.797 | 0.770 | 0.807 | 0.785 | 0.827 | 0.810 | 0.845 | 0.833 |
| (0.017) | (0.008) | (0.017) | (0.007) | (0.018) | (0.007) | (0.018) | (0.005) | |
| 400 K marker density | ||||||||
| # SNP | 40K | 40K | 20 K | 20 K | 8 K | 8 K | 4 K | 4 K |
| Tagged QTL | 86 | 99 | 75 | 98 | 59 | 97 | 53 | 96 |
| % GV | 92.36 | 98.46 | 91.88 | 98.16 | 91.20 | 97.78 | 91.03 | 96.69 |
| Acc_P | 0.465 | 0.450 | 0.470 | 0.457 | 0.478 | 0.469 | 0.488 | 0.481 |
| (0.015) | (0.018) | (0.015) | (0.018) | (0.014) | (0.018) | (0.013) | (0.019) | |
| Acc_G | 0.790 | 0.756 | 0.799 | 0.767 | 0.813 | 0.787 | 0.829 | 0.807 |
| (0.019) | (0.013) | (0.017) | (0.013) | (0.016) | (0.014) | (0.015) | (0.014) | |
1 QTL effects sampled from a Gamma distribution, 2QTL effects pre-defined to explain at least 0.5% of genetic variance (GV), 3QTL with r2 > 0.7 with at least one selected SNP, 4 GV = Genetic Variance, 5 accuracy of phenotype prediction, 6accuracy of genomic prediction
Number of selected SNPs, number of tagged QTL, percentage of genetic variance explained, and accuracies of genomic and phenotype prediction under different π values, sampling distribution for the QTL effects and density of the marker panel using BayesC method. Standard errors of accuracies are listed between parentheses
| (1-π) =0.90 | (1-π) =0.95 | (1-π) =0.98 | (1-π) =0.99 | |||||
|---|---|---|---|---|---|---|---|---|
| Gamma | Predefined | Gamma | Predefined | Gamma | Predefined | Gamma | Predefined | |
| 200 K marker density | ||||||||
| # SNP | 20 K | 20 K | 10 K | 10 K | 4 K | 4 K | 2 K | 2 K |
| Tagged QTL3 | 76 | 97 | 61 | 96 | 53 | 94 | 46 | 91 |
| % GV4 | 88.84 | 97.66 | 86.56 | 97.53 | 86.30 | 95.74 | 85.76 | 93.32 |
| Acc_P5 | 0.453 | 0.451 | 0.467 | 0.459 | 0.484 | 0.477 | 0.496 | 0.493 |
| (0.019) | (0.009) | (0.019) | (0.009) | (0.018) | (0.008) | (0.018) | (0.008) | |
| Acc_G6 | 0.769 | 0.751 | 0.791 | 0.766 | 0.821 | 0.794 | 0.842 | 0.821 |
| (0.017) | (0.009) | (0.018) | (0.008) | (0.018) | (0.009) | (0.018) | (0.006) | |
| 400 K marker density | ||||||||
| # SNP | 40K | 40K | 20 K | 20 K | 8 K | 8 K | 4 K | 4 K |
| Tagged QTL | 85 | 99 | 68 | 98 | 53 | 97 | 48 | 95 |
| % GV | 92.05 | 98.97 | 91.59 | 98.37 | 90.98 | 96.95 | 90.16 | 95.81 |
| Acc_P | 0.444 | 0.441 | 0.456 | 0.447 | 0.472 | 0.459 | 0.485 | 0.472 |
| (0.013) | (0.017) | (0.013) | (0.017) | (0.014) | (0.017) | (0.014) | (0.018) | |
| Acc_G | 0.754 | 0.740 | 0.773 | 0.749 | 0.802 | 0.769 | 0.824 | 0.791 |
| (0.017) | (0.011) | (0.017) | (0.011) | (0.017) | (0.012) | (0.016) | (0.012) | |
1QTL effects sampled from a Gamma distribution, 2QTL effects pre-defined to explain at least 0.5% of genetic variance (GV), 3QTL with r2 > 0.7 with at least one selected SNP, 4GV = Genetic Variance, 5accuracy of phenotype prediction, 6accuracy of genomic prediction
Comparison of best accuracies between BayesB, BayesC, and the proposed method under different sampling distribution for the QTL effects and density of the marker panel
| 200 K marker panel | 400 K marker panel | |||
|---|---|---|---|---|
| Gamma1 | Predefined2 | Gamma | Predefined | |
| Diff_acc_G3 | ||||
| | −0.94 | 0.36 | −3.60 | −0.74 |
| | −1.29 | −1.08 | −4.19 | −2.71 |
| Diff_acc_P4 | ||||
| | −0.80 | 2.04 | −3.56 | −0.82 |
| | −1.39 | 0.61 | −4.15 | −2.68 |
1QTL effects sampled from a Gamma distribution, 2QTL effects pre-defined to explain at least 0.5% of genetic variance, 3percentage difference in genomic accuracy compared to the proposed method, 4percentage difference in phenotype prediction genomic accuracy compared to the proposed method