| Literature DB >> 31726712 |
Ling-Yun Chang1,2, Sajjad Toghiani1,3, El Hamidi Hay3, Samuel E Aggrey4,5, Romdhane Rekaya1,5.
Abstract
A dramatic increase in the density of marker panels has been expected to increase the accuracy of genomic selection (GS), unfortunately, little to no improvement has been observed. By including all variants in the association model, the dimensionality of the problem should be dramatically increased, and it could undoubtedly reduce the statistical power. Using all Single nucleotide polymorphisms (SNPs) to compute the genomic relationship matrix (G) does not necessarily increase accuracy as the additive relationships can be accurately estimated using a much smaller number of markers. Due to these limitations, variant prioritization has become a necessity to improve accuracy. The fixation index (FST) as a measure of population differentiation has been used to identify genome segments and variants under selection pressure. Using prioritized variants has increased the accuracy of GS. Additionally, FST can be used to weight the relative contribution of prioritized SNPs in computing G. In this study, relative weights based on FST scores were developed and incorporated into the calculation of G and their impact on the estimation of variance components and accuracy was assessed. The results showed that prioritizing SNPs based on their FST scores resulted in an increase in the genetic similarity between training and validation animals and improved the accuracy of GS by more than 5%.Entities:
Keywords: accuracy; genomic selection; high density; sequence data
Mesh:
Substances:
Year: 2019 PMID: 31726712 PMCID: PMC6895924 DOI: 10.3390/genes10110922
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Variance components and heritability (SE) for different weighting scenarios of the prioritized 20K and the remaining 380K SNPs when the full panel (400K SNPs) was used to compute the genomic relationship matrix (average over 5 replicates).
| Scenario 2 | Weighting (%) | Genetic Variance | Residual Variance | Heritability | |
|---|---|---|---|---|---|
| 20K 1 | 380K | ||||
| 1 = (100,0) | 100 | 0 | 0.196 (0.026) | 0.671 (0.042) | 0.228 (0.033) |
| 2 = (90,10) | 90 | 10 | 0.213 (0.018) | 0.648 (0.032) | 0.247 (0.023) |
| 3 = (75,25) | 75 | 25 | 0.232 (0.015) | 0.633 (0.025) | 0.268 (0.018) |
| 4 = (50,50) | 50 | 50 | 0.257 (0.016) | 0.618 (0.021) | 0.294 (0.018) |
| 5 = (25,75) | 258 | 75 | 0.279 (0.021) | 0.619 (0.021) | 0.311 (0.023) |
| 6 = (PS 3,PS) | PS | PS | 0.251 (0.032) | 0.629 (0.037) | 0.285 (0.037) |
| 7 = Equal weights | Equal weights | Equal weights | 0.247 (0.027) | 0.692 (0.016) | 0.263 (0.025) |
1 Top 20K SNPs based on FST scores; 2 (x,y) are the percentages of the weights allocated to the prioritized top 20K and the remaining 380K SNPs, respectively; 3 contribution proportional to the SNP FST score.
Distribution of off-diagonal elements (OD) of the genomic relationships matrix corresponding to the training and validation individuals using all 400 SNPs and for different weighting scenarios for the prioritized 1 (20K) and nonprioritized (380K) SNPs (in %).
| OD |
|
|
|---|---|---|
| OD < −0.05 | 2.32 | 1.61 |
| 0.05 < OD < −0.03 | 9.85 | 8.39 |
| 0.03 < OD < −0.01 | 28.18 | 29.35 |
| −0.01 < OD < 0.01 | 33.48 | 36.14 |
| 0.01 < OD < 0.03 | 17.21 | 16.52 |
| 0.03 < OD < 0.05 | 5.52 | 4.86 |
| OD > 0.05 | 3.46 | 4.86 |
1 SNPs selected based on FST scores.
Distribution of off-diagonal elements (OD) of the genomic relationship matrix corresponding to the training and validation individuals using the prioritized 1 20K SNPs and for different weighting scenarios (in %).
| OD | Weights | No Weight | Scenario | ||||
|---|---|---|---|---|---|---|---|
| (100,0) 2 | (90,10) | (75,25) | (50,50) | (25,75) | |||
| OD < −0.05 | 0.92 | 0.00 | 2.32 | 1.66 | 0.92 | 0.25 | 0.03 |
| −0.05 < OD < −0.03 | 4.60 | 0.77 | 9.85 | 8.72 | 6.81 | 3.67 | 1.42 |
| −0.03 < OD < −0.01 | 30.26 | 29.77 | 28.18 | 29.16 | 30.45 | 31.72 | 31.22 |
| −0.01 < OD < 0.01 | 43.93 | 52.43 | 33.49 | 35.54 | 38.86 | 44.57 | 49.81 |
| 0.01 < OD < 0.03 | 13.52 | 11.52 | 17.21 | 16.74 | 15.74 | 13.64 | 11.89 |
| 0.03 < OD < 0.05 | 4.04 | 3.30 | 5.52 | 5.01 | 4.36 | 3.68 | 3.36 |
| OD > 0.05 | 2.73 | 2.21 | 3.46 | 3.19 | 2.86 | 2.49 | 2.27 |
1 SNPs selected based on FST scores; 2 (x,y) are the percentages of the weights allocated to the prioritized top 20K and the remaining 380K SNPs, respectively.
Figure 1Accuracy of genomic prediction for different weighting scenarios for the contribution of the 20K prioritized SNPs and the remaining 380K markers (x,y). Horizontal lines indicate the accuracy using only the top 20K SNPs with (red) or without (green) weights SNPs.
Residual variance and log-likelihood of the model and the parameters (intercept and slope) of the regression of the estimated on the true breeding values for different weighting scenarios.
| Scenario | Residual Variance | Intercept | Slope | −2LogL |
|---|---|---|---|---|
| 1 = (100,0) | 0.671 (0.04) | −1.208 (0.04) | 0.664 (0.03) | 26,409.13 (310.78) |
| 2 = (90,10) | 0.648 (0.03) | −1.228 (0.04) | 0.675 (0.02) | 26,397.40 (275.55) |
| 3 = (75,25) | 0.633 (0.03) | −1.244 (0.04) | 0.683 (0.02) | 26,404.90 (233.44) |
| 4 = (50,50) | 0.618 (0.02) | −1.240 (0.05) | 0.682 (0.02) | 26,489.84 (180.68) |
| 5 = (25,75) | 0.619 (0.02) | −1.185 (0.06) | 0.651 (0.02) | 26,746.99 (106.85) |
| 6 = (PS,PS) | 0.629 (0.04) | −1.205 (0.04) | 0.662 (0.03) | 26,619.76 (232.62) |
| 7 = Equal weight | 0.692 (0.02) | −0.921 (0.13) | 0.505 (0.05) | 27,378.30 (82.38) |