| Literature DB >> 19278539 |
Kacper Zukowski1, Tomasz Suchocki, Anna Gontarek, Joanna Szyda.
Abstract
The study focuses on the impact of different sets of single nucleotide polymorphisms (SNPs) selected from the available data set on prediction of genomewide breeding values (GBVs) of animals. Correlations between breeding values estimated as additive polygenic effects (EBVs) and GBVs as well as correlations between true breeding values (TBVs) and GBVs are used as major criteria for the comparison of different SNP selection schemes and GBV estimation models.The analysed data is the simulated data set from the XII QTL Workshop. In the analysis five different SNP data sets are considered. For prediction of EBVs a standard mixed animal model is applied, whereas GBVs are defined as the sum of additive effects of SNPs estimated for the different SNP data sets using model 1 with fixed SNPs effects, model 2 with fixed SNPs effects and a random additive polygenic effect, model 3 with a random effects of uncorrelated SNP genotypes.The additive polygenic and residual variance components estimated by the EBV model amount to 1.36 and 3.12, respectively. Differences between models are expressed by comparing the ranking of individuals based on EBV and on GBV and by correlations. Among 100 individuals with the highest EBVs, depending on a model and a data set, there are only between 11 and 37 individuals with the highest GBVs. The highest correlation between GBV and EBV amounts to 0.787 and is observed for model 3 with 3,328 SNPs selected based on their minor allele frequency, the lowest correlation of 0.519 is attributed to model 2 with 300 SNPs. Correlations between GBV estimates obtained from different models with the same number of SNPs range between 0.916 and 0. 998, whereas correlations between different SNP data sets using the same model fall under 0.850.These results indicate that successful application of high throughoutput SNP genotyping technologies for prediction of breeding values is a very promising approach, but before the method can be routinely applied further methodological improvements regarding model construction and SNP selection are required.Entities:
Year: 2009 PMID: 19278539 PMCID: PMC2654494 DOI: 10.1186/1753-6561-3-s1-s13
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Differences in top 100 ranking of individuals.
| GBVSNP6000 | 41 | NA | NA |
| GBVSNP3328 | 36 | 35 | 35 |
| GBVSNP1200 | 30 | 32 | 37 |
| GBVSNP600 | 26 | 18 | 31 |
| GBVSNP300 | 23 | 11 | 25 |
The number of 100 best individuals as ranked by GBV models contained within the set of 100 best individuals as ranked by the EBV model calculated for individuals from the first four generations. GBV are calculated for different SNP data sets, as indicated in subscripts. NA, not available.
Figure 1Differences in ranking of individuals based on EBV and on GBVs. Individual differences in ranks based on EBV and different GBV models and for different SNP data sets, calculated for animals from the first four generations and sorted in ascending order. Model 1 is represented by black curves, model 2 – by red curves, and model 3 – by green curves. The best (lowest differences) and the worst (highest differences) models are represented by dashed curves.
Correlation between EBV and GBV.
| GBVSNP6000 | 0.761 | NA | NA |
| GBVSNP3328 | 0.750 | 0.758 | 0.787 |
| GBVSNP1200 | 0.742 | 0.714 | 0.777 |
| GBVSNP600 | 0.720 | 0.643 | 0.745 |
| GBVSNP300 | 0.665 | 0.519 | 0.694 |
Correlations between EBV and GBV calculated for individuals from the first four generations and for different SNP data sets, as indicated in subscripts. NA, not available.
Figure 2Correlations between GBVs. Correlations (r) between GBVs estimated by different models and for different SNP data sets. Models are indicated in parentheses, followed by the number of SNPs used.
Residual variances.
| GBVSNP6000 | 0.20 | NA | NA |
| GBVSNP3328 | 0.91 | 0.64 | 3.00 |
| GBVSNP1200 | 2.29 | 2.29 | 3.11 |
| GBVSNP600 | 2.82 | 2.82 | 3.23 |
| GBVSNP300 | 3.31 | 3.31 | 3.45 |
Residual variances calculated for the GBV estimation models and for different SNP data sets, as indicated in subscripts. NA, not available.