| Literature DB >> 34877603 |
Matias Bermann1, Daniela Lourenco1, Ignacy Misztal1.
Abstract
The objectives of this study were to develop an efficient algorithm for calculating prediction error variances (PEVs) for genomic best linear unbiased prediction (GBLUP) models using the Algorithm for Proven and Young (APY), extend it to single-step GBLUP (ssGBLUP), and apply this algorithm for approximating the theoretical reliabilities for single- and multiple-trait models in ssGBLUP. The PEV with APY was calculated by block sparse inversion, efficiently exploiting the sparse structure of the inverse of the genomic relationship matrix with APY. Single-step GBLUP reliabilities were approximated by combining reliabilities with and without genomic information in terms of effective record contributions. Multi-trait reliabilities relied on single-trait results adjusted using the genetic and residual covariance matrices among traits. Tests involved two datasets provided by the American Angus Association. A small dataset (Data1) was used for comparing the approximated reliabilities with the reliabilities obtained by the inversion of the left-hand side of the mixed model equations. A large dataset (Data2) was used for evaluating the computational performance of the algorithm. Analyses with both datasets used single-trait and three-trait models. The number of animals in the pedigree ranged from 167,951 in Data1 to 10,213,401 in Data2, with 50,000 and 20,000 genotyped animals for single-trait and multiple-trait analysis, respectively, in Data1 and 335,325 in Data2. Correlations between estimated and exact reliabilities obtained by inversion ranged from 0.97 to 0.99, whereas the intercept and slope of the regression of the exact on the approximated reliabilities ranged from 0.00 to 0.04 and from 0.93 to 1.05, respectively. For the three-trait model with the largest dataset (Data2), the elapsed time for the reliability estimation was 11 min. The computational complexity of the proposed algorithm increased linearly with the number of genotyped animals and with the number of traits in the model. This algorithm can efficiently approximate the theoretical reliability of genomic estimated breeding values in ssGBLUP with APY for large numbers of genotyped animals at a low cost.Entities:
Keywords: BIF accuracy; accuracy approximation; genomic evaluation; large-scale evaluation; prediction error variance
Mesh:
Year: 2022 PMID: 34877603 PMCID: PMC8827023 DOI: 10.1093/jas/skab353
Source DB: PubMed Journal: J Anim Sci ISSN: 0021-8812 Impact factor: 3.159
Traits, number of animals in the pedigree, number of animals with records, and number of genotyped animals for each dataset
| Data1 | Data2 | ||||
|---|---|---|---|---|---|
| Data1_st | Data1_mt | Data2_st | Data2_mt | ||
| Trait(s) | PWG | BW—WW—PWG | PWG | BW—WW—PWG | |
| Animals in the pedigree | 167,951 | 172,089 | 10,213,401 | 10,213,401 | |
| Animals with records | 76,758 | 78,641 | 4,218,407 | 8,681,659 | |
| Genotyped animals | Core | 10,523 | 10,523 | 10,523 | 10,523 |
| Noncore | 39,477 | 9,477 | 324,802 | 324,802 |
1Data1_st , Data1 for single-trait analysis; Data1_mt, Data1 used for multi-trait analysis; Data2_st , Data2 for single-trait analysis; Data2_mt, Data2 used for multi-trait analysis.
2The traits are birth weight (BW), weaning weight (WW), and postweaning gain (PWG).
Correlation, intercept, slope, and mean absolute change (MAC) between the exact and estimated reliabilities for Data1_st and Data1_mt
| Dataset | Trait | Group | Correlation | Intercept | Slope | MAC |
|---|---|---|---|---|---|---|
| Data1_st | PWG | Genotyped | 0.98 | 0.02 | 0.94 | 0.01 |
| Non-genotyped | 0.97 | 0.01 | 1.05 | 0.03 | ||
| Data1_mt | BW | Genotyped | 0.98 | 0.04 | 0.93 | 0.01 |
| Non-genotyped | 0.98 | 0.00 | 0.98 | 0.02 | ||
| WW | Genotyped | 0.98 | 0.02 | 0.94 | 0.01 | |
| Non-genotyped | 0.99 | 0.00 | 0.97 | 0.02 | ||
| PWG | Genotyped | 0.98 | 0.02 | 0.96 | 0.01 | |
| Non-genotyped | 0.99 | 0.00 | 1.01 | 0.01 |
1BW, birth weight; PWG, postweaning gain; WW, weaning weight.
2Data1_st , Data1 for single-trait analysis; Data1_mt, Data1 used for multi-trait analysis.
Figure 1.Scatter plots comparing reliability obtained from the inverse of the mixed model equation against estimated reliability for the genotyped animals in Data1_mt. Abbreviations: BW, birth weight; PWG, postweaning gain; ssGBLUP, single-step genomic best linear unbiased prediction; WW, weaning weight.
Correlation, intercept, slope, and mean absolute change (MAC) between the reliabilities obtained by inversion with and for Data1_st and Data1_mt
| Dataset | Trait | Group | Correlation | Intercept | Slope | MAC |
|---|---|---|---|---|---|---|
| Data1_st | PWG | Genotyped | 0.99 | 0.06 | 0.92 | 0.01 |
| Non-genotyped | 0.99 | 0.00 | 1.00 | 1.0 × 10-3 | ||
| Data1_mt | BW | Genotyped | 0.98 | −0.03 | 1.05 | 0.01 |
| Non-genotyped | 0.98 | 0.00 | 0.99 | 0.02 | ||
| WW | Genotyped | 0.98 | −0.03 | 1.07 | 0.01 | |
| Non-genotyped | 0.98 | 0.00 | 0.98 | 0.02 | ||
| PWG | Genotyped | 0.98 | −0.07 | 1.10 | 0.01 | |
| Non-genotyped | 0.99 | 0.00 | 0.99 | 0.02 |
1BW, birth weight; PWG, postweaning gain; WW, weaning weight.
2Data1_st , Data1 for single-trait analysis; Data1_mt, Data1 used for multi-trait analysis.
Wall clock time in minutes of each step for estimating reliabilities for each dataset1
| Single trait | Multiple trait | |||
|---|---|---|---|---|
| Data1 | Data2 | Data1 | Data2 | |
| Sorting pedigree | 0.003 | 0.23 | 0.003 | 0.23 |
| Approximation of pedigree reliabilities | 0.009 | 0.46 | 0.031 | 1.37 |
| Calculation of GBLUP | 0.28 | 1.85 | 0.36 | 4.92 |
| Approximation of reliabilities of | 0.008 | 0.44 | 0.025 | 1.32 |
| Propagation to non-genotyped animals | 0.001 | 0.05 | 0.002 | 0.16 |
| Multiple-trait adjustment | — | — | 0.03 | 3.23 |
| Total time | 0.31 | 3.32 | 0.42 | 11.11 |
1The memory requirements in gigabytes (GB) are inside parenthesis.
2GBLUP, genomic best linear unbiased prediction.
3 refers to the numerator relationship matrix for genotyped animals.