| Literature DB >> 30400768 |
Andres Legarra1, Antonio Reverter2.
Abstract
BACKGROUND: Cross-validation tools are used increasingly to validate and compare genetic evaluation methods but analytical properties of cross-validation methods are rarely described. There is also a lack of cross-validation tools for complex problems such as prediction of indirect effects (e.g. maternal effects) or for breeding schemes with small progeny group sizes.Entities:
Mesh:
Year: 2018 PMID: 30400768 PMCID: PMC6219059 DOI: 10.1186/s12711-018-0426-6
Source DB: PubMed Journal: Genet Sel Evol ISSN: 0999-193X Impact factor: 4.297
Summary statistics for age and body weight (YWT) in yearling records used in the beef cattle data example
| Sex | N | Variable | Mean | SD | Min. | Max. |
|---|---|---|---|---|---|---|
| Cows | 995 | Age (days) | 361.77 | 12.68 | 323 | 400 |
| BWT (kg) | 209.73 | 30.54 | 115 | 299 | ||
| Bulls | 1116 | Age (days) | 359.10 | 20.54 | 302 | 416 |
| BWT (kg) | 243.71 | 29.17 | 138 | 353 |
Set of 16 statistics used to compare predictions based on the whole and partial beef cattle datasets
| Statistic | Description |
|---|---|
|
| REML estimate of heritability for each ‘Partial’ dataset (each random 50% missing) |
|
| Regression of whole on partial EBV (expectation of 1.0) |
|
| |
|
| |
|
| Regression of partial on whole EBV (expectation depends on accuracies) |
|
| |
|
| |
|
| Correlation between whole and partial EBV (expectation depends on accuracies) |
|
| |
|
| |
|
| Correlation between the partial EBV and the adjusted phenotypes for the reference samples |
|
| Correlation between the partial EBV and the adjusted phenotypes for the validation samples (NB. This is the conventional measure of accuracy in cross-validation genomic selection studies) |
|
| Difference between whole and partial EBV (in absolute value) computed within reference samples |
|
| Difference between whole and partial EBV (in absolute value) computed within validation samples |
|
| Variance of the difference between whole and partial EBV computed within reference samples |
|
| Variance of the difference between whole and partial EBV computed within validation samples |
Summary metrics (mean, standard deviation, minimum and maximum) for the 16 statistics across the 1000 partial datasets (each one setting a random 50% as missing phenotypes) and obtained using either the pedigree-based NRM or the SNP-based GRM
| Statistic | Pedigree-based NRM | SNP-based GRM | ||||||
|---|---|---|---|---|---|---|---|---|
| Mean | SD | Min. | Max. | Mean | SD | Min. | Max. | |
|
| 0.260 | 0.021 | 0.211 | 0.371 | 0.433 | 0.044 | 0.316 | 0.598 |
|
| 0.957 | 0.064 | 0.741 | 1.206 | 0.961 | 0.083 | 0.718 | 1.275 |
|
| 0.970 | 0.059 | 0.763 | 1.180 | 0.954 | 0.077 | 0.729 | 1.231 |
|
| 0.925 | 0.082 | 0.688 | 1.272 | 0.975 | 0.099 | 0.685 | 1.372 |
|
| 0.751 | 0.077 | 0.522 | 1.189 | 0.710 | 0.066 | 0.519 | 0.967 |
|
| 1.079 | 0.090 | 0.840 | 1.541 | 0.955 | 0.079 | 0.730 | 1.238 |
|
| 0.423 | 0.056 | 0.253 | 0.743 | 0.462 | 0.046 | 0.329 | 0.667 |
|
| 0.751 | 0.024 | 0.665 | 0.809 | 0.823 | 0.013 | 0.772 | 0.864 |
|
| 0.909 | 0.013 | 0.859 | 0.943 | 0.952 | 0.006 | 0.934 | 0.967 |
|
| 0.550 | 0.035 | 0.425 | 0.637 | 0.668 | 0.021 | 0.584 | 0.736 |
|
| 0.849 | 0.012 | 0.804 | 0.892 | 0.898 | 0.015 | 0.852 | 0.944 |
|
| 0.076 | 0.022 | 0.011 | 0.156 | 0.312 | 0.021 | 0.227 | 0.373 |
|
| 2.253 | 0.266 | 1.684 | 3.902 | 2.905 | 0.288 | 2.344 | 4.476 |
|
| 3.865 | 0.167 | 3.441 | 4.422 | 6.726 | 0.216 | 5.932 | 7.575 |
|
| 8.303 | 1.988 | 4.585 | 24.081 | 13.798 | 2.977 | 8.839 | 32.127 |
|
| 23.893 | 2.003 | 19.174 | 30.920 | 73.330 | 4.676 | 57.355 | 91.677 |
Fig. 1Mean value for the 16 statistics across the 1000 partial (random 50%) beef cattle datasets obtained using either the pedigree-based NRM or the SNP-based GRM. Double-ended arrows indicate ± 1 standard deviation (SD). Refer to Tables 2 and 3 for a description of the statistics and the actual values, respectively
Fig. 2Heatmap of the correlation matrix among the 16 statistics obtained using the pedigree-based NRM (left panel) and the SNP-based GRM (right panel). Refer to Table 2 for a description of the statistics and to Supplementary Tables 1 and 2 for the actual correlation values
Fig. 3Scatter plot of the relationship of the absolute difference in EBV between the whole and partial datasets () with the correlation of the EBV based on the partial data with the adjusted phenotypes (; left panel) and the correlation between EBV based on the whole and partial data (; right panel) across the 1000 partial (random 50%) beef cattle datasets