| Literature DB >> 35764921 |
Ibrahim Jibrila1, Jeremie Vandenplas2, Jan Ten Napel2, Rob Bergsma3, Roel F Veerkamp2, Mario P L Calus2.
Abstract
BACKGROUND: Empirically assessing the impact of preselection on genetic evaluation of preselected animals requires comparing scenarios that take different approaches into account, including scenarios without preselection. However, preselection is almost always performed in animal breeding programs, so it is difficult to have a dataset without preselection. Hence, most studies on preselection have used simulated datasets, and have concluded that genomic estimated breeding values (GEBV) from subsequent single-step genomic best linear unbiased prediction (ssGBLUP) evaluations are unbiased. The aim of this study was to investigate the impact of genomic preselection (GPS) on accuracy and bias in subsequent ssGBLUP evaluations, using data from a commercial pig breeding program.Entities:
Mesh:
Year: 2022 PMID: 35764921 PMCID: PMC9238012 DOI: 10.1186/s12711-022-00727-5
Source DB: PubMed Journal: Genet Sel Evol ISSN: 0999-193X Impact factor: 5.100
Data used in subsequent ssGBLUPa evaluations following each preselection scenario, after quality control
| Data in the subsequent ssGBLUP evaluation/preselection scenario | With records on animals in the validation generation | Without records on animals in the validation generation | ||||
|---|---|---|---|---|---|---|
| Referenceb | VGPc | MGPd | Referenceb | VGPc | MGPd | |
| Sire line (number of validation animals per trait is ± 1383) | ||||||
| Number of animals in the pedigree | 81,875 | 60,950 | 12,777 | 81,875 | 60,950 | 12,777 |
| Number of animals with record for at least one traite | 75,129 | 54,217 | 6065 | 52,846 | 52,846 | 4694 |
| Number of animals with genotypes | 33,506 | 23,315 | 5131 | 33,506 | 23,315 | 5131 |
| Number of SNP genotyped | 20,550 | 20,963 | 20,926 | 20,550 | 20,963 | 20,926 |
| Dam line (number of validation animals per trait is ± 2051) | ||||||
| Number of animals in the pedigree | 160,426 | 124,031 | 33,485 | 160,426 | 124,031 | 33,485 |
| Number of animals with record for at least one trait | 139,403 | 103,018 | 12,514 | 100,710 | 100,710 | 10,206 |
| Number of animals with genotypes | 50,895 | 36,369 | 9072 | 50,895 | 36,369 | 9072 |
| Number of SNP genotyped | 19,199 | 19,256 | 20,647 | 19,199 | 19,256 | 20,647 |
aSingle-step genomic best linear unbiased prediction
bIn the reference scenario, the subsequent ssGBLUP evaluation used the entire available data until the validation generation
cValidation generation preselection (VGP) scenario, in which all animals in the validation generation without progeny in the data were discarded
dMulti-generation preselection (MGP) scenario, in which all animals in the validation and training generations without progeny in the data were discarded
eAbout 87% and 70% of the animals in the sire and dam lines, respectively, had records for the four traits used in this study, and even larger numbers had records for any two and three traits. We decided to keep any animal with a record for at least one of the traits (92% and 87% of the animals in the sire and dam lines, respectively) because every animal in the analyses would benefit from records on relatives and records of correlated traits (see Additional file 1: Table S1), in addition to its own record on the primary trait
Fig. 1Overview of groups of animals used in subsequent ssGBLUP for each of the considered GPS scenarios
Means and SD (in brackets)a of precorrected phenotypes of the traits used in this study after each GPS scenario
| Trait/preselection scenario | Within the validation generation only | Across the entire dataset | ||
|---|---|---|---|---|
| Referenceb | VGPc/MGPd | VGP | MGP | |
| Sire line | ||||
| ADGT (g/day) | 0.26 (2.01) | 1.03 (1.59) | − 0.17 (2.05) | 0.77 (1.71) |
| ADGL (g/day) | 0.07 (1.97) | 0.81 (1.78) | − 0.21 (1.95) | 0.61 (1.67) |
| Backfat thickness (mm) | − 0.32 (1.20) | − 0.35 (1.15) | − 0.02 (1.32) | − 0.13 (1.25) |
| Loin depth (mm) | 0.34 (1.32) | 0.35 (1.29) | 0.23 (1.34) | 0.27 (1.30) |
| Dam line | ||||
| ADGT (g/day) | 0.40 (1.87) | 1.16 (1.69) | − 0.12 (1.75) | 0.71 (1.60) |
| ADGL (g/day) | 0.21 (1.84) | 0.98 (1.61) | − 0.19 (1.81) | 0.58 (1.62) |
| Backfat thickness (mm) | 0.10 (1.32) | 0.06 (1.28) | − 0.03 (1.43) | − 0.09 (1.41) |
| Loin depth (mm) | 0.31 (1.42) | 0.19 (1.39) | 0.28 (1.39) | 0.33 (1.37) |
ADGT average daily gain during performance testing, ADGL ADG throughout life
aBoth means and SD are in additive genetic SD units
bIn the reference scenario, the subsequent ssGBLUP evaluation utilized the entire available data until the validation generation
cValidation generation preselection (VGP) scenario, in which all animals in the validation generation without progeny in the data were discarded
dMulti-generation preselection (MGP) scenario, in which all animals in the validation and training generations without progeny in the data were discarded
Performance of ssGBLUPa in the subsequent evaluations in the sire line (SE in brackets)
| Measure/preselection scenario | With records on animals in the validation generation | Without records on animals in the validation generation | ||||
|---|---|---|---|---|---|---|
| Referenceb | VGPc | MGPd | Reference | VGP | MGP | |
| Average daily gain during performance testing, size of validation population = 1382 | ||||||
| Estimated heritabilitye | 0.24 (0.01) | 0.25 (0.01) | 0.33 (0.02) | 0.24 (0.01) | 0.24 (0.01) | 0.35 (0.03) |
| Validation accuracyf | 0.51 (0.02) | 0.51 (0.02) | 0.50 (0.02) | 0.47 (0.02) | 0.47 (0.02) | 0.44 (0.02) |
| Level biasg | − 0.09 (0.02) | − 0.15 (0.02) | − 0.01 (0.02) | − 0.11(0.02) | − 0.11(0.02) | − 0.02 (0.02) |
| Dispersion biash | 0.48 (0.02) | 0.49 (0.02) | 0.48 (0.02) | 0.48 (0.02) | 0.48 (0.02) | 0.46 (0.03) |
| Average daily gain throughout life, size of validation population = 1383 | ||||||
| Estimated heritability | 0.26 (0.01) | 0.28 (0.01) | 0.33 (0.03) | 0.27 (0.01) | 0.27 (0.01) | 0.35 (0.03) |
| Validation accuracy | 0.57 (0.02) | 0.56 (0.02) | 0.55 (0.02) | 0.52 (0.02) | 0.52 (0.02) | 0.48 (0.02) |
| Level bias | − 0.10 (0.02) | − 0.17 (0.02) | − 0.06 (0.02) | − 0.14 (0.02) | − 0.14 (0.02) | − 0.08 (0.02) |
| Dispersion bias | 0.48 (0.02) | 0.49 (0.02) | 0.50 (0.02) | 0.47 (0.02) | 0.47 (0.02) | 0.49 (0.02) |
| Backfat thickness, size of validation population = 1383 | ||||||
| Estimated heritability | 0.58 (0.01) | 0.58 (0.01) | 0.58 (0.02) | 0.58 (0.01) | 0.58 (0.01) | 0.60 (0.03) |
| Validation accuracy | 0.69 (0.01) | 0.68 (0.01) | 0.67 (0.01) | 0.63 (0.02) | 0.63 (0.02) | 0.56 (0.02) |
| Level bias | − 0.02 (0.01) | − 0.03 (0.01) | − 0.03 (0.01) | − 0.05 (0.01) | − 0.05 (0.01) | − 0.09 (0.01) |
| Dispersion bias | 0.48 (0.01) | 0.47 (0.01) | 0.47 (0.01) | 0.44 (0.01) | 0.44 (0.01) | 0.42 (0.02) |
| Loin depth, size of validation population = 1383 | ||||||
| Estimated heritability | 0.55 (0.01) | 0.55 (0.01) | 0.55 (0.03) | 0.55 (0.01) | 0.55 (0.01) | 0.57 (0.03) |
| Validation accuracy | 0.68 (0.01) | 0.67 (0.01) | 0.65 (0.02) | 0.62 (0.02) | 0.62 (0.02) | 0.54 (0.02) |
| Level bias | 0.01 (0.01) | 0.00 (0.01) | 0.00 (0.01) | 0.00 (0.01) | 0.00 (0.01) | − 0.01 (0.01) |
| Dispersion bias | 0.50 (0.01) | 0.50 (0.01) | 0.48 (0.02) | 0.48 (0.02) | 0.48 (0.02) | 0.45 (0.02) |
aSingle-step genomic best linear unbiased prediction
bIn the reference scenario, the subsequent ssGBLUP evaluation used the entire available data until the validation generation
cValidation generation preselection (VGP) scenario, in which all animals in the validation generation without progeny in the data were discarded
dMulti-generation preselection (MGP) scenario, in which all animals in the validation and training generations without progeny in the data were discarded
eThe heritability was estimated from an equivalent pedigree-based animal model in ASReml
fValidation accuracy was computed as weighted Pearson’s correlation coefficient between progeny yield deviation and genomic estimated breeding value of all validation animals
gLevel bias was computed as the weighted mean difference between progeny yield deviation and half of the genomic estimated breeding value across all validation animals, expressed in additive genetic standard deviation units of the trait
hDispersion bias was measured by the weighted regression coefficient of progeny yield deviation on genomic estimated breeding value of all validation animals
Performance of ssGBLUPa in the subsequent evaluations in the dam line (SE in brackets)
| Measure/preselection scenario | With records on animals in the validation generation | Without records on animals in the validation generation | ||||
|---|---|---|---|---|---|---|
| Referenceb | VGPc | MGPd | Reference | VGP | MGP | |
| Average daily gain during performance testing, size of validation population = 2323 | ||||||
| Estimated heritabilitye | 0.31 (0.01) | 0.32 (0.01) | 0.40 (0.02) | 0.30 (0.01) | 0.30 (0.01) | 0.38 (0.02) |
| Validation accuracyf | 0.35 (0.02) | 0.31 (0.02) | 0.29 (0.02) | 0.28 (0.02) | 0.28 (0.02) | 0.23 (0.02) |
| Level biasg | − 0.05 (0.02) | − 0.14 (0.02) | 0.04 (0.02) | 0.03 (0.02) | 0.03 (0.02) | 0.14 (0.02) |
| Dispersion biash | 0.46 (0.03) | 0.43 (0.03) | 0.41 (0.03) | 0.44 (0.03) | 0.44 (0.03) | 0.43 (0.04) |
| Average daily gain throughout life, size of validation population = 2405 | ||||||
| Estimated heritability | 0.31 (0.01) | 0.33 (0.01) | 0.43 (0.02) | 0.31 (0.01) | 0.31 (0.01) | 0.44 (0.02) |
| Validation accuracy | 0.46 (0.02) | 0.42 (0.02) | 0.42 (0.02) | 0.38 (0.02) | 0.38 (0.02) | 0.35 (0.02) |
| Level bias | − 0.06 (0.01) | − 0.16 (0.01) | − 0.01 (0.01) | 0.00 (0.01) | 0.00 (0.01) | 0.08 (0.01) |
| Dispersion bias | 0.45 (0.02) | 0.42 (0.02) | 0.42 (0.02) | 0.43 (0.02) | 0.43 (0.02) | 0.43 (0.02) |
| Backfat thickness thickness, size of validation population = 2312 | ||||||
| Estimated heritability | 0.51 (0.01) | 0.51 (0.01) | 0.51 (0.02) | 0.51 (0.01) | 0.51 (0.01) | 0.53 (0.02) |
| Validation accuracy | 0.52 (0.01) | 0.50 (0.02) | 0.50 (0.02) | 0.45 (0.02) | 0.45 (0.02) | 0.42 (0.02) |
| Level bias | 0.02 (0.01) | − 0.01 (0.01) | − 0.03 (0.01) | 0.02 (0.01) | 0.02 (0.01) | − 0.01 (0.01) |
| Dispersion bias | 0.43 (0.01) | 0.41 (0.01) | 0.41 (0.01) | 0.42 (0.02) | 0.42 (0.02) | 0.41 (0.02) |
| Loin depth, size of validation population = 1164 | ||||||
| Estimated heritability | 0.50 (0.01) | 0.50 (0.01) | 0.55 (0.02) | 0.49 (0.01) | 0.49 (0.01) | 0.53 (0.02) |
| Validation accuracy | 0.62 (0.02) | 0.60 (0.02) | 0.59 (0.02) | 0.55 (0.02) | 0.56 (0.02) | 0.49 (0.02) |
| Level bias | − 0.02 (0.02) | − 0.03 (0.02) | 0.02 (0.02) | − 0.04 (0.02) | − 0.04 (0.02) | 0.03 (0.02) |
| Dispersion bias | 0.54 (0.02) | 0.54 (0.02) | 0.52 (0.02) | 0.53 (0.02) | 0.53 (0.02) | 0.51 (0.03) |
aSingle-step genomic best linear unbiased prediction
bIn the reference scenario, the subsequent ssGBLUP evaluation utilized the entire available data until the validation generation
cValidation generation preselection (VGP) scenario, in which all animals in the validation generation without progeny in the data were discarded
dMulti-generation preselection (MGP) scenario, in which all animals in the validation and training generations without progeny in the data were discarded
eThe heritability was estimated from an equivalent pedigree-based animal model in ASReml
fValidation accuracy was computed as weighted Pearson’s correlation coefficient between progeny yield deviation and genomic estimated breeding value of all validation animals
gLevel bias was computed as the weighted mean difference between progeny yield deviation and half of the genomic estimated breeding value across all validation animals, expressed in additive genetic standard deviation units of the trait
hDispersion bias was measured by the weighted regression coefficient of progeny yield deviation on genomic estimated breeding value of all validation animals
Performance of PBLUPa in the subsequent evaluations in the sire line (SE in brackets)
| Measure/preselection scenario | With records on animals in the validation generation | Without records on animals in the validation generation | ||||
|---|---|---|---|---|---|---|
| Referenceb | VGPc | MGPd | Reference | VGP | MGP | |
| Average daily gain during performance testing, size of validation population = 1382 | ||||||
| Estimated heritability | 0.24 (0.01) | 0.25 (0.01) | 0.33 (0.02) | 0.24 (0.01) | 0.24 (0.01) | 0.35 (0.03) |
| Validation accuracye | 0.51 (0.02) | 0.50 (0.02) | 0.49 (0.02) | 0.41 (0.02) | 0.41 (0.02) | 0.40 (0.02) |
| Level biasf | − 0.04 (0.02) | − 0.11 (0.02) | 0.01 (0.02) | − 0.01 (0.02) | − 0.01 (0.02) | 0.01 (0.02) |
| Dispersion biasg | 0.53 (0.02) | 0.54 (0.03) | 0.48 (0.02) | 0.55 (0.03) | 0.55 (0.03) | 0.49 (0.03) |
| Average daily gain throughout life, size of validation population = 1383 | ||||||
| Estimated heritability | 0.26 (0.01) | 0.28 (0.01) | 0.33 (0.03) | 0.27 (0.01) | 0.27 (0.01) | 0.35 (0.03) |
| Validation accuracy | 0.58 (0.02) | 0.56 (0.02) | 0.54 (0.02) | 0.47 (0.02) | 0.47 (0.02) | 0.44 (0.02) |
| Level bias | − 0.06 (0.02) | − 0.14 (0.02) | − 0.04 (0.02) | − 0.05 (0.02) | − 0.05 (0.02) | − 0.05 (0.02) |
| Dispersion bias | 0.55 (0.02) | 0.55 (0.02) | 0.51 (0.02) | 0.56 (0.03) | 0.56 (0.03) | 0.54 (0.03) |
| Backfat thickness thickness, size of validation population = 1383 | ||||||
| Estimated heritability | 0.58 (0.01) | 0.58 (0.01) | 0.58 (0.02) | 0.58 (0.01) | 0.58 (0.01) | 0.60 (0.03) |
| Validation accuracy | 0.67 (0.01) | 0.66 (0.02) | 0.66 (0.02) | 0.48 (0.02) | 0.48 (0.02) | 0.46 (0.02) |
| Level bias | − 0.03 (0.01) | − 0.03 (0.01) | − 0.03 (0.01) | − 0.09 (0.01) | − 0.09 (0.01) | − 0.10 (0.01) |
| Dispersion bias | 0.50 (0.01) | 0.50 (0.02) | 0.50 (0.02) | 0.46 (0.02) | 0.46 (0.02) | 0.43 (0.02) |
| Loin depth, size of validation population = 1383 | ||||||
| Estimated heritability | 0.55 (0.01) | 0.55 (0.01) | 0.55 (0.03) | 0.55 (0.01) | 0.55 (0.01) | 0.57 (0.03) |
| Validation accuracy | 0.66 (0.02) | 0.65 (0.02) | 0.64 (0.02) | 0.49 (0.02) | 0.49 (0.02) | 0.46 (0.02) |
| Level bias | 0.00 (0.01) | 0.00 (0.01) | 0.00 (0.01) | 0.01 (0.01) | 0.01 (0.01) | 0.00 (0.01) |
| Dispersion bias | 0.50 (0.02) | 0.49 (0.02) | 0.49 (0.02) | 0.48 (0.02) | 0.48 (0.02) | 0.46 (0.02) |
aPedigree-based best linear unbiased prediction
bIn the reference scenario, the subsequent PBLUP evaluation utilized the entire available data until the validation generation
cValidation generation preselection (VGP) scenario, in which all animals in the validation generation without progeny in the data were discarded
dMulti-generation preselection (MGP) scenario, in which all animals in the validation and training generations without progeny in the data were discarded
eValidation accuracy was computed as weighted Pearson’s correlation coefficient between progeny yield deviation and estimated breeding value of all validation animals
fLevel bias was computed as the weighted mean difference between progeny yield deviation and half of the estimated breeding value across all validation animals, expressed in additive genetic standard deviation units of the trait
gDispersion bias was measured by the weighted regression coefficient of progeny yield deviation on estimated breeding value of all validation animals
Performance of PBLUPa in the subsequent evaluations in the dam line (SE in brackets)
| Measure/preselection scenario | With records on animals in the validation generation | Without records on animals in the validation generation | ||||
|---|---|---|---|---|---|---|
| Referenceb | VGPc | MGPd | Reference | VGP | MGP | |
| Average daily gain during performance testing, size of validation population = 2323 | ||||||
| Estimated heritability | 0.31 (0.01) | 0.32 (0.01) | 0.40 (0.02) | 0.30 (0.01) | 0.30 (0.01) | 0.38 (0.02) |
| Validation accuracye | 0.35 (0.02) | 0.30 (0.02) | 0.30 (0.02) | 0.24 (0.02) | 0.24 (0.02) | 0.21 (0.02) |
| Level biasf | − 0.04 (0.02) | − 0.16 (0.02) | 0.01 (0.02) | 0.08 (0.02) | 0.08 (0.02) | 0.13 (0.02) |
| Dispersion biasg | 0.52 (0.03) | 0.45 (0.03) | 0.42 (0.03) | 0.50 (0.04) | 0.50 (0.04) | 0.45 (0.04) |
| Average daily gain throughout life, size of validation population = 2405 | ||||||
| Estimated heritability | 0.31 (0.01) | 0.33 (0.01) | 0.43 (0.02) | 0.31 (0.01) | 0.31 (0.01) | 0.44 (0.02) |
| Validation accuracy | 0.48 (0.01) | 0.43 (0.02) | 0.43 (0.02) | 0.34 (0.02) | 0.34 (0.02) | 0.31 (0.02) |
| Level bias | − 0.05 (0.01) | − 0.18 (0.01) | − 0.03 (0.01) | 0.05 (0.02) | 0.05 (0.02) | 0.07 (0.01) |
| Dispersion bias | 0.51 (0.02) | 0.47 (0.02) | 0.44 (0.02) | 0.51 (0.03) | 0.51 (0.03) | 0.44 (0.03) |
| Backfat thickness thickness, size of validation population = 2312 | ||||||
| Estimated heritability | 0.51 (0.01) | 0.51 (0.01) | 0.51 (0.02) | 0.51 (0.01) | 0.51 (0.01) | 0.53 (0.02) |
| Validation accuracy | 0.52 (0.02) | 0.50 (0.02) | 0.50 (0.02) | 0.37 (0.02) | 0.37 (0.02) | 0.36 (0.02) |
| Level bias | 0.02 (0.01) | 0.00 (0.01) | − 0.03 (0.01) | 0.04 (0.01) | 0.04 (0.01) | 0.00 (0.01) |
| Dispersion bias | 0.45 (0.02) | 0.43 (0.02) | 0.42 (0.02) | 0.41 (0.02) | 0.41 (0.02) | 0.39 (0.02) |
| Loin depth, size of validation population = 1164 | ||||||
| Estimated heritability | 0.50 (0.01) | 0.50 (0.01) | 0.55 (0.02) | 0.49 (0.01) | 0.49 (0.01) | 0.53 (0.02) |
| Validation accuracy | 0.58 (0.02) | 0.56 (0.02) | 0.56 (0.02) | 0.43 (0.02) | 0.43 (0.02) | 0.41 (0.02) |
| Level bias | 0.00 (0.02) | − 0.01 (0.02) | 0.04 (0.02) | − 0.02 (0.02) | − 0.02 (0.02) | 0.04 (0.02) |
| Dispersion bias | 0.55 (0.02) | 0.54 (0.02) | 0.51 (0.02) | 0.57 (0.03) | 0.57 (0.03) | 0.52 (0.03) |
aPedigree-based best linear unbiased prediction
bIn the reference scenario, the subsequent PBLUP evaluation utilized the entire available data until the validation generation
cValidation generation preselection (VGP) scenario, in which all animals in the validation generation without progeny in the data were discarded
dMulti-generation preselection (MGP) scenario, in which all animals in the validation and training generations without progeny in the data were discarded
eValidation accuracy was computed as weighted Pearson’s correlation coefficient between progeny yield deviation and estimated breeding value of all validation animals
fLevel bias was computed as the weighted mean difference between progeny yield deviation and half of the estimated breeding value across all validation animals, expressed in additive genetic standard deviation units of the trait
gDispersion bias was measured by the weighted regression coefficient of progeny yield deviation on estimated breeding value of all validation animals