| Literature DB >> 36068488 |
Sunduimijid Bolormaa1, Iona M MacLeod2, Majid Khansefid2, Leah C Marett3,4, William J Wales3,4, Filippo Miglior5,6, Christine F Baes6,7, Flavio S Schenkel6, Erin E Connor8,9, Coralia I V Manzanilla-Pech10, Paul Stothard11, Emily Herman11, Gert J Nieuwhof2,12, Michael E Goddard2,13, Jennie E Pryce2,14.
Abstract
BACKGROUND: Sharing individual phenotype and genotype data between countries is complex and fraught with potential errors, while sharing summary statistics of genome-wide association studies (GWAS) is relatively straightforward, and thus would be especially useful for traits that are expensive or difficult-to-measure, such as feed efficiency. Here we examined: (1) the sharing of individual cow data from international partners; and (2) the use of sequence variants selected from GWAS of international cow data to evaluate the accuracy of genomic estimated breeding values (GEBV) for residual feed intake (RFI) in Australian cows.Entities:
Mesh:
Year: 2022 PMID: 36068488 PMCID: PMC9450441 DOI: 10.1186/s12711-022-00749-z
Source DB: PubMed Journal: Genet Sel Evol ISSN: 0999-193X Impact factor: 5.100
List of datasets and GBLUP analyses used in this study
| Dataset code | Analysesa | Heifer or cow | Country or continent |
|---|---|---|---|
| AUSh | Univariate | Growing heifer | Australia |
| AUSc | Univariate | Lactating cow | Australia |
| USA | na | Lactating cow | United States |
| CAN | na | Lactating cow | Canada |
| DNK | na | Lactating cow | Denmark |
| GBR | na | Lactating cow | United Kingdom |
| CHE | na | Lactating cow | Switzerland |
| NLD | na | Lactating cow | the Netherlands |
| EU | Univariate | Lactating cow | DNK + GBR + CHE + NLD |
| NA | Univariate | Lactating cow | USA + CAN |
| OVE | Univariate | Lactating cow | EU + NA |
| AUShAUSc | Bivariate | Heifer + cow | AUSh and AUSc |
| AUScEU | Bivariate or univariate | Lactating cow | AUSc and EU |
| AUScNA | Bivariate or univariate | Lactating cow | AUSc and NA |
| AUScOVE | Bivariate or univariate | Lactating cow | AUSc, AUSc and OVE |
| AUShAUScEU | Trivariate | Heifer + cow | AUSc, AUSc and EU |
| AUShAUScNA | Trivariate | Heifer + cow | AUSc, AUSc and NA |
| AUShAUScOVE | Trivariate | Heifer + cow | AUSc, AUSc and OVE |
ana: not applicable; bivariate or trivariate: analyses using two or three datasets, which were treated as individual traits, respectively
Number of records (and number of animals in parentheses) for dry matter intake and traits used to calculate residual feed intake per country
| Dataset | DMI | ECM | BW | ∆BW | DMIRFI |
|---|---|---|---|---|---|
| AUSh | 46,144 (824) | 46,144 (824) | 46,144 (824) | 46,144 (824) | |
| AUSc | 20,384 (584) | 20,384 (584) | 20,384 (584) | 20,384 (584) | 20,384 (584) |
| USA | 126,863 (673) | 18,485 (676) | 127,693 (672) | 127,693 (672) | 17,568 (671) |
| CAN | 137,517 (755) | 38,543 (785) | 150,208 (492) | 150,208 (492) | 13,278 (473) |
| DNK | 34,284 (439) | 30,648 (431) | 30,777 (429) | 30,777 (429) | 27,559 (425) |
| CHE | 3887 (95) | 2192 (127) | 4167 (95) | 4167 (95) | 297 (63) |
Acronyms as described in Table 1
DMI: dry matter intake; ECM: energy corrected milk; BW: body weight; ∆BW: change in body weight; DMIRFI: dry matter intake used to calculate RFI when ECM, BW, and ∆BW had no missing values
Mean of days in milk (DIM) (and standard deviation in parenthesis) and mean and standard deviation of raw and corrected dry matter intake (DMI) and residual feed intake per country and dataset
| Dataset | Raw DMI | DMIa | RFIa | ||||||
|---|---|---|---|---|---|---|---|---|---|
| DIM | Mean | SD | Number | Mean | SD | Number | Mean | SD | |
| AUSh | 8.29 | 1.34 | 824 | − 0.01 | 0.84 | 824 | − 0.02 | 0.42 | |
| AUSc | 108 (35) | 22.93 | 3.80 | 584 | − 0.02 | 2.10 | 584 | 0.00 | 1.29 |
| USA | 81 (67) | 21.52 | 4.47 | 673 | 0.09 | 1.97 | 671 | 0.01 | 1.46 |
| CAN | 121 (77) | 20.54 | 5.47 | 755 | 0.04 | 3.20 | 473 | − 0.07 | 3.04 |
| DNK | 140 (83) | 21.93 | 3.57 | 439 | 0.34 | 1.87 | 425 | 0.12 | 1.17 |
| CHE | 120 (79) | 21.48 | 3.62 | 95 | 0.10 | 2.88 | 63 | 0.09 | 1.20 |
| GBR | 143 (83) | 16.70 | 5.25 | 564 | − 0.31 | 2.85 | 211 | − 0.14 | 2.26 |
| NLD | na | na | na | na | na | na | 597 | 0.00 | 0.97 |
| EU | na | na | na | 1098 | − 0.02 | 2.52 | 1296 | 0.02 | 1.33 |
| NA | na | na | na | 1428 | 0.07 | 2.69 | 1144 | − 0.02 | 2.25 |
| OVE | na | na | na | 2526 | 0.03 | 2.62 | 2440 | 0.00 | 1.82 |
Acronyms as described in Tables 1 and 2
na: not applicable
aCorrected phenotypes for dry matter intake (DMI) and residual feed intake (RFI) in GBLUP
Fig. 1a Principal component decompositions of the genomic relationship matrix constructed from HD SNP genotypes for 3711 animals from seven different countries and b Neighbor-Joining tree representing the genetic distances between animals from different countries
Observed and predicted heterozygosities for each genotype set
| Dataset | Number of animals | HEpr | HEo | HEoHW |
|---|---|---|---|---|
| AUSc | 584 | 0.33 | 0.34 | 0.33 |
| AUSh | 687 | 0.33 | 0.34 | 0.34 |
| USA | 671 | 0.33 | 0.33 | 0.32 |
| CAN | 473 | 0.32 | 0.32 | 0.32 |
| DNK | 425 | 0.33 | 0.33 | 0.33 |
| CHE | 63 | 0.32 | 0.33 | 0.34 |
| GBR | 211 | 0.33 | 0.34 | 0.33 |
| NLD | 597 | 0.33 | 0.33 | 0.33 |
Acronyms as described in Table 1
Number of animals: number of animals used to calculate residual feed intake (RFI); HEpr: heterozygosity predicted from the GRM; HEo: observed heterozygosity; HEoHW: heterozygosity when SNP are assumed to be in Hardy–Weinberg equilibrium
Number of animals, variance explained by 50k genotypes and residuals, and genomic heritability estimates for DMI and RFI within each dataset
| Dataset | RFI | DMI | ||||||
|---|---|---|---|---|---|---|---|---|
| Number of animals | Vg | Ve | h2 (s.e.) | Number of animals | Vg | Ve | h2 (s.e.) | |
| AUSh | 824 | 0.37 | 0.65 | 0.36 (0.087) | 824 | 0.37 | 0.65 | 0.36 (0.086) |
| AUSc | 584 | 0.19 | 0.82 | 0.19 (0.088) | 584 | 0.33 | 0.67 | 0.33 (0.090) |
| EU | 1296 | 0.29 | 0.60 | 0.32 (0.053) | 1098 | 0.23 | 0.55 | 0.29 (0.053) |
| NA | 1144 | 0.26 | 0.71 | 0.27 (0.051) | 1428 | 0.34 | 0.52 | 0.40 (0.045) |
| OVE | 2440 | 0.26 | 0.68 | 0.27 (0.034) | 2526 | 0.25 | 0.58 | 0.30 (0.032) |
| AUScEU | 1880 | 0.28 | 0.65 | 0.30 (0.043) | 1682 | 0.25 | 0.61 | 0.29 (0.044) |
| AUScNA | 1728 | 0.23 | 0.75 | 0.23 (0.041) | 2012 | 0.32 | 0.58 | 0.36 (0.038) |
| AUScOVE | 3024 | 0.24 | 0.70 | 0.26 (0.030) | 3110 | 0.25 | 0.61 | 0.29 (0.029) |
Acronyms as described in Tables 1 and 2
Genetic correlations (SE) for RFI in trivariate model using different scenarios
| Dataset | Number of animals | AUSc-AUSh | AUSc-EU/NA/OVE | AUSh-EU/NA/OVE |
|---|---|---|---|---|
| AUScAUShEU | 2567 | 0.30 (0.26) | 0.68 (0.31) | 0.28 (0.22) |
| AUScAUShNA | 2415 | 0.37 (0.27) | 0.90 (0.39) | 0.19 (0.24) |
| AUScAUShOVE | 3711 | 0.34 (0.26) | 0.95 (0.28) | 0.25 (0.18) |
Acronyms as described in Tables 1 and 2
SE: standard error
Fig. 2Box plot showing accuracies of GEBV for RFI using single- and multi-variate GREML analyses using cohort and random cross-fold validation approaches. As a training population, AUSc (Australian cows) used for the single-variate analysis; AUScAUSh, AUScEU, AUScNA, and AUScOVE (AUSc with Australian heifers (AUSh), European cows (EU), North American cows (NA), and overseas cows (OVE), respectively) used for the bi-variate analyses. AUScAUShEU, AUScAUShNA, and AUScAUShOVE (EU, NA, and OVE cow datasets on top of the AUScAUSh dataset) used for the tri-variate analyses
Fig. 3Mean accuracies of GEBV for RFI by treating Australian cow and overseas cow datasets as a single trait and using cohort and random cross-fold validation approaches. The standard error (SE) bars are approximate estimates. As a training population, AUSc (Australian cows) used for the single-variate analysis; AUScEU, AUScNA, and AUScOVE (AUSc with European cows (EU), North American cows (NA), and overseas cows (OVE), respectively) used for the bi-variate analyses
Average weighted accuracies of GEBV for RFI and regression slopes of corrected phenotypes on GEBV (bias) of the four cohort and random fourfold cross-validation subsets
| Training set | Number of traitsa | Variant set | Accuracy | Bias | Differenceb |
|---|---|---|---|---|---|
| Cohort | |||||
| AUSc | 1 | 50k | 0.29 | 1.13 | |
| AUSc | 1 | HD | 0.29 | 1.10 | 0.0 |
| AUScAUSh | 2 | 50k | 0.25 | 0.90 | − 3.5 |
| AUScOVE | 2 | 50k | 0.52 | 1.31 | 22.9 |
| AUScAUShOVE | 3 | 50k | 0.50 | 1.28 | 21.1 |
| AUScOVE | 1 | 50k | 0.50 | 0.99 | 21.8 |
| AUSc | 1 | 50k + s-tr_G | 0.34 | 1.28 | 5.6 |
| AUSc | 1 | 50k + m-tr_G | 0.38 | 1.10 | 9.0 |
| AUSc | 1 | 50k + sm-tr_G | 0.39 | 1.15 | 10.8 |
| Random | |||||
| AUSc | 1 | 50k | 0.32 | 1.07 | |
| AUSc | 1 | HD | 0.32 | 1.14 | 0.5 |
| AUScAUSh | 2 | 50k | 0.29 | 0.84 | − 3.0 |
| AUScOVE | 2 | 50k | 0.53 | 1.26 | 21.3 |
| AUScAUShOVE | 3 | 50k | 0.50 | 1.17 | 18.4 |
| AUScOVE | 1 | 50k | 0.52 | 0.93 | 20.1 |
| AUSc | 1 | 50k + s-tr_G | 0.34 | 1.17 | 1.8 |
| AUSc | 1 | 50k + m-tr_G | 0.37 | 1.10 | 5.5 |
| AUSc | 1 | 50k + sm-tr_G | 0.38 | 1.15 | 6.6 |
Acronyms as described in Table 1
aNumber of traits (datasets) used in each analysis
bDifference in accuracy of GEBV (%) between using the Australian cow dataset (AUSc) with 50k GRM and the corresponding dataset