Literature DB >> 30147871

The impact of genomic relatedness between populations on the genomic estimated breeding values.

Peipei Ma1,2, Ju Huang1, Weijia Gong3, Xiujin Li1, Hongding Gao4, Qin Zhang1, Xiangdong Ding1, Chonglong Wang5.   

Abstract

In genomic selection, prediction accuracy is highly driven by the size of animals in the reference population (RP). Combining related populations from different countries and regions or using a related population with large size of RP has been considered to be viable strategies in cattle breeding. The genetic relationship between related populations is important for improving the genomic predictive ability. In this study, we used 122 French bulls as test individuals. The genomic estimated breeding values (GEBVs) evaluated using French RP, America RP and Chinese RP were compared. The results showed that the GEBVs were in higher concordance using French RP and American RP compared with using Chinese population. The persistence analysis, kinship analysis and the principal component analysis (PCA) were performed for 270 French bulls, 270 American bulls and 270 Chinese bulls to interpret the results. All the analyses illustrated that the genetic relationship between French bulls and American bulls was closer compared with Chinese bulls. Another reason could be the size of RP in China was smaller than the other two RPs. In conclusion, using RP of a related population to predict GEBVs of the animals in a target population is feasible when these two populations have a close genetic relationship and the related population is large.

Entities:  

Keywords:  Genomic prediction; Genomic relationship; Joint population prediction

Year:  2018        PMID: 30147871      PMCID: PMC6094871          DOI: 10.1186/s40104-018-0279-4

Source DB:  PubMed          Journal:  J Anim Sci Biotechnol        ISSN: 1674-9782


Short communication

Since genomic selection (GS) was first described by Meuwissen et al. [1], with the constantly decreasing genotyping cost, this technology has revolutionized breeding of both livestock and crops in the last few years. The size of reference population (RP) and the relationship between the reference and candidate population were reported to be the important factors affecting accuracy of genomic prediction [2-4]. The advantage of using GS has been limited due to limited size of RP. Firstly, a low number of progeny-test proven bulls were available in each country especially in countries which mainly relied on importing bull semen from the other countries, e.g. China [5]. Secondly, it is not economically feasible to genotype all the animals as RP since the contribution of the cows may be less than the cost for genotyping them [6]. To gain accuracy of GEBV, two strategies were used in practice. One strategy is to combine the reference data from several countries. The other one is to use the RP from a commercial institute e.g. CDCB (https://www.uscdcb.com/what-we-do/genomics). However, it was reported that the relationship between RP and candidate individual was a crucial factor for prediction accuracy in genomic prediction [7, 8]. Therefore, it is necessary to investigate the relationship between populations before applying these strategies. The objectives of this study were 1) to investigate the correlations between genomic estimated breeding values (GEBVs) for French bulls using Chinese, French and American RP separately; 2) to explore the reasons led to different GEBVs, by analyzing the linkage disequilibrium (LD) phase persistence, genetic relatedness, and population structure among French, American and Chinese populations.

Materials and methods

Data

A total of 122 French bulls were used as test set in this study. The GEBVs of milk yield, fat percentage, protein percentage, confirmation and feet_legs evaluated using American RP and French RP separately was provided by Gènes DIFFSUION. The GEBVs of these 122 French bulls using Chinese RPs were estimated in this study. The Chinese RP consisted of 1,568 Chinese cows with both genotype and phenotype. De-regressed proof (DRP) was used as the response variable for genomic prediction in this study. Genotypes of 270 French bulls, 270 American bulls and 270 Chinese bulls were used to compare the relationship among three populations. These 270 French bulls were the progenies of the imported French bulls and cows. So did the American bulls. The Chinese bulls were randomly selected from the native population. All the animals were genotyped using Illumina Bovine SNP50 BeadChip (Illumina, San Diego, CA, USA). After deleting SNPs with a minor allele frequency smaller than 0.01, 45,404 SNPs on 29 autosomes were retained.

Model

GBLUP [9] was used for prediction of GEBV using Chinese RP. The model is as follows:where is a vector of DRP from Chinese population, μ is the overall mean, g is a vector of GEBV, 1 is a vector of ones, Z is the design matrix for linking g to , and e is a vector of the random residuals. Random effects were assumed to be normally distributed as g~N(0,) and e~N(0,),where is the additive genetic variance, is the residual variance, G is the genomic relationship matrix constructed with all the markers using the formula G = MM′/ ∑ 2p(1 − p) [9]. The genotypes in M were coded as 0, 1, and 2 for A1A1, A1A2 and A2A2 and then centralized by subtracting 2p [9], where p was the allele frequency of A2 and was calculated based on the genotypes from the individuals used in the model. DMU package [10] was used to estimate variance components and obtain solutions of the mixed model equations.

Validation of genomic predictive ability

The Spearman’s rank correlation coefficient between GEBVs predicted using different RPs was used as a measurement of concordance of GEBVs. The correlation coefficient between GEBVs evaluated from Chinese RPs and from French RPs was named as CORCF. Accordingly, CORCA was used for that between Chinese RPs and American RPs and CORFA for that between French RPs and American RPs.

The measurement of relatedness between different populations

To examine the genetic relatedness between different RPs, three measurements of genetic distance were performed for 270 French bulls, 270 American bulls and 270 Chinese bulls: 1) the persistence of LD phase between two populations. It was calculated as the correlation of linkage disequilibrium (r2) of adjacent marker pairs on each autosome [11, 12]. The persistence of LD phase between each pair of these three populations was named PERCF, PERCA, PERFA. 2) the number of pair of related individuals between different populations. All pair-wise relationship can be classified as monozygotic twins, 1st-, 2nd- or 3rd- degree relatives by the estimation of kinship coefficients using Kinship-based INference for Genome-wide association study (KING) software package [13]. 3) the principal components (PCs) of marker genotype data. Principal components analysis (PCA) was performed on genotype using KING [13]. We used the plot of PC2 against PC1 as the description of genetic similarity among three populations.

Results

The comparison of genomic prediction using different RPs

The spearman’s rank correlation coefficient between GEBVs using RP from two of three countries is shown in Table 1. For all traits, the correlation between GEBVs using French RP and using American RP (CORFA) is much larger than the correlation between GEBVs using Chinese RP and using American RP (CORCA) or French RP (CORCF). CORFA for fat percentage achieved the highest (0.862) while CORCA for milk yield was the lowest (0.060). CORCF ranged from 0.133 (for feet_legs) to 0.442 (for conformation). CORCA was similar as CORCF and ranged from 0.060 (for milk yield) to 0.420 (for protein percentage). CORFA was much larger than CORCF and CORCA and ranged from 0.472 (for feet_legs) to 0.862 (for fat percentage).
Table 1

Spearman’s rank correlation coefficient between GEBVs evaluated using different RP

TraitCORCFaCORCAbCORFAc
Milk yield0.1570.1600.752
Fat percentage0.1620.1670.862
Protein percentage0.4030.4200.805
Conformation0.4420.4480.643
Mammary system0.3590.4020.765
Feet_legs0.2330.3080.472

aThe correlation between GEBVs estimated using Chinese RP and using American RP

bThe correlation between GEBVs estimated using Chinese RP and using French RP

cThe correlation between GEBVs estimated using French RP and using American RP

Spearman’s rank correlation coefficient between GEBVs evaluated using different RP aThe correlation between GEBVs estimated using Chinese RP and using American RP bThe correlation between GEBVs estimated using Chinese RP and using French RP cThe correlation between GEBVs estimated using French RP and using American RP The plot of GEBVs of milk yield using different RPs is presented in Fig. 1. The trends of GEBVs using American RP and French RP are similar while the trends of GEBVs of using Chinese RP are relative different from the GEBVs using the other two RPs.
Fig. 1

The genomic estimated breeding values (GEBVs) of milk yield for 122 French bulls estimated using different reference population (RP)

The genomic estimated breeding values (GEBVs) of milk yield for 122 French bulls estimated using different reference population (RP)

LD and persistence of LD phase

The LD of each chromosome from each population and persistence of LD phase (PER) between populations are shown in Table 2. The mean r2 of adjacent SNP pairs within each chromosome ranged from 0.13 (chromosomes 27 and 28) to 0.19 (chromosomes 6 and 14) for Chinese RP, 0.14 (chromosomes 27 and 28) to 0.20 (chromosomes 6 and 14) in both France and USA RPs. The mean r2 across all chromosomes were 0.16 in China and 0.17 in France and USA. The persistence of LD phase between France and USA RPs was apparently higher than that between China and the other two countries. The PERCF ranged from 0.893 of chromosome 28 to 0.959 of chromosome 14. The PERCA ranged from 0.931 of chromosome 9 to 0.973 of chromosome 14. The PERFA ranged from 0.942 of chromosome 19 to 0.974 of chromosome 29.
Table 2

Linkage disequilibrium (LD) of adjacent markers for each Bos Taurus autosome (BTA)

BTANo. of SNPsMean r2Persistencea
ChinaFranceUSAChina- FranceChina-USAFrance-USA
134310.180.190.190.9340.9520.968
228290.170.180.170.9300.9520.962
325500.170.190.180.9200.9460.950
425700.160.170.170.9090.9330.948
522710.150.160.160.9260.9530.968
625750.190.200.200.9340.9540.966
723520.170.190.180.9240.9440.961
824300.170.180.180.9120.9440.949
920950.160.170.170.8980.9310.960
1022060.170.190.190.9270.9380.955
1122950.160.170.160.9300.9570.966
1217730.150.160.160.9200.9470.958
1318500.160.170.170.9070.9540.960
1418310.190.200.200.9590.9730.972
1517620.150.160.150.9230.9410.967
1617260.170.180.190.9260.9540.948
1716000.160.170.170.9220.9440.957
1813760.160.170.160.9300.9440.965
1914200.150.160.160.9050.9540.942
2015680.170.190.180.9190.9490.966
2114830.160.170.170.9420.9640.966
2213240.150.160.150.9340.9590.968
2310920.140.160.150.9210.9490.968
2413120.160.170.160.9320.9590.969
2510040.150.160.160.9260.9330.972
2611160.150.170.160.9040.9430.963
279810.130.140.140.9320.9450.967
289810.130.140.140.8930.9390.955
2910870.140.150.150.9450.9650.974
Mean52,890b0.160.170.170.9240.9490.962

aThe correlation of r2 of adjacent SNP pairs between two populations

bSum over 29 autosomes

Linkage disequilibrium (LD) of adjacent markers for each Bos Taurus autosome (BTA) aThe correlation of r2 of adjacent SNP pairs between two populations bSum over 29 autosomes

The kinship coefficients and classification of all pair-wise relationship

The number of pairs of related individuals in each relationship group which was determined by KING software was listed in Table 3. There was 1 pair of individuals in 1st-degree, 1 in 2nd-degree and 596 in 3rd-degree based on genomic relationship between Chinese population and French population. Based on genomic relationship between Chinese population and American population, there were 2 pairs of individuals in 1st-degree, 0 in 2nd-degree and 1,174 in 3rd-degree. Compared with genomic relationship between Chinese population and French population or American population, there were much more pairs of individuals in 1st, 2nd and 3rd degrees based on genomic relationship between French population and American population, which meant there were more related individuals in these two populations.
Table 3

The number of pairs of related individuals between different populations

RelationshipCriteriaNo. of pairs between CFaNo. of pairs between CAbNo. of pairs between FAc
MZ twin[0.354,]000
1st-degree[0.177,0.354]1232
2nd-degree[0.0884,0.177]101057
3rd-degree[0.0442,0.0884]59611744447
Unrelated[,0.0442]71,49271,72466,554
Total72,09072,90072,090

aChinese population and French population

bChinese population and American population

cFrench population and American population

The number of pairs of related individuals between different populations aChinese population and French population bChinese population and American population cFrench population and American population

The principal component analysis (PCA)

Figure 2 illustrates that the relationship between French population and American population was closer than the relationship between them and Chinese population.
Fig. 2

The principal components of marker genotype data.The first principle component (PC1) versus the second principle component (PC2) calculated using marker genotype data

The principal components of marker genotype data.The first principle component (PC1) versus the second principle component (PC2) calculated using marker genotype data

Discussion

In this study, we investigated the difference on GEBVs for French Holstein bulls using references from different countries. The genomic relatedness between different populations were investigated to illustrate the results. The results showed that the correlation between GEBVs estimated using French RP and using American RP was higher than the correlation between GEBVs estimated using Chinese RP and French/American RP. The LD phase persistence analysis, kinship coefficients and the PCA showed that the relationship between French population and American population was closer than that between Chinese population and American or French population. For combined RP, a close relationship between populations reflects a similar LD structures among populations which enabled the joint prediction feasible. Lund et al. [14] used European Holsteins as joint reference to predict Nordic Holstein, Dutch Holstein, French Holstein and German Holstein and found reliability improved by up to 10% compared with using separate RP. A joint Nordic Red dairy cattle RP was intended to improve the accuracy of genomic prediction in the previous study [15]. However, the results showed that the prediction for Swedish and Finnish population was improved slightly when the Danish Red dairy cattle were added into the RP since the relationship between Finnish Red and Swedish Red was closer compared with the relationship between Danish Red and the other two populations [15]. Similar pattern was observed when G matrix was used to measure the relationship among three countries in our study and the report from Brøndum et al. [15]. Higher related individuals were observed between Swedish and Finnish Red in their study and between American population and French population in our study. It is consisted with the conclusion from previous studies that the prediction ability was improved by including related individuals in the RP [16, 17]. The average of kinship among individuals from different countries was calculated, and the results showed the average relationship of any two countries was similar with the others (data not shown). One of the reasons could be that too many small values diluted the close relationship, which illustrated that the average of kinship matrix was not suitable as the criterion to measure the relationship between populations. Another reason leading to the spearman’s rank correlation coefficient between GEBVs using Chinese RP and using American/French RP smaller than the other two correlations could be that the size of RP was different. Since Chinese RP in this study only included 1568 individuals, which may be not as informative as proven bulls from the other two countries. The combined RP between Nordic Holstein population, which is one member of Eurogenomics, and Chinese Holstein population had been utilized to investigate the improvement of reliability of genomic prediction in previous studies [5, 18]. The results showed the reliability of genomic prediction for Chinese population was improved greatly while little improvement for Nordic population [5]. Therefore, the size of RP should be considered when joint-population prediction was conducted besides taking the relationship between different populations into account. There is possibility to improve the genomic prediction ability for populations with a small number of RP even if the relationship between the added population and target population is distant.

Conclusions

Information from the other related populations was applied to improve the predictive ability. However, our results showed that the GEBVs were in different rank when a loose related population was used as RP. Integrating results from previous studies, we concluded that it was feasible to predict the GEBVs of a target population using RP of a related population in the condition that there was a close genetic relationship between these two populations and the size of related population is large.
  15 in total

1.  The impact of genetic relationship information on genome-assisted breeding values.

Authors:  D Habier; R L Fernando; J C M Dekkers
Journal:  Genetics       Date:  2007-12       Impact factor: 4.562

Review 2.  Mapping genes for complex traits in domestic animals and their use in breeding programmes.

Authors:  Michael E Goddard; Ben J Hayes
Journal:  Nat Rev Genet       Date:  2009-06       Impact factor: 53.242

3.  Efficient methods to compute genomic predictions.

Authors:  P M VanRaden
Journal:  J Dairy Sci       Date:  2008-11       Impact factor: 4.034

4.  Impact of relationships between test and training animals and among training animals on reliability of genomic prediction.

Authors:  X Wu; M S Lund; D Sun; Q Zhang; G Su
Journal:  J Anim Breed Genet       Date:  2015-05-23       Impact factor: 2.380

5.  Reliability of direct genomic values for animals with different relationships within and to the reference population.

Authors:  M Pszczola; T Strabel; H A Mulder; M P L Calus
Journal:  J Dairy Sci       Date:  2012-01       Impact factor: 4.034

6.  Increasing imputation and prediction accuracy for Chinese Holsteins using joint Chinese-Nordic reference population.

Authors:  P Ma; M S Lund; X Ding; Q Zhang; G Su
Journal:  J Anim Breed Genet       Date:  2014-08-06       Impact factor: 2.380

7.  Linkage disequilibrium and persistence of phase in Holstein-Friesian, Jersey and Angus cattle.

Authors:  A P W de Roos; B J Hayes; R J Spelman; M E Goddard
Journal:  Genetics       Date:  2008-07-13       Impact factor: 4.562

8.  Extent of linkage disequilibrium in Holstein cattle in North America.

Authors:  M Sargolzaei; F S Schenkel; G B Jansen; L R Schaeffer
Journal:  J Dairy Sci       Date:  2008-05       Impact factor: 4.034

9.  Comparison on genomic predictions using three GBLUP methods and two single-step blending methods in the Nordic Holstein population.

Authors:  Hongding Gao; Ole F Christensen; Per Madsen; Ulrik S Nielsen; Yuan Zhang; Mogens S Lund; Guosheng Su
Journal:  Genet Sel Evol       Date:  2012-07-06       Impact factor: 4.297

10.  Consistency of linkage disequilibrium between Chinese and Nordic Holsteins and genomic prediction for Chinese Holsteins using a joint reference population.

Authors:  Lei Zhou; Xiangdong Ding; Qin Zhang; Yachun Wang; Mogens S Lund; Guosheng Su
Journal:  Genet Sel Evol       Date:  2013-03-21       Impact factor: 4.297

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.