| Literature DB >> 31379929 |
Shaopan Ye1, Ning Gao2, Rongrong Zheng1, Zitao Chen1, Jinyan Teng1, Xiaolong Yuan1, Hao Zhang1, Zanmou Chen1, Xiquan Zhang1, Jiaqi Li1, Zhe Zhang1.
Abstract
Genomic prediction with imputed whole-genome sequencing (WGS) data is an attractive approach to improve predictive ability with low cost. However, high accuracy has not been realized using this method in livestock. In this study, we imputed 435 individuals from 600K single nucleotide polymorphism (SNP) chip data to WGS data using different reference panels. We also investigated the prediction accuracy of genomic best linear unbiased prediction (GBLUP) using imputed WGS data from different reference panels, linkage disequilibrium (LD)-based marker pruning, and pre-selected variants based on Genome-wide association society (GWAS) results. Results showed that the imputation accuracies from 600K to WGS data were 0.873 ± 0.038, 0.906 ± 0.036, and 0.979 ± 0.010 for the internal, external, and combined reference panels, respectively. In most traits of chickens, the prediction accuracy of imputed WGS data obtained from the internal reference panel was greater than or equal to that of the combined reference panel; the external reference panel had the lowest prediction accuracy. Compared with 600K chip data, GBLUP with imputed WGS data had only a small increase (1-3%) in prediction accuracy. Using only variants selected from imputed WGS data based on GWAS results resulted in almost no increase for most traits and even increased the bias of the regression coefficient. The impact of the degree of LD of selected and remaining variants on prediction accuracy was different. For average daily gain (ADG), residual feed intake (RFI), intestine length (IL), and body weight in 91 days (BW91), the accuracy of GBLUP increased as the degree of LD of selected variants decreased, but the opposite relationship occurred for the remaining variants. But for breast muscle weight (BMW) and average daily feed intake (ADFI), the accuracy of GBLUP increased as the degree of LD of selected variants increased, and the degree of LD of remaining variants had a small effect on prediction accuracy. Overall, the optimal imputation strategy to obtain WGS data for genomic prediction should consider the relationship between selected individuals and target population individuals to avoid heterogeneity of imputation. LD-based marker pruning can be used to improve the accuracy of genomic prediction using imputed WGS data.Entities:
Keywords: GWAS; LD-based marker pruning; chickens; genomic prediction; imputed WGS data
Year: 2019 PMID: 31379929 PMCID: PMC6650575 DOI: 10.3389/fgene.2019.00673
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1Average imputation accuracies of Beagle 4.1 using different reference panels per chromosome. The imputation accuracy was assessed by the correlation between imputed and masked true genotypes per SNPs.
Prediction accuracy and regression coefficients of 21 traits in chicken using genomic best linear unbiased prediction (GBLUP) with different genotype data.
| Traits1 | 600K array2 | WGS (internal)3 | WGS (external)4 | WGS (combined)5 | ||||
|---|---|---|---|---|---|---|---|---|
| ADG |
| 0.98 (0.05) | 0.33 (0.01) | 0.99 (0.06) | 0.33 (0.01) | 1.0 2(0.06) | 0.33 (0.01) | 0.99 (0.06) |
| ADFI | 0.40 (0.01) | 1.01 (0.04) |
| 1.00 (0.04) | 0.42 (0.01) | 1.02 (0.04) | 0.42 (0.01) | 1.0 0(0.04) |
| RFI | 0.45 (0.01) | 1.03 (0.04) |
| 1.03 (0.04) | 0.46 (0.01) | 1.03 (0.04) | 0.47 (0.01) | 1.03 (0.04) |
| FCR | 0.26 (0.01) | 0.92 (0.06) | 0.26 (0.01) | 0.95 (0.06) |
| 0.98 (0.06) | 0.26 (0.01) | 0.94 (0.06) |
| CW | 0.30 (0.01) | 1.01 (0.06) |
| 0.99 (0.06) | 0.29 (0.01) | 1.01 (0.06) | 0.30 (0.01) | 0.99 (0.06) |
| EWG | 0.28 (0.01) | 0.97 (0.07) |
| 0.99 (0.07) | 0.27 (0.01) | 1.00 (0.08) | 0.28 (0.01) | 0.99 (0.07) |
| EW | 0.26 (0.01) | 0.96 (0.07) |
| 1.00 (0.08) | 0.26 (0.02) | 1.00 (0.08) |
| 1.00 (0.08) |
| BMW | 0.26 (0.02) | 1.04 (0.09) |
| 1.04 (0.08) | 0.25 (0.01) | 1.04 (0.08) |
| 1.04 (0.08) |
| DW | 0.20 (0.01) | 1.10 (0.14) |
| 1.11 (0.14) | 0.20 (0.01) | 1.12 (0.14) | 0.21 (0.01) | 1.10 (0.12) |
| AFW | 0.36 (0.01) | 1.04 (0.05) | 0.36 (0.01) | 1.02 (0.04) | 0.36 (0.01) | 1.04 (0.05) | 0.36 (0.01) | 1.02 (0.04) |
| AFP | 0.32 (0.01) | 1.00 (0.04) |
| 1.00 (0.04) | 0.32 (0.01) | 0.98 (0.04) |
| 1.00 (0.04) |
| GW | 0.23 (0.01) | 1.15 (0.10) | 0.23 (0.01) | 1.18 (0.10) | 0.23 (0.01) | 1.17 (0.10) |
| 1.16 (0.10) |
| IL | 0.24 (0.01) | 1.09 (0.07) |
| 1.07 (0.07) | 0.24 (0.01) | 1.08 (0.07) |
| 1.07 (0.07) |
| BW45 | 0.28 (0.01) | 1.06 (0.07) |
| 1.06 (0.07) | 0.25 (0.01) | 1.09 (0.08) |
| 1.05 (0.07) |
| BW49 | 0.27 (0.01) | 1.07 (0.06) |
| 1.06 (0.05) | 0.26 (0.01) | 1.08 (0.06) | 0.28 (0.01) | 1.05 (0.05) |
| BW56 | 0.29 (0.01) | 1.19 (0.09) |
| 1.18 (0.08) | 0.27 (0.01) | 1.19 (0.09) |
| 1.18 (0.08) |
| BW63 | 0.26 (0.01) | 1.22 (0.10) |
| 1.23 (0.11) | 0.24 (0.02) | 1.28 (0.12) |
| 1.25 (0.11) |
| BW70 | 0.26 (0.01) | 1.06 (0.07) |
| 1.06 (0.07) | 0.24 (0.01) | 1.08 (0.07) |
| 1.04 (0.07) |
| BW77 | 0.29 (0.01) | 0.99 (0.06) |
| 1.03 (0.07) | 0.29 (0.01) | 1.06 (0.07) | 0.29 (0.01) | 1.02 (0.07) |
| BW84 | 0.32 (0.01) | 1.05 (0.05) |
| 1.05 (0.05) | 0.31 (0.01) | 1.07 (0.06) | 0.32 (0.01) | 1.06 (0.05) |
| BW91 | 0.30 (0.01) | 1.01 (0.05) |
| 0.98 (0.04) | 0.30 (0.01) | 1.00 (0.05) |
| 0.98 (0.04) |
1These traits were average daily gain (ADG), average daily feed intake (ADFI), residual feed intake (RFI), feed conversion ratio (FCR), carcass weight (CW), breast muscle weight (BMW), eviscerated weight with giblets (EWG), eviscerated weight (EW), drumstick weight (DW), abdominal fat weight (AFW), abdominal fat percentage (AFP), gizzard weight (GW), intestine length (IL), body weight in 45 days (BW45), body weight in 49 days (BW49), body weight in 56 days (BW56), body weight in 63 days (BW63), body weight in 70 days (BW70), body weight in 77 days (BW77), body weight in 84 days (BW84), and body weight in 91 days (BW91).
2600K array: the 600K Affymetrix® Axiom® HD genotyping array.
3WGS (internal): the imputed whole-genome sequencing data obtained from internal reference panel.
4WGS (external): the imputed whole-genome sequencing data obtained from external reference panel.
5WGS (combined): the imputed whole-genome sequencing data obtained from combined reference panel.
6: the Pearson correlation between the predicted genetic values (ĝ) and the observed phenotypes (p) corrected for fixed effect. Means and standard errors (in parentheses) of are shown in the table. Bold font was used to represent values that were higher than others.
7: regression of the predicted genetic values (ĝ) and the observed phenotypes (p) corrected for fixed effect. Means and standard errors (in parentheses) of regression coefficients are shown in the table.
Figure 2Impact of linkage disequilibrium (LD)-based marker pruning on the predictive ability of imputed whole-genome sequencing (WGS) and chip data. Different R-squared cutoffs of LD (0.99, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, and 0.1) were used to prune markers of imputed WGS and chip data. The predictive ability was assessed by the Pearson correlation between the predicted genetic values and the observed phenotypes corrected for fixed effect per trait. These traits were average daily gain (ADG), average daily feed intake (ADFI), residual feed intake (RFI), feed conversion ratio (FCR), carcass weight (CW), breast muscle weight (BMW), eviscerated weight with giblets (EWG), eviscerated weight (EW), drumstick weight (DW), abdominal fat weight (AFW), abdominal fat percentage (AFP), gizzard weight (GW), intestine length (IL), body weight in 45 days (BW45), body weight in 49 days (BW49), body weight in 56 days (BW56), body weight in 63 days (BW63), body weight in 70 days (BW70), body weight in 77 days (BW77), body weight in 84 days (BW84), and body weight in 91 days (BW91).
Figure 3Impact of pre-selected variants on the predictive ability of GBLUP using imputed WGS data. Different p-value cutoffs from 2 to 5 were used to select variants from imputed WGS data based on GWAS results for GBLUP. The red line was the prediction accuracy of GBLUP with all markers of imputed WGS data. The predictive ability was assessed by the Pearson correlation between the predicted genetic values and the observed phenotypes corrected for fixed effect(s) per trait. These traits were average daily gain (ADG), average daily feed intake (ADFI), residual feed intake (RFI), feed conversion ratio (FCR), carcass weight (CW), breast muscle weight (BMW), eviscerated weight with giblets (EWG), eviscerated weight (EW), drumstick weight (DW), abdominal fat weight (AFW), abdominal fat percentage (AFP), gizzard weight (GW), intestine length (IL), body weight in 45 days (BW45), body weight in 49 days (BW49), body weight in 56 days (BW56), body weight in 63 days (BW63), body weight in 70 days (BW70), body weight in 77 days (BW77), body weight in 84 days (BW84), and body weight in 91 days (BW91).
Figure 4Impact of the linkage disequilibrium (LD)-based marker pruning of selected or remaining variants on prediction accuracy. R-squared of LD less than 0.1 was fixed to prune selected (or remaining) variants, and then the remaining (or selected) variants that were pruned with different R-squared cutoffs of LD (0.99, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, and 0.1) for GBLUP were merged. The red line is the prediction accuracy of GBLUP with all markers of imputed WGS data. The predictive ability was assessed by the Pearson correlation between the predicted genetic values and the observed phenotypes corrected for fixed effect(s) per trait. These traits were average daily gain (ADG), intestine length (IL), breast muscle weight (BMW), residual feed intake (RFI), body weight in 91 days (BW91), and average daily feed intake (ADFI).