| Literature DB >> 24884927 |
Liuhong Chen1, Changxi Li, Stephen Miller, Flavio Schenkel.
Abstract
BACKGROUND: Genomic prediction in multiple populations can be viewed as a multi-task learning problem where tasks are to derive prediction equations for each population and multi-task learning property can be improved by sharing information across populations. The goal of this study was to develop a multi-task Bayesian learning model for multi-population genomic prediction with a strategy to effectively share information across populations. Simulation studies and real data from Holstein and Ayrshire dairy breeds with phenotypes on five milk production traits were used to evaluate the proposed multi-task Bayesian learning model and compare with a single-task model and a simple data pooling method.Entities:
Mesh:
Year: 2014 PMID: 24884927 PMCID: PMC4024655 DOI: 10.1186/1471-2156-15-53
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Number of animals used for genomic prediction
| Training set | 393 | 2084 |
| Validation set | 65 | 214 |
| Total | 458 | 2298 |
Accuracy {expressed as correlations between true breeding values (TBV) and genomic estimated breeding values [GEBV; r(TBV, GEBV)]}, and slopes [b(TBV, GEBV)] of regression of TBV on GEBV for genomic prediction with 20 simulated QTL
| | ||||||
|---|---|---|---|---|---|---|
| r(TBV, GEBV) | ||||||
| ρ = 0.2 | ||||||
| 800 k | 0.71 ± 0.02 | 0.56 ± 0.04 | 0.75 ± 0.02 | 0.91 ± 0.01 | 0.90 ± 0.01 | 0.91 ± 0.01 |
| 400 k | 0.64 ± 0.04 | 0.53 ± 0.04 | 0.73 ± 0.03 | 0.90 ± 0.01 | 0.86 ± 0.01 | 0.90 ± 0.01 |
| 200 k | 0.60 ± 0.05 | 0.50 ± 0.04 | 0.68 ± 0.02 | 0.88 ± 0.01 | 0.84 ± 0.01 | 0.88 ± 0.01 |
| 100 k | 0.57 ± 0.04 | 0.47 ± 0.04 | 0.63 ± 0.03 | 0.84 ± 0.01 | 0.81 ± 0.02 | 0.84 ± 0.01 |
| ρ = 0.8 | ||||||
| 800 k | 0.67 ± 0.05 | 0.76 ± 0.02 | 0.83 ± 0.01 | 0.92 ± 0.01 | 0.92 ± 0.01 | 0.93 ± 0.01 |
| 400 k | 0.66 ± 0.05 | 0.72 ± 0.02 | 0.80 ± 0.01 | 0.90 ± 0.01 | 0.89 ± 0.01 | 0.90 ± 0.01 |
| 200 k | 0.66 ± 0.04 | 0.68 ± 0.02 | 0.76 ± 0.01 | 0.86 ± 0.01 | 0.86 ± 0.01 | 0.86 ± 0.01 |
| 100 k | 0.61 ± 0.04 | 0.63 ± 0.04 | 0.72 ± 0.02 | 0.83 ± 0.01 | 0.83 ± 0.01 | 0.84 ± 0.01 |
| b(TBV, GEBV) | ||||||
| ρ = 0.2 | ||||||
| 800 k | 1.06 ± 0.05 | 0.73 ± 0.05 | 1.04 ± 0.05 | 0.98 ± 0.02 | 1.01 ± 0.01 | 0.98 ± 0.01 |
| 400 k | 1.06 ± 0.05 | 0.76 ± 0.06 | 1.06 ± 0.05 | 0.99 ± 0.02 | 1.00 ± 0.01 | 0.98 ± 0.01 |
| 200 k | 1.02 ± 0.06 | 0.75 ± 0.05 | 1.03 ± 0.04 | 1.00 ± 0.01 | 0.99 ± 0.01 | 0.98 ± 0.01 |
| 100 k | 1.00 ± 0.06 | 0.75 ± 0.05 | 1.03 ± 0.06 | 1.00 ± 0.01 | 1.00 ± 0.02 | 0.99 ± 0.01 |
| ρ = 0.8 | ||||||
| 800 k | 1.10 ± 0.06 | 0.90 ± 0.03 | 1.10 ± 0.04 | 0.99 ± 0.02 | 1.00 ± 0.02 | 0.99 ± 0.02 |
| 400 k | 1.09 ± 0.06 | 0.89 ± 0.04 | 1.07 ± 0.04 | 0.99 ± 0.02 | 0.99 ± 0.01 | 0.99 ± 0.02 |
| 200 k | 1.14 ± 0.06 | 0.89 ± 0.06 | 1.04 ± 0.05 | 0.98 ± 0.02 | 1.00 ± 0.02 | 0.98 ± 0.03 |
| 100 k | 1.08 ± 0.08 | 0.85 ± 0.06 | 1.06 ± 0.05 | 0.98 ± 0.03 | 0.99 ± 0.03 | 0.98 ± 0.03 |
Accuracy {expressed as correlations between true breeding values (TBV) and genomic estimated breeding values [GEBV; r(TBV, GEBV)]}, and slopes [b(TBV, GEBV)] of regression of TBV on GEBV for genomic prediction with 200 simulated QTL
| | ||||||
|---|---|---|---|---|---|---|
| r(TBV, GEBV) | ||||||
| ρ = 0.2 | ||||||
| 800 k | 0.46 ± 0.02 | 0.44 ± 0.04 | 0.47 ± 0.03 | 0.77 ± 0.01 | 0.76 ± 0.01 | 0.77 ± 0.01 |
| 400 k | 0.46 ± 0.02 | 0.43 ± 0.03 | 0.47 ± 0.02 | 0.76 ± 0.01 | 0.76 ± 0.01 | 0.76 ± 0.01 |
| 200 k | 0.46 ± 0.02 | 0.42 ± 0.04 | 0.47 ± 0.02 | 0.75 ± 0.01 | 0.75 ± 0.01 | 0.75 ± 0.01 |
| 100 k | 0.45 ± 0.02 | 0.41 ± 0.03 | 0.46 ± 0.03 | 0.74 ± 0.01 | 0.74 ± 0.01 | 0.74 ± 0.01 |
| ρ = 0.8 | ||||||
| 800 k | 0.54 ± 0.04 | 0.57 ± 0.02 | 0.56 ± 0.03 | 0.74 ± 0.02 | 0.75 ± 0.01 | 0.75 ± 0.02 |
| 400 k | 0.54 ± 0.03 | 0.56 ± 0.03 | 0.55 ± 0.03 | 0.74 ± 0.02 | 0.74 ± 0.02 | 0.74 ± 0.02 |
| 200 k | 0.54 ± 0.03 | 0.56 ± 0.02 | 0.55 ± 0.03 | 0.73 ± 0.02 | 0.73 ± 0.02 | 0.73 ± 0.02 |
| 100 k | 0.53 ± 0.03 | 0.52 ± 0.02 | 0.54 ± 0.03 | 0.72 ± 0.02 | 0.72 ± 0.02 | 0.72 ± 0.02 |
| b(TBV, GEBV) | ||||||
| ρ = 0.2 | ||||||
| 800 k | 1.14 ± 0.08 | 0.89 ± 0.09 | 1.16 ± 0.09 | 1.07 ± 0.01 | 1.09 ± 0.01 | 1.07 ± 0.01 |
| 400 k | 1.14 ± 0.09 | 0.91 ± 0.08 | 1.13 ± 0.09 | 1.07 ± 0.01 | 1.09 ± 0.01 | 1.07 ± 0.01 |
| 200 k | 1.14 ± 0.09 | 0.88 ± 0.08 | 1.14 ± 0.09 | 1.08 ± 0.01 | 1.09 ± 0.01 | 1.08 ± 0.01 |
| 100 k | 1.15 ± 0.09 | 0.89 ± 0.09 | 1.15 ± 0.09 | 1.09 ± 0.02 | 1.11 ± 0.03 | 1.09 ± 0.02 |
| ρ = 0.8 | ||||||
| 800 k | 1.12 ± 0.11 | 1.00 ± 0.06 | 1.07 ± 0.09 | 1.01 ± 0.02 | 1.00 ± 0.02 | 1.01 ± 0.02 |
| 400 k | 1.11 ± 0.11 | 1.02 ± 0.07 | 1.08 ± 0.09 | 1.01 ± 0.02 | 1.00 ± 0.02 | 1.01 ± 0.02 |
| 200 k | 1.12 ± 0.11 | 1.04 ± 0.07 | 1.11 ± 0.09 | 1.01 ± 0.02 | 1.01 ± 0.02 | 1.01 ± 0.02 |
| 100 k | 1.11 ± 0.11 | 0.99 ± 0.07 | 1.10 ± 0.10 | 1.01 ± 0.02 | 1.01 ± 0.02 | 1.01 ± 0.02 |
Accuracy {expressed as correlations between true breeding values (TBV) and genomic estimated breeding values [GEBV; r(TBV, GEBV)]}, and slopes [b(TBV, GEBV)] of regression of TBV on GEBV for genomic prediction with simulated QTL genotypes included for training and validation
| | | ||||||
|---|---|---|---|---|---|---|---|
| r(TBV, GEBV) | |||||||
| 20 | 0.2 | 0.76 ± 0.04 | 0.64 ± 0.04 | 0.92 ± 0.01 | 0.96 ± 0.01 | 0.93 ± 0.01 | 0.97 ± 0.01 |
| 20 | 0.8 | 0.71 ± 0.05 | 0.88 ± 0.02 | 0.93 ± 0.01 | 0.97 ± 0.01 | 0.97 ± 0.01 | 0.97 ± 0.01 |
| 200 | 0.2 | 0.57 ± 0.03 | 0.53 ± 0.03 | 0.60 ± 0.04 | 0.77 ± 0.01 | 0.76 ± 0.01 | 0.78 ± 0.01 |
| 200 | 0.8 | 0.54 ± 0.04 | 0.65 ± 0.02 | 0.61 ± 0.04 | 0.76 ± 0.02 | 0.78 ± 0.02 | 0.77 ± 0.01 |
| b(TBV, GEBV) | |||||||
| 20 | 0.2 | 1.01 ± 0.06 | 0.83 ± 0.08 | 0.99 ± 0.03 | 1.00 ± 0.01 | 1.00 ± 0.02 | 1.00 ± 0.01 |
| 20 | 0.8 | 1.04 ± 0.08 | 0.89 ± 0.06 | 0.97 ± 0.03 | 0.98 ± 0.01 | 1.01 ± 0.02 | 0.99 ± 0.01 |
| 200 | 0.2 | 1.23 ± 0.10 | 0.89 ± 0.06 | 1.16 ± 0.08 | 1.00 ± 0.03 | 1.03 ± 0.03 | 1.01 ± 0.03 |
| 200 | 0.8 | 1.10 ± 0.09 | 1.06 ± 0.06 | 1.12 ± 0.09 | 0.98 ± 0.03 | 0.97 ± 0.03 | 0.98 ± 0.02 |
Accuracy of genomic prediction of breeding values for milk production traits
| | ||||||
|---|---|---|---|---|---|---|
| No. of SNP: 28,206 | ||||||
| Milk yield | 0.52 | 0.44 | 0.54 | 0.66 | 0.65 | 0.66 |
| Fat yield | 0.64 | 0.55 | 0.66 | 0.63 | 0.64 | 0.63 |
| Protein yield | 0.70 | 0.60 | 0.70 | 0.69 | 0.69 | 0.68 |
| Fat % | 0.66 | 0.58 | 0.72 | 0.74 | 0.74 | 0.74 |
| Protein % | 0.48 | 0.51 | 0.55 | 0.67 | 0.68 | 0.67 |
| No. of SNP: 246,668 | ||||||
| Milk yield | 0.54 | 0.53 | 0.55 | 0.64 | 0.64 | 0.64 |
| Fat yield | 0.67 | 0.62 | 0.67 | 0.63 | 0.64 | 0.63 |
| Protein yield | 0.72 | 0.68 | 0.72 | 0.66 | 0.66 | 0.66 |
| Fat % | 0.66 | 0.65 | 0.69 | 0.77 | 0.77 | 0.78 |
| Protein % | 0.51 | 0.42 | 0.53 | 0.71 | 0.68 | 0.70 |