| Literature DB >> 29207947 |
Grum Gebreyesus1,2, Mogens S Lund3, Bart Buitenhuis3, Henk Bovenhuis4, Nina A Poulsen5, Luc G Janss3.
Abstract
BACKGROUND: Accurate genomic prediction requires a large reference population, which is problematic for traits that are expensive to measure. Traits related to milk protein composition are not routinely recorded due to costly procedures and are considered to be controlled by a few quantitative trait loci of large effect. The amount of variation explained may vary between regions leading to heterogeneous (co)variance patterns across the genome. Genomic prediction models that can efficiently take such heterogeneity of (co)variances into account can result in improved prediction reliability. In this study, we developed and implemented novel univariate and bivariate Bayesian prediction models, based on estimates of heterogeneous (co)variances for genome segments (BayesAS). Available data consisted of milk protein composition traits measured on cows and de-regressed proofs of total protein yield derived for bulls. Single-nucleotide polymorphisms (SNPs), from 50K SNP arrays, were grouped into non-overlapping genome segments. A segment was defined as one SNP, or a group of 50, 100, or 200 adjacent SNPs, or one chromosome, or the whole genome. Traditional univariate and bivariate genomic best linear unbiased prediction (GBLUP) models were also run for comparison. Reliabilities were calculated through a resampling strategy and using deterministic formula.Entities:
Mesh:
Substances:
Year: 2017 PMID: 29207947 PMCID: PMC5718071 DOI: 10.1186/s12711-017-0364-8
Source DB: PubMed Journal: Genet Sel Evol ISSN: 0999-193X Impact factor: 4.297
Heritability estimates and genome-wide correlations and covariances with total milk protein yield
| Traita | h2 | SE | Covariance | SE | Correlation | SE |
|---|---|---|---|---|---|---|
|
| 0.14 | 0.07 | 0.01 | 0.05 | 0.04 | 0.16 |
|
| 0.14 | 0.09 | − 0.02 | 0.05 | − 0.07 | 0.16 |
|
| 0.33 | 0.09 | − 0.08 | 0.06 | − 0.16 | 0.12 |
|
| 0.69 | 0.09 | 0.06 | 0.05 | 0.09 | 0.07 |
| G- | 0.41 | 0.09 | 0.0008 | 0.04 | 0.0006 | 0.10 |
|
| 0.15 | 0.09 | 0.05 | 0.05 | 0.15 | 0.16 |
|
| 0.52 | 0.10 | 0.04 | 0.05 | 0.07 | 0.09 |
| Protein % | 0.54 | 0.09 | − 0.08 | 0.06 | − 0.14 | 0.10 |
Heritability (h2) estimates were from the univariate GBLUP analysis; covariances and correlations are from the bivariate GBLUP model
aProtein composition expressed as a fraction of the total milk protein percentage by weight wt (wt/wt), protein % expressed as percentage of the total milk yield; individual proteins comprise only the peaks identified as intact proteins and isoforms, i.e., S1-CN (comprises S1-CN 8P + 9P), S2-CN (comprises S2-CN 11P + 12P), -CN (comprises -CN G 1P + unglycosylated -CN 1P), where P = phosphorylated serine group. G--CN = glycosylated--CN; S1-CN-8P = S1-CN with 8 phosphorylated serine groups
Prediction reliability from univariate and bivariate GBLUP models
| Traita | ST-GBLUP | MT-GBLUP |
|---|---|---|
|
| 0.11 | 0.10 |
|
| 0.03 | 0.03 |
|
| 0.03 | 0.06 |
|
| 0.16 | 0.16 |
| G- | 0.14 | 0.14 |
|
| 0.12 | 0.11 |
|
| 0.21 | 0.21 |
| Protein % | 0.10 | 0.12 |
aProtein composition expressed as a fraction of the total milk protein percentage by weight wt (wt/wt), protein % expressed as percentage of the total milk yield; individual proteins comprise only the peaks identified as intact proteins and isoforms, i.e., S1-CN (comprises S1-CN 8P + 9P), S2-CN (comprises S2-CN 11P + 12P), -CN (comprises -CN G 1P + unglycosylated -CN 1P), where P = phosphorylated serine group. G--CN = glycosylated--CN; S1-CN-8P = S1-CN with 8 phosphorylated serine groups
Fig. 1Proportion of genomic variance explained by each chromosome. Proportion of the genomic variance in the milk protein composition traits explained by each chromosome from the ST-BayesAS model taking chromosomes as segments
Fig. 2Covariance between each protein composition trait with total protein yield explained by 100-SNP genomic segments
Fig. 3Prediction reliability across MT-BayesAS models. Reliability of models according to segment sizes of 1, 50, 100, and 200 SNPs, chromosome, and whole genome. G--CN = glycosylated--CN; S1-CN-8P = S1-CN with eight phosphorylated serine groups
Prediction reliability from univariate and bivariate BayesAS models
| Traita | BayesAS-1SNP | BayesAS-100SNP | BayesAS-Genome | |||
|---|---|---|---|---|---|---|
| MT | ST | MT | ST | MT | ST | |
|
| 0.10 | 0.09 | 0.13 | 0.09 | 0.10 | 0.09 |
|
| 0.04 | 0.02 | 0.06 | 0.03 | 0.03 | 0.03 |
|
| 0.03 | 0.03 | 0.18 | 0.16 | 0.03 | 0.03 |
|
| 0.38 | 0.37 | 0.68 | 0.63 | 0.16 | 0.16 |
| G- | 0.41 | 0.39 | 0.76 | 0.70 | 0.13 | 0.14 |
|
| 0.11 | 0.09 | 0.14 | 0.14 | 0.11 | 0.11 |
|
| 0.39 | 0.39 | 0.52 | 0.50 | 0.21 | 0.19 |
| Protein % | 0.14 | 0.14 | 0.18 | 0.17 | 0.12 | 0.11 |
aProtein composition expressed as a fraction of the total milk protein percentage by weight wt (wt/wt), protein % expressed as percentage of the total milk yield; individual proteins comprise only the peaks identified as intact proteins and isoforms,i.e., S1-CN (comprises S1-CN 8P + 9P), S2-CN (comprises S2-CN 11P + 12P), -CN (comprises -CN G 1P + unglycosylated -CN 1P), where P = phosphorylated serine group. G--CN = glycosylated--CN; S1-CN-8P = S1-CN with 8 phosphorylated serine groups
Model reliability for bulls across the MT-BayesAS models
| Traita | MT-BayesAS model reliability | |||||
|---|---|---|---|---|---|---|
| 1 | 50 | 100 | 200 | Chromosome | Genome | |
|
| 0.05 | 0.06 | 0.04 | 0.06 | 0.05 | 0.06 |
|
| 0.06 | 0.06 | 0.06 | 0.06 | 0.06 | 0.07 |
|
| 0.12 | 0.32 | 0.32 | 0.26 | 0.21 | 0.14 |
|
| 0.56 | 0.71 | 0.71 | 0.68 | 0.56 | 0.21 |
| G- | 0.42 | 0.56 | 0.56 | 0.54 | 0.39 | 0.15 |
|
| 0.07 | 0.07 | 0.08 | 0.08 | 0.08 | 0.06 |
|
| 0.37 | 0.50 | 0.51 | 0.49 | 0.27 | 0.19 |
| Protein % | 0.23 | 0.22 | 0.22 | 0.21 | 0.19 | 0.18 |
aProtein composition expressed as a fraction of the total milk protein percentage by weight wt (wt/wt), protein % expressed as percentage of the total milk yield; individual proteins comprise only the peaks identified as intact proteins and isoforms,i.e., S1-CN (comprises S1-CN 8P + 9P), S2-CN (comprises S2-CN 11P + 12P), -CN (comprises -CN G 1P + unglycosylated -CN 1P), where P = phosphorylated serine group. G--CN = glycosylated--CN; S1-CN-8P = S1-CN with 8 phosphorylated serine groups
Fig. 4Reliability of prediction using various proportions of genomic segments. Predictions were based on post-Gibbs analyses of samples from the MT-100-BayesA model. Segments were ranked based on explained covariance separately for each training set