Literature DB >> 29373954

Genomic selection models for directional dominance: an example for litter size in pigs.

Luis Varona^1,2, Andrés Legarra³, William Herring⁴, Zulma G Vitezica⁵.

Abstract

BACKGROUND: The quantitative genetics theory argues that inbreeding depression and heterosis are founded on the existence of directional dominance. However, most procedures for genomic selection that have included dominance effects assumed prior symmetrical distributions. To address this, two alternatives can be considered: (1) assume the mean of dominance effects different from zero, and (2) use skewed distributions for the regularization of dominance effects. The aim of this study was to compare these approaches using two pig datasets and to confirm the presence of directional dominance.
RESULTS: Four alternative models were implemented in two datasets of pig litter size that consisted of 13,449 and 11,581 records from 3631 and 2612 sows genotyped with the Illumina PorcineSNP60 BeadChip. The models evaluated included (1) a model that does not consider directional dominance (Model SN), (2) a model with a covariate b for the average individual homozygosity (Model SC), (3) a model with a parameter λ that reflects asymmetry in the context of skewed Gaussian distributions (Model AN), and (4) a model that includes both b and λ (Model Full). The results of the analysis showed that posterior probabilities of a negative b or a positive λ under Models SC and AN were higher than 0.99, which indicate positive directional dominance. This was confirmed with the predictions of inbreeding depression under Models Full, SC and AN, that were higher than in the SN Model. In spite of differences in posterior estimates of variance components between models, comparison of models based on LogCPO and DIC indicated that Model SC provided the best fit for the two datasets analyzed.
CONCLUSIONS: Our results confirmed the presence of positive directional dominance for pig litter size and suggested that it should be taken into account when dominance effects are included in genomic evaluation procedures. The consequences of ignoring directional dominance may affect predictions of breeding values and can lead to biased prediction of inbreeding depression and performance of potential mates. A model that assumes Gaussian dominance effects that are centered on a non-zero mean is recommended, at least for datasets with similar features to those analyzed here.

Entities: Chemical Disease Species

Mesh：

Year: 2018 PMID： 29373954 PMCID： PMC5787328 DOI： 10.1186/s12711-018-0374-1

Source DB: PubMed Journal: Genet Sel Evol ISSN： 0999-193X Impact factor: 4.297

Background

Since the availability of dense genotyping panels [1], genomic prediction [2, 3] has become a very successful strategy for the prediction of breeding values of candidates for selection. Genomic prediction methods are based on the evaluation of the additive substitution effects of markers that capture a large part of the dominance and higher-order interaction effects [4]. However, estimating dominance effects may be relevant because their estimates can be used to allocate mates among candidates for selection [5, 6]. Two approaches have been suggested to estimate the effects of dominance in genomic prediction methods. The first [7] directly models the additive (a) and dominance (d) effects, while for the second, Vitezica et al. [8] proposed to include allele substitution (α) and dominance deviation (δ) effects in order to compute appropriate breeding values. However, both these approaches impose a Gaussian regularization of additive and dominance effects that forces a symmetric distribution of the posterior estimates. Nevertheless, the classical theory of quantitative genetics [9] argues that inbreeding depression and heterosis are based on the presence of directional dominance (i.e., a higher percentage of positive than negative dominance effects) and this contrasts with the assumption of symmetry of the above-described procedures. This discrepancy can be overcome in at least two ways: (1) by assuming that the mean of dominance effects differ from zero, which leads to the inclusion of a covariate for the average individual homozygosity in the statistical model, and (2) by using skewed distributions for the regularization of dominance effects. The first approach can be called regression on genomic inbreeding and it was empirically used by Sun et al. [6], Silió et al. [10], Aliloo et al. [11] and Zeng et al. [12], and proved by Xiang et al. [13]. As far as we are aware, the second approach was never applied in the field of animal genetics, although it may be of considerable value since it ensures that the most frequent dominance effects are close to zero. In contrast, regression on genomic inbreeding implies that the mean and mode of the dominance effects are equal, although they may differ from zero. In the statistical literature, there is a broad corpus on the specification of skewed distributions [14, 15] and, among them, the family of skew-elliptical distributions defined by Sahu et al. [16] can be easily implemented in Bayesian regression using Markov chain Monte Carlo (MCMC) techniques [17]. The objectives of this study were: (1) to develop a genomic best linear unbiased prediction (BLUP) model that uses a prior skewed distribution for dominance effects; (2) to compare it with the model with inclusion of a covariate for inbreeding proposed by Xiang et al. [13]; and (3) to confirm the presence of directional dominance for pig litter size.

Methods

Data

The data used in this study were from two unrelated pig lines provided by Genus plc (Hendersonville, TN, USA). Genotypes for all sows were generated using the Illumina PorcineSNP60 BeadChip (Illumina, San Diego). After quality control, i.e. excluding genotypes from single nucleotide polymorphisms (SNPs) with a minor allele frequency lower than 0.05 and a call rate lower than 0.95 in each population, 37,900 and 37,011 genotypes for SNPs remained for lines 1 and 2, respectively. Individuals with a call rate lower than 0.95 were also removed. Finally, the number of sows included in the analysis were 3631 and 2612 for lines 1 and 2, respectively. In total, 13,449 and 11,581 records on litter size (number of piglets born alive) were available for these sows, with an average litter size of 11.7 ± 2.9 and 12.4 ± 3.0 for lines 1 and 2, respectively.

Genomic prediction models

The first step was the definition of a Full Model that included both approaches for directional dominance, i.e. regression on genomic inbreeding and a skewed distribution for SNP effects). Then, a subset of model parameters was set to 0 in order to identify reduced models. The Full Model was:where is the vector of phenotypic records, is the general mean, is a covariate that can be interpreted as inbreeding depression or heterosis, is a vector of order of parity effects (4 levels − 1st, 2nd, 3rd and > 3rd), is a vector of farm-year-month of farrowing effects (3163 levels for line 1 and 4293 for line 2), is a vector of permanent environmental effects (3631 and 2612 levels for lines 1 and 2, respectively), and are vectors of additive and dominance effects (37,900 and 37,011 levels), and is a vector of residuals. Furthermore, is a vector of the average SNP homozygosity of the individuals and , , , and are incidence matrices that link the phenotypic records with , , , and , respectively. Under the Bayesian paradigm, prior distributions were uniform for , and for each element of , univariate Gaussian for each element of , and , and skew Gaussian for each element of . Finally, prior distributions for the variances of farm-year-month of farrowing (), permanent environmental (), additive (), dominance (), and residual effects () were scaled inverted Chi square (see “Appendix” for a full description of the Bayesian inference). It should be noted that directional dominance comprises two model parameters, one covariate for the average SNP homozygosity () and one asymmetry parameter () that is involved in the skew Gaussian prior distribution of dominance effects. Based on the Full Model, three reduced models were defined as follows: Model SC: symmetric dominance effect with the inbreeding depression covariate, i.e. the asymmetry parameter () was set to zero. This model was equivalent to that defined by Xiang et al. [13]. Model AN: asymmetric dominance effect without the inbreeding depression covariate, i.e. covariate was set to zero. Model SN: symmetric dominance effect without the inbreeding depression covariate, i.e. both asymmetry parameter and covariate were set to zero. This model was equivalent to that defined by Su et al. [5]. The four models (Full, SC, AN, SN) were analyzed using a Gibbs sampler [18] (see “Appendix” for a full description) that provided posterior distributions for all unknowns in the model, i.e. individual breeding values () and dominance deviations (), additive and dominance variances ( and ), and the expected inbreeding depression per percentage of inbreeding (). Models SC, AN and SN were analyzed by five chains of 75,000 iterations, after discarding the first 25,000. Each chain used a different random seed. As the convergence of the Full Model was clearly the worst, the Gibbs sampler implementation for this model was set to five chains of 250,000 iterations, after a burn-in of the first 50,000. Convergence and effective sample size were checked using the standard procedures [19] with CODA package [20] and by visual inspection of the chains. Finally, models were compared using the deviance information criteria (DIC) [21] and the logarithm of the conditional predictive ordinate (LogCPO) [22] (see “Appendix” for a full description).

Results

The results of the convergence and the effective size of the MCMC chains are presented in Additional file 1: Tables S1 and S2. The average number of iterations required until convergence was computed using the Raftery and Lewis approach [23] and ranged from 93.0 (for parameter in the SC Model for line 1) to 9204.0 ( in the Full Model for line 1). The estimated effective sample size (EFS) of the MCMC chains [24] ranged from 82 ( in the SN Model for line 2) to 16,510 ( in the Full Model for line 1). Finally, the required numbers of samples to achieve an accuracy of 0.1 for the 0.5 quantile with a probability of 0.95 were calculated using the Raftery and Lewis approach [23] and ranged from 3210 ( in the SC Model for line 1) to 290,280 ( in the Full Model for line 1). Posterior mean estimates (and posterior deviations) of variance components, asymmetry parameters, and expected inbreeding depression and results of the comparison of models are in Tables 1 and 2 for lines 1 and 2, respectively. Posterior estimates of the variance of the additive effects () under Model SN were equal to 0.394 × 10−4 and 0.678 × 10−4 for lines 1 and 2, respectively. Compared with the SN Model, these estimates were slightly lower for the AN Model (0.345 × 10−4 and 0.617 × 10−4), those for the SC Model were moderately higher (0.439 × 10−4 and 0.701 × 10−4) (Tables 1 and 2) and those for the Full Model were similar (0.381 × 10−4 and 0.615 × 10−4).

Table 1

	Model
	SN	AN	SC	Full
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$b$$\end{document}b	–	–	− 12.153 (1.746)	− 7.950 (7.527)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma_{a}^{2}$$\end{document}σa2 (× 10⁻⁴)	0.394 (0.062)	0.345 (0.061)	0.439 (0.065)	0.381 (0.067)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma_{d}^{2}$$\end{document}σd2 (× 10⁻⁴)	0.369 (0.101)	0.769 (0.108)	0.122 (0.095)	0.536 (0.236)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma_{r}^{2}$$\end{document}σr2	0.160 (0.043)	0.161 (0.043)	0.160 (0.043)	0.161 (0.043)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma_{c}^{2}$$\end{document}σc2	0.478 (0.089)	0.308 (0.084)	0.572 (0.089)	0.394 (0.116)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda_{d}$$\end{document}λd (× 10⁻³)	–	0.380 (0.078)	–	0.135 (0.244)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma_{e}^{2}$$\end{document}σe2	6.569 (0.099)	6.570 (0.099)	6.567 (0.099)	6.568 (0.098)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{I}}_{\text{D}}$$\end{document}ID	− 0.016 (0.005)	− 0.044 (0.006)	− 0.045 (0.006)	− 0.045 (0.006)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{V}}_{\text{A}}$$\end{document}VA	0.679 (0.101)	0.862 (0.1174)	0.832 (0.110)	0.859 (0.158)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{V}}_{\text{D}}$$\end{document}VD	0.597 (0.165)	1.326 (0.211)	0.415 (0.165)	1.013 (0.343)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{h}}^{2}$$\end{document}h2	0.080 (0.011)	0.093 (0.015)	0.097 (0.011)	0.095 (0.015)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{d}}^{2}$$\end{document}d2	0.070 (0.018)	0.143 (0.019)	0.048 (0.018)	0.111 (0.034)
LogCPO	− 32,508.88	− 32,513.61	− 32,498.72	− 32,517.83
DIC	64,939.52	64,948.21	64,920.97	64,947.08

is the covariate with individual homozygosity, and are the variance of the additive and dominance SNP effects, is the variance of the permanent environmental effects, is the variance of the farm-year-month effects, is the asymmetry parameters for the dominance effects, is the residual variance, is the inbreeding depression per percentage of inbreeding, and are the additive and dominance variance, and are the heritability and the ratio of dominance variance, LogCPO is the logarithm of the conditional predictive ordinate and DIC is the deviance information criterion

Table 2

	Model
	SN	AN	SC	Full
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$b$$\end{document}b	–	–	− 6.479 (2.289)	1.726 (5.845)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma_{a}^{2}$$\end{document}σa2 (× 10⁻⁴)	0.678 (0.091)	0.617 (0.092)	0.701 (0.095)	0.615 (0.096)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma_{d}^{2}$$\end{document}σd2 (× 10⁻⁴)	0.430 (0.170)	0.872 (0.169)	0.334 (0.154)	0.993 (0.322)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma_{r}^{2}$$\end{document}σr2	0.299 (0.060)	0.296 (0.062)	0.299 (0.061)	0.297 (0.060)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma_{c}^{2}$$\end{document}σc2	0.580 (0.123)	0.380 (0.114)	0.614 (0.118)	0.333 (0.148)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda_{d}$$\end{document}λd (× 10⁻³)	–	0.249 (0.096)	–	0.307 (0.209)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma_{e}^{2}$$\end{document}σe2	6.630 (0.109)	6.635 (0.110)	6.631 (0.109)	6.635 (0.108)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{I}}_{\text{D}}$$\end{document}ID	− 0.008 (0.005)	− 0.028 (0.008)	− 0.025 (0.008)	− 0.029 (0.008)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{V}}_{\text{A}}$$\end{document}VA	1.100 (0.125)	1.170 (0.171)	1.152 (0.135)	1.198 (0.174)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{V}}_{\text{D}}$$\end{document}VD	0.669 (0.263)	1.377 (0.274)	0.574 (0.241)	1.537 (0.464)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{h}}^{2}$$\end{document}h2	0.119 (0.013)	0.118 (0.015)	0.124 (0.013)	0.120 (0.015)
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{d}}^{2}$$\end{document}d2	0.072 (0.027)	0.139 (0.024)	0.061 (0.024)	0.152 (0.040)
LogCPO	− 28,176.11	− 28,176.62	− 28,174. 84	− 28,180.68
DIC	56,250.06	56,251.39	56,247.95	56,258.66

Posterior mean (and posterior standard deviation) estimates for variance components, asymmetry parameters, inbreeding depression, ratios of additive and dominance variation and criteria for model comparison for line 1 is the covariate with individual homozygosity, and are the variance of the additive and dominance SNP effects, is the variance of the permanent environmental effects, is the variance of the farm-year-month effects, is the asymmetry parameters for the dominance effects, is the residual variance, is the inbreeding depression per percentage of inbreeding, and are the additive and dominance variance, and are the heritability and the ratio of dominance variance, LogCPO is the logarithm of the conditional predictive ordinate and DIC is the deviance information criterion Posterior mean (and posterior standard deviation) estimates for variance components, asymmetry parameters, inbreeding depression, ratios of additive and dominance variation and criteria for model comparison for line 2 is the covariate with individual homozygosity, and are the variance of the additive and dominance SNP effects, is the variance of the permanent environmental effects, is the variance of the farm-year-month effects, is the asymmetry parameters for the dominance effects, is the residual variance, is the inbreeding depression per percentage of inbreeding, and are the additive and dominance variance, and are the heritability and the ratio of dominance variance, LogCPO is the logarithm of the conditional predictive ordinate and DIC is the deviance information criterion A different pattern was observed for the variance of dominance effects (), with posterior mean estimates being equal to 0.369 × 10−4 and 0.430 × 10−4 for lines 1 and 2, respectively, for Model SN (Tables 1 and 2). Models that allowed for asymmetry of dominance effects (AN and Full) provided higher posterior mean estimates of the variance of dominance effects (0.769 × 10−4 and 0.536 × 10−4 for line 1 and 0.872 × 10−4 and 0.993 × 10−4 for line 2, respectively) than the SC Model (0.122 × 10−4 and 0.334 × 10−4 for lines 1 and 2, respectively). Because of the above results, estimates of additive genetic variance () and narrow sense heritability () were higher for Models AN and Full than for Models SN and SC (Tables 1 and 2). In contrast, posterior mean estimates of the variance of dominance deviations () and percentage of dominance variation () were lower for Models SN and SC than for Models AN and Full (Tables 1 and 2). Estimates of the variance of farm-year month effects () and of residuals () were consistent between models, ranging from 0.160 to 0.161 for line 1 and from 0.296 to 0.299 for line 2 for the farm-year-month variance and from 6.567 to 6.570 and from 6.630 to 6.635 for the residual variance for lines 1 and 2, respectively. However, the estimates of the variance of permanent environmental effects () differed substantially between models (Tables 1 and 2), with posterior mean estimates for the SN and SC Models being the highest (0.478 and 0.572 for line 1 and 0.580 and 0.614 for line 2, respectively) and decreasing when asymmetry was allowed, reaching the lowest estimates for Models AN (0.308 and 0.380 for lines 1 and 2, respectively) and Full (0.394 and 0.333). Posterior mean estimates of the asymmetry parameter for dominance effects () were all positive (Tables 1 and 2 and Fig. 1) and ranged from 0.135 (line 1 and Model Full) to 0.380 (line 1 and Model AN). However, it should also be noted that posterior probabilities of a positive value for were higher than 0.999 for Model AN, while the highest posterior density regions at 95% (HPD95) for included zero for the Full Model for both lines.

Fig. 1

Posterior distribution of the asymmetry parameter (λ) under Models AN and Full for lines 1 and 2

Posterior distribution of the asymmetry parameter (λ) under Models AN and Full for lines 1 and 2 The regression coefficient on individual homozygosity () was estimated with Models SC and Full (Tables 1 and 2 and Fig. 2). With the SC Model, posterior mean estimates of were clearly negative (− 12.15 and − 7.95 for lines 1 and 2, respectively), but equal to − 5.72 and 1.73 for lines 1 and 2 for the Full Model. It should also be noted that posterior standard deviations were higher for the Full than for the SC Model. The HPD95 regions for included zero for the Full Model, but posterior probabilities of negative values were always higher than 0.99 for Model SC.

Fig. 2

Posterior distribution of the covariate for individual homozygosity (b) under Models SC and Full for lines 1 and 2

Posterior distribution of the covariate for individual homozygosity (b) under Models SC and Full for lines 1 and 2 Results for the expected inbreeding depression () per percentage of inbreeding are in Tables 1 and 2 and Fig. 3. Posterior mean (and posterior standard deviation) estimates of for the SN Model were − 0.016 (0.005) piglets for line 1 and − 0.008 (0.005) piglets for line 2. However, posterior mean (and posterior standard deviation) estimates for remaining Models (AN, SC and Full) were remarkably lower, being − 0.044 (0.006), − 0.045 (0.006), and − 0.045 (0.006) for line 1 and − 0.028 (0.008), − 0.025 (0.008) and − 0.029 (0.008) for line 2.

Fig. 3

Posterior distribution of the expected inbreeding depression for an inbreeding level of 0.10 for lines 1 and 2

Posterior distribution of the expected inbreeding depression for an inbreeding level of 0.10 for lines 1 and 2 Correlations of estimates for the SNP additive () and dominance () effects and for breeding values () and dominance deviations () between the four models of analysis are in Additional file 2: Tables S3 and S4. Correlations of estimates of the additive and dominance effects between models were always higher than 0.990 and correlations of estimates of breeding values and dominance deviations between models were also close to 1. However, it should be noted that the correlations between the estimated breeding values from the SN Model and the dominance deviations from the AN Model with the estimates from the remaining models were remarkably lower than those from the other models. In the first case, they ranged from 0.933 to 0.944 in line 1 and from 0.794 to 0.842 in line 2 and in the second case, from 0.769 to 0.944 in line 1 and from 0.702 to 0.857 in line 2. Results of the model comparison tests (logCPO and DIC) are also in Tables 1 and 2. In both lines, the model with the best fit for both tests was the SC Model, followed by the SN and AN Models. The Full Model had the worst fit.

Discussion

The advent of dense genotyping information has allowed the development of models for genomic evaluation [3] that have revolutionized the field of animal breeding during the last decade. Most models for genomic evaluation are designed to deal with the classical statistical problem of large and small , because the number of parameters to evaluate is frequently larger than the number of phenotypic data. The most common approach for dealing with this problem is the use of some kind of regularization of the effects of SNPs [25]. Several approaches have been suggested, ranging from simple Gaussian regularization [2] to more complex models that involve t-shaped [2], double exponential [26, 27], or mixtures of distributions [2, 28, 29]. However, all these methods of regularization use symmetric distributions that, from a Bayesian perspective, imply that marker effects are centered at zero. This assumption seems reasonable for the additive or substitution effects, but it is not so clear for dominance effects. In fact, the classical theory of quantitative genetics attributes the phenomenon of inbreeding depression (or heterosis) to the presence of directional dominance or, in other words, a positive average of dominance effects, jointly with a decrease (or increase) in the degree of heterozygosity [9]. In this study, we considered two approaches to model directional dominance in genomic evaluation methods. The first assumed a prior distribution for dominance effects that allowed a mean that was different from 0, i.e. Model SC, following the work of Xiang et al. [13]; the second assumed that dominance effects followed a skew Gaussian distribution that has a higher probability of positive (or negative) effects, i.e. Model AN. Finally, both approaches were combined into a Full Model. All models were implemented using a Gibbs sampler. The analysis of the MCMC chains indicated that convergence was achieved with the proposed burn-in for all models and both lines (25,000 iterations for AN, SC and SN Models and 50,000 for the Full Model). Nevertheless, the EFS was heterogeneous across parameters and models. In general, the EFS of the variance of dominance effects was smaller than that of the variance of additive effects, and the EFS of the parameters related with directional dominance ( and ) were very large for the AN and SN Models and remarkably smaller for the Full Model. Nevertheless, the sizes of the five Gibbs sampler chains (5 × 75,000 iterations for AN, SN and SC Models and 5 × 200,000 for the Full Model) were always larger than the length required for estimation of the 0.5 quantile of the posterior distributions with an accuracy of 0.1 and with a probability of 0.95, based on the Raftery and Lewis approach [23].

Evidence of directional dominance

Results from Models SC and AN provided clear evidence of directional dominance for both lines (Figs. 1 and 2); posterior distributions of the regression coefficient on individual homozygosity ( in Model SC) and the asymmetry parameter ( in Model AN) did not include zero in the highest posterior density at 99%. These results confirm the presence of directional dominance for litter size in pigs and they are in line with extensive reports on positive estimates for inbreeding depression and heterosis in the literature [30, 31]. However, results from the Full Model were not so clear because it suffered from some degree of statistical confounding of and , as observed in the strong posterior correlation (0.91) between the Gibbs samples of and (see Additional file 3: Figures S1 and S2). As a consequence, their posterior distributions were wider and they included zero in the HPD at 95% for and (Figs. 1 and 2) and convergence and EFS for both these parameters were worse than with the SC and AN Models (see Additional file 1: Tables S1 and S2). Models that allow the presence of directional dominance (SC, AN, Full) were able to predict the expected inbreeding depression () in populations that had a low range of levels of genealogical inbreeding. This approximation uses the classical additive model of inbreeding depression [9] but replacing dominance effects of causal polymorphisms with dominance effects of SNPs. In this approach, a linear relationship between inbreeding and inbreeding depression is assumed. Results were presented as the expected inbreeding depression per percentage of inbreeding. In the analyzed populations, the expected inbreeding depression coefficients under these models were around − 0.045 and between − 0.025 and − 0.028 piglets in lines 1 and 2, respectively. These results concur with those of Vitezica et al. [32], who also reported larger estimates of inbreeding depression in line 1 than in line 2 and they are close to the estimates of inbreeding depression for litter size in other pig populations [33-35]. In contrast, the estimates provided by the SN Model were substantially closer to 0, i.e. − 0.016 and − 0.008 piglets for lines 1 and 2, respectively. This may indicate that models that do not allow for directional dominance, such as the SN Model, cannot predict the magnitude of inbreeding depression (or heterosis) correctly and, thus, lead to biases if they are used for the prediction of future mate performance and mate allocation [5]. Nevertheless, there were some remarkable differences between the results obtained for the two lines, which are interesting to analyze further. Evidence of directional dominance was larger for line 1 than for line 2 for both approaches (estimates of − 12.15 vs. − 6.48 for in Model SC and of 0.38 vs. 0.25 for in Model AN), although posterior estimates of the dominance variance were lower for line 1 for all models. This suggests that the magnitude of directional dominance (or inbreeding depression) is not necessarily related to the amount of dominance variance estimated from resemblance between relatives. In fact, in the presence of inbreeding, the total genetic variance is split into five components [36, 37]: the additive and dominance genetic variances in the base population, the dominance genetic variance between homozygous individuals, the covariance between additive and dominance effects between homozygous individuals, and the square of the inbreeding depression. Traditional approaches to estimate dominance variance using genealogical [38, 39] or genomic dominance relationships [32] only take the additive and dominance variance in the base population into account and ignore the remaining variance components. It is possible that the presence of directional dominance also allows some of the other variance components that are not considered under the assumption of multivariate normality to be captured. Of particular significance is the fact that the estimates of the variance of dominance effects differed substantially between models. Lower estimates were obtained with the SC Model, whereas estimates from Models SN, AN and Full were higher. The cause of the inflation of dominance effects under the last three models may be the restrictions imposed by the assumed prior distributions. Under Model SN, the prior distribution forced the mean and mode of effects to be centered at zero. Thus, if directional dominance exists, specific estimates of the effects of SNPs would attempt to accommodate this, which would lead to an increase in the variance of dominance effects. Model AN allowed the presence of more positive (or negative) dominance effects but it forced the mode of the distributions to be close to zero. Estimates of the effects of SNPs for Model AN may be even larger than for Model SN, but as the prior distribution forced them to have a mode close to zero, the variance of dominance effects was also inflated in Model AN. Furthermore, the increase in the variance of dominance effects in Models SN, AN and Full with respect to Model SC was compensated by a corresponding decrease in the permanent environmental variance, as pointed out in other studies [40, 41]. Thus, the estimate of the permanent environmental variance was the largest for Model SC for both lines. The differences between models were also reflected in the correlations of estimates of breeding values and dominance deviations between models. Although the correlations for estimates of SNP additive and dominance effects between models were very high (see Additional file 2: Tables S3 and S4), the correlations for estimates of breeding values and dominance deviations provided some exceptions. For breeding values, estimates from the model that did not consider directional dominance (Model SN) had lower correlations with estimates from the other models (SC, AN and Full). This suggests that the inclusion of directional dominance with either of the two approaches would result in substantial changes in the ranking of individuals based on estimates of breeding values, which may have consequences for breeding decisions. In addition, the correlations between estimates of dominance deviations from Model AN with those from the other models were also lower (0.70–0.94), which may imply that the use of skewed prior distributions affects estimation of dominance deviations and the prediction of performance of future individuals (or crosses).

Comparison of models

The best model based on the two criteria used for comparison of models was the SC Model, followed by the SN and AN Models; the Full Model provided the worst fit for both lines. Model SN does not consider directional dominance and thus, it was penalized relative to Model SC. Models AN and Full were equally able to capture directional dominance since they led to similar estimates of inbreeding depression. However, they were penalized because the number of unknowns in these models is larger than in Model SC; they estimate and one auxiliary variable for the dominance effect of each SNP. In the light of these results, the main finding of our study is that Model SC, as defined by Xiang et al. [13], is recommended for the analysis of traits when directional dominance (or inbreeding depression) is expected and when resulting estimates of dominance effects are used for prediction of performance of future mates and mate allocation [5]. This recommendation is strengthened by the ease with which the SC Model can be formulated based on the genomic dominance relationship matrix [8], which helps to reduce the computational burden and directly provides predictions of additive and dominance effects for each individual. However, the application of skewed distributions should not be completely discarded for new lines of research. First, we assumed that the additive and dominance effects were independent, although it is possible to use multivariate asymmetric distributions [16], as in the models of Wellman and Bennewitz [42], which consider a relationship between the magnitudes of additive and dominance effects. Second, the assumption of Gaussian distributions can be replaced by the asymmetric version of any other distribution, such as t-shape or double exponential distribution, leading to asymmetric versions of the Bayes B [3] or Bayesian Lasso [26] approaches. These approaches may avoid the large increase in the variance of dominance effects since most of the estimates of the dominance effects of SNPs will be forced to be zero [3] or closer to zero [26] than with a prior Gaussian distribution. Finally, all the approaches described here assume that directional dominance is homogeneous along the genome. However, there is evidence in the literature of local differences in the causes of inbreeding depression across the genome [43, 44]. Further research is needed to investigate this phenomenon and, also, to model additional causes of inbreeding depression (or heterosis), such as epistatic interactions [45].

Conclusions

The results of our study confirm the presence of positive directional dominance for litter size in two lines of pigs. Ignoring this in genomic evaluation models with dominance effects alters the prediction of breeding values and may cause bias in the prediction of inbreeding depression (or heterosis) and of the performance of future mates. These effects can be avoided by using two alternative models, one that includes a non-zero mean of dominance effects and another that uses skewed prior distributions for them, with the latter providing a better fit. Thus, this approach should be recommended for modeling dominance effects, at least for datasets that have similar features as those analyzed here.

Additional files

Convergence and effective sample size of the Gibbs sampler. These tables include the results of the required burn-in and required sample size and the estimates of the effective sample size of the Gibbs sampler for Models SN, SC, AN and Full in lines 1 (Table S1) and 2 (Table S2). Correlations between estimates from Models SN, AC, AN and Full. These tables present the correlations between estimates of SNP additive (a) and dominance (b) effects, individual breeding values (c) and dominance deviations (d) with Models SN, SC, AN and Full in lines 1 (Table S3) and 2 (Table S4). Plots of the Gibbs sampler chains for the Full Model. These figures include the plots of the five Gibbs sampler chains for the covariate with individual heterozygosity (b) and the asymmetry parameter (λ) and bivariate plots of the Gibbs samples for the Full Model in lines 1 (Figure S1) and 2 (Figure S2).

28 in total

1. Predicting quantitative traits with regression models for dense molecular markers and pedigree.

Authors: Gustavo de los Campos; Hugo Naya; Daniel Gianola; José Crossa; Andrés Legarra; Eduardo Manfredi; Kent Weigel; José Miguel Cotes
Journal: Genetics Date: 2009-03-16 Impact factor: 4.562

2. The contribution of dominance and inbreeding depression in estimating variance components for litter size in Pannon White rabbits.

Authors: I Nagy; G Gorjanc; I Curik; J Farkas; H Kiszlinger; Zs Szendrő
Journal: J Anim Breed Genet Date: 2012-12-28 Impact factor: 2.380

3. Genetic evaluation methods for populations with dominance and inbreeding.

Authors: I J de Boer; I Hoeschele
Journal: Theor Appl Genet Date: 1993-04 Impact factor: 5.699

4. Bayesian analysis of quantitative traits using skewed distributions.

Authors: L Varona; N Ibañez-Escriche; R Quintanilla; J L Noguera; J Casellas
Journal: Genet Res (Camb) Date: 2008-04 Impact factor: 1.588

5. On the additive and dominant variance and covariance of individuals within the genomic selection scope.

Authors: Zulma G Vitezica; Luis Varona; Andres Legarra
Journal: Genetics Date: 2013-10-11 Impact factor: 4.562

6. The contribution of dominance to the understanding of quantitative genetic variation.

Authors: Robin Wellmann; Jörn Bennewitz
Journal: Genet Res (Camb) Date: 2011-04-12 Impact factor: 1.588

7. Estimation of dominance variance in purebred Yorkshire swine.

Authors: M S Culbertson; J W Mabry; I Misztal; N Gengler; J K Bertrand; L Varona
Journal: J Anim Sci Date: 1998-02 Impact factor: 3.159

8. Detecting inbreeding depression for reproductive traits in Iberian pigs using genome-wide data.

Authors: María Saura; Almudena Fernández; Luis Varona; Ana I Fernández; Maria Ángeles R de Cara; Carmen Barragán; Beatriz Villanueva
Journal: Genet Sel Evol Date: 2015-01-17 Impact factor: 4.297

9. Improvement of prediction ability for genomic selection of dairy cattle by including dominance effects.

Authors: Chuanyu Sun; Paul M VanRaden; John B Cole; Jeffrey R O'Connell
Journal: PLoS One Date: 2014-08-01 Impact factor: 3.240

10. Genomic BLUP including additive and dominant variation in purebreds and F1 crossbreds, with an application in pigs.

Authors: Zulma G Vitezica; Luis Varona; Jean-Michel Elsen; Ignacy Misztal; William Herring; Andrès Legarra
Journal: Genet Sel Evol Date: 2016-01-29 Impact factor: 4.297

5 in total

1. Inbreeding depression load for litter size in Entrepelado and Retinto Iberian pig varieties1.

Authors: Joaquim Casellas; Noelia Ibáñez-Escriche; Luis Varona; Juan P Rosas; Jose L Noguera
Journal: J Anim Sci Date: 2019-04-29 Impact factor: 3.159

2. Improving genomic predictions with inbreeding and nonadditive effects in two admixed maize hybrid populations in single and multienvironment contexts.

Authors: Morgane Roth; Aurélien Beugnot; Tristan Mary-Huard; Laurence Moreau; Alain Charcosset; Julie B Fiévet
Journal: Genetics Date: 2022-04-04 Impact factor: 4.402

3. Estimate of inbreeding depression on growth and reproductive traits in a Large White pig population.

Authors: Yu Zhang; Yue Zhuo; Chao Ning; Lei Zhou; Jian-Feng Liu
Journal: G3 (Bethesda) Date: 2022-07-06 Impact factor: 3.542

Review 4. Heterosis and Hybrid Crop Breeding: A Multidisciplinary Review.

Authors: Marlee R Labroo; Anthony J Studer; Jessica E Rutkoski
Journal: Front Genet Date: 2021-02-24 Impact factor: 4.599

5. Optimizing Genomic-Enabled Prediction in Small-Scale Maize Hybrid Breeding Programs: A Roadmap Review.

Authors: Roberto Fritsche-Neto; Giovanni Galli; Karina Lima Reis Borges; Germano Costa-Neto; Filipe Couto Alves; Felipe Sabadin; Danilo Hottis Lyra; Pedro Patric Pinho Morais; Luciano Rogério Braatz de Andrade; Italo Granato; Jose Crossa
Journal: Front Plant Sci Date: 2021-07-01 Impact factor: 5.753

5 in total