Literature DB >> 26467850

A fast and efficient Gibbs sampler for BayesB in whole-genome analyses.

Hao Cheng^1,2, Long Qu³, Dorian J Garrick^4,5, Rohan L Fernando⁶.

Abstract

BACKGROUND: In whole-genome analyses, the number p of marker covariates is often much larger than the number n of observations. Bayesian multiple regression models are widely used in genomic selection to address this problem of [Formula: see text] The primary difference between these models is the prior assumed for the effects of the covariates. Usually in the BayesB method, a Metropolis-Hastings (MH) algorithm is used to jointly sample the marker effect and the locus-specific variance, which may make BayesB computationally intensive. In this paper, we show how the Gibbs sampler without the MH algorithm can be used for the BayesB method.
RESULTS: We consider three different versions of the Gibbs sampler to sample the marker effect and locus-specific variance for each locus. Among the Gibbs samplers that were considered, the most efficient sampler is about 2.1 times as efficient as the MH algorithm proposed by Meuwissen et al. and 1.7 times as efficient as that proposed by Habier et al.
CONCLUSIONS: The three Gibbs samplers presented here were twice as efficient as Metropolis-Hastings samplers and gave virtually the same results.

Entities: Chemical Disease Species

Mesh：

Year: 2015 PMID： 26467850 PMCID： PMC4606519 DOI： 10.1186/s12711-015-0157-x

Source DB: PubMed Journal: Genet Sel Evol ISSN： 0999-193X Impact factor: 4.297

Background

In whole-genome analyses, the number p of marker covariates is often much larger than the number n of observations. Bayesian multiple regression models are widely used in genomic selection to address this problem of The primary difference between these models is the prior assumed for the effects of the covariates. These priors and their effects on inference have been recently reviewed by Gianola [1]. In most Bayesian analyses of whole-genome data, inferences are based on Markov chains constructed to have a stationary distribution equal to the posterior distribution of the unknown parameters of interest [2]. This is often done by employing a Gibbs sampler where samples are drawn from the full-conditional distributions of the parameters [3]. It can be shown that in BayesA introduced by Meuwissen et al. [4], the prior for each marker effect follows a scaled t distribution [5]. However when the prior for the marker effect is specified as a t distribution, its full-conditional is not of a known form. Fortunately, this prior can also be specified as a normally distributed marker effect conditional on a locus-specific variance, which is given a scaled inverted chi-square distribution. When marginalized over the variance, this gives a t distribution for the marker effect [5]. Thus, the posterior for the marker effect would be identical under both these priors. The second form of the prior, however, is more convenient because it results in the full-conditional for the marker effect having a normal distribution. BayesA is a special case of BayesB, also introduced by Meuwissen et al. [4], where the prior for each marker effect follows a mixture distribution with a point mass at zero with probability and a univariate-t distribution with probability [5]. When , BayesB becomes BayesA. When the marker effect is non-null, as in BayesA, the second form of the prior leads to the full-conditional of the marker effect being normal. Nevertheless, Meuwissen et al. [4] used a Metropolis–Hastings (MH) algorithm to jointly sample the marker effect and the locus-specific variance because they argued that “the Gibbs sampler will not move through the entire sampling space” for BayesB. In their MH algorithm, they use the prior distribution of the locus-specific variance as the proposal distribution. When is high, the proposed values for the marker effect will be zero with high probability. Thus, for each locus, 100 cycles of MH algorithm were used in their paper, which makes BayesB computationally intensive. Habier et al. [6] used an alternative proposal, where the marker effect was zero with probability 0.5, that leads to a more efficient MH algorithm. For each locus, five cycles of MH were used to sample marker effects in this efficient MH method. In this paper, we will show how Gibbs samplers without the MH algorithm can be used for the BayesB method. Recall that by introducing a locus-specific variance into BayesA, the full-conditional for the marker effects becomes normal. Similarly, in this paper we show that by introducing a variable in BayesB, indicating whether the marker effect for a locus is zero or non-zero, the marker effect and locus-specific variance can be sampled using Gibbs. We consider three different versions of the Gibbs sampler to sample each marker effect, locus-specific variance and its indicator variable . The objectives of this paper are to introduce these samplers and study their performance.

Methods

Statistical methods

BayesB introduced by Meuwissen et al. [4] assumes that each locus-specific variance follows a mixture distribution. However, following Gianola [5], we prefer to specify the mixture at the level of the marker effect instead of the locus specific variance. In this formulation, the prior for the marker effect is a mixture with a point mass at zero and a univariate normal distribution conditional on :where follows a scaled inverted chi-square distribution and is treated as known. Employing the concept of data augmentation, it is convenient to write the marker effect as , where we introduce a Bernoulli variable with probability of success and normally distributed variable with mean zero and variance , which has a scaled inverted chi-square distribution. As shown below, Gibbs sampling can be used to draw samples for these unknowns.

Gibbs samplers for BayesB

Here, we present three Gibbs samplers for BayesB. The first is a single-site Gibbs sampler, where all parameters are sampled from their full conditional distributions. The second is a joint Gibbs sampler, where are sampled from the joint full-conditional distribution , where , because and are highly dependent. Carlin and Chib [7] have shown that the prior used for parameters that are not in the model does not affect the Bayes factor. Thus, this prior, which they call a pseudo prior, can be chosen to improve mixing of the sampler. Following Carlin and Chib, the third sampler is a Gibbs sampler where a pseudo prior is used for when is zero. Godsill [8] has shown that the marginal posterior for a parameter in the model does not depend on the choice of pseudo priors. It has been suggested to choose the full conditional distribution for when it is in the model as the pseudo prior [7, 8]. This choice is justified by showing that using this pseudo prior is equivalent to sampling from [8]. However, in BayesB, use of the exact full conditional distribution as the pseudo prior will require MH to sample and . Thus in this paper, a distribution close to the full conditional is used.

BayesB model with data augmentation model

where is the phenotype for individual i, is the overall mean, k is the number of SNPs, is the genotype covariate at locus j for animal i (coded as 0, 1, 2), is the allele substitution effect for locus j, is an indicator variable and is the random residual effect for individual i.

Priors

The prior for is a constant. The prior for is and . The prior for is . The prior for . The prior for is

Single-site Gibbs sampler

The full conditional distributions of and are well-known [4, 9, 10]. Thus they are presented here without derivations. The full conditional of is a normal distribution with mean and variance , where and n is the number of individuals. The full conditional distributions of and are both scaled inverted chi-square distributions; for , the scale parameter is and the degrees of freedom parameter are ; for , the scale parameter is and the degrees of freedom parameter are , where . Next, we derive the full conditional distributions of and . These full conditional distributions are proportional to the joint distribution of all parameters and , which can be written asThe full conditional distribution of is now obtained by dropping factors that do not involve , which giveswhere . When ,where . Now, (3) can be recognized as the kernel of a normal distribution with mean and variance . When ,and dropping the factor , which is free of , giveswhich is the kernel of a normal distribution with null mean and variance . Thus,where ELSE stands for all the other parameters and y. This means when , the sampling of is identical to that in BayesA; when , is sampled from its prior. This is different from the original implementation of BayesB introduced by Meuwissen et al. [4]. Similarly, the full conditional distribution of can be obtained from (2) by dropping all factors free of , which givesThus,where .

Joint Gibbs sampler

The same priors as in single-site Gibbs sampler are used here. The only difference is that are sampled from their joint full conditional distribution, which can be written as the product of the full conditional distribution of given and marginal full conditional distribution of :Thus, is first sampled from . Then is sampled from , which is identical to the sampling of in BayesB with single-site Gibbs sampler. The marginal full conditional for can be written as:where . Now is a multivariate normal distribution with mean and variance . When , it becomes a multivariate normal with null mean and variance ; when , it becomes a multivariate normal with null mean and variance . Thus, samples can be drawn usingwhere . However, evaluating the multivariate normal distribution is computationally intense. An efficient way is to use the univariate distribution of , which contains all the information from about , instead of the distribution of , which is a multivariate. Thus, (5) can be written aswhere , is an univariate normal distribution with null mean and variance , and is an univariate normal distribution with null mean and variance .

Gibbs sampler with pseudo priors

Here, following Carlin and Chib [7], a pseudo prior is used for when is zero. They proposed to use the full conditional distribution of when as the pseudo prior for when [7, 8], which results in the prior for asWe show below that the posterior mean of the marker effect, , does not depend on the pseudo prior. This posterior mean can be written as:The numerator in (7) is free of the pseudo prior: . Furthermore, it can be seen from the model equation (1) that the value of is free of when . Thus, the marginal distribution of , the denominator of (7), does not depend on the pseudo prior, which is the distribution of when . As both the numerator and denominator of (1) are free of the pseudo prior, it follows that the posterior mean of does not depend on the pseudo prior for . We show here that, given this pseudo prior, the full conditional for is identical to the marginal full conditional distribution of , , which is used in the joint Gibbs sampler. The full conditional probability of can be written as:wherewith or 1. The ratio in the denominator of (8) is:In the above equation, (9) is identical to in (5) used in the joint Gibbs sampler, because is also a multivariate normal with null mean and variance . We show below that (10) is identical to . Our proposed prior for when (the pseudo prior) can be written as: After replacing the pseudo prior in (10) with (11), it becomes which is identical to . Thus, the ratio in (8) is identical to in (6), which proves the full conditional probability (8) of , when the proposed prior is used, is identical to (6), the marginal full conditional probability of , which is used in the joint Gibbs sampler. Use of the exact full conditional distribution as the pseudo prior in BayesB, however, will require MH to sample and . Thus, a distribution close to the full conditional is used here. Here, we use a normal distribution with mean and variance , where , and and are means of the prior distributions for the residual and the marker effect variances, respectively. Next, we will show the derivation of the full conditionals, which are proportional to the joint distribution of all parameters and . Here, the joint distribution of all parameters and can be written as:It is easy to see that the full conditional distribution of , which does not involve , is the same as that in the single-site Gibbs sampler. Although also appears in in (16), (16) has no effect on the full conditional of because the columns of X, which are always centered, are orthogonal to the column vectors of ones so that . Thus, the full conditional of is the same as that in the single-site Gibbs sampler. When , the full conditional distribution of is identical to that in the single-site Gibbs sampler. When , (16) is the only part that includes . Thus the full conditional distribution of is:When , the full conditional distribution of is the same as that in the single-site Gibbs sampler. When , (18) is the only part that contains , which means it should be sampled from its prior. Thus when , the full conditional distribution of is a scaled inverted chi-square distribution with scale parameter and degrees of freedom parameter ; when , it is a scaled inverted chi-square distribution with scale parameter and degrees of freedom parameter . The full conditional distribution of can be obtained from the joint distribution of all parameters and by dropping all factors free of , which givesCompared to the full conditional distributions for in the single-site Gibbs sampler, the difference is the extra factor and , because they cannot be canceled out as in the single-site Gibbs sampler. Thus,where and .

Data

Real genotypic data and simulated phenotypic data were used here to compare BayesB using MH, efficient MH or the three different Gibbs samplers as described above. The genotypic data included 3961 individuals with 55,734 SNPs. The heritability of the simulated trait was 0.25. The training data contained 3206 individuals and the remaining individuals were used for testing. A chain of length of 50,000 was used to estimate parameters of interest. Prediction accuracies were calculated using different samplers. The effective sample sizes [11], which estimate the number of independent samples from a chain, were calculated for to compare convergence rates for different methods. Computing time for different methods with the same number of iterations were also compared.

Results

The number of effective samples per second of computing time was obtained for BayesB using MH, efficient MH or the three different Gibbs samplers. These three Gibbs samplers were almost twice as efficient as Metropolis–Hastings (Table 1). The prediction accuracies for different samplers, which are calculated as the correlation between estimated breeding values and simulated phenotypes, are all equal to 0.296. Posterior means of for these four samplers are all equal to 2.508. Posterior means of for different samplers are almost equal, ranging from 0.955 to 0.957.

Table 1

Efficiency of alternative MCMC samplers for BayesB

	Alternative MCMC samplers
	MH	Efficient MH	Single-site Gibbs	Joint Gibbs	Gibbs with pseudo prior
Computing time	90,009	70,714	52,452	44,726	47,043
Effective sample size	25,262	24,588	24,684	26,757	25,036
Effective samples/s	0.280	0.347	0.471	0.598	0.532

Efficiency of alternative MCMC samplers for BayesB. Results are given for the computing time in seconds to obtain 50,000 samples, effective sample size and effective samples/s for BayesB using Metropolis–Hastings (MH), single-site Gibbs sampler, joint Gibbs sampler and Gibbs sampler with pseudo priors

Efficiency of alternative MCMC samplers for BayesB Efficiency of alternative MCMC samplers for BayesB. Results are given for the computing time in seconds to obtain 50,000 samples, effective sample size and effective samples/s for BayesB using Metropolis–Hastings (MH), single-site Gibbs sampler, joint Gibbs sampler and Gibbs sampler with pseudo priors

Discussion

In the joint Gibbs sampler, and are sampled jointly, which addresses the problem of dependence between and . Thus, the joint sampler had the largest effective sample size. However, in the single-site Gibbs sampler, and are sampled from their full conditionals, and thus due to the dependence between and , the single-site Gibbs sampler had the smallest effective sample size. These differences in effective sample size, however, were small. In the Gibbs sampler with pseudo priors, and are also sampled from their full conditionals. Recall that we have shown that the posterior mean of the marker effects does not depend on the pseudo prior. Furthermore, Godsill [8] has shown that the marginal posterior for parameters in the model do not depend on the pseudo prior, which is the prior for when . As suggested by Carlin and Chib [7], when the full conditional distribution of when is chosen to be the pseudo prior, we have shown that the samples of and are identically distributed to those from the joint Gibbs sampler. Thus, the Gibbs sampler with pseudo priors will have a similar effective sample size as the joint Gibbs sampler. However, when the full conditional distribution for when is used as the pseudo prior in BayesB, the full conditional distributions of and are not of known forms because and are in the pseudo prior for the marker effect. In contrast to BayesB, in the model used by Godsill [8] to justify the use of full conditional distributions as the pseudo priors, for simplicity, hyper-parameters such as were omitted [8]. Here, we have replaced and in the pseudo prior with constants such that the full conditionals for and have scaled inverted chi-square distributions. This modification will give a pseudo prior whose distribution is close to that of the full conditional. In the Gibbs sampler with this pseudo prior, the effective sample size was smaller than in the joint Gibbs sampler but still larger than in the single-site Gibbs sampler.

Conclusions

When a MH algorithm is used to jointly sample the marker effect and the locus-specific variance, the BayesB method is computationally intensive. After introducing a variable , indicating whether the marker effect for a locus is zero or non-zero, the marker effect and locus-specific variance can be sampled using Gibbs sampler without MH. Among the Gibbs samplers that were considered here, the joint Gibbs sampler is the most efficient. This sampler is about 2.1 times as efficient as the MH algorithm proposed by Meuwissen et al. [4] and 1.7 times as efficient as that proposed by Habier et al. [6].

5 in total

1. Prediction of total genetic value using genome-wide dense marker maps.

Authors: T H Meuwissen; B J Hayes; M E Goddard
Journal: Genetics Date: 2001-04 Impact factor: 4.562

2. Bayesian methods applied to GWAS.

Authors: Rohan L Fernando; Dorian Garrick
Journal: Methods Mol Biol Date: 2013

Review 3. Additive genetic variability and the Bayesian alphabet.

Authors: Daniel Gianola; Gustavo de los Campos; William G Hill; Eduardo Manfredi; Rohan Fernando
Journal: Genetics Date: 2009-07-20 Impact factor: 4.562

4. Priors in whole-genome regression: the bayesian alphabet returns.

Authors: Daniel Gianola
Journal: Genetics Date: 2013-05-01 Impact factor: 4.562

5. Extension of the bayesian alphabet for genomic selection.

Authors: David Habier; Rohan L Fernando; Kadir Kizilkaya; Dorian J Garrick
Journal: BMC Bioinformatics Date: 2011-05-23 Impact factor: 3.169

5 in total

14 in total

1. Genomic prediction using an iterative conditional expectation algorithm for a fast BayesC-like model.

Authors: Linsong Dong; Zhiyong Wang
Journal: Genetica Date: 2018-06-11 Impact factor: 1.082

2. Genomic Prediction from Multiple-Trait Bayesian Regression Methods Using Mixture Priors.

Authors: Hao Cheng; Kadir Kizilkaya; Jian Zeng; Dorian Garrick; Rohan Fernando
Journal: Genetics Date: 2018-03-07 Impact factor: 4.562

3. Genomic study and Medical Subject Headings enrichment analysis of early pregnancy rate and antral follicle numbers in Nelore heifers.

Authors: G A Oliveira Júnior; B C Perez; J B Cole; M H A Santana; J Silveira; G Mazzoni; R V Ventura; M L Santana Júnior; H N Kadarmideen; D J Garrick; J B S Ferraz
Journal: J Anim Sci Date: 2017-11 Impact factor: 3.159

Review 4. Application of Bayesian genomic prediction methods to genome-wide association analyses.

Authors: Anna Wolc; Jack C M Dekkers
Journal: Genet Sel Evol Date: 2022-05-13 Impact factor: 5.100

5. Computational strategies for alternative single-step Bayesian regression models with large numbers of genotyped and non-genotyped animals.

Authors: Rohan L Fernando; Hao Cheng; Bruce L Golden; Dorian J Garrick
Journal: Genet Sel Evol Date: 2016-12-08 Impact factor: 4.297

6. The Accuracy and Bias of Single-Step Genomic Prediction for Populations Under Selection.

Authors: Wan-Ling Hsu; Dorian J Garrick; Rohan L Fernando
Journal: G3 (Bethesda) Date: 2017-08-07 Impact factor: 3.154

7. Prediction of genomic breeding values using new computing strategies for the implementation of MixP.

Authors: Linsong Dong; Ming Fang; Zhiyong Wang
Journal: Sci Rep Date: 2017-12-08 Impact factor: 4.379

8. Comparison of alternative approaches to single-trait genomic prediction using genotyped and non-genotyped Hanwoo beef cattle.

Authors: Joonho Lee; Hao Cheng; Dorian Garrick; Bruce Golden; Jack Dekkers; Kyungdo Park; Deukhwan Lee; Rohan Fernando
Journal: Genet Sel Evol Date: 2017-01-04 Impact factor: 4.297

9. Identification of recombination hotspots and quantitative trait loci for recombination rate in layer chickens.

Authors: Ziqing Weng; Anna Wolc; Hailin Su; Rohan L Fernando; Jack C M Dekkers; Jesus Arango; Petek Settar; Janet E Fulton; Neil P O'Sullivan; Dorian J Garrick
Journal: J Anim Sci Biotechnol Date: 2019-02-26

10. Evaluation of Genomic Selection for Seven Economic Traits in Yellow Drum (Nibea albiflora).

Authors: Guijia Liu; Linsong Dong; Linlin Gu; Zhaofang Han; Wenjing Zhang; Ming Fang; Zhiyong Wang
Journal: Mar Biotechnol (NY) Date: 2019-11-20 Impact factor: 3.619