| Literature DB >> 25453604 |
Abstract
The distribution of allele frequencies of a large number of biallelic sites is known as "allele-frequency spectrum" or "site-frequency spectrum" (SFS). Without selection and in regions of relatively high recombination rates, sites may be assumed to be independently and identically distributed. With a beta equilibrium distribution of allelic proportions and binomial sampling, a beta-binomial compound likelihood for each site results. The likelihood of the data and the posterior distribution of two parameters, scaled mutation rate θ and mutation bias α, is investigated in the general case and for small scaled mutation rates θ. In the general case, an expectation-maximization (EM) algorithm is derived to obtain maximum likelihood estimates of both parameters. With an appropriate prior distribution, a Markov chain Monte Carlo sampler to integrate the posterior distribution is also derived. As far as I am aware, previous maximum likelihood or Bayesian estimators of θ, explicitly or implicitly assume small scaled mutation rates, i.e., θ≪1. For θ≪1, maximum-likelihood estimators are also derived for both parameters using a Taylor series expansion of the beta-binomial distribution. The estimator of θ is a variant of the Ewens-Watterson estimator and of the maximum likelihood estimator derived with the Poisson Random Field approach. With a conjugate prior distribution, marginal and conditional beta posterior distributions are also derived for both parameters.Entities:
Keywords: Beta–binomial; EM-algorithm; Markov chain Monte Carlo algorithm; Mutation–drift equilibrium; Posterior; Stirling distribution
Mesh:
Year: 2014 PMID: 25453604 DOI: 10.1016/j.tpb.2014.10.002
Source DB: PubMed Journal: Theor Popul Biol ISSN: 0040-5809 Impact factor: 1.570