| Literature DB >> 34581792 |
Tristan L Stark1, David A Liberles1.
Abstract
Amino acid substitution models are commonly used for phylogenetic inference, for ancestral sequence reconstruction, and for the inference of positive selection. All commonly used models explicitly assume that each site evolves independently, an assumption that is violated by both linkage and protein structural and functional constraints. We introduce two new models for amino acid substitution which incorporate linkage between sites, each based on the (population-genetic) Moran model. The first model is a generalized population process tracking arbitrarily many sites which undergo mutation, with individuals replaced according to their fitnesses. This model provides a reasonably complete framework for simulations but is numerically and analytically intractable. We also introduce a second model which includes several simplifying assumptions but for which some theoretical results can be derived. We analyze the simplified model to determine conditions where linkage is likely to have meaningful effects on sitewise substitution probabilities, as well as conditions under which the effects are likely to be negligible. These findings are an important step in the generation of tractable phylogenetic models that parameterize selective coefficients for amino acid substitution while accounting for linkage of sites leading to both hitchhiking and background selection.Entities:
Keywords: mutation-selection model; phylogenetic model; positive directional selection; protein evolution
Mesh:
Substances:
Year: 2021 PMID: 34581792 PMCID: PMC8557849 DOI: 10.1093/gbe/evab225
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Fig. 1.Gumbel approximation versus exact solution for cumulative distribution of time to fixation given eventual fixation in the two-allele process with , Ns = 10. Units of time are expected time to replace one member of the population.
Fig. 2.Probability that two mutations fix together conditional on both reaching fixation eventually for various fixed time-between-mutations t (units of average time to replace an individual in the population) as a function of f1 (fitness of first mutation to arise relative to wild-type) and f2 (fitness of second mutation to arise relative to wild-type) with N = 100.
Fig. 3.Probability that two mutations (arising according to a Poisson process) fix together conditional on exactly two mutations arising, and both reaching fixation eventually on a branch of length t (units of average time to replace an individual in the population) as a function of their (unordered) fitnesses relative to the wild-type f1, f2 with N = 10.