| Literature DB >> 30825307 |
Dominik Schrempf1,2,3, Bui Quang Minh3,4,5, Arndt von Haeseler3,6, Carolin Kosiol2,7.
Abstract
Molecular phylogenetics has neglected polymorphisms within present and ancestral populations for a long time. Recently, multispecies coalescent based methods have increased in popularity, however, their application is limited to a small number of species and individuals. We introduced a polymorphism-aware phylogenetic model (PoMo), which overcomes this limitation and scales well with the increasing amount of sequence data whereas accounting for present and ancestral polymorphisms. PoMo circumvents handling of gene trees and directly infers species trees from allele frequency data. Here, we extend the PoMo implementation in IQ-TREE and integrate search for the statistically best-fit mutation model, the ability to infer mutation rate variation across sites, and assessment of branch support values. We exemplify an analysis of a hundred species with ten haploid individuals each, showing that PoMo can perform inference on large data sets. While PoMo is more accurate than standard substitution models applied to concatenated alignments, it is almost as fast. We also provide bmm-simulate, a software package that allows simulation of sequences evolving under PoMo. The new options consolidate the value of PoMo for phylogenetic analyses with population data.Entities:
Keywords: boundary mutation model; incomplete lineage sorting; phylogenetics; polymorphism-aware phylogenetic model; species tree
Mesh:
Year: 2019 PMID: 30825307 PMCID: PMC6526911 DOI: 10.1093/molbev/msz043
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
. 1.Branch score distance of concatenation approach and IQ-TREE-PoMo with N = 10 and weighted binomial sampling for Yule trees with 100 species and ten individuals each. The tree height measured in coalescent units is , where N is the effective population size. The HKY model was used for both inference methods. The heterozygosity is per site. Each gene spans 1,000 sites. The error bars are standard deviations of ten replicate analyses.
. 2.Relative errors of the transition to transversion ratio , the heterozygosity , and the shape parameter of the Γ distributed mutation rate heterogeneity . The true shape parameter is (A), (B), and (C), respectively.
. 3.Branch score distance of weighted binomial and weighted hypergeometric sampling for Yule trees of height with twelve species and ten individuals each. Heterozygosity varies between 0.01 and 0.1.
. 4.Phylogeny inferred from primate data (Prado-Martinez et al. 2013). Both, UFBoot2 (Hoang et al. 2018) and SH-aLRT (Guindon et al.2010) branch support tests evaluated to hundred percent support values.