| Literature DB >> 26615177 |
Jakub Truszkowski1, Nick Goldman2.
Abstract
We prove that maximum likelihood phylogenetic inference is consistent on gapped multiple sequence alignments (MSAs) as long as substitution rates across each edge are greater than zero, under mild assumptions on the structure of the alignment. Under these assumptions, maximum likelihood will asymptotically recover the tree with edge lengths corresponding to the mean number of substitutions per site on each edge. This refutes Warnow's recent suggestion (Warnow 2012) that maximum likelihood phylogenetic inference might be statistically inconsistent when gaps are treated as missing data, even if the MSA is correct. We also derive a simple new proof of maximum likelihood consistency of ungapped alignments.Entities:
Keywords: Kullback–Leibler divergence; maximum likelihood; missing data; multiple sequence alignment; phylogeny
Mesh:
Year: 2015 PMID: 26615177 PMCID: PMC4748752 DOI: 10.1093/sysbio/syv089
Source DB: PubMed Journal: Syst Biol ISSN: 1063-5157 Impact factor: 15.683