| Literature DB >> 25205031 |
Guanghong Zuo1, Qiang Li2, Bailin Hao3.
Abstract
Using an enlarged alphabet of K-tuples is the way to carry out alignment-free comparison of genomes in the composition vector (CV) approach to prokaryotic phylogeny. We summarize the known aspects concerning the choice of K and examine the results of using CVs with subtraction of a statistical background for K=3-9 and using raw CVs without subtraction for K=1-12. The criterion for evaluation consists in direct comparison with taxonomy. For prokaryotes the best performances are obtained for K=5 and 6 with subtraction and for K=11, 12 or even more without subtraction. In general, CVs with subtractions are slightly better and less CPU consuming, but CVs without subtraction may provide complementary information.Entities:
Keywords: Alignment-free; Composition vector; Prokaryote phylogeny and taxonomy; Subtraction procedure; Whole-genome-based
Mesh:
Substances:
Year: 2014 PMID: 25205031 DOI: 10.1016/j.compbiolchem.2014.08.021
Source DB: PubMed Journal: Comput Biol Chem ISSN: 1476-9271 Impact factor: 2.877