Roland Wittler1,2,3. 1. 1Genome Informatics, Faculty of Technology, Bielefeld University, Bielefeld, Germany. 2. 2Center for Biotechnology, Bielefeld University, Bielefeld, Germany. 3. 3Center for Biotechnology, Bielefeld Institute for Bioinformatics Infrastructure, Bielefeld University, Bielefeld, Germany.
Abstract
BACKGROUND: The increasing amount of available genome sequence data enables large-scale comparative studies. A common task is the inference of phylogenies-a challenging task if close reference sequences are not available, genome sequences are incompletely assembled, or the high number of genomes precludes multiple sequence alignment in reasonable time. RESULTS: We present a new whole-genome based approach to infer phylogenies that is alignment- and reference-free. In contrast to other methods, it does not rely on pairwise comparisons to determine distances to infer edges in a tree. Instead, a colored de Bruijn graph is constructed, and information on common subsequences is extracted to infer phylogenetic splits. CONCLUSIONS: The introduced new methodology for large-scale phylogenomics shows high potential. Application to different datasets confirms robustness of the approach. A comparison to other state-of-the-art whole-genome based methods indicates comparable or higher accuracy and efficiency.
BACKGROUND: The increasing amount of available genome sequence data enables large-scale comparative studies. A common task is the inference of phylogenies-a challenging task if close reference sequences are not available, genome sequences are incompletely assembled, or the high number of genomes precludes multiple sequence alignment in reasonable time. RESULTS: We present a new whole-genome based approach to infer phylogenies that is alignment- and reference-free. In contrast to other methods, it does not rely on pairwise comparisons to determine distances to infer edges in a tree. Instead, a colored de Bruijn graph is constructed, and information on common subsequences is extracted to infer phylogenetic splits. CONCLUSIONS: The introduced new methodology for large-scale phylogenomics shows high potential. Application to different datasets confirms robustness of the approach. A comparison to other state-of-the-art whole-genome based methods indicates comparable or higher accuracy and efficiency.
Authors: Madeline A Crosby; Joshua L Goodman; Victor B Strelets; Peili Zhang; William M Gelbart Journal: Nucleic Acids Res Date: 2006-11-11 Impact factor: 16.971
Authors: Maximilian Haeussler; Donna Karolchik; Hiram Clawson; Brian J Raney; Kate R Rosenbloom; Pauline A Fujita; Angie S Hinrichs; Matthew L Speir; Chris Eisenhart; Ann S Zweig; David Haussler; W James Kent Journal: PLoS Curr Date: 2014-11-07
Authors: Zhemin Zhou; Inge Lundstrøm; Alicia Tran-Dien; Sebastián Duchêne; Nabil-Fareed Alikhan; Martin J Sergeant; Gemma Langridge; Anna K Fotakis; Satheesh Nair; Hans K Stenøien; Stian S Hamre; Sherwood Casjens; Axel Christophersen; Christopher Quince; Nicholas R Thomson; François-Xavier Weill; Simon Y W Ho; M Thomas P Gilbert; Mark Achtman Journal: Curr Biol Date: 2018-07-19 Impact factor: 10.834