Mateusz Kaduk1,2, Erik Sonnhammer1,2. 1. Department of Biochemistry and Biophysics, Stockholm University. 2. Science for Life Laboratory (SciLifeLab), Tomtebodavagen 23, Solna, Sweden.
Abstract
Motivation: The initial step in many orthology inference methods is the computationally demanding establishment of all pairwise protein similarities across all analysed proteomes. The quadratic scaling with proteomes has become a major bottleneck. A remedy is offered by the Hieranoid algorithm which reduces the complexity to linear by hierarchically aggregating ortholog groups from InParanoid along a species tree. Results: We have further developed the Hieranoid algorithm in many ways. Major improvements have been made to the construction of multiple sequence alignments and consensus sequences. Hieranoid version 2 was evaluated with standard benchmarks that reveal a dramatic increase in the coverage/accuracy tradeoff over version 1, such that it now compares favourably with the best methods. The new parallelized cluster mode allows Hieranoid to be run on large data sets in a much shorter timespan than InParanoid, yet at similar accuracy. Contact: mateusz.kaduk@scilifelab.se. Availability and Implementation: Perl code freely available at http://hieranoid.sbc.su.se/ . Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: The initial step in many orthology inference methods is the computationally demanding establishment of all pairwise protein similarities across all analysed proteomes. The quadratic scaling with proteomes has become a major bottleneck. A remedy is offered by the Hieranoid algorithm which reduces the complexity to linear by hierarchically aggregating ortholog groups from InParanoid along a species tree. Results: We have further developed the Hieranoid algorithm in many ways. Major improvements have been made to the construction of multiple sequence alignments and consensus sequences. Hieranoid version 2 was evaluated with standard benchmarks that reveal a dramatic increase in the coverage/accuracy tradeoff over version 1, such that it now compares favourably with the best methods. The new parallelized cluster mode allows Hieranoid to be run on large data sets in a much shorter timespan than InParanoid, yet at similar accuracy. Contact: mateusz.kaduk@scilifelab.se. Availability and Implementation: Perl code freely available at http://hieranoid.sbc.su.se/ . Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Christophe H Georgescu; Abigail L Manson; Alexander D Griggs; Christopher A Desjardins; Alejandro Pironti; Ilan Wapinski; Thomas Abeel; Brian J Haas; Ashlee M Earl Journal: Microb Genom Date: 2018-11-12
Authors: Natasha Glover; Christophe Dessimoz; Ingo Ebersberger; Sofia K Forslund; Toni Gabaldón; Jaime Huerta-Cepas; Maria-Jesus Martin; Matthieu Muffato; Mateus Patricio; Cécile Pereira; Alan Sousa da Silva; Yan Wang; Erik Sonnhammer; Paul D Thomas Journal: Mol Biol Evol Date: 2019-10-01 Impact factor: 16.240
Authors: Stacia R Engel; Edith D Wong; Robert S Nash; Suzi Aleksander; Micheal Alexander; Eric Douglass; Kalpana Karra; Stuart R Miyasato; Matt Simison; Marek S Skrzypek; Shuai Weng; J Michael Cherry Journal: Genetics Date: 2022-04-04 Impact factor: 4.402
Authors: Francesco Cicconardi; Patrick Krapf; Ilda D'Annessa; Alexander Gamisch; Herbert C Wagner; Andrew D Nguyen; Evan P Economo; Alexander S Mikheyev; Benoit Guénard; Reingard Grabherr; Philipp Andesner; Arthofer Wolfgang; Daniele Di Marino; Florian M Steiner; Birgit C Schlick-Steiner Journal: Mol Biol Evol Date: 2020-08-01 Impact factor: 16.240