Jorge González-Domínguez1, Yongchao Liu2, Juan Touriño1, Bertil Schmidt3. 1. Campus de Elviña, Grupo de Arquitectura de Computadores, Universidade da Coruña, 15071 A Coruña, Spain. 2. School of Computational Science and Engineering, Georgia Institute of Technology, 266 Ferst Drive, Atlanta, GA 30332, USA. 3. Institut für Informatik, Johannes Gutenberg Universität Mainz, Staudingerweg 9, Mainz 55128, Germany.
Abstract
MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-scale input datasets. In this work we present MSAProbs-MPI, a distributed-memory parallel version of the multithreaded MSAProbs tool that is able to reduce runtimes by exploiting the compute capabilities of common multicore CPU clusters. Our performance evaluation on a cluster with 32 nodes (each containing two Intel Haswell processors) shows reductions in execution time of over one order of magnitude for typical input datasets. Furthermore, MSAProbs-MPI using eight nodes is faster than the GPU-accelerated QuickProbs running on a Tesla K20. Another strong point is that MSAProbs-MPI can deal with large datasets for which MSAProbs and QuickProbs might fail due to time and memory constraints, respectively. AVAILABILITY AND IMPLEMENTATION: Source code in C ++ and MPI running on Linux systems as well as a reference manual are available at http://msaprobs.sourceforge.net CONTACT: jgonzalezd@udc.esSupplementary information: Supplementary data are available at Bioinformatics online.
MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-scale input datasets. In this work we present MSAProbs-MPI, a distributed-memory parallel version of the multithreaded MSAProbs tool that is able to reduce runtimes by exploiting the compute capabilities of common multicore CPU clusters. Our performance evaluation on a cluster with 32 nodes (each containing two Intel Haswell processors) shows reductions in execution time of over one order of magnitude for typical input datasets. Furthermore, MSAProbs-MPI using eight nodes is faster than the GPU-accelerated QuickProbs running on a Tesla K20. Another strong point is that MSAProbs-MPI can deal with large datasets for which MSAProbs and QuickProbs might fail due to time and memory constraints, respectively. AVAILABILITY AND IMPLEMENTATION: Source code in C ++ and MPI running on Linux systems as well as a reference manual are available at http://msaprobs.sourceforge.net CONTACT: jgonzalezd@udc.esSupplementary information: Supplementary data are available at Bioinformatics online.
Authors: Gaurav Sablok; Regan J Hayward; Peter A Davey; Rosiane P Santos; Martin Schliep; Anthony Larkum; Mathieu Pernice; Rudy Dolferus; Peter J Ralph Journal: Sci Rep Date: 2018-02-09 Impact factor: 4.379
Authors: Maria A Daugavet; Sergey Shabelnikov; Alexander Shumeev; Tatiana Shaposhnikova; Leonid S Adonin; Olga Podgornaya Journal: Mob DNA Date: 2019-01-19