Hyungro Lee1, Minsu Lee2, Wazim Mohammed Ismail1, Mina Rho3, Geoffrey C Fox1, Sangyoon Oh4, Haixu Tang1. 1. School of Informatics and Computing, Indiana University, Bloomington, IN, USA. 2. Department of Computer Science and Engineering, Ewha Womans University, Seoul, Korea. 3. Department of Computer Science and Engineering, Hanyang University, Seoul, Korea. 4. Department of Software Convergence Technology, Ajou University, Suwon, Korea.
Abstract
UNLABELLED: : MGEScan-long terminal repeat (LTR) and MGEScan-non-LTR are successfully used programs for identifying LTRs and non-LTR retrotransposons in eukaryotic genome sequences. However, these programs are not supported by easy-to-use interfaces nor well suited for data visualization in general data formats. Here, we present MGEScan, a user-friendly system that combines these two programs with a Galaxy workflow system accelerated with MPI and Python threading on compute clusters. MGEScan and Galaxy empower researchers to identify transposable elements in a graphical user interface with ready-to-use workflows. MGEScan also visualizes the custom annotation tracks for mobile genetic elements in public genome browsers. A maximum speed-up of 3.26× is attained for execution time using concurrent processing and MPI on four virtual cores. MGEScan provides four operational modes: as a command line tool, as a Galaxy Toolshed, on a Galaxy-based web server, and on a virtual cluster on the Amazon cloud. AVAILABILITY AND IMPLEMENTATION: MGEScan tutorials and source code are available at http://mgescan.readthedocs.org/ CONTACT: hatang@indiana.edu or syoh@ajou.ac.kr SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
UNLABELLED: : MGEScan-long terminal repeat (LTR) and MGEScan-non-LTR are successfully used programs for identifying LTRs and non-LTR retrotransposons in eukaryotic genome sequences. However, these programs are not supported by easy-to-use interfaces nor well suited for data visualization in general data formats. Here, we present MGEScan, a user-friendly system that combines these two programs with a Galaxy workflow system accelerated with MPI and Python threading on compute clusters. MGEScan and Galaxy empower researchers to identify transposable elements in a graphical user interface with ready-to-use workflows. MGEScan also visualizes the custom annotation tracks for mobile genetic elements in public genome browsers. A maximum speed-up of 3.26× is attained for execution time using concurrent processing and MPI on four virtual cores. MGEScan provides four operational modes: as a command line tool, as a Galaxy Toolshed, on a Galaxy-based web server, and on a virtual cluster on the Amazon cloud. AVAILABILITY AND IMPLEMENTATION: MGEScan tutorials and source code are available at http://mgescan.readthedocs.org/ CONTACT: hatang@indiana.edu or syoh@ajou.ac.kr SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Zhao Peng; Ely Oliveira-Garcia; Guifang Lin; Ying Hu; Melinda Dalby; Pierre Migeon; Haibao Tang; Mark Farman; David Cook; Frank F White; Barbara Valent; Sanzhen Liu Journal: PLoS Genet Date: 2019-09-12 Impact factor: 5.917