Anuj Gupta1,2, I King Jordan1,2,3, Lavanya Rishishwar1,2,3. 1. School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA. 2. Applied Bioinformatics Laboratory, Atlanta, GA 30332, USA. 3. PanAmerican Bioinformatics Institute, Cali, Valle del Cauca 760043, Colombia.
Abstract
Rapid and accurate identification of the sequence type (ST) of bacterial pathogens is critical for epidemiological surveillance and outbreak control. Cheaper and faster next-generation sequencing (NGS) technologies have taken preference over the traditional method of amplicon sequencing for multilocus sequence typing (MLST). But data generated by NGS platforms necessitate quality control, genome assembly and sequence similarity searching before an isolate's ST can be determined. These are computationally intensive and time consuming steps, which are not ideally suited for real-time molecular epidemiology. Here, we present stringMLST, an assembly- and alignment-free, lightweight, platform-independent program capable of rapidly typing bacterial isolates directly from raw sequence reads. The program implements a simple hash table data structure to find exact matches between short sequence strings (k-mers) and an MLST allele library. We show that stringMLST is more accurate, and order of magnitude faster, than its contemporary genome-based ST detection tools. AVAILABILITY AND IMPLEMENTATION: The source code and documentations are available at http://jordan.biology.gatech.edu/page/software/stringMLST CONTACT: lavanya.rishishwar@gatech.eduSupplementary information: Supplementary data are available at Bioinformatics online.
Rapid and accurate identification of the sequence type (ST) of bacterial pathogens is critical for epidemiological surveillance and outbreak control. Cheaper and faster next-generation sequencing (NGS) technologies have taken preference over the traditional method of amplicon sequencing for multilocus sequence typing (MLST). But data generated by NGS platforms necessitate quality control, genome assembly and sequence similarity searching before an isolate's ST can be determined. These are computationally intensive and time consuming steps, which are not ideally suited for real-time molecular epidemiology. Here, we present stringMLST, an assembly- and alignment-free, lightweight, platform-independent program capable of rapidly typing bacterial isolates directly from raw sequence reads. The program implements a simple hash table data structure to find exact matches between short sequence strings (k-mers) and an MLST allele library. We show that stringMLST is more accurate, and order of magnitude faster, than its contemporary genome-based ST detection tools. AVAILABILITY AND IMPLEMENTATION: The source code and documentations are available at http://jordan.biology.gatech.edu/page/software/stringMLST CONTACT: lavanya.rishishwar@gatech.eduSupplementary information: Supplementary data are available at Bioinformatics online.
Authors: Shaokang Zhang; Hendrik C den Bakker; Shaoting Li; Jessica Chen; Blake A Dinsmore; Charlotte Lane; A C Lauer; Patricia I Fields; Xiangyu Deng Journal: Appl Environ Microbiol Date: 2019-11-14 Impact factor: 4.792
Authors: Sandeep J Joseph; Jesse C Thomas; Matthew W Schmerer; John C Cartee; Sancta St Cyr; Karen Schlanger; Ellen N Kersh; Brian H Raphael; Kim M Gernert Journal: Genome Biol Evol Date: 2022-01-04 Impact factor: 3.416
Authors: John C Cartee; Sandeep J Joseph; Emily Weston; Cau D Pham; Jesse C Thomas; Karen Schlanger; Sancta B St Cyr; Monica M Farley; Ashley E Moore; Amy K Tunali; Charletta Cloud; Brian H Raphael Journal: Open Forum Infect Dis Date: 2022-05-13 Impact factor: 4.423
Authors: Pedro Feijao; Hua-Ting Yao; Dan Fornika; Jennifer Gardy; William Hsiao; Cedric Chauve; Leonid Chindelevitch Journal: Microb Genom Date: 2018-01-10
Authors: Andrew J Page; Nabil-Fareed Alikhan; Heather A Carleton; Torsten Seemann; Jacqueline A Keane; Lee S Katz Journal: Microb Genom Date: 2017-07-04