MOTIVATION: The imperfect sequence data produced by next-generation sequencing technologies have motivated the development of a number of short-read error correctors in recent years. The majority of methods focus on the correction of substitution errors, which are the dominant error source in data produced by Illumina sequencing technology. Existing tools either score high in terms of recall or precision but not consistently high in terms of both measures. RESULTS: In this article, we present Musket, an efficient multistage k-mer-based corrector for Illumina short-read data. We use the k-mer spectrum approach and introduce three correction techniques in a multistage workflow: two-sided conservative correction, one-sided aggressive correction and voting-based refinement. Our performance evaluation results, in terms of correction quality and de novo genome assembly measures, reveal that Musket is consistently one of the top performing correctors. In addition, Musket is multi-threaded using a master-slave model and demonstrates superior parallel scalability compared with all other evaluated correctors as well as a highly competitive overall execution time. AVAILABILITY: Musket is available at http://musket.sourceforge.net.
MOTIVATION: The imperfect sequence data produced by next-generation sequencing technologies have motivated the development of a number of short-read error correctors in recent years. The majority of methods focus on the correction of substitution errors, which are the dominant error source in data produced by Illumina sequencing technology. Existing tools either score high in terms of recall or precision but not consistently high in terms of both measures. RESULTS: In this article, we present Musket, an efficient multistage k-mer-based corrector for Illumina short-read data. We use the k-mer spectrum approach and introduce three correction techniques in a multistage workflow: two-sided conservative correction, one-sided aggressive correction and voting-based refinement. Our performance evaluation results, in terms of correction quality and de novo genome assembly measures, reveal that Musket is consistently one of the top performing correctors. In addition, Musket is multi-threaded using a master-slave model and demonstrates superior parallel scalability compared with all other evaluated correctors as well as a highly competitive overall execution time. AVAILABILITY: Musket is available at http://musket.sourceforge.net.
Authors: Rafael D Acemel; Juan J Tena; Ibai Irastorza-Azcarate; Ferdinand Marlétaz; Carlos Gómez-Marín; Elisa de la Calle-Mustienes; Stéphanie Bertrand; Sergio G Diaz; Daniel Aldea; Jean-Marc Aury; Sophie Mangenot; Peter W H Holland; Damien P Devos; Ignacio Maeso; Hector Escrivá; José Luis Gómez-Skarmeta Journal: Nat Genet Date: 2016-02-01 Impact factor: 38.330
Authors: Reed M Stubbendieck; Daniel S May; Marc G Chevrette; Mia I Temkin; Evelyn Wendt-Pienkowski; Julian Cagnazzo; Caitlin M Carlson; James E Gern; Cameron R Currie Journal: Appl Environ Microbiol Date: 2019-05-02 Impact factor: 4.792
Authors: Alexandra Moura; Olivier Disson; Morgane Lavina; Pierre Thouvenot; Lei Huang; Alexandre Leclercq; Maria Fredriksson-Ahomaa; Athmanya K Eshwar; Roger Stephan; Marc Lecuit Journal: Infect Immun Date: 2019-03-25 Impact factor: 3.441
Authors: S Wesley Long; Matthew Ojeda Saavedra; Paul A Christensen; James M Musser; Randall J Olsen Journal: J Clin Microbiol Date: 2020-06-24 Impact factor: 5.948
Authors: Amaro F Sanchez-Larrayoz; Noha M Elhosseiny; Marc G Chevrette; Yang Fu; Peter Giunta; Raúl G Spallanzani; Keerthikka Ravi; Gerald B Pier; Stephen Lory; Tomás Maira-Litrán Journal: J Immunol Date: 2017-08-30 Impact factor: 5.422