Literature DB >> 24642064

KAnalyze: a fast versatile pipelined k-mer toolkit.

Peter Audano1, Fredrik Vannberg1.   

Abstract

MOTIVATION: Converting nucleotide sequences into short overlapping fragments of uniform length, k-mers, is a common step in many bioinformatics applications. While existing software packages count k-mers, few are optimized for speed, offer an application programming interface (API), a graphical interface or contain features that make it extensible and maintainable. We designed KAnalyze to compete with the fastest k-mer counters, to produce reliable output and to support future development efforts through well-architected, documented and testable code. Currently, KAnalyze can output k-mer counts in a sorted tab-delimited file or stream k-mers as they are read. KAnalyze can process large datasets with 2 GB of memory. This project is implemented in Java 7, and the command line interface (CLI) is designed to integrate into pipelines written in any language.
RESULTS: As a k-mer counter, KAnalyze outperforms Jellyfish, DSK and a pipeline built on Perl and Linux utilities. Through extensive unit and system testing, we have verified that KAnalyze produces the correct k-mer counts over multiple datasets and k-mer sizes.
AVAILABILITY AND IMPLEMENTATION: KAnalyze is available on SourceForge: https://sourceforge.net/projects/kanalyze/.
© The Author 2014. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2014        PMID: 24642064      PMCID: PMC4080738          DOI: 10.1093/bioinformatics/btu152

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  5 in total

1.  A fast, lock-free approach for efficient parallel counting of occurrences of k-mers.

Authors:  Guillaume Marçais; Carl Kingsford
Journal:  Bioinformatics       Date:  2011-01-07       Impact factor: 6.937

2.  DSK: k-mer counting with very low memory usage.

Authors:  Guillaume Rizk; Dominique Lavenier; Rayan Chikhi
Journal:  Bioinformatics       Date:  2013-01-16       Impact factor: 6.937

3.  Mutation identification by direct comparison of whole-genome sequencing data from mutant and wild-type individuals using k-mers.

Authors:  Karl J V Nordström; Maria C Albani; Geo Velikkakam James; Caroline Gutjahr; Benjamin Hartwig; Franziska Turck; Uta Paszkowski; George Coupland; Korbinian Schneeberger
Journal:  Nat Biotechnol       Date:  2013-03-10       Impact factor: 54.908

4.  Best practices for scientific computing.

Authors:  Greg Wilson; D A Aruliah; C Titus Brown; Neil P Chue Hong; Matt Davis; Richard T Guy; Steven H D Haddock; Kathryn D Huff; Ian M Mitchell; Mark D Plumbley; Ben Waugh; Ethan P White; Paul Wilson
Journal:  PLoS Biol       Date:  2014-01-07       Impact factor: 8.029

5.  UniPROBE: an online database of protein binding microarray data on protein-DNA interactions.

Authors:  Daniel E Newburger; Martha L Bulyk
Journal:  Nucleic Acids Res       Date:  2008-10-08       Impact factor: 16.971

  5 in total
  14 in total

1.  KCMBT: a k-mer Counter based on Multiple Burst Trees.

Authors:  Abdullah-Al Mamun; Soumitra Pal; Sanguthevar Rajasekaran
Journal:  Bioinformatics       Date:  2016-06-09       Impact factor: 6.937

2.  SPRISS: Approximating Frequent K-mers by Sampling Reads, and Applications.

Authors:  Diego Santoro; Leonardo Pellegrina; Matteo Comin; Fabio Vandin
Journal:  Bioinformatics       Date:  2022-05-18       Impact factor: 6.931

3.  Familial long-read sequencing increases yield of de novo mutations.

Authors:  Michelle D Noyes; William T Harvey; David Porubsky; Arvis Sulovari; Ruiyang Li; Nicholas R Rose; Peter A Audano; Katherine M Munson; Alexandra P Lewis; Kendra Hoekzema; Tuomo Mantere; Tina A Graves-Lindsay; Ashley D Sanders; Sara Goodwin; Melissa Kramer; Younes Mokrab; Michael C Zody; Alexander Hoischen; Jan O Korbel; W Richard McCombie; Evan E Eichler
Journal:  Am J Hum Genet       Date:  2022-03-14       Impact factor: 11.043

4.  PPE37 Is Essential for Mycobacterium tuberculosis Heme-Iron Acquisition (HIA), and a Defective PPE37 in Mycobacterium bovis BCG Prevents HIA.

Authors:  Michael V Tullius; Susana Nava; Marcus A Horwitz
Journal:  Infect Immun       Date:  2019-01-24       Impact factor: 3.441

5.  These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure.

Authors:  Qingpeng Zhang; Jason Pell; Rosangela Canino-Koning; Adina Chuang Howe; C Titus Brown
Journal:  PLoS One       Date:  2014-07-25       Impact factor: 3.240

6.  Vipie: web pipeline for parallel characterization of viral populations from multiple NGS samples.

Authors:  Jake Lin; Lenka Kramna; Reija Autio; Heikki Hyöty; Matti Nykter; Ondrej Cinek
Journal:  BMC Genomics       Date:  2017-05-15       Impact factor: 3.969

7.  Enhanced Fitness of a Helicobacter pylori babA Mutant in a Murine Model.

Authors:  M Lorena Harvey; Aung Soe Lin; Lili Sun; Tatsuki Koyama; Jennifer H B Shuman; John T Loh; Holly M Scott Algood; Matthew B Scholz; Mark S McClain; Timothy L Cover
Journal:  Infect Immun       Date:  2021-07-12       Impact factor: 3.441

8.  Taxonomer: an interactive metagenomics analysis portal for universal pathogen detection and host mRNA expression profiling.

Authors:  Steven Flygare; Keith Simmon; Chase Miller; Yi Qiao; Brett Kennedy; Tonya Di Sera; Erin H Graf; Keith D Tardif; Aurélie Kapusta; Shawn Rynearson; Chris Stockmann; Krista Queen; Suxiang Tong; Karl V Voelkerding; Anne Blaschke; Carrie L Byington; Seema Jain; Andrew Pavia; Krow Ampofo; Karen Eilbeck; Gabor Marth; Mark Yandell; Robert Schlaberg
Journal:  Genome Biol       Date:  2016-05-26       Impact factor: 13.583

9.  PhyloPythiaS+: a self-training method for the rapid reconstruction of low-ranking taxonomic bins from metagenomes.

Authors:  Ivan Gregor; Johannes Dröge; Melanie Schirmer; Christopher Quince; Alice C McHardy
Journal:  PeerJ       Date:  2016-02-08       Impact factor: 2.984

10.  Assessment of k-mer spectrum applicability for metagenomic dissimilarity analysis.

Authors:  Veronika B Dubinkina; Dmitry S Ischenko; Vladimir I Ulyantsev; Alexander V Tyakht; Dmitry G Alexeev
Journal:  BMC Bioinformatics       Date:  2016-01-16       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.