Literature DB >> 29036588

A novel data structure to support ultra-fast taxonomic classification of metagenomic sequences with k-mer signatures.

Xinan Liu1, Ye Yu2, Jinpeng Liu3, Corrine F Elliott1, Chen Qian4, Jinze Liu1.   

Abstract

Motivation: Metagenomic read classification is a critical step in the identification and quantification of microbial species sampled by high-throughput sequencing. Although many algorithms have been developed to date, they suffer significant memory and/or computational costs. Due to the growing popularity of metagenomic data in both basic science and clinical applications, as well as the increasing volume of data being generated, efficient and accurate algorithms are in high demand.
Results: We introduce MetaOthello, a probabilistic hashing classifier for metagenomic sequencing reads. The algorithm employs a novel data structure, called l-Othello, to support efficient querying of a taxon using its k-mer signatures. MetaOthello is an order-of-magnitude faster than the current state-of-the-art algorithms Kraken and Clark, and requires only one-third of the RAM. In comparison to Kaiju, a metagenomic classification tool using protein sequences instead of genomic sequences, MetaOthello is three times faster and exhibits 20-30% higher classification sensitivity. We report comparative analyses of both scalability and accuracy using a number of simulated and empirical datasets. Availability and implementation: MetaOthello is a stand-alone program implemented in C ++. The current version (1.0) is accessible via https://doi.org/10.5281/zenodo.808941. Contact: liuj@cs.uky.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

Mesh:

Year:  2018        PMID: 29036588      PMCID: PMC5870563          DOI: 10.1093/bioinformatics/btx432

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  25 in total

1.  Community structure and metabolism through reconstruction of microbial genomes from the environment.

Authors:  Gene W Tyson; Jarrod Chapman; Philip Hugenholtz; Eric E Allen; Rachna J Ram; Paul M Richardson; Victor V Solovyev; Edward M Rubin; Daniel S Rokhsar; Jillian F Banfield
Journal:  Nature       Date:  2004-02-01       Impact factor: 49.962

2.  Integrative analysis of environmental sequences using MEGAN4.

Authors:  Daniel H Huson; Suparna Mitra; Hans-Joachim Ruscheweyh; Nico Weber; Stephan C Schuster
Journal:  Genome Res       Date:  2011-06-20       Impact factor: 9.043

3.  Higher classification sensitivity of short metagenomic reads with CLARK-S.

Authors:  Rachid Ounit; Stefano Lonardi
Journal:  Bioinformatics       Date:  2016-08-18       Impact factor: 6.937

4.  Metagenomic microbial community profiling using unique clade-specific marker genes.

Authors:  Nicola Segata; Levi Waldron; Annalisa Ballarini; Vagheesh Narasimhan; Olivier Jousson; Curtis Huttenhower
Journal:  Nat Methods       Date:  2012-06-10       Impact factor: 28.547

5.  Structure, function and diversity of the healthy human microbiome.

Authors: 
Journal:  Nature       Date:  2012-06-13       Impact factor: 49.962

6.  NBC: the Naive Bayes Classification tool webserver for taxonomic classification of metagenomic reads.

Authors:  Gail L Rosen; Erin R Reichenberger; Aaron M Rosenfeld
Journal:  Bioinformatics       Date:  2010-11-08       Impact factor: 6.937

7.  Scalable metagenomic taxonomy classification using a reference genome database.

Authors:  Sasha K Ames; David A Hysom; Shea N Gardner; G Scott Lloyd; Maya B Gokhale; Jonathan E Allen
Journal:  Bioinformatics       Date:  2013-07-04       Impact factor: 6.937

8.  Centrifuge: rapid and sensitive classification of metagenomic sequences.

Authors:  Daehwan Kim; Li Song; Florian P Breitwieser; Steven L Salzberg
Journal:  Genome Res       Date:  2016-10-17       Impact factor: 9.043

9.  Strain/species identification in metagenomes using genome-specific markers.

Authors:  Qichao Tu; Zhili He; Jizhong Zhou
Journal:  Nucleic Acids Res       Date:  2014-02-12       Impact factor: 16.971

10.  Fast and sensitive taxonomic classification for metagenomics with Kaiju.

Authors:  Peter Menzel; Kim Lee Ng; Anders Krogh
Journal:  Nat Commun       Date:  2016-04-13       Impact factor: 14.919

View more
  11 in total

Review 1.  Benchmarking Metagenomics Tools for Taxonomic Classification.

Authors:  Simon H Ye; Katherine J Siddle; Daniel J Park; Pardis C Sabeti
Journal:  Cell       Date:  2019-08-08       Impact factor: 41.582

2.  Nanopore sequencing of a monkeypox virus strain isolated from a pustular lesion in the Central African Republic.

Authors:  Mathias Vandenbogaert; Aurélia Kwasiborski; Ella Gonofio; Stéphane Descorps-Declère; Benjamin Selekon; Andriniaina Andy Nkili Meyong; Rita Sem Ouilibona; Antoine Gessain; Jean-Claude Manuguerra; Valérie Caro; Emmanuel Nakoune; Nicolas Berthet
Journal:  Sci Rep       Date:  2022-06-24       Impact factor: 4.996

3.  IDseq-An open source cloud-based pipeline and analysis service for metagenomic pathogen detection and monitoring.

Authors:  Katrina L Kalantar; Tiago Carvalho; Charles F A de Bourcy; Boris Dimitrov; Greg Dingle; Rebecca Egger; Julie Han; Olivia B Holmes; Yun-Fang Juan; Ryan King; Andrey Kislyuk; Michael F Lin; Maria Mariano; Todd Morse; Lucia V Reynoso; David Rissato Cruz; Jonathan Sheu; Jennifer Tang; James Wang; Mark A Zhang; Emily Zhong; Vida Ahyong; Sreyngim Lay; Sophana Chea; Jennifer A Bohl; Jessica E Manning; Cristina M Tato; Joseph L DeRisi
Journal:  Gigascience       Date:  2020-10-15       Impact factor: 6.524

4.  To Petabytes and beyond: recent advances in probabilistic and signal processing algorithms and their application to metagenomics.

Authors:  R A Leo Elworth; Qi Wang; Pavan K Kota; C J Barberan; Benjamin Coleman; Advait Balaji; Gaurav Gupta; Richard G Baraniuk; Anshumali Shrivastava; Todd J Treangen
Journal:  Nucleic Acids Res       Date:  2020-06-04       Impact factor: 16.971

5.  Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps.

Authors:  Alexander T Dilthey; Chirag Jain; Sergey Koren; Adam M Phillippy
Journal:  Nat Commun       Date:  2019-07-11       Impact factor: 14.919

6.  Unique k-mers as Strain-Specific Barcodes for Phylogenetic Analysis and Natural Microbiome Profiling.

Authors:  Valery V Panyukov; Sergey S Kiselev; Olga N Ozoline
Journal:  Int J Mol Sci       Date:  2020-01-31       Impact factor: 5.923

7.  Specific Microbial Taxa and Functional Capacity Contribute to Chicken Abdominal Fat Deposition.

Authors:  Hai Xiang; Jiankang Gan; Daoshu Zeng; Jing Li; Hui Yu; Haiquan Zhao; Ying Yang; Shuwen Tan; Gen Li; Chaowei Luo; Zhuojun Xie; Guiping Zhao; Hua Li
Journal:  Front Microbiol       Date:  2021-03-17       Impact factor: 5.640

8.  SeqOthello: querying RNA-seq experiments at scale.

Authors:  Ye Yu; Jinpeng Liu; Xinan Liu; Yi Zhang; Eamonn Magner; Erik Lehnert; Chen Qian; Jinze Liu
Journal:  Genome Biol       Date:  2018-10-19       Impact factor: 13.583

9.  Sc-ncDNAPred: A Sequence-Based Predictor for Identifying Non-coding DNA in Saccharomyces cerevisiae.

Authors:  Wenying He; Ying Ju; Xiangxiang Zeng; Xiangrong Liu; Quan Zou
Journal:  Front Microbiol       Date:  2018-09-12       Impact factor: 5.640

10.  Microbial Diversity and Metabolic Potential in the Stratified Sansha Yongle Blue Hole in the South China Sea.

Authors:  Peiqing He; Linping Xie; Xuelei Zhang; Jiang Li; Xuezheng Lin; Xinming Pu; Chao Yuan; Ziwen Tian; Jie Li
Journal:  Sci Rep       Date:  2020-04-06       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.