Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 MetaCache: context-aware classification of metagenomic reads using minhashing.

Literature DB >> 28961782

MetaCache: context-aware classification of metagenomic reads using minhashing.

André Müller¹, Christian Hundt¹, Andreas Hildebrandt¹, Thomas Hankeln², Bertil Schmidt¹.

Abstract

MOTIVATION: Metagenomic shotgun sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification, i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes corresponding software tools suffer from either long runtimes, large memory requirements or low accuracy.
RESULTS: We introduce MetaCache-a novel software for read classification using the big data technique minhashing. Our approach performs context-aware classification of reads by computing representative subsamples of k-mers within both, probed reads and locally constrained regions of the reference genomes. As a result, MetaCache consumes significantly less memory compared to the state-of-the-art read classifiers Kraken and CLARK while achieving highly competitive sensitivity and precision at comparable speed. For example, using NCBI RefSeq draft and completed genomes with a total length of around 140 billion bases as reference, MetaCache's database consumes only 62 GB of memory while both Kraken and CLARK fail to construct their respective databases on a workstation with 512 GB RAM. Our experimental results further show that classification accuracy continuously improves when increasing the amount of utilized reference genome data.
AVAILABILITY AND IMPLEMENTATION: MetaCache is open source software written in C ++ and can be downloaded at http://github.com/muellan/metacache. CONTACT: bertil.schmidt@uni-mainz.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Chemical

Mesh：

Year: 2017 PMID： 28961782 DOI： 10.1093/bioinformatics/btx520

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

7 in total

1. MSC: a metagenomic sequence classification algorithm.

Authors: Subrata Saha; Jethro Johnson; Soumitra Pal; George M Weinstock; Sanguthevar Rajasekaran
Journal: Bioinformatics Date: 2019-09-01 Impact factor: 6.937

2. RainDrop: Rapid activation matrix computation for droplet-based single-cell RNA-seq reads.

Authors: Stefan Niebler; André Müller; Thomas Hankeln; Bertil Schmidt
Journal: BMC Bioinformatics Date: 2020-07-01 Impact factor: 3.169

3. Assembling Reads Improves Taxonomic Classification of Species.

Authors: Quang Tran; Vinhthuy Phan
Journal: Genes (Basel) Date: 2020-08-17 Impact factor: 4.096

4. LEMMI: a continuous benchmarking platform for metagenomics classifiers.

Authors: Mathieu Seppey; Mosè Manni; Evgeny M Zdobnov
Journal: Genome Res Date: 2020-07-02 Impact factor: 9.043

5. Downregulation of growth plate genes involved with the onset of femoral head separation in young broilers.

Authors: Adriana Mércia Guaratini Ibelli; Jane de Oliveira Peixoto; Ricardo Zanella; João José de Simoni Gouveia; Maurício Egídio Cantão; Luiz Lehmann Coutinho; Jorge Augusto Petroli Marchesi; Mariane Spudeit Dal Pizzol; Débora Ester Petry Marcelino; Mônica Corrêa Ledur
Journal: Front Physiol Date: 2022-08-08 Impact factor: 4.755

6. Locality-sensitive hashing enables efficient and scalable signal classification in high-throughput mass spectrometry raw data.

Authors: Konstantin Bob; David Teschner; Thomas Kemmer; David Gomez-Zepeda; Stefan Tenzer; Bertil Schmidt; Andreas Hildebrandt
Journal: BMC Bioinformatics Date: 2022-07-20 Impact factor: 3.307

7. expam-high-resolution analysis of metagenomes using distance trees.

Authors: Sean M Solari; Remy B Young; Vanessa R Marcelino; Samuel C Forster
Journal: Bioinformatics Date: 2022-10-14 Impact factor: 6.931

7 in total