Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 DREAM-Yara: an exact read mapper for very large databases with short update time.

Literature DB >> 30423080

DREAM-Yara: an exact read mapper for very large databases with short update time.

Temesgen Hailemariam Dadi¹, Enrico Siragusa², Vitor C Piro^3,4, Andreas Andrusch⁵, Enrico Seiler¹, Bernhard Y Renard³, Knut Reinert¹.

Abstract

Motivation: Mapping-based approaches have become limited in their application to very large sets of references since computing an FM-index for very large databases (e.g. >10 GB) has become a bottleneck. This affects many analyses that need such index as an essential step for approximate matching of the NGS reads to reference databases. For instance, in typical metagenomics analysis, the size of the reference sequences has become prohibitive to compute a single full-text index on standard machines. Even on large memory machines, computing such index takes about 1 day of computing time. As a result, updates of indices are rarely performed. Hence, it is desirable to create an alternative way of indexing while preserving fast search times.
Results: To solve the index construction and update problem we propose the DREAM (Dynamic seaRchablE pArallel coMpressed index) framework and provide an implementation. The main contributions are the introduction of an approximate search distributor via a novel use of Bloom filters. We combine several Bloom filters to form an interleaved Bloom filter and use this new data structure to quickly exclude reads for parts of the databases where they cannot match. This allows us to keep the databases in several indices which can be easily rebuilt if parts are updated while maintaining a fast search time. The second main contribution is an implementation of DREAM-Yara a distributed version of a fully sensitive read mapper under the DREAM framework. Availability and implementation: https://gitlab.com/pirovc/dream_yara/.

Mesh：

Year: 2018 PMID： 30423080 DOI： 10.1093/bioinformatics/bty567

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

11 in total

1. ReadBouncer: precise and scalable adaptive sampling for nanopore sequencing.

Authors: Jens-Uwe Ulrich; Ahmad Lutfi; Kilian Rutzen; Bernhard Y Renard
Journal: Bioinformatics Date: 2022-06-24 Impact factor: 6.931

2. SPRISS: Approximating Frequent K-mers by Sampling Reads, and Applications.

Authors: Diego Santoro; Leonardo Pellegrina; Matteo Comin; Fabio Vandin
Journal: Bioinformatics Date: 2022-05-18 Impact factor: 6.931

3. To Petabytes and beyond: recent advances in probabilistic and signal processing algorithms and their application to metagenomics.

Authors: R A Leo Elworth; Qi Wang; Pavan K Kota; C J Barberan; Benjamin Coleman; Advait Balaji; Gaurav Gupta; Richard G Baraniuk; Anshumali Shrivastava; Todd J Treangen
Journal: Nucleic Acids Res Date: 2020-06-04 Impact factor: 16.971

4. Featherweight long read alignment using partitioned reference indexes.

Authors: Hasindu Gamaarachchi; Sri Parameswaran; Martin A Smith
Journal: Sci Rep Date: 2019-03-13 Impact factor: 4.379

5. Where did you come from, where did you go: Refining metagenomic analysis tools for horizontal gene transfer characterisation.

Authors: Enrico Seiler; Kathrin Trappe; Bernhard Y Renard
Journal: PLoS Comput Biol Date: 2019-07-23 Impact factor: 4.475

Review 6. Data structures based on k-mers for querying large collections of sequencing data sets.

Authors: Camille Marchet; Christina Boucher; Simon J Puglisi; Paul Medvedev; Mikaël Salson; Rayan Chikhi
Journal: Genome Res Date: 2020-12-16 Impact factor: 9.043

Review 7. Technology dictates algorithms: recent developments in read alignment.

Authors: Mohammed Alser; Jeremy Rotman; Onur Mutlu; Serghei Mangul; Dhrithi Deshpande; Kodi Taraszka; Huwenbo Shi; Pelin Icer Baykal; Harry Taegyun Yang; Victor Xue; Sergey Knyazev; Benjamin D Singer; Brunilda Balliu; David Koslicki; Pavel Skums; Alex Zelikovsky; Can Alkan
Journal: Genome Biol Date: 2021-08-26 Impact factor: 13.583