Literature DB >> 25028723

Journaled string tree-a scalable data structure for analyzing thousands of similar genomes on your laptop.

René Rahn1, David Weese1, Knut Reinert1.   

Abstract

MOTIVATION: Next-generation sequencing (NGS) has revolutionized biomedical research in the past decade and led to a continuous stream of developments in bioinformatics, addressing the need for fast and space-efficient solutions for analyzing NGS data. Often researchers need to analyze a set of genomic sequences that stem from closely related species or are indeed individuals of the same species. Hence, the analyzed sequences are similar. For analyses where local changes in the examined sequence induce only local changes in the results, it is obviously desirable to examine identical or similar regions not repeatedly.
RESULTS: In this work, we provide a datatype that exploits data parallelism inherent in a set of similar sequences by analyzing shared regions only once. In real-world experiments, we show that algorithms that otherwise would scan each reference sequentially can be speeded up by a factor of 115.
© The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Year:  2014        PMID: 25028723     DOI: 10.1093/bioinformatics/btu438

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  11 in total

Review 1.  Pangenome Graphs.

Authors:  Jordan M Eizenga; Adam M Novak; Jonas A Sibbesen; Simon Heumos; Ali Ghaffaari; Glenn Hickey; Xian Chang; Josiah D Seaman; Robin Rounthwaite; Jana Ebler; Mikko Rautiainen; Shilpa Garg; Benedict Paten; Tobias Marschall; Jouni Sirén; Erik Garrison
Journal:  Annu Rev Genomics Hum Genet       Date:  2020-05-26       Impact factor: 8.929

Review 2.  Searching and Indexing Genomic Databases via Kernelization.

Authors:  Travis Gagie; Simon J Puglisi
Journal:  Front Bioeng Biotechnol       Date:  2015-02-09

3.  Sequence Factorization with Multiple References.

Authors:  Sebastian Wandelt; Ulf Leser
Journal:  PLoS One       Date:  2015-09-30       Impact factor: 3.240

4.  A representation of a compressed de Bruijn graph for pan-genome analysis that enables search.

Authors:  Timo Beller; Enno Ohlebusch
Journal:  Algorithms Mol Biol       Date:  2016-07-18       Impact factor: 1.405

Review 5.  Computational pan-genomics: status, promises and challenges.

Authors: 
Journal:  Brief Bioinform       Date:  2018-01-01       Impact factor: 11.622

6.  Bit-parallel sequence-to-graph alignment.

Authors:  Mikko Rautiainen; Veli Mäkinen; Tobias Marschall
Journal:  Bioinformatics       Date:  2019-10-01       Impact factor: 6.937

7.  Indexes of large genome collections on a PC.

Authors:  Agnieszka Danek; Sebastian Deorowicz; Szymon Grabowski
Journal:  PLoS One       Date:  2014-10-07       Impact factor: 3.240

Review 8.  Visual programming for next-generation sequencing data analytics.

Authors:  Franco Milicchio; Rebecca Rose; Jiang Bian; Jae Min; Mattia Prosperi
Journal:  BioData Min       Date:  2016-04-27       Impact factor: 2.522

9.  seq-seq-pan: building a computational pan-genome data structure on whole genome alignment.

Authors:  Christine Jandrasits; Piotr W Dabrowski; Stephan Fuchs; Bernhard Y Renard
Journal:  BMC Genomics       Date:  2018-01-15       Impact factor: 3.969

10.  Founder Reconstruction Enables Scalable and Seamless Pangenomic Analysis.

Authors:  Tuukka Norri; Bastien Cazaux; Saska Dönges; Daniel Valenzuela; Veli Mäkinen
Journal:  Bioinformatics       Date:  2021-07-14       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.