Literature DB >> 27659452

TwoPaCo: an efficient algorithm to build the compacted de Bruijn graph from many complete genomes.

Ilia Minkin1, Son Pham2, Paul Medvedev1,3,4.   

Abstract

MOTIVATION: de Bruijn graphs have been proposed as a data structure to facilitate the analysis of related whole genome sequences, in both a population and comparative genomic settings. However, current approaches do not scale well to many genomes of large size (such as mammalian genomes).
RESULTS: In this article, we present TwoPaCo, a simple and scalable low memory algorithm for the direct construction of the compacted de Bruijn graph from a set of complete genomes. We demonstrate that it can construct the graph for 100 simulated human genomes in less than a day and eight real primates in < 2 h, on a typical shared-memory machine. We believe that this progress will enable novel biological analyses of hundreds of mammalian-sized genomes.
AVAILABILITY AND IMPLEMENTATION: Our code and data is available for download from github.com/medvedevgroup/TwoPaCo. CONTACT: ium125@psu.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

Entities:  

Mesh:

Year:  2017        PMID: 27659452     DOI: 10.1093/bioinformatics/btw609

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  19 in total

1.  The design and construction of reference pangenome graphs with minigraph.

Authors:  Heng Li; Xiaowen Feng; Chong Chu
Journal:  Genome Biol       Date:  2020-10-16       Impact factor: 13.583

Review 2.  Pangenome Graphs.

Authors:  Jordan M Eizenga; Adam M Novak; Jonas A Sibbesen; Simon Heumos; Ali Ghaffaari; Glenn Hickey; Xian Chang; Josiah D Seaman; Robin Rounthwaite; Jana Ebler; Mikko Rautiainen; Shilpa Garg; Benedict Paten; Tobias Marschall; Jouni Sirén; Erik Garrison
Journal:  Annu Rev Genomics Hum Genet       Date:  2020-05-26       Impact factor: 8.929

3.  The effect of genome graph expressiveness on the discrepancy between genome graph distance and string set distance.

Authors:  Yutong Qiu; Carl Kingsford
Journal:  Bioinformatics       Date:  2022-06-24       Impact factor: 6.931

4.  Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2.

Authors:  Jamshed Khan; Marek Kokot; Sebastian Deorowicz; Rob Patro
Journal:  Genome Biol       Date:  2022-09-08       Impact factor: 17.906

5.  Multiplex de Bruijn graphs enable genome assembly from long, high-fidelity reads.

Authors:  Anton Bankevich; Andrey V Bzikadze; Mikhail Kolmogorov; Dmitry Antipov; Pavel A Pevzner
Journal:  Nat Biotechnol       Date:  2022-02-28       Impact factor: 68.164

6.  Faucet: streaming de novo assembly graph construction.

Authors:  Roye Rozov; Gil Goldshlager; Eran Halperin; Ron Shamir
Journal:  Bioinformatics       Date:  2018-01-01       Impact factor: 6.937

7.  A space and time-efficient index for the compacted colored de Bruijn graph.

Authors:  Fatemeh Almodaresi; Hirak Sarkar; Avi Srivastava; Rob Patro
Journal:  Bioinformatics       Date:  2018-07-01       Impact factor: 6.937

8.  Cuttlefish: fast, parallel and low-memory compaction of de Bruijn graphs from large-scale genome collections.

Authors:  Jamshed Khan; Rob Patro
Journal:  Bioinformatics       Date:  2021-07-12       Impact factor: 6.937

9.  Constructing small genome graphs via string compression.

Authors:  Yutong Qiu; Carl Kingsford
Journal:  Bioinformatics       Date:  2021-07-12       Impact factor: 6.937

10.  seq-seq-pan: building a computational pan-genome data structure on whole genome alignment.

Authors:  Christine Jandrasits; Piotr W Dabrowski; Stephan Fuchs; Bernhard Y Renard
Journal:  BMC Genomics       Date:  2018-01-15       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.