Literature DB >> 30519278

Superbubbles revisited.

Fabian Gärtner1,2, Lydia Müller1,3,4, Peter F Stadler1,2,3,5,6,7,8,9.   

Abstract

BACKGROUND: Superbubbles are distinctive subgraphs in direct graphs that play an important role in assembly algorithms for high-throughput sequencing (HTS) data. Their practical importance derives from the fact they are connected to their host graph by a single entrance and a single exit vertex, thus allowing them to be handled independently. Efficient algorithms for the enumeration of superbubbles are therefore of important for the processing of HTS data. Superbubbles can be identified within the strongly connected components of the input digraph after transforming them into directed acyclic graphs. The algorithm by Sung et al. (IEEE ACM Trans Comput Biol Bioinform 12:770-777, 2015) achieves this task in O ( m l o g ( m ) ) -time. The extraction of superbubbles from the transformed components was later improved to by Brankovic et al. (Theor Comput Sci 609:374-383, 2016) resulting in an overall O ( m + n ) -time algorithm.
RESULTS: A re-analysis of the mathematical structure of superbubbles showed that the construction of auxiliary DAGs from the strongly connected components in the work of Sung et al. missed some details that can lead to the reporting of false positive superbubbles. We propose an alternative, even simpler auxiliary graph that solved the problem and retains the linear running time for general digraph. Furthermore, we describe a simpler, space-efficient O ( m + n ) -time algorithm for detecting superbubbles in DAGs that uses only simple data structures. IMPLEMENTATION: We present a reference implementation of the algorithm that accepts many commonly used formats for the input graph and provides convenient access to the improved algorithm. https://github.com/Fabianexe/Superbubble.

Entities:  

Keywords:  Genome assembly; Linear time algorithm; Superbubble; de Bruijn graph

Year:  2018        PMID: 30519278      PMCID: PMC6271648          DOI: 10.1186/s13015-018-0134-3

Source DB:  PubMed          Journal:  Algorithms Mol Biol        ISSN: 1748-7188            Impact factor:   1.405


  5 in total

1.  An Eulerian path approach to DNA fragment assembly.

Authors:  P A Pevzner; H Tang; M S Waterman
Journal:  Proc Natl Acad Sci U S A       Date:  2001-08-14       Impact factor: 11.205

2.  An O(m log m)-Time Algorithm for Detecting Superbubbles.

Authors:  Wing-Kin Sung; Kunihiko Sadakane; Tetsuo Shibuya; Abha Belorkar; Iana Pyrogova
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2015 Jul-Aug       Impact factor: 3.710

3.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Authors:  Daniel R Zerbino; Ewan Birney
Journal:  Genome Res       Date:  2008-03-18       Impact factor: 9.043

4.  Superbubbles, Ultrabubbles, and Cacti.

Authors:  Benedict Paten; Jordan M Eizenga; Yohei M Rosen; Adam M Novak; Erik Garrison; Glenn Hickey
Journal:  J Comput Biol       Date:  2018-02-20       Impact factor: 1.479

5.  Coordinate systems for supergenomes.

Authors:  Fabian Gärtner; Christian Höner Zu Siederdissen; Lydia Müller; Peter F Stadler
Journal:  Algorithms Mol Biol       Date:  2018-09-24       Impact factor: 1.405

  5 in total
  1 in total

1.  A tri-tuple coordinate system derived for fast and accurate analysis of the colored de Bruijn graph-based pangenomes.

Authors:  Jindan Guo; Erli Pang; Hongtao Song; Kui Lin
Journal:  BMC Bioinformatics       Date:  2021-05-27       Impact factor: 3.169

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.