Literature DB >> 29461862

Superbubbles, Ultrabubbles, and Cacti.

Benedict Paten1, Jordan M Eizenga1, Yohei M Rosen1, Adam M Novak1, Erik Garrison2, Glenn Hickey1.   

Abstract

A superbubble is a type of directed acyclic subgraph with single distinct source and sink vertices. In genome assembly and genetics, the possible paths through a superbubble can be considered to represent the set of possible sequences at a location in a genome. Bidirected and biedged graphs are a generalization of digraphs that are increasingly being used to more fully represent genome assembly and variation problems. In this study, we define snarls and ultrabubbles, generalizations of superbubbles for bidirected and biedged graphs, and give an efficient algorithm for the detection of these more general structures. Key to this algorithm is the cactus graph, which, we show, encodes the nested decomposition of a graph into snarls and ultrabubbles within its structure. We propose and demonstrate empirically that this decomposition on bidirected and biedged graphs solves a fundamental problem by defining genetic sites for any collection of genomic variations, including complex structural variations, without need for any single reference genome coordinate system. Further, the nesting of the decomposition gives a natural way to describe and model variations contained within large variations, a case not currently dealt with by existing formats [e.g., variant cell format (VCF)].

Keywords:  genome assembly; genome graphs; genomic variation; sequence analysis; variant discovery

Mesh:

Year:  2018        PMID: 29461862      PMCID: PMC6067107          DOI: 10.1089/cmb.2017.0251

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  9 in total

1.  An Eulerian path approach to DNA fragment assembly.

Authors:  P A Pevzner; H Tang; M S Waterman
Journal:  Proc Natl Acad Sci U S A       Date:  2001-08-14       Impact factor: 11.205

2.  An O(m log m)-Time Algorithm for Detecting Superbubbles.

Authors:  Wing-Kin Sung; Kunihiko Sadakane; Tetsuo Shibuya; Abha Belorkar; Iana Pyrogova
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2015 Jul-Aug       Impact factor: 3.710

3.  The fragment assembly string graph.

Authors:  Eugene W Myers
Journal:  Bioinformatics       Date:  2005-09-01       Impact factor: 6.937

4.  On the Number of Husimi Trees: I.

Authors:  F Harary; G E Uhlenbeck
Journal:  Proc Natl Acad Sci U S A       Date:  1953-04       Impact factor: 11.205

5.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Authors:  Daniel R Zerbino; Ewan Birney
Journal:  Genome Res       Date:  2008-03-18       Impact factor: 9.043

6.  Maximum likelihood genome assembly.

Authors:  Paul Medvedev; Michael Brudno
Journal:  J Comput Biol       Date:  2009-08       Impact factor: 1.479

7.  Cactus graphs for genome comparisons.

Authors:  Benedict Paten; Mark Diekhans; Dent Earl; John St John; Jian Ma; Bernard Suh; David Haussler
Journal:  J Comput Biol       Date:  2011-03       Impact factor: 1.479

8.  Breakpoint graphs and ancestral genome reconstructions.

Authors:  Max A Alekseyev; Pavel A Pevzner
Journal:  Genome Res       Date:  2009-02-13       Impact factor: 9.043

9.  A global reference for human genetic variation.

Authors:  Adam Auton; Lisa D Brooks; Richard M Durbin; Erik P Garrison; Hyun Min Kang; Jan O Korbel; Jonathan L Marchini; Shane McCarthy; Gil A McVean; Gonçalo R Abecasis
Journal:  Nature       Date:  2015-10-01       Impact factor: 49.962

  9 in total
  16 in total

Review 1.  Pangenome Graphs.

Authors:  Jordan M Eizenga; Adam M Novak; Jonas A Sibbesen; Simon Heumos; Ali Ghaffaari; Glenn Hickey; Xian Chang; Josiah D Seaman; Robin Rounthwaite; Jana Ebler; Mikko Rautiainen; Shilpa Garg; Benedict Paten; Tobias Marschall; Jouni Sirén; Erik Garrison
Journal:  Annu Rev Genomics Hum Genet       Date:  2020-05-26       Impact factor: 8.929

2.  The effect of genome graph expressiveness on the discrepancy between genome graph distance and string set distance.

Authors:  Yutong Qiu; Carl Kingsford
Journal:  Bioinformatics       Date:  2022-06-24       Impact factor: 6.931

3.  A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar.

Authors:  Erik Garrison; Zev N Kronenberg; Eric T Dawson; Brent S Pedersen; Pjotr Prins
Journal:  PLoS Comput Biol       Date:  2022-05-31       Impact factor: 4.779

4.  Pangenomics enables genotyping of known structural variants in 5202 diverse genomes.

Authors:  Jouni Sirén; Jean Monlong; Xian Chang; Adam M Novak; Jordan M Eizenga; Charles Markello; Jonas A Sibbesen; Glenn Hickey; Pi-Chuan Chang; Andrew Carroll; Namrata Gupta; Stacey Gabriel; Thomas W Blackwell; Aakrosh Ratan; Kent D Taylor; Stephen S Rich; Jerome I Rotter; David Haussler; Erik Garrison; Benedict Paten
Journal:  Science       Date:  2021-12-17       Impact factor: 63.714

5.  Superbubbles revisited.

Authors:  Fabian Gärtner; Lydia Müller; Peter F Stadler
Journal:  Algorithms Mol Biol       Date:  2018-12-01       Impact factor: 1.405

6.  Progressive Cactus is a multiple-genome aligner for the thousand-genome era.

Authors:  Joel Armstrong; Glenn Hickey; Mark Diekhans; Ian T Fiddes; Adam M Novak; Alden Deran; Qi Fang; Duo Xie; Shaohong Feng; Josefin Stiller; Diane Genereux; Jeremy Johnson; Voichita Dana Marinescu; Jessica Alföldi; Robert S Harris; Kerstin Lindblad-Toh; David Haussler; Elinor Karlsson; Erich D Jarvis; Guojie Zhang; Benedict Paten
Journal:  Nature       Date:  2020-11-11       Impact factor: 49.962

7.  A tri-tuple coordinate system derived for fast and accurate analysis of the colored de Bruijn graph-based pangenomes.

Authors:  Jindan Guo; Erli Pang; Hongtao Song; Kui Lin
Journal:  BMC Bioinformatics       Date:  2021-05-27       Impact factor: 3.169

8.  LazyB: fast and cheap genome assembly.

Authors:  Thomas Gatter; Sarah von Löhneysen; Jörg Fallmann; Polina Drozdova; Tom Hartmann; Peter F Stadler
Journal:  Algorithms Mol Biol       Date:  2021-06-01       Impact factor: 1.405

9.  Distance indexing and seed clustering in sequence graphs.

Authors:  Xian Chang; Jordan Eizenga; Adam M Novak; Jouni Sirén; Benedict Paten
Journal:  Bioinformatics       Date:  2020-07-01       Impact factor: 6.937

10.  Coordinate systems for supergenomes.

Authors:  Fabian Gärtner; Christian Höner Zu Siederdissen; Lydia Müller; Peter F Stadler
Journal:  Algorithms Mol Biol       Date:  2018-09-24       Impact factor: 1.405

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.