Literature DB >> 29792514

A Flow Procedure for Linearization of Genome Sequence Graphs.

David Haussler1, Maciej Smuga-Otto1, Jordan M Eizenga1, Benedict Paten1, Adam M Novak1, Sergei Nikitin2, Maria Zueva2, Dmitrii Miagkov2.   

Abstract

Efforts to incorporate human genetic variation into the reference human genome have converged on the idea of a graph representation of genetic variation within a species, a genome sequence graph. A sequence graph represents a set of individual haploid reference genomes as paths in a single graph. When that set of reference genomes is sufficiently diverse, the sequence graph implicitly contains all frequent human genetic variations, including translocations, inversions, deletions, and insertions. In representing a set of genomes as a sequence graph, one encounters certain challenges. One of the most important is the problem of graph linearization, essential both for efficiency of storage and access, and for natural graph visualization and compatibility with other tools. The goal of graph linearization is to order nodes of the graph in such a way that operations such as access, traversal, and visualization are as efficient and effective as possible. A new algorithm for the linearization of sequence graphs, called the flow procedure (FP), is proposed in this article. Comparative experimental evaluation of the FP against other algorithms shows that it outperforms its rivals in the metrics most relevant to sequence graphs.

Entities:  

Keywords:  backbone; cut width; feedback arcs; flow procedure; grooming; linearization; sequence graph.

Mesh:

Year:  2018        PMID: 29792514      PMCID: PMC6067104          DOI: 10.1089/cmb.2017.0248

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  2 in total

1.  Maximum likelihood genome assembly.

Authors:  Paul Medvedev; Michael Brudno
Journal:  J Comput Biol       Date:  2009-08       Impact factor: 1.479

2.  Building a pan-genome reference for a population.

Authors:  Ngan Nguyen; Glenn Hickey; Daniel R Zerbino; Brian Raney; Dent Earl; Joel Armstrong; W James Kent; David Haussler; Benedict Paten
Journal:  J Comput Biol       Date:  2015-01-07       Impact factor: 1.479

  2 in total
  2 in total

1.  Linearization of genome sequence graphs revisited.

Authors:  Anna Lisiecka; Norbert Dojer
Journal:  iScience       Date:  2021-06-19

2.  Coordinate systems for supergenomes.

Authors:  Fabian Gärtner; Christian Höner Zu Siederdissen; Lydia Müller; Peter F Stadler
Journal:  Algorithms Mol Biol       Date:  2018-09-24       Impact factor: 1.405

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.