Literature DB >> 34044757

A tri-tuple coordinate system derived for fast and accurate analysis of the colored de Bruijn graph-based pangenomes.

Jindan Guo1, Erli Pang1, Hongtao Song1, Kui Lin2.   

Abstract

BACKGROUND: With the rapid development of accurate sequencing and assembly technologies, an increasing number of high-quality chromosome-level and haplotype-resolved assemblies of genomic sequences have been derived, from which there will be great opportunities for computational pangenomics. Although genome graphs are among the most useful models for pangenome representation, their structural complexity makes it difficult to present genome information intuitively, such as the linear reference genome. Thus, efficiently and accurately analyzing the genome graph spatial structure and coordinating the information remains a substantial challenge.
RESULTS: We developed a new method, a colored superbubble (cSupB), that can overcome the complexity of graphs and organize a set of species- or population-specific haplotype sequences of interest. Based on this model, we propose a tri-tuple coordinate system that combines an offset value, topological structure and sample information. Additionally, cSupB provides a novel method that utilizes complete topological information and efficiently detects small indels (< 50 bp) for highly similar samples, which can be validated by simulated datasets. Moreover, we demonstrated that cSupB can adapt to the complex cycle structure.
CONCLUSIONS: Although the solution is made suitable for increasingly complex genome graphs by relaxing the constraint, the directed acyclic graph, the motif cSupB and the cSupB method can be extended to any colored directed acyclic graph. We anticipate that our method will facilitate the analysis of individual haplotype variants and population genomic diversity. We have developed a C +  + program for implementing our method that is available at https://github.com/eggleader/cSupB .

Entities:  

Keywords:  Coordinate system; Genome graph; Variant detection

Mesh:

Year:  2021        PMID: 34044757     DOI: 10.1186/s12859-021-04149-w

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  36 in total

Review 1.  1-Tuple DNA sequencing: computer analysis.

Authors:  P A Pevzner
Journal:  J Biomol Struct Dyn       Date:  1989-08

2.  Goodbye reference, hello genome graphs.

Authors:  Adam Ameur
Journal:  Nat Biotechnol       Date:  2019-08       Impact factor: 54.908

3.  Evolution of biosequence search algorithms: a brief survey.

Authors:  Gregory Kucherov
Journal:  Bioinformatics       Date:  2019-10-01       Impact factor: 6.937

Review 4.  The Third Revolution in Sequencing Technology.

Authors:  Erwin L van Dijk; Yan Jaszczyszyn; Delphine Naquin; Claude Thermes
Journal:  Trends Genet       Date:  2018-06-22       Impact factor: 11.639

5.  Fast and accurate genomic analyses using genome graphs.

Authors:  Goran Rakocevic; Vladimir Semenyuk; Wan-Ping Lee; James Spencer; John Browning; Ivan J Johnson; Vladan Arsenijevic; Jelena Nadj; Kaushik Ghose; Maria C Suciu; Sun-Gou Ji; Gülfem Demir; Lizao Li; Berke Ç Toptaş; Alexey Dolgoborodov; Björn Pollex; Iosif Spulber; Irina Glotova; Péter Kómár; Andrew L Stachyra; Yilong Li; Milos Popovic; Morten Källberg; Amit Jain; Deniz Kural
Journal:  Nat Genet       Date:  2019-01-14       Impact factor: 38.330

Review 6.  Pan-Genome Storage and Analysis Techniques.

Authors:  Tina Zekic; Guillaume Holley; Jens Stoye
Journal:  Methods Mol Biol       Date:  2018

Review 7.  The Ecology and Evolution of Pangenomes.

Authors:  Michael A Brockhurst; Ellie Harrison; James P J Hall; Thomas Richards; Alan McNally; Craig MacLean
Journal:  Curr Biol       Date:  2019-10-21       Impact factor: 10.834

8.  Breakpoint graphs and ancestral genome reconstructions.

Authors:  Max A Alekseyev; Pavel A Pevzner
Journal:  Genome Res       Date:  2009-02-13       Impact factor: 9.043

Review 9.  Genome graphs and the evolution of genome inference.

Authors:  Benedict Paten; Adam M Novak; Jordan M Eizenga; Erik Garrison
Journal:  Genome Res       Date:  2017-03-30       Impact factor: 9.043

Review 10.  PacBio Sequencing and Its Applications.

Authors:  Anthony Rhoads; Kin Fai Au
Journal:  Genomics Proteomics Bioinformatics       Date:  2015-11-02       Impact factor: 7.691

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.