Literature DB >> 27924003

The combination of direct and paired link graphs can boost repetitive genome assembly.

Wenyu Shi1, Peifeng Ji1, Fangqing Zhao1.   

Abstract

Currently, most paired link based scaffolding algorithms intrinsically mask the sequences between two linked contigs and bypass their direct link information embedded in the original de Bruijn assembly graph. Such disadvantage substantially complicates the scaffolding process and leads to the inability of resolving repetitive contig assembly. Here we present a novel algorithm, inGAP-sf, for effectively generating high-quality and continuous scaffolds. inGAP-sf achieves this by using a new strategy based on the combination of direct link and paired link graphs, in which direct link is used to increase graph connectivity and to decrease graph complexity and paired link is employed to supervise the traversing process on the direct link graph. Such advantage greatly facilitates the assembly of short-repeat enriched regions. Moreover, a new comprehensive decision model is developed to eliminate the noise routes accompanying with the introduced direct link. Through extensive evaluations on both simulated and real datasets, we demonstrated that inGAP-sf outperforms most of the genome scaffolding algorithms by generating more accurate and continuous assembly, especially for short repetitive regions.
© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 27924003      PMCID: PMC5399794          DOI: 10.1093/nar/gkw1191

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  33 in total

Review 1.  Comparison of the two major classes of assembly algorithms: overlap-layout-consensus and de-bruijn-graph.

Authors:  Zhenyu Li; Yanxiang Chen; Desheng Mu; Jianying Yuan; Yujian Shi; Hao Zhang; Jun Gan; Nan Li; Xuesong Hu; Binghang Liu; Bicheng Yang; Wei Fan
Journal:  Brief Funct Genomics       Date:  2011-12-19       Impact factor: 4.241

2.  SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

Authors:  Anton Bankevich; Sergey Nurk; Dmitry Antipov; Alexey A Gurevich; Mikhail Dvorkin; Alexander S Kulikov; Valery M Lesin; Sergey I Nikolenko; Son Pham; Andrey D Prjibelski; Alexey V Pyshkin; Alexander V Sirotkin; Nikolay Vyahhi; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2012-04-16       Impact factor: 1.479

3.  De novo assembly of human genomes with massively parallel short read sequencing.

Authors:  Ruiqiang Li; Hongmei Zhu; Jue Ruan; Wubin Qian; Xiaodong Fang; Zhongbin Shi; Yingrui Li; Shengting Li; Gao Shan; Karsten Kristiansen; Songgang Li; Huanming Yang; Jian Wang; Jun Wang
Journal:  Genome Res       Date:  2009-12-17       Impact factor: 9.043

4.  Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences.

Authors:  Song Gao; Wing-Kin Sung; Niranjan Nagarajan
Journal:  J Comput Biol       Date:  2011-09-19       Impact factor: 1.479

5.  QUAST: quality assessment tool for genome assemblies.

Authors:  Alexey Gurevich; Vladislav Saveliev; Nikolay Vyahhi; Glenn Tesler
Journal:  Bioinformatics       Date:  2013-02-19       Impact factor: 6.937

6.  A new algorithm for DNA sequence assembly.

Authors:  R M Idury; M S Waterman
Journal:  J Comput Biol       Date:  1995       Impact factor: 1.479

7.  ALLPATHS: de novo assembly of whole-genome shotgun microreads.

Authors:  Jonathan Butler; Iain MacCallum; Michael Kleber; Ilya A Shlyakhter; Matthew K Belmonte; Eric S Lander; Chad Nusbaum; David B Jaffe
Journal:  Genome Res       Date:  2008-03-13       Impact factor: 9.043

8.  SOPRA: Scaffolding algorithm for paired reads via statistical optimization.

Authors:  Adel Dayarian; Todd P Michael; Anirvan M Sengupta
Journal:  BMC Bioinformatics       Date:  2010-06-24       Impact factor: 3.169

9.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.

Authors:  Ben Langmead; Cole Trapnell; Mihai Pop; Steven L Salzberg
Journal:  Genome Biol       Date:  2009-03-04       Impact factor: 13.583

10.  Candidate phylum TM6 genome recovered from a hospital sink biofilm provides genomic insights into this uncultivated phylum.

Authors:  Jeffrey S McLean; Mary-Jane Lombardo; Jonathan H Badger; Anna Edlund; Mark Novotny; Joyclyn Yee-Greenbaum; Nikolay Vyahhi; Adam P Hall; Youngik Yang; Christopher L Dupont; Michael G Ziegler; Hamidreza Chitsaz; Andrew E Allen; Shibu Yooseph; Glenn Tesler; Pavel A Pevzner; Robert M Friedman; Kenneth H Nealson; J Craig Venter; Roger S Lasken
Journal:  Proc Natl Acad Sci U S A       Date:  2013-06-10       Impact factor: 11.205

View more
  3 in total

1.  Faucet: streaming de novo assembly graph construction.

Authors:  Roye Rozov; Gil Goldshlager; Eran Halperin; Ron Shamir
Journal:  Bioinformatics       Date:  2018-01-01       Impact factor: 6.937

Review 2.  Modern technologies and algorithms for scaffolding assembled genomes.

Authors:  Jay Ghurye; Mihai Pop
Journal:  PLoS Comput Biol       Date:  2019-06-05       Impact factor: 4.475

Review 3.  Single-cell metagenomics: challenges and applications.

Authors:  Yuan Xu; Fangqing Zhao
Journal:  Protein Cell       Date:  2018-04-25       Impact factor: 14.870

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.