Literature DB >> 34417615

SWALO: scaffolding with assembly likelihood optimization.

Atif Rahman1,2, Lior Pachter1,3,4.   

Abstract

Scaffolding, i.e. ordering and orienting contigs is an important step in genome assembly. We present a method for scaffolding using second generation sequencing reads based on likelihoods of genome assemblies. A generative model for sequencing is used to obtain maximum likelihood estimates of gaps between contigs and to estimate whether linking contigs into scaffolds would lead to an increase in the likelihood of the assembly. We then link contigs if they can be unambiguously joined or if the corresponding increase in likelihood is substantially greater than that of other possible joins of those contigs. The method is implemented in a tool called Swalo with approximations to make it efficient and applicable to large datasets. Analysis on real and simulated datasets reveals that it consistently makes more or similar number of correct joins as other scaffolders while linking very few contigs incorrectly, thus outperforming other scaffolders and demonstrating that substantial improvement in genome assembly may be achieved through the use of statistical models. Swalo is freely available for download at https://atifrahman.github.io/SWALO/.
© The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Year:  2021        PMID: 34417615      PMCID: PMC8599790          DOI: 10.1093/nar/gkab717

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  62 in total

1.  Bambus 2: scaffolding metagenomes.

Authors:  Sergey Koren; Todd J Treangen; Mihai Pop
Journal:  Bioinformatics       Date:  2011-09-16       Impact factor: 6.937

2.  Human whole-genome shotgun sequencing.

Authors:  J L Weber; E W Myers
Journal:  Genome Res       Date:  1997-05       Impact factor: 9.043

3.  Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences.

Authors:  Heng Li
Journal:  Bioinformatics       Date:  2016-03-19       Impact factor: 6.937

4.  A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning.

Authors:  Anton Valouev; Jeffrey Ichikawa; Thaisan Tonthat; Jeremy Stuart; Swati Ranade; Heather Peckham; Kathy Zeng; Joel A Malek; Gina Costa; Kevin McKernan; Arend Sidow; Andrew Fire; Steven M Johnson
Journal:  Genome Res       Date:  2008-05-13       Impact factor: 9.043

5.  High-coverage sequencing and annotated assemblies of the budgerigar genome.

Authors:  Ganeshkumar Ganapathy; Jason T Howard; James M Ward; Jianwen Li; Bo Li; Yingrui Li; Yingqi Xiong; Yong Zhang; Shiguo Zhou; David C Schwartz; Michael Schatz; Robert Aboukhalil; Olivier Fedrigo; Lisa Bukovnik; Ty Wang; Greg Wray; Isabelle Rasolonjatovo; Roger Winer; James R Knight; Sergey Koren; Wesley C Warren; Guojie Zhang; Adam M Phillippy; Erich D Jarvis
Journal:  Gigascience       Date:  2014-07-08       Impact factor: 6.524

6.  Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.

Authors:  Sergey Koren; Brian P Walenz; Konstantin Berlin; Jason R Miller; Nicholas H Bergman; Adam M Phillippy
Journal:  Genome Res       Date:  2017-03-15       Impact factor: 9.043

7.  ARKS: chromosome-scale scaffolding of human genome drafts with linked read kmers.

Authors:  Lauren Coombe; Jessica Zhang; Benjamin P Vandervalk; Justin Chu; Shaun D Jackman; Inanc Birol; René L Warren
Journal:  BMC Bioinformatics       Date:  2018-06-20       Impact factor: 3.169

8.  Aggressive assembly of pyrosequencing reads with mates.

Authors:  Jason R Miller; Arthur L Delcher; Sergey Koren; Eli Venter; Brian P Walenz; Anushka Brownley; Justin Johnson; Kelvin Li; Clark Mobarry; Granger Sutton
Journal:  Bioinformatics       Date:  2008-10-24       Impact factor: 6.937

9.  Ragout-a reference-assisted assembly tool for bacterial genomes.

Authors:  Mikhail Kolmogorov; Brian Raney; Benedict Paten; Son Pham
Journal:  Bioinformatics       Date:  2014-06-15       Impact factor: 6.937

10.  ARCS: scaffolding genome drafts with linked reads.

Authors:  Sarah Yeo; Lauren Coombe; René L Warren; Justin Chu; Inanç Birol
Journal:  Bioinformatics       Date:  2018-03-01       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.