Literature DB >> 29547882

Repeat-aware evaluation of scaffolding tools.

Igor Mandric1, Sergey Knyazev1, Alex Zelikovsky1,2.   

Abstract

Summary: Genomic sequences are assembled into a variable, but large number of contigs that should be scaffolded (ordered and oriented) for facilitating comparative or functional analysis. Finding scaffolding is computationally challenging due to misassemblies, inconsistent coverage across the genome and long repeats. An accurate assessment of scaffolding tools should take into account multiple locations of the same contig on the reference scaffolding rather than matching a repeat to a single best location. This makes mapping of inferred scaffoldings onto the reference a computationally challenging problem. This paper formulates the repeat-aware scaffolding evaluation problem, which is to find a mapping of the inferred scaffolding onto the reference maximizing number of correct links and proposes a scalable algorithm capable of handling large whole-genome datasets. Our novel scaffolding validation framework has been applied to assess the most of state-of-the-art scaffolding tools on the representative subset of Genome Assembly Golden-Standard Evaluations (GAGE) datasets and some novel simulated datasets. Availability and implementation: The source code of this evaluation framework is available at https://github.com/mandricigor/repeat-aware. The documentation is hosted at https://mandricigor.github.io/repeat-aware. Supplementary information: Supplementary data are available at Bioinformatics online.

Mesh:

Year:  2018        PMID: 29547882      PMCID: PMC6061838          DOI: 10.1093/bioinformatics/bty131

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  21 in total

1.  GAGE: A critical evaluation of genome assemblies and assembly algorithms.

Authors:  Steven L Salzberg; Adam M Phillippy; Aleksey Zimin; Daniela Puiu; Tanja Magoc; Sergey Koren; Todd J Treangen; Michael C Schatz; Arthur L Delcher; Michael Roberts; Guillaume Marçais; Mihai Pop; James A Yorke
Journal:  Genome Res       Date:  2012-01-06       Impact factor: 9.043

2.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Authors:  Daniel R Zerbino; Ewan Birney
Journal:  Genome Res       Date:  2008-03-18       Impact factor: 9.043

3.  ScaffMatch: scaffolding algorithm based on maximum weight matching.

Authors:  Igor Mandric; Alex Zelikovsky
Journal:  Bioinformatics       Date:  2015-04-17       Impact factor: 6.937

4.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

Review 5.  Repetitive DNA and next-generation sequencing: computational challenges and solutions.

Authors:  Todd J Treangen; Steven L Salzberg
Journal:  Nat Rev Genet       Date:  2011-11-29       Impact factor: 53.242

6.  BESST--efficient scaffolding of large fragmented assemblies.

Authors:  Kristoffer Sahlin; Francesco Vezzi; Björn Nystedt; Joakim Lundeberg; Lars Arvestad
Journal:  BMC Bioinformatics       Date:  2014-08-15       Impact factor: 3.169

7.  OPERA-LG: efficient and exact scaffolding of large, repeat-rich eukaryotic genomes with performance guarantees.

Authors:  Song Gao; Denis Bertrand; Burton K H Chia; Niranjan Nagarajan
Journal:  Genome Biol       Date:  2016-05-11       Impact factor: 13.583

8.  In Silico Whole Genome Sequencer and Analyzer (iWGS): a Computational Pipeline to Guide the Design and Analysis of de novo Genome Sequencing Studies.

Authors:  Xiaofan Zhou; David Peris; Jacek Kominek; Cletus P Kurtzman; Chris Todd Hittinger; Antonis Rokas
Journal:  G3 (Bethesda)       Date:  2016-11-08       Impact factor: 3.154

9.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors:  Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal:  Gigascience       Date:  2012-12-27       Impact factor: 6.524

10.  A comprehensive evaluation of assembly scaffolding tools.

Authors:  Martin Hunt; Chris Newbold; Matthew Berriman; Thomas D Otto
Journal:  Genome Biol       Date:  2014-03-03       Impact factor: 13.583

View more
  4 in total

1.  RegScaf: a regression approach to scaffolding.

Authors:  Mengtian Li; Lei M Li
Journal:  Bioinformatics       Date:  2022-05-13       Impact factor: 6.931

Review 2.  Systematic benchmarking of omics computational tools.

Authors:  Serghei Mangul; Lana S Martin; Brian L Hill; Angela Ka-Mei Lam; Margaret G Distler; Alex Zelikovsky; Eleazar Eskin; Jonathan Flint
Journal:  Nat Commun       Date:  2019-03-27       Impact factor: 14.919

3.  MetaCarvel: linking assembly graph motifs to biological variants.

Authors:  Jay Ghurye; Todd Treangen; Marcus Fedarko; W Judson Hervey; Mihai Pop
Journal:  Genome Biol       Date:  2019-08-26       Impact factor: 13.583

4.  SLR: a scaffolding algorithm based on long reads and contig classification.

Authors:  Junwei Luo; Mengna Lyu; Ranran Chen; Xiaohong Zhang; Huimin Luo; Chaokun Yan
Journal:  BMC Bioinformatics       Date:  2019-10-30       Impact factor: 3.169

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.