Literature DB >> 22689760

SEQuel: improving the accuracy of genome assemblies.

Roy Ronen1, Christina Boucher, Hamidreza Chitsaz, Pavel Pevzner.   

Abstract

MOTIVATION: Assemblies of next-generation sequencing (NGS) data, although accurate, still contain a substantial number of errors that need to be corrected after the assembly process. We develop SEQuel, a tool that corrects errors (i.e. insertions, deletions and substitution errors) in the assembled contigs. Fundamental to the algorithm behind SEQuel is the positional de Bruijn graph, a graph structure that models k-mers within reads while incorporating the approximate positions of reads into the model.
RESULTS: SEQuel reduced the number of small insertions and deletions in the assemblies of standard multi-cell Escherichia coli data by almost half, and corrected between 30% and 94% of the substitution errors. Further, we show SEQuel is imperative to improving single-cell assembly, which is inherently more challenging due to higher error rates and non-uniform coverage; over half of the small indels, and substitution errors in the single-cell assemblies were corrected. We apply SEQuel to the recently assembled Deltaproteobacterium SAR324 genome, which is the first bacterial genome with a comprehensive single-cell genome assembly, and make over 800 changes (insertions, deletions and substitutions) to refine this assembly. AVAILABILITY: SEQuel can be used as a post-processing step in combination with any NGS assembler and is freely available at http://bix.ucsd.edu/SEQuel/.

Entities:  

Mesh:

Year:  2012        PMID: 22689760      PMCID: PMC3371851          DOI: 10.1093/bioinformatics/bts219

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  33 in total

1.  De novo repeat classification and fragment assembly.

Authors:  Pavel A Pevzner; Paul A Pevzner; Haixu Tang; Glenn Tesler
Journal:  Genome Res       Date:  2004-09       Impact factor: 9.043

2.  Genomic DNA amplification from a single bacterium.

Authors:  Arumugham Raghunathan; Harley R Ferguson; Carole J Bornarth; Wanmin Song; Mark Driscoll; Roger S Lasken
Journal:  Appl Environ Microbiol       Date:  2005-06       Impact factor: 4.792

3.  Correcting base-assignment errors in repeat regions of shotgun assembly.

Authors:  Degui Zhi; Uri Keich; Pavel Pevzner; Steffen Heber; Haixu Tang
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2007 Jan-Mar       Impact factor: 3.710

4.  Short read fragment assembly of bacterial genomes.

Authors:  Mark J Chaisson; Pavel A Pevzner
Journal:  Genome Res       Date:  2007-12-14       Impact factor: 9.043

5.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Authors:  Daniel R Zerbino; Ewan Birney
Journal:  Genome Res       Date:  2008-03-18       Impact factor: 9.043

6.  The complete genome of an individual by massively parallel DNA sequencing.

Authors:  David A Wheeler; Maithreyan Srinivasan; Michael Egholm; Yufeng Shen; Lei Chen; Amy McGuire; Wen He; Yi-Ju Chen; Vinod Makhijani; G Thomas Roth; Xavier Gomes; Karrie Tartaro; Faheem Niazi; Cynthia L Turcotte; Gerard P Irzyk; James R Lupski; Craig Chinault; Xing-zhi Song; Yue Liu; Ye Yuan; Lynne Nazareth; Xiang Qin; Donna M Muzny; Marcel Margulies; George M Weinstock; Richard A Gibbs; Jonathan M Rothberg
Journal:  Nature       Date:  2008-04-17       Impact factor: 49.962

7.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment.

Authors:  B Ewing; L Hillier; M C Wendl; P Green
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

8.  Base-calling of automated sequencer traces using phred. II. Error probabilities.

Authors:  B Ewing; P Green
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

9.  A new algorithm for DNA sequence assembly.

Authors:  R M Idury; M S Waterman
Journal:  J Comput Biol       Date:  1995       Impact factor: 1.479

10.  ALLPATHS: de novo assembly of whole-genome shotgun microreads.

Authors:  Jonathan Butler; Iain MacCallum; Michael Kleber; Ilya A Shlyakhter; Matthew K Belmonte; Eric S Lander; Chad Nusbaum; David B Jaffe
Journal:  Genome Res       Date:  2008-03-13       Impact factor: 9.043

View more
  33 in total

Review 1.  Sequence assembly demystified.

Authors:  Niranjan Nagarajan; Mihai Pop
Journal:  Nat Rev Genet       Date:  2013-01-29       Impact factor: 53.242

2.  Assembly of long error-prone reads using de Bruijn graphs.

Authors:  Yu Lin; Jeffrey Yuan; Mikhail Kolmogorov; Max W Shen; Mark Chaisson; Pavel A Pevzner
Journal:  Proc Natl Acad Sci U S A       Date:  2016-12-12       Impact factor: 11.205

3.  Mitogenome analysis of dwarf pufferfish (Carinotetraodon travancoricus) endemic to southwest India and its implications in the phylogeny of Tetraodontidae.

Authors:  Chandhini Sathyajith; Yusuke Yamanoue; Shin-Ichi Yokobori; Sunesh Thampy; Rejish Kumar Vattiringal Jayadradhan
Journal:  J Genet       Date:  2019-12       Impact factor: 1.166

4.  Graph Traversal Edit Distance and Extensions.

Authors:  Ali Ebrahimpour Boroojeny; Akash Shrestha; Ali Sharifi-Zarchi; Suzanne Renick Gallagher; S Cenk Sahinalp; Hamidreza Chitsaz
Journal:  J Comput Biol       Date:  2020-02-13       Impact factor: 1.479

5.  The Antibody Repertoire of Colorectal Cancer.

Authors:  Seong Won Cha; Stefano Bonissone; Seungjin Na; Pavel A Pevzner; Vineet Bafna
Journal:  Mol Cell Proteomics       Date:  2017-10-18       Impact factor: 5.911

6.  Genomics of Salmonella phage ΦStp1: candidate bacteriophage for biocontrol.

Authors:  K S Sritha; Sarita G Bhat
Journal:  Virus Genes       Date:  2018-02-24       Impact factor: 2.332

7.  Succinct colored de Bruijn graphs.

Authors:  Martin D Muggli; Alexander Bowe; Noelle R Noyes; Paul S Morley; Keith E Belk; Robert Raymond; Travis Gagie; Simon J Puglisi; Christina Boucher
Journal:  Bioinformatics       Date:  2017-10-15       Impact factor: 6.937

8.  Dysgu: efficient structural variant calling using short or long reads.

Authors:  Kez Cleal; Duncan M Baird
Journal:  Nucleic Acids Res       Date:  2022-05-20       Impact factor: 19.160

9.  Complete genome sequence of Ornithobacterium rhinotracheale strain ORT-UMN 88.

Authors:  Emilie S Zehr; Darrell O Bayles; William D Boatwright; Louisa B Tabatabai; Karen B Register
Journal:  Stand Genomic Sci       Date:  2014-12-08

10.  Draft Genome Sequence of the Fungus Penicillium brasilianum MG11.

Authors:  Fabian Horn; Jörg Linde; Derek J Mattern; Grit Walther; Reinhard Guthke; Axel A Brakhage; Vito Valiante
Journal:  Genome Announc       Date:  2015-09-03
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.