Literature DB >> 15579242

A preprocessor for shotgun assembly of large genomes.

Michael Roberts1, Brian R Hunt, James A Yorke, Randall A Bolanos, Arthur L Delcher.   

Abstract

The whole-genome shotgun (WGS) assembly technique has been remarkably successful in efforts to determine the sequence of bases that make up a genome. WGS assembly begins with a large collection of short fragments that have been selected at random from a genome. The sequence of bases at each end of the fragment is determined, albeit imprecisely, resulting in a sequence of letters called a "read." Each letter in a read is assigned a quality value, which estimates the probability that a sequencing error occurred in determining that letter. Reads are typically cut off after about 500 letters, where sequencing errors become endemic. We report on a set of procedures that (1) corrects most of the sequencing errors, (2) changes quality values accordingly, and (3) produces a list of "overlaps," i.e., pairs of reads that plausibly come from overlapping parts of the genome. Our procedures, which we call collectively the "UMD Overlapper," can be run iteratively and as a preprocessor for other assemblers. We tested the UMD Overlapper on Celera's Drosophila reads. When we replaced Celera's overlap procedures in the front end of their assembler, it was able to produce a significantly improved genome.

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 15579242     DOI: 10.1089/cmb.2004.11.734

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  13 in total

1.  New assembly, reannotation and analysis of the Entamoeba histolytica genome reveal new genomic features and protein content information.

Authors:  Hernan A Lorenzi; Daniela Puiu; Jason R Miller; Lauren M Brinkac; Paolo Amedeo; Neil Hall; Elisabet V Caler
Journal:  PLoS Negl Trop Dis       Date:  2010-06-15

2.  The genome and transcriptome of the enteric parasite Entamoeba invadens, a model for encystation.

Authors:  Gretchen M Ehrenkaufer; Gareth D Weedall; Daryl Williams; Hernan A Lorenzi; Elisabet Caler; Neil Hall; Upinder Singh
Journal:  Genome Biol       Date:  2013-07-26       Impact factor: 13.583

3.  Mash: fast genome and metagenome distance estimation using MinHash.

Authors:  Brian D Ondov; Todd J Treangen; Páll Melsted; Adam B Mallonee; Nicholas H Bergman; Sergey Koren; Adam M Phillippy
Journal:  Genome Biol       Date:  2016-06-20       Impact factor: 13.583

4.  The effects of sampling on the efficiency and accuracy of k-mer indexes: Theoretical and empirical comparisons using the human genome.

Authors:  Meznah Almutairy; Eric Torng
Journal:  PLoS One       Date:  2017-07-07       Impact factor: 3.240

5.  Gerbil: a fast and memory-efficient k-mer counter with GPU-support.

Authors:  Marius Erbert; Steffen Rechner; Matthias Müller-Hannemann
Journal:  Algorithms Mol Biol       Date:  2017-03-31       Impact factor: 1.405

6.  Improving the performance of minimizers and winnowing schemes.

Authors:  Guillaume Marçais; David Pellow; Daniel Bork; Yaron Orenstein; Ron Shamir; Carl Kingsford
Journal:  Bioinformatics       Date:  2017-07-15       Impact factor: 6.937

7.  Improving de novo sequence assembly using machine learning and comparative genomics for overlap correction.

Authors:  Lance E Palmer; Mathaeus Dejori; Randall Bolanos; Daniel Fasulo
Journal:  BMC Bioinformatics       Date:  2010-01-15       Impact factor: 3.169

8.  A whole-genome assembly of the domestic cow, Bos taurus.

Authors:  Aleksey V Zimin; Arthur L Delcher; Liliana Florea; David R Kelley; Michael C Schatz; Daniela Puiu; Finnian Hanrahan; Geo Pertea; Curtis P Van Tassell; Tad S Sonstegard; Guillaume Marçais; Michael Roberts; Poorani Subramanian; James A Yorke; Steven L Salzberg
Journal:  Genome Biol       Date:  2009-04-24       Impact factor: 13.583

9.  Aggressive assembly of pyrosequencing reads with mates.

Authors:  Jason R Miller; Arthur L Delcher; Sergey Koren; Eli Venter; Brian P Walenz; Anushka Brownley; Justin Johnson; Kelvin Li; Clark Mobarry; Granger Sutton
Journal:  Bioinformatics       Date:  2008-10-24       Impact factor: 6.937

10.  Improving Phrap-based assembly of the rat using "reliable" overlaps.

Authors:  Michael Roberts; Aleksey V Zimin; Wayne Hayes; Brian R Hunt; Cevat Ustun; James R White; Paul Havlak; James Yorke
Journal:  PLoS One       Date:  2008-03-19       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.