Literature DB >> 12169559

Efficiently detecting polymorphisms during the fragment assembly process.

Daniel Fasulo1, Aaron Halpern, Ian Dew, Clark Mobarry.   

Abstract

MOTIVATION: Current genomic sequence assemblers assume that the input data is derived from a single, homogeneous source. However, recent whole-genome shotgun sequencing projects have violated this assumption, resulting in input fragments covering the same region of the genome whose sequences differ due to polymorphic variation in the population. While single-nucleotide polymorphisms (SNPs) do not pose a significant problem to state-of-the-art assembly methods, these methods do not handle insertion/deletion (indel) polymorphisms of more than a few bases.
RESULTS: This paper describes an efficient method for detecting sequence discrepencies due to polymorphism that avoids resorting to global use of more costly, less stringent affine sequence alignments. Instead, the algorithm uses graph-based methods to determine the small set of fragments involved in each polymorphism and performs more sophisticated alignments only among fragments in that set. Results from the incorporation of this method into the Celera Assembler are reported for the D. melanogaster, H. sapiens, and M. musculus genomes.

Entities:  

Mesh:

Year:  2002        PMID: 12169559     DOI: 10.1093/bioinformatics/18.suppl_1.s294

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  12 in total

1.  Hierarchical scaffolding with Bambus.

Authors:  Mihai Pop; Daniel S Kosack; Steven L Salzberg
Journal:  Genome Res       Date:  2004-01       Impact factor: 9.043

2.  The multiassembly problem: reconstructing multiple transcript isoforms from EST fragment mixtures.

Authors:  Yi Xing; Alissa Resch; Christopher Lee
Journal:  Genome Res       Date:  2004-02-12       Impact factor: 9.043

3.  Exploring variation-aware contig graphs for (comparative) metagenomics using MaryGold.

Authors:  Jurgen F Nijkamp; Mihai Pop; Marcel J T Reinders; Dick de Ridder
Journal:  Bioinformatics       Date:  2013-09-20       Impact factor: 6.937

4.  Bambus 2: scaffolding metagenomes.

Authors:  Sergey Koren; Todd J Treangen; Mihai Pop
Journal:  Bioinformatics       Date:  2011-09-16       Impact factor: 6.937

5.  Phased diploid genome assembly with single-molecule real-time sequencing.

Authors:  Chen-Shan Chin; Paul Peluso; Fritz J Sedlazeck; Maria Nattestad; Gregory T Concepcion; Alicia Clum; Christopher Dunn; Ronan O'Malley; Rosa Figueroa-Balderas; Abraham Morales-Cruz; Grant R Cramer; Massimo Delledonne; Chongyuan Luo; Joseph R Ecker; Dario Cantu; David R Rank; Michael C Schatz
Journal:  Nat Methods       Date:  2016-10-17       Impact factor: 28.547

Review 6.  Assembly algorithms for next-generation sequencing data.

Authors:  Jason R Miller; Sergey Koren; Granger Sutton
Journal:  Genomics       Date:  2010-03-06       Impact factor: 5.736

7.  Detection and correction of false segmental duplications caused by genome mis-assembly.

Authors:  David R Kelley; Steven L Salzberg
Journal:  Genome Biol       Date:  2010-03-10       Impact factor: 13.583

8.  Heterozygous genome assembly via binary classification of homologous sequence.

Authors:  Paul M Bodily; M Fujimoto; Cameron Ortega; Nozomu Okuda; Jared C Price; Mark J Clement; Quinn Snell
Journal:  BMC Bioinformatics       Date:  2015-04-23       Impact factor: 3.169

9.  Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.

Authors:  Sergey Koren; Brian P Walenz; Konstantin Berlin; Jason R Miller; Nicholas H Bergman; Adam M Phillippy
Journal:  Genome Res       Date:  2017-03-15       Impact factor: 9.043

10.  Improving de novo sequence assembly using machine learning and comparative genomics for overlap correction.

Authors:  Lance E Palmer; Mathaeus Dejori; Randall Bolanos; Daniel Fasulo
Journal:  BMC Bioinformatics       Date:  2010-01-15       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.