Literature DB >> 21156730

Alignment-free estimation of nucleotide diversity.

Bernhard Haubold1, Floyd A Reed, Peter Pfaffelhuber.   

Abstract

MOTIVATION: Sequencing capacity is currently growing more rapidly than CPU speed, leading to an analysis bottleneck in many genome projects. Alignment-free sequence analysis methods tend to be more efficient than their alignment-based counterparts. They may, therefore, be important in the long run for keeping sequence analysis abreast with sequencing.
RESULTS: We derive and implement an alignment-free estimator of the number of pairwise mismatches, . Our implementation of , pim, is based on an enhanced suffix array and inherits the superior time and memory efficiency of this data structure. Simulations demonstrate that is accurate if mutations are distributed randomly along the chromosome. While real data often deviates from this ideal, remains useful for identifying regions of low genetic diversity using a sliding window approach. We demonstrate this by applying it to the complete genomes of 37 strains of Drosophila melanogaster, and to the genomes of two closely related Drosophila species, D.simulans and D.sechellia. In both cases, we detect the diversity minimum and discuss its biological implications.

Entities:  

Mesh:

Year:  2010        PMID: 21156730     DOI: 10.1093/bioinformatics/btq689

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  7 in total

1.  A geometric interpretation for local alignment-free sequence comparison.

Authors:  Ehsan Behnam; Michael S Waterman; Andrew D Smith
Journal:  J Comput Biol       Date:  2013-07       Impact factor: 1.479

2.  An alignment-free test for recombination.

Authors:  Bernhard Haubold; Linda Krause; Thomas Horn; Peter Pfaffelhuber
Journal:  Bioinformatics       Date:  2013-09-23       Impact factor: 6.937

3.  A novel hierarchical clustering algorithm for gene sequences.

Authors:  Dan Wei; Qingshan Jiang; Yanjie Wei; Shengrui Wang
Journal:  BMC Bioinformatics       Date:  2012-07-23       Impact factor: 3.169

4.  Fractal MapReduce decomposition of sequence alignment.

Authors:  Jonas S Almeida; Alexander Grüneberg; Wolfgang Maass; Susana Vinga
Journal:  Algorithms Mol Biol       Date:  2012-05-02       Impact factor: 1.405

5.  Estimating evolutionary distances between genomic sequences from spaced-word matches.

Authors:  Burkhard Morgenstern; Bingyao Zhu; Sebastian Horwege; Chris André Leimeister
Journal:  Algorithms Mol Biol       Date:  2015-02-11       Impact factor: 1.405

6.  An improved alignment-free model for DNA sequence similarity metric.

Authors:  Junpeng Bao; Ruiyu Yuan; Zhe Bao
Journal:  BMC Bioinformatics       Date:  2014-09-28       Impact factor: 3.169

7.  Alignment-free population genomics: an efficient estimator of sequence diversity.

Authors:  Bernhard Haubold; Peter Pfaffelhuber
Journal:  G3 (Bethesda)       Date:  2012-08-01       Impact factor: 3.154

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.