Literature DB >> 19803738

Estimating mutation distances from unaligned genomes.

Bernhard Haubold1, Peter Pfaffelhuber, Mirjana Domazet-Loso, Thomas Wiehe.   

Abstract

Abstract Alignment-free distance measures are generally less accurate but more efficient than traditional alignment-based metrics. In the context of genome sequence analysis, the efficiency gain is often so substantial that it outweights the loss in accuracy. However, a further disadvantage of alignment-free distances is that their relationship to evolutionary events such as substitutions is generally unknown. We have therefore derived an estimator of the number of substitutions per site between two unaligned DNA sequences, K(r). Simulations show that this estimator works well with "ideal" data. We compare K(r) to two alternative alignment-free distances: a k-tuple distance and a measure of relative entropy based on average common substring length. All three measures are applied to 27 primate mitochondrial genomes, eight whole genomes of Streptococcus agalactiae strains, and 12 whole genomes of Drosophila species. In each case, the cluster diagrams based on K(r) are equivalent to or significantly better than those based on the two alternative measures. This is due to the fact that in contrast to the alternative measures K(r) is derived from an explicit model of evolution. The computation of K(r) is efficiently implemented in the program kr, which can be downloaded freely from the internet.

Entities:  

Mesh:

Year:  2009        PMID: 19803738     DOI: 10.1089/cmb.2009.0106

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  33 in total

1.  Alignment-free detection of horizontal gene transfer between closely related bacterial genomes.

Authors:  Mirjana Domazet-Lošo; Bernhard Haubold
Journal:  Mob Genet Elements       Date:  2011-09-01

2.  An alignment-free test for recombination.

Authors:  Bernhard Haubold; Linda Krause; Thomas Horn; Peter Pfaffelhuber
Journal:  Bioinformatics       Date:  2013-09-23       Impact factor: 6.937

3.  Prot-SpaM: fast alignment-free phylogeny reconstruction based on whole-proteome sequences.

Authors:  Chris-Andre Leimeister; Jendrik Schellhorn; Svenja Dörrer; Michael Gerth; Christoph Bleidorn; Burkhard Morgenstern
Journal:  Gigascience       Date:  2019-03-01       Impact factor: 6.524

4.  Sequence Comparison Without Alignment: The SpaM Approaches.

Authors:  Burkhard Morgenstern
Journal:  Methods Mol Biol       Date:  2021

5.  Large local analysis of the unaligned genome and its application.

Authors:  Lianping Yang; Xiangde Zhang; Tianming Wang; Hegui Zhu
Journal:  J Comput Biol       Date:  2013-01       Impact factor: 1.479

6.  Identity: rapid alignment-free prediction of sequence alignment identity scores using self-supervised general linear models.

Authors:  Hani Z Girgis; Benjamin T James; Brian B Luczak
Journal:  NAR Genom Bioinform       Date:  2021-02-01

7.  On the transformation of MinHash-based uncorrected distances into proper evolutionary distances for phylogenetic inference.

Authors:  Alexis Criscuolo
Journal:  F1000Res       Date:  2020-11-10

8.  Spaced words and kmacs: fast alignment-free sequence comparison based on inexact word matches.

Authors:  Sebastian Horwege; Sebastian Lindner; Marcus Boden; Klas Hatje; Martin Kollmar; Chris-André Leimeister; Burkhard Morgenstern
Journal:  Nucleic Acids Res       Date:  2014-05-14       Impact factor: 16.971

9.  CAM: an alignment-free method to recover phylogenies using codon aversion motifs.

Authors:  Justin B Miller; Lauren M McKinnon; Michael F Whiting; Perry G Ridge
Journal:  PeerJ       Date:  2019-06-04       Impact factor: 2.984

10.  Alignment-free population genomics: an efficient estimator of sequence diversity.

Authors:  Bernhard Haubold; Peter Pfaffelhuber
Journal:  G3 (Bethesda)       Date:  2012-08-01       Impact factor: 3.154

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.