Literature DB >> 33289890

Sequence Comparison Without Alignment: The SpaM Approaches.

Burkhard Morgenstern1.   

Abstract

Sequence alignment is at the heart of DNA and protein sequence analysis. For the data volumes that are nowadays produced by massively parallel sequencing technologies, however, pairwise and multiple alignment methods are often too slow. Therefore, fast alignment-free approaches to sequence comparison have become popular in recent years. Most of these approaches are based on word frequencies, for words of a fixed length, or on word-matching statistics. Other approaches are using the length of maximal word matches. While these methods are very fast, most of them rely on ad hoc measures of sequences similarity or dissimilarity that are hard to interpret. In this chapter, I describe a number of alignment-free methods that we developed in recent years. Our approaches are based on spaced-word matches ("SpaM"), i.e. on inexact word matches, that are allowed to contain mismatches at certain pre-defined positions. Unlike most previous alignment-free approaches, our approaches are able to accurately estimate phylogenetic distances between DNA or protein sequences using a stochastic model of molecular evolution.

Entities:  

Keywords:  Alignment free; FSWM; Genome comparison; Phylogenomics; Phylogeny; SpaM; Spaced words

Mesh:

Year:  2021        PMID: 33289890     DOI: 10.1007/978-1-0716-1036-7_8

Source DB:  PubMed          Journal:  Methods Mol Biol        ISSN: 1064-3745


  52 in total

1.  Application of tetranucleotide frequencies for the assignment of genomic fragments.

Authors:  Hanno Teeling; Anke Meyerdierks; Margarete Bauer; Rudolf Amann; Frank Oliver Glöckner
Journal:  Environ Microbiol       Date:  2004-09       Impact factor: 5.491

2.  Alignment-free sequence comparison (II): theoretical power of comparison statistics.

Authors:  Lin Wan; Gesine Reinert; Fengzhu Sun; Michael S Waterman
Journal:  J Comput Biol       Date:  2010-10-25       Impact factor: 1.479

3.  The average common substring approach to phylogenomic reconstruction.

Authors:  Igor Ulitsky; David Burstein; Tamir Tuller; Benny Chor
Journal:  J Comput Biol       Date:  2006-03       Impact factor: 1.479

4.  Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions.

Authors:  Gregory E Sims; Se-Ran Jun; Guohong A Wu; Sung-Hou Kim
Journal:  Proc Natl Acad Sci U S A       Date:  2009-02-02       Impact factor: 11.205

5.  Alignment-free sequence comparison (I): statistics and power.

Authors:  Gesine Reinert; David Chew; Fengzhu Sun; Michael S Waterman
Journal:  J Comput Biol       Date:  2009-12       Impact factor: 1.479

6.  Alignment-free sequence comparison based on next-generation sequencing reads.

Authors:  Kai Song; Jie Ren; Zhiyuan Zhai; Xuemei Liu; Minghua Deng; Fengzhu Sun
Journal:  J Comput Biol       Date:  2013-02       Impact factor: 1.479

7.  Average values of a dissimilarity measure not requiring sequence alignment are twice the averages of conventional mismatch counts requiring sequence alignment for a computer-generated model system.

Authors:  B E Blaisdell
Journal:  J Mol Evol       Date:  1989-12       Impact factor: 2.395

8.  A measure of the similarity of sets of sequences not requiring sequence alignment.

Authors:  B E Blaisdell
Journal:  Proc Natl Acad Sci U S A       Date:  1986-07       Impact factor: 11.205

9.  Alignment-Free Sequence Analysis and Applications.

Authors:  Jie Ren; Xin Bai; Yang Young Lu; Kujin Tang; Ying Wang; Gesine Reinert; Fengzhu Sun
Journal:  Annu Rev Biomed Data Sci       Date:  2018-04-25

10.  Genomic DNA k-mer spectra: models and modalities.

Authors:  Benny Chor; David Horn; Nick Goldman; Yaron Levy; Tim Massingham
Journal:  Genome Biol       Date:  2009-10-08       Impact factor: 13.583

View more
  3 in total

1.  The complexity landscape of viral genomes.

Authors:  Jorge Miguel Silva; Diogo Pratas; Tânia Caetano; Sérgio Matos
Journal:  Gigascience       Date:  2022-08-11       Impact factor: 7.658

2.  Insertions and deletions as phylogenetic signal in an alignment-free context.

Authors:  Niklas Birth; Thomas Dencker; Burkhard Morgenstern
Journal:  PLoS Comput Biol       Date:  2022-08-08       Impact factor: 4.779

3.  Convolutional Neural Network Applied to SARS-CoV-2 Sequence Classification.

Authors:  Gabriel B M Câmara; Maria G F Coutinho; Lucileide M D da Silva; Walter V do N Gadelha; Matheus F Torquato; Raquel de M Barbosa; Marcelo A C Fernandes
Journal:  Sensors (Basel)       Date:  2022-07-31       Impact factor: 3.847

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.