Literature DB >> 17061922

Local decoding of sequences and alignment-free comparison.

Gilles Didier1, Ivan Laprevotte, Maude Pupin, Alain Hénaut.   

Abstract

Subword composition plays an important role in a lot of analyses of sequences. Here we define and study the "local decoding of order N of sequences," an alternative that avoids some drawbacks of "subwords of length N" approaches while keeping informations about environments of length N in the sequences ("decoding" is taken here in the sense of hidden Markov modeling, i.e., associating some state to all positions of the sequence). We present an algorithm for computing the local decoding of order N of a given set of sequences. Its complexity is linear in the total length of the set (whatever the order N) both in time and memory space. In order to show a use of local decoding, we propose a very basic dissimilarity measure between sequences which can be computed both from local decoding of order N and composition in subwords of length N. The accuracies of these two dissimilarities are evaluated, over several datasets, by computing their linear correlations with a reference alignment-based distance. These accuracies are also compared to the one obtained from another recent alignment-free comparison.

Mesh:

Year:  2006        PMID: 17061922     DOI: 10.1089/cmb.2006.13.1465

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  7 in total

1.  Automatic detection of anchor points for multiple sequence alignment.

Authors:  Florian Pitschi; Claudine Devauchelle; Eduardo Corel
Journal:  BMC Bioinformatics       Date:  2010-09-02       Impact factor: 3.169

2.  MS4--Multi-Scale Selector of Sequence Signatures: an alignment-free method for classification of biological sequences.

Authors:  Eduardo Corel; Florian Pitschi; Ivan Laprevotte; Gilles Grasseau; Gilles Didier; Claudine Devauchelle
Journal:  BMC Bioinformatics       Date:  2010-07-30       Impact factor: 3.169

3.  PICS-Ord: unlimited coding of ambiguous regions by pairwise identity and cost scores ordination.

Authors:  Robert Lücking; Brendan P Hodkinson; Alexandros Stamatakis; Reed A Cartwright
Journal:  BMC Bioinformatics       Date:  2011-01-07       Impact factor: 3.169

4.  Comparing sequences without using alignments: application to HIV/SIV subtyping.

Authors:  Gilles Didier; Laurent Debomy; Maude Pupin; Ming Zhang; Alexander Grossmann; Claudine Devauchelle; Ivan Laprevotte
Journal:  BMC Bioinformatics       Date:  2007-01-02       Impact factor: 3.169

5.  Estimating evolutionary distances between genomic sequences from spaced-word matches.

Authors:  Burkhard Morgenstern; Bingyao Zhu; Sebastian Horwege; Chris André Leimeister
Journal:  Algorithms Mol Biol       Date:  2015-02-11       Impact factor: 1.405

6.  DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment.

Authors:  Amarendran R Subramanian; Michael Kaufmann; Burkhard Morgenstern
Journal:  Algorithms Mol Biol       Date:  2008-05-27       Impact factor: 1.405

7.  Comparison study on k-word statistical measures for protein: from sequence to 'sequence space'.

Authors:  Qi Dai; Tianming Wang
Journal:  BMC Bioinformatics       Date:  2008-09-23       Impact factor: 3.169

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.