Literature DB >> 15271780

A probabilistic measure for alignment-free sequence comparison.

Tuan D Pham1, Johannes Zuegg.   

Abstract

MOTIVATION: Alignment-free sequence comparison methods are still in the early stages of development compared to those of alignment-based sequence analysis. In this paper, we introduce a probabilistic measure of similarity between two biological sequences without alignment. The method is based on the concept of comparing the similarity/dissimilarity between two constructed Markov models.
RESULTS: The method was tested against six DNA sequences, which are the thrA, thrB and thrC genes of the threonine operons from Escherichia coli K-12 and from Shigella flexneri; and one random sequence having the same base composition as thrA from E.coli. These results were compared with those obtained from CLUSTAL W algorithm (alignment-based) and the chaos game representation (alignment-free). The method was further tested against a more complex set of 40 DNA sequences and compared with other existing sequence similarity measures (alignment-free). AVAILABILITY: All datasets and computer codes written in MATLAB are available upon request from the first author.

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 15271780     DOI: 10.1093/bioinformatics/bth426

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  23 in total

1.  Protein sequence analysis based on hydropathy profile of amino acids.

Authors:  Xiao-li Xie; Li-fei Zheng; Ying Yu; Li-ping Liang; Man-cai Guo; John Song; Zhi-fa Yuan
Journal:  J Zhejiang Univ Sci B       Date:  2012-02       Impact factor: 3.066

Review 2.  Sequence analysis by iterated maps, a review.

Authors:  Jonas S Almeida
Journal:  Brief Bioinform       Date:  2013-10-25       Impact factor: 11.622

3.  Large local analysis of the unaligned genome and its application.

Authors:  Lianping Yang; Xiangde Zhang; Tianming Wang; Hegui Zhu
Journal:  J Comput Biol       Date:  2013-01       Impact factor: 1.479

4.  Integrating overlapping structures and background information of words significantly improves biological sequence comparison.

Authors:  Qi Dai; Lihua Li; Xiaoqing Liu; Yuhua Yao; Fukun Zhao; Michael Zhang
Journal:  PLoS One       Date:  2011-11-10       Impact factor: 3.240

5.  Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences.

Authors:  Alexander F Auch; Stefan R Henz; Barbara R Holland; Markus Göker
Journal:  BMC Bioinformatics       Date:  2006-07-19       Impact factor: 3.169

6.  An investigation into inter- and intragenomic variations of graphic genomic signatures.

Authors:  Rallis Karamichalis; Lila Kari; Stavros Konstantinidis; Steffen Kopecki
Journal:  BMC Bioinformatics       Date:  2015-08-07       Impact factor: 3.169

7.  Prediction of high-risk types of human papillomaviruses using statistical model of protein "sequence space".

Authors:  Cong Wang; Yabing Hai; Xiaoqing Liu; Nanfang Liu; Yuhua Yao; Pingan He; Qi Dai
Journal:  Comput Math Methods Med       Date:  2015-04-20       Impact factor: 2.238

8.  A hybrid distance measure for clustering expressed sequence tags originating from the same gene family.

Authors:  Keng-Hoong Ng; Chin-Kuan Ho; Somnuk Phon-Amnuaisuk
Journal:  PLoS One       Date:  2012-10-11       Impact factor: 3.240

9.  Pattern-based phylogenetic distance estimation and tree reconstruction.

Authors:  Michael Höhl; Isidore Rigoutsos; Mark A Ragan
Journal:  Evol Bioinform Online       Date:  2007-02-25       Impact factor: 1.625

10.  Comparison study on k-word statistical measures for protein: from sequence to 'sequence space'.

Authors:  Qi Dai; Tianming Wang
Journal:  BMC Bioinformatics       Date:  2008-09-23       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.