Literature DB >> 17646303

A statistical method for alignment-free comparison of regulatory sequences.

Miriam R Kantorovitz1, Gene E Robinson, Saurabh Sinha.   

Abstract

MOTIVATION: The similarity of two biological sequences has traditionally been assessed within the well-established framework of alignment. Here we focus on the task of identifying functional relationships between cis-regulatory sequences that are non-orthologous or greatly diverged. 'Alignment-free' measures of sequence similarity are required in this regime.
RESULTS: We investigate the use of a new score for alignment-free sequence comparison, called the score. It is based on comparing the frequencies of all fixed-length words in the two sequences. An important, novel feature of the score is that it is comparable across sequence pairs drawn from arbitrary background distributions. We present a method that gives quadratic improvement in the time complexity of calculating the score, over the naïve method. We then evaluate the score on several tissue-specific families of cis-regulatory modules (in Drosophila and human). The new score is highly successful in discriminating functionally related regulatory sequences from unrelated sequence pairs. The performance of the score is compared to five other alignment-free similarity measures, and shown to be consistently superior to all of these measures. AVAILABILITY: Our implementation of the score will be made freely available as source code, upon publication of this article, at: http://veda.cs.uiuc.edu/d2z/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Year:  2007        PMID: 17646303     DOI: 10.1093/bioinformatics/btm211

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  48 in total

1.  Alignment-free sequence comparison (II): theoretical power of comparison statistics.

Authors:  Lin Wan; Gesine Reinert; Fengzhu Sun; Michael S Waterman
Journal:  J Comput Biol       Date:  2010-10-25       Impact factor: 1.479

2.  An alignment-free method to identify candidate orthologous enhancers in multiple Drosophila genomes.

Authors:  Manonmani Arunachalam; Karthik Jayasurya; Pavel Tomancak; Uwe Ohler
Journal:  Bioinformatics       Date:  2010-07-11       Impact factor: 6.937

3.  A geometric interpretation for local alignment-free sequence comparison.

Authors:  Ehsan Behnam; Michael S Waterman; Andrew D Smith
Journal:  J Comput Biol       Date:  2013-07       Impact factor: 1.479

Review 4.  Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis.

Authors:  Oliver Bonham-Carter; Joe Steele; Dhundy Bastola
Journal:  Brief Bioinform       Date:  2013-07-31       Impact factor: 11.622

5.  Multiple alignment-free sequence comparison.

Authors:  Jie Ren; Kai Song; Fengzhu Sun; Minghua Deng; Gesine Reinert
Journal:  Bioinformatics       Date:  2013-08-29       Impact factor: 6.937

Review 6.  New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing.

Authors:  Kai Song; Jie Ren; Gesine Reinert; Minghua Deng; Michael S Waterman; Fengzhu Sun
Journal:  Brief Bioinform       Date:  2013-09-23       Impact factor: 11.622

7.  The distribution of word matches between Markovian sequences with periodic boundary conditions.

Authors:  Conrad J Burden; Paul Leopardi; Sylvain Forêt
Journal:  J Comput Biol       Date:  2013-10-26       Impact factor: 1.479

Review 8.  Sequence analysis by iterated maps, a review.

Authors:  Jonas S Almeida
Journal:  Brief Bioinform       Date:  2013-10-25       Impact factor: 11.622

9.  K-mer natural vector and its application to the phylogenetic analysis of genetic sequences.

Authors:  Jia Wen; Raymond H F Chan; Shek-Chung Yau; Rong L He; Stephen S T Yau
Journal:  Gene       Date:  2014-05-22       Impact factor: 3.688

10.  Alignment-free sequence comparison (I): statistics and power.

Authors:  Gesine Reinert; David Chew; Fengzhu Sun; Michael S Waterman
Journal:  J Comput Biol       Date:  2009-12       Impact factor: 1.479

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.