Literature DB >> 23739838

Approximation of sojourn-times via maximal couplings: motif frequency distributions.

Manuel E Lladser1, Stephen R Chestnut.   

Abstract

Sojourn-times provide a versatile framework to assess the statistical significance of motifs in genome-wide searches even under non-Markovian background models. However, the large state spaces encountered in genomic sequence analyses make the exact calculation of sojourn-time distributions computationally intractable in long sequences. Here, we use coupling and analytic combinatoric techniques to approximate these distributions in the general setting of Polish state spaces, which encompass discrete state spaces. Our approximations are accompanied with explicit, easy to compute, error bounds for total variation distance. Broadly speaking, if Tn is the random number of times a Markov chain visits a certain subset T of states in its first n transitions, then we can usually approximate the distribution of Tn for n of order (1 − α)(−m), where m is the largest integer for which the exact distribution of Tm is accessible and 0 ≤ α ≤ 1 is an ergodicity coefficient associated with the probability transition kernel of the chain. This gives access to approximations of sojourn-times in the intermediate regime where n is perhaps too large for exact calculations, but too small to rely on Normal approximations or stationarity assumptions underlying Poisson and compound Poisson approximations. As proof of concept, we approximate the distribution of the number of matches with a motif in promoter regions of C.

Mesh:

Year:  2013        PMID: 23739838     DOI: 10.1007/s00285-013-0690-6

Source DB:  PubMed          Journal:  J Math Biol        ISSN: 0303-6812            Impact factor:   2.259


  7 in total

1.  Determination of local statistical significance of patterns in Markov sequences with application to promoter element identification.

Authors:  Haiyan Huang; Ming-Chih J Kao; Xianghong Zhou; Jun S Liu; Wing H Wong
Journal:  J Comput Biol       Date:  2004       Impact factor: 1.479

2.  Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase.

Authors:  C Tuerk; L Gold
Journal:  Science       Date:  1990-08-03       Impact factor: 47.728

3.  Multiple pattern matching: a Markov chain approach.

Authors:  Manuel E Lladser; M D Betterton; Rob Knight
Journal:  J Math Biol       Date:  2007-08-01       Impact factor: 2.259

4.  Neuroscience: A bar code for differentiation.

Authors:  Nicholas C Spitzer
Journal:  Nature       Date:  2009-04-16       Impact factor: 49.962

5.  Information, probability, and the abundance of the simplest RNA active sites.

Authors:  Ryan Kennedy; Manuel E Lladser; Michael Yarus; Rob Knight
Journal:  Front Biosci       Date:  2008-05-01

6.  Compound Poisson and Poisson process approximations for occurrences of multiple words in Markov chains.

Authors:  G Reinert; S Schbath
Journal:  J Comput Biol       Date:  1998       Impact factor: 1.479

7.  Gene regulatory logic of dopamine neuron differentiation.

Authors:  Nuria Flames; Oliver Hobert
Journal:  Nature       Date:  2009-03-15       Impact factor: 49.962

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.