Literature DB >> 10890396

An overview on the distribution of word counts in Markov chains.

S Schbath1.   

Abstract

In this paper, we give an overview about the different results existing on the statistical distribution of word counts in a Markovian sequence of letters. Results concerning the number of overlapping occurrences, the number of renewals and the number of clumps will be presented. Counts of single words and also multiple words are considered. Most of the results are approximations as the length of the sequence tends to infinity. We will see that Gaussian approximations switch to (compound) Poisson approximations for rare words. Modeling DNA sequences or proteins by stationary Markov chains, these results can be used to study the statistical frequency of motifs in a given sequence.

Mesh:

Year:  2000        PMID: 10890396     DOI: 10.1089/10665270050081469

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  8 in total

Review 1.  Computational approaches to identify promoters and cis-regulatory elements in plant genomes.

Authors:  Stephane Rombauts; Kobe Florquin; Magali Lescot; Kathleen Marchal; Pierre Rouzé; Yves van de Peer
Journal:  Plant Physiol       Date:  2003-07       Impact factor: 8.340

2.  Normal and compound poisson approximations for pattern occurrences in NGS reads.

Authors:  Zhiyuan Zhai; Gesine Reinert; Kai Song; Michael S Waterman; Yihui Luan; Fengzhu Sun
Journal:  J Comput Biol       Date:  2012-06       Impact factor: 1.479

3.  The power of detecting enriched patterns: an HMM approach.

Authors:  Zhiyuan Zhai; Shih-Yen Ku; Yihui Luan; Gesine Reinert; Michael S Waterman; Fengzhu Sun
Journal:  J Comput Biol       Date:  2010-04       Impact factor: 1.479

4.  Statistical modelling of bacterial promoter sequences for regulatory motif discovery with the help of transcriptome data: application to Listeria monocytogenes.

Authors:  Ibrahim Sultan; Vincent Fromion; Sophie Schbath; Pierre Nicolas
Journal:  J R Soc Interface       Date:  2020-10-07       Impact factor: 4.118

5.  Abundant oligonucleotides common to most bacteria.

Authors:  Colin F Davenport; Burkhard Tümmler
Journal:  PLoS One       Date:  2010-03-23       Impact factor: 3.240

6.  Nucleotide frequency variation across human genes.

Authors:  Elizabeth Louie; Jurg Ott; Jacek Majewski
Journal:  Genome Res       Date:  2003-11-12       Impact factor: 9.043

7.  MOST+: A de novo motif finding approach combining genomic sequence and heterogeneous genome-wide signatures.

Authors:  Yizhe Zhang; Yupeng He; Guangyong Zheng; Chaochun Wei
Journal:  BMC Genomics       Date:  2015-06-11       Impact factor: 3.969

8.  cWords - systematic microRNA regulatory motif discovery from mRNA expression data.

Authors:  Anders Jacobsen; Simon H Rasmussen; Anders Krogh
Journal:  Silence       Date:  2013-05-20
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.