| Literature DB >> 16241899 |
Rose Hoberman1, David Sankoff, Dannie Durand.
Abstract
Statistical validation of gene clusters is imperative for many important applications in comparative genomics which depend on the identification of genomic regions that are historically and/or functionally related. We develop the first rigorous statistical treatment of max-gap clusters, a cluster definition frequently used in empirical studies. We present exact expressions for the probability of observing an individual cluster of a set of marked genes in one genome, as well as upper and lower bounds on the probability of observing a cluster of h homologs in a pairwise whole-genome comparison. We demonstrate the utility of our approach by applying it to a whole-genome comparison of E. coli and B. subtilis. Code for statistical tests is available at.Entities:
Mesh:
Year: 2005 PMID: 16241899 DOI: 10.1089/cmb.2005.12.1083
Source DB: PubMed Journal: J Comput Biol ISSN: 1066-5277 Impact factor: 1.479