Literature DB >> 24564620

Statistics for approximate gene clusters.

Katharina Jahn, Sascha Winter, Jens Stoye, Sebastian Böcker.   

Abstract

BACKGROUND: Genes occurring co-localized in multiple genomes can be strong indicators for either functional constraints on the genome organization or remnant ancestral gene order. The computational detection of these patterns, which are usually referred to as gene clusters, has become increasingly sensitive over the past decade. The most powerful approaches allow for various types of imperfect cluster conservation: Cluster locations may be internally rearranged. The individual cluster locations may contain only a subset of the cluster genes and may be disrupted by uninvolved genes. Moreover cluster locations may not at all occur in some or even most of the studied genomes. The detection of such low quality clusters increases the risk of mistaking faint patterns that occur merely by chance for genuine findings. Therefore, it is crucial to estimate the significance of computational gene cluster predictions and discriminate between true conservation and coincidental clustering.
RESULTS: In this paper, we present an efficient and accurate approach to estimate the significance of gene cluster predictions under the approximate common intervals model. Given a single gene cluster prediction, we calculate the probability to observe it with the same or a higher degree of conservation under the null hypothesis of random gene order, and add a correction factor to account for multiple testing. Our approach considers all parameters that define the quality of gene cluster conservation: the number of genomes in which the cluster occurs, the number of involved genes, the degree of conservation in the different genomes, as well as the frequency of the clustered genes within each genome. We apply our approach to evaluate gene cluster predictions in a large set of well annotated genomes.

Entities:  

Mesh:

Year:  2013        PMID: 24564620      PMCID: PMC3908651          DOI: 10.1186/1471-2105-14-S15-S14

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  28 in total

1.  The statistical analysis of spatially clustered genes under the maximum gap criterion.

Authors:  Rose Hoberman; David Sankoff; Dannie Durand
Journal:  J Comput Biol       Date:  2005-10       Impact factor: 1.479

2.  Identifying clusters of functionally related genes in genomes.

Authors:  Gangman Yi; Sing-Hoi Sze; Michael R Thon
Journal:  Bioinformatics       Date:  2007-01-19       Impact factor: 6.937

3.  Gapped permutation pattern discovery for gene order comparisons.

Authors:  Laxmi Parida
Journal:  J Comput Biol       Date:  2007 Jan-Feb       Impact factor: 1.479

4.  Gecko and GhostFam: rigorous and efficient gene cluster detection in prokaryotic genomes.

Authors:  Thomas Schmidt; Jens Stoye
Journal:  Methods Mol Biol       Date:  2007

5.  Two plus two does not equal three: statistical tests for multiple genome comparison.

Authors:  Narayanan Raghupathy; Rose Hoberman; Dannie Durand
Journal:  J Bioinform Comput Biol       Date:  2008-02       Impact factor: 1.122

6.  Efficiently identifying max-gap clusters in pairwise genome comparison.

Authors:  Xu Ling; Xin He; Dong Xin; Jiawei Han; Jaiwei Han
Journal:  J Comput Biol       Date:  2008 Jul-Aug       Impact factor: 1.479

7.  Detecting gene clusters under evolutionary constraint in a large number of genomes.

Authors:  Xu Ling; Xin He; Dong Xin
Journal:  Bioinformatics       Date:  2009-01-21       Impact factor: 6.937

8.  Computation of median gene clusters.

Authors:  Sebastian Böcker; Katharina Jahn; Julia Mixtacki; Jens Stoye
Journal:  J Comput Biol       Date:  2009-08       Impact factor: 1.479

9.  Statistical inference of chromosomal homology based on gene colinearity and applications to Arabidopsis and rice.

Authors:  Xiyin Wang; Xiaoli Shi; Zhe Li; Qihui Zhu; Lei Kong; Wen Tang; Song Ge; Jingchu Luo
Journal:  BMC Bioinformatics       Date:  2006-10-12       Impact factor: 3.169

10.  Gene cluster statistics with gene families.

Authors:  Narayanan Raghupathy; Dannie Durand
Journal:  Mol Biol Evol       Date:  2009-01-15       Impact factor: 16.240

View more
  2 in total

1.  Finding approximate gene clusters with Gecko 3.

Authors:  Sascha Winter; Katharina Jahn; Stefanie Wehner; Leon Kuchenbecker; Manja Marz; Jens Stoye; Sebastian Böcker
Journal:  Nucleic Acids Res       Date:  2016-09-26       Impact factor: 16.971

2.  Proteny: discovering and visualizing statistically significant syntenic clusters at the proteome level.

Authors:  Thies Gehrmann; Marcel J T Reinders
Journal:  Bioinformatics       Date:  2015-06-27       Impact factor: 6.937

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.