Literature DB >> 12912833

Identification of functional links between genes using phylogenetic profiles.

Jie Wu1, Simon Kasif, Charles DeLisi.   

Abstract

MOTIVATION: Genes with identical patterns of occurrence across the phyla tend to function together in the same protein complexes or participate in the same biochemical pathway. However, the requirement that the profiles be identical (i) severely restricts the number of functional links that can be established by such phylogenetic profiling; (ii) limits detection to very strong functional links, failing to capture relations between genes that are not in the same pathway, but nevertheless subserve a common function and (iii) misses relations between analogous genes. Here we present and apply a method for relaxing the restriction, based on the probability that a given arbitrary degree of similarity between two profiles would occur by chance, with no biological pressure. Function is then inferred at any desired level of confidence.
RESULTS: We derive an expression for the probability distribution of a given number of chance co-occurrences of a pair of non-homologous orthologs across a set of genomes. The method is applied to 2905 clusters of orthologous genes (COGs) from 44 fully sequenced microbial genomes representing all three domains of life. Among the results are the following. (1) Of the 51 000 annotated intrapathway gene pairs, 8935 are linked at a level of significance of 0.01. This is over 30-fold greater than the 271 intrapathway pairs obtained at the same confidence level when identical profiles are used. (2) Of the 540 000 interpathway genes pairs, some 65 000 are linked at the 0.01 level of significance, some 12 standard deviations beyond the number expected by chance at this confidence level. We speculate that many of these links involve nearest-neighbor path, and discuss some examples. (3) The difference in the percentage of linked interpathway and intrapathway genes is highly significant, consistent with the intuitive expectation that genes in the same pathway are generally under greater selective pressure than those that are not. (4) The method appears to recover well metabolic networks. This is illustrated by the TCA cycle which is recovered as a highly connected, weighted edge network of 30 of its 31 COGs. (5) The fraction of pairs having a common pathway is a symmetric function of the Hamming distance between their profiles. This finding, that the functional correlation between profiles with near maximum Hamming distance is as large as between profiles with near zero Hamming distance, and as statistically significant, is plausibly explained if the former group represents analogous genes.

Entities:  

Mesh:

Year:  2003        PMID: 12912833     DOI: 10.1093/bioinformatics/btg187

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  58 in total

1.  Whole-genome annotation by using evidence integration in functional-linkage networks.

Authors:  Ulas Karaoz; T M Murali; Stan Letovsky; Yu Zheng; Chunming Ding; Charles R Cantor; Simon Kasif
Journal:  Proc Natl Acad Sci U S A       Date:  2004-02-23       Impact factor: 11.205

2.  The role of miRNAs in complex formation and control.

Authors:  Wilson Wen Bin Goh; Hirotaka Oikawa; Judy Chia Ghee Sng; Marek Sergot; Limsoon Wong
Journal:  Bioinformatics       Date:  2011-12-16       Impact factor: 6.937

Review 3.  Proteome-wide prediction of protein-protein interactions from high-throughput data.

Authors:  Zhi-Ping Liu; Luonan Chen
Journal:  Protein Cell       Date:  2012-06-22       Impact factor: 14.870

4.  Identification and analysis of evolutionarily cohesive functional modules in protein networks.

Authors:  Mónica Campillos; Christian von Mering; Lars Juhl Jensen; Peer Bork
Journal:  Genome Res       Date:  2006-01-31       Impact factor: 9.043

Review 5.  Practical and theoretical advances in predicting the function of a protein by its phylogenetic distribution.

Authors:  Philip R Kensche; Vera van Noort; Bas E Dutilh; Martijn A Huynen
Journal:  J R Soc Interface       Date:  2008-02-06       Impact factor: 4.118

6.  Reconstructing ancestral gene content by coevolution.

Authors:  Tamir Tuller; Hadas Birin; Uri Gophna; Martin Kupiec; Eytan Ruppin
Journal:  Genome Res       Date:  2009-11-30       Impact factor: 9.043

7.  Testing the accuracy of eukaryotic phylogenetic profiles for prediction of biological function.

Authors:  Saurav Singh; Dennis P Wall
Journal:  Evol Bioinform Online       Date:  2008-06-18       Impact factor: 1.625

8.  Inference of functional relations in predicted protein networks with a machine learning approach.

Authors:  Beatriz García-Jiménez; David Juan; Iakes Ezkurdia; Eduardo Andrés-León; Alfonso Valencia
Journal:  PLoS One       Date:  2010-04-01       Impact factor: 3.240

9.  Stratification of co-evolving genomic groups using ranked phylogenetic profiles.

Authors:  Shiri Freilich; Leon Goldovsky; Assaf Gottlieb; Eric Blanc; Sophia Tsoka; Christos A Ouzounis
Journal:  BMC Bioinformatics       Date:  2009-10-27       Impact factor: 3.169

10.  Co-evolutionary networks of genes and cellular processes across fungal species.

Authors:  Tamir Tuller; Martin Kupiec; Eytan Ruppin
Journal:  Genome Biol       Date:  2009-05-05       Impact factor: 13.583

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.