| Literature DB >> 17158152 |
Elhanan Borenstein1, Tomer Shlomi, Eytan Ruppin, Roded Sharan.
Abstract
The rate of conservation of a gene in evolution is believed to be correlated with its biological importance. Recent studies have devised various conservation measures for genes and have shown that they are correlated with several biological characteristics of functional importance. Specifically, the state-of-the-art propensity for gene loss (PGL) measure was shown to be strongly correlated with gene essentiality and its number of protein-protein interactions (PPIs). The observed correlation between conservation and functional importance varies however between conservation measures, underscoring the need for accurate and general measures for the rate of gene conservation. Here we develop a novel maximum-likelihood approach to computing the rate in which a gene is lost in evolution, motivated by the same principles as those underlying PGL. However, in difference to PGL which considers only the most parsimonious ancestral states of the internal nodes of the phylogenetic tree relating the species, our approach weighs in a probabilistic manner all possible ancestral states, and includes the branch length information as part of the probabilistic model. In application to data of 16 eukaryotic genomes, our approach shows higher correlations with experimental data than PGL, including data on gene lethality, level of connectivity in a PPI network and coherence within functionally related genes.Entities:
Mesh:
Year: 2006 PMID: 17158152 PMCID: PMC1802574 DOI: 10.1093/nar/gkl792
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1A simple phylogenetic tree, demonstrating the difference between the GLR and PGL measures. Presence or absence of a gene is indicated by 1 or 0, respectively, next to the corresponding node (based on a parsimonious reconstruction of the ancestral phyletic pattern). Dotted lines represent branches in which the gene was lost. (a) The gene is lost in one branch of length l. (b) The gene is lost in two branches, each of length l/2.
Figure 2The phylogenetic tree used in our analysis, relating 16 eukaryotes. Estimated divergence times (in millions of years ago) are shown for all internal nodes. The number in parentheses next to each branch indicates the expected number of gene losses (see Methods).
Figure 3ROC curves for GLR and PGL illustrating the increased specificity and sensitivity of GLR in predicting the lethality of genes. The areas under the ROC curves are 0.78 and 0.72 for GLR and PGL, respectively.
Correlation between the GLR or PGL measure and the connectivity level in a PPI network
| Yeast | Worm | |
|---|---|---|
| GLR | −0.429 ( | −0.524 ( |
| PGL | −0.316 ( | not significant |