Literature DB >> 18345352

Co-evolutionary rates of functionally related yeast genes.

Leonardo Mariño-Ramírez1, Olivier Bodenreider, Natalie Kantz, I King Jordan.   

Abstract

Evolutionary knowledge is often used to facilitate computational attempts at gene function prediction. One rich source of evolutionary information is the relative rates of gene sequence divergence, and in this report we explore the connection between gene evolutionary rates and function. We performed a genome-scale evaluation of the relationship between evolutionary rates and functional annotations for the yeast Saccharomyces cerevisiae. Non-synonymous (dN) and synonymous (dS) substitution rates were calculated for 1,095 orthologous gene sets common to S. cerevisiae and six other closely related yeast species. Differences in evolutionary rates between pairs of genes (DeltadN & DeltadS) were then compared to their functional similarities (sGO), which were measured using Gene Ontology (GO) annotations. Substantial and statistically significant correlations were found between DeltadN and sGO, whereas there is no apparent relationship between DeltadS and sGO. These results are consistent with a mode of action for natural selection that is based on similar rates of elimination of deleterious protein coding sequence variants for functionally related genes. The connection between gene evolutionary rates and function was stronger than seen for phylogenetic profiles, which have previously been employed to inform functional inference. The co-evolution of functionally related yeast genes points to the relevance of specific function for the efficacy of natural selection and underscores the utility of gene evolutionary rates for functional predictions.

Entities:  

Keywords:  Functional inference; Co-evolution; gene ontology; genome evolution; natural selection

Year:  2006        PMID: 18345352      PMCID: PMC2674680     

Source DB:  PubMed          Journal:  Evol Bioinform Online        ISSN: 1176-9343            Impact factor:   1.625


Many post-genomic research efforts are aimed at uncovering relationships among genes, and the yeast Saccharomyces cerevisiae has served as a model system for such investigations (Cherry et al. 1998). A particular emphasis has been placed on high-throughput experimental attempts to elucidate various kinds of interactions between pairs of genes (or proteins), such as physical protein-protein interactions (Krogan et al. 2006), synthetic lethal gene pairs (Tong et al. 2004) and regulatory interactions between transcription factors and promoters (Harbison et al. 2004). The characterization of such relationships has the potential to reveal important clues as to the function of individual genes. Perhaps even more importantly, this line of inquiry can reveal higher order relationships, which define groups of genes that function as integrated biological systems (Ideker et al. 2001). In addition to the kinds of experimental approaches mentioned above, computational analyses have also been brought to bear on the discovery of functional relationships between genes. These include classic information transfer techniques that rely on sequence similarity searches, using BLAST (Altschul et al. 1997) for instance, as well as more recently developed techniques that seek to exploit information on the co-occurrence of genes in different organisms (Pellegrini et al. 1999). What many of these computational approaches share in common is a reliance, either implicit or explicit, on evolutionary information. Information transfer via BLAST rests on the fact that molecular evolution is a conservative process marked by the preservation of biochemical function among related genes. Phylogenetic profile methods, which evaluate patterns of gene presence and absence across sets of species, work because natural selection tends to maintain functionally related genes as coherent sets within evolutionary lineages. In this manuscript, we report an attempt to assess the utility of an additional source of evolutionary information for functional inference, namely the relative rates of gene evolution. Our approach is based on a growing body of literature that points to the connections between various phenotypic aspects of genes and their rates of evolution (Wall et al. 2005; Wolf et al. 2006). Among other findings, these studies have uncovered co-evolutionary connections between particular phenotypes and rates gene of evolution. For instance, genes that encode physically interacting proteins tend to evolve at similar rates (Fraser et al. 2002) as do genes that are co-expressed across similar tissue types (Jordan et al. 2004). It stands to reason that, as a general principle, genes with similar functional affinities should have similar (average) rates of evolution. We set out to test this notion by comparing the relative rates of evolution between orthologs, detected for S. cerevisiae and six closely related yeast species, with their Gene Ontology (GO) functional annotations. 1,095 sets of orthologous yeast genes were identified by using all-against-all reciprocal BLASTP searches (e−10) between S. cerevisiae and six closely related species with complete whole-genome draft sequences (Cliften et al. 2003; Kellis et al. 2003): S. paradoxus, S. mikatae, S. kudriavzevii, S. bayanus, S. castelli and S. kluyveri. Protein sequences of each orthologous set were aligned using ClustalW (Thompson et al. 1994), and the protein alignments were used to guide inframe alignments of the corresponding DNA protein coding sequences. For each set of 7 aligned orthologous genes, pairwise non-synonymous (dN) and synonymous (dS) substitution rates were computed between S. cerevisiae and each of the other six species using the modified Nei-Gojobori method (Nei and Gojobori 1986) implemented in the program PAML (Yang 1997). The resulting evolutionary distance values were used to calculate pairwise distance differences (ΔdN ΔdS) between S. cerevisiae genes, across each of the six between-species comparisons. Specifically, for any pair of S. cerevisiae genes i & j: ΔdN = |dN − dN| and ΔdS = |dS − dS|. This approach allowed us to evaluate the differences in evolutionary distances for pairs of genes over a range of phylogenetic distances from S. cerevisiae. A modified version of the semantic similarity method (Lord et al. 2003) was used to quantitatively assess the functional relationships between S. cerevisiae genes. Functional similarity coefficients between pairs of GO biological process terms – s(c, c) – were calculated by using an information content based approach. This approach takes into account both the frequency of biological process GO terms in the Saccharomyces Genome Database (SGD – http://www.yeastgenome.org/) and the structure of the GO directed acyclic graph (DAG). The DAG was used to relate query terms by their closest parent term – i.e. the lowest common subsumer (lcs). For any term (c), its information content – ln p(c) – is a function of its number of occurrences normalized by the total number of occurrences of all GO biological process terms in the SGD. Term-term functional similarities were measured using the information content of the query terms – ln p(c) & ln p(c) – and their lowest common subsumer parent term – ln p(c, c) (Lin, 1998): For any gene pair ij, all term-term similarity values were aggregated at the level of gene products to yield sGO by using the average highest similarity aggregation scheme as follows (Azuaje et al. 2005). Given m and n distinct GO terms for each gene in the pair ij, Thus, we were able to quantify functional similarities as well as evolutionary rate differences for all pairwise relationships among the 1,095 orthologous S. cerevisiae genes. We then compared function with evolutionary rate to determine whether functionally related genes have more similar evolutionary rates on average. Gene pairs were sorted in ascending order according to the pairwise distance difference (ΔdN & ΔdS), grouped into 10 bins, and average binned distance differences as well as average functional similarities (sGO) were calculated. For all six between-species comparisons, a clear linear trend exists between ΔdN and sGO (Figure 1), whereby ΔdN is negatively correlated with sGO (Figure 2a). Five out of the six ΔdN-sGO correlations are statistically significant at P < 0.01 (Figure 2b). In other words, genes that are more functionally similar tend to have smaller non-synonymous distance differences, on average, than genes with increasingly different functions. The only ΔdN-sGO correlation that was not significant was observed for the comparison between S. cerevisiae and S. paradoxus (Figure 2b). Among the six species we analyzed, S. paradoxus is the most closely related to S. cerevisiae; therefore, the lack of significance for this particular pair probably reflects the low resolution afforded by the small evolutionary distances between the two species. Consistent with this interpretation, the strength of the ΔdN-sGO negative correlation, as well as its statistical significance, tends to increase together with the distance between the species being compared (Figure 2). ΔdS, on the other hand, shows virtually no correlation with sGO. The magnitudes of the ΔdS-sGO correlations are uniformly lower than seen for ΔdN; the slopes of the trend lines are notably shallower, and the signs of the correlation coefficients and trend line slopes both fluctuate between positive and negative (Figure 1 and Figure 2).
Figure 1

Average pairwise distance differences (x-axis) for 10 bins, with ΔdN shown in red and ΔdS shown in blue, are plotted against average pairwise GO functional similarities (sGO on the y-axis). The error bars correspond to 99% confidence intervals. Distances were calculated between orthologous genes of S. cerevisiae and a) S. paradoxus, b) S. mikatae, c) S. kudriavzevii, d) S. bayanus, e) S. castelli, f) S. kluyveri.

Figure 2

a) Pearson correlation (r) values are shown for the plots of distance difference (ΔdN ΔdS) vs. GO functional similarity (sGO) in Figure 1. b) Statistical significance (−logP) values are shown for the correlations in panel a. The P < 0.01 confidence level (−logP = 2) is shown. ΔdN related values are shown in red and ΔdS related values are shown in blue. Species are ordered left-to-right in terms of increasing evolutionary distance from S. cerevisiae.

In summary, genes with similar functions tend to have similar non-synonymous evolutionary rates, on average, while synonymous substitution rates show no such relationship with function. This is not surprising given the fact that non-synonymous substitutions, which change the encoded amino acid, have a more profound effect on protein structure and function than synonymous substitutions, which do not result in an amino acid change. Natural selection operates based on function and, at the molecular level, acts primarily to remove deleterious protein coding sequence variants. Nevertheless, the distinction between the patterns observed for ΔdN and ΔdS underscores a demonstrable connection between the particular effects of natural selection and the specific annotated function of yeast genes. Phylogenetic profiles have also been successfully employed to guide computationally based functional inferences, under the assumption that functionally related genes will have similar patterns of presence and absence across different species. We sought to compare the relationships between phylogenetic profiles and the same GO-based semantic measure of functional similarity that we found to be related to non-synonymous evolutionary rates. The phylogenetic profiles we analyzed are binary presence (1) and absence (0) vectors over a defined set of species. Two different sources of phylogenetic profiles were used in this analysis: i-Marcotte group profiles (Pellegrini et al. 1999) and ii-COG database profiles (Tatusov et al. 2003). The Marcotte profiles were based on an evaluation of 16 species, and the similarities between profiles were scored using a log-likelihood ratio as previously described (Lee et al. 2004). The COG profiles were based on the presence and absence of orthologs among 71 species, and these profiles were compared here using Jaccard and Hamming similarity measures. As with the evolutionary rates, phylogenetic profile similarities were binned in ascending order, and average sGO values were compared to average profile similarities. All three comparisons yield a positive correlation between profile and functional similarity (Figure 3). In other words, genes that are functionally related tend to have more similar evolutionary histories in terms of gene gain and loss. However, the magnitude and significance of this effect was not nearly as strong as seen for the comparison between function and evolutionary rate. In fact, the Marcotte profiles did not yield a significantly positive correlation with sGO (Figure 3a). This may be attributable to the relative sparseness of this dataset; only ~3,000 profile comparisons over 16 species were available compared to >500,000 comparisons over 71 species for the COG data set. Indeed, COG based profiles were significantly correlated with sGO for the Jaccard similarity measure but not when Hamming similarities were used (Figure 3b and c). The different results observed for the Jaccard and Hamming measures reflects that fact that most binary phylogenetic profiles contain many absent (0) signals, and too many of these will dominate the Hamming measure, which simply counts all positions as similar or different. The Jaccard measure attains more sensitivity by ignoring vector positions that are scored as absent for both genes. Even in this case though, the strength of the correlation is not as great as typically observed for ΔdN-sGO.
Figure 3

Phylogenetic profile similarity (x-axis) versus GO functional similarity (sGO on the y-axis). sGO is compared to a) Marcotte profiles, b) COG profiles evaluated via Jaccard similarity and c) COG profiles evaluated via Hamming similarity. Pearson correlation (r) and significance (P) values are shown in the inset of each plot.

We have demonstrated that functionally related yeast genes co-evolve with respect to the evolutionary rate at non-synonymous coding sequence positions. This effect is observed to be highly significant for all but the most closely related species comparison. For the data analyzed here, the correlation between function and evolutionary rate is stronger than seen for function and phylogenetic profiles. Rates of gene evolution are, for the most part, determined by the strength of purifying natural selection, which involves the removal of deleterious variants. As such, the results that we report here point to a close coupling between the particular function of a gene and the efficacy of purifying selection. The robust correlations between ΔdN-sGO also indicate that evolutionary rate comparisons can be used aid functional inference and prediction.
  19 in total

1.  Global mapping of the yeast genetic interaction network.

Authors:  Amy Hin Yan Tong; Guillaume Lesage; Gary D Bader; Huiming Ding; Hong Xu; Xiaofeng Xin; James Young; Gabriel F Berriz; Renee L Brost; Michael Chang; YiQun Chen; Xin Cheng; Gordon Chua; Helena Friesen; Debra S Goldberg; Jennifer Haynes; Christine Humphries; Grace He; Shamiza Hussein; Lizhu Ke; Nevan Krogan; Zhijian Li; Joshua N Levinson; Hong Lu; Patrice Ménard; Christella Munyana; Ainslie B Parsons; Owen Ryan; Raffi Tonikian; Tania Roberts; Anne-Marie Sdicu; Jesse Shapiro; Bilal Sheikh; Bernhard Suter; Sharyl L Wong; Lan V Zhang; Hongwei Zhu; Christopher G Burd; Sean Munro; Chris Sander; Jasper Rine; Jack Greenblatt; Matthias Peter; Anthony Bretscher; Graham Bell; Frederick P Roth; Grant W Brown; Brenda Andrews; Howard Bussey; Charles Boone
Journal:  Science       Date:  2004-02-06       Impact factor: 47.728

2.  Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation.

Authors:  P W Lord; R D Stevens; A Brass; C A Goble
Journal:  Bioinformatics       Date:  2003-07-01       Impact factor: 6.937

3.  Conservation and coevolution in the scale-free human gene coexpression network.

Authors:  I King Jordan; Leonardo Mariño-Ramírez; Yuri I Wolf; Eugene V Koonin
Journal:  Mol Biol Evol       Date:  2004-07-28       Impact factor: 16.240

4.  Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions.

Authors:  M Nei; T Gojobori
Journal:  Mol Biol Evol       Date:  1986-09       Impact factor: 16.240

5.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.

Authors:  J D Thompson; D G Higgins; T J Gibson
Journal:  Nucleic Acids Res       Date:  1994-11-11       Impact factor: 16.971

6.  Finding functional features in Saccharomyces genomes by phylogenetic footprinting.

Authors:  Paul Cliften; Priya Sudarsanam; Ashwin Desikan; Lucinda Fulton; Bob Fulton; John Majors; Robert Waterston; Barak A Cohen; Mark Johnston
Journal:  Science       Date:  2003-05-29       Impact factor: 47.728

7.  Transcriptional regulatory code of a eukaryotic genome.

Authors:  Christopher T Harbison; D Benjamin Gordon; Tong Ihn Lee; Nicola J Rinaldi; Kenzie D Macisaac; Timothy W Danford; Nancy M Hannett; Jean-Bosco Tagne; David B Reynolds; Jane Yoo; Ezra G Jennings; Julia Zeitlinger; Dmitry K Pokholok; Manolis Kellis; P Alex Rolfe; Ken T Takusagawa; Eric S Lander; David K Gifford; Ernest Fraenkel; Richard A Young
Journal:  Nature       Date:  2004-09-02       Impact factor: 49.962

8.  Sequencing and comparison of yeast species to identify genes and regulatory elements.

Authors:  Manolis Kellis; Nick Patterson; Matthew Endrizzi; Bruce Birren; Eric S Lander
Journal:  Nature       Date:  2003-05-15       Impact factor: 49.962

9.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae.

Authors:  Nevan J Krogan; Gerard Cagney; Haiyuan Yu; Gouqing Zhong; Xinghua Guo; Alexandr Ignatchenko; Joyce Li; Shuye Pu; Nira Datta; Aaron P Tikuisis; Thanuja Punna; José M Peregrín-Alvarez; Michael Shales; Xin Zhang; Michael Davey; Mark D Robinson; Alberto Paccanaro; James E Bray; Anthony Sheung; Bryan Beattie; Dawn P Richards; Veronica Canadien; Atanas Lalev; Frank Mena; Peter Wong; Andrei Starostine; Myra M Canete; James Vlasblom; Samuel Wu; Chris Orsi; Sean R Collins; Shamanta Chandran; Robin Haw; Jennifer J Rilstone; Kiran Gandi; Natalie J Thompson; Gabe Musso; Peter St Onge; Shaun Ghanny; Mandy H Y Lam; Gareth Butland; Amin M Altaf-Ul; Shigehiko Kanaya; Ali Shilatifard; Erin O'Shea; Jonathan S Weissman; C James Ingles; Timothy R Hughes; John Parkinson; Mark Gerstein; Shoshana J Wodak; Andrew Emili; Jack F Greenblatt
Journal:  Nature       Date:  2006-03-22       Impact factor: 49.962

10.  The COG database: an updated version includes eukaryotes.

Authors:  Roman L Tatusov; Natalie D Fedorova; John D Jackson; Aviva R Jacobs; Boris Kiryutin; Eugene V Koonin; Dmitri M Krylov; Raja Mazumder; Sergei L Mekhedov; Anastasia N Nikolskaya; B Sridhar Rao; Sergei Smirnov; Alexander V Sverdlov; Sona Vasudevan; Yuri I Wolf; Jodie J Yin; Darren A Natale
Journal:  BMC Bioinformatics       Date:  2003-09-11       Impact factor: 3.169

View more
  10 in total

1.  Reconstructing ancestral gene content by coevolution.

Authors:  Tamir Tuller; Hadas Birin; Uri Gophna; Martin Kupiec; Eytan Ruppin
Journal:  Genome Res       Date:  2009-11-30       Impact factor: 9.043

2.  Evolutionary rate and gene expression across different brain regions.

Authors:  Tamir Tuller; Martin Kupiec; Eytan Ruppin
Journal:  Genome Biol       Date:  2008       Impact factor: 13.583

3.  Repetitive DNA elements, nucleosome binding and human gene expression.

Authors:  Ahsan Huda; Leonardo Mariño-Ramírez; David Landsman; I King Jordan
Journal:  Gene       Date:  2009-02-05       Impact factor: 3.688

4.  Coevolution in RNA molecules driven by selective constraints: evidence from 5S rRNA.

Authors:  Nan Cheng; Yuanhui Mao; Youyi Shi; Shiheng Tao
Journal:  PLoS One       Date:  2012-09-04       Impact factor: 3.240

5.  Revealing and avoiding bias in semantic similarity scores for protein pairs.

Authors:  Jing Wang; Xianxiao Zhou; Jing Zhu; Chenggui Zhou; Zheng Guo
Journal:  BMC Bioinformatics       Date:  2010-05-28       Impact factor: 3.169

6.  Evidence for divergent evolution of growth temperature preference in sympatric Saccharomyces species.

Authors:  Paula Gonçalves; Elisabete Valério; Cláudia Correia; João M G C F de Almeida; José Paulo Sampaio
Journal:  PLoS One       Date:  2011-06-02       Impact factor: 3.240

7.  Proteome Profiling Outperforms Transcriptome Profiling for Coexpression Based Gene Function Prediction.

Authors:  Jing Wang; Zihao Ma; Steven A Carr; Philipp Mertins; Hui Zhang; Zhen Zhang; Daniel W Chan; Matthew J C Ellis; R Reid Townsend; Richard D Smith; Jason E McDermott; Xian Chen; Amanda G Paulovich; Emily S Boja; Mehdi Mesri; Christopher R Kinsinger; Henry Rodriguez; Karin D Rodland; Daniel C Liebler; Bing Zhang
Journal:  Mol Cell Proteomics       Date:  2016-11-11       Impact factor: 5.911

8.  Covariation of branch lengths in phylogenies of functionally related genes.

Authors:  Wai Lok Sibon Li; Allen G Rodrigo
Journal:  PLoS One       Date:  2009-12-29       Impact factor: 3.240

9.  Co-evolutionary networks of genes and cellular processes across fungal species.

Authors:  Tamir Tuller; Martin Kupiec; Eytan Ruppin
Journal:  Genome Biol       Date:  2009-05-05       Impact factor: 13.583

10.  Discovering local patterns of co-evolution: computational aspects and biological examples.

Authors:  Tamir Tuller; Yifat Felder; Martin Kupiec
Journal:  BMC Bioinformatics       Date:  2010-01-22       Impact factor: 3.169

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.