| Literature DB >> 25197576 |
Abstract
Ongoing debates about functional importance of gene duplications have been recently intensified by a heated discussion of the "ortholog conjecture" (OC). Under the OC, which is central to functional annotation of genomes, orthologous genes are functionally more similar than paralogous genes at the same level of sequence divergence. However, a recent study challenged the OC by reporting a greater functional similarity, in terms of gene ontology (GO) annotations and expression profiles, among within-species paralogs compared to orthologs. These findings were taken to indicate that functional similarity of homologous genes is primarily determined by the cellular context of the genes, rather than evolutionary history. Subsequent studies suggested that the OC appears to be generally valid when applied to mammalian evolution but the complete picture of evolution of gene expression also has to incorporate lineage-specific aspects of paralogy. The observed complexity of gene expression evolution after duplication can be explained through selection for gene dosage effect combined with the duplication-degeneration-complementation model. This paper discusses expression divergence of recent duplications occurring before functional divergence of proteins encoded by duplicate genes.Entities:
Year: 2014 PMID: 25197576 PMCID: PMC4150538 DOI: 10.1155/2014/516508
Source DB: PubMed Journal: Genet Res Int ISSN: 2090-3162
Figure 1Expression and sequence similarity of orthologous and paralogous genes. (a) Z-score expression similarity averaged across 4 tissues. (b) Rank-based expression similarity averaged across 4 tissues. (c) Kendall's τ rank correlation coefficient. (d) Pearson linear correlation coefficient. The raw data is taken from Rogozin and coworkers [34]; see Table 1 for more details about procedures used in this study.
Analysis of the duplication-degeneration-complementation (DDC) model using expression profiles of within-species paralogs (gene X versus genes Y1/Y2).
|
|
|
|
|
| ||
| 16 | 15 | 46 |
|
| ||
|
| ||
Kendall's τ rank correlation coefficient was used to measure the similarity between expression profiles of pairs of human-mouse paralogs (I analyzed cases when one genome contains one gene copy X and another genome contains two copies Y1 and Y2). The number of cases where the expression profile E x shows a greater similarity to the combined expression profile E y (E y = E y1 + E y2) as predicted by the DDC model (the first column) is compared with the number of cases where E x shows a greater similarity to E y1, E y2, or both (the second and third columns) using the binomial test. The ortholog-paralog cluster construction protocol included, first, all-against-all comparison of protein sequences from the analyzed human and genomes using the BLASTP program, with masking of low sequence complexity regions using the SEG program [34]. At the second step, orthologs were identified using symmetrical best hits. Paralogs were delineated using within-species and between-species BLASTP hits (e-value < 10−20) using the single linkage clustering procedure (the 50% identity score was used as a threshold) [34]. The RPKM values, that is, reads per kilobase of exon model per million mapped reads [33], were calculated from the counts values for each of four tissues shared by human and mouse (heart, kidney, liver, and lung) [34]. The expression data and clusters of orthologs and paralogs are available at ftp://ftp.ncbi.nlm.nih.gov/pub/managdav/paper_suppl/ortholog_conjecture/.
Figure 2Schematic representation of the “protein dosage rebalancing” hypothesis [34]. This synthetic model is a combination of the dosage effect and DDC models: many recent gene duplications (or gene copy-number variations (CNVs) at the population level) have a positive effect in some tissues and/or environmental conditions, whereas they also have a negative effect in some other tissues and/or environmental conditions [3, 7, 21–23]. Balancing of positive and negative dosage effects influenced by natural selection may be an important factor which is causing diversification of expression patterns (rebalancing of expression) of duplicate genes in the course of fixation of gene duplications. This process is similar to the conventional dosage effect hypothesis [3]. After the gene duplication is fixed in a population, preservation of this gene duplication may be largely explained by the DDC model (maintenance of duplicate genes due to differential loss or reduction of expression in various tissues).