| Literature DB >> 24610837 |
Igor B Rogozin1, David Managadze, Svetlana A Shabalina, Eugene V Koonin.
Abstract
The ortholog conjecture (OC), which is central to functional annotation of genomes, posits that orthologous genes are functionally more similar than paralogous genes at the same level of sequence divergence. However, a recent study challenged the OC by reporting a greater functional similarity, in terms of Gene Ontology (GO) annotations and expression profiles, among within-species paralogs compared with orthologs. These findings were taken to indicate that functional similarity of homologous genes is primarily determined by the cellular context of the genes, rather than evolutionary history. However, several subsequent studies suggest that GO annotations and microarray data could artificially inflate functional similarity between paralogs from the same organism. We sought to test the OC using approaches distinct from those used in previous studies. Analysis of a large RNAseq data set from multiple human and mouse tissues shows that expression similarity (correlations coefficients, rank's, or Z-scores) between orthologs is substantially greater than that for between-species paralogs with the same sequence divergence, in agreement with the OC and the results of recent detailed analyses. These findings are further corroborated by a fine-grain analysis in which expression profiles of orthologs and paralogs were compared separately for individual gene families. Expression profiles of within-species paralogs are more strongly correlated than profiles of orthologs but it is shown that this is caused by high background noise, that is, correlation between profiles of unrelated genes in the same organism. Z-scores and rank scores show a nonmonotonic dependence of expression profile similarity on sequence divergence. This complexity of gene expression evolution after duplication might be at least partially caused by selection for protein dosage rebalancing following gene duplication.Entities:
Keywords: duplicated genes; duplication–degeneration–complementation model; neofunctionalization model; neutral evolution; rebalancing dosage effect model; selection; subfunctionalization model
Mesh:
Substances:
Year: 2014 PMID: 24610837 PMCID: PMC4007545 DOI: 10.1093/gbe/evu051
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FConstruction of clusters of orthologous and paralogous genes from human and mouse. Solid lines show symmetrical best BlastP hits between human and mouse proteins (predicted orthologs). Dotted lines illustrate the identification of paralogs using single linkage clustering of within-species and between-species BlastP hits.
FExpression similarity of orthologous and paralogous genes (solid lines) and background noise for randomly shuffled profiles of orthologs and paralogs (dashed lines). (A) Kendall τ rank correlation coefficient, (B) Pearson linear correlation coefficient, (C) Z-score similarity averaged across four tissues, and (D) rank-based similarity averaged across four tissues.
FExpression similarity of orthologous and paralogous genes from the same clusters. The distance between each pair of orthologs and each pair of paralogs from the same cluster was chosen to be the same or similar (according to the χ2 test, the 0.95 level of significance). (A) Kendall τ rank correlation coefficient, (B) Pearson linear correlation coefficient, (C) Z-score similarity averaged across four tissues, and (D) rank-based similarity averaged across four tissues.
Analysis of Balanced Subsets of Orthologs and Paralogs (the Background Noise Is Approximately the Same for Orthologs and Paralogs According to the Student’s t-Test, the 0.95 Significance Level)
| Correlation Coefficient | Balanced Subsets | Expression Similarity | P (Sign Test) | |
|---|---|---|---|---|
| Ortholog > Paralog | Ortholog < Paralog | |||
| Pearson linear correlation coefficient | All (72) | 55 | 17 | <0.001 |
| Significant only (16) | 13 | 3 | 0.011 | |
| Kendall τ rank correlation coefficient | All (139) | 137 | 2 | <0.001 |
| Significant only (38) | 38 | 0 | <0.001 | |
Note.—Expression similarity Ortholog > Paralog is the number of cases when the mean expression similarity was greater (All) or significantly greater (Significant only, Student’s t-test, 0.95 significance level) for orthologs than for paralogs. Expression similarity Ortholog < Paralog is the number of cases when the mean expression similarity was greater (All) or significantly greater (Significant only) for paralogs than for orthologs. The significance of the difference for each pair of these numbers was estimated using the sign test.