| Literature DB >> 25743543 |
Avazeh T Ghanbarian1, Laurence D Hurst2.
Abstract
When considering the evolution of a gene's expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (<100 kb) but extends much further. Sex-specific expression change is also genomically clustered. As genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking.Entities:
Keywords: gene clustering; gene expression evolution; sex-biased evolution
Mesh:
Substances:
Year: 2015 PMID: 25743543 PMCID: PMC4476153 DOI: 10.1093/molbev/msv053
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
FRelationship between Z of a focal gene and Z of the nearest downstream neighbor for six male tissues. In this instance we consider all genes are nearest downstream neighbors if the distance between the start codons is <100 kb. This slightly contrasts with data in table 1, where the distance is defined as minimum distance between gene bodies. Trends are robust to alternative definitions. Data are split into equal sized bins (of 500 genes) defined after rank ordering with respect to Z score of the focal gene. The value on the X axis represents the mean Z of the genes in that bin. The value of the Y axis indicates the mean (±SEM) for the relevant flanking genes. The presented statistics are from Spearman correlation on raw data.
Spearman Correlation between Focal Gene’s Z Score and Z Score of Its Closest Nonoverlapping Downstream Neighbor.
| 8.71E−07 | 0.05504 | 2.81E−08 | 0.06247 | |
| 1.71E−19 | 0.10246 | 9.25E−21 | 0.10539 | |
| 3.97E−126 | 0.26420 | 3.37E−07 | 0.05751 | |
| 4.13E−66 | 0.19308 | 7.14E−20 | 0.10423 | |
| 5.91E−12 | 0.07786 | NA | NA | |
| 6.92E−83 | 0.21132 | NA | NA |
Note.—All statistics are significant after Bonferroni testing.
FNumbers of clusters of a given size compared to that expected under a random null. Observed number of clusters including certain number of genes is shown by red stars, boxplots show variation across number of clusters in 1,000 random sets.
Spearman Correlation between Z of Divergent, Convergent, and Cooriented Closest Gene Pairs.
| Tissue/Gender | Divergent | Divergent | Convergent | Convergent | Cooriented | Cooriented |
|---|---|---|---|---|---|---|
| Brain/male | 0.07738 | 0.105396 | 0.03449 | 0.05483 | ||
| Cerebellum/male | 0.09616 | 0.11123 | 0.11086 | |||
| Kidney/male | 0.27214 | 0.23963 | 0.27992 | |||
| Heart/male | 0.19287 | 0.17496 | 0.20693 | |||
| Liver/male | 0.11054 | 0.05694 | 0.07485 | |||
| Testis/male | 0.24745 | 0.20458 | 0.20261 | |||
| Brain/female | 0.06569 | 0.07510 | 0.05010 | |||
| Cerebellum/female | 0.14002 | 0.07796 | 0.09887 | |||
| Kidney/female | 0.004371 | 0.06349 | 0.032372 | 0.04601 | 0.05684 | |
| Heart/female | 0.13200 | 0.10040 | 0.10359 |
Note.—Results significant after Bonferroni testing are highlighted in italic.
P-Values of Monte Carlo Simulations Comparing Spearman’s Correlation ρ Score between Z Score of Focal Gene and Z Score of Its Downstream Neighbor across Divergent, Convergent, and Cooriented Subsets against ρ of a Randomly Selected Set of Genes of the Same Size as Those Subsets.
| 0.12059 | 0.87421 | 0.86861 | 0.37086 | 0.20748 | 0.20998 | |
| Cerebellum | 0.70893 | 0.40776 | 0.40526 | 0.03330 | 0.91901 | 0.92151 |
| Kidney | 0.41026 | 0.94881 | 0.95150 | 0.36286 | 0.70813 | 0.71763 |
| Heart | 0.55744 | 0.86571 | 0.86821 | 0.12109 | 0.68713 | 0.67243 |
| Liver | 0.05359 | 0.88301 | 0.88571 | NA | NA | NA |
| Testis | 0.03550 | 0.72293 | 0.71803 | NA | NA | NA |
Note.—If the number of genes in divergent orientation, for example, after removing zero Z scores in a specific tissue and sex is shown by tsND and Spearman’s correlation’s ρ score between those focal genes and their divergent downstream is shown by tsρ. Then ρ score of 10,000 random sets of linked gene pairs of tsND size, selected from pool of all genes in this study regardless of their orientation, is calculated and compared with tsρ in corresponding tissue/gender. If the number of random sets with their ρ great or greater than tsρ is shown by M, Monte Carlo P-values are then calculated as (M+1)/10,001. No observations are significant after Bonferroni testing.
Spearman Correlation between Z Score of Focal Gene and Z Score of Its Closest Downstream Neighbor across Divergent, Convergent, and Cooriented Closest Gene Pairs Which Are Closer than 1 kb.
| 0.10085 | 0.08288 | 0.81912 | 0.01280 | 0.95651 | −0.00366 | |
| 0.01006 | 0.13001 | 0.01738 | 0.13288 | 0.02453 | 0.15090 | |
| 0.39189 | 0.31211 | 0.00327 | 0.19567 | |||
| 0.22392 | 0.31661 | 0.00752 | 0.17813 | |||
| 0.17669 | 0.20270 | 0.07196 | 0.69872 | 0.02606 | ||
| 0.33586 | 0.34886 | 0.04807 | 0.13197 | |||
| 0.36058 | 0.04629 | 0.86790 | −0.00929 | 0.43382 | 0.05267 | |
| 0.21838 | 0.00461 | 0.15838 | 0.05900 | 0.12635 | ||
| 0.12010 | 0.07853 | 0.64196 | −0.02613 | 0.72420 | −0.0237 | |
| 0.00250 | 0.15248 | 0.00302 | 0.16604 | 0.02574 | 0.14933 |
Note.—Results significant after Bonferroni testing are highlighted in italic.
P-Values of Monte Carlo Simulation Comparing Spearman’s Correlation ρ Score between Focal Gene and Its Downstream Neighbor across Divergent, Convergent, and Coordinated Subsets to a Randomly Selected Subset of the Same Size for Gene Pairs Closer than 1 kb.
| 0.13399 | 0.71823 | 0.72053 | 0.33787 | 0.79582 | 0.79852 | |
| 0.64264 | 0.60364 | 0.59444 | 0.33907 | 0.85431 | 0.84622 | |
| 0.17848 | 0.87671 | 0.87581 | 0.07129 | 0.84862 | 0.84202 | |
| 0.91831 | 0.15938 | 0.15298 | 0.78032 | 0.62664 | 0.63754 | |
| 0.02850 | 0.76262 | 0.75932 | NA | NA | NA | |
| 0.57334 | 0.42326 | 0.43336 | NA | NA | NA |
Note.—Monte Carlo simulation’s steps and number of repetition are the same as explained in table 5. No observation is significant after Bonferroni testing.
Spearman Correlation between Focal Gene’s Z Scores and Z of Its Overlapping Downstream Neighbor on the Opposite Strand.
| 0.00392 | 0.10783* | 0.00368 | 0.10886* | |
| 8.37E−14 | 0.27613* | 8.45E−06 | 0.16696* | |
| 2.75E−26 | 0.38295* | 0.01655 | 0.08992* | |
| 4.90E−15 | 0.28986* | 1.18E−06 | 0.18234* | |
| 0.00019 | 0.13979* | NA | NA | |
| <2.2E−16 | 0.3942* | NA | NA |
Note.—Those incidences marked with an asterisk have a higher correlation than seen in the comparable nonoverlapping case (shown in table 1). All observations are significant after Bonferroni testing. As the underlying data are strand-specific transcriptomics, employing overlapping sequence from opposite strands obviates problems with mismapping, causing artifactual signals of high correlation.
Spearman Correlation between Focal Gene’s Z Scores and Mean of Its Closest Up and Downstream Neighbors, at Least One of Which Overlaps the Focal Gene.
| 0.00013 | 0.11001* | 0.0002 | 0.10724* | |
| 1.18E−24 | 0.29169* | 1.52E−11 | 0.19365* | |
| <2.2E−16 | 0.41596* | 0.00126 | 0.09303* | |
| 2.93E−29 | 0.31778* | 4.58E−13 | 0.20841* | |
| 7.60E−07 | 0.14236* | NA | NA | |
| <2.2E−16 | 0.4018* | NA | NA |
Note.—Those incidences marked with an asterisk have a higher correlation than seen in the comparable nonoverlapping case (shown in table 2). All observations are significant after Bonferroni testing.
Monte Carlo Simulation of Overlapping Genes’ Z.
| 0.005999 | 0.0095 | |
| 0.000099 | 0.003 | |
| 0.000099 | 0.0132 | |
| 0.000499 | 0.0004 | |
| 0.007399 | NA | |
| 0.000099 | NA |
Note.—Comparing Spearman correlation’s ρ score of overlapping genes against randomly selected set of gene pairs of the same size over 1,000 repetitions. The number of incidents when ρ of randomly selected set is equal or higher than ρ in overlapping set was counted to calculate empirical P-values. All observations are significant after Bonferroni testing.
Spearman Correlation between Focal Gene’s Z Score and Mean of Its Closest Nonoverlapping Neighbors on Both Sides.
| 2.95E−10 | 0.08015 | 8.70E−12 | 0.08727 | |
| 1.96E−31 | 0.15009 | 1.51E−33 | 0.15433 | |
| 1.44E−155 | 0.33054 | 6.07E−10 | 0.07925 | |
| 2.03E−86 | 0.24993 | 2.16E−28 | 0.14318 | |
| 8.86E−17 | 0.10676 | NA | NA | |
| 4.43E−118 | 0.28520 | NA | NA |
Note.—All statistics are significant after Bonferroni testing.
FCorrelation between Z of each focal gene and Z of nearest downstream neighbor more than a given minimum physical distance away. (a) We plot data considering increments of minimum distance 1 MB at a time up to a maximum of 30 MB. (b) We consider 10-kb increments up to a maximum of 1 MB. For each focal gene we extract the nearest neighbor downstream that is at least the distance x away, x being the units on the x axis. From a list of focal and neighbor Z scores, we consider then the correlation between these. Correlations significant at the 0.05 level are shown in red, otherwise in blue. The blue horizontal lines indicate 1.96 SD limits determined by randomization (which should in principle correspond with the P from Spearman’s ρ), with the black line indicating mean of null expectation from randomization (which should be around zero).
FZ scores of genes in and out of lamina domains across six tissues. All pairwise comparisons are highly significant (before multitest correction, Mann–Whitney U test P < 10−9 except brain P = 4 × 10−4). Z score of the genes on Lamina domains are shown with boxplots in red and the rest are in green. Genes with very high or very low Z are excluded from the plot as outliers to improve presentation but have been included in Mann–Whitney U test.
Number of Positive and Negative Z Score Genes Overlapping at Least One H3K4me3 Peak.
| Number of | ||||||||
|---|---|---|---|---|---|---|---|---|
| 12,418 | 5,923 | 6,495 | 5,108 | 4,812.38 | 4,981.5 | 5,277.12 | 3.806E−09 | |
| 12,098 | 5,605 | 6,493 | 4,702 | 4,548.21 | 5,115 | 5,268.78 | 0.00185 | |
| 12,098 | 5,605 | 6,493 | 4,920.5 | 4,759.71 | 5,353 | 5,513.79 | 0.00146 |
Number of Highly Positive and Negative Z Score Genes Overlapping at Least One H3K4me3 Peak.
| 6,164 | 3,708 | 2,456 | 3,206.5 | 31,32.91 | 2,001.5 | 2,075.089 | 0.03727 | |
| 4,679 | 2,941 | 1,738 | 2,389 | 2,394.47 | 1,420.5 | 1,415.027 | 0.8544 | |
| 4,679 | 2,941 | 1,738 | 2,520 | 2,516.10 | 1,483 | 1,486.902 | 0.8984 |
Note.—Genes with Z score higher than 1 are considered highly positive Z and the ones with Z score lower than −1 are studied as highly negative Z.
Observed Number of Concerted Genes Is Higher than Expected.
| 0.4916 | 0.49996 | 0.4999 | 0.4999 | 0.4999 | 0.4999 | 0.015356 | 200.0482 | 1216 | 5159 | <<0.001 | |
| 0.4804 | 0.49996 | 0.4999 | 0.4999 | 0.4999 | 0.4999 | 0.015006 | 195.4874 | 1165 | 4808 | <<0.001 | |
Note.—Concerted genes are either Z+ or Z− across all six tissues. So the expected number is the mean expectation of the number of concerted genes against a null of independent evolution in all tissues. The total number of genes included in this analysis is 13,027.
Monte Carlo Simulation’s P-Value and the Number of Clusters of Concerted Genes of the Same Direction of Evolution of Expression Are Shown by Cluster Size.
| Randomization | |||||
|---|---|---|---|---|---|
| 2 | 3 | 4 | 5 | 6 | |
| 9.999E−05/137 | 9.999E−05/29 | 9.999E−05/9 | 0.0059/2 | 0.0158/1 | |
| 9.999E−05/137 | 9.999E−05/26 | 9.999E−05/8 | 1/0 | 1/0 | |
Note.—Number of Z+ and Z− concerted genes are kept unchanged, but their order has been randomized, this is repeated for 1,000 iterations. Concerted gene clusters are found, and the number of occurrences of each cluster is compared with observed number of clusters of specific number of concerted genes. If the number is the same or exceeds the observed number of clusters of specific size, Monte Carlo counter is incremented. At the end of the simulation, P-value is calculated.
Spearman Correlation between Female and Mean of Male Z Scores Per Tissue.
| 0.52967 | <<0.0001 | |
| 0.32532 | <<0.0001 | |
| 0.45401 | <<0.0001 | |
| 0.43073 | <<0.0001 |
FThe extent of local correlation in sex-biased expression change for four tissues. Method is the same as that for figure 3, excepting that here we employ standardized residuals of the orthologous regression on Z between sexes (rather than Z). We consider all focal genes and the correlation between residuals of Z scores for these genes and the nearest downstream gene on the same chromosome a minimum of x base pairs away. Correlations significant at the 0.05 level are shown in red, otherwise in blue. The blue horizontal lines indicate 1.96 SD limits determined by randomization, with the black line indicating mean of null expectation (which should be around zero).
Spearman Ranked Correlation of Z Score of Focal Gene with Mean Z Score of All Its Nonoverlapping Neighboring (within ±100 kb) Genes.
| 7.75E−08 | 0.04780 | 6.93E−17 | 0.07465 | |
| 8.67E−61 | 0.14784 | 1.17E−41 | 0.12111 | |
| 1.32E−274 | 0.30926 | 2.81E−15 | 0.07078 | |
| 8.82E−160 | 0.23968 | 2.07E−44 | 0.12681 | |
| 8.51E−26 | 0.09458 | NA | NA | |
| 6.27E−187 | 0.25247 | NA | NA |
Note.—All statistics are significant after Bonferroni testing.
Spearman Correlation between Sex Bias Standard Residual of Standard Major Axis Estimation between Z of Male and Female for a Focal Gene and Standard Residual of Its Nearest Downstream Neighbor.
| 0.03995 | 0.10407 | |||
| 0.03109 | 0.02304 | 0.15636 | ||
| 0.04638 | 0.13913 | |||
| 0.09465 | 0.01206 | 0.08883 |
Note.—Incidences significant after Bonferroni testing are shown in italic.
Spearman Correlation between Standard Residual of Standard Major Axis Estimation between Z of Male and Female for a Focal Gene and Mean Standard Residual of Its Two Nearest Neighbors.
| 0.05452 | 0.07649 | |||
| 0.01433 | 0.03082 | 0.12738 | ||
| 0.06346 | 0.14127 | |||
| 0.12348 | 0.11740 |
Note.—Incidences significant after Bonferroni testing are shown in italic.
Spearman Correlation between Standard Residual of Standard Major Axis Estimation between Z of Male and Female of the Focal Gene and the Mean of Standard Residual of All Its Neighbors within 100 kb of the Focal Gene.
| 4.00E−08 | 0.04817 | |
| 0.00848 | 0.02310 | |
| 1.71E−39 | 0.11504 | |
| 1.87E−05 | 0.03755 |
Note.—All incidences are significant after Bonferroni testing.