| Literature DB >> 25237051 |
Dong-Dong Wu1, Xin Wang2, Yan Li1, Lin Zeng1, David M Irwin3, Ya-Ping Zhang4.
Abstract
New genes, which provide material for evolutionary innovation, have been extensively studied for many years in animals where it is observed that they commonly show an expression bias for the testis. Thus, the testis is a major source for the generation of new genes in animals. The source tissue for new genes in plants is unclear. Here, we find that new genes in plants show a bias in expression to mature pollen, and are also enriched in a gene coexpression module that correlates with mature pollen in Arabidopsis thaliana. Transposable elements are significantly enriched in the new genes, and the high activity of transposable elements in the vegetative nucleus, compared with the germ cells, suggests that new genes are most easily generated in the vegetative nucleus in the mature pollen. We propose an "out of pollen" hypothesis for the origin of new genes in flowering plants.Entities:
Keywords: Arabidopsis thaliana; young gene evolution; “Out of pollen” hypothesis
Mesh:
Substances:
Year: 2014 PMID: 25237051 PMCID: PMC4224333 DOI: 10.1093/gbe/evu206
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
F(A) Phylogenetic tree used for deducing the ages of genes using ProteinHistorian. Age indices 0–4 are presented above each branch. The actual ages are presented beside the ancestral nodes: 1,47.8, 936, 1,628, and 4,200 Myr. (B) Top 15 tissues with lowest TAI values. TAI = ∑E*AGE/∑E, where E, and AGE are the expression value, and age of each gene, respectively. TAI1 is calculated using the true phylogenetic time (Myr) as AGE, TAI2 is calculated by using the phylostratum number as described in Domazet-Loso et al. (2007) to represent AGE. The full data is in supplementary table S1, Supplementary Material online.
F(A) Dendrogram showing clustering of genes by WGCNA. Modules labeled by different colors. (B) Enrichment of duplicated genes and orphan genes among different modules. Stars show modules with significant enrichments. The enrichment score of young genes in each module is calculated by the proportion of young genes in the module divided by the proportion of other genes in the module. (C) Heatmap of the expression of genes in the blue module. Arrow shows mature pollen. Z score is defined as “actual value” minus the mean of the group divided by the standard deviation.
F(A) Relative expression level of new genes at the four stages of pollen development. Relative expression level was calculated by the mean expression level of new genes divided by the mean expression level of genome wide genes for each stage. (B) TAI values of the four stages. (C and D) Enrichment of TEs at regions of duplicated genes, orphan genes. (E) Proportion of the regions covered by transposons upstream (0–2 kb) and downstream (0–2 kb) of young genes and in the coding regions of young duplicated genes and orphan genes. (F and G) Levels of DNA methylation at the CG, CHH, and CHG sites (H = A, C, or T) of duplicated genes and orphan genes in the VN and SCs.
Gene Ontology Analysis of Genes in the Blue Module
| GO Term | Gene Count | Benjamini–Hochberg Corrected | |
|---|---|---|---|
| GO:0045449∼regulation of transcription | 304 | 3.67E-10 | 5.66E-07 |
| GO:0006355∼regulation of transcription, DNA-dependent | 182 | 1.19E-09 | 9.14E-07 |
| GO:0051252∼regulation of RNA metabolic process | 182 | 1.84E-09 | 9.48E-07 |
| GO:0006350∼transcription | 205 | 2.52E-08 | 9.70E-06 |
| GO:0032870∼cellular response to hormone stimulus | 84 | 1.28E-07 | 3.94E-05 |
| GO:0009755∼hormone-mediated signaling | 84 | 1.28E-07 | 3.94E-05 |
| GO:0006508∼proteolysis | 165 | 2.31E-07 | 5.94E-05 |
| GO:0006511∼ubiquitin-dependent protein catabolic process | 60 | 3.25E-07 | 7.17E-05 |
| GO:0044257∼cellular protein catabolic process | 100 | 5.66E-07 | 9.70E-05 |
| GO:0051603∼proteolysis involved in cellular protein catabolic process | 99 | 6.29E-07 | 9.71E-05 |
| GO:0043632∼modification-dependent macromolecule catabolic process | 98 | 7.63E-07 | 1.07E-04 |
| GO:0019941∼modification-dependent protein catabolic process | 98 | 7.63E-07 | 1.07E-04 |
| GO:0007242∼intracellular signaling cascade | 131 | 5.58E-07 | 1.08E-04 |
| GO:0044265∼cellular macromolecule catabolic process | 102 | 1.71E-06 | 2.20E-04 |
| GO:0030163∼protein catabolic process | 100 | 1.94E-06 | 2.30E-04 |
| GO:0009057∼macromolecule catabolic process | 113 | 3.90E-06 | 4.30E-04 |
| GO:0030528∼transcription regulator activity | 287 | 1.43E-05 | 0.012428 |
| GO:0009725∼response to hormone stimulus | 134 | 1.61E-04 | 0.016387 |
| GO:0006468∼protein amino acid phosphorylation | 153 | 1.88E-04 | 0.017933 |
| GO:0048545∼response to steroid hormone stimulus | 11 | 2.27E-04 | 0.02042 |
| GO:0009742∼brassinosteroid mediated signaling | 11 | 2.27E-04 | 0.02042 |
| GO:0043401∼steroid hormone mediated signaling | 11 | 2.27E-04 | 0.02042 |
| GO:0003700∼transcription factor activity | 252 | 5.08E-05 | 0.021912 |
| GO:0000160∼two-component signal transduction system (phosphorelay) | 42 | 3.05E-04 | 0.025773 |
| GO:0009719∼response to endogenous stimulus | 140 | 3.75E-04 | 0.029989 |