| Literature DB >> 20924039 |
Abstract
A growing body of evidence suggests that DNA methylation is functionally divergent among different taxa. The recently discovered functional methylation system in the honeybee Apis mellifera presents an attractive invertebrate model system to study evolution and function of DNA methylation. In the honeybee, DNA methylation is mostly targeted toward transcription units (gene bodies) of a subset of genes. Here, we report an intriguing covariation of length and epigenetic status of honeybee genes. Hypermethylated and hypomethylated genes in honeybee are dramatically different in their lengths for both exons and introns. By analyzing orthologs in Drosophila melanogaster, Acyrthosiphon pisum, and Ciona intestinalis, we show genes that were short and long in the past are now preferentially situated in hyper- and hypomethylated classes respectively, in the honeybee. Moreover, we demonstrate that a subset of high-CpG genes are conspicuously longer than expected under the evolutionary relationship alone and that they are enriched in specific functional categories. We suggest that gene length evolution in the honeybee is partially driven by evolutionary forces related to regulation of gene expression, which in turn is associated with DNA methylation. However, lineage-specific patterns of gene length evolution suggest that there may exist additional forces underlying the observed interaction between DNA methylation and gene lengths in the honeybee.Entities:
Mesh:
Year: 2010 PMID: 20924039 PMCID: PMC2975444 DOI: 10.1093/gbe/evq060
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
F(A) The distribution of CpG O/E in Apis mellifera genes. A mixture of two distributions (represented by two blue curves) fit the observed distribution of CpG O/E (the red curve represents the sum of the two distributions). Accordingly, honeybee genes are classified into low- and high-CpG O/E genes (see text). (B) Length differences of low- and high-CpG O/E genes are represented in the boxplot. Note that gene length diagrams are shown only up to 80 kbps for display purposes. Analyzing exons and introns separately leads to similar patterns of bimodal distributions (supplementary material, Supplementary Material online).
Length Difference between Low- and High-CpG Genes in Honeybee
| Low-CpG | High-CpG | Ratio | Low-CpG | High-CpG | Ratio | Low-CpG | High-CpG | Ratio | ||||
| Gene body | 2,815 (35.5) | 15,118 (497.1) | 5.37 | <2.2 × 10−16 | 5,301 (65.9) | 13,495 (265.6) | 2.54 | <2.2 × 10−16 | 4,962 (66.6) | 4,961 (70.9) | 1.00 | 0.99 |
| Exons | 1,626 (21.9) | 1,837 (23.5) | 1.13 | 2.9 × 10−16 | 2,001 (19.5) | 1,849 (18.5) | 0.92 | 1.4 × 10−8 | 1,204 (13.3) | 971 (14.2) | 0.81 | 0.0065 |
| Introns | 1,189 (62.1) | 13,281 (400.7) | 11.12 | <2.2 × 10−16 | 3,300 (54.1) | 11,646 (261.4) | 3.53 | <2.2 × 10−16 | 3,758 (56.7) | 3,990 (63.9) | 1.06 | <2.2 × 10−16 |
NOTE.—Mean lengths in basepairs in each class are presented (standard errors are shown in parentheses). Significance values are assessed using a t-test. Data from C. intestinalis and Ac. pisum are also shown for comparison.
Ratio of high-CpG/low-CpG genes.
Gene Length Distribution among the 1:1:1:1 Orthologs between the Honeybee, the Fruitfly, the Pea Aphid, and the Sea Squirt
| Low-CpG | High-CpG | Ortholog Low-CpG | Ortholog High-CpG | Ortholog Low-CpG | Ortholog High-CpG | Ortholog Low-CpG | Ortholog High-CpG | |||||
| Gene body | 3,034 (55.4) | 19,046 (1847.2) | <2.2 × 10−16 | 3,646 (124.7) | 11,459 (3,646.5) | <2.2 × 10−16 | 6,216 (149.3) | 15,293 (812.6) | <2.2 × 10−16 | 5,559 (135.1) | 6,644 (264.3) | 0.0003 |
| Exons | 1,792 (29.6) | 2,014 (67.9) | 0.0027 | 2,220 (36.6) | 2,753 (92.5) | 1.1 × 10−7 | 2,123 (31.0) | 2,178 (72.8) | 0.487 | 1,603 (23.2) | 1,721 (50.0) | 0.032 |
| Introns | 1,242 (34.3) | 17,032 (1838.1) | <2.2 × 10−16 | 1,426 (109.6) | 8,706 (775.1) | <2.2 × 10−16 | 4,093 (136.8) | 13,115 (795.1) | <2.2 × 10−16 | 3,956 (120.4) | 4,923 (230.4) | 0.0002 |
NOTE.—Mean values for each class are presented (standard errors are shown in parentheses). T-test was used for test of significance (N = 2,026).
Orthologs of low-CpG genes in A. mellifera.
Orthologs of high-CpG genes in A. mellifera.
F(A) Gene lengths between Apis mellifera and Drosophila melanogaster are highly correlated. Ortholog length in D. melanogaster can explain 41% of observed variation in A. mellifera gene lengths in a linear regression model (see text). Note that the lengths are log-transformed to improve normality. (B) Residuals remaining from the regression model in figure 2. Residuals from low-CpG genes (red triangles) tend to be negative, whereas those from the high-CpG genes (green circles) tend to be positive, demonstrating that low-CpG genes are shorter and high-CpG genes are longer than expected from the linear regression model alone. Note that there exists a subset of high-CpG genes with particularly large residuals (denoted as darker green circles). These genes include those related to specific developmental functions (see text).
F(A) Gene lengths decrease as expression breadths increase in honeybee genes. (B) Experimentally determined levels of CG methylation increase with expression breadths in honeybee genes. Data on expression breadths are obtained from Foret et al. (2009), who combined microarray profiling of six tissues: antennae, brain, larvae, ovary, thorax, and hypopharyngeal gland. Data on experimental verified CpG methylation are from Zemach et al. (2010). Gene lengths are shown in kilobases.
Genes with the Greatest Deviations in Length from Associations Predicted by Phylogenetic Analysis (Top 100 Residuals Genes in fig. 2) Are Enriched in Specific GO Terms
| GO Biological Process Term | Accession | Fold Enrichment | Significance |
| Postembryonic development | GO:0009791 | 4.23 | 1.00 × 10−04 |
| Imaginal disc development | GO:0007444 | 4.15 | 3.27 × 10−04 |
| Appendage morphogenesis | GO:0035107 | 5.44 | 4.92 × 10−04 |
| Imaginal disc-derived appendage morphogenesis | GO:0035114 | 5.44 | 4.92 × 10−04 |
| Appendage development | GO:0048736 | 5.37 | 5.73 × 10−04 |
| Imaginal disc-derived appendage development | GO:0048737 | 5.37 | 5.73 × 10−04 |
| Postembryonic organ development | GO:0048569 | 4.84 | 7.04 × 10−04 |
Significance is denoted by a Benjamini correction for multiple testing.