Literature DB >> 30398664

Loss of Gene Body Methylation in Eutrema salsugineum Is Associated with Reduced Gene Expression.

Aline Muyle1, Brandon S Gaut1.   

Abstract

Gene body methylation (gbM) is typically characterized by DNA methylation in the CG context within coding regions and is associated with constitutive genes that have moderate to high expression levels. A recent study discovered the loss of gbM in two plant species (Eutrema salsugineum and Conringia planisiliqua), illustrating that gbM is not necessary for survival and reproduction. The same paper stated there was no detectable effect of gbM loss on gene expression (GE). Here, we reanalyzed the GE data and accounted for experimental variability in expression level estimates. We show that the loss of gbM in E. salsugineum is associated with a small but highly significant decrease in GE relative to the closely related species Arabidospis thaliana. Our results are consistent with various evolutionary analyses that suggest gbM has a function, perhaps as a homeostatic effect on GE.

Entities:  

Mesh:

Year:  2019        PMID: 30398664      PMCID: PMC6340462          DOI: 10.1093/molbev/msy204

Source DB:  PubMed          Journal:  Mol Biol Evol        ISSN: 0737-4038            Impact factor:   16.240


Since its first genomic characterization in Arabidopsis thaliana (Cokus et al. 2008; Lister et al. 2008), gene body methylation (hereafter gbM) has been a puzzling phenomenon. Typically, plants methylate cytosines in three contexts: CG, CHG, and CHH (where H = A, C, or T). When all three contexts are methylated, gene expression (hereafter GE) is silenced. In contrast, A. thaliana gbM is characterized by elevated methylation levels in the CG context, but neither in the CHG nor in the CHH context of coding regions. This high CG methylation does not silence genes. In fact, gbM genes tend to be intermediately expressed (Zhang et al. 2006; Lister et al. 2008) and expressed across more tissues than Łunmethylated (UM) genes (Lister et al. 2008; Kawakatsu et al. 2016). Importantly, not all genes have gbM; only 20% of A. thaliana genes contain CG methylation above background levels (Takuno and Gaut 2012). The puzzling aspect about gbM is its function. Some argue that gbM must have some function (Zilberman 2017), based on four observations rooted in evolutionary analyses. First, gbM levels are highly conserved between orthologs (Takuno and Gaut 2013; Seymour et al. 2014), even across ∼300 My of land plant evolution (Takuno et al. 2016). Second, gbM genes tend to evolve more slowly than UM genes (Takuno and Gaut 2012, 2013) with lower levels of polymorphism (Takuno et al. 2017), consistent with their enrichment for important functions (Zhang et al. 2006; Takuno and Gaut 2012). Third, most gbM genes are not particularly GC rich, suggesting that elevated CG methylation levels are not a simple consequence of cytosine availability (Takuno and Gaut 2013). In fact, gbM is maintained against mutational biases that decrease GC content over time (Takuno and Gaut 2013). Finally, gbM status seems to have a modest effect on GE, based on an evolutionary comparison between closely related Arabidopsis species (Takuno et al. 2017). This comparison focused on a small subset of orthologs that did not have conserved gbM levels between species. It revealed that gbM genes tend to be more highly expressed than their UM orthologous counterpart, but the overall trend was not convincingly significant. In another study comparing A. thaliana Swedish accessions, gbM genes were found to be more heavily methylated in northern latitudes, which was associated with a higher expression level (Dubin et al. 2015). Other studies have suggested that gbM has no function and is instead a by-product of transposition and/or methylation pathways (Roudier et al. 2009; Teixeira and Colot 2009; Kawakatsu et al. 2016; Bewick and Schmitz 2017). This argument is consistent with observations that: 1) the loss of gbM in an A. thaliana mutant does not substantially alter GE (Roudier et al. 2009; Bewick et al. 2016) and 2) gbM variation across A. thaliana accessions does not strongly affect GE (Kawakatsu et al. 2016). Two recent papers have seemingly strengthened the argument against gbM functionality, because they found that two flowering plants (Eutrema salsugineum and Conringia planisiliqua) have no gbM throughout their genome. The identification of these species is important for two reasons. First, they have provided important insights into the mechanism that produces gbM, which had been mysterious (Bewick et al. 2017). Second, they provide prima facie evidence against gbM function, because these two plant species seem to exist happily without it. In support of this argument, Bewick et al. (2016) compared GE between genes that have gbM in A. thaliana and their UMŁ orthologs in E. salsugineum. They concluded that the two sets of genes had similar transcription levels, again suggesting that gbM has no function. Here, we use the data of Bewick et al. (2016) to revisit GE analyses and to test whether an effect of gbM can be detected. We began by defining gbM genes and UM genes in A. thaliana (see Materials and Methods) using the same statistical approach and thresholds as Bewick et al. (2016). We then identified 12,189 orthologous UM genes in E. salsugineum, based on best-hits between species, again using the same methods as Bewick et al (2016). These ortholog pairs were then separated into two groups (table 1) for inclusion in GE analyses. Group 1 consists of 4,221 ortholog pairs that changed gbM status between species—that is, the gene was gbM in A. thaliana and UM in E. salsugineum. Group 2 includes 7,968 ortholog pairs that did not change gbM status, because they were UM in both species. These Groups 2 genes can be viewed as a “control” set and are necessary for comparative analyses.
Table 1.

Definition of the Two Gene Groups.

Ortholog PairsMethylation in A. thalianaMethylation in E. salsugineumChange in Methylation StatusNumber of Orthologous Genes
Group 1gbMUMYes4,221
Group 2UMUMNo7,968

Note.—The first group consists of genes that have changed methylation status between A. thaliana and E. salsugineum. The second group has conserved methylation status as UM genes.

Definition of the Two Gene Groups. Note.—The first group consists of genes that have changed methylation status between A. thaliana and E. salsugineum. The second group has conserved methylation status as UM genes. Given these two gene groups, we gathered GE data from Bewick et al. (2016) and contrasted expression levels between A. thaliana and E. salsugineum for Group 1 orthologs (fig. 1 ). These orthologs differed significantly in GE, with lower expression in E. salsugineum (one-sided Wilcoxon test W = 95218000, P-value < 2.2e–16). However, we also found significantly lower expression between the full data sets of 26,248 E. salsugineum genes compared with the 27,066 A. thaliana genes (Wilcoxon one-sided W = 3913700000, P-value < 2.2e–16). We concluded that these results are likely due to inherent experimental biases that cause global differences in expression levels between the two species. This global difference in GE between the two species may be one reason why Bewick et al. (2016) did not find a link between gbM and GE.
1.

Expression levels (log scale) for Arabidospis thaliana and Eutrema salsugineum in two gene Groups defined in table 1. The boxplot shows the median, the hinges are the first and third quartiles (the 25th and 75th percentiles) and the whiskers extend from the hinge to the largest or smallest value no further than 1.5 times the interquartile range (distance between the first and third quartiles).

Expression levels (log scale) for Arabidospis thaliana and Eutrema salsugineum in two gene Groups defined in table 1. The boxplot shows the median, the hinges are the first and third quartiles (the 25th and 75th percentiles) and the whiskers extend from the hinge to the largest or smallest value no further than 1.5 times the interquartile range (distance between the first and third quartiles). To account for the global species difference in expression, which is (again) likely due to experimental biases since all genes were affected, we applied a linear model with mixed effects to the expression data (see Materials and Methods). In addition to accounting for GE differences between species, the model can address a specific hypothesis: if gbM modulates GE, we expect the 4,221 Group 1 genes to exhibit more substantial differences in GE between species than the 7,968 Group 2 genes (table 1). After applying the model, we found that the average expression in A. thaliana for Group 2 genes is 2.08 FPKM on a log scale (the intercept in table 2). The same genes (Group 2) were expressed at significantly lower levels in E. salsugineum (0.32 FPKM less on a log scale, on average; the species effect in table 2). In A. thaliana, Group 1 genes were significantly more expressed than Group 2 genes (0.49 FPKM more on a log scale on average; the Group effect in table 2), consistent with the known high to intermediate expression level of gbM genes. Finally, after taking into account the species effect, Group 1 orthologs have significantly lower expression levels in E. salsugineum compared with A. thaliana (the interaction effect in table 2). Although the change in expression level is small (–0.14 FPKM in log scale), the fact that it is significant shows that this difference is consistent across genes in the data set. The results also hold after excluding 1,321 lowly expressed genes. Therefore, we conclude that the loss of gbM in E. salsugineum Group 1 genes is associated with a small but significant decrease in expression level relative to the same genes which are gbM in A. thaliana.
Table 2.

Results of the Linear Regression Model with Mixed Effects (see Materials and Methods for details).

Estimate95% Confidence Intervalt-ValueP-value
Average expression in A. thaliana for Group 2 genes (intercept)2.081.96–2.21
Difference in expression between E. salsugineum and A. thaliana for Group 2 genes (species effect)–0.32–0.33 to –0.31–55.373<0.001
Difference in expression between Group 1 and Group 2 genes in A. thaliana (Group effect)0.490.44–0.5418.67<0.001
Additional difference in expression between E. salsugineum and A. thaliana for Group 1 genes (interaction effect)–0.14–0.16 to –0.13–15.63<0.001

Note.—See table 1 for the definition of gene Groups and for gene numbers. For each fixed effect of the model and their interaction, the estimated average change in expression level is shown in log scale, along with the 95% confidence interval, t-value, and P-value. The first line (intercept) shows the average expression (FPKM in log scale) for one species and one gene Group; subsequent lines show differences in expression observed with that intercept and whether the differences are significant.

Results of the Linear Regression Model with Mixed Effects (see Materials and Methods for details). Note.—See table 1 for the definition of gene Groups and for gene numbers. For each fixed effect of the model and their interaction, the estimated average change in expression level is shown in log scale, along with the 95% confidence interval, t-value, and P-value. The first line (intercept) shows the average expression (FPKM in log scale) for one species and one gene Group; subsequent lines show differences in expression observed with that intercept and whether the differences are significant. Our result is important for at least two reasons. First, by utilizing E. salsugineum, a species naturally devoid of gbM, we have studied the effect of gbM loss from 4,221 Group 1 genes simultaneously. A previous paper took a similar approach but could study only hundreds of genes (Takuno et al. 2017). It is worth noting, however, that both studies have detected a modest but consistent association between gbM and GE, with a trend toward higher expression levels for gbM genes. Second, by contrasting the GE effects between two groups—that is, genes that did and did not differ in gbM status (table 1)—we have effectively controlled for experimental effects. We suspect that our conclusions differ from Bewick et al. (2016) in part due to the use of Group 2 as a control to contrast to Group 1 genes. We believe these results to bolster the argument, which is largely based on evolutionary analyses, that gbM has some function, specifically with regard to effects on GE. Although the loss of gbM in E. salsugineum is associated with only a small reduction in GE, the effect is significant and notable across 4,221 genes, even if the effect is small at the scale of individual genes. We note that natural selection can act on functional genomic features that have small effects on fitness, such as codon usage bias (Duret and Mouchiroud 1999), as long as the species effective population size (N) is large enough for selection to act efficaciously. One reason why E. salsugineum lost gbM, whereas most angiosperms have preserved it, could be the relatively small N of this species and its specialized halophytic niche (Wang et al. 2018). The question remains, however, as to the exact nature and mechanism linking gbM and GE. Zilberman (2017) argues persuasively that the primary effect of gbM is likely to be homeostatic, by (for example) preventing aberrant transcription within genes or restricting access to histone H2A.7, which is associated with gene responsiveness. If the effect of gbM is to better regulate GE, it may have been missed up to now due to technical limitations. Indeed, current estimates of GE are averaged across thousands of cells, and because the effect of gbM is small, it may be necessary to consider single cell approaches to better understand it (Ji et al. 2015; Zilberman 2017). Whatever the function, the effect of gbM is modest enough to be difficult to detect on experimental time-scales, but strong enough to be conserved among plant orthologs.

Materials and Methods

Data Set

We used data from Bewick et al. (2016) and Niederhuth et al. (2016) from five Brassicaceae species: A. thaliana, Arabidopsis lyrata, Brassica rapa, Capsella rubella, and E. salsugineum (supplementary table S1, Supplementary Material online). The methylation data consisted of the methylation status of each cytosine based on a binomial test (Lister et al. 2008). The RNA-seq data were reported by Bewick et al (2016) as the FPKM level for each gene for six A. thaliana replicates (three leaf and three aerial tissue) and two E. salsugineum replicates from leaf tissue (supplementary table S1, Supplementary Material online).

gbM Inference

For each gene, the methylation state was inferred using the same approach as Bewick et al. (2016) and Niederhuth et al. (2016), for details see Supplementary Material.

Analysis of Expression and Methylation Levels

GE levels were analyzed using a linear regression model with mixed effects using the R package lme4 (Bates et al. 2015). FPKM expression levels were log transformed. To account for interGE variability, a random gene effect was included in the model (see eq. 1). Similarly, a random tissue effect was included (eq. 1) to account for variability in expression between leaf and aerial RNA-seq data. The aim of this model was to compare the expression level of gbM genes in A. thaliana to genes that lost gbM in E. salsugineum, while taking into account any global difference in expression levels between the two species. In order to achieve this, we defined a fixed effect called “gene Group” (table 1). A species fixed effect was also used in the model to account for global differences in expression levels between the two species. The resulting model was written to examine an effect of gene Group on expression level after taking the species effect into account—that is, Significance for fixed effects and their interaction were determined by comparing the fit of the full model to nested models that first removed the interaction and then removed one effect at a time. P-values for each effect and their interaction were computed via Wald-statistics approximation using sjPlot R package (Lüdecke 2018).

Supplementary Material

Supplementary data are available at Molecular Biology and Evolution online. Click here for additional data file.
  20 in total

1.  Genome-wide high-resolution mapping and functional analysis of DNA methylation in arabidopsis.

Authors:  Xiaoyu Zhang; Junshi Yazaki; Ambika Sundaresan; Shawn Cokus; Simon W-L Chan; Huaming Chen; Ian R Henderson; Paul Shinn; Matteo Pellegrini; Steve E Jacobsen; Joseph R Ecker
Journal:  Cell       Date:  2006-08-31       Impact factor: 41.582

2.  Body-methylated genes in Arabidopsis thaliana are functionally important and evolve slowly.

Authors:  Shohei Takuno; Brandon S Gaut
Journal:  Mol Biol Evol       Date:  2011-08-02       Impact factor: 16.240

Review 3.  Crop Epigenomics: Identifying, Unlocking, and Harnessing Cryptic Variation in Crop Genomes.

Authors:  Lexiang Ji; Drexel A Neumann; Robert J Schmitz
Journal:  Mol Plant       Date:  2015-01-29       Impact factor: 13.164

4.  Evolutionary patterns of genic DNA methylation vary across land plants.

Authors:  Shohei Takuno; Jin-Hua Ran; Brandon S Gaut
Journal:  Nat Plants       Date:  2016-01-25       Impact factor: 15.793

5.  Highly integrated single-base resolution maps of the epigenome in Arabidopsis.

Authors:  Ryan Lister; Ronan C O'Malley; Julian Tonti-Filippini; Brian D Gregory; Charles C Berry; A Harvey Millar; Joseph R Ecker
Journal:  Cell       Date:  2008-05-02       Impact factor: 41.582

6.  Demographic expansion and genetic load of the halophyte model plant Eutrema salsugineum.

Authors:  Xiao-Juan Wang; Quan-Jun Hu; Xin-Yi Guo; Kun Wang; Da-Fu Ru; Dmitry A German; Elizabeth A Weretilnyk; Richard J Abbott; Martin Lascoux; Jian-Quan Liu
Journal:  Mol Ecol       Date:  2018-06-22       Impact factor: 6.185

7.  DNA methylation in Arabidopsis has a genetic basis and shows evidence of local adaptation.

Authors:  Manu J Dubin; Pei Zhang; Dazhe Meng; Marie-Stanislas Remigereau; Edward J Osborne; Francesco Paolo Casale; Philipp Drewe; André Kahles; Geraldine Jean; Bjarni Vilhjálmsson; Joanna Jagoda; Selen Irez; Viktor Voronin; Qiang Song; Quan Long; Gunnar Rätsch; Oliver Stegle; Richard M Clark; Magnus Nordborg
Journal:  Elife       Date:  2015-05-05       Impact factor: 8.140

8.  Evolution of DNA methylation patterns in the Brassicaceae is driven by differences in genome organization.

Authors:  Danelle K Seymour; Daniel Koenig; Jörg Hagmann; Claude Becker; Detlef Weigel
Journal:  PLoS Genet       Date:  2014-11-13       Impact factor: 5.917

Review 9.  An evolutionary case for functional gene body methylation in plants and animals.

Authors:  Daniel Zilberman
Journal:  Genome Biol       Date:  2017-05-09       Impact factor: 13.583

10.  Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning.

Authors:  Shawn J Cokus; Suhua Feng; Xiaoyu Zhang; Zugen Chen; Barry Merriman; Christian D Haudenschild; Sriharsa Pradhan; Stanley F Nelson; Matteo Pellegrini; Steven E Jacobsen
Journal:  Nature       Date:  2008-02-17       Impact factor: 49.962

View more
  12 in total

1.  Natural variation in DNA methylation homeostasis and the emergence of epialleles.

Authors:  Yinwen Zhang; Jered M Wendte; Lexiang Ji; Robert J Schmitz
Journal:  Proc Natl Acad Sci U S A       Date:  2020-02-18       Impact factor: 11.205

Review 2.  EvoChromo: towards a synthesis of chromatin biology and evolution.

Authors:  Ines A Drinnenberg; Frédéric Berger; Simon J Elsässer; Peter R Andersen; Juan Ausió; Wendy A Bickmore; Alexander R Blackwell; Douglas H Erwin; James M Gahan; Brandon S Gaut; Zachary H Harvey; Steven Henikoff; Joyce Y Kao; Siavash K Kurdistani; Bernardo Lemos; Mia T Levine; Karolin Luger; Harmit S Malik; José M Martín-Durán; Catherine L Peichel; Marilyn B Renfree; Kinga Rutowicz; Peter Sarkies; Robert J Schmitz; Ulrich Technau; Joseph W Thornton; Tobias Warnecke; Kenneth H Wolfe
Journal:  Development       Date:  2019-09-26       Impact factor: 6.868

3.  Gene body methylation is under selection in Arabidopsis thaliana.

Authors:  Aline Muyle; Jeffrey Ross-Ibarra; Danelle K Seymour; Brandon S Gaut
Journal:  Genetics       Date:  2021-06-24       Impact factor: 4.562

4.  Encyclopaedia of eukaryotic DNA methylation: from patterns to mechanisms and functions.

Authors:  Peter Sarkies
Journal:  Biochem Soc Trans       Date:  2022-06-30       Impact factor: 4.919

5.  ------Widespread conservation and lineage-specific diversification of genome-wide DNA methylation patterns across arthropods.

Authors:  Samuel H Lewis; Laura Ross; Stevie A Bain; Eleni Pahita; Stephen A Smith; Richard Cordaux; Eric A Miska; Boris Lenhard; Francis M Jiggins; Peter Sarkies
Journal:  PLoS Genet       Date:  2020-06-25       Impact factor: 5.917

6.  Evolutionary and Experimental Loss of Gene Body Methylation and Its Consequence to Gene Expression.

Authors:  Adam J Bewick; Yinwen Zhang; Jered M Wendte; Xiaoyu Zhang; Robert J Schmitz
Journal:  G3 (Bethesda)       Date:  2019-08-08       Impact factor: 3.154

7.  Excess Light Priming in Arabidopsis thaliana Genotypes with Altered DNA Methylomes.

Authors:  Diep R Ganguly; Bethany A B Stone; Andrew F Bowerman; Steven R Eichten; Barry J Pogson
Journal:  G3 (Bethesda)       Date:  2019-11-05       Impact factor: 3.154

8.  Single-cell expression noise and gene-body methylation in Arabidopsis thaliana.

Authors:  Robert Horvath; Benjamin Laenen; Shohei Takuno; Tanja Slotte
Journal:  Heredity (Edinb)       Date:  2019-01-16       Impact factor: 3.821

Review 9.  Gene Body Methylation in Plants: Mechanisms, Functions, and Important Implications for Understanding Evolutionary Processes.

Authors:  Aline M Muyle; Danelle K Seymour; Yuanda Lv; Bruno Huettel; Brandon S Gaut
Journal:  Genome Biol Evol       Date:  2022-04-10       Impact factor: 3.416

10.  Gene body DNA methylation in seagrasses: inter- and intraspecific differences and interaction with transcriptome plasticity under heat stress.

Authors:  Gabriele Procaccini; Lazaro Marín-Guirao; Laura Entrambasaguas; Miriam Ruocco; Koen J F Verhoeven
Journal:  Sci Rep       Date:  2021-07-12       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.