Parent-of-origin-dependent gene expression in mammals and flowering plants results from differing chromatin imprints (genomic imprinting) between maternally and paternally inherited alleles. Imprinted gene expression in the endosperm of seeds is associated with localized hypomethylation of maternally but not paternally inherited DNA, with certain small RNAs also displaying parent-of-origin-specific expression. To understand the evolution of imprinting mechanisms in Oryza sativa (rice), we analyzed imprinting divergence among four cultivars that span both japonica and indica subspecies: Nipponbare, Kitaake, 93-11, and IR64. Most imprinted genes are imprinted across cultivars and enriched for functions in chromatin and transcriptional regulation, development, and signaling. However, 4 to 11% of imprinted genes display divergent imprinting. Analyses of DNA methylation and small RNAs revealed that endosperm-specific 24-nt small RNA-producing loci show weak RNA-directed DNA methylation, frequently overlap genes, and are imprinted four times more often than genes. However, imprinting divergence most often correlated with local DNA methylation epimutations (9 of 17 assessable loci), which were largely stable within subspecies. Small insertion/deletion events and transposable element insertions accompanied 4 of the 9 locally epimutated loci and associated with imprinting divergence at another 4 of the remaining 8 loci. Correlating epigenetic and genetic variation occurred at key regulatory regions-the promoter and transcription start site of maternally biased genes, and the promoter and gene body of paternally biased genes. Our results reinforce models for the role of maternal-specific DNA hypomethylation in imprinting of both maternally and paternally biased genes, and highlight the role of transposition and epimutation in rice imprinting evolution.
Parent-of-origin-dependent gene expression in mammals and flowering plants results from differing chromatin imprints (genomic imprinting) between maternally and paternally inherited alleles. Imprinted gene expression in the endosperm of seeds is associated with localized hypomethylation of maternally but not paternally inherited DNA, with certain small RNAs also displaying parent-of-origin-specific expression. To understand the evolution of imprinting mechanisms in Oryza sativa (rice), we analyzed imprinting divergence among four cultivars that span both japonica and indica subspecies: Nipponbare, Kitaake, 93-11, and IR64. Most imprinted genes are imprinted across cultivars and enriched for functions in chromatin and transcriptional regulation, development, and signaling. However, 4 to 11% of imprinted genes display divergent imprinting. Analyses of DNA methylation and small RNAs revealed that endosperm-specific 24-nt small RNA-producing loci show weak RNA-directed DNA methylation, frequently overlap genes, and are imprinted four times more often than genes. However, imprinting divergence most often correlated with local DNA methylation epimutations (9 of 17 assessable loci), which were largely stable within subspecies. Small insertion/deletion events and transposable element insertions accompanied 4 of the 9 locally epimutated loci and associated with imprinting divergence at another 4 of the remaining 8 loci. Correlating epigenetic and genetic variation occurred at key regulatory regions-the promoter and transcription start site of maternally biased genes, and the promoter and gene body of paternally biased genes. Our results reinforce models for the role of maternal-specific DNA hypomethylation in imprinting of both maternally and paternally biased genes, and highlight the role of transposition and epimutation in rice imprinting evolution.
Genomic imprinting results in biased expression from one parentally inherited allele, even though both alleles are present in the same cell and may be genetically identical. Here, maternally and paternally inherited alleles are differentiated postfertilization by a mark that is epigenetic or, in other words, not directly related to DNA sequence. This epigenetic mark is referred to as the “imprint” (1). To date, locus-specific imprinting of largely nonimprinted chromosomes is only observed in species with a “placental habit,” namely placental mammals and flowering plants (2). The phenomenon appears to have evolved independently in both clades, around the same time as the placenta in mammals (3) and its equivalent, the endosperm, in plants (2). Imprinting still predominantly occurs in placental tissue (4–9) and seems crucial for offspring development (4, 10, 11). Several imprinted genes are important regulators of development (1, 4, 12).Mammalian imprinting appears to have evolved via the spread of long terminal repeat (LTR) retrotransposons (3). Though the origins of plant imprinting are less clear, current evidence also suggests a role for transposable elements (TEs) in imprint initiation (1, 13–16). Not only do TEs create novel genetic variation at genes and promoters but TE-targeted chromatin modifications can affect transcription of neighboring genes (17, 18). Chromatin modification of nearby TE-related sequences is required for parent-of-origin–specific expression of several imprinted genes (7, 19–23). Additionally, allele-specific imprinting variation among Arabidopsis thaliana ecotypes has been associated with epigenetic variation at TEs when the underlying genetic sequences show little sequence divergence, suggesting that epimutation plays an additional role in the ongoing evolution of imprinting (7, 24).Studies in A. thaliana, maize, and rice indicate that DNA methylation and histone 3 lysine 27 trimethylation (H3K27me3) are the major imprints for endosperm-expressed genes (5, 16, 20, 25–27). Plant endosperm is a triploid tissue, formed by fusion of a homodiploid female central cell and haploid male sperm cell in a fertilization event independent of but concurrent with that which creates the embryo (28, 29). DNA methylation-based imprints are established prefertilization by glycosylase-mediated DNA demethylation in the central cell and DNA methylation maintenance in developing sperm cells (30–35). Thus, they are observed as site-specific hypomethylation of maternally inherited but not paternally inherited endosperm DNA (26, 36, 37) and considered to be a cause rather than a consequence of imprinting (38). In mammals, differential DNA methylation is considered the canonical cause of imprinting (39, 40), with H3K27me3 serving as a noncanonical mechanism at a subset of paternally expressed genes (41). Both mechanisms serve to maintain mammalian imprints postfertilization (40, 42).Like mammalian H3K27me3 imprints, plant H3K27me3 imprints are initiated and maintained by Polycomb group (PcG) proteins and are more often associated with paternally expressed than maternally expressed imprinted genes (5, 25, 43–46). While some endosperm H3K27me imprints are triggered by DNA hypomethylation (22, 25, 47, 48), others are observed in its absence, suggesting either that the causal imprint is H3K27me3 (49) or a yet undiscovered epigenetic mark, or that the genes are regulated by unknown long-distance regulatory elements, or that their endosperm epigenetic state does not reflect gametic imprints (50). Small RNAs (sRNAs) are also associated with imprint formation at some loci in the endosperm of A. thaliana, maize, and rice, such that maternally or paternally biased sRNA expression overlaps the silenced allele (12, 26, 43, 51, 52).Here, we further dissect mechanisms central to plant imprinting and evolution in rice, a staple food crop for much of the world. In order to identify the contributions of genetic and epigenetic variation in ongoing divergence of rice cultivars, we identified divergence in imprinted expression across Kitaake, Nipponbare, 93-11, and IR64 rice cultivars of Oryza sativa L., using reciprocal crosses and cultivar-specific single-nucleotide polymorphisms (SNPs). By exploring differences in DNA methylation and sRNA production among cultivars, we find that genes with diverged imprinting differ in DNA methylation of key regulatory regions, and that these changes in DNA methylation may or may not be associated with genetic changes such as the transposition of class I or class II elements.
Results
Approximately 10% of Genes Expressed in Rice Endosperm Are Imprinted, with About 4 to 11% of Those Genes Displaying Imprinting Divergence across Four Rice Cultivars.
We performed one reciprocal set of intra-japonica crosses (i.e., Nipponbare × Kitaake and Kitaake × Nipponbare), one set of intra-indica crosses (i.e., IR64 × 93-11 and 93-11 × IR64), and two sets of intersubspecies crosses (Nipponbare × IR64 and IR64 × Nipponbare, and Nipponbare × 93-11 and 93-11 × Nipponbare). Genome-wide comparison of identified SNPs revealed that Nipponbare and Kitaake japonica cultivars are the most closely related of the four, with the next most closely related, 93-11 and IR64 indica cultivars, differing at 3 times as many nucleotide positions while indica and japonica cultivars differ at 10 times as many positions ( and Dataset S1). F1 seeds dissected at day 7 to 8 after pollination were genotyped to verify hybridity prior to endosperm and embryo RNA extraction for strand-specific RNA-sequencing (RNA-seq) libraries. Duplex-specific nuclease treatment of endosperm libraries enhanced transcript detection without altering the ratio of maternal SNPs to paternal SNPs ().Though 35 to 42% of annotated rice genes were detected in endosperm libraries, only ∼20% of annotated genes had sufficient read depth at informative SNPs for reliable detection of expression biases in intersubspecies crosses (Fig. 1 ). Only 2 to 6% of annotated genes (5 to 17% of detectable transcripts) could be evaluated in intrasubspecies crosses. Cultivar-specific expression bias prevented clear imprinting inference at a further third of the transcripts. Of the remaining genes, which are the genes truly assessable for imprinting, parental biases varied along a continuum, as previously noted in rice (6). We found that roughly 10% of assessable genes appear to be imprinted in a moderate, strong, or complete manner, as defined by significant deviation (Fisher’s exact test, P < 0.01) of the maternally derived fraction of sorted reads from the expected ratio of 0.67 (Fig. 1). Having identified several hundred candidate imprinted genes (Fig. 1), we inferred imprinting conservation and divergence from reciprocal cross data as described in , resolving many initial candidates for imprinting divergence to technical artifacts. Finally, we defined a set of 217 conserved paternally expressed genes (cPEGs), 175 conserved maternally expressed genes (cMEGs), and 19 genes with diverged imprinting (Fig. 1, , and Dataset S2). Our list of conserved imprinted rice genes overlaps the majority of those from other groups (). Differences are likely due to variation in SNPs, cultivars, analytical pipelines, developmental stages, and growth conditions. Of the 19 genes with divergent imprinting, variation between imprinted and biallelic states is the most common form of divergence, with only one gene varying between maternally biased and paternally biased expression.
Fig. 1.
Detection of imprinted genes in reciprocal crosses. (A) Imprinted gene expression was defined as significant deviation (Fisher’s exact test, P < 0.01) from the biallelic parental ratio for endosperm (dashed line), and classified as moderate, strong (over 90% uniparental), or complete (over 99% uniparental). (B) Genes were grouped based on assessment of parental biases. (C) Hundreds of genes exhibited significant parental biases. (D) For MEGs and PEGs where imprinting could be assessed across Nipponbare, 93-11, and IR64 cultivars, imprinting was more often conserved than diverged.
Detection of imprinted genes in reciprocal crosses. (A) Imprinted gene expression was defined as significant deviation (Fisher’s exact test, P < 0.01) from the biallelic parental ratio for endosperm (dashed line), and classified as moderate, strong (over 90% uniparental), or complete (over 99% uniparental). (B) Genes were grouped based on assessment of parental biases. (C) Hundreds of genes exhibited significant parental biases. (D) For MEGs and PEGs where imprinting could be assessed across Nipponbare, 93-11, and IR64 cultivars, imprinting was more often conserved than diverged.
Functional Gene Ontology Term Enrichment of Imprinted Genes Reveals Regulatory Functions.
cMEGs and cPEGs were functionally annotated with the agriGO toolkit (53), and gene ontology (GO) term enrichment was compared with the pool of assessable endosperm genes. cPEGs were similar to those in both maize (54) and A. thaliana (7) in being enriched for functions in regulation, including that of chromatin and transcription (). cMEGs were enriched for functions in transcriptional regulation, development, and signaling (), and thus similar to those in maize (54) but not to those in A. thaliana, where only a weak enrichment for transcription factors was seen (7). However, the lack of GO annotations for 38% of cMEGs, compared with just 16% of cPEGs, suggests that the enrichment analysis may not be as illustrative of the general roles of cMEGs. Twenty of the 175 MEGs (11%) lacked GO annotations because they were TE genes (11 retrotransposons, 4 Pong MITE DNA transposons, 2 MuDR DNA transposons, 1 CACTA DNA transposon, and 2 unclassified). By contrast, only one of the 217 PEGs encoded an annotated TE protein (an LTR retrotransposon). Although the list of genes with diverged imprinting was too small to draw definitive conclusions about GO term enrichment, both MEGs and PEGs appeared enriched for similar functions compared with their conserved counterparts, with roles in transcription, chromatin regulation, protein modification, and protein–protein interactions for PEGs, and transposition and chromatin and transcriptional regulation for MEGs ().
sRNA Populations in Rice Endosperm Are Similar across Cultivars and Are Four Times More Likely to Be Imprinted than Genes.
Having previously described a unique sRNA population in the endosperm of Kitaake and Nipponbare japonica cultivars (26), we performed strand-specific sequencing of sRNA () from dissected endosperm and embryos of 7- to 8-d-old seeds of IR64 and 93-11 rice cultivars, as well as the reciprocal crosses used for detection of imprinted genes. Endosperm sRNA populations of IR64 and 93-11 indica cultivars also predominantly originate from the relatively small number of euchromatic loci we identified in Kitaake and Nipponbare. These “siren” loci (small-interfering RNA in endosperm) are often associated with high sRNA abundance in all four cultivars (), but a minority show divergent levels of expression among cultivars (Fig. 2). Of 801 defined siren loci (Dataset S3), imprinting was observed for 41% of the 186 with significant read depth at informative SNPs and a lack of confounding variety-specific bias in crosses of at least three cultivars (). More siren loci were paternally biased than maternally biased, with 27% compared with 11% (Fig. 2).
Fig. 2.
Siren loci, which are loci associated with a relatively high production of 24-nt sRNAs in rice endosperm, can be imprinted and associate with gene silencing. (A) Venn diagram showing the overlap of significantly high 24-nt sRNA expression in siren loci of different rice cultivars. (B) Where imprinting could be assessed, a larger proportion of siren loci are imprinted compared with genes. (C) Most siren loci overlap genes or nearby intergenic regions and this trend is more marked for siren loci conserved among cultivars. (D) Where imprinted siren loci overlap genes, 24-nt sRNAs are expressed from the silenced parental allele of the gene. (E) Siren loci show greater CHH methylation in endosperm (endo; orange) than embryo (emb; green), unlike other 24-nt sRNA-producing regions of endosperm. Average fractional methylation of intersubspecies reciprocal crosses and the corresponding difference between parental alleles are shown for 50-bp windows, using embryo–endosperm CG differentially methylated regions as a control. Parental alleles differ most at imprinted siren loci compared with other endosperm sRNA-producing regions, though a paucity of data points reduces statistical power. sRNA-producing windows were defined where the sum of 24-nt reads across the four intersubspecies crosses was ≥30. Window methylation is based on ≥15 cytosines per cross, with differences between parental alleles only calculated where at least one allele is significantly methylated (>0.15) in each direction of the reciprocal cross. Data points are displayed in colors based on parent-of-origin–specific bias (blue, paternal; red, maternal; gray, biallelic) when <1,000 per violin. Different letter groupings indicate Bonferroni-corrected P < 0.01 in pairwise Wilcoxon rank-sum tests. Markers indicate the median. To resolve differences between genomic features, sRNA-expressing windows exclude CG DMRs and those of siren loci additionally exclude repeats.
Siren loci, which are loci associated with a relatively high production of 24-nt sRNAs in rice endosperm, can be imprinted and associate with gene silencing. (A) Venn diagram showing the overlap of significantly high 24-nt sRNA expression in siren loci of different rice cultivars. (B) Where imprinting could be assessed, a larger proportion of siren loci are imprinted compared with genes. (C) Most siren loci overlap genes or nearby intergenic regions and this trend is more marked for siren loci conserved among cultivars. (D) Where imprinted siren loci overlap genes, 24-nt sRNAs are expressed from the silenced parental allele of the gene. (E) Siren loci show greater CHH methylation in endosperm (endo; orange) than embryo (emb; green), unlike other 24-nt sRNA-producing regions of endosperm. Average fractional methylation of intersubspecies reciprocal crosses and the corresponding difference between parental alleles are shown for 50-bp windows, using embryo–endosperm CG differentially methylated regions as a control. Parental alleles differ most at imprinted siren loci compared with other endosperm sRNA-producing regions, though a paucity of data points reduces statistical power. sRNA-producing windows were defined where the sum of 24-nt reads across the four intersubspecies crosses was ≥30. Window methylation is based on ≥15 cytosines per cross, with differences between parental alleles only calculated where at least one allele is significantly methylated (>0.15) in each direction of the reciprocal cross. Data points are displayed in colors based on parent-of-origin–specific bias (blue, paternal; red, maternal; gray, biallelic) when <1,000 per violin. Different letter groupings indicate Bonferroni-corrected P < 0.01 in pairwise Wilcoxon rank-sum tests. Markers indicate the median. To resolve differences between genomic features, sRNA-expressing windows exclude CG DMRs and those of siren loci additionally exclude repeats.Although they appear to originate from distinct noncoding transcripts, most siren loci reside within 5 kb of genes (Fig. 2 and ), with the 801 imprinted and nonimprinted siren loci overlapping the bodies of 519 genes. Of these 519 genes, we assessed parental bias at the 112 with good read depth over SNPs lacking variety-specific bias and found 40 (36%) were imprinted (22 MEGs and 18 PEGs). This is higher than expected by chance as only 10% of endosperm-expressed genes are estimated to be imprinted. Wherever informative SNPs indicated conserved imprinting of both siren loci and the genes they overlapped, we observed perfectly opposed parental biases of siren loci and overlapping genes (Fig. 2), supporting our previous results (26). Only six siren loci (3.2%) appeared to differ in parental bias among rice cultivars (Fig. 2 and ). Of these six siren loci, three overlapped genes but only two overlapped genes with detectable transcripts. Imprinting of the siren locus at MEG Os02g15350 correlated with gene imprinting while that of the siren locus at MEG Os08g28960 did not (), though it is possible that the lack of correlation at Os08g28960 is due to cultivar-specific bias or higher mismapping of 24-nt sRNA reads compared with RNA-seq reads.Overall, the correlation seen at conserved imprinted siren loci, along with the finding that biallelic siren loci tend to overlap poorly expressed genes (Fig. 2), indicates that siren locus expression associates with gene silencing. Observation of endosperm-specific RNA-directed DNA methylation (RdDM) due to siren-derived sRNA is complicated by low levels of endosperm non-CG DNA methylation relative to other tissues (55) as well as endosperm-specific maternal hypomethylation (). After attempting to resolve such complications, we find that CHG and CHH methylation vary differently across genomic features of seed tissues () but CHH methylation is significantly increased at siren loci and decreased at other endosperm 24-nt sRNA-producing loci compared with embryo (Fig. 2), suggestive of siren-mediated RdDM. Surprisingly, CHH methylation of imprinted siren loci shows an opposite direction of parent-of-origin–specific bias compared with sRNAs, with paternally biased sRNAs associated with maternal hypermethylation and maternally biased sRNAs associated with paternal hypermethylation, though data fell short of statistical significance (Fig. 2 and ).
DNA Methylation Profiles Are Highly Similar among Cultivars, Associating Imprinting with Maternal-Specific DNA Hypomethylation.
We previously obtained DNA methylation data from dissected endosperm and embryos of 7- to 8-d-old seeds of Nipponbare, Kitaake, and their reciprocal crosses (26). Using the same methods for IR64 and 93-11 rice cultivars (), we find that, like sRNA populations, DNA methylation features are largely conserved among the four rice cultivars (Fig. 3 and ). Furthermore, patterns of endosperm-specific maternal hypomethylation, indicated by differential methylation between embryo and endosperm (26), are conserved () even more so than general DNA methylation (Fig. 3 and ).
Fig. 3.
DNA methylation profiles tend to be conserved among cultivars, with DNA demethylation enriched at both maternally expressed genes and paternally expressed genes. (A) Kernel density plots quantify the difference in fractional DNA methylation between cultivars of different subspecies (Nipponbare–IR64 and Nipponbare–93-11) and cultivars of the same subspecies (IR64–93-11) across all 50-bp windows of the genome with significant CG methylation based on ≥19 cytosines. Significant CG methylation for embryo was defined as embryo fractional methylation level >0.7 for at least one cultivar in the comparison, while that for endosperm required endosperm fractional methylation >0.7 for at least one cultivar in the comparison as well as embryo fractional methylation >0.7 for both cultivars. The centering of the main peak at zero indicates that most methylated windows were methylated in both cultivars. Shoulders indicating differences between cultivars are greater in the embryo panel and are reflective of general DNA methylation while shoulders are smaller in the endosperm panel and are reflective of endosperm-specific demethylation. (B) Nipponbare embryo–endosperm DNA hypomethylation patterns of 175 cMEGs, 217 cPEGs, and 5,096 conserved biallelic genes were assigned to one of five categories: a TSS pattern, gene body pattern, promoter pattern, 3′ downstream pattern, and a pattern where none of the sequence within 1 kb of the gene is hypomethylated. For detailed category definitions, refer to .
DNA methylation profiles tend to be conserved among cultivars, with DNA demethylation enriched at both maternally expressed genes and paternally expressed genes. (A) Kernel density plots quantify the difference in fractional DNA methylation between cultivars of different subspecies (Nipponbare–IR64 and Nipponbare–93-11) and cultivars of the same subspecies (IR64–93-11) across all 50-bp windows of the genome with significant CG methylation based on ≥19 cytosines. Significant CG methylation for embryo was defined as embryo fractional methylation level >0.7 for at least one cultivar in the comparison, while that for endosperm required endosperm fractional methylation >0.7 for at least one cultivar in the comparison as well as embryo fractional methylation >0.7 for both cultivars. The centering of the main peak at zero indicates that most methylated windows were methylated in both cultivars. Shoulders indicating differences between cultivars are greater in the embryo panel and are reflective of general DNA methylation while shoulders are smaller in the endosperm panel and are reflective of endosperm-specific demethylation. (B) Nipponbare embryo–endosperm DNA hypomethylation patterns of 175 cMEGs, 217 cPEGs, and 5,096 conserved biallelic genes were assigned to one of five categories: a TSS pattern, gene body pattern, promoter pattern, 3′ downstream pattern, and a pattern where none of the sequence within 1 kb of the gene is hypomethylated. For detailed category definitions, refer to .We further analyzed our previous finding that PEGs are enriched for endosperm-specific maternal DNA hypomethylation in the gene body and promoter, while MEGs are enriched for hypomethylation in the promoter, transcription termini, and 3′ downstream region (26) (). As many endosperm transcripts differed in transcription start site (TSS) compared with reference gene annotations, we generated a curated set of endosperm-expressed transcription termini, updated using de novo assembled transcripts ( and Dataset S5). We then used Nipponbare data to categorize individual cMEGs, cPEGs, and conserved biallelic genes into five embryo–endosperm DNA hypomethylation patterns: TSS, gene body, promoter, 3′ downstream, and a final pattern where none of the sequence within 1 kb of the gene is hypomethylated (Fig. 3). Though all five patterns were observed within each subgroup, cMEGs and cPEGs show significantly greater overlap with regions of embryo–endosperm DNA hypomethylation than do biallelic genes (Fisher’s exact test, P = 10−35 and 10−30, respectively), with the TSS pattern dominant among cMEGs and the gene body pattern dominant among cPEGs (Fig. 3 and ). A role for the TSS hypomethylation pattern in endosperm-specific gene activation is supported by findings that the pattern is enriched in embryo-silenced cMEGs compared with embryo-expressed cMEGs (Fig. 4) and that, in general, expression levels are lowest for cMEGs with the TSS pattern compared with cPEGs and other cMEGs in the embryo (Fig. 4) as well as other nonendosperm tissues ().
Fig. 4.
Conserved maternally expressed genes with DNA methylation at the transcription start site are repressed in somatic tissues. (A) Curated endosperm genes were aligned at the TSS and transcription termination site (56), and the proportion of genes with a DMR present is plotted for each 50-bp interval within 3 kb of the alignment sites. Silencing in Nipponbare embryos was defined as no mapped RNA-seq reads while expression was defined as >3 mapped RNA-seq reads (0.31 reads per million reads mapped; the median embryo read count for maternally expressed genes). (B) Boxplots showing expression in Nipponbare embryos of cPEGs, cMEGs with TSS embryo–endosperm DMRs (i.e., with TSS DNA methylation in embryos), and cMEGs without TSS embryo–endosperm DMRs. (C) Boxplots showing expression in Nipponbare wild-type (WT) and OsMET1-2 mutant seedlings of cPEGs overlapped only with gene body embryo–endosperm DMRs within 1 kb of the gene, cMEGs with TSS embryo–endosperm DMRs, and cMEGs without TSS embryo–endosperm DMRs. N.S., not significant; *P < 0.05, ****P < 10−4; Mann–Whitney test.
Conserved maternally expressed genes with DNA methylation at the transcription start site are repressed in somatic tissues. (A) Curated endosperm genes were aligned at the TSS and transcription termination site (56), and the proportion of genes with a DMR present is plotted for each 50-bp interval within 3 kb of the alignment sites. Silencing in Nipponbare embryos was defined as no mapped RNA-seq reads while expression was defined as >3 mapped RNA-seq reads (0.31 reads per million reads mapped; the median embryo read count for maternally expressed genes). (B) Boxplots showing expression in Nipponbare embryos of cPEGs, cMEGs with TSS embryo–endosperm DMRs (i.e., with TSS DNA methylation in embryos), and cMEGs without TSS embryo–endosperm DMRs. (C) Boxplots showing expression in Nipponbare wild-type (WT) and OsMET1-2 mutant seedlings of cPEGs overlapped only with gene body embryo–endosperm DMRs within 1 kb of the gene, cMEGs with TSS embryo–endosperm DMRs, and cMEGs without TSS embryo–endosperm DMRs. N.S., not significant; *P < 0.05, ****P < 10−4; Mann–Whitney test.In rice, there are two orthologs of the A. thaliana CG methylation maintenance gene MET1, OsMET1-1 and OsMET1-2 (56). A previous study has shown that OsMET1-2 is the major gene responsible for the maintenance of CG methylation in rice: Loss of OsMET1-2 expression due to retrotransposon insertion into a key exon results in abnormal seed development and seedling lethality (56). To further assess the effects of CG methylation on imprinting gene expression, we compared the expression levels of cPEGs and cMEGs in wild-type and OsMET1-2 null mutant seedlings. cMEGs repressed in wild-type rice seedlings are activated in OsMET1-2 seedlings (Fig. 4 and ), with the greatest effects on those with endosperm-specific TSS hypomethylation. Though 12 of the 77 cMEGs with apparent endosperm-specific TSS hypomethylation were unexpectedly expressed (reads per kilobase of transcript per million reads mapped [RPKM] > 0.5) in wild-type seedlings (Fig. 4), 9 of them were predominantly transcribed from an alternative unmethylated TSS in seedlings () and tissue-specific TSS definition was likely imprecise at the remaining 3 cMEGs due to very low or very high expression ().A role for the gene body hypomethylation pattern in endosperm-specific gene silencing is supported by the highest embryo expression for cPEGs compared with cMEGs either with or without the TSS pattern (Fig. 4). However, though the prevalent model suggests that cPEG gene body hypomethylation recruits silencing machinery, cPEGs with endosperm-specific gene body demethylation are not suppressed in OsMET1-2 seedlings (Fig. 4). This result might reflect the specificity of PEG silencing machinery to the endosperm. The greater relevance of gene body methylation to regulation of PEGs than regulation of MEGs is highlighted by the overrepresentation of cPEGs with significant gene body methylation (the CG methylation levels of 66% of cPEGs are between 0.3 and 0.7; ) as compared with cMEGs (only 34% of cMEGs show gene body methylation between 0.3 and 0.7; ).
cis-Acting Genetic and Epigenetic Variation Is Associated with Imprinting Divergence.
We next assessed the relationship between cultivar-specific DNA sequence variation and epigenetic variation at the 19 loci with diverged imprinting by Sanger sequencing of cultivar alleles. For 17 of the 19 loci, we assembled contigs of Nipponbare, 93-11, and IR64 alleles, verifying known SNPs and identifying novel SNPs and insertion/deletion events (indels) up to 5 kb away from transcription termini (). Contigs of the remaining 2 loci were not assembled due to insurmountable difficulties in amplification and sequencing. To confirm associations of imprinting divergence with variation in maternal DNA hypomethylation, we mapped bisulfite-sequencing (BS-seq) libraries of endosperm and embryo DNA from each cultivar to the Sanger-sequenced loci and used embryo–endosperm CG hypomethylation as a proxy for maternal hypomethylation.Of the 17 loci, 9 display gene-proximal correlative endosperm DNA hypomethylation among cultivars (Fig. 5). Of the 8 loci lacking correlating methylation features (), 4 (3 PEGs and 1 MEG) possess cultivar-specific TEs in the promoter or gene body that mostly correlate with imprinted expression, except for one correlated with biallelic expression at PEG Os09g24220. The cultivar-specific TE in the gene body of PEG Os11g09329 is an intact autonomous Copia LTR retrotransposon, suggesting that its relatively recent transposition is associated with the emergence of imprinting at a previously biallelically expressed locus. It is possible that the epigenetic changes associated with these insertions act in a manner unrelated to DNA cytosine methylation; however, it is also possible that our estimation of DNA methylation over these repeats is obscured by the methylation of near-identical repeats elsewhere in the genome, or that the insertions themselves are not directly related to imprinting divergence but instead associated with more distal changes that have not yet been surveyed.
Fig. 5.
Snapshots of CG methylation in Nipponbare, 93-11, and IR64 cultivars across gene contigs showing imprinting divergence. (A) Os02g57200. (B) Os11g38990. (C) Os11g31630. (D) Os11g45295. (E) Os01g57890. (F) Os08g20500. (G) Os12g35590. (H) Os11g06650. (I) Os01g05510. Paternal expression bias of cultivar alleles is indicated by a blue circle, maternal bias by a red circle, and biallelic expression by a half-red half-blue circle. Red lines underlying endosperm tracks indicate regions of significant CG hypomethylation compared with embryo (Fisher’s exact test, P < 10−3; fractional methylation difference ≥ 0.3). Black dashed boxes indicate variation in embryo–endosperm DNA hypomethylation that is associated with imprinting divergence. Teal-colored boxes and lines highlight nearby genetic variation in A–E. Only 50-bp windows with >10 bisulfite-seq reads are shown.
Snapshots of CG methylation in Nipponbare, 93-11, and IR64 cultivars across gene contigs showing imprinting divergence. (A) Os02g57200. (B) Os11g38990. (C) Os11g31630. (D) Os11g45295. (E) Os01g57890. (F) Os08g20500. (G) Os12g35590. (H) Os11g06650. (I) Os01g05510. Paternal expression bias of cultivar alleles is indicated by a blue circle, maternal bias by a red circle, and biallelic expression by a half-red half-blue circle. Red lines underlying endosperm tracks indicate regions of significant CG hypomethylation compared with embryo (Fisher’s exact test, P < 10−3; fractional methylation difference ≥ 0.3). Black dashed boxes indicate variation in embryo–endosperm DNA hypomethylation that is associated with imprinting divergence. Teal-colored boxes and lines highlight nearby genetic variation in A–E. Only 50-bp windows with >10 bisulfite-seq reads are shown.Of the nine genes with correlative hypomethylation differences, Os02g57200 displays direct association of genetic and epigenetic variation: Paternally expressed Nipponbare and IR64 alleles possess characteristic endosperm-specific gene body hypomethylation, but this hypomethylation is disrupted in the biallelically expressed 93-11 allele by intronic insertion of a 5.6-kb Copia LTR retrotransposon (Fig. 5). We posit that this insertion prevents PcG-mediated repression of the 93-11 maternal allele, leading to biallelic expression. Similarly, paternal expression of the peptidyl-prolyl cis–trans isomerase Os11g38990 (Fig. 5) in Nipponbare correlates with promoter and gene body hypomethylation. In the 93-11 and IR64 alleles, the promoter and gene body are less hypomethylated and the gene is biallelically expressed, possibly due to the disruption of demethylation recognition sites by two small promoter indels (5 and 8 bp, indicated by the black dashed box in Fig. 5).On the other hand, maternal expression of the Os11g31630 Nipponbare allele correlates with promoter and TSS DNA methylation in the embryo and maternal-specific hypomethylation in the endosperm (Fig. 5). Lack of promoter and TSS DNA methylation in the embryo and endosperm of 93-11 and IR64 alleles associates with biallelic expression. This indica-specific epiallele may result from an upstream 300-bp DNA transposon PIF/Harbinger insertion that either forms a boundary that restricts spreading of DNA methylation from the promoter to the TSS or creates alternative transcription factor binding sites whose transcription prevents promoter and TSS DNA methylation. The TSS of the endosperm-expressed transcript variant for retrotransposon Os11g45295 (Fig. 5) also displays endosperm-specific hypomethylation at the maternally expressed Nipponbare allele and lack of methylation at biallelically expressed 93-11 and IR64 alleles, with the epiallele associating with an adjacent small indel.Imprinting at the homeodomain-containing protein Os01g57890 appears to effect maternal expression of the Nipponbare allele and paternal expression of IR64 and 93-11 indica alleles (). DNA methylation variation between Nipponbare and the indica cultivars spans the promoter, TSS, and part of the gene body, such that the Nipponbare allele is methylated and the 93-11 and IR64 alleles are unmethylated in embryos (Fig. 5). Nipponbare-specific embryo methylation correlates with an ∼250-bp LINE L1 element insertion. Based on similar methylation profiles of indica cultivars and wild rice species (Fig. 6), Os01g57890 was likely silenced by the PcG complex in the last common ancestor of the cultivars, with promoter methylation of the paternal allele protecting it from silencing. In Nipponbare, the silencing effect of TSS methylation caused by the LINE L1 insertion appears to be epistatic to the effect of promoter methylation in preventing PcG-mediated silencing, so the overall effect of the insertion-associated methylation is silencing of the paternal Nipponbare allele. Additionally, the Nipponbare LINE L1 insertion may erase genetic or epigenetic signals for PcG recruitment to unmethylated alleles and result in activation of the maternal Nipponbare allele by hypomethylation.
Fig. 6.
Snapshots of CG methylation across japonica, indica, and wild rice reveal that epialleles associated with imprinting divergence are stable within lineages. Red lines underlying endosperm tracks indicate regions of significant CG hypomethylation compared with embryo (Fisher’s exact test, P < 10−3; fractional methylation difference ≥ 0.3). Black dashed boxes indicate epialleles associated with imprinting divergence. Tracks consisting of dashed lines represent cultivars without informative read mapping. Only 50-bp windows with >10 bisulfite-seq reads are shown.
Snapshots of CG methylation across japonica, indica, and wild rice reveal that epialleles associated with imprinting divergence are stable within lineages. Red lines underlying endosperm tracks indicate regions of significant CG hypomethylation compared with embryo (Fisher’s exact test, P < 10−3; fractional methylation difference ≥ 0.3). Black dashed boxes indicate epialleles associated with imprinting divergence. Tracks consisting of dashed lines represent cultivars without informative read mapping. Only 50-bp windows with >10 bisulfite-seq reads are shown.Os08g20500, Os12g35590, Os11g06650, and Os01g05510 are the four genes where DNA hypomethylation variation in critical regulatory regions for imprinting was observed without significant indels (<20 bp; Fig. 5 ). Os08g20500 (Fig. 5), Os12g35590 (Fig. 5), and Os11g06650 (Fig. 5) are maternally expressed in 93-11 and IR64, and biallelically expressed in Nipponbare. Their imprinting divergence may be explained by epialleles at the TSS (Os08g20500 and Os12g35590) or promoter (Os11g06650), where maternally expressed alleles possess embryo CG methylation and low but significantly higher non-CG methylation () while biallelically expressed alleles are unmethylated. Although no large indel events were found at these loci, the density of SNPs between Nipponbare and indica alleles at these loci (∼1%) is higher than the genome average (∼0.45%). For Os01g05510 (Fig. 5), paternal bias of Nipponbare and IR64 alleles correlates with endosperm-specific hypomethylation in the middle of the gene body. The 93-11 allele possesses additional DNA methylation in this region that is not targeted for hypomethylation and may prevent PcG targeting despite hypomethylation at flanking sites. SNPs within this region only exist between Nipponbare and indica cultivars, so the epiallele between 93-11 and IR64 cannot be explained by sequence variation.
Epialleles Associated with Imprinting Divergence Are Usually Stably Inherited.
As DNA methylation is known to be faithfully transmitted through generations with a low rate of epimutation, we examined the stability of the epialleles associated with imprinting divergence using publicly available rice BS-seq data from different cultivars of japonica and indica rice as well as wild rice species (57–59). In line with expectations, we found that cultivars within the same grouping (i.e., japonica, indica, or wild rice) mostly had very similar epialleles at six of the seven assessable loci, with two major types of epialleles identified for each locus (Fig. 6). Notable exceptions included the saline-tolerant indica cultivar Pokkali possessing a japonica-like unmethylated TSS for retrotransposon genes Os08g20500 and Os11g45295, while the TSS of homeobox domain–containing protein gene Os01g57890 was only methylated in a subset of japonica cultivars composed of Nipponbare and Kitaake. For the seventh locus, Os01g05510, we found three epialleles: the unmethylated state of Nipponbare, IR64, and Nagina22, the highly methylated state of Dianjingyou1, TNG67, Oryza nivara, and 93-11, and the intermediately methylated state of Oryza rufipogon and Pokkali. The lack of association between epialleles and subspecies groups may result from either occasional reversion events or introgressions during the long breeding histories of cultivars.
Discussion
Though rapid genome-wide detection of plant imprinted genes has demonstrated that gene imprinting evolves as cultivars and species diverge, little is known about the forces driving imprinting divergence. Using cultivars from both indica and japonica subspecies that carry importance in the rice community (37, 60, 61), we show that the extent of imprinting divergence among rice cultivars is similar to that among maize cultivars (54) and A. thaliana ecotypes (7), providing an opportunity to dissect mechanisms of imprinting divergence. At PEG Os11g09329, we identify the possible birth of a gene imprint due to a relatively recent retrotransposition event. We did not observe associated changes in parent-of-origin–specific DNA methylation at the locus, either because the imprint is formed at a distal location or by other chromatin modifications, or because analysis of DNA methylation at such TEs is obscured by near-identical sequences elsewhere in the genome. We also found imprinting divergence without local differences in parent-of-origin–specific DNA methylation to be associated with the presence of specific DNA transposons at Os09g24220 and Os05g47870. Here, the cut-and-paste mechanism of DNA transposons makes it difficult to distinguish between imprint birth and death. At other loci, genetic changes more often caused imprint death through reduced parent-of-origin–specific differences in DNA methylation, either by apparent deletion as seen in Os11g45295 and Os01g69910 or insertion as in Os02g57200, Os11g38990, and Os11g31630. Where TE insertion correlated with imprinting divergence, TE class or family did not predict imprinting state, suggesting that the surrounding chromatin environment plays a role in defining the targets of imprinting mechanisms. Epiallelic variation without genetic divergence is also a key contributor to imprint evolution, observed at four of nine genes where DNA methylation differences correlate with imprinting. We found that such DNA methylation epialleles are stably transmitted even as lineages diverge, producing consistent effects on gene expression as measured by imprinted expression.These results reinforce models developed by identifying dominant DNA methylation and H3K27me3 histone methylation patterns at rice MEGs and PEGs, where DNA methylation is a causal imprint (25, 26, 43). Our finding that the OsMET1-2 mutation derepresses MEGs in nonendosperm tissues such as rice seedlings is consistent with experiments using the DNA methylation inhibitor 5-aza-2′-deoxycytidine (43). The case study of Os01g57890, which is maternally expressed from the Nipponbare allele and paternally expressed from indica alleles, suggests that silencing by TSS methylation is epistatic to other regulatory mechanisms such as PcG-mediated silencing and, as such, is a stronger predictor of gene silencing. Data from Os01g57890 and Os11g45295 also support the previous finding that endosperm-specific TSS hypomethylation mediates switching between alternative TSSs (25). Our finding that the OsMET1-2 mutation did not silence PEGs suggests either that DNA methylation alone does not form a causal imprint for PEGs or that PEG regulation in OsMET1-2 mutants is affected by a redistribution of other chromatin marks during genome-wide suppression of DNA methylation maintenance, as seen in A. thaliana (47). Unlike a recent study in A. thaliana (62), we did not find rice PEGs enriched for maternal-specific CHG methylation in the endosperm but it is possible that such CHG methylation and associated PcG-mediated marks serve as a transitory imprint in the central cell.Another point of difference with A. thaliana is that rice endosperm siren loci appear to differ in genomic distribution from similar “siren-like” loci that dominate ovary and egg sRNA populations (63). Siren sRNAs in Brassica rapa and A. thaliana are expressed via the canonical RNA POLYMERASE IV (Pol IV)–RNA-DEPENDENT RNA POLYMERASE 2 (RDR2) pathway from the same loci in maternal and seed tissues, relatively highly in ovules and seed coat, moderately in the endosperm, and lowly in embryos and mature seeds (64). We show that rice endosperm siren sRNAs, like those of Brassicaceous ovules (64, 65) but unlike those of rice eggs and ovary (66), are associated with greater CHH methylation compared with the relatively lowly expressed nonsiren 24-nt sRNAs, suggestive of siren-mediated RdDM. However, our finding that imprinted siren sRNAs mediate as much or more trans RdDM than cis RdDM, despite frequent imprinted gene expression from the trans allele of genes which overlap siren loci, suggests either that siren-mediated RdDM is ineffective at gene silencing in rice endosperm or that imprinted genes overlapped by siren loci progress toward silencing of both alleles during rice endosperm development. It is unclear whether Brassicaceous endosperm siren sRNAs also associate with RdDM but they and other endosperm 24-nt sRNAs are not required for proper seed development in A. thaliana (64, 67). In B. rapa, siren and/or other 24-nt sRNAs are essential for seed development (64) but endosperm effects are yet to be dissociated from those in maternal reproductive tissues.Although imprinted sRNAs associate with parent-of-origin–specific gene silencing in the endosperm across A. thaliana (7, 51), maize (52), and rice (12, 26, 43), we find that siren locus expression is not a strong driver of rice gene imprinting divergence. Our observation that rice endosperm sRNAs are imprinted more often than coding transcripts suggests that epigenetically regulated DNA is imprinted more often than can be estimated based on coding regions. This concurs with indications that epigenetic regulation is essential for balancing parental genome dosage in the A. thaliana endosperm (68). Unlike in A. thaliana (51, 64), sRNAs associated with imprinted rice genes are more likely to be paternally biased than maternally biased. This may be due to general differences in epigenetic regulation between A. thaliana and rice (55) such that some imprinted rice genes may behave more like A. thaliana TEs, which show an overall paternal bias in associated sRNA (51). This may also be due to the overlap of endosperm and ovule siren loci in A. thaliana (64) but not rice (66).Further work is necessary to establish whether PcG-mediated histone methylation acts as a causal imprint in rice, as it appears to at A. thaliana PEGs (62, 69). PcG activity may also explain imprinting of MEGs that do not display parent-of-origin–specific DNA methylation patterns (49). We expect technical advances in the detection of plant distal regulatory elements (70, 71) will help resolve the origin of imprinting at some loci for which the mechanism of imprinting is yet unknown.
Materials and Methods
See for full methods.
Next-Generation Sequencing Data.
Strand-specific RNA-sequencing libraries were constructed as previously described (72), using ribosomal RNA-depleted total RNA as input and duplex-specific nuclease treatment of the final library (73). Bisulfite- and sRNA-sequencing libraries were constructed as previously described (26). Using the Illumina HiSeq 2000 platform, RNA-seq and BS-seq libraries were sequenced as 100-bp single-end reads and sRNA-seq libraries as 50-bp single-end reads. We used previously published DNA methylation data for japonica cultivars TNG67 (58) and Dianjingyou1 (59), wild rice species O. rufipogon (59) and O. nivara (59), and indica cultivars Nagina22 (57) and Pokkali (57).
Data Processing.
Resequencing identified SNPs between IR64 and Nipponbare (Dataset S1), while previously published lists were used for SNPs between Kitaake and Nipponbare (26) and between 93-11 and Nipponbare (74, 75). We used our previous methods for read trimming (26), allele-specific read mapping (5, 26), calculation of fractional DNA methylation (36), and definition of differentially methylated regions (DMRs) (Dataset S6) and sRNA-producing loci (26). Scaled Venn diagrams were generated using eulerAPE v3 (76). Kernel density plots were made using 50-bp windows of the genome.
Analysis of Imprinted Expression in Endosperm.
Imprinted genes identified based on deviation of RNA-seq read counts from expected biallelic ratios (Fisher’s exact test, P < 0.01; Fig. 1) were defined as conserved or diverged among cultivars after resolving for variety-specific biases, read complexity, mismapping, erroneous SNPs, alternative transcript variants, developmental resetting, and potential contamination with maternal tissue (). sRNA-producing siren loci were also classified as imprinted (maternal fraction >0.77 or <0.5 in both reciprocal crosses; Fisher’s exact test, P < 0.001), biallelic (maternal fraction >0.617 and <0.717 in reciprocal crosses), or unclear, and assessed for imprinting conservation or divergence (). Functional GO term enrichment of imprinted genes used the agriGO toolkit (53).
Sanger Sequencing and Assembly of Cultivar Alleles.
Primers in amplified ∼1.7-kb overlapping fragments of the gene body, plus flanking regions either up to 5 kb away or until neighboring transcription termini. PCRs used Phusion High-Fidelity DNA Polymerase (NEB; M0530), Q5 High-Fidelity DNA Polymerase (NEB; M0491), or LongAmp Taq DNA Polymerase (NEB; M0323) with melting temperature between 68 and 78 °C and optional addition of 1 or 2 M betaine. PCR amplicons purified by SPRI beads or the CloneJET PCR Cloning Kit (Fisher Scientific; K1232) were Sanger-sequenced with forward and reverse primers either from the amplicon or pJET1.2 and assembled into contigs (Dataset S4) using the CodonCode Aligner (https://www.codoncode.com/).
Authors: Yeonhee Choi; Mary Gehring; Lianna Johnson; Mike Hannon; John J Harada; Robert B Goldberg; Steven E Jacobsen; Robert L Fischer Journal: Cell Date: 2002-07-12 Impact factor: 41.582
Authors: Ming Luo; Jennifer M Taylor; Andrew Spriggs; Hongyu Zhang; Xianjun Wu; Scott Russell; Mohan Singh; Anna Koltunow Journal: PLoS Genet Date: 2011-06-23 Impact factor: 5.917
Authors: Chenxin Li; Hengping Xu; Fang-Fang Fu; Scott D Russell; Venkatesan Sundaresan; Jonathan I Gent Journal: Genome Res Date: 2020-01-02 Impact factor: 9.043
Authors: Jonathan I Gent; Kaitlin M Higgins; Kyle W Swentowsky; Fang-Fang Fu; Yibing Zeng; Dong Won Kim; R Kelly Dawe; Nathan M Springer; Sarah N Anderson Journal: Plant Cell Date: 2022-09-27 Impact factor: 12.085