Literature DB >> 29736015

Locus-specific control of the de novo DNA methylation pathway in Arabidopsis by the CLASSY family.

Ming Zhou¹, Ana Marie S Palanca¹, Julie A Law².

Abstract

DNA methylation is essential for gene regulation, transposon silencing and imprinting. Although the generation of specific DNA methylation patterns is critical for these processes, how methylation is regulated at individual loci remains unclear. Here we show that a family of four putative chromatin remodeling factors, CLASSY (CLSY) 1-4, are required for both locus-specific and global regulation of DNA methylation in Arabidopsis thaliana. Mechanistically, these factors act in connection with RNA polymerase-IV (Pol-IV) to control the production of 24-nucleotide small interfering RNAs (24nt-siRNAs), which guide DNA methylation. Individually, the CLSYs regulate Pol-IV-chromatin association and 24nt-siRNA production at thousands of distinct loci, and together, they regulate essentially all 24nt-siRNAs. Depending on the CLSYs involved, this regulation relies on different repressive chromatin modifications to facilitate locus-specific control of DNA methylation. Given the conservation between methylation systems in plants and mammals, analogous pathways may operate in a broad range of organisms.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Substances：

Year: 2018 PMID： 29736015 PMCID： PMC6317521 DOI： 10.1038/s41588-018-0115-y

Source DB: PubMed Journal: Nat Genet ISSN： 1061-4036 Impact factor: 38.330

Introduction

The use of small non-coding RNAs to silence transposons and other foreign genetic elements via the deposition of repressive chromatin modifications is a highly conserved strategy employed by eukaryotic organisms to ensure genome stability[1,2]. Unlike in animals and fungi, where the biogenesis of these non-coding RNAs is initiated by Pol-II, in plants they are generated by two plant-specific RNA polymerases, Pol-IV and Pol-V. These polymerases evolved from Pol-II[3,4] and play central roles in the RNA-directed DNA Methylation (RdDM) pathway[5,6]. Briefly, Pol-IV generates short single-stranded RNAs[7,8] that are copied into double-stranded RNAs by RNA-DEPENDENT RNA POLYMERASE 2 (RDR2) and cleaved into 24nt-siRNAs by DICER-LIKE PROTEIN 3 (DCL3)[9]. These 24nt-siRNAs are then loaded into ARGONAUTE (AGO) effector complexes, including AGO4, AGO6 and AGO9[10]. Pol-V generates longer non-coding transcripts[11] that serve as scaffolds for the recruitment of additional RdDM factors including 24nt-siRNA-loaded ARGONAUTE proteins[12]–[14]. Ultimately, these interactions lead to the recruitment of DOMAINS REARRANGED METHYLTRANSFERASE 2 (DRM2)[15,16] and the deposition of DNA methylation throughout the genome. Once established, maintenance pathways take over to ensure the faithful inheritance of DNA methylation patterns[5]. Despite the existence of robust maintenance pathways, DNA methylation patterns are not static, and can differ between cell types[17]–[22], tissues[23]–[26], and even generations, depending on the organism[27]. The processes through which such differences in DNA methylation profiles arise, or are modulated during development, remain poorly understood. Yet, they are clearly important, as aberrant patterns of DNA methylation can result in developmental defects in plants[28,29] and are associated with numerous diseases in humans, including cancer[30,31]. To gain insight into the regulation of DNA methylation patterns, we investigated the functions of four SNF2-related, putative chromatin remodeling factors, CLSY1–4, in connection with the Pol-IV and SAWADEE HOMEODOMAIN HOMOLOG1 (SHH1)[32]–[35] components of the RdDM pathway. CLSY1, the founding member of the CLSY family, was initially identified from a genetic screen for the spreading of gene silencing and was linked to Pol-IV function based on reduced 24nt-siRNA levels at several genomic loci and immunolocalization experiments[36]. Consistent with these observations, CLSY1 was subsequently found to co-purify with Pol-IV[33,35] and SHH1[33], to facilitate de novo DNA methylation[37], and to play a weak role in controlling DNA methylation at RdDM targets[38]. However, the global effects of clsy1 mutants on 24nt-siRNA levels, the functional connections between CLSY1, SHH1 and Pol-IV, and an in-depth analysis of the effects of clsy1 mutants on DNA methylation patterns and gene silencing remain to be determined. Furthermore, the roles of CLSY2, CLSY3 and CLSY4, which also co-purify with Pol-IV, remain completely unknown.

Results

The CLSY family controls 24nt-siRNA levels in a locus-specific manner

To examine the roles of the CLSY family in the RdDM pathway, T-DNA insertion mutants for each CLSY genes were obtained. Gene expression profiling in these mutants confirmed disruption of the corresponding transcripts and demonstrated that there are no obvious compensatory gene expression effects observed between family members (Supplementary Fig. 1a and Supplementary Table 1). The effects of these mutants on 24nt-siRNAs were then determined by small RNA profiling (Supplementary Table 2) and compared to a Pol-IV mutant (nrpd1, hereafter termed pol-iv) as well as three wild-type controls. After determining loci that produce small RNAs based on both unique- and multi-mapping reads (Supplementary Fig. 1b and Supplementary Table 3), a core set of 13,253 24nt-siRNA clusters were identified using ShortStack[39] (Supplementary Table 3 and 4a). These core clusters were detected in all three wild-type replicates and account for more than 92% of the mapped 24nt-siRNAs in each experiment (Supplementary Fig. 1c). As expected based on previous studies[40,41], the expression of these 24nt-siRNA clusters are highly dependent on Pol-IV (Supplementary Fig. 1d, e). In each clsy mutant largely non-overlapping subsets of reduced 24nt-siRNA clusters were identified using DESeq2[42] (fold change (FC)≥2 and false discovery rate (FDR)≤0.01; Fig. 1a, Supplementary Fig. 1f, and Supplementary Table 4). The clsy1 mutant affected the most 24nt-siRNA clusters, while clsy3 and clsy4 displayed an intermediate effect, and clsy2 only affected a small number of loci (Fig. 1a). Quantification of 24nt-siRNA levels over these reduced 24nt-siRNA clusters revealed strong decreases that are specific to each mutant and approached the levels observed in pol-iv (Fig. 1b and Supplementary Fig. 1g). Further attesting to the robustness of these phenotypes, similar results were observed using only uniquely mapping reads (Supplementary Fig. 1h) or using data from an independent, biological replicate (Supplementary Fig. 1i). In addition to depending on different CLSY family members, these four groups of 24nt-siRNA clusters also differ in their wild-type expression levels (Fig. 1b and Supplementary Fig. 1g) as well as their size (Supplementary Fig. 1j), which may contribute to their differential regulation. In total, the clsy-dependent 24nt-siRNA clusters identified here represent approximately 25% of the 24nt-siRNA producing loci genome-wide (Fig. 1a), which account for 62.7% of all the 24nt-siRNAs present in wild-type plants (Supplementary Fig. 1k). Similar differential expression analyses for 21nt- and 22nt-siRNA clusters, which include miRNAs, revealed essentially no down-regulated clusters (Supplementary Table 5). Taken together, these findings demonstrate that the CLSY proteins act as potent, locus-specific regulators of 24nt-siRNA expression.

Figure 1.

The CLSY family controls 24nt-siRNA levels in a locus-specific manner.

(a) Scaled Venn diagram based on the reduced 24nt-siRNA clusters provided in Supplementary Table 4 showing the relationships between loci with reduced 24nt-siRNA levels in the clsy single mutants. For readability, only overlaps >20 are labeled. A small number of overlaps between clsy2 and clsy3 are not shown due to spatial constraints, but an unscaled Venn diagram showing all the overlaps is present in Supplementary Figure 1f. (b) Boxplots showing 24nt-siRNA levels (reads per kilobase per million; rpkm) in each clsy single mutant compared to each other, wild-type (WT) controls, and pol-iv. Here, and in all subsequent figures, the boxplots show the interquartile range (IQR) with the median shown as the black line and the whiskers corresponding to 1.5 times the IQR. Above each plot, the numbers of clusters (n) are indicated and biological replicates for the WT controls are designated as WT_1, WT_2, and WT_3, with the average signal from these replicates designated as the WT_avg. These boxplots represent a single experiment, but confirmatory data from an independent biological replicate and from additional alleles are presented in Supplementary Figs. 1i and 3, respectively. Below each boxplot are genome browser screen shots showing the levels of 24nt-siRNAs (reads per 10 million; rp10m) at representative clsy-dependent 24nt-siRNA clusters. The scale for each panel is indicated in brackets, where k indicates 1000.

To determine whether the 24nt-siRNA clusters regulated by the clsy single mutants represent the totality of loci controlled by these factors, all 6 combinations of clsy double mutants were generated and their small RNA profiles and reduced 24nt-siRNA clusters were determined (Supplementary Table 4, Fig. 2a, b and Supplementary Fig. 2a, b). This revealed two double mutants (clsy1,2 and clsy3,4) that showed clear synergistic relationships, affecting more loci (Fig. 2a) and displaying stronger reductions in 24nt-siRNA levels relative to their respective single mutants (Fig. 2b). Notably, these findings are consistent with previous phylogenetic analyses, as CLSY1 and CLSY2 form one subgroup while CLSY3 and CLSY4 form another[36]. As observed for 24nt-siRNA clusters dependent on individual CLSY proteins, the reductions in 24nt-siRNAs observed at the clsy1,2- and clsy3,4-dependent clusters were largely specific to the corresponding mutants (Fig. 2b and Supplementary Fig. 2c). In total, these clsy doubles control 67% of all 24nt-siRNA clusters (Fig. 2c), which equates to 88% of all 24nt-siRNAs present in wild-type plants (Supplementary Fig. 2d), revealing a second layer of locus-specific regulation that relies on distinct pairs of CLSY proteins.

Figure 2.

Specific CLSY pairs regulate 24nt-siRNAs at non-overlapping and spatially distinct genomic loci.

(a, c, and e) Scaled Venn diagrams showing the relationships between loci with reduced 24nt-siRNA levels in the indicated clsy single, double, and quadruple mutants. For readability, only overlaps >20 are labeled except for panel e where the % overlap between both samples is shown instead. (b and f) Boxplots showing 24nt-siRNA levels in each clsy single, double or quadruple mutant compared to each other, WT controls and pol-iv, from a single experiment. Confirmatory data using additional alleles are presented in Supplementary Fig. 3. (d) Chromosome 1 view of 24nt-siRNA clusters dependent on the genotypes indicated on the left, where the scale is the number of clusters per 100kb bin. The red region corresponds to pericentromeric DNA[56]. The pie charts represent the genome wide (i.e. Chr1–5) distributions. Chromosomal views for Chr2–5 are present in Supplementary Figure 2e.

To further examine the relationship between the clsy1,2- and clsy3,4-dependent 24nt-siRNA clusters, their overlap with each other and their genomic distributions were determined. Not only do these CLSY pairs regulate mutually exclusive sets of 24nt-siRNAs clusters (Fig. 2c), they also show preferential enrichment for chromosome arms (clsy1,2-dependent clusters) or pericentromeric heterochromatin (clsy3,4-dependent clusters), revealing a striking distribution of labor amongst the CLSY family (Fig. 2d and Supplementary Fig. 2e). Notably, the remaining pol-iv-dependent 24nt-siRNA clusters, which were not significantly affected in either double mutant, show an even more extreme partitioning within the genome, with 78% residing in pericentromeric heterochromatin (Fig. 2d and Supplementary Fig. 2e). These clusters are lowly expressed (Supplementary Fig. 2c, d) and, like the clsy3,4-dependent 24nt-siRNA clusters, they tend to be larger in size (Supplementary Fig. 2f). To determine whether these remaining loci are redundantly controlled by all four CLSY proteins, a clsy quadruple mutant was generated. In this mutant, greater than 98% of all pol-iv-dependent 24nt-siRNA clusters were reduced (Fig. 2e) and the levels of 24nt-siRNAs at these clusters were near zero (Fig. 2f). Finally, the effects and locus-specificities of the clsy single, double and quadruple mutants on 24nt-siRNA levels were confirmed with additional mutant alleles for all four CLSY genes (Supplementary Fig. 3) Together, these findings demonstrate that the four CLSY proteins act individually as highly locus-specific regulators of 24nt-siRNAs and together as the master regulators of essentially all Pol-IV-dependent 24nt-siRNAs.

The CLSY family controls global DNA methylation patterns

To assess the effects of the clsy-dependent 24nt-siRNA losses on DNA methylation patterns, whole genome bisulfite sequencing experiments were conducted (Supplementary Table 6). In Arabidopsis, the patterns of DNA methylation can be broadly classified into two categories[43,44]: Methylation at transposons and repeats, which is established via the RdDM pathway and occurs in all sequence contexts (CG, CHG, and CHH, where H=A, T, or C), and gene body methylation, which is restricted to the CG context and is established via mechanisms that remain poorly understood[45]. Thus, to best evaluate the roles of the clsy mutants, differentially methylated regions (DMRs) for each genotype were determined independently for the CG, CHG, and CHH contexts (FC≥40%, 20%, or 10% for CG, CHG, and CHH DMRs, respectively, relative to three wild-type controls with an FDR≤0.01; Fig. 3a and Supplementary Table 7). Consistent with roles for the CLSY family in RdDM, this analysis revealed a high degree of overlap between hypo DMRs and reduced 24nt-siRNA clusters, especially for non-CG DMRs in the clsy double and quadruple mutants (Fig. 3a). Furthermore, even at DMRs that failed to overlap with reduced 24nt-siRNA clusters, 24nt-siRNA levels were still decreased (Supplementary Fig. 4). Thus, at non-CG DMRs, reduced DNA methylation is highly correlated with 24nt-siRNA losses. In contrast, a similar analysis at CG DMRs showed minimal overlap with reduced 24nt-siRNA clusters in the clsy mutants (Fig. 3a) and revealed that the vast majority of these regions have little to no 24nt-siRNAs (Supplementary Fig. 4), suggesting they likely represent natural variation in methylation at body-methylated genes rather than defects in targeting methylation at RdDM loci. Nonetheless, the small subset of CG DMRs that do overlap with reduced 24nt-siRNA clusters (Supplementary Fig. 4a) showed a clear reduction in 24nt-siRNAs, nearly phenocopying pol-iv mutants. Together, these comparisons reveal the subset of loci where reductions in 24nt-siRNA levels result in the most significant changes in DNA methylation for each sequence context.

Figure 3.

24nt-siRNA losses in clsy mutants result in reduced DNA methylation.

(a) Table showing the numbers of hypo DMRs in the genotypes and methylation contexts indicated, where H=A, T, or C. The number of these DMRs that overlap (∩) with reduced 24nt-siRNA clusters (“DMR ∩ ↓ 24nt-siRNA clusters”) is also indicated and shaded from light blue to red based on the percentage of total DMRs represented. (b-d) Scaled Venn diagrams of hypo CHH DMRs showing the relationships between loci regulated by the clsy single, double, and quadruple mutants, respectively. For readability, only overlaps >20 are labeled except for panel d where the % overlap is shown instead. For panel b, a small number of overlaps are not shown due to spatial constraints, but an unscaled Venn diagram showing all the overlaps is present in Supplementary Figure 5a. (e) Boxplots showing the levels of CHH methylation at the hypo CHH DMRs identified in each clsy single, double or quadruple mutant as compared to each other, WT controls, and pol-iv. These boxplots represent a single experiment including three independent WT controls.

As expected based on the presence of pathways controlling the maintenance of DNA methylation in the CG and CHG contexts[5], the largest effects on DNA methylation observed in the RdDM mutants were in the CHH context. Consistent with their 24nt-siRNA phenotypes, each clsy single mutant affected DNA methylation at largely distinct sets of DMRs. Once again, clsy1 was the strongest with 1,238 CHH DMRs, clsy3 and clsy4 had 338 and 161, respectively, and clsy2 was the weakest with just 74 (Fig. 3a, b and Supplementary Fig. 5a). Further paralleling the effects observed for 24nt-siRNAs, the clsy double mutants showed additive effects at mutually exclusive sets of CHH DMRs (Fig. 3a, c) and the quadruple mutant showed the strongest effect, overlapping with >90% of the CHH DMRs identified in pol-iv (Fig. 3a, d). Quantification of DNA methylation levels at all the non-CG DMRs (Fig. 3e and Supplementary Fig. 5b), as well as the CG DMRs overlapping with reduced 24nt-siRNA clusters (Supplementary Fig. 5c), revealed the strongest reductions in DNA methylation levels in the corresponding mutant backgrounds. In addition, quantification of DNA methylation levels at all the reduced 24nt-siRNA clusters, not just those corresponding to DMRs, revealed similar trends: CG methylation levels were minimally affected, while stronger reductions were observed in the non-CG contexts in a genotype-specific manner (Supplementary Fig. 5d). Together, these findings demonstrate that the locus-specific reductions in 24nt-siRNA levels observed in the clsy single, double and quadruple mutants result in locus-specific decreases in DNA methylation.

Figure 5.

The CLSY proteins are required for Pol-IV chromatin association at 24nt-siRNA producing loci.

(a and b) Profile plots showing Pol-IV enrichment at all the different classes of clsy-dependent 24nt-siRNA clusters in a WT background (the pNRPD1::NRPD1–3xFLAG line) or the indicated clsy mutant backgrounds, respectively, from two sets of ChIP-seq data (see Supplementary Table 10). The asterisk (*) indicates that these lines are also homozygous for both the NRPD1–3xFLAG transgene and the nrpd1 mutant.

The CLSY family is required for DNA methylation-mediated silencing

Given the known roles of DNA methylation in gene silencing, transcriptome profiling experiments were conducted to identify RdDM targets up-regulated in pol-iv and clsy mutants (Supplementary Table 1, 8 and 9). These analyses revealed a total of 177 genes, repeats, and unannotated transcripts up-regulated at least 2-fold in pol-iv mutants. Although the clsy single mutants displayed weak expression phenotypes, at least one locus regulated predominantly by each mutant was identified (Fig. 4a, Supplementary Fig. 6a, and Supplementary Table 9). Of these single mutants, clsy4 was by far the strongest. However, the vast majority of pol-iv loci were redundantly controlled by all four CLSY proteins, as the clsy quadruple mutant regulated approximately 50% of all pol-iv up-regulated loci and nearly 80% of those were at least 5-fold up-regulated (Fig. 4a and Supplementary Table 9). To determine the extent to which the observed changes in gene expression correlate with altered 24nt-siRNA and DNA methylation profiles, these features were plotted side-by-side for all 177 loci (+/− 2kb) in the pol-iv and clsy quadruple mutants (Fig. 4b). On aggregate, these loci showed lower levels of 24nt-siRNAs and DNA methylation. For approximately half of the genes, and the majority of unannotated transcripts and repeats, discrete regions with more strongly reduced 24nt-siRNAs and DNA methylation levels were apparent either within the transcript itself or in the flanking 2kb regions (Fig. 4b). Indeed, further characterization of these loci revealed a high degree of overlap (80–100%) with the previously identified reduced 24nt-siRNA clusters and hypo DMRs (Supplementary Fig. 6b, c and Supplementary Tables 4 and 7). In contrast, similar reductions were not observed in the clsy2 single mutant, which is the weakest mutant overall and thus served as a negative control (Supplementary Fig. 6d). Nonetheless, like the pol-iv and clsy quadruple mutants, two of the three loci up-regulated in the clsy2 mutant were associated with reduced 24nt-siRNA clusters and hypo DMRs (Supplementary Fig. 6e). Together, these findings support the conclusion that these up-regulated loci in the clsy mutants are normally silenced by DNA methylation that is controlled by the RdDM pathway.

Figure 4.

The CLSY family controls the expression of RdDM targets.

(a) Plot showing the expression level of pol-iv-up-regulated loci (represented as horizontal slashes) in the clsy single, double and quadruple mutants. The slashes in all genotypes are colored based on the expression level of up-regulated loci in pol-iv and the number of up-regulated loci in each mutant is indicated above. (b) Heatmaps and profile plots showing the expression levels of the up-regulated TAIR10 genes (n=115), unannotated transcripts (un. txn; n=26), and TAIR10 repeats (n=36) shown in a as well as the corresponding 24nt-siRNA and DNA methylation levels at these same loci. For the mRNA and 24nt-siRNA analyses, the Log2 fold change in expression is plotted and for the DNA methylation analysis, the percent difference in methylation is plotted. Color bars indicating the scales are shown below. The heatmaps include 2kb flanking the transcription start site (S) and the transcription termination site (T) and were ranked based on the 24nt-siRNA and mCHH values in both mutants (pol-iv and the clsy quad). The profiles of the genes, un. txn, and repeats are in black, light blue, and grey, respectively. (c) Boxplots showing the number of leaves produced before flowering in FWA transformed T0 plants (Left) or untransformed plants (Right). The number of independent transformants (or untransformed plants) used for each genotype is shown below the boxplots. p-values ≤1e−4 calculated using Wilcoxon sum tests relative to the WT_3 control are shown above. (d) Genome browser screen shot showing the levels of 24nt-siRNAs (rp10m) and DNA methylation at the endogenous FWA gene in the indicated genotypes. For each set of data, the scale is indicated in brackets, with CG, CHG, and CHH methylation shown in green, blue, and red, respectively. The region showing the most prominent reduction in CHH methylation is highlighted in grey. The expression data presented in panels a, b, and d corresponds to two biological replicates of each genotype.

As an additional test of the CLSY specificities, their roles in the establishment of DNA methylation were assessed using a well-vetted de novo methylation assay involving the transformation of an unmethylated FWA transgene into each mutant background[46]. In this assay, failure to methylate and silence the incoming transgene results in an increase in the number of leaves produced prior to flowering. Compared to the untransformed controls, several of the FWA-transformed clsy mutants showed delayed flowering (Fig. 4c). In addition to clsy1, which was previously shown to display a late-flowering phenotype in FWA assays[37], clsy2 mutants also showed a slight delay, while clsy3 and clsy4 flowered at or near the normal number of leaves. This phenotype was enhanced in the clsy1,2 double, which flowered nearly as late as the clsy quadruple and pol-iv mutants. Notably, the specificities observed for this de novo assay match those observed at the endogenous FWA gene, where 24nt-siRNA production depends on CLSY1 and CLSY2 (Fig. 4d). These findings represent the first examples wherein bone fide components of the RdDM pathway (i.e. CLSY3 and CLSY4) are not required to establish methylation in the FWA de novo assay and demonstrate that the locus specificity observed for the CLSY family extends to the establishment phase of the RdDM pathway.

The CLSY family is required for Pol-IV chromatin association

To gain mechanistic insights into the roles of the CLSY proteins, enrichment of Pol-IV at 24nt-siRNA-producing loci was determined by chromatin immunoprecipitation and sequencing (ChIP-seq) experiments using a previously characterized tagged Pol-IV line (pNRPD1::NRPD1–3xFLAG[34]) crossed into various clsy mutant backgrounds (Supplementary Table 10). In a wild-type background, Pol-IV was enriched at all classes of clsy-dependent 24nt-siRNA clusters and, consistent with previous Pol-IV ChIP-seq experiments[34], Pol-IV was most enriched at highly expressed 24nt-siRNA clusters (e.g. clsy1-dependent loci) and less enriched at lowly expressed clusters (e.g. clsy4-dependent loci) (Fig. 5a). In the clsy1,2 or clsy3,4 mutant backgrounds, Pol-IV enrichment was specifically reduced at the loci regulated by these factors, and in the clsy quadruple mutant Pol-IV enrichment was depleted at all 24nt-siRNA loci (Fig. 5b and Supplementary Fig. 7a, b). In the clsy single mutants, reductions in Pol-IV enrichment were most clearly observed at clsy1- and clsy3-dependent loci (Supplementary Fig. 7c). For the clsy2 mutant, where only a few reduced 24nt-siRNA clusters were identified (n=45), or the clsy4 mutant, where the reduced 24nt-siRNA clusters are lowly expressed even in wild-type plants (Fig. 1b), global reductions were difficult to observe. However, individual examples of Pol-IV reduction in these mutants were identified (Supplementary Fig. 7d), and in both cases these weaker mutants (clsy2 and clsy4) enhanced their stronger mutant counterparts (clsy1 and clsy3, respectively; Supplementary Fig. 7a). Taken together, these findings demonstrate that the CLSY proteins are required for the locus-specific association of Pol-IV at chromatin.

The CLSY proteins rely on different chromatin modifications

In addition to the CLSY family, one other Pol-IV-associated factor, the methyl-H3K9 reader SHH1, is known to regulate 24nt-siRNA expression and function at the level of Pol-IV chromatin association[32]–[35]. Consistent with previous results[33,34], 24nt-siRNA profiling revealed that ~50% of the core 24nt-siRNA clusters were at least 2-fold reduced in shh1 mutants (Fig. 6a). Comparison of shh1-dependent 24nt-siRNA clusters and hypo CHH DMRs with those identified in the clsy1,2 or clsy3,4 double mutants show a nearly complete, and highly specific overlap between shh1 and clsy1,2 (Fig. 6a-d), revealing a genetic connection between these mutants. Further supporting this relationship, analysis of 24nt-siRNA levels over all pol-iv-dependent clusters demonstrated that shh1 and either the shh1,clsy1 double or the shh1,clsy1,2 triple mutants have similarly reduced 24nt-siRNA levels, while the shh1,clsy3,4 triple mutant phenocopies the clsy quadruple and pol-iv mutants (Fig. 6e). Based on these findings, the hypothesis that CLSY1 and CLSY2 are required for the association of SHH1 with Pol-IV in vivo was tested by a series of co-immunoprecipitation experiments. Indeed, this interaction was specifically disrupted in clsy1,2 mutants, with less than ~12.5% of the wild-type level of NRPD1, the largest subunit of Pol-IV, co-purifying with SHH1 (Fig. 6f and Supplementary Fig. 8). Given the known connections between SHH1 and H3K9 methylation, the dependence of 24nt-siRNA production at CLSY1- and CLSY2-regulated loci on H3K9 methylation was also determined. In the suvh4,5,6 triple mutant, where H3K9 methylation levels are globally reduced, but not eliminated[47], 24nt-siRNA levels at clsy1,2-dependent, but not clsy3,4-dependent loci, were significantly reduced (Supplementary Table 2, Fig. 6g and Supplementary Fig. 9). As the reductions in 24nt-siRNA levels in the suvh4,5,6 mutant were not as strong as those observed in the clsy1,2 and shh1 mutants, publicly available data[48] was used to further investigate the relationship between the residual H3K9 di-methylation and 24nt-siRNA abundances in this mutant. At clsy1,2-dependent loci, regions that retain more H3K9 di-methylation in the suvh4,5,6 mutant also retain more 24nt-siRNAs (Supplementary Fig. 9a, b), further supporting the notion that 24nt-siRNAs at these loci are regulated in an H3K9me-dependent manner. Finally, consistent with previous observations that H3K9 methylation depends on CG methylation[47,49,50], 24nt-siRNA levels at clsy1,2-dependent loci were also reduced in the met1 and ddm1 mutants (Fig. 6g). Although some roles for CG methylation independent of H3K9 methylation cannot be excluded, these findings support a model in which CLSY1 and CLSY2 mediate the interaction between SHH1 and Pol-IV to control 24nt-siRNA production at clsy1,2-dependent loci in a highly H3K9 methylation-dependent manner.

Figure 6.

The CLSY1/2 and CLSY3/4 proteins regulate Pol-IV in connection with repressive chromatin marks.

(a and c) Scaled Venn diagrams of reduced 24nt-siRNA clusters and hypo CHH DMRs, respectively, showing the relationships between loci regulated by the shh1 single and clsy1,2 or clsy3,4 double mutants. For readability, only overlaps >20 are labeled. (b, e and g) Boxplots showing the levels of 24nt-siRNAs at the reduced 24nt-siRNA clusters identified in the clsy double mutants, b and g, or pol-iv single mutant, e, in the genotypes indicated below. In g, the asterisks (*) indicate a p-value <2.2e−16 calculated using a Wilcoxon sum test relative to the WT_avg control for all samples except for met1, which was calculated relative to the MET1-WT control. The p-values for all other samples are >0.05. These boxplots represent a single experiment including three independent WT controls. (d) Boxplot showing the levels of CHH methylation at the hypo CHH DMRs identified in the shh1 single mutant as compared to the clsy double mutants and pol-iv. This boxplot represents a single experiment including three independent WT controls. (f) Cropped Western blots showing the levels of NRPD1–3xFLAG or SHH1–3xMyc from co-immunoprecipitation (co-IP) experiments in the genetic backgrounds indicated above each lane. For each blot the antibody (α) used is indicated in the upper right corner and the sizes of the protein markers are indicated on the left. An asterisk (*) marks a background band present in the α-Myc IP and the bands corresponding to the NRPD1–3xFLAG and SHH1–3xMyc proteins are marked with arrows. For the IP titrations, the gradient triangles represent a series of 2-fold dilutions starting from undiluted IP samples. Uncropped images are shown in Supplementary Fig. 8.

Alternatively, the genetic interactions between shh1, suvh4,5,6, and the clsy mutants clearly demonstrate that CLSY3 and CLSY4 facilitate Pol-IV function independent of both SHH1 and H3K9 methylation (Fig. 6 and Supplementary Fig. 9c, d). Thus, we sought to determine whether CLSY3 and CLSY4 rely on any other epigenetic features to facilitate Pol-IV localization. To this end, 24nt-siRNA levels at clsy3,4-dependent loci were profiled in mutants controlling DNA methylation in all three contexts (drm1,2, cmt2, and cmt3), as well as mutants controlling the deposition of several known repressive histone modifications (suvh4,5,6 and atxr5,6; Supplementary Table 2). Of these mutants, only those controlling methylation in the CG context, ddm1 and met1, showed significantly reduced 24nt-siRNA levels (Fig. 6g), demonstrating that 24nt-siRNA production at loci controlled by CLSY3 and CLSY4 depends on CG methylation. However, it remains unknown whether these CLSYs rely directly on CG methylation or if they instead depend on other chromatin modifications or heterochromatin features that, like H3K9 methylation, rely on CG methylation.

Discussion

A major unanswered question in the field of epigenetics is how specific patterns of DNA methylation are generated and modulated—a critical step in deciphering epigenetic processes in both normal development and disease. As Pol-IV “kicks off” the RdDM pathway by initiating the biogenesis of 24nt-siRNAs, which ultimately guide DNA methylation in a sequence specific manner, understanding the regulation of this polymerase is essential to determining how specific DNA methylation patterns are generated. Previously, we identified the CLSY proteins as components of the Pol-IV complex(es)[35] and here we show they act as locus-specific regulators of both 24nt-siRNA production and DNA methylation. This locus-specific behavior differs from previously characterized RdDM factors, as none rival the degree or comprehensive nature of the specificities displayed by the CLSY family. Overall, these findings not only shed light on the regulation of Pol-IV, but also uncover a long-sought layer of complexity within the RdDM pathway that enables the locus-specific control of DNA methylation patterns. Investigation into the locus-specific behavior of the CLSYs revealed that different chromatin modifications are required for the production of 24nt-siRNAs depending on the CLSY proteins involved. For loci regulated by CLSY3 and CLSY4, CG methylation is required, but the connections (direct or indirect) between CG methylation and CLSY3 and CLSY4 remain to be elucidated. Perhaps further characterization of factors like HISTONE DEACETYLASE 6, which participate in both the CG methylation and RdDM pathways[51,52], will shed light on these connections. For loci regulated by CLSY1 and CLSY2, our analyses provide a direct link to H3K9 methylation, as these two CLSY proteins are required for the association between the H3K9me2 reader, SHH1, and the Pol-IV complex. Finally, for the remaining loci that are redundantly controlled by all four CLSYs, it remains unclear whether different modes of regulation are employed as these 24nt-siRNA clusters are expressed at low levels in all mutants tested (Fig. 6g). Together, these results reveal that specific chromatin features, including, but not limited to, CG and H3K9 methylation, can be leveraged to generate locus-specific control over DNA methylation. Indeed, such mechanisms appear to be conserved between plants and animals, as a similar, though less locus-specific, mechanism was recently identified in Drosophila wherein the core transcriptional machinery was shown to be linked to repressive histone marks in connection with the H3K9me3 reader, Rhino[53]. Furthermore, given the widespread conservation of SNF2 chromatin remodeling factors in general, and the specific conservation of the CLSY family in crops including rice[54,55] and maize[54], we anticipate that our findings will be informative for understanding the mechanisms governing the establishment of specific DNA methylation patterns in diverse organisms.

Online methods

Plant Materials

All plant materials used in this study were in the Columbia-0 (Col-0) ecotype and unless otherwise specified, plants were grown in Salk greenhouses with long-day conditions. Newly characterized CLSY T-DNA insertion mutant lines include: clsy1–10 (SALK_204860C)[57], clsy3–2 (SALK_204501C)[57], clsy4–2 (WiscDsLox472B9)[58], clsy2–1 (GABI-Kat line 554E02)[59], clsy2–2 (SAIL_484_F03)[60], clsy3–1 (SALK_040366) and clsy4–1 (SALK_003876)[57]. Unless otherwise specified, the clsy1–7, clsy2–2, clsy3–1, and clsy4–1 alleles were utilized. Previously published mutant lines include: clsy1–7 (SALK_018319)[61], nrpd1–4 (SALK_083051)[62], shh1–1 (SALK_074540C)[35], drm1–2,drm2–2 (drm1,2; SALK_031705 and SALK_150863, respectively)[63], cmt2–7 (WiscDsLox7E02)[47], cmt3–11 (SALK_148381)[63], met1–3 (CS16387)[64], ddm1–2 (EMS allele)[65], atxr5,atxr6 (atxr5,6; SALK_130607 and SAIL_240_H01, respectively)[66], and suvh4,suvh5,suvh6 (suvh4,5,6; SALK_41474, GABI-Kat 263C05, Garlic_1244_F04.b.1a, respectively)[67]. The pNRPD1::NRPD1–3xFLAG and pSHH1::SHH1–3xMyc transgenic lines were previously characterized in Law et al.[35].

Small RNA isolation, library preparation, sequencing and data processing

Small RNA isolation:

4 un-opened flower buds (stage 12 and younger) from individual mutant plants as well as 3 individual wild-type (WT) controls were collected, frozen in liquid nitrogen and kept at −80°C until use. The total RNA extraction and small RNA enrichment were performed as previously described[68] with the following minor modifications: (1) for the small RNA enrichment step an equal volume of 20% polyethylene glycol 8000/2M NaCl was added to each total RNA sample and (2) the ZR-small RNA ladder (Zymo Research, Cat# R1090) was used to determine the gel region corresponding to the 17–29 nucleotide (nt) size range. The resulting small RNAs were then used for library preparation with the NEBNext Multiplex Small RNA Library Prep Set for Illumina (New England Biolabs, Cat# E7300) following the user’s manual. The final library products were further purified using an 8% polyacrylamide gel to excise 130–160nt products relative to the pBR322 DNA-MspI Digest ladder (New England Biolabs, Cat# E7323AA). The libraries were pooled and sequenced (single end 50bp, SE50) on a HiSeq 2500 machine (Illumina).

Small RNA data processing and mapping:

The adapter sequences in the de-multiplexed small RNA (smRNA) sequencing reads were trimmed using cutadapt (v1.9.1) and reads longer than 15nt were kept for further analyses[69]. The trimmed smRNA reads were then mapped to the Arabidopsis genome (TAIR10) using ShortStack (v3.8.1)[39], allowing 1 mismatch (--mismatches 1) and employing either the multi-mapping mode (--mmap f) or the no multi-mapping, none mode (--mmap n). Subsequently, a custom JSON filter (JSON_findPerfectMatches_and_TerminalMisMatches_v3) was employed to keep only perfectly matching reads and reads with a single mismatch at their 3’ terminus, as such mismatches were recently identified as a feature of Pol-IV-dependent RNAs[7]. The smRNA reads passing this custom filter were then used to call small RNA clusters using ShortStack with the —mincov 20, pad 100, --dicermin 21 and —dicermax 24 options. The number of 21–24nt smRNA clusters identified were extracted using a custom perl script (splitpancakesbysize_shortStack_v3.8.1.pl) and are presented in Supplementary Table 3. To facilitate further analysis, the smRNA reads passing the JSON filter (bam file format) were used to generate a “Tag Directory” using the makeTagDirectory script from the HOMER (Hypergeometric Optimization of Motif EnRichment) package[70]. The Tag Directory was then split into sub-TagDirectories by smRNA size (20–25nt) using a custom perl script (splitTagDirectoryByLength.dev2.pl).

Differential expression analysis:

To identify a core set of 24nt-siRNA clusters in WT plants, common clusters from three WT replicates (WT_1, WT_2, and WT_3) were determined and the overlapping regions of each cluster were kept and merged using the mergePeaks.pl script from HOMER. All differential expression analyses were conducted based on these core clusters using DESeq242 as follows: First, the raw read counts (24nt) for each cluster in each genotype, including all the corresponding WT controls, were calculated using annotatePeaks.pl script (-raw -len 1) from HOMER. These read counts were then normalized using DESeq2 with modifications to the size factor estimation in order to relate counts to total mapped reads (i.e. smRNA reads of all sizes passing the JSON filter) rather than reads associated with specific features (e.g. 24nt reads) as follows: First, size factors were calculated for all the WT replicates using the DESeq2 default method. Then, these values were compared against the corresponding number of total mapped reads in order to derive an average number of mapped reads per size factor unit. With this average value, the number of mapped reads per sample was used to calculate the size factors for the individual mutants. The derived size factors and the matrix of raw read counts for each cluster in all the mutants, as well as the WT replicates, were then used as the input for DESeq2 to call mutant-dependent differential expression of 24nt-siRNA clusters (fold change (FC) ≥2, false discovery rate (FDR) ≤0.01). For 21nt- and 22nt-smRNAs, core clusters for each size class were determined as describe above and reduced clusters were identified using DESeq2 (FC≥2 and FDR≤0.01).

Visualization and analysis of 24nt-siRNA levels:

Downstream analyses were performed using HOMER and other tools as described below. Genome browser tracks of 24nt-siRNAs were generated using the HOMER makeUCSCfile script (-fragLength 24 -norm 10000000). For each boxplot, normalized smRNA read counts for the specified 24nt-siRNA clusters were calculated using the HOMER annotatePeaks.pl script (-rpkm -len 1) and the boxplot was drawn in R using RStudio (v1.0.136). For each heatmap, the HOMER annotatePeaks.pl script (-size 10000, -hist 600, -ghist and -len 24) was used to calculate the values for each set of 24nt-siRNA clusters. A pseudocount of 1 was then added to all the data, which was then log2 transformed and visualized using the Morpheus online tool. To generate the Venn diagrams, the unique identifiers of each mutant-dependent 24nt-siRNA cluster were imported and visualized using online tools for unscaled (VENNY2.1) or scaled (VennMaster[71]) Venn diagrams. For the chromosome-wide views of reduced 24nt-siRNA clusters, the pericentromeric heterochromatin genomic features were marked in the IGV genome browser based on previously published regions[56] and the distribution of mutant-dependent reduced 24nt-siRNA clusters were determined by bedmap[72] (--count --bp-ovr 1) in 100kb bins.

DNA isolation, MethylC-seq library construction, sequencing and data processing

DNA isolation:

0.1g of un-opened flower buds (stage 12 and younger) were collected from the same individual plants as used for the smRNA-seq analyses and genomic DNA was isolated using the DNeasy Plant Mini Kit (Qiagen, Cat# 69104). 2.0μg of purified genomic DNA was then used to generate MethylC-seq libraries as described in Li et al.[73]. The resulting libraries were pooled and sequenced (single end 50bp, SE50) on a HiSeq 2500 machine (Illumina).

MethylC-seq data processing:

MethylC-seq reads were trimmed and analyzed using BS-Seeker2 (v2.0.9). Briefly, reads were mapped against the C-to-T converted TAIR10 reference genome using the bs_seeker2-align.py script with the bowtie aligner, allowing 2 mismatches (-m 2). Clonal reads were removed using the MarkDuplicates function within picard tools (http://broadinstitute.github.io/picard). The mapped reads were then used to calculate the methylation level at each cytosine using the bs_seeker2-call_methylation.py script, requiring a minimum coverage of 4 reads (-r 4). From these analyses, the mapability, coverage, global percent CG, CHG, and CHH methylation levels, and non-conversion rates for each library were determined (See Supplementary Table 6). In addition, wiggle (wig) files showing the percent CG, CHG and CHH methylation for each genotype at single nucleotide resolution were generated using a custom perl script (Bsseeker2_2_wiggleV2.pl) based on the BS-Seeker2 Cgmap output files.

DMR calling:

To call differentially methylated regions (DMRs) several custom perl scripts were used (Bsseeker2_methylCall2Cytosine.pl, CytosineTo100bpBin.pl, GetOnlyCommonBins.pl, DMRFtestFDR.R, SplittingDMRs.pl, and SplittingDMR2Bed.pl). These scripts identified DMRs in the CG, CHG or CHH contexts based on pair-wise comparisons between each mutant and three independent WT data sets in 100bp non-overlapping bins using the following criteria: (1) only bins with ≥4 cytosines in the specified context were included, (2) only bins in which there was sufficient coverage in both genotypes being compared were included (i.e. ≥4 reads over the required 4 cytosines in the specific context), and (3) only bins with a fold change of 40%, 20% or 10% methylation in the CG, CHG, and CHH contexts, respectively, with an adjusted p-value of ≤0.01 relative to all three WT controls were called as DMRs.

Visualization and analysis of DNA methylation levels:

The overlaps between the clsy DMRs and reduced 24nt-siRNA clusters were determined using bedops[72] (--element-of 1) and the heatmap indicating the percent overlap in Figure 3a was generated using the Morpheus online tool. The overlaps between DMRs called in different genotypes were determined using bedops (--element-of 1) and visualized as Venn diagrams generated as described for the smRNA analyses. The DNA methylation levels over reduced 24nt-siRNA clusters or DMRs were determined using the HOMER tool suite. For these analyses, Tag Directories were made from each of the methyl CG, CHG, and CHH wig files in two steps. First, the wig files were converted into the tag format recognized by HOMER using a custom script (parseWig_noChr.v2.pl) and then the Tag Directories were generated using the HOMER makeTagDirectory script (-precision 3 -t). Using these Tag Directories, the percent methylation over the desired genomic regions (e.g. reduced 24nt-siRNA clusters or DMRs) were determined using the HOMER annotatePeaks.pl script (-ratio -len 1). These methylation levels were then used to generate boxplots in R using RStudio.

RNA isolation, real-time PCR, mRNAseq library construction, sequencing and data processing

RNA isolation:

4 un-opened flower buds (stage 12 and younger) were collected from the same individual plants as used for the smRNA-seq and MethylC-seq analyses and total RNA was isolated using the Quick-RNA MiniPrep kit (Zymo Research, Cat# R1055). For the Reverse Transcriptase quantitative PCR (RT-qPCR) assays, 1.0μg of DNase I-treated total RNA reverse transcribed using High-Capacity cDNA Reverse Transcription Kit with RNase Inhibitor (Applied biosystems, Cat#4374967). The RT-qPCR assays were conducted using the iTaq Universal SYBR Green Mix (Bio-Rad, Cat#172–5124) with CFX384 Real-Time System (Bio-Rad). The cDNA levels of target genes were normalized to ACTIN2 and the error bars represent the standard error between three technical replicates. The primer pairs for the CLSY genes are listed in Supplementary Table 11. For the RNA-seq libraries, 2.0μg of total RNA from each genotype was used to generate mRNA-seq libraries using the NEBNext Ultra RNA Library Prep Kit (New England Biolabs, Cat# E7530). All size selection and clean-up steps were preformed using Sera-Mag Magnetic SpeedBeads (Thermo Scientific, Cat# 65152105050250). The resulting libraries were pooled and sequenced (single end 50bp, SE50) on a HiSeq 2500 machine (Illumina).

mRNA-seq data processing:

mRNA-seq reads were mapped to the TAIR10 reference genome using STAR (v2.5.0c)[74] allowing 2 mismatches (--outFilterMismatchNmax 2) and including only uniquely mapped reads (--outFilterMultimapNmax 1). The sorted bam files were then used to generate Tag Directories using the HOMER makeTagDirectory script and the TAIR10 annotation was used to obtain the raw read counts for each gene (or repeat) using the HOMER analyzeRepeats.pl script with different options for genes (rna tair10 -raw -condenseGenes -len 1) or repeats (repeats tair10 -raw -len 1). Differentially expressed genes and repeats were then determined by DESeq2 using the default parameters and employing a FC threshold of ≥2 with an FDR ≤0.05 compared to all WT controls. To identify previously un-annotated transcripts regulated by the RdDM pathway, the mRNA-seq data was re-analyzed using TopHat2 (v2.1.1)[75]. Briefly, the mRNA-seq reads were mapped to TAIR10 genome by TopHat2 and the output bam files were used to identify transcript assemblies using Cufflinks (v2.2.1) without using the TAIR10 annotation. The resulting transcript assemblies were merged using Cuffmerge to get the de novo transcript units (in GTF form), which were further converted into bed format using the gtfToGenePred and genePredToBed scripts. The converted bed file was then used to obtain raw read counts for each transcript using the HOMER annotatePeaks.pl script (-raw -len 1). The differentially expressed transcripts were then determined by DESeq2 as described above for genes and repeats. These differentially expressed transcripts were then compared with TAIR10 genes and repeats, and non-overlapped transcripts were designated as un-annotated transcripts.

Visualization and analysis of pol-iv-dependent up-regulated loci:

To visualize loci up-regulated in pol-iv mutants (including genes, repeats and unannotated transcripts) that are also up-regulated in the clsy mutants, a profile plot of pol-iv-dependent up-regulated loci was generated as follows: (1) From the DESeq2 output files, the up-regulated loci in pol-iv were determined (FC ≥2 and FDR ≤0.05). (2) Then FC and FDR values for this set of loci in each mutant were extracted from the DESeq2 output files and filtered with the threshold (FC ≥2 and FDR ≤0.05). The FC values passing the filter were kept and all other values were replaced with “NA”. (3) The resulting data matrix was organized by tidyr (gather), color-coded tidyr based on FC value in pol-iv and visualized by ggplot2 (geom_point) in RStudio. To determine the correlation between 24nt-siRNAs, DNA methylation and gene expression, the set of 177 up-regulated loci in pol-iv was used to generate heatmaps and profile plots using deepTools (v2.4.0)[76]. For the mRNA-seq data, the sorted bam files derived from the STAR mapping were first compared to WT controls using the bamCompare tool (--ratio=log2 --scaleFactorsMethod SES -bs 10). For 24nt-siRNA data, the 24nt-siRNA bedGraph files generated by HOMER were converted into bigwig format using the bedGraphToBigWig script with default options and then compared to WT controls using the bigwigCompare tool (--ratio=log2 -bs 10). For DNA methylation data, the wig files were first converted into bigwig format using the wigToBigWig tool and then the difference between the mutants and WT controls were calculated using the bigwigCompare tool (--ratio=subtract). The resulting bigwig files were then used to calculate a matrix using computeMatrix tool (scale-regions -a 2000 --regionBodyLength 2000 -b 2000 -bs=100). Finally, the data was plotted using the plotHeatmap tool. To determine the overlaps between the up-regulated loci indicated in Fig. 4a, and reduced 24nt-siRNA clusters and hypo DMRs identified in the pol-iv, clsy quadruple, and clsy2 mutants, the bedops --element-of 1 function was used, and to determine the number of DMRs overlapping with each locus, the bedmap counts function was used (--count --bp-ovr 1).

FWA transformation assay

A previously described FWA plasmid[35] was used for floral dipping[77] into the following genotypes: Col-0, clsy1–7, clsy2–1, clsy3–1, clsy4–1, clsy1–7,2–1, clsy3–1,4–1, clsy1–7,2–1,3–1,4–1 and nrpd1–4. The resulting T0 seeds were selected on Linsmaier and Skoog (LS) media with 0.6% agar and Basta (25mg/L) for one week and the resistant plants were transferred to soil and grown in a growth chamber at 22°C, on a 16h light and 8h dark cycle, with 70% humidity. The number of rosette leaves produced prior to bolting were determined and plotted using R in RStudio and the p-values were calculated using Wilcoxon rank sum tests.

ChIP, ChIP-seq library preparation, sequencing and data processing

ChIP:

A FLAG-tagged Pol-IV line, pNRPD1::NRPD1–3xFLAG in an nrpd1–4 mutant background[35] was crossed into the following mutants: clsy1–7, clsy2–2, clsy3–1, clsy4–1, clsy1–7,2–2, clsy3–1,4–1, clsy1–7,2–2,3–1,4–1 and shh1–1. The progeny of these crosses were screened by drug-resistance to select for lines homozygous for the tagged Pol-IV transgene and genotyped by PCR to isolate lines homozygous for each mutant background, including the nprd1–4 allele. The ChIP was performed as previously described in Law et al.[34]. For each genotype, 2.0g of un-opened flower buds (stage 12 and younger) were collected, ground to a fine powder in liquid nitrogen, and crosslinked with 1% formaldehyde (Sigma, Cat# F8775) for 20min at room temperature with slow rotation. The chromatin was then fragmented to ~500bp by sonication and the lysate was incubated with anti-FLAG M2 Magnetic beads (Sigma, Cat# M8823) at 4°C for 2h. The beads were washed 5 times, for 5min at 4°C and eluted twice using 150μL of 3xFLAG peptide [0.1 mg/mL] (Sigma, Cat# F4799) at room temperature, rotating for 15min each time. The crosslinking was reversed by incubation at 65°C overnight, and the DNA was purified using a Phenol:Chloroform:Isoamyl Alcohol kit (Thermo Scientific, Cat# 17908). ChIP libraries were prepared from the resulting DNA using the NEBNext Ultra II DNA Library Prep Kit (New England Biolabs, Cat# 7645) and sequenced (single end 50bp, SE50) on a HiSeq 2500 machine (Illumina).

ChIP-seq data analysis:

Pol-IV ChIP sequencing data were aligned to TAIR10 reference genome using bowtie (v1.1.0)[78] allowing 2 mismatches (-v 2) and including multi-mapping reads (--all --best --strata). Pol-IV ChIP enrichment relative to WT controls at the identified 24nt-siRNA clusters were visualized using deepTools (v2.4.0)[76]. Briefly the sorted bam files derived from bowtie mapping were compared to WT controls using the bamCompare tools (reference-point --referencePoint center --ratio=log2 –scaleFactorsMethod SES -bs=10) and the resulting bigwig files were used to generate a data matrix using the computeMatrix tool (reference-point --referencePoint center -a 5000 -b 5000 -bs=10). Finally, the data was plotted using the plotHeatmap or plotProfile tools. The H3 and H3K9me2 ChIP sequencing data sets were downloaded from the Sequence Read Archive (http://www.ncbi.nlm.nih.gov/sra) under accession numbers GSM2837360 and GSM2837359[48], and were mapped and analyzed as described for the Pol-IV ChIP, including visualization using deepTools.

Co-IP and Western blotting

For these experiments, the plant lines described above in which the pNRPD1::NRPD1–3xFLAG construct was crossed into the clsy1,2 or clsy3,4 mutants were super-transformed with a previously described Myc-tagged SHH1 plasmid, pSHH1::SHH1–3xMyc[35], using the floral dip method[77]. The resulting T0 seeds were selected on LS media with 0.6% agar and hygromycin (25mg/L) for one week and the resistant plants were then transferred to soil and grown under long-day conditions at 22°C. The two tagged control lines, pSHH1::NRPD1–3xMyc and pNRPD1::NRPD1–3xFLAG were also grown under the same conditions. Approximately 0.5g of flower buds were collected from each genotype and ground into a fine powder in liquid nitrogen with 1mL Lysis buffer (50mM Tris, pH 7.6; 150mM NaCl; 5mM MgCl2; 10% Glycerol; 0.1% NP40) containing protease inhibitors. The lysate was cleared by centrifugation at 13,000rpm for 10min at 4°C. The supernatants were incubated with 2.0μL anti-c-Myc 4A6 antibody (Millipore, Cat# 05–724) and 30μL protein G Dynabeads (Invitrogen, Cat# 10004D) at 4°C for 2h rotating slowly. The beads were then washed 5 times, for 5min, with 1mL of Lysis buffer and resuspended in 50μL SDS-PAGE loading buffer. 16μL of input and bead eluate were resolved on a 7.5% TGX Precast Protein Gel (Bio-Rad, Cat# 3450005). The proteins were then detected by Western blotting using either the anti-FLAG M2 Monoclonal Antibody-Peroxidase Conjugated antibody (Sigma, Cat# A8592) at a dilution of 1:5,000 or the anti-c-Myc 4A6 antibody at a dilution of 1:2,000. Goat anti-mouse IgG horseradish peroxidase (Bio-Rad, Cat# 170–6516) was used at a dilution of 1:10,000 as the secondary antibody. All Western blots were developed using the ECL2 Western Blotting Substrate (Pierce, Cat# 80196).

77 in total

Review 1. Form, function, and regulation of ARGONAUTE proteins.

Authors: Allison Mallory; Hervé Vaucheret
Journal: Plant Cell Date: 2010-12-23 Impact factor: 11.277

2. An atypical component of RNA-directed DNA methylation machinery has both DNA methylation-dependent and -independent roles in locus-specific transcriptional gene silencing.

Authors: Jun Liu; Ge Bai; Cuijun Zhang; Wei Chen; Jinxing Zhou; Suwei Zhang; Qing Chen; Xin Deng; Xin-Jian He; Jian-Kang Zhu
Journal: Cell Res Date: 2011-11-08 Impact factor: 25.617

3. Epigenetic differences between shoots and roots in Arabidopsis reveals tissue-specific regulation.

Authors: Nicolas Widman; Suhua Feng; Steven E Jacobsen; Matteo Pellegrini
Journal: Epigenetics Date: 2013-10-29 Impact factor: 4.528

4. Maintenance of CpG methylation is essential for epigenetic inheritance during plant gametogenesis.

Authors: Hidetoshi Saze; Ortrun Mittelsten Scheid; Jerzy Paszkowski
Journal: Nat Genet Date: 2003-05 Impact factor: 38.330

5. PolIVb influences RNA-directed DNA methylation independently of its role in siRNA biogenesis.

Authors: Rebecca A Mosher; Frank Schwach; David Studholme; David C Baulcombe
Journal: Proc Natl Acad Sci U S A Date: 2008-02-19 Impact factor: 11.205

6. Polymerase IV occupancy at RNA-directed DNA methylation sites requires SHH1.

Authors: Julie A Law; Jiamu Du; Christopher J Hale; Suhua Feng; Krzysztof Krajewski; Ana Marie S Palanca; Brian D Strahl; Dinshaw J Patel; Steven E Jacobsen
Journal: Nature Date: 2013-05-01 Impact factor: 49.962

7. An SNF2 protein associated with nuclear RNA silencing and the spread of a silencing signal between cells in Arabidopsis.

Authors: Lisa M Smith; Olga Pontes; Iain Searle; Nataliya Yelina; Faridoon K Yousafzai; Alan J Herr; Craig S Pikaard; David C Baulcombe
Journal: Plant Cell Date: 2007-05-25 Impact factor: 11.277

8. Epigenetic remodeling of meiotic crossover frequency in Arabidopsis thaliana DNA methyltransferase mutants.

Authors: Nataliya E Yelina; Kyuha Choi; Liudmila Chelysheva; Malcolm Macaulay; Bastiaan de Snoo; Erik Wijnker; Nigel Miller; Jan Drouaud; Mathilde Grelon; Gregory P Copenhaver; Christine Mezard; Krystyna A Kelly; Ian R Henderson
Journal: PLoS Genet Date: 2012-08-02 Impact factor: 5.917

9. GABI-Kat SimpleSearch: new features of the Arabidopsis thaliana T-DNA mutant database.

Authors: Nils Kleinboelting; Gunnar Huep; Andreas Kloetgen; Prisca Viehoever; Bernd Weisshaar
Journal: Nucleic Acids Res Date: 2011-11-12 Impact factor: 16.971

10. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.

Authors: Michael I Love; Wolfgang Huber; Simon Anders
Journal: Genome Biol Date: 2014 Impact factor: 13.583

39 in total

1. Arabidopsis RNA Polymerase IV generates 21-22 nucleotide small RNAs that can participate in RNA-directed DNA methylation and may regulate genes.

Authors: Kaushik Panda; Andrea D McCue; R Keith Slotkin
Journal: Philos Trans R Soc Lond B Biol Sci Date: 2020-02-10 Impact factor: 6.237

2. Broad noncoding transcription suggests genome surveillance by RNA polymerase V.

Authors: Masayuki Tsuzuki; Shriya Sethuraman; Adriana N Coke; M Hafiz Rothi; Alan P Boyle; Andrzej T Wierzbicki
Journal: Proc Natl Acad Sci U S A Date: 2020-11-16 Impact factor: 11.205

Review 3. Plant Noncoding RNAs: Hidden Players in Development and Stress Responses.

Authors: Yu Yu; Yuchan Zhang; Xuemei Chen; Yueqin Chen
Journal: Annu Rev Cell Dev Biol Date: 2019-08-12 Impact factor: 13.827

4. Reinforcement of CHH methylation through RNA-directed DNA methylation ensures sexual reproduction in rice.

Authors: Lili Wang; Kezhi Zheng; Longjun Zeng; Dachao Xu; Tianxin Zhu; Yumeng Yin; Huadong Zhan; Yufeng Wu; Dong-Lei Yang
Journal: Plant Physiol Date: 2022-02-04 Impact factor: 8.340

5. The effect of RNA polymerase V on 24-nt siRNA accumulation depends on DNA methylation contexts and histone modifications in rice.

Authors: Kezhi Zheng; Lili Wang; Longjun Zeng; Dachao Xu; Zhongxin Guo; Xiquan Gao; Dong-Lei Yang
Journal: Proc Natl Acad Sci U S A Date: 2021-07-27 Impact factor: 11.205

6. An epigenetic pathway in rice connects genetic variation to anaerobic germination and seedling establishment.

Authors: Lina Castano-Duque; Sharmistha Ghosal; Fergie A Quilloy; Thomas Mitchell-Olds; Shalabh Dixit
Journal: Plant Physiol Date: 2021-06-11 Impact factor: 8.340

Review 7. Epigenome plasticity in plants.

Authors: James P B Lloyd; Ryan Lister
Journal: Nat Rev Genet Date: 2021-09-15 Impact factor: 53.242

8. Two interacting ethylene response factors regulate heat stress response.

Authors: Jianyan Huang; Xiaobo Zhao; Marco Bürger; Yurong Wang; Joanne Chory
Journal: Plant Cell Date: 2021-04-17 Impact factor: 11.277

9. Arabidopsis MORC proteins function in the efficient establishment of RNA directed DNA methylation.

Authors: Yan Xue; Zhenhui Zhong; C Jake Harris; Javier Gallego-Bartolomé; Ming Wang; Colette Picard; Xueshi Cao; Shan Hua; Ivy Kwok; Suhua Feng; Yasaman Jami-Alahmadi; Jihui Sha; Jason Gardiner; James Wohlschlegel; Steven E Jacobsen
Journal: Nat Commun Date: 2021-07-13 Impact factor: 14.919

10. Elevated retrotransposon activity and genomic instability in primed pluripotent stem cells.

Authors: Haifeng Fu; Weiyu Zhang; Niannian Li; Jiao Yang; Xiaoying Ye; Chenglei Tian; Xinyi Lu; Lin Liu
Journal: Genome Biol Date: 2021-07-09 Impact factor: 13.583