Literature DB >> 33617633

Mutation of a major CG methylase alters genome-wide lncRNA expression in rice.

Juzuo Li1, Ning Li1, Ling Zhu1, Zhibin Zhang1, Xiaochong Li1, Jinbin Wang1, Hongwei Xun2, Jing Zhao1, Xiaofei Wang1, Tianya Wang1, Hongyan Wang3, Bao Liu1, Yu Li4, Lei Gong1.   

Abstract

Plant long non-coding RNAs (lncRNAs) function in diverse biological processes, and lncRNA expression is under epigenetic regulation, including by cytosine DNA methylation. However, it remains unclear whether 5-methylcytosine (5mC) plays a similar role in different sequence contexts (CG, CHG, and CHH). In this study, we characterized and compared the profiles of genome-wide lncRNA profiles (including long intergenic non-coding RNAs [lincRNAs] and long noncoding natural antisense transcripts [lncNATs]) of a null mutant of the rice DNA methyltransferase 1, OsMET1-2 (designated OsMET1-2-/-) and its isogenic wild type (OsMET1-2+/+). The En/Spm transposable element (TE) family, which was heavily methylated in OsMET1-2+/+, was transcriptionally de-repressed in OsMET1-2-/- due to genome-wide erasure of CG methylation, and this led to abundant production of specific lncRNAs. In addition, RdDM-mediated CHH hypermethylation was increased in the 5'-upstream genomic regions of lncRNAs in OsMET1-2-/-. The positive correlation between the expression of lincRNAs and that of their proximal protein-coding genes was also analyzed. Our study shows that CG methylation negatively regulates the TE-related expression of lncRNA and demonstrates that CHH methylation is also involved in the regulation of lncRNA expression.
© The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America.

Entities:  

Keywords:  DNA methylation; OsMET1-2; RNA-directed DNA methylation (RdDM); long non-coding RNAs (lncRNAs); small interference RNA (siRNA); transposable element

Year:  2021        PMID: 33617633      PMCID: PMC8049413          DOI: 10.1093/g3journal/jkab049

Source DB:  PubMed          Journal:  G3 (Bethesda)        ISSN: 2160-1836            Impact factor:   3.154


Introduction

Long non-coding RNAs (lncRNAs) are mRNA-like long RNA transcripts (usually >200 nt in length) that do not encode proteins because they lack discernible open-reading frames (Zhu and Wang 2012; Quinn and Chang 2016; Kopp and Mendell 2018). LncRNAs are expressed across diverse plant and animal species and are involved in the regulation of various biological processes, such as reproduction (Lee and Bartolomei 2013; Zhang ), nutrient absorption (Franco-Zorrilla ), and response to stimuli (Bhan ; Qin ). With the development of high‐throughput sequencing technologies, many lncRNA transcripts have been identified in different species by transcriptome reassembly (Liu ; Wang ; Kyriakou ; Uszczynska-Ratajczak ; Akay ). LncRNAs can be classified into long intergenic non-coding RNAs (lincRNAs) and long noncoding natural antisense transcripts (lncNATs) according to their genomic locations and transcriptional direction relative to the closest neighboring protein-coding genes (PCgenes) (Derrien ). Following the advancing steps of lncRNA identification and characterization in animal models (Wang ; Bakhtiarizadeh ; Scott ; Wang b), many studies have explored tissue lncRNA in different plant species, including representative angiosperms and gymnosperms (Liu ; Wang ; Zhang ; Wang ; Lu ; Jain ; Wang a; Deng ; Huang ; Wang ; Xu ; Yuan ; Zhang ; Zhao ; Deng ; Hou ; Jiang ; Zheng ). The features of lncRNAs in these plant species have been extensively characterized in terms of their biogenesis, intrinsic regulation, responses to stresses, regulation of PCgene expression, and involvement in speciation. Plant lncRNAs are typically transcribed by RNA polymerase II, which is similar to that characterized in animal species; additionally, lncRNAs can also be transcribed by plant-specific RNA polymerase V (Wierzbicki ). In terms of intrinsic regulation, most lncRNAs exhibit lower expression levels and strong tissue‐specific expression patterns relative to PCgenes (Liu ; Wang ). It is also recognized that whole-genome expression of plant lncRNAs is responsive to multiple stress conditions (Wang ; Lu ; Deng ; Yuan ) and specific lncRNAs function as novel positive regulators of plants response to different abiotic and biotic stresses (Jain et al. 2017; Qin ; Wang a; Zhang ). Another special type of stress, the genomic shock that results from genome merger and doubling in allopolyploid plant species, also induces changes in the lncRNA expression profile (Zhao ). Another intriguing dimension involves the regulation by lncRNAs of their PCgene expression (Huang ; Xu ). Finally, from an evolutionary viewpoint, lncRNA profiles phylogenetically related species suggest that abundant genome-specific and/or lineage-specific lncRNAs show weak evolutionary conservation throughout plant speciation (Liu ; Zhao ; Zheng ). The close association between transposable elements (TEs) and lncRNA expression has inspired a number of investigations into the regulation of lncRNA expression by DNA methylation (Wang ; Yan ; Chen ). Most of these studies have characterized the DNA methylation (in CG, CHG, and CHH contexts) around genomic regions that generate lncRNAs and have reached a consistent conclusion: CG and CHG methylation tends to be negatively correlated with lncRNA expression (Wang ; Xu ; Yan ). Notably, because no detailed analysis of DNA methylation mutants were involved, these previous studies are based on correlation analyses only and therefore do not reveal a causal relationship. In addition, although the loss function of DDM1 (decrease in DNA methylation 1, required for CG and CHG methylation of heterochromatic regions) was used to probe the effects of methylation on the expression of transcripts in some plant species (Corem ; Tan ; Long ); this approach could not distinguish the specific effect of CG methylation from that of CHG methylation on lncRNA expression. Overall, the question of whether and how contextual methylation (i.e., CG, CHG, and CHH) affects lncRNA expression remains unanswered. In this study, we characterized and compared genome-wide lncRNA profiles between a rice loss-of-function mutant for DNA methyltransferase 1, OsMET1-2 (OsMET1-2−−), and its isogenic wild type (OsMET1-2). We show that genome-wide CG hypomethylation in OsMET1-2−− (Hu ) leads to massive generation of specific lincRNAs and lncNATs. We demonstrate that these novel lincRNAs and lncNATs derive primarily from hypomethylated En/Spm TEs that are heavily methylated in the wild type. We also find that RNA-directed DNA methylation (RdDM)-mediated CHH hypermethylation in the 5′-upstream genomic regions of lincRNAs is associated with their elevated transcription in OsMET1-2−−. Using paired samples of OsMET1-2−− and OsMET1-2, we consistently show that the expression of cis-acting lincRNAs is positively correlated with that of their paired PCgenes in rice.

Materials and methods

Plant materials

The homozygous null mutant of OsMET1-2 (OsMET1-2−−) and its isogenic wild type (OsMET1-2) of Oryza sativa L. ssp. japonica cv. Nipponbare (Hu ) were used in this study. OsMET1-2 and OsMET1-2−− seeds were germinated and grown on plates with Murashige and Skoog (MS) medium in a plant incubator under controlled conditions of 24°C/16 h light and 20°C/8 h dark. Three biological replicates of each genotype, each consisting of five pooled 11-day-old seedlings, were collected and prepared for RNA isolation.

Library construction and next-generation sequencing

Total RNA was isolated from each biological replicate following standard procedures using the TRIzol reagent (Invitrogen). High-quality RNA was used for the subsequent library constructions. Strand-specific whole transcriptome sequencing (containing both coding and non-coding RNAs) and small RNA sequencing libraries were constructed using the NEBNext Ultra Directional RNA Library Prep Kit for Illumina (NEB, USA) and the NEBNext Multiplex Small RNA Library Prep Set for Illumina (NEB, USA.). The resulting libraries were sequenced on the Illumina HiSeq 2500 platform in paired-end 150 bp and single-end 50 bp mode, respectively, at the Novogene Company in Beijing.

Identification of lncRNAs and their adjacent PCgenes

Low-quality raw sequencing reads were filtered out, and contaminating adaptors within the reads were trimmed, thereby producing clean reads for mapping to the rice reference genome (MSU7.0; http://rice.plantbiology.msu.edu/) with HISAT2 (version 2.1.0; no mismatches allowed) (Kim ). The transcriptome was assembled and transcripts were quantified by StringTie (version 1.3.4d) (Pertea ). GffCompare (version 0.11.4, http://ccb.jhu.edu/software/stringtie/gffcompare.shtml) was used to compare the assembled transcripts to the rice annotation profiles and generate a classification code for each transcript, including “i/u/x” coded transcripts (Zhao ). Based on previous definitions and characterizations of lncRNAs (Derrien ), transcripts that originated from existing genes were removed, although they were retained if they were located on the opposite strand. In addition, transcripts <200 nt in length, transcripts expressed in only one replicate of each genotype, and transcripts with TPM (Transcripts Per Million as calculated by StringTie) <1 were also removed. After these initial filtering steps, blastx was used to evaluate the similarity of candidate transcripts to annotated proteins in rice genome (abbreviated as rice-proteins) and the uniref90 (https://www.uniprot.org/help/uniref/) protein database (e-value <0.001). Furthermore, minimap2 (with default parameter) and TransDecoder (e-value <0.001) were used to scan the Rfam (http://rfam.xfam.org/) and Pfam (http://pfam.xfam.org/) databases. Candidate transcripts with matches in the aforementioned databases were excluded. In addition, the potential coding ability of novel transcripts was estimated using the CPC2 (http://cpc2.cbi.pku.edu.cn/) and CNCI programs (Sun ), and novel transcripts with potential coding ability were also removed. A final list of candidate lncRNA transcripts identified from each genotype with their originating genomic locations was used for further analyses. Based on their genomic locations, lncRNAs were further classified into lincRNAs and lncNATs. LncRNA located completely within intergenic regions of the rice genome and that did not intersect with PCgenes were defined as lincRNAs. By contrast, lncRNAs situated on the opposite strand from protein coding genes and that intersected with PCgenes by more than one nucleotide were defined as lncNATs. For each lincRNA, the closest PCgene within ±5 kb of its genomic position was defined as its paired PCgene. For each lncNAT, the PCgene on the opposite strand with which it intersected by at least one base was defined as its paired PCgene.

Experimental validation of lncRNA

Twenty lncRNAs randomly selected from mutant-specific lincRNAs and lncNATs and from common lincRNAs and lncNATs were validated by reverse transcription polymerase chain reaction (RT-PCR) followed by Sanger sequencing. In brief, reverse transcription of total RNAs extracted from each genotype was performed using TransScript One-Step gDNA Removal and cDNA Synthesis SuperMix (TransGen Biotech). Primer pairs were designed to specifically amplify the reverse-transcribed cDNA of the target lncRNAs using Primer Premier 5 software (Lalitha 2000) (Supplementary Table S1). After amplification and electrophoresis, the PCR products were collected, cloned, and sequenced by Sangon Biotech (Shanghai, China). To verify the differential expression of selected lncRNAs (DElncRNA; see details in the following sections) in OsMET1-2 and OsMET1-2−−, quantitative real-time PCR (qRT-PCR) was performed on 20 randomly selected common DElncRNAs: 10 lincRNAs and 10 lncNATs. All qRT–PCR primers were designed using an Integrated DNA Technologies online tool (https://sg.idtdna.com/scitools/Applications/RealTimePCR/; Supplementary Table S2). Reverse-transcribed cDNA from each biological replicate of each genotype was used as a template for individual qRT-PCR amplification to quantify the lncRNA expression level. The 2−ΔΔCt method was used to estimate relative expression, and ACTIN was used as the internal control gene.

Differential expression of PCgenes and lncRNAs

To identify differentially expressed lncRNAs (DElncRNAs) and PCgenes (DEPCgenes) in OsMET1-2−−, DESeq2(Love ) was used to calculate their normalized expression values in RPKM (reads per kilobase per million mapped reads) and assessed their differential expression based on raw reads counts. DElncRNAs and DEPCgenes were defined based on a twofold expression difference between the genotypes and a false discovery rate-adjusted P<0.05.

Small RNA data analysis

Raw small RNA sequencing data (merged from three biological replicates per genotype) were filtered by removing adaptor contamination and low-quality reads. Reads derived from rRNA, tRNA, and were removed using SILVA (https://www.arb-silva.de/), GtRNAdb (http://gtrnadb.ucsc.edu/), Rfam, and snoPY (http://snoopy.med.miyazaki-u.ac.jp/). All potential miRNA reads were identified using miRDeep-P prediction tool (Yang and Li 2011) and by blastn searches against known pre-miRNAs in the miRbase (version 22.1) (Kozomara ). After removing potential miRNA reads, the remaining small interference RNAs (siRNAs) reads were used as input for subsequent analyses. All siRNAs were mapped to the rice reference genome (MSU7.0) using Bowtie1 (Langmead ). To compare the siRNA abundance in OsMET1-2−−, the counts of mapped 21–24 nt siRNAs from each genotype were normalized into RPM values (reads per million base pair).

Analysis of whole genome bisulfite sequencing data

Whole genome bisulfite sequencing (WGBS) data from OsMET1-2 and OsMET1-2−− were published previously and have been deposited at NCBI under the accession no. SRP043447 (Hu ). We estimated context-specific DNA methylation profiles and differentially methylated regions (DMRs) as described in our previous studies (Hu , 2020). The weighted mean DNA methylation levels in CG, CHG, and CHH contexts within and around genomic regions that contained expressed lncRNAs, PCgenes, and TEs were calculated.

Anchoring paralogs of DEPCgenes

Paralogous gene duplicates in the rice genome were downloaded from the Rice Genome Annotation Project (http://rice.plantbiology.msu.edu/). The PCgenes of DElncRNAs, which had no paralogous duplicates, were discarded. Only the PCgenes with paralogous duplicates being not any neighbor PCgenes of any other lncRNAs were retained.

Results

Genome-wide identification and characterization of lncRNAs in OsMET1-2 and OsMET1-2−/−

Strand-specific RNA-sequencing and a stringent prediction pipeline were used to identify the long non-coding RNA (lncRNA) in a homozygous mutant of OsMET1-2 (OsMET1-2−−) and its isogenic wild type (OsMET1-2). After the removal of the low-quality raw reads, 378 and 380 million paired-end reads were obtained for OsMET1-2 and OsMET1-2−− and were used as input for the prediction pipeline (Figure 1A). In brief, 81,842 and 139,425 transcripts were obtained from OsMET1-2 and OsMET1-2−− using HISAT2 and StringTie as mapping and assembly tools, respectively. Following the removal of unqualified transcripts similar to annotated genic transcripts, transcripts of unexpectedly short length, and transcripts with very low expression, 38,611 and 38,795 transcripts remained in OsMET1-2 and OsMET1-2−−. To ensure the non-coding features of the identified lncRNAs, a final filtration step was performed to exclude transcripts with known and predicted coding potential. Finally, 932 and 1104 lncRNAs were identified in OsMET1-2 and OsMET1-2−− (Figure 1A; Supplementary File S1).
Figure 1

Identification and characterization of long non-coding RNA (lncRNAs) in OsMET1-2+/+ and OsMET1-2−/−. (A) The workflow of lncRNA identification pipeline developed in this study. The parenthesized numbers in blue and red denote the respective number of reads or transcripts input into the following step. The frames in gradient colors specify the detailed database(s) and/or tools adopted in respective step. (B) The Venn diagrams tabulating the numbers of lincRNA and lncNAT shared (common) in OsMET1-2+/+ (blue) and OsMET1-2−/− (red) and specifically identified in respective sample (wild type and mutant specific). The exact number of lncRNAs in each category is listed beneath respective category name. (C) Proportions of lncRNA transcripts (lincRNAs and lncNATs) and the adjacent PCgenes in OsMET1-2+/+ and OsMET1-2−/− categorized in terms of the exon numbers. (D) Proportions of lncRNA transcripts (lincRNAs and lncNATs) and the adjacent PCgenes in OsMET1-2+/+ and OsMET1-2−/− categorized in terms of the transcript length. (E) Cumulative frequency curves of the transcript abundances of lincRNA, lncNAT, and PCgenes. The x-axis tabulates each transcript category with respective log2FC (fold change) of Reads Per Kilobase per Million mapped reads (RPKM); the y-axis tabulates the accumulative frequency after adding each transcript category.

Identification and characterization of long non-coding RNA (lncRNAs) in OsMET1-2+/+ and OsMET1-2−/−. (A) The workflow of lncRNA identification pipeline developed in this study. The parenthesized numbers in blue and red denote the respective number of reads or transcripts input into the following step. The frames in gradient colors specify the detailed database(s) and/or tools adopted in respective step. (B) The Venn diagrams tabulating the numbers of lincRNA and lncNAT shared (common) in OsMET1-2+/+ (blue) and OsMET1-2−/− (red) and specifically identified in respective sample (wild type and mutant specific). The exact number of lncRNAs in each category is listed beneath respective category name. (C) Proportions of lncRNA transcripts (lincRNAs and lncNATs) and the adjacent PCgenes in OsMET1-2+/+ and OsMET1-2−/− categorized in terms of the exon numbers. (D) Proportions of lncRNA transcripts (lincRNAs and lncNATs) and the adjacent PCgenes in OsMET1-2+/+ and OsMET1-2−/− categorized in terms of the transcript length. (E) Cumulative frequency curves of the transcript abundances of lincRNA, lncNAT, and PCgenes. The x-axis tabulates each transcript category with respective log2FC (fold change) of Reads Per Kilobase per Million mapped reads (RPKM); the y-axis tabulates the accumulative frequency after adding each transcript category. The genomic locations and transcription directions of the lncRNAs relative to their nearest neighboring PCgenes were determined, and the lncRNAs were then categorized into long intergenic non-coding RNA (lincRNA) and long non-coding natural antisense transcript (lncNAT). As shown in the Venn diagrams (Figure 1B), the 932 lncRNAs in OsMET1-2 consisted of 729 lincRNAs and 203 lncNATs, and the 1104 lncRNAs of OsMET1-2−− consisted of 880 lincRNAs and 224 lncNATs. Most lincRNAs and lncNATs were shared by the two genotypes (719 common lincRNAs and 201 common lncNATs; Figure 1B). However, there were a limited number of genotype-specific lincRNAs and lncNATs (10 and 2 wild type-specific lincRNAs and lncNATs; 23 mutant-specific lncNATs; Figure 1B). An exceptionally large number (161) of mutant-specific lincRNAs were identified (Figure 1B). Compared with their respective PCgenes, both types of lncRNAs usually contained fewer exons (most consisted of single exon; Figure 1C), produced shorter transcripts (Figure 1D), and had lower expression levels (Figure 1E). These lncRNA characteristics are consistent with those reported in other plant species (Li ; Wang ; Xu ). RT-PCR and qRT-PCR analyses confirmed the existence of randomly selected lncRNAs and validated their relative expression levels, further verifying the accuracy of our lncRNA predictions (Supplementary Figures S1 and S2).

Genomic regions that generated mutant-specific lncRNAs showed greater hypomethylation than those that generated common lncRNAs

In addition to the large number of common lncRNAs shared between OsMET1-2 and OsMET1-2−−, sets of lincRNAs (18.30%; 161/880) and lncNATs (10.27%; 23/224) were specifically expressed in OsMET1-2−− (Figure 1B). In our previous study, the loss-of-function mutation of OsMET1-2 caused genome-wide CG and CHG hypomethylation (Hu ). To test for an association between novel lncRNAs expression and CG and CHG hypomethylation, we compared the CG and CHG methylation patterns of genomic regions that expressed novel or common lncRNAs in OsMET1-2−− with their corresponding regions in OsMET1-2 (Figure 2A; Supplementary Figure S3). As expected, the overall CG and CHG methylation level of lncRNA genomic regions was lower in OsMET1-2−− than in OsMET1-2 for both common and mutant-specific lncRNAs (Figure 2A; Supplementary Figure S3A). Genomic regions that generated mutant-specific lncRNAs in OsMET1-2−− had higher CG and CHG methylation levels in OsMET1-2 than regions that generated common lncRNAs (Figure 2A). This difference was confirmed statistically by a random sampling method in which the CG and CHG methylation levels of regions that encoded mutant-specific lncRNAs in OsMET1-2 were significantly higher than those of randomly sampled regions (Figure 2B; Supplementary Figure S3B). Furthermore, CG and CHG methylation levels of genomic regions that generated mutant-specific lncRNAs were hypomethylated more than the regions that generated common lncRNAs in OsMET1-2−− (Figure 2A; Supplementary Figure S3A).
Figure 2

Genomic regions of CG hypomethylation in OsMET1-2+/+ expressing mutant-specific lncRNAs after null-mutation of OsMET1-2 gene. (A) The boxplots depict the CG methylation levels of genomic regions (core body and up-/downstream 2 kb flanking regions) expressing common and mutant-specific lncRNAs (including lincRNA and lncNAT) in respective OsMET1-2+/+ and OsMET1-2−/−. Wilcoxon test is adopted to test the statistical significance for paired two sample sets. One asterisk (*), two asterisks (**), and three asterisks (***) denote the significant P-values at the levels of 0.05, 0.01, and 0.001, respectively. (B) Boxplots of weighted mean CG methylation levels of random bootstrap sampled genomic regions and genomic regions expressing common and mutant-specific lncRNAs (lincRNAs and lncNATs) in OsMET1-2+/+. Independent two-sample t-test is used, in which significance levels are also denoted at the same cutoff P-values as above. (C) Density curves of the percentages of random bootstrap sampled intergenic (left) and anti-sense genic regions (right) overlapping with DMRs and arrow-marked observed percentage of common and mutant-specific lncRNAs (lincRNAs and lncNATs) derived from the DMRs. Within respective bootstrapping test, we randomly re-sample 1000 sets of genomic regions, the number and length of which are identical with respective lncRNAs (lincRNAs and lncNATs). Within each re-sampled set of genomic regions, the proportion of regions overlapping with DMRs is calculated. Respective 1000 proportions are summarized in each density curve. The original observed proportion of lncRNA occurred in DMRs is denoted by the arrow and respective statistical P-value for each bootstrapping test is also specified nearby each arrow.

Genomic regions of CG hypomethylation in OsMET1-2+/+ expressing mutant-specific lncRNAs after null-mutation of OsMET1-2 gene. (A) The boxplots depict the CG methylation levels of genomic regions (core body and up-/downstream 2 kb flanking regions) expressing common and mutant-specific lncRNAs (including lincRNA and lncNAT) in respective OsMET1-2+/+ and OsMET1-2−/−. Wilcoxon test is adopted to test the statistical significance for paired two sample sets. One asterisk (*), two asterisks (**), and three asterisks (***) denote the significant P-values at the levels of 0.05, 0.01, and 0.001, respectively. (B) Boxplots of weighted mean CG methylation levels of random bootstrap sampled genomic regions and genomic regions expressing common and mutant-specific lncRNAs (lincRNAs and lncNATs) in OsMET1-2+/+. Independent two-sample t-test is used, in which significance levels are also denoted at the same cutoff P-values as above. (C) Density curves of the percentages of random bootstrap sampled intergenic (left) and anti-sense genic regions (right) overlapping with DMRs and arrow-marked observed percentage of common and mutant-specific lncRNAs (lincRNAs and lncNATs) derived from the DMRs. Within respective bootstrapping test, we randomly re-sample 1000 sets of genomic regions, the number and length of which are identical with respective lncRNAs (lincRNAs and lncNATs). Within each re-sampled set of genomic regions, the proportion of regions overlapping with DMRs is calculated. Respective 1000 proportions are summarized in each density curve. The original observed proportion of lncRNA occurred in DMRs is denoted by the arrow and respective statistical P-value for each bootstrapping test is also specified nearby each arrow. To obtain further support, we also calculated the numbers of common and mutant-specific lncRNAs that co-localizing with CG and CHG DMRs in OsMET1-2−/− for each type of lncRNAs (Figure 2C; Supplementary Figure S3C). Relative to the number of randomly bootstrap-sampled intergenic and anti-sense genic regions that overlapped with DMRs (i.e., the reference distribution), the mutant-specific lincRNAs and lncNATs occurred in CG DMRs at significantly higher frequencies than expected, but a similar result was not found for common lncRNAs (Figure 2C). However, the result for CHG DMRs was more complicated: both mutant-specific and common lincRNAs were statistically enriched in CHG DMRs (Supplementary Figure S3C), but mutant-specific lncNATs were not. These observations suggest a potential association between novel lncRNA expression and CG hypomethylation. Nonetheless, there was a lack of statistical evidences to support an association between novel lncRNAs expression and CHG hypomethylation in this study.

TE-derived lncRNAs were de-repressed in OsMET1-2−−

Genomic features that generated lincRNAs and lncNATs in both OsMET1-2 and OsMET1-2−− were further characterized. First, the two types of lncRNAs were categorized into four groups based on their locations in genic/intergenic regions with/without TEs (Supplementary Figure S4; the lack of coding ability of autonomous TE-related lncRNAs was confirmed by checking their incomplete ORFs; see Materials and methods section). Relative to mRNA regions (separated into 5′ UTR, CDS, and 3′ UTR), significantly more lncRNAs (especially lincRNAs) were generated by genomic regions associated with TEs (genic and intergenic TEs) in both OsMET1-2 and OsMET1-2−− (Figure 3A).
Figure 3

LncRNA and mRNA transcripts generated by TEs in OsMET1-2+/+ and OsMET1-2−/−. (A) Proportions of lncRNA transcripts (lincRNA and lncNAT) and genomic mRNA with at least one exon overlapping with TEs (at least 10 bp). (B) Proportions of common and mutant-specific lncRNAs (lincRNAs and lncNATs) overlapping with respective type of TEs (at least 10 bp). The parenthesized number denotes the total number of respective TE type in the genome.

LncRNA and mRNA transcripts generated by TEs in OsMET1-2+/+ and OsMET1-2−/−. (A) Proportions of lncRNA transcripts (lincRNA and lncNAT) and genomic mRNA with at least one exon overlapping with TEs (at least 10 bp). (B) Proportions of common and mutant-specific lncRNAs (lincRNAs and lncNATs) overlapping with respective type of TEs (at least 10 bp). The parenthesized number denotes the total number of respective TE type in the genome. Next, the proportions of common and mutant-specific lncRNAs expressed by specific TE types were summarized (Figure 3B). Overall, more lncRNAs were generated by Type II transposons (DNA transposons) than by Type I transposons (retro-transposons) for both common and mutant-specific lncRNAs (Figure 3B). In addition, mutant-specific lincRNAs and lncNATs were more highly expressed than common lncRNAs (Figure 3B). Notably, 49.69% of the mutant-specific lincRNAs were generated by the En/Spm DNA transposon family, significantly higher than the corresponding percentage of common lincRNA (12.40%) (Chi-square test, P<0.001) (Figure 3B). Although miniature inverse-repeated TEs (MITEs) were the most abundant TE types in the rice genome (Figure 3B), MITEs did not generate significantly more mutant-specific lncNATs than common lncNATs (Chi-square test, P=0.09). Furthermore, detailed characterization of DNA methylation (in CG, CHG, and CHH contexts) of all TE types in OsMET1-2+/+ revealed that En/Spm harbored higher CG methylation levels than other DNA TEs (Table 1). Taken together, these results imply that CG-methylated TE types (e.g., En/Spm) may be more likely to be de-repressed and to express lncRNAs in the OsMET1-2−−.
Table 1

The weighted mean cytosine DNA methylation levels of protein coding genes, TE-related genes, all TE types, and each specific type of TEs in OsMET1-2 and OsMET1-2−−

CategoryCG
CHG
CHH
OsMET1- 2 +/+ (%) OsMET1- 2 −/− (%)Decreased (%) OsMET1- 2 +/+ (%) OsMET1- 2 −/− (%)Decreased (%) OsMET1- 2 +/+ (%) OsMET1- 2 −/− (%)Decreased (%)
Protein coding genes26.403.30−87.609.206.70−26.502.100.70−65.50
TE-related genes85.4018.40−78.4064.7051.20−20.904.902.20−53.90
Total repeats83.6018.30−78.1054.1043.80−19.1026.3010.70−59.50
Retrotransposons (Class I/retro TE)89.6021.10−76.5065.0051.00−21.5010.206.10−40.60
 Copia87.5019.30−78.0061.0047.00−23.007.604.90−36.10
 Gypsy90.8021.70−76.1068.7053.70−21.807.005.50−20.60
 LTR-other84.9023.30−72.5059.3048.80−17.7010.909.50−12.90
 Cassandra94.4028.90−69.4074.2060.10−19.0020.5013.10−36.20
 Caulimovirus94.6024.90−73.7081.7073.90−9.603.304.3029.10
 LINE82.7017.20−79.2061.2055.30−9.704.602.50−45.60
 SINE87.3018.90−78.4054.6042.90−21.4023.907.90−66.90
Transposons (Class II/DNA TE)78.2016.60−78.8047.2038.40−18.5022.009.20−58.30
 En/Spm90.5019.00−79.0054.1037.00−31.5010.0010.202.70
 MITEs83.4018.00−78.5052.9043.20−18.3033.8012.90−61.80
 hAT79.6014.50−81.8037.5023.10−38.3012.305.20−57.80
 Harbinger80.9017.50−78.4053.1046.30−12.9030.2012.80−57.70
 Stowaway77.2017.20−77.7045.7037.40−18.1025.409.20−63.50
 Tourist79.4018.40−76.8050.3044.60−11.4024.509.50−61.10
 MuDR87.5021.10−75.9053.5047.10−12.0016.306.10−62.80
 DNA-other59.4010.10−83.0034.6029.40−14.9014.405.90−59.30

Within each category, the proportion of reduction in DNA methylation level (in CG, CHG, and CHH context) in OsMET1-2−− relative to respective level in the OsMET1-2 is recorded as “Increase or Decreased (%),” which is calculated as (OsMET1-2−−–OsMET1-2)/OsMET1-2.

The weighted mean cytosine DNA methylation levels of protein coding genes, TE-related genes, all TE types, and each specific type of TEs in OsMET1-2 and OsMET1-2−− Within each category, the proportion of reduction in DNA methylation level (in CG, CHG, and CHH context) in OsMET1-2−− relative to respective level in the OsMET1-2 is recorded as “Increase or Decreased (%),” which is calculated as (OsMET1-2−−–OsMET1-2)/OsMET1-2.

RdDM-mediated CHH hypermethylation in the 5′-upstream regions of transcriptionally upregulated lincRNAs in OsMET1-2−−

To examine the link between DNA methylation and lncRNA expression in different contexts (i.e., CG, CHG, and CHH), genomic regions that contained differentially expressed lncRNA (DElncRNA [lincRNA and lncNAT]) transcripts and their ±2 kb upstream and downstream regulatory regions were examined in OsMET1-2 and OsMET1-2−− (Figure 4A; Supplementary Figures S5 and S6). Genomic regions with statistically significantly upregulated and downregulated lncRNAs and with common and mutant-specific lncRNAs were considered separately (Figure 4A; Supplementary Figures S5 and S6).
Figure 4

Weighted mean CHH DNA methylation and siRNA abundance (Log2 transformed) of genomic regions (lincRNA bodies and their up-/downstream [+2kb] regulative regions) expressing common, mutant-specific, and differentially up- and downregulated lincRNA in OsMET1-2+/+ and OsMET1-2−/−. (A) Weighted mean CHH DNA methylation and siRNA abundance of genomic regions expressing respective featured lincRNAs. (B) Weighted mean CHH DNA methylation and siRNA abundance of genomic regions expressing En/Spm-derived featured lincRNAs. The gray blocks denote the 5′-upstream (∼250 bp upstream of transcription starting site) regulative regions with co-localization of hypermethylated CHH and abundant siRNAs.

Weighted mean CHH DNA methylation and siRNA abundance (Log2 transformed) of genomic regions (lincRNA bodies and their up-/downstream [+2kb] regulative regions) expressing common, mutant-specific, and differentially up- and downregulated lincRNA in OsMET1-2+/+ and OsMET1-2−/−. (A) Weighted mean CHH DNA methylation and siRNA abundance of genomic regions expressing respective featured lincRNAs. (B) Weighted mean CHH DNA methylation and siRNA abundance of genomic regions expressing En/Spm-derived featured lincRNAs. The gray blocks denote the 5′-upstream (∼250 bp upstream of transcription starting site) regulative regions with co-localization of hypermethylated CHH and abundant siRNAs. For DNA methylation in CG and CHG contexts, all lncRNA-related genomic regions were consistently hypomethylated in OsMET1-2−−, and there were no region-specific DNA methylation changes (Supplementary Figures S5 and S6). The genomic regions that generated lincRNAs exhibited CHH hypomethylation in OsMET1-2−− (Figure 4A). Specifically, CHH hypomethylation occurred in genomic regions that generated downregulated common lincRNAs and lncNATs (Figure 4A; Supplementary Figure S6). By contrast, in genomic regions that generated upregulated common and mutant-specific lincRNAs, CHH sites were hypermethylated in the 5′-upstream regulatory regions (∼250 bp) adjacent to transcription starting sites in OsMET1-2−− (Figure 4A). This phenomenon was not observed in regions that generated lncNATs (Supplementary Figure S6). Considering the important role of siRNAs in the establishment of CHH methylation by the RdDM pathway (Matzke and Mosher 2014), we sought to test whether these CHH hypermethylated 5′-upstream regions were targeted by siRNAs. As expected, our small RNA sequencing and mapping results revealed significantly abundant siRNAs that co-localized with the special hypermethylated regions associated with upregulated common and mutant-specific lincRNAs (Figure 4A). Among the mutant-specific lincRNAs, 61.49% (99/161) displayed CHH hypermethylation in their 5′-upstream region, 74.53% (120/161) harbored enriched siRNAs in their 5′-upstream region, and 52.17% (84/161) exhibited concomitant CHH hypermethylation and abundant siRNAs in their 5′-upstream regions. However, such high proportions were not observed for non-differentially expressed lincRNAs (hyper mCHH 32.99%, 193/585; abundant siRNAs 39.15%, 229/585; concomitant hyper mCHH and abundant siRNAs 14.19%, 83/585). Given our previous findings of En/Spm enrichment in mutant-specific lincRNAs (Figure 3B), we also characterized the weighted mean CHH methylation levels of En/Spm genomic regions that expressed upregulated common and mutant-specific lincRNAs transcripts. Concomitant CHH hypermethylation and siRNA abundance was once again observed 5′-upstream of En/Spm genomic regions that expressed upregulated common and mutant-specific lincRNAs (Figure 4B). This observation was also supported by compensatory CHH methylation that occurred specifically in En/Spm TEs after the null mutation of the OsMET1-2 gene (Table 1). All these results indicate that RdDM can produce compensatory CHH methylation within the 5′-upstream regulatory genomic regions (especially in the En/Spm TE regions) of transcriptionally upregulated lincRNAs in OsMET1-2−−.

Expression of cis-acting lincRNAs is positively correlated with that of their paired PCgenes

Our earlier study reported extensive differential PCgene expression in OsMET1-2−− relative to OsMET1-2 (Hu ). Based on the DElncRNAs in the same sample set, it was possible to explore potential cis-regulatory effects of lncRNAs on the expression of their neighboring PCgenes. Specifically, we characterized the correlation between expression fold changes of DElncRNAs (including both common and mutant-specific lincRNAs and lncNATs) and those of their corresponding differentially expressed of PCgenes (DEPCgenes) (Figure 5). To exclude intrinsic noise effects from other factors (including the adjacent TEs and local differential methylation) that may have mediated an indirect correlation, we categorized the lncRNAs into four sub-groups based on their locations relative to genomic TEs and CG DMRs. Subsequently, we calculated Pearson’s correlations and corresponding P-values for each subgroup of lincRNAs and lncNATs (Figure 5C). After excluding the effects of adjacent TEs and CG DMRs associated with the null mutation of OsMET1-2 gene, the fold changes of DElincRNA expression in OsMET1-2−− relative to OsMET1-2 were significantly correlated with those of their corresponding DEPCgenes (n = 184; Pearson’s correlation = 0.604, P<0.001; Figure 5, A and C). There was no significant correlation between the fold changes of DElncNAT and those of their corresponding DEPCgenes (Figure 5, B and C). PCgenes paired with DElincRNA are enriched in arabinan/xylan catabolic process and sodium ion transmembrane transport. Both arabinan and xylan are present abundantly in plant cell walls (Verhertbruggen ; Grantham ). These enrichments indicate that the correlation between lincRNA and PCgene expression may be involved in the abnormal growth of the mutant.
Figure 5

Cis-acting lncRNAs is positively correlated with expression of their neighboring PCgenes. (A) Scatter plot illustrating the positive correlation between the fold changes of DElincRNA (differential expression of lincRNA in OsMET1-2−/−vs in OsMET1-2+/+; log2 transformed on the x-axis) and those of respective DEPCgenes (differential expression of lincRNA-related PCgenes in OsMET1-2−/−vs in OsMET1-2+/+; log2 transformed on the y-axis). The detailed Pearson’s correlation indices and respective statistical significances are tabulated in panel C of this figure. (B) Scatter plot illustrating no correlation between the fold changes of DElncNAT (differential expression of lncNAT in OsMET1-2−/−vs in OsMET1-2+/+; log2 transformed on the x-axis) and those of respective DEPCgenes (differential expression of lncNAT-related PCgenes in OsMET1-2−/−vs in OsMET1-2+/+; log2 transformed on the y-axis). The detailed Pearson’s correlation indices and respective statistical significances are tabulated in panel C of this figure. (C) Different lincRNA and lncNAT subgroups are categorized in terms of their relative positions to TEs and CG DMRs, in which the circles denote the lncRNAs co-localizing with the TEs and CG DMRs; the squares denote the lncRNAs uniquely co-localizing with the TEs; the diamonds denote the lncRNAs uniquely co-localizing with the CG DMRs; and the triangles denote the lncRNAs neither co-localizing with the TEs nor CG DMRs. Pearson’s correlation is calculated for paired lncRNA and PCgenes in each subgroup. Three asterisks (***) represent the significant P-values at the level of 0.001; and raw non-significant P-values (>0.05) are specified.

Cis-acting lncRNAs is positively correlated with expression of their neighboring PCgenes. (A) Scatter plot illustrating the positive correlation between the fold changes of DElincRNA (differential expression of lincRNA in OsMET1-2−/−vs in OsMET1-2+/+; log2 transformed on the x-axis) and those of respective DEPCgenes (differential expression of lincRNA-related PCgenes in OsMET1-2−/−vs in OsMET1-2+/+; log2 transformed on the y-axis). The detailed Pearson’s correlation indices and respective statistical significances are tabulated in panel C of this figure. (B) Scatter plot illustrating no correlation between the fold changes of DElncNAT (differential expression of lncNAT in OsMET1-2−/−vs in OsMET1-2+/+; log2 transformed on the x-axis) and those of respective DEPCgenes (differential expression of lncNAT-related PCgenes in OsMET1-2−/−vs in OsMET1-2+/+; log2 transformed on the y-axis). The detailed Pearson’s correlation indices and respective statistical significances are tabulated in panel C of this figure. (C) Different lincRNA and lncNAT subgroups are categorized in terms of their relative positions to TEs and CG DMRs, in which the circles denote the lncRNAs co-localizing with the TEs and CG DMRs; the squares denote the lncRNAs uniquely co-localizing with the TEs; the diamonds denote the lncRNAs uniquely co-localizing with the CG DMRs; and the triangles denote the lncRNAs neither co-localizing with the TEs nor CG DMRs. Pearson’s correlation is calculated for paired lncRNA and PCgenes in each subgroup. Three asterisks (***) represent the significant P-values at the level of 0.001; and raw non-significant P-values (>0.05) are specified. To further verify the potential positive correlation between expression of cis-acting lincRNAs and that of their paired PCgenes, another two groups of PCgenes were selected as negative controls. One included paralogs of the lincRNA-related PCgenes (see Materials and methods section), and the other included randomly selected rice genes. If the expression of lincRNAs was positively correlated with that of their PCgenes, such a positive correlation should be present between lincRNAs and their PCgenes but absent in the two negative control groups. This hypothesis was tested using the same method described above (Figure 5), and a significant correlation was found only between the cis-acting lincRNAs and their corresponding paired PCgenes (Figure 6).
Figure 6

Scatter plot illustrating the unique positive correlation of cis-acting lincRNA with the expression of their neighboring PCgenes rather than respective paralogs of PCgenes and random selected PCgenes for the correlation. (A) Positive correlation between the fold changes of DElincRNA (differential expression of lincRNA in OsMET1-2−/−vs in OsMET1-2+/+; log2 transformed on the x-axis) and those paralogs of DEPCgenes and DEPCgene (differential expression of lincRNA-related PCgenes and their paralogs in OsMET1-2−/−vs in OsMET1-2+/+; log2 transformed on the y-axis). No corresponding correlation is detected between DElncNAT and their DEPCgenes and repective paralogs of DEPCgene. The detailed Pearson’s correlation indices and respective statistical significances are tabulated in panel C of this figure. (B) No significant correlation is detected between the lncRNA and their random selected PCgenes. Detailed Pearson’s correlation indices and categories are tabulated in panel C of this figure. (C) Pearson’s correlation indices between the fold changes of DElncRNA (differential expression of lincRNAs and lncNATs in OsMET1-2−−vs in OsMET1-2) and those of DEPCgenes, paralogs of respective DEPCgenes (differential expression of lincRNA- and lncNAT-related PCgenes and their paralogs in OsMET1-2−−vs in OsMET1-2), and random selected respective PCgenes are tabulated with corresponding supporting statistical P-values. Different lincRNA and lncNAT subgroups are categorized in terms of their PCgenes, paralogs of respective DEPCgenes, and random selected PCgenes, in which the circles denote the lincRNAs paired with their respective DEPCgenes; the squares denote the lincRNAs paired with their respective paralogs of DEPCgenes; the diamonds denote the lncNATs paired with their respective DEPCgenes; the triangles denote the lncNATs paired with their respective paralogs of DEPCgenes; the crosses denote the lincRNAs paired with random selected PCgenes; and the pentagons denote the lncNATs with random selected PCgenes. Pearson’s correlation is calculated for each subgroup. Two asterisks (**) represent the significant P-values at the level of 0.01; and raw non-significant P-values (>0.05) are specified.

Scatter plot illustrating the unique positive correlation of cis-acting lincRNA with the expression of their neighboring PCgenes rather than respective paralogs of PCgenes and random selected PCgenes for the correlation. (A) Positive correlation between the fold changes of DElincRNA (differential expression of lincRNA in OsMET1-2−/−vs in OsMET1-2+/+; log2 transformed on the x-axis) and those paralogs of DEPCgenes and DEPCgene (differential expression of lincRNA-related PCgenes and their paralogs in OsMET1-2−/−vs in OsMET1-2+/+; log2 transformed on the y-axis). No corresponding correlation is detected between DElncNAT and their DEPCgenes and repective paralogs of DEPCgene. The detailed Pearson’s correlation indices and respective statistical significances are tabulated in panel C of this figure. (B) No significant correlation is detected between the lncRNA and their random selected PCgenes. Detailed Pearson’s correlation indices and categories are tabulated in panel C of this figure. (C) Pearson’s correlation indices between the fold changes of DElncRNA (differential expression of lincRNAs and lncNATs in OsMET1-2−−vs in OsMET1-2) and those of DEPCgenes, paralogs of respective DEPCgenes (differential expression of lincRNA- and lncNAT-related PCgenes and their paralogs in OsMET1-2−−vs in OsMET1-2), and random selected respective PCgenes are tabulated with corresponding supporting statistical P-values. Different lincRNA and lncNAT subgroups are categorized in terms of their PCgenes, paralogs of respective DEPCgenes, and random selected PCgenes, in which the circles denote the lincRNAs paired with their respective DEPCgenes; the squares denote the lincRNAs paired with their respective paralogs of DEPCgenes; the diamonds denote the lncNATs paired with their respective DEPCgenes; the triangles denote the lncNATs paired with their respective paralogs of DEPCgenes; the crosses denote the lincRNAs paired with random selected PCgenes; and the pentagons denote the lncNATs with random selected PCgenes. Pearson’s correlation is calculated for each subgroup. Two asterisks (**) represent the significant P-values at the level of 0.01; and raw non-significant P-values (>0.05) are specified.

Discussion

High-throughput sequencing technology has enabled researchers to characterize a large number of lncRNAs from various eukaryotic species (Kyriakou ; Wang b; Akay ). Major questions about lncRNA composition, biogenesis, tissue-specific expression, function, and association with epigenetic modifications have been explored and mostly answered in plant species (Liu ; Wang ; Hu ). Nonetheless, little evidence exists for participation of context-specific DNA methylation in the regulation of plant lncRNA expression (Wang a; Xu ; Chen ). We therefore characterized and compared lncRNA expression (lincRNAs and lncNATs) between wild-type rice (OsMET1-2) and its homozygous mutant OsMET1-2−−, in which CG methylation has been dramatically reduced by null mutation of the OsMET1-2 gene (Hu ). In addition to clarifying the elusive relationship between CG methylation and lncRNA expression, we also demonstrated the involvement of CHH methylation in the regulation of lncRNA expression. Notably, compared with the OsDDM1 mutant that exhibits a simultaneous decrease in CG and CHG methylation (Tan ), the limited CHG methylation variation in our rice OsMET1-2−− mutant allows us to specifically exclude any potential mixed effects from CHG methylation in our association analyses. Use of the wild type OsMET1-2 and its OsMET1-2−− mutant enabled us to provide strong evidence for the regulation of lncRNA expression by CG methylation: the heavily CG-methylated regions in OsMET1-2 were induced to express novel mutant-specific lncRNAs in OsMET1-2−/− (Figure 2). Given that the CG methylation level was higher in TE regions than in genic regions (Table 1) (Feng ), we hypothesized that the novel mutant-specific lncRNAs may have originated from TE-rich regions. To test this hypothesis, we investigated the composition of genomic regions that generated mutant-specific lncRNAs. A specific group of DNA transposons, the En/Spm DNA transposons, expressed more mutant-specific lncRNAs after the erasure of CG methylation in OsMET1-2−− (Figure 3). Here, it is necessary to emphasize that the role of CHG methylation in the regulation of lncRNA expression is still ambiguous as characterized in the current study system. Future investigation in other mutants with abolished CHG methylation (e.g., the cmt3 mutant) could provide additional insight. Another intriguing question arises: why does this specific type of TE promote the active expression of lncRNAs in response to the removal of CG methylation? Given the smaller number of En/Spm transposons relative to those of other TE types in the rice genome (Figure 3B), the contribution of En/Spm transposons to lncRNA transcription does not correlate with their genomic abundance. This suggests that active lincRNA expression by En/Spm transposons must be determined by other intrinsic properties. Although both En/Spm transposons and MITEs are enriched in intergenic regions (Ouyang and Buell 2004), significant mutant-specific lincRNA expression is derived by En/Spm transposons but not by MITEs, implying that a biased distribution within intergenic regions is not the intrinsic factor either. Given the marked decrease in CG methylation in regions expressing mutant-specific En/Spm transposons in OsMET1-2−− (79.00%, Table 1; Figure 2, B and C), greater erasure of CG methylation from En/Spm transposons than from other TE types may be one relevant intrinsic factors. However, SINE retrotransposons exhibited a degree of CG methylation erasure similar to that of En/Spm transposons (79.20%; Table 1), but they did not express more mutant-specific lincRNAs in OsMET1-2−−. This suggests that other unknown intrinsic features of En/Spm transposons and/or other regulatory process(es) involved in their de-repression must influence mutant-specific lncRNA expression after the null mutation of OsMET1-2. In addition to the previously reported co-localization of TEs with expressed lncRNAs in rice and other plant species (Wang a; Yan ), this study provides a clear example of the direct negative regulation of lncRNA expression by CG methylation of TEs in a monocot species. In addition to enriched CG methylation, CHH methylation established by siRNAs through the RdDM pathway is another prominent epigenetic feature of plant intergenic TE regions (Xu ; Yan ). As previously reported (Hu ) and also illustrated in our study (Table 1;Figure 4), a decrease in CHH methylation within the bodies and regulatory regions of most TEs is accompanied by the erasure of CG methylation. However, an exceptional contrasting case is the compensatory increase in CHH methylation in the En/Spm transposons (2.70%; Table 1). The prima facie coincidence of lincRNA expression and compensatory CHH methylation in the same group of En/Spm transposons after null mutation is contradicted by the observed co-occurrence of siRNA enrichment and increased CHH methylation in the 5′-upstream regulatory regions of mutant-specific and upregulated common lincRNA transcripts (Figure 4). Our observations suggest that together with CG methylation, CHH methylation mediated by the RdDM pathway is also involved in regulating lncRNA expression, especially for lincRNAs. However, in contrast to the clear negative effects of CG methylation on lncRNA expression discussed above, the potential role of compensatory CHH methylation remains unclear. According to canonical theory on the silencing effects of CHH methylation on TE transcription (Matzke and Mosher 2014), it is deduced that our observed CHH hypermethylation in lincRNA regulative regions could compensatively silence the TE transcription in the absence of inhibitive CG methylation. Such a prediction is consistent with the previously reported association between 5′-upstream CHH methylation and the expression of downstream neighboring PCgenes in other plant species (Gent ; Li ; Secco ). However, based on the recent recognition of RdDM-mediated CHH methylation as a signal that recruits certain transcriptional anti-silencers (Harris ), another possible scenario is that CHH methylation around the intergenic TE regions may counteract the repressive effects of CG methylation on lncRNA expression. Comparisons of lncRNA profiles from additional RdDM rice mutants will be necessary to determine whether intergenic lncRNAs expression increases (supporting the former “collaborative negative model”) or decreases (supporting the latter “counteracting active model”) when the RdDM pathway is abolished. The exact role of CHH methylation in the regulation of lncRNA expression will then be made clear. LncRNA has been reported to regulate the expression of both neighboring (cis) and distal (trans) PCgenes in animal models (Pauli ; Casero ; Zhu ). As in some other plant model species (Huang ; Xu ), cis-acting lincRNAs exhibited positive correlations with their neighboring PCgenes in our rice materials. Given the abnormal phenotypes of OsMET1-2−− (Hu ), it will be interesting to construct lncRNA and/or PCgene mutants with which to characterize the specific functions of lncRNAs in the regulation of PCgene expression and to identify their potential roles in underpinning the observed phenotypes. As in other plant studies (Li ; Li ; Huang ; Gao ), the potential trans-acting functions of lncRNAs in the regulation of gene expression at independent or distant loci were not explored in this study. Any potential trans-action of lncRNAs on their partners, any possible physical interactions between them, and any effects of DNA methylation on these processes deserve further detailed exploration.

Data availability

The non-coding RNA sequencing data and small RNA sequencing data had been deposited and available in the NCBI (PRJNA629903). LncRNA (lincRNA and lncNAT) profiles with information about location and coding ability are available in Supplementary File S1. Supplementary material available at figshare: https://doi.org/10.25387/g3.14034515.
  62 in total

1.  Conservation and divergence of methylation patterning in plants and animals.

Authors:  Suhua Feng; Shawn J Cokus; Xiaoyu Zhang; Pao-Yang Chen; Magnolia Bostick; Mary G Goll; Jonathan Hetzel; Jayati Jain; Steven H Strauss; Marnie E Halpern; Chinweike Ukomadu; Kirsten C Sadler; Sriharsa Pradhan; Matteo Pellegrini; Steven E Jacobsen
Journal:  Proc Natl Acad Sci U S A       Date:  2010-04-15       Impact factor: 11.205

2.  Mouse transcriptome: neutral evolution of 'non-coding' complementary DNAs.

Authors:  Jun Wang; Jianguo Zhang; Hongkun Zheng; Jun Li; Dongyuan Liu; Heng Li; Ram Samudrala; Jun Yu; Gane Ka-Shu Wong
Journal:  Nature       Date:  2004-10-14       Impact factor: 49.962

Review 3.  Towards a complete map of the human long non-coding RNA transcriptome.

Authors:  Barbara Uszczynska-Ratajczak; Julien Lagarde; Adam Frankish; Roderic Guigó; Rory Johnson
Journal:  Nat Rev Genet       Date:  2018-09       Impact factor: 53.242

4.  HISAT: a fast spliced aligner with low memory requirements.

Authors:  Daehwan Kim; Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2015-03-09       Impact factor: 28.547

Review 5.  Long Noncoding RNA and Cancer: A New Paradigm.

Authors:  Arunoday Bhan; Milad Soleimani; Subhrangsu S Mandal
Journal:  Cancer Res       Date:  2017-07-12       Impact factor: 12.701

6.  Genome-wide analysis uncovers regulation of long intergenic noncoding RNAs in Arabidopsis.

Authors:  Jun Liu; Choonkyun Jung; Jun Xu; Huan Wang; Shulin Deng; Lucia Bernad; Catalina Arenas-Huertero; Nam-Hai Chua
Journal:  Plant Cell       Date:  2012-11-06       Impact factor: 11.277

7.  Stress induced gene expression drives transient DNA methylation changes at adjacent repetitive elements.

Authors:  David Secco; Chuang Wang; Huixia Shou; Matthew D Schultz; Serge Chiarenza; Laurent Nussaume; Joseph R Ecker; James Whelan; Ryan Lister
Journal:  Elife       Date:  2015-07-21       Impact factor: 8.140

8.  Systematic identification and characterization of cardiac long intergenic noncoding RNAs in zebrafish.

Authors:  Lei Wang; Xiao Ma; Xiaolei Xu; Yuji Zhang
Journal:  Sci Rep       Date:  2017-04-28       Impact factor: 4.379

9.  Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts.

Authors:  Liang Sun; Haitao Luo; Dechao Bu; Guoguang Zhao; Kuntao Yu; Changhai Zhang; Yuanning Liu; Runsheng Chen; Yi Zhao
Journal:  Nucleic Acids Res       Date:  2013-07-27       Impact factor: 16.971

10.  Genome-wide discovery and characterization of maize long non-coding RNAs.

Authors:  Lin Li; Steven R Eichten; Rena Shimizu; Katherine Petsch; Cheng-Ting Yeh; Wei Wu; Antony M Chettoor; Scott A Givan; Rex A Cole; John E Fowler; Matthew M S Evans; Michael J Scanlon; Jianming Yu; Patrick S Schnable; Marja C P Timmermans; Nathan M Springer; Gary J Muehlbauer
Journal:  Genome Biol       Date:  2014-02-27       Impact factor: 13.583

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.