Weimin Zhao1, Yulian Mu1, Lei Ma1, Chen Wang1, Zhonglin Tang1, Shulin Yang1, Rong Zhou1, Xiaoju Hu2, Meng-Hua Li3, Kui Li1. 1. The State Key Laboratory for Animal Nutrition, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China. 2. 1] CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences (CAS), Beijing 100101, China [2] University of Chinese Academy of Sciences, Beijing 100049, China. 3. CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences (CAS), Beijing 100101, China.
Abstract
Long intergenic non-coding RNAs (lincRNAs) play important roles in many cellular processes. Here, we present the first systematic identification and characterization of lincRNAs in fetal porcine skeletal muscle. We obtained a total of 55.02 million 90-bp paired-end reads and assembled 54,550 transcripts using cufflinks. We developed a pipeline to identify 570 multi-exon lincRNAs by integrating a set of previous approaches. These putative porcine lincRNAs share many characteristics with mammalian lincRNAs, such as a relatively short length, small number of exons and low level of sequence conservation. We found that the porcine lincRNAs were preferentially located near genes mediating transcriptional regulation rather than those with developmental functions. We further experimentally analyzed the features of a conserved mouse lincRNA gene and found that isoforms 1 and 4 of this lincRNA were enriched in the cell nucleus and were associated with polycomb repressive complex 2 (PRC2). Our results provide a catalog of fetal porcine lincRNAs for further experimental investigation of the functions of these genes in the skeletal muscle developmental process.
Long intergenic non-coding RNAs (lincRNAs) play important roles in many cellular processes. Here, we present the first systematic identification and characterization of lincRNAs in fetal porcine skeletal muscle. We obtained a total of 55.02 million 90-bp paired-end reads and assembled 54,550 transcripts using cufflinks. We developed a pipeline to identify 570 multi-exon lincRNAs by integrating a set of previous approaches. These putative porcine lincRNAs share many characteristics with mammalian lincRNAs, such as a relatively short length, small number of exons and low level of sequence conservation. We found that the porcine lincRNAs were preferentially located near genes mediating transcriptional regulation rather than those with developmental functions. We further experimentally analyzed the features of a conserved mouse lincRNA gene and found that isoforms 1 and 4 of this lincRNA were enriched in the cell nucleus and were associated with polycomb repressive complex 2 (PRC2). Our results provide a catalog of fetal porcine lincRNAs for further experimental investigation of the functions of these genes in the skeletal muscle developmental process.
In mammals, a large proportion of the genome is composed of intergenic regions, yet little was known about the transcription of these regions at the time of completion of the human genome1. Several studies in the past decade, however, have revealed that most of these regions may represent novel transcribed regions234, where transcripts longer than 200 nucleotides in length are localized. Much of this newly discovered major class of transcripts has very weak or no protein-coding potential and, thus, was defined as long intergenic non-coding RNAs (lincRNAs)5. Since the identification of the first two imprinted lincRNAs, H19 and Xist (X inactive specific transcript), in the early 1990s67, lincRNAs have emerged as an exciting new molecules with potential roles in a variety of cellular processes, including gene regulation89, X-chromosome inactivation71011, reprogramming12, pluripotency maintenance13, embryonic development14 and paraspeckle formation1516.In recent decades, the main goal of pig breeding has been to improve the pig growth rate and muscularity17. Several studies have indicated that postnatal skeletal muscle growth is largely affected by prenatal skeletal muscle development1819. Therefore, understanding the network dynamics of the muscle transcriptome during earlier fetal stages will be of great importance in unraveling the complex mechanism underlying muscle development. The vast majority of transcripts behave as non-coding RNAs (ncRNAs)202122, which include a large portion of lincRNAs2324. However, previous studies on porcine fetal skeletal muscle have primarily focused on protein-coding genes25262728 and miRNAs29303132 instead of lincRNAs; consequently, transcriptional information for skeletal muscle growth in swine is incomplete. Of note, previous investigations found that polycomb repressive complex 2 (PRC2) plays a key role in regulating myogenesis33 and that approximately 20% of lincRNAs are bound by PRC234. These findings indicate that lincRNAs are most likely involved in the development of skeletal muscle.Thousands of lincRNAs have been identified in humans and mice535, some of which demonstrate strong evolutionary signals of inter-species conservation5. However, the assembly of porcine lincRNA sequences using conserved lincRNAs as seeds for the collection of expressed sequence tags (ESTs) is still laborious and ineffective36. In particular, lincRNA transcripts obtained from different sources make it difficult to study their roles in skeletal muscle development. More recently, RNA-seq technology37 and computational methods developed for transcriptome reconstruction3538 have facilitated comprehensive gene annotation and functional characterization of lincRNAs. These approaches have been successfully applied to identify and characterize lincRNAs in a given tissue or cell line394041.Muscle development in pig fetuses involves two major waves of fiber generation: primary fiber formation at 35–60 days post coitus (dpc) and secondary fiber formation at 54–90 dpc42. However, some studies have shown that the development of muscle fibers in the fetal pig is mostly complete by 70–75 dpc4344, which is consistent with additional recent findings that porcine myogenesis is almost complete before 77 dpc28. Moreover, it was shown that the stage ranging from 50 to 75 dpc is critical for the formation of various muscle phenotypes28. Thus, the 50–75 dpc period is a critical stage of fetal skeletal muscle development.In this study, we report the systematic identification and characterization of lincRNAs in porcine fetal skeletal muscle from paired-end RNA-seq data that were obtained from a pool of samples at 50, 55, 60, 65 and 75 dpc. We further characterized the basic features of a conserved mouse lincRNA, including its subcellular localization and its association with chromatin-modifying complexes. Our study paves the way for further studies exploring the functional roles of lincRNA during porcine skeletal muscle development.
Results
Read mapping and transcript assembly
A total of 55.02 million 90-bp pair-end reads were obtained after filtering out low-quality reads and removing the adaptor sequences. Approximately 73.1% of the total clean reads were mapped to the Sus scrofa genome assembly 10.2, and 54,550 assembled transcripts were produced.
Genomic information of porcine lincRNAs
We developed a highly stringent filtering pipeline (Figure 1) to identify porcine lincRNAs using an integrated experimental and computational approach. In total, our pipeline yielded 570 lincRNA transcripts, corresponding to 476 lincRNA genes. Some lincRNA genes were alternatively spliced, containing 1.2 isoforms per lincRNA locus on average. We found that 45.4% of the total lincRNA was transcribed near (<10 kb) known protein-coding genes. The average size of porcine lincRNA was approximately 1,043 bp, with a range of 200 to 834 nucleotides that span 2.5 exons on average, which is similar to that of human lincRNA39. The average exon length of lincRNA was 417 bp, and lincRNAs that contained two exons accounted for 65.7% of the total lincRNAs. We also found that our lincRNA genes contained canonical splice sites (GT-AG); these lincRNAs were distributed in all chromosomes except the Y chromosome.
Figure 1
Overview of the stringent filtering pipeline used to identify the resulting 570 lincRNAs.
At each step, the vertical arrow denotes the transcripts that passed the filter, and the box denotes those that were removed.
Comparison between lincRNAs and protein-coding genes
We also obtained 14,836 protein-coding transcripts that corresponded to 12,372 genes (an average of 1.2 isoforms per protein gene). The average length of these transcripts was 1,790 bp with an average of 6.8 exons, which was larger than the size of the lincRNA genes. However, the average exon length of the protein-coding genes was 261 bp, which was less than that of the lincRNA genes. Furthermore, the exon size distribution of protein genes was mostly within 200 bp (Figure 2A). We found that protein transcripts that contained only two exons accounted for 14.4% of the total protein-coding genes, which was far less than what was observed for the lincRNA genes (Figure 2B).
Figure 2
Comparison of features of porcine lincRNAs and protein-coding genes.
Conservation of porcine lincRNAs
To examine the sequence conservation of lincRNAs between pig and other mammals, we compared pig lincRNAs with those of mouse and human lincRNAs (Ensembl Genes 70) using BLASTN version 2.2.26+. The Ensembl database contains 11,325 human lincRNAs and 3,148 mouse lincRNAs. Only 28 (5%) and six (1.05%) pig lincRNAs overlapped with human and mouse lincRNAs (E-value ≤10−5), respectively. Additionally, four of the lincRNAs corresponded to two lincRNA genes overlapping between human and mouse. The lincRNAs conserved between pig, human and mouse spanned a modest portion of the transcript ranging from 28 to 2,282 nt (396 nt on average) and 35 to 1,930 nt (612 nt on average), respectively. We further found that six and three pig lincRNAs had sequence homology with human and mouse lincRNAs, respectively, restricted to the regions located in a single exon in which the conserved regions started from or ended at the intron-exon boundary (Figure 3). In these nine cases, both the pig lincRNA and the mammalian ortholog were also spliced, indicating that the relative position of the exon within the conserved region was conserved.
Figure 3
Representative images of two pig lincRNAs with conserved segments for mouse (A) and human (B).
Thick lines indicate an exon, and thin lines indicate an intron of the lincRNA. Boxes indicate the conserved region between the two lincRNAs.
In addition, we found that 364 and 137 porcine lincRNA loci could be synthetically mapped to the human and mouse genome using liftover with a value of 0.5 for the “Minimum ratio of bases that must remap”. Moreover, we found that 91 porcine lincRNA loci overlapped (>1 bp) with the human lincRNA loci, which also contained 19 lincRNAs of the above 28 lincRNAs. In addition, 20 porcine lincRNA loci overlapped (>1 bp) with the mouse lincRNA loci, which contained all of the above six lincRNAs.
Repetitive elements of porcine lincRNA
We found that 367 (65%) pig lincRNA transcripts harbored at least a partial repetitive element (RE) and that 12.64% of the lincRNA transcripts are composed of ≥50% RE-derived sequences (Figure 4). In general, the number of lincRNAs decreased with the increasing RE content ratio (Figure 4). The average size of the RE-derived fragments in the lincRNAs was 355 bp, whereas the average length of the 367 lincRNAs was 1,263 bp. Thus, on average, 28% of the lincRNA length is composed of REs. Short interspersed repetitive sequences (SINEs) and long interspersed repetitive sequences (LINEs) accounted for 57.64% of the total REs, and the long terminal repeat (LTR) occupied 15.52% of the total RE. In addition to the above major REs, 19.25% of REs were identified as simple repeat and low complexity sequences.
Figure 4
Percentage of pig lincRNA transcripts masked by RE (from 0 to 100%).
Nearest neighbor analysis of lincRNA genes
We found that 259 lincRNA loci were transcribed near (<10 kb) their protein-coding neighbor, and a total of 378 protein-coding neighbors were collected. Of these neighbors, 361 were assigned to 26 GO terms involved in the biological process (Table 1, P < 0.05). The 26 GO terms mainly referred to the development process, transcriptional regulation and the biosynthetic process. We further found that all of the GO terms contained a small number of genes, except those related to transcription regulation and the biosynthetic process.
Table 1
GO analysis of the closely neighboring protein-coding genes of lincRNA
No.
Terms
GO Accession
No. of genes
1
skeletal muscle organ development
GO:0060538
7
2
skeletal muscle tissue development
GO:0007519
7
3
anterior/posterior pattern formation
GO:0009952
9
4
pattern specification process
GO:0007389
12
5
regionalization
GO:0003002
10
6
segmentation
GO:0035282
5
7
regulation of transcription from RNA polymerase II promoter
GO:0006357
23
8
tongue development
GO:0043586
3
9
segment specification
GO:0007379
3
10
striated muscle tissue development
GO:0014706
7
11
cell fate determination
GO:0001709
4
12
regulation of transcription
GO:0045449
61
13
muscle tissue development
GO:0060537
7
14
leukocyte activation
GO:0045321
10
15
regulation of membrane potential
GO:0042391
7
16
negative regulation of protein catabolic process
GO:0042177
3
17
positive regulation of transcription from RNA polymerase II
GO:0045944
13
18
blood vessel morphogenesis
GO:0048514
9
19
muscle organ development
GO:0007517
9
20
regulation of transcription, DNA-dependent
GO:0006355
43
21
positive regulation of cellular biosynthetic process
GO:0031328
20
22
positive regulation of nitrogen compound metabolic process
GO:0051173
19
23
autonomic nervous system development
GO:0048483
3
24
cranial nerve development
GO:0021545
3
25
negative regulation of cellular protein metabolic process
GO:0032269
8
26
positive regulation of biosynthetic process
GO:0009891
20
Expression of lincRNAs at different developmental stages
We randomly selected 10 lincRNAs and examined their expression pattern at three important development stages. The results confirmed the expression of seven lincRNAs, of which six were detected at all time points (Figure 5) and showed differential expression between the fetal and adult periods. We further found that CUFF.15945 and CUFF.6127 were both higher in the 65 dpc period and were considerably decreased during muscle development; they had a distinctive expression pattern compared with the other four lincRNAs.
Figure 5
Developmental expression pattern of lincRNAs during muscle development (here and below, the values represent the means ± s.e.m., n = 6).
Characterization of ENSMUSG00000090086
We choose a mouse lincRNA gene locus (named ENSMUSG00000090086) that was conserved between pig and human to analyze its features during C2C12 cell differentiation. The ENSMUSG00000090086 gene contained seven transcript isoforms, and we found that iso3 and, in particular, iso1 and iso4 were considerably up-regulated during C2C12 cell differentiation; whereas iso2, iso5, iso6 and iso7 were not detected (Figure 6A). We separated the differentiated C2C12 cells into nuclear and cytoplasmic fractions and found that iso1 and iso4 were both expressed mainly in the nucleus, although iso4 showed weak expression a weak expression (Figure 6C, Supplementary Information). Further, in RNA immunoprecipitation (RIP) experiments, we confirmed that iso1 and iso4 were both significantly (P < 0.01) enriched with EZH2 antibody compared to the IgG nonspecific antibody (Figure 6D).
Figure 6
Features of the ENSMUSG00000090086 gene.
(A) Q-PCR analysis of ENSMUSG00000090086 gene expression during C2C12 cell culture in growth medium (GM) or differentiation medium (DM) for 1, 3 and 5 days (here and below, the values represent the means ± s.e.m., n = 6). (B) RT-PCR analysis of Myog and MHC expression to monitor the differentiation status at indicated times. HPRT was used as an endogenous control. (C) RT-PCR analysis of the relative expression of iso1 and iso4 as well as other control genes in the nuclear and cytoplasmic cell fractions. (D) The RIP result of iso1, iso4 and HOTAIR with the EZH2 antibody. RIP enrichment was measured by q–PCR, and the values were normalized to background levels and input samples (Percent Input Method). The interaction of HOTAIR with EZH2 is a known interaction that served as a positive control(**indicates P < 0.01, here and below, the values represent the means ± s.e.m., n = 3).
Discussion
The identification and characterization of porcine lincRNA, particularly in fetal skeletal muscle development, has been very limited compared with that of lincRNAs in humans3945 and other model organisms, such as zebrafish1446 and mouse535. To the best of our knowledge, this is the first report of the systematic identification and characterization of a reference catalog of 570 porcine lincRNAs by integrating RNA-seq data from fetal muscle tissues. We annotated the basic features of the pig lincRNAs, including the transcript structure, sequence conservation, transposable elements, nearest neighbor analysis and developmental expression. Below, we discuss these findings in more detail and relate our findings to results that have emerged from humans and other model organisms.Noncoding and protein-coding genes were distinguished by their coding potential capability. It has been reported that CPC can discriminate coding from noncoding transcripts with high accuracy47. Additionally, some reports have shown that the combination of a strict BlastX and Pfam (PfamA and PfamB) search could better reduce false negative and false positive results3946. Therefore, we performed this combination of steps to ensure that our resultant lincRNAs were of high quality.We noted that about 73.1% of the total clean reads were mapped to Sus scrofa genome assembly v.10.2, which showed a low efficiency of mapping. Here we only focus on the number of reads that could be mapped to all the 20 chromosomes (SSC1-18, X and Y), and, thus, we observed a low efficiency of mapping. Nevertheless, when the reads were mapped to the 20 chromosomes and unplaced scaffolds, the mapping rate was increased to 80.4%, which was in accordance with previous reports48.LincRNA genes are typically shorter (~1 kb) and have fewer exons (~2–3) than protein-coding genes3946 and, thus, have relatively simple compositions. Our putative porcine lincRNAs also display these properties, indicating that the lincRNAs identified here were reliable. However, an earlier study showed that porcine lincRNAs identified in the testis were 456 bp in size on average40, which is approximately half the size of the lincRNAs in human39 and swine characterized here. This inconsistency is possibly due to fewer reads and mapping problems. Our results showed that the porcine lincRNAs had 1.2 isoforms per locus, which was lower than that of human lincRNAs39. This difference does not seem to be attributable to the fewer reads, as zebrafish lincRNAs had more sequence reads compared to human lincRNAs, although they exhibited less efficient alternative splicing3946. We also found that our lincRNAs had canonical splice sites (GT/AG), which supported the fact that the lincRNAs were similar to protein-coding genes in some properties, such as chromatin modification and splicing signals3949.We found that almost half of the porcine lincRNA was transcribed near (<10 kb) protein-coding genes, which is consistent with the finding that lincRNA genes were preferentially found within 10 kb of protein-coding genes249. However, some studies showed different results50, which may be attributed to the diverse sources of lincRNA. It was demonstrated that lincRNAs were transcribed in close proximity to protein-coding genes and was possibly coordinated with transcriptional regulation of neighboring coding genes51. This was partially supported by evidence that mammalian lincRNAs (<10 kb) are more likely to be located near genes that mediate transcriptional regulation1439, which was also observed in the GO analysis in the nearest neighbors of porcine lincRNA genes.Transposable elements comprise a substantial fraction of the vertebrate genome1 and have been shown to be a major source of vertebrate lncRNAs52. In this study, most of the porcine lincRNAs were also composed of partial TE-derived sequences, which is consistent with the above conclusion. Moreover, we found that SINEs and LINEs account for half of the TE family in the porcine lincRNAs, which is also consistent with the results observed in the mammalian lncRNAs52.Most of the lincRNAs had a less conserved sequence with other mammalian lincRNAs, and some of them had more positional conservation than sequence conservation across vertebrates14. Our results showed that few pig lincRNAs overlapped with the human and mouse lincRNAs at the sequence level, but more pig lincRNAs were synthetically mapped to the mammalian genome, which is consistent with the above conclusions. However, a recent report showed that nearly 40% of pig lincRNAs had detectable sequence homology with human and mouse lincRNAs by BLASTN. We found that our pig lincRNAs were only obtained from the skeletal muscle, while the 6,621 pig lincRNAs in Zhou et al. (2014) were obtained from several tissues53. LincRNAs exhibited tissue-specific expression patterns more so than protein-coding genes, which may lead to identify different number and structure of lincRNAs across multiple tissues. Furthermore, the mouse lincRNAs from the NONCODE database (v4)54 and human lincRNAs from the Gencode database (v19)55 database was also different from the mouse and human lincRNAs database (Ensemble Genes 70) analysed in this study. Thus, we think the difference in the number and source of pig lincRNAs and the databases of mouse and human lincRNAs may have led to our poor results about lincRNAs sequence similarity between species.Conserved lincRNAs among mammals are generally thought to have important roles14. We found that the conserved ENSMUSG00000090086 gene was differentially expressed during C2C12 cell differentiation and was associated with PRC2. It was shown that the binding of differentially expressed lincRNAs to PRC2 could indicate a possible role of lincRNAs8. However, a recent study stated that the binding to PRC2 does not necessarily imply functionality for lincRNA56. The role of the ENSMUSG00000090086 gene needs to be further investigated.In conclusion, we have provided a resource of porcine lincRNA which will enable further studies of the function of these genes in the process of skeletal muscle development.
Methods
The methods were performed in accordance with the guidelines of the Good Experimental Practices adopted by the Institute of Animal Science.All experimental protocols were approved by the Institute of Animal Science of the Chinese Academy of Agricultural Sciences.
Animal and tissue preparation
All longissimus dorsi muscle samples, which were maintained in liquid nitrogen, were derived from our laboratory. Two Tongchen pig fetuses (one male and one female) at 50, 55, 60, 65 and 75 dpc were included in this study.
RNA extraction, library preparation and Solexa sequencing
Total RNA was isolated with TRIzol reagent (Invitrogen, Carlsbad, CA, USA), treated with DNase I (Qiagen, Beijing, China), and purified using an RNeasy MinElute Cleanup column (Qiagen). The total RNA integrity was assessed using Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA), and only the samples with RNA Integrity Number (RIN) scores >8 were used for sequencing. Equal amounts of total RNA from the samples at the different stages (i.e., 50, 55, 60, 65 and 75 dpc) were pooled into one sample.PolyA+ RNA was purified using Magnetic Oligo (dT) Beads from 20 μg of the total RNA and was further fragmented before cDNA synthesis. First-strand cDNA was synthesized using Random Primer p(dN)6 and Superscript III (Invitrogen, Carlsbad, CA, USA), and the synthesis of double-stranded cDNA was performed using 10× second strand buffer, RNaseH, and DNA Polymerase I. Following the second-strand cDNA synthesis and adaptor ligation, 240–310 bp cDNA fragments were isolated. The cDNA libraries were then prepared following the manufacturer's instructions (Illumina, San Diego, CA, USA). The purified cDNA libraries were sequenced using a paired-end sequencing strategy on the Illumina HiSeq 2000 after quantification by the Agilent 2100 Bioanalyzer.
Transcriptome assembly
The raw reads were cleaned by filtering the adapter using cutadapt v1.157 and low-quality reads using Prinseq v0.17.358. The clean reads were then mapped to the pig reference genome (Sscrofa10.2) using the TopHat version 1.3.2 software59. Transcriptomes were assembled with Cufflinks version 1.3.038 supported through Galaxy (https://usegalaxy.org/u/jeremy/w/sort-sam-file-for-cufflinks).
Pipeline for the identification of multiple-exon lincRNA
We identified multiple-exon lincRNAs following the steps listed in the pipeline (Figure 1). The steps are detailed as follows:(1) Size selection: single-exon transcripts and the transcripts less than 200 bp were removed; (2) the remaining transcripts were removed if they had genomic positions that overlapped (>1 bp) with those of pig protein-coding genes obtained from NCBI RefSeq mRNAs (release 54, July 2012) with the accession prefixes NM_ and XM_ (hypothetical protein genes were not included) and Ensembl protein-coding genes (Ensembl release 68, July 2012); (3) the Coding Potential Calculator (CPC) tool47 was used to assess the coding potential of transcripts in both strands, and the remaining transcripts were removed if they had a CPC value >0 in either strand; (4) any remaining transcripts with similarity to known proteins against the Swiss-Prot database with an E-value ≤10−5 were removed using the NCBI BLAST version 2.2.26; (5) the remaining transcripts that contained a known protein-coding domain were removed. To accomplish this, we translated each transcript sequence in all six possible frames and used HMMER to exclude the transcripts whose corresponding translated protein sequences had a significant hit in the Pfam (PfamA and PfamB) database release 26.060; (6) the remaining transcripts that belonged to known classes of small RNAs (snRNA, snoRNAs, tRNAs, miRNA, etc.) were removed using Rfam61 (release 10.0); and (7) to filter the transcripts that were located in the UTR regions of the protein-gene due to incomplete assemblies, the remaining sequences were aligned against the NCBI RNA reference sequences (RefSeqs) only with the identifiers beginning with “NM_” prefixes via BLASTN. The sequences with more than 90% of their lengths overlapping in the UTR regions of the RNA RefSeqs were discarded.
Analysis of protein-coding transcript, TEs and GO
The assembled transcripts that had at least two exons were collected and were considered to be protein-coding transcripts when they had a sequence similarity of ≥96% and overlapped with ≥90% of the porcine protein genes from NCBI RefSeq mRNAs (hypothetical protein genes were not included; release 54, July 2012) and Ensemble (release 68, July 2012) using BLASTN version 2.2.26+ (ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/).We ran RepeatMasker program version open-4.0.3 (http://www.repeatmasker.org/) with options “cross_match” as the search engine and “pig” as the DNA source to identify transposable and repetitive DNA elements in the pig lincRNA sequences.For each lincRNA locus, the nearest upstream and downstream (within <10 kb) protein-coding neighbors (without overlap) were identified. The neighbor gene names were used as the gene list input into DAVID (http://david.abcc.ncifcrf.gov/) for GO analysis62. We selected the “GOTERM_BP_FAT” and set the value of EASE to 0.05 for the GO term enrichment analysis.
Reverse transcription polymerase chain reaction (RT-PCR) and real-time PCR
For RT-PCR, the total RNA was converted into cDNA using a Revert Aid First Strand cDNA Synthesis Kit (Thermo Fisher Scientific, Boston, MA, USA) with oligo dT and random hexamer primers included in the kit. The PCR reactions were performed as follows: initial denaturation at 95°C for 3 min, followed by 30 cycles of denaturation at 95°C for 15 s, annealing at 60°C for 30 s, and elongation at 72°C for 20 s. The real-time PCR was performed according to the SYBR Premix Ex Taq™ instructions (Takara, Shiga, Japan). The reaction volume contained 10 μl of 2× SYBR® Premix Ex Taq™, 0.4 μl of 50× ROX Reference Dye II, 0.5 μl of 10 μM forward and reverse primers, 2 μl of template cDNA and dH2O up to a final volume of 20 μl. The reactions were performed on an ABI 7500 instrument (Applied Biosystems, Carlsbad, CA, USA) as follows: 2 min at 95°C, followed by 35 cycles of 5 s at 95°C and 34 s at 60°C. All data were analyzed by the 2−ΔΔCT method using 7500 System (SDS) Software version 1.4.0. LincRNA primers (Table 2) were designed over suitable exon-exon junctions using primer3 (http://bioinfo.ut.ee/primer3-0.4.0/primer3/input.htm).
Table 2
LincRNA primers used in this study
Primers
Primer sequence
Amplicon size (bp)
Forward (5′-3′)
Reverse (3′-5′)
CUFF30734
CCCTCTTCATTTCACCAGGA
GCAGGCTGAGGACGAGAATA
192
CUFF.34686
ATATTCCTCCCGGGTTTCAC
CCACAGCCAGAATCATCCTT
218
CUFF.48491
GACTACCATCTTGGGGACCA
TGAAGAACCGGGGTTATCAG
210
CUFF.15945
CTAAGGCCCTCTGCAAACTG
TTTTGCTTCCAACTTTTCCA
239
CUFF.6127
CAATGCTGTGGCAACAAGAC
CAGACGAAAGCCAGAAGTCC
241
CUFF.41031
CCCAAGGTGGTGGTTTAGTG
CCCCGTTACTGTGGTACCTG
224
CUFF.3991
ATCCATCCAGCATCTTCTCG
GACACGTGCCAGGTACTGAG
227
CUFF.16680
ACACGTGTGCCAGTCAACAT
AGCCCATCAGTCCCTCTTCT
157
CUFF.1689
GTGCAGCCACTTCTTTCTCC
TTGCAGGTCCCATTCCATAC
228
CUFF.37686
AGGGCCCTCAGTTGTGATTT
GTGGGTGAAGGATTTCCCTA
174
Pig-TBP
GCGATTTGCTGCTGTAATCA
CCCCACCATGTTCTGAATCT
196
Iso1
GTATTTTCTGTGGCGGTTGG
GATGTCCAGAGGAAGCAAGG
282
Iso3
GGAATTGCCCTAACAGAACG
CTACCCGGCCTAGAGTGTTG
210
Iso4
ACTCCACCATCCAGTTCAGG
TTCGCTGAGGGCTGTTTAGT
216
Myog
ACTCCCTTACGTCCATCGTG
CAGGGCTGTTTTCTGGACAT
195
MHC
CGTCAAGGGTCTTCGTAAGC
ATTGTTCCTCAGCCTCCTCA
158
U6
CGCTTCGGCAGCACATATAC
TTCACGAATTTGCGTGTCAT
87
tRNA-lle
AGTGGCGCAATCGGTTAG
AGGCTCGAACTCACAACCTC
78
NEAT1
TTTGAGATGCAGTTGCTTGG
CTCCCCAGCTTCACTTCTTG
205
HOTAIR
GGGCTGCAGAATTCACTCTC
GACTTCCTTCCTTCCGCTCT
207
HPRT
GCCCCAAAATGGTTAAGGTT
TTGCGCTCATCTTAGGCTTT
208
Cell culture and differentiation
The mouseC2C12 cell line was provided by the Cell Resource Center of Peking Union Medical College (CRC/PUMC, Beijing, China) and was cultured in Dulbecco's modified Eagle's medium (DMEM) - high glucose (Invitrogen, Carlsbad, CA, USA) with 10% (v/v) fetal bovine serum (Invitrogen, Carlsbad, CA, USA) at 37°C in a 5% CO2 humidified incubator. Upon the induction of differentiation, the culture medium was switched to DMEM plus 2% horse serum when cells reached approximately 80% confluence. The medium was changed every two days.
Nuclear and cytoplasmic RNA fractionation
The cells cultured in differentiation medium were harvested in a T-25 flask, washed once with cold PBS, and centrifuged at 500 g for 3 min at 4°C. Cell pellets were resuspended by gentle pipetting in 200 μl of lysis buffer (10 mM Tris-HCl, pH = 8.0, 140 mM NaCl, 1.5 mM MgCl2, 10 mM EDTA, 0.5% IGEPAL® CA-630, and 40 U/ml RNase inhibitor) and incubated on ice for 5 min, followed by centrifugation at 500 g at 4°C for 3 min. The supernatant was transferred to a fresh 1.5 ml microcentrifuge tube, centrifuged at full speed (14,000 rpm) for 1 min, and lysed in 1 ml of TRIzol for cytoplasmic RNA isolation. The nuclear pellet was washed once with lysis buffer and was resuspended in 200 μl of lysis buffer to determine the nucleus viscosity; 1 ml of TRIzol was added for nuclear RNA isolation if necessary. An equal amount of nuclear and cytoplasmic RNA was reverse transcribed for further analysis.
RNA-binding protein immunoprecipitation (RIP)
Cells cultured in differentiation medium were prepared in four T-75 flasks for the RNA-binding protein immunoprecipitation (RIP) assay using an EZ-Magna RIP Kit (Millipore, Billerica, MA, USA) and following the manufacturer's instructions. An anti-EZH2 polyclonal antibody (ab3748, 1:100; Abcam, Cambridge, UK) and negative control rabbit IgG antibody was used to investigate the interactions between lincRNAs and PRC2. The final isolated RNA was reverse transcribed using random primers according to the RevertAid™ First-Strand cDNA Synthesis Kit. Data were analyzed using the Percent Input Method (http://www.lifetechnologies.com/cn/zh/home/life-science/epigenetics-noncoding-rna-research/chromatin-remodeling/chromatin-immunoprecipitation-chip/chip-analysis.html).
Statistical analysis
The results are reported as the mean ± standard deviation (SD). Statistical analysis was performed using a two-tailed unpaired Student's t-test. Differences were considered statistically significant at the p < 0.05 level and were considered very significant at p < 0.01. All experiments were performed three times.
Author Contributions
K.L. conceived and designed the experiments. W.Z. and Y.M. performed the experiments. L.M., C.W. and X.H. analyzed the data. Z.T., S.Y., R.Z. and X.H. helped interpret the results. W.Z. and M.-H.L. wrote the paper. All authors reviewed the manuscript.
Authors: Ewan Birney; John A Stamatoyannopoulos; Anindya Dutta; Roderic Guigó; Thomas R Gingeras; Elliott H Margulies; Zhiping Weng; Michael Snyder; Emmanouil T Dermitzakis; Robert E Thurman; Michael S Kuehn; Christopher M Taylor; Shane Neph; Christoph M Koch; Saurabh Asthana; Ankit Malhotra; Ivan Adzhubei; Jason A Greenbaum; Robert M Andrews; Paul Flicek; Patrick J Boyle; Hua Cao; Nigel P Carter; Gayle K Clelland; Sean Davis; Nathan Day; Pawandeep Dhami; Shane C Dillon; Michael O Dorschner; Heike Fiegler; Paul G Giresi; Jeff Goldy; Michael Hawrylycz; Andrew Haydock; Richard Humbert; Keith D James; Brett E Johnson; Ericka M Johnson; Tristan T Frum; Elizabeth R Rosenzweig; Neerja Karnani; Kirsten Lee; Gregory C Lefebvre; Patrick A Navas; Fidencio Neri; Stephen C J Parker; Peter J Sabo; Richard Sandstrom; Anthony Shafer; David Vetrie; Molly Weaver; Sarah Wilcox; Man Yu; Francis S Collins; Job Dekker; Jason D Lieb; Thomas D Tullius; Gregory E Crawford; Shamil Sunyaev; William S Noble; Ian Dunham; France Denoeud; Alexandre Reymond; Philipp Kapranov; Joel Rozowsky; Deyou Zheng; Robert Castelo; Adam Frankish; Jennifer Harrow; Srinka Ghosh; Albin Sandelin; Ivo L Hofacker; Robert Baertsch; Damian Keefe; Sujit Dike; Jill Cheng; Heather A Hirsch; Edward A Sekinger; Julien Lagarde; Josep F Abril; Atif Shahab; Christoph Flamm; Claudia Fried; Jörg Hackermüller; Jana Hertel; Manja Lindemeyer; Kristin Missal; Andrea Tanzer; Stefan Washietl; Jan Korbel; Olof Emanuelsson; Jakob S Pedersen; Nancy Holroyd; Ruth Taylor; David Swarbreck; Nicholas Matthews; Mark C Dickson; Daryl J Thomas; Matthew T Weirauch; James Gilbert; Jorg Drenkow; Ian Bell; XiaoDong Zhao; K G Srinivasan; Wing-Kin Sung; Hong Sain Ooi; Kuo Ping Chiu; Sylvain Foissac; Tyler Alioto; Michael Brent; Lior Pachter; Michael L Tress; Alfonso Valencia; Siew Woh Choo; Chiou Yu Choo; Catherine Ucla; Caroline Manzano; Carine Wyss; Evelyn Cheung; Taane G Clark; James B Brown; Madhavan Ganesh; Sandeep Patel; Hari Tammana; Jacqueline Chrast; Charlotte N Henrichsen; Chikatoshi Kai; Jun Kawai; Ugrappa Nagalakshmi; Jiaqian Wu; Zheng Lian; Jin Lian; Peter Newburger; Xueqing Zhang; Peter Bickel; John S Mattick; Piero Carninci; Yoshihide Hayashizaki; Sherman Weissman; Tim Hubbard; Richard M Myers; Jane Rogers; Peter F Stadler; Todd M Lowe; Chia-Lin Wei; Yijun Ruan; Kevin Struhl; Mark Gerstein; Stylianos E Antonarakis; Yutao Fu; Eric D Green; Ulaş Karaöz; Adam Siepel; James Taylor; Laura A Liefer; Kris A Wetterstrand; Peter J Good; Elise A Feingold; Mark S Guyer; Gregory M Cooper; George Asimenos; Colin N Dewey; Minmei Hou; Sergey Nikolaev; Juan I Montoya-Burgos; Ari Löytynoja; Simon Whelan; Fabio Pardi; Tim Massingham; Haiyan Huang; Nancy R Zhang; Ian Holmes; James C Mullikin; Abel Ureta-Vidal; Benedict Paten; Michael Seringhaus; Deanna Church; Kate Rosenbloom; W James Kent; Eric A Stone; Serafim Batzoglou; Nick Goldman; Ross C Hardison; David Haussler; Webb Miller; Arend Sidow; Nathan D Trinklein; Zhengdong D Zhang; Leah Barrera; Rhona Stuart; David C King; Adam Ameur; Stefan Enroth; Mark C Bieda; Jonghwan Kim; Akshay A Bhinge; Nan Jiang; Jun Liu; Fei Yao; Vinsensius B Vega; Charlie W H Lee; Patrick Ng; Atif Shahab; Annie Yang; Zarmik Moqtaderi; Zhou Zhu; Xiaoqin Xu; Sharon Squazzo; Matthew J Oberley; David Inman; Michael A Singer; Todd A Richmond; Kyle J Munn; Alvaro Rada-Iglesias; Ola Wallerman; Jan Komorowski; Joanna C Fowler; Phillippe Couttet; Alexander W Bruce; Oliver M Dovey; Peter D Ellis; Cordelia F Langford; David A Nix; Ghia Euskirchen; Stephen Hartman; Alexander E Urban; Peter Kraus; Sara Van Calcar; Nate Heintzman; Tae Hoon Kim; Kun Wang; Chunxu Qu; Gary Hon; Rosa Luna; Christopher K Glass; M Geoff Rosenfeld; Shelley Force Aldred; Sara J Cooper; Anason Halees; Jane M Lin; Hennady P Shulha; Xiaoling Zhang; Mousheng Xu; Jaafar N S Haidar; Yong Yu; Yijun Ruan; Vishwanath R Iyer; Roland D Green; Claes Wadelius; Peggy J Farnham; Bing Ren; Rachel A Harte; Angie S Hinrichs; Heather Trumbower; Hiram Clawson; Jennifer Hillman-Jackson; Ann S Zweig; Kayla Smith; Archana Thakkapallayil; Galt Barber; Robert M Kuhn; Donna Karolchik; Lluis Armengol; Christine P Bird; Paul I W de Bakker; Andrew D Kern; Nuria Lopez-Bigas; Joel D Martin; Barbara E Stranger; Abigail Woodroffe; Eugene Davydov; Antigone Dimas; Eduardo Eyras; Ingileif B Hallgrímsdóttir; Julian Huppert; Michael C Zody; Gonçalo R Abecasis; Xavier Estivill; Gerard G Bouffard; Xiaobin Guan; Nancy F Hansen; Jacquelyn R Idol; Valerie V B Maduro; Baishali Maskeri; Jennifer C McDowell; Morgan Park; Pamela J Thomas; Alice C Young; Robert W Blakesley; Donna M Muzny; Erica Sodergren; David A Wheeler; Kim C Worley; Huaiyang Jiang; George M Weinstock; Richard A Gibbs; Tina Graves; Robert Fulton; Elaine R Mardis; Richard K Wilson; Michele Clamp; James Cuff; Sante Gnerre; David B Jaffe; Jean L Chang; Kerstin Lindblad-Toh; Eric S Lander; Maxim Koriabine; Mikhail Nefedov; Kazutoyo Osoegawa; Yuko Yoshinaga; Baoli Zhu; Pieter J de Jong Journal: Nature Date: 2007-06-14 Impact factor: 49.962
Authors: Eduard Muráni; Mária Murániová; Siriluck Ponsuksili; Karl Schellander; Klaus Wimmers Journal: BMC Dev Biol Date: 2007-10-01 Impact factor: 1.978