| Literature DB >> 26969372 |
Mei-Ju May Chen1, Li-Kai Chen2, Yu-Shing Lai3, Yu-Yu Lin4, Dung-Chi Wu3, Yi-An Tung1, Kwei-Yan Liu2, Hsueh-Tzu Shih2, Yi-Jyun Chen2, Yan-Liang Lin5, Li-Ting Ma5, Jian-Long Huang4, Po-Chun Wu4, Ming-Yi Hong3, Fang-Hua Chu5, June-Tai Wu6,7,8, Wen-Hsiung Li9,10,11, Chien-Yu Chen12,13,14.
Abstract
BACKGROUND: Recent advances in sequencing technology have opened a new era in RNA studies. Novel types of RNAs such as long non-coding RNAs (lncRNAs) have been discovered by transcriptomic sequencing and some lncRNAs have been found to play essential roles in biological processes. However, only limited information is available for lncRNAs in Drosophila melanogaster, an important model organism. Therefore, the characterization of lncRNAs and identification of new lncRNAs in D. melanogaster is an important area of research. Moreover, there is an increasing interest in the use of ChIP-seq data (H3K4me3, H3K36me3 and Pol II) to detect signatures of active transcription for reported lncRNAs.Entities:
Keywords: Active transcription; ChIP-seq; Drosophila melanogaster; Long non-coding RNA; RNA-seq
Mesh:
Substances:
Year: 2016 PMID: 26969372 PMCID: PMC4787191 DOI: 10.1186/s12864-016-2457-0
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary statistics of datasets used in study
| Platforms | Types | Total number of datasets | Experimental condition | Number of datasets |
|---|---|---|---|---|
| Public RNA-seq (59 in total) | Paired-end without strand-specific | 30 | Time course/whole body | 30 |
| Paired-end with strand-specific | 29 | Tissue/head | 9 | |
| Tissue/ovary | 2 | |||
| Tissue/accessory glands | 1 | |||
| Tissue/testis | 1 | |||
| Tissue/carcass | 4 | |||
| Tissue/digestive system | 4 | |||
| Tissue/CNS | 2 | |||
| Tissue/fat body | 3 | |||
| Tissue/imaginal discs | 1 | |||
| Tissue/salivary glands | 2 | |||
| In-house RNA-seq (2 in total) | Paired-end with poly(A)-enriched | 1 | Tissue/brain | 1 |
| Paired-end with ribo-zero | 1 | Tissue/brain | 1 | |
| ChIP-seq (32 in total) | H3K36me3 | 3 | Embryos | 1 |
| Larvae | 1 | |||
| Mixed Adult | 1 | |||
| H3K4me3 | 14 | Embryos | 7 | |
| Larvae | 3 | |||
| Pupae | 1 | |||
| Adult Female | 1 | |||
| Adult Male | 1 | |||
| Mixed Adult | 1 | |||
| RNA polymerase II | 15 | Embryos | 8 | |
| Larvae | 5 | |||
| Pupae | 1 | |||
| Mixed Adult | 1 |
Detailed information of these datasets can be seen in Additional file 3: Table S2 and Table S5
Fig. 1RT-qPCR experiments for a selected set of lncRNAs in brains. a 22 novel lncRNAs discovered in the present study were selected for validation. RpL32 (a coding gene) and roX1 (a non-coding gene) were included as positive controls. The horizontal line indicated − delta Ct ≥ 1. The rectangle indicated the five lncRNAs with considerably low expression, and was tested again by the second RT-qPCR experiment shown in (b). b The five lncRNAs from the rectangle of (a) were tested again by RT-qPCR with twofold amount of template cDNA. Ten FlyBase lncRNAs were included for comparison. The three FlyBase lncRNAs highlighted by the orange stars were selected because their RPKM values in our brain RNA-seq data was 0
Statistics of transcriptional direction in the lncRNA genes from different sources. The mRNA information was downloaded from the UCSC genome browser (Sep. 21st, 2015)
| Transcriptional direction | FlyBase + UCSC | Young et al. | Brown et al. | Present study | mRNA |
|---|---|---|---|---|---|
| Positive (+) | 1011 | 200 | 392 | 268 | 14,941 |
| Negative (-) | 988 | 192 | 380 | 194 | 15,321 |
| Unknown (*) | 0 | 191 | 0 | 0 | 0 |
| Total | 1999 | 583 | 772 | 462 | 30,262 |
Types of lncRNA transcripts
| Types | Number of lncRNAs | Averaged length (±sd) | Number of exons (counts of lncRNAs) | Transcriptional direction (counts of lncRNAs) |
|---|---|---|---|---|
| Intergenic | 2602 | 1002 (±1305.81) | Single (1805); multiple (797) | +(1375); −(1227) |
| Exonic | ||||
| Anti-sense | 832 | 1161 (±1059.20) | single (373); multiple (459) | +(448); −(384) |
| Sense | 268 | 1380 (±1317.87) | single (154); multiple (114) | +(131); −(137) |
| Total | 1100 | |||
| Intronic | ||||
| Anti-sense | 495 | 770 (±581.83) | single (292); multiple (203) | +(239); −(256) |
| Sense | 211 | 733 (±633.81) | single (149); multiple (62) | +(108); −(103) |
| Total | 706 | |||
| Unknown | 191 | 813 (±782.66) | Single (164); multiple (27) | NA |
| Total | 4599 | |||
+: positive strand
−: negative strand
NA not available
Fig. 2Expression profiles at different developmental stages of fruit fly. a Averaged RPKM values at different developmental stages for lncRNAs and mRNAs. b Numbers of expressed transcripts (RPKM > 1) at different developmental stages for lncRNAs and mRNAs, respectively
Fig. 3Analysis of chromatin signatures (Pol II, H3K36me3 and H3K4me3) in the curated lncRNA genes
Fig. 4RT-qPCR experiments of a selected set of lncRNAs in male adults. G1: high expression with chromatin signatures (11 lncRNAs); G2: low expression with chromatin signatures (11 lncRNAs); G3: high expression without chromatin signatures (10 lncRNAs); and G4: low expression without chromatin signatures (10 lncRNAs). Three negative controls (un-transcribed region 1, 2, and 3) were all around zero. Stars were used to highlight the lncRNAs that were not from the databases (Orange stars: the selected lncRNAs from Young et al. [18]. Blue stars: the lncRNAs from the present study). The horizontal line indicated the cutoff (−delta Ct ≥2) used to define a validated lncRNA. Green stars: the transcripts that are now annotated as other types of transcripts by FlyBase, and thus were removed from the list of the curated lncRNAs in the present study
Fig. 5Distribution of exon numbers in the lncRNA/mRNA genes
Fig. 6Procedures for discovering novel lncRNAs from RNA-seq data of the present study. The sequencing read datasets of poly(A)-enriched RNA and total RNA were respectively mapped to the reference genome sequence using TopHat and Cufflinks. Putative lncRNAs were then discovered by Cuffcompare, followed by coding potential estimation and rRNA exclusion. Sequencing reads were again mapped to the set of putative lncRNAs to construct the final set of novel lncRNAs
Fig. 7Rules for classifying lncRNAs. Black arrows (transcripts) represent coding genes and colored transcripts are lncRNAs. a lncRNAs with intronic overlaps. This group includes lncRNAs (dark green and light green transcripts) located in intronic regions of coding genes (black transcripts). b Intergenic lncRNAs. This group includes lncRNAs (red transcripts) located in regions between two coding genes (black transcripts). c lncRNAs with exonic overlaps. This group includes lncRNAs (dark blue and light blue transcripts) overlapping exonic regions of coding genes (the black transcript)