| Literature DB >> 23006825 |
Georges St Laurent1, Dmitry Shtokalo, Michael R Tackett, Zhaoqing Yang, Tatyana Eremina, Claes Wahlestedt, Silvio Urcuqui-Inchima, Bernd Seilheimer, Timothy A McCaffrey, Philipp Kapranov.
Abstract
BACKGROUND: The function of RNA from the non-coding (the so called "dark matter") regions of the genome has been a subject of considerable recent debate. Perhaps the most controversy is regarding the function of RNAs found in introns of annotated transcripts, where most of the reads that map outside of exons are usually found. However, it has been reported that the levels of RNA in introns are minor relative to those of the corresponding exons, and that changes in the levels of intronic RNAs correlate tightly with that of adjacent exons. This would suggest that RNAs produced from the vast expanse of intronic space are just pieces of pre-mRNAs or excised introns en route to degradation.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23006825 PMCID: PMC3507791 DOI: 10.1186/1471-2164-13-504
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Distribution of mapped reads among different genomic annotations
| 0 | 172,810,082 | 100,490,763 | 37,454,298 | 4,040,580 | 58,995,885 | 21,353,240 | 63.8 % | 25,213,131 | 42.7 % | 67.0 % | 12,429,515 | 21.1 % |
| 3 | 188,573,448 | 111,280,605 | 39,613,334 | 4,490,786 | 67,176,485 | 24,307,053 | 63.8 % | 28,235,047 | 42.0 % | 65.9 % | 14,634,386 | 21.8 % |
| 6 | 201,295,896 | 116,218,502 | 47,449,481 | 4,055,092 | 64,713,929 | 22,823,246 | 64.7 % | 28,228,817 | 43.6 % | 67.4 % | 13,661,866 | 21.1 % |
| 12 | 170,486,061 | 93,829,203 | 39,479,285 | 3,924,386 | 50,425,532 | 18,187,717 | 63.9 % | 21,318,414 | 42.3 % | 66.1 % | 10,919,401 | 21.7 % |
| 24 | 157,234,367 | 91,301,394 | 34,072,429 | 4,477,595 | 52,751,370 | 18,760,773 | 64.4 % | 22,667,538 | 43.0 % | 66.7 % | 11,323,060 | 21.5 % |
| 48 | 147,848,170 | 85,053,947 | 32,412,171 | 3,634,177 | 49,007,599 | 17,832,101 | 63.6 % | 20,929,521 | 42.7 % | 67.1 % | 10,245,978 | 20.9 % |
*Exons were defined by the UCSC Genes track (Methods).
Figure 1RNAseq profiles representing annotations with little intronic signal (A) and extensive intronic signal (B & C). The profiles are based on the control RNA samples. Positions of RT-PCR products presented in Figure 7 are shown (see Additional file 4: Table S3 for more details).
Figure 7Detection of long transcripts in intronic regions using RT-PCR. Reactions were done with (“+”) and without (“-”) reverse transcriptase. The size range where the expected PCR products should fall is shown on the right. M- size standard. See text, Figures 1 &5 and Methods for more details.
Figure 2Correlation between levels of exonic and intronic RNAs. (A) Plot of normalized read densities for every intron and corresponding exons: the exon-intron densities for every animal were combined and the top half of the dataset based the highest exonic densities is plotted. (B) Histogram of Spearman rank correlations obtained for every intron-exon pairs throughout the time course of LPS treatment. (C) Histogram of minimal (min) and maximum (max) correlations (Y-axis) between an intron and all other introns of the same transcript. The X-axis shows the correlation of the intron with the corresponding exons of the transcript as shown in the panel B. (D) Boxplots of the ratio of maximum/minimum intronic RNAseq density of different introns in the same transcript for each time point. Intronic density was calculated as the average of the 7 animals per each time point. (E) Histogram of maximum ratio of intron/exon densities for each intron that could be found in any of the 42 animals at any timepoint.
Annotation of abundant introns based on presence of known small RNAs
| | ||||
|---|---|---|---|---|
| Total introns* | 96 | 707 | 4,950 | 5,753 |
| UCSC snoRNAs | 34 | 19 | 23 | 76 |
| Ensemble snoRNAs | 48 | 31 | 17 | 96 |
| Non-mouse snoRNAs | 1 | 0 | 0 | 1 |
| Ensemble snRNAs | 2 | 7 | 8 | 17 |
| RNA repeats | 3 | 14 | 52 | 69 |
| Un-annotated | 8 | 636 | 4,850 | 5,494 |
| Un-annotated % | 8.3% | 90.0% | 98.0% | 95.5% |
*Introns overlapping an annotation were removed from the subsequent analysis. For example, introns overlapping UCSC snoRNAs were not to be used for comparison with Ensembl snoRNAs and so on.
Figure 3Examples of novel intronic RNAs whose RNAseq densities in the control (untreated with LPS) sample are >10 fold higher than those of the corresponding exons. The PhastCons track from the UCSC browser represents the Euarchontoglire subset. One of these introns is close to an annotated snoRNA Snora26 (B), which is interesting considering that sometimes multiple snoRNAs are encoded within the same locus.
Figure 4An example of a known non-coding RNA, the primary precursor transcript of mir-21 upregulated at 3 hrs after LPS treatment.
Distribution of differentially expressed (DE) bins among different genomic annotations
| | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Total number of DE bins | 8,316 | 17,348 | 4,107 | 6,746 | 5,276 | 5,264 | 6,714 | 5,464 | 6,217 | 3,939 |
| Exonic bins | 392 | 2,053 | 249 | 356 | 299 | 434 | 334 | 258 | 203 | 269 |
| Bins that overlap both exons and introns | 648 | 2,597 | 409 | 699 | 492 | 704 | 566 | 617 | 429 | 386 |
| Bins that overlap Ensembl exons | 127 | 297 | 79 | 92 | 79 | 110 | 101 | 104 | 72 | 58 |
| Bins that overlap exons of Ensembl retained introns | 27 | 78 | 29 | 26 | 24 | 20 | 31 | 32 | 25 | 25 |
| Bins that overlap both Ensembl exons and exons of Ensembl retained introns | 3 | 9 | 1 | 1 | 0 | 2 | 5 | 2 | 0 | 1 |
| Intronic bins | 6,530 | 10,274 | 3,041 | 5,063 | 4,062 | 3,515 | 5,359 | 3,995 | 5,070 | 2,953 |
| Bins that overlap EST (both exons and introns) | 75 | 223 | 41 | 65 | 57 | 71 | 50 | 53 | 65 | 46 |
| Bins that overlap EST (exons only) | 25 | 115 | 23 | 34 | 29 | 46 | 19 | 33 | 21 | 16 |
| Bins that overlap EST (introns only) | 248 | 527 | 73 | 123 | 86 | 96 | 95 | 99 | 91 | 56 |
| Intergenic bins | 241 | 1,175 | 162 | 287 | 148 | 266 | 154 | 271 | 241 | 129 |
| Total number of DE bins | 100.00 % | 100.00 % | 100.00 % | 100.00 % | 100.00 % | 100.00 % | 100.00 % | 100.00 % | 100.00 % | 100.00 % |
| Exonic bins | 4.71 % | 11.83 % | 6.06 % | 5.28 % | 5.67 % | 8.24 % | 4.97 % | 4.72 % | 3.27 % | 6.83 % |
| Bins that overlap both exons and introns | 7.79 % | 14.97 % | 9.96 % | 10.36 % | 9.33 % | 13.37 % | 8.43 % | 11.29 % | 6.90 % | 9.80 % |
| Bins that overlap Ensembl exons | 1.53 % | 1.71 % | 1.92 % | 1.36 % | 1.50 % | 2.09 % | 1.50 % | 1.90 % | 1.16 % | 1.47 % |
| Bins that overlap exons of Ensembl retained introns | 0.32 % | 0.45 % | 0.71 % | 0.39 % | 0.45 % | 0.38 % | 0.46 % | 0.59 % | 0.40 % | 0.63 % |
| Bins that overlap both Ensembl exons and exons of Ensembl retained introns | 0.04 % | 0.05 % | 0.02 % | 0.01 % | 0.00 % | 0.04 % | 0.07 % | 0.04 % | 0.00 % | 0.03 % |
| Intronic bins | 78.52 % | 59.22 % | 74.04 % | 75.05 % | 76.99 % | 66.77 % | 79.82 % | 73.11 % | 81.55 % | 74.97 % |
| Bins that overlap EST (both exons and introns) | 0.90 % | 1.29 % | 1.00 % | 0.96 % | 1.08 % | 1.35 % | 0.74 % | 0.97 % | 1.05 % | 1.17 % |
| Bins that overlap EST (exons only) | 0.30 % | 0.66 % | 0.56 % | 0.50 % | 0.55 % | 0.87 % | 0.28 % | 0.60 % | 0.34 % | 0.41 % |
| Bins that overlap EST (introns only) | 2.98 % | 3.04 % | 1.78 % | 1.82 % | 1.63 % | 1.82 % | 1.41 % | 1.81 % | 1.46 % | 1.42 % |
| Intergenic bins | 2.90 % | 6.77 % | 3.94 % | 4.25 % | 2.81 % | 5.05 % | 2.29 % | 4.96 % | 3.88 % | 3.27 % |
*Relative to the control time point.
Distribution of intronic differentially expressed (DE) bins
| | | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Intronic bins** | 6,530 | 10,274 | 3,041 | 5,063 | 4,062 | 3,515 | 5,359 | 3,995 | 5,070 | 2,953 | |
| Introns overlapping intronic bins | 4,331 | 6,995 | 3,149 | 5,152 | 3,779 | 3,645 | 4,063 | 3,856 | 4,096 | 3,087 | 24,181 |
| Transcripts overlapping intronic bins | 5,433 | 8,463 | 4,709 | 7,557 | 5,387 | 5,555 | 5,674 | 5,522 | 5,814 | 4,870 | 20,061 |
| Loci overlapping intronic bins | 2,254 | 3,622 | 1,972 | 3,082 | 2,315 | 2,340 | 2,331 | 2,321 | 2,416 | 2,046 | 8,016 |
| Intronic bins** | 3,569 | 4765 | 2,292 | 4,033 | 3,016 | 2,665 | 3,413 | 3,134 | 4,131 | 2,449 | |
| Introns overlapping intronic bins | 2,813 | 4,414 | 2,435 | 4,131 | 2,906 | 2,849 | 3,014 | 3,114 | 3,514 | 2,613 | 19,462 |
| Transcripts overlapping intronic bins | 3,890 | 6,165 | 3,727 | 6,098 | 4,253 | 4,522 | 4,422 | 4,626 | 5,054 | 4,186 | 18,209 |
| Loci overlapping intronic bins | 1,556 | 2,472 | 1,474 | 2,316 | 1,754 | 1,777 | 1,713 | 1,820 | 1,974 | 1,620 | 7,319 |
*Relative to the control time point; **After removal of bins overlapping exons of Ensembl annotations.
Figure 5Examples of intronic RNAs downregulated at 3 hrs LPS with no changes in the exonic RNAs. The panel (C) shows zoom-in around the DE bin marked with an arrow in the panel (B). More details in the text. Positions of RT-PCR products presented in Figure 7 are shown (see Additional file 4 Table S3 for more details).
Number of introns that contain functional RNAs based on various criteria
| ≤ 0 | 30,141 | 111 | 0.009 | 28,617 | 1,258 | 266 | 5.1 % | |
| ≤ −0.3 | 8,989 | 39 | 0.019 | 8,529 | 384 | 76 | 5.1 % | |
| ≥ 10 | 18,863 | 104 | 2.34E-10 | 17,491 | 1,101 | 271 | 7.3 % | |
| ≥ 1 | 5,753 | 95 | 1.37E-42 | 4,936 | 594 | 223 | 14.2 % | |
| ≤ that of exons | 40,948 | 201 | 4.31E-16 | 39,146 | 1,107 | 695 | 4.4 % | |
| Presence of a DE bin | 25,739 | 159 | 1.34E-22 | 24,626 | 714 | 387 | 4.3 % | |
| | 82,481 | 349 | 4.49E-27 | 77,611 | 3,517 | 1,341 | 5.9 % | |
| | 35,979 | 229 | 1.90E-40 | 34,315 | 1,248 | 416 | 4.6 % | |
| 10,111 | 103 | 2.02E-30 | 9,671 | 319 | 121 | 4.4 % |
* With unique coordinates.
** Based on 534 Ensembl-specific snoRNAs (see Methods).
*** See Methods.
**** The following was done: 1. Average intronic density was calculated per each time point, introns were sorted based on this value and an intron with minimal value for each time point was identified. 2. Per each time and each intron, the ration of the density for this intron versus the minimal intronic value in this locus was calculated. 3. Introns passing the threshold in at least one timepoint were taken.
Figure 6Examples of lincRNA regions originally found in intergenic regions [[33]] that are actually part of longer intronic transcripts.
Distribution of informative reads and DE bins in annotated lincRNA regions
| 0 | 58,995,885 | 1,472,090 | 2.5 % | 420,117 | 0.7 % | | | | | |
| 3 | 67,176,485 | 1,619,443 | 2.4 % | 482,092 | 0.7 % | 19,433 | 214 | 1.1 % | 147 | 0.8 % |
| 6 | 64,713,929 | 1,626,306 | 2.5 % | 474,877 | 0.7 % | 8,912 | 89 | 1.0 % | 78 | 0.9 % |
| 12 | 50,425,532 | 1,268,052 | 2.5 % | 348,418 | 0.7 % | 8,376 | 84 | 1.0 % | 72 | 0.9 % |
| 24 | 52,751,370 | 1,342,859 | 2.5 % | 387,030 | 0.7 % | 10,128 | 130 | 1.3 % | 73 | 0.7 % |
| 48 | 49,007,599 | 1,278,355 | 2.6 % | 359,202 | 0.7 % | 8,688 | 86 | 1.0 % | 65 | 0.7 % |
*As defined by the Additional file 2: Table S1 from Guttman et al., 2009 [33].
**Relative to the control time point.