| Literature DB >> 35645341 |
Marine Lambert1,2, Sara Guellal1,2, Jeffrey Ho1,2, Abderrahim Benmoussa1,2, Benoit Laffont1,2, Richard Bélanger3, Patrick Provost1,2.
Abstract
Small RNA sequencing (sRNA-Seq) approaches unveiled sequences derived from longer non-coding RNAs, such as transfer RNA (tRNA) and ribosomal RNA (rRNA) fragments, known as tRFs and rRFs, respectively. However, rRNAs and RNAs shorter than 16 nt are often depleted from library preparations/sequencing analyses, although they may be functional. Here, we sought to obtain a complete repertoire of small RNAs by sequencing the total RNA from 11 samples of 6 different eukaryotic organisms, from yeasts to human, in an extended 8- to 30-nt window of RNA length. The 8- to 15-nt window essentially contained fragments of longer non-coding RNAs, such as microRNAs, PIWI-associated RNAs (piRNAs), small nucleolar RNAs (snoRNAs), tRNAs and rRNAs. Notably, unusually short RNAs < 16 nt were more abundant than those >16 nt in bilaterian organisms. A new RT-qPCR method confirmed that two unusually short rRFs of 12 and 13 nt were more overly abundant (~3-log difference) than two microRNAs. We propose to not deplete rRNA and to reduce the lower threshold of RNA length to include unusually short RNAs in sRNA-Seq analyses and datasets, as their abundance and diversity support their potential role and importance as biomarkers of disease and/or mediators of cellular function.Entities:
Keywords: RNA sequencing; non-coding RNA; small RNA; unusually short RNA
Year: 2022 PMID: 35645341 PMCID: PMC9149858 DOI: 10.3390/ncrna8030034
Source DB: PubMed Journal: Noncoding RNA ISSN: 2311-553X
Figure 1Opening of the 8- to 15-nt window revealed a high abundance of unusually short RNAs in human, mouse and fly samples. (A–E) Length distribution of the small RNA reads (in nucleotides, nt) from human (A), mouse (B), D. melanogaster (C), A. thaliana (D), S. cerevisiae and S. pombe (E) samples in the 8- to 15-nt and the standard 16- to 30-nt windows of RNA length. (F–J) Relative proportion of the reads in either of the two windows of RNA length in each sample.
Figure 2Most unusually short RNAs detected in the 8- to 15-nt window derive from rRNA in human, mouse, and fly samples. (A–E) Biotype distribution of the small RNAs (percentage of RPM) from human (A), mouse (B), D. melanogaster (C), A. thaliana (D), S. cerevisiae and S. pombe (E) samples in the 8- to 15-nt window. (F–J) The small RNA biotype distribution (percentage of RPM) of the corresponding samples in the standard 16- to 30-nt window.
Figure 3High abundance of fragments derived from longer, authentic non-coding RNAs detected by sRNA-Seq analysis in the 8- to 30-nt window. (A–E) Heatmap of length distribution abundance (log10 of RPM) of miRNA-miRNA (A), piRNA-piRF (B), sdRNA (C), tRF (D), and rRF (E) biotypes from human, mouse, D. melanogaster, A. thaliana, S. cerevisiae and S. pombe samples in the 8- to 30-nt window. Gray boxes display conditions with zero reads. For each biotype, a k-mean clustering was generated according to the small RNA length abundance distribution (Euclidean distance, n = 3 clusters; in blue the most abundant, while the green is the lesser abundant group and in beige were grouped RNAs with a middle abundance).
Most abundant sdRNA sequences identified in the standard, 16- to 30-nt window by sRNA-Seq analysis.
| Length (nt) | Sequence | Reads * | Origin | ||
|---|---|---|---|---|---|
|
| HUVEC | 17 | GTTTGTGATGACTTACA | 99.9 | 5′ end of SNORD30 |
| 29 | TTGCTGTGATGACTATCTTAGGACACCTT | 94.4 | 5′ end of SNORD58C | ||
| 29 | 4.4 | 5′ end of SNORD58A | |||
| PMN | 17 | GTTTGTGATGACTTACA | 99.9 | 5′ end of SNORD30 | |
| 29 | TTGCTGTGATGACTATCTTAGGACACCTT | 54.2 | 5′ end of SNORD58C | ||
| 29 | 29.7 | 5′ end of SNORD58A | |||
| HEK293 | 17 | GTTTGTGATGACTTACA | 99.9 | 5′ end of SNORD30 | |
| 29 | TTGCTGTGATGACTATCTTAGGACACCTT | 94.6 | 5′ end of SNORD58C | ||
| 29 | 4.7 | 5′ end of SNORD58A | |||
|
| Cerebellum | 17 | GTTCTGTGATGAGGCTC | 96 | 5′ end of SNORD83B, without the 3 first nt |
| 29 | TTGCTGTGATGACTATCTTAGGACACCTT | 64 | 5′ end of SNORD58, without the 3 first nt | ||
| 29 | 17 | 5′ end of SNORD58, without the 3 first nt | |||
| PMN | 17 | GTTCTGTGATGAGGCTC | 98 | 5′ end of SNORD83B, without the 3 first nt | |
| 29 | TTGCTGTGATGACTATCTTAGGACACCTT | 73 | 5′ end of SNORD58, without the 3 first nt | ||
| 29 | 13 | 5′ end of SNORD58, without the 3 first nt | |||
| NIH | 17 | GTTCTGTGATGAGGCTC | 99 | 5′ end of SNORD83B, without the 3 first nt | |
| 29 | TTGCTGTGATGACTATCTTAGGACACCTT | 47 | 5′ end of SNORD58, without the 3 first nt | ||
| 29 | 35 | 5′ end of SNORD58, without the 3 first nt | |||
| N2a | 17 | GTTCTGTGATGAGGCTC | 99 | 5′ end of SNORD83B, without the 3 first nt | |
| 29 | TTGCTGTGATGACTATCTTAGGACACCTT | 45 | 5′ end of SNORD58, without the 3 first nt | ||
| 29 | 26 | 5′ end of SNORD58, without the 3 first nt |
* % of reads from sequences having the same length. Nucleotide substitutions are in red.
Most abundant piRNAs and piRFs identified by sRNA-Seq analysis.
| Length (nt) | Sequence | Origin (piRBase Name) | ||
|---|---|---|---|---|
|
| HUVEC | 15 | GACCAATGATGTGAA | piR-hsa-4433698 5′ end |
| 23 | TCCTGTACTGAGCTGCCCCGAGA | piR-hsa-145507 | ||
| 23 | TCCTGTACTGAGCTGCCCCGAGT | piR-hsa-145507 | ||
| PMN | 15 | TACAACTTTTGGCAA | piR-hsa-7695930 3′ end | |
| 14 | ACAACTTTTGGCAA | piR-hsa-7695930 3′ end | ||
| 23 | TATTGCACTTGTCCCGGCCTGTA | piR-hsa-137098 | ||
| HEK293 | 14 | GATGGGTGACCGCC | piR-hsa-741077 fragment | |
| 13 | ATGGGTGACCGCC | piR-hsa-741077 fragment | ||
| 23 | TCCTGTACTGAGCTGCCCCGAGA | piR-hsa-145507 | ||
|
| Cerebellum | 15 | GCATTGGTGGTTCAG | piR-mmu-10912946 5′ end |
| 18 | GCATTGGTGGTTCAGTGG | piR-mmu-10912946 5′ end | ||
| 23 | AACCCGTAGATCCGAACTTGTGA | piR-mmu-29307247 5′ end | ||
| 23 | TCCTGTACTGAGCTGCCCCGAGA | piR-mmu-25873647 5′ end | ||
| PMN | 15 | GCATTGGTGGTTCAG | piR-mmu-10912946 5′ end | |
| 16 | AGCGGAGTAGAGCAGT | piR-mmu-23655655 5′ end | ||
| 23 | TCCTGTACTGAGCTGCCCCGAGA | piR-mmu-25873647 5′ end | ||
| 23 | TCCTGTACTGAGCTGCCCCGAGT | piR-mmu-25873647 5′ end | ||
| 22 | CCTGTACTGAGCTGCCCCGAGA | piR-mmu-25873647 5′ end | ||
| 23 | GTACCCTGTAGATCCGAATTTGT | piR-mmu-11542414 | ||
| NIH/3T3 | 16 | AGCGGAGTAGAGCAGT | piR-mmu-23655655 5′ end | |
| 22 | CCTGTACTGAGCTGCCCCGAGA | piR-mmu-25873647 5′ end | ||
| 23 | TCCTGTACTGAGCTGCCCCGAGA | piR-mmu-25873647 5′ end | ||
| 24 | GTCCTGTACTGAGCTGCCCCGAGA | piR-mmu-25873647 5′ end | ||
| N2a | 12 | TCGCTGTGATGA | piR-mmu-24106721 | |
| 23 | CACCCGTAGAACCGACCTTGCGT | piR-mmu-31228201 5′ end | ||
| 27 | GGCTCTGTGGCGCAATGGATAGCGCAT | piR-mmu-5102689 | ||
| 28 | TGGCCAAGGATGAGAACTCTAACCTGAC | piR-mmu-7884931 | ||
|
| 13 | GAGGAAACTCTGG | piR-dme-108681 5′ end | |
| 15 | AAGGGAAGGGTATTG | piR-dme-5048778 5′ end | ||
| 16 | AAAGGGAAGGGTATTG | piR-dme-5048778 5′ end | ||
| 18 | CTGGGTCGGCCGGGGCGC | piR-dme-34359551 fragment | ||
| 20 | TAGGGACGGTCGGGGGCATC | piR-dme-40694119 3′ end | ||
| 21 | ATAGGGACGGTCGGGGGCATC | piR-dme-40694119 3′ end |
Most abundant sdRNA sequences identified in the 8- to 10-nt cluster by sRNA-Seq analysis.
| Length (nt) | Sequence | Reads * | Origin | ||
|---|---|---|---|---|---|
|
| HUVEC | 9 | GGCTA | 97.9 | 5′ end of SNORD-like-snoRNA,alias:ZL45, |
| 12 | TCGCTATG | 36.9 | 5′ end of SNORD14B | ||
| 10 | GGACCA | 96.0 | 5′ end of SNORD114-12 | ||
| PMN | 9 | GGCTA | 99.7 | 5′ end of SNORD-like-snoRNA,alias:ZL45, | |
| 12 | TCGCTATG | 23.9 | 5′ end of SNORD14B | ||
| 11 | CCCGTCTGACC | 22.0 | 3′ end of SNORD13 | ||
| HEK293 | 9 | GGCTA | 100,0 | 5′ end of SNORD-like-snoRNA,alias:ZL45, | |
| 10 | TGGCTA | 45,2 | 5′ end of SNORD-like-snoRNA,alias:ZL45, | ||
| 11 | GTAAGTATATT | 41,4 | Middle of SNORA24L2 | ||
|
| Cerebellum | 11 | CGCTGTG | 32.6 | 5′ end of SNORD14C, without the first nt |
| 9 | ATTG | 7.9 | CD_40-1_ (chr16) 20684238,20684314 | ||
| 12 | AATTGTGGTAAC | 13.6 | Middle of SCARNA10 | ||
| PMN | 11 | CGCTGTG | 37.9 | 5′ end of SNORD14C, without the first nt | |
| 12 | AATTGTGGTAAC | 8.3 | Middle of SCARNA10 | ||
| 11 | ATTGTGGTAAC | 11.3 | Middle of SCARNA10 | ||
| NIH/3T3 | 11 | CGCTGTG | 19.5 | 5′ end of SNORD14C, without the first nt | |
| 11 | AGAGAGGTGAG | 18.1 | Middle of SNORA17 | ||
| 12 | TGCTGTGATGAC | 39.6 | 5′ end of SNORD58C, without the first nt | ||
| N2a | 11 | CGCTGTG | 80.4 | 5′ end of SNORD14C, without the first nt | |
| 12 | AGGGATTGTGGG | 28.2 | 5′ end of SNORA71 | ||
| 10 | GCG | 24.1 | SNORA74B | ||
|
| 12 | GTGGAGGTAAAG | 98.0 | 5′ end snoRNA:Psi18S-525f | |
| 9 | ATAGGGACG | 71.3 | snoRNA:Psi18S-525k (Dmel_CR34569) | ||
| 10 | TTATAAACTG | 43.7 | PsiU2-38.40.42 (scaRNA:PsiU2-38.40.42) | ||
|
| 11 | AGATATG | 95.1 | 5′ end of SnoR18a | |
| 10 | AATATTGAAA | 31.4 | Middle of SnoR96 | ||
| 11 | TAATATTGAAA | 1.4 | Middle of SnoR96 | ||
|
| 10 | CCTTCTGAAA | 22.1 | SnoRNA86 | |
| 11 | TCCTTCTGAAA | 26.6 | SnoRNA86 | ||
| 10 | TCGGGGCTGA | 11.4 | SnoRNA86 | ||
|
| 9 | TCAACTGTA | 28.0 | SnR70 | |
| 10 | TGTTCTGATG | 35.5 | SnR81 | ||
| 9 | TGTCTGAT | 6.7 | Snr41 |
* % of reads from sequence with the same length. Nucleotide substitutions are in red. The common motif at the 3′ end is highlighted in green.
Most abundant tRF sequences identified by sRNA-Seq analysis.
| Length (nt) | Sequence | Origin | ||
|---|---|---|---|---|
|
| HUVEC | 18 | GCATTGGTGGTTCAGTGG | 5′ end of tRNA-Gly-GCC-3-1 |
| 18 | GCAT | 5′ end of tRNA-Gly-GCC-3-1 | ||
| 15 | GCATTGGTGGTTCAG | 5′ end of tRNA-Gly-GCC-3-1 | ||
| PMN | 18 | GCATTGGTGGTTCAGTGG | 5′ end of tRNA-Gly-GCC-3-1 | |
| 15 | GCATTGGTGGTTCAG | 5′ end of tRNA-Gly-GCC-3-1 | ||
| 14 | TAGAATTCTCGCCT | Middle of tRNA-Gly-CCC-1-1 | ||
| HEK293 | 15 | GCATTGGTGGTTCAG | 5′ end of tRNA-Gly-GCC-3-1 | |
| 18 | GCATTGGTGGTTCAGTGG | 5′ end of tRNA-Gly-GCC-3-1 | ||
| 18 | GCAT | 5′ end of tRNA-Gly-GCC-3-1 | ||
|
| Cerebellum | 18 | GCATTGGTGGTTCAGTGG | 5′ end of tRNA-Gly-GCC-3-1 |
| 15 | GCATTGGTGGTTCAG | 5′ end of tRNA-Gly-GCC-3-1 | ||
| 14 | CTTCGTGGTCGCCA | Partial 3035a trf-3 | ||
| PMN | 18 | GCATTGGTGGTTCAGTGG | 5′ end of tRNA-Gly-GCC-3-1 | |
| 17 | CATTGGTGGTTCAGTGG | 5′ end of tRNA-Gly-GCC-3-1 | ||
| 15 | GCATTGGTGGTTCAG | 5′ end of tRNA-Gly-GCC-3-1 | ||
| NIH/3T3 | 18 | GCATTGGTGGTTCAGTGG | 5′ end of tRNA-Gly-GCC-3-1 | |
| 15 | GCATTGGTGGTTCAG | 5′ end of tRNA-Gly-GCC-3-1 | ||
| N2a | 18 | GCATTGGTGGTTCAGTGG | 5′ end of tRNA-Gly-GCC-3-1 | |
| 15 | GCATTGGTGGTTCAG | 5′ end of tRNA-Gly-GCC-3-1 | ||
|
| 30 | CATCGGTGGTTCAGTGGTAGAATGCTCGCC | 5′ end of tRNA-Gly-GCC-3-1 | |
| 28 | GCATCGGTGGTTCAGTGGTAGAATGCTC | 5′ end of tRNA-Gly-GCC-3-1 | ||
| 17 | CCCGGGTTTCGGCACCA | 3023 trf-3 | ||
|
| 15 | GGCTAGGTAACATAA | PT-261581 tRF-5 | |
| 16 | GGGGATGTAGCTCATA | 5′ end of tRNA-Ala-CGC-2-1 | ||
| 16 | GGCGGATGTAGCCAAG | PT-218828 tRF-5 | ||
|
| 13 | GCGGATTTAGCTC | trna9-PheGAA | |
| 13 | GCTTCAGTAGCTC | trna19-MetCAT | ||
| 28 | TCCTTAGTTCGATCCTGAGTGCGAGCTC | tRNA-Cys-GCA-1-1 | ||
| 29 | TCCGTGATAGTTTAATGGTCAGAATGGGC | trna1-AspGTC | ||
|
| 8 | GCTTCAGT | trna49-LeuCAG | |
| 8 | GCGGATTT | trna17-SerGCT | ||
| 10 | CCCTGGGTTC | trna15-AlaTGC |
Nucleotide substitutions are in red.
Most abundant rRF sequences identified by sRNA-Seq analysis.
| Length (nt) | Sequence | ||
|---|---|---|---|
|
| HUVEC | 18 | TCGTACGACTCTTAGCGG |
| 19 | CTCGTACGACTCTTAGCGG | ||
| 18 | TCGTACGACTCTTAGCGG | ||
| 12 | GACTCTTAGCGG | ||
| 13 | CGACTCTTAGCGG | ||
| PMN | 12 | GACTCTTAGCGG | |
| 13 | CGACTCTTAGCGG | ||
| 18 | TCGTACGACTCTTAGCGG | ||
| HEK293 | 12 | GACTCTTAGCGG | |
| 13 | CGACTCTTAGCGG | ||
| 18 | TCGTACGACTCTTAGCGG | ||
|
| Cerebellum | 12 | GACTCTTAGCGG |
| 13 | CGACTCTTAGCGG | ||
| 25 | CAAACGAGAACTTTGAAGGCCGAAG | ||
| PMN | 12 | GACTCTTAGCGG | |
| 13 | CGACTCTTAGCGG | ||
| 18 | CGATACGACTCTTAGCGG | ||
| NIH | 12 | GACTCTTAGCGG | |
| 13 | CGACTCTTAGCGG | ||
| 18 | CGATACGACTCTTAGCGG | ||
| N2a | 12 | GACTCTTAGCGG | |
| 13 | CGACTCTTAGCGG | ||
| 18 | CGATACGACTCTTAGCGG | ||
|
| 11 | ACTCTAAGCGG | |
| 12 | AACTCTAAGCGG | ||
| 30 | TGCTTGGACTACATATGGTTGAGGGTTGTA | ||
|
| 12 | GAGTCTGGTAAT | |
| 14 | GGGATGGGTCGGCC | ||
| 18 | TAGGATAGTGGCCTACCA | ||
|
| 13 | TTGACCTCAAATC | |
| 18 | TATCTGGTTGATCCTGCC | ||
| 19 | GCGGCTGTCTGATCAGGCA | ||
|
| 13 | TAAAACTTTCAGC | |
| 13 | TTGACCTCAAATC | ||
| 24 | TTTGACCTCAAATCAGGTAGGACT |
Figure 4A 12-nt and a 13-nt rRF sequences are more abundant than microRNAs miR-25 and miR-30a in human and mouse samples. (A,B) Heatmap comparing the levels of 12-nt and 13-nt rRFs versus microRNAs miR-25 and miR-30a expressed either as reads per million (RPM) detected by sRNA-Seq (A) or in copy number detected by RT-qPCR (B). (C) Quantitation of 12 and 13-nt rRF levels by splinted ligation RT-qPCR analysis of total RNA extracted from HEK293 and N2a cells, in parallel to that of microRNAs miR-25 and miR-30a (n = 3 independent experiments). The detailed results of the statistical analyses are shown in Supplementary Table S14 (two-way ANOVA with Holm Sidak’s post-hoc test).