| Literature DB >> 34827671 |
Bernardo Bonilauri1, Fabiola Barbieri Holetz2, Bruno Dallagiovanna1.
Abstract
Ribosome profiling reveals the translational dynamics of mRNAs by capturing a ribosomal footprint snapshot. Growing evidence shows that several long non-coding RNAs (lncRNAs) contain small open reading frames (smORFs) that are translated into functional peptides. The difficulty in identifying bona-fide translated smORFs is a constant challenge in experimental and bioinformatics fields due to their unconventional characteristics. This motivated us to isolate human adipose-derived stem cells (hASC) from adipose tissue and perform a ribosome profiling followed by bioinformatics analysis of transcriptome, translatome, and ribosome-protected fragments of lncRNAs. Here, we demonstrated that 222 lncRNAs were associated with the translational machinery in hASC, including the already demonstrated lncRNAs coding microproteins. The ribosomal occupancy of some transcripts was consistent with the translation of smORFs. In conclusion, we were able to identify a subset of 15 lncRNAs containing 35 smORFs that likely encode functional microproteins, including four previously demonstrated smORF-derived microproteins, suggesting a possible dual role of these lncRNAs in hASC self-renewal.Entities:
Keywords: lncRNA; microprotein; ribosome; smORF; stem cells; translation
Mesh:
Substances:
Year: 2021 PMID: 34827671 PMCID: PMC8615451 DOI: 10.3390/biom11111673
Source DB: PubMed Journal: Biomolecules ISSN: 2218-273X
Subject characteristics.
| SUBJECT | DONOR1 | DONOR2 | DONOR3 | Mean ± SD |
|---|---|---|---|---|
| Age | 46 | 20 | 48 | 38 ± 12.75 |
| Gender | F | F | F | F |
| Weight (kg) | 74.5 | 75 | 90 | 79.8 ± 7.19 |
| Height (cm) | 166 | 174 | 175 | 171 ± 4.02 |
| BMI | 27.04 | 24.77 | 29.39 | 27.06 ± 1.88 |
Figure 1Ribosome profiling was performed in human undifferentiated adipose-derived stem cells (n = 3) followed by massive sequencing (Ribo-seq). For deep analysis, our previous RNA-seq datasets from total fraction (Total RNA-seq) and polysomal fraction (Poly RNA-seq) were collected. Performing bioinformatics analysis, we combined Ribo-seq and RNA-seq datasets to search for the translated smORFs of lncRNAs in hASC. SVF: stromal vascular fraction.
Figure 2General characteristics of the lncRNAs identified in the Ribo-seq of hASC. (A) Boxplot comparing the expression levels of all lncRNAs and protein-coding genes identified in Ribo-seq. (B) Comparison of transcript lengths between lncRNAs and mRNAs. (C) Comparison of percentage of GC contents between lncRNAs and mRNAs. (D) Density plots show the number of exons in lncRNAs and mRNAs. (E) Cumulative distribution of minimum free energy (MFE) of lncRNAs, protein-coding genes, 5-untranslated-regions (5′UTR), and 3-untranslated-regions (3′UTR). (F) Histogram showing the chromosomal distribution of the lncRNAs identified in the Ribo-seq data.
Identified smORFs within lncRNAs with ribosome occupancy and their predicted microproteins.
| lncRNA | GeneID | ORF Length (nt) | MP* Length (aa) | DeepLoc1.0 Pred. | DLoc† Score | Microprotein Sequence |
|---|---|---|---|---|---|---|
| CYTOR | ENSG00000222041.11 | 156 | 52 | Nucleus | 0.37 | MTDTENHDSAPSSTSTCCPPITAGMQLKDSLGPGSNRPLWTLRPLHLRVVCL |
| EBLN3P | ENSG00000281649.2 | 141 | 47 | Nucleus | 0.77 | MEEPMDTSEPLSALPFTGQQSFEPSGKFGQYPSMQMNHIQALGKWRT |
| EBLN3P | ENSG00000281649.2 | 78 | 26 | Extracellular | 0.52 | MYVTDPESPAAWDPCLPSVSPAELWN |
| GAS5 | ENSG00000234741.8 | 150 | 50 | Extracellular | 0.43 | MVLGADAVWLWIAPYGQLCPQGRMRIATEVLKSKPNSSHWHTGIRQKAGS |
| GAS5 | ENSG00000234741.8 | 81 | 27 | Extracellular | 1 | MTCLGKDMKTVPVIPFKGTCFIDVNVN |
| LINC00968 | ENSG00000246430.7 | 132 | 44 | Extracellular | 0.53 | MFLQKLKSCLVKAFHKMVCVWDQEDRRLLKKRTGTLTHFRLLHV |
| LINC01116 | ENSG00000163364.10 | 249 | 83 | Nucleus | 0.71 | MGPRFLADARGRGRVPGSRFSQAPIPAHARGPRPTHEAPTPIVEAPPGKEVRLPLQAAPRGMGNRQEMTRTASLRLCSRPSLC |
| MEG3 | ENSG00000214548.18 | 168 | 56 | Nucleus | 0.50 | MPFERLEAKSIKHSWENTTGGTTRFSYTLGSHGEDRREKKEVEREERAGETGEENN |
| MEG3 | ENSG00000214548.18 | 444 | 148 | Nucleus/ | 0.418/ | MRRLSIVMKNPWHSPHPQTHGSHSHTGPKATVSAAVAPVDIGKPGEGVEEISWPPAGSLGFCAQGSWSPKNFQKLTPHVPILLGFLDFSEAPAEGSRCSLECRGSPLTWLLESLLFLLLLPSSSSSSLSISPSLCPSPVPDLAIPGCP |
| MEG3 | ENSG00000214548.18 | 252 | 84 | Cytoplasm | 0.37 | MEAAEEALMGPTIPDPSLLPGGPLVSFLVWAEAITWMPTWEGTSNVGPQPLSSSKSLHSHGDTLHLFPRDRLDPETLDPGPPLE |
| MIR22HG | ENSG00000186594.14 | 279 | 93 | Mitochondria | 0.60 | MGWEGPNSRVDDTFWASWRAFAQIGPARSGFRLETLAGLRSRRLKQPKRLQEAVSVRFGG |
| MIR22HG | ENSG00000186594.14 | 66 | 22 | Mitochondria | 0.50 | MIRFGQVGEPLPRLAQQGAVLD |
| MSC-AS1 | ENSG00000235531.10 | 192 | 64 | Nucleus | 0.42 | MSLETTGPQERQALSVLLLPWKKPAPTMPSATSKSSLRPPQKQMLSCFLYSCRTTSNHPNTREH |
| SNHG1 | ENSG00000255717.7 | 87 | 29 | Extracellular | 0.47 | MSYWAPVCRIYAHVGTEESSVVAPTRAYW |
| SNHG1 | ENSG00000255717.7 | 153 | 51 | Extracellular | 0.73 | MFSPQELTGEGMGQDPSLCKASVTVMFQVGVHGLCSYRGDLVDNHSMMNTK |
| SNHG16 | ENSG00000163597.15 | 99 | 33 | Nucleus | 0.78 | MATPVGVEHGEQSQAFSDDGAVSLSFQSRKRIL |
| SNHG16 | ENSG00000163597.15 | 108 | 36 | Nucleus | 0.58 | MATPVGVEHGEQSQAFSDDGWLGGLKVLDEKMLSKR |
| SNHG29 | ENSG00000175061.18 | 405 | 135 | Mitochondria | 0.69 | MFPGSLSRGRRAAVEMAWLPGSCARVAFAAGAAARYWTAWQGSAGPNPAAVAEAHGSLFCGRATSARAWSLRRPGPGSPAHSGGVQTRENWVSWGRLAVWGTPRAVYVGKIVTVLLEDLFDCPDDTCNRKCRQKR |
| SNHG29 | ENSG00000175061.18 | 285 | 95 | Mitochondria | 0.66 | MFPGSLSRGRRAAVEMAWLPGSCARVAFAAGAAARYWTAWQGSAGPNPAAVAEAHGSLFCGRATSARAWSLRRPGPGSPAHSGGVQTRENWVANS |
| SNHG29 | ENSG00000175061.18 | 111 | 37 | Extracellular | 0.68 | MDHSFVVGPHLPEPGVCEGRDPVPRPTVGVCKPERTG |
| SNHG29 | ENSG00000175061.18 | 237 | 79 | Nucleus/ | 0.243/ | MDHSFVVGPHLPEPGVCEGRDPVPRPTVGVCKPERTGLQIREESASCLAAEYWSQEPAMRLYSQRMSVPRTSSCHQFGF |
| SNHG29 | ENSG00000175061.18 | 210 | 70 | Endoplasmic Reticulum | 0.49 | MLALCIRGHAQQIQEIYLATFSRKGTLGIIHYILEFFWVFFFFFETVLLYCPGWSVVAQSQLIASSITQA |
| SNHG29 | ENSG00000175061.18 | 54 | 18 | - | - | MYQRTCSADPRDIFGNFF |
| SNHG29 | ENSG00000175061.18 | 237 | 79 | Golgi Apparatus | 0.42 | MLSRSKRYIWQLFLEKAHWVSFITFLSFFGFFFFFLRQSCCIAQAGVWWHNHSSLHPQSPRPKQSSHLVAGTTAHSTPG |
| SNHG29 | ENSG00000175061.18 | 78 | 26 | Mitochondria | 0.97 | MLPRLVSGSWAQMVLLPQLPKAQAKL |
| SNHG5 | ENSG00000203875.12 | 105 | 35 | Mitochondria | 0.26 | MALSSVAQWSSSEDAKIHEKTSRTSGRIFNGKSLG |
| SNHG5 | ENSG00000203875.12 | 72 | 24 | Mitochondria | 0.57 | MQRYTKKLPEHLGEYLMENRLVKT |
| SNHG6 | ENSG00000245910.8 | 75 | 25 | Mitochondria | 0.86 | MPVWWRRRRLRARSWALRGARKPLR |
| SNHG8 | ENSG00000269893.8 | 156 | 52 | Mitochondria | 0.68 | MIIGPKLTALPKRQRSQDIGRSGAALETLKFTSMRGLECSLGRRASTCSPGP |
| SNHG8 | ENSG00000269893.8 | 108 | 36 | Mitochondria/ | 0.309/ | MDDGNIRLSRNPSGNGRSLFSIRQWTYRSWGNGCSE |
| ZFAS1 | ENSG00000177410.13 | 75 | 25 | Mitochondria | 0.34 | MDFGRGSHHWTSKEATCRHLQPSIS |
| ZFAS1 | ENSG00000177410.13 | 60 | 20 | - | - | MRVLEVEYIYTYKIETGDGI |
| ZFAS1 | ENSG00000177410.13 | 99 | 33 | Extracellular | 0.44 | MRVLEVEYIYTYKIGWEPRVPVCVDLGLIQSAL |
| ZFAS1 | ENSG00000177410.13 | 153 | 51 | Nucleus | 0.57 | MEYERSPLERKGQTLCFHESEDLAEPVPQGYCIHSLSLKGCAHFKNVIVRL |
| ZFAS1 | ENSG00000177410.13 | 99 | 33 | Extracellular | 0.51 | MRGALWKEKDRPCAFMKVKIWLNQFHKVTVYIA |
MP*: Microprotein, DLoc†: DeepLoc1.0.
Figure 3LncRNAs-encoded microproteins present in human adipose-derived stem cells. (A) Expression profile of microprotein NOBODY from LINC01420 with 68 aa. (B) Expression profile of microprotein MTLN from LINC0116 with 56 aa. (C) Expression profile of TERC with smORFs-encoded microprotein with 121 aa. (D) Expression profile of TUG1 with 5′UTR smORFs-encoded microprotein with 154 aa. Light blue boxes indicate the smORF location in transcripts, and the dashed boxes represent the smORF location in the read coverage plot. The upper panel represents Ribo-seq coverage, the middle panel represents Total fraction RNA-seq (Total RNA-seq) coverage, and the lower panel represents Polyso-mal fraction RNA-seq (Poly RNA-seq) coverage; the y-axis represents transcript raw counts.
Figure 4Putative smORFs-derived microproteins within EBLN3P lncRNA in hASC. (A,B) Ribosome coverage and the expression level of EBLN3P in different datasets, respectively. (C) Ribosome occupancy of smORFs located in exon 1 of EBLN3P, showing a putative 26-aa microprotein with extracellular localization prediction. (D) Ribosome occupancy of smORFs located in exon 2 or 3 of EBLN3P, showing a putative 47-aa microprotein with nuclear localization prediction. The upper panel represents Ribo-seq coverage from this study, the second panel represents Ribo-seq coverage from Marcon et al., the third panel represents Total RNA-seq coverage, and the lower panel represents Polysomal RNA-seq coverage; the y-axis represents transcript raw counts. Light blue boxes indicate the smORF location in transcripts, and the dashed boxes represent the smORF location in the read coverage plot.