| Literature DB >> 34276900 |
Tanxi Cai1,2, Qing Zhang1,2, Bowen Wu1,2, Jifeng Wang1, Na Li1,2, Tingting Zhang1,2, Zhipeng Wang1,2, Jianjun Luo3, Xiaojing Guo1,2, Xiang Ding1,2, Zhensheng Xie1,2, Lili Niu1, Weihai Ning4, Zhen Fan5, Xiaowei Chen5, Xiangqian Guo6, Runsheng Chen2,3, Hongwei Zhang4, Fuquan Yang1,2.
Abstract
Advancements in omics-based technologies over the past few years have led to the discovery of numerous biologically relevant peptides encoded by small open reading frames (smORFs) embedded in long noncoding RNA (lncRNA) transcripts (referred to as microproteins here) in a variety of species. However, the mechanisms and modes of action that underlie the roles of microproteins have yet to be fully characterized. Herein, we provide the first experimental evidence of abundant microproteins in extracellular vesicles (EVs) derived from glioma cancer cells, indicating that the EV-mediated transfer of microproteins may represent a novel mechanism for intercellular communication. Intriguingly, when examining human plasma, 48, 11 and 3 microproteins were identified from purified EVs, whole plasma and EV-free plasma, respectively, suggesting that circulating microproteins are primarily enriched in EVs. Most importantly, the preliminary data showed that the expression profile of EV microproteins in glioma patient diverged from the health donors, suggesting that the circulating microproteins in EVs might have potential diagnostic application in identifying patients with glioma.Entities:
Keywords: cancer; extracellular vesicles (EVs); lncRNA‐encoded microproteins; long noncoding RNA (lncRNA); small open reading frames (smORFs)
Mesh:
Substances:
Year: 2021 PMID: 34276900 PMCID: PMC8275822 DOI: 10.1002/jev2.12123
Source DB: PubMed Journal: J Extracell Vesicles ISSN: 2001-3078
FIGURE 1Schematic illustration of the workflow for MS‐based discovery of microproteins encoded by lncRNAs. (a) Construction of the putative microprotein database. The human lncRNA transcripts deposited in noncode database were screened by oRFfinder and six‐frame translation to find all possible smORFs and then theoretically translated into putative microproteins. All microproteins were collected into the human microprotein database. (b) MS‐based identification of microproteins in cell culture‐derived and circulating EVs
FIGURE 2Identification of microproteins from glioma cancer cells and cell culture‐derived EVs. (a) TEM image of glioma cell culture‐derived EVs. (b) Western blot of commonly used EV marker proteins, including CD9, CD63, CD81 and Alix, with a 15‐µg protein loading volume. (c) Nanoparticle tracking analysis (NTA) of EVs released from glioma cells. (d) Microproteins identified from glioma cells and cell culture‐derived EVs, respectively. (e) Length distribution, sequence coverage and start codon usage of new microproteins. The values in the sequence coverage pie chart refer to the number of individual microproteins identified. (f) Evolutionary conservatism of microproteins. A sequence similarity search between the amino acid sequences of identified microproteins with that from other species was performed with the blast (basic local alignment search tool) program in uniprot database (https://www.uniprot.org/). The colour scale indicates the sequence identity
FIGURE 3Experimental validation for the presence of microproteins in cells and cell culture‐derived EVs. (a) Polysome fraction of glioma cell lysate. golima cells were treated with 0.1 mg/ml cycloheximide, lysed and separated by sucrose gradient centrifugation. All fractions were collected. Fraction 1–4 were combined together and marked as R1, fraction 5 was marked as R2, and fraction 6–12 were combined together and marked as R3. total RNA from R1, R2 and R3 were isolated and marked as unbound (free) RNAs, monosome‐and polysome‐ bound RNAs, respectively. (b) Relative ratios of monosome‐ and polysome‐bound RNAs vs free RNAs for several microprotein‐encoded lncRNAs. (c) Diagram of the gfp fusion constructs used for transfection. The start codon ATGGTG of the gfp (GFPWT) gene is mutated to ATTGTT (GFPmut). (d) Expression of the NONHSAT115127‐gfp fusion protein in NONHSAT115127 5′ UTR‐smORF‐GFPmut‐transfected 293T cells. (e) Western blot analysis of cell lysates from 293T cells transfected with different gfp fusion constructs using anti‐gfp antibodies, with a 15‐µg protein loading volume. β‐tubulin was used as a protein loading control. (f) Diagram of the flag fusion construct used for transfection. (g) Expression of the NONHSAT115127‐flag fusion proteins in NONHSAT115127 5′ UTR‐smORF‐flag‐transfected cells. (h) Western blot analysis of cell lysates from 293T cells transfected with different flag fusion constructs using anti‐flag antibodies, with a 15‐µg protein loading volume. β‐tubulin was used as a protein loading control. (j) SDS‐PAGE of proteins co‐immunoprecipitated from NONHSAT115127 5′ UTR‐smORF‐GFP‐transfected cells via anti‐gfp antibodies. (k) Protein‐protein interaction networks. the protein‐protein interaction networks were analyzed based on the proteins that were detected in all three replicates and significantly upregulated (fold change ≧4.0, p < 0.05) in the co‐ip fraction derived from the NONHSAT115127 5 ’UTR‐smORF‐GFP‐transfected cells compared to the GFP‐transfected cells
FIGURE 4Identification of microproteins from plasma EVs. (a) Microproteins identified in enriched EVs, EV‐free plasma and whole plasma from healthy donors. (b) and (c) Reproducibility of the identification of microproteins from three replicates of circulating EVs from healthy donors and glioma cancer patients, respectively. (d) Length distribution and sequence coverage of microproteins that presented in EVs from either the healthy donors or glioma patients. The values in the sequence coverage pie chart refer to the number of individual microproteins identified. (e) Principal component analysis (PCA) demonstrated clear clustering of the plasma samples from healthy donors and those from glioma cancer patients
List of microproteins specifically identified in EVs from glioma cancer patients
| ID | CHR | Start site | End site | Length | Sequence |
|---|---|---|---|---|---|
| NONHSAT108542 | chr6 (+) | 29693230 | 29697128 | 35 | MCTHDFITGHWMLIFIRSAGAQDTTLLPTGRNPLH |
| NONHSAT040552 | chr14 (−) | 106775156 | 106775618 | 37 | TTSVKGRFTISRDDSKSITYLQMNSLRAEDTAVYYCA |
| NONHSAT051106 | chr15 (−) | 101040452 | 101069470 | 37 | GCHPVASPHVTDCPQSEILTRNFGGWVSEGISGLTQR |
| NONHSAT071811 | chr2 (+) | 75711197 | 75714060 | 41 | MGMWRTFVSSSGIVNAPITALSKQAAGLYQSAGCGWGQIRE |
| NONHSAT072293 | chr2 (−) | 89170774 | 89171212 | 43 | KLENPPKLLIYAASSLPSGVPSRFSGSRSGTHFTHSHHQEPAT |
| NONHSAT072318 | chr2 (+) | 89929700 | 89930202 | 54 | TISSSLAGYQWKPGQALRLLIHGASTRTTNVPAWWSGSGFGENFSLIISRLEHE |
| NONHSAT083836 | chr22 (+) | 22915634 | 22915927 | 62 | KENGTLVTKGIETTTPSTQSNNNYAASSYLSLTPEQWKSHRSYSCQVTHKESTMEKTMAHAE |
| NONHSAT028900 | chr12 (−) | 57633328 | 57634498 | 63 | GEGKPRRGGGAGWEWAYDPCPKPGQEPGRRGPRRRALCINRWRGRLCLTRSLTAQLPAPLLSG |
| NONHSAT032697 | chr13 (+) | 28582866 | 28583699 | 71 | QGFLLRLCHDVGKREVVLTQGTEGVLAVAGGAGAAAHLRVKTTLERPSSLKFDIRIYIREFPTEAICSAGR |
| NONHSAT083819 | chr22 (+) | 22762293 | 22762516 | 73 | YELTQPPAVSVSPGQTARISCSGDVLRDNYADWYPQKPGQAPVLVIYKDGERPSGIPERFSGSTSGNTTALTI |
| NONHSAT072624 | chr2 (+) | 97355058 | 97355335 | 87 | MTQPPSSLSASVGDSVTITCRASQSFTNQLAWYQQKPGKAPKLLIYRVSSLQTGVPSLFSGSESGTDFTLTISSLQPDDVATYYCQQ |
| NONHSAT040529 | chr14 (−) | 106586375 | 106586826 | 88 | GLVQPGGSLRLSCAASGFTFSSSWMHWVCQAPEKGLEWVADIKCDGSEKYYVDSVKGRLTISRDNAKNSLYLQVNSLRAEDMTVYYCV |
| NONHSAT030405 | chr12 (−) | 104030743 | 104032248 | 89 | MKGVGAGDKMAKMKMGATGPHLQLQLKMSTDVVSVTCAGGRGRGTRWAGRVLLGTVSSPPAFTSWPCPGQPQLPWLCAPSPHPPETAWA |
| NONHSAT141981 | chr16 (+) | 31962087 | 31974361 | 95 | EVQLVESGGGLVQPGGSLRLSCAASGFTFSNRSTHWVRQAPGKGLEWVGHSVSKKKKKKKKKKKKGRFTISRDDSKNTLYLQMNSLKTEDTAVYY |
| NONHSAT072603 | chr2 (+) | 97059944 | 97060389 | 107 | MRIPAQLLAFLLLCLPGKEGEHWEMTQPPSSLSASVGDRVTVSCQASQSIYNYLNWYQQKPGKAPKFLTYRASSLQRGMPSQFSGSGYGRDFTLTVSSLQPEDFATY |
| NONHSAT040524 | chr14 (−) | 107034726 | 107034966 | 110 | ISCKGSGYSFSTYWVGWVRQIPGKGLEWMAIIYPSDSDTRYSPSFQGQVTISADKSISTAYLQWSSLKASDTAMYYCARHGYLSSGKGYFDYWGQGTQVTVSSGSASAPT |
| NONHSAT142116 | chr16 (−) | 33938336 | 33938799 | 113 | AEFSLLLFLKVSSVRCSWWSLPEALVQPGGSLRLSCAASGFTCSNAWMSWVRQAPGKGLEWVGRIKSKANGGTTDYAAPVKGRFTISRVDSKNTLYLQMNSLKTEDTAVYYCT |
| NONHSAT040427 | chr14 (−) | 105864215 | 106268900 | 125 | VQLLESGGGLVQPGGSLRLSCAASGFTFSSYAMSWVRQAPGKGLEWVSAISGSGGGTYYADSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCAKDRAPYSSSFDYWGQGTLVTVSSRSASAP |
| NONHSAT011749 | chr10 (−) | 22435424 | 22437929 | 210 | GVRWIQQNKHHKKIICERRRRCCENLVPETVSEQQKNRRKPKEHDPDGDGGGARGQHCHLPPTPGLKSERRLGGHKCRLLLSFAEAEKSPSGARRASRGKRAVTIPTGRPQRSLSADKAASKGAPRLPSHHVPFGRRKRKRPLPVRQPDNSGGQAAPRQHLCFPRAGKRSLRIPQKPEADTTETRGRSSRSDVYSGRERGGHLRLGEAAG |
NONCODE transcript ID.
Chromosome (Strand).