| Literature DB >> 32778133 |
Abstract
Noncoding RNAs (ncRNAs) are a large segment of the transcriptome that do not have apparent protein-coding roles, but they have been verified to play important roles in diverse biological processes, including disease pathogenesis. With the development of innovative technologies, an increasing number of novel ncRNAs have been uncovered; information about their prominent tissue-specific expression patterns, various interaction networks, and subcellular locations will undoubtedly enhance our understanding of their potential functions. Here, we summarized the principles and innovative methods for identifications of novel ncRNAs that have potential functional roles in cancer biology. Moreover, this review also provides alternative ncRNA databases based on high-throughput sequencing or experimental validation, and it briefly describes the current strategy for the clinical translation of cancer-associated ncRNAs to be used in diagnosis.Entities:
Keywords: Diagnostic kits; Functional ncRNA discovery; Novel ncRNAs; Sequencing technologies; Subcellular localization; ncRNA database
Mesh:
Substances:
Year: 2020 PMID: 32778133 PMCID: PMC7416809 DOI: 10.1186/s13045-020-00945-8
Source DB: PubMed Journal: J Hematol Oncol ISSN: 1756-8722 Impact factor: 17.388
Fig. 1Principle for novel ncRNA discovery. a Identification and classification of ncRNAs based on chromatin signatures. Most mRNA-like lincRNAs are generated from genomic regions with H3K4me3/H3K36me3 signatures; eRNAs originate from activated enhancers with H3K4me1/H3K27ac signatures; the junction site sequences of circSTATB1 were reverse transcribed and inserted into an enhancer with active H3K4me1 signatures. b Sno-lncRNAs maintain their stability by their classical stem-loop structures of snoRNAs. c Alternative splicing within circRNAs. d A number of novel small ncRNAs derived from rRNAs (rRFs), tRNAs (tsRNAs), and snoRNAs (sdRNAs) have also been found to be enriched in RNA-induced silencing complexes (RISCs) and function in a miRNA-like pathway
Characteristics of diverse sequencing methods
| Classification | Techniques | Short description | Strengths of the approach | Weakness | Ref |
|---|---|---|---|---|---|
| Microarrays | Tiling arrays | A method based on probes for discovering transcripts from specific genomic regions. | This approach can provide in-depth analysis of transcripts from target regions of genome. | Suffer from potential noise as a result of weak binding or cross-hybridization of transcripts to probes. | [ |
| Microarrays | A method based on a large number of oligonucleotide probes for performing quick global or parallel expression analysis of transcriptome. | Small size and high-throughput capabilities. | This method is not able to discover novel transcripts. | [ | |
| RNA-seq | RNA-seq | A technique that is currently the most widespread sequencing technology for both detecting RNA expression and discovering novel RNAs. | The method provides a global high-throughput detection amd identification of RNAs greater than 200 nt. | Its standard procedure is not suitable for detection of RNAs less than 200 nt. It also suffer from sequence errors at the reverse-transcription step or primer bias. | [ |
| RNA capture sequencing | A derivative technology combining RNA-seq with tilling arrays. | The method can specifically elevate the sequencing depth of target regions. | Suffer from disadvantages of both tiling arrays and RNA-seq. | [ | |
| scRNA-seq | Smart-seq | A scRNA-seq method based on a full-length cDNA amplification strategy. | Provide a full-length cDNA amplification of polyadenylated RNAs. | The limitations are lack of strand-specific identification, inability to read transcripts longer than 4 kb and only for polyadenylated RNAs. | [ |
| DP-seq | A scRNA-seq method using heptamer primers. | Suitable for smaller size samples or transcripts longer than 4 kb. this approach also suppresses highly expressed rRNAs in the cDNA library. | Captured RNAs are limited to polyadenylated RNAs. | [ | |
| Quartz-seq | A scRNA-seq method which reduces back ground noise. | Reduce background noise by using specially suppression PCR primers to reduce side products. | The method is limited to detecting polyadenylated RNAs. | [ | |
| SUPeR-seq | A single-cell universal polyadenylated tail-independent RNA sequencing. | Detect polyadenylated and nonpolyadenylated RNAs. Minimal rRNAs contamination. | Relatively low sensitivity for nonpolyadenylated RNAs. | [ | |
| RamDA-seq | A full-length total RNA-sequencing method for analyzing single cells. | High sensitivity for nonpolyadenylated RNAs. It can also uncover the dynamics of recursive splicing. | Unknown | [ | |
| Small RNA-seq | Small RNA-seq | A type of RNA-seq that discriminate small RNA from larger RNA to better evaluate and discover novel small RNAs. | Specifically detect and discover small or intermediate-sized RNAs with target sizes. | Adapter ligation bias lead to reverse transcription bias or amplification bias. | [ |
| Single-cell small-RNA sequencing | Small-seq | A method which detect small RNAs in a single cell. | The method can detect small RNAs in a single cell. | The limination may be similar to small RNA-seq. | [ |
| Nascent RNA-seq | GRO-seq | A method labeling nascent RNAs with 5Br-UTP and immunoprecipitating RNAs for sequencing. | Detect nascent RNAs and provide a genome-wide view of the location, orientation, and density of Pol II-engaged transcripts. | The method is confounded by contamination due to nonspecific binding, which could possibly result in experimental bias. | [ |
| SLAM-seq | A method distinguishing nascent RNA from total RNA via s4U-to-C conversion induced by nucleophilic substitution chemistry. | It is an enrichment-free method which can avoid contamination induced by affinity purification. | The oxidation condition caused certain oxidative damage to guanine, which may impact the accurancy of sequencing. | [ | |
| TimeLapse-seq | A method distinguishing nascent RNA from total RNA via s4U-to-C conversion induced by an oxidative nucleophilic aromatic substitution reaction. | It is an enrichment-free method which can avoid contamination induced by affinity purification. | The oxidation condition caused certain oxidative damage to guanine, which may impact the accurancy of sequencing. | [ | |
| AMUC-seq | A method distinguishing nascent RNA from total RNA via transforming s4U into a cytidine derivative using acrylonitrile. | More efficient and reliable because it has a minimal influence on the base-pairing manner of other nucleosides. | Unknown | [ | |
| Identification of RNA-chromatin interaction | GRID-seq | A method that aims to comprehensively detect and determine the localization of all potential chromatin-interacting RNAs. | Use a bivalent linker to ligate RNA to DNA in situ and provide exact profiles of RNA-chromatin interactome. | Usable sequence length for mapping RNA is 18–23 bp. However, short sequence length can result in ambiguity in mapping. | [ |
| iMARGI | A method providing a in situ mapping of RNA-genome interactome. | iMARGI needs less number of input cells and is suitable for paired-end sequencing. | Unknown | [ | |
| ChAR-seq | A chromatin-associated RNA sequencing that maps genome-wide RNA-to-DNA contacts. | Uncover chromosome-specific dosage compensation ncRNAs, and genome-wide trans-associated RNAs. | The method needs more than 100 million input cells. | [ | |
| Identification of RNA-RNA interaction | CLASH | A relatively early method that uses UV cross-linking to capture direct RNA-RNA hybridization. | Avoid noise from protein intermediate-mediated interactions. | This method only detects the RNA-RNA interactions base on proteins. | [ |
| RIPPLiT | A transcriptome-wide method for probing the 3D conformations of RNAs stably associated with defined proteins. | The method can capture 3D RNP structural information independent of base pairing. | This method only detects the RNA-RNA interactions base on proteins. | [ | |
| MARIO | A method identifying RNA-RNA interactions in the vicinity of all RNA-binding proteins using a biotin-linked reagent. | This method can identify RNA-RNA interactions in the vicinity of all RNA-binding proteins. | The method only detects the RNA-RNA interactions base on proteins. | [ | |
| PARIS | Psoralen analysis of RNA interactions and structures with high throughput and resolution. | Directly measure RNA-RNA interactions independent of proteins in living cells. | Unknown | [ | |
| LIGR-seq | A method for the global-scale mapping RNA-RNA interactions in vivo. | Provide global-scale mapping RNA-RNA interactions independent of proteins in vivo | Unknown | [ | |
| SPLASH | A method providing pairwise RNA-RNA partnering information genome-wide. | Map pairwise RNA interactions in vivo with high sensitivity and specificity, genome-wide. | Unknown | [ | |
| RIC-seq | RNA in situ conformation sequencing technology for the global mapping of intra- and intermolecular RNA-RNA interactions. | The method performs RNA proximity ligation in situ and can facilitate the generation of 3D RNA interaction maps. | Unknown | [ | |
| RNA proximity sequencing | A method based on massive-throughput RNA barcoding of particles in water-in-oil emulsion droplets. | This method can detect multiple RNAs in proximity to each other without ligation and is fit for studying the spatial organization of RNAs in the nucleus. | Unknown | [ | |
| RNAs in protein complexes or subcellular structures | FISSEQ | A method that offers in situ information of RNAs at high-throughput levels. | Provide information of RNAs at high-throughput levels. Visualization. | Unknown | [ |
| CeFra-seq | A method that physically isolates subcellular compartments and identifies their RNAs. | The methods have high sensitivity for low-abundance transcripts. | The method is limited to isolation protocols and the purity of resulting isolates. | [ | |
| APEX-RIP | A method can map organelle-associated RNAs in living cells via proximity biotinylation combined with protein-RNA crosslinking. | The technique can offer high specificity and sensitivity in targeting the transcriptome of membrane-bound organelles. | Unknown | [ |
Fig. 2Technologies for novel ncRNA discovery. a Process diagrams of RNA-seq. RNA-seq with purposeful experimental treatments can be used to detect diverse species of ncRNAs, including lncRNAs, circRNAs, and small ncRNAs. b Process diagrams of scRNA-seq. (I) The schematic of single-cell RNA-seq. Single cells are isolated and lysed to release total RNAs. RNAs are then reverse transcribed into first-strand cDNAs using designed primers followed by amplification for RNA-seq. (II–IV) The detailed schematic of innovative and novel methods such as Smart-seq (II), SUPeR-seq (III), and RamDA-seq (IV). In Smart-seq, polyadenylated RNAs are reverse transcribed into a pool of cDNAs by oligo (dT) primers followed by adding nontemplate C nucleotide tails to the 3′ ends (II); however, SUPeR-seq uses random primers with fixed anchor sequences for cDNA synthesis, followed by adding poly(A) tails to the 3′ ends (III). (IV) RamDA-seq uses both oligo (dT) and random primers for cDNA synthesis. cDNA is synthesized by the RNA-dependent DNA polymerase activity of RNase H minus reverse transcriptase (RTase). DNase I selectively nicks the cDNA of the RNA:cDNA hybrid strand. The 3′ cDNA strand is displaced by the strand displacement activity of RTase mediated by the T4 gene 32 protein (gp32), starting from the nick randomly introduced by DNase I. cDNA is amplified and protected by gp32 from DNase I. NSR: not-so-random primer
Fig. 3Process diagrams of representative nascent RNA-seq methods. a Schematic of GRO-seq. In this approach, nascent RNAs are labeled with 5Br-UTP and immunoprecipitated with the antibody anti-Br-UTP; the isolated RNAs subsequently undergoes deep sequencing. b Schematic of methods based on base mutation for nascent RNA detection. Nascent RNAs are labeled with a thiol-labeled nucleoside (s4U or s6G), and these newly synthesized RNAs can then be isolated and treated with specific chemical reagents, such as thiol (SLAM-seq) and acrylonitrile (AMUC-seq), leading to a change in the base-pairing manner of metabolically incorporated nucleosides
Fig. 4Technologies for discovery of RNA-chromatin interaction. a Process diagrams of GRID-seq. Cells are fixed with disuccinimidyl glutarate (DSG) and formaldehyde. Then, nuclei are extracted, and DNA is digested in situ by the frequent 4-base cutter AluI. A specifically designed bivalent linker labeled by biotin that consists of single-stranded RNA (ssRNA) portions, to ligate RNA, and a double-stranded DNA (dsDNA) portion, to ligate DNA, is used to link RNAs to AluI-digested genomic DNAs. DNA ligation to AluI-digested genomic DNA are performed in situ followed by affinity purification on streptavidin beads. Then, ssDNA are released from the beads, generated into dsDNA, cleaved by a type II restriction enzyme MmeI and sequenced. b Overview of the ChAR-seq method. RNA-DNA contacts are preserved by crosslinking, followed by in situ ligation of the 3′ end of RNAs to the 5′ end of the ssDNA tail of a bivalent linker containing biotin and a DpnII-complementary overhang on the opposite end. After generating a strand of cDNA complementary to the RNA, the genomic DNA is then digested with DpnII and then re-ligated, capturing proximally associated bridge molecules and RNA. The chimeric molecules are reverse transcribed, purified, and sequenced
Fig. 5Technologies for capturing RNA secondary structures and tertiary interactions. Schematics of CLASH, RIPPLiT, MARIO, RARIS, and RIC-seq. MNase, micrococcal nuclease
Fig. 6Technologies for discover of RNA location. a Schematics of FISSEQ and STARmap. FISSEQ begins with fixing cells on a glass slide and performing reverse transcription in situ with aminoallyl-dUTP and adapter sequence-tagged random hexamers. After RT, cDNA fragments are circularized at 60 °C. The circular templates are amplified using rolling-circle amplification (RCA) primers complementary to the adapter sequence in the presence of aminoallyl-dUTP and stably cross-linked. The nucleic acid amplicons in cells are then ready for sequencing and imaging. STARmap begins with labeling of cellular RNAs by pairs of DNA probes followed by enzymatic amplification so as to produce a DNA nanoball (amplicon). Tissue can then be transformed into a 3D hydrogel DNA chip by anchoring DNA amplicons via an in situ—synthesized polymer network. b Schematics of biochemical cell fractionation. Biological extracts including intact organelles are separated by density gradient or immunoprecipitation with specific antibodies. c Overview of the APEX-RIP. Cells expressing APEX2 targeted to the mitochondrial are cultured with the APEX substrate biotin-phenol. H2O2 initiates biotinylation of proximal endogenous proteins, which are subsequently crosslinked to nearby RNAs by formaldehyde. After cell lysis, biotinylated species are enriched by streptavidin pull-down, and coeluting RNAs are analyzed by RNA-Seq
Database of ncRNAs
| Cancer or basis | Database | Species | Website | Short description | Ref |
|---|---|---|---|---|---|
| Cancer | Lnc2Cancer v2.0 | lncRNA | An updated database that provides comprehensive experimentally supported associations between lncRNAs and human cancers. | [ | |
| TANRIC | lncRNA | This database characterizes the expression profiles of lncRNAs in large patient cohorts of 20 cancer types, including TCGA and independent datasets (> 8000 samples overall). | [ | ||
| lnCaNet | lncRNA | This database provides a comprehensive co-expression data resource which reveals the interactions between lncRNA and non-neighbouring cancer genes. | [ | ||
| LncRNADisease 2.0 | lncRNA | A database integrating comprehensive experimentally supported and predicted lncRNA-disease associations. | [ | ||
| The Cancer LncRNome Atlas | lncRNA | An academic research database to explore the lncRNA alternations across multiple human cancer types. | [ | ||
| SELER | lncRNA | A database of super-enhancer-associated lncRNA-directed transcriptional regulation in human cancers. | [ | ||
| CSCD | circRNA | A database that focuses on distinguishing cancer-specific circRNAs from noncancerous circRNAs, and reports predicted cellular location, RBP sites, and ORFs. | [ | ||
| Circ2Traits | circRNA | Provide cirRNA-disease association based on the interaction of circRNAs with disease-related miRNAs and SNP mapped on circRNA loci. | [ | ||
| CircR2Disease | circRNA | Provide a comprehensive resource for circRNA deregulation in various diseases, containing 725 associations between 661 circRNAs and 100 diseases. | [ | ||
| CircRNA disease | circRNA | A manually curated database of experimentally supported circRNA-disease associations. | [ | ||
| MiOncoCirc | circRNA | circRNA detection in 2093 clinical human cancer samples using exome capture sequencing. | [ | ||
| CircRiC | circRNA | A database focusing on lineage-specific circRNAs in 935 cancer cell lines including drug response. | [ | ||
| miRCancer | miRNA | A database currently documents more than 9000 relationships between 57,984 miRNAs and 196 human cancers. | [ | ||
| SomamiR 2.0 | miRNA | A database of cancer somatic mutations in microRNAs (miRNA) and their target sites that potentially alter the interactions between miRNAs and competing endogenous RNAs (ceRNA). | [ | ||
| OncomiR | miRNA | An online resource for exploring miRNA dysregulation in cancer. | [ | ||
| miRCancerdb | miRNA | An easy-to-use database to investigate the microRNAs-dependent regulation of target genes involved in development of cancer. | [ | ||
| miR2Disease | miRNA | A database aiming at providing a comprehensive resource of microRNA deregulation in various human diseases. | [ | ||
| YM500v3 | small ncRNA | A database which contains more than 8000 small RNA-seq dataseta and focuses on piRNAs, tRFs, snRNAs, snoRNAs, and miRNAs. | [ | ||
| tRF2Cancer | small ncRNA | A web server to detect tRFs and their expression in multiple cancers. | [ | ||
| MINTbase v2.0 | Small ncRNA | A framework for the interactive exploration of mitochondrial and nuclear tRNA fragments. | [ | ||
| Basis | LNCipedia | lncRNA | A public database for lncRNA sequence and annotation. | [ | |
| LNCediting | lncRNA | This database provides a comprehensive resource for the functional prediction of RNA editing in lncRNAs. | [ | ||
| lncRNAdb v2.0 | lncRNA | This database provides comprehensive annotations of eukaryotic lncRNAs. | [ | ||
| LncRNAWiki | lncRNA | This database is a publicly editable and open-content platform for community curation of human lncRNAs. | [ | ||
| LncBook | lncRNA | This database is a curated knowledgebase of human lncRNAs. | [ | ||
| MONOCLdb | lncRNA | 20,728 mouse lncRNA genes. | [ | ||
| NONCODE | lncRNA | An interactive database that aims to present the most complete collection and annotation of ncRNAs especially lncRNAs from 17 species. | [ | ||
| CircAtlas | circRNA | An integrated resource of one million highly accurate circular RNAs from 1070 vertebrate transcriptomes. | [ | ||
| circBase | circRNA | A database containing thousands of recently identified circRNAs in eukaryotic cells. | [ | ||
| CIRCpedia v2 | circRNA | A database for comprehensive circRNA annotation from over 180 RNA-seq datasets across six different species. | [ | ||
| TSCD | circRNA | A tissue-specific circRNA database from RNA-seq datasets and characterized the features of circRNAs in human and mouse. | [ | ||
| starBase v2.0 | miRNA | A database decoding miRNA-ceRNA, miRNA-ncRNA, and protein–RNA interaction networks from large-scale CLIP-Seq data. | [ | ||
| miRTarBase | miRNA | A resource for experimentally validated microRNA-target interactions. | [ | ||
| miRmine | miRNA | A database of human miRNA expression profiles. | [ | ||
| EVmiRNA | miRNA | A database focusing on miRNA expression profiles in extracellular vesicles. | [ | ||
| miRGate | miRNA | A curated database of human, mouse, and rat miRNA–mRNA targets. | [ | ||
| miRBase | miRNA | A database containing microRNA sequences from 271 organisms: 38,589 hairpin precursors and 48,860 mature microRNAs. | [ | ||
| DIANA-TarBase v8 | miRNA | A reference database devoted to the indexing of experimentally supported miRNA targets. | [ | ||
| DASHR 2.0 | small ncRNA | A database that integrates human small ncRNA gene and mature products derived from all major RNA classes. | [ |
Developments in diagnostic kits for cancer diagnosis (EPO https://worldwide.espacenet.com)
| Species | Name | Expression in cancer | Diseases | Application | Patent number |
|---|---|---|---|---|---|
| circRNA | hsacirc_0028185 | Up | Hepatocellular carcinoma | Cancer auxiliary diagnosis | CN111004850A (2020) |
| circRNA | hsa_circ_001477 | Up | Gastric cancer | Cancer diagnosis | CN110129324A (2019) |
| circRNA | hsa_circRNA_012515 | Up | Non-small cell lung cancer | Cancer diagnosis | CN110592223A (2019) |
| circRNA | hsa_circRNA_405124 or hsa_circ_0012152 | Up | Leukemia | Cancer early diagnosis | CN109593859A (2019) |
| circRNA | circ_104075 | Up | Liver cancer | Cancer diagnosis | CN109161595A (2019) |
| circRNA | circ3823 | Up | Colorectal cancer | Cancer early diagnosis | CN110592220A (2019) |
| circRNA | hsa_circ_0021977 | Up | Breast cancer | Cancer diagnosis | CN109022583A (2018) |
| circRNA | hsa_circ_0012755 | Up | Prostate cancer | Cancer diagnosis | CN108624688A (2018) |
| circRNA | circ_0047921, circ_0007761 and circ_0056285 | Up | Non-small cell lung cancer | Cancer early diagnosis | CN108179190A (2018) |
| circRNA | hsa-circRPL15-001 | Up | Chronic lymphocytic leukemia | Cancer diagnosis | CN109055564A (2018) |
| circRNA | has_circ_0117909 | Up | Acute lymphoblastic leukemia | Cancer diagnosis | CN107937522A (2017) |
| has_circ_0005720 | Down | ||||
| circRNA | cRNA-ZFR | Up | Bladder cancer | Cancer diagnosis | CN106011139A (2016) |
| lncRNA | lncRNA-AC006159.3 | Down | Colorectal cancer | Cetuximab-resistance diagnosis | CN108949993A (2018) |
| lncRNA | lncRNAXLOC_004122, Linc00467 and lncRNAA1049452 | Up | Breast cancer | Cancer bone metastasis diagnosis | CN107699619A (2017) |
| lncRNA | LncRNA GENE NO.9 | Up | Bladder cancer | Cancer diagnosis | CN107267636A (2017) |
| lncRNA | LINC00516 | Up | Lung cancer | Cancer or cancer metastasis diagnosis | CN108998528A (2018) |
| lncRNA | LSAMP-AS1 | Up | Gastric cancer | Cancer diagnosis | CN110628915A (2019) |
| miRNA | miRNA-4692 | Down | Hepatocellular carcinoma | Cancer diagnosis | (2018) |
| miRNA | miRNA-1266 | Up | Endometrial carcinoma | Cancer diagnosis | CN105907883A (2016) |
| miRNA | miR-320 | Down | Cervical cancer | Cancer early diagnosis | CN105506076A (2016) |
| miRNA | miRNA-2116 | Up | Lung adenocarcinoma | Cancer metastasis diagnosis | CN104774966A (2015) |
| miRNA | miRNA-410 | Up | Prostate cancer | Cancer diagnosis | CN104651492A (2015) |
| miRNA | miRNA-1262 | Up | Acute myeloid leukemia | Cancer diagnosis | CN105063052A (2015) |