| Literature DB >> 29617966 |
Alexander R Gawronski1, Michael Uhl2, Yajia Zhang3,4, Yen-Yi Lin1,5, Yashar S Niknafs6, Varune R Ramnarine5, Rohit Malik6, Felix Feng6,7, Arul M Chinnaiyan3,4,6,8, Colin C Collins5, S Cenk Sahinalp5,9, Rolf Backofen2.
Abstract
Motivation: Long non-coding RNAs (lncRNAs) are defined as transcripts longer than 200 nt that do not get translated into proteins. Often these transcripts are processed (spliced, capped and polyadenylated) and some are known to have important biological functions. However, most lncRNAs have unknown or poorly understood functions. Nevertheless, because of their potential role in cancer, lncRNAs are receiving a lot of attention, and the need for computational tools to predict their possible mechanisms of action is more than ever. Fundamentally, most of the known lncRNA mechanisms involve RNA-RNA and/or RNA-protein interactions. Through accurate predictions of each kind of interaction and integration of these predictions, it is possible to elucidate potential mechanisms for a given lncRNA.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29617966 PMCID: PMC6137976 DOI: 10.1093/bioinformatics/bty208
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Overview of the MechRNA pipeline. IntaRNA2 computes the optimal RNA–RNA interaction sites between the lncRNA and the accessible regions of targets/transcriptome. GraphProt predicts protein binding sites for all specificied RBPs on all targets and the lncRNA. Information derived from these predictions, as well as correlation data, is used to generate candidate mechanisms. Finally, the candidate with the lowest joint P-value is selected for each lncRNA-target pair, and a output list of mechanisms is produced. (*)Since at the time of publication only 22 RBP CLIP-Seq datasets were available for non-splicing related, post-transcriptional regulation proteins
Fig. 2.Example execution of the splitting algorithm with a max sequence length of 1000 nt, where the red interval is the one being processed. (i) The first iteration starts with the entire sequence which is longer than the threshold. (ii) The first split occurs at the position with max ED at ∼1700 nt. (iii) The interval is still too long, so a second split is made at the next position of max ED at ∼250 nt. (iv) The interval is now below the threshold so the iteration continues to the next interval. (v) This interval is over the threshold and is split at ∼800 nt. (vi) and (vii) The next two intervals are below the threshold. (final) The end result is four intervals, all below the length threshold and more accessible than their split positions
List of RBPs used in the analysis including the source CLIP-Seq data and model type
| Gene ID | Gene symbol | Protein | Model type | Protocol | Reference |
|---|---|---|---|---|---|
| ENSG00000092199 | HNRNPC | hnRNP C | Sequence | eCLIP | ( |
| ENSG00000165119 | HNRNPK | hnRNP K | Sequence | eCLIP | ( |
| ENSG00000066044 | ELAVL1 | HuR | Sequence | PAR-CLIP | ( |
| ENSG00000102081 | FMR1 | FMR-1 | Structure | eCLIP | ( |
| ENSG00000121774 | KHDRBS1 | Sam68 | Structure | eCLIP | ( |
| ENSG00000172660 | TAF15 | TAF15 | Sequence | PAR-CLIP | ( |
| ENSG00000092847 | AGO1 | argonaute | Structure | PAR-CLIP | ( |
| ENSG00000123908 | AGO2 | argonaute-2 | Structure | PAR-CLIP | ( |
| ENSG00000126070 | AGO3 | argonaute-3 | Structure | PAR-CLIP | ( |
| ENSG00000134698 | AGO4 | argonaute-4 | Structure | PAR-CLIP | ( |
| ENSG00000182944 | EWSR1 | EWS | Structure | eCLIP | ( |
| ENSG00000089280 | FUS | FUS | Sequence | PAR-CLIP | ( |
| ENSG00000159217 | IGF2BP1 | IGF2BP1 | Structure | PAR-CLIP | ( |
| ENSG00000073792 | IGF2BP2 | IGF2BP2 | Structure | PAR-CLIP | ( |
| ENSG00000136231 | IGF2BP3 | IGF2BP3 | Structure | PAR-CLIP | ( |
| ENSG00000155363 | MOV10 | MOV-10 | Sequence | PAR-CLIP | ( |
| ENSG00000055917 | PUM2 | Pumilio-2 | Sequence | eCLIP | ( |
| ENSG00000112531 | QKI | Hqk | Structure | eCLIP | ( |
| ENSG00000120948 | TARDBP | TDP-43 | Sequence | eCLIP | ( |
| ENSG00000116001 | TIA1 | TIA-1 | Sequence | eCLIP | ( |
| ENSG00000090905 | TNRC6A | TNRC6A | Structure | eCLIP | ( |
| ENSG00000197157 | SND1 | SND1 | Structure | eCLIP | ( |
Fig. 3.Illustration of the possible mechanisms that can be inferred from RNA–RNA and RNA–protein interactions
Descriptions of known lncRNA mechanisms
| Mechanism | Description | Example |
|---|---|---|
| RBP interaction directly impacts the target or lncRNA | hnRNPL binding to DSCAM-AS1 ( | |
| RNA–RNA interaction directly impacts the target with no RBP involvement | TINCR stabilization of various mRNAs ( | |
| RNA-RNA interaction increases/decreases the affinity of RBP binding nearby | iNOS stabilization by AS via HuR ( | |
| RBP bound to the lncRNA is brought into the vicinity of the target through RNA-RNA interaction | MALAT1 localization of splicing factors ( | |
| RBP is sequestered from the target by the lncRNA | Gas5-AS binding transcription factors ( | |
| RBP and lncRNA compete for the same binding location on the target | 7SL disrupts HuR stabilization of TP53 ( | |
| A dsRNA binding protein interacts with stems created from lncRNA interaction | STAU1-mediated decay ( | |
| The lncRNA facilitates the formation of a complex between multiple proteins | HOTAIR and the polycomb complex ( |
Note: Mechanisms in italics are not included in the predictions.
Selected LncRNAs for MechRNA analysis
| LncRNA | Length | Target | Protein binding | Mechanism | Cancer type |
|---|---|---|---|---|---|
| 7SL | 299 | TP53 | HuR | Competitive | Prostate |
| PCAT1 | 1992 | BRCA2 | HuR | Competitive? | Prostate |
| ARlnc1 | 2786 | AR | Unknown | Unknown | Prostate |
| PCA3 | 3922 | Unknown | Unknown | Unknown | Prostate |
| PCAT29 | 694 | Unknown | Unknown | Unknown | Prostate |
| LINC00514 | 3385 | CLDN9 | Unknown | Unknown | NEPC |
| SSTR5-AS1 | 2864 | SSTR5 | Unknown | Unknown | NEPC |
| TINCR | 3733 | STAU1 | Many | Stabilization | Various |
Note: The lncRNAs vary in terms of what is known about their mechanisms, allowing MechRNA to be tested with various amounts of a priori data. PCAT1 has a question mark indicating that competitive binding is the hypothesis not been validated yet.
Fig. 4.Distribution of RNA–RNA and RNA–protein interactions for PCAT1. The circles represent proteins and have a diameter relative to their molecular masses. The curve above the exons is a histogram of the frequency of RNA–RNA binding of each position with other transcripts
Fig. 5.(A) IgG/HuR/AR/Ago2 proteins were immunoprecipitated by antibodies in LNCaP cells and the bound BRCA2 RNA was detected by qPCR. The result confirms binding of BRCA2 mRNA to HuR protein in cells. (B) RWPE cells stably expressing lac-Z, PCAT1-FL or PCAT1-delta-1-250 were harvested. HuR was immunoprecipitated using anti-HuR antibody and bound RNA (BRCA2) was detected by qPCR. As shown in A, HuR can bind to BRCA2. In presence of FL-PCAT1 this binding is inhibited. In presence of PCAT1-delta-1-250 there was no effect on HuR and BRCA2 binding. The result confirms the role of PCAT1 in mediating BRCA2-HuR binding
Select lncRNA mechanisms predictions for known cancer genes, selected based on rank (joint p-value) and agreement with known roles of the cancer genes and RBPs
| lncRNA | Target | RNA–RNA interaction | RBP–target interaction | RBP–lncRNA interaction | Mechanism | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Gene symbol | Gene symbol | Iso. | FE | Context | Cor. | Cor. FDR | RBP | Cor. | Cor. FDR | RBP | Cor. | Cor. FDR | Type | |
| 3 | −65.97 | 5’UTR | + | 1.3 | None | NA | NA | None | NA | NA | Direct | 2.6 | ||
| 2 | −31.63 | 3′UTR | − | 0.001 | IGF2BP2 | + | 1.3 | None | NA | NA | Competitive | 1.2 | ||
| BMPR1A | 1 | −28.32 | 3′UTR | + | 0.182 | IGF2BP3 | + | 2.3 | None | NA | NA | Stabilization | 6.6 | |
| 5 | −44.55 | 5′UTR | − | 0.023 | TAF15 | + | 0.111 | None | NA | NA | De-stabilization | 6.8 | ||
| 1 | −26.57 | 5′UTR | + | 1.3 | None | NA | NA | None | NA | NA | Direct | 4.8 | ||
| 1 | −60.4096 | 5’UTR | + | 0.083 | EWSR1 | − | 0.006 | None | NA | NA | Competitive | 5.5 | ||
| TP53 | 7 | −33.18 | 3′UTR | + | 0.006 | HNRNPC | + | 0.081 | None | NA | NA | Stabilization | 1.6 | |
| 1 | −32.97 | 3′UTR | + | 0.159 | KHDRBS1 | + | 4.1 | None | NA | NA | Stabilization | 2.2 | ||
| 5 | −27.25 | 3′UTR | + | 2.4 | None | NA | NA | None | NA | NA | Direct | 5.6 | ||
| 1 | −48.02 | 5′UTR | + | 0.013 | None | NA | NA | None | NA | NA | Direct | 1.7 | ||
| 1 | −27.7706 | 5′UTR | + | 1.4 | None | NA | NA | None | NA | NA | Direct | 2.0 | ||
| DAXX | 7 | −106.10 | CDS | None | NA | None | NA | NA | IGF2BP2 | + | 0.046 | Localization | 2.7 | |
| 1 | −30.86 | 3′UTR | + | 0.007 | ELAVL1 | + | 0.001 | none | NA | NA | Stabilization | 1.1 | ||
| 2 | −31.95 | 3′UTR | + | 0.039 | ELAVL1 | + | 0.052 | none | NA | NA | Stabilization | 3.0 | ||
| 4 | −36.99 | CDS | + | 0.029 | TAF15 | + | 0.002 | none | NA | NA | Stabilization | 7.2 | ||
| 2 | −32.85 | CDS | + | 4.6 | None | NA | NA | none | NA | NA | Direct | 8.6 | ||
| 3 | −66.93 | 5′UTR | None | NA | None | NA | NA | IGF2BP2 | + | 0.045 | Localization | 5.4 | ||
Note: Genes in boldface indicate oncogenes, italics indicate tumor suppressors and normal text are uncategorized. The first section indicates the target and how many isoforms (Iso) it interacts with. The next three sections describe the interactions involved. For RNA-RNA, the free energy in kcal/mol (FE) and genomic context are included. For RBP-RNA, the protein name is provided. In all three cases the correlation (+ positive, − negative) and the correlation FDR are shown if applicable. The final section displays the mechanism categorization and the combined P-value.