| Literature DB >> 31652751 |
Vincenza Barresi1, Ilaria Cosentini2, Chiara Scuderi3, Salvatore Napoli4, Virginia Di Bella5, Giorgia Spampinato6, Daniele Filippo Condorelli7.
Abstract
The awareness of genome complexity brought a radical approach to the study of transcriptome, opening eyes to single RNAs generated from two or more adjacent genes according to the present consensus. This kind of transcript was thought to originate only from chromosomal rearrangements, but the discovery of readthrough transcription opens the doors to a new world of fusion RNAs. In the last years many possible intergenic cis-splicing mechanisms have been proposed, unveiling the origins of transcripts that contain some exons of both the upstream and downstream genes. In some cases, alternative mechanisms, such as trans-splicing and transcriptional slippage, have been proposed. Five databases, containing validated and predicted Fusion Transcripts of Adjacent Genes (FuTAGs), are available for the scientific community. A comparative analysis revealed that two of them contain the majority of the results. A complete analysis of the more widely characterized FuTAGs is provided in this review, including their expression pattern in normal tissues and in cancer. Gene structure, intergenic splicing patterns and exon junction sequences have been determined and here reported for well-characterized FuTAGs. The available functional data and the possible roles in cancer progression are discussed.Entities:
Keywords: Cis-SAGe; FuTAG; alternative transcription; cancer; fusion transcript; readthrough
Mesh:
Substances:
Year: 2019 PMID: 31652751 PMCID: PMC6862657 DOI: 10.3390/ijms20215252
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Schematic representation of different fusion transcripts of adjacent genes (FuTAGs) structures according to Lu et al. [12] and Yuan et al. [9] shown, respectively, in (A) and (B).
List of experimentally evaluated FuTAGs. Additional exons are highlighted in green letters. Chr: Chromosome; NM and NR: NCBI curated Refseq accession numbers for coding and non-coding transcripts, respectively.
| N. | FuTAG | Upstream Gene | Downstream Gene | Position (Chr) | Tissue/Cell Type | Normal Tissue Expression (GTEx) | NM, NR | ISP Mechanism in According to Lu et al., [ | Ensembl Code | Structure | Junction Exon Sequence | Reference |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | GALT-IL11Rα |
|
| 9p13 | Normal human cell- T cell clones and fetal bone marrow | Colon, adipocytes, ovary and testis | N.D. | Type I | ENSG00000258728 | ex10-ex2 | GAGCAG-ATGAGC | Magrangeas et al., [ |
| 2 | HHLA1-OC90 |
|
| 8q24.1–24.3 | Tera1 and NTera2D1 cell lines | N.D. | N.D. | N.D. | N.D. | N.D. | N.D. | Kowalski et al., [ |
| 3 | P2Y11 (PPAN)-SSF1 |
|
| 19p13.1 | HL-60 cell lines | Heart, thyroid, adrenal gland, ovary, prostate and testis | NM_001040664; NM_001198690 | Type III | ENSG00000243207 | ex12partial-ex2 | ATCGAG-GTGCCA | Communi et al., [ |
| 4 | TWE-PRIL (TNFSF12-TNFSF13) |
|
| 17P13.1 | T lymphocytes and monocytes cell lines | Kidney, liver and breast | NM_172089 | Type I | ENSG00000248871 | ex6-ex2 | TGTCAG-AGTTCC | Pradet-Balade et al., [ |
| 5 | SLC45A3-ELK4 |
|
| 1q32 | LNCaP and PC3 prostate cancer cell lines | N.D. | N.D. | Type II | N.D. | N.D | N.D | Kumar et al., [ |
| 6 | DEC205-DCL1 (or LY75-CD302) |
|
| 2q24 | Hodgkin and Reed-Sternberg cells | White blood cells, skeletal muscle, thyroid, adrenal gland | NM_001198759 | Type I | ENSG00000248672 | ex34-ex2 | CTCTGG-ACTGTC | Kato et al., [ |
| 7 | SCNN1A-TNFRSF1A |
|
| 12p13.31 | Breast cancer cell lines | N.D. | N.D. | Type I | N.D. | ex12-ex2 | GTCACG-GTGCTC | Varley et al., [ |
| 8 | CTSD-IFITM10 |
|
| 11p15.5 | Breast cancer cell lines | N.D. | N.D. | Type I | N.D. | ex8-ex2 | CTCAAG-GCCCAG | Varley et al., [ |
| 9 | STX16-NPEPL1 |
|
| 20q13.32 | Acute myeloid leukemia and gastrointestinal stromal tumors | Whole blood, lymph node, brain, cortex, cerebellum, spinal cord, heart, artery, skeletal muscle, small intestine, colon, adipocyte, kidney, liver, lung, spleen, stomach, esophagus, bladder, pancreas, thyroid, salivary gland, adrenal gland, pituitary, breast, skin, ovary, uterus, placenta, prostate, testis. | NR_037945.1 | Type IV | ENSG00000254995 | ex8- ex1(addictional intergenic exon)- ex2(addictional intergenic exon)- ex3(addictional intergenic exon)-ex2-6- ex1(addictional intron exon) -ex7-12 | CACAAG-GACTTC_CACACT-TGCCTG_GGGAAG-GCTGGT_ATGGAG-CTCTGG_GGGAAG-AGGGCA_GGGGGT-ACTACC | Wen et al. [ |
| 10 | JMJD7-PLA2G4B |
|
| 15q15.1 | human head and neck squamous cell carcinoma cell lines | White blood cells, lymph node, brain, heart, colon, adipocyte, kidney, liver, lung, thyroid, adrenal gland, breast, ovary, prostate, testis. | NM_001198588; NM_005090 | N.D. | ENSG00000168970 | ex6-ex2 | GAGAAG-GCAGAG | Cheng et al., [ |
| 11 | miR-200c/141-PTPN6 |
|
| N.D. | Ovarian tumorigenesis | N.D. | N.D. | N.D. | N.D. | N.D | N.D | Batista et al., [ |
| 12 | DUS4L-BCAP29 |
|
| 7q22.3 | gastric and prostate cancer tissues | N.D. | N.D. | Type I | N.D. | ex7-ex2 | CAGATG-GTGTGA | Tang et al., [ |
| 13 | TSNAX-DISC1 |
|
| 1q42.2 | endometrial carcinoma tissues | Whole blood, brain, cortex, cerebellum, spinal cord, tibial nerve, heart, artery, skeletal muscle, small intestine, colon, adipocyte, kidney, liver, lung, spleen, stomach, esophagus, bladder, pancreas, thyroid, salivary gland, adrenal gland, pituitary, breast, skin, ovary, uterus, prostate, testis. | NR_028393; NR_028394; NR_028395; NR_028396; NR_028397; NR_028398; NR_028399; NR_028400 | Type IV | ENSG00000270106 | ex4-ex(addictional intergenic exon)-ex2 | ACTACA-AAGTTT_TATTTG-GCAGCC | Li et al., [ |
| 14 | PHOSPHO2-KLHL23 |
|
| 2q31.1 | Gastric cancer cell lines and tissues | N.D. | NM_001199290; NR_144936 | Type I | ENSG00000213160 | ex3-ex2 | AGTTGG-CCATGG | Choi et al., [ |
| 15 | RPL17-C18orf32 |
|
| 18q21.1 | Gastric cancer cell lines and tissues | N.D. | NM_001199355; NM_001199356 | Type I | ENSG00000215472 | ex6-ex2 | AAAAAG-TTGAGG | Choi et al., [ |
| 16 | PRR5-ARHGAP8 |
|
| 22q13.31 | Gastric cancer cell lines and tissues and bipolar disorder | White blood cells, brain, colon, adipocyte, kidney, lung, thyroid, adrenal gland, breast, ovary, prostate, testis. | NM_181334 | N.D. | ENSG00000248405 | ex4-ex2 | ATGAGG-AGCTGC | Choi et al., [ |
| 17 | Kua-UVE1 (TMEM189-UBE2V1) |
|
| 20q13.2 | Colon cancer cell lines | Liver, thyroid, adrenal gland, breast, testis. | NM_199203 | Type I | ENSG00000124208 | ex5-ex2 | CCACAG-GAGTAA | Thomson et al., [ |
| 18 | MASK-BP3 (ANKHD1-EIFAEBP3) |
|
| 5q31.3 | ? | White blood cells, lymph node, brain, heart, skeletal muscle, colon, adipocyte, kidney, liver, lung thyroid, adrenal gland, breast, ovary, prostate testis. | NM_020690 | Type IV | ENSG00000254996 | ex33-ex(addictional intergenic exon)-ex2 | CAGCAG-GCCAGT_CCAGAG-GCACCA | Poulin et al., [ |
| 19 | CTSC-RAB38 |
|
| 11q14.2 | Clear renal cell carcinoma | N.D. | N.D. | N.D. | N.D. | N.D | N.D | Grosso et al., [ |
| 20 | BC039389-GATM (WRB-SH3BGR or KLK4-KRSP1 ) |
|
| 21q22.2 | Kidney cancer | N.D. | NM_001317744; NM_001350300 | N.D. | ENSG00000285815 | N.D | N.D | Pflueger et al., [ |
| 21 | LHX6-NDUFA8 |
|
| N.D. | Cervical cancer tissues (PAP smear) | N.D. | N.D. | N.D. | N.D. | Variant.1- ex8-ex2 | ACTTGA-GTGAAA | Wu et al., [ |
| 22 | SLC2A11-MIF |
|
| N.D. | Cervical cancer tissues (PAP smear) | N.D. | N.D. | N.D. | N.D. | ex9-ex2 | GTTAGT-TACATC | Wu et al., [ |
| 23 | INS-IGF2 |
|
| 11q15.5 | NSCLC tissues | Whole blood, brain, cortex, cerebellum, spinal cord, tibial nerve, heart, artery, skeletal muscle, colon, adipocyte, kidney, liver, lung, stomach, esophagus, pancreas, thyroid, salivary gland, adrenal gland, pituitary, breast, ovary, testis. | NM_001042376; NR_003512 | N.D. | ENSG00000129965 | ex2-ex1partial | TGCAGG-CCTCAG | Gao et al., [ |
| 24 | NFATC3-PLA2G15 |
|
| 16q22.1 | T-acute lymphoblastic leukemia and Colon rectal cancer | N.D. | N.D. | Type I | N.D. | ex9-ex2 | ATGATG-TCCCTG | Bond et al., [ |
| 25 | BCL2L2-PABPN1 |
|
| 14q11.2 | Bladder urothelial carcinoma tissues | Whole blood, brain, cortex, cerebellum, spinal cord, tibial nerve, heart, artery, skeletal muscle, small intestine, colon, adipocyte, kidney, liver, lung, spleen, stomach, esophagus, bladder, pancreas, thyroid, salivary gland, adrenal gland, pituitary, breast, skin, ovary, uterus, prostate, testis. | NM_001199864 | Type I | ENSG00000258643 | ex3-ex2 | GGCTGG-GAGCTG | Zhu et al., [ |
| 26 | CHFR-GOLGA3 |
|
| 12q24.33 | Bladder urothelial carcinoma tissues | N.D. | N.D. | Type I | N.D. | N.D | N.D | Zhu et al., [ |
Figure 2The timeline of five public databases collecting FuTAGs reports the year of publication, last update and number of FuTAGs compared to total records.
Figure 3The Venn graph shows the comparison among ChiTaRs v3.1, ConjoinG, the Tumor Fusion Gene Database, Fusion GDB and NCBI readthrough transcripts. The data contained in each dataset are available in Table S2.
Figure 4Distribution of FuTAGs in human chromosomes normalized for the total number of transcripts for each chromosome. The results of ChiTaRs v3.1, ConjoinG and NCBI readthroughs are compared.
Figure 5(A) Averages (±SEM) of TPM values of all transcripts (all T), upstream ConjoinG transcripts (Upstream T), including both the upstream parent gene and the upstream part of the fusion transcript and downstream ConjoinG transcripts (Downstream T), including both the downstream parent gene and the downstream part of the fusion transcript. N: Normal colonic mucosae; COAD: CIN-positive colon adenocarcinomas from TCGA. (B) Averages (±SEM) of TPM values of all 60,485 analysed transcripts (All T), Upstream transcripts (Upstream T) and Downstream transcripts (Downstream T). Up: Upregulated; Down: Downregulated.
Figure 6(A) Percentage chromosomal distribution of upregulated FuTAG’s parent genes normalized for the total number of transcripts in each chromosome (normalized chromosomal distribution index (NCDI)). (B) Percentage chromosomal distribution of 800 ConjoinG transcripts (Chromosomal distribution index (CDI)) and its normalized chromosomal distribution index, NCDI (CDI normalized for the total number of transcripts in each chromosome).
FuTAGs located in Chr20q and upregulated in comparison with normal mucosa in both parent genes. Data obtained by RNAseq have been explored in HTA 2.0.
| Conjoined Genes | Omics Technologies | Alias | Upstream Gene | Downstream Gene | Readthrough | RNA* | Known Hybrid Protein* | Chr | Band |
|---|---|---|---|---|---|---|---|---|---|
| FC ** | FC ** | FC ** | |||||||
| CGHSA0796 | RNAseq | 1.58 | 3.36 | N/A | NO | NO | 20 | q11.22 | |
| HTA2.0 | 4.52 | 1.53 | 4.52 | ||||||
| CGHSA0023 | RNAseq | N/A | N/A | 2.08 | YES | CDS Predicted | 20 | q11.23 | |
| HTA2.0 | 1.11 | 2.13 | N/A | ||||||
| CGHSA0579 | RNAseq | 2.39 | 20.324 | N/A | YES | NO | 20 | q13.12 | |
| HTA2.0 | 1.29 | −1.25 | N/A | ||||||
| CGHSA0573 | RNAseq | EPPIN-WFDC6 | 7.69 | 2.97 | 1.45 | NO | NO | 20 | q13.12 |
| HTA2.0 | −1.69 | −1.9 | −1.69 | ||||||
| CGHSA0217 | RNAseq | TMEM189-UBE2V1 | 1.59 | 1.42 | 1.24 | YES | YES | 20 | q13.13 |
| HTA2.0 | 1.72 | 1.41 | 1.72 | ||||||
| CGHSA0215 | RNAseq | 2.07 | 2.54 | 3.05 | YES | YES | 20 | q13.32 | |
| HTA2.0 | 1.85 | 1.1 | 1.51 | ||||||
| CGHSA0738 | RNAseq | PRELID3B-ATP5F1E | 1.36 | 1.33 | N/A | YES | NO | 20 | q13.32 |
| HTA2.0 | 3.13 | 2.79 | 7.11 | ||||||
| CGHSA0212 | RNAseq | 1.42 | 3.08 | N/A | YES | CDS Predicted | 20 | q13.33 | |
| HTA2.0 | 1.1 | −1.08 | N/A | ||||||
| CGHSA0570 | RNAseq | 3.08 | 1.32 | N/A | NO | NO | 20 | q13.33 | |
| HTA2.0 | −1.08 | 1.03 | N/A | ||||||
| CGHSA0214 | RNAseq | 2.22 | 2.23 | N/A | YES | YES | 20 | q13.33 | |
| HTA2.0 | −1.57 | 2.7 | N/A | ||||||
| CGHSA0577 | RNAseq | 1.93 | 1.07 | N/A | Not Attempted Experimentally | NO | 20 | q13.33 | |
| HTA2.0 | 2.32 | 1.09 | N/A |
* experimentally confirmed by Akiva et al., [8]. ** FC: Linear fold-change in the comparison tumor vs. normal tissue (only transcripts showing an FC value > 1.5 in one of the two parent genes in RNAseq data are reported). N/A: Not Available.
FuTAGs located in all chromosomes and upregulated (FC > 1.5) in CRC in comparison to normal mucosa. Data obtained by HTA 2.0.
| Transcript Cluster ID | FC > 1.5 (CRCvs.MU) | FDR | Chr Position | Gene Symbol | Description | FuTAG Reported in |
|---|---|---|---|---|---|---|
| TC02005002.hg.1 | 1.57 | 2 × 10−6 | 2q31.1 | KLHL23; PHOSPHO2-KLHL23 | kelch-like family member 23; PHOSPHO2-KLHL23 readthrough; NULL | [ |
| TC02005005.hg.1 | 2 | 1.7 × 10−7 | 2q33.1 | MOB4; HSPE1-MOB4 | MOB family member 4, phocein; HSPE1-MOB4 readthrough; NULL | |
| TC02002467.hg.1 | 2.32 | 2 × 10−6 | 2q24.2 | LY75-CD302; CD302; LY75 | LY75-CD302 readthrough; CD302 molecule; lymphocyte antigen 75; NULL | [ |
| TC05000726.hg.1 | 2.61 | 1.2 × 10−7 | 5q31.3 | EIF4EBP3; ANKHD1; ANKHD1-EIF4EBP3 | eukaryotic translation initiation factor 4E binding protein 3; ankyrin repeat and KH domain containing 1; ANKHD1-EIF4EBP3 readthrough; NULL | [ |
| TC05001690.hg.1 | 1.67 | 2 × 10−6 | 5q22.3 | TMED7-TICAM2; TICAM2; TMED7 | TMED7-TICAM2 readthrough; toll-like receptor adaptor molecule 2; transmembrane emp24 protein transport domain containing 7; NULL | [ |
| TC07003311.hg.1 | 1.75 | 1.4 × 10−5 | 7q11.23 | DTX2P1-UPK3BP1-PMS2P11; LOC100132832 | DTX2P1-UPK3BP1-PMS2P11 readthrough transcribed pseudogene; PMS2 postmeiotic segregation increased 2 (S. cerevisiae) pseudogene | |
| TC0X002317.hg.1 | 1.64 | 1 × 10−10 | Xq22.1 | RPL36A; RPL36A-HNRNPH2 | ribosomal protein L36a; RPL36A-HNRNPH2 readthrough; NULL | |
| TC0X002316.hg.1 | 4.2 | 4.1 × 10−12 | Xq22.1 | HNRNPH2; RPL36A-HNRNPH2 | heterogeneous nuclear ribonucleoprotein H2 (H’); RPL36A-HNRNPH2 readthrough; NULL | |
| TC10002935.hg.1 | 2.17 | 5.4 × 10−9 | 10p12.2 | BMI1; COMMD3-BMI1 | BMI1 polycomb ring finger oncogene; COMMD3-BMI1 readthrough; NULL | |
| TC11000477.hg.1 | 2.26 | 1.7 × 10−8 | 11q12.1 | CNTF; ZFP91; ZFP91-CNTF | ciliary neurotrophic factor; ZFP91 zinc finger protein; ZFP91-CNTF readthrough (NMD candidate); zinc finger protein 91 homolog (mouse); ZFP91-CNTF readthrough (non-protein coding); NULL | |
| TC11000673.hg.1 | 1.58 | 6.5 × 10−13 | 11q13.2 | RBM14; RBM4; RBM14-RBM4; LOC101059993 | RNA binding motif protein 14; RNA binding motif protein 4; RBM14-RBM4 readthrough; uncharacterized LOC101059993; NULL | |
| TC11002132.hg.1 | 1.72 | 7.6 × 10−8 | 11q14.1 | NDUFC2-KCTD14; NDUFC2; KCTD14 | NDUFC2-KCTD14 readthrough; NADH dehydrogenase (ubiquinone) 1, subcomplex unknown, 2, 14.5kDa; potassium channel tetramerisation domain containing 14; NULL | |
| TC12001797.hg.1 | 3.66 | 1.9 × 10−12 | 12q21.33 | POC1B; POC1B-GALNT4; GALNT4 | POC1 centriolar protein homolog B (Chlamydomonas); POC1B-GALNT4 readthrough; UDP-N-acetyl-alpha-D-galactosamine:polypeptide N-acetylgalactosaminyltransferase 4 (GalNAc-T4) | |
| TC13001721.hg.1 | 1.7 | 8.3 × 10−9 | 13q33.1 | ERCC5; BIVM-ERCC5 | excision repair cross-complementing rodent repair deficiency, complementation group 5; BIVM-ERCC5 readthrough; NULL | |
| TC14001267.hg.1 | 2.85 | 5.9 × 10−10 | 14q24.2 | SYNJ2BP-COX16; COX16; SYNJ2BP | SYNJ2BP-COX16 readthrough; COX16 cytochrome c oxidase assembly homolog (S. cerevisiae); synaptojanin 2 binding protein | |
| TC17000082.hg.1 | 1.83 | 3 × 10−11 | 17p13.1 | RNASEK; C17orf49; RNASEK-C17orf49 | ribonuclease, RNase K; chromosome 17 open reading frame 49; RNASEK-C17orf49 readthrough | |
| TC17002881.hg.1 | 1.74 | 1 × 10−10 | 17q21.33 | NME2; NME1-NME2 | NME/NM23 nucleoside diphosphate kinase 2; NME1-NME2 readthrough; NULL | |
| TC18001003.hg.1 | 9.48 | 3 × 10−10 | 18q21.1 | SNORD58B; RPL17; RPL17-C18orf32 | small nucleolar RNA, C/D box 58B; ribosomal protein L17; RPL17-C18orf32 readthrough | |
| TC20001752.hg.1 | 1.72 | 4.3 × 10−9 | 20q13.13 | TMEM189; TMEM189-UBE2V1; UBE2V1 | transmembrane protein 189; TMEM189-UBE2V1 readthrough; ubiquitin-conjugating enzyme E2 variant 1; NULL | [ |
| TC6_apd_hap1000079.hg.1 | 4.49 | 1.8 × 10−13 | 6p21.33 | DDX39B; ATP6V1G2-DDX39B; OTTHUMG00000148789; BAT1 | DEAD (Asp-Glu-Ala-Asp) box polypeptide 39B; ATP6V1G2-DDX39B readthrough (NMD candidate); NULL |