| Literature DB >> 34219124 |
Guanglei Ji1, Rongrong Ren2, Xichao Fang3.
Abstract
BACKGROUND Thymoma is the most common tumor of the anterior mediastinum, and can be caused by infrequent malignancies arising from the epithelial cells of the thymus. Unfortunately, blood-based diagnostic markers are not currently available. High-throughput sequencing technologies, such as RNA-seq with next-generation sequencing, have facilitated the detection and characterization of both coding and non-coding RNAs (ncRNAs), which play significant roles in genomic regulation, transcriptional and post-transcriptional regulation, and imprinting and epigenetic modification. The knowledge about fusion genes and ncRNAs in thymomas is scarce. MATERIAL AND METHODS For this study, we gathered large-scale RNA-seq data belonging to samples from 25 thymomas and 25 healthy thymus specimens and analyzed them to identify fusion genes, lncRNAs, and miRNAs. RESULTS We found 21 fusion genes, including KMT2A-MAML2, HADHB-REEP1, COQ3-CGA, MCM4-SNTB1, and IFT140-ACTN4, as the most frequent and significant in thymomas. We also detected 65 differentially-expressed lncRNAs in thymomas, including AFAP1-AS1, LINC00324, ADAMTS9-AS1, VLDLR-AS1, LINC00968, and NEAT1, that have been validated with the TCGA database. Moreover, we identified 1695 miRNAs from small RNA-seq data that were overexpressed in thymomas. Our network analysis of the lncRNA-mRNA-miRNA regulation axes identified a cluster of miRNAs upregulated in thymomas, that can trigger the expression of target protein-coding genes, and lead to the disruption of several biological pathways, including the PI3K-Akt signaling pathway, FoxO signaling pathway, and HIF-1 signaling pathway. CONCLUSIONS Our results show that overexpression of this miRNA cluster activates PI3K-Akt, FoxO, HIF-1, and Rap-1 signaling pathways, suggesting pathway inhibitors may be therapeutic candidates against thymoma.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34219124 PMCID: PMC8268976 DOI: 10.12659/MSM.929727
Source DB: PubMed Journal: Med Sci Monit ISSN: 1234-1010
Basic characteristics of considered small RNA-seq data.
| Run ID | Average spot length | Bases (Mbase) | Data size (MB) | Data file format | Sample type | WHO sub-type |
|---|---|---|---|---|---|---|
| SRR3341111 | 46 | 705.93 | 292.9 | bam,sra | Thymic neoplasm | A |
| SRR3341114 | 45 | 497.06 | 230.71 | bam,sra | Thymic neoplasm | A |
| SRR3341117 | 44 | 472.92 | 231.16 | bam,sra | Thymic neoplasm | A |
| SRR3341120 | 43 | 380.61 | 191.63 | bam,sra | Thymic neoplasm | A/B |
| SRR3341123 | 44 | 438.40 | 207.93 | bam,sra | Thymic neoplasm | B3 |
| SRR3341126 | 44 | 368.09 | 177.25 | bam,sra | Thymic neoplasm | B3 |
| SRR3341129 | 46 | 696.30 | 309.79 | bam,sra | Thymic neoplasm | A |
| SRR3341132 | 46 | 661.74 | 279.82 | bam,sra | Thymicneoplasm | B3 |
| SRR3341135 | 46 | 626.43 | 266.13 | bam,sra | Thymic neoplasm | A/B |
| SRR3341144 | 41 | 246.58 | 143.25 | bam,sra | Thymic neoplasm | C |
| SRR3341146 | 79 | 3156.80 | 1669.12 | bam,sra | Thymic neoplasm | C |
| SRR3341150 | 46 | 776.60 | 349.51 | bam,sra | Normal thymus | – |
| SRR3341151 | 46 | 689.80 | 315.14 | bam,sra | Normal thymus | – |
| SRR3341152 | 80 | 2253.16 | 1285.14 | bam,sra | Normal thymus | – |
| SRR3341153 | 45 | 700.13 | 334.67 | bam,sra | Normal thymus | – |
| SRR3341154 | 44 | 630.27 | 307.39 | bam,sra | Normal thymus | – |
| SRR3341155 | 79 | 2352.09 | 1394.3 | bam,sra | Normal thymus | – |
| SRR3341156 | 45 | 711.40 | 337.3 | bam,sra | Normal thymus | – |
| SRR3341157 | 44 | 636.42 | 307.0 | bam,sra | Normal thymus | – |
| SRR3341158 | 78 | 2289.89 | 1370.73 | bam,sra | Normal thymus | – |
Figure 1Methodological pipeline to process RNA-Seq data and analysis used in this study.
Characteristics of study samples of thymus tumor/cancer (n=25) and healthy thymus. (n=25) with the number of missing values.
| Parameter | Category | Number (%) or Median (IQR) | Missing values (%) |
|---|---|---|---|
| Age (years) | – | 55.5 (37–76) | 3 (12) |
| Sex | Male | 12 (54.54) | 3 (12) |
| Female | 10 (45.45) | ||
| WHO Histo-type | A-B3 | 16 (64) | 0 (0) |
| C type | 9 (36) | ||
| Disease stage | I–II | 12 (48) | 0 (0) |
| III–IV | 13 (52) | 0 (0) | |
| Age (years) | – | 2.39 (5 months-22 years) | 0 (0) |
| Sex | Male | 9 (2.18) | 0 (0) |
| Female | 16 (2.51) | 0 (0) | |
Summary of detected fusion genes and their statistics (arranged by significance level).
| S. No. | Detected fusion genes | chr1 | Breakpoint1 | chr2 | Breakpoint2 | Strands (1/2) | Thymoma samples | Healthy samples |
|---|---|---|---|---|---|---|---|---|
| 1 | KMT2A-MAML2 | 11 | 118436492 | 11 | 95976698 | (+/−) | 9/25 | 1/25 |
| 2 | HADHB -REEP1 | 2 | 26502983 | 2 | 86491164 | (+/−) | 6/25 | 0/25 |
| 3 | COQ3-CGA | 6 | 99842080 | 6 | 87804824 | (+/+) | 6/25 | 1/25 |
| 4 | MCM4-SNTB1 | 8 | 48878682 | 8 | 48878526 | (+/−) | 5/25 | 0/25 |
| 5 | IFT140-ACTN4 | 16 | 1616292 | 19 | 39138547 | (+/−) | 4/25 | 0/25 |
| 6 | ZAP70-GLI2 | 2 | 98349346 | 2 | 121555044 | (−/−) | 6/25 | 2/25 |
| 7 | UBR1-FLNB | 15 | 43340689 | 3 | 58104716 | (+/−) | 6/25 | 3/25 |
| 8 | PREPL-P4HTM | 2 | 44569556 | 3 | 49038871 | (−/+) | 4/25 | 3/25 |
| 9 | R3HDM1-KATNA1 | 2 | 136418963 | 6 | 149959696 | (+/−) | 4/25 | 3/25 |
| 10 | ANXA6-SLC36A3 | 5 | 150496688 | 5 | 150675829 | (−/−) | 4/25 | 3/25 |
| 11 | ADCY3-KDM3A | 2 | 25065253 | 2 | 86684194 | (+/−) | 3/25 | 3/25 |
| 12 | EP300-PRR5 | 22 | 41560134 | 22 | 45110471 | (+/+) | 3/25 | 3/25 |
| 13 | DTNB-PSIP1 | 2 | 25600422 | 9 | 15510115 | (+/+) | 3/25 | 2/25 |
| 14 | PDS5B-IL17RB | 13 | 33233417 | 3 | 53890871 | (+/+) | 3/25 | 2/25 |
| 15 | CLIP4-C5 | 2 | 29358532 | 9 | 123732529 | (+/−) | 3/25 | 1/25 |
| 16 | EML4-TMEM163 | 2 | 42396776 | 2 | 135309662 | (+/−) | 3/25 | 1/25 |
| 17 | MFSD2B-DNMT3A | 2 | 24236155 | 2 | 25458576 | (−/+) | 3/25 | 1/25 |
| 18 | R3HDM1-WDR6 | 2 | 136482840 | 3 | 49053386 | (−/−) | 3/25 | 1/25 |
| 19 | PIGS-SLC46A1 | 17 | 26898546 | 17 | 26733228 | (−/−) | 3/25 | 1/25 |
| 20 | ATP9B-SYNPR | 18 | 76903860 | 3 | 63466508 | (+/+) | 3/25 | 1/25 |
| 21 | ARHGEF4-USP45 | 2 | 131785518 | 6 | 99936076 | (−/+) | 2/25 | 1/25 |
Figure 2Circos plot of fusion genes mapped to chromosome and breakpoints on genome.
lncRNAs dysregulated in thymoma discovered by RNA-seq and validated by TCGA.
| Transcript ID | Gene ID | Gene symbol | logFC | logCPM | p-Value | lncRNA type | Regulation type | FDR |
|---|---|---|---|---|---|---|---|---|
| ENST00000608442 | ENSG00000272620 | AFAP1-AS1 | 5.9488 | 3.5952 | 0.0000 | Antisense | Up | 0.0000 |
| ENST00000315707 | ENSG00000178977 | LINC00324/ C17ORF44 | 5.8974 | 3.6420 | 0.0000 | lincRNA | Up | 0.0000 |
| ENST00000601022 | ENSG00000241158 | ADAMTS9-AS1 | −5.7239 | 3.9160 | 0.0000 | antisense | Down | 0.0000 |
| ENST00000453601 | ENSG00000236404 | VLDLR-AS1 | 5.7215 | 3.6212 | 0.0000 | antisense | Up | 0.0000 |
| ENST00000523664 | ENSG00000246430 | LINC00968 | −5.3190 | 3.2842 | 0.0000 | lincRNA | Down | 0.0000 |
| ENST00000616315 | ENSG00000245532 | NEAT1 | −5.1874 | 3.0155 | 0.0000 | lincRNA | Down | 0.0000 |
| ENST00000425914 | ENSG00000224609 | HSD52 | −4.4465 | 2.2834 | 0.0000 | lincRNA | Down | 0.0000 |
| ENST00000423808 | ENSG00000232079 | LINC01697 | −4.1166 | 3.4826 | 0.0000 | lincRNA | Down | 0.0000 |
| ENST00000330148 | ENSG00000182366 | FAM87A | 4.0398 | 3.1122 | 0.0000 | lincRNA | Down | 0.0000 |
| ENST00000553575 | ENSG00000258498 | DIO3OS | 3.9456 | 5.7375 | 0.0000 | lincRNA | Up | 0.0001 |
| ENST00000609696 | ENSG00000225670 | CADM3-AS1 | −3.9167 | 4.9436 | 0.0000 | antisense | Down | 0.0001 |
| ENST00000484765 | ENSG00000242268 | LINC02082 | 3.9025 | 2.1784 | 0.0000 | lincRNA | Up | 0.0003 |
| ENST00000577000 | ENSG00000185168 | LINC00482 | −3.7068 | 2.9948 | 0.0000 | lincRNA | Down | 0.0001 |
| ENST00000500381 | ENSG00000246363 | LINC02458 | 3.4937 | 5.3952 | 0.0000 | lincRNA | Up | 0.0001 |
| ENST00000530198 | ENSG00000255176 | AP000941.1 | 3.4311 | 2.5965 | 0.0001 | antisense | Up | 0.0102 |
| ENST00000603439 | ENSG00000271579 | AC078880.3 | −3.3799 | 2.5707 | 0.0001 | lincRNA | Down | 0.0158 |
| ENST00000242109 | ENSG00000122548 | KIAA0087 | −3.3738 | 6.4171 | 0.0000 | lincRNA | Down | 0.0088 |
| ENST00000523301 | ENSG00000245812 | LINC02202 | 3.3481 | 5.4172 | 0.0000 | lincRNA | Up | 0.0001 |
| ENST00000575424 | ENSG00000262097 | LINC02185 | −3.2869 | 2.3678 | 0.0000 | lincRNA | Down | 0.0019 |
| ENST00000605631 | ENSG00000270412 | AL136084.2 | −3.2391 | 3.6568 | 0.0000 | antisense | Down | 0.0002 |
| ENST00000626206 | ENSG00000238018 | AC093110.1 | −3.1506 | 4.6319 | 0.0000 | antisense | Down | 0.0087 |
| ENST00000603514 | ENSG00000271334 | LINC02104 | 3.1179 | 2.7402 | 0.0000 | lincRNA | Up | 0.0000 |
| ENST00000418372 | ENSG00000233117 | LINC00702 | 3.0428 | 4.4037 | 0.0001 | lincRNA | Up | 0.0158 |
| ENST00000572479 | ENSG00000263257 | AC040173.1 | −3.0091 | 2.0691 | 0.0003 | lincRNA | Down | 0.0408 |
| ENST00000519795 | ENSG00000253978 | CTB-113P19.1 | 2.9814 | 2.1697 | 0.0001 | antisense | Up | 0.0158 |
| ENST00000609941 | ENSG00000272823 | AL445423.1 | 2.9430 | 2.9938 | 0.0000 | lincRNA | Up | 0.0049 |
| ENST00000505254 | ENSG00000249669 | CARMN | 2.9374 | 5.3167 | 0.0000 | lincRNA | Up | 0.0009 |
| ENST00000417193 | ENSG00000213373 | LINC00671 | −2.8571 | 3.3597 | 0.0000 | lincRNA | Down | 0.0033 |
| ENST00000574724 | ENSG00000228157 | AC007952.2 | 2.8459 | 3.0662 | 0.0004 | retained_intron | Up | 0.0463 |
| ENST00000556053 | ENSG00000259134 | LINC00924 | 2.8254 | 4.6589 | 0.0001 | lincRNA | Up | 0.0158 |
| ENST00000418029 | ENSG00000223901 | AP001469.1 | −2.8033 | 2.8406 | 0.0000 | antisense | Down | 0.0004 |
| ENST00000505807 | ENSG00000250208 | FZD10-AS1 | 2.7905 | 5.3926 | 0.0001 | lincRNA | Up | 0.0191 |
| ENST00000620326 | ENSG00000277631 | PGM5P3-AS1 | 2.7716 | 3.3111 | 0.0000 | lincRNA | Up | 0.0050 |
| ENST00000417354 | ENSG00000230630 | DNM3OS | 2.7346 | 5.5551 | 0.0001 | antisense | Up | 0.0098 |
| ENST00000619486 | ENSG00000274080 | AC005089.1 | 2.7190 | 3.0662 | 0.0000 | sense_intronic | Up | 0.0087 |
| ENST00000427825 | ENSG00000230623 | AC104461.1 | −2.6769 | 2.4333 | 0.0004 | lincRNA | Down | 0.0454 |
| ENST00000577295 | ENSG00000263586 | HID1-AS1 | 2.6724 | 3.0135 | 0.0000 | antisense | Up | 0.0029 |
| ENST00000430895 | ENSG00000234484 | AL032821.1 | −2.6391 | 2.1961 | 0.0002 | antisense | Down | 0.0265 |
| ENST00000446355 | ENSG00000237813 | AC002066.1 | 2.6230 | 2.1554 | 0.0003 | antisense | Up | 0.0398 |
| ENST00000563759 | ENSG00000260868 | LINC01960 | −2.6040 | 2.7845 | 0.0001 | lincRNA | Down | 0.0182 |
| ENST00000624364 | ENSG00000280339 | AP001528.2 | 2.5928 | 5.5979 | 0.0000 | TEC | Up | 0.0022 |
| ENST00000538077 | ENSG00000255874 | LINC00346 | 2.5205 | 5.8892 | 0.0004 | lincRNA | Up | 0.0478 |
| ENST00000434306 | ENSG00000234235 | BOK-AS1 | −2.5061 | 2.1082 | 0.0000 | antisense | Down | 0.0003 |
| ENST00000513055 | ENSG00000237187 | NR2F1-AS1 | 2.4916 | 6.0892 | 0.0002 | antisense | Up | 0.0265 |
| ENST00000573168 | ENSG00000262319 | AC007952.6 | 2.4867 | 2.9364 | 0.0002 | antisense | Up | 0.0272 |
| ENST00000600473 | ENSG00000269688 | AC008982.2 | −2.4289 | 3.8376 | 0.0001 | sense_intronic | Down | 0.0158 |
| ENST00000450804 | ENSG00000227467 | LINC01537 | 2.4021 | 4.2479 | 0.0000 | lincRNA | Up | 0.0050 |
| ENST00000435312 | ENSG00000230148 | HOXB-AS1 | 2.2070 | 6.9580 | 0.0001 | antisense | Up | 0.0120 |
| ENST00000567984 | ENSG00000260953 | AC009093.4 | 2.1585 | 2.9114 | 0.0001 | lincRNA | Up | 0.0158 |
| ENST00000437930 | ENSG00000180139 | ACTA2-AS1 | 2.0767 | 5.4572 | 0.0002 | antisense | Up | 0.0299 |
| ENST00000508887 | ENSG00000237125 | HAND2-AS1 | 2.0078 | 3.8768 | 0.0003 | antisense | Down | 0.0021 |
Figure 3(A–F) Boxplot of normalized expression [log2(TPM+1)] of thymoma samples (red color) and normal samples (gray color), and disease-free survival (DFS) analysis of patients in high- (red line) and low-expression (blue line) groups for the top 6 dysregulated lncRNAs.
Figure 4Disease-free survival (DFS) map showing survival contribution of top 25 dysregulated lncRNAs in various cancers including thymoma (THYM) using TCGA datasets with a significance level of 0.05 estimated using the Mantel-Cox test.
Figure 5Boxplot of first 10 differentially-expressed miRNAs (DEMs) showing their normalized gene expression over thymoma versus those of healthy samples.
Figure 6First 50 differentially-expressed miRNAs (DEMs) and their logFC values.
lncRNAs and their predicted protein interactions using the UCSC-TFBS algorithm.
| ID | Gene Name | Species | UCSC_TFBS |
|---|---|---|---|
| 84740 | AFAP1-AS1 (AFAP1 antisense RNA 1) | AML1, AP1FJ, AP2REP, AP4, AREB6, ARNT, ARP1, CDC5, CDPCR3HD, COMP1, CP2, CREBP1, ELK1, EN1, ER, FOXJ2, FREAC7, GCNF, GRE, HAND1E47, HFH3, HNF3B, HOXA3, HTF, LYF1, MEF2, MSX1, MYCMAX, MYOD, NFAT, NFY, NKX25, NKX61, P53, PAX4, PAX5, PPARA, PPARG, RFX1, RORA2, RP58, SREBP1, SRY, TAL1BETAITF2, TAXCREB, TBP, TCF11, USF, ZIC2, ZID | |
| 401491 | VLDLR-AS1 (VLDLR antisense RNA 1) | AHRARNT, AML1, AP1, AP1FJ, AP2GAMMA, AP4, AREB6, ARNT, ATF, BACH1, BACH2, BRACH, CART1, CDC5, CDP, CDPCR1, CEBP, CEBPB, CHX10, COMP1, COUP, CP2, E47, E4BP4, EN1, ER, EVI1, FOXD3, FOXJ2, FOXO4, FREAC4, FREAC7, GATA, GATA1, GCNF, GFI1, GR, HEN1, HFH1, HFH3, HLF, HMX1, HNF1, HNF3B, HOXA3, HSF2, HTF, ISRE, LHX3, LMO2COM, LUN1, LYF1, MAX, MEF2, MEIS1, MEIS1BHOXA9, MSX1, MYCMAX, MYOD, MYOGNF1, MZF1, NCX, NF1, NFAT, NFY, NKX22, NKX25, NKX3A, NKX61, NRSF, OCT, OCT1, OLF1, P300, P53, PAX4, PAX6, PBX1, POU3F2, POU6F1, PPARA, PPARG, RORA2, RP58, RREB1, RSRFC4, S8, SOX5, SOX9, SREBP1, SRF, SRY, STAT1, STAT3, STAT5B, TAL1BETAITF2, TATA, TAXCREB, TCF11, TCF11MAFG, TGIF, USF, XBP1, YY1, ZIC1, ZID | |
| 284029 | LINC00324 (long intergenic non-protein coding RNA 324) | AP1, AP4, AREB6, ARP1, BRACH, CART1, CEBP, CETS1P54, CMYB, CP2, E47, FOXJ2, FREAC4, GATA3, GRE, HAND1E47, HEN1, HNF4, ISRE, MEF2, MZF1, NFY, OCT1, OLF1, PBX1, PPARG, RORA2, STAT3, STAT5A, TAL1BETAE47, TAL1BETAITF2, TCF11MAFG, TST1, USF, ZIC1 | |
| 729467 | HSD52 (uncharacterized LOC729467) | AP1, ARP1, BACH1, BACH2, BRACH, CEBP, CEBPB, COUP, EGR3, EVI1, FOXD3, FOXO3, FOXO4, FREAC2, GATA1, HFH1, HNF3B, IK3, IRF2, IRF7, ISRE, LUN1, MEF2, MZF1, NFAT, NFE2, NKX22, OCT, OCT1, PAX4, POU3F2, PPARG, RP58, RSRFC4, SRF, STAT, TAXCREB, TCF11MAFG, YY1 |
Figure 7Differentially-expressed lncRNA-mRNA-miRNA interaction network in thymoma. Boxes represent lncRNAs, circles represent coding genes, and triangles represent miRNAs. Up arrows represent upregulated nodes, while down arrows represent downregulated nodes in thymomas. Lines between lncRNAs-mRNAs-miRNAs represent regulatory networks among them.