| Literature DB >> 33195699 |
Liangliang Meng1,2,3, Xiaoxi He4, Xiao Zhang2, Xiaobo Zhang2, Yingtian Wei2, Bin Wu3, Jing Li2, Yueyong Xiao2.
Abstract
Adenocarcinoma is the most common type of lung cancer, and patients have varying prognoses. RNA-binding proteins (RBP) are deemed to be closely associated with tumorigenesis and development, but the exact mechanism is currently unknown. This study was aimed at constructing a new robust prognostic model based on RNA-binding protein-related gene pair scores for better clinical guidance. The model for this study was constructed based on data of lung adenocarcinoma from The Cancer Genome Atlas (TCGA) database. Prognosis-related RBP gene pair models were created based on differentially expressed genes, and the accuracy of the models was verified in a different age, staging, and other subdatasets. A total of 379 RNA-binding protein-related genes were differentially expressed in tumor tissue. From these genes, we constructed a prognostic model consisting of 33 gene pairs, which were found to be significantly associated with survival in TCGA dataset (P < 0.0001, hazard ratio (HR) = 4.380 (3.139 to 6.111)) and different subdatasets. As expected, the results were verified in the GEO validation cohort (P = 7.8 × 10-3, HR = 1.597 (1.095 to 2.325)). We found that the signature exhibited an independent prognostic factor in both the univariate and multivariate Cox regression analyses (P < 0.001). CIBERSORT was applied to estimate the fractions of infiltrated immune cells in bulk tumor tissues. CD8 T cells, activated dendritic cells, regulatory T cells (Tregs), and activated CD4 memory T cells presented a significantly lower fraction in the high-risk group (P < 0.01). Patients in the high-risk group had significantly higher tumor mutational burden (TMB) (P = 4.953e - 04) and lower levels of immune cells (P = 3.473e - 05) and stromal cells (P = 0.005) in the tumor microenvironment than those in the low-risk group. Furthermore, the Protein-protein interaction (PPI) network and various enrichment analyses have genuinely uncovered the interrelationships and potential functions of the RBP genes within the model. The results of the present study validated the importance of RNA-binding proteins in tumorigenesis and progression and support the RBP gene-related signature as a promising marker for prognosis prediction in lung adenocarcinoma.Entities:
Mesh:
Substances:
Year: 2020 PMID: 33195699 PMCID: PMC7643376 DOI: 10.1155/2020/8896511
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1Abnormally expressed RBP genes existed in tumor tissues in TCGA. Heat map (a) and a volcano plot (b) of RBP gene expression.
Prognostic signature consists of 33 RBP gene pairs.
| Gene A | Description | Gene B | Description | Coefficient |
|---|---|---|---|---|
| WDR46 | WD repeat domain 46 | RAE1 | RAE1 RNA export 1 homolog (S. pombe) | -0.29314 |
| SNRPA1 | Small nuclear ribonucleoprotein polypeptide A′ | PABPC1L | Poly(A) binding protein, cytoplasmic 1-like | 0.248844 |
| INTS8 | Integrator complex subunit 8 | SKIV2L | Superkiller viralicidic activity 2-like | 0.039139 |
| DCAF13 | DDB1 and CUL4-associated factor 13 | SMAD9 | SMAD family member 9 | 0.048633 |
| WDR4 | WD repeat domain 4 | PARS2 | Prolyl-tRNA synthetase 2, mitochondrial (putative) | 0.622802 |
| MRPS12 | Mitochondrial ribosomal protein S12 | DCP1A | Decapping mRNA 1A | 0.019737 |
| IGF2BP1 | Insulin-like growth factor 2 mRNA binding protein 1 | PABPC3 | Poly(A) binding protein, cytoplasmic 3 | 0.04353 |
| IGF2BP1 | Insulin-like growth factor 2 mRNA binding protein 1 | SRSF12 | Serine/arginine-rich splicing factor 12 | 0.214354 |
| IGF2BP1 | Insulin-like growth factor 2 mRNA binding protein 1 | ZC3H12D | Zinc finger CCCH-type containing 12D | 0.047965 |
| IGF2BP3 | Insulin-like growth factor 2 mRNA binding protein 3 | RPP40 | Ribonuclease P/MRP 40 kDa subunit | 0.030426 |
| IGF2BP3 | Insulin-like growth factor 2 mRNA binding protein 3 | OASL | 2′-5′-oligoadenylate synthetase-like | 0.098828 |
| SLU7 | SLU7 splicing factor homolog | DDX24 | DEAD (Asp-Glu-Ala-Asp) box polypeptide 24 | 0.295423 |
| MAGOHB | Mago-nashi homolog B (Drosophila) | PARS2 | Prolyl-tRNA synthetase 2, mitochondrial (putative) | 0.207791 |
| MRPL38 | Mitochondrial ribosomal protein L38 | TSEN54 | tRNA splicing endonuclease 54 homolog | 0.1538 |
| DDX52 | DEAD (Asp-Glu-Ala-Asp) box polypeptide 52 | STRBP | Spermatid perinuclear RNA-binding protein | 0.012441 |
| MRPL54 | Mitochondrial ribosomal protein L54 | OAS1 | 2′-5′-oligoadenylate synthetase 1, 40/46 kDa | -0.05311 |
| MRPL54 | Mitochondrial ribosomal protein L54 | DDX56 | DEAD (Asp-Glu-Ala-Asp) box helicase 56 | -0.20468 |
| MRPL54 | Mitochondrial ribosomal protein L54 | BZW2 | Basic leucine zipper and W2 domains 2 | -0.12583 |
| SKIV2L | Superkiller viralicidic activity 2-like | MRPS24 | Mitochondrial ribosomal protein S24 | -0.22893 |
| SKIV2L | Superkiller viralicidic activity 2-like | DCPS | Decapping enzyme, scavenger | -0.11877 |
| OAS1 | 2′-5′-oligoadenylate synthetase 1, 40/46 kDa | POP7 | Processing of precursor 7, ribonuclease P/MRP subunit | 0.014735 |
| RNPC3 | RNA-binding region (RNP1, RRM) containing 3 | ERI2 | ERI1 exoribonuclease family member 2 | -0.05978 |
| DNMT3B | DNA (cytosine-5-)-methyltransferase 3 beta | ZC3H8 | Zinc finger CCCH-type containing 8 | 0.139235 |
| SMAD9 | SMAD family member 9 | ERI2 | ERI1 exoribonuclease family member 2 | -0.09502 |
| TDRKH | Tudor and KH domain containing | URB1 | URB1 ribosome biogenesis 1 homolog | -0.13488 |
| TDRKH | Tudor and KH domain containing | ZC3HAV1L | Zinc finger CCCH-type, antiviral 1-like | -0.01879 |
| TDRKH | Tudor and KH domain containing | ZC3H12C | Zinc finger CCCH-type containing 12C | -0.25482 |
| PPIL4 | Peptidylprolyl isomerase (cyclophilin)-like 4 | DARS2 | Aspartyl-tRNA synthetase 2, mitochondrial | -0.37219 |
| TFB2M | Transcription factor B2, mitochondrial | RBPMS | RNA-binding protein with multiple splicing | 0.029924 |
| DARS2 | Aspartyl-tRNA synthetase 2, mitochondrial | HINT3 | Histidine triad nucleotide binding protein 3 | 0.126521 |
| PABPC3 | Poly(A) binding protein, cytoplasmic 3 | ZC3H12D | Zinc finger CCCH-type containing 12D | 0.157939 |
| FASTK | Fas-activated serine/threonine kinase | SPATS2 | Spermatogenesis associated, serine-rich 2-like | -0.04858 |
| FASTKD3 | FAST kinase domains 3 | ZC3H12C | Zinc finger CCCH-type containing 12C | -0.06448 |
Figure 2Survival curves for different risk groups in TCGA and GEO datasets. According to the optimal cutoff value, patients from different cohorts were stratified into the high- or low-risk group. Kaplan-Meier curves were used for survival analyses between different risk groups in different datasets: TCGA cohort (a), two TCGA validation subcohort (b, c), and the GEO validation cohort (d) groups stratified by pathologic stage (e, f), age (g, h), gender (i, j), and N stage (k, l).
Figure 3Plots of risk score analysis for the respective TCGA (a) (n = 477) and GEO (b) (n = 386) datasets. The plots in the first row were risk curves ranked by patient risk scores, and the plots in the middle row were survival status plots based on patient risk curves, which show a worse survival prognosis for the high-risk group. The heat maps at the bottom represent the patient's score for each gene pair. The score of a gene pair in a sample should be 0 or 1.
Summary of the results of univariate and multivariate analyses of the risk factors for the overall survival of patients with lung adenocarcinoma in TCGA cohort and the GEO cohort.
| Datasets | Variables | Univariate analysis | Multivariate analysis | ||
|---|---|---|---|---|---|
| HR (95% CI) |
| HR (95% CI) |
| ||
| TCGA (exploratory dataset) | Age | 0.997 (0.978−1.015) | 0.718 | 1.008 (0.989−1.028) | 0.402 |
| Gender | 1.000 (0.694−1.441) | 1.000 | 0.980 (0.673−1.428) | 0.917 | |
| Stage | 1.648 (1.396−1.946) | <0.001 | 2.119 (1.250-3.590) | 0.005 | |
| T stage | 1.600 (1.285−1.994) | <0.001 | 1.083 (0.835-1.404) | 0.549 | |
| M stage | 1.748 (0.959−3.187) | 0.068 | 0.272 (0.071-1.039) | 0.057 | |
| N stage | 1.787 (1.455−2.195) | <0.001 | 0.759 (0.472-1.233) | 0.258 | |
| Risk-score | 3.967 (3.068−5.131) | <0.001 | 3.666 (2.791−4.815) | <0.001 | |
|
| |||||
| GSE72094 (validation dataset) | Age | 1.008 (0.985-1.031) | 0.517 | 1.004 (0.980-1.028) | 0.754 |
| Gender | 1.885 (1.223-2.905) | 0.004 | 2.205 (1.417-3.430) | <0.001 | |
| Smoking | 1.261 (0.549-2.897) | 0.584 | 0.904 (0.390-2.095) | 0.815 | |
| Stage | 1.704 (1.393-2.083) | <0.001 | 1.855 (1.499-2.297) | <0.001 | |
| Risk-score | 2.202 (1.525-3.179) | <0.001 | 2.214 (1.533-3.198) | <0.001 | |
Abbreviations: HR: hazard ratio; CI: confidence interval.
Figure 4The relative fraction of infiltrated immune cells in different risk groups in TCGA dataset. Violin plot of differences in various immune cell abundances between the high- and low-risk groups. (∗P < 0.05, ∗∗P < 0.01, and ∗∗∗P < 0.001).
Figure 5Analysis of differences in the tumor microenvironment (TME) and tumor mutational burden (TMB) between high- and low-risk groups: (a) immune score; (b) stromal score; (c) tumor purity; (d) tumor mutational burden.
Figure 6Circle plots of GO and KEGG enrichment analysis results for genes in the signature. (a) GO enrichment analysis showed that the function of these genes is mainly enriched in GO terms, including ncRNA processing and rRNA metabolic process. (b) KEGG enrichment analysis revealed that genes within the signature were predominantly enriched in five pathways, including RNA degradation (P < 0.05 and Q <0.05)). BP: biological process; CC: cellular component; MF: molecular function.
Figure 7GSEA of TCGA cohort with oncogenic signature gene sets. According to the GSEA results, there were three significant gene set enrichments in the high-risk group (P < 0.05, FDR Q − value < 0.25). GSEA: gene set enrichment analysis.
Figure 8Protein-protein interaction (PPI) network analysis of the genes that make up the signature. (a) PPI network between the genes using STRING, with different edge colors representing various protein-protein associations. (b) The PPI network was plotted using Cytoscape, with red representing an upregulated expression of the gene in the tumor tissue and blue representing downregulation.