Background: We sought to screen and verify the long non-coding ribonucleic acids (lncRNAs) related to immune infiltration in metastatic osteosarcoma (OS). Methods: We downloaded the RNA-sequencing expression data from The Cancer Genome Atlas (TCGA) database as the training data set. We downloaded the GSE39055 data set from the National Center for Biotechnology Information, Gene Expression Omnibus as the validation data set. The least absolute shrinkage and selection operator (LASSO) regression algorithm was used to screen the optimized lncRNA combinations. Kaplan-Meier curves were used to evaluate the associations between the lncRNAs and actual prognosis. The independent survival prognosis clinical factors were obtained by univariate and multivariate Cox analyses. A nomogram was established to explore the correlation between survival rate and risk information. The Tumor IMmune Estimation Resource was applied to estimate the composition of 6 subtypes of immune infiltration cells. Results: In total, 1,398 lncRNAs and 14,631 messenger RNAs were screened from TCGA data set, and divided into the low and high immunity groups. The Estimation of STromal and Immune cells in MAlignant Tumour tissues using Expression data (ESTIMATE) scores differed significantly between the samples in the two groups. Additionally, 5 optimized lncRNA combinations were obtained using the LASSO algorithm. Risk factors including age, metastatic tumor, and risk-score (RS) were related to the prognosis of OS patients. The survival rates predicted by the nomogram model were consistent with the actual survival rates of OS patients. Finally, we found that RS was negatively correlated with the proportion of immune cells. Conclusions: In total, 5 feature lncRNAs were identified as novel biomarkers for OS. Next, a RS nomogram model was constructed based on the 5 identified lncRNAs. This model predicted the survival rates and prognoses of OS patients well. 2022 Translational Cancer Research. All rights reserved.
Background: We sought to screen and verify the long non-coding ribonucleic acids (lncRNAs) related to immune infiltration in metastatic osteosarcoma (OS). Methods: We downloaded the RNA-sequencing expression data from The Cancer Genome Atlas (TCGA) database as the training data set. We downloaded the GSE39055 data set from the National Center for Biotechnology Information, Gene Expression Omnibus as the validation data set. The least absolute shrinkage and selection operator (LASSO) regression algorithm was used to screen the optimized lncRNA combinations. Kaplan-Meier curves were used to evaluate the associations between the lncRNAs and actual prognosis. The independent survival prognosis clinical factors were obtained by univariate and multivariate Cox analyses. A nomogram was established to explore the correlation between survival rate and risk information. The Tumor IMmune Estimation Resource was applied to estimate the composition of 6 subtypes of immune infiltration cells. Results: In total, 1,398 lncRNAs and 14,631 messenger RNAs were screened from TCGA data set, and divided into the low and high immunity groups. The Estimation of STromal and Immune cells in MAlignant Tumour tissues using Expression data (ESTIMATE) scores differed significantly between the samples in the two groups. Additionally, 5 optimized lncRNA combinations were obtained using the LASSO algorithm. Risk factors including age, metastatic tumor, and risk-score (RS) were related to the prognosis of OS patients. The survival rates predicted by the nomogram model were consistent with the actual survival rates of OS patients. Finally, we found that RS was negatively correlated with the proportion of immune cells. Conclusions: In total, 5 feature lncRNAs were identified as novel biomarkers for OS. Next, a RS nomogram model was constructed based on the 5 identified lncRNAs. This model predicted the survival rates and prognoses of OS patients well. 2022 Translational Cancer Research. All rights reserved.
Osteosarcoma (OS) is a common clinical malignant bone tumor (1,2). Despite advances in the treatment of OS, nearly 80% of patients retain the diseased limbs, the 5-year survival rate has increased from about 20% to nearly 80%, and >50% of patients still die from metastatic OS (3,4). In recent years, with the continuous improvement of multi-stage, multi-factor, and multi-step theories about the regulation of tumor metastasis factors, we have extended our understanding of the extremely complex pathological process of tumor metastasis, and discovered and confirmed that biomolecules are highly related to the process of tumor metastasis (5). Biomolecules have become a hot spot in the research field of tumor metastasis, as research on biomolecules allows us to better understand the mechanisms underlying tumor metastasis (6). Further, and more importantly, these molecules and their products may become targets for anti-tumor metastasis or indicators for observing the prognosis and metastasis of tumor patients (7,8).Among the various cell types related to the progression and development of tumors, the effect of tumor-infiltrating lymphocytes on prognosis has been extensively studied. Previous research has shown that the level extent of tumor infiltrating lymphocytes is a novel supplementary biomarker for predicting recurrence and mortality (9,10). In addition to lymphocytes, tumors usually contain different non-lymphocyte immune cells (11,12), which are regarded as having unique effects on prognosis in different types of cancer (13). However, traditional methods for examining tumor immune cell infiltration (e.g., immunohistochemistry or flow cytometry) do not completely assess the immune function of different types of cells. This is mainly because the existing technology is limited as to the number of immune signatures that can be detected.Long non-coding ribonucleic acid (lncRNA) is a type of RNA molecule whose transcript length exceeds 200 nt (14), and is a new type of gene regulatory factor. Studies have shown that lncRNA is closely associated with the occurrence, development, and prognosis of human diseases, especially tumors (15,16). Further research has shown that the abnormal expression of lncRNA might be associated with epithelial-mesenchymal transition, cell apoptosis, metastasis, invasion, and the poor prognosis of OS patients (17). Yu et al. (18) identified 5 metastasis-associated lncRNAs (including LAMA5-AS1, RP5-894D12.4, RP11-231|13.2, RP11-128N14.5 and RP11-346L1.2) which are regarded as potential prognostic indicators for OS. Deng et al. (19) established a four-methylated lncRNA signature (including MAP3K14, SNHG12, DSCR8 and IGF2BP2-AS1) for determining OS patient prognosis. In recent years, the identification of lncRNAs that mediate the immune microenvironment of OS patients has been a research hotspot (20,21). Previous research has shown that there are many different immune cell infiltration types in OS, not only in the tumor matrix, but also in the cancer nest. Additionally, OS prognosis is associated with the type and the number of immune cell infiltrations around the tumor (22,23).In the present study, we assessed the immune infiltration cells in OS tumor samples downloaded from The Cancer Genome Atlas (TCGA) based on a single-sample gene set enrichment analysis (ssGSEA) algorithm. We then divided the samples into a high immune cell infiltration group and a low immune cell infiltration group. After combining the sample metastasis information, we identified the RNAs associated with metastasis and immunity. The Cox regression analysis and least absolute shrinkage and selection operator (LASSO) algorithm were then used to identify the prognostic lncRNA markers. Finally, based on the prognostic lncRNA markers, we constructed and verified the survival prognosis prediction model. Compared with the previous studies, we adopted a larger sample size and the prediction performance of the prognostic model based on the five-lncRNA signature was quite good. Our study provided valuable lncRNAs as promising prognostic biomarkers for OS metastasis. We present the following article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-22-1926/rc).
Methods
Data source and preprocessing
The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). We downloaded the RNA-sequencing expression data from TCGA database, including 265 samples from the Illumina HiSeq 2000 RNA Sequencing platform. After the 265 samples were matched with the clinical information of the OS samples, the samples with information on metastasis were retained. In total, 176 OS samples were included in the analysis, and these were used as the training data set. Additionally, the GSE39055 data set (24), which included 37 OS samples, was obtained from National Center for Biotechnology Information, Gene Expression Omnibus (25) (https://www.ncbi.nlm.nih.gov/geo/) based on the platform of Illumina HumanHT-12 WG-DASL V4.0 R2 expression beadchip. The GSE39055 data set was used as the validation data set.The lncRNAs and mRNAs in TCGA data set were annotated according to the annotation file of TCGA detection platform. Next, we obtained the Illumina HumanHT-12 WG-DASL V4.0 R2 expression beadchip platform from the Ensembl genome browser 96 (http://asia.ensembl.org/index.html), including probe, gene symbol, RNA type, and other information. We then re-annotated the detection probes in the downloaded expression profile data set to screen the corresponding expressions of the mRNAs and lncRNAs.
Construction and evaluation of OS immune grouping
A gene set variation analysis of the microarray and RNA-sequencing data (26) (version 1.36.3, http://www.bioconductor.org/packages/release/bioc/html/GSVA.html) based on the ssGSEA algorithm (27) was conducted to assess the type of immune infiltration in the OS samples. We grouped the immune infiltration types according to the results of the ssGSEA and named them the “Immunity_H” and “Immunity_L” groups. The estimate package in R (28) (version 3.6.1, http://127.0.0.1:29606/library/estimate/html/estimateScore.html) was applied to calculate the stromal, immune, and ESTIMATE scores. Next, the differences in the scores of the immune infiltration groups were compared to verify the correction of the immune infiltration groups. Additionally, CIBERSORT (29) (https://cibersort.stanford.edu/index.php) was used to calculate the proportions of various immune cells on the basis of the expression level of TCGA OS tumor samples. Next, the difference in the composition ratios of various immune cells between the immune infiltration groups was compared to verify the correction of the immune infiltration groups.
Screening of significantly differentially expressed RNAs
According to the Immunity_H and Immunity_L grouping and the information as to whether or not there was tumor metastasis, limma package (30) (version 3.6.1, https://bioconductor.org/packages/release/bioc/html/limma.html) was used to screen the differentially expressed genes (DEGs) with a cutoff of false discovery rate (FDR) <0.05 and |log2 fold change (FC)| >0.263. We then compared the sets of DEGs filtered between the two comparison groups and selected the intersection genes for further analysis.
Construction of prognostic model
A univariate Cox regression analysis was conducted using survival package of R (version 3.6.1, http://bioconductor.org/packages/survivalr/) to identify the lncRNAs related to survival prognosis in TCGA data set. The LASSO regression algorithm in the lars package (version 1.2) (31) of R (version 3.6.1, https://cran.r-project.org/web/packages/lars/index.html) was then used to perform the survival regression analysis to obtain the optimal lncRNAs based on the lncRNAs related to survival prognosis.According to the LASSO coefficient of each element in the optimized lncRNAs and their expression in TCGA training data set, we constructed the risk-score (RS) model. The following RS formula was used:where represented the LASSO coefficient of the target lncRNA, and represented the lncRNA expression in TCGA training data set.We also calculated the RSs in TCGA training data set and the GSE39055 validation data set. Additionally, we separated TCGA training data set and the GSE39055 data set into high- and low-risk groups based on the RS median value. Next, Kaplan-Meier (KM) curves of survival package (32) in R (version 3.6.1) were used to evaluate the correlations between the high- and low-risk groups and the actual prognosis information. Additionally, based on the lncRNAs related to the RS model in TCGA training data set, the KM curves of survival package (32) (version 2.41-1) in R (version 3.6.1) were used to assess the associations between the different expressions of lncRNAs and actual prognosis information.
Construction of nomogram survival rate model with the independent survival prognostic factors
The univariate and multivariate Cox regression analyses of survival package (version 2.41-1) in R (version 3.6.1) were used to screen the independent survival prognosis clinical factors with a cutoff of log-rank P value < 0.05. Next, to further study the prognostic independence between the clinical prognostic factors and RS factors, the rms package (version 5.1-2) in R3.6.1 (https://cran.r-project.org/web/packages/rms/index.html) (33) was used to construct nomogram 3- and 5-year survival rate prediction models on the basis of independent prognosis factors and risk information in the RS model. We then calculated the concordance index (C-index) coefficient of the nomogram prognostic model. The C-index is an evaluation index for assessing the predictive ability of the model (34), and was calculated using the survcomp package in R (version, 1.34.0 http://www.bioconductor.org/packages/release/bioc/html/survcomp.html) (35). A C-index >0.70 indicates that the model had good predictive ability.
Construction and functional analysis of the characteristic lncRNA correlation network
The cor function in R 3.6.1 (http://77.66.12.57/R-help/cor.test.html) was used to calculate the Pearson correlation coefficients (PCCs) between the lncRNAs in the RS model and the overlapping DEGs obtained by comparing the immunity_H and immunity_ L groups and the groups with and without tumor metastasis. Subsequently, we retained the pairs for which the absolute value of the PCC was >0.3 and the P value was <0.05 to construct the lncRNA-mRNA co-expression network, which was visualized using Cytoscape (36) (version 3.6.1, https://cytoscape.org/). Finally, we used the Database for Annotation, Visualization, and Integrated Discovery (DAVID) (37,38) (version 6.8, http://metascape.org/) to perform the Gene Ontology biological process (BP) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses of the DEGs in the co-expression network with a threshold of P value <0.05.
Correlations between the prognostic signals of the feature RNAs and immune cell subtype infiltration
The Tumor IMmune Estimation Resource (TIMER) contains many conventional cancer data sets in TCGA data set, which can estimate the composition of 6 types of immune infiltrating cells (i.e., B cells, cluster of differentiation (CD)4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells). Thus, the TIMER (39) (https://cistrome.shinyapps.io/timer/) was used to analyze the immune cells related to OS based on TCGA training data set. We then calculated the correlation between each type of immune cell subtype and the RS value based on TCGA samples to observe the correlation between the prognostic signal based on each feature lncRNA and the infiltration of various immune cell subtypes.
Statistical analysis
All the statistical analysis was performed by R (version 4.1.2), and P value <0.05 was considered to be statistically significant.
Results
OS immune group construction and evaluation
A total of 1,398 lncRNAs and 14,631 mRNAs were obtained from the annotation of the lncRNAs and mRNAs in TCGA training data set. Next, the ssGSEA algorithm was used to evaluate the immune infiltration type of each sample based on the expression level of the OS tumor samples. As shows, the sample was clearly divided into two clusters: cluster 1 (comprising 73 samples) and cluster 2 (comprising 103 samples).
Figure 1
OS immune group construction and evaluation. (A) Heatmap of the immune evaluation based on ssGSEA. (B) Scores obtained by the ESTIMATE algorithm in the Immunity_H and Immunity_L groups. The red and blue points represent the samples in the Immunity_H and Immunity_L groups, respectively. (C) Distribution of immune cell composition in Immunity_H and Immunity_L groups. ***, P<0.001. OS, osteosarcoma.
OS immune group construction and evaluation. (A) Heatmap of the immune evaluation based on ssGSEA. (B) Scores obtained by the ESTIMATE algorithm in the Immunity_H and Immunity_L groups. The red and blue points represent the samples in the Immunity_H and Immunity_L groups, respectively. (C) Distribution of immune cell composition in Immunity_H and Immunity_L groups. ***, P<0.001. OS, osteosarcoma.Additionally, the immune score, including the ESTIMATE score, immune score, and stromal score, were obtained using the ESTIMATE algorithm (see ). We found that the ESTIMATE scores differed significantly between the samples in the Immunity_H group and Immunity_L group. Further, the scores of the samples in the Immunity_H group were significantly higher than those in the Immunity_L group. Additionally, the evaluation of the immune cell types based on CIBERSORT (see ) showed that important immune cell types, such as naive B cells, also differed significantly between the Immunity_H group and Immunity_L group based on the ssGSEA evaluation grouping. All these results indicated that the Immunity_H and Immunity_L groups obtained based on the ssGSEA evaluation grouping could be used for the subsequent grouping analysis.
Screening of significant DEGs
Based on the Immunity_H and Immunity_L groupings obtained by ssGSEA, and according to whether the sample was metastatic or not, a total of 3465 and 646 DEGs were screened from both the Immunity_H and Immunity_L groups, and the with or without tumor metastasis groups, respectively (see ). Next, we identified a total of 315 overlapping DEGs, including 53 lncRNAs and 262 mRNAs by comparing the two groups (see ). The overlapping DEGs were used in the further analyses. Specifically, the lncRNAs were used to construct prognostic models, and the mRNAs were used in the functional analyses of the feature lncRNAs.
Figure 2
Screening of the significant DEGs. (A) Volcano map of DEGs in with vs. without tumor metastasis groups and in Immunity_H vs. Immunity_L groups; the red and green dots indicate significantly upregulated and downregulated RNAs, respectively; the horizontal dashed line indicates FDR <0.05; the 2 vertical dashed lines indicate |log2FC| >0.263. (B) Venn diagram of the comparison of the 2 sets of DEGs. DEGs, differentially expressed genes; FDR, false discovery rate; FC, fold change.
Screening of the significant DEGs. (A) Volcano map of DEGs in with vs. without tumor metastasis groups and in Immunity_H vs. Immunity_L groups; the red and green dots indicate significantly upregulated and downregulated RNAs, respectively; the horizontal dashed line indicates FDR <0.05; the 2 vertical dashed lines indicate |log2FC| >0.263. (B) Venn diagram of the comparison of the 2 sets of DEGs. DEGs, differentially expressed genes; FDR, false discovery rate; FC, fold change.
Prognosis analysis and model construction
Based on the 53 lncRNAs from the overlapping DEGs, we identified 9 lncRNAs associated with prognosis using a univariate Cox regression analysis. Further, 5 optimal lncRNA combinations were identified using the LASSO algorithm. Additionally, we developed the following RS calculation formula based on the LASSO regression coefficient of the expression of 5 lncRNAs in TCGA training data set:We calculated the RSs of the samples in TCGA training data set and the GSE390555 validation data set. The distribution of RSs, and the distribution of clinical survival information are shown in .
Figure 3
Screening of the optimized lncRNA combinations. Distribution of RSs, survival prognostic information, and the heatmap of 5 characteristic lncRNAs in the samples in TCGA training set (A) and GSE39055 validation set (B). TCGA, The Cancer Genome Atlas; RSs, risk-scores; lncRNAs, long non-coding ribonucleic acids.
Screening of the optimized lncRNA combinations. Distribution of RSs, survival prognostic information, and the heatmap of 5 characteristic lncRNAs in the samples in TCGA training set (A) and GSE39055 validation set (B). TCGA, The Cancer Genome Atlas; RSs, risk-scores; lncRNAs, long non-coding ribonucleic acids.Additionally, the 5 lncRNAs in the prognosis model were separated into high expression and low expression groups based on the median value in TCGA training data set. Next, KM curves were used to analyze the relationship between the expression group and survival prognosis. As shows, the high expression of linc00243 (P=0.029), linc00545 (P=0.00016), linc00672 (P=0.00015), and linc00892 (P=0.001) were significantly related to good survival. While the low expression of linc00626 was significantly related to good survival (P=0.014).
Figure 4
The prognostic KM curves of lncRNA expression. The blue and red curves indicate the low and high expression sample groups, respectively. lncRNAs, long non-coding ribonucleic acids. KM, Kaplan-Meier.
The prognostic KM curves of lncRNA expression. The blue and red curves indicate the low and high expression sample groups, respectively. lncRNAs, long non-coding ribonucleic acids. KM, Kaplan-Meier.Additionally, KM curves were used to assess the correlations between the high and low risk groups and actual prognosis information in TCGA training data set and the validation data set, respectively. The results indicated that there was an obvious association between the different risk groups into which the samples had been divided based on the predictions of the RS model and the actual prognoses of patients in TCGA training data set and the GSE39055 validation data set (see ). Additionally, receiver operating characteristic (ROC) curves were used to examine the prediction results of the 1-, 3-, and 5-year survival rates based on the RS prognosis model in TCGA training data set and the GSE39055 validation data set. The results indicated that the RS prognosis model predicted the survival rates of patients in TCGA training data set and the GSE39055 validation data set well (see ).
Figure 5
Evaluation and comparison of the effectiveness of the prognostic risk prediction model. Prognosis-related KM curves of samples in TCGA (A) and GSE39055 (B). The blue and red curves represent the low- and high-risk samples, respectively. The 1-, 3-, and 5-year ROC curves of samples in TCGA (C) and GSE39055 (D). KM, Kaplan-Meier; TCGA, The Cancer Genome Atlas; AUC, Area Under Curve; ROC, receiver operating characteristic.
Evaluation and comparison of the effectiveness of the prognostic risk prediction model. Prognosis-related KM curves of samples in TCGA (A) and GSE39055 (B). The blue and red curves represent the low- and high-risk samples, respectively. The 1-, 3-, and 5-year ROC curves of samples in TCGA (C) and GSE39055 (D). KM, Kaplan-Meier; TCGA, The Cancer Genome Atlas; AUC, Area Under Curve; ROC, receiver operating characteristic.
Screening of independent clinical factors
In total, 3 independent clinical factors related to prognosis (i.e., age, metastatic, and RS model status) were screened by the univariate and multivariate Cox regression analyses (see ).
Table 1
The information of clinical factors
Clinical characteristics
TCGA (N=176)
Univariate Cox analysis
Multivariate Cox analysis
HR
95% CI
P
HR
95% CI
P
Age (years), mean ± SD
61.10±15.21
1.018
1.001–1.036
3.99E-02
1.035
1.011–1.059
4.08E-03
Gender (male/female)
72/104
1.06
0.642–1.750
8.20E–01
–
–
–
Pathologic tumor depth (cm), mean ± SD
6.35±3.68
1.135
1.051–1.225
9.96E-04
0.997
0.867–1.145
9.63E-01
Pathologic tumor length (cm), mean ± SD
11.89±7.25
1.062
1.031–1.093
3.91E-05
1.089
0.965–1.228
1.66E-01
Pathologic tumor width (cm), mean ± SD
8.85±5.51
1.092
1.043–1.144
1.37E-04
0.996
0.853–1.163
9.58E-01
Tumor recurrence (yes/no/–)
28/141/7
2.603
1.533–4.422
2.38E-04
0.971
0.446–2.116
9.41E-01
Metastatic tumor (yes/no)
56/120
3.014
1.834–4.954
4.80E-06
2.992
1.516–5.904
1.58E-03
Radiotherapy (yes/no/–)
64/110/2
0.865
0.517–1.447
5.80E-01
–
–
–
Tumor necrosis (no/slight/moderate/severe/–)
61/35/59/11/10
1.182
0.923–1.513
1.83E-01
–
–
–
PS model status (high/low)
88/88
3.213
1.860–5.550
8.68E-06
4.107
1.916–8.802
2.81E-04
TCGA, The Cancer Genome Atlas; HR, hazard ratio; CI, confidence interval; SD, standard deviation.
TCGA, The Cancer Genome Atlas; HR, hazard ratio; CI, confidence interval; SD, standard deviation.Additionally, to analyze the correlation between the independent clinical factors (i.e., age, a metastatic tumor, and RS model status) and survival prognosis, we conducted a nomogram survival rate model to examine the 3- and 5-year overall survival rates for the OS samples, and the results are shown in . Subsequently, the points of each indicator were summed, thereby predicting the survival probabilities at 3 and 5 years. A calibration curve was also constructed to assess the nomogram model results. As shown in , the 3 and 5 years survival time with C-index of 0.7459 and 0.7328 indicated that the nomogram model could well predict the survival rates for patients with OS.
Figure 6
Construction of nomogram survival rate model with independent prognostic factors. (A) Nomogram of survival rate prediction model. (B,C) Consistency line graph of nomogram-predicted and actual 3-year (B) and 5-year (C) survival rate; the horizontal axis represents the predicted survival rate, and the vertical axis represents the actual survival rate.
Construction of nomogram survival rate model with independent prognostic factors. (A) Nomogram of survival rate prediction model. (B,C) Consistency line graph of nomogram-predicted and actual 3-year (B) and 5-year (C) survival rate; the horizontal axis represents the predicted survival rate, and the vertical axis represents the actual survival rate.
Construction and functional analysis of the feature lncRNA correlation network
A total of 239 significant pairs related to OS with a threshold of PCC >3 and P value <0.05 were identified based on the expression levels of the lncRNAs in the RS model and the overlapping DEGs in TCGA training data set. Subsequently, we constructed a co-expression network of the lncRNAs and mRNAs involved with the 5 feature lncRNAs (i.e., linc00243, linc00892, linc00626, linc00545, and linc00672) and 143 mRNAs (e.g., LILRB4, MMP13, and CHRD) (see ).
Figure 7
The co-expression network of the feature lncRNAs and intersection mRNAs. The squares and circles represent the lncRNAs and intersection mRNAs, respectively. lncRNAs, long non-coding ribonucleic acids.
The co-expression network of the feature lncRNAs and intersection mRNAs. The squares and circles represent the lncRNAs and intersection mRNAs, respectively. lncRNAs, long non-coding ribonucleic acids.Next, we performed the DAVID-based BP and KEGG pathway enrichment analysis of the genes in the co-expression network, and a total of 16 significantly related BPs, such as signal transduction and positive regulation of interferon-gamma production, and 11 KEGG pathways, such as cytokine receptor interaction, were screened (see ).
Table 2
The significantly related BPs and KEGG pathways of genes contained in the co-expression network
Category
Term
Count
P value
Gene
BP
GO:0032729~positive regulation of interferon-gamma production
5
3.86E-04
IL18, IRF8, TNF
GO:0007165~signal transduction
21
3.96E-04
ARHGAP9, IL15RA, ARRDC5,
GO:0006869~lipid transport
5
2.56E-03
PTGES
GO:0032526~response to retinoic acid
4
3.57E-03
CD38, AQP3, DKK
GO:0006955~immune response
10
4.48E-03
C3, TMIGD2, CCL22
GO:0030514~negative regulation of BMP signaling pathway
4
4.65E-03
HTRA3, CHRD, DKK1
GO:0002250~adaptive immune response
6
5.17E-03
CD79A, CLEC10A, SH2D1B
GO:0006909~phagocytosis
4
5.58E-03
CEBPE, CEACAM4, IRF8
GO:0006954~inflammatory response
9
7.81E-03
C3, PSTPIP1, VNN1
GO:0050776~regulation of immune response
6
1.10E-02
NCR1, C3, CD200R1
GO:0030593~neutrophil chemotaxis
4
1.34E-02
CSF3R, CCL22, CCL17
GO:0045087~innate immune response
9
1.58E-02
GO:0019221~cytokine-mediated signaling pathway
5
1.71E-02
GO:0031295~T cell co-stimulation
4
2.09E-02
GO:0051897~positive regulation of protein kinase B signaling
4
2.53E-02
GO:0070374~positive regulation of ERK1 and ERK2 cascade
BP, biological process; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes.
BP, biological process; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes.
Correlations between the feature lncRNA prognostic signals and immune cell subtype infiltration
The online TIMER tool was applied to analyze the proportion of immune cells associated with OS according to the expression levels of TCGA OS samples. Next, we calculated the correlation between the prognostic signal RS value of the characteristic lncRNAs and the subtypes of immune infiltration cells. The results indicated that the RSs were negatively correlated to the proportion of kinds of immune cells (see ).
Figure 8
Scatter plot of the correlations between RS and various types of immune cells in TIMER. RS, risk-score.
Scatter plot of the correlations between RS and various types of immune cells in TIMER. RS, risk-score.
Discussion
OS is an aggressive malignant bone tumor with heterogeneous biology (40,41). A previous study indicated that the heterogeneous biology of OS is related to the tumor microenvironment (15). OS tissue is not only made up of OS cells, but also contains a variety of other cells, including stromal cells, fibroblasts, and immune cells (4). The immune environment of osteosarcoma mainly consists of myeloid (macrophages, monocytes, dendritic cells) and lymphoid cells, as well as few B lymphocytes and mast cells. The recruitment and differentiation of immune infiltrating cells are controlled by OS cells, inducing a local immunosuppressive environment which contributes to the tumor growth and metastasis. Therefore, improving the antitumor immune responses by means of reprogramming the immune microenvironment is a challenging opportunity to the osteosarcoma therapy (4). In our study, we explored the heterogeneity of OS and the relationship between tumor metastasis and immune infiltrating cells. We obtained the OS RNA-sequencing data and clinical information from TCGA, and the results indicated that there were significant differences between the high immune cell infiltration group and the low immune cell infiltration group in terms of immune, stromal, and ESTIMATE scores.Additionally, previous studies have reported that lncRNAs are involved in the occurrence, invasion, progression, and metastasis of OS. In our study, we identified 5 lncRNAs (i.e., linc00243, linc00892, linc00626, linc00545, and linc00672) that are related to the prognosis of OS patients. Similarly, Feng et al. (42) found linc00243 expression is significantly correlated to the prognosis and survival time of non-small cell lung cancer patients. Linc00892 is related to the microenvironment for bladder cancer patients, and is regarded as an important biomarker for bladder cancer metastasis and progression (43). Additionally, Linc00892 is related to some types of cancer, such as endometrial cancer (44,45), lung adenocarcinoma (46), and colorectal cancer (47), and is a novel biomarker for the prognosis and diagnosis of tumors.We also constructed a RS model to verify the function of lncRNAs in the prognosis of OS patients, and found that the expression of lncRNAs is associated with OS patients’ survival time. Further, univariate and multivariate Cox analyses were used to analyze the independent clinical factors, and then, a nomogram was constructed to predict the 3- and 5-year survival rates. We found that the survival rates predicted by the nomogram model were consistent with actual observations for OS patients at 3 and 5 years. All these results suggested that these 5 lncRNAs are key markers for treating OS in immunotherapy. However, further research on the functional verification of these key lncRNAs with clinical prognostic value and their involved multilayer genetic regulation networks is warranted to deeply understand the mechanism, which may contribute to its clinical application.We also found that the 5 feature lncRNAs were negatively correlated to immune cells, including CD8+ T cells, CD4+ T cells, B cells, macrophages, neutrophils, and myeloid dendritic cells. The antigen-antibody complementarity determining regions of the T-cell and B-cell receptors have an important function in the recognition of tumor-specific antigens (48,49). Studying the sequence features of tumor infiltrative T-cell and B-cell surface receptors is useful in analyzing the interactions between immune cells and tumor cells, and provides a novel method for the diagnosis and treatment of cancer. After surgery, cancer tissues often include a number of immune infiltration cells, which cause the RNA sequencing data of the tumor tissue to be mixed with various information about the tumor immune microenvironment.
Conclusions
Our research identified 5 feature lncRNAs as novel biomarkers for OS. We also constructed a RS nomogram model based on the 5 feature lncRNAs. This model predicted the survival rates and prognoses of OS patients well. The 5 lncRNAs for OS were related to the infiltration of immune cell subtypes.The article’s supplementary files as