Shengkang Dai1,2, Desheng Yao1. 1. Department of Gynecologic Oncology, Guangxi Medical University Cancer Hospital, Nanning, China. 2. People's Hospital of Baise, Baise, China.
Abstract
BACKGROUND: Several immune-associated long non-coding RNA (lncRNA) signatures have been reported as prognostic models in different types of cancers; however, the immune-associated lncRNA signature for predicting overall survival (OS) in cervical cancer is unknown. METHODS: The lncRNA expression profiles and clinical data of cervical cancer were acquired from The Cancer Genome Atlas (TCGA) dataset. Immune-associated genes were extracted from the Molecular Signatures Database (MSigDB), and the immune-associated lncRNAs were extracted for Cox regression analysis. Principal component analysis (PCA) was used to distinguish the high and low risk status of cervical cancer patients. Gene Set Enrichment Analysis (GSEA) was used for functional analyses. RESULTS: Cox regression analyses and the least absolute shrinkage and selection operator (LASSO) Cox regression model were used to construct an immune-associated ten-lncRNA signature (containing AL021807.1, AL109976.1, LINC02446, MIR4458HG, AC004540.2, AC009065.8, AC083809.1, AC055822.1, AP000904.1, and FBXL19-AS1) for predicting OS in cervical cancer. The signature segregated the cervical cancer patients into 2 groups (high-risk group and low-risk group). The Kaplan-Meier survival curves of AL021807.1, AL109976.1, LINC02446, and MIR4458HG were statistically significant (P<0.05) and the others (including AC004540.2, AC009065.8, AC083809.1, AC055822.1, AP000904.1, and FBXL19-AS1) were not statistically significant (P>0.05). The Kaplan-Meier survival curves of the signature were statistically significant (P=1.134e-10), and the 5-year survival rate was 0.444 in the high-risk group [95% confidence interval (CI): 0.334 to 0.590] and 0.884 in the low-risk group (95% CI: 0.807 to 0.969). The area under curve (AUC) of the receiver operating characteristic (ROC) curve of the signature was 0.833. The concordance index (C-index) of the signature was 0.788 (95% CI: 0.730 to 0.846, P=1.884778e-22). The PCA successfully distinguished the high-risk group and low-risk group based on the signature. The GSEA showed that the signature-related protein coding genes (PCGs) may participate in immunologic biological processes and pathways. CONCLUSIONS: This study revealed that the immune-associated ten-lncRNA signature is an independent factor for cervical cancer prognosis prediction, providing a bright future for immunotherapy of cervical cancer patients. 2021 Translational Cancer Research. All rights reserved.
BACKGROUND: Several immune-associated long non-coding RNA (lncRNA) signatures have been reported as prognostic models in different types of cancers; however, the immune-associated lncRNA signature for predicting overall survival (OS) in cervical cancer is unknown. METHODS: The lncRNA expression profiles and clinical data of cervical cancer were acquired from The Cancer Genome Atlas (TCGA) dataset. Immune-associated genes were extracted from the Molecular Signatures Database (MSigDB), and the immune-associated lncRNAs were extracted for Cox regression analysis. Principal component analysis (PCA) was used to distinguish the high and low risk status of cervical cancer patients. Gene Set Enrichment Analysis (GSEA) was used for functional analyses. RESULTS: Cox regression analyses and the least absolute shrinkage and selection operator (LASSO) Cox regression model were used to construct an immune-associated ten-lncRNA signature (containing AL021807.1, AL109976.1, LINC02446, MIR4458HG, AC004540.2, AC009065.8, AC083809.1, AC055822.1, AP000904.1, and FBXL19-AS1) for predicting OS in cervical cancer. The signature segregated the cervical cancer patients into 2 groups (high-risk group and low-risk group). The Kaplan-Meier survival curves of AL021807.1, AL109976.1, LINC02446, and MIR4458HG were statistically significant (P<0.05) and the others (including AC004540.2, AC009065.8, AC083809.1, AC055822.1, AP000904.1, and FBXL19-AS1) were not statistically significant (P>0.05). The Kaplan-Meier survival curves of the signature were statistically significant (P=1.134e-10), and the 5-year survival rate was 0.444 in the high-risk group [95% confidence interval (CI): 0.334 to 0.590] and 0.884 in the low-risk group (95% CI: 0.807 to 0.969). The area under curve (AUC) of the receiver operating characteristic (ROC) curve of the signature was 0.833. The concordance index (C-index) of the signature was 0.788 (95% CI: 0.730 to 0.846, P=1.884778e-22). The PCA successfully distinguished the high-risk group and low-risk group based on the signature. The GSEA showed that the signature-related protein coding genes (PCGs) may participate in immunologic biological processes and pathways. CONCLUSIONS: This study revealed that the immune-associated ten-lncRNA signature is an independent factor for cervical cancer prognosis prediction, providing a bright future for immunotherapy of cervical cancer patients. 2021 Translational Cancer Research. All rights reserved.
Entities:
Keywords:
Cervical cancer; immune; long non-coding RNA (lncRNA); signature; survival
According to the cancer statistics of China in 2015 (1), although the prevention, diagnosis, and treatment of cervical cancer has made great progress, the morbidity of cervical cancer has had an upward tendency in China in comparison with other developed countries. In recent years, there have been many studies on cervical cancer and numerous efforts have been made, but according to the global cancer statistics in 2018, compared with high-income countries, cancer-related deaths caused by cervical cancer are the main cause in 42 low-income countries and lower-middle-income countries (2). An optimal tool on prognosis prediction for cervical cancer is called for urgently to address this issue. Principal Component Analysis (PCA) is a multivariate statistical method to investigate the correlation between multiple variables. the internal structure of multiple variables were revealed through a few principal components and retain as much information as possible about the original variables.Non-coding RNAs (ncRNAs) can be divided into short ncRNAs (length less than 200 nucleotides) and long ncRNAs (longer than 200 nucleotides). The long non-coding RNAs (lncRNAs) consist of long intergenic ncRNA (lincRNA), intronic lncRNA, antisense lncRNA, transcribed pseudogene lncRNA, and enhancer RNA (eRNA) (3). They have been shown to play important roles in regulating the innate immune response (4), and have been closely related with the cancer-immunity cycle, such as antigen release, antigen presentation, immune cells differentiation, immune cells migration, T cells infiltration, as well as the recognition and killing of cancer cells (5). LncRNA is abnormally expressed in a variety of malignant tumors and exerts a carcinogenic or anticancer role. Previous studies have found that lncRNA is closely related to cervical cancer and plays an important role in the occurrence, invasion, metastasis and drug resistance of cervical cancer.Many researchers have already established relevant models for prognosis prediction based on clinical parameters (including α-fetoprotein, alanine aminotransferase, body mass index, tumor-node metastasis stage, tumor grade, tumor size, age, gender, vascular tumor cell, number of tumors, and so on) (6-8). Recently, several signature prognostic models were identified based on clinicopathological data, having displayed great advantages on cervical cancer prognosis prediction (9-13). Some immune-associated lncRNA signatures have been reported to be related with prognosis of cancer patients (14-17). Ye et al. finding demonstrated the value of lncRNAs in evaluating the immune infiltrate of the tumor. The immune-associated lncRNA signature could predict the prognosis of Cervical cancer and contribute to decisions regarding the immunotherapeutic strategy (18). These provide strong evidence for immunotherapy in the clinical setting. The occurrence and development of cervical cancer are closely related to inflammation and infection. Epidemiological results show that more than 90% of cervical cancer patients are infected with human papillomavirus (HPV). The combined action of many factors destroys cervical epithelial cells and induces imbalance of immune response. Currently, the immune-associated lncRNA signature for predicting overall survival (OS) in cervical cancer is unknown. This study is the first report on the immune-associated lncRNA signature for predicting OS in cervical cancer. We present the following article in accordance with the REMARK reporting checklist (available at https://dx.doi.org/10.21037/tcr-21-2390).
Methods
Databases
Recent studies have demonstrated that lncRNA plays a critical role in the immune system, and has become the focus of immunology. The occurrence, development and prognosis of various tumors were affected by lncRNA. In the study, the lncRNA expression profiles and clinical data of cervical cancer were acquired from The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/). Immune-related genes were acquired from immune system process M13664 and immune response M19817 via the Molecular Signatures Database v 7.0 (MSigDB, http://www.broadinstitute.org/gsea/msigdb/index.jsp) (19,20). The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). If the survival time of patients was less than 1 month and clinical characteristics were deficient, the clinical datasets of these patients were excluded. The details are displayed in .
Table 1
The detailed clinical characteristics
Characteristics
Number of participants
Age
≤50
152
>50
89
Stage
I
136
II
53
III
37
IV
15
Status
Alive
180
Dead
61
Grade
G1
16
G2
124
G3
100
G4
1
Identification and Cox regression analyses of immune-associated lncRNAs
Immune-associated lncRNAs were identified from co-expression networks of lncRNAs and immune-associated genes. Univariate Cox regression analysis was used to identify the latent immune-associated lncRNAs with prognostic value in cervical cancer, filtered by the Pearson correlation coefficient ≥0.40 and the P value ≤0.001 via R version 3.6.1 (https://cran.r-project.org/bin/windows/base/old/3.6.1/). Then, the differentially expressed immune-associated lncRNAs were merged with survival time and status via R (survival) package in order to perform multivariate Cox regression analysis (21,22). Finally, the selected immune-associated lncRNAs (P≤0.01) were filtered by multivariate Cox regression analysis according to the Akaike Information Criterion (AIC) (23).
Construction of the signature
The least absolute shrinkage and selection operator (LASSO) Cox regression model was used to construct an immune-associated 10-lncRNA signature. The risk score method was assigned and assessed according to the mathematical formula: risk score = βgene [1] × Expression gene [1] +βgene [2] × Expression gene [2] + βgene [3] × Expression gene [3] + ... +βgene [n] × Expression gene [n]. The risk score of 10 immune-associated lncRNAs was calculated for each participant (19,24,25). The concordance index (C-index) was simultaneously used to evaluate the prediction accuracy of the signature in cervical cancer patients. Kaplan-Meier survival curves were performed by R version 3.6.1 (survival and survminer packages). The participants were divided into 2 groups (low-risk group and high-risk group). The Multi index receiver operating characteristic (ROC) curve was demonstrated utilizing the Survival ROC R package.
Statistical analysis
PCA was demonstrated utilizing R package. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) associated with immunologic biological processes and pathways were extracted in the top 20 most significant terms via Gene Set Enrichment Analysis (GSEA) (version 4.0.3), with normalized enrichment score (NES) ≥1, nominal P value ≤0.05, false discovery rate (FDR), and q-value ≤0.25.
Results
Identification of immune-associated lncRNAs in cervical cancer
A total of 14,142 lncRNAs were identified from TCGA database among 306 cervical cancer samples and 3 non-cancer samples. A total of 331 immune-associated genes were extracted from MSigDB. A total of 685 immune-associated lncRNAs (Pearson correlation coefficient ≥0.40 and P≤0.001) were identified from co-expression networks of lncRNAs and immune-associated genes.
Acquisition of 10 immune-associated lncRNAs in cervical cancer patients
The differentially expressed immune-associated lncRNAs were merged with patient ID, survival time, and status via R (survival) package (P≤0.01). Univariate Cox regression analysis was used to identify the latent lncRNAs, and then 22 immune-associated lncRNAs were obtained to prepare for the multivariate Cox regression analysis (). Finally, 10 immune-associated lncRNAs were identified for predicting the OS in cervical cancer patients, as presented in . The Kaplan-Meier survival curves of AL021807.1, AL109976.1, LINC02446, and MIR4458HG had statistical significance (P<0.05) (). The Kaplan-Meier survival curves of AC004540.2, AC009065.8, AC083809.1, AC055822.1, AP000904.1, and FBXL19-AS1 did not have statistical significance (P>0.05, Table S1).
Figure 1
Univariate Cox regression analysis of immune-associated lncRNAs hazard ratio. lncRNAs, long non-coding RNAs.
Table 2
Multivariate Cox regression analysis of immune-associated lncRNAs hazard ratio
ID
Coef
HR
HR.95L
HR.95H
P value
AC099343.2
0.648
1.912
0.842
4.346
0.122
LINC02446
−0.396
0.673
0.4952
0.9146
0.0112
AC083809.1
0.833
2.300
1.6162
3.272
0.000
LINC01315
−0.294
0.745
0.541
1.027
0.072
MIR4458HG
−0.854
0.426
0.282
0.643
0.000
AC090948.3
−0.945
0.389
0.167
0.902
0.028
AC116312.1
0.739
2.093
1.531
2.861
0.000
AC004540.2
0.585
1.795
1.354
2.379
0.000
AL021807.1
−0.587
0.556
0.413
0.750
0.000
AL109976.1
−0.558
0.572
0.279
1.174
0.128
lncRNAs, long non-coding RNAs; Coef, correlation coefficient; HR, hazard ratio; HR.95L, the low limit value 95% CI of HR; HR.95H, the high limit value 95% CI of HR.
Figure 2
Kaplan-Meier survival curves of 4 immune-associated lncRNAs. lncRNAs, long non-coding RNAs.
Univariate Cox regression analysis of immune-associated lncRNAs hazard ratio. lncRNAs, long non-coding RNAs.lncRNAs, long non-coding RNAs; Coef, correlation coefficient; HR, hazard ratio; HR.95L, the low limit value 95% CI of HR; HR.95H, the high limit value 95% CI of HR.Kaplan-Meier survival curves of 4 immune-associated lncRNAs. lncRNAs, long non-coding RNAs.
Construction of the immune-associated 10-lncRNA signature in cervical cancer patients
According to the risk score of LASSO Cox regression model, the participants were divided into 2 groups (low-risk group and high-risk group). There were 120 cervical cancer patients in the high-risk group and 121 cervical cancer patients in the low-risk group. The Kaplan-Meier survival curves of the signature had previous statistical significance (P=1.134e-10), and the 5-year survival probability was 0.444 in the high-risk group [95% confidence interval (CI): 0.334 to 0.590] and 0.884 in low-risk group (95% CI: 0.807 to 0.969) (). The distribution of risk score, survival status with increasing the risk score, and expressions of 10 immune-associated lncRNAs in cervical cancer patients are displayed in .
Figure 3
Kaplan-Meier survival curves of the signature.
Figure 4
The evaluation of the immune-associated ten-lncRNA signature. (A) The expression heatmap of ten immune-associated lncRNAs; (B) The distributions of cervical cancer patients with increasing risk score of ten immune-associated lncRNAs; (C) Survival time and status of cervical cancer patients. lncRNAs, long non-coding RNAs.
Kaplan-Meier survival curves of the signature.The evaluation of the immune-associated ten-lncRNA signature. (A) The expression heatmap of ten immune-associated lncRNAs; (B) The distributions of cervical cancer patients with increasing risk score of ten immune-associated lncRNAs; (C) Survival time and status of cervical cancer patients. lncRNAs, long non-coding RNAs.
Analysis of independent risk factors with clinical characteristics
There were 241 cervical cancer patients with clinical characteristics (including age, grade, clinical stage) involved in the analysis of independent risk factors. After the univariate and multivariate Cox regression analyses with clinical characteristics, an immune-associated 10-lncRNA signature was shown to be an independent prognostic factor for predicting OS in cervical cancer patients (). The Multi index ROC curves are shown in . The area under curve (AUC) of the receiver operating characteristic curve of the signature was 0.833. The C-index was 0.788 (95% CI: 0.730 to 0.846, P=1.884778e-22).
Figure 5
Cox regression analyses with clinical characteristics. (A) Univariate Cox regression analyses with clinical characteristics (B) Multivariate Cox regression analyses with clinical characteristics.
Figure 6
The multi-index ROC curves for signature and clinical characteristics. ROC, receiver operating characteristic.
Cox regression analyses with clinical characteristics. (A) Univariate Cox regression analyses with clinical characteristics (B) Multivariate Cox regression analyses with clinical characteristics.The multi-index ROC curves for signature and clinical characteristics. ROC, receiver operating characteristic.
Functional analyses for high-risk group and low-risk groups
We used PCA to perform different distribution comparisons among all expression genes, immune genes, immune-associated lncRNAs, and 10 immune-associated lncRNAs between the low-risk group and high-risk group (). Those significantly enriched on GO functional enrichment analysis and KEGG pathway analysis were up-regulated in the low-risk group ().
Figure 7
Different status was shown among low-risk groups and high-risk groups. (A) PCA among low-risk groups and high-risk groups based on all genes. (B) PCA among low-risk groups and high-risk groups based on all the immune genes. (C) PCA among low-risk groups and high-risk groups based on immune-associated lncRNAs. (D) PCA among low-risk groups and high-risk groups based on risk genes. PCA, principal component analysis; lncRNAs, long non-coding RNAs.
Figure 8
GSEA functional analyses of the signature related PCGs associated with immune. GSEA, Gene Set Enrichment Analysis; PCGs, protein coding genes.
Different status was shown among low-risk groups and high-risk groups. (A) PCA among low-risk groups and high-risk groups based on all genes. (B) PCA among low-risk groups and high-risk groups based on all the immune genes. (C) PCA among low-risk groups and high-risk groups based on immune-associated lncRNAs. (D) PCA among low-risk groups and high-risk groups based on risk genes. PCA, principal component analysis; lncRNAs, long non-coding RNAs.GSEA functional analyses of the signature related PCGs associated with immune. GSEA, Gene Set Enrichment Analysis; PCGs, protein coding genes.
Discussion
Immune status is closely associated with human health, especially among cancer-related patients. The immune system plays an important role in cancer immunoediting (26). The immune response is characterized by its durable and adaptive features, and its self-propagating possibilities after activation (27). It has been reported that lncRNAs have abilities to control the differentiation of human monocytes into dendritic cells (DCs) (28). The DCs provide a connection in the innate and adaptive arms of the immune system as primary antigen presentation cells (APCs) for T cells (29). The lncRNAs acquired from CD8(+) T cells perform important roles in adaptive immunity (30), and lncRNAs have been defined as fine-tuners and key drivers in T cell differentiation (31).In recent years, some immune-related lncRNAs and establishment of some prognostic models based on immune-related lncRNAs have provided us with a certain understanding of the immune status in cancer-related patients. There were 7 immune-related lncRNAs show to be independent prognostic risk factors in low-grade glioma (16). The lncRNA OSTN-AS1 was shown to be a potential immune-related prognostic marker for triple-negative breast cancer (32). Immune-related lncRNAs have been associated with prognostication in kidney renal clear cell carcinoma (33). An immune-related 6-lncRNA signature was verified as an independent prognostic factor for glioblastoma multiforme(14). A 9-immune-related lncRNA signature was closely associated with the immune status of pancreatic cancer (17). A 9-immune-related lncRNA signature was correlated with prognostication in anaplastic gliomas based on different immune status (34).In this study, a total of 685 statistically significant immune-associated lncRNAs were identified from co-expression networks of lncRNAs and immune-associated genes. Through univariate Cox regression analysis, we identified 22 prognostic potential immune-related lncRNAs, among which, according to the hazard ratio (HR), 6 high-risk immune-related lncRNAs (HR >1) and 16 low-risk immune-related lncRNAs (HR <1) were associated with latent prognostic value. Finally, through Cox regression analyses and the LASSO Cox regression model, 10 immune-related lncRNAs (AC004540.2, AC083809.1, AP000904.1, FBXL19-AS1, LINC02446, AL109976.1, MIR4458HG, AC009065.8, AL021807.1, and AC055822.1) significantly have latent prognostic value in cervical cancer patients. Among the 10 immune-associated lncRNAs, only FBXL19-AS1 has been reported to be associated with cancers, and no other lncRNAs have been reported. The lncRNA FBXL19-AS1 is a potential oncogenic lncRNA and has been shown to act as a molecular sponge in colorectal cancer (35), osteosarcoma (36), breast cancer (37), and lung cancer (38,39).Cervical cancer patients were divided into a high-risk group and low-risk group according to the HR of immune-related lncRNA. The HR of some immune-related lncRNA was larger than 1 (including AC004540.2, AC083809.1, AP000904.1, and FBXL19-AS1), which meant that these immune-associated lncRNAs may be harmful to the OS of cervical cancer patients. The HR of some immune-related lncRNA was less than 1 (including LINC02446, AL109976.1, MIR4458HG, AC009065.8, AL021807.1, and AC055822.1), which meant that these immune-associated lncRNAs may be protective factors the OS of cervical cancer patients. The Kaplan-Meier survival curves of AL021807.1, AL109976.1, LINC02446, and MIR4458HG were associated with the OS in cervical cancer patients. The Kaplan-Meier survival curves of the survival probability of 5 years were 0.444 in the high-risk group (95% CI: 0.334 to 0.590) and 0.884 in the low-risk group (95% CI: 0.807 to 0.969), which meant that the OS of cervical cancer patients was better in the low-risk group compared with the high-risk group. Multi index ROC curves showed that the immune-associated 10-lncRNA signature prognostic model could efficiently stratify the OS of cervical cancer patients as an independent factor, compared with other clinical characteristics. The area under the curve (AUC) of the prognostic model for 5-year survival was 0.833, combined with C-index (0.788, 95% CI: 0.730 to 0.846, P=1.884778e-22), indicating the efficient accuracy of the signature prognostic model. The PCA results showed that AC004540.2, AC083809.1, AP000904.1, and FBXL19-AS1 promote cervical cancer development while LINC02446, AL109976.1, MIR4458HG, AC009065.8, AL021807.1 and AC055822.1 has a protective effect on cervical cancer. The signature could adequately distinguish the high-risk patients and the low-risk patients.For the sake of discovering the potential functions of the signature, signature-related protein coding genes (PCGs) co-expression networks were performed by GSEA. Surprisingly, we discovered that 5/20 GO functional enrichment analyses and 3/20 KEGG pathway analyses were up-regulated in low-risk groups in cervical cancer patients. These functional analyses were directly or indirectly associated with immunity. All these enriched gene sets indicated that the signature-related PCGs may participate in immunologic biological processes and pathways. Patients in the high-risk group and the low-risk group had previously different OS. The OS of the high-risk group was poorer than that of the low-risk group. The reason might be attributable to the enriched immunologic biological processes and pathways.
Conclusions
In summary, we identified an immune-associated 10-lncRNA signature prognostic model containing 10 lncRNAs (AC004540.2, AC083809.1, AP000904.1, FBXL19-AS1, LINC02446, AL109976.1, MIR4458HG, AC009065.8, AL021807.1, and AC055822.1). The signature is a dependable and measurable method to predict the prognosis of cervical cancer patients acting as an independent factor, and provides strong evidence for individualized immunotherapy for cervical cancer patients.
Authors: Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker Journal: Genome Res Date: 2003-11 Impact factor: 9.043
Authors: Izidore S Lossos; Debra K Czerwinski; Ash A Alizadeh; Mark A Wechser; Rob Tibshirani; David Botstein; Ronald Levy Journal: N Engl J Med Date: 2004-04-29 Impact factor: 91.245