Lingyu Guo1,2, Tian An3, Zhixin Huang1,2, Ziyan Wan1,2, Tie Chong2. 1. Department of Medicine, Xi'an Jiaotong University, Xi'an, China. 2. Department of Urology, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China. 3. Department of Dermatology and Plastic Surgery, The Second Affiliated Hospital of Shaanxi University of Traditional Chinese Medicine, Xianyang, China.
Abstract
Background: Clear cell renal cell carcinoma (ccRCC) is one of the common malignant tumors worldwide. There is still a lack of effective diagnostic and therapeutic targets for the recurrence and metastasis of ccRCC. In this study, we sought to identify effective diagnostic and therapeutic targets for ccRCC recurrence and metastasis. Methods: Gene Expression Omnibus (GEO) dataset was used to obtain differentially expressed genes (DEGs) between primary and metastasis ccRCC. We used The Cancer Genome Atlas (TCGA), GeneMANIA, cBioPortal, MethSurv, and TIMER to analyze the expression differences, mutation status, prognostic value, molecular function, and immune infiltration of hub genes in renal cell carcinoma (RCC). Results: We obtained a total of 35 different gene lists. Six collagen family members were identified as hub genes. The expression level of collagen family members was closely related to ccRCC. Moreover, differences in the expression levels of collagen family members were closely related to the stage and prognosis of ccRCC. Members of the collagen family were responsible for more than 15% of the genetic alterations in ccRCC and are involved in multiple signaling pathways. The expression level of collagen family members was closely related to the infiltration of tumor-associated immune cells. Univariate and multivariate Cox regression identified the prognosis-related genes: COL5A1. Conclusions: Our study implied that members of the collagen family may serve as a biomarker for ccRCC metastasis and prognosis. 2022 Translational Cancer Research. All rights reserved.
Background: Clear cell renal cell carcinoma (ccRCC) is one of the common malignant tumors worldwide. There is still a lack of effective diagnostic and therapeutic targets for the recurrence and metastasis of ccRCC. In this study, we sought to identify effective diagnostic and therapeutic targets for ccRCC recurrence and metastasis. Methods: Gene Expression Omnibus (GEO) dataset was used to obtain differentially expressed genes (DEGs) between primary and metastasis ccRCC. We used The Cancer Genome Atlas (TCGA), GeneMANIA, cBioPortal, MethSurv, and TIMER to analyze the expression differences, mutation status, prognostic value, molecular function, and immune infiltration of hub genes in renal cell carcinoma (RCC). Results: We obtained a total of 35 different gene lists. Six collagen family members were identified as hub genes. The expression level of collagen family members was closely related to ccRCC. Moreover, differences in the expression levels of collagen family members were closely related to the stage and prognosis of ccRCC. Members of the collagen family were responsible for more than 15% of the genetic alterations in ccRCC and are involved in multiple signaling pathways. The expression level of collagen family members was closely related to the infiltration of tumor-associated immune cells. Univariate and multivariate Cox regression identified the prognosis-related genes: COL5A1. Conclusions: Our study implied that members of the collagen family may serve as a biomarker for ccRCC metastasis and prognosis. 2022 Translational Cancer Research. All rights reserved.
Renal cell carcinoma (RCC) is one of the most common malignancies of the urinary system in the world, accounting for ~2–3% of all malignant tumors (1). Thanks to advances in medical diagnostic technology, a large number of renal cancers are diagnosed at an early stage, but there are still many RCC patients who have a poor prognosis due to distant metastasis and other reasons. Recurrence and metastasis of RCC are the leading causes of death in patients. At present, there is still a lack of accurate and effective targets for recurrence and metastasis of RCC (2). Renal carcinoma has a high degree of morphological heterogeneity, which can be divided into 16 histological subtypes according to the World Health Organization (WHO) classification of tumors in 2016 (3). The most common pathological type is clear cell renal cell carcinoma (ccRCC), papillary RCC, and chromophobe RCC. ccRCC accounts for about 70–75%. Since there are no typical clinical symptoms or specific diagnostic markers in the early stage of renal cancer, 20–30% of patients have developed distant metastasis or advanced renal cancer at the time of initial diagnosis. The existing treatment methods for metastatic RCC [radiotherapy, chemotherapy, interferons (IFN) immune therapy, etc.] are not sensitive (4). Molecular targeted therapy is one of the main treatment strategies for metastatic RCC, the most common molecular targeted therapy includes mammalian target of rapamycin (mTOR) inhibitors sirolimus, tyrosine kinase inhibitors sunitinib, and vascular endothelial growth factor (VEGF) inhibitor bevacizumab (5). However, most patients develop drug resistance to targeted drugs 6–15 months after targeted therapy, resulting in a ≤10% 5-year survival rate of patients with metastatic ccRCC (6). Therefore, exploring a new diagnosis and treatment of RCC has become an urgent problem to be solved in clinical practice.Collagen widely exists in various tissues of the human body, with a total of 28 different types encoded by different genes and located in specific tissues of the human body, playing a variety of biological functions (7). Previous study has shown that members of the collagen family can participate in regulating the growth and migration of cancer cells. COL1A1 expression level can be used to predict the prognosis and immunotherapy effect of gastric cancer patients (8). Besides, COL4A1 is an active oncogene in glioma and is associated with tumor stage and prognosis (9). COL6A3 polymorphisms were associated with lung cancer risk (10). COL10A1 can promote the proliferation and migration of breast cancer cells in vitro (11). DNA methylation regulates gene transcription and translation, and the methylation level of many genes is closely related to cancer progression. The relationship between collagen gene methylation level and cancer has not been elucidated. In addition, the level of tumor immune cell infiltration significantly affects the progression of cancer, which has attracted widespread attention (12). Collagens can not only directly regulate the proliferation and metastasis of tumor cells, but also affect the function of tumor-associated immune cells such as tumor-associated macrophages and T cells, suggesting that collagens play an important role in tumor immunity and can be used as a target for tumor immunotherapy (13). Study have shown that collagen changes in melanoma can affect the motility of immune cells, thus affecting tumor progression (14). Study in vitro has confirmed that collagens can affect the motor ability of T cells and regulate the proportion of CD4 and CD8 in T cells (15). Due to the high heterogeneity of ccRCC, the prognosis of patients with ccRCC varies greatly. Some immune-related genes have been found to be related to the prognosis of patients with ccRCC (16), which can improve the accuracy of the existing prognosis prediction methods such as TNM staging system (17). There is no systematic study on the relationship between collagen and immune cell infiltration in ccRCC.In this study, we used a series of bioinformatics methods to explore the role of collagen in ccRCC metastasis. First, we analyzed the Gene Expression Omnibus (GEO) data set to find the differentially expressed collagen genes in the process of ccRCC metastasis. We then assessed the relationship between collagen genes expression and ccRCC stage and prognosis. Finally, we explored the methylation levels of collagen genes in ccRCC and their relationship with tumor immune invasion. We believe that this study will contribute to a clearer understanding of the role of the collagen gene family in ccRCC metastasis and provide a basis for screening prognostic markers and therapeutic targets. We present the following article in accordance with the STREGA reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-22-398/rc).
Methods
Differentially expressed genes (DEGs) screening
We selected two sequencing datasets, GSE22541 and GSE105261, containing gene expression of ccRCC metastasis from the GEO database. GSE22541 from the GPL570 Affymetrix Human Genome U133 Plus 2.0 Array includes 24 primary ccRCC and 44 pulmonary metastases of ccRCC tissues. GSE105261 from the GPL10558 Illumina HumanHT-12 V4.0 expression bead chip includes 9 normal, 9 primary ccRCC, and 26 metastatic ccRCC tissues. GEO2R tool was used for data analysis, analysis parameters were set to |logFC| ≥1 and adjusted P<0.05.
PPI network analysis
GeneMANIA (http://www.genemania.org) uses extensive genomic and proteomic data to find genes with similar functions (18). We used this website to predict protein interactions and to analyze pathways of the common DEGs. Cytohubba is a plug-in for Cytoscape software to identify hub nodes. It is used to analyze the previously obtained DEGS interaction network to search for hub genes.
Gene enrichment analysis
The Database for Annotation, Visualization, and Integrated Discovery (DAVID) v6.8 (https://david.ncifcrf.gov/) can associate genes from the input list with biological annotations (19). We used the DAVID website to conduct enrichment analysis of gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways for DEGs.
Gene expression analysis
Oncomine (https://www.oncomine.org) is a cancer microarray database and integrated data-mining platform (20). We analyzed and compared the mRNA expression of collagen family members in renal cancer tissues and normal tissues, and the screening parameters were set as P<0.0001, |logFC| ≥2, and a top 10% gene rank. The Cancer Genome Atlas (TCGA) database includes expression data, miRNA expression data, methylation data, mutation data, and copy number data for 33 tumors. were verified. We used TCGA-KIRC data to analyze collagen family genes’ expression levels in ccRCC tissues.
Mutation analysis
The cBioPortal for Cancer Genomics (https://www.cbioportal.org/) provides a visual tool for research and analysis of cancer genetic data (21). CBioPortal helps understand genetics, epigenetics, gene expression, and proteomics from molecular data derived from cancer tissue and cytology studies. In the study, this tool was used to analyze the mutation of collagen family genes.
Identification of differentially expressed and prognosis-related collagens
The survival package was used to perform survival analysis of TCGA data and plot Kaplan-Meier (KM) curves. Subsequently, we performed a univariate regression analysis between collagen family genes and ccRCC overall survival (OS). Then, we selected genes that were statistically significant in univariate regression analysis for multivariate regression analysis and finally obtained genes with significance in both univariate and multivariate analysis were considered as candidates with significant correlation with the prognosis of ccRCC.
Gene set enrichment analysis (GSEA)
LinkedOmics database (http://www.linkedomics.org/) contains multiple omics and clinical data for 32 cancer types (22). We selected ccRCC as the tumor type in the database website and screened genes related to collagen family genes based on Pearson correlation analysis by using the LinkFinder function of the website. Then, the LinkInterpreter functional module of the website was used to conduct GO and KEGG gene enrichment analysis.
DNA methylation analysis
MethSurv (https://biit.cs.ut.ee/methsurv/) is a web-based tool for survival analysis based on cytosine-phosphate-guanine (CpG) methylation patterns (23). We used TCGA methylation data contained in MethSurv to perform survival analysis of CpGs located near collagen family genes.
Immune infiltration and drug response analysis
The TIMER website (https://cistrome.shinyapps.io/timer/) provides a comprehensive and systematic analysis of immune infiltrations across different cancer types (24). We first estimated the relationship between collagen family members’ expression level and tumor purity and the level of tumor-associated immune cell invasion including B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells. Subsequently, we used the SCNA module of the website to explore the correlation between tumor immune cell infiltration and gene copy number. GSCALite offers a variety of analytical models including methylation analysis and drug sensitivity analysis (25). In the current study, GSCALite (http://bioinfo.life.hust.edu.cn/web/GSCALite/) is a tumor genome analysis platform that integrates genomic data from the TCGA for 33 tumor types, drug response data from GDSC, CTRP, and normal tissue data from GTEX for genome analysis in a unified data analysis process. GSCALite was used to analyze the correlation between expression of the collagen family and drug sensitivity based on the data of GDSC.
Statistical analysis
Statistical analysis of data was carried out by R software (V4.0.2). We performed Cox regression analysis on collagen family gene expression and OS, obtained hazard ratios (HRs) and 95% confidence intervals (CIs). The results of statistical analysis were considered to be significant if the P value was less than 0.05.
Ethical statement
The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013).
Results
Identification of DEGs in ccRCC
A total of 375 up-regulated genes and 226 down-regulated genes were found in GSE22541. A total of 80 up-regulated genes and 35 down-regulated genes were found in GSE105261 (). Then, we obtained 29 up-regulated genes and 6 down-regulated genes in both data sets by Venn diagram (). Based on these lists of DEGs, we performed PPI network analysis (). DEGs are mainly involved in biological functions including extracellular matrix (ECM) structural constituent, collagen trimer, etc. Then, we applied the CytoHubba plug-in to obtain hub genes. The results showed that the top ten hub genes include COL3A1, COL1A1, COL5A2, COL1A2, POSTN, COL6A3, COL5A1, LUM, DCN, and THBS2 (). There were 6 genes in the collagen family. This result suggests that the collagen family plays a key role in the process of kidney cancer metastasis.
Figure 1
DEGs were identified from two gene expression profiles. (A,B) Volcano plots of upregulated (red) and downregulated (blue) DEGs between metastatic ccRCC samples and primary tumor samples in GSE22541 (A) and GSE105261 (B). (C,D) Venn diagram of upregulated and downregulated DEGs. (E) Protein-protein interaction of DEGs (GeneMANIA). DEGs, differential expression genes; ccRCC, clear cell renal cell carcinoma.
Table 1
Top 10 in network ranked by MCC method
Rank
Name
Score
1
COL3A1
1864806
2
COL1A1
1864802
3
COL5A2
1864800
3
COL1A2
1864800
3
POSTN
1864800
6
COL6A3
1859760
7
COL5A1
1854720
8
LUM
1819440
9
DCN
1088641
10
THBS2
771120
MCC, maximal clique centrality.
DEGs were identified from two gene expression profiles. (A,B) Volcano plots of upregulated (red) and downregulated (blue) DEGs between metastatic ccRCC samples and primary tumor samples in GSE22541 (A) and GSE105261 (B). (C,D) Venn diagram of upregulated and downregulated DEGs. (E) Protein-protein interaction of DEGs (GeneMANIA). DEGs, differential expression genes; ccRCC, clear cell renal cell carcinoma.MCC, maximal clique centrality.Next, we conducted gene enrichment analysis of the DEGs to understand their biological functions, and the results showed that DEGs mainly affected ECM organization, collagen catabolic process, collagen fibril organization, and ECM structural constituent. The main pathways involving DEGs were ECM-receptor interaction, protein digestion and absorption, platelet activation, focal adhesion, amoebiasis, pi3k/Akt signaling pathway, and beta-alanine metabolism ().
We obtained the mRNA expression of 6 collagen family members in renal carcinoma and normal tissues through the Oncomine database. The collagen family members we screened showed elevated expression levels in various tumor tissues, as well as in renal cancer tissues. These results suggest that members of the collagen family may play a role in cancer progression. Results showed that compared to normal tissues, the expression levels of COL1A2, COL3A1, and COL5A1 were elevated in more kidney cancer datasets, while COL6A3 was decreased (). Then we combined the data in the TCGA database, and the results were consistent with the previous results. The results showed that COL1A1, COL1A2, COL3A1, COL5A1, COL5A2, and COL6A3 were highly expressed in renal tumor samples ().
Figure 3
The mRNA expression of collagen family genes (ONCOMINE). The numbers in the figure represent the number of datasets with significant differences in gene expression, red representing up-regulated genes and blue representing down-regulated genes. CNS, central nervous system.
Figure 4
The expression of collagen family members in TCGA KIRC database. (A) COL1A1; (B) COL1A2; (C) COL3A1; (D) COL5A1; (E) COL5A2; (F) COL6A3. ***, P<0.001. TCGA, the Cancer Genome Atlas; KIRC, kidney renal clear cell carcinoma.
The mRNA expression of collagen family genes (ONCOMINE). The numbers in the figure represent the number of datasets with significant differences in gene expression, red representing up-regulated genes and blue representing down-regulated genes. CNS, central nervous system.The expression of collagen family members in TCGA KIRC database. (A) COL1A1; (B) COL1A2; (C) COL3A1; (D) COL5A1; (E) COL5A2; (F) COL6A3. ***, P<0.001. TCGA, the Cancer Genome Atlas; KIRC, kidney renal clear cell carcinoma.
Genetic mutation analysis of collagen expression in ccRCC
By analyzing the ccRCC data of cBioPortal, the results showed the mutation rate of COL3A1 and COL5A2 was 8%, which were the highest among them (Figure S1A). We studied the mutations of collagen family members in different types of renal cancer, and the results showed that high mutation levels of collagen family members were prevalent in different types of renal cancer (Figure S1B-S1H). We also found that altered expression of collagen family genes is also common in renal cancer, suggesting that mutations and altered expression of collagen family members play a role in ccRCC.
Survival analysis of collagen expression in ccRCC
We used RNAseq data from TCGA KIRC database for survival analysis. Patients were divided by the medium value of gene expression. The results showed that elevated expression levels in most collagen family members were associated with shorter survival. Among them, the high expression levels of COL1A1, COL5A1, and COL6A3 were significantly correlated with the OS of ccRCC (log-rank P<0.05) (), and the high expression levels of COL1A1, COL1A2, COL5A1, and COL6A3 were significantly correlated with the DSS of ccRCC (log-rank P<0.05) (Figure S2A-S2F). These results suggest that collagen family members play an important role in the progression of ccRCC, significantly affect the survival of patients with ccRCC, and can be used as a prognostic marker of ccRCC.
Figure 5
The prognostic value of collagen family in ccRCC (KM plotter). (A) COL1A1; (B) COL1A2; (C) COL3A1; (D) COL5A1; (E) COL5A2; (F) COL6A3. HR, hazard ratio; ccRCC, clear cell renal cell carcinoma; KM, Kaplan-Meier.
The prognostic value of collagen family in ccRCC (KM plotter). (A) COL1A1; (B) COL1A2; (C) COL3A1; (D) COL5A1; (E) COL5A2; (F) COL6A3. HR, hazard ratio; ccRCC, clear cell renal cell carcinoma; KM, Kaplan-Meier.Subsequently, univariate and multivariate regression analyses were conducted respectively. Univariate analysis showed that COL1A1 (HR: 1.161; P<0.001), COL1A2 (HR: 1.109; P<0.05), COL5A1 (HR: 1.233; P<0.001), and COL6A3 (HR: 1.191; P<0.001) were correlated with ccRCC OS, while multivariate analysis showed that COL1A2 (HR: 0.389; P<0.001) and COL5A1 (HR: 2.308; P<0.001) were correlated with ccRCC prognosis (). In general, COL5A1 can be used as independent prognostic factors of ccRCC.
Table 2
Cox analysis of collagen family in the TCGA
Name
Total (N)
Univariate analysis
Multivariate analysis
Hazard ratio (95% CI)
P value
Hazard ratio (95% CI)
P value
COL1A1
539
1.161 (1.066–1.263)
<0.001
1.314 (0.930–1.857)
0.122
COL1A2
539
1.109 (1.002–1.227)
0.045
0.389 (0.288–0.524)
<0.001
COL3A1
539
1.063 (0.962–1.176)
0.230
COL5A1
539
1.233 (1.111–1.369)
<0.001
2.308 (1.497–3.557)
<0.001
COL5A2
539
1.087 (0.957–1.236)
0.200
COL6A3
539
1.191 (1.075–1.320)
<0.001
0.983 (0.803–1.203)
0.866
TCGA, The Cancer Genome Atlas; CI, confidence interval.
TCGA, The Cancer Genome Atlas; CI, confidence interval.
GSEA analysis of COL5A1
In order to further understand the COL5A1-related molecular functions and possible molecular mechanisms involved in tumor progression, genes related to COL5A1 expression in tumors were screened and gene enrichment analysis was performed. We screened 7,089 genes that were positively correlated with COL5A1 expression, and 7,689 genes that were negatively correlated with COL5A1 expression () (P<0.05; false discovery rate <0.05). The gene heat map shows the genes with the top 50 correlations. Enrichment analysis of relevant genes obtained showed that genes associated with COL5A1 were primarily involved in extracellular structure organization, amoebiasis, ECM-receptor interaction, and valine, leucine and isoleucine degradation (Figure S3A,S3B).
Figure 6
Genes correlated with COL5A1 (LinkedOmics). (A) Volcano maps of top 50 genes correlated with COL5A1. (B) Heat maps of genes negatively correlated with COL5A1. (C) Heat maps of genes positively correlated with COL5A1.
Genes correlated with COL5A1 (LinkedOmics). (A) Volcano maps of top 50 genes correlated with COL5A1. (B) Heat maps of genes negatively correlated with COL5A1. (C) Heat maps of genes positively correlated with COL5A1.The methylation level of gene DNA promoter is closely related to tumor survival. We used TCGA KIRC methylation data contained in MethSurv to perform survival analysis of CPGs located near collagen family genes. Our study found that the methylation levels of collagen family members changed in ccRCC, and CpG methylation sites were associated with ccRCC survival. The DNA promoter methylation levels of COL1A1 and COL1A2 were significantly reduced in renal cancer, which to some extent explained the high expression of these two genes in ccRCC. In contrast, the promoter methylation levels of COL6A3 were significantly increased (). In addition, we found that certain CpG sites in collagen members were associated with ccRCC prognosis, including 14 sites of COL1A1, 10 sites of COL1A2, 2 sites of COL3A1, 42 sites of COL5A1, 6 sites of COL5A2, and 27 sites of COL6A3 (P<0.05) (). In conclusion, these results suggest that methylation levels in collagen family members influence the prognosis of ccRCC.
Figure 7
DNA methylation of collagen family members in MethSurv. (A) COL1A1; (B) COL1A2; (C) COL3A1; (D) COL5A1; (E) COL5A2; (F) COL6A3.
Table 3
The significant prognostic values of CpG in the collagen family members
Name
CpG name
HR
95% CI
LR test P value
UCSC RefGene Group
Relation to UCSC CpG Island
COL1A1
cg00060287
0.603
(0.37–0.983)
0.0332
Body
Island
cg02186748
1.761
(1.176–2.636)
0.0049
TSS1500
S_Shore
cg03799835
0.518
(0.353–0.761)
0.0009
Body
Open_Sea
cg11027398
0.575
(0.356–0.929)
0.0172
Body
Island
cg14562086
0.564
(0.343–0.928)
0.0170
TSS1500
S_Shore
cg14700325
0.546
(0.362–0.824)
0.0056
Body
N_Shelf
cg16781907
0.591
(0.4–0.873)
0.0076
Body
N_Shelf
cg18405262
0.47
(0.316–0.701)
0.0002
Body
Open_Sea
cg18618815
2.973
(1.549–5.705)
0.0002
Body
N_Shore
cg21847118
1.669
(1.004–2.777)
0.0373
Body
Open_Sea
cg22809726
2.639
(1.476–4.72)
0.0002
3'UTR
Open_Sea
cg23950157
2.879
(1.872–4.427)
0.0000
Body
N_Shore
cg24540710
0.367
(0.247–0.546)
0.0000
Body
Open_Sea
cg27604897
0.612
(0.416–0.901)
0.0141
Body
Open_Sea
COL1A2
cg03920522
0.537
(0.358–0.805)
0.0037
Body
Open_Sea
cg08695855
1.864
(1.079–3.221)
0.0165
TSS200
Open_Sea
cg09146903
1.503
(1.013–2.229)
0.0402
TSS200
Open_Sea
cg10368049
0.376
(0.216–0.655)
0.0001
TSS200
Open_Sea
cg14340196
0.585
(0.359–0.952)
0.0231
Body
Open_Sea
cg16872226
2.472
(1.382–4.419)
0.0007
TSS200
Open_Sea
cg23348014
2.676
(1.433–4.998)
0.0005
TSS1500
Open_Sea
cg24406898
0.654
(0.446–0.959)
0.0303
TSS1500
Open_Sea
cg25300386
0.586
(0.359–0.958)
0.0249
1stExon;5'UTR
Open_Sea
ch.7.1973356R
2.145
(1.442–3.191)
0.0003
Body
Open_Sea
COL3A1
cg01942023
0.554
(0.337–0.91)
0.0134
TSS1500
Open_Sea
cg20770175
0.541
(0.325–0.899)
0.0116
Body
Open_Sea
COL5A1
cg01753595
0.6
(0.361–0.997)
0.0376
TSS1500
N_Shore
cg03298938
0.455
(0.304–0.68)
0.0002
TSS1500
Island
cg03430597
0.552
(0.332–0.917)
0.0147
Body
Island
cg05328939
1.69
(1.017–2.809)
0.0324
Body
Island
cg05329720
2.851
(1.558–5.215)
0.0001
Body
N_Shore
cg07300559
2.34
(1.332–4.108)
0.0011
Body
N_Shore
cg08029329
1.858
(1.105–3.125)
0.0125
Body
Island
cg13438095
1.897
(1.278–2.818)
0.0012
Body
Open_Sea
cg13492737
1.747
(1.05–2.907)
0.0229
Body
Open_Sea
cg13496596
1.781
(1.082–2.931)
0.0162
Body
S_Shore
cg13499271
0.413
(0.241–0.705)
0.0004
TSS1500
N_Shore
cg13516654
0.559
(0.336–0.929)
0.0170
Body
Open_Sea
cg13567205
1.95
(1.315–2.892)
0.0007
Body
N_Shelf
cg13596983
0.603
(0.367–0.992)
0.0361
Body
Island
cg13605536
2.049
(1.258–3.337)
0.0020
Body
N_Shore
cg13639452
1.958
(1.319–2.908)
0.0007
Body
Open_Sea
cg13698865
0.658
(0.446–0.971)
0.0335
Body
Open_Sea
cg13714791
2.596
(1.475–4.566)
0.0002
Body
S_Shore
cg13717540
1.82
(1.082–3.063)
0.0161
Body
Open_Sea
cg13754661
0.511
(0.325–0.804)
0.0023
TSS1500
Island
cg13775295
0.527
(0.353–0.787)
0.0014
Body
Open_Sea
cg13854962
2.294
(1.476–3.566)
0.0001
Body
S_Shelf
cg13865347
2.73
(1.526–4.885)
0.0001
Body
Open_Sea
cg13913654
2.099
(1.215–3.628)
0.0038
Body
Open_Sea
cg13917918
1.791
(1.175–2.732)
0.0051
Body
Open_Sea
cg14070775
1.698
(1.031–2.797)
0.0282
Body
Open_Sea
cg14091896
0.605
(0.368–0.994)
0.0370
Body
Open_Sea
cg14194478
0.647
(0.439–0.954)
0.0267
Body
Open_Sea
cg14207613
1.962
(1.29–2.985)
0.0011
Body
N_Shelf
cg14227731
0.416
(0.24–0.721)
0.0006
Body
Open_Sea
cg14228756
1.8
(1.115–2.906)
0.0111
Body
Open_Sea
cg14237069
1.612
(1.073–2.421)
0.0255
Body
N_Shore
cg14274542
2.718
(1.519–4.863)
0.0001
Body
Island
cg14350693
1.627
(1.083–2.443)
0.0228
Body
Island
cg14355794
1.788
(1.049–3.047)
0.0227
Body
Open_Sea
cg14356362
0.566
(0.34–0.944)
0.0207
Body
Island
cg14399122
0.413
(0.235–0.726)
0.0006
Body
Island
cg14581018
1.909
(1.283–2.839)
0.0020
Body
N_Shore
cg14622967
1.528
(1.022–2.284)
0.0354
Body
S_Shore
cg14656180
2.356
(1.363–4.074)
0.0007
Body
Open_Sea
cg21208686
2.409
(1.606–3.613)
0.0000
Body
S_Shore
cg24354213
1.866
(1.11–3.138)
0.0118
Body
Island
COL5A2
cg02420724
0.529
(0.318–0.882)
0.0092
TSS1500
Open_Sea
cg07875385
2.378
(1.33–4.254)
0.0012
1stExon;5'UTR
Open_Sea
cg08247938
0.596
(0.403–0.881)
0.0086
Body
Open_Sea
cg09211763
2.544
(1.423–4.55)
0.0004
1stExon;5'UTR
Open_Sea
cg10765212
1.508
(1.021–2.227)
0.0375
TSS200
Open_Sea
cg12329318
0.341
(0.187–0.623)
0.0001
Body
Open_Sea
COL6A3
cg00002145
2.265
(1.29–3.978)
0.0017
Body
Open_Sea
cg00779216
2.361
(1.344–4.149)
0.0010
Body
Island
cg03372974
1.917
(1.139–3.224)
0.0086
Body
Open_Sea
cg05223158
0.44
(0.255–0.761)
0.0013
Body
Open_Sea
cg06284586
2.387
(1.398–4.077)
0.0005
Body
Open_Sea
cg08871711
1.77
(1.065–2.941)
0.0192
Body
Open_Sea
cg08950375
0.56
(0.373–0.841)
0.0071
TSS1500
Open_Sea
cg08957605
2.286
(1.277–4.09)
0.0021
Body
Open_Sea
cg12681727
2.521
(1.674–3.798)
0.0000
Body
Open_Sea
cg13502931
0.515
(0.347–0.764)
0.0014
Body
Open_Sea
cg13537346
0.59
(0.402–0.866)
0.0076
Body
Open_Sea
cg14556851
1.647
(1.002–2.708)
0.0384
Body
S_Shelf
cg15747921
2.183
(1.479–3.222)
0.0001
Body
Open_Sea
cg17725364
2.14
(1.27–3.604)
0.0020
Body
Island
cg19696718
0.668
(0.454–0.982)
0.0397
5'UTR
Open_Sea
cg20502977
0.473
(0.27–0.831)
0.0044
Body
Open_Sea
cg21136443
2.203
(1.27–3.822)
0.0021
Body
N_Shelf
cg21386952
2.33
(1.548–3.507)
0.0000
Body
Open_Sea
cg22944062
1.847
(1.242–2.748)
0.0020
Body
N_Shelf
cg23417677
2.248
(1.316–3.841)
0.0012
Body
Open_Sea
cg24830524
2.712
(1.484–4.955)
0.0002
Body
Open_Sea
cg25424742
1.557
(1.037–2.338)
0.0378
Body
Open_Sea
cg25591469
2.246
(1.319–3.827)
0.0011
5'UTR;1stExon
Open_Sea
cg26278699
0.593
(0.36–0.976)
0.0304
TSS200
Open_Sea
cg27049194
2.827
(1.605–4.982)
0.0001
Body
Island
cg27050057
1.764
(1.061–2.933)
0.0203
Body
Open_Sea
cg27451920
1.793
(1.2–2.68)
0.0059
Body
S_Shore
CpG, cytosine-phosphate-guanine; HR, hazard ratio; CI, confidence interval; LR, Likelihood ratio; UCSC, University of California Santa Cruz.
DNA methylation of collagen family members in MethSurv. (A) COL1A1; (B) COL1A2; (C) COL3A1; (D) COL5A1; (E) COL5A2; (F) COL6A3.CpG, cytosine-phosphate-guanine; HR, hazard ratio; CI, confidence interval; LR, Likelihood ratio; UCSC, University of California Santa Cruz.
Immune infiltration and drug response
We used the ccRCC data from TIMER database to detect the correlation between collagen family members’ expression levels and the infiltration levels of tumor-immune infiltrating cells (TIICs). The results showed that collagen family members were positively correlated with detected immune cells, but negatively correlated with tumor purity (). Subsequently, we used the SCNA module of the database to detect the somatic copy number alterations of collagen family members, and the results showed that the arm-level deletion, arm-level gain, deep deletion, and high amplification of collagen family members were closely related to the level of immune cell infiltration in ccRCC (Figure S4A-S4F). These results suggest that members of the collagen family may influence the prognosis of ccRCC by regulating the level of tumor immune cell infiltration.
Figure 8
The correlation between collagens and immune cell infiltration in ccRCC (TIMER). (A) COL1A1; (B) COL1A2; (C) COL3A1; (D) COL5A1; (E) COL5A2; (F) COL6A3. ccRCC, clear cell renal cell carcinoma.
The correlation between collagens and immune cell infiltration in ccRCC (TIMER). (A) COL1A1; (B) COL1A2; (C) COL3A1; (D) COL5A1; (E) COL5A2; (F) COL6A3. ccRCC, clear cell renal cell carcinoma.Previous studies have shown that the expression level of collagen family members is correlated with the prognosis of ccRCC, and these gene expression changes may affect the prognosis of the tumor by regulating the level of tumor-associated immune cell infiltration through the regulation of DNA methylation (26,27). Thus, members of the collagen family may have the potential to become targets for ccRCC therapy. Our test results in the GSCALite database showed that the expression levels of collagen family members were most closely related to the drug sensitivity of the tumor. The number of related drugs or small molecules from most to least is COL5A1, COL5A2, COL1A1, COL1A2, COL6A3, and COL3A1, which are 14, 11, 9, 8, 7, and 3 respectively (Figure S5). The results may suggest that the collagen family especially COL5A1, and COL5A2 are potential biomarkers for drug screening.
Discussion
ccRCC is a common malignant tumor, which often leads to death due to tumor recurrence and metastasis (28). The treatment has improved with advances in technology, but there is still no effective treatment for recurrent and metastatic tumors. The lack of specific diagnostic and prognostic markers limits the early diagnosis and treatment of ccRCC. Therefore, the development of specific targets for the diagnosis and treatment of ccRCC is crucial. In this study, we identified 6 collagen family genes by analyzing 2 GEO ccRCC metastasis datasets. Further studies showed that collagen family genes were highly expressed in ccRCC tissues and were closely related to the prognosis of ccRCC. Subsequently, we assessed the methylation level of collagen family genes in ccRCC, their relationship with tumor immune cell infiltration, and their responsiveness to therapeutic drugs. The results confirmed that collagen family genes can be used as prognostic markers of ccRCC and help improve the level of diagnosis and treatment of ccRCC.Previous study has shown the prognostic value of collagens in a variety of tumors. Elevated COL1A2 expression level is a predictor of gastric cancer prognosis (29). m6A methylation-mediated COL3A1 up-regulation promotes metastasis of triple-negative breast cancer (TNBC) (30). Furthermore, CircACAP2 promotes breast cancer proliferation and metastasis by targeting the miR-29a/b-3p-COL5A1 axis (31). COL5A2 acts as a potential clinical biomarker for gastric cancer and renal metastasis (32).These studies are consistent with our findings. At present, it is widely believed that DNA methylation is closely related to the prognosis of tumors (33). A high methylation level of gene DNA promoter often leads to gene silencing, and methylation of key genes can affect the progress of the tumor (34). Previous study has shown that DNA methylation of TMEM130 promotes cell migration in breast cancer (35). DIO3OS DNA methylation drives non-small cell lung cancer progression (36). ANGPTL4 DNA methylation promotes colorectal cancer metastasis by activating the ERK pathway (37). We assessed the methylation levels of the collagen family genes in ccRCC and found that the methylation levels of COL1A1 and COL1A2 decreased in ccRCC and COL6A3 was increased. In addition, multiple CpG sites of collagen family genes are associated with the prognosis of ccRCC.Tumor immunotherapy is now very effective against many tumor types, especially inoperable tumors. The infiltration level of tumor-associated immune cells directly affects the effect of tumor immunotherapy. Previous study has shown that the activation of the programmed death (PD)-1/PD-ligand (PD-L) pathway and regulatory T cells (Tregs) in the tumor microenvironment contributes to the evasion of the transformed cells from the immune surveillance and the suppression of an antitumor immune response (38). In patients with TNBC, tumor-infiltrating lymphocytes (TILs) are associated with improved survival (39). Collagen promotes anti-PD-1/PD-L1 resistance in cancer through LAIR1-dependent CD8(+) T cell exhaustion (40). We found those collagen family genes are closely associated with levels of infiltration of various tumor-associated immune cells. Collagen family genes can be used as potential tumor immunotherapy targets. In addition, the results of drug sensitivity analysis showed that the collagen family genes were associated with multiple chemotherapeutic drug sensitivities in ccRCC, especially COL5A1 and COL5A2. These results suggest that collagen family genes are closely associated with ccRCC prognosis and can be used as potential therapeutic targets for ccRCC.
Conclusions
In summary, we found that the collagen family genes are key genes for ccRCC metastasis. The collagen family genes’ expression levels and methylation levels both affect the prognosis of ccRCC. In particular, COL5A1 can be used as independent prognostic factors of ccRCC. In addition, collagen expression was also associated with tumor immune cell infiltration level and chemotherapy drug sensitivity. Therefore, our study suggests that collagen family genes can be used as a prognostic and therapeutic target for ccRCC.The article’s supplementary files as
Authors: Ethan Cerami; Jianjiong Gao; Ugur Dogrusoz; Benjamin E Gross; Selcuk Onur Sumer; Bülent Arman Aksoy; Anders Jacobsen; Caitlin J Byrne; Michael L Heuer; Erik Larsson; Yevgeniy Antipin; Boris Reva; Arthur P Goldberg; Chris Sander; Nikolaus Schultz Journal: Cancer Discov Date: 2012-05 Impact factor: 39.397
Authors: Brad T Sherman; Ming Hao; Ju Qiu; Xiaoli Jiao; Michael W Baseler; H Clifford Lane; Tomozumi Imamichi; Weizhong Chang Journal: Nucleic Acids Res Date: 2022-03-23 Impact factor: 19.160
Authors: Kenichi Harano; Ying Wang; Bora Lim; Robert S Seitz; Stephan W Morris; Daniel B Bailey; David R Hout; Rachel L Skelton; Brian Z Ring; Hiroko Masuda; Arvind U K Rao; Steven Van Laere; Francois Bertucci; Wendy A Woodward; James M Reuben; Savitri Krishnamurthy; Naoto T Ueno Journal: PLoS One Date: 2018-10-12 Impact factor: 3.240
Authors: Max Franz; Harold Rodriguez; Christian Lopes; Khalid Zuberi; Jason Montojo; Gary D Bader; Quaid Morris Journal: Nucleic Acids Res Date: 2018-07-02 Impact factor: 16.971
Authors: David H Peng; Bertha Leticia Rodriguez; Lixia Diao; Limo Chen; Jing Wang; Lauren A Byers; Ying Wei; Harold A Chapman; Mitsuo Yamauchi; Carmen Behrens; Gabriela Raso; Luisa Maren Solis Soto; Edwin Roger Parra Cuentes; Ignacio I Wistuba; Jonathan M Kurie; Don L Gibbons Journal: Nat Commun Date: 2020-09-09 Impact factor: 14.919