Literature DB >> 34273970

PODN is a prognostic biomarker and correlated with immune infiltrates in osteosarcoma.

Feng Yao1, Zhao Feng Zhu2, Jun Wen2, Fu Yong Zhang1, Zheng Zhang1,3, Lun Qing Zhu1, Guang Hao Su1,3, Quan Wen Yuan1, Yun Fang Zhen1, Xiao Dong Wang4,5.   

Abstract

BACKGROUND: Osteosarcoma was the most common primary bone malignancy in children and adolescents. It was imperative to identify effective prognostic biomarkers for this cancer. This study was aimed to identify potential crucial genes of osteosarcoma by integrated bioinformatics analysis.
METHODS: Identification of differentially expressed genes from public data gene expression profiles (GSE42352), functional and pathway enrichment analysis, protein-protein interaction (PPI) network construction and module analysis, Cox regression and survival analysis was conducted.
RESULTS: Totally 17 co-differential genes were found to be differentially expressed. These genes were enriched in biological processes, Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Set Enrichment Analysis (GSEA) pathway of inflammatory immune response. PPI network was constructed with 63 differentially expressed genes that co-existed between the test set and the validation set. The area under the receiver operating characteristic curve (AUC value) was 0.855, which indicated that the expression of PODN had a good diagnostic value for osteosarcoma. Furthermore, Cox regression and survival analysis revealed 5 genes were statistically significant.
CONCLUSIONS: PODN was regarded as a potential biomarker for the diagnosis and prognosis of osteosarcoma, ACTA2, COL6A1, FAP, OLFML2B and COL6A3, can be used as potential prognostic indicators for osteosarcoma.
© 2021. The Author(s).

Entities:  

Keywords:  Biomarker; Diagnosis; Immune infiltrates; Osteosarcoma; PODN; Prognosis

Year:  2021        PMID: 34273970      PMCID: PMC8285818          DOI: 10.1186/s12935-021-02086-5

Source DB:  PubMed          Journal:  Cancer Cell Int        ISSN: 1475-2867            Impact factor:   5.722


Introduction

Osteosarcoma was a kind of highly aggressive tumor [1], most often developing in the long bone metaphysis, which have a relatively high incidence in adolescents (0.8–1.1 per 100,000 per year at age 15–19 years) [2, 3]. Compared with surgery alone, multimodal neoadjuvant chemotherapy (ChT) of high-grade localised OS increases disease-free survival probability from 10%–20% to > 60% [4], and no additional treatment have been found that can increase survival significantly [5, 6]. With the development of molecular biology, many studies have shown that high chromosome instability was one of the most important characteristic hallmark of osteosarcoma. Many studies on osteosarcoma find that mRNA play a potential biological role in osteosarcoma and can be used as prognostic markers and therapeutic targets. However, the biological function and related network of mRNA in osteosarcoma has not been fully elucidated. For instance, PODN gene encodes protein podocan. Podocan was a small proteoglycan (SLRP) family protein rich in leucine. It was not only a strong regulator of cell phenotype in extracellular matrix (ECM), but also a kind of ECM protein. ECM protein was involved in cell migration and proliferation, and high podocan level inhibits the proliferation of (SMC) in smooth muscle cells [7], so PODN may also be involved in the regulation of cell proliferation. PODN is associated with malignant tumors, YiBai et al. established a prognostic model of gastric cancer (GC) through PODN and other 6 genes, which can accurately predict the prognosis of GC [8]. In this study, bioinformatics methods were used to integrate the mRNA expression data, which was available in the GEO database, to identify differentially expressed mRNA between osteosarcoma and normal cell, and establish the related network of targeted molecules, aiming to provide new targets for diagnosis, treatments, and prognostic in osteosarcoma. Finally, it was determined that PODN has functional relevance in osteosarcoma and additional genes (ACTA2, COL6A1, FAP, OLFML2B and COL6A3) have prognostic significance for osteosarcoma.

Materials and methods

Identification of differential (co-expressed) genes from public microarray data

It was found that the expression of PODN had a significant difference between 15 normal samples and 103 osteosarcoma samples in the GSE42352 data set (test set), suggesting that PODN could be used as a diagnostic and prognostic biomarker for osteosarcoma. Then, 103 osteosarcoma samples in GSE42352 data set were grouped according to the median expression of PODN, including 51 cases in the low expression group and 52 cases in the high expression group. Difference analysis was performed by using the limma package [9], 103 differential genes was obtained by screen differential (co-expressed) genes according to multiple of fold change and P value (|logFC| > 1 & P < 0.05). Pheatmap package and ggplot2 package were used for visualization, heat map and volcano map of differential genes were performed. 101 osteosarcoma samples from the Target-OS data set (validation set) were grouped by the median expression of PODN, including 50 in the low expression group and 51 in the high expression group, and then, 526 differential genes were obtained in the same way. The intersection of two differential genes sets was obtained with 63 co-differential genes (|logFC| > 1 & P < 0.05). Thirty-eight differential lncRNAs were also obtained in the Target-OS data set (validation set) (|logFC| > 1 & P < 0.05).

GO, KEGG, and GSEA enrichment analysis

GO (Gene Ontology) includes BP (Biological Process), MF (Molecular Function) and CC (Cellular Component). GO analysis of differential genes (screening criteria was FDR < 0.1) was performed with the clusterProfiler package [10], using ggplot2 (https://cran.r-project.org/web/packages/ggplot2/index.html), R package for visualization. The ClueGO plugin of cytoscape software supports more than 200 species, database real-time update,multiple annotation data sets, multiset enrichment result comparison, and network graph presentation. The plug-in was also used for validation enrichment analysis, and the results were basically consistent. Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis used the clusterProfiler package and plotted the associated bar graph, bubble graph and enrichment graph. KEGG pathway enrichment analysis was performed on differentially expressed genes (DEGs) with EnsembleID among the differential genes in the test set to obtain an enrichment list, and pathways with P < 0.05 were considered significantly enriched. Gene Set Enrichment Analysis (GSEA) [11] used the expression matrix of differential genes grouped by high and low expression of PODN, which was analyzed by the ssGSEA package, and the selected reference gene set was msigdb.v7.0.entrez.gmt. and the ggplot2 package was used to visualize the GSEA collection.

Construction of PPI molecular interaction network and extraction of key gene modules

STRING database [12] was used to construct protein-protein interaction networks for differential genes. Currently, the database contains 18,838 human proteins and 25,914,693 core network interactions. In the present study, a protein–protein interaction (PPI) network was constructed by using the DEGs identified by the serial interaction body. The highest confidence interaction score was set to 0.4. The Hub genes of PPI network modules were screened by using MCODE and the cytoHubba plug-in and visualized.

Difference of immune infiltrating cells and correlation analysis of immunophenotype

CIBERSORT [13] is an algorithm widely used to characterize the cellular composition of complex tissues by gene expression levels in solid tumors. When using CIBERSORT, the LM22 signature algorithm was employed. LM22 is a special gene marker containing 547 genes that can differentiate 22 immune cell hypotypes downloaded from the CIBERSORT portal website (http://cibersort.stanford.edu/). In this study, we used the CIBERSORT package and calculated the infiltrating abundance of 22 immune cell with high and low PODN expression in 103 osteosarcoma samples by the LM22 algorithm, including different T cells, B cells, plasma cells, natural killer cells, and different myeloid lineages. Subsequently, we also analyzed the expression distribution correlation of 22 infiltrating immune cells at high and low PODN expression, and plotted the correlation heat map, with red representing positive correlation and blue representing negative correlation, and the darker the color, the closer the value is to 1 and the stronger the correlation. From the ImmPort database (https://immport.niaid.nih.gov/home) [14], a list of immune-related genes was obtained. Then, the intersection of the differential genes was acquried in the test set and the validation set with the immunophenotype-related genes and list genes respectively, and a Wayne diagram was made by the VennDiagram package. Subsequently, the pheatmap package of R language was used to correlate the differential gene sets of PODN high and low expression groups with immunophenotype-related genes, and the results with strong correlation were visualized to draw a correlation heat map.

Area under the ROC curve (AUC) to assess the diagnostic performance

The receiver operating characteristic curve (ROC curve) was plotted by pROC package, with the true positive rate (sensitivity) as the ordinate and the false positive rate (1-specificity) as the abscissa. It is used to analyze whether the expression level of PODN can well distinguish between 15 normal samples and 103 osteosarcoma samples, and determine to produce the optimal cut-off value of the highest likelihood ratio to decide the recognition threshold of PODN for osteosarcoma. The survivalROC package was also used to validate the accuracy of the risk model in predicting the survival prognosis of osteosarcoma. An area under the curve (AUC) greater than 0.5 can visually reflect the diagnostic value of biomarkers for the disease. At the same time, the closer the area under the ROC curve is to 1 and the closer to the (0, 1) point, the better the authenticity of the diagnostic test is demonstrated.

Cox regression and hazard model construction

At first, univariate Cox regression analysis of the Target database gene matrix was performed. Then, univariate regression analysis results was extracted by a risk factor collection with 17 co-differential genes, and further multivariate Cox regression analysis (narrow the range) of the gene collection was performed for predicting high-risk biomarkers for osteosarcoma.

Survival analysis

Target-OS data set survival analysis was performed using the survival package data, including overall survival time (OS), event-free (disease) survival time (DFS), and time to progression (TTP). To validating if the risk model has a good accuracy for the overall survival prognosis of osteosarcoma.

Statistical analysis

Excel (Microsoft) and R statistical software (version 4.0.2) were used to analyze data. P value less than 0.05 was considered statistically significant.

Results

Differential expression analysis

The GSE42352 data set of 15 normal samples and 103 osteosarcoma samples was used as the test set for differential analysis. Then, we found that PODN was significantly different between normal samples and osteosarcoma patients (Fig. 1A). Subsequently, we grouped the 103 tumor samples of the test set into high and low expression groups by PODN median value, including 51 cases in the low expression group and 52 cases in the high expression group. Differential analysis was performed by using the limma package and the differential results were visualized in the form of volcano maps and heat maps (Fig. 2A, B). And 101 osteosarcoma patients in the Target database of the validation set were divided into high and low expression groups by the median value of PODN expression, 50 and 51 cases respectively. Differential analysis was performed using the limma package, and differential (co-expressed) genes were screened according to multiple of difference between groups (fold change) and significance of difference (P value), setting a threshold of |logFC| > 1 & P < 0.05, to obtain 103 co-expressed genes. Pheatmap and ggplot2 packages were also used for visualization to draw the heat map and volcano map of differential genes. The median expression levels were grouped in 101 osteosarcoma samples from the Target-OS dataset (validation set), including 50 in the low expression group and 51 in the high expression group, and 526 differential genes were obtained in the same way (Fig. 2C, D). The intersection of two differential genes sets was obtained with 17 co-differential genes (Fig. 8A).
Fig. 1

Differential expression and prognostic significance of PODN in osteosarcoma. A PODN was significantly highly expressed in tumor tissues compared with normal tissues; B PODN was used to assess the diagnostic performance of PODN for osteosarcoma by area under the ROC curve (AUC) in the test set; C the effect of PODN on the overall survival prognosis (OS) of osteosarcoma was statistically significant; D the effect of PODN on the progression-free survival prognosis (DFS) of osteosarcoma disease was statistically significant

Fig. 2

Differential analysis of single and co-expressed genes. A The median expression levels of PODN in 103 osteosarcoma samples of GSE42352 data set (test set) were divided into high and low expression groups, and the significant differential genes between the two groups were displayed in the form of heat map; B the significant differential genes between the two groups of GSE42352 data set were displayed in the form of volcano diagram; C the median expression levels of PODN in 101 osteosarcoma samples of Target-OS data set were divided into high and low expression groups, and the significant differential genes between the two groups were displayed in the form of heat map; D the significant differential genes between the two groups of Target-OS data set were displayed in the form of volcano diagram

Fig. 8

Cox regression analysis and risk assessment analysis. A Wayne diagram of the intersection of test set and validation set differential genes; B Accuracy of each gene in the 5-gene set (ACTA2, COL6A1, FAP, OLFML2B, COL6A3) obtained by multivariate Cox analysis in predicting the survival prognosis of osteosarcoma; C ROC curve verifies the accuracy of the 5-gene set model as a whole in predicting the survival prognosis of osteosarcoma; D the effect of the 5-gene set as a risk model on the overall survival (OS) prognosis of osteosarcoma was statistically significant; E the risk curve (upper), survival status (middle), and risk heat map (lower) of the 5-gene set

Differential expression and prognostic significance of PODN in osteosarcoma. A PODN was significantly highly expressed in tumor tissues compared with normal tissues; B PODN was used to assess the diagnostic performance of PODN for osteosarcoma by area under the ROC curve (AUC) in the test set; C the effect of PODN on the overall survival prognosis (OS) of osteosarcoma was statistically significant; D the effect of PODN on the progression-free survival prognosis (DFS) of osteosarcoma disease was statistically significant Differential analysis of single and co-expressed genes. A The median expression levels of PODN in 103 osteosarcoma samples of GSE42352 data set (test set) were divided into high and low expression groups, and the significant differential genes between the two groups were displayed in the form of heat map; B the significant differential genes between the two groups of GSE42352 data set were displayed in the form of volcano diagram; C the median expression levels of PODN in 101 osteosarcoma samples of Target-OS data set were divided into high and low expression groups, and the significant differential genes between the two groups were displayed in the form of heat map; D the significant differential genes between the two groups of Target-OS data set were displayed in the form of volcano diagram GO (Gene Ontology) includes BP (Biological Process), MF (Molecular Function) and CC (Cellular Component). GO analysis of differential genes (Fig. 3 and Table 1) was performed with the clusterProfiler package [10] (screening criteria was FDR < 0.1) and visualized using the ggplot2 package. KEGG enrichment analysis (Fig. 4 and Table 2) was performed using the clusterProfiler package and plotting the correlation bar graph, bubble graph, enrichment graph. KEGG pathway enrichment analysis was performed on DEGs with EnsembleID among the differential genes in the test set to obtain an enrichment list, and pathways with P < 0.05 were considered significantly enriched.
Fig. 3

GO enrichment analysis of differential genes. GO enrichment analysis was performed for GSE42352 and Target database differential genes respectively. A The bar graph of GO enrichment analysis of GSE42352 data set using clusterProfiler package. Length represents the number of gene enrichment, color represents significance; B the bar graph of GO enrichment analysis of Target-OS data set using clusterProfiler package. Length represents the number of gene enrichment, color represents significance; C the bubble graph of GO enrichment analysis of GSE42352 data set using clusterProfiler package. Bubble size represents the number of gene enrichment, color represents significance; D the bubble graph of GO enrichment analysis of Target-OS data set using clusterProfiler package. Bubble size represents the number of gene enrichment, color represents significance; E the GO enrichment results of using GoPlot package to analyse test set were used to draw bubble diagram; F the GO enrichment results of using GoPlot package to analyse validation set were used to draw bubble diagram

Table 1

List of GO enrichment analysis performed on GSE42352 (gene enrichment number Top10) and target (gene enrichment number Top10) database co-expressed differential genes by clusterProfiler package

DatasetOntologyIDDescriptionP.adjustCount
GSE42352BPGO:0030198Extracellular matrix organization4.25E−1522
GSE42352BPGO:0043062Extracellular structure organization4.25E−1522
GSE42352BPGO:0019882Antigen processing and presentation1.98E−1015
GSE42352BPGO:0002478Antigen processing and presentation of exogenous peptide antigen8.65E−1013
GSE42352BPGO:0019884Antigen processing and presentation of exogenous antigen1.25E−0913
GSE42352BPGO:0048002Antigen processing and presentation of peptide antigen1.79E−0913
GSE42352BPGO:0034341Response to interferon-gamma3.09E−0913
GSE42352BPGO:0071346Cellular response to interferon-gamma1.19E−0812
GSE42352BPGO:0002697Regulation of immune effector process0.00028770212
GSE42352BPGO:0043312Neutrophil degranulation0.0004259412
GSE42352CCGO:0062023Collagen-containing extracellular matrix1.39E−1926
GSE42352CCGO:0010008Endosome membrane9.35E−0714
GSE42352CCGO:0030139Endocytic vesicle4.96E−0813
GSE42352CCGO:0005765Lysosomal membrane2.61E−0713
GSE42352CCGO:0098852Lytic vacuole membrane2.61E−0713
GSE42352CCGO:0005774Vacuolar membrane1.09E−0613
GSE42352CCGO:0042611MHC protein complex1.00E−1711
GSE42352CCGO:0005581Collagen trimer1.39E−1111
GSE42352CCGO:0030666Endocytic vesicle membrane1.26E−0811
GSE42352CCGO:0030133Transport vesicle2.34E−0511
GSE42352MFGO:0005201Extracellular matrix structural constituent3.04E−1718
GSE42352MFGO:0033218Amide binding1.06E−0815
GSE42352MFGO:0042277Peptide binding1.06E−0814
GSE42352MFGO:0005518Collagen binding1.21E−1010
GSE42352MFGO:0004175Endopeptidase activity0.0102820878
GSE42352MFGO:0042605Peptide antigen binding1.02E−087
GSE42352MFGO:0019838Growth factor binding8.98E−057
GSE42352MFGO:0003823Antigen binding0.0002301627
GSE42352MFGO:0005539Glycosaminoglycan binding0.0017343737
GSE42352MFGO:1901681Sulfur compound binding0.0026708047
Target-OSBPGO:0030198Extracellular matrix organization6.15E−1241
Target-OSBPGO:0043062Extracellular structure organization6.15E−1241
Target-OSBPGO:0031589Cell-substrate adhesion2.67E−0733
Target-OSBPGO:0001655Urogenital system development0.00013736527
Target-OSBPGO:0034329Cell junction assembly0.01081890525
Target-OSBPGO:0072001Renal system development0.0004637524
Target-OSBPGO:0001503Ossification0.01482297624
Target-OSBPGO:0001822Kidney development0.0005721123
Target-OSBPGO:0007178Transmembrane receptor protein serine/threonine kinase signaling pathway0.0084436323
Target-OSBPGO:0030111Regulation of Wnt signaling pathway0.01657874622
Target-OSCCGO:0062023Collagen-containing extracellular matrix1.28E−0938
Target-OSCCGO:0005925Focal adhesion2.57E−0937
Target-OSCCGO:0030055Cell-substrate junction2.83E−0937
Target-OSCCGO:0031252Cell leading edge9.36E−0528
TargetOSCCGO:0005743Mitochondrial inner membrane0.00198623827
Target-OSCCGO:0005759Mitochondrial matrix0.04319506422
Target-OSCCGO:0031253Cell projection membrane0.00229328521
Target-OSCCGO:0045121Membrane raft0.00343157220
Target-OSCCGO:0098857Membrane microdomain0.00343157220
Target-OSCCGO:0098589Membrane region0.00535740920
Target-OSMFGO:0050839Cell adhesion molecule binding1.46E−0843
Target-OSMFGO:0005201Extracellular matrix structural constituent3.19E−0823
Target-OSMFGO:0005178Integrin binding2.43E−0517
Target-OSMFGO:0019838Growth factor binding0.0001649516
Target-OSMFGO:0030020Extracellular matrix structural constituent conferring tensile strength0.000164959
Target-OSMFGO:0003779Actin binding0.00028630930
Target-OSMFGO:0048407Platelet-derived growth factor binding0.0005770355
Target-OSMFGO:0045296Cadherin binding0.00111876824
Target-OSMFGO:0051015Actin filament binding0.00224386817
Target-OSMFGO:0008237Metallopeptidase activity0.02974245314
Fig. 4

KEGG enrichment analysis. KEGG enrichment analysis was performed for GSE42352 and Target database differential genes, respectively. A The bubble graph of KEGG enrichment analysis of GSE42352 data set using clusterProfiler package. Bubble size represents the number of gene enrichment, color from blue to red, significance gradually increased; B the bar graph of KEGG enrichment analysis of GSE42352 data set using clusterProfiler package. Length represents the number of gene enrichment, color from blue to red, significance gradually increased; C the enrichment diagram of KEGG analysis of Target-42352 data set using clusterProfiler package, showing the relationship between gene ID and pathway; D the bubble graph of KEGG enrichment analysis of Proteet-OS data set using clusterProfiler package. Bubble size represents the number of gene enrichment, color from blue to red, significance gradually increased; E the bar graph of KEGG enrichment analysis of Target-OS data set using clusterProfiler package. Length represents the number of gene enrichment, color from blue to red, significance gradually increased; F the enrichment diagram of KEGG analysis of Target-OS data set using clusterProfiler package, showing the relationship between gene ID and pathway; G ECM-receptor interaction pathway; H relaxin signaling pathway pathway; I protein digestion and absorption pathway

Table 2

List of KEGG enrichment analysis performed on GSE42352 (gene enrichment number Top10) and target (gene enrichment number Top10) database co-expressed differential genes by clusterProfiler package

DatasetIDDescriptionPqvalueCount
Target-OShsa04974Protein digestion and absorption0.00070.000614
Target-OShsa04510Focal adhesion0.00420.003718
Target-OShsa04933AGE–RAGE signaling pathway in diabetic complications0.00420.003712
Target-OShsa04810Regulation of actin cytoskeleton0.00740.006518
Target-OShsa04512ECM-receptor interaction0.01700.015010
Target-OShsa04926Relaxin signaling pathway0.02400.021112
Target-OShsa03010Ribosome0.03920.034513
GSE42352hsa04612Antigen processing and presentation0.00000.000013
GSE42353hsa05150Staphylococcus aureus infection0.00000.000013
GSE42354hsa04145Phagosome0.00000.000013
GSE42355hsa05152Tuberculosis0.00000.000012
GSE42356hsa05416Viral myocarditis0.00000.000011
GSE42357hsa05323Rheumatoid arthritis0.00000.000011
GSE42358hsa05322Systemic lupus erythematosus0.00000.000011
GSE42359hsa04514Cell adhesion molecules0.00000.000011
GSE42360hsa05166Human T-cell leukemia virus 1 infection0.00000.000011
GSE42361hsa05168Herpes simplex virus 1 infection0.00390.002611
GO enrichment analysis of differential genes. GO enrichment analysis was performed for GSE42352 and Target database differential genes respectively. A The bar graph of GO enrichment analysis of GSE42352 data set using clusterProfiler package. Length represents the number of gene enrichment, color represents significance; B the bar graph of GO enrichment analysis of Target-OS data set using clusterProfiler package. Length represents the number of gene enrichment, color represents significance; C the bubble graph of GO enrichment analysis of GSE42352 data set using clusterProfiler package. Bubble size represents the number of gene enrichment, color represents significance; D the bubble graph of GO enrichment analysis of Target-OS data set using clusterProfiler package. Bubble size represents the number of gene enrichment, color represents significance; E the GO enrichment results of using GoPlot package to analyse test set were used to draw bubble diagram; F the GO enrichment results of using GoPlot package to analyse validation set were used to draw bubble diagram List of GO enrichment analysis performed on GSE42352 (gene enrichment number Top10) and target (gene enrichment number Top10) database co-expressed differential genes by clusterProfiler package KEGG enrichment analysis. KEGG enrichment analysis was performed for GSE42352 and Target database differential genes, respectively. A The bubble graph of KEGG enrichment analysis of GSE42352 data set using clusterProfiler package. Bubble size represents the number of gene enrichment, color from blue to red, significance gradually increased; B the bar graph of KEGG enrichment analysis of GSE42352 data set using clusterProfiler package. Length represents the number of gene enrichment, color from blue to red, significance gradually increased; C the enrichment diagram of KEGG analysis of Target-42352 data set using clusterProfiler package, showing the relationship between gene ID and pathway; D the bubble graph of KEGG enrichment analysis of Proteet-OS data set using clusterProfiler package. Bubble size represents the number of gene enrichment, color from blue to red, significance gradually increased; E the bar graph of KEGG enrichment analysis of Target-OS data set using clusterProfiler package. Length represents the number of gene enrichment, color from blue to red, significance gradually increased; F the enrichment diagram of KEGG analysis of Target-OS data set using clusterProfiler package, showing the relationship between gene ID and pathway; G ECM-receptor interaction pathway; H relaxin signaling pathway pathway; I protein digestion and absorption pathway List of KEGG enrichment analysis performed on GSE42352 (gene enrichment number Top10) and target (gene enrichment number Top10) database co-expressed differential genes by clusterProfiler package GSEA enrichment (Fig. 5) analysis used the expression matrix of differential genes grouped by high and low expression of PODN, which was analyzed by the ssGSEA package, and the selected reference gene set was msigdb.v7.0.entrez.gmt. And the GSEA set was visualized. This suggests that high PODN expression may be associated with the above enrichment pathways and molecular enrichment function, and may mediate the occurrence of cancer by indirect mechanisms that affect a variety of cytokines and metabolic enzymes of the above pathways or phenotypes.
Fig. 5

GSEA enrichment analysis. GSEA enrichment analysis was performed using the ssGSEA package on the GSE42352 database with the results of PODN high and low expression group difference analysis. A, B The forms of bubble graph (A) and enrichment chord graph (B) based on GSEA enrichment analysis of MSigDB database show the Top30 enrichment results obtained using ssGSEA method respectively; C, D The forms of bubble graph (C) and enrichment chord graph (D) based on GO functional enrichment analysis show the Top30 enrichment results obtained using ssGSEA method respectively; E, F the forms of bubble graph (E) and enrichment chord graph (F) based on KEGG database GSEA enrichment analysis show the Top30 enrichment results obtained using ssGSEA method respectively

GSEA enrichment analysis. GSEA enrichment analysis was performed using the ssGSEA package on the GSE42352 database with the results of PODN high and low expression group difference analysis. A, B The forms of bubble graph (A) and enrichment chord graph (B) based on GSEA enrichment analysis of MSigDB database show the Top30 enrichment results obtained using ssGSEA method respectively; C, D The forms of bubble graph (C) and enrichment chord graph (D) based on GO functional enrichment analysis show the Top30 enrichment results obtained using ssGSEA method respectively; E, F the forms of bubble graph (E) and enrichment chord graph (F) based on KEGG database GSEA enrichment analysis show the Top30 enrichment results obtained using ssGSEA method respectively

PPI network construction and Hub gene screening

Subsequently, a protein-protein interaction (PPI) network of differential genes using the STRING database for the 63 differential genes was constructed that co-existed between the test set and the validation set (Fig. 6A). Based on the protein interaction relationship predicted by the string database, one of the most closely linked gene sets was obtained using the MCODE plug-in analysis in Cytoscape (Fig. 6B), as well as the MCC analysis method in the cytoHubba plug-in resulting in a bar diagram of the Hub gene with tight Top30 relationship (Fig. 6C).
Fig. 6

Construction of PPI molecular interaction network. A PPI networks were constructed using Cytoscape software on the molecular interaction relationships analyzed by the String database; B the most closely related key gene sets obtained through the MCODE plug-in of Cytoscape software; C the most closely related Hub gene set with Top30 obtained by the MCC algorithm in the cytoHubba plug-in of Cytoscape software

Construction of PPI molecular interaction network. A PPI networks were constructed using Cytoscape software on the molecular interaction relationships analyzed by the String database; B the most closely related key gene sets obtained through the MCODE plug-in of Cytoscape software; C the most closely related Hub gene set with Top30 obtained by the MCC algorithm in the cytoHubba plug-in of Cytoscape software Infiltrating immune cell analysis was performed to calculate the infiltrating abundance of 22 immune cell with high and low PODN expression in 103 osteosarcoma samples, including different T cells, B cells, plasma cells, natural killer cells and different myeloid lineages, by using the CIBERSORT inclusion and LM22 algorithms. Subsequently, the differential infiltration expression of 22 infiltrating immune cells at high and low PODN expression was analyzed in the test set and validation set respectively, and plotted a violin plot to show the differential expression of each immune cell between groups (Fig. 7A, B) as well as the correlation heat map (Fig. 7C, D). Combined with the above analysis results, it was suggested that the immune cells with significant differences obtained by us in the high and low PODN expression groups may have a very important role in the immune microenvironment of osteosarcoma. Similarly, the occurrence and development of osteosarcoma may be related to inflammatory and metabolic pathways, and may improve the proportion of immune cell distribution by predicting the target small molecule drugs of these immune cells, so as to achieve the purpose of treating osteosarcoma.
Fig. 7

Differential immune cell infiltration and immunophenotype correlation analysis. A Using CIBERSORT package, the violin plot of 22 immune cell infiltration expression differences between PODN high and low expression in 103 osteosarcoma samples in the test set was calculated by LM22 algorithm, yellow represents the low expression group and red represents the high expression group, the p value represents the difference significance, and p < 0.05 was considered to have significant difference. B The violin plot of 22 immune cell infiltration expression differences between high and low PODN expression in 101 osteosarcoma samples in the validation set, yellow represents the low expression group and red represents the high expression group, p value represents the difference significance, and p < 0.05 is considered to have a significant difference. C Heat map of correlation of 22 immune cells between high and low expression of PODN in the test set; D heat map of correlation of 22 immune cells between high and low PODN expression set in the validation set; E Wayne diagram of intersection of DEGs and immune-related genes in the test set; F Wayne diagram of intersection of DEGs and immune-related genes in the validation set; G heat map of correlation of immune-related differential gene expression in the test set; H heat map of correlation of immune-related differential gene expression in the validation set

Differential immune cell infiltration and immunophenotype correlation analysis. A Using CIBERSORT package, the violin plot of 22 immune cell infiltration expression differences between PODN high and low expression in 103 osteosarcoma samples in the test set was calculated by LM22 algorithm, yellow represents the low expression group and red represents the high expression group, the p value represents the difference significance, and p < 0.05 was considered to have significant difference. B The violin plot of 22 immune cell infiltration expression differences between high and low PODN expression in 101 osteosarcoma samples in the validation set, yellow represents the low expression group and red represents the high expression group, p value represents the difference significance, and p < 0.05 is considered to have a significant difference. C Heat map of correlation of 22 immune cells between high and low expression of PODN in the test set; D heat map of correlation of 22 immune cells between high and low PODN expression set in the validation set; E Wayne diagram of intersection of DEGs and immune-related genes in the test set; F Wayne diagram of intersection of DEGs and immune-related genes in the validation set; G heat map of correlation of immune-related differential gene expression in the test set; H heat map of correlation of immune-related differential gene expression in the validation set Subsequently, the list of immune-related genes was obtained from the ImmPort database, took the intersection between differential genes and immune-related genes among the test set and validation set, and made a Wayne diagram using the VennDiagram package (Fig. 7E, F). Subsequently, the expression of these differential phenotype-related genes in two data sets was correlated, visualizing the results with strong correlation using pheatmap of R language, and displayed the correlation coefficient to draw the correlation heat map (Fig. 7G, H).

Diagnostic performance evaluation and survival analysis of PODN molecule for osteosarcoma

Finally, to assess the diagnostic performance of PODN for cancer, receiver operating characteristic curve (ROC curve) was plotted by pROC package for analyzing whether the expression level of PODN could well distinguish 15 normal samples from 103 tumor samples in the test set, and determine to produce the optimal cut-off value of the highest likelihood ratio to decide the recognition threshold of PODN for osteosarcoma. The area under the ROC curve (AUC value) was 0.855, indicating that the expression of PODN has a good diagnostic value for osteosarcoma (Fig. 1B). In order to further analyze the effect on the survival prognosis of osteosarcoma, the survival package was used to perform the prognostic survival analysis on the clinical overall survival time (OS), event-free (disease) survival time (DFS), and time to progression (TTP) of the validation set, and the results showed that OS and DFS were significantly different (Fig. 1C, D).

Cox regression analysis and risk score

We first did univariate Cox regression analysis of the Target database gene matrix, and we used 17 common immune-related differential genes as a risk factor set, extracted their univariate regression analysis results, and did further multivariate Cox regression analysis of this gene set. Finally, ACTA2, COL6A1, FAP, OLFML2B and COL6A3 were identified statistically significant (Table 3), indicating that these five differential genes were significantly correlated, and the effect of differential gene set on the prognosis and survival of osteosarcoma was significant, which could be used as the high-risk biomarkers for osteosarcoma.
Table 3

The hazard model obtained by multivariate Cox regression analysis based on 17 co-expressed differential genes common to GSE42352 and Target databases

idcoefexp(coef)se(coef)zPr( >|z|)
ACTA2− 0.34650.70720.1316− 2.63340.0085
COL6A1− 0.30900.73420.2067− 1.49540.1348
FAP− 0.41880.65780.1896− 2.20860.0272
OLFML2B− 0.21070.81010.1367− 1.54110.1233
COL6A30.76092.14020.25382.99800.0027
The hazard model obtained by multivariate Cox regression analysis based on 17 co-expressed differential genes common to GSE42352 and Target databases In order to further investigate the effect of the 5 gene set on the survival prognosis of osteosarcoma, we analyzed the overall survival prognosis (OS) (Fig. 8D) between high and low risk groups with it as a risk factor in the clinical survival data of Target database using the survival package. Meanwhile, we used survival ROC package to verify the accuracy of the gene set model in predicting the prognosis of osteosarcoma. We got the result of AUC = 0.75 (Fig. 8C), which indicated that this gene set has high accuracy for the overall survival prognosis of osteosarcoma. Finally, we plotted the risk curve (top), survival status (middle), and risk heatmap (bottom) for this set of gene sets (Fig. 8E). Cox regression analysis and risk assessment analysis. A Wayne diagram of the intersection of test set and validation set differential genes; B Accuracy of each gene in the 5-gene set (ACTA2, COL6A1, FAP, OLFML2B, COL6A3) obtained by multivariate Cox analysis in predicting the survival prognosis of osteosarcoma; C ROC curve verifies the accuracy of the 5-gene set model as a whole in predicting the survival prognosis of osteosarcoma; D the effect of the 5-gene set as a risk model on the overall survival (OS) prognosis of osteosarcoma was statistically significant; E the risk curve (upper), survival status (middle), and risk heat map (lower) of the 5-gene set

Discussion

We designed this experiment to analyze the huge credit generating data and found that the corresponding credit generating indexes after grouping according to different levels of PODN were marked with significantly differences, so we found the corresponding credit generating indexes after grouping according to different levels of PODN. Through bioinformatics analysis (identification of differentially expressed genes from public microarray data, functional and pathway enrichment analysis, PPI network construction and module analysis, Cox regression analysis), we found that the expression of ACTA2, COL6A1, FAP, OLFML2B and COL6A3 genes were significantly different with different PODN expression levels. Further survival analysis showed that they had good accuracy for the overall survival and prognosis of osteosarcoma. Better knowledge on cellular signaling in high-grade osteosarcoma may identify new possibilities for targeted treatment of this cancer [4]. As high-risk biomarkers of osteosarcoma, these five molecules were expected to become potential therapeutic targets, early diagnosis and prognostic indicators. PODN had an important role of profound clinical significance. For example, studies by YiSun et al. have shown that miR-3180-5p reduces the expression of PODN and leads to cyclin-dependent kinase 2 (cdk2)-mediated proliferation of HBSMCs in human bladder smooth muscle cells [15]. Podocan encoded by PODN can be found in a variety of cells. Ross, Bruggeman et al. find podocan in glomerular basement membrane. And, they elucidate that the imbalance of podocan expression in (HIVAN) of HIV-a-associated nephropathy may lead to the pathogenesis of (FSGS) in focal segmental glomerulosclerosis [16]. Didangelos et al. confirm the existence of Podocan in aortic intima by immunohistochemical staining [17]. Many related researches were carired to verify the importance of PODN. In this study, new molecules were found be aberrantly expressed in osteosarcoma, which suggested the potential value of PODN. The shortage of this study was that we only carried out bioinformatics analysis, without sufficient laboratory data to verify and unclear of specific molecular transmission signaling pathway. However, ACTA2, COL6A1, FAP, OLFML2B and COL6A3, these five potential targets have great research value for further study. Disclosure of these five crucial genes will greatly save the time and economic costs of osteosarcoma scholars and scientists for other researchers around the world. Our current research was just an introduction, under the guidance of our research data, we hope to see the publication of more and more valuable researches, and human beings will continue to advance their understanding and treatment of osteosarcoma.

Conclusions

It was identified that PODN was a prognostic biomarker and correlated with immune infiltrates in osteosarcoma, and was correlated with immune infiltrates in osteosarcoma. Furthermore, ACTA2, COL6A1, FAP, OLFML2B and COL6A3, can be used as prognosis biomarkers for osteosarcoma.
  17 in total

1.  Bone sarcomas: ESMO-PaedCan-EURACAN Clinical Practice Guidelines for diagnosis, treatment and follow-up.

Authors:  P G Casali; S Bielack; N Abecassis; H T Aro; S Bauer; R Biagini; S Bonvalot; I Boukovinas; J V M G Bovee; B Brennan; T Brodowicz; J M Broto; L Brugières; A Buonadonna; E De Álava; A P Dei Tos; X G Del Muro; P Dileo; C Dhooge; M Eriksson; F Fagioli; A Fedenko; V Ferraresi; A Ferrari; S Ferrari; A M Frezza; N Gaspar; S Gasperoni; H Gelderblom; T Gil; G Grignani; A Gronchi; R L Haas; B Hassan; S Hecker-Nolting; P Hohenberger; R Issels; H Joensuu; R L Jones; I Judson; P Jutte; S Kaal; L Kager; B Kasper; K Kopeckova; D A Krákorová; R Ladenstein; A Le Cesne; I Lugowska; O Merimsky; M Montemurro; B Morland; M A Pantaleo; R Piana; P Picci; S Piperno-Neumann; A L Pousa; P Reichardt; M H Robinson; P Rutkowski; A A Safwat; P Schöffski; S Sleijfer; S Stacchiotti; S J Strauss; K Sundby Hall; M Unk; F Van Coevorden; W T A van der Graaf; J Whelan; E Wardelmann; O Zaikova; J Y Blay
Journal:  Ann Oncol       Date:  2018-10-01       Impact factor: 32.976

2.  Incidence and survival of malignant bone sarcomas in England 1979-2007.

Authors:  Jeremy Whelan; Anne McTiernan; Nicola Cooper; Yuen K Wong; Matthew Francis; Sally Vernon; Sandra J Strauss
Journal:  Int J Cancer       Date:  2011-11-02       Impact factor: 7.396

3.  limma powers differential expression analyses for RNA-sequencing and microarray studies.

Authors:  Matthew E Ritchie; Belinda Phipson; Di Wu; Yifang Hu; Charity W Law; Wei Shi; Gordon K Smyth
Journal:  Nucleic Acids Res       Date:  2015-01-20       Impact factor: 16.971

4.  Novel small leucine-rich repeat protein podocan is a negative regulator of migration and proliferation of smooth muscle cells, modulates neointima formation, and is expressed in human atheroma.

Authors:  Randolph Hutter; Li Huang; Walter S Speidl; Chiara Giannarelli; Paul Trubin; Gerhard Bauriedel; Mary E Klotman; Valentin Fuster; Juan J Badimon; Paul E Klotman
Journal:  Circulation       Date:  2013-09-16       Impact factor: 29.690

5.  Bone cancer incidence by morphological subtype: a global assessment.

Authors:  Patricia C Valery; Mathieu Laversanne; Freddie Bray
Journal:  Cancer Causes Control       Date:  2015-06-09       Impact factor: 2.506

6.  Improvement in histologic response but not survival in osteosarcoma patients treated with intensified chemotherapy: a randomized phase III trial of the European Osteosarcoma Intergroup.

Authors:  Ian J Lewis; Marianne A Nooij; Jeremy Whelan; Matthew R Sydes; Robert Grimer; Pancras C W Hogendoorn; Muhammad A Memon; Simon Weeden; Barbara M Uscinska; Martine van Glabbeke; Anne Kirkpatrick; Esther I Hauben; Alan W Craft; Antonie H M Taminiau
Journal:  J Natl Cancer Inst       Date:  2007-01-17       Impact factor: 13.506

7.  Robust enumeration of cell subsets from tissue expression profiles.

Authors:  Aaron M Newman; Chih Long Liu; Michael R Green; Andrew J Gentles; Weiguo Feng; Yue Xu; Chuong D Hoang; Maximilian Diehn; Ash A Alizadeh
Journal:  Nat Methods       Date:  2015-03-30       Impact factor: 28.547

8.  STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets.

Authors:  Damian Szklarczyk; Annika L Gable; David Lyon; Alexander Junge; Stefan Wyder; Jaime Huerta-Cepas; Milan Simonovic; Nadezhda T Doncheva; John H Morris; Peer Bork; Lars J Jensen; Christian von Mering
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

9.  IR/IGF1R signaling as potential target for treatment of high-grade osteosarcoma.

Authors:  Marieke L Kuijjer; Elisabeth F P Peterse; Brendy E W M van den Akker; Inge H Briaire-de Bruijn; Massimo Serra; Leonardo A Meza-Zepeda; Ola Myklebost; A Bassim Hassan; Pancras C W Hogendoorn; Anne-Marie Cleton-Jansen
Journal:  BMC Cancer       Date:  2013-05-20       Impact factor: 4.430

10.  MiR 3180-5p promotes proliferation in human bladder smooth muscle cell by targeting PODN under hydrodynamic pressure.

Authors:  Yi Sun; De-Yi Luo; Yu-Chun Zhu; Liang Zhou; Tong-Xin Yang; Cai Tang; Hong Shen; Kun-Jie Wang
Journal:  Sci Rep       Date:  2016-09-09       Impact factor: 4.379

View more
  3 in total

1.  Pan-Cancer Analysis of OLFML2B Expression and Its Association With Prognosis and Immune Infiltration.

Authors:  Pengbo Hu; Xiuyuan Zhang; Yiming Li; Liang Xu; Hong Qiu
Journal:  Front Genet       Date:  2022-07-06       Impact factor: 4.772

2.  The Clinical Significance and Potential Molecular Mechanism of Upregulated CDC28 Protein Kinase Regulatory Subunit 1B in Osteosarcoma.

Authors:  Chaohua Mo; Le Xie; Chang Chen; Jie Ma; Yingxin Huang; Yanxing Wu; Yuanyuan Xu; Huizhi Peng; Zengwei Chen; Rongjun Mao
Journal:  J Oncol       Date:  2021-12-10       Impact factor: 4.375

3.  Identification of circadian clock genes as regulators of immune infiltration in Hepatocellular Carcinoma.

Authors:  Zhen Zhang; Zicheng Liang; Wenhui Gao; Shuxian Yu; Zongwei Hou; Kexin Li; Puhua Zeng
Journal:  J Cancer       Date:  2022-09-01       Impact factor: 4.478

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.