Literature DB >> 34123594

Identification of potential gene signatures associated with osteosarcoma by integrated bioinformatics analysis.

Yutao Jia1, Yang Liu1, Zhihua Han2, Rong Tian1.   

Abstract

BACKGROUND: Osteosarcoma (OS) is the most primary malignant bone cancer in children and adolescents with a high mortality rate. This work aims to screen novel potential gene signatures associated with OS by integrated microarray analysis of the Gene Expression Omnibus (GEO) database.
MATERIAL AND METHODS: The OS microarray datasets were searched and downloaded from GEO database to identify differentially expressed genes (DEGs) between OS and normal samples. Afterwards, the functional enrichment analysis, protein-protein interaction (PPI) network analysis and transcription factor (TF)-target gene regulatory network were applied to uncover the biological function of DEGs. Finally, two published OS datasets (GSE39262 and GSE126209) were obtained from GEO database for evaluating the expression level and diagnostic values of key genes.
RESULTS: In total 1,059 DEGs (569 up-regulated DEGs and 490 down-regulated DEGs) between OS and normal samples were screened. Functional analysis showed that these DEGs were markedly enriched in 214 GO terms and 54 KEGG pathways such as pathways in cancer. Five genes (CAMP, METTL7A, TCN1, LTF and CXCL12) acted as hub genes in PPI network. Besides, METTL7A, CYP4F3, TCN1, LTF and NETO2 were key genes in TF-gene network. Moreover, Pax-6 regulated four key genes (TCN1, CYP4F3, NETO2 and CXCL12). The expression levels of four genes (METTL7A, TCN1, CXCL12 and NETO2) in GSE39262 set were consistent with our integration analysis. The expression levels of two genes (CXCL12 and NETO2) in GSE126209 set were consistent with our integration analysis. ROC analysis of GSE39262 set revealed that CYP4F3, CXCL12, METTL7A, TCN1 and NETO2 had good diagnostic values for OS patients. ROC analysis of GSE126209 set revealed that CXCL12, METTL7A, TCN1 and NETO2 had good diagnostic values for OS patients. ©2021 Jia et al.

Entities:  

Keywords:  Activating transcription factor; Bioinformatic; Diagnosis; Genes; Osteosarcoma

Year:  2021        PMID: 34123594      PMCID: PMC8164836          DOI: 10.7717/peerj.11496

Source DB:  PubMed          Journal:  PeerJ        ISSN: 2167-8359            Impact factor:   2.984


Introduction

Osteosarcoma (OS) is a type of primary malignant bone cancer that causes public health concern, especially in children and adolescents (Isakoff, Meltzer & Gorlick, 2015; Lindsey, Markel & Kleinerman, 2017). Several treatment strategies of OS such as surgical resection, traditional adjuvant chemotherapy and radiotherapy have been favored by clinical oncologists in the past few decades (Nagarajan et al., 2011). Accordingly, the 5-year survival rate has been raised to approximately 70% (Bielack et al., 2002). However, it is estimated that 80% of OS patients may suffer from the micro-metastasis, which cannot be detected at early diagnosis (Messerschmitt et al., 2009). Although multiple methods for the diagnosis and treatment of OS have been developed, new methods for the prevention and treatment of OS are still needed. The pathogenesis of OS progression remains not fully understood. Therefore, the identification of effective diagnostic makers and exploring underlying molecular etiology of OS is a pressing need. The emergence of high-throughput sequencing technology has become an effective way to illuminate the pathogenic genes in a variety of human diseases, which help to explore pathogenesis and develop biomarkers. Many research groups have screened multiple biomarkers associated with OS by analyzing gene expression data. For example, Sun et al. evaluated the difference of genes in the expression level between OS metastasis and OS non-metastasis and discovered that TGFB1, LFT3, KDM1A, and KRAS may participate in the occurrence of OS. Xiong et al. (2015) found that CCT3, COPS3 and WWP1 were involved in the OS development by integrating gene expression and genomic aberration data. Liu, Zhao & Chen (2017) constructed a co-expression network based on a Gene Expression Omnibus (GEO) dataset and identified many potential biomarkers such as CTLA4 and PBF for diagnosis and treatment of OS. However, the molecular mechanisms of OS initiation and development have not been fully explored. In this study, we retrieved GEO database and obtained four OS datasets. Subsequently, the differentially expressed genes (DEGs) between OS samples and normal samples were obtained and subjected to functional analysis. A protein–protein interaction (PPI) network was constructed followed by the establishment of transcription factor (TF)-target regulatory network. Following this, we downloaded two published GEO datasets of OS as the validation set for assessing the expression levels of key candidate genes. Finally, the receiver operating characteristic (ROC) analysis was conducted to evaluate the diagnostic values of key candidate genes. This study will discover novel gene signatures associated with OS, providing new trains of thought for the diagnosis and treatment of OS.

Materials and Methods

Data acquisition

The datasets were retrieved from the National Center for Biotechnology Information-GEO (http://www.ncbi.nlm.nih.gov/geo/) repository using the key terms of ‘osteosarcoma’ AND ‘Homo sapiens’[porgn]. All selected datasets in this study should meet the following criteria: (i) the datasets contained genome-wide expression data of tumor tissues and normal control tissues of OS patients; and (ii) all data were standardized or raw data. As shown in Table S1, a total of five datasets were obtained. Notably, the GSE9508 dataset contained over 50% missing data and was subsequently removed from the following analysis. Eventually, four datasets (GSE12865, GSE19276, GSE87624 and GSE99671) were included in this study, which included 118 OS tissues and 28 normal bone tissues. The GSE12865 series (GPL6244 platform) included a total of 14 samples (12 OS and two normal control tissues). The platform for GSE19276 was GPL6848 including 44 OS and five normal control bone tissue. The platform for GSE87624, consisting of 44 OS and three normal control bone tissue samples, was GPL11154. GSE99671 was in GPL20148 platform, which contained 18 OS and 18 normal control bone tissue samples. The platform and series matrix files were downloaded. The dataset information was listed in Table S1. The impact of different platforms on the results, we normalized the data through the log function, and centralized and standardized the scale function to eliminate the impact of the dimension on the data structure.

Data pre-processing and DEGs identification

The standardized data from included datasets were firstly processed as follows: (i) the probes that mapped to several genes were deleted; and (ii) if the gene was matched by multiple probes, the probe with the greatest gene expression value would be retained. Overall, there were overlapping 14981 genes among four datasets. Subsequently, MetaMA (https://cran.r-project.org/web/packages/metaMA/), an R package, was used to combine data from four GEO datasets. Individual p-values were obtained using Limma R package. The inverse normal method was used to combine P values in meta-analysis (Marot, Mayer & Jaffrézic, 2009). We carried out the multiple comparison correction by Benjamini & Hochberg approach to acquire false discovery rate (FDR). Herein, the DEGs between OS tissues and normal controls were defined according to the cutoff of false discovery rate (FDR) <  0.05 and those DEGs with different directionality were removed from this study. Finally, the hierarchical clustering analysis of top 100 DEGs was also carried out by pheatmap package in R software.

Functional enrichment analyses

To systematically explore the underlying biological functions of identified DEGs, the Metascape (http://metascape.org/gp/index.html), an online tool that integrates multiple data resources such as Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and Universal Protein (Uniprot) database, was used to perform GO and KEGG pathway enrichment analysis of DEGs. GO analysis involved three categories: biological process (BP), cellular component (CC) and molecular function (MF). P values <  0.05 was set as the thresholds for significant enrichment analyses.

Protein–protein interaction (PPI) network analysis

The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database, a freely web-based analytic tool, can predict the interactions among proteins (Szklarczyk et al., 2019). Here, a PPI analysis was conducted to examine the interactive associations between protein products of DEGs. The Cytoscape software (http://www.cytoscape.org) was utilized to establish a PPI network. In addition, the CytoNCA (http://apps.cytoscape.org/apps/cytonca) was used to analyze topological characteristics of PPI network. The top 15 nodes were considered as hub genes according to the degree value.

TF-candidate gene network analysis

TFs can bind to specific DNA sequences in promoter region of target gene to regulate gene expression. The top 20 up- and down-regulated genes were regarded as the candidate genes. The DNA sequences (2 kb) in the upstream promoter region of these candidate genes were firstly downloaded from the University of California, Santa Cruz (http://www.genome.ucsc.edu/) databases. Then, we employed online match tool from TRANSFAC (http://genexplain.com/transfac) to predict potential TFs that targeted candidate genes. Notably, the TFs that had only one binding site with target genes were retained in this study. Finally, the Cytoscape software was used to build a transcriptional regulatory network and perform node degree analysis.

Evaluation the expression level and diagnostic values of key genes

Two published OS datasets were obtained from GEO database for the expression level evaluation of seven key DEGs (CAMP, METTL7A, TCN1, LTF, CXCL12, CYP4F3 and NETO2). Then, we performed a ROC analysis using the pROC package in R software (http://web.expasy.org/pROC/) to evaluate the diagnostic value of these seven DEGs. Accordingly, the area under the curve (AUC) was computed and the ROC curve was built. The AUC value > 0.8 showed a good diagnostic value for OS.

Results

Identification of DEGs

After data pre-processing, a total of 1,059 DEGs (569 up-regulated genes and 490 down-regulated genes) were identified between OS samples and normal controls according to methods described above. The clustering analysis indicated that top 100 DEGs could distinguish OS samples and controls from four datasets (Fig. S1). The top 20 up- and down-regulated genes were listed in Table 1.
Table 1

The list of top 20 up-regulated and down-regulated differentially expressed genes.

Gene symbolCombined.ESP_valueFDRUp/Down-regulation
NETO21.7735.48E−101.16E−07Up-regulation
RTKN1.6643.90E−096.57E−07Up-regulation
TMEM651.6691.08E−081.54E−06Up-regulation
MLLT111.5071.41E−081.84E−06Up-regulation
LAPTM4B1.4991.78E−082.21E−06Up-regulation
ZC3H81.5631.97E−082.41E−06Up-regulation
SLC35F21.5703.46E−083.90E−06Up-regulation
IRX21.5345.03E−085.19E−06Up-regulation
ZNF5931.4047.11E−086.78E−06Up-regulation
FLAD11.4517.54E−087.10E−06Up-regulation
KCNG11.4561.32E−071.16E−05Up-regulation
FGD11.3042.59E−072.04E−05Up-regulation
RPAP21.4802.63E−072.06E−05Up-regulation
DYRK41.4052.83E−072.20E−05Up-regulation
EDARADD1.3603.06E−072.33E−05Up-regulation
PDCD51.2853.77E−072.80E−05Up-regulation
TMEM971.3494.04E−072.95E−05Up-regulation
GNL21.3114.14E−073.01E−05Up-regulation
HOXB61.5274.47E−073.17E−05Up-regulation
ZZZ31.5045.79E−073.99E−05Up-regulation
CAMP−4.15800Down-regulation
AHSP−3.15600Down-regulation
OLFM4−3.10800Down-regulation
LTF−3.00600Down-regulation
ADH1C−2.81400Down-regulation
CXCL12−2.74600Down-regulation
BPI−2.55600Down-regulation
HBD−2.6158.88E−165.32E−13Down-regulation
TCN1−2.5221.11E−156.40E−13Down-regulation
FABP4−2.2967.77E−154.31E−12Down-regulation
RAB37−2.2792.02E−141.08E−11Down-regulation
FCN1−2.3202.95E−141.53E−11Down-regulation
TMEM154−2.2793.13E−141.55E−11Down-regulation
METTL7A−2.1815.44E−142.47E−11Down-regulation
CYP4F3−2.3046.66E−142.94E−11Down-regulation
CAT−2.1093.44E−131.43E−10Down-regulation
CHL1−1.9745.51E−132.17E−10Down-regulation
TMEM132C−1.8605.19E−121.85E−09Down-regulation
SERPINB2−3.0825.64E−121.97E−09Down-regulation
SLC28A3−2.1188.26E−122.69E−09Down-regulation

GO and KEGG enrichment analysis of DEGs

The GO enrichment analysis of DEGs showed that a total of 214 GO terms were enriched, including 194 GO-BP terms, 15 GO-CC terms and 5 GO-MF terms (Table S1). Specifically, for GO-BP analysis, these DEGs were strongly associated with positive regulation of cell death, myeloid cell activation involved in immune response and regulation of protein kinase activity. Meanwhile, protein domain specific binding and transcription factor binding were significantly enriched GO-MF terms. Many DEGs were primarily involved in multiple GO-CC terms, such as anchored component of membrane and tertiary granule. The top 20 clusters of significantly enriched GO terms were displayed in Fig. 1. In addition, these DEGs were markedly enriched in 20 KEGG pathways such as regulation of lipolysis in adipocytes, protein processing in endoplasmic reticulum and pathways in cancer (Table 2). Notably, the top 20 up-regulated genes did not enrich in any KEGG pathway. However, four of top 20 down-regulated genes played vital roles in multiple significantly enriched KEGG pathways, including FABP4 (fatty acid binding protein 4), CXCL12 (C-X-C motif chemokine ligand 12), CXCL12 (C-X-C motif chemokine ligand 12) and CAT (Catalase). More specifically, FABP4 was involved in regulation of lipolysis in adipocytes and CXCL12 participated in pathways in cancer and axon guidance (Table 2). CYP4F3 was closely correlated with arachidonic acid metabolism pathway and CAT was significantly enriched in biosynthesis of amino acids and AMPK signaling pathway (Table 2).
Figure 1

Top 20 significantly enriched Gene Ontology terms of differentially expressed genes.

Table 2

The top 20 significantly enriched KEGG pathways.

IDTermP_valueCountGene Symbols
hsa04923Regulation of lipolysis in adipocytes0.000110ADRB2, FABP4, PDE3B, PIK3CD, PLIN1, PTGER3, IRS2, ABHD5, PNPLA2, ADCY4
hsa04141Protein processing in endoplasmic reticulum0.000119BAG1, EIF2S1, HSPA2, LMAN1, MAN1A1, MAP3K5, P4HB, EIF2AK2, RPN2, RRBP1, SSR2, PREB, SEC24A, SEC61G, UBQLN2, DNAJB11, UBQLN4, DNAJC1, UBE2J2
hsa05200Pathways in cancer0.000234BCL2L1, CASP8, CDKN2A, CEBPA, CKS1B, COL4A1, DVL1, E2F1, EPAS1, FGF7, FGF13, FLT3, MTOR, GLI2, GLI3, GNAQ, PIK3CD, PTEN, PTGER2, PTGER3, RARB, RXRA, CXCL12, SKP2, STAT5A, TGFB3, TGFBR2, VEGFA, NCOA4, FGF16, PIAS2, ARHGEF11, LEF1, ADCY4, SHC1, CALM1, CHEK1, ELK1, FDPS, GPS2, MSX2, RANBP1, TLN1, VCAM1, TLN2
hsa01230Biosynthesis of amino acids0.000411CTH, ENO1, GAPDH, PC, SHMT1, TALDO1, TKT, TPI1, SDS, RPIA, AADAT, CAT, MDH2, ME1, MCEE
hsa04974Protein digestion and absorption0.001911COL2A1, COL4A1, COL5A2, COL10A1, COL11A1, COL17A1, CPA3, SLC7A8, COL18A1, COL27A1, SLC16A10
hsa04722Neurotrophin signaling pathway0.002213CALM1, MAPK14, MAP3K5, NFKBIE, NTF3, NTRK2, PIK3CD, PRKCD, RPS6KA2, SHC1, MAGED1, PRDM4, IRAK3
hsa04152AMPK signaling pathway0.002313CPT1A, ELAVL1, MTOR, LEP, PIK3CD, PPP2R5A, PPP2R5D, PRKAB2, IRS2, CREB5, CAMKK2, STRADB, CREB3L1, PRKCD, PRKCE, PTEN, RPS6KA2, TBC1D4, JAK2, NFKBIE, RXRA, SOCS2, CAT, ADCY4, CALM1, ELK1, FLOT2, PDE3B, PRKAR2B, SHC1, INPP5K, HSPA2
M00007Pentose phosphate pathway, non-oxidative phase, fructose 6P =>ribose 5P0.00263TALDO1, TKT, RPIA
hsa03060Protein export0.00285OXA1L, SRP54, SRP72, SEC61G, SRPRB
hsa04933AGE-RAGE signaling pathway in diabetic complications0.004111COL4A1, MAPK14, JAK2, PIK3CD, PRKCD, PRKCE, STAT5A, TGFB3, TGFBR2, VCAM1, VEGFA, CD247, MTOR, NFKBIE, RXRA, STAT6, IL27RA, LHB, SHC1, SOCS2, AOX1, BCL2L1, IL5RA, LEP, PIAS2
hsa04360Axon guidance0.004516EFNA1, EFNA3, EFNA5, EFNB1, EPHA3, EPHB2, EPHB6, FES, PIK3CD, CXCL12, SEMA7A, SEMA3A, RHOD, SEMA4C, NTNG2, PLXNA4
hsa04015Rap1 signaling pathway0.005318CALM1, MAPK14, EFNA1, EFNA3, EFNA5, FGF7, FGF13, GNAQ, ITGB3, PFN2, PIK3CD, TLN1, VEGFA, FGF16, RAPGEF3, APBB1IP, TLN2, ADCY4, BCL2L1, COL2A1, COL4A1, MTOR, ITGA7, JAK2, PPP2R5A, PPP2R5D, PTEN, RXRA, TLR4, CREB5, CREB3L1, THEM4
hsa00590Arachidonic acid metabolism0.00558ALOX15, GGT5, GPX3, GPX7, LTA4H, CYP4F3, PLA2G5, PTGES, GSTA4, GSTM3, SRM, GGCT
hsa04932Non-alcoholic fatty liver disease (NAFLD)0.006014CASP8, CEBPA, EIF2S1, LEP, MAP3K5, NDUFB9, NDUFS5, PIK3CD, PRKAB2, RXRA, UQCRH, IRS2, NDUFB11, NDUFA4L2
hsa00620Pyruvate metabolism0.00666GLO1, HAGH, MDH2, ME1, PC, LDHD
hsa04924Renin secretion0.00738ADRB2, CALM1, GNAQ, PDE1A, PDE1C, PDE3B, PTGER2, CLCA4
hsa04750Inflammatory mediator regulation of TRP channels0.010110CALM1, MAPK14, GNAQ, PIK3CD, PRKCD, PRKCE, PTGER2, TRPA1, TRPV4, ADCY4, ADRB2, ATP2B4, PPP2R5A, PPP2R5D, CREB5, RAPGEF3, CACNG7, CREB3L1, HSPA2, SHC1, PLA2G5, PTGIR, ARHGEF11, PPP1R14A, MYL6B, GPX3, GPX7, SLC26A4, ELK1, GNRH2, LHB
hsa01524Platinum drug resistance0.01448BCL2L1, CASP8, CDKN2A, GSTA4, GSTM3, MAP3K5, PIK3CD, PMAIP1, BCL2A1, DFFA, DFFB, EIF2S1, PARP2, DIABLO, DAB2IP, MAPK14, VCAM1, CREB5, CREB3L1, MLKL, CALM1, ITGB3, PLAT, VEGFA, TRPV4
hsa04071Sphingolipid signaling pathway0.014611MAPK14, MS4A2, GNAQ, MAP3K5, PIK3CD, PPP2R5A, PPP2R5D, PRKCE, PTEN, SPTLC2, SGMS1
hsa00983Drug metabolism-other enzymes0.01476CDA, CES1, DPYD, UCK2, CES2, UCKL1

Notes.

KEGG; Kyoto Encyclopedia of Genes and Genomes.

Notes. KEGG; Kyoto Encyclopedia of Genes and Genomes.

PPI network analysis

To determine the relationships among DEGs, a PPI network was built based on the STRING database, which included 109 nodes and 196 protein pairs (Fig. 2). The top 15 hub genes contain PPBP (pro-platelet basic protein; degree = 13), CAMP (cathelicidin antimicrobial peptide, degree = 13), LTF (lactotransferrin, degree = 12), BST1 (bone marrow stromal cell antigen 1, degree = 12), CXCR2 (C-X-C motif chemokine receptor 2, degree = 10), OLFM4 (olfactomedin 4, degree = 10), STOM (stomatin, degree = 10), TCN1 (transcobalamin 1, degree = 9), SLC4A1 (solute carrier family 4 member 1, degree = 9), LTA4H (leukotriene A4 hydrolase, degree = 9), S100A9 (S100 calcium binding protein A9, degree = 9), CXCL12 (degree = 8), CLEC12A (C-type lectin domain family 12 member A, degree = 8), RAB37 (RAB37, member RAS oncogene family, degree = 8) and METTL7A (methyltransferase like 7A, degree = 8). More notably, these genes were all down-regulated.
Figure 2

Protein–protein interaction networks of differentially expressed genes.

Red and green ellipses represent up-regulated and down-regulated genes, respectively. The black borders indicate top 20 up-regulated and down-regulated genes.

TF-target network analysis

TF exerts crucial roles in regulating the expression of target gene. Herein, we employed TRANSFAC to predict TFs that regulated 20 up-regulated and down-regulated DEGs. As shown in Fig. 3, the TF-target gene regulatory network contained 95 nodes (55 TFs and 40 genes) and 275 TF-gene pairs. The top 15 genes in TF-target network were CHL1 (cell adhesion molecule L1 like, down-regulation, degree = 16), SERPINB2 (serpin family B member 2, down-regulation, degree = 15), SLC28A3 (solute carrier family 28 member 3, down-regulation, degree = 11), ZC3H8 (zinc finger CCCH-type containing 8, up-regulation, degree = 10), LAPTM4B (lysosomal protein transmembrane 4 beta, up-regulation, degree = 10), DYRK4 (dual specificity tyrosine phosphorylation regulated kinase 4, up-regulation, degree = 9), METTL7A (solute carrier family 28 member 3, down-regulation, degree = 9), KCNG1 (potassium voltage-gated channel modifier subfamily G member 1, down-regulation, degree = 9), CYP4F3 (down-regulation, degree = 9), GNL2 (G protein nucleolar 2, up-regulation, degree = 8), HBD (hemoglobin subunit delta, down-regulation, degree = 8), TCN1 (down-regulation, degree = 8), ZZZ3 (zinc finger ZZ-type containing 3, up-regulation, degree = 8), NETO2 (neuropilin and tolloid like 2, up-regulation, degree = 8), and LTF (lactotransferrin, down-regulation, degree = 8). In addition, the top six TFs that covered the most downstream genes were exhibited in Table 3, which contained Pax-4, 1-Oct, Nkx2-5, HNF-4, FOXD3 and Pax-6. Interestingly, TCN1, CYP4F3, NETO2 and CXCL12 were regulated by Pax-6 (Table 3).
Figure 3

Transcription factor-top 20 up-regulated and down-regulated genes network.

Diamonds and ellipses represent transcription factors and top 20 up-regulated and down-regulated genes, respectively. Red and green ellipses represent up-regulated and down-regulated genes, respectively.

Table 3

The top 6 TF that has the most downstream genes.

TFNumberGene Symbol
Pax-431FLAD1, ZZZ3, GNL2, EDARADD, RTKN, LTF, CAMP, CHL1, ADH1C, TMEM154, IRX2, FABP4, LAPTM4B, SLC28A3, FGD1, CXCL12, HBD, TCN1, CAT, SLC35F2, DYRK4, METTL7A, OLFM4, AHSP, RAB37, TMEM97, SERPINB2, CYP4F3, PDCD5, BPI, KCNG1
1-Oct22FLAD1, GNL2, RPAP2, EDARADD, ZC3H8, CHL1, CAMP, IRX2, FABP4, TMEM65, SLC28A3, HBD, TCN1, CAT, TMEM132C, DYRK4, METTL7A, TMEM97, HOXB6, SERPINB2, BPI, KCNG1
Nkx2-519FLAD1, MLLT11, RPAP2, EDARADD, CAMP, CHL1, TMEM154, LAPTM4B, TMEM65, SLC28A3, FGD1, TCN1, SLC35F2, DYRK4, AHSP, SERPINB2, CYP4F3, KCNG1, BPI
HNF-416ZZZ3, ZNF593, GNL2, LTF, CHL1, CAMP, LAPTM4B, SLC28A3, FGD1, CXCL12, HBD, SLC35F2, METTL7A, RAB37, CYP4F3, KCNG1
FOXD314GNL2, RPAP2, EDARADD, ZC3H8, CHL1, LTF, FABP4, FGD1, HBD, DYRK4, METTL7A, OLFM4, NETO2, RAB37
Pax-613EDARADD, GNL2, CHL1, TMEM65, SLC28A3, CXCL12, HBD, TCN1, DYRK4, NETO2, SERPINB2, CYP4F3, KCNG1

Notes.

indicates the number of genes regulated by the TF

transcription factor

An external dataset (GSE39262) was obtained from GEO database, which contained 10 human osteosarcoma cell lines and five untransformed cell lines samples. The platform for this dataset was GPL96 [HG-U133A] Affymetrix Human Genome U133A Array. GSE126209 was downloaded from the GEO database, which inclued 12 osteosarcomas tumors and 11 adjacent normal tissues samples. The platform for this dataset was GPL20301 Illumina HiSeq 4000. Seven key genes (CAMP, METTL7A, TCN1, LTF, CXCL12, CYP4F3 and NETO2) were selected to verify in GSE39262. Among them, CAMP, METTL7A, TCN1, LTF and CXCL12 acted as hub genes in PPI network. METTL7A, CYP4F3, TCN1, LTF and NETO2 were key genes in TF-gene network. Moreover, Pax-6 regulated four key genes (TCN1, CYP4F3, NETO2 and CXCL12). The gene differential expression analysis of the GSE39262 dataset revealed that NETO2 was significantly up-regulated while CXCL12, METTL7A and TCN1 were significantly down-regulated, which were consistent with our integration analysis (Fig. 4; Table S2). The gene differential expression analysis of the GSE126209 dataset displayed that NETO2 was significantly up-regulated while CXCL12 was significantly down-regulated, which were consistent with our integration analysis (Fig. 5; Table S3). ROC analysis is a commonly used method to evaluate the value of genetic diagnosis and has been used in previously biomedical works (Le et al., 2019; Le, Yapp & Yeh, 2019; Thi, Trang & Khanh, 2020). Additionally, the results of the GSE39262 dataset showed that five genes had good diagnostic values for OS (CXCL12, CYP4F3, METTL7A, NETO2 and TCN1; Fig. 6). The AUC of CXCL12 was 1.000 and the specificity and sensitivity of this model were 100.0% and 100%, respectively. The AUC of CYP4F3 was 0.840 and the specificity and sensitivity of this model were 80.0% and 80.0%, respectively. The AUC of METTL7A was 0.900 and the specificity and sensitivity of this model were 100.0% and 70.0%, respectively. The AUC of NETO2 was 0.860 and the specificity and sensitivity of this model were 100.0% and 80.0%, respectively. The AUC of TCN1 was 0.820 and the specificity and sensitivity of this model were 80.0% and 90.0%, respectively. Ultimately, the results of the GSE126209 dataset showed that four genes had good diagnostic values for OS (CXCL12, METTL7A, NETO2 and TCN1; Fig. 7). CXCL12 was 1.000 and the specificity and sensitivity of this model were 100.0% and 100%, respectively. The AUC of METTL7A was 0.856 and the specificity and sensitivity of this model were 100.0% and 83.3%, respectively. The AUC of NETO2 was 0.833 and the specificity and sensitivity of this model were 100.0% and 75.0%, respectively. The AUC of TCN1 was 0.879 and the specificity and sensitivity of this model were 81.8% and 83.3%, respectively.
Figure 4

Box plots of seven differentially expressed genes in the GSE39262 dataset.

The x-axes represent control and case groups while the y-axes represent the relative expression levels of the genes. Seven genes included NETO2, CAMP, METTL7A, TCN1, LTF, CXCL12 and CYP4F3.

Figure 5

(A-G) Box plots of seven differentially expressed genes in GSE126209 dataset.

The x-axes represent control and case groups while the y-axes represent the relative expression levels of the genes. Seven genes included NETO2, CAMP, METTL7A, TCN1, LTF, CXCL12 and CYP4F3.

Figure 6

ROC curves of selected differentially expressed genes in the GSE39262 dataset.

The x-axes and the y-axes show 1-specificity and sensitivity, respectively. ROC, receiver operating characteristic. (A-G) The seven genes included NETO2, CAMP, METTL7A, TCN1, LTF, CXCL12 and CYP4F3.

Figure 7

ROC curves of selected differentially expressed genes in the GSE126209 dataset.

The x-axes and the y-axes show 1-specificity and sensitivity, respectively. ROC, receiver operating characteristic. (A-G) The seven genes included NETO2, CAMP, METTL7A, TCN1, LTF, CXCL12 and CYP4F3.

Protein–protein interaction networks of differentially expressed genes.

Red and green ellipses represent up-regulated and down-regulated genes, respectively. The black borders indicate top 20 up-regulated and down-regulated genes.

Transcription factor-top 20 up-regulated and down-regulated genes network.

Diamonds and ellipses represent transcription factors and top 20 up-regulated and down-regulated genes, respectively. Red and green ellipses represent up-regulated and down-regulated genes, respectively. Notes. indicates the number of genes regulated by the TF transcription factor

Box plots of seven differentially expressed genes in the GSE39262 dataset.

The x-axes represent control and case groups while the y-axes represent the relative expression levels of the genes. Seven genes included NETO2, CAMP, METTL7A, TCN1, LTF, CXCL12 and CYP4F3.

(A-G) Box plots of seven differentially expressed genes in GSE126209 dataset.

The x-axes represent control and case groups while the y-axes represent the relative expression levels of the genes. Seven genes included NETO2, CAMP, METTL7A, TCN1, LTF, CXCL12 and CYP4F3.

ROC curves of selected differentially expressed genes in the GSE39262 dataset.

The x-axes and the y-axes show 1-specificity and sensitivity, respectively. ROC, receiver operating characteristic. (A-G) The seven genes included NETO2, CAMP, METTL7A, TCN1, LTF, CXCL12 and CYP4F3.

ROC curves of selected differentially expressed genes in the GSE126209 dataset.

The x-axes and the y-axes show 1-specificity and sensitivity, respectively. ROC, receiver operating characteristic. (A-G) The seven genes included NETO2, CAMP, METTL7A, TCN1, LTF, CXCL12 and CYP4F3.

Discussion

OS is a common malignant bone tumor and originates from mesenchymal stromal cells (MSCs) (Xiao, Hogendoorn & Cleton-Jansen, 2013). The heterogeneous histopathological characteristics and complex genomic landscape of OS have been major challenges for elaborating underlying the molecular pathogenesis of OS. In this study, we included four OS datasets and identified 1,059 DEGs (569 up-regulated DEGs and 490 down-regulated DEGs) between OS and normal samples. These genes were significantly enriched in 54 KEGG pathways such as pathways in cancer. Moreover, CAMP, METTL7A, TCN1, LTF and CXCL12 served as hub genes in PPI network. METTL7A, CYP4F3, TCN1, LTF and NETO2 were key players in TF-target gene regulatory network. Interestingly, TCN1, CYP4F3, NETO2 and CXCL12 were all regulated by Pax-6. Additionally, the expression patterns of key genes (CAMP, METTL7A, TCN1, LTF, CXCL12, CYP4F3 and NETO2) were selected to verify in two published OS datasets (GSE39262 and GSE126209). CAMP, also known as hCAP18 or LL37, is an antimicrobial peptide gene in human (Larrick et al., 1995). The C-terminal of the protein product of CAMP contains a 37-amino acid-long peptide with broad spectrum-antibacterial activity (Vandamme, Luyten & Schoofs, 2012). There are positive expressions of CAMP in the multiple cell systems, such as epithelial cells, neutrophils and macrophages (Dhawan et al., 2015; Frew et al., 2014; Li et al., 2018). Wu et al. (2012) suggested that bone marrow stroma could express CAMP, which may be a potential ex vivo priming factor for hematopoietic stem progenitor cells to promote hematopoietic reconstitution after transplantation. Later, Coffelt et al. (2019) discovered that CAMP expression level was elevated in MSCs compared to that in ovarian cancer cells. Herein, our analysis showed that CAMP was the most down-regulated gene in patients suffering from OS. Besides, CAMP acted as a hub gene in PPI network, suggesting that this gene may be involved in the pathologic mechanism of OS. Although the underlying role of CAMP on the initiation and progression of OS has not been investigated, available evidence showes that CAMP plays significant roles in several cancers, including breast cancer, lung cancer and pancreatic cancer (García-Quiroz et al., 2016; Sainz Jr et al., 2015; Von Haussen et al., 2008). More notably, existing data indicated that CAMP had either carcinogenic or anti-cancer effects (Chen et al., 2018; Wu et al., 2010). Therefore, the influence of CAMP on OS occurrence and development needs to be further clarified in future. Our gene differential expression revealed that CXCL12 and TCN1 were down-regulated in OS patients, which were verified in a validation dataset. Moreover, these two genes also acted as hub genes in PPI network. In addition, up-regulated NETO2 and down-regulated CYP4F3 had high degree in TF-gene regulatory network. Interestingly, CXCL12, TCN1, NETO2 and CYP4F3, regulated by Pax-6, exhibited important diagnostic values for OS. CXCL12 is also called stromal cell-derived factor-1 (SDF-1) and can bind to G-protein-coupled chemokine receptor CXCR4 (Nagasawa, 2014). Increasing studies suggested that CXCL12/CXCR4 axis played pivotal roles in tumor growth and development (Balkwill, 2004; Lu et al., 2015; Perissinotto et al., 2005). Li et al. (2018). highlighted that epigenetic regulation of CXCL12 by DNA methyltransferase 1 was associated with the metastasis and immune response in OS. Previous reports also indicated that down-regulation of CXCR4 induced OS cell apoptosis via suppressing PI3K/Akt/NF-κβ pathway (Pollino et al., 2019). However, there is no directive evidence to support the involvement of TCN1, NETO2 and CYP4F3 in OS. Notably, Pax-6 is a highly conserved evolutionarily TF and belongs to paired box TF family (Mansouri & Gruss, 1996). Several studies have pointed out that Pax6 participated in the regulation of cancer cell proliferation and progression (Shyr et al., 2010; Zong et al., 2011). Yang et al. (2019) established a TF-top 20 DEGs regulatory network by integrating and analyzing three GEO datasets (GSE66673, GSE49003 and GSE37552), and found that Pax-6 down-regulated BMP6 expression in non-metastatic OS samples. Taken together, we inferred that CXCL12/TCN1/NETO2/CYP4F3-Pax-6 axis may be implicated in the pathogenesis of OS, and four genes (CXCL12, TCN1, NETO2 and CYP4F3) were novel diagnostic biomakers for OS. METTL7A and LTF are reported to act as tumor suppressor genes (Qi et al., 2017; Zhang et al., 2011; Zhang et al., 2015). Similarly, our findings showed that the expressions of METTL7A and LTF were decreased in OS samples. Moreover, these two genes were both hub genes in PPI analysis and key gene nodes in TF-gene regulatory analysis. These results implied that METTL7A and LTF may be correlated with underlying mechanisms of OS. However, the potential effects of METTL7A and LTF down-regulation on OS progression needs to be further investigated. Although we have identified multiple novel gene signatures associated with OS, there are still limitations in this work. Our conclusion was drawn based on an integrated bioinformatic analysis. Therefore, additional experiments are required to confirm our findings. In addition, a larger sample size verification will also improve the reliability of our conclusion. Moreover, the clinical information should be collected to evaluate the diagnostic value of biomarkers for OS patients. Finally, the biological significances of key biomarkers will be investigated in model systems or cell lines. In summary, a total of 1,059 DEGs were identified between OS and normal samples. Among them, up-regulation of NETO2 and down-regulation of METTL7A, TCN1, and CXCL12 may be potential gene signatures related to OS. Pax-6 was also probably associated with the pathological process of OS. However, a comprehensive bioinformatics analysis with larger sample size and in vivo or in vitro assays should be performed to confirm our results. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file.
  37 in total

1.  Twenty years of follow-up of survivors of childhood osteosarcoma: a report from the Childhood Cancer Survivor Study.

Authors:  Rajaram Nagarajan; Anmmd Kamruzzaman; Kirsten K Ness; Victoria G Marchese; Charles Sklar; Ann Mertens; Yutaka Yasui; Leslie L Robison; Neyssa Marina
Journal:  Cancer       Date:  2010-10-04       Impact factor: 6.860

Review 2.  Emerging roles of the host defense peptide LL-37 in human cancer and its potential therapeutic applications.

Authors:  William K K Wu; Guangshun Wang; Seth B Coffelt; Aline M Betancourt; Chung W Lee; Daiming Fan; Kaichun Wu; Jun Yu; Joseph J Y Sung; Chi H Cho
Journal:  Int J Cancer       Date:  2010-10-15       Impact factor: 7.396

3.  Epigenetic Regulation of CXCL12 Plays a Critical Role in Mediating Tumor Progression and the Immune Response In Osteosarcoma.

Authors:  Hengyuan Li; Zenan Wang; Binghao Li; Zhan Wang; Hao Wu; Mingfeng Xue; Peng Lin; Shengdong Wang; Nong Lin; Xin Huang; Weibo Pan; Meng Liu; Xiaobo Yan; Hao Qu; Lingling Sun; Yan Wu; Wangsiyuan Teng; Xingzhi Zhou; Huabiao Chen; Mark C Poznansky; Zhaoming Ye
Journal:  Cancer Res       Date:  2018-05-07       Impact factor: 12.701

4.  Identification of Key Gene Modules in Human Osteosarcoma by Co-Expression Analysis Weighted Gene Co-Expression Network Analysis (WGCNA).

Authors:  Xiangsheng Liu; Ai-Xin Hu; Jia-Li Zhao; Feng-Li Chen
Journal:  J Cell Biochem       Date:  2017-05-23       Impact factor: 4.429

5.  The pro-inflammatory peptide LL-37 promotes ovarian tumor progression through recruitment of multipotent mesenchymal stromal cells.

Authors:  Seth B Coffelt; Frank C Marini; Keri Watson; Kevin J Zwezdaryk; Jennifer L Dembinski; Heather L LaMarca; Suzanne L Tomchuck; Kerstin Honer zu Bentrup; Elizabeth S Danka; Sarah L Henkle; Aline B Scandurro
Journal:  Proc Natl Acad Sci U S A       Date:  2009-02-20       Impact factor: 11.205

Review 6.  Osteosarcoma: Current Treatment and a Collaborative Pathway to Success.

Authors:  Michael S Isakoff; Stefan S Bielack; Paul Meltzer; Richard Gorlick
Journal:  J Clin Oncol       Date:  2015-08-24       Impact factor: 44.544

7.  Tumor suppressor PAX6 functions as androgen receptor co-repressor to inhibit prostate cancer growth.

Authors:  Chih-Rong Shyr; Meng-Yin Tsai; Shuyuan Yeh; Hong-Yo Kang; Yun-Chao Chang; Pei-Ling Wong; Chao-Cheng Huang; Ko-En Huang; Chawnshang Chang
Journal:  Prostate       Date:  2010-02-01       Impact factor: 4.104

8.  The bone marrow-expressed antimicrobial cationic peptide LL-37 enhances the responsiveness of hematopoietic stem progenitor cells to an SDF-1 gradient and accelerates their engraftment after transplantation.

Authors:  W Wu; C H Kim; R Liu; M Kucia; W Marlicz; N Greco; J Ratajczak; M J Laughlin; M Z Ratajczak
Journal:  Leukemia       Date:  2011-09-20       Impact factor: 11.528

9.  STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets.

Authors:  Damian Szklarczyk; Annika L Gable; David Lyon; Alexander Junge; Stefan Wyder; Jaime Huerta-Cepas; Milan Simonovic; Nadezhda T Doncheva; John H Morris; Peer Bork; Lars J Jensen; Christian von Mering
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

10.  Human cathelicidin production by the cervix.

Authors:  Lorraine Frew; Sofia Makieva; Andrew T M McKinlay; Brian J McHugh; Ann Doust; Jane E Norman; Donald J Davidson; Sarah J Stock
Journal:  PLoS One       Date:  2014-08-04       Impact factor: 3.240

View more
  5 in total

1.  Analysis of immune related gene expression profiles and immune cell components in patients with Barrett esophagus.

Authors:  Lin Shi; Renwei Guo; Zhuo Chen; Ruonan Jiao; Shuangshuang Zhang; Xuanxuan Xiong
Journal:  Sci Rep       Date:  2022-06-02       Impact factor: 4.996

2.  Construction and validation of a novel gene signature for predicting the prognosis of osteosarcoma.

Authors:  Jinpo Yang; Anran Zhang; Huan Luo; Chao Ma
Journal:  Sci Rep       Date:  2022-01-24       Impact factor: 4.996

3.  The role of SPI1-TYROBP-FCER1G network in oncogenesis and prognosis of osteosarcoma, and its association with immune infiltration.

Authors:  Jiahua Li; Hui Shi; Zhanyuan Yuan; Zhiheng Wu; Haohao Li; Yuelong Liu; Ming Lu; Ming Lu
Journal:  BMC Cancer       Date:  2022-01-25       Impact factor: 4.430

4.  Identification of key genes and biological pathways in Chinese lung cancer population using bioinformatics analysis.

Authors:  Ping Liu; Hui Li; Chunfeng Liao; Yuling Tang; Mengzhen Li; Zhouyu Wang; Qi Wu; Yun Zhou
Journal:  PeerJ       Date:  2022-01-31       Impact factor: 2.984

5.  Identification of Survival-Related Genes in Acute Myeloid Leukemia (AML) Based on Cytogenetically Normal AML Samples Using Weighted Gene Coexpression Network Analysis.

Authors:  Tingting Chen; Juan Zhang; Yinying Wang; Hebing Zhou
Journal:  Dis Markers       Date:  2022-09-29       Impact factor: 3.464

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.