Junfeng Guo2, Hong Tang1, Pan Huang1, Junfeng Guo2, Youxing Shi1, Chengsong Yuan1, Taotao Liang1, Kanglai Tang1. 1. Department of Orthopaedics/Sports Medicine Center, State Key Laboratory of Trauma, Burn and Combined Injury, Southwest Hospital, Third Military Medical University, Chongqing, China. 2. Department of Stomatology, The 970th Hospital of the Joint Logistics Support Force, Yantai, China.
Abstract
Osteosarcoma is the most common malignant bone tumor in adolescents, and metastasis is the key reason for treatment failure and poor prognosis. Once metastasis occurs, the 5-year survival rate is only approximately 20%, and assessing and predicting the risk of osteosarcoma metastasis are still difficult tasks. In this study, cellular communication between tumor cells and nontumor cells was identified through comprehensive analysis of osteosarcoma single-cell RNA sequencing (scRNA-seq) and bulk RNA-seq data, illustrating the complex regulatory network in the osteosarcoma microenvironment. In line with the heterogeneity of osteosarcoma, we found subpopulations of osteosarcoma cells that highly expressed COL6A1, COL6A3 and MIF and were closely associated with lung metastasis. Then, BCDEG, a reliable risk regression model that could accurately assess the metastasis risk and prognosis of patients, was established, providing a new strategy for the diagnosis and treatment of osteosarcoma.
Osteosarcoma is the most common malignant bone tumor in adolescents, and metastasis is the key reason for treatment failure and poor prognosis. Once metastasis occurs, the 5-year survival rate is only approximately 20%, and assessing and predicting the risk of osteosarcoma metastasis are still difficult tasks. In this study, cellular communication between tumor cells and nontumor cells was identified through comprehensive analysis of osteosarcoma single-cell RNA sequencing (scRNA-seq) and bulk RNA-seq data, illustrating the complex regulatory network in the osteosarcoma microenvironment. In line with the heterogeneity of osteosarcoma, we found subpopulations of osteosarcoma cells that highly expressed COL6A1, COL6A3 and MIF and were closely associated with lung metastasis. Then, BCDEG, a reliable risk regression model that could accurately assess the metastasis risk and prognosis of patients, was established, providing a new strategy for the diagnosis and treatment of osteosarcoma.
Osteosarcoma (OS) is the most common primary malignant bone tumor, accounting for approximately 60% of all pediatric malignancies, and is the second leading cause of cancer-related death in adolescents (1–4). Seper et al. reported that the first confirmed finding of OS occurred in a dinosaur (a specimen of Centrosaurus apertus), dating from approximately 77.0–75.5 million years ago (5). Experts have noted that OS mainly occurs in the proximal tibia, proximal humerus, and distal femur and that its clinical manifestations mainly include local pain, swelling, and limited joint mobility (6, 7). For OS, local therapy alone is insufficient, as 80-90% of all patients with seemingly localized disease will develop metastasis (6).Tumor metastasis is the main cause of treatment failure and death in patients with OS. Metastasis to the lungs is the most common type and has a great impact on the prognosis of patients (8). Because of the extensive application of neoadjuvant chemotherapy, the prognosis of patients with localized OS has been significantly improved, with a 5-year survival rate reaching 60-70%. However, the survival rate of those with pulmonary metastasis is still poor, at only 20-30% (9, 10). This situation highlights the urgency of studying the biological behavior and genetic characteristics of OS metastasis. An accurate description of the process of tumor metastasis will help to enhance our understanding of OS and improve clinical treatment effects and patient survival (11, 12). In previous studies, researchers have utilized a variety of assessment methods to investigate OS. Sheen et al. used two radiomics features to develop a logistic regression imaging model of OS metastasis (13), but its sensitivity was slightly low, which probably led to delayed treatment. Flores et al. assessed the blood‐based biomarkers for prognostication in OS patients (14); however, circulating biomarkers are unstable and susceptible. Tissue biopsy is of great necessity in the diagnosis of OS, and this approach is more reliable than other methods for predicting OS progression and metastasis.The combined application of bone scintigraphy and CT is often used to identify metastases of OS (15). However, imagological examination alone is insufficient, and pulmonary micrometastases are often undiagnosed, leading to poor prognosis (16). Considering the deficiencies in early pulmonary metastasis diagnosis and resulting poor survival, there is an urgent need for pulmonary metastatic biomarkers for OS. Single-cell RNA sequencing (scRNA-seq), through the analysis of transcripts within an individual cell, can distinguish tumor cells from nontumor cells and allow exploration of the interrelationships between cells in the tumor microenvironment. Breakthroughs have been made in the study of tumor cell heterogeneity, proliferation and metastasis (17, 18). OS is highly heterogeneous and has many cell subpopulations with different biological functions that evolve during tumor progression (19). scRNA-seq of OS cells is conducive to identifying new cell types, exploring tumor heterogeneity and revealing different developmental trajectories (20), which can provide new theoretical support for further research on the molecular mechanisms of OS progression and metastasis.
Materials and Methods
Data Sources and Processing
Single-cell sequencing data and clinical information for 11 OS patients were downloaded from GSE152048. The 10X genomics data for each individual sample were loaded by the Seurat package (v4.0.0) in R software (v4.0.2) (21). Low-quality cells with ≤ 300 detected genes or ≥ 10% mitochondrial genes were removed. Feature counts for each cell were divided by the total and multiplied by 10,000. The log(x+1) method was used to perform natural-log transformation. The top 2,000 highly variable genes (HVGs) in the normalized expression matrix were identified, centered, and scaled before we performed principal component analysis (PCA) based on these HVGs. Batch effects were removed with the Harmony package (v1.0) of R based on the top 50 PCA components identified (22).Eighteen tumor-normal pairs of osteosarcoma RNA-sequencing data were downloaded from GSE99671. The edgeR package (v 3.30.3) was used to identify up- and down-regulated genes in osteosarcoma (log2FoldChange >1 and adjusted P-value < 0.1). Bulk RNA-seq and clinical data for 80 OS patients from the TARGET database and 53 from GSE21257 were collected. Patients from TARGET were randomly divided into two groups: a training group (40), and an internal verification group (40). The 53 patients from GSE21257 were used as an external training group. All the raw data are available in the GEO database (https://www.ncbi.nlm.nih.gov/geo/).
Identification of Cell Types
Based on harmony-corrected data, k-nearest neighbors were calculated, and a shared nearest neighbor (SNN) graph was constructed. Then, the modular function was optimized to realize cluster recognition based on the clustering algorithm. The identified clusters were visualized on a 2D map by employing the uniform manifold approximation and projection (UMAP) dimensional reduction technique. We first used the “SingleR” package (R software) as an auxiliary tool to identify cells, and then used the Wilcoxon rank-sum test with Bonferroni correction to identify marker genes for double annotation of cell clusters.
Cell-Cell Communication
Crosstalk between OS cells and other cells within the tumor microenvironment was calculated with the iTALK package (v0.1.0) in R (23). The top 50% of highly expressed genes were uploaded to the ligand-receptor database to determine their locations within an intercellular signaling network.
Gene Set Enrichment Analysis
Gene set enrichment analysis (GSEA) was used to detect whether the PI3K-AKT signaling pathway was significantly enriched in the subpopulations of OS cells. The enrichment score (ES) and statistical significance (nominal P value) of the ES were calculated; a positive ES suggested that PI3K-AKT signaling was activated in a subpopulation, whereas a negative ES suggested that PI3K-AKT signaling was not activated.
Pseudotemporal Ordering of Osteosarcoma Cells
Monocle (v2.16.0) aims to resolve cellular transitions during differentiation through pseudotemporal profiling of scRNA-seq data (24). After inputting the cell-gene matrix into the “newCellDataSet” function with its clustering information, it was computed into a lower dimensional space based on the discriminative dimensionality reduction with trees (DDRTree) method, a more recent manifold learning algorithm, and then OS cells were ordered according to pseudotime.
Generation of the Risk Regression Model
To establish a generalized linear model and minimize the risk of overfitting, we applied the least absolute shrinkage and selection operator (LASSO) penalty in the COX proportional hazards regression model (“glmnet” package, R software). In the modelling process, “lambda” is the variable and causes the gene coefficients to converge. As lambda increases, the coefficients for individual genes gradually approach zero. When the coefficient becomes zero, it means that the expression of the gene no longer affects the risk score and the gene can be removed from the model. We used 10-fold cross-validation to calculate partial-likelihood, a key metric for evaluating model performance. Value of lambda that gives minimum partial-likelihood is optimal, at which the model constructed of candidate genes with nonzero regression coefficients has the best performance. Risk scores for each patient were calculated using the following formula:where RS is the risk score for OS patients; N is the number of genes in the model; and the ith gene expression and coefficient are denoted by e
i and c
i respectively. The independent variables are the expression of the modelling genes for each patient, and the dependent variable is the risk score.According to the median scores for each group, the patients were divided into high-risk and low-risk subgroups.
Cell Culture
The human OS cell line HOS was kindly provided by Procell Life Science & Technology Co., Ltd (Wuhan, China) and cultured in Dulbecco’s Modified Eagle Medium/Nutrient Mixture F-12 (DMEM/F-12; Gibco, Grand Island, NY, USA) supplemented with 10% fetal bovine serum (FBS; HyClone, South Logan, UT, USA) and 1% penicillin-streptomycin (Gibco, Grand Island, NY, USA) at 37°C with 5% CO2.
SiRNA Transfect Cells
OS cells were grown to 60-80% confluence. In all, 30 pmol EFEMP2 siRNA, GALNT14 siRNA or negative control siRNA (GenePharma, Shanghai, China) was diluted to 150 µl Opti-MEM® Reduced-Serum Medium (Gibco, Grand Island, NY, USA) respectively and mixed with the diluted Lipofectamine® RNAiMAX Reagent (1:1 ratio; Thermo Fisher Scientific, Waltham, MA, USA). The siRNA-lipid complex was incubated for 5 minutes at room temperature and then added to the cells.
Quantitative RT-PCR
Total RNA was extracted using the SimplyP RNA Kit (Bioer, Hangzhou, China), and then reversed transcription was carried out using the PrimeScript RT reagent Kit (TaKaRa, Tokyo, Japan). Quantitative RT-PCR was performed with SsoAdvanced Universal SYBR Green Supermix reagent (Bio-Rad, Hercules, CA, USA) on a PCR system (Bio-Rad, Hercules, CA, USA) per the manufacturer’s instructions. The primers were designed and synthesized by Sangon Biotech (Shanghai, China), and the sense and antisense primers are listed in
.
Wound Healing
HOS cells (5x105) were seeded in 6-well plates and transfected by siRNA described above. When the cells reached 90-95% confluence as a monolayer, a scratch wound was made as a marked line in the cultures using a sterile pipette tip. Wells were washed three times with phosphate-buffered saline (PBS; 0.01 M) to remove loose or dead cells, and fresh serum-free medium was added. The cells were photographed at 0-, 12- and 24-hour time points after treatment. The wound area was quantitated using ImageJ software (25).
Statistical Analysis
Bilateral tests were performed for statistical tests. A P value less than 0.05 was considered statistically significant. R software version 4.0.0 (https://www.r-project.org/) was used for analysis. Some R packages, including “dplyr”, “Seurat”, “cowplot”, “Matrix”, “ggplot2”, “reshape2”, “iTALK”, “monocle”, “harmony”, and “survival”, were used in this study.
Results
Cell Type Identification in the Osteosarcoma Microenvironment
A total of 11 OS patients with scRNA-seq data were involved in this study (seven primary patients: P01-07, two recurrent patients: P08-09, and two lung-metastatic patients: P10-11) (
). After quality control and removal of the batch effect between samples, we used the UMAP technique to reduce dimensionality, and graphed the output on a 2D scatter plot (
). A total of 109,415 single cells were unbiasedly clustered into 14 major identities, and cluster-specific markers were used to annotate cell types: OS cells (COL1A1, CXCL12, MEPE and COL2A1, including osteoblastic OS cells and chondroblastic OS cells); macrophages (HLA-DRA, CD68 and CD14); mesenchymal stem cells (MSCs) (NES, HMGB2 and CCNB2); T cells (CCL5 and CD69); endothelial cells (CD34 and PECAM1); pericytes (ACTA2 and RGS5); B cells (JCHAIN and MZB1) and myoblasts (MYOG) (
). Detailed annotation information and references are given in the
.
Figure 1
scRNA-seq profiling of OS tumor microenvironments. (A) Uniform manifold approximation and projection (UMAP) visualization of cells clustered and color-coded according to the patient (left panel) or disease state (right panel). (B) UMAP plot showing annotated and color-coded cell types in the OS microenvironment. (C) Heatmap showing the expression of marker genes in the indicated cell types. The top bar color indicates the specific cell type cluster. (D) UMAP plot showing expression levels of marker genes for specific cell types.
scRNA-seq profiling of OS tumor microenvironments. (A) Uniform manifold approximation and projection (UMAP) visualization of cells clustered and color-coded according to the patient (left panel) or disease state (right panel). (B) UMAP plot showing annotated and color-coded cell types in the OS microenvironment. (C) Heatmap showing the expression of marker genes in the indicated cell types. The top bar color indicates the specific cell type cluster. (D) UMAP plot showing expression levels of marker genes for specific cell types.
Cellular Communication
To investigate and visualize the cell-cell interaction network in the OS microenvironment, the iTALK package of R was used to analyze the ligand-receptor interactions in different cell types (23). Based on the primary function of the ligand, iTALK further classifies intercellular crosstalk into 4 categories: checkpoint, cytokine, growth factor and other. In terms of the checkpoint category, OS cells of clusters 1 and 6 expressed higher levels of CD24, which could interact with the SIGLEC10 receptor of cluster 0 macrophages to regulate the antitumor immune response of macrophages and promote immune evasion (26). The cytokine category showed internal connections among macrophages. CCL3, CCL4 and CCL3L1 secreted by cluster 0 and 12 macrophages could interact with CCR1, which was highly expressed on the surface of cluster 4 macrophages, promoted their migration, and regulated the inflammatory response (27). Regarding the growth factor category, we also found that cluster 6 OS cells likely produce VEGFA, which binds to ITGB1 receptors on endothelial cells (ECs) to regulate their migration and promote new vessel formation and tumor metastasis (28–30). (
)
Figure 2
Circos plot showing cellular crosstalk within the OS microenvironment. The outside ring of the circos plot displays cell clusters, and the inside shows the details of each interacting ligand-receptor pair. The lines and arrowheads inside the circos plot were scaled to indicate the relative signal strength of the ligand and receptor, respectively, and different colors and types of lines were used to illustrate various categories.
Circos plot showing cellular crosstalk within the OS microenvironment. The outside ring of the circos plot displays cell clusters, and the inside shows the details of each interacting ligand-receptor pair. The lines and arrowheads inside the circos plot were scaled to indicate the relative signal strength of the ligand and receptor, respectively, and different colors and types of lines were used to illustrate various categories.
Identification of Different Invasive Osteosarcoma Cells
To study the heterogeneity among OS cells and find new cell subpopulations closely related to OS metastasis, data for 53,441 OS cells were extracted, and the cells were clustered into nine subclusters (named Osteosarcoma_0 ~ Osteosarcoma_8), and Osteosarcoma_3 and 8 were chondroblastic and the rest were osteoblastic cells (
and
). As illustrated in
, the genes CADM1, LINC00662, and RUVBL1, which have been shown to inhibit OS metastasis (31–33), were highly expressed by the cells in subclusters 6 and 7. In addition, Osteosarcoma_8 cells showed high expression levels of COL6A1, COL6A3 and MIF, and these genes have been confirmed to be associated with metastasis (34–37). In addition, many genes that play important roles in osteosarcoma display distinct expression patterns between Osteosarcoma_6, 7 and 8 (
).
Figure 3
Heterogeneity of OS cells. (A) Nine main OS cell subclusters were identified by UMAP analysis. (B) Violin plot showing expression levels of some vital genes that promoted or inhibited OS metastasis. (C) Dot plots showing expression of the nine signature genes across the nine subcellular clusters. (D) GSEA results regarding enrichment of the PI3K-AKT signaling pathway in subclusters 6, 7 and 8. (E) Pseudotemporal trajectory plot showing the dynamics of OS subclusters. (F) Heatmap showing expression alterations of stem cell-related genes under different cell fates defined by pseudotime analysis. (G) Pseudotime of OS cells. (H) Differentiation trajectories of subclusters 6, 7 and 8.
Heterogeneity of OS cells. (A) Nine main OS cell subclusters were identified by UMAP analysis. (B) Violin plot showing expression levels of some vital genes that promoted or inhibited OS metastasis. (C) Dot plots showing expression of the nine signature genes across the nine subcellular clusters. (D) GSEA results regarding enrichment of the PI3K-AKT signaling pathway in subclusters 6, 7 and 8. (E) Pseudotemporal trajectory plot showing the dynamics of OS subclusters. (F) Heatmap showing expression alterations of stem cell-related genes under different cell fates defined by pseudotime analysis. (G) Pseudotime of OS cells. (H) Differentiation trajectories of subclusters 6, 7 and 8.PI3K-AKT signaling is one of the key pathways promoting the metastasis in OS, and inhibition of PI3K-AKT signaling yields enhanced antitumor activity, impaired tumor growth, and reduced migratory and metastatic potential (38, 39). The GSEA results revealed that the PI3K-AKT pathway was activated in Osteosarcoma_8 (ES=0.35, P-value=0.004) but was inhibited in Osteosarcoma_6 (ES=-0.40, P-value=0.004) and Osteosarcoma_7 (ES=-0.39, P-value=0.001) (
). We further constructed an osteosarcoma-related gene set using the results of the edgeR package and performed GSEA enrichment analysis. Similarly, the results also confirmed that there were significant differences in cell states between Osteosarcoma _6, 7 and 8 (
). Thus, we demonstrated that subcluster 8 contained aggressive cells, while the cells in subclusters 6 and 7 showed limited associations with metastasis.
Defining the Transcriptional Dynamics of Osteosarcoma
Next, we explored the dynamics and regulators of cell fate decisions in OS cells with Monocle 2. The pseudotime trajectory was visualized and is shown in
. Strikingly, we noticed that Osteosarcoma_4 was mainly located at the beginning of the trajectory path, with high expression of many stem and progenitor cell marker genes (CD34, Oct-4, NGFR, SSEA3 and Nanog) (40–44). A set of stem cell-related genes were collected from the “CellMarker” database (http://biocc.hrbmu.edu.cn/CellMarker/index.jsp), and their expression gradually decreased with the cell trajectory (
). Therefore, Osteosarcoma_4 was identified as the principal progenitor. As cells progressed along a differentiation trajectory, they diverged along two separate paths, which represented two distinctive cell fates (
). Interestingly, Osteosarcoma_8, which represented the more aggressive subpopulation, mainly followed the right differentiation trajectory after the branch point. In contrast, the metastatically indolent OS cells in Osteosarcoma_6 and Osteosarcoma_7 predominantly followed the left trajectory (
). In summary, Osteosarcoma_8 OS cells showed different differentiation states from the cells of Osteosarcoma_6 and Osteosarcoma_7, which may be the reason for their different levels of invasiveness.
Establishment of the Risk Regression Model
Differential analysis was performed between weakly metastatic subclusters (Osteosarcoma_6 and Osteosarcoma_7) and the highly invasive subcluster (Osteosarcoma_8) of OS cells, and 813 metastasis-related genes were identified. As described in Materials and Methods, BCDEG, a model considering five metastasis-related genes (GALNT14, BNIP3, EFEMP2, CPE and DKK1), was developed (
). Previous studies have confirmed that silencing BNIP3 and DKK1 can effectively inhibit OS proliferation and promote apoptosis (45–47) and that CPE overexpression enhances OS growth, migration, and invasion (48, 49). Since the roles of EFEMP2 and GALNT14 in OS have not been explored, which is worthy of further study. We used siRNA to significantly suppress the expression of EFEMP2 and GALNT14 (
). Wound healing analysis showed that, within 24 hours, the scratches in the negative control group were close to healing, while those in the siRNA-EFEMP2 and the siRNA-GALNT14 groups were still relatively obvious (
). These findings suggest that inhibiting EFEMP2 and GALNT14 could decrease the invasion and metastasis ability of OS.
Figure 4
Establishment of the BCDEG model. (A) The error rate for model cross-validation. Each lambda value with error bars gives a confidence interval for the cross-validation error rate. The vertical dashed line on the left represents the minimum error, and the number of genes in each model is given at the top of the plot. (B) A model considering prognosis-related genes was produced by maximizing the appropriate penalized log-likelihood. Lasso penalty leads to convergence of gene coefficients, and the colored line shows the trend of gene coefficients as a function of lambda. The number of genes with nonzero regression coefficients is shown at the top of the plot. (C) siRNA-EFEMP2 significantly inhibited the expression of EFEMP2. (D) siRNA-GALNT14 significantly inhibited the expression of GALNT14. (E) Degree of wound healing at 0, 12 and 24 hours; the red dotted lines represent the edge of the scratch. (F) Survival curve for the training group. The prognosis of the low-risk group was better than that of the high-risk group. (G) Survival status plot of the training group ranked according to risk score. **p < 0.01 and ****p < 0.0001.
Establishment of the BCDEG model. (A) The error rate for model cross-validation. Each lambda value with error bars gives a confidence interval for the cross-validation error rate. The vertical dashed line on the left represents the minimum error, and the number of genes in each model is given at the top of the plot. (B) A model considering prognosis-related genes was produced by maximizing the appropriate penalized log-likelihood. Lasso penalty leads to convergence of gene coefficients, and the colored line shows the trend of gene coefficients as a function of lambda. The number of genes with nonzero regression coefficients is shown at the top of the plot. (C) siRNA-EFEMP2 significantly inhibited the expression of EFEMP2. (D) siRNA-GALNT14 significantly inhibited the expression of GALNT14. (E) Degree of wound healing at 0, 12 and 24 hours; the red dotted lines represent the edge of the scratch. (F) Survival curve for the training group. The prognosis of the low-risk group was better than that of the high-risk group. (G) Survival status plot of the training group ranked according to risk score. **p < 0.01 and ****p < 0.0001.The BCDEG model was applied to predict survival in the training group, and the results of Kaplan-Meier analysis showed that the risk score was significantly and negatively associated with the prognosis of OS patients (
). When patients were divided into groups according to risk score, the mortality rate was significantly higher in the high-risk group than the low-risk group, and almost all patients with the highest risk scores died with a very short survival period (
, P=0.001).
Excellent Performance of BCDEG
The remaining patients were used to verify the BCDEG model, and similar results were observed in the internal (
) and external validation groups (
). Time-dependent ROC curves were produced to assess the prognostic ability of BCDEG, and the AUCs at 5 years were 0.838, 0.724, and 0.729 in the training, internal validation, and external validation groups, respectively (
), indicating that the BCDEG model had a great ability to predict the survival of OS patients. We also wondered whether BCDEG could predict OS metastasis. We generated a box plot, which showed that the risk scores of metastatic patients were significantly higher than those of localized OS patients (
), and the sensitivity and specificity of prediction were 0.81 and 0.74, respectively.
Figure 5
Evaluation of the BCDEG model. (A) Survival curve for the internal verification group. (B) Survival curve for the external verification group. (C) Time-dependent ROC curves for three groups. The AUCs of the 3 groups were all greater than 0.7. (D) Box plot showing the difference in risk score between metastatic and localized OS patients. (E) Survival curve for the BCDEG model to predict the prognosis of liver hepatocellular carcinoma and colon adenocarcinoma patients. (F) Snapshots of the BCDEG web app.
Evaluation of the BCDEG model. (A) Survival curve for the internal verification group. (B) Survival curve for the external verification group. (C) Time-dependent ROC curves for three groups. The AUCs of the 3 groups were all greater than 0.7. (D) Box plot showing the difference in risk score between metastatic and localized OS patients. (E) Survival curve for the BCDEG model to predict the prognosis of liver hepatocellular carcinoma and colon adenocarcinoma patients. (F) Snapshots of the BCDEG web app.We also applied the BCDEG model for patients with liver hepatocellular carcinoma and colon adenocarcinoma from The Cancer Genome Atlas (TCGA). The results of TCGA-LIHC and TCGA-COAD stand in sharp contrast to OS patients, as that there were no significant differences between the survival of high-risk and low-risk groups (
), indicating that this model is specific for OS.To enable clinicians to have convenient access to this model, an open-source web app was generated. Users can calculate multiple patient risk scores online to evaluate prognosis and guide clinical treatment. The web app can be accessed at http://39.108.85.1:8072/common/home (
).
Discussion
OS is the most common primary malignant bone tumor in adolescents and children. A high rate of distant metastasis is its characteristic feature, and the occurrence of metastasis has a great impact on the prognosis of patients (50). The rapid development of scRNA-seq technology has enabled research on highly heterogeneous tumors, including OS, which is helpful for understanding the tumor microenvironment and exploring new cell populations (51).In this study, we identified alterations in single OS cells through comprehensive analysis of two different types of sequencing data, isolated 14 major cell populations in OS tissue, and annotated 8 cell types by extracting characteristic genes. By applying intercellular communication analysis software, we collected 26 reliable ligand-receptor signaling cell pairs and characterized the complex regulatory network in the multicellular tumor microenvironment. OS cells act on SIGLEC10 on macrophages and the ITGB1 receptor on the surface of ECs to achieve immune escape and promote angiogenesis (26, 52, 53), which provides a new therapeutic approach for inhibiting the growth and metastasis of OS.Deeply exploring the high heterogeneity of OS cells and finding new subpopulations with different biological functions have become goals of research (54–56). We further subdivided OS cells into 9 subpopulations. One subpopulation was highly invasive, whereas two were found to be metastatically inert by annotation of metastasis-related markers and pathway enrichment analysis. The developmental trajectory analysis showed that these two kinds of subpopulations had different differentiation fates during development. A highly aggressive subpopulation is often considered to play a key role in OS metastasis (57). Exploring the mechanisms underlying the origination and development of this population, further studying how to regulate or change the differentiation fate of OS cells, and promoting the differentiation of tumor cells toward a phenotype with low invasiveness are of great significance for effectively inhibiting the metastasis of OS and improving the prognosis of patients.ScRNA-seq gets more detailed information at the cellular level, and bulk RNA-seq and scRNA-seq have certain consistency. Combining the scRNA-seq with existing bulk data from large cohorts can help decipher the molecular mechanisms of OS progression and metastasis (58). Based on the above research results, we established a metastasis-associated risk assessment model for OS, which considered five genes: BNIP3, CPE, DKK1, EFEMP2 and GALNT14. The excellent effectiveness and specificity of this model were demonstrated not only in the training group, internal and external validation groups, but also via validation in patients with other types of tumors. Sheen et al. (13) developed an OS metastatic risk prediction model using metabolic imaging phenotypes, but its sensitivity was only 0.63. In general, sensitivity is important for OS risk models because metastasis is extremely unfavorable for the prognosis of OS patients, and the BCDEG model improves this situation. Overall, the model could effectively predict prognosis and evaluate the risk of tumor metastasis, providing a theoretical basis for the development of personalized diagnosis and treatment plans for patients.In terms of the limitations of these data, there were relatively few OS scRNA-seq samples, and there was a lack of corresponding normal tissue samples. Analysis of additional tumor and normal samples would be conducive to eliminating the heterogeneity caused by individual differences. Nevertheless, we provide moderate evidence that there are strong and meaningful associations between different cell types in the OS microenvironment, with different subpopulations of OS cells performing different biological functions.In summary, our research in-depth analyzed RNA-seq and bulk RNA-seq data in a complementary way to explore the intercellular relationships and internal mechanisms that affect the occurrence, progression and metastasis of OS. Moreover, BCDEG, an effective, reliable and easy-to-use prognostic model, was established. These results will aid in accurate treatment of OS and further improve the therapeutic effect and 5-year survival rate of patients.
Data Availability Statement
The original contributions presented in the study are included in the article/
. Further inquiries can be directed to the corresponding authors.
Author Contributions
Conceptualization, KT and TL. Methodology, KT and CY. Software, JG (1st author). Validation, JG (1st author) and HT. Formal analysis, HT and PH. Investigation, JG (4th author). Resources, YS. Data curation, JG (4th author) and PH. Writing-original draft preparation, JG (1st author) and JG (4th author). Writing-review and editing, KT. Visualization, JG (1st author) and TL. Supervision, CY. Project administration, TL. Funding acquisition, KT. All authors contributed to the article and approved the submitted version.
Funding
This study was supported by the Personalization Training Program for the Training Object of the Outstanding Talents of Army Medical University (2020, 4139Z2C2) and Chief Medical Expert of Chongqing (2019, 414Z333).
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Authors: Seper Ekhtiari; Kentaro Chiba; Snezana Popovic; Rhianne Crowther; Gregory Wohl; Andy Kin On Wong; Darren H Tanke; Danielle M Dufault; Olivia D Geen; Naveen Parasu; Mark A Crowther; David C Evans Journal: Lancet Oncol Date: 2020-08 Impact factor: 41.316
Authors: Michael R Hicks; Julia Hiserodt; Katrina Paras; Wakana Fujiwara; Ascia Eskin; Majib Jan; Haibin Xi; Courtney S Young; Denis Evseenko; Stanley F Nelson; Melissa J Spencer; Ben Van Handel; April D Pyle Journal: Nat Cell Biol Date: 2017-12-18 Impact factor: 28.824
Authors: Sara R Martins-Neves; Áurio O Lopes; Anália do Carmo; Artur A Paiva; Paulo C Simões; Antero J Abrunhosa; Célia M F Gomes Journal: BMC Cancer Date: 2012-04-04 Impact factor: 4.430
Authors: Xuan D Ho; Hoang G Nguyen; Le H Trinh; Ene Reimann; Ele Prans; Gea Kõks; Katre Maasalu; Van Q Le; Van H Nguyen; Nghi T N Le; Phuong Phung; Aare Märtson; Freddy Lattekivi; Sulev Kõks Journal: Front Genet Date: 2017-11-30 Impact factor: 4.599
Authors: Sam Behjati; Patrick S Tarpey; Kerstin Haase; Hongtao Ye; Matthew D Young; Ludmil B Alexandrov; Sarah J Farndon; Grace Collord; David C Wedge; Inigo Martincorena; Susanna L Cooke; Helen Davies; William Mifsud; Mathias Lidgren; Sancha Martin; Calli Latimer; Mark Maddison; Adam P Butler; Jon W Teague; Nischalan Pillay; Adam Shlien; Ultan McDermott; P Andrew Futreal; Daniel Baumhoer; Olga Zaikova; Bodil Bjerkehagen; Ola Myklebost; M Fernanda Amary; Roberto Tirabosco; Peter Van Loo; Michael R Stratton; Adrienne M Flanagan; Peter J Campbell Journal: Nat Commun Date: 2017-06-23 Impact factor: 14.919