Literature DB >> 36082002

Construction of a prognostic model related to copper dependence in breast cancer by single-cell sequencing analysis.

Abstract

Purpose: To explore the clinical significance of copper-dependent-related genes (CDRG) in female breast cancer (BC).
Methods: CDRG were obtained by single-cell analysis of the GSE168410 dataset in the Gene Expression Omnibus (GEO) database. According to a 1:1 ratio, the Cancer Genome Atlas (TCGA) cohort was separated into a training and a test cohort randomly. Based on the training cohort, the prognostic model was built using COX and Lasso regression. The test cohort was used to validate the model. The GSE20685 dataset and GSE20711 dataset were used as two external validation cohorts to further validate the prognostic model. According to the median risk score, patients were classified as high-risk or low-risk. Survival analysis, immune microenvironment analysis, drug sensitivity analysis, and nomogram analysis were used to evaluate the clinical importance of this prognostic model.
Results: 384 CDRG were obtained by single-cell analysis. According to the prognostic model, patients were classified as high-risk or low-risk in both cohorts. The high-risk group had a significantly worse prognosis. The area under the curve (AUC) of the model was around 0.7 in the four cohorts. The immunological microenvironment was examined for a possible link between risk score and immune cell infiltration. Veliparib, Selumetinib, Entinostat, and Palbociclib were found to be more sensitive medications for the high-risk group after drug sensitivity analysis.
Conclusion: Our CDRG-based prognostic model can aid in the prediction of prognosis and treatment of BC patients.

Entities: Chemical

Keywords: bioinformatics; breast cancer; copper dependence; prognostic model; single-cell sequencing analysis

Year: 2022 PMID： 36082002 PMCID： PMC9445252 DOI： 10.3389/fgene.2022.949852

Source DB: PubMed Journal: Front Genet ISSN： 1664-8021 Impact factor: 4.772

Introduction

As the most common malignancy, BC accounts for 15.5% of female cancer deaths worldwide, which poses a heavy burden on global health (Sung et al., 2021). Based on genomic and transcriptomic sequences, BC can be classified into five molecular subtypes (Rainey et al., 2020). However, due to the heterogeneity of BC, although multiple prognostic tools have been developed, none of them can predict the prognosis of all types of BC (Krop et al., 2017; Faria et al., 2021). Moreover, the overall prognosis of BC is poor, especially for advanced patients (Tao et al., 2015). Advanced BC with distant organ metastases is considered incurable (Harbeck et al., 2019). Therefore, finding novel prognostic factors and treatment targets for BC to guide clinical practice is critical. Multiple cells in BC now can be studied accurately due to the advances in single-cell sequencing, which is a strong method for characterizing diverse cell types, and has been used to study a variety of cancers (Treutlein et al., 2016; Ziegenhain et al., 2017). At the same time, through cell clustering and annotation, we can understand the cellular differentiation and immune mechanisms of BC better (Hwang et al., 2018). Defects in the execution of cell death by tumor cells are one of the main reasons for their resistance to therapy (Hassannia et al., 2019). As a form of regulated cell death, copper-dependent death occurs through the direct binding of copper to fatty acylation components of the tricarboxylic acid cycle (TCA) (Tsvetkov et al., 2022). Copper has two roles in carcinogenesis: it promotes tumor development while also causing redox stress in cancer cells (Maung et al., 2021). High levels of copper promote drug resistance and repair of damaged DNA in cancer cells through the induction of MDC1 expression by copper chaperones (Jin et al., 2022). It has been shown that reducing copper uptake by knocking out human copper transporter protein 1 can inhibit prostate cancer cell proliferation and tumor growth (Xie and Peng, 2021). The study by Teng et al. also confirmed that copper deficiency may be a novel approach to the treatment of pancreatic cancer (Yu et al., 2019). Besides, copper can also regulate proteins involved in evading immune responses, such as the transmembrane protein programmed death ligand 1. Interaction between PD-L1 and PD-1 receptors on cytotoxic T lymphocytes prevents immune cells from attacking cancer cells (Voli et al., 2020). Therefore, limiting the availability of copper during carcinogenesis may be one way to slow cancer progression (Shanbhag et al., 2021). It is crucial to explore the role of CDRG in cancer. Nevertheless, whether these CDRG are associated with the prognosis of BC patients is uncertain. Herein, we first identified CDRG in BC by single-cell sequencing. Based on these CDRG, we constructed a prognostic model which could evaluate the prognosis of BC patients accurately. At the same time, the immune microenvironment and medication sensitivity of BC are likewise linked to CDRG. This study informed the treatment strategy for BC.

Methods

A flow chart of our work was shown in Figure 1.

FIGURE 1

The flow chart of data collection and analysis in this study.

Data collection

The “TCGAbiolinks” R package was used to download TCGA data (TCGA-BRCA; URL: https://portal.gdc.cancer.gov/; data: 31 May 2022; Version: v33.1). The TCGA database was used to download transcriptome and clinical data. The workflow type we used was Counts. 10 copper-dependent genes (Negative hits: MTF1, GLS, CDKN2A; Positive hits: FDX1, LIAS, LIPT1, DLD, DLAT, PDHA1, PDHB) were obtained from the study by Tsvetkov et al. (2022). We downloaded the BC single-cell sequencing dataset GSE168410 (Kester et al., 2022) from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). We also downloaded two databases GSE20685 (Kao et al., 2011) and GSE20711 (Dedeurwaerder et al., 2011) from the GEO database as the external validation cohorts to validate our model.

Data processing of the GSE168410

First, we performed quality control on the data. As patient10 had less single-cell sequencing data, we excluded patient10 and used single-cell sequencing data from the remaining 11 patients for subsequent analysis. Cells with fewer than 5% mitochondrial genes and a total amount of genes over 300 were kept. Genes expressed in at least three cells were kept. Stacked histograms were used to show the proportion of cells in each sample. We screened out the 6,000 most fluctuating genes according to their fluctuating degrees in all samples. We used the “CellCycleScoring” function to judge the selected cell cycle and used the “ScaleData” function to eliminate the effect caused by the cell cycle. The LogNormalize method was used to normalize and integrate the samples. After the data was corrected, principal component analysis was used for dimensionality reduction of the data, and TSNE was used for cluster analysis. We annotated cell types using the “SingleR” package. We download the singler database, load “ref_Human_all.Rdata” into the environment, and define cell subsets according to the singler algorithm. And after annotating the cells, the differential genes of each cluster were obtained by FindAllMarkers detection. After importing copper-dependent genes, the proportion of them in each cell was calculated by the PercentageFeatureSet function. According to the median ratio of copper-dependent genes, we divided the cells into low_cuproptosis and high_cuproptosis cells. Then, we use the FindMarkers function to find the differential genes of low_cuproptosis and high_cuproptosis cells, and filter the genes to screen out the genes whose p-value is less than 0.05. We defined these genes as copper-dependence-related genes (CDRG).

Data processing of the TCGA

First, the data downloaded were preprocessed and combined using the Perl language to access the count file. The gene symbol was also transformed with Perl. Then, the corresponding gene expression was acquired by matching the transcriptome data from TCGA with CDRG. We excluded patients with incomplete clinical data and those with 0 days of follow-up. We then performed a subgroup analysis of the patients. We matched the CDRG expression data with the survival data, performed a univariate COX analysis, and screened out genes with a p-value less than 0.05, which were prognostically significant genes. The forest diagram was used to show the prognostic genes.

Construction of the prognostic model and nomogram

We used the “caret” package to randomly split the matched cohort into a training cohort and a test cohort in a 1:1 ratio. Subsequently, the prognostic CDRG were selected by the least absolute and selection operator (LASSO) regression. We then calculated the risk score for each patient and built a prognostic model. We divided BC patients into high- and low-risk groups based on median score. Between the two groups, we utilized a clinical correlation heatmap to analyze differences in clinical features and examined disparities in patient outcomes. The survival differences were then verified using a log-rank test. Thereafter, univariate and multivariate cox analyses were performed with the risk score and different clinical information. Subsequently, we plotted the time-dependent receiver operating characteristic (ROC) plots and calculated the area under the curve (AUC) to validate the predictive power of the constructed prognostic model. We combined clinical data and patient risk scores to construct a nomogram to further analyze the prognosis of BC patients. Finally, the nomogram’s accuracy in estimating patient outcomes was evaluated by prognostic ROC curves.

External validation of the prognostic model

GSE20685 cohort and GSE20711cohort were selected as two external validation cohorts. In both external validation cohorts, risk scores for each sample were calculated according to the formula of the model, and patients were divided into high- and low-risk groups based on the median. Next, survival analysis was performed to determine whether there was also a difference in prognosis between the two groups in the external validation cohorts. The ROC curve was used to evaluate the accuracy of the model.

Functional enrichment analysis

We performed the Gene Ontology (GO) analysis and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis by the “clusterProfiler” package. We also performed the gene set variation analysis (GSVA) by the “GSVA” package. The results were kept if the p-value < 0.05. The bar charts were used to represent the results of the analysis.

Immunoassay and m6A analysis

To investigate the correlation between our risk model and the level of tumor infiltration, we designed the immune infiltration heatmap and the correlation map to visualize our data. The tumor infiltration methods we used were TIMER, CIBERSORT, QUANTISEQ, MCPCOUNTER, XCELL, and EPIC. The literature provided us with a list of m6a-related (Huang et al., 2020; Tang et al., 2020; Li et al., 2021) and immune checkpoint-related genes (Tian et al., 2020; Jiang et al., 2021; Shimada et al., 2021; Song et al., 2021). Boxplots were used to display the results of the analysis.

Drug sensitivity analysis

We used the expression matrix and drug processing information from the “Cancer Genome Project” (CGP, https://www.cancerrxgene.org/) to obtain the drugs associated with the model genes using the “pRROpheticPredict” function (Geeleher et al., 2014).

Results

Analysis of the GEO dataset

Supplementary Figure S1A shows the amount of gene expression per cell, the ratio of mitochondrial genes and CDRG in 11 samples. Cells were evenly distributed among the 11 samples. The number of genes and their expression levels are positively correlated (Supplementary Figure S1B). We marked the top 10 genes out of 6,000 hypervariable genes in red (Supplementary Figure S1C). We then integrated the 11 samples. The result showed that the integration could be used for subsequent analysis. After PCA dimensionality reduction, using the TSNE clustering technique, we divided all cells into 13 groups and annotated all cells. According to the surface marker genes of different cell types, the cells were annotated as Fibroblasts, MSC, Epithelial_cells, Tissue_stem_cells, Monocyte, Endothelial_cells, and T_cells (Figure 2A). Figure 2B shows the ratio of different cells in each patient. Then after using the “PercentageFeatureSet” function to input 10 copper-dependent genes, the proportion of them in each cell was obtained. According to the median ratio of copper-dependent genes, we divided the cells into low_cuproptosis and high_cuproptosis cells (Figures 2C,D). The cut-off value we used was 0.04336513. We found that the distribution of low_cuproptosis cells and high_cuproptosis cells in each cell cluster was relatively uniform (Figure 2E). Finally, between the two groups, we analyzed the differentially expressed genes and identified 384 CDRG.

FIGURE 2

Single-cell sequencing analysis (A,B), distribution of high_cuproptosis cells and low_cuproptosis cells (C,D,E), subgroup analysis (E), and univariate COX analysis (G). (A) Single-cell sequencing analysis of the GSE168410 dataset (n = 11). According to the surface marker genes of different cell types, the cells were annotated as Fibroblasts, MSC, Epithelial_cells, Tissue_stem_cells, Monocyte, Endothelial_cells, and T_cells. (B) The cell ratios of every sample at the single-cell level after quality control. (C,D) Distribution of high_cuproptosis cells and low_cuproptosis cells. (E) The distribution of high_cuproptosis cells and low_cuproptosis cells in each cell cluster was relatively uniform. (F) The BC patients were divided into five subgroups, including luminal A (46.86%), luminal B (17.62%), Her2-enriched (6.73%), basal-like (15.96%), and normal-like (12.82%). (G) Univariate COX analysis of the TCGA cohort. We selected 47 genes with prognostic significance (p < 0.05). Blue represented low-risk CDRG and red represented high-risk CDRG.

Analysis of the TCGA dataset

After subgroup analysis, the patients were divided into five subgroups (Figure 2F), including luminal A (46.86%), luminal B (17.62%), Her2-enriched (6.73%), basal-like (15.96%), and normal-like (12.82%). After matching transcriptomic and clinical data of CDRG in the TCGA database, we performed independent prognostic analysis, resulting in 47 genes with prognostic significance. Figure 2G shows the CDRG associated with prognosis.

Construction and evaluation of prognostic model

After performing the Lasso regression analysis, we screened 16 genes from CDRG associated with prognosis in the training cohort (Figures 3A,B). At the same time, we calculated and recorded the risk score for each patient. The risk score = DLAT*0.070163 + SNX3*0.488279 + TTC3*0.028921 + PHF20*0.107077 + RTN4*0.126126 + SURF4*0.084301 + SDC1*0.155383 + KDELR2*0.366251 + BAMBI*0.037788-ANXA5*0.47211-RBP1*0.1076-TPT1*0.07754 + MARVELD1* 0.198669—MDK*0.05575-RPLP1*0.07644-ETV6*0.09076. BC patients were divided into high-risk and low-risk groups based on the median risk score (54.869275). Table 1 showed the screened genes and their coefficients.

FIGURE 3

LASSO regression analysis. (A,B) Lasso regression was used to construct prognostic signatures in the training cohort (n = 525). When Lamda was 16, the curve converged.

TABLE 1

Genes used for model building and their Coefficients.

Gene	Coefficients
DLAT	0.070163
SNX3	0.488279
TTC3	0.028921
PHF20	0.107077
RTN4	0.126126
SURF4	0.084301
SDC1	0.155383
KDELR2	0.366251
BAMBI	0.037788
ANXA5	−0.47211
RBP1	−0.1076
TPT1	−0.07754
MARVELD1	0.198669
MDK	−0.05575
RPLP1	−0.07644
ETV6	−0.09076

Genes and their coefficients used to construct prognostic models. The risk score = DLAT*0.070163 + SNX3*0.488279 + TTC3*0.028921 + PHF20*0.107077 + RTN4* 0.126126 + SURF4*0.084301 + SDC1*0.155383 + KDELR2*0.366251 + BAMBI*0.037788-ANXA5*0.47211-RBP1*0.1076-TPT1*0.07754 + MARVELD1*0.198669–MDK*0.05575-RPLP1*0.07644-ETV6*0.09076.

LASSO regression analysis. (A,B) Lasso regression was used to construct prognostic signatures in the training cohort (n = 525). When Lamda was 16, the curve converged. Genes used for model building and their Coefficients. Genes and their coefficients used to construct prognostic models. The risk score = DLAT*0.070163 + SNX3*0.488279 + TTC3*0.028921 + PHF20*0.107077 + RTN4* 0.126126 + SURF4*0.084301 + SDC1*0.155383 + KDELR2*0.366251 + BAMBI*0.037788-ANXA5*0.47211-RBP1*0.1076-TPT1*0.07754 + MARVELD1*0.198669–MDK*0.05575-RPLP1*0.07644-ETV6*0.09076. We then evaluated the prognostic model. We performed survival analysis to explore the prognostic value of this feature. As can be seen, in both cohorts, the prognosis for patients in the high-risk group was much poorer (Figures 4A,B). The AUC at 1, 2, 3, 4, and 5 years of the training cohort were 0.645, 0.713, 0.764, 0.798, and 0.739, respectively (Figure 4C). The AUC at 1, 2, 3, 4 and 5 years of the test cohort were 0.722, 0.771, 0.708, 0.699, and 0.674, respectively (Figure 4D). Similarly, in both external validation cohorts, we also observed that patients with high-risk scores had a significantly worse prognosis than those with low-risk scores (Figures 4E,F). In order to further explore the accuracy of the prognostic model in the evaluation of the prognosis of BC patients, we conducted the ROC curve analysis in both external validation cohorts. The AUC at 1, 2, 3, 4, and 5 years of the GSE20685 cohort were 0.776, 0.747, 0.658, 0.621, and 0.637, respectively (Figure 3G). The AUC at 1, 2, 3, 4, and 5 years of the GSE20711 cohort were 0.977, 0.679, 0.742, 0.734, and 0.719, respectively (Figure 3H). The AUC in the four cohorts was greater than or near 0.7, demonstrating that the prognostic model was accurate and stable.

FIGURE 4

Evaluation of prognostic model. (A,B) In both training (A) and test cohorts (B, n = 525), the high-risk patients had a worse prognosis (p < 0.05). (C,D) We found that the AUC in both training (C) and test (D) cohorts were greater than or close to 0.7. (E,F) In both GSE20685 [(E), n = 382] and GSE20711 [(F), n = 89] cohorts, patients with high-risk scores had a significantly worse prognosis than those with low-risk scores (p < 0.05). (G,H) The AUC of the GSE20685 (G) and GSE20711 (H) cohorts was basically between 0.6 and 0.8. We then analyzed the distribution of gene expression and patient survival in the models between the high - and low-risk groups in training and test cohorts (Figures 5A,B). We found that with the increase in risk value, the proportion of BC patients who died increased (Figures 5C,D). Moreover, we found that genes ANXA5, RBP1, TPT1, MDK, RPLP1, and ETV6 were highly expressed in the low-risk group, while DLAT, SNX3, TTC3, PHF20, RTN4, SURF4, SDC1, KDELR2, BAMBI, and MARVELD1 were highly expressed in the high-risk group (Figures 5E,F).

FIGURE 5

Evaluation of prognostic model. (A,B) The risk score of both cohorts. The patients were divided into high-risk and low-risk groups based on the median risk score. (C,D) The correlation of risk score and survival status of patients in both cohorts. With the increase in risk value, the proportion of BC patients who died increased. (E,F) Heat map of expression of 16 model genes in high-risk and low-risk in both cohorts. The genes ANXA5, RBP1, TPT1, MDK, RPLP1, and ETV6 were highly expressed in the low-risk group, while DLAT, SNX3, TTC3, PHF20, RTN4, SURF4, SDC1, KDELR2, BAMBI, and MARVELD1 were highly expressed in the high-risk group.

Exploration of independent prognostic significance of the signature

Subsequently, to see if risk score and other clinical features were independent prognostic predictors of BC, we employed univariate and multivariate COX regressions. First, univariate COX regression revealed that age (Hazard Ratio (HR) = 1.033, p < 0.001), stage (HR = 1.860, p < 0.001), and risk score (HR = 1.012, p < 0.001) were independent prognostic indicators of BC in the training cohort. Multivariate COX regression showed that age (HR = 1.031, p = 0.002, stage (HR = 2.159, p < 0.001), and risk score (HR = 1.012, p < 0.001) were independent prognostic indicators of BC. In the test cohort, univariate COX regression showed that age (HR = 1.037, p < 0.001), stage (HR = 2.530, p < 0.001), and risk score (HR = 1.011, p < 0.001) were independent prognostic indicators of BC. Multivariate COX regression showed that age (HR = 1.037, p = 0.001), stage (HR = 2.68, p = 0.05), and risk score (HR = 1.010, p = 0.037) were independent prognostic indicators of BC.

Enrichment analysis

Then, we performed the enrichment analysis. The results of GO enrichment analysis showed that these genes were mainly related to the extracellular matrix organization, extracellular framework components, and protein expression (Figure 6A). The results of the KEGG enrichment analysis showed that these genes were mainly related to the TCA cycle, ribosome, and protein metabolism (Figure 6B). GSVA analysis was used to further explore the differences in KEGG pathways involved between high and low-risk groups. The results showed that ribosome-related pathways, cell adhesion molecule-related pathways, and glycolysis and gluconeogenesis-related pathways had significant differences between high and low-risk groups (Figure 6C).

FIGURE 6

GO enrichment analysis (A), KEGG enrichment analysis (B), and GSVA analysis (C) of CDRG. (A) The results of GO enrichment analysis showed that these genes were mainly related to the extracellular matrix organization, extracellular framework components, and protein expression. (B) The results of the KEGG enrichment analysis showed that these genes were mainly related to the TCA cycle, ribosome, and protein metabolism. (C) The results showed that ribosome-related pathways, cell adhesion molecule-related pathways, and glycolysis and gluconeogenesis-related pathways had significant differences between high and low-risk groups.

FIGURE 7

Immunoassay and m6A analysis. (A) Immune cell infiltration distribution. T cells, B cells, and Macrophage tended to be highly expressed mainly in the high-risk group. (B) Immune-related functions. The high-risk groups have a more active immune function. (C) The expression of immune checkpoint-related genes. The majority of immunological checkpoint genes were up-regulated. (D) the expression of m6A-related genes. In high-risk groups, most m6a-related genes were upregulated. To further understand the differences in immune microenvironments to guide immunotherapy, the immunological function of high-risk and low-risk populations was discussed. The results show that high-risk groups have a more active immune function (Figure 7B). In the high-risk group, the majority of immunological checkpoint genes were up-regulated (Figure 7C). Immune checkpoint blockade may be more beneficial to them. Meanwhile, between the two groups, Results showed that in high-risk groups, most m6a-related genes were upregulated (Figure 7D). To target treatment, drug sensitivity analyses were performed to identify drugs that were more effective in the high-risk group. Figure 8 illustrated that the candidates were Veliparib, Selumetinib, Entinostat, and Palbociclib.

FIGURE 8

Drug sensitivity analysis. The candidates are Veliparib (A), Selumetinib (B), Entinostat (C), and Palbociclib (D).

Construction of the nomogram

To further apply this prognostic model to BC prognostic assessment, we combined TCGA BC transcriptome data and clinical data to construct a nomogram related to the risk score. The prognostic model estimated that the 1, 3, and 5-years mortality of a BC patient was 0.0117, 0.0737, and 0.153, respectively (Figure 9A). The nomogram can better assess patient risk and guide subsequent clinical decisions. The ROC curves showed that the AUC at 1, 3 and 5 years of the training cohort were 0.741, 0.761, and 0.638, respectively (Figure 9B), and the AUC at 1, 3, and 5 years of the test cohort were 0.881, 0.745 and 0.714, respectively (Figure 9C).

FIGURE 9

The construction of a nomogram. (A) The nomogram. The mortality rate of the patient in 1, 3, and 5 years was estimated to be 0.0117, 0.0737, and 0.153. (B) ROC curve of the nomogram in the training cohort. The AUC in 1, 3, and 5 years were 0.741, 0.761, and 0.638, respectively. (C) ROC curve of the nomogram in the test cohort. The AUC in 1, 3, and 5 years were 0.881, 0.745, and 0.714, respectively.

Discussion

We performed an extensive bioinformatics analysis to explore the significance of copper-dependent-related genes in BC in this study. Using the GEO and TCGA datasets, a predictive model based on copper-dependent-related genes was effectively built in this study. By calculating the risk score, patients with BC could be classified into high-risk and low-risk groups. The high-risk group showed worse outcomes in both TCGA and GEO cohorts. Besides, the ROC curves showed that this signature showed high accuracy in evaluating the prognosis of BC patients at 1, 2, 3, 4, and 5 years. In addition, we also confirmed the roles of copper-dependent-related genes in the immune microenvironment were significantly different between them, which may provide new predictors for immunotherapy in BC patients. Drug sensitivity analysis identifies more sensitive drugs for high-risk groups, which can be valuable in stratifying treatment for BC. Female breast cancer is the most common cancer worldwide (Sung et al., 2021). Surgery, chemotherapy, radiotherapy, targeted therapy, and hormone therapy have become the main treatment strategies (Ferreira et al., 2019; Hirukawa et al., 2019; Kim et al., 2020). However, the overall prognosis of BC patients remains poor, especially for advanced patients (Tao et al., 2015). Meanwhile, despite the development of various prognostic indicators to aid clinical decision-making in BC patients, the application of predictors has been limited. The proposal of CDRG provides a novel approach to the treatment of BC. These CDRG are significant. Copper is a cofactor for a number of essential enzymes, while copper-induced cytotoxic mechanisms can also lead to cell death (Ge et al., 2022). Studies have demonstrated that copper-induced cell death is mediated by lipid acylation of proteins that are concentrated in the TCA cycle, where lipid acylation is required for enzyme function (Solmonson and DeBerardinis, 2018; Tsvetkov et al., 2022). Copper-dependent genes are regulators of copper death (Tsvetkov et al., 2022). In this study, the results of KEGG enrichment analysis showed that the copper-dependent genes we screened were mainly associated with the TCA cycle, and GSVA analysis also showed that glycolysis and gluconeogenesis-related pathways were significantly different between high-risk and low-risk groups, confirming that our screening for copper-dependent genes is meaningful. In addition, GO enrichment analysis, KEGG enrichment analysis and GSVA analysis showed that the differential functions and pathways between the high-risk group and the low-risk group were also concentrated in extracellular matrix organization and ECM receptor interaction pathways. Therefore, we further explored the content of immune matrix components in the tumor microenvironment. The results of immune analysis confirmed that T cells, B cells and macrophages were mainly highly expressed in high-risk groups, and the high-risk groups had more active immune functions and up-regulated expression of most immune checkpoint genes. This suggests that there are more immune matrix components in the tumor microenvironment of high-risk individuals. Sixteen genes in the prognostic model have been initially elucidated in the pathogenesis and progression of the disease. DLAT is involved in pyruvate metabolism and the TCA cycle (Chen et al., 2021). Goh et al. (2015) found that DLAT was up-regulated in gastric cancer cells and may be one of the potential drug targets in mitochondria. SNX3 is involved in intracellular protein trafficking and acts as a key factor driving tumor progression and metastasis in triple-negative breast cancer (Cicek et al., 2022). TTC3 is a ubiquitin E3 ligase that promotes the degradation of Akt ubiquitination and phosphorylation (Suizu et al., 2009). The study by Wu et al. (2021) constructed a 4-gene prognostic marker to evaluate and predict patients with soft tissue sarcoma, in which TTC3 is a key molecule. PHF20 is a multi-domain protein that regulates the activity and gene expression of P53 (Cui et al., 2012). Ma et al. (2020) identified PHF20 as a key driver of glioblastoma malignant behavior. RTN4 belongs to the reticulin-encoding gene family and is involved in the membrane trafficking of neuroendocrine cells (Wang et al., 2021a). Pathak et al. (Pathak et al., 2018) found that RTN4 is involved in carcinogenesis, and the knockdown of RTN4 enhanced the toxic effect of paclitaxel on cancer cells. SURF4 encodes a conserved integral membrane protein that interacts with ER-Golgi intermediate compartment proteins (Yan et al., 2022). Kim et al. (2018) found that SURF4 exhibited abnormal amplification and increased expression in tumor tissues. SDC1 mediates cell binding, cell signaling, and cytoskeletal organization (Jenkins et al., 2018). Yao et al. (2019) found that SDC1 is essential for pancreatic cancer maintenance and progression by regulating micropinocytosis. KDELR2 is a transmembrane protein (Wang et al., 2011). Mao et al. (2020) discovered that elevated KDELR2 expression in glioma patients was linked to a poor prognosis. BAMBI is a transmembrane glycoprotein whose overexpression performs an important part in the pathogenesis and development of osteosarcoma (Zhou et al., 2013). ANXA5 is a calcium-dependent phospholipid-binding annexin (Bouter et al., 2015). Yang et al. (2021) showed that ANXA5 is associated with chemoresistance in B-cell acute lymphoblastic leukemia. RBP is a carrier protein involved in the transport of retinol (Napoli, 2017). Gao et al. (2020) found that the RBP1-CKAP4 axis is a key regulator of autophagy machinery in oral squamous cell carcinoma. TPT1 is a protein involved in cell growth and proliferation (Bae et al., 2017). Li et al. (2019) found that the up-regulation of TPT1 can promote the metastasis of colorectal cancer. MARVELD1 is a nuclear protein (Wang et al., 2009). The study by Li et al. (2016) showed that gefitinib’s therapeutic efficacy in lung cancer can be improved by interfering with MARVELD1. MDK is a heparin-binding growth factor (Filippou et al., 2020). Yuan et al. (2015) found that lung cancer patients with high levels of MDK have a bad prognosis. RPLP1 is a ribosomal protein (Campos et al., 2020). Xie et al. (2021) found that liver cancer patients with high levels of RPLP1 have a bad prognosis. ETV6 encodes an ETS family transcription factor and is associated with susceptibility to acute lymphoblastic leukemia (Nishii et al., 2021). Our study, combining these 16 genes to construct a prognostic model, could improve our understanding of tumor cells. Programmed cell death is gaining increasing attention in the study of tumor therapy and the immune microenvironment (Wang et al., 2021b; Niu et al., 2022). Copper-dependent death is a newly proposed concept that occurs through the direct binding of copper to fatty acylation components of the tricarboxylic acid cycle (Tsvetkov et al., 2022). Tumor growth and metastasis have a high demand for metallic nutrients such as copper, which represents a metabolic vulnerability that can be exploited by limiting the availability of copper (Brewer, 2014; Ge et al., 2022). There are also studies that suggest copper consumption may play a role in cancer prevention (Pan et al., 2009). Tetrathiomolybdate, a less toxic copper chelator, is the primary drug used in copper depletion experiments in cancer models. It has achieved impressive results in a clinical trial (https://www.cancer.org/cancer/breast-cancer/understanding-a-breast-cancer-diagnosis/types-of- breast-cancer/triple-negative.html) in advanced breast cancer, with event-free survival rates of 90% (stage II/III) and 50% (stage IV) in patients with triple-negative breast cancer. However, studies of genes related to copper dependence in BC are lacking. For the first time, we provide the prognostic features of BC CDRG, which have crucial consequences for BC prognosis. Research has confirmed that cancer must evade anti-tumor immune responses in order to grow gradually (Gajewski et al., 2013). Tumor immune evasion has been recognized as a hallmark of cancer progression (Batlle and Massagué, 2019). Immunotherapy for cancer has recently advanced in the treatment of advanced tumors, however, a significant proportion of patients do not respond (Pardoll, 2012; Di Giacomo et al., 2013). BC is considered to be immunologically quiescent tumors, which greatly hinders their therapeutic response to immunotherapy. However, recent studies have shown that immune infiltration in the tumor immune microenvironment plays a decisive role in predicting the prognosis of BC (Baxevanis et al., 2021). Tumor microenvironment structure in selected BC correlates with genomic profiles indicative of immune escape (Danenberg et al., 2022). The results of pre-clinical trials suggest that immunotherapy may be a new approach to the clinical management of BC (Adams et al., 2019). Therefore, it is important to understand the immune microenvironment of BC. Our study found that the high-risk group is associated with immune cell infiltration and high expression of immune checkpoint genes, and therefore the high-risk group is more likely to benefit from immunotherapy. Immune checkpoint blockade (ICB) is expected to be a treatment modality for cancer patients (El-Khoueiry et al., 2017). However, many patients who receive immunotherapy do not respond to ICB, or patients initially respond to ICB but gradually become insensitive as the disease progresses (Pitt et al., 2016; Marin et al., 2022). M6A is considered to be the most significant and important modification of mRNA and noncoding RNA (He et al., 2019). The m6A regulators that regulate m6A modification may be involved in the growth, invasion, and metastasis of various cancers, as well as abnormal immune regulation (Gu et al., 2021; Uddin et al., 2021). Several studies have shown that the tumor microenvironment is closely linked to m6A modifications (Tang et al., 2020; Xu et al., 2020; Zhang et al., 2020). Our study found significant differences in the expression of m6a-related genes in high-risk and low-risk groups, which could help to screen patients for the benefits of immunotherapy. Drug-resistant treatment is a major challenge in the current treatment of BC (Bai et al., 2018). Our study screens drug candidates relevant to prognostic models. In addition, between high and low-risk groups, m6A-related gene expression varies substantially, which has implications for our further breast cancer treatment. However, our study has some limitations. Firstly, our model was constructed and validated based on retrospective data. Prospective clinical validation is needed henceforth. Secondly, The research data comes from the TCGA and GEO public databases. In the future, in vivo or in vitro basic experiments will be performed to confirm our findings, and we will further refine them in the future. This is the first copper-dependent-related gene prognostic model of breast cancer utilizing single-cell cluster analysis that we are aware of, and it informs the study of BC programmed deaths and also contributes to the treatment of BC patients.

Conclusion

Based on copper-dependent-related genes, the prognostic model was built in BC. We can accurately estimate the prognosis and immunological microenvironment of BC patients using this model. In addition, Our findings might lead to new approaches to BC therapy.

82 in total

Review 1. Review: Annexin-A5 and cell membrane repair.

Authors: A Bouter; R Carmeille; C Gounou; F Bouvet; S A Degrelle; D Evain-Brion; A R Brisson
Journal: Placenta Date: 2015-02-03 Impact factor: 3.481

2. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries.

Authors: Hyuna Sung; Jacques Ferlay; Rebecca L Siegel; Mathieu Laversanne; Isabelle Soerjomataram; Ahmedin Jemal; Freddie Bray
Journal: CA Cancer J Clin Date: 2021-02-04 Impact factor: 508.702

Review 3. Breast Cancer: Epidemiology and Etiology.

Authors: ZiQi Tao; Aimin Shi; Cuntao Lu; Tao Song; Zhengguo Zhang; Jing Zhao
Journal: Cell Biochem Biophys Date: 2015-06 Impact factor: 2.194

4. Reduction in Copper Uptake and Inhibition of Prostate Cancer Cell Proliferation by Novel Steroid-based Compounds.

Authors: Fang Xie; Fangyu Peng
Journal: Anticancer Res Date: 2021-12 Impact factor: 2.480

Review 5. Breast cancer.

Authors: Nadia Harbeck; Frédérique Penault-Llorca; Javier Cortes; Michael Gnant; Nehmat Houssami; Philip Poortmans; Kathryn Ruddy; Janice Tsang; Fatima Cardoso
Journal: Nat Rev Dis Primers Date: 2019-09-23 Impact factor: 52.329

6. PHF20 is an effector protein of p53 double lysine methylation that stabilizes and activates p53.

Authors: Gaofeng Cui; Sungman Park; Aimee I Badeaux; Donghwa Kim; Joseph Lee; James R Thompson; Fei Yan; Satoshi Kaneko; Zengqiang Yuan; Maria Victoria Botuyan; Mark T Bedford; Jin Q Cheng; Georges Mer
Journal: Nat Struct Mol Biol Date: 2012-08-05 Impact factor: 15.369

7. RPLP1 is highly expressed in hepatocellular carcinoma tissues and promotes proliferation, invasion and migration of human hepatocellular carcinoma Hep3b cells.

Authors: Changji Xie; Kai Cao; Dexin Peng; Lei Qin
Journal: Exp Ther Med Date: 2021-05-12 Impact factor: 2.447

Review 8. Dually modified transmembrane proteoglycans in development and disease.

Authors: Laura M Jenkins; Ben Horst; Carly L Lancaster; Karthikeyan Mythreye
Journal: Cytokine Growth Factor Rev Date: 2017-12-22 Impact factor: 17.660

Review 9. The balance between breast cancer and the immune system: Challenges for prognosis and clinical benefit from immunotherapies.

Authors: Constantin N Baxevanis; Sotirios P Fortis; Sonia A Perez
Journal: Semin Cancer Biol Date: 2019-12-24 Impact factor: 15.707

Review 10. Copper metabolism as a unique vulnerability in cancer.

Authors: Vinit C Shanbhag; Nikita Gudekar; Kimberly Jasmer; Christos Papageorgiou; Kamal Singh; Michael J Petris
Journal: Biochim Biophys Acta Mol Cell Res Date: 2020-10-20 Impact factor: 4.739