Literature DB >> 29467930

Identification of the potential crucial genes in invasive ductal carcinoma using bioinformatics analysis.

Chunguang Li1, Liangtao Luo2, Sheng Wei3, Xiongbiao Wang4.   

Abstract

Invasive ductal carcinoma (IDC) is a common histological type of breast cancer. The aim of this study was to identify the potential crucial genes associated with IDC and to provide valid biological information for further investigations. The gene expression profiles of GSE10780 which contained 42 histologically normal breast tissues and 143 IDC tissues were downloaded from the GEO database. Functional and pathway enrichment analysis of differentially expressed genes (DEGs) were performed and protein-protein interaction (PPI) network was analyzed using Cytoscape. In total, 999 DEGs were identified, including 667 up-regulated and 332 down-regulated DEGs. Gene ontology analysis demonstrated that most DEGs were significantly enriched in mitotic cell cycle, adhesion and protein binding process. Through PPI network analysis, a significant module was screened out, and the top 10 hub genes, CDK1, CCNB1, CENPE, CENPA, PLK1, CDC20, MAD2L1, HIST1H2BK, KIF2C and CCNA2 were identified from the PPI network. The expression levels of the 10 genes were validated in Oncomine database. KIF2C, MAD2L1 and PLK1 were associated with the overall survival. And we used cBioPortal to explore the genetic alterations of hub genes and potential drugs. In conclusion, the present study identified DEGs between normal and IDC samples, which could improve our understanding of the molecular mechanisms in the development of IDC, and these candidate genes might be used as therapeutic targets for IDC.

Entities:  

Keywords:  bioinformatics analysis; differentially expressed genes; invasive ductal carcinoma

Year:  2017        PMID: 29467930      PMCID: PMC5805516          DOI: 10.18632/oncotarget.23239

Source DB:  PubMed          Journal:  Oncotarget        ISSN: 1949-2553


INTRODUCTION

Breast cancer is the most common malignancy diagnosed in women worldwide, and is the second leading cause of cancer death among women. It has become one of the austerity issues in the world. Although the prognosis of patients is generally favorable due to the early detection and comprehensive treatment, the morbidity of breast cancer is rising. Additionally, the recurrence rate remains high and 20%–30% of patients will develop distant metastases with a median two-year survival time [1, 2]. Several studies have revealed that the rise of breast cancer cases is often due to the inherent susceptibility of genes and it is easy to relapse even after surgery of removing the primary tumor [3]. Invasive breast carcinoma is a heterogeneous group of tumors that exhibit different morphological spectrums and clinical behaviors. Treatment strategies for patients are designed based on the histological characteristics of tumor and other prognostic factors [4]. Breast cancer type is one of the most vital characteristics, and it is an important prognostic factor for breast cancer patients. To treat patients with invasive breast carcinoma, it is necessary for us to understand the specific biological characteristics of a given histological type [4]. According to a reported data, invasive ductal carcinoma (IDC) is a common histological type of breast cancer, accounting for about 75% of all invasive breast carcinoma cases [5]. Despite clinicians suggest that invasive ductal carcinoma always requires radical treatment, chemotherapy, and radiotherapy, to date, a lack of knowledge regarding the precise molecular targets for IDC limits the ability to treat advanced diseases [6]. Microarray is a high-throughput platform for the analysis of gene expression profiles, it has been widely used for investigating the underlying regulatory network involved in different types of cancer with great clinical applications: to improve the clinical diagnosis, and to discover new drug targets. Using microarray technology, several studies have exploited gene expression profiles of breast cancer and demonstrated prognostic significance. The analysis of BRCA1/2 mutation has been already used in clinical practice as a prognostic marker for breast cancer [7]. In a recent meta-analysis, AMDC2, TSHZ2, and CLDN11 were significantly related to the disease-free survival of breast cancer patients [6]. However, the results are often inconsistent due to sample heterogeneity. In the present study, we have downloaded public microarray data from Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) to identify DEGs between IDC samples and normal samples. Subsequently, functions of DEGs were further analyzed by gene ontology (GO) annotation, pathway enrichment and protein-protein interaction (PPI) network construction using bioinformatics methods. By way of identifying DEGs and analyzing their biological functions and key pathways, we will have a better understanding of the mechanisms of IDC pathogenesis, and explore the potential candidate biomarkers for early diagnosis, individualize the prevention and therapy of IDC patients.

RESULTS

Identification of DEGs

In the present study, 143 IDC samples and 42 normal samples in the dataset of GSE10780 were analyzed. Based on the cut-off criteria (adjusted P-value < 0.01 and |log2 foldchange (FC)|> 1), a total of 999 DEGs were identified, including 667 up-regulated and 332 down-regulated DEGs. DEGs expression heat map is shown in Supplementary Figure 1, and the hierarchical cluster analysis of the data demonstrated that the DEGs could be used to distinguish IDC samples from the normal samples. We then analyzed GSE21422 to validate the results, the overlapping DEGs were identified using Venn diagram in Supplementary Figure 2.

Gene ontology enrichment analysis

Functional enrichment analysis of DEGs was performed using DAVID online tool. The DEGs were categorized into three functional groups: biological process (BP), molecular function (MF) and cellular component (CC). GO analysis results (Table 1) showed that in the biological process, up-regulated genes were mainly enriched in cell division, sister chromatid cohesion, mitotic nuclear division, chromosome segregation and DNA replication; down-regulated genes were enriched in cell adhesion, hemidesmosome assembly, response to drug, and positive regulation of nitric oxide biosynthetic process. For molecular function, up-regulated genes were significantly enriched in protein binding, microtubule binding, protein kinase binding, identical protein binding and protein heterodimerization activity; while down-regulated genes were enriched in heparin binding, transcriptional activator activity, RNA polymerase II core promoter proximal region sequence-specific binding, growth factor activity, protein homodimerization activity and Wnt-protein binding. In the cellular component analysis, up-regulated genes were mainly enriched in nucleoplasm, condensed chromosome kinetochore, cytosol, chromosome, centromeric region and chromosome, centromeric region; down-regulated genes were enriched in extracellular space, extracellular region, proteinaceous extracellular matrix, extracellular exosome and cell surface. The results demonstrated that most DEGs were significantly enriched in mitotic cell cycle, adhesion and protein binding process.
Table 1

The top 5 enriched gene ontology terms of differentially expressed genes

ExpressionCategoryTermGene countP value
Up-regulatedGOTERM_BPGO:0051301~cell division478.76E-15
GOTERM_BPGO:0007062~sister chromatid cohesion248.39E-13
GOTERM_BPGO:0007067~mitotic nuclear division361.94E-10
GOTERM_BPGO:0007059~chromosome segregation181.22E-09
GOTERM_BPGO:0006260~DNA replication244.72E08
GOTERM_MFGO:0005515~protein binding3786.99E-09
GOTERM_MFGO:0008017~microtubule binding201.44E-04
GOTERM_MFGO:0019901~protein kinase binding291.65E-04
GOTERM_MFGO:0042802~identical protein binding471.76E-04
GOTERM_MFGO:0046982~protein heterodimerization activity332.47E-04
GOTERM_CCGO:0005654~nucleoplasm1612.31E-12
GOTERM_CCGO:0000777~condensed chromosome kinetochore207.16E-11
GOTERM_CCGO:0005829~cytosol1731.21E-09
GOTERM_CCGO:0000775~chromosome, centromeric region154.51E-09
GOTERM_CCGO:0070062~extracellular exosome1441.80E-07
Down-regulatedGOTERM_BPGO:0007155~cell adhesion222.10E-05
GOTERM_BPGO:0031581~hemidesmosome assembly53.05E-05
GOTERM_BPGO:0042493~response to drug173.99E-05
GOTERM_BPGO:0045429~positive regulation of nitric oxide biosynthetic process76.43E-05
GOTERM_BPGO:0007568~aging128.28E-05
GOTERM_MFGO:0008201~heparin binding141.00E-06
GOTERM_MFGO:0001077~transcriptional activator activity, RNA polymerase II core promoter proximal region sequence-specific binding151.54E05
GOTERM_MFGO:0008083~growth factor activity111.85E-04
GOTERM_MFGO:0042803~protein homodimerization activity253.21E-04
GOTERM_MFGO:0017147~Wnt-protein binding50.001
GOTERM_CCGO:0005615~extracellular space621.12E-13
GOTERM_CCGO:0005576~extracellular region622.10E-10
GOTERM_CCGO:0005578~proteinaceous extracellular matrix207.82E-08
GOTERM_CCGO:0070062~extracellular exosome725.55E-05
GOTERM_CCGO:0009986~cell surface237.34E-05

*BP, biological process; MF, molecular function; CC, cellular component.

P value < 0.01 was considered as threshold values of significant difference.

*BP, biological process; MF, molecular function; CC, cellular component. P value < 0.01 was considered as threshold values of significant difference.

Pathway enrichment analysis

According to KEGG pathway analysis, significantly enriched pathways of DEGs were shown in Table 2. Up-regulate genes were mainly enriched in cell cycle, DNA replication, viral carcinogenesis, systemic lupus erythematosus and pyrimidine metabolism pathways; while down-regulate genes were significantly enriched in regulation of lipolysis in adipocytes, PI3K-Akt signaling pathway, PPAR signaling pathway and cytokine-cytokine receptor interaction pathways.
Table 2

KEGG pathway analysis of differentially expressed genes

ExpressionPathwayGene countP value
Up-regulatedhsa04110: Cell cycle223.75E-07
hsa03030: DNA replication96.68E-05
hsa05203: Viral carcinogenesis211.84E-04
hsa05322: Systemic lupus erythematosus162.71E-04
hsa00240: Pyrimidine metabolism138.41E-04
hsa04114: Oocyte meiosis120.004
hsa05034: Alcoholism160.005
hsa05161: Hepatitis B140.005
hsa03410: Base excision repair60.009
Down-regulatedhsa04923: Regulation of lipolysis in adipocytes74.81E-04
hsa04151: PI3K-Akt signaling pathway150.004
hsa03320: PPAR signaling pathway60.007
hsa04060: Cytokine-cytokine receptor interaction110.008

P value < 0.01 was considered as threshold values of significant difference.

P value < 0.01 was considered as threshold values of significant difference.

PPI network analysis

The 999 DEGs were submitted to the STRING database to predict the protein interactions. With combined score greater than 0.7, the PPI network consisted of 955 nodes and 2246 edges. The top 10 hub nodes with higher degrees were screened out using the plug-in CytoHubba in Cytoscape. These hub genes included cyclin-dependent kinase 1 (CDK1), cyclin B1 (CCNB1), centromere protein E (CENPE), centromere protein A (CENPA), polo-like kinases 1 (PLK1), cell division cycle 20 (CDC20), MAD2 mitotic arrest deficient-like 1 (MAD2L1), histone cluster 1, H2bk (HIST1H2BK), kinesin family member 2C (KIF2C) and cyclin A2 (CCNA2). Moreover, the total of 955 nodes and 2246 edges were analyzed using plug-in MCODE, and the most significant modules were screened out, which contained 13 nodes and 75 edges (Figure 1). Strikingly, all of the genes in this module were up-regulated DEGs. According to GO enrichment analysis, in biological process, the genes were mainly associated with sister chromatid cohesion, cell division, mitotic nuclear division, mitotic sister chromatid segregation and chromosome segregation. In molecular function, these genes were significantly enriched in protein binding, histone kinase activity, kinetochore binding, anaphase-promoting complex binding and centromeric DNA binding. In the cellular component analysis, they were mainly enriched in condensed chromosome kinetochore, cytosol, chromosome, centromeric region, kinetochore and spindle pole (Table 3). KEGG pathway analysis demonstrated that these genes were mainly involved in cell cycle, oocyte meiosis and progesterone-mediated oocyte maturation pathways (Table 4).
Figure 1

The significant module identified from the protein-protein interaction network

Table 3

The top 5 enriched gene ontology terms the significant module

CategoryTermGene countP value
GOTERM_BPGO:0007062~sister chromatid cohesion224.90E-44
GOTERM_BPGO:0051301~cell division171.54E-21
GOTERM_BPGO:0007067~mitotic nuclear division133.57E-16
GOTERM_BPGO:0000070~mitotic sister chromatid segregation79.90E12
GOTERM_BPGO:0007059~chromosome segregation84.43E-09
GOTERM_MFGO:0005515~protein binding242.95E-07
GOTERM_MFGO:0035173~histone kinase activity20.005
GOTERM_MFGO:0043515~kinetochore binding20.005
GOTERM_MFGO:0010997~anaphase-promoting complex binding20.008
GOTERM_MFGO:0019237~centromeric DNA binding20.009
GOTERM_CCGO:0000777~condensed chromosome kinetochore141.30E-24
GOTERM_CCGO:0005829~cytosol262.91E-19
GOTERM_CCGO:0000775~chromosome, centromeric region102.89E-17
GOTERM_CCGO:0000776~kinetochore91.09E-13
GOTERM_CCGO:0000922~spindle pole63.37E-07

BP, biological process; MF, molecular function; CC, cellular component.

P value < 0.01 was considered as threshold values of significant difference.

Table 4

KEGG pathway analysis of the significant module

PathwayGene countP value
hsa04110: Cell cycle78.05E-10
hsa04114: Oocyte meiosis64.81E-08
hsa04914: Progesterone-mediated oocyte maturation41.03E-04

P value < 0.01 was considered as threshold values of significant difference

BP, biological process; MF, molecular function; CC, cellular component. P value < 0.01 was considered as threshold values of significant difference. P value < 0.01 was considered as threshold values of significant difference

Validation of the hub genes

To confirm the reliability of the hub genes, we used ONCOMINE (www.oncomine.org), a cancer microarray database and web-based data-mining platform to validate the expression levels of the 10 genes [8]. We performed the differential analysis between IDC and normal samples using TGCA datasets. Consistently, the top 10 hub genes were significantly up-regulated in IDC (Supplementary Figure 3). In order to identify the hub genes which would be potentially associated with overall survival of IDC patients, we evaluated the associations between hub genes’ expression and patients’ survival using Kaplan-Meier curve and Log-rank test. The results showed that 3 hub genes (KIF2C, MAD2L1 and PLK1) were associated with the overall survival (Figure 2). We summarized the association of the three hub genes’ expression levels and clinical features in Tables 5–7. The result demonstrated that MAD2L1and KIF2C were significantly associated with patients age, ER and PR status, tumor stage and size; PLK1was related to ER and PR status, tumor stage and size. The difference expressions of MAD2L, KIF2C and PLK1 among molecular subtypes were shown in Figure 3.
Figure 2

Three hub genes were associated with overall survival in invasive ductal carcinoma patients by using Kaplan-Meier curve and log-rank test

The patients were stratified into high level group and low level group according to median of each gene. (A) KIF2C. (B) MAD2L1. (C) PLK1.

Table 5

Association of MAD2L1 expression level and clinical features

VariablesLow expressionHigh expressionP value*
Age at diagnosis, y0.041
< 35917
35-4992108
50-64162169
≥6516295
Sex0.362
Male385382
Female47
ERP < 0.001
Negative67136
Positive307231
Equivocal54
Unknown1018
PRP < 0.001
Negative100179
Positive272188
Equivocal65
Unknown1117
HER20.324
Negative207286
Positive6265
Equivocal5572
Unknown6566
Stage0.001
I9049
II211241
III6886
IV116
Unknown97
Tumor sizeP < 0.001
T113782
T2212255
T32137
T41814
Unknown11
Lymph node stage0.243
N0185172
N1134141
N23955
N32216
Unknown95
Distant metastasis0.204
M0328345
M1118
Unknown5036

*P values calculated by Pearson Chi squared or Fisher's exact testing.

y: years.

Table 7

Association of PLK1 expression level and clinical features

VariablesLow expressionHigh expressionP value*
Age at diagnosis, y0.587
< 351214
35–4996104
50–64162169
≥65119102
Sex0.761
Male384383
Female56
ERP < 0.001
Negative48155
Positive329209
Equivocal36
Unknown919
PRP < 0.001
Negative83196
Positive293167
Equivocal38
Unknown1018
HER20.452
Negative206187
Positive5770
Equivocal6463
Unknown6269
StageP < 0.001
I9841
II208244
III6787
IV710
Unknown97
Tumor sizeP < 0.001
T115069
T2208259
T31543
T41517
Unknown11
Lymph node stage0.213
N0183174
N1140135
N23757
N32018
Unknown95
Distant metastasis0.284
M0334339
M1712
Unknown4838

*P values calculated by Pearson Chi squared or Fisher's exact testing.

y: years.

Figure 3

L KIF2C, MAD2L and PLK1 expressions among molecular subtypes of breast cancer

(A) KIF2C. (B) MAD2L1. (C) PLK1. TN: triple negative.

Three hub genes were associated with overall survival in invasive ductal carcinoma patients by using Kaplan-Meier curve and log-rank test

The patients were stratified into high level group and low level group according to median of each gene. (A) KIF2C. (B) MAD2L1. (C) PLK1. *P values calculated by Pearson Chi squared or Fisher's exact testing. y: years. *P values calculated by Pearson Chi squared or Fisher's exact testing. y: years. *P values calculated by Pearson Chi squared or Fisher's exact testing. y: years.

L KIF2C, MAD2L and PLK1 expressions among molecular subtypes of breast cancer

(A) KIF2C. (B) MAD2L1. (C) PLK1. TN: triple negative.

Mining genetic alterations and potential drugs of hub genes

We analyzed the 10 hub genes using cBioportal to explore their cancer genomic alterations in breast cancer and to find out potential drugs. Among the 10 breast cancer studies, alterations ranging for the hub genes were found from 0% to 51.7% (Figure 4). In the study of Curtis et al. and/or Pereira et al. [9], 634 cases (25%) had an alteration in at least one of the 10 genes queried. The frequency of alteration in each of the selected genes was shown in Figure 5. Mutual exclusivity analysis showed that 43 gene pairs had significant co-occurrent alterations (Supplementary Table 1). The drugs for hub genes were showed in Figure 6. Among them, CCNA2, CDK1 and PLK1 were targets of most drugs.
Figure 4

Summary of alterations for CDK1, CCNB1, CENPE, CENPA, PLK1, CDC20, MAD2L1, HIST1H2BK, KIF2C and CCNA2 in breast cancer

Figure 5

A visual summary of alteration across a set of breast samples (data taken from the study of Curtis et al

and/or Pereira et al.) based on a query of the hub genes.

Figure 6

A visual display of the drugs connected to CDK1, CCNB1, CENPE, CENPA, PLK1, CDC20, MAD2L1, HIST1H2BK, KIF2C and CCNA2 in breast cancer

(based on the study of Curtis et al. and/or Pereira et al.)

A visual summary of alteration across a set of breast samples (data taken from the study of Curtis et al

and/or Pereira et al.) based on a query of the hub genes.

A visual display of the drugs connected to CDK1, CCNB1, CENPE, CENPA, PLK1, CDC20, MAD2L1, HIST1H2BK, KIF2C and CCNA2 in breast cancer

(based on the study of Curtis et al. and/or Pereira et al.)

DISCUSSION

IDC is a common histological type of breast cancer, it is important to understand the molecular mechanisms for the treatments. Nowadays, microarray has been widely used to analyze the expression changes of mRNA in breast cancer and predict the potential therapeutic targets. In the present study, we explored potential crucial genes associated with IDC using bioinformatics analysis. Based on the cutoff criteria, a total of 667 up-regulated and 332 down-regulated DEGs were identified from GSE10780. Through GO enrichment analysis, in biological process, the up-regulated genes were most significantly involved in cell division. While down-regulated genes were significantly enriched in cell adhesion. It is easy to understand that uncontrolled cell division is the hallmark of cancers, and loss of cell-cell adhesion is an important step in the acquisition of the invasive, metastatic phenotype [10, 11]. In the molecular function portion, up-regulated genes were mainly enriched in protein binding, microtubule binding and protein kinase binding. It is pointed out that many microtubule binding proteins were associated with oncogenesis. Microtubule end-binding protein 1 (EB1) was demonstrated up-regulated both in human breast cancer specimens and cell lines. The level of EB1 could indicate the malignancy of breast cancer and is reported to be correlated with clinical characteristics, including higher histological grade, higher pathological tumor node metastasis stage (pTNM), and higher incidence of lymph node metastasis [12]. While down-regulated genes were most significantly enriched in heparin binding, a previous study reported that the expression of heparin-binding epidermal growth factor-like growth factor (HB-EGF) was inversely related to biological aggressiveness of the breast carcinoma, suggesting that HB-EGF may play an important role in process of breast carcinoma [13]. As for cellular component, up-regulated genes mainly located in the cell nucleus, and down-regulated genes were mostly enriched in extracellular positions. This result indicated that DEGs may participate in DNA replication and cell adhesion. GO enrichment analysis demonstrated that DEGs might play crucial roles in oncogenesis through cell division, adhesion and binding-related mechanisms. Furthermore, pathway analysis revealed that up-regulated genes were mainly engaged in cell cycle and DNA replication, suggesting DEGs may participate in cell proliferation. Down-regulated genes were enriched in regulation of lipolysis in adipocytes, PI3K-Akt signaling pathway, PPAR signaling pathway and cytokine-cytokine receptor interaction. Recent studies have demonstrated that obesity is associated with increased recurrence and reduced survival of breast cancer, and adipocyte lipolysis may play an important role in the provision of metabolic substrates to breast cancer cells. While further studies are needed to explore the complex metabolic symbiosis between tumor-surrounding adipocytes and cancer cells that stimulate their invasiveness [13, 14]. In addition, PI3K-Akt signaling pathway and PPAR signaling pathway were confirmed that both play crucial roles in cell proliferation, invasion and metastasis [15], PI3K-Akt signaling pathway had been demonstrated to be activated in breast cancer, in the present study, 15 genes (IL6, FGF10, GNG11, KIT, PCK1, LAMB3, COL6A6, RELN, LAMC2, ANGPT1, TNN, PDGFD, EGF, PIK3R1, GHR) were down regulated in this pathway, and the result was validated using TCGA database. These genes are not crucial genes in the PI3K-Akt signaling pathway, they may participate other pathways which can regulate tumorigenesis, the downregulation of these genes may promote the occurrence and development of tumor. Through PPI network construction, the module analysis revealed that the DEGs were mainly involved in cell division, cycle and binding-related mechanisms. And we listed the top 10 hub genes with higher degrees: CDK1, CCNB1, CENPE, CENPA, PLK1, CDC20, MAD2L1, HIST1H2BK, KIF2C and CCNA2. The 10 hub genes were validated in the TCGA database. PLK1, MAD2L1 and KIF2C were demonstrated significantly associated with overall survival and clinical features. In addition, these hub genes all had alterations in breast cancer. CDK1 have been demonstrated to be a potential prognostic indicator, the protein encoded by this gene is a member of the Ser/Thr protein kinase family, which is essential for G1/S and G2/M phase transitions of eukaryotic cell cycle. S. J. Kim et al. considered that CDK1 is strongly associated with clinical outcomes of breast cancer patients, and regarded CDK1 as a new independent prognostic factor [16, 17]. CCNB1 is a regulatory protein involved in mitosis, it is expressed predominantly during G2/M phase. It is reported that CCNB1 is a power prognostic factor for the survival of ER+ breast cancer patients [18], and it is also involved in therapy resistance [19]. CENPE is a kinesin-like motor protein that accumulates in the G2 phase of the cell cycle, while CENPA is proposed to be a component of a modified nucleosome or nucleosome-like structure. CENPE and CENPA are two AU-rich elements (AREs) involved in the mitotic cell cycle, a recent study revealed that defects in ARE-mediated posttranscriptional control could lead to carcinogenesis. Recent studies have also shown that the survival of breast cancer patients is related to high levels of the mitotic ARE-mRNA signature [20]. PLK1 is a Ser/Thr protein kinase which performs important roles in the M phase of the cell cycle. PLK1 antagonizes p53 during DNA damage response, and alteration of mRNA and protein expression related to DNA damaging, replication and repairing was detected in PLK1-silenced tumor cells, including the DNA-dependent protein kinase (DNAPK) and topoisomerase II alpha (TOPO2A) [21]. M Wierer et al. reported PLK1 mediates estrogen receptor (ER)-regulated gene transcription in human breast cancer cells [22]. Evidence also revealed that breast cancer cells with treatment of siRNAs targeting PLK1 could improve the sensitivity toward paclitaxel and Herceptin [23]. CDC20 acts as a regulatory protein which is an essential component of cell division. High CDC20 is reported to be associated with an aggressive course of disease and poor prognosis [24]. MAD2L1 is a component of the mitotic spindle assembly checkpoint, and it may play crucial roles in the progression of breast cancer. Interestingly, MAD2L1 could inhibit the activity of the anaphase promoting complex by sequestering CDC20 until all chromosomes are aligned at the metaphase plate. Lowering the expression of MAD2L1 by siRNAs could reduce tumor cell growth and inhibit cell migration and invasion [25]. MAD2 overexpression has recently been shown to lead to tumor initiation and progression through the acquisition of chromosomal instability (CIN) in mice, tumors that experience transient MAD2 overexpression and consequent CIN results in markedly elevated recurrence rates [26]. It is reported that measuring the expression of MAD2L1 may assist the prediction of breast cancer prognosis [25]. The potential mechanisms of MAD2L1 in breast cancer require further investigation. HIST1H2BK is a core component of nucleosome, which participates in the pathway of activated PKN1 stimulates transcription of androgen receptor regulated genes KLK2 and KLK3. While the role of HIST1H2BK in breast cancer remains unclear. KIF2C is a kinesin-like protein that functions as a microtubule-dependent molecular motor. It can depolymerize microtubules at the plus end, thereby promoting mitotic chromosome segregation. T Abdelfatah et al. revealed the overexpression of KIF2C protein is associated with unfavorable clinic-pathological features and predicted poor clinical outcomes [27]. While the underlying mechanisms are not clear, further investigations are needed to identify the role of KIF2C in breast cancer. CCNA2 functions as a regulator of the cell cycle to promote transition through G1/S and G2/M. Several studies have demonstrated that CCNA2 has significant power to predict the survival of breast cancer patients and it is also found that CCNA2 was closely associated with tamoxifen resistance [28, 29]. In our present study, only MAD2L1, KIF2C and PLK1 were associated the overall survival of IDC patients. The three genes all play important roles in the process of mitotic cell cycle. It is easy to understand that uncontrolled cell cycle is an important step of cancer occurrence and development, while further studies are still required to explore the mechanisms. Using bioinformatics analysis, our study identified 667 up-regulated and 332 down-regulated DEGs. Among the10 hub genes, MAD2L1, KIF2C and PLK1 were potential biomarkers for the prognosis of IDC patients. The results of the present study may give valuable indication for basic and clinical research. However, further molecular biological experiments are needed in order to confirm the functions of identified DEGs.

MATERIALS AND METHODS

Identification of DEFs in IDC

The gene expression profiles of GSE10780 were downloaded from the GEO database. GSE10780 which was submitted by Chen D et al. was based on GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array) and contained 185 samples, including 42 histologically normal breast tissues and 143 IDC tissues. The probes without annotation of gene expression profiles were filtered and probes were transformed into gene symbol. The gene expression profile data was preprocessed using the Robust Multi-array Average (RMA) algorithmin affy package within Bioconductor (http://www.bioconductor.org) in R. After background correction, quantile normalization and probe summarization, we used the Linear Models for Microarray Data (LIMMA, http://www.bioconductor.org/packages/release/bioc/html/limma.html) package in R to identify DEGs by comparing expression value between samples in IDC and normal group. The corresponding p value of gene symbols after classical T-test was defined as adjusted p-value, adjusted P-value <0.01 and |log2 foldchange (FC)|>1 were set as the cutoff criteria. And five healthy tissue samples and five IDC samples from GSE21422 (based on GPL570 platform) were analyzed to validate the results.

Gene ontology and pathway enrichment analysis of DEGs

Functional and pathway enrichment analysis of DEG were carried out using the database for Annotation, Visualization and Integrated Discovery (DAVID, http://david.abcc.ncifcrf.gov/). In the present study, we performed Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis based on DAVID online tools. The Go terms were classified into three categories, including cellular component (CC), biological process (BP) and molecular function (MF). P-value < 0.01 was considered as statistically significant differences. For KEGG pathway analysis, P-value < 0.01 was set as the cut-off criterion to identify the enriched pathways.

Construction of PPI network

The PPI network of DEGs in our study were constructed using Search Tool for the Retrieval of Interacting Gene (STRING) database. STRING is an online tool to predict the protein-protein interaction information, and can provide system-wide view of cellular processes. To evaluate the interactive relationships among DEGs, we established the PPI network using STRING, and “Confidence score > 0.7” was selected as the cut-off criterion. Then, PPI network was visualized by cytoscape software (http://www.cytoscape.org/). Molecular Complex Detection (MCODE) was subsequently applied to screen the modules of PPI network in cytoscape. The criteria were set as follows: “degree cutoff = 2”, “node score cutoff = 0.2”, “k-core = 2” and “max.depth = 100”. The hub proteins are a small number of proteins that have many interactions with other proteins, to screen out these important nodes in the PPI network, the plug-in CytoHubba was utilized in the present study. To confirm the reliability of the hub genes, we used ONCOMINE (www.oncomine.org), a cancer microarray database and web-based data-mining platform to validate the expression levels of the 10 genes [8]. We performed differential analysis between IDC and normal samples using TGCA datasets. In addition, the RNA sequencing data and clinical information were downloaded from TCGA database (https://cancergenome.nih.gov/). A total of 778 IDC cases were analyzed in our study. The RNA sequencing data were normalized using R language package. Patients clinical information included sex (male and female), age at diagnosis (<35, 35-49, 50-64, ≥65 years), race (white, black, Asian), ER, PR and HER2 status, and tumor-node-metastasis (TNM) stage. The prognostic value of each differentially expressed mRNA was evaluated using Kaplan -Meier survival curves by log-rank test. A p-value < 0.05 was defined as significant.

Exploring cancer genomics data by cBioportal

The cBioPortal for Cancer Genomics (http://www.cbioportal.org/) provides visualization, analysis and download of large-scale cancer genomics data sets [30, 31]. In this study, we used cBioPortal to explore the genetic alterations of hub genes and potential drugs.

Human participants and animal rights

This article does not contain any studies with human participants or animals performed by any of the authors.
Table 6

Association of KIF2C expression level and clinical features

VariablesLow expressionHigh expressionP value*
Age at diagnosis, y0.022
<35917
35-4992108
50-6416071
≥6512893
Sex0.362
Male385382
Female47
ERP < 0.001
Negative42161
Positive331207
Equivocal45
Unknown1216
PRP < 0.001
Negative78201
Positive294166
Equivocal56
Unknown1216
HER20.3870
Negative199194
Positive6067
Equivocal6661
Unknown6467
Stage0.001
I9247
II212240
III6985
IV89
Unknown88
Tumor sizeP < 0.001
T114277
T2214253
T31642
T41616
Unknown11
Lymph node stage0.378
N0184173
N1138137
N23955
N31919
Unknown95
Distant metastasis0.204
M0332341
M1811
Unknown4937

*P values calculated by Pearson Chi squared or Fisher's exact testing.

y: years.

  30 in total

1.  Gene expression profiling predicts clinical outcome of breast cancer.

Authors:  Laura J van 't Veer; Hongyue Dai; Marc J van de Vijver; Yudong D He; Augustinus A M Hart; Mao Mao; Hans L Peterse; Karin van der Kooy; Matthew J Marton; Anke T Witteveen; George J Schreiber; Ron M Kerkhoven; Chris Roberts; Peter S Linsley; René Bernards; Stephen H Friend
Journal:  Nature       Date:  2002-01-31       Impact factor: 49.962

2.  CCNB1 is a prognostic biomarker for ER+ breast cancer.

Authors:  Kun Ding; Wenqing Li; Zhiqiang Zou; Xianzhi Zou; Chengru Wang
Journal:  Med Hypotheses       Date:  2014-06-27       Impact factor: 1.538

Review 3.  Breast cancer statistics, 2011.

Authors:  Carol DeSantis; Rebecca Siegel; Priti Bandi; Ahmedin Jemal
Journal:  CA Cancer J Clin       Date:  2011-10-03       Impact factor: 508.702

4.  Inhibiting polo-like kinase 1 enhances radiosensitization via modulating DNA repair proteins in non-small-cell lung cancer.

Authors:  Da Yao; Peigui Gu; Youyu Wang; Weibin Luo; Huiliang Chi; Jianjun Ge; Youhui Qian
Journal:  Biochem Cell Biol       Date:  2017-10-17       Impact factor: 3.626

5.  Determination of the specific activity of CDK1 and CDK2 as a novel prognostic indicator for early breast cancer.

Authors:  S J Kim; S Nakayama; Y Miyoshi; T Taguchi; Y Tamaki; T Matsushima; Y Torikoshi; S Tanaka; T Yoshida; H Ishihara; S Noguchi
Journal:  Ann Oncol       Date:  2007-10-22       Impact factor: 32.976

6.  Cks1 regulates cdk1 expression: a novel role during mitotic entry in breast cancer cells.

Authors:  Louise Westbrook; Marina Manuvakhova; Francis G Kern; Norman R Estes; Harish N Ramanathan; Jaideep V Thottassery
Journal:  Cancer Res       Date:  2007-12-01       Impact factor: 12.701

7.  Functional proteomics can define prognosis and predict pathologic complete response in patients with breast cancer.

Authors:  Ana M Gonzalez-Angulo; Bryan T Hennessy; Funda Meric-Bernstam; Aysegul Sahin; Wenbin Liu; Zhenlin Ju; Mark S Carey; Simen Myhre; Corey Speers; Lei Deng; Russell Broaddus; Ana Lluch; Sam Aparicio; Powel Brown; Lajos Pusztai; W Fraser Symmans; Jan Alsner; Jens Overgaard; Anne-Lise Borresen-Dale; Gabriel N Hortobagyi; Kevin R Coombes; Gordon B Mills
Journal:  Clin Proteomics       Date:  2011-07-08       Impact factor: 3.988

8.  Cdc20 and securin overexpression predict short-term breast cancer survival.

Authors:  H Karra; H Repo; I Ahonen; E Löyttyniemi; R Pitkänen; M Lintunen; T Kuopio; M Söderström; P Kronqvist
Journal:  Br J Cancer       Date:  2014-05-22       Impact factor: 7.640

9.  Erratum: The somatic mutation profiles of 2,433 breast cancers refine their genomic and transcriptomic landscapes.

Authors:  Bernard Pereira; Suet-Feung Chin; Oscar M Rueda; Hans-Kristian Moen Vollan; Elena Provenzano; Helen A Bardwell; Michelle Pugh; Linda Jones; Roslin Russell; Stephen-John Sammut; Dana W Y Tsui; Bin Liu; Sarah-Jane Dawson; Jean Abraham; Helen Northen; John F Peden; Abhik Mukherjee; Gulisa Turashvili; Andrew R Green; Steve McKinney; Arusha Oloumi; Sohrab Shah; Nitzan Rosenfeld; Leigh Murphy; David R Bentley; Ian O Ellis; Arnie Purushotham; Sarah E Pinder; Anne-Lise Børresen-Dale; Helena M Earl; Paul D Pharoah; Mark T Ross; Samuel Aparicio; Carlos Caldas
Journal:  Nat Commun       Date:  2016-06-06       Impact factor: 14.919

10.  Biomarker discovery to improve prediction of breast cancer survival: using gene expression profiling, meta-analysis, and tissue validation.

Authors:  Liwei Meng; Yingchun Xu; Chaoyang Xu; Wei Zhang
Journal:  Onco Targets Ther       Date:  2016-10-11       Impact factor: 4.147

View more
  7 in total

1.  [Expression of NUF2 in breast cancer and its clinical significance].

Authors:  Jingbo Sun; Jiawei Chen; Zhizhi Wang; Yunyao Deng; Lixin Liu; Xiaolong Liu
Journal:  Nan Fang Yi Ke Da Xue Xue Bao       Date:  2019-05-30

2.  High Levels of HIST1H2BK in Low-Grade Glioma Predicts Poor Prognosis: A Study Using CGGA and TCGA Data.

Authors:  Weidong Liu; Zhentao Xu; Jie Zhou; Shuang Xing; Zhiqiang Li; Xu Gao; Shiyu Feng; Yilei Xiao
Journal:  Front Oncol       Date:  2020-05-08       Impact factor: 6.244

3.  Identification of five hub genes as monitoring biomarkers for breast cancer metastasis in silico.

Authors:  Yun Cai; Jie Mei; Zhuang Xiao; Bujie Xu; Xiaozheng Jiang; Yongjie Zhang; Yichao Zhu
Journal:  Hereditas       Date:  2019-06-21       Impact factor: 3.271

4.  Classification models for Invasive Ductal Carcinoma Progression, based on gene expression data-trained supervised machine learning.

Authors:  Shikha Roy; Rakesh Kumar; Vaibhav Mittal; Dinesh Gupta
Journal:  Sci Rep       Date:  2020-03-05       Impact factor: 4.379

5.  TSPAN1-elevated FAM110A promotes pancreatic cancer progression by transcriptionally regulating HIST1H2BK.

Authors:  Hua Huang; Huan Li; Ting Zhao; Aamir Ali Khan; Ruining Pan; Sijia Wang; Shensen Wang; Xinhui Liu
Journal:  J Cancer       Date:  2022-01-01       Impact factor: 4.207

6.  The Value of H2BC12 for Predicting Poor Survival Outcomes in Patients With WHO Grade II and III Gliomas.

Authors:  Jie Zhou; Zhaoquan Xing; Yilei Xiao; Mengyou Li; Xin Li; Ding Wang; Zhaogang Dong
Journal:  Front Mol Biosci       Date:  2022-04-25

7.  CENPE promotes glioblastomas proliferation by directly binding to WEE1.

Authors:  Chuanling Ma; Jianchao Wang; Jie Zhou; Kaisen Liao; Min Yang; Fangqiong Li; Meng Zhang
Journal:  Transl Cancer Res       Date:  2020-02       Impact factor: 1.241

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.