Literature DB >> 28990063

Bioinformatics analysis of gene expression data for the identification of critical genes in breast invasive carcinoma.

Yi Li1, Yongsheng Wang1.   

Abstract

Gene expression data were analyzed in order to identify critical genes in breast invasive carcinoma (BRCA). Data from 1,073 BRCA samples and 99 normal samples were analyzed, which were obtained from The Cancer Genome Atlas. Differentially expressed genes (DEGs) were identified using the significance analysis of microarrays method and a functional enrichment analysis was performed using the Database for Annotation, Visualization and Integrated Discovery. Relevant microRNAs (miRNAs), transcription factors (TFs) and associated small molecule drugs were revealed by Fisher's exact test. Furthermore, protein‑protein interaction (PPI) information was downloaded from the Human Protein Reference Database. Interactions with a Pearson's correlation coefficient >0.5 were identified and PPI networks were subsequently constructed. A survival analysis was also conducted according to the Kaplan‑Meier method. Initially, the 1,073 BRCA samples were clustered into seven groups, and 5,394 DEGs that were identified in ≥4 groups were selected. These DEGs were involved in the cell cycle, ubiquitin‑mediated proteolysis, oxidative phosphorylation and human immunodeficiency virus infection. In addition, TFs, including Sp1 transcription factor, DAN domain BMP antagonist family member 5, MYCN proto‑oncogene, bHLH transcription factor and cAMP responsive element binding protein (CREB)1, were identified in the BRCA groups. Seven PPI networks were subsequently constructed and the top 10 hub genes were acquired, including RB transcriptional corepressor 1, inhibitor of nuclear factor (NF)‑κB kinase subunit γ, NF‑κB subunit 2, transporter 1, ATP binding cassette subfamily B member, CREB binding protein and proteasome subunit α3. A significant difference in survival was observed between the two combined groups (groups‑2, ‑4 and ‑5 vs. groups‑1, ‑3, ‑6 and ‑7). In conclusion, numerous critical genes were detected in BRCA, and relevant miRNAs, TFs and small molecule drugs were identified. These findings may advance understanding regarding the pathogenesis of BRCA.

Entities:  

Mesh:

Year:  2017        PMID: 28990063      PMCID: PMC5779935          DOI: 10.3892/mmr.2017.7717

Source DB:  PubMed          Journal:  Mol Med Rep        ISSN: 1791-2997            Impact factor:   2.952


Introduction

Breast cancer is the most common cancer in women, affecting ~12% of women worldwide (1). It is also the leading cause of cancer-associated mortality in women (2,3). The main cause of breast cancer mortality is metastasis, which accounts for ~90% of breast cancer-associated mortality (4,5). Therefore, research has focused on breast cancer metastasis. The treatment methods of breast cancer include surgery, radiation, chemotherapy, hormone blocking therapy (6), and monoclonal antibodies (7). The choice of treatment for breast cancer depends on classification and special markers. Breast cancers are classified by several grading systems, such as histopathology, grade, stage, receptor status, and protein and gene expression patterns (8). Various signaling pathways have been implicated in metastasis regulation, including transforming growth factor-β signaling (9), Wnt signaling (10), Notch signaling (11) and epidermal growth factor signaling (12). Numerous metastasis-associated biomarkers have also been revealed. Parathyroid hormone-related protein (12–48) is a plasma biomarker that is associated with breast cancer bone metastasis (13). In addition, C-C chemokine receptor type 7 and C-X-C chemokine receptor type 4 are biomarkers that exhibit predictive value for axillary lymph node metastasis in breast cancer (14). MicroRNAs (miRNAs/miRs) serve important roles in the regulation of metastasis, including miR-31 (15) and miR-146 (16). Due to its complexity and heterogeneity, the molecular mechanisms underlying breast invasive carcinoma (BRCA) have yet to be completely elucidated. The identification of critical genes is an essential step toward fully understanding the metastatic mechanism of BRCA and may also provide novel therapeutic targets against BRCA. In the present study, BRCA gene expression data were acquired from The Cancer Genome Atlas (TCGA). Differential analysis, combined with network analysis, was performed to identify the critical genes in BRCA. In addition, relevant miRNAs, transcription factors (TFs) and associated small molecule drugs were investigated, which may provide information that aids the development of therapeutic strategies.

Materials and methods

Raw data and pretreatment

RNA-seq data for BRCA were downloaded from TCGA (https://cancergenome.nih.gov/) using TCGA-Assembler (17); the data were then normalized with package TCC (http://www.bioconductor.org/packages/release/bioc/html/TCC.html). Data from 1,073 BRCA samples and 99 normal samples, which were downloaded from TCGA, were included for further analysis. The RNA-seq data was acquired from the platform Illumina HiSeq 2000 RNA Sequencing. The survival data were also collected from TCGA. Missing values were filled with ‘1’ and a log2 transformation was applied. BRCA samples were clustered using Algorithms and Framework for Nonnegative Matrix Factorization (NMF; https://cran.r-project.org/web/packages/NMF/index.html) (18) and k value was set to generate optimal cophenetic correlation coefficient (19).

Screening of differentially expressed genes (DEGs)

DEGs were identified using the significance analysis of microarrays method (20), which compensates for the false discovery rate in multiple testing. Adjusted P<0.05 and |log (fold change) |<1 were set as the cut-off criteria.

Functional enrichment analysis

Gene Ontology enrichment analysis and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis were performed using the Database for Annotation, Visualization and Integration Discovery (http://david.abcc.ncifcrf.gov/) (21). The P-value was adjusted by Bonferroni correction (22). Reactome analysis were performed using the Reactome Database (http://reactome.org/) (23).

Screening of relevant miRNAs, TFs and associated drugs

Regulatory relationships between miRNAs and target genes, and between TFs and target genes were downloaded from combinatorial Gene Regulatory Networks Builder (https://www.scbit.org/cgrnb/) (24,25). A total of 197,906 miRNA-target gene relationships were acquired, which included 699 mature miRNAs and 8,646 target genes. Furthermore, 210,637 TF-target gene relationships were obtained, which included 207 TFs and 16,862 target genes. Relationships between small molecule drugs and target genes were downloaded from DrugBank (http://www.drugbank.ca/). A total of 6,108 relationships were acquired, which included 348 small molecule drugs and 1,353 target genes. Relevant miRNAs, TFs and drugs were identified by Fisher's exact test (26). For specific miRNAs, TFs or drugs the associated target genes were defined as M, whereas the DEGs were identified as N. The significance of the overlap between M and N was examined by Fisher's exact test: Where ‘a’ indicates genes included in M and N, ‘d’ indicates genes included in neither M nor N, ‘b’ indicates genes only included in M, and ‘c’ indicates genes only included in N. Relevant miRNAs, TFs and drugs with P<0.05 were selected for each group of BRCA samples.

Construction of interaction networks

Protein-protein interaction (PPI) information was downloaded from the Human Protein Reference Database (http://www.hprd.org/) (27). A total of 39,240 PPI interactions and 9,572 proteins were identified. Interactions with a Pearson's correlation coefficient >0.5 were selected and PPI networks were subsequently constructed.

Survival analysis

Difference in survival time among the various BRCA groups was analyzed using the Kaplan-Meier (K-M) method in the survival package (version 2.41–3; https://cran.r-project.org/web/packages/survival/index.html).

Results

Classification of BRCA samples

Box plots of normalized gene expression data are presented in Fig. 1. A good performance of normalization was achieved.
Figure 1.

Box plot of gene expression data of 100 samples randomly selected from the 1,073 samples. A good performance of normalization was achieved.

Coefficient of variation was calculated for the expression levels of each gene in the BRCA samples and the top 1,500 genes, which were regarded as core genes, were selected for classification. Maximum cophenetic correlation coefficient was obtained while the k value was set as 7 (Fig. 2). Accordingly, seven BRCA groups were identified from the 1,052 samples using the NMF package. Group-1 included 53 samples, group-2 included 227 samples, group-3 included 124 samples, group-4 included 367 samples, group-5 included 134 samples, group-6 included 32 samples and group-7 included 115 samples.
Figure 2.

(A) Evaluation of k value; (B) correlation matrix of breast invasive carcinoma groups; (C) heatmap of the 1,500 core genes.

DEGs and biological functions

The DEGs were identified in the BRCA group samples compared with the control samples. A total of 4,970 DEGs were identified in group-1, 6,355 DEGs in group-2, 6,008 DEGs in group-3, 5,018 DEGs in group-4, 7,793 DEGs in group-5, 4,976 DEGs in group-6 and 3,581 DEGs in group-7 compared with the control samples. A considerable number of overlapping genes were observed among groups −1 to −4 and groups −4 to −7 (Fig. 3). A total of 5,394 DEGs were identified in ≥4 groups and were selected and termed common DEGs. Functional enrichment analysis demonstrated that these genes were involved in the cell cycle, ubiquitin-mediated proteolysis, oxidative phosphorylation and human immunodeficiency virus infection (Table I).
Figure 3.

Venn diagram showing the overlapping of differentially expressed genes in (A) groups-1 to −4 and (B) groups-4 to −7.

Table I.

Biological functions over-represented in the 5,394 differentially expressed genes.

SourceNameBonferroni-adjusted P-value
KEGGUbiquitin-mediated proteolysis2.35×10−10
KEGGNon-alcoholic fatty liver disease6.50×10−10
KEGGOxidative phosphorylation6.79×10−10
KEGGCell cycle9.26×10−10
KEGGRNA transport1.69×10−9
KEGGProteasome3.30×10−9
KEGGSpliceosome2.10×10−8
KEGGProtein processing in endoplasmic reticulum9.19×10−8
KEGGHuman T lymphotropic virus type 1 infection9.34×10−8
KEGGRibosome biogenesis in eukaryotes9.34×10−8
KEGGDNA replication1.06×10−7
KEGGShigellosis3.96×10−7
REACTOMECell cycle, mitotic4.29×10−7
REACTOMECell cycle5.01×10−7
REACTOMEHuman immunodeficiency virus infection1.18×10−6

KEGG, Kyoto Encyclopedia of Genes and Genomes.

The top 500 DEGs ranked by adjusted P-value were selected, which were used to generate a heatmap for the seven groups. Different expression patterns were observed among the seven groups (Fig. 4).
Figure 4.

Heatmap of top 500 differentially expressed genes ranked by adjusted P-value. X-axis indicates breast invasive carcinoma samples from groups-1 to −7, whereas Y-axis indicates the 500 genes.

Relevant miRNAs, TFs and drugs

According to the threshold, 129, 153, 11, 77, 37, 71 and 202 miRNAs were obtained from groups-1 to −7, respectively. A total of 48, 56, 46, 45, 56, 49 and 48 TFs were obtained from groups-1 to −7, respectively. A total of 6, 1, 2, 1, 0, 1 and 2 small molecule drugs were unveiled in groups-1 to −7, respectively. The top 5 miRNAs, TFs and small molecule drugs (ranked by P-value) are listed in Table II. Numerous TFs were observed in >1 group, including Sp1 transcription factor and DAN domain BMP antagonist family member 5. In addition, some TFs were unique to certain groups, including MYCN proto-oncogene, bHLH transcription factor in group-7 and cAMP responsive element binding protein 1 (CREB)1 in group-4. Furthermore, different small molecule drugs were revealed between the groups.
Table II.

Top five relevant microRNAs, transcription factors and small molecule drugs in each group.

GroupMicroRNAsTranscription factorsDrugs
1miR-936, miR-130a, miR-26a, miR-30a, miR-301aDAND5, PSG1, SP1, Elk-1, EGR3Basiliximab, Alemtuzumab, Efalizumab, Natalizumab, L-Isoleucine
2miR-506, miR-1289, miR-552, miR-590-3p, miR-214DAND5, PSG1, SP1, E2F, E2F1Bortezomib
3miR-1207-5p, miR-1224-3p, miR-1275, miR-518e*, miR-765E2F, E2F1, Elk-1, SP1, DAND5Bortezomib, Sunitinib
4miR-454, miR-1245, miRr-659, miR-518a-5p, miR-340DAND5, PSG1, SP1, Elk-1, CREB1Carfilzomib
5miR-663, miR-519e, miR-940, miR-339-5p, miR-125a-5pDAND5, PSG1, SP1, Elk-1, PAX5N/A
6miR-214, miR-1290, miR-142-3p, miR-607, miR-134SP1, DAND5, PSG1, PAX5, Pax-5Biotin
7miR-590-3p, miR-376b, miR-568, miR-1200, miR-302c*SP1, DAND5, PSG1, MYCN, E2FSpermine, L-Glutamine

PPI network

PPIs were identified among the 5,394 DEGs; PPIs with a Pearson's correlation coefficient >0.5 were selected and seven PPI networks were constructed. The PPI network for group-1 contained 711 interactions and 748 genes. The PPI network for group-2 included 197 interactions and 256 genes. The PPI network for group-3 included 473 interactions and 533 genes. The PPI network for group-4 included 207 interactions and 258 genes. The PPI network for group-5 included 382 interactions and 453 genes. The PPI network for group-6 included 774 interactions and 830 genes. The PPI network for group-7 included 416 interactions and 463 genes. Degree was calculated for each node and the top 10 hub genes of the seven PPI networks are listed in Table III. Some hub genes were unique to certain groups, including RB transcriptional corepressor 1 (RB1) from group-1, inhibitor of nuclear factor (NF)-κB kinase subunit γ (IKBKG) from group-2, transporter 1, ATP binding cassette subfamily B member from group-4, small nuclear ribonucleoprotein polypeptide E from group-5, and CREB binding protein from group-6.
Table III.

Top 10 hub genes of the seven groups.

GroupGeneDegree
1TRAF211
SYK10
RB110
CEBPB10
CSNK2B  9
PCNA  8
STAT1  8
MCM3  8
COPS6  8
MCM2  8
2STAT1  6
MCM3  6
MCM7  6
MCM2  6
PCNA  5
IKBKG  5
PTBP1  5
SNRPG  5
NFKB2  5
TAP2  4
3ACTB14
COPS6  9
LYN  7
NFKB2  7
MCM3  7
ARPC4  7
FYN  7
PSMB5  7
SYK  6
PCNA  6
  4FYN10
LYN  7
PCNA  7
MCM6  7
MCM2  7
STAT1  6
MCM3  6
MCM7  6
TAP1  4
PSMA3  4
  5PCNA  8
SNRPE  8
SNRPF  7
SNRPD2  7
MCM2  7
TAF1  6
LSM2  6
MCM3  6
MCM6  6
COPS6  6
  6COPS614
TRAF213
ACTB11
FYN11
CREBBP10
C14orf110
STAT1  9
CSNK2B  9
PRPF40A  9
LYN  8
  7FYN11
COPS6  9
ACTB  9
LYN  8
SNRPD2  8
SNRPF  7
CSNK2B  7
MCM2  7
PCNA  6
PSMA3  6
A total of 101 cases succumbed to BRCA, as follows: 3 cases from group-1, 32 from group-2, 10 from group-3, 38 from group-4, 11 from group-5, 1 from group-6 and 6 from group-7. The survival analysis demonstrated that groups-2, −4 and −5 had similar survival curves (Fig. 5A). Therefore, these groups were combined into m-group-2, whereas the remaining groups were combined into m-group-1. A significant difference was observed between the combined two groups (P=0.0484; Fig. 5B). According to the available data, no common features in m-group-1 and m-group-2, including gene expression patterns and clinical features, were observed.
Figure 5.

Survival analysis result of the (A) seven groups and (B) two combined groups.

Discussion

Expression data from 1,073 BRCA samples and 99 normal samples were analyzed in the present study. The BRCA samples were divided into seven groups, and a total of 5,394 common DEGs were obtained. The identified DEGs were involved in the cell cycle, ubiquitin-mediated proteolysis and oxidative phosphorylation. Subsequently, PPI networks were constructed for the DEGs and hub genes were selected. A combination of differential and network analyses was useful in identifying critical genes. Some of the identified hub genes have previously been implicated in breast cancer. TNF receptor-associated factor 2 mediates the signal transduction from members of the TNF receptor superfamily and serves a role in regulation of breast cancer cell invasion (28,29). Spleen associated tyrosine kinase (SYK) is a tumor suppressor, the expression of which is often absent or reduced in invasive breast cancer tissues (30,31). SYK may exert its function through integrin-mediated protein tyrosine phosphorylation (32). Proliferating cell nuclear antigen is considered a potential biomarker of BRCA (33). Polypyrimidine tract-binding protein 1 (PTBP1) has been reported to serve a role in the maintenance of breast cancer cell growth and malignant properties (34). The stability of PTBP1 is increased in breast cancer cell lines compared with in matched controls (35). FYN proto-oncogene, Src family tyrosine kinase is a member of the protein-tyrosine kinase oncogene family and is implicated in the control of cell growth. In addition, it is considered an important molecule in drug resistance in breast cancer (36), is induced by Ras/phosphoinositide 3-kinase/Akt signaling and is required for enhanced invasion (37). RB1 is a negative regulator of the cell cycle and was the first identified tumor suppressor gene. In addition, RB1 is associated with epithelial-to-mesenchymal transition in triple-negative breast cancer (38); therefore, it may be used to predict drug response (39) and long-term survival (40). NFKB2 is a subunit of the transcription factor complex NF-κB, whereas IKBKG is a regulatory subunit of the IκB kinase complex, which activates NF-κB. Both NFKB2 and IKBKG are upregulated in inflammatory breast cancer and may serve important roles in the disease (41). Relevant miRNAs, TFs and small molecule drugs were also identified in the present study. miR-130a (42), miR-26a (43) and miR-30a (44) have been reported to suppress breast cancer cell proliferation and migration. Furthermore, miR-340 inhibits breast cancer cell migration and invasion through targeting the oncoprotein c-Met (45). Conversely, upregulated miR-301a in breast cancer promotes tumor metastasis by targeting phosphatase and tensin homolog and activating Wnt/β-catenin signaling (46). Furthermore, miR-506 has been reported to regulate epithelial-mesenchymal transition in breast cancer cell lines (47), whereas miR-214 enhances the invasive ability of breast cancer cells by targeting p53 (48). The TF early growth response 3 is involved in the estrogen-signaling pathway in breast cancer cells (49). E2F transcription factor 1 is a proliferative marker of breast neoplasia (50). In addition, ELK1, ETS transcription factor, is a member of the ETS oncogene family, and is implicated in breast cancer cell survival (51). These factors may be used to develop novel therapies for the treatment of breast cancer. In addition, two small molecule drugs that are currently used to treat BRCA, were identified to be associated with the BRCA DEGs: Bortezomib (52,53) and sunitinib (54), thus suggesting the reliability of the DEGs. In the present study, the two combined BRCA groups exhibited a significant difference in survival rate. Therefore, a comparative analysis of the two combined groups may disclose prognostic genes associated with breast cancer. In conclusion, numerous critical genes were identified in BRCA. Further studies regarding these genes may improve understanding regarding the complex molecular mechanisms underlying BRCA and result in the identification of therapeutic targets. The identified relevant miRNAs and TFs in the present study may also provide information that aids future therapy development.
  52 in total

1.  Fyn is induced by Ras/PI3K/Akt signaling and is required for enhanced invasion/migration.

Authors:  Vipin Yadav; Mitchell F Denning
Journal:  Mol Carcinog       Date:  2010-12-10       Impact factor: 4.784

2.  Suppressive role of miR-502-5p in breast cancer via downregulation of TRAF2.

Authors:  Li-Li Sun; Jian Wang; Zhi-Juan Zhao; Ning Liu; Ai-Lian Wang; Hua-Yan Ren; Fan Yang; Ke-Xin Diao; Wei-Neng Fu; En-Hua Wan; Xiao-Yi Mi
Journal:  Oncol Rep       Date:  2014-03-21       Impact factor: 3.906

3.  microRNA-214 enhances the invasion ability of breast cancer cells by targeting p53.

Authors:  Fang Wang; Pengwei Lv; Xinwei Liu; Mingzhi Zhu; Xinguang Qiu
Journal:  Int J Mol Med       Date:  2015-03-03       Impact factor: 4.101

4.  Reduced expression of the Syk gene is correlated with poor prognosis in human breast cancer.

Authors:  Tatsuya Toyama; Hirotaka Iwase; Hiroko Yamashita; Yasuo Hara; Yoko Omoto; Hiroshi Sugiura; Zhenhuan Zhang; Yoshitaka Fujii
Journal:  Cancer Lett       Date:  2003-01-10       Impact factor: 8.679

5.  Transcription factor EGR3 is involved in the estrogen-signaling pathway in breast cancer cells.

Authors:  A Inoue; Y Omoto; Y Yamaguchi; R Kiyama; S-I Hayashi
Journal:  J Mol Endocrinol       Date:  2004-06       Impact factor: 5.098

6.  Role of the protein tyrosine kinase Syk in regulating cell-cell adhesion and motility in breast cancer cells.

Authors:  Xiaoying Zhang; Ulka Shrikhande; Bethany M Alicie; Qing Zhou; Robert L Geahlen
Journal:  Mol Cancer Res       Date:  2009-05-12       Impact factor: 5.852

7.  Identification of PTHrP(12-48) as a plasma biomarker associated with breast cancer bone metastasis.

Authors:  Charity L Washam; Stephanie D Byrum; Kim Leitzel; Suhail M Ali; Alan J Tackett; Dana Gaddy; Suzanne E Sundermann; Allan Lipton; Larry J Suva
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2013-03-05       Impact factor: 4.254

8.  Gene expression profiling identifies FYN as an important molecule in tamoxifen resistance and a predictor of early recurrence in patients treated with endocrine therapy.

Authors:  D Elias; H Vever; A-V Lænkholm; M F Gjerstorff; C W Yde; A E Lykkesfeldt; H J Ditzel
Journal:  Oncogene       Date:  2014-06-02       Impact factor: 9.867

9.  Considerations when using the significance analysis of microarrays (SAM) algorithm.

Authors:  Ola Larsson; Claes Wahlestedt; James A Timmons
Journal:  BMC Bioinformatics       Date:  2005-05-29       Impact factor: 3.169

10.  NF-kappa B genes have a major role in inflammatory breast cancer.

Authors:  Florence Lerebours; Sophie Vacher; Catherine Andrieu; Marc Espie; Michel Marty; Rosette Lidereau; Ivan Bieche
Journal:  BMC Cancer       Date:  2008-02-04       Impact factor: 4.430

View more
  3 in total

1.  Sp1 Suppresses miR-3178 to Promote the Metastasis Invasion Cascade via Upregulation of TRIOBP.

Authors:  Hui Wang; Kai Li; Yu Mei; Xuemei Huang; Zhenglin Li; Qingzhu Yang; Huanjie Yang
Journal:  Mol Ther Nucleic Acids       Date:  2018-04-26       Impact factor: 8.886

2.  Estradiol promotes the progression of ER+ breast cancer through methylation-mediated RSK4 inactivation.

Authors:  Qiuyun Li; Hui Gao; Huawei Yang; Wei Wei; Yi Jiang
Journal:  Onco Targets Ther       Date:  2019-07-22       Impact factor: 4.147

3.  Evaluation of the Role of hsa-mir-124 in Predicting Clinical Outcome in Breast Invasive Carcinoma Based on Bioinformatics Analysis.

Authors:  Tongbao Feng; Ping Zhang; Yingxin Sun; Xiu Han; Jichun Tong; Zichun Hua
Journal:  Biomed Res Int       Date:  2020-03-03       Impact factor: 3.411

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.