| Literature DB >> 33128008 |
Sepideh Dashti1, Mohammad Taheri2, Soudeh Ghafouri-Fard3.
Abstract
Breast cancer is a highly heterogeneous disorder characterized by dysregulation of expression of numerous genes and cascades. In the current study, we aim to use a system biology strategy to identify key genes and signaling pathways in breast cancer. We have retrieved data of two microarray datasets (GSE65194 and GSE45827) from the NCBI Gene Expression Omnibus database. R package was used for identification of differentially expressed genes (DEGs), assessment of gene ontology and pathway enrichment evaluation. The DEGs were integrated to construct a protein-protein interaction network. Next, hub genes were recognized using the Cytoscape software and lncRNA-mRNA co-expression analysis was performed to evaluate the potential roles of lncRNAs. Finally, the clinical importance of the obtained genes was assessed using Kaplan-Meier survival analysis. In the present study, 887 DEGs including 730 upregulated and 157 downregulated DEGs were detected between breast cancer and normal samples. By combining the results of functional analysis, MCODE, CytoNCA and CytoHubba 2 hub genes including MAD2L1 and CCNB1 were selected. We also identified 12 lncRNAs with significant correlation with MAD2L1 and CCNB1 genes. According to The Kaplan-Meier plotter database MAD2L1, CCNA2, RAD51-AS1 and LINC01089 have the most prediction potential among all candidate hub genes. Our study offers a framework for recognition of mRNA-lncRNA network in breast cancer and detection of important pathways that could be used as therapeutic targets in this kind of cancer.Entities:
Mesh:
Year: 2020 PMID: 33128008 PMCID: PMC7603345 DOI: 10.1038/s41598-020-76024-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Study design flowchart.
Figure 2(a) A Venn diagram of 887 overlapping DEGs in GSE65194 and GSE45827. (b) Volcano plot of significant DEGs with |logFC|> 2. (c) Volcano plot of significant differentially expressed lncRNA with |logFC|> 0.5.
Figure 3KEGG and GO enrichment analysis. (a) KEGG pathways (based on KEGG PATHWAY database[26,27]). (b) GO for DEGs, Biological process. (c) GO for DEGs, Cellular component. (d) GO for DEGs, Molecular function.
Key differentially expressed genes acquired by centrality analysis.
| Gene | logFC | adj. | MCODE | Centrality analysis by | CytoNCA | Centrality analysis by | CytoHubba | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| MCODE Score | Betweenness | Closeness | Degree | Eigenvector | EPC | MCC | MNC | Stress | |||
| CDK1 | 3.729327 | 3.03E−28 | 46.020339 | 6576.003525 | 0.551637 | 131 | 0.1437205 | 55.3 | 9.22E+13 | 158 | 356,658 |
| CCNB1 | 3.115417 | 1.59E−27 | 46.020339 | 3303.18478 | 0.534146 | 116 | 0.1417335 | 55.9 | 9.22E+13 | 133 | 213,424 |
| CCNA2 | 2.219921 | 2.57E−14 | 46.020339 | 2534.615585 | 0.52019 | 105 | 0.1342352 | 53.4 | 9.22E+13 | 122 | 163,390 |
| CDC20 | 2.756389 | 1.63E−14 | 46.020339 | 2758.724299 | 0.496036 | 103 | 0.1345755 | 51.8 | 9.22E+13 | 116 | 116,042 |
| MAD2L1 | 3.346366 | 4.29E−26 | 46.020339 | 1131.260005 | 0.48079 | 94 | 0.1333797 | 52.9 | 9.22E+13 | 110 | 72,556 |
| KIF11 | 3.207367 | 4.39E−28 | 46.020339 | 1464.21417 | 0.487751 | 92 | 0.1298102 | 50.4 | 9.22E+13 | 104 | 95,940 |
| CENPA | 2.75168 | 8.64E−15 | 47.094949 | 1704.200531 | 0.481319 | 92 | 0.1282006 | 50.3 | 9.22E+13 | 100 | 74,860 |
| PCNA | 2.453675 | 1.03E−37 | 43.857039 | 2862.973827 | 0.489385 | 92 | 0.1150848 | 47.9 | 9.22E+13 | 106 | 108,266 |
| EZH2 | 2.942149 | 1.48E−27 | 42.305272 | 3357.588136 | 0.511085 | 91 | 0.1046336 | 47.1 | 9.22E+13 | 112 | 243,314 |
| KIF23 | 2.687969 | 7.39E−20 | 46.020339 | 1657.240201 | 0.493799 | 89 | 0.1254939 | 49 | 9.22E+13 | 97 | 99,424 |
| TOP2A | 4.595822 | 5.72E−29 | 46.020339 | 1122.678983 | 0.493243 | 88 | 0.1293264 | 51.4 | 9.22E+13 | 104 | 94,808 |
| UBE2C | 3.062068 | 1.20E−22 | 46.020339 | 1455.353604 | 0.466951 | 88 | 0.1219277 | 48.8 | 9.22E+13 | 101 | 67,434 |
| BIRC5 | 2.792394 | 5.43E−15 | 46.020339 | 1678.399139 | 0.493799 | 88 | 0.1287475 | 48.9 | 9.22E+13 | 99 | 108,288 |
| KIF2C | 2.389307 | 1.51E−15 | 46.94026 | 1182.010234 | 0.487751 | 88 | 0.1259678 | 48 | 9.22E+13 | 97 | 85,072 |
| RRM2 | 4.481084 | 1.30E−31 | 46.020339 | 1745.829126 | 0.497727 | 86 | 0.1228496 | 50.9 | 9.22E+13 | 101 | 110,592 |
| RACGAP1 | 2.400852 | 1.87E−23 | 46.020339 | 1361.211732 | 0.493799 | 86 | 0.1225891 | 49.8 | 9.22E+13 | 94 | 93,686 |
| KIF4A | 2.206078 | 2.43E−13 | 46.94026 | 1008.406749 | 0.478689 | 80 | 0.1192849 | 46 | 9.22E+13 | 88 | 65,850 |
| KPNA2 | 3.428273 | 1.09E−42 | 46.880102 | 961.5679185 | 0.47454 | 78 | 0.111362 | 46.1 | 9.22E+13 | 86 | 91,340 |
| TYMS | 2.878076 | 1.89E−21 | 47.774118 | 1340.60442 | 0.473514 | 76 | 0.1156782 | 48.1 | 9.22E+13 | 92 | 99,082 |
| RRM1 | 2.268661 | 2.13E−25 | 40.445411 | 1145.682149 | 0.45768 | 69 | 0.0981711 | 43.2 | 9.22E+13 | 80 | 69,510 |
logFC, log2 fold change; adj.P.Val, adjusted P value; EPC, Edge Percolated Component; MCC, Maximal Clique Centrality; MNC, Maximum Neighborhood Component.
Figure 4(a) A Venn diagram of 26 overlapping genes between different calculation methods of Cytohubba and CytoNCA. (b) Heatmap correlation plot for 20 candidate genes (depicted using Pheatmap package of R software[25]).
Key lncRNAs which co-expressed with MAD2L1 and CCNA2.
| Symbol | ||||
|---|---|---|---|---|
| Pearson correlation | Pearson correlation | |||
| − 0.68 | < 2.2e−16 | − 0.73 | < 2.2e−16 | |
| LINC01279 | − 0.66 | < 2.2e−16 | − 0.68 | < 2.2e−16 |
| − 0.64 | < 2.2e−16 | − 0.68 | < 2.2e−16 | |
| LINC01089 | − 0.64 | < 2.2e−16 | − 0.67 | < 2.2e−16 |
| − 0.64 | < 2.2e−16 | − 0.63 | < 2.2e−16 | |
| − 0.61 | < 2.2e−16 | − 0.63 | < 2.2e−16 | |
| LINC02256 | − 0.60 | < 2.2e−16 | − 0.62 | < 2.2e−16 |
| − 0.60 | < 2.2e−16 | − 0.62 | < 2.2e−16 | |
| − 0.60 | < 2.2e−16 | − 0.62 | < 2.2e−16 | |
| − 0.60 | < 2.2e−16 | − 0.60 | < 2.2e−16 | |
| − 0.60 | < 2.2e−16 | − 0.60 | < 2.2e−16 | |
| 0.65 | < 2.2e−16 | 0.66 | < 2.2e−16 | |
| 0.93 | < 2.2e−16 | 1 | < 2.2e−16 | |
| 1 | < 2.2e−16 | 0.936209 | < 2.2e−16 | |
Relative expression of our candidate genes in different molecular subtypes of breast cancer and healthy breast tissue in GSE65194 and GSE45827.
| Symbol | Basal-like | HER2-enriched | Luminal A | Luminal B | ||||
|---|---|---|---|---|---|---|---|---|
| logFC | adj. | logFC | adj. | logFC | adj. | logFC | adj. | |
| MAD2L1 | 4.4 | 3.02E−61 | 3.57 | 6.58E−44 | 1.709 | 3.53E−13 | 3.297 | 4.45E−37 |
| CCNA2 | 3.91 | 9.31E−50 | 2.78 | 1.50E−28 | 0.644 | 0.00823 | 1.986 | 1.09E−15 |
| RAD51-AS1 | − 1.6 | 1.47E−33 | − 1.6 | 1.57E−30 | − 0.9 | 6.18E−12 | − 1.27 | 5.54E−21 |
| LINC01089 | − 1 | 2.44E−18 | − 0.8 | 6.34E−11 | − 0.32 | 0.00964 | − 0.71 | 5.79E−09 |
| EIF3J-DT | − 0.9 | 1.02E−13 | − 0.8 | 1.66E−10 | − 0.04 | 0.77445 | − 0.46 | 0.00031 |
| LINC02256 | − 0.9 | 1.13E−21 | − 0.9 | 3.56E−22 | − 0.44 | 3.20E−06 | − 0.57 | 1.22E−09 |
| TNFRSF14-AS1 | − 0.8 | 8.62E−26 | − 0.8 | 6.44E−25 | − 0.34 | 6.84E−06 | − 0.5 | 7.35E−11 |
| CARMN | − 3.2 | 6.13E−22 | − 3 | 4.42E−18 | − 1.91 | 4.53E−08 | − 2.41 | 5.23E−12 |
| EPB41L4A-AS1 | − 1.6 | 1.78E−20 | − 1.3 | 2.99E−13 | − 0.7 | 0.0002 | − 1.24 | 3.19E−11 |
| LINC01279 | − 2.1 | 2.58E−08 | − 0.9 | 0.022916 | 0.484 | 0.27445 | − 0.66 | 0.10494 |
| MEG3 | − 2.8 | 7.98E−40 | − 2.1 | 2.33E−25 | − 1.55 | 5.77E−14 | − 1.91 | 8.24E−20 |
| FUT8-AS1 | − 0.8 | 2.61E−31 | − 0.8 | 7.15E−26 | − 0.49 | 5.63E−11 | − 0.55 | 1.80E−13 |
| PRINS | − 1.2 | 4.18E−44 | − 1.2 | 2.64E−38 | − 0.92 | 4.31E−24 | − 1.09 | 1.10E−31 |
| NCK1-DT | 1.58 | 1.22E−22 | 0.94 | 4.64E−09 | 0.625 | 0.00022 | 0.897 | 8.21E−08 |
logFC, log2 fold change; adj.P.Val, adjusted P value.
Recurrence free survival (RFS) and overall survival (OS) of candidate hub genes.
| Gene name | Multivariate analysis for RFS | Univariate analysis for RFS | Multivariate analysis for OS | Univariate analysis for OS | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Probe ID | Symbole | HR | CI | logrank | HR | CI | logrank | HR | CI | logrank | HR | CI | logrank |
| 210794_s_at | 0.8 | 0.71–0.89 | 0.0001 | 0.73 | 0.65–0.81 | 1.30e−08 | 0.93 | 0.74–1.17 | 0.5396 | 0.82 | 0.66–1.02 | 0.0713 | |
| 227061_at | 0.74 | 0.63–0.87 | 0.0003 | 0.69 | 0.59–0.81 | 4.2e−06 | 0.73 | 0.53–1.01 | 0.0553 | 0.72 | 0.53–0.99 | 0.0391 | |
| 235124_at | 0.57 | 0.48–0.68 | 0 | 0.5 | 0.42–0.58 | < 1e−16 | 0.76 | 0.54–1.07 | 0.1189 | 0.62 | 0.45–0.85 | 0.0029 | |
| 226369_at | 0.5 | 0.42–0.58 | 0 | 0.46 | 0.39–0.54 | < 1e−16 | 0.84 | 0.6–1.17 | 0.2993 | 0.72 | 0.53–0.99 | 0.0433 | |
| 1560081_at | 0.45 | 0.38–0.53 | 0 | 0.41 | 0.35–0.49 | < 1e−16 | 0.7 | 0.5–0.99 | 0.0427 | 0.61 | 0.45–0.84 | 0.0023 | |
| 232190_x_at | 0.63 | 0.54–0.74 | 0 | 0.58 | 0.5–0.68 | 1.4e−11 | 0.86 | 0.62–1.18 | 0.3494 | 0.76 | 0.55–1.04 | 0.0868 | |
| 234423_x_at | 0.63 | 0.53–0.74 | 0 | 0.58 | 0.49–0.68 | 5.2e−12 | 1.26 | 0.91–1.75 | 0.1708 | 1.06 | 0.78–1.46 | 0.6976 | |
| 216051_x_at | 0.7 | 0.63–0.78 | 0 | 0.69 | 0.62–0.77 | 2.4e−11 | 0.91 | 0.73–1.13 | 0.3928 | 0.86 | 0.69–1.06 | 0.1658 | |
| 225698_at | 0.77 | 0.65–0.91 | 0.0023 | 0.65 | 0.55–0.76 | 3.3e−08 | 0.91 | 0.65–1.28 | 0.5966 | 0.75 | 0.55–1.03 | 0.0781 | |
| 1558828_s_at | 0.62 | 0.53–0.73 | 0 | 0.59 | 0.51–0.69 | 5.3e−11 | 0.61 | 0.44–0.84 | 0.0028 | 0.56 | 0.41–0.77 | 0.0003 | |
| 242889_x_at | 0.8 | 0.68–0.93 | 0.0046 | 0.75 | 0.64–0.88 | 0.00028 | 1.74 | 1.25–2.43 | 0.0011 | 1.54 | 1.11–2.12 | 0.0083 | |
| 228799_at | 1.06 | 0.91–1.25 | 0.4498 | 1.18 | 1.01–1.38 | 0.035 | 0.95 | 0.68–1.32 | 0.751 | 1.09 | 0.8–1.49 | 0.5982 | |
| 203418_at | 1.24 | 1.09–1.42 | 0.0011 | 1.84 | 1.64–2.05 | < 1e−16 | 1.36 | 1.05–1.77 | 0.0193 | 1.55 | 1.25–1.93 | 5.0e−05 | |
| 203362_s_at | 1.66 | 1.47–1.87 | 0 | 1.86 | 1.67–2.08 | < 1e−16 | 1.8 | 1.42–2.28 | 0 | 2.02 | 1.62–2.51 | 1.8e−10 | |
HR, hazard ratio; CI, confidence interval; RFS, recurrence free survival; OS, overall survival.
A multivariate analysis was performed for MKI67, ESR1, and HER2 (ERBB2).