| Literature DB >> 34341433 |
Jinbao Yin1,2, Chen Lin1, Meng Jiang1, Xinbin Tang1, Danlin Xie1, Jingwen Chen1, Rongqin Ke3.
Abstract
As a highly prevalent disease among women worldwide, breast cancer remains in urgent need of further elucidation its molecular mechanisms to improve the patient outcomes. Identifying hub genes involved in the pathogenesis and progression of breast cancer can potentially help to unveil mechanism and also provide novel diagnostic and prognostic markers. In this study, we integrated multiple bioinformatic methods and RNA in situ detection technology to identify and validate hub genes. EZH2 was recognized as a key gene by PPI network analysis. CENPL, ISG20L2, LSM4, MRPL3 were identified as four novel hub genes through the WGCNA analysis and literate search. Among these, many studies on EZH2 in breast cancer have been reported, but no studies are related to the roles of CENPL, ISG20L2, MRPL3 and LSM4 in breast cancer. These four novel hub genes were up-regulated in tumor tissues and associated with cancer progression. The receiver operating characteristic analysis and Kaplan-Meier survival analysis indicated that these four hub genes are promising candidate genes that can serve as diagnostic and prognostic biomarkers for breast cancer. Moreover, these four newly identified hub genes as aberrant molecules in the maintenance of breast cancer development, their exact functional mechanisms deserve further in-depth study.Entities:
Year: 2021 PMID: 34341433 PMCID: PMC8328991 DOI: 10.1038/s41598-021-95068-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Characteristics of the included GEO datasets.
| Dataset ID | Country | Normal | Tumor | platform ID | Number of rows per platform |
|---|---|---|---|---|---|
| GSE21422 | Germany | 5 | 14 | GPL570 | 54,675 |
| GSE33447 | China | 4 | 12 | GPL14550 | 42,545 |
| GSE42568 | Ireland | 17 | 104 | GPL570 | 54,675 |
| GSE14999 | Italy | 61 | 68 | GPL3991 | 23,653 |
| GSE65194 | France | 11 | 153 | GPL570 | 54,675 |
| GSE15852 | Malaysia | 43 | 43 | GPL96 | 22,283 |
| GSE5764 | Czech Republic | 20 | 10 | GPL570 | 54,675 |
| GSE3744 | USA | 7 | 40 | GPL570 | 54,675 |
| In total | 168 | 444 |
Figure 1GO enrichment analysis and KEGG pathways analysis of 512 DEGs. (a) GO terms of biological process (BP); (b) GO terms of cellular component (CC); (c) GO terms of molecular function (MF); (d) KEGG pathways terms.
Hub genes for highly expressed genes ranked by different CytoHubba methods.
| category | Rank methods in CytoHubba | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| MCC | MNC | Degree | BottleNeck | EcCentricity | Closeness | Radiality | Betweenness | Stress | |
| 1 | FN1 | FN1 | FN1 | FN1 | FN1 | FN1 | FN1 | FN1 | |
| 2 | CDK1 | CDH1 | CDH1 | CDH1 | CDH1 | CDH1 | CDH1 | ||
| 3 | CCNB1 | FGF2 | FGF2 | CDH1 | CDH1 | FGF2 | FGF2 | PPARG | PPARG |
| 4 | FOXM1 | ERBB2 | ERBB2 | MMP9 | MMP9 | FGF2 | MMP9 | ||
| 5 | UBE2C | MMP9 | MMP9 | PPARG | PPARG | PPARG | ERBB2 | ERBB2 | FGF2 |
| 6 | AURKA | CDK1 | CDK1 | FGF2 | FGF2 | ERBB2 | PPARG | MMP9 | ERBB2 |
| 7 | CDKN3 | CCNB1 | PPARG | IGF1 | IGF1 | IGF1 | IGF1 | ||
| 8 | RRM2 | FOXM1 | CCNB1 | POSTN | FOS | SPP1 | FOS | FOS | |
| 9 | ASPM | PPARG | FOXM1 | FOS | DMD | SPP1 | FOS | IGF1 | IGF1 |
| 10 | TOP2A | AURKA | ERBB2 | DMD | MMP9 | FOS | DMD | SPP1 | |
Figure 2Identification of candidate gene modules and 102 hub genes for breast cancer based on TCGA_BRCA dataset through WGCNA. (a) Left: analysis of the scale-free fitting indices for various soft-thresholding powers (β), red line indicated Scale Free Topology Model Fit, signed R2 is 0.90. Right: mean connectivity analysis of various soft-thresholding powers (β value range 1–20); (b) Left: histogram shows the frequency distribution of the k (namely connection) when β = 5. Right: checking the scale-free topology when β = 5, the figure shows that log10(k) and log10(p(k)) are negatively correlated (correlation coefficient 0.97), denoting that the gene scale-free network that we constructed is guaranteed; (c) Clustering dendrograms of genes based on dissimilarity topological overlap calculation formula (1—TOM) and merged gene set modules. Seven weighted gene co-expression network modules were constructed and shown in different colors; (d) Heatmap of the correlation between module eigengenes and breast cancer samples traits (Tumor). The numbers in each square of heatmap indicates the Pearson correlation coefficient (up) and P value (down); (e) Scatter plot of gene significance for “Tumor” and module membership in the blue module. The red lines indicate MM value = 0.6 and GS value = 0.3; (f) Scatter plot of gene significance for “Tumor” and module membership in the brow module. The red lines indicate MM value = 0.6 and GS value = 0.3.
Hub genes identified in the blue and brown modules associated with breast tumor via WGCNA analysis.
| Module | Hub genes |
|---|---|
| blue | PPM1G, CDCA4, TACC3, CCNF, TIMELESS, SRSF1, CENPU, |
| brown | TJP3, POLR3K, ATP6V0B, PAFAH1B3, DTYMK, FBXL19, VARS2, TRAF2, VAMP8, |
Four selected genes are annotated in bold.
Figure 3Relationship between four novel key genes expression levels and clinicopathological variables in breast cancer based on bc-GenExMiner platform.
Figure 4Methylation level analyses and genetic alteration of novel hub genes for breast cancer. (a–d) the methylation levels of CENPL, ISG20L2, LSM4, and MRPL3 in breast cancer and normal tissues were examined using DiseaseMeth 2.0 databaset based on 450 k (Illumina Infinium HumanMethylation450 BeadChip) platform; (e) Genetic alterations of CENPL, ISG20L2, MRPL3, and LSM4 were examined in cBioPortal database.
Figure 5The diagnostic value analysis and validation of four novel hub genes in breast cancer. ROC curves analysis for CENPL, ISG20L2, LSM4 and MRPL3 based on (a) TCGA dataset, (b) GEO_BRCA dataset. Abbreviation: ROC receiver operating characteristic, AUC area under the ROC curve.
Figure 6The prognostic value analysis of four novel hub genes in breast cancer based on (a) TCGA_GEO BRCA dataset, (b) METABRIC dataset, (c) bc-GenExMiner v4.6 Platform. Expression levels of CENPL, ISG20L2, LSM4 and MRPL3 are significantly associated with the OS of patients in breast cancer (all P < 0.05, HR˃1).
Figure 7Gene set enrichment analysis (GSEA) of potential hub genes in the METABRC dataset. Tumor cell proliferation related gene-sets were significantly enriched in the high-expression group of each hub gene.
Figure 8RNA in situ detection of five hub genes in different cell lines. (a–d) Demonstration of the expression abundance and spatial localization for each mRNA imaging in single cells. (a) five hub genes detection in MCF10A cell; (b) five hub genes detection in MDA-MB-231 cell; (c) five hub genes detection in MCF7 cell; (d) five hub genes detection in SKBR3 cell. (e–i) Distribution of RCPs/cell of each probe in four cell lines (MCF10A, MCF7, MDA-MB-231 and SKBR3). NS. Denotes P ˃0.05; * denotes P < 0.05; *** denotes P < 0.001.
correlation analysis between novel four hub genes (CENPL, ISG20L2, LSM4 and MRPL3) and EZH2 based on RNA in situ detection.
| Breast cancer cell lines | Gene | EZH2 | |
|---|---|---|---|
| Cor | |||
| MCF7 | CENPL | 0.16 | 4.99e-09 |
| ISG20L2 | 0.38 | 2.34e-46 | |
| LSM4 | 0.38 | 1.68e-46 | |
| MRPL3 | 0.63 | 2.53e-143 | |
| MDA-MB-231 | CENPL | 0.23 | 4.64e − 17 |
| ISG20L2 | 0.39 | 1.11e − 46 | |
| LSM4 | 0.25 | 2.72e − 18 | |
| MRPL3 | 0.55 | 3.84e − 106 | |
| SKBR3 | CENPL | 0.09 | 9.64e-04 |
| ISG20L2 | 0.46 | 1.85e-62 | |
| LSM4 | 0.12 | 3.63e-05 | |
| MRPL3 | 0.56 | 4.84e-114 |
Cor: Pearson correlation coefficient.