| Literature DB >> 31423162 |
Haoxuan Jin1,2,3,4, Xiaoyan Huang1,2,3,4, Kang Shao2,3,4, Guibo Li2,3,4, Jian Wang2,4, Huanming Yang2,4, Yong Hou2,3,4.
Abstract
The aim of the present study was to identify the hub genes and provide insight into the tumorigenesis and development of breast cancer. To examine the hub genes in breast cancer, integrated bioinformatics analysis was performed. Gene expression profiles were obtained from the Gene Expression Omnibus (GEO) database and the differentially expressed genes (DEGs) were identified using the 'limma' package in R. Gene Ontology enrichment analysis and Kyoto Encyclopedia of Genes and Genomes pathway analysis was used to determine the functional annotations and potential pathways of the DEGs. Subsequently, a protein-protein interaction network analysis and weighted correlation network analysis (WGCNA) were conducted to identify hub genes. To confirm the reliability of the identified hub genes, RNA gene expression profiles were obtained from The Cancer Genome Atlas (TCGA)-breast cancer database, and WGCNA was used to screen for genes that were markedly correlated with breast cancer. By combining the results from the GEO and TCGA datasets, 15 hub genes were identified to be associated with breast cancer pathophysiology. Overall survival analysis was performed to examine the association between the expression of hub genes and the overall survival time of patients with breast cancer. Higher expression of all hub genes was associated with significantly shorter overall survival in patients with breast cancer compared with patients with lower levels of expression of the respective gene.Entities:
Keywords: Gene Expression Omnibus; The Cancer Genome Atlas; bioinformatics analysis; breast cancer; hub gene
Year: 2019 PMID: 31423162 PMCID: PMC6607081 DOI: 10.3892/ol.2019.10411
Source DB: PubMed Journal: Oncol Lett ISSN: 1792-1074 Impact factor: 2.967
Figure 1.DEGs in each GEO dataset and common DEGs shared by the two GEO datasets. (A) Volcano plot of DEGs in each GEO dataset. Red dots represent the genes that were significantly upregulated in tumor samples. Blue dots represent the genes that were significantly downregulated in tumor samples. The dotted vertical lines indicate the significance thresholds filter. (B) Common DEGs shared by the two datasets. (C) Gene expression heat map of common DEGs in the two datasets with the same gene expression pattern. Red lines represent the genes that were significantly upregulated in tumor samples. Blue lines represent the genes that were significantly downregulated in tumor samples. DEG, differentially expressed gene; GEO, Gene Expression Omnibus; FC, fold change.
Figure 2.Top enriched KEGG pathways and GO annotations of 322 common DEGs identified from the GSE10180 and GSE65194 datasets. (A) Top enriched KEGG pathways for the 322 DEGs. The size of the circle represents the number of genes enriched in the pathway. The color of the circle represents the P-value. (B) Top enriched GO terms for key DEGs classified into the MF, BP or CC groups. KEGG, Kyoto Encyclopedia of Genes and genomes; GO, Gene Ontology; DEG, differentially expressed genes; MF, molecular function; BP, biological process; CC, cellular component; AMPK, 5′ adenosine monophosphate-activated kinase; PPAR, peroxisome proliferator-activated receptor; ECM, extracellular matrix.
Figure 3.PPI network analysis. (A) PPI networks of 95 key differentially expressed genes. Nodes represent genes and edges represent the protein-protein interaction. (B) Top three significant clusters selected from the PPI network. Red circles represent the genes that were significantly upregulated in tumor samples. Blue circles represent the genes that were significantly downregulated in tumor samples. PPI, protein-protein interaction.
Key differentially expressed genes identified from the protein-protein interaction network.
| A, MCODE cluster 1 | |||||
|---|---|---|---|---|---|
| Gene | MCODE score | Degree | Clustering coefficient | Topological coefficient | Expression |
| 7.2 | 16 | 0.73333333 | 0.75 | Upregulated | |
| 8.836.363.636 | 11 | 0.92727273 | 0.85795455 | ||
| 9 | 9 | 1 | 0.86805556 | ||
| 9 | 10 | 0.93333333 | 0.85625 | ||
| 7.813.186.813 | 14 | 0.8021978 | 0.79017857 | ||
| 7.2 | 15 | 0.77142857 | 0.77083333 | ||
| 7.822.222.222 | 9 | 0.97222222 | 0.88888889 | ||
| 7.961.538.462 | 13 | 0.80769231 | 0.79326923 | ||
| 7.2 | 16 | 0.73333333 | 0.75 | ||
| 7.2 | 16 | 0.73333333 | 0.75 | ||
| 7.2 | 16 | 0.73333333 | 0.75 | ||
| 9 | 9 | 1 | 0.86805556 | ||
| 9 | 13 | 0.80769231 | 0.78846154 | ||
| 8.192.307.692 | 13 | 0.84615385 | 0.80288462 | ||
| 8 | 9 | 0.94444444 | 0.88194444 | ||
| 8 | 8 | 1 | 0.875 | ||
| 3.733.333.333 | 5 | 0.9 | 0.92 | Downregulated | |
| 3.733.333.333 | 5 | 0.9 | 0.92 | ||
| 3.733.333.333 | 5 | 0.9 | 0.92 | ||
| 3.733.333.333 | 5 | 0.9 | 0.92 | ||
| 4 | 4 | 1.0 | 1.0 | ||
| 4 | 4 | 1.0 | 1.0 | ||
| 3 | 3 | 1 | 1 | Downregulated | |
| 2.4 | 4 | 0.83333333 | 0.875 | ||
| 2.7 | 4 | 0.83333333 | 0.875 | ||
| 3.0 | 3 | 1 | 1 | ||
| 2.7 | 4 | 0.83333333 | 0.875 | ||
MCODE, molecular complex detection.
Figure 4.Weighted correlation co-expression network analysis of the GEO datasets and TCGA dataset. (A) Gene dendrogram obtained by clustering the DEGs from the GEO datasets. A total of 3 modules (MEblue, MEturquoise and MEgrey) were marked with different colors (blue, turquoise and gray, respectively). (B) Association between the consensus MEs and phenotypes in the GEO datasets. (C) Gene dendrogram obtained by clustering the DEGs in TCGA dataset. A total of 4 modules (MEturquoise, MEblue, MEbrown and MEgrey) were marked with different colors (turquoise, blue, brown and gray, respectively). (D) Correlations between consensus MEs and phenotypes in the TCGA dataset. GEO, Gene Expression Omnibus; TCGA, The Cancer Genome Atlas; DEG, differentially expressed gene; ME, module eigengene.
Figure 5.Association between the expression of hub genes and the overall survival of patients with breast cancer. Increased expression of each hub gene above the median expression level was associated with a decreased overall survival time. HR, hazard ratio.