| Literature DB >> 34957234 |
Jing Xu1,2, Yuejin Yang1,2.
Abstract
Objective: To explore the molecular mechanism and search for the candidate differentially expressed genes (DEGs) with the predictive and prognostic potentiality that is detectable in the whole blood of patients with ST-segment elevation (STEMI) and those with post-STEMI HF.Entities:
Keywords: acute myocardial infarction; biomarker; heart failure; machine learning; microarray
Year: 2021 PMID: 34957234 PMCID: PMC8702808 DOI: 10.3389/fcvm.2021.736497
Source DB: PubMed Journal: Front Cardiovasc Med ISSN: 2297-055X
Figure 1The overview of the analysis procedure. We downloaded GSE60993, GSE61144, and GSE66360 from the NCBI-GEO database and identified 90 upregulated DEGs and nine downregulated DEGs convergence in the three datasets (|log2FC| ≥ 0.8 and adjusted p < 0.05). Gene ontology and pathway enrichment were performed via ClueGO (version 2.5.7), CluePedia (version 1.5.7), and the DAVID database. A protein interaction network was constructed via STRING. Enriched hub genes were analyzed by Cytoscape software. The logistic LASSO regression was performed to build a machine learning model. GSE59867 dataset was utilized to validate the hub genes in patients with post-STEMI HF.
The DEGs convergence in GSE60093, GSE61144, and GSE66360 in the comparison of STEMI patients with healthy controls.
|
| |
|---|---|
| Up-regulated DEGs |
|
| Down-regulated DEGs |
|
DEGs were set as |log.
Figure 2Gene ontology (GO) analysis and significant enrichment of the DEGs. GO analysis classified selected genes into the cellular component (CC), molecular function (MF), and KEGG pathway group, ranking significant enriched GO terms of the DEGs. The vertical axis on the left and the bar plot represents the gene count per term, and the vertical axis on the right and the gray dots represent log2 p-value (please consult Supplementary Table S4 for details). A p < 0.05 was considered statistically significant.
Figure 3Terms of biological process (BP) by GO analysis. (A) Representative functional BP groups selected by the hypergeometric test and the percentage of terms per group. (B) The ontology relations of the annotated terms. A p < 0.05 was considered statistically significant.
Figure 4(A) The construction of the PPI network based on the DEGs. The red ellipse represents upregulated DEGs, the green ellipse represents downregulated DEGs. (B) The hub gene cluster with the highest scores in the PPI network is displayed by the yellow ellipse. (C–E) The heatmaps with clustering analysis showed the normalized expression values of hub genes in GSE60993, GSE61144, and GSE66360 datasets, respectively.
Detailed information of the hub genes.
|
|
|
|
|
| |||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| ||
|
| Integrin subunit alpha M | 0.8986 | 1.09E−2 | 0.8506 | 1.55E−5 | 0.9132 | 3.46E−2 |
|
| C-type lectin domain family 4 member D | 1.5638 | 1.62E−2 | 1.7151 | 3.10E−6 | 2.6189 | 2.52E−9 |
|
| Solute carrier family 2 member 3 | 0.9867 | 4.68E−4 | 1.0359 | 3.02E−4 | 1.1033 | 3.70E−6 |
|
| Bone marrow stromal cell antigen 1 | 1.1451 | 4.12E−4 | 0.9381 | 4.77E−6 | 2.0780 | 2.23E−8 |
|
| Mast cell expressed membrane protein 1 | 1.7537 | 9.86E−4 | 1.8760 | 2.06E−5 | 1.3833 | 4.15E−7 |
|
| Plasminogen activator, urokinase receptor | 1.0007 | 6.35E−3 | 0.8102 | 4.71E−5 | 2.3098 | 1.05E−9 |
|
| Adhesion G protein-coupled receptor G3 | 1.4664 | 3.55E−3 | 1.2396 | 9.35E−5 | 1.3983 | 8.66E−7 |
|
| Matrix metallopeptidase 25 | 1.1087 | 1.72E−2 | 1.5248 | 8.97E−5 | 1.2299 | 6.56E−6 |
Description of the gene was obtained via Human Genome Resources at NCBI. The log.
Figure 5Construction of LASSO regression model and ROC curves of hub genes. (A) The plot indicates binomial deviance of different numbers of variables revealed by the LASSO regression model for GSE66360. The red dots represent the value of binomial deviance; the gray lines represent the standard error (SE); the vertical dotted lines represent optimal values by the minimum criteria and 1-SE criteria. “Lambda” is the tuning parameter. (B) The plot determines the coefficient by 1-SE criteria of LASSO regression model 0.3086, 0.2593, 0.2251, 0.2072, and 0.0541 for SLC2A3, CLEC4D, GPR97, PLAUR, and BST1, respectively. (C) The ROC curves of the LASSO regression model of training, testing, and 5-fold cross-validation in GSE66360.
Figure 6(A–H) Gene variations over time for investigated hub genes in patients with post-STEMI HF and patients with non-HF at different time points after STEMI (on admission, discharge, 1 month, and 6 months). Red line: HF patients, black line: patients with non-HF. Bars in high and low represent the maximum and minimum gene expression values, respectively. Statistical significance: *p < 0.05; **p < 0.01; ***p < 0.001, Wilcoxon rank-sum test. (I,J) ROC curves for ITGAM, and BST1, respectively. AUC, area under the curve; ROC, receiver operating characteristic. The bars represented the area of 95% CI.