| Literature DB >> 29138828 |
Abstract
The aim of the present study was to identify risk genes in myocardial infarction. Microarray data GSE34198, containing data from the peripheral blood of 49 myocardial infarction samples and 48 corresponding control samples, were downloaded from the Gene Expression Omnibus database to screen the differentially expressed genes (DEGs). The DEGs were used to construct a protein‑protein interaction (PPI) network of patient samples, from which the feature genes were identified using the neighboring score method. The recursive feature elimination (RFE) algorithm was employed to select the risk genes among feature genes, which were subsequently applied to perform a support vector machine (SVM) classifier to identify the specific signature in myocardial infarction samples. Another dataset, GSE61144, was also downloaded to verify the efficacy of the classifier. A total of 724 downregulated and 483 upregulated DEGs were screened in patient samples compared with control samples in the GSE34198 dataset. The PPI network of myocardial infarction was comprised of 1,083 nodes (genes) and 46,363 lines (connections). Using the neighborhood scoring method, the top 100 feature genes in myocardial infarction samples were identified as the disease feature genes, which distinguish the myocardial infarction samples from the control samples. The RFE algorithm screened 15 risk genes, which were employed to construct a SVM classifier with an average precision of 88% to the patient sample following visualization by a confusion matrix. The predictive precision of the classifier on another microarray dataset, GSE61144, was 0.92, with an average true positive of 0.9278 and an average false positive of 0.2361. A‑kinase‑anchoring protein 12 (AKAP12) and glycine receptor α2 (GLRA2) were two risk genes in the SVM classifier. Therefore, AKAP12 and GLRA2 exert potential roles in the development of myocardial infarction, potentially by influencing cardiac contractility and protecting against ischemia‑reperfusion injury, which may provide clues in developing potential diagnostic biomarkers or therapeutic targets for myocardial infarction.Entities:
Mesh:
Substances:
Year: 2017 PMID: 29138828 PMCID: PMC5780094 DOI: 10.3892/mmr.2017.8044
Source DB: PubMed Journal: Mol Med Rep ISSN: 1791-2997 Impact factor: 2.952
Figure 1.Distributions of node degrees in the protein-protein interaction network. The x-axis represents the log (degree) value; the y-axis indicates the number of responding nodes in each of the log (degree) ranges.
Feature genes with top 10 neighbor scores.
| Node | NS_score | Log (fold change) | P-value |
|---|---|---|---|
| EHBP1 | 0.96 | 1.0153 | 0.0004 |
| EXOC6B | 0.96 | 0.9025 | 0.0016 |
| GRB10 | 0.92 | 0.9488 | 0.0009 |
| AKAP12 | 0.91 | 0.9764 | 0.0007 |
| SOX4 | 0.91 | 0.8647 | 0.0026 |
| GLRA3 | 0.91 | −0.8335 | 0.0036 |
| GLRA2 | 0.91 | −0.9855 | 0.0006 |
| PPP1R3A | 0.90 | −1.0402 | 0.0003 |
| FABP4 | 0.90 | 1.0953 | 0.0001 |
| MED13L | 0.90 | 0.7106 | 0.0132 |
NS, neighbor score; EHBP1, EH domain-binding protein 1; EXOC6B, exocyst complex component 6B; GRB10, growth factor receptor-bound protein 10; AKAP12, A-kinase-anchoring protein 12; SOX4, SRY-box 4; GLRA, glycine receptor α; PPP1R3A, protein phosphatase 1 regulatory subunit 3A; FABP4, fatty acid-binding protein 4; MED13L, mediator complex subunit 13-like.
Figure 2.Clustering analysis results for the top 100 feature genes. The x-axis represents the samples, with control samples marked in green and patient samples marked in red.
Figure 3.Feature elimination of the top 100 feature genes. The x-axis is the feature gene number and the y-axis indicates the corresponding prediction precision. The gene combination with the highest precision is marked in red, which was a 15-gene combination.
Risk genes in myocardial infarction samples.
| Gene | Log (fold change) | P-value |
|---|---|---|
| HES5 | −0.8925 | 0.0018 |
| ZNF417 | −0.8260 | 0.0040 |
| GLRA2 | −0.9855 | 0.0006 |
| OR8D2 | −0.8135 | 0.0045 |
| HOXA7 | 0.7150 | 0.0126 |
| FABP6 | 0.9234 | 0.0013 |
| MUSK | −0.7975 | 0.0054 |
| HTR6 | −0.7651 | 0.0076 |
| GRIP2 | −0.9973 | 0.0005 |
| OR51M1 | −0.8125 | 0.0046 |
| OR1C1 | −0.7755 | 0.0068 |
| KLRK1 | −0.9248 | 0.0013 |
| VEGFA | 0.8442 | 0.0032 |
| AKAP12 | 0.9764 | 0.0007 |
| RHEB | 0.9288 | 0.0012 |
HES5, hes family bHLH transcription factor 5; ZNF417, zinc-finger protein 417; GLRA2, glycine receptor α2; OR, olfactory receptor; OR8D2, OR family 8 subfamily D member 2 (gene/pseudogene); HOXA7, homeobox A7; FABP6, fatty acid-binding protein 6; MUSK, muscle-associated receptor tyrosine kinase; HTR6, 5-hydroxytryptamine receptor 6, GRIP2, glutamate receptor-interacting protein 2, OR51M1, OR family 51 subfamily M member 1; OR1C1, OR family 1 subfamily C member 1; KLRK1, killer cell lectin-like receptor K1; VEGFA, vascular endothelial growth factor A; AKAP12, A-kinase-anchoring protein 12; RHEB, Ras homolog mTORC1-binding.
Figure 4.ROC curve of the support vector machine classifier. The x-axis represents the false positive rate and the y-axis indicates the true positive rate. The simulation results (mean ROC) is marked by the dotted line. ROC, receiver operating characteristic.
Figure 5.Confusion matrix of the support vector machine classifier.
Figure 6.ROC curve of the support vector machine classifier verified by the GSE61144 dataset. The x-axis represents the false positive rate and the y-axis indicates the true positive rate. ROC, receiver operating characteristic; FPR, false positive rate; TPR, true positive rate; AUC, area under the curve.