Literature DB >> 36091596

Integrated Microarray Analysis to Identify Genes and Small-Molecule Drugs Associated with Stroke Progression.

Shasha Cui1, Yunfeng Zhao2, Menghui Huang3, Huan Zhang4, Wei Zhao4, Zhenhua Chen4.   

Abstract

Several blood biomarkers are now considered increasingly important for stratifying risk, monitoring disease progression, and evaluating the response to therapy in ischemic stroke. The purpose of the present study was to identify the key genes associated with ischemic stroke progression and elucidate the potential therapeutic small molecules. Microarray datasets related to stroke for GSE58294, GSE22255, and GSE16561 were obtained from the Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were filtered using the Limma package. DAVID was then searched to perform gene ontology (GO) and pathway enrichment analyses. Based on the DEGs, a protein-protein interaction (PPI) network was developed using Cytoscape, and MCODE was applied to conduct module analysis. Finally, to identify the potential drugs for ischemic stroke, the connectivity map (CMap) database was used. Sixty DEGs were identified after analyzing the three datasets. The GO data analysis revealed that the DEGs were significantly associated with biological processes, including positive regulation of programmed cell death, protein localization in organelles, and positive regulation of apoptosis. KEGG analysis showed that the DEGs were particularly enriched in the Fc epsilon RI signaling pathway, MAPK signaling pathway, and Huntington's disease. We selected five DEGs with high connectivity (CYBB, SYK, DUSP1, TNF, and SP1) that significantly predicted stroke progression. In addition, CMap prediction showed ten small molecules that could be used as adjuvants when treating ischemic stroke. The outcomes of the present study indicated that the five genes mentioned above can be considered potential targets for developing new medications that can modify the ischemic stroke process, and mycophenolic acid was the most promising small molecule to treat ischemic stroke.
Copyright © 2022 Shasha Cui et al.

Entities:  

Year:  2022        PMID: 36091596      PMCID: PMC9458405          DOI: 10.1155/2022/7634509

Source DB:  PubMed          Journal:  Evid Based Complement Alternat Med        ISSN: 1741-427X            Impact factor:   2.650


1. Introduction

In the U.S., stroke has become the third most common cause of death; additionally, this condition is the leading cause of disability according to the CDC. Most strokes (87%) are caused by ischemic events, which result in persistent neurological impairments and physical disabilities at high socioeconomic costs [1]. Physicians and patients should have access to prognostic information for the facilitation of patient care and for the allocation of healthcare resources. Prognostic biomarkers are becoming increasingly important to stratify risk, monitor disease progression, and evaluate the response to therapy. It is important to develop biomarkers for predicting the outcomes of treatment trials and developing neuroprotection targets. In this regard, easily measurable biomarkers can be used to predict mortality and function after stroke [2]. Concurrently, an association reportedly exists between incident strokes and circulating markers of inflammation and thrombosis, such as C-reactive protein, interleukin 6, and fibrinogen [3-5]. Brain damage arising from ischemia is caused by the immune system, and the damaged brain tissue consequently contributes to fatal infections. Inflammatory signaling is involved in all stages of the ischemic cascade, from early damaging events triggered by arterial occlusion to the late regenerative processes associated with postischemic tissue repair. Recent studies have revealed that stroke affects both the innate and adaptive immunity [1, 6]. Ischemic stroke is treated with intravenous thrombolysis and/or endovascular thrombectomy, which have been proven effective for reducing disability. Notably, in spite of their efficacy, these treatments are relatively expensive to perform and are consequently often unaffordable to patients, particularly in developing countries. Additionally, many treated patients present themselves with persistent infarctions, which highlights the need for further developments in the field of thrombolysis and thrombectomy [7]. The mRNA microarray technology based on gene expression profiles has facilitated the detection of a wide range of diseases [8, 9]. The expression levels of several thousand genes have been measured simultaneously in these profiles for providing a better prognosis than prior models owing to the fact that they are used as the basis for the feature selection and classification [10]. The mechanism underlying the interactions between proteins and genes is being elucidated on a large scale using the high-throughput technologies [11]. Various mRNA profiles are reportedly expressed during ischemic stroke. Researchers have performed profiling studies on humans during an acute phase or shortly after a stroke, owing to which these studies focus more on stroke severity and/or recovery mechanisms and less on stroke risk. To the best of our knowledge, gene expression changes specifically correlated with an increased risk of stroke have not been investigated in human studies. In the present study, using the GEO database, we collected the gene expression microarray datasets of ischemic stroke and identified the DEGs in patients with ischemic stroke and controls. The purpose of the current work was to identify potential molecular mechanisms, novel effective biomarkers, and therapeutic targets of ischemic stroke. Furthermore, the gene expression profiles of patients with ischemic stroke and normal controls were obtained from the public databases that use CIBERSORT to analyze the proportion of immune cells in samples from patients and controls. The workflow of this investigation is shown in Figure 1.
Figure 1

Workflow of this investigation.

2. Materials and Methods

2.1. Microarray Data

The human genome microarray datasets GSE22255, GSE16561, and GSE58294 were downloaded from the GEO database [12-16]. One hundred and twenty-eight peripheral whole-blood samples from patients with ischemic stroke, and 67 normal controls were included in these datasets.

2.2. Identification of DEGs

To screen the DEGs between ischemic stroke and normal samples, we used the statistical packages R and Bioconductor. In addition to the Series Matrix Files, we downloaded the SOFT annotation tables of the platforms. We used the R software to perform background corrections and normalization. After data analysis with the Limma package, the threshold for identifying DEGs was P < 0.05 [17, 18].

2.3. GO and KEGG Pathway Enrichment Analyses on the DEGs

Gene enrichment and functional annotation analyses were frequently conducted using the DAVID database. The database uses analytical methods with biological data to ensure a thorough and systematic annotation of the biological functions for large lists of proteins or genes [19]. DAVID was applied to investigate the GO annotations and KEGG pathway enrichment for further analysis of the identified DEGs. The TXT result files of analyses on GO and KEGG pathway enrichment were downloaded [20, 21].

2.4. PPI Network Construction and Analysis of Modules

An online tool to retrieve proteins and genes that interact is available at String-DB (https://string-db.org/). We analyzed the PPI network of DEGs using STRING for providing insights into gene relationships. We screened the hub genes according to the degrees using the Cytoscape software. The MCODE plugin of the Cytoscape was used with the default parameters “Degree Cutoff = 2, Node Score Cutoff = 0.2, K-Core = 2, and Max.Depth = 100” to analyze the PPI network modules. Furthermore, signaling pathway enrichment analysis was performed on the most significant modules [22-24].

2.5. Analysis and Validation of Key Genes

Using enrichment analysis, we confirmed the importance of these key genes in the progression and pathogenesis of ischemic stroke. We analyzed and visualized the biological processes of the key genes using the BiNGO plugin for Cytoscape's Networks Gene Oncology Tool (BiNGO) [25].

2.6. Identification of Candidate Small Molecules

To identify the potential drugs for ischemic stroke, we used CMap to query the gene signature of the condition. In silico, CMap can predict the drugs that can induce or reverse the biological status encoded in a particular signature of gene expression. The following two groups of DEGs are currently overlapping: upregulated and downregulated. The CMap database was queried using these probe sets. Finally, the similarity enrichment score was determined, with a range of −1 to +1. Positive connectivity scores indicated that, in human cell lines, a drug is capable of causing an input signature, which can be reversibly altered by the drugs with a negative connectivity score. The negative connectivity score indicated the potential therapeutic value. To filter the instances with different connectivity scores, we ranked all the instances based on their P value, with P < 0.05 indicating statistical significance [26].

2.7. Estimation of Immune Cell Type Fractions

We quantified the number of immune cells in the ischemic stroke samples using the CIBERSORT method and the LM22 gene signature (http://cibersort.stanford.edu/). CIBERSORT uses a well-designed method for the microarray data validation with profiles of gene expression. Monte Carlo sampling is used by the CIBERSORT to derive the P value for each sample, which provides a measure of confidence in the results. We considered the results of the inferred fractions of the immune cell populations generated via CIBERSORT to ensure that our outcomes are accurate, with a threshold of P < 0.05. The inferred fractions of immune cells generated by CIBERSORT were considered to be accurate, with a threshold of P < 0.05. For each sample, we separately calculated the immune cell proportion for each gene expression series.

3. Results

3.1. Identification of DEGs

Sixty DEGs were identified in the ischemic stroke samples after preprocessing when compared with the control samples. Figure 2(a) presents the volcanic plot of DEGs for STROKE in each dataset. Three datasets were compared with a Venn diagram to identify the 60 overlapping DEGs (Figure 2(b)).
Figure 2

(a) Volcano plot of gene expression profile data between IC and normal tissues in each dataset. Red dots: significantly upregulated genes in PAAD; green dots: significantly downregulated genes in IC; black dots: nondifferentially expressed genes. P < 0.05 and |log2 FC| > 1 were considered as significant.(b) Venn diagram of 60 overlapping DEGs.

3.2. GO and KEGG Pathway Enrichment Analysis

The GO function and KEGG pathway enrichment analyses were performed using DAVID to understand the overlapping DEGs among the three datasets. According to the DEG GO enrichment analysis results, biological processes (BP), molecular function (MF), and cellular components (CC) were all enriched. For MF, these DEGs were enriched in terms of oxidoreductase activity, acting on the NADH or NADPH, histone binding, and phosphoprotein phosphatase activity. Moreover, these genes were significantly enriched in terms of the positive regulation of apoptosis and protein localization in organelle in the BP category. In the CC group, these DEGs were significantly associated with mitochondrion, histone methyltransferase complex, and methyltransferase complex. From the KEGG pathway analysis results, the DEGs were enriched in pathways associated with Huntington's disease, MAPK signaling pathway, and FC epsilon RI signaling (Figure 3(a), Table 1).
Figure 3

(a) Functional and signaling pathway analysis of the overlapped DEGs in IC, (b) protein-protein interaction networks construction and module analysis, and (c) hub genes of DEGs for IC vis MCODE.

Table 1

Functional and pathway enrichment analysis of the overlap DEGs.

CategoryTermCount
GOTERM_BP_FATGO:0033365∼protein localization in organelle5
GOTERM_BP_FATGO:0043065∼positive regulation of apoptosis7
GOTERM_BP_FATGO:0043068∼positive regulation of programmed cell death7
GOTERM_CC_FATGO:0005739∼mitochondrion10
GOTERM_CC_FATGO:0035097∼histone methyltransferase complex2
GOTERM_CC_FATGO:0034708∼methyltransferase complex2
GOTERM_MF_FATGO:0016651∼oxidoreductase activity, acting on NADH or NADPH4
GOTERM_MF_FATGO:0042393∼histone binding3
GOTERM_MF_FATGO:0004721∼phosphoprotein phosphatase activity4
KEGG_PATHWAYhsa04010:MAPK signaling pathway5
KEGG_PATHWAYhsa05016:Huntington's disease4
KEGG_PATHWAYhsa04664:Fc epsilon RI signaling pathway3

3.3. Construction of PPI Network and Screening of Modules

A PPI network comprising 38 nodes and 37 edges was developed by using the Cytoscape software based on STRING database information (Figure 3(b)). Using the PPI network, Cytoscape was used to construct a module in the default MCODE settings, in which five genes were assembled. We investigated the KEGG pathways associated with the assembled genes. The enriched KEGG pathways comprised the NF-kappa B signaling pathway, the TGF-beta signaling pathway, the necroptosis, the NOD-like receptor signaling pathway, and the osteoclast differentiation (Figure 3(c)). In the modules, CYBB, SYK, DUSP1, TNF, and SP1, which showed a high degree of connectivity, were selected as key genes.

3.4. Analysis and Confirmation of Key Genes

The BiNGO analysis of the biological processes revealed that the five key genes play an important role in high-affinity L-histidine transmembrane transporter activity, glycogen glucosyltransferase activities, and high-affinity basic amino acid transmembrane transporter activity, histidine transport, and L-histidine transmembrane transporter activity (Figure 4(a)). To further explore the molecular mechanism of the key genes in ischemic stroke, we used the GGBI analysis to identify the potential transcription factors and created a regulatory network for long noncoding RNAs, microRNAs, and mRNAs involved in key gene expression (Gene-Cloud Biotechnology Information; Figures 4(b) and 5(a)).
Figure 4

(a) The biological process of the nine hub genes analyzed by BiNGO. The color depth of nodes represents the corrected P value. The size of nodes represents the number of genes involved, and (b) a regulatory network of lncRNA-miRNA-mRNA constructed by GCBI. Purple nodes: related lncRNA; blue nodes: targeted miRNA.

Figure 5

(a) The potential transcription factors that could be involved in regulating the expression of hub genes and (b) pop plot of the top 20 identified small molecules that could reverse the gene expression of IC.

3.5. Small-Molecule Drug Screening

We used the CMap to identify probesets that are consistently different in ischemic stroke samples and in healthy controls to screen the small-molecular drugs. A list of small molecules with highly significant correlations is provided in Figure 5(b) and Table 2. Ischemic stroke has a higher chance of effective treatment by mycophenolic acid, calmidazolium, zidovudine, clorsulon, and thioridazine, which showed greater negative correlations.
Table 2

List of the 10 most significant small molecule drugs that can reverse the tumoral status of stroke.

CMap nameEnrichment P
Mycophenolic acid−0.950.00022
Calmidazolium−0.870.03338
Zidovudine−0.8620.00066
Clorsulon−0.8260.00171
Flunarizine−0.7740.00527
Hycanthone−0.770.00567
Mecamylamine−0.7660.02612
Diphemanil methylsulfate−0.7590.0015
Dirithromycin−0.7520.03093
Prestwick-1082−0.7490.03219

3.6. Estimation of Immune Cell Type Fractions

Based on CIBERSORT, as shown in Figure 6, the fractions of CD8+ T cells, gamma delta T cells, resting dendritic cells, and follicular helper T cells were consistently lower in the normal tissue than in ischemic stroke samples, whereas the fraction of activated NK cells, M0 macrophages, activated mast cells, and neutrophils were significantly lower in the ischemic stroke samples.
Figure 6

Violin plot of immune cell infraction.

4. Discussion

A gene chip, or microarray, is a type of biochip that has recently garnered considerable attention owing to its importance in the retrieval of biochemical information on gene expression profiling in hereditary diseases at a highly efficient and large scale [27]. A variety of gene expression profiles have been included in the GEO database including a large and comprehensive public resource for gene expression data [28]. In a recent study, Chen et al. constructed a ceRNA network with three DElncRNAs, three DEmiRNAs, and seven DEmRNAs for stroke [29]. Xu et al. performed a weighted gene coexpression network analysis and identified the key biomarkers and immune infiltration in female stroke patients. These results may facilitate the development of new diagnostic and treatment strategies for stroke patients. In the present study, we compared the genetic profiles of 128 patients with ischemic stroke with those of 67 controls retrieved from the GEO database. Compared with that in controls, 60 genes showed significantly different expression patterns in patients with ischemic stroke. These DEGs may have a certain role in the development of ischemic stroke. Similar to the DEGs, these molecules may be used therapeutically when treating ischemic stroke. To explore the potential functions of identified DEGs, we performed a functional and pathway enrichment analyses. Protein localization in organelles, apoptosis regulation, and programmed cell death regulation are among the three most important biological processes of these DEGs. The overlapping DEGs enriched in molecular functions were primarily associated with oxidoreductase activity, action on NADH or NADPH, histone binding, and phosphoprotein phosphatase activity. The mitochondrion, the histone methyltransferase complex, and the methyltransferase complex were the three cell components that showed the most substantial changes. Additionally, the overlapping DEGs were enriched in the Fc epsilon RI signaling pathway, Huntington's disease, and MAPK signaling pathway. The MAPK has been considered a key regulator of ischemic and hemorrhagic cerebral vascular disease, which indicates its potential as a target in stroke therapy. Under ischemic conditions, in primary cortical neurons and brain tissue, the NF-κB and MAPK signaling pathways play a pivotal role in the expression regulation and in the activation of NLRP3 and NLRP1 inflammasomes. It is a well-known fact that, after an ischemic stroke, inflammation causes neuronal cell death and brain damage. The future potential of ischemic stroke treatment may be associated with therapeutic approaches that target the inflammasome activity in neurons [30, 31]. The PPI networks were identified among the overlapping DEGs, and an effective network module was identified. CYBB, SYK, DUSP1, TNF, and SP1 were further analyzed and found to be significantly associated with the pathogenesis and prognosis of ischemic stroke and were then selected as key genes for this module. Compared with that in the controls, a significant reduction was observed in the expression of these genes in the ischemic stroke groups. Further analysis of the coexpression genes of these key genes confirmed a significant association with ischemic stroke pathogenesis and prognosis. These findings have further validated the accuracy of the current results that CYBB, SYK, DUSP1, TNF, and SP1 may have important functions in the disease, and to further improve our understanding of ischemic stroke, these key genes are currently being investigated. The transcription factors associated with each key gene has been predicted and a regulatory network of long noncoding RNAs-microRNAs-mRNAs has been established. These regulatory networks will help elucidate the possible mechanisms through which these key genes are expressed and produces proteins associated with ischemic stroke progression. In arterial thrombosis as well as in ischemic stroke, platelet collagen receptor glycoprotein VI (GPVI) plays a key role, owing to which the associated signaling pathway can be considered as an effective target in pharmacological interventions. For GPVI, immune cell receptors, and other platelets, spleen tyrosine kinase (Syk) is a crucial signaling mediator downstream. According to Van Eeuwijk et al., BI1002494 could be used in a well-established mouse model to treat ischemic stroke and prevent its recurrence. In addition to supporting stroke progression, tumor necrosis factor alpha (TNF-α) interferes with the brain functioning. Furthermore, Liguz-Lecznar et al. reported that 1 week after stroke, in the cortex adjacent to a stroke-induced lesion, a reduction was observed in experience-dependent plasticity, followed by an elevation of TNF-α expression in the brain of an ischemic mice. In the early poststroke period, impaired functional cortical plasticity could be rescued by inhibiting the TNF-α R1 signaling [32, 33]. Our analysis of the overlapping genes and CMap database revealed that a set of small-molecule drugs could rescue ischemic stroke-induced gene expression. Small molecules with positive enrichment values could restore the abnormal gene expression levels arising as a result of ischemic strokes. This analysis will facilitate the discovery of new targeted therapeutic drugs for ischemic stroke treatment and management. Mycophenolic acid was the most significant small molecule (enrichment score = −0.95), and it has not been investigated in terms of its efficacy and safety in ischemic stroke. Moreover, the correlation of ischemic stroke with calmidazolium (enrichment score = −0.87) remains relatively unclear. To address this issue, further investigation is required to focus on the potential of the small molecules listed above in the treatment of ischemic stroke. To determine the number of immune cells in ischemic stroke samples, the CIBERSORT method was used, and we found that the fractions of resting dendritic cells, follicular helper T cells, CD8+ T cells, and gamma delta T cells were significantly higher in the ischemic stroke samples than in normal tissue, whereas the fraction of neutrophils, activated M0 macrophages, activated mast cells, and NK cells was significantly lower. Further investigation in future is warranted for elucidating the correlation between the occurrence of ischemic stroke and immune infiltration. In conclusion, by mining the gene expression profiles of peripheral whole-blood samples from patients with ischemic stroke and by performing a comprehensive microarray analysis, we identified the five key genes that helped elucidate the molecular mechanism of the initiation and progression of ischemic stroke. CYBB, SYK, DUSP1, TNF, and SP1 could act as effective novel biomarkers for the diagnosis and treatment of ischemic stroke. In the present study, we identified several small-molecule drugs that may be of interest as potential new drugs for the ischemic stroke. Furthermore, we quantified the proportions of immune cells between the ischemic stroke and normal samples, which helped improve our understanding of the correlation between immune infiltration and ischemic stroke pathogenesis.
  33 in total

1.  DAVID: Database for Annotation, Visualization, and Integrated Discovery.

Authors:  Glynn Dennis; Brad T Sherman; Douglas A Hosack; Jun Yang; Wei Gao; H Clifford Lane; Richard A Lempicki
Journal:  Genome Biol       Date:  2003-04-03       Impact factor: 13.583

2.  affy--analysis of Affymetrix GeneChip data at the probe level.

Authors:  Laurent Gautier; Leslie Cope; Benjamin M Bolstad; Rafael A Irizarry
Journal:  Bioinformatics       Date:  2004-02-12       Impact factor: 6.937

3.  BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks.

Authors:  Steven Maere; Karel Heymans; Martin Kuiper
Journal:  Bioinformatics       Date:  2005-06-21       Impact factor: 6.937

4.  Inhibition of Tnf-α R1 signaling can rescue functional cortical plasticity impaired in early post-stroke period.

Authors:  Monika Liguz-Lecznar; Renata Zakrzewska; Malgorzata Kossut
Journal:  Neurobiol Aging       Date:  2015-06-18       Impact factor: 4.673

5.  SCAMP4 is a novel prognostic marker and correlated with the tumor progression and immune infiltration in glioma.

Authors:  Xinqi Ge; Ziheng Wang; Rui Jiang; Shiqi Ren; Wei Wang; Bing Wu; Yu Zhang; Qianqian Liu
Journal:  Int J Biochem Cell Biol       Date:  2021-08-12       Impact factor: 5.085

Review 6.  Inflammation in atherosclerosis: from vascular biology to biomarker discovery and risk prediction.

Authors:  René R S Packard; Peter Libby
Journal:  Clin Chem       Date:  2008-01       Impact factor: 8.327

7.  Association of circulating inflammatory markers with recurrent vascular events after stroke: a prospective cohort study.

Authors:  William Whiteley; Caroline Jackson; Steff Lewis; Gordon Lowe; Ann Rumley; Peter Sandercock; Joanna Wardlaw; Martin Dennis; Cathie Sudlow
Journal:  Stroke       Date:  2010-12-02       Impact factor: 7.914

8.  Machine-learning approach identifies a pattern of gene expression in peripheral blood that can accurately detect ischaemic stroke.

Authors:  Grant C O'Connell; Ashley B Petrone; Madison B Treadway; Connie S Tennant; Noelle Lucke-Wold; Paul D Chantler; Taura L Barr
Journal:  NPJ Genom Med       Date:  2016-11-30       Impact factor: 8.617

9.  Comprehensive Analysis of Hub Genes Associated With Competing Endogenous RNA Networks in Stroke Using Bioinformatics Analysis.

Authors:  Xiuqi Chen; Danhong Wu
Journal:  Front Genet       Date:  2022-01-12       Impact factor: 4.599

10.  Gene expression in peripheral immune cells following cardioembolic stroke is sexually dimorphic.

Authors:  Boryana Stamova; Glen C Jickling; Bradley P Ander; Xinhua Zhan; DaZhi Liu; Renee Turner; Carolyn Ho; Jane C Khoury; Cheryl Bushnell; Arthur Pancioli; Edward C Jauch; Joseph P Broderick; Frank R Sharp
Journal:  PLoS One       Date:  2014-07-18       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.