Immunoglobulin A (IgA) nephropathy (IgAN) is the most common glomerular disease. The major pathological changes associated with it affect cell proliferation, fibrosis, apoptosis, inflammation and extracellular matrix (ECM) organization. However, the molecular events underlying IgAN remain to be fully elucidated. In the present study, an integrated bioinformatics analysis was applied to further explore novel potential gene targets for IgAN. The mRNA expression profile datasets GSE93798 and GSE37460 were downloaded from the Gene Expression Omnibus database. After data preprocessing, differentially expressed genes (DEGs) were identified. Gene Ontology (GO) enrichment analysis of DEGs was performed. Protein-protein interaction (PPI) networks of the DEGs were built with the STRING online search tool and visualized by using Cytoscape, and hub genes were identified through the degree of connectivity in the PPI. The hub genes were subjected to Kyoto Encyclopedia of Genes and Genomes pathway analysis, and co-expression analysis was performed. A total of 298 DEGs between IgAN and control groups were identified, and 148 and 150 of these DEGs were upregulated and downregulated, respectively. The DEGs were enriched in distinct GO terms for Biological Process, including cell growth, epithelial cell proliferation, ERK1 and ERK2 cascades, regulation of apoptotic signaling pathway and ECM organization. The top 10 hub genes were then screened from the PPI network by Cytoscape. As novel hub genes, Fos proto-oncogene, AP-1 transcription factor subunit and early growth response 1 were determined to be closely associated with apoptosis and cell proliferation in IgAN. Tumor protein 53, integrin subunit β2 and fibronectin 1 may also be involved in the occurrence and development of IgAN. Co-expression analysis suggested that these hub genes were closely linked with each other. In conclusion, the present integrated bioinformatics analysis provided novel insight into the molecular events and novel candidate gene targets of IgAN.
Immunoglobulin A (IgA) nephropathy (IgAN) is the most common glomerular disease. The major pathological changes associated with it affect cell proliferation, fibrosis, apoptosis, inflammation and extracellular matrix (ECM) organization. However, the molecular events underlying IgAN remain to be fully elucidated. In the present study, an integrated bioinformatics analysis was applied to further explore novel potential gene targets for IgAN. The mRNA expression profile datasets GSE93798 and GSE37460 were downloaded from the Gene Expression Omnibus database. After data preprocessing, differentially expressed genes (DEGs) were identified. Gene Ontology (GO) enrichment analysis of DEGs was performed. Protein-protein interaction (PPI) networks of the DEGs were built with the STRING online search tool and visualized by using Cytoscape, and hub genes were identified through the degree of connectivity in the PPI. The hub genes were subjected to Kyoto Encyclopedia of Genes and Genomes pathway analysis, and co-expression analysis was performed. A total of 298 DEGs between IgAN and control groups were identified, and 148 and 150 of these DEGs were upregulated and downregulated, respectively. The DEGs were enriched in distinct GO terms for Biological Process, including cell growth, epithelial cell proliferation, ERK1 and ERK2 cascades, regulation of apoptotic signaling pathway and ECM organization. The top 10 hub genes were then screened from the PPI network by Cytoscape. As novel hub genes, Fos proto-oncogene, AP-1 transcription factor subunit and early growth response 1 were determined to be closely associated with apoptosis and cell proliferation in IgAN. Tumor protein 53, integrin subunit β2 and fibronectin 1 may also be involved in the occurrence and development of IgAN. Co-expression analysis suggested that these hub genes were closely linked with each other. In conclusion, the present integrated bioinformatics analysis provided novel insight into the molecular events and novel candidate gene targets of IgAN.
Entities:
Keywords:
IgA nephropathy; bioinformatics; gene expression; hub gene
Immunoglobulin A (IgA) nephropathy (IgAN) is a common primary renal disease worldwide (1,2) and has emerged as an important healthcare issue (3). Cell proliferation (4,5), fibrosis (6,7), apoptosis (8,9) and sustained inflammation (10) are involved in the pathogenesis of IgAN. Inhibition of human mesangial cell proliferation by targeting C3a/C5a receptors has been demonstrated to alleviate IgAN in mice (4). Glomerular endothelial proliferation has been reported to contribute to renal injury in IgAN (11,12). Cytotoxin-associated antigen A may induce cellular injury in glomerular mesangium through the proliferation and secretion of the extracellular matrix (ECM), which may have an important role in the pathogenesis of IgAN (13). Renal expression of microRNA (miR)-21-5p is associated with fibrosis and renal survival in patients with IgAN (6), and rapamycin may reduce apoptosis of podocytes under stimulated conditions of IgAN (8,9). However, the crucial genes involved in IgAN have remained elusive due to limited large-scale studies, and methods for the effective early diagnosis and treatment of IgAN remain unavaliable.Bioinformatics analysis is a powerful research method used to predict molecular mechanisms and associations among genes. This approach has been used to predict novel genes and pathways associated with tumors, including hepatocellular carcinoma (14), non-small cell lung cancer (15), osteosarcoma (16) and esophageal adenocarcinoma (17). Bioinformatics analysis has gradually provided insight into the molecular mechanisms of kidney disease (18). For instance, the gene expression profile of macrophages was recently analyzed through a bioinformatics analysis, indicating the induction of CCL2 and CD38 in macrophages from patients with lupus nephritis (19). To date, only few bioinformatics analyses have been performed on IgAN. A distinct glomerular molecular signature associated with endocapillary proliferation has been identified in patients with IgAN through a gene expression profiling array (12). However, to the best of our knowledge, no integrated and in-depth data analysis associated with IgAN has been previously performed. Therefore, it is necessary to identify genes associated with IgAN through integrated bioinformatics analysis. In the present study, several gene expression profile datasets were downloaded from the Gene Expression Omnibus (GEO) database. After data integration processing, differentially expressed genes (DEGs) were identified, and Gene Ontology (GO) functional analysis was performed. The top upregulated and downregulated hub genes in five disease-associated biological functions were further analyzed. Protein-protein interaction (PPI) networks of the DEGs were built and the hub genes were identified. The present results indicated that these hub genes were involved in different pathological mechanisms in IgAN. The DEGs co-expressed with the top hub genes were then further analyzed. The present study aimed to discover potential novel candidate molecular targets in IgAN.
Materials and methods
Retrieval of IgAN-associated gene expression data
HumanIgAN microarray datasets were searched and downloaded from the National Center for Biotechnology Information (NCBI) GEO database (http://www.ncbi.nlm.nih.gov/geo). The keyword ‘IgA nephropathy’ was used for accurate searching. The data selection criteria were as follows: i) All datasets were expression profiles, ii) all samples were kidney glomerular tissues, iii) the species was Homo sapiens, and iv) complete microarray raw data were available. Finally, two datasets, namely GSE93798 (20) and GSE37460 (21,22), were finally selected on the basis of the abovementioned criteria with exclusion of the duplicate data, for integrated analysis. The integrated datasets included 47 IgAN and 31 normal glomerular tissues. Original CEL files and platform probe annotation information files were subjected to further bioinformatics analysis.
Data preprocessing
The preprocessing and normalization of microarray datasets with raw data (.CEL files) were performed with the RMA function in the Affy package in the R environment (version 3.2.3) (23) with the following parameters: Data normalization using quantile normalization and background correction using RMA background correction with a background similar to the pure RMA background given in the Affy version 1.1 and above (23,24). After the gene expression value was obtained, the ‘annotate’ software package was used to annotate the genes and the expression matrix was merged. The batch effects from the microarray were removed by the function Combat in the SVA package with the ‘Empirical Bayes methods’ (25).
DEG analysis
The DEGs between the IgAN and normal tissues were analyzed with the Limma package in R (26). The linear fit method (using the lmFit function with default options), Bayesian analysis (using the eBayes function with default options) and the t-test algorithm were utilized to calculate the P-values and fold change (FC) values. The TopTable function in the Limma package was used to screen the DEGs (parameters: Adjust.method=‘fdr’, coef=1, adjusted (adj.)P-value=0.05, lfc=log(2,2), number=5,000, and sort.by=‘logFC’). An adj.P-value <0.05 and |log2FC|≥1 were set as the cutoff parameters to screen any significantly upregulated or downregulated genes. The ggplot2 software package was used to visualize the DEGs.
Functional enrichment analysis
The ClusterProfiler version 3.5 is an R package for the biological term classification and enrichment analysis of gene clusters (27). The cluster profiler package was used to perform GO functional enrichment analysis for the DEGs. Adj.P-value <0.05 was set as the cutoff criterion for GO enrichment analysis.
PPI network analysis and hub gene identification
The DEGs identified were subjected to PPI analysis by using the search functionality of STRING (http://string.embl.de/) (28) to explore the association between the DEGs, and a network interaction matrix was built. The minimum required interaction score of 0.7 was the cutoff threshold. The PPI network data matrix was downloaded for further analysis and visualization by using Cytoscape (version 6.3; http://www.cytoscape.org/) (29). CytoHubba (30) is a tool used to identify hub objects and subnetworks from a complex interactome. ‘Degree’ is a topological analysis method in CytoHubba. ‘Degree’ was used to discover featured nodes and identify the hub genes from all DEGs.
Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis
The top five hub genes in the five disease-associated GO terms were selected by consulting the literature in the NCBI database to find the potential KEGG pathway associations with IgAN. The KEGG tool (http://www.genome.jp/kegg/pathway.html) was used to further analyze the signaling pathways of the selected target genes.
Co-expressed gene analysis
Pearson correlation coefficients between each top hub gene and all other DEGs were calculated by using the package ‘Hmisc’ (version 4.1.1). The top five significant genes associated with each selected top hub gene were screened. The co-expression association between each top hub gene and other DEGs was further analyzed using Cytoscape.
Results
Identification of DEGs
A total of 78 samples, comprising 47 IgAN samples and 31 normal samples, in the two datasets were included for analysis in the present study. The detailed information of all of the samples is listed in Table SI. Based on the cutoff criteria of |log2 FC| ≥1.0 and adj.P-value ≤0.05, 298 DEGs in the IgAN group vs. control group were obtained. Among those DEGs, 148 and 150 genes were upregulated and downregulated, respectively. The results on the expression level analysis are presented in a volcano plot in Fig. 1A. As indicated in the graph, the data distribution of the upregulated, downregulated or insignificantly changed gene expression levels was normal. Transcriptional and immune response regulator, fatty acid binding protein 5, tropomyosin 1, colony-stimulating factor 1 receptor, erythrocyte membrane protein band 4.1 like 2 (EPB41L2), fibronectin 1 (FN1), hemoglobin subunit β, ECM protein 1, adipocyte enhancer-binding protein and transforming growth factor β receptor 2 (TGFBR2) were the 10 most significantly upregulated genes in IgAN, whereas FosB proto-oncogene, AP-1 transcription factor subunit (FOSB), FOS, activating transcription factor 3 (ATF3), early growth response 1 (EGR1), albumin (ALB), apolipoprotein L domain containing 1, EGR3, cytochrome P450 family 27 subfamily B member 1 (CYP27B1), solute carrier family 7 member 9 and dual specificity phosphatase 1 (DUSP1) were the 10 most substantially downregulated genes (Fig. 1B). Additional detailed information on all DEGs is listed in Table SII.
Figure 1.
Expression patterns of DEGs. (A) Volcano plot of all DEGs. Orange dots indicate high expression levels in IgAN tissues, whereas blue dots denote low expression levels. Gray dots correspond to the genes with a |log2Fc| <1 or adj.P-value >0.05. The dots in the area above the horizontal dotted line have an adj.P-value <0.05. The dots outside the two vertical dotted lines have a |log2Fc| ≥1. (B) Top 10 upregulated and downregulated DEGs in IgAN. |log2Fc| >1 and adj.P-value <0.05 were set as the selection criteria. Log (FC) >0, upregulated; log (FC) <0, downregulated. DEG, differentially expressed gene; adj., adjusted; FC, fold change; IgAN, immunoglobulin A nephropathy; FOSB, FosB proto-oncogene, AP-1 transcription factor subunit; ATF3, activating transcription factor 3; EGR1, early growth response 1; ALB, albumin; APOLD1, apolipoprotein L domain containing 1; CYP27B1, cytochrome P450 family 27 subfamily B member 1; SLC7A9, solute carrier family 7 member 9; DUSP1, dual specificity phosphatase 1; TCIM, transcriptional and immune response regulator; FABP, fatty acid binding protein; TPM1, tropomyosin 1; CSF1R, colony-stimulating factor 1 receptor; EPB41L2, erythrocyte membrane protein band 4.1 like 2; FN1, fibronectin 1; HBB, hemoglobin subunit β; ECM1, extracellular matrix protein 1; AEBP, adipocyte enhancer-binding protein; TGFBR2, transforming growth factor β receptor 2.
PPI network and hub gene analysis
In the PPI network analysis, the average node degree of connectivity was 2.57, the average local clustering coefficient was 0.366 and the number of edges was 172. The P-value for clusters of interacted proteins was <1.0×10−16. Finally, 10 top-ranked hub genes were identified from all the DEGs in the network by Degree analysis, as presented in Fig. 2A (colored nodes). These genes were tumor protein (TP53), integrin subunit β2 (ITGB2), FN1, FOS, complement C3a receptor 1 (C3AR1), EGR1, ALB, Fc fragment of IgE receptor Ig (FCER1G), phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit α (PIK3CA) and SHC adaptor protein 1 (SHC1). In the network, these hub genes had higher Degree scores than the other hub genes (Fig. 2A). The genes were closely correlated with one another. Among the 10 hub genes, seven genes, including TP53, ITGB2, FN1, C3AR1, FCER1G, PIK3CA and SHC1, were upregulated, and three genes, including FOS, EGR1 and ALB, were downregulated (Fig. 2B). Of note, among the top 10 hub genes, ALB (31), FOS (32) and TP53 (33) are known to be involved in the pathological process of IgAN. The top five hub genes were considered for subsequent KEGG pathway analysis to narrow down the analysis.
Figure 2.
(A) PPI network and top hub genes. Hub genes were identified from all DEGs by using the Degree analysis method. The depth of colour from blue to red indicates the rank from low to high of the hub genes. The table on the right-hand side presents the rank scores by Degree of the top 10 hub genes. (B) Expression patterns of the top 10 hub genes screened out from the PPI networks between the control group and the IgAN groups. (C) One cluster represents the top five gene ontology terms co-expressed with a hub gene (red nodes). The red digits indicate correlation coefficients. All P-values of the correlation are <0.05. Regarding the definition of all gene names, please refer to Table SIV. PPI, protein-protein interaction; DEG, differentially expressed gene; IgAN, immunoglobulin A nephropathy; TP53, tumor protein 53; FOS, Fos proto-oncogene, AP-1 transcription factor subunit; EGR1, early growth response 1; ALB, albumin; ITGB2, integrin subunit β2; FN, fibronectin; C3AR, complement C3a receptor 1; FCER1G, Fc fragment of IgE receptor Ig; PIK3CA, phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit α; SHC1, SHC adaptor protein 1.
Co-expressed gene analysis. Pearson correlation analysis revealed the major DEGs co-expressed with TP53, ITGB2, FN1, FOS, C3AR1, EGR1, ALB, FCER1G, PIK3CA and SHC1 (Fig. 2C). The top five DEGs co-expressed with TP53 were EPB41L2, SWI/SNF related, matrix associated, actin-dependent regulator of chromatin, subfamily A, member 4, MIS18 binding protein 1, FKBP prolyl isomerase 1A and SHC1. The genes co-expressed with ITGB2 were neutrophil cytosolic factor 2, C3AR1, FCER1G, TYRO protein tyrosine kinase binding protein and lysosomal protein transmembrane 5. Collagen type I α2 chain, TGFB1, periostin, transgelin and complement C1q A chain were closely associated with FN1. Of these, ITGB2 is possibly involved in the pathological process of IgAN through ECM remodeling and apoptosis (34,35), whereas FN1 may be implicated in IgAN-associated fibrosis (36). Additional information on the top six novel hub genes is listed in Table I (37–69).
Table I.
Extended information on the function of the top 6 hub genes.
Gene
Function
Associated pathway
(Refs.)
TP53
Deregulation of TP53 in deregulation of TP53 in multiple myeloma
Tp53 pathway
(37)
Tumor suppressive
DNA damage-induced apoptosis
(38)
Regulated inhibitor of apoptosis
Regulating podocyte apoptosis
(39)
Contributes to the pathogenesis of dilated cardiomyopathy
DNA damage response/TP53 pathway
(40)
Increases susceptibility to cervical cancer development
Interaction between the XRCC1 and TP53 genes
(41)
Response to stress
Autophagy and apoptosis
(42)
ITGB2
Inhibits osteosarcoma proliferation and metastasis
Wnt/β-catenin signalling
(43)
Mediates cell invasion
Leukocyte-specific integrin β2 expression.
(44)
Promotes macrophage retention
Inflammatory
(45)
Causes canine leukocyte adhesion deficiency
Missense mutation
(46)
Inhibits TLR responses
NF-κB pathway and p38 MAPK activation.
(47)
FN1
Promotes apoptosis of epithelial cells
MiR-206/FN1
(48)
Suppresses apoptosis
NF-κB pathway
(49)
Reverses the radioactive iodine resistance of papillary thyroid carcinoma cell
MiR-101-3p/FN1/PI3K/AKT signaling pathway
(50)
Cell proliferation, senescence and apoptosis
PI3K/AKT signaling pathway
(51)
FOS
Chronic inflammation
Metabolic pathways
(52)
Inflammation
AP-1 and AKT/mTor pathways
(53)
Inflammatory injury
Oxidative stress-mediated FOS/IL8 signaling
(54)
Regulates cell cycle
P38 MAPK/AP-1 factors
(55)
Proliferation and apoptosis of hippocampal neurons
MAPK signaling pathway
(56)
Regulates cervical cancer cells growth
ERK1/2/c-Fos/c-Jun
(57)
C3AR1
VEGFR2 survival and mitotic signaling
C3AR1/C5AR1 and IL-6R-GP130
(58)
Limits expansion and differentiation of alloreactive CD8+ T-cell immunity
C3AR1 signaling
(59)
Enhances the formation of intestinal organoids
C3AR1 signaling
(60)
Up-regulated genes in T2DM
Type 2 diabetes mellitus
(61)
EGR1
Proliferation and fibrosis
TGF-β1 signaling
(62)
Prevents renal tubulointerstitial fibrosis
MiR-192/TGF-β1/FN
(63)
Reducing the expression of fibrosis and inflammatory cytokines
First, Homo sapiens was selected as the organism. TP53, ITGB2, FN1, FOS and C3AR1 were then entered in the keyword dialog box. The pathways involved in cell cycle and proliferation, inflammation, apoptosis and focal adhesion were selected for presentation. TP53 was indicated to be involved in the P53 signaling pathway, which is associated with the cell cycle, apoptosis and inhibition of metastasis (Fig. 3A). ITGB2 was indicated to participate in the HIPPO signaling pathway, which regulates the expression of anti-apoptotic genes, and is also associated with focal adhesion (Fig. 3B). FN1 and ITGB2 are implicated in focal adhesion through the same signaling pathway. In addition, FN1 was indicated to participate in apoptosis and mesangial matrix expansion (Fig. 3C). FOS was also suggested to be involved in apoptosis (Fig. 3D). All of these pathways are closely linked to IgAN. However, the search in the KEGG pathway database failed to identify any signaling pathway where C3AR1 is directly involved in the cell cycle and proliferation, inflammation, apoptosis and focal adhesion.
Figure 3.
KEGG pathway analysis for hub genes. (A) TP53 was significantly involved in the P53 signaling pathway. This figure was redrawn on the basis of the KEGG pathway hsa04115. (B) ITGB2 was indicated to participate in the HIPPO signaling pathway. This figure was redrawn on the basis of the KEGG pathways hsa04390, hsa04933 and hsa05133. (C) FN1 was indicated to be significantly involved in the ECM/PI3K/AKT signaling and mesangial matrix expansion pathways. This figure was redrawn on the basis of the KEGG pathways hsa04151, has04510 and hsa04933. (D) FOS was indicated to participate in the Wnt signaling, IL-17 signaling and apoptosis pathways. This figure was redrawn on the basis of the KEGG pathways hsa01522, hsa04010, hsa04310 and hsa04657. Regarding the definition of all gene names, please refer to Table SIV. KEGG, Kyoto Encyclopedia of Genes and Genomes; hsa, Homo sapiens; TP53, tumor protein 53; IL, interleukin; FN, fibronectin; ECM, extracellular matrix; FOS, Fos proto-oncogene, AP-1 transcription factor subunit; ITGB2, integrin subunit β2.
GO enrichment
The ClusterProfiler package was used for pathway enrichment analysis and GO analysis to reveal the biological functions based on the DEGs. The 15 most significant GO terms in the category biological process from the groups with adj.P<0.05 are presented in Fig. 4A and the 15 most significant GO terms in the category molecular function from the groups with adj.P<0.05 are presented in Fig. 4B. The five GO terms closely associated with pathological mechanisms are provided in Fig. 4C. A total of 25 DEGs were involved in cell growth (GO:0016049), and 22 DEGs were implicated in the regulation of cell growth and epithelial cell proliferation (GO:0050673). Furthermore, 19 DEGs participated in the ERK1 and ERK2 cascades (GO:0070371) and 18 DEGs functioned in the regulation of the apoptotic signaling pathway (GO:2001233). In addition, 17 DEGs had a role in the ECM organization (GO:0030198). The top upregulated and downregulated hub genes in these disease-associated processes and functions are provided in Fig. 4D. CYP27B1, EGR3, ATF3, ATF3 and cysteine-rich angiogenic inducer 61 (CYR61) were the top downregulated hub genes in the five GO terms closely associated with pathological mechanisms, and TGFBR2, ECM3, FN1, SKI-like proto-oncogene and FN1 were the top upregulated hub genes in the five GO terms. All of the significantly enriched GO terms are listed in Table SIII.
Figure 4.
GO enrichment results of DEGs. (A) GO terms in the category biological process enriched by the DEGs. (B) GO terms in the category molecular function enriched by the DEGs. An adj.P-value <0.05 was used as the cut-off criterion. The length of each bar represents the gene counts in the GO term. The adj.P-value of each GO term is printed on the right side of the bars. (C) Five GO terms closely associated with the pathological mechanisms of IgAN. (D) Top upregulated and downregulated hub genes in each GO term. Regarding the definition of all gene names, please refer to Table SIV. GO, gene ontology; adj., adjusted; DEG, differentially expressed gene; IgAN, immunoglobulin A nephropathy.
Discussion
In the present study, 148 upregulated and 150 downregulated DEGs were identified from microarray data by applying an integrated bioinformatics analysis to further elucidate the molecular pathology of IgAN. GO analysis revealed that the DEGs were significantly enriched in cell growth, epithelial cell proliferation, ERK1 and ERK2 cascade, apoptotic signaling pathway regulation and ECM organization. CYP27B1 and TGFBR2 were enriched in the GO terms associated with cell growth. miR-195 was reported to inhibit proliferation, invasion and metastasis by targeting CYP27B1 in breast cancer cells (70), and CYP27B1 is involved in the anti-proliferative effects of 25-hydroxyvitamin D (71). miR-9-5p has been demonstrated to promote cell growth and metastasis in non-small cell lung cancer through repression of TGFBR2 (72). Cell proliferation is a vital factor in the pathogenesis of IgAN. For instance, circulating galactose-deficientIgA forms immune complexes deposited in the glomerular mesangium and causes local proliferation in IgAN (73). The present analysis revealed that ATF3 was enriched in GO terms including ERK1 and ERK2 cascades and regulation of the apoptotic signaling pathway. Cell migration and invasion may be strengthened by ATF3 through the activation of the p53 signaling pathway (74). Uremic toxins have been indicated to induce ATF3/c-Jun complex-mediated cannabinoid receptor type 1 expression by modulating the ERK1/2 and JNK signaling pathways and reactive oxygen species (75). FN1 was enriched in GO terms including the ERK1 and ERK2 cascades and the ECM organization. Depletion of FN1 was reported to markedly reduce the invasive capacity of prostate cancer cells (76). Furthermore, increased expression of FN1 in tumors may alter the primary tumor architecture, resulting in decreased metastasis formation (36). Treatment of PC-3 cells with 1 µM FN1 was observed to result in a decrease in activated ERK1/2 (77). CYR61, which is also named cellular communication network factor 1 (CCN1), is an ECM-associated matricellular protein and one of the six members of the CCN family (78,79). It may impair fibroblast responsiveness to TGF-β signaling and upregulation of matrix metalloproteinase 1 (80). Fibrosis is closely associated with IgAN (6,7). Overall, the above indicates that the results of the functional analysis of the identified DEGs in the present study are reasonable and consistent with mechanisms identified by previous studies.Several hub genes were identified from the PPI network, and this result is consistent with previously described genes, including ALB (31), FOS (32) and TP53 (33). TP53/p53 is a known regulator of apoptosis and macro-autophagy/autophagy (39). The coupled induction of inducible nitric oxide synthase and upregulation of TP53 in intrinsic renal cells of IgAN may be linked to pro- and anti-apoptotic activities (33). The present analysis revealed relatively few novel hub genes with a close association with the pathological processes of IgAN, offering novel insight. ITGB2 is a receptor of intercellular adhesion molecule (ICAM)1, ICAM2, ICAM3 and ICAM4, and it is also called CD18. ITGB2 is involved in cellular adhesion and ECM remodeling in patients with renal cancer (34). Furthermore, ITGB2 was identified to be closely associated with apoptosis in patients with Alzheimer's disease (81). However, to the best of our knowledge, no previous study has reported on the role of ITGB2 in IgAN. In the present study, ITGB2 was the second-ranked hub gene in the PPI network. The KEGG analysis results confirmed that ITGB2 was directly involved in apoptosis and focal adhesion. Collectively, the novel hub gene ITGB2 was indicated to have an important role in IgAN.Another novel and noteworthy hub gene identified in the present analysis is EGR1, a zinc finger transcription factor with an essential role in cell growth and proliferation (65). EGR1 contributes to diabetic kidney disease by enhancing epithelial-mesenchymal transition (65). Specific inhibition of EGR1 was observed to prevent mesangial cell hypercellularity in experimental nephritis (82). EGR1 overexpression in rhabdomyosarcoma significantly decreases cell proliferation, mobility and anchorage-independent growth (83). However, no previous study has reported on the role of EGR1 in IgAN. In the present study, EGR1 was among the top 10 hub genes in the PPI network. The present co-expression analysis indicated a close association between EGR1 and FOS. The expression levels of these two DEGs were decreased, possibly leading to a reduction in the inhibition of cell proliferation and resulting in the progression of IgAN.The present study did not identify any direct significant GO term for ‘fibrosis’. Renal biopsy in patients with IgAN is generally performed in the early stages of the disease, when renal fibrosis is not prominent. The GSE37460 dataset did not provide any clinical information. However, the GSE93798 dataset suggested that most of the patients' chronic kidney disease grades were below 3a and the Oxford Classification scores were relatively low, indicating that these patients were in the early stages of the disease (20). Therefore, abnormal expression of fibrosis-associated genes was not common in these samples. However, DUSP1, a gene associated with fibrosis, was among the DEGs. In chronic hypertension, angiotensin-1-7 increased DUSP1 to decrease fibrosis in resistance arterioles and attenuate end-stage organ damage (84). In the present study, the downregulation of DUSP1 may have acted as a fibrotic factor and prompt the onset of fibrosis.The present study was the first to identify novel molecular targets by integrating all microarray datasets of IgAN in GEO. Thereby, the sample size was expanded and further information was obtained. The microarray matrix of the expression values was combined and the batch effects were removed by using the empirical Bayes method to make the data more comparable (85). The novel results may enhance the current understanding of the molecular pathogenesis of IgAN. However, the present study has certain limitations. First, the clinical data from the GEO database were not available for each sample. Furthermore, the array data came from typical IgAN in the early stage, and therefore, the expression levels of certain genes may not be identical to those in the later stage. For instance, the protein levels of IGFBP1 have been reported to be upregulated in this disease (86), while this gene was downregulated in the present study. The cause of the inconsistency may be that it is unrealistic to dynamically obtain kidney tissue from a patient at different time-points. In addition, the novel potential candidate targets should be further validated in experimental studies. The present results were obtained using a bioinformatics screening to identify several novel DEGs between IgAN and healthy control samples, and suggested that part of the top hub genes have vital roles in the pathological process of cell proliferation in IgAN. Of note, the information provided in the present study was not limited to the top 10 hub genes, but included certain other representative DEGs. The present results provide a valuable resource for future research on IgAN.In conclusion, the present study was the first to apply an integrated bioinformatics analysis to investigate novel candidate genes and mechanisms involved in the pathogenesis of IgAN. ITGB2, FN1, ATF3 and EGR1 genes may have important roles in the development of IgAN and act as potential candidate molecular targets for the diagnosis and treatment of IgAN.