Chengjiang Wu1, Yangjing Zhao1, Yu Lin2, Xinxin Yang1, Meina Yan1, Yujiao Min1, Zihui Pan1, Sheng Xia1, Qixiang Shao1. 1. Department of Immunology, Key Laboratory for Laboratory Medicine of Jiangsu Province, Jiangsu University Medical School, Zhenjiang, Jiangsu 212013, P.R. China. 2. Center for Computational Science, University of Miami, Coral Gables, FL 33146, USA.
Abstract
DNA microarray and high-throughput sequencing have been widely used to identify the differentially expressed genes (DEGs) in systemic lupus erythematosus (SLE). However, the big data from gene microarrays are also challenging to work with in terms of analysis and processing. The presents study combined data from the microarray expression profile (GSE65391) and bioinformatics analysis to identify the key genes and cellular pathways in SLE. Gene ontology (GO) and cellular pathway enrichment analyses of DEGs were performed to investigate significantly enriched pathways. A protein‑protein interaction network was constructed to determine the key genes in the occurrence and development of SLE. A total of 310 DEGs were identified in SLE, including 193 upregulated genes and 117 downregulated genes. GO analysis revealed that the most significant biological process of DEGs was immune system process. Kyoto Encyclopedia of Genes and Genome pathway analysis showed that these DEGs were enriched in signaling pathways associated with the immune system, including the RIG‑I‑like receptor signaling pathway, intestinal immune network for IgA production, antigen processing and presentation and the toll‑like receptor signaling pathway. The current study screened the top 10 genes with higher degrees as hub genes, which included 2'‑5'‑oligoadenylate synthetase 1, MX dynamin like GTPase 2, interferon induced protein with tetratricopeptide repeats 1, interferon regulatory factor 7, interferon induced with helicase C domain 1, signal transducer and activator of transcription 1, ISG15 ubiquitin‑like modifier, DExD/H‑box helicase 58, interferon induced protein with tetratricopeptide repeats 3 and 2'‑5'‑oligoadenylate synthetase 2. Module analysis revealed that these hub genes were also involved in the RIG‑I‑like receptor signaling, cytosolic DNA‑sensing, toll‑like receptor signaling and ribosome biogenesis pathways. In addition, these hub genes, from different probe sets, exhibited significant co‑expressed tendency in multi‑experiment microarray datasets (P<0.01). In conclusion, these key genes and cellular pathways may improve the current understanding of the underlying mechanism of development of SLE. These key genes may be potential biomarkers of diagnosis, therapy and prognosis for SLE.
DNA microarray and high-throughput sequencing have been widely used to identify the differentially expressed genes (DEGs) in systemic lupus erythematosus (SLE). However, the big data from gene microarrays are also challenging to work with in terms of analysis and processing. The presents study combined data from the microarray expression profile (GSE65391) and bioinformatics analysis to identify the key genes and cellular pathways in SLE. Gene ontology (GO) and cellular pathway enrichment analyses of DEGs were performed to investigate significantly enriched pathways. A protein‑protein interaction network was constructed to determine the key genes in the occurrence and development of SLE. A total of 310 DEGs were identified in SLE, including 193 upregulated genes and 117 downregulated genes. GO analysis revealed that the most significant biological process of DEGs was immune system process. Kyoto Encyclopedia of Genes and Genome pathway analysis showed that these DEGs were enriched in signaling pathways associated with the immune system, including the RIG‑I‑like receptor signaling pathway, intestinal immune network for IgA production, antigen processing and presentation and the toll‑like receptor signaling pathway. The current study screened the top 10 genes with higher degrees as hub genes, which included 2'‑5'‑oligoadenylate synthetase 1, MX dynamin like GTPase 2, interferon induced protein with tetratricopeptide repeats 1, interferon regulatory factor 7, interferon induced with helicase C domain 1, signal transducer and activator of transcription 1, ISG15 ubiquitin‑like modifier, DExD/H‑box helicase 58, interferon induced protein with tetratricopeptide repeats 3 and 2'‑5'‑oligoadenylate synthetase 2. Module analysis revealed that these hub genes were also involved in the RIG‑I‑like receptor signaling, cytosolic DNA‑sensing, toll‑like receptor signaling and ribosome biogenesis pathways. In addition, these hub genes, from different probe sets, exhibited significant co‑expressed tendency in multi‑experiment microarray datasets (P<0.01). In conclusion, these key genes and cellular pathways may improve the current understanding of the underlying mechanism of development of SLE. These key genes may be potential biomarkers of diagnosis, therapy and prognosis for SLE.
Systemic lupus erythematosus (SLE) is a fatal multisystem autoimmune disease with diverse manifestations, which may lead to severe morbidity and mortality. It is characterized by a wide profile of autoantibodies, widespread precipitation of immune complexes and aberrant innate and adaptive immune responses. SLE preferentially invades skin, joints, kidneys and nervous system leading to various typical symptoms (1–3). Recent studies have revealed that genetic factors have an important role in the susceptibility to SLE (4–6). However, the precise pathogenic mechanism of SLE remains to be elucidated. Therefore, it is important to identify the molecular mechanisms of occurrence and development of SLE.High-throughput sequencing and microarray technology have been used to screen for differential gene expression in disease. Previous gene expression profiling analyses have been conducted to identify differentially expressed genes (DEGs) and cellular pathways in SLE (7,8). However, at present, no specific gene has been identified to act as a potential diagnosis marker in SLE. The large quantity of data obtained from high-throughput sequencing and microarray technology have not been fully used. Ducreux et al (9) collected whole blood samples from SLEpatients and healthy volunteers in order to identify DEGs. However, interactions among DEGs and key genes involved in signal conduction pathway of SLE remain to be elucidated. In addition, previous studies of genetic factors primarily focused on a single gene; however, interactions among multiple genes may result in the characteristics of multisystem invasion observed in SLE (10,11). It is of note that disease-associated gene expression networks have potential roles in immune response, which highlights their mechanism and therapeutic values for SLE (12,13). Therefore, the combination of microarray technology and bioinformatics analysis may aid in the identification of the gene regulatory networks, key genes and their associated pathways in SLE.The present study analyzed the raw data from the GSE65391 dataset downloaded from Gene Expression Omnibus (GEO; www.ncbi.nlm.nih.gov/geo/), which provides free access to high-throughput gene expression datasets (14). The present study selected GSE65391 due to the large sample quantity of the dataset and the transcriptional data were complemented by demographic, laboratory and clinical data. Additionally, 24 technical replicates of healthy samples were included and used for batch correction purposes. DEGs between SLE and healthy controls were identified in order to identify the potential mechanisms of SLE. Subsequently, functional enrichment, protein-protein interaction (PPI) networks and sub-network analyses were performed to determine the significant pathogenic genes and their critical pathways involved in the mechanism of occurrence and development of SLE.
Materials and methods
Microarray data
The gene expression profile of GSE65391 was downloaded from the GEO database (15). It was established on the platform of GPL10558 Illumina HumanHT-12 V4.0 expression bead chip (Illumina Inc., San Diego, CA, USA). The dataset contained 996 whole blood samples, including 924 SLE samples and 72 healthy controls.
Data preprocessing and identification of DEGs
The original expression datasets following background correction, normalization and probe summarization were converted into expression measures using the R affy package (www.bioconductor.org/packages/release/bioc/html/affy.html). The linear models for microarray data package in Bioconductor (www.bioconductor.org/packages/release/bioc/html/limma.html) was used to identify DEGs according to the cut-off criteria: Adjusted P<0.01 and |log2 fold-change (FC)|>0.8.
Gene ontology (GO) and pathway enrichment analysis of DEGs
GO is a widely used method for unification of biology which collected structured, defined and controlled vocabulary for a large scale of genes annotation (16). Kyoto Encyclopedia of Genes and Genomes (KEGG) database is a collection of online databases regarding gene functions, enzymatic pathways and associates genomic information with higher-order functional information (17). The Database for Annotation, Visualization and Integrated Discovery (DAVID; david.ncifcrf.gov) provides a comprehensive set of functional annotation tools to identify GO terms and KEGG pathways (18,19). To understand the biological functions and cellular pathways of the DEGs, the present study used DAVID to identify GO categories and KEGG pathways.
Analysis of protein-protein interaction (PPI) network and sub-networks
Search Tool for the Retrieval of Interacting Genes (STRING; string-db.org) (20) is a precomputed global resource designed to evaluate PPI information. In the present study, the STRING online tool was used to analyze the PPI of DEGs and experimentally validated interactions with a combined score >0.4 were selected as significant. Therefore, the degree of connectivity was statistically analyzed in networks using CytoHubba (version 3.4.0) (www.cytoscape.org) to obtain the significant nodes or hub genes in the PPI network (21). CytoHubba, a Java plugin for Cytoscape (version 3.4.0), is capable of predicting important nodes and subnetworks in a given network by several topological algorithms. The Molecular Complex Detection (MCODE) algorithm is an established automated method to find highly interconnected modules as molecular complexes or clusters in large PPI networks (22). Additionally, the Multi Experiment Matrix (MEM; biit.cs.ut.ee/mem/index.cgi) (23) is a web-based multi experiment gene expression query and visualization tool. It gathers hundreds of publicly available gene expression datasets from the ArrayExpress database (24). In order to determine whether these hub genes from different probe sets were co-expressing or not, the present study compared ISG15 ubiquitin-like modifier (ISG15) and the other hub genes using MEM.
Results
Identification of DEGs in SLE
The present study analyzed the GSE65391 microarray data, containing 924 SLE samples and 72 healthy controls. Following data normalization, preprocessing and filtering with the criteria of adjusted P<0.01 and |log2FC|>0.8 a total of 310 genes were identified as differentially expressed in patients with SLE compared with healthy controls. Among them, 193 were upregulated and 117 were downregulated. The volcano plot of all DEGs is presented in Fig. 1.
Figure 1.
Volcano plot of 310 DEGs. Red, upregulated DEGs with log2 FC>0.8 and adjusted P<0.01. Green, downregulated DEGs with log2FC<-0.8 and adjusted P<0.01. FC, fold-change; DEGs, differentially expressed genes.
Expression and function of DEGs in SLE
Several GO categories of upregulated and downregulated DEGs were enriched in GO terms associated with immune response and immune cell metabolism. The most significant biological process (BP) of upregulated and downregulated DEGs was the immune system process. In addition to the enrichment in immune response, the upregulated DEGs were significantly enriched in the BPs of the defense response to other organisms (e.g., virus), whereas the downregulated DEGs were significantly enriched in immune cell activities, such as leukocytes and lymphocytes (Fig. 2).
Figure 2.
Biological processes analysis of upregulated and downregulated differentially expressed genes associated with systemic lupus erythematosus. GO, gene ontology.
DEGs-associated pathways in SLE are enriched in multi-system harm diseases
The most significantly enriched pathways of the DEGs are presented in Fig. 3. The DEGs were enriched in immune diseases and infectious diseases, including SLE, inflammatory bowel disease, rheumatoid arthritis, herpes simplex virus (HSV) infection, influenza A, measles, asthma and primary immunodeficiency. Furthermore, the DEGs were also enriched in signaling pathways associated with the immune system, including the RIG-I-like receptor signaling pathway, intestinal immune network for IgA, antigen processing and presentation and toll-like receptor signaling pathway.
Figure 3.
Kyoto Encyclopedia of Genes and Genomes pathway analysis of differentially expressed genes associated with systemic lupus erythematosus. P-values for each pathway presented in brackets. NF-κB, nuclear factor-κB.
Major PPI networks in SLE
The present study identified 1,514 interactions among the DEGs. There were 338 interactions matching the condition of experimentally validated interaction with a combined score >0.4. The top 15 significant interactions with highest combined score are presented in Table I. Based on the information in the STRING database, the PPI network was constructed (Fig. 4). The present study screened the top 10 genes with higher degrees as hub genes in the PPI network using CytoHubba, including 2′-5′-oligoadenylate synthetase 1 (OAS1), MX dynamin like GTPase 2 (MX2), interferon induced protein with tetratricopeptide repeats 1 (IFIT1), interferon regulatory factor 7 (IRF7), interferon induced with helicase C domain 1 (IFIH1), signal transducer and activator of transcription 1 (STAT1), ISG15, DExD/H-box helicase 58 (DDX58), interferon induced protein with tetratricopeptide repeats 3 (IFIT3), and 2′-5′-oligoadenylate synthetase 2 (OAS2). Additionally, the modules of DEGs were obtained using the MCODE plugin. The presents study selected the top 3 significant modules and analyzed the cellular pathways of the genes involved in the modules (Fig. 5). The current study determined that these hub genes were also involved in the RIG-I-like receptor signaling (Fig. 6), cytosolic DNA-sensing, toll-like receptor signaling and ribosome biogenesis pathways. In addition, it was determined that the hub genes from different probe sets exhibited significant co-expression tendency in multi-experiment microarray datasets (P<0.01; Fig. 7).
Table I.
Top 15 significant interactions experimentally validated with highest combined score.
Node1
Node2
Combined score
RPL23A
RPL7
0.999
RPL23A
RPS23
0.999
RPL23A
RPL14
0.999
RPL14
RPL7
0.999
RPL14
RPL13
0.999
RPL7
RPL13
0.999
RPL23A
RPL13
0.999
RPL7
RPS23
0.999
ISG15
USP18
0.999
RPS23
RPS4Y1
0.999
RPL14
RPS23
0.999
RPL13
RPS23
0.999
RPL23A
RPL22
0.999
HIST1H2AC
HIST1H2BD
0.999
RPL22
RPL14
0.998
Figure 4.
Protein-protein interaction network of DEGs. Node size indicates the connectivity degree; larger circles indicate a higher degree. Circles indicate genes. Edges indicate the interaction between genes Red circles, upregulated DEGs. Green circles, downregulated DEGs. Squares, pathways associated with systemic lupus erythematosus.
Figure 5.
Top 3 modules from the protein-protein interaction network and the enriched cellular pathways of module genes. (A) Module 1 and (B) respective enriched pathway. (C) Module 2 and (D) respective enriched pathway. (E) Module 3 and (F) respective enriched pathway. FDR, false discovery rate.
Figure 6.
Kyoto Encyclopedia of Genes and Genomes pathway of hsa04622: RIG-I-like receptor signaling pathway. Red indicates upregulated DEGs, green indicates downregulated DEGs and purple indicates genes not identified as DEGs in systemic lupus erythematosus. Ellipses represent hub genes.
Figure 7.
Co-expression of hub genes from different probe sets in multi-experiment microarray datasets.
Discussion
The cause of SLE has not been fully elucidated. It is believed to involve hormonal, environmental and genetic factors. However, genetic factors are essential to determine the process of SLE occurrence and development (25). However, no single causal gene has been identified. Instead, multiple genes may influence a person's chance of developing SLE when triggered by environmental factors (26). Microarray and high-throughput sequencing may be used to quantify the expression levels of large numbers of genes simultaneously, combined with bioinformatics analysis, this may allow for the identification of the associated pathways and key genes with complex diseases. A total of 310 DEGs that were differently expressed in SLE compared with healthy controls in the present study using the gene expression profile contained in GSE65391.According to BPs identified of the identified SLE-associated DEGs, a significant portion of upregulated DEGs were enriched in response to virus, type I interferon signaling pathway and negative regulation of viral genome replication. Downregulated DEGs were primarily involved in the immune response, alpha-beta T cell differentiation and regulation of immune response. These findings were in accordance with the features of immune abnormalities belonging to autoimmune diseases. Virus infection is also one of the primary reasons for onset of SLE in patients. Additionally, the DEGs were mainly enriched in pathways associated with susceptibility to immune diseases and infectious diseases including HSV, influenza A, SLE, measles, inflammatory bowel disease, tuberculosis and rheumatoid arthritis. The DEGs were also involved in various signaling pathways, such as RIG-I-like receptor signaling, cytosolic DNA-sensing and toll-like receptor signaling pathways. These findings were consistent with the knowledge that SLE is a type of multiple systemic autoimmune disease which may affect many organ systems including the skin, joints and internal organ (27).Based on the PPI network, the present study identified the top 10 hub genes: OAS1, MX2, IFIT1, IRF7, IFIH1, STAT1, ISG15, DDX58, IFIT3 and OAS2. Further analysis revealed that these hub genes, from different probe sets, exhibited significant co-expression tendency in multi-experiment microarray datasets (P<0.01), which indicated that all of these hub genes may work together in the pathogenesis of SLE. OAS1, an IFN-regulated gene, had the highest degree of connectivity, is expressed as part of the innate immune response to viruses and enriched in HSV infection, hepatitis C, measles, influenza A and the NOD-like receptor signaling pathway. Rios et al (28) reported that OAS1 may activate RNase L leading to degradation of cellular and viral RNA, inhibiting thus viral replication. Previous studies revealed that viral infection and antiviral immunity contributed to the development of autoimmunity (29). The second hub gene MX2 was primarily enriched in interferon-g signaling and immune system pathways. GO annotations revealed that MX2 has a BP of GTP binding and GTPase activity. Fernández-Trujillo et al (30) reported that MX2 protein has an antiviral activity against RNA and DNA viruses in vitro. The third hub gene IFIT1, one of IFN-induced antiviral proteins, was the first gene described as a candidate gene for SLE (31). It has been previously reported that IFIT1 may interact with Rho/Rac guanine nucleotide exchange factor, regulate the activation of Rho/Rac proteins and contribute to the pathogenesis of SLE (32). IRF7, a member of the interferon regulatory transcription factor (IRF) family, was involved in multifarious functions and signaling pathways associated with the immune system, including the toll-like receptor signaling, NOD-like receptor signaling, RIG-I-like receptor signaling and cytosolic DNA-sensing pathways. Furthermore, IRF7 may efficiently activate IFN-β and the IFN-α and mediate their induction in the virus-activated, myeloid differentiation primary response 88 (MyD88)-independent pathway and the toll-like receptor-activated, MyD88-dependent pathway. Previous studies had demonstrated that the IFN system had an important role in the etiopathogenesis of SLE (33,34). IFIH1, a member of the RIG-I like receptor (RLR) family, was involved in immune system and cytosolic sensors of pathogen-associated DNA pathway. Lincez et al (35) reported that reduced expression of IFIH1 may prevent autoimmune diabetes. In addition, a previous study demonstrated that the gene polymorphism of IFIH1 was associated with increased sensitivity to IFN-α and serologic autoimmunity in patients with lupus (36). Another hub gene, STAT1, was reported to be involved in perturbing regulatory T cell homeostasis, which may be a possible marker of SLE disease severity by augmenting STAT1 signaling (37). ISG15, a member of the Ub1 family, is a 17 kDa secreted protein and has a significant sequence homology to ubiquitin (38). ISG15 may induce the synthesis and secretion of IFN-γ from B-cell-depleted lymphocytes. It has been previously reported that ISG15 acts as a cytokine modulator in immune response (39,40). It has also been previously reported that IFN-α/IFN-γ may induce the activation of the JAK-STAT1 pathway, regulate proliferation and activation of immunocytes and lead to the abnormal activation of the immune system, which may affect the occurrence and development of SLE (41–43). Therefore, ISG15 may be involved in the abnormal immune response of SLE. DDX58 was also enriched in the RIG-I-like receptor signaling pathway and had a synthetic function as viral double-stranded RNA recognition and the regulation of immune response. However, the specific function of DDX58 in SLE requires further investigation. IFIT3 and OAS2 are interferon-inducible genes whose encoding proteins are involved in the innate immune response to viral infection, which may lead to poor prognosis of SLE.Through module analysis of the PPI network, the present study determined that the development of SLE was closely associated with the RIG-I-like receptor signaling, cytosolic DNA-sensing, toll-like receptor signaling and ribosome pathways. A previous study proposed that patients with SLE had an increased expression of type I IFN and the production of IFN depended on the RIG-I-like receptor signaling pathway (44). The RIG-I-like receptor signaling pathway could also trigger innate immune responses against viral infections that may limit viral replication and stimulate adaptive immunity (45). Further analysis revealed that 4 hub genes were involved in the RIG-I-like receptor signaling pathway and had core positions in the pathway (Fig. 6). The cytosolic DNA-sensing pathway was reported to signal via stimulator of interferon genes that mediate immunity to pathogens and promoted autoimmune pathology in DNase II- and DNase III-deficient mice. Previous studies declared that polymorphisms of genes involved in toll-like receptor signaling pathway had a link with the development of SLE (46,47). Module 2 was presented in Fig. 5 and enriched in ribosome biogenesis pathway. This pathway was involved in pre-rRNA processing, ribosome assembly and the synthetic processing of >200 proteins. Ribosomal P protein was a main component part of ribosome (48). Additionally, molecular biomarkers for the diagnosis of SLE were anti-ribosomal P antibodies (49), which was sufficient to confirm that this pathway had a significant effect on the development of SLE.In conclusion, these hub genes may have various roles in the occurrence and development of the SLE, leading to damage of multiple systems in SLE. Combined with bioinformatics analysis, the current study identified key genes and cellular pathways involved in the occurrence and development of SLE. The present study may provide a basis for an improved understanding of SLE. However, the current findings are limited by the lack of experimental verification in vivo and in vitro. Therefore, future experimental studies should be conducted to confirm the expression and function of the identified genes at the protein level, which may be an area of future research.
Authors: Siobhán Smith; Thilini Fernando; Pei Wen Wu; Jane Seo; Joan Ní Gabhann; Olga Piskareva; Eoghan McCarthy; Donough Howard; Paul O'Connell; Richard Conway; Phil Gallagher; Eamonn Molloy; Raymond L Stallings; Grainne Kearns; Lindsy Forbess; Mariko Ishimori; Swamy Venuturupalli; Daniel Wallace; Michael Weisman; Caroline A Jefferies Journal: J Autoimmun Date: 2017-03-17 Impact factor: 7.094
Authors: Matthew A Care; Sophie J Stephenson; Nicholas A Barnes; Im Fan; Alexandre Zougman; Yasser M El-Sherbiny; Edward M Vital; David R Westhead; Reuben M Tooze; Gina M Doody Journal: J Immunol Date: 2016-06-29 Impact factor: 5.422
Authors: Damian Szklarczyk; John H Morris; Helen Cook; Michael Kuhn; Stefan Wyder; Milan Simonovic; Alberto Santos; Nadezhda T Doncheva; Alexander Roth; Peer Bork; Lars J Jensen; Christian von Mering Journal: Nucleic Acids Res Date: 2016-10-18 Impact factor: 16.971
Authors: James Bentham; David L Morris; Deborah S Cunninghame Graham; Christopher L Pinder; Philip Tombleson; Timothy W Behrens; Javier Martín; Benjamin P Fairfax; Julian C Knight; Lingyan Chen; Joseph Replogle; Ann-Christine Syvänen; Lars Rönnblom; Robert R Graham; Joan E Wither; John D Rioux; Marta E Alarcón-Riquelme; Timothy J Vyse Journal: Nat Genet Date: 2015-10-26 Impact factor: 38.330
Authors: Abby D Benninghoff; Melissa A Bates; Preeti S Chauhan; Kathryn A Wierenga; Kristen N Gilley; Andrij Holian; Jack R Harkema; James J Pestka Journal: Front Immunol Date: 2019-12-13 Impact factor: 7.561