Literature DB >> 32190668

Investigation of Potential Genetic Biomarkers and Molecular Mechanism of Ulcerative Colitis Utilizing Bioinformatics Analysis.

Jiaqi Zhang1, Xue Wang2, Lin Xu1, Zedan Zhang1, Fengyun Wang1, Xudong Tang3.   

Abstract

OBJECTIVES: To reveal the molecular mechanisms of ulcerative colitis (UC) and provide potential biomarkers for UC gene therapy.
METHODS: We downloaded the GSE87473 microarray dataset from the Gene Expression Omnibus (GEO) and identified the differentially expressed genes (DEGs) between UC samples and normal samples. Then, a module partition analysis was performed based on a weighted gene coexpression network analysis (WGCNA), followed by pathway and functional enrichment analyses. Furthermore, we investigated the hub genes. At last, data validation was performed to ensure the reliability of the hub genes.
RESULTS: Between the UC group and normal group, 988 DEGs were investigated. The DEGs were clustered into 5 modules using WGCNA. These DEGs were mainly enriched in functions such as the immune response, the inflammatory response, and chemotaxis, and they were mainly enriched in KEGG pathways such as the cytokine-cytokine receptor interaction, chemokine signaling pathway, and complement and coagulation cascades. The hub genes, including dual oxidase maturation factor 2 (DUOXA2), serum amyloid A (SAA) 1 and SAA2, TNFAIP3-interacting protein 3 (TNIP3), C-X-C motif chemokine (CXCL1), solute carrier family 6 member 14 (SLC6A14), and complement decay-accelerating factor (CD antigen CD55), were revealed as potential tissue biomarkers for UC diagnosis or treatment.
CONCLUSIONS: This study provides supportive evidence that DUOXA2, A-SAA, TNIP3, CXCL1, SLC6A14, and CD55 might be used as potential biomarkers for tissue biopsy of UC, especially SLC6A14 and DUOXA2, which may be new targets for UC gene therapy. Moreover, the DUOX2/DUOXA2 and CXCL1/CXCR2 pathways might play an important role in the progression of UC through the chemokine signaling pathway and inflammatory response.
Copyright © 2020 Jiaqi Zhang et al.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 32190668      PMCID: PMC7073481          DOI: 10.1155/2020/4921387

Source DB:  PubMed          Journal:  Biomed Res Int            Impact factor:   3.411


1. Introduction

Ulcerative colitis (UC) is a chronic nonspecific inflammation of the rectum and colon whose etiology and pathogenesis are not yet well defined [1]. UC has a high incidence in western countries, with increasing incidence in the developing countries [2]. The etiology of UC is considered to be multifactorial, including genetic and environmental factors such as urban lifestyles, dietary factors, high levels of hygiene, and gut microbiota, all of which are associated with disease progression; however, the pathogenesis of UC remains unclear [3]. Bioinformatics can be effectively used to analyze UC microarray data, providing theoretical reference for further exploration of the mechanisms of inflammatory bowel disease, and help to find potential target genes. As the latest bioinformatics research method, WGCNA is commonly used to reveal differences between genes in different samples [4]. In this study, UC gene expression data uploaded by Li et al. were downloaded. We identified the DEGs between UC samples and normal samples. Then, a module partition analysis was performed based on a WGCNA, followed by pathway and functional enrichment analyses. Then, data validation was performed to ensure the reliability of the hub genes. This study forecasts the molecular mechanism of UC and the potential biomarkers for UC therapy.

2. Materials and Methods

2.1. Microarray Data

The gene expression profile of GSE87473 was obtained from the GEO database [5] (http://www.ncbi.nlm.nih.gov/geo/). A total of 127 mucosal biopsy samples were obtained from 106 UC patients and 21 control subjects for subsequent analysis. The UC samples consisted of adult UC samples (n = 87) and pediatric UC samples (n = 19). Adult UC patients of 44 male and 43 female were enrolled from all geographic regions of the USA and from both metropolitan and rural settings, with an average age of 41 (race of samples not available) [6]. Pediatric UC patients of 8 male and 11 female obtained from a phase 1b clinical trial of golimumab in pediatric patients, with an average age of 15, and only subjects of European ancestry were applied [7].Normal samples (n = 21) were obtained from the Department of Gastroenterology, Perelman School of Medicine at the University of Pennsylvania (Philadelphia, PA) and the Department of Gastroenterology, University Hospital Gasthuisberg (Leuven, Belgium) [6], and information on age, gender, and race was not available.

2.2. Data Preprocessing and DEG Analysis

There were a total of 20741 probes in the present dataset. GEO2R (http://www.ncbi.nlm.nih.gov/geo/geo2r/) is based on R that comes with the GEO databases, which was used to identify DEGs between UC and control samples. |log‐fold change(LFC)| > 1 and P values <0.05 were selected as the thresholds for DEG screening.

2.3. WGCNA Analysis

The coexpression network analysis was performed using WGCNA (version: 1.63) [8]. WGCNA is a systematic biological method for constructing scale-free networks using gene expression data. First, we selected the soft threshold for network construction. The soft threshold was used to transform the similarity matrix of gene expression into adjacency matrix, which enhances strong correlation and weakens correlation at the exponential level. Second, the adjacency matrix was transformed into a topological matrix. Based on TOM, we used the average-linkage hierarchical clustering method to cluster genes. According to the standard of hybrid dynamic cut tree, we set the minimum number of base 30 for each gene network module. After determining the gene module by the dynamic shearing method, we calculated the eigenvectors of each module in turn, then clustered the modules, merged the nearer modules into new modules, and set height = 0.25 [9]. Third, we calculated the module eigengene (ME) of each module, which represents the expression level for each module. We also calculated the correlation between the clinical traits and ME in each module. At last, we calculated the gene significance (GS) of each gene in the module, which represented the correlation between the genes and sample.

2.4. Function and Pathway Enrichment Analysis

We used the DAVID 6.8 (https://david.ncifcrf.gov) software for the GO-biological function (GO-BP) and KEGG pathway analyses of the genes in main modules. We selected the P-false discovery rate (FDR) of <0.05 as the threshold for the identification of significant GO-BP terms and KEGG pathways.

2.5. Hub Genes Investigation

According to the feature vector of each module, the correlation of the gene expression in the module was analyzed by WGCNA. Genes with correlations greater than 0.9 in each module were considered hub genes.

2.6. Data Validation

To verify the robustness of hub genes, the microarray data of GSE75214 [10] ([HuGene-1_0-st] Affymetrix Human Gene 1.0 ST Array [transcript (gene) version]), which included 108 tissue samples (97UC samples and 11 control samples), were downloaded from the GEO database. GraphPad Prism 7.00 software was used to calculate the area under the curve (AUC).

3. Results

3.1. DEGs between UC Samples and Normal Samples

We identified 988 DEGs, including 466 upregulated DEGs and 522 downregulated DEGs with PFDR < 0.05 and |LFC| > 1. The heatmap and volcano plot are shown in Figures 1(a) and 1(b). Obviously, the heatmap showed that these DEGs could be used to distinguish UC from control samples.
Figure 1

The heat map, volcano plot, and weighted gene coexpression network analysis (WGCNA) of differentially expressed genes (DEGs) between the UC group and the control group. (a) The heatmap for DEGs. (b) The volcano plot for DEGs. Gray dots represent the genes that are not differentially expressed, red dots represent the upregulated genes, and the blue dots represent the downregulated genes. (c) Determination of the soft threshold in the WGCNA algorithm. The approximate scale-free fit index can be attained at the soft-thresholding power of 6. (d) Clustering dendrograms showing 4 modules that contain a group of highly connected genes. Each designated color represents a certain gene module.

3.2. WGCNA Analysis

We performed WGCNA analysis using the 988 DEGs. The coexpression network is a scale-free network, which means the logarithm log(k) of a node with a connection degree of k is negatively correlated with the logarithm log(P(k)) of the probability of occurrence of the node, and the correlation coefficient is greater than 0.8. R software package WGCNA was used to build a weighted coexpression network. To ensure that the network was a scale-free network, we chose a soft threshold of β = 6 (Figure 1(c)). The DEGs were clustered into 5 modules, described here as including turquoise (510 DEGs), blue (393 DEGs), brown (48 DEGs), yellow (30 DEGs), and gray (7 DEGs) (Figure 1(d)). The turquoise and blue modules were downregulated, while the brown and yellow modules were upregulated (Figure 2). Moreover, the turquoise module (correlation index: −0.68, P=3.0E − 18) was negatively correlated with the disease presence and the yellow (correlation index: 0.51, P=6.0E − 10), blue (correlation index: 0.62, P=1.0E − 14), and brown modules (correlation index: 0.73, P=3E − 22) were positively correlated with the disease presence; while the turquoise module (correlation index: −0.65, P=1.0E − 16) was negatively correlated with the disease extent, the yellow (correlation index: 0.42, P=1.0E − 16), blue (correlation index: 0.43, P=4E − 07), and brown modules (correlation index: 0.52, P=6E − 10) were positively correlated with the disease extent (Figure 3(a)).The average gene significance (GS) for each module indicated that the brown module was the most related to disease presence, and the turquoise module was most related to disease extent (limited or extensive) (Figure 3(b)).
Figure 2

The module expression pattern. The heatmap represents the expression of genes where each row represents a gene and each column represents a sample. The red color in the heatmap represents upregulated genes, while the green color represents downregulated genes. The bar charts represent the eigengene profiles of four WGCNA modules; the color of the bar chart represents the color of the related module.

Figure 3

Relationships of module eigengenes and the samples. (a) The module-trait relationships. The number in the first row of the square on the right is the correlation coefficient to the UC group shown at the top of each row with the P values printed below the correlations in parentheses, and the number on the left is the correlation coefficient to the disease extent (limited or extensive) of the UC group. The rows are colored based on the correlation of the module to the UC group: red for a positive correlation and blue for a negative correlation. (b) The average gene significance (GS) of all genes (i.e., module significance, MS) of each module. Modules with greater MS values were considered to have more connection with the disease.

3.3. Functional and Pathway Enrichment for DEGs

The top 3 GO-BP and KEGG terms enriched by DEGs are shown in Table 1 and Figure 4. The DEGs in the brown module were mainly involved in functions such as inflammatory response (P=4.88E − 07) and pathways such as the chemokine signaling pathway (P=0.004195). The DEGs in the turquoise module were mainly involved in functions such as the oxidation-reduction process (P=9.70E − 3) and pathways such as metabolic pathways (P=2.8E − 09).
Table 1

The results for GO-BP function and KEGG pathway enrichment analysis (top 3 in the brown and turquoise module are listed).

ModuleGO-BP terms P valueKEGG terms P value
BrownInflammatory response4.88E − 07Chemokine signaling pathway0.004195
Innate immune response1.39E − 06Cytokine-cytokine receptor interaction0.010708
Chemotaxis7.32E − 07Complement and coagulation cascades0.001897

TurquoiseOxidation-reduction process9.7E − 3Metabolic pathways2.8E − 09
Transport1.4E − 5Drug metabolism, cytochrome P4501.3E − 11
Metabolic process1.2E − 10Chemical carcinogenesis2.1E − 10
Figure 4

The results for GO-BP function and KEGG pathway enrichment analysis (the top 20 in the brown- and turquoise-colored modules are listed). (a) The GO-BP function enrichment of DEGs in the brown module. (b) KEGG pathway enrichment of the DEGs in the brown module. (c) The GO-BP function enrichment of the DEGs in the turquoise module. (d) KEGG pathway enrichment of the DEGs in the turquoise module.

3.4. Hub Genes

The brown module was most relevant to the disease; therefore, we analyzed the correlation of gene expression in the brown module in the following study. Figure 5 shows that dual oxidase maturation factor 2 (DUOXA2), serum amyloid A (SAA) 1 and SAA2, TNFAIP3-interacting protein 3 (TNIP3), C-X-C motif chemokine (CXCL1), solute carrier family 6 member 14 (SLC6A14), and complement decay-accelerating factor (CD antigen CD55) were selected as hub genes.
Figure 5

Hub genes in the brown module. A total of six genes were selected as hub genes, and the correlation coefficients (ranging from 0.90–0.93) and P values are shown in the lower right corner of each image.

3.5. Data Validation

To verify the robustness of the hub genes, the validation data GSE75214 were obtained from the GEO database. We performed ROC curve analysis using GraphPad Prism7.00. The results of the analysis showed that the hub genes related to UC, including DUOXA2, SAA1, SAA2, TNIP3, CXCL1, SLC6A14, and CD55, were identified as potential tissue biopsy molecules for UC diagnosis (Table 2 and Figure 6).
Table 2

Results of AUCs for hub genes.

Hub genesUC vs. normal
AUC P value95% CI
Disease-related
DUOXA20.8894<0.00010.826 to 0.9528
SAA1///SAA20.80970.00080.6975 to 0.9220
TNIP30.89690.00020.8366 to 0.9572
CXCL10.8857<0.00010.8151 to 0.9562
SLC6A140.9822<0.00010.9606 to 1.004
CD550.9297<0.00010.8814 to 0.978
Figure 6

Predicted ROC curves of the UC hub genes. The prediction of UC vs. control was robust, and the area under the curve (AUC) of 6 hub genes for UC vs. control ranged from 0.8097 to 0.9822.

4. Discussion

UC is a kind of inflammatory bowel disease that is difficult to treat, easy to recur, and prone to cancerization [11, 12].Recently, many potential biomarkers for early diagnosis or treatment of UC have been identified after the development of biology technology; however, the mechanism of UC is still unknown. In this study, UC gene expression data were analyzed by WGCNA. We screened a total of 988 DEGs between UC samples and control samples, and identified 5 modules. Based on the correlation between the modules and occurrence or development of UC, we identified 7 hub genes after data verification. Combined with previous research, SLC6A14 and DUOXA2 might be critical biomarkers for UC diagnosis. SLC6A14 and DUOXA2 are involved in the development and carcinogenesis of UC. Multiple sequencing or microarray studies have shown that SLC6A14 was upregulated in UC patients [5, 10], 58 [13]. SLC6A14 may be involved in colonic inflammation by regulating glutamine (a substrate for SLC6A14) and nitric oxide synthase 2 (coordinated upregulation with SLC6A14 in inflamed cells) [14, 15]. Furthermore, SLC6A14 is one such cancer-specific amino acid transporter and is essential for tumor growth [16]. DUOXA2, an ROS-generating enzyme expressed in the lower gastrointestinal tract, plays a critical role in host mucosal defense [17], which could be induced by the changes of gut microbiota [18]. DUOXA2 is the maturation partner of DUOX2, which participates in the signaling pathways against inflammation and regulates reactive oxygen species (ROS), mucin, IL-8, and matrix metalloproteinase-9 against invading microbial pathogens [19]. However, overproduction of H2O2 could lead to oxidative stress resulting in oxidative injuries and mucosal barrier impairment [20]. In addition to its role in the persistent and recurrent inflammatory of UC, the DUOXA2/DUOX2 pathway is also involved in the development of UC-associated adenomas and colorectal cancer [21-23]. We supposed that SLC6A14 and DUOXA2 aberrantly expressed might promote the initiation and development of UC. Our research also found that SAA, TNIP3, CD55, and CXCL1 were potential biomarkers for UC. SAA can reflect inflammation of UC at an early stage due to its higher sensitivity and specificity [24-26]. CXCL1 acts by specifically binding to its receptor, C-X-C chemokine receptor type 2 (CXCR2) [27]. Recent studies have shown that the CXCL1/CXCR2 signaling pathway regulates the inflammatory response; moreover, the pathway causes tumor cell proliferation, angiogenesis, and lymph angiogenesis and promotes tumor invasion and vascular metastasis [28]. Previous studies have shown increased CD55 in stools and colonic mucosa of disease activity in patients with UC [29, 30] and CD55 as the decay-accelerating factor can reflect the carcinogenesis of UC [31-33]. TNIP3 is a negative regulator of nuclear factor (NF)-κB signal transduction in response to multiple stimuli [34]. Ishani Majumdar's study has demonstrated that the expression of TNIP3 negatively correlates with diseases severity in UC [35], which was contrary to our results. The contrary results might be related to the difference in disease severity and the genetic testing method. The results of functional and pathway DEGs enrichment in this study show that the biological functions involved in the pathogenesis of UC include the inflammatory response, innate immune response, and chemotaxis, indicating that the pathogenesis of UC was multifactorial, involving epithelial barrier defects, genetic predisposition, environmental factors, and dysregulated immune responses. The 7 hub genes screened in this study are not only related to mucosal inflammation but they also accelerate the progression of colon cancer, so they should be given proper attention in the treatment of UC. Although we found 7 hub genes closely related to UC and confirmed the robustness of their diagnostic value, which may be useful for us to improve our understanding of the molecular mechanism of UC and as a potential prognostic and diagnostic biomarker, however, there were some limitations in this study such as small sample size and lack of verification test; thus, we still need large sample size with a wide verification analysis to confirm our hypothesis.

5. Conclusions

In conclusion, DUOXA2, A-SAA, TNIP3, CXCL1, SLC6A14, and CD55 might be used as potential biomarkers for UC tissue biopsy, especially SLC6A14 and DUOXA2, which may be new targets for UC gene therapy. Furthermore, DUOXA2/DUOX2 and CXCL1/CXCR2 pathways may play important roles in UC progression via the inflammatory response.
  34 in total

Review 1.  Dual oxidase in mucosal immunity and host-microbe homeostasis.

Authors:  Yun Soo Bae; Myoung Kwon Choi; Won-Jae Lee
Journal:  Trends Immunol       Date:  2010-06-25       Impact factor: 16.687

2.  Detection of elafin as a candidate biomarker for ulcerative colitis by whole-genome microarray screening.

Authors:  Carl-Fredrik Flach; Anders Eriksson; Eva Jennische; Stefan Lange; Charina Gunnerek; Ivar Lönnroth
Journal:  Inflamm Bowel Dis       Date:  2006-09       Impact factor: 5.325

3.  Drug therapies and the risk of malignancy in Crohn's disease: results from the TREAT™ Registry.

Authors:  Gary R Lichtenstein; Brian G Feagan; Russell D Cohen; Bruce A Salzberg; Robert H Diamond; Wayne Langholff; Anil Londhe; William J Sandborn
Journal:  Am J Gastroenterol       Date:  2014-01-07       Impact factor: 10.864

Review 4.  Emerging functions of serum amyloid A in inflammation.

Authors:  Richard D Ye; Lei Sun
Journal:  J Leukoc Biol       Date:  2015-06-30       Impact factor: 4.962

Review 5.  Disorders of H2O2 generation.

Authors:  Marina Muzza; Laura Fugazzola
Journal:  Best Pract Res Clin Endocrinol Metab       Date:  2017-05-10       Impact factor: 4.690

6.  Serum amyloid A induces interleukin-6 in dermal fibroblasts via Toll-like receptor 2, interleukin-1 receptor-associated kinase 4 and nuclear factor-κB.

Authors:  Steven O'Reilly; Rachel Cant; Marzena Ciechomska; James Finnigan; Fiona Oakley; Sophie Hambleton; Jacob M van Laar
Journal:  Immunology       Date:  2014-11       Impact factor: 7.397

7.  Expression and prognostic value of CD97 and its ligand CD55 in pancreatic cancer.

Authors:  Zheng He; Hui Wu; Yanli Jiao; Jun Zheng
Journal:  Oncol Lett       Date:  2014-12-01       Impact factor: 2.967

8.  WGCNA: an R package for weighted correlation network analysis.

Authors:  Peter Langfelder; Steve Horvath
Journal:  BMC Bioinformatics       Date:  2008-12-29       Impact factor: 3.169

9.  Ulcerative colitis mucosal transcriptomes reveal mitochondriopathy and personalized mechanisms underlying disease severity and treatment response.

Authors:  Yael Haberman; Rebekah Karns; Phillip J Dexheimer; Melanie Schirmer; Judith Somekh; Ingrid Jurickova; Tzipi Braun; Elizabeth Novak; Laura Bauman; Margaret H Collins; Angela Mo; Michael J Rosen; Erin Bonkowski; Nathan Gotman; Alison Marquis; Mason Nistel; Paul A Rufo; Susan S Baker; Cary G Sauer; James Markowitz; Marian D Pfefferkorn; Joel R Rosh; Brendan M Boyle; David R Mack; Robert N Baldassano; Sapana Shah; Neal S Leleiko; Melvin B Heyman; Anne M Grifiths; Ashish S Patel; Joshua D Noe; Bruce J Aronow; Subra Kugathasan; Thomas D Walters; Greg Gibson; Sonia Davis Thomas; Kevin Mollen; Shai Shen-Orr; Curtis Huttenhower; Ramnik J Xavier; Jeffrey S Hyams; Lee A Denson
Journal:  Nat Commun       Date:  2019-01-03       Impact factor: 14.919

10.  Host immunoglobulin G selectively identifies pathobionts in pediatric inflammatory bowel diseases.

Authors:  Heather Armstrong; Misagh Alipour; Rosica Valcheva; Michael Bording-Jorgensen; Juan Jovel; Deenaz Zaidi; Prachi Shah; Yuefei Lou; Cory Ebeling; Andrew L Mason; Dawson Lafleur; Jeremy Jerasi; Gane K-S Wong; Karen Madsen; Matthew W Carroll; Hien Q Huynh; Levinus A Dieleman; Eytan Wine
Journal:  Microbiome       Date:  2019-01-03       Impact factor: 14.650

View more
  9 in total

Review 1.  The Importance of CXCL1 in the Physiological State and in Noncancer Diseases of the Oral Cavity and Abdominal Organs.

Authors:  Jan Korbecki; Iwona Szatkowska; Patrycja Kupnicka; Wojciech Żwierełło; Katarzyna Barczak; Iwona Poziomkowska-Gęsicka; Jerzy Wójcik; Dariusz Chlubek; Irena Baranowska-Bosiacka
Journal:  Int J Mol Sci       Date:  2022-06-28       Impact factor: 6.208

2.  Artemisinin Alleviates Intestinal Inflammation and Metabolic Disturbance in Ulcerative Colitis Rats Induced by DSS.

Authors:  Xuemei Jia; Yunxiao Gao; Liran Liu; Yuxi Guo; Jie Wang; Hongyu Ma; Runyuan Zhao; Bolin Li; Yao Du; Qian Yang
Journal:  Evid Based Complement Alternat Med       Date:  2022-04-19       Impact factor: 2.650

3.  MicroRNA-532-5p protects against cerebral ischemia-reperfusion injury by directly targeting CXCL1.

Authors:  Yuanyuan Shi; Zhongquan Yi; Panwen Zhao; Yun Xu; Pinglei Pan
Journal:  Aging (Albany NY)       Date:  2021-04-18       Impact factor: 5.682

4.  Based on Network Pharmacology to Explore the Molecular Targets and Mechanisms of Gegen Qinlian Decoction for the Treatment of Ulcerative Colitis.

Authors:  Meiqi Wei; He Li; Qifang Li; Yi Qiao; Qun Ma; Ruining Xie; Rong Wang; Yuan Liu; Chao Wei; Bingbing Li; Canlei Zheng; Bing Sun; Bin Yu
Journal:  Biomed Res Int       Date:  2020-11-24       Impact factor: 3.411

5.  Potential Biomarkers Associated with Differential Manifestations of Ulcerative Colitis (UC) in Uyghur and Han Population in China.

Authors:  Jia Jie Lu; Munila Maimaiti; Huan Liu; Wei Dong Liu; Wen Jia Hui; Xiao Ling Huang; Feng Gao
Journal:  J Inflamm Res       Date:  2021-12-30

6.  Identification of Potential Biomarkers and Immune Infiltration Characteristics in Ulcerative Colitis by Combining Results from Two Machine Learning Algorithms.

Authors:  Minchun Bu; Xiandong Cao; Bo Zhou
Journal:  Comput Math Methods Med       Date:  2022-08-01       Impact factor: 2.809

7.  Increased SERPINA3 Level Is Associated with Ulcerative Colitis.

Authors:  Jingwei Zhang; Wei Wang; Shenglong Zhu; Yongquan Chen
Journal:  Diagnostics (Basel)       Date:  2021-12-16

8.  Identification of the key genes and pathways involved in B cells in primary Sjögren' s syndrome.

Authors:  Shizhen Lei; Yi Zhang
Journal:  Bioengineered       Date:  2021-12       Impact factor: 3.269

9.  Identification of diagnostic genes and vital microRNAs involved in rheumatoid arthritis: based on data mining and experimental verification.

Authors:  Conglin Ren; Mingshuang Li; Yang Zheng; Fengqing Wu; Weibin Du; Renfu Quan
Journal:  PeerJ       Date:  2021-05-14       Impact factor: 2.984

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.