| Literature DB >> 36186439 |
Yufei Yang1, Lijun Xu1, Yuqi Qiao1, Tianrong Wang1, Qing Zheng1.
Abstract
Objective: Crohn's disease (CD), a chronic recurrent illness, is a type of inflammatory bowel disease whose incidence and prevalence rates are gradually increasing. However, there is no universally accepted criterion for CD diagnosis. The aim of this study was to create a diagnostic prediction model for CD and identify immune cell infiltration features in CD.Entities:
Keywords: Crohn’s disease; artificial neural network model; bioinformatics; immune cells; inflammatory bowel disease
Year: 2022 PMID: 36186439 PMCID: PMC9520627 DOI: 10.3389/fgene.2022.976578
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
Baseline characteristics of the training and validation cohorts.
| Dataset ID | Platform | Crohn’s disease | Normal | Total |
|---|---|---|---|---|
| GSE16879 | GPL570 | 37 | 12 | 49 |
| GSE112366 | GPL13158 | 141 | 26 | 167 |
| GSE36807 | GPL570 | 13 | 7 | 20 |
FIGURE 1Flow diagram of the study design.
FIGURE 2(A) A volcano plot of differentially expressed genes between Crohn’s disease and control groups. (B) Heatmap of DEGs between Crohn’s disease and control groups. (C) Heatmap of DEGs between colon and ileum groups in the Crohn’s disease group.
FIGURE 3GO (A) and KEGG (B) enrichment analysis of differentially expressed genes between the CD and healthy control groups.
FIGURE 4(A) High-scoring physical interactions from the STRING network. (B) A protein-protein interaction network of differentially expressed genes. (C) The top 15 hub genes and their degree values of modules.
FIGURE 5Identification of candidate genes by random forest. (A) The influence of the number of decision trees on the error rate. The x-axis represents the number of decision trees, and the y-axis indicates the error rate. (B) The importance of the top 30 genes identified by random forest. The candidate genes were identified based on the algorithm requirements of the random forest.
FIGURE 6Construction of an Artificial Neural Network (ANN). (A) The result of ANN. (B) The AUC of the training cohort. (C) The AUC of the validation cohort.
The gene weight of candidate genes.
| Gene symbol | Gene weight |
|---|---|
| S100A8 | 23.07970106 |
| PDZK1IP1 | 17.22093684 |
| AQP9 | 23.43978408 |
| DUOXA2 | 24.11507018 |
| DUOX2 | 23.74934357 |
| MUC1 | −0.79382263 |
| CXCL3 | 31.40582089 |
| WNT5A | 31.06980029 |
| CXCL2 | 1.26885933 |
| IL1RN | 2.39195034 |
| LCN2 | 7.68460207 |
| S100P | 32.09364659 |
| GPX8 | 13.60179552 |
| CFI | 13.42181831 |
| CXCL6 | −1.68997642 |
| CXCL5 | 4.59815424 |
| S100A9 | 24.38535677 |
| CXCL1 | −1.02826724 |
| FPR1 | −2.32435006 |
| IL1B | 5.20656088 |
| MMP3 | 0.1723395 |
| DHRS9 | 35.72181487 |
| TCN1 | 33.86358796 |
| PI3 | −0.52618612 |
| CEACAM5 | −3.4219667 |
| TFF1 | 34.0373206 |
| ADM | 17.85158596 |
| C2CD4A | −0.44643442 |
| GUCA2A | −35.67123135 |
| CDHR1 | −1.0287512 |
Binary logistic regression analysis.
| B | S.E. | Wald | df | Sig. | Exp(B) | 95% C.I. for EXP(B) | |||
|---|---|---|---|---|---|---|---|---|---|
| Lower | Upper | ||||||||
| Step 1 | Disease location | −.985 | .533 | 3.409 | 1 | .065 | .373 | .131 | 1.062 |
| Gene score | 4.290 | .901 | 22.654 | 1 | .000 | 72.946 | 12.469 | 426.739 | |
| Constant | −1.169 | .877 | 1.776 | 1 | .183 | .311 | |||
Variable(s) entered on step 1: Disease location, Gene score.
The accuracy of this model for predicting Crohn’s disease in the training cohort.
| Normal | Crohn’s disease | Total | |
|---|---|---|---|
| Normal | 37 | 1 | 38 |
| Crohn’s disease | 12 | 166 | 178 |
The accuracy of this model for predicting Crohn’s disease in the validation cohort.
| Normal | Crohn’s disease | Total | |
|---|---|---|---|
| Normal | 3 | 4 | 7 |
| Crohn’s disease | 1 | 12 | 13 |
FIGURE 7Estimation of the immune composition in tissues using CIBERSORT. (A) The proportion of 22 types of immune cells in CD patients and healthy individuals. (B) Correlation heatmap of 22 types of immune cells. (C) Differences in the amount of immune cell infiltration between CD and control samples.