| Literature DB >> 30097691 |
Hanna Bragde1,2, Ulf Jansson3, Mats Fredrikson4, Ewa Grodzinsky5,6, Jan Söderman7,8.
Abstract
Establishing a celiac disease (CD) diagnosis can be difficult, such as when CD-specific antibody levels are just above cutoff or when small intestinal biopsies show low-grade injuries. To investigate the biological pathways involved in CD and select potential biomarkers to aid in CD diagnosis, RNA sequencing of duodenal biopsies from subjects with either confirmed Active CD (n = 20) or without any signs of CD (n = 20) was performed. Gene enrichment and pathway analysis highlighted contexts, such as immune response, microbial infection, phagocytosis, intestinal barrier function, metabolism, and transportation. Twenty-nine potential CD biomarkers were selected based on differential expression and biological context. The biomarkers were validated by real-time polymerase chain reaction of eight RNA sequencing study subjects, and further investigated using an independent study group (n = 43) consisting of subjects not affected by CD, with a clear diagnosis of CD on either a gluten-containing or a gluten-free diet, or with low-grade intestinal injury. Selected biomarkers were able to classify subjects with clear CD/non-CD status, and a subset of the biomarkers (CXCL10, GBP5, IFI27, IFNG, and UBD) showed differential expression in biopsies from subjects with no or low-grade intestinal injury that received a CD diagnosis based on biopsies taken at a later time point. A large number of pathways are involved in CD pathogenesis, and gene expression is affected in CD mucosa already in low-grade intestinal injuries. RNA sequencing of low-grade intestinal injuries might discover pathways and biomarkers involved in early stages of CD pathogenesis.Entities:
Keywords: Gene expression profiling; Gene ontology enrichment analysis; Molecular biomarkers; RNA sequencing; RNA-seq
Mesh:
Substances:
Year: 2018 PMID: 30097691 PMCID: PMC6208765 DOI: 10.1007/s00018-018-2898-5
Source DB: PubMed Journal: Cell Mol Life Sci ISSN: 1420-682X Impact factor: 9.261
Descriptive statistics of the RNA sequencing study groups
| Study group |
| Age at biopsy (years)a | Gender; M/F | Anti-TG2a,b (U/mL) | Anti-DGa,c (U/mL) | HLA-DQ2.5 |
|---|---|---|---|---|---|---|
| M0 | 20 | 8.5 (1.6–17) | 10/10 | 0.20 (0–3.6) | 0.50 (0–3.2) | 0.65, 0.30, 0.050 |
| M3 | 20 | 10 (2.3–18) | 10/10 | 262 (36–2858) | 89 (9–781) | 0.15, 0.75, 0.10 |
Study group M0 contained study subjects with histopathologic assessments corresponding to grade Marsh 0, whereas group M3 contained study subjects with assessments corresponding to grades Marsh 3A, 3B, or 3C. All of the study subjects were on a gluten-containing diet, and subjects in study group M3 received a celiac disease diagnosis, whereas subjects in study group M0 did not
aMean (min–max)
bLevels of IgA autoantibodies against tissue transglutaminase (anti-TG2) in serum. For two subjects in study group M0, no serum results were available, but plasma results were within the range of the serum results. IgG results from two subjects with IgA deficiency were included, which were within the range of the IgA-based results
cLevels of IgG antibodies against deamidated gliadin (anti-DG) in serum. For four subjects in study group M0 and one subject in study group M3, no serum results were available, but plasma results were within the range of the serum results
dFor each group, the fractions of study subjects with 0, 1, or 2 HLA-DQ2.5cis are accounted for
Fig. 1Flow diagram illustrating the number and type of study subjects included in the different parts of this study. RNA sequencing (upper section) was performed on 20 subjects without CD (study group M0) and 20 subjects with active CD of grade Marsh 3 (study group M3), which are described further in Table 1. Eight study subjects were selected from the RNA sequencing part and used for correlation between results from RNA sequencing and real-time PCR (midsection). Biopsies from these eight study subjects together with biopsies from 43 independent study subjects represent the entire set of 51 biopsies used for the follow-up study of potential CD biomarkers by means of real-time PCR (lower section). Additional data on these 51 study subjects can be found in Tables 2 and 3
Descriptive statistics of the two clear groups of study subjects used for the validation of RNA sequencing results by real-time polymerase chain reaction
| Study group | Marsh grade | Age (years)a | Diagnosis | Diet | Anti-TG2 (U/mL)b | HLA-DQ2.5 | |
|---|---|---|---|---|---|---|---|
| Not CD | 10 (2/8) | 0 | 7.9 (1.1–17) | Not CD | GD | 0.63 (0–3.6) | 0.6, 0.4, 0 |
| Active CD | 26 (12/14) | 3A–3C | 7.8 (1.8–18) | CD | GD | 712 (15–6832) | 0.23, 0.62, 0.08 |
The Active CD group included study subjects with histopathologic assessments corresponding to grade Marsh 3 and elevated levels of IgA autoantibodies against tissue transglutaminase (anti-TG2) on a gluten-containing diet (GD). The Not CD group contained study subjects with histopathologic assessments corresponding to grade Marsh 0 and anti-TG2 levels below cutoff on a GD. The principal component analysis (Fig. 2) was constructed based on gene expressions from these two groups
aAge at biopsy, expressed as the mean (min–max)
bLevels of anti-TG2 analyzed in serum, expressed as the mean (min–max)
cFor each group, the fractions of study subjects with 0, 1, or 2 HLA-DQ2.5cis are accounted for. Data was not available for two study subjects in group Active CD
Descriptive statistics of study subjects used for the validation of RNA sequencing results by real-time polymerase chain reaction
| Study subject (gender) | Marsh grade | Age (years) | Diagnosis | Context | Diet | Anti-TG2 (U/mL)a | HLA-DQ2.5 |
|---|---|---|---|---|---|---|---|
| 1 (F) | 0–2 | 3.1 | CD | CD laterc | GD | 106 | 1 |
| 2 (M) | 0 | 7.3 | CD | CD laterc | GD | 70 | 1 |
| 3 (F) | 1 | 15 | CD | CD laterc | GD | 93 | 1 |
| 4 (F) | 0 | 15 | CD | CD laterc | GD | 10 | 1 |
| 5 (F) | 0–1 | 9.1 | Not CD | Not CDd | GD | 23 | 1 |
| 6 (F) | 2–3B | 14 | CD | CD | GD | 27 | 0 |
| 7 (F) | 2 | 16 | CD | CD | GD | 50 | 1 |
| 8 (M) | 0 | 7 | CD | Normalized CD | GFD | 0.4 | 1 |
| 9 (F) | 0 | 17 | CD | Normalized CD | GFD | 1.6 | 0 |
| 10 (F) | 0 | 9 | CD | Normalized CD | GFD | 2.2 | 1 |
| 11 (F) | 0 | 17 | CD | Normalized CD | GFD | 1.3 | 1 |
| 12 (F) | 0 | 5 | CD | Normalized CD | GFD | 0.9 | 1 |
| 13 (F) | 3C | 0.7 | CD | M3 TG- | GD | 2.4 | 1 |
| 14 (F) | 3C | 0.8 | CD | M3 TG- | GD | 2.8 | N/A |
| 15 (F) | 3A | 11 | CD | M3 TG- GFD | GFD | 5.4 | 2 |
These study subjects did not fit into the groups in Table 2 and were accounted for as single study subjects. However, they were grouped into contexts. Study subjects who did not receive a celiac disease (CD) diagnosis at the time of the biopsy sampling for this study, but received a CD diagnosis at a later biopsy sampling (CD later), and study subjects who received a Not CD diagnosis at a later biopsy sampling (Not CD). Other subjects were included as control biopsies on a gluten-free diet (GFD) after a previous CD diagnosis; some of these subjects returned to a Marsh 0 histology (normalized CD) but one did not, although levels of IgA autoantibodies against tissue transglutaminase (anti-TG2) normalized (M3 TG- GFD). Other subjects had Marsh 3 histopathologies on a gluten-containing diet (GD) although their anti-TG2 levels were below the cutoff (M3 TG-). All of the study subjects were projected onto the principal component analysis in Fig. 2. Varying histopathologic assessments between pathologists are indicated by ranges in the Marsh grade column
aLevels of anti-TG2 analyzed in serum (study subject 7 analyzed in plasma)
bNumber of HLA-DQ2.5cis. N/A = not available
cStudy subjects 1, 2, 3, and 4 received their CD diagnosis at a biopsy sampling occasion 3, 10, 4 months, and 1 year and 7 months, respectively, after the biopsy sampling for this study
dStudy subject 5 was judged not to have CD, after repeated sampling over a period of 4 years, based on normal histology and normalized anti-TG2 on GD
Fig. 2Hierarchical clustering of study subjects with histopathologic assessments corresponding to grade Marsh 3 (M3) or Marsh 0 (M0) based on RNA sequencing data (this study) from eight genes (APOC3, CYP3A4, OCLN, MAD2L1, MKI67, CXCL11, IL17A, and CTLA4) that were included in a previously developed gene expression profile
Highly significantly differentially expressed genes (HDEGs) were identified by comparing RNA sequencing data from study subjects with active celiac disease (CD) (Marsh 3, group M3, Table 1) with study subjects without CD (Marsh grade 0, group M0, Table 1) using two different approaches, one-way analysis of variance (ANOVA) or modeling of mean–variance relationships of count data using a lognormal distribution with shrinkage and differential expression analysis using linear regression (gene specific analysis, GSA)
| Gene symbol | Gene name | FC RNA sequencing | FDR-adjusted | FDR-adjusted | FC real-time PCR (FDR-adjusted |
|---|---|---|---|---|---|
|
| ATP binding cassette subfamily C member 2 | − 5.1 | 1.5E−12 | 9.7E−12 | |
|
| ATP binding cassette subfamily G member 5 | − 4.8 | 2.1E−11 | 9.0E−14 | |
|
| Angiotensin I converting enzyme | − 4.6 | 6.3E−09 | − 4.5 (1.1E−06) | |
|
| Alkylglycerol monooxygenase | − 5.1 | 5.2E−10 | 2.2E−11 | |
|
| Aldolase, fructose-bisphosphate B | − 4.1 | 2.6E−14 | 3.7E−17 | |
|
| Apolipoprotein A1 | − 41 | 6.3E−09 | 3.0E−15 | |
|
| Apolipoprotein A4 | − 5.5 | 1.5E−08 | ||
|
| Apolipoprotein B | − 5.1 | 4.5E−12 | 1.8E−18 | − 4.7 (4.7E−08) |
|
| Apolipoprotein C2 | − 5.2 | 1.8E−08 | 2.3E−14 | |
|
| Apolipoprotein C3 | − 9.9 | 3.6E−10 | 6.7E−14 | − 5.8 (1.9E−06) |
|
| Apolipoprotein H | − 9.1 | 9.7E−08 | ||
|
| Aquaporin 10 | − 6.6 | 1.9E−09 | 8.3E−14 | |
|
| − 12 | 2.0E−16 | 8.8E−20 | − 6.1 (1.1E−07) | |
|
| Aspartate beta-hydroxylase domain containing 2 | 4.7 | 1.5E−08 | 2.9E−14 | |
|
| Basic leucine zipper ATF-like transcription factor 2 | 4.6 | 5.0E−07 | 1.0E−13 | |
|
| Calpain 13 | − 4.6 | 2.6E−14 | 4.7E−18 | |
|
| Calpain 8 | 5.3 | 3.1E−09 | 5.2 (1.9E−06) | |
|
| CD36 molecule | − 4.9 | 2.6E−14 | 1.2E−15 | − 3.2 (3.7E−07) |
|
| CD79a molecule | 4.2 | 8.6E−07 | ||
|
| Carcinoembryonic antigen-related cell adhesion molecule 20 | − 6.7 | 1.1E−09 | ||
|
| Calsyntenin 2 | − 4.8 | 3.3E−11 | ||
|
| Collagen type VI alpha 5 chain | − 4.5 | 7.2E−08 | ||
|
| C-X-C motif chemokine ligand 9 | 5.5 | 6.8E−07 | 5.5E−10 | 3.6 (3.1E−06) |
|
| C-X-C motif chemokine ligand 10 | 7.7 | 3.8E−11 | 5.7 (1.8E−07) | |
|
| C-X-C motif chemokine ligand 11 | 32 | 2.9E−15 | 22 (3.5E−08) | |
|
| C-X-C motif chemokine receptor 2 pseudogene 1 | 5.1 | 4.9E−08 | ||
|
| Cytochrome P450 family 2 subfamily B member 7, pseudogene | − 12 | 2.7E−09 | 5.8E−14 | |
|
| Cytochrome P450 family 2 subfamily C member 9 | − 5.7 | 9.6E−15 | 1.2E−17 | |
|
| Cytochrome P450 family 3 subfamily A member 4 | − 33 | 8.9E−13 | 1.2E−17 | |
|
| DFNA5, deafness-associated tumor suppressor | − 4.0 | 4.6E−11 | ||
|
| Diacylglycerol | − 10 | 1.3E−13 | ||
|
| DIRAS family GTPase 2 | − 7.3 | 9.0E−14 | 2.0E−12 | |
|
| glutamyl aminopeptidase | − 5.1 | 1.9E−10 | ||
|
| Ectonucleotide pyrophosphatase/phosphodiesterase 3 | − 11 | 2.3E−10 | 2.1E−17 | |
|
| Coagulation factor XIII B chain | − 5.6 | 2.7E−07 | ||
|
| Family with sequence similarity 184 member A | − 5.5 | 5.4E−10 | 2.1E−10 | |
|
| Fc fragment of IgG receptor IIIa | 5.4 | 9.7E−11 | N/Ac | |
|
| Glucose-6-phosphatase catalytic subunit | − 15 | 1.7E−09 | 5.6E−14 | |
|
| Guanylate binding protein 5 | 4.9 | 6.2E−07 | 9.7E−12 | 4.0 (3.5E−08) |
|
| Glutathione S-transferase alpha 2 | − 5.6 | 1.7E−11 | 1.5E−09 | |
|
| Hexokinase 2 | 7.5 | 6.7E−13 | ||
|
| 3-Hydroxy-3-methylglutaryl-CoA synthase 2 | − 9.1 | 4.7E−09 | 1.1E−08 | |
|
| Interferon alpha inducible protein 27 | 4.6 | 2.4E−09 | 3.2 (2.5E−06) | |
|
| Interferon gamma | 29 | 8.9E−08 | 17 (3.5E−08) | |
|
| Interleukin 1 receptor antagonist | 4.6 | 3.9E−08 | ||
|
| Interleukin 21 receptor | 4.9 | 3.6E−08 | ||
|
| Lipocalin 2 | 8.1 | 7.1E−09 | 12 (5.4E−06) | |
|
| Lactase | − 20 | 1.7E−09 | 4.0E−12 | |
|
| Uncharacterized LOC100507537 | − 7.7 | 3.4E−08 | 7.9E−11 | |
|
| Lipoprotein lipase | 100 | 8.5E−17 | 107 (3.5E−08) | |
|
| Lecithin retinol acyltransferase | − 9.6 | 4.7E−11 | 4.5E−16 | − 6.4 (3.7E−07) |
|
| Meprin A subunit beta | − 4.3 | 9.0E−14 | 6.5E−15 | |
|
| Membrane metalloendopeptidase | − 4.6 | 2.6E−14 | 1.5E−15 | |
|
| Matrix metallopeptidase 3 | 16 | 3.4E−09 | 10 (3.1E−06) | |
|
| Matrix metallopeptidase 12 | 14 | 1.0E−06 | 1.0E−11 | 9.3 (7.7E−08) |
|
| Membrane spanning 4-domains A10 | − 11 | 7.9E−14 | 4.7E−11 | |
|
| Neural EGFL like 2 | − 6.1 | 4.8E−12 | 8.0E−18 | |
|
| NLR family CARD domain containing 5 | 4.5 | 1.1E−07 | 2.4E−10 | |
|
| Phosphoenolpyruvate carboxykinase 1 | − 11 | 2.1E−10 | 1.4E−15 | − 7.2 (4.7E−08) |
|
| Proprotein convertase subtilisin/kexin type 9 | 4.9 | 1.5E−07 | ||
|
| PITPNM family member 3 | 4.6 | 1.4E−07 | ||
|
| Piwi like RNA-mediated gene silencing 2 | − 4.1 | 1.2E−08 | 2.4E−10 | N/Ac |
|
| Pyruvate kinase L/R | − 4.4 | 1.7E−08 | 8.0E−10 | |
|
| Paraoxonase 3 | − 6.0 | 2.2E−07 | 3.4E−10 | |
|
| Protein kinase, cGMP-dependent, type II | − 9.8 | 1.8E−07 | 1.2E−15 | |
|
| Regucalcin | − 6.6 | 5.0E−11 | 7.0E−14 | |
|
| S100 calcium binding protein A9 | 4.8 | 6.1E−07 | 4.5 (1.1E−07) | |
|
| S100 calcium binding protein G | − 5.1 | 2.4E−08 | ||
|
| Sodium voltage-gated channel beta subunit 3 | − 10 | 2.6E−11 | ||
|
| Sucrase-isomaltase | − 4.3 | 1.9E−09 | 5.5E−14 | |
|
| Solute carrier family 2 member 2 | − 4.0 | 2.1E−09 | 6.9E−12 | |
|
| Solute carrier family 5 member 11 | − 8.4 | 3.3E−10 | ||
|
| Solute carrier family 6 member 4 | − 4.8 | 1.9E−10 | 3.7E−10 | |
|
| Solute carrier family 6 member 14 | 21 | 1.5E−09 | 21 (3.5E−08) | |
|
| Solute carrier family 22 member 4 | − 6.5 | 6.6E−10 | ||
|
| Solute carrier family 23 member 1 | − 8.8 | 3.6E−11 | 2.0E−12 | |
|
| Solute carrier family 28 member 2 | − 4.2 | 7.2E−07 | 4.4E−07 | |
|
| Solute carrier family 46 member 1 | − 4.5 | 5.4E−11 | 3.2E−11 | |
|
| Sterol | − 14 | 6.4E−10 | − 6.4 (3.7E−07) | |
|
| Serine peptidase inhibitor, Kazal type 4 | 4.5 | 2.4E−10 | ||
|
| Sulfotransferase family 2A member 1 | − 6.8 | 3.3E−09 | 4.6E−08 | |
|
| Trefoil factor 1 | 11 | 7.6E−07 | 6.1 (1.5E−06) | |
|
| Transmembrane 4 L six family member 4 | − 5.7 | 5.7E−08 | 5.9E−10 | |
|
| TNF receptor superfamily member 9 | 6.8 | 7.6E−13 | 4.1 (3.5E−08) | |
|
| Trehalase | − 5.4 | 2.3E−09 | 1.8E−11 | |
|
| Transient receptor potential cation channel subfamily M member 6 | − 8.0 | 1.9E−14 | 8.5E−17 | |
|
| Tetratricopeptide repeat domain 36 | − 5.8 | 5.1E−08 | ||
|
| Ubiquitin D | 17 | 3.7E−12 | 8.3 (5.3E−07) | |
|
| UDP glucuronosyltransferase family 1 member A3 | − 16 | 2.7E−09 | ||
|
| UDP glucuronosyltransferase family 1 member A4 | − 15 | 3.3E−07 | − 5.3 (6.8E−06) | |
|
| UDP glucuronosyltransferase family 2 member B7 | − 6.3 | 8.7E−10 | 2.2E−13 | |
|
| unc-93 homolog A | − 12 | 1.7E−12 | 2.9E−17 | |
|
| Beta-ureidopropionase 1 | − 35 | 9.3E−09 | − 33 (7.7E−08) | |
|
| Vanin 1 | − 4.9 | 1.5E−12 | 3.0E−15 | − 3.2 (4.7E−08) |
Fold changes (FC) were based on mean expression (M3 vs. M0), and the p values were adjusted for multiple testing using false discovery rate (FDR). Genes marked with an asterisk were selected as potential CD biomarkers and validated using real-time polymerase chain reaction (PCR). Marsh grade 3 (group Active CD, n = 26, Table 2) vs. Marsh grade 0 (group Not CD, n = 10, Table 2) FCs from real-time PCR follow-up analyses are included, together with FDR-adjusted p values from the Mann–Whitney U test of differential expressions between the two groups
aOne-way ANOVA using Partek Genomics Suite version 6.6 (Partek Incorporated, St. Louis, MO)
bGSA using Partek Flow version 5.0.16.0523 (Partek Incorporated)
cN/A = not available. Expression of PIWIL2 and FCGR3A was not detected in a majority of the study subjects using real-time PCR, thus these genes were excluded from further analyses based on real-time PCR data
Fig. 3Coordinates of study subjects in a PCA based on the expression of 27 potential CD biomarkers (Table 4). Gene expressions of study subjects in groups Not CD and Active CD (Table 2) were used to construct the PCA and are represented in the PCA by colored markers. Study subjects 1–15 were projected onto this PCA and are represented by unique study subject numbers (Table 3)
Fig. 4Box plot visualizing the expression of the five potential CD biomarkers that showed higher expression in subjects with no or low-grade intestinal injury who were later diagnosed with CD (CD later, Table 3) than in the Not CD group (Table 2) on a logarithmic scale. The box and the square within the box represent the 25–75% interquartile range and the median, respectively. The whiskers represent the non-outlier ranges