| Literature DB >> 33095822 |
Amin M Cheikhi1, Zariel I Johnson1, Dana R Julian1,2, Sarah Wheeler3, Carol Feghali-Bostwick4, Yvette P Conley1, James Lyons-Weiler5, Cecelia C Yates1,2,3.
Abstract
Fibrosis is a chronic disease with heterogeneous clinical presentation, rate of progression, and occurrence of comorbidities. Systemic sclerosis (scleroderma, SSc) is a rare rheumatic autoimmune disease that encompasses several aspects of fibrosis, including highly variable fibrotic manifestation and rate of progression. The development of effective treatments is limited by these variabilities. The fibrotic response is characterized by both chronic inflammation and extracellular remodeling. Therefore, there is a need for improved understanding of which inflammation-related genes contribute to the ongoing turnover of extracellular matrix that accompanies disease. We have developed a multi-tiered method using Naïve Bayes modeling that is capable of predicting level of disease and clinical assessment of patients based on expression of a curated 60-gene panel that profiles inflammation and extracellular matrix production in the fibrotic disease state. Our novel modeling design, incorporating global and parametric-based methods, was highly accurate in distinguishing between severity groups, highlighting the importance of these genes in disease. We refined this gene set to a 12-gene index that can accurately identify SSc patient disease state subsets and informs knowledge of the central regulatory pathways in disease progression.Entities:
Mesh:
Substances:
Year: 2020 PMID: 33095822 PMCID: PMC7584227 DOI: 10.1371/journal.pone.0240986
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Qualitative and quantitative analysis of demographic and clinical characteristics of donor biopsies from microarray gene expression of patient skin biopsies.
Shown are bubble charts reflecting the magnitude of the skin score given the race and age of the donors as function of the (A) disease type and (B) biopsy origin as well as the (C) distribution of donor age and disease type or (D) and skin scores as a function of the disease type, gender and race (A = Asian, AA = African American, H = Hispanic, W = White) respectively. Disease type, biopsy origin, race, and sex are color-coordinated, and the size of the bubbles indicate the magnitude of the skin score.
Fig 2Conditional dependency between demographic and clinical characteristics of donor biopsies.
(A) A simple Bayesian network model encoding the conditional probability between disease type classification as the target variable on other characteristics as predictors, and the relative predictor importance. The node focuses on Tree Augmented Naïve Bayes (TAN) and Markov Blanket networks that are primarily used for classification. (B) Linear projection methods using principal component analysis of disease type-labeled data showing the skin score/age two-dimensional projection where instances of different classes are best separated.
Fig 3Qualitative and quantitative analysis contrasting disease types and related-gene expression pattern.
Shown are (A) a bubble chart reflecting expression levels of statistically significant genes according to their J5-score and ranking and differentiating healthy vs dSSc as opposed to lSSc vs dSSc and, (B) the distribution of J5 scores contrasting healthy vs dSSc as opposed to lSSc vs dSSc.
Pathways associated with differentially expressed genes between healthy and dSSc patient biopsy samples.
| Rank | Database Name | Pathway Name | Impact Factor | No. Genes in Pathway | No. Input Genes in Pathway | No. Pathway Genes on Chip | % Pathway Genes in Input | Corrected p-value | Sum (PF) | KEGG Pathway ID |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | KEGG | TGF-β signaling pathway | 9.104 | 87 | 1 | 71 | 1.149 | 0.112591573 | 6.919863899 | 1:04350 |
| 2 | KEGG | Wnt signaling pathway | 6.415 | 152 | 1 | 123 | 0.658 | 0.18712644 | 4.738727096 | 1:04310 |
| 3 | KEGG | ECM-receptor interaction | 4.477 | 84 | 1 | 72 | 1.19 | 0.114085731 | 2.306564066 | 1:04512 |
| 4 | KEGG | Primary immunodeficiency | 4.446 | 35 | 1 | 21 | 2.857 | 0.034674686 | 1.084276803 | 1:05340 |
| 5 | KEGG | Ribosome | 3.255 | 101 | 1 | 71 | 0.99 | 0.112591573 | 1.071276617 | 1:03010 |
| 6 | KEGG | Focal adhesion | 2.563 | 203 | 1 | 166 | 0.493 | 0.244127768 | 1.153324958 | 1:04510 |
No. Genes in Pathway: Number of genes annotated for pathway, No. Input Genes in Pathway: Number of genes in input list that occur in pathway, No. Pathway Genes on Chip: Number of genes annotated for pathway for which there are probes on microarray chip, % Pathway Genes in Input: Percentage of genes that are annotated for pathway and included in input set, Corrected p-value: FDR-corrected p-value, Sum (PF): Sum of absolute values of perturbation factors.
Pathways associated with differentially expressed genes between lSSc and dSSc patient biopsy samples.
| Database Name | Pathway Name | Impact Factor | No. Genes in Pathway | No. Input Genes in Pathway | No. Pathway Genes on Chip | % Pathway Genes in Input | Corrected p-value | Sum (PF) | KEGG Pathway ID | |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | KEGG | PPAR signaling pathway | 11.982 | 70 | 4 | 52 | 5.714 | 1.67E-05 | 9.85E-01 | 1:03320 |
| 2 | KEGG | Axon guidance | 7.301 | 129 | 1 | 96 | 0.775 | 2.47E-01 | 5.90E+00 | 1:04360 |
| 3 | KEGG | MAPK signaling pathway | 4.294 | 272 | 1 | 217 | 0.368 | 4.75E-01 | 3.55E+00 | 1:04010 |
| 4 | KEGG | Primary immunodeficiency | 3.710 | 35 | 1 | 21 | 2.857 | 6.02E-02 | 8.99E-01 | 1:05340 |
| 5 | KEGG | Homologous recombinant | 3.585 | 28 | 1 | 24 | 3.571 | 6.84E-02 | 9.03E-01 | 1:03440 |
| 6 | KEGG | Bladder cancer | 3.265 | 42 | 1 | 36 | 2.381 | 1.01E-01 | 9.72E-01 | 1:05219 |
| 7 | KEGG | Ribosome | 2.905 | 101 | 1 | 71 | 0.99 | 1.89E-01 | 1.24E+00 | 1:03010 |
| 8 | KEGG | TGF-β signaling pathway | 2.700 | 87 | 1 | 71 | 1.149 | 1.89E-01 | 1.04E+00 | 1:04350 |
| 9 | KEGG | Hematopoietic cell lineage | 2.660 | 87 | 1 | 67 | 1.149 | 1.80E-01 | 9.43E-01 | 1:04640 |
| 10 | KEGG | Alzheimer's disease | 2.061 | 178 | 1 | 135 | 0.562 | 3.30E-01 | 9.51E-01 | 1:05010 |
| 11 | KEGG | Cytokine-cytokind receptor interaction | 1.857 | 263 | 1 | 173 | 0.38 | 4.01E-01 | 9.43E-01 | 1:04060 |
| 12 | KEGG | Pathways in cancer | 1.581 | 330 | 1 | 264 | 0.303 | 5.44E-01 | 9.72E-01 | 1:05200 |
No. Genes in Pathway: Number of genes annotated for pathway, No. Input Genes in Pathway: Number of genes in input list that occur in pathway, No. Pathway Genes on Chip: Number of genes annotated for pathway for which there are probes on microarray chip, % Pathway Genes in Input: Percentage of genes that are annotated for pathway and included in input set, Corrected p-value: FDR-corrected p-value, Sum (PF): Sum of absolute values of perturbation factors.
Fig 4Silhouette analysis of genes differentially expressed between healthy and dSSc patient biopsy samples.
(A) The silhouette analysis scores range from 1.0 to − 1.0, and a larger value for the average silhouette (AS) over all samples to be analyzed indicates a higher degree of cluster separation. Silhouette coefficients near +1 indicate that the feature is far away from the neighboring clusters. A value of 0 indicates that the sample is on or very close to the decision boundary between two neighboring clusters, and negative values indicate that those samples might have been assigned to the wrong cluster. (B) This scatterplot contrasting the positive silhouette scores healthy vs dSSc as opposed to lSSc.
Fig 5Enrichment analysis using PANTHER of genes differentially expressed between healthy and dSSc patient biopsy samples based on J5 analysis.
(A) Enrichment analysis using PANTHER of the collective set of genes with positive silhouette scores (Protein Analysis Through Evolutionary Relationships, http://pantherdb.org). (B, D) Enrichment analysis of the collective set of genes with positive silhouette scores using PANTHER, based on the skin-specific protein-protein interactions, derived from the DifferentialNet database. (C, E) Enrichment analysis of the collective set of genes with positive silhouette scores using PANTHER, based on the skin-specific gene co-expression interactions, derived from the TCSBN database.
Fig 6Silhouette analysis of genes differentially expressed between dSSc and lSSc patient biopsy samples.
(A) The silhouette analysis scores range from 1.0 to − 1.0, and a larger value for the average silhouette (AS) over all samples to be analyzed indicates a higher degree of cluster separation. Silhouette coefficients near +1 indicate that the feature is far away from the neighboring clusters. A value of 0 indicates that the sample is on or very close to the decision boundary between two neighboring clusters, and negative values indicate that those samples might have been assigned to the wrong cluster. (B) This scatterplot contrasting the positive silhouette scores healthy vs dSSc as opposed to lSSc.
KEGG pathways used for selection of genes for predictive gene index (PDI).
All pathways are Homo sapien.
| Pathway ID | Pathway Name | |
|---|---|---|
| hsa04064 | NF-κB signaling pathway | |
| hsa05321 | Inflammatory bowel disease (IBD) | |
| hsa05323 | Rheumatoid arthritis | |
| hsa04062 | Chemokine signaling pathway | |
| hsa04668 | TNF signaling pathway | |
| hsa04010 | MAPK signaling pathway | |
| hsa04610 | Complement and coagulation cascades | |
| hsa04066 | HIF-1 signaling pathway | |
| hsa04510 | Focal adhesion | |
| hsa04350 | TGF-β signaling pathway | |
| hsa04512 | ECM-receptor interaction | |
| hsa05205 | Proteoglycans in cancer |
60 genes chosen for predictive gene index (PDI).
| Gene Symbol | Gene Name | Associated Pathway [ |
|---|---|---|
| TNC | TNC Tenascin | [KO:K05692] Focal adhesion, [KO:K06236] ECM-receptor interaction |
| DCN | DCN Decorin | [KO:K05692] Proteoglycans in cancer, [KO:K16622] TGF-β signaling pathway |
| FN1 | FN1 Fibronectin 1 | [KO:K05692] Focal adhesion, [KO:K05692] Proteoglycans in cancer, [KO:K06236] ECM-receptor interaction |
| COL1A2 | COL1A2 Collagen type 1 alpha 2 | [KO:K05692] Focal adhesion, [KO:K06236] ECM-receptor interaction |
| TGFB | TGFB1 Transforming Growth Factor, Beta 1 | [KO:K04858] MAPK signaling pathway, [KO:K16622] TGF-β signaling pathway, [KO:K05692] Proteoglycans in cancer, [KO:K06752] Inflammatory bowel disease (IBD), [KO:K14624] Rheumatoid arthritis |
| CXCR3 | CXCR3 C-X-C Chemokine Receptor Type 3 | [KO:K05726] Chemokine signaling pathway |
| CXCR4 | CXCR4 C-X-C Chemokine Receptor Type 4 | [KO:K05726] Chemokine signaling pathway |
| A2M | A2M alpha-2-macroglobulin | [KO:K03910] Complement and coagulation cascades |
| ACTB | ACTB actin, beta | [KO:K05692] Focal adhesion, [KO:K05692] Proteoglycans in cancer |
| ATP6V1B2 | ATP6V1B2 ATPase, H+ transporting, lysosomal 56/58kDa, V1 subunit B2 | [KO:K02147] [EC:3.6.3.14] Rheumatoid arthritis |
| BCAR1 | BCAR1 breast cancer anti-estrogen resistance 1 | [KO:K05726] Chemokine signaling pathway, [KO:K05726] Focal adhesion |
| BCL3 | BCL3 B-cell CLL/lymphoma 3 | [KO:K09258] TNF signaling pathway |
| BMP8A | BMP8A bone morphogenetic protein 8a | [KO:K16622] TGF-β signaling pathway |
| CACNA2D1 | CACNA2D1 calcium channel, voltage-dependent, alpha 2/delta subunit 1 | [KO:K04858] MAPK signaling pathway |
| CACNG6 | CACNG6 calcium channel, voltage-dependent, gamma subunit 6 | [KO:K04871] MAPK signaling pathway |
| CAV2 | CAV2 caveolin 2 | [KO:K12958] Focal adhesion, [KO:K12958] Proteoglycans in cancer |
| CCL2 | CCL2 C-C motif chemokine ligand 2 | [KO:K14624] TNF signaling pathway, [KO:K14624] Rheumatoid arthritis, [KO:K14624] Chemokine signaling |
| CCL4 | CCL4 C-C motif chemokine ligand 4 | [KO:K12964] NF-κB signaling, [KO:K12964] Chemokine signaling pathway |
| CCR5 | CCR5 C-C motif chemokine receptor 5 (gene/pseudogene) | [KO:K04180] Chemokine signaling pathway |
| CD86 | CD86 CD86 molecule | [KO:K05413] Rheumatoid arthritis |
| COL1A2 | COL1A2 collagen, type I, alpha 2 | [KO:K06236] Focal adhesion, [KO:K06236] ECM-receptor interaction |
| COL6A2 | COL6A2 collagen, type VI, alpha 2 | [KO:K06238] Focal adhesion, [KO:K06238] ECM-receptor interaction |
| COL6A3 | COL6A3 collagen, type VI, alpha 3 | [KO:K06238] Focal adhesion, [KO:K06238] ECM-receptor interaction |
| CREB3L3 | CREB3L3 cAMP responsive element binding protein 3-like 3 | [KO:K09048] TNF signaling pathway |
| CXCL5 | CXCL5 chemokine (C-X-C motif) ligand 5 | [KO:K05506] Rheumatoid arthritis, [KO:K05506] Chemokine signaling, [KO:K05506] TNF signaling pathway |
| DDX58 | DDX58 DEAD (Asp-Glu-Ala-Asp) box polypeptide 58 | [KO:K12646] [EC:3.6.3.14] NF-κB B signaling pathway |
| EIF4B | EIF4B eukaryotic translation initiation factor 4B | [KO:K03258] Proteoglycans in cancer |
| F13A1 | F13A1 coagulation factor XIII, A1 polypeptide | [KO:K03917] [EC:2.3.2.13] Complement and coagulation cascades |
| F7 | F7 coagulation factor VII (serum prothrombin conversion accelerator) | [KO:K01320] [EC:3.4.21.21] Complement and coagulation cascades |
| FGF19 | FGF19 fibroblast growth factor 19 | [KO:K04358] MAPK signaling pathway, [KO:K04358] Proteoglycans in cancer |
| FGF5 | FGF5 fibroblast growth factor 5 | [KO:K04358] MAPK signaling pathway, [KO:K04358] Proteoglycans in cancer |
| HCLS1 | HCLS1 hematopoietic cell-specific Lyn substrate 1 | [KO:K06106]Proteoglycans in cancer |
| HLA-DMA | HLA-DMA major histocompatibility complex, class II, DM alpha | [KO:K06752]Inflammatory bowel disease (IBD), [KO:K06752] Rheumatoid arthritis |
| HLA-DOA | HLA-DOA major histocompatibility complex, class II, DO alpha | [KO:K06752]Inflammatory bowel disease (IBD), alpha [KO:K06752] Rheumatoid arthritis |
| HLA-DPA1 | HLA-DPA1 major histocompatibility complex, class II, DP alpha 1 [ | KO:K06752] Inflammatory bowel disease (IBD), [KO:K06752] Rheumatoid arthritis |
| HLA-DPB1 | HLA-DPB1 major histocompatibility complex, class II, DP beta 1 | [KO:K06752] Inflammatory bowel disease (IBD), [KO:K06752] Rheumatoid arthritis |
| HLA-DQA1 | HLA-DQA1 major histocompatibility complex, class II, DQ alpha 1 | [KO:K06752] Inflammatory bowel disease (IBD), [KO:K06752] Rheumatoid arthritis |
| HLA-DQA2 | HLA-DQA2 major histocompatibility complex, class II, DQ alpha 2 | [KO:K06752] Inflammatory bowel disease (IBD), [KO:K06752] Rheumatoid arthritis |
| HLA-DQB1 | HLA-DQB1 major histocompatibility complex, class II, DQ beta 1 | [KO:K06752] Inflammatory bowel disease (IBD), [KO:K06752] Rheumatoid arthritis |
| HLA-DRB5 | HLA-DRB5 major histocompatibility complex, class II, DR beta 5 | [KO:K06752] Inflammatory bowel disease (IBD), [KO:K06752] Rheumatoid arthritis |
| HRAS | HRAS Harvey rat sarcoma viral oncogene homolog | [KO:K02833] Chemokine signaling pathway, [KO:K02833] MAPK signaling |
| Pathway, [KO:K02833] Focal adhesion, [KO:K02833] Proteoglycans in cancer | ||
| IKBKG | IKBKG inhibitor of kappa light polypeptide gene enhancer in B-cells, kinase gamma | [KO:K07210] MAPK signaling pathway, [KO:K07210] NF-κB signaling pathway, [KO:K07210] Chemokine signaling pathway, [KO:K07210] TNF signaling pathway |
| IL15 | IL15 interleukin 15 | [KO:K05433] TNF signaling pathway, [KO:K05433] Rheumatoid arthritis |
| IL23A | IL23A interleukin 23, alpha subunit p19 | [KO:K05426] Inflammatory bowel disease (IBD), [KO:K05426] Rheumatoid arthritis |
| ITGAL | ITGAL integrin, alpha L (antigen CD11A (p180), lymphocyte function-associated antigen 1 | [KO:K05718] Rheumatoid arthritis |
| ITGB1 | ITGB1 integrin, beta 1 (fibronectin receptor, beta polypeptide, antigen CD29 includes MDF2, MSK12) | [KO:K05719] ECM-receptor interaction, [KO:K05719] Focal adhesion, [KO:K05719] Proteoglycans in cancer |
| ITGB2 | ITGB2 integrin, beta 2 (complement component 3 receptor 3 and 4 subunit) | [KO:K06464] Rheumatoid arthritis |
| LAMB1 | LAMB1 laminin, beta 1 | [KO:K05636] Focal adhesion, [KO:K05636] ECM-receptor interaction |
| LUM | LUM lumican | [KO:K08122] Proteoglycans in cancer |
| MSN | MSN moesin | [KO:K05763] Proteoglycans in canceR |
| PDGFC | PDGFC platelet derived growth factor C | [KO:K05450] Focal adhesion |
| PDGFRA | PDGFRA platelet-derived growth factor receptor, alpha polypeptide | [KO:K04363] [EC:2.7.10.1] MAPK signaling pathway, [KO:K04363] [EC:2.7.10.1] Focal adhesion |
| PLAUR | PLAU plasminogen activator, urokinase | [KO:K01348] [EC:3.4.21.73] Proteoglycans in cancer, [KO:K01348] [EC:3.4.21.73] NF-κB signaling pathway, [KO:K01348] [EC:3.4.21.73] Complement and coagulation cascades |
| RAC2 | RAC2 ras-related C3 botulinum toxin substrate 2 (rho family, small GTP binding protein Rac2) | [KO:K07860] Focal adhesion, [KO:K07860] Chemokine signaling pathway, [KO:K07860] MAPK signaling pathway |
| SMAD1 | SMAD1 SMAD family member 1 | [KO:K04676] TGF-β signaling pathway |
| SP1 | SP1 Sp1 transcription factor | [KO:K04684] TGF-β signaling pathway |
| STAT6 | STAT6 signal transducer and activator of transcription 6, interleukin-4 induced | [KO:K11225] Inflammatory bowel disease (IBD) |
| TGFBR2 | TGFBR2 transforming growth factor, beta receptor II (70/80kDa) | [KO:K04388] [EC:2.7.11.30] TGF-β signaling pathway, [KO:K04388] [EC:2.7.11.30] MAPK signaling pathway |
| TIMP1 | TIMP1 TIMP metallopeptidase inhibitor 1 | [KO:K16451] HIF-1 signaling pathway |
| VAV1 | VAV1 vav 1 guanine nucleotide exchange factor | [KO:K05730] Chemokine signaling pathway, [KO:K05730] Focal adhesion |
Genes from predictive gene index that were differentially expressed between healthy control and dSSc patient biopsy samples.
| J5 Rank | Gene ID | J5 Score |
|---|---|---|
| 1 | DCN | 3.552 |
| 2 | LUM | -2.729 |
| 3 | HLA-DQA1 | 2.198 |
| 4 | ITGAL | 2.067 |
| 5 | HLA-DQA2 | 1.907 |
| 6 | LAMB1 | -1.814 |
| 7 | CCL4 | -1.766 |
| 8 | COL6A2 | 1.738 |
| 9 | BCL3 | -1.725 |
| 10 | IKBKG | -1.723 |
| 11 | F13A1 | 1.635 |
| 12 | TIMP1 | -1.621 |
| 13 | PDGFRA | 1.599 |
| 14 | COL6A3 | -1.496 |
| 15 | VAV1 | -1.487 |
| 16 | DDX58 | -1.467 |
| 17 | HCLS1 | -1.447 |
| 18 | CACNG6 | -1.405 |
Genes from predictive gene index that were differentially expressed between lSSc and dSSc patient biopsy samples.
| J5 Rank | Gene ID | J5 Score |
|---|---|---|
| 1 | HLA-DQA1 | 3.11 |
| 2 | F13A1 | 3.014 |
| 3 | HLA-DRB5 | 2.812 |
| 4 | STAT6 | 2.679 |
| 5 | HLA-DQA2 | 2.296 |
| 6 | ITGAL | 2.236 |
| 7 | DCN | 1.94 |
| 8 | COL6A2 | 1.93 |
| 9 | ATP6V1B2 | 1.729 |
| 10 | BMP8A | 1.622 |
| 11 | IL23A | -1.572 |
| 12 | FGF5 | -1.561 |
| 13 | CACNG6 | -1.522 |
| 14 | CREB3L3 | -1.441 |
| 15 | HRAS | 1.426 |
| 16 | IKBKG | -1.397 |
| 17 | LUM | -1.372 |
| 18 | CACNA2D1 | 1.37 |
| 69 | IL15 | 1.345 |
| 20 | HLA-DQB1 | 1.336 |
| 21 | CCL4 | -1.307 |
| 22 | PDGFRA | 1.245 |
| 23 | HLA-DPB1 | 1.109 |
Fig 8Gene expression grid showing expression of genes in 12-gene panel capable of predicting disease features.
Color of boxes indicates directionality of expression differences with red indicating high expression and green indicating low expression. Patient samples highlighted in red were all from dSSc patients and were higher severity (mean mRSS 35.6); samples highlighted in blue were all from lSSc patients and were lower severity (mean mRSS 7.73).
Fig 7Enrichment analysis using PANTHER of genes differentially expressed between lSSc and dSSc patient biopsy samples based on J5 analysis.
(A) Enrichment analysis using PANTHER of the collective set of genes with positive silhouette scores (Protein Analysis Through Evolutionary Relationships, http://pantherdb.org). (B, D) Enrichment analysis of the collective set of genes with positive silhouette scores using PANTHER, based on the skin-specific protein-protein interactions, derived from the DifferentialNet database. (C, E) Enrichment analysis of the collective set of genes with positive silhouette scores using PANTHER, based on the skin-specific gene co-expression interactions, derived from the TCSBN database.