Literature DB >> 35404985

Construction of disease-specific cytokine profiles by associating disease genes with immune responses.

Tianyun Liu1, Shiyin Wang2, Michael Wornow3, Russ B Altman4.   

Abstract

The pathogenesis of many inflammatory diseases is a coordinated process involving metabolic dysfunctions and immune response-usually modulated by the production of cytokines and associated inflammatory molecules. In this work, we seek to understand how genes involved in pathogenesis which are often not associated with the immune system in an obvious way communicate with the immune system. We have embedded a network of human protein-protein interactions (PPI) from the STRING database with 14,707 human genes using feature learning that captures high confidence edges. We have found that our predicted Association Scores derived from the features extracted from STRING's high confidence edges are useful for predicting novel connections between genes, thus enabling the construction of a full map of predicted associations for all possible pairs between 14,707 human genes. In particular, we analyzed the pattern of associations for 126 cytokines and found that the six patterns of cytokine interaction with human genes are consistent with their functional classifications. To define the disease-specific roles of cytokines we have collected gene sets for 11,944 diseases from DisGeNET. We used these gene sets to predict disease-specific gene associations with cytokines by calculating the normalized average Association Scores between disease-associated gene sets and the 126 cytokines; this creates a unique profile of inflammatory genes (both known and predicted) for each disease. We validated our predicted cytokine associations by comparing them to known associations for 171 diseases. The predicted cytokine profiles correlate (p-value<0.0003) with the known ones in 95 diseases. We further characterized the profiles of each disease by calculating an "Inflammation Score" that summarizes different modes of immune responses. Finally, by analyzing subnetworks formed between disease-specific pathogenesis genes, hormones, receptors, and cytokines, we identified the key genes responsible for interactions between pathogenesis and inflammatory responses. These genes and the corresponding cytokines used by different immune disorders suggest unique targets for drug discovery.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 35404985      PMCID: PMC9022887          DOI: 10.1371/journal.pcbi.1009497

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.475


Introduction

The pathogenesis of inflammatory diseases is a coordinated process involving metabolic dysfunctions, signaling, and innate immune response—modulated by the production of cytokines and associated inflammatory molecules [1,2,3]. The continued discovery of novel pathways and inflammatory mediators reveals the complexity and richness of autoimmune diseases [4,5], but the complete molecular decision network behind these processes and the coordination between cytokine signaling and underlying disease biology are not well understood [6,7,8]. Many models of autoimmune diseases posit a common cytokine framework with highly conserved mechanisms of inflammation [4,9,10]. Recent advances in genome-wide association studies (GWAS) provide evidence of considerable genetic overlap between autoimmune diseases, along with unique loci for individual diseases—and sometimes their subtypes [11,12,13]. The success of anti-TNF treatment in multiple inflammatory diseases suggests that there is a shared cytokine framework (at least for a subset of diseases) that defines conserved mechanisms of inflammation [14]. However, clinical trials testing the efficacy of cytokine inhibitors suggest a more complex set of interacting cytokine mechanisms that are associated with different disease phenotypes [15]. For example, the Jak-Stat-Socs signaling module can have either pro- or anti-inflammatory outcomes depending on the activation pattern of cytokine receptors—and different diseases show different patterns. If diseases do not have the same cytokine activity profile, then it is important to define which ones are key for each disease. Therefore, we ask the question: What is the degree to which cytokine responses are shared across diseases or specific to each disease? And how does heterogeneity of cytokine responses mediate different pathogenesis and inflammatory processes? Addressing the above questions requires a deeper understanding of the connections between cytokine signaling and the disease-specific genes implicated in pathogenesis of specific diseases and discovered through GWAS or transcriptional studies. Networks of interacting genes provide a useful representation of the functional associations between genes and gene modules [16,17]. Unfortunately, efforts to identify disease-associated genes do not always provide a clear link between pathogenesis and immune response. Publicly available immunology databases often limit their coverage to only characterizing properties of immune processes. ImmuneSigDB maps changes in the expression of sets of genes corresponding to immune response, thereby enabling a system analysis of data to improve the understanding of immune processes [18]. ImmProt [19] uses high-resolution mass spectrometry-based proteomics to characterize primary human immune cell types for ligand and receptor interactions, thereby connecting distinct immune functions. ImmuneXpresso [20] identifies relationships between cells, cytokines, and diseases via Natural Language Processing (NLP) of PubMed articles. Ideally, we would like to map known inflammatory response genes (e.g., cytokines and cytokine receptors) more completely to human gene networks to better identify potential mechanistic links between inflammatory response genes and pathogenesis genes. STRING is a comprehensive gene network that focuses chiefly on molecular pathways, with cancer heavily emphasized. Unfortunately, STRING does not provide fully elaborated links to immune response [21]. We have compared the network sparsity (the ratio of the number of links present in a graph to the total number of possible links that would be present in a complete graph) within and between known immune and disease-related functional modules in STRING (Fig A in S1 Text). We have found that the associations between different immune response genes (as identified by ImmProt) are under-represented in the STRING database. We hypothesize that either the connections are more difficult to study and characterize or that the connections are less dense compared with metabolic or signaling modules as the human body requires more traffic for metabolic or signaling activities. Nevertheless, we aim to bridge this gap between pathogenesis and the immune processes that ultimately cause systemic damage. This will allow us to understand how immune responses are triggered in a disease-specific manner. We hypothesize that specific clinical phenotypes result from the interactions between disease-specific cytokines and disease-related genes (identified through genetics, transcriptomics, and analysis of metabolic dysfunctions), even though they also may share a common cytokine elements and conserved mechanisms of inflammation. In this study, we identify these disease-specific cytokines and their associated disease-specific genes to provide insights into the underlying molecular mechanisms. These mechanisms may suggest new approaches to treatment and treatment combinations for specific clinical phenotypes.

Results

A complete map of associations between 14K human genes

We developed a novel network method to construct a complete interaction map between human genes by embedding a PPI of 14,707 human genes using network scalable feature learning [22] that captures 728,090 high confidence edges in STRING [21] (Fig 1A and Table A in S1 Text). We have calculated Association Scores of all possible pairs (108,140,571 pairs) between the 14,707 human genes. The distribution of the predicted Association Scores of all possible pairs (108,140,571 pairs) is similar with that of the known pairs in STRING (9,250,034 pairs) (Figs B-D in S1 Text). Among these pairs, STRING provides a confidence index for 9,250,034 pairs, of which 8,521,944 have a confidence index below 800 (signifying high quality) and thus these lower quality pairs were not used for embedding. Figs 2 and 3 show that the predicted Association Scores correlate with the level of confidence in STRING (scatter plot shown in Fig B in S1 Text).
Fig 1

Data flow for (A) predicting Association Scores and (B) analyzing disease-specific cytokine profiles. A. Network features of the high confidence STRING pairs were used to embed 14,707 human genes. The predicted associations between the 14,707 genes were validated using medium and low confidence STRING pairs. B. For each gene associated with a given disease, we calculated Association Scores with each of the 126 cytokines. The Association Scores were averaged and normalized to NAAS that represent the cytokine profile of the given disease. These profiles were further analyzed by (1) calculating Immune Scores and (2) analyzing subnetworks formed between pathogenesis and inflammation by employing network visualization, spectrum partition, and estimation of connection density.

Fig 2

The predicted Association Score between two genes measures the confidence of their associations.

We calculated Association Scores for all possible pairs (108,140,571 pairs) of 14,707 human genes. A total of 9,250,034 of these pairs have a known STRING confidence index. The confidence indexes of known STRING pairs are shown in the four boxplots below, grouped by their predicted Association Scores. As the predicted Association Score increases (left to right), the average STRING confidence index also increases (low to high).

Fig 3

The predicted Association Score between two genes measures the confidence of their associations.

The percentage of known STRING pairs above a certain confidence index cutoff. (Note: STRING confidence indexes are discrete scores).

Data flow for (A) predicting Association Scores and (B) analyzing disease-specific cytokine profiles. A. Network features of the high confidence STRING pairs were used to embed 14,707 human genes. The predicted associations between the 14,707 genes were validated using medium and low confidence STRING pairs. B. For each gene associated with a given disease, we calculated Association Scores with each of the 126 cytokines. The Association Scores were averaged and normalized to NAAS that represent the cytokine profile of the given disease. These profiles were further analyzed by (1) calculating Immune Scores and (2) analyzing subnetworks formed between pathogenesis and inflammation by employing network visualization, spectrum partition, and estimation of connection density.

The predicted Association Score between two genes measures the confidence of their associations.

We calculated Association Scores for all possible pairs (108,140,571 pairs) of 14,707 human genes. A total of 9,250,034 of these pairs have a known STRING confidence index. The confidence indexes of known STRING pairs are shown in the four boxplots below, grouped by their predicted Association Scores. As the predicted Association Score increases (left to right), the average STRING confidence index also increases (low to high). The percentage of known STRING pairs above a certain confidence index cutoff. (Note: STRING confidence indexes are discrete scores). The 9,250,034 pairs are grouped into four boxplots based on their predicted Association Scores (Fig 2). As the predicted Association Score increases, the average STRING confidence index also increases. Association Scores below 0.6 have a low average STRING confidence of ~200. On the other hand, ~75% of Association Scores above 0.8 correspond with STRING confidence indexes above 800 (Fig 3). Thus, it appears that the predicted Association Scores between genes in our embedded network are good predictors of protein-protein interactions, which can thereby enable construction of a complete and reliable network between all 14,707 human genes in STRING.

Essential cytokines can be classified into six clusters based on their interaction profiles

We identified 126 well-studied cytokines based on ImmuneXpresso [20] in our embedded network. We call these 126 cytokines “essential cytokines” because they are linked to at least one disease in the literature. The term “cytokine” encompasses inflammatory regulators, including interferons, the interleukins, the chemokine family, mesenchymal growth factors, the tumor necrosis factors, and other non-classified ones [23]. Each essential cytokine can be described by its location in the embedded network space, and its Association Scores with each of the 14,581 non-cytokine human genes in our network may suggest known or novel interactions with other human genes (Fig 1B). The Association Scores between each of the cytokines and the 14,581 non-cytokine human genes classified these 126 cytokines generally into six distinct clusters, which we have named based on the types of their most enriched cytokine/chemokines: TGF-CLU (growth factors), Chemokine-CLU (chemokines), TNF-CLU (TNFs), IFN-CLU (interferons), IL-CLU (interleukins), and Unclassified-CLU (Fig 4). Unfortunately, it remains difficult to quantify the enrichment and purity of these clusters, due to the pleiotropy and element of redundancy in the cytokine family (each cytokine has many overlapping functions, and each function is mediated by multiple cytokines.) Six sets of close interactors are suggested by the dendrogram of the hierarchical cluster tree (names as SIG1-SIG6 in Table 1) and allow us to capture the function of each cluster with a gene signature. For example, Chemokine-CLU contains genes involved in the G-protein signaling pathway as well as cellular responses to endogenous and environmental insults, while TGF-CLU contains with genes that mediates blood coagulations and plasminogen activating cascades which are often associated with the innate immunity in infectious and neuroinflammatory diseases (Tables 2 and 3). We also identified two sets of genes (BLU1, BLU2) that are distant from the three major clusters (Chemokine-CLU, IFU-CLU, IL-CLU) (Table 4). Both BLU1 and BLU2 have biological functions in the nucleus. The full gene lists for each of the six clusters are listed in Table B in S1 Text.
Fig 4

The 126 cytokines form six clusters based on their Association Scores with the 14,581 non-cytokine genes.

The six clusters are named after their most enriched types of cytokines: TGF-CLU (growth factors), Chemokine-CLU (chemokines), TNF-CLU (TNFs), IFN-CLU (interferons), IL-CLU (interleukins), and Unclassified-CLU. Based on the dendrogram of the hierarchical cluster tree, we identified six gene sets (SIG1-SIG6) that associate with the six individual clusters and two gene sets (BLU1, BLU2) that do not interact with the three major clusters (Chemokine-CLU, IFU-CLU, IL-CLU). Details of cytokines and signatures are in Tables 1–4.

Table 1

The associations between the six gene sets (SIG1-SIG6) and specific clusters (CLU).

TGF-CLUChemokine-CLUTNF-CLUIFN-CLUIL-CLUUnclassified-CLU
SIG1Yes
SIG2YesYes
SIG3Yes
SIG4Yes
SIG5Yes
SIG6YesYes
Table 2

Specific cytokines in each of the six clusters.

TGF-CLUChemokine-CLUTNF-CLUIFN-CLUIL-CLUUnclassified-CLU
TGFA RETN EGF FGF1 FGF2 PDGFA PDGFB HGF SPP1 CSF1 TGFB2 TGFB3 TGFB1XCL1 CCL18 CCL8 CCL24 CXCL3 CX3CL1 CCL13 CCL27 CCL1 CXCL6 CCL21 CXCL16 CXCL11 CXCL5 CCL25 CCL16 CCL19 CXCL13 CCL20 CXCL9 CXCL2 CXCL1 PPBP PF4 CCL4 CXCL12 CCL11 CCL5 CXCL10 CCL17 CCL3 CCL2 CCL7 CCL26 CXCL8TNFSF13 TNFSF11 TNFSF14 LTA TNFSF13B KITLGCCL15 CCL22 IFNL3 EBI3 IL20 IL19 IL31 IFNL2 IL24 IL21 IL17F IL23A IL12A IL27 IFNL1 IL22 IL11 CLCF1 OSM LIF THPO IFNA1 IFNK IL26 IFNA6 IFNA2 IFNB1IL6 CD70 CD40LG TNF TNFSF4 IL33 IL1RN IL18 IL1A IL1B EPO IL15 IL9 IL7 IL3 IL5 IL13 IL17A CSF3 IL10 IL4 CSF2 IL2 IL12B IFNGIL32 MIF TNFSF15 TNFSF9 CKLF IL16 TNFSF8 IL34 GDF15 IL1F10 IL36RN IL36G IL17C IL17B IL25 PF4V1 CCL23 CXCL14 LECT1 LECT2
Table 3

The functional annotations of the six genes sets (SIG1-SIG6).

GO/PANTHER Pathways
SIG1G-protein signaling pathways, inflammation mediated chemokine pathways, GABA-B receptor signaling, 5HT1 receptor signaling, Enkephalin release, Endothelin signaling, opioid pathways, dopamine pathways, Nicotine pharmacodynamics, Metabotropic glutamate pathways
SIG2Interleukin signaling, JAK/STAT signaling, Toll receptor signaling, Interferon-gamma signaling, inflammation mediated by cytokine signaling pathways
SIG3Ubiquitin proteasome pathways, Parkinson disease, Hedgehog signaling, WNT signaling, FGF signaling, Angiogenesis, hypoxia response via HIF activation, p53 pathways by glucose deprivation
SIG4Blood coagulation, Plasminogen activating cascade
SIG5G-protein signaling pathways, 5HT2 receptor signaling, Histamine H1 receptor signaling, Thyrotropin releasing hormone receptor signaling, Oxytocin receptor signaling, Muscarinic acetylcholine receptor signaling. Corticotrophin releasing factor receptor signaling, Angiotensin II stimulate signaling, Endogenous cannabinoid signaling
SIG6Apoptosis signaling, Toll receptor signaling, FAS signaling, p38 MAPK pathway, Gonadotropin-releasing hormone receptor pathways, B-cell activation, T-cell activation, Insulin/IGF pathway (protein kinase B cascade, MAPKK/MAPK cascade), axon guidance mediated by netrin, Angiogenesis, EGF signaling, PDGF signaling, VEGF signaling
Table 4

The two gene sets (BLU1, BLU2) that do not interact with the three major clusters Chemokine-CLU, IFU-CLU, and IL-CLU.

GO/PANTHER Pathways
BLU1Ribosome biogenesis, rRNA metabolic process, gene expression, translation
BLU2DNA repair, DNA metabolic process, telomere maintenance, cellular macromolecule metabolic process, response to stress

The 126 cytokines form six clusters based on their Association Scores with the 14,581 non-cytokine genes.

The six clusters are named after their most enriched types of cytokines: TGF-CLU (growth factors), Chemokine-CLU (chemokines), TNF-CLU (TNFs), IFN-CLU (interferons), IL-CLU (interleukins), and Unclassified-CLU. Based on the dendrogram of the hierarchical cluster tree, we identified six gene sets (SIG1-SIG6) that associate with the six individual clusters and two gene sets (BLU1, BLU2) that do not interact with the three major clusters (Chemokine-CLU, IFU-CLU, IL-CLU). Details of cytokines and signatures are in Tables 1–4.

The predicted cytokine profiles of 171 diseases are validated using literature

We collected gene sets for 11,944 diseases from DisGeNET [24]. A majority (5,048) of these diseases are linked with fewer than ten genes, while a few are associated with up to 2,000 genes (Figs D-F in S1 Text). For each disease, we estimated the normalized average Association Scores (NAAS) between each cytokine and the disease based on the normalized score of averaged Association Scores of the given gene set (Fig 1B), resulting in a 126-dimensional cytokine profile for each disease, which represents its disease-specific cytokine profile. From the overall set of 11,944 diseases, we identified 171 well-studied diseases whose co-occurrence frequencies with the cytokines (for 79 of our 126 essential cytokines) had been evaluated by ImmuneXpresso through literature sampling. The predicted profiles correlate with the literature sampling frequency significantly (p-value cutoff with multiple testing correction is 0.0003) for 95 of the 171 diseases, suggesting reasonable reliability of predicted cytokine profiles (Fig 5 and Table C in S1 Text). Note that when the number of disease-associated genes increases, the accuracy of the predicted cytokine profiles decreases (Fig G in S1 Text). Fig 6 shows an example of the predicted cytokine profiles for aneurysm, a disease that is not typically considered as an immune disorder. The NAAS between aneurysm and each of the 79 cytokines of which the literature sampling frequency in diseases are known in ImmuneXpresso are plotted with known associations (frequency cutoff is 0.005) marked in solid blue squares. At a high cutoff (NAAS >0.8), the recall rate is 21/24. The novel predictions are: HGF, IL11, IL12B, IL13, IL15, IL17F, IL22, IL33IL5, IL7, IL9, LIF, OSM, PDGFA, PDGFB, PPBP. A scatter plot showing the correlation between the literature co-occurrence frequency and the predicted NAAS in aneurysm is in Fig H in S1 Text).
Fig 5

Predicted cytokine profiles for 171 well-studied diseases correlate with cytokine sampling in literature.

The Spearman correlation coefficients between each disease’s NAAS and known literature sampling frequency are plotted against the P-values. Of the 171 diseases, we were able to predict the profiles for 95 diseases with p-value<0.0003 (corrected cutoff by multiple testing), suggesting the accuracy of the predicted profiles.

Fig 6

The NAAS between aneurysm and each of the 79 cytokines for which the literature sampling frequency in disease is known in ImmuneXpresso.

Known associations (frequency cutoff of 0.005 in ImmuneXpresso) are marked in solid blue squares.

Predicted cytokine profiles for 171 well-studied diseases correlate with cytokine sampling in literature.

The Spearman correlation coefficients between each disease’s NAAS and known literature sampling frequency are plotted against the P-values. Of the 171 diseases, we were able to predict the profiles for 95 diseases with p-value<0.0003 (corrected cutoff by multiple testing), suggesting the accuracy of the predicted profiles.

The NAAS between aneurysm and each of the 79 cytokines for which the literature sampling frequency in disease is known in ImmuneXpresso.

Known associations (frequency cutoff of 0.005 in ImmuneXpresso) are marked in solid blue squares.

Defining the key modes of cytokine response

We have shown that our predicted disease profiles for 79 cytokines align with known cytokine associations, strengthening the validity of our predicted profiles for all 126 cytokines which include novel predictions for 47 cytokines. There are many ways to classify diseases, we aimed to assess patterns in cytokine response for different diseases, reasoning that shared cytokine response might indicate the potential for shared treatment strategies. We analyzed the predicted disease profiles of 126 cytokines in the 171 well-studied diseases. The 171 diseases include 23 immune disorders (C20), 48 infections (C01), seventeen cardiovascular diseases (C14), thirteen metabolic disorders (C18), and 55 neoplasms (C04). Note that one disease may belong to more than one disease classes (Table D in S1 Text). These diseases separate into three primary clusters based on their interactions with cytokines (Fig 7). As shown in Fig 7, immune disorders are enriched in cluster labeled “cluster-1”. Infections are split into two clusters (“cluster-1” and “cluster-2”). Note that of the twenty neoplasms in cluster-1, nineteen are hematologic or lymphatic diseases (C15/C04), suggesting that these neoplasms have distinct cytokine distributions from other neoplasms. Most metabolic diseases (11/13) and cardiovascular disorders (11/17) are enriched in cluster-3 (blue dendrogram in Fig 7), suggesting that cytokine responses are shared across different disease classes. The three clusters of cytokine responses that form across these five classes of disease suggest that there are common cytokine frameworks shared across disease classes, even though there is also significant heterogeneity of cytokine response within each disease class.
Fig 7

Cytokine features for the 171 well-studied diseases.

The 171 diseases formed three clusters based on their NASS with different types of cytokines. Immune disorders are enriched in cluster-1. Infections are split into two clusters (cluster-1 and cluster-2). Note that of the twenty neoplasms in cluster-1, nineteen are hematic and lymphatic diseases (C15/C04). Most metabolic diseases (11/13) and cardiovascular disorders (11/17) are enriched in cluster-3. Note that diseases of other classes are not counted in the labels. Cluster details are in Table D in S1 Text.

Cytokine features for the 171 well-studied diseases.

The 171 diseases formed three clusters based on their NASS with different types of cytokines. Immune disorders are enriched in cluster-1. Infections are split into two clusters (cluster-1 and cluster-2). Note that of the twenty neoplasms in cluster-1, nineteen are hematic and lymphatic diseases (C15/C04). Most metabolic diseases (11/13) and cardiovascular disorders (11/17) are enriched in cluster-3. Note that diseases of other classes are not counted in the labels. Cluster details are in Table D in S1 Text. The disease cytokine profiles show that inflammatory cytokines that include interleukins, inferons, TNFs are grouped together (Fig 7), while chemokines form two groups based on their associations with diseases. These groups of cytokines drive the clustering of diseases across different classes. Therefore, we analyzed the influence on diseases from inflammatory components, chemokines, growth factors, and other cytokines. We defined Immune Scores to capture the contributions from four categories of cytokines to the inflammation process of a given pathogenesis (Fig 8). In general, immune disorders and infections show higher values for all four cytokine-type components, with cardiovascular diseases presenting intermediate values, and metabolic diseases and, specially, neoplasm, having the lowest average immune scores for all four components. The chemokine scores of immune disorders spread in a wide range. Infections have the highest scores for growth factors. Cardiovascular diseases have higher scores than metabolic diseases in all four categories, while neoplasms show the lowest scores in all the categories. In summary, a disease can be represented by a 126-dimension cytokine profile or a 4-dimensional summary Immune Scores, both of which suggest that the inflammatory responses in different diseases are mediated by different distributions of cytokines.
Fig 8

Immune Scores of five disease classes (23 immune disorders, 48 infections, seventeen cardiovascular, thirteen metabolic, 55 neoplasms).

For each class, the average NAAS between its diseases and the cytokines within four categories are plotted: 47 inflammation related cytokines, 37 chemokines, thirteen growth factors, and 29 other cytokines. The chemokine scores for immune disorders are spread in a wide range. Growth factors have the highest scores in infections. Cardiovascular diseases have higher scores than metabolic diseases over the three groups of cytokines. Neoplasms show the lowest scores for all four categories.

Immune Scores of five disease classes (23 immune disorders, 48 infections, seventeen cardiovascular, thirteen metabolic, 55 neoplasms).

For each class, the average NAAS between its diseases and the cytokines within four categories are plotted: 47 inflammation related cytokines, 37 chemokines, thirteen growth factors, and 29 other cytokines. The chemokine scores for immune disorders are spread in a wide range. Growth factors have the highest scores in infections. Cardiovascular diseases have higher scores than metabolic diseases over the three groups of cytokines. Neoplasms show the lowest scores for all four categories.

Inflammatory response subnetworks provide disease-specific insights

To gain a more comprehensive understanding of inflammatory components, we identified the subnetworks formed by the predicted cytokines and pathogenesis genes of a given disease. We inspected the subnetworks of five diseases representing immune disorders, infections, cardiovascular diseases, metabolic disorders, and neoplasms (systemic lupus erythematosus (SLE), TB, aneurysm, metabolic syndrome X, and acute leukemia) and found that they visually show different network patterns. SLE displays two heavily connected subnetworks centered on inflammatory cytokines and chemokines, while the network of TB shows that chemokines are distant from pathogenesis genes (Fig I in S1 Text). We quantified these network features by counting the number of high confidence associations between disease-associated genes (taken from DisGeNET) and cytokines in the five example diseases (Table 5). For these five diseases, 8–14% of the known disease-associated genes in DisGeNET are predicted to interact with at least one of the 126 cytokines, suggesting that the connections between pathogenesis and immune response are less dense compared to the connections between metabolic or signaling modules.
Table 5

Analysis of subnetworks formed by high confidence associations between the known disease associated genes and the predicted cytokines of five diseases.

The known DisGeNET genes (column #2) of a given disease often contain cytokine receptors. The number of cytokine receptors and other disease genes captured by high-confidence associations (column #3) is listed in column #4 and column #5, respectively. The number of predicted essential cytokines that interact with receptors and disease genes from DisGeNET is listed in column #6. The Immune Connection Density (ICD) estimated on the subnetworks formed by receptors, disease genes, and essential cytokines is shown in column #7.

DiseaseAll DisGeNET genesHigh confidence associationDisGeNET genes that interact with predicted essential cytokinesICD
Cytokine receptorsDisease genesPredicted cytokines
SLE793105338 (5%)96 (12%)1020.067
TB1352139 (7%)12 (9%)700.123
Aneurysm136703 (2%)19 (14%)550.049
Metabolic Syndrome X4615727 (2%)66 (14%)820.052
Acute Leukemia42530711 (3%)35 (8%)900.062

Analysis of subnetworks formed by high confidence associations between the known disease associated genes and the predicted cytokines of five diseases.

The known DisGeNET genes (column #2) of a given disease often contain cytokine receptors. The number of cytokine receptors and other disease genes captured by high-confidence associations (column #3) is listed in column #4 and column #5, respectively. The number of predicted essential cytokines that interact with receptors and disease genes from DisGeNET is listed in column #6. The Immune Connection Density (ICD) estimated on the subnetworks formed by receptors, disease genes, and essential cytokines is shown in column #7. The genes involved in the pathogenesis of a given disease often are cytokine receptors–and these are critical clues for how the pathogenesis genes may communicate with immune modules. The numbers of cytokine receptors that interact with cytokines can serve as a measurement for the density of inflammatory responses. For example, cytokine receptors of which their associations with diseases are known in DisGeNET are more heavily involved in the process from pathogenesis to inflammatory responses in SLE and TB, but not in aneurysm, metabolic syndrome X or acute leukemia. To capture the density of connections between pathogenesis genes and immune response, we compute an “Immune Connection Density (ICD)”. We adapted the original equation of network efficiency [25] by counting the edges between the two components within the predicted subnetworks: genes for pathogenesis and genes for inflammation (see Methods). The interactions between cytokines, or between pathogenesis genes, are not included in this calculation. The ICD thus captures the associations between the two functions, pathogenesis and inflammation processes. Compared to SLE, aneurysm, metabolic syndrome X, and acute leukemia, TB shows the highest ICD (Table 5), suggesting that it triggers the most coherent reactions of immune systems upon infections. Analyzing the subnetworks formed by the predicted cytokines and pathogenesis genes also suggested the reliability of the predicted disease profiles of 126 cytokines (See Result section “Cytokine features of diseases”). Some of the known genes from DisGeNET are cytokines. When comparing with all the cytokines known in DisGeNET, the recall rate (Table 6) of being recognized by the high confidence interactions ranges from 50% to 88%. When using lower confidence cutoff (0.7), 36 more essential cytokines are predicted to be associated with aneurysm, and 11/14 (78%) of the known essential cytokines are recognized by our network methods.
Table 6

The recall rate (column #4) of these cytokines being recognized by the predicted subnetworks formed by high-confidence associations ranges from 50% to 88%.

DiseaseKnown cytokines (DisGeNET)Cytokines recognizedRecall rate
SLE504488%
TB231878%
Aneurysm14750%
Metabolic Syndrome X171376%
Acute Leukemia171482%

Pathogenesis genes for immune disorders connect to key inflammatory genes

We observe different predicted cytokine associations across the five immune disorders in our dataset (Fig 9). The five diseases—rheumatoid arthritis (RA), psoriasis (PS), ulcerative colitis (UC), Crohn’s disease (CD) and SLE—exhibit similar associations with the core inflammatory cytokines, but differ in their associations with chemokines, TNFs, and growth factors. For example, IL23A is highly involved (i.e., has a high NAAS) in all five diseases, while CCL18 and CCL7 are predicted to only associate with PS.
Fig 9

Disease-specific cytokine profiles of five immune disorders.

The Y-axis shows the Probability of Association between each cytokine and the five immune disorders: rheumatoid arthritis (RA), psoriasis (PS), ulcerative colitis (UC), Crohn’s disease (CD) and systemic lupus erythematosus (SLE). A conserved association pattern is observed in inflammation-related cytokines, while differential patterns are observed in other types of cytokines.

Disease-specific cytokine profiles of five immune disorders.

The Y-axis shows the Probability of Association between each cytokine and the five immune disorders: rheumatoid arthritis (RA), psoriasis (PS), ulcerative colitis (UC), Crohn’s disease (CD) and systemic lupus erythematosus (SLE). A conserved association pattern is observed in inflammation-related cytokines, while differential patterns are observed in other types of cytokines. In order to understand how pathogenesis genes associate with different types of cytokines, we used a force-directed graph to visualize the various interactions in the process from pathogenesis to inflammatory responses under different disease contexts. Stronger associations are drawn with shorter “springs” in order to convey a qualitative understanding of the subnetwork structure [26]. The visualization in Fig 10 shows that the individual pathogenesis genes of SLE are closely linked to their embedded network neighbors—shorter edges correspond to closer network neighbors and thus allows us to assess the pathogenesis genes that are most likely associated with inflammation. Different sets (Box-C and Box-I-1) of pathogenesis genes form associations with chemokines and proinflammatory cytokines, respectively. Six genes (ACKR3, HRH4, HTR1, GAL, GRM3, S1PR1) are connected with a set of densely connected chemokines (23 cytokines marked in red box), with ANXA1 interacting with sixteen chemokines. Seven pathogenesis genes (Box-I-1) make interactions with a group of 28 proinflammation cytokines (orange box) directly or through receptors (green nodes). Next to this group of proinflammation cytokines, a group of nine disease associated genes (Box-I-3, S100A8, NLRP3, MYD88, IRAK1, IRAK4, TIRAP, TLR2, TLR5, TLR9) form a small clique with four other proinflammation cytokines (IL18, IL22, IL1A and IL1B). Two other groups of genes (Box-I-2: PSME3, PSMB9, PSMD4, PSMB6, PSMA5, PSMD7, OAZ1, REL, HIVEP3, and Box-I-4: MAP4K3, SPATA2, TNIP1, ZC3H12A) are distant from these proinflammation cytokines (orange dots), even they are apparently linked to the TNFs.
Fig 10

SLE subnetwork formed between pathogenesis genes (green and purple) and inflammatory responses (orange, red, blue). The graph was plotted using a force-directed layout that uses attractive forces between adjacent nodes and repulsive forces between distant nodes. The distances between two vertices are roughly proportional to the length of the shortest path between them. Six genes (ACKR3, HRH4, HTR1, GAL, GRM3, S1PR1) in Box-C are making high degree contacts with the chemokine core (red box), with ANXA1 interacting with 16 chemokines. Interactions with the inflammation core (orange box) appear in multiple directions. Seven pathogenesis genes (Box-I-1) interact with the inflammation core (orange box) directly or through receptors (green nodes). Nine disease genes in Box-I-3 form a small core with four cytokines (IL18, IL22, IL1A and IL1B). Two other groups of genes (Box-I-2 and Box-I-4) appear distant from the cytokine core but are linked to the TNFs, as they cannot overcome the repulsive force to association with the center of inflammation responses.

SLE subnetwork formed between pathogenesis genes (green and purple) and inflammatory responses (orange, red, blue). The graph was plotted using a force-directed layout that uses attractive forces between adjacent nodes and repulsive forces between distant nodes. The distances between two vertices are roughly proportional to the length of the shortest path between them. Six genes (ACKR3, HRH4, HTR1, GAL, GRM3, S1PR1) in Box-C are making high degree contacts with the chemokine core (red box), with ANXA1 interacting with 16 chemokines. Interactions with the inflammation core (orange box) appear in multiple directions. Seven pathogenesis genes (Box-I-1) interact with the inflammation core (orange box) directly or through receptors (green nodes). Nine disease genes in Box-I-3 form a small core with four cytokines (IL18, IL22, IL1A and IL1B). Two other groups of genes (Box-I-2 and Box-I-4) appear distant from the cytokine core but are linked to the TNFs, as they cannot overcome the repulsive force to association with the center of inflammation responses.

Highly connected graphs capture connections between immune disorder pathogenesis and inflammation

The ICD for the five immune disorders are: RA, 0.055, UC, 0.069, CD, 0.071, SLE, 0.067, and PS, 0.060, suggesting that density of association between pathogenesis and inflammation is similar. However, different sets of genes are involved in these interactions. For each specific immune disorder, we are interested in the key pathogenesis genes that mediate cytokine connections. From the predicted inflammatory response subnetworks, spectrum partition enabled identification of highly connected graphs, or modules formed by a group of well-connected cytokines and pathogenesis genes, revealing the key genes for cytokine mediators that drive the pathogenesis in inflammatory diseases. Table 7 shows that 23 receptors and 36 disease genes (from 1340 genes taken from DisGeNET) were identified to form well-connected cytokine modules in RA, six receptors and four disease genes (from 542 genes) for PS, seventeen receptors and 35 disease genes (from 793 genes) for SLE, four receptors and eight disease genes (from 654 genes) for UC, and thirteen receptors and eighteen disease genes (from 622 genes) for CD. This process further prioritizes the known disease associated genes from DisGeNET, providing a more focused set of candidates for experimental follow-up. Of additional note, the cytokine modules identified for SLE and RA overlap, while those identified for PS and UC overlap, but not the corresponding pathogenesis genes (Tables C-F in S1 Text), suggesting different mechanisms mediating the cytokine framework.
Table 7

Pathogenesis genes in the highly connected modules that were identified by spectrum partition on the subnetworks formed by pathogenesis genes, receptors, and cytokines, in the context of five immune disorders: rheumatoid arthritis (RA), psoriasis (PS), ulcerative colitis (UC), Crohn’s disease (CD) and systemic lupus erythematosus (SLE).

The table shows that 36 disease genes were identified out of 1,340 disease-associated genes for RA (2.7%), four disease genes from the 542 genes for PS (0.7%), 35 disease genes from the 793 genes for SLE (4.4%), eight disease genes from the 654 genes for UC (1.2%), and 18 disease genes from the 622 genes for CD (3%). Note that many of these disease-associated genes are related to immune responses.

rheumatoid arthritispsoriasissystemic lupus erythematosusulcerative colitisCrohn’s disease
TRAF1FASLGMAP3K5MALT1TLR4TLR5IRAK1S100A8TLR1NLRP3TIRAPMYD88TYK2OSMRGH1PRLIKZF3BCL6SH2B3SELECTLA4CR2PTPN2INPP5DIFI4NFKBIASIGIRRTSLPFOXP3 TBX21CLEC7AMAP4K3PDCD5SPATA2TNIP1 DDAH1 TLR2 TLR9FASLG PSMD7 REL LTBRTLR9IRAK4NFKBIASIGIRRFOXP3TBX21CLEC7AMAP4K3SPATA2TNIP1ZC3H12AOSMRPRLTRAF1FASLGTLR4TLR5IRAK1S100A8NLRP3TIRAPMYD88TLR2TYK2IKZF3SH2B3CR2SELECTLA4IRF9IRF7HLXIFIT1IFI44IRF1IRF2IRF5CFLARBIRC2BIRC3NFKB2USP14RELPSMG1EGLN3MAP3K1TLR4TLR5IRAK1TLR1IRAK3NLRP3TLR2TLR9TYK2GHRGH1PTPN2CTLA4INPP5DFOXP3TSLPNFKBIA

Pathogenesis genes in the highly connected modules that were identified by spectrum partition on the subnetworks formed by pathogenesis genes, receptors, and cytokines, in the context of five immune disorders: rheumatoid arthritis (RA), psoriasis (PS), ulcerative colitis (UC), Crohn’s disease (CD) and systemic lupus erythematosus (SLE).

The table shows that 36 disease genes were identified out of 1,340 disease-associated genes for RA (2.7%), four disease genes from the 542 genes for PS (0.7%), 35 disease genes from the 793 genes for SLE (4.4%), eight disease genes from the 654 genes for UC (1.2%), and 18 disease genes from the 622 genes for CD (3%). Note that many of these disease-associated genes are related to immune responses.

Discussion

A complete, large-scale map of associations between genes enables the identification of genome-wide features of cytokine interactions

The connection density between immune response genes is lower than that within the heavily studied modules for metabolic, transcriptome, and signaling. Meanwhile, the associations across functional units are not well defined, relative to the connections within known functional units (Table A in S1 Text). Disease associated genes are often found scattered across different modules (metabolic, signaling, and immune modules). For example, for non-alcoholic steatohepatitis, disease associated genes are found in the highly disparate functional modules of fatty acid beta-oxidation, proteolysis, signal transduction, leukocyte aggregation, and other cellular process [17]. In this work, we aimed to identify novel associations between pathogenesis genes and immune responses; for this task, we required a map of pairwise associations between genes at a large scale. The STRING network itself is a highly connected graph that obeys “small world” statistics and thus path length calculations are not useful for estimating likelihood of association [21]. We cannot distinguish pairwise importance by shortest path length because there are too many gene pairs that share the same length. Our proposed embedding space provides more information by capturing the topological structures of STRING, thereby enabling a complete map of pairwise associations. The high degree of pleiotropy and redundancy among cytokine family (each cytokine has multiple functions, and each function potentially mediated by multiple cytokines) make the classification of cytokines a challenge [23,27]. Using our map of pairwise associations, we were able to connect a key set of cytokines to 14,707 human genes and identified genome-wide features that interact with unique groups of cytokines. These genome-wide associations enable a more systematic classification of cytokines. The biological annotations of these specific interactions also provide important insights into the functions of cytokine groups. We identified two sets of genes (191 genes in SIG1 and 175 genes in SIG5) that interact only with chemokines, highlighting specific signaling pathways for chemokines: their biological functions focus on the G-protein signaling pathways, a response to endogenous and environmental insults. We also found that genes responsible for ubiquitin proteasome pathways interact with TNFs, not other cytokines (SIG3 in Table 1). SIG4 is another interesting gene set which interacts only with TGF and plays an important role in blood coagulation and plasminogen activating cascades, which are often associated with innate immunity in infectious and neuroinflammatory diseases. Some of the genes in the featured interactions (F5 and SERPINE2 in SIG4) are known to affect the concentrations of circulating cytokines [28]. Those genes that are not recognized by GWAS could be critical links from pathogenesis to inflammation. The biochemical pathways underlying the links from these genes to complex diseases have remained elusive. Our findings provide candidate genes pivoting to deeper studies of pathogenesis and inflammation.

Disease-specific cytokine profiles reveal flexible features of differential inflammatory responses

Our analysis suggests that diseases have flexible cytokine distributions even though they may share cytokine framework that provides conserved mechanisms of inflammation. First, clustering diseases based on their cytokine profiles yields three different cytokine response modes which correlate with disease classification. Of the 55 neoplasms studied, 32 fall into cluster-1 and twenty in cluster-3, of which nineteen are hematologic and lymphatic diseases (C15/C04) (Fig 7). Second, Immune Scores that capture the contributions from different types of cytokines to the inflammation show that inflammation is a driver of pathology for many diseases beyond those that are typically considered autoimmune or infectious. Cardiovascular diseases show higher Immune Scores than metabolic disorders and neoplasms (Fig 8). The increased concentrations of cytokines in cardiovascular diseases are not only markers of chronic low-grade inflammation, but also provide an important pathophysiological link between cardiovascular health and ageing [29]. Third, the number of disease genes and receptors that are associated with essential cytokines varies widely compared with the numbers of cytokines themselves, suggesting that different mechanisms mediate between pathogenesis and inflammatory response (Table 5). Finally, ICDs which quantify the density of interactions between pathogenesis and inflammation suggest the mechanism by which different diseases have different levels of inflammation (Table 5). Within the class of immune disorders, we also observed differential cytokine distributions between different diseases (Fig 9). The five immune disorders examined all show close interactions with the cytokines responsible for proinflammatory responses, but not all five of them have close interactions with chemokines, TNFs or growth factors. One explanation is that multiple cytokines are triggered simultaneously by a few key activated triggers. Therefore, identification of the key genes and cytokines that trigger the immune responses in individual diseases may provide insights into therapeutic strategies.

Subnetworks between pathogenesis and inflammation suggest different mechanisms of immune response

Our predicted associations between cytokines and pathogenesis enable network analysis from different perspectives, providing useful insights into the molecular pathways that mediate inflammation. We investigated two methods for visualizing the subnetworks formed between pathogenesis and inflammation. Through hierarchical layered analysis on the connections between pathogenesis and inflammation, we were able to identify the central nodes in this dynamic process [30] (Fig J in S1 Text). For example, in the subnetwork for metabolic syndrome X, pathogenesis genes AGTR2 and ADRA1A are at the top hierarchical layer for chemokine signaling, while IRF1, MXZB1, MTTP, and CNTC are at the top layer in cytokine signaling. Additionally, our layered graph analysis suggests different interaction patterning: aneurysm showed a clear hierarchical flow starting from disease genes to cytokines, while metabolic syndrome X showed interactive layers between disease genes and cytokines, with an emphasis on chemokine responses, suggesting different mechanisms in signaling between pathogenesis and inflammation. We also utilized a force-directed graph to present the various interactions under different disease contexts [26] (Fig I in S1 Text). The networks for SLE and TB display different patterns, suggesting different mechanisms in triggering inflammatory responses in immune disorders and infections. These mechanisms may be related to the speed or strength of the immune reactions. Attractive forces between pathogenesis and chemokine responses are prominent in metabolic syndrome X, but not in aneurysm or acute leukemia. Interestingly, recent research has found that modification in the genes that closely interact with chemokines may affect functions in glucose and lipid metabolism in patients with metabolic syndrome X [31,32]. Our subnetwork for metabolic syndrome X provides candidates as novel targets for broader and more efficacious treatments and prevention of metabolic disease.

Spectrum partition of subnetworks identifies key mediators of immune disorders

Immune cells can release many pathogenic cytokines. Mechanistic studies will be necessary to identify the key cytokines for a given inflammatory disorder and to pinpoint which cytokines might be the appropriate targets for tacking each disease. Given a disease, our methods identify the well-connected subnetwork formed between pathogenesis and inflammation and can extract key genes closely associated with cytokines within the subnetwork. These key genes can then serve as therapeutic target candidates, as they are predicted to be the main mediators of inflammation. Human trials targeting different cytokines have shown differential efficacy of cytokine inhibition in chronic inflammatory diseases. Most of the chronic inflammatory diseases share clinical responsiveness to TNF-a inhibition but differ in their responsiveness to inhibition of cytokines, such as IL6, IL1, IL17 and IL23. This suggests the existence of a hierarchical framework of cytokines that defines groups for chronic inflammatory diseases, in contrast to the previously assumed the homogenous molecular disease patterns [15]. Interestingly, we have identified a common well-connected subnetwork that defines the close interactions between pathogenesis genes and cytokines in SLE and RA, which comprises pathogenesis genes TNIP1, SPATA2, MAP4K3, and CLEC7A. Annotation of these genes explains the possible shared pathways in SLE and RA, therefore shared therapeutic targets. Among these genes, TNIP1 is involved in inhibition of nuclear factor-κB (NF-κB) activation by interacting with TNF-α induced protein 3, an established susceptibility gene to SLE and RA [33]. Other evidence suggests that the downregulation of SPATA2 augments transcriptional activation of NF-κB and inhibits TNF-α-induced necroptosis, pointing to an important function of SPATA2 in modulating the outcomes of TNF-α signaling, which plays important roles in inflammatory responses in RA and SLE [34]. These observations support our predicted key mediators for pathogenesis and inflammation. Further study of other key genes identified from disease-specific subnetworks may provide additional insights into therapeutic strategies. Our methods identify networks of cytokines and disease-related genes specific to each inflammatory disease. We cannot determine if these are the causal factors for disease specific clinical phenotypes without further analysis, including experimental investigations. However, our predictions provide insights into the potential underlying molecular mechanisms, and may be useful to guide experimental programs. In addition, tissue-specific effects are lost in using a unified PPI network. Our future work should focus on using tissue-specific PPI networks to refine our predictions as more comprehensive tissue-specific networks are made available, our future work should focus on using tissue-specific PPI networks to refine our predictions.

Methods

Network embedding of 14K human genes

We downloaded the network of 19,344 human genes in the STRING database. This network contains 5,879,727 total edges. We selected 14,707 genes that involve 728,090 high confidence edges (at a cutoff of 800) in STRING (Table A in S1 Text). We applied the methods of network scalable feature learning [22] to capture network topology features of the 14,707 genes in a 64-dimensional embedding space. Specifically, we have conducted a grid search over hyper-parameters to identify the optimal settings for the embedding algorithm (Method Notes in S1 Text). The finalized parameters were as follows: for each node, we used it as source to sample ten paths, with each path at a length of thirty (We set the hyperparameters as length of walks = 30, number of walks = 10, min count = 1, batch word = 6, window = 10.). We then applied node2vec to this data to get a 64-dimensional embedding representation for each node.

Prediction of association scores

For every pair of genes in our pool of 14,707 human genes, we calculated the cosine similarity between the two genes’ embedding vectors. We refer to the resulting 108,140,571 pairwise scores as Association Scores. The STRING confidence score for 9,250,034 of these pairs was available, of which 8,521,944 pairs had confidence scores below 800 and thus were not used for embedding. We evaluated the predictive strength of Association Scores by comparing them to the known confidence scores for these 8,521,944 pairs.

Prediction of disease-specific cytokine profiles

We identified 126 “essential cytokines” by mapping cytokines from ImmuneXpresso with the 14,707 human genes. For each cytokine gene, we calculated its Association Scores with each of the other 14,581 non-cytokine human genes. We collected 11,944 disease concepts from DisGeNET [24] that were associated with at least two genes in our set of 14,707 genes. For each disease concept, we calculated the average Association Score between its associated genes and each of the 126 cytokines, resulting in a 126-dimensional vector of Association Scores for each disease. The 11,944 diseases were grouped into four bins based on the number of genes associated with the disease: 2–9, 10–19, 20–49, or >49 (Fig F in S1 Text). The 126 Association Scores for each disease were normalized by the bin that the disease fell into: p-value = [number of diseases of which average cosine similarities We collected 171 well-studied diseases of which the literature sampling frequency of 79 cytokines are available in ImmuneXpresso for validation. For each disease, we compared the predicted NAAS with the known literature sampling frequencies by calculating the Spearman correlation coefficients, where MatLab computes p-values for Spearman’s rank correlation coefficient using the exact permutation distributions.

Analysis of disease-specific cytokine profile features and subnetworks between pathogenesis and inflammation

Given a disease, we calculated the NAAS with respect to each of the 126 cytokines, resulting in a 126-dimensional disease-specific cytokine profile. We analyzed the features of these disease-specific cytokine profiles via hierarchical clustering. These disease-specific cytokine profiles were further converted into “Immune Scores” by averaging the NAAS for 47 inflammation related cytokines, 37 chemokines, 13 growth factors, and 29 other cytokines to quantify the contribution from four aspects of inflammatory responses. In order to visualize the subnetwork formed between pathogenesis (disease associated genes) and inflammatory responses (essential cytokines), we labeled the disease genes as either “disease-specific” or “cytokine receptors” by mapping to the 110 cytokine receptors (available at https://github.com/TianyunC/cytokine-networks) that we defined by filtering genes acquired from GeneCards [35]. We graphed the subnetwork formed by high confidence connections (Association Score > 0.8) between disease genes and cytokines in a force-directed visualization that uses attractive forces between adjacent nodes and repulsive forces between distant nodes. The distance between two vertices in the graph is roughly proportional to the length of the shortest path between them within the subnetwork. To quantify the information exchange between pathogenesis and inflammatory responses, we calculated Immune Connection Density (ICD) with the formula , where N is the total number of total pathogenesis genes in the disease-specific network, N is the total number of total cytokines in the disease-specific network, and d is the cosine similarity of the two sets of genes in embedding space [25].

Identification of highly connected graphs between pathogenesis and inflammation by spectrum partition

For a given disease, we constructed a graph G using the predicted subnetworks derived from the high confidence interactions between pathogenesis genes, receptors, and cytokines, with the interactions between cytokines, and those between pathogenesis genes removed. We then calculated the Laplacian matrix L for the graph G, which yields a square, symmetric, sparse matrix. The smallest non-null eigenvalue of L is called the Fiedler value, which represents the algebraic connectivity of a graph; the further from zero the Fiedler value, the more connected the graph. The Fiedler vector is the eigenvector corresponding to the smallest non-null eigenvalue of the graph. With this vector , one can partition the graph into two or three subgraphs using the Fiedler vector . A node is assigned to one subgraph if it has a positive value in (well-connected nodes). Otherwise, the node is assigned to another subgraph (poorly connected nodes). Alternatively, the nodes of close to zero values in can be placed in a class of their own (known as articulation point). This practice is called a “sign cut” or “zero threshold cut”. The sign cut minimizes the weight of the cut, subject to the upper and lower bounds on the weight of any nontrivial cut of the graph [36].

Supplementary material.

Method notes for embedding and grid searching. Fig A: Network sparsity within and between known functional modules in protein-protein interaction (PPI) networks. Fig B: The predicted Association Scores (Y-axis) of 8,521,944 edges correlate with their known STRING confidence scores. Fig C: The distribution of Association Scores. Fig D: Predicted Association Scores correlate with known confidence scores in STRING. Fig E: The histogram of the number of genes associated with each of the 171 diseases. Fig F: The NAAS distribution within each bin defined by the number of genes associated with a disease. Fig G: The Number of genes associated with diseases is plotted against the P-value estimating the correlation between the predicted NAAS and the literature sampling frequency of cytokines. Fig H: The NAAS between aneurysm and each of the 79 cytokines are plot against the literature sampling frequency in aneurysm. Fig I: Graph plots showing interactions between pathogenesis genes and inflammatory responses. Fig J: Hierarchical structure showing information flow from pathogenesis genes to inflammatory responses. Table A: Statistics of selected sets from STRING and three classes of functional modules. Table B: Gene signatures identified for the six groups of cytokines. Table C: The predicted cytokine profiles correlate with the known literature sampling frequency in ImmuneXpresso for the 171 well-studied diseases. Table D: The 171 diseases classified into three clusters based on their cytokine profiles. Table E: Disease-associated genes in the well-connected modules formed by pathogenesis genes, receptors, and essential cytokines identified by spectrum partition in the context of five immune disorders. Table F: Frequency (#) in the five diseases. (DOCX) Click here for additional data file. 12 Nov 2021 Dear Dr. Altman, Thank you very much for submitting your manuscript "Construction of disease-specific cytokine profiles by associating disease genes with immune responses" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. Please make sure to address all concerns sufficiently including the data and code availability, use of proper statistical methodology, revising the support for each major claim and the consistency of terminology in the text. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts. Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Ferhat Ay, Ph.D Associate Editor PLOS Computational Biology Rob De Boer Deputy Editor PLOS Computational Biology *********************** Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: Here the authors ask "how genes involved in pathogenesis which are often not associated with the immune system in an obvious way communicate with the immune system?". They further ask "What is the degree to which cytokine responses are shared across diseases or specific to each disease? And how does heterogeneity of cytokine responses mediate different pathogenesis and inflammatory processes?". These are important mechanistic questions. Their comprehensive analysis reveals subnetworks formed between disease-specific pathogenesis genes, hormones, receptors, and cytokines, leading to genes responsible for interactions between pathogenesis and inflammatory responses. In line with their observations, trials indicated a hierarchical framework of cytokines that defines groups for chronic inflammatory diseases rather differently from the homogenous molecular disease pattern previously assumed. The authors detail and annotate the genes that they identified and their functions. Altogether, this is a good, coherent paper that is well-suited to PLOS CB. Especially, I like it since it aims at elucidating mechanisms. It is always possible to raise technical, methodological and/or statistical questions. In this case I do not think that these are needed. The manuscript describes an in-depth, thoughtful, robust and innovative approach and its output, and I hope that the results will indeed be useful toward drug discovery. Reviewer #2: The manuscript entitled “Construction of disease-specific cytokine profiles by associating disease genes with immune responses” by Liu et al submitted to PLoS Computational Biology for publication (Manuscript Number: PCOMPBIOL-D-21-01684) presents an interesting attempt at discovering pathways of interaction between disease pathogenesis genes and immune-response driving cytokines and shed light on disease-specific mechanisms that mediate pathogenic immune-responses using an innovative network-based approach. However, there are a number of issues that need to be addressed before considering the manuscript for publication. Minor comments: Introduction: - Page 3, lines 1-3: please provide references to support the claim made in the sentence. - Paragraph 5: please move the sentences describing the results presented in Supplementary Material Figure S1 either to the results section or to the method sections, where they can be used to justify the usage of the network embedding procedure rather than the STRING PPI-network itself. Results: - Please represent the results presented in Figure 2A as a “STRING confidence score vs association score” density map scatterplot, so the reader can have a clearer idea of how those two metrics track each other. Also, please calculate a correlation coefficient and add it to the results. - Please perform a multiple-testing correction on the p-values obtained in the NAAS vs ImmuneXpresso literature co-occurrence frequency correlation analysis (Figure 4A) to decide which of the 171 disease show a statistically significant correlation. - Please represent the results presented in Figure 4B either as a Normalized Average Association Score vs literature co-occurrence frequency (in log scale?) scatterplot or, alternatively, as boxplot showing the distribution of NAAS for those disease-cytokine pairs with literature co-occurrence frequency above 0.005 and below 0.005 separately. - When presenting the results of the overlap between the high NAAS and high literature co-occurrence frequency, please add Specificity and overall Accuracy besides the recall. - Subsection “Defining the key modes of cytokine response”, paragraph 1: o The authors state that “They fall into three distinct patterns in cytokine response”. According to what? The results of a hierarchical clustering? I know the readers can infer that from looking at Figure 5A but, by just reading the text, the assertion seems arbitrary. Please add detail. o The description of the results of Figure 5 in the main text and in the figure’s legend do not agree. Please revise so both pieces of text describe the results in the figure appropriately. - Subsection “Defining the key modes of cytokine response”, paragraph 2: o It’s unclear what is meant by the first sentence. Please rephrase to make it clearer. o Please use ‘drive’ instead of decide in the third sentence of this paragraph. o In Figure 5A, it is not possible to identify the different groups of inflammatory cytokines (interleukins, interferons and TNFs). Please use color-coding on the cytokine labels to aid the identification of those subgroups of cytokines. o Please try and summarize the results presented in figure 5B more concisely. In general, immune disorder and infections, as expected, show higher values for all 4 cytokine-type components, with cardiovascular diseases presenting intermediate values, and metabolic diseases and, specially, neoplams, having the lowest average immune scores for all 4 components. - Subsection “Inflammatory response subnetworks provide disease-specific insight”, paragraph 1: o The fact that the subnetworks identified for each disease look different on visual inspection is a meaningful result. Please either remove that part of the sentences or elaborate in more detail into the differences between the disease subnetworks. o Please explain the rationale for the assertion made in the last sentence of the paragraph. Would you expect to see more connections if, instead of cytokines, the interactors were metabolic or signaling genes? And, if so, what would you base your expectation on? - Subsection “Inflammatory response subnetworks provide disease-specific insight”, paragraph 2: o Please clarify if, for any given disease, only the cytokine receptors that are associated with the disease as per DisGeNET or, alternatively, all cytokine receptors that appear in your embedded network are included in these subnetwork. o “Therefore, the ICD suggests the associations between…” please use some other term instead of “suggests”: “captures” or “represents” could do the job, for example. o It would be informative to assign p-values to the ICD values you calculate for each disease (Table 2A). For that, please, perform simulations, for each disease, by taking, randomly, as many non-cytokine genes in the network as the number of the DisGeNET genes for the disease that appear in your global network, calculate the ICD for the resulting subnetwork and, after repeating this procedure X times (at least 1000, preferably), calculate the percentile for your real ICD value to get a p-value. Repeating this procedure for each of the five diseases should assign a p-value for each ICD and aid its interpretation. - Subsection “Immune disorders pathogenesis genes connect to key inflammatory genes”, paragraph 1: o In Figure 6, please explain in the legend that only NAAS values > 0.7 are shown. Also, please use 2-3 cytokines as examples to elaborate on and illustrate the fact that all 5 disease show similar patterns of association with inflammatory cytokines, but very varying levels with chemokins, TNFs and growth factors (Il18 and CCL17 look like they could be good candidates for that!). - Subsection “Immune disorders pathogenesis genes connect to key inflammatory genes”, paragraph 2: o Page 10, line 1: there is probably a typo and “Box-I-1-4” should probably read “Box-I-1”. Please check and fix if necessary. o From this point on, there are several instances where pre-inflammatory cytokines are mentioned. I assume this is a mistake and “pro-inflammatory” cytokines is the appropriate term. Please fix all instances if this is the case. - Subsection “Highly connected graphs capture connections between immune disorder pathogenesis and inflammation”, paragraph 1: o Is there a typo in the first sentence of is CD’s ICD really 0.71? Please check and amend if necessary. o In supplementary table S5, it's impossible to infer from the gene-lists how many are shared across the diseases. Please add either venn diagrams showing the overlap between the gene-lists, or a list of unique geneIDs/symbols, indicating, for each gene, in how many diseases it appears and which are those diseases. Discussion: - Subsection “A complete map of associations between genes in large scale enables the identification of genome-wide features of cytokine interactions”, paragraph 1: o It is difficult to understand the results presented in Supplementary Materials Table S1 without further explanation. For example, what do ME and HM stand for? How where those gene-sets obtained? Please add detail to aid understanding. o From the first four sentences of the paragraph, it doesn't follow that one needs to identify and explore novel associations between disease pathogenesis genes and the immune response. Please add another sentence in between to link the context provided at the beginning of the paragraph and the stated goal of the study. o Please explain why the fact that a network has "small world" properties makes it unsuitable for estimation of likelihood of association between genes via path-length calculations. For non-graph experts, this is not immediately obvious. - Subsection “Subnetworks between pathogenesis and inflammation suggest different mechanisms of immune response”: o The results of Supplementary materials Figure S9 should be presented in the results section rather than here in the Discussion. - Subsection “Spectrum partition of subnetworks identifies key mediators of immune disorders”: o “These key genes serve as candidates of therapeutic targets, as they are the main mediators to fuel inflammation”: This is a very strong assertion that, I feel, is not justified by the results. It is true that the subnetworks have allowed the identification of genes involved in pathogenesis that might have a notorious role in driving the immune responses in those diseases, but more work is needed to confirm that this is, indeed, the case, and that they could be good candidates for therapeutic intervention. Please rephrase to tone down. o “The human trials targeting different cytokines suggest the existence of a hierarchical framework of cytokines that defines groups for chronic inflammatory diseases rather differently from the homogenous molecular disease pattern previously assumed”: I find this sentence very difficult to understand. Please rephrase to increase clarity. o “These observations validate our predicted key mediators for pathogenesis and inflammation”: I think “validate” is a clear overstatement here. Please tone down to “support” or to another similar term. - Please add a paragraph discussing the limitations of the study and another one discussing potential future steps regarding how to overcome those limitations and/or strategies for validation of the therapeutic candidate genes identified. Methods - Please provide details on the experiment performed to find the optimal hyper-parameters for the embedding algorithm and how the final parameters were chosen. - Did you calculate the cosine distance or the cosine similarity? I mean, if the resulting Association Scores are positively associated with the edge-confidences in STRING, I suppose that cosine similarities were calculated, rather than distances (1-cos_similarity). Otherwise, the paper doesn't make any sense. Please clarify. - As far as I can tell, there are 140 cytokines in ImmuneXpresso. Did only 126 of those map onto the 14707 human gene network? Please clarify. - Please add explanations on why it is necessary to perform the embedding of the STRING network onto 64-dimensional space rather than using the STRING network itself to calculate potential disease-gene – cytokine associations. - Why was necessary to normalize the average association scores for each cytokine and disease for gene-set size? Did you observe an association between the number of genes associated to a disease and the average Association Scores between the genes and the cytokines? If so, please provide some result showing this association. - It is not clear to me how the cytokine profiles are normalized by the associated-gene-count-bin each disease falls into. For starters, in each bin, is the calculation of the p-value done using the distribution of the association score across all cytokines or for each cytokine separately? Also, how is the normalized average Association Score calculated, then? Is it the -log10pvalue? Or the percentile? Please add details to clarify. - For the validation of the NAAS scores, did you calculate the Spearman correlation between the NAAS profile across 79 cytokines for a disease and the corresponding profile of the literature sampling frequency for the disease? Please add clarification. Also, please explain the rationale behind using ImmuneXpresso's literature sampling frequency profile to validate the Association Score and an explanation of what the “literature sampling frequencies” represent. Are these frequencies of co-occurrence of a disease and a cytokine across the literature? - How was the filtering done to reach the list of 110 cytokine receptors from GeneCards? Please provide details. - “and is the cosine distance of two sets of genes in embedding space”. Two comment here: o Again, please clarify if you calculated cosine distances or cosine similarities, as the analysis wouldn’t make sense as it stands if using the former. o I would expect the dpi to represent the cosine distance (or similarity) in embedding space between each pair of pathogenesis genes and cytokines and not the overall cosine distance (or similarity) between the two sets of genes, right? Please clarify. - If the object of the spectrum partition was to identify the pathogenesis genes showing a high connectivity with cytokines in each disease-specific subnetwork, I would expect one would have to remove all the interactions among pathogenesis genes, too, and not only the interactions among cytokines as, otherwise, the Fiedler vectors will reflect not only pathogenic gene – cytokine connectivity but, also, pathogenic gene – pathogenic gene connectivity. However, the text only makes reference to interactions between cytokines being removed. Please clarify this point. Figures: - In Figure 1A, please remove the arrow from the “14,707 genes” box to the “9,250,034 pairs in STRING” box as it suggests that the STRING interactions were identified based on the initial selection of those genes when, according to the explanation in the methods section, that is not the case. Overall comments: - Throughout the text inflammation/inflammatory and immune/immunological seem to be used interchangeably. My understanding is that the scope of the study is the exploration of the broader immune response rather than of the particular inflammatory aspect of immune responses. If that’s the case, please change all instances where inflammation/inflammatory is used to refer to the broader immune response. Otherwise, please clarify in the text that the study’s focus is inflammation only. - There are a few instances in the text where the study is framed as identifying disease-specific cytokine response profiles. I find this misleading. No levels of cytokines or of their downstream effectors were used in the study. Please rephrase those instances to more accurately reflect the fact that the cytokine profiles represent network-interaction-based potential cytokine involvement rather than cytokine response. - There are quite a few grammatical errors and some of the sentences are not worded in an easily understandable manner. Please get the manuscript proofread by a native English speaker to fix grammatical and punctuation mistakes and make some sentences clearer. In this work, the concept of seeking to understand how genes involved in pathogenesis which are often not associated with the immune system in an obvious way communicate with the immune system is an endeavor of interest to the research community. Overall, the writing can be further refined for more concise and specific articulation and flow of concepts. It is a useful endeavor to establish a threshold of high confidence interactions. Even in the case where the interaction can occur: ligand/receptor, it is not an indication that in a given disease tissue it does occur unless covariance and level of expression signifies there is a high probability or confidence level of this occurring in which case rna and protein expression would have to be used as supporting data. • Another question to ask is how does the hierarchy of cytokine response differ or is conserved across diseases and can you characterize the various function all gene sets that are associated with specific cytokine responses.This was not done. • It remains unclear whether inflammation score is differentiating/ qualifying the type of inflammatory response or whether it is quantifying the magnitude of immune response. • Unclear what the rationale is for merging all diseases in the analysis between protein protein interaction and disease genes and then separating the diseases for inflammation score. The former may be dependent on the disease with some genes interacting in some diseases and not others whereas the magnitude of inflammation score may converge across diseases since it tracks with severity of inflammation. Gene interactions may also depend on tissue context which is associated with disease type. It is these context dependent interactions which may be the drivers of cytokine responses, with some collections of cytokines triggering dependencies on other cytokines so it maybe overly simplistic to start with one universal disease gene PPI framework. In this way, the ingoing assumption is flawed. • The ‘known’ disease cytokine associations derived from NLP are questionable. • Provide a build in of an in silico negative control. • It would be interesting if this model could parse the relationships between what disease genes are causal to driving a cytokines, how the inter relationship of different cytokines are altered in different disease settings and then what is the disease gene effect of cytokine stimulation • Some of the preliminary conclusions have not been vetted properly to identify biological meaning. For example ” Chemokine-CLU is associated with a set of genes that function in G-protein signaling pathway and the response to endogenous and environmental insults ". This is the case because chemokine receptors are GPCRs. • “Note that of the twenty neoplasms in cluster-3, nineteen are hematic and lymphatic diseases (C15/C04), suggesting that these neoplasms have distinct cytokine distributions from other neoplasms. This is expected given hematopoietic and lymphatic are immune organs as compared to neoplasms of other tissues. • “We also found that genes responsible for Ubiquitin proteasome pathways interact with TNFs, not other cytokines” already known in part and not entirely true. • Main conclusions are either too vague or already known or recite existing open questions which have not been sufficiently investigated and analyzed in this model. • Neoplasms (systemic lupus erythematosus (SLE), TB, aneurysm, metabolic syndrome X, and acute leukemia). These examples do not fit the definition of neoplasm. • While the collection of methods are interesting, the novel conclusions are limited, i.e it is already known TLR activation is linked to cytokine/TNF response. • Since RA, SLE and IBD are inflammatory diseases it is expected that they would have a higher ICD score than the other disease genes. The final conclusions between these diseases is vague and not analyzed with sufficient depth. Reviewer #3: Manuscript PCOMPBIOL-D-21-01684 The article is framed in the potential relationship of inflammation on human disease and how to uncover this relationship. The authors present a method to obtain disease-specific cytokines and their associated disease-specific genes. The method is based on prediction of protein-protein interactions based on network embeddings and previous knowledge on inflammatory genes and disease genes. The manuscript would benefit on more clarity on the goals and description of the methodology. It has been hard to understand the rationale behind key aspects of the methodology and its implementation. Moreover, data and computational code is not available for evaluating the results and their reproducibility. Said that, from what I could understand, the proposed method is based on some assumptions that deserve a more careful thought. One is that there are disease-genes and disease-specific cytokines, and by mapping these gene sets to PPI networks it is possible to uncover relationships between inflammation and disease. This is fine with the exception that there are a lot of immune related genes and cytokines already associated to diseases. Thus, there is no such disjoint gene sets, and this is related to the complexity of biology and conflicts with our intention to classify processes in clear-cut boxes. The other caveat is the assumption of a complete graph for the protein interaction network. In my understanding biological networks are not complete graphs. More specific comments are provided below. 1. A non-negligible number of disease genes are actually cytokines. How your method accounts for this overlap in gene sets? In this context, the separation of disease-gene and cytokine-gene sets seems rather artificial. Please elaborate on this. 2. Intro, p. 4“Ideally, we would like to map known immune response genes more completely to human gene networks to better identify potential links to pathogenesis genes.” Do you mean inflammatory disease genes? There are many expressions like this in the article that are not clear enough to understand what the authors mean. 3. Why selecting STRING as a source of PPI interaction networks? There are other resources that are available and include the latest datasets for discovering PPIs, such as Intact. In addition, STRING includes different types of data and methods to derive the associations, the authors should specify which associations were included. 4. What is the rationale of the network sparsity analysis commented in the Intro? Why should we expect a complete graph in a biological network? This analysis should be better explained and justified. 5. Figure S1 refers to “modules”, but no details on how these modules are detected is provided. The same for the reference to “functional modules”. How immune response and disease modules are obtained, for instance? Please define the labels in Fig S1. Why the methodology for this analysis is not included in the methods section is not clear. 6. Does the Association Score consider experimental results for any given interaction? 7. Figures S2-S4 refer to “network distance”, do you mean “association score”? It is quite difficult to follow the figures and the text is there is no consistency in the naming of variables. 8. Please explain which diseases were selected from DisGeNET (the 11,944 disease concepts) and the rationale for the selection. 9. Figure S5 is hard to interpret without labels in the axis. 10. The code and data are provided under this link https://simtk.org/projects/cytokine But it is not possible to access to the data or code without being a member of the project. Therefore, the datasets and the code are not available to ensure reproducibility. ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: None Reviewer #3: No: The code and data are provided under this link https://simtk.org/projects/cytokine But it is not possible to access to the data or code without being a member of the project. I have registered to the platform but still do not have access to the project. Therefore the datasets and the code are not available to ensure reproducibility. ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at . Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols 28 Jan 2022 Submitted filename: Response2Reviewer.docx Click here for additional data file. 3 Mar 2022 Dear Dr. Altman, Thank you very much for submitting your manuscript "Construction of disease-specific cytokine profiles by associating disease genes with immune responses" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations. In addition to the remaining reviewer comments, you will need to make the source code publicly available before we can accept your work for publication. Please check the Sharing Software section of the journal policy: https://journals.plos.org/ploscompbiol/s/materials-software-and-code-sharing Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Ferhat Ay, Ph.D Associate Editor PLOS Computational Biology Rob De Boer Deputy Editor PLOS Computational Biology *********************** A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately: [LINK] Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #2: Please address the remaining issues in your analysis: The pathogenesis genes which you cite as interacting with cytokines (immune genes) are in fact almost all immune genes because there are inflammatory diseases (Table 3) CVD may not be an infectious or autoimmune disease, but it is widely considered a chronic inflammatory disease. Aneurysm or leukemia may affect blood or vessel lining cells but they are not considered as immune mediated per se. Many chemokine/ receptors are GPCRs so may be circular in your enrichment analysis. ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #2: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #2: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols References: Review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. 4 Mar 2022 Submitted filename: Revision2.docx Click here for additional data file. 17 Mar 2022 Dear Dr. Altman, We are pleased to inform you that your manuscript 'Construction of disease-specific cytokine profiles by associating disease genes with immune responses' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Ferhat Ay, Ph.D Associate Editor PLOS Computational Biology Rob De Boer Deputy Editor PLOS Computational Biology *********************************************************** 8 Apr 2022 PCOMPBIOL-D-21-01684R2 Construction of disease-specific cytokine profiles by associating disease genes with immune responses Dear Dr Altman, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Katalin Szabo PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol
  35 in total

Review 1.  How cytokine networks fuel inflammation: Toward a cytokine-based disease taxonomy.

Authors:  Georg Schett; Dirk Elewaut; Iain B McInnes; Jean-Michel Dayer; Markus F Neurath
Journal:  Nat Med       Date:  2013-07       Impact factor: 53.440

2.  Social network architecture of human immune cells unveiled by quantitative proteomics.

Authors:  Jan C Rieckmann; Roger Geiger; Daniel Hornburg; Tobias Wolf; Ksenya Kveler; David Jarrossay; Federica Sallusto; Shai S Shen-Orr; Antonio Lanzavecchia; Matthias Mann; Felix Meissner
Journal:  Nat Immunol       Date:  2017-03-06       Impact factor: 25.606

3.  SPATA2 links CYLD to the TNF-α receptor signaling complex and modulates the receptor signaling outcomes.

Authors:  Sebastian A Wagner; Shankha Satpathy; Petra Beli; Chunaram Choudhary
Journal:  EMBO J       Date:  2016-06-15       Impact factor: 11.598

4.  Integrative Analysis Reveals a Molecular Stratification of Systemic Autoimmune Diseases.

Authors:  Guillermo Barturen; Sepideh Babaei; Francesc Català-Moll; Manuel Martínez-Bueno; Zuzanna Makowska; Jordi Martorell-Marugán; Pedro Carmona-Sáez; Daniel Toro-Domínguez; Elena Carnero-Montoro; María Teruel; Martin Kerick; Marialbert Acosta-Herrera; Lucas Le Lann; Christophe Jamin; Javier Rodríguez-Ubreva; Antonio García-Gómez; Jorge Kageyama; Anne Buttgereit; Sikander Hayat; Joerg Mueller; Ralf Lesche; Maria Hernandez-Fuentes; Maria Juarez; Tania Rowley; Ian White; Concepción Marañón; Tania Gomes Anjos; Nieves Varela; Rocío Aguilar-Quesada; Francisco Javier Garrancho; Antonio López-Berrio; Manuel Rodriguez Maresca; Héctor Navarro-Linares; Isabel Almeida; Nancy Azevedo; Mariana Brandão; Ana Campar; Raquel Faria; Fátima Farinha; António Marinho; Esmeralda Neves; Ana Tavares; Carlos Vasconcelos; Elena Trombetta; Gaia Montanelli; Barbara Vigone; Damiana Alvarez-Errico; Tianlu Li; Divya Thiagaran; Ricardo Blanco Alonso; Alfonso Corrales Martínez; Fernanda Genre; Raquel López Mejías; Miguel A Gonzalez-Gay; Sara Remuzgo; Begoña Ubilla Garcia; Ricard Cervera; Gerard Espinosa; Ignasi Rodríguez-Pintó; Ellen De Langhe; Jonathan Cremer; Rik Lories; Doreen Belz; Nicolas Hunzelmann; Niklas Baerlecken; Katja Kniesch; Torsten Witte; Michaela Lehner; Georg Stummvoll; Michael Zauner; Maria Angeles Aguirre-Zamorano; Nuria Barbarroja; Maria Carmen Castro-Villegas; Eduardo Collantes-Estevez; Enrique de Ramon; Isabel Díaz Quintero; Alejandro Escudero-Contreras; María Concepción Fernández Roldán; Yolanda Jiménez Gómez; Inmaculada Jiménez Moleón; Rosario Lopez-Pedrera; Rafaela Ortega-Castro; Norberto Ortego; Enrique Raya; Carolina Artusi; Maria Gerosa; Pier Luigi Meroni; Tommaso Schioppo; Aurélie De Groof; Julie Ducreux; Bernard Lauwerys; Anne-Lise Maudoux; Divi Cornec; Valérie Devauchelle-Pensec; Sandrine Jousse-Joulin; Pierre-Emmanuel Jouve; Bénédicte Rouvière; Alain Saraux; Quentin Simon; Montserrat Alvarez; Carlo Chizzolini; Aleksandra Dufour; Donatienne Wynar; Attila Balog; Márta Bocskai; Magdolna Deák; Sonja Dulic; Gabriella Kádár; László Kovács; Qingyu Cheng; Velia Gerl; Falk Hiepe; Laleh Khodadadi; Silvia Thiel; Emanuele de Rinaldis; Sambasiva Rao; Robert J Benschop; Chris Chamberlain; Ernst R Dow; Yiannis Ioannou; Laurence Laigle; Jacqueline Marovac; Jerome Wojcik; Yves Renaudineau; Maria Orietta Borghi; Johan Frostegård; Javier Martín; Lorenzo Beretta; Esteban Ballestar; Fiona McDonald; Jacques-Olivier Pers; Marta E Alarcón-Riquelme
Journal:  Arthritis Rheumatol       Date:  2020-12-08       Impact factor: 10.995

5.  Association of TNFAIP3 interacting protein 1, TNIP1 with systemic lupus erythematosus in a Japanese population: a case-control association study.

Authors:  Aya Kawasaki; Satoshi Ito; Hiroshi Furukawa; Taichi Hayashi; Daisuke Goto; Isao Matsumoto; Makio Kusaoi; Jun Ohashi; Robert R Graham; Kunio Matsuta; Timothy W Behrens; Shigeto Tohma; Yoshinari Takasaki; Hiroshi Hashimoto; Takayuki Sumida; Naoyuki Tsuchiya
Journal:  Arthritis Res Ther       Date:  2010-09-17       Impact factor: 5.156

Review 6.  Cytokine signaling modules in inflammatory responses.

Authors:  John J O'Shea; Peter J Murray
Journal:  Immunity       Date:  2008-04       Impact factor: 31.745

7.  Immune-centric network of cytokines and cells in disease context identified by computational mining of PubMed.

Authors:  Ksenya Kveler; Elina Starosvetsky; Amit Ziv-Kenet; Yuval Kalugny; Yuri Gorelik; Gali Shalev-Malul; Netta Aizenbud-Reshef; Tania Dubovik; Mayan Briller; John Campbell; Jan C Rieckmann; Nuaman Asbeh; Doron Rimar; Felix Meissner; Jeff Wiser; Shai S Shen-Orr
Journal:  Nat Biotechnol       Date:  2018-06-18       Impact factor: 54.908

Review 8.  Resolution of chronic inflammatory disease: universal and tissue-specific concepts.

Authors:  Georg Schett; Markus F Neurath
Journal:  Nat Commun       Date:  2018-08-15       Impact factor: 14.919

Review 9.  Cytokines and Abnormal Glucose and Lipid Metabolism.

Authors:  Jie Shi; Jiangao Fan; Qing Su; Zhen Yang
Journal:  Front Endocrinol (Lausanne)       Date:  2019-10-30       Impact factor: 5.555

10.  Network approaches and applications in biology.

Authors:  Trey Ideker; Ruth Nussinov
Journal:  PLoS Comput Biol       Date:  2017-10-12       Impact factor: 4.475

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.