Literature DB >> 26557424

Nodes with high centrality in protein interaction networks are responsible for driving signaling pathways in diabetic nephropathy.

Maryam Abedi1, Yousof Gheisari2.   

Abstract

In spite of huge efforts, chronic diseases remain an unresolved problem in medicine. Systems biology could assist to develop more efficient therapies through providing quantitative holistic sights to these complex disorders. In this study, we have re-analyzed a microarray dataset to identify critical signaling pathways related to diabetic nephropathy. GSE1009 dataset was downloaded from Gene Expression Omnibus database and the gene expression profile of glomeruli from diabetic nephropathy patients and those from healthy individuals were compared. The protein-protein interaction network for differentially expressed genes was constructed and enriched. In addition, topology of the network was analyzed to identify the genes with high centrality parameters and then pathway enrichment analysis was performed. We found 49 genes to be variably expressed between the two groups. The network of these genes had few interactions so it was enriched and a network with 137 nodes was constructed. Based on different parameters, 34 nodes were considered to have high centrality in this network. Pathway enrichment analysis with these central genes identified 62 inter-connected signaling pathways related to diabetic nephropathy. Interestingly, the central nodes were more informative for pathway enrichment analysis compared to all network nodes and also 49 differentially expressed genes. In conclusion, we here show that central nodes in protein interaction networks tend to be present in pathways that co-occur in a biological state. Also, this study suggests a computational method for inferring underlying mechanisms of complex disorders from raw high-throughput data.

Entities:  

Keywords:  Diabetic nephropathy; Microarray analysis; Protein interaction maps; Systems biology

Year:  2015        PMID: 26557424      PMCID: PMC4636410          DOI: 10.7717/peerj.1284

Source DB:  PubMed          Journal:  PeerJ        ISSN: 2167-8359            Impact factor:   2.984


Introduction

Chronic diseases are the leading cause of death and disability. Even with enormous investigations, the exact mechanisms of the occurrence and progression of these disorders are not yet fully discovered and in most cases the therapeutic options are not satisfying. Systems biology is a promising approach to address these limitations. Using this novel strategy, invaluable information has been obtained on the molecular basis of various diseases such as macular degeneration, myocardial infarction, metabolic syndrome, and kidney fibrosis (Dumas, Kinross & Nicholson, 2014; Ghasemi et al., 2014; Jin et al., 2012; Morrison et al., 2011). Systems biology with its global view has an exclusive potential to extract the meaning from bulk, sometimes ambiguous data derived from omics technologies. It can also provide a mechanistic view via generation of mathematical models (Azeloglu et al., 2014; Cedersund et al., 2008; Swameye et al., 2003). Chronic kidney disease (CKD) is a common debilitating disorder consuming a considerable fraction of health budgets (Trivedi, 2010). CKD secondary to diabetes mellitus, known as diabetic nephropathy, is the most common subtype. Although lots of previous studies have identified the pathogenic role of individual genes and signaling pathways in DN, a systematic holistic view has rarely been attempted for this complex disorder. Among systematics studies in DN, Starkey et al. (2010) have shown altered retinoic acid metabolism in diabetic mouse kidney by proteomics analysis. Similarly, using network analysis approach, Sengupta et al. (2009) have predicted the interaction of PTPN1 with EGFR and CAV1 in DN vascular complications. However, the functional significance of the network topology parameters has not been thoroughly assessed in this disorder. Here, we have reanalyzed a microarray dataset originally deposited by Baelde et al. (2004) which compares the expression profile of glomeruli from DN and normal individuals. Their analyses revealed some differentially expressed genes (DE genes) among which VEGF and Nephrin down-regulation were confirmed by real time PCR. Also, their gene ontology (GO) analysis predicted pathways such as nucleic acid metabolism, Neuropeptide signaling pathway and Actin binding to be related to DN. Here, we reanalyzed this dataset with a different statistical significance detection method which resulted in a dissimilar number of DE genes. In addition, we have constructed a protein-protein interaction network (PPI) and employed graph theory concepts to assess the network topology. Critical nodes were then selected for pathway enrichment analysis. This computational approach may also be employed for other large datasets to deepen our understandings of chronic diseases by extracting meaningful concepts from bulk raw data of high-throughput technologies.

Materials and Methods

Microarray data

The mRNA expression profile of GSE1009, deposited by Baelde et al. (2004), was downloaded from the Gene Expression Omnibus (GEO) database (Barrett et al., 2009). In this microarray experiment, the expression of genes in the glomeruli of DN patients was compared to that of healthy individuals. For further analysis, we assessed the quality of samples by hierarchical clustering and principle component analysis (PCA) based on the data of the top DE genes. For hierarchical clustering, Euclidean distance measure and average-linkage method were applied using SUMO software (Schwager. http://www.oncoexpress.de/software/sumo/). PCA was performed using Multibase_2015 Excel Add-In program. The dataset was re-analyzed by GEO2R tool of GEO. In analysis by GEO2R, three normal samples were compared to three DN samples by Student’s t-test. For p-value correction, Benjamini–Hochberg false discovery rate method was applied. Genes with adjusted p-value of less than 0.05 were considered as differentially expressed.

Protein-protein interaction network

Using CluePedia plugin version 1.1.3 (Bindea, Galon & Mlecnik, 2013) of Cytoscape software version 3.1.0 (Shannon et al., 2003) a PPI network was constructed for DE genes in the microarray dataset. STRING database with confidence cutoff 0.80 was used for retrieving interactions. The network topology was analyzed by Cytoscape NetworkAnalyzer tool and network topology measures such as Degree, Betweenness, Closeness Centrality, and Clustering Coefficient were calculated.

Pathway enrichment analysis

Pathway enrichment analysis was performed using Cytoscape ClueGO plugin version 2.1.3 (Bindea et al., 2009). In this analysis, Bonferroni step down was applied for p-value adjustment and pathways with adjusted p-value <0.05 were selected.

Results

The quality of the microarray dataset was assessed and differentially expressed genes were identified

In this study, we re-analyzed the microarray dataset GSE1009 which compares glomeruli samples from DN patients and healthy individuals. Comparison of the two groups by GEO2R revealed that 49 genes were differentially expressed with adjusted p-value <0.05 (Table S1). As different parameters such as the efficiency of RNA extraction and spot detection can influence the validity of microarray experiments, we assessed the suitability of this dataset for further analysis by unsupervised hierarchical clustering and PCA with the data of the 49 genes. Both these methods could differentiate samples based on disease state (normal or DN), indicating the acceptable quality of this dataset (Fig. 1).
Figure 1

The quality of microarray GSE1009 dataset is satisfying.

The heat map diagram shows the result of unsupervised hierarchical clustering for diabetic (GSM15968, GSM15969, and GSM15970) and normal (GSM15965, GSM15966, and GSM15967) samples based on the data of the top differentially expressed genes. GSM15969 is the technical replicate of GSM15968 and GSM15966 is the technical replicate of GSM15965. Each row represents a gene and each column represents a sample (A). Principal component analysis was performed on all samples based on the most up/down regulated genes. The first principle component (PC1) separates samples into DN and normal groups (B).

The quality of microarray GSE1009 dataset is satisfying.

The heat map diagram shows the result of unsupervised hierarchical clustering for diabetic (GSM15968, GSM15969, and GSM15970) and normal (GSM15965, GSM15966, and GSM15967) samples based on the data of the top differentially expressed genes. GSM15969 is the technical replicate of GSM15968 and GSM15966 is the technical replicate of GSM15965. Each row represents a gene and each column represents a sample (A). Principal component analysis was performed on all samples based on the most up/down regulated genes. The first principle component (PC1) separates samples into DN and normal groups (B).

The PPI network and pathway enrichment analysis of differentially expressed genes were not informative

To investigate the interaction between the 49 selected genes from the microarray dataset, we constructed a PPI network using Cytoscape CluPedia plugin. Although various kinds of interactions with different evidences (activation, post-translational modification, binding, database, experiment) were allowed to be shown, unexpectedly, only few genes revealed to be interacting (Fig. 2A). Next, to infer pathways that are related to these 49 genes, pathway enrichment analysis was performed which showed only 5 pathways with no overlap genes. These pathways were not previously shown to be related to DN (Fig. 2B).
Figure 2

Network construction and pathway enrichment analysis of differentially expressed genes were not informative.

The PPI network of the 49 differentially expressed genes has few edges as these genes do not directly interact (A). Pathway enrichment analysis of these genes could not detect critical pathways in DN. Pathways with adjusted p-value <0.05 are shown (B).

Network construction and pathway enrichment analysis of differentially expressed genes were not informative.

The PPI network of the 49 differentially expressed genes has few edges as these genes do not directly interact (A). Pathway enrichment analysis of these genes could not detect critical pathways in DN. Pathways with adjusted p-value <0.05 are shown (B).

Pathway enrichment analysis of central genes in the enriched PPI network could detect critical pathways in DN

Observation of the scarcity of interactions between the 49 genes that all were either up- or down-regulated in DN was unexpected. It is rational to assume that in the actual network between the genes related to DN, not all genes are regulated in the level of mRNA and hence not detected in the mRNA microarray experiment. The absence of these genes makes the interaction network incomplete. Therefore, the PPI network was enriched by the addition of maximum 2 interacting nodes for each gene. This resulted in expansion of the network from 49 nodes to 137 nodes. Indeed, the added 88 genes were predicted to be interacting with the 49 initial genes based on previous knowledge. The PPI network of these 137 genes was constructed with the same parameters applied for the initial network (Fig. 3A).
Figure 3

Enrichment of the PPI network and selection of central nodes for pathway enrichment analysis can determine pathways essentially related to DN.

The 49-node network was extended with maximum two interactive genes for each node. The initial nodes selected from the microarray experiment are depicted with red color and enriched nodes with black (A). In this expanded network, 34 genes were selected as nodes with high centrality. Pathway enrichment analysis with these “central genes” disclosed 62 highly connected pathways related to DN. Pathways with adjusted p-value <0.05 are shown (B).

Enrichment of the PPI network and selection of central nodes for pathway enrichment analysis can determine pathways essentially related to DN.

The 49-node network was extended with maximum two interactive genes for each node. The initial nodes selected from the microarray experiment are depicted with red color and enriched nodes with black (A). In this expanded network, 34 genes were selected as nodes with high centrality. Pathway enrichment analysis with these “central genes” disclosed 62 highly connected pathways related to DN. Pathways with adjusted p-value <0.05 are shown (B). Graph theory concepts such as degree, closeness centrality, and betweenness centrality were employed to assess the topology of this network. The genes were sorted based on each of these parameters and the top 15% genes with the highest rank were selected. Considering the overlapping nodes between the three gene lists, a total of 34 genes were finally chosen (Table 1). Pathway enrichment analysis was then performed starting with either the central 34 genes or the total 137 genes. Interestingly, the central gene set resulted in 62 pathways strongly related to DN (Fig. 3B). These pathways had several similar genes and formed a deeply connected network (170 edges, edge/node: 2.7). In contrast, pathway enrichment analysis with the total 137 genes determined 51 pathways (Fig. S1) with fewer connections to each other (86 edges, edge/node: 1.7).
Table 1

Central genes in the PPI network.

The top 15% genes with the highest degrees, betweenness centrality, and closeness centrality scores in the enriched PPI network are shown.

GenesDegreeGenesBetweenness centralityGenesCloseness centrality
VEGFA23IL60.253CHL10.667
JUN20JUN0.205FAF10.667
ITGB117HRAS0.153MGAT4B0.667
IL617EZR0.122MGAT4A0.667
MYC17MYC0.12JUN0.408
IL213VEGFA0.111MYC0.388
BMP413VIM0.109IL60.384
CSF212CALM10.107VEGFA0.38
ITGA612BMP40.074HRAS0.366
HRAS11ITGA60.069ITGA60.36
KDR11CYP2C80.063ITGB10.359
BMPR211PLA2G2D0.061IL20.357
EZR11PLA2G2A0.061VIM0.354
CALM111FLNA0.058CSF20.349
FLNA10PLCE10.054ITGA20.341
ITGA210TUBB4A0.051NGF0.337
BMPR1A10THBS10.051TUBB4A0.332
FGF19ITGB10.049EZR0.329
BMP29IL20.049FLNA0.329
FASLG9CSF20.048FASLG0.329
TNNT29ADORA2B0.048CALM10.326

Central genes in the PPI network.

The top 15% genes with the highest degrees, betweenness centrality, and closeness centrality scores in the enriched PPI network are shown.

Discussion

In spite of enormous studies, the current therapeutic options for most chronic diseases are not yet satisfying. It can partly be due to the simple tools and concepts of classical biology that are not appropriate for investigation of complex situations of chronic disease. The recent development of high-throughput technologies, allows the assessment of gene expression at different levels in various biological states. However, there has been a lag between the emergence of these techniques and introduction of proper mathematical methods to analyze bulk raw biological data. Therefore, for a while it was common to only inspect the few most up- or down-regulated genes individually. However, with the novel analysis methods, it is feasible to infer complex interactions at various levels from the simultaneous alteration in the expression of a bundle of genes. Therefore, re-investigation of the prior omics data with the current analysis tools may assist to produce valuable biomedical knowledge. In this study, the GSE1009 microarray dataset which deals with the comparison of mRNA expression profile of DN patients’ glomeruli with those from healthy individuals was assessed to construct a PPI network. We found that expansion of this network followed by selection of nodes with high centrality for pathway enrichment analysis is an efficient strategy to infer critical signaling pathways in DN. In this study, we found 49 genes to be differentially expressed between DN and normal samples. In contrast, in the original study; Baelde et al. (2004) identified 615 DE genes. This discrepancy can be due to the inappropriate bulk data analysis methods that were employed in that study. For instance, they used raw p-value reported by Student’s T-test. However, it is now publicly believed that this method of statistical significance detection is associated with high false positive results. To address this problem, false discovery rate methods such as Bonferroni, Benjamini–Hochberg have been proposed for p-value adjustment (Sandrine Dudoit & Callow, 2002). Therefore, we have considered genes with adj. p-value <0.05 as differentially expressed. Based on DE genes in the microarray experiment, a PPI network was constructed. Interestingly, very few interactions appeared in this network. This could be due to the fact that we had selected genes only based on mRNA expression difference and therefore, other critical genes regulated at other levels were missing. Therefore, to fill these gaps in the map of interactions, the network was expanded based on previous knowledge and a network with 137 nodes was constructed. Then we tried to determine the critical nodes in this network but as there is no simple criterion for “biologically important genes”, we analyzed the topology of the network and employed a combination of different measures of centrality; some nodes such as VEGFA and JUN have high degree, so they have many connections and are vital for the surveillance of the network. Betweenness centrality measures the number of shortest paths going through a node and so nodes with high betweenness centrality such as JUN and IL6 in this network are shortcuts of the network. In addition, nodes with highest closeness centrality such as CHL1 and FAF1 in our network are physically nearest genes to all nodes. Using these parameters, 34 genes were assumed to have high centrality. Starting with a set of genes, pathway enrichment analysis allows determination of the top affected functions in a specific disease. An interesting finding in this study was that pathway enrichment with the set of 34 central genes was more informative than enrichment with the initial 49 genes or even with the total 137 genes in the enriched PPI network. It is widely believed that the functional significance of a protein is related to its position in the PPI network as deletion of hub proteins are more lethal compared to non-hubs, a phenomenon known as centrality-lethality rule (Hahn & Kern, 2005; He & Zhang, 2006; Jeong et al., 2001; Yu et al., 2004). Our study demonstrates that central nodes in PPI network tend to be present in pathways that co-occur in a given biological state and probably make pathway cross-links. This observation provides an explanation for the functional essentiality of the central nodes. Pathway enrichment analysis with the central nodes had an acceptable validity as most of the enriched pathways including TGFB, VEGF, MAPK, and BMP signaling pathways were previously shown to be associated to DN in experimental studies (Toyoda et al., 2004; Turk et al., 2009; Ziyadeh, 2008). With this analysis, we could also determine novel pathways which their role in DN remains to be confirmed in future studies. For instance, neurotrophin signaling pathway, which has been previously shown to be related to diabetic neuropathy (Pittenger & Vinik, 2003), was among the enriched pathways. Similarly, we detected platelet degranulation pathway as a potential role player in DN. Previous studies have demonstrated the role of this pathway in some profibrotic disorders such as idiopathic pulmonary fibrosis (Crooks et al., 2014; Wynn, 2007). In conclusion, we have here introduced a systems biology approach to DN as a complex biological state. Methods employed in this study may also be used for other chronic diseases to suggest novel therapies via generation of a holistic multi-level insight.

Differentially expressed genes

Forty-nine genes were differently expressed between normal and DN samples with adjusted p-value <0.05. The genes are sorted by log2 of fold change (LogFC). Click here for additional data file.

Pathway enrichment analysis with all genes in the enriched network

Pathway enrichment analysis with all 137 genes in the enriched PPI network revealed 51 pathways that were less connected to each other compared to pathways inferred from the central 34 genes. Pathways with adjusted P-value <0.05 are shown. Click here for additional data file.
  25 in total

1.  Identification of nucleocytoplasmic cycling as a remote sensor in cellular signaling by databased modeling.

Authors:  I Swameye; T G Muller; J Timmer; O Sandra; U Klingmuller
Journal:  Proc Natl Acad Sci U S A       Date:  2003-01-27       Impact factor: 11.205

2.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

Review 3.  Genomic analysis of essentiality within protein networks.

Authors:  Haiyuan Yu; Dov Greenbaum; Hao Xin Lu; Xiaowei Zhu; Mark Gerstein
Journal:  Trends Genet       Date:  2004-06       Impact factor: 11.639

Review 4.  Common and unique mechanisms regulate fibrosis in various fibroproliferative diseases.

Authors:  Thomas A Wynn
Journal:  J Clin Invest       Date:  2007-03       Impact factor: 14.808

5.  NCBI GEO: archive for high-throughput functional genomic data.

Authors:  Tanya Barrett; Dennis B Troup; Stephen E Wilhite; Pierre Ledoux; Dmitry Rudnev; Carlos Evangelista; Irene F Kim; Alexandra Soboleva; Maxim Tomashevsky; Kimberly A Marshall; Katherine H Phillippy; Patti M Sherman; Rolf N Muertter; Ron Edgar
Journal:  Nucleic Acids Res       Date:  2008-10-21       Impact factor: 16.971

6.  Altered retinoic acid metabolism in diabetic mouse kidney identified by O isotopic labeling and 2D mass spectrometry.

Authors:  Jonathan M Starkey; Yingxin Zhao; Rovshan G Sadygov; Sigmund J Haidacher; Wanda S Lejeune; Nilay Dey; Bruce A Luxon; Maureen A Kane; Joseph L Napoli; Larry Denner; Ronald G Tilton
Journal:  PLoS One       Date:  2010-06-14       Impact factor: 3.240

7.  A systems approach identifies HIPK2 as a key regulator of kidney fibrosis.

Authors:  Yuanmeng Jin; Krishna Ratnam; Peter Y Chuang; Ying Fan; Yifei Zhong; Yan Dai; Amin R Mazloom; Edward Y Chen; Vivette D'Agati; Huabao Xiong; Michael J Ross; Nan Chen; Avi Ma'ayan; John Cijiang He
Journal:  Nat Med       Date:  2012-03-11       Impact factor: 53.440

8.  Why do hubs tend to be essential in protein networks?

Authors:  Xionglei He; Jianzhi Zhang
Journal:  PLoS Genet       Date:  2006-04-26       Impact factor: 5.917

9.  Increased platelet reactivity in idiopathic pulmonary fibrosis is mediated by a plasma factor.

Authors:  Michael G Crooks; Ahmed Fahim; Khalid M Naseem; Alyn H Morice; Simon P Hart
Journal:  PLoS One       Date:  2014-10-22       Impact factor: 3.240

10.  Model-based hypothesis testing of key mechanisms in initial phase of insulin signaling.

Authors:  Gunnar Cedersund; Jacob Roll; Erik Ulfhielm; Anna Danielsson; Henrik Tidefelt; Peter Strålfors
Journal:  PLoS Comput Biol       Date:  2008-06-20       Impact factor: 4.475

View more
  15 in total

1.  The impact of cytokine responses in the intra- and extracellular signaling network of a traumatic injury.

Authors:  Alice A Han; Holly N Currie; Matthew S Loos; Giovanni Scardoni; Julie V Miller; Nicole Prince; Julia A Mouch; Jonathan W Boyd
Journal:  Cytokine       Date:  2017-11-02       Impact factor: 3.861

2.  Central Nodes in Protein Interaction Networks Drive Critical Functions in Transforming Growth Factor Beta-1 Stimulated Kidney Cells.

Authors:  Reyhaneh Rabieian; Maryam Abedi; Yousof Gheisari
Journal:  Cell J       Date:  2016-09-26       Impact factor: 2.479

Review 3.  Proteomic and bioinformatic discovery of biomarkers for diabetic nephropathy.

Authors:  Chadinee Thippakorn; Nalini Schaduangrat; Chanin Nantasenamat
Journal:  EXCLI J       Date:  2018-03-26       Impact factor: 4.068

4.  The analysis of a time-course transcriptome profile by systems biology approaches reveals key molecular processes in acute kidney injury.

Authors:  Kobra Moradzadeh; Yousof Gheisari
Journal:  J Res Med Sci       Date:  2019-01-31       Impact factor: 1.852

5.  Analysis of Predicted Host-Parasite Interactomes Reveals Commonalities and Specificities Related to Parasitic Lifestyle and Tissues Tropism.

Authors:  Yesid Cuesta-Astroz; Alberto Santos; Guilherme Oliveira; Lars J Jensen
Journal:  Front Immunol       Date:  2019-02-13       Impact factor: 8.786

6.  Equine arteritis virus long-term persistence is orchestrated by CD8+ T lymphocyte transcription factors, inhibitory receptors, and the CXCL16/CXCR6 axis.

Authors:  Mariano Carossino; Pouya Dini; Theodore S Kalbfleisch; Alan T Loynachan; Igor F Canisso; R Frank Cook; Peter J Timoney; Udeni B R Balasuriya
Journal:  PLoS Pathog       Date:  2019-07-29       Impact factor: 6.823

7.  A systematic integrative approach reveals novel microRNAs in diabetic nephropathy.

Authors:  Farnoush Kiyanpour; Maryam Abedi; Yousof Gheisari
Journal:  J Res Med Sci       Date:  2020-01-20       Impact factor: 1.852

8.  Transcriptomic analysis of equine chorioallantois reveals immune networks and molecular mechanisms involved in nocardioform placentitis.

Authors:  Hossam El-Sheikh Ali; Shavahn C Loux; Laura Kennedy; Kirsten E Scoggin; Pouya Dini; Carleigh E Fedorka; Theodore S Kalbfleisch; Alejandro Esteller-Vico; David W Horohov; Erdal Erol; Craig N Carter; Jackie L Smith; Barry A Ball
Journal:  Vet Res       Date:  2021-07-08       Impact factor: 3.683

9.  Network analysis of membranous glomerulonephritis based on metabolomics data.

Authors:  Amir Taherkhani; Shiva Kalantari; Afsaneh Arefi Oskouie; Mohsen Nafar; Mohammad Taghizadeh; Koorosh Tabar
Journal:  Mol Med Rep       Date:  2018-09-12       Impact factor: 2.952

10.  Identification of Key Genes in Gastric Cancer by Bioinformatics Analysis.

Authors:  Xinyu Chong; Rui Peng; Yan Sun; Luyu Zhang; Zheng Zhang
Journal:  Biomed Res Int       Date:  2020-09-21       Impact factor: 3.411

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.