Hui-Ling Wang1, Jing Liu2, Zhao-Min Qin3. 1. First Gynecological Ward, Binzhou People's Hospital, Binzhou, Shandong 256610, P.R. China. 2. Department of Health Checkup, Binzhou People's Hospital, Binzhou, Shandong 256610, P.R. China. 3. Department of Nursing, Shandong Medical College, Jinan, Shandong 250002, P.R. China.
Abstract
The aim of the present study was to identify differential pathways in uterine leiomyomata (UL) using a novel method based on protein-protein interaction networks and pathway analysis. The pathway networks were constructed by examining the intersections of the Reactome database and the Search Tool for the Retrieval of Interacting Genes/proteins (STRING) protein-protein interaction (PPI) networks. The Objective network was defined as the differential expressed genes (DEGs) associated with the interactions identified by STRING. Topological centrality (degree) analysis was performed for the Objective network to explore the hub genes and hub networks. Subsequent to isolating the intersections between the Pathway and Objective networks, randomization tests were conducted to identify the differential pathways. There were 559,598 interactions in the Pathway networks. A total of 657 genes with 3,835 interactions were mapped in the Objective network, which included 20 hub genes. It was identified that 358 pathways demonstrated interaction with the Objective network, such as Signal Transduction, Immune System and Signaling by G-protein-coupled receptor (GPCR). By accessing the randomization tests, P-values of these pathways were close to 0, which indicated that they were significantly different. The present study successfully identified differential pathways (such as signal transduction, immune system and signaling by GPCR) in UL, which may be potential biomarkers in the detection and treatment of UL.
The aim of the present study was to identify differential pathways in uterine leiomyomata (UL) using a novel method based on protein-protein interaction networks and pathway analysis. The pathway networks were constructed by examining the intersections of the Reactome database and the Search Tool for the Retrieval of Interacting Genes/proteins (STRING) protein-protein interaction (PPI) networks. The Objective network was defined as the differential expressed genes (DEGs) associated with the interactions identified by STRING. Topological centrality (degree) analysis was performed for the Objective network to explore the hub genes and hub networks. Subsequent to isolating the intersections between the Pathway and Objective networks, randomization tests were conducted to identify the differential pathways. There were 559,598 interactions in the Pathway networks. A total of 657 genes with 3,835 interactions were mapped in the Objective network, which included 20 hub genes. It was identified that 358 pathways demonstrated interaction with the Objective network, such as Signal Transduction, Immune System and Signaling by G-protein-coupled receptor (GPCR). By accessing the randomization tests, P-values of these pathways were close to 0, which indicated that they were significantly different. The present study successfully identified differential pathways (such as signal transduction, immune system and signaling by GPCR) in UL, which may be potential biomarkers in the detection and treatment of UL.
Uterine leiomyomata (UL), a benign neoplasm deriving from the myometrial compartment of the uterus, is the most widespread gynecological problem in females (1). The common symptoms associated with UL are pelvic pain, discomfort, menstrual disorders and infertility (2). Surgery is the primary treatment modality, and tumors are often resistant to chemotherapy and radiation therapy (3). To date, adjuvant therapy has not demonstrated a significant survival advantage (3). Although surgical staging and nomograms may assist in predicting clinical outcome, the 5-year survival rate for uterus-confined disease remains <50% (4). Understanding the molecular biology of UL may provide additional prognostic and therapeutic insights.With the advances of high-throughput experimental technologies, these have been applied to explore the diagnostic gene signatures and biological processes of human diseases (5), which provide novel insights into the underlying biological mechanisms of UL. Microarray experiments have revealed that fibroid development may be due to abnormal tissue repair and an altered extracellular matrix (6). The levels of the inflammatory cytokine transforming growth factor-β (TGF-β) were increased 3-fold in fibroid tissue relative to myometrium (7,8). However, a number of investigations on UL pathogenesis lack physiological relevance due to these studies being solely based on a number of individual genes and cell lines (9). Molecular pathways underlying UL development and growth acceleration are largely unknown, and the majority of previous results have stemmed from studying recurrent cytogenetic abnormalities identified among the 40% of abnormal UL (10), and gene expression profiles may be an additional good choice for research.A variety of methods have been developed for the analysis of gene expression microarray data, but a small number of methods exist for using these data to quantify the interrelated behavior of genes within a gene interaction network (11). Even though the incidence of tumor is hypothesized to be closely associated with the abnormal expression of numerous genes, the studies on differential expressed genes (DEGs) is inadequate and there is a lot of work required to fully realize the potential of these DEGs. Therefore, studies investigating gene interactions are essential, as these interactions serve important roles in biological processes for cancer development (12). Previously, network-based approaches utilizing information concerning the interactions between gene pairs have emerged as powerful tools for the systematic understanding of the molecular mechanisms underlying biological processes important for cancer development, and several algorithms have been developed to study these biological networks. Barter et al (13) performed a comparative analysis and identified that the network-based method was more stable compared with single-gene and gene-set methods. However, there are a small number of studies identifying differential pathways dependent on network-based approaches (11,14).Therefore, in the present study, a novel method to identify differential pathways in UL based on gene interaction networks and pathway analysis was proposed. To achieve this, the primary step was to construct networks (Pathway, Objective and Hub networks), and analyzed their topological properties. Subsequently, the intersections between Pathway and Objective networks, and between Pathway and Hub networks, were isolated and randomization tests were performed to identify differential pathways in UL. This novel method may be an efficient supplement for identifying differential pathways.
Materials and methods
The primary component of this novel method consisted of Pathway network identification, Objective network construction, Hub network extraction and differential pathway evaluation. This method used to identify differential pathways is presented in Fig. 1.
Figure 1.
Method of identifying differential pathways between UL and normal controls. PPIs, protein-protein interactions; DEGs, differentially expressed genes; STRING, Search Tool for the Retrieval of Interacting Genes/proteins.
Pathway network identification
Networks may provide new insights for mining unknown connections in incomplete networks. Although the data of large-scale protein interactions are accumulated with the development of high throughput testing technology, a certain number of significant interactions are not tested (14). This type of difficulty may be resolved to a certain extent by utilizing sub-networks of the complex network (15). Therefore, in the present study, pathway networks were identified by exploring the interactions of pathway-enriched genes with the global human protein-protein interaction (PPI) network from the Search Tool for the Retrieval of Interacting Genes/proteins (STRING) database (string-db.org; accessed August 24, 2015) (16). The pathway enriched genes originated from the Reactome pathway database (reactome.org; accessed July 13, 2015), which is a manually curated open-data resource of human pathways and reactions (17).
Objective network construction
Data collection and pretreatment
A total of two gene expression profiles [E-GEOD-18096 (18) and E-GEOD-64763 (19)] for UL and normal controls were collected from ArrayExpress database (http://www.ebi.ac.uk/arrayexpress/). E-GEOD-18096, which presented on the A-AFFY-44-Affymetrix GeneChip Human Genome U133 Plus 2.0 (HG-U133_Plus_2) platform, comprised 18 Ul samples and 9 normal controls. E-GEOD-64763 existed on the A-AFFY-37-Affymetrix GeneChip Human Genome U133A 2.0 [HG-U133A_2] platform and consisted of 25 Ul samples and 29 normal controls. In all, there were a total of 43 Ul samples and 38 normal controls in the two gene expression profiles.Pretreatment for microarray expressions was performed to control the quality at probe level. The preprocess included four standard procedures: i) Background correction (20); ii) normalization (21); iii) probe correction (22); and iv) summarization (20). The preprocessed probe-level dataset in CEL formats were converted into expression measures, and then screened by the feature filter method of gene filter package (23). Finally, a total of 20,102 and 12,493 genes were obtained from the E-GEOD-18096 and E-GEOD-64763 profiles, respectively. Subsequently, the empirical Bayes method in inSilicoMerging package version 1.15.0 was utilized to merge the two preprocessed gene expression profiles into a single group (24) which included 12,493 genes for additional analysis.
DEGs detection
DEGs between UL and normal controls were identified using the linear models for microarray data (Limma) package version 3.30.0 (University of California, Berkeley, CA, USA) (25). All genes were manipulated with t test and F test, and then Linear fit, empirical Bayes statistics and false discovery rate correction were performed to the data by using lmFit function (26). DEGs were identified for additional study with the threshold of P<0.05 and |logFoldChange|>2.
Objective network construction
Certain significant genes may not be identifiable through their own behavior, but their changes are quantifiable when considered in conjunction with other genes, such as in a network (27). In the present study, a human PPI dataset from STRING as utilized to capture interactions among DEGs. The interactions were visualized by Cytoscape version 3.1.0 (Institute for Systems Biology, San Diego, CA, USA), and a PPI network was formed, which was defined as the Objective network. Cytoscape is a free software package for visualizing, modeling and analyzing the integration of bimolecular interaction networks with high-throughput expression data and other molecular states (28).
Hub network extraction
One of the fundamental problems in network analysis is to determine the importance of a particular node or an interaction between two nodes in a network, and quantifying centrality and connectivity assists in identifying portions of the network that may serve notable roles (29). In the present study, the biological importance of genes was characterized based on the Objective network using indices of topological centrality, degree centrality. The genes at the ≥97% quantile distribution in the significantly perturbed networks were defined as hub genes. In addition, the network, which was composed of hub genes and their interactions, was denoted as a hub network.‘Degree’ quantifies the local topology of each gene by summing up the number of its adjacent genes (j), and provides a simple count of the number of interactions of a given node (30). The degree CD(v) of a node v was calculated as following:In addition, the association between the number of genes and degree distribution was analyzed, and the fitting coefficient R2 of the power-law of the Objective network was detected, due to the fact that PPI networks in general are modular and scale-free, which meant that the network exhibited power-law (or scale-free) degree distributions (31,32). The Network Analyzer 2.7 (Institute for Systems Biology) plugin in Cytoscape 3.1.0 was used for the evaluation of the topological parameters.
Differential pathways evaluation
The pathway, objective and hub networks were constructed, but selecting differential pathways based on the three kinds of networks was challenging. To overcome the problem, the intersections of the interactions between the Pathway and Objective networks, and between the Pathway and Hub networks were identified, and the quantity of intersected interactions was denoted as a ‘count’. Subsequently, randomization tests were employed to determine the P-value of each pathway from the intersected interactions.Randomization tests provide a general means of constructing tests that control size in finite samples whenever the distribution of the observed data exhibits symmetry under the null hypothesis (33). Let T(X) be a real-valued test statistic such that large values provide evidence against the null hypothesis. Ordering , denoting k=M(1-λ) and define:Using this notation, the randomization tests were performed according to the following formulas (34):For any λ ε (0, 1), φ(X) defined in the formula satisfied P[φ(X)]=λ. In the present study, T(X) represented random networks that comprised intersected interactions, φ(X) stood for each pathway and P represented the significance of the pathway. If P<0.05, this pathway was considered to be a differential pathway compared with normal controls.
Results
Pathway network
There were 787,896 interactions in the human STRING PPI network, while 1,675 human pathways were identified in the Reactome database. Interactions between pathway-enriched genes were extracted from the STRING database. A total of 559,598 gene-gene interactions were obtained, which formed a Pathway network. The 559,598 interactions may contain reduplicative interactions that were as a result of repeated enrichments of one interaction; one interaction was probably enriched in ≥2 pathways.
Objective network construction and analysis
A total of 903 DEGs between patients with UL and normal controls were identified using the Limma package with thresholds of P<0.05 and |logFoldChange|>2. When inputting these DEGs into the STRING database, 3,835 gene-gene interactions were obtained. Using Cytoscape, 657 genes with 3,835 interactions were mapped into the Objective network (Fig. 2). To additionally investigate the importance of individual genes in the bjective network, degree centrality analysis was conducted, and the degree distribution is presented in Fig. 3. The network analysis demonstrated that the Objective network presented a scale-free property whose degree distribution followed a power law (y=axb, where a=117.0, b=−0.775) with the fitting coefficients R2 (R2=0.956).
Figure 2.
Objective network. Nodes represented genes, and lines between two nodes represent gene-gene interactions and the pink nodes are hub genes, which were defined as exhibiting ≥97% degree distribution. Purple nodes are genes.
Figure 3.
Gene degree distribution in the objective network. The objective network was a scale-free network whose degree distribution followed a power law (y=axb, where a=117.0, b=−0.775) with the fitting coefficient (R2=0.956). PHLPP1, PH domain and leucine rich repeat protein phosphatase 1; VEGFA, vascular endothelial growth factor A; EGR1, early growth response 1; CCND1, cyclin D1; IL6, interleukin 6; MMP9, matrix metalloproteinase 9; ICAM1, intercellular adhesion molecule 1; FOS, Fos proto-oncogene, AP-1 transcription factor subunit; EGFR, epidermal growth factor receptor; JUN, Jun proto-oncogene, AP-1 transcription factor subunit.
Hub network extraction
In the present study, the genes in the ≥97% quantile distribution of ‘degree’ in the Objective network were defined as hub genes. In addition, the degree was calculated by summing up the number of adjacent genes. Consequently, a total of 20 hub genes were evaluated: Jun proto-oncogene, AP-1 transcription factor subunit (degree=125), epidermal growth factor receptor (degree=113), Fos proto-oncogene, AP-1 transcription factor subunit (degree=108), interleukin (IL)-6 (degree=85), cyclin D1 (degree=77), matrix metalloproteinase 9 (degree=72), intercellular adhesion molecule 1 (degree=68), early growth response 1 (degree=67), vascular endothelial growth factor A (degree=64), protein tyrosine phosphatase, receptor type C (degree=60), KIT proto-oncogene receptor tyrosine kinase (degree=59), peroxisome proliferator activated receptor gamma (degree=55), toll like receptor 4 (degree=51), topoisomerase (DNA) II α (degree=51), serpin family E member 1 (degree=50), cluster of differentiation (CD)44 (degree=48), CD40 (degree=47), Acetyl-CoA carboxylase β (degree=47), PH domain and leucine rich repeat protein phosphatase 1 (degree=47) and Ras-related C3 botulinum toxin substrate 2 (Rho family, small GTP binding protein Rac2) (degree=47). The network, which was composed of hub genes and their interactions, was denoted as the Hub network, which was also the sub-network of the Objective network. Fig. 4 was the largest of the Hub networks, which included 10 hub genes.
Figure 4.
Hub network. There were 10 hub genes in the network; EGFR, IL6, CCND1, MMP9, EGR1, KIT, SERPINE1, CD44, CD40 and PHLPP1. Nodes represented genes, and lines between two nodes represent for gene-gene interactions. Yellow nodes are hub genes, purple nodes are genes. EGFR, epidermal growth factor receptor; IL6, interleukin 6; CCND1, cyclin D1; MMP9, matrix metalloproteinase 9; EGR1, early growth response 1; KIT, KIT proto-oncogene receptor tyrosine kinase; SERPINE1, serpin family E member 1; CD, cluster of differentiation; PHLPP1, PH domain and leucine rich repeat protein phosphatase 1.
Differential pathway identification
In the present study, randomization tests were implemented to identify differential pathways of UL based on the common interactions between the Pathway networks and the Objective network, and Pathway networks and Hub networks. During the examination of the intersections between the Pathway networks and Objective network, it was revealed that 358 pathways demonstrated interactions with the Objective network, but the numbers of interactions for the different pathways were markedly different, and listed counts ≥ 20 in Table I. ‘Count’ signifies the quantity of intersecting interactions. The top five pathways were signal transduction, with a count of 100, extracellular matrix organization, with a count of 87, immune system, with a count of 81, signaling by GPCR, with a count of 79 and GPCR downstream signaling, with a count of 60. A total of 28.2% of the 358 pathways belonged to signal type pathways.
Table I.
Intersections (≥20) between the pathway and objective networks.
Pathway
Count
Signal transduction
100
Extracellular matrix organization
87
Immune system
81
Signaling by GPCR
79
GPCR downstream signaling
60
Innate immune system
49
G α (i) signaling events
34
GPCR ligand binding
34
Class A/1 (rhodopsin-like receptors)
31
Collagen formation
31
Assembly of collagen fibrils and other multimeric structures
30
Cytokine signaling in immune system
25
Developmental biology
25
Hemostasis
24
Metabolism of lipids and lipoproteins
22
Gastrin-cyclic adenosine 5′-monophosphate response element binding protein signaling pathway via protein kinase C and mitogen-activated protein kinase
21
Metabolism
21
Gene expression
20
GPCR, G-protein-coupled receptor.
Concurrently, a total of 162 pathways interacted with the Hub networks, and Table II summarizes the pathways with counts ≥10. Immune system, signal transduction, innate immune system, hemostasis and signaling by GPCR were the top five in descending order, with counts of 49, 42, 29, 19 and 19, respectively. The 162 pathways were all involved in the intersections between the Pathway networks and the Objective network. It also validated the feasibility and accuracy of this method in identifying the differential pathways in UL.
Table II.
Intersections (≥10) between the pathway and hub networks.
If P<0.05, the pathway was considered to be a differential pathway. Notably, the majority of the P-values were close to or equal to 0, which suggested that these pathways were significantly differential. Due to the similar P-values of the differential pathways, the count may be an additional measure to evaluate the significance of pathways. The higher of count, the closer the association between the pathway and UL, such as signal transduction, with a count of 100.
Discussion
To identify the differential pathways of UL, a novel method including Pathway, Objective and Hub networks, was proposed. The topological properties of gene interaction networks have been studied widely (30). It has been indicated that gene interaction networks also have scale-free properties (35,36), which are typical of biological networks. Featherstone and Broadie (37) demonstrated that the scale-free distribution of gene degrees in network assisted organisms in developing resistance to the deleterious effects of mutation. Similar architecture was also identified in the gene co-expression networks of gastric cancer (38). In the present study, a novel network-based method was produced, in which the objective network was revealed to be an evidently scale-free network, whose node degree distribution followed a power law with the fitting coefficient, which validated the reliability and feasibility of the network-based method.A total of 358 differential pathways were identified, based on networks and randomization tests with P<0.05, for example, signal transduction, immune system and signaling by GPCR. In addition, the differential pathways obtained from Hub networks were all involved in these 358 pathways, attributing to the Hub network presented as a sub-network of objective network, and confirmed the repeatability of the present study.In detail, 28.2% of the 358 differential pathways were associated with signaling, for example: Signal transduction and signaling by GPCR. Signal transduction occurs when an extracellular signaling molecule activates a specific receptor located on the cell surface or inside the cell, which in turn triggers a biochemical chain of events inside the cell, creating a response (39). Depending on the cell, the response alters the metabolism, shape, gene expression of the cell, or ability of the cell to divide: Dysregulation of these processes may lead to cancer (40). It had been suggested that certain microbial molecules, such as viral nucleotides and protein antigens, may elicit an immune system response against invading pathogens, mediated by signal transduction processes (41). Gene activations and alterations in metabolism were examples of cellular responses to extracellular stimulation that required signal transduction (42). The mitogen-activated protein kinase/extracellular-signal related protein kinase pathway couples intracellular responses to the binding of growth factors to cell surface receptors, and its activation promoted cell division and numerous forms of cancer are associated with aberrations in it, such as UL (43). Therefore, signal transduction serves a significant role in UL development.The immune system is a system involving a number of biological structures and processes within an organism that protects against disease. To function properly, it detects a wide variety of pathogens (from viruses to parasitic worms) and distinguishes them from the organism's own healthy tissue (44). In a number of species, the immune system may be classified into subsystems, such as the innate immune system versus the adaptive immune system, or humoral immunity versus cell-mediated immunity. ‘Innate immune system’ was an additional important differential pathway in the present study. The present study indicated that UL development may be triggered, at least in part, by a chronically-active inflammatory immune system. The concept of inflammation actually serves a theory of fibroid development based on an altered response to noxious stimuli; possibly tissue injury from extravasated menstrual blood into the myometrium, or hypoxia leading to altered tissue repair and fibroids (45). It had been demonstrated that leiomyoma formation may be acquired through investigation of immune system (46). Complex interactions between the endocrine and immune systems govern the key endometrial events, and inflammatory pathway dysfunction was present in the endometria of women with endometriosis and uterine fibroids (47). Santulli et al (48) revealed that IL-33 was released as a danger signal, alerting the immune system following endogenous stimulation, and elevated serum IL-33 levels were associated with the existence of UL (48). The present study identified that IL6 was a hub gene in UL, and perhaps also took part in the signal activity and served a critical role in UL.In conclusion, the present study successfully identified differential pathways (such as signal transduction, immune system and signaling by GPCR) in UL, which may provide potential insights into the detection and treatment of UL.
Authors: Oliver Zivanovic; Lindsay M Jacks; Alexia Iasonos; Mario M Leitao; Robert A Soslow; Emanuela Veras; Dennis S Chi; Nadeem R Abu-Rustum; Richard R Barakat; Murray F Brennan; Martee L Hensley Journal: Cancer Date: 2011-07-12 Impact factor: 6.860
Authors: Alicia B Moore; Linda Yu; Carol D Swartz; Xaiolin Zheng; Lu Wang; Lysandra Castro; Grace E Kissling; David K Walmer; Stanley J Robboy; Darlene Dixon Journal: Cell Commun Signal Date: 2010-06-10 Impact factor: 5.712
Authors: Damian Szklarczyk; Andrea Franceschini; Stefan Wyder; Kristoffer Forslund; Davide Heller; Jaime Huerta-Cepas; Milan Simonovic; Alexander Roth; Alberto Santos; Kalliopi P Tsafou; Michael Kuhn; Peer Bork; Lars J Jensen; Christian von Mering Journal: Nucleic Acids Res Date: 2014-10-28 Impact factor: 16.971
Authors: David Croft; Antonio Fabregat Mundo; Robin Haw; Marija Milacic; Joel Weiser; Guanming Wu; Michael Caudy; Phani Garapati; Marc Gillespie; Maulik R Kamdar; Bijay Jassal; Steven Jupe; Lisa Matthews; Bruce May; Stanislav Palatnik; Karen Rothfels; Veronica Shamovsky; Heeyeon Song; Mark Williams; Ewan Birney; Henning Hermjakob; Lincoln Stein; Peter D'Eustachio Journal: Nucleic Acids Res Date: 2013-11-15 Impact factor: 16.971
Authors: Ives Charlie-Silva; Natália M Feitosa; Leticia G Pontes; Bianca H Fernandes; Rafael H Nóbrega; Juliana M M Gomes; Mariana N L Prata; Fausto K Ferraris; Daniela C Melo; Gabriel Conde; Letícia F Rodrigues; Mayumi F Aracati; José D Corrêa-Junior; Wilson G Manrique; Joshua Superio; Aguinaldo S Garcez; Katia Conceição; Tania M Yoshimura; Silvia C Núñez; Silas F Eto; Dayanne C Fernandes; Anderson Z Freitas; Martha S Ribeiro; Artem Nedoluzhko; Mônica Lopes-Ferreira; Ricardo C Borra; Leonardo J G Barcellos; Andrea C Perez; Guilheme Malafaia; Thiago M Cunha; Marco A A Belo; Jorge Galindo-Villegas Journal: Front Immunol Date: 2022-09-29 Impact factor: 8.786