Wengui Xie1, Lixin Ji1, Teng Zhao1, Pengfei Gao1. 1. Department of Spinal Surgery, North Medical District of Linyi People's Hospital Group, Linyi, Shandong, China (mainland).
Abstract
BACKGROUND: A number of genes have been identified to be related with primary osteoporosis while less is known about the comprehensive interactions between regulating genes and proteins. We aimed to identify the differentially expressed genes (DEGs) and regulatory effects of transcription factors (TFs) involved in primary osteoporosis. MATERIAL/ METHODS: The gene expression profile GSE35958 was obtained from Gene Expression Omnibus database, including 5 primary osteoporosis and 4 normal bone tissues. The differentially expressed genes between primary osteoporosis and normal bone tissues were identified by the same package in R language. The TFs of these DEGs were predicted with the Essaghir A method. DAVID (The Database for Annotation, Visualization and Integrated Discovery) was applied to perform the GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analysis of DEGs. After analyzing regulatory effects, a regulatory network was built between TFs and the related DEGs. RESULTS: A total of 579 DEGs was screened, including 310 up-regulated genes and 269 down-regulated genes in primary osteoporosis samples. In GO terms, more up-regulated genes were enriched in transcription regulator activity, and secondly in transcription factor activity. A total 10 significant pathways were enriched in KEGG analysis, including colorectal cancer, Wnt signaling pathway, Focal adhesion, and MAPK signaling pathway. Moreover, total 7 TFs were enriched, of which CTNNB1, SP1, and TP53 regulated most up-regulated DEGs. CONCLUSIONS: The discovery of the enriched TFs might contribute to the understanding of the mechanism of primary osteoporosis. Further research on genes and TFs related to the WNT signaling pathway and MAPK pathway is urgent for clinical diagnosis and directing treatment of primary osteoporosis.
BACKGROUND: A number of genes have been identified to be related with primary osteoporosis while less is known about the comprehensive interactions between regulating genes and proteins. We aimed to identify the differentially expressed genes (DEGs) and regulatory effects of transcription factors (TFs) involved in primary osteoporosis. MATERIAL/ METHODS: The gene expression profile GSE35958 was obtained from Gene Expression Omnibus database, including 5 primary osteoporosis and 4 normal bone tissues. The differentially expressed genes between primary osteoporosis and normal bone tissues were identified by the same package in R language. The TFs of these DEGs were predicted with the Essaghir A method. DAVID (The Database for Annotation, Visualization and Integrated Discovery) was applied to perform the GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analysis of DEGs. After analyzing regulatory effects, a regulatory network was built between TFs and the related DEGs. RESULTS: A total of 579 DEGs was screened, including 310 up-regulated genes and 269 down-regulated genes in primary osteoporosis samples. In GO terms, more up-regulated genes were enriched in transcription regulator activity, and secondly in transcription factor activity. A total 10 significant pathways were enriched in KEGG analysis, including colorectal cancer, Wnt signaling pathway, Focal adhesion, and MAPK signaling pathway. Moreover, total 7 TFs were enriched, of which CTNNB1, SP1, and TP53 regulated most up-regulated DEGs. CONCLUSIONS: The discovery of the enriched TFs might contribute to the understanding of the mechanism of primary osteoporosis. Further research on genes and TFs related to the WNT signaling pathway and MAPK pathway is urgent for clinical diagnosis and directing treatment of primary osteoporosis.
Primary osteoporosis, a skeletal disease characterized by imbalanced bone homeostasis, generally consists of postmenopausal osteoporosis and senile osteoporosis [1]. Primary osteoporosis can occur in different age stages, but mainly in elderly people and postmenopausal women [2,3]. In the US, over 50% of people who are age of 50 and above suffer from osteoporosis and nearly 80% of them are female [4].There are several reasons for the occurrence of primary osteoporosis. Mainly, it is caused by micro-architectural deterioration of bone tissues, which lead to decreased skeletal strength and increased susceptibility to fractures. Thus, there are higher risks of fragile fractures of vertebrae, femoral neck, and other typical localizations of lower incidence [1,5].The imbalance between bone resorption and formation is the main mechanism of osteoporosis. Normally, osteoclast cells resorb bone in bone marrow and deposit new bone by osteoblast [6]. Recently, it has been found that the mutation of osteoclast genes such as CSF1(colony stimulating factor 1) [7], PTH1R (parathyroid hormone 1 receptor), and osteoporosis related genes such as, LRP5 (low-density lipoprotein receptor-related protein 5) [8], WNT3A (Wingless-type MMTV integration site family, member 3A) [9], RANKL (Receptor Activator for Nuclear Factor-κ B Ligand), COL1A1 (Collagen, type I, alpha 1) [10], and RUNX2 (Runt-related transcription factor 2) [11] are relevant to osteoporosis. For example, CSF1 plays an important role in the development of bone, and it is vital for osteoclast proliferation, osteoclast differentiation, and bone resorption [12]. PTH1R signaling is associated with the bone growth of cartilaginous growth plate postpartum [13], and treatment with parathyroid hormone (PTH) is an approach for osteoporosis. However, the balance is hard to control. With the continuous activation of PTH1R, it could activate over-expression of RANKL, which would stimulates osteoclast formation and bone resorption [14]. LRP5 mutations can reduce the activity of Wnt signaling and therefore lead to a decrease in bone formation [8]. WNT3A is critically associated with bone mineral density (BMD) [9] and COL1A1 is related to BMD reduction and increased fracture risk [10].Therefore, in response to this skeletal disease, it is significant to analyze the related genes and transcription factors (TFs) that cause this disease. However, most of the current reports were focusing on related genes individually with unilateral analysis, which leads to a lack of understanding of the comprehensive interactions between regulating genes and proteins. Even by some modern methods such as meta-analysis, the relevant information could not be obtained easily because of many essential issues associated with annotation and comparison of cross-platforms.In this paper, we analyzed the gene expressing profiles to investigate the differentially expressed genes (DEGs) of normal and the osteoporosis samples. TFs would be predicted with network tools. Then protein-protein interaction networks of these TFs were constructed using HPRD (the Human Protein Reference Database) and DIP (the Database of Interacting Proteins) databases. In addition, DAVID (The Database for Annotation, Visualization and Integrated Discovery) was applied to perform the GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analysis of DEGs and the corresponding TFs. Understanding the interactions between DEGs of TFs might provide references for the further exploration of the pathogenic mechanism of osteoporosis.
Material and Methods
Under agreement of the ethics committee of the University of Würzburg, bone marrow was used with written informed consent of each patient [14]. Non-osteoporotic donors were people who had total hip arthroplasty because of osteoarthritis and/or hip dysplasia. There were no nicotine abuse, alcohol abuse, usage of medications for these people.
Affymetrix microarray analysis
The gene expression profile dataset of GSE35958 [14] was obtained from GEO () database which is based on the Affymetrix Human Genome U133 Plus 2.0 Array platform. A total of 9 chips were available, including 5 human mesenchymal stem cell (MSC) samples from primary osteoporosispatients (79–94 years old, 5 females) and 4 age-matched human MSC of non-osteoporotic donors’ control samples (donor age 79–89 years, 3 females and 1 male).The bone marrow of 4 control donors was collected from bone marrow of femoral heads using the described protocol [15]. The 5 MSC samples of primary osteoporosis were isolated from femoral heads after low-energy fracture of the femoral neck. Further proof for primary osteoporosis of these 5 patients were their old-age and vertebrae fractures. The MSC cells were cultured and used for microarray as Benisch described [14]. There were significant differences between MSC cells of osteoporosis and the controls according to Benisch [14].
Data processing and DEGs screening
Firstly, the original data in CEL format were processed into expression values by the robust multi-array average (RMA) method through the Affy package [16] in R. Secondly, the probe-level data were transformed by R/Bioconductor platform notes package. Afterwards, background correction and quartile data normalization were performed.The DEGs of mesenchymal stem cell between primary osteoporosis and normal bone tissues were identified by samr package [17] in R software. The threshold value should be set as delta=1, minimum fold change=2. False discovery rate (FDR) less than 0.25 was chosen as cut-off criterion. Cluster analysis was applied to construct a dendrogram and to estimate the genetic distances of DEGs.
Function and pathway enrichment analysis of DEGs
The Database for Annotation, Visualization, and Integrated Discovery (DAVID) [18] was used to enrich GO functions and KEGG pathways of both up-regulated and down-regulated genes. FDR less than 0.05 were chosen as cut-off criteria.
TFs screening
Essaghir A method [19] in Network tools () was used to predict the TFs of these DEGs and the interaction proteins were obtained based on data from the Human Protein Reference Database (HPRD: ) and Database of Interacting Proteins (DIP: ) [19,20]. TFs were screened with values of 4 indexes in TfactS (p-value, q-value, E-value, and FDR) less than 0.05. The up- and down-regulated DEGs were firstly uploaded to TfactS containing approved target genes of TFs. Then the DEGs were matched with positive or negative regulated target genes of TFs to identify the numbers of DEGs whose expressions were in the same (or opposite) direction with their regulations. Afterwards, activation or suppression of the TFs were identified through estimating the enrichment of genes with the same (or opposite) expressing direction to the regulating direction using the Fisher test.
Functional enrichment analysis of TFs
GO functions and KEGG pathways of both up- and down-regulated genes regulated by the predicted TFs were analyzed with DAVID [18]. FDRs less than 0.05 were chosen as cut-off criteria.
Protein-Protein Interaction (PPI) network construction and analysis
The protein interaction relationships corresponding to the DEGs regulated by predicted TFs were obtained using the STRING [20] online database. PPI pairs with selected larger scores were used to construct the PPI network. Afterwards, the regulatory relationship between genes were analyzed and discussed through topological property of computing network such as the degree distribution of network, the shortest pathway distribution, average clustering coefficient, and closeness centrality.Based on the above datasets, the regulatory network was constructed with transcriptome and their 88 regulator genes, including target genes, interacted with target genes and the upstream enzyme genes.
Results
Data processing
A total of 21 049 genes in 9 samples were obtained after the preprocessing of expression profile. The original expression datasets from all conditions were processed into expression estimates using the RMA (Robust Multi-array Average) method. As shown in a box plot (Figure 1), the median of different samples was almost on the same line after normalization, which shows an excellent degree of standardization.
Figure 1
Data distribution box plots. The horizontal axis represents the name of samples, while the vertical axis represents the expression value. The black lines stand for median, which can be used to identify the degree of standardization. The hMSC.old stood for the advanced age controls and the hMSC.OP stood for the 5 elderly patients of primary osteoporosis.
Identification of DEGs
After cluster analysis of the expression value, a mesenchymal stem cell sample from primary osteoporosis in the same cluster with mesenchymal stem cell sample from normal bone tissues was excluded. A total of 579 DEGs of mesenchymal stem cells were screened in 4 normal bone tissues and 4 primary osteoporosis samples. As shown in Figure 2, there were 310 (53.54%) up-regulated genes in primary osteoporosis samples and 269 (46.46%). down-regulated genes The 10 most significant up- and down- regulated genes are listed in Table 1. A total of 20 DEGs of minimum P values were screened using principal components analysis (PCA) for distinguishing control groups and disease groups. Figure 3 shows PCA of the 20 most obvious DEGs.
Figure 2
Hierarchical clustering dendrogram of gene expression. Heat map of microarray results of primary osteoporosis and old controls. Total 579 DEGs were screened in 4 normal bone tissues mesenchymal stem cells and 4 primary osteoporosis mesenchymal stem cells. Through a process of gradual change color from red to green indicated expression values change from high to low.
Table 1
Top10 significantly enriched up- and down-regulated DEGs.
PCA of 20 most obvious DEGs. The horizontal axis represents the scores of first principal components and the vertical axis represents the scores of second principal components of samples. In the first principal components, 77.6% of variances from of 20 samples was explained while in the second principal component, 21.058% of variances from 20 samples was explained. Totally, the resolution degree of variances was 98.658%. In this figure, the hMSCold represented normal tissues (total of 4) while others were OP pateints.
Enrichment analysis of DEGs
There were 156 up-regulated genes obviously enriched in 56 GO terms and other seldom up-regulated genes mapped on hsa04510 (focal adhesion) pathways (Figure 4). As shown in Table 2, it can be concluded that the 56 biological progresses in GO terms were mainly related to cell cycle, physical preparation of mitosis, and DNA replication. There were 35 down-regulated genes enriched in 3 GO terms, including nuclear lumen (GO: 0031981), intracellular organelle lumen (GO: 0070013), and nucleoplasm (GO: 0005654).
Figure 4
Focal adhesion KEGG pathways. Red color represents up-regulated genes and green color represents down-regulated genes.
Table 2
Top 10 GO terms and KEGG pathways enrichment results of up-regulated genes.
Category
Term
Count
P-Value
BP
GO: 0043066~negative regulation of apoptosis
20
0.003334278
BP
GO: 0060548~negative regulation of cell death
20
0.003384174
BP
GO: 0022610~biological adhesion
30
0.003427309
BP
GO: 0043069~negative regulation of programmed cell death
20
0.003592446
BP
GO: 0030198~extracellular matrix organization
12
0.003697203
BP
GO: 0010941~regulation of cell death
33
0.003939575
BP
GO: 0051094~positive regulation of developmental process
17
0.004354961
BP
GO: 0007155~cell adhesion
30
0.004667666
BP
GO: 0040008~regulation of growth
19
0.004702444
BP
GO: 0046907~intracellular transport
28
0.005028033
KEGG
hsa04510: Focal adhesion
19
3.30E-05
A total of 7 active TFs were obtained, including CTNNB1 (catenin (cadherin-associated protein), beta 1), E2F4 (E2F transcription factor 4), EGR1 (early growth response gene 1), JUN (jun proto-oncogene), SP1 (trans-acting transcription factor 1), TCF7L2 (Transcription factor 7-like 2), and TP53 (tumor protein p53). Then 88 interacting genes without the non-differential expressing genes were detected by analyzing the 7 TFs in DPRD and DIP. The numbers of target genes corresponding to these 7 TFs were 36, 9, 13, 31, 33, 6, and 40.
Enrichment analysis of TFs
GO enrichment analysis of the interacting DEGs and TFs indicated that 84 genes were obviously enriched in 310 biological processes, 57 genes were obviously enriched in 14 cellular component pathways, and 57 genes were obviously enriched in molecular function pathways. The top 10 GO terms of DEGs related to TFs are shown in Table 3. It is obvious that more up-regulated genes were enriched in transcription regulator activity (GO: 0030528) and secondly in transcription factor activity (GO: 0003700).
Table 3
Top 10 GO terms enrichment results of TFs.
Category
Term
Count
P-Value
Fold enrichment
GOTERM_MF_FAT
GO: 0003700~transcription factor activity
27
6.75E-10
4.039654
GOTERM_MF_FAT
GO: 0030528~transcription regulator activity
33
1.37E-09
3.18381
GOTERM_MF_FAT
GO: 0008134~transcription factor binding
18
4.87E-08
5.11847
GOTERM_MF_FAT
GO: 0043565~sequence-specific DNA binding
19
1.00E-07
4.566148
GOTERM_MF_FAT
GO: 0046983~protein dimerization activity
17
6.08E-07
4.575459
GOTERM_MF_FAT
GO: 0003690~double-stranded DNA binding
8
4.05E-06
12.03104
GOTERM_MF_FAT
GO: 0043566~structure-specific DNA binding
9
6.03E-06
9.054398
GOTERM_MF_FAT
GO: 0016563~transcription activator activity
13
1.96E-05
4.625349
GOTERM_MF_FAT
GO: 0019899~enzyme binding
14
4.77E-05
3.904913
GOTERM_MF_FAT
GO: 0042803~protein homodimerization activity
11
8.45E-05
4.804313
Through KEGG pathways enrichment analysis, total 10 significant pathways of DEGs related to TFs were enriched (Table 4). It included 5 cancer pathways and 3 signal pathways, such as colorectal cancer (has05210), Wnt signaling pathway (hsa04310), focal adhesion (hsa04510), and MAPK signaling pathway (hsa04010). Many of these genes were related to the cell cycle, cell apoptosis, cell junction, and focal adhesional pathways.
Table 4
Top 10 KEGG pathways enrichment results of TFs.
Category
Term
Count
FDR
PATHWAY
hsa05210: Colorectal cancer
9
4.21E-04
PATHWAY
hsa04510: Focal adhesion
12
7.02E-04
PATHWAY
hsa05213: Endometrial cancer
7
8.02E-04
PATHWAY
hsa05200: Pathways in cancer
14
0.001687
PATHWAY
hsa05212: Pancreatic cancer
7
0.003085
PATHWAY
hsa04310: Wnt signaling pathway
9
0.004867
PATHWAY
hsa04210: Apoptosis
7
0.006217
PATHWAY
hsa05215: Prostate cancer
7
0.006149
PATHWAY
hsa04520: Adherens junction
6
0.019332
PATHWAY
hsa04722: Neurotrophin signaling pathway
7
0.027373
PPI network construction
A total of 170 pairs of PPI with reliability index >0.4 were selected for the construction of a PPI network using the String database (Figure 5). A total of 71 nodes were collected, which took 80.68% of all DEGs related to TFs. Generally, the protein interaction network of DEGs was in a state of high aggregation (Figure 5), which is an essential property of biological networks. In addition, it can be seen from Figures 5 and 6 that most DEGs that have strong interaction relationships were significantly up-regulated. Furthermore, DEGs related to regulation of TFs were mostly up-regulated.
Figure 5
The PPI networks of DEGs. Red color represents up-regulated genes and green color represents down-regulated genes, connecting ling represent the relationships.
Figure 6
Interaction networks of TFs and DEGs. The yellow ones represent TFs, the green ones represent down-regulated genes, the red ones represent up-regulated genes. The imaginarylines represented the interactions between DEGs related to TFs, while the solid lines represented the interactions between TFs and their regulated genes.
Further calculation on the topological parameters of network is shown in Figure 7. It can be seen in part A that the network node is in a power-law distribution form. From part B (shortest path length distribution), part C (average clustering coefficient), and part D (closeness centrality) it can be confirmed that this PPI network was characterized by small-world structure.
Figure 7
Network nodes distribution. (A) Degree distribution of the network node. (B) Shortest path length distribution. (C) Average clustering coefficient. (D) Closeness centrality.
PPI network analysis
The regulated network of TFs and related DEGs (binding protein) are shown in Figure 7. This network consisted of 7 TFs and 88 DEGs. Among the 7 TFs, CTNNB1, SP1, and TP53 regulated most up-regulated DEGs, which may play a crucial role in primary osteoporosis.
Discussion
Although primary osteoporosis is rarely lethal, this disease contributes to millions of fractures every year [21]. Previous research has mentioned many genes involved in pathogenesis of primary osteoporosis based on gene expression profiling. Mutations in CSF1 [7] and LRP5 [8] have been proven to be relevant to osteoporosis because of their ability to reduce the activity of Wnt signaling and therefore lead to a decrease in bone formation [8]. Genes WNT3A [9] and COL1A1 are related to BMD reduction [10]. The interested DEGs are mostly the same in different reports, but many factors such as samples, platforms, and analysis methods would lead to the different results of chips. Most of these reports only concentrated on individual genes and lacked overall data. In response to this, many experts have made great efforts to obtain results from chips by meta-analysis, but many essential issues such as annotation and comparison of cross-platforms could not be solved appropriately. Additionally, there is an urgent need to understand the molecular mechanism of primary osteoporosis to develop better detection, diagnosis, and therapeutic targets that could improve clinical management and therapeutic outcome for patients.Therefore, 2 datasets in a same platform were selected to analyze. Firstly, DEGs were screened by R language; we found that the DEGs and TFs were obviously enriched in certain GO terms and KEGG pathways related to cancer. Through GO term analysis, we found that up-regulated genes were only enriched in biological process and most of these biological processes were related to cell cycle, such as negative regulation of apoptosis, negative regulation of programmed cell death, and regulation of growth. GO terms associated with TFs mostly took part in biological process like cell cycle, cell apoptosis, and focal adhesion.In KEGG enrichment analysis, 5 cancer pathways and 3 signaling pathways were enriched, including WNT signaling pathway and MAPK signaling pathway. The WNT signaling pathway plays an important role in cell proliferation, differentiation, morphogenesis, and oncogenesis [22,23]. In 2001, Gong et al. reported that the WNT-β-catenin signaling pathway plays a vital role in regulating bone density through LRP5. Osteoblasts differentiation could be depressed by inhibiting β-catenin (Aliases for CTNNB1) expression [28]. As the main regulator of bone formation [24], the WNT signaling pathway has been confirmed to depress osteoclastogenesis and promote differentiation, proliferation, and mineralization of osteoblasts [25]. The MAPK pathway was reported to be essential for formation of bones [26]. In addition, over-proliferated cells are mitigated by depressingcyclin A transcriptional regulation and the cell cycle in primary osteoblasts [27].From this study, PPI network of DEGs was found to be in a state of high aggregation, which is an essential property of a biological network. In addition, it can be seen from Figure 5 that most genes that have strong interaction relationships were up-regulated but there are few interactions among down-regulated genes. Moreover, most genes that related to TFs were up-regulated. VEGF (vascular endothelial growth factor) is up-regulated in osteoblastic precursor cells and is involved in bone formation by interaction with other TFs and genes [28]. TGFB1 (transforming growth factor beta-1) has positive correlations with peak bone mass [29]. Among the 7 TFs predicted in this paper, JUN and TP53 took part in both WNT signaling pathway and MAPK pathway. TP53 was crucial for the development of many cancers, and it has an influence on many biology process like cell cycling, DNA repair, and apoptosis [30]. Almost 90% of osteosarcomas are associated with TP53 mutation [31]. JUN was found to be associated with humanmalignancies [32]. CTNNB1 was a key down-regulated component of the canonical Wnt signaling pathway and takes part in the regulation of cell adhesion [33]. E2F4 was found to be involved in cell cycle regulation and the over-expression of E2F4, which is related to many humancancers [34]. SP1 participates in many cellular processes such as apoptosis and response to DNA damage. Through inducing TERT and TERC expression, SP1 can maintain telomerase activity in cancer with the help of TF7IP [35]. Obviously, the samples we chose were based on the many enriched genes and TFs involved in cancer pathways. Most people who suffer from osteoporosis are elderly people, especially females [4], and older people have more potential for cancer disease. Therefore, more research is needed for further analysis of the functions of TFs.According to Benisch, DEGs of osteoporotic bones were significantly different from an age-matched control group. Comparing the osteoporotic bones with the healthy bones (middle-aged), there were also distinct gene products [14]. This may indicate that osteoporosis can change gene expression pattern independent of age.In conclusion, a total of 7 TFs – CTNNB1, E2F4, EGR1, JUN, SP1, TCF7L2, and TP53 – were enriched from gene-expressed database. CTNNB1, JUN, and TP53 are key elements in the WNT signaling pathway. CTNNB1 affected osteoblast differentiation. JUN, E2F4, SP1, and TP53 were found to be related to several cancers. For example, the mutations of TP53 may lead to osteosarcomas in humans [36]. The complex network constructed by TFs and genes is important in primary osteoporosis.
Conclusions
Although many enriched genes that which seem to be highly associated with osteoporosis lack of research verification, this study supplies significant information for realizing the change of primary osteoporosis genes and the functions of TFs.
Authors: Michael Overholtzer; Pulivarthi H Rao; Reyna Favis; Xin-Yan Lu; Michael B Elowitz; Francis Barany; Marc Ladanyi; Richard Gorlick; Arnold J Levine Journal: Proc Natl Acad Sci U S A Date: 2003-09-12 Impact factor: 11.205
Authors: Tao Qiu; Lingling Xian; Janet Crane; Chunyi Wen; Matthew Hilton; William Lu; Peter Newman; Xu Cao Journal: J Bone Miner Res Date: 2015-02 Impact factor: 6.741
Authors: Peter Tzakas; Betty Y L Wong; Alexander G Logan; Laurence A Rubin; David E C Cole Journal: BMC Musculoskelet Disord Date: 2005-06-14 Impact factor: 2.362