| Literature DB >> 27121963 |
Chen Chen1, Hong Shen2, Li-Guo Zhang1, Jian Liu1, Xiao-Ge Cao3, An-Liang Yao1, Shao-San Kang1, Wei-Xing Gao1, Hui Han1, Feng-Hong Cao1, Zhi-Guo Li4.
Abstract
Currently, using human prostate cancer (PCa) tissue samples to conduct proteomics research has generated a large amount of data; however, only a very small amount has been thoroughly investigated. In this study, we manually carried out the mining of the full text of proteomics literature that involved comparisons between PCa and normal or benign tissue and identified 41 differentially expressed proteins verified or reported more than 2 times from different research studies. We regarded these proteins as seed proteins to construct a protein-protein interaction (PPI) network. The extended network included one giant network, which consisted of 1,264 nodes connected via 1,744 edges, and 3 small separate components. The backbone network was then constructed, which was derived from key nodes and the subnetwork consisting of the shortest path between seed proteins. Topological analyses of these networks were conducted to identify proteins essential for the genesis of PCa. Solute carrier family 2 (facilitated glucose transporter), member 4 (SLC2A4) had the highest closeness centrality located in the center of each network, and the highest betweenness centrality and largest degree in the backbone network. Tubulin, beta 2C (TUBB2C) had the largest degree in the giant network and subnetwork. In addition, using module analysis of the whole PPI network, we obtained a densely connected region. Functional annotation indicated that the Ras protein signal transduction biological process, mitogen-activated protein kinase (MAPK), neurotrophin and the gonadotropin-releasing hormone (GnRH) signaling pathway may play an important role in the genesis and development of PCa. Further investigation of the SLC2A4, TUBB2C proteins, and these biological processes and pathways may therefore provide a potential target for the diagnosis and treatment of PCa.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27121963 PMCID: PMC4866967 DOI: 10.3892/ijmm.2016.2577
Source DB: PubMed Journal: Int J Mol Med ISSN: 1107-3756 Impact factor: 4.101
List of the 41 differentially expressed proteins between prostate cancer and normal or benign tissues by literature mining and screening the reported frequencies.
| Gene ID | Symbol (Refs.) | Description | Reported frequency |
|---|---|---|---|
| 1674 | DES ( | Desmin | 6 |
| 3329 | HSPD1 ( | Heat shock 60 kDa protein 1 (chaperonin) | 5 |
| 3309 | HSPA5 ( | Heat shock 70 kDa protein 5 (glucose-regulated protein, 78 kDa) | 3 |
| 3875 | KRT18 ( | Keratin 18, type I | 3 |
| 3856 | KRT8 ( | Keratin 8, type II | 3 |
| 7414 | VCL ( | Vinculin | 3 |
| 55 | ACPP ( | Acid phosphatase, prostate | 2 |
| 213 | ALB ( | Albumin | 2 |
| 308 | ANXA5 ( | Annexin A5 | 2 |
| 392 | ARHGAP1 ( | Rho GTPase activating protein 1 | 2 |
| 396 | ARHGDIA ( | Rho GDP dissociation inhibitor (GDI)α | 2 |
| 563 | AZGP1 ( | alpha-2-glycoprotein 1, zinc-binding | 2 |
| 822 | CAPG ( | Capping protein (actin filament), gelsolin-like | 2 |
| 1152 | CKB ( | Creatine kinase, brain | 2 |
| 30846 | EHD2 ( | EH-domain containing 2 | 2 |
| 2023 | ENO1 ( | Enolase 1, (α) | 2 |
| 2266 | FGG ( | Fibrinogen gamma chain | 2 |
| 2288 | FKBP4 ( | FK506 binding protein 4, 59 kDa | 2 |
| 2638 | GC ( | Group-specific component (vitamin D binding protein) | 2 |
| 2934 | GSN ( | Gelsolin | 2 |
| 2947 | GSTM3 ( | Glutathione S-transferase mu 3 (brain) | 2 |
| 2950 | GSTP1 ( | Glutathione S-transferase pi 1 | 2 |
| 3187 | HNRPH1 ( | Heterogeneous nuclear ribonucleoprotein H1 (H) | 2 |
| 3313 | HSPA9 ( | Heat shock 70 kDa protein 9 (mortalin) | 2 |
| 3315 | HSPB1 ( | Heat shock 27 kDa protein 1 | 2 |
| 3848 | KRT1 ( | Keratin 1, type II | 2 |
| 3880 | KRT19 ( | Keratin 19, type I | 2 |
| 5034 | P4HB ( | Prolyl 4-hydroxylase, beta polypeptide | 2 |
| 5245 | PHB ( | Prohibitin | 2 |
| 7052 | TGM2 ( | Transglutaminase 2 | 2 |
| 7163 | TPD52 ( | Tumor protein D52 | 2 |
| 7168 | TPM1 ( | Tropomyosin 1 (α) | 2 |
| 10383 | TUBB2C ( | Tubulin, beta 2C | 2 |
| 7431 | VIM ( | Vimentin | 2 |
| 3615 | IMPDH2 | IMP (inosine 5′-monophosphate) dehydrogenase 2 | 2 |
| 64087 | MCCC2 | Methylcrotonoyl-CoA carboxylase 2 (β) | 2 |
| 10631 | POSTN | Periostin, osteoblast specific factor | 2 |
| 5500 | PPP1CB | Protein phosphatase 1, catalytic subunit, beta isozyme | 2 |
| 5694 | PSMB6 | Proteasome (prosome, macropain) subunit, beta type, 6 | 2 |
| 10131 | TRAP1 | TNF receptor-associated protein 1 | 2 |
| 7334 | UBE2N | Ubiquitin-conjugating enzyme E2N | 2 |
Gene symbol with 'a' indicates that the protein only appeared in one article, but was given further experimental (western blot analysis and immunohistochemistry) validation as a protein of interest. Apart from those labeled with 'a', the remaining proteins were at found in at least two studies.
Figure 1Overview of the extended network. The extended network includes one giant network and 3 separate small components which are derived, respectively, from the seed proteins, AZGP1, CAPG and POSTN. Nodes with a red triangular shape are the seed proteins shown in Table I, the rest are their neighbors.
Figure 2Topology of the giant network. The giant network extracted from the extended network is the biggest component in the extended network. It consisted of 1,264 nodes and 1,744 edges. Key nodes in the giant network are highlighted in different colors. Nodes with a triangular shape are the seed proteins. The size of the nodes corresponds to their BC values. SLC2A4 is located at the center of the giant network. BC, betweenness centrality.
Figure 3Topology of the backbone network. The backbone network consisted of 63 nodes with a high BC value and 186 edges. The size of the nodes corresponds to their BC values. Nodes marked with red are the 17 neighbors of SLC2A4. BC, betweenness centrality.
The general measurements for each network.
| Parameter | Giant network | Backbone network | Subnetwork |
|---|---|---|---|
| Nο. of nodes | 1,264 | 63 | 302 |
| Average degree | 2.759 | 5.905 | 5.093 |
| Largest degree | 174 | 17 | 72 |
| Diameter | 7 | 5 | 5 |
| Mean shortest path length | 3.859 | 2.675 | 3.145 |
Figure 4The subnetwork consisting of all of the shortest paths between the 41 seed proteins. The subnetwork consisted of 302 nodes and 769 edges. The size of the nodes corresponds to their BC values. SLC2A4 is located at the center of the subnetwork. TUBB2C has the highest BC value and the largest degree. BC, betweenness centrality.
Figure 5The significant modules in the whole extended network.
Gene Ontology (GO) functional enrichment analysis of the densely connected region with the threshold of P<0.05.
| Category | GO ID | Term | Count | P-value | Size |
|---|---|---|---|---|---|
| BP | GO:0006915 | Apoptosis | 8 | 6.4E-4 | 602 |
| BP | GO:0007265 | Ras protein signal transduction | 4 | 9.8E-3 | 105 |
| BP | GO:0042981 | Regulation of apoptosis | 7 | 1.1E-2 | 804 |
Calculated with the Benjamini method to control the false discovery rate (FDR) to correct the P-value. Category, GO function; Count, the number of proteins; Size, the total number of genes in the GO BP. BP, biological process.
KEGG pathway enrichment analysis of the densely connected region with the threshold of P<0.05.
| KEGG pathway | KEGG entry | Count | P-value | Size |
|---|---|---|---|---|
| MAPK signaling pathway | hsa04010 | 5 | 3.5E-2 | 267 |
| Neurotrophin signaling pathway | hsa04722 | 4 | 2.9E-2 | 124 |
| GnRH signaling pathway | hsa04912 | 4 | 2.9E-2 | 98 |
| Colorectal cancer | hsa05210 | 4 | 3.7E-2 | 84 |
| Non-small cell lung cancer | hsa05223 | 3 | 4.6E-2 | 54 |
Calculated with the Benjamini method to control the false discovery rate (FDR) to correct the P-value. Count, the number of proteins. Size, the total number of genes in the pathway. KEGG, Kyoto Encyclopedia of Genes and Genomes.