| Literature DB >> 28384236 |
Lei Chen1,2, Jing Yang1, Zhihao Xing3, Fei Yuan4, Yang Shu3, YunHua Zhang5, XiangYin Kong3, Tao Huang3, HaiPeng Li6, Yu-Dong Cai1.
Abstract
Cancer is a significant public health problem worldwide. Complete identification of genes related to one type of cancer facilitates earlier diagnosis and effective treatments. In this study, two widely used algorithms, the random walk with restart algorithm and the shortest path algorithm, were adopted to construct two parameterized computational methods, namely, an RWR-based method and an SP-based method; based on these methods, an integrated method was constructed for identifying novel disease genes. To validate the utility of the integrated method, data for oral cancer were used, on which the RWR-based and SP-based methods were trained, thereby building two optimal methods. The integrated method combining these optimal methods was further adopted to identify the novel genes of oral cancer. As a result, 85 novel genes were inferred, among which eleven genes (e.g., MYD88, FGFR2, NF-κBIA) were identified by both the RWR-based and SP-based methods, 70 genes (e.g., BMP4, IFNG, KITLG) were discovered only by the RWR-based method and four genes (L1R1, MCM6, NOG and CXCR3) were predicted only by the SP-based method. Extensive analyses indicate that several novel genes have strong associations with cancers, indicating the effectiveness of the integrated method for identifying disease genes.Entities:
Mesh:
Year: 2017 PMID: 28384236 PMCID: PMC5383255 DOI: 10.1371/journal.pone.0175185
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
The pseudo-code of the RWR-based method.
| RWR-based method |
|---|
| 1. Execute the RWR algorithm on |
| 2. Execute the permutation test, producing the p-value for each gene; select candidate genes with a p-value less than 0.05; |
| 3. For each candidate gene, calculate its |
| 4. For each candidate gene, calculate its |
| 5. Output the remaining candidate genes as the putative OC-related genes. |
The pseudo-code of the SP-based method.
| SP-based method |
|---|
| 1. Execute the SP algorithm on |
| 2. Execute the permutation test, producing the p-value for each candidate gene; select candidate genes with p-values less than 0.05; |
| 3. For each candidate gene, calculate its |
| 4. For each candidate gene, calculate its |
| 5. Output the remaining candidate genes as the putative OC-related genes. |
Fig 1The performance of the RWR-based method under different combinations of parameters.
(a) The performance of the RWR-based method setting p = 400. (b) The performance of the RWR-based method setting p = 700. (c) The performance of the RWR-based method setting p = 900.
Fig 2The performance of the SP-based method under different combinations of parameters.
There are three lines in this figure, which represent the performance of the SP-based method with different thresholds of maximum interaction score. In detail, the full line represents the performance of the SP-based method with the threshold of maximum interaction score 900, the dot line represents the performance of the SP-based method with the threshold of maximum interaction score 700, the dash line represents the performance of the SP-based method with the threshold of maximum interaction score 400.
Genes identified by the optimal SP-based method.
| Ensembl ID | Gene symbol | Betweenness | P-value | Function | ||
|---|---|---|---|---|---|---|
| ENSP00000354394 | STAT1 | 1443 | <0.001 | 999 | 0.852 | functions as a key factor in cell viability in response to different cell stimuli and pathogens [ |
| ENSP00000263341 | IL1B | 543 | <0.001 | 994 | 0.873 | a member of the interleukin 1 cytokine family |
| ENSP00000379625 | MYD88 | 528 | 0.006 | 999 | 0.880 | an essential signal transducer in the IL1 and Toll-like receptor signaling pathways [ |
| ENSP00000233946 | IL1R1 | 528 | 0.001 | 920 | 0.843 | interleukin 1 receptor type 1 [ |
| ENSP00000216797 | NFKBIA | 201 | 0.018 | 999 | 0.825 | a member of the NF-kappa-B inhibitor family, which is involved in inflammatory responses [ |
| ENSP00000222382 | CYP3A43 | 183 | <0.001 | 958 | 0.988 | a member of the cytochrome P450 superfamily of enzymes [ |
| ENSP00000328181 | NOG | 183 | 0.005 | 999 | 0.862 | binds and inactivates members of the TGF-beta superfamily signaling proteins [ |
| ENSP00000410294 | FGFR2 | 183 | 0.01 | 999 | 0.846 | a tyrosine protein kinase that functions as a receptor for fibroblast growth factors and plays key roles in cell proliferation, differentiation, migration and apoptosis [ |
| ENSP00000362795 | CXCR3 | 179 | 0.021 | 999 | 0.808 | a G protein-coupled receptor with selectivity for chemokines [ |
| ENSP00000260356 | THBS1 | 10 | 0.033 | 984 | 0.807 | an adhesive glycoprotein that mediates cell-cell and cell-matrix interactions [ |
| ENSP00000264156 | MCM6 | 8 | 0.014 | 999 | 0.822 | be involved in the formation of replication forks [ |
| ENSP00000301141 | CYP2A6 | 3 | 0.016 | 950 | 0.948 | a member of the cytochrome P450 superfamily of enzymes [ |
| ENSP00000331736 | SELE | 1 | 0.006 | 978 | 0.852 | responsible for the accumulation of blood leukocytes at sites of inflammation [ |
| ENSP00000168712 | FGF4 | 1 | 0.016 | 999 | 0.847 | fibroblast growth factor 4 which are involved in various biological processes such as cell growth and morphogenesis |
| ENSP00000286758 | CXCL13 | 1 | 0.005 | 992 | 0.838 | C-X-C motif chemokine ligand 13 |
a: Genes that have shown stimulative or suppressive effects on cancer as validated by experiments.
b: Genes that have been reported to have a certain relationship with cancer but that have not been validated by experiments.
Fig 3The distribution of the 85 putative OC-related genes obtained in this study.
The blue part represents the set consisting of 70 putative genes obtained using only the RWR-based method. The red part represents the set consisting of 11 putative genes obtained using both the RWR-based and SP-based methods. The green part represents the set consisting of 4 putative genes obtained using only the SP-based method.
Fig 4The sub-network containing the putative genes and OC-related genes that was extracted from the network for RWR-based and SP-based methods.
The blue nodes represent putative genes and red nodes represent OC-related genes.
Eleven putative genes identified using both RWR-based and SP-based methods.
| Ensembl ID | Gene symbol | RWR-based method | SP-based method | Function | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Probability | P-value | Betweenness | P-value | |||||||
| ENSP00000379625 | MYD88 | 6.67E-05 | 0.032 | 999 | 0.880 | 528 | 0.006 | 999 | 0.880 | an essential signal transducer in the IL1 and Toll-like receptor signaling pathways [ |
| ENSP00000410294 | FGFR2 | 9.04E-05 | 0.021 | 999 | 0.846 | 183 | 0.01 | 999 | 0.846 | a tyrosine protein kinase that functions as a receptor for fibroblast growth factors and plays key roles in cell proliferation, differentiation, migration and apoptosis [ |
| ENSP00000216797 | NFKBIA | 7.56E-05 | 0.026 | 999 | 0.825 | 201 | 0.018 | 999 | 0.825 | a member of the NF-kappa-B inhibitor family, which is involved in inflammatory responses [ |
| ENSP00000331736 | SELE | 9.73E-05 | <0.001 | 978 | 0.852 | 1 | 0.006 | 978 | 0.852 | responsible for the accumulation of blood leukocytes at sites of inflammation [ |
| ENSP00000260356 | THBS1 | 6.94E-05 | <0.001 | 984 | 0.807 | 10 | 0.033 | 984 | 0.807 | an adhesive glycoprotein that mediates cell-cell and cell-matrix interactions [ |
| ENSP00000354394 | STAT1 | 4.36E-04 | <0.001 | 999 | 0.852 | 1443 | <0.001 | 999 | 0.852 | functions as a key factor in cell viability in response to different cell stimuli and pathogens [ |
| ENSP00000301141 | CYP2A6 | 8.85E-05 | <0.001 | 950 | 0.948 | 3 | 0.016 | 950 | 0.948 | a member of the cytochrome P450 superfamily of enzymes[ |
| ENSP00000222382 | CYP3A43 | 3.35E-04 | <0.001 | 958 | 0.988 | 183 | <0.001 | 958 | 0.988 | a member of the cytochrome P450 superfamily of enzymes [ |
| ENSP00000286758 | CXCL13 | 6.63E-05 | 0.006 | 992 | 0.837 | 1 | 0.005 | 992 | 0.838 | C-X-C motif chemokine ligand 13 |
| ENSP00000168712 | FGF4 | 8.21E-05 | 0.001 | 999 | 0.847 | 1 | 0.016 | 999 | 0.847 | fibroblast growth factor 4 which are involved in various biological processes such as cell growth and morphogenesis |
| ENSP00000263341 | IL1B | 1.92E-04 | <0.001 | 994 | 0.873 | 543 | <0.001 | 994 | 0.873 | a member of the interleukin 1 cytokine family |
a: Genes that have shown stimulative or suppressive effects on cancer as validated by experiments.
b: Genes that have been reported to have a certain relationship with cancer but that have not been validated by experiments.
Four putative genes identified using the SP-based method.
| Ensembl ID | Gene symbol | Betweenness | P-value | Function | ||
|---|---|---|---|---|---|---|
| ENSP00000264156 | MCM6 | 8 | 0.014 | 999 | 0.822 | be involved in the formation of replication forks [ |
| ENSP00000328181 | NOG | 183 | 0.005 | 999 | 0.862 | binds and inactivates members of the TGF-beta superfamily signaling proteins [ |
| ENSP00000362795 | CXCR3 | 179 | 0.021 | 999 | 0.808 | a G protein-coupled receptor with selectivity for chemokines [ |
| ENSP00000233946 | IL1R1 | 528 | 0.001 | 920 | 0.843 | interleukin 1 receptor type 1 [ |
a: Genes that have shown stimulative or suppressive effects on cancer as validated by experiments.
b: Genes that have been reported to have a certain relationship with cancer but that have not been validated by experiments.
Important genes among the seventy putative genes identified using the RWR-based method.
| Ensembl ID | Gene symbol | Probability | P-value | Function | ||
|---|---|---|---|---|---|---|
| ENSP00000245451 | BMP4 | 9.57E-05 | 0.026 | 981 | 0.905 | bind TGF-beta receptor leading to recruitment and activation of transcription factor [ |
| ENSP00000225831 | CCL2 | 1.21E-04 | 0.012 | 984 | 0.869 | C-C motif chemokine ligand 2 |
| ENSP00000351671 | CCL20 | 6.69E-05 | 0.003 | 965 | 0.804 | C-C motif chemokine ligand 20 |
| ENSP00000293272 | CCL5 | 7.70E-05 | 0.002 | 994 | 0.891 | C-C motif chemokine ligand 5 |
| ENSP00000292303 | CCR5 | 1.01E-04 | 0.003 | 996 | 0.839 | C-C motif chemokine receptor 5 |
| ENSP00000246657 | CCR7 | 9.82E-05 | <0.001 | 998 | 0.823 | C-C motif chemokine receptor 7 |
| ENSP00000229135 | IFNG | 1.70E-04 | 0.03 | 994 | 0.839 | binds to the interferon gamma receptor to response to infection [ |
| ENSP00000228280 | KITLG | 1.05E-04 | 0.026 | 948 | 0.816 | the ligand of the tyrosine-kinase receptor |
| ENSP00000162749 | TNFRSF1A | 9.69E-05 | 0.013 | 999 | 0.825 | a member of the TNF receptor superfamily which plays a role in various biological processes |
| ENSP00000289153 | PIK3CB | 1.23E-04 | <0.001 | 997 | 0.926 | an isoform of the catalytic subunit of PI3K |
| ENSP00000366563 | PIK3CD | 1.15E-04 | <0.001 | 997 | 0.919 | PI3Ks phosphorylate inositol lipids and it is involved in the immune response [ |
| ENSP00000352121 | PIK3CG | 1.21E-04 | <0.001 | 996 | 0.921 | phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit gamma |
| ENSP00000324648 | CYP2B6 | 9.73E-05 | <0.001 | 962 | 0.943 | a member of the cytochrome P450 superfamily |
| ENSP00000360372 | CYP2C19 | 9.32E-05 | <0.001 | 962 | 0.966 | cytochrome P450 family 2 subfamily C member 19 |
| ENSP00000360247 | CYP2J2 | 7.48E-05 | 0.004 | 912 | 0.969 | cytochrome P450 family 2 subfamily J member 2 |
| ENSP00000337915 | CYP3A4 | 3.32E-04 | <0.001 | 963 | 0.941 | cytochrome P450 family 3 subfamily A member 4 |
| ENSP00000360968 | CYP4X1 | 7.43E-05 | 0.018 | 939 | 0.959 | cytochrome P450 family 4 subfamily X member 1 |
| ENSP00000304283 | RAC3 | 1.08E-04 | 0.006 | 990 | 0.981 | a GTPase regulates cell growth, cytoskeletal reorganization, and the activation of kinases [ |
a: Genes that have shown stimulative or suppressive effects on cancer as validated by experiments.
b: Genes that have been reported to have a certain relationship with cancer but that have not been validated by experiments.