| Literature DB >> 30069494 |
Lei Chen1,2, Yu-Hang Zhang1, Zhenghua Zhang3, Tao Huang1, Yu-Dong Cai4.
Abstract
Extensive studies on tumor suppressor genes (TSGs) are helpful to understand the pathogenesis of cancer and design effective treatments. However, identifying TSGs using traditional experiments is quite difficult and time consuming. Developing computational methods to identify possible TSGs is an alternative way. In this study, we proposed two computational methods that integrated two network diffusion algorithms, including Laplacian heat diffusion (LHD) and random walk with restart (RWR), to search possible genes in the whole network. These two computational methods were LHD-based and RWR-based methods. To increase the reliability of the putative genes, three strict screening tests followed to filter genes obtained by these two algorithms. After comparing the putative genes obtained by the two methods, we designated twelve genes (e.g., MAP3K10, RND1, and OTX2) as common genes, 29 genes (e.g., RFC2 and GUCY2F) as genes that were identified only by the LHD-based method, and 128 genes (e.g., SNAI2 and FGF4) as genes that were inferred only by the RWR-based method. Some obtained genes can be confirmed as novel TSGs according to recent publications, suggesting the utility of our two proposed methods. In addition, the reported genes in this study were quite different from those reported in a previous one.Entities:
Keywords: Laplacian heat diffusion; permutation test; protein-protein interaction network; random walk with restart; tumor suppressor gene
Year: 2018 PMID: 30069494 PMCID: PMC6068090 DOI: 10.1016/j.omtm.2018.06.007
Source DB: PubMed Journal: Mol Ther Methods Clin Dev ISSN: 2329-0501 Impact factor: 6.698
Figure 1The Procedures of the LHD-Based and RWR-Based Methods
The LHD-based method first applied the LHD algorithm on a PPI network using validated TSGs as seed nodes, producing a large number of LHD genes. Then, these genes were filtered by three screening tests. The RWR-based method followed similar procedures. The only difference was the application of the RWR algorithm on the PPI network rather than the LHD algorithm.
Number of Candidate TSGs in Different Stages of LHD-Based and RWR-Based Methods
| Network Diffusion Algorithm | Permutation Test | Association Test | Function Test | |
|---|---|---|---|---|
| LHD-based method | 2,874 | 443 | 85 | 41 |
| RWR-based method | 5,889 | 1,364 | 980 | 140 |
Figure 2Venn Diagrams to Illustrate Putative Gene Sets Yielded by the Different Methods
(A) A Venn diagram to illustrate the distribution of 169 putative genes that were identified by either the LHD-based method or the RWR-based method. The red and blue circles represent the gene sets consisting of the putative genes yielded by the LHD-based method and the RWR-based method, respectively. 12 genes were identified by both of the two methods. (B) A Venn diagram to illustrate three putative gene sets yielded by three methods. The purple circle represents the gene set yielded by the LHD-based method, the yellow circle represents the gene set yielded by the RWR-based method, and the green circle represents the gene set reported in a previous study (SP-based method).
Figure 3The Subnetwork of PPI Network Containing the Linkages between Putative Genes and Validated TSGs
Pink nodes represent validated TSGs, while green, blue, and red nodes represent putative genes yielded by the LHD-based method, the RWR-based method, and both methods, respectively. The sizes of nodes in green, blue, and red represent their degrees. (A) The subnetwork containing all linkages between putative genes and validated TSGs. Edges in black, blue, green, and red represent PPIs with low, medium, high, and highest confidence, respectively. (B) The subnetwork only containing linkages between putative genes and validated TSGs with highest confidence.
24 Putative Genes Identified by at Least Two Methods
| Ensembl ID | Gene Symbol | LHD-Based Method | RWR-Based Method | SP-Based Method |
|---|---|---|---|---|
| ENSP00000020945 | SNAI2 | ×b | √c | √ |
| ENSP00000222330 | GSK3A | × | √ | √ |
| ENSP00000228682 | GLI1 | × | √ | √ |
| ENSP00000233948 | WNT6 | × | √ | √ |
| ENSP00000253055 | MAP3K10 | √ | √ | × |
| ENSP00000254480 | SMARCC1 | × | √ | √ |
| ENSP00000262158 | SMAD7 | × | √ | √ |
| ENSP00000287934 | FZD1 | × | √ | √ |
| ENSP00000293549 | WNT1 | × | √ | √ |
| ENSP00000308461 | RND1 | √ | √ | × |
| ENSP00000341032 | WNT7B | × | √ | √ |
| ENSP00000343819 | OTX2 | √ | √ | × |
| ENSP00000347942 | RET | √ | √ | × |
| ENSP00000354586 | GLI2 | √ | √ | × |
| ENSP00000358309 | EPHA7 | √ | √ | √ |
| ENSP00000361892 | STK4 | × | √ | √ |
| ENSP00000362139 | EPHA10 | √ | √ | × |
| ENSP00000363115 | FGR | √ | √ | × |
| ENSP00000364895 | ZBTB17 | √ | × | √ |
| ENSP00000365012 | HCK | √ | √ | × |
| ENSP00000368686 | E2F4 | × | √ | √ |
| ENSP00000370912 | TEC | √ | √ | × |
| ENSP00000381097 | EPHB1 | √ | √ | × |
| ENSP00000390500 | STK3 | √ | √ | × |
×, the putative gene cannot be identified by the method; √, the putative gene can be identified by the method.
The computational method proposed in Chen et al.’s study.
Important Putative Genes Yielded by Both LHD-Based and RWR-Based Methods
| Ensembl ID | Gene Symbol | LHD-Based Method | RWR-Based Method | MIS | MFS | ||
|---|---|---|---|---|---|---|---|
| Heat | p Value | Probability | p Value | ||||
| ENSP00000253055 | MAP3K10 | 1.8567E−04 | 0.036 | 2.7819E−05 | 0.029 | 925 | 0.9954 |
| ENSP00000308461 | RND1 | 1.2137E−04 | 0.040 | 5.2943E−05 | <0.001 | 982 | 0.9942 |
| ENSP00000343819 | OTX2 | 4.1770E−04 | 0.028 | 3.7524E−05 | 0.024 | 984 | 0.9855 |
| ENSP00000347942 | RET | 1.3647E−04 | 0.048 | 1.1216E−04 | <0.001 | 984 | 0.9867 |
| ENSP00000354586 | GLI2 | 2.1655E−04 | 0.032 | 5.4405E−05 | <0.001 | 999 | 0.9847 |
MIS, maximum interaction score; MFS, maximum function score.
Important Putative Genes Yielded Only by the LHD-Based Method
| Ensembl ID | Gene Symbol | LHD-Based Method | RWR-Based Method | MIS | MFS | ||
|---|---|---|---|---|---|---|---|
| Heat | p Value | Probability | p Value | ||||
| ENSP00000055077 | RFC2 | 3.1496E−04 | 0.036 | 1.8028E−05 | 0.316 | 999 | 0.9146 |
| ENSP00000218006 | GUCY2F | 2.1541E−04 | 0.044 | 1.8699E−05 | 0.544 | 904 | 0.9930 |
| ENSP00000238558 | GSC | 9.4478E−04 | 0.030 | 1.1429E−05 | 0.170 | 977 | 0.9595 |
| ENSP00000241261 | TNFSF10 | 4.9214E−04 | 0.010 | 3.9617E−05 | 0.005 | 999 | 0.9460 |
| ENSP00000261731 | LHX5 | 6.8836E−04 | 0.018 | 1.4245E−05 | 0.082 | 914 | 0.9436 |
| ENSP00000261980 | VSX2 | 1.0335E−03 | 0.016 | 1.6524E−05 | 0.055 | 910 | 0.9491 |
| ENSP00000266058 | SLIT1 | 3.3237E−04 | 0.012 | 4.0380E−05 | <0.001 | 959 | 0.9608 |
MIS, maximum interaction score; MFS, maximum function score.
Important Putative Genes Yielded Only by the RWR-Based Method
| Ensembl ID | Gene Symbol | RWR-Based Method | LHD-Based Method | MIS | MFS | ||
|---|---|---|---|---|---|---|---|
| Probability | p Value | Heat | p Value | ||||
| ENSP00000020945 | SNAI2 | 5.3473E−05 | 0.002 | 3.9529E−05 | – | 998 | 0.9825 |
| ENSP00000168712 | FGF4 | 3.8685E−05 | <0.001 | 4.1323E−05 | – | 936 | 0.9836 |
| ENSP00000222330 | GSK3A | 6.2092E−05 | 0.010 | 1.0955E−04 | 0.076 | 999 | 0.9921 |
| ENSP00000222462 | WNT16 | 3.6751E−05 | <0.001 | 4.0873E−05 | – | 919 | 0.9854 |
| ENSP00000222598 | DLX5 | 2.9495E−05 | 0.028 | 4.7228E−05 | – | 936 | 0.9853 |
MIS, maximum interaction score; MFS, maximum function score; –, the corresponding gene received a heat lower than the threshold of heat in the LHD-based method, i.e., it was not selected as LHD genes. Thus, the p value was not available for this gene.