| Literature DB >> 27412431 |
Lei Chen1,2, Tao Huang3, Yu-Hang Zhang3, Yang Jiang4, Mingyue Zheng5, Yu-Dong Cai1.
Abstract
Tumors are formed by the abnormal proliferation of somatic cells with disordered growth regulation under the influence of tumorigenic factors. Recently, the theory of "cancer drivers" connects tumor initiation with several specific mutations in the so-called cancer driver genes. According to the differentiation of four basic levels between tumor and adjacent normal tissues, the cancer drivers can be divided into the following: (1) Methylation level, (2) microRNA level, (3) mutation level, and (4) mRNA level. In this study, a computational method is proposed to identify novel lung adenocarcinoma drivers based on dysfunctional genes on the methylation, microRNA, mutation and mRNA levels. First, a large network was constructed using protein-protein interactions. Next, we searched all of the shortest paths connecting dysfunctional genes on different levels and extracted new candidate genes lying on these paths. Finally, the obtained candidate genes were filtered by a permutation test and an additional strict selection procedure involving a betweenness ratio and an interaction score. Several candidate genes remained, which are deemed to be related to two different levels of cancer. The analyses confirmed our assertions that some have the potential to contribute to the tumorigenesis process on multiple levels.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27412431 PMCID: PMC4944139 DOI: 10.1038/srep29849
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Flowchart of our method.
(A) Four gene sets consisting of dysfunctional genes on four levels; (B) SP method to search candidates in a network. Yellow nodes represent dysfunctional genes on different levels, and the dashed lines represent the shortest path connecting a and i, e and h are selected; (C) Six candidate gene sets obtained by the SP method; (D) Permutation test to filter some false positives. Two randomly produced sets {d, f} and {c, g} were shown in the network (highlighted in red and green), in detail, red nodes d and c replace yellow node a, while green nodes g and f replace yellow node i, dotted lines represent the shortest path connecting d and f, dashed-dotted lines represent the shortest path connecting c and g, and e (highlighted in pink) is removed by the permutation test; (E) Six candidate gene sets filtered by the permutation test; (F) Six candidate gene sets filtered by further selection using betweenness and PPI.
Number of candidate genes obtained by the SP method and filtered by the permutation test and further selection.
| Pair of gene sets | Number of candidate genes obtained by SP method ( | Number of candidate genes filtered by permutation test ( | Number of candidate genes filtered by further selection using betweenness and PPI ( |
|---|---|---|---|
| 1355 | 310 | 27 | |
| 723 | 242 | 42 | |
| 1606 | 455 | 39 | |
| 1402 | 357 | 45 | |
| 2515 | 485 | 33 | |
| 1705 | 431 | 56 |
G1: A set containing 153 methylated CpG site genes; G2: A set containing 825 microRNA target genes; G3: A set containing 197 somatic mutation genes; G4: A set containing 1,373 mRNA genes.
Important candidate genes in (based on methylated CpG site genes and microRNA target genes) identified by our method.
| Ensembl ID | Gene symbol | Description | Betweenness | Permutation FDR | Betweenness ratio | Min-Max interaction score |
|---|---|---|---|---|---|---|
| ENSP00000403005 | EFNA4 | Ephrin-A4 | 1597 | <0.001 | 0.014897 | 679 |
| ENSP00000337088 | MEN1 | Multiple Endocrine Neoplasia I | 7269 | <0.001 | 0.067808 | 719 |
| ENSP00000352262 | MLL | Myeloid/lymphoid or mixed-lineage leukemia (trithorax homolog, Drosophila) | 7342 | <0.001 | 0.068489 | 988 |
| ENSP00000405890 | PBX1 | Pre-B-Cell Leukemia Homeobox 1 | 2529 | <0.001 | 0.023591 | 822 |
| ENSP00000297261 | SHH | Sonic Hedgehog | 2800 | 0.001 | 0.026119 | 986 |
| ENSP00000262965 | TCF3 | Transcription Factor 3 | 2651 | <0.001 | 0.024729 | 985 |
Important candidate genes in (based on methylated CpG site genes and somatic mutation genes) identified by our method.
| Ensembl ID | Gene symbol | Description | Betweenness | Permutation FDR | Betweenness ratio | Min-Max interaction score |
|---|---|---|---|---|---|---|
| ENSP00000344456 | CTNNB1 | Catenin (Cadherin-Associated Protein), Beta 1, 88kDa | 3240 | <0.001 | 0.126592 | 996 |
| ENSP00000287934 | FZD1 | Frizzled Class Receptor 1 | 382 | 0.002 | 0.014925 | 813 |
| ENSP00000337088 | MEN1 | Multiple Endocrine Neoplasia I | 1567 | <0.001 | 0.061225 | 719 |
| ENSP00000352262 | MLL | Myeloid/lymphoid or mixed-lineage leukemia (trithorax homolog, Drosophila) | 1581 | <0.001 | 0.061772 | 571 |
| ENSP00000297261 | SHH | Sonic Hedgehog | 706 | <0.001 | 0.027585 | 985 |
| ENSP00000262965 | TCF3 | Transcription Factor 3 | 571 | 0.002 | 0.02231 | 420 |
Important candidate genes in (based on methylated CpG site genes and mRNA genes) identified by our method.
| Ensembl ID | Gene symbol | Description | Betweenness | Permutation FDR | Betweenness ratio | Min-Max interaction score |
|---|---|---|---|---|---|---|
| ENSP00000337088 | MEN1 | Multiple Endocrine Neoplasia I | 9748 | <0.001 | 0.059972 | 719 |
| ENSP00000297261 | SHH | Sonic Hedgehog | 4782 | <0.001 | 0.02942 | 995 |
| ENSP00000262965 | TCF3 | Transcription Factor 3 | 4274 | <0.001 | 0.026295 | 987 |
Important candidate genes in (based on microRNA target genes and somatic mutation genes) identified by our method.
| Ensembl ID | Gene symbol | Description | Betweenness | Permutation FDR | Betweenness ratio | Min-Max interaction score |
|---|---|---|---|---|---|---|
| ENSP00000297268 | COL1A2 | Collagen, Type I, Alpha 2 | 2577 | <0.001 | 0.016865 | 985 |
| ENSP00000371138 | FKBP1A | FK506 Binding Protein 1A, 12kDa | 3140 | <0.001 | 0.02055 | 998 |
| ENSP00000358525 | NGF | Nerve Growth Factor (Beta Polypeptide) | 4923 | <0.001 | 0.032219 | 943 |
| ENSP00000401303 | SHC1 | SHC (Src Homology 2 Domain Containing) Transforming Protein 1 | 6291 | <0.001 | 0.041171 | 999 |
| ENSP00000348444 | TTN | Titin | 3248 | <0.001 | 0.021257 | 504 |
Important candidate genes in (based on microRNA target genes and mRNA genes) identified by our method.
| Ensembl ID | Gene symbol | Description | Betweenness | Permutation FDR | Betweenness ratio | Min-Max interaction score |
|---|---|---|---|---|---|---|
| ENSP00000263253 | EP300 | E1A Binding Protein P300 | 60273 | <0.001 | 0.062112 | 995 |
| ENSP00000339007 | GRB2 | Growth Factor Receptor-Bound Protein 2 | 48282 | <0.001 | 0.049755 | 939 |
| ENSP00000296585 | ITGA2 | Integrin, Alpha 2 (CD49B, Alpha 2 Subunit Of VLA-2 Receptor) | 12251 | 0.002 | 0.012625 | 987 |
| ENSP00000293379 | ITGA5 | Integrin, Alpha 5 (Fibronectin Receptor, Alpha Polypeptide) | 25513 | 0.004 | 0.026291 | 964 |
| ENSP00000332353 | PTCH1 | Patched 1 | 17300 | <0.001 | 0.017828 | 939 |
| ENSP00000297261 | SHH | Sonic Hedgehog | 10778 | <0.001 | 0.011107 | 986 |
| ENSP00000354720 | SMC3 | Structural Maintenance Of Chromosomes 3 | 10413 | 0.001 | 0.010731 | 986 |
Important candidate genes in (based on somatic mutation genes and mRNA genes) identified by our method.
| Ensembl ID | Gene symbol | Description | Betweenness | Permutation FDR | Betweenness ratio | Min-Max interaction score |
|---|---|---|---|---|---|---|
| ENSP00000242577 | DYNLL1 | Dynein, Light Chain, LC8-Type 1 | 6746 | <0.001 | 0.029117 | 803 |
| ENSP00000296585 | ITGA2 | Integrin, Alpha 2 (CD49B, Alpha 2 Subunit Of VLA-2 Receptor) | 9109 | <0.001 | 0.039317 | 959 |
| ENSP00000293379 | ITGA5 | Integrin, Alpha 5 (Fibronectin Receptor, Alpha Polypeptide) | 10741 | <0.001 | 0.046361 | 835 |
| ENSP00000277541 | NOTCH1 | Notch 1 | 11069 | <0.001 | 0.047776 | 948 |
| ENSP00000228307 | PXN | Paxillin | 3913 | <0.001 | 0.016889 | 702 |
Frequencies of some core candidate genes.
| Ensembl ID | Gene symbol | Description | Frequency | Pair of gene sets producing the candidate gene |
|---|---|---|---|---|
| ENSP00000332353 | PTCH1 | Patched 1 | 6 | |
| ENSP00000344456 | CTNNB1 | Catenin (Cadherin-Associated Protein), Beta 1, 88 kDa | 6 | |
| ENSP00000357656 | FYN | FYN Proto-Oncogene, Src Family Tyrosine Kinase | 6 | |
| ENSP00000162330 | BCAR1 | Breast Cancer Anti-Estrogen Resistance 1 | 5 | |
| ENSP00000297261 | SHH | Sonic Hedgehog | 5 | |
| ENSP00000358525 | NGF | Nerve Growth Factor (Beta Polypeptide) | 5 | |
| ENSP00000361125 | VEGFA | Vascular Endothelial Growth Factor A | 5 | |
| ENSP00000387662 | GCG | Glucagon | 5 | |
| ENSP00000261769 | CDH1 | Cadherin 1, Type 1, E-Cadherin (Epithelial) | 4 | |
| ENSP00000264657 | STAT3 | Signal Transducer And Activator Of Transcription 3 (Acute-Phase Response Factor) | 4 | |
| ENSP00000277541 | NOTCH1 | Notch 1 | 4 | |
| ENSP00000296585 | ITGA2 | Integrin, Alpha 2 (CD49B, Alpha 2 Subunit Of VLA-2 Receptor) | 4 | |
| ENSP00000312652 | LEP | Leptin | 4 | |
| ENSP00000350941 | SRC | SRC Proto-Oncogene, Non-Receptor Tyrosine Kinase | 4 |
G1: A set containing 153 methylated CpG site genes; G2: A set containing 825 microRNA target genes; G3: A set containing 197 somatic mutation genes; G4: A set containing 1,373 mRNA genes.