| Literature DB >> 23762832 |
Bi-Qing Li1, Jin You, Lei Chen, Jian Zhang, Ning Zhang, Hai-Peng Li, Tao Huang, Xiang-Yin Kong, Yu-Dong Cai.
Abstract
Lung cancer is one of the leading causes of cancer mortality worldwide. The main types of lung cancer are small cell lung cancer (SCLC) and nonsmall cell lung cancer (NSCLC). In this work, a computational method was proposed for identifying lung-cancer-related genes with a shortest path approach in a protein-protein interaction (PPI) network. Based on the PPI data from STRING, a weighted PPI network was constructed. 54 NSCLC- and 84 SCLC-related genes were retrieved from associated KEGG pathways. Then the shortest paths between each pair of these 54 NSCLC genes and 84 SCLC genes were obtained with Dijkstra's algorithm. Finally, all the genes on the shortest paths were extracted, and 25 and 38 shortest genes with a permutation P value less than 0.05 for NSCLC and SCLC were selected for further analysis. Some of the shortest path genes have been reported to be related to lung cancer. Intriguingly, the candidate genes we identified from the PPI network contained more cancer genes than those identified from the gene expression profiles. Furthermore, these genes possessed more functional similarity with the known cancer genes than those identified from the gene expression profiles. This study proved the efficiency of the proposed method and showed promising results.Entities:
Mesh:
Year: 2013 PMID: 23762832 PMCID: PMC3674655 DOI: 10.1155/2013/267375
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 11711 shortest paths between 54 NSCLC genes. The 1171 shortest paths between 54 NSCLC genes were identified with Dijkstra's algorithm based on PPI data from STRING. Yellow round represents 54 NSCLC genes. Red round represents 114 genes existing on shortest paths. Numbers on edges represent the edge weight to quantify the interaction confidence. The smaller the number, the stronger the interaction between two nodes.
Figure 23916 shortest paths between 84 SCLC genes. The 3916 shortest paths between 84 SCLC genes were identified with Dijkstra's algorithm based on PPI data from STRING. Yellow round represents 84 SCLC genes. Red round represents 161 genes existing on shortest paths. Numbers on edges represent the edge weight to quantify the interaction confidence. The smaller the number, the stronger the interaction between two nodes.
KEGG enrichment analysis of 38 SCLC shortest path genes.
| Term | Counta | Percentageb |
| Benjamini adjusted |
|---|---|---|---|---|
| Focal adhesion | 8 | 21.1 | 1.40 | 6.70 |
| Regulation of actin cytoskeleton | 7 | 18.4 | 2.20 | 5.40 |
| Arrhythmogenic right ventricular cardiomyopathy (ARVC) | 5 | 13.2 | 2.70 | 4.40 |
| ECM-receptor interaction | 5 | 13.2 | 4.00 | 4.80 |
| Hypertrophic cardiomyopathy (HCM) | 5 | 13.2 | 4.20 | 4.10 |
| Dilated cardiomyopathy | 5 | 13.2 | 5.70 | 4.60 |
| Cell cycle | 5 | 13.2 | 1.80 | 1.20 |
| p53-signaling pathway | 4 | 10.5 | 2.90 | 1.70 |
aThe number of genes belonging to a certain pathway.
bThe percentage of genes belonging to a certain pathway accounts for all the genes undergoing KEGG pathway analysis.
Overlap between candidate genes and cancer-related genes.
| Gene set | Number of candidate genes | Overlap with 742 cancer genes |
|
|---|---|---|---|
| NSCLC from array | 1825 | 93 | 6.698 |
| SCLC from array | 1063 | 69 | 2.218 |
| NSCLC in our study | 25 | 6 | 2.518 |
| SCLC in our study | 38 | 5 | 2.559 |
P value was calculated with the hypergeometric test assuming the total number of protein-coding genes was 20000.
Comparing the overlap between candidate genes with cancer-related genes.
| Gene set | Number of candidate genes | Overlap with 742 cancer genes |
|
|---|---|---|---|
| NSCLC from array | 1825 | 93 | |
| NSCLC in our study | 25 | 6 | 3.858 |
| SCLC from array | 1063 | 69 | |
| SCLC in our study | 38 | 5 | 0.186 |
P value was calculated with Fisher's exact test.
The functional similarity between identified lung cancer genes and 742 cancer genes.
| 742 cancer genes | |
|---|---|
| 1825 NSCLC genes from array | 0.4314* |
| 1063 SCLC genes from array | 0.4845* |
| 25 NSCLC genes from our study | 0.5554* |
| 38 SCLC genes from our study | 0.6919* |
*Pearson correlation coefficient of functional profiles.