| Literature DB >> 28559546 |
Ravindra Kumar1, Sabindra K Samal1,2, Samapika Routray3,4, Rupesh Dash1, Anshuman Dixit5.
Abstract
In the recent years, bioinformatics methods have been reported with a high degree of success for candidate gene identification. In this milieu, we have used an integrated bioinformatics approach assimilating information from gene ontologies (GO), protein-protein interaction (PPI) and network analysis to predict candidate genes related to oral squamous cell carcinoma (OSCC). A total of 40973 PPIs were considered for 4704 cancer-related genes to construct human cancer gene network (HCGN). The importance of each node was measured in HCGN by ten different centrality measures. We have shown that the top ranking genes are related to a significantly higher number of diseases as compared to other genes in HCGN. A total of 39 candidate oral cancer target genes were predicted by combining top ranked genes and the genes corresponding to significantly enriched oral cancer related GO terms. Initial verification using literature and available experimental data indicated that 29 genes were related with OSCC. A detailed pathway analysis led us to propose a role for the selected candidate genes in the invasion and metastasis in OSCC. We further validated our predictions using immunohistochemistry (IHC) and found that the gene FLNA was upregulated while the genes ARRB1 and HTT were downregulated in the OSCC tissue samples.Entities:
Mesh:
Year: 2017 PMID: 28559546 PMCID: PMC5449392 DOI: 10.1038/s41598-017-02522-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The flowchart of the methodology. The data sources (HIPPIE, Cancer candidate genes and Oral cancer genes) are indicated on the top. The HCGN was created from HIPPIE interactions in cancer candidate genes. The top ranking genes in HCGN (Set A, 227 genes) were identified using a combination of centralities. The genes common between oral cancer genes (465) and genes in HCGN (4704) were defined as oral cancer genes in HCGN (Set B, 297 genes). The GO enrichment was done using oral cancer genes and cancer candidate genes. The genes corresponding to significantly enriched GO terms were identified (Set C, 530 genes). The candidate oral cancer genes (Set D, 39 genes) were predicted using Set A, Set B and Set C. The Set E and F contains oral cancer genes in HCGN that are either top ranked or corresponding to enriched GO terms respectively. Set G contains the oral cancer genes in HCGN that are both top ranked and corresponding to enriched GO terms. A pathway and cluster analysis was done on predicted genes (Set D) and selected candidates were validated.
Figure 2The graph of degree distribution for HCGN, where K is the degree and P (K) is the fraction of nodes in the network with degree K. The exponent value of 1.4302 indicates that the network is a scale-free network.
The distribution of oral cancer genes in HCGN in top ranked lists.
| Centrality | Degree | Closeness | Centroid | SPB | Eigen vector | Page rank | CFC | CFB | Stress | Vulnerability | Unique pooled genes | Consensus Ranking | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| Oral cancer genes# | 2 | 22 | 18 | 24 | 26 | 13 | 27 | 21 | 23 | 21 | 37 | 28 |
| Total genes | 4 | 67 | 46 | 76 | 90 | 47 | 90 | 71 | 74 | 84 | 152 | 91 | |
| Precision* | 50.00 | 32.84 | 39.13 | 31.58 | 28.89 | 27.66 | 30.00 | 29.58 | 31.08 | 25.00 | 24.34 | 30.77 | |
|
| Oral cancer genes# | 5 | 41 | 31 | 43 | 46 | 32 | 51 | 43 | 44 | 45 | 71 | 56 |
| Total genes | 9 | 170 | 109 | 193 | 221 | 121 | 221 | 181 | 180 | 211 | 359 | 227 | |
| Precision* | 55.56 | 24.12 | 28.44 | 22.28 | 20.81 | 26.45 | 23.08 | 23.76 | 24.44 | 21.33 | 19.77 | 24.67 | |
|
| Oral cancer genes# | 8 | 55 | 37 | 59 | 68 | 42 | 65 | 56 | 56 | 58 | 94 | 57 |
| Total genes | 13 | 237 | 157 | 269 | 309 | 168 | 308 | 252 | 251 | 295 | 517 | 239 | |
| Precision* | 61.54 | 23.21 | 23.57 | 21.93 | 22.01 | 25.00 | 21.10 | 22.22 | 22.31 | 19.66 | 18.18 | 23.85 | |
|
| Oral cancer genes# | 8 | 75 | 52 | 77 | 85 | 53 | 83 | 71 | 73 | 70 | 118 | 89 |
| Total genes | 18 | 354 | 230 | 383 | 442 | 239 | 440 | 359 | 357 | 423 | 762 | 448 | |
| Precision* | 44.44 | 21.19 | 22.61 | 20.10 | 19.23 | 22.18 | 18.86 | 19.78 | 20.45 | 16.55 | 15.48 | 19.87 |
*Precision is the percentage of oral cancer genes in total genes. #Common out of 297 oral cancer genes in HCGN.
Analysis of different criteria to select specific GO terms.
| S. No. | Criteria | Number of oral cancer genes in HCGN | Total genes | *Precision |
|---|---|---|---|---|
| 3 < = B < = 50 | 188 | 935 | 20.10 | |
| Top 20% enriched GO terms | 170 | 684 | 24.85 | |
| Top 20% enriched GO terms and redundancy removal | 152 | 530 | 28.67 |
*Percentage of oral cancer genes in total genes.
Figure 3Prediction of candidate genes using top ranked genes (Set A), oral cancer genes in HCGN (Set B) and genes corresponding to significantly enriched GO terms (Set C). The genes shared by Set A and Set C but not by Set B were predicted as candidate genes.
The candidate oral cancer genes.
| S. No. | Gene ID | Gene Symbol | HCGN Ranking | Degree with known oral cancer genes | Expression Atlas (mRNA) | Oncomine mRNA expression* |
|---|---|---|---|---|---|---|
|
| 3320 | HSP90AA1 | 1 | 55 | Up | |
|
| 8452 | CUL3 | 2 | 59 | NA | |
|
| 2099 | ESR1 | 3 | 41 | Up | Down |
|
| 3312 | HSPA8 | 4 | 30 | Up | Up |
|
| 1499 | CTNNB1 | 5 | 32 | NA | |
|
| 6714 | SRC | 6 | 0 | NA | |
|
| 4088 | SMAD3 | 7 | 20 | Up | NA |
|
| 5071 | PARK2 | 8 | 21 | Down | Down |
|
| 5970 | RELA | 9 | 23 | NA | |
|
| 207 | AKT1 | 10 | 28 | NA | |
|
| 408 | ARRB1 | 11 | 18 | NA | |
|
| 7186 | TRAF2 | 12 | 21 | NA | |
|
| 1457 | CSNK2A1 | 13 | 25 | Up | |
|
| 7013 | TERF1 | 14 | 15 | Up | |
|
| 5591 | PRKDC | 15 | 25 | Up | |
|
| 3064 | HTT | 15 | 15 | NA | |
|
| 23411 | SIRT1 | 17 | 20 | NA | |
|
| 5518 | PPP2R1A | 18 | 14 | NA | |
|
| 10014 | HDAC5 | 19 | 21 | NA | |
|
| 4176 | MCM7 | 20 | 21 | Up | |
|
| 3676 | ITGA4 | 21 | 38 | Up | |
|
| 1432 | MAPK14 | 22 | 15 | NA | |
|
| 808 | CALM3 | 23 | 12 | NA | |
|
| 805 | CALM2 | 23 | 12 | NA | |
|
| 801 | CALM1 | 23 | 12 | NA | |
|
| 4851 | NOTCH1 | 26 | 10 | NA | |
|
| 5515 | PPP2CA | 27 | 16 | Up | |
|
| 2316 | FLNA | 28 | 14 | Up | Up |
|
| 5894 | RAF1 | 29 | 19 | NA | |
|
| 8841 | HDAC3 | 30 | 11 | NA | |
|
| 8826 | IQGAP1 | 31 | 12 | Down | NA |
|
| 5747 | PTK2 | 32 | 23 | Up | |
|
| 6657 | SOX2 | 33 | 7 | Down | NA |
|
| 6195 | RPS6KA1 | 34 | 9 | NA | |
|
| 5580 | PRKCD | 35 | 15 | NA | |
|
| 3717 | JAK2 | 36 | 15 | NA | |
|
| 998 | CDC42 | 37 | 5 | Up | |
|
| 6709 | SPTAN1 | 38 | 12 | NA | |
|
| 3688 | ITGB1 | 36 | 14 | Down | NA |
*The mRNA expression status (oral cancer vs normal) of a gene as reported in the Oncomine database and expression ATLAS. NA = Not reported.
The pathway enrichment analysis.
| Enriched Pathways | Genes* |
|---|---|
| B cell activation | CALM1, CALM2, CALM3, MAPK14, PRKCD, raf1 |
| Heterotrimeric G-protein signaling pathway-rod outer segment phototransduction | CALM1, CALM2, CALM3 |
| VEGF signaling pathway | AKT1, MAPK14, PRKCD, PTK2, RAF1 |
| p53 pathway feedback loops 2 | AKT1, CTNNB1, MAPK14, PPP2CA |
| CCKR signaling map | AKT1, CALM1, CALM2, CALM3, CDC42, CTNNB1, ITGB1, JAK2, MAPK14, PRKCD, PTK2, RAF2, RPS6KA1, SRC |
| T cell activation | AKT1, CALM1, CALM2, CALM3, CDC42, RAF1 |
| Ras Pathway | AKT1, CDC42, MAPK14, RAF1, RPS6KA1 |
| Angiogenesis | AKT1, CTNNB1, ITGA4, ITGB1, MAPK14, NOTCH1, PRKCD, PTK2, RAF1, SRC |
| FGF signaling pathway | AKT1, ITGA4, ITGB1, MAPK14, PPP2CA, PPP2R1A, PRKCD, RAF1 |
| Apoptosis signaling pathway | AKT1, HSPA8, PRKCD, RELA, SP101, TRAF2 |
| p53 pathway | AKT1, PP2CA, SIRT1, TRAF2 |
| Gonadotropin releasing hormone receptor pathway | AKT1, CDC42, CTNNB1, HTT, ITGB1, MAPK14, PRKCD, PTK2, RAF1, RELA, SMAD3, SRC |
| Parkinson disease | CSNK2A1, HSPA8, MAPK14, PARK2, SRC |
| Endothelin signaling pathway | AKT1, ARRB1, PARKCD, RAF1 |
| EGF receptor signaling pathway | AKT1, MAPK14, PPP2CA, PARKCD, RAF1 |
| Integrin signalling pathway | CDC42, FLNA, ITGA4, ITGB1, PTK2, RAF1, SRC |
| Inflammation mediated by chemokine and cytokine signaling pathway | AKT1, ARRB1, ITGA4, ITGB1, JAK2, PARKCD, RAF1, RELA |
*The predicted genes related to the corresponding pathway.
Figure 4The enriched pathways (A) and functions (B) corresponding to their number of genes and fold change.
The functional enrichment analysis.
| Enriched Functions | Genes |
|---|---|
| Acetylation | TRAF2, PPP2R1A, HSP90AA1, RELA, SMAD3, PRKDC, SIRT1, ITGB1, FLNA, IQGAP1, HDAC5, CDC42, PTK2, CSNK2A1, MCM7, PPP2CA, CALM3, CALM2, HSPA8, CALM1, SPTAN1, TERF1 |
| Active site:Proton acceptor | AKT1, PTK2, CSNK2A1, RPS6KA1, MAPK14, RAF1, JAK2, SIRT1, PRKCD, SRC |
| Apoptosis | AKT1, TRAF2, HTT, SIRT1, CTNNB1 |
| ATP-binding | AKT1, PTK2, CSNK2A1, HSP90AA1, MCM7, RPS6KA1, MAPK14, PRKDC, RAF1, JAK2, PRKCD, SRC, HSPA8 |
| Chromatin structure and dynamics/Secondary metabolites biosynthesis, transport, and catabolism | HDAC5, HDAC3 |
| Cytoplasm | TRAF2, HSP90AA1, HTT, RELA, SMAD3, PARK2, PRKCD, FLNA, CTNNB1, AKT1, HDAC5, ARRB1, MAPK14, PPP2CA, CALM3, CALM2, HSPA8, CALM1, TERF1, SPTAN1 |
| Cytoskeleton | PPP2CA, CALM3, ITGA4, CALM2, FLNA, SPTAN1, TERF1, CALM1, CTNNB1 |
| Duplication | CALM3, ITGA4, ITGB1, PRKCD, CALM2, IQGAP1, FLNA, CALM1 |
| Host-virus interaction | RELA, SIRT1, ITGB1, SRC, HSPA8 |
| Kinase | AKT1, PTK2, CSNK2A1, RPS6KA1, MAPK14, CALM3, PRKDC, RAF1, JAK2, PRKCD, CALM2, SRC, CALM1 |
| Nucleotide-binding | AKT1, CDC42, PTK2, CSNK2A1, HSP90AA1, MCM7, RPS6KA1, MAPK14, PRKDC, RAF1, JAK2, PRKCD, SRC, HSPA8 |
| Nucleus | HTT, RELA, SOX2, ESR1, SMAD3, PRKDC, PARK2, SIRT1, CTNNB1, CUL3, AKT1, HDAC5, HDAC3, NOTCH1, MCM7, ARRB1, MAPK14, PPP2CA, TERF1 |
| Phosphotransferase | CSNK2A1, RPS6KA1, MAPK14, PRKDC, PRKCD, SRC |
| Proto-oncogene | AKT1, RAF1, JAK2, SRC |
| Serine/threonine-protein kinase | AKT1, CSNK2A1, RPS6KA1, MAPK14, PRKDC, RAF1, PRKCD |
| Transcription regulation | HDAC5, HDAC3, NOTCH1, MCM7, ARRB1, RELA, SOX2, ESR1, SMAD3, SIRT1, CTNNB1 |
| Transferase | AKT1, PTK2, CSNK2A1, RPS6KA1, MAPK14, PRKDC, RAF1, JAK2, PRKCD, SRC |
| Ubl conjugation | TRAF2, HTT, RELA, SOX2, ESR1, PARK2, CTNNB1, CUL3, AKT1, HDAC5, HDAC3, ARRB1, CALM3, CALM2, CALM1, TERF1 |
*The predicted genes related to the corresponding function.
Figure 5The cluster analysis of predicted 39 genes using STRING. The identified clusters are colored in red (A), green (B), yellow (C) and blue (D). The solid and the dotted lines indicate connection within the same and different cluster respectively. Different color indicates different type of interactions. (Cyan-from curated databases; Pink-experimentally determined; Blue-gene co-occurrence; Khaki-from text mining; Black-co-expression; Light blue-protein homology).
Figure 6The proposed role of ARRB1, CALM3, FLNA and HTT in OSCC.
Figure 7Box plot showing the stage wise expression of the genes on immunohistochemically stained TMA sections of OSCC samples. The scoring was done on scale of 0 to 4 [where 0: no staining, 1: 25% (mild staining), 2: 25–50% (medium staining), 3: 50–75% (moderate staining) and 4: ≥75% (strong staining)]. (a) ARRB1 expression was found to be significantly (p < 0.05) decreased from TANT (tumor adjacent normal tissue) to cancer but within the stages (stage 1 to stage 3) the expression did not show significant change. (b) Significantly (p < 0.05) upregulated expression of FLNA was found in the oral cancer patients as compared to the TANT samples but within the stages (stage 1 to stage 3) the expression goes down. (c) The expression of CALM3 did not show a significant change from TANT to cancerous samples or among the different stages of cancerous samples. (d) HTT showed a decreased expression as compared to the TANT samples. Especially, the expression significantly decreased from TANT to stage 2 and stage 1 to stage 2 (p < 0.05). The number of samples used is indicated in brackets.
Figure 8Representative figures of immunohistochemically stained TMA sections in TANT (tumor adjacent normal tissue), Grade 1(G1), Grade 2 (G2) and Grade 3 (G3) oral cancer patient samples. Upper panel: (a) Strong cytoplasmic staining of ARRB1 was observed in the TANT cells as compared to the cancerous cells. (b) FLNA showed a strong cytoplasmic staining in case of cancerous samples as compared to TANT samples (c) CALM3 showed cytoplasmic staining with no significant change between TANT to grade 1 but showed significant (p = 0.04) increases from grade1 to grade 3 cancer samples, (d) HTT showed a cytoplasmic staining which decreases from TANT to grade 3. All the images were captured using LEICA DM500 HD compound microscope at 100X magnification. The insert (400X) in the IHC images are the zoom in section of the same tissue section which provides a better visualization for the intensity of staining. Lower panel: table representing the corresponding score of each image in upper panel, which was calculated by taking the percentage of positive cells and intensity of staining.