| Literature DB >> 23282040 |
Meeta P Pradhan1, Kshithija Nagulapalli, Mathew J Palakal.
Abstract
BACKGROUND: Colorectal cancer (CRC) is one of the most commonly diagnosed cancers worldwide. Studies have correlated risk of CRC development with dietary habits and environmental conditions. Gene signatures for any disease can identify the key biological processes, which is especially useful in studying cancer development. Such processes can be used to evaluate potential drug targets. Though recognition of CRC gene-signatures across populations is crucial to better understanding potential novel treatment options for CRC, it remains a challenging task.Entities:
Mesh:
Year: 2012 PMID: 23282040 PMCID: PMC3524317 DOI: 10.1186/1752-0509-6-S3-S17
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Figure 1Overall methodology to identify the unique and common cliques in the population network. (i) Identify genes satisfying t-test in each data set. (ii) Construct networks for each dataset and annotate each node and edge in the network with its respective topological and biological features. (iii) Identify cliques of all sizes in each network and annotate each clique with its clique strength. Identify the maximum common size and highest scored clique as the seed across networks. (iv) Using the seed, identify the clique connectivity profile across networks. (v) Compare the clique connectivity profile (CCP) across network for commonality and uniqueness. (vi) Evaluate CCPs for their biological processes and pathways across networks and identify gene signatures for CRC across populations.
Figure 2(a) Gene expression profile for genes satisfying the t-test across populations. SA showed the highest number of up-regulated genes, followed by GER, CHN, and then USA. (b) Clique distribution across population. USA had the highest number of cliques for all the sizes, while SA had the lowest number of cliques. CHN and GER had nearly same number of cliques of all sizes. (c) Total number of unique cliques of respective sizes identified across population. There was a large decrease in size 7 unique cliques identified in all the populations compared to numbers of cliques of other sizes. (d) The number of common cliques identified across all populations was 49. The cliques identified in SA overlapped with all the other populations.
Node similarity across population
| Country | USA | CHN | SA |
|---|---|---|---|
| GER (7452) | 6797 | 6815 | 6290 |
| SA (7182) | 6564 | 6587 | -- |
| CHN (7830) | 7119 | -- | -- |
Comparison of population networks with HPRD network
| Network | No. of interactions | Degree diameter | Av. Path length |
|---|---|---|---|
| HPRD | 35706 | 7.79 | 4.22 |
| CHI | 27877 | 7.71 | 5.42 |
| GER | 25453 | 5.24 | 5.43 |
| USA | 28453 | 5.598 | 5.34 |
| SA | 24754 | 5.3 | 5.43 |
HPRD had the highest number of interactions. This illustrates that all four population networks are sub-networks within HPRD
Common cliques across the four population datasets
| Clique | Enriched GOTerms | Processes(p-value) | Literature of CRC |
|---|---|---|---|
| EGFR, ESR1, GRB2, PTPN6, SHC1, SRC, PIK3R1 | GO:0007173 | EGFR signaling pathway (0.00253) | ESR1 [ |
| GO:0071363 | Cellular response to growth | GRB2 [ | |
| GO:0007105 | Signal transduction (0.002) | EEGFR [ | |
| BRCA1, CREBBP, EP300, ESR1, SMAD2, SMAD3, TP53 | GO:000637 | Regulation of transcription from RNA polymerase II promoter (0.0044) | BRCA1 [ |
| GO:0033993 | Response to lipid | SMAD2 [ | |
| GO:0031325 | Positive regulation of cellular metabolic process (0.00048) | P53 [ | |
| CSN2, CSN3, CSN4, CSN5, CSN6, CSN7, CSN8, TP53 | GO:000338 | Protein deneddylation (1.7E-18) | |
| GO:0044267 | Cellular protein metabolic process (0.00029) | ||
| DIS3, EXOSC2, EXOSC4, EXOSC5, EXOSC7, EXOSC8, EXOSC9, MPP6 | GO:0045006 | DNA deamination | EXOSC [ |
| GO:0006304 | DNA modification (0.00192) | ||
| GO:0006402 | mRNA catabolic process (0.00769) | ||
GO TERM FINDER AND DAVID level 3 (biological processes) were considered for analysis.
Figure 3Clique gene distribution in pathways across population. More clique-genes were associated with Pathways in Cancer in population CHN, GER, USA than any other pathways for the same populations.
Top scored cliques in each population network
| Cliques | Country | GOTerm, DAVID-level3 | Genes Identified in GO Terms |
|---|---|---|---|
| EP300, ESR1, | USA(5.91) | GO:0045595 | EP300, SMAD2, JUN |
| CHN(5.85) | GO:0045595 | EP300, SMAD2, JUN | |
| GO:0048522 | EP300, SMAD2, SMAD3, | ||
| CTNNB1, CREBBP, EP300, ESR1, SMAD2, SMAD3, SMAD4 | GER (5.38) | GO:0033993 | EP300, SMAD2 |
| GO:0010769 | EP300, SMAD2 | ||
| PTPN11, CBL, SRC, PRKCA, SHC1, PTPN6, EGFR | SA(3.75) | GO:0007173 | EGFR, PRKCA, SHC, CBL, SRC, SHC1, PTPN11 |
| GO:0071363 | PRKCA, CBL, SHC | ||
| MCM10, MCM2, MCM3, MCM4, ORC2L, MCM6, MCM7 | SA(4.19) | GO:0006270 | ORC2L, MCM6 |
| GO:0000082 | ORC2L, MCM6 | ||
CliqueStrength of each clique was represented in brackets with respect to the population in which it was identified
Analysis of clique connectivity profile MaxCliques
| Population | Result of GO TERMFINDER & DAVID level 3 | Pathways (p-value) |
|---|---|---|
| USA | Positive regulation of cellular processes (1.2E-7) | Wnt Signaling (4.3E-7) |
| Regulation of cell proliferation (3.3E-5) | TGF-beta signaling (2.7E-6) | |
| GER | Cell morphogenesis (3.1E-6) | ErbB signaling pathway (1.5 E-7) |
| Cellular response to chemical stimulus (4.4E-06) | Focal adhesion (1.1E-6) | |
| Positive regulation of cellular process (1.1E-11) | JAK-STAT(1.1E-3) | |
| CHN | Positive regulation of cellular process (7.3E-11) | Wnt Signaling (5.1E-5) |
| Regulation of cell differentiation(1.3E-4) | B cell receptor signaling pathway(2E-2) | |
| Regulation of growth(5.8E-04) | T cell receptor signaling pathway (3.9E-2) | |
Figure 4Clique Connectivity Profile (MaxCliques). (Green-USA, Light blue-GER, Dark Blue-CHN). The figure depicts the clique connectivity profile for each of the populations. The seed was the same for all the populations, but the iteration considered maximum overlapping clique nodes and highest strength, resulting in overlapping nodes. The CCP diverges at three cliques, where the profile changes. The diverging cliques identified the genes that are both significant in CRC.
Figure 5Clique Connectivity Profile MaxCliques. (Green USA, Yellow SA): This figure depicts the seed common to USA/SA. The iteration considered the identification of next clique by evaluating the maximum overlap and highest strength. It can be seen that the profile diverges at the seed itself. The genes identified in CCPs of both USA and SA are significant in CRC. This figure depicts the variability in the expression of genes across the population.
Figure 6Clique Connectivity Profile for MinCliques (Green-USA, Light blue-GER, Dark Blue-CHN). This figure depicts the CCP for USA, GER, and CHN that was identified using a common seed (same as in Figure IV). The algorithm considered minimum overlapping nodes with highest CliqueStrength. From this figure, we can see the CCP diverges at the seed itself for the three populations. In SA, we identified two cliques that have the same number of overlapping nodes with the seed and same clique strength. Therefore we see two gene signature profiles of SA, both of which end at the same clique.