| Literature DB >> 26222778 |
Manika Sehgal1, Rajinder Gupta1, Ahmed Moussa2, Tiratha Raj Singh1.
Abstract
For examining the intricate biological processes concerned with colorectal cancer (CRC), a systems biology approach integrating several biological components and other influencing factors is essential to understand. We performed a comprehensive system level analysis for CRC which assisted in unravelling crucial network components and many regulatory elements through a coordinated view. Using this integrative approach, the perceptive of complexity hidden in a biological phenomenon is extensively simplified. The microarray analyses facilitated differential expression of 631 significant genes employed in the progression of disease and supplied interesting associated up and down regulated genes like jun, fos and mapk1. The transcriptional regulation of these genes was deliberated widely by examining transcription factors such as hnf4, nr2f1, znf219 and dr1 which directly influence the expression. Further, interactions of these genes/proteins were evaluated and crucial network motifs were detected to associate with the pathophysiology of CRC. The available standard statistical parameters such as z-score, p-value and significance profile were explored for the identification of key signatures from CRC pathway whereas a few novel parameters representing over-represented structures were also designed in the study. The applied approach revealed 5 key genes i.e. kras, araf, pik3r5, ralgds and akt3 via our novel designed parameters illustrating high statistical significance. These novel parameters can assist in scrutinizing candidate markers for diseases having known biological pathways. Further, investigating and targeting these proposed genes for experimental validations, instead being spellbound by the complicated pathway will certainly endow valuable insight in a well-timed systematic understanding of CRC.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26222778 PMCID: PMC4519280 DOI: 10.1371/journal.pone.0133901
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The methodology applied for recognizing biomarkers in colorectal cancer.
Study initiated with the characterization of differentially expressed genes in colorectal cancer dataset and their transcriptional regulation. Important interactions and network patterns were identified from the CRC pathway and eventually functional enrichment was executed for key players in the disease progression.
Values of the designed parameters for each particular network motif in order to deduce crucial network components.
| Network Motif Image ID | Abbreviations |
|
|
|
|---|---|---|---|---|
| '0000001000011000' | 4a | 76 | 25 | 0.329 |
| '0000000000011100' | 4b | 48 | 16 | 0.333 |
| '0000000000001110' | 4c | 16 | 16 | 1 |
| '0000010000010000000101000' | 5a | 30 | 8 | 0.267 |
| '0000000000000010000111000' | 5b | 15 | 6 | 0.4 |
| '0000000000000100100010100' | 5c | 60 | 14 | 0.233 |
| '000000000000000010000001000001110000' | 6a | 36 | 8 | 0.222 |
| '000000000000000000000010001000110100' | 6b | 36 | 14 | 0.389 |
| '000000000000010000000010100000001100' | 6c | 60 | 12 | 0.2 |
| '000000000000000100000010000001110000' | 6d | 36 | 14 | 0.389 |
| '000000000000000000000010011000100100' | 6e | 36 | 8 | 0.222 |
| '000000100000000010000001010000010000' | 6f | 18 | 8 | 0.444 |
| '000000100000010000000001010000000010' | 6g | 18 | 8 | 0.444 |
| '000000000000000000000001000001111000' | 6h | 6 | 6 | 1 |
| '0000000000000000000000000100000001000100001101000' | 7a | 49 | 18 | 0.367 |
| '0000000000000000000000000000000001000110001100100' | 7b | 21 | 8 | 0.381 |
| '0000000000000000010000000010000000100000011100000' | 7c | 21 | 8 | 0.381 |
| '0000000000000000001000000010000000100000011100000' | 7d | 21 | 8 | 0.381 |
| '0000000100000000000100000001010000001000000000100' | 7e | 21 | 9 | 0.429 |
| '0000000100000001000000000010010000000000010000100' | 7f | 21 | 9 | 0.429 |
| '0000000000000000000000000000000000100011001110000' | 7g | 14 | 8 | 0.571 |
| '0000000000000000000000000001000001000010001110000' | 7h | 14 | 14 | 1 |
| '0000000000000000000000000001000000100010001110000' | 7i | 14 | 8 | 0.571 |
| '0000000000000000001000000010100000001000000011000' | 7j | 49 | 12 | 0.245 |
| '0000000000000000010000000100010000000001001010000' | 7k | 42 | 10 | 0.238 |
| '0000000000000000000000010000000001011000000001100' | 7l | 42 | 10 | 0.238 |
| '0000000000000000001000100000000001010000000011000' | 7m | 56 | 13 | 0.232 |
| '0000000000000000000000010000000001001000001001100' |
|
|
|
|
| '0000000000000000010000000100000001001000001010000' | 7o | 50 | 13 | 0.26 |
| '0000000000000000000000000000001000000010000100000110000010000100' | 8a | 48 | 10 | 0.208 |
| '0000000000000000000000000000010000000010001000000100000010011000' | 8b | 56 | 12 | 0.214 |
| '0000000000000000000000000000001010000000000010000110000000010100' | 8c | 48 | 12 | 0.25 |
| '0000000000000000000000000000000000010000000000100110000010001100' | 8d | 48 | 10 | 0.208 |
| '0000000000000000000000000010000000000010000010001100000000010100' | 8e | 48 | 11 | 0.229 |
| '0000000000000000000000000000001000000001000100000110000010000100' | 8f | 48 | 10 | 0.208 |
| '0000000000000000000100000000100001000000000000100000000110100000' | 8g | 48 | 11 | 0.229 |
| '0000000000000000000000000000010000100000000000100100000010011000' | 8h | 64 | 13 | 0.203 |
| '0000000000000000000100000000100000000100000000100100000010100000' | 8i | 48 | 11 | 0.229 |
| '0000000000000000000100000000100001000000000000100000100010100000' | 8j | 48 | 11 | 0.229 |
| '0000000000000000000000000000001000100000010000001000000000011100' | 8k | 40 | 12 | 0.3 |
| '0000000000000000000000000010000000000010100000000100010000011000' | 8l | 32 | 11 | 0.344 |
| '0000000000000000000000000000100000000100000000100010000011010000' | 8m | 29 | 13 | 0.448 |
| '0000000000000000000000000000100000000100001000000000000111010000' | 8n | 24 | 10 | 0.417 |
| '0000000000000000000010000000001000000100000000010000000111000000' | 8o | 24 | 9 | 0.375 |
| '0000000010000000010000000000010001000000000000100000000100001000' | 8p | 24 | 10 | 0.417 |
| '0000000000000000000010000000001000010000000000010000000111000000' | 8q | 24 | 9 | 0.375 |
| '0000000000000000000010000000001000000100000100000000000111000000' | 8r | 24 | 9 | 0.375 |
| '0000000000000000000000000000100000000100001000000000010011010000' | 8s | 24 | 10 | 0.417 |
| '0000000010000000000010000000000101000000010000000000010000000010' | 8t | 24 | 10 | 0.417 |
| '0000000000000000000100000000100000000100010000000100000010100000' | 8u | 16 | 9 | 0.563 |
| '0000000000000000000100000100000000000001010000000000100010100000' | 8v | 16 | 9 | 0.563 |
| '0000000000000000000000000000000000000001000100000000110011100000' | 8w | 16 | 10 | 0.625 |
| '0000000000000000000000000000000000000000000000100011100011000100' | 8x | 8 | 8 | 1 |
| '0000000000000000000000000000010000000010000000010000000111100000' | 8y | 8 | 8 | 1 |
| '0000000000000000000000000000100000000010000000010000000111100000' | 8z | 8 | 8 | 1 |
Fig 2Pre-processing and normalization of DNA microarray data.
2a shows the distribution of microarray files before normalization and 2b explains the uniform distribution obtained after implementing normalization i.e. removal of noise from data.
Fig 3Identification of differential expression.
Significance analysis of microarrays (SAM) and volcano plot were generated for detecting the differentially expressed genes in the early colorectal cancer dataset. In SAM, 631 significant genes were identified for their over or under expression in the diseased state whereas the volcano plot evidently elucidates the differentially expressed genes with red spots having signal log ratio (SLR)>2 or SLR<2.
Identified major transcription factors in early colorectal cancer progression.
|
| Frequency | Importance | JASPAR ID | Class | Family | Pubmed IDs/ Experimental Databases |
|---|---|---|---|---|---|---|
|
| 31.80% | 0.31802 | MA0114.1 | Zinc-coordinating | Hormone-nuclear Receptor | 19048623, 22731903, 22308320 |
|
| 19.43% | 0.50044 | MA0017.1 | Zinc-coordinating | Hormone-nuclear Receptor | The Human Protein Atlas |
|
| 17.31% | 0.04112 | - | Zinc-coordinating | Hormone-nuclear Receptor | The Human Protein Atlas, 10690519, 19251712 |
|
| 14.49% | 0.03622 | MA0066.1 | Zinc-coordinating | Hormone-nuclear Receptor | 19186181, 16489531 |
|
| 14.13% | 0.36064 | MA0046.1, MA0153.1 | Helix-Turn-Helix | Homeo | 12730871, 20096102 |
|
| 13.78% | 0.16882 | - | Zinc-coordinating | Hormone-nuclear Receptor | 22383578, 18180275 |
|
| 13.43% | 0.13428 | - | Zinc-coordinating | Hormone-nuclear Receptor | 11840453,25961905 |
|
| 12.01% | 0.29848 | MA0114.1 | Zinc-coordinating | Hormone-nuclear Receptor | 25961905, The Human Protein Atlas, 22731903 |
|
| 12.01% | 0.18322 | MA0068.1 | Helix-Turn-Helix | Homeo | 12970747, The Human Protein Atlas, 19395656 |
|
| 10.60% | 0.08216 | MA0112.2, MA0258.1 | Zinc-coordinating | Hormone-nuclear Receptor | 20663982 |
1The JASPAR IDs correspond to the transcription factors from JASPAR database
2The Pubmed IDs/ Experimental Databases column contains the information for literature references and databases created on experimentally validated data for their association with colorectal cancer.
Fig 4Functional enrichment and annotation analyses.
The 631 differentially expressed genes were subjected to manual curation and annotation analyses for their involvement in diverse biological pathways, molecular functions and cellular components.
Fig 5Identified network motifs from colorectal cancer pathway.
Some 4 and 5 node sub-graphs have been symbolized with gene names and their interactions if any. If the given interaction in the pathway was found to be missing, it is depicted as unknown (black coloured arrow).
Fig 6Bottom-up approach for classifying the network motifs.
From the 4 to 8 node sub-graphs, each node has been recognized and annotated in order to deduce certain vital interactions.
Fig 7Significance profile for all 4–8 node generated sub-graphs based on normalized z-scores.
The motif significance profile evidently exemplifies that when the complexity in CRC pathway increases, the interactions among the nodes and intricacy in recognition of genes amplifies immensely. Lesser the node size, it becomes easy to annotate the nodes (genes) and their associations with stronger statistical significance (greater normalized z-scores).
Putative over-represented genes from CRC pathway as indicated by the most recurrent network motif.
|
| Genes | Gene Details | Gene Size | Gene Frequency | Molecular Functions | Pubmed IDs |
|---|---|---|---|---|---|---|
| 1 | KRAS | Kirsten rat sarcoma viral oncogene homolog | 21656 Da, 189 amino acids | 10 | GTPase activity, LRR domain binding, protein binding | 19515263, 15069679, 10545700, 19832985 |
| 2 | ARAF | V-raf murine sarcoma 3611 viral oncogene homolog | 67585 Da, 606 amino acids | 10 | Protein kinase activity, protein binding, ATP binding, transferase activity, metal ion binding |
|
| 3 | PIK3R5 | Phosphoinositide-3-kinase, regulatory subunit 5 | 97348 Da, 880 amino acids | 10 | G-protein beta/gamma-subunit complex binding, 1-phosphatidylinositol-3-kinase regulator activity | - |
| 4 | RALGDS | Ral guanine nucleotide dissociation stimulator | 100607 Da, 914 amino acids | 10 | small GTPase regulator activity, protein binding, guanyl-nucleotide exchange factor activity |
|
| 5 | AKT3 | V-akt murine thymoma viral oncogene homolog 3 | 55775 Da, 479 amino acids | 8 | protein kinase activity, ATP binding, protein binding, transferase activity | 18813315 |
| 6 | RHOA | Ras homolog family member A | 21768 Da, 193 amino acids | 6 | GTPase activity, protein binding, myosin binding, protein domain specific binding | 19374769, 11844789, 11953197, 19499974 |
| 7 | MAP2K1 | Mitogen-activated protein kinase kinase 1 | 43439 Da, 393 amino acids | 6 | protein kinase activity, ATP binding, protein binding, transferase activity, RAS GTPase binding |
|
| 8 | MAPK1 | Mitogen-activated protein kinase 1 | 41390 Da, 360 amino acids | 2 | phosphotyrosine binding, DNA binding, protein kinase activity, transferase activity, ATP binding, transcription factor binding | 9690379, 11992399 |
| 9 | GSK3B | Glycogen synthase kinase 3 beta | 46744 Da, 420 amino acids | 2 | protein kinase activity, beta-catenin binding, tau protein binding, transferase activity, p53 binding, NF-kappaB binding |
|
| 10 | BAD | BCL2-associated agonist of cell death | 18392 Da, 168 amino acids | 2 | protein binding, phospholipid binding, protein heterodimerization activity, protein kinase binding, protein phosphatase binding | 17583570, 17393317 |
| 11 | CASP9 | Caspase 9, apoptosis-related cysteine peptidase | 46281 Da, 416 amino acids | 2 | cysteine-type endopeptidase activity, enzyme activator activity, protein binding, peptidase activity, SH3 domain binding, protein kinase binding | 11912124, 23303631 |
| 12 | MAPK8 | Mitogen-activated protein kinase 8 | 48296 Da, 427 amino acids | 2 | catalytic activity, JUN kinase activity, MAP kinase activity, protein kinase activity, ATP binding, phosphotransferase activity,transferase activity, histone deacetylase binding |
|
1Pubmed IDs correspond to the published literature illustrating role of these genes in colorectal cancer, whereas for some genes, experimental evidences were not found and a few depicted in bold explains their occurrence in colon cancer and further their role in colorectal cancer may be confirmed.