| Literature DB >> 27158220 |
André Fonseca1, Marco D Gubitoso2, Marcelo S Reis3, Sandro J de Souza1, Junior Barrera4.
Abstract
Cancer cells have anomalous development and proliferation due to disturbances in their control systems. The study of the behavior of cellular control system requires high-throughput dynamical data. Unfortunately, this type of data is not largely available. This fact motivates the main issue of this article: how to use static omics data and available biological knowledge to get new information about the elements of the control system in cancer cells. Two important measures to access the state of the cellular control system are the gene expression profile and the signaling pathways. This article uses a combination of these two static omics data to gain insights on the states of a cancer cell. To extract information from this kind of data, a statistical computational model was formalized and implemented. In order to exemplify the application of some aspects of the developed conceptual framework, we verified the hypothesis that different types of cancer cells have different disturbed signaling pathways. To this end, we developed a method that recovers small protein networks, called motifs, which are differentially represented in some subtypes of breast cancer. These differentially represented motifs are enriched with specific gene ontologies as well as with new putative cancer genes.Entities:
Keywords: cancer; motifs; omic data; pathway
Year: 2016 PMID: 27158220 PMCID: PMC4854218 DOI: 10.4137/CIN.S30800
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1Connected graphs with four vertices.
Figure 2Example of a motif 〈G,L〉 with vertices {a,b,c,d} and L = 〈+1,0,+1,−1〉.
Figure 3Pseudocode of the basic search and count algorithm.
Figure 4Example of PPI network graph .
Figure 5Pseudocode of a recursive function to identify the subgraphs from a PPI network graph.
Simulation of the IDENTIFY-SUBGRAPHS function.
| #CALL | G (ONLY THE VERTEX SETS OF THE GRAPHS ARE SHOWN) | X | V | K |
|---|---|---|---|---|
| 1 | Ø | {7} | 7 | 3 |
| 2 | {{7, 6, 5, 4}} | {7} | 7 | 2 |
| 3 | {{7, 6, 5, 4}} | {7, 6, 5} | 6 | 1 |
| 4 | {{7, 6, 5, 4}} | {7, 6, 5} | 5 | 1 |
| 5 | {{7, 6, 5, 4}} | {7, 6, 4} | 6 | 1 |
| 6 | {{7, 6, 5, 4}} | {7, 6, 4} | 4 | 1 |
| 7 | {{7, 6, 5, 4}, {7, 6, 4, 3}} | {7, 5, 4} | 5 | 1 |
| 8 | {{7, 6, 5, 4}, {7, 6, 4, 3}} | {7, 5, 4} | 4 | 1 |
| 9 | {{7, 6, 5, 4}, {7, 6, 4, 3}, {7, 5, 4, 3}} | {7} | 7 | 1 |
| 10 | {{7, 6, 5, 4}, {7, 6, 4, 3}, {7, 5, 4, 3}} | {7, 6} | 6 | 2 |
| 11 | {{7, 6, 5, 4}, {7, 6, 4, 3}, {7, 5, 4, 3}} | {7, 6} | 6 | 1 |
| 12 | {{7, 6, 5, 4}, {7, 6, 4, 3}, {7, 5, 4, 3}} | {7, 6, 5} | 5 | 1 |
| 13 | {{7, 6, 5, 4}, {7, 6, 4, 3}, {7, 5, 4, 3}} | {7, 6, 4} | 4 | 1 |
| 14 | {{7, 6, 5, 4}, {7, 6, 4, 3}, {7, 5, 4, 3}} | {7, 5} | 5 | 2 |
| 15 | {{7, 6, 5, 4}, {7, 6, 4, 3}, {7, 5, 4, 3}} | {7, 5} | 5 | 1 |
| 16 | {{7, 6, 5, 4}, {7, 6, 4, 3}, {7, 5, 4, 3}} | {7, 5, 4} | 4 | 1 |
| 17 | {{7, 6, 5, 4}, {7, 6, 4, 3}, {7, 5, 4, 3}} | {7, 4} | 4 | 2 |
| 18 | {{7, 6, 5, 4}, {7, 6, 4, 3}, {7, 5, 4, 3}} | {7, 4} | 4 | 1 |
| 19 | {{7, 6, 5, 4}, {7, 6, 4, 3}, {7, 5, 4, 3}} | {7, 4, 3} | 3 | 1 |
| {{7, 6, 5, 4}, {7, 6, 4, 3}, {7, 5, 4, 3}, {7, 4, 3, 2}, {7, 4, 3, 1}} | – | – | – |
Notes: Each row contains the input values of a call of the function; the first row is the initial call, while the remaining rows are the recursive calls in the sequential order they are executed. For all calls of the function, the PPI network graph is the one depicted in Figure 4.
After the execution of the 19th and last recursive call.
Figure 6Pseudocode of a search and count algorithm for connected graphs with four vertices.
Figure 7A procedure to avoid two motifs that are symmetric to each other of being counted separately. At each figure, vertices names and their associated labels are, respectively, outside parentheses and between parentheses. The motifs of (A) and (B) are symmetric to each other, and the EXTRACT-TOPOLOGY function maps them to the same unique motif (C).
Number of samples in each breast cancer subtype.
| SUBTYPE | SAMPLES |
|---|---|
| Triple negative | 100 |
| Luminal A | 393 |
| Luminal B | 100 |
| Her 2 enhanced | 30 |
Entropy of motifs and distribution among subtypes.
| MOTIF | ENTROPY | TRIPLE NEG | LUMINAL A | LUMINAL B | HER2E |
|---|---|---|---|---|---|
| K4-osss | 0.3667 | 0.0392 | 0.9436 | 0.0166 | 0.0006 |
| Diag-soss | 0.4474 | 0.0840 | 0.9130 | 0.0018 | 0.0011 |
| K4-ooss | 0.5592 | 0.1184 | 0.8780 | 0.0001 | 0.0035 |
| Square-ooss | 0.5614 | 0.8711 | 0.1282 | 0.0003 | 0.0003 |
| Diag-osss | 0.5661 | 0.0912 | 0.8919 | 0.0162 | 0.0007 |
| Star-ooos | 0.6463 | 0.8699 | 0.1121 | 0.0038 | 0.0141 |
| Linear-oooo | 0.6516 | 0.8549 | 0.1369 | 0.0044 | 0.0039 |
| Square-osso | 0.7082 | 0.8116 | 0.1874 | 0.0005 | 0.0005 |
| Square-oooo | 0.7268 | 0.8252 | 0.1642 | 0.0001 | 0.0106 |
| Kite-soss | 0.7296 | 0.1611 | 0.8273 | 0.0107 | 0.0009 |
| Kite-oooo | 0.7428 | 0.8236 | 0.1652 | 0.0044 | 0.0068 |
| Diag-oooo | 0.7969 | 0.8123 | 0.1704 | 0.0102 | 0.0071 |
| Square-ooos | 0.7990 | 0.7718 | 0.2252 | 0.0002 | 0.0028 |
| Square-ooso | 0.8021 | 0.8236 | 0.1487 | 0.0224 | 0.0053 |
| Kite-osoo | 0.8198 | 0.8137 | 0.1601 | 0.0044 | 0.0219 |
| Linear-osoo | 0.8541 | 0.7807 | 0.2040 | 0.0072 | 0.0081 |
| Square-osss | 0.8756 | 0.7134 | 0.2855 | 0.0005 | 0.0005 |
| K4-ssss | 0.8865 | 0.1755 | 0.7880 | 0.0365 | 0.0000 |
| Kite-ssos | 0.8937 | 0.2093 | 0.7675 | 0.0230 | 0.0003 |
| K4-noss | 0.8968 | 0.1034 | 0.8107 | 0.0845 | 0.0011 |
| Star-ssso | 0.9696 | 0.3358 | 0.6584 | 0.0057 | 0.0001 |
| Star-ssss | 0.9991 | 0.5525 | 0.4469 | 0.0003 | 0.0003 |
Figure 8KEGG enrichment analysis. The dotchart shows the enrichment analysis performed by using clusterProfiler. All motifs selected by Shannon entropy method are represented at the X-axis, while in the Y-axis, all the enriched KEGG pathways with P-value ≤ 0.05 are listed. The adjusted P-values are sorted from less (blue) to more (red) significant. Furthermore, the dot size is based on gene ratio, which is the observed number of genes in the experimental set within the respective KEGG pathway.
The ranked proteins associated with pathway in cancer.
| PROTEIN | SUPPRESSED (S) | OVER-EXPRESSED (O) | ||||||
|---|---|---|---|---|---|---|---|---|
| TNBC | LMNA | LMNB | HER2 | TNBC | LMNA | LMNB | HER2 | |
| HSF1 | 0 | 0 | 0 | 0 | 0.74 | 0.89 | 0.72 | 0.77 |
| RANGRF | 0.51 | 0.46 | 00.87 | 0.78 | 0 | 0 | 0 | 0 |
| ORAOV1 | 0 | 0 | 0 | 0 | 0.44 | 0.89 | 0.73 | 0.5 |
| ERBB2IP | 0.72 | 0.36 | 0.09 | 0.19 | 0 | 0.01 | 0.01 | 0 |
| NDUFB9 | 0 | 0 | 0 | 0 | 0.92 | 0.92 | 0.83 | 0.86 |
| IFNAR1 | 0.25 | 0.1 | 0 | 0 | 0.38 | 0.21 | 0 | 0.89 |
| FBP1 | 0.83 | 0.34 | 0 | 0.45 | 0 | 0 | 0 | 0 |
| SRSF12 | 0 | 0 | 0 | 0 | 0.85 | 0.15 | 0 | 0 |
| UQCRB | 0 0 | 0 0 | 0 | 0.37 | 0.86 | 0 | 0.58 | 0.67 |
| TOPORS | 0.26 | 0.58 | 0.36 | 0.51 | 0.08 | 0.02 | 0.04 | 0.21 |
| ZNF706 | 0 | 0.05 | 0 | 0 | 0.69 | 0.85 | 0.74 | 0.69 |
| RPL19 | 0.12 | 0.13 | 0 | 0 | 0 | 0.13 | 0.85 | 0.9 |
| GOLGA1 | 0.64 | 0.32 | 0 | 0.79 | 0 | 0 | 0.42 | 0 |
| FLII | 0 | 0.58 | 0.43 | 0.35 | 0 | 0 | 0.22 | 0.35 |
| MRPL53 | 0.72 | 0 | 0 | 0.66 | 0 | 0.27 | 0 | 0 |
| NADK2 | 0 | 0.13 | 0 | 0.83 | 0 | 0.25 | 0.5 | 0 |
| TCAP | 0 | 0 | 0 | 0 | 0 | 0.34 | 0.97 | 0.96 |
| NUP50 | 0.13 | 0.59 | 0 | 0 | 0.13 | 0 | 0.42 | 0.32 |
| NMT1 | 0 | 0.34 | 0.62 | 0.89 | 0.26 | 0.4 | 0.12 | 0 |
| MGMT | 0.35 | 0.59 | 0.82 | 0.7 | 0 | 0 | 0 | 0 |
| USP32 | 0.17 | 0 | 0 | 0.32 | 0 | 0.73 | 0.89 | 0.32 |
| GATA3 | 0.87 | 0.38 | 0 | 0.67 | 0 | 0 | 0 | 0 |
| PTRH2 | 0 | 0.05 | 0 | 0 | 0.34 | 0.75 | 0.9 | 0.83 |
| ORMDL3 | 0 | 0 | 0 | 0 | 0 | 0.42 | 0.95 | 0.99 |
| PHB | 0 | 0 | 0 | 0 | 0 | 0.63 | 0.86 | 0.89 |
| KAT7 | 0 | 0.01 | 0 | 0 | 0.01 | 0.49 | 0.9 | 0.59 |
| FOXA1 | 0.81 | 0.47 | 0 | 0.42 | 0 | 0.02 | 0.05 | 0 |
| GRB7 | 0 | 0 | 0 | 0 | 0.01 | 0.37 | 0.92 | 0.94 |
| ATMIN | 0.33 | 0.54 | 0.83 | 0 | 0.22 | 0.05 | 0 | 0 |
| ELP3 | 0.88 | 0.55 | 0.86 | 0.86 | 0 | 0.15 | 0 | 0 |
| TBCE | 0 | 0 | 0 | 0 | 0.9 | 0.92 | 0.88 | 0.96 |
| VPS72 | 0 | 0 | 0 | 0 | 0.83 | 0.67 | 0.5 | 0.86 |
| MRPL27 | 0.37 | 0.05 | 0 | 0 | 0 | 0.59 | 0.81 | 0.89 |
| MSL1 | 0 | 0 | 0 | 0 | 0 | 0.12 | 0.84 | 0.96 |
| PABPC1 | 0 | 0 | 0 | 0 | 0.84 | 0.67 | 0.77 | 0.9 |
| ATP5L | 0.39 | 0.51 | 0.75 | 0.84 | 0 | 0 | 0 | 0 |
| NHP2L1 | 0.3 | 0.7 | 0 | 0.29 | 0.1 | 0 | 0 | 0.29 |
| MED7 | 0.78 | 0.44 | 0.05 | 0 | 0 | 0 | 0 | 0 |
| MED4 | 0.36 | 0.47 | 0.82 | 0.6 | 0.12 | 0 | 0 | 0 |
| YEATS2 | 0 | 0 | 0 | 0 | 0.86 | 0.25 | 0 | 0.43 |
| KANSL1 | 0.2 | 0.1 | 0.61 | 0.82 | 0 | 0 | 0 | 0 |
| FAM175A | 0 | 0.14 | 0 | 0.8 | 0 | 0 | 0 | 0 |
| RAD21 | 0 | 0 | 0 | 0 | 0.83 | 0.9 | 0.83 | 0.93 |
| MRPL10 | 0.52 | 0.07 | 0.38 | 0.89 | 0 | 0.57 | 0.49 | 0 |
| MRPL13 | 0 | 0 | 0 | 0.11 | 0.77 | 0.91 | 0.94 | 0.85 |
| MED30 | 0 | 0 | 0 | 0 | 0.85 | 0.61 | 0.06 | 0.32 |
| POLR2K | 0 | 0 | 0 | 0 | 0.79 | 0.85 | 0.82 | 0.75 |
| TUBGCP3 | 0 | 0.09 | 0.76 | 0.31 | 0.8 | 0.26 | 0 | 0.31 |
| DNAJC3 | 0 | 0 | 0.76 | 0.37 | 0.07 | 0.21 | 0.01 | 0.02 |
| UBE2Z | 0.26 | 0.26 | 0 | 0.13 | 0 | 0.46 | 0.92 | 0.67 |
| ANKRA2 | 0.77 | 0.44 | 0.37 | 0.44 | 0 | 0 | 0 | 0 |
| 01/08/02 | 0 | 0 | 0 | 0 | 0.86 | 0.63 | 0.52 | 0.83 |
| MCPH1 | 0.25 | 0.71 | 0.61 | 0.47 | 0.15 | 0.01 | 0.01 | 0 |
| TNFRSF10B | 0.18 | 0.65 | 0.22 | 0 | 0.18 | 0 | 0 | 0 |
| KRT4 | 0 | 0 | 0 | 0 | 0.54 | 0 | 0 | 0.95 |
| CDC6 | 0 | 0 | 0 | 0 | 0 | 0.15 | 0.88 | 0.89 |
| VPS45 | 0 | 0 | 0 | 0 | 0.38 | 0.53 | 0.91 | 0.43 |
| INTS10 | 0.09 | 0.71 | 0.64 | 0.37 | 0.46 | 0.03 | 0 | 0 |
| FBX025 | 0.49 | 0.68 | 0.74 | 0.82 | 0.18 | 0.04 | 0 | 0 |
| DDX19A | 0 | 0.63 | 0 | 0 | 0.16 | 0 | 0 | 0 |
| KLHL12 | 0.15 | 0 | 0 | 0 | 0 | 0.6 | 0.89 | 0.75 |
| RPS25 | 0.24 | 0.57 | 0.89 | 0.46 | 0 | 0.06 | 0 | 0 |
| RABIF | 0 | 0 | 0 | 0 | 0.8 | 0.85 | 0.93 | 0.89 |
| FAM96B | 0.4 | 1 | 0.75 | 0.33 | 0 | 0 | 0 | 0.33 |
| VPS4A | 0.13 | 0.57 | 0.85 | 0.22 | 0 | 0 | 0 | 0.22 |
| MAEA | 0.35 | 0.23 | 0.77 | 0.34 | 0.21 | 0 | 0 | 0 |
| EZH1 | 0.16 | 0.13 | 0.29 | 0.83 | 0 | 0.13 | 0 | 0 |
| RI0K1 | 0 | 0 | 0 | 0 | 0.84 | 0.26 | 0.22 | 0.36 |
| LRRFIP1 | 0.83 | 0.17 | 0 | 0 | 0 | 0.08 | 0 | 0 |
| SKIV2L2 | 0.73 | 0.36 | 0.29 | 0 | 0.07 | 0.18 | 0 | 0 |
| MED1 | 0.09 | 0.24 | 0.07 | 0.08 | 0.3 | 0.31 | 0.89 | 0.87 |
| PSMB4 | 0 | 0 | 0 | 0 | 0.91 | 0.51 | 0.62 | 0.35 |
| FYCOl | 0.85 | 0.33 | 0 | 0.82 | 0 | 0 | 0 | 0 |
| DCAF13 | 0 | 0 | 0 | 0 | 0.81 | 0.87 | 0.83 | 0.73 |
| CAMLG | 0.76 | 0.12 | 0 | 0.7 | 0 | 0.16 | 0 | 0 |
| PEX14 | 0.2 | 0 | 0.75 | 0 | 0 | 0 | 0 | 0 |
| YWHAZ | 0 | 0 | 0 | 0 | 0.76 | 0.87 | 0.74 | 0.77 |
| NSMCE4A | 0 | 0 | 0 | 0.78 | 0.29 | 0.39 | 0 | 0 |
| ELAC2 | 0 | 0.59 | 0.54 | 0 | 0 | 0 | 0 | 0 |
| CDK12 | 0 | 0 | 0 | 0 | 0 | 0.17 | 0.88 | 0.96 |
| TLX1 | 0 | 0 | 0 | 0 | 0.78 | 0.25 | 0.34 | 0.88 |
| TGOLN2 | 0.41 | 0 | 0 | 0.79 | 0 | 0 | 0.75 | 0 |
| CNOT7 | 0.42 | 0.74 | 0.74 | 0.62 | 0.21 | 0.04 | 0 | 0 |
| ATP5G2 | 0.82 | 0.2 | 0 | 0.62 | 0 | 0 | 0 | 0 |
| COPA | 0 | 0 | 0 | 0 | 0.69 | 0.78 | 0.91 | 0.51 |
| CLASP2 | 0 | 0 | 0 | 0.79 | 0 | 0.2 | 0.2 | 0 |
| CLASP1 | 0.29 | 0.1 | 0.51 | 0.82 | 0 | 0.1 | 0 | 0 |
| HEATR1 | 0.04 | 0.12 | 0 | 0 | 0.86 | 0.59 | 0.77 | 0.67 |
| INTS9 | 0.55 | 0.62 | 0.71 | 0.62 | 0 | 0 | 0.09 | 0 |
| SNF8 | 0.22 | 0.11 | 0 | 0 | 0 | 0.54 | 0.94 | 0.84 |
| KRIT1 | 0.18 | 0.31 | 0.77 | 0 | 0.18 | 0.31 | 0 | 0 |
| DEDD | 0 | 0 | 0 | 0 | 0.88 | 0.7 | 0.75 | 0.8 |
| DUSP12 | 0 | 0 | 0 | 0 | 0.55 | 0.69 | 1 | 0.38 |
| LACTB2 | 0 | 0 | 0 | 0.33 | 0.7 | 0.85 | 0.44 | 0.33 |
| EX0C4 | 0.23 | 0 | 0.87 | 0 | 0 | 0.24 | 0 | 0 |
| HSPA14 | 0 | 0.04 | 0 | 0 | 0.83 | 0.51 | 0.26 | 0.58 |
| TINF2 | 0.72 | 0.14 | 0 | 0.66 | 0 | 0 | 0 | 0 |
| RPA1 | 0 | 0.4 | 0.77 | 0.51 | 0.14 | 0 | 0 | 0 |
| BBS4 | 0.74 | 0.25 | 0 | 0 | 0 | 0.12 | 0 | 0 |
| SCRIB | 0 | 0 | 0 | 0 | 0.5 | 0.9 | 0.42 | 0.54 |
| DYNC1I1 | 0 | 0 | 0 | 0 | 0.83 | 0.3 | 0 | 0 |
| ACTL6A | 0 | 0 | 0 | 0 | 0.83 | 0.47 | 0.49 | 0.83 |
| ATP6V1C1 | 0 | 0 | 0 | 0 | 0.34 | 0.86 | 0.74 | 0.4 |
Notes: The normalized values were calculated based on absolute frequency of each type expression label in each subtype. For a given protein, the sum of the counts N, O, and S was used to normalize S and 0.
Abbreviations: TNBC, triple negative; LMNA, luminal A; LMNB, luminal B; HER2, Her2 enriched.