| Literature DB >> 19828069 |
Chiara Balestrieri1, Lilia Alberghina, Marco Vanoni, Ferdinando Chiaradonna.
Abstract
BACKGROUND: The integration of data from multiple genome-wide assays is essential for understanding dynamic spatio-temporal interactions within cells. Such integration, which leads to a more complete view of cellular processes, offers the opportunity to rationalize better the high amount of "omics" data freely available in several public databases.In particular, integration of microarray-derived transcriptome data with other high-throughput analyses (genomic and mutational analysis, promoter analysis) may allow us to unravel transcriptional regulatory networks under a variety of physio-pathological situations, such as the alteration in the cross-talk between signal transduction pathways in transformed cells.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19828069 PMCID: PMC2762058 DOI: 10.1186/1471-2105-10-S12-S1
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
The cancer cell lines in the NCI60 collection sorted by tissue of origin
| Breast | HS578T, MDA-MB231, MD-MB435, MCF7, T47, MDA-N, BT549 |
| CNS | SF295, SF359, SNB19, U251, SF268, SNB75 |
| Colon | HCC2998, HCT116, HCT15, SW620, COLO205, HT29, KM12 |
| Leukaemia | CCRF-CEM, RPMI-8226, HL60, MOLT4, K562, SR |
| Melanoma | SK-MEL2, LOXIMVI, M14, MALME-3M, SK-MEL28, SK-MEL5, UACC257, UACC62 |
| Lung | A549, HOP62, NCI-H23, NCI-H460, EKVX, NCI-H226, NCI-H322, NCI-H522, HOP92 |
| Ovarian | OVCAR5, OVCAR8, SKOW3, IGROV1, OVCAR3, OVCAR4 |
| Prostate | PC3, DU145 |
| Renal | 786-0, RXF-393, A498, ACHN, CAKI-1, SNC12C, TK10, U031 |
| Unknown | ADR-RES |
Gene expression profiling datasets of NCI60 cell lines and normal tissues analyzed in this study
| [ | NCI60 cells | 60 | GSE5949 |
| Breast | 0 | - | |
| [ | CNS | 2 | GSE96 |
| [ | Colon | 4 | GSE6731 |
| [ | Blood | 4 | GSE1402 |
| [ | Lung | 2 | GSE96 |
| Skin | 0 | - | |
| [ | Ovary | 3 | GSE96 |
| [ | Prostate | 3 | GSE96 |
| [ | Kidney | 3 | GSE96 |
Gene expression profiles retrieved from the GEO Database. Dataset A (60 profiles) is made up of the NCI60 cell lines [31]. Dataset B (13 profiles) is a subset of transcriptional profiles of a diverse array of tissues, organs, and cell lines from a normal human physiological state [106]. Dataset C (4 profiles) encompasses the normal human adult samples derived from colonoscopic biopsy present in a database comprising samples of patients with Crohn's disease or ulcerative colitis [107]. Dataset D (4 profiles) contains normal control samples present in a database containing transcriptional profiles of peripheral blood mononuclear cells (PBMC) obtained by juvenile arthritis patients and healthy controls [108].
PKA related genes identified in all the datasets shown in Table 2 and used in this study
| 33353_at | Hs.192215 | ADCY1 | Adenylate cyclase 1 (brain) |
| 34686_at | Hs.481545 | ADCY2 | Adenylate cyclase 2 (brain) |
| 33134_at | Hs.467898 | ADCY3 | Adenylate cyclase 3 |
| 39383_at | Hs.525401 | ADCY6 | Adenylate cyclase 6 |
| 40585_at | Hs.513578 | ADCY7 | Adenylate cyclase 7 |
| 36246_at | Hs.414631 | ADCY8 | Adenylate cyclase 8 (brain) |
| 33800_at | Hs.391860 | ADCY9 | Adenylate cyclase 9 |
| 37698_at | Hs.463506 | AKAP1 | A kinase (PRKA) anchor protein 1 |
| 36633_at | Hs.462457 | AKAP10 | A kinase (PRKA) anchor protein 10 |
| 34657_at | Hs.105105 | AKAP11 | A kinase (PRKA) anchor protein 11 |
| 37680_at | Hs.371240 | AKAP12 | A kinase (PRKA) anchor protein (gravin) 12 |
| 554_at | Hs.459211 | AKAP13 | A kinase (PRKA) anchor protein 13 |
| 41075_at | Hs.98397 | AKAP3 | A kinase (PRKA) anchor protein 3 |
| 37087_at | Hs.97633 | AKAP4 | A kinase (PRKA) anchor protein 4 |
| 32421_at | Hs.532489 | AKAP5 | A kinase (PRKA) anchor protein 5 |
| 40747_at | Hs.509083 | AKAP6 | A kinase (PRKA) anchor protein 6 |
| 41703_r_at | Hs.486483 | AKAP7 | A kinase (PRKA) anchor protein 7 |
| 35138_at | Hs.199029 | AKAP8 | A kinase (PRKA) anchor protein 8 |
| 37886_at | Hs.399800 | AKAP8L | A kinase (PRKA) anchor protein 8-like |
| 36506_at | Hs.527348 | AKAP9 | A kinase (PRKA) anchor protein (yotiao) 9 |
| 36297_at | Hs.435267 | ATF1 | Activating transcription factor 1 |
| 37535_at | Hs.516646 | CREB1 | cAMP responsive element binding protein 1 |
| 32066_g_at | Hs.200250 | CREM | cAMP responsive element modulator |
| 35522_at | Hs.487129 | PDE10A | Phosphodiesterase 10A |
| 36311_at | Hs.416061 | PDE1A | Phosphodiesterase 1A, calmodulin-dependent |
| 38921_at | Hs.530871 | PDE1B | Phosphodiesterase 1B, calmodulin-dependent |
| 32418_at | Hs.487897 | PDE1C | Phosphodiesterase 1C, calmodulin-dependent |
| 666_at | Hs.89901 | PDE4A | Phosphodiesterase 4A, cAMP-specific |
| 33705_at | Hs.198072 | PDE4B | Phosphodiesterase 4B, cAMP-specific |
| 38860_at | Hs.437211 | PDE4C | Phosphodiesterase 4C, cAMP-specific |
| 38526_at | Hs.117545 | PDE4D | Phosphodiesterase 4D, cAMP-specific |
| 37676_at | Hs.9333 | PDE8A | Phosphodiesterase 8A |
| 37249_at | Hs.78106 | PDE8B | Phosphodiesterase 8B |
| 33709_at | Hs.473927 | PDE9A | Phosphodiesterase 9A |
| 438_at | Hs.194350 | PRKACA | Protein kinase, cAMP-dependent, catalytic, alpha |
| 36215_at | Hs.487325 | PRKACB | Protein kinase, cAMP-dependent, catalytic, beta |
| 36359_at | Hs.158029 | PRKACG | Protein kinase, cAMP-dependent, catalytic, gamma |
| 226_at | Hs.280342 | PRKAR1A | Protein kinase, cAMP-dependent, regulatory, type I, alpha |
| 1091_at | Hs.550753 | PRKAR1B | Protein kinase, cAMP-dependent, regulatory, type I, beta |
| 116_at | Hs.517841 | PRKAR2A | Protein kinase, cAMP-dependent, regulatory, type II, alpha |
| 37221_at | Hs.433068 | PRKAR2B | Protein kinase, cAMP-dependent, regulatory, type II, beta |
Figure 1Statistical analysis of the 41 PKA pathway-encoding genes expression in normal and transformed samples. 81 transcriptional profiles from normal tissues and from the NCI60 cancer cell line collection, were recovered from the GEO database. After normalization (see Methods), the expression values of 41 PKA pathway-encoding genes were used to perform an ANOVA analysis (p-value 0.0001) to evaluate the statistical significance of the differences between normal and transformed samples. IQR: Interquartile Range. Outliers are also shown.
Figure 2Hierarchical clustering of the 41 PKA pathway-encoding genes analyzed in this paper. Two-way (gene, column and cell line, row) hierarchical clustering (see Methods) of the same profiles analyzed in Figure 1. Normalized expression is colour-coded from green (poor expression) to red (strong expression). The name of each gene is colour-coded according to family to which it belongs.
The 6 main classes described in the text (red lines on the top of the dendrogram and roman number bottom of the dendrogram) are shown. The distance function is based on Pearson correlation and complete linkage clustering. Legends for expression, condition, gene family and tissue of origin are shown on the right of the dendrogram.
Correlation between PKA related gene patterns and tissues
| I | Breast 2/8 | |
| II | Colon 6/7 | |
| III | Ovary 3/3 | Ovary 4/6 |
| IV | Breast 2/8 | |
| V | Breast 3/8 | |
| VI | CNS 2/2 | |
Six different classes, corresponding to the main arms of the dendrogram derived from clustering according to "Tissue and cell line", were identified. The number on the right of each tissue represents the number of samples belonging to a class as compared to the total sample analyzed.
NCI60 cell lines with predicted active pathways by mutational analysis
| HS578T, MDA-MB231, MD-MB435 | MCF7, T47, | - | MDA-N, BT549 | |
| - | SF295, SF359, SNB19, U251 | SF268, SNB75 | - | |
| HCC2998, HCT116, HCT15, SW620, COLO205, HT29 | KM12 | - | - | |
| CCRF-CEM, RPMI-8226, HL60, MOLT4, K562 | - | - | SR | |
| SK-MEL2, LOXIMVI, M14, MALME-3M, SK-MEL28, SK-MEL5, UACC257, UACC62 | - | - | - | |
| A549, HOP62, NCI-H23, NCI-H460 | - | EKVX, NCI-H226, NCI-H322, NCI-H522 | HOP92 | |
| OVCAR5, OVCAR8 | SKOW3, IGROV1 | OVCAR3, OVCAR4 | - | |
| - | PC3, DU145 | - | - | |
| - | 786-0, RXF-393 | A498, ACHN, CAKI-1, SNC12C, TK10, U031 | - | |
| ADR-RES | - | - | - |
The 60 cell lines reorganized in 4 categories, as described in the text, on the basis of the most representative mutation of each cell line:
Cell lines carrying mutations able to interfere with Ras-Raf-MAPK pathway (i.e., mutations in genes encoding Ras, B-Raf, ERBB2, PDGFRA, referred to as Ras);
Cell lines carrying mutations able to interfere with PI3K-Akt pathway (i.e., mutations in genes encoding PI3KCA, PTEN and Lkb1, referred to as PI3K);
Cell lines carrying no somatic mutations interfering with the two above pathways (i.e., mutations in genes encoding for CDKN2A, p53, referred to as Other Mutation);
Cell lines for which the presence of somatic mutations interfering with the two above pathways has not been searched, referred to as Not Tested.
Figure 3Identification of differentially regulated genes in normal and transformed samples. (A) Samples were sorted in five groups according to mutational activation: green, normal; yellow, Ras; red, PI3K; blue, Other Mutation; cyan, Not Tested. Principal Component Analysis (PCA) performed on 41 PKA pathway-encoding genes for normal samples and the four classes of mutation-dependent samples. Each sphere represents the comparative averaging of the 41 genes for each pathway identified by mutational analysis. (B) For each of the 5 groups described in (A), the 41 PKA-encoding genes were clustered, relative to their level of expression, in three subgroups: Strong (>1, red), Average (=1, black) and Low (<1, green). (C) Gene list according to expression level and mutational group of the three subgroups previously indicated, divided for each sample. Color-coding is as follows: blue, common between normal and at least one transformed sample; yellow, specific for normal samples; grey, specific for transformed samples. Percentage of regulated genes for each subgroup is shown at the bottom. (D) ANOVA analysis to evaluate the statistical significance of the differences between the five classes of samples described in (A). The right inset shows p-value of the pair-wise comparisons. Statistically significant differences are indicated in red. IQR: Interquartile Range. Outliers are also shown.
Figure 4Hierarchical clustering of the 41 PKA pathway-encoding genes in transformed samples. Two-way (gene, column and cell line, row) hierarchical clustering (see Methods) of the profiles from the NCI60 collection only. Normalized expression is colour-coded from green (poor expression) to red (strong expression). The distance function is based on Pearson correlation and complete linkage clustering. The name of each gene is colour-coded according to the family to which it belongs. Legends for expression, condition, gene family and tissue of origin are shown on the right of the dendrogram. The data have been organized on the basis of the tissue of origin of the cancer (Tissue), the specific oncogenic mutations identified in each cell line (Mutation), the putative altered pathway by the specific mutations (Pathway) and the gene family.
Figure 5TFBS identification by using the enrichment as parameter. (A) The panel shows for each TFBS, recognized as relevant (present in ≥ 70% of the promoters of 41 PKA pathway-encoding genes) the percentage of promoters in our collection that contain the motif as compared to Matrix Family Library on vertebrates. This percentage has been calculated by dividing the total number of promoters containing the motif (S) by the total number of promoters (T). Color-coding scheme on the right of the panel. (B) Schematic representation of the TFBSs (color-coded as shown on the right of the panel) identified in the promoters of the 15 subgroups described in the text and in Figure 3. Each cartoon represents the promoter structure resulting from the average of the TFBS identified in ≥ 70% of the gene promoters for each subgroup. The asterisks on the bottom of the cartoon indicate the over-represented TFBS, as scored in panel A, for all the 41 PKA pathway-encoding genes.
Figure 6Hierarchical clustering of TFBSs present in the promoters of the 41 PKA pathway-encoding genes, according to total number and frequency. Two-way (TBFS, column and expression subgroup, see Figure 3, row) hierarchical clustering of the TFBS present within the promoters of the 41 PKA pathway-encoding genes. Clustering was run according to the total number of TBFS present in each group (panel A) or to the frequency, i.e. the total number of a given TBFS divided by the number of promoters (panel (B). The color-coding scale is shown at the top of each panel. The distance function is based on Pearson correlation and complete linkage clustering. The two classes, corresponding to the main arms of the dendrogram, derived from clustering according to "Condition" are shown on the right of each dendrogram.
Comparison between computational data and literature data
| HOXF, ETSF, SORY | CREB | |
| ZF5F, SP1F, ZBPF, MAZF, EKLF, EGRF, | ||
| MYOD, EKLF, MAZF, EGRF, ETSF, HOXF, AP1R, | ||
| SP1F, EKLF, ZBPF, EBOX, ZF5F, HIFF, MAZF, EGRF, ETSF, AHRR, | ||
| ZBPF, EGRF, EKLF, ETSF, HOXF, SP1F, MAZF, GLIF, HOMF, NKXH, CREB | c-Myc/EBOX | |
| ETSF, SP1F, HESF, | ||
| SORY, | Myc/EBOX, | |
| ZBPF, EGRF, SP1F, MAZF, EKLF, ETSF, | ||
| ETSF, SORY, HOXF, ZBPF, HEAT, | ||
| ZBPF, ETSF, | Myc/EBOX, | |
| SORY, HOXF, NKXH, ETSF, | ||
| EKLF, ZBPF, EGRF, MZF1, NKXH, ETSF, MAZF, SORY, HOXF | USF1/EBOX, USF2/EBOX | |
| ETSF, | AP1/AP1F, AP2/AP2F, | |
| EKLF, MAZF, HESF, PLAG, ZBPF, EBOX, | ||
| ZBPF, ZF5F, EGRF, EKLF, MAZF, HESF, |
The table shows the transcriptional factor families identified by computational analysis and the transcriptional factors/transcriptional family identified by analysis of literature data. The factors identified by both analysis are shown in bold.
Figure 7Flowchart of our web-based and statistical strategy used to elucidate the relation between PKA encoding genes transcriptional profiles and oncogenic mutations. (A) Flow chart of our web-based and statistical strategy with indication of some of the databases (Source) used, the type of data analyzed (Input), the specific program and statistical test (Tool) used and the result obtained (Output). (B) Graphical representation of the block diagram summarizing functional interconnections within the PKA pathway module with indication of the expression level (geometric mean) of each gene belonging to the network -Strong (red), Average (black) and Low (green)- as identified by our analysis both in normal (B, left) and transformed samples (B, right). (C) Boxplots of the expression of PKA pathway-encoding genes in normal (C, left) and transformed (C, right) samples, grouped for functional classes (ADCY: adenylyl cyclase; AKAP: A-kinase anchor protein; PDE: phosphodiesterase; PRKACR: PKA regulatory subunit; PRKAC: PKA catalytic subunit). The represented value is the median. (D) Schematic representation of the TFBSs (color-coded) identified in the promoters of PKA pathway-encoding genes of normal and transformed samples. Each cartoon represents the promoter structure resulting from the merge of the TFBS identified in ≥ 70% of the gene promoters of all normal samples and transformed samples.