| Literature DB >> 23737949 |
Xin Yi Goh1, Richard Newton, Lorenz Wernisch, Rebecca Fitzgerald.
Abstract
Correlation patterns between matched copy number variation and gene expression data in cancer samples enable the inference of causal gene regulatory relationships by exploiting the natural randomization of such systems. The aim of this study was to test and verify experimentally the accuracy of a causal inference approach based on genomic randomization using esophageal cancer samples. Two candidates with strong regulatory effects emerging from our analysis are components of growth factor receptors, and implicated in cancer development, namely ERBB2 and FGFR2. We tested experimentally two ERBB2 and three FGFR2 regulated interactions predicted by the statistical analysis, all of which were confirmed. We also applied the method in a meta-analysis of 10 cancer datasets and tested 15 of the predicted regulatory interactions experimentally. Three additional predicted ERBB2 regulated interactions were confirmed, as well as interactions regulated by ARPC1A and FANCG. Overall, two thirds of experimentally tested predictions were confirmed.Entities:
Mesh:
Year: 2013 PMID: 23737949 PMCID: PMC3667814 DOI: 10.1371/journal.pone.0063780
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Schematic illustration of analysis.
A. Starting with genome-wide data from array comparative genomic hybridization and microarray gene expression, potential gene regulations were identified based on three conditions, marked i–iii in the figure and described in detail in the text. B. Flow chart of steps involved to validate the predictions.
Results for all predicted regulatory gene interactions that were tested experimentally, single and multiple datasets combined.
| Regulator | Target genes | direction | dir ok |
| out | signif | fdr | Cell line |
| ERBB2 | BST1 | + | 1 | 0.000 | 1 | *** | 0.000 | OE19 |
| IFIT1 | + | 1 | 0.010 | 0 | *** | 0.029 | OE19 | |
| PPP2R3A | + | 0 | 0.000 | 1 | 0.000 | BT474 | ||
| KCNS1 | + | 0 | 0.002 | 2 | 0.007 | BT474 | ||
| PFDN5 | − | 1 | 0.000 | 1 | *** | 0.000 | BT474 | |
| GAL3ST4 | − | 1 | 0.013 | 1 | ** | 0.031 | BT474 | |
| PPP2R3A | + | 1 | 0.160 | 1 | 0.213 | OE19 | ||
| KCNS1 | + | 1 | 0.030 | 1 | ** | 0.048 | OE19 | |
| PFDN5 | − | 1 | 0.000 | 1 | *** | 0.000 | OE19 | |
| GAL3ST4 | − | 1 | 0.011 | 0 | ** | 0.029 | OE19 | |
| FGFR2 | JAK1 | + | 1 | 0.027 | 2 | ** | 0.046 | HSC39 |
| NFIA | + | 1 | 0.000 | 1 | *** | 0.000 | HSC39 | |
| SAMD12 | + | 1 | 0.017 | 2 | ** | 0.034 | HSC39 | |
| ARPC1A | NCBP2 | + | 1 | 0.424 | 2 | 0.443 | AsPc1 | |
| VTI1B | + | 1 | 0.044 | 1 | * | 0.066 | AsPc1 | |
| YEATS2 | + | 0 | 0.128 | 1 | 0.181 | AsPc1 | ||
| TNFRSF8 | − | 1 | 0.017 | 1 | ** | 0.034 | AsPc1 | |
| PTGDS | − | 1 | 0.000 | 1 | *** | 0.000 | AsPc1 | |
| MFNG | − | 1 | 0.207 | 1 | 0.261 | AsPc1 | ||
| FANCG | KIRREL3 | + | 1 | 0.377 | 1 | 0.431 | BT474 | |
| PBX3 | + | 1 | 0.027 | 1 | ** | 0.046 | BT474 | |
| CKB | − | 1 | 0.365 | 1 | 0.431 | BT474 | ||
| ALDH6A1 | − | 0 | 0.425 | 1 | 0.443 | BT474 | ||
| PCDHB6 | − | 0 | 0.490 | 1 | 0.490 | BT474 |
A total of 24 different regulating-target gene interactions were tested, of which 13 (54.2%) validated ().
+/−: Positive/negative gene regulations as predicted by genomic randomization (‘+’: regulating gene acts as an inducer, reduced expressions of the regulating gene lead to reduced expressions of its target genes; ‘−‘: regulating gene acts as a suppressor, reduced expressions of the regulating gene leads to increased expressions of its target genes).
dir: ‘1’ means regulation direction from validation followed genomic randomization prediction.
: Statistical significance according to a linear mixed model analysis.
out: Number of outliers removed for analysis.
signif: Statistical significance according to and correct directions of regulating gene effects: *, **, ***.
fdr: following Benhamini-Hochberg multiple-testing adjustment.
Figure 2Quantile-quantile plots of observed versus expected partial correlations in the EAC dataset.
(for the 2000 probes with the maximum expression variance) A. Partial correlations between each probe’s array comparative genomic hybridization (aCGH) profile and its own expression profile. The two potential regulating genes selected for experimental validation (ERBB2 and FGFR2) are marked in the plot. B. Partial correlations between ERBB2’s aCGH profile and the expression profiles of all other probes. The two genes, IFIT1 and BST1, selected for experimental validation are marked. C. Partial correlations between FGFR2’s aCGH profile and the expression profiles of all other probes. The three genes, JAK1, NFIA and SAMD12, selected for validation are marked. In all plots 5% confidence intervals are marked by dashed lines.
Single dataset: List of top 10 potential target genes whose expression was highly correlated with the aCGH status of regulating genes, ERBB2 and FGFR2.
| Regulating gene | Chrom | Target gene | Chrom | fdr | sign |
| ERBB2 | 17 | IFIT1 | 10 | 0.024 | + |
| BST1 | 4 | 0.049 | + | ||
| SLCO1B3 | 12 | 0.053 | + | ||
| PPARGC1A | 4 | 0.058 | + | ||
| ALPPL2 | 2 | 0.074 | + | ||
| PRRX2 | 9 | 0.092 | + | ||
| MSX2 | 5 | 0.104 | + | ||
| RHOU | 1 | 0.161 | + | ||
| GSTM3 | 1 | 0.200 | + | ||
| ATP10B | 5 | 0.200 | + | ||
| FGFR2 | 10 | JAK1 | 1 | 0.003 | + |
| NFIA | 1 | 0.044 | + | ||
| SAMD12 | 8 | 0.082 | + | ||
| DCUN1D1 | 3 | 0.100 | + | ||
| DSG1 | 18 | 0.136 | + | ||
| PNLIPRP2 | 10 | 0.180 | + | ||
| PTPN2 | 18 | 0.180 | + | ||
| DRD5 | 4 | 0.180 | + | ||
| CKMT2 | 5 | 0.204 | + | ||
| MCART1 | 9 | 0.204 | + |
Based on the false discovery rate (fdr), which is the Benjamini-Hochberg adjusted . The sign of the correlation (positive or negative, +/−) is indicated. Hypothetical proteins have been excluded from the list. The top three gene pairs were selected for subsequent experimental validations; except for ERBB2-SLCO1B3 because primers designed for SLCO1B3 could not be optimized for qRT-PCR assays. Chrom Chromosome.
Figure 3qRT-PCR quantifications of mRNA levels in RNA interference assays in selected cancer cell lines with amplifications of regulating genes.
(a) Silencing of ERBB2 in OE19 cells, which harbor ERBB2 amplifications, showed significant reduction of the mRNA levels of: (i) ERBB2 and its predicted target gene, (ii) BST1 () and (iii) IFIT1 (). (b) Silencing of FGFR2 in HSC39 cells, which harbor FGFR2 amplifications, showed significant reduction of the mRNA levels of: (i) FGFR2 and its predicted target genes, (i) JAK1 (), (ii) NFIA () and (iii) SAMD12 (). Note: (R) regulating genes targeted by targeting siRNAs; (T) potential target genes tested, predicted to be positively-regulated (+) by regulating genes; (−ve) non-silencing negative siRNAs. siRNAs used in the panel are named according to their commercial product name (Qiagen). The vertical-axes for all plots are fixed from 0.0–2.0, except for the plot for IFIT1, where the vertical-axis is customized due to the variability in the gene expression changes.
Multiple datasets: List of potential regulating genes arranged according to the number of datasets in which the gene has significant aCGH-expression correlations, and according to the fdr adjusted , derived from 11 individual correlation .
| Regulatinggene | Chromosome | fdr | Number of datasets |
| ERBB2 | 17 | 0.00004 | 9 |
| GRB7 | 17 | 0.00004 | 8 |
| ARPC1A | 7 | 0.00004 | 7 |
| STIP1 | 11 | 0.0002 | 7 |
| FANCG | 9 | 0.0005 | 7 |
| RBM6 | 3 | 0.0006 | 7 |
| RAD23B | 9 | 0.0006 | 7 |
| SRRM2 | 16 | 0.0006 | 7 |
| PPFIA1 | 11 | 0.00004 | 6 |
| PEX1 | 7 | 0.00004 | 6 |
Number of datasets contributing to the significance of each correlation was important to ensure that no single cohort or cancer type was introducing bias to the analysis.
Regulating genes selected for subsequent validation assays via RNAi experiments. Cell lines with amplifications of these genes were reported (CONAN - Cancer Genome Project, Wellcome Trust Sanger Institute: http://www.sanger.ac.uk/cgi-bin/genetics/CGP/conan/search.cgi). GRB7 was not chosen because of its proximity with ERBB2.
Multiple datasets: List of top 10 potential target genes (5 positive correlations and 5 negative correlations) for the three potential regulating genes selected for experimental validation: ERBB2, ARPC1A and FANCG.
| Regulating gene | Chrom | Target gene | Chrom | Num | fdr |
| ERBB2/positive | 17 | PPP2R3A* | 3 | 6 | 0.0019 |
| GAS2 | 11 | 5 | 0.0012 | ||
| KCNS1* | 20 | 5 | 0.0022 | ||
| PTPN11 | 12 | 4 | 0.0008 | ||
| SRPK1 | 6 | 4 | 0.0010 | ||
| ERBB2/negative | 17 | PFDN5* | 12 | 6 | 0.0005 |
| GAL3ST4* | 7 | 5 | 0.0092 | ||
| OLFML3 | 1 | 5 | 0.0107 | ||
| ARL3 | 10 | 4 | 0.0014 | ||
| COX7A1 | 19 | 4 | 0.0016 | ||
| ARPC1A/positive | 7 | NCBP2* | 3 | 4 | 0.0401 |
| VTI1B* | 14 | 4 | 0.0413 | ||
| GTF3C3 | 2 | 3 | 0.0225 | ||
| YEATS2* | 3 | 3 | 0.0321 | ||
| SPTBN2 | 11 | 3 | 0.0347 | ||
| ARPC1A/negative | 7 | TNFRSF8* | 1 | 5 | 0.0017 |
| PTGDS* | 9 | 5 | 0.0038 | ||
| MFNG* | 22 | 5 | 0.0112 | ||
| IL16 | 15 | 4 | 0.0076 | ||
| TGFBR2 | 3 | 4 | 0.0080 | ||
| FANCG/positive | 9 | CTLA4 | 2 | 4 | 0.0800 |
| KIRREL3* | 11 | 3 | 0.0076 | ||
| PBX3* | 9 | 3 | 0.0124 | ||
| AGTR2 | X | 3 | 0.0137 | ||
| GLRA2 | X | 3 | 0.0172 | ||
| FANCG/negative | 9 | CKB* | 14 | 4 | 0.0104 |
| ALDH6A1* | 14 | 4 | 0.0250 | ||
| PCDHB6* | 5 | 3 | 0.0158 | ||
| CRYGD | 2 | 3 | 0.0294 | ||
| HADHA | 2 | 3 | 0.0510 |
Target genes are arranged according to the number of datasets (Num) in which the gene pair has significant aCGH-expression correlations, and the fdr adjusted combined . Target genes located on the same chromosome as their potential regulating genes are excluded from the lists. Asterisks (*) mark the potential target genes investigated in validation experiments. Chrom = Chromosome.
Figure 4qRT-PCR quantifications of mRNA levels following RNA interference assays in selected cancer cell lines with amplifications of regulating genes.
(a) Effects of ERBB2-targeting siRNAs treatment in BT474 cells, showing: (i) effective silencing of ERBB2 (); leading to significant up-regulations of (ii) PFDN5 () and (iii) GAL3ST4 (). (b) Effects of ERBB2-targeting siRNAs treatment in OE19 cells, showing: (i) effective silencing of ERBB2 (); leading to (ii) down-regulation of KCNS1 (; (iii) up-regulation of PFDN5 () and (iv) up-regulation of GAL3ST4 (). (c) Effects of ARPC1A-targeting siRNAs treatment in AsPc1 cells, showing: (i) effective silencing of ARPC1A (); leading to significant up-regulations of (ii) TNFRSF8 () and (iii) PTGDS (). (d) Effects of FANCG-targeting siRNAs treatment in BT474 cells, showing: (i) effective silencing of FANCG (); leading to significant up-regulation of (ii) PBX3 (). Note: (R) regulating genes targeted by targeting siRNAs; (T) potential target genes, positively- (+) or negatively-regulated (−) by regulating genes; (−ve) non-silencing negative siRNAs; siRNAs used in the panel are named according to their commercial product name (Qiagen). The vertical-axes for plots showing silencing of regulating genes are fixed from 0.0–2.0 whilst the vertical-axes for plots of target genes were customized according to the variability in the gene expression levels.
Comparing the false discovery rate (fdr) adjusted obtained from an analysis of both datasets using Pearson correlation and using partial correlation, and the performance of their predictions in light of the results from the 24 validation experiments.
| Genes | Experiment | Pearson | Partial | |||
| Regulator | Target | Fdr | fdr | performance | fdr | performance |
| ERBB2 | BST1 | 0.000 | 0.329 | FN |
|
|
| IFIT1 | 0.010 | 0.245 | FN |
|
| |
| PPP2R3A | (0.000) | 0.002 | FP | 0.018 | FP | |
| KCNS1 | (0.002) | 0.002 | FP | 0.002 | FP | |
| PFDN5 | 0.000 | 0.001 | TP | 0.005 | TP | |
| GAL3ST4 | 0.013 | 0.009 | TP | 0.021 | TP | |
| PPP2R3A | 0.160 | 0.002 | FP | 0.018 | FP | |
| KCNS1 | 0.030 | 0.002 | TP | 0.002 | TP | |
| PFDN5 | 0.000 | 0.001 | TP | 0.005 | TP | |
| GAL3ST4 | 0.011 | 0.009 | TP | 0.021 | TP | |
| FGFR2 | JAK1 | 0.027 | 0.937 | FN |
|
|
| NFIA | 0.000 | 0.937 | FN |
|
| |
| SAMD12 | 0.017 | 0.937 | FN |
|
| |
| ARPC1A | NCBP2 | 0.424 | 0.040 | FP | 0.587 | TN |
| VTI1B | 0.044 | 0.041 | TP | 0.757 | FN | |
| YEATS2 | (0.128) | 0.032 | FP | 0.641 | TN | |
| TNFRSF8 | 0.017 | 0.002 | TP | 0.001 | TP | |
| PTGDS | 0.000 | 0.004 | TP | 0.045 | TP | |
| MFNG | 0.207 | 0.011 | FP | 0.1 | TN | |
| FANCG | KIRREL3 | 0.377 | 0.008 | FP | 0.016 | FP |
| PBX3 | 0.027 | 0.012 | TP | 0.005 | TP | |
| CKB | 0.365 | 0.010 | FP | 0.083 | TN | |
| ALDH6A1 | (0.425) | 0.025 | FP | 0.102 | TN | |
| PCDHB6 | (0.490) | 0.016 | FP | 0.014 | FP | |
TP = True Positive, TN = True Negative, FP = False Positive, FN = False Negative, based on a fdr threshold of 0.05. () = experimental direction of change does not agree with prediction.
predictions from the single dataset. Partial correlations in italics were calculated using a 2000 probe subset of the single dataset, otherwise partial correlations were calculated from the multiple datasets using up to 15000 probes.