| Literature DB >> 31335867 |
Richard Newton1, Lorenz Wernisch1.
Abstract
The copy numbers of genes in cancer samples are often highly disrupted and form a natural amplification/deletion experiment encompassing multiple genes. Matched array comparative genomics and transcriptomics datasets from such samples can be used to predict inter-chromosomal gene regulatory relationships. Previously we published the database METAMATCHED, comprising the results from such an analysis of a large number of publically available cancer datasets. Here we investigate genes in the database which are unusual in that their copy number exhibits consistent heterogeneous disruption in a high proportion of the cancer datasets. We assess the potential relevance of these genes to the pathology of the cancer samples, in light of their predicted regulatory relationships and enriched biological pathways. A network-based method was used to identify enriched pathways from the genes' inferred targets. The analysis predicts both known and new regulator-target interactions and pathway memberships. We examine examples in detail, in particular the gene POGZ, which is disrupted in many of the cancer datasets and has an unusually large number of predicted targets, from which the network analysis predicts membership of cancer related pathways. The results suggest close involvement in known cancer pathways of genes exhibiting consistent heterogeneous copy number disruption. Further experimental work would clarify their relevance to tumor biology. The results of the analysis presented in the database METAMATCHED, and included here as an R archive file, constitute a large number of predicted regulatory relationships and pathway memberships which we anticipate will be useful in informing such experiments.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31335867 PMCID: PMC6650054 DOI: 10.1371/journal.pone.0213221
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Fig 1Schematic diagram showing the steps involved in the analysis.
Details of the 45 datasets used in the meta-analysis, the original 31 followed by the 14 new datasets.
| Code | GEO | Publication | N | P | Pathology |
|---|---|---|---|---|---|
| parr | GSE20486 | [ | 97 | 18616 | Breast Cancer (Diploid) |
| crow | GSE15134 | [ | 31 | 16153 | Breast Cancer (ER+) |
| sirc | GSE17907 | [ | 51 | 14689 | Breast Cancer (ERBB2 amplified) |
| myll | [ | 46 | 17050 | Gastric Cancer | |
| junn | [ | 10 | 16844 | Gastric Cancer | |
| ch.w | [ | 91 | 10285 | Lung adenocarcinoma | |
| ch.s | [ | 94 | 10285 | Lung adenocarcinoma | |
| hoac | GSE20154 | [ | 54 | 14388 | Oesophageal adenocarcinoma |
| zho | GSE29023 | [ | 115 | 13697 | Multiple Myeloma |
| shai | GSE26089 | [ | 68 | 14201 | Pancreatic Cancer |
| vain | GSE28403 | [ | 13 | 10107 | Prostate Cancer |
| bott | GSE29211 | [ | 53 | 10321 | Pleural Mesothelioma |
| bekh | GSE23720 | [ | 173 | 13682 | Breast Cancer (Inflammatory) |
| chap | GSE26863 | [ | 245 | 13667 | Multiple Myeloma |
| ooi | GSE22785 | [ | 14 | 10091 | Neuroblastoma |
| brag | GSE12668 | [ | 11 | 10310 | Waldenström’s Macroglobulinemia |
| jons | GSE22133 | [ | 356 | 4183 | Breast Cancer |
| mura | GSE24707 | [ | 47 | 4472 | Breast Cancer |
| lin1 | GSE19915 | [ | 72 | 4965 | Urothelial Carcinoma |
| beck | GSE17555 | [ | 18 | 12174 | Leiomyosarcoma |
| toed | GSE18166 | [ | 74 | 4289 | Astrocytic Gliomas |
| ell | GSE35191 | [ | 124 | 13569 | Breast Cancer |
| gra.1 | GSE35988 | [ | 85 | 12849 | Prostate Cancer |
| gra.2 | GSE35988 | [ | 34 | 12813 | Prostate Cancer |
| lenz | GSE11318 | [ | 203 | 15212 | Lymphoma |
| lin2 | GSE32549 | [ | 131 | 8450 | Urothelial Carcinoma |
| micc | GSE38230 | [ | 12 | 16657 | Vulva Squamous Cell Carcinoma |
| tayl | GSE21032 | [ | 155 | 14572 | Prostate Cancer |
| coco | GSE25711 | [ | 36 | 4394 | Neuroblastoma |
| med | GSE14079 | [ | 8 | 6376 | Lung Cancer |
| przy | GSE54188 | [ | 53 | 17032 | Synovial Sarcoma |
| huang | GSE30311 | [ | 98 | 14927 | Ovarian Cancer |
| lira | GSE34211 | [ | 89 | 14907 | Cancer Cell lines |
| chpy.1 | GSE34171 | [ | 87 | 14907 | Diffuse Large B-cell Lymphoma |
| chpy.2 | GSE34171 | [ | 78 | 10437 | Diffuse Large B-cell Lymphoma |
| ross | GSE70770 | [ | 78 | 15150 | Prostate Cancer |
| ochs | GSE33232 | 69 | 14489 | Head and Neck Squamous Cell Carcinoma | |
| rama | GSE19539 | [ | 67 | 14972 | Ovarian Cancer |
| guar | GSE66399 | [ | 65 | 14833 | Breast Cancer |
| wilk | GSE36471 | [ | 47 | 13681 | Lung Adenocarcinoma |
| dona | GSE32688 | [ | 32 | 14833 | Pancreatic Cancer |
| zhu | GSE12805 | 31 | 11639 | Osteosarcoma | |
| kuij | GSE33383 | [ | 29 | 14339 | Osteosarcoma |
| pau | GSE26576 | [ | 29 | 14883 | Diffuse Intrinsic Pontine Glioma |
| weig | GSE57549 | [ | 25 | 15150 | Breast Cancer |
GEO = Gene Expression Omnibus dataset reference (http://www.ncbi.nlm.nih.gov/geo/), N = Number of samples, P = Number of matched probes,
* http://www.cangem.org/,
† http://cbio.mskcc.org/Public/lung_array_data/,
‡ Expression data in ArrayExpress (http://www.ebi.ac.uk/arrayexpress/): E-TABM-38, E-MTAB-161. Expts. ch, gra & chpy use 2 expr. platforms, so samples from each platform treated as separate dataset, to avoid spurious correlations which may be caused by systematic shifts between the 2 sets of expr. data. Each contributes 2 datasets to study, resulting in 45 d’sets from 42 expts.
Fig 2Schematic diagram showing the steps involved in the network-based pathway enrichment analysis.
Fig 3Frequency of occurrence of the major types of pathway amongst the enriched pathways of the 250 regulators.
Regulatory genes with self aCGH/expression correlation in the greatest number of datasets.
| Regulator | Targs | D’sets | Main pathway |
|---|---|---|---|
| AZIN1 | 2 | 22 | Regulation of ornithine decarboxylase (ODC) |
| AGO2 | 7 | 21 | Transcriptional regulation by small RNAs |
| DERL1 | 3 | 21 | E3 ubiquitin ligases ubiquitinate target proteins |
| PTK2 | 3 | 21 | PMID: 9187108 [ |
| BCL9 | 1 | 21 | Formation of the beta-catenin:TCF transactivating complex |
| POGZ | 33 | 20 | RNA polymerase II transcribes snRNA genes |
| MRPS28 | 4 | 19 | Mitochondrial translation termination |
| YWHAZ | 4 | 19 | ATR signaling pathway |
| CHD1 | 2 | 18 | Estrogen-dependent gene expression |
| ZC3H3 | 1 | 18 | Cleavage of Growing Transcript in the Termination Region;Transport of Mature mRNA derived from an Intron-Containing Transcript;mRNA 3’-end processing |
| HSBP1 | 5 | 17 | CXCR4-mediated signaling events;IL12 signaling mediated by STAT4;IL12-mediated signaling events;TCR signaling in naïve CD4+ T cells |
| ANKRD46 | 4 | 17 | RUNX1 regulates genes involved in megakaryocyte differentiation and platelet function |
| RABGAP1 | 1 | 17 | Regulation of gene expression in beta cells |
| TRAK1 | 1 | 17 | Signaling by BRAF and RAF fusions |
| TERF2IP | 8 | 16 | Acetylcholine regulates insulin secretion;Activation of … |
| RBBP5 | 2 | 16 | Activation of anterior HOX genes in hindbrain development during early embryogenesis;RUNX1 regulates genes involved in megakaryocyte differentiation and platelet function |
| ATG7 | 1 | 16 | Antigen processing: Ubiquitination and Proteasome degradation;Interconversion of nucleotide di- and triphosphates |
| NCOA6 | 1 | 16 | Activation of anterior HOX genes in hindbrain development during early embryogenesis |
| SLC30A5 | 1 | 16 | NEP/NS2 Interacts with the Cellular Export Machinery;NS1 Mediated … |
| VCPIP1 | 1 | 16 | Ovarian tumor domain proteases |
| WWOX | 1 | 16 | Formation of the beta-catenin:TCF transactivating complex |
| PTS | 6 | 15 | tetrahydrobiopterin biosynthesis I;tetrahydrobiopterin biosynthesis II |
| DOCK1 | 1 | 15 | Integrin signalling pathway |
Targs = Number of significant targets for which regulator is the best regulator p-value < 0.05, D’sets = Number of datasets in which the regulator has significant correlation between its aCGH profile and its own expression profile
* TEP1, encoded by a candidate tumor suppressor locus, is a novel protein tyrosine phosphatase regulated by transforming growth factor beta.
** P-TEN, the tumor suppressor from human chromosome 10q23, is a dual-specificity phosphatase.
*** Acetylcholine regulates insulin secretion;Activation of NF-kappaB in B cells;Activation of RAS in B cells;Antigen activates B Cell Receptor (BCR) leading to generation of second messengers;Arachidonate production from DAG;Ca2+ pathway;EGFR Transactivation by Gastrin;Effects of PIP2 hydrolysis;Elevation of cytosolic Ca2+ levels;Fatty Acids bound to GPR40 (FFAR1) regulate insulin secretion;G alpha (q) signalling events;G beta:gamma signalling through PLC beta;GPVI-mediated activation cascade;Rap1 signalling;Response to elevated platelet cytosolic Ca2+;Syndecan interactions;Synthesis of IP3 and IP4 in the cytosol.
****NEP/NS2 Interacts with the Cellular Export Machinery;NS1 Mediated Effects on Host Pathways;Nuclear Pore Complex (NPC) Disassembly;Nuclear import of Rev protein;Regulation of Glucokinase by Glucokinase Regulatory Protein;Regulation of HSF1-mediated heat shock response;Rev-mediated nuclear export of HIV RNA;SUMOylation of DNA damage response and repair proteins;SUMOylation of DNA replication proteins;SUMOylation of RNA binding proteins;SUMOylation of chromatin organization proteins;Transcriptional regulation by small RNAs;Transport of Mature mRNA Derived from an Intronless Transcript;Transport of Mature mRNA derived from an Intron-Containing Transcript;Transport of Ribonucleoproteins into the Host Nucleus;Transport of the SLBP Dependant Mature mRNA;Transport of the SLBP independent Mature mRNA;Viral Messenger RNA Synthesis;Vpr-mediated nuclear import of PICs;snRNP Assembly;tRNA processing in the nucleus.
Fig 4For regulators, number of significant best targets against number of datasets.
For each regulator, plotting the number of their predicted significant best targets against the number of datasets in which they show significant self aCGH/expression correlation. Showing only regulators with self-correlation in 8 or more datasets and regulators with at least one enriched pathway. The regulator with the maximum number of targets at each number of datasets is annotated.
Most disrupted pathways in the 45 datasets used in the analysis.
| Pathway | Regulators | N |
|---|---|---|
| Downstream TCR signaling | ABCC12 ARHGEF4 ATXN7L2 HSBP1 IGSF9 MEF2C MNS1 PBXIP1 PIK3R1 PIP5K1A PLCG2 POGZ PSMB10 PTK2 RASGRF2 TERF2IP VHL | 33 |
| Generation of second messenger molecules | ARHGEF4 ATXN7L2 HSBP1 PIP5K1A PLCG2 POGZ PTK2 TERF2IP VHL | 33 |
| Antigen activates B Cell Receptor (BCR) leading to generation of second messengers | ABCC12 ARHGEF4 ATXN7L2 ETV7 GALE IGSF9 INTS8 MCMDC2 MEF2C MNS1 PBXIP1 PEBP4 PIK3R1 PIP5K1A PLCG2 PTK2 RASGRF2 TERF2IP | 31 |
| Formation of the beta-catenin:TCF transactivating complex | BCL9 CPEB4 HIST2H2BE HOXD9 LYL1 MYC NIPAL4 POGZ PYGO2 RAB3A RBBP5 WWOX ZNF446 | 31 |
| RUNX1 regulates genes involved in megakaryocyte differentiation and platelet function | ANKRD46 DAAM2 HIST2H2BE HOXD9 KMT2E LYL1 NIPAL4 POGZ RAB3A RBBP5 ZNF446 | 31 |
| PD-1 signaling | ARHGEF4 ATXN7L2 HSBP1 MNS1 POGZ VHL | 31 |
| Phosphorylation of CD3 and TCR zeta chains ∣ Translocation of ZAP-70 to Immunological synapse | ARHGEF4 ATXN7L2 HSBP1 POGZ VHL | 31 |
| FCERI mediated Ca+2 mobilization ∣ Role of phospholipids in phagocytosis | ABCC12 ETV7 IGSF9 MCMDC2 MEF2C MNS1 PBXIP1 PEBP4 PIK3R1 PIP5K1A PLCG2 PTK2 RASGRF2 TERF2IP | 30 |
| SUMOylation of DNA damage response and repair proteins | ASNSD1 GUCA2B HOXD9 LINC01587 NR3C2 NSMCE2 PHC3 SLC30A5 SSB TNNT2 TSPAN7 XRCC4 | 30 |
| Ca2+ pathway | ETV7 GALE INTS8 MCMDC2 PEBP4 PIP5K1A PLCG2 PTK2 TERF2IP | 30 |
| DAG and IP3 signaling ∣ PLC beta mediated events ∣ VEGFR2 mediated cell proliferation | ETV7 MCMDC2 PEBP4 PIP5K1A PLCG2 PTK2 TERF2IP | 30 |
| Major pathway of rRNA processing in the nucleolus and cytosol | AGO2 BCAN CDX1 CFHR4 DENND1C DUOX1 FAM212B FCRL5 GKN1 IDH1 KCNK4 KDELR1 KLK6 MYC NIP7 PDE4C PRKACA PROCR RIPK3 RPL13A RPS23 RPS4X SEC24D SF3B4 STARD4 TLE2 ZCCHC9 | 29 |
| Transcriptional regulation by small RNAs | AGO2 ASNSD1 BCAN GIMAP2 GUCA2B HIST2H2BE HOXD9 LINC01587 LYL1 NIPAL4 NR3C2 RAB3A SLC30A5 SSB TNNT2 TSPAN7 ZNF446 | 29 |
| snRNP Assembly | ASNSD1 CFHR4 FIGF GCG GEMIN7 GEMIN8 GUCA2B HOXD9 LINC01587 NR3C2 POGZ SLC30A5 SSB TNNT2 TSPAN7 | 29 |
| GPVI-mediated activation cascade | ABCC12 GALE IGSF9 INTS8 MEF2C MNS1 PBXIP1 PIK3R1 PIP5K1A PLCG2 PTK2 RASGRF2 TERF2IP | 29 |
| Estrogen-dependent gene expression | AGO2 CHD1 HIST2H2BE IQGAP2 LYL1 MYC NIPAL4 PIEZO1 POGZ RAB3A ZNF446 | 29 |
| FGF signaling pathway | ARHGEF4 COTL1 GAB2 PIK3R1 PLCG2 POGZ PTK2 SELPLG TNFSF10 | 29 |
| G alpha (q) signalling events | INTS8 MNS1 NPY4R PIP5K1A PLCG2 PTK2 TERF2IP | 29 |
| Activation of anterior HOX genes in hindbrain development during early embryogenesis | DAAM2 HIST2H2BE LYL1 NCOA6 NIPAL4 PARP10 PAX6 PLCG2 POGZ RAB3A RBBP5 VHL ZNF446 | 28 |
| PKMTs methylate histone lysines | ASH1L DAAM2 DQX1 KMT2E LYL1 MECOM PARP10 POGZ RBBP5 | 28 |
| G beta:gamma signalling through PLC beta | GALE INTS8 PIP5K1A PLCG2 PTK2 TERF2IP | 28 |
| MHC class II antigen presentation | DCTN4 GIMAP2 HSBP1 POGZ RASGRF2 RILP | 28 |
N = Total number of datasets in which pathway potentially disrupted
Enriched pathways of POGZ with associated significant best targets.
| Pathway | Targets |
|---|---|
| TGF-beta signaling pathway | ACVR2B EP400 |
| EGF receptor signaling pathway;FGF signaling pathway | RASA1 |
| RNA polymerase II transcribes snRNA genes | SP1 YY1 TRRAP EP400 |
| RUNX1 regulates genes involved in megakaryocyte differentiation and platelet function;RUNX1 regulates transcription of genes involved in differentiation of HSCs | YY1 TRRAP KMT2A SMC4 EP400 |
| Activation of anterior HOX genes in hindbrain development during early embryogenesis;Estrogen-dependent gene expression | YY1 TRRAP KMT2A SMC4 EP400 |
| Condensation of Prophase Chromosomes | YY1 TRRAP KMT2A SMC4 EP400 |
| snRNP Assembly | GEMIN5 |
| MHC class II antigen presentation | SP1 HLA-DPA1 |
| Formation of the beta-catenin:TCF transactivating complex;Ub-specific processing proteases | YY1 TRRAP KMT2A SMC4 EP400 |
| PKMTs methylate histone lysines;RUNX1 regulates genes involved in megakaryocyte differentiation and platelet function;RUNX1 regulates transcription of genes involved in differentiation of HSCs | YY1 TRRAP KMT2A SMC4 EP400 |
| Formation of the beta-catenin:TCF transactivating complex;HATs acetylate histones | YY1 TRRAP KMT2A SMC4 EP400 NAT8L |
| HATs acetylate histones | SP1 YY1 TRRAP KMT2A SMC4 EP400 NAT8L |
| DNA Damage Recognition in GG-NER | YY1 |
| HATs acetylate histones;Ub-specific processing proteases | TRRAP EP400 |
| Formation of the beta-catenin:TCF transactivating complex;HATs acetylate histones;Ub-specific processing proteases | YY1 TRRAP KMT2A SMC4 EP400 |
| Downstream TCR signaling;Generation of second messenger molecules;Interferon gamma signaling;MHC class II antigen presentation;PD-1 signaling;Phosphorylation of CD3 and TCR zeta chains;Translocation of ZAP-70 to Immunological synapse | HLA-DPA1 |
| Downstream TCR signaling;Generation of second messenger molecules;PD-1 signaling;Phosphorylation of CD3 and TCR zeta chains;Translocation of ZAP-70 to Immunological synapse | HLA-DPA1 |
Fig 5POGZ: Formation of the beta-catenin:TCF transactivating complex.
Simplified diagram of the pathway ‘Formation of the beta-catenin:TCF transactivating complex’ as enriched in the target list of the gene POGZ. Applying the less stringent criterion for a regulator to be the best regulator of targets. Nodes: Green = significant target, Red border = regulator is best regulator for target Dark Blue = enriched centre also significant target, Light Blue = enriched centre but not significant target. Edges: grey = ‘in-complex-with’, purple = ‘catalysis-precedes’, ‘used-to-produce’ orange = ‘controls-state-change-of’, ‘controls-phosphorylation-of’ brown = ‘controls-expression-of’, ‘cntrls-production-of’, consumption-cntrlled-by’.
Fig 6DERL1, MAN2A1 and POLD3: HDMs demethylate histones.
Simplified diagram of the pathway ‘HDMs demethylate histones’ fusing the results of the enrichment analysis for regulators DERL1, MAN2A1 and POLD3. Key to nodes and edges as in Fig 5 with the addition of: Nodes: Red = regulator, White = target, not significant.