| Literature DB >> 24806839 |
Michele Tinti1, Kumara Dissanayake, Silvia Synowsky, Luca Albergante, Carol MacKintosh.
Abstract
The complexity of signalling pathways was boosted at the origin of the vertebrates, when two rounds of whole genome duplication (2R-WGD) occurred. Those genes and proteins that have survived from the 2R-WGD-termed 2R-ohnologues-belong to families of two to four members, and are enriched in signalling components relevant to cancer. Here, we find that while only approximately 30% of human transcript-coding genes are 2R-ohnologues, they carry 42-60% of the gene mutations in 30 different cancer types. Across a subset of cancer datasets, including melanoma, breast, lung adenocarcinoma, liver and medulloblastoma, we identified 673 2R-ohnologue families in which one gene carries mutations at multiple positions, while sister genes in the same family are relatively mutation free. Strikingly, in 315 of the 322 2R-ohnologue families displaying such a skew in multiple cancers, the same gene carries the heaviest mutation load in each cancer, and usually the second-ranked gene is also the same in each cancer. Our findings inspire the hypothesis that in certain cancers, heterogeneous combinations of genetic changes impair parts of the 2R-WGD signalling networks and force information flow through a limited set of oncogenic pathways in which specific non-mutated 2R-ohnologues serve as effectors. The non-mutated 2R-ohnologues are therefore potential therapeutic targets. These include proteins linked to growth factor signalling, neurotransmission and ion channels.Entities:
Keywords: 2R-ohnologue families; cancer; mutations; signal multiplexing; vertebrates
Mesh:
Substances:
Year: 2014 PMID: 24806839 PMCID: PMC4042849 DOI: 10.1098/rsob.140029
Source DB: PubMed Journal: Open Biol ISSN: 2046-2441 Impact factor: 6.411
Figure 1.Distributions of cancer mutations between 2R-ohnologue and non-ohnologue transcript-coding genes in 30 different cancers. (a) The bar shows the percentage of 2R-ohnologue and non-ohnologue transcript-coding genes in the Ensembl 72 dataset, based on the provisional 2R-ohnologue list compiled by Makino & McLysaght [10]. Note that assignment of which human genes are 2R-ohnologues is still undergoing revision. (b) For each cancer type analysed [15], the graph on the left side reports the percentage of mutations that map on 2R-ohnologue and non-ohnologue genes. The right side reports the log10 value of the total number of mutations identified in each cancer type in this dataset.
Figure 2.MLs of 2R-ohnologue genes in melanoma. (a) For the melanoma dataset [15], the figure plots the ML distribution for 2R-ohnologues within families of 2, 3 and 4 members. Only families in which at least one gene carries carried 10 or more different mutations are included. The ML is computed by summing the total number of mutations identified for a gene divided by the total number of mutations in all members of the same 2R-ohnologue family. The y-axes give the number of genes, with the ML scores indicated on the x-axes. Each histogram set indicates the medians (red lines), interquartile ranges (rectangular boxes) and outliers (green diamonds) for the ML distributions. Note that for families of 2 members, the median will always be 0.5 by construction, regardless of the ML distribution profile. Data file S1 in the electronic supplementary material presents the corresponding histograms for 30 cancer types, and its legend contains a discussion note about statistics. (b) The distribution of mutations in melanoma is given for the HECW E3 ubiquitin ligase family (as an example of even ML distribution) and the type II PAK family (an example with a skewed ML where the PAK7 gene accumulates most of the family mutations). Illustrations were created with Domain Graph, v. 1.0.5 [26]. CDS is coding sequence, UTR refers to 3′ and 5′ UTRs, and mutations are indicated by vertical black lines. Data file S2 in the electronic supplementary material gives similar diagrams for the distributions of mutations in other 2R-ohnologue families in melanoma.
Figure 3.VisANT map of 2R-ohnologue families that display skewed MLs in different cancer types; graph created using VisANT (visant.bu.edu [27]) from the data in the electronic supplementary material, table S3. Each cancer was assigned a node in blue and lines connect these cancers to the 2R-ohnologue families (in green, red or orange) that display a skewed ML in that cancer. For a line to join a cancer and a 2R-ohnologue family, at least one gene in the family had to carry at least 10 different mutations in that cancer. Also, we plotted only those families with ML skew above the thresholds of at least 0.9 for families containing two genes (that is one gene carried more than or equal to 90% of the mutated positions for its family), at least 0.8 for families with three genes and at least 0.7 for families of four genes. The red node labelled ‘P53-F’ is the p53/p63/p73 family, and the orange node marked ‘FOG-F’ represents FOG1/FOG2.
Rank orders of ML scores within ML-skewed families. Electronic supplementary material, table S3, and figure 3 contain data for all the 2R-ohnologue families that have extreme ML skews. This table shows only those families of three and four members with high ML skews in at least four cancers. The table shows which 2R-ohnologues carry the highest number and second highest number of different mutations for its family. The last column indicates how many cancers have that same rank order of MLs for each family. The ML scores for the 2R-ohnologue families of two members can be viewed in the electronic supplementary material, table S3. The family identifier is an arbitrary number assigned to each 2R-ohnologue family in [7].
| 2R-ohnologue family identifier number and family description | most mutated member of the 2R-ohnologue family | second most mutated family member | least mutated family member(s) | no. cancers where the proteins indicated to the left are ranked first and second most mutated in the family | |
|---|---|---|---|---|---|
| 60 | P53 tumour suppressor family | P53 | P63 | P73 | 13 of 14 |
| 177 | cyclic nucleotide-gated ion channels | HCN1 | HCN4 | HCN2, HCN3 | 10 of 10 |
| 309 | receptor tyr phosphatases | PTPRD | PTPRS | PTPRF | nine of nine |
| 904 | low-density lipoprotein receptors | LRP1B | LRP2 | LRP1 | nine of nine |
| 724 | N-acetylglucosaminyltransferases | MGT4C | MGT4A | MGT4B | eight of eight |
| 1621 | bHLH transcription factors | NPAS3 | SIM1 | NPAS1, SIM2 | seven of eight |
| 66 | Ca2+-binding cadherin-like | CSTN2 | CSTN1 | CSTN3 | seven of seven |
| 170 | very long chain acyl-CoA synthetases | S27A6 | S27A2, S27A3 | seven of seven for first; others equal | |
| 219 | phosphatidylserine receptors | BAI3 | BAI1 | BAI2 | seven of seven |
| 289 | RNA-binding proteins | RALYL | RALY | HNRPC, HNRCL | seven of seven |
| 689 | Kv channel-interacting proteins | KCIP4 | KCIP1 | CSEN, KCIP2 | seven of seven |
| 785 | EGF receptor family | ERBB4 | EGFR | ERBB2, ERBB3 | seven of seven |
| 1117 | Discs large homologues | DLG2 | DLG1 | DLG3, DLG4 | seven of seven |
| 1547 | transmembrane proteins | TM14B | TM14C | TM14A | seven of seven for first; five of seven for second |
| 1706 | autism susceptibility | AUTS2 | FBSL | FBRS | six of seven |
| 1727 | engulfment and cell motility proteins | ELMO1 | ELMO2 | ELMO3 | seven of seven for first; six of seven for second |
| 61 | Kv channel subunits | KCAB1 | KCAB2 | KCAB3 | six of six |
| 89 | synaptophysin-like proteins | SYNPR | SYPL1, SYPL2, SYPH | six of six for first, others equal | |
| 157 | leucine-rich repeat proteins | LRRC7 | LAP2 | SCRIB, LRRC1 | six of six for first, five of six for second |
| 246 | choline transporter-like | CTL5 | CTL2 | CTL4 | six of six for first, five of six for second |
| 277 | Kv channel subunits | KCND2 | KCND3 | KCND1 | six of six |
| 482 | liprin family | LIPA2 | LIPA1 | LIPA3, LIPA4 | six of six |
| 725 | guanine exchange factors for ARF GTPases | PSD3 | PSD2 | PSD1, PSD4 | six of six for first, three of six for second |
| 948 | protein kinase D family | KPCD1 | KPCD3 | KPCD2 | six of six |
| 1327 | Dickkopf-related, Wnt antagonists | DKK2 | DKK4, DKK1 | six of six for first, others equal | |
| 2105 | C2-containing calcium sensors | RP3A | DOC2A | DOC2B | six of six |
| 2215 | type II cdc42-interacting protein kinases | PAK7 | PAK4, PAK6 | six of six for first, others equal | |
| 2275 | RNA-binding splicing regulators | RFOX1 | RFOX3 | RFOX2 | six of six for first, five of six for second |
| 28 | zinc transporters | ZNT8 | ZNT4 | ZNT2, ZNT3 | five of five for first, four of five for second |
| 47 | histone lysine N-methyltransferases | SMYD3 | SMYD1 | SMYD2 | five of five for first, three of five for second |
| 139 | PI 3-kinase regulatory subunits | P85A | P85B | P55G | five of five for first, four of five for second |
| 287 | muscarinic acetylcholine receptors | ACM2 | ACM3 | ACM1, ACM5 | four of five for first, four of five for second |
| 479 | a synaptotagmin family | SYT1 | SYT2 | SYT5 | five of five |
| 515 | hypoxia-inducible prolyl hydroxylases | EGLN3 | EGLN1 | EGLN2 | five of five |
| 525 | protein kinase B family | AKT3 | AKT2 | AKT1 | five of five for first, four of five for second |
| 649 | single-stranded DNA/RNA interacting | RBMS3 | RBMS1 | RBMS2 | five of five |
| 806 | type II histone deacetylases | HDAC9 | HDAC4 | HDAC5, HDAC7 | five of five |
| 872 | integrin alpha family | ITA8 | ITAV | ITA2B, ITA5 | five of five for first, four of five for second |
| 882 | accessory to TGFbeta assembly | LTBP1 | LTBP2 | LTBP3, LTBP4 | five of five |
| 941 | serotonin receptors | 5HT2C | 5HT2A | 5HT2B | five of five |
| 1085 | diacylglycerol kinases | DGKB | DGKG | DGKA | five of five |
| 1443 | zinc-finger DNA-binding proteins | ZMAT4 | ZMAT1 | ZN346 | five of five for first, three of five for second |
| 1996 | localization of receptors and ion channels | LIN7A | LIN7C | LIN7B | four of four for first, three of four for second |
| 106 | voltage-gated calcium channel subunits | CAC1C | CAC1D | CAC1S, CAC1F | four of four |
| 140 | Rab3 GTPase in exocytotic vesicle fusion | RAB3C | RAB3B | RAB3D, RAB3A | four of four for first, three of four for second |
| 193 | transcription factors | RUNX1 | RUNX2 | RUNX3 | four of four |
| 196 | regulator in Ras signalling pathway | CNKR2 | CNKR3 | CNKR1 | four of four for first, three of four for second |
| 254 | deubiquitylating enzymes | OTU7A | OTU7B | TNAP3 | four of four |
| 386 | dynamin vesicle trafficking proteins | DYN3 | DYN1, DYN2 | four of four for first, others equal | |
| 541 | RNA-splicing protein | CELF4 | CELF5 | CELF3, CELF6 | four of four for first, three of four for second |
| 632 | poly(rC)-binding proteins | PCBP3 | PCBP2 | PCBP1, PCBP4 | four of four |
| 915 | RNA-binding zinc-finger proteins | Z385D | Z385B | Z385A | four of four |
| 1103 | cell adhesion molecules | CD166 | MUC18 | BCAM | four of four for first, two of four for second |
| 1909 | exocytosis, regulated by diacylglycerol | UN13C | UN13B | UN13A | four of four for first, two of four for second |
| 2301 | vesicle regulators alcohol dehydrogenase family | VAT1L | ZADH2 | VAT1 | four of four |
| 2377 | histone demethylases | KDM6A | UTY | KDM6B | four of four for first, three of four for second |
Characteristics of cancers in different regions of the VisANT graph in figure 3. In this table, cancers are loosely clustered according to the indicated characteristics. The data for mutations in p53, p63 and p73 for all cancers (from electronic supplementary material, table S2) are summarized in the electronic supplementary material, figure S5. Note that for all cancers with at least 10 mutations in at least one member of the p53/p63/p73 family, p73 always carries the lowest number of mutations in this dataset (electronic supplementary material, figure S3).
| position on graph in | cancers in cluster | connected to ML-skewed families | above-threshold ML skew within p53/63/73 family? |
|---|---|---|---|
| top and sides | melanoma, B-cell lymphoma, breast, CLL, liver, lung adenocarcinoma, pancreatic, stomach, uterus | highly connected. Each cancer has both unique and shared ML-skewed families | no: melanoma, liver and B-cell lymphoma have most mutations in p63, while the other cancers have p53 as the most mutated protein, but these trends are below the skew thresholds for these cancers to be linked to the p53/p63/p73 in |
| centre | medulloblastoma | highly connected, with 120 ML-skewed families shared with other cancers and 20 ML-skewed families unique to medulloblastoma | yes: p63 is the most mutated member of this family |
| bottom | ALL, AML, bladder, colorectal, esophageal, glioblastoma, glioma (low grade), head-and-neck, lung squamous, ovary, prostate, kidney chromophobe | relatively few ML-skewed families are linked to each cancer in this cluster | yes: p53 is the most mutated member of this family |
| unconnected | cervical, kidney papillary, kidney clear cell carcinoma, myeloma, thyroid | unconnected to any ML-skewed families | no: cancers have fewer than nine mutations in p53/p63/p73 family in this dataset |
| neuroblastoma | only one ML-skewed family, namely ALK/LTK (ALK most mutated, including well-known mutations) | no: only one p53 mutation and two p63 mutations in dataset | |
| pilocytic astrocytoma | connected to only three ML-skewed families, two of which are also highly connected to other cancers (LRP1B-most mutated, PTPRD-most mutated), the third being the Raf family (B-Raf most mutated) | no: only two mutations in the p53/p63/p73 family in dataset; one in p63 and one in p73 |
Figure 4.Associations between specific cancers and specific mutation-load-skewed 2R-ohnologue families. (a) Data extracted from figure 3, showing the cancers for which there are ML skews in the p53/p63/p73 and FOG1/FOG2 protein families. (b) A graph showing those 2R-ohnologue families that display a skewed ML in the cumulative data from those melanoma and colorectal cancer samples that have either a B-RafV600E or N-RasQ61K/R mutation (data in the electronic supplementary material, table S5). ‘Common’ indicates 2R-ohnologue families that have a skewed ML in the data from both the B-RafV600E-mutated and the N-RasQ61K/R-mutated samples.
Figure 5.Relationships among ML of 2R-ohnologues in melanoma, cancer/control mRNA expression in melanoma and proteins identified in 14-3-3-affinity capture experiments using melanoma cell lysates. Each cross represents a gene in the E-GEOD-3189 transcription profiling dataset [34]. The log2 ratio of mRNA expression in malignant melanoma versus benign melanocytic lesions in the E-GEOD-3189 dataset is plotted on the y-axis against the ML score of the gene calculated from the Alexandrov et al. [15] data on the x-axis. The genes whose mRNA levels are most strongly up- or downregulated in melanoma are in red and blue, respectively. Also plotted (circles) are the proteins that were isolated by 14-3-3-affinity capture of cell lysates from both SKMEL13 and SBCL2 melanoma cells, and identified by mass spectrometric analyses. The data used for this figure are in the electronic supplementary material, table S6.
Overexpression of protein kinase 2R-ohnologues with low mutation scores decreased the sensitivity of B-RafV600E-mutant melanoma cells to PLX4720. The viability score assigned by Wood et al. [43] refers to the ability of the protein kinase, when overexpressed, to enhance the viability of B-RafV600E-mutant melanoma (A375) cells exposed to the B-RafV600E inhibitor, PLX4720. Seven 2R-ohnologue families of protein kinases had least one member among the top hits of the Wood dataset [43] and at least one member with at least 10 mutations in melanoma in the Alexandrov dataset [15]. The cell viability scores [42] and ML scores (this study) are shown for each member of these seven families. The family Id is an arbitrary number assigned to identify each 2R-ohnologue family in [7]. n.a., data not available.
| 2R-ohnologue family Id | protein name | UniProt Id | viability score [ | ML score in melanoma (electronic supplementary material, table S2) | is ML skew in melanoma above threshold for inclusion in |
|---|---|---|---|---|---|
| 378 | NTRK2 | Q16620 | 1.23 | 0.0777 | no |
| 378 | NTRK1 | P04629 | 1.07 | 0.3932 | no |
| 378 | NTRK3 | Q16288 | n.a. | 0.2621 | no |
| 378 | MUSK | O15146 | 0.76 | 0.2670 | no |
| 1058 | MST1R | Q04912 | 1.13 | 0.2923 | no |
| 1058 | MET | P08581 | 1.03 | 0.7077 | no |
| 1085 | MAPK8 | P45983 | 1.18 | 0.1563 | no |
| 1085 | MAPK9 | P45984 | 0.93 | 0.3438 | no |
| 1085 | MAPK10 | P53779 | 0.79 | 0.5000 | no |
| 1548 | SRPK3 | Q9UPE1 | 1.16 | 0.2195 | no |
| 1548 | SRPK1 | Q96SB4 | 0.95 | 0.3415 | no |
| 1548 | SRPK2 | P78362 | 0.86 | 0.4390 | no |
| 1666 | PIM2 | Q9P1W9 | 1.15 | 0.3182 | no |
| 1666 | PIM1 | P11309 | 1.11 | 0.5000 | no |
| 1666 | PIM3 | Q86V86 | n.a. | 0.1818 | no |
| 1780 | LIMK1 | P53667 | 1.17 | 0.3103 | no |
| 1780 | LIMK2 | P53671 | 0.94 | 0.6897 | no |
| 2215 | PAK6 | Q9NQU5 | 1.21 | 0.0877 | yes |
| 2215 | PAK4 | O96013 | 0.97 | 0.0789 | yes |
| 2215 | PAK7 | Q9P286 | 0.86 | 0.8333 | yes |
Figure 6.Simplified model that depicts cancer as a disorder of signal multiplexing in the cellular 2R-WGD networks of vertebrate animals. (a) The ancestor of all the vertebrates was an invertebrate chordate whose cells are depicted as being under the control of simple linear regulatory pathways. The image is of amphioxus (Branchiostoma), regarded as the best modern-day proxy for the ancestor. (b) 2R-WGD at the evolutionary origins of the vertebrate animals boosted communication networks inside our cells. Variations in these networks may underpin variety of vertebrate cell types, species and behaviours. (c) We hypothesize that certain cancers arise when different heterogeneous combinations of mutations (crosses) disconnect certain parts of the 2R-WGD regulatory networks and force too much communication flow via a restricted number of oncogenic pathways. These ‘open’ oncogenic pathways are activated by specific driver mutations (stars) and also require effector proteins that must remain mutation-free. If these effectors acquire too many deleterious mutations, the cancer cell will be lost. Though the model only depicts 2R families with high ML skews, it could be extended to include other patterns. For example, 2R families whose members carry an even ML may be in parts of the network where any member can perform the family function for the cancer, or represent functions whose total elimination gives a selective advantage to the cancer. In its simple form, the model assumes that when genes are hit by a number of different mutations (crosses) these will include loss-of-function mutations. However, it is appreciated that this may not always be so, in which case the rules of the model would change. The model highlights that the contribution of both mutated and non-mutated 2R-ohnologues to the overall function of each family in the cancer should be considered.