| Literature DB >> 29402903 |
Julia Schulz1, Nancy Mah2, Martin Neuenschwander3, Tabea Kischka4, Richard Ratei5,6, Peter M Schlag7, Esmeralda Castaños-Vélez7, Iduna Fichtner1, Per-Ulf Tunn6, Carsten Denkert8, Oliver Klaas9, Wolfgang E Berdel9, Jens P von Kries3, Wojciech Makalowski4, Miguel A Andrade-Navarro10, Achim Leutz11,12, Klaus Wethmar13,14.
Abstract
Ribosome profiling revealed widespread translational activity at upstream open reading frames (uORFs) and validated uORF-mediated translational control as a commonly repressive mechanism of gene expression. Translational activation of proto-oncogenes through loss-of-uORF mutations has been demonstrated, yet a systematic search for cancer-associated genetic alterations in uORFs is lacking. Here, we applied a PCR-based, multiplex identifier-tagged deep sequencing approach to screen 404 uORF translation initiation sites of 83 human tyrosine kinases and 49 other proto-oncogenes in 308 human malignancies. We identified loss-of-function uORF mutations in EPHB1 in two samples derived from breast and colon cancer, and in MAP2K6 in a sample of colon adenocarcinoma. Both mutations were associated with enhanced translation, suggesting that loss-of-uORF-mediated translational induction of the downstream main protein coding sequence may have contributed to carcinogenesis. Computational analysis of whole exome sequencing datasets of 464 colon adenocarcinomas subsequently revealed another 53 non-recurrent somatic mutations functionally deleting 22 uORF initiation and 31 uORF termination codons, respectively. These data provide evidence for somatic mutations affecting uORF initiation and termination codons in human cancer. The insufficient coverage of uORF regions in current whole exome sequencing datasets demands for future genome-wide analyses to ultimately define the contribution of uORF-mediated translational deregulation in oncogenesis.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29402903 PMCID: PMC5799362 DOI: 10.1038/s41598-018-19201-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Target sets and workflow of the PCR-based, multiplex identifier-tagged deep sequencing approach. (A) Composition of the cancer sample set with numbers indicating sample sizes of investigated malignant entities: ALL – acute lymphoblastic leukemia, AML – acute myeloid leukemia, NHL – non-Hodgkin lymphoma, OS – osteosarcoma, CA – colon adenocarcinoma, CX – colon xenograft, LA – lung adenocarcinoma, LX – lung xenograft, MC – mammary carcinoma. (B) Composition of the target gene set consisting of indicated numbers of uORF-bearing tyrosine kinases[5], previously validated proto-oncogenes[18] and genes post-transcriptionally induced in cancer cell lines[19] (see also Supplementary Table 1). (C) Flow chart displaying amplification and normalization steps allowing simultaneous deep sequencing of 404 uORF initiation sites of 132 target genes in 308 individual cancer samples. Briefly, genomic regions of uAUG targets were amplified individually from every cancer DNA (see also Supplementary Table 7). uAUG-specific amplicons of each cancer sample were pooled and labeled with cancer-specific MID-tags in a second round of PCR (see also Supplementary Table 8). After normalization and pooling of all MID-tagged amplicons, a deep sequencing library was generated and analyzed using the Illumina® HiSeq2000 sequencing system.
Figure 2Recovery of genetic information of targeted uAUG regions and identification of uORF-associated alterations in human cancers. (A) Heatmap displaying the number of sequencing reads for individual uAUG target sites (rows) and individual cancer samples of indicated entities (columns). The threshold was set to ≥10 sequencing reads (seq. reads) (see also Supplementary Table 2). (B) Summary of sequencing data processing. The top pie chart shows the proportion of all individual target sites (404 uAUGs of 308 cancer samples) that were covered by ≥10 sequencing reads. The bottom pie charts represent the numbers of potential genetic alterations (mutations, single nucleotide polymorphisms (SNPs) and long deletion/repeat regions) in uAUG and uKozak target sites that showed ≥10% deviation from the reference base (ref. base)(see also Supplementary Tables 3 and 4). Selected candidate mutations were subsequently re-sequenced by Sanger confirming the indicated number of uORF-associated alterations.
Summary of verified uORF-associated SNPs.
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| CAMKK2 | chr12:121735975 | A | — | 1.4 | 2 | CA | |
| EPHA3 | chr3:89156859 | A | T | 0.2 | 1 | ALL | |
| KDR | chr4:55991731 | A | G | 55.4 | 4 | MC, ALL, AML, OS | |
| CAMKK2 | chr12:121712374 | G | A | n.a. | tgcC | 1 | AML |
| MDM2 | chr12:69202164 | A | G | 33.7 | agtGga | 237 | all entities |
| MUSK | chr9:113431103 | T | C | 1.2 | caaAg | 2 | AML, NHL |
| NRP2 | chr2:206547747 | T | C | 0.6 | acaTa | 5 | MC, CA, AML, NHL |
| PTK2B | chr8:27179964 | C | T | 0.05 | 1 | AML | |
| STAT6 | chr12:57505073-8 | GTGTGT | — | div. | 306/307/265/273/138/146 | all entities | |
| TTN | chr2:179672033 | G | A | 2.6 | tc | 2 | CA, LX |
| TYK2 | chr19:10490402 | T | C | 16.5 | c | 120 | all entities |
| YEATS4 | chr12:69753557 | G | A | 0.2 | gccT | 1 | OS |
The table shows confirmed annotated SNPs (in bold) in uAUGs (top) and uKozak sequences (bottom, uAUG is underlined and core uKozak bases are in capital letters) with information of affected genes and cancer samples.
Note that the single nucleotide deletion in the uAUG of CAMKK2 and the 6-bp repeat deletion in the uKozak sequence of STAT6 did not alter the uORF start site or uKozak sequence, respectively, as resulting genotypes correspond to the reference base(s). In the case of STAT6, different numbers of affected cancer samples were determined for each base in the 6-bp repeat region.
freq – frequency; n.a. – not annotated; div. – diverse annotations; all entities – ALL, AML, NHL, OS, CA, CX, LA, LX, MC.
Summary of verified uORF-associated novel mutations.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| BLK | chr8:11351560 | G | AT | 1 | CA |
| EPHB1 | chr3:134514263 | A | 2 | MC, CX | |
| JAK2 | chr9:5021969 | G | AT | 1 | NHL |
| MAP2K6 | chr17:67410881 | T | A | 1 | CA |
| CHD1L | chr1:146731516 | G | ttgTg | 1 | CX |
The table shows newly identified mutations (in bold) in uAUGs (top) and uKozak sequences (bottom, uAUG is underlined and core uKozak bases are in capital letters) with information of the affected genes and cancer samples.
Figure 3Translational impact of identified loss-of-uORF mutations. (A) Schematic representation showing the position and length of uORFs with identified uAUG-associated mutations and polymorphisms (p) in the indicated transcripts. Conservation of affected uORF start sites among nine vertebrate species (human, rhesus, mouse, rat, cow, dog, elephant, chicken, and zebrafish) is indicated and the quality of uKozak contexts is depicted as intermediate (+, one core uKozak base match) or weak (−, no core uKozak base match). Additional columns display the detected sequence of the mutant codon, the affected cancer sample and the expression of indicated transcripts in affected cancer samples determined by semi-quantitative real-time PCR (see also Supplementary Fig. 1). Note that all transcripts contained additional uORFs that were devoid of genetic alterations and are not illustrated here (see also Supplementary Table 6 and Supplementary Fig. 2). n.a. - not analyzed due to the lack of cancer material. (B,C) Luciferase assays and real-time PCR analysis in HeLa cells showing relative luciferase activities and mRNA levels in the presence of indicated TLSs containing wild-type (red or orange) or mutant (gray) uORF initiation sites as shown in (A). (D) Luciferase assays demonstrating relative luciferase activities in the presence of the wild-type (red) or mutant (gray) uORF initiation codon in the TLS of MAP2K6 in two indicated colon cancer-derived cell lines. Error bars represent means ± standard error of the mean (s.e.m) of Firefly luciferase signals relative to Renilla luciferase internal control signals from duplicate measurements of at least three (b,c) and two (d) independent experiments. Statistical significance was determined by the two-tailed, non-parametric Mann-Whitney test and is indicated by *P < 0.05, **P < 0.01 and ***P < 0.001. Numbers identify the specific cancer sample affected by the uAUG alteration.