Literature DB >> 30172240

Mutations, Differential Gene Expression, and Chimeric Transcripts in Esophageal Squamous Cell Carcinoma Show High Heterogeneity.

Paulo Thiago de Souza-Santos1, Sheila Coelho Soares Lima2, Pedro Nicolau-Neto3, Mariana Boroni4, Nathalia Meireles Da Costa5, Lilian Brewer6, Albert Nobre Menezes7, Carolina Furtado8, Miguel Angelo Martins Moreira9, Hector N Seuanez10, Tatiana de Almeida Simão11, Luis Felipe Ribeiro Pinto12.   

Abstract

Esophageal squamous cell carcinoma (ESCC) is a frequent and lethal neoplasia. As recent advances in targeted therapy have not improved ESCC prognosis, characterization of molecular alterations associated to this tumor is of foremost relevance. In this study, we analyze, for the first time, the complete genomic profile of ESCC by RNA-seq. TP53 was the most frequently mutated gene in the investigation and validation sets (78.6% and 67.4%, respectively). Differential expression analysis between tumor and nontumor adjacent mucosa showed 6698 differentially expressed genes, most of which were overexpressed (74%). Enrichment analysis identified overrepresentation of Wnt pathway, with overexpressed activators and underexpressed inactivators, suggesting activation of canonical and noncanonical Wnt signaling pathways. Higher WNT7B expression was associated with poor prognosis. Twenty-one gene fusions were identified in 50% of tumors, none of which involving the same genes in different patients; 71% of fusions involved syntenic genes. Comparisons with TCGA data showed co-amplification of seven gene pairs involved in fusions in the present study (~33%), suggesting that these rearrangements might have been driven by chromoanagenesis. In conclusion, genomic alterations in ESCC are highly heterogeneous, impacting negatively in target therapy development.
Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

Entities:  

Year:  2018        PMID: 30172240      PMCID: PMC6121831          DOI: 10.1016/j.tranon.2018.08.002

Source DB:  PubMed          Journal:  Transl Oncol        ISSN: 1936-5233            Impact factor:   4.243


Introduction

Esophageal carcinoma (EC) is a highly frequent and lethal tumor, representing the eighth most common and the sixth most fatal cancer worldwide [1]. Esophageal squamous cell carcinoma (ESCC) is the main histopathology subtype, accounting to approximately 80% of all EC cases, mainly in developing countries, such as Brazil [2]. ESCC patients show a poor prognosis, with a 5-year survival rate below 20% [3]. Regardless of the noteworthy advances in cancer diagnosis and therapy, current ESCC treatment approaches are frequently unsuccessful, and the outcome of ESCC patients remains unfavorable [4], [5], [6]. This makes the understanding of the molecular mechanisms involved in esophageal carcinogenesis a most relevant precondition for developing more efficient therapies. A small number of genome-wide ESCC studies have been reported, mainly of patients from eastern countries. Recently, The Cancer Genome Atlas (TCGA) consortium published a genome-wide EC study that included ESCC and esophageal adenocarcinoma [7]. This report focused on copy number and mutation analyses revealing differences between patients from several geographic regions, albeit with limited data on gene expression profiles, probably due to lack of paired samples from tumor and normal, adjacent esophageal mucosa. In this study, we report the ESCC transcriptome in a sample of patients from a Western population and analyze, for the first time, differentially expressed genes, mutations, and gene fusions, pointing to dysregulated signaling pathways potentially associated to ESCC carcinogenesis.

Materials and Methods

Patients and Sample Collection

A set of 55 paired biopsies of ESCC and nontumor adjacent mucosa was collected from 2006 to 2015 at the Endoscopy Service of the Instituto Nacional de Câncer (Rio de Janeiro, Brazil). Histopathology profiles were provided by the Pathology Department. Written informed consents were signed for using biological samples as well as clinical and pathology data from patient records. Samples, obtained before treatment, were separated in an investigation (14 paired samples) and a validation set (41 paired samples). This study was approved by the institution Ethics Committee and was conducted according to the Declaration of Helsinki.

RNA Isolation, Library Preparation, and Sequencing

Total RNA was isolated with the RNeasy Kit (Qiagen) following the manufacturer’s instructions. The integrity of RNA was assessed with RNA 6000 Nano chip with a Bioanalyzer platform (Agilent), and only samples with RNA integrity number (RIN) ≥8 were included. RNA samples were used for constructing cDNA libraries with TruSeq RNA (Illumina) following the manufacturer’s protocol. Libraries were sequenced in an Illumina HiSeq 2500 platform to produce 2×100 bp reads.

Data Processing and Read Mapping

Statistics and quality analyses of reads were generated with FastQC 0.10.1 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) by Babraham Bioinformatics. RNA-Seq reads were trimmed to remove bad quality bases and reads with Prinseq [8]. Reads were subsequently aligned to a genomic reference sequence of H. sapiens (GRch37 version) downloaded from Ensembl database using TopHat2 mapper 2.0.9 [9] with default parameters. Mapped reads were filtered by quality and unique mapping features with Samtools 0.1.19 [10], with -bhq 20 -F 0×100 parameters.

Analysis of Differential Gene Expression

Raw counts for each gene were estimated for all datasets using the HTSeq Python package 0.5.3p3 [11] and the “–stranded=yes” option, as well as the “–mode=intersection-strict” option for exon-intersection counting. As cutoff, only genes with ≥10 read counts were considered for further analyses. Differential expression analysis was performed with DESeq2 package 1.2.5 [12]. Genes were considered to be differentially expressed with adjusted P value < .001 and |log fold-change| ≥ 1. Principal component analysis (PCA) was estimated, and heatmaps were constructed with regularized-logarithmic transformation (rlog) provided by the DESeq2 package.

Functional Annotation

Enrichment analysis was carried out with R package KEGG.db [13]. Following application of Fisher exact test, only pathways with P < 0.1 adjusted by Benjamin-Hochberg’s procedure were considered to be enriched.

qPCR Validation

Four genes related to the WNT-signaling pathway (Supplementary Table 1) were selected for validation by quantitative PCR (qPCR) as described previously [14].

Variant Calling

RNA-seq reads were initially mapped to the reference genome and to all known splice junctions using STAR [15]. Uniquely mapped reads were subsequently used for calling the initial set of candidate variants with Genome Analyses Toolkit (GATK) following the best practices recommended by the Broad Institute. Subsequently, these candidates were subjected to several filtering procedures, including removal of variations present in the nontumor adjacent mucosa; InDel alterations and A:T>G:C conversions were excluded in view of their high probability of representing RNA editing products [16]. We used ANNOVAR [17] for annotating variants based on gene models from GENCODE, RefSeq, Ensembl, and the UCSC Genome Browser. All RNA-seq variants present in 1000 genomes were defined as SNPs and excluded from the analyses.

Validation of TP53 Mutations

Genomic DNA was isolated using the DNeasy Blood and Tissue kit (Qiagen) as recommended by the manufacturer. TP53 was amplified using three primer pairs (Supplementary Table 1), 50 ng of DNA and Taq Platinum (Invitrogen), and amplified products purified with Purelink genomic DNA purification kit (Invitrogen). Subsequently, DNA libraries were constructed with 1 ng of PCR product and Nextera XT DNA sample preparation (Illumina) following the manufacturer’s instructions and sequenced in a HiSeq 2500 platform. Reads were submitted to the above described quality control for cDNA reads. Good quality reads were mapped to the human genome with BWA aligner [18].

Fusion Analysis

The FusionMap package [19] was used with the following parameters: Sample.SeedCount > 3, SplicePatternClass = “CanonicalPattern[Major]” or “CanonicalPattern[Minor]”, Filter = empity, FrameShiftClass = “InFrame” and OnExonBoundary = “Both”.

Comparisons with TCGA data

TCGA data were used to compare our findings with other populations and datasets. The cBioportal for cancer genomics [20], [21] was used for selecting and analyzing ESCC from the TCGA provisional dataset.

Further Statistical Analyses

Differential expression of selected genes between ESCC and paired nontumor mucosa was evaluated with paired t test or Wilcoxon test. Associations between gene expression and the TP53 mutation status with clinical and pathology data were analyzed with t test, Mann-Whitney test, and chi-square. Survival analyses were carried out using “survival” package for R, and hazard ratios (HRs) were adjusted by Cox regression. All analyses were performed in the R environment.

Results

Clinical and Pathology Data

The 55 patients included in this study showed a median age of 59 years (39-79) (Table 1). Most patients were male (74.5%) and showed a median overall survival of 8 months (1.2-42.7). Tumors were moderately differentiated (70.9%), most frequently located in the middle third of the esophagus (78.2%), and diagnosed at late stages (80%).
Table 1

Sample Data and Clinical and Pathology Profiles of 55 ESCC Patients Included in This Study

ESCC Patients
Gender
 Male41 (74.5%)
 Female14 (25.4%)
Age (median and range, years)59 (39-79)
Follow-up (median and range, months)8 (1.2-41.7)
Tumor central localization
 Upper third5 (9.1%)
 Middle third43 (78.2%)
 Lower Third7 (12.7%)
Differentiation
 Well1 (1.8%)
 Moderately39 (70.9%)
 Poorly15 (27.3%)
Tumor stage
 I2 (3.6%)
 II5 (9%)
 III9 (16.4%)
 IV18 (32.7%)
 NA21 (38.2%)

NA, not available.

Sample Data and Clinical and Pathology Profiles of 55 ESCC Patients Included in This Study NA, not available.

High-Throughput Evaluation of Gene Expression in ESCC

Global gene expression of tumor and nontumor adjacent mucosa was analyzed in the investigation set (IS). RNA-seq provided an average of 37 million sequences for nontumor adjacent samples, 33 million (89%) of which of good quality. Comparatively, an average of 40 million sequences was generated from tumor samples, with 38.5 million (96%) of good quality. Sequences of good quality, aligned to the reference human genome, were correctly mapped, accounting for 80% and 87% of sequences from nontumor and tumor samples, respectively. Reads were mapped to the 63,677 genes of the Ensembl database, and 44,402 (70%) of them were considered for further analyses following mapping of at least 10 reads. Principal component analysis revealed a homogenous distribution among nontumor adjacent esophageal mucosa but a spread distribution among ESCC samples (Supplementary Figure 1). Comparisons between total paired samples revealed 6698 differentially expressed genes (DEG), 4966 of which (74%) were overexpressed and 1732 (26%) underexpressed in tumors. The differential expression of the 10 most commonly amplified or deleted genes in the TCGA study was analyzed. Four amplified genes (FGF3, FGF19, TP63, and TFRC) were found to be overexpressed in ESCC with respect to the nontumor adjacent mucosa, while CDKN2B, found to be deleted in TCGA, was underexpressed in our dataset (Supplementary Table 2). Based on our DEG dataset, pathway enrichment analysis using the KEGG database retrieved a list of 71 overrepresented pathways (adjusted P value < .1), among which the “Wnt signaling pathway” showed a total of 54 DEG of the 150 genes included in this pathway. Their expression patterns in tumors and nontumor adjacent mucosa are shown in Figure 1A. Most genes (87%) were found to be overexpressed in tumors, with a median log fold-change of 1.75 (1.08-7.55). Most underexpressed genes in tumors corresponded to pathway inhibitors, like CTNNBIP1 and DKK1, indicating reinforcement of Wnt pathway activation. Additionally, we found that, in tumors, activation was not restricted to the canonical pathway represented by the WNT7B ligand and its downstream CCND2 target leading to cell cycle progression, but also involved the noncanonical pathway activated by WNT16 leading to MAPK10 activation and apoptosis (Figure 1B). Validation of these findings confirmed overexpression of WNT7B (P < .001), CCND2 (P = .041), WNT16 (P = .037), and MAPK10 (P < .001) in tumors with respect to the nontumor adjacent mucosa (Figure 1C). The Wnt signaling pathway was also found to be deregulated in the TCGA dataset, although this finding was not analyzed in the TCGA report [7]. Reanalysis of TCGA data revealed that, among the Wnt-related DEG herein analyzed, NKD2 and LRP5 were mutated or amplified (23% and 21%, respectively), while SFRP5, WNT7B, CDC25C, RHOC, PAX2, and SOST did not show alterations (Supplementary Figure 2).
Figure 1

Dysregulation of the Wnt signaling pathway in ESCC. (A) Heatmap showing all differentially expressed genes (DEG) of the Wnt signaling pathway detected by RNA-seq analysis. Black: tumor samples; gray: nontumor adjacent mucosa. Each column represents a single sample; each line represents a single DEG. Red and green indicate high and low gene expression, respectively. (B) Schematic representation of the Wnt-signaling pathway; with overexpressed genes in the canonical (WNT7B and CCND2) and in the noncanonical pathway (WNT16 and JNK) selected for validation by RT-qPCR. (C) Validation analysis of the Wnt signaling pathway selected targets: WNT7B, WNT16, CCND2, and MAPK10. (D) Kaplan-Meier plot of overall survival of patients of the validation set showing the prognostic impact of WNT7B expression. Patients with low WNT7B expression (<0.077 GAPDH units) showed a more favorable survival than those with a high expression (≥0.077 GAPDH units); *P < .05.

The expression of genes selected for validation was subsequently analyzed in association to age and clinical and pathology variables (Table 2). Low MAPK10 expression was associated with local/distant metastases (P = .046), while associations with age, tumor stage, or lymph node metastases were not observed for any other gene. Interestingly, patients whose tumors showed higher WNT7B expression showed a decreased overall survival (median = 7.73 months) when compared with patients with lower WNT7B expression in tumors (median = 17.17 months) (Figure 1D). Multivariate analysis further revealed that the impact of WNT7B expression on the prognosis of ESCC patients was independent of age and tumor stage (HR = 5.5 (1.7-18.0), P = .005).
Table 2

Association Between Characteristics of the ESCC Patients Included in This Study and Gene Expression Levels of the Wnt Signaling Pathway Components, Assessed by RT-qPCR in the Validation Set (n=28)

Clinical and Pathology DataWNT7B ExpressionP ValueWNT16 ExpressionP ValueCCND2 ExpressionP ValueMAPK10 ExpressionP Value
Age
 <604.9×10−21.04.2×10−6.7227.2×10−3.6027.5×10−4.763
(5.2×10−3 to 3.2×10−1)(5.4×10−8 to 1.3×10−4)(5.7×10−4 to 1.0×10−1)(3.6×10−5 to 1.5×10−3)
 ≥603.7×1027.4×10−61.4×10−21.8×10−4
(2.2×10−2 to 4.0×10−1)(3.1×10−8 to 1.0×10−4)(1.0×10−3 to 5.7×10−2)(1.4×10−5 to 4.1×10−3)
Tumor stage
 I/II8.0×10−2.729.9×10−5.0992.0×10−2.1839.7×10−4.389
(5.2×10−3 to 3.2×10−1)(2.4×10−6 to 1.3×10−4)(2.4×10−3 to 1.0×10−1)(6.1×10−5 to 1.5×10−3)
 III/IV3.7×1028.4×10−68.5×1031.7×10−4
(1.3×10−2 to 1.4×10−2)(3.1×10−8 to 9.2×10−5)(5.7×10−4 to 5.7×10−2)(1.4×10−5 to 4.1×10−3)
Lymph node metastases
 No5.9×10−21.02.5×10−5.8593.8×10−2.1491.0×10−3.818
(5.2×10−3 to 4.0×10−1)(2.8×10−7 to 1.3×10−4)(1.5×10−2 to 1.0×10−1)(6.1×10−5 to 4.1×10−3)
 Yes6.2×10−23.4×10−27.1×10−31.0×10−3
(2.3×10−2 to 3.2×10−1)(1.1×10−7 to 9.9×10−5)(5.7×10−4 to 5.7×10−2)(1.4×10−4 to 1.5×10−3)
Local/distant metastases
 No3.7×10−2.565.7×10−6.3701.8×10−2.1059.7×10−4.046
(5.2×10−3 to 4.0×10−1)(1.1×10−7 to 1.3×10−4)(5.7×10−4 to 5.7×10−2)(6.1×10−5 to 4.1×10−3)
 Yes7.3×10−26.3×10−63.2×10−31.4×10−4
(2.2×10−2 to 1.4×10−1)(3.1×10−8 to 9.2×10−5)(1.6×10−3 to 3.4×10−2)(1.4×10−5 to 1.2×10−3)
Cox regressionHR = 5.5 (1.7-18.0).0053HR = 1.8 (0.5-6.2).33HR = 0.4 (0.1-1.4).57HR = 0.7 (0.2-2.3).16
Dysregulation of the Wnt signaling pathway in ESCC. (A) Heatmap showing all differentially expressed genes (DEG) of the Wnt signaling pathway detected by RNA-seq analysis. Black: tumor samples; gray: nontumor adjacent mucosa. Each column represents a single sample; each line represents a single DEG. Red and green indicate high and low gene expression, respectively. (B) Schematic representation of the Wnt-signaling pathway; with overexpressed genes in the canonical (WNT7B and CCND2) and in the noncanonical pathway (WNT16 and JNK) selected for validation by RT-qPCR. (C) Validation analysis of the Wnt signaling pathway selected targets: WNT7B, WNT16, CCND2, and MAPK10. (D) Kaplan-Meier plot of overall survival of patients of the validation set showing the prognostic impact of WNT7B expression. Patients with low WNT7B expression (<0.077 GAPDH units) showed a more favorable survival than those with a high expression (≥0.077 GAPDH units); *P < .05. Association Between Characteristics of the ESCC Patients Included in This Study and Gene Expression Levels of the Wnt Signaling Pathway Components, Assessed by RT-qPCR in the Validation Set (n=28)

Mutational ESCC Landscape

RNA-seq data from IS revealed a median of 65 point mutations per tumor (33-199) following removal of AT>GC conversions (Figure 2A). GC>AT was the most common conversion (57.2%), followed by GC>CG (17.7%), GC>TA (12.9%), AT>TA (7.5%), and AT>CG (4.7%) (Figure 2B). Among GC>AT, 58.6% occurred in a CpG context. TP53 was the most commonly mutated gene, in 78.6% of tumors (Figure 2C), while mutations were not detected in the nontumor adjacent mucosa. Other frequently mutated genes were LOC389831 (42.9%), PI4KA (21.4%), MST1L (21.4%), HERC2 (21.4%), and NBPF9 (21.4%). Comparisons between our findings and the TCGA dataset revealed that, except for TP53, mutations occurred in different genes. In our tumor samples, expression analyses of the 10 most frequently TCGA mutated genes (Supplementary Table 2) showed that 4 of them (CSMD3, MUC16, DNAH5 and PKHD1L1) were overexpressed, while NFE2L2 was underexpressed (adjusted P value < .05).
Figure 2

RNA-seq mutational analysis in ESCC. Only missense, nonsense, and synonymous point mutations were considered. SNPs, InDel, and A>T to G>C conversions were removed due to presumptive association with RNA edition. (A) Graphical representation of the number of mutations per tumor sample in the investigation set. (B) Graphical representation of the mutational profiles. (C) Graphical representation of the mutation frequency of the most commonly altered genes in tumors of the investigation set. IS, investigation set.

TP53 was selected for further analysis by DNA sequencing in the validation set (VS) where its mutation frequency was slightly lower (67.4%) than in IS, with single mutations representing the most frequent class in both cases (Figure 3A). TP53 analysis identified four tumors with two mutations, and three and six mutations in single tumors (Figure 3A). The most common conversion was AT>GC (31.7%), not considered in IS (Figure 3B). GC>AT was the second most frequent conversion (29.3%) in VS and the most frequent one in IS (45.5%); AT>TA and GC>TA were found in at least 10% of cases in both sets; InDel and GC>CG and AT>CG conversions showed smaller frequencies (Figure 3B). Figure 3C shows the distribution of TP53 mutations per exon in IS and VS. The vast majority of mutations was found in exons 5 to 8 (altogether comprising 82% in the IS and 90% in the VS), while mutations were also observed in exons 4 (3% in the VS only), 9 (9% the IS only), 10 (9% the IS and 3% in the VS), and 11 (3% in the VS only). In IS and VS, 64% and 86% of missense mutations were observed, respectively, while all others comprised nonsense mutations in both datasets (Supplementary Table 3). Mutations were also identified in TP53 introns in VS, always coexisting with exonic mutations. The presence of TP53 mutations was not associated with clinical and pathology data (data not shown).
Figure 3

TP53 mutational analysis by RNA-seq and DNA sequencing. (A) Graphical representation of the frequency of samples without mutations and with different numbers of TP53 mutations in the investigation set (RNA-seq; n = 14) and validation set (DNA-seq; n = 41). (B) TP53 InDel and conversions in the investigation and validation sets. (C) Distribution of TP53 mutations per exon, in the investigation and validations sets. WT = wild type. # not applicable; InDel and A>T to G>C conversions were removed from analysis in the investigation set due to presumptive association with RNA edition.

Gene Fusions Analysis in ESCC

RNA-seq data were further used for investigating the presence of gene fusions in ESCC. Interestingly, although 21 fusions were detected, none of them was shared by different tumors, and the vast majority of them (71%) involved syntenic genes (Table 3). These fusions had not been previously described in the Fusion Cancer Database and Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer.
Table 3

Gene Fusions (n = 21) Identified in the 14 Samples of the Investigation Set by RNA-seq

Sample5' Gene3' Gene5' Breakpoint Position (chr:nt)3' Breakpoint Position (chr:nt)
IS01PLS3IKZF3X:11479558717:37949186
SLC25A43IMPG1X:1185406646:76640868
IS02CCDC127AHRR5:2168445:376725
IS04RAB3IPCCT212:7013281112:69985839
HNRNPCCHD814:2173147014:21869672
POLR2ANLGN217:741624217:7317663
NDUFA10MYEOV22:2409579702:241070505
MAP4SPINK83:481302633:48351436
MAP4SPINK83:481302633:48361108
FLNBSLMAP3:581345793:57893611
ZNF3TAF61082:38:007:99711891
IS07SPAG9CA1017:4919771517:49825178
IS12NDUFAF2ZSWIM65:602412095:60768508
IS13SGMS1CACNB210:5222043310:18439812
CTBP2EFCAB510:12684888817:28378136
LOC100133315DEFB13111:716271204:9452086
YWHAZDACH18:10196415713:72256042
TRAF3CDK1214:10324401217:37665958
PHF2C9orf1029:964295229:98718196
IS14CTTNBP2NLST7L1:1129588861:113098640
SFRP1SLC20A28:411609808:42302280

Chr, chromosome; nt, nucleotide; IS, investigation set.

RNA-seq mutational analysis in ESCC. Only missense, nonsense, and synonymous point mutations were considered. SNPs, InDel, and A>T to G>C conversions were removed due to presumptive association with RNA edition. (A) Graphical representation of the number of mutations per tumor sample in the investigation set. (B) Graphical representation of the mutational profiles. (C) Graphical representation of the mutation frequency of the most commonly altered genes in tumors of the investigation set. IS, investigation set. TP53 mutational analysis by RNA-seq and DNA sequencing. (A) Graphical representation of the frequency of samples without mutations and with different numbers of TP53 mutations in the investigation set (RNA-seq; n = 14) and validation set (DNA-seq; n = 41). (B) TP53 InDel and conversions in the investigation and validation sets. (C) Distribution of TP53 mutations per exon, in the investigation and validations sets. WT = wild type. # not applicable; InDel and A>T to G>C conversions were removed from analysis in the investigation set due to presumptive association with RNA edition. Gene Fusions (n = 21) Identified in the 14 Samples of the Investigation Set by RNA-seq Chr, chromosome; nt, nucleotide; IS, investigation set. Association between gene fusions and nonhomologous end joining (NHEJ) DNA repair. (A) Boxplots showing expression patterns of genes involved in NHEJ DNA repair pathway (KEGG database) in tumors with and without gene fusion in the investigation set (n = 14). (B) ATR expression in the same set of samples with and without gene fusions. *P < .05. Five of seven patients of IS showed one or two fusions in tumors, and two patients showed a higher number of fusions. Patient 4 was shown to carry eight gene fusions, including two involving the same genes, MAP4 and SPINK8, but with different breakpoints (Table 3), while patient 14 showed six fusions. Mutations were not detected in genes involved in fusions, and fusions were not associated with clinical and pathology data (not shown). However, TCGA data of all genes herein involved in fusions showed that seven gene pairs were co-amplified, albeit with frequencies ranging from 1% to 22%, while five gene pairs showed co-deletion (in 1%-4% of samples), and one gene pair was co-amplified and co-deleted in different samples (Supplementary Figure 3). Moreover, as fusions might be a consequence of dysregulated DNA repair, mainly by nonhomologous end-joining (NHEJ), analysis of 14 NHEJ-related genes from the KEGG database showed that only ATR was overexpressed in tumor samples with fusions (P = .034) (Figure 4).
Figure 4

Association between gene fusions and nonhomologous end joining (NHEJ) DNA repair. (A) Boxplots showing expression patterns of genes involved in NHEJ DNA repair pathway (KEGG database) in tumors with and without gene fusion in the investigation set (n = 14). (B) ATR expression in the same set of samples with and without gene fusions. *P < .05.

Discussion

This study reports, for the first time, a comprehensive analysis of the ESCC transcriptome, mutational landscape, and fusion events. It shows that the Wnt signaling pathway was frequently activated through dysregulation of gene expression. Furthermore, TP53 was the only gene with a high mutation frequency, while gene fusions were apparently random. RNA-seq is a high-throughput procedure that provides extensive data on genomic alterations with numerous advantages over hybridization-based transcriptome analysis, like detection of rare and alternative transcripts, splice variants, mutations, and quantification of transcript expression [22]. Although mutational analysis of transcripts is restricted following removal of InDel mutations and A:T>G:C conversions largely produced by RNA editing [16], all other mutations in coding regions can be successfully analyzed, as was the case in liver and prostate cancer [23], [24]. Less than 10 studies of ESCC alterations by RNA-seq have been reported in the literature, most of which of Chinese patients and with similar number of patients to those included in our study [25], [26], [27], [28], [29], [30], [31]. These studies also showed different but comparable estimates of DEG (1425 to 6150) to our findings (6698), depending on sample size, fold-change and P value cutoffs, with a predominance of overexpression. The difference in DEG among the different studies may, at least in part, be caused by the heterogeneity among tumor samples, as shown in this study through the comparison of principal component analysis between nontumor adjacent mucosa and tumor samples. Nevertheless, RNA-seq experiments with at least 12 biological replicates were found to be satisfactory for identifying differentially expressed genes for all fold changes [32]. Recently, the most comprehensive RNA-seq study on ESCC carried out by the TCGA consortium [7] included 90 tumors and 3 samples of nontumor esophageal mucosa from Asian, North American, South American, and European populations. Although this study did not compare ESCC tumor with nontumor adjacent tissues, three different molecular subtypes (ESCC1, ESCC2, and ESCC3) were suggested. The ESCC2 subtype, more common among Eastern European and South American patients, was characterized by loss of function of KDM6A, KTM2D, PTEN and PIK3R1, CDK6 amplification and BST2 overexpression. Interestingly, in our dataset, CDK6 and BST2 overexpression was observed in tumors with respect to the nontumor adjacent mucosa (logFC of 1.86 and 2.28, respectively), while PTEN was significantly underexpressed (logFC of −0.47). These findings suggested that these alterations might play a role in ESCC development in South American patients. Although enrichment of alterations of the Wnt pathway has been reported in ESCC studies with RNA-seq [7], [26], [27], this pathway had not been further explored. Our findings showed 54 DEG in the Wnt pathway, most of which showing overexpression and associated with activation of Wnt signaling. Deregulation of Wnt signaling has been found in other tumors, albeit by different mechanisms. Recently, Guinney et al. [33] proposed a new molecular classification for colorectal cancer, showing that APC mutations were present in 70% of cases, being more common (83%) in the consensus molecular subtype 2 (CMS2), followed by CMS3 (78%), CMS4 (66%), and CMS1 (40%). Conversely, breast cancer showed a more similar scenario to ESCC because Wnt activation, preferentially observed in triple-negative invasive carcinomas, was mediated by differential expression of pathway members rather than by mutations [34], [35], [36]. Deregulation of CTNNB1, APC, and DVL1 expression was found in 21% of all invasive breast carcinomas, while this frequency was much higher (56%) in PAM50 Basal subtype [36]. Although different genes were found to be involved in ESCC and breast cancer, Wnt pathway activation was associated with poor outcome in both tumors [present data; 34-36]. In this study, WNT7B overexpression was shown to be an independent prognostic marker in ESCC (HR = 5.5). WNT7B overexpression has been reported in breast cancer at mRNA and protein levels, with immunoreactivity in tumor and myeloid cells [37]. These authors showed that WNT7B silencing in myeloid cells resulted in reduction of breast tumor mass, volume, and lung metastases in a murine model. Additionally, WNT16, a gene involved in Wnt pathway activation by a noncanonical mechanism leading to JNK activation [38], was also found to be overexpressed in our study, as well as its downstream target MAPK10, also known as JNK3. The activation of this noncanonical pathway has been shown to increase keratinocyte proliferation [38], while JNK has been shown to be necessary for UV-mediated apoptosis [39]. Mutation analysis revealed that TP53 was the most frequently mutated gene in ESCC, with approximately 70% of tumors showing at least one mutation, mainly missense (86%), and in exons 5 to 8 (90%). Most mutations resulted in a nonfunctional protein. Our RNA-seq analyses were more sensitive than previous ones based on Sanger sequencing pointing to TP53 mutations in 35% of Brazilian patients with ESCC [40]. It also confirmed TCGA and the International Cancer Genome Consortium (ICGC) studies showing TP53 as the only gene that was very frequently affected by mutations in ESCC (93% and 62%, respectively). LOC389831, a gene coding for uncharacterized isoforms, was the second most frequently mutated gene (43%) in our study, followed by PI4KA, HERC2, NBPF9, and MST1L, each with 21.4%. Phosphatidylinositol 4-Kinase Alpha (PI4KA) catalyzes the first step of the biosynthesis of phosphatidylinositol 4,5-bisphosphate. It has been found to be overexpressed in hepatocarcinoma, showing a positive and significant correlation with expression of PCNA and KI67 proliferation markers and associated to a poor prognosis [41]. HECT Domain And RCC1-Like Domain-Containing Protein 2 (HERC2) facilitates the assembly of the RNF8-UBC13 complex to recruit BRCA1 to DNA damaged sites, and it also plays a role in p53 oligomerization [42]. Its overexpression has been previously associated with a favorable prognosis in non–small cell lung cancer [43]. Neuroblastoma Breakpoint Family Member 9 (NBPF9) has been found to be overexpressed in lung adenocarcinoma [44], while Macrophage Stimulating 1 Like (MSTIL) codes for a serine-type endopeptidase of unknown status in tumors (www.genecards.org, GC01M016754). Our findings were not coincident with TCGA and ICGC studies because none of these four above-mentioned genes were included among the 20 most mutated genes in these databases. A likely explanation for this discrepancy might be the exclusion of A:T>G:C conversions and InDel in our RNA-seq mutational analysis and the ESCC heterogeneity between populations since only 4 of the 10 most frequently mutated genes were shared by TCGA and ICGC data, albeit with different mutation frequencies, probably due to the different ethnic composition of patients included in these studies, viz., Chinese in ICGC and a heterogeneous set of patients in TCGA. These findings pointed to the heterogeneous mutational landscape of ESCC, without a major mutation accounting for the likely activation of a driver oncogene, thus making the development of a target-specific tyrosine kinase inhibitor difficult. Although half of the tumors herein studied showed fusions, none of them was recurrent or involved the same genes in different patients, and most of them involved syntenic loci. Approximately 33% of the gene pairs involved in fusions were co-amplified in TCGA, like the syntenic genes CCDC127 and AHRR. This pointed to a likely occurrence of chromoanagenesis and formation of micronuclei initiated by error in mitotic segregation, a common event in malignancies with a highly dysregulated cell cycle like ESCC. Newly formed micronuclei frequently show a reduced nuclear import consequently to which defective response to DNA damage signaling and delayed DNA replication occur. Cells may thus enter in mitosis with micronuclei still undergoing DNA replication, resulting in chromosome fragmentation and repair of double-strand breaks by nonhomologous end joining (NHEJ), leading to deletions, translocations, and formation of double minute chromosomes. These small, circular DNA fragments might contain one or more genes, usually oncogenes, while lacking centromeres and telomeres [45], often present at many copies per cell. Double minute chromosomes have been reported in ESCC in association to amplification of two candidate oncogenes, FGFR1 and LETM2 [46]. Furthermore, the association between ATR overexpression and presence of fusions in our dataset further suggested the occurrence of chromoanagenesis in ESCC, although further studies are necessary for confirming this phenomenon. This study showed that there was a high heterogeneity among ESCC patients, either for each type of genomic alteration or between them, and also when compared to other studies, such as TCGA. Therefore, we conclude that genomic alterations in ESCC are highly heterogeneous, impacting negatively in target therapy development for ESCC. The following are the supplementary data related to this article.

Supplementary Figure 1

Principal component analysis reveals the heterogeneous expression profile of ESCC samples.

Supplementary Figure 2

Oncoprint, based on TCGA database, showing alteration profiles of genes of the Wnt signaling pathway identified as differentially expressed in this study.

Supplementary Figure 3

Oncoprint, based on TCGA database, showing alteration profiles of genes involved in fusions in the present study. Fusion partners are listed in line pairs.

Supplementary Table 1

Primers Sequences Used in the Present Study

Supplementary Table 2

Comparisons Between Alterations Described in the Present Study and in TCGA Database

Supplementary Table 3

Description of All Mutations Identified in TP53 by RNA-seq or DNA-seq in the Present Study
  44 in total

1.  Predictors of outcome of single-dose brachytherapy for the palliation of dysphagia from esophageal cancer.

Authors:  Marjolein Y V Homs; Ewout W Steyerberg; Wilhelmina M H Eijkenboom; Peter D Siersema
Journal:  Brachytherapy       Date:  2006 Jan-Mar       Impact factor: 2.362

2.  Rab25 is a tumor suppressor gene with antiangiogenic and anti-invasive activities in esophageal squamous cell carcinoma.

Authors:  Man Tong; Kwok Wah Chan; Jessie Y J Bao; Kai Yau Wong; Jin-Na Chen; Pak Shing Kwan; Kwan Ho Tang; Li Fu; Yan-Ru Qin; Si Lok; Xin-Yuan Guan; Stephanie Ma
Journal:  Cancer Res       Date:  2012-09-18       Impact factor: 12.701

3.  TP53 mutation profile of esophageal squamous cell carcinomas of patients from Southeastern Brazil.

Authors:  Ana Rossini; Tatiana de Almeida Simão; Cynthia B Marques; Sheila C Soares-Lima; Suellen Herbster; Davy Carlos M Rapozo; Nelson A Andreollo; Maria A Ferreira; Kenya Balbi El-Jaick; Roberto Teixeira; Denise P Guimarães; Rodolpho Mattos Albano; Luis Felipe Ribeiro Pinto
Journal:  Mutat Res       Date:  2009-11-26       Impact factor: 2.433

Review 4.  RNA-Seq: a revolutionary tool for transcriptomics.

Authors:  Zhong Wang; Mark Gerstein; Michael Snyder
Journal:  Nat Rev Genet       Date:  2009-01       Impact factor: 53.242

5.  Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012.

Authors:  Jacques Ferlay; Isabelle Soerjomataram; Rajesh Dikshit; Sultan Eser; Colin Mathers; Marise Rebelo; Donald Maxwell Parkin; David Forman; Freddie Bray
Journal:  Int J Cancer       Date:  2014-10-09       Impact factor: 7.396

6.  β-Catenin pathway activation in breast cancer is associated with triple-negative phenotype but not with CTNNB1 mutation.

Authors:  Felipe C Geyer; Magali Lacroix-Triki; Kay Savage; Monica Arnedos; Maryou B Lambros; Alan MacKay; Rachael Natrajan; Jorge S Reis-Filho
Journal:  Mod Pathol       Date:  2010-11-12       Impact factor: 7.842

7.  HTSeq--a Python framework to work with high-throughput sequencing data.

Authors:  Simon Anders; Paul Theodor Pyl; Wolfgang Huber
Journal:  Bioinformatics       Date:  2014-09-25       Impact factor: 6.937

8.  How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use?

Authors:  Nicholas J Schurch; Pietá Schofield; Marek Gierliński; Christian Cole; Alexander Sherstnev; Vijender Singh; Nicola Wrobel; Karim Gharbi; Gordon G Simpson; Tom Owen-Hughes; Mark Blaxter; Geoffrey J Barton
Journal:  RNA       Date:  2016-03-28       Impact factor: 4.942

9.  Integrated genomic characterization of oesophageal carcinoma.

Authors: 
Journal:  Nature       Date:  2017-01-04       Impact factor: 49.962

10.  Integrative analyses of transcriptome sequencing identify novel functional lncRNAs in esophageal squamous cell carcinoma.

Authors:  C-Q Li; G-W Huang; Z-Y Wu; Y-J Xu; X-C Li; Y-J Xue; Y Zhu; J-M Zhao; M Li; J Zhang; J-Y Wu; F Lei; Q-Y Wang; S Li; C-P Zheng; B Ai; Z-D Tang; C-C Feng; L-D Liao; S-H Wang; J-H Shen; Y-J Liu; X-F Bai; J-Z He; H-H Cao; B-L Wu; M-R Wang; D-C Lin; H P Koeffler; L-D Wang; X Li; E-M Li; L-Y Xu
Journal:  Oncogenesis       Date:  2017-02-13       Impact factor: 7.485

View more
  5 in total

Review 1.  Signaling pathways and their potential therapeutic utility in esophageal squamous cell carcinoma.

Authors:  L K Kadian; M Arora; C P Prasad; R Pramanik; S S Chauhan
Journal:  Clin Transl Oncol       Date:  2022-01-06       Impact factor: 3.405

Review 2.  Interplay between HMGA and TP53 in cell cycle control along tumor progression.

Authors:  Nathalia Meireles Da Costa; Antonio Palumbo; Marco De Martino; Alfredo Fusco; Luis Felipe Ribeiro Pinto; Luiz Eurico Nasciutti
Journal:  Cell Mol Life Sci       Date:  2020-09-12       Impact factor: 9.261

3.  Drugging multiple same-allele driver mutations in cancer.

Authors:  Ruth Nussinov; Mingzhen Zhang; Ryan Maloney; Hyunbum Jang
Journal:  Expert Opin Drug Discov       Date:  2021-03-26       Impact factor: 7.050

4.  Systematic profiling identifies PDLIM2 as a novel prognostic predictor for oesophageal squamous cell carcinoma (ESCC).

Authors:  Guiqin Song; Jun Xu; Lang He; Xiao Sun; Rong Xiong; Yuxi Luo; Xin Hu; Ruolan Zhang; Qiuju Yue; Kang Liu; Gang Feng
Journal:  J Cell Mol Med       Date:  2019-06-20       Impact factor: 5.310

5.  A natural WNT signaling variant potently synergizes with Cdkn2ab loss in skin carcinogenesis.

Authors:  Paul Krimpenfort; Margriet Snoek; Jan-Paul Lambooij; Ji-Ying Song; Robin van der Weide; Rajith Bhaskaran; Hans Teunissen; David J Adams; Elzo de Wit; Anton Berns
Journal:  Nat Commun       Date:  2019-03-29       Impact factor: 14.919

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.