Literature DB >> 26374103

Copy number alterations detected by whole-exome and whole-genome sequencing of esophageal adenocarcinoma.

Xiaoyu Wang1, Xiaohong Li2,3, Yichen Cheng4, Xin Sun5, Xibin Sun6, Steve Self7, Charles Kooperberg8, James Y Dai9,10.   

Abstract

BACKGROUND: Esophageal adenocarcinoma (EA) is among the leading causes of cancer mortality, especially in developed countries. A high level of somatic copy number alterations (CNAs) accumulates over the decades in the progression from Barrett's esophagus, the precursor lesion, to EA. Accurate identification of somatic CNAs is essential to understand cancer development. Many studies have been conducted for the detection of CNA in EA using microarrays. Next-generation sequencing (NGS) technologies are believed to have advantages in sensitivity and accuracy to detect CNA, yet no NGS-based CNA detection in EA has been reported.
RESULTS: In this study, we analyzed whole-exome (WES) and whole-genome sequencing (WGS) data for detecting CNA from a published large-scale genomic study of EA. Two specific comparisons were conducted. First, the recurrent CNAs based on WGS and WES data from 145 EA samples were compared to those found in five previous microarray-based studies. We found that the majority of the previously identified regions were also detected in this study. Interestingly, some novel amplifications and deletions were discovered using the NGS data. In particular, SKI and PRKCZ detected in a deletion region are involved in transforming growth factor-β pathway, suggesting the potential utility of novel biomarkers for EA. Second, we compared CNAs detected in WGS and WES data from the same 15 EA samples. No large-scale CNA was identified statistically more frequently by WES or WGS, while more focal-scale CNAs were detected by WGS than by WES.
CONCLUSIONS: Our results suggest that NGS can replace microarrays to detect CNA in EA. WGS is superior to WES in that it can offer finer resolution for the detection, though if the interest is on recurrent CNAs, WES can be preferable to WGS for its cost-effectiveness.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 26374103      PMCID: PMC4570720          DOI: 10.1186/s40246-015-0044-0

Source DB:  PubMed          Journal:  Hum Genomics        ISSN: 1473-9542            Impact factor:   4.639


Background

Cancer arises from gradual accumulation of somatic genomic instability and alterations, which eventually lead to carcinogenesis and cancer progression [1, 2]. Copy number alterations (CNAs), one form of somatic genome alterations, refer to somatic changes in chromosome structure that result in gains or losses of copies of DNA segments. Detection of CNA is important to understand cancer development and identify key driver events [3, 4]. Microarray technologies have been widely used in CNA detection [5-7], including array comparative genomic hybridization (array CGH) and single nucleotide polymorphisms (SNP) microarrays. In array CGH, reference and test DNAs are fluorescence-labeled and hybridized to arrays, which are composed of bacterial artificial chromosome (BAC) clones, cDNA clones, or oligonucleotides. The signal ratio is used as an estimate of the copy number ratio. SNP microarrays are also based on hybridization, but a single sample is processed on each microarray and intensity ratios are formed by comparing the intensity of the sample under investigation to a collection of reference samples, or all other samples that are studied. Compared to array CGH, SNP arrays can have better resolution and produce B allele frequency so that loss of heterozygosity (LOH) can be detected [7]. Resolution of these arrays is typically greater than 1 kb, depending on the density, distribution, and response characteristics of their probes. More recently, next-generation sequencing (NGS) technologies offer single-nucleotide resolution and absolute counts of read numbers and therefore can provide more sensitive and accurate CNA results. Moreover, direct sequencing enables substantial increases in discoveries of smaller structural variation events [8, 9]. It is believed that, with its ever-decreasing cost, NGS will ultimately replace microarrays in copy number analyses [10]. In this paper, we conduct CNA analyses using published NGS data from [11], which contains 145 esophageal adenocarcinoma (EA) samples, as no CNA analyses were reported in the paper. The incidence of EA has strikingly increased over the past 30–40 years, and it is the seventh leading cause of cancer death among men in the USA [12]. Many studies of CNA detection in EA have been carried out using microarrays. Paulson et al. detected 19 most frequent CNAs in 15 EA patients using BAC array data [13]. Beroukhim et al. created the Tumorscape Copy Number Portal, where they collected more than 3000 copy number profiles from 26 cancer types using Affymetrix 250K StyI (Affymetrix, Santa Clara, CA) arrays [3]. They identified 33 recurrent CNAs (RCNAs), which appear in 44 EA samples more frequently than expected by chance. Dulak et al. detected 46 regions of significant recurrent events of gain and loss in 186 EA samples using 250K StyI arrays and SNP Array 6.0 arrays (Affymetrix) [14]. Zack et al. created the TCGA Copy Number Portal and identified RCNAs across multiple cancer types; they detected 88 RCNAS across 184 EA samples using Affymetrix SNP6 arrays [4, 15]. Frankel et al. detected 52 RCNAs in 54 EA samples using Illumina CytoSNP-12 arrays [16]. However, there has not been any published CNA detection study using NGS technologies. In this study, we plan to fill the gap by analyzing the NGS data from [11] and compare the result to the findings of the aforementioned papers. Indeed, microarray-based CNA analyses are still a common approach to detect CNAs, possibly due to the following reasons: microarray technologies have been developed for a longer time and corresponding CNA detection methods were well established and accurate detection of CNA in NGS can be a challenging task due to the complexities of sequencing data processing [17]. To the best of our knowledge, only a few CNA studies have been conducted to compare the performance of microarrays and NGS side-by-side. Koboldt et al. detected CNAs on coding regions of five ovarian tumors using both a SNP array and two NGS platforms—whole-genome (WGS) and whole-exome sequencing (WES) [18]. They found the majority of CNA events were consistently detected by the three platforms. More CNAs were detected by the WGS platform than those by the array. In another study, the authors detected germline copy number variations (CNVs) in 16 breast cancer cell lines using both array CGH and WES [19]. Four WES-based CNV detection methods were compared, and the regions detected by the array were used to form the gold standard. They detected a greater number of focal-scale CNVs using the array. These studies were conducted on the individual sample level. In this study, we are interested to detect and compare regions frequently appearing among multiple samples between NGS data and previous findings derived from microarrays-based studies. The detected recurrent regions may contain real driver events that contribute to the cancer development. Furthermore, there were 15 samples (patients) subjected to both WGS and WES in [11], providing a great opportunity to compare CNA detection by WES and WGS. Not much work has been conducted to address this question. Koboldt et al. found that a significant portion (79.53 %) of focal-scale CNAs detected by WES were also supported by WGS, and they recommended the use of WES-based approach, by which it is likely to detect more platform-specific focal copy number changes missed by WGS and microarray [18]. WES is an increasingly popular platform for studying tumor genomics because of itscost-effectiveness and the immediate interpretation of mutations in coding regions. It has been shown that WES data can be used to study CNA [19]. However, the uniformity of WES coverage is worse than that of WGS mostly due to exome capturing, and exons are not evenly placed within the genome so that it is difficult to detect CNAs over a long intergenic region using WES. On the other hand, if the interest is long CNA segments spanning over genes, it is not clear whether CNAs inferred by WES will lose a substantial amount of information when compared to WGS. It is quite possible that this comparison may depend on cancer site and the length of CNAs, since longer segment should be reliably detected by exome sequencing. A number of bioinformatics and statistical methods have been developed for CNA detection using NGS data [17, 20–22]. These methods can be classified in several ways. Most methods were developed to detect CNAs on the individual sample level, and they usually detect CNAs based on read count ratios between a tumor sample and its matched normal sample. These methods can be further categorized according to the study design. Some commonly used ones are as follows. (a) CNVnator [23], RDXplorer [24], and ReadDepth [25] detect CNAs on a single tumor sample. (b) CNAseg [26], Segseq [27], ExomeCNV [28], HMMcopy [29], and VarScan2 [18] identify CNAs on matched tumor-normal samples. Control-FREEC [30, 31] can be categorized both into classes (a) and (b), as it can either work with tumor-normal pairs or with tumor-only samples. Depending on the NGS platforms, CNVnator, Segseq, RDXplorer, ReadDepth, and HMMcopy work for WGS data; ExomeCNV and VarScan work for WES data; and Control-FREEC can work for both types of the sequencing data. In addition to the above methods detecting CNA in individual samples, other methods have been developed to detect RCNAs from multiple samples. These methods take segments from all the individual samples as input and identify the (merged) segments which appear more frequently across the population than expected by chance. Only a few RCNA methods have been developed for NGS data, including JointSLM [32] and cn.MOPS [33]. They conduct copy number analyses based on read counts of segments of multiple tumor samples and usually are applied for CNV detection. On the other hand, many RCNA detection methods that were originally developed for microarray platforms [34] can also be adapted to work on NGS data. These methods include STAC [35], CMDS [36], and GISTIC2.0 [37]. In this study, Control-FREEC is selected to detect CNAs on the individual sample level using WGS and WES data from [11], and the results are compared between the two sequencing platforms. Control-FREEC is a flexible and powerful tool in that it performs multiple types of bias corrections considering GC-content, mappability, and matched normal sample, and it is among the most sensitive tools on both WGS and WES platforms [22]. GISTIC2.0, likely the most popular RCNA detection method, is chosen to detect RCNAs using both WGS and WES data. The identified RCNAs are then compared with those reported previously using microarrays. We compare our results with those from five previous studies, and four of which (all except [13]) used GISTIC2.0. By choosing GISTIC2.0, we hope to alleviate the concern that potential differences generated in the NGS data are due to different software and analytical methods being applied.

Results

RCNA analysis

The estimated copy ratios of segments among 145 WES and 15 WGS data are shown in Fig. 1. We used GISTIC2.0 on the copy ratio profiles to perform a permutation-based significance analysis and identify significantly amplified/deleted regions. The recurrent amplification/deletion regions for WES data are shown in Fig. 2. The results of WGS data are shown in Fig. 3 accordingly. The threshold for the residual q value was set as 0.1, resulting in 41/16 amplifications and 67/19 deletions in WES/WGS data, respectively. We further combined the results from WES and WGS, and resulted in 47 amplification and 74 deletion events.
Fig. 1

Segmented copy number ratio profiles in WES and WGS. The x-axis represents the samples. The y-axis represents the chromosomes. a WES data. b WGS data

Fig. 2

Genomic positions of RCNAs detected in 145 WES data. The x-axis represents the normalized amplification signals (top) and significance by q value (bottom). The green line indicates the significance cutoff at q = 0.25. a Amplification regions. b Deletion regions

Fig. 3

Genomic positions of RCNAs detected in 15 WGS data. The x-axis represents the normalized amplification signals (top) and significance by q value (bottom). The green line indicates the significance cutoff at q = 0.25. a Amplification regions. b Deletion regions

Segmented copy number ratio profiles in WES and WGS. The x-axis represents the samples. The y-axis represents the chromosomes. a WES data. b WGS data Genomic positions of RCNAs detected in 145 WES data. The x-axis represents the normalized amplification signals (top) and significance by q value (bottom). The green line indicates the significance cutoff at q = 0.25. a Amplification regions. b Deletion regions Genomic positions of RCNAs detected in 15 WGS data. The x-axis represents the normalized amplification signals (top) and significance by q value (bottom). The green line indicates the significance cutoff at q = 0.25. a Amplification regions. b Deletion regions These newly identified genomic regions were verified with all the five microarray-based studies (Tables 1 and 2). It was found that the majority of the regions (68 % of deletions and 74 % of amplifications) detected in our study were also identified in those previous studies. Known cancer genes within these regions were identified according to the Cancer Gene Census [38], and the results are shown in the supplementary document (Additional file 1: Tables S1 and S2). Among all these detected regions, 13 amplification events were not reported in any of the previous studies; four of them (1p36.33, 12p13.31, 18p11.21, 8q24.3) had a residual q value less than 0.01. Twenty-nine deletion events were not identified previously, and ten of them (Xp22.33, 3p26.3, 6q22.31, 14q32.2, 1p21.1, 3p12.3, 6q12, Yq12, 6p12.3, 14p11.2) had a residual q value less than 0.01. We also examined the regions identified from the five previous studies to see whether they were also identified using the NGS data. We extracted the amplification regions (from Additional file 1: Table S2-C) and deletion regions (from Additional file 1: Table S4-B) in [14], for example. We checked if these regions were detected using the sequencing data and listed the q value for each region in Table 3. The genomic location for each region was converted from hg18 to hg19 using the University of California, Santa Cruz (UCSC) liftOver tool. The majority of those regions overlapped with our results, except for four amplifications and four deletions. The comparisons with other four studies are listed in the supplementary document (Additional file 1: Tables S3–S6), from which it can be seen that 58 % of regions in [16], 95 % of regions in [13], 64 % of regions in [3], and 57 % of regions in [4] were detected in our study. From these comparisons, we observed that the majority of regions in previous microarray studies were detected using NGS data.
Table 1

Amplification RCNAs detected by 145 WES data and 15 WGS data

CytobandPeak boundary (Mb)Width (Mb)PlatformResidual q valueDFPBZ
1p36.33chr1:0.99-3.162.174WES1.16E−04
1q21.3chr1:149.94-156.696.751WES2.14E−05Y
3q26.1chr3:164.71-164.760.047WES and WGS9.75E−02
3q26.2chr3:169.43-170.591.158WES4.18E−02YYY
6p21.32chr6:32.56-32.580.016WGS2.35E−02
6p21.1chr6:42.79-43.971.178WES and WGS2.00E−13YYY
6q23.3chr6:135.29-135.710.421WES and WGS4.98E−04YY
7p22.1chr7:4.29-6.892.609WES6.42E−02Y
7p11.2chr7:55.00-55.460.455WES5.48E−14YYYYY
7q21.2chr7:91.98-92.760.779WES and WGS5.07E−12YYYY
7q22.1chr7:98.46-100.6742.217WES7.81E−07YYY
7q31.2chr7:115.61-117.832.211WES1.71E−02Y
8p23.1chr8:6.97-7.160.182WES5.08E−02YYY
8p23.1chr8:7.37-7.630.263WES1.79E−03YYY
8p23.1chr8:11.28-11.680.402WES4.25E−13YYY
8q24.21chr8:126.45-129.022.572WES and WGS1.79E−10YYYYY
8q24.3chr8:141.9-146.364.464WES and WGS1.19E−03
9p13.3chr9:35.4-35.970.571WES5.60E−06YY
9q33.3chr9:124.98-141.2116.234WES6.22E−03Y
10p11.22chr10:31.61-33.622.015WES9.75E−02
11p11.2chr11:46.1-47.181.076WES1.30E−06Y
11q13.3chr11:68.86-69.630.775WES5.25E−16YYYY
11q14.1chr11:77.73-77.880.157WES8.70E−05Y
12p13.31chr12:9.63-9.720.082WES7.09E−03
12p12.1chr12:25.34-25.670.328WES and WGS2.63E−32YYYY
12q13.3chr12:56.14-57.321.181WES8.56E−02
12q15chr12:67.07-70.193.116WES3.88E−02YYYY
13q13.2chr13:33.28-35.251.972WES8.76E−04Y
13q14.11chr13:39.36-43.163.798WES1.99E−02Y
13q14.13chr13:44.73-46.641.906WGS7.11E−02
13q22.1chr13:72.13-78.676.54WGS5.39E−02YYY
15q26.1chr15:90.03-91.791.765WES4.45E−03YY
16p13.13chr16:11.37-12.010.64WES3.88E−02
17q12chr17:37.83-37.90.072WES1.04E−24YYYYY
17q21.2chr17:38.82-39.020.2WES and WGS4.35E−02Y
17q21.2chr17:39.85-39.990.146WES and WGS1.90E−08Y
17q21.33chr17:48.68-49.120.442WGS9.71E−02
17q25.3chr17:68.13-81.2013.067WES5.04E−02
18p11.21chr18:12.25-13.441.184WES9.23E−03
18q11.2chr18:19.75-20.520.766WES5.94E−11YYYY
19q12chr19:30.19-30.480.282WES6.69E−15YYYYY
20q13.12chr20:42.98-43.560.584WES and WGS8.36E−02Y
20q13.2chr20:47.90-52.774.877WES1.00E−02YY
20q13.33chr20:58.42-58.510.099WES1.75E−03Y
22q11.23chr22:24.39-24.410.016WGS7.11E−02
Xp11.1chrX:58.52-58.530.014WGS8.53E−02Y
Xq28chrX:152.11-153.911.793WES5.53E−04Y

A region may span multiple cytobands, in which case the longest one was listed. The regions were verified by checking if they were identified in any of the five previous microarray-based studies

D Dulak et al. 2012 [14], F Frankel et al. 2014 [16], P Paulson et al. 2009 [13], B Beroukhim et al. 2010 [3], Z Zack et al. 2013 [4], Y indicates a region was identified in a study

Table 2

Deletion RCNAs detected by 145 WES data and 15 WGS data

CytobandBoundary (Mb)Width (Mb)PlatformResidual q valueDFPBZ
1p36.11chr1:19.53-31.7312.21WES9.12E−04YYY
1p31.1chr1:45.53-100.3254.79WES5.89E−02
1p21.1chr1:104.12-107.603.48WES2.86E−04
1p13.2chr1:115.32-115.580.26WES5.65E−04Y
1q31.3chr1:186.41-200.1813.77WES1.16E−02
2q22.1chr2:136.87-149.4012.53WES8.62E−02YY
2q32.1chr2:179.23-190.4311.20WES6.27E−02
3p26.3chr3:0.00-2.612.61WES3.97E−08
3p24.3chr3:12.79-69.0356.24WES and WGS3.19E−02YYYY
3p12.3chr3:75.71-88.1012.39WES2.29E−03
4p16.1chr4:0.00-15.9715.97WES and WGS2.81E−02
4p15.31chr4:17.84-24.536.69WES9.06E−03Y
4p12chr4:42.15-47.455.30WES5.30E−02
4q13.2chr4:69.34-71.251.91WES2.49E−02
4q22.1chr4:90.88-93.232.35WGS9.73E−02YYY
4q28.3chr4:129.78-139.9810.20WES3.52E−02
4q32.1chr4:154.56-159.595.03WES9.27E−02Y
4q32.3chr4:164.44-165.881.44WES1.84E−02
4q34.3chr4:174.30-191.1516.86WES2.50E−03YYY
5q12.1chr5:58.15-59.791.64WGS4.42E−02YYYYY
5q13.1chr5:66.46-68.462.00WES3.52E−06Y
5q14.3chr5:79.47-130.5251.04WES1.41E−03Y
6p25.3chr6:0.00-2.622.62WES3.97E−07YYY
6p12.3chr6:49.82-50.790.97WES8.30E−03
6p22.2chr6:24.98-25.730.75WES1.97E−02
6p21.33chr6:31.17-31.320.15WGS3.97E−03
6q12chr6:64.42-71.146.72WES3.62E−03
6q16.1chr6:90.58-97.256.67WES2.08E−02Y
6q16.3chr6:100.06-105.415.35WES4.82E−02Y
6q22.31chr6:123.37-124.601.23WES1.21E−04
6q27chr6:151.79-171.1219.33WES2.27E−02YYY
7q31.1chr7:105.14-128.4723.33WES and WGS8.30E−03YYY
7q34chr7:141.64-141.950.31WES2.45E−06YY
8p23.2chr8:0.00-6.266.27WES9.89E−05YYYY
8p23.1chr8:7.83-10.392.55WES2.59E−02
8p21.2chr8:23.42-24.771.35WES1.73E−05Y
8p11.22chr8:38.85-39.780.92WES5.30E−02Y
9p23chr9:6.64-15.178.53WES1.10E−06YYYY
9p21.3chr9:21.86-23.691.83WES and WGS1.35E−34YYYYY
9q12chr9:43.13-66.5123.38WES3.52E−02
9q31.1chr9:70.49-123.1552.66WES2.49E−02
10q23.31chr10:89.55-94.214.67WES2.63E−05Y
11p15.4chr11:0.00-8.948.94WES1.05E−02Y
11p11.12chr11:49.00-57.078.07WES8.36E−02
11q14.1chr11:77.96-111.9633.99WES6.27E−02Y
11q25chr11:126.13-135.018.88WES4.23E−03YYY
12p13.31chr12:9.47-9.750.29WES4.82E−02
12q12chr12:33.56-48.1314.57WES5.89E−02
12q21.31chr12:70.76-93.7723.01WES4.23E−03Y
13q31.1chr13:61.10-95.2334.12WES2.38E−02
14p11.2chr14:0.00-20.4820.48WES8.69E−03
14q13.3chr14:36.79-37.640.86WES5.30E−02Y
14q32.13chr14:94.16-96.852.69WES1.39E−02
14q32.2chr14:97.03-107.3510.32WES2.37E−04
15q11.2chr15:20.78-22.691.91WES2.59E−02YYY
15q24.2chr15:74.01-77.713.71WGS6.84E−02
16p13.3chr16:0.00-4.904.90WES2.10E−05Y
16q21chr16:29.48-90.3560.88WES and WGS7.29E−02YYYY
17p12chr17:0.00-18.0218.02WES3.43E−03Y
17p11.2chr17:18.42-18.540.12WES2.20E−13YY
18q12.1chr18:24.60-28.654.04WES and WGS6.80E−03YY
18q12.3chr18:35.15-42.287.14WES and WGS5.53E−07YY
18q21.2chr18:48.59-50.281.69WES and WGS2.87E−13YYYY
18q23chr18:67.87-78.0810.21WES and WGS8.67E−05YYY
19p13.3chr19:0.00-10.6610.65WES6.47E−04Y
20p12.1chr20:13.97-16.042.06WGS2.93E−02YYY
21p11.2chr21:0.00-15.3215.32WGS1.77E−02Y
21q21.1chr21:19.63-27.017.38WES4.49E−05YY
21q22.3chr21:47.86-48.130.27WES and WGS2.64E−04Y
22q11.23chr22:24.33-24.370.05WGS3.83E−02Y
Xp22.33chrX:0.00-2.672.67WES8.13E−33
Xp21.1chrX:30.87-32.661.79WES and WGS2.99E−04YY
Xq28chrX:154.75-155.270.52WES2.55E−17Y
Yq12chrY:20.89-59.2238.33WES and WGS4.08E−03

A region may span multiple cytobands, in which case the longest one was listed. The regions were verified by checking if they were identified in any of the five previous microarray-based studies

D Dulak et al. 2012 [14], F Frankel et al. 2014 [16], P Paulson et al. 2009 [13], B Beroukhim et al. 2010 [3], Z Zack et al. 2013 [4], O our study, Y indicates a region was identified in a study

Table 3

Comparison of results of Dulak et al. [14] to our results

CytobandBoundary (Mb)Width (Mb)Our studyResidual q valueType
12p12.1chr12:25.34-25.450.11Y2.63E−32amp
18q11.2chr18:19.70-19.910.21Y5.94E−11amp
8p23.1chr8:11.37-11.670.30Y4.25E−13amp
19q12chr19:30.25-30.410.16Y6.69E−15amp
7q21.2chr7:92.48-92.660.18Y5.07E−12amp
11q13.3chr11:69.26-69.810.55Y5.25E−16amp
17q12chr17:37.72-38.020.30Y1.04E−24amp
17q21.2chr17:39.77-39.960.19Y1.90E−08amp
7p11.2chr7:54.95-55.430.48Y5.48E−14amp
8q24.21chr8:128.40-128.840.44Y1.79E−10amp
6p21.1chr6:43.21-43.350.14Y2.00E−13amp
9p13.3chr9:35.48-35.940.46Y5.60E−06amp
13q13.1chr13:33.38-34.431.05Y8.76E−04amp
7q22.1chr7:99.29-100.000.71Y7.81E−07amp
7q31.2chr7:116.13-116.630.50Y1.71E−02amp
12q15chr12:67.27-70.212.94Y3.88E−02amp
6q23.3chr6:135.28-135.830.55Y4.98E−04amp
10q22.2chr10:75.33-76.130.80amp
1q21.3chr1:147.76-154.106.34Y2.14E−05amp
10q26.13chr10:122.76-123.921.16amp
3q26.2chr3:168.72-172.283.56Y4.18E−02amp
18q11.2chr18:23.45-24.210.76Y5.94E−11amp
13q14.11chr13:41.37-41.930.56Y1.99E−02amp
11p14.1chr11:27.08-27.610.53amp
7q34a chr7:141.92-142.260.34amp
3p14.2chr3:58.98-61.542.56Y3.19E−02del
16q23.1chr16:78.13-79.651.52Y7.29E−02del
9p21.3chr9:21.86-22.020.16Y1.35E−34del
5q12.1chr5:58.26-59.791.53Y4.42E−02del
6p25.3chr6:1.60-2.631.03Y3.97E−07del
20p12.1chr20:14.26-16.041.78Y2.93E−02del
4q22.1chr4:91.15-93.272.12Y9.73E−02del
18q21.2chr18:48.52-48.720.20Y2.87E−13del
21q22.12chr21:36.11-36.430.32del
9p23chr9:7.79-12.724.93Y1.10E−06del
6q26chr6:161.69-163.211.52Y2.27E−02del
2q33.3chr2:204.82-206.561.74del
1q44chr1:245.85-246.710.86del
8p23.3chr8:1.01-1.460.45Y9.89E−05del
7q33chr7:123.66-142.5318.87Y8.30E−03del
7q36.1chr7:148.11-159.1311.02del
1p36.11chr1:25.77-31.255.48Y9.12E−04del
4q34.3chr4:178.82-185.316.49Y2.50E−03del
11q22.3chr11:105.95-112.836.88Y6.27E−02del
11q25chr11:121.03-134.9413.91Y4.23E−03del
21p11.2a chr21:1.00-16.2615.26Y1.77E−02del

Regions detected in Dulak et al., 2012 [14] were verified in our study

aIn cytoband indicates that the coordinate is based on hg18

Amplification RCNAs detected by 145 WES data and 15 WGS data A region may span multiple cytobands, in which case the longest one was listed. The regions were verified by checking if they were identified in any of the five previous microarray-based studies D Dulak et al. 2012 [14], F Frankel et al. 2014 [16], P Paulson et al. 2009 [13], B Beroukhim et al. 2010 [3], Z Zack et al. 2013 [4], Y indicates a region was identified in a study Deletion RCNAs detected by 145 WES data and 15 WGS data A region may span multiple cytobands, in which case the longest one was listed. The regions were verified by checking if they were identified in any of the five previous microarray-based studies D Dulak et al. 2012 [14], F Frankel et al. 2014 [16], P Paulson et al. 2009 [13], B Beroukhim et al. 2010 [3], Z Zack et al. 2013 [4], O our study, Y indicates a region was identified in a study Comparison of results of Dulak et al. [14] to our results Regions detected in Dulak et al., 2012 [14] were verified in our study aIn cytoband indicates that the coordinate is based on hg18 To generate a consensus list of regions, we investigated all the genomic regions in terms of cytobands across all the results from the six studies including ours and listed the regions appearing in at least three of them. The results are shown in Tables 4 and 5. Only two amplifications and six deletions were not found in our study, and our result is the one that is most consistent with the consensus regions, which suggests that NGS may be a more powerful approach for detecting RCNAs.
Table 4

Consensus amplification RCNAs in 6 studies

CytobandBoundary (Mb)DFPBZO
3q26.2chr3:167.60-170.90YYYY
6p21.1chr6:40.50-46.20YYYY
6q23.3chr6:135.20-139.00YYY
7p11.2chr7:54.00-58.00YYYYYY
7q21.2chr7:91.10-92.80YYYYY
7q21.3chr7:92.80-98.00YYY
7q22.1chr7:98.00-103.80YYYY
8p23.1chr8:6.20-12.70YYYY
8q24.13chr8:122.50-127.30YYY
8q24.21chr8:127.30-131.50YYYYYY
9p13.3chr9:33.20-36.30YYY
11q13.3chr11:68.40-70.40YYYYY
12p12.1chr12:21.30-26.50YYYYY
12q14.3chr12:65.10-67.70YYY
12q15chr12:67.70-71.50YYYYY
13q22.1chr13:73.30-75.40YYYY
15q26.1chr15:89.10-94.30YYY
15q26.2chr15:94.30-98.50YYY
17q12chr17:31.80-38.10YYYYYY
18q 11.2chr18:19.00-25.00YYYYY
19q12chr19:28.60-32.40YYYYYY
20q13.2chr20:49.80-55.00YYY

These regions are those that appear in at least three studies

D Dulak et al. 2012 [14], F Frankel et al. 2014 [16], P Paulson et al. 2009 [13], B Beroukhim et al. 2010 [3], Z Zack et al. 2013 [4], O our study, Y indicates a region was identified in a study

Table 5

Consensus deletion RCNAs in six studies

CytobandBoundary (Mb)DFPBZO
1p36.11chr1:23.90-28.00YYYY
1p35.3chr1:28.00-30.20YYY
1p35.2chr1:30.20-32.40YYY
1q44chr1:243.70-249.25YYY
2q22.1chr2:136.80-142.20YYY
2q22.2chr2:142.20-144.10YYY
2q33.3chr2:204.90-209.00YYY
3p14.2chr3:58.60-63.70YYYYY
4q22.1chr4:88.00-93.70YYYY
4q34.1chr4:171.90-176.30YYY
4q34.2chr4:176.30-177.50YYY
4q34.3chr4:177.50-183.20YYYY
4q35.1chr4:183.20-187.10YYYY
5q11.2chr5:50.70-58.90YYYYYY
5q12.1chr5:58.90-62.90YYYYYY
6p25.3chr6:0.00-2.30YYYY
6p25.2chr6:2.30-4.20YYY
6q26chr6:161.00-164.50YYYY
7q31.1chr7:107.40-114.60YYY
7q31.32chr7:121.10-123.80YYY
7q31.33chr7:123.80-127.10YYY
7q32.1chr7:127.10-129.20YYY
7q34chr7:138.20-143.10YYY
7q36.3chr7:155.10-159.14YYY
8p23.3chr8:0.00-2.20YYYYY
8p23.2chr8:2.20-6.20YYY
8p23.1chr8:6.20-12.70YYY
9p24.1chr9:4.60-9.00YYYYY
9p23chr9:9.00-14.20YYYYY
9p21.3chr9:19.90-25.60YYYYYY
11q24.2chr11:123.90-127.80YYY
11q24.3chr11:127.80-130.80YYYY
11q25chr11:130.80-135.01YYYY
16q23.1chr16:74.10-79.20YYYYY
16q23.2chr16:79.20-81.70YYYY
17p11.2chr17:16.00-22.20YYY
18q12.2chr18:32.70-37.20YYY
18q12.3chr18:37.20-43.50YYY
18q21.2chr18:48.20-53.80YYYYY
18q21.33chr18:59.00-61.60YYY
18q22.1chr18:61.60-66.80YYY
18q22.2chr18:66.80-68.70YYYY
18q22.3chr18:68.70-73.10YYYY
18q23chr18:73.10-78.077YYYY
20p12.1chr20:12.10-17.90YYYY
21q11.2chr21:14.30-16.40YYYY
21q21.1chr21:16.40-24.00YYY
21q21.2chr21:24.00-26.80YYY
21q22.12chr21:35.80-37.80YYYY
Xp21.2chrX:29.30-31.50YYY
Xp21.1chrX:31.50-37.60YYY

These regions are those that appear in at least three studies

Consensus amplification RCNAs in 6 studies These regions are those that appear in at least three studies D Dulak et al. 2012 [14], F Frankel et al. 2014 [16], P Paulson et al. 2009 [13], B Beroukhim et al. 2010 [3], Z Zack et al. 2013 [4], O our study, Y indicates a region was identified in a study Consensus deletion RCNAs in six studies These regions are those that appear in at least three studies

Comparison of CNAs on WGS and WES

We detected CNAs in 15 normal-tumor sample pairs based on both WGS data and WES data using Control-FREEC and compared the results from the two platforms. The comparisons were made on different lengths of segments, including large-scale and focal-scale, where large-scale CNAs refer to those spanning more than 25 % of a chromosome arm and focal-scale CNAs refer to those shorter than 25 % of a chromosome arm. The size span of large-scale CNAs is [18.32 161.22] Mb, with a standard deviation of 37.39 Mb. The size span of small-scale CNAs is [0.001 50.65] Mb, with a standard deviation of 2.50 Mb. More than 83 % of focal-scale CNAs are shorter than 1 Mb. For each detected CNA, we used Kolmogorov-Smirnov (KS) test to assess the possibility that it was generated just by chance; furthermore, we searched the WGS and WES data of each sample to see if it contained an event that overlapped the detected CNA with at least 10 % of bases, i.e., we counted how many times it appeared in WGS data and WES data. We then applied Fisher’s exact test to compare the detection frequency of each CNA by the two platforms. The results of large-scale CNAs are shown in Table 6. Totally, 19 regions were detected from the 15 EA samples. We then counted how many times these CNAs were detected by WGS and WES and found none of them was more frequently detected by one platform than the other. In addition, we used KS test and found the false-positive detection rate of each identified CNA was 0.
Table 6

Large-scale CNAs detected in WGS and WES

Boundary (Mb)Width (Mb)TypeWGS countWES count p value
chrX: 2.70-44.0041Gain011.00
chr12:58.34-110.0052.05Gain130.60
chr8:87.08-126.0039.3Gain011.00
chr14:37.15-65.0027.5Gain011.00
chr20:36.79-55.0018.32Gain011.00
chr4:20.88-69.0048.46Gain011.00
chrX:46.95-139.0091.93Loss140.33
chr8:0.12-86.0086.27Loss121.00
chr13:35.76-115.0079.35Loss250.39
chr4:75.26-178.00103.1Loss130.60
chr5:19.47-181.00161.22Loss240.65
chrY:10.01-29.0018.71Loss050.04
chr17:0.00-21.0021.19Loss211.00
chr18:19.84-78.0058.18Loss541.00
chr15:32.02-60.0028.39Loss101.00
chr19:0.25-25.0024.26Loss300.22
chr19:36.81-55.0018.42Loss101.00
chr7:93.74-152.0058.36Loss211.00
chr21:15.00-45.0029.63Loss111.00

p value is used to assess if the detected CNA is more frequently identified by WGS or WES

Large-scale CNAs detected in WGS and WES p value is used to assess if the detected CNA is more frequently identified by WGS or WES The results of focal-scale CNAs are shown in Table 7. WGS identified 21,197 focal-scale CNAs from the 15 samples; among them, 3675 were statistically more frequently detected by WGS than by WES. WES identified 4371 focal-scale CNAs, and 144 of them were identified more frequently by the platform. We checked the false-positive detection rates of the detected CNAs using the KS test and found 19,694/3655 CNAs on WGS/WES with p values < 0.05; these CNAs are less likely to be spurious discoveries, and we only worked on these CNAs afterwards. Among them, about 18 % of CNAs detected by WGS were statistically more frequently identified by WGS than by WES, while only about 3 % of CNAs detected by WES were more frequently identified on the platform. We further investigated if the false-positive detection rates of small CNAs (<200 k) detected on the two platforms were different using one-tailed t test, which resulted in a p value of 2.2E−16 (with means 0.004 vs. 0.009), and it indicates that the false-positive detection rate of those small CNAs is significantly smaller using WGS. One possible explanation is that WGS does not contain the exome-capturing process as in WES, and the local variation/bias of sequence read coverage is smaller [39]. Compared to WGS, WES does not cover intron regions, and it only covers 2.76 % of the whole genome. So finally, we investigated the effect of non-coverage to CNA detection and dealt with small CNAs that only reside in intron regions. As the result, no CNAs detected by WES spanned only on introns, and more than 7000 of such CNAs were identified by WGS, but only 22 % of these intron CNAs were statistically more frequently detected by WGS.
Table 7

Focal-scale CNAs detected in WGS and WES

WGSWES
All CNAs21,197/36754371/144
Filtered CNAs19,694/34563655/121
Small CNAs (<200 k)11,480/21751201/36
Small and intron CNAs7452/16030/0

A number before/indicates how many CNAs detected in the specific platform; a number after/indicates how many CNAs are more frequently detected in the platform. Filtered CNAs represent detected CNAs with p-value < 0.05. Small and Intron CNAs are CNAs with width < 200 k and only cover introns

Focal-scale CNAs detected in WGS and WES A number before/indicates how many CNAs detected in the specific platform; a number after/indicates how many CNAs are more frequently detected in the platform. Filtered CNAs represent detected CNAs with p-value < 0.05. Small and Intron CNAs are CNAs with width < 200 k and only cover introns

Discussion

In this study, we detected RCNAs using NGS data from 145 EA samples and compared them with those from the five microarray studies. We found that the majority of the regions detected by microarrays overlapped the regions identified by NGS and vise versa. Furthermore, based on all these six studies, we identified 22/51 consensus amplification/deletion regions, and our result was found to be the one that is most concordant with the consensus events. From the above observations, we suggest that NGS can replace microarrays to detect RCNAs in EA. However, discrepancy generally exists when comparing each specific region from all the studies. Even for the largest detected events, they are not consistent across the platforms and across the different microarray studies. The largest recurrent deletions detected by microarrays are not consistent. Two of them [3, 14] identified the largest recurrent deletions on chr7:123.66-142.52 (Mb), which corresponds to chr7:105.14-128.47 (Mb) detected both by WGS and WES in our study. The largest deletion detected by WGS and WES is on Chr16:29.48-90.35 (Mb), while only part of the region—chr16:78.13–79.65 (Mb) (in [4, 14, 16]) and chr16:31.93–33.39 (Mb) (in [16])—were detected in the microarray studies. Part of these discrepancies may just be caused by different technologies used in these platforms, such as different hybridization and scanning methods applied in these microarray studies, target-enrichment strategies applied in WES, and bias due to the effect of GC-content and uneven mappability across genome in NGS. Although our study indicates a significant overlap between RCNAs detected using microarray data and NGS data, it is still a challenge to rigorously compare these RCNA calling methods. To further compare these approaches, a well-controlled study design such as a spike-in experiment should be applied in the future. GISTIC analysis is often used to identify driver genes that contribute to cancer development. In this study, we found several potential driver genes in the detected regions that were reported in previous studies, and the results are listed in Table 8. We detected oncogenes such as EGFR, ERBB2, GATA6, KRAS, MYC, and tumor suppressor genes such as APC, ARID1A, ATM, CDKN2A, CDKN2B, CDK6, MCL1, MET, MYB, PDE4D, PRCKI, and PTPRD. Those were also identified in the various previous microarray studies. In another study [11], the authors identified 26 significantly mutated genes based on the 145 WES data used in our study. Among them, ten genes such as TP53, CDKN2A, EYS, ARID1A, TLR4, ARID2, SYNE1, C6orf118, ACTL7B, and SCN10A were also identified in our study, and three of the rest (SMAD4, TLL1, and SMARCA4) are located within 1 Mb of the detected regions of this study. It is worth to point out that some of the potential driver genes such as ERBB2 and TP53 were reported as implicated in the progression of esophageal Barrett to EA [13]. However, CNA regions are usually large and contain many genes. It is difficult to distinguish driver genes from passengers by just studying copy numbers [40]. Although more common driver genes were detected in this study than those found in [16], the discrepancy still implies the need of an integrated approach to identify driver genes of EA, which can consider CNA, mutation, gene expression, and methylation altogether.
Table 8

Potential driver genes reported in previous studies and corresponding RCNAs detected in this study

GenesCytobandBoundary (Mb)
ARID1A1p36.11chr1:19.53-31.73
SKI.PRKCZ1p36.33chr1:0.99-3.16
MCL11q21.3chr1:149.94-156.69
SCN10A3p24.3chr3:12.79-69.03
PRCKI3q26.2chr3:169.43-170.59
PDE4D5q12.1chr5:58.15-59.79
APC5q14.3chr5:79.47-130.52
EYS6q12chr6:64.42-71.14
MYB6q23.3chr6:135.29-135.71
C6orf118, SYNE16q27chr6:151.79-171.12
EGFR7p11.2chr7:55.00-55.46
CDK67q21.2chr7:91.98-92.76
MET7q31.2chr7:115.61-117.83
MYC8q24.21chr8:126.45-129.02
CDKN2A, CDKN2B9p21.3chr9:21.86-23.69
PTPRD9p23chr9:6.64-15.17
TLR49q31.1chr9:70.49-123.15
ATM11q14.1chr11:77.96-111.96
KRAS12p12.1chr12:25.34-25.67
ARID212q12chr12:33.56-48.13
GATA618q11.2chr18:19.75-20.52
DMDXp21.1chrX:30.87-32.66
Potential driver genes reported in previous studies and corresponding RCNAs detected in this study In addition to the common regions, we found some novel ones, including four amplification regions and ten deletion regions with statistically high frequency of appearance in the population. These regions may provide more clues to understand the cancer genomics of EA. In particular, SKI and PRKCZ in 1p36.33 have been reported to contribute to the loss function of TGFBR2 and SMAD4 in cancer [41]. TGFBR2 and SMAD4 are involved in the transforming growth factor (TGF)-β pathway and were identified as driver genes in gastric cancer [42] and colorectal cancer [43]. The novel deletion event identified on Yq12 in our study, along with previously found deletion events on X chromosome (e.g., Xp21.1 and Xp21.2) may help to understand the greater incidence of EA in males over the past three decades. For example, the DMD gene in Xp21.1 was identified as a driver gene in gastric cancer [42], and our result suggests that it may also contribute to EA development. The recurrently detected regions are likely to harbor “common mutations” that are of great interest in cancer studies. However, each tumor sample can contain private driver mutations for that individual patient’s tumor. To verify it, we compared the CNAs detected at individual sample level (Tables 6 and 7) with the recurrent events (Tables 1 and 2). We found only about 25.2 % of individual deletions overlapped identified deletion RCNAs. More extremely, only 10.2 % of amplifications detected at individual sample level overlapped those amplification RCNAs. Even for large-scale events, we found 88.0 % of individual deletions overlapped the recurrent deletion events, and only 35 % of individual amplifications overlapped the recurrent amplification events. The above observation implies that a considerable amount of driver mutations in a specific tumor sample is not located in the recurrent regions and personalized studies are required to identify these rare events. In our study, the medians of spans of recurrent amplification/deletion events are 1.0/6.6 Mb for WES (and possibly WGS) and 0.2/2.1 Mb for those identified only from WGS (Tables 1 and 2). Also, we detected more individual small CNAs by WGS (Table 7). Compared to WES, WGS appears more powerful to detect small events, especially for those that mostly reside in non-coding regions. The limitation of this comparison is that only 15 WGS/WES samples were available. For future studies, a larger sample size should provide more precision to calibrate the performance of WES relative to WGS.

Conclusions

In this study, we detected RCNAs in EA using the NGS data from [11] and compared the results with those from the previous microarray studies. The majority of the events detected in our study also were detected in those previous studies. Furthermore, novel regions and genes were found using NGS technologies. We also compared carefully WGS and WES in detecting CNA on an individual level. We found large-scale segments can be more consistently detected by both platforms, whereas WGS does detect more focal events. Importantly, the recurrent events on the population level appear to be successfully identified by WES. Given that the cost of WES is much less than that of WGS, and the mutations in WES is much more interpretable, our study suggests that WES may be the viable platform to detect recurrent copy number events in EA research.

Methods

Esophageal adenocarcinoma cancer data

The NGS data, including both WGS and WES data, were generated in [11] and stored in the database of Genotypes and Phenotypes (dbGaP) (study accession: phs000598.v1.p1). The dataset is comprised of 145 matched tumor-normal samples. Among them, 15 samples both have WGS and WES data, and the rest 130 samples have only WES data. The EA samples include those from the gastric-esophageal junction, not treated with chemotherapy or radiation before surgery. The tumor samples were examined by a board-certified pathologist and ensured that their carcinoma content >70 %. The samples were sequenced on multiple Illumina HiSeq flow cells to have the average target exome coverage of ~80× in WES data, and sequenced on the Illumina Genome Analyzer Iix or the Illumina HiSeq sequencer with an average of ~30× coverage depth in WGS data. The details of the sample collection, DNA extraction, and sequencing procedures can be found in [11]. The raw sequence data were extracted from dbGaP using the NCBI SRA Toolkit; the sequences were aligned to the NCBI build 37 (hg19) reference using BWA [44] and processed following GATK best practices. The base score re-calibrated bam files were used for CNA detection.

CNA detection methods

Control-FREEC was applied in this study on both WGS and WES data. It divided the genome into small contiguous regions using sliding windows. The read count profiles in each region for normal and tumor samples were computed and normalized accounting for GC-content and mappability. The read count ratios of tumors to matched normal samples were calculated and used as the proxy of the copy number ratios. A LASSO-based algorithm was used to segment the data. LASSO is a widely used generalized linear regression method that involves penalizing the absolute size of its regression coefficients [45]. Using LASSO, a piecewise constant smoothed step profile was used to model the copy number ratios, and the positions with nonzero coefficients were considered as change points. For WES data, the window size was set to 500, and the step size was set to 250, which were recommended by the authors. For WGS data, those parameters were set as 2000 and 1000, respectively. Control-FREEC estimates the normal cell contamination in tumor samples by comparing the observed and predicted copy numbers. It uses the Kolmogorov-Smirnov test to assess the false-positive rate of each detected CNA. Control-FREEC can predict absolute copy numbers if the ploidy information is provided. We used ABSOLUTE [46] to estimate the ploidy of the 15 EA samples using WES data, and the results are listed in the supplement. In this study we classified the identified CNAs based on their status (amplification or deletion) instead of their absolute copy numbers. Control-FREEC ignored genomic regions with mappability less than 0.85 by default, and hence, we did not consider the effect of unmappable regions in this study. GISTIC2.0 was used to identify regions with a statistically high frequency of copy number aberrations over background aberrations. It evaluated both the frequency and the significance to identify regions of interest. The G score measured both the frequency of aberrations, and the magnitude of the copy number changes (log ratio intensity) in each sample. Each location was scored separately for gains and losses. Then locations in each sample were permuted to simulate random aberrations. This random distribution was compared to the observed statistic to identify scores that are statistically significant. A false discovery rate (FDR) multiple testing correction was applied to calculate a q-bound significance score. Within each statistically significant region, a peak region was identified so that the region with a maximal G score and a minimal q value is most likely to contain affected genes. In addition to the q value, it also computed the residual q value, which measured the q value of a peak region after removing events that overlap with other more significant peak regions in the same chromosome. The 145 WES data were segmented using circular binary segmentation (CBS) algorithm [47] and combined to form the segmentation file, while the 15 WGS data were segmented using Control-FREEC as described above. The parameter settings were as follows: amplification threshold = 0.1, deletion threshold = 0.1, broad length cutoff = 0.98, remove X-chromosome = 0, and confidence level = 0.95. Whenever possible, default parameters and recommended settings were used in the implementation of these tools.
  43 in total

1.  VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing.

Authors:  Daniel C Koboldt; Qunyuan Zhang; David E Larson; Dong Shen; Michael D McLellan; Ling Lin; Christopher A Miller; Elaine R Mardis; Li Ding; Richard K Wilson
Journal:  Genome Res       Date:  2012-02-02       Impact factor: 9.043

2.  Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV.

Authors:  Jarupon Fah Sathirapongsasuti; Hane Lee; Basil A J Horst; Georg Brunner; Alistair J Cochran; Scott Binder; John Quackenbush; Stanley F Nelson
Journal:  Bioinformatics       Date:  2011-08-09       Impact factor: 6.937

Review 3.  Clonal evolution in cancer.

Authors:  Mel Greaves; Carlo C Maley
Journal:  Nature       Date:  2012-01-18       Impact factor: 49.962

4.  A comparison of DNA copy number profiling platforms.

Authors:  Joel Greshock; Bin Feng; Cristina Nogueira; Elena Ivanova; Ilana Perna; Katherine Nathanson; Alexei Protopopov; Barbara L Weber; Lynda Chin
Journal:  Cancer Res       Date:  2007-10-29       Impact factor: 12.701

5.  SMAD2, SMAD3 and SMAD4 mutations in colorectal cancer.

Authors:  Nicholas I Fleming; Robert N Jorissen; Dmitri Mouradov; Michael Christie; Anuratha Sakthianandeswaren; Michelle Palmieri; Fiona Day; Shan Li; Cary Tsui; Lara Lipton; Jayesh Desai; Ian T Jones; Stephen McLaughlin; Robyn L Ward; Nicholas J Hawkins; Andrew R Ruszkiewicz; James Moore; Hong-Jian Zhu; John M Mariadason; Antony W Burgess; Dana Busam; Qi Zhao; Robert L Strausberg; Peter Gibbs; Oliver M Sieber
Journal:  Cancer Res       Date:  2012-11-08       Impact factor: 12.701

6.  Detecting common copy number variants in high-throughput sequencing data by using JointSLM algorithm.

Authors:  Alberto Magi; Matteo Benelli; Seungtai Yoon; Franco Roviello; Francesca Torricelli
Journal:  Nucleic Acids Res       Date:  2011-02-14       Impact factor: 16.971

7.  Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data.

Authors:  Valentina Boeva; Tatiana Popova; Kevin Bleakley; Pierre Chiche; Julie Cappo; Gudrun Schleiermacher; Isabelle Janoueix-Lerosey; Olivier Delattre; Emmanuel Barillot
Journal:  Bioinformatics       Date:  2011-12-06       Impact factor: 6.937

8.  Control-free calling of copy number alterations in deep-sequencing data using GC-content normalization.

Authors:  Valentina Boeva; Andrei Zinovyev; Kevin Bleakley; Jean-Philippe Vert; Isabelle Janoueix-Lerosey; Olivier Delattre; Emmanuel Barillot
Journal:  Bioinformatics       Date:  2010-11-15       Impact factor: 6.937

9.  Comparative analysis of methods for identifying recurrent copy number alterations in cancer.

Authors:  Xiguo Yuan; Junying Zhang; Shengli Zhang; Guoqiang Yu; Yue Wang
Journal:  PLoS One       Date:  2012-12-20       Impact factor: 3.240

10.  Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity.

Authors:  Austin M Dulak; Petar Stojanov; Shouyong Peng; Michael S Lawrence; Cameron Fox; Chip Stewart; Santhoshi Bandla; Yu Imamura; Steven E Schumacher; Erica Shefler; Aaron McKenna; Scott L Carter; Kristian Cibulskis; Andrey Sivachenko; Gordon Saksena; Douglas Voet; Alex H Ramos; Daniel Auclair; Kristin Thompson; Carrie Sougnez; Robert C Onofrio; Candace Guiducci; Rameen Beroukhim; Zhongren Zhou; Lin Lin; Jules Lin; Rishindra Reddy; Andrew Chang; Rodney Landrenau; Arjun Pennathur; Shuji Ogino; James D Luketich; Todd R Golub; Stacey B Gabriel; Eric S Lander; David G Beer; Tony E Godfrey; Gad Getz; Adam J Bass
Journal:  Nat Genet       Date:  2013-03-24       Impact factor: 38.330

View more
  11 in total

1.  A muscle-specific UBE2O/AMPKα2 axis promotes insulin resistance and metabolic syndrome in obesity.

Authors:  Isabelle K Vila; Mi Kyung Park; Stephanie Rebecca Setijono; Yixin Yao; Hyejin Kim; Pierre-Marie Badin; Sekyu Choi; Vihang Narkar; Sung-Woo Choi; Jongkyeong Chung; Cedric Moro; Su Jung Song; Min Sup Song
Journal:  JCI Insight       Date:  2019-07-11

2.  A UBE2O-AMPKα2 Axis that Promotes Tumor Initiation and Progression Offers Opportunities for Therapy.

Authors:  Isabelle K Vila; Yixin Yao; Goeun Kim; Weiya Xia; Hyejin Kim; Sun-Joong Kim; Mi-Kyung Park; James P Hwang; Enrique González-Billalabeitia; Mien-Chie Hung; Su Jung Song; Min Sup Song
Journal:  Cancer Cell       Date:  2017-02-02       Impact factor: 31.743

Review 3.  Genetic alterations in hepatocellular carcinoma: An update.

Authors:  Zhao-Shan Niu; Xiao-Jun Niu; Wen-Hong Wang
Journal:  World J Gastroenterol       Date:  2016-11-07       Impact factor: 5.742

4.  Integrated molecular analysis reveals complex interactions between genomic and epigenomic alterations in esophageal adenocarcinomas.

Authors:  DunFa Peng; Yan Guo; Heidi Chen; Shilin Zhao; Kay Washington; TianLing Hu; Yu Shyr; Wael El-Rifai
Journal:  Sci Rep       Date:  2017-01-19       Impact factor: 4.379

5.  Integrative analysis of copy number and transcriptional expression profiles in esophageal cancer to identify a novel driver gene for therapy.

Authors:  Gaochao Dong; Qixing Mao; Decai Yu; Yi Zhang; Mantang Qiu; Gaoyue Dong; Qiang Chen; Wenjie Xia; Jie Wang; Lin Xu; Feng Jiang
Journal:  Sci Rep       Date:  2017-02-07       Impact factor: 4.379

6.  Stratification of clear cell renal cell carcinoma (ccRCC) genomes by gene-directed copy number alteration (CNA) analysis.

Authors:  H-J Thiesen; F Steinbeck; M Maruschke; D Koczan; B Ziems; O W Hakenberg
Journal:  PLoS One       Date:  2017-05-09       Impact factor: 3.240

7.  Germline and somatic variations influence the somatic mutational signatures of esophageal squamous cell carcinomas in a Chinese population.

Authors:  Jintao Guo; Jiankun Huang; Ying Zhou; Yulin Zhou; Liying Yu; Huili Li; Lingyun Hou; Liuwei Zhu; Dandan Ge; Yuanyuan Zeng; Bayasi Guleng; Qiyuan Li
Journal:  BMC Genomics       Date:  2018-07-16       Impact factor: 3.969

8.  Systematic profiling identifies PDLIM2 as a novel prognostic predictor for oesophageal squamous cell carcinoma (ESCC).

Authors:  Guiqin Song; Jun Xu; Lang He; Xiao Sun; Rong Xiong; Yuxi Luo; Xin Hu; Ruolan Zhang; Qiuju Yue; Kang Liu; Gang Feng
Journal:  J Cell Mol Med       Date:  2019-06-20       Impact factor: 5.310

9.  Sub-region based radiomics analysis for survival prediction in oesophageal tumours treated by definitive concurrent chemoradiotherapy.

Authors:  Congying Xie; Pengfei Yang; Xuebang Zhang; Lei Xu; Xiaoju Wang; Xiadong Li; Luhan Zhang; Ruifei Xie; Ling Yang; Zhao Jing; Hongfang Zhang; Lingyu Ding; Yu Kuang; Tianye Niu; Shixiu Wu
Journal:  EBioMedicine       Date:  2019-05-23       Impact factor: 8.143

10.  Clinical-grade whole-genome sequencing and 3' transcriptome analysis of colorectal cancer patients.

Authors:  Agata Stodolna; Miao He; Mahesh Vasipalli; Zoya Kingsbury; Jennifer Becq; Joanne D Stockton; Mark P Dilworth; Jonathan James; Toju Sillo; Daniel Blakeway; Stephen T Ward; Tariq Ismail; Mark T Ross; Andrew D Beggs
Journal:  Genome Med       Date:  2021-02-25       Impact factor: 11.117

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.