Weiwei Shi1, Charlotte K Y Ng2, Raymond S Lim3, Tingting Jiang1, Sushant Kumar4, Xiaotong Li5, Vikram B Wali1, Salvatore Piscuoglio6, Mark B Gerstein7, Anees B Chagpar8, Britta Weigelt3, Lajos Pusztai9, Jorge S Reis-Filho10, Christos Hatzis11. 1. Department of Medicine, Yale School of Medicine, Yale University, New Haven, CT, USA. 2. Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Institute of Pathology, University Hospital Basel, Basel, Switzerland; Department of Biomedicine, University of Basel, Basel, Switzerland. 3. Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA. 4. Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA. 5. Department of Medicine, Yale School of Medicine, Yale University, New Haven, CT, USA; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA. 6. Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Institute of Pathology, University Hospital Basel, Basel, Switzerland. 7. Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Computer Science, Yale University, New Haven, CT, USA. 8. Department of Surgery, Yale School of Medicine, Yale University, New Haven, CT, USA; Yale Cancer Center, New Haven, CT, USA. 9. Department of Medicine, Yale School of Medicine, Yale University, New Haven, CT, USA; Yale Cancer Center, New Haven, CT, USA. 10. Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA. Electronic address: reisfilj@mskcc.org. 11. Department of Medicine, Yale School of Medicine, Yale University, New Haven, CT, USA; Yale Cancer Center, New Haven, CT, USA. Electronic address: christos.hatzis@yale.edu.
Abstract
Multi-region sequencing is used to detect intratumor genetic heterogeneity (ITGH) in tumors. To assess whether genuine ITGH can be distinguished from sequencing artifacts, we performed whole-exome sequencing (WES) on three anatomically distinct regions of the same tumor with technical replicates to estimate technical noise. Somatic variants were detected with three different WES pipelines and subsequently validated by high-depth amplicon sequencing. The cancer-only pipeline was unreliable, with about 69% of the identified somatic variants being false positive. Even with matched normal DNA for which 82% of the somatic variants were detected reliably, only 36%-78% were found consistently in technical replicate pairs. Overall, 34%-80% of the discordant somatic variants, which could be interpreted as ITGH, were found to constitute technical noise. Excluding mutations affecting low-mappability regions or occurring in certain mutational contexts was found to reduce artifacts, yet detection of subclonal mutations by WES in the absence of orthogonal validation remains unreliable.
Multi-region sequencing is used to detect intratumor genetic heterogeneity (ITGH) in tumors. To assess whether genuine ITGH can be distinguished from sequencing artifacts, we performed whole-exome sequencing (WES) on three anatomically distinct regions of the same tumor with technical replicates to estimate technical noise. Somatic variants were detected with three different WES pipelines and subsequently validated by high-depth amplicon sequencing. The cancer-only pipeline was unreliable, with about 69% of the identified somatic variants being false positive. Even with matched normal DNA for which 82% of the somatic variants were detected reliably, only 36%-78% were found consistently in technical replicate pairs. Overall, 34%-80% of the discordant somatic variants, which could be interpreted as ITGH, were found to constitute technical noise. Excluding mutations affecting low-mappability regions or occurring in certain mutational contexts was found to reduce artifacts, yet detection of subclonal mutations by WES in the absence of orthogonal validation remains unreliable.
Intratumor genetic heterogeneity (ITGH), typically defined as the coexistence of genetically distinct but clonally related cancer cells within the same patient (Yap et al., 2012), can manifest itself spatially within the same lesion or as genetic differences between different metastatic sites and the primary tumor from the same patient (Ding et al., 2010; Gerlinger et al., 2012; Marusyk et al., 2012; Newburger et al., 2013; Yates et al., 2015). The broad availability of massively parallel sequencing has accelerated research into ITGH, and numerous studies have applied whole-exome or targeted-exome sequencing to multiple biopsies from the same cancer, to different metastatic lesions from the same patient, and more recently to multiple single cells from the same cancer (Gerlinger et al., 2012; Hou et al., 2012; Martelotto et al., 2017; Navin et al., 2011; Newburger et al., 2013; Nik-Zainal et al., 2012; Wang et al., 2014; Xu et al., 2012). ITGH represents a snapshot of the tumor’s evolutionary path and is a clinically important phenomenon with implications in prognosis and treatment response (Fisher et al., 2013; Hiley et al., 2014; Jiang et al., 2014; Marusyk et al., 2012; Morris et al., 2016; Turner and Reis-Filho, 2012).The assessment of ITGH, by definition, involves the detection of subclonal, low-frequency variants that are not uniformly present in all cancer cells and is made possible by the availability of bioinformatics tools to detect low-frequency somatic mutations with high sensitivity (Cibulskis et al., 2013; Koboldt et al., 2012; Saunders et al., 2012; Wilm et al., 2012). However, the presence of technical noise in sequencing data is well known (Li, 2014; Nakamura et al., 2011), and it is unclear whether genuine ITGH can be reliably distinguished from artifacts generated during library preparation, sequencing, and data processing (Qi et al., 2015; Smith et al., 2014). Understanding the signal-to-noise characteristics in these experiments is critical for the interpretation of ITGH.Given the implications in prognosis and treatment response, ITGH is an important consideration in the clinical setting. Because the cost of WES and complex informed consent requirements, the inclusion of matching normal samples still represents a limitation in the sequencing of tumor samples in some large clinical trials (Shi et al., 2017). Whether subclonal mutations can be robustly identified in the absence of matching normal samples and whether pooled normal samples from unrelated individuals would serve as a reasonable control should be explored as alternative options for the assessment of ITGH in the clinical setting.In this study, we aimed to assess the reliability of somatic variant detection from whole-exome sequencing (WES) in the context of ITGH (Figures 1A and S1A). To address this question, we performed WES on DNA from biopsies obtained from three anatomically distinct regions of six primary breast cancers (6 3 3 biopsies) and from matched peripheral blood leukocytes. Additionally, to determine the background noise levels as a comparator for the assessment of ITGH, aliquots of the same DNA samples from six distinct biopsies were sequenced twice to generate technical replicates. We examined the reliability of somatic variant detection using three different analysis approaches, namely, using cancer WES data only, cancer data and WES from pooled unrelated (i.e., non-matched) normal samples, and WES data from cancer and matching normal tissue (i.e., blood). Each approach used different sets of detection algorithms and filtering steps in an attempt to control for the specific biases associated with each approach. To generate the “gold standard” benchmark dataset, we re-sequenced all somatic variants identified by any of the three somatic variant detection pipelines in at least one sample using high-depth targeted amplicon sequencing. Given the higher depth obtained by amplicon sequencing, we assessed true ITGH and estimated the frequency of false positives by the different detection pipelines. Finally, we evaluated the sequence patterns and context in which the artifactual mutations occurred to improve the specificity of WES for characterizing genuine ITGH.
Figure 1.
Reliability of Somatic SNVs and INDELs Detected by WES and High-Depth Amplicon Sequencing Validation of Variants
(A) Biopsies were obtained from three anatomically distinct regions of each tumor to assess spatial genomic heterogeneity. One of the three DNA samples was split in two to provide a pair of technical replicates. Somatic variants were detected by three different WES analysis pipelines and subsequently validated by high-depth amplicon sequencing.
(B) Number of somatic mutations identified by WES in each technical replicate, subclassified according to their validation status by high-depth amplicon sequencing.
(C and D) Concordance of somatic SNVs (C) and INDELs (D) defined in each pair of technical replicates using the matched-normal WES analysis pipeline (left) or high-depth amplicon sequencing (right). Discordance between the replicates quantified as the Jaccard distance shown next to each bar and the pathogenicity of variants were assessed as described in Supplemental Experimental Procedures. Putative WES variants re-sequenced with high-depth amplicon sequencing were further classified as absent (VAF < 1%), germline (tumor VAF/germline VAF < 5), or low depth (<503).
(E) Validation status of variants (SNVs and INDELs) detected by WES in technical replicate pairs, categorized as concordant (“con”) or discordant (“dis”) by WES, and the distribution of their AmpliSeq validation status is shown within each bar. See also Figure S1.
RESULTS
Limited Reliability of Somatic Mutations Defined by WES
We performed WES on DNA extracted from three distinct regions of the primary tumor and the matching normal blood cells from six breast cancerpatients, including four with estrogen receptor-positive and two with triple-negative cancers (Table 1). The biopsies were obtained from three anatomically distinct regions of each tumor at least 1 cm apart (i.e., intratumor replicates; Figure 1A) in the context of a prospective institutional review board-approved study to assess intratumor molecular heterogeneity (von Wahlde et al., 2017). All tumor samples had at least 50% tumor cellularity on the basis of pathologic assessment. For the technical replicates, a second library was generated from one of the tumor DNA samples randomly selected from each cancer and sequenced by WES using the same protocol at the same facility on a different day. The mean target depth was 1603 (range 703 to 2203; Table S1), consistent with recommendations for WES (Clark et al., 2011; Sims et al., 2014). Following WES, somatic single-nucleotide variants (SNVs) and small insertion-deletions (INDELs) were identified by three different somatic variant calling pipelines that used (1) the tumor DNA alone (i.e., tumor only, a common approach in clinical practice), (2) tumor DNA and pooled unrelated normal DNA (i.e., cohort normal), or (3) tumor DNA and patient-matched normal DNA (i.e., matched normal; Figure S1A; Supplemental Experimental Procedures). Targeted amplicon sequencing using an orthogonal library generation and an independent sequencing method (AmpliSeq) was performed for all putative somatic variants identified by any of the WES variant calling pipelines on all tumor and matching normal DNA to a median depth of 6053 to define the “gold standard” mutation status for each identified somatic variant (Table S1; Supplemental Experimental Procedures).
Table 1.
Tumor Characteristics and Estimated Tumor Cellularity and Ploidy from WES Data Using FACETS
Sample
Subtype
Stage
Grade
EstimatedPurity (%)
EstimatedPloidy
Case 1 biorepA/techrep 1
ER+/PR+/HER2+
IIA
2
53.8
1.92
Case 1 biorep B
46.3
2.00
Case 1 biorep C
26.0
1.92
Case 1 techrep 2
54.0
1.92
Case 2 biorep A
ER−/PR−/HER2−
IIB
3
40.0
2.14
Case 2 biorepB/techrep 1
32.3
1.79
Case 2 biorep C
NE
1.72
Case 2 techrep 2
38.5
2.03
Case 3 biorep A
ER+/PR+/HER2+
IIB
2
56.4
1.89
Case 3 biorep B
72.8
1.93
Case 3 biorepC/techrep 1
64.9
1.90
Case 3 techrep 2
66.7
1.93
Case 4 biorep A
ER+/PR−/HER2−
IIB
2
60.4
1.98
Case 4 biorepB/techrep 1
61.7
1.95
Case 4 biorep C
57.6
1.99
Case 4 techrep 2
57.6
1.94
Case 5 biorep A
ER−/PR−/HER2−
IIA
3
78.7
2.60
Case 5 biorepB/techrep 1
76.7
2.62
Case 5 biorep C
80.8
2.82
Case 5 techrep 2
76.2
2.61
Case 6 biorepA/techrep 1
ER+/PR+/HER2+
IIA
2
47.8
1.99
Case 6 biorep B
49.4
2.05
Case 6 biorep C
52.9
2.18
Case 6 techrep 2
49.0
1.96
biorep, biological replicate; NE, could not be estimated; techrep, technical replicate; WES, whole-exome sequencing.
To quantify the technical reliability of somatic mutation detection by the matched-normal WES pipeline, the approach that is most frequently used in the research setting to assess ITGH, we compared the somatic mutations identified in the six pairs of technical replicates. In this experiment, tumor samples and normal samples were sequenced to a median coverage of 1843 (range 92–211) and 903 (range 80–138), respectively (Table S1). We identified medians of 74 (range 40–125) and 3 (range 0–13) somatic SNVs and INDELs, respectively, in each of the 12 DNA samples. Considering the high-depth amplicon sequencing results as the “gold standard,” we categorized candidate mutations detected in WES as true somatic, absent, germline-like (incorporating genuine germline variants and artifactual variant alleles caused by alignment biases and/or the sequencing technology) (Kim and Speed, 2013), or low-depth (i.e., technical failure with amplicon sequencing; Table S2; Supplemental Experimental Procedures). Excluding the low-depth (technical failure) variants, a median 82% (range 56%–90% per sample) of the somatic SNVs were confirmed as somatic, a median 5% (range 0%–16%) as germline-like, and the remaining were absent by AmpliSeq (Figure 1B; Table S3).Given that both technical replicates used the same input DNA, we anticipated detecting nearly identical somatic mutations in each pair of replicates. However, only a subset of the somatic SNVs and INDELs were consistently identified in technical replicate pairs, with median Jaccard distances (ranging from 0, perfect agreement, to 1, absence of overlapping variants; Supplemental Experimental Procedures) of 0.39 (range 0.24–0.64) for SNVs (Figure 1C) and 0.17 (range 0–0.50) for INDELs (Figure 1D). Interestingly, the technical replicate pairs with the highest Jaccard distances were those with the lowest tumor cell content as inferred by FACETS (Shen and Seshan, 2016) (Table 1). There were also a small number of potentially pathogenic variants among the discordant variants in technical replicate pairs (range 0–3; Figure 1C, red bars) that could have been misinterpreted as ITGH. We obtained fewer somatic mutations but similarly modest reproducibility with the cohort-normal WES pipeline and far fewer somatic mutations but improved reproducibility using the tumor-only pipeline (see explanation below; Figures S1B and S1C). Comparing only the mutations that were confirmed to be somatic by AmpliSeq between the technical replicate pairs, we observed almost perfect agreement for SNVs (Figure 1C) and for INDELs (Figure 1D) (maximum Jaccard distance of 0.02 and 0, respectively).When we examined the somatic mutations found to be concordant or discordant between pairs of technical replicates on the basis of WES, a median of 95% (range 73%–97%) of the concordant variants were confirmed to be genuinely somatic in at least one of the two technical replicates, compared with a median 33% (range 14%–48%) of the discordant variants (Figure 1E). Of the discordant WES variants, a median of 44% (range 36%–71%) were found to be absent by AmpliSeq (i.e., false positive in one of the two technical replicates), and a median of 7% (range 3%–22%) were germline(-like) variants (i.e., missed by WES in the matching normal). The validation status of 3% (range 1%–8%) of the mutations could not be ascertained, because of technical failure of low AmpliSeq coverage in the validation experiments.Taken together, these results suggest that WES performed at typical sequencing depth may be inadequate for detecting ITGH, particularly when the tumor cell content is less than 50%, as only 62% (range 36%–76%) of the somatic mutations were detected consistently in the technical replicate pairs by WES, with the remaining mutations falsely appearing as discordant.
WES Overestimates True ITGH
Next, we quantified ITGH by comparing somatic variant calls between the geographically distinct biopsies from the same cancer. In this analysis, we also examined how the three different WES analytic approaches differed in the quantification of ITGH (Figure S1A; Supplemental Experimental Procedures). Tumor cellularity of all samples was inferred from WES using FACETS (Shen and Seshan, 2016) (Table 1). We used mixed-effects linear modeling to estimate an average cellularity of 54.7% with intratumor and technical SDs of 7.0% and 2.2%, respectively (Supplemental Experimental Procedures). The matched-normal pipeline detected a median of 150 unique somatic mutations (range 68–186) in each tumor. Compared with the matched-normal pipeline, the cohort-normal pipeline detected a median of 101 mutations (range 48–131; p > 0.05, Wilcoxon test), and the tumor-only pipeline identified a median of 62 mutations (range 36–97; p = 0.01, Wilcoxon test; Figures 2A and 2B; Table S4).
Figure 2.
ITGH as Assessed by Different WES Analysis Pipelines
(A) Total number of somatic variants (SNVs and INDELs) identified in intratumor biopsies by the three WES analysis pipelines. One of the two technical replicates was randomly selected for inclusion in this analysis. Validation status by high-depth amplicon sequencing (i.e., somatic, germline, absent [VAF < 1%], low depth [< 503 coverage]) is shown according to the color key.
(B) Venn diagrams showing the overlap of putative somatic variants detected in intratumor biopsies from each tumor by the three WES analysis pipelines. The last row includes only “true” somatic variants validated by high-depth amplicon sequencing. The size of the circles is proportional to the number of somatic variants in a biopsy, with the numbers representing the total variants and those in parentheses indicating the number of pathogenic variants.
(C) Performance characteristics of the three WES analysis pipelines to identify true somatic variants. Putative somatic variants were considered as “true” if confirmed by high-depth amplicon sequencing. Precision was calculated as TP/(FP + TP) and sensitivity as TP/STP, where TP and FP are the number of true-positive and false-positive variants and STP is the total number of true somatic calls made by all three pipelines. Each circle represents one sample as analyzed by each pipeline, and the size of the circles is proportional to the number of putative somatic variants per biopsy identified by each analysis pipeline. See also Figure S2.
To assess the reliability of the different WES pipelines in detecting ITGH, we compared the WES candidate mutations with the “gold-standard” AmpliSeq-validated somatic variants from the same sample. A median of 62% (range 50%–98%) of the candidate somatic variants detected by the tumor-only pipeline were germline variants, and only 28% (range 2%–45%) were validated as true somatic mutations, highlighting the challenges posed by this commonly used approach in clinical practice. By contrast, a median of 79% (range 59%–91%) and 84% (range 61%–92%) of the variants defined by the cohort-normal and the matched-normal pipelines, respectively, were true somatic mutations (Figures 2A, S1D, and S1E). The tumor-only pipeline had the lowest sensitivity (median 18%, range 0%–29%) and precision (median 28%, range 0%–46%) for identifying true somatic variants. The matched-normal pipeline had the highest sensitivity (median 87%, range 47%–94%) and precision (median 86%, range 62%–94%; Figure 2C), whereas the cohort-normal pipeline had similar precision (median 83%, range 61%–91%) but significantly lower sensitivity (median 48%, range 26%–64%; p < 0.001, Wilcoxon test). Using the matched-normal pipeline, tumor cellularity was positively correlated with sensitivity (Pearson r = 0.7, p = 0.001) and numerically, though not statistically significantly, correlated with precision (Figure S2A). We did not observe the same correlation with the other two pipelines.Next, we estimated the apparent ITGH on the basis of mutations detected by the three WES calling pipelines using the Jaccard distance as the metric of ITGH. Mutations identified as somatic in one or two of the three biopsies and as germ-line-like or absent in the remaining biopsies contributed to ITGH. The tumor-only pipeline had a median Jaccard distance of 0.34 (range 0.19–0.96) compared with the cohort-normal pipeline of 0.70 (range 0.53–0.91) and matched-normal pipeline of 0.60 (range 0.44–0.89) (Figure 2B). The apparently lower ITGH defined by the tumor-only pipeline (p = 0.03 for tumor-only versus cohort-normal, p > 0.05 versus matched-normal, paired Wilcoxon tests) was due to the large number of germline variants misidentified as somatic mutations by the tumor-only pipeline (Figure 2A). When ITGH was estimated on the basis of the AmpliSeq-validated somatic mutations only, the median Jaccard distance was 0.40 (range 0.19–0.61; Figure 2B), which in the context of this study was considered a true estimation of ITGH. We also observed that the proportion of private mutations in a given biopsy was positively correlated with its purity relative to the mean purity of all biopsies for the patient (Pearson r = 0.531, p = 0.023; Figure S2B), suggesting that ITGH may be overestimated in cases with large variability in tumor purity between biopsies. Compared with the apparent ITGH defined by the candidate mutations in the cohort-normal and matched-normal WES pipelines, the true ITGH was significantly smaller (p = 0.015 versus cohort-normal and p = 0.015 versus matched-normal, paired Wilcoxon tests). We noted that 5.8% of the heterogeneous variants, also called branch mutations, that were not present in all biopsies of a given case were predicted to be pathogenic, but there was no statistically significant enrichment in pathogenic mutations compared to non-pathogenic variants among the heterogeneous somatic variants (Figure 2B). Heterogeneous variants were detected in a small number of cancer genes (21 of 306 [6.9%]), and five of these variants (1.6%) were predicted pathogenic.Taken together, these results suggest that the tumor-only WES pipeline misidentifies a substantial proportion of germline variants as somatic mutations. Even when using the matched-normal DNA for mutation detection, the extent of ITGH defined solely on the basis of WES performed at typical sequencing depth is overestimated, potentially affecting actionable cancer genes. For example, a deleterious stop-gain branch mutation in CDC27 (p.Cys71*) was identified as heterogeneous (in one of the three biopsies) by the matched-normal WES pipeline but was not validated by AmpliSeq. False-positive heterogeneous variants were mostly not actionable, however. genes. For example, a deleterious stop-gain branch mutation in CDC27 (p.Cys71*) was identified as heterogeneous (in one of the three biopsies) by the matched-normal WES pipeline but was not validated by AmpliSeq. False-positive heterogeneous variants were mostly not actionable, however.
Characteristics of Artifactual WES Somatic Mutations
To identify the characteristics of the putative mutations identified by WES that were subsequently found not to be truly somatic variants, we examined the alternative coverage (i.e., the number of reads supporting the alternate allele), the variant allele frequency (VAF), and the total depth of coverage of the candidate somatic mutations identified by the three WES pipelines from the intratumor biopsies. The tumor-only pipeline reported a median of 2 variants (range 0–5) with VAF < 10% because of reduced sensitivity of the single-sample mutation detection algorithm at low VAF and the strict filtering imposed to remove potential germline variants (Figures 3A and 3B). Despite aggressive filtering, most of the putative somatic variants from the tumor-only analysis were germline variants with VAF ~ 50%, indicating the presence of a large number of private mutations that have not been cataloged in publicly available databases (Figures 2A, 3A, and 3B). The cohort-normal analysis correctly identified somatic variants with low VAF, including many that were heterogeneous between the biopsies, but missed somatic variants with VAF > 45% because of filtering imposed to remove likely germline variants that may not be present in the pooled normal DNA used as the reference (Figures 2C, 3A, and 3B). The validated somatic mutations defined by the matched-normal pipeline covered the widest range of VAFs. The putative somatic mutations identified by the matched-normal pipeline that were found to be absent by AmpliSeq were mostly in the low-VAF range (median 7.3%, range 0.6%–44%; Figures 2A, 3A, and 3B).
Figure 3.
Coverage Characteristics of True Somatic Variants and False-Positive Mutations in the WES Data
(A and B) Alternative allele coverage (i.e., number of read supporting the alternative allele) (A) and total coverage (B) are plotted against variant allele fraction (VAF; all log10 scale) for the three WES analysis pipelines (rows). The different subsets of WES putative somatic variants according to validation status by high-depth amplicon sequencing are shown as columns: validated somatic mutations, validated homogeneous somatic mutations (i.e., present in all three biopsies of the same tumor), validated heterogeneous somatic mutations (i.e., present in one or two biopsies from the same tumor), and putative somatic mutations identified by the WES pipelines but failed validation by high-depth amplicon sequencing (i.e., putative somatic mutations that were validated to be germline, absent [VAF < 1%] or low coverage [< 50x]).
(C and D) For the matched-normal WES pipeline, (C) total coverage in the tumor (bottom) is plotted against the coverage in the matched normal sample, and (D) WES alternative allele coverage is plotted against VAF and of somatic mutations identified in all the specimens. The validation status categories are the same as in (A) and (B). Density kernel plots of the marginal distributions are included above and to the right of the scatterplots for each of the four categories of mutations. See also Figure S3.
Given that the majority of the ITGH studies carried out to date (Ding et al., 2010; Gerlinger et al., 2012; Nik-Zainal et al., 2012) used matched tumor-normal samples and analysis pipelines similar to our matched-normal pipeline and that mainly low-VAF mutations contributed to ITGH (Figures 3A and 3B), we compared the true- and false-positive somatic mutations (i.e., the validated and the un validated putative somatic variants) and the validated homogeneous (i.e., present in all biopsies from a case) and heterogeneous (i.e., absent in at least one biopsy from a case) somatic mutations derived from the distinct biopsies using the matched-normal WES pipeline. We found that the total depth for the true positive (median 124, range 9–1,028; Figure 3C, green) was significantly higher than for the false-positive mutations (median 78, range 12–926; p < 0.001, Wilcoxon test; Figures 3C and S3A, red). Furthermore, the false positives, compared with true mutations, had significantly lower alternative coverage (median 7, range 1–126 versus median 22, range 4–231; p < 0.001, Wilcoxon test) and VAF (median 11%, range 0.6%–64% versus median 23%, range 3%–93%; p < 0.001, Wilcoxon test; Figures 3D and S3B). There were also significant differences in the VAF distribution of the validated homogeneous and heterogeneous mutations (median 25%, range 1%–93% versus median 6%, range 1%–57%; p < 0.001, Wilcoxon test) and between the validated homogeneous mutations and the false positives (median 25%, range 1%–93% versus median 12%, range 2%–64%; p < 0.001, Wilcoxon test; Figures 3D and S3B). Crucially, the validated mutations implicated in true ITGH had significantly lower VAF than the false positives (p < 0.001, Wilcoxon test; Figures 3D and S3B–S3E), which indicates that true ITGH mutations may display similarly low or lower VAFs compared with the false-positive and false-negative mutations. These results suggest that filtering somatic variants with low VAF or low alternative coverage may improve the precision of the WES pipeline but would also eliminate many true somatic variants that contribute to ITGH. Importantly, false-positive mutations had significantly lower total depth in the matched normal DNA (median 35, range 6–492) compared with true-positive mutations (median 68, range 9–514; p < 0.001, Wilcoxon test) and validated heterogeneous mutations (median 72, range 12–343; p < 0.001, Wilcoxon test; Figure 3C). Many putative somatic mutations (47%) with total depth in the normal DNA of 10 or less were confirmed as germline-like by AmpliSeq, emphasizing the importance of having adequate sequencing depth in the normal samples.
Separating True ITGH from WES Artifacts
Because subclonal mutations are expected to be the predominant contributors to ITGH, we inferred the clonality of all mutations identified by WES using ABSOLUTE (Carter et al., 2012). Subclonal mutations were significantly overrepresented among the validated heterogeneous variants compared with the homogeneous somatic variants (83.2% versus 28.9%; p < 0.001, Fisher’s exact test) but were similarly overrepresented among the artifactual somatic variants (91.5% versus 83.2%; p > 0.05, Fisher’s exact test; Figures 4A and S4A). Importantly, subclonal mutations were also significantly enriched among discordant variants in the WES technical replicates compared with the concordant variants (72.0% versus 24.9%; p < 0.001, Fisher’s exact test; Figures 4A and S4A). These results suggest that compared with clonal variants, subclonal variants detected by WES are more likely to be erroneously attributed to ITGH.
Figure 4.
Mappability and Sequence Context of True and Artifactual Somatic Variants
(A and B) Clonality as defined by ABSOLUTE (A) and mappability of somatic variants (B) identified by the matched-normal WES pipeline. In each panel, the first two sets of bars enumerate the putative somatic variants identified as concordant or discordant in the technical replicates, whereas the bottom four sets of bars enumerate the somatic variants identified in intratumor biopsies and subsequently validated by high-depth amplicon sequencing. High-mappability regions are regions with mappability score of 1 (see Supplemental Experimental Procedures).
(C) Comparison of the mutational spectra of validated somatic heterogeneous mutations and artifactual somatic mutations that failed validation in all samples. The reference base listed (C or T) includes the corresponding reverse complement (G or A).
(D) Distribution of signature weights obtained from the decomposition of mutational signatures from each tumor sample.
(E) Detailed mutational spectra of the trinucleotide context of the pool of mutations detected in all tumor samples. Trinucleotide contexts with significant enrichment in the validated somatic heterogeneous mutations or in the artifactual somatic mutations are shown with an asterisk above the corresponding bars. *p < 0.005. Statistical comparisons in (A), (B), (C), and (E) are based on Fisher’s exact tests. Statistical comparisons in (D) are based on Wilcoxon tests. All statistical tests are two-sided. A p value < 0.05 was considered to indicate statistical significance. See also Figures S4–S6.
Examination of the genomic locations of mutations revealed that 41.1% of the artifactual somatic mutations occurred in regions of low mappability (Derrien et al., 2012) compared with only 6.4% for the validated somatic heterogeneous mutations (p < 0.001, Fisher’s exact test; Figures 4B and S4B). Furthermore, 30.8% of the discordant variants in the WES technical replicates occurred in regions of low mappability compared with 8.4% for the concordant variants (p < 0.001, Fisher’s exact test; Figures 4B and S4B). These results suggest that ambiguous mapping of DNA fragments directly contributes to artifactual somatic variants, even if longer reads (100 bp) were used (Figure S4C). Compared with the validated heterogeneous mutations, artifactual somatic mutations appeared to be significantly enriched in T > C transitions (p < 0.001, Fisher’s exact test; Figures 4C and S5A), particularly in the ApTpA and NpTpG trinucleotide contexts and also in T > G transversions in the GpTpG context (Figure 4E). By contrast, artifactual somatic mutations were significantly depleted in C > G transversions (p < 0.001, Fisher’s exact test; Figures 4C and S5A). Interestingly, validated somatic heterogeneous mutations were enriched for C > G substitutions in the TpCpA and TpCpT contexts (Figure 4E), the characteristic substitution patterns induced by the upregulation of APOBEC cytidine deaminases (Nik-Zainal et al., 2016). However, this enrichment appears to be driven primarily by the case with the largest variability in tumor purity (Figures S2B and S5A). Although a substantial proportion of the artifactual mutations detected were C > T transitions, which have been associated with both the aging process and with labinduced cytosine deamination during DNA library preparation (Alexandrov et al., 2013; Chen et al., 2014), their proportion was similar between the artifactual and validated somatic heterogeneous variants (34.6% versus 33.9%; p > 0.05, Fisher’s exact test; Figure 4C), with mutations occurring in the TpCpN context being significantly underrepresented in the artifactual variants (Figure 4E). We also did not observe an enrichment of C > T substitutions at low VAFs among the artifactual somatic mutations, which would have been indicative of likely FFPE fixation artifacts (Graw et al., 2015; Do and Dobrovic, 2015) (Figure S5B). Mutational signature analysis using previously defined mutational signatures (Alexandrov et al., 2013) identified signatures 5 and 29 to be overrepresented in the artifactual somatic mutations (p = 0.028, Wilcoxon test; Figure 4D), with a median of 16.1% (range 0%–44.4%) of the artifactual mutations classified as signature 5, driven mainly by the T > C transitions reported above. Signature 29 is driven by C > A mutations, predominantly in the ApCpA and GpCpA contexts, that were significantly overrepresented among the artifactual variants (Figure 4E).Finally, we considered whether applying filtering strategies to exclude mutations that occur in low-mappability regions or within sequence contexts enriched in artifactual mutations (C > A mutations in ApCpA and GpCpA contexts, T > C mutations in ApTpA and [C/T/G]pTpG contexts, and T > G mutations in GpTpG context; Figure 4E) can improve the reliability of WES. Excluding putative somatic mutations in low-mappability regions improved the precision for mutations detected in the technical (0.862 versus 0.792 without filtering) and biological replicates (0.876 versus 0.817 without filtering), while reducing moderately the sensitivity (0.858 versus 0.913 for technical, 0.773 versus 0.824 for biological). Filtering by mutational context reduced the sensitivity without appreciably improving precision, as did filtering by the combination of filters (Figure 5).
Figure 5.
Effect of Filtering of Variants Called by WES Pipelines in Technical Replicates and Intratumor Biopsies
Excluding somatic mutations in low-mappability regions improves precision (or positive predictive value) by reducing false-positive calls, while slightly reducing the sensitivity by increased false-negative calls.
DISCUSSION
WES is an appealing and increasingly affordable technology to study the extent of ITGH in an unbiased manner (i.e., without a priori selecting genes of interest for sequencing). The reliability of WES to detect low-frequency mutations, which often account for the majority of ITGH within a cancer, and therefore to resolve clonal architecture (Figure S6) depends on the experimental design, the sequencing depth, tumor purity, and the bioinformatics approaches used to define the somatic variants. To examine the influence of these factors on measuring ITGH, here we generated a dataset incorporating both technical and biological replicates sequenced to depths commonly found in clinical or translational research studies, and validated every putative somatic variant detected with orthogonal high-depth sequencing methods. All datasets generated in this study have been made publicly available to provide a resource for the community to refine analytical tools for ITGH detection from WES.We performed six pairs of technical replicates that involved independent library preparation and sequencing of aliquots from the same DNA extractions, and experiments were performed on different days. The technical replicates revealed an unexpectedly high degree of discordance in the putative somatic variants identified, even using the current best practice matched-normal variant calling analysis approach. Subsequent validation with high-depth amplicon sequencing (6053 median coverage) of all variants identified by WES demonstrated that the majority of the false-positive somatic variants (1) displayed low VAF and were often detected as subclonal in one experiment but not in the other, (2) were in fact germline-like variants that appeared as heterogeneous somatic mutations (Kim and Speed, 2013), or (3) map to genomic regions of low mappability. The enrichment of subclonal mutations among the discordant mutations in the WES technical replicates is expected, given the well-known difficulty in identifying somatic mutations at low VAF. Indeed, comparing the putative mutations that did not validate to the “true” somatic mutations validated by high-depth amplicon sequencing demonstrated that mutations with low VAF and/or low alternative coverage were more difficult to be reliably identified by WES. On the other hand, our results revealed a not insignificant proportion of germline-like false-positive mutation calls. Although some of these germline-like variants are genuinely germline alleles not detected in the matched normal samples, a substantial proportion of these are likely attributed to alignment and sequencing biases (Kim and Speed, 2013) that manifested as false-positive variants at low VAF or alternative coverage. Importantly, our results high-lighted the often overlooked importance of adequate coverage for the matched-normal sample in the accurate identification of somatic mutations, given that mutations that failed validation were, on average, associated with lower coverage in the normal sample. In terms of mappability of genomic regions, we found that false-positive mutation calls were enriched in genomic regions of poor mappability. In fact, we demonstrated that this may represent a reasonable filter if specificity is of paramount importance and some trade-off in sensitivity can be tolerated. Although our study provides direct evidence in support of ITGH, the mutations implicated in ITGH showed substantial overlap with the alterations stemming from intrinsic technical noise in terms of VAF, alternate allele depth, total depth in tumor and normal, as well as mappability. Incorporating unique molecular identifiers into deep sequencing experiments will likely reduce false positives and enhance sensitivity in detecting subclonal mutations with greater confidence (Salk et al., 2018). Furthermore, because private mutations are less likely to be identified in biopsies of relatively low purity, ITGH should be best assessed in biopsies with uniformly high cellularity.Our analysis of the mutational signatures between the validated mutations implicated in ITGH versus the false-positive mutations revealed striking differences. The heterogeneous mutations were enriched in a pattern typically associated with increased APOBEC activity, and this pattern has been previously shown to contribute to ITGH in breast and lung cancers (de Bruin et al., 2014; Ng et al., 2017). On the other hand, the false-positive mutations were enriched for, in particular, C > A and T > G mutations at specific sequence contexts. A recent study of rare polymorphisms determined by high-depth whole-genome sequencing in 300 individuals of diverse genetic origins identified four mutational signatures, two of which were consistent within populations and had a clear association with geographic distribution (Mathieson and Reich, 2017). The origin of the remaining two were uninterpretable, with one of these latter signatures dominated by T > G mutations in the GpTpG context and the other signature highly correlated with COSMIC signature 5, which has been found in all cancer types (Alexandrov et al., 2013) and has been suggested to display clock-like properties suggestive of an association with the aging process (Alexandrov et al., 2015). Our analysis of the sequence context of false-positive variants identified both these features as being significantly enriched in artifactual putative mutations (Figures 4C–4E), strongly suggesting that caution should be exercised in the interpretation of the reported mutational signatures.This study has several limitations. The sample size of 6 breast cancers with 18 biopsies may be too limited to allow generalizing our results on ITGH to all breast cancer subtypes or to other cancers. Of note, the unique nested experimental design incorporating within-sample technical replication, processed and sequenced in the same manner as the intratumor biopsies, provided an estimate of background discordance against which the ITGH results could be interpreted. Additionally, the extensive orthogonal validation by high-depth amplicon sequencing on an independent sequencing platform with very different chemistry from the platform used for WES adds rigor to our study. The sequencing depth attained in this study is comparable with that in previous studies using WES (with subsequent high-depth sequencing for validation) for the genomic characterization of ITGH (Yates et al., 2015); it is plausible, however, that WES at higher depth (i.e., >2503) would mitigate in part the false positives and false negatives, in particular in samples with tumor cell content < 50%. Finally, the three WES analyses pipelines used different calling algorithms and filtering steps that reflected the best practice at the time of the analysis, but future improvements could result in reduced bias.In summary, our study showed that WES at 184 mean depth of coverage in the tumor samples overestimates the extent of ITGH, and the technical noise associated with somatic mutation detection using WES alone can confound true ITGH. Our results also suggested that it is not possible to reduce the false-positive rate through more aggressive minimum depth filtering without affecting the sensitivity of detecting true somatic mutations in the 1%–5% VAF range, but excluding mutations that occur in low-mappability regions of the genome, or in certain mutational contexts could reduce artifactual somatic mutations and provide less biased estimates of ITGH. Nevertheless, orthogonal, high-depth validation experiments are highly desirable in the context of quantifying ITGH.
EXPERIMENTAL PROCEDURES
Tumor Sample Collection
Breast cancer samples were collected from patients with newly diagnosed invasive breast cancer with tumor size > 2 cm at the Yale Cancer Center. Tumor tissues were obtained with three punch biopsies at least 1 cm apart from three different regions of the tumor after pathologic gross examination. Six of these tumors with high enough cellularity (>50%) and high DNA quality from all three biopsies and with matched blood DNA were selected for this study.
WES and Analysis
DNA was extracted and library was prepared using standard protocols, and the exome was captured using the NimbleGen SeqCap EZ Human Exome Kit version 2.0. Sequencing was performed on the HiSeq 2000 in paired-end 75-cycle mode at the Yale Center for Genome Analysis. We used three different analytical pipelines for detecting variants. A single-sample “tumor-only” pipeline, a “cohort-normal” pipeline using an in-house normal reference obtained from ten unrelated normal blood DNA samples, and a “matched-normal” pipeline using the matched-normal DNA from each patient as reference. Further details are provided in Supplemental Experimental Procedures.
Validation of Putative Variants with High-Depth Amplicon Sequencing
Variants identified by WES were subjected to validation with high-depth amplicon sequencing using custom AmpliSeq panels on the same tumor and matched normal DNA samples. Amplicon sequencing was performed to a median depth of 6003. Further details are provided in Supplemental Experimental Procedures.
Mutational Signature Analysis and Mappability
Mutational signature analyses comparing the validated variants that contribute to ITHG and the WES false-positive calls were performed for individual tumor samples and for the pooled mutations over all samples using the R package deconstructSigs (Rosenthal et al., 2016). Mappability of SNVs was assessed using the CRG Alignability track (Derrien et al., 2012) in the UCSC Genome Browser. Additional details are provided in Supplemental Experimental Procedures.
Authors: Marco Gerlinger; Andrew J Rowan; Stuart Horswell; James Larkin; David Endesfelder; Eva Gronroos; Pierre Martinez; Nicholas Matthews; Aengus Stewart; Charles Swanton; M Math; Patrick Tarpey; Ignacio Varela; Benjamin Phillimore; Sharmin Begum; Neil Q McDonald; Adam Butler; David Jones; Keiran Raine; Calli Latimer; Claudio R Santos; Mahrokh Nohadani; Aron C Eklund; Bradley Spencer-Dene; Graham Clark; Lisa Pickering; Gordon Stamp; Martin Gore; Zoltan Szallasi; Julian Downward; P Andrew Futreal Journal: N Engl J Med Date: 2012-03-08 Impact factor: 91.245
Authors: Thomas Derrien; Jordi Estellé; Santiago Marco Sola; David G Knowles; Emanuele Raineri; Roderic Guigó; Paolo Ribeca Journal: PLoS One Date: 2012-01-19 Impact factor: 3.240
Authors: Stefan Graw; Richard Meier; Kay Minn; Clark Bloomer; Andrew K Godwin; Brooke Fridley; Anda Vlad; Peter Beyerlein; Jeremy Chien Journal: Sci Rep Date: 2015-07-23 Impact factor: 4.379
Authors: Lucy R Yates; Moritz Gerstung; Stian Knappskog; Christine Desmedt; Gunes Gundem; Peter Van Loo; Turid Aas; Ludmil B Alexandrov; Denis Larsimont; Helen Davies; Yilong Li; Young Seok Ju; Manasa Ramakrishna; Hans Kristian Haugland; Peer Kaare Lilleng; Serena Nik-Zainal; Stuart McLaren; Adam Butler; Sancha Martin; Dominic Glodzik; Andrew Menzies; Keiran Raine; Jonathan Hinton; David Jones; Laura J Mudie; Bing Jiang; Delphine Vincent; April Greene-Colozzi; Pierre-Yves Adnet; Aquila Fatima; Marion Maetens; Michail Ignatiadis; Michael R Stratton; Christos Sotiriou; Andrea L Richardson; Per Eystein Lønning; David C Wedge; Peter J Campbell Journal: Nat Med Date: 2015-06-22 Impact factor: 53.440
Authors: Tiffany M Delhomme; Patrice H Avogbe; Aurélie A G Gabriel; Nicolas Alcala; Noemie Leblay; Catherine Voegele; Maxime Vallée; Priscilia Chopard; Amélie Chabrier; Behnoush Abedi-Ardekani; Valérie Gaborieau; Ivana Holcatova; Vladimir Janout; Lenka Foretová; Sasa Milosavljevic; David Zaridze; Anush Mukeriya; Elisabeth Brambilla; Paul Brennan; Ghislaine Scelo; Lynnette Fernandez-Cuesta; Graham Byrnes; Florence L Calvez-Kelm; James D McKay; Matthieu Foll Journal: NAR Genom Bioinform Date: 2020-04-20
Authors: Ryan L Powles; Vikram B Wali; Xiaotong Li; William E Barlow; Zeina Nahleh; Alastair M Thompson; Andrew K Godwin; Christos Hatzis; Lajos Pusztai Journal: Clin Cancer Res Date: 2020-01-09 Impact factor: 12.531
Authors: Valsamo Anagnostou; Noushin Niknafs; Kristen Marrone; Daniel C Bruhm; James R White; Jarushka Naidoo; Karlijn Hummelink; Kim Monkhorst; Ferry Lalezari; Mara Lanis; Samuel Rosner; Joshua E Reuss; Kellie N Smith; Vilmos Adleff; Kristen Rodgers; Zineb Belcaid; Lamia Rhymee; Benjamin Levy; Josephine Feliciano; Christine L Hann; David S Ettinger; Christos Georgiades; Franco Verde; Peter Illei; Qing Kay Li; Alexander S Baras; Edward Gabrielson; Malcolm V Brock; Rachel Karchin; Drew M Pardoll; Stephen B Baylin; Julie R Brahmer; Robert B Scharpf; Patrick M Forde; Victor E Velculescu Journal: Nat Cancer Date: 2020-01-13
Authors: Johannes G Reiter; Marina Baretti; Jeffrey M Gerold; Alvin P Makohon-Moore; Adil Daud; Christine A Iacobuzio-Donahue; Nilofer S Azad; Kenneth W Kinzler; Martin A Nowak; Bert Vogelstein Journal: Nat Rev Cancer Date: 2019-08-27 Impact factor: 60.716
Authors: Sebastian Lange; Thomas Engleitner; Sebastian Mueller; Roman Maresch; Maximilian Zwiebel; Laura González-Silva; Günter Schneider; Ruby Banerjee; Fengtang Yang; George S Vassiliou; Mathias J Friedrich; Dieter Saur; Ignacio Varela; Roland Rad Journal: Nat Protoc Date: 2020-01-06 Impact factor: 13.491
Authors: Maxime Tarabichi; Adriana Salcedo; Amit G Deshwar; Máire Ni Leathlobhair; Jeff Wintersinger; David C Wedge; Peter Van Loo; Quaid D Morris; Paul C Boutros Journal: Nat Methods Date: 2021-01-04 Impact factor: 28.547