| Literature DB >> 30138725 |
Kazimierz O Wrzeszczynski1, Vanessa Felice2, Avinash Abhyankar2, Lukasz Kozon2, Heather Geiger2, Dina Manaa2, Ferrah London2, Dino Robinson2, Xiaolan Fang2, David Lin2, Michelle F Lamendola-Essel2, Depinder Khaira2, Esra Dikoglu2, Anne-Katrin Emde2, Nicolas Robine2, Minita Shah2, Kanika Arora2, Olca Basturk3, Umesh Bhanot3, Alex Kentsis4, Mahesh M Mansukhani5, Govind Bhagat5, Vaidehi Jobanputra6.
Abstract
We developed and validated a clinical whole-genome and transcriptome sequencing (WGTS) assay that provides a comprehensive genomic profile of a patient's tumor. The ability to fully capture the mappable genome with sufficient sequencing coverage to precisely call DNA somatic single nucleotide variants, insertions/deletions, copy number variants, structural variants, and RNA gene fusions was analyzed. New York State's Department of Health next-generation DNA sequencing guidelines were expanded for establishing performance validation applicable to whole-genome and transcriptome sequencing. Whole-genome sequencing laboratory protocols were validated for the Illumina HiSeq X Ten platform and RNA sequencing for Illumina HiSeq2500 platform for fresh or frozen and formalin-fixed, paraffin-embedded tumor samples. Various bioinformatics tools were also tested, and CIs for sensitivity and specificity thresholds in calling clinically significant somatic aberrations were determined. The validation was performed on a set of 125 tumor normal pairs. RNA sequencing was performed to call fusions and to confirm the DNA variants or exonic alterations. Here, we present our results and WGTS standards for variant allele frequency, reproducibility, analytical sensitivity, and present limit of detection analysis for single nucleotide variant calling, copy number identification, and structural variants. We show that The New York Genome Center WGTS clinical assay can provide a comprehensive patient variant discovery approach suitable for directed oncologic therapeutic applications.Entities:
Mesh:
Year: 2018 PMID: 30138725 PMCID: PMC6198246 DOI: 10.1016/j.jmoldx.2018.06.007
Source DB: PubMed Journal: J Mol Diagn ISSN: 1525-1578 Impact factor: 5.568
Cancer Types in Validation Sample Set
| Cancer type | Sample count |
|---|---|
| Brain | 40 |
| Sarcoma | 15 |
| Colon | 11 |
| Lymphoma | 11 |
| Lung | 8 |
| Pancreatic | 8 |
| Leukemia | 6 |
| Bone | 5 |
| Ovarian/cervical | 4 |
| Skin | 3 |
| Kidney | 3 |
| Breast | 3 |
| Multiple myeloma | 3 |
| Liver | 2 |
| Appendiceal | 2 |
| Unknown | 1 |
Figure 1Positive predictive value (PPV) and sensitivity (SENS) for tumor/normal sequencing depth. Virtual tumor experiment PPV (solid line) and SENS (dotted line) percentages for single nucleotide variant callers MuTect (version 1.1.7) (black lines) and Strelka (version 1.0.14) (red lines) over a range of tumor/normal sequencing coverage. PPV increases with increased normal sequencing depth. SENS increases with increased tumor sequencing depth.
Figure 2Virtual tumor variant allele frequency (VAF) and alternate allele read count versus read count (RC). A: VAF of true positive (TP; red dots) and false positive (FP; black dots) calls made by MuTect versus total RC at tumor/normal coverage (60×:30×, 80×:40×). B: Alternate allele read count of TP (red dots) and FP (black dots) calls made by MuTect (B) versus total RC at tumor/normal coverage (60×:30×, 80×:40×).
Virtual Tumor 95% CI (Bottom) per VAF and Read Count 80×:40×
| Total read count, bins | MuTect VAF | MuTect ALT_COUNT | Strelka VAF | Strelka ALT_COUNT |
|---|---|---|---|---|
| 10 | 0 | 0 | 0 | 0 |
| 20 | 0 | 0 | 0 | 0 |
| 30 | 0.152 | 4.33 | 0 | 0 |
| 40 | 0.139 | 5.28 | 0.176 | 6.80 |
| 50 | 0.158 | 7.57 | 0.173 | 8.30 |
| 60 | 0.165 | 9.58 | 0.180 | 10.42 |
| 70 | 0.165 | 11.23 | 0.173 | 11.78 |
| 80 | 0.156 | 12.16 | 0.170 | 13.27 |
| 90 | 0.154 | 13.52 | 0.166 | 14.64 |
| 100 | 0.145 | 14.22 | 0.157 | 15.34 |
MuTect version 1.1.7, Strelka version 1.0.14. CIs calculated from the Gaussian distribution of true positive calls. Zero values represent insufficient data to perform CI calculation.
ALT_COUNT, alternate allele count; VAF, variant allele frequency.
Figure 3Whole-genome sequencing coverage. A: Representation of the coverage (mean and SD) for all variants in cancer census genes based on our validation sample data. Red filled circles indicate mean coverage of genes in our validation sample set. Black lines indicate SD limits from mean. Mean coverage per gene when targeting 80× is shown. B: Percentage of genome sequencing coverage for all 64 samples at 30× read depth (PCT_30X) and 40× read depth (PCT_40X). FF, fresh or frozen; FFPE, formalin-fixed, paraffin-embedded.
Figure 4Limit of detection. A and B: Selected sample-specific variants of single nucleotide variant (SNV) and insertion/deletion (indel) (A) and copy number variant (CNV) (B) at 30%, 20%, and 10% tumor content in two samples, respectively. A: Variant allele frequencies in two samples [frozen and formalin-fixed, paraffin-embedded (FFPE)] with original tumor content of 91% and 65%, respectively, are shown with decreasing tumor content. B: Copy number log2(tumor/normal; T/N) of one focal amplification (EGFR, FFPE sample) and three deletions [two whole arm (1p, 19q, frozen sample) and one focal deletion (CDKN2A, FFPE sample)]. Inset shows 19q and 1p CNV log2(T/N) values at 30% to 10% tumor content for better resolution.
Reproducibility Experiment: Sequencing Validation Design
| Assay | Day | Intra | |||||
|---|---|---|---|---|---|---|---|
| Sequencer 1 | Sequencer 2 | Sequencer 3 | |||||
| Inter | Day 1 | CA-0101-p1 (frozen) | ONC16-13-p1 (FFPE) | ||||
| Day 2 | CA-0101-p1 (frozen) | ONC16-13-p1 (FFPE) | CA-0101-p2 | ONC16-03-p2 | CA-101-p3 | ONC16-13-p3 | |
| Day 3 | CA-0101-p1 (frozen) | ONC16-13-p1 (FFPE) | |||||
Sequencing was performed on the same library preparation (p1) for four samples (two frozen and two FFPE) on the same sequencing machine on three separate days (inter-run reproducibility). Two additional library preparations were performed for each sample (p2, p3) and sequenced on the same day on different sequencing machines (intra-technician reproducibility).
FFPE, formalin-fixed, paraffin-embedded.
Figure 5DNA whole-genome sequencing (WGS) copy number correlation to genotyping chip copy number. The correlation of allele-specific copy number analysis of tumors (ASCAT) discrete copy number per gene to copy number per gene identified by DNA WGS is shown for 40 samples with tumor purity of 30%. The red squares represent the mean values in the distributions. The correlation of the mean values was r = 0.89, with total correlation at r = 0.70. The plot is shown with a y axis range limited to 0,30 copy number. T/N, tumor/normal.
Minimum Number of Total Reads Required to Detect a Variant
| Type | VAF | Total reads | Method |
|---|---|---|---|
| SNVs | 15% | ≥40 | WGS |
| Indels | 20% | ≥40 | WGS |
| CNVs | NA | ≥30 | WGS |
| SVs | NA | ≥3 | WGS |
| Fusions | NA | ≥5 | RNA-seq |
CNV, copy number variant; Indel, insertion/deletion; NA, not applicable; SNV, single-nucleotide variant; SV, structural variant; VAF, variant allele frequency; WGS, whole-genome sequencing.
Based on Xi et al.
Spanning pairs.