| Literature DB >> 25133800 |
Daniel Klevebring1, Mårten Neiman1, Simon Sundling1, Louise Eriksson1, Eva Darai Ramqvist2, Fuat Celebioglu3, Kamila Czene4, Per Hall4, Lars Egevad2, Henrik Grönberg4, Johan Lindberg1.
Abstract
Accurate estimation of systemic tumor load from the blood of cancer patients has enormous potential. One avenue is to measure the presence of cell-free circulating tumor DNA in plasma. Various approaches have been investigated, predominantly covering hotspot mutations or customized, patient-specific assays. Therefore, we investigated the utility of using exome sequencing to monitor circulating tumor DNA levels through the detection of single nucleotide variants in plasma. Two technologies, claiming to offer efficient library preparation from nanogram levels of DNA, were evaluated. This allowed us to estimate the proportion of starting molecules measurable by sequence capture (<5%). As cell-free DNA is highly fragmented, we designed and provide software for efficient identification of PCR duplicates in single-end libraries with a varying size distribution. On average, this improved sequence coverage by 38% in comparison to standard tools. By exploiting the redundant information in PCR-duplicates the background noise was reduced to ∼1/35,000. By applying our optimized analysis pipeline to a simulation analysis, we determined the current sensitivity limit to ∼1/2400, starting with 30 ng of cell-free DNA. Subsequently, circulating tumor DNA levels were assessed in seven breast- and one prostate cancer patient. One patient carried detectable levels of circulating tumor DNA, as verified by break-point specific PCR. These results demonstrate exome sequencing on cell-free DNA to be a powerful tool for disease monitoring of metastatic cancers. To enable a broad implementation in the diagnostic settings, the efficiency limitations of sequence capture and the inherent noise levels of the Illumina sequencing technology must be further improved.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25133800 PMCID: PMC4136786 DOI: 10.1371/journal.pone.0104417
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Duplication rates using the ThruPLEX kit and the Mondrian system.
The proportion PCR duplicates in relation to sequencing depth demonstrated by subsampling of deeply sequenced libraries. A) The lower range and B) the higher range of sequencing depth. At any given number of reads sequenced, libraries with an input amount of 10 ng shows a lower fraction of duplicated reads compared to 1 ng. Furthermore, ThruPLEX-prepared libraries consistently show lower fraction of duplicated reads compared to Mondrian-prepared libraries.
Clinical and profiling data.
| Tissue/Blood exome sequencing | Plasma DNA exome sequencing | dPCR breakpoint profiling | |||||||||
| SampleID | Clincal data | Tumor | Blood | Nbr point mutations | cfDNA source | DNA (ng) | Plasma | Fraction ctDNA | P-value | DNA (ng) | Fraction ctDNA |
| SWE-54_B | Gleason 5+4, T3A | 87 | 86 | 27 | Before surgery | 3 | 39 | 0.001 | 0.265 | NA | NA |
| SWE-54_A | Gleason 5+4, T3A | 87 | 86 | 27 | 1 month after surgery | 8 | 42 | 0 | 1 | NA | NA |
| BC_A | Elston 3, prolif 75%, 21 mm, ER+, PR+, HER2+ | 122 | 147 | 26 | Before surgery | 5 | 62 | 0 | 1 | 5 | NA |
| BC_B | Elston 3, prolif 70%, 18 mm, ER+, HER2+ | 114 | 137 | 184 | Before surgery | 1 | 18 | 0 | 1 | 1 | 0 |
| BC_C | Elston 2, prolif 13%, 16 mm, ER+, PR+ | 111 | 146 | 17 | Before surgery | 5 | 57 | 0 | 1 | 5 | 0 |
| BC_D | Elston 3, prolif 90%, 18 mm, ER+, HER2+ | 101 | 118 | 245 | Before surgery | 3 | 43 | 0.003 | 1.16E-18 | 3 | 0.026 |
| BC_E | Elston 1, prolif 10%, 38 mm, ER+, PR+ | 134 | 188 | 20 | Before surgery | 5 | 55 | 0 | 1 | 5 | NA |
| BC_F | Elston 2, prolif 2%, 12 mm, ER+, PR+ | 175 | 153 | 82 | Before surgery | 5 | 48 | 0 | 1 | 5 | 0 |
| BC_G | Elston 3, prolif 85%, 24 mm, ER+, PR+ | 163 | 168 | 47 | Before surgery | 5 | 39 | 0 | 1 | 5 | 0 |
*Mean coverage throughout the exome.
**The point mutations originate from the two lymph-node metastases that were sequenced in ref 15.
Figure 2Base alignment quality filtering to reduce noise.
A) The noise rate in background samples for analysis pipelines 1)–3) as the base alignment quality (BAQ) cutoff is increased. Rate is defined here as total number of mutant reads/(total number of mutant reads+the total number of reference reads) in the background samples at mutated positions. B) The log2 ratio of (proportion of mutant reads)/(proportion of reference reads) left by increasing BAQ cutoffs in the background samples at mutated positions. Colors scale according to analysis pipelines. Pipeline 1) BAQ limited to 40, as qualities were not altered. Pipeline 2) BAQ limited to 45 through merging of overlapping reads. Pipeline 3) BAQ limited to 50 by merging reads and also accounting for concordance between PCR duplicates originating form the same starting molecule.
Comparison of analysis pipelines 1–3.
| Pipeline | Proportion of N-bases | Optimal BAQ cutoff | Proportion of data left | Noise rate | Sensitivity |
|
| 0.00018 | No cutoff | 1.00 | 1/2176 | 1/852 |
| 1 | 0.00018 | 38 | 0.40 | 1/11451 | 1/1372 |
| 2 | 0.00067 | 43 | 0.17 | 1/8673 | 1/775 |
| 3 | 0.00330 | 46 | 0.61 | 1/35419 | 1/2433 |
Analysis pipelines as described in Material and Methods.
Proportion of bases set to “N” during processing.
Optimal base alignment quality cutoff (BAQ).
Proportion of data left using the BAQ cutoff set in3.
Noise rate defined as the number of mutant bases in the background divided by the number of reference bases.
Sensitivity of exome sequencing to detect ctDNA based on in silico evaluation.
Figure 3Sensitivity of exome sequencing to track ctDNA.
Analysis pipelines 1)–3) are displayed here with optimal base quality alignment cutoffs (BAQ) and without for pipeline 1 to display the effects of BAQ filtering. As exome sequencing is limited by the efficiency of the capture procedure, 30 whole genome sequencing was also simulated assuming 3000 variants in the genome. 1000 iterations were performed for each ctDNA fraction assuming 50 variants in the exome, starting with 10 ml of plasma (30 ng of cfDNA). The sensitivity is defined as the number of proportion of tests passing the significance threshold for each set of 1000 iterations (p<0.05, fishers' exact test, comparing the number of variant and reference reads from sample and background).