| Literature DB >> 26248467 |
Quan Peng1, Ravi Vijaya Satya2, Marcus Lewis3, Pranay Randad4, Yexun Wang5.
Abstract
BACKGROUND: PCR amplicon sequencing has been widely used as a targeted approach for both DNA and RNA sequence analysis. High multiplex PCR has further enabled the enrichment of hundreds of amplicons in one simple reaction. At the same time, the performance of PCR amplicon sequencing can be negatively affected by issues such as high duplicate reads, polymerase artifacts and PCR amplification bias. Recently researchers have made some good progress in addressing these shortcomings by incorporating molecular barcodes into PCR primer design. So far, most work has been demonstrated using one to a few pairs of primers, which limits the size of the region one can analyze.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26248467 PMCID: PMC4528782 DOI: 10.1186/s12864-015-1806-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Overview of the high multiplex amplicon barcoding PCR
Summary of the sequencing runs for in vitro DNA mixtures
| Input amount | 10 ng | 20 ng | 40 ng | 80 ng | 10 ng | 80 ng |
|---|---|---|---|---|---|---|
| LA cycles | 1 | 1 | 1 | 1 | 3 | 3 |
| Total reads | 5,161,694 | 5,029,394 | 4,181,410 | 4,568,978 | 4,612,940 | 8,718,690 |
| On-target reads | 4,449,285 | 4,226,778 | 3,528,081 | 4,051,939 | 3,591,578 | 7,704,936 |
| On-target read pairs | 2,152,647 | 2,066,226 | 1,707,379 | 1,972,168 | 1,715,098 | 3,659,067 |
| Median raw read depth | 9,263 | 8,558 | 6,454 | 6,915 | 7,701 | 16,275 |
| Mean raw read depth | 10,514 | 10,096 | 8,332 | 9,628 | 8,271 | 17,635 |
| % Bases >0.2x mean depth | 95 | 94 | 92 | 90 | 95 | 96 |
| Median consensus read depth | 98 | 195 | 346 | 544 | 209 | 889 |
| Mean consensus read depth | 98 | 187 | 336 | 530 | 208 | 839 |
| Mean raw read/consensus read | 53 | 28 | 13 | 11 | 22 | 16 |
| Median raw read/consensus read | 53 | 26 | 11 | 8 | 20 | 10 |
| Bases in target region | 39,231 | 39,231 | 39,231 | 39,231 | 39,231 | 39,231 |
| GIAB high confident region for NA12878 | 29,343 | 29,343 | 29,343 | 29,343 | 29,343 | 29,343 |
| NA12878 unique SNVs | 134 | 134 | 134 | 134 | 134 | 134 |
| Detected true positives | 17 | 40 | 76 | 93 | 39 | 114 |
| Detected false positives | 0 | 2 | 3 | 5 | 4 | 3 |
Fig. 2Comparison of sensitivity and false-positive rates for different input DNA amounts. (a and b) The x-axis represents different input quantity of the DNA admixture. The left y-axis represents detection sensitivity for SNVs at 1–2 % fraction. The right y-axis represents false positive rates (a) Performance using the original protocol. (b) The sensitivity of SNV detection was significantly higher after adding 3 cycles of limited amplification. (c) The ROC curve from 80 ng 3-cycle data with or without using the information of molecular barcodes
Fig. 3ERCC RNA quantification using amplicon barcoding. (a) Correlation between “measured” vs. “expected” numbers for each ERCC RNA transcripts represented by each amplicon. The x-axis represents log2 values of known copies in the ERCC RNA spike-in mix. The y-axis represents log2 values of average barcode or read counts for each amplicon (n = 3). Both barcode count and read count from different sequencing runs were first normalized to a mean value of 10,000 for each run before being averaged. (b) CV computed on the basis of barcode counts vs. raw read counts. Three independent target enrichment experiments were performed. Solid black line represents diagonal and two red dash lines represent 2-fold intervals. (c) CV vs mean plot for both barcode counts and read counts. X-axis represents the mean value for each amplicon on the basis of either barcodes or reads. Corresponding CV is plotted on y-axis. The theoretical Poisson CV is plotted as the black dash line
Summary of the sequencing runs for FFPE samples
| Sample ID | T5 | LN2 | LT2 | T2 |
|---|---|---|---|---|
| Total reads | 1,053,646 | 11,352,414 | 13,518,788 | 9,911,538 |
| On-target reads | 1,015,755 | 11,027,934 | 13,034,688 | 9,642,483 |
| On-target read pairs | 501,517 | 5,417,637 | 6,346,625 | 4,745,809 |
| Median read depth | 820 | 9,653 | 11,460 | 8,420 |
| Mean read depth | 1,079 | 11,534 | 13,526 | 10,120 |
| % Bases >0.2x mean depth | 90 | 93 | 94 | 93 |
| Median consensus read depth | 30 | 390 | 908 | 151 |
| Mean consensus read depth | 33 | 376 | 891 | 146 |
| Mean raw read/consensus read | 15 | 14 | 8 | 33 |
| Median raw read/consensus read | 15 | 13 | 7 | 32 |
| Bases in target region | 86,544 | 86,544 | 86,544 | 86,544 |
| Called SNVs | 77 | 129 | 134 | 141 |