| Literature DB >> 35962321 |
Vera Belova1, Anna Shmitko2, Anna Pavlova2, Robert Afasizhev2, Valery Cheranev2, Anastasia Tabanakova2, Natalya Ponikarovskaya2, Denis Rebrikov2, Dmitriy Korostin2.
Abstract
Exome sequencing is becoming a routine in health care, because it increases the chance of pinpointing the genetic cause of an individual patient's condition and thus making an accurate diagnosis. It is important for facilities providing genetic services to keep track of changes in the technology of exome capture in order to maximize throughput while reducing cost per sample. In this study, we focused on comparing the newly released exome probe set Agilent SureSelect Human All Exon v8 and the previous probe set v7. In preparation for higher throughput of exome sequencing using the DNBSEQ-G400, we evaluated target design, coverage statistics, and variants across these two different exome capture products. Although the target size of the v8 design has not changed much compared to the v7 design (35.24 Mb vs 35.8 Mb), the v8 probe design allows you to call more of SNVs (+ 3.06%) and indels (+ 8.49%) with the same number of raw reads per sample on the common target regions (34.84 Mb). Our results suggest that the new Agilent v8 probe set for exome sequencing yields better data quality than the current Agilent v7 set.Entities:
Keywords: Agilent SureSelect; BGI; Enrichment quality; Exome sequencing; MGISEQ; NGS; Variant calling; WES
Mesh:
Year: 2022 PMID: 35962321 PMCID: PMC9375261 DOI: 10.1186/s12864-022-08825-w
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 4.547
Fig. 1Venn diagram showing the intersection between the Agilent v7 exome (35.8 Mb), Agilent v8 exome (35.24 Mb), and Gencode V39 coding exons (34.93 Mb): A weighted, B unweighted
Fig. 2Venn diagram showing the intersection of unique regions longer than 30 bp of the Agilent v7 exome (0.88 Mb) and Agilent v8 exome (0.39 Mb) with: A Gencode V39 coding exons (34.93 Mb); B NCBI RefSeq ALL exons + -20 bp (95.47 Mb)
Fig. 3Barplots show the average values for on-targets, off-targets, duplicates, and unaligned reads in the samples from the v7 and v8 exome pools (downsampled results)
Fig. 4A, B, C Performance of exome protocol in terms of coverage quality in downsampled samples: A Dependence of coverage quality of the v7 and v8 target regions from depth (mean ± SD values); B Graph showing the number of positions (bp, y-axis) vs. the coverage (x-axis); C Box plot showing the Fold-80 metric for the samples from the v7 and v8 pools
Fig. 5Density plot showing Mean Depth vs. %GC content for: a Agilent v7; b Agilent v8. Density plot showing %GC content vs. mean depth. The data in this plot was collected by merging all samples from the V7 and V8 pools. Density estimation was performed using 2D plots. More specifically, we chose data points in a fixed rectangle (%GC content ∈ [0;1], mean depth ∈ [0;1000]) and split it into evenly spaced 200 × 100 grid and counted the data points in each cell of the grid. Finally, we normalised the grid to the range of [0,1] and plotted it using "jet" colormap from matplotlib library
Average (mean ± SD) results of variant calling of SNV and indels for the samples from the v7 and v8 pools using their own target (bed v7, bed v8) and target intersection (bed v7 vs. v8) filtered by DP > 13 and QUAL > 30
| Design | Variant type | Count on target v7 | Count on target v8 | Count on target V7 cross v8 (34.84 Mb) |
|---|---|---|---|---|
| V7 pool | SNV | 25 736 ± 380 | - | 24 374 ± 347 |
| indel | 743 ± 22 | - | 612 ± 18 | |
| V8 pool | SNV | - | 25 558 ± 362 | 25 120 ± 355 |
| indel | - | 699 ± 18 | 664 ± 17 |