| Literature DB >> 29940847 |
Nikola Tom1,2, Ondrej Tom3, Jitka Malcikova1,2, Sarka Pavlova1,2, Blanka Kubesova2, Tobias Rausch4, Miroslav Kolarik3, Vladimir Benes4, Vojtech Bystry5, Sarka Pospisilova6,7.
Abstract
BACKGROUND: High-throughput bioinformatics analyses of next generation sequencing (NGS) data often require challenging pipeline optimization. The key problem is choosing appropriate tools and selecting the best parameters for optimal precision and recall.Entities:
Keywords: Benchmarking; Next generation sequencing; Parameter optimization; Variant calling
Mesh:
Year: 2018 PMID: 29940847 PMCID: PMC6020218 DOI: 10.1186/s12859-018-2227-x
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1a Once the pipeline is set up for the optimization, all the configurations are run in parallel using raw input data. In this particular example, the emphasis is placed on optimizing the variant calling filters, however, the pipeline design depends on the user’s needs. In the case of the GIAB approach, the benchmarking step is part of the pipeline done by RTG Tools and hap.py. The pipeline results in the form of the stratified performance reports (csv) provided by hap.py are imported into ToTem’s internal database and filtered using ToTem’s filtering tool. This allows the best performing pipeline to be selected based on the chosen quality metrics, variant type and genomic region. b Similar to the previous diagram, the optimization is focused on tuning the variant filtering. Contrary to the previous case, Little Profet requires the pipeline results to be represented as tables of normalized variants with mandatory headers (CHROM, POS, REF, ALT). Such data are imported into ToTem’s internal database for pipeline benchmarking by the Little Profet method. Benchmarking is done by comparing the results of each pipeline to the ground truth reference variant calls in the given regions of interest and by estimating TP, FP, FN; and quality metrics derived from them - precision, recall and F-measure. To prevent overfitting of the pipelines, Little Profet also calculates the reproducibility of each quality metric over different data subsets. The results are provided in the form of interactive graphs and tables
Fig. 2Each dot represents an arithmetic mean of recall (X-axis) and precision (Y-axis) for one pipeline configuration calculated based on repeated random sub-sampling of 3 input datasets (220 samples). The crosshair lines show the standard deviation of the respective results across the sub-sampled sets. Individual variant callers (Mutect2, VarDict and VarScan2) are colour coded with a distinguished default setting for each. The default settings and the best performing configurations for each variant caller are also enlarged. Based on our experiment, the largest variant calling improvement (2.36× higher F-measure compared to default settings, highlighted by an arrow) and also the highest overall recall, precision, precision-recall, and F-measure were registered for VarScan2. In case of VarDict, a significant improvement in variant detection, mainly for recall (2.42×) was observed. The optimization effect on Mutect2 had a great effect on increasing the precision (1.74×). Although the F-measure after optimization did not reach as high values as VarScan2 and VarDict, Mutect2’s default setting provided the best results, mainly in a sense of recall