| Literature DB >> 31878881 |
Samuel Valentini1, Tarcisio Fedrizzi2, Francesca Demichelis2, Alessandro Romanel3.
Abstract
BACKGROUND: Interrogation of whole exome and targeted sequencing NGS data is rapidly becoming a preferred approach for the exploration of large cohorts in the research setting and importantly in the context of precision medicine. Single-base and genomic region level data retrieval and processing still constitute major bottlenecks in NGS data analysis. Fast and scalable tools are hence needed.Entities:
Mesh:
Year: 2019 PMID: 31878881 PMCID: PMC6933905 DOI: 10.1186/s12864-019-6386-6
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1PaCBAM performances. Time (a) and memory (b) required by PaCBAM to perform a pileup compared to SAMtools, GATK and Sambamba, using increasing number of threads. The figure focuses on the analysis of a BAM file with ~300X mean coverage and ~30Mbp target size using 30 threads. Note that parallel pileup option is not available for SAMtools and red lines in panel a and b refer to the average of single thread executions
Fig. 2Comparison of PaCBAM results with other tools. a Comparison of PaCBAM and GATK depth of coverage (left) with zoom in the coverage range [0,500] (right); number of positions considered in the analysis and correlation results are reported. b Comparison of allelic fraction of ~ 40 K positions annotated as SNPs in dbSNP database v144 and having an allelic fraction > 0.2 in both PaCBAM and GATK pileup output. c Single-base coverage obtained by running either Picard MarkDuplicates + PaCBAM pileup or PaCBAM pileup with duplicates filtering option active (left) with zoom in the coverage range [0,500] (right). d Regional mean depth of coverage obtained by running either Picard MarkDuplicates + PaCBAM pileup or PaCBAM pileup with duplicates filtering option active