| Literature DB >> 25942438 |
Amit Kawalia1, Susanne Motameny1, Stephan Wonczak2, Holger Thiele1, Lech Nieroda2, Kamel Jabbari1, Stefan Borowski2, Vishal Sinha1, Wilfried Gunia1, Ulrich Lang2, Viktor Achter2, Peter Nürnberg1.
Abstract
Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files.Entities:
Mesh:
Year: 2015 PMID: 25942438 PMCID: PMC4420499 DOI: 10.1371/journal.pone.0126321
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Exome Analysis Workflow.
Checkpoints where a workflow abortion can be triggered are colored in yellow, subpipelines are colored in green.
Fig 2Schematic of the HPC Working Environment.
Fig 3Workflow Context.
Fig 4Flowchart of the Job Submission Function.
Fig 5Performance of Multi-Threading.
CPU-time and walltime usage of BWA-Mem and GATK HaplotypeCaller with different number of threads.