| Literature DB >> 29158876 |
Pranav Kulkarni1, Peter Frommolt1.
Abstract
While Next-Generation Sequencing (NGS) can now be considered an established analysis technology for research applications across the life sciences, the analysis workflows still require substantial bioinformatics expertise. Typical challenges include the appropriate selection of analytical software tools, the speedup of the overall procedure using HPC parallelization and acceleration technology, the development of automation strategies, data storage solutions and finally the development of methods for full exploitation of the analysis results across multiple experimental conditions. Recently, NGS has begun to expand into clinical environments, where it facilitates diagnostics enabling personalized therapeutic approaches, but is also accompanied by new technological, legal and ethical challenges. There are probably as many overall concepts for the analysis of the data as there are academic research institutions. Among these concepts are, for instance, complex IT architectures developed in-house, ready-to-use technologies installed on-site as well as comprehensive Everything as a Service (XaaS) solutions. In this mini-review, we summarize the key points to consider in the setup of the analysis architectures, mostly for scientific rather than diagnostic purposes, and provide an overview of the current state of the art and challenges of the field.Entities:
Year: 2017 PMID: 29158876 PMCID: PMC5683667 DOI: 10.1016/j.csbj.2017.10.001
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Overview of the most important challenges in the design and implementation of NGS analysis workflows and suggestions how these challenges can be addressed.
Fig. 2Usage of resources for large-scale analysis of Next-Generation Sequencing data in our local Core Facility: (a) Average filesystem space used for storage of NGS data at different levels of the analysis for the most important NGS applications (light grey: WGS; medium grey: WXS, dark grey: amplicon-based gene panel sequencing; dark red: RNA-Seq, light red: miRNA-Seq, green: ChIP-Seq). (b) Percentage of the analysis runs for several applications. On average, our pipelines are processing between 1000 and 1500 samples per year. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
List of publicly available bioinformatics workflow systems and comparison of the features they offer.