| Literature DB >> 31613890 |
Edoardo Morandi1,2, Matteo Cereda2, Danny Incarnato1,2, Caterina Parlato2, Giulia Basile2, Francesca Anselmi1,2, Andrea Lauria1,2, Lisa Marie Simon1,2, Isabelle Laurence Polignano1, Francesca Arruga2, Silvia Deaglio2,3, Elisa Tirtei4, Franca Fagioli4, Salvatore Oliviero1,2.
Abstract
BACKGROUND: Next generation sequencing methods are widely adopted for a large amount of scientific purposes, from pure research to health-related studies. The decreasing costs per analysis led to big amounts of generated data and to the subsequent improvement of software for the respective analyses. As a consequence, many approaches have been developed to chain different software in order to obtain reliable and reproducible workflows. However, the large range of applications for NGS approaches entails the challenge to manage many different workflows without losing reliability.Entities:
Year: 2019 PMID: 31613890 PMCID: PMC6793853 DOI: 10.1371/journal.pone.0222512
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1A schematic representation of the workflow of HaTSPiL.
The topmost left part shows the starting point of the whole analysis, a set of barcoded FastQ files. The software supports the most common operations to handle the data, performing different steps of filtering and alignment. It is shown that the various steps can change depending on the barcoding of the sample, and that the workflow is highly customisable. At date, HaTSPiL supports a mutation analysis pipeline, and this feature will be improved and extended. Additional analysis pipeline can be added easily as well, and a final step of report generation is included, in order to provide an immediate and user-friendly output.
Fig 2Representation of the output in a report automatically generated by HaTSPiL.
(A) Two of the plots showing the quality of a single sequencing. The values are obtained using Picard software (https://broadinstitute.github.io/picard/). (B) Two comparisons between samples and relative control. This representation is used with a scalable amount of samples, from a single sample to all the samples available for a whole project. A control sample is always added, if available, in order to provide a minimal comparison against another sequencing. (C) A portion of the table containing the mutations found highly damaging by the variant calling. The mutated genes that are known to be a target of a drug are highlighted.