Diogo Pratas1,2,3, Mari Toppinen1, Lari Pyöriä1, Klaus Hedman1,4, Antti Sajantila5,6, Maria F Perdomo1. 1. Department of Virology, University of Helsinki, Haartmaninkatu 3, Helsinki, 00290, Finland. 2. Department of Electronics, Telecommunications and Informatics, University of Aveiro, Campus Universitario de Santiago, 3810-193 Aveiro, Portugal. 3. Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Campus Universitario de Santiago, 3810-193 Aveiro, Portugal. 4. HUSLAB, Helsinki University Hospital, Topeliuksenkatu 32, 00290 Helsinki, Finland. 5. Department of Forensic Medicine, University of Helsinki, Kytösuontie 11, 00300, Helsinki, Finland. 6. Forensic Medicine Unit, Finnish Institute of Health and Welfare, PO Box 30 FI-00271 Helsinki, Finland.
Abstract
BACKGROUND: Advances in sequencing technologies have enabled the characterization of multiple microbial and host genomes, opening new frontiers of knowledge while kindling novel applications and research perspectives. Among these is the investigation of the viral communities residing in the human body and their impact on health and disease. To this end, the study of samples from multiple tissues is critical, yet, the complexity of such analysis calls for a dedicated pipeline. We provide an automatic and efficient pipeline for identification, assembly, and analysis of viral genomes that combines the DNA sequence data from multiple organs. TRACESPipe relies on cooperation among 3 modalities: compression-based prediction, sequence alignment, and de novo assembly. The pipeline is ultra-fast and provides, additionally, secure transmission and storage of sensitive data. FINDINGS: TRACESPipe performed outstandingly when tested on synthetic and ex vivo datasets, identifying and reconstructing all the viral genomes, including those with high levels of single-nucleotide polymorphisms. It also detected minimal levels of genomic variation between different organs. CONCLUSIONS: TRACESPipe's unique ability to simultaneously process and analyze samples from different sources enables the evaluation of within-host variability. This opens up the possibility to investigate viral tissue tropism, evolution, fitness, and disease associations. Moreover, additional features such as DNA damage estimation and mitochondrial DNA reconstruction and analysis, as well as exogenous-source controls, expand the utility of this pipeline to other fields such as forensics and ancient DNA studies. TRACESPipe is released under GPLv3 and is available for free download at https://github.com/viromelab/tracespipe.
BACKGROUND: Advances in sequencing technologies have enabled the characterization of multiple microbial and host genomes, opening new frontiers of knowledge while kindling novel applications and research perspectives. Among these is the investigation of the viral communities residing in the human body and their impact on health and disease. To this end, the study of samples from multiple tissues is critical, yet, the complexity of such analysis calls for a dedicated pipeline. We provide an automatic and efficient pipeline for identification, assembly, and analysis of viral genomes that combines the DNA sequence data from multiple organs. TRACESPipe relies on cooperation among 3 modalities: compression-based prediction, sequence alignment, and de novo assembly. The pipeline is ultra-fast and provides, additionally, secure transmission and storage of sensitive data. FINDINGS: TRACESPipe performed outstandingly when tested on synthetic and ex vivo datasets, identifying and reconstructing all the viral genomes, including those with high levels of single-nucleotide polymorphisms. It also detected minimal levels of genomic variation between different organs. CONCLUSIONS: TRACESPipe's unique ability to simultaneously process and analyze samples from different sources enables the evaluation of within-host variability. This opens up the possibility to investigate viral tissue tropism, evolution, fitness, and disease associations. Moreover, additional features such as DNA damage estimation and mitochondrial DNA reconstruction and analysis, as well as exogenous-source controls, expand the utility of this pipeline to other fields such as forensics and ancient DNA studies. TRACESPipe is released under GPLv3 and is available for free download at https://github.com/viromelab/tracespipe.
Authors: Björn Grüning; Ryan Dale; Andreas Sjödin; Brad A Chapman; Jillian Rowe; Christopher H Tomkins-Tinch; Renan Valieris; Johannes Köster Journal: Nat Methods Date: 2018-07 Impact factor: 28.547
Authors: S Anderson; A T Bankier; B G Barrell; M H de Bruijn; A R Coulson; J Drouin; I C Eperon; D P Nierlich; B A Roe; F Sanger; P H Schreier; A J Smith; R Staden; I G Young Journal: Nature Date: 1981-04-09 Impact factor: 49.962
Authors: Patrick W Laffy; Elisha M Wood-Charlson; Dmitrij Turaev; Karen D Weynberg; Emmanuelle S Botté; Madeleine J H van Oppen; Nicole S Webster; Thomas Rattei Journal: Front Microbiol Date: 2016-06-09 Impact factor: 5.640
Authors: Mari Toppinen; Antti Sajantila; Diogo Pratas; Klaus Hedman; Maria F Perdomo Journal: Front Cell Infect Microbiol Date: 2021-04-22 Impact factor: 5.293