Elisa Pischedda1, Cristina Crava1,2, Martina Carlassara1, Susanna Zucca3, Leila Gasmi1, Mariangela Bonizzoni4. 1. Department of Biology and Biotechnology, University of Pavia, 27100, Pavia, Italy. 2. ERI BIOTECMED, Universitat de Valencia, 46010, Valencia, Spain. 3. Engenome Srl, 27100, Pavia, Italy. 4. Department of Biology and Biotechnology, University of Pavia, 27100, Pavia, Italy. mariangela.bonizzoni@unipv.it.
Abstract
BACKGROUND: Several bioinformatics pipelines have been developed to detect sequences from viruses that integrate into the human genome because of the health relevance of these integrations, such as in the persistence of viral infection and/or in generating genotoxic effects, often progressing into cancer. Recent genomics and metagenomics analyses have shown that viruses also integrate into the genome of non-model organisms (i.e., arthropods, fish, plants, vertebrates). However, rarely studies of endogenous viral elements (EVEs) in non-model organisms have gone beyond their characterization from reference genome assemblies. In non-model organisms, we lack a thorough understanding of the widespread occurrence of EVEs and their biological relevance, apart from sporadic cases which nevertheless point to significant roles of EVEs in immunity and regulation of expression. The concomitance of repetitive DNA, duplications and/or assembly fragmentations in a genome sequence and intrasample variability in whole-genome sequencing (WGS) data could determine misalignments when mapping data to a genome assembly. This phenomenon hinders our ability to properly identify integration sites. RESULTS: To fill this gap, we developed ViR, a pipeline which solves the dispersion of reads due to intrasample variability in sequencing data from both single and pooled DNA samples thus ameliorating the detection of integration sites. We tested ViR to work with both in silico and real sequencing data from a non-model organism, the arboviral vector Aedes albopictus. Potential viral integrations predicted by ViR were molecularly validated supporting the accuracy of ViR results. CONCLUSION: ViR will open new venues to explore the biology of EVEs, especially in non-model organisms. Importantly, while we generated ViR with the identification of EVEs in mind, its application can be extended to detect any lateral transfer event providing an ad-hoc sequence to interrogate.
BACKGROUND: Several bioinformatics pipelines have been developed to detect sequences from viruses that integrate into the human genome because of the health relevance of these integrations, such as in the persistence of viral infection and/or in generating genotoxic effects, often progressing into cancer. Recent genomics and metagenomics analyses have shown that viruses also integrate into the genome of non-model organisms (i.e., arthropods, fish, plants, vertebrates). However, rarely studies of endogenous viral elements (EVEs) in non-model organisms have gone beyond their characterization from reference genome assemblies. In non-model organisms, we lack a thorough understanding of the widespread occurrence of EVEs and their biological relevance, apart from sporadic cases which nevertheless point to significant roles of EVEs in immunity and regulation of expression. The concomitance of repetitive DNA, duplications and/or assembly fragmentations in a genome sequence and intrasample variability in whole-genome sequencing (WGS) data could determine misalignments when mapping data to a genome assembly. This phenomenon hinders our ability to properly identify integration sites. RESULTS: To fill this gap, we developed ViR, a pipeline which solves the dispersion of reads due to intrasample variability in sequencing data from both single and pooled DNA samples thus ameliorating the detection of integration sites. We tested ViR to work with both in silico and real sequencing data from a non-model organism, the arboviral vector Aedes albopictus. Potential viral integrations predicted by ViR were molecularly validated supporting the accuracy of ViR results. CONCLUSION: ViR will open new venues to explore the biology of EVEs, especially in non-model organisms. Importantly, while we generated ViR with the identification of EVEs in mind, its application can be extended to detect any lateral transfer event providing an ad-hoc sequence to interrogate.
Authors: Michael Forster; Silke Szymczak; David Ellinghaus; Georg Hemmrich; Malte Rühlemann; Lars Kraemer; Sören Mucha; Lars Wienbrandt; Martin Stanulla; Andre Franke Journal: Sci Rep Date: 2015-07-13 Impact factor: 4.379
Authors: Matthew V Cannon; Haikel N Bogale; Devika Bhalerao; Kalil Keita; Denka Camara; Yaya Barry; Moussa Keita; Drissa Coulibaly; Abdoulaye K Kone; Ogobara K Doumbo; Mahamadou A Thera; Christopher V Plowe; Mark A Travassos; Seth R Irish; Joshua Yeroshefsky; Jeannine Dorothy; Brian Prendergast; Brandyce St Laurent; Megan L Fritz; David Serre Journal: Biol Open Date: 2021-07-19 Impact factor: 2.643