| Literature DB >> 29492276 |
Achim Quaiser1,2,3, Mart Krupovic1,2,3, Alexis Dufresne1,2,3, André-Jean Francez1,2,3, Simon Roux1,2,3.
Abstract
A new group of viruses carrying naturally chimeric single-stranded (ss) DNA genomes that encompass genes derived from eukaryotic ssRNA and ssDNA viruses has been recently identified by metagenomic studies. The host range, genomic diversity, and abundance of these chimeric viruses, referred to as cruciviruses, remain largely unknown. In this article, we assembled and analyzed thirty-seven new crucivirus genomes from twelve peat viromes, representing twenty-four distinct genome organizations, and nearly tripling the number of available genomes for this group. All genomes possess the two characteristic genes encoding for the conserved capsid protein (CP) and a replication protein. Additional ORFs were conserved only in nearly identical genomes with no detectable similarity to known genes. Two cruciviruses possess putative introns in their replication-associated genes. Sequence and phylogenetic analyses of the replication proteins revealed intra-gene chimerism in at least eight chimeric genomes. This highlights the large extent of horizontal gene transfer and recombination events in the evolution of ssDNA viruses, as previously suggested. Read mapping analysis revealed that members of the 'Cruciviridae' group are particularly prevalent in peat viromes. Sequences matching the CP ranged from 0.6 up to 10.9 percent in the twelve peat viromes. In contrast, from sixty-nine available viromes derived from other environments, only twenty-four contained cruciviruses, which on average accounted for merely 0.2 percent of sequences. Overall, this study provides new genome information and insights into the diversity of chimeric viruses, a necessary first step in progressing toward an accurate quantification and host range identification of these new viruses.Entities:
Keywords: ssDNA viruses; viral metagenomics; virus diversity; virus ecology
Year: 2016 PMID: 29492276 PMCID: PMC5822885 DOI: 10.1093/ve/vew025
Source DB: PubMed Journal: Virus Evol ISSN: 2057-1577
Figure 1.Maximum-likelihood phylogenetic analysis of full-length CP sequences and associated genome structure of crucivirus-like genomes from Sphagnum-dominated peat viromes. A total of 333 unambiguously aligned positions from sixty-seven sequences were used in the analysis. Bootstrap values above 50 are indicated at the nodes. The scale bar indicates the number of substitutions per position for a unit branch length. Three groups are highlighted by different background colors. Green: ssRNA viruses (Tombusviridae); blue: peat origin; red: RDHV or chimeric viruses origin. Genome organizations are drawn to scale. Red: replication protein (RC-Rep); yellow: CP; green: intron in RC-Rep, grey: other ORFs. Blue triangle: potential replication origin. Star (*): linear genomes. I–VI: CRESS virus types.
Figure 2.Characterization of CP diversity. (A) Domain organization of the tombusvirus-like CPs of chimeric viruses. The nucleic acid binding (R), shell (S), and projection (P) domains are indicated. The pattern of sequence conservation of fifty-six CP sequences is shown underneath the schematic domain organization cartoon. (B) Structural model of CHIV10 (Roux et al. 2013) showing the unequal distribution of sequence conservation in the context of the protein structure. The color key for the sequence identity is provided underneath the 3D model. (C) Comparison of the sequence conservation within the S-domain with that of the full-length CPs.
Figure 3.Maximum-likelihood phylogenetic analysis of full-length replication protein sequences and associated endonuclease and helicase domain affiliation. A total of 207 unambiguously aligned positions from eighty-nine sequences were used in the full-length phylogenetic analysis. Bootstrap values above 50 are indicated at the nodes. The scale bar indicates the number of substitutions per position for a unit branch length. Three groups are highlighted by different background colors. Green: Geminiviridae; blue: Circoviridae; red: Nanoviridae. Columns indicate the closest relative of the endonuclease and the helicase protein domains as determined by domain specific phylogenetic analysis. C: Circoviridae-like domain (blue), G: Geminivirus-like domain (green), N: Nanoviridae-like domain (red). Bold: all confirmed crucivirus-like viral sequences. Bold blue: chimere-like viral sequences from peat samples. X: potential chimeric replication protein. NF: Non-functional Walker B motif in the SF3 helicase domain.
Figure 4.Relative proportions of crucivirus-like CP encoding sequences in six fen and six bog viromes. The viromes were analyzed by BLASTx against sixty-eight CP sequences. Best matches were counted and normalized to the number of sequences in each virome. The grouping is based on the phylogenetic affiliation of the CPs.