Literature DB >> 30533707

A New Putative Caulimoviridae Genus Discovered through Air Metagenomics.

Alberto Rastrojo1, Andrés Núñez2, Diego A Moreno2, Antonio Alcamí1.   

Abstract

Members of the Caulimoviridae family are important plant pathogens. These circular double-stranded DNA viruses may integrate into the host genome, although this integration is not required for the viral replication cycle. Here, we describe three complete genomes belonging to a new putative Caulimoviridae genus discovered through air metagenomics.

Entities:  

Year:  2018        PMID: 30533707      PMCID: PMC6256638          DOI: 10.1128/MRA.00955-18

Source DB:  PubMed          Journal:  Microbiol Resour Announc        ISSN: 2576-098X


ANNOUNCEMENT

Caulimoviridae is the only known double-stranded DNA (dsDNA) virus family infecting plants that replicates by reverse transcription (1). Members of this family have circular genomes of 7,000 to 8,200 bp with discontinuities in both strands coding for 1 to 8 open reading frames (ORFs) (2). The replication cycle is episomal and does not require integration into the host genome. However, integration can occur during the nonhomologous end-joining repair of dsDNA breaks in host genomes, leaving a fingerprint of past infections (3). The study of these endogenous viral elements suggests that this family would have emerged approximately 320 million years ago (4). Here, we describe three complete viral genomes belonging to the family Caulimoviridae. These genomes were obtained from air samples collected in Madrid, Spain (40.439881°N, 3.689409°W) using different devices, namely, a Hirst spore trap, a Surface Air System DUO 360 instrument, and a Burkard multivial cyclone sampler (our unpublished data). A PowerSoil DNA isolation kit was used to extract total DNA. Samples were then sequenced using Illumina technology, obtaining 40 million paired-end reads for each sample (2 × 125 nucleotides). Raw reads were quality filtered using PRINSEQ (mean quality score of 25 and length of >75) (5), assembled with IDBA_UD with default parameters (6), and classified using a BLAST search against the NCBI nonredundant database (e-value, <1e-3; score, >50) (7). Three contigs of ∼7 kb were assigned to the family Caulimoviridae and were circularized using Minimus2 (8). These viral genomes are 98.8 to 99.3% identical to each other and therefore belong to the same species. All three genomes have a minus-strand primer binding site, a polypurine tract, and polyadenylation signals, and they have only one large ORF coding a 2,129-amino acid polyprotein with several domains, movement protein (amino acids [aa] 57 to 222), coat protein (aa 602 to 838), aspartic protease (aa 991 to 1124), reverse transcriptase (aa 1144 to 1619), and RNase H (aa 1627 to 1740) (9). All of these features are characteristic of members of the genus Petuvirus (2). However, the reverse transcriptase (RT)-RNase H domain shares only 40% identity with Petunia vein clearing virus (PVCV), the unique member of the genus Petuvirus. By searching against the complete or near-complete endogenous viral genomes described by Diop et al. (4), we were able to find the closest relative, Pinus taeda gymnendovirus 2 (PtGy2), which shares a 77% identity at the nucleotide level with the three new genomes. PtGy2 contains five ORFs, but the rearrangement the ORFs generated a single ORF coding a 2,106-aa polyprotein that is 79% identical to the proteins of the newly discovered viruses. This identity increased to 84% when the RT-RNase H domain was examined. Therefore, the new genomes could represent a replication competent version of the PtGy2 endogenous element, which could have been fragmented because of the accumulation of several mutations or indels due to its integration in the genome of Pinus taeda. Interestingly, ∼80% of the shotgun metagenomic reads were assigned to Pinus taeda. Additionally, we were able to detect by PCR these new viruses in a Pinus nigra sample from the vicinity of where the air samples were collected, suggesting that these new viruses are likely Pinus pathogens. The phylogenetic analysis (10, 11) of the RT-RNase H protein of representative Caulimoviridae members showed a strong relation of the new viruses with PtGy2 and with Picea glauca gymnendovirus 2, another endogenous element, all of them forming an ancient branch in clade B, to which PVCV belongs (4). In conclusion, we propose that these new genomes may represent a new genus of Caulimoviridae-infecting gymnosperms, in contrast to Petuvirus-infecting angiosperms, which could represent the replicative counterpart of the endogenous Gymnendovirus 2 genus recently described.

Data availability.

The three complete genome sequences reported here have been deposited in GenBank under the accession numbers MH551471, MH551472, and MH551473. Shotgun raw reads have also been deposited in the European Nucleotide Archive (ENA) under the accession numbers ERX2313857, ERX2313858, and ERX2313863.
  11 in total

1.  IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth.

Authors:  Yu Peng; Henry C M Leung; S M Yiu; Francis Y L Chin
Journal:  Bioinformatics       Date:  2012-04-11       Impact factor: 6.937

2.  New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0.

Authors:  Stéphane Guindon; Jean-François Dufayard; Vincent Lefort; Maria Anisimova; Wim Hordijk; Olivier Gascuel
Journal:  Syst Biol       Date:  2010-03-29       Impact factor: 15.683

3.  Next generation sequence assembly with AMOS.

Authors:  Todd J Treangen; Dan D Sommer; Florent E Angly; Sergey Koren; Mihai Pop
Journal:  Curr Protoc Bioinformatics       Date:  2011-03

4.  BLAST+: architecture and applications.

Authors:  Christiam Camacho; George Coulouris; Vahram Avagyan; Ning Ma; Jason Papadopoulos; Kevin Bealer; Thomas L Madden
Journal:  BMC Bioinformatics       Date:  2009-12-15       Impact factor: 3.169

Review 5.  Viral sequences integrated into plant genomes.

Authors:  Glyn Harper; Roger Hull; Ben Lockhart; Neil Olszewski
Journal:  Annu Rev Phytopathol       Date:  2002-02-20       Impact factor: 13.078

6.  Quality control and preprocessing of metagenomic datasets.

Authors:  Robert Schmieder; Robert Edwards
Journal:  Bioinformatics       Date:  2011-01-28       Impact factor: 6.937

7.  Endogenous florendoviruses are major components of plant genomes and hallmarks of virus evolution.

Authors:  Andrew D W Geering; Florian Maumus; Dario Copetti; Nathalie Choisne; Derrick J Zwickl; Matthias Zytnicki; Alistair R McTaggart; Simone Scalabrin; Silvia Vezzulli; Rod A Wing; Hadi Quesneville; Pierre-Yves Teycheney
Journal:  Nat Commun       Date:  2014-11-10       Impact factor: 14.919

8.  The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

Authors:  Vikram Alva; Seung-Zin Nam; Johannes Söding; Andrei N Lupas
Journal:  Nucleic Acids Res       Date:  2016-04-29       Impact factor: 16.971

9.  Virus taxonomy: the database of the International Committee on Taxonomy of Viruses (ICTV).

Authors:  Elliot J Lefkowitz; Donald M Dempsey; Robert Curtis Hendrickson; Richard J Orton; Stuart G Siddell; Donald B Smith
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

10.  Tracheophyte genomes keep track of the deep evolution of the Caulimoviridae.

Authors:  Seydina Issa Diop; Andrew D W Geering; Françoise Alfama-Depauw; Mikaël Loaec; Pierre-Yves Teycheney; Florian Maumus
Journal:  Sci Rep       Date:  2018-01-12       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.