Literature DB >> 26877136

FLDS: A Comprehensive dsRNA Sequencing Method for Intracellular RNA Virus Surveillance.

Syun-Ichi Urayama1, Yoshihiro Takaki, Takuro Nunoura.   

Abstract

Knowledge of the distribution and diversity of RNA viruses is still limited in spite of their possible environmental and epidemiological impacts because RNA virus-specific metagenomic methods have not yet been developed. We herein constructed an effective metagenomic method for RNA viruses by targeting long double-stranded (ds)RNA in cellular organisms, which is a hallmark of infection, or the replication of dsRNA and single-stranded (ss)RNA viruses, except for retroviruses. This novel dsRNA targeting metagenomic method is characterized by an extremely high recovery rate of viral RNA sequences, the retrieval of terminal sequences, and uniform read coverage, which has not previously been reported in other metagenomic methods targeting RNA viruses. This method revealed a previously unidentified viral RNA diversity of more than 20 complete RNA viral genomes including dsRNA and ssRNA viruses associated with an environmental diatom colony. Our approach will be a powerful tool for cataloging RNA viruses associated with organisms of interest.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 26877136      PMCID: PMC4791113          DOI: 10.1264/jsme2.ME15171

Source DB:  PubMed          Journal:  Microbes Environ        ISSN: 1342-6311            Impact factor:   2.912


Viruses are the universal genetic elements associated with all three domains of life (22), and virus-host interactions impact on the status of life and surrounding ecosystems (41). Historically, viruses are most often recognized as pathogens (38), and, thus, have been studied in the field of medical and crop science. Recent advances in high-throughput sequencing technologies have enabled us to identify not only viruses associated with diseases, but also those present in natural environments including oceans (41) and soil (12). Although these sequencing technologies have opened a new era in virus identification (24), a limited number of methods have been established for virus enrichment and library construction. The diversity and distribution of viruses in non-viral nucleic acid-dominant environments, such as the intracellular environments in which viruses actually replicate, still remain unclear due to technical difficulties (16). The development of a new procedure for effective virus enrichment and library construction is required in order to understand the full spectrum of diverse viruses. RNA sequencing (RNA-seq) is a popular method in RNA virus metagenomics and is widely used for RNA virus identification (35). Purification and library construction methods have been established for RNA viruses at the extracellular stage (7, 10, 40). However, the viral read ratio of intracellular RNA viruses (RNA viruses at the intracellular stage) in the RNA-seq library is typically < 1% because mRNA and rRNA are dominant in the total RNA fraction extracted from biological samples (25). Therefore, the enrichment of viral RNA is essential for maximizing sensitivity in the identification of novel viruses. The physical enrichment of viral particles and nuclease digestion of non-viral nucleotides has been employed to increase the viral read ratio; however, a relatively low abundance of viral reads is still observed in most studies (39). These techniques are only applicable to specific RNA viruses because not all RNA viruses form viral particles (21). In addition, difficulties are associated with capturing terminal RNA sequences in an efficient and effective manner (32) and obtaining uniform coverage using the RNA-seq method. Sample preparation methods for effective viral RNA-seq are still inadequate and the sequence information generated is biased and incomplete. In an attempt to resolve these issues, an environmental viral metagenomic approach targeting intracellular long double-stranded RNA (dsRNA) has recently been examined (2, 6, 9, 37). Intracellular dsRNA consists of the genomes of dsRNA viruses and replicative intermediates of single-stranded RNA (ssRNA) viruses, and, thus, long dsRNA is known as an RNA virus-specific molecule and molecular marker for RNA virus infection and replication (28). Therefore, a metagenomic analysis targeting intracellular long dsRNA theoretically retrieves dsRNA and ssRNA viruses, except for ssRNA retroviruses, which do not form dsRNA in the replicative stage. In addition, it is possible to eliminate non-viral nucleic acids such as mRNA and rRNA, which dominate RNA-seq reads, by DNase I, S1 nuclease, RNase, or column chromatography (44). However, previous studies have reported technical issues with the purification of dsRNA and library construction. Random priming for the reverse transcription of dsRNA does not enable the terminal sequences of the dsRNA molecule to be determined or eliminate significant contamination by non-viral sequences. The heterogeneous sequencing depth in certain viral genome segments is also an issue associated with this method (2, 6, 9, 37). Although the full-length cDNAs of dsRNA viruses may be obtained using loop primers that are ligated to the dsRNA terminal ends for reverse transcription (8), this method is only useful for short dsRNA viruses. Therefore, to the best of our knowledge, this method has not yet been applied to a viral metagenomic analysis. We herein established a novel strategy to obtain full-length RNA virus sequences with extremely high efficiency by applying a short dsRNA full-length cloning method (8) for physically fragmented dsRNAs. The improved method, named FLDS (fragmented and loop primer ligated dsRNA sequencing), was applied to a diatom colony in a tide pool and revealed previously unidentified RNA viruses. Our results indicate that the diversity of environmental RNA viruses has been underestimated due to the technical limitations in identifying entire RNA viromes in cellular organisms, and this technique will be a powerful tool for cataloging RNA viruses associated with organisms of interest.

Materials and Methods

Model and environmental samples

Mycelial plugs of Magnaporthe oryzae strain S-0412-II 1a, naturally infected with Magnaporthe oryzae chrysovirus 1 strain A (MoCV1-A) (45) were incubated in 0.5% yeast extract and 2% glucose liquid broth (YG broth) with reciprocal shaking (60 rpm) at 25°C for 2 weeks in the laboratory of Prof. Teraoka (Tokyo University of Agriculture and Technology). Colonies of a diatom on tidal rocks in Tokyo Bay (35.3405° N, 139.6396° E) were sampled in April 2014. After washing with distilled water, the colonies were stored at −80°C.

Purification and fragmentation of dsRNA

DsRNA was purified as described by Okada et al. with a few modifications (31, 46). Briefly, the microbial sample was disrupted in liquid nitrogen in a mortar and total nucleic acids were manually extracted. DsRNA was purified twice through a micro-spin column (empty Bio-spin column; Bio-Rad Laboratories, Inc., Hercules, CA, USA) containing cellulose powder (Cellulose D; ADVANTEC, Tokyo, Japan) to obtain pure dsRNA. The dsRNA eluted from cellulose powder in MQ water was treated with DNaseI (amplification grade, Invitrogen, Carlsbad, CA, USA) and S1 nuclease (Invitrogen) in nuclease buffer (57 mM CH3COONa, 9.5 mM MgCl2, 1.9 mM ZnSO4, and 189 mM NaCl) and was then incubated at 37°C for 2 h. The final concentrations of CH3COONa, MgCl2, ZnSO4, and NaCl were adjusted to 90 mM, 15 mM, 3 mM, and 300 mM, respectively. DsRNA was purified using an RNeasy Mini Kit (Qiagen, Valencia, CA). A one-tenth volume of 10 × ShortCut buffer and 10 × MnCl provided with ShortCut RNase III (NEB Japan, Tokyo, Japan) was added to the dsRNA solution and fragmented by ultrasound at 4°C in Snap-Cap microTUBEs using a Covaris S220 (Woburn, MA, USA). The fragmentation conditions were as follows; run time 35 s, peak power 140.0 W, duty factor 2.0%, and 200 cycles/burst. Fragmented dsRNA was divided into two equal volumes, and maintained at 37°C with or without ShortCut RNase III (NEB). DsRNAs were then purified using a ZymoClean Gel RNA Recovery Kit (ZymoResearch, Orange, CA). Note that dsRNA purification from M. oryzae was carried out in the laboratory of Prof. Teraoka.

cDNA synthesis and amplification for dsRNA

The PC3-T7 loop primer (5′-p-GGA TCC CGG GAA TTC GGT AAT ACG ACT CAC TAT ATT TTT ATA GTG AGT CGT ATT A-OH-3′) was ligated to fragmented dsRNA as described by Potgieter et al. (34), and dsRNA was then purified using the MinElute Gel Extraction Kit (Qiagen). After the addition of DMSO at a final concentration of 15% (v/v), dsRNA was denatured at 95°C for 3 min and snap-frozen in ice-water slurry. RNA was reverse transcribed into cDNA from the ligated loop primer region using the Superscript III First-Strand Synthesis System (Invitrogen). After excess and hybrid RNAs were removed (34), cDNA was desalted and concentrated using the MinElute PCR cleanup kit (Qiagen). Primary cDNA strands were re-annealed by lowering the temperature from 95 to 50°C, as described previously (30). Second strand DNA polymerization was performed using KOD-plus Neo (Toyobo, Osaka, Japan) with a primer complementary to the partial sequence of the PC3-T7 loop primer, PC2 (5′-CCG AAT TCC CGG GAT CC-3′) (34). After heat activation of KOD-plus Neo in the reaction mixture provided at 96°C for 2 min, template cDNA was added and incubated at 68°C for 5 min. After the reaction, cDNA was amplified under the following conditions: 96°C for 2 min, 25 (for MoCV1-A) or 18 (for diatom colony) cycles of 98°C for 10 s, and 68°C for 2 min. Small cDNA and primer dimers were removed using the 1.25 × SPRIselect reagent kit (Beckman Coulter, Brea, CA, USA) according to the Left Side Size Selection procedure in the manufacturer’s protocol.

Total RNA extraction, cDNA synthesis, and library construction from an environmental sample

Total RNA was isolated from a diatom colony using the TRIzol Plus RNA Purification Kit (Invitrogen) according to the manufacturer’s protocol. The RNA fraction was treated with DNase I (Takara, Otsu, Japan). Double-stranded cDNA was synthesized from 2 μg of total RNA with random primers (9-mers) using a PrimeScript Double Strand cDNA Synthesis Kit (Takara). The resultant cDNA was quantified using a Qubit dsDNA HS Kit.

Illumina sequencing

Ultrasound was used to fragment cDNA in Snap-Cap microTUBEs at 4°C using a Covaris S220 (Woburn, MA, USA). The fragmentation conditions were as follows; run time 55 s, peak power 175.0 W, duty factor 5.0% and 200 cycles/burst. The Illumina library was constructed with KAPA Hyper Prep Kit Illumina platforms (Kapa Biosystems, Woburn, MA, USA). The quantity of the library was evaluated using the KAPA library quantification kit (Kapa Biosystems). Each 300 bp of the paired-end sequences of each fragment were determined with the Illumina MiSeq platform (San Diego, CA, USA).

Data assembly and processing

Raw sequence reads were processed with the CLC Genomics Workbench (CLC Bio, Aarhus, Denmark). Adaptor and primer sequences were trimmed, and low quality sequence regions were removed with default parameters. PhiX sequences derived from control libraries and experimentally contaminated sequences (< 0.05% of total reads) were also removed using a mapping tool. The consensus sequences of viral contigs were obtained de novo exclusively with the CLC Genomics Workbench (CLC Bio), and assemblies were manually examined and extended using the Tablet viewer (27). Using the mapping tool, each contig was confirmed to be constructed with at least 3 × sequence coverage, 10 × average coverage, and 1,000 bp in length. In cases of dominant reads (more than 10 reads) that stopped in the same position around the ends of contigs, the position was recognized as a terminal end. The predicted terminal ends of the viral genome segments were also confirmed by the presence of adjacent PCR primer sequences next to the predicted terminal sequence, except for cases of contigs with a poly(A) tail. Contigs with 70–90% nucleotide identity with other contigs were classified as the genome types of the same species. Contigs with > 90% nucleotide identity were assigned as the same genome type and only major contigs were used in further analyses. Sequences were compared against the NCBI non-redundant nucleotide and amino acid (aa) databases using BLASTN-plus and BLASTX-plus, respectively (5), and then classified by MEGAN 5.7.1. (18). A sequence analysis was performed using Genetyx-MAC software version 17.0.0 (Genetyx Corp., Tokyo, Japan) and Genetyx software version 9.1.0 (Genetyx). Most full-length small subunit rRNA sequences in the diatom colony were reconstructed from RNA-seq reads with EMIRGE (26).

Phylogenetic analysis

Multiple alignments based on the deduced aa sequences of putative RNA-dependent RNA polymerase (RdRp) genes in dsRNA contigs were obtained using ClustalX 2.0 (23) and MEGA5 software (42). Phylogenetic analyses were conducted using MrBayes 3.2.3 (36) with the model of aa substitution, RtREV+I+G+F, selected by ProtTest2.4 (1), as judged by the Akaike information criterion (33). Bayesian analyses with the covarion parameter were run with one run and four chains for 1,000,000 generations.

Data accession

The data sets supporting the results of this study are available in the GenBank database repository (accession nos. DDBJ: AP014890–AP014920) and Short Read Archive database (accession no. DDBJ: DRA003723 and DRA003724).

Results

Application of FLDS to a segmented dsRNA virus

The novel dsRNA purification and library construction method, named FLDS, consists of cellulose column chromatography, the physical fragmentation of dsRNA, cDNA synthesis using a loop primer, and the PCR amplification of cDNA (Fig. 1). The purification of dsRNA was achieved by the repeated affinity purification of dsRNA using cellulose powder and the enzymatic removal of ssRNA and DNA. Purified dsRNAs were fragmented using ultrasound to retrieve all types of dsRNA viruses in order to apply the previously reported full-length dsRNA cloning method using a loop primer (8). The full-length dsRNA cloning method requires overlapped cDNAs synthesized from both terminal ends for further cDNA amplification, and was only applicable to short dsRNA molecules. Reverse transcription was initiated from the ligated loop primer on both ends of the dsRNA fragment. cDNA was then thermally denatured to allow annealing of single-stranded cDNA with the complementary sequence in the 3′ terminal region. The single-stranded regions of annealed cDNA were filled in with DNA polymerase. The double-stranded cDNA derived from dsRNA was amplified by PCR with a single primer (PC2) in order to obtain sufficient cDNA to construct a sequencing library.
Fig. 1

Schematic work flow of FLDS. 1. Fragmentation of dsRNA by ultrasound. 2. Ligation of a loop primer on 3′-terminal ends and reverse transcription. 3. Selective duplex formation of cDNA from dsRNA, and PCR amplification. Details of the FLDS method are described in the Materials and Methods section.

Mycelial MoCV1-A was used to test the feasibility of this method. Since PCR amplicons were not observed in the dsRNA-specific RNaseIII-treated sample prior to reverse transcription, most of the amplicons (cDNA) were likely to have been derived from dsRNA (Fig. S1). The results of the sequencing analysis indicated that 99.1% of total reads were derived from the MoCV1-A genome (Table S1). Five contigs obtained by de novo assembly were identical to the entire region of the MoCV1-A genome segments attained using a conventional cloning and sequencing method (44, 45) with > 99.9% identity (Table S2). Read mapping on MoCV1-A genomes (Fig. S2) showed that the sequence coverage of terminal regions was generally higher than that of the central regions of each segment with few exceptions. No obvious relationship was observed between read coverage and GC content (Fig. S2). These results indicated that FLDS effectively enriched dsRNA reads, thereby allowing the retrieval of complete genome sequences including terminal regions without the requirement for the additional rapid amplification of cDNA ends (RACE).

FLDS analysis in an environmental diatom colony

Gel electrophoresis showed that the total long dsRNA fraction from the diatom colony contained at least ten dsRNA segments, whereas genomic DNA and rRNA were the predominant in total nucleic acids (Fig. 2). Total dsRNA extracted from 1 g of the diatom colony was analyzed using the FLDS method. PCR amplicons were not observed in the dsRNA-specific RNaseIII-treated sample prior to reverse transcription (Fig. S3). As a result of de novo assembly and manual extension, we obtained 42 composite viral contigs (Table 1 and Table S3). More than 98.2% of reads were mapped to these 42 contigs (Table 2) as in the case of the model experiment described above. Both terminal ends of 31 of the viral contigs were identified and recognized as full-length viral genome segments. The terminal sequences of the full-length segments were used to identify segment compositions for some of the viral species because terminal sequences are highly conserved between segments in some dsRNA viral genomes for viral RNA replication and/or encapsidation (19).
Fig. 2

Agarose gel electrophoresis of purified nucleic acids from a diatom colony. Nucleic acids were stained with ethidium bromide. Lane M, 300 ng of HindIII-digested λ DNA; lane 1, total nucleic acids extracted from 5 mg (wet weight) of the diatom colony; lane 2, purified dsRNA extracted from 1 g (wet weight) of the diatom colony.

Table 1

List of complete composite genomes of RNA viruses and full-length virus-like RNAs obtained from a diatom colony obtained using FLDS.

RNA virus speciesAccessionDescriptionSize (nt)Num. of mapped readsAverage coverageBlastX analysis

Top Hit for each CDS, Virus familyE-valueProtein
DCADSRV-1a)AP014890segment 11,7341,301,278191,942
AP014891segment 21,5621,717,396279,580Fox PicobirnavirusPicobirnaviridae1 × 10−33RdRp

DCADSRV-2AP0148924,0261,337,57083,876Ustilaginoidea virens nonsegmented virus1Not assigned5 × 10−15RdRp

DCADSRV-3AP0148934,91114,544703Ustilaginoidea virens RNA virus1Totiviridae2 × 10−63RdRp

DCADSRV-4AP014894Genome type A4,98212,325591Aspergillus mycovirus 178Totiviridae4 × 10−69RdRp
DCADSRV-4AP014895Genome type B4,9791,07452Ustilaginoidea virens RNA virus1Totiviridae5 × 10−69RdRp

DCADSRV-5AP0148965,2527,863359Aspergillus foetidus slow virus 1Totiviridae3 × 10−74RdRp

DCADSRV-6AP0148974,9392,720131Aspergillus mycovirus 178Totiviridae2 × 10−66RdRp

DCADSRV-7AP0148985,3271,95787Gremmeniella abietina RNA virus L1Totiviridae3 × 10−123RdRp
Ustilaginoidea virens RNA virus 3Totiviridae2 × 10−56CP

DCADSRV-8AP0148994,6601,16360Aspergillus foetidus slow virus 1Totiviridae8 × 10−57RdRp

DCADSRV-9AP014900Genome type A4,8441,19860Magnaporthe oryzae virus 2Totiviridae1 × 10−65RdRp
DCADSRV-9AP014901Genome type B4,84536418Aspergillus foetidus slow virus 1Totiviridae2 × 10−66RdRp

DCADSRV-10AP0149025,0821,24459Rosellinia necatrix victorivirus 1Totiviridae2 × 10−108RdRp
Ustilaginoidea virens RNA virus 1Totiviridae6 × 10−50CP

DCADSRV-11AP0149035,1601,17355Ustilaginoidea virens RNA virus1Totiviridae4 × 10−128RdRp
Ustilaginoidea virens RNA virus 1Totiviridae8 × 10−64CP

DCADSRV-12AP0149045,9411,21949Beauveria bassiana RNA virus 1Totiviridae1 × 10−40RdRp

DCADSRV-13AP0149054,67182042Aspergillus foetidus slow virus 1Totiviridae4 × 10−58RdRp

DCADSRV-14a)AP014906segment 11,57643867Persimmon cryptic virusPartitiviridae3 × 10−97RdRp
AP014907segment 21,49027443

DCADSRV-15AP01490812,1721,48229Chalara endornavirus CeEV1Endornaviridae1 × 10−115Polyprotein

DCASSRV-1AP01491211,4131,01121Border disease virus—BD31Flaviviridae4 × 10−15Polyprotein

DCASSRV-2AP0149134,5864,153224Tuber excavatum mitovirusNarnaviridae5 × 10−20RdRp

DCADSRV-16AP0149096,6358,735310Rhizoctonia fumigata mycovirusNot assigned4 × 10−10RdRp

DCADSRV-17AP014910Genome type A5,9075,325218dsRNA virus environmental sampleNot assigned7 × 10−14RdRp
DCADSRV-17AP014911Genome type B5,9091,56463Botrytis porri RNA virus 1Not assigned1 × 10−13RdRp

DCAVLRS-1AP014914Interrupted RdRp4,56757,8023,039Ustilaginoidea virens nonsegmented virus 1Not assigned3 × 10−11RdRp

DCAVLRS-2AP014915Interrupted RdRp4,78641,1812,100Ustilaginoidea virens nonsegmented virus 1Not assigned2 × 10−11RdRp

DCAVLRS-3AP014916CP only3,45813,140876Ustilaginoidea virens RNA virus 1Totiviridae2 × 10−41CP

DCAVLRS-4AP014917RdRp only3,1903,995294Magnaporthe oryzae virus 2Totiviridae2 × 10−123RdRp

DCAVLRS-5AP014918CP only3,2621,33196Phomopsis vexans RNA virusTotiviridae5 × 10−47CP

DCAVLRS-6AP014919RdRp only3,32589165Ustilaginoidea virens RNA virus 3Totiviridae6 × 10−102RdRp

DCAVLRS-7AP014920Interrupted RdRp1,98616420Flammulina velutipes browning virusPartitiviridae4 × 10−63RdRp

The classification was based on the shared 5′ terminal sequences in paired segments, whereas CDSs in the segments that did not show significant similarities with genes in databases.

Table 2

Classification of next-generation sequencing reads obtained by FLDS and total RNA-seq.

FLDStotal RNA-seq

Num. of readsrate (%)Num. of reads rate(%)
Trimmed4,631,738100.06,979,561100.0
Major viral reads4,549,62998.224,0360.3
Unmapped reads (include minor viral reads)82,1091.76,955,52599.6
Based on aa sequence similarities (E-value ≥ 1 × 10−5) in the predicted protein-encoding sequences (CDSs), the number of genome segments in related viruses, and terminal conserved sequences in each segment of a single virus, we identified 22 viral putative composite genomes out of 31 full-length viral segments. Sequence similarities between the 22 putative viral composite genomes were used to classify them into 19 putative viral species, and each of the two genome types was identified in three species (Table 1). Seventeen dsRNA and two ssRNA viral species were identified and named Diatom Colony-Associated dsRNA virus 1–17 (DCADSRV-1–17) and Diatom Colony-Associated ssRNA virus 1–2 (DCASSRV-1–2) (Table 1). Since ssRNA viruses form an RNA duplex as an intermediate in genome replication, these contigs were most likely derived from replicating ssRNA viruses (11) and not from contaminant ssRNA. An additional seven full-length viral segments with predicted CDSs were also identified; however, we were unable to determine the combination of their segments or reconstruct viral genomes based on information from previously reported viruses. Thus, these viral segments were assigned as Diatom Colony-Associated Virus-Like RNA Segments (DCAVLRS-1–7).

Comparison between FLDS and total RNA-seq

Total RNA from the diatom colony was also investigated using shotgun RNA-seq in order to determine the active organisms of the colony and the abundance of viral RNA genomes in total RNA. Sequence reads derived from rRNA were identified using EMIRGE (26). The results of the analysis revealed that 56% of all trimmed reads were rRNA sequences, while 37.2% of all reads showed more than 99% identity to 18S and 23S rRNA from the diatom Achnanthes brevipes. In addition, 4.1 and 6.2% of reads belonged to the other diatom genus Cylindrotheca and chlorophyte genus Cladophora, respectively. The relative abundance of the rRNA reads was shown in Table S4. Only 0.3% of reads from total RNA-seq was mapped on the major viral contigs obtained using FLDS with a read mapping algorithm in the CLC workbench (Table 2). Comparisons of the relative read frequencies of each major viral contig between total RNA-seq and FLDS revealed that FLDS achieved 0.8–4372.3-fold enrichment for each viral contig (653.2 mean) (Fig. 3). FLDS also had apparent advantages in uniform read coverage and efficiency for retrieving terminal sequences (Fig. 4). Sequence reads for ssRNA viruses in FLDS were also more abundant than when RNA-seq was used for four out of five ssRNA contigs. In addition, by de novo assembly, only six partial viral contigs were obtained using RNA-seq, and no viral contigs specific for total RNA-seq were found. Accordingly, we concluded that FLDS is more efficient than total RNA-seq for the detection and identification of RNA viruses, with the exception of retroviruses, which theoretically cannot be identified using FLDS.
Fig. 3

Comparison of mapped read frequencies for each viral contig between FLDS and total RNA-seq. Plots indicate each viral contig. The rhombus and triangle plots show dsRNA and ssRNA viral contigs, respectively. 100–10−7 represent the frequencies of reads in each library. Dotted lines with 1×, 10×, 100×, or 1000× show a higher viral read frequency than that with an RNA-seq analysis. Reads mapped with nine contigs found in FLDS were not found in total RNA-seq.

Fig. 4

Comparison of coverage uniformity between FLDS and RNA-seq. DsRNA segments with an average depth of > 200 in RNA-seq were used for the analysis. (A) Coefficient of variation (the ratio of the standard deviation to the mean coverage). Values were plotted on viral dsRNA segments of DCADSRV-1 segment 1 (square), DCADSRV-1 segment 2 (triangle), and DCADSRV-2 (circle), and were plotted on the Y axis. (B–D) Genomic coverage of each viral segment from the FLDS (upper graph) and RNA-seq (lower graph) analysis.

Phylogenetic analysis and characterization of viral RNA genomes

A phylogenetic analysis of viral RNA replicases (RNA-dependent RNA polymerase; RdRp) presented the phylogenetic relationship between viral genomes from a diatom colony and known RNA viruses (Fig. S4). Viruses belonging to the family Totiviridae harbor non-segmented dsRNA genomes and form isometric virions that infect either fungi or protozoa (21). Thirteen composite genomes of Totiviridae-related viruses were identified and classified into four clades distinct from the five characterized genera of Totiviridae (clades a–d in Fig. S4A). Clade c was the sister clade of the proposed genus “Trichomonasvirus” and clade d included Ustilaginoidea virens RNA virus 1 (UvRV1). In general, −1 ribosomal frameshift signals [the XXXYYYZ motif (4), in which XXX may be any three identical nucleotides, YYY may be either AAA or UUU, and Z may be A, U, or C] or +1 ribosomal frameshift signals [CCCUUUU (14) or UCCUUUCGU (47)] were located in the upstream region of the 2nd CDS, and were used in the expression of overlapping viral genes such as the pol (RdRp) of Totivirus and Leishmaniavirus. These regions were examined in an attempt to better classify the identified viruses. However, as in the case of UvRV1, −1 or +1 ribosomal frameshift signals were not found in any of the Totiviridae genomes obtained in this study. CDSs in the predicted Totiviridae virus-like segments DCAVLRS-3 and DCAVLRS-4 showed significant similarities with the gag (coat protein; CP) and pol (RdRp) of known totiviruses, respectively. Totiviridae genomes consist of a single genome segment that encodes the two essential CDSs, whereas DCAVLRS-3 and -4 lacked pol and gag, respectively. These two segments harbored nine identical 5′-terminal nucleotide sequences, which were distinguishable from the other identified terminal viral sequences. Genomic features implied that DCAVLRS-3 and -4 may be parts of a bisegmented viral genome. RdRp in DCADSRV-1 segment 2 showed significant homology with that in fox Picobirnavirus, a member of the Picobirnaviridae, although DCADSRV-1 was phylogenetically distinct from the known Picobirnaviridae viruses (Fig. S4B). Picobirnaviruses are small, non-enveloped, bisegmented dsRNA viruses that infect animals and humans (21). The genome structure of DCADSRV-1 was similar to that of the known Picobirnaviridae (21). DCADSRV-14 was classified into the genus Deltapartitivirus of the family Partitiviridae based on the predicted RdRp sequence (Fig. S4C) (29). To date, all of the alphacryptoviruses have been identified from plants including the angiosperm, gymnosperm, and chlorophytes (29). rRNA sequences belonging to the Streptophyta, including land plants, have not yet been detected by an RNA-seq analysis, whereas Cladophora sp. of the Chlorophyta, a sister division of Streptophyta, were detected (Table S4). The CDSs of DCADSRV-15 and a few viral contigs showed significant homology with viruses belonging to the Endornaviridae (dsRNA), Naranviridae (ssRNA), or Hypoviridae (ssRNA), whose virion formation has not yet been observed. In the ssRNA viral population, RdRp in DCASSRV-1 presented a close relationship with Border disease virus—BD31 (E-value = 4 × 10−15), a member of the genus Pestivirus of the family Flaviviridae, which consists of the arthropodborne pathogens of humans and other animals. The genome size and CDS structure of DCASSRV-1 (11.4 kb) were similar to those of Flaviviridae (9.6–12.3 kb) (21), and the phylogenetic tree of RdRp indicated that DCASSRV-1 was not classified into the three known Flaviviridae genera (Fig. S4D). A phylogenetic analysis of RdRp in DCASSRV-2 suggested that the RNA virus was classified into the genus Mitovirus, which has a non-segmented ssRNA genome, infects the mitochondria of fungi, and lacks viral particles (Fig. S4E). The presence of multiple UGA codons suggested that the putative coding strand of DCASSRV-2 was likely to be translated in mitochondria. The genome size of DCASSRV-2 (4.5 kb) was larger than those of the known mitoviruses (2.3–3.6 kb) (17).

Discussion

This study revealed the presence of novel RNA viruses associated with a diatom colony and inferred the unexpected evolutionary relationship between environmental viruses and pathogenic animal viruses. Among the RNA viral genomes obtained in this study, some dominant populations showed greater similarities to fungal viruses than to known diatom viruses; however, several ssDNA and ssRNA viruses have already been identified from marine diatoms (20, 43). We cannot exclude the possibility that these viral genomes were derived from fungi associated with a diatom colony, but it is more likely that they came from the major components of a diatom colony because of their high abundance in the RNA viral metagenomic library. Since extracellular viral particles have been a major target of virus surveillance and isolation, information on intracellular viruses in microorganisms is very limited (13, 37). Therefore, the accumulation of knowledge on intracellular RNA viruses infecting diverse host organisms is essential for understanding the evolution and distribution of RNA viruses. FLDS revealed 22 full-length and some partial composite viral RNA genomes associated with a diatom colony by de novo assembly. These were classified into five dsRNA (Totiviridae, Endornaviridae, Picobirnaviridae, Cystoviridae, and Partitiviridae) and four ssRNA (Flaviviridae, Narnaviridae, Virgaviridae, and Hypoviridae) virus families. To the best of our knowledge, this is the largest number of full-length genome sequences of novel RNA viruses identified in one metagenomic library. The viral RNA community successfully detected in this study consisted of dsRNA viruses with or without virion formation and ssRNA viruses detected as replicative intermediates. Our results suggest that FLDS has the potential to detect a wide range of RNA viruses, excluding retroviruses. Several studies have been performed using metagenomic analyses targeting dsRNA with Next-Generation Sequencing technology. In these studies, viral read abundance reached a maximum of 52.7% (2). In contrast, FLDS provided extremely high viral read abundance. The improvement in viral read rates with FLDS was likely derived from [1] a combination of repeating cellulose powder column chromatography and subsequent enzymatic treatment, [2] the fragmentation and efficient thermal denaturation of dsRNA prior to cDNA synthesis, and [3] the selective duplex formation of dsRNA-derived cDNA prior to PCR amplification. Furthermore, FLDS also presented advantages in reconstructing complete genome sequences including terminal regions, which are difficult to obtain using RNA-seq and random priming methods (3). The complete sequences of viral RNA segments are beneficial for the identification of RNA virus segments, particularly in cases in which coding CDSs did not show significant similarities with viral CDSs in databases. The application of a fulllength dsRNA cloning method using a PC3-T7 loop primer (8) to fragmented dsRNA enabled us to determine the terminal regions of long dsRNA genomes. Since T4 RNA ligase requires a 5′ phosphoryl-terminated nucleic acid donor (PC3-T7 loop primer) and 3′ hydroxyl-terminated nucleic acid acceptor for ligation activity, dsRNA fragments with 3′ terminal phosphate were not used as substrates. The terminal structures of dsRNA fragmented by ultrasound have not been reported. However, in the case of dsDNA fragmented by ultrasound, double-strand breaks occur preferentially in 5′-CpG-3′ dinucleotides, and the phosphate group is at the 5′ side of G in the products (15). In this study, fragmented dsRNAs were successfully converted into cDNA and amplified. Taking this into consideration, dsRNA fragmentation using ultrasound with Covaris S220 also produced 3′ hydroxyl-terminated fragments. Furthermore, the lack of any modifications to the 3′ hydroxyl-terminal of viral RNA genomes (21) also allowed us to retrieve the terminal regions of the RNA viral genome. Total RNA-seq is considered to be a less-biased method for identifying RNA viruses despite the very low abundance of viral reads in general. In the present study, FLDS enriched the viral RNA reads by > 300-fold that with total RNA-seq (Table 2). Notably, FLDS produced significantly more ssRNA viral reads than total RNA-seq; however, FLDS only has the ability to detect ssRNA viruses at the replicative stage. Moreover, FLDS showed more uniform read coverage than RNA-seq. These results indicate that FLDS is more effective than total RNA-seq for revealing all RNA viruses in cellular organisms.
  41 in total

Review 1.  Genome packaging in influenza A virus.

Authors:  Edward C Hutchinson; Johann C von Kirchbach; Julia R Gog; Paul Digard
Journal:  J Gen Virol       Date:  2009-12-02       Impact factor: 3.891

Review 2.  50-plus years of fungal viruses.

Authors:  Said A Ghabrial; José R Castón; Daohong Jiang; Max L Nibert; Nobuhiro Suzuki
Journal:  Virology       Date:  2015-03-13       Impact factor: 3.616

Review 3.  Taxonomic reorganization of family Partitiviridae and other recent progress in partitivirus research.

Authors:  Max L Nibert; Said A Ghabrial; Edgar Maiss; Till Lesker; Eeva J Vainio; Daohong Jiang; Nobuhiro Suzuki
Journal:  Virus Res       Date:  2014-04-21       Impact factor: 3.303

4.  Discovery of two novel viruses expands the diversity of single-stranded DNA and single-stranded RNA viruses infecting a cosmopolitan marine diatom.

Authors:  Kei Kimura; Yuji Tomaru
Journal:  Appl Environ Microbiol       Date:  2014-12-01       Impact factor: 4.792

5.  BLAST+: architecture and applications.

Authors:  Christiam Camacho; George Coulouris; Vahram Avagyan; Ning Ma; Jason Papadopoulos; Kevin Bealer; Thomas L Madden
Journal:  BMC Bioinformatics       Date:  2009-12-15       Impact factor: 3.169

6.  Enhanced methods for unbiased deep sequencing of Lassa and Ebola RNA viruses from clinical and biological samples.

Authors:  Christian B Matranga; Kristian G Andersen; Sarah Winnicki; Michele Busby; Adrianne D Gladden; Ryan Tewhey; Matthew Stremlau; Aaron Berlin; Stephen K Gire; Eleina England; Lina M Moses; Tarjei S Mikkelsen; Ikponmwonsa Odia; Philomena E Ehiane; Onikepe Folarin; Augustine Goba; S Humarr Kahn; Donald S Grant; Anna Honko; Lisa Hensley; Christian Happi; Robert F Garry; Christine M Malboeuf; Bruce W Birren; Andreas Gnirke; Joshua Z Levin; Pardis C Sabeti
Journal:  Genome Biol       Date:  2014       Impact factor: 13.583

7.  Comparison of tissue sample processing methods for harvesting the viral metagenome and a snapshot of the RNA viral community in a turkey gut.

Authors:  Jigna D Shah; Joshua Baller; Ying Zhang; Kevin Silverstein; Zheng Xing; Carol J Cardona
Journal:  J Virol Methods       Date:  2014-09-01       Impact factor: 2.014

8.  Evolution and phylogenetic analysis of full-length VP3 genes of Eastern Mediterranean bluetongue virus isolates.

Authors:  Kyriaki Nomikou; Chrysostomos I Dovas; Sushila Maan; Simon J Anthony; Alan R Samuel; Maria Papanastassopoulou; Narender S Maan; Olga Mangana; Peter P C Mertens
Journal:  PLoS One       Date:  2009-07-30       Impact factor: 3.240

9.  Metagenomic analysis of RNA viruses in a fresh water lake.

Authors:  Appolinaire Djikeng; Ryan Kuzmickas; Norman G Anderson; David J Spiro
Journal:  PLoS One       Date:  2009-09-29       Impact factor: 3.240

10.  Mutational analysis of the "slippery-sequence" component of a coronavirus ribosomal frameshifting signal.

Authors:  I Brierley; A J Jenner; S C Inglis
Journal:  J Mol Biol       Date:  1992-09-20       Impact factor: 5.469

View more
  23 in total

1.  RNA viromes from terrestrial sites across China expand environmental viral diversity.

Authors:  Yan-Mei Chen; Sabrina Sadiq; Jun-Hua Tian; Xiao Chen; Xian-Dan Lin; Jin-Jin Shen; Hao Chen; Zong-Yu Hao; Michelle Wille; Zhuo-Cheng Zhou; Jun Wu; Feng Li; Hong-Wei Wang; Wei-Di Yang; Qi-Yi Xu; Wen Wang; Wen-Hua Gao; Edward C Holmes; Yong-Zhen Zhang
Journal:  Nat Microbiol       Date:  2022-07-28       Impact factor: 30.964

2.  The Intrapopulation Genetic Diversity of RNA Virus May Influence the Sensitivity of Chlorine Disinfection.

Authors:  Syun-Suke Kadoya; Syun-Ichi Urayama; Takuro Nunoura; Miho Hirai; Yoshihiro Takaki; Masaaki Kitajima; Toyoko Nakagomi; Osamu Nakagomi; Satoshi Okabe; Osamu Nishimura; Daisuke Sano
Journal:  Front Microbiol       Date:  2022-05-20       Impact factor: 6.064

3.  Resolving deep evolutionary relationships within the RNA virus phylum Lenarviricota.

Authors:  Sabrina Sadiq; Yan-Mei Chen; Yong-Zhen Zhang; Edward C Holmes
Journal:  Virus Evol       Date:  2022-06-14

4.  RNA Viral Metagenome Analysis of Subnanogram dsRNA Using Fragmented and Primer Ligated dsRNA Sequencing (FLDS).

Authors:  Miho Hirai; Yoshihiro Takaki; Fumie Kondo; Masayuki Horie; Syun-Ichi Urayama; Takuro Nunoura
Journal:  Microbes Environ       Date:  2021       Impact factor: 2.912

5.  Virologists are "Symbionts" in Microbial Ecology.

Authors:  Ken Takai
Journal:  Microbes Environ       Date:  2016       Impact factor: 2.912

6.  Genomics and Metagenomics in Microbial Ecology: Recent Advances and Challenges.

Authors:  Takashi Narihiro; Yoichi Kamagata
Journal:  Microbes Environ       Date:  2017       Impact factor: 2.912

7.  Habitat Alterations by Viruses: Strategies by Tupanviruses and Others.

Authors:  Hiroyuki Ogata
Journal:  Microbes Environ       Date:  2018       Impact factor: 2.912

Review 8.  Metagenomics reshapes the concepts of RNA virus evolution by revealing extensive horizontal virus transfer.

Authors:  Valerian V Dolja; Eugene V Koonin
Journal:  Virus Res       Date:  2017-11-08       Impact factor: 3.303

9.  dsRNA-seq Reveals Novel RNA Virus and Virus-Like Putative Complete Genome Sequences from Hymeniacidon sp. Sponge.

Authors:  Syun-Ichi Urayama; Yoshihiro Takaki; Daisuke Hagiwara; Takuro Nunoura
Journal:  Microbes Environ       Date:  2020       Impact factor: 2.912

10.  Characterization of Cronartium ribicola dsRNAs reveals novel members of the family Totiviridae and viral association with fungal virulence.

Authors:  Jun-Jun Liu; Yu Xiang; Richard A Sniezko; Anna W Schoettle; Holly Williams; Arezoo Zamany
Journal:  Virol J       Date:  2019-10-17       Impact factor: 4.099

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.