| Literature DB >> 29795788 |
Evelien M Adriaenssens1, Kata Farkas2, Christian Harrison1, David L Jones2, Heather E Allison1, Alan J McCarthy1.
Abstract
Detection of viruses in the environment is heavily dependent on PCR-based approaches that require reference sequences for primer design. While this strategy can accurately detect known viruses, it will not find novel genotypes or emerging and invasive viral species. In this study, we investigated the use of viromics, i.e., high-throughput sequencing of the biosphere's viral fraction, to detect human-/animal-pathogenic RNA viruses in the Conwy river catchment area in Wales, United Kingdom. Using a combination of filtering and nuclease treatment, we extracted the viral fraction from wastewater and estuarine river water and sediment, followed by high-throughput RNA sequencing (RNA-Seq) analysis on the Illumina HiSeq platform, for the discovery of RNA virus genomes. We found a higher richness of RNA viruses in wastewater samples than in river water and sediment, and we assembled a complete norovirus genotype GI.2 genome from wastewater effluent, which was not contemporaneously detected by conventional reverse transcription-quantitative PCR (qRT-PCR). The simultaneous presence of diverse rotavirus signatures in wastewater indicated the potential for zoonotic infections in the area and suggested runoff from pig farms as a possible origin of these viruses. Our results show that viromics can be an important tool in the discovery of pathogenic viruses in the environment and can be used to inform and optimize reference-based detection methods provided appropriate and rigorous controls are included. IMPORTANCE Enteric viruses cause gastrointestinal illness and are commonly transmitted through the fecal-oral route. When wastewater is released into river systems, these viruses can contaminate the environment. Our results show that we can use viromics to find the range of potentially pathogenic viruses that are present in the environment and identify prevalent genotypes. The ultimate goal is to trace the fate of these pathogenic viruses from origin to the point where they are a threat to human health, informing reference-based detection methods and water quality management.Entities:
Keywords: RNA viruses; norovirus; pathogen detection; rotavirus; viromics; wastewater
Year: 2018 PMID: 29795788 PMCID: PMC5964442 DOI: 10.1128/mSystems.00025-18
Source DB: PubMed Journal: mSystems ISSN: 2379-5077 Impact factor: 6.496
FIG 1 Map of the sampling locations, indicated with blue arrows. WWTP, wastewater treatment plant. Data in the left panel were taken from Google Maps (Map data ©Google 2017).
Summary of viromic and qRT-PCR detection of specific RNA viruses across sewage, estuarine water, and sediment samples
| Sample | Sample | Location | No. of contigs | Target RNA viruses | qRT-PCR results |
|---|---|---|---|---|---|
| LI_13-9 | 1 liter | Llanrwst WWTP | 5,721 | RVA, RVC, PBV, SaV | NoVGII (1,457) |
| LE_13-9 | 1 liter | Llanrwst WWTP | 2,201 | RVA, RVC, PBV | NoVGII (1,251) |
| LI_11-10 | 1 liter | Llanrwst WWTP | 859 | PBV | NoVGII (detected) |
| LE_11-10 | 1 liter | Llanrwst WWTP | 5,433 | NoVGI, RVA, RVC, PBV, AsV | NoVGII (50,180) |
| SW | 50 liters | Morfa beach | 243 | ||
| Sed1 | 60 g | Morfa beach | 550 | ||
| Sed2 | 60 g | Morfa beach | 550 |
LI, sewage influent; LE, sewage effluent; SW, estuarine surface water; Sed, estuarine sediment.
RVA, rotavirus A; RVB, rotavirus B; PBV, picobirnavirus; SaV, sapovirus; NoVGI, norovirus genogroup I; AsV, astrovirus.
Samples were tested with qRT-PCR for the following targets: NoVGI, NoVGII, SaV, HAV, and HEV. Results are reported in genome copies per liter (gc/liter). NoVGII below the limit of quantification (approximately 200 gc/liter) was detected in sample LI_11-10. NoVGII was the only target virus detected by qRT-PCR.
WWTP, wastewater treatment plant.
Samples Sed1 and Sed2 were assembled together into the contig data set Sed.
FIG 2 Taxonomic distribution of curated read data (relative abundance) at the virus family level. Reads were assigned to a family or equivalent group by MEGAN6 using a lowest-common-ancestor algorithm, based on blastx-based homology using the program Diamond with the RefSeq Viral protein database (January 2017 version) and the nonredundant protein database (May 2017 version). Only viral groupings are shown. LI, sewage influent; LE, sewage effluent; SW, estuarine surface water; Sed, estuarine sediment.
FIG 3 Heatmap of viral richness at the family level per sample. Heatmap colors denote relative abundances per sample. Contigs larger than 300 nucleotides (nt) were assigned to a family or grouping by MEGAN6 using a lowest-common-ancestor algorithm, based on blastx-based homology using the program Diamond with the RefSeq viral protein database (version January 2017) and the nonredundant protein database (May 2017 version). Only those families/groups comprising large contigs (>1,000 nt) or with contigs mapping to viral signature genes (e.g., capsid and RNA-dependent RNA polymerase genes) were retained. LI, sewage influent; LE, sewage effluent; SW, estuarine surface water; Sed, estuarine sediment.
FIG 4 Pairwise genome comparison between the virome’s norovirus genome (middle) and its closest relatives, norovirus Hu/GI.2/Jingzhou/2013401/CHN and norovirus Hu/GI.2/Leuven/2003/BEL. BLASTN similarity is indicated in shades of gray. ORFs are delineated by dark-blue arrows. Deviations from the average GC content are indicated above the genomes in a green and purple graph. The qRT-PCR primer binding sites for the wastewater (WW)-associated genome are indicated by light-blue rectangles. The figure was created with Easyfig (92).
FIG 5 Maximum-likelihood phylogenetic tree of norovirus genomes belonging to genogroup GI, with the norovirus GII reference genome as an outlier. The nucleotide sequences were aligned with MUSCLE, and the alignment was trimmed to the length of contig 6 of the LE_11-10 virome sequence, resulting in 7,758 positions analyzed for tree building. The maximum-likelihood method was used, with the Tamura-Nei model for nucleic acid substitution. The percentages of trees in which the associated taxa clustered together are shown next to the branches. The scale bar represents the number of substitutions per site.
Rotavirus A and C genome information and detection in the LI_13-9 sample data set
| Virus, | Length | Protein(s) | Predicted function | No. of contigs | Putative | Potential host(s) |
|---|---|---|---|---|---|---|
| RVA | ||||||
| 1 | 3,302 | VP1 | RNA-dependent RNA polymerase | 7 | R2 | Human, cow |
| 2 | 2,693 | VP2 | Core capsid protein | 1 | C2 | Human |
| 3 | 2,591 | VP3 | RNA capping protein | 1 | M2 | Human, sheep |
| 4 | 2,363 | VP4 | Outer capsid spike protein | 3 | P[1], P(41), P[14] | Human, pig, alpaca, monkey |
| 5 | 1,614 | NSP1 | Interferon antagonist protein | 6 | A3, A11 | Human, cow, pig, deer |
| 6 | 1,356 | VP6 | Inner capsid protein | 1 | I2 | Human |
| 7 | 1,105 | NSP3 | Translation effector protein | 4 | T6 | Human, dog, cow |
| 8 | 1,059 | NSP2 | Viroplasm RNA binding protein | 0 | ||
| 9 | 1,062 | VP7 | Outer capsid glycoprotein | 2 | G10, G8 | Cow, human |
| 10 | 751 | NSP4 | Enterotoxin | 1 | E2 | Human, cow |
| 11 | 667 | NSP5 and -6 | Phosphoprotein, nonstructural protein | 0 | ||
| RVC | ||||||
| 1 | 3,309 | VP1 | RNA-dependent RNA polymerase | 7 (0) | Rx | Pig, cow |
| 2 | 2,736 | VP2 | Core capsid protein | 4 (2) | Cx | Pig, dog |
| 3 | 2,283 | VP4 | Outer capsid protein | 2 (4) | P[x] | Pig |
| 4 | 2,166 | VP3 | Guanylyl transferase | 6 (0) | Mx | Pig |
| 5 | 1,353 | VP6 | Inner capsid protein | 1 (0) | Ix | Pig |
| 6 | 1,350 | NSP3 | 0 (1) | Tx | Human | |
| 7 | 1,270 | NSP1 | 0 (2) | Ax | Pig, dog | |
| 8 | 1,063 | VP7 | Outer capsid glycoprotein | 0 (2) | Gx | Pig |
| 9 | 1,037 | NSP2 | 2 (0) | Nx | Pig | |
| 10 | 730 | NSP5 | 0 (0) | |||
| 11 | 613 | NSP4 | Enterotoxin | 0 (4) | Ex | Pig |
Number of predicted RVCX contigs are in parentheses, i.e., contigs with only limited amino acid similarity to RVC.
Potential hosts are defined as the hosts of the reference rotavirus sequence with the highest similarity to the contigs found in the virome sample LI_13-9.
FIG 6 Picobirnavirus diversity. (A) Maximum-likelihood phylogenetic tree of RdRP amino acid sequences of isolated and virome picobirnaviruses. Sequences from isolates are indicated with white dots and virome-derived sequences with filled colored dots, as follows: sample LI_11-10 in purple, sample LE_11-10 in blue, and sample LI_13-9 in red. Sequences were aligned using MUSCLE, providing 114 amino acid positions for tree generation. The maximum-likelihood method was used, with the JTT matrix-based model. The scale bar represents the number of substitutions per site. The bootstrap values of all branches were low. (B) Predicted ribosome binding site consensus sequence from extracted 5′ UTRs. The logo was produced using the MEME Suite.