| Literature DB >> 34834931 |
Ashleigh F Porter1, Joanna Cobbin2,3, Ci-Xiu Li4, John-Sebastian Eden2,3,5, Edward C Holmes2,3.
Abstract
Metagenomic next-generation sequencing has transformed the discovery and diagnosis of infectious disease, with the power to characterise the complete 'infectome' (bacteria, viruses, fungi, parasites) of an individual host organism. However, the identification of novel pathogens has been complicated by widespread microbial contamination in commonly used laboratory reagents. Using total RNA sequencing ("metatranscriptomics") we documented the presence of contaminant viral sequences in multiple 'blank' negative control sequencing libraries that comprise a sterile water and reagent mix. Accordingly, we identified 14 viral sequences in 7 negative control sequencing libraries. As in previous studies, several circular replication-associated protein encoding (CRESS) DNA virus-like sequences were recovered in the blank control libraries, as well as contaminating sequences from the Totiviridae, Tombusviridae and Lentiviridae families of RNA virus. These data suggest that viral contamination of common laboratory reagents is likely commonplace and can comprise a wide variety of viruses.Entities:
Keywords: Circoviridae; Lentiviridae; Tombusviridae; Totiviridae; metatranscriptomics; reagent contamination; virology
Mesh:
Substances:
Year: 2021 PMID: 34834931 PMCID: PMC8625350 DOI: 10.3390/v13112122
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.048
Experimental conditions of each blank negative control sample utilised here.
| Library Name | Sequencing Platform | RNA Extraction | Library Preparation | Data Generated | SRA Library Accession |
|---|---|---|---|---|---|
| L1 | Illumina Novaseq 6000 150 cycle kit (2 × 75 nt reads) | RNeasy Plus Universal Kits (Qiagen, Hilden, Germany) | Trio RNA-seq + UDI (NuGEN) | 11,940,824 paired reads (1.8 Gb) | SRR14737471 |
| L2 | Illumina Novaseq 6000 150 cycle kit (2 × 75 nt reads) | RNeasy Plus Universal Kits (Qiagen, Hilden, Germany) | Trio RNA-seq + UDI (NuGEN) | 57,606,392 paired reads (8.7 Gb) | SRR14737470 |
| L3 | Illumina MiSeq, v3 150 cycle kit (2 × 75 nt reads) | RNeasy Plus Mini Kit (Qiagen, Hilden, Germany) | SMARTer Stranded Total RNA-Seq Kit v2 -Pico Input Mammalian (Clontech) | 4,156,504 paired reads (0.63 Gb) | SRR10069984 |
| L4 | Illumina NextSeq 500, mid-output 150 cycle kit (2 × 75 nt reads) | Total RNA Purification Kit (Norgen Biotek, Thorold, ON, Canada) | SMARTer Stranded Total RNA-Seq Kit v2-Pico Input Mammalian (Clontech) | 32,279,914 paired reads (4.91 Gb) | SRR14737469 |
| L5 | Illumina MiSeq 150 cycle kit (2 × 75 nt reads) | Total RNA purification Kit (Norgen BioTek Corp., Thorold, ON, Canada) | SMARTer Stranded Total RNA-Seq Kit v2-Pico Input Mammalian (Clontech) | 7,342,876 paired reads | SRR15221433 |
| L6 | Illumina MiSeq 150 cycle kit (2 × 75 nt reads) | Total RNA purification Kit (Norgen BioTek Corp., Thorold, ON, Canada) | SMARTer Stranded Total RNA-Seq Kit v2-Pico Input Mammalian (Clontech) | 10,978,253 | SRR15221432 |
| L7 | Illumina MiSeq 150 cycle kit (2 × 75 nt reads) | Total RNA purification Kit (Norgen BioTek Corp., Thorold, ON, Canada) | SMARTer Stranded Total RNA-Seq Kit v2-Pico Input Mammalian (Clontech) | 8,564,269 | SRR14737466 |
Reference proteins for each sequence alignment performed in this analysis.
| Reference Protein | Protein Acronym | Virus | Number of Sequences in Analysis | Alignment Length (Amino Acid, AA) |
|---|---|---|---|---|
| Viral replicase protein | Rep | CRESS | 221 | 672 AA |
| Viral replicase protein | Rep |
| 69 | 161 AA |
| Polymerase peptide | Pol |
| 11 | 478 AA |
| RNA-dependent RNA polymerase | RdRp |
| 95 | 125 AA |
| RNA-dependent RNA polymerase | RdRp |
| 87 | 256 AA |
Figure 1Abundance of viral reads in libraries L1, L2, L3, and L7. Visual representation of the virus-associated reads in respective libraries, with pie charts depicting the total number reads mapped to long (>800 bp) virus-associated contigs (orange) compared to all the virus-associated reads (blue). (A–D) Each bar chart denotes the proportion of contigs associated with different virus families in the respective libraries.
Novel reagent-associated viral sequences identified in this study.
| Virus name | Accession | Library Abundance (%) of Total Reads (rRNA Removed) | Length (bp) | Library |
|---|---|---|---|---|
| Reagent-associated tombus-like virus 1 | MZ824229 | 1.28 | 1204 | L3 |
| Reagent-associated tombus-like virus 2 | MZ824228 | 0.46 | 828 | L3 |
| Reagent-associated tombus-like virus 3 | MZ824227 | 1.08 | 1574 | L3 |
| Reagent-associated tombus-like virus 4 | MZ824226 | 1.29 | 1410 | L3 |
| Reagent-associated toti-like virus | MZ824225 | 0.001 | 920 | L2 |
| Reagent-associated lenti-like virus | MZ824230 | 0.004 | 962 | L2 |
| Reagent-associated CRESS-like virus 1 | MZ824237 | 0.78 | 3878 | L1 |
| Reagent-associated CRESS-like virus 2 | MZ824236 | 0.24 | 2377 | L1 |
| Reagent-associated CRESS-like virus 3 | MZ824235 | 0.02 | 1592 | L1 |
| Reagent-associated CRESS-like virus 4 | MZ824234 | 2.89 | 2663 | L3 |
| Reagent-associated CRESS-like virus 5 | MZ824233 | 9.66 | 3027 | L3 |
| Reagent-associated CRESS-like virus 6 | MZ824232 | 4.98 | 3517 | L3 |
| Reagent-associated CRESS-like virus 7 | MZ824231 | 0.01 | 1124 | L1 |
Figure 2Phylogenetic relationships of CRESS (ssDNA) viruses, including the seven novel CRESS-like viruses identified here and highlighted in red (Reagent-associated CRESS-like viruses 1-7). Reagent-associated sequences determined previously are highlighted in blue. The clades that included the novel CRESS-like viruses identified here (A,B,G) are magnified on the right. The tree and other clades (C–F) are shown in higher resolution in Supplementary Figure S1. The tree was mid-point rooted for clarity purposes only. Bootstrap values greater than 70% are represented by asterisks next to nodes. All horizontal branch lengths are scaled according to the number of amino acid substitutions per site.
Figure 3Phylogenetic relationships of the ssDNA virus family Circoviridae based on hypothesised “host-associated” circoviruses. The tree has two major clades, comprising the circovirus clade associated with vertebrate hosts (highlighted in blue) and the cyclovirus clade previously associated with invertebrate hosts (highlighted in green). For clarity, the tree is mid-point rooted. Bootstrap values greater than 70% are represented by asterisks next to nodes. All horizontal branch lengths are scaled according to number of amino acid substitutions per site.
Figure 4Phylogenetic relationships of RNA virus family Lentiviridae including the novel virus reagent-associated lenti-like virus sequence identified in this study. This virus is highlighted in red and falls within the EIAV clade.
Figure 5Phylogenetic relationships of RNA virus family Tombusviridae including the seven novel viruses identified in this study (highlighted in red). The phylogeny was mid-point rooted for clarity purposes only. Bootstrap values greater than 70% are represented by asterisks next to nodes. All horizontal branch lengths are scaled according to number of amino acid substitutions per site.
Figure 6Phylogenetic relationships of RNA virus family Totiviridae, including the novel virus identified in this study—Reagent-associated toti-like virus (highlighted in red). For clarity, the tree was mid-point rooted. Bootstrap values greater than 70% are represented by asterisks next to nodes. All horizontal branch lengths are scaled according to number of amino acid substitutions per site.