| Literature DB >> 35205129 |
Marie-Emilie A Gauthier1, Ruvini V Lelwala1,2, Candace E Elliott2, Craig Windell1, Sonia Fiorito3, Adrian Dinsdale3, Mark Whattam3, Julie Pattemore2, Roberto A Barrero1.
Abstract
Rapid and safe access to new plant genetic stocks is crucial for primary plant industries to remain profitable, sustainable, and internationally competitive. Imported plant species may spend several years in Post Entry Quarantine (PEQ) facilities, undergoing pathogen testing which can impact the ability of plant industries to quickly adapt to new global market opportunities by accessing new varieties. Advances in high throughput sequencing (HTS) technologies provide new opportunities for a broad range of fields, including phytosanitary diagnostics. In this study, we compare the performance of two HTS methods (RNA-Seq and sRNA-Seq) with that of existing PEQ molecular assays in detecting and identifying viruses and viroids from various plant commodities. To analyze the data, we tested several bioinformatics tools which rely on different approaches, including direct-read, de novo, and reference-guided assembly. We implemented VirusReport, a new portable, scalable, and reproducible nextflow pipeline that analyses sRNA datasets to detect and identify viruses and viroids. We raise awareness of the need to evaluate cross-sample contamination when analyzing HTS data routinely and of using methods to mitigate index cross-talk. Overall, our results suggest that sRNA analyzed using VirReport provides opportunities to improve quarantine testing at PEQ by detecting all regulated exotic viruses from imported plants in a single assay.Entities:
Keywords: high throughput sequencing; phytosanitary diagnostic assay; plant siRNA; plant virus and viroid detection; post-entry quarantine facility
Year: 2022 PMID: 35205129 PMCID: PMC8868628 DOI: 10.3390/biology11020263
Source DB: PubMed Journal: Biology (Basel) ISSN: 2079-7737
Selected PEQ positive control plants infected with known viruses and viroids. Presence of regulated viruses detected using molecular (PCR and ELISA) and bioassays (biological and woody indexing).
| Sample ID | Commodity | Species | Positive Detections in PEQ |
|---|---|---|---|
| MT001 | Citrus | Citrus Troyer × Frost-Lisbon | CEVd |
| MT002 | Prunus |
| PNRSV |
| MT003 | Citrus | CTV, CVEV, CDVd, HSVd | |
| MT004 | Citrus | CEVd, CTV, HSVd | |
| MT005 | Raspberry |
| RBDV |
| MT007 | Citrus | CDVd, HSVd | |
| MT008 | Citrus |
| CDVd, HSVd |
| MT010 | Ornamental grass | Novel potyvirus (MsiMV) | |
| MT011 | Citrus | CTV, CVd-VI, HSVd | |
| MT012 | Iris | ISMV, TRSV | |
| MT013 | Strawberry | SMoV | |
| MT014 | Strawberry | SMoV | |
| MT015 | Strawberry. | SMoV | |
| MT016 | Sweet potato |
| SPFMV |
CEVd = Citrus exocortis viroid, CTV = Citrus tristeza virus, CVEV = Citrus vein enation virus, CDVd = Citrus dwarfing viroid, CVd-VI = Citrus viroid VI, HSVd = Hop stunt viroid, ISMV = Iris severe mosaic virus, PNRSV = Prunus necrotic ringspot virus, RBDV = Raspberry bushy dwarf virus, SMoV = Strawberry mottle virus, SPFMV = Sweet potato feathery mottle virus, TRSV = Tobacco ringspot virus. Species names shown in italics.
Figure 1Distribution of RNA reads collected across samples. Bar chart showing amount of filtered reads in millions that were left-over (in green) after removing low-quality reads (in blue) and reads matching ribosomal RNA (in yellow) for RNA-Seq (on the left) and sRNA-Seq methods (on the right). The counts were split between sequencing providers (SP1 and SP2).
Figure 2Total number of viral contigs recovered for each program tested that uses an assembly-based approach. The counts are split between sequencing providers (SP1 in green and SP2 in purple). The boxplot limits indicate the first and third quartiles, with the central line marking the median. Vertical lines extending from each box capture the remaining data that sits within 1.5 times of the interquartile range, while the dots placed past the line edges denote outliers.
Figure 3Side-by-side comparison of the total number of unique viruses and/or viroids detected by RNA-Seq and sRNA-Seq across samples by each software. The counts are split between sequencing providers (SP1 in green and SP2 in purple). See Figure 2 legend for boxplot interpretation.
Viral detection sensitivity for the different methods and technology tested in this study. Sensitivity was calculated as the number of true positives recovered divided by the total number of known targets identified by PEQ (ratio indicated in brackets). Targets included 24 viruses/viroids across 14 plant samples for the SP1 and 14 viruses/viroids across nine samples for the SP2.
| Sequencing Provider | Sequencing Technology | Software | Sensitivity (%) |
|---|---|---|---|
| SP1 | RNA-Seq | Kodoja | 100 (24/24) |
| PVDP | 95.8 (23/24) | ||
| SPAdes | 95.8 (23/24) | ||
| Trinity | 95.8 (23/24) | ||
| sRNA-Seq | VirusDetect | 100 (24/24) | |
| VirReport-SPAdes | 100 (24/24) | ||
| VirReport-Velvet | 100 (24/24) | ||
| SP2 | RNA-Seq | Kodoja | 100 (14/14) |
| PVDP | 100 (14/14) | ||
| SPAdes | 100 (14/14) | ||
| Trinity | 92.9 (13/14) | ||
| sRNA-Seq | VirusDetect | 100 (14/14) | |
| VirReport-SPAdes | 78.6 (11/14) | ||
| VirReport-Velvet | 100 (14/14) |
Figure 4Detection of false positive viruses and viroids across samples. (A). Details of all predicted viral contaminants detected in each sample sequenced by SP1 (top) and SP2 (bottom). Each number in the table corresponds to one of the software tested: numbers 1 to 4 refer to methods applied to RNA-Seq data; numbers 5 to 6 refer to methods applied to sRNA-Seq data. 1 = Kodoja, 2 = PVDP, 3 = SPAdes, 4 = Trinity, 5a = VirReport-SPAdes, 5b = VirReport-Velvet, 6 = VirusDetect. (B). Total number of false positive events detected per method and sequencing technology tested across 14 plant samples for the SP1 samples and 9 samples for the SP2 samples. The CVD-VI call detected by VirReport-Velvet in sample M010 was only detected in the de novo assembly derived using 24 nt-long reads.
Detection sensitivity and false discovery rate at specified subsampling depth for sRNA-Seq and RNA-Seq datasets. Sensitivity was calculated as the number of true positives (number of correctly identified viruses and viroids) divided by the total number of known target species identified by PEQ (ratio indicated in brackets). Targets included 13 viruses and 11 viroids across 14 plant samples for the SP1 and 9 viruses and 5 viroids across nine samples for the SP2. The discovery rate was calculated as the number of false positives (indicated in brackets) divided by the total number of both true positives and false positives detected.
| Sequencing Technology | Subsampling | Viruses | Viroids | ||
|---|---|---|---|---|---|
| Sensitivity (%) | False Discovery Rate (%) | Sensitivity (%) | False Discovery Rate (%) | ||
| RNA-Seq SP1 (Kodoja) | 1 M | 100 (13/13) | 66.7 (26) | 72.7 (8/11) | 0 |
| 2.5 M | 100 (13/13) | 71.7 (33) | 72.7 (8/11) | 0 | |
| 4 M | 100 (13/13) | 73.5 (36) | 81.8 (9/11) | 0 | |
| 5 M | 100 (13/13) | 75.9 (41) | 81.8 (9/11) | 0 | |
| 10 M | 100 (13/13) | 78.3 (47) | 100 (11/11) | 8.3 (1) | |
| All reads | 100 (13/13) | 80.6 (54) | 100 (11/11) | 21.4 (3) | |
| sRNA-Seq SP1 (VirReport-Velvet) | 1 M | 92.3 (12/13) | 7.7 (1) | 100 (11/11) | 8.3 (1) |
| 2.5 M | 100 (13/13) | 7.1 (1) | 100 (11/11) | 15.3 (2) | |
| 4 M | 100 (13/13) | 13.3 (2) | 100 (11/11) | 21.4 (3) | |
| All reads | 100 (13/13) | 48 (12) | 100 (11/11) | 52.1 (11) | |
| RNA-Seq SP2 (Kodoja) | 1 M | 100 (9/9) | 25 (3) | 80 (4/5) | 0 |
| 2.5 M | 100 (9/9) | 35.7 (5) | 100 (5/5) | 0 | |
| 4 M | 100 (9/9) | 43.8 (7) | 100 (5/5) | 0 | |
| 5 M | 100 (9/9) | 50.0 (9) | 100 (5/5) | 0 | |
| 10 M | 100 (9/9) | 57.1 (12) | 100 (5/5) | 0 | |
| All reads | 100 (9/9) | 69.9 (20) | 100 (5/5) | 0 | |
| sRNA-Seq SP2 (VirReport-Velvet) | 1 M | 100 (9/9) | 0 | 80 (4/5) | 0 |
| 2.5 M | 100 (9/9) | 0 | 100 (5/5) | 16.7 (1) | |
| 4 M | 100 (9/9) | 0 | 100 (5/5) | 16.7 (1) | |
| All reads | 100 (9/9) | 0 | 100 (5/5) | 16.7 (1) | |