| Literature DB >> 33808381 |
Solomon Maina1,2, Linda Zheng3, Brendan C Rodoni3,4.
Abstract
Globally, high-throughput sequencing (HTS) has been used for virus detection in germplasm certification programs. However, sequencing costs have impeded its implementation as a routine diagnostic certification tool. In this study, the targeted genome sequencing (TG-Seq) approach was developed to simultaneously detect multiple (four) viral species of; Pea early browning virus (PEBV), Cucumber mosaic virus (CMV), Bean yellow mosaic virus (BYMV) and Pea seedborne mosaic virus (PSbMV). TG-Seq detected all the expected viral amplicons within multiplex PCR (mPCR) reactions. In contrast, the expected PCR amplicons were not detected by gel electrophoresis (GE). For example, for CMV, GE only detected RNA1 and RNA2 while TG-Seq detected all the three RNA components of CMV. In an mPCR to amplify all four viruses, TG-Seq readily detected each virus with more than 732,277 sequence reads mapping to each amplicon. In addition, TG-Seq also detected all four amplicons within a 10-8 serial dilution that were not detectable by GE. Our current findings reveal that the TG-Seq approach offers significant potential and is a highly sensitive targeted approach for detecting multiple plant viruses within a given biological sample. This is the first study describing direct HTS of plant virus mPCR products. These findings have major implications for grain germplasm healthy certification programs and biosecurity management in relation to pathogen entry into Australia and elsewhere.Entities:
Keywords: crops; diagnostics; genome; high-throughput sequencing; plant virus
Year: 2021 PMID: 33808381 PMCID: PMC8066983 DOI: 10.3390/v13040583
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.048
RNA-Seq paired-end genome sequence data, including sequence depth coverage, GC content and genome size of the Bean yellow mosaic virus (BYMV), Pea seed-borne mosaic virus (PSbMV), Pea early browning virus (PEBV) and Cucumber mosaic virus (CMV) isolates used in this study for primer design and target template for TG-Seq verification.
| Sample | Host | Virus | Coverage (x) c | No of Read Counts Mapping to the Virus | GC Content | Genome Size | GenBank Accession |
|---|---|---|---|---|---|---|---|
| 14BY a | Lentil | BYMV | 2307 | 1,331,893 | 39.4% | 9868 | LC500882 |
| 13C | Field pea | PSbMV | 718 | 507,189 | 41.5% | 9852 | SRR13206509 |
| LY-2 b | Faba bean | PEBV-RNA1 | 3899 | 235,443 | 40.7% | 7037 | LC528622 |
| LY-2 b | Faba bean | PEBV-RNA2 | 5606 | 1,252,888 | 42% | 2604 | LC528623 |
| 14C | Faba bean | CMV-RNA1 | 8774 | 318,290 | 45.3% | 3215 | SRR13197436 |
| 14C | Faba bean | CMV-RNA2 | 37,144 | 1,293,807 | 45.3% | 2892 | SRR13197436 |
| 14C | Faba bean | CMV-RNA3 | 10,615 | 260,700 | 47.1% | 2188 | SRR13197436 |
a = Genome sequence of PEBV as reported [29], b = BYMV as reported [30], 13C and 14C = new PSbMV and CMV genome sequences generated from this study. c = Average coverage depth across the genome (x) times. The three genomes missed a few nucleotides within the 5′UTR and 3′UTR genome regions but all the coding regions were intact.
Nucleotide sequence, genome location, amplicon size and optimal annealing temperature of 16 primer pairs used in both singleplex and multiplex PCR reactions.
| Primer | Target Virus | Target Genome Region | Amplicon | Primer Sequence (5′-3′) | Amplicon Size (bp) | Optimal | Primer |
|---|---|---|---|---|---|---|---|
| HcPro-1F a | BYMV | HcPro | HcProF1 | CCTTGTGGTCGTATCACTTGTAA | 132 | 64.4 | 1182–1204 |
| HcPro-1R a | CTGAATGGTGCCTCTGGTAAC | 64.9 | 1412–1432 | ||||
| BYHcProF2 | BYMV | HcPro | HcProF2 | CCTTGTGGTCGTATCACTTGTAA | 251 | 64.4 | 1199–1222 |
| BYHcProR2 | CTGAATGGTGCCTCTGGTAAC | 64.9 | 1429–1449 | ||||
| BYNIb2F | BYMV | NIb | NIb2 | AGAGCAATTCAACCAGAGCATAG | 283 | 64.9 | 8247–8269 |
| BYNIb2R | CACAAGCACCTCATCAGTCTC | 64.9 | 8505–8525 | ||||
| BYNIb3F | BYMV | NIb | NIb3 | TTACAGCCGCACCGATTG | 288 | 64.9 | 7549–7566 |
| BYNIb3R | CGCATCTCAAGAACAGCATTC | 65 | 7766–7786 | ||||
| BYCPF3 | BYMV | CP | CPF3 | GAATGGACAATGATGGATGGAGAG | 287 | 65.2 | 8966–8989 |
| BYCPR3 | CTAACTGCTGCCGCCTTC | 65 | 9235–9252 | ||||
| HCPF2 | PSbMV | HcPro | HcPro | AGTTAGGCATCTGGCAATAG | 359 | 61.3 | 2028–2047 |
| HCPR2 | AGTCCTTAGCATCCTTCTCA | 61.8 | 2367–2386 | ||||
| CI-1F | PSbMV | CI | CI | TTGCGTGATTCGTCTATGC | 296 | 62.4 | 5227–5245 |
| CI-1R | TGTGCTATCGTTCTTGTAATTGA | 62.3 | 5500–5522 | ||||
| NIbF3 | PSbMV | NIb | NIb | GTGCGTCCAGATTGTGAA | 328 | 61.8 | 8338–8355 |
| NIbR3 | TACTTCTATATGGCTCCTGTTCTA | 62 | 8642–8665 | ||||
| PCP-F1 a | PSbMV | CP | CP | GAACATCAGGAACCATCACA | 254 | 61.7 | 9005–9024 |
| PCP-R1 a | TTCAATACACCACACCATCAA | 60.4 | 9238–9259 | ||||
| 12K2F | PEBV | 12K | 12K | GAAGTGTGCTGTGTCAAC | 294 | 60.4 | 6279–6296 |
| 12K2R | AAACCGAAATCTATGTCATCTC | 60.1 | 6551–6572 | ||||
| 14KF4 | PEBV | 14K | 14K | AGATGTGGACGACTCAGTGAA | 254 | 65 | 2303–2323 |
| 14KR4 | CGAAGTTGGCGAAGTGGTT | 65.1 | 2538–2556 | ||||
| 30KF | PEBV | 30K | 30K | TCATCGTAGAAGAGAGACTGTGTT | 348 | 65 | 5626–5649 |
| 30KR | ACCGCAACCGTACCTATCT | 64.7 | 5955–5973 | ||||
| 201K-F a | PEBV | 201K | 201K | GGTTAGAAGTGCTGGAAGTGAA | 399 | 64.4 | 1621–1642 |
| 201K-R a | TCATTGGCTTGCGACTCTC | 64.3 | 2001–2019 | ||||
| CMVRNA1F a | CMV | RNA1 | RNA1 | CTCCCACGGCGATAAAGG | 315 | 57.56 | 133–150 |
| CMVRNA1R a | GTGACCCAACTTCCTCCGA | 58.94 | 429–447 | ||||
| CMVRNA2F | CMV | RNA2 | RNA2 | ATAACMTCCCAGTTCTCACC | 260 | 56.23 | 1488–1507 |
| CMVRNA2R | TGRAARTCRCACCACCAYTT | 57.25 | 1728–1747 | ||||
| CMVRNA3F | CMV | RNA3 | RNA3 | GAAATTYGATTCRACYGTGTGGG | 202 | 58.02 | 1601–1623 |
| CMVRNA3R | CTTNCKCATRTCRCCDATATCAGC | 56.98 | 1779–1802 |
The 16 primer pairs were designed from BYMV (helper component proteinase (HcPro), nuclear inclusion protein (NIb), and coat protein), PSbMV (HcPro, cylindrical inclusion (CI) protein, NIb and CP), PEBV (12K, 14K, 30K and 201K proteins) and CMV (RNA1, RNA2, RNA3). a = Primers used in the multiplex and serially diluted mPCR reactions. The target genome region represents the region targeted by the specific primer within the viral genome, product size represents the final expected agarose GE size. Primer binding position represent the primer binding region within the BYMV, PSbMV, CMV and CMV genomes generated in Table 1.
Figure 1Agarose gel electrophoresis from an mPCR of the four viruses (BYMV, PSbMV, CMV and PEBV) quadruplicate (×4) samples. The (+VE) positive control infected viral RNA pooled together from (BYMV, PSbMV, CMV and PEBV) infected samples amplified using HcPro-1F/HcPro-1FR,PCP-F1/PCP-F1R,CMVRNA1F/CMVRNA1R,201K-F/201K-R primers. The (-VE) negative controls were RNase/DNase-free water and a viral-negative sample (healthy oat plant). L = Invitrogen ready to use 1 kb Plus DNA ladder used on both right and left side of the gel.
A comparison between targeted genome sequencing (TG-Seq) and gel electrophoresis (GE) to detect BYMV, PSbMV, PEBV and CMV amplicons generated in a multiplex PCR (mPCR) reaction.
| Library | Amplicons Targeted | Raw Reads | No. of Reads | Amplicons Detected | Amplicons of BYMV, PSbMV, PEBV and CMV, Detected by GE |
|---|---|---|---|---|---|
| 1 | BYMV (NIb2, CPF3, HcProF2) | 3,754,078 | 97.74% | NIb (1,208,788), CP (1,503,144), HcPro (949,413) | CP, HcPro |
| 2 | BYMV (HcProF2, NIb3, CPF3) | 3,704,546 | 98.07% | HcPro (1,748,342), NIb (22,057), CP (1,839,520) | CP, HcPro |
| 3 | BYMV (NIb2, CP3, HcProF1) | 3,563,538 | 98.04% | NIb (1,487,879), CP (1,091,068), HcPro (906,692) | HcPro, NIb, CP |
| 4 | BYMV (CP, HcProF1, HcProF2) | 3,523,172 | 97.85% | CP (880,456), HcPro (2,490,101) | CP, HcPro |
| 5 | PSbMV (CP, NIb, HcPro, CI) | 4,568,980 | 98.71% | CP (967,475), NIb (1,005,493), HcPro (1,121,760), | CP, NIb, HcPro, CI |
| 6 | PEBV (12K, 14K, 30K, 201K) | 4,110,734 | 98.46% | 12K (706,420), 14K (979,796), 30K (1,371,956), | 12K, 14K, 30K, 201K |
| 7 | PEBV (12K, 14K, 30K, 201K) | 3,923,838 | 98.50% | 12K (515,527), 14K (900,724), 30K (153,998), 201K (899,718) | 12K, 14K, 30K, 201K |
| 8 | CMV (RNA1, RNA2, RNA3) | 3,457,376 | 98.44% | RNA1 (318,290), RNA2 (1,293,807), RNA3 (260,700) | RNA1, RNA2 |
| 9 | CMV (RNA1), PEBV (201K), | 3,257,938 | 98.21% | RNA1 (1,299,800), 201K (1,145,237), | RNA1, 201K, CP |
| 10 | CMV(RNA3), PEBV (201K2), | 3,318,404 | 98.37% | RNA3 (207), 201K (1,561,718), | 201K, HcPro, CP3 |
| 11 | CMV (RNA1), PEBV (201K), | 2,210,396 | 95.24% | RNA1 (703,928), 201K (8929), | RNA1, HcPro |
| 12 | CMV (RNA1), PEBV (201K), PSbMV (CP), BYMV (HcPro) | 2,514,042 | 98.54% | RNA1 (735,687), 201K (701,502), CP (645,571), HcPro (739,571) | RNA1, 201K, CP, HcPro |
This data was generated using the 16 primers designed in Table 2. The amplicon open reading frames (ORFs) targeted by mPCR are the BYMV and PSbMV each amplified using primer pairs designed from the NIb, CP, HcPro and CI (Libraries 1–5). PEBV multiplex reactions (library 7), targeting the 12K, 14K, 30K, and 201K ORFs of the PEBV genome. CMV multiplex reaction involved the three RNA1-3 components (library 8). Series of mPCR reactions to detect three to four plant viruses Libraries 9–12. a = Corresponding to virus name and specific primers name (s) listed in Table 2 and used to amplify the ORFs shown in the column named “Amplicons detected by TG-Seqb”, b = figures in parenthesis are the number of reads mapping to each genome region of interest.
A comparison of the sensitivity of targeted genome sequencing (TG-Seq) and gel electrophoresis (GE) to detect BYMV, PSbMV, PEBV and CMV amplicons generated in a multiplex PCR (mPCR) reaction.
| Library | Virus | mPCR Product | Raw Reads | No. | Virus Amplicons Detected by TG-Seq | Amplicons Detected |
|---|---|---|---|---|---|---|
| 10−2 | CMV,PEBV,PSbMV,BYMV | 16.9 ng/uL | 2,245,566 | 98.05% | RNA1 (1,465,542), 201K (130,757), CP (2,224), HcPro (519,838) | RNA1, 201K, HcPro |
| 10−4 | CMV,PEBV,PSbMV,BYMV | 8 ng/uL | 2,332,290 | 97.49% | RNA1 (1,831,035), 201K (91,395), CP (27,503), HcPro (239,682) | RNA1, HcPro |
| 10−6 | CMV,PEBV,PSbMV,BYMV | 8 ng/uL | 1,924,302 | 96.71% | RNA1 (807,712), 201K (74,276), CP (246,891), HcPro (724,792) | RNA1 *, HcPro * |
| 10−8 | CMV,PEBV,PSbMV,BYMV | 7 ng/uL | 2,221,416 | 96.45% | RNA1 (1,096,272), (201K) 127,154, CP (210, 534), HcPro (704,258) | RNA1 *, HcPro * |
A 100-fold serial dilution (10−2, 10−4, 10−6, 10−8) of viral cDNA in nuclease free water from each of the four viruses was used as template. Amplicons detected by TG-Seq, CMV (RNA1), PEBV (201K), PSbMV (CP), BYMV (HcPro). Amplicons detected by gel visualisation at serial dilution 10−2 * (201K, RNA1, HcPro) 10−4 (RNA1, HcPro), 10−6 (RNA1, HcPro *), 10−8 (RNA1, HcPro *) = (nearly invisible), () = figures in parenthesis are the number of reads mapping to each genome region of interest.
Figure 2The proportion comparison of RNA-Seq and TG-Seq data based on the number of reads mapping to the virus genome region of interest. (a) A comparison of CMV-RNA-Seq and TG-Seq library using specific primer CMVRNA1F/CMVRNA1R for CMV, (b) A comparison of PEBV-RNA-Seq and TG-Seq library using PEBV specific primer 201K-F/201K-R, (c) A comparison of (BYMV-RNA-Seq and TG-Seq library using BYMV specific primer HcPro-1F/HcPro-1FR, (d) A comparison of PSbMV-RNA-Seq and TG-Seq library using PSbMV specific primer PCP-F1/PCP-F1R.
Figure 3The proportion of number of reads generated through singleplex TG-Seq after mapping back to the region of interest. (a) BYMV-TG-Seq library, (b) PEBV-TG-Seq library, (c) PSbMV- TG-Seq library, (d) CMV-TG-Seq library.