| Literature DB >> 27822536 |
Lindsey A Moser1, Lisbeth Ramirez-Carvajal2, Vinita Puri3, Steven J Pauszek4, Krystal Matthews5, Kari A Dilley3, Clancy Mullan6, Jennifer McGraw6, Michael Khayat6, Karen Beeri7, Anthony Yee3, Vivien Dugan3, Mark T Heise6, Matthew B Frieman5, Luis L Rodriguez4, Kristen A Bernard1, David E Wentworth3, Timothy B Stockwell3, Reed S Shabman3.
Abstract
Several biosafety level 3 and/or 4 (BSL-3/4) pathogens are high-consequence, single-stranded RNA viruses, and their genomes, when introduced into permissive cells, are infectious. Moreover, many of these viruses are select agents (SAs), and their genomes are also considered SAs. For this reason, cDNAs and/or their derivatives must be tested to ensure the absence of infectious virus and/or viral RNA before transfer out of the BSL-3/4 and/or SA laboratory. This tremendously limits the capacity to conduct viral genomic research, particularly the application of next-generation sequencing (NGS). Here, we present a sequence-independent method to rapidly amplify viral genomic RNA while simultaneously abolishing both viral and genomic RNA infectivity across multiple single-stranded positive-sense RNA (ssRNA+) virus families. The process generates barcoded DNA amplicons that range in length from 300 to 1,000 bp, which cannot be used to rescue a virus and are stable to transport at room temperature. Our barcoding approach allows for up to 288 barcoded samples to be pooled into a single library and run across various NGS platforms without potential reconstitution of the viral genome. Our data demonstrate that this approach provides full-length genomic sequence information not only from high-titer virion preparations but it can also recover specific viral sequence from samples with limited starting material in the background of cellular RNA, and it can be used to identify pathogens from unknown samples. In summary, we describe a rapid, universal standard operating procedure that generates high-quality NGS libraries free of infectious virus and infectious viral RNA. IMPORTANCE This report establishes and validates a standard operating procedure (SOP) for select agents (SAs) and other biosafety level 3 and/or 4 (BSL-3/4) RNA viruses to rapidly generate noninfectious, barcoded cDNA amenable for next-generation sequencing (NGS). This eliminates the burden of testing all processed samples derived from high-consequence pathogens prior to transfer from high-containment laboratories to lower-containment facilities for sequencing. Our established protocol can be scaled up for high-throughput sequencing of hundreds of samples simultaneously, which can dramatically reduce the cost and effort required for NGS library construction. NGS data from this SOP can provide complete genome coverage from viral stocks and can also detect virus-specific reads from limited starting material. Our data suggest that the procedure can be implemented and easily validated by institutional biosafety committees across research laboratories.Entities:
Keywords: West Nile virus; alphavirus; coronavirus; flavivirus; foot-and-mouth disease virus; genomics; next-generation sequencing; picornavirus; rhinovirus
Year: 2016 PMID: 27822536 PMCID: PMC5069770 DOI: 10.1128/mSystems.00039-15
Source DB: PubMed Journal: mSystems ISSN: 2379-5077 Impact factor: 6.496
FIG 1 Overview of the proposed standard operating procedure (SOP) for rapid next-generation sequencing library preparation and inactivation of ssRNA+ viruses. (A) Stepwise overview of the SOP. A detailed protocol is provided in Text S1 in the supplemental material. Steps in the pink box denote work performed in a biosafety level 3 and/or 4 (BSL-3/4) laboratory. Steps in the blue box denote work that can be performed in a BSL-2 laboratory. The asterisk in step 1 indicates that for nonselect agent pathogens (e.g., West Nile virus), extracted RNA may be moved to BSL-2 for library construction. Step 1, generating cDNA and SISPA, utilizes a primer with a random hexamer coupled to a unique barcode (BC-N6). SISPA stands for sequence-independent single-primer amplification. (B) The BC-N6 primer is used for both generating single-stranded cDNA from input RNA and generating double-stranded DNA by randomly priming the synthesized cDNA. A PCR step using primers only encoding the barcode sequence with either three or four random nucleotides (3N/4N) at the 5′ end simultaneously amplifies and uniquely identifies (barcodes) a sample. (C) Representative gel image that displays products of the SOP obtained from serial dilutions of genomic human rhinovirus 16 (HRV-16) virion RNA. At high-input RNA amounts, a smear between 200 bp and 1,000 bp is visible. This signal intensity diminishes as the starting material is diluted. (D) Summary of diverse types of starting material which can feed into the SOP. Samples enriched for virus-specific sequence (e.g., virion stocks) can directly proceed to the SOP. For samples that contain a majority of host nucleic acid, the use of upstream procedures to enrich for virus-specific signal (e.g., rRNA depletion or mRNA enhancement) is recommended.
FIG 2 The SOP generates high-quality full-genome sequence data across multiple ssRNA+ virus families. Pooled samples from the SOP were sequenced on the Illumina MiSeq platform. Samples were demultiplexed, the adaptors were trimmed, and low-quality sequencing reads were removed. Sequencing reads were mapped corresponding to input viruses. These viruses include foot-and-mouth disease virus (FMDV) type O (GenBank accession no. KF112887.1) (A), West Nile virus (WNV) AF404756.1) (B), human rhinovirus 16 (HRV-16) (GenBank accession no. L24917.1) (C), Chikungunya virus (CHIKV) (pJM6-3-CHIKV 181/25-mkate) (D), and Middle East respiratory syndrome coronavirus (MERS) (GenBank accession no. KJ614529.1) (E). Nucleotide coverage depth (NT coverage) is indicated on the y axis, and nucleotide (NT) position is indicated on the x axis. The genome length for each virus is indicated on the x axis, and the percentage of the genome covered greater than 3 nucleotides is indicated. For FMDV, WNV, HRV-16, and CHIKV, data represent material from a single barcode. For MERS, the data shown is a combination of four barcodes generated from the same sample.
SISPA products lack both viral and RNA infectivity
| Input virus for SISPA | Location | Viral infectivity tests | RNA infectivity tests | ||||
|---|---|---|---|---|---|---|---|
| Cell line | LOD | Loss of infectivity (no. positive/no. tested) | Cell line | LOD | Loss of infectivity (no. positive/no. tested) | ||
| HRV-16 | JCVI | H1 HeLa | 0.01 [0/2] | 0/44 | H1 HeLa | 7.24 × 104 [2/2] | 0/44 |
| 0.1 [2/2] | 0/13 | ||||||
| HRV-14 | JCVI | H1 HeLa | NT | 0/44 | H1 HeLa | NT | 0/44 |
| FMDV | USDA | LFBK αvβ6 | NT | 0/90 | LFBK αvβ6 | 4.23 × 105 | 0/90 |
| 0/93 | 0/93 | ||||||
| 0/1 | |||||||
| 0/1 | |||||||
| CHIKV | UNC-CH | Vero | 0.1 [1/3] | 0/25 | BHK-21 | 107 [2/2] | 0/25 |
| 1 [3/3] | 0/1 | 0/2 | |||||
| WNV | UW | Vero | 0.1 [1/3] | 0/3 | BHK-21 | 107 [3/3] | 0/3 |
| 1 [3/3] | |||||||
| MERS-CoV | UMD | Vero | NT | 0/3 | Vero | NT | 0/10 |
High-titer samples of representative coronaviruses (MERS-CoV), flaviviruses (WNV), alphaviruses (CHIKV), and picornaviruses (HRV-16, HRV-14, and FMDV) were processed following the SOP, and an aliquot of each sample was used to test for infectious virus or infectious viral genomic RNA (gRNA). Currently, blind passaging of potentially infectious material on permissive cells is the standard for removing products out of a BSL-3 facility. Viral and gRNA infectivity is absent after testing all samples. In all cases, positive-control samples confirmed each cell line clearly detected both viral infectivity and gRNA infectivity.
The starting material represents viral RNA from at least 1 × 105 PFU. A total of 324 samples were tested for genomic RNA loss of infectivity; 309 samples tested for virus loss of infectivity.
Abbreviations: JCVI, J. Craig Venter Institute; USDA, U.S. Department of Agriculture; UNC-CH, University of North Carolina at Chapel Hill; UW, University of Wisconsin—Madison; UMD, University of Maryland.
SISPA products were used to infect the indicated permissive cell line. Three serial passages were performed.
SISPA products were either electroporated or transfected into the indicated permissive cell line. Three serial passages were performed.
The limit of detection (LOD) for viral and gRNA infectivity was determined independently from the loss of infectivity testing for each virus. For each loss of infectivity test, a positive control for infectivity (either transfection/electroporation of gRNA or virus infection) was performed in parallel. Abbreviations: NT, not tested; GE, genomic equivalents.
FIG 3 Performing the SOP with both HRV-16 and FMDV to identify where loss of genomic RNA infectivity occurs. (A) Flow chart depicting a test for HRV-16 or FMDV loss of RNA infectivity. Briefly, 60 tubes of HRV-16 gRNA were subject to six different conditions in replicates of 10. The SOP was performed, and a subset was purified at each step and tested for the presence of infectious RNA over three blind passages on H1 HeLa cells. For FMDV, viral RNA, intermediates, or the final SOP products were electroporated into LFBK αvβ6 cells. (B) Results of infectivity testing with HRV-16. Each symbol represents the value for an individual sample. Samples to the right of the red line highlight steps where all samples tested had no detectable infectious HRV-16 genomic RNA. The “Viral RNA +RNase” group demonstrates the RNase treatment is sufficient to inactivate all infectious gRNA. (C) Results of infectivity testing with FMDV as outlined in panel A. Each symbol represents the value for an individual sample. Samples to the right of the red line highlight steps where all samples tested had no detectable infectious FMDV genomic RNA.
FIG 4 Defining the sensitivity of the SOP on Illumina MiSeq and HiSeq platforms. RNA from serial 10-fold dilutions of an HRV-16 virion stock was treated according to the SOP. Samples were pooled and sequenced on an Illumina MiSeq or HiSeq platform. The left y axis denotes the number of reads mapped to the HRV-16 reference genome, the right y axis denotes the percentage of the reference genome covered, and the x axis denotes the input PFU for each reaction. The solid black line demonstrates that sequencing reads were detected between 1 and 10 PFU on the MiSeq platform. A similar sensitivity is obtained on the HiSeq platform, as denoted by the solid red line. The corresponding percentage of the HRV-16 genomic coverage from each platform is denoted by a dashed black line (MiSeq) and a dashed red line (HiSeq). The slight enhancement of genomic coverage on the MiSeq platform, despite the fewer number of sequence reads, results from the longer read length on the MiSeq platform (300 nucleotides [nt]) over the HiSeq platform (100 nt), as sequencing capacity is in excess at all dilutions.
FIG 5 NGS on SOP-generated HRV-16-specific sequence from pure and mixed samples is slightly less sensitive than quantitative real-time RT-PCR (qrRT-PCR). Four independent tests were conducted to determine the sensitivity of the SOP. Test 1 detects HRV-16 sequence from dilutions of purified virus. Test 2 detects HRV-16 sequence from dilutions of genomic RNA. Test 3 detects HRV-16 sequence from dilutions of virus spiked into H1 HeLa cells. Test 4 detects HRV-16 sequence from genomic RNA dilutions spiked into total HeLa cell RNA. A ribosomal removal step was performed for tests 3 and 4 prior to the initiation of the SOP. For each sample, a fraction of the RNA used to initiate the SOP was subjected to qrRT-PCR analysis. (A to D) HRV-16-specific reads obtained by MiSeq (black solid lines) are plotted on the left y axis and the cycle threshold (Ct) values are plotted on the right y axis (red lines). Sequencing reads not mapping to the HRV-16 reference are also indicated (black dashed lines). Corresponding HRV-16 input PFU values are plotted on the x axis. (A) The limit of detection (LOD) for test 1 in this experiment is between 101 and 102 input PFU. The corresponding LOD by qrRT-PCR is approximately 10-fold greater (100 to 101 input PFU). (B) The LOD for test 2 in this experiment is between 101 and 102 input PFU. The corresponding LOD by qrRT-PCR is approximately 10-fold greater (100 to 101 input PFU). (C) The LOD for test 3 in this experiment is between 102 and 103 input PFU. The corresponding LOD by qrRT-PCR is approximately 100-fold greater (100 to 101 input PFU). (D) The LOD for test 4 in this experiment is between 101 and 102 input PFU; however, single reads are detected down to an input of 10−1. The corresponding LOD by qrRT-PCR is approximately 10-fold greater (100 to 101 input PFU) when individual HRV-16 reads are not considered and approximately 10-fold less sensitive when individual HRV-16 reads are considered.
FIG 6 The SOP can detect WNV infection in vitro and in vivo. (A and B) Infected WNV cells were spiked into uninfected cells (A) or uninfected tissues (B), libraries were prepared on RNA according to the SOP, and the libraries were examined by Illumina MiSeq. (C and D) Footpad (C) and brain tissue (D) from WNV-infected mice were analyzed at 5, 10, and 29 days postinfection for WNV-specific sequence reads by Illumina HiSeq. (A) Data representing the ability of the SOP to identify WNV-specific reads from limiting dilutions of WNV-infected Vero cells spiked into uninfected 293T cells. Mapped and unmapped reads from each sample are displayed. (B) The SOP identifies WNV-specific reads from limiting dilutions of WNV-infected Vero cells spiked into uninfected mouse tissues (spleen and brain). Mapped and unmapped reads from each sample are shown. (C) WNV was detected in the footpad RNA of mice prepared according to the SOP at the indicated times postinfection. (D) WNV-specific reads can be detected from brain tissue RNA of mice at the indicated times postinfection. For panels C and D, three mice per group were analyzed, and WNV-mapped and unmapped reads are shown.
FIG 7 The ability of the SOP to sequence and identify unknown samples. (A) High-titer viral stocks were subjected to the SOP, anonymized, and shipped to JCVI for sequencing and data analysis. Samples were pooled and sequenced by Illumina MiSeq. Data from each corresponding sample were put into de novo assembly, and large contigs (>500 bp) were used to identify the best full-length viral genome references by nucleotide BLAST search against the NT database. Raw data were then mapped onto the best available reference genome. (B) Mapping coverage of an unknown sample against the selected genome for St. Louis encephalitis virus (SLEV). (C) Mapping coverage of an unknown sample against the selected genome for Western equine encephalitis virus (WEEV). (D) Mapping coverage of an unknown against the selected genome for Chikungunya virus (CHIKV). In panels B to D, nucleotide coverage depth is indicated on the y axis, and genomic position, with the length of each genome indicated as well as the best available reference genome, is indicated on the x axis.