Literature DB >> 29917014

Transcriptome-wide survey of pseudorabies virus using next- and third-generation sequencing platforms.

Dóra Tombácz1, Donald Sharon2, Attila Szűcs1, Norbert Moldován1, Michael Snyder2, Zsolt Boldogkői1.   

Abstract

Pseudorabies virus (PRV) is an alphaherpesvirus of swine. PRV has a large double-stranded DNA genome and, as the latest investigations have revealed, a very complex transcriptome. Here, we present a large RNA-Seq dataset, derived from both short- and long-read sequencing. The dataset contains 1.3 million 100 bp paired-end reads that were obtained from the Illumina random-primed libraries, as well as 10 million 50 bp single-end reads generated by the Illumina polyA-seq. The Pacific Biosciences RSII non-amplified method yielded 57,021 reads of inserts (ROIs) aligned to the viral genome, the amplified method resulted in 158,396 PRV-specific ROIs, while we obtained 12,555 ROIs using the Sequel platform. The Oxford Nanopore's MinION device generated 44,006 reads using their regular cDNA-sequencing method, whereas 29,832 and 120,394 reads were produced by using the direct RNA-sequencing and the Cap-selection protocols, respectively. The raw reads were aligned to the PRV reference genome (KJ717942.1). Our provided dataset can be used to compare different sequencing approaches, library preparation methods, as well as for validation and testing bioinformatic pipelines.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29917014      PMCID: PMC6007087          DOI: 10.1038/sdata.2018.119

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   6.444


Background & Summary

Pseudorabies virus (PRV) is a causative agent of Aujeszky’s disease (AD)[1] in pigs. PRV has a double-stranded DNA genome with a size of approximately 143 kbp. PRV is often employed in laboratories to study the molecular pathomechanism of the herpesviruses[2]. It is also a suitable tool as concerns gene and tumour therapy[3], as well as for mapping of neuronal circuits[4-8]. This virus has also been used as live vaccines against AD[9-11]. Here, we provide a large dataset derived from RNA-Seq experiments including different next-generation sequencing (NGS) – and third-generation sequencing (3rdGS) techniques (Fig. 1). Our aim with this study was to provide a dataset that can be used for comparison of the different sequencing platforms and library preparation methods using PRV as a model organism. In addition, these data are also applicable for identifying novel coding and non-coding transcripts, transcript isoforms, splice variants of PRV, and for defining full-length transcripts by using a combination of sequencing platforms.
Figure 1

Data flow diagram shows the detailed overview of the study design.

One of the most popular NGS platforms, the Illumina HiScanSQ was used to generate high quality short-reads and extremely high coverage throughout the entire PRV genome. Random-primed cDNA library was prepared from viral RNAs. Paired-end RNA sequencing was carried out to characterize novel splice isoforms, as well as to obtain general information on the transcription activity of PRV[12]. PolyA-sequencing was used to determine the 3′-ends of RNA molecules. With this technique, we were able to detect alternative polyadenylation events in the PRV transcripts. Both libraries were run on a single flow cell resulting in 1.3 million 100 bp paired-end reads from the random hexamer-primed libraries, and 10 million 50 bp single-end reads from the poly(A)-enriched RNA-Seq, aligning to the viral reference[13] (KJ717942.1). Although the error rate of 3rdGS techniques is higher than those of NGS’s[14], they are able to identify novel full-length transcripts[15-17] and are therefore more applicable for global transcriptome profiling and RNA isoform detection compared to short-read techniques. The Real-Time Sequencer II (RSII) and the Sequel 3rdGS platforms from Pacific Biosciences (PacBio) and the Oxford Nanopore Technologies (ONT) MinION 3rdGS device were used to characterize the static[18,19] and dynamic[20] PRV transcriptome. These sequencing techniques, with the library preparation methods [e.g. non-amplified SMRT method and amplified, Iso-seq protocol from the PacBio; full-length cDNA-sequencing, direct RNA-sequencing, and cDNA-sequencing on 5′Cap-selected samples from ONT, (Fig. 1,2)] used in these studies made it possible to identify several hundreds of novel transcript isoforms (including 3′- and 5′ UTR variants, and splice isoforms), as well as dozens of protein-coding and non-coding RNAs and numerous complex transcripts of PRV.
Figure 2

Data flow diagram shows the detailed overview of the wet lab experiments and bioinformatics pipelines.

Seventy-one SMRT Cells were run on RSII system. P5-C3 chemistry and 180-minute data collection mode was used for the non-amplified samples, while P6-C4 enzymes were applied and 240 or 360 min movies were recorded for the amplified samples. cDNAs were sequenced on a single Sequel SMRT Cell with P6-C4 reagents; 10 h run-time was applied. Altogether seven MinION flow cells were used for the different ONT approaches. The raw sequencing reads were mapped to the above-mentioned reference genome. Sequencing on the RSII platform resulted in 215,417 reads of inserts (ROIs), while the utilized nanopore sequencing methods generated altogether 194,232 PRV specific reads (Table 1). The average read lengths aligning to the PRV genome were 1,326 bp for PacBio RSII, 1,763 bp for the Sequel and 827 bp for ONT. It should be noted that the library preparation and size-selection methods resulted in different samples in length (Table 2).
Table 1

Summary of the obtained read counts from long-read sequencing aligned to the PRV genome.

SampleaNumber of PRV readsbNumber of mapped readsc
MinION-Cap-selected120394131223
MinION-cDNA4400647363
MinION-RNA2983230667
Sequel1255513481
RSII-PolyA-amplified-1h69566956
RSII-PolyA-amplified-4h2295822958
RSII-PolyA-amplified-8h4614746147
RSII-PolyA-amplified-1, 2, 4, 6, 8h-Bluepippin 0.8kb-5kb+10771077
RSII-PolyA-amplified-1, 2, 4, 6, 8h-Bluepippin 0.8-2kb77287728
RSII-PolyA-amplified-1, 2, 4, 6, 8h-Bluepippin 2-3kb61316131
RSII-PolyA-amplified-1, 2, 4, 6, 8h-Bluepippin 3-5kb37053705
RSII-PolyA-amplified-1, 2, 4, 6, 8h-Bluepippin 5kb+40614061
RSII-PolyA-amplified-1, 4, 8h,-Manual Gel 1-2kb38223822
RSII-PolyA-amplified-1h-Manual Gel 3kb+31103110
RSII-PolyA-amplified-1h-Manual Gel 2-3kb483510
RSII-PolyA-amplified-4h-Manual Gel 3kb+26452928
RSII-PolyA-amplified-4h-Manual Gel 2-3kb1434815708
RSII-PolyA-amplified-8h-Manual Gel 3kb+34433765
RSII-PolyA-amplified-8h-Manual Gel 2-3kb16081765
RSII-random-amplified-1, 2, 4, 6, 8h18041953
RSII-random-amplified-1h23272485
RSII-random-amplified-4h72027629
RSII-random-amplified-8h1145212259
RSII-random-amplified-1, 4, 8h,-Manual Gel 3kb+17781855
RSII-random-amplified-1, 4, 8h,-Manual Gel 1-2kb22182425
RSII-random-amplified-1, 4, 8h,-Manual Gel 2-3kb33933627
RSII-PolyA-non amplified-1h (1st)7481
RSII-PolyA-non amplified-1h (2nd)73927440
RSII-PolyA-non amplified-2h (1st)124125
RSII-PolyA-non amplified-2h (2nd)1294413121
RSII-PolyA-non amplified-4h (1st)13691574
RSII-PolyA-non amplified-4h (2nd)10231054
RSII-PolyA-non amplified-6h (1st)21442269
RSII-PolyA-non amplified-6h (2nd)1015110231
RSII-PolyA-non amplified-8h (1st)219237
RSII-PolyA-non amplified-8h (2nd)79158015
RSII-PolyA-non amplified-12h (1st)9351002
RSII-PolyA-non amplified-12h (2nd)1273112864

aType of the samples.

bTotal number of PRV-specific reads.

cTotal number of reads mapped to the PRV genome. (There is an approximately 15 kb-long inverted repeat sequence region in the PRV genome, therefore those reads which map to this location occur in duplicate in row c).

Table 2

Summary of the obtained read lengths from long-read sequencing.

SampleaAvr. Read LengthbAvr. Read Length SDc
MinION-Cap-selected810.00519.07
MinION-cDNA786.00936.86
MinION-RNA909.00664.56
Sequel1763.00745.01
RSII-PolyA-amplified-1 h1333.00989.17
RSII-PolyA-amplified-4 h1193.00977.50
RSII-PolyA-amplified- 8h1365.00874.74
RSII-PolyA-amplified-1, 2, 4, 6, 8 h-Bluepippin 0.8kb-5 kb+1555.00938.80
RSII-PolyA-amplified-1, 2, 4, 6, 8 h-Bluepippin 0.8-2 kb1362.00381.91
RSII-PolyA-amplified-1, 2, 4, 6, 8 h-Bluepippin 2-3 kb1300.00474.58
RSII-PolyA-amplified-1, 2, 4, 6, 8 h-Bluepippin 3-5 kb1048.00489.94
RSII-PolyA-amplified-1, 2, 4, 6, 8 h-Bluepippin 5 kb+1268.00484.31
RSII-PolyA-amplified-1, 4, 8 h,-Manual Gel 1-2 kb1251.00532.61
RSII-PolyA-amplified-1 h-Manual Gel 3kb+1805.001875.46
RSII-PolyA-amplified-1 h-Manual Gel 2-3 kb1661.00852.13
RSII-PolyA-amplified-4 h-Manual Gel 3kb+2054.003654.40
RSII-PolyA-amplified-4 h-Manual Gel 2-3kb1644.001874.46
RSII-PolyA-amplified-8 h-Manual Gel 3 kb+1660.00358.94
RSII-PolyA-amplified-8 h-Manual Gel 2-3 kb1701.001162.41
RSII-random-amplified-1, 2, 4, 6, 8 h772.00382.48
RSII-random-amplified-1 h1121.00438.93
RSII-random-amplified-4 h1109.00557.74
RSII-random-amplified-8 h1105.00468.34
RSII-random-amplified-1, 4, 8 h,-Manual Gel 3 kb+1726.002107.78
RSII-random-amplified-1, 4, 8 h,-Manual Gel 1-2 kb999.00522.86
RSII-random-amplified-1, 4, 8 h,-Manual Gel 2-3 kb1173.001725.91
RSII-PolyA-non amplified-1 h (1st)1077.00323.67
RSII-PolyA-non amplified-1 h (2nd)1403.00625.65
RSII-PolyA-non amplified-2 h (1st)1227.00428.39
RSII-PolyA-non amplified-2 h (2nd)1331.00608.73
RSII-PolyA-non amplified-4 h (1st)1207.00529.63
RSII-PolyA-non amplified-4 h (2nd)1307.00585.04
RSII-PolyA-non amplified-6 h (1st)1189.00485.97
RSII-PolyA-non amplified-6 h (2nd)1211.00473.44
RSII-PolyA-non amplified-8 h (1st)1120.00443.99
RSII-PolyA-non amplified-8 h (2nd)1390.00554.59
RSII-PolyA-non amplified-12 h (1st)1081.00410.88
RSII-PolyA-non amplified-12 h (2nd)1336.00555.02

aType of the samples.

bAverage read lengths of the different library preparation and long-read sequencing approaches.

cStandard deviation (SD) values.

This dataset can help explore the advantages and disadvantages associated with each sequencing method used in this work. This approach can be used for the analysis of multiple features of the sequencing platforms, including read length, base-calling error rate, coverage and mappability. The application of the various sequencing techniques can be evaluated by the analysis of the identified transcript isoforms, and the quantification of the transcriptome comparing the performance of Illumina, PacBio and ONT. This dataset is also useful for the analysis of the transcriptome complexity of PRV. Our data include a sub-dataset which can be used for the transcriptome analysis of PRV during an infection period including six different time-points. Here we provide a detailed overview of the library preparation techniques and a description of the data (Table 3, Figs. 1 and 2).
Table 3

Summary table of the various wet lab approaches used in this study.

Sample NoSampleSample time points (h pi)RT primingCap-selectionAmplificationSize selectionLibrary prepPlatform
1Mixed1, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24Oligo(d)TnoyesnoIllumina single-endHiScan SQ
2Mixed1, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24RandomnoyesnoIllumina paired-endHiScan SQ
31 h1Oligo(d)TnoyesnoPacBio Isoform seqRSII
44 h4Oligo(d)TnoyesnoPacBio Isoform seqRSII
58 h8Oligo(d)TnoyesnoPacBio Isoform seqRSII
6Mixed1, 2, 4, 6, 8Oligo(d)TnoyesBluepippin 5kb+PacBio Isoform seqRSII
7Mixed1, 2, 4, 6, 8Oligo(d)TnoyesBluepippin 0.8-2kbPacBio Isoform seqRSII
8Mixed1, 2, 4, 6, 8Oligo(d)TnoyesBluepippin 2-3kbPacBio Isoform seqRSII
9Mixed1, 2, 4, 6, 8Oligo(d)TnoyesBluepippin 3-5kbPacBio Isoform seqRSII
10Mixed1, 2, 4, 6, 8Oligo(d)TnoyesBluepippin 5kb+PacBio Isoform seqRSII
11Mixed1, 4, 8Oligo(d)TnoyesManual Gel 1-2kbPacBio Isoform seqRSII
121 h1Oligo(d)TnoyesManual Gel 3kb+PacBio Isoform seqRSII
131 h1Oligo(d)TnoyesManual Gel 2-3kbPacBio Isoform seqRSII
144 h4Oligo(d)TnoyesManual Gel 3kb+PacBio Isoform seqRSII
154 h4Oligo(d)TnoyesManual Gel 2-3kbPacBio Isoform seqRSII
168 h8Oligo(d)TnoyesManual Gel 3kb+PacBio Isoform seqRSII
178 h8Oligo(d)TnoyesManual Gel 2-3kbPacBio Isoform seqRSII
18Mixed1, 2, 4, 6, 8RandomnoyesnoPacBio Isoform seqRSII
191 h1RandomnoyesnoPacBio Isoform seqRSII
204 h4RandomnoyesnoPacBio Isoform seqRSII
218 h8RandomnoyesnoPacBio Isoform seqRSII
22Mixed1, 4, 8RandomnoyesManual Gel 3kb+PacBio Isoform seqRSII
23Mixed1, 4, 8RandomnoyesManual Gel 1-2kbPacBio Isoform seqRSII
24Mixed1, 4, 8RandomnoyesManual Gel 2-3kbPacBio Isoform seqRSII
251 h1Oligo(d)TnononoPacBio 2kbRSII
261 h1Oligo(d)TnononoPacBio Very Low InputRSII
272 h2Oligo(d)TnononoPacBio 2kbRSII
282 h2Oligo(d)TnononoPacBio Very Low InputRSII
294 h4Oligo(d)TnononoPacBio 2kbRSII
304 h4Oligo(d)TnononoPacBio Very Low InputRSII
316 h6Oligo(d)TnononoPacBio 2kbRSII
326 h6Oligo(d)TnononoPacBio Very Low InputRSII
338 h8Oligo(d)TnononoPacBio 2kbRSII
348 h8Oligo(d)TnononoPacBio Very Low InputRSII
3512 h12Oligo(d)TnononoPacBio 2kbRSII
3612 h12Oligo(d)TnononoPacBio Very Low InputRSII
37Mixed1, 2, 4, 6, 8Oligo(d)TnoyesnoPacBio Isoform seqSequel
38Mixed1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 18, 24Oligo(d)TnononoONT Direct RNAMinION
39Mixed1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 18, 24Oligo(d)TyesyesnoONT cDNAMinION
40Mixed1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 18, 24Oligo(d)TnoyesnoONT cDNAMinION

Methods

Schematic overviews of the methodological workflow are shown in the flowchart of Figs. 1 and 2. The applied reagents and utilized approaches are listed in Table 4.
Table 4

Summary table of the reagents and chemistries used for the sequencing.

Total RNA isolationPolyA selectionRibodepletionReverse transcription & dscDNA productioncDNA synthesis by PCRLibrary preparation kitSequencing chemistryInstrument
Macherey-Nagel RNAQiagen Oligotex mRNA mini KitStarScript AMV Reverse TranscriptaseFailSafe PCR PreMix Selection KitScriptSeq v2 RNA-Seq Library Preparation KitNAHiScan 2500
Epicentre Ribo-Zero™ Magnetic Kit H/M/R
Qiagen Oligotex mRNA mini KitSuperScript III & SuperScript double-stranded cDNA Synthesis KitPacBio Template Preparation KitP5-C3RSII
Clontech SMARTer PCR cDNA Synthesis KitKAPA HiFi PCR KitP6-C4
Epicentre Ribo-Zero™ Magnetic Kit H/M/R
Qiagen Oligotex mRNA mini KitSequel
SuperScript IIIDirect RNA Sequencing KitNAMinION
SuperScript IVKAPA HiFi PCR KitLigation Sequencing Kit 1DNA
Lexogen Teloprime Kit enzymes & reagentsLexogen Teloprime PCR mixLigation Sequencing Kit 1DNA

Cells, viruses and infection conditions

Immortalized porcine kidney-15 (PK-15; ATCC® CCL-33™) cells were used for the propagation of pseudorabies virus strain Kaplan (PRV-Ka) at 37 °C and 5% CO2 in Dulbecco’s modified Eagle medium (DMEM, Gibco Invitrogen) supplemented with 5% foetal bovine serum (FBS; Gibco Invitrogen). The virus stock was originally obtained from the Kaplan Lab (Department of Microbiology, Vanderbilt University School of Medicine, Nashville, Tennessee)[21], but Vanderbilt University received it from Dr. Richard F. Haff in a suspension of infected mouse brain[22]. Gentamycin (80 μg/ml) was also added to the cell culture medium. The virus stock was prepared as follows: the medium was removed from the rapidly-growing semi-confluent PK-15 cells then it was infected with the Kaplan strain of PRV (a multiplicity of infection of 0.1 plaque-forming unit (pfu)/cell). Infected cells were incubated until complete cytopathic effect was observed. Samples were taken through three times freeze-thaw cycles, followed by centrifugation at 10,000 g for 15 min. The titre of the virus stock was determined in PK-15 cells. For all experiments, cells were infected with a high MOI (10 pfu/cell) and incubated for 1 h, followed by removal of the virus suspension and washing of the cells with phosphate-buffered saline (PBS). The number of cells in a culture flask was 5 × 106. After the addition of new medium to the cells, they were incubated for 1, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22 or 24 h pi and they both were mixed for Illumina sequencing (Table 3). One, 2, 4, 6, 8 and 12 h of incubation were used for the non-amplified PacBio sequencing, while the 1, 2, 4, 6 and 8 h pi samples were utilized for the PacBio amplified, Iso-Seq protocol. Samples from different time points were individually sequenced on the RSII, but they were also mixed for PacBio sequencing (Table 3). The incubation time was 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 18 and 24 h and a mixture from them was used for all types of ONT sequencing (Table 3).

RNA purification

Isolation of total RNAs The NucleoSpin® RNA II kit (Macherey-Nagel) was used to isolate RNA from samples for Illumina sequencing, while the new version, the NucleoSpin® RNA kit (Macherey-Nagel) was used for all the other samples, as was described earlier[12,18,20]. Briefly, cells were collected by centrifugation and lysed by incubation in a solution containing large amounts of chaotropic ions. This buffer inactivates the RNase. Nucleic acid molecules bind to the silica membrane. All samples were handled with DNase I solution (provided by the kit) to remove residual DNA contaminations. Total RNAs were eluted from the membrane in RNase-free water. To eliminate the potential remaining DNA contamination, samples were treated by Ambion® TURBO DNA-free™ Kit. The final concentrations of the RNA samples were determined by Qubit®. 2.0 Fluorometer using Qubit RNA BR Assay Kit (Life Technologies). RNA quality was assessed with the Agilent Bioanalyzer 2100 and RIN scores above 9.6 were used for cDNA production. RNA samples were stored at −80 °C until further use. Samples were made as follows: The Illumina oligodT- and random-primed sequencing reactions were carried out from the same RNA mixture. The libraries for the kinetic analysis and for the mixed sequencing using PacBio RSII (non-amplified method) were all prepared from different cell culture flasks (containing 5×106 cells/flask), but the same virus stock was used for these infections. For the amplified RSII sequencing, samples were prepared from separate flasks (each containing 5×106 cells and infected with the same virus stock). The Sequel and the various MinION libraries have been prepared using the same RNA mixture. Ribosomal RNA depletion For the Illumina sequencing and for the PacBio random-primed sequencing, the total RNA samples were depleted from rRNA using the Epicentre Ribo-Zero™ Magnetic Kit H/M/R (Illumina). Selection of PolyA(+) RNA For the PacBio and MinION polyA sequencing, the polyA(+) fraction of the RNA samples were isolated using the Qiagen Oligotex mRNA Mini Kit, following the “Spin Columns” protocol. PolyA purified and rRNA depleted RNA samples were quantified through use of the Qubit RNA HS Assay Kit (Life Technologies) and then subjected to cDNA synthesis according to the downstream applications.

cDNA synthesis, library preparation and sequencing Illumina sequencing

Total RNA was purified from PK-15 cells in various stages of PRV infection from 1 to 24 h pi and then, the samples were mixed together to uncover an extensive variety of viral transcripts. Libraries were prepared from ribo-depleted samples using the ScriptSeq v2 RNA-Seq Library Preparation Kit (Epicentre/Illumina) according to the manufacturer’s recommendations. The kit uses a random-primed (random hexamer with tagging sequence) cDNA synthesis reaction; this “original” protocol was used to construct a paired-end library, but for PolyA sequencing (PA-seq), a single-end library was prepared through the use of custom anchored adaptor-primer oligonucleotides with an oligo(VN)T20 primer sequence (Table 5). Briefly, the rRNA-depleted RNA samples were mixed with the primer (random or oligo(d)T) and the RNA Fragmentation Solution (part of the ScriptSeq Kit), and the mixtures were incubated at 85 °C for 5 min. The following kit components were mixed together: cDNA Synthesis Premix, DTT and StarScript AMV Reverse Transcriptase. This reagent mix was added to the pre-heated RNA-mixtures and they were incubated at 25 °C for 5 min and then 42 °C for 20 min. Reactions were cooled down to 37 °C. Finishing solution was added to the samples and incubation was continued at 37 °C for an additional 10 min, then at 95 °C for 3 min. Samples were cooled down to 25 °C, and Terminal Tagging Premix, as well as DNA Polymerase were added. The incubation was continued at 25 °C for 15 min and then at 95 °C for 3 min. The di-tagged cDNA samples were purified by using the AMPure XP beads (Beckman Coulter). The purified samples were amplified by PCR using the FailSafe PCR Premix E, primers and FailSafe PCR enzyme (Lucigen, Epicentre). The PCR conditions were as follows: initial denaturation at 95 °C for 1 min, followed by 15 cycles at 95 °C for 30 sec 55 °C for 30 sec and 68 °C for 3 min. The final incubation step was carried out at 68 °C for 7 min. The PCR amplicons were purified with AMPure beads. The quantity and the quality of the samples were checked using Qubit fluorometer (Life Technologies) and Agilent 2100 Bioanalyzer, respectively.
Table 5

The list of primers used in this study for the reverse transcription reactions.

Sequencing methodName, availabilityCatalog #Sequence (5' -> 3')
Illumina PolyACustom made-adaptor-primer VN(T)20 (IDT DNA)- GTGTGCTCTTCCGATCT(T)20VN
Illumina randomRandom hexamer + tagging sequence-ScriptSeq™ v2 RNA-Seq Library Preparation Kit (Epicentre)SSV21106 & SSV21124 GTGTGCTCTTCCGATCTNNNNNN
PacBio non-amplifiedAnchored Oligo(dT)20 primers (Life Technologies)12577011 TTTTTTTTTTTTTTTTTTTTVN
PacBio amplified PolyA3' SMART CDS primer II A-SMARTer PCR cDNA Synthesis Kit (Clontech)634925 & 634926 AAGCAGTGGTATCAACGCAGAGTAC(T)30VN
PacBio amplified RandomCustome made (IDT DNA)- AAGCAGTGGTATCAACGCAGAGTACNNNNNN (G: 37%; C: 37%; A: 13%; T: 13%)
MinION cDNAPoly(T)-containing anchored primer [(VN)T20-ONT recommended, custom made (Bio Basic)-5phos/ ACTTGCCTGTCGCTCTATCTTC(T)20VN
MinION CAP selectedTeloPrime Full-Length cDNA Amplification Kit (Lexogen)013.08 & 013.24 TCTCAGGCGTTTTTTTTTTTTTTTTTT
MInION RNART adapter-Direct RNA Sequencing Kit (Oxford Nanopore Technologies)SQK-RNA001 GAGGCGAGCGGTCAATTTTCCTAAGAGCAAGAAGAAGCCTTTTTTTTTT

PacBio SMRTbell library preparation & sequencing – the non-amplified method & RSII sequencing

Generation of cDNAs

The SuperScript Double-Stranded cDNA Synthesis Kit (Life Technologies) was used to prepare cDNAs from the polyA(+) RNA samples. These samples were used to quantify the PRV transcriptome during the infection period between 1-12 h. The enzyme-which was included in the kit-was changed to SuperScript III Reverse Transcriptase. The first-strand synthesis reactions were primed with Anchored Oligo(dT)20 primers (Life Technologies, Table 5). The obtained cDNAs were measured by using Qubit HS dsDNA Assay Kit (Life Technologies). The total amount of cDNA synthesized at each time point was used to prepare SMRTbell templates.

Preparation of SMRTbell libraries-“Barcoding method”

The cDNA samples (~500 ng/sample) were used to prepare SMRTbell templates by using the PacBio DNA Template Prep Kit 1.0 following the Pacific Biosciences’ 2 kb Template Preparation and Sequencing protocol.

Repairing the cDNA ends

Template Prep Buffer, ATP High, dNTP and End Repair Mix (PacBio) were added to the samples and then they were incubated at 25 °C for 15 min.

Sample purification 0.6x volume

AMPure PB bead was added to the samples. They were mixed using VWR vortex mixer for 10 min at 2000 rpm and room temperature. Tubes were placed in a magnetic bead rack for 3 min. After the bead pellets were formed, the supernatant were discarded. Beads were 2-times washed with freshly prepared ethanol (70%). Samples were dried and then they were eluted in 30 μl Elution Buffer (PacBio).

Adapter ligation

This step was carried out at 25 °C for 15 min with the addition of specific bar-coded adapters (Table 6), Template Prep Buffer, ATP Low and Ligase (PacBio). The enzyme was inactivated at 65 °C (10 min).
Table 6

Barcode sequences utilized for PacBio sequencing.

NameSequence
This table contains the modified adapter sequences with unique barcodes which were used for multiplex sequencing. The barcode sequences are labelled by blue colour. 
PBBC adapter #1[Phos] TATGCTAATCTCTCTCTTTTCCTCCTCCTCCGTTGTTGTTGTTGAGAGAGATTAGCATA
PBBC adapter #2[Phos] GACAGTGATCTCTCTCTTTTCCTCCTCCTCCGTTGTTGTTGTTGAGAGAGATCACTGTC
PBBC adapter #3[Phos] GATCTCGATCTCTCTCTTTTCCTCCTCCTCCGTTGTTGTTGTTGAGAGAGATCGAGATC
PBBC adapter #4[Phos] TACACGTATCTCTCTCTTTTCCTCCTCCTCCGTTGTTGTTGTTGAGAGAGATACGTGTA
PBBC adapter #5[Phos] GAGCTCAATCTCTCTCTTTTCCTCCTCCTCCGTTGTTGTTGTTGAGAGAGATTGAGCTC
PBBC adapter #6[Phos] TCTGCAGATCTCTCTCTTTTCCTCCTCCTCCGTTGTTGTTGTTGAGAGAGATCTGCAGA

Exonuclease treatment

ExoIII (50U) and ExoVII (5U) enzymes were added to the carrier DNA-cDNA mixture, then they were incubated at 37 °C for 1 h, then the reactions were returned to 4 °C.

Sample purification

SMRTbell Templates were purified using 0.6× AMPure PB beads, as was described above. Two purification steps were applied after one other. The final elution volume was 10 μl. Qubit fluorometer was used for quantitation. SMRTbell templates were bound to the PacBio’s P5 DNA polymerase. These complexes were bound to MagBeads using the Pacific Biosciences MagBead Binding Kit. The concentrations of the SMRTbell libraries were measured by Qubit and they were also qualified by Agilent 2100 Bioanalyzer. The PacBio RSII platform and C3 sequencing chemistry was used for sequencing. 180 min movies were applied for each SMRT Cell.

Annealing of the sequencing primers to the template DNA and the DNA polymerase binding

The PacBio Calculator v.2.3.0.0. was used to set the annealing and binding reactions. 2000 bp insert size and 1 ng/μl concentration was set. The Sequencing Primer (5000 nM) was diluted to 150 nM with the PacBio Elution Buffer (EB). One μl from the diluted primer and 10x Primer Buffer were added to the template DNA. Annealing was carried out at 80 °C for 2 min, and then the temperature was ramp to 25 °C at a rate of 0.1 °C/sec. The total volume of annealed template was bound to the Polymerase. For this, 2 μl dNTP, 2 μl DTT, 2 μl Binding Buffer (BB) and 2 μl diluted Polymerase were added to the samples. Mixtures were incubated at 30 °C for 4 h and then they were heated to 37 °C for 30 min. 2 μl from the complexes were used for MagBead binding. Cleaned MagBeads (74 μl) were added to the samples and they were incubated at 4 °C for 2 h on a HulaMixer rotator (Invitrogen). After the incubation, samples were washed with 19 μl BB, then 19 μl Wash Buffer (WB), and finally they were resuspended in 19 μl BB. The total amount of the MagBead-bound complex was loaded onto the RSII sequencer.

Preparation of SMRTbell libraries-“Carrier DNA method”

The total amount of cDNA synthesized at each time point was used to prepare SMRTbell templates by using the PacBio DNA Template Prep Kit 1.0 and following the Pacific Biosciences template preparation and sequencing. protocol for Very Low (10 ng) Input 2 kb libraries with carrier DNA (pBR322, Thermo Scientific).

Preparing the carrier DNA

The concentration of the pBR322 plasmid DNA was measured by Nanodrop. A 100ng/μl stock solution was prepared from the plasmid using the PacBio Elution Buffer (EB). The DNA was exonuclease treated with the PacBio ExoIII (200U) and ExoVII (20U) enzymes and the Template Prep buffer (10×). The mixture was incubated at 37 °C for 1 h, then it was cooled down to 4 °C. The DNA was purified and concentrated by using 0.6× AMPure® PB beads and it was eluted in 50 μL EB. The exonuclease-treated carrier DNA was quantified by Qubit fluorometer.

Repairing the DNA damage

cDNA samples were mixed with DNA Damage Repair Buffer, NAD+, ATP High, dNTP and DNA Damage Repair Mix (all from the PacBio DNA Template Prep Kit), and then were incubated at 37 °C for 20 min. Samples were cooled to 4 °C. End Repair Mix (PacBio) was added to the samples and then they were incubated at 25 °C for 5 min. AMPure PB bead was added to the samples and they were purified as in case of the barcoded samples. This step was carried out at 25 °C for 60 min with the addition of Blunt Adapter, Template Prep Buffer, ATP Low and Ligase (PacBio). The enzyme was inactivated at 65 °C (10 min). After this step, the ExoIII and ExoVII-treated carrier DNA (5 μl; 100 ng/μl) was mixed with the adapter-ligated cDNA samples (40 μl). ExoIII (50U) and ExoVII (5U) enzymes were added to the carrier DNA-cDNA mixture, then they were incubated at 37 °C for 1 hour, then the reactions were returned to 4 °C. Two purification steps were carried out successively, as was described earlier. SMRTbell libraries were bound to DNA polymerases by using the DNA polymerase binding kit P5 and v2 sequencing primers. The DNA polymerase/template complexes were bound to MagBeads using the MagBead Binding Kit. The concentrations of the SMRTbell libraries were measured by Qubit and they were further analysed by Agilent 2100 Bioanalyzer. The cDNA sequencing reactions were carried out on the PacBio RSII platform with C3 sequencing chemistry with 180 min movies. Conditioning and annealing of the Sequencing Primer, the binding of the Polymerase to the libraries, as well as Polymerase-template complex binding to the magnetic beads was done exactly as indicated by the PacBio Very Low Input protocol. The total amounts of prepared libraries (10 μl) were used for the binding. The DNA concentrations were set to 0.1 μl in the Calculator version 2.3.0.0. The “small-scale” preparation protocol and the “non-standard” protocol were chosen. The Sequencing Primer (5000 nM) was diluted to 150 nM in EB. One μl from the diluted primer and 10x Primer Buffer were added to the template DNA. Annealing was carried out at 80 °C for 2 min then the temperature was ramp to 25 °C at a rate of 0.1 °C/sec. The total volume of annealed template was bound to the Polymerase. For this, 2 μl dNTP, 2 μl DTT, 2 μl BB and 2 μl diluted Polymerase were added to the samples. Mixtures were incubated at 30 °C for 4 h and then they were heated to 37 °C for 30 min. The total volume from the polymerase binding step was used for MagBead binding. The salt molarity was adjusted for optimal binding by adding WB (0.3× volume) to the bound complex instead of BB. Cleaned MagBeads (26 μl) were added to the samples and they were incubated at 4 °C for 30 min on a HulaMixer rotator (Invitrogen). After the incubation, samples were washed with 26 μl BB, then 26 μl BW, and finally they were resuspended in 19 μl BB. The total amount of the MagBead-bound complex was loaded onto the PacBio machine.

PacBio SMRTbell library preparation-Iso-Seq method/the amplified protocol & sequencing on RSII as well as Sequel platforms

Full-length cDNAs were generated using the Clontech SMARTer PCR cDNA Synthesis Kit based on the PacBio Isoform Sequencing (Iso-Seq) protocol. No Size Selection method was carried out for the analysis of short viral transcripts, while Manual Agarose-gel Size Selection, as well as SageELF™ and BluePippin™ Size-Selection Systems (Sage Science) were used for the isolation of long RNA molecules. The first-strand cDNAs were generated by using the SMARTer PCR cDNA Synthesis Kit (Clontech), the reactions were primed with oligo(dT) (part of the Clontech Kit) or adapter-linked GC-rich random primers (ordered from IDT DNA). The single-stranded cDNAs were PCR-amplified using KAPA HiFi Enzyme (Kapa Biosystems), in accordance with recommendations provided by PacBio, as follows: initial denaturation was carried at −95 °C for 2 min, followed by 16 cycles for PA-seq, 20 or 30 cycles for random-primed samples (the optimal cycle was determined in the optimization step) at −98 °C for 20 s (denaturation), −65 °C for 15 s (annealing) −72 °C for 4 min (extension). The final extension was carried out at −72 °C for 5 min. (n: 16 cycles was ideal for the No size-selection protocol. For the agarose size-selection, 12 cycles and 1:45 min extension was set for the amplification of transcripts between 2–3 kb and 15 cycles and 3 min extension was used for the longer transcripts. Sixteen cycles were set for the SageELF and BluePippin samples. PCR products were pooled then size selected manually by using 0.8% agarose gel or with the SageELF™ System according to the PacBio's protocol. Size-selected samples were amplified with KAPA enzyme using the conditions as above. The fraction of cDNAs with a size over 5 kb was run on BluePippin™ System to eliminate the short SMRTbell libraries. Five-hundred ng of each non-size-selected cDNA sample was applied for the SMRTbell template preparation, using the PacBio DNA Template Prep Kit 1.0. The amount of cDNAs from the size-selected samples used in the library preparation reaction were based on the following PacBio protocols: Procedure & Checklist – Isoform Sequencing (Iso-Seq™) using the Clontech SMARTer PCR cDNA Synthesis Kit and (a) Manual Agarose-gel Size Selection; (b) SageELF™ Size Selection System; and (c) BluePippin™ Size-Selection System. SMRTbell sequencing libraries were bound to polymerases by using the DNA/Polymerase Binding Kit P6 and v2 primers. The polymerase-template complexes were bound to MagBeads with the PacBio MagBead Binding Kit. The qualities of the samples were checked on the Agilent 2100 Bioanalyzer. Sequencing reactions were performed by using the PacBio RS II sequencer with DNA Sequencing Reagent 4.0. Movie lengths were 240 min or 360 min (one movie was recorded for each SMRT Cell). The volume of the sequencing primer for the annealing, and the polymerase (P5 or P6) for the binding was determined using the PacBio Calculator version 2.3.1.1., by adding the concentrations and the average insert sizes of SMRTbell templates. The polymerase-template complexes were bound to MagBeads, loaded onto SMRT Cells and sequenced on the RSII instrument. The PacBio’s Binding Calculator was used to prepare the library for sequencing using the MagBead one-cell per well (OCPW) protocol, and binding kit P6v2 was used with an on-plate concentration of 0.05 nM. The insert sizes were set according to the size-selections which were applied: 1000, 2500 and 6000 bp sizes were chosen. In short, the sequencing primer was diluted in PacBio EB to 150 nM. The annealing step was performed with 1 μl template DNA (cc: ~20 ng/μl), the diluted sequencing primer and primer buffer (10x). The final concentration of this mixture was 0.8333 nM. Annealing was carried out at 20 °C for 30 min then the DNA polymerase enzyme was diluted to a final concentration of 50 nM in PacBio BB v2, and then it was bound to the annealed template followed by the addition of DTT, dNTP and BB. The complex (0.5 nM final concentration) was incubated at 30 °C for 4 h. The sample complex (0.5 μl) was mixed with and 18.5 μl MagBead Binding Buffer (0.0125 nM final concentration). MagBeads were prepared as follows: 73.9 μl MagBeads were washed with 73.9 μl MagBead WB, then 73.9 μl MagBead BB was added. The sample complex was bound to the washed, prepared MagBeads for loading to the RSII machine: 19 μl sample complex was added to the beads, and then it was placed at 4 °C for 30 min in a HulaMixer. After incubation, the MagBead-bound complex was washed with 19 μl BB, then with 19 μl WB and finally, it was resuspended in 19 μl BB. The total amount of the MagBead-bound complex was loaded onto the instrument. The MagBead One Cell Per Well protocol was used. One SMRT Cell was also run on Sequel instrument.

Oxford Nanopore cDNA sequencing

PRV transcripts were sequenced on MinION device using the 1D Strand switching cDNA by ligation method (Version: SSE_9011_v108_revS_18Oct2016) and the ONT Ligation Sequencing Kit 1D (SQK-LSK108). For this, PolyA(+)-selected RNAs were used. 50ng from the samples were subjected to reverse transcription. Poly(T)-containing anchored primer [(VN)T20; ordered from Bio Basic, Canada, (Table 5)] and dNTPs (10 mM, Thermo Scientific) was added to the RNA samples and then the mixture was incubated at 65 °C for 5 min. Buffer and DTT from SuperScipt IV Reverse Transcriptase kit (Life Technologies), RNase OUT (Life Technologies) and strand-switching oligo with three O-methyl-guanine RNA bases (PCR_Sw_mod_3G; ordered from Bio Basic, Canada) were added and the sample was incubated at 42 °C for 2 min. 200U SuperScript IV Reverse Transcriptase enzyme was measured into the mix. First-strand cDNA synthesis was carried out at 50 °C for 10 min; it was followed by the strand switching step at 42 °C for 10 min. Enzymes were inactivated at 80 °C for 10 min. Five μl from the prepared double-stranded cDNA was amplified in a single PCR reaction using KAPA HiFi DNA Polymerase (Kapa Biosystems) and Ligation Sequencing Kit Primer Mix (provided by the 1D Kit). The Veriti Thermal Cycler (Applied Biosystems) was set as the 1D Kit’s protocol recommended: initial denaturation for 30 sec at 95 °C (1 cycle); denaturation for 15 sec at 95 °C (15 cycles); annealing for 15 sec at 62 °C (15 cycles); elongation for 4 min at 65 °C (15cycles); final extension 10 min at 65 °C. NEBNext End repair / dA-tailing Module (New England Biolabs) was used for end repair, while NEB Blunt/TA Ligase Master Mix (New England Biolabs) was applied for adapter ligations. The adapter sequences were supplied by the kit. Agencourt AMPure XP magnetic beads (Beckman Coulter) were used for purification following each enzymatic step. The Qubit Fluorometer (Life Technologies Qubit 2.0) and the Qubit (ds)DNA HS Assay Kit were used to quantify the concentration of the libraries. Samples were loaded on R9.4 SpotON Flow Cells, and base calling was performed using Albacore v1.2.6.

Oxford Nanopore sequencing on Cap-selected samples

To obtain full-length transcripts with the exact 5′-ends, Cap selection was carried out. For this, the TeloPrime Full-Length cDNA Amplification Kit (Lexogen) was used, which has an exceptional specificity for 5′-Cap. The starting material was 2 μg total RNA diluted in 12 μl water, from a mixed PRV sample (containing RNA from 1, 2, 3, 4, 5, 6, 7, 8, 12, 18 and 24 h post-infection). The method based on cDNA generation. Reverse transcription (RT) was carried out according to the kit’s manual. Briefly, the diluted RNA was mixed with RT buffer, primer (both are supplied by the kit). The RT primer contains an “oligodT” sequence (Table 5) to select the polyadenylated transcripts. The mixture was preheated at 70 °C for 30 sec, then it was cooled down to 37 °C for 1 min. RT enzyme and reagents (part of the kit) were added and the reaction was contain at 37 °C for 2 min. Temperature was increased to 46 °C for 50 min. The RNA-cDNA hybrid was purified using silica columns (kit’s component). A specific adapter was ligated to the cDNA by base-pairing of the 5’C to the cap structure of the RNA. This step was carried out by the double-strand specific ligase of the kit. Ligation was performed at 25 °C, overnight. The sample was purified after ligation using the silica columns. The cDNA was converted to dscDNA using the Second-Strand Mix and the Enzyme Mix from the Teloprime kit. The reaction was carried out in a Veriti Cycler with the following protocol: 98 °C for 90 sec, 62 °C for 60 sec, 72 °C for 5 min, hold at 25 °C. Sample concentration was measured using Qubit dsDNA HS Assay Kit (Life Technologies). Specificity of the obtained cDNA was checked by qPCR (Rotor-Gene Q) using a gene specific primer (us9, 10μM each; Table 7), cDNA and ABsolute qPCR SYBR Green Mix (Thermo Fisher Scientific) in 20 μl final volume. The initial denaturation was 94 °C 15 min, and it was followed by 35 cycles of 94 °C for 25 sec, 60 °C 25 sec and 72 °C 6 sec.
Table 7

The gene-specific primers used for the amplification of us9 gene of PRV.

PrimerSequence
Forward CAGGACGACTCGGACTGCTA
Reverse AGGAACTCGCTGGGCGT
The PolyA(+)-CAP-selected samples were also sequenced on MinION using the 1D Strand switching cDNA by ligation method. These samples were subjected to the end repair and adapter ligation steps, and then they were loaded on the ONT Flow Cells.

Oxford Nanopore direct RNA sequencing

Three flow cells were used for sequencing PRV samples following the Direct RNA sequencing (DRS) protocol from the ONT (Version: DRS_9026_v1_revM_15Dec2016). Total RNAs from 12 different time points were mixed together, and then polyA selection was carried out. RNA from the PolyA(+) fraction in 9 μl was used as a template for sequencing. RNA was mixed with the RT (oligodT-containing T10) adapter (supplied by the ONT Direct RNA Sequencing Kit; SQK-RNA001; Oxford Nanopore Technologies) and T4 DNA ligase (2M U/ml; New England BioLabs). The mixture was incubated at room temperature for 10 min. First-strand cDNA synthesis was carried out in 40 μl final volume with SuperScript III Reverse Transcriptase (Life Technologies), according to the DRS protocol, at 50 °C for 50 min, then 70 °C for 10 min in a Veriti Thermal Cycler. Samples were washed with Agencourt AMPure XP Beads (Beckman Coulter). XP Beads were treated before usage with RNase OUT (40 U/μl; Life Technologies); 2U enzyme was added to 1 μl bead. Purified RNA-cDNA hybrids were eluted in 20 μl Ambion Nuclease-Free Water (Thermo Fisher Scientific). RMX sequencing adapter was ligated to the eluted samples with T4 DNA ligase and NEBNext Quick Ligation Reaction Buffer (New England BiceoLabs) at room temperature for 10 min. Samples were purified with RNase OUT-treated XP beads using Wash Buffer (part of the DRS Kit) and then eluted in 21 μl Elution Buffer (provided by the DRS Kit). The concentration of the reverse-transcribed and adapted RNA was measured by using the Qubit 2.0 Fluorometer and Qubit dsDNA HS Assay Kit (Life Technologies). Samples were loaded onto the R9.4 SpotON Flow Cell. Data on the quality of PacBio RSII, Sequel, and ONT MinION reads including insertions, deletions, and mismatches, as well as the coverages are summarized in Table 8 (available online only).
Table 8

Summary table of the read qualities obtained from PacBio and ONT long-read sequencing

SamplifiedleaDel %bDel SDcIns %dIns SDeMM %fMM SDgCoverageh
Data show that the ONT MinION sequencing resulted in relatively high error rates for both Indels and mismatches. The best read quality (i.e. less Indel) – as expected from the literature data – was obtained from the Illumina runs; however, we obtained no significant differences between the Illumina and PacBio mismatch data. Moreover, several PacBio data sets yielded better results than those of the Illumina assemblies. The composition of the errors of the three platforms (Illumina, PacBio, and ONT) and the four techniques (HiScan SQ, RSII, Sequel, and MinION) are different. Mismatches are the most common errors in both ONT (according to the previously published data31) and Illumina data32. In agreement with others’ data31, insertions are the least frequent errors in ONT in our study. However, in contrast to the others’ results31, insertions are the major errors in our PacBio RSII and Sequel data. In sum, the absolute error rate of both the Illumina and PacBio (especially the Sequel) platforms is fairly low. The custom code [Github (https://doi.org/10.5281/zenodo.1034511)] was used to obtain the presented data. In Illumina sequencing experiments, the non-matching nucleotides, generated by the mapping software, were removed from the end of alignments in SAM files in order to get more precise quality values.
       
Illumina-randomom0.010.090.010.110.450.64317.86
Illumina-PolyA0.010.140.010.190.330.943095.39
RSII-PolyA-Seq-amplified- 1 h1.241.731.243.050.821.9054.82
RSII-PolyA-Seq-amplified-4 h1.962.401.032.720.831.84160.97
RSII-PolyA-Seq-amplified-8 h1.721.951.132.880.861.89378.37
RSII-PolyA-Seq-amplified-1, 2, 4, 6, 8 h-BluePippin 0.8 kb-5 kb+0.761.430.452.611.143.899.07
RSII-PolyA-Seq-amplified-1, 2, 4, 6, 8 h-BluePippin 0.8-2 kb0.530.980.050.370.060.1767.26
RSII-PolyA-Seq-amplified-1, 2, 4, 6, 8 h-BluePippin 2-3 kb0.391.010.060.310.060.1950.15
RSII-PolyA-Seq-amplified-1, 2, 4, 6, 8 h-BluePippin 3–5 kb0.511.170.070.400.070.1823.03
RSII-PolyA-Seq-amplified-1, 2, 4, 6, 8 h-BluePippin 5 kb+0.390.950.090.780.060.2132.45
RSII-PolyA-Seq-amplified-1, 4, 8 h,-Manual gel size selection 1-2 kb0.831.460.060.740.070.2429.85
RSII-PolyA-Seq-amplified-1 h-Manual gel size selection 3 kb+1.622.100.862.350.832.0632.43
RSII-PolyA-Seq-amplified-1 h-Manual gel size selection 2–3 kb0.961.370.692.630.491.564.75
RSII-PolyA-Seq-amplified-4 h-Manual gel size selection 3 kb+1.561.950.912.990.772.0928.08
RSII-PolyA-Seq-amplified-4 h-Manual gel size selection 2–3 kb1.491.910.922.610.792.02134.18
RSII-PolyA-Seq-amplified-8 h-Manual gel size selection 3 kb+0.911.430.050.370.060.1537.50
RSII-PolyA-Seq-amplified-8 h-Manual gel size selection 2–3 kb1.561.820.762.380.631.7316.89
RSII-random-amplified-1, 2, 4, 6, 8 h0.571.320.171.020.130.638.26
RSII-random-amplified-1 h0.781.440.100.980.090.4616.11
RSII-random-amplified-4 h0.801.460.100.750.090.3746.73
RSII-random-amplified-8 h0.681.330.110.970.090.3875.37
RSII-random-amplified-1, 4, 8 h,-Manual gel size selection 3 kb+1.051.511.022.840.731.9316.73
RSII-random-amplified-1, 4, 8 h,-Manual gel size selection 1–2 kb0.721.370.151.030.100.4211.61
RSII-random-amplified-1, 4, 8 h,-Manual gel size selection 2–3 kb0.901.450.642.240.451.4222.93
RSII-PolyA-Seq-non amplified-1 h (Barcoding method)0.550.970.510.920.240.510.53
RSII-PolyA-Seq-non amplified-1 h (Carrier DNA method)0.631.050.921.560.310.7070.25
RSII-PolyA-Seq-non amplified-2 h (Barcoding method)0.490.650.500.790.170.431.02
RSII-PolyA-Seq-non amplified-2 h (Carrier DNA method)0.470.851.071.770.300.60115.89
RSII-PolyA-Seq-non amplified-4 h (Barcoding method)0.430.750.460.960.160.4710.81
RSII-PolyA-Seq-non amplified-4 h (Carrier DNA method)0.751.170.941.560.410.728.96
RSII-PolyA-Seq-non amplified-6 h (Barcoding method)0.500.840.471.100.170.3916.76
RSII-PolyA-Seq-non amplified-6 h (Carrier DNA method)0.741.070.981.490.410.7282.06
RSII-PolyA-Seq-non amplified-8 h (Barcoding method)0.421.020.340.700.110.281.62
RSII-PolyA-Seq-non amplified-8h (Carrier DNA method)0.750.981.031.650.440.7074.77
RSII-PolyA-Seq-non amplified-12h (Barcoding method)0.440.740.350.740.160.406.67
RSII-PolyA-Seq-non amplified-12 h (Carrier DNA method)0.650.861.582.100.520.77114.10
Sequel0.130.480.461.420.150.55136.75
MinION-Cap-selected5.771.964.114.297.122.48328.53
MinION-cDNA6.432.313.222.288.142.80151.94
MinION-RNA8.682.632.562.147.812.24162.42

aSamples

bpercentage of deletions.

cstandard deviations.

dpercentage of insertions.

estandard deviations.

fpercentage of mismatches.

gstandard deviations of mismatches.

haverage coverages across the PRV genome.

Read processing

Raw reads from the random-primed Illumina sequencing were aligned to the PRV genome (KJ717942.1), using Tophat v2.09 (ref. 23); ambiguous reads were discarded. For PA-Seq, mapping was carried out with Bowtie v2 (ref. 24). The PacBio RSII and Sequel consensus reads were generated following the RS_ReadsOfInsert protocol of the SMRT Analysis (v2.3.0 and v5.0.0) (Fig. 2), with the following settings: Minimum Full Passes=1, Minimum Predicted Accuracy=90, Minimum Length of Reads of Insert=1, Maximum Length of Reads of Insert=No Limit. These consensus reads were mapped using GMAP[25], with the following settings: gmap -d Genome.fa --nofails -f samse File.fastq>Mapped_file.sam. The ONT's Albacore software (v.2. 0.1) was used for base calling. This basecaller identify the nucleotide sequences directly from raw data. The sequencing reads were mapped with GMAP using the same setting as was described above. Custom routines were used to acquire the quality information presented in this data descriptor. The codes have been archived on Github (https://doi.org/10.5281/zenodo.1034511).

Data Records

All sequencing data have been uploaded to the European Nucleotide Archive under the project accession PRJEB24593 (Data Citation 1)-contains BAM files-and PRJEB9526 (Data Citation 2) – containing FASTQ files -. All sequencing reads were mapped to the KJ717942.1 genome build. All data can be used without restrictions.

Technical Validation

The quantity of the isolated total RNAs, the polyA-selected RNAs, the rRNA-depleted samples, as well as the synthesized cDNA fractions and sequencing-ready libraries were measured by Qubit 2.0 (Life Technologies) fluorometer using the Qubit RNA, HS RNA and HS dsDNA Assay Kits. The conditions for primer annealing and binding of the polymerase to the templates were determined by PacBio’s Binding Calculator in RS Remote. The libraries were measured by Agilent 2100 Bioanalyzer using the Agilent High Sensitivity DNA Kit.

Usage Notes

The provided dataset was primarily produced to discover and determine the complexity and expression dynamic properties of PRV transcriptome. The uploaded binary alignment (BAM) files contain reads already mapped to the KJ717942.1 reference. These aligned files can be further analysed using various bioinformatics program packages, such as bedtools[26], samtools[27], or visualized using e.g. IGV[28], Geneious[29] or Artemis[30]. The uploaded Illumina, PacBio and ONT files have not been trimmed, they contain terminal poly(A) sequences as well as the 5′and 3′ adapter sequences, which can be used to determine the orientations of the reads.

Additional information

How to cite this article: Tombácz, D. et al. Transcriptome-wide survey of pseudorabies virus using next- and third-generation sequencing platforms. Sci. Data 5:180119 doi: 10.1038/sdata.2018.119 (2018). Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
  31 in total

1.  Growth, physicochemical properties, and morphogenesis of Chinese wild-type PRV Fa and its gene-deleted mutant strain PRV SA215.

Authors:  Ling Zhu; Yue Yi; Zhiwen Xu; Lu Cheng; Shanhu Tang; Wanzhu Guo
Journal:  Virol J       Date:  2011-06-04       Impact factor: 4.099

2.  Calcium imaging of neuronal circuits in vivo using a circuit-tracing pseudorabies virus.

Authors:  Andrea E Granstedt; Bernd Kuhn; Samuel S-H Wang; Lynn W Enquist
Journal:  Cold Spring Harb Protoc       Date:  2010-04

3.  Genetic basis of the neurovirulence of pseudorabies virus.

Authors:  B Lomniczi; S Watanabe; T Ben-Porat; A S Kaplan
Journal:  J Virol       Date:  1984-10       Impact factor: 5.103

4.  Retrograde, transneuronal spread of pseudorabies virus in defined neuronal circuitry of the rat brain is facilitated by gE mutations that reduce virulence.

Authors:  M Yang; J P Card; R S Tirabassi; R R Miselis; L W Enquist
Journal:  J Virol       Date:  1999-05       Impact factor: 5.103

Review 5.  Gene and cancer therapy--pseudorabies virus: a novel research and therapeutic tool?

Authors:  Zsolt Boldogköi; Antal Nógrádi
Journal:  Curr Gene Ther       Date:  2003-04       Impact factor: 4.391

6.  Integrative genomics viewer.

Authors:  James T Robinson; Helga Thorvaldsdóttir; Wendy Winckler; Mitchell Guttman; Eric S Lander; Gad Getz; Jill P Mesirov
Journal:  Nat Biotechnol       Date:  2011-01       Impact factor: 54.908

7.  Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data.

Authors:  Matthew Kearse; Richard Moir; Amy Wilson; Steven Stones-Havas; Matthew Cheung; Shane Sturrock; Simon Buxton; Alex Cooper; Sidney Markowitz; Chris Duran; Tobias Thierer; Bruce Ashton; Peter Meintjes; Alexei Drummond
Journal:  Bioinformatics       Date:  2012-04-27       Impact factor: 6.937

8.  Characterization of pseudorabies virus transcriptome by Illumina sequencing.

Authors:  Péter Oláh; Dóra Tombácz; Nándor Póka; Zsolt Csabai; István Prazsák; Zsolt Boldogkői
Journal:  BMC Microbiol       Date:  2015-07-01       Impact factor: 3.605

9.  A transcriptome atlas of rabbit revealed by PacBio single-molecule long-read sequencing.

Authors:  Shi-Yi Chen; Feilong Deng; Xianbo Jia; Cao Li; Song-Jia Lai
Journal:  Sci Rep       Date:  2017-08-09       Impact factor: 4.379

10.  TopHat: discovering splice junctions with RNA-Seq.

Authors:  Cole Trapnell; Lior Pachter; Steven L Salzberg
Journal:  Bioinformatics       Date:  2009-03-16       Impact factor: 6.937

View more
  8 in total

1.  Transcriptome-wide analysis of a baculovirus using nanopore sequencing.

Authors:  Zsolt Boldogkői; Norbert Moldován; Attila Szűcs; Dóra Tombácz
Journal:  Sci Data       Date:  2018-12-04       Impact factor: 6.444

2.  Dynamic transcriptome profiling dataset of vaccinia virus obtained from long-read sequencing techniques.

Authors:  Dóra Tombácz; István Prazsák; Attila Szucs; Béla Dénes; Michael Snyder; Zsolt Boldogkoi
Journal:  Gigascience       Date:  2018-12-01       Impact factor: 6.524

3.  Multiple Long-Read Sequencing Survey of Herpes Simplex Virus Dynamic Transcriptome.

Authors:  Dóra Tombácz; Norbert Moldován; Zsolt Balázs; Gábor Gulyás; Zsolt Csabai; Miklós Boldogkői; Michael Snyder; Zsolt Boldogkői
Journal:  Front Genet       Date:  2019-09-24       Impact factor: 4.599

4.  Time-course profiling of bovine alphaherpesvirus 1.1 transcriptome using multiplatform sequencing.

Authors:  Norbert Moldován; Gábor Torma; Gábor Gulyás; Ákos Hornyák; Zoltán Zádori; Victoria A Jefferson; Zsolt Csabai; Miklós Boldogkői; Dóra Tombácz; Florencia Meyer; Zsolt Boldogkői
Journal:  Sci Rep       Date:  2020-11-24       Impact factor: 4.379

Review 5.  Pseudorabies Virus: From Pathogenesis to Prevention Strategies.

Authors:  Hui-Hua Zheng; Peng-Fei Fu; Hong-Ying Chen; Zhen-Ya Wang
Journal:  Viruses       Date:  2022-07-27       Impact factor: 5.818

6.  Time course profiling of host cell response to herpesvirus infection using nanopore and synthetic long-read transcriptome sequencing.

Authors:  Zoltán Maróti; Dóra Tombácz; Norbert Moldován; Gábor Torma; Victoria A Jefferson; Zsolt Csabai; Gábor Gulyás; Ákos Dörmő; Miklós Boldogkői; Tibor Kalmár; Florencia Meyer; Zsolt Boldogkői
Journal:  Sci Rep       Date:  2021-07-09       Impact factor: 4.379

7.  Lytic Transcriptome Dataset of Varicella Zoster Virus Generated by Long-Read Sequencing.

Authors:  Dóra Tombácz; István Prazsák; Norbert Moldován; Attila Szűcs; Zsolt Boldogkői
Journal:  Front Genet       Date:  2018-10-16       Impact factor: 4.599

8.  Long-read assays shed new light on the transcriptome complexity of a viral pathogen.

Authors:  Dóra Tombácz; István Prazsák; Zsolt Csabai; Norbert Moldován; Béla Dénes; Michael Snyder; Zsolt Boldogkői
Journal:  Sci Rep       Date:  2020-08-14       Impact factor: 4.379

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.