Literature DB >> 23741393

Simplified large-scale Sanger genome sequencing for influenza A/H3N2 virus.

Hong Kai Lee1, Julian Wei-Tze Tang, Debra Han-Lin Kong, Evelyn Siew-Chuan Koay.   

Abstract

BACKGROUND: The advent of next-generation sequencing technologies and the resultant lower costs of sequencing have enabled production of massive amounts of data, including the generation of full genome sequences of pathogens. However, the small genome size of the influenza virus arguably justifies the use of the more conventional Sanger sequencing technology which is still currently more readily available in most diagnostic laboratories.
RESULTS: We present a simplified Sanger-based genome sequencing method for sequencing the influenza A/H3N2 virus in a large-scale format. The entire genome sequencing was completed with 19 reverse transcription-polymerase chain reactions (RT-PCRs) and 39 sequencing reactions. This method was tested on 15 native clinical samples and 15 culture isolates, respectively, collected between 2009 and 2011. The 15 native clinical samples registered quantification cycle values ranging from 21.0 to 30.56, which were equivalent to 2.4×10(3)-1.4×10(6) viral copies/µL of RNA extract. All the PCR-amplified products were sequenced directly without PCR product purification. Notably, high quality sequencing data up to 700 bp were generated for all the samples tested. The completed sequence covered 408,810 nucleotides in total, with 13,627 nucleotides per genome, attaining 100% coding completeness. Of all the bases produced, an average of 89.49% were Phred quality value 40 (QV40) bases (representing an accuracy of circa one miscall for every 10,000 bases) or higher, and an average of 93.46% were QV30 bases (one miscall every 1000 bases) or higher.
CONCLUSIONS: This sequencing protocol has been shown to be cost-effective and less labor-intensive in obtaining full influenza genomes. The constant high quality of sequences generated imparts confidence in extending the application of this non-purified amplicon sequencing approach to other gene sequencing assays, with appropriate use of suitably designed primers.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23741393      PMCID: PMC3669369          DOI: 10.1371/journal.pone.0064785

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

In recent years, advances in sequencing techniques have enabled an increasing number of research studies based on the genome-wide sequences of the influenza viruses [1]–[6], rather than relying solely on an individual gene that may preclude more comprehensive gene signatures [7], [8]. Since the large number of influenza genome sequences deposited by Ghedin et al. [4] and the initiation of the Influenza Genome Sequencing Project in 2005 [9], the deposition of complete human influenza A virus genomes by other groups has increased exponentially. The genome of the influenza A virus (family Orthomyxoviridae) consists of eight segmented, negative-stranded RNAs, ranging from 890 to 2,341 nucleotides (nt), constituting 13,627 nt per genome. The eight RNA segments encode for (in the order of the segment numbers one to eight): viral RNA polymerase basic 2 (PB2, 2341 nt), polymerase basic 1 (PB1, 2341 nt), polymerase acidic (PA, 2233 nt), hemagglutinin (HA, 1762 nt), nucleoprotein (NP, 1567 nt), neuraminidase (NA, 1466 nt), matrix (M1, 1027 nt), and nonstructural (NS1, 890 nt) protein. Apart from these proteins, alternatively spliced mRNAs of the seventh segment (M1) and the eighth segment (NS1) allow translation of two additional proteins, namely, the ion channel matrix protein (M2) and nuclear export/nonstructural protein (NEP/NS2), respectively. Also, PB1-F2 proteins are alternatively translated from PB1 gene segments of some influenza A viruses [10]. The introduction of next-generation sequencing (NGS), which delivers high throughput readings [11] compared to the traditional Sanger dideoxy chain-termination method [12], has provided a remarkable cost reduction for microbial genome sequencing. However, a higher error rate due to homopolymeric miscalling and other systematic base-calling biases have been observed in NGS techniques, compared with the Sanger methods [13]–[16]. The average error rate of the former is considerably higher, with a value of 10−2–10−4 versus that of the latter at 10−4–10−5 [13], [14]. A recent report on 12 influenza genomes comparing 2 NGS platforms from 454 Life Sciences and Illumina revealed error rates up to 10−3 and 10−5 at the homopolymeric region, respectively [17]. Besides, the cost of the initial NGS capital equipment outlay, together with the additional bioinformatics manpower support for the storage and analysis of the huge amount of data generated through the NGS system [18] may not be cost-effective for many smaller research laboratories for the sequencing of influenza viruses which have a relatively small genome size (∼14 kb). The Sanger technique is regarded to be low throughput and more tedious, due to the requirement of multiple purification or plasmid cloning steps [4], [8], [19]–[23]. Here, we describe a whole genome sequencing method for seasonal influenza A/H3N2, with modifications of the normal sequencing protocol that reduces the number of processing steps, but still constantly produces a high quality sequence read of up to 700 bp. This protocol, when applied systematically, should hasten the routine genome sequencing work for local influenza surveillance studies. It was also demonstrated that this protocol is highly applicable for both clinical samples and Madin-Darby canine kidney- (MDCK-) cultured samples.

Results

Clinical Specimens and Culture Isolates

A total of 30 archived influenza A/H3N2 clinical samples collected from different patients between 2 May 2009-1 Aug 2011 were selected randomly for this study. All samples were received for diagnostic testing at the National University Hospital (NUH) in Singapore and were confirmed positive using two clinically validated, in-house, real-time influenza A/B screening [24] and subtyping assays [25], [26]. The samples included nasal/nasopharyngeal or throat swabs collected in universal transport medium, endotracheal tube aspirates, or sputum samples. Fifteen of the 30 were sequenced from cultured isolates of the original clinical sample using a MDCK.2 (ATCC; CRL-2936) cell line; the other 15 sequences were obtained directly from the clinical samples with no preliminary culture step.

Primer Design

To ensure the utility of the assay for the sequencing of older as well as future circulating strains, two reference gene sequences were randomly chosen per month from depositions from different countries and dates of collection (2007 to 2011) available at the NCBI Influenza Virus Resource. Primer target regions for RT-PCRs for the different gene segments were selected from the conserved regions of the respective aligned gene sequences. Large gene segments (1 to 3) were amplified as three fragments. Small segments (4 to 8) were amplified as two fragments. To achieve tolerance for accurate sequence assembly, the PCR products for each of these segments overlapped with preceding and follow-up segments for at least 39 bp. The 5′ and 3′ ends of each segment were amplified using modified published forward (MBTuni-12) and reverse (MBTuni-13) primers [21], [27]. Sequencing primers were designed within the internal regions of the PCR products. All the sequencing and RT-PCR primers are listed in Tables 1 and 2, respectively.
Table 1

Summary of sequencing primers employed in this study and their respective performance.

Segment/fragmentPrimersPrimer sequence (5′-3′)Nucleotide position (5′-3′)ReferenceAverage percentage of bases ≥QV40 (S.D.)Average percentage of bases ≥QV30 (S.D.)Mean LOR in bases (S.D.)
1(PB2)/APB2_230F25 CGGAGAGAAATGAACAAGGACAAAC 230–254GU90712191.62 (5.62)94.46 (4.80)556 (23)
PB2_629R26 TCTCTCTAACATGTATGCAACCATCA 654–62989.87 (7.45)94.45 (5.05)593 (24)
1(PB2)/BPB2_960F21CAARGCTGCAATGGGATTGAG960–98089.93 (5.82)94.33 (4.69)618 (23)
PB2_1432R24 TCTCATTGACATCTCTGTGCTTGG 1455–143290.00 (6.36)94.31 (4.83)597 (24)
1(PB2)/CPB2_1796F25 GCCAATACAGTGGGTTTGTCAGAAC 1796–182092.69 (3.74)94.58 (3.37)498 (17)
PB2_2118R25TCCRTAYCTTCTGTCTTCCTTACCT2142–211889.27 (4.83)93.79 (4.11)580 (21)
2(PB1)/APB1_232F25 GATGGACCACTACCTGAGGATAATG 232–256AB44194891.70 (3.96)94.56 (3.93)540 (21)
PB1_590R23GGTCATGTTGTCYCTTACTCTCC612–59089.39 (5.70)93.43 (4.77)552 (24)
2(PB1)/BPB1_1007F26 ATCAACCTGAGTGGTTCAGAAACATC 1007–103286.18 (5.45)92.83 (4.38)681 (23)
PB1_1369R26TCATGATYTGGTGCATTCACTATGAG1394–136990.25 (5.11)93.72 (4.52)582 (26)
2(PB1)/CPB1_1700F25ATAGRTGCCATAGAGGAGACACACA1700–172491.20 (3.87)94.93 (3.49)579 (22)
PB1_2126R25 ATCGGTCTCCTATATGAACTACTAG 2150–212689.21 (6.23)94.24 (5.56)627 (31)
3(PA)/APA_210F24GGTAGAACTTGACRATCCAAATGC210–233GU90711790.40 (7.02)93.96 (4.66)520 (21)
PA_601R23 GTTTCTTCGCCTCTTTCGGACTG 623–60189.82 (5.04)92.96 (4.66)559 (25)
3(PA)/BPA_862F23TCCAARTTCCTCCTGATGGATGC862–88490.78 (3.85)94.85 (2.96)641 (18)
PA_1225R24CTGTAYCCAGCTTGAAAGTGACCT1248–122591.55 (8.56)94.21 (7.83)493 (38)
3(PA)/CPA_1608F20 TGACCCGAGAATTGAGCCAC 1608–162792.89 (3.16)95.71 (2.73)572 (13)
PA_1975R24 AAATCCTTCCAATTGTGGTGATGC 1998–197590.37 (6.22)93.16 (8.56)544 (68)
4(HA)/AHA_286F24TATTGGGAGACCCTCADTGTGATG286–309GU90711488.85 (5.64)94.39 (3.70)689 (16)
HA_517R27 GGGTCAACCAATTCAATCTACTAAAGA 543–51789.77 (6.92)93.20 (6.23)491 (22)
4(HA)/BHA_1387F26 TTGATCTAACTGACTCAGAAATGAAC 1387–141288.61 (5.25)91.77 (4.79)324 (17)
HA_1393R27ACAGTTTGTTCATTTCTGARTCAGTTA1419–139385.39 (14.55)91.35 (10.33)474 (18)
ACAGTTTGTTCATTTCTGARTCAATTA1419–1393
HA_1632R25 GCAAAAAACATGATATGGCAAAGGA 1656–163275.69 (11.79)86.43 (8.36)707 (26)
5(NP)/ANP_166F25 ATCCAAATGTGCACTGAACTTAAAC 166–190GU90712088.28 (8.19)94.00 (5.24)653 (25)
NP_664R20CGYCCATTYTCACCTCTCCA683–66491.71 (5.63)95.21 (4.06)624 (23)
5(NP)/BNP_998F25 CTAACGAGAATCCAGCACACAAGAG 998–102290.46 (4.60)93.66 (4.17)507 (20)
NP_1322R23 CGTATTTCCAGTGAATGCTGCCA 1344–132288.65 (7.16)93.33 (5.47)520 (27)
6(NA)/ANA_350F20GGYGGRGACATCTGGGTGAC350–369GU90711990.32 (4.46)93.53 (3.39)480 (17)
NA_529R23 ATGCTATGCACACTTGCTTGGTC 551–52988.24 (10.79)92.38 (8.81)494 (24)
NA_699R25 CCATTGATACAAACGCATTCTGACT 723–69987.70 (6.05)93.66 (3.37)667 (19)
6(NA)/BNA_1090F24AAATGACGTGTGGATGGGRAGAAC1090–111388.12 (7.04)91.74 (6.11)322 (17)
NA_1331R24CACAACAATACTGTTYGAGGTCCA1354–133190.43 (5.87)94.36 (3.94)584 (20)
7(MP)/AMP_78F18 GCCCCCTCAAAGCCGAGA 78–95GU90711589.32 (5.56)92.81 (1.93)457 (17)
MP_551R23CTGGCCAARACCATTCTGTTCTC573–55190.42 (4.21)94.10 (1.30)520 (17)
7(MP)/BMP_459F22GYCTRGTATGTGCAACATGTGA459–48091.65 (2.20)94.16 (1.75)524 (11)
8(NS)/ANS_38F23CACTGTGTYARGTTTCCAGGTAG38–60GU90711689.32 (6.65)92.78 (4.77)388 (16)
NS_373R23 GATTGCCTGGTCCATTCTGATGC 395–37389.32 (9.21)92.40 (9.13)340 (30)
8(NS)/BNS462F24 TTACTAAGGGCTTTCACCGAAGAG 462–48590.57 (5.73)92.79 (5.48)383 (21)
NS795R25AAACAGCAGTTGYAATGCTTGCATG819–79590.18 (2.32)92.50 (2.31)396 (9)

The performance of each sequencing primer is described in Table 1, as seen by the average percentage of bases generated from the 30 complete genomes with QV more than 30 and 40, respectively. The QV values were generated using the proprietary sequencing analysis software (version 5.2) of the ABI 3130×l genetic analyzer (Applied Biosystems). Length of Read (LOR) is defined as the length of sequence with QV20 and above for at least 20 continuous bases.

Table 2

PCR primers and second annealing temperatures (TaS) used to amplify the influenza A/H3N2 genome.

Segment/fragmentPrimersPrimer sequence (5′-3′)Nucleotide position (5′-3′)Reference geneSecond Ta (°C)
1(PB2)/AMBTuni-12ACGCGTGATCAGCRAAAGCAGG1–12GU90712159
PB2_841R24 AGATGCTAGTGGATCTGCTGATAC 864–841
1(PB2)/BPB2_778F24 AGGAATGACGATGTTGACCAAAGC 778–80160
PB2_1631R24 CAGGACCGTTAATCTCCCACATCA 1654–1631
1(PB2)/CPB2_1501F22 GAGAGGGTGGTGGTTAGCATTG 1501–152259
MBTuni-13 ACGCGTGATCAGTAGAAACAAGG 2341–2329
2(PB1)/AMBTuni-12ACGCGTGATCAGCRAAAGCAGG1–12AB44194860
PB1_820R21 CGGAAGTCCAGACTGTTCAAG 840–820
2(PB1)/BPB1_733F23AAARGAAGGGCTATTGCAACACC733–75560
PB1_1765R23CCTGYCCTTGATTGGGTTTGATC1787–1765
2(PB1)/CPB1_1447F25ATCAACATGAGCAAAAARAAGTCCT1447–147158
MBTuni-13 ACGCGTGATCAGTAGAAACAAGG 2341–2329
3(PA)/AMBTuni-12ACGCGTGATCAGCRAAAGCAGG1–12GU90711760
PA_778R25 AAGGTTCAATTTGGGCATTCACTTC 802–778
3(PA)/BPA_683F21 CACCGAACTTCTCCTGCCTTG 683–70358
PA_1558R24 ATTTACCACGTCTGTGTCATTCCT 1581–1558
3(PA)/CPA_1416F23CATTAACACTGCYCTGCTCAATG1416–143859
MBTuni-13 ACGCGTGATCAGTAGAAACAAGG 2233–2221
4(HA)/AMBTuni-12ACGCGTGATCAGCRAAAGCAGG1–12GU90711461
HA_1013R22YCCTGTTGCCAATTTCAGAGTG1034–1013
4(HA)/BHA_873F25 TCAATAATGAGATCAGATGCACCCA 873–89761
MBTuni-13 ACGCGTGATCAGTAGAAACAAGG 1762–1750
5(NP)/AMBTuni-12ACGCGTGATCAGCRAAAGCAGG1–12GU90712061
NP_868R18 CGCACAGGCAGGTAGGCA 885–868
5(NP)/BNP_753F23 AGCAATGGTGGATCAAGTGAGAG 753–77560
MBTuni-13 ACGCGTGATCAGTAGAAACAAGG 1567–1555
6(NA)/AMBTuni-12ACGCGTGATCAGCRAAAGCAGG1–12GU90711959
NA_862R23ATCTGACACCAGGRTATCGAGGA884–862
6(NA)/BNA_699F25AGTCRGAATGCGTYTGTATCAATGG699–72358
MBTuni-13 ACGCGTGATCAGTAGAAACAAGG 1466–1454
7(MP)/AMBTuni-12ACGCGTGATCAGCRAAAGCAGG1–12GU90711561
MP_582R23 AGCCATTTGCTCCATAGCCTTAG 604–582
7(MP)/BMP_429F21 TGGGGGCTGTAACCACTGAAG 429–44959
MBTuni-13 ACGCGTGATCAGTAGAAACAAGG 1027–1015
8(NS)/AMBTuni-12ACGCGTGATCAGCRAAAGCAGG1–12GU90711660
NS_464R22 CTCTTCGGTGAAAGCCCTTAGT 485–464
8(NS)/BNS382F21 TGGACCAGGCAATCATGGAGA 382–40260
MBTuni-13 ACGCGTGATCAGTAGAAACAAGG 890–878

The TaS for all the PCR primers ranged between 58 and 61°C. MBTuni-12 and MBTuni-13 primers targeting the 5′ and 3′ ends of each segment were adopted from published methods [21], [27], with nucleotides (in bold) representing the modifications made. Nucleotide R (bold) in the primer sequence indicates a degenerate nucleotide that represents A or G.

The performance of each sequencing primer is described in Table 1, as seen by the average percentage of bases generated from the 30 complete genomes with QV more than 30 and 40, respectively. The QV values were generated using the proprietary sequencing analysis software (version 5.2) of the ABI 3130×l genetic analyzer (Applied Biosystems). Length of Read (LOR) is defined as the length of sequence with QV20 and above for at least 20 continuous bases. The TaS for all the PCR primers ranged between 58 and 61°C. MBTuni-12 and MBTuni-13 primers targeting the 5′ and 3′ ends of each segment were adopted from published methods [21], [27], with nucleotides (in bold) representing the modifications made. Nucleotide R (bold) in the primer sequence indicates a degenerate nucleotide that represents A or G.

PCR Sensitivity

The 15 RNA samples extracted directly from the clinical samples were of quantification cycle values ranging from 21.0 to 30.56 (equivalent to 2.4×103–1.4×106 viral RNA copies/µL of RNA extract) [24]. All of the gene segments from both the clinical and MDCK-cultured samples collected from 2009–2011 were successfully amplified and appeared as specific and discernible bands on the agarose gel. It was noticed that some gene amplifications additionally produced minor non-specific bands in clinical samples with low viral titers.

Sequencing

All the eight segments from the respective 15 clinical and MDCK-cultured samples were successfully sequenced with high Phred quality value (QV) [28], and sequencing length up to 700 bp (Table 1). Length of read (LOR) for all sequence contigs had base calls of QV20 (representing an accuracy of circa one miscall for every 100 bases) and above for at least 20 continuous bases, which was in accordance to the analyzer machine’s default setting. Sequences with a mixture of nucleotides that contained only a single coverage depth was confirmed with reverse sequencing using PCR primers from the purified amplicon method briefly described in Figure 1. In total, the completed sequences obtained from the 15 cultured isolates and directly from the 15 clinical samples covered 408,810 nucleotides, with 13,627 nucleotides per genome, attaining 100% coding completeness. The entire sequencing protocol produced an average of 1.57 sequencing reads covering each nucleotide. Of all the bases in the assembly, an average of 89.49% were QV40 bases (representing an accuracy of circa one miscall for every 10,000 bases) or higher, and an average of 93.46% were QV30 bases (one miscall every 1000 bases) or higher (Table 1). All the sequences were successfully assembled into their respective segments. The use of the non-purified amplicon method resulted in a very high-quality genome assembly, including samples that had Ct values up to 30. The total sequencing raw data obtained per genome was less than 5 megabytes of data storage. The sequence analyses and assembly for each genome was completed within 15–30 minutes. The sequencing chromatograms generated were uploaded into Trace Archive [trace identifier number: 2333373621–2333374798] to allow visual inspection of the traces and quality scores underlying every nucleotide in each of the thirty genomes [29], [30]. All assembled sequences obtained in this study were uploaded onto NCBI GenBank [accession number: JX437693-JX437932].
Figure 1

Processing times and steps required for plasmid cloning, purified amplicon, and non-purified amplicon methods.

Representative sequencing chromatograms generated from each method are shown. The quality of the raw data obtained from the non-purified amplicon method was comparable with that of the plasmid cloning method. In contrast, the purified amplicon method generated lower quality data in the later portions of the sequence. * Please refer to appropriate references (under the References section).

Processing times and steps required for plasmid cloning, purified amplicon, and non-purified amplicon methods.

Representative sequencing chromatograms generated from each method are shown. The quality of the raw data obtained from the non-purified amplicon method was comparable with that of the plasmid cloning method. In contrast, the purified amplicon method generated lower quality data in the later portions of the sequence. * Please refer to appropriate references (under the References section).

Further Testing of Assay Protocol on other Clinical Samples

The genome sequencing and assembling protocols were further tested on 125 additional H3N2 primary clinical samples with Ct values of 30.56 and below. All the 125 samples were collected in NUH as diagnostic samples from 1 May 2009–15 Dec 2012. Of the 125 additional primary clinical samples, 118 were sequenced and assembled completely. In total, 134 out of 140 (96%) primary clinical samples were sequenced successfully in this study with similar Phred quality. There were seven samples that could not be sequenced completely. More specifically: full PB2, PB1, PA, HA, NP, and NS sequences were not obtainable from 2, 3, 3, 2, 1, 2 of these seven samples, respectively. Of these 13 failures, nine were from two samples with Ct values of 28.72 and 29.04, respectively. The PB1 and PA genes encountered the highest failure rate relative to the others.

Discussion

Traditionally, Sanger sequencing is performed on purified PCR amplicons to prevent background noise generated during sequencing analyses. Here, it was found possible to employ a non-purified amplicon approach for direct sequencing, which minimized processing time and effort for large-scale viral genome sequencing that produced consistently high quality sequencing data. Figure 1 summarizes the comparisons of the steps and amount of time required to perform sequencing using existing methods (plasmid cloning and purified amplicon approaches) and the non-purified amplicon method employed in this study. Direct sequencing on non-purified amplicons using target-specific sequencing primers not only significantly reduced the workload and cost for the entire genome sequencing, but also produced high quality sequencing peaks that were comparable to those generated by the plasmid cloning method (Figure 1). In addition, it will provide a more economical approach to detect viral mixture or quasispecies because unlike the plasmid cloning method [22], it does not require a minimum critical mass in clone selection for sequencing to obtain representative results. In comparison with the purified amplicon method, this non-purified amplicon method produced much higher quality raw data, according to the data produced from this study (Figure 1). One possible explanation for the success of this simplified approach may be due to minimum loss of the PCR products as a result of the omission of the purification step, in combination with the use of target-specific sequencing primers that were designed discretely from the PCR primers. Unlike the commonly used M13-flanked PCR primers that allow the use of the M13 primer to sequence the PCR product in a more effective way [4], [31], the independent sequencing primers allowed distinctive sequencing amplification of the specific region of the PCR product, without interference from non-specific products and primer-dimers generated during PCR. To minimize the undesirable effects of residual PCR primers during the sequencing reaction, the forward and reverse primers for each PCR were prepared in equimolar amounts, and PCR conditions of up to 50 total PCR cycles were used, to avoid background noise during sequencing analysis. The 4% (v/v) dimethyl sulfoxide (DMSO) used in the sequencing reaction suppressed background noise encountered by sequencing primer NS373R23 during sequencing analysis [32]. Culturing of clinical samples prior to sequencing is a common practice to obtain sufficient viral genetic material for PCR amplification, as well as to avoid contaminants that may inhibit the PCR. However, it is well-recognized that the passaging of viruses in different hosts may induce excessive host-mediated mutations [33], [34] that can inadvertently lead to biased conclusions. Use of the proposed modified protocol allowed successful complete genome sequencing of human influenza A/H3N2 from clinical and MDCK-cultured samples, from samples with viral loads as low as 2,400 viral RNA copies/µL RNA sample. Assay primer designs based on reference sequences collected from different geographical regions from different periods from 2007–2011, and a 96% success rate of the sequencing of 140 clinical samples collected between 2009–2012 showed that this protocol would be widely applicable to a wide range of viruses. However, further testing on A/H3N2 viruses collected prior to 2009 should be performed to check the sensitivity of this full-genome sequencing assay for these earlier viruses. The two samples that encountered most failures for individual gene segment sequencing could be possibly due to sample degradation or gene reassortment events within these regions. The H3N2 subtyping results were obtained for the purposes of clinical diagnosis earlier, based on specific real-time RT-PCRs targeting HA and MP genes only. The other five samples that had single incomplete gene sequences may possess single point mutation(s) that affected the capability of the assay to amplify those respective gene targets at either the PCR amplification or sequencing stage. The entire genomic sequencing for the influenza A/H3N2 virus can be completed with a data storage size of approximately 5 megabytes per genome, permitting convenient data handling by biologists or non-bioinformatics expertise for large-scale sequencing for local surveillance purposes. The sequencing cost per genome of the entire protocol from RNA extraction to sequence analysis was calculated to be less than SGD 350 (∼ USD 290), compared to the conventional purified-amplicon method at around SGD 410 (∼USD 340) and plasmid cloning approach at roughly SGD 1360 (∼USD 1120). The high quality data obtained from multiple sequencing reactions targeting different genes (Table 1) suggested the applicability of this technique for other viral (i.e. small genome) gene sequencing work. Influenza surveillance will continue on a worldwide basis for the foreseeable future, and molecular surveillance for influenza using partial or full-genome sequencing is now becoming routine in many diagnostic laboratories – especially in those which are not set up to perform the traditional serological surveillance for influenza (hemagglutination inhibition and viral micro-neutralization testing). Among the different seasonal human influenza viruses, influenza A/H3N2 has circulated in the human population since its emergence during the 1968 ‘Hong Kong’ pandemic, and has persisted successfully, despite the emergence of the 2009 A/H1N1pdm virus and its subsequent almost virtual replacement of the previously circulating seasonal influenza A/H1N1 [35], [36]. Ongoing antigenic changes in circulating seasonal A/H3N2 viruses continue to trigger new recommendations for seasonal influenza vaccine composition, to optimize vaccine-induced immunity in both the community and healthcare worker populations [37]–[39]. Thus, ever more efficient and economical methods are required to keep down the costs of molecular surveillance, allowing more laboratories to perform such sequencing routinely, thereby enhancing the quality, temporal and geographical resolution of the local influenza surveillance data available, to keep vaccine manufacturers and public health teams informed [40]. Towards this goal, the simplified sequencing protocol described here has been shown to be effective in obtaining full influenza A/H3N2 genomes at a reasonable price with equipment already available in many diagnostic and research laboratories, suggesting potential use of a similar strategy for studying human influenza A/H1N1pdm viruses.

Methods

Ethics Statement

All research studies involving the use of these clinical samples were reviewed and approved by the local institutional ethics review board (National Healthcare Group: B/09/360 and E/09/341).

Viral RNA Extraction

Viral RNAs were extracted from 200 µL of clinical or cultured samples with either the Qiagen EZ1 Virus mini kit v2.0 or the QIAsymphony Virus/Bacteria mini kit, using their respective proprietary Bio Robot EZ1 and QIAsymphony automated platforms (Qiagen, Valencia, CA), according to the manufacturer’s instructions. All extracted RNAs were eluted into a final volume of 60 µL of elution buffer.

Reverse Transcription Polymerase Chain Reaction

RT-PCRs were performed with a Superscript III one-step RT-PCR system with Platinum Taq high-fidelity polymerase (Invitrogen, Carlsbad, CA). Nineteen RT-PCRs were set up for whole genome amplification. All RT-PCRs were prepared manually in 10 µL of reaction volume, consisting of 5 µL of 2× Reaction Mix, equimolar amounts of forward and reverse primers (0.3 µmol/L each), 0.25 µL of enzyme mix, and 2.5 µL of extracted RNA sample. The remaining volume was topped up with RNase-free water. All RT-PCRs were performed using either the ABI 9700 thermal cycler (Applied Biosystems, CA, USA) or the Biometra T3000 thermocycler (Biometra GmbH, Goettingen, Germany). The cycling conditions were 30 min at 42°C (RT); 2.5 min at 95°C (inactivation of RT enzyme and activation of Taq enzyme); 5 cycles of 30 s at 95°C (denaturation), 30 s at 47°C (annealing), and 1.25 min at 68°C (extension); 45 cycles of 30 s at 95°C, 30 s at the respective second annealing temperature (Ta), and 1.25 min at 68°C; followed by a hold for 10 min at 68°C (final extension). The second Ta for each RT-PCR is summarized in Table 2. Sequencing reactions were performed directly on non-purified amplicons, using BigDye Terminator v3.1 chemistry (Applied Biosystems). The 10 µL sequencing reaction is composed of 1.5 µL of 5× Buffer, 0.5 µmol/L of respective sequencing primer (Table 1), 1 µL of BigDye enzyme mix, and 1.25 µL of template amplicons. One microliter of 4% DMSO was added into the sequencing reaction together with primer NS373R23 [29]. Large-scale sequencing reactions were carried out on a 96-well plate and purified directly using the BigDyeXTerminator purification kit (Applied Biosystems). Individual sequencing reactions were performed in PCR tubes and purified using the DyeEx 2.0 spin kit (Qiagen). Purified sequencing products were analyzed on the ABI 3130×l genetic analyzer (Applied Biosystems) using the BDx_stdSeq50_POP7_1 run module. Sequencing peak heights were adjusted with the sample injection time ranging from 3–5 seconds.

Contig Assembly

All sequences were assembled and verified using the ATF software, version 1.0.2.41 (Connexio Genomics, Perth, Australia), using the reference sequence influenza A/Nanjing/1/2009(H3N2) for all segments (GenBank accession: GU907114-GU907117 and GU907119-GU907121), except for the PB1 segment which used influenza A/Sendai-H/F193/2007(H3N2) (GenBank accession: AB441948) as the reference sequence. The primer sequences were subtracted from the data during contig assembly. The multiple A’s observed at the 3′end of the NA, NP, and PA genes were checked carefully by visualization of the sequencing chromatograms.
  40 in total

1.  Effects of passage history and sampling bias on phylogenetic reconstruction of human influenza A evolution.

Authors:  R M Bush; C B Smith; N J Cox; W M Fitch
Journal:  Proc Natl Acad Sci U S A       Date:  2000-06-20       Impact factor: 11.205

2.  An improved method for post-PCR purification for mtDNA sequence analysis.

Authors:  Kerri A Dugan; Helen S Lawrence; Douglas R Hares; Constance L Fisher; Bruce Budowle
Journal:  J Forensic Sci       Date:  2002-07       Impact factor: 1.832

3.  Performance comparison of benchtop high-throughput sequencing platforms.

Authors:  Nicholas J Loman; Raju V Misra; Timothy J Dallman; Chrystala Constantinidou; Saheer E Gharbia; John Wain; Mark J Pallen
Journal:  Nat Biotechnol       Date:  2012-05       Impact factor: 54.908

4.  Recommended composition of influenza virus vaccines for use in the 2012–2013 northern hemisphere influenza season.

Authors: 
Journal:  Wkly Epidemiol Rec       Date:  2012-03-09

5.  Emergence and dissemination of a swine H3N2 reassortant influenza virus with 2009 pandemic H1N1 genes in pigs in China.

Authors:  Xiaohui Fan; Huachen Zhu; Boping Zhou; David K Smith; Xinchun Chen; Tommy T-Y Lam; Leo L M Poon; Malik Peiris; Yi Guan
Journal:  J Virol       Date:  2011-12-14       Impact factor: 5.103

6.  Hemagglutinin stalk antibodies elicited by the 2009 pandemic influenza virus as a mechanism for the extinction of seasonal H1N1 viruses.

Authors:  Natalie Pica; Rong Hai; Florian Krammer; Taia T Wang; Jad Maamary; Dirk Eggink; Gene S Tan; Jens C Krause; Thomas Moran; Cheryl R Stein; David Banach; Jens Wrammert; Robert B Belshe; Adolfo García-Sastre; Peter Palese
Journal:  Proc Natl Acad Sci U S A       Date:  2012-01-30       Impact factor: 11.205

7.  Analysis of high-depth sequence data for studying viral diversity: a comparison of next generation sequencing platforms using Segminator II.

Authors:  John Archer; Greg Baillie; Simon J Watson; Paul Kellam; Andrew Rambaut; David L Robertson
Journal:  BMC Bioinformatics       Date:  2012-03-23       Impact factor: 3.169

8.  A universal influenza A and B duplex real-time RT-PCR assay.

Authors:  Hong Kai Lee; Tze Ping Loh; Chun Kiat Lee; Julian Wei-Tze Tang; Lily Chiu; Evelyn Siew-Chuan Koay
Journal:  J Med Virol       Date:  2012-10       Impact factor: 2.327

9.  Influenza A virus PB1-F2 gene in recent Taiwanese isolates.

Authors:  Guang-Wu Chen; Ching-Chun Yang; Kuo-Chien Tsao; Chung-Guei Huang; Li-Ang Lee; Wen-Zhi Yang; Ya-Ling Huang; Tzou-Yien Lin; Shin-Ru Shih
Journal:  Emerg Infect Dis       Date:  2004-04       Impact factor: 6.883

10.  The genome Assembly Archive: a new public resource.

Authors:  Steven L Salzberg; Deanna Church; Michael DiCuccio; Eugene Yaschenko; James Ostell
Journal:  PLoS Biol       Date:  2004-09-14       Impact factor: 8.029

View more
  13 in total

1.  Molecular surveillance of antiviral drug resistance of influenza A/H3N2 virus in Singapore, 2009-2013.

Authors:  Hong Kai Lee; Julian Wei-Tze Tang; Tze Ping Loh; Aeron C Hurt; Lynette Lin-Ean Oon; Evelyn Siew-Chuan Koay
Journal:  PLoS One       Date:  2015-01-30       Impact factor: 3.240

2.  Emergence of G186D mutation in the presence of R292K mutation in an immunocompromised child infected with influenza A/H3N2 virus, treated with oseltamivir.

Authors:  Hong Kai Lee; Julian Wei-Tze Tang; Tze Ping Loh; Debra Han-Lin Kong; Yew-Weng Lau; Hui Kim Yap; Evelyn Siew-Chuan Koay
Journal:  J Clin Microbiol       Date:  2014-03-05       Impact factor: 5.948

3.  An efficient genome sequencing method for equine influenza [H3N8] virus reveals a new polymorphism in the PA-X protein.

Authors:  Adam Rash; Alana Woodward; Neil Bryant; John McCauley; Debra Elton
Journal:  Virol J       Date:  2014-09-02       Impact factor: 4.099

4.  Sequence amplification via cell passaging creates spurious signals of positive adaptation in influenza virus H3N2 hemagglutinin.

Authors:  Claire D McWhite; Austin G Meyer; Claus O Wilke
Journal:  Virus Evol       Date:  2016-10-03

5.  In-silico Designing and Testing of Primers for Sanger Genome Sequencing of Dengue Virus Types of Asian Origin.

Authors:  Ajay Prakash Joshi; Annette Angel; Bennet Angel; Rajendra Kumar Baharia; Suman Rathore; Neha Sharma; Karuna Yadav; Sharad Thanvi; Indu Thanvi; Vinod Joshi
Journal:  J Genomics       Date:  2018-04-10

6.  Novel Clade 2.3.4.4b Highly Pathogenic Avian Influenza A H5N8 and H5N5 Viruses in Denmark, 2020.

Authors:  Yuan Liang; Jakob N Nissen; Jesper S Krog; Solvej Ø Breum; Ramona Trebbien; Lars E Larsen; Charlotte K Hjulsager
Journal:  Viruses       Date:  2021-05-11       Impact factor: 5.048

7.  Molecular Characterization of Highly Pathogenic Avian Influenza Viruses H5N6 Detected in Denmark in 2018-2019.

Authors:  Yuan Liang; Jesper Schak Krog; Pia Ryt-Hansen; Anders Gorm Pedersen; Lise Kirstine Kvisgaard; Elisabeth Holm; Pernille Dahl Nielsen; Anne Sofie Hammer; Jesper Johannes Madsen; Kasper Thorup; Lars Erik Larsen; Charlotte Kristiane Hjulsager
Journal:  Viruses       Date:  2021-06-02       Impact factor: 5.048

8.  Comparison of mutation patterns in full-genome A/H3N2 influenza sequences obtained directly from clinical samples and the same samples after a single MDCK passage.

Authors:  Hong Kai Lee; Julian Wei-Tze Tang; Debra Han-Lin Kong; Tze Ping Loh; Donald Kok-Leong Chiang; Tommy Tsan-Yuk Lam; Evelyn Siew-Chuan Koay
Journal:  PLoS One       Date:  2013-11-01       Impact factor: 3.240

9.  Causal variants screened by whole exome sequencing in a patient with maternal uniparental isodisomy of chromosome 10 and a complicated phenotype.

Authors:  Niu Li; Y U Ding; Tingting Yu; Juan Li; Yongnian Shen; Xiumin Wang; Qihua Fu; Yiping Shen; Xiaodong Huang; Jian Wang
Journal:  Exp Ther Med       Date:  2016-04-11       Impact factor: 2.447

10.  Contamination-controlled high-throughput whole genome sequencing for influenza A viruses using the MiSeq sequencer.

Authors:  Hong Kai Lee; Chun Kiat Lee; Julian Wei-Tze Tang; Tze Ping Loh; Evelyn Siew-Chuan Koay
Journal:  Sci Rep       Date:  2016-09-14       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.