| Literature DB >> 24058534 |
Francesca Di Giallonardo1, Osvaldo Zagordi, Yannick Duport, Christine Leemann, Beda Joos, Marzanna Künzli-Gontarczyk, Rémy Bruggmann, Niko Beerenwinkel, Huldrych F Günthard, Karin J Metzner.
Abstract
Next-generation sequencing (NGS) is a valuable tool for the detection and quantification of HIV-1 variants in vivo. However, these technologies require detailed characterization and control of artificially induced errors to be applicable for accurate haplotype reconstruction. To investigate the occurrence of substitutions, insertions, and deletions at the individual steps of RT-PCR and NGS, 454 pyrosequencing was performed on amplified and non-amplified HIV-1 genomes. Artificial recombination was explored by mixing five different HIV-1 clonal strains (5-virus-mix) and applying different RT-PCR conditions followed by 454 pyrosequencing. Error rates ranged from 0.04-0.66% and were similar in amplified and non-amplified samples. Discrepancies were observed between forward and reverse reads, indicating that most errors were introduced during the pyrosequencing step. Using the 5-virus-mix, non-optimized, standard RT-PCR conditions introduced artificial recombinants in a fraction of at least 30% of the reads that subsequently led to an underestimation of true haplotype frequencies. We minimized the fraction of recombinants down to 0.9-2.6% by optimized, artifact-reducing RT-PCR conditions. This approach enabled correct haplotype reconstruction and frequency estimations consistent with reference data obtained by single genome amplification. RT-PCR conditions are crucial for correct frequency estimation and analysis of haplotypes in heterogeneous virus populations. We developed an RT-PCR procedure to generate NGS data useful for reliable haplotype reconstruction and quantification.Entities:
Mesh:
Year: 2013 PMID: 24058534 PMCID: PMC3776835 DOI: 10.1371/journal.pone.0074249
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Substitution and insertion/deletion rates and their sources using 454 pyrosequencing.
A) The molecular full-length HIV-1 clone pYK-JRCSF was used to generate three different samples for the determination of error rates and error sources during the different steps of sample preparation. The blue (left) pathway indicates the procedure NGS, i.e., no amplification step was performed before emulsion PCR and pyrosequencing. The green (right) pathway shows the procedure PCR-NGS, i.e., the target was amplified once prior to 454 emulsion PCR/pyrosequencing. The orange (middle) pathway depicts the commonly used procedure RT-2PCR-NGS to reverse transcribe, amplify and sequence HIV-1 RNA genomes reflecting the errors that will occur using patients’ plasma samples to analyze HIV-1 haplotypes. Detailed description of each step is given in the materials and methods section. B) Error rates per positions are shown for forward reads (left) and reverse reads (right). For each duplicate, one example is shown (always sample a as presented table 1).
Substitution and insertion/deletion rates per base of different procedures for amplicon generation.
| sample | procedure | analysis strategy | totalreads | total readsanalyzed | total basesanalyzed | insertionrate [%] | deletionrate [%] | substitutionrate [%] |
| a | NGS | forward reads | 3,898 | 1,463 | 421,588 | 0.19 | 0.31 | 0.06 |
| b | NGS | forward reads | 1,649 | 685 | 197,238 | 0.17 | 0.19 | 0.08 |
| a | PCR-NGS | forward reads | 50,173 | 21,214 | 6,106,238 | 0.16 | 0.22 | 0.14 |
| b | PCR-NGS | forward reads | 100,263 | 39,758 | 11,358,035 | 0.21 | 0.23 | 0.16 |
| a | RT-2PCR-NGS | forward reads | 37,724 | 16,798 | 4,842,383 | 0.19 | 0.20 | 0.11 |
| b | RT-2PCR-NGS | forward reads | 55,337 | 28,511 | 8,217,013 | 0.17 | 0.19 | 0.12 |
| a | NGS | reverse reads | 3,898 | 2,434 | 705,284 | 0.63 | 0.56 | 0.14 |
| b | NGS | reverse reads | 1,649 | 963 | 279,122 | 0.66 | 0.43 | 0.15 |
| a | PCR-NGS | reverse reads | 50,173 | 28,697 | 8,276,817 | 0.28 | 0.04 | 0.11 |
| b | PCR-NGS | reverse reads | 100,263 | 53,266 | 15,260,156 | 0.29 | 0.05 | 0.14 |
| a | RT-2PCR-NGS | reverse reads | 37,724 | 20,915 | 6,034,401 | 0.25 | 0.05 | 0.11 |
| b | RT-2PCR-NGS | reverse reads | 55,337 | 26,804 | 7,744,407 | 0.40 | 0.07 | 0.08 |
each sample was done in duplicate, the sample name a and b helps to distinguish the duplicates from each other.
NGS, next-generation sequencing; RT, reverse transcription.
Figure 2Substitution rates of each virus strain used to generate the 5-virus-mix.
Each of the HIV-1 stocks was pyrosequenced separately to control for the purity of each virus strain. The y-axis shows the substitution rate per base according to the reference within the analyzed 271 bp long fragment (amino acids 10–99 of the HIV-1 protease, nt 2279–2549 based on HIV-1HXB2). The x-axis shows the positions on the sequence. The orange bars indicate differences in the nucleotide sequences of the five virus strains.
Detailed amplification conditions.
| NGS | PCR-NGS | RT-2PCR-NGS | PR1+ PR2 | PR3 - PR8 | |
| standard RT-PCR conditions | standard RT-PCR conditions | optimized RT-PCRconditions | |||
|
| –––––– HIV-1JR-CSF –––––– | ––––––––––––– 5-virus-mix | |||
|
| |||||
| input RNA copies | – | – | ∼40,000 | ∼30,000 | ∼35,000 |
| RT enzyme | – | – | Transcriptor RT | Transcriptor High Fidelity RT | Transcriptor High Fidelity RT (PR3+4) |
| M-MuLV RT, RNase H− (PR5+6) | |||||
| SuperScript III RT (PR7+8) | |||||
|
| |||||
| input cDNA copies | – | – | n.p. | n.p. | ∼10,000 |
| dNTPs (mM) | – | – | 0.2 | 0.2 | 0.4 |
| oligonucleotides (µM each) | – | – | 0.4 | 0.4 | 1 |
| FastStart High Fidelity DNA polymerase (U) | – | – | 1.25 | 1.25 | 3 |
| denaturation 94°C (sec) | – | – | 15 | 15 | 30 |
| annealing 55°C (sec) | – | – | 30 | 30 | 60 |
| elongation 72°C (sec) | – | – | 30 | 30 | 60 |
| PCR cylces | – | – | 30 | 30 | 30 |
| final extention 72°C (min) | – | – | 8 | 8 | none |
| PCR product purification | – | – | yes | yes | yes |
|
| |||||
| input DNA copies | – | – | n.p. | n.p. | 100,000 |
| dNTPs (mM) | – | 0.2 | 0.2 | 0.2 | 0.4 |
| oligonucleotieds (µM each) | – | 0.4 | 0.4 | 0.4 | 1 |
| FastStart High Fidelity DNA polymerase (U) | – | 1.25 | 1.25 | 1.25 | 3 |
| denaturation 94°C (sec) | – | 15 | 15 | 15 | 30 |
| annealing 55°C (sec) | – | 30 | 30 | 30 | 60 |
| elongation 72°C (sec) | – | 30 | 30 | 30 | 60 |
| final extention 72°C (min) | – | 8 | 8 | 8 | none |
| PCR cylces | – | 40 | 40 | 40 | 35 |
All concentrations are given per reaction.
n.p. qPCR was not performed.
5-virus-mix consists of the HIV-1 strains JR-CSF, NL4-3, HXB2, YU2 and 89.6 (see also Materials and Methods).
3/23 µl of cDNA was used for the 1st PCR reaction, this corresponds to ∼4,200 cDNA copies.
10/50 µl of purified, undiluted 1st PCR product was transferred to the 2nd PCR.
Frequencies of true and false haplotypes.
| Template | Estimated frequencies of truehaplotypes (%) | Estimated frequencies of falsehaplotypes (%) | ||||||||||||
| copies/reaction | ShoRAH | ShoRAH | Recco | |||||||||||
| Sample | RT enzyme | outerPCR | innerPCR | RT-PCRconditions | Total reads/clonesanalyzed | HIV-1HXB2 | HIV-1NL4-3 | HIV-1JR-CSF | HIV-1YU2 | HIV-189.6 | Sum |
| Erroneoushaplotypes |
|
| PR1 | Transcriptor HighFidelity RT | n.p. | n.p. | standard | 84,645 | 2.1 | 9.2 | 9.3 | 1.9 | 5.0 | 27.6 | 53.6 | 18.8 | 37.1 |
| PR2 | Transcriptor HighFidelity RT | n.p. | n.p. | standard | 20,133 | 2.8 | 11.2 | 9.1 | 2.3 | 3.1 | 28.5 | 43.9 | 27.6 | 30.6 |
| PR3 | Transcriptor HighFidelity RT | ∼10,000 | 100,000 | optimized | 3,846 | 6.2 | 39.6 | 24.2 | 10.9 | 17.4 | 98.3 | 0.3 | 1.4 | 0.9 |
| PR4 | Transcriptor HighFidelity RT | ∼10,000 | 100,000 | optimized | 3,781 | 6.8 | 32.9 | 25.7 | 13.2 | 19.9 | 98.5 | 0.7 | 0.8 | 1.1 |
| PR5 | M-MuLV RT, RNase H− | ∼10,000 | 100,000 | optimized | 14,482 | 4.5 | 45.4 | 22.9 | 11.4 | 11.7 | 95.9 | 1.2 | 2.9 | 1.2 |
| PR6 | M-MuLV RT, RNase H− | ∼10,000 | 100,000 | optimized | 11,809 | 6.2 | 38.9 | 23.2 | 11.6 | 13.8 | 93.7 | 3 | 3.3 | 2.6 |
| PR7 | SuperScript III RT | ∼10,000 | 100,000 | optimized | 2,046 | 5.3 | 40.4 | 26.4 | 9.6 | 17.4 | 99.1 | 0.5 | 0.4 | 1.7 |
| PR8 | SuperScript III RT | ∼10,000 | 100,000 | optimized | 14,629 | 4.9 | 44.1 | 23.0 | 11.1 | 13.4 | 96.5 | 0.9 | 2.6 | 0.9 |
| SGA1 | Transcriptor RT | 0.2 | n.a. | n.a. | 168 | 9.5 | 36.1 | 27.8 | 11.8 | 14.8 | 100.0 | 0.0 | 0.0 | – |
| SGA2 | Transcriptor HighFidelity RT | 0.2 | n.a. | n.a. | 156 | 5.8 | 37.0 | 32.5 | 13.0 | 11.7 | 100.0 | 0.0 | 0.0 | – |
| SGA3 | M-MuLV RT, RNase H− | 0.2 | n.a. | n.a. | 148 | 12.4 | 42.1 | 23.4 | 11.7 | 10.3 | 100.0 | 0.0 | 0.0 | – |
see also table 2.
n.p. qPCR was not performed.
n.a. not applicable.
Figure 3Major in vitro recombinant haplotypes assigned by ShoRAH.
Haplotypes were aligned to the five reference strains and characterized. The top part shows the five virus strains (true haplotypes) of the 5-virus-mix and the bars indicate the specific mutation for each strain distinguishing it from the other four virus strains. The corresponding nucleotides and positions are indicated. HIV-1HXB2 has one unique mutation at position 84 (reference numbering 2362) that is indicated in grey. The mutations for HIV-1NL4-3 are marked in blue, in HIV-1JR-CSF in green, in HIV-189.6 in red, and in HIV-1YU2 in orange. Dark colours indicate unique mutations, light colours indicate differences to other strains but not unique for the respective strain. The bottom part shows all recombinant haplotypes found at 1% and higher frequencies. Triangles indicate positions were a specific nucleotide is expected according to the corresponding strain, but is missing. The Nucleotide positions in the sequences are indicated.