| Literature DB >> 32605062 |
Neil T Parkin1, Santiago Avila-Rios2, David F Bibby3, Chanson J Brumme4,5, Susan H Eshleman6, P Richard Harrigan7, Mark Howison8, Gillian Hunt9, Hezhao Ji10, Rami Kantor11, Johanna Ledwaba9, Emma R Lee10, Margarita Matías-Florentino2, Jean L Mbisa3, Marc Noguera-Julian12, Roger Paredes12, Vanessa Rivera-Amill13, Ronald Swanstrom14, Daniel J Zaccaro15, Yinfeng Zhang6, Shuntai Zhou14, Cheryl Jennings16.
Abstract
Next-generation sequencing (NGS) is increasingly used for HIV-1 drug resistance genotyping. NGS methods have the potential for a more sensitive detection of low-abundance variants (LAV) compared to standard Sanger sequencing (SS) methods. A standardized threshold for reporting LAV that generates data comparable to those derived from SS is needed to allow for the comparability of data from laboratories using NGS and SS. Ten HIV-1 specimens were tested in ten laboratories using Illumina MiSeq-based methods. The consensus sequences for each specimen using LAV thresholds of 5%, 10%, 15%, and 20% were compared to each other and to the consensus of the SS sequences (protease 4-99; reverse transcriptase 38-247). The concordance among laboratories' sequences at different thresholds was evaluated by pairwise sequence comparisons. NGS sequences generated using the 20% threshold were the most similar to the SS consensus (average 99.6% identity, range 96.1-100%), compared to 15% (99.4%, 88.5-100%), 10% (99.2%, 87.4-100%), or 5% (98.5%, 86.4-100%). The average sequence identity between laboratories using thresholds of 20%, 15%, 10%, and 5% was 99.1%, 98.7%, 98.3%, and 97.3%, respectively. Using the 20% threshold, we observed an excellent agreement between NGS and SS, but significant differences at lower thresholds. Understanding how variation in NGS methods influences sequence quality is essential for NGS-based HIV-1 drug resistance genotyping.Entities:
Keywords: HIV-1; NGS; drug resistance; genotyping
Year: 2020 PMID: 32605062 PMCID: PMC7411816 DOI: 10.3390/v12070694
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.048
Characteristics of Virology Quality Assurance (VQA) specimens used in this study.
| Specimen | Viral Load a | Subtype b | PR DRMs c | RT DRMs c | % Mixed Bases in SS Consensus d | Number of Amplification Failures |
|---|---|---|---|---|---|---|
| 24.1 | 7815 | B | None | T215C | 2.3% | 0 |
| 24.2 | 18,023 | F | K20R, M36I | None | 0.0% | 0 |
| 24.3 | 26,372 | C | M36I | M41L, V75T, V90I, V106M, V179D | 0.0% | 0 |
| 24.4 | 29,139 | C | M36I | M41L, K103N, M184V, T215Y | 0.1% | 1 |
| 24.5 | 6424 | B | L10I, L33F, M46L, I54V, A71I/T, V82A, L90M | M41L, E44D, A62V, D67N, L74V, L100I, K103N, H208Y, L210W, T215Y H221Y | 0.8% | 1 |
| 26.1 | 16,685 | C | M36I, T74S | D67N, K70R, V90I, M184V | 0.9% | 0 |
| 26.2 e | 4513 | B | L10I, L33F, M46L, I54V, A71I/T, V82A, L90M | M41L, E44D, A62V, D67N, L74V, L100I, K103N, H208Y, L210W, T215Y, H221Y | 1.1% | 1 |
| 26.3 | 18,213 | C | K20R, M36I | A62V, K65R, D67N, V75A/I/T, K101Q, K103N, V106M, E138A, M184V | 2.1% | 1 |
| 26.4 | 6506 | D | M36I | None | 1.1% | 2 |
| 26.5 | 3656 | B | none | V90I, K103N | 3.8% | 0 |
a RNA copies/mL. b determined based on protease (PR)- reverse transcriptase (RT) sequence and Stanford HIVdb. c Drug resistance-associated mutation (DRM) sites were defined as any position with a potential impact on the penalty score in the Stanford HIVdb algorithm (version 8.5). d percentage of nucleotides in the VQA Sanger consensus sequence that are mixed, including positions where consensus was not reached. e same donor virus as 24.5.
Assay details.
| Laboratory ID | RNA Extraction Method (Specimen Volume) | RT-PCR Amplification Strategy | Negative Control | % of Extracted RNA Used | Coverage | Minimum Read Depth a | Minimum Variant Count b | Analysis Pipeline |
|---|---|---|---|---|---|---|---|---|
| 1 | QIAamp Viral RNA Mini kit (0.14 mL) | RT with primerID, then nested PCR | Water | 50% | PR 1–99, RT 34–122 and 152–236 | Varies | NA c | TCS pipeline in house |
| 2 | ViroSeq RNA extraction kit (0.5 mL) | RT then nested PCR | Water | 10% | PR 1–99, RT 1–440 | 1000 | 1000 | CLC Genomics Workbench and In-house |
| 3 | QIAamp Viral RNA Mini kit (1 mL) | One-step RT-PCR then nested PCR | Water | 10% | PR 6–99, RT 1–251 | 1000 | 50 | HyDRA [ |
| 4 | MagnaPure LC | One-step RT-PCR, then nested PCR | Water | 30% | PR 1–99, RT 1–250 | 330 | NA c | Geneious |
| 5 | QIAamp Viral RNA Mini kit (0.14 mL) | One-step RT-PCR then nested PCR | Water | 10% | PR 1–99 | 100 | 5 | Trim Galore!, HydDRA [ |
| 6 | NucliSENS easyMAG | One-step RT-PCR, then nested PCR | Water | ~9% | PR 1–99, | 100 | 5 | HyDRA [ |
| 7 | QIAamp UltraSens Virus kit (0.5 mL) | Primary RT-PCR, then nested PCR | Fetal bovine serum | 16.7% | PR 5–99, RT 1–320 | 1000 | NA c | In-house [ |
| 8 | QIAamp Viral RNA Mini kit (0.14 mL) | One-step RT-PCR then nested PCR | Water | 25 % | PR 1–99, RT 1–440 | 1000 | 10 | PASeq.org [ |
| 9 | EZ1 Advance XL (variable) d | RT then nested PCR | Water | 16.7% | PR 1–99, RT 1–240 | 1000 | 10 e | Hivmmer [ |
| 10 | NucliSENS easyMAG (0.5 mL) | RT then nested PCR | Water | 6.7% | PR 1–99, RT 1–400 or 1–240 | 100 | NA c | MiCall [ |
a minimum number of reads required for data quality assurance. b minimum number of individual variants required for reporting. c in some analysis pipelines, the minimum variant count is not specified, although in practice is defined by the minimum coverage and variant proportion. d volume adjusted based on viral load to contain at least 5000 copies. e and ≥1% of total coverage at site.
Figure 1Plots of next-generation sequencing (NGS)-derived PR-RT nucleotide sequence identity vs. VQA Sanger consensus at various thresholds. Each line represents one specimen from panel 24 (24.1 through 24.5) or 26 (26.1 through 26.5).
Nucleotide sequence identity vs. VQA Sanger consensus at different variant thresholds.
| 5% | 10% | 15% | 20% | |
|---|---|---|---|---|
| Number | 94 | 94 | 94 | 85 |
| Minimum | 95.0 | 95.7 | 98.2 | 98.3 |
| Median | 98.9 | 99.6 | 99.7 | 99.9 |
| Mean | 98.7 | 99.4 | 99.6 | 99.7 |
| Std. Deviation | 0.95 | 0.63 | 0.41 | 0.40 |
| Lower 95% CI of mean | 98.5 | 99.2 | 99.5 | 99.6 |
| Upper 95% CI of mean | 98.9 | 99.5 | 99.7 | 99.8 |
Random Effects model testing differences between thresholds.
| Comparison | N | Rand Eff Model Mean (SEM) | Rand Eff Model p-value Test = 0 | Mean Diff | Median Diff | SD Diff | Min Diff | Max Diff | Paired t | Sign Test | Sign Rank |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 10% vs 5% | 94 | 0.0062 (0.0016) | 0.004 | 0.0065 | 0.0044 | 0.0071 | −0.0056 | 0.0359 | 0 | 0 | 0 |
| 15% vs 5% | 94 | 0.0087 (0.0020) | 0.002 | 0.009 | 0.0056 | 0.0094 | −0.0052 | 0.0466 | 0 | 0 | 0 |
| 15% vs 10% | 94 | 0.0025 (0.00076) | 0.01 | 0.0025 | 0.0011 | 0.0048 | −0.0028 | 0.04 | 0.0000022 | 0 | 0 |
| 20% vs 5% | 85 | 0.0094 (0.0022) | 0.003 | 0.0097 | 0.0055 | 0.01 | −0.0034 | 0.0477 | 0 | 0 | 0 |
| 20% vs 10% | 85 | 0.0033 (0.00091) | 0.007 | 0.0034 | 0.0022 | 0.0055 | −0.0063 | 0.0411 | 0.0000002 | 0 | 0 |
| 20% vs 15% | 85 | 0.00084 (0.0002) | 0.003 | 0.0008 | 0 | 0.0017 | −0.0075 | 0.0056 | 0.0000265 | 0 | 0.0000012 |
Figure 2Nucleotide sequence alignment for six laboratories. The VQA Sanger sequencing (SS) consensus is shown at the top. Mixtures of A and G (R) or C and T (Y) that were reported by some but not all laboratories are highlighted in blue. The sequences from laboratories 5, 7, and 8 did not contain any mixtures in this region, and those from laboratory 4 contained the Y in codon 221 at all thresholds reported (5%, 10%, and 15%).
Figure 3Protease/reverse transcriptase nucleotide sequence concordance between laboratories. The mean percent identity with standard deviation is shown for each specimen and threshold.
Figure 4Sequence quality assurance anomalies (total for all laboratories) at different low-abundance variant (LAV) thresholds. Sequence quality evaluation was performed with Stanford HIVdb (https://hivdb.stanford.edu/). HIVdb sequence analysis was performed using NGS sequences generated using the 5%, 10%, 15%, or 20% threshold levels.