Literature DB >> 27099321

HIV-1 Protease, Reverse Transcriptase, and Integrase Variation.

Soo-Yon Rhee¹, Kris Sankaran², Vici Varghese³, Mark A Winters^3,4, Christopher B Hurt⁵, Joseph J Eron⁵, Neil Parkin⁶, Susan P Holmes², Mark Holodniy^3,4, Robert W Shafer³.

Abstract

UNLABELLED: HIV-1 protease (PR), reverse transcriptase (RT), and integrase (IN) variability presents a challenge to laboratories performing genotypic resistance testing. This challenge will grow with increased sequencing of samples enriched for proviral DNA such as dried blood spots and increased use of next-generation sequencing (NGS) to detect low-abundance HIV-1 variants. We analyzed PR and RT sequences from >100,000 individuals and IN sequences from >10,000 individuals to characterize variation at each amino acid position, identify mutations indicating APOBEC-mediated G-to-A editing, and identify mutations resulting from selective drug pressure. Forty-seven percent of PR, 37% of RT, and 34% of IN positions had one or more amino acid variants with a prevalence of ≥1%. Seventy percent of PR, 60% of RT, and 60% of IN positions had one or more variants with a prevalence of ≥0.1%. Overall 201 PR, 636 RT, and 346 IN variants had a prevalence of ≥0.1%. The median intersubtype prevalence ratios were 2.9-, 2.1-, and 1.9-fold for these PR, RT, and IN variants, respectively. Only 5.0% of PR, 3.7% of RT, and 2.0% of IN variants had a median intersubtype prevalence ratio of ≥10-fold. Variants at lower prevalences were more likely to differ biochemically and to be part of an electrophoretic mixture compared to high-prevalence variants. There were 209 mutations indicative of APOBEC-mediated G-to-A editing and 326 mutations nonpolymorphic treatment selected. Identification of viruses with a high number of APOBEC-associated mutations will facilitate the quality control of dried blood spot sequencing. Identifying sequences with a high proportion of rare mutations will facilitate the quality control of NGS. IMPORTANCE: Most antiretroviral drugs target three HIV-1 proteins: PR, RT, and IN. These proteins are highly variable: many different amino acids can be present at the same position in viruses from different individuals. Some of the amino acid variants cause drug resistance and occur mainly in individuals receiving antiretroviral drugs. Some variants result from a human cellular defense mechanism called APOBEC-mediated hypermutation. Many variants result from naturally occurring mutation. Some variants may represent technical artifacts. We studied PR and RT sequences from >100,000 individuals and IN sequences from >10,000 individuals to quantify variation at each amino acid position in these three HIV-1 proteins. We performed analyses to determine which amino acid variants resulted from antiretroviral drug selection pressure, APOBEC-mediated editing, and naturally occurring variation. Our results provide information essential to clinical, research, and public health laboratories performing genotypic resistance testing by sequencing HIV-1 PR, RT, and IN.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Substances：

Year: 2016 PMID： 27099321 PMCID： PMC4907232 DOI： 10.1128/JVI.00495-16

Source DB: PubMed Journal: J Virol ISSN： 0022-538X Impact factor: 5.103

INTRODUCTION

As HIV-1 has spread among humans, it has developed an extraordinary amount of genetic diversity (1). This diversity arises from HIV-1's high mutation rate and predilection for recombination (2, 3). Amino acid variants accumulate within an individual as a result of various selective pressures and HIV-1's genetic robustness or tolerance for a large number of different amino acid variants (4, 5). The large number of protease (PR), reverse transcriptase (RT), and integrase (IN) amino acid variants has implications for antiretroviral (ARV) therapy and presents a challenge to laboratories performing genotypic resistance testing. The challenge of HIV-1 genotypic resistance test interpretation is increasing with the adoption of dried blood spot sequencing in low- and middle-income countries and the expansion of next-generation sequencing (NGS) in upper-income countries. Dried blood spot samples contain proviral DNA, which is more likely to contain APOBEC-mediated G-to-A hypermutation, an ancient host defense mechanism responsible for lethal mutagenesis (6). NGS technologies are intrinsically more error prone than dideoxynucleotide terminator Sanger sequencing and are at risk of yielding reports of low-abundance variants that result from PCR error (7, 8). We analyzed PR and RT direct PCR Sanger sequences from more than 100,000 individuals and IN direct PCR Sanger sequences from more than 10,000 individuals to characterize the amino acid variation at each amino acid position in these genes. We also analyzed sequences from individuals with known ARV treatment histories to identify those mutations resulting from selective drug pressure. Knowledge of the observed variation and selection pressure on the molecular targets of HIV therapy can be useful to clinical, research, and public health laboratories performing genotypic resistance testing.

MATERIALS AND METHODS

Sequences.

HIV-1 group M protease (PR), reverse transcriptase (RT), and integrase (IN) sequences determined by direct PCR dideoxynucleotide sequencing were retrieved from the Stanford HIV Drug Resistance Database (HIVDB) on 1 April 2015 (9). These sequences included 119,000 PR, 128,000 RT, and 13,000 IN sequences from 132,000 individuals in 143 countries. Eighty-five percent of the sequences are in GenBank; 15% were submitted directly to HIVDB. The subtype of each sequence was determined using the REGA HIV-1 Subtyping Tool version 3 (10). The five most common subtypes were B (61%), C (12%), CRF01_AE (8%), CRF02_AG (5%), and A (5%). Clonal sequences were excluded to minimize the likelihood of detecting random virus polymerization errors or—in the case of molecular cloning—PCR errors (11). Ninety-four percent of sequences were obtained from plasma. Plasma sequences were used to analyze overall amino acid variation and ARV selection pressure. Six percent of sequences were obtained from peripheral blood mononuclear cell (PBMC) proviral DNA. PBMC sequences were pooled with the plasma virus sequences in our analysis of APOBEC-associated mutations because proviral DNA is enriched for APOBEC-edited virus genomes (12, 13).

APOBEC-associated mutations.

To identify amino acid changes consistent with APOBEC editing, we first identified all highly conserved GG or GA dinucleotide positions in PR, RT, and IN sequences from plasma samples. Conserved dinucleotides were defined as those present in 98% of pooled samples and in each of the five most common subtypes. We then identified sequences containing mutations that resulted from canonical APOBEC3G (GG→AG) and 3F (GA→AA) G-to-A changes at these highly conserved dinucleotide positions. Sequences with these candidate APOBEC-associated mutations were then examined for stop codons—a specific indicator of APOBEC-mediated editing of tryptophan codons (TGG)—and for the number of additional candidate APOBEC-associated mutations. To identify the number of APOBEC-associated mutations to use as a cutoff for classifying a sequence as likely to have undergone G-to-A hypermutation, we assumed a mixture of two Poisson distributions with different λ's defined as the average number of APOBEC-associated mutations in a sequence: (i) a distribution with a lower λ reflecting sequences lacking APOBEC-associated mutations or containing sparse APOBEC-associated mutations resulting from random HIV mutations and (ii) another distribution with a higher λ reflecting sequences with abundant APOBEC-associated mutations resulting from host APOBEC-3F and APOBEC-3G enzymatic activity. We then developed an R package, LocFDRPois, to estimate the local false discovery rate for each number of APOBEC-associated mutations at which a sequence with that number of APOBEC-associated mutations did not arise from APOBEC editing (http://cran.r-project.org/web/packages/LocFDRPois/). Theoretically APOBEC-edited genomes should not be found in plasma at a detectable level by Sanger sequencing because these viruses usually cannot complete a virus replication cycle (14). However, plasma can occasionally be contaminated by proviral DNA, which would be extracted and amplified by most HIV sequencing protocols. Therefore, in our subsequent analyses, we excluded all sequences likely to be hypermutated.

Amino acid variants.

To characterize variability at each position in PR, RT, and IN, we determined the proportion of each amino acid at each position in all viruses and in each of the five most common HIV-1 subtypes. Each amino acid variant was also characterized by its biochemical relatedness to the consensus amino acid at that position using the BLOSUM62 and BLOSUM80 amino acid similarity matrices. The BLOSUM62 and BLOSUM80 matrices are based on the likelihood that two amino acids can replace one another in genomes that share up to 62% and 80% amino acid similarity, respectively, regardless of the organisms from which they were obtained. Thus, they represent the extent of biochemical similarity between amino acids, which is independent of historical evolution and local sequence context. For notational purposes, amino acid variants were defined as differences from the consensus subtype B amino acid sequence because this is a commonly used reference and because it was nearly always the same as the consensus of all pooled sequences. We also determined the proportion of times that each amino acid variant occurred as part of an electrophoretic mixture in which two peaks were present on the sequence electropherogram resulting in one of the following ambiguous nucleotide calls: R (combination of A and G), Y (combination of C and T), M (combination of A and C), W (combination of A and T), K (combination of G and T), and S (combination of C and G) (15). Amino acids that always occurred as part of an electrophoretic mixture were excluded.

Nonpolymorphic TSMs.

To identify nonpolymorphic treatment-selected mutations (TSMs), we examined the treatment history of the individuals from whom each sequenced virus was obtained. For each drug class—PR inhibitor (PI), nucleoside RT inhibitor (NRTI), nonnucleoside RT inhibitor (NNRTI), and IN strand transfer inhibitor (INSTI)—sequences were characterized as being either from an ARV class-naive individual who received no drugs belonging to the class or an ARV class-experienced individual who received at least one drug from that class. Sequences from individuals of unknown or uncertain treatment history were excluded from this analysis. In sequences from patients with multiple virus isolates, mutations occurring in more than one isolate were counted just once. We then examined each amino acid variant for its association with ARV selection pressure. The proportion of each variant in ARV-experienced individuals was compared to its proportion in ARV-naive individuals using a chi-square test with Yates' correction. The Holm's method was then used to control the family-wise error rate for multiple-hypothesis testing at an adjusted P value of <0.01 (16). To exclude TSMs under minimal drug selection pressure, we included only those TSMs that were five times more frequent in ARV-experienced than in ARV-naive individuals. To identify the TSMs that are most specific for ARV selection across subtypes, we identified those TSMs that were nonpolymorphic in the absence of selective drug pressure, defined as occurring at a frequency below 1.0% in ARV-naive individuals infected with viruses belonging to each of the five most common subtypes. Transmitted drug resistance (TDR) will cause many nonpolymorphic TSMs to appear in virus sequences from untreated individuals. This will cause the proportion of these mutations in ARV-naive individuals to be higher than what would be expected in ARV-naive individuals whose viruses had not experienced selective drug pressure. This in turn will reduce the ratio of the prevalence of these mutations in ARV-experienced individuals divided by their inflated prevalence in ARV-naive individuals. Therefore, we restricted our analysis of ARV-naive sequences to those lacking any of the 93 surveillance drug resistance mutations (SDRMs) that have become established markers of TDR (17). For IN for which the SDRM list is not available, we used major INSTI resistance mutations defined in Stanford HIVDB: T66I/A/K, E92Q, F121Y, G140S/A/C, Y143C/R/H, S147G, Q148H/K/R, and N155H/S. Among RT inhibitor (RTI)-experienced individuals, 75% received NRTIs in combination with an NNRTI, 22% received NRTIs without an NNRTI, and 3% received an NNRTI without an NRTI. The frequent use of NRTIs in combination with an NNRTI makes it difficult to determine for some mutations whether they are selected by NRTIs or NNRTIs. Therefore, we first determined whether RT mutations were treatment selected by comparing the proportions of mutations in sequences from RTI-naive and RTI-experienced individuals. We then determined whether the selection appeared to be primarily associated with NRTIs versus NNRTIs using a previously described approach (18). Those mutations that did not demonstrate a strong significant association with just one class were classified as (i) NRTI associated if their positions are known to be associated with NRTI resistance, (ii) NNRTI associated if their positions are known to be associated with NNRTI resistance, or (iii) undifferentiated RTI associated if their positions were not previously associated with NRTI or NNRTI resistance.

Synonymous and nonsynonymous mutation rates.

To determine whether the overall nucleotide mutation rate at a codon influenced the likelihood of developing amino acid variants, we estimated the synonymous and nonsynonymous rates at each codon in PR, RT, and IN for the five most common subtypes. For each subtype, we used FastML (19) to determine the most probable ancestral codon and then compared the codon of each sequence to this codon to estimate the number of synonymous changes/number of potential synonymous changes (dS) and the number of nonsynonymous changes/number of potential nonsynonymous changes (dN). Additionally, we examined each consensus amino acid and TSM to determine the minimum number of nucleotide differences between their respective codons.

RESULTS

Signature mutations indicating APOBEC-mediated editing.

Of 297 PR nucleic acids, 24 GG and GA dinucleotides at 22 amino acid positions were conserved in more than 98% of sequences in each of the most common five subtypes. Canonical APOBEC-mediated changes at these positions—GG→AG, GA→AA, and GG→AA (if GG is followed by G)—would result in 58 different amino acid mutations and two stop codons. Fifty of the 58 mutations occurred in sequences from one or more plasma samples. Of the 50 observed mutations, 32 were strongly associated with one or more stop codon or with a canonical APOBEC-mediated mutation at one or more of the active-site residues D25, G27, G49, G51, and G52. Table S1 in the supplemental material lists the two stop codons and the 32 PR mutations, which our analysis suggests indicate APOBEC-mediated editing. Of 1,680 RT nucleic acids, 128 GG and GA dinucleotides at 115 amino acid positions were conserved in >98% of sequences in each of the five most common subtypes. Canonical APOBEC-mediated changes at these positions would result in 241 different amino acid mutations and 19 stop codons. One hundred eighty of the 245 mutations occurred in sequences from one or more plasma samples. Of the 180 observed mutations, 89 were significantly associated with one or more of stop codons or with a canonical APOBEC-mediated mutation at one of the active-site residues D110, D185, and D186. One of the 89 mutations, M230I, has recently been reported to cause resistance to the NNRTI rilpivirine (20). Table S1 in the supplemental material lists the 19 stop codons and the 88 RT mutations that our analysis suggests indicate APOBEC-mediated editing. Of the 864 IN nucleic acids, 76 GG and GA dinucleotides at 65 amino acid positions were conserved in >98% of sequences in each of the five most common subtypes. Canonical APOBEC-mediated changes at these positions would result in 136 different amino acid mutations and 7 stop codons. Eighty of the 136 mutations occurred in sequences from one or more plasma samples. Of these 80 mutations, 62 were significantly associated with one or more stop codons or with a canonical APOBEC-mediated mutation at one of the active-site residues D64, D116, and E152. One of the 62 mutations, G118R, has recently been reported to reduce susceptibility to multiple INSTIs (21, 22). Table S1 in the supplemental material lists the seven stop codons and the 61 IN mutations that our analysis suggests indicate APOBEC-mediated editing. The local false discovery rate derived from the mixture model described in Materials and Methods was used to classify sequences as hypermutated or nonhypermutated based on the number of signature APOBEC mutations within PR, RT, and IN (see Table S2 in the supplemental material). The presence of one signature mutation predicted risks of hypermutation of 18%, 19%, and 16% for PR, RT, and IN sequences, respectively. The presence of two signature mutations predicted risks of hypermutation of 86%, 79%, and 76%, respectively. The presence of three signature mutations predicted risks of hypermutation of 99.8%, 98.5%, and 97.8%, respectively. Therefore, in our subsequent analyses, we excluded 112 PR, 225 RT, and 81 IN plasma sequences containing two or more signature APOBEC mutations.

Amino acid variation.

Overall, we analyzed 110,357 PR sequences obtained from 101,154 individuals, 118,246 RT sequences from 108,681 individuals, and 11,838 IN sequences from 11,156 individuals. Most RT sequences did not encompass the 3′ RNase H coding region of RT. Therefore, for our analysis of RT amino acid variability, we included just positions 1 to 400. Of the 99 PR positions, 47 (47%) had one or more variants occurring at a prevalence of ≥1%, and 69 (70%) had one or more variants occurring at a prevalence of ≥0.1% (Fig. 1). Overall, there were 201 variants occurring at a prevalence of ≥0.1% at these 69 positions (Table 1).

FIG 1

TABLE 1

Amino acid variants according to frequency

Frequency (%)	Protease				Reverse transcriptase				Integrase
Frequency (%)	No. of amino acid variants	% of positions with variant	Median similarity score^b	% found in electrophoretic mixtures	No. of amino acid variants	% of positions with variant	Median similarity score^b	% found in electrophoretic mixtures	No. of amino acid variants	% of positions with variant	Median similarity score^b	% found in electrophoretic mixtures
<0.01	655	100	−2	60	2,487	99	−2	60	504	85	−1	54
0.01–0.1	260	89	−1	45	1,091	91	−1	49	460	81	0	45
0.1–1	119	56	0	26	379	47	0	30	214	47	0	26
1–10	65	38	0	17	202	31	1	18	107	28	1	14
>10	17	17	2	9	55	12	1	9	25	8	1	7

Protease positions 1 to 99 were analyzed using 109,497 protease sequences, RT positions 1 to 400 were analyzed using 108,848 RT sequences, and integrase positions 1 to 288 were analyzed using 11,778 integrase sequences.

BLOSUM62 similarity score to the consensus amino acid.

Distribution of the number of HIV-1 protease (PR) amino acid variants by position stratified by prevalence: ≥1% (A), 0.1% to 1% (B), 0.01% to 0.1% (C), and <0.01% (D). The total number of sequences analyzed at each position is shown on a log10 scale (E). Amino acid variants according to frequency Protease positions 1 to 99 were analyzed using 109,497 protease sequences, RT positions 1 to 400 were analyzed using 108,848 RT sequences, and integrase positions 1 to 288 were analyzed using 11,778 integrase sequences. BLOSUM62 similarity score to the consensus amino acid. Of the 400 RT positions, 147 (37%) had one or more variants occurring at a prevalence of ≥1%, and 240 (60%) had one or more variants in ≥0.1% of sequences (Fig. 2). Overall, there were 636 variants occurring at a prevalence of ≥0.1% at these 240 positions (Table 1).

FIG 2

Distribution of the number of HIV-1 reverse transcriptase (RT) amino acid variants by position stratified by prevalence: ≥1% (A), 0.1% to 1% (B), 0.01% to 0.1% (C), and <0.01% (D). The total number of sequences analyzed at each position is shown on a log10 scale (E). Of the 288 IN positions, 97 (34%) had one or more variants occurring at a prevalence of ≥1%, and 172 (60%) had one or more variants in ≥0.1% of sequences (Fig. 3). Overall, there were 346 variants occurring at a prevalence of ≥0.1% at these 172 positions (Table 1).

FIG 3

Distribution of the number of HIV-1 integrase (IN) amino acid variants by position stratified by prevalence: ≥1% (A), 0.1% to 1% (B), 0.01% to 0.1% (C), and <0.01% (D). The total number of sequences analyzed at each position is shown on a log10 scale (E).

Variability between subtypes.

At each position, the number of amino acid variants with a prevalence of ≥0.1% was highly correlated between subtypes: The median intersubtype correlation coefficients for the number of variants with a prevalence above 0.1% were 0.85 (P < 2E−16), 0.84 (P < 2E−16), and 0.68 (P < 2E−16) for PR, RT, and IN, respectively (Fig. 4, 5, and 6).

FIG 4

Distribution of the number of HIV-1 protease (PR) amino acid variants present at prevalences of ≥1% (blue) and ≥0.1% (green) by subtype.

FIG 5

Distribution of the number of HIV-1 reverse transcriptase (RT) amino acid variants present at prevalences of ≥1% (blue) and ≥0.1% (green) by subtype.

FIG 6

Distribution of the number of HIV-1 integrase (IN) amino acid variants present at prevalences of ≥1% (blue) and ≥0.1% (green) by subtype.

Distribution of the number of HIV-1 protease (PR) amino acid variants present at prevalences of ≥1% (blue) and ≥0.1% (green) by subtype. Distribution of the number of HIV-1 reverse transcriptase (RT) amino acid variants present at prevalences of ≥1% (blue) and ≥0.1% (green) by subtype. Distribution of the number of HIV-1 integrase (IN) amino acid variants present at prevalences of ≥1% (blue) and ≥0.1% (green) by subtype. For amino acid variants with a prevalence of ≥0.1%, the median intersubtype ratio of the prevalence for PR variants was 2.9-fold (interquartile range [IQR], 1.2- to 4.7-fold); only 5.0% of PR variants had a prevalence in one subtype that differed by ≥10-fold in another subtype (range, 10- to 28-fold). The median intersubtype ratio of the prevalence for RT variants was 2.1-fold (IQR, 1.0- to 3.5-fold); only 3.7% of RT variants had a prevalence in one subtype that differed by ≥10-fold in another subtype (range, 10- to 39-fold). The median intersubtype ratio of the prevalence for IN variants was 1.9-fold (IQR, 1.2- to 3.0-fold); only 2.0% of IN variants had a prevalence in one subtype that differed by ≥10-fold in another subtype (range, 10- to 51-fold).

Chemical relatedness.

There was a strong relationship between the prevalence of an amino acid variant and its biochemical similarity to the consensus amino acid (Table 1). Each 10-fold increase in a variant's prevalence was significantly correlated with the change in BLOSUM62 similarity score: the slopes of a fitted line for each gene were 0.71 (r = 0.47; P < 2E−16), 0.67 (r = 0.41; P < 2E−16), and 0.68 (r = 0.36; P < 2E−16) for PR, RT, and IN, respectively. Similar results were obtained using the BLOSUM80 scoring matrix: the slopes of a fitted line for each gene were 0.81 (r = 0.47; P < 2E−16), 0.77 (r = 0.41; P < 2E−16), and 0.74 (r = 0.35; P < 2E−16) for PR, RT, and IN, respectively.

Mixture analysis.

There was a strong inverse relationship between a variant's prevalence and the proportion of times that it occurred as part of an electrophoretic mixture. Each 10-fold increase in a variant's prevalence was inversely correlated with the change in the proportion of times that it occurred as part of an electrophoretic mixture: the slopes of a fitted line for each gene were −3.6 (r = 0.14; P < 2E−06), −5.9 (r = 0.32; P < 2E−16), and −7.6 (r = 0.43; P < 2E−16) for PR, RT, and IN, respectively. For example, the very rare variants with a prevalence of <0.01% were present as a part of mixture in 54% to 60% of their occurrences, depending on the gene. In contrast, the most common variants were present as a part of mixture in 7% to 9% of their occurrences, depending on the gene (Table 1).

Very rare amino acid variants.

The very rare variants occurring at a prevalence of <0.01% were evenly distributed throughout PR, RT, and IN (coefficients of variation [CV], 29% for PR, 43% for RT, and 66% for IN) across positions whether they were highly conserved or were variable at higher-mutation-prevalence strata. In contrast, amino acid variants with higher prevalence had a higher coefficient of variation than variants with lower prevalence: ≥1% (CV, 155% for PR, 179% for RT, and 170% for IN), 0.1% to 1% (CV, 130% for PR, 147% for RT, and 139% for IN), and 0.01% to 0.1% (CV, 73% for PR, 68% for RT, and 76% for IN) (Fig. 1 to 3). Table S3 in the supplemental material shows that 3.5% of PR, 10.3% of RT, and 6.5% of IN sequences had ≥1 very rare amino acid variant and 0.5% of PR, 2.2% of RT, and 0.9% of IN sequences had ≥2 very rare amino acid variants. The steep reduction in the proportion of sequences with increasing numbers of very rare amino acid variants followed a Poisson distribution.

Nonpolymorphic TSMs. (i) PR.

To identify nonpolymorphic PI-selected mutations, we analyzed the proportions of all PR mutations in sequences from 61,593 PI-naive individuals and 15,420 PI-experienced individuals. Within PR, 144 mutations at 57 positions were significantly more common in PI-experienced than PI-naive patients after adjustment for multiple-hypothesis testing by controlling the family-wise error rate (i.e., adjusted P) at <0.01 (chi-square test; unadjusted P < 8.8 × 10−6). Of these 144 mutations, 111 at 41 positions were nonpolymorphic and occurred more than five times more frequently in PI-experienced than PI-naive individuals. Table 2 lists each of the 111 nonpolymorphic TSMs by their position and frequency in ARV-experienced individuals.

TABLE 2

PI nonpolymorphic treatment-selected mutations

Position	Cons^a	TSM(s)^b	No. of individuals
Position	Cons^a	TSM(s)^b	PI treated	PI naïve
10	L	F_9.5 R_0.4 Y_0.3	15,231	60,294
11	V	L_0.8	15,244	60,351
20	K	T_5.1 A_0.1	15,278	61,114
22	A	V_0.9	15,292	61,145
23	L	I_1.2	15,295	61,252
24	L	I_5.9 F_0.6 M_0.2	15,282	61,263
30	D	N_6.3	15,302	61,316
32	V	I_5.1	15,302	61,323
33	L	M_0.1	15,302	61,317
34	E	Q_2.7 D_0.3 V_0.2 N_0.1 R_0.1	15,302	61,315
36	M	A_0.1	15,296	61,306
38	L	W_0.2	15,304	61,319
43	K	T_5.7 N_0.4 I_0.3 Q_0.2 S_0.1 P_0.04	15,420	61,587
45	K	Q_0.3 I_0.2 V_0.1	15,421	61,587
46	M	I_22.7 L_10.1 V_0.5	15,412	61,594
47	I	V_4.9 A_0.4	15,423	61,595
48	G	V_4.1 M_0.5 A_0.4 E_0.2 Q_0.1 S_0.1 L_0.1 T_0.05	15,423	61,597
50	I	V_2.0 L_0.5	15,423	61,597
51	G	A_0.3	15,422	61,592
53	F	L_6.0 Y_0.4 I_0.1 W_0.1	15,423	61,598
54	I	V_25.5 L_3.2 M_2.8 A_1.4 T_0.9 S_0.7 C_0.04	15,422	61,594
55	K	R_7.6 N_0.3	15,421	61,596
66	I	F_1.7 V_1.2 L_0.4	15,423	61,593
67	C	F_1.1 L_0.1	15,418	61,577
71	A	I_3.2 L_0.5	15,415	61,592
72	I	L_2.5 K_0.7	15,417	61,574
73	G	S_8.7 T_2.6 C_1.2 A_0.7 V_0.2 D_0.1 I_0.1 N_0.05	15,423	61,592
74	T	P_1.9 E_0.1	15,421	61,591
76	L	V_3.8	15,419	61,585
79	P	A_0.9 N_0.1	15,421	61,591
82	V	A_23.3 T_3.2 F_1.8 S_1.4 C_0.8 L_0.3 M_0.3 G_0.2	15,414	61,582
83	N	D_0.8 S_0.3	15,421	61,584
84	I	V_14.2 A_0.2 C_0.1	15,421	61,584
85	I	V_4.9	15,420	61,582
88	N	D_5.1 S_1.5 G_0.2 T_0.1	15,418	61,543
89	L	V_4.2 T_0.2 P_0.1	15,412	61,533
90	L	M_32.0 I_0.1	15,416	61,537
91	T	S_1.7 C_0.1	15,417	61,536
92	Q	R_0.9	15,416	61,527
95	C	F_1.7 L_0.2 V_0.1	15,404	61,251
96	T	S_0.3	15,391	61,129

Cons, consensus.

Nonpolymorphic treatment-selected mutations (TSMs) in boldface were previously reported as being associated with drug resistance (18).

PI nonpolymorphic treatment-selected mutations Cons, consensus. Nonpolymorphic treatment-selected mutations (TSMs) in boldface were previously reported as being associated with drug resistance (18). Of the 88 PI nonpolymorphic TSMs that were previously reported by us (18), two mutations, I13M and T74K, were no longer found 5-fold more often in treated compared with untreated individuals. One mutation, Q58E, had a prevalence of 1.1% in subtype D viruses from untreated individuals. The 85 mutations in boldface were previously reported by us as nonpolymorphic TSMs, whereas the remaining 26 mutations are newly identified. Ninety-two percent of the sequences containing a novel nonpolymorphic TSM had one or more PI-associated SDRMs.

(ii) RT.

To identify nonpolymorphic RTI-selected mutations, we analyzed the proportions of all RT mutations in sequences from 52,040 RTI-naive and 28,806 RTI-experienced individuals. Among the sequences from RTI-naive individuals, 22,810 encompassed RT positions 1 to 300, 4,790 encompassed RT positions 1 to 400, and 2,440 encompassed positions 1 to 560. Among the sequences from RTI-experienced individuals, 14,163 encompassed positions 1 to 300, 5,727 encompassed positions 1 to 400, and 437 encompassed positions 1 to 560. Within RT, 245 mutations at 116 positions were significantly more common in RTI-experienced than RTI-naive individuals after adjustment for multiple-hypothesis testing by controlling the family-wise error rate (i.e., adjusted P) at <0.01 (chi-square test; unadjusted P <3.6 × 10−6). Of these 245 mutations, 185 mutations at 82 positions were nonpolymorphic and occurred more than five times more frequently in RTI-experienced than RTI-naive individuals. Table 3 lists each of the 95 nonpolymorphic NRTI-selected mutations. Table 4 lists each of the 64 nonpolymorphic NNRTI-selected mutations. Table 5 lists 26 nonpolymorphic RTI-selected mutations that could not be attributed to either NRTI or NNRTI selection pressure alone and that occurred at positions not previously associated with NRTI or NNRTI selection pressure.

TABLE 3

NRTI nonpolymorphic treatment-selected mutations

Position	Cons^a	TSM(s)^b	No. of individuals
Position	Cons^a	TSM(s)^b	RTI treated	RTI naive
40	E	F_0.6	28,619	51,040
41	M	L_28.5	28,761	51,192
43	K	N_1.7 D_0.1 H_0.1	28,768	51,944
44	E	A_1.5	28,769	51,957
64	K	H_0.6 N_0.5 Y_0.2 Q_0.1	28,796	51,997
65	K	R_4.7 N_0.1 E_0.1	28,803	52,000
67	D	N_26.8 G_2.5 E_0.5 S_0.3 H_0.2 T_0.2 A_0.1 d_0.1	28,792	51,999
68	S	K_0.1	28,804	52,003
69	T	D_6.1 i_0.9 G_0.2 d_0.2 E_0.2 Y_0.1	28,789	52,005
70	K	R_18.1 E_0.8 G_0.4 T_0.3 N_0.3 Q_0.3 S_0.1	28,797	52,013
73	K	M_0.1	28,804	52,017
74	L	V_8.7 I_4.2	28,799	52,021
75	V	M_3.3 I_3.1 T_1.4 A_0.7 S_0.3	28,798	52,034
77	F	L_1.7	28,805	52,035
115	Y	F_2.3	28,806	52,037
116	F	Y_2.0	28,807	52,044
117	S	A_0.2	28,802	52,037
151	Q	M_2.7 L_0.2 K_0.1	28,792	52,026
157	P	A_0.2	28,791	52,029
159	I	L_0.1	28,792	52,027
162	S	D_1.9	28,763	51,998
164	M	L_0.1	28,786	52,028
165	T	L_0.7 M_0.1	28,787	52,021
167	I	V_0.6	28,788	52,020
184	M	V_52.5 I_2.5	28,777	52,016
203	E	K_5.4 V_0.4 A_0.3 N_0.1	28,736	51,864
205	L	F_0.1	28,738	51,841
208	H	Y_7.2 F_0.3	28,725	51,820
210	L	W_17.7 Y_0.1 R_0.1	28,688	51,798
211	R	D_0.3	28,700	51,755
212	W	M_0.2 C_0.1 L_0.1	28,705	51,789
215	T	Y_26.3 F_10.3 S_2.1 I_1.9 N_1.0 C_0.9 D_0.8 V_0.7 E_0.2 G_0.1 H_0.1	28,657	51,505
218	D	E_5.6	28,653	51,454
219	K	Q_10.9 E_6.1 N_3.1 R_2.7 D_0.3 H_0.3 W_0.3 G_0.1 S_0.1	28,639	51,435
304	A	G_0.7	11,563	19,788

Cons, consensus.

Nonpolymorphic treatment-selected mutations (TSMs) in boldface were previously reported as being associated with drug resistance (18). Lowercase “i” indicates an insertion; lowercase “d” indicates a deletion.

TABLE 4

NNRTI nonpolymorphic treatment-selected mutations

Position	Cons^a	TSM(s)^b	No. of individuals
Position	Cons^a	TSM(s)^b	RTI treated	RTI naive
94	I	L_0.6	28,810	52,041
98	A	G_5.7	28,802	52,042
100	L	I_3.6	28,796	51,999
101	K	E_6.6 P_1.3 H_1.1 N_0.4 T_0.3 A_0.2 D_0.1	28,794	52,039
102	K	N_0.4 G_0.1	28,804	52,028
103	K	N_30.7 S_1.6 T_0.2 H_0.1	28,805	52,032
105	S	T_0.2	28,808	52,045
106	V	M_4.0 A_1.4	28,805	52,045
108	V	I_7.4	28,808	52,043
132	I	L_0.7	28,800	52,037
138	E	Q_1.0 K_0.5 T_0.1	28,798	52,024
139	T	R_0.8	28,798	52,037
178	I	F_0.2	28,781	52,001
179	V	F_0.2 L_0.1 M_0.1	28,774	52,010
181	Y	C_16.6 I_0.7 V_0.5 F_0.2 G_0.1 N_0.1	28,780	52,016
188	Y	L_3.7 C_0.8 H_0.7 F_0.4	28,758	52,014
190	G	A_12.7 S_2.3 E_0.4 Q_0.3 C_0.1	28,771	52,015
221	H	Y_6.1 C_0.1	28,565	50,963
225	P	H_3.7	28,386	50,583
227	F	L_2.3 Y_0.2	28,165	50,128
230	M	L_1.4	28,081	49,720
232	Y	H_0.3	27,827	49,437
234	L	I_0.2	27,760	49,216
238	K	T_1.9 N_0.4	27,404	47,232
240	T	K_0.1	23,831	46,204
241	V	M_0.2	23,586	44,549
242	Q	H_0.9 L_0.2 K_0.1	23,529	43,984
318	Y	F_1.3	10,809	15,668
348	N	I_13.0 T_0.8	6,367	5,528
404	E	N_1.3	1,207	3,663

Cons, consensus.

Nonpolymorphic treatment-selected mutations (TSMs) in boldface were previously reported as being associated with drug resistance (18).

TABLE 5

Undifferentiated RTI nonpolymorphic treatment-selected mutations

Position	Cons^a	TSM(s)^b	No. of individuals
Position	Cons^a	TSM(s)^b	RTI treated	RTI naive
3	S	C_0.3	19,241	42,633
16	M	V_0.4	19,884	43,640
31	I	L_1.6	21,490	45,863
33	A	V_0.2	21,573	46,050
34	L	I_0.7	21,582	46,129
54	N	I_0.1	28,794	51,991
58	T	N_0.2 S_0.2	28,795	51,994
109	L	I_0.8 M_0.1 V_0.1	28,808	52,043
202	I	T_0.1	28,742	51,873
223	K	Q_2.1 E_1.7 T_0.5 P_0.1	28,537	50,880
228	L	R_5.4 N_0.1 I_0.1 K_0.1	28,148	50,071
302	E	D_0.3	12,507	20,464
312	E	G_0.4	10,935	17,751
341	I	F_1.4	6,671	5,802
394	Q	S_0.8	6,108	4,874
399	E	G_1.2	5,882	4,830
547	Q	R_3.6	473	2,559

Cons, consensus.

Nonpolymorphic treatment-selected mutations (TSMs) in boldface were previously reported as being associated with drug resistance (18).

NRTI nonpolymorphic treatment-selected mutations Cons, consensus. Nonpolymorphic treatment-selected mutations (TSMs) in boldface were previously reported as being associated with drug resistance (18). Lowercase “i” indicates an insertion; lowercase “d” indicates a deletion. NNRTI nonpolymorphic treatment-selected mutations Cons, consensus. Nonpolymorphic treatment-selected mutations (TSMs) in boldface were previously reported as being associated with drug resistance (18). Undifferentiated RTI nonpolymorphic treatment-selected mutations Cons, consensus. Nonpolymorphic treatment-selected mutations (TSMs) in boldface were previously reported as being associated with drug resistance (18). Of the 122 RTI nonpolymorphic TSMs that were previously reported by us (18), two mutations, P236L and D237E, were no longer found to be 5-fold more common in treated compared with untreated individuals. One mutation, K43Q, was found to have a prevalence of 2.0% in CRF01_AE viruses from ARV-naive individuals, and another mutation, L228H, was found to have a prevalence of 1.2% in subtype F viruses from ARV-naive individuals. In Tables 3, 4, and 5, the 118 mutations shown in boldface were previously reported by us to be nonpolymorphic TSMs, whereas the remaining 63 are newly identified. Ninety-eight percent of the sequences containing a novel nonpolymorphic TSM in RTI-experienced individuals had one or more RTI-associated SDRMs.

(iii) IN.

To identify nonpolymorphic INSTI-selected mutations, we analyzed the proportions of all IN mutations in sequences from 6,630 INSTI-naive and 1,020 INSTI-experienced individuals. Within IN, 45 mutations at 28 positions were significantly more common in INSTI-experienced than INSTI-naive individuals after adjustment for multiple-hypothesis testing by controlling the family-wise error rate (i.e., adjusted P) at <0.01 (chi-square test; unadjusted P <1.3 × 10−5). Of these 45 mutations, 44 occurred more than five times more frequently in INSTI-experienced than INSTI-naive individuals. Of these 44 TSMs, 30 at 15 positions were nonpolymorphic in INSTI-naive patients. Table 6 shows those 30 nonpolymorphic TSMs. Of these 30 nonpolymorphic TSMs, 23 in boldface are established previously reported DRMs (23), and the remaining 7 were new: V79I, E92A, E138T, P142T, Q148N, N155D, and D253Y. Eighty-one percent of the sequences containing a novel nonpolymorphic TSM had one or more established INSTI-associated DRMs.

TABLE 6

INSTI nonpolymorphic treatment-selected mutations

Position	Cons^a	TSM(s)^b	No. of individuals
Position	Cons^a	TSM(s)^b	INSTI treated	INSTI naive
51	H	Y_0.5	1,019	6,609
66	T	I_1.3 A_0.7 K_0.4	1,019	6,619
79	V	I_2.5	1,020	6,625
92	E	Q_6.4 A_0.4	1,020	6,628
95	Q	K_1.6	1,020	6,627
121	F	Y_0.4	1,020	6,631
138	E	K_5.9 A_3.0 T_0.7	1,020	6,631
140	G	S_25.2 A_2.1 C_0.7	1,020	6,631
142	P	T_0.6	1,020	6,631
143	Y	R_7.7 C_5.4 H_2.8 S_0.6 G_0.4	1,020	6,631
147	S	G_1.6	1,020	6,631
148	Q	H_22.6 R_7.9 K_1.0 N_0.4	1,020	6,629
155	N	H_30.8 D_0.5	1,020	6,629
230	S	R_3.6	1,018	6,608
253	D	Y_1.0	1,018	6,588

Cons, consensus.

Nonpolymorphic treatment-selected mutations (TSMs) in boldface were previously reported as being associated with drug resistance (9).

INSTI nonpolymorphic treatment-selected mutations Cons, consensus. Nonpolymorphic treatment-selected mutations (TSMs) in boldface were previously reported as being associated with drug resistance (9). Among the 99 PR positions, dN was higher than dS at a median of 18 positions in the five most common subtypes. dN was higher than dS in all five subtypes at positions 12, 13, 15, and 37. Among the 400 RT positions studied for amino acid variation, dN was higher than dS at a median of 37 positions in the five most common subtypes. dN was higher than dS in all five subtypes at positions 35, 135, 178, 200, 202, 272, and 369. Among the 288 IN positions, dN was higher than dS at a median of 28 positions in the five most common subtypes. dN was higher than dS in all five subtypes at positions 124 and 218. Among the PR TSMs, the minimum numbers of nucleotide differences between the TSM and the consensus amino acid variant were 1 for 67.6% and 2 for 32.4% (i.e., these were 2-bp mutations). Among the RT TSMs, the minimum numbers of nucleotide differences were 1 for 68.4%, 2 for 31.1%, and 3 for 0.6%. Among the IN TSMs, the minimum numbers of nucleotide differences were 1 for 86.7% and 2 for 13.3%.

DISCUSSION

Within an individual, HIV-1 variation arises from repeated cycles of virus polymerization errors, recombination, APOBEC-mediated RNA editing, and selective drug and immune pressure (24, 25). Although HIV-1 has a high mutation rate, only those variants without significantly impaired fitness will rise to levels detectable by standard direct PCR Sanger sequencing. In contrast, it is expected that many virus polymerization errors will result in nonviable variants or variants that may not compete successfully with more-fit virus variants (26). The consistent presence of certain mutations by Sanger sequencing attests to their fitness at least under some conditions and genetic contexts. An extensive amount of data are available for characterizing HIV-1 PR, RT, and IN variability because these genes are frequently sequenced for clinical, research, and epidemiological purposes. We analyzed PR and RT sequences from more than 100,000 individuals and IN sequences from more than 10,000 individuals and identified 1,183 amino acid variants in PR, RT, and IN that were present in ≥0.1% of sequences. We also analyzed several subsets of these sequences from individuals with known ARV treatment histories and identified 326 nonpolymorphic PR, RT, and IN TSMs.

Overall PR, RT, and IN variability.

Forty-seven percent of PR, 37% of RT, and 34% of IN positions had one or more amino acid variants with a prevalence of ≥1%. Seventy percent of PR, 60% of RT, and 60% of IN positions had one or more amino acid variants with a prevalence of ≥0.1%. Although amino acid variants occurred in different proportions in different subtypes, the prevalence of a variant in one subtype rarely differed by more than 10-fold compared with the prevalence of that variant in a different subtype (2.0% of IN variants, 3.7% of RT variants, and 5.0% of PR variants). In each gene, the more rare the amino acid variant, the more likely it was present as part of an electrophoretic mixture or differed biochemically from the consensus amino acid. Variants that occur frequently as part of electrophoretic mixtures are likely to have reduced replication fitness, explaining their inability to replicate sufficiently to become dominant within an infected individual's circulating virus population (27, 28). Although the presence of two electrophoretic peaks at a position is usually a reliable indicator that two nucleotides are present in that virus population, a small secondary peak can also result from PCR error and sequencing artifact (29, 30). Very rare variants had the lowest biochemical similarity to the consensus amino acid at each position and often occurred as part of an electrophoretic mixture. Additionally, these variants were evenly distributed across all positions in PR, RT, and IN—occurring in similar numbers at positions that were highly conserved or displayed variability at higher mutation thresholds. We propose that it is useful to identify sequences that contain large numbers of such rare variants because a high number of very rare amino acids in a direct PCR dideoxynucleotide terminator Sanger sequence could result from sequencing error or unrecognized frameshifts if the rare amino acids are clustered. Additionally, the presence of a high number of very rare variants in a next-generation deep-sequencing assay would be more consistent with PCR error than quasispecies variation and would suggest that the threshold for identification of low-abundance variants was set too low.

Treatment-selected mutations.

We previously published an analysis of nonpolymorphic TSMs in PR and the first 350 positions of RT using an earlier data set containing sequences from approximately 25,000 individuals with known ARV treatment histories (18). In this article, we extended our analysis of nonpolymorphic TSMs to IN and to the entire RT. In addition, the numbers of sequences from individuals with known treatment histories in PR and the 5′ part of RT were nearly three times higher for PR and RT than those in our previous analysis. We identified 111 nonpolymorphic PR TSMs: 26 new TSMs and 85 of the 88 previously identified TSMs. The novel PR TSMs are likely to be accessory drug resistance mutations because they nearly always occurred in combination with established PI resistance mutations. We identified 185 nonpolymorphic RT TSMs: 67 new TSMs and 118 of the 122 previously identified TSMs. The novel RT TSMs were likely to be accessory drug resistance mutations because they nearly always occurred in combination with established NRTI or NNRTI resistance mutations. Of the 185 RT TSMs, 95 were selected by NRTIs and 64 were selected by NNRTIs. For 26 RT TSMs, however, it was not possible to determine whether the mutations were primarily selected by NRTIs or NNRTIs because most of the individuals with these 26 TSMs received both NRTIs and NNRTIs. Several mutations in the connection and RNase H domains of RT have been shown to play an accessory role in reducing HIV-1 susceptibility in combination with thymidine analog mutations (TAMs), most likely by slowing the activity of RNase H and thereby allowing more time for TAM-mediated primer unblocking (31). However, only 11 TSMs were identified beyond position 300, including the NRTI-selected mutation A304G, the NNRTI-selected mutations Y318F, N348IT, and E404N, and the RTI-selected mutations E302D, E312G, I341F, Q394S, E399G, and Q547G. This is consistent with the much lower number of sequenced viruses extending beyond position 300 obtained from NRTI- and/or NNRTI-experienced individuals. We identified 30 nonpolymorphic IN TSMs, including 23 established INSTI resistance mutations (H51Y, T66IAK, E92Q, Q95K, F121Y, E138KA, G140SAC, Y143RCHSG, S147G, Q148HRK, N155H, and S230R) and seven novel mutations not previously associated with INSTI resistance. Four of the novel mutations—E92A, E138T, Q148N, and N155D—were at positions also containing established INSTI resistance mutations. Three other mutations—V79I, P142T, and D253Y—were at novel positions. Eighty-two percent of the sequences containing one of these three novel nonpolymorphic TSMs had one or more established INSTI-associated DRMs. Four well-characterized accessory INSTI-associated DRMs—L74M, T97A, and G163R/K—were not identified because they were polymorphic in one or more subtypes (32). G118R and R263K, two other highly studied mutations (21, 33), were also not identified. G118R is extremely rare and was not present in a single plasma virus sequence. R263K was significantly more common in INSTI-treated than INSTI-naive sequences (6/1,016 [0.59%] versus 8/6558 [0.12%]), but this difference was not significant after controlling for multiple comparisons. Although practically all major drug resistance mutations are TSMs, the converse may not always be true. For example, many TSMs are accessory mutations that only arise in the presence of other drug resistance mutations. Other TSMs such as the T215 revertant mutations T215S/C/E/D/I/V have been shown to arise from drug resistance mutations (e.g., T215Y/F) when selective drug pressure is removed (34).

APOBEC.

We previously published an analysis of mutations indicative of APOBEC-mediated RNA editing that encompassed PR and the first 240 positions of RT (13). Our current analysis identified two new mutations in PR and one new mutation in the previously analyzed region of RT. Additionally, we identified 55 mutations between RT positions 241 and 560 and 71 mutations in IN that are also likely to result from APOBEC-mediated RNA editing. We then predicted that most sequences with two or more of these mutations were likely to have undergone G-to-A hypermutation. Identification of sequences with G-to-A hypermutation is important because the extent of hypermutation is usually incomplete and may not be uniformly distributed (13, 35, 36) and because several mutations known to emerge from selective drug pressure can also arise from G-to-A hypermutation, including D30N, M46I, and G73S in PR, D67N, E138K, M184I, G190SE, and M230I in RT, and E138K, G118R, and G163R in IN. As drug resistance testing in low- and middle-income countries will increasingly be performed using dried blood spots, which often contain proviral HIV-1 DNA (36–39), it will become necessary to determine if a sequence has evidence of G-to-A hypermutation to assess the clinical significance of the above drug resistance mutations. For example, the isolated presence of DRMs associated with G-to-A hypermutation would need to be judged differently if they occurred in a sequence containing an excess of the APOBEC-indicating mutations that we describe in this study.

Conclusions.

This study of HIV-1 PR, RT, and IN variability makes it possible to apportion amino acid variants into the following categories: (i) established variants that may or may not be a nonpolymorphic TSM, (ii) APOBEC-associated mutations, and (iii) very rare variants of questionable validity or replication potential. Determination of whether a particular sequence contains an excess of APOBEC-associated mutations or of very rare amino acid variants can be used to optimally determine the significance of other mutations present in that sequence, particularly when that sequence is generated using technologies associated with greater sequencing artifacts, as occurs with the use of samples likely to be enriched for proviral DNA or with NGS deep sequencing. As the number of sequences for IN and the 3′ part of RT was approximately 10-fold lower than those for PR and the 5′ part of RT and as subtype B was overly represented in our data set, we will update our estimates of the prevalence of each mutation at each position as additional sequence data are available.

38 in total

1. Population level analysis of human immunodeficiency virus type 1 hypermutation and its relationship with APOBEC3G and vif genetic variation.

Authors: Craig Pace; Jean Keller; David Nolan; Ian James; Silvana Gaudieri; Corey Moore; Simon Mallal
Journal: J Virol Date: 2006-09 Impact factor: 5.103

2. Identification of a rare mutation at reverse transcriptase Lys65 (K65E) in HIV-1-infected patients failing on nucleos(t)ide reverse transcriptase inhibitors.

Authors: Slim Fourati; Benoit Visseaux; Daniele Armenia; Laurence Morand-Joubert; Anna Artese; Charlotte Charpentier; Peter Van Den Eede; Giosuè Costa; Stefano Alcaro; Marc Wirden; Carlo Federico Perno; Francesca Ceccherini Silberstein; Diane Descamps; Vincent Calvez; Anne-Genevieve Marcelin
Journal: J Antimicrob Chemother Date: 2013-06-07 Impact factor: 5.790

3. Primer ID Informs Next-Generation Sequencing Platforms and Reveals Preexisting Drug Resistance Mutations in the HIV-1 Reverse Transcriptase Coding Domain.

Authors: Jessica R Keys; Shuntai Zhou; Jeffrey A Anderson; Joseph J Eron; Lauren A Rackoff; Cassandra Jabara; Ronald Swanstrom
Journal: AIDS Res Hum Retroviruses Date: 2015-04-02 Impact factor: 2.205

4. Drug resistance mutations for surveillance of transmitted HIV-1 drug-resistance: 2009 update.

Authors: Diane E Bennett; Ricardo J Camacho; Dan Otelea; Daniel R Kuritzkes; Hervé Fleury; Mark Kiuchi; Walid Heneine; Rami Kantor; Michael R Jordan; Jonathan M Schapiro; Anne-Mieke Vandamme; Paul Sandstrom; Charles A B Boucher; David van de Vijver; Soo-Yon Rhee; Tommy F Liu; Deenan Pillay; Robert W Shafer
Journal: PLoS One Date: 2009-03-06 Impact factor: 3.240

5. The "Connection" Between HIV Drug Resistance and RNase H.

Authors: Krista A Delviks-Frankenberry; Galina N Nikolenko; Vinay K Pathak
Journal: Viruses Date: 2010-07-01 Impact factor: 5.048

6. 2014 Update of the drug resistance mutations in HIV-1.

Authors: Annemarie M Wensing; Vincent Calvez; Huldrych F Günthard; Victoria A Johnson; Roger Paredes; Deenan Pillay; Robert W Shafer; Douglas D Richman
Journal: Top Antivir Med Date: 2014 Jun-Jul

7. Biochemical analysis of the role of G118R-linked dolutegravir drug resistance substitutions in HIV-1 integrase.

Authors: Peter K Quashie; Thibault Mesplède; Ying-Shan Han; Tamar Veres; Nathan Osman; Said Hassounah; Richard D Sloan; Hong-Tao Xu; Mark A Wainberg
Journal: Antimicrob Agents Chemother Date: 2013-09-30 Impact factor: 5.191

8. Human immunodeficiency virus reverse transcriptase and protease sequence database.

Authors: Soo-Yon Rhee; Matthew J Gonzales; Rami Kantor; Bradley J Betts; Jaideep Ravela; Robert W Shafer
Journal: Nucleic Acids Res Date: 2003-01-01 Impact factor: 16.971

Review 9. Measurement of HIV-1 viral load for drug resistance surveillance using dried blood spots: literature review and modeling of contribution of DNA and RNA.

Authors: Neil T Parkin
Journal: AIDS Rev Date: 2014 Jul-Sep Impact factor: 2.500

10. Analysis of 454 sequencing error rate, error sources, and artifact recombination for detection of Low-frequency drug resistance mutations in HIV-1 DNA.

Authors: Wei Shao; Valerie F Boltz; Jonathan E Spindler; Mary F Kearney; Frank Maldarelli; John W Mellors; Claudia Stewart; Natalia Volfovsky; Alexander Levitsky; Robert M Stephens; John M Coffin
Journal: Retrovirology Date: 2013-02-13 Impact factor: 4.602

36 in total

Review 1. Decoding HIV resistance: from genotype to therapy.

Authors: Irene T Weber; Robert W Harrison
Journal: Future Med Chem Date: 2017-08-09 Impact factor: 3.808

Review 2. Molecular evolution methods to study HIV-1 epidemics.

Authors: Juan Á Patiño-Galindo; Fernando González-Candelas
Journal: Future Virol Date: 2018-05-21 Impact factor: 1.831

3. Comparison of an In Vitro Diagnostic Next-Generation Sequencing Assay with Sanger Sequencing for HIV-1 Genotypic Resistance Testing.

Authors: Philip L Tzou; Pramila Ariyaratne; Vici Varghese; Charlie Lee; Elian Rakhmanaliev; Carolin Villy; Meiqi Yee; Kevin Tan; Gerd Michel; Benjamin A Pinsky; Robert W Shafer
Journal: J Clin Microbiol Date: 2018-05-25 Impact factor: 5.948

4. Novel Protease Inhibitors Containing C-5-Modified bis-Tetrahydrofuranylurethane and Aminobenzothiazole as P2 and P2' Ligands That Exert Potent Antiviral Activity against Highly Multidrug-Resistant HIV-1 with a High Genetic Barrier against the Emergence of Drug Resistance.

Authors: Yuki Takamatsu; Manabu Aoki; Haydar Bulut; Debananda Das; Masayuki Amano; Venkata Reddy Sheri; Ladislau C Kovari; Hironori Hayashi; Nicole S Delino; Arun K Ghosh; Hiroaki Mitsuya
Journal: Antimicrob Agents Chemother Date: 2019-07-25 Impact factor: 5.191

5. Amino Acid Prevalence of HIV-1 pol Mutations by Direct Polymerase Chain Reaction and Single Genome Sequencing.

Authors: Philip L Tzou; Soo-Yon Rhee; Robert W Shafer
Journal: AIDS Res Hum Retroviruses Date: 2019-08-26 Impact factor: 2.205

6. Prevalence of Drug-Resistant Minority Variants in Untreated HIV-1-Infected Individuals With and Those Without Transmitted Drug Resistance Detected by Sanger Sequencing.

Authors: Dana S Clutter; Shuntai Zhou; Vici Varghese; Soo-Yon Rhee; Benjamin A Pinsky; W Jeffrey Fessel; Daniel B Klein; Ean Spielvogel; Susan P Holmes; Leo B Hurley; Michael J Silverberg; Ronald Swanstrom; Robert W Shafer
Journal: J Infect Dis Date: 2017-08-01 Impact factor: 5.226

7. Prospective Evaluation of the Vela Diagnostics Next-Generation Sequencing Platform for HIV-1 Genotypic Resistance Testing.

Authors: Jenna Weber; Ilona Volkova; Malaya K Sahoo; Philip L Tzou; Robert W Shafer; Benjamin A Pinsky
Journal: J Mol Diagn Date: 2019-08-02 Impact factor: 5.568

8. Near Real-Time Identification of Recent Human Immunodeficiency Virus Transmissions, Transmitted Drug Resistance Mutations, and Transmission Networks by Multiplexed Primer ID-Next-Generation Sequencing in North Carolina.

Authors: Shuntai Zhou; Sabrina Sizemore; Matt Moeser; Scott Zimmerman; Erika Samoff; Victoria Mobley; Simon Frost; Andy Cressman; Michael Clark; Tara Skelly; Hemant Kelkar; Umadevi Veluvolu; Corbin Jones; Joseph Eron; Myron Cohen; Julie A E Nelson; Ronald Swanstrom; Ann M Dennis
Journal: J Infect Dis Date: 2021-03-03 Impact factor: 5.226

9. Deciphering Complex Mechanisms of Resistance and Loss of Potency through Coupled Molecular Dynamics and Machine Learning.

Authors: Florian Leidner; Nese Kurt Yilmaz; Celia A Schiffer
Journal: J Chem Theory Comput Date: 2021-03-30 Impact factor: 6.006

Review 10. HIV-1 drug resistance and resistance testing.

Authors: Dana S Clutter; Michael R Jordan; Silvia Bertagnolio; Robert W Shafer
Journal: Infect Genet Evol Date: 2016-08-29 Impact factor: 3.342