Literature DB >> 25512682

Pattern of the evolution of HIV-1 enν gene in Côte d׳Ivoire.

Sery Gonedelé Bi¹, Didier P Sokouri², Kouakou Tiékoura², Oulo Alla NNan², Marcel Lolo², Félix Gnangbé², Assanvo Sp NGuetta².

Abstract

Cête d׳Ivoire continues to have the highest HIV-1 prevalence rate in West Africa, although the infection number is in constant decline. The external envelope protein of the viruses is a likely site of selection, and responsible for receptor binding and entry into host cells, and therefore constitutes an ideal region with which to investigate the evolutionary processes acting on HIV-1. In this study, we analyse 189 envelope glycoprotein V3 loop region sequences of viruse isolates from 1995 to 2009, from HIV-1 untreated patients living in Cête d׳Ivoire, to decipher the temporal relationship between disease diversity, divergence and selection. Our analyses show that the nonsynonymous and synonymous ratio (dN/dS) was lower than 1 for viral populations analysed within 15 years, which showed the sequences did not undergo adequate immune pressure. The phylogenetic tree of the sequences analysed demonstrated distinctly long internal branches and short external branches, suggesting that only a small number of viruses infected the new host cell at each transmission. In addition to identifying sites under purifying selection, we also identified neutral sites that can cause false positive inference of selection. These sites presented form a resource for future studies of selection pressures acting on HIV-1 enν gene in Cête d׳Ivoire and other West African countries.

Entities: Chemical Disease Gene Species

Keywords: Cête d׳Ivoire; HIV-1; diversity; enν gene; selection

Year: 2014 PMID： 25512682 PMCID： PMC4261110 DOI： 10.6026/97320630010671

Source DB: PubMed Journal: Bioinformation ISSN： 0973-2063

Background

Since the first AIDS case detection in Cête d׳Ivoire in 1985, the infection number is in constant decline with an actual estimated prevalence of 3.7%. Although this constant decrease, Cête d׳Ivoire continues to have the highest HIV-1 prevalence rate in West Africa and 60% of HIV-infected patients are women, most of them of childbearing age [1]. Based on partial polymerase (pol) and/or envelope (enν) sequences, the high prevalence of circulating recombinant form CRF02_AG (82%) and cocirculation of subtype A (5%), CRF01_AE (1%), CRF06_cpx (4%), and complex intersubtype recombinants (11%) has been documented in Cête d׳Ivoire [2]. One important feature of HIV-1 infection is the diversification and evolution of the viral genome over the course of infection. From all the protein encoding genes, the most variable is the enν gene. It encodes for the envelope proteins associated with the host cell-HIV interaction [3]. Changes in this highly conserved residue provide an interesting case of study to test whether selective pressure was altered with the substitution. Nevertheless, due to their functional relevance, several amino acid residues are extremely conserved among HIV-1 variants. The external envelope protein is a likely site of selection, being targeted by the patient׳s antibody response [4] and responsible for receptor binding and entry into host cells [5], and therefore constitutes an ideal region with which to investigate the evolutionary processes acting on HIV-1. The long term fate of these abundant genetic changes depends on the interplay of effective population size and natural selection, resulting in an extremely high rate of HIV genomic evolution [6]. Population level process such as selection, migration, population dynamic and recombination shape HIV genetic diversity both among and within hosts [6]. The ratio of nonsynonymous/ synonymous substitution rates has proved useful in investigating molecular adaptation; however, changes in the absolute rates of nonsynonymous and synonymous substitution should provide greater insight [7]. Changes in synonymous substitution rates can reflect changes in generation time or mutation rate, while nonsynonymous rates can also be affected by changes in selective pressure and effective population size. Previous studies of HIV evolution have typically assumed that the rate of neutral or synonymous change (per month or year) is approximately constant among patients [8]. Differences in the mutational profile among HIV subtypes have been reported [8]. Such high viral genetic diversity among subtypes is involved in difference in the rate of disease progression and response to antiretroviral therapy including the development of resistance [9]. Therefore it is crucial to acquire further knowledge concerning the real significance of these differences; it may be important to determine strategies of initial treatment for infected individuals. Studying the evolutionary relationship of HIV-1 and characterizing the distinct adaptation patterns in different parts of the HIV-1 genome that interact with the immune system will be key to elucidate how HIV-1 overwhelms the immune system and leads to AIDS [10]. In this study, we present sequence analyses of envelope glycoprotein V3 loop region of viruse isolates from HIV-1 untreated patients living in Cête d׳Ivoire, to decipher the temporal relationship between diversity, divergence and selection, in the HIV-1 envelope gene. Understanding the process that determines viral genetic diversity will undoubtedly assist in the struggle against viral infections and will contribute to our knowledge of past epidemiologic events in Cête d׳Ivoire

Methodology

Data sets compilation:

All HIV-1 sequences classified as subtype A derived from Cête d׳Ivoire were downloaded from the Los Alamos National Laboratory and GenBank databases. Pseudogenes (as noted in GenBank), clones and sequences with less than 250 bp were excluded from the following analyses. The sequences included in this work were from individuals in the asymptomatic phase of infection and they were naïve to drug therapy. The description of data sets and the GenBank accession number of each sequence are summarized as supplementary material. The final set included the 189 subtype A sequences described and four non-A sequences, (subtypes B), which were used as out groups. Fifteen (18) sequences originating from other African countries were also included for phylogenetic comparison: 4 from Mali, 8 from Senegal, and 6 from Congo Democratic Republic. The sequences were first aligned using the ClustalX program [11]. All sites with deletions and insertions were then excluded in order to preserve the reading frames of the genes. The final alignment was 406 bp long and is presented as supplementary information.

Phylogenetic inference:

For the maximum likelihood analysis of selection pressures, phylogenetic trees were constructed. We first determined the most appropriate model of nucleotide substitution for each data set using the program jModeltest 2.1 [12]. Models GTR+G, and GTR+I+G were suggested to have better Likelihood scores. Then we reconstructed the phylogenetic tree using the ML method under GTR+G and GTR+I+G methods. We used Markov chain Monte Carlo (MCMC) methods as implemented in BEAST 1.7 to obtain a posterior distribution of trees under an uncorrelated relaxed clock [13]. In order to assess confidence in each of the internal nodes of the constructed phylogeny, a bootstrap resampling (1,000 replicates) of the data using the neighbor-joining method based on maximum likelihood distances performed with FigTree [14]. To investigate the diversity change, we inferred between-host mean diversities for the 1995s to 2009s, using the nucleotide diversity, p, implemented in MEGA 6 [15] under the GTR+G model again. For Setup Data, the viral sequences obtained from the same year were grouped as one subpopulation. Then, the within-year diversity was calculated by Mean Diversity within Subpopulations, whereas the between-year diversity was calculated by Mean Interpopulational Diversity.

Analysis of selective pressures:

Codon models of coding sequence evolution were used to detect positive selection operating on the HIV-1 enν gene. In particular, we were interested in differences in positive selection pressure on the virus from the 1995s to the 2009s. Selective pressures were analyzed using two distinct approaches that estimate the number of nonsynonymous (dN) and synonymous (dS) at all sites in the sequence alignments. This compares the fit to the data of various models of codon evolution, which differ in the distribution of nonsynonymous and synonymous ratio (dN/dS) among sites and takes into account the phylogenetic relationships of the sequences. HyPhy software [16] was used to generate simulated data under a neutral model with trees generated from the original alignments. The same sequence alignments used as input in the initial analysis were used and one hundred simulated datasets were generated for each alignment. Each simulated dataset was then analyzed using the Dual Model as described above. The minimum value of mean dS across all sliding windows of three adjacent codons, in all of the one hundred simulated datasets, was used as a conservative threshold to identify windows of reduced dS in the observed data. This stringent threshold and a less stringent one that included 95% of the values inferred from the simulated data are shown in the sliding window plots.

Codon usage analysis:

The Relative Synonymous Codon Usage (RSCU) values were calculated for the dataset. The RSCU statistics is calculated by dividing the observed usage of a codon by that expected if all codons were used equally frequently. Thus an RSCU of 1 indicates a codon is used as expected by random usage, RSCU > 1 indicates a codon used more frequently than expected randomly, and RSCU < 1 indicates a codon used less frequently than random. RSCU analysis was conducted using Mega 6 software [15]. Rare codon was computed by the improved implementation in DAMBE [17].

Results

Evolutionary analysis :

The nucleotide frequencies are 45.59% (A), 26.29% (T/U), 10.41% (C), and 17.71% (G). The transition/transversion rate ratios are k1 = 1.729 (purines) and k2 = 8.445 (pyrimidines). The overall transition/transversion bias is R = 1.596. The analysis involved 189 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All ambiguous positions were removed for each sequence pair. There were a total of 82 positions in the final dataset.

Selection analyses:

A faster increase of dS was detected with respect to dN through time, and a slowdown of dN/dS in enν gene due to a slower increase through time of dN with respect to dS Table 1 (see supplementary material). The dN/dS ratio of enν regions fluctuates was less than 1. For the fragment of enν region analysed, the dN/dS ratio was lower than 1 for viral populations analysed within each of the 15 years, which showed the lowest levels of divergence (Figure 1). A low dN/dS ratio indicates that, the sequences did not undergo adequate immune pressure to lead to changes in amino acids. The dN/dS ratio of the years 1997 and 2008 showed a significant difference (P < 0.022) compared to the other years, except for year 2001 and 2006. Both the years 1997 and 2008 showed any significant difference (p = 0.215). The synonymous substitution rate was always significantly higher (P = 0.001, Student׳s t) than the nonsynonymous substitution rate.

Figure 1

Synonymoud and nonsynonimous ratio (dN/dS) plot along HIV-1 enν gene. Interhost overall dN/dS ratios at all sites within the enν gene fragments from HIV-1-infected individuals. Box plot showing the mean, standard error and 95% confidence interval for dN/dS ratios obtained for each codon of the HIV-1 alignments. Statistical significance was determined using the Kruskal–Wallis test.

Phylogenetic tree:

In the Maximum Likelihood (ML) tree, viral sequences from the same year or from other West African countries do not form a distinct cluster (Figure 2). No significant change in sequence diversity was found after nearly two decades of evolution. The reconstructed phylogenetic tree of 211 sequences demonstrated distinctly long internal branches and short external branches, suggesting that only a small number of viruses infected the new host cell at each transmission so that these founder viruses usually are quite different among hosts. These results are compatible with a severe bottleneck at each new infection. The topology of the tree is notable in that the sequences sampled through different times are evenly distributed among the terminal branches (Figure 2). This suggests that most of these mutations have occurred independently and have not been transmitted for sustained periods of time. Sequences did not cluster according to year or compartment. Using this model of evolution, the neighbor joining tree for the entire data set shows that sequences cluster predominantly by host individual. Furthermore, no sequences clustered strongly (bootstrap values 50) with known laboratory strains of HIV subtype B, an indication of no evidence for recombination.

Figure 2

Radial phylogeny tree reconstructed by Maximum likelihood methods based on fragment of HIV-1 subtype A enν V3 loop sequence isolated from HIV-1 untreated patients in Cête d׳Ivoire from 1995 to 2009. The tree was generated using the GTR + I + G model of nucleotide substitution. Sequences were named according to their accession number and year of isolation

Phylogenetic Signal and Informativeness:

The level of substitution saturation in enν gene was measured by comparing the number of transitions and transversions with the size of the genetic distance for each pair of sequences (Figure 3). The analysis of these sequences showed that the amount of substitutions was increasing with the extension of genetic distance. Only the number of transversions within the enν gene showed a lower increasing tendency. However, none of the plots took the form of a plateau, typical of the state of saturation with substitutions.

Figure 3

Transitions and transversions versus genetic divergence plots of enν gene fragment isolated from HIV-1- infected individuals in Cête d׳Ivoire (s: transitions, v: transvertions). The estimated number of transitions and transversions for each pairwise comparison was plotted against the genetic distance.

Evolution of :

Mean nucleotide distances per year of the enν-V3 region of the viral genome of strains A1, are shown in Table 1. Overall nucleotide distances slowly rise over the years. This rise is mostly accounted for by an increase in synonymous substitutions, while non-synonymous nucleotide distances are more constant throughout the period investigated. Relative synonymous codon usage (RCSU) patterns support the attenuation hypothesis as well Table 2 (see supplementary material). Those with the greatest rate of positive change over time were UUU, UUA, CUG, AUA, GUA, GUG, UCU, UCA, CCA, ACU, ACA, GCU, GCC, GCA, UAC, CAU, AAU, GAC, GAA, AGU, AGC, AGA, AGG, GGC, GGG. These changes are due to simple transition mutations, possibly associated with some mutational bias. The variation of relative synonymous codon usage (RSCU) values not only indicated the different frequency of occurrence of each codon for a given amino acid in different protein but also revealed the preference of either A + U or G + C codon usage as listed in Table 2. Preferential codon usage in the portion of enν gene analysed indicates that the codons with A or U at the third position are more preferred compared to G or C ending codons.

Codon Based Analysis:

To examine how the variation of codon usage pattern over time reflects in the usage of individual codons in HIV-1 enν genes in Cête d׳Ivoire, the normalized frequency of each codon in each sequence was compared between the years 1995 and 2009. A graph of codon frequency distribution was plotted to identify the quantities of rare codons present in each sequence (Figure 4). Frequency of codon usage with a value of 100 indicates that the codons are highly used for a given amino acid. Conversely, the frequency of codon usage with a value of less than 5 is determined as low-frequency codon (blue bars) which is likely to affect the expression efficiency. Low frequency codon are ACG, ACC, AUC, AGA, CCG, CGA, CCC, CGG, CGC, CGU, CUC, GAU, GCG, GGU, GUC, GUU, UAU, UCC, UCG, UGC, UGG, UGU, UUC, and UUG, respectively. This result suggested that the enν gene analysed contain a large number of rare codons that may reduce translational efficiency of the gene. We detected fewer nonsynonymous substitutions than expected by chance and dN/dS < 1. Taken together, these results indicate that these regions are subject to very strong purifying selection. The location of the midpoints of the window showing negative selection is given in Figure 5. On average, we detected 68% of codon sites under negative selection and 28% neutral sites.

Figure 4

Graph showing the relative codon frequency of portion of HIV-1 subtypes A enν gene fragment isolated from HIV-1-infected individuals in Cête d׳Ivoire. The blue and black color bar indicating codons that are used less than 5% and more than 5% respectively.

Figure 5

Sliding-window analysis of the cumulative dN/dS across Bayes factor for the event of negative selection at a site along the enν gene fragment isolated from HIV-1-infected individuals in Cête d׳Ivoire.

Discussion

In this study, we contrast the changes in genetic diversity and adaptive evolution of the HIV-1 enν gene between samples collected during fitteen years. Since HIV-1 is an obligate pathogen on human for replication and assembly, codon usage bias, that affects the translational efficiency, is likely to be subjected to host selection pressure [18]. Thus, codon usage bias can play a significant role in host adaptation of HIV-1. For Cête d׳Ivoire, no study has examined the enν genes at the global scale over a long time period to address this issue. The enν sequences analysed indicated that substitution saturation has not been reached, so that the data can be expected to provide reliable phylogenetic signal. No significant change in between-year-sequence diversity was found after more than a decade of evolution. The reconstructed phylogenetic tree of 211 sequences demonstrated distinctly short internal branches and long external branches, suggesting that a large number of viruses infected the new host cell at each transmission. Moreover, the viruses that successfully infected new host cells are not under strong selective pressure from the host immune system, which does not limited between-host diversification, as indicated by those large clusters on the tree. The selective pressure does not significantly vary between the early and the recent samples. These samples seem most likely infected with the virus representing the transmissions between the populations with different genetic backgrounds. This is supported by the intermixing of Ivorian strains to these isolated from other African countries (Senegal, Mali, RDC). Since, Cête d׳Ivoire has for decades been the most important destination for migrants in West Africa, the exchange of HIV-1 gene pools between and the populations of Cête d׳Ivoire and those of the neighboring countries may increasingly affect the diversity of HIV-1 gene pools in Cête d׳Ivoire. The data indicate a maximum intragenotypic subtype A distance of 7.3%, lower than these reported by Janssens et al. [19] who observed a maximum intragenotypic subtype A distance of 14.1% in their limited number of samples collected during 1990±1991 in Abidjan. It is likely that the intragenotypic distance obtained by these authors is skewed on the basis of so few years analysed. The low intragenotypic distance obtained by our data is supported by a large number of silent (synonymous) mutations that cause no change in the amino acid sequence. The viruse sequences were remarkably well conserved at the amino acid level, both within and among different individuals.

Selective pressure :

A low dN/dS ratio indicates that, the sequences did not undergo adequate immune pressure to lead to changes in amino acids and hence a reflection of the lower variability in the enν gene when compared. The analyses indicate that the enν sequences analysed are subject to purifying selection overall and that the derived proteins are not subject to positive selection favoring diversity at the amino acid level but actually tend to be conserved evolutionarily. Since the continuity of the various patients analysed are not known over time, the changes described may not reflect immune pressure. Indeed, purifying immune selection dominates evolution of HIV within hosts, but evolution between hosts is largely decoupled from within-host evolution [20]. The ratio of 0.701 found in our study is lower than the ratio of 0.90 found by Yamaguchi Kabata and Gojobori [21], and higher than that of 0.68 reported by Brown & Monaghan [22]. Although we did not analyze the four variable regions where insertions, deletions, and partial duplications might be very frequent, we think that the ratio in this study is realistic. We are aware that changes in the strength of the immune response may not result in predictable changes in the dN/dS ratio if the selection coefficient is on the same order of magnitude as the effective population size and hence providing only a little information about the status of the immune system [23].

Codon usage biais:

Although we are studying changes in codon usage pattern over a decade, the data were not collected throughout the time for each host analysed. Hence, our results may represent outcome of additional and may be even contradictory selective forces (e.g., effect of anti-retroviral therapies). Such a scenario can also give rise to results similar to this study. On studying the codons, Meintjes and Rodrigo [24] found that the early enν sequences displayed a very biased codon usage pattern, where many codons occurred at very low frequency and the preferred codons were used at a very high frequency.

Conclusion

This study is the first that examine the selective pressures that governed the evolution of the subtypes of HIV-1 in Cête d׳Ivoire, the most affected country in West Africa. No significant change in the HIV-1 enν gene sequences diversity was found over one decade of evolution. We detected fewer nonsynonymous substitutions than expected by chance, indicating that the sequences analyzed are subject to very strong purifying selection. In addition to identifying sites under purifying selection, we also identified neutral sites that can cause false positive inference of selection. These sites presented form a resource for future studies of selection pressures acting on HIV-1 enν gene in Cête d׳Ivoire and other West African countries.

23 in total

Review 1. HIV evolutionary dynamics within and among hosts.

Authors: Philippe Lemey; Andrew Rambaut; Oliver G Pybus
Journal: AIDS Rev Date: 2006 Jul-Sep Impact factor: 2.500

2. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0.

Authors: Koichiro Tamura; Glen Stecher; Daniel Peterson; Alan Filipski; Sudhir Kumar
Journal: Mol Biol Evol Date: 2013-10-16 Impact factor: 16.240

3. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.

Authors: J D Thompson; T J Gibson; F Plewniak; F Jeanmougin; D G Higgins
Journal: Nucleic Acids Res Date: 1997-12-15 Impact factor: 16.971

4. Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene.

Authors: R Nielsen; Z Yang
Journal: Genetics Date: 1998-03 Impact factor: 4.562

5. Evolution of the structural proteins of human immunodeficiency virus: selective constraints on nucleotide substitution.

Authors: A L Brown; P Monaghan
Journal: AIDS Res Hum Retroviruses Date: 1988-12 Impact factor: 2.205

Review 6. Identifying and characterizing recently transmitted viruses.

Authors: Brandon F Keele
Journal: Curr Opin HIV AIDS Date: 2010-07 Impact factor: 4.283

7. Towards a structure of the HIV-1 envelope glycoprotein gp120: an immunochemical approach.

Authors: J P Moore; B A Jameson; Q J Sattentau; R Willey; J Sodroski
Journal: Philos Trans R Soc Lond B Biol Sci Date: 1993-10-29 Impact factor: 6.237

8. Rapid scaling-up of antiretroviral therapy in 10,000 adults in Côte d'Ivoire: 2-year outcomes and determinants.

Authors: Siaka Toure; Bertin Kouadio; Catherine Seyler; Moussa Traore; Nicole Dakoury-Dogbo; Julien Duvignac; Nafissatou Diakite; Sophie Karcher; Christophe Grundmann; Richard Marlink; François Dabis; Xavier Anglaret
Journal: AIDS Date: 2008-04-23 Impact factor: 4.177

9. Bayesian phylogenetics with BEAUti and the BEAST 1.7.

Authors: Alexei J Drummond; Marc A Suchard; Dong Xie; Andrew Rambaut
Journal: Mol Biol Evol Date: 2012-02-25 Impact factor: 16.240

10. Differential trends in the codon usage patterns in HIV-1 genes.

Authors: Aridaman Pandit; Somdatta Sinha
Journal: PLoS One Date: 2011-12-22 Impact factor: 3.240