| Literature DB >> 19812689 |
Abstract
A likely key factor in the failure of a HIV-1 vaccine based on cytotoxic T lymphocytes (CTL) is the natural immunodominance of epitopes that fall in variable regions of the proteome, which both increases the chance of epitope sequence mismatch with the incoming challenge strain and replicates the pathogenesis of early CTL failure due to epitope escape mutation during natural infection. To identify potential vaccine sequences to focus the CTL response on highly conserved epitopes, the whole proteomes of HIV-1 clades A1, B, C, and D were assessed for Shannon entropy at each amino acid position. Highly conserved regions in Gag (cGag-1, Gag 148-214, and cGag-2, Gag 253-331), Env (cEnv, Env 521-606), and Nef (cNef, Nef 106-148) were identified across clades. Inter- and intra-clade variability of amino acids within the regions tended to overlap, suggesting that polyvalent representation of consensus sequences for the four clades would allow broad HIV-1 strain representation. These four conserved regions were rich in both known and predicted CTL epitopes presented by a breadth of HLA types, and screening of 54 persons with chronic HIV-1 infection revealed that these regions are commonly immunogenic in the context of natural infection. These data suggest that vaccine delivery of a 16-valent mixture of these regions could focus the CTL response against conserved epitopes that are broadly representative of circulating HIV-1 strains.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19812689 PMCID: PMC2753653 DOI: 10.1371/journal.pone.0007388
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Shannon Entropy across the HIV-1 clade B proteome.
All full length clade B protein sequences in the LANL HIV Sequence Database were aligned and assessed for Shannon entropy of each amino acid position. The red bars indicate entropy at each codon, and the heavy black lines plot mean entropy for the nine amino acids starting at each position. The shaded regions indicate relatively conserved regions that were examined further. Note that the numbering of the conserved regions (based on HXB2 position) does not necessarily match amino acid positions in the graphs (due to insertions in some sequences relative to HXB2).
Figure 2Entropy of 9 amino acid stretches for clades A1-D of HIV-1.
Shannon entropy was assessed for full length protein sequences of clades A1, C, and D available from the LANL HIV Sequence Database, as in Figure 1. For each clade, the mean entropies of nine amino acid stretches are plotted for each clade for the proteins containing the relatively conserved regions (shaded) identified in Figure 1. Note that the numbering of the conserved regions (based on HXB2 position) does not necessarily match amino acid positions in the graphs (due to insertions in some sequences relative to HXB2).
Figure 3Sequences of relatively conserved regions of the HIV-1 proteome.
The relatively conserved regions shaded in Figures 1 and 2 are given for clades A1, B, C, and D, aligned against the overall consensus across all group M clades. “-” indicates amino acid identity with overall consensus.
Figure 4Shannon entropy in the conserved region cGag-1.
For HIV-1 Gag amino acids 148–214, Shannon entropies of individual codons (red bars) and average entropies of stretches of 9 codons (black lines) are plotted for clades A1–D. The shaded columns indicate positions where the consensus sequences differ between clades. For each of those positions, the consensus amino acid and most common variant (in parentheses) are indicated.
Figure 5Shannon entropy in the conserved region cGag-2.
For HIV-1 Gag amino acids 250–335, Shannon entropies of individual codons (red bars) and average entropies of stretches of 9 codons (black lines) are plotted for clades A1–D. The shaded columns indicate positions where the consensus sequences differ between clades. For each of those positions, the consensus amino acid and most common variant (in parentheses) are indicated.
Figure 6Shannon entropy in the conserved region cEnv.
For HIV-1 Env amino acids 521–606, Shannon entropies of individual codons (red bars) and average entropies of stretches of 9 codons (black lines) are plotted for clades A1–D. The shaded columns indicate positions where the consensus sequences differ between clades. For each of those positions, the consensus amino acid and most common variant (in parentheses) are indicated.
Figure 7Shannon entropy in the conserved region cNef.
For HIV-1 Nef amino acids 106–148, Shannon entropies of individual codons (red bars) and average entropies of stretches of 9 codons (black lines) are plotted for clades A1–D. The shaded columns indicate positions where the consensus sequences differ between clades. For each of those positions, the consensus amino acid and most common variant (in parentheses) are indicated.
Varying consensus sequences and common polymorphisms in cGag-1.
| 159 | 173 | 186 | 190 | 203 | ||
|
| Consensus |
|
|
|
|
|
| Polymorphism #1 | V (9.6%) | T (7.2%) | T (8.9%) | V (8.8%) | E (4.8%) | |
| Other | T (2.4%) | |||||
|
| Consensus |
|
|
|
|
|
| Polymorphism #1 | I (20.5%) | T (14.8%) | L (0.4%) | A (0.3%) | D (3.5%) | |
|
| Consensus |
|
|
|
|
|
| Polymorphism #1 | V (9.1%) | S (3.4%) | S (3.2%) | A (1.4%) | E (1.0%) | |
|
| Consensus |
|
|
|
|
|
| Polymorphism #1 | V (3.8%) | T (30.4%) | S (6.4%) | A (2.5%) | D (3.8%) |
At each position in cGag-1 (Gag 148–214) where consensus amino acids differ between clades, the frequency of the consensus amino acid and the most common polymorphism(s) within each clade are listed (based on all available whole protein sequences in the LANL HIV Sequence Database).
Varying consensus sequences and common polymorphisms in cGag-2.
| 260 | 280 | 286 | 301 | 310 | 312 | 319 | ||
|
| Consensus |
|
|
|
|
|
|
|
| Polymorphism #1 | E (10.5%) | T (1.6%) | R (21.0%) | Y (0.8%) | S (12.9%) | D (32.3%) | D (19.4%) | |
|
| Consensus |
|
|
|
|
|
|
|
| Polymorphism #1 | D (7.4%) | V (18.0%) | K (24.7%) | - | T (8.9%) | D (30.5%) | K (0.2%) | |
| Other | S (3.6%) I (3.4%) | |||||||
|
| Consensus |
|
|
|
|
|
|
|
| Polymorphism #1 | E (31.0%) | T (1.7%) | R (39.6%) | Y (0.2%) | S (3.1%) | E (29.6%) | E (18.8%) | |
|
| Consensus |
|
|
|
|
|
|
|
| Polymorphism #1 | D (7.6%) | I (1.3%) | K (16.5%) | L (1.3%) | T (8.9%) | E (26.6%) | D (2.5%) |
At each position in cGag-2 (Gag 250–335) where consensus amino acids differ between clades, the frequency of the consensus amino acid and the most common polymorphism(s) within each clade are listed (based on all available whole protein sequences in the LANL HIV Sequence Database).
Varying consensus sequences and common polymorphisms in cEnv.
| 500 | 518 | 530 | 532 | 543 | 545 | 548 | 567 | ||
|
| Consensus |
|
|
|
|
|
|
|
|
| Polymorphism #1 | V (14.1%) | N (15.5%) | M (9.9%) | R (15.5%) | - | I (2.8%) | L (7.0%) | I (2.8%) | |
| Other | M (7.0%) | ||||||||
|
| Consensus |
|
|
|
|
|
|
|
|
| Polymorphism #1 | I (26.3%) | S (10.5%) | M (24.7%) | R (1.4%) | - | I (7.4%) | L (3.8%) | I (1.5%) | |
| Other | V (23.7% | R (4.8%) | I (1.2%) | K (0.9%) | L (2.8%) | I (3.6%) | |||
|
| Consensus |
|
|
|
|
|
|
|
|
| Polymorphism #1 | M (7.3%) | N (13.3%) | L (19.8%) | K (0.4%) | A (53.9%) | I (2.2%) | L (8.4%) | I (3.3%) | |
| Other | V (4.3%) | V (2.2%) | |||||||
|
| Consensus |
|
|
|
|
|
|
|
|
| Polymorphism #1 | L (57.6%) | S (4.7%) | M (5.9%) | - | - | V (57.1%) | L (3.5%) | L (7.1%) | |
| Other | V (17.6%) I (8.2%) |
At each position in cEnv (Env 521–606) where consensus amino acids differ between clades, the frequency of the consensus amino acid and the most common polymorphism(s) within each clade are listed (based on all available whole protein sequences in the LANL HIV Sequence Database).
Varying consensus sequences and common polymorphisms in cNef.
| 108 | 116 | 120 | 133 | 135 | 144 | ||
|
| Consensus |
|
|
|
|
|
|
| Polymorphism #1 | D (24.3%) | H (49.3%) | F (34.1%) | I (39.0%) | Y (59.7%) | E (1.5%) | |
| Other | V (18.4%) | ||||||
|
| Consensus |
|
|
|
|
|
|
| Polymorphism #1 | E (16.5%) | N (15.1%) | F (15.0%) | T (22.8%) | F (27.4%) | R (0.3%) | |
| Other | V (22.3%) | W (1.5%) | |||||
|
| Consensus |
|
|
|
|
|
|
| Polymorphism #1 | D (22.7%) | N (37.4%) | F (54.7%) | I (8.1%) | F (12.9%) | - | |
|
| Consensus |
|
|
|
|
|
|
| Polymorphism #1 | D (46.5%) | H (64.6%) | Y (64.6%) | V (13.3%) | F (3.0%) | K (9.2%) | |
| Other | I (5.1%) | T (12.2%) | Q (7.1%) |
At each position in cNef (Nef 106–148) where consensus amino acids differ between clades, the frequency of the consensus amino acid and the most common polymorphism(s) within each clade are listed (based on all available whole protein sequences in the LANL HIV Sequence Database).
Known and potential epitopes in the conserved regions.
| Region | Reported Epitopes (LANL Database) | Potential Epitopes For Other HLA Types |
|
| A*02, A*11, | A*03, A*20, A*30, A*31, A*32, A*33, A*68, A*69 |
| B*04, | B*08, B*18, B*37, B*46, B*51, B*54, B*55, B*56, B*67, B*78 | |
|
| C*03, C*04, C*06, C*07 | |
|
|
| A*03, A*25, A*29, A*30, A*31, A*66, A*68, A*69 |
| B*07, | B*37, B*38, B*39, B*40, B*46, B*48, B*51, B*54, B*55, B*56, B*67, B*78 | |
|
| C*06, C*07, C*12, C*14, C*16 | |
|
| A*02, A*11, | A*03, A*25, A*26, A*29, A*30, A*31, A*33, A*66, A*68, A*69 |
|
| B*07, B*15, B*18, B*35, B*37, B*38, B*39, B*46, B*48, B*52, B*78 | |
|
| C*01, C*02, C*05, C*06, C*16, C*17, C*18 | |
|
| A*01, | A*11, A*25, A*26, A*30, A*31, A*66, A*68 |
|
| B*38, B*39, B*40, B*44, B*46, B*51, B*54, B*55, B*56, B*58, B*67, B*78 | |
| C*04, | C*03, C*14 |
For each region, HLA types for which there are reported and predicted epitopes in the Los Alamos HIV Immunology Database (http://www.hiv.lanl.gov/content/sequence/ELF/epitope_analyzer.html) are listed. Types in bold font are those in the database that have been reviewed as best defined optimal epitopes [42].
Recognition of conserved regions by CTL responses in persons with chronic HIV-1 infection.
| ID | VL | HLA A | HLA B | HLA C | cGag1 | cGag2 | cEnv | cNef | Min# Epitopes |
| 1 | Y | *02/*03 | *07/*35 | *04/*07 | 0 | ||||
| 5 | Y | *02/*24 | *44/*55 | *03/*05 | 6820/6821 | 1 | |||
| 6 | Y | *01/*02 | *08/*35 | *04/*07 | 6811 | 1 | |||
| 7 | N | *29/*68 | *44/*53 | *04/*16 | 6791/6797 | 6823 | 5171/5172 | 4 | |
| 9 | Y | *03/*26 | *15/*38 | *03/*12 | 6784/6788/7912/7913 7920 | 6813/6814/6820/6821 7938/7939/7944/7945 | 8900/8901 | 4 | |
| 10 | N | *01/*25 | *18/ | *06/*12 | 6787/6788/6789/6797 | 5167 | 4 | ||
| 11 | Y | *01/*11 | *35/ | *04/*06 | 0 | ||||
| 12 | N | *30/*34 | *53/ | *03/*08 | 6787/6797 | 5167/5168 | 3 | ||
| 14 | Y | *03/*25 | *18/ | *07/*12 | 6787 | 1 | |||
| 15 | Y | *02/*03 | *15/*56 | *01/*03 | 0 | ||||
| 16 | Y | *03/*32 | *18/*40 | *02/*07 | 0 | ||||
| 17 | Y | *02/*03 | *44/*51 | *04/*14 | 6787/7911/7912 | 5168 | 2 | ||
| 18 | Y | *03/*68 | *55/ | 03/*06 | 6797 | 5167 | 2 | ||
| 19 | Y | *02 | *44/*50 | *06/*16 | 0 | ||||
| 21 | Y | *02/*36 | *53 | *04 | 5171/5172 | 1 | |||
| 22 | Y | *02/*32 | *07/*15 | *03/*07 | 5172 | 1 | |||
| 23 | Y | *03/*33 | *44/ | *03/*08 | 0 | ||||
| 24 | N | *02 | *13/*15 | *06/*07 | 6349 | 1 | |||
| 25 | Y | *02/*03 | *15/*40 | *02/*03 | 6813/6814 | 5168 | 2 | ||
| 26 | Y | *01/*02 | *08/*44 | *05/*07 | 6811/6814/6815/7940 | 2 | |||
| 27 | Y | *03/*66 | *35/*49 | *04/*07 | 0 | ||||
| 28 | Y | *03/*33 | *07/*65 | *07/*08 | 7946 | 8900 | 2 | ||
| 29 | N | *29/*31 | *48/*56 | *01/*08 | 7919/7920 | 1 | |||
| 30 | Y | *30 | *08/*81 | *07/*18 | 6812 | 5167/5168/5170/5171 | 3 | ||
| 31 | Y | *02 | *13/*15 | *03/*06 | 0 | ||||
| 32 | Y | *03/*68 | *08/ | *06/*07 | 7911/7912 | 1 | |||
| 33 | Y | *01/*74 | *49/*53 | *04/*07 | 5171/5172 | 1 | |||
| 34 | Y | *11/*30 | *52/ | *07/*12 | 6787 | 6823/6824/7949/7950 | 2 | ||
| 35 | Y | *01/*23 | *07/*44 | *04/*07 | 5169/5171 | 2 | |||
| 36 | Y | *02/*11 | *39/ | *06/*07 | 0 | ||||
| 37 | Y | *25/*33 | *14/*38 | *08/*12 | 0 | ||||
| 39 | Y | *31/*32 | *12/*35 | *04/*06 | 0 | ||||
| 40 | Y | *02/*31 | *35/*44 | *04/*05 | 5169 | 1 | |||
| 41 | Y | *11/*31 | *13/*52 | *04/*12 | 0 | ||||
| 44 | Y | *02/*11 | *07/*15 | *03/*07 | 7938/7939 | 1 | |||
| 45 | Y | *30/*68 | *15 | *02/*03 | 6820 | 1 | |||
| 46 | N | *25/*68 | *35/*14 | *04/*08 | 6792/7915/7916 | 5168/5169/5171/5172 | 3 | ||
| 47 | Y | *11/*33 | *55/*78 | *03/*16 | 0 | ||||
| 48 | N | *24/*30 | *15/*27 | *01/*02 | 6812/6813 | 1 | |||
| 49 | Y | *02/*26 | *38/*44 | *07/*12 | 7919/7920 | 1 | |||
| 50 | Y | *24/*30 | *38/*44 | *05/*12 | 8907/8908 | 5171/5172 | 2 | ||
| 51 | Y | *02/*24 | *14/*49 | *07/*08 | 7944/7945/7946 | 5167/5169 | 4 | ||
| 52 | Y | *01/*25 | *18/ | *06/*12 | 6797/7911/7912 | 5171 | 3 | ||
| 54 | Y | *29/*74 | *50/*81 | *06/*18 | 6793/7915/7916 | 6814/6815 | 2 | ||
| 58 | N | *02/*29 | *44/*15 | *03 | 0 | ||||
| 59 | Y | *02/*26 | *44/*52 | *03/*04 | 7912/7913 | 1 | |||
| 60 | N | *30 | *07/ | *15/*18 | 0 | ||||
| 65 | N | *34/*74 | *53/ | *04/*07 | 6787/6788/7911/7912 | 5171/5172 | 2 | ||
| 66 | Y | *02/*23 | *07/*45 | *07/*16 | 5166/5168/5170/5172 | 4 | |||
| 67 | Y | *11/*25 | *18/*27 | *01/*12 | 6813/7936/7937 | 3 | |||
| 68 | Y | *01/*29 | *44/ | *05/*06 | 6787/7912 | 5167/5168 | 2 | ||
| 69 | Y | *02 | *44 | *05/*07 | 7948 | 1 | |||
| 70 | Y | *02/*74 | *44/*53 | *04/*05 | 6823/6824/7947 | 1 | |||
| 71 | N | *03/*33 | *53/ | *04/*18 | 0 | ||||
|
| 17 (31.5%) | 17 (31.5%) | 4 (7.4%) | 19 (35.2%) | |||||
A panel of 54 persons (from the Los Angeles area) with chronic HIV-1 infection and not receiving antiretroviral treatment was screened for HIV-1-specific CTL responses by IFN-γ ELISpot assays using overlapping peptides spanning each protein. Gag-specific responses were screened using peptides based on clade B consensus and strain DU422 sequences. Env-specific responses were screened using peptides based on clade B consensus or strain MN sequences. Nef-specific responses were screened using peptides based on clade B consensus sequence. The presence or absence of detectable viremia is indicated in the second column; these individuals were biased towards slow progressors due to recruitment selection for being untreated. Recognized peptides that fall entirely within each conserved region are indicated by NIH AIDS Reagent Repository catalog number. The minimal number of epitopes recognized by each person is listed in the last column (assuming that consecutive overlapping peptides contain a single epitope).