| Literature DB >> 20696039 |
Sinu Paul1, Helen Piontkivska.
Abstract
BACKGROUND: Epitope vaccines have been suggested as a strategy to counteract viral escape and development of drug resistance. Multiple studies have shown that Cytotoxic T-Lymphocyte (CTL) and T-Helper (Th) epitopes can generate strong immune responses in Human Immunodeficiency Virus (HIV-1). However, not much is known about the relationship among different types of HIV epitopes, particularly those epitopes that can be considered potential candidates for inclusion in the multi-epitope vaccines.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20696039 PMCID: PMC2924856 DOI: 10.1186/1471-2180-10-212
Source DB: PubMed Journal: BMC Microbiol ISSN: 1471-2180 Impact factor: 3.605
Overview of HIV-1 sequences used in the analyses.
| Type of genome | Group | Subtype | Reference sequences# | Non-reference sequences* | |
|---|---|---|---|---|---|
| Non - recombinant | M group | A | - | 6 | 6 |
| A1 | 4 | 46 | 50 | ||
| A2 | 3 | - | 3 | ||
| B | 5 | 158 | 163 | ||
| C | 4 | 350 | 354 | ||
| D | 4 | 32 | 36 | ||
| F1 | 4 | 6 | 10 | ||
| F2 | 4 | - | 4 | ||
| G | 4 | 12 | 16 | ||
| H | 3 | - | 3 | ||
| J | 3 | - | 3 | ||
| K | 2 | - | 2 | ||
| M - Total | 40 | 610 | 650 | ||
| N group | 3 | 2 | 5 | ||
| O group | 4 | 13 | 17 | ||
| N & O Total | 7 | 15 | 22 | ||
| Non-recombinants - Total | 47 | 625 | 672 | ||
| Circulating Recombinant Forms (CRF) | 43 | 263 | 306 | ||
| Total | 90 | 888 | 978 | ||
The table shows numbers of HIV-1 sequences of different subtypes among reference sequences and global population used in the analyses.
# Reference sequences used in the primary analyses to identify association rules
* Non-reference sequences were collected from 2008 Web alignment of HIV Sequence database
^ Total number of sequences in the global HIV-1 population used in the analysis
Overview of epitopes used in the analyses.
| Gene | Protein | Total no. of epitopes | Highly conserved epitopes* | No of associated epitopes^ | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CTL# | Th | Ab | Total | CTL | Th | Ab | Total | CTL | Th | Ab | Total | ||
| p17 | 18 | 32 | - | 50 | 1 | - | - | 1 | - | - | - | - | |
| p24 | 42 | 88 | 1 | 131 | 8 | 6 | - | 14 | 8 | 6 | - | 14 | |
| p2p7p1p6 | 6 | 18 | - | 24 | 2 | - | - | 2 | 2 | - | - | 2 | |
| Total | 66 | 138 | 1 | 205 | 11 | 6 | - | 17 | 10 | 6 | 0 | 16 | |
| Gag-Pol | 1 | - | - | 1 | - | - | - | - | - | - | - | - | |
| Protease | 8 | - | - | 8 | 1 | - | - | 1 | 1 | - | - | 1 | |
| RT | 39 | 20 | 3 | 62 | 12 | 1 | - | 13 | 12 | 1 | - | 13 | |
| RT- | |||||||||||||
| Integrase | 1 | 1 | - | 2 | 1 | - | - | 1 | 1 | - | - | 1 | |
| Integrase | 12 | 11 | - | 23 | 5 | 2 | - | 7 | 4 | 2 | - | 6 | |
| Total | 61 | 32 | 3 | 96 | 19 | 3 | 22 | 18 | 3 | 0 | 21 | ||
| Vif | 9 | 2 | - | 11 | - | - | - | - | - | - | - | - | |
| 7 | 6 | - | 13 | - | - | - | - | - | - | - | - | ||
| 4 | 6 | 1 | 11 | - | - | - | - | - | - | - | - | ||
| 4 | 5 | - | 9 | - | - | - | - | - | - | - | - | ||
| 1 | 1 | - | 2 | - | - | - | - | - | - | - | - | ||
| 40 | 82 | 75 | 197 | - | - | 2 | 2 | - | - | 1 | 1 | ||
| 37 | 24 | 1 | 62 | 2 | 1 | - | 3 | 2 | 1 | - | 3 | ||
| Total | 229 | 296 | 81 | 32 | 10 | 2 | 30 | 10 | 1 | ||||
# CTL epitopes included only the best-defined epitopes as described by Frahm et al. (2007) [56]
* Only those epitopes present in more than 75% of the reference sequences were considered as highly conserved and thus included in the association rule mining. 3 epitopes completely overlapping with other epitopes of same type without amino acid differences were not included.
^ Associated epitopes are epitopes involved in association rules identified with a support value of 0.75 and confidence value of 0.95
Description of the 44 epitopes used in association rule mining.
| Gene | Protein | Non-overlapping genomic regions | Epitope sequence | amino acid coordin ates@ | Type of epitope | Epitopes involved in 2T-3G^ | Non-overlapping genomic regions of 2T-3G epitopes | Number of "unique" association rules each epitope is involved | HLA allele/MAb$ | Class-I HLA allele supertype association | Alternate HLA allele in case of promiscuous HLA alleles (if known)* | + if cumulative frequencies of HLA supertype alleles over 10% in the population | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Start | End | European | North American | Sub-Saharan African | |||||||||||
| Gag | p17 | 1 | WASRELERF# | 36 | 44 | CTL | 0 | B*3501, B*5801, B53 | B07, B58 | + | + | + | |||
| p24 | 2 | SPRTLNAWV | 16 | 24 | CTL | ✓ | 1 | 712 | B*0702, B42 | B07 | B42, B39, B81 | + | + | + | |
| 3 | FSPEVIPMF | 32 | 40 | CTL | 8 | B*57 | B58 | - | - | - | |||||
| EVIPMFSAL | 35 | 43 | CTL | 1 | A*2601, A*6901, B*1501, | A01, A02, | + | + | + | ||||||
| SEGATPQDL | 44 | 52 | CTL | 257 | A*2601, B*4001 | B44, A01 | B44 | + | + | + | |||||
| 4 | GHQAAMQML | 61 | 69 | CTL | ✓ | 2 | 2752 | A*0201, A3, B*1510, B38, B*3901 | B27, A03, B07, A02 | A03, B38 | + | + | + | ||
| 5 | EPRGSDIAGT | 98 | 107 | TH | 17 | DQ7 | + | - | + | ||||||
| 6 | IYKRWIILGLNKIVR | 129 | 143 | TH | 1167 | - | - | - | |||||||
| KRWIILGLNK | 131 | 140 | CTL | 1541 | B*2703, B*2705, B35, DRB1*0101 | B27, B07 | + | + | + | ||||||
| KRWIILGLNKIVRMY | 131 | 145 | TH | 1541 | DR1, DRB1*0101, DRB1*0301, DRB1*0405, DRB1*0701, DRB1*0802, DRB1*0901, DRB1*1101, DRB1*1201, DRB1*1302, DRB1*1501, DRB4*0101, DRB5*0101 | + | + | + | |||||||
| WIILGLNKIVRMYSP | 133 | 147 | TH | ✓ | 3 | 1885 | - | - | - | ||||||
| GLNKIVRMY | 137 | 145 | CTL | ✓ | 2868 | B*1501 | B62 | - | - | + | |||||
| LNKIVRMYSPVSILD | 138 | 152 | TH | 15 | - | - | - | ||||||||
| VRMYSPVSI | 142 | 150 | CTL | 46 | Cw*18 | - | - | - | |||||||
| 7 | PKEPFRDYV | 157 | 165 | TH | ✓ | 4 | 1866 | DQ5 | + | - | + | ||||
| p2p7p1p6 | 8 | CRAPRKKGC | 42 | 50 | CTL | 9 | B*14 | B27 | - | + | - | ||||
| 9 | TERQANFL | 64 | 71 | CTL | 29 | B*1801, B*4002, B*4001, B*4402, B*4403 | B44 | + | + | + | |||||
| Pol | PR | 10 | LVGPTPVNI | 76 | 84 | CTL | 1 | A*0201, A*0202, A*0203, A*6802 | A02 | + | + | + | |||
| RT | 11 | IETVPVKL | 5 | 12 | CTL | 17 | B*4001 | B44 | + | + | + | ||||
| 12 | GPKVKQWPL | 18 | 26 | CTL | 6 | B*0801, B8 | B08 | + | - | - | |||||
| 13 | KLVDFRELNK | 73 | 82 | CTL | ✓ | 5 | 1554 | A*0301 | A03 | + | + | + | |||
| 14 | GIPHPAGLK | 93 | 101 | CTL | ✓ | 6 | 971 | A*0301, A11 | A03 | + | + | + | |||
| 15 | TVLDVGDAY | 107 | 115 | CTL | ✓ | 7 | 783 | A*1101, B*1501, B*3501 | B07, A03, B62 | B07 | + | + | + | ||
| 16 | NETPGIRYQY | 137 | 146 | CTL | 30 | B*1801, B*4001, B*4002, B*4402, B*4403 | B44 | + | + | + | |||||
| IRYQYNVL | 142 | 149 | CTL | 31 | B*1401 | B27 | - | + | - | ||||||
| 17 | LVGKLNWASQIY | 260 | 271 | CTL | ✓ | 8 | 1117 | B*1501 | B62 | - | - | + | |||
| KLNWASQIY | 263 | 271 | CTL | ✓ | 1376 | A*3002 | A01 | - | - | - | |||||
| 18 | WEFVNTPPLVKLWYQ | 414 | 428 | TH | 65 | DRB1*0101, DRB1*0401, DRB1*0405, DRB1*0701, DRB1*0802, DRB1*0901, DRB1*1101, DRB1*1302, DRB1*1501, DRB5*0101 | + | + | + | ||||||
| 19 | GAETFYVDGA | 436 | 445 | CTL | 11 | A*6802 | A03 | + | + | + | |||||
| 20 | IVTDSQYAL | 495 | 503 | CTL | ✓ | 9 | 471 | Cw*0802 | - | - | - | ||||
| VTDSQYALGI | 496 | 505 | CTL | ✓ | 857 | B*1503 | B27 | + | |||||||
| RT-Integrase | 21 | LFLDGIDKA | 560 | 8 | CTL | ✓ | 10 | 557 | B*81 | B07 | + | + | + | ||
| Integrase | 22 | LKTAVQMAVFIHNFK | 172 | 186 | TH | ✓ | 11 | 1172 | - | - | - | ||||
| KTAVQMAVF | 173 | 181 | CTL | ✓ | 1279 | B*5701 | B07 | + | + | + | |||||
| KTAVQMAVFIHNFKR | 173 | 187 | TH | ✓ | 1041 | DRB1*0101, DRB1*0405, DRB1*1101, DRB1*1302 | + | + | + | ||||||
| AVFIHNFKRK | 179 | 188 | CTL | ✓ | 631 | A*0301, A*1101 | A03 | + | + | + | |||||
| FKRKGGIGGY | 185 | 194 | CTL | ✓ | 195 | B*1503 | B27 | - | + | - | |||||
| 23 | VPRRKAKII | 260 | 268 | CTL | ✓ | 12 | 15 | B*42 | B07 | + | + | + | |||
| RKAKIIRDY# | 263 | 271 | CTL | 0 | B*1503 | B27 | - | + | - | ||||||
| Env | 24 | PIPIHYCAPA# | 212 | 221 | Ab | 0 | 110.1 | - | - | - | |||||
| 25 | IKQI | 420 | 423 | Ab | 5 | E51 | - | - | - | ||||||
| Nef | 26 | VGFPVRPQ | 66 | 73 | TH | ✓ | 13 | 72 | DR1, DRw15(2) | - | - | - | |||
| RPQVPLRPM | 71 | 79 | CTL | 7 | B*4201 | B07 | + | + | + | ||||||
| 27 | FLKEKGGL | 90 | 97 | CTL | ✓ | 14 | 258 | B*0801 | B08 | B50 | + | - | - | ||
Out of the 44 epitopes included in the association rule mining, 41 were found to be part of association rules. Non-overlapping genomic regions and HLA alleles corresponding to each epitope are also shown.
# Epitopes not involved in any association rule
@ Amino acid coordinates are given with respect to the corresponding gene/protein in the HIV-1 HXB2 reference sequence (GenBank Accession no: K03455)
^ Epitopes involved in association rules with 2 types and 3 genes
$ HLA allele/MAb data given where available (from HIV database & IEDB)
*As per Frahm et al., 2007 [56]
Distribution of unique association rules according to genes involved in each association rule.
| Total* | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Association rules with 2 epitopes | 46 | 24 | 1 | 55 | 3 | 5 | 1 | 3 | 0 | 138 |
| Association rules with 3 epitopes | 104 | 160 | 0 | 768 | 1 | 33 | 0 | 23 | 56 | 1145 |
| Association rules with 4 epitopes | 108 | 135 | 0 | 1699 | 0 | 29 | 0 | 23 | 104 | 2098 |
| Association rules with 5 epitopes | 73 | 47 | 0 | 1551 | 0 | 11 | 0 | 4 | 33 | 1719 |
| Association rules with 6 epitopes | 29 | 6 | 0 | 753 | 0 | 2 | 0 | 0 | 3 | 793 |
| Association rules with 7 epitopes | 5 | 0 | 0 | 211 | 0 | 0 | 0 | 0 | 0 | 216 |
| Association rules with 8 epitopes | 0 | 0 | 0 | 31 | 0 | 0 | 0 | 0 | 0 | 31 |
| Association rules with 9 epitopes | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 2 |
| Total | 365 | 372 | 1 | 5070 | 4 | 80 | 1 | 53 | 196 | 6142 |
* There were no epitope associations in the following categories: Env only, Nef-Env, Gag-Pol-Env, Gag-Nef-Env, Pol-Nef-Env, Gag-Pol-Env-Nef
$ Detailed break-up of number of associations based on epitope type and genes involved is given in additional file 4
Figure 1A "multi-type" association rule involving three CTL and one Th epitope from three different genes, . The corresponding amino acid coordinates (as per HIV-1 HXB2 reference sequence) and HLA allele supertypes recognizing these epitopes are also shown.
Figure 2Relative composition of unique association rules involving multiple genes (. The 6142 unique association rules are classified according to the genes that harbor these epitopes. The pie-chart inside each segment represents the division according to the epitope region types involved. The single association rule in Nef-only category involved CTL and Th epitopes, while that in Pol-Env category involved CTL and Ab epitopes. Out of four association rules involving epitopes from Gag and Env, three belonged to CTL-Ab and one belonged to Th-Ab epitope regions types.
Nucleotide substitution rates among different epitope and non-epitope regions.
| dN | SE# | dS | SE | P-value* | |
|---|---|---|---|---|---|
| Associated epitopes | 0.01062 | 0.00952 | 0.20969 | 0.07091 | < 0.001 |
| Non-associated epitopes | 0.02387 | 0.02537 | 0.24220 | 0.12666 | < 0.001 |
| Not included epitopes | 0.10532 | 0.01277 | 0.29085 | 0.04305 | < 0.001 |
| Non-epitopes | 0.09793 | 0.01653 | 0.27329 | 0.04665 | < 0.001 |
Average pairwise number of nonsynonymous (d) and synonymous (d) substitutions per nonsynonymous and synonymous site, respectively, estimated at different categories of epitope and non-epitope regions among reference sequences of M group are given.
# Standard errors were estimated with 100 bootstrap replications in MEGA4.
* In pairwise t-tests, the null hypothesis of dS = dN was rejected in all four comparisons.