| Literature DB >> 32640609 |
Wonderful Tatenda Choga1,2, Motswedi Anderson1, Edward Zumbika3, Bonolo B Phinius1, Tshepiso Mbangiwa1,2, Lynnette N Bhebhe1, Kabo Baruti1,4, Peter Opiyo Kimathi5, Kaelo K Seatla1,6, Rosemary M Musonda1,7, Trevor Graham Bell8, Sikhulile Moyo1,7, Jason T Blackard9, Simani Gaseitsiwe1,7.
Abstract
Hepatitis B virus (HBV) is the primary cause of liver-related malignancies worldwide, and there is no effective cure for chronic HBV infection (CHB) currently. Strong immunological responses induced by T cells are associated with HBV clearance during acute infection; however, the repertoire of epitopes (epi) presented by major histocompatibility complexes (MHCs) to elicit these responses in various African populations is not well understood. In silico approaches were used to map and investigate 15-mers HBV peptides restricted to 9 HLA class II alleles with high population coverage in Botswana. Sequences from 44 HBV genotype A and 48 genotype D surface genes (PreS/S) from Botswana were used. Of the 1819 epi bindings predicted, 20.2% were strong binders (SB), and none of the putative epi bind to all the 9 alleles suggesting that multi-epitope, genotype-based, population-based vaccines will be more effective against HBV infections as opposed to previously proposed broad potency epitope-vaccines which were assumed to work for all alleles. In total, there were 297 unique epi predicted from the 3 proteins and amongst, S regions had the highest number of epi (n = 186). Epitope-densities (Depi) between genotypes A and D were similar. A number of mutations that hindered HLA-peptide binding were observed. We also identified antigenic and genotype-specific peptides with characteristics that are well suited for the development of sensitive diagnostic kits. This study identified candidate peptides that can be used for developing multi-epitope vaccines and highly sensitive diagnostic kits against HBV infection in an African population. Our results suggest that viral variability may hinder HBV peptide-MHC binding, required to initiate a cascade of immunological responses against infection.Entities:
Keywords: Africa; Botswana; HLA class II alleles; T-cell epitopes; candidate multi-epitope vaccines (MEV); escape mutation; hepatitis B virus (HBV); immunoinformatics; in silico
Mesh:
Substances:
Year: 2020 PMID: 32640609 PMCID: PMC7412261 DOI: 10.3390/v12070731
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.818
Figure 1Schema illustrating the flow of data analysis used in this study. N = sample size; SB = strong binding peptides; WB = weak biding peptides; Tepi = total predicted epitopes; PreS/S = HBV surface gene; Depi = epitope densities. Sequences were derived from patients with different clinical outcomes: −(HBV/HIV; CHB; OBI)—HIV = human immunodeficiency virus; OBI = occult hepatitis B infection, CHB = chronic hepatitis B infection, HBV/HIV = coinfection. The blue colored segment shows the pipeline used to evaluate the diversity of epi. The grey segment is the pipeline used to determine epi and measure of promiscuity and conservativeness. The pink segment is the pipeline used to determine the best candidate vaccine.
Distribution of T cell epitopes restricted to 9 HLA class II alleles with high population coverage in Botswana.
| PreS1 | PreS2 | S | |||||
|---|---|---|---|---|---|---|---|
| A (120 aa) | D (108 aa) | A (55 aa) | D (55 aa) | A (226 aa) | D (226 aa) | ||
| Total bindings ( | 6; 79 (4.7) | 7; 107 (6.3) | 4; 81 (4.7) | 3; 78 (4.5) | 122; 619 (40.7) | 136; 577 (39.1%) | |
| Unique | 25 ¥ | 38 ¥ | 29 ¥ | 24 ¥ | 125 ¥ | 126 ¥ | |
| Ratio = Tepi: | 3.4 | 3.0 | 3.4 | 2.9 | 5.9 | 5.7 | |
| Most active HLA: DRB * (SB; WB) | *0802; *0101 | *0401; (*0401, *0101) | *0301; (*0401, *0101) | *0301; 5*0101 | *0702; *0401 | *0401; *1501 | |
| Genotype variation: |
|
|
| ||||
| (A|D | A|D > 0.05 | A|D > 0.05 | A|D > 0.05 | ||||
¥ indicates existence of epi that are common in both genotypes A and D. SB; strong binding peptides. WB; weak binding peptides. aa; amino acids. Web-logo diagrams represent signature aa between consensus sequences of genotype A and D set at a threshold of 100%. HBV; hepatitis B virus. * 0101 means HLA class II allele DRB1*0101 etc. 5*0101 means HLA class II allele DRB5*0101.
Figure 2Epitope densities (Depi) of different PreS/S proteins stratified by genotype (A or D) and protein (PreS1, PreS2, S). where can be any protein (PreS/SA versus PreS/SD). PreS1A represent genotype A large Hepatitis B surface antigen (HBsAg); PreS1D represent genotype D large HBsAg; PreS2A represent genotype A middle HBsAg; PreS2D represent genotype D middle HBsAg; SA represent genotype A small HBsAg; SD represent genotype D small HBsAg. Tepi = Total binding peptides (WB + SB). Nepi unique = count of unique binding peptides per each protein.
Showing the prevalence of S protein epi in other HBV genotypes (B–I).
| Epitope Sites in S Protein | AA Sequence | B ( | C ( | E (n = 1118) | F ( | G ( | H ( | I ( | Prevalence (%) = 1 − | Degree of Conservation ↓ (+: Variable) | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 17–31 | AFGKFLWEWASARFS | E = 998 | C = 32 | F = 1 | 10,0 | + | [ | ||||
| 180–194 | AGFFLLTRILTIPQS | B = 62 | C = 4627 | E = 5 | F = 2 | G = 75 | 46,3 | ++ | [ | ||
| 90–104 | CLIFLLVLLDYQGML | B = 2605 | C = 4661 | E = 1034 | F = 442 | G = 84 | H = 65 | I = 70 | 86,9 | ++++ | [ |
| 69–83 | CPGYRWMCLRRFIIF | B = 2384 | C = 5064 | E = 1027 | F = 465 | G = 84 | H = 67 | I = 65 | 88,8 | ++++ | [ |
| 19–33 | FFLLTRILTIPQSLD | B = 63 | C = 4717 | E = 5 | F = 2 | G = 76 | 47,2 | ++ | [ | ||
| 158–172 | FGKFLWEWASARFSW | E = 996 | C = 31 | F = 1 | 10,0 | + | [ | ||||
| 20–34 | FLLTRILTIPQSLDS | B = 63 | C = 4709 | E = 5 | F = 5 | G = 78 | 47,1 | ++ | [ | ||
| 93–107 | FLLVLLDYQGMLPVC | B = 2616 | C = 4720 | E = 1031 | F = 446 | G = 84 | H = 66 | I = 72 | 87,7 | ++++ | [ |
| 161–175 | FLWEWASARFSWLSL | B = 1 | E = 997 | C = 81 | F = 3 | I = 16 | 10,7 | + | [ | ||
| 179–193 | FVQWFVGLSPTVWLS | B = 2630 | C = 4698 | E = 76 | G = 85 | I = 75 | 73,4 | +++ | [ | ||
| 18–32 | GFFLLTRILTIPQSL | B = 63 | C = 4618 | E = 5 | F = 2 | G = 76 | 46,2 | ++ | - | ||
| 159–173 | GKFLWEWASARFSWL | E = 1002 | C = 31 | F = 3 | 10,1 | + | - | ||||
| 202–216 | GPSLYSILSPFLPLL | C = 19 | E = 2 | 0,2 | + | [ | |||||
| 71–85 | GYRWMCLRRFIIFLF | B = 82 | C = 5065 | E = 1032 | F = 464 | G = 84 | H = 67 | I = 65 | 66,5 | +++ | - |
| 92–106 | IFLLVLLDYQGMLPV | B = 2606 | C = 4659 | E = 1030 | F = 443 | G = 84 | H = 66 | I = 70 | 86,9 | ++++ | - |
| 195–209 | IWMMWYWGPSLYSIL | B = 3 | C = 24 | E = 2 | 0,3 | + | - | ||||
| 160–174 | KFLWEWASARFSWLS | B = 1 | E = 1012 | C = 34 | F = 3 | I = 16 | 10,3 | + | - | ||
| 91–105 | LIFLLVLLDYQGMLP | B = 2611 | C = 4656 | E = 1031 | F = 443 | G = 84 | H = 66 | I = 70 | 86,9 | ++++ | [ |
| 21–35 | LLTRILTIPQSLDSW | B = 63 | C = 4754 | E = 5 | F = 5 | G = 77 | 47,6 | ++ | [ | ||
| 94–108 | LLVLLDYQGMLPVCP | B = 2653 | C = 4729 | E = 1032 | F = 445 | G = 84 | H = 66 | I = 72 | 88,1 | ++++ | [ |
| 15–29 | LQAGFFLLTRILTIP | B = 62 | C = 4735 | E = 5 | F = 2 | G = 76 | 47,3 | ++ | [ | ||
| 192–206 | LSVIWMMWYWGPSLY | B = 772 | C = 4249 | E = 889 | G = 1 | I = 50 | 57,8 | ++ | [ | ||
| 95–109 | LVLLDYQGMLPVCPL | B = 2619 | C = 4708 | E = 1022 | F = 444 | G = 84 | H = 66 | I = 73 | 87,5 | ++++ | [ |
| 162–176 | LWEWASARFSWLSLL | B = 7 | E = 1001 | C = 88 | F = 445 | G = 2 | H = 65 | I = 75 | 16,3 | + | [ |
| 205–219 | LYSILSPFLPLLPIF | C = 21 | E = 2 | 0,2 | + | - | |||||
| 178–192 | PFVQWFVGLSPTVWL | B = 2649 | C = 4723 | E = 76 | G = 85 | I = 74 | 73,8 | +++ | - | ||
| 70–84 | PGYRWMCLRRFIIFL | B = 2375 | C = 5117 | E = 1033 | F = 466 | G = 85 | H = 67 | I = 65 | 89,3 | ++++ | [ |
| 66–80 | PPTCPGYRWMCLRRF | B = 60 | C = 751 | E = 14 | F = 464 | G = 77 | H = 68 | 13,9 | + | [ | |
| 203–217 | PSLYSILSPFLPLLP | C = 19 | E = 2 | 0,2 | + | - | |||||
| 67–81 | PTCPGYRWMCLRRFI | B = 60 | C = 750 | E = 14 | F = 464 | G = 77 | H = 68 | 13,9 | + | [ | |
| 16–30 | QAGFFLLTRILTIPQ | B = 62 | C = 4645 | E = 5 | F = 2 | G = 76 | 46,5 | ++ | [ | ||
| 181–195 | QWFVGLSPTVWLSVI | B = 2588 | C = 4310 | E = 73 | G = 6 | I = 71 | 68,4 | +++ | [ | ||
| 38–52 | SLNFLGGTTVCLGQN | B = 5 | C = 2 | E = 2 | 0,1 | + | [ | ||||
| 204–218 | SLYSILSPFLPLLPI | C = 19 | E = 2 | 0,2 | + | [ | |||||
| 193–207 | SVIWMMWYWGPSLYS | B = 3 | C = 24 | E = 2 | 0,3 | + | - | ||||
| 155–169 | SWAFGKFLWEWASAR | E = 998 | C = 31 | F = 1 | 10,0 | + | - | ||||
| 68–82 | TCPGYRWMCLRRFII | B = 63 | C = 757 | E = 17 | F = 466 | G = 77 | H = 68 | 14,0 | + | - | |
| 37–51 | TSLNFLGGTTVCLGQ | B = 5 | C = 2 | E = 2 | 0,1 | + | [ | ||||
| 194–208 | VIWMMWYWGPSLYSI | B = 3 | C = 24 | E = 2 | 0,3 | + | - | ||||
| 14–28 | VLQAGFFLLTRILTI | B = 61 | C = 4676 | E = 5 | F = 2 | G = 76 | 46,8 | ++ | [ | ||
| 180–194 | VQWFVGLSPTVWLSV | B = 2636 | C = 4584 | E = 77 | G = 6 | I = 75 | 71,6 | +++ | [ | ||
| 156–170 | WAFGKFLWEWASARF | E = 1000 | C = 32 | F = 1 | 10,0 | + | [ | ||||
| 163–177 | WEWASARFSWLSLLV | B = 7 | E = 1002 | C = 87 | F = 445 | G = 2 | H = 64 | I = 75 | 16,3 | + | - |
| 182–196 | WFVGLSPTVWLSVIW | B = 2506 | C = 4125 | E = 73 | G = 6 | I = 71 | 65,8 | +++ | [ | ||
| 201–215 | WGPSLYSILSPFLPL | C = 19 | E = 2 | 0,2 | + | [ | |||||
| 196–210 | WMMWYWGPSLYSILS | B = 3 | C = 25 | E = 2 | 0,3 | + | - | ||||
| 36–50 | WTSLNFLGGTTVCLG | B = 5 | C = 2 | E = 2 | 0,1 | + | - | ||||
| 35–49 | WWTSLNFLGGTTVCL | B = 5 | C = 2 | E = 2 | 0,1 | + | - | ||||
| 199–213 | WYWGPSLYSILSPFL | C = 18 | E = 2 | 0,2 | + | - | |||||
| 206–220 | YSILSPFLPLLPIFF | C = 19 | E = 2 | 0,2 | + | - | |||||
| 200–214 | YWGPSLYSILSPFLP | C = 18 | E = 2 | 0,2 | + | - |
+ The degree of conservation. The scale used: if score > = 100, then highly conserved and will be denoted by ‘+++++’. elif score > = 85: then semi conserved = ‘++++’. elif score > = 60: region of mutation and is denoted by ‘+++’. elif score > = 20, then highly variable region = ‘++’. else: high escape mutation = ‘+’. n represents the number of sequences used in the analysis. B represents full-length genotype B sequences, C represents full-length genotype C sequences, etc. 17–31 representing the beginning the position occupied by the 14-mer epitope predicts (e.g., 17 is the first amino acid of the epitope, while 31 is the last amino acid of the epitope).
List of most promiscuous SB epi mapped from S protein.
| S: Residues | AA Sequence | Geno | Core aa | HLA Class II Alleles |
| AA Sequence | Geno | Core aa | HLA Class II Alleles | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 69–83 | CPGYRWMCLRRFIIF | A | YRWMCLRRF | *0101 | *0301 | *0401 | *0701 | *0802 | *1302 | 68–82 | TCPGYRWMCLRRFII | D | YRWMCLRRF | *0101 | *0301 | *0401 | *0701 | *0802 | *1302 | *1501 | ||||
| CPGYRWMCLRRFIIF | D | 168–182 | VRFSWLSLLVPFVQW | A | WLSLLVPFV | *0101 | *0401 | *0701 | *0802 | *1101 | *1302 | *1501 | 5*0101 | |||||||||||
|
| ICPGYRWMCLRRFII | A | 14–28 | VLQAGFFLLTRILTI | A | FLLTRILTI | *0301 | *0401 | *0802 | *1302 | *1501 | 5*0101 | ||||||||||||
|
| PGYRWMCLRRFIIFL | D | VLQAGFFLLTRILTI | D | ||||||||||||||||||||
| PGYRWMCLRRFIIFL | A | *0101 | *0301 | *1501 | 165–179 | WASARFSWLSLLVPF | D | FSWLSLLVP | *0101 | *0401 | *0802 | *1101 | 5*0101 | |||||||||||
|
| GYRWMCLRRFIIFLF | A | 174–188 | SLLVPFVQWFVGLSP | A | FVQWFVGLS | *0101 | *0802 | *1101 | *1501 | ||||||||||||||
| GYRWMCLRRFIIFLF | D | SLLVPFVQWFVGLSP | D | |||||||||||||||||||||
|
| PICPGYRWMCLRRFI | A | 6–20 | SGFLGPLLVLQAGFF | A | LGPLLVLQA | *0101 | *0701 | *0802 | *1101 | *1501 | 5*0101 | ||||||||||||
|
| PPICPGYRWMCLRRF | A | *0101 | *1501 | 5*0101 | SGFLGPLLVLQAGFF | D | *0701 | ||||||||||||||||
|
| PTCPGYRWMCLRRFI | D | *1501 | 7–21 | GFLGPLLVLQAGFFL | D | ||||||||||||||||||
|
| CTCIPIPSSWAFAKY | A | CIPIPSSWA | *0101 | *0401 | *0701 | *0802 | *1501 | GFLGPLLVLQAGFFL | A | ||||||||||||||
| CTCIPIPSSWAFGKF | D | 5–19 | TSGFLGPLLVLQAGF | D | ||||||||||||||||||||
|
| GNCTCIPIPSSWAFA | A | TSGFLGPLLVLQAGF | A | ||||||||||||||||||||
|
| NCTCIPIPSSWAFAK | A | 93–107 | FLLVLLDYQGMLPVC | D | LVLLDYQGM | *0101 | *0401 | *0802 | *1101 | *1501 | |||||||||||||
| NCTCIPIPSSWAFGK | D | 5*0101 | FLLVLLDYQGMLPVC | A | *0802 | *1101 | ||||||||||||||||||
|
| VPFVQWFVGLSPTVW | D | WFVGLSPTV | *0101 | *0401 | *0701 | *0802 | *1101 | 5*0101 | 92–106 | IFLLVLLDYQGMLPV | A | *0802 | *1101 | ||||||||||
| VPFVQWFVGLSPTVW | A | IFLLVLLDYQGMLPV | D | *0802 | *1101 | |||||||||||||||||||
|
| ARFSWLSLLVPFVQW | D | WLSLLVPFV | *0101 | *0401 | *0701 | *0802 | *1101 | *1501 | 5*0101 | 182–196 | WFVGLSPTVWLSVIW | D | VGLSPTVWL | *0301 | *0401 | *0802 | *1302 | *1501 | |||||
|
| RFSWLSLLVPFVQWF | D | WFVGLSPTVWLSVIW | A | *0301 | *0401 | *0802 | *1302 | ||||||||||||||||
| RFSWLSLLVPFVQWF | A | 97–111 | LLDYQGMLPVCPLIP | A | YQGMLPVCP | *0101 | *0401 | *0701 | *1101 | *1501 | 5*0101 | |||||||||||||
|
| SARFSWLSLLVPFVQ | D | LLDYQGMLPVCPLIP | D | *0401 | *0701 | *1101 | |||||||||||||||||
|
| LVPFVQWFVGLSPTV | A | WFVGLSPTV | *0101 | *0401 | *0701 | *0802 | 5*0101 | 96–110 | VLLDYQGMLPVCPLI | A | *0401 | *0701 | *1101 | ||||||||||
| LVPFVQWFVGLSPTV | D | VLLDYQGMLPVCPLI | D | *0401 | *0701 | *1101 | ||||||||||||||||||
|
| ASARFSWLSLLVPFV | D | *0101 | *0401 | *0701 | *0802 | 5*0101 | 72–86 | YRWMCLRRFIIFLFI | A | WMCLRRFII | *0701 | *0802 | *1101 | *1501 | 5*0101 | ||||||||
|
| FSWLSLLVPFVQWFV | A | *1101 | YRWMCLRRFIIFLFI | D | WMCLRRFII | ||||||||||||||||||
| FSWLSLLVPFVQWFV | D | 194–208 | VIWMMWYWGPSLYNI | A | WYWGPSLYN | *0101 | *0401 | *0701 | *0802 | *1101 | 5*0101 | |||||||||||||
|
| SVRFSWLSLLVPFVQ | A | 5*0101 | |||||||||||||||||||||
PreS1A represent genotype A epi derived from sequences of large Hepatitis B surface antigen (HBsAg); PreS1D represent genotype D epi derived from sequences of large HBsAg; PreS2A represent genotype A epi derived from sequences of middle HBsAg; PreS2D represent genotype D epi derived from sequences of middle HBsAg; SA represent genotype A epi derived from sequences of small HBsAg; SD represent genotype D epi derived from sequences of small HBsAg. *0101 means HLA class II allele DRB1*0101 e.tc. 5*0101 means HLA class II allele DRB5*0101.
Highlighting most promiscuous T cell epitopes restricted to 9 HLA class II alleles.
| Epitope Site | AA Sequence | HBV Protein | Core AA | HLA Class II Alleles | Previously Discussed |
|---|---|---|---|---|---|
| 34–48 | PVPNIASHISSISSR | PreS2A | IASHISSIS | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1302, 1*1501, 5*0101 | - |
| 39–53 | ASHISSISSRTGDPA | PreS2A | ISSISSRTG | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
| 38–52 | IASHISSISSRTGDP | PreS2A | ISSISSRTG | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
| 37–51 | TTASPLSSIFSRIGD | PreS2D | LSSIFSRIG | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
| 38–52 | TASPLSSIFSRIGDP | PreS2D | LSSIFSRIG | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
| 39–53 | ASPLSSIFSRIGDPA | PreS2D | LSSIFSRIG | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
| 208–222 | ILSPFIPLLPIFFCL | SA | FIPLLPIFF | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
| 209–223 | LSPFIPLLPIFFCLW | SA | FIPLLPIFF | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
| 207–221 | NILSPFIPLLPIFFC | SA | FIPLLPIFF | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1302, 5*0101 | - |
| 149–163 | CIPIPSSWAFAKYLW | SA | IPSSWAFAK | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
| 6–20 | SGFLGPLLVLQAGFF | SA | LGPLLVLQA | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
| 200–214 | YWGPSLYNILSPFIP | SA | LYNILSPFI | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1302, 1*1501 | - |
| 199–213 | WYWGPSLYNILSPFI | SA | LYNILSPFI | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
| 164–178 | EWASVRFSWLSLLVP | SA | VRFSWLSLL | 1*0101, 1*0401, 1*0802, 1*1101, 1*1302, 1*1501, 5*0101 | [ |
| 168–182 | VRFSWLSLLVPFVQW | SA | WLSLLVPFV | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1302, 1*1501, 5*0101 | [ |
| 169–183 | RFSWLSLLVPFVQWF | SA | WLSLLVPFV | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | [ |
| 156–170 | WAFAKYLWEWASVRF | SA | YLWEWASVR | 1*0101, 1*0301, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | [ |
| 208–222 | ILSPFLPLLPIFFCL | SD | FLPLLPIFF | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
| 209–223 | LSPFLPLLPIFFCLW | SD | FLPLLPIFF | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
| 155–169 | SWAFGKFLWEWASAR | SD | FLWEWASAR | 1*0101, 1*0301, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501 | - |
| 6–20 | SGFLGPLLVLQAGFF | SD | LGPLLVLQA | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | [ |
| 199–212 | WYWGPSLYSILSPFL | SD | LYSILSPFL | 1*0101, 1*0401, 1*0802, 1*1101, 1*1302, 1*1501, 5*0101 | - |
| 167–181 | SARFSWLSLLVPFVQ | SD | WLSLLVPFV | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
| 168–182 | ARFSWLSLLVPFVQW | SD | WLSLLVPFV | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | [ |
| 169–183 | RFSWLSLLVPFVQWF | SD | WLSLLVPFV | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | [ |
| 197–211 | MMWYWGPSLYSILSP | SD | WYWGPSLYS | 1*0101, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
| 68–82 | TCPGYRWMCLRRFII | SD | YRWMCLRRF | 1*0101, 1*0301, 1*0401, 1*0701, 1*0802, 1*1101, 1*1501, 5*0101 | - |
*0101 means HLA class II allele DRB1*0101 e.tc. 5*0101 means HLA class II allele DRB5*0101.
Figure 3Showing a tertiary structure of candidate vaccines: (a) Tertiary structures of candidate epi modelled using 3Dpro webtool. The SA protein in (a) has the aa composition: 5′-SGFLGPLLVLQAGFFWYWGPSLYNILSPFIPLLPIFFCLWCIPIPSSWAFAKYLWEWASVRFSWLSLLVPFVQWF-3′, and had following theoretical properties: antigenicity (0.04), instability index (II = 47.07), estimated half-life in vitro = 1.9 h, molecular weight (mw) = 8878.62, aliphatic index (AI = 114.40) and grand average of hydrophobicity (GRAVY = 0.995), and theoretical alkalinity (pI = 7,76). Using VaxiJen ver2.0 server set at threshold of 0.4, the overall prediction for the protective antigen was 0.53, displaying it as a plausible antigen [66]. (b) (5′-MMWYWGPSLYSILSPFLPLLPIFFCLWSGFLGPLLVLQAGFFSWAFGKFLWEWASARFSWLSLLVPFVQWFTCPGYRWMCLRRFIIFLF-3′) protein modelled using from SD epi. The protein had following theoretical properties: antigenicity (0.11), instability index (II = 53), estimated half-life in vitro 30 h, molecular weight (mw) = 10749.99, aliphatic index (AI = 101.91) and grand average of hydrophobicity (GRAVY = 0.965), and theoretical alkalinity (pI = 9.42),). The antigenicity score predicted in both candidate vaccine suggests that they are plausible antigens [66].
The prevalence of S gene escape mutations.
| Protein | AA Sequence | S Protein AA Residues ( | Escape Mutations | Count in A ( | Count in D ( | Count in Other Genotypes |
|---|---|---|---|---|---|---|
| SA | MENITSGFLGPQLV | 1–14 | L12Q | 13 | 1 | 3 |
| 1–14 | I4T | 13 | 25 | 91 | ||
| 1–14 | I4Stop | 10 | - | - | ||
| SA | GPLLVLQAGFFLLTR | 10–24 | L15Stop | 2 | - | 17 |
| SA | LNFPGGSPVCLGQNS | 39–53 | L42P | 13 | 6 | 53 |
| SA | TRILTIPQ *LDSWWT | 23 -37 | S31Stop | 6 | 12 | - |
| SA | DLWWTSLNFLGDPPV | 33–47 | G44D | 4 | - | 16 |
| DSWWTSLNFLGESPV | G44E | 105 | 87 | 826 | ||
| IPIPSSWGFAKYLWE | 150–164 | A157G | 10 | 5 | 3 | |
| SA | IPIPSSWAFVKYLWE | A159V | 20 | 3 | 234 | |
| SA | PPICPGYRWMCQRRF | 66–80 | L77R | 7 | 2 | 43 |
| L77Q | 6 | - | 18 | |||
| SA | PPICPGYRWMCLR *F | R79Stop | 9 | 10 | 20 | |
| R79H | 13 | 58 | 49 | |||
| SA | LIFLLVLLDYQDMLP | 91–105 | G102D | 1 | 2 | 3 |
| SA | CLIFLLVLLDYQGML | 90–104 | D99Stop | 15 | 5 | 8 |
| Y100C | 196 | 12 | 17 | |||
| M103I | 31 | 31 | 29 | |||
| SA | SLLVPFVQWFVGLTP LLVPFVQWFEGLSPT | 174–188 | S187T | 1 | - | - |
| G185E | 6 | 15 | 19 | |||
| V184E | 1 | 2 | 2 | |||
| L186P | 10 | - | 5 | |||
| SA | SPTVWLLAIWMMWYW | 187–193 | S193L | 104 | 257 | 612 |
| A194V | 494 |
| ||||
| SA | GPSLYNISSPFIPLL | 202–216 | L209S | 4 | 9 | 12 |
| SD | ENITSGCLGPLLVLQAGF | 2–16 | F8C | - | 1 | - |
| SA | YLWEWASVRFSWPSL | 161–175 | L173P | 1 | 3 | 11 |
| SA | SPFIPLLLIFFCLWV | 210–224 | P214L | 11 | 41 | 23 |
| SD | SSWAFGKFLWEWASA | 154–168 208–222 | K160N | 1 | 7 | 9 |
| SD | ILSPYILLLPIFFCI | F212Y + L213I + P214L | -; -; 11 | 40; 202; 41 | 3; ∆ *;23 | |
| NITSGFLGLLLVLQA | 4 -18 | P11L | 2 | 4 | 3 | |
| SD | MENITSGFLGPLLVL | 1–15 | T5P; N3S + I4T + T5A | 2; - | 4; 2 | 4; -29 |
| SD | SWWTSLNFLGETTVC | 34–48 | G44E | 105 | 87 | 826 |
| SD | SWWTSLNFRGGTTVC | L42R | 14 | 8 | 14 | |
| SD | LSVIWMMWYWGPNLY | 192–206 | S204N | 136 | 194 | 848 |
|
| M1E, M1L, M1K, M1I, M1V, M1T, M1R, E2H, E2Q, E2*, E2V, E2K, E2A, E2G, E2D, N3H, N3E, N3P, N3A, N3C, N3D, N3I, N3R, N3Y, N3K, N3T, N3G, N3S, I4L, I4R, I4Y, I4Q, I4F, I4M, I4H, I4K, I4A, I4P, I4S, I4N, I4V, I4T, T5Y, T5Q, T5L, T5R, T5E, T5K, T5V, T5P, T5I, T5S, T5A, S6G, S6F, S6A, S6P, S6T, S6L, G7C, G7D, G7*, G7A, G7V, G7K, G7E, G7R, F8C, F8A, F8G, F8V, F8Y, F8I, F8H, F8P, F8S, F8L, L9R, L9K, L9V, L9I, L9H, L9Q, L9P, G10D, G10*, G10V, G10Q, G10T, G10A, G10K, G10E, G10R, P11A, P11T, P11H, P11L, L12V, L12E, L12M, L12R, L12P, L12Q, L13Q, L13V, L13I, L13F, L13R, L13H, L13P, V14R, V14L, V14E, V14I, V14M, V14G, V14A, L15E, L15K, L15T, L15*, L15I, L15F, L15V, L15S, Q16K, Q16E, Q16H, Q16L, Q16R, Q16P, A17R, A17T, A17S, A17P, A17V, A17G, A17E, G18W, G18K, G18E, G18A, G18R, G18V, F19I, F19V, F19L, F19S, F19Y, F19C, F20I, F20Y, F20L, F20S, L21V, L21M, L21F, L21*, L21W, L21S, L22Q, L22M, L22V, L22F, L22S, L22*, L22W, T23F, T23R, T23Q, T23S, T23P, T23A, T23I, R24N, R24T, R24Q, R24E, R24I, R24G, R24S, R24K, I25R, I25S, I25F, I25N, I25A, I25T, I25V, L26Y, L26I, L26F, L26Q, L26P, L26H, L26R, T27R, T27P, T27S, T27A, T27K, T27I, I28K, I28L, I28V, I28T, I28M, P29E, P29A, P29F, P29S, P29Q, P29T, P29L, Q30M, Q30S, Q30P, Q30A, Q30L, Q30H, Q30R, Q30K, S31K, S31I, S31D, S31T, S31C, S31G, S31R, S31N, L32V, L32G, L32Q, L32R, L32I, L32P, D33V, D33H, D33Y, D33E, D33N, D33G, S34*, S34A, S34T, S34W, S34P, S34L, W35P, W35C, W35*, W35G, W35R, W35L, W36E, W36G, W36R, W36*, W36L, T37D, T37P, T37L, T37I, T37S, T37N, T37A, S38E, S38A, S38Y, S38F, S38P, L39I, L39V, L39R, L39H, L39F, L39P, N40Q, N40H, N40K, N40I, N40D, N40S, F41Q, F41I, F41Y, F41C, F41L, F41S, L42*, L42S, L42I, L42V, L42Q, L42R, L42P, G43V, G43W, G43K, G43R, G43E, G44Q, G44K, G44R, G44A, G44D, G44V, G44E, S45M, S45D, S45G, S45E, S45H, S45Q, S45R, S45I, S45K, S45N, S45V, S45L, S45P, S45A, S45T, P46N, P46R, P46S, P46I, P46A, P46H, P46L, P46T, V47Q, V47N, V47L, V47P, V47M, V47E, V47R, V47K, V47A, V47G, V47T, C48W, C48L, C48F, C48R, C48G, C48S, C48Y, L49A, L49T, L49C, L49S, L49I, L49V, L49F, L49H, L49R, L49P, G50C, G50W, G50P, G50R, G50D, G50V, G50S, G50A, Q51E, Q51K, Q51H, Q51P, Q51R, Q51L, N52T, N52G, N52H, N52K, N52I, N52D, N52S, S53G, S53M, S53*, S53T, S53P, S53W, S53L, Q54K, Q54*, Q54H, Q54L, Q54P, Q54R, S55T, S55Y, S55A, S55P, S55F, S55C, P56A, P56*, P56S, P56R, P56H, P56L, P56Q, T57S, T57A, T57I, S58Y, S58A, S58L, S58T, S58P, S58F, S58C, N59L, N59I, N59R, N59H, N59D, N59K, N59S, H60D, H60S, H60N, H60Y, H60Q, H60R, H60P, S61T, S61P, S61*, S61L, P62S, P62Q, P62L, T63V, T63F, T63N, T63S, T63A, T63I, S64W, S64P, S64F, S64Y, S64C, C65G, C65F, C65S, C65Y, C65R, P66S, P66L, P66T, P66A, P66Q, P66H, P67T, P67A, P67R, P67S, P67L, P67Q, I68D, I68P, I68V, I68F, I68S, I68N, I68A, I68T, C69F, C69G, C69S, C69R, C69Y, C69W, C69*, P70S, P70H, P70R, P70A, P70T, P70L, G71S, G71D, G71W, G71V, Y72S, Y72D, Y72N, Y72H, Y72F, Y72C, R73L, R73C, R73P, R73H, W74C, W74G, W74R, W74*, W74S, W74L, M75K, M75L, M75R, M75S, M75V, M75T, M75I, C76G, C76*, C76R, C76W, C76S, C76F, C76Y, L77P, L77G, L77M, L77V, L77Q, L77R, R78I, R78G, R78W, R78P, R78L, R78Q, R79P, R79G, R79L, R79C, R79S, R79H, F80G, F80N, F80Y, F80I, F80L, F80S, I81S, I81Y, I81N, I81F, I81M, I81V, I81T, I82F, I82N, I82V, I82M, I82T, I82L, F83Y, F83I, F83L, F83C, F83S, L84F, L84H, L84P, F85Q, F85R, F85P, F85L, F85Y, F85S, F85C, I86M, I86K, I86L, I86F, I86V, I86T, L87M, L87V, L87R, L87Q, L87P, L88R, L88V, L88M, L88Q, L88P, L89V, L89Y, L89T, L89R, L89I, L89Q, L89P, C90G, C90W, C90N, C90I, C90*, C90Y, C90F, C90S, L91G, L91F, L91P, L91R, L91H, I92H, I92N, I92V, I92M, I92L, I92S, I92T, F93R, F93W, F93A, F93Y, F93I, F93L, F93C, F93S, L94G, L94*, L94V, L94F, L94M, L94W, L94S, L95P, L95F, L95M, L95V, L95S, L95*, L95W, V96I, V96E, V96N, V96F, V96L, V96C, V96D, V96A, V96G, L97C, L97R, L97V, L97H, L97F, L97P, V98M, V98W, V98A, V98Q, V98P, V98R, V98L, D99H, D99K, D99V, D99A, D99E, D99N, D99G, Y100P, Y100*, Y100K, Y100D, Y100H, Y100N, Y100L, Y100W, Y100S, Y100F, Y100C, Q101P, Q101L, Q101*, Q101K, Q101H, Q101R, G102N, G102R, G102A, G102D, G102C, G102S, G102V, M103K, M103R, M103L, M103T, M103V, M103I, L104Q, L104M, L104G, L104V, L104W, L104S, L104F, P105T, P105S, P105H, P105A, P105L, P105R, V106L, V106R, V106Q, V106D, V106E, V106G, V106I, V106A, C107A, C107W, C107*, C107G, C107S, C107Y, C107R, P108A, P108T, P108S, P108L, P108H, L109R, L109V, L109I, L109Q, L109M, L109P, I110R, I110H, I110V, I110N, I110F, I110S, I110P, I110T, I110M, I110L, A111N, A111R, A111T, A111Q, A111S, A111L, A111P, G112V, G112K, G112N, G112E, G112A, G112R, S113C, S113R, S113F, S113L, S113Y, S113K, S113P, S113N, S113A, S113T, T114V, T114*, T114L, T114R, T114N, T114M, T114K, T114A, T114P, T114S, T115P, T115S, T115A, T115I, T115N, T116S, T116P, T116A, T116I, T116N, S117Y, S117A, S117V, S117R, S117I, S117G, S117N, S117T, T118W, T118N, T118G, T118I, T118E, T118P, T118S, T118K, T118R, T118M, T118A, T118V, G119S, G119W, G119T, G119*, G119C, G119V, G119E, G119R, P120H, P120N, P120R, P120L, P120A, P120Q, P120S, P120T, C121T, C121G, C121F, C121L, C121Y, C121S, C121W, C121R, K122G, K122V, K122E, K122N, K122T, K122I, K122S, K122Q, K122R, T123G, T123V, T123P, T123S, T123N, T123I, T123A, C124G, C124R, C124W, C124Y, C124S, C124F, T125P, T125I, T125A, T125M, T126P, T126Q, T126K, T126G, T126M, T126L, T126V, T126N, T126A, T126S, T126I, P127F, P127R, P127V, P127H, P127S, P127A, P127I, P127T, P127L, A128T, A128R, A128G, A128V, Q129E, Q129K, Q129L, Q129P, Q129N, Q129H, Q129R, G130V, G130*, G130D, G130K, G130A, G130E, G130S, G130R, G130N, N131Q, N131G, N131D, N131H, N131K, N131S, N131I, N131A, N131P, N131T, S132H, S132Y, S132P, S132C, S132F, M133R, M133G, M133K, M133V, M133Q, M133S, M133I, M133L, M133T, F134K, F134D, F134W, F134T, F134Q, F134C, F134R, F134V, F134H, F134S, F134L, F134N, F134I, F134Y, P135S, P135R, P135T, P135H, P135L, S136T, S136C, S136A, S136L, S136P, S136F, S136Y, C137S, C137R, C137*, C137Y, C137W, C138S, C138*, C138W, C138G, C138R, C138Y, C139S, C139G, C139*, C139Y, C139W, C139R, T140P, T140K, T140M, T140A, T140L, T140I, T140S, K141N, K141Q, K141T, K141I, K141R, K141E, P142T, P142A, P142H, P142R, P142I, P142S, P142L, T143W, T143F, T143A, T143P, T143M, T143L, T143S, D144Y, D144N, D144V, D144A, D144G, D144E, G145D, G145V, G145I, G145S, G145K, G145E, G145A, G145R, N146I, N146T, N146K, N146S, N146D, C147W, C147S, C147R, C147G, C147Y, T148S, T148I, T148A, C149S, C149G, C149W, C149Y, C149R, I150S, I150N, I150M, I150V, I150T, P151S, P151R, P151T, P151L, P151H, I152H, I152F, I152L, I152T, I152V, P153A, P153T, P153L, P153Q, P153S, S154Q, S154A, S154H, S154V, S154T, S154P, S154L, S155T, S155A, S155Y, S155F, S155P, W156S, W156C, W156G, W156R, W156*, W156L, A157S, A157N, A157D, A157P, A157V, A157T, A157G, F158I, F158Y, F158S, F158L, A159P, A159Q, A159L, A159S, A159T, A159R, A159E, A159V, A159G, K160Q, K160T, K160E, K160G, K160S, K160N, K160R, Y161K, Y161C, Y161V, Y161H, Y161I, Y161L, Y161S, Y161F, L162S, L162*, L162R, L162V, L162I, L162P, L162Q, W163G, W163C, W163R, E164*, E164K, E164A, E164V, E164D, E164G, W165Q, W165*, W165G, W165C, W165S, W165R, W165L, A166S, A166D, A166P, A166T, A166V, A166G, S167Y, S167E, S167T, S167G, S167P, S167L, V168L, V168I, V168F, V168G, V168D, V168S, V16887T, V168P, V168A, R169G, R169L, R169S, R169C, R169P, R169H, F170V, F170L, F170S, S171N, S171Y, S171C, S171L, S171P, S171F, W172M, W172F, W172G, W172R, W172S, W172L, W172C, W172*, P173I, P173Y, P173S, P173V, P173F, P173L, S174G, S174K, S174C, S174R, S174T, S174I, S174N, L175I, L175*, L175F, L175S, L176I, L176V, L176Q, L176P, V177P, V177I, V177S, V177G, V177E, V177M, V177L, V177A, P178G, P178R, P178H, P178A, P178L, P178S, P178Q, F179I, F179C, F179L, F179Y, F179S, V180P, V180L, V180E, V180F, V180I, V180A, Q181K, Q181E, Q181L, Q181H, Q181R, W182F, W182G, W182S, W182R, W182C, W182L, W182*, F183V, F183Y, F183I, F183L, F183S, F183C, V184L, V184P, V184D, V184F, V184I, V184E, V184G, V184A, G185K, G185L, G185W, G185R, G185A, G185E, L186M, L186A, L186R, L186C, L186S, L186I, L186V, L186F, L186P, L186H, S187A, S187T, S187C, S187P, S187F, S187L, P188A, P188S, P188R, P188T, P188H, P188L, T189L, T189N, T189A, T189S, T189P, T189I, V190S, V190E, V190P, V190D, V190G, V190I, V190F, V190A, W191K, W191S, W191*, W191G, W191L, W191R, W191C, L192R, L192H, L192V, L192F, L192P, S193*, S193T, S193P, S193F, S193L, A194G, A194D, A194S, A194T, A194F, A194L, A194I, A194V, I195K, I195V, I195L, I195T, I195M, W196G, W196V, W196K, W196R, W196*, W196S, W196L, M197L, M197K, M197R, M197V, M197I, M197T, M198L, M198V, M198R, M198K, M198T, M198I, W199P, W199C, W199R, W199G, W199S, W199*, W199L, Y200H, Y200*, Y200L, Y200N, Y200S, Y200C, Y200F, W201C, W201L, W201G, W201S, W201R, W201*, G202V, G202E, G202R, G202A, P203H, P203T, P203G, P203S, P203A, P203L, P203Q, P203R, S204Q, S204H, S204C, S204D, S204I, S204T, S204G, S204K, S204R, S204N, L205Q, L205R, L205V, L205P, L205M, Y206G, Y206P, Y206Q, Y206*, Y206I, Y206W, Y206V, Y206D, Y206R, Y206L, Y206N, Y206S, Y206F, Y206H, Y206C, N207G, N207K, N207C, N207D, N207P, N207H, N207I, N207T, N207R, N207S, I208L, I208C, I208F, I208S, I208V, I208N, I208T, L209K, L209A, L209F, L209*, L209G, L209M, L209S, L209W, L209V, S210M, S210G, S210I, S210T, S210K, S210R, S210N, P211A, P211T, P211S, P211R, P211L, P211H, F212C, F212S, F212L, F212Y, I213A, I213V, I213S, I213F, I213T, I213M, I213L, P214G, P214R, P214T, P214H, P214S, P214Q, P214L, L215D, L215I, L215V, L215T, L215M, L215Q, L215R, L215P, L216T, L216G, L216C, L216R, L216Y, L216V, L216I, L216S, L216F, L216*, P217A, P217T, P217Q, P217S, P217L, I218P, I218K, I218S, I218F, I218N, I218T, I218L, F219C, F219I, F219L, F219S, F220V, F220I, F220S, F220W, F220Y, F220L, F220C, C221V, C221R, C221W, C221L, C221S, C221G, C221F, C221Y, H222Q, H222V, H222S, H222R, H222F, H222I, H222P, H222L, W223F, W223G, W223C, W223R, W223*, W223L, V224L, V224I, V224E, V224G, V224A, Y225A, Y225*, Y225N, Y225D, Y225L, Y225I, Y225H, Y225C, Y225F, Y225S, I226R, I226H, I226L, I226F, I226N, I226V, I226T, I226M, I226S, | |||||
1R represents the position of the mutation ‘1’ and R is the change in amino acid “candidate escape mutations”. ∆ Genotype C sequences were excluded in the analysis. ∆* genotype B, C, D, E, F, G, H and I sequences were excluded in this analysis. * = Stop codon.
Summary of PreS1 epitopes binding to respective alleles, and mutations that exists in the Botswana HBV sequences.
| Epitope Position | Core AA Sequence | HLA_ DRB*1/5_ Genotype A | Epitopes Position | Core AA Sequence | HLA_ DRB*1/5_ Genotype D | Mutations Relative to Protein: (WT, aa#, Mut) | |
|---|---|---|---|---|---|---|---|
| A | D | ||||||
| 84–92 | ILATVPAVP ¥ | *0802 | 73–81 | ILQTLPANP | I84V, I84T, A86V, A86T, V88M, V88L, A90P, A90V, P92L | I73M, L75H, L77M, L77V, A79E | |
| 85–93 | LATVPAVPP ¥ | *˜0802 | 74–82 | LQTLPANPP | I84V, I84T, A86V, A86T, V88M, V88L, A90P, A90V, P92L | I73M, L75H, L77M, L77V, A79E | |
| 34–42 | FGANSNNPD | 23–30 | FRANTANP ¥ | *0401 | - | R24K | |
| 16–24 | LSVPNPLGF | 5–13 | LSTSNPLGF ¥ | *0401, *1302 | - | - | |
| 24–32 | FFPDHQLDP | 13–21 | FFPDHQLDP | *0401, | F25L | - | |
| 63–71 | FGPGFTPPH | 52–60 | FGLGFTPPH | *0101, *0401, *0701, 5*0101 | F67L | L54M, L54P, F56L | |
| 34–42 | FGANSNNPD | 23–30 | FRANTANPD | *0101, *0401, *0802, *1302 | - | R24K | |
| 67–75 | FTPPHGGVL | 56–64 | FTPPHGGLL | *0101, *0701, | F67L, G73R | - | |
| 28–36 | HQLDPAFGA | 17–25 | HQLDPAFRA | *0301, *0401, | - | R24K | |
| 84–92 | ILATVPAVP | *0101, *0301, *0401, *0701, *0802, *1101, *1302, *1501, 5*0101 | 73–81 | ILQTLPANP | *0101, *0301, *0401, *0701, *0802,*1302, | I84V, I84T, A86V, A86T, V88M, V88L, A90P, A90V, P92L | I73M, L75H, L77M, L77V, A79E |
| 85–93 | LATVPAVPP | *0401, *0802, 5*0101 | 74–82 | LQTLPANPP | *0802, 5*0101 | I84V, I84T, A86V, A86T, V88M, V88L, A90P, A90V, P92L | I73M, L75H, L77M, L77V, A79E |
| 74–82 | LLGWSPQAQ | *1501 | 63–71 | *0101, *0401, *0802, *1501 | W77stop, S78R, P79S, A81S, A81P | S67N, P68S | |
| 16–24 | LSTSNPLGF | 5–13 | *0301, *0701, *1302 | - | - | ||
| 12–21 | MGTNLSVPN | *0401, *0802, *1302 | 1–10 | MGQNLSTSNP | - | G2E | |
| 15–22 | NLSVPNPLG | *0401, | 4–12 | NLSTSNPLGF | - | - | |
| 83–90 | QGILATVPA | *0101, *0802 | 71–79 | QGILQTLPA | *0101, | G83D, I84V, I84K, A86T, A86V, V88L, V88M, A90T, A90V | I73M, L75H, L77M, L77V, A79E |
| 14–22 | TNLSVPNPL | *0101, *0701, *1302 | 3–11 | QNLSTSNPL | *0101, *0401, | - | - |
| 4–12 | WSAKPRKGM | *1101, 5*0101 | - | S5P, A6S, A6T, K10N, K10I | - | ||
| 77–85 | WSPQAQGIL | *0101, *0701, | 66–75 | WSPQAQGIL | *0101, *0701, | W77stop, S78R, P79S, A81P, A81S, G83D, I84V, I84K | P68S, |
*0101 means HLA class II allele DRB1*0101 e.tc. 5*0101 means HLA class II allele DRB5*0101. ¥ indicates SB.
Summary of PreS2 epitopes binding to respective alleles, and mutations that exist within the Botswana HBV sequences.
| Genotype | AA Sequence Core | AA Sequence of | Pres2 | Count of | HLA Class II Alleles | Mutations in the Core Sequence |
|---|---|---|---|---|---|---|
| A | ALQDPRVRG | AFHQALQDPRVRGLYFPA | 7–24 | 7 | *0301∆ | R16K, V17I |
| A | ASHISSISS | VPNTASHISSISSRT | 35–49 | *0401 | A39T, I45S | |
| D | ASPLSSIFS | VPTTASPLSSIFSRIG | 35–50 | 2 | *0101, *0401 | L42I, L42S |
| A | FHQALQDPR | NSTAFHQALQDPRVRG | 4–19 | 2 | *0301, 5*0101 | A11T, R16K |
| D | FHQTLQDPR | NSTTFHQTLQDPRVR | 4–18 | 1 | *0301 | - |
| D | FSRIGDPAL | SPLSSIFSRIGDPALN | 40–55 | 1 | *0101, *0401, *0701, 5*0101 | R48T, P52H, A53V, L54P |
| A | HISSISSRT | VPNTASHISSISSRTG | 35–50 | 4 | *0101, *0401, *0701 | I45S, R48T |
| D | IFSRIGDPA | SPLSSIFSRIGDPALN | 40–55 | 2 | *0802, *1302, *1501 | L42I, L42S |
| A | ISSISSRTG | PNTASHISSISSRTGDPALN | 36–55 | 6 | *0101, *0401, *0701, *0802, *1101, *1501, 5*0101 | I45S, R48T |
| A | LNPVPNTAS | SSSGTLNPVPNTASHISSI | 27–45 | 5 | *0401, *0802 | L32H, L32R, P34L, N37H, N37T, I38T, A38T |
| D | LSSIFSRIG | PTTASPLSSIFSRIGDPALN | 36–55 | 6 | *0101, *0401, *0701, *0802, *1101, *1501, 5*0101 | P52L, A53V |
| A | MQWNSTAFH | MQWNSTAFHQALQDP | 1–15 | 1 | *1302 | Q2I, A7T |
| MQWNSTTFHQTLQDP | - | - | S5F, S5Y, T6A, T7I, T7N | |||
| D | PLSSIFSRI | VPTTASPLSSIFSRI | 35–49 | 1 | *0701 | L42I, L42S |
| A | PVPNTASHI | SSGTLNPVPNTASHISSI | 27–45 | 6 | *1302 | P34L, N37H, N37T |
| D | PVPTTASPL | VNPVPTTASPLSSIF | 32–46 | 2 | *0701 | P34H, P36L, L42I, L42S |
| A | QALQDPRVR | TAFHQALQDPRVRGLYF | 6–22 | 3 | 5*0101 | - |
| D | QTLQDPRVR | STTFHQTLQDPRVRGLYF | 5–22 | 4 | 5*0101 | R22K |
| D | TLQDPRVRG | STTFHQTLQDPRVRGLYFPA | 6 | *0301 | R18K, G19D, G19A | |
| A | VRGLYFPAG | DPRVRGLYFPAGGSSSG | 14–30 | 3 | *0802, *1101, *1501 | V17I, Y21N, Y21S, F22L, F22P, F22T, P23A |
| D | 3 | *0802, *1101, *1501 | R18K, G19D, G19A, F2222Q, F22H, F22P, F22L | |||
| A | WNSTAFHQA | MQWNSTAFHQALQDP | 1–15 | 1 | *0401, *0701 | A7T, A11T |
| D | WNSTTFHQT | MQWNSTTFHQTLQDP | 1 | *0701 | S5F, S5Y, T6A, T7I, T7N | |
| A | YFPAGGSSS | PRVRGLYFPAGGSSSGTLNP | 15–34 | 6 | *0101, *0401, *1101, 5*0101 | Y21N, Y21S, F22L, F22P, F22T, P23A, S29L |
| D | YFPAGGSSS | PRVRGLYFPAGGSSSGTVNP | 6 | *0101, *0401, 5*0101, *1101 | F22Q, F22H, F22P, F22L |
*0101 means HLA class II allele DRB1*0101 e.tc. 5*0101 means HLA class II allele DRB5*0101.
Figure 4Pareto Analysis applied to rank the Tepi of alleles against their percentage frequency.