| Literature DB >> 31874646 |
Li Chuin Chong1, Asif M Khan2.
Abstract
BACKGROUND: The sequence diversity of dengue virus (DENV) is one of the challenges in developing an effective vaccine against the virus. Highly conserved, serotype-specific (HCSS), immune-relevant DENV sequences are attractive candidates for vaccine design, and represent an alternative to the approach of selecting pan-DENV conserved sequences. The former aims to limit the number of possible cross-reactive epitope variants in the population, while the latter aims to limit the cross-reactivity between the serotypes to favour a serotype-specific response. Herein, we performed a large-scale systematic study to map and characterise HCSS sequences in the DENV proteome.Entities:
Keywords: Cross-reactivity; Dengue virus; Entropy; Immune targets; Mutual information; Sequence conservation; Serotype-specific; Vaccine design
Mesh:
Substances:
Year: 2019 PMID: 31874646 PMCID: PMC6929274 DOI: 10.1186/s12864-019-6311-z
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Overview of the methodology employed for the identification and analyses of highly conserved, serotype-specific (HCSS) DENV sequences
Number and distribution of redundant (R) and non-redundant (NR) reported DENV protein sequences in 2007 and 2018
| Protein / Serotype | DENV1 | DENV2 | DENV3 | DENV4 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2018R | 2018NR | 2018R | 2018NR | 2018R | 2018NR | 2018R | 2018NR | 2007R | 2018R | 2018NR | Increase (#|%)a | Reduction (#|%)b | |
| C | 3566 | 322 | 2061 | 312 | 1736 | 293 | 454 | 114 | 6539 | 511% | 6776 | 86.68% | |||
| prM | 2651 | 364 | 2376 | 329 | 1787 | 168 | 659 | 89 | 5943 | 388% | 6523 | 87.29% | |||
| E | 2329 | 1074 | 5269 | 1533 | 2950 | 933 | 1724 | 543 | 8427 | 219% | 8189 | 66.73% | |||
| NS1 | 2470 | 491 | 2190 | 488 | 1314 | 306 | 397 | 114 | 4587 | 257% | 4972 | 78.04% | |||
| NS2a | 1982 | 411 | 1535 | 349 | 1012 | 207 | 334 | 97 | 4158 | 589% | 3799 | 78.12% | |||
| NS2b | 1978 | 155 | 1537 | 126 | 1019 | 87 | 259 | 38 | 4179 | 680% | 4387 | 91.53% | |||
| NS3 | 1976 | 404 | 1578 | 384 | 1204 | 309 | 276 | 92 | 4339 | 624% | 3845 | 76.38% | |||
| NS4a | 1949 | 141 | 1519 | 114 | 993 | 84 | 241 | 40 | 4179 | 799% | 4323 | 91.94% | |||
| NS4b | 1952 | 193 | 1524 | 261 | 999 | 97 | 319 | 77 | 4192 | 696% | 4166 | 86.90% | |||
| NS5 | 2021 | 742 | 1995 | 749 | 1334 | 494 | 421 | 149 | 4943 | 596% | 3637 | 63.02% | |||
RNumber of redundant sequences collected from the National Center for Biotechnology Information (NCBI) Taxonomy database in December 2007 [35] and April 2018
NRNumber of non-redundant sequences after removal of duplicate sequences (full length and partial)
aNumber and percentage of redundant sequences increase from 2007 [35] and 2018
bNumber and percentage of sequence reduction for the 2018 dataset as a result of the removal of duplicate sequences; rounded to two decimal places
Fig. 2Sequence diversity of DENV proteomes, within (top four) and across (bottom) the four serotypes. The Shannon’s entropy values were computed from the alignments of DENV sequences using the tool AVANA, as described in the Methods. Centre, instead of starting positions were used herein for the plot (everywhere else, starting positions are used), and thus, the first and last four positions in the alignment of each protein were not assigned any peptide entropy value as they cannot be the centre of a nonamer
Number of highly conserved, serotype-specific (HCSS) nonamers
| Protein / Serotype | DENV1 | DENV2 | DENV3 | DENV4 | |
|---|---|---|---|---|---|
| C | 0 | 0 | 0 | 1 | |
| prM | 5 | 1 | 9 | 11 | |
| E | 11 | 65 | 81 | 149 | |
| NS1 | 32 | 29 | 77 | 88 | |
| NS2a | 15 | 10 | 37 | 35 | |
| NS2b | 1 | 15 | 3 | 44 | |
| NS3 | 110 | 158 | 87 | 223 | |
| NS4a | 3 | 3 | 12 | 17 | |
| NS4b | 55 | 41 | 39 | 60 | |
| NS5 | 227 | 143 | 220 | 204 | |
Fig. 3Scatter plot of entropy and mutual information (MI) values for all nonamer positions of each DENV serotype proteins. The boxed region (MI of > 0.8 and Entropy of < 0.25) is the selected cut-off threshold for identification of HCSS nonamers
Fig. 4DENV proteome map of highly conserved, serotype-specific (HCSS) sequences. The width of the boxes corresponds to the length of the proteins. Coloured boxes represent the location of the HCSS sequences within each serotype: red, DENV1; yellow, DENV2; blue, DENV3; and green, DENV4. The dotted rectangular boxes represent regions of the proteome where distinct HCSS sequences corresponded across the four serotypes
Nonamer positions depicting amino acid differences between an HCSS nonamer and the corresponding variants, within and between the serotypes. Only positions of mutual information value of 1 and low entropy values are shown. HCSS nonamers are shown in yellow, and one is arbitrarily chosen as the reference when more than one corresponding HCSS nonamers are present. Data for two additional positions are shown in Additional file 4: Table S4
| Protein | Entropy Value | NS1 | 0.12 | NS2a | 0.11 | ||||
|---|---|---|---|---|---|---|
| HCSS Reference | ||||||
| DENV1 | 146 | Q....IW.. | 154 | 202 | CK.LTM..I | 210 |
| 146 | Q....VW.. | 154 | 202 | CK.LPM..I | 210 | |
| 146 | L....IW.. | 154 | 202 | CK.LTML.I | 210 | |
| 146 | H....IW.. | 154 | 202 | 210 | ||
| 146 | Q....IWK. | 154 | 202 | CK.LTMY.I | 210 | |
| 146 | Q..S.IW.. | 154 | 202 | CK.L.M..I | 210 | |
| 146 | Q....IW.G | 154 | 202 | 210 | ||
| 146 | Q....I... | 154 | 202 | CK.LTM..V | 210 | |
| 146 | Q...TIW.. | 154 | 202 | CK.STM..I | 210 | |
| 202 | C..LTM..I | 210 | ||||
| 202 | SK.LTM..I | 210 | ||||
| 202 | CK.LTMYFI | 210 | ||||
| 202 | CKTLTM..I | 210 | ||||
| DENV2 | 146 | ......... | 154 | 202 | 210 | |
| 146 | S........ | 154 | 202 | I | 210 | |
| 146 | .......K. | 154 | 202 | S | 210 | |
| 146 | .......KG | 154 | 202 | L | 210 | |
| 146 | ..T.D.... | 154 | 202 | YF | 210 | |
| 146 | .......KL | 154 | ||||
| DENV3 | 146 | S....VW.. | 154 | 202 | VP.LPL.IF | 210 |
| 146 | A....VW.. | 154 | 202 | VP.LPLLIF | 210 | |
| 146 | L....VW.. | 154 | 202 | IP.LPL.IF | 210 | |
| 146 | S..L.VW.. | 154 | 202 | .P.LPL.IF | 210 | |
| 202 | VQ.LPL.IF | 210 | ||||
| 202 | VS.LPL.IF | 210 | ||||
| 202 | VP.SPL.IF | 210 | ||||
| 202 | VPSLPL.IF | 210 | ||||
| 202 | AQ.LPL.IF | 210 | ||||
| DENV4 | 146 | R........ | 154 | 202 | AQALPVY.M | 210 |
| 146 | R....F... | 154 | ||||
| 146 | R.....F.. | 154 | ||||
| 146 | R....FF.. | 154 | ||||
HLA-A, -B and -DR supertype-restricted T-cell epitopes, predicted for HCSS nonamers, summarised according to DENV protein and serotypes
| Protein | Serotype | MHC Class I | MHC Class II | Totala | Non-redundant Totala | Totalb | Non-redundant Totala | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| HLA A supertypes | HLA B supertypes | HLA DR supertypes | ||||||||||||||
| A1 | A2 | A3 | B7 | B27 | B44 | B58 | B62 | Main DR | DR4 | DRB3 | ||||||
| prM | DENV3 | – | – | – | – | – | – | – | – | – | 1 | – | ||||
| DENV4 | 1 | – | – | – | – | – | – | 1 | 1 | 1 | – | |||||
| E | DENV1 | – | – | – | – | – | – | 1 | – | – | – | – | ||||
| DENV2 | – | 1 | 2 | – | 1 | 1 | – | – | – | 1 | – | |||||
| DENV3 | – | – | 1 | – | – | 1 | 1 | – | – | 3 | – | |||||
| DENV4 | – | 2 | 2 | 1 | – | 3 | 3 | – | 1 | – | – | |||||
| NS1 | DENV1 | – | – | – | – | – | – | – | – | 1 | 1 | – | ||||
| DENV2 | – | – | – | – | – | – | 2 | – | – | – | – | |||||
| DENV3 | – | – | 1 | – | – | 1 | 1 | – | – | – | – | |||||
| DENV4 | – | 1 | – | 1 | 1 | 1 | 2 | 1 | – | – | – | |||||
| NS2a | DENV1 | – | – | – | – | – | 1 | 2 | – | – | – | – | ||||
| DENV3 | – | – | 1 | – | 1 | – | 1 | – | 3 | 3 | – | |||||
| DENV4 | – | – | – | 2 | 1 | – | – | – | 3 | 3 | 2 | |||||
| NS2b | DENV2 | – | – | 1 | – | – | – | – | – | 2 | – | – | ||||
| DENV4 | 1 | 2 | – | – | – | – | 1 | 1 | 2 | 1 | – | |||||
| NS3 | DENV1 | 2 | – | – | – | – | – | 2 | 2 | 1 | 1 | 2 | ||||
| DENV2 | 1 | 1 | 2 | 1 | – | 5 | – | 1 | 1 | – | – | |||||
| DENV3 | 1 | – | – | 1 | – | – | – | – | – | – | – | |||||
| DENV4 | 1 | 2 | – | 3 | – | 1 | 2 | 1 | – | 1 | 1 | |||||
| NS4a | DENV1 | – | – | – | 1 | – | – | – | – | – | – | – | ||||
| DENV2 | – | – | – | 1 | – | – | – | – | – | – | – | |||||
| DENV3 | – | – | – | – | – | 1 | – | – | – | – | – | |||||
| DENV4 | – | 1 | – | – | – | – | – | – | – | – | 1 | |||||
| NS4b | DENV1 | – | 1 | – | – | – | – | 1 | – | 1 | 1 | 1 | ||||
| DENV2 | – | 1 | – | 1 | 1 | – | 2 | 2 | 2 | 1 | – | |||||
| DENV3 | – | 1 | – | – | – | – | – | – | – | – | – | |||||
| DENV4 | – | – | 2 | – | – | – | 1 | 1 | 1 | 1 | 1 | |||||
| NS5 | DENV1 | 1 | – | 3 | 2 | 1 | 2 | 5 | 2 | 2 | 3 | 2 | ||||
| DENV2 | – | 1 | 2 | – | – | 1 | – | – | 1 | 1 | 2 | |||||
| DENV3 | 1 | 1 | 2 | – | 1 | 1 | – | – | 1 | 2 | 2 | |||||
| DENV4 | 1 | 1 | 2 | 1 | 1 | 2 | 5 | 3 | 1 | 1 | 1 | |||||
| Totalc | ||||||||||||||||
| Gene level totalc | ||||||||||||||||
aTotal number of predicted epitopes for each serotype with respect to each protein
bTotal number of predicted epitopes for each protein
cTotal number of predicted epitopes for each supertype
Fig. 5Visualization of epitope-receptor binding by use of ClusPro molecular docking. Panel A: a docked complex of a representative putative epitope (DENV4 NS3 335YQGKTVWFV363) and HLA-A2*0201 receptor (PDB ID: 2GIT). Panel B: docked control, known peptide-HLA complex (PDB ID: 3SPV). Peptide in either complex is represented by a cyan ‘New Cartoon’ structure, while HLA receptor is represented by a silver transparent ‘QuickSurf’ and ‘New Cartoon’ (chain α: purple; chain β: yellow). The inset in panel A shows two interactions between the epitope and the HLA receptor (chain α1: blue ‘QuickSurf’ background; chain α2: red ‘QuickSurf’ background) within the cut-off distance of 5.0 Å, which are 4.30 Å and 4.72 Å
Fig. 6IEDB reported DENV T cell epitopes/ligands in human that completely matched HCSS sequences
Reported epitopes that matched the predicted epitopes of HCSS sequences for structural protein E. Full data for other DENV proteins are provided in Additional file 8: Table S8
| Protein | Serotype | Matched Epitopes (Starting Position | Ending Position) | HCSS Sequence | Supertype Predicted | Supertype Reported (IEDB |
|---|---|---|---|---|---|
| E | DENV1 | 204|212 | 204 KSWLVHKQWFKTAHAKKQE 249 | B58 | B58: HLA-B*57:01, HLA-B*58:01 |
| DENV2 | 210|218, 213|221 | 210 RQWFLDLPLPWLPG 223# | A2 | A2: HLA-A*02:06, HLA-A*02:01, HLA-A*02:17 | |
| B27 | B27: HLA-B*27:05, HLA-B*48:01 | ||||
| B44 | B44: HLA-B*40:01 | ||||
| 238|246 | 237 LVTFKNPHAKKQDVVVLGSQE 257 | A3 | A3: HLA-A*03:01, HLA-A*11:01, HLA-A*68:01 | ||
| 296|305, 297|305 | 281 GHLKCRLRMDKLQLKGMSYSMCTGKFK 307$ | A3 | A3: HLA-A*03:01, HLA-A*11:01, HLA-A*68:01 | ||
| DENV3 | 204|212, 211|220, 212|220 | 204 KAWMVHRQWFFDLPLPW 220# | A24 | A24: HLA-A*23:01, HLA-A*24:02 | |
| B44 | B44: HLA-B*44:03 | ||||
| B58 | B58: HLA-B*57:01, HLA-B*58:01 | ||||
| 238|246 | 234 KELLVTFKNAHAKKQ 248 | A3 | A3: HLA-A*03:01, HLA-A*11:01, HLA-A*68:01 | ||
| 313|321 | 306 FVLKKEVSETQHGTILI 322 | B44 | B44: HLA-B*40:01, HLA-B*44:03 | ||
| DENV4 | 51|59, 51|60 | 47 KTTAKEVALLRTYCIEA 63$ | B44 | B44: HLA-B*44:02, HLA-B*44:03 | |
| 65|73, 82|90 | 65 ISNITTATRCPTQGEPYLKEEQDQQYICRR 94# | A3 | A3: HLA-A*31:01 | ||
| B44 | B44: HLA-B*40:01, HLA-B*44:03 | ||||
| 164|173, 165|173 | 164 ITPRSPSVEV 173# | A2 | A2: HLA-A*68:02 | ||
| B7 | B7: HLA-B*07:02, HLA-B*51:01 | ||||
| 204|212 | 204 KTWLVHKQWF 213 | B58 | B58: HLA-B*57:01, HLA-B*58:01 | ||
| 237|246, 238|246, 238|247 | 235 ERMVTFKVPHAKRQDVTVLGSQEGAMHSAL 264$ | A3 | A3: HLA-A*03:01, HLA-A*11:01, HLA-A*68:01 | ||
| 313|321 | 290 EKLRIKGMSYTMCSGKFSIDKEMAETQHGTTVV 322 | B44 | B44: HLA-B*40:01, HLA-B*44:03 | ||
| 412|420 | 391 WFRKGSSIGKMFESTYRGAKRMAILGETAWDFGSVGGL 428 | B58 | B58: HLA-B*57:01, HLA-B*58:01 | ||
| 445|453 | 430 TSLGKAVHQVFGSVYTTMFGGVSWM 454 | B58 | B58: HLA-B*57:01, HLA-B*58:01 |
Highly conserved, serotype-specific (HCSS) sequences with at least two matched (reported and predicted) epitopes (hotspot) that show $intra-supertype restriction (epitope that is restricted by only one supertype of HLA gene; i.e. A1 supertype-restricted epitope) and #inter-HLA gene supertype restriction (epitope that is restricted by at least two supertypes of different HLA gene; i.e. A2 and B7 supertype-restricted epitope)
Top 20 candidate HCSS sequence, sorted according to prioritisation criteria
| Protein | Serotype | HCSS Sequence | MI Value | Entropy Value | Number of Epitopes | Number of Supertype Restrictions | Extended HLA Promiscuity | Length of HCSS Sequence |
|---|---|---|---|---|---|---|---|---|
| NS2a | DENV4 | 196 TALILGAQALPVYLMTLMKGAS 217 | 1 | 0.072 | 3 | 2 | Yes | 22 |
| NS2b | DENV4 | 96 TLLVKLALITVSGLYPLAIP 115 | 1 | 0.092 | 5 | 3 | No | 20 |
| NS4b | DENV4 | 241 WNTTIAVSTANIFRGSY 257 | 1 | 0.168 | 2 | 2 | Yes | 17 |
| NS1 | DENV4 | 34 FQPESPARLASAILNAH 50 | 1 | 0.186 | 2 | 1 | No | 17 |
| NS2b | DENV2 | 31 PLVAGGLLTVCYVLTGRSADLELE 54 | 1 | 0.199 | 2 | 1 | No | 24 |
| NS2a | DENV1 | 119 SLEELGDGLAMGIM 132 | 1 | 0.219 | 2 | 1 | No | 14 |
| NS2b | DENV4 | 5 NEGIMAVGLVSLLGSALLKNDVPLAGPMVAGGLLLAAYVMSGS 47 | 0.999 | 0.052 | 8 | 2 | Yes | 43 |
| NS5 | DENV1 | 484 LEFEALGFMNEDHWFSR 500 | 0.999 | 0.052 | 2 | 1 | No | 17 |
| NS5 | DENV1 | 756 GKSYAQMWQLMYFHRRD 772 | 0.999 | 0.064 | 5 | 3 | No | 17 |
| NS5 | DENV1 | 848 CGSLIGLTARATWA 861 | 0.999 | 0.066 | 2 | 1 | No | 14 |
| NS3 | DENV2 | 32 GYSQIGAGVYKEGTFHTMWHVT 53 | 0.999 | 0.133 | 4 | 4 | No | 22 |
| NS3 | DENV4 | 324 DPFPQSNSPIEDIEREIPERSWNTGFDWITDYQGKTVWFVP 364 | 0.999 | 0.136 | 3 | 2 | No | 41 |
| NS3 | DENV2 | 78 SYGGGWKLEGEWKEGEEVQVLALEPGKNPRAVQT 111 | 0.999 | 0.189 | 2 | 1 | No | 34 |
| E | DENV4 | 164 ITPRSPSVEV 173 | 0.998 | 0.155 | 2 | 2 | No | 10 |
| E | DENV4 | 47 KTTAKEVALLRTYCIEA 63 | 0.997 | 0.108 | 2 | 2 | No | 17 |
| E | DENV4 | 65 ISNITTATRCPTQGEPYLKEEQDQQYICRR 94 | 0.997 | 0.168 | 2 | 2 | No | 30 |
| E | DENV4 | 235 ERMVTFKVPHAKRQDVTVLGSQEGAMHSAL 264 | 0.997 | 0.185 | 3 | 1 | No | 30 |
| NS5 | DENV1 | 215 THEMYWVSCGTGNIVSAVNMTSRMLLNRFTM 245 | 0.996 | 0.090 | 2 | 1 | No | 31 |
| NS4b | DENV2 | 207 TQVLMMRTTWALCEALT 223 | 0.996 | 0.168 | 3 | 2 | No | 17 |
| NS3 | DENV1 | 546 WLSYKVASEGFQYSDRRWCFDGERNNQVLEENMDVE 581 | 0.996 | 0.178 | 2 | 1 | No | 36 |