| Literature DB >> 17254309 |
Asif M Khan1, A T Heiny, Kenneth X Lee, K N Srinivasan, Tin Wee Tan, J Thomas August, Vladimir Brusic.
Abstract
BACKGROUND: Antigenic diversity in dengue virus strains has been studied, but large-scale and detailed systematic analyses have not been reported. In this study, we report a bioinformatics method for analyzing viral antigenic diversity in the context of T-cell mediated immune responses. We applied this method to study the relationship between short-peptide antigenic diversity and protein sequence diversity of dengue virus. We also studied the effects of sequence determinants on viral antigenic diversity. Short peptides, principally 9-mers were studied because they represent the predominant length of binding cores of T-cell epitopes, which are important for formulation of vaccines.Entities:
Mesh:
Substances:
Year: 2006 PMID: 17254309 PMCID: PMC1764481 DOI: 10.1186/1471-2105-7-S5-S4
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Collected and unique protein sequences for each dengue serotype in 2004 and 2005 and the corresponding increase in data between the two time points.
| Dengue serotype | Data retrieved in 2004 (#) | Data retrieved in 2005 (#) | Increase (#) | |||
| Collected sequences | Unique sequences | Collected sequences | Unique sequences | Collected sequences | Unique sequences | |
| DV1 | 744 | 359 | 2318 | 724 | 1574 | 365 |
| DV2 | 1426 | 507 | 3351 | 697 | 1925 | 190 |
| DV3 | 597 | 230 | 2520 | 678 | 1923 | 448 |
| DV4 | 932 | 222 | 1323 | 320 | 391 | 98 |
Proteins of a representative dengue virus serotype 2 polyprotein entry (P14340 of 3391 amino acids) in the NCBI Entrez protein database.
| Protein | Length (amino acids) |
| Capsid (C) | 114 |
| Precursor membrane (pM) | 166 |
| Envelope (E) | 495 |
| Nonstructural protein 1 (NS1) | 352 |
| Nonstructural protein 2a (NS2a) | 218 |
| Nonstructural protein 2b (NS2b) | 130 |
| Nonstructural protein 3 (NS3) | 618 |
| Nonstructural protein 4a (NS4a) | 150 |
| Nonstructural protein 4b (NS4b) | 248 |
| Nonstructural protein 5 (NS5) | 900 |
Unique sequences for the proteins of the four serotypes in 2004 and 2005.
| Protein | No. of unique sequences (all four serotypes) | |
| 2004 | 2005 | |
| C | 107 | 196 |
| pM | 126 | 220 |
| E | 495 | 998 |
| NS1 | 150 | 224 |
| NS2a | 95 | 142 |
| NS2b | 59 | 78 |
| NS3 | 80 | 164 |
| NS4a | 37 | 69 |
| NS4b | 57 | 88 |
| NS5 | 112 | 240 |
Minimum and maximum percentage sequence identity range for each dengue protein, intra- and inter-serotype.
| DV1 | DV2 | DV3 | DV4 | Average PSI | DV1 | DV2 | DV3 | DV4 | Average PSI | ||||
| C | DV1 | 88–99 | 65 | pM | DV1 | 92–99 | 68 | ||||||
| DV2 | 56–75 | 81–99 | DV2 | 62–75 | 79–99 | ||||||||
| DV3 | 75–84 | 53–66 | 91–99 | DV3 | 75–82 | 60–72 | 93–99 | ||||||
| DV4 | 61–68 | 57–69 | 54–60 | 94–99 | DV4 | 62–67 | 60–71 | 64–70 | 96–99 | ||||
| E | DV1 | 89–99 | 65 | NS1 | DV1 | 93–99 | 72 | ||||||
| DV2 | 58–70 | 80–99 | DV2 | 68–75 | 85–99 | ||||||||
| DV3 | 72–79 | 60–69 | 92–99 | DV3 | 77–80 | 69–75 | 94–99 | ||||||
| DV4 | 58–66 | 55–65 | 61–64 | 94–99 | DV4 | 67–70 | 68–73 | 70–74 | 93–99 | ||||
| NS2a | DV1 | 90–99 | 39 | NS2b | DV1 | 93–99 | 60 | ||||||
| DV2 | 36–40 | 93–99 | DV2 | 56–62 | 95–99 | ||||||||
| DV3 | 43–48 | 35–40 | 93–99 | DV3 | 66–70 | 58–63 | 96–99 | ||||||
| DV4 | 35–39 | 33–36 | 36–41 | 89–99 | DV4 | 56–62 | 54–59 | 56–59 | 94–99 | ||||
| NS3 | DV1 | 97–99 | 79 | NS4a | DV1 | 92–99 | 60 | ||||||
| DV2 | 78–80 | 96–99 | DV2 | 56–61 | 96–99 | ||||||||
| DV3 | 84–86 | 79–81 | 97–99 | DV3 | 63–68 | 56–63 | 92–99 | ||||||
| DV4 | 75–77 | 75–77 | 77–79 | 97–99 | DV4 | 56–60 | 59–64 | 56–62 | 94–99 | ||||
| NS4b | DV1 | 95–99 | 78 | NS5 | DV1 | 96–99 | 77 | ||||||
| DV2 | 75–79 | 95–99 | DV2 | 77–79 | 95–99 | ||||||||
| DV3 | 81–85 | 75–79 | 97–99 | DV3 | 80–82 | 77–79 | 96–99 | ||||||
| DV4 | 75–78 | 77–81 | 76–79 | 97–99 | DV4 | 73–76 | 72–75 | 74–77 | 95–99 | ||||
The average percentage sequence identities (PSI) are shown for inter-serotype comparisons.
Figure 1Definition of antigenically redundant sequence. A) The three sequences (NCBI GI no.: 1854039, 17129648 and 37963458) are each unique, and residues that vary among them are shown. B) Overlapping 9-mers generated from the three unique sequences represent all the inherent antigenic variations, with respect to potential 9-mer T-cell epitopes. Although the three sequences are each unique, they share identical 9-mers. 9-mers shown in uppercase are those with an identical match in two of the unique sequences analyzed, while those in bold uppercase have an identical match in all three sequences; unique 9-mers are shown in lowercase. All the 9-mers in sequence 1854039 have a match in at least one of the other two sequences; thus, the antigenic diversity of this sequence can be covered by the other two sequences combined, rendering the sequence 1854039 antigenically redundant. Hence, the minimal number of sequences required to represent antigenic diversity for this dataset is two.
Reduction of the number of unique dengue sequences by removal of antigenically redundant sequences.
| Dengue serotype | Data retrieved in 2004 | Data retrieved in 2005 | ||||
| Unique sequences (#) | Minimal antigenic set | Unique sequences (#) | Minimal antigenic set | |||
| Unique sequences (#)* | Percentage of unique sequences (%)** | Unique sequences (#)* | Percentage of unique sequences (%)** | |||
| DV1 | 359 | 244 | 68% | 724 | 493 | 68% |
| DV2 | 507 | 368 | 73% | 697 | 466 | 67% |
| DV3 | 230 | 180 | 78% | 678 | 482 | 71% |
| DV4 | 222 | 177 | 80% | 320 | 243 | 76% |
*Minimal no. of unique sequences that represent complete short-peptide (9-mer) antigenic diversity of dengue unique sequences reported in NCBI Entrez protein database. **Percentage of unique sequences that represent complete short-peptide (9-mer) antigenic diversity of dengue unique sequences reported in the NCBI Entrez protein database.
Effects of number of unique dengue virus serotype 2 (DV2) envelope sequences (N) on short-peptide (9-mer) antigenic diversity.
| Number of unique sequences (N) | 20 | 40 | 60 | 80 | 100 | 120 | 140 |
| Length of sequences | 460 aa | 460 aa | 460 aa | 460 aa | 460 aa | 460 aa | 460 aa |
| Minimal number of unique sequences that represent complete short-peptide antigenic diversity (Mean ± SE) | 18 ± 0.30 | 32 ± 0.54 | 46 ± 0.70 | 58 ± 0.87 | 70 ± 0.87 | 80 ± 0.87 | 90 ± 0.71 |
| Percentage of unique sequences that represent complete short-peptide antigenic diversity (%) (Mean ± SE) | 90 ± 1.5 | 80 ± 1.35 | 77 ± 1.17 | 73 ± 1.09 | 70 ± 0.87 | 67 ± 0.73 | 64 ± 0.51 |
The mean and standard error (SE) values are shown for random repeated sampling of 20 times.
Figure 2Short-peptide (9-mer) antigenic diversity as a function of number of sequences. Short-peptide antigenic diversity has an asymptotic relationship to number of unique dengue virus serotype 2 (DV2) envelope sequences (N). Each curve shows the cumulative percentage coverage of short-peptide antigenic diversity. Vertical bars represent standard error for repeated random sampling of 20 times.
Effect of length of dengue virus serotype 2 (DV2) envelope protein sequences on short-peptide (9-mer) antigenic diversity.
| Length of fragments | 100% (460 aa) | 60% (276 aa) | 30% (138 aa) | 20% (92 aa) | 10% (46 aa) | 5% (23 aa) |
| Number of fragments | 187 | 187 | 187 | 187 | 187 | 187 |
| Number of unique fragments | 187 | 131 | 82 | 58 | 27 | 17 |
| Minimal number of fragments that represent complete short-peptide antigenic diversity (Mean ± SE) | 111 ± 0.11 | 74 ± 0.11 | 48 ± 0.17 | 38 ± 0.10 | 24 ± 0.10 | 14 ± 0.10 |
| Percentage of fragments that represent complete short-peptide antigenic diversity (%) (Mean ± SE) | 59 ± 0.06 | 40 ± 0.06 | 26 ± 0.09 | 20 ± 0.05 | 13 ± 0.05 | 7 ± 0.05 |
The mean and standard error (SE) values are shown for random repeated sampling of 20 times.
Figure 3Short-peptide (9-mer) antigenic diversity as a function of length of sequences. Short-peptide antigenic diversity shows a linear relationship to the sequence length of dengue virus serotype 2 (DV2) envelope protein. Vertical bars represent standard error for repeated random sampling of 20 times.
Figure 4Flowchart summarizing the steps undertaken to identify the antigenically relevant unique sequence for dengue virus.