| Literature DB >> 18681973 |
Deena M A Gendoo1, Mahmoud M El-Hefnawi, Mark Werner, Rania Siam.
Abstract
BACKGROUND: Variations in the influenza Hemagglutinin protein contributes to antigenic drift resulting in decreased efficiency of seasonal influenza vaccines and escape from host immune response. We performed an in silico study to determine characteristics of novel variable and conserved motifs in the Hemagglutinin protein from previously reported H3N2 strains isolated from Hong Kong from 1968-1999 to predict viral motifs involved in significant biological functions.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18681973 PMCID: PMC2553082 DOI: 10.1186/1743-422X-5-91
Source DB: PubMed Journal: Virol J ISSN: 1743-422X Impact factor: 4.099
Figure 1Selected 14 MEME Blocks in the HA1 consensus sequence from 1968–1999. Combined block diagram of non overlapping sites with p value < 0.0001 was generated from the MEME server which are common to the entire data set, with the exception of block 14 which occurs in only 16 of the 17 sequences.
MEME blocks positions, size and genetic distance
| MEME 1 | 89 | 129 | 41 |
| 348 | 388 | 41 | |
| 426 | 466 | 41 | |
| MEME 2 | 14 | 42 | 29 |
| 179 | 207 | 29 | |
| 478 | 506 | 29 | |
| MEME 3 | 507 | 541 | 35 |
| 215 | 249 | 35 | |
| MEME 4 | 296 | 345 | 50 |
| MEME 5 | 404 | 424 | 21 |
| 49 | 69 | 21 | |
| MEME 6 | 253 | 293 | 41 |
| MEME 7 | 130 | 170 | 41 |
| MEME 8 | 542 | 562 | 21 |
| MEME 9 | 71 | 85 | 15 |
| 389 | 403 | 15 | |
| MEME 10 | 3 | 13 | 11 |
| 467 | 477 | 11 | |
| MEME 11 | 171 | 178 | 8 |
| MEME 12 | 43 | 48 | 6 |
| MEME 13 | 209 | 214 | 6 |
| MEME 14 | 563 | 566 | 4 |
HA consensus sequences were submitted in Multiple Em for Motif Elucidation (MEME) server. The fourteen MEME blocks spanning the consensus sequence alignment are presented, with the start and end positions and width of each block.
Amino acid substitutions in the different isolates from 1969–1999 used to extrapolate the genetic distance in the different MEME blocks
| 1968–1969 | 5 | 0.883392 | ||
| 1969–1971 | 12 | 1.943463 | 2 | 0.3678 |
| 1971–1972 | 11 | 1.943463 | 3 | 0.228 |
| 1972–1973 | 22 | 3.886926 | 4 | 0.16 |
| 1973–1974 | 5 | 0.883392 | 5 | 0.142 |
| 1974–1975 | 15 | 2.650177 | 6 | 0.293 |
| 1975–1980 | 29 | 4.946997 | ||
| 1980–1982 | 6 | 1.060071 | ||
| 1982–1983 | 13 | 2.296820 | 9 | 0.288 |
| 1983–1984 | 2 | 0.353357 | 10 | 0.212 |
| 1984–1985 | 1 | 0.176678 | ||
| 1985–1987 | 7 | 1.236749 | 12 | 0.167 |
| 1987–1988 | 3 | 0.530035 | ||
| 1988–1989 | 8 | 1.423488 | ||
| 1989–1992 | 10 | 2.473498 | ||
| 1992–1999 | 23 | 4.240283 |
Using ClustalW alignment the number of observed substitutions for each of the consensus sequence and the equivalent years are tabulated using Infoalign. The highest aminoacid substitution (29 aa substitutions over the entire sequence) was in Years 1980. The genetic distance in each MEME block is calculated showing that MEME blocks 1 and 8 are conserved (bold), MEME blocks 7, 11 and 13 are highly variable and the other MEME blocks show intermediate variability.
Figure 2Number of aminoacid substitutions in each MEME block over the period from 1968–1999. (A) Bar graph of amino acid substitutions within MEME blocks for each of the years. (B) Behavior of the substitutions in MEME block 7; frequency of amino acid substitutions within MEME block 7 largely follows the occurrence pattern of substitutions within the entire protein as illustrated Table 2, reaching a peak in 1980, which corresponds to the year with the greatest number of mutations in the alignment. (C) Behavior of the substitutions in MEME block 2.
Figure 3Entropy plot of the protein consensus ClustalW alignment. Amino acid positions that do not exhibit any changes over the years have entropy of 0, whereas positions of high variability are represented by peak in the plot. Two hot spots of variability were observed and are clustered around amino acid position 140–190, and 200–240. The entropy analysis was performed for the entire hemagglutinin sequence (560 amino acids), but at amino acid position 340 (HA2) the analysis does not exhibit much entropy.
Positions of potential post-translational modification sites
| Motif ID | Expression | Start | End | Years observed |
| CK2_PHOSPHO_SITE Casein kinase II phosphorylation site. | [ST]-x(2)-[DE]. | 44 | 47 | 1968,1969,1971,1972 |
| 81 | 84 | |||
| 142 | 145 | 1972 | ||
| 203 | 206 | |||
| 416 | 419 | |||
| 432 | 435 | |||
| 456 | 459 | |||
| PKC_PHOSPHO_SITE Protein kinase C phosphorylation site | [ST]-x-[RK] | 64 | 66 | All years except 1982 |
| 123 | 125 | |||
| 152 | 154 | |||
| 154 | 156 | |||
| 159 | 161 | |||
| 173 | 175 | 1972 | ||
| 190 | 192 | 1975 | ||
| 203 | 205 | 1975, 1980, 1982, 1983, 1984, 1985, 1987, 1988, 1989, 1992 | ||
| 215 | 217 | |||
| 221 | 223 | |||
| 222 | 224 | |||
| 243 | 245 | |||
| 278 | 280 | |||
| 329 | 331 | |||
| 467 | 469 | |||
| 496 | 498 | |||
| cAMP_PHOSPHO_SITE cAMP- and cGMP-dependent protein kinase phosphorylation site. | [RK](2)-x-[ST] | 156 | 159 | 1975, 1980, 1982, 1983, 1984, 1985, 1987, 1988, 1989, 1992, 1999 |
| ASN_GLYCOSYLATION N-glycosylation site | N-{P}-[ST]-{P} | 24 | 27 | All years except 1971, 1972 |
| 38 | 41 | |||
| 54 | 57 | |||
| 79 | 82 | 1975, 1980, 1982, 1983, 1984, 1985, 1987, 1988, 1989, 1992, 1999 | ||
| 97 | 100 | 1968, 1969, 1971, 1972, 1973 | ||
| 138 | 141 | 1999 | ||
| 142 | 145 | 1974, 1980, 1982, 1983, 1984, 1985, 1987, 1988, 1989, 1992, 1999 | ||
| 149 | 152 | 1999 | ||
| 181 | 184 | |||
| 262 | 265 | 1980, 1982, 1983, | ||
| 301 | 304 | |||
| 499 | 502 | |||
| MYRISTYL | G-{EDRKHPFYW}- | 21 | 26 | |
| 77 | 82 | 1968,1969,1971,1972, | ||
| 145 | 150 | All years except 1972 | ||
| 150 | 155 | All years except 1989, | ||
| 151 | 156 | All years except 1989, | ||
| 158 | 163 | 1975, 1980, 1982, | ||
| 291 | 296 | 1974, 1975, 1980, | ||
| 302 | 307 | |||
| 346 | 351 | |||
| 349 | 354 | |||
| 361 | 366 | |||
| 376 | 381 | |||
| 495 | 500 | 1973, 1974, 1975, | ||
| 558 | 563 | All years except 1989, |
Prosite motifs detected for the H3N2 sequences using PPSearch this includes 24 phosphorylation, 12 glycosylation and 14 myristylation sites. Potential phosphorylation sites include casein kinase II phosphorylation site, protein kinase C phosphorylation site and cAMP- and cGMP-dependent protein kinase phosphorylation site, ASN glycosylation motifs and N-myristylation sites. The start and end positions of each motif are shown, as well as the regular expression of the motif. Unless otherwise indicated, sites have been observed in all 17 consensus sequences.
Figure 4Frequency of specific potential post-translational modification (prosite) motifs implicated in each of the MEME blocks. MEME block 7 has the highest number of post-translational modification sites, followed by MEME block 2, 1 and 3 respectively. High frequency of post-translational modification site was recorded when a frequency of 2 or above is observed. Frequency of potential protein kinase C phosphorylation site (PKC) in the MEME blocks reveals that MEME block 3, 2 and 7 have a high PKC sites frequency. Frequency of potential N-myristilation site in the MEME blocks reveals that MEME blocks 1, 2 and 7 have a high myristilation sites frequency. Frequency of potential N-glycosylation site in the MEME blocks reveal that MEME block 2 and 7 has a high glycosylation sites frequency. Frequency of potential CKII phosphorylation sites in the MEME blocks reveals that MEME block 1 and 2 have a high CKII sites frequency.
Figure 5Average entropy of specific post-translational modification sites in each of the MEME blocks is demonstrated using boxplot. (A) Average entropy of potential CKII phosphorylation sites in the MEME blocks. Blocks 1, 5 and 9 have zero entropy at all CKII sites. The majority of MEME blocks 2 and 7 CKII sites have nonzero entropy. One of the MEME block 2 CKII sites (amino acid 205) has the largest entropy (1.24) among all of CKII's sites. The average entropy over MEME block 7 and 2 CKII sites is therefore higher than for any other block. MEME block 1 has a wider boxplot than the others, indicating more CKII sites in this block. (B) Average entropy of potential PKC phosphorylation site in the MEME blocks. MEME block 1 and 4 have zero entropy at all their PKC sites. The highest PKC entropy values were observed in MEME block 2 (amino acid 205) and MEME block 7 (amino acid 160) with 1.2 entropy values. MEME block 5, 7 and 11 are unusual in that very few of their PKC sites have zero entropy. MEME block 11 then 7 PKC sites have the highest average entropy. The width of the boxplots indicates that more PKC sites are observed in MEME sites 2, 3 and 7 respectively. (C) Average entropy of potential N-glycosylation site in the MEME blocks. MEME blocks 4 and 5 have zero entropy at all of their ASN sites. MEME block 2, 6 and 9 have nonzero entropy at the majority of their ASN sites. One of the ASN sites (amino acid 99) from MEME block 1 has the highest entropy (1.003) among all ASN sites. The width of the boxplots indicates that more N-glycosylation sites are observed in MEME sites 2 and 7 respectively (D) Average entropy of potential N-myristylation site in the MEME blocks. MEME blocks 1, 2, 4, and 9 have the majority of their myristylation sites possessing zero entropy. The highest myristylation sites entropy is at MEME block 9 and 7 (Amino acid 78 and 160 respectively) with an approximate entropy value of 1.2. MEME block 1 and 7 have more N-myristylation sites than any other block, although MEME block 2 also has a fairly large number of myristylation sites.
List of antigenic sites observed in the hemagglutinin structure.
| 143–146 | HA1, MEME7, CKII, ASN | |
| 187–196 | HA1, MEME2, PKC | |
| 3 | MEME10 | |
| 31 | MEME2 | |
| 53 | MEME5 | |
| 54 | MEME5, ASN | |
| 63 | MEME5 | |
| 78 | MEME9, Myristyl | |
| 83 | MEME9, CKII | |
| 110 | MEME1 | |
| 122 | MEME1 | |
| 133 | MEME7 | |
| 137 | MEME7 | |
| 155 | MEME7, Myristyl | |
| 164 | MEME7 | |
| 174 | MEME11, PKC | |
| 182 | MEME2, ASN | |
| 186 | MEME2 | |
| 201 | MEME2 | |
| 205 | MEME2, CKII, PKC | |
| 207 | MEME2 | |
| 208 | ||
| 217 | MEME3, PKC | |
| 220 | MEME3 | |
| 226 | MEME3 | |
| 228 | MEME3 | |
| 242 | MEME3 | |
| 260 | MEME6 | |
| 275 | MEME6 | |
| 278 | MEME6, PKC | |
| 327 | MEME4 |
Antigenic sites A-D [11] were mapped to our consensus sequences and tabulated with overlapping MEME motif, entropy values and post-translational modifications sites. Site A average entropy is based on amino acid position 144 and 145, while site B average entropy is based on amino acid position 188 and 189.
Position of receptor binding sites and their overlap with MEME blocks
| 98 | MEME1 |
| 135 | MEME 7 |
| 136 | MEME 7 |
| 137 | MEME 7 |
| 153 | MEME 7 |
| 183 | MEME 2 |
| 190 | MEME 2 |
| 194 | MEME 2 |
Receptor binding sites described by Skehel and Wiley (2000) were used to generate their correlation with MEME blocks. These receptors binding sites mainly overlap MEME blocks 2 and 7.
Figure 6Graphical representation of MEME blocks and antigenic sites on the 3-D hemagglutinin structure. The HA1 and HA2 are represented in yellow and blue, respectively. A) MEME blocks on HA: MEME2 (Magenta), MEME7 (Red), MEME3 (Bright Green), MEME1 (Orange (89–129 AA)). B) Antigenic sites on HA: Antigenic Binding Site A (Green), Antigenic Binding Site B (Magenta), Antigenic Binding Site C (Red), Antigenic Binding Site D (Red).
Co-mutating pairs and their position with respect to MEME motifs.
| I-78-D-18 | 9 | Q-205-I-78 | 9 | ||
| T-99-I-78 | 9 | Q-205-T-99 | |||
| G-140-D-18 | Q-205-P-159 | ||||
| G-140-I-78 | 9 | Q-205-T-171 | 11 | ||
| G-151-G-140 | Q-205-T-176 | 11 | |||
| N-153-D-18 | S-209-T-99 | 13 | |||
| N-153-I-78 | 9 | S-209-Q-205 | 13 | ||
| P-159-D-18 | V-212-N-153 | 13 | |||
| P-159-I-78 | 9 | V-212-Q-205 | 13 | ||
| P-159-T-99 | V-260-D-18 | 6 | |||
| P-159-G-140 | V-260-T-99 | 6 | |||
| P-159-N-153 | V-260-G-140 | 6 | |||
| G-160-D-18 | V-260-N-153 | 6 | |||
| G-160-I-78 | 9 | V-260-P-159 | 6 | ||
| N-161-D-18 | V-260-G-160 | 6 | |||
| N-161-I-78 | 9 | V-260-N-161 | 6 | ||
| N-161-G-140 | V-260-Q-205 | 6 | |||
| N-161-N-153 | N-264-D-18 | 6 | |||
| N-161-P-159 | N-264-T-99 | 6 | |||
| T-171-A-14 | 11 | N-264-N-153 | 6 | ||
| T-171-T-99 | 11 | N-264-P-159 | 6 | ||
| T-171-P-159 | 11 | N-264-N-161 | 6 | ||
| K-172-N-153 | 11 | N-264-Q-205 | 6 | ||
| K-172-P-159 | 11 | I-294-D-18 | - | ||
| K-172-N-161 | 11 | I-294-T-99 | - | ||
| G-174-D-18 | 11 | I-294-N-153 | - | ||
| G-174-G-140 | 11 | I-294-P-159 | - | ||
| G-174-N-153 | 11 | I-294-N-161 | - | ||
| G-174-P-159 | 11 | I-363-N-153 | |||
| G-174-G-160 | 11 | I-363-N-161 | |||
| G-174-N-161 | 11 | I-363-G-174 | 11 | ||
| T-176-D-18 | 11 | I-363-V-212 | 13 | ||
| T-176-T-99 | 11 | I-363-I-294 | - | ||
| T-176-G-140 | 11 | V-400-D-18 | 9 | ||
| T-176-N-153 | 11 | V-400-G-140 | 9 | ||
| T-176-P-159 | 11 | V-400-N-153 | 9 | ||
| T-176-G-160 | 11 | V-400-P-159 | 9 | ||
| T-176-N-161 | 11 | V-400-G-160 | 9 | ||
| V-400-N-161 | 9 |
77 co-mutating pairs implicated in MEME blocks 1, 2, 3, and 7 were determined using CRASP (see materials and methods) and aligned with MEME motifs.