| Literature DB >> 19208143 |
Jhang-Wei Huang1, Chwan-Chuen King, Jinn-Moon Yang.
Abstract
BACKGROUND: In pandemic and epidemic forms, avian and human influenza viruses often cause significant damage to human society and economics. Gradually accumulated mutations on hemagglutinin (HA) cause immunologically distinct circulating strains, which lead to the antigenic drift (named as antigenic variants). The "antigenic variants" often requires a new vaccine to be formulated before each annual epidemic. Mapping the genetic evolution to the antigenic drift of influenza viruses is an emergent issue to public health and vaccine developmentEntities:
Mesh:
Substances:
Year: 2009 PMID: 19208143 PMCID: PMC2648776 DOI: 10.1186/1471-2105-10-S1-S41
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Overview of our method for predicting the antigenic variants of human influenza A/H3N2 viruses.
Figure 2The relationship between entropies and information gains of 329 amino acids on HA protein. The positions in area I (e.g. 145-A, 189-B and 278-C) with both high entropy and high IG values are highly correlated to the antigenic variants. 145-A denotes the amino acid position 145 located at the epitope A.
The entropy, information gain, and co-mutated positions of 15 amino acid positions on HA sequences
| Position-epitope | Entropy | IG | Number of co-mutate positions | Co-mutated positions | Positive selection | Cluster Transition |
| 145-A1 | 0.87 | 1.00 | 12 | 9,31,63,78,83,126,137,160,193,197,242,278 | + 2 | + 3 |
| 137-A | 0.68 | 0.41 | 23 | 9,31,53,54,62,63,83,126,143,145,146,158,160,164,174,189,193,201,213,217,244,260,278 | + | |
| 193-B | 0.86 | 0.23 | 17 | 9,31,63,78,83,126,137,145,158,160,164,174,201,217,242,260,278 | + | + |
| 160-B | 0.58 | 0.28 | 16 | 2,31,54,62,126,137,143,146,156,158,164,197,217,244,260,278 | + | |
| 156-B | 0.80 | 0.43 | 8 | 54,62,143,146,160,197,244,260 | + | + |
| 226-D | 1.00 | 0.15 | 2 | 145,189 | + | |
| 135-A | 0.83 | 0.07 | 1 | 165 | + | |
| 121-D | 0.72 | 0.00 | 0 | + | ||
| 142-A | 0.47 | 0.00 | 0 | + | ||
| 186-B | 0.41 | 0.00 | 0 | + | ||
| 164-B | 0.24 | 0.46 | 6 | 126,137,158,174,201,217, | + | |
| 201-D | 0.27 | 0.36 | 4 | 137,164,174,217 | + | |
| 78-E | 0.14 | 0.29 | 4 | 31,63,126,242 | ||
| 174-D | 0.32 | 0.47 | 4 | 137,164,201,217 | + | |
| 63-E | 0.19 | 0.39 | 6 | 78,83,126,137,242,278 | ||
1 The epitope of the position on HA sequence.
2 the position is under positive selection defined by Bush et al. [3].
3 the position is a cluster-difference substitution defined by Smith et al. [6].
Figure 3The distribution of IG values and co-mutation scores on HA structure. (A) The distribution of IG values of 329 amino acids on HA structure (PDB code 1HGF) and the R indicates the receptor binding site. The blue and gray indicate the highest IG value and the lowest IG value, respectively. (B) The structural locations and scores of 12 co-mutation positions of the position 145. These structures are presented by using PyMOL.
Figure 4The decision tree and rules for predicting antigenic variants. Each internal node (circle) is represented as an amino acid position. The leaf node (square) includes the predicted antigenic type (i.e. "antigenic variants" and "similar viruses"), the numbers of total pairs (the first value) and predicted error pairs (the second value) by applying this rule in this node.
Figure 5Compare our method with other two methods on predicting antigenic variants on two data sets.
Comparison our method with other methods for predicting the antigenic variants on 31,878 pairs
| Antigenic variants | Wilson & Cox, 1990 [ | Lee & Chen, 2004 [ | Our method | Similar viruses | Wilson & Cox, 1990 [ | Lee & Chen, 2004 [ | Our method |
| HK68-EN72 (210 1) | 210 | 206 | 210 | HK68 (91 1) | 24 | 52 | 37 |
| EN72-VI75 (135) | 135 | 135 | 135 | EN72 (105) | 36 | 79 | 48 |
| VI75-TX77 (27) | 27 | 27 | 27 | VI75 (36) | 30 | 36 | 21 |
| TX77-BA79 (48) | 48 | 48 | 45 | TX77 (3) | 1 | 2 | 1 |
| BA79-SI87 (400) | 400 | 381 | 400 | BA79 (120) | 13 | 46 | 58 |
| SI87-BE89 (1600) | 1577 | 863 | 1600 | SI87 (300) | 125 | 233 | 276 |
| BE89-BE92 (3648) | 3648 | 3648 | 3648 | BE89 (2016) | 872 | 1725 | 2016 |
| BE92-WU95 (1596) | 1542 | 1391 | 1562 | BE92 (1596) | 372 | 928 | 732 |
| WU95-SY97 (448) | 448 | 448 | 448 | WU95 (378) | 53 | 156 | 325 |
| SY97-FU02 (96) | 96 | 96 | 96 | SY97 (120) | 24 | 65 | 120 |
| Other inter clusters (18890) | 18889 | 18870 | 18855 | FU02 (15) | 15 | 15 | 15 |
| Number of predicted pairs | 27020 | 26113 | 27026 | Number of predicted pairs | 1565 | 3337 | 3649 |
| Accuracy | 99.71% | 96.37% | 99.73% | Accuracy | 32.74% | 69.81% | 76.34% |
1 the number of the pairs in the cluster.
The numbers of co-mutation positions of five epitopes and the other area on HA protein
| Epitope A | Epitope B | Epitope C | Epitope D | Epitope E | Other area | sum | |
| Epitope A | 15 | 8 | 11 | 16 | 8 | ||
| Epitope B | 15 | 6 | 13 | 13 | 5 | ||
| Epitope C | 11 | 3 | 5 | 9 | 4 | 47 | |
| Epitope D | 12 | 3 | 8 | 6 | 4 | 46 | |
| Epitope E | 11 | 4 | 6 | 7 | 3 | 44 | |
| Other area | 4 | 2 | 1 | 3 | 4 | 4 | 18 |
Figure 6The co-mutation z-score distributions of six positions on the HA sequence. A position is considered as a co-evolution residue if its z-score is more than 2.3 (i.e. the blue line).