| Literature DB >> 25978416 |
Xiaowei Ren1, Yuefeng Li2, Xiaoning Liu1, Xiping Shen1, Wenlong Gao1, Juansheng Li1.
Abstract
The antigenic variability of influenza viruses has always made influenza vaccine development challenging. The punctuated nature of antigenic drift of influenza virus suggests that a relatively small number of genetic changes or combinations of genetic changes may drive changes in antigenic phenotype. The present study aimed to identify antigenicity-associated sites in the hemagglutinin protein of A/H1N1 seasonal influenza virus using computational approaches. Random Forest Regression (RFR) and Support Vector Regression based on Recursive Feature Elimination (SVR-RFE) were applied to H1N1 seasonal influenza viruses and used to analyze the associations between amino acid changes in the HA1 polypeptide and antigenic variation based on hemagglutination-inhibition (HI) assay data. Twenty-three and twenty antigenicity-associated sites were identified by RFR and SVR-RFE, respectively, by considering the joint effects of amino acid residues on antigenic drift. Our proposed approaches were further validated with the H3N2 dataset. The prediction models developed in this study can quantitatively predict antigenic differences with high prediction accuracy based only on HA1 sequences. Application of the study results can increase understanding of H1N1 seasonal influenza virus antigenic evolution and accelerate the selection of vaccine strains.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25978416 PMCID: PMC4433265 DOI: 10.1371/journal.pone.0126742
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Full names and accession numbers of H1N1 seasonal influenza viruses from 1977 to 2008.
| Full name | Accession number |
|---|---|
| A/WUHAN/371/95 | CAC86625 |
| A/BAYERN/7/95 | CAD29944 |
| A/BEIJING/262/95 | ACF41867 |
| A/BRAZIL/11/78 | ABO38065 |
| A/BRISBANE/59/07 | ACA28844 |
| A/BRISBANE/193/2004 | ACD37424 |
| A/CAMBODIA/0371/2007 | ACI45444 |
| A/CHILE/1/83 | ABO38340 |
| A/FLORIDA/13/07 | ACF40117 |
| A/FUKUSHIMA/141/2006 | ACM17297 |
| A/HONG_KONG/2652/2006 | ACD37439 |
| A/INDIA/6263/80 | ABO38362 |
| A/JIANGXI/160/2005 | ACF76722 |
| A/JOHANNESBURG/82/96 | CAD29943 |
| A/KENTUCKY/1/2005 | ABI96135 |
| A/KENTUCKY/02/2006 | ABU86800 |
| A/NEW_CALEDONIA/9/2004 | ABQ09837 |
| A/NEW_CALEDONIA/20/99 | AFO65027 |
| A/PHILIPPINES/673/2006 | ACD37433 |
| A/SHENZHEN/227/95 | AAP34325 |
| A/SICHUAN/4/88 | AAA43231 |
| A/SINGAPORE/6/86 | ABO38395 |
| A/SINGAPORE/14/2004 | ABQ09838 |
| A/SOLOMON_ISLANDS/03/2006 | ABU50586 |
| A/SOUTH_DAKOTA/06/2007 | AFM72510 |
| A/TAIWAN/1/86 | CAA35097 |
| A/TEXAS/36/91 | ABD60955 |
| A/USSR/90/77 | AFM73477 |
| A/VICTORIA/500/2006 | ABQ09960 |
| A/VIRGINIA/01/2006 | ABI96152 |
| A/ENGLAND/333/80 | X00031 |
| A/CHILE/4795/00 | AFO66147 |
| A/FUJIAN/156/00 | AFQ90525 |
| A/HONG KONG/1870/2008 | AFM72587 |
| A/MALAYSIA/100/2006 | ABQ09959 |
| A/MOSCOW/13/98 | AFQ90529 |
| A/NEIMENGGU/52/2002 | AFQ90530 |
Fig 1The cross-validation MSE of RFR against the number of variables selected using different mtry functions.
Antigenicity-associated sites of H1N1 identified using the random forest regression algorithm.
| Order of variable importance | Amino acid position | ||
|---|---|---|---|
| 1 | 141 | ||
| 2 | 130 | ||
| 3 | 43 | ||
| 4 | 54 | 127 | 193 |
| 5 | 186 | ||
| 6 | 80 | 271 | |
| 7 | 71 | ||
| 8 | 36 | ||
| 9 | 190 | ||
| 10 | 194 | ||
| 11 | 163 | ||
| 12 | 128 | ||
| 13 | 187 | ||
| 14 | 189 | ||
| 15 | 125 | ||
| 16 | 121 | 205 | |
| 17 | 321 | ||
| 18 | 133 | 191 | |
Fig 2The cross-validation MSE of SVR against the number of variables used.
Antigenicity-associated sites of H1N1 identified using the support vector regression based on recursive feature elimination.
| Order of variable importance | Amino acid position | ||
|---|---|---|---|
| 1 | 130 | ||
| 2 | 54 | 127 | 193 |
| 3 | 43 | ||
| 4 | 141 | ||
| 5 | 190 | ||
| 6 | 160 | ||
| 7 | 121 | 205 | |
| 8 | 71 | ||
| 9 | 273 | ||
| 10 | 321 | ||
| 11 | 125 | ||
| 12 | 96 | ||
| 13 | 277 | ||
| 14 | 69 | ||
| 15 | 187 | ||
| 16 | 269 | ||
| 17 | 57 | ||
Fig 3Antigenic map of human H1N1 seasonal influenza viruses from 1977 to 2008.
The relative positions of strains (green circles) and antisera (uncolored squares) are adjusted such that the distances between strains and antisera on the map represent the corresponding HI measurements with the least error. One unit (grid) corresponds to a two-fold dilution of antiserum in the HI assay. The cluster-transition amino acid substitutions are shown in red.
Antigenicity-associated sites of H3N2 identified using the random forest regression algorithm.
| Order of variable importance | Amino acid position |
|---|---|
| 1 |
|
| 2 |
|
| 3 |
|
| 4 |
|
| 5 |
|
| 6 | 144 |
| 7 |
|
| 8 | 216 |
| 9 |
|
| 10 |
|
| 11 | 163 |
| 12 |
|
Experimentally validated sites were marked as bold. Accessory substitutions were marked as italic.
Antigenicity-associated sites of H3N2 identified using the support vector regression based on recursive feature elimination.
| Order of variable importance | Amino acid position | |
|---|---|---|
| 1 |
| |
| 2 |
| |
| 3 |
| |
| 4 |
| |
| 5 |
| |
| 6 | 216 | |
| 7 | 144 | |
| 8 | 285 | |
| 9 |
| |
| 10 |
| |
| 11 | 137 | |
| 12 | 196 | |
| 13 | 129 | 132 |
| 14 | 271 | |
| 15 | 131 | |
| 16 | 175 | |
| 17 |
| |
Experimentally validated sites were marked as bold. Accessory substitutions were marked as italic.
Comparison of amino acid positions related to antigenic variation of H1N1 seasonal influenza viruses identified by current and previous studies.
| RFR | SVR-RFE | natural epitope residues | antigenic cartography | antigenic epitope regions |
|---|---|---|---|---|
| 35 | ||||
| 36 | 36 | |||
| 43 | 43 | 43 | 43 | |
| 47 | ||||
| 54 | 54 | 54 | 54 | |
| 57 | ||||
| 69 | 69 | |||
| 71 | 71 | 71 | 71 | Cb |
| 73 | Cb | |||
| 80 | 80 | 80 | ||
| 82 | ||||
| 94 | ||||
| 96 | ||||
| 121 | 121 | 121 | 121 | |
| 125 | 125 | 125 | 125 | Sa |
| 127 | 127 | 127 | 127 | |
| 128 | 128 | |||
| 130 | 130 | 130 | 130 | |
| 133 | 133 | |||
| 141 | 141 | 141 | Ca2 | |
| 146 | ||||
| 153 | Sa | |||
| 160 | 160 | Sa | ||
| 163 | 163 | Sa | ||
| 183 | ||||
| 186 | 186 | 186 | Sb | |
| 187 | 187 | Sb | ||
| 189 | 189 | 189 | Sb | |
| 190 | 190 | 190 | 190 | Sb |
| 191 | 191 | Sb | ||
| 193 | 193 | 193 | 193 | Sb |
| 194 | 194 | Sb | ||
| 205 | 205 | 205 | 205 | Ca1 |
| 209 | ||||
| 216 | ||||
| 222 | Ca2 | |||
| 224 | ||||
| 267 | ||||
| 269 | ||||
| 271 | 271 | 271 | ||
| 273 | 273 | |||
| 274 | ||||
| 277 | 277 | 277 | ||
| 295 | ||||
| 310 | ||||
| 321 | 321 |
aTwenty-three antigenicity-associated sites identified by random forest regression.
bTwenty antigenicity-associated sites identified by support vector regression based on recursive feature elimination.
cForty-one natural epitope residues identified by Huang et al. [13].
dFifteen cluster-difference substitutions revealed by antigenic cartography.
eFive antigenic epitope regions described by Brownlee and Fodor [37].