| Literature DB >> 26837263 |
Hongyang Xu1,2, Yiyan Yang1, Shuning Wang1,3, Ruixin Zhu1, Tianyi Qiu1, Jingxuan Qiu1, Qingchen Zhang1, Li Jin4, Yungang He2, Kailin Tang1, Zhiwei Cao1,5.
Abstract
Mutations of the influenza virus lead to antigenic changes that cause recurrent epidemics and vaccine resistance. Preventive measures would benefit greatly from the ability to predict the potential distribution of new antigenic sites in future strains. By leveraging the extensive historical records of HA sequences for 90 years, we designed a computational model to simulate the dynamic evolution of antigenic sites in A/H1N1. With templates of antigenic sequences, the model can effectively predict the potential distribution of future antigenic mutants. Validation on 10932 HA sequences from the last 16 years showing that the mutated antigenic sites of over 94% of reported strains fell in our predicted profile. Meanwhile, our model can successfully capture 96% of antigenic sites in those dominant epitopes. Similar results are observed on the complete set of H3N2 historical data, supporting the general applicability of our model to multiple sub-types of influenza. Our results suggest that the mutational profile of future antigenic sites can be predicted based on historical evolutionary traces despite the widespread, random mutations in influenza. Coupled with closely monitored sequence data from influenza surveillance networks, our method can help to forecast changes in viral antigenicity for seasonal flu and inform public health interventions.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26837263 PMCID: PMC4738307 DOI: 10.1038/srep20239
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Overview of our model to calculate the mutating distribution for the HA antigenic sites.
Steps (A–C) illustrate the construction of the nucleotide transition matrix for HA antigens. Steps (D–F) present the mutant simulation and selection for the template sequences. In step (G), the top 100 mutants are re-ranked according to potential dominance likeliness score.
Figure 2The type coverage and strain coverage during the entire validation period from 1999 to 2014.
The results were grouped every two years using three different template strains announced in 1999, 2006 and 2009. (A) The average results of the five antigenic sites representing the overall epitope areas on the HA protein. The Y-axis on the left indicates the type coverage (solid line) and strain coverage (dashed line) in proportion. The X-axis indicates the evaluation period. (B–F) Prediction results for each individual site. (G) The strain coverage of the prediction profile with three template strains and five sites indicated in one graph. The Y-axis on the right indicates the number of antigenic types in the bar plot.
Rank combination of dominant epitopes observed every year from 1999 to 2014.
| 1999 | 54.55% | 1,1,1,1,19 | 2007 | 46.03% | 1,2,2,1,6 |
| 2000 | 30.68% | 11,1,1,1,1 | 2008 | 28.65% | 1,1,2,1,-- |
| 17.05% | 11,1,1,1,11 | 2009 | 57.09% | 2,1,1,1,1 | |
| 2001 | 51.15% | 1,1,1,18,1 | 18.48% | 1,1,1,1,1 | |
| 2002 | 31.25% | 1,1,1,18,19 | 2010 | 44.49% | 2,1,1,1,1 |
| 18.75% | 1,1,1,18,1 | 15.66% | 2,1,1,1,24 | ||
| 2003 | 46.94% | 11,1,1,78,27 | 2011 | 53.07% | 2,1,1,1,24 |
| 34.69% | 11,1,1,1,1 | 2012 | 65.80% | 2,1,1,1,24 | |
| 2004 | 50.00% | 11,1,1,1,1 | 2013 | 43.55% | 2,1,1,39,24 |
| 25.00% | 11,1,1,1,19 | 19.41% | 2,1,1,1,24 | ||
| 2005 | 77.78% | 11,1,1,1,1 | 2014 | 84.91% | 2,1,1,39,24 |
| 2006 | 28.32% | 11,1,1,1,1 |
aThe relative abundance of epitopes (combined by five antigenic sites) are generated from epitope sequence variation in reported strains every year. Dominant epitopes are listed with relative abundance above 15% in the yearly reported data during 1999–2014.
bThe predicted combination rank of five sites are for Ca1, Ca2, Cb, Sa, Sb, respectively. Only mutants falling within top100 are recorded, otherwise represented by ‘--’.
cThe combination rank during 1999–2006 are predicted by template A/New Caledonia/20/1999.
dThe combination rank during 2007–2008 are predicted by template A/Solomon Islands/3/2006.
eThe combination rank during 2009-2014 are predicted by template A/California/7/2009.