| Literature DB >> 36146872 |
Yunsong Liu1,2, Hui Chen3, Wenyuan Duan1,2, Xinyi Zhang1,2, Xionglei He4, Rasmus Nielsen5,6,7, Liang Ma1, Weiwei Zhai1,2,8.
Abstract
Seasonal H3N2 influenza evolves rapidly, leading to an extremely poor vaccine efficacy. Substitutions employed during vaccine production using embryonated eggs (i.e., egg passage adaptation) contribute to the poor vaccine efficacy (VE), but the evolutionary mechanism remains elusive. Using an unprecedented number of hemagglutinin sequences (n = 89,853), we found that the fitness landscape of passage adaptation is dominated by pervasive epistasis between two leading residues (186 and 194) and multiple other positions. Convergent evolutionary paths driven by strong epistasis explain most of the variation in VE, which has resulted in extremely poor vaccines for the past decade. Leveraging the unique fitness landscape, we developed a novel machine learning model that can predict egg passage substitutions for any candidate vaccine strain before the passage experiment, providing a unique opportunity for the selection of optimal vaccine viruses. Our study presents one of the most comprehensive characterizations of the fitness landscape of a virus and demonstrates that evolutionary trajectories can be harnessed for improved influenza vaccines.Entities:
Keywords: H3N2 influenza; convergent evolution; epistasis; fitness landscape; passage adaptation; vaccine efficacy
Mesh:
Substances:
Year: 2022 PMID: 36146872 PMCID: PMC9501976 DOI: 10.3390/v14092065
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.818
Figure 1The temporal dynamics of egg passage adaptation. (A) The passage histories of the strains in the dataset (D1). (B) The phylogenetic relationships of all the sequences (Dataset D2). The inset illustrates the branches supporting egg passage adaptation (i.e., “egg branches”), which included both monophyletic groups for the egg-passaged sequences (i.e., “egg clade”), as well a single egg-passaged sequence (i.e., “egg terminal”). (C) The eighteen residues driving egg passage adaptation. The top bars indicate the number of nonsynonymous and synonymous changes (for a given sample history). The middle rows indicate the significance of the three statistical tests (Methods). The bottom plot shows whether the 18 residues are in the RBS (receptor binding sites) or antigenic domains A–E. (D) The temporal changes in the frequencies of substitutions at the 18 residues (only a few key residues, 156, 160, 219, 225, 194 and 186, are labelled). (E) The temporal changes in the frequency of different substitutions at residue 194. (F) The temporal changes in the frequency of different substitutions at residue 225.
Figure 2The epistatic landscape of egg passage adaptation. (A) Clustering temporal trajectories of the most frequent substitutions across the 18 residues (Methods). (B) Cartoon illustration of the tests of the co-occurrence relationship between two amino acid sites along the egg branches. (C) Examples of co-occurrence relationships between substitutions at different amino acid sites. The first column shows the substitution type at the first amino acid site, the second column indicates the second amino acid site, and the third column plots the types of substitutions occurring at the second amino acid site. Each line represents the history along one egg branch. In order to better illustrate different combinations of mutations, we colored the co-occurrence pattern based on different co-occurring amino acid positions (second column). Red: position 203; yellow: position 225; dark purple: position 160. (D) Positive and negative epistatic interactions between the 18 residues. Dot size indicates the level of significance. (E) The epistatic network between the 18 residues. Red and blue lines indicate positive and negative epistatic relationships. We have “clustered” the set of residues into discrete groups based on their epistatic properties. Residue 186 has a core group of positively epistatic residues, which tend to have negative epistatic relationship with the cluster centered around residue 194. The same pattern applies to the cluster around residue 194. Residues 160 and 225 tend to have positive epistatic relationships with two clusters surrounding 186 and 194.
Figure 3The fitness landscape and machine learning models. (A) A cartoon illustration of the fitness landscape of the egg passage adaptation. As residues 186 and 194 are two dominant codons correlated with vaccine efficacy, we thus summarized the evolutionary trajectories of egg passage adaptation into three possible evolutionary paths: (1) those with substitutions in residue 186 (no change in residue 194), (2) trajectories with substitutions in residue 194 (no change in residue 186) or (3) no changes in any of the two residues. (B) A schematic flow of the PEPA (Predicting Egg Passage Adaptation) (Methods). In the cross-validation, we trained the model on 80% of all the egg-passaged sequences (n = 688) and tested the model on the other 20% of the data. (C) The performance of the Random Forest and XGBoost model.