| Literature DB >> 36090813 |
Jingxuan Qiu1, Xinxin Tian1, Yaxing Liu1, Tianyu Lu1, Hailong Wang1, Zhuochen Shi1, Sihao Lu1, Dongpo Xu1, Tianyi Qiu2.
Abstract
The rapid mutations on hemagglutinin (HA) of influenza A virus (IAV) can lead to significant antigenic variance and consequent immune mismatch of vaccine strains. Thus, rapid antigenicity evaluation is highly desired. The subtype-specific antigenicity models have been widely used for common subtypes such as H1 and H3. However, the continuous emerging of new IAV subtypes requires the construction of universal antigenic prediction model which could be applied on multiple IAV subtypes, including the emerging or re-emerging ones. In this study, we presented Univ-Flu, series structure-based universal models for HA antigenicity prediction. Initially, the universal antigenic regions were derived on multiple subtypes. Then, a radial shell structure combined with amino acid indexes were introduced to generate the new three-dimensional structure based descriptors, which could characterize the comprehensive physical-chemical property changes between two HA variants within or across different subtypes. Further, by combining with Random Forest classifier and different training datasets, Univ-Flu could achieve high prediction performances on intra-subtype (average AUC of 0.939), inter-subtype (average AUC of 0.771), and universal-subtype (AUC of 0.978) prediction, through independent test. Results illustrated that the designed descriptor could provide accurate universal antigenic description. Finally, the application on high-throughput antigenic coverage prediction for circulating strains showed that the Univ-Flu could screen out virus strains with high cross-protective spectrum, which could provide in-silico reference for vaccine recommendation.Entities:
Keywords: Antigenic prediction; Hemagglutinin; In-silico model; Influenza virus
Year: 2022 PMID: 36090813 PMCID: PMC9436755 DOI: 10.1016/j.csbj.2022.08.052
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 6.155
Fig. 1The flowchart of model construction for antigenicity prediction. (A) HA sequences were collected from public resources. Antigenic center and shell structure were determined to describe the residue layout. Antigenic descriptor was designed to describe the property change of each HA pair. (B) Antigenic relationship was determined by HI assay. (C) Antigen prediction model was constructed.
Fig. 2Antigenic region for different virus subtypes. (A) Illustration of receptor binding sites on HA protein of different virus subtypes. Residues on 130-loop, 190-helix and 220 loop were labeled in yellow, blue and red for H1 (3LZG), H3 (6AOU) and H5 (2IBX). (B) Illustration of antigenic region determined by shell structure model. (C–E) Frequently mutated sites of H1, H3 and H5 subtypes. Each bar refers to one mutation site, the height of the bar refers to the maximum residue frequency on the sites. Blue bar represent the site located in antigenic region, black bar refers to the mutation sites located on the outside of antigenic region. (F–H) Residue distribution on mutation sites located with antigenic region. The horizontal axis labeled the mutation sites in antigenic region, and the vertical axis is sequence conservation for different amino acid at each site. The residues of K, T, H were labeled in orange, D and Q were labeled in blue, residues of A, V, L, I, P, W, F, M, N were labeled in red, residues of S, Y, R and E were labeled in purple. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 3Model performance on antigenic prediction. (A–C) Tenfold cross validation of intra-subtype classification based on different machine learning approaches. (D) Independent test performance of intra-subtype model. (E) Independent test performance of inter-subtype model. (F) Independent test performance of inter-group prediction. (G) Tenfold cross validation of universal antigenic prediction model based on the mixed training dataset of H1, H3 and H5. (H) Independent test of universal model. The integrated test set includes the test set of H1, H3 and H5.
Comparison of model performance.
| Influenza subtype of test set | Prediction model | Accuracy | Sensitivity | Specificity | Balanced Accuracy |
|---|---|---|---|---|---|
| H1 | Univ-Flu | 0.747 | 0.746 | 0.744 | 0.745 |
| PREDAV-FluA | 0.662 | 0.632 | 0.697 | 0.665 | |
| H3 | Univ-Flu | 0.937 | 0.937 | 0.933 | 0.935 |
| PREDAV-FluA | 0.780 | 0.800 | 0.750 | 0.775 | |
| H5 | Univ-Flu | 0.844 | 0.844 | 0.857 | 0.851 |
| PREDAV-FluA | 0.813 | 0.923 | 0.640 | 0.782 | |
| H9 | Univ-Flu | 0.730 | 0.730 | 0.632 | 0.681 |
| PREDAV-FluA | 0.753 | 0.984 | 0.222 | 0.603 | |
Fig. 4Antigenic network illustration and antigenic degree distribution. (A) Antigenic network of H9 subtype, each strain node is arranged into circle from outside to inside according to degree value from high to low. (B-F) Degree distribution of virus strains for H1, H3, H5, H7 and H9 subtypes. X-axis refers to the emerging time for each strain. Y-axis refer to specific degree value ranked in descending order.