| Literature DB >> 24820965 |
Nicholas C Wu1, Arthur P Young2, Laith Q Al-Mawsawi3, C Anders Olson3, Jun Feng3, Hangfei Qi3, Shu-Hwa Chen4, I-Hsuan Lu4, Chung-Yen Lin4, Robert G Chin5, Harding H Luan3, Nguyen Nguyen3, Stanley F Nelson6, Xinmin Li7, Ting-Ting Wu3, Ren Sun8.
Abstract
Genetic research on influenza virus biology has been informed in large part by nucleotide variants present in seasonal or pandemic samples, or individual mutants generated in the laboratory, leaving a substantial part of the genome uncharacterized. Here, we have developed a single-nucleotide resolution genetic approach to interrogate the fitness effect of point mutations in 98% of the amino acid positions in the influenza A virus hemagglutinin (HA) gene. Our HA fitness map provides a reference to identify indispensable regions to aid in drug and vaccine design as targeting these regions will increase the genetic barrier for the emergence of escape mutations. This study offers a new platform for studying genome dynamics, structure-function relationships, virus-host interactions, and can further rational drug and vaccine design. Our approach can also be applied to any virus that can be genetically manipulated.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24820965 PMCID: PMC4018626 DOI: 10.1038/srep04942
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Mutant library passaging and sequencing library preparation.
(A) The HA segment was randomized by error-prone PCR. The randomized segment with the remaining seven wild type segments were transfected into C227 cells to generate the viral mutant library. Two rounds of 24-hour infections were performed using A549 cells with an MOI of 0.05. Both the plasmid library and the passaged viral library were subjected to sequencing using the Illumina HiSeq 2000 machine. (B) The HA gene was divided into 12 amplicons for the first PCR. Unique tags were assigned to both ends of the individual molecules during the amplification process. The second PCR generated identical copies of individual molecules linked with unique tags. Red circles represent true mutations; Yellow circles represent sequencing errors.
Figure 2Single-nucleotide resolution fitness profiling.
(A) The RF index for individual point mutations across the HA gene was computed. Log10 of the RF index is plotted on the y-axis. Each nucleotide position is represented by four consecutive lines for the RF indices that correspond to mutating to A (blue), T (green), C (orange), or G (red). The Log10 RF index of wild type (WT) nucleotides is set as zero. Only point mutations with a coverage of ≥ 30 tag-conflated reads in the plasmid library are shown. Otherwise, point mutations are plotted as a gray circle on the zero baseline. A short region is shown as an inset to demonstrate the resolution of our dataset. (B) The distributions of the log10 RF indices for silent substitutions, nonsense substitutions and missense substitutions are displayed as histograms. Mutations located at the 5′ terminal 200 bp and 3′ terminal 200 bp regions are not included in this analysis to avoid confounding by the vRNA packaging signal50.
Comparison with phenotype reported in the literature
| Substitution | RF index | Expected Phenotype |
|---|---|---|
| Y174H (Y159H) | 0.04 | Deleterious |
| D238G (D225G) | 0.23 | Deleterious |
| Y342H (Y328H) | 0.16 | Deleterious |
| Y342C (Y328C) | 0.11 | Deleterious |
| Y342N (Y328N) | 0.04 | Deleterious |
| Y342F (Y328F) | 0.37 | Neutral |
| D111E (D110E) | 1.03 | Neutral |
| Q299R (Q298R) | 1.00 | Neutral |
aPositions of the substitutions are named based on our wild type protein sequence. Positions of substitutions in the parentheses represent the naming in the corresponding reference.
bExpected phenotype is classified into deleterious and neutral based on their reported phenotype.
cTemperature sensitive mutation, in which 37°C is a non-permissive temperature.
dPrefers α2,3 linked sialic acid receptor (avian) and does not efficiently bind to α2,6 linked sialic acid receptor (human).
eOnly Y and F at this residue support efficient viral replication in our growth condition that is in the absence of trypsin.
fMutations that were confirmed to thermodynamically stabilize the HA protein.
Figure 3Experimental validation.
(A) The top panel displays the log10 TCID50 value of mutant virus rescued from transfection. The bottom panel represents their log10 RF indices from the biological duplicate. (B) A Pearson correlation of 0.9 is obtained between log10 TCID50 from transfection (x-axis) and log10 RF index (y-axis).
Figure 4Structural analysis on hemagglutinin.
(A) All α-helices (orange, red, pink, cyan, green, yellow) and a non-structural loop (blue) in HA are highlighted. Mean log10 RF indices for individual highlighted structural elements are shown. (B) The log10 RF indices for all observed X → P mutations (where X can be any amino acids but P) in individual highlighted structural elements are plotted as stripcharts. The colors of the stripcharts match the highlight colors of the corresponding structural elements in panel A. The bottom stripchart represents the non-structural loop that undergoes α-helix formation during membrane fusion. (C) Helical wheel was constructed by DrawCoil 1.0 (http://www.grigoryanlab.org/drawcoil/). Amino acid property of each residue is color coded. Polar: orange; Hydrophobic: grey; Positively charged: red; Negatively charged: blue. (D) The bar chart represents the RF indices of all profiled amino acid substitutions at heptad position d. RF indices of silent mutations are also included for comparison.
Figure 5Essential regions on hemagglutinin.
(A–B) The RF indices of the most destructive missense substitutions in the profiling data for individual amino acids are projected on the HA protein structure to identify essential regions intolerable to mutations. (C) The RF indices of the least destructive missense substitutions in the profiling data for individual amino acids are projected on the HA protein structure to identify essential regions intolerable to mutations. The inset represents the side chain interaction between HA (grey) and the proposed influenza universal antibody CR6261 (green) (PDB: 3GBN)28. Parentheses represent the residue naming according to HA228. The mean log10 RF indices of nonconservative mutations for each residue are shown. Note that, residue 389 is an aspartic acid in the structure but is an asparagine in our wild type HA sequence. A compatible rotamer for T392 was generated using PyMOL to display the hydrogen bond. All hydrogen bonds (black dotted lines) are displayed as described28. (A–C) Red: RF index < 0.05; Orange: RF index < 0.1; Green: other. The structure is based on PDB: 1RUZ49. (D) The RF indices for missense mutations within the universal antibody recognition sites are shown. Types of amino acid substitution are color coded with red: nonsense substitution; orange: nonconservative substitution; blue: conservative substitution; green: silent mutation. A conservative substitution is defined as having a positive score in the blosum80 matrix.