Literature DB >> 31458257

Insights into the ZIKV NS1 Virology from Different Strains through a Fine Analysis of Physicochemical Properties.

Sergio A Poveda-Cuevas^1,2,3, Catherine Etchebest^4,5,6,3, Fernando L Barroso da Silva^1,2,3,7.

Abstract

The flavivirus genus has several organisms responsible for generating various diseases in humans. Recently, especially in tropical regions, Zika virus (ZIKV) has raised great health concerns due to the high number of cases affecting the area during the last years that has been accompanied by a rise in the cases of the Guillain-Barré syndrome and fetal and neonatal microcephaly. Diagnosis is still difficult since the clinical symptoms between ZIKV and other flaviviruses (e.g., dengue and yellow fever) are highly similar. The understanding of their common physicochemical properties that are pH-dependent and biomolecular interaction features and their differences sheds light on the relation strain-virulence and might suggest alternative strategies toward differential serological diagnostics and therapeutic intervention. Due to their immunogenicity, the primary focus of this study was on the ZIKV nonstructural proteins 1 (NS1). By means of computational studies and semiquantitative theoretical analyses, we calculated the main physicochemical properties of this protein from different strains that are directly responsible for the biomolecular interactions and, therefore, can be related to the differential infectivity of the strains. We also mapped the electrostatic differences at both the sequence and structural levels for the strains from Uganda to Brazil, which could suggest possible molecular mechanisms for the increase of the virulence of ZIKV in Brazil. Exploring the interfaces used by NS1 to self-associate in some different oligomeric states and interact with membranes and the antibody, we could map the strategy used by the ZIKV during its evolutionary process. This indicates possible molecular mechanisms that can be correlated with the different immunological responses. By comparing with the known antibody structure available for the West Nile virus, we demonstrated that this antibody would have difficulties to neutralize the NS1 from the Brazilian strain. The present study also opens up perspectives to computationally design high-specificity antibodies.

Entities: Chemical Disease Gene Mutation Species

Year: 2018 PMID： 31458257 PMCID： PMC6643396 DOI： 10.1021/acsomega.8b02081

Source DB: PubMed Journal: ACS Omega ISSN： 2470-1343

Introduction

The flavivirus genus has several organisms responsible for generating diseases in humans, which often cause global health concerns [e.g., dengue virus (DENV), West Nile virus (WNV), and yellow fever virus (YFV)].[1−3] During the last years, the worldwide attention has moved to Zika virus (ZIKV) due to the large increase of infected people and affected areas, which has been accompanied by a rise in the cases of the Guillain–Barré syndrome and microcephaly.[4,5] A practical critical issue is that, in many cases, the ZIKV symptoms can be confused with other viruses, resulting in misdiagnoses and, posteriorly, inadequate treatments.[6,7] Consequently, the outbreak of ZIKV has produced a race to better diagnose it, as well as to treat and stop its epidemic growth. Yet, to date, there is no totally effective medication, easy and cheap diagnosis, or even treatment. The complex life cycle of ZIKV and its capacity to be transported by several mosquito vectors aggravate the health issues.[8] Its spread is followed by an evolution of the strains together with distinct aggressive effects. The first isolation of the ZIKV occurred in Uganda (UG) in 1947, where the reported cases did not indicate the severe neurological disorders that the ZIKV can produce. Later, cases were found in Africa, Southeastern Asia, Oceania, and Americas.[9,10] From this, it has been established that there are two lineages of ZIKV, one Asian and another African,[11,12] as well as that the latter might be divided into two.[13] However, how each mutation directly contributes to the change in virulence remains unknown. As indicated in a recent review by Simonin and collaborators,[14] the high number of microcephaly cases observed for the Brazilian strain (Asian lineage) might be a result of a mild infection in contradiction to the common sense. The African lineage causes a much more severe infection inducing an early termination of pregnancy (higher mortality) without giving a chance to the development of congenital malformations.[15] Unfortunately, no molecular mechanisms have been proved to be responsible for these different pathogenicities. Therefore, the understanding of their common physicochemical properties and biomolecular interaction features sheds light on the relation strain-infectivity and might suggest alternative strategies toward an efficient serological diagnostics and possible therapeutic interventions. ZIKV is a single-stranded RNA virus that encodes one polyprotein, which is later processed to generate three structural proteins (C, prM, and Env) and seven nonstructural (NS) proteins (NS1, NS2a, NS2b, NS3, NS4a, NS4b, and NS5). Several of these proteins are directly involved in various molecular processes that define the structure or replication of the virus.[1−3] Here, we will focus on the NS1 protein due to its high biological relevance. Among other features, this protein is able to self-associate into a symmetric head-to-head homodimer, which increases its hydrophobic character in the early viral stages. The formed NS1 dimer interacts with the membrane of the endoplasmic reticulum (ER).[2,16−19] Later, the NS1 can have three possible destinations:[2,20] (1) it can function as a cofactor in the replication complex, (2) it can go toward the membrane surface to interact with different membrane receptors, or (3) it can go toward the extracellular space to interact with proteins of the immune system. Such versatile behavior of NS1 is exploited by the virus to carry out its development since this protein has the ability to evade the immune response of the host,[21,22] acting on several pathways of the complement system (e.g., see ref (23)). It is known that all this diversity of interactions leads to a cascade of signaling, preventing the good metabolic functioning of the host.[2] Due to its involvement in the viral RNA replication, NS1 may also be an attractive target for drug developments. pH is a trigger physicochemical parameter in several viral processes.[3,24−26] It is directly related to the macromolecular electrostatic properties and the molecular interactions.[27] In a previous study,[28] Song and coauthors compared electrostatic maps on the molecular surfaces of NS1 proteins from ZIKV, DENV, and WNV. They showed diverse electrostatic features among them at host interaction interfaces and suggested the possibility to relate the pathogenesis with the electrostatic behavior. Another study has shown that the dissociation of NS1 dimers occurs in acidic pH regimes, and more stable dimers prevail under physiological pH conditions.[29] All of these studies imply that the interactions of NS1 with any other charged molecule (including its self-association) and membranes of the host might be different for each flavivirus and even for each strain, producing a diverse of clinical manifestations. This also indicates the possibility to put NS1 as a potential candidate for the design of vaccines and as a biomarker for the detection of the virus in seropositive patients.[30−32] All of these facts have motivated the present work where we extended the preliminary study of Song and collaborators[28] to explore the electrostatic properties of several ZIKV strains at several solution pH regimes. We propose that analyzing these properties at three key biological functional interfaces [(i) the self-association interface (homodimer interface), (ii) the predicted NS1 molecular region that interacts with the membranes, and (iii) a putative epitope region] could provide a way in the understanding of the differential infectious behavior of the ZIKV. This was achieved by means of numerical simulations and structural bioinformatics tools. Moreover, the quantification of biomolecular interactions is essential for understanding the molecular biology. In this direction, we performed semiquantitative estimations of the free energies of interactions of the ZIKV NS1 in some different oligomeric states at these biointerfaces. These thermodynamic measurements provide practical insights into some possible molecular mechanisms that can be related with the virology. Although the NS1 undergoes a complex maturation process and can be found in several oligomeric states from monomers that exist briefly (in the endoplasmic reticulum lumen) to hexamers (in late infection),[19,33−37] only the monomeric and dimeric forms are discussed here for several reasons. First, the monomer is the beginning of all processes and the only state where all biological interfaces can be compared on a common basis. Second, the idea that the high-molecular-weight form of NS1 is a homodimer as introduced by Winkler et al.[37] seems to be a consensus of a general feature of flavivirus infection. The subject has also been recently reviewed in different papers (e.g., ref (36)), which puts the dimer in a central position. The homodimerization is a fundamental step to provide to NS1 the necessary characteristics to associate with membranes, which is central for the viral replication.[34] Third, the homodimer is the predominant form of intracellular NS1.[35] Fourth, the glycosylation of NS1 is not required for both the dimerization and the NS1–membrane association, which allows the use of a simpler theoretical approach.[38] Finally, the hexamer state is more obscure in the experiments where it has been discussed that they might even require the presence of stabilizing elements to maintain the three NS1 dimers complexed to form the hexameric state of NS1.[34] Tetrameric intermediates have also been reported.[33] The present analyses are carried out exploring global titration features as well as more local effects at the biofunctional interfaces at both the sequence and structural levels. It is also interesting to note that despite the small changes in the sequence due to the high sequence identity among the studied strains, the electrostatic properties are strongly affected by the pH, which also affects their biomolecular interactions with partners and, consequently, the molecular viral biology. This paper is organized as follows. In the next section, we present the theoretical methods used in this work starting with the comparative modeling of the NS1 structures for each ZIKV strain, the molecular model, and numerical simulation details. This section describes the main physicochemical properties calculated, how the studied biointerfaces were defined based on either bioinformatical tools or crystallographic data, and how some relevant biomolecular interactions invoking simple colloidal-like ideas were estimated. A key concept is the pKa that is computed for all titratable residues and represented in different color-coded maps for a comparison among the strains especially at the biointerfaces. In the subsequent Results and Discussion section, two groups of data are given. First, we focus on the global protein properties related to charges and dipoles and on the corresponded estimated free energies of interactions. Then, we present and discuss pKa’s and their maps comparing the strains. This allows us to cover from a more basic physicochemical description in the first part to a more biochemical meaningful discussion in the second part. Finally, the main results of this paper are discussed and summarized.

Theoretical Methods

Computational approaches play an important role in modern virology contributing from studies of the virus assembly to insights into the dynamical and interactive properties of the structures of viruses and their components.[32,39] In this scenario, numerical simulations based on statistical mechanics methods allow to explore many thermodynamical properties and also the main features of the interaction of macromolecules providing free-energy derivatives as a function of the macromolecules separation at different environmental conditions. Here, we combine a fast constant-pH Monte Carlo (MC) scheme with canonical bioinformatic tools to investigate the ZIKV NS1 from different strains.

Three-Dimensional (3D) Modeling

Initially, we defined the most interesting ZIKV strains for this investigation based on previous phylogenetic studies, the history of ZIKV outbreaks, and the rise in the cases of microcephaly and the Guillain–Barré syndrome. The Uganda (African lineage) and Brazilian (Asian lineage) strains were assumed to be the two extreme and most important cases supported by the literature.[9,11,12] The sequences of Uganda (UG), Senegal (SE), Central African Republic (CAR), Malaysia (MA), Thailand (TH), Yap Island (YAP), and Brazil (BR) were selected from the study of Kochakarn and collaborators,[12] where phylogenetic reconstructions were carried out from ZIKV genomes from different regions of the world. These sequences are identified, respectively, with accession numbers AY632535, KF383117, KF268949, HQ234499, KU681081, EU545988, and KU365777, provided by the National Center for Biotechnology Information. Conceptual translations were made with the Basic Local Alignment Search Tool, using “blastx” with default parameters.[40] Crystal structure coordinates are available for the strains from Brazil and Uganda only. They were provided by the RCSB Protein Data Bank (PDB)[41] with PDB ids 5GS6 (resolution 2.85 Å) and 5K6K (resolution 1.89 Å).[17,18] These two structures are quite similar except for a few structural differences that are observed in some loop regions (1.9 Å root-mean-square deviation (RMSD) across all 352 pairs and 0.706 Å RMSD between 310 pruned atom pairs). The PDB files were edited before the calculations. All water molecules and heteroatoms were removed. Missing residues were added in the structures with the “UCSF Chimera 1.11.2” interface of the program “Modeller” for building all missing segments.[42] We will refer to these completed three-dimensional structures with the label “X-ray” in this text. The original PDB structures with chains A and B were used as templates in all comparative modeling reported here. There are no crystal structures available for the ZIKV NS1 proteins from the other strains (SE, CAR, MA, TH, and YAP). Their three-dimensional models in both oligomeric states (monomer and dimer) were built from the known structures using Modeller version 9.18.[43] For each variant and oligomeric state, we generated two different models using the completed structures of the PDB ids 5GS6 and 5K6K as templates. Accordingly, the use of two comparative models makes it possible to evaluate the impact of the structural fluctuations in all outcomes. For the same reason, comparative models for the UG and BR strains were also produced with the crossing structures (e.g., the BR sequence and the UG template were used together to generate the BR three-dimensional model). We shall refer to the resulting models obtained with the completed structures (the templates) from PDB ids 5GS6 and 5K6K, respectively, as 1 (for the BR strain) and 2 (for the UG strain). The subscripts “mod1” and “mod2” are used to label them in the next sections. Monomers were produced by the simple deletion of one given chain. Finally, as suggested by Wang and coauthors,[19] we used the structure of the complex between the antibody Fab fragment 22NS1 and the incomplete structure of WNV (PDB id 4OII with a resolution of 3.0 Å)[37] to estimate the location of possible residues of ZIKV NS1 that have a chance to be part of epitopes and be responsible for the protein–antibody interaction. The residues 1–175 that correspond to the N-terminal domain of the NS1 protein were missing in this X-ray structure. By comparison at the sequence level of the NS1 proteins given by the PDB ids 4OII (WNV NS1), 5GS6 (ZIKV NS1BR), and 5K6K (ZIKV NS1UG), we removed from the PDB files 5GS6 and 5K6K the 175 residues from the N-terminal domain that were missing in the PDB file 4OII. The generated models were used only for the calculations related to the antibody analysis. We assumed that the absence of the N-terminal domain was not necessary for the interaction of NS1 with the antibody. This is reasonable since the structure without the N-terminal domain was found bound with the antibody at the crystal, indicating that the missing part is not essential for the complexation. The used antibody (available in the PDB id 4OII) is the Fab fragment 22NS1 as reported by Edeling et al.[44] The structures of the NS1 proteins without the antibody are superimposed and shown in Figure . It is interesting to point out that 13 residues on the WNV epitope out of the 21 residues are identical to the corresponding residues in ZIKV.[19]

Figure 1

Comparison of different crystallographic NS1176–352 structures from two flaviviruses. Superimposed structures in ribbon representation are the NS1 from WNV (blue), ZIKVUG (red), and ZIKVBR (green). The PDB ids are 4OII, 5K6K, and 5GS6, respectively.

Coarse-Grained (CG) Modeling

Coarse-grained models are very convenient and appropriate to be applied in the present study because they offer the possibility to explore the main physical features of a system with a reduced number of parameters and lower computational costs, which allows us to repeat the calculations on several different experimental conditions. Therefore, the proteins were modeled here as rigid bodies (i.e., bond lengths, angles, and dihedral angles are kept fixed) in an amino acid coarse-grained (CG) level of details according to the three-dimensional structures provided by the comparative modeling as described above. Following previously successful studies,[45,57] each protein atomistic coordinate is transformed into a collection of charged Lennard-Jones (LJ) spheres of radii (R) and valence z mimicking amino acids. Amino acids sizes were taken from ref (45). For the sake of convenience, these values are also reported in Table S1. The valences of all ionizable residues were assigned by the fast proton titration scheme (FPTS)[46,47] according to the solution pH. Details of this titration scheme, its pKa benchmarks, and the discussions related to its approximations can be found elsewhere.[47] We note that this CG model focuses on the electrostatic features of the biomolecular system and has been predicting pKa’s with good accuracy at low computational costs.[27] This is highly important in the present study due to the elevated number of ionizable residues in the NS1 proteins, which increases the electrostatic coupling between them. As in the original FPTS paper,[46] the intrinsic acid dissociation constants (pK0) were taken from the experimental work of Nozaki and Tanford.[48] These values together with the number of each amino acid in the ZIKV NS1 structures and their charge numbers in the deprotonated and protonated states are listed in Table S1. Figure gives a scheme of the CG model. Each protein is placed inside an electroneutral open cylindrical cell of radius rcyl and height lcyl with a static dielectric constant of 78.7 (assuming a temperature of 298 K). The shape of the simulation box is chosen only for convenience and has no effect on the outcomes as soon as the box is bigger than the protein size. Counterions and added salt particles were represented implicitly using a screening term, i.e., for two titratable residues i and j, the screening is given by [exp(−κr)], where κ is the modified inverse Debye length length (as suggested in ref (81)) and r is the interparticle separation distance. Therefore, the electrostatic interactions [uel(r)] between any two charged amino acids of valences z and z are given bywhere ϵ0 is the dielectric constant of the vacuum (ϵ0 = 8.854 × 10–12 C2 N–1 m–2), ϵs is the dielectric constant of the medium (78.7 here), and e = 1.602 × 10–19 C is the elementary charge. See refs (27 and 47) for more details.

Figure 2

Schematic representation of model cell used in this study. For each titration, one protein was built up by a collection of charged LJ spheres of radii (Rai), and valence zai is placed in an electroneutral open cylindrical cell of radius rcyl and height lcyl. The solvent is represented by its static dielectric constant ϵ. Counterions and added salt particles are represented by Debye’s term κ. Positively and negatively charged, neutral, and hydrophobic amino acids are represented in blue, red, white, and green, respectively.

Constant-pH Monte Carlo Simulations

Monte Carlo (MC) methods are numerical integration tools that have become a useful procedure to compute thermodynamic averages in theoretical biochemistry and contribute to rationalize complex molecular mechanisms.[45,49,50] The CG model was solved via MC simulations using the Metropolis criteria in an NVT ensemble. Calculations were performed with a personalized version of the Faunus biomolecular simulation package,[49] where the FPTS is implemented to study pH-dependent processes.[46,47] In the first part of the work, to illustrate that the differences observed in the simulated pKa’s do influence the biomolecular interactions, three physicochemical properties [the protein net charge number (Zp), the charge regulation parameter (Cp), and the dipole moment number (μp)] were evaluated during the Monte Carlo runs for each ZIKV strain (UG, SE, CAR, MA, TH, YAP, and BR) using each structure model (X-ray, mod1, and mod2) and oligomeric state (monomer and dimer). They are averaged quantities calculated from all molecular charges spatially distributed in the system sampled during the MC runs. Zp, Cp, and μp were also computed for the antibody, the incomplete structure of WNV NS1 protein, and the corresponding truncated structures for NS1ZIKV BR and NS1ZIKV UG. These quantities are equivalent to the ones obtained by the Henderson–Hasselbalch equation with the local electrostatic potential used by Podgornik and coauthors[51] and represent the lower multipoles moments and the charge fluctuations based on the details of charge distributions in these proteins and their acid–base equilibria. Their magnitudes will be used here to estimate different free energies of interactions within a simple colloidal-like framework. Cp is the variance of the mean net protein charge, , which is an example of a linear response. Note that z (a given charge in a titratable group i) is based on the total electrical potential at the coordinates of the titratable site i, and ⟨Z⟩ = ∑1⟨z⟩ is measured after all z values for the N titratable sites were obtained during the MC runs. Salt concentration was kept constant at physiological conditions (150 mM) in all calculations. It was assumed that this is the ionic strength where the most important metabolic processes of the virus occur. A vast number of simulations were performed as a consequence of the numbers of strains, structural models, and oligomeric states. For each system (i.e., a given strain, structural model, and oligomeric state), a series of MC simulations with pH varying from 0 to 14 using a 0.1 interval were run to produce the theoretical titration plots and similar ones for the other pH-related properties. It was from this series of 140 independent simulations that the pKa’s were computed as the pH where the residue is half-protonated as it is done in wet laboratory experiments.[52,53] For each pH condition, at least two production runs with 108 MC steps were carried out after an equilibration phase of 106 MC runs.

Maps of the Electrostatic Properties

Computed pKa’s were converted into color-coded maps using the completed structure of the ZIKV NS1UG (seemingly older strain) as reference. For the comparisons at the monomeric state, chain A of the NS1UG was adopted as the reference. The pKa shifts were mapped at both the proteins sequences and their molecular surfaces. For the mapping at the sequence level, the alignments were done using the high-speed multiple sequence alignment web tool “MUSCLE” (Multiple Sequences Alignment by Log-Expectation).[54] Default parameters and the sequence conservation criteria were used. Positive shifts are represented in dark blue, whereas red is used for the negative ones. The pKa shifts can be due to the sequence (strain), structure, and mixed effects. To differentiate them, three pKa shifts were calculated for each case depending on the molecular model (1 or 2) used as input in the MC runs: (a) ΔpKa = pKa_strain_mod1 – pKa_UG_xray, (b) ΔpKa = pKa_strain_mod2 – pKa_UG_xray, and (c) ΔΔpKa = ΔpKa – ΔpKa. ΔpKa values other than zero are candidates for interesting cases. However, to discriminate pKa shifts related to the sequence from those shifts due to the comparative modeling process, we analyzed the ΔΔpKa’s too. Only the cases where the signs of the pKa for the two models are the same (i.e., the variation of the pKa occurs in the same direction, which happens when ΔpKa × ΔpKa > 0) are the relevant ones for the present analysis. These are the cases that can be used to quantify the effects from the evolutionary mutations. By simultaneously observing these two sets of pKa values (for models 1 and 2), we can find the five possible situations listed in Table . There are clear situations where the pKa shifts are due to the different sequences. For instance, this can be seen for condition 1 in this table. When ΔΔpKa is equal to 0 and the product of the pKa shifts between the two models is greater than 0 (condition 1 in Table ), there is a pKa shift and both models give the same shifts. So, there are no artifacts that were produced by the comparative modeling process. This is one of the most important cases. Conversely, we can find situations where there can be a mixture of the effects produced by the structure and by the primary sequence. These two features are contributing together and are indistinguishable. This is what happens for condition 4. There are shifts (ΔΔpKa ≠ 0), but each model predicts the shift in the opposite direction (ΔpKa × ΔpKa ≤ 0). In this situation, the shifts are most likely due to the structural differences. Finally, since NS1UG is used as a reference, we have distinguished those amino acids where some mutation occurs in NS1BR (condition 5 in Table ). For our purposes, the most interesting situations are given by conditions 1, 3, and 5. Additional colored geometric figures were included at the top of the alignments to guide the reader to the most interesting pKa shifts from the biological perspective. They were based on the shifts between the BR and UG strains. Table summarizes all of these issues and lists the adopted criterion, the relevant cases, and the corresponding colored geometric figures. Alignments were also made from the sequences of the incomplete structures of the NS1 of WNV and the corresponding one from the ZIKV (UG and BR strains). For these comparisons, we have used the NS1 of WNV as reference. This analysis was simpler because all of the computed pKa’s were done using the crystallographic structures in this case.

Table 1

Criteria To Find the Biologically Relevant Cases with pKa Shiftsa

condition		biological relevance	graphical representation
1	ΔΔpK_a = 0 and ΔpK_a₁ × ΔpK_a₂ > 0	relevant	circle
2	ΔΔpK_a = 0 and ΔpK_a₁ × ΔpK_a₂ = 0	not relevant	triangle
3	ΔΔpK_a ≠ 0 and ΔpK_a₁ × ΔpK_a₂ > 0	relevant	square
4	ΔΔpK_a ≠ 0 and ΔpK_a₁ × ΔpK_a₂ ≤ 0	not relevant	inverted triangle
5	mutation	relevant	diamond

ΔΔpKa is calculated between the pKa shifts values from models 1 (ΔpKa = pKa_strain_mod1 – pKa_UG_xray) and 2 (ΔpKa = pKa_strain_mod2 – pKa_UG_xray) of NS1BR, where pKa_strain_mod1, pKa_strain_mod2, and pKa_UG_xray refer to the computed pKa’s obtained using the modeled protein structures, respectively, from the template from the BR strain, from the template from the UG strain, and from the completed UG crystallographic structure. Complementarily, the pKa shifts were also projected at the three-dimensional structures. For the sake of consistency, the color-coded map was obtained by the ΔpKa’s between the UG and BR strains. These two strains represent the oldest and the newest mutations acquired during the ZIKV evolution. They also belong to different lineages, which makes them appealing for this comparison. We used the available crystallographic coordinates for this analysis, from where we also extracted the B-factors. For the estimation of the conserved regions, the “CONSURF” server[55] was employed with default parameters and 90 sequences as suggested by the server.

Analytical Estimates

Semiquantitative theoretical estimates of the free energy of interactions provided a practical way to rationalize the molecular processes and are often obtained from analytical equations based on simple Derjaguin–Landau–Verwey–Overbeek (DLVO) theory arguments.[45,56,57] Following a previous work,[45] the electrostatic free energy for the self-association of the ZIKV NS1 proteins [ADVLO(r)] can be approximately described in terms of electrostatically screened contributions aswhere, for the two macromolecules immersed in an electrolyte solution, these terms can approximately be written asThe screened factors for the dipolar interactions areandThe size of the proteins in different oligomeric states was estimated from their full atomistic coordinates using the completed structures. The “ProteinVolume” server[58] was employed for this purpose. For the sake of convenience, on the basis of the calculated sizes, we assumed a separation distance r equal to 50 Å for the monomeric state and 100 Å for the dimeric state. The theoretical framework behind this set of equations is the Phillies model,[59−61] where anomalous properties of the screening environment are not considered. The limitations of this approximation are discussed in the literature.[51,62,63] A related point to consider here is that although more elaborated theoretical approaches might be important for protein–protein complexation in some cases, there is still an interest in general mechanisms such as the used free-energy equations using these simpler colloidal-like frameworks usually in good agreement with numerical simulations and that can contribute to rationalize more complex systems.[45,57,64,82] The variable d was seen as an adjustable parameter here[65] and assumed to be equal to r.[45] For the NS1–antibody association, only the charge–charge contribution was taken into account due to the severe screening from the other terms.[66] All results were expressed in kT units, where k is the Boltzmann constant (1.3807 × 10–23 J mol–1 K–1) and T is the temperature (in kelvin). A similar approach was used by Lund, Åkesson, and Jönsson to estimate the free energy of interaction [ΔA(R)] between a protein (hisactophilin) and a charged surface mimicking a lipid membrane[67]where ϕ is the electrostatic surface potential generated by the charged membrane groups and β is equal to 1/kT. We have adopted ϕ = −95 mV, which roughly corresponds to the potential calculated for ER membranes in neurons.[68]

Biological Interfaces

As already mentioned, we have focused on three selected biological interfaces of interactions: (1) dimeric (or self-association), (2) membrane, (3) and a putative epitope. They were determined by canonical structural bioinformatics tools. For the dimeric interface, the online server “PDBePISa”[69] was used, providing the residues directly involved in the NS1–NS1 interaction. In a similar way, the interface NS1–antibody as given by the PDB id 4OII was dissected by “PDBePISa”. The choice of an experimental crystallographic structure to determine the epitopes as done here avoids all of the uncertainties commonly seen in epitope predictions by different theoretical methods.[32] The embedded residues in the membrane were estimated by the “PPM” server.[70] As the outputs from the “PPM” can be slightly dependent on the rotation, we have submitted rotated coordinates of the same structural model to have an inclusive protocol taking into account all amino acids that most likely interact with the membrane as given by each coordinate submitted to the “PPM” server. A few differences were observed between the rotated models. Depending on the coordinates, about one to three neighbor amino acids can be included or removed from the obtained prediction. Therefore, we have included all returned list from the server for the different runs with a set of rotated coordinates. The percentage of the exposure of the titratable amino acids was determined by the “PROPKA” program (version 3.1).[71] The outcomes were used to create a grayish color scale that is included in the alignment figures. The most exposed ionizable amino acids are represented in black in this scale, whereas white is used for buried residues. All programs were run with default parameters.

Results and Discussion

Physicochemical Properties of the Proteins

Electrostatic interactions are described as very important for virus in general[25,26,72,73] and also specifically for the NS1 protein.[17,28] The high number of titratable groups in NS1 (e.g., there are 34 GLUs in the monomeric structure) does indicate that these interactions should contribute to the molecular mechanisms. For this reason, we started the present study, quantifying the main physicochemical properties of the proteins that are directly responsible for the molecular interactions and, therefore, can be related to the virus molecular basis and, as such, contribute to the understanding of the differential infectivity of the strains. In this context, within a simple colloidal-like framework, the total charge (Zp) of the system, the charge regulation parameter [or capacitance, (Cp)], and the dipole moment number (μp) are key physical quantities that provide an important initial insight into the biomolecular interactions between the NS1 proteins and their partners. They directly affect the different physical forces responsible for protein associations (e.g., coulombic attraction for oppositely charged proteins,[57] dipolar attraction for macromolecules with high dipole moments,[45] and macromolecular attraction due to a charge regulation mechanism[50,64,66,74,82]). The global similarities and especially the differences among these physicochemical properties of the NS1 proteins from all of the studied strains might have a direct biological implication as discussed above. Charge–charge interactions are expected to be the most important one.[66] Hence, the values of Zp, Cp, and μp at the physiological concentration of salt (150 mM) and different solution pH values were obtained from the numerical simulations with a single protein in the electrolyte solution. Calculations were carried out for each NS1 genetic variant of ZIKV, i.e., from Uganda (UG), Senegal (SE), Central African Republic (CAR), Malaysia (MA), Thailand (TH), Yap Island (YAP), and Brazil (BR), for the two oligomeric states (monomer and dimer). Three-dimensional (3D) structures were obtained directly from X-ray atomistic data when available or were built using homology modeling methods (see Theoretical Methods for details of the procedure). The high sequence identity (the smallest sequence identity between these strains is 96.9%; see Table ) makes the procedure robust and the models reliable. Nevertheless, to evaluate the impact of the structural fluctuations brought by the building procedure, as already mentioned, we used two different models (noted models 1 and 2), based on two different templates (5GS6 and 5K6K), for each variant and oligomeric state.

Table 2

Percentages (%) of Identity of Sequences for Different NS1 ZIKV Strains

identity
	UG	SE	CAR	MA	TH	YAP	BR
UG	100.0
SE	98.3	100.0
CAR	98.0	98.0	100.0
MA	97.4	97.4	97.7	100.0
TH	96.9	96.9	97.4	99.4	100.0
YAP	96.9	97.4	97.4	98.9	98.9	100.0
BR	97.4	97.4	98.0	99.4	99.4	99.4	100.0

Figure shows the computed average net charges and their charge regulation parameters as a function of pH at physiological ionic strength from pH 5 to 7.5. This is the pH range of important cell compartments.[75] The observed titration pattern of each strain is well characterized. Despite the small changes in the sequence, the total protein charge of each strain is differently affected by the pH. The NS1UG and NS1CAR strains are predicted to be two extreme cases, the NS1UG being the more negatively charged protein and the NS1CAR the more positively charged one. Three different groups of strains can be identified for the ZIKV NS1 proteins in terms of their titration behavior and isoelectric points (pI). For the monomeric state, (a) the UG strain with pI = 6.7 is apart from the other groups, (b) the SE, MA, TH, YAP, and BR strains with an intermediated value of pI (7.1) form a group with similar titration properties (the curves for MA, TH, YAP, and BR are seen on top of each other), and (c) the CAR with a pI shifted to the basic regime (pI = 7.4) defines another extreme case (see also Table S2). For the dimeric state, a similar behavior is observed (pIUG < pISE,MA,TH,YAP,BR < pICAR). In this case, the corresponding pIs are 6.6, 7.1, and 7.3 (see also Table S3). Note that the pI values are all around physiological pH (∼7.0) for both the monomeric and dimeric states. This facilitates, for instance, the NS1 self-association around the pI due to the relatively small protein net charges that result in weaker or almost null repulsive electrostatic interactions between the protein pairs. Since this pH condition is found in several cellular environments, it indicates that all ZIKV strains are able to take advantage of it. It is interesting to note that the intermediated group contains strains mostly from the Asian lineage (MA, TH, YAP, and BR strains). This is particularly more evident for the pH window between 5 and 6.5, where the SE strain (African lineage) is a bit outside the intermediated group. From the monomeric to the dimeric state, the difference between the UG strain and the others tends to increase. This means that the UG strain dimer tends to be less positively charged at the acid regime (pH < pI) and more negatively charged at the basic regime (pH > pI). Henceforth, it is expected that the NS1UG will interact differently from other charged molecules in the cell (e.g., for pH < 6.7, the monomeric form of NS1UG will be less attracted by a negatively charged membrane, whereas NS1CAR will be strongly attracted). This leads us to think that the electrostatic properties have an important evolutionary role defining different functionalities in highly conserved proteins such as NS1. Full titration plots obtained in our simulations are provided in the Supporting Information (Figure S1). From the very acidic to the basic pH conditions, the charge Zp approximately varies from +57 to −55 (for the monomeric state) and from +116 to −109 (for the dimeric state). In both high and low pH regimes, the dimers present a higher total protein charge than the monomers (see Figure S1). Structural differences due to the manner that the protein models were built are shown to be small.

Figure 3

Physicochemical properties of different ZIKV NS1 strains (using model 1). (a, b) Simulated protein net charge number as a function of pH. (c, d) Simulated charge regulation parameter as a function of pH. Plots on the left- and right-hand sides are, respectively, for the monomers and the dimers. The estimated deviations calculated based on the averaged values for models 1 and 2 are also shown. Data from MC simulations at 150 mM of salt concentration. We note that some curves are on top of each other. Considering a pair of macromolecules, the second most relevant contribution in the interaction energy between them is often the charge regulation mechanism.[66] This induction interaction is especially important for the interaction of an approximately neutral protein with another (highly) charged macromolecule. Plots of the charge regulation parameter as a function of pH at physiological salt concentration are also shown in Figure . In the pH window of 5–7.5, it decreases from ca. 5 to 1.7, for the monomeric state, and from ca. 10 to 3.5, for the dimeric state. Since the attractive charge regulation mechanism depends on the product (CpZp2), this indicates that the charge regulation mechanism also decreases its importance for the homoassociation of the NS1 proteins as pH becomes closer to pI. Full plots of Cp as a function of pH at 150 mM of NaCl are shown in Figure S1c,d. For the monomers, we observe that the highest peaks occur around pH values 3.6 (Cp ≈ 8.7) and 10.8 (Cp ≈ 6.0) (see Figure S1c). For the dimers, high peaks around pH values 3.6 (Cp ≈ 16.6) and 12.5 (Cp ≈ 11.8) are obtained as shown in Figure S1d. Between pH 5.5 and 7.0, we can observe that Cp values are somewhat different for NS1UG and NS1SE compared to the other strains for both monomers and dimers. They tend to be smaller in this pH regime. Unlike what was seen for Zp, we cannot clusterize the strains based on their values. There seem to be no strong differences between those of Africa and Asiatic origins in terms of their capacitance. Therefore, we can hypothesize that during the evolutionary process, this property has been conserved probably because it might be so significant for the biomolecular associations of NS1 and even would play no role in the differential virulence of ZIKV strains. The other physicochemical property that can play a role in the molecular association is the dipole moment. However, the contribution of the charge–dipole term often vanishes for biomolecules at physiological salt condition. The dipole–dipole interactions also tend to be smaller than those of the other contributions.[66] Higher other moments would have a negligible contribution. For the sake of completeness, we evaluated μp for all of the strains. Results are presented in Figure S2 for both the monomeric and dimeric states at 150 mM of salt. There are significant differences depending on structural models because dipole moments are quite sensitive to the structural coordinates. Despite these, we can observe that each strain exhibits a specific curve pattern. This might indicate that although the total net charge is relatively maintained, there are local differences probably at the biointeractive interfaces. NS1UG has higher μp at all pH regimes. This can be explored in more detail in the future when theoretical methods that couple titration and conformational changes became feasible for larger proteins.

NS1 Associations

We have limited the present study on the investigation of the impact of each ZIKV strain in the interactions of NS1 with its three main partners: (a) itself in a dimerization process, (b) a membrane represented by a negative surface potential, and (c) an antibody. Following ref (66), we estimated the corresponding electrostatic free energy of interactions [βw(r)], applying the simulated Zp, Cp, and μp values obtained at pH values in eqs –6. In this semiquantitative analytical approach,[56] we have used the values of Zp, Cp, and μp obtained from the MC simulations with model 1 at 150 mM of salt. On the basis of a simple thermodynamical criterion,[76] negative values of βw(r) indicate an attractive regime, whereas w(r) > 1 corresponds to repulsion. Despite the intrinsic approximations, the central aspect here is to verify possible differences in the interactions between the NS1 protein for each strain and its different targets.

NS1–NS1 Homooligomers

We first examined the pH effects on the homoassociation process of the NS1 proteins for all of the strains to form the homodimers from the monomeric units (NS1 + NS1 → NS1 dimer) as this is a very fundamental step to give NS1 the capacity to associate with membranes.[34] Another important association is the formation of hexamers from three homodimers (NS1 dimer + NS1 dimer + NS1 dimer → NS1 hexamer). For the sake of convenience, this last process was only partially investigated here in a simplified manner by means of a dimer–dimer interaction (NS1 dimer + NS1 dimer → NS1 tetramer) that might be relevant to the hexamer formation. The transition between the membrane-bound dimer (NS1 dimerbound) to form a soluble trimer of dimers and then a hexamer is still unclear in the literature. Moreover, since tetrameric intermediates have also been reported,[33] it might be useful to have more information about this eventually intermediate step in the hexamer formation observed in late infection. Figure shows the calculated electrostatic free energy of the interactions as given by eq at physiological ionic strength, assuming r = 50 Å. This corresponds to roughly the NS1 diameter (i.e., the center–center separation distance of two NS1 proteins at the contact). For the sake of convenience, all plots were shifted down by an arbitrary value of −9 kT to mimic the addition of an estimated attractive van der Waals (vdW) contribution. In a typical Hamaker description often used in colloidal science, attractive numbers are usually in the range of 3–10 kT.[77] With this assumption, negative values of βw(r) are observed at pH values between ca. 5 and 10, indicating that a homoassociation can happen at this relatively large pH window. Despite the simplicity of the model, this result agrees with the experimental data.[29] A smaller attractive vdW contribution will narrow this pH window without affecting the qualitative comparison between the strains, which is the central aspect of the present study. The picture is largely dominated by the charge–charge (coulombic) contributions. Both the charge regulation and dipolar interactions contribute with negligible numbers (less than 0.1kT at maximum). As expected, the behavior of βw(r) for each strain follows what was shown before for Zp. The NS1UG is shifted to the acid pH regime, which permits the complexation to both start and be over at smaller pH values. All of the other strains have a similar profile with the NS1CAR being the farther case. The differences between the strains become clearer for the tetramerization process (Figure b). Thus, the observed differences between the NS1UG and the other strains in the self-association process might be related to their distinct biological responses as it is known that the formation of NS1 oligomers is very important for the ZIKV functionality.[2,28]

Figure 4

Homodimerization in two different types of oligomers for different NS1 of ZIKV. The electrostatic free energies of interactions are calculated as a function of pH. (a) Monomer state. (b) Dimer state. Data from the analytical equations using the parameters obtained from the MC simulations at 150 mM of salt concentration.

NS1–Membrane Interaction

We next investigated the pH effects on the interactions of the ZIKV NS1 protein from each strain with a general membrane model. Assuming that a surface membrane electrical potential equals −95 mV[68] and using in eq the values of Zp obtained from the MC simulations, we estimated the electrostatic free energy of interactions for the protein–membrane association [βΔA(R)]. The results at 150 mM salt are shown in Figure for both oligomeric states. The monomer–membrane case is included here for comparisons and the sake of completeness since the only biologically relevant system is the dimer–membrane. The same tendency for the NS1UG and NS1CAR to be extreme cases as observed for the NS1 dimer–dimer association is seen here too for both cases. The transition from monomer to dimer is reported to increase the hydrophobicity of NS1. Electrostatic attraction might contribute to this process as well. For instance, comparing the monomeric and dimeric cases, we can see that the discrepancy in βw(r) between the different strains is increased for the dimers, i.e., the dimerization amplifies the biomolecular interactions in comparison with the monomers. The dimerization also slightly decreased the pH where a transition between an electrostatic attraction and a repulsive regime is observed [pH where βw(r) = 0]. For the monomeric case, the NS1UG is attracted by the membrane until pH 7.4, whereas the NS1CAR is attracted by pH values up to 9.5. Conversely, for the dimer case, these pH values suffer a small decrease to 7.3 and 9.4 for, respectively, UG and CAR. Most of the NS1 proteins from the other strains (NS1SE, NS1MA, NS1TH, NS1YAP, and NS1BR) can be grouped together in terms of their estimated interactive behavior. This group of proteins can be attracted by the membrane until an intermediate pH value (∼8.0). At the pH of the ER lumen (pH 7.2), all proteins have an attractive behavior. CAR is the system where the strongest electrostatic attraction is analytically estimated at this pH value, followed by the group of strains mentioned above (SE, MA, TH, YAP, and BR), and finally by the UG strain.

Figure 5

Protein–membrane interactions in two different types of oligomers for different NS1 of ZIKV. (a) Monomer state. (b) Dimer state. The electrostatic free energies of interactions between the complexes are calculated as a function of pH. Data from the analytical equations using the parameters obtained from the MC simulations at 150 mM of salt concentration. Overall, we can see that the NS1 proteins from different strains do not have the same βw(r) profile. Each strain produces an NS1 protein with specific molecular interactive properties, i.e., the magnitude of the attraction NS1 dimer–membrane is specific for each strain. It is legitimate to suppose that such differences on how NS1 interacts with the membrane should have an impact on the differential virulence of each strain among other factors. There should be a direct relation between the biomolecular interactions and the biological effects. The NS1–membrane association is a critical step to the assembly and release of the viral lipoprotein particle.[34] Akey and coauthors previously suggested the relation between the observed phenotype and the loss of effective interactions with transmembrane proteins of the replication complex.[35] On the basis of such arguments, a stronger affinity between the NS1 dimer and the membrane would promote a better association of this protein with intracellular membranes and consequently would favor the viral replication. Indeed, it has already been reported that African strains have a more aggressive effect on human neural stem cells compared to strains of Asian origin.[15] Moreover, it has been hypothesized that the strains from an African lineage ZIKV have a deadly effect on the early stages of the fetus, whereas the Asian lineage is less destructive, allowing the fetal development to continue, but then congenital malformations can appear at a later stage.[15,67] If these processes are related to the viral genome replication as it seems to be the case, it could be assumed that there is some kind of relationship between that and the results found here. As seen in Figure , for both the monomer and dimeric cases, it might be that African NS1 proteins (as NS1CAR) have a higher affinity for negatively charged membranes since the residues involved in the interaction are positively charged. On the contrary, a less attractive effect at pH 7.2 should occur in the NS1 strains of Asiatic origin, due to the presence of more negative residues that prevent a strong affinity with the intracellular membranes. The hydrophobic characteristic of the NS1 homodimer might contribute to further enhance the NS1–membrane attraction.

West Nile Virus NS1

To go further and better understand the impact of sequence differences on the main physicochemical properties of NS1, we decided to examine a primary sequence, which has a sequence identity much lower than that found between the ZIKV strains. In this regard, the West Nile virus sequence (denoted WNV), for which a crystallographic structure of the NS1 complexed with an antibody is available, is a convenient case. Considering that the sequence identity between the available regions of the NS1 of WNV and ZIKV in the crystallographic structure is around 59%, we can examine how the amino acid composition affects these physicochemical properties in these proteins along the pH scale. Furthermore, the NS1–antibody complex can also be used as a “template” to estimate the interactions between the antibody and the NS1 from ZIKV. This does not necessarily mean that the WNV NS1–antibody can have a promiscuous behavior and also be able to bind the ZIKV NS1 in vivo. To our knowledge, there is no information about the possibility of polyspecificity or cross-reactions for the WNV–antibody. The theoretical calculations done here are simply to illustrate that each ZIKV strain would have a different electrostatic response for a given antibody molecule. This will be true even if the real ZIKV–antibody will be different from the WNV one as common sense suggests. In addition, it is interesting to point out that Wang and coauthors suggested that the WNV NS1 could be used in the development of antibodies against ZIKV.[19] Moreover, Rocha and coauthors[78] identified NS1–dengue epitopes that are recognized by two monoclonal antibodies (2H5 and 4H1BC) that also cross-reacted with NS1 ZIKV protein. In some cases, the sequence identity between the dengue epitope and the corresponding epitope in Zika can be less than 77% (see the Supporting Information of ref (78)). Since our main goal was to evaluate the interaction properties of an epitope region with an antibody, in the absence of 3D structure of ZIKV with Rocha’s antibodies, we turned out to the sole 3D complex that was indubitably characterized, namely, the evolutionary closely related WNV NS1 protein interacting with 22NS1. The crystallographic NS1WNV structure was missing the 1–175 N-terminal region. We have assumed that the missing amino acids would not affect the complexation because, even when absent in the known experimental structure, the oligomer could be formed in the crystal conditions. For comparison purpose, we reevaluated the electrostatic properties of NS1ZIKV restricting to the same region. The first procedure is to compute these properties for both the antibody and the NS1WNV and to compare them with the results for ZIKV. This can be seen in Figure . The pI for NS1WNV (5.7) is smaller than what was previously computed for all NS1ZIKV strains. From pH 7.5 to the acid regime, the NS1WNV is always less charged than the ZIKV strains. The strain from the UG is the intermediate case, whereas the BR strain is the most positively charged molecule within this pH range. The same order is observed for the charge regulation parameter. The pI for the antibody is 8.7. From these titration data, it can be anticipated that the NS1WNV will form the strongest complex with the antibody, and the NS1BR the weaker complex. Due to its negligible contribution, μp was not reported here.

Figure 6

Physicochemical properties for different NS1176–352 from two flaviviruses [WNV PDB id 4OII and ZIKV PDB id 5GS6 (BR) and 5K6K (UG)] and the Fab fragment 22NS1 (derived from 4OII). (a) Net protein charge number as a function of pH. (b) Charge regulation parameter as a function of pH. Plots have specifically focused on the typical physiological pH conditions for the virus stability and infectivity. Data from MC simulations at 150 mM of salt concentration.

NS1–Antibody

Finally, using the C-terminus NS1 of WNV and ZIKV, we calculate the βw(r) as a function of pH for these proteins with the antibody found in the crystallographic structure of WNV (see Figure ). For NS1ZIKV, we assumed that the epitope region is the same as that for NS1WNV. We built the NS1 ZIKV–antibody complex by superimposing the common regions between NS1WNV and NS1ZIKV. As expected, βw(r) starts to be negative at a smaller pH for the WNV–antibody complex, confirming that this is the system with the stronger affinity followed by the NS1ZIKV UG and then the NS1ZIKV BR. This result confirms that the ZIKV BR strain is less attracted by this specific antibody for WNV, which should imply a different immunological response. As expected based on the common sense for specificity for antibody–antigen interactions, an immune system that produces this antibody for WNV would have difficulties to neutralize the ZIKV NS1BR. Our result clearly demonstrated that the electrostatics properties of the 22NS1 antibody have to be modified first before any other fine subtle residue changes. This result is of utmost importance as it significantly limits the number of residues combination to design a synthetic antibody.

Figure 7

Protein–antibody interactions for different NS1176–352 of two flaviviruses. The electrostatic free energies of interactions between the complexes are calculated as a function of pH. Data from the analytical equations using the parameters obtained from the MC simulations at 150 mM of salt concentration. The antibody Fab fragment 22NS1 used in these calculations is from the WNV as given by the PDB id 4OII.

Mapping the Electrostatic Properties at the Sequence Level

We described above the detected differences at the main global protein physicochemical properties. To refine this analysis, we also evaluated the differences in the electrostatic properties at the amino acid level due to the small variation on the sequence identity. This allows us to map the protein regions largely affected by the evolutionary mutations that could be related to the self-association of NS1 and the interactions with the membrane and the antibody. Consequently, these regions have different interaction affinities with their molecular partners and could be directly related to the patterns of differential infectivity found in different strains of ZIKV. In addition, regions where variations do not occur must be fundamental for the virus, and, as a consequence, their electrostatic conservation must be maintained. Computed pKa values of all titratable groups were employed in this mapping process. The pKa calculations were carried out for all available structures, i.e., the experimental ones as well as the 3D models. Thus, we can differentiate the shifts from the sequence itself from those that are due to the tertiary structure. For both the UG and BR strains, we performed the calculations with the two chains (A and B) that were accessible at the PDB. They are referred to as UG_chA_X-ray, UG_chB_X-ray, BR_chA_X-ray, and BR_chB_X-ray. For the other strains, the same notation was followed, replacing the “_X-ray” by the correspondent comparative structural models labels (“_mod1” and “_mod2” for the structures obtained from the templates from the BR and UG strains, respectively). In fact, the interesting properties are the pKa shifts from a chosen reference system. The pKa shifts can be due to the sequence (strain), structure, and mixed effects. To distinguish impact from sequences to structures, three pKa shifts were calculated for each case depending on the molecular model, i.e., 1 or 2. In this study, we used the NS1UG as such reference. Accordingly, ΔpKa (or 2) stands for the difference between pKa of a given amino acid in model 1 (or 2) of a given strain (pKa_strain) compared to the same amino acid in X-ray structure of UG (pKa_UG). We also evaluated ΔΔpKa = ΔpKa – ΔpKa, which reflects the fine influence of the structural building procedure. The ionic strength was kept constant at 150 mM during all simulation runs. Figure presents the sequence alignments for all studied NS1 strains. At the bottom line, both the degree of conservation and the exposition to the solvent of the titratable amino acids are represented by the usual stars and the grayish little squares (where black means a very exposed amino acid, and white, a buried one). The results for the ΔpKa’s are displayed as a color-coded ΔpKa scale, with blue being the positive shifts (ΔpKa > 0), and red the negative ones (ΔpKa < 0). The numerical data used to build up this color-coded ΔpKa scale are reported in Table S4. Amino acids’ one-letter codes highlighted in bold indicate those that are found at the NS1 self-association interfaces. The other important interfaces are indicated at the top line (above the amino acid numbering) by the letters “m” (for the interaction with the membrane) and “a” (for the NS1–antibody interface defined here based on the NS1 antigenic molecule known from the WNV NS1–Fab fragment 22NS1 from the PDB id 4OII). For the sake of convenience, we also introduced another notation at the third top line with a colored geometric figure. This is done to guide the reader to the most interesting pKa shifts from the biological perspective. Colored circles, triangles, squares, inverted triangles, and diamonds correspond to conditions 1, 2, 3, 4, and 5, respectively, as given in Table . This symbol comparison is done between the UG and BR strains. Sequence positions labeled with triangles indicate the pKa shifts that come from the differences in the three-dimensional structure. The most relevant cases to determine the shifts associated with the strains are both the ones where there was a mutation on a specific sequence position (see the diamonds in Figure ) and the cases where the pKa shifts for models 1 and 2 are in the same direction (ΔpKa × ΔpKa > 0; see the circles and squares in Figure ). For instance, the diamond at position 52 means that there is a mutation from the UG sequence (our reference in this study) to the BR strain. In this example, we can observe the effect of mutation E52D for the BR strain (and also for MA, TH, and YAP strains). At position 51, we can see the circle. In this case, E51 is conserved in the BR sequence (and all of the others), but in some strains, this residue experiences a different electrostatic environment, and positive pKa shifts were measured. From Table S4, we can see that both models (1 and 2) give exactly the same pKa shift (pKa mod1_br = 0.1 and pKa mod2_br = 0.1; see Table S4), indicating that the observed shifts are not an artifact from the structure. Similar cases where the pKa shifts are not exactly the same but keep the same signal can also be found. See the green square at position 14 in Figure , where pKa mod1_br = −0.4 and pKa mod2_br = −0.2 (values from Table S4). Actually, there are other cases like this one closer to the NS1 self-association interface (e.g., R29, E156, and E192). There are pKa shifts at the biological functional interfaces and also at other sequence positions with apparently no specific relation with them. A detailed description of each amino acid behavior with a description of the involved biofunctional interface is provided in Table S5. We can summarize the main observations from these analyses with the numerical quantification of all possible cases. These statistical data are given in Table . Interesting comments can be made using the data presented in this table. For instance, it becomes clear that TYR maintains a high degree of conservation in terms of the pKa shifts. There are seven TYRs that are not affected by the changes in the sequences from each strain, including the TYRs present at the functional interfaces (there is one TYR at each interface). The only TYR affected is not at these interfaces. The other titratable amino acids have situations where pKa shifts were observed in two functional interfaces. The total number of GLUs is 34, being the highest number of ionizable amino acids in the NS1 protein followed by 25 ARGs, 22 LYSs, 16 ASPs, and 12 HISs. From the 34 GLUs, 13 were affected by the mutations from the UG to the BR strains. Considering the total number of GLUs, this corresponds to a ratio equal to 0.38. It is also the highest relation between the number of affected cases of a given amino acid divided by its total number. Nevertheless, only parts of the affected GLUs are at the functional interfaces. There is one at the dimeric interface and three at the epitope. This amino acid is not presented at the membrane interface. HISs that are located at the functional interfaces have a high tendency to be sensitive to the mutations. We found three HISs at the dimeric interface, the membrane interface, and the epitope. HISs experienced pKa shifts at both the self-association and the epitope interfaces due to the evolutionary mutations: one out of three at the self-association interface and one out of one at the epitope. In short, we can say that (a) ARG (1), GLU (1), HIS (1), and LYS (2) are affected at the dimeric interface; (b) ARG (1), ASP (1), and LYS (1) at the membrane interface; and (c) GLU (3) and HIS (1) at the epitope. The number of occurrences of each amino acid at these interfaces was given between parentheses. It seems that these residues are essential for the functionality and survival of the virus since they are engaged in regions of crucial interactions.

Figure 8

Table 3

Number of Ionizable Residues with pKa Shifts in Some Important Regions of NS1BR of ZIKV and Their Relationsa

	total				dimer interface				membrane interface				epitope
residue	N^T	N′^T	N′^T/N^T	N′^V/N′^T	N^D	N′^D	N′^D/N^D	N′^D/N′^T	N^M	N′^M	N′^M/N^M	N′^M/N′^T	N^E	N′^E	N′^E/N^E	N′^E/N′^T
ARG	25	4	0.16	0.50	1	1	1.00	0.25	2	1	0.50	0.25	3	0	0.00	0.00
ASP	16	3	0.19	0.33	5	0	0.00	0.00	1	1	1.00	0.33	3	0	0.00	0.00
GLU	34	13	0.38	0.31	2	1	0.50	0.08	0	0	0.00	0.00	5	3	0.60	0.60
HIS	12	2	0.17	1.00	3	1	0.33	0.50	1	0	0.00	0.00	1	1	1.00	1.00
LYS	22	6	0.27	0.50	5	2	0.40	0.33	2	1	0.50	0.17	0	0	0.00	0.00
TYR	8	1	0.13	0.00	1	0	0.00	0.00	1	0	0.00	0.00	1	0	0.00	0.00

NT = total number of residues, N′T = number of residues with pKa shifts, N′V = N′D + N′M + N′E, ND = total number of residues in the dimeric interface, N′D = number of residues with pKa shifts in the dimeric interface, NM = total number of residues embedded in the membrane, N′M = number of residues with pKa shifts embedded in the membrane, NE = total number of residues in the antibody interface of NS1 WNV, N′E = number of residues with pKa shifts in the antibody interface of NS1 WNV.

Alignment from different NS1 (monomers) of several geographic regions. When the amino acid suffers pKa variation, it is shaded with blue or red indicating, respectively, that it is more positive or more negative with regard to the reference (NS1UG). Residues are shaded with yellow or green when a substitution per acidic or basic residue occurs with respect to the reference. Explanations about the symbols above the alignment are given in the text (see also Table , where the numerical criterion is provided to guide the use of these symbols). The letters “m” and “a” above the alignment are residues embedded in the membrane and epitope regions (this last region was defined based on the crystallographic structure of the complex WNV-Fab fragment 22NS1). Residues that belong to the dimeric interface are shown in bold. The bars below the alignment indicate exposure of charged residues on the protein surface, with darker areas indicating more exposure. The last line indicates the grade of conservation between proteins. NT = total number of residues, N′T = number of residues with pKa shifts, N′V = N′D + N′M + N′E, ND = total number of residues in the dimeric interface, N′D = number of residues with pKa shifts in the dimeric interface, NM = total number of residues embedded in the membrane, N′M = number of residues with pKa shifts embedded in the membrane, NE = total number of residues in the antibody interface of NS1 WNV, N′E = number of residues with pKa shifts in the antibody interface of NS1 WNV.

Mapping the Electrostatic Properties at the Structural Level

The analysis at the sequence level can be seen as a rough estimation since the three-dimensional distribution of the amino acids charges has a clear effect on the strength of the molecular interactions that are distance-dependent properties. Even homologous proteins might have regions with different folds,[79] which have a direct effect on the protonation states of the titratable groups. Therefore, the pKa shifts reported above were also mapped at the protein surface following a similar color code. Exploring possible relations with other relevant properties, Figure shows the pKa shifts regarding the same reference (NS1UG) as a molecular surface color mapping. The threshold of ΔpKa = 0.2 defined this color mapping to better highlight all affected molecular regions. In the same figure, we included a map for the degree of conservation and the B-factor of each residue. Because the B-factor is a thermodynamic measure of the structural flexibility and provides important information about the averaged protein dynamics properties, a comparison between this variable and pKa shifts is interesting to be investigated. Comparing the maps for the ΔpKa’s and B-factors, we can conclude that there is no clear relation between the protein flexibility and the ΔpKa’s for the NS1BR. Conversely, there seem to be more pKa shifts at conserved regions of the protein. This makes sense in evolutionary terms since viruses can have some regions where the mutation rate is higher “to counter” the attacks from the hosts they infect.[73] However, we must emphasize that it is also interesting that independent of that degree of conservation, the variation of pKa can be seen between strains with a high sequence identity. In this way, each particular NS1 strain of ZIKV would have the ability to interact differently from their possible different partners to which they can be associated.

Figure 9

Tridimensional structures of the NS1BR monomer. Models from the left to the right show the degree of conservation, B-factor of each residue, and pKa variation regarding the reference (NS1UG). See the text for details. In Figure , the focus is on the comparison between the pKa shifts and the biological functional interfaces. For the sake of convenience, the same molecular map of the ΔpKa’s previously seen in Figure is reproduced again in this figure. From the left to right in this figure, we start with the pKa shifts color mapping, followed by the molecular surface representation of the dimeric interface and the residues that are embedded in the membrane. The last image on the right-hand side described the three distinct domains (β-roll, wing domain, and β-ladder domain) that are structurally important for NS1 together with the molecular regions that interact with the antibody. There are pKa shifts in all of these domains. In general, most of the pKa shifts occur at the dimeric interfaces and epitope. There are only a few amino acids affected at the region that directly interacts with the membrane: two with negative shifts, D30 (ΔpKa ≅ −0.3) and K116 (ΔpKa ≅ −0.3), and one with positive shift, R29 (ΔpKa ≅ 0.3). It might be not surprising since it has been put forward that the interaction with the membrane would be driven by a hydrophobic mechanism at least for the WNV.[80] Nevertheless, such shifts explain the reduction in the free energy of interaction between NS1 and the membrane, as discussed in Figure . We have to bear in mind that electrostatic interactions are long-range interactions, and even charges at larger separation distances can contribute with the attraction between NS1 and the membrane. Thus, the presence of some titratable groups at the interface reinforces the evidence that electrostatic interactions are important to favor the approach of NS1 and the membrane, while the hydrophobic ones guide the final step of the anchoring of the NS1 into the membrane. The two mechanisms might work together for this biointerface. From the UG to the BR strain, most titratable residues experienced negative shifts both at the functionally important regions of the protein and in others without an apparent direct functional relation. We can hypothesize that these residues either contribute to the structural stability of NS1 or indirectly help the titration mechanisms of the other ionizable groups due to their high electrostatic coupling. Residues D52, K146, K191, and H286 seem to be affected by the neighborhood (see also Figure ). It is clear that even the small different compositions of amino acids affect the electrostatics properties of NS1, therefore allowing it to interact in multiple ways depending on the strains.

Figure 10

Tridimensional structures of the NS1BR monomer. Models from the left to right show pKa variation regarding the reference (NS1UG), residues of the dimeric interface, residues embedded in the membrane, and domains of the NS1. Epitope region from NS1 of WNV (PDB id 4OII) is colored violet in the last image. Some earlier studies reported more negative electrostatic potential profiles at the loop surface on NS1UG of ZIKV compared to those on other flaviviruses.[17,28] This result is in agreement with our present outcomes. Furthermore, it seems as if the natural tendency of the future strains is to maintain more negatively charged interfaces while new geographical areas are conquered by the virus. It might be a strategy used by the ZIKV to trick the immune system. Focusing on the NS1BR, we have listed the number of ionizable residues and the distribution of these residues in the three main functional interfaces in Table , with or without showing pKa shifts, as commented above. Among the titratable residues (117 in total), roughly a third is found in biological functional interface regions. Interestingly, more than 40% of residues with pKa shifts are found in these regions, which underlines the fine role of electrostatic features in key interactions. By the abundant number of GLUs present in the structure, it seems reasonable that the ZIKV has explored this feature. Although most GLU residues fall outside of the biological functional interfaces regions, we can observe how the presence of this amino acid is directly related to an electrostatically negative charges pattern in ZIKV reported by other studies.[17,28] It is important to point out that we have limited our analysis to known functional regions. This does not preclude the impact of these changes on interactions with putative other partners that still need to be identified. It should also be noted that some ΔpKa’s were positive as seen in residues E26 (ΔpKa = 0.1), R29 (ΔpKa ≅ 0.3), E51 (ΔpKa = 0.1), and K189 (ΔpKa = 0.2), and mutations R69K (ΔpKa = 1.7) and E146K (ΔpKa = 6.9), as given in Table S4. In the antigenic region, the mutation Y286H exhibits a relatively high negative pKa shift (ΔpKa = −3.2). The highest positive shifts are observed for the mutations to LYS. The results obtained to date suggest that the difference in the virulence is related to the pKa shifts in the three biological functional interfaces. For possible therapeutic interventions, the molecular interface between NS1 and the antibody is highly interesting to be explored. Using the epitope region crystallographically known for WNV (only 185 residues from amino acids 176 to 352), we compared the ΔpKa for WNV, ZIKVUG, and ZIKVBR in Figure (note that this alignment has a pKa shift scale from −1.2 to 1.2). The missing portion of the structure in the WNV–antibody complex was removed from the ZIKV strains structures too. All of these calculations were done in the apo form. Several pKa shifts are observed specific at the antigenic region (see cases with the “a”s at the top line of Figure ). We can see that even conservative amino acids between the ZIKV and the WNV experienced pKa shifts (e.g., E178 and D180). There are cases where the shifts are larger for the ZIKVBR than for ZIKVUG and vice versa (e.g., E178). Nevertheless, the effective result of these mutations is to make the ZIKVBR more positively charged, as seen in Figure and, consequently, less recognized by the antibody (see Figure ). These regions affected by the protein sequence indicate a potential candidate for the development of vaccines.

Figure 11

Alignment from different NS1 of two different flaviviruses. Calculations were done using the incomplete dimeric structure of WNV (PDB id 4OII) that interacts with the antibody (but removing this part). Incomplete structures for NS1 of ZIKV were used as well. Conventions are the same as in Figure .

Conclusions

We have explored here the electrostatic properties of different strains of the ZIKV NS1 protein that are structurally conserved (in terms of the scaffold) but could have differential infectivities despite the small variations in the sequence level. This small variation in the sequences can have a considerable impact on the main physicochemical properties that are pH-dependent. This has been observed for global properties, such as the net protein charge numbers, and for more refined analyses based on pKa’s and electrostatic maps. The mutations can affect neighboring amino acids and others that are spatially far but still electrostatically connected due to the long range of electrostatic interactions and the complex coupling between the titratable groups. Three different groups of strains could be identified in terms of their titration behavior and isoelectric points, the Uganda and the Central Africa Republic being the two extreme opposite cases. Most of the Asian strains tend to have a similar behavior. The biomolecular interactions of the NS1 protein related to the three most important biological interfaces were investigated and demonstrated the differences between the strains that can be related to their distinct biological responses. For the self-association interaction, the UG strain is able to start its complexation at more acid pH regimes, whereas most of the studied strains need higher pH values. The differences between the strains increased from the monomeric to the dimeric states. Charge–charge interactions were confirmed as the main electrostatic-driven force for the complexation. Using the available structural data of the West Nile virus, we could demonstrate the difficulties that the same antibody would have to neutralize the NS1 from the Brazilian strain of the ZIKV. Another interesting observation is that it would have less difficulty to neutralize the strain from Uganda. In terms of the interaction with membranes, excluding the Uganda strain, most of the others have a similar behavior and can be attracted by the charged membrane until an intermediate pH value of ca. 8. All of these differences can be related to the differential clinical observed virulence. In fact, several pKa shifts were observed specifically at the three main studied biological interfaces reinforcing the idea that the NS1 electrostatic properties have a relation with the differential infectivity seen for this virus. It remains to be investigated how these observations are related to the hexamer transition, which was left for a future work. We have detected an important role for GLU residues and highlighted that several of its pKa shifts are found at the epitope. Furthermore, the antigenic region exhibits a negative pKa variation compared to the most ancestral strains. Since the NS1 protein has been proposed as a biomarker for flaviviruses and also as a potential vaccine target, the observed differences in these electrostatic properties maps indicate key issues for the development of antibody-based technologies and/or for the design of new medicines. The present study also opens up perspectives to computationally design high-specificity antibodies.

70 in total

1. Quantitative nanoscale electrostatics of viruses.

Authors: M Hernando-Pérez; A X Cartagena-Rivera; A Lošdorfer Božič; P J P Carrillo; C San Martín; M G Mateu; A Raman; R Podgornik; P J de Pablo
Journal: Nanoscale Date: 2015-11-07 Impact factor: 7.790

2. Newly synthesized dengue-2 virus nonstructural protein NS1 is a soluble protein but becomes partially hydrophobic and membrane-associated after dimerization.

Authors: G Winkler; S E Maxwell; C Ruemmler; V Stollar
Journal: Virology Date: 1989-07 Impact factor: 3.616

3. Inference of macromolecular assemblies from crystalline state.

Authors: Evgeny Krissinel; Kim Henrick
Journal: J Mol Biol Date: 2007-05-13 Impact factor: 5.469

4. Structure of the immature dengue virus at low pH primes proteolytic maturation.

Authors: I-Mei Yu; Wei Zhang; Heather A Holdaway; Long Li; Victor A Kostyuchenko; Paul R Chipman; Richard J Kuhn; Michael G Rossmann; Jue Chen
Journal: Science Date: 2008-03-28 Impact factor: 47.728

5. Fast coarse-grained model for RNA titration.

Authors: Fernando Luís Barroso da Silva; Philippe Derreumaux; Samuela Pasquali
Journal: J Chem Phys Date: 2017-01-21 Impact factor: 3.488

Review 6. Development of constant-pH simulation methods in implicit solvent and applications in biomolecular systems.

Authors: Fernando Luís Barroso daSilva; Luis Gustavo Dias
Journal: Biophys Rev Date: 2017-09-18

7. Phylogenetic analysis revealed the central roles of two African countries in the evolution and worldwide spread of Zika virus.

Authors: Shu Shen; Junming Shi; Jun Wang; Shuang Tang; Hualin Wang; Zhihong Hu; Fei Deng
Journal: Virol Sin Date: 2016-04-26 Impact factor: 4.327

8. The relation between the divergence of sequence and structure in proteins.

Authors: C Chothia; A M Lesk
Journal: EMBO J Date: 1986-04 Impact factor: 11.598

9. ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules.

Authors: Haim Ashkenazy; Shiran Abadi; Eric Martz; Ofer Chay; Itay Mayrose; Tal Pupko; Nir Ben-Tal
Journal: Nucleic Acids Res Date: 2016-05-10 Impact factor: 16.971

10. Virus taxonomy: the database of the International Committee on Taxonomy of Viruses (ICTV).

Authors: Elliot J Lefkowitz; Donald M Dempsey; Robert Curtis Hendrickson; Richard J Orton; Stuart G Siddell; Donald B Smith
Journal: Nucleic Acids Res Date: 2018-01-04 Impact factor: 16.971

2 in total

1. On the interactions of the receptor-binding domain of SARS-CoV-1 and SARS-CoV-2 spike proteins with monoclonal antibodies and the receptor ACE2.

Authors: Carolina Corrêa Giron; Aatto Laaksonen; Fernando L Barroso da Silva
Journal: Virus Res Date: 2020-05-15 Impact factor: 3.303

2. Up State of the SARS-COV-2 Spike Homotrimer Favors an Increased Virulence for New Variants.

Authors: Carolina Corrêa Giron; Aatto Laaksonen; Fernando Luís Barroso da Silva
Journal: Front Med Technol Date: 2021-06-30

2 in total