Esther S Brielle1, Isaiah T Arkin2. 1. The Alexander Grass Center for Bioengineering, Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Edmond J. Safra Campus, Jerusalem 9190400, Israel. 2. The Alexander Silberman Institute of Life Sciences. Department of Biological Chemistry, The Hebrew University of Jerusalem, Edmond J. Safra Campus, Jerusalem 9190400, Israel.
Abstract
H-bonding is the predominant geometrical determinant of biomolecular structure and interactions. As such, considerable analyses have been undertaken to study its detailed energetics. The focus, however, has been mostly reserved for H-bonds comprising a single donor and a single acceptor. Herein, we measure the prevalence and energetics of multiplex H-bonds that are formed between three or more groups. We show that 92% of all transmembrane helices have at least one non-canonical H-bond formed by a serine or threonine residue whose hydroxyl side chain H-bonds to an over-coordinated carbonyl oxygen at position i-4, i-3, or i in the sequence. Isotope-edited FTIR spectroscopy, coupled with DFT calculations, enables us to determine the bond enthalpies, pointing to values that are up to 127% higher than that of a single canonical H-bond. We propose that these strong H-bonds serve to stabilize serine and threonine residues in hydrophobic environments while concomitantly providing them flexibility between different configurations, which may be necessary for function.
H-bonding is the predominant geometrical determinant of biomolecular structure and interactions. As such, considerable analyses have been undertaken to study its detailed energetics. The focus, however, has been mostly reserved for H-bonds comprising a single donor and a single acceptor. Herein, we measure the prevalence and energetics of multiplex H-bonds that are formed between three or more groups. We show that 92% of all transmembrane helices have at least one non-canonical H-bond formed by a serine or threonine residue whose hydroxyl side chain H-bonds to an over-coordinated carbonyl oxygen at position i-4, i-3, or i in the sequence. Isotope-edited FTIR spectroscopy, coupled with DFT calculations, enables us to determine the bond enthalpies, pointing to values that are up to 127% higher than that of a single canonical H-bond. We propose that these strong H-bonds serve to stabilize serine and threonine residues in hydrophobic environments while concomitantly providing them flexibility between different configurations, which may be necessary for function.
H-bonds
are relatively weak interactions that are driven by the
electrostatic attraction between a positively charged hydrogen and
a negatively charged acceptor. Their prevalence is the driving force
behind many natural phenomena, perhaps the most notable of which is
the flotation of ice on water. Despite their small magnitude, they
often amass a considerable impact due to their directional character
and their abundance.Their ability to compound, like Lego bricks,
allows H-bonds to
achieve a wide range of biological purposes in macromolecules. Complementary
H-bonds between the two strands of DNA are responsible for high replication
fidelity of genetic information.[1] H-bonds
between glucose monomers in cellulose provide tremendous physical
strength. Finally, as predicted by Pauling and co-workers, specific
H-bond patterns in proteins define the secondary structure of helices[2] and pleated sheets.[3]These secondary structures form during the folding process
due
to the scarcity of internal water molecules in the hydrophobic core,
which requires the protein to self-satisfy its H-bonding potential.[4] The lack of water molecules is even more pronounced
in the hydrophobic milieu of membrane proteins. This may lead to stronger
hydrogen bonding and greater helical uniformity in membrane proteins
compared to water-soluble proteins.[5,6] Therefore,
as one might expect, transmembrane α-helices are frequently
more stable than their counterparts in water-soluble proteins and,
at times, only unravel when the membrane integrity collapses.[7−12]Conventional H-bonds, such as those found in α-helices,[2] where the amide H at position i interacts with the i–4 amide carbonyl, have
been characterized extensively in terms of geometry and energetics.[13] However, these single donor–single acceptor
interactions represent only one type of H-bond. More complex H-bonds
exist, which are formed with multiple acceptors (multifurcation),
multiple donors (over-coordination), or both.The most common
multiplex H-bonds, identified ever since protein
structures were first solved,[14] involve
the over-coordination of a backbone carbonyl with two donors: the
backbone amidehydrogen and the hydroxyl side chain of serine or threonine.[15] In membrane proteins, such over-coordinated
H-bonds have been proposed to accommodate the polarity of serine and
threonine in the apolar lipid bilayer.[8,16,17]Motivated by the abundance of multiplex H-bonds
and their importance
to membrane proteins, we have previously measured the strength of
one such bond: the over-coordination of the carbonyl of residue i–4 to the hydroxyl and amidehydrogens of serine
or threonine residues at position i.[18] Our combined experimental and computational study indicated
that this bond configuration is about 60% stronger than the single
canonical bond.As shown in Figure , however, this is only one of several multiplex
H-bonds that serines
and threonines may form. In the current study, we provide a comprehensive
quantitative analysis of serine and threonine side chains H-bonding
to backbone carbonyls in over-coordinated and bifurcated H-bonds.
Our results provide a detailed energetic landscape of non-canonical
H-bonds in transmembrane helices.
Figure 1
Different H-bond configurations in solved
membrane protein structures,
as indicated by the PDB ID and chain (if relevant). Structures a,
b, and c are from the M2 H+ channel[19] and structure d is from the mitochondrial uncoupling protein
2.[20] Note that structures a–c were
determined by solid state NMR,[19] in which
the side-chain conformations were obtained by the refinement procedure.
The backbone helical H-bonds are colored in gray, while the bonds
formed by the hydroxyl side chain are depicted in green. The χ1 rotamer of the hydroxyl group and the particular H-bond acceptor(s)
are noted.
Different H-bond configurations in solved
membrane protein structures,
as indicated by the PDB ID and chain (if relevant). Structures a,
b, and c are from the M2 H+ channel[19] and structure d is from the mitochondrial uncoupling protein
2.[20] Note that structures a–c were
determined by solid state NMR,[19] in which
the side-chain conformations were obtained by the refinement procedure.
The backbone helical H-bonds are colored in gray, while the bonds
formed by the hydroxyl side chain are depicted in green. The χ1 rotamer of the hydroxyl group and the particular H-bond acceptor(s)
are noted.
Results and Discussion
Prevalence of Polar Residues
in TM Helices
Analysis
of a non-redundant dataset of transmembrane helices[21−23] indicates that
residues containing polar side chains that are capable of H-bonding
comprise 23% of all amino-acids (16% in bitopic or single-pass proteins
and 24% in polytopic or multi-pass proteins, Table S1). Such side chains include serine, threonine, tyrosine,
cysteine, histidine, glutamine, aspargine, glutamate, aspartate, lysine,
and arginine. Of these polar residues, serines and threonines are
the most common, together representing 11% of all transmembrane helical
amino acids leading to the fact that 92% of all transmembrane helices
contain one or more serine or threonine residues. Finally, the prevalence
of serines and threonines in membrane proteins is similar to what
is found in water-soluble proteins. However, most other polar/charged
residues are more abundant in soluble proteins, as shown in Table S1.
Statistical Analysis of
Serine and Threonine H-Bonding
We analyzed each of the serine
and threonine residues found in transmembrane
helices of solved membrane protein structures for their participation
in multiplex H-bonding. H-bonding was determined by a distance of
less than 3.5 Å between the hydroxyl O and the carbonyl O. The
results indicate that the majority of serines and threonines form
such multiplex H-bonds (Table S2). The
vast majority of these bonds form with over-coordinated backbone carbonyl
groups located at the same residue (i), at three
residues prior (i–3), or at four residues
prior (i–4) in the sequence.In order
to understand the factors that determine which of these H-bonds is
formed, we measured the χ1 rotamer. Residues with
χ1 = −60° (Figure red points) are nearly all back-bonded to
the i–4 carbonyl group (Figure pink shading). In contrast, at χ1 = +60° (Figure blue points), they back-bond to the i–3
carbonyl group (Figure cyan shading), or simultaneously to both the i–3
and the i–4 carbonyl groups (Figure checkered shading). When the
χ1 rotamer is at ±180° (Figure gray points), the side chains
H-bond to their own carbonyl group (shown in Figure S1). Similar results are obtained when analyzing threonine
residues, as shown in Figure S1.
Figure 2
Analysis of
the distances between the serine Oγ and the oxygen
of carbonyl groups located at the i–3 and i–4 positions, as a function of side-chain rotamer
(according to the color scale). The cyan shaded region indicates residues
whose Oγ is close enough (within 3.5 Å) to H-bond to the i–3 carbonyl group. The pink shaded region indicates
residues whose Oγ is close enough (within 3.5 Å) to H-bond
to the i–4 carbonyl group. The cyan and pink
checkered region indicates residues whose Oγ is close enough
to H-bond to both the i–3 and i–4 carbonyl groups simultaneously. The residues are from a
dataset of non-redundant transmembrane helices.[21−23]
Analysis of
the distances between the serine Oγ and the oxygen
of carbonyl groups located at the i–3 and i–4 positions, as a function of side-chain rotamer
(according to the color scale). The cyan shaded region indicates residues
whose Oγ is close enough (within 3.5 Å) to H-bond to the i–3 carbonyl group. The pink shaded region indicates
residues whose Oγ is close enough (within 3.5 Å) to H-bond
to the i–4 carbonyl group. The cyan and pink
checkered region indicates residues whose Oγ is close enough
to H-bond to both the i–3 and i–4 carbonyl groups simultaneously. The residues are from a
dataset of non-redundant transmembrane helices.[21−23]The statistical analysis shows that serine and threonine
residues
are commonly involved in a number of different multiplex H-bonds:
to the i–3 carbonyl, to the i–4 carbonyl, to both i–3 and i–4 carbonyls, or to the carbonyl of the same residue
(i). Hence, the choice of H-bonding partner depends
on the serine or threonine χ1 angles. Previously
we have measured the strength of the over-coordinated bond to the i–4 carbonyl.[18] In order
to obtain a complete understanding of all of these configurations,
we now expand our analysis and measure the strength of all of these
multiplex H-bonds in a membrane protein solvated in its natural lipid
bilayer environment.
FTIR Spectroscopy
As a system to
investigate multiplex
H-bonds, we chose the 97 amino acid, tetrameric M2 H+ channel
from influenza A. Its structure has been extensively characterized
by X-ray crystallography,[24,25] solution NMR spectroscopy,[26] and solid-state NMR spectroscopy.[19,27] Moreover, a 25 amino acid peptide that encompasses the protein’s
single transmembrane domain exhibits many of the characteristics of
the full length protein, such as tetramerization, drug binding, and
conductivity.[28,29]The M2 channel contains
a single serine residue in its transmembrane domain at position 31.
Three of the multiplex H-bonding configurations (i–3, i–4, and simultaneous i–3 and i–4) can be observed
at this serine location when inspecting the different protein chains
and frames of PDB ID 2L0J,[19] as depicted in Figure . Note that the structure was determined
by solid state NMR,[19] in which the side-chain
conformations were obtained by the refinement procedure. Finally,
in all of these configurations, the backbone carbonyl retains its
canonical H-bond with the amide H four residues later, and so it does
not require the hydroxyl side chain for its own stabilization.In order to measure the strength of the different multiplex H-bonds,
we utilized FTIR spectroscopy, focusing on the vibrational frequency
of the carbonyl group. The C=O stretch is the major component
of the amide I vibrational mode.[30] Consequently,
the amide I band is expected to shift to lower frequencies when bound
to a single H-bond donor, and even more so when it is over-coordinated
to two donors,[31] as shown schematically
in Figure . Hence,
FTIR spectroscopy is particularly useful, since the extent of the
shift is directly related to the strength of the H-bond in question.
Spectroscopic observation of an individual carbonyl group is achieved
by 13C=18O labeling, which shifts the
labeled carbonyl vibration far from the natural abundance amide I
mode.[32,33]
Figure 3
Impact of H-bonding (in red) on the vibrational
frequencies of
the C=O group (in blue). Left: Non-bonded configuration. Center:
Canonical H-bonded configuration composed of a single donor and a
single acceptor. Right: Over-coordinated H-bonding configuration with
two donors and a single acceptor. Consequently, due to H-bonding the
vibrational frequencies are related to one another as follows: ν1 > ν2 > ν3.
Impact of H-bonding (in red) on the vibrational
frequencies of
the C=O group (in blue). Left: Non-bonded configuration. Center:
Canonical H-bonded configuration composed of a single donor and a
single acceptor. Right: Over-coordinated H-bonding configuration with
two donors and a single acceptor. Consequently, due to H-bonding the
vibrational frequencies are related to one another as follows: ν1 > ν2 > ν3.In order to analyze the H-bond between Ser31’s
hydroxyl
to the i–3 carbonyl, we labeled residue Val28
with 13C=18O. Similarly, analysis of
the H-bond to the i–4 carbonyl was achieved
by labeling Val27. As a control without side-chain over-coordination,
we used two additional peptides, once more labeled with 13C=18O at Val27 or Val28, but in these instances
Ser31 was replaced with an alanine. Site 31 in the M2 protein has
appreciable variability (including S, N, C, G, I, D, K, and R) among
currently sequenced naturally circulating viral strains. The overall
M2 structure is not altered in any detectable way upon mutation, as
can be seen by the FTIR spectra of the amide I and amide II bands
that are found at the same frequencies (Figure S2). Finally, an H-bond to the carbonyl of the same residue
(i) was not observed in the M2 channel, and hence
could not be analyzed experimentally.The FTIR spectra of the
labeled amide I peaks of these four M2
transmembrane peptides in hydrated lipid bilayers are shown in Figure . Interestingly,
the isotope-edited peaks of Val27 or Val28 change dramatically depending
on which residue is located at position 31. In particular, when residue
31 is an alanine, a peak is observed at higher frequencies: 1596 or
1602 cm–1 for the carbonyl stretching mode of residue
27 or 28, respectively. However, an appreciable shift to lower frequencies
is obtained when residue 31 is a serine, whose side chain is capable
of H-bonding. The carbonyl stretching mode of Val28 (i–3) shifts by 8.4 cm–1, while that of Val27
(i–4) shifts by 14.7 cm–1. These values align with the 7–13 cm–1 downshift
reported previously for interactions between cations and an amide
carbonyl.[34]
Figure 4
FTIR spectra in the isotope-edited
amide I mode region of M2 peptides
(Ser22–Leu46, as noted) in hydrated lipid bilayers obtained
at room temperature. Amino acids shaded in green are labeled with 13C=18O at position 27 (right panel) or 28
(left panel). The arrows in the sequence depict possible over-coordinated
H-bonds by Ser31 (shaded in cyan). Spectra of peptides with an alanine
at position 31 (shaded in orange) are depicted in black. The spectra
of these peptides with serine at position 31 are depicted in red or
blue for peptides labeled at Val28 or Val27, respectively. The spectra
were normalized according to each isotope-edited amide I peak.
FTIR spectra in the isotope-edited
amide I mode region of M2 peptides
(Ser22–Leu46, as noted) in hydrated lipid bilayers obtained
at room temperature. Amino acids shaded in green are labeled with 13C=18O at position 27 (right panel) or 28
(left panel). The arrows in the sequence depict possible over-coordinated
H-bonds by Ser31 (shaded in cyan). Spectra of peptides with an alanine
at position 31 (shaded in orange) are depicted in black. The spectra
of these peptides with serine at position 31 are depicted in red or
blue for peptides labeled at Val28 or Val27, respectively. The spectra
were normalized according to each isotope-edited amide I peak.
DFT Calculations
In order to correlate
the experimentally
measured frequency shifts to bond enthalpies, we undertook DFT calculations.
Such calculations yield the frequency of any particular vibrational
mode in the system, which can then be compared with the experimental
results from FTIR in order to validate the computation.While
a peptide is an exceedingly large system for quantum calculations,
it is possible to capture the chemistry and geometry of the relevant
H-bonding groups using smaller compounds. For example, two consecutive
peptide carbonyls may be effectively mimicked by a 2-acetamido-N-methylacetamide molecule (Figure and Figure S3). Specifically, from the atom coordinates of chains A, B, and D
of PDB ID 2L0J,[19] we built mimics for the i–3, i–4, and i–3
and i–4 multiplex H-bonding systems (panels
d–f, respectively, in Figure and Figure S3). A mimic
for the i H-bond system was based on the structure
of PDB ID 2LCK(20) (panel g of Figure and Figure S3). Finally, the structures were optimized after assembly, and the
resulting minimal deviations can be seen in Figure S4.
Figure 5
Model compounds used in
the DFT calculations in order to calculate the H-bond enthalpies and
vibrational frequencies. Middle row models (d–g) are serine
side-chain mimics that contain multiplex H-bonds, while the bottom
row (h−k) does not, due to the rotation of the hydroxyl group.
Similarly, the top row (a–c) are alanine side-chain mimics
that represent equivalent systems in which the hydroxyl group is not
present, and hence once again, multiplex H-bonds do not exist. Each
column differs in the identity of the residue(s) of carbonyl(s) acceptor
of the multiplex H-bond, as indicated on top (i–3, i–4, i–3 and i–4, and i). The i, i–3, and i–4 carbonyl groups,
and the multiplex H-bonds that they form are colored in purple, red,
and blue, respectively. The side-chain hydroxyl group is depicted
in green, while canonical H-bonds are depicted in black. Note that
these two-dimensional schematic diagrams of the model compounds are
intended to clearly show the H-bonds being calculated. For an accurate
three-dimensional model of these compounds with the correct bond lengths
and angles, see Figure S3.
Model compounds used in
the DFT calculations in order to calculate the H-bond enthalpies and
vibrational frequencies. Middle row models (d–g) are serine
side-chain mimics that contain multiplex H-bonds, while the bottom
row (h−k) does not, due to the rotation of the hydroxyl group.
Similarly, the top row (a–c) are alanine side-chain mimics
that represent equivalent systems in which the hydroxyl group is not
present, and hence once again, multiplex H-bonds do not exist. Each
column differs in the identity of the residue(s) of carbonyl(s) acceptor
of the multiplex H-bond, as indicated on top (i–3, i–4, i–3 and i–4, and i). The i, i–3, and i–4 carbonyl groups,
and the multiplex H-bonds that they form are colored in purple, red,
and blue, respectively. The side-chain hydroxyl group is depicted
in green, while canonical H-bonds are depicted in black. Note that
these two-dimensional schematic diagrams of the model compounds are
intended to clearly show the H-bonds being calculated. For an accurate
three-dimensional model of these compounds with the correct bond lengths
and angles, see Figure S3.We then proceeded to calculate the vibrational frequencies
of the
carbonyls in question (see colored carbonyls in panels d–g
of Figure and Figure S3). We followed by calculating the same
frequencies for structures in which the hydroxyl group, which participated
in the multiplex H-bonding, is absent (panels a–c of Figure and Figure S3). These two calculation series resembled
systems with a serine capable of multiplex H-bonding, or conversely,
an alanine that is not. Consequently, the vibrational shifts due to
multiplex H-bonding could be obtained readily by comparing the two
frequencies (top versus middle rows in Figure and Figure S3). The results are very encouraging: The calculated i–4 and i–3 H-bond spectral shifts
are 15.7 and 9.01 cm–1, respectively, which are
exceptionally close to the 14.7 and 8.4 cm–1 shifts
measured experimentally by FTIR (Figure ).Following confirmation of the accuracy
of the DFT calculations,
we proceeded to evaluate the enthalpy of the different multiplex H-bonds.
We calculated the overall advantage in stability that serine contributes
to the structure. We did so by first calculating the energy of the
system with serine back-bonding to a backbone carbonyl (panels d–f
in Figure and Figure S3). We then removed any intramolecular
influences by calculating the energy of the system again when the
two molecules are separated by 100 Å. We did the same for the
valine systems (panels a–c in Figure and Figure S3). Subsequently, we subtracted the valine energy values from the
serine ones in order to determine the energetic favorability of a
serine in this location:The energetic difference of a serine versus
a valine in the i–3, i–4,
and multiplex i–3 and i–4
orientations is −4.9 kcal/mol, −5.2 kcal/mol, and −3.0
kcal/mol, respectively.To determine the particular contribution
of the hydroxyl side-chain
interaction with the backbone carbonyl, we manipulated each of the
above systems to abolish any non-canonical H-bond. This was achieved
by rotating the χ1 dihedral such that the hydroxylic
side chain is rotated about the Cα–Cβ bond by 180°, thereby breaking the multiplex H-bond
(bottom row in Figure and Figure S3). The impact of all other
energies, such as any new intramolecular interactions caused by the
rotation, can then be accounted for by separating the two molecules
apart in both the rotated and non-rotated systems. Hence the enthalpy
of the side-chain contribution to each of the multiplex H-bonds (ΔE) is given byThe results listed in Table indicate that the addition of another H-bond
donor strengthens the helical H-bond appreciably. In particular, the
contribution of a hydroxyl group to the H-bond system involving the i–4 carbonyl increases its stability by 5.8 kcal/mol
relative to a canonical (i.e., single donor) H-bond. Similarly, when
the hydroxyl group participates in an over-coordinated H-bond with
the i–3 or i carbonyl, it
results in an H-bonding system that is stronger than a canonical bond
by 4.1 and 4.2 kcal/mol, respectively. Finally, when the hydroxyl
is simultaneously bound to the i–3 or i–4 carbonyls, the H-bond is strengthened by 1.9
kcal/mol.
Table 1
DFT Calculated Bond Enthalpies of
the Different Multiplex H-Bonds Involving Serine Side Chainsa
H-bond acceptor(s)
bond
enthalpy(kcal/mol)
prevalence (%)
C=O at i–4
–5.8
63
C=O
at i–3
–4.1
56
C=O at i
–4.2
32
C=O
at i–4 and i–3
–1.9
30
The
last column is the calculated
statistical prevalence of the serine multiplex H-bond in our non-redundant
dataset of transmembrane helices (Figure ). A strict cutoff distance of 3.0 Å
between the hydroxyl O and the carbonyl O acceptor was used for classification.
Note that due to multifurcation of the hydroxyl group, percentages
exceed 100%.
The
last column is the calculated
statistical prevalence of the serine multiplex H-bond in our non-redundant
dataset of transmembrane helices (Figure ). A strict cutoff distance of 3.0 Å
between the hydroxyl O and the carbonyl O acceptor was used for classification.
Note that due to multifurcation of the hydroxyl group, percentages
exceed 100%.The amide I
shift (14.7 cm–1) to hydrogen bond
length (1.95 Å) ratio for the i–4 system
is 28.4 cm–1/Å, which is very similar to that
predicted previously.[35] For the i–3 system, however, we receive an amide I shift
(8.4 cm–1) to hydrogen bond length (1.93 Å)
ratio of 16.4 cm–1/Å, which is nearly half
of the value received for the i–4 system.
Additional factors, such as environment polarity and geometry, may
be responsible for such differences.
Conclusions
We
observe that multiplex H-bonds are significantly more stable
than canonical H-bonds. Our findings are consistent with their prevalence
among serine and threonine residues: over 75% of serines and threonines
in TM α-helices form multiplex H-bonds. Moreover, the relative
strengths of the different configurations are generally consistent
with their prevalence, albeit entropy is not taken into account in
the DFT calculations. For example, the strongest bond, in which the
hydroxyl is bound to the i–4 carbonyl, is
also the most prevalent. Conversely, the weakest bond, in which the
hydroxyl is simultaneously bound to both i–3
and i–4 carbonyls, is also the least common.In analyzing the specific side-chain structure of the M2 peptide,
we recall that two isotopomeric peptides were studied, with an identical
sequence containing a serine at position 31. The only difference between
the two peptides is the location of the 13C=18O label: one at valine 27 (i–4) and
the other at valine 28 (i–3). Since we observe
a shift in the amide I mode of both carbonyl groups (Figure ), we can deduce that they
both serve as acceptors to an H-bond from the hydroxyl side chain
of residue 31. One may speculate that these H-bonding configurations
may not behave classically as independent forms but may demonstrate
a quantum nature where the proton can tunnel between different proton
acceptors and donors. With this view, the position of the proton would
not be on any of the donors or acceptors at any given moment, but
rather within a potential well somewhere between the acceptors and
donors. This has previously been suggested to exist between a proton
donor and proton acceptor.[36]Polar
residues are often necessary for membrane protein function.
While most polar or charged residues exist at significantly lower
proportions in membrane proteins compared to in water-soluble proteins,
approximately equal proportions of serines and threonines exist in
both membrane and water-soluble proteins (Table S1). So, while membrane proteins demonstrate a preference for
apolar residues over most polar/charged residues, this preference
is negated for serine and threonine. The equal ratio of these hydroxylic
residues in membrane and water-soluble proteins is due to their ability
to form multiplex H-bonds, which provides stability in the hydrophobic
membrane environment. The environment-dependent nature of the serine
and threonine dihedral preferences that allow the multiplex hydrogen
bonding described herein can be applied to statistical and energy-based
force fields and scoring functions of bio-computational tools.The different H-bond configurations may allow serine and threonine
residues to form H-bonds at any orientation necessary for protein
function. Moreover, the hydroxyl side chain may break and re-form
its H-bond to the over-coordinated backbone carbonyl with relative
ease since it does not destabilize the carbonyl, which remains H-bonded
to the backbone amine hydrogen. This flexible nature of the over-coordinated
H-bond of the backbone carbonyl with the hydroxyl side chain makes
it uniquely suited for simultaneously stabilizing serine and threonine
side chains, while still affording them the versatility needed to
function. Moreover, Bowie and co-workers have pointed at the role
hydroxyl groups’ over-coordination may have in the pliability
of transmembrane helical H-bond patterns.[37] Finally, Thiel and co-workers have recently suggested that gating
of an ion channel may be controlled by a temporal over-coordinated
H-bond.[38]We have focused on serine
and threonine multiplex hydrogen bonding
because these residues are by far the most common polar residues in
transmembrane helices. Their behavior highlights the importance of
intramolecular hydrogen bonding in the hydrophobic membrane environment.
It is quite likely that other residues exhibit equally interesting
hydrogen bonding behavior. Similar over-coordination has recently
been shown to occur for glutamines in polyglutamine tracts.[39] Tyrosines exist with equal prevalence in both
membrane and water-soluble proteins. They do not form hydrogen bonds
with backbone carbonyls nearly as often as serine and threonine, and
they have (together with tryptophan) a preference for the aqueous–lipid
interface, where they can interact with the aqueous phase.[40] Their specific hydrogen-bonding stabilization
mechanism would merit future investigation.
Experimental
Section
Statistical Analyses
A list of 27,052 transmembrane
α-helices was obtained from PDBTM.[21−23] Structures
with an X-ray resolution greater than a potential H-bond length (3.5
Å) were pared from the list, resulting in 20,542 transmembrane
α-helical segments. Redundancies were removed using CD-HIT[41,42] at 80% identity. The representative sequences for each cluster were
made into a non-redundant dataset of 2294 transmembrane α-helices.
Finally, each of the protein structures was analyzed for the presence
of non-canonical H-bonding using in-house written VMD[43] TCL scripts.1-13C=18O isotopic labeling was prepared as described
previously.[44] Briefly, 4.52 mmol of 3,5-dimethylpyridine
hydrobromide[45] in 2 mL of anhydrous N,N-dimethylformamide
(DMF) (Sigma-Aldrich, MO, USA) was combined with 2.24 mmol of N-(3-(dimethylamino)propyl)-N′-ethylcarbodiimide
hydrochloride (EDC·HCl) (Sigma-Aldrich, MO, USA) and 11.3 mmol
of H218O (Sigma-Aldrich, MO, USA) under N2. In order to start the reaction, 225 μmol of l-valine-1-13C-N-FMOC (Cambridge Isotope Laboratories,
Inc., MA, USA), dissolved in 3 mL of anhydrous DMF, was added. The
reaction mixture was held at room temperature and stirred overnight.
After 18 h, another 2.24 mmol of EDC·HCl was added, followed
by a third addition of 2.24 mmol of EDC·HCl 8 h later. Sixteen
hours after that, the reaction was removed from mixing and N2. Thirty milliliters of ethyl acetate (Gadot-group, Netanya, Israel)
was added, and the mixture was transferred to a separatory funnel,
where it was washed three times with 0.1 M citric acid and then once
with brine. Thirty milliliters of ethyl acetate was then added to
the combined citric acid and brine portions and separated. The 60
mL of ethyl acetate containing the labeled amino acid was dried over
anhydrous sodium sulfate (Dasit Group, Milan, Italy) and filtered,
and finally the ethyl acetate was removed by rotary evaporation, creating
an azeotrope with dichloromethane (Gadot-group, Netanya, Israel).The labeled valine (see above), represented as V̅, was incorporated
into four different peptides corresponding to the transmembrane domain
of the influenza A M2 channel. The four peptides created include the
native sequence with valines 27 or 28 labeled as well as an S31A mutant
with valines 27 or 28 labeled (peptide numbering begins at 22):SSDPLV̅VAASIIGILHLILWILDRLSSDPLVV̅AASIIGILHLILWILDRLSSDPLV̅VAAAIIGILHLILWILDRLSSDPLVV̅AAAIIGILHLILWILDRLThe four peptides were synthesized separately
with N-(9-fluorenyl methoxycarbonyl) solid-phase
chemistry. Each peptide
sample was purified with high performance liquid chromatography on
a 20 mL Jupiter 300 Å C4 5 μm high-performance liquid chromatography
column (Phenomenex, CA, USA). The column was pre-equilibrated with
80:8:12 (by volume) water:acetonitrile:isopropanol, where all solvents
contained 0.1% trifluoroacetic acid (TFA) (Merck, Darmstadt, Germany).
Two milligrams of protein sample was dissolved in 2 mL of TFA and
injected into the column. The solvent gradient was linearly altered
with the VWR Hitachi Chromaster 5160 Pump to remove all water composition
while retaining the acetonitrile:isopropanol ratio at 40%:60% with
0.1% TFA. Peptide elution was monitored at 280 nm using the VWR Hitachi
Chromaster 5410 UV detector.All of our experimental measurements
were performed on peptides
in lipid vesicles. We used organic solvent cosolubilization in order
to reconstitute each peptide in a membrane bilayer. Approximately
1 mg of protein and 10 mg of 1,2-dimyristoyl-sn-glycero-3-phosphocholine
(Avanti Polar Lipids, AL, USA) were dissolved in 1 mL of 1,1,1,3,3,3-hexafluoro-2-propanol
(HFIP) (Merck, Darmstadt, Germany). The mixture was rotary evaporated
at 37 °C until all HFIP evaporated. One milliliter of water was
added, and the mixture was rotated at 37 °C to spontaneously
form vesicles. The sample was then sonicated to ensure uniformly sized
vesicles and no aggregation. The pH of all samples was below 6, and
so the M2 protein is in its open conformation.[46]For each of the four samples of peptides in a membrane
vesicle,
separate FTIR spectra were collected. First, 200 μL of sample
was deposited on a germanium trapezoid ATR plate (50 mm × 2 mm
× 20 mm) with a 45° face angle (Wilmad, NJ, USA). Following
removal of bulk solvent, the crystal was incorporated into a 25 reflection
variable angle ATR unit (Specac, Orpington, UK), which reflects the
incoming FTIR beam 25 times before its exit from the crystal. The
ATR unit was incorporated within a Nicolet iS10 FTIR spectrometer,
with a mercury cadmium telluride detector (Thermo Scientific, MA,
USA), cooled with liquid nitrogen. The FTIR spectrometer was purged
with water- and CO2-depleted air, and spectra were collected
at room temperature. For each sample, 1000 scans were sampled and
averaged at a data spacing of 0.241 cm–1 with two
levels of zero filling, N-B strong apodization, and Mertz phase correction.
For each of the four samples of peptides in a membrane vesicle, separate
FTIR spectra were collected at room temperature. The FTIR spectra
that we collected indicate that the DMPC membrane is in the gel phase
since the lipid C=O stretch is at 1738 cm–1, the CO–O stretch is at 1177 cm–1, and
there are distinct CH2 wag peaks.[47]The i–3, i–4, and multiplex i–3 and i–4 H-bonding models, were created from chains A,
B, and D, respectively, of the solved structure of the influenza A
M2 protein with PDB ID 2L0J.[19] Each model contains
the serine residue, the i–3 and i–4 amide groups, as well as the i + 1 and i amide groups that form canonical H-bonds with the i–3 and i–4 amide groups.
Cα’s connecting adjacent amide groups and at the molecule
ends were also included, and then H atoms are added with VMD molefacture.[43] The models underwent geometric optimization
of H atoms and the i–3 and i–4 backbone carbonyls.The i H-bond
model was created from the solved structure with PDB ID 2LCK.[20] The model includes the serine residue and the NH group
at residue i + 4, involved in a canonical H-bond
with the i carbonyl. Cα atoms cap the molecules,
and H atoms were added via VMD Molefacture.[43] The models underwent geometric optimization of H atoms and the i backbone carbonyl.All optimization steps, as well
as frequency and energy calculations,
were conducted with the Q-Chem software package[48] using the B3LYP method[49,50] and the aug-cc-pVDZ
basis set.[51,52] The dielectric constant was set
to 4 to mimic the hydrophobic membrane environment.The i–3 and i–4
amide carbonyls of the models were isotopically labeled as 1-13C=18O to imitate the peptides experimentally
analyzed by FTIR. The amide I peak shift between the structures in Figure d–f and three
other structures, where the serine is mutated to an alanine (Figure a–c), was
calculated. We tested different methods, basis sets, optimization
schemes, and even structures until arriving at close correlation between
the measured FTIR peak shifts and the calculated DFT peak shifts.
The chosen parameters and structures are as described above.The self-consistent field (SCF) energy calculations were performed
on each system in order to derive the energy of the side chain-to-carbonyl
H-bond contributions in the different multiplex systems.The
energy of the structures in Figure d–f were calculated. The two molecules
in each of these systems were separated by 100 Å to remove any
influence of H-bonding. By subtracting the far system from the close
system, we remove the energy of covalent bonds and atoms from consideration.
But we are still left with the energy of all of the H-bonds: the two
canonical ones in black and the colored ones (Figure ).In order to remove the contribution
of the canonical H-bonds, the
structures in Figure h–j were created, where we rotated the serine side-chain χ1 dihedral by 180°, to break the H-bond. We calculated
the energy of these structures, both when the molecules are close
together and far apart. We again subtract the far system’s
energy from the close system’s energy, giving us the energy
of the canonical H-bonds. We deduct this canonical H-bond energy from
the energy we calculated previously for all H-bonds, leaving us with
the energy of just the colored H-bonds—namely, just the side
chain to carbonyl contribution of the different multiplex H-bond schemes.For the i over-coordinated H-bond, instead of
separating the molecules far apart (since that would not break all
of the H-bonds, and the side chain to carbonyl H-bond would remain
intact), we converted the carbonyl to a methylene group, thereby breaking
all H-bonds.
Authors: Robert J Fick; Amy Y Liu; Felix Nussbaumer; Christoph Kreutz; Atul Rangadurai; Yu Xu; Roger D Sommer; Honglue Shi; Steve Scheiner; Allison L Stelling Journal: J Phys Chem B Date: 2021-07-08 Impact factor: 2.991