Human tRNA(Lys3)UUU is the primer for HIV replication. The HIV-1 nucleocapsid protein, NCp7, facilitates htRNA(Lys3)UUU recruitment from the host cell by binding to and remodeling the tRNA structure. Human tRNA(Lys3)UUU is post-transcriptionally modified, but until recently, the importance of those modifications in tRNA recognition by NCp7 was unknown. Modifications such as the 5-methoxycarbonylmethyl-2-thiouridine at anticodon wobble position-34 and 2-methylthio-N(6)-threonylcarbamoyladenosine, adjacent to the anticodon at position-37, are important to the recognition of htRNA(Lys3)UUU by NCp7. Several short peptides selected from phage display libraries were found to also preferentially recognize these modifications. Evolutionary algorithms (Monte Carlo and self-consistent mean field) and assisted model building with energy refinement were used to optimize the peptide sequence in silico, while fluorescence assays were developed and conducted to verify the in silico results and elucidate a 15-amino acid signature sequence (R-W-Q/N-H-X2-F-Pho-X-G/A-W-R-X2-G, where X can be most amino acids, and Pho is hydrophobic) that recognized the tRNA's fully modified anticodon stem and loop domain, hASL(Lys3)UUU. Peptides of this sequence specifically recognized and bound modified htRNA(Lys3)UUU with an affinity 10-fold higher than that of the starting sequence. Thus, this approach provides an effective means of predicting sequences of RNA binding peptides that have better binding properties. Such peptides can be used in cell and molecular biology as well as biochemistry to explore RNA binding proteins and to inhibit those protein functions.
HumantRNA(Lys3)UUU is the primer for HIV replication. The HIV-1 nucleocapsid protein, NCp7, facilitates htRNA(Lys3)UUU recruitment from the host cell by binding to and remodeling the tRNA structure. HumantRNA(Lys3)UUU is post-transcriptionally modified, but until recently, the importance of those modifications in tRNA recognition by NCp7 was unknown. Modifications such as the 5-methoxycarbonylmethyl-2-thiouridine at anticodon wobble position-34 and 2-methylthio-N(6)-threonylcarbamoyladenosine, adjacent to the anticodon at position-37, are important to the recognition of htRNA(Lys3)UUU by NCp7. Several short peptides selected from phage display libraries were found to also preferentially recognize these modifications. Evolutionary algorithms (Monte Carlo and self-consistent mean field) and assisted model building with energy refinement were used to optimize the peptide sequence in silico, while fluorescence assays were developed and conducted to verify the in silico results and elucidate a 15-amino acid signature sequence (R-W-Q/N-H-X2-F-Pho-X-G/A-W-R-X2-G, where X can be most amino acids, and Pho is hydrophobic) that recognized the tRNA's fully modified anticodon stem and loop domain, hASL(Lys3)UUU. Peptides of this sequence specifically recognized and bound modified htRNA(Lys3)UUU with an affinity 10-fold higher than that of the starting sequence. Thus, this approach provides an effective means of predicting sequences of RNA binding peptides that have better binding properties. Such peptides can be used in cell and molecular biology as well as biochemistry to explore RNA binding proteins and to inhibit those protein functions.
There are
three human isoaccepting
tRNAs for the amino acid lysine, htRNALys1,2,3. The three
humantRNALys decode the two lysine codons, AAA and AAG.
Two of the isoacceptors, htRNALys1,2CUU with
the anticodon CUU, decode AAG, but only one, htRNALys3UUU with the anticodon UUU, responds to the cognate codon AAA
and wobbles to AAG. Besides its important role in protein synthesis,
htRNALys3UUU serves as the primer of reverse
transcription in the replication of the lentiviruses, including humanimmunodeficiency virus type 1 (HIV-1).[1,2] During the
replication of HIV-1, the host cell htRNALys3UUU is recognized and bound and its structure destabilized by nucleocapsid
protein 7 (NCp7).[3−5] This destabilization allows the relaxed U-rich anticodon
stem loop (hASLLys3UUU), as well as the acceptor
stem (Figure 1), to be annealed to the HIV
viral RNA. During the subsequent infection, htRNALys3UUU is the primer for HIV reverse transcriptase.
Figure 1
Human modified
and unmodified ASLLys3UUU.
(A) Human ASLLys3UUU with all naturally occurring
modifications (mcm5s2U34, ms2t6A37, and Ψ39). The
construct for this study was not modified at position 39. (B) The
unmodified hASLLys3UUU used in this study.
Human modified
and unmodified ASLLys3UUU.
(A) HumanASLLys3UUU with all naturally occurring
modifications (mcm5s2U34, ms2t6A37, and Ψ39). The
construct for this study was not modified at position 39. (B) The
unmodified hASLLys3UUU used in this study.htRNALys3UUU is one of the most uniquely
processed tRNAs having chemically rich post-transcriptional modifications
that are important to conformation and function of the tRNA during
protein synthesis.[6] Until recently, the
role(s) these modifications play in the tRNA’s interaction
with NCp7 and in viral replication were not known. The naturally occurring
modifications, 5-methoxycarbonylmethyl-2-thiouridine (mcm5s2U34), at tRNA’s wobble position-34,
and 2-methylthio-N-threonylcarbamoyladenosine
(ms2t6A37), at position-37, 3′-adjacent
to the anticodon in the loop of the hASLLys3UUU, are both chemically rich and constitute a unique combination in
human tRNAs (Figure 1).[7,8] These
modifications enhance NCp7’s ability to recognize and bind
to the RNA, suggesting that these modifications are an important discrimination
factor for recognition by NCp7.[9] The presence
of these modifications increases NCp7 affinity for hASLLys3 almost 10-fold (Kd = 0.28 ± 0.03
μM for modified and Kd = 2.30 ±
0.62 μM for unmodified ASL).[9] NCp7
is critical to HIV replication because it binds and relaxes the htRNALys3 structure, facilitating annealing of the tRNA to the viral
genomic RNA and packaging of the genomic RNA into the viral capsid.[4]Fifteen- and 16-amino acid peptides have
been selected to mimic
NCp7’s preferential recognition of the fully modified hASLLys3UUU.[9,10] These peptides can
be used to study modification-dependent protein recognition of RNAs,
in general, and recognition and annealing of htRNALys3UUU to the HIV viral RNA, specifically. One peptide, P6 (sequence
RVTHHAFLGAHRTVG), was also shown to mimic NCp7. P6 not
only binds hASLLys3UUU but also destabilizes
the ASL structure as does NCp7.[9] The ability
of peptides to mimic NCp7 makes it possible to engineer a peptide
with a signature amino acid sequence that can be used as a tool in
future studies of protein recognition of RNAs, particularly those
with unique modifications chemistries. Herein, we report the development
of a signature amino acid sequence for the recognition of htRNALys3UUU. An algorithm was developed that optimizes
the amino acid sequence by combining self-consistent mean field (SCMF)
and Monte Carlo (MC) approaches. The resulting peptides were then
validated as binders through empirical experimentation in order to
corroborate the computer-aided approach. Amino acid substitutions in silico indeed enhanced the modification-dependent binding
of the peptide ligand with high affinity and selectivity in
vitro. The peptide sequences predicted by the algorithms
preferentially bound the modified hASLLys3UUU with affinities at or higher than P6, and with greater specificity.
The signature sequence provides insight into peptide and protein recognition
of the modified tRNALys3UUU.
Experimental
Procedures
Materials and Reagents
All materials, buffers, and
reagents were of RNA grade quality and RNase free. The modified and
unmodified hASLLys3UUU were chemically synthesized
by Thermo Scientific. The modified hASLLys3UUU was synthesized with the nucleoside phosphoramidites that were 2′-protected
with tert-butyldimethylsilyl-ether.[9] The unmodified hASLLys3UUU was synthesized
with “ACE” chemistry.[11] All
fluorescein labeled peptides were obtained from Sigma-Aldrich (PEPscreen)
with sequence verification by mass spectrometry.
In
Silico Evolution of Peptide Sequences
A random initial
sequence that satisfies the constraints on hydration
properties is generated (Figure 2). For the
search described here, we started with the 15-amino-acid sequence
of peptide P6, RVTHHAFLGAHRTVG, found experimentally to
bind selectively to the modified hASLLys3UUU.[9] The peptide backbone conformation is
determined via atomistic simulation of the peptide-hASLLys3UUU complex and then held fixed with respect to the hASLLys3UUU conformation throughout the search. The
binding free energy for the complex is then evaluated. Subsequently,
a random number is generated to determine whether to mutate one amino
acid or to exchange two amino acids. If one amino acid is to be mutated,
one site along the peptide sequence is chosen randomly. The amino
acid at that site is then mutated to another amino acid of the same
residue type. The best rotamer for the new amino acid is chosen to
substitute for the old amino acid in this mutation step. If an exchange
step is chosen, two random sites along the chain and their corresponding
amino acids are chosen for a mutual exchange attempt. In this exchange
step, we calculate the effective potential of all the possible rotamers
and perform the self-consistent mean field (SCMF) procedure[12] described below to obtain the best rotamer combination
for the exchanged amino acids. Regardless of whether one amino acid
was mutated or two amino acids were exchanged, the new generated peptide
sequence is evaluated further by calculating the new binding free
energy and accepted or rejected according to the Metropolis criterion.
After a total of 10,000 evolution steps, the best peptide sequences
with the lowest binding free energy are identified.
Figure 2
Search algorithm flow
strategy. An initial peptide sequence is
chosen (in this instance peptide P6). Random numbers were generated
to determine whether to mutate one amino acid or not (“No”
or “Yes”). If yes, then one amino acid from the sequence
was randomly changed to an amino acid from the same residue category
(Table 1). If no, then two amino acids from
the sequence were randomly exchanged regardless of the residue category.
The SCMF algorithm was then used to determine the lowest-energy rotamer
combination. The MC algorithm was used to accept or reject the newly
generated peptide sequence based on the calculation of binding free
energy (ΔG1binding).
Search algorithm flow
strategy. An initial peptide sequence is
chosen (in this instance peptide P6). Random numbers were generated
to determine whether to mutate one amino acid or not (“No”
or “Yes”). If yes, then one amino acid from the sequence
was randomly changed to an amino acid from the same residue category
(Table 1). If no, then two amino acids from
the sequence were randomly exchanged regardless of the residue category.
The SCMF algorithm was then used to determine the lowest-energy rotamer
combination. The MC algorithm was used to accept or reject the newly
generated peptide sequence based on the calculation of binding free
energy (ΔG1binding).
Table 1
(A)
residue categories of 20 canonical amino acids
(B)
hydration properties of three optimization cases
amino acids
hydration
notation
case 1
case 2
case 3
Leu, Val, Ile, Met, Phe, Tyr, Trp
hydrophobic
Npho
4
5
3
Glu, Asp
negatively
Nneg
0
0
0
Arg, Lys
positively
Npos
2
2
1
Ser, Thr,
Asn, Gln, His
hydrophilic
Npol
5
6
6
Ala, Cys
other
Noth
2
1
3
Gly
neutral
Ngly
2
1
2
In our use of the SCMF, a trial
exchange between two amino acids
at randomly chosen sites is implemented (Figure 3). The conformational probability matrix is set initially for the two
amino acids so that all possible rotamers have equal conformational
probability. The initial conformational matrix P is then used to calculate the effective potential
of each amino acid in each rotamer state. Once the effective potentials
for all the rotamer states are known, new conformational probabilities
of the rotamers are obtained according to the Boltzmann law so as
to constitute a new conformational matrix P. Next, the absolute error between P and P is
calculated. If the absolute error is less than 10–3, the best rotamers with the highest conformational probability for
the two amino acids are selected from P to repack the side chains. Otherwise, the conformational matrix P is updated by employing a self-consistent iteration. The
updated conformation matrix P is stored as the old conformational
matrix P for the next round
evaluation. The conformational matrix is iterated until the absolute
error between P and P is less than 10–3. Eventually,
the best combination of rotamers is found, thereby repacking the backbone.
Figure 3
Self-consistent
mean field (SCMF) procedure. A trial exchange between
two amino acids is implemented. The conformational probability matrix P is set initially so that all possible
rotamers at any one site have equal probabilities. The effective potential
experienced by each rotamer at each site is calculated, and the Boltzmann
law is used to determine new conformational probabilities of the rotamers
for each amino acid and hence a new conformational probability matrix P. If the absolute error between P and P is less than 10–3, the rotamer combination
with the highest conformational probability is selected from P to repack the side chains. Otherwise,
the conformational matrix P is updated by employing a
self-consistent iteration until the absolute error falls below a certain
tolerance.
Self-consistent
mean field (SCMF) procedure. A trial exchange between
two amino acids is implemented. The conformational probability matrix P is set initially so that all possible
rotamers at any one site have equal probabilities. The effective potential
experienced by each rotamer at each site is calculated, and the Boltzmann
law is used to determine new conformational probabilities of the rotamers
for each amino acid and hence a new conformational probability matrix P. If the absolute error between P and P is less than 10–3, the rotamer combination
with the highest conformational probability is selected from P to repack the side chains. Otherwise,
the conformational matrix P is updated by employing a
self-consistent iteration until the absolute error falls below a certain
tolerance.The absolute binding energy for
a ligand and a receptor is defined
to be the difference between the free energy of the complex, and the
free energies of the ligand and the receptor prior to binding.[13] It was calculated according to the following
equation:where GTOTcomplex, GTOTligand, and GTOTreceptor represent the total free energies of the complex and the ligand
and the receptor in the solution, respectively. The total free energy GTOT of the molecular complex in the solution
was calculated as follows:where UINT, UVDW, UELE, GEGB, and GGBSUR indicate
the internal energy (INT), van der Waals energy (VDW), electrostatic
energy (ELE), the polar solvation energy (EGB), and the nonpolar solvation
energy (GBSUR). The internal energy UINT is defined as the potential energy associated with the random, disordered
motion of the molecule itself, including the vibration of bonds, bond
angles, and the torsion of dihedral angles. The van der Waals energy UVDW between two nonbonded atoms adopts a typical
12–6 Lennard–Jones equation. The electrostatic energy UELE between two nonbonded atoms follows the
conservative Coulomb’s law. The polar solvation energy GEGB is calculated based on the generalized Born
model, which is an analytical way to evaluate the electrostatic contribution
to the solvation free energy. The nonpolar solvation energy GGBSUR is approximated by a pairwise potential
that occurs at the incompatible interface of two distinct species
molecules. In the solution, actually, it is proportional to the solvent-accessible
surface area of solute molecules. The expressions for the energies
in eq 2 are as follows:(a) internal energy UINTwhere Kb and Kθ are the harmonic force constants, b0 and θ0 are the equilibrium bond length
and bond angle, respectively, b and θ are the
actual bond length and bond angle, respectively, Vn is an energy constant, n is the dihedral
multiplicity, ø is the actual torsion angle,
and δ is phase shift. All of the values of Kb, b0, Kθ, θ0, Vn, and δ come from the Amber force field.(b) van der
Waals energy UVDWwhere ε( and r0( are the
well depth and the characteristic
distance for a pair of atoms i and j, respectively, and r is the actual distance between two nonbonded atoms i and j. Each atom has a depth of the potential well
ε, and a characteristic distance r0 at which the potential reaches its minimum. For any pair of atoms i and j, ε( = (ε(·ε()1/2 and r0( = r0( + r0(. All
of the values of ε and r0 come from
the Amber force field.(c) electrostatic energy UELEwhere q (x = i or j) represents
the charge quantity of atom x, and r is the actual distance between two
nonbonded atoms i and j. All of
the values of q come
from the Amber force field.(d) polar solvation energy GEGB(14−17)where q (x = i or j) represents
the charge quantity of atom x, εr is the relative dieletric constant (in this work, εr = 80.0 for water), r is the actual distance between atoms i and j (thus, r = 0.00), α (x = i or j) is the effective Born
radii of atom x, and fGB is a smoothing function, of which a common form is(e) nonpolar solvation
energy GGBSURwhere γ is surface tension (in this work, γ = 0.0072
kcal/mol/Å2 for the interface between water and biomolecules),
σ is the solvent-accessible surface
area of atom i, and b is an offset
parameter (in this work, b is set to 0.00).
Fluorescein-Labeled
Peptides
Fifteen-amino acid peptides
were selected from the sequences predicted in silico and were chemically synthesized, each with fluorescein (Flc) at
the N-terminus. This set of peptides included the original P1 and
P6 sequences to be used as an internal control.[9] The lyophilized peptide set was reconstituted via standard
suggestions from the manufacturer (80% DMSO/20% H2O, v/v).
Concentrated peptide stocks were stored in 25 μL aliquots at
−80 °C for later use. Working concentrations were diluted
for each experiment and kept on ice or stored at −20 °C.
Fluorescent Assays
Fluorescent assays were conducted
in phosphate buffer (10 mM Na2HPO4 and 10 mM
KH2PO4, pH 6.8) in low volume 384 well plates.
All buffers, peptides, and RNA were pipetted into wells via a liquid
handling robot (Janus, PerkinElmer). All plates were read using a
plate reader fitted with fluorescein-specific filters (PerkinElmer
EnVision) which were optimized for each plate and peptide before each
experiment. Initial validation screens were conducted by obtaining
fluorescent signals for each peptide (0.50 μM) alone (FS0) and in the presence of a 2-fold excess of modified or unmodified
hASLLys3UUU (FS1). Percent change
in the fluorescent signal was calculated (% change = 100·(FS1/FS0)). A decrease (quench) in fluorescent signal
in the presence of RNA indicated a binding event between the peptide
and ASL. In control wells, H2O, phosphate buffer, and ASLs
were individually tested for any inherent fluorescent signal. Water
and buffer signals served as blank background signals. The hASLLys3UUU had a negligible inherent signal, which
was taken into account when calculating the overall signal quench.
All controls and experimental sets were repeated in triplicate within
a single plate, and signals are an average of each triplicate. For
the subset of peptides which were further studied to obtained binding
constants, the fluorescent binding assay was completed as described
above. The peptide fluorescent signal was monitored throughout an
increasing ASL concentration (0–3 μM). The percent quench
in signal was plotted against the hASL concentration. Binding (dissociation)
constants (Kd) were calculated using the
single-site, nonlinear regression function within SigmaPlot. Because
the peptide concentration was 0.50 μM, Kd values between 0.05 and 0.13 μM are considered approximate
(≈). Experiments were performed in triplicate within a single
plate, and fluorescent signals are an average of each triplicate.
The binding of peptides P6, P27, P31, and P35* to the unmodified hASLLys3UUU resulted in little change in fluorescence
and could not be satisfactorily fitted with a single-site, nonlinear
regression function. Thus, these Kds were
considered indeterminable. Results could not be curve fitted, and
a Kd could not be calculated from the
data (Table 4).
Table 4
Affinity of Peptides for hASLLys3UUUa
peptide
modified or unmodified hASLLys3UUU
Kd (μM)
P6
modified
0.50 ± 0.10b
unmodified
ID
P27
modified
∼0.05 ± 0.02c
unmodified
ID
P31
modified
0.58 ± 0.24
unmodified
ID
P35*
modified
1.87 ± 1.00
unmodified
ID
ID = indeterminable; could not be
curve fitted and calculated from the data because of little change
in fluorescence with binding of ASL.
K value
from previous studies.[9]
K value is approximate for it is one-tenth the concentration
of the fluorescent peptide reporter.
Results
Development
of an Algorithm That Optimizes a Peptide Sequence
for Binding RNA
Previous studies demonstrated the feasibility
of selecting peptides with modification-dependent recognition of tRNAs’
anticodon stem and loop domains, ASLs.[18,19] The peptides
were selected from completely and partially randomized phage display
libraries.[9,10] However, optimizing 15- and 16-amino acid
peptide sequences using this approach is not feasible since there
are over 3.3 × 1023 possible sequences. Because of
the exorbitantly high costs of creating and screening millions of
peptides even with the benefit of phage display, we turned to computer
algorithms[20] and assisted model building
with energy refinement, AMBER, simulations to pare down the number
of possibilities before performing in vitro assays.
We developed a novel optimization strategy[21] that combines MC with SCMF to evolve amino acid sequences. The peptide
P6 sequence RVTHHAFLGAHRTVG[9] was the starting point from which an optimized peptide was sought
to bind the modified hASLLys3UUU with the highest
specificity and affinity.The canonical 20 amino acids were
categorized into six distinct groups according to hydrophobicity,
polarity, size, and charge (Table 1A). These hydration properties were necessary to
ensure the peptide did not become too hydrophobic (and thus insoluble)
or so hydrophilic that binding to hASLLys3 was inhibited.
The overall charge of the peptide was chosen to be slightly positive
to ensure interaction with the negatively charged ASL. By adjusting
the number of amino acids in each category, via Npho, Nneg, Npos, Npol, Noth, Ngly, we maintained hydration
properties similar to those of the original P6 sequence while evolving
the sequences.[9]The peptide sequence was optimized using the
following computational
procedure. The stable structure for the complex between the original
P6 sequence and ASL was determined using AMBER. The structure of the
fully modified hASLLys3UUU was taken from the
high resolution solution structure,[6] providing
a restrained structure to which the peptide would bind in
silico. Once the stable structure of the peptide P6 with
the ASLLys3 was determined, the peptide’s amino
acid sequence was evolved and optimized while keeping the backbone
fixed. Each peptide sequence evolved in two types of “ moves”:
(1) a single randomly chosen amino acid in the peptide sequence was
mutated to a different amino acid from the same residue category (Table 1); or (2) two randomly chosen amino acids in the
peptide sequence were exchanged regardless of the their residue category
(Figure 2) using SCMF.[22] SCMF finds the optimal rotamer combination with the lowest binding
energy for the two exchanged amino acid residues (Figure 3) based on the preferred, distinct side chain conformations
in Lovell’s rotamer Library.[23] The
sequences were subjected to continued rounds of optimization (Figure 2). By comparing the changes in binding energy before
and after each of the two types of moves, the peptide sequence was
evolved to those with the lowest binding energies and thus increased
binding affinity to the modified hASLLys3UUU.
Evolved Peptides Have Increased Specificity for ASL Modifications
The initial P6 sequence was subjected to an evolution of over several
hundred thousands of rounds of 15-amino acid peptide sequences that,
based on binding energies, should recognize and bind modified hASLLys3UUU with a similar or higher affinity than that
of P6. Initial results from the in silico selection
suggested two optimized peptide sequences, P26, R-T-L-H-H-A-L-F-G-A-H-Q-T-V-G
and P27, R-W-Q-M-T-A-F-A-H-G-W-R-H-S-G. These sequences exhibited
binding energies to the hASLLys3UUU lower than
that of P6 (P26, −22.55 kcal/mol, and P27, −22.07 kcal/mol,
respectively, vs P6 −21.26 kcal/mol). On the basis of these
initial results, we developed three distinct peptide sequence cases.
The three cases varied within the six residue categories (Table 1B). Although different, each of the three cases
is still within the overall desired levels of moderate hydration and
charge properties (slightly positively charged). P6, the initially
evolved sequences P26 and P27, and three of the top 10 sequences from
each of the first two cases 1 and 2 have lower binding energies than
those of case 3 (Table 2). This is likely due
to the increased allowance in cases 1 and 2 for positively charged
and hydrophobic residues (Table 2). The binding
energies calculated for P26 and P27 and cases 1 and 2, but not 3,
are on par with or lower than the binding energy for P6. This suggests
a potential increase in their binding affinity for modified hASLLys3UUU versus that of P6.
Table 2
Binding
Energies for Selected, Optimized
Peptide Sequences
Using the search algorithm, peptide
sequence candidates are evolved on the basis of a backbone scaffold
of structure. For each binding structure, an explicit binding energy
is generated. This search and selection process does not quantify
the error in the binding energy.
Peptides are numbered in accordance
with the previous report of phage display selected sequences.[9] P6 is from the original selection, and P26–P38
were chosen based on predictions in silico. P35*
with the sequence RWNHCQFWSGWRANG has a single amino
acid serine change from P35. Peptide sequences A and B from cases
2 and 3, respectively, were not selected for chemical synthesis and
analysis.
Using the search algorithm, peptide
sequence candidates are evolved on the basis of a backbone scaffold
of structure. For each binding structure, an explicit binding energy
is generated. This search and selection process does not quantify
the error in the binding energy.Peptides are numbered in accordance
with the previous report of phage display selected sequences.[9] P6 is from the original selection, and P26–P38
were chosen based on predictions in silico. P35*
with the sequence RWNHCQFWSGWRANG has a single amino
acid serine change from P35. Peptide sequences A and B from cases
2 and 3, respectively, were not selected for chemical synthesis and
analysis.
In Vitro Fluorescence Screens Validate in Silico Methods:
Qualitative Analysis
Sequences
predicted during the in silico optimizations to have
the lowest binding energies and thus potentially higher affinity for
modified hASLLys3 were selected for validation with a fluorescence
assay.[9] Fifteen peptides (Table 3) were chemically synthesized with fluorescein at
the N-terminus to allow for very sensitive, low volume detection of
peptide–RNA binding interactions. P1 and P6 from phage display
selections,[9] the initially evolved sequences,
P26 and P27, and the best binders from each of the cases 1, 2, and
3 were synthesized. Variants of these sequences that had one or two
amino acid changes were also synthesized (Table 3). During the initial validation assay, changes in the amount of
fluorescence were monitored to determine whether the peptide was binding
to the modified and/or unmodified hASLLys3UUU and to what relative degree (Figure 4). In
this screening assay, P6 behaved as expected. When bound by the modified
hASLLys3UUU, P6 fluorescence was altered more
than that when bound by the unmodified hASLLys3UUU. Other peptides (P27, P31, and P35*) behaved similarly. Still others
demonstrated nonspecific binding in that the degree of fluorescence
did not significantly differ between the binding of modified and unmodified
hASLLys3UUU (Figure 4: P30 and P32). Three peptides exhibited a greater degree of change
in fluorescence quenching when binding the unmodified RNA versus the
modified RNA; their binding could have been negatively affected by
modifications (P26, P28, and P37). The stacking interactions of fluorescein
with nucleobases can be nonspecific and result in fluorescence quenching
that is mistakenly attributed to a selective binding of a ligand to
a nucleic acid. We have employed a direct comparison of peptide binding
of the modified hASLLys3UUU to binding of the
control that is identical in sequence except for the modified nucleosides
in the anticodon loop. If nonspecific stacking interactions were to
occur, they were most likely limited to the stacked and identical
stem regions of the control and experimental RNAs. Therefore, the
observable differences in quenching could be attributed to the peptide
ligand’s affinity for the modifications.
Table 3
Synthesized Peptides Based on Predictions in Silicoa
Peptides named following the peptides
from original phage display library screens.[9] P1 and P6 (bold font) are from the original screen. P6 has been
characterized.[9] In addition, the binding
of the modified and unmodified hASLLys3UUU by
P1, P27, P31, and P35* (bold font and shaded row) are characterized
in this article.
Figure 4
Fluorescence of chemically
synthesized peptides effected by modified
and unmodified hASLLys3UUU. An initial fluorescent
signal (FS0) of peptide alone (1.5 μM) was obtained.
Then, a 2-fold excess of ASL was added to each peptide, and the fluorescent
signal (FS1) was monitored. The percent change (100·(FS1/FS0)) is graphed for each of the assayed peptides.
Dark gray bars represent the percent change in fluorescence in the
presence of the modified hASLLys3UUU, and light
gray bars represent the percent change in the presence of the unmodified
hASLLys3UUU. Sequences for P1–P38 are
presented in Table 3
Peptides named following the peptides
from original phage display library screens.[9] P1 and P6 (bold font) are from the original screen. P6 has been
characterized.[9] In addition, the binding
of the modified and unmodified hASLLys3UUU by
P1, P27, P31, and P35* (bold font and shaded row) are characterized
in this article.Fluorescence of chemically
synthesized peptides effected by modified
and unmodified hASLLys3UUU. An initial fluorescent
signal (FS0) of peptide alone (1.5 μM) was obtained.
Then, a 2-fold excess of ASL was added to each peptide, and the fluorescent
signal (FS1) was monitored. The percent change (100·(FS1/FS0)) is graphed for each of the assayed peptides.
Dark gray bars represent the percent change in fluorescence in the
presence of the modified hASLLys3UUU, and light
gray bars represent the percent change in the presence of the unmodified
hASLLys3UUU. Sequences for P1–P38 are
presented in Table 3Three peptides of the 15 exhibited fluorescence enhancement
when
binding the modified RNA but exhibited quenching when binding the
unmodified RNA (Figure 4: P34, P36, and P38),
whereas a fourth exhibited the opposite (P1). This result is counter
to the better understood binding of P6.[9] The result is also counterintuitive. Fluorescein stacks with the
nucleobases of nucleic acids resulting in a reduced fluorescence.
The quenching of fluorescein fluorescence is one of its advantageous
properties in studying the interactions of fluorescein-coupled ligands
with nucleic acids. Without a detailed investigation of the mode of
binding by these four peptides, interpretation of these results is
difficult at this time.Three peptides, P27 one of the two initially
evolved, P31 from
case 1, and P35*, a variant of P35 from case 2, preferentially bound
the modified hASLLys3UUU as compared to their
binding of the unmodified hASLLys3UUU, as did
the phage selected P6. P27, P31, and P35* showed a very high selectivity
for the modified hASLLys3UUU. When these peptides
were tested for their binding to unmodified hASLLys3UUU, little or no change in fluorescence was observed (Figure 4; Table 4). However, with the addition of the modified hASLLys3UUU, the fluorescent signals of P27, P31, P35*, and P6
were quenched. PeptidesP32 and P33 had the best binding properties
among the case 3 family of peptides. Even so, they had significantly
weaker affinities and a lack of specificity for the modified hASLLys3UUU as compared to those of P27, P31, and P35*
(Figure 4). Thus, peptidesP27, P31, and P35*
evolved in silico from P6 as having high affinities
for the modified hASLLys3UUU also appear from
the screening assay in vitro to have higher affinities
and higher specificities for the modified RNA.ID = indeterminable; could not be
curve fitted and calculated from the data because of little change
in fluorescence with binding of ASL.K value
from previous studies.[9]K value is approximate for it is one-tenth the concentration
of the fluorescent peptide reporter.
Evolved Peptides with Lower Free Energies of Binding Also Have
Higher Affinity for Modified hASLLys3UUU: Quantitative
Analysis
Five of the 15 peptides were chosen for detailed
analysis of their binding because the data collected and presented
in the initial screen (Figure 4) was indicative
of these peptides distinguishing modified from unmodified RNA. Also,
these peptides were chosen for detailed analyses because in their
binding of the modified hASLLys3UUU, they exhibited
sizable changes in fluorescence that could be accurately quantified
(Figure 4 and Table 4). The peptide fluorescence signal was monitored with an increasing
hASL concentration, and the percent quench in signal was plotted against
the hASL concentration. Binding constants (Kd) were calculated using the single binding site, nonlinear
regression function. Peptides P6, P27, P31, and P35* having exhibited
qualitatively the highest affinity and specificity for the modified
substrate (Figure 4) were subjected to a quantitative
analysis of their binding to both the modified and the unmodified
hASLLys3UUU. P1 was also subjected to this analysis
though the peptide exhibited a decreased fluorescence when binding
the modified hASLLys3UUU and an increased fluorescence
in binding the unmodified hASLLys3UUU. Equilibrium
binding constants (as the dissociation constant Kd; Table 4) were determined from
the concentration-dependent fluorescence quenching with the binding
of the modified and unmodified hASLLys3UUU.
This data was compared to that for P6 from which these peptides were
evolved in silico (Table 4). Little to no fluorescence quenching was observed when P27, P31,
P35*, and P6 were titrated with the unmodified hASLLys3UUU; thus, binding constants could not be extracted from
the data (Table 4). In contrast, the fluorescence
of P27 was quenched considerably with increasing amounts of the modified
hASLLys3UUU. The peptide bound the RNA with
high affinity (Kd ≈ 0.05 ±
0.10 μM) (Table 4). P27 exhibited the
highest affinity for the modified hASLLys3UUU. P31 bound the modified hASLLys3UUU with a
10-fold less affinity (Kd = 0.50 ±
0.10 μM), and its affinity for the hASL was equivalent to that
of P6 (Table 4). P35* bound the modified hASLLys3UUU with the lowest affinity of these peptides
(Kd = 1.87 ± 1.00 μM). The
fluorescence of P1 was altered by both the modified and unmodified
hASLLys3UUU. In binding the modified hASLLys3UUU, P1 exhibited a quench in fluorescence from
which we extracted a modest binding constant (Kd ≈ 0.13 ± 0.02 μM). However, the fluorescence
change in binding the unmodified hASLLys3UUU resulted in an equivalent affinity (Kd ≈ 0.15 ± 0.04 μM). Thus, in contrast to P27, P31,
P35*, and P6, P1 lacked specificity for the modified hASLLys3UUU.
Computational Analysis of Peptide P27 Binding
of the Modified
hASLLys3UUU
Peptide P27 has the highest
affinity coupled with highest specificity for binding of the modified
hASLLys3UUU. The calculated structure bound
to the modified hASLLys3UUU at equilibrium reveals
how this affinity and specificity could be achieved and thus suggests
future experimentation (Figure 5A). Amino acids
throughout P27 are engaged with the extensive chemistries of the two
modifications unique to the tRNALys3UUU. The
two arginines, R1 and R12, bracket the threonylcarbamoyl-group
of ms2t6A37 (Figure 5B). At the middle of the peptide, F7 is closely
associated with the hydrophobic methyl of the threonyl-side chain.
The imidazole ring of W11 lies above the methyl-ester of
the 5-methoxycarbonylmethyl-moiety of mcm5s2U34. As evidenced by calculations for each of the 15 amino
acids, R1, R12, F7, and W11 contribute to ΔGBinding (Figure 5C). The binding energy contributed by each of the
nucleosides of the modified hASLLys3UUU have
also been calculated. The binding energy is concentrated in the anticodon
loop, as opposed to the stem. However, the two modifications, particularly
ms2t6A37, provide the most significant
binding energies.
Figure 5
Peptide P27 binds the modified hASLLys3UUU with high affinity and specificity. (A) The computed equilibrium
binding structure of the modified hASLLys3UUU bound by P27. The peptide backbone is in gold, and the ribose-phosphodiester
backbone of the hASLLys3UUU is colored in green.
(B) Enlargement of the interaction demonstrating the specificity achieved
in the binding of the two modifications by the amino acids R1 (red), F7 (light green), W11 (light purple),
and R12 (dark green). The peptide backbone is in gold and
the side chains in color. The modifications ms2t6A37 (purple) and mcm5s2U34 (blue) are bound by amino acids at the beginning, middle, and end
of the peptide. The ribose-phosphodiester backbone of the hASLLys3UUU is not shown. The table characterizes the
contributions of different binding modes: ΔGBinding, Gibbs free energy of binding; BEw/o GBSUR, binding energy without GBSUR; VDW, van der Waals energy; ELE, electrostatic
energy; EGB, polar solvation energy based on the generalized Born
(implicit solvent) model; and GBSUR, nonpolar solvation energy, which
is the product of the solvent-accessible surface area of the solute
molecules and the interfacial tension between the solute and solvent.
(C) Individual contributions of each amino acid to the VDW, ELE +
EGB, and GBSUR. The amino acids are colored as in B. (D) Individual
contributions of each nucleoside to the VDW, ELE + EGB, and GBSUR.
The nucleosides engaged in the interaction with P27 are those of the
anticodon loop, particularly the modified nucleosides at U34 and A37. The modified nucleosides are colored as in B.
Peptide P27 binds the modified hASLLys3UUU with high affinity and specificity. (A) The computed equilibrium
binding structure of the modified hASLLys3UUU bound by P27. The peptide backbone is in gold, and the ribose-phosphodiester
backbone of the hASLLys3UUU is colored in green.
(B) Enlargement of the interaction demonstrating the specificity achieved
in the binding of the two modifications by the amino acids R1 (red), F7 (light green), W11 (light purple),
and R12 (dark green). The peptide backbone is in gold and
the side chains in color. The modifications ms2t6A37 (purple) and mcm5s2U34 (blue) are bound by amino acids at the beginning, middle, and end
of the peptide. The ribose-phosphodiester backbone of the hASLLys3UUU is not shown. The table characterizes the
contributions of different binding modes: ΔGBinding, Gibbs free energy of binding; BEw/o GBSUR, binding energy without GBSUR; VDW, van der Waals energy; ELE, electrostatic
energy; EGB, polar solvation energy based on the generalized Born
(implicit solvent) model; and GBSUR, nonpolar solvation energy, which
is the product of the solvent-accessible surface area of the solute
molecules and the interfacial tension between the solute and solvent.
(C) Individual contributions of each amino acid to the VDW, ELE +
EGB, and GBSUR. The amino acids are colored as in B. (D) Individual
contributions of each nucleoside to the VDW, ELE + EGB, and GBSUR.
The nucleosides engaged in the interaction with P27 are those of the
anticodon loop, particularly the modified nucleosides at U34 and A37. The modified nucleosides are colored as in B.
Discussion
An Amino Acid
Signature Sequence Important for the Recognition
of Uniquely Modified RNA
Protein recognition of RNA has been
well-defined for two families of RNA binding proteins. The Pumilio
protein was first identified as a Drosophila protein critical for segmentation of the posteria,[24] and the FBF protein was found responsible in Caenorhabditis to mediate the sperm/oocyte switch.[25] Other RNA binding domains or RNA recognition
motifs have been studied extensively, revealing conserved sequence/structure
relationships, yet these structures differ by having subtle distinctions,
some of which are dynamic.[26] However, insight
into protein recognition of RNA’s modified nucleoside chemistries
has not been pursued and thus has yet to be characterized.The
primary goal of this study was to demonstrate that a signature amino
acid sequence can be identified as binding a uniquely modified RNA
with high affinity and specificity. We reached this signature sequence
using a combination of computational simulations to obtain optimized
amino acid sequences that were then confirmed by binding studies in vitro. By comparing peptide sequences which specifically
bound the modified hASLLys3UUU to those which
did not, we were able to derive an amino acid signature that should
be useful for protein/peptide recognition of RNA with modifications.
Focusing primarily on those peptides which showed the highest affinity
and specificity for the modified hASLLys3UUU, the amino acid signature emerged: R-W-Q/N-H-X2-F-Pho-X-G/A-W-R-X2-G (where X can be most amino acids, and Pho is hydrophobic)
(Table 5).
Table 5
Amino Acid Signature
Derived from
Optimized Peptide Sequences
peptide
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
P1
F
S
V
S
F
P
S
L
P
A
P
P
D
R
S
P6
R
V
T
H
H
A
F
L
G
A
H
R
T
V
G
P27
R
W
Q
M
T
A
F
A
H
G
W
R
H
S
G
P29
R
W
N
H
Q
S
F
W
H
G
W
R
A
C
G
P31
R
W
Q
H
H
S
F
H
P
L
W
R
M
S
G
P35
R
W
N
H
C
Q
F
W
S
G
W
R
A
N
G
signaturea
R
W
Q/N
H
X
X
F
Pho
X
G/A
W
R
X
X
G
X = any amino acid;
Pho = hydrophobic
amino acid. Position 3 is either Q or N; position 10 is either G or
A.
X = any amino acid;
Pho = hydrophobic
amino acid. Position 3 is either Q or N; position 10 is either G or
A.The evolution of peptide
sequences in silico is
rapid relative to screening at the bench. We have developed an algorithm
with the potential to simulate a very high number of binding events
for the 15 amino acid peptide theoretical combinations (>3.3 ×
1023) to each substrate. In our algorithm, all 20 amino
acids are considered. However, we group them for the purpose of describing
their hydration properties. There are concessions such as grouping
the amino acids by side chain properties to more quickly move through
peptide evolution. Our developed algorithm proved to be a powerful
tool in accurately predicting peptides which would bind specifically
to hASLLys3UUU modifications. We believe that
we can improve the accuracy of in silico predictions
by developing simulations in tandem to look more closely at nonspecific
binding of the peptide to other small RNAs and/or unmodified tRNAs
or ASLs. A cross-check performed by a parallel screen assessing binding
energies of peptides binding to different ASLs could potentially eliminate
nearly all false positives before moving to in vitro and/or in vivo experiments. The validation screens in vitro revealed that while the computer algorithms were
not 100% correct in predicting peptide sequences with both high affinity
and specificity however, the selection in silico was
a serious tool for predicting binding trends and quickly screening
through many peptide sequence combinations.
The
signature amino acid sequence offers an opportunity
to investigate the mode or mechanism of binding in future research.
The sequence offers clues and surprises to be studied as to why the
optimized peptides from cases 1 and 2 bind the modified hASLLys3 with high affinity. Interestingly, the 5′-amino terminal
sequence is more hydrophilic (R, Q, and H) than the center (F and
Pho) or the 3′-carboxyl terminus (G). Conventional thought
would have the two positively charged arginine residues (positions
1 and 12) preferentially engaged with the negatively charged phosphate
linkages via charge–charge interactions and/or the hydrophilic
sugars. Here, the two arginine residues are also involved in interactions
with the mcm5s2U34 and ms2t6A37 due to VDW energy (Figure 5B). The increased number of hydrophobic residues, specifically
tryptophan (position 11) and phenylalanine (position 7), contribute
to the overall binding specificity through VDW interactions.The fluorescence changes observed during the binding of the peptides
to the hASLLys3UUU are to a lesser degree than
expected for fluorescein stacking or intercalating with nucleobases
of a duplex stem. Our results are not consistent with fluorescein
labeled peptides recognizing the identical stem regions of the modified
and unmodified hASLLys3UUU. However, the results
are consistent with the modest effect on fluorescence that could be
expected of a differential interaction of peptides P6, P27, P31, and
P35* with the single stranded nucleosides of the modified hASLLys3UUU loop. For instance, one could expect that
the phenyl-ring of phenylalanine would intercalate within the 3′-base
stack of the anticodon domain. The N-threonylcarbamoyl-group of ms2t6A37 is known to enhance base stacking.[6] Phenylalanine has been observed to intercalate between
anticodon nucleosides of tRNALys in the cocrystal structure
of lysyl-tRNA synthetase and tRNALys.[27] Instead of the expected intercalation, in the computed
structure F7 interacts with the threonyl-side chain contributing
to the affinity and specificity of the peptide (Figure 5B). Though the signature sequence and the selected peptide
sequences P27 and P31 have the highest affinity and specificity for
the modified hASLLys3UUU and have two arginines
each, there is little sequence homology with RNA binding proteins
that are rich in arginine[28−32] or with single-stranded RNA binding proteins.[33−37]
Potential for Modification-Dependent, RNA
Binding Peptides
The optimization of RNA binding peptides
to recognize the unique
chemistries of modified nucleosides and the contributions they make
to local structure affords the opportunity of inhibiting RNA binding
proteins studied in vitro and possibly in
vivo. The benefits of modification-dependent signature peptides
are many-fold. First, an amino acid signature peptide that uniquely
recognizes a specific RNA modification or combination of modifications
becomes a tool in the study of RNA binding proteins that interact
with RNA in a modification-dependent manner. Modifications are most
often found in the terminal and internal loops of RNA structures.[38] There the modifications negate intraloop hydrogen
bonding and can enhance or even decrease the possibility of base stacking.[39] Peptides that recognize the ubiquitous anticodon
domain modification N6-threonylcarbamoyladenosine
can be used as a tool to study other modified tRNA–protein
interactions, for instance, those between tRNAs and their modification
enzymes and/or aminoacyl-tRNA synthetases.Second, the amino
acid signature sequence has the potential to shed light on the residues
necessary in protein recognition of modified RNA and hence aid in
future studies of modified RNA–protein interactions. Indeed,
pBLAST searches for this signature find hits in other RNA-binding
proteins such as RNA polymerase sigma factor RpoE and threonyl- and
prolyl-tRNA synthetases. The synthetases recognize tRNA substrates
having t6A37. Whether these amino acids confer
specific recognition to substrate modifications has yet to be determined.
There are many instances whereby a particular tRNA modification is
known to exist, its presence determined by mass spectrometry, 2-dimensional
thin layer chromatography, or other appropriate method (RNA MDB),
with very few clues about which enzyme(s) are responsible for the
modification.Third, peptides directed against specific modifications
could also
help in identifying incompletely modified RNAs. Complex modifications
typically occur in an organized stepwise fashion.[40] Even the earliest analyses demonstrated that many tRNA
modification enzymes often recognize a partially modified RNA.[41,42]Fourth, knowledge of signature peptide sequences for recognition
of specific modified nucleosides could provide clues as to how best
to narrow bioinformatic searches for modification enzyme functions.
Fifth, optimized peptides can be used as competitive inhibitors of
protein–RNA interactions that are related to human disease,
symptoms, or causative factors. The enhanced binding activity of the
peptides reported here could inhibit the recruitment of htRNALys3UUU and its annealing to the HIV genome, thus
inhibiting HIV replication. Sixth, the amino acid signature sequence
or the methodology reported here also could be used to derive a signature
in the engineering of new proteins and the design of peptide based
therapeutics.[43]
Authors: L Ratner; W Haseltine; R Patarca; K J Livak; B Starcich; S F Josephs; E R Doran; J A Rafalski; E A Whitehorn; K Baumeister Journal: Nature Date: 1985 Jan 24-30 Impact factor: 49.962
Authors: C Barat; V Lullien; O Schatz; G Keith; M T Nugeyre; F Grüninger-Leitch; F Barré-Sinoussi; S F LeGrice; J L Darlix Journal: EMBO J Date: 1989-11 Impact factor: 11.598
Authors: Xingqing Xiao; Yiming Wang; Dillon T Seroski; Kong M Wong; Renjie Liu; Anant K Paravastu; Gregory A Hudalla; Carol K Hall Journal: Sci Adv Date: 2021-09-03 Impact factor: 14.136