Literature DB >> 24483944

Amino acid signature enables proteins to recognize modified tRNA.

Jessica L Spears¹, Xingqing Xiao, Carol K Hall, Paul F Agris.

Abstract

Human tRNA(Lys3)UUU is the primer for HIV replication. The HIV-1 nucleocapsid protein, NCp7, facilitates htRNA(Lys3)UUU recruitment from the host cell by binding to and remodeling the tRNA structure. Human tRNA(Lys3)UUU is post-transcriptionally modified, but until recently, the importance of those modifications in tRNA recognition by NCp7 was unknown. Modifications such as the 5-methoxycarbonylmethyl-2-thiouridine at anticodon wobble position-34 and 2-methylthio-N(6)-threonylcarbamoyladenosine, adjacent to the anticodon at position-37, are important to the recognition of htRNA(Lys3)UUU by NCp7. Several short peptides selected from phage display libraries were found to also preferentially recognize these modifications. Evolutionary algorithms (Monte Carlo and self-consistent mean field) and assisted model building with energy refinement were used to optimize the peptide sequence in silico, while fluorescence assays were developed and conducted to verify the in silico results and elucidate a 15-amino acid signature sequence (R-W-Q/N-H-X2-F-Pho-X-G/A-W-R-X2-G, where X can be most amino acids, and Pho is hydrophobic) that recognized the tRNA's fully modified anticodon stem and loop domain, hASL(Lys3)UUU. Peptides of this sequence specifically recognized and bound modified htRNA(Lys3)UUU with an affinity 10-fold higher than that of the starting sequence. Thus, this approach provides an effective means of predicting sequences of RNA binding peptides that have better binding properties. Such peptides can be used in cell and molecular biology as well as biochemistry to explore RNA binding proteins and to inhibit those protein functions.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Substances：

Year: 2014 PMID： 24483944 PMCID： PMC3985708 DOI： 10.1021/bi401174h

Source DB: PubMed Journal: Biochemistry ISSN： 0006-2960 Impact factor: 3.162

There are three human isoaccepting tRNAs for the amino acid lysine, htRNALys1,2,3. The three human tRNALys decode the two lysine codons, AAA and AAG. Two of the isoacceptors, htRNALys1,2CUU with the anticodon CUU, decode AAG, but only one, htRNALys3UUU with the anticodon UUU, responds to the cognate codon AAA and wobbles to AAG. Besides its important role in protein synthesis, htRNALys3UUU serves as the primer of reverse transcription in the replication of the lentiviruses, including human immunodeficiency virus type 1 (HIV-1).[1,2] During the replication of HIV-1, the host cell htRNALys3UUU is recognized and bound and its structure destabilized by nucleocapsid protein 7 (NCp7).[3−5] This destabilization allows the relaxed U-rich anticodon stem loop (hASLLys3UUU), as well as the acceptor stem (Figure 1), to be annealed to the HIV viral RNA. During the subsequent infection, htRNALys3UUU is the primer for HIV reverse transcriptase.

Figure 1

Human modified and unmodified ASLLys3UUU. (A) Human ASLLys3UUU with all naturally occurring modifications (mcm5s2U34, ms2t6A37, and Ψ39). The construct for this study was not modified at position 39. (B) The unmodified hASLLys3UUU used in this study. htRNALys3UUU is one of the most uniquely processed tRNAs having chemically rich post-transcriptional modifications that are important to conformation and function of the tRNA during protein synthesis.[6] Until recently, the role(s) these modifications play in the tRNA’s interaction with NCp7 and in viral replication were not known. The naturally occurring modifications, 5-methoxycarbonylmethyl-2-thiouridine (mcm5s2U34), at tRNA’s wobble position-34, and 2-methylthio-N-threonylcarbamoyladenosine (ms2t6A37), at position-37, 3′-adjacent to the anticodon in the loop of the hASLLys3UUU, are both chemically rich and constitute a unique combination in human tRNAs (Figure 1).[7,8] These modifications enhance NCp7’s ability to recognize and bind to the RNA, suggesting that these modifications are an important discrimination factor for recognition by NCp7.[9] The presence of these modifications increases NCp7 affinity for hASLLys3 almost 10-fold (Kd = 0.28 ± 0.03 μM for modified and Kd = 2.30 ± 0.62 μM for unmodified ASL).[9] NCp7 is critical to HIV replication because it binds and relaxes the htRNALys3 structure, facilitating annealing of the tRNA to the viral genomic RNA and packaging of the genomic RNA into the viral capsid.[4] Fifteen- and 16-amino acid peptides have been selected to mimic NCp7’s preferential recognition of the fully modified hASLLys3UUU.[9,10] These peptides can be used to study modification-dependent protein recognition of RNAs, in general, and recognition and annealing of htRNALys3UUU to the HIV viral RNA, specifically. One peptide, P6 (sequence RVTHHAFLGAHRTVG), was also shown to mimic NCp7. P6 not only binds hASLLys3UUU but also destabilizes the ASL structure as does NCp7.[9] The ability of peptides to mimic NCp7 makes it possible to engineer a peptide with a signature amino acid sequence that can be used as a tool in future studies of protein recognition of RNAs, particularly those with unique modifications chemistries. Herein, we report the development of a signature amino acid sequence for the recognition of htRNALys3UUU. An algorithm was developed that optimizes the amino acid sequence by combining self-consistent mean field (SCMF) and Monte Carlo (MC) approaches. The resulting peptides were then validated as binders through empirical experimentation in order to corroborate the computer-aided approach. Amino acid substitutions in silico indeed enhanced the modification-dependent binding of the peptide ligand with high affinity and selectivity in vitro. The peptide sequences predicted by the algorithms preferentially bound the modified hASLLys3UUU with affinities at or higher than P6, and with greater specificity. The signature sequence provides insight into peptide and protein recognition of the modified tRNALys3UUU.

Experimental Procedures

Materials and Reagents

All materials, buffers, and reagents were of RNA grade quality and RNase free. The modified and unmodified hASLLys3UUU were chemically synthesized by Thermo Scientific. The modified hASLLys3UUU was synthesized with the nucleoside phosphoramidites that were 2′-protected with tert-butyldimethylsilyl-ether.[9] The unmodified hASLLys3UUU was synthesized with “ACE” chemistry.[11] All fluorescein labeled peptides were obtained from Sigma-Aldrich (PEPscreen) with sequence verification by mass spectrometry.

In Silico Evolution of Peptide Sequences

A random initial sequence that satisfies the constraints on hydration properties is generated (Figure 2). For the search described here, we started with the 15-amino-acid sequence of peptide P6, RVTHHAFLGAHRTVG, found experimentally to bind selectively to the modified hASLLys3UUU.[9] The peptide backbone conformation is determined via atomistic simulation of the peptide-hASLLys3UUU complex and then held fixed with respect to the hASLLys3UUU conformation throughout the search. The binding free energy for the complex is then evaluated. Subsequently, a random number is generated to determine whether to mutate one amino acid or to exchange two amino acids. If one amino acid is to be mutated, one site along the peptide sequence is chosen randomly. The amino acid at that site is then mutated to another amino acid of the same residue type. The best rotamer for the new amino acid is chosen to substitute for the old amino acid in this mutation step. If an exchange step is chosen, two random sites along the chain and their corresponding amino acids are chosen for a mutual exchange attempt. In this exchange step, we calculate the effective potential of all the possible rotamers and perform the self-consistent mean field (SCMF) procedure[12] described below to obtain the best rotamer combination for the exchanged amino acids. Regardless of whether one amino acid was mutated or two amino acids were exchanged, the new generated peptide sequence is evaluated further by calculating the new binding free energy and accepted or rejected according to the Metropolis criterion. After a total of 10,000 evolution steps, the best peptide sequences with the lowest binding free energy are identified.

Figure 2

Search algorithm flow strategy. An initial peptide sequence is chosen (in this instance peptide P6). Random numbers were generated to determine whether to mutate one amino acid or not (“No” or “Yes”). If yes, then one amino acid from the sequence was randomly changed to an amino acid from the same residue category (Table 1). If no, then two amino acids from the sequence were randomly exchanged regardless of the residue category. The SCMF algorithm was then used to determine the lowest-energy rotamer combination. The MC algorithm was used to accept or reject the newly generated peptide sequence based on the calculation of binding free energy (ΔG1binding).

Table 1

(A) residue categories of 20 canonical amino acids		(B) hydration properties of three optimization cases
amino acids	hydration	notation	case 1	case 2	case 3
Leu, Val, Ile, Met, Phe, Tyr, Trp	hydrophobic	Npho	4	5	3
Glu, Asp	negatively	Nneg	0	0	0
Arg, Lys	positively	Npos	2	2	1
Ser, Thr, Asn, Gln, His	hydrophilic	Npol	5	6	6
Ala, Cys	other	Noth	2	1	3
Gly	neutral	Ngly	2	1	2

In our use of the SCMF, a trial exchange between two amino acids at randomly chosen sites is implemented (Figure 3). The conformational probability matrix is set initially for the two amino acids so that all possible rotamers have equal conformational probability. The initial conformational matrix P is then used to calculate the effective potential of each amino acid in each rotamer state. Once the effective potentials for all the rotamer states are known, new conformational probabilities of the rotamers are obtained according to the Boltzmann law so as to constitute a new conformational matrix P. Next, the absolute error between P and P is calculated. If the absolute error is less than 10–3, the best rotamers with the highest conformational probability for the two amino acids are selected from P to repack the side chains. Otherwise, the conformational matrix P is updated by employing a self-consistent iteration. The updated conformation matrix P is stored as the old conformational matrix P for the next round evaluation. The conformational matrix is iterated until the absolute error between P and P is less than 10–3. Eventually, the best combination of rotamers is found, thereby repacking the backbone.

Figure 3

Self-consistent mean field (SCMF) procedure. A trial exchange between two amino acids is implemented. The conformational probability matrix P is set initially so that all possible rotamers at any one site have equal probabilities. The effective potential experienced by each rotamer at each site is calculated, and the Boltzmann law is used to determine new conformational probabilities of the rotamers for each amino acid and hence a new conformational probability matrix P. If the absolute error between P and P is less than 10–3, the rotamer combination with the highest conformational probability is selected from P to repack the side chains. Otherwise, the conformational matrix P is updated by employing a self-consistent iteration until the absolute error falls below a certain tolerance. The absolute binding energy for a ligand and a receptor is defined to be the difference between the free energy of the complex, and the free energies of the ligand and the receptor prior to binding.[13] It was calculated according to the following equation:where GTOTcomplex, GTOTligand, and GTOTreceptor represent the total free energies of the complex and the ligand and the receptor in the solution, respectively. The total free energy GTOT of the molecular complex in the solution was calculated as follows:where UINT, UVDW, UELE, GEGB, and GGBSUR indicate the internal energy (INT), van der Waals energy (VDW), electrostatic energy (ELE), the polar solvation energy (EGB), and the nonpolar solvation energy (GBSUR). The internal energy UINT is defined as the potential energy associated with the random, disordered motion of the molecule itself, including the vibration of bonds, bond angles, and the torsion of dihedral angles. The van der Waals energy UVDW between two nonbonded atoms adopts a typical 12–6 Lennard–Jones equation. The electrostatic energy UELE between two nonbonded atoms follows the conservative Coulomb’s law. The polar solvation energy GEGB is calculated based on the generalized Born model, which is an analytical way to evaluate the electrostatic contribution to the solvation free energy. The nonpolar solvation energy GGBSUR is approximated by a pairwise potential that occurs at the incompatible interface of two distinct species molecules. In the solution, actually, it is proportional to the solvent-accessible surface area of solute molecules. The expressions for the energies in eq 2 are as follows: (a) internal energy UINT where Kb and Kθ are the harmonic force constants, b0 and θ0 are the equilibrium bond length and bond angle, respectively, b and θ are the actual bond length and bond angle, respectively, Vn is an energy constant, n is the dihedral multiplicity, ø is the actual torsion angle, and δ is phase shift. All of the values of Kb, b0, Kθ, θ0, Vn, and δ come from the Amber force field. (b) van der Waals energy UVDW where ε( and r0( are the well depth and the characteristic distance for a pair of atoms i and j, respectively, and r is the actual distance between two nonbonded atoms i and j. Each atom has a depth of the potential well ε, and a characteristic distance r0 at which the potential reaches its minimum. For any pair of atoms i and j, ε( = (ε(·ε()1/2 and r0( = r0( + r0(. All of the values of ε and r0 come from the Amber force field. (c) electrostatic energy UELE where q (x = i or j) represents the charge quantity of atom x, and r is the actual distance between two nonbonded atoms i and j. All of the values of q come from the Amber force field. (d) polar solvation energy GEGB(14−17) where q (x = i or j) represents the charge quantity of atom x, εr is the relative dieletric constant (in this work, εr = 80.0 for water), r is the actual distance between atoms i and j (thus, r = 0.00), α (x = i or j) is the effective Born radii of atom x, and fGB is a smoothing function, of which a common form is (e) nonpolar solvation energy GGBSUR where γ is surface tension (in this work, γ = 0.0072 kcal/mol/Å2 for the interface between water and biomolecules), σ is the solvent-accessible surface area of atom i, and b is an offset parameter (in this work, b is set to 0.00).

Fluorescein-Labeled Peptides

Fifteen-amino acid peptides were selected from the sequences predicted in silico and were chemically synthesized, each with fluorescein (Flc) at the N-terminus. This set of peptides included the original P1 and P6 sequences to be used as an internal control.[9] The lyophilized peptide set was reconstituted via standard suggestions from the manufacturer (80% DMSO/20% H2O, v/v). Concentrated peptide stocks were stored in 25 μL aliquots at −80 °C for later use. Working concentrations were diluted for each experiment and kept on ice or stored at −20 °C.

Fluorescent Assays

Fluorescent assays were conducted in phosphate buffer (10 mM Na2HPO4 and 10 mM KH2PO4, pH 6.8) in low volume 384 well plates. All buffers, peptides, and RNA were pipetted into wells via a liquid handling robot (Janus, PerkinElmer). All plates were read using a plate reader fitted with fluorescein-specific filters (PerkinElmer EnVision) which were optimized for each plate and peptide before each experiment. Initial validation screens were conducted by obtaining fluorescent signals for each peptide (0.50 μM) alone (FS0) and in the presence of a 2-fold excess of modified or unmodified hASLLys3UUU (FS1). Percent change in the fluorescent signal was calculated (% change = 100·(FS1/FS0)). A decrease (quench) in fluorescent signal in the presence of RNA indicated a binding event between the peptide and ASL. In control wells, H2O, phosphate buffer, and ASLs were individually tested for any inherent fluorescent signal. Water and buffer signals served as blank background signals. The hASLLys3UUU had a negligible inherent signal, which was taken into account when calculating the overall signal quench. All controls and experimental sets were repeated in triplicate within a single plate, and signals are an average of each triplicate. For the subset of peptides which were further studied to obtained binding constants, the fluorescent binding assay was completed as described above. The peptide fluorescent signal was monitored throughout an increasing ASL concentration (0–3 μM). The percent quench in signal was plotted against the hASL concentration. Binding (dissociation) constants (Kd) were calculated using the single-site, nonlinear regression function within SigmaPlot. Because the peptide concentration was 0.50 μM, Kd values between 0.05 and 0.13 μM are considered approximate (≈). Experiments were performed in triplicate within a single plate, and fluorescent signals are an average of each triplicate. The binding of peptides P6, P27, P31, and P35* to the unmodified hASLLys3UUU resulted in little change in fluorescence and could not be satisfactorily fitted with a single-site, nonlinear regression function. Thus, these Kds were considered indeterminable. Results could not be curve fitted, and a Kd could not be calculated from the data (Table 4).

Table 4

Affinity of Peptides for hASLLys3UUUa

peptide	modified or unmodified hASL^Lys3_UUU	K_d (μM)
P6	modified	0.50 ± 0.10b
P6	unmodified	ID
P27	modified	∼0.05 ± 0.02c
P27	unmodified	ID
P31	modified	0.58 ± 0.24
P31	unmodified	ID
P35*	modified	1.87 ± 1.00
P35*	unmodified	ID

ID = indeterminable; could not be curve fitted and calculated from the data because of little change in fluorescence with binding of ASL.

K value from previous studies.[9]

K value is approximate for it is one-tenth the concentration of the fluorescent peptide reporter.

Results

Development of an Algorithm That Optimizes a Peptide Sequence for Binding RNA

Previous studies demonstrated the feasibility of selecting peptides with modification-dependent recognition of tRNAs’ anticodon stem and loop domains, ASLs.[18,19] The peptides were selected from completely and partially randomized phage display libraries.[9,10] However, optimizing 15- and 16-amino acid peptide sequences using this approach is not feasible since there are over 3.3 × 1023 possible sequences. Because of the exorbitantly high costs of creating and screening millions of peptides even with the benefit of phage display, we turned to computer algorithms[20] and assisted model building with energy refinement, AMBER, simulations to pare down the number of possibilities before performing in vitro assays. We developed a novel optimization strategy[21] that combines MC with SCMF to evolve amino acid sequences. The peptide P6 sequence RVTHHAFLGAHRTVG[9] was the starting point from which an optimized peptide was sought to bind the modified hASLLys3UUU with the highest specificity and affinity. The canonical 20 amino acids were categorized into six distinct groups according to hydrophobicity, polarity, size, and charge (Table 1A). These hydration properties were necessary to ensure the peptide did not become too hydrophobic (and thus insoluble) or so hydrophilic that binding to hASLLys3 was inhibited. The overall charge of the peptide was chosen to be slightly positive to ensure interaction with the negatively charged ASL. By adjusting the number of amino acids in each category, via Npho, Nneg, Npos, Npol, Noth, Ngly, we maintained hydration properties similar to those of the original P6 sequence while evolving the sequences.[9] The peptide sequence was optimized using the following computational procedure. The stable structure for the complex between the original P6 sequence and ASL was determined using AMBER. The structure of the fully modified hASLLys3UUU was taken from the high resolution solution structure,[6] providing a restrained structure to which the peptide would bind in silico. Once the stable structure of the peptide P6 with the ASLLys3 was determined, the peptide’s amino acid sequence was evolved and optimized while keeping the backbone fixed. Each peptide sequence evolved in two types of “ moves”: (1) a single randomly chosen amino acid in the peptide sequence was mutated to a different amino acid from the same residue category (Table 1); or (2) two randomly chosen amino acids in the peptide sequence were exchanged regardless of the their residue category (Figure 2) using SCMF.[22] SCMF finds the optimal rotamer combination with the lowest binding energy for the two exchanged amino acid residues (Figure 3) based on the preferred, distinct side chain conformations in Lovell’s rotamer Library.[23] The sequences were subjected to continued rounds of optimization (Figure 2). By comparing the changes in binding energy before and after each of the two types of moves, the peptide sequence was evolved to those with the lowest binding energies and thus increased binding affinity to the modified hASLLys3UUU.

Evolved Peptides Have Increased Specificity for ASL Modifications

The initial P6 sequence was subjected to an evolution of over several hundred thousands of rounds of 15-amino acid peptide sequences that, based on binding energies, should recognize and bind modified hASLLys3UUU with a similar or higher affinity than that of P6. Initial results from the in silico selection suggested two optimized peptide sequences, P26, R-T-L-H-H-A-L-F-G-A-H-Q-T-V-G and P27, R-W-Q-M-T-A-F-A-H-G-W-R-H-S-G. These sequences exhibited binding energies to the hASLLys3UUU lower than that of P6 (P26, −22.55 kcal/mol, and P27, −22.07 kcal/mol, respectively, vs P6 −21.26 kcal/mol). On the basis of these initial results, we developed three distinct peptide sequence cases. The three cases varied within the six residue categories (Table 1B). Although different, each of the three cases is still within the overall desired levels of moderate hydration and charge properties (slightly positively charged). P6, the initially evolved sequences P26 and P27, and three of the top 10 sequences from each of the first two cases 1 and 2 have lower binding energies than those of case 3 (Table 2). This is likely due to the increased allowance in cases 1 and 2 for positively charged and hydrophobic residues (Table 2). The binding energies calculated for P26 and P27 and cases 1 and 2, but not 3, are on par with or lower than the binding energy for P6. This suggests a potential increase in their binding affinity for modified hASLLys3UUU versus that of P6.

Table 2

Binding Energies for Selected, Optimized Peptide Sequences

Peptides are numbered in accordance with the previous report of phage display selected sequences.[9] P6 is from the original selection, and P26–P38 were chosen based on predictions in silico. P35* with the sequence RWNHCQFWSGWRANG has a single amino acid serine change from P35. Peptide sequences A and B from cases 2 and 3, respectively, were not selected for chemical synthesis and analysis.

Using the search algorithm, peptide sequence candidates are evolved on the basis of a backbone scaffold of structure. For each binding structure, an explicit binding energy is generated. This search and selection process does not quantify the error in the binding energy. Peptides are numbered in accordance with the previous report of phage display selected sequences.[9] P6 is from the original selection, and P26–P38 were chosen based on predictions in silico. P35* with the sequence RWNHCQFWSGWRANG has a single amino acid serine change from P35. Peptide sequences A and B from cases 2 and 3, respectively, were not selected for chemical synthesis and analysis.

In Vitro Fluorescence Screens Validate in Silico Methods: Qualitative Analysis

Sequences predicted during the in silico optimizations to have the lowest binding energies and thus potentially higher affinity for modified hASLLys3 were selected for validation with a fluorescence assay.[9] Fifteen peptides (Table 3) were chemically synthesized with fluorescein at the N-terminus to allow for very sensitive, low volume detection of peptide–RNA binding interactions. P1 and P6 from phage display selections,[9] the initially evolved sequences, P26 and P27, and the best binders from each of the cases 1, 2, and 3 were synthesized. Variants of these sequences that had one or two amino acid changes were also synthesized (Table 3). During the initial validation assay, changes in the amount of fluorescence were monitored to determine whether the peptide was binding to the modified and/or unmodified hASLLys3UUU and to what relative degree (Figure 4). In this screening assay, P6 behaved as expected. When bound by the modified hASLLys3UUU, P6 fluorescence was altered more than that when bound by the unmodified hASLLys3UUU. Other peptides (P27, P31, and P35*) behaved similarly. Still others demonstrated nonspecific binding in that the degree of fluorescence did not significantly differ between the binding of modified and unmodified hASLLys3UUU (Figure 4: P30 and P32). Three peptides exhibited a greater degree of change in fluorescence quenching when binding the unmodified RNA versus the modified RNA; their binding could have been negatively affected by modifications (P26, P28, and P37). The stacking interactions of fluorescein with nucleobases can be nonspecific and result in fluorescence quenching that is mistakenly attributed to a selective binding of a ligand to a nucleic acid. We have employed a direct comparison of peptide binding of the modified hASLLys3UUU to binding of the control that is identical in sequence except for the modified nucleosides in the anticodon loop. If nonspecific stacking interactions were to occur, they were most likely limited to the stacked and identical stem regions of the control and experimental RNAs. Therefore, the observable differences in quenching could be attributed to the peptide ligand’s affinity for the modifications.

Table 3

Synthesized Peptides Based on Predictions in Silicoa

Figure 4

Fluorescence of chemically synthesized peptides effected by modified and unmodified hASLLys3UUU. An initial fluorescent signal (FS0) of peptide alone (1.5 μM) was obtained. Then, a 2-fold excess of ASL was added to each peptide, and the fluorescent signal (FS1) was monitored. The percent change (100·(FS1/FS0)) is graphed for each of the assayed peptides. Dark gray bars represent the percent change in fluorescence in the presence of the modified hASLLys3UUU, and light gray bars represent the percent change in the presence of the unmodified hASLLys3UUU. Sequences for P1–P38 are presented in Table 3

Peptides named following the peptides from original phage display library screens.[9] P1 and P6 (bold font) are from the original screen. P6 has been characterized.[9] In addition, the binding of the modified and unmodified hASLLys3UUU by P1, P27, P31, and P35* (bold font and shaded row) are characterized in this article. Fluorescence of chemically synthesized peptides effected by modified and unmodified hASLLys3UUU. An initial fluorescent signal (FS0) of peptide alone (1.5 μM) was obtained. Then, a 2-fold excess of ASL was added to each peptide, and the fluorescent signal (FS1) was monitored. The percent change (100·(FS1/FS0)) is graphed for each of the assayed peptides. Dark gray bars represent the percent change in fluorescence in the presence of the modified hASLLys3UUU, and light gray bars represent the percent change in the presence of the unmodified hASLLys3UUU. Sequences for P1–P38 are presented in Table 3 Three peptides of the 15 exhibited fluorescence enhancement when binding the modified RNA but exhibited quenching when binding the unmodified RNA (Figure 4: P34, P36, and P38), whereas a fourth exhibited the opposite (P1). This result is counter to the better understood binding of P6.[9] The result is also counterintuitive. Fluorescein stacks with the nucleobases of nucleic acids resulting in a reduced fluorescence. The quenching of fluorescein fluorescence is one of its advantageous properties in studying the interactions of fluorescein-coupled ligands with nucleic acids. Without a detailed investigation of the mode of binding by these four peptides, interpretation of these results is difficult at this time. Three peptides, P27 one of the two initially evolved, P31 from case 1, and P35*, a variant of P35 from case 2, preferentially bound the modified hASLLys3UUU as compared to their binding of the unmodified hASLLys3UUU, as did the phage selected P6. P27, P31, and P35* showed a very high selectivity for the modified hASLLys3UUU. When these peptides were tested for their binding to unmodified hASLLys3UUU, little or no change in fluorescence was observed (Figure 4; Table 4). However, with the addition of the modified hASLLys3UUU, the fluorescent signals of P27, P31, P35*, and P6 were quenched. Peptides P32 and P33 had the best binding properties among the case 3 family of peptides. Even so, they had significantly weaker affinities and a lack of specificity for the modified hASLLys3UUU as compared to those of P27, P31, and P35* (Figure 4). Thus, peptides P27, P31, and P35* evolved in silico from P6 as having high affinities for the modified hASLLys3UUU also appear from the screening assay in vitro to have higher affinities and higher specificities for the modified RNA. ID = indeterminable; could not be curve fitted and calculated from the data because of little change in fluorescence with binding of ASL. K value from previous studies.[9] K value is approximate for it is one-tenth the concentration of the fluorescent peptide reporter.

Evolved Peptides with Lower Free Energies of Binding Also Have Higher Affinity for Modified hASLLys3UUU: Quantitative Analysis

Five of the 15 peptides were chosen for detailed analysis of their binding because the data collected and presented in the initial screen (Figure 4) was indicative of these peptides distinguishing modified from unmodified RNA. Also, these peptides were chosen for detailed analyses because in their binding of the modified hASLLys3UUU, they exhibited sizable changes in fluorescence that could be accurately quantified (Figure 4 and Table 4). The peptide fluorescence signal was monitored with an increasing hASL concentration, and the percent quench in signal was plotted against the hASL concentration. Binding constants (Kd) were calculated using the single binding site, nonlinear regression function. Peptides P6, P27, P31, and P35* having exhibited qualitatively the highest affinity and specificity for the modified substrate (Figure 4) were subjected to a quantitative analysis of their binding to both the modified and the unmodified hASLLys3UUU. P1 was also subjected to this analysis though the peptide exhibited a decreased fluorescence when binding the modified hASLLys3UUU and an increased fluorescence in binding the unmodified hASLLys3UUU. Equilibrium binding constants (as the dissociation constant Kd; Table 4) were determined from the concentration-dependent fluorescence quenching with the binding of the modified and unmodified hASLLys3UUU. This data was compared to that for P6 from which these peptides were evolved in silico (Table 4). Little to no fluorescence quenching was observed when P27, P31, P35*, and P6 were titrated with the unmodified hASLLys3UUU; thus, binding constants could not be extracted from the data (Table 4). In contrast, the fluorescence of P27 was quenched considerably with increasing amounts of the modified hASLLys3UUU. The peptide bound the RNA with high affinity (Kd ≈ 0.05 ± 0.10 μM) (Table 4). P27 exhibited the highest affinity for the modified hASLLys3UUU. P31 bound the modified hASLLys3UUU with a 10-fold less affinity (Kd = 0.50 ± 0.10 μM), and its affinity for the hASL was equivalent to that of P6 (Table 4). P35* bound the modified hASLLys3UUU with the lowest affinity of these peptides (Kd = 1.87 ± 1.00 μM). The fluorescence of P1 was altered by both the modified and unmodified hASLLys3UUU. In binding the modified hASLLys3UUU, P1 exhibited a quench in fluorescence from which we extracted a modest binding constant (Kd ≈ 0.13 ± 0.02 μM). However, the fluorescence change in binding the unmodified hASLLys3UUU resulted in an equivalent affinity (Kd ≈ 0.15 ± 0.04 μM). Thus, in contrast to P27, P31, P35*, and P6, P1 lacked specificity for the modified hASLLys3UUU.

Computational Analysis of Peptide P27 Binding of the Modified hASLLys3UUU

Peptide P27 has the highest affinity coupled with highest specificity for binding of the modified hASLLys3UUU. The calculated structure bound to the modified hASLLys3UUU at equilibrium reveals how this affinity and specificity could be achieved and thus suggests future experimentation (Figure 5A). Amino acids throughout P27 are engaged with the extensive chemistries of the two modifications unique to the tRNALys3UUU. The two arginines, R1 and R12, bracket the threonylcarbamoyl-group of ms2t6A37 (Figure 5B). At the middle of the peptide, F7 is closely associated with the hydrophobic methyl of the threonyl-side chain. The imidazole ring of W11 lies above the methyl-ester of the 5-methoxycarbonylmethyl-moiety of mcm5s2U34. As evidenced by calculations for each of the 15 amino acids, R1, R12, F7, and W11 contribute to ΔGBinding (Figure 5C). The binding energy contributed by each of the nucleosides of the modified hASLLys3UUU have also been calculated. The binding energy is concentrated in the anticodon loop, as opposed to the stem. However, the two modifications, particularly ms2t6A37, provide the most significant binding energies.

Figure 5

Peptide P27 binds the modified hASLLys3UUU with high affinity and specificity. (A) The computed equilibrium binding structure of the modified hASLLys3UUU bound by P27. The peptide backbone is in gold, and the ribose-phosphodiester backbone of the hASLLys3UUU is colored in green. (B) Enlargement of the interaction demonstrating the specificity achieved in the binding of the two modifications by the amino acids R1 (red), F7 (light green), W11 (light purple), and R12 (dark green). The peptide backbone is in gold and the side chains in color. The modifications ms2t6A37 (purple) and mcm5s2U34 (blue) are bound by amino acids at the beginning, middle, and end of the peptide. The ribose-phosphodiester backbone of the hASLLys3UUU is not shown. The table characterizes the contributions of different binding modes: ΔGBinding, Gibbs free energy of binding; BEw/o GBSUR, binding energy without GBSUR; VDW, van der Waals energy; ELE, electrostatic energy; EGB, polar solvation energy based on the generalized Born (implicit solvent) model; and GBSUR, nonpolar solvation energy, which is the product of the solvent-accessible surface area of the solute molecules and the interfacial tension between the solute and solvent. (C) Individual contributions of each amino acid to the VDW, ELE + EGB, and GBSUR. The amino acids are colored as in B. (D) Individual contributions of each nucleoside to the VDW, ELE + EGB, and GBSUR. The nucleosides engaged in the interaction with P27 are those of the anticodon loop, particularly the modified nucleosides at U34 and A37. The modified nucleosides are colored as in B.

Discussion

An Amino Acid Signature Sequence Important for the Recognition of Uniquely Modified RNA

Protein recognition of RNA has been well-defined for two families of RNA binding proteins. The Pumilio protein was first identified as a Drosophila protein critical for segmentation of the posteria,[24] and the FBF protein was found responsible in Caenorhabditis to mediate the sperm/oocyte switch.[25] Other RNA binding domains or RNA recognition motifs have been studied extensively, revealing conserved sequence/structure relationships, yet these structures differ by having subtle distinctions, some of which are dynamic.[26] However, insight into protein recognition of RNA’s modified nucleoside chemistries has not been pursued and thus has yet to be characterized. The primary goal of this study was to demonstrate that a signature amino acid sequence can be identified as binding a uniquely modified RNA with high affinity and specificity. We reached this signature sequence using a combination of computational simulations to obtain optimized amino acid sequences that were then confirmed by binding studies in vitro. By comparing peptide sequences which specifically bound the modified hASLLys3UUU to those which did not, we were able to derive an amino acid signature that should be useful for protein/peptide recognition of RNA with modifications. Focusing primarily on those peptides which showed the highest affinity and specificity for the modified hASLLys3UUU, the amino acid signature emerged: R-W-Q/N-H-X2-F-Pho-X-G/A-W-R-X2-G (where X can be most amino acids, and Pho is hydrophobic) (Table 5).

Table 5

Amino Acid Signature Derived from Optimized Peptide Sequences

peptide	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
P1	F	S	V	S	F	P	S	L	P	A	P	P	D	R	S
P6	R	V	T	H	H	A	F	L	G	A	H	R	T	V	G
P27	R	W	Q	M	T	A	F	A	H	G	W	R	H	S	G
P29	R	W	N	H	Q	S	F	W	H	G	W	R	A	C	G
P31	R	W	Q	H	H	S	F	H	P	L	W	R	M	S	G
P35	R	W	N	H	C	Q	F	W	S	G	W	R	A	N	G
signaturea	R	W	Q/N	H	X	X	F	Pho	X	G/A	W	R	X	X	G

X = any amino acid; Pho = hydrophobic amino acid. Position 3 is either Q or N; position 10 is either G or A.

X = any amino acid; Pho = hydrophobic amino acid. Position 3 is either Q or N; position 10 is either G or A. The evolution of peptide sequences in silico is rapid relative to screening at the bench. We have developed an algorithm with the potential to simulate a very high number of binding events for the 15 amino acid peptide theoretical combinations (>3.3 × 1023) to each substrate. In our algorithm, all 20 amino acids are considered. However, we group them for the purpose of describing their hydration properties. There are concessions such as grouping the amino acids by side chain properties to more quickly move through peptide evolution. Our developed algorithm proved to be a powerful tool in accurately predicting peptides which would bind specifically to hASLLys3UUU modifications. We believe that we can improve the accuracy of in silico predictions by developing simulations in tandem to look more closely at nonspecific binding of the peptide to other small RNAs and/or unmodified tRNAs or ASLs. A cross-check performed by a parallel screen assessing binding energies of peptides binding to different ASLs could potentially eliminate nearly all false positives before moving to in vitro and/or in vivo experiments. The validation screens in vitro revealed that while the computer algorithms were not 100% correct in predicting peptide sequences with both high affinity and specificity however, the selection in silico was a serious tool for predicting binding trends and quickly screening through many peptide sequence combinations.

Amino Acid Signature Sequence R-W-Q/N-H-X2-F-Pho-X-G/A-W-R-X2-G

The signature amino acid sequence offers an opportunity to investigate the mode or mechanism of binding in future research. The sequence offers clues and surprises to be studied as to why the optimized peptides from cases 1 and 2 bind the modified hASLLys3 with high affinity. Interestingly, the 5′-amino terminal sequence is more hydrophilic (R, Q, and H) than the center (F and Pho) or the 3′-carboxyl terminus (G). Conventional thought would have the two positively charged arginine residues (positions 1 and 12) preferentially engaged with the negatively charged phosphate linkages via charge–charge interactions and/or the hydrophilic sugars. Here, the two arginine residues are also involved in interactions with the mcm5s2U34 and ms2t6A37 due to VDW energy (Figure 5B). The increased number of hydrophobic residues, specifically tryptophan (position 11) and phenylalanine (position 7), contribute to the overall binding specificity through VDW interactions. The fluorescence changes observed during the binding of the peptides to the hASLLys3UUU are to a lesser degree than expected for fluorescein stacking or intercalating with nucleobases of a duplex stem. Our results are not consistent with fluorescein labeled peptides recognizing the identical stem regions of the modified and unmodified hASLLys3UUU. However, the results are consistent with the modest effect on fluorescence that could be expected of a differential interaction of peptides P6, P27, P31, and P35* with the single stranded nucleosides of the modified hASLLys3UUU loop. For instance, one could expect that the phenyl-ring of phenylalanine would intercalate within the 3′-base stack of the anticodon domain. The N-threonylcarbamoyl-group of ms2t6A37 is known to enhance base stacking.[6] Phenylalanine has been observed to intercalate between anticodon nucleosides of tRNALys in the cocrystal structure of lysyl-tRNA synthetase and tRNALys.[27] Instead of the expected intercalation, in the computed structure F7 interacts with the threonyl-side chain contributing to the affinity and specificity of the peptide (Figure 5B). Though the signature sequence and the selected peptide sequences P27 and P31 have the highest affinity and specificity for the modified hASLLys3UUU and have two arginines each, there is little sequence homology with RNA binding proteins that are rich in arginine[28−32] or with single-stranded RNA binding proteins.[33−37]

Potential for Modification-Dependent, RNA Binding Peptides

The optimization of RNA binding peptides to recognize the unique chemistries of modified nucleosides and the contributions they make to local structure affords the opportunity of inhibiting RNA binding proteins studied in vitro and possibly in vivo. The benefits of modification-dependent signature peptides are many-fold. First, an amino acid signature peptide that uniquely recognizes a specific RNA modification or combination of modifications becomes a tool in the study of RNA binding proteins that interact with RNA in a modification-dependent manner. Modifications are most often found in the terminal and internal loops of RNA structures.[38] There the modifications negate intraloop hydrogen bonding and can enhance or even decrease the possibility of base stacking.[39] Peptides that recognize the ubiquitous anticodon domain modification N6-threonylcarbamoyladenosine can be used as a tool to study other modified tRNA–protein interactions, for instance, those between tRNAs and their modification enzymes and/or aminoacyl-tRNA synthetases. Second, the amino acid signature sequence has the potential to shed light on the residues necessary in protein recognition of modified RNA and hence aid in future studies of modified RNA–protein interactions. Indeed, pBLAST searches for this signature find hits in other RNA-binding proteins such as RNA polymerase sigma factor RpoE and threonyl- and prolyl-tRNA synthetases. The synthetases recognize tRNA substrates having t6A37. Whether these amino acids confer specific recognition to substrate modifications has yet to be determined. There are many instances whereby a particular tRNA modification is known to exist, its presence determined by mass spectrometry, 2-dimensional thin layer chromatography, or other appropriate method (RNA MDB), with very few clues about which enzyme(s) are responsible for the modification. Third, peptides directed against specific modifications could also help in identifying incompletely modified RNAs. Complex modifications typically occur in an organized stepwise fashion.[40] Even the earliest analyses demonstrated that many tRNA modification enzymes often recognize a partially modified RNA.[41,42] Fourth, knowledge of signature peptide sequences for recognition of specific modified nucleosides could provide clues as to how best to narrow bioinformatic searches for modification enzyme functions. Fifth, optimized peptides can be used as competitive inhibitors of protein–RNA interactions that are related to human disease, symptoms, or causative factors. The enhanced binding activity of the peptides reported here could inhibit the recruitment of htRNALys3UUU and its annealing to the HIV genome, thus inhibiting HIV replication. Sixth, the amino acid signature sequence or the methodology reported here also could be used to derive a signature in the engineering of new proteins and the design of peptide based therapeutics.[43]

39 in total

Review 1. Adaptive recognition in RNA complexes with peptides and protein modules.

Authors: D J Patel
Journal: Curr Opin Struct Biol Date: 1999-02 Impact factor: 6.809

2. Intra-tRNA distance measurements for nucleocapsid proteindependent tRNA unwinding during priming of HIV reverse transcription.

Authors: B Chan; K Weidemaier; W T Yip; P F Barbara; K Musier-Forsyth
Journal: Proc Natl Acad Sci U S A Date: 1999-01-19 Impact factor: 11.205

Review 3. Mean-field minimization methods for biological macromolecules.

Authors: P Koehl; M Delarue
Journal: Curr Opin Struct Biol Date: 1996-04 Impact factor: 6.809

Review 4. The importance of being modified: roles of modified nucleosides and Mg2+ in RNA structure and function.

Authors: P F Agris
Journal: Prog Nucleic Acid Res Mol Biol Date: 1996

5. Complete nucleotide sequence of the AIDS virus, HTLV-III.

Authors: L Ratner; W Haseltine; R Patarca; K J Livak; B Starcich; S F Josephs; E R Doran; J A Rafalski; E A Whitehorn; K Baumeister
Journal: Nature Date: 1985 Jan 24-30 Impact factor: 49.962

6. Human ribosomal protein L7 binds RNA with an alpha-helical arginine-rich and lysine-rich domain.

Authors: P Hemmerich; S Bosbach; A von Mikecz; U Krawinkel
Journal: Eur J Biochem Date: 1997-05-01

7. A conserved RNA-binding protein that regulates sexual fates in the C. elegans hermaphrodite germ line.

Authors: B Zhang; M Gallegos; A Puoti; E Durkin; S Fields; J Kimble; M P Wickens
Journal: Nature Date: 1997-12-04 Impact factor: 49.962

8. The crystal structures of T. thermophilus lysyl-tRNA synthetase complexed with E. coli tRNA(Lys) and a T. thermophilus tRNA(Lys) transcript: anticodon recognition and conformational changes upon binding of a lysyl-adenylate analogue.

Authors: S Cusack; A Yaremchuk; M Tukalo
Journal: EMBO J Date: 1996-11-15 Impact factor: 11.598

9. Interaction of human immunodeficiency virus type 1 Tat-derived peptides with TAR RNA.

Authors: K S Long; D M Crothers
Journal: Biochemistry Date: 1995-07-11 Impact factor: 3.162

10. HIV-1 reverse transcriptase specifically interacts with the anticodon domain of its cognate primer tRNA.

Authors: C Barat; V Lullien; O Schatz; G Keith; M T Nugeyre; F Grüninger-Leitch; F Barré-Sinoussi; S F LeGrice; J L Darlix
Journal: EMBO J Date: 1989-11 Impact factor: 11.598

7 in total

Review 1. tRNA wobble modifications and protein homeostasis.

Authors: Namit Ranjan; Marina V Rodnina
Journal: Translation (Austin) Date: 2016-01-28

2. Simulation study of the ability of a computationally-designed peptide to recognize target tRNA^Lys3 and other decoy tRNAs.

Authors: Xingqing Xiao; Binwu Zhao; Paul F Agris; Carol K Hall
Journal: Protein Sci Date: 2016-10-07 Impact factor: 6.725

3. Adding energy minimization strategy to peptide-design algorithm enables better search for RNA-binding peptides: Redesigned λ N peptide binds boxB RNA.

Authors: Xingqing Xiao; Michelle E Hung; Joshua N Leonard; Carol K Hall
Journal: J Comput Chem Date: 2016-08-04 Impact factor: 3.376

Review 7. Exploiting tRNAs to Boost Virulence.

Authors: Suki Albers; Andreas Czech
Journal: Life (Basel) Date: 2016-01-19