Jessica Torres-Kolbus1, Chungjung Chou1, Jihe Liu2, Alexander Deiters3. 1. Department of Chemistry, North Carolina State University, Raleigh, North Carolina, United States of America. 2. Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America. 3. Department of Chemistry, North Carolina State University, Raleigh, North Carolina, United States of America; Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America.
Abstract
Site-specific incorporation of bioorthogonal unnatural amino acids into proteins provides a useful tool for the installation of specific functionalities that will allow for the labeling of proteins with virtually any probe. We demonstrate the genetic encoding of a set of alkene lysines using the orthogonal PylRS/PylTCUA pair in Escherichia coli. The installed double bond functionality was then applied in a photoinitiated thiol-ene reaction of the protein with a fluorescent thiol-bearing probe, as well as a cysteine residue of a second protein, showing the applicability of this approach in the formation of heterogeneous non-linear fused proteins.
Site-specific incorporation of bioorthogonal unnatural amino acids into proteins provides a useful tool for the installation of specific functionalities that will allow for the labeling of proteins with virtually any probe. We demonstrate the genetic encoding of a set of alkene lysines using the orthogonal PylRS/PylTCUA pair in Escherichia coli. The installed double bond functionality was then applied in a photoinitiated thiol-ene reaction of the protein with a fluorescent thiol-bearing probe, as well as a cysteine residue of a second protein, showing the applicability of this approach in the formation of heterogeneous non-linear fused proteins.
Covalent attachment of proteins to ligands, polymers, and surfaces creates macromolecules combining specific biological function with favorable physical and chemical properties. For example, studying biological processes in their native environment often requires the addition of reporter tags to proteins [1]. To date, the mainstay tagging strategy for imaging of proteins involves genetic fusions of fluorescent proteins [2], [3]. However, the large size of fluorescent proteins can interfere with the folding and activity of the targeted protein [4]. Alternatively, tag-mediated labeling methods have been exploited, including self-labeling proteins, such as HaloTag, SNAP-tag, CLIP-tag, and enzyme-mediated labeling [5], [6]. Although these methods allow for smaller reporter tags, limitations with regard to the position and the structure of the label remain and the presence of an enzyme is required.An alternative strategy to label proteins is via the introduction of a single-residue modification, which is nearly non-perturbing. Site-specific protein labeling can be achieved by the installation of tags through bioconjugation reactions with reactive handles previously installed in a protein by using an orthogonal aminoacyl-tRNA synthetase/aminoacyl-tRNA pair for unnatural amino acid (UAA) mutagenesis [7]–[10]. The bioorthogonal groups can be installed at virtually any position at the protein expressed in pro- and eukaryotic cells and the choice of probes is nearly limitless. Bioorthogonal chemical handles that have been genetically encoded for conjugation reactions include ketones [11]–[14], azides [15]–[20], alkenes [21]–[28], alkynes [24], [29]–[32], tetrazines [33], aryl halides [34], [35], and aryl boronates [36]. The alkene functionality is currently receiving considerable attention due to its versatility in organic transformations and it is rarely found in natural proteins [37], [38], allowing for selective modification. Carbon-carbon double bonds have been exploited for protein modification in reactions including olefin-metathesis [26], [39], photoaddition of tetrazoles [25], [27], [28], inverse electron demand Diels-Alder cycloadditions [23], [24], [30], and thiol-ene reactions [22], [40]–[44].Bioorthogonal reactions have been applied in a variety of site-specific modifications of proteins such as fluorescent labeling, PEGylation, biotinylation, post-translational modification mimics, and surface immobilization [7], [9], [45]–[47]. Another area of interest for which bioconjugation reactions have been explored is in the generation of non-linear protein fusions. In biological systems, proteins often bind to other proteins to gain stability, affinity and higher specificity to perform specific cellular functions such as signal transduction, transcriptional regulation, and DNA repair [48]–[50]. Elucidation of many of these processes have led to the generation of chemical and biosynthetic methods to create non-linear protein linkages post-translationally for the control and performance of a number of functions, as well as protein trafficking and isolation. Methods that have been explored include native chemical ligation [51]–[54], enzyme based strategies [55],[56], and conjugation employing reactions with UAA residues [57]–[63]. The introduction of UAAs at a specific position allows for greater topological diversity with minimal protein modification [7]–[9], [46]. Here, we are applying the site-specific genetic incorporation of alkenes into proteins in the direct, spacer-free generation of non-linear protein fusions.The thiol-ene reaction involves a radical-mediated addition of a thiol to an alkene that occurs upon UV irradiation (365–405 nm) [64], [65]. The reaction offers the possibility of using light to control both in space and time the formation of a stable thioether bond. As a result of its specificity for alkenes and compatibility with aqueous environments, the thiol-ene reaction is a bioorthogonal reaction that has been applied in polymer and material synthesis [66]–[72], carbohydrate modification [73], [74], and peptide and protein modification [21], [22], [29], [40]–[44]. Recently, orthogonal thiol-ene bioconjugations applying alkenyl UAAs and synthetic organic reaction partners have been reported [21], [22]. In order to expand the chemical diversity of these orthogonal handles, we demonstrate the synthesis, incorporation and protein heterodimer formation using alternative thiol-ene reaction conditions.
Results and Discussion
Incorporation of alkene lysines into proteins
The pyrrolysyl-tRNA synthetase (PylRS), found in certain methanogenic archaea and bacteria, directly charges pyrrolysine (Pyl) onto its cognate tRNA that subsequently delivers it in response to an in-frame amber stop codon, TAG [75]–[77]. Furthermore, it has been demonstrated that the pyrrolysyl-tRNA synthetase/pyrrolysyl-tRNACUA pairs from Methanosarcina barkeri (MbPylRS/PylTCUA) and M. mazei (MmPylRS/PylTCUA) are functional in Escherichia coli
[78], Saccharomyces cerevisiae
[79], mammalian cells [80], Caenorhabditis elegans
[81], [82], Drosophila melanogaster
[83], and Xenopus laevis oocytes [84]. Furthermore, the wild-type PylRS is capable of accommodating a broad range of unnatural lysine derivatives based on a carbamate linkage at the ε-amino group to which a variety of functional groups, including tert-butyl [85], azido, alkynyl [17], norbornene [23], and diazirine [86] have been accommodated. We first synthesized a small collection of aliphaticalkene-lysines to diversify the structure of bioconjugation handles and to explore the ability to accommodate long-chain alkenes and lysine-linkages other than carbamates by the MbPylRS (Figure 1A and Schemes S1–S4).
Figure 1
Genetic incorporation of alkene-lysine analogs into myoglobin by the wild-type MbPylRS/PylTCUA pair.
(A) Structures of alkenyl lysine derivatives bearing an ε-carbamate linkage (1–6), an inverted carbamate 7, an amide 8, and an urea 9. (B) Myoglobin comparative incorporation efficiencies (%) and ESI-MS results.
Genetic incorporation of alkene-lysine analogs into myoglobin by the wild-type MbPylRS/PylTCUA pair.
(A) Structures of alkenyl lysine derivatives bearing an ε-carbamate linkage (1–6), an inverted carbamate 7, an amide 8, and an urea 9. (B) Myoglobin comparative incorporation efficiencies (%) and ESI-MS results.To investigate whether the synthesized alkene-lysines are substrates for the wild-type MbPylRS, the incorporation efficiencies of 1–9 into myoglobin were evaluated by protein expression in E. coli. Cells were grown in the absence of an UAA and in the presence of 1–9. The amino acids 1, 2 and 3 have been previously described and incorporated into proteins using wild-type PylRS and/or PylRS mutants [22], [85]. Here we found that additional analogs can be efficiently incorporated into myoglobin by the MbPylRS. The obtained incorporation efficiencies and ESI-MS results are listed in Figure 1B and the corresponding SDS-PAGE analysis is shown in Figure S1.Previous crystallographic studies of PylRS have indicated that the synthetase holds a large hydrophobic pocket, capable of accommodating bulky and hydrophobic moieties [87], [88]. In addition, it has been found that the carbamate moiety at the lysine side-chain is an essential discriminator for substrate recognition. For instance, the oxygen atom adjacent to the side-chain carbonyl group in 1 interacts via a water-mediated hydrogen bond with the side-chain carbonyl group of Asn346, a key residue in establishing substrate recognition in PylRS [85], [87]. We found that the amino acid binding pocket of MbPylRS exhibited flexibility to accommodate substrates 1, 2, 3, 6 and 7 with amino acids 1 and 7 showing the highest incorporation efficiency, which could be explained by their smaller size. While the amino acids 4 and 5 were not efficiently incorporated into protein due to their longer carbon chains.The successful incorporation of 1 and 7 into protein together with the inefficient substrate recognition of 8 and 9 by MbPylRS suggests that the presence of an oxygen atom adjacent to the side-chain carbonyl group favors the hydrogen-bond network to be established more efficiently. We hypothesize that the recognition of 7 by MbPylRS may be possible by re-directing the necessary interactions of the synthetase’s binding pocket to the O
ε-position. Moreover, we have previously observed a preference for the carbamate moiety over an amide group to drive the efficient genetic encoding of ε-N-propargyloxycarbonyl-lysine by the wild-type MbPylRS/PylTCUA pair, while its amide analog ε-N-pentynoyl-lysine was not accepted as a substrate [17]. Although analogs that bear a side-chain amide moiety have been incorporated into proteins by wild-type PylRS, so far only structures with up to four atom bonds in length from the amide ε-amino group have been tolerated by the enzyme’s binding pocket [89], [90]. Since our amino acid 8 is a bond longer, we can speculate that the carbamate functionality in 1, compared to 8, assists in an increase of substrate recognition efficiency by MbPylRS, as the enzyme showed to also tolerate the lengthier amino acids 2 and 3. The amino acid 9, which bears a urea linkage, seemed to be slightly favored by MbPylRS compared to the amide 8. However, the amino acid 9 still proves to be a poor substrate compared to 1. Our findings suggest that the replacement of the oxygen atom on the carbamate by a carbon or nitrogen atom may be enough to discriminate between the very similar substrates 1, 8, and 9, possibly due to weaker interactions with the amidenitrogen atom or urea functionalities, thus not favoring an efficient binding of 8 or 9 into the MbPylRS amino acid pocket.With amino acids 1–3 showing good incorporation efficiency, we site-specifically incorporated these amino acids into superfolder Green Fluorescent Protein (sfGFP) as a second model protein in E. coli. We found that alkene lysines 1–3 were successfully introduced at position Y151 in sfGFP (Figure 2A) and that sfGFP yields were obtained at 32–70 mg/L, an approximately 10-fold increase of incorporation efficiency compared to myoglobin bearing the same amino acids 1–3. ESI-MS analysis of purified sfGFP shows molecular weights corresponding to the site-specific incorporation of 1, 2, and 3 (Figure 2B).
Figure 2
Genetic incorporation of alkene-lysine analogs 1, 2 and 3 into sfGFP.
(A) SDS-PAGE analysis of purified sfGFP. –AA: no UAA was supplemented; WT: wild-type sfGFP; 1, 2 and 3: expression in the presence of the corresponding UAA (1 mM). (B) Protein yields (*wild-type sfGFP yield is 70 mg/L, 100%) and ESI-MS results.
Genetic incorporation of alkene-lysine analogs 1, 2 and 3 into sfGFP.
(A) SDS-PAGE analysis of purified sfGFP. –AA: no UAA was supplemented; WT: wild-type sfGFP; 1, 2 and 3: expression in the presence of the corresponding UAA (1 mM). (B) Protein yields (*wild-type sfGFP yield is 70 mg/L, 100%) and ESI-MS results.
sfGFP labeling via the thiol-ene reaction
To verify that the thiol-ene reaction is suitable for labeling the alkene-bearing sfGFP, dansyl-thiol (10) was used as a fluorescent probe (Figure 3A and Scheme S5). Wild-type sfGFP and modified sfGFPs carrying 1 or 2, which showed the highest incorporation efficiency, were subjected to a thiol-ene reaction with 10 by irradiating the reaction mixture with 365 nm UV light in the presence of the photoinitiator I2959 for 5 min. Both samples were then analyzed by SDS-PAGE gel and in-gel fluorescence imaging. Figure 3B shows that the alkene-containing sfGFPs modified with 1 and 2 were both selectively labeled with 10 after UV irradiation while the wild-type sfGFP was not fluorescently labeled. These results demonstrate that a thiol-containing fluorescence probe could be site-specifically conjugated to sfGFP bearing an alkene functional group.
Figure 3
Alkenyl-sfGFP is fluorescently labeled with dansyl-thiol, and bioconjugated to lysozyme to assemble a non-linear protein dimer via the thiol-ene reaction.
(A) sfGFP bearing an alkene functionality reacts photochemically with dansyl-thiol (10) or lysozyme (LYZ). (B) SDS-PAGE analysis demonstrates the labeling of alkenyl-sfGFP with 10 after 5 min of UV irradiation via thiol-ene ligation (lanes 5 and 6). Fluorescence (top) and Coomassie stain (bottom). (C) SDS-PAGE analysis shows mobility band shifts from 28 kD to 44 kD after samples were UV irradiated for 10 min (lanes 8 and 9), corresponding to the molecular weight of sfGFP-lysozyme conjugate. WT: wild-type sfGFP; 1 and 2: sfGFP carrying the corresponding UAA; LYZ: lysozyme. –UV: samples were not exposed to UV irradiation. +UV: samples were irradiated at 365 nm for 5 or 10 min.
Alkenyl-sfGFP is fluorescently labeled with dansyl-thiol, and bioconjugated to lysozyme to assemble a non-linear protein dimer via the thiol-ene reaction.
(A) sfGFP bearing an alkene functionality reacts photochemically with dansyl-thiol (10) or lysozyme (LYZ). (B) SDS-PAGE analysis demonstrates the labeling of alkenyl-sfGFP with 10 after 5 min of UV irradiation via thiol-ene ligation (lanes 5 and 6). Fluorescence (top) and Coomassie stain (bottom). (C) SDS-PAGE analysis shows mobility band shifts from 28 kD to 44 kD after samples were UV irradiated for 10 min (lanes 8 and 9), corresponding to the molecular weight of sfGFP-lysozyme conjugate. WT: wild-type sfGFP; 1 and 2: sfGFP carrying the corresponding UAA; LYZ: lysozyme. –UV: samples were not exposed to UV irradiation. +UV: samples were irradiated at 365 nm for 5 or 10 min.In order to show the potential of the thiol-ene reaction in protein chemistry, we hypothesized that cysteine residues in another protein could also be used as a possible reaction partner, leading to the formation of a non-linear protein heterodimer (Figure 3A). Lysozyme is a small protein containing 8 cysteine residues within 129 amino acids [91]. The cysteines form 4 disulfide bonds and can be reduced to release free thiol groups. Analysis of bioconjugated proteins by SDS-PAGE revealed bands of expected molecular weight, as the bands corresponding to sfGFP increased from 28 kD to 44 kD via conjugation to lysozyme after UV exposure in the presence of the photoinitiator I2959 for 10 min (Figure 3C). This result indicates that the majority of the observed products are sfGFP-lysozyme heterodimers since lysozyme was supplied in 4 fold excess compared to alkenyl sfGFP. Without UV irradiation, no significant mobility shift was observed. As expected, wild-type sfGFP did not undergo a thiol-ene reaction with lysozyme. Overall, a successful protein-protein heterodimer formation via thiol-ene conjugation of an alkene-containing protein was achieved.In both bioconjugation strategies we found that the addition of sodium dodecyl sulfate (SDS) was necessary for an efficient and specific conjugation reaction to alkene-labeled proteins within 5–10 min, in contrast to previously reported 1–2 h reaction times [22], [29], [43], thus significantly reducing UV exposure. We found that under our experimental conditions, lysozyme is (at least partially) denatured [92], [93], as confirmed by circular dichroism (CD) spectroscopy (Figure S2). This may result in more accessible cysteine residues and facilitate the thiol-ene bioconjugation reaction. Moreover, as a well-known surfactant, SDS has been proposed to form micelles in thiol-ene reactions for water-based polymerization reactions [94], [95]. It is possible that the association of the proteins with micelles may increase their local concentration, thus further facilitating the reaction.
Conclusions
In conclusion, we have synthesized a collection of alkene lysines of varying length and ε-linkages and demonstrated their site-specific, genetically encoded incorporation into proteins in E. coli by the wild-type MbPylRS/PylTCUA pair. The alkene-containing amino acids 1–3 showed the highest incorporation efficiencies into myoglobin and protein yields decreased with increasing side-chain length, hinting the limitations of the wild-type synthetase’s binding pocket to accommodate sterically demanding amino acids. Among these amino acids, we also successfully incorporated the amino acid 7 with an inverted carbamate functionality at the ε-position of lysine. Replacement of the carbamate motif for an amide or urea failed to provide efficient incorporation into protein, once again suggesting that the carbamate moiety at the lysine side-chain can be an essential discriminator for substrate recognition by wild-type PylRS.Next, the alkene amino acids 1, 2, and 3 were successfully incorporated into sfGFP, with 1 and 2 exhibiting the highest incorporation efficiency. Utilizing the thiol-ene reaction, alkene-bearing sfGFP was site-specifically bioconjugated to a dansyl-thiol fluorophore (10) upon irradiation with 365 nm of UV light in the presence of photoinitiator I2959 after only 5 min. In addition, we applied the site-specific genetic incorporation of alkene-bearing amino acids into proteins in the direct, spacer-free synthesis of a non-linear protein fusion of sfGFP and lysozyme. All components are recombinantly expressed and no post-translational introduction of functional groups was required. The work described herein demonstrates for the first time the assembly of a protein heterodimer by means of a light-induced thiol-ene ligation using genetically encoded alkene-bearing UAAs. This approach may become a promising tool to create non-linear proteins directly, with minimal synthetic effort, by creating direct protein-to-protein conjugations.
Materials and Methods
Synthesis of alkene lysines: general considerations
Unless otherwise stated, all reagents used were commercial reagents used without purification and reactions were performed under nitrogen using flame-dried glassware. The 1H NMR and 13C NMR spectra were recorded on a 300 MHz or 400 MHz Varian NMR spectrometer. The amino acid 1 was purchased from Chem-Impex International, Inc. For synthesis schemes of 2–10, please refer to Schemes S1–S5.
(S)-2-Amino-6-(((but-3-en-1-yloxy)carbonyl)amino)hexanoic acid HCl salt (2)
To a solution of 2b (110 mg, 0.32 mmol) and Et3SiH (0.1 mL, 0.64 mmol) in dry DCM (4.5 mL), trifluoroacetic acid (0.24 mL, 3.2 mmol) was added dropwise, and the reaction mixture was allowed to stir at room temperature overnight. The volatiles were removed under reduced pressure and the residue was dissolved in a solution of 4 N HCl in 1,4-dioxane (0.25 mL) and DCM (0.75 mL), allowed to stir for 10 min at room temperature and then concentrated. The latter process was repeated two more times to ensure complete TFA to HClsalt exchange. The concentrated residue was dissolved in a minimal amount of MeOH and was precipitated into ice-cold Et2O. The precipitate was pelleted by centrifugation, the supernatant decanted, and the solid was washed with Et2O before drying under vacuum, affording the amino acid 2 (82.2 mg, 92%) as a white solid. 1H-NMR (400 MHz, DMSO-d6): δ 8.45 (s, br, 3 H), 7.09 (s, br, 1 H), 5.75 (m, 1 H), 5.10–5.01 (m, 2 H), 3.94 (t, J = 6.8 Hz, 2 H), 3.77 (t, J = 6.4 Hz, 1 H), 2.92 (m, 2 H), 2.23 (m, 2 H), 1.75 (m, 2 H), 1.36–1.26 (m, br, 4 H) ppm; 13C-NMR (100 MHz, DMSO-d6): δ 170.9, 156.2, 134.8, 117.0, 62.7, 51.8, 33.2, 29.6, 28.9, 21.6 ppm; HRMS-ESI (m/z): [M+H]+ calcd for C11H20N2O4 245.1496, found 245.1490.
(S)-2-Amino-6-(((pent-4-en-1-yloxy)carbonyl)amino)hexanoic acid HCl salt (3)
Deprotection of 3b (0.5 g, 1.39 mmol) was performed as described above to obtain 3 (0.40 g, 97%) as a white solid. 1H-NMR (400 MHz, D2O): δ 5.86 (m, 1 H), 5.07–4.98 (m, 2 H), 4.05–4.00 (m, 3 H), 3.11 (t, J = 5.2 Hz, 2 H), 2.10 (m, 2 H), 1.97–1.87 (m, 2 H), 1.69 (t, J = 6.4 Hz, 2 H), 1.56–1.39 (m, 4 H) ppm; 13C-NMR (75 MHz, D2O): δ 172.0, 158.9, 138.6, 114.9, 65.0, 52.7, 39.9, 29.6, 29.4, 28.5, 27.5, 21.6 ppm; HRMS-ESI (m/z): [M+H]+ calcd for C12H22N2O4 259.1652, found 259.1653.
(S)-2-Amino-6-(((hex-5-en-1-yloxy)carbonyl)amino)hexanoic acid HCl salt (4)
Diphosgene (0.26 mL, 2.16 mmol) was added dropwise to an ice-cold mixture of 2-buten-1-ol (cis:trans isomers, ∼1∶19) (0.12 mL, 1.66 mmol) and potassium carbonate (0.69 g, 4.98 mmol) in dry Et2O (5 mL). The resulting mixture was allowed to stir overnight at room temperature, filtered and carefully concentrated under reduced pressure to avoid loss of the volatile product. The chloroformate 6a was obtained as a clear liquid and without further purification it was dissolved in THF (1 mL). Then, it was added dropwise to an ice-cold solution of Boc-L-Lys-OH (495 mg, 2.0 mmol) in 1 M NaOH aqueous (1 mL) and THF (4 mL). The reaction was allowed to run overnight at room temperature. The volatiles were removed under reduced pressure and the residue was diluted in water and then washed with EtOAc (10 mL). The water layer was acidified with 5% citric acid to pH 3–4 and extracted with EtOAc (3×10 mL). The combined organic layers were washed with water (20 mL) and brine (10 mL). The resulting organic layer was dried over Na2SO4, filtered and concentrated in vacuo to dryness to furnish 6b (343 mg, 60%) as an off-white foam. 1H-NMR (400 MHz, CDCl3): δ 8.40 (s, br, 1 H), 6.29 (s, br, 0.5 H), 5.78 (m, 1 H), 5.31 (m, br, 1 H), 5.02–4.90 (m, 2.5 H), 4.29 (s, br, 1 H), 4.05 (m, br, 2 H), 3.15 (m, 2 H), 2.07 (m, 2 H), 1.81–1.40 (m, 15 H) ppm; 13C-NMR (75 MHz, CDCl3): δ 176.4, 156.9, 156.0, 131.0, 125.9, 80.1, 65.7, 53.3, 40.6, 32.2, 29.5, 28.5, 22.5, 17.9 ppm; HRMS-ESI (m/z): [M-H]− calcd for C16H28N2O6 343.1864, found 343.1869.
(S)-2-Amino-6-(((but-2-en-1-yloxy)carbonyl)amino)hexanoic acid TFA salt (6)
To a solution of 6b (317 mg, 0.92 mmol) and Et3SiH (0.29 mL, 1.84 mmol) in dry DCM (13 mL), trifluoroacetic acid (0.68 mL, 9.20 mmol) was added dropwise, and the reaction mixture was allowed to stir at room temperature overnight. The volatiles were removed under reduced pressure and the residue was dissolved in a minimal amount of MeOH and precipitated into ice-cold Et2O. The precipitate was pelleted, the supernatant decanted, and the solid was washed with Et2O before drying under vacuum, affording the amino acid 6 (288 mg, 87%) as a white solid. 1H-NMR (400 MHz, D2O): δ 5.80 (m, 1 H), 5.58 (m, 1 H), 4.43 (d, J = 5.6, 2 H), 3.85 (m, 1 H), 3.08 (t, J = 6.0 Hz, 2 H), 1.88 (t, J = 6.0 Hz, 2 H), 1.66 (d, J = 6.4, 3 H), 1.53–1.33 (m, 4 H) ppm; 13C-NMR (100 MHz, DMSO-d6): δ 171.3, 156.1, 129.6, 126.6, 64.1, 52.7, 38.4, 30.1, 29.1, 26.5, 21.9, 21.6, 17.5 ppm; HRMS-ESI (m/z): [M+H]+ calcd for C11H20N2O4 245.14958, found 245.14970.
6-Hydroxy-Boc-L-norleucine-OH (25 mg, 0.10 mmol) was dissolved in a solution of dry DCM (1 mL) and DIPEA (53 µL, 0.30 mmol). The solution was chilled to 0°C before the addition of allyl isocyanate (18 µL, 0.20 mmol) and the reaction was allowed to proceed at 40°C overnight. After cooling to room temperature, the mixture was diluted with DCM (3 mL) and 5% citric acid (4 mL) was added. The aqueous layer was extracted with DCM (3×4 mL) and the combined organic layers were washed with water (10 mL) and brine (5 mL). The resulting organic layer was dried over Na2SO4, filtered and concentrated in vacuo to dryness to furnish 7a (29 mg, 89% yield) as an off-white foam. 1H-NMR (400 MHz, CDCl3): δ 5.85 (m, 1 H), 5.24–5.07 (m, 2 H), 4.74 (m, br, 1 H), 4.29, (s, br, 1 H), 4.06 (t, J = 5.6 Hz, 2 H), 3.78 (m, 2 H), 1.93–1.25 (m, 15) ppm; 13C-NMR (100 MHz, CDCl3): δ 176.0, 155.8, 135.0, 116.4, 80.3, 64.8, 53.3, 43.3, 32.2, 28.7, 28.5, 22.0 ppm; HRMS-ESI (m/z): [M+Na]+ calcd for C15H26N2O6 353.1689, found 353.1654.
(S)-6-((Allylcarbamoyl)oxy)-2-aminohexanoic acid TFA salt (7)
Deprotection of 7a (28 mg, 0.085 mmol) was performed by following the procedure described for compound 6 to afford compound 7 (28.8 mg, 96%) as a white solid. 1H-NMR (400 MHz, D2O): δ 5.86 (m, 1 H), 5.19–4.80 (m, 2 H), 4.08 (t, J = 6.0 Hz, 1 H), 3.94 (t, J = 5.6 Hz, 2 H), 3.72 (m, 2 H), 1.95 (m, 2 H), 1.69 (m, 2 H), 1.50 (m, 2 H) ppm; 13C-NMR (100 MHz, D2O): δ 173.2, 159.0, 135.4, 115.1, 63.1, 53.7, 42.1, 29.7, 28.0, 21.0 ppm; HRMS-ESI (m/z): [M+Na]+ calcd for C15H26N2O6 353.1689, found 353.1654.
A solution of dansyl chloride (150 mg, 0.56 mmol) and TEA (194 µL, 1.39 mmol) in dry DCM (0.8 mL) was cooled to 0°C and added dropwise into an ice-cold solution of cysteamine (86 mg, 1.11 mmol) in dry DCM (1 mL). The reaction was allowed to stir at room temperature for 3 h, was concentrated, and the product was purified on silica gel, eluting with 97∶2∶1 DCM/Hexanes/TEA to furnish 10 (51.4 mg, 30%) as a yellow film. Characterization data matched with literature [97].
Myoglobin Expression in E. coli
Plasmids, pMyo4TAGpylT and pBKpylS, were co-transformed into E. coli Top10 cells as previously described [98] and selected with 25 µg/mL tetracycline and 50 µg/mL kanamycin. A single colony was used to inoculate 2 mL LB medium containing the same antibiotics and grown overnight. Next, 500 µL of culture was used to seed 50 mL of LB culture containing 1 mM of the corresponding UAAs and antibiotics. The pH was adjusted to 7 with 10 M NaOH immediately before inoculation. Cells were then cultivated to OD600 = 0.6 and 100 µL of 20% arabinose solution was supplemented to induce arabinose promoter driven expression. The cells were cultivated at 37°C shaker overnight and harvested by centrifugation at 3000 g in standard 50 mL conical tubes. Lysis of the cell was conducted by re-suspending the cell pellets with standard Ni-NTA phosphate lysis buffer with lysozyme and 0.1% Triton X-100. After 1 hour of incubation at 4°C, cells were sonicated on ice to release the soluble portion and debris was removed by centrifugation. The cleared lysates were incubated with 100 µL of Qiagen Ni-NTAagarose slurry at 4°C to bind His-tagged myoglobin. The mixture was then centrifuged at 1000 g for 5 min and agarose beads were collected and transferred to microcentrifugator filter columns. Beads were washed three times with 400 µL Ni-NTA lysis buffer and one time with 400 µL Ni-NTA wash buffer. The protein was eluted with 400 µL of elution buffer. Eluted sample was mixed with SDS loading buffer, heated at 95°C for 5 min and loaded onto 10% SDS-PAGE gel with 1.5 mm thickness and ran at 150 V for 50 min. The gel was stained overnight with Coomassie blue solution (0.1% Coomassie blue, 10% acetic acid, 40% ethanol), then de-stained (10% acetic acid, 40% ethanol) and analyzed (Figure S1). The protein was dialyzed in 1 L of 20 mM ammonium acetate buffer for mass spectrum analysis.
sfGFP Expression in E. coli
The plasmid pMyo4TAGpylT [98] was modified by replacing the myoglobin coding sequence with the sfGFP gene with an amber stop codon mutation placed on Y151 position located on the outer beta sheet domain. The co-transformation was the same as described above but condensed culture protocol was used to maximize UAA yields. The 2 mL of overnight culture was scaled-up to 400 mL culture in 2×1 L Erlenmeyer flask and grown to OD600 = 0.6. Cells were harvested in 4×50 mL conical tubes and re-suspended in 50 mL of LB medium containing 1 mM of the corresponding UAA, antibiotics, and 0.1% arabinose. Cells were re-suspended by incubating in a rotary shaker at 37°C for 10 min and collected in a 250 mL Erlenmeyer flask. The cells were induced for 4 h and harvested by centrifugation. Cell pellets were first suspended by 3.6 mL of 50 mM Tris-HCl pH 8.0, supplemented with 2.4 mL of 4 M ammonium sulfate and extracted by three-phase partitioning method [99] with 6 mL of t-butanol and vigorous shaking. The aqueous bottom layer containing sfGFP was removed and dialyzed against 1 L Ni-NTA lysis buffer for 1–2 h to remove most of the ammonium sulfate. The dialyzed samples were filtered through 0.45 µm disc filter before loading into Ni-NTA gravity column containing 0.5 mL bed volume. The proteins were bound and washed with 12 mL bed volume of lysis buffer, 6 mL bed volume of wash buffer containing 50 mM imidazole and eluted with Ni-NTA elution buffer. Samples were analyzed by SDS-PAGE, dialyzed against PBS pH 7.4 for subsequent labeling reaction and then dialyzed against 20 mM ammonium acetate for mass spectrum analysis.
Protein MS Analysis
Protein MS was measured at the Genomics and Proteomics Core Laboratories, University of Pittsburgh. The protein solution was adjusted to 5 pmol/µL in 80% acetonitrile and 0.1% aqueous formic acid. The sample was injected into a Bruker micrOTOF with an Ultimate 3000 HPLC. The results were deconvoluted to calculate the molecular weight using HyStar.
Thiol-ene Reactions with sfGFP
A reaction buffer containing 30 µL of 1 M TrisHCl pH 6.8 (120 mM), 50 µL of 10% SDS (2%), 50 µL of 10 mM TCEP (2 mM), and 120 µL of water was made. In another eppendorf tube, a solution of 10 mM photoinitiator I2959 containing 50% DMSO in water was prepared. Then 62.5 µL of I2959 were added to 250 µL of reaction buffer just before labeling. Next, 2.5 µL of reaction buffer/photoinitiator mix were added to 20 µL of sfGFP (2400 ng/µL) and incubated at room temperature for 10 min. Dansyl-thiol (10) 50X substrate solution was prepared with 10 µL of 100 mM TCEP (10 mM), 20 µL of 25 mM dansyl-thiol in DMSO and 70 µL of deionized water. Next, 16.8 µL of this solution was added to the reaction mixture and incubated at room temperature for another 10 min. Subsequently, 250 µL of 1X SDS loading buffer containing 2-mercaptoethanol and additional 50 µL of 100 mM DTT were prepared for stopping the reaction. Then 12 µL of the reaction samples were aliquoted into 200 µL PCR tubes. Samples were placed on a standard UV transilluminator at 365 nm for 5 min and the reaction was stopped by adding 12 µL of 1X SDS loading buffer to the mixture and heated at 95°C for 5 min. Next, 5 µL of the samples were loaded onto a 10% SDS-PAGE gel with 1.5 mm thickness and ran at 150 V for 50 min. After electroporation, gels were rinsed briefly with deionized water and imaged. Gels were stained with coomassie blue and scanned to visualize the protein bands.For protein heterodimer formation, the thiol-ene conjugation was carried out with a denatured and reduced lysozyme solution. A reaction buffer containing 120 µL of 1 M TrisHCl pH 6.8, 200 µL of 10% SDS, and 1 mL of 10 mM TCEP was prepared. In 1.32 mL of the reaction buffer, 19 mg of lysozyme were dissolved (1 mM). The solution was sealed in a 2 mL micro-centrifugation tube with a rubber septum and purged with nitrogen for 30 min. The tube containing the protein solution was then heated at 75°C for 30 min. Photo-initiator I2959 was diluted to 10 mM in a solution of 50% DMSO in water. sfGFP solutions were adjusted to a concentration of 23 µM (650 ng/µL) and 20 µL of this solution was mixed with 2 µL of reduced lysozyme and 0.5 µL of I2959 in the dark. The PCR tube containing this mixture was then placed on a standard UV transilluminator at 365 nm for 10 min. A solution containing 120 µL of 1 M Tris-HCl pH 6.8, 20 µL of 10% SDS, 10 µL of 1 M TCEP and 200 µL of glycerol was prepared and 20 µL of it were immediately added to the reaction solution after irradiation. 15 µL of the resulted sample was loaded onto native or 10% SDS-PAGE gel following standard procedures.
Circular dichroism analysis of lysozyme
CD experiments were performed on an Olis Circular Dichroism Spectrophotometer using 0.1 cm quartz cuvettes. A solution containing 19 mg of lysozyme in 1.32 mL 10 mM phosphate buffer pH 7.4 was prepared. SDS was added to the final concentration of 2%. The lysozyme concentration was diluted to 20 uM for the CD experiment, and CD spectra were collected from 195 to 260 nm in 1 nm increments with an integration time of 5 s and a bandwidth of 2 nm. Increased intensity in the far-UV spectrum with the addition of SDS (Figure S2) is in agreement with previous observations [100].SDS-PAGE analysis for the incorporation of alkene-bearing lysines 1-9 into myoglobin. –AA: no UAA was supplemented; +AA: positive control UAA (1 mM); 1-9: myoglobin expression in the presence of the corresponding UAA (1 mM).(TIF)Click here for additional data file.Circular dichroism (CD) spectrum of lysozyme with and without SDS treatment. Blue: lysozyme with no SDS; Red: lysozyme with 2% SDS.(TIF)Click here for additional data file.Synthesis of alkene-bearing lysines 2–6.(TIF)Click here for additional data file.Synthesis of alkene-bearing lysine 7.(TIF)Click here for additional data file.Synthesis of alkene-bearing lysine 8.(TIF)Click here for additional data file.Synthesis of alkene-bearing lysine 9.(TIF)Click here for additional data file.Synthesis of dansyl-thiol, 10.(TIF)Click here for additional data file.
Authors: Kathrin Lang; Lloyd Davis; Stephen Wallace; Mohan Mahesh; Daniel J Cox; Melissa L Blackman; Joseph M Fox; Jason W Chin Journal: J Am Chem Soc Date: 2012-06-13 Impact factor: 15.419
Authors: Saba Nojoumi; Ying Ma; Sergej Schwagerus; Christian P R Hackenberger; Nediljko Budisa Journal: Int J Mol Sci Date: 2019-05-09 Impact factor: 5.923