There are few biocatalytic transformations that produce fluorine-containing molecules prevalent in modern pharmaceuticals. To expand the scope of biocatalysis for organofluorine synthesis, we have developed an enzymatic platform for highly enantioselective carbene B-H bond insertion to yield versatile α-trifluoromethylated (α-CF3) organoborons, an important class of organofluorine molecules that contain stereogenic centers bearing both CF3 and boron groups. In contrast to current "carbene transferase" enzymes that use a limited set of simple diazo compounds as carbene precursors, this system based on Rhodothermus marinus cytochrome c (Rma cyt c) can accept a broad range of trifluorodiazo alkanes and deliver versatile chiral α-CF3 organoborons with total turnovers up to 2870 and enantiomeric ratios up to 98.5:1.5. Computational modeling reveals that this broad diazo scope is enabled by an active-site environment that directs the alkyl substituent on the heme CF3-carbene intermediate toward the solvent-exposed face, thereby allowing the protein to accommodate diazo compounds with diverse structural features.
There are few biocatalytic transformations that produce fluorine-containing molecules prevalent in modern pharmaceuticals. To expand the scope of biocatalysis for organofluorine synthesis, we have developed an enzymatic platform for highly enantioselective carbene B-H bond insertion to yield versatile α-trifluoromethylated (α-CF3) organoborons, an important class of organofluorine molecules that contain stereogenic centers bearing both CF3 and boron groups. In contrast to current "carbene transferase" enzymes that use a limited set of simple diazo compounds as carbene precursors, this system based on Rhodothermus marinus cytochrome c (Rma cyt c) can accept a broad range of trifluorodiazo alkanes and deliver versatile chiral α-CF3 organoborons with total turnovers up to 2870 and enantiomeric ratios up to 98.5:1.5. Computational modeling reveals that this broad diazo scope is enabled by an active-site environment that directs the alkyl substituent on the hemeCF3-carbene intermediate toward the solvent-exposed face, thereby allowing the protein to accommodate diazo compounds with diverse structural features.
Fluorine-containing molecules
have assumed a privileged position in modern medicine and comprise
more than 20% of all pharmaceuticals.[1,2] Although many
chemical methods for synthesis of organofluorines have been developed,[3,4] there is a noticeable lack of enzymatic approaches for their synthesis.[5] To expand the scope of biocatalytic fluorine
chemistry, two main strategies are currently considered.[5,6] The first is to expand the substrate scope of natural fluorinase
enzymes via protein engineering. The second is to modify biosynthetic
pathways to accept fluorinated building blocks for precursor-directed
biosynthesis. Although significant progress has been made,[5−13] both approaches suffer from restricted substrate scope and low efficiency,
which hampers synthetic applications.A promising strategy to
expand the synthetic capability of nature
is to introduce non-natural chemistries into existing proteins.[14−16] Enzymes that share similar mechanistic elements with non-natural
reactions can exhibit promiscuous activities for abiological reactions,
from which new activities can be evolved.[17,18] Here, we used this mechanism-driven approach to develop fully genetically
encoded enzymes that can perform non-natural reactions for organofluorine
synthesis. Our target transformation is the synthesis of chiral α-trifluoromethylated (α-CF3) organoborons
from trifluorodiazo alkanes and Lewis-base borane complexes. This
class of organofluorine compounds serves as valuable synthetic building
blocks that can be converted to a broad range of CF3-containing
molecules via versatile boron-mediated transformations.[19−24] Despite the synthetic versatility of α-CF3 organoborons,
their asymmetric preparation remains a challenge. The few known enantioselective
examples in synthetic chemistry include a copper-catalyzed asymmetric
hydroboration of CF3-substituted alkenes and enantioselective
insertions of CF3-carbene intermediates into B–H
bonds of Lewis-base borane complexes.[21,25] The substrate
scopes in both examples are very limited: the hydroboration approach
has only been applied to aromatic β-CF3-α,β-unsaturated ketones,[21] whereas the B–H insertion strategy has only been
demonstrated with a few aryl-substituted trifluorodiazo compounds.[25]The deficit of methods for making chiral
α-CF3 organoborons motivated us to develop an enzymatic
platform for their
synthesis by reprogramming heme proteins to utilize trifluorodiazo
alkanes for enantioselective carbene B–H bond insertion reactions
(Figure a). Harnessing
the facile synthetic accessibility of trifluorodiazo alkanes[26] and the evolvability of heme proteins,[16,27] this platform can be used to prepare a broad range of chiral α-CF3 organoborons, many of which are currently unobtainable. The
major challenge to developing the enzymatic system is engineering
the heme proteins to accommodate structurally diverse hemeCF3-carbene intermediates for enantioselective B–H bond
insertion. Although current carbene transferases can use a variety
of substrates for versatile chemical transformations,[28−43] the scope of diazo compounds in most of these reactions has been
very limited (mainly to ethyl diazoacetate (EDA)). To expand the diazo
substrate scope, it has usually been necessary to reoptimize the enzyme
active site for each new diazo substrate. Furthermore, for all synthetic
carbene B–H bond insertion reactions developed to date,[44−50] high enantioselectivity has only been achieved for diazo compounds
containing (hetero)aromatic groups adjacent to the diazo carbon.
Figure 1
(a) Design
of a general enzymatic platform for synthesis of chiral
α-CF3 organoborons. (b) On the left, overlay of Rma cyt c (PDB 3CP5, front loop region in blue), Rma TDE (PDB 6CUK, front loop region in orange), and Rma TDE with iron porphyrin carbene structure obtained from computational
modeling (ref (52),
front loop region in green). The large change in front loop structure
highlights the impact loop mutations have on access to the active
site of Rma cyt c. On the right,
we propose that the active-site environment can be tuned to orient
the heme-carbene intermediate such that the alkyl substituent R is
solvent-exposed.
(a) Design
of a general enzymatic platform for synthesis of chiral
α-CF3 organoborons. (b) On the left, overlay of Rma cyt c (PDB 3CP5, front loop region in blue), Rma TDE (PDB 6CUK, front loop region in orange), and Rma TDE with iron porphyrin carbene structure obtained from computational
modeling (ref (52),
front loop region in green). The large change in front loop structure
highlights the impact loop mutations have on access to the active
site of Rma cyt c. On the right,
we propose that the active-site environment can be tuned to orient
the heme-carbene intermediate such that the alkyl substituent R is
solvent-exposed.We envisioned that these
challenges could be met by leveraging
the unique structural features of Rhodothermus marinus cytochrome c (Rma cyt c).[51] This protein has proven
to be a highly versatile platform for developing new enzymatic carbene-transfer
reactions.[33,34] The heme-binding pocket of wild-type Rma cyt c is surrounded by several α-helices,
two of which are connected by a surface loop.[51] Recent structural characterization of the laboratory-evolved Rma cyt c TDE carbene transferase with
a bound iron porphyrin carbene (IPC) intermediate revealed that the
three mutations introduced during directed evolution rendered this
surface loop (residues 98–103) highly flexible (Figure b, left).[52] This enhanced flexibility allows the loop to explore more
open conformations, not observed in the wild-type structure, that
promote interactions between the substrate and the carbene intermediate.
We hypothesized that we could engineer an active-site environment
that favors a constrained conformation in which the CF3 group of hemeCF3-carbene intermediates points into the
heme pocket, and the bulkier alkyl substituent R faces the solvent-exposed
side (Figure b, right).
In this conformation, the restricted orientation of the CF3 group could ensure highly enantioselective formation of C–B
bonds with little or no interference from the alkyl substituent R.To identify mutations that can stabilize the conformation proposed
in Figure b, we began
directed evolution of Rma cyt c with N-heterocycliccarbene (NHC) borane 1 as the
model borane substrate and (3-diazo-4,4,4-trifluorobutyl)benzene (2 in Figure ) as the model diazo carbene precursor. We expected that the large
phenylethyl group in 2 would facilitate its positioning
toward the solvent-exposed face. While wild-type Rma cyt c barely catalyzed carbene B–H insertion
with 2, site-saturation mutagenesis (SSM) at residue
V75 led to the discovery of the V75S variant, which afforded the desired
product 2a with 220 total turnovers (TTN) and 80:20 enantiomeric
ratio (e.r.). Residue V75 is located on an α-helix that is close
to the heme cofactor and was previously shown to be a key site for
expanding the scope of diazo compounds.[33] With Rma cyt cV75S as the parent,
we performed additional rounds of SSM on residues M100, M103, T101,
and M99 to further improve the catalytic performance of this borylation
catalyst. These residues reside on the front loop and are important
for controlling the structure of the heme pocket.[52] Through this engineering, we obtained a quadruple mutant, Rma cyt cV75SM100LM103DM99A (SLDA),
which produced product 2a with 1960 TTN and 96.5:3.5
e.r.
Figure 2
(a) X-ray crystal structure of wild-type Rma cyt c (PDB 3CP5). Residues targeted for site-saturation mutagenesis (V75, M99, M100,
M103, and Y44) are shown in sticks. (b) Directed evolution of Rma cyt c for enantioselective synthesis
of α-CF3 organoborons with NHC borane 1 and diazo compound 2 as the model substrates. Reactions
were performed in M9-N (pH 7.4) suspensions of Escherichia
coli cells expressing Rma cyt c variants (OD600 = 20). Standard reaction conditions were
10 mM borane substrate 1, 7.5 mM diazo substrate 2, room temperature under anaerobic conditions. Total turnovers
(TTN) were defined as the amount of α-CF3 organoboron
product divided by the total amount of expressed Rma cyt c protein as determined by the hemochrome assay.
The absolute configuration of product 2a was determined
to be R based on the optical rotation of the derivatized
alcohol. See the Supporting Information, section VII, for detailed experimental procedures. wt refers to
wild-type Rma cyt c. Single-letter
abbreviations for the amino acid residues: V, Val; S, Ser; M, Met;
L, Leu; D, Asp; A, Ala; Y, Tyr; I, Ile.
(a) X-ray crystal structure of wild-type Rma cyt c (PDB 3CP5). Residues targeted for site-saturation mutagenesis (V75, M99, M100,
M103, and Y44) are shown in sticks. (b) Directed evolution of Rma cyt c for enantioselective synthesis
of α-CF3 organoborons with NHC borane 1 and diazo compound 2 as the model substrates. Reactions
were performed in M9-N (pH 7.4) suspensions of Escherichia
coli cells expressing Rma cyt c variants (OD600 = 20). Standard reaction conditions were
10 mM borane substrate 1, 7.5 mM diazo substrate 2, room temperature under anaerobic conditions. Total turnovers
(TTN) were defined as the amount of α-CF3 organoboron
product divided by the total amount of expressed Rma cyt c protein as determined by the hemochrome assay.
The absolute configuration of product 2a was determined
to be R based on the optical rotation of the derivatized
alcohol. See the Supporting Information, section VII, for detailed experimental procedures. wt refers to
wild-type Rma cyt c. Single-letter
abbreviations for the amino acid residues: V, Val; S, Ser; M, Met;
L, Leu; D, Asp; A, Ala; Y, Tyr; I, Ile.Y44 on the A helix of Rma cyt c is another residue that might affect catalysis of the target borylation
reaction. In an Rma cyt c-catalyzed
carbeneSi–H insertion reaction, this residue was expected
to interact with a silane substrate approaching the enzyme active
site, as indicated by the crystal structure of an Rma cyt c-bound heme-carbene intermediate.[52] SSM of the quadruple mutant SLDA at Y44 and
screening yielded the Y44I mutation that further improved borylation
activity to 2460 TTN and enantioselectivity to 97.5:2.5 e.r.With optimized variant Rma cyt cY44IV75SM99AM100LM103D (denoted BOR-CF3) in hand,
we next probed its activity toward a panel of structurally diverse
trifluorodiazo alkanes. If BOR-CF3 stabilizes the “solvent-exposed”
conformation of the R group as we expect, then it should accept a
wide range of trifluorodiazo alkanes for enantioselective carbene
B–H insertion. Indeed, as shown in Figure a, BOR-CF3 could be used to synthesize
a spectrum of α-CF3 organoborons with diverse structural
features. High enantioselectivity and activity were obtained for diazo
compounds bearing different phenyl substitution patterns (3a–6a). Extending the chain length of the alkyl
substituent R has a more profound impact on the performance of BOR-CF3 (7a, 8a). Strikingly, reducing
the chain length to one carbon in 9a abolishes borylation
activity. As indicated by computational modeling, the iron-carbene
intermediate generated from diazo 9 with a shorter alkyl
chain can adopt conformations in which the phenyl moiety is close
to the carbene reaction center. Unfavorable steric interactions with
the phenyl group in these conformations would inhibit the borane substrate’s
approach to the carbene (see Figure S6 in
the Supporting Information).
Figure 3
(a) Scope of trifluorodiazo alkanes for carbene
B–H insertion
catalyzed by BOR-CF3. Reactions were performed in M9-N
(pH 7.4) suspensions of E. coli cells expressing
BOR-CF3 (OD600 = 20). Standard reaction conditions
were 10 mM borane substrate 1, 7.5 mM diazo substrate n, room temperature under anaerobic conditions. The absolute
configuration of 2a was determined to be R based on the optical rotation of the derivatized alcohol (see the Supporting Information, section VII). (b) Preparative
scale synthesis (0.2 mmol scale) and the derivatization of 2a to boronic acid 2c.
(a) Scope of trifluorodiazo alkanes for carbene
B–H insertion
catalyzed by BOR-CF3. Reactions were performed in M9-N
(pH 7.4) suspensions of E. coli cells expressing
BOR-CF3 (OD600 = 20). Standard reaction conditions
were 10 mM borane substrate 1, 7.5 mM diazo substrate n, room temperature under anaerobic conditions. The absolute
configuration of 2a was determined to be R based on the optical rotation of the derivatized alcohol (see the Supporting Information, section VII). (b) Preparative
scale synthesis (0.2 mmol scale) and the derivatization of 2a to boronic acid 2c.We further challenged BOR-CF3 with trifluorodiazo
alkanes
containing alkyl chains without aromatic groups (10–13). Although these diazo compounds are structurally distinct
from the model compound 2 used for directed evolution,
BOR-CF3 can still effectively convert them into α-CF3 organoborons with 730–1630 TTN and 96:4–98:2
e.r. (10a–13a). This result shows
that the high enantioselectivity of BOR-CF3 does not require
specific recognition of the aromatic substituent. One synthetic advantage
of this method is that the trifluorodiazo alkane starting compounds
can be synthesized easily from a variety of starting materials such
as alkyl halides, aldehydes, and carboxylic acids.[26,53,54] As these compounds are widely present in
nature, this method could provide a facile way to obtain chiral organoborons
that contain motifs of complex natural products. To demonstrate this,
we synthesized trifluorodiazo compound 13 bearing a geranyl
structural motif. Subjecting 13 to the standard conditions
described in Figure with BOR-CF3 as the catalyst afforded organoboron product 13a with 1630 TTN and 98:2 e.r. Given the prevalence of the
geranyl structural motif in bioactive molecules,[55] this organoboron compound may find applications in syntheses
of fluorinated analogues of natural products such as pheromones. To
further demonstrate the synthetic utility of our method, we performed
this biocatalytic transformation on preparative scale and obtained
borane product 2a in 52% isolated yield and 97:3 e.r.
The NHC borane 2a can be readily converted to the boronic
acid 2c while retaining the stereochemistry of the boron-substituted
chiral center, which would facilitate its further derivatization to
other chiral trifluoromethyl-containing molecules.We next used
molecular dynamics (MD) simulations to obtain further
insights into the stereocontrol imposed by the enzyme and the roles
of the introduced mutations. Analysis of the active-site shape of
BOR-CF3 in the absence of both substrates revealed that
the hydroxyl group in the 75S side chain forms a hydrogen bond with
the Y71 amide backbone, which directs the Chydrogen of serine to face toward the heme (Figure a). Modeling of the active-site
structure of BOR-CF3 with a bound diazo compound 2 showed that this serine arrangement generates more space
at the heme distal side to better accommodate the CF3 group.
As a result, diazo compound 2 can mainly be docked in
one specific conformation in the BOR-CF3 active site (Figure b). Intriguingly,
this conformation is structurally analogous to the transition state
for iron-carbene generation via N2 loss (Figure c). This result suggests that
the active-site environment of BOR-CF3 not only constrains
the conformation of the trifluorodiazo alkane, but also promotes the
formation of the iron-carbene intermediate by facilitating the interaction
between the diazo compound and the hemeiron center.
Figure 4
(a) The hydrogen bonding
interactions between 75S and 71Y amide
backbone and the empty surface area in a representative snapshot obtained
from MD trajectories of BOR-CF3 in the absence of diazo
compound (see also Figures S1 and S2 in
the Supporting Information). (b) Binding pose of diazo substrate 2 bound into the active site of BOR-CF3, obtained
from docking and constrained MD simulations (see Figure S3 and computational details in the SI). Direct comparison
between the shape of the BOR-CF3 active site and the docked
diazo 2 shows the high complementarity achieved by evolution
and introduction of V75S mutation. (c) Comparison of diazo 2 bound in the BOR-CF3 structure and the DFT-optimized
model transition state (TS) geometry for formation of the iron porphyrin
carbene (IPC) intermediate (see the SI, Figure S8). The diazo 2 in this binding pose represents
a near attack conformation that leads to the transition state for
the generation of the iron porphyrin carbene intermediate.
(a) The hydrogen bonding
interactions between 75S and 71Y amide
backbone and the empty surface area in a representative snapshot obtained
from MD trajectories of BOR-CF3 in the absence of diazo
compound (see also Figures S1 and S2 in
the Supporting Information). (b) Binding pose of diazo substrate 2 bound into the active site of BOR-CF3, obtained
from docking and constrained MD simulations (see Figure S3 and computational details in the SI). Direct comparison
between the shape of the BOR-CF3 active site and the docked
diazo 2 shows the high complementarity achieved by evolution
and introduction of V75S mutation. (c) Comparison of diazo 2 bound in the BOR-CF3 structure and the DFT-optimized
model transition state (TS) geometry for formation of the iron porphyrin
carbene (IPC) intermediate (see the SI, Figure S8). The diazo 2 in this binding pose represents
a near attack conformation that leads to the transition state for
the generation of the iron porphyrin carbene intermediate.We further modeled the BOR-CF3-bound
trifluoroalkyl-carbene
intermediate 2b formed from diazo compound 2. MD simulations revealed that 2b can adopt two major
conformations as shown in Figure a,b. The main difference between the two conformers
is the slight rotation of the Fe–C bond, which is possible
because no specific contacts between the CF3 group and
the protein stabilize one conformation over the other. Nonetheless,
in both conformations, the CF3 group is pointing into the
active site, and the bulkier phenyl group lies between the M103D and
Y44I side chains and is exposed to the solvent, in line with our original
hypothesis. More interestingly, both conformers of 2b expose the same pro-R face of the carbene intermediate
to the empty volume generated in the active site, consistent with
the observed R absolute configuration of product 2a. To better understand the interaction between the borane
substrate 1 and the carbene intermediate 2b, we used DFT calculations to describe the transition state (TS)
for B–H carbene insertion using a model iron porphyrin system
(see the SI, section VIII, for computational
details). A structural comparison between this TS geometry and the
structure of protein-bound carbene intermediate 2b revealed
that borane 1 can be effectively accommodated in the
empty volume of 2b where it can adopt a catalytically
competent pose to reach the TS for C–B bond formation (TS-CB). In this reaction scheme, the newly introduced M99A
and M100L mutations alter the conformational dynamics of the front
loop and generate an appropriate binding pocket for borane substrate 1 in the carbene-bound enzyme, while the major role of M103D
and Y44I is to facilitate binding of the diazo substrate and stabilize
both the diazo and the carbene intermediate in this binding pose.
Notably, in this interaction mode, the alkyl substituent R on the
CF3-carbene intermediate should have little influence on
how the borane substrate approaches the carbene intermediate, which
explains the general high enantioselectivity exhibited by BOR-CF3 for most trifluorodiazo alkanes tested. The generality of
this rationale for other trifluorodiazo alkanes is suggested by the
catalytic performance of BOR-CF3 and its earlier variants
on geranyl-containing diazo compound 13, where the trend
in activity and enantioselectivity was similar to that with model
diazo compound 2 (Scheme S1 in the Supporting Information).
Figure 5
(a) Overlay of snapshots extracted from
a 500 ns MD trajectory
of BOR-CF3-bound trifluoroalkyl-carbene intermediate 2b (see also Figures S4 and S5 in
the Supporting Information). (b) Representative snapshots of the two
main conformations explored by trifluoroalkyl-carbene intermediate 2b in BOR-CF3 during MD trajectories. Blue surface
represents the inner void cavity generated in the protein active site
next to the iron porphyrin carbene. (c) DFT-optimized transition state
(TS) for B–H carbene insertion in a model system (see also Figure S7). Key distances and angles are given
in Å and degrees.
(a) Overlay of snapshots extracted from
a 500 ns MD trajectory
of BOR-CF3-bound trifluoroalkyl-carbene intermediate 2b (see also Figures S4 and S5 in
the Supporting Information). (b) Representative snapshots of the two
main conformations explored by trifluoroalkyl-carbene intermediate 2b in BOR-CF3 during MD trajectories. Blue surface
represents the inner void cavity generated in the protein active site
next to the iron porphyrin carbene. (c) DFT-optimized transition state
(TS) for B–H carbene insertion in a model system (see also Figure S7). Key distances and angles are given
in Å and degrees.In conclusion, we have developed a biocatalytic platform
for synthesis
of chiral α-CF3 organoborons. Using directed evolution,
we created an active site in Rma cyt c that can host structurally diverse trifluorodiazo alkanes for highly
enantioselective carbene B–H insertion reactions. The effects
of beneficial mutations on the catalytic activities and conformational
dynamics have been rationalized by computational modeling. These efforts
have expanded the scope of carbene intermediates accessible to heme
proteins and provided new mechanistic insights into enzymatic carbene-transfer
reactions.
Authors: Marc Garcia-Borràs; S B Jennifer Kan; Russell D Lewis; Allison Tang; Gonzalo Jimenez-Osés; Frances H Arnold; K N Houk Journal: J Am Chem Soc Date: 2021-04-28 Impact factor: 16.383