Joshua B Pyser1, Suman Chakrabarty1, Evan O Romero1, Alison R H Narayan1. 1. Department of Chemistry, Life Sciences Institute, and Program in Chemical Biology, University of Michigan, , 210 Washtenaw Avenue, Ann Arbor, Michigan 48109, United States.
Abstract
The use of enzyme-mediated reactions has transcended ancient food production to the laboratory synthesis of complex molecules. This evolution has been accelerated by developments in sequencing and DNA synthesis technology, bioinformatic and protein engineering tools, and the increasingly interdisciplinary nature of scientific research. Biocatalysis has become an indispensable tool applied in academic and industrial spheres, enabling synthetic strategies that leverage the exquisite selectivity of enzymes to access target molecules. In this Outlook, we outline the technological advances that have led to the field's current state. Integration of biocatalysis into mainstream synthetic chemistry hinges on increased access to well-characterized enzymes and the permeation of biocatalysis into retrosynthetic logic. Ultimately, we anticipate that biocatalysis is poised to enable the synthesis of increasingly complex molecules at new levels of efficiency and throughput.
The use of enzyme-mediated reactions has transcended ancient food production to the laboratory synthesis of complex molecules. This evolution has been accelerated by developments in sequencing and DNA synthesis technology, bioinformatic and protein engineering tools, and the increasingly interdisciplinary nature of scientific research. Biocatalysis has become an indispensable tool applied in academic and industrial spheres, enabling synthetic strategies that leverage the exquisite selectivity of enzymes to access target molecules. In this Outlook, we outline the technological advances that have led to the field's current state. Integration of biocatalysis into mainstream synthetic chemistry hinges on increased access to well-characterized enzymes and the permeation of biocatalysis into retrosynthetic logic. Ultimately, we anticipate that biocatalysis is poised to enable the synthesis of increasingly complex molecules at new levels of efficiency and throughput.
The utility of naturally occurring enzymes
has been harnessed for
thousands of years through fermentation and food preservation processes.[1] Fascination with the chemistry of microbes originated
at the dawn of the Neolithic era, nearly 12 000 years ago,
when humans began domesticating grains and consuming alcohol, the
evidence of which can be found in archeological records.[2] In fact, arguments have been made that our use
of alcohol produced enzymatically even predates archeological records,
with evidence for ethanol-degrading enzymes present in primate species
that lived before Homo sapiens.[2] Driven by curiosity, people sought to understand
how and why leaving cereals or grapes[3] alone
for some time caused them to adopt new properties, seeding fields
like enzymology, molecular biology, and biocatalysis as an extension
of this fascination.[4]Within the
past few decades, biocatalysis in fine chemical and
pharmaceutical production has surged.[5−7] This trend is driven
in part by advances in DNA sequencing, bioinformatics, and protein
engineering that allow for the identification of enzymes that meet
the reactivity and selectivity needs of a given synthetic route.[8] Biocatalytic reactions are now routinely used
in scalable processes ranging from simple chemical manipulations such
as chiral resolutions,[9−11] reductive aminations,[9,12] and alcohol
oxidations,[13] to complex, multistep chemoenzymatic
cascades that enable access to high-value drug molecules on an industrial
scale.[14] The rapid developments in biocatalysis
are also enabling a re-emergence of natural products in the current
era of genomics.[15] Natural products have
a long history in drug discovery and development given their often
potent biological activity. Still, the structural complexity of these
compounds challenges chemists and demands a substantial time and resource
investment to synthesize these compounds and analogs thereof.[16] It seems undeniable that the next logical step
in synthetic chemistry is to leverage the machinery that Nature has
developed to access a similar breadth of complexity exhibited by these
compounds.Biocatalytic methods also offer several key advantages
over traditional
chemical processes. These advantages include increased safety and
sustainability, procedural simplicity, and the tunability of an enzyme’s
reactivity or selectivity through protein engineering.[7] In particular, the sustainability profile of enzymatic
transformations has motivated their adoption.[17] In contrast to the organometallic catalysts developed using precious
metals, which are being rapidly depleted from the Earth’s crust,[18] enzymatic catalysts can be produced without
the worry of exhausting limited resources. These advantages poise
biocatalysis for adoption into the mainstream synthetic repertoire.[19] It is easier now than ever before for anyone,
from enzyme novices to global biotech companies, to tap into the powerful
transformations biocatalysis can offer.In this Outlook, we explore the diverse
field of biocatalysis and
highlight recent advances in technology that have dramatically improved
the accessibility of enzymes and driven the transition from simple
biocatalytic systems to sophisticated complexity-generating biocatalytic
platforms. Additionally, we highlight the advantages that biocatalysts
offer in organic synthesis, the current state-of-the-art in this field,
and how advancing technologies will provide new opportunities for
incorporating biocataytic strategies in the synthesis of target molecules.
Early
Applications of Biocatalysis in Synthesis
Despite their use
in the fermentation process for millennia, enzymatic
methods were first appreciated on the molecular level beginning in
the early 1800s[20] when researchers began
investigating yeasts for their fermentation abilities.[21] Though it was first believed that the entire
microorganism itself was functioning as the catalyst, the discovery
of the first enzyme mixture, called “diastase”, fundamentally
changed the field by demonstrating that observed reactions were mediated
by only specific parts of the organism.[22] This sparked increased interest in the then-new field of enzymology,
and crucial milestones in understanding enzymes followed. These include
the development of the lock and key model,[23] cell-free fermentation,[24] the realization
that enzymes were in fact proteins,[25] and
the elucidation of the DNA structure;[26] all of which have paved the way for modern biocatalysis (Figure ).[21] Finally, with the invention and adoption of X-ray crystallography,
researchers were finally able to view the three-dimensional structure
of these miraculous macromolecules in detail and gain insight into
their functions and mechanisms.[27] A more
detailed account of the rich history of enzymology has been reviewed
previously.[21]
Figure 1
(A) Early uses of enzyme-mediated
transformations, such as fermentation,
chiral resolutions, and functional group interconversions. (B) Recent
advances in genome sequencing, gene synthesis, and bioinformatics
increase the accessibility of obtaining enzymes. (C) Select strategies
in modern biocatalysis include cascades, chemoenzymatic synthesis,
and enzyme evolution.
(A) Early uses of enzyme-mediated
transformations, such as fermentation,
chiral resolutions, and functional group interconversions. (B) Recent
advances in genome sequencing, gene synthesis, and bioinformatics
increase the accessibility of obtaining enzymes. (C) Select strategies
in modern biocatalysis include cascades, chemoenzymatic synthesis,
and enzyme evolution.Over much of the 20th
century and into the early 2000s, the use
of enzymes to perform useful chemistry truly gained popularity.[21] Enzyme-mediated kinetic resolutions were one
of the most common initial uses of biocatalysis in synthesis. Though
several different classes of enzymes have been applied to conduct
these enantiomeric enrichments,[9] lipases
are commonly employed to affect this transformation based on their
commercial availability, large substrate scope, high levels of selectivity,
and cofactor-free catalysis.[11] Also commonly
used in the production of cheese products and laundry detergents,[28] the first member of this enzyme class was discovered
in 1848 by Claude Bernard in his investigation of pancreatic secretions.[29] Initial experimentation with lipases in the
1930s[30] and 1940s[31] laid the groundwork for their use in kinetic resolutions and other
biocatalytic transformations for the rest of the 1900s.[11,28] A relatively recent example published by Kaga and co-workers in
2003 demonstrates the simplicity of using lipases in a more modern
biocatalytic setting: to construct a small library of chiral hemiaminals
through the dynamic kinetic resolution of racemic starting materials.[10] The group first screened a set of commercially
available lipase enzymes for acylation activity against their library
of racemic N-acylhemiaminals and determined
that lipase QL gave short reaction times and operated with high levels
of enantioselectivity. With this enzyme, they constructed several O-acylated hemiaminals in quantitative yields and in high/exquisite
enantioselectivities (Figure A). Lipase QL is just a select example among several others
that demonstrates the early and widespread use of lipases and other
kinetic resolution enzymes in academia and industry.[28]
Figure 2
(A) Dynamic kinetic resolution of racemic N-acylhemiaminals
by a lipase. (B) NAD(P)H recycling system developed by Wong and Whitesides.
(C) Cascade system for construction of chiral amines using an ω-transaminase.
Abbreviations: G6PDH glucose-6-phosphate dehydrogenase, DH dehydrogenase,
TA transaminase, L-AADH l-α-amino acid dehydrogenase.
(A) Dynamic kinetic resolution of racemic N-acylhemiaminals
by a lipase. (B) NAD(P)H recycling system developed by Wong and Whitesides.
(C) Cascade system for construction of chiral amines using an ω-transaminase.
Abbreviations: G6PDH glucose-6-phosphate dehydrogenase, DH dehydrogenase,
TA transaminase, L-AADH l-α-amino acid dehydrogenase.The design of systems for in situ cofactor regeneration
is a significant milestone in biocatalytic method development.[32] Early studies of cofactor-dependent enzymes
in synthesis relied on the addition of stoichiometric quantities of
these cofactors, which limited the utility of the enzymatic reaction.
Thus, the ability to continuously recycle these essential components
in the reaction mixture was critical to certain biocatalysts’
practical use.[33] The early work of Wong
and Whitesides on the regeneration of NAD(P)H in situ to enable reductions by dehydrogenase enzymes demonstrates this
method’s capability and has since made an enormous impact on
the field.[34−36] To apply these dehydrogenases toward the construction
of chiral alcohols, they developed the use of glucose-6-phosphate
dehydrogenase (G6PDH) from L. mesenteroides to reduce the NAD(P)+ cofactor in situ following
its oxidation by the dehydrogenase. G6PDH relies on glucose-6-phosphate,
which is inexpensive and easy to synthesize, to provide the equivalent
of hydride needed to reduce NAD(P)+ to NAD(P)H (Figure B). With this recycling
system in place, Wong and Whitesides completed the biocatalytic generation
of optically pure D-lactic acid, threo-Ds(+)-isocritic acid, and (S)-benzyl-α-d1 alcohol.[34] This
early example of a biocatalytic cascade has since enabled the use
of many enzyme-catalyzed reductions and has paved the way for application
on an industrial scale.[37]Since this
preliminary work, methods relying on electrochemistry,
photochemistry, and other hydride donor/acceptor systems have been
developed. For example, readily available reagents like isopropanol
have been used in cofactor regeneration systems, providing an alternative
to the more expensive sugars used previously.[32] A variety of more economical and industrially feasible sacrificial
functional group donors have also been applied to improve efficiency,
scalability, and ease of use of cofactor regeneration.[38,39] Several reports describe the use of isopropyl amine as an amino
donor for transamination reactions, providing a substitute for the
cost prohibitive amino acids used conventionally.[40,41] There is also a focus on the construction and regeneration of synthetic,
biomimetic cofactors, which holds promise for increasing the effectiveness
of these systems further.[42]The rapid
adoption of recycling system methods allowed for the
broad application of biocatalytic reduction reactions.[37] The use of transaminases for constructing chiral
amines is a prime example, and their utility in synthesis has been
showcased in the synthesis of drug molecules such as sitagliptin.[43,44] Transaminases offer many advantages over their chemical counterparts,
including improved stereoselectivity, mild reaction conditions, and
reducing the reliance on harmful solvents and transition metals.[43,45] A notable example of the application of ω-transaminases, a
subgroup of transaminases that has drawn particular attention in the
pharmaceutical industry,[46] was developed
by Koszelewski and co-workers. Ultimately relying on a system similar
to that produced by Wong and Whitesides to power catalysis, the Koszelewski
group constructed nine chiral amines with excellent enantiomeric excess
through a reductive amination with the commercial ω-transaminase
ATA-113.[12] They also employed a second
enzyme, l-α-amino acid dehydrogenase (L-AADH), to further
streamline this reaction by regenerating the amino acid alanine in situ, which is the amine source for the transamination
reaction. L-AADH, enabled by the NAD(P)H recycling system, utilizes
an equivalent of ammonium as the ultimate nitrogen source to reduce
pyruvate to the desired alanine (Figure C). Albeit a relatively simple transformation
by today’s standards, this early work serves as a quintessential
example of synthetic utility of transaminases.The seminal work
highlighted here, alongside other early examples
of simple biocatalytic reactions such as isomerizations, redox manipulations,
and ligations,[47] brought to light the power
of enzymes as catalysts in synthesis. In the more modern history of
biocatalysis, there has been a paradigm shift from using enzymes to
construct relatively simple building blocks or provide chiral intermediates
for traditional syntheses,[48] to relying
on them for late-stage synthetic modifications,[49] combining molecule fragments toward value-added compounds,
and conducting multistep, biocatalytically mediated total syntheses.[14,37] Additionally, the tools for investigating and leveraging biocatalysts
for synthetic uses have reached a stage where they are widely accessible
to the chemistry community: obtaining the knowledge and equipment
needed for biocatalysis can be accomplished with just a few clicks.
Accessibility
of Biocatalysis to Synthetic Chemists
Once relegated to the
fields of biochemistry and molecular biology,
recent advances in bioinformatics,[50] DNA
sequencing,[51] protein engineering,[52] and DNA synthesis have made it possible for
virtually anyone to take advantage of enzymatic catalysts and tailor
them to their own needs. The process of identifying, producing, isolating,
and tuning the reactivity of biocatalysts for desired transformations
is as accessible to synthetic chemists as obtaining and using small
molecule catalysts. In particular, the recent exponential growth in
annotated protein sequences available in online databases has created
an enormous catalog of potential enzymes to serve many synthetic needs.
Two of the most popular databases, UnitProt[53] and Genbank,[54] now house information
on more than 420 000 individual species, representing over
one billion total sequence records. Instead of taking to the field
and collecting specimens by hand to examine their genes, these databases
store a wealth of information on protein sequence and origin and are
a valuable starting point for anyone looking to identify enzymes for
a given synthetic purpose.[55]Combining
the vast amount of data stored in these online libraries
with bioinformatic tools allows one to begin making predictions about
the function of uncharacterized or “hypothetical” proteins,[56] and to search for previously identified proteins
that may also demonstrate activity in a noncanonical transformation.[57] For example, the basic local alignment search
tool (BLAST) is one of the most popular and easy to use for this type
of analysis.[58,59] Gaining popularity in the early
1990s and now available to use for free on the National Center for
Biotechnology Information (NCBI) Web site,[59] this tool relies on algorithms to search available online databases
for protein sequences that resemble a given input sequence. By feeding
the BLAST search engine a known nucleotide or amino acid sequence,
or a protein identifier such as an accession number, the tool can
align all known protein sequences that share similarity with the input
sequence and rank them in a list. As minute changes in the order or
position of amino acid residues can drastically alter function between
homologous proteins with highly similar sequences, this type of search
can be advantageous when trying to identify enzymes with improved
stability and activity, complementary substrate scopes, or proteins
that can perform desired transformations with the alternative site-
and/or stereoselectivity to the one used to build the query.[36,60,61] This tool also provides known
information about each sequence, such as the originating organism
and any characterized metabolic function of the protein within said
organism. By displaying data on the degree of similarity between proteins
based on how well their sequences align, a user can quickly identify
any known proteins that may share functional characteristics with
the input protein sequence.[62]Albeit
a useful starting point, this list format provided by BLAST
can become cumbersome when the search yields thousands of potentially
related protein sequences. To obtain a more comprehensive view of
entire protein families, some of which can contain hundreds of thousands
of proteins,[63] tools have been developed
that provide greater context for viewing connections within these
groups. Phylogenetic trees are commonly used to examine relationships
between homologous proteins and study changes in protein families
over their evolution.[64] This bioinformatic
analysis technique relies on the alignment of homologous protein sequences
to construct a visual representation of the evolutionary history of
the related sequences in a phylogenetic tree (Figure A).[64] Building
and visualizing these trees has also been simplified by programs like
Molecular Evolutionary Genetics Analysis (MEGA)[65,66] and Ensembl[67] that provide straightforward
user interfaces. Once various algorithms and search tools are applied
to analyze all available data and establish the most likely configuration,
the trees can be examined to draw conclusions about relatedness among
protein families and test hypotheses about their evolutionary origins.[68,69]
Figure 3
(A)
Conceptual phylogenetic tree depicting locations of calculated
ancestral sequences. (B) Conceptual SSN demonstrating nodes, edges,
and clusters. (C) Workflow for a traditional cloning procedure.
(A)
Conceptual phylogenetic tree depicting locations of calculated
ancestral sequences. (B) Conceptual SSN demonstrating nodes, edges,
and clusters. (C) Workflow for a traditional cloning procedure.For example, one intriguing use of phylogenetic
analyses in the
context of biocatalysis is the identification and reconstruction of
ancestral protein sequences (Figure A) that can offer benefits in stability and biocatalytic
activity over their modern “offspring”.[70] This technique relies on software to compare related protein
sequences that most likely evolved from a common ancestor to calculate
or “infer” the exact sequence of that ancestral protein.[71] The ability to now obtain any DNA sequence quickly
and easily makes reconstructing ancestral proteins a potentially powerful
tool in identifying novel enzymes with desirable functions. To this
effect, Furukawa et al. have identified an ancestor of 3-isopropylmalate
dehydrogenase (IPMDH), a key enzyme in the biosynthesis of leucine,
which offers improvements in its stability and activity over extant
IPMDH enzymes from present-day organisms through construction and
analysis of a phylogenetic tree.[72] Following
inference and identification of two ancestral protein sequences, dubbed
ancIPMDH-IQ and ancIPMDH-ML, the group successfully expressed each
protein in E. coli and, after isolating
the enzymes for further investigation, discovered they provided increased
thermal stability and improved catalytic activity at low temperatures
compared to their modern homologues.[72] This
work demonstrates just one of many potential uses for ancestral protein
reconstruction, as other reports describe how ancestral proteins might
possess higher degrees of substrate promiscuity compared to their
modern offspring, thus offering potentially valuable characteristics
to organic chemists seeking diverse and novel bond-forming activity.[73]Despite their utility and newfound ease-of-use,
phylogenetic trees
can still prove overwhelming when examining extensive protein families
or groups of sequences.[74,75] Tools like sequence
similarity networks (SSNs) have emerged to help overcome these challenges.
SSNs have gained much attention since its introduction to the bioinformatics
community in 2003.[76] It provides a way
to visualize family wide relationships and patterns in large groups
of protein sequences by ranking sequences in “clusters”
based on their alignment scores.[74−77] These networks comprise groups
of “nodes,” representing a protein sequence or group
of sequences. These nodes are then connected by lines called “edges”,
representing a threshold for sequence similarity that can be set by
the user (Figure B).
Changing this score controls which nodes group together, allowing
for inferences to be made about protein structure and functions by
examining and comparing the location of nodes within the clusters.[77] These networks can be constructed and analyzed
quickly and easily through a web-based tool called EFI-EST[75] and the free-to-download software Cytoscape.[76] Helpful tutorials and videos on how to construct,
use, and manipulate SSNs with these programs are also available for
free online.[75,76]These networks can be beneficial
for chemists looking to identify
new enzymes for catalysis from families with a limited number of previously
characterized proteins. Lewis and co-workers have recently applied
SSNs to identify and profile novel flavin-dependent halogenase (FDH)
enzymes.[78] Using these networks to guide
their search, the group elected 128 initial halogenase sequences to
sample for useful halogenation activity. Following expression of the
genes, they obtained 87 soluble proteins for preliminary activity
screens with 12 initial substrates containing a mixture of phenols,
indoles, and anilines. Overall, the group identified 39 previously
uncharacterized halogenases that demonstrated unique bromination and/or
chlorination activity against the substrate panel. After examining
an additional 50 complex and bulky substrates, they discovered at
least one member of their halogenase library that demonstrated activity
with around 48% of the substrates tested. Ultimately, Lewis and co-workers
examined and characterized the preference for these FDHs toward bromination
and chlorination, their site-selectivity, and thermostability and
could draw further conclusions about trends in their SSNs through
this family wide profiling.[78] This cutting-edge
application of SSNs demonstrates how free and straightforward Internet-based
software can be used to identify synthetically tractable biocatalysts
without the need to perform more complex mutagenesis and directed
evolution experiments.Our group has also demonstrated the applicability
of SSNs to examine
previously uncharacterized enzymes with useful chemical functions.[36,74] We sought to identify homologous flavin-dependent monooxygenase
(FDMO) proteins to investigate the factors that control their site
and facial selectivity in an oxidative dearomatization reaction and
to identify enzymes suitable to enable a stereodivergent chemoenzymatic
natural product synthesis campaign.[36] Analysis
of an SSN comprised of over 45 000 sequences from the flavin
adenine dinucleotide (FAD) binding domain protein family (pfam01494)
identified several FDMOs that are highly similar to those our group
had investigated previously.[35] Combining
the experimental data gained from reactions of these enzymes in a
model system with comparisons of their sequence information and location
in the SSN allowed us to identify trends in the SSN that predict the
site-selectivity of a putative FDMO based on which cluster it is located
in. We envisioned this technique may also help predict the stereoselectivity
of the dearomatization mediated by a given FDMO, but further studies
suggest that this is much more finely controlled than what can be
predicted by a precursory SSN. Additional studies suggested two key
active site residues are crucial in controlling the stereochemical
outcome of the dearomatization reaction known to these proteins.[36,60] Though this does highlight a potential drawback of using SSNs in
this way, the tool did ultimately demonstrate its utility in identifying
other catalytically active proteins with desired activity. Work is
currently underway to further characterize these enzymes in hopes
of expanding our library of biocatalysts.Before developing
these tools for identifying and characterizing
enzymes in silico, obtaining biocatalysts for chemical
experimentation was a significant challenge. To investigate a wild-type
or naturally occurring catalytic protein, a molecular biologist would
first need to get the source DNA or RNA encoding the gene of interest
from the native organism. Following isolation, it is necessary to
amplify the DNA fragment through a polymerase chain reaction (PCR).[79] These amplified fragments must then be digested
with restriction enzymes and ligated into a circular piece of DNA
called a plasmid that has also been prepared with the same enzymes
to ensure the ends of these sequences are compatible.[80] Inserting DNA into a vector such as this not only allows
for the host organism to uptake the gene of interest but can also
be used to impart properties like antibiotic resistance to the transfected
cells to allow for the selection of individual cells that have successfully
incorporated the plasmid. Following digestion, the prepared DNA fragment
and cut vector are then combined in the presence of a DNA ligase enzyme,
which efficiently joins the compatible ends of the fragment and vector,
resulting in the production of a so-called “recombinant plasmid”.[81] In the case of transforming E.
coli, one of the most popular and easy-to-use host
organisms for recombinant protein production, the recombinant plasmid
is then added to competent bacterial cells (cells that are primed
to uptake foreign DNA from their surroundings). The cells can then
be grown on agar media possessing an antibiotic to prevent cells that
do not contain the plasmid from growing. After allowing the cells
to grow on the agar, a colony can be harvested and analyzed to ensure
that it possesses the desired gene. Finally, after ensuring the gene
is present and contains the correct sequence, the colony can be used
to seed a larger culture to harvest usable amounts of the desired
protein, as well as to produce more of the plasmid for additional
studies or to transfect new cells with the desired gene without having
to undergo the entire process from scratch (Figure C).[81]In
contrast to these traditional cloning techniques, technological
breakthroughs in modern gene synthesis provide a highly streamlined
process for chemists seeking DNA sequences and plasmids. Instead of
using isolated DNA from native organisms as a template to amplify,
solid-phase oligonucleotide synthesis allows for the de novo construction of any nucleotide sequence found online from individual
nucleotide bases.[82] Companies now offer
customized DNA constructs for purchase on-demand: input your insert
sequence of interest and choose the desired vector in their online
interface, and the company will ship you a ready-to-use recombinant
plasmid possessing your exact gene, or even a sample of host organisms
containing the plasmid, in a matter of weeks. The cost is dependent
on the number of base pairs in the DNA sequence and the particular
plasmid desired, but these DNA constructs can typically be purchased
for under 200 USD. Not only does this save time and effort in obtaining
the recombinant vector, it also allows for nearly anyone to take advantage
of this technology without the need for specialized equipment, reagents,
and knowledge required for traditional cloning. Inexpensive and straightforward
methods, reagents, and equipment for transforming, growing, and isolating
recombinant protein from cells containing a mail order plasmid also
lower the barrier for individuals and laboratories looking to enter
the field of biocatalysis.[83,84]These tools and
techniques described above barely scratch the surface
of what is available for anyone interested in using and tuning biocatalysts
for a particular synthetic application.[75,85,86] Advances in the fields of directed evolution[52] and computer-guided enzyme engineering[87] promise to construct enzymes with ever-greater
efficiency, selectivity, stability, and reusability than those known
today. Leveraging combinations of these strategies have already begun
to provide highly applicable and useful biocatalysts to the synthetic
community at large and will continue to improve biocatalytic methods
as they are developed further.
State-of-the-Art Biocatalysis
Following
this explosion of interest in enzyme-mediated catalysis,
biocatalytic reactions are now increasingly employed in complex molecule
synthesis. Biocatalytic methods that affect late-stage site- and stereoselective
C–H functionalization constitute one of the best state-of-the-art
transformations available today that maximize step efficiency and
enable diversification of complex scaffolds. Select examples of biocatalytic
C–H functionalization in complex molecule synthesis are shown
in Figure A. Sherman
and co-workers have carried out a late-stage hydroxylation of the
macrolide natural product M-4365 G1 (9) to form antibiotic
juvenimicin B1 (10) with P450 monooxygenase TylI.[88] Late-stage biocatalytic C–H hydroxylation
has also been explored in the pursuit of steroid-based drugs.[89,90] Zhou and co-workers developed a biocatalytic C19 hydroxylation of
cortexolone (11) to form 19-hydroxycortexolone
(12) using TcP450-1, a cytochrome P450 enzyme.[90] This strategy enables direct access to bioactive
C19-hydroxylated steroids.[90] It is worth
mentioning that direct hydroxylation at the C19 position of steroids
is extremely challenging using traditional chemical methods.[91−93] Our research group’s long-standing interest in using enzymes
to carry out C–H hydroxylation reactions has been channeled
for the late-stage diversification of paralytic shellfish toxins.[94−97] We have employed the Rieske oxygenase SxtT to carry out the site-
and stereoselective hydroxylation of β-saxitoxinol (13), directly generating saxitoxin (14).[95] The Renata group recently disclosed a nonheme iron (NHI)
dependent enzymatic platform to enable late-stage biocatalytic hydroxylation
of complex terpene scaffolds.[98] The enzyme
P450BM3 MERO1M177A was employed in carrying out selective
C–H hydroxylation to form the oxidized terpene product 15.[98] Direct C–H hydroxylation
has also been developed for amino acid scaffolds. For example, Zaparucha
discovered the NHI enzyme KDO1-3 that carried out selective hydroxylation
of l-lysine.[99,100] The enzymes KDO1 and KDO2/3
selectively hydroxylated the C3 and C4 positions of l-lysine,
and the enzyme KDO3 carried out C4 hydroxylation of a pre-C3-hydroxylated l-lysine.[99] Renata and co-workers
employed the KDO1 mediated C3-selective hydroxylation of l-lysine in their total synthesis of tambromycin.[101]
Figure 4
Biocatalysis in complex molecule synthesis: (A) selected C–H
functionalization reactions. (B) Selected C–C bond forming
reactions.
Biocatalysis in complex molecule synthesis: (A) selected C–H
functionalization reactions. (B) Selected C–C bond forming
reactions.Rapid advances in biocatalysis
have resulted in the identification
of enzymes that can carry out carbon–carbon (C–C) bond-forming
reactions (select examples in Figure B).[102] Balskus and co-workers
reported the enzyme CylK that carries out biocatalytic intermolecular
Friedel–Crafts alkylation of two halogenated resorcinol derivatives
to construct the cylindrocyclophane 19.[103] The enzyme CylK has also been shown to be highly promiscuous,
carrying out alkylation of a variety of resorcinol derivatives with
secondary alkyl halides.[104] Biocatalytic
Friedel–Crafts alkylation has also been carried out to synthesize
podophyllotoxin lignans.[105,106] For example, the NHI
enzyme 2-ODD-PH has been utilized to carry out the biocatalytic synthesis
of deoxypodophyllotoxin (20) and related analogs.[106−108] Biocatalytic oxidative phenolic coupling reactions are emerging
as powerful tools to construct complex molecules.[109−111] The Müller group recently reported fungal P450 enzymes capable
of carrying out oxidative coupling of coumarin derivatives in a site-
and stereoselective manner.[109] For example,
the enzyme KtnC catalyzes the synthesis of the bicoumarinP-orlandin (21).[109] Biocatalytic C–C bond formation has been explored in carbene
transfers to generate chiral cyclopropanes.[112−114] Arnold and co-workers first reported an engineered P450BM3 that carried out carbene transfer reactions. Diazoacetate reagents
were used as the carbene sources to carry out alkene cyclopropanation.[112] Several other groups have contributed to the
development of biocatalytic carbene transfer reactions, and these
have been applied toward the synthesis of pharmacologically relevant
compounds such as the TRPV1 inhibitor 25.[115,116] Biocatalytic carbene transfer reactions can be extended to alkynes
as well, where the first carbene transfer generates a cyclopropene
product which is primed for a second carbene transfer reaction to
generate stereopure bicyclobutane products.[117] This transformation rivals the best of what synthetic chemistry
has to offer in terms of building complexity through C–C bond
formation.In the case of selective C–H functionalization
and C–C
bond-forming reactions, biocatalysis is often employed at an advanced
stage or in the final step of a synthetic campaign. Alternatively,
biocatalysis can be engaged at an early stage in chemoenzymatic synthesis
planning (Figure ).
In such cases, the product of a biocatalytic reaction is transformed
into a target molecule of interest using modern synthetic organic
chemistry tools. This strategic merge of biocatalysis and small molecule-based
synthetic methods enables access to chemical scaffolds previously
unattainable using traditional chemical methods alone. For example,
Renata and co-workers developed a chemoenzymatic total synthesis of
the natural product manzacidin C (32).[118,119] The NHI-dependent enzyme GriE was employed to carry out selective
hydroxylation of an l-leucine derivative 30 to
form 31.[118] The product 31 was taken through established synthetic steps to formally
assemble manzacidin C (32).[118] Our group has been interested in the hydroxylative dearomatization
of resorcinol compounds using flavin-dependent monooxygenases (FDMOs).[35,36] We have employed the site- and stereoselectivity of FDMOs in conjunction
with small-molecule-based methods to enable the total synthesis of
azaphilone natural products.[36] For example,
the enzyme AzaH was used to carry out the dearomatization of resorcinol 33 to form 34. The quinol product 34 was subsequently transformed to (S)-trichoflectin
(35) using chemical methods.[36] Our group has also focused on developing benzylic hydroxylation
of o-cresol compounds using NHI-dependent monooxygenases.[120] For example, we have employed the enzyme ClaD
to carry out benzylic hydroxylation of resorcinol derivative 36, the product of which (37) undergoes spontaneous
loss of water resulting in a biocatalytically generated o-quinone methide, which was trapped using a chiral dienophile to
construct the bioactive natural product xyloketal D (38).[120] α-Deuterated amino acids are
important building blocks toward the synthesis of labeled pharmaceuticals
and biological probes; however, traditional methods to access these
compounds often require protecting group manipulations[121] and can be difficult to perform in a stereoselective
manner.[122] We discovered that SxtA AONS,
α-oxoamine synthase evolved for saxitoxin biosynthesis, is capable
of deuterating a range of unprotected amino acids and their methyl
esters using D2O as the deuteron source. For example, deuteration
of alanine methyl ester (39) resulted in 40, which was subsequently transformed using chemical methods to access
the deuterium-labeled Parkinson’s pharmaceutical safinamide
(41).
Figure 5
Chemoenzymatic sequences to complex molecules. (A) Amino-acid
C–H
hydroxylation in the synthesis of manzacidin C. (B) Hydroxylative
dearomatization in the synthesis of azaphilone natural products. (C)
Benzylic hydroxylation en route to xyloketal D synthesis. (D) Alpha
deuteration of amino acids in the formation of deutero safinamide.
Chemoenzymatic sequences to complex molecules. (A) Amino-acid
C–H
hydroxylation in the synthesis of manzacidin C. (B) Hydroxylative
dearomatization in the synthesis of azaphilone natural products. (C)
Benzylic hydroxylation en route to xyloketal D synthesis. (D) Alpha
deuteration of amino acids in the formation of deutero safinamide.Multienzyme cascade reactions have been developed
in industrial
and academic laboratories to enable complex molecule synthesis (select
examples in Figure ). The process toward HIV treatment drug islatravir (48) developed by Merck and Codexis is a representative example of a
multienzyme cascade employed on an industrial scale.[14] The artificial nucleoside islatravir (48)
was constructed using a combination of five enzymes from the nucleoside
salvage pathway in bacteria, which were each engineered for a distinct
purpose.[14] This protecting group-free cascade
yielded the product islatravir in markedly higher yields than previous
chemical syntheses.[14,123] Moore and co-workers developed
a multienzyme synthesis of complex halogenated bacterial meroterpenoidsnapyradiomycins A1 and B1 (54 and 55) in
a single pot.[124] Starting with three organic
substrates (tetrahydroxynaphthalene 49, dimethylallylpyrophosphate,
and geranyl pyrophosphate), the team developed a catalytic sequence
involving five enzymes: two aromatic prenyltransferases (NapT8
and T9) and three vanadium dependent haloperoxidase (VHPO) homologues
(NapH1, H3, and H4) to assemble the complex halogenated metabolites
in milligram quantities.[124] Our group has
leveraged the exquisite reactivity of FDMOs and NHI-dependent monooxygenases
to construct tropolone natural products.[35,125] Tropolones are a structurally diverse class of bioactive molecules
that are characterized by a cycloheptatriene core bearing an
α-hydroxyketone functional group. We developed a two-step, biocatalytic
cascade to the tropolone natural product stipitatic aldehyde starting
with the resorcinol 56. Hydroxylative dearomatization
of 56 using TropB affords the quinol intermediate 57. The quinol intermediate undergoes oxidation by an α-KG
dependent NHI enzyme TropC to form a radical intermediate which undergoes
a net ring rearrangement to form stipitatic aldehyde 59.
Figure 6
Multienzyme biocatalytic sequences: (A) Merck’s biocatalytic
synthesis of islatravir. (B) Multienzyme synthesis of napyradiomycin
A1 and B1. (C) Multienzyme sequence toward the synthesis of tropolone
stipitatic aldehyde.
Multienzyme biocatalytic sequences: (A) Merck’s biocatalytic
synthesis of islatravir. (B) Multienzyme synthesis of napyradiomycin
A1 and B1. (C) Multienzyme sequence toward the synthesis of tropolone
stipitatic aldehyde.Biocatalytic methods
are poised to significantly expand the repertoire
of transformations possible in an organic chemist’s toolbox,
allowing greater access to chemical space than previously possible.
This creates an incentive for academic and industrial laboratories
to embrace biocatalytic methods. As interest in this field continues
to grow, it will most certainly inform the retrosynthetic logic of
modern organic synthesis and shape the next generation of methods.
Outlook
and Conclusion
New technology and approaches in biocatalysis
continue to pave
the way for innovation and paint a bright future for this field. Enzymatic
catalysis has demonstrated utility in the construction of simple molecules
and holds promise for expanding synthetic access to new corners of
chemical space. The rapid technological advances surrounding biocatalyst
discovery, characterization, and application naturally raises the
question as to what comes next in the field. We anticipate that the
amenability of biocatalysis to high-throughput experimentation will
shape the application of enzymatic catalysis in synthesis. For example,
we envision generation of compound libraries in plates will be possible
through biocatalysis. Considering the benign nature of biocatalytic
reactions, we anticipate biocatalytically generated compound libraries
can be directly coupled with biological assays as well, matching the
pace of compound generation with established high-throughput biological
assays to ultimately accelerate drug discovery.[126,127]Continued progress in biocatalysis would benefit combinatorial
platforms for the synthesis of small-molecule-based compound libraries.
The idea of combinatorial biocatalysis platforms for library synthesis
has been around since the early 2000s; however, its widespread adoption
has been hindered by the lack of resources to identify and develop
promiscuous catalytic enzymes.[128,129] Combinatorial biocatalytic
syntheses are now taking shape with recent advances in contemporary
organic chemistry, synthetic biology, and bioinformatics. In addition,
studies of enzyme cocktails have shown that biocatalysts can operate
synergistically to complement each other’s substrate scopes,
creating useful catalyst mixtures to perform sequential chemical transformations.[130,131] With this precedent, as well as equipment for high-throughput experimentation
becoming more advanced and commonplace,[126] it seems only a matter of time before the high-throughput synthesis
of vast and diverse small molecule libraries mediated by combinatorial
biocatalysis is realized.Without question, biocatalysis has
become a valued approach in
modern organic synthesis[126] and is a methodology
we will rely heavily on as the need to develop green alternatives
in chemistry grows.[17,132] With the rapid advances in the
field over the past few decades and the wealth of sequence data now
widely available, biocatalytic methods are more accessible than ever
before. As the global community adapts these techniques to their individual
needs, new ideas and strategies will take hold and continue to push
biocatalysis into the forefront of synthetic chemistry.
Authors: John A Gerlt; Jason T Bouvier; Daniel B Davidson; Heidi J Imker; Boris Sadkhin; David R Slater; Katie L Whalen Journal: Biochim Biophys Acta Date: 2015-04-18
Authors: Grzegorz M Boratyn; Christiam Camacho; Peter S Cooper; George Coulouris; Amelia Fong; Ning Ma; Thomas L Madden; Wayne T Matten; Scott D McGinnis; Yuri Merezhuk; Yan Raytselis; Eric W Sayers; Tao Tao; Jian Ye; Irena Zaretskaya Journal: Nucleic Acids Res Date: 2013-04-22 Impact factor: 16.971