Literature DB >> 34345663

State-of-the-Art Biocatalysis.

Joshua B Pyser¹, Suman Chakrabarty¹, Evan O Romero¹, Alison R H Narayan¹.

Abstract

The use of enzyme-mediated reactions has transcended ancient food production to the laboratory synthesis of complex molecules. This evolution has been accelerated by developments in sequencing and DNA synthesis technology, bioinformatic and protein engineering tools, and the increasingly interdisciplinary nature of scientific research. Biocatalysis has become an indispensable tool applied in academic and industrial spheres, enabling synthetic strategies that leverage the exquisite selectivity of enzymes to access target molecules. In this Outlook, we outline the technological advances that have led to the field's current state. Integration of biocatalysis into mainstream synthetic chemistry hinges on increased access to well-characterized enzymes and the permeation of biocatalysis into retrosynthetic logic. Ultimately, we anticipate that biocatalysis is poised to enable the synthesis of increasingly complex molecules at new levels of efficiency and throughput.

Entities: Chemical Disease Species

Year: 2021 PMID： 34345663 PMCID： PMC8323117 DOI： 10.1021/acscentsci.1c00273

Source DB: PubMed Journal: ACS Cent Sci ISSN： 2374-7943 Impact factor: 14.553

Introduction

The utility of naturally occurring enzymes has been harnessed for thousands of years through fermentation and food preservation processes.[1] Fascination with the chemistry of microbes originated at the dawn of the Neolithic era, nearly 12 000 years ago, when humans began domesticating grains and consuming alcohol, the evidence of which can be found in archeological records.[2] In fact, arguments have been made that our use of alcohol produced enzymatically even predates archeological records, with evidence for ethanol-degrading enzymes present in primate species that lived before Homo sapiens.[2] Driven by curiosity, people sought to understand how and why leaving cereals or grapes[3] alone for some time caused them to adopt new properties, seeding fields like enzymology, molecular biology, and biocatalysis as an extension of this fascination.[4] Within the past few decades, biocatalysis in fine chemical and pharmaceutical production has surged.[5−7] This trend is driven in part by advances in DNA sequencing, bioinformatics, and protein engineering that allow for the identification of enzymes that meet the reactivity and selectivity needs of a given synthetic route.[8] Biocatalytic reactions are now routinely used in scalable processes ranging from simple chemical manipulations such as chiral resolutions,[9−11] reductive aminations,[9,12] and alcohol oxidations,[13] to complex, multistep chemoenzymatic cascades that enable access to high-value drug molecules on an industrial scale.[14] The rapid developments in biocatalysis are also enabling a re-emergence of natural products in the current era of genomics.[15] Natural products have a long history in drug discovery and development given their often potent biological activity. Still, the structural complexity of these compounds challenges chemists and demands a substantial time and resource investment to synthesize these compounds and analogs thereof.[16] It seems undeniable that the next logical step in synthetic chemistry is to leverage the machinery that Nature has developed to access a similar breadth of complexity exhibited by these compounds. Biocatalytic methods also offer several key advantages over traditional chemical processes. These advantages include increased safety and sustainability, procedural simplicity, and the tunability of an enzyme’s reactivity or selectivity through protein engineering.[7] In particular, the sustainability profile of enzymatic transformations has motivated their adoption.[17] In contrast to the organometallic catalysts developed using precious metals, which are being rapidly depleted from the Earth’s crust,[18] enzymatic catalysts can be produced without the worry of exhausting limited resources. These advantages poise biocatalysis for adoption into the mainstream synthetic repertoire.[19] It is easier now than ever before for anyone, from enzyme novices to global biotech companies, to tap into the powerful transformations biocatalysis can offer. In this Outlook, we explore the diverse field of biocatalysis and highlight recent advances in technology that have dramatically improved the accessibility of enzymes and driven the transition from simple biocatalytic systems to sophisticated complexity-generating biocatalytic platforms. Additionally, we highlight the advantages that biocatalysts offer in organic synthesis, the current state-of-the-art in this field, and how advancing technologies will provide new opportunities for incorporating biocataytic strategies in the synthesis of target molecules.

Early Applications of Biocatalysis in Synthesis

Despite their use in the fermentation process for millennia, enzymatic methods were first appreciated on the molecular level beginning in the early 1800s[20] when researchers began investigating yeasts for their fermentation abilities.[21] Though it was first believed that the entire microorganism itself was functioning as the catalyst, the discovery of the first enzyme mixture, called “diastase”, fundamentally changed the field by demonstrating that observed reactions were mediated by only specific parts of the organism.[22] This sparked increased interest in the then-new field of enzymology, and crucial milestones in understanding enzymes followed. These include the development of the lock and key model,[23] cell-free fermentation,[24] the realization that enzymes were in fact proteins,[25] and the elucidation of the DNA structure;[26] all of which have paved the way for modern biocatalysis (Figure ).[21] Finally, with the invention and adoption of X-ray crystallography, researchers were finally able to view the three-dimensional structure of these miraculous macromolecules in detail and gain insight into their functions and mechanisms.[27] A more detailed account of the rich history of enzymology has been reviewed previously.[21]

Figure 1

(A) Early uses of enzyme-mediated transformations, such as fermentation, chiral resolutions, and functional group interconversions. (B) Recent advances in genome sequencing, gene synthesis, and bioinformatics increase the accessibility of obtaining enzymes. (C) Select strategies in modern biocatalysis include cascades, chemoenzymatic synthesis, and enzyme evolution. Over much of the 20th century and into the early 2000s, the use of enzymes to perform useful chemistry truly gained popularity.[21] Enzyme-mediated kinetic resolutions were one of the most common initial uses of biocatalysis in synthesis. Though several different classes of enzymes have been applied to conduct these enantiomeric enrichments,[9] lipases are commonly employed to affect this transformation based on their commercial availability, large substrate scope, high levels of selectivity, and cofactor-free catalysis.[11] Also commonly used in the production of cheese products and laundry detergents,[28] the first member of this enzyme class was discovered in 1848 by Claude Bernard in his investigation of pancreatic secretions.[29] Initial experimentation with lipases in the 1930s[30] and 1940s[31] laid the groundwork for their use in kinetic resolutions and other biocatalytic transformations for the rest of the 1900s.[11,28] A relatively recent example published by Kaga and co-workers in 2003 demonstrates the simplicity of using lipases in a more modern biocatalytic setting: to construct a small library of chiral hemiaminals through the dynamic kinetic resolution of racemic starting materials.[10] The group first screened a set of commercially available lipase enzymes for acylation activity against their library of racemic N-acylhemiaminals and determined that lipase QL gave short reaction times and operated with high levels of enantioselectivity. With this enzyme, they constructed several O-acylated hemiaminals in quantitative yields and in high/exquisite enantioselectivities (Figure A). Lipase QL is just a select example among several others that demonstrates the early and widespread use of lipases and other kinetic resolution enzymes in academia and industry.[28]

Figure 2

(A) Dynamic kinetic resolution of racemic N-acylhemiaminals by a lipase. (B) NAD(P)H recycling system developed by Wong and Whitesides. (C) Cascade system for construction of chiral amines using an ω-transaminase. Abbreviations: G6PDH glucose-6-phosphate dehydrogenase, DH dehydrogenase, TA transaminase, L-AADH l-α-amino acid dehydrogenase. The design of systems for in situ cofactor regeneration is a significant milestone in biocatalytic method development.[32] Early studies of cofactor-dependent enzymes in synthesis relied on the addition of stoichiometric quantities of these cofactors, which limited the utility of the enzymatic reaction. Thus, the ability to continuously recycle these essential components in the reaction mixture was critical to certain biocatalysts’ practical use.[33] The early work of Wong and Whitesides on the regeneration of NAD(P)H in situ to enable reductions by dehydrogenase enzymes demonstrates this method’s capability and has since made an enormous impact on the field.[34−36] To apply these dehydrogenases toward the construction of chiral alcohols, they developed the use of glucose-6-phosphate dehydrogenase (G6PDH) from L. mesenteroides to reduce the NAD(P)+ cofactor in situ following its oxidation by the dehydrogenase. G6PDH relies on glucose-6-phosphate, which is inexpensive and easy to synthesize, to provide the equivalent of hydride needed to reduce NAD(P)+ to NAD(P)H (Figure B). With this recycling system in place, Wong and Whitesides completed the biocatalytic generation of optically pure D-lactic acid, threo-Ds(+)-isocritic acid, and (S)-benzyl-α-d1 alcohol.[34] This early example of a biocatalytic cascade has since enabled the use of many enzyme-catalyzed reductions and has paved the way for application on an industrial scale.[37] Since this preliminary work, methods relying on electrochemistry, photochemistry, and other hydride donor/acceptor systems have been developed. For example, readily available reagents like isopropanol have been used in cofactor regeneration systems, providing an alternative to the more expensive sugars used previously.[32] A variety of more economical and industrially feasible sacrificial functional group donors have also been applied to improve efficiency, scalability, and ease of use of cofactor regeneration.[38,39] Several reports describe the use of isopropyl amine as an amino donor for transamination reactions, providing a substitute for the cost prohibitive amino acids used conventionally.[40,41] There is also a focus on the construction and regeneration of synthetic, biomimetic cofactors, which holds promise for increasing the effectiveness of these systems further.[42] The rapid adoption of recycling system methods allowed for the broad application of biocatalytic reduction reactions.[37] The use of transaminases for constructing chiral amines is a prime example, and their utility in synthesis has been showcased in the synthesis of drug molecules such as sitagliptin.[43,44] Transaminases offer many advantages over their chemical counterparts, including improved stereoselectivity, mild reaction conditions, and reducing the reliance on harmful solvents and transition metals.[43,45] A notable example of the application of ω-transaminases, a subgroup of transaminases that has drawn particular attention in the pharmaceutical industry,[46] was developed by Koszelewski and co-workers. Ultimately relying on a system similar to that produced by Wong and Whitesides to power catalysis, the Koszelewski group constructed nine chiral amines with excellent enantiomeric excess through a reductive amination with the commercial ω-transaminase ATA-113.[12] They also employed a second enzyme, l-α-amino acid dehydrogenase (L-AADH), to further streamline this reaction by regenerating the amino acid alanine in situ, which is the amine source for the transamination reaction. L-AADH, enabled by the NAD(P)H recycling system, utilizes an equivalent of ammonium as the ultimate nitrogen source to reduce pyruvate to the desired alanine (Figure C). Albeit a relatively simple transformation by today’s standards, this early work serves as a quintessential example of synthetic utility of transaminases. The seminal work highlighted here, alongside other early examples of simple biocatalytic reactions such as isomerizations, redox manipulations, and ligations,[47] brought to light the power of enzymes as catalysts in synthesis. In the more modern history of biocatalysis, there has been a paradigm shift from using enzymes to construct relatively simple building blocks or provide chiral intermediates for traditional syntheses,[48] to relying on them for late-stage synthetic modifications,[49] combining molecule fragments toward value-added compounds, and conducting multistep, biocatalytically mediated total syntheses.[14,37] Additionally, the tools for investigating and leveraging biocatalysts for synthetic uses have reached a stage where they are widely accessible to the chemistry community: obtaining the knowledge and equipment needed for biocatalysis can be accomplished with just a few clicks.

Accessibility of Biocatalysis to Synthetic Chemists

Once relegated to the fields of biochemistry and molecular biology, recent advances in bioinformatics,[50] DNA sequencing,[51] protein engineering,[52] and DNA synthesis have made it possible for virtually anyone to take advantage of enzymatic catalysts and tailor them to their own needs. The process of identifying, producing, isolating, and tuning the reactivity of biocatalysts for desired transformations is as accessible to synthetic chemists as obtaining and using small molecule catalysts. In particular, the recent exponential growth in annotated protein sequences available in online databases has created an enormous catalog of potential enzymes to serve many synthetic needs. Two of the most popular databases, UnitProt[53] and Genbank,[54] now house information on more than 420 000 individual species, representing over one billion total sequence records. Instead of taking to the field and collecting specimens by hand to examine their genes, these databases store a wealth of information on protein sequence and origin and are a valuable starting point for anyone looking to identify enzymes for a given synthetic purpose.[55] Combining the vast amount of data stored in these online libraries with bioinformatic tools allows one to begin making predictions about the function of uncharacterized or “hypothetical” proteins,[56] and to search for previously identified proteins that may also demonstrate activity in a noncanonical transformation.[57] For example, the basic local alignment search tool (BLAST) is one of the most popular and easy to use for this type of analysis.[58,59] Gaining popularity in the early 1990s and now available to use for free on the National Center for Biotechnology Information (NCBI) Web site,[59] this tool relies on algorithms to search available online databases for protein sequences that resemble a given input sequence. By feeding the BLAST search engine a known nucleotide or amino acid sequence, or a protein identifier such as an accession number, the tool can align all known protein sequences that share similarity with the input sequence and rank them in a list. As minute changes in the order or position of amino acid residues can drastically alter function between homologous proteins with highly similar sequences, this type of search can be advantageous when trying to identify enzymes with improved stability and activity, complementary substrate scopes, or proteins that can perform desired transformations with the alternative site- and/or stereoselectivity to the one used to build the query.[36,60,61] This tool also provides known information about each sequence, such as the originating organism and any characterized metabolic function of the protein within said organism. By displaying data on the degree of similarity between proteins based on how well their sequences align, a user can quickly identify any known proteins that may share functional characteristics with the input protein sequence.[62] Albeit a useful starting point, this list format provided by BLAST can become cumbersome when the search yields thousands of potentially related protein sequences. To obtain a more comprehensive view of entire protein families, some of which can contain hundreds of thousands of proteins,[63] tools have been developed that provide greater context for viewing connections within these groups. Phylogenetic trees are commonly used to examine relationships between homologous proteins and study changes in protein families over their evolution.[64] This bioinformatic analysis technique relies on the alignment of homologous protein sequences to construct a visual representation of the evolutionary history of the related sequences in a phylogenetic tree (Figure A).[64] Building and visualizing these trees has also been simplified by programs like Molecular Evolutionary Genetics Analysis (MEGA)[65,66] and Ensembl[67] that provide straightforward user interfaces. Once various algorithms and search tools are applied to analyze all available data and establish the most likely configuration, the trees can be examined to draw conclusions about relatedness among protein families and test hypotheses about their evolutionary origins.[68,69]

Figure 3

(A) Conceptual phylogenetic tree depicting locations of calculated ancestral sequences. (B) Conceptual SSN demonstrating nodes, edges, and clusters. (C) Workflow for a traditional cloning procedure.

(A) Conceptual phylogenetic tree depicting locations of calculated ancestral sequences. (B) Conceptual SSN demonstrating nodes, edges, and clusters. (C) Workflow for a traditional cloning procedure. For example, one intriguing use of phylogenetic analyses in the context of biocatalysis is the identification and reconstruction of ancestral protein sequences (Figure A) that can offer benefits in stability and biocatalytic activity over their modern “offspring”.[70] This technique relies on software to compare related protein sequences that most likely evolved from a common ancestor to calculate or “infer” the exact sequence of that ancestral protein.[71] The ability to now obtain any DNA sequence quickly and easily makes reconstructing ancestral proteins a potentially powerful tool in identifying novel enzymes with desirable functions. To this effect, Furukawa et al. have identified an ancestor of 3-isopropylmalate dehydrogenase (IPMDH), a key enzyme in the biosynthesis of leucine, which offers improvements in its stability and activity over extant IPMDH enzymes from present-day organisms through construction and analysis of a phylogenetic tree.[72] Following inference and identification of two ancestral protein sequences, dubbed ancIPMDH-IQ and ancIPMDH-ML, the group successfully expressed each protein in E. coli and, after isolating the enzymes for further investigation, discovered they provided increased thermal stability and improved catalytic activity at low temperatures compared to their modern homologues.[72] This work demonstrates just one of many potential uses for ancestral protein reconstruction, as other reports describe how ancestral proteins might possess higher degrees of substrate promiscuity compared to their modern offspring, thus offering potentially valuable characteristics to organic chemists seeking diverse and novel bond-forming activity.[73] Despite their utility and newfound ease-of-use, phylogenetic trees can still prove overwhelming when examining extensive protein families or groups of sequences.[74,75] Tools like sequence similarity networks (SSNs) have emerged to help overcome these challenges. SSNs have gained much attention since its introduction to the bioinformatics community in 2003.[76] It provides a way to visualize family wide relationships and patterns in large groups of protein sequences by ranking sequences in “clusters” based on their alignment scores.[74−77] These networks comprise groups of “nodes,” representing a protein sequence or group of sequences. These nodes are then connected by lines called “edges”, representing a threshold for sequence similarity that can be set by the user (Figure B). Changing this score controls which nodes group together, allowing for inferences to be made about protein structure and functions by examining and comparing the location of nodes within the clusters.[77] These networks can be constructed and analyzed quickly and easily through a web-based tool called EFI-EST[75] and the free-to-download software Cytoscape.[76] Helpful tutorials and videos on how to construct, use, and manipulate SSNs with these programs are also available for free online.[75,76] These networks can be beneficial for chemists looking to identify new enzymes for catalysis from families with a limited number of previously characterized proteins. Lewis and co-workers have recently applied SSNs to identify and profile novel flavin-dependent halogenase (FDH) enzymes.[78] Using these networks to guide their search, the group elected 128 initial halogenase sequences to sample for useful halogenation activity. Following expression of the genes, they obtained 87 soluble proteins for preliminary activity screens with 12 initial substrates containing a mixture of phenols, indoles, and anilines. Overall, the group identified 39 previously uncharacterized halogenases that demonstrated unique bromination and/or chlorination activity against the substrate panel. After examining an additional 50 complex and bulky substrates, they discovered at least one member of their halogenase library that demonstrated activity with around 48% of the substrates tested. Ultimately, Lewis and co-workers examined and characterized the preference for these FDHs toward bromination and chlorination, their site-selectivity, and thermostability and could draw further conclusions about trends in their SSNs through this family wide profiling.[78] This cutting-edge application of SSNs demonstrates how free and straightforward Internet-based software can be used to identify synthetically tractable biocatalysts without the need to perform more complex mutagenesis and directed evolution experiments. Our group has also demonstrated the applicability of SSNs to examine previously uncharacterized enzymes with useful chemical functions.[36,74] We sought to identify homologous flavin-dependent monooxygenase (FDMO) proteins to investigate the factors that control their site and facial selectivity in an oxidative dearomatization reaction and to identify enzymes suitable to enable a stereodivergent chemoenzymatic natural product synthesis campaign.[36] Analysis of an SSN comprised of over 45 000 sequences from the flavin adenine dinucleotide (FAD) binding domain protein family (pfam01494) identified several FDMOs that are highly similar to those our group had investigated previously.[35] Combining the experimental data gained from reactions of these enzymes in a model system with comparisons of their sequence information and location in the SSN allowed us to identify trends in the SSN that predict the site-selectivity of a putative FDMO based on which cluster it is located in. We envisioned this technique may also help predict the stereoselectivity of the dearomatization mediated by a given FDMO, but further studies suggest that this is much more finely controlled than what can be predicted by a precursory SSN. Additional studies suggested two key active site residues are crucial in controlling the stereochemical outcome of the dearomatization reaction known to these proteins.[36,60] Though this does highlight a potential drawback of using SSNs in this way, the tool did ultimately demonstrate its utility in identifying other catalytically active proteins with desired activity. Work is currently underway to further characterize these enzymes in hopes of expanding our library of biocatalysts. Before developing these tools for identifying and characterizing enzymes in silico, obtaining biocatalysts for chemical experimentation was a significant challenge. To investigate a wild-type or naturally occurring catalytic protein, a molecular biologist would first need to get the source DNA or RNA encoding the gene of interest from the native organism. Following isolation, it is necessary to amplify the DNA fragment through a polymerase chain reaction (PCR).[79] These amplified fragments must then be digested with restriction enzymes and ligated into a circular piece of DNA called a plasmid that has also been prepared with the same enzymes to ensure the ends of these sequences are compatible.[80] Inserting DNA into a vector such as this not only allows for the host organism to uptake the gene of interest but can also be used to impart properties like antibiotic resistance to the transfected cells to allow for the selection of individual cells that have successfully incorporated the plasmid. Following digestion, the prepared DNA fragment and cut vector are then combined in the presence of a DNA ligase enzyme, which efficiently joins the compatible ends of the fragment and vector, resulting in the production of a so-called “recombinant plasmid”.[81] In the case of transforming E. coli, one of the most popular and easy-to-use host organisms for recombinant protein production, the recombinant plasmid is then added to competent bacterial cells (cells that are primed to uptake foreign DNA from their surroundings). The cells can then be grown on agar media possessing an antibiotic to prevent cells that do not contain the plasmid from growing. After allowing the cells to grow on the agar, a colony can be harvested and analyzed to ensure that it possesses the desired gene. Finally, after ensuring the gene is present and contains the correct sequence, the colony can be used to seed a larger culture to harvest usable amounts of the desired protein, as well as to produce more of the plasmid for additional studies or to transfect new cells with the desired gene without having to undergo the entire process from scratch (Figure C).[81] In contrast to these traditional cloning techniques, technological breakthroughs in modern gene synthesis provide a highly streamlined process for chemists seeking DNA sequences and plasmids. Instead of using isolated DNA from native organisms as a template to amplify, solid-phase oligonucleotide synthesis allows for the de novo construction of any nucleotide sequence found online from individual nucleotide bases.[82] Companies now offer customized DNA constructs for purchase on-demand: input your insert sequence of interest and choose the desired vector in their online interface, and the company will ship you a ready-to-use recombinant plasmid possessing your exact gene, or even a sample of host organisms containing the plasmid, in a matter of weeks. The cost is dependent on the number of base pairs in the DNA sequence and the particular plasmid desired, but these DNA constructs can typically be purchased for under 200 USD. Not only does this save time and effort in obtaining the recombinant vector, it also allows for nearly anyone to take advantage of this technology without the need for specialized equipment, reagents, and knowledge required for traditional cloning. Inexpensive and straightforward methods, reagents, and equipment for transforming, growing, and isolating recombinant protein from cells containing a mail order plasmid also lower the barrier for individuals and laboratories looking to enter the field of biocatalysis.[83,84] These tools and techniques described above barely scratch the surface of what is available for anyone interested in using and tuning biocatalysts for a particular synthetic application.[75,85,86] Advances in the fields of directed evolution[52] and computer-guided enzyme engineering[87] promise to construct enzymes with ever-greater efficiency, selectivity, stability, and reusability than those known today. Leveraging combinations of these strategies have already begun to provide highly applicable and useful biocatalysts to the synthetic community at large and will continue to improve biocatalytic methods as they are developed further.

State-of-the-Art Biocatalysis

Following this explosion of interest in enzyme-mediated catalysis, biocatalytic reactions are now increasingly employed in complex molecule synthesis. Biocatalytic methods that affect late-stage site- and stereoselective C–H functionalization constitute one of the best state-of-the-art transformations available today that maximize step efficiency and enable diversification of complex scaffolds. Select examples of biocatalytic C–H functionalization in complex molecule synthesis are shown in Figure A. Sherman and co-workers have carried out a late-stage hydroxylation of the macrolide natural product M-4365 G1 (9) to form antibiotic juvenimicin B1 (10) with P450 monooxygenase TylI.[88] Late-stage biocatalytic C–H hydroxylation has also been explored in the pursuit of steroid-based drugs.[89,90] Zhou and co-workers developed a biocatalytic C19 hydroxylation of cortexolone (11) to form 19-hydroxycortexolone (12) using TcP450-1, a cytochrome P450 enzyme.[90] This strategy enables direct access to bioactive C19-hydroxylated steroids.[90] It is worth mentioning that direct hydroxylation at the C19 position of steroids is extremely challenging using traditional chemical methods.[91−93] Our research group’s long-standing interest in using enzymes to carry out C–H hydroxylation reactions has been channeled for the late-stage diversification of paralytic shellfish toxins.[94−97] We have employed the Rieske oxygenase SxtT to carry out the site- and stereoselective hydroxylation of β-saxitoxinol (13), directly generating saxitoxin (14).[95] The Renata group recently disclosed a nonheme iron (NHI) dependent enzymatic platform to enable late-stage biocatalytic hydroxylation of complex terpene scaffolds.[98] The enzyme P450BM3 MERO1M177A was employed in carrying out selective C–H hydroxylation to form the oxidized terpene product 15.[98] Direct C–H hydroxylation has also been developed for amino acid scaffolds. For example, Zaparucha discovered the NHI enzyme KDO1-3 that carried out selective hydroxylation of l-lysine.[99,100] The enzymes KDO1 and KDO2/3 selectively hydroxylated the C3 and C4 positions of l-lysine, and the enzyme KDO3 carried out C4 hydroxylation of a pre-C3-hydroxylated l-lysine.[99] Renata and co-workers employed the KDO1 mediated C3-selective hydroxylation of l-lysine in their total synthesis of tambromycin.[101]

Figure 4

Biocatalysis in complex molecule synthesis: (A) selected C–H functionalization reactions. (B) Selected C–C bond forming reactions.

Biocatalysis in complex molecule synthesis: (A) selected C–H functionalization reactions. (B) Selected C–C bond forming reactions. Rapid advances in biocatalysis have resulted in the identification of enzymes that can carry out carbon–carbon (C–C) bond-forming reactions (select examples in Figure B).[102] Balskus and co-workers reported the enzyme CylK that carries out biocatalytic intermolecular Friedel–Crafts alkylation of two halogenated resorcinol derivatives to construct the cylindrocyclophane 19.[103] The enzyme CylK has also been shown to be highly promiscuous, carrying out alkylation of a variety of resorcinol derivatives with secondary alkyl halides.[104] Biocatalytic Friedel–Crafts alkylation has also been carried out to synthesize podophyllotoxin lignans.[105,106] For example, the NHI enzyme 2-ODD-PH has been utilized to carry out the biocatalytic synthesis of deoxypodophyllotoxin (20) and related analogs.[106−108] Biocatalytic oxidative phenolic coupling reactions are emerging as powerful tools to construct complex molecules.[109−111] The Müller group recently reported fungal P450 enzymes capable of carrying out oxidative coupling of coumarin derivatives in a site- and stereoselective manner.[109] For example, the enzyme KtnC catalyzes the synthesis of the bicoumarin P-orlandin (21).[109] Biocatalytic C–C bond formation has been explored in carbene transfers to generate chiral cyclopropanes.[112−114] Arnold and co-workers first reported an engineered P450BM3 that carried out carbene transfer reactions. Diazoacetate reagents were used as the carbene sources to carry out alkene cyclopropanation.[112] Several other groups have contributed to the development of biocatalytic carbene transfer reactions, and these have been applied toward the synthesis of pharmacologically relevant compounds such as the TRPV1 inhibitor 25.[115,116] Biocatalytic carbene transfer reactions can be extended to alkynes as well, where the first carbene transfer generates a cyclopropene product which is primed for a second carbene transfer reaction to generate stereopure bicyclobutane products.[117] This transformation rivals the best of what synthetic chemistry has to offer in terms of building complexity through C–C bond formation. In the case of selective C–H functionalization and C–C bond-forming reactions, biocatalysis is often employed at an advanced stage or in the final step of a synthetic campaign. Alternatively, biocatalysis can be engaged at an early stage in chemoenzymatic synthesis planning (Figure ). In such cases, the product of a biocatalytic reaction is transformed into a target molecule of interest using modern synthetic organic chemistry tools. This strategic merge of biocatalysis and small molecule-based synthetic methods enables access to chemical scaffolds previously unattainable using traditional chemical methods alone. For example, Renata and co-workers developed a chemoenzymatic total synthesis of the natural product manzacidin C (32).[118,119] The NHI-dependent enzyme GriE was employed to carry out selective hydroxylation of an l-leucine derivative 30 to form 31.[118] The product 31 was taken through established synthetic steps to formally assemble manzacidin C (32).[118] Our group has been interested in the hydroxylative dearomatization of resorcinol compounds using flavin-dependent monooxygenases (FDMOs).[35,36] We have employed the site- and stereoselectivity of FDMOs in conjunction with small-molecule-based methods to enable the total synthesis of azaphilone natural products.[36] For example, the enzyme AzaH was used to carry out the dearomatization of resorcinol 33 to form 34. The quinol product 34 was subsequently transformed to (S)-trichoflectin (35) using chemical methods.[36] Our group has also focused on developing benzylic hydroxylation of o-cresol compounds using NHI-dependent monooxygenases.[120] For example, we have employed the enzyme ClaD to carry out benzylic hydroxylation of resorcinol derivative 36, the product of which (37) undergoes spontaneous loss of water resulting in a biocatalytically generated o-quinone methide, which was trapped using a chiral dienophile to construct the bioactive natural product xyloketal D (38).[120] α-Deuterated amino acids are important building blocks toward the synthesis of labeled pharmaceuticals and biological probes; however, traditional methods to access these compounds often require protecting group manipulations[121] and can be difficult to perform in a stereoselective manner.[122] We discovered that SxtA AONS, α-oxoamine synthase evolved for saxitoxin biosynthesis, is capable of deuterating a range of unprotected amino acids and their methyl esters using D2O as the deuteron source. For example, deuteration of alanine methyl ester (39) resulted in 40, which was subsequently transformed using chemical methods to access the deuterium-labeled Parkinson’s pharmaceutical safinamide (41).

Figure 5

Chemoenzymatic sequences to complex molecules. (A) Amino-acid C–H hydroxylation in the synthesis of manzacidin C. (B) Hydroxylative dearomatization in the synthesis of azaphilone natural products. (C) Benzylic hydroxylation en route to xyloketal D synthesis. (D) Alpha deuteration of amino acids in the formation of deutero safinamide. Multienzyme cascade reactions have been developed in industrial and academic laboratories to enable complex molecule synthesis (select examples in Figure ). The process toward HIV treatment drug islatravir (48) developed by Merck and Codexis is a representative example of a multienzyme cascade employed on an industrial scale.[14] The artificial nucleoside islatravir (48) was constructed using a combination of five enzymes from the nucleoside salvage pathway in bacteria, which were each engineered for a distinct purpose.[14] This protecting group-free cascade yielded the product islatravir in markedly higher yields than previous chemical syntheses.[14,123] Moore and co-workers developed a multienzyme synthesis of complex halogenated bacterial meroterpenoids napyradiomycins A1 and B1 (54 and 55) in a single pot.[124] Starting with three organic substrates (tetrahydroxynaphthalene 49, dimethylallylpyrophosphate, and geranyl pyrophosphate), the team developed a catalytic sequence involving five enzymes: two aromatic prenyltransferases (NapT8 and T9) and three vanadium dependent haloperoxidase (VHPO) homologues (NapH1, H3, and H4) to assemble the complex halogenated metabolites in milligram quantities.[124] Our group has leveraged the exquisite reactivity of FDMOs and NHI-dependent monooxygenases to construct tropolone natural products.[35,125] Tropolones are a structurally diverse class of bioactive molecules that are characterized by a cycloheptatriene core bearing an α-hydroxyketone functional group. We developed a two-step, biocatalytic cascade to the tropolone natural product stipitatic aldehyde starting with the resorcinol 56. Hydroxylative dearomatization of 56 using TropB affords the quinol intermediate 57. The quinol intermediate undergoes oxidation by an α-KG dependent NHI enzyme TropC to form a radical intermediate which undergoes a net ring rearrangement to form stipitatic aldehyde 59.

Figure 6

Multienzyme biocatalytic sequences: (A) Merck’s biocatalytic synthesis of islatravir. (B) Multienzyme synthesis of napyradiomycin A1 and B1. (C) Multienzyme sequence toward the synthesis of tropolone stipitatic aldehyde. Biocatalytic methods are poised to significantly expand the repertoire of transformations possible in an organic chemist’s toolbox, allowing greater access to chemical space than previously possible. This creates an incentive for academic and industrial laboratories to embrace biocatalytic methods. As interest in this field continues to grow, it will most certainly inform the retrosynthetic logic of modern organic synthesis and shape the next generation of methods.

Outlook and Conclusion

New technology and approaches in biocatalysis continue to pave the way for innovation and paint a bright future for this field. Enzymatic catalysis has demonstrated utility in the construction of simple molecules and holds promise for expanding synthetic access to new corners of chemical space. The rapid technological advances surrounding biocatalyst discovery, characterization, and application naturally raises the question as to what comes next in the field. We anticipate that the amenability of biocatalysis to high-throughput experimentation will shape the application of enzymatic catalysis in synthesis. For example, we envision generation of compound libraries in plates will be possible through biocatalysis. Considering the benign nature of biocatalytic reactions, we anticipate biocatalytically generated compound libraries can be directly coupled with biological assays as well, matching the pace of compound generation with established high-throughput biological assays to ultimately accelerate drug discovery.[126,127] Continued progress in biocatalysis would benefit combinatorial platforms for the synthesis of small-molecule-based compound libraries. The idea of combinatorial biocatalysis platforms for library synthesis has been around since the early 2000s; however, its widespread adoption has been hindered by the lack of resources to identify and develop promiscuous catalytic enzymes.[128,129] Combinatorial biocatalytic syntheses are now taking shape with recent advances in contemporary organic chemistry, synthetic biology, and bioinformatics. In addition, studies of enzyme cocktails have shown that biocatalysts can operate synergistically to complement each other’s substrate scopes, creating useful catalyst mixtures to perform sequential chemical transformations.[130,131] With this precedent, as well as equipment for high-throughput experimentation becoming more advanced and commonplace,[126] it seems only a matter of time before the high-throughput synthesis of vast and diverse small molecule libraries mediated by combinatorial biocatalysis is realized. Without question, biocatalysis has become a valued approach in modern organic synthesis[126] and is a methodology we will rely heavily on as the need to develop green alternatives in chemistry grows.[17,132] With the rapid advances in the field over the past few decades and the wealth of sequence data now widely available, biocatalytic methods are more accessible than ever before. As the global community adapts these techniques to their individual needs, new ideas and strategies will take hold and continue to push biocatalysis into the forefront of synthetic chemistry.

102 in total

Review 1. Catalytic promiscuity and the evolution of new enzymatic activities.

Authors: P J O'Brien; D Herschlag
Journal: Chem Biol Date: 1999-04

Review 2. Biocatalysis in organic chemistry and biotechnology: past, present, and future.

Authors: Manfred T Reetz
Journal: J Am Chem Soc Date: 2013-08-20 Impact factor: 15.419

Review 3. Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST): A web tool for generating protein sequence similarity networks.

Authors: John A Gerlt; Jason T Bouvier; Daniel B Davidson; Heidi J Imker; Boris Sadkhin; David R Slater; Katie L Whalen
Journal: Biochim Biophys Acta Date: 2015-04-18

Review 4. Evolution of protein specificity: insights from ancestral protein reconstruction.

Authors: Mohammad A Siddiq; Georg Ka Hochberg; Joseph W Thornton
Journal: Curr Opin Struct Biol Date: 2017-08-23 Impact factor: 6.809

Review 5. Plasmid Detection, Characterization, and Ecology.

Authors: Kornelia Smalla; Sven Jechalke; Eva M Top
Journal: Microbiol Spectr Date: 2015-02

6. Biocatalytic Alkylation Cascades: Recent Advances and Future Opportunities for Late-Stage Functionalization.

Authors: Iain J W McKean; Paul A Hoskisson; Glenn A Burley
Journal: Chembiochem Date: 2020-05-27 Impact factor: 3.164

7. Cytotoxic pregnane steroids from the formosan soft coral Stereonephthya crystalliana.

Authors: Shang-Kwei Wang; Chang-Feng Dai; Chang-Yih Duh
Journal: J Nat Prod Date: 2006-01 Impact factor: 4.050

8. Biocatalytic site- and enantioselective oxidative dearomatization of phenols.

Authors: Summer A Baker Dockrey; April L Lukowski; Marc R Becker; Alison R H Narayan
Journal: Nat Chem Date: 2017-11-13 Impact factor: 24.427

Review 9. Looking Back: A Short History of the Discovery of Enzymes and How They Became Powerful Chemical Tools.

Authors: Christian M Heckmann; Francesca Paradisi
Journal: ChemCatChem Date: 2020-10-01 Impact factor: 5.686

10. BLAST: a more efficient report with usability improvements.

Authors: Grzegorz M Boratyn; Christiam Camacho; Peter S Cooper; George Coulouris; Amelia Fong; Ning Ma; Thomas L Madden; Wayne T Matten; Scott D McGinnis; Yuri Merezhuk; Yan Raytselis; Eric W Sayers; Tao Tao; Jian Ye; Irena Zaretskaya
Journal: Nucleic Acids Res Date: 2013-04-22 Impact factor: 16.971

4 in total

Review 1. Carbon Nanomaterials (CNMs) and Enzymes: From Nanozymes to CNM-Enzyme Conjugates and Biodegradation.

Authors: Petr Rozhin; Jada Abdel Monem Gamal; Silvia Giordani; Silvia Marchesan
Journal: Materials (Basel) Date: 2022-01-28 Impact factor: 3.623

2. Programing a cyanide-free transformation of aldehydes to nitriles and one-pot synthesis of amides through tandem chemo-enzymatic cascades.

Authors: Haoteng Zheng; Qinjie Xiao; Feiying Mao; Anming Wang; Mu Li; Qiuyan Wang; Pengfei Zhang; Xiaolin Pei
Journal: RSC Adv Date: 2022-06-16 Impact factor: 4.036

3. A Deep Eutectic Solvent Thermomorphic Multiphasic System for Biocatalytic Applications.

Authors: Lars-Erik Meyer; Mads Bruno Andersen; Selin Kara
Journal: Angew Chem Int Ed Engl Date: 2022-06-21 Impact factor: 16.823

Review 4. Current Progress in the Chemoenzymatic Synthesis of Natural Products.

Authors: Evan P Vanable; Laurel G Habgood; James D Patrone
Journal: Molecules Date: 2022-09-27 Impact factor: 4.927

4 in total