Literature DB >> 36032533

Designing Microbial Cell Factories for the Production of Chemicals.

Jae Sung Cho^1,2,3, Gi Bae Kim^1,2, Hyunmin Eun^1,2, Cheon Woo Moon^1,2, Sang Yup Lee^1,2,3.

Abstract

The sustainable production of chemicals from renewable, nonedible biomass has emerged as an essential alternative to address pressing environmental issues arising from our heavy dependence on fossil resources. Microbial cell factories are engineered microorganisms harboring biosynthetic pathways streamlined to produce chemicals of interests from renewable carbon sources. The biosynthetic pathways for the production of chemicals can be defined into three categories with reference to the microbial host selected for engineering: native-existing pathways, nonnative-existing pathways, and nonnative-created pathways. Recent trends in leveraging native-existing pathways, discovering nonnative-existing pathways, and designing de novo pathways (as nonnative-created pathways) are discussed in this Perspective. We highlight key approaches and successful case studies that exemplify these concepts. Once these pathways are designed and constructed in the microbial cell factory, systems metabolic engineering strategies can be used to improve the performance of the strain to meet industrial production standards. In the second part of the Perspective, current trends in design tools and strategies for systems metabolic engineering are discussed with an eye toward the future. Finally, we survey current and future challenges that need to be addressed to advance microbial cell factories for the sustainable production of chemicals.

Entities: Chemical

Year: 2022 PMID： 36032533 PMCID： PMC9400054 DOI： 10.1021/jacsau.2c00344

Source DB: PubMed Journal: JACS Au ISSN： 2691-3704

Introduction

Commercial chemicals produced from fossil resources through petrochemical refinery processes have played an integral part in human society over the past century. However, our overdependence on fossil resources and derived products have led to serious problems such as environmental pollution, extreme weather, and the depletion of fossil resources, which threatens not only humanity but also the entire planet as a whole. It is crucial to address these pressing issues by transforming the current petrochemical processes into sustainable, environmentally friendly processes for the production of chemicals. Microbial cell factories are engineered microorganisms designed and optimized to produce chemicals of interest from renewable resources such as nonedible biomass or even carbon dioxide. The fermentative production of chemicals using engineered microbial cell factories has been demonstrated to be a viable alternative to the production of chemicals. In addition to the advantage of using renewable resources as carbon sources, microbial cell factories use a relatively lower temperature and pressure and do not use toxic solvents and catalysts for the production of chemicals, unlike conventional chemical processes. With the increasing international effort of research groups and companies around the world, a large portfolio of chemicals can now be produced using microbial cell factories.[1] When microorganisms are first isolated from nature, they are not optimized to the desired function to readily uptake carbon sources derived from renewable biomass and produce a target chemical of interest with a high enough efficiency. Thus, metabolic engineering—the purposeful modification of cellular and metabolic networks to achieve defined objectives[2]—is performed on the microorganism to convert it into an efficient microbial cell factory. Through metabolic engineering, a microorganism can be engineered to utilize an inexpensive renewable carbon source as a substrate to produce a chemical of interest, even those nonnative to its metabolism. In 1991, the term “metabolic engineering” was first officially suggested,[3] and the first generation of metabolic engineering began with the development of molecular tools that enabled the deletion, insertion, or replacement of genetic components in the microbial chromosome or through the use of plasmids. Typically, metabolic pathways were streamlined by the manipulation of one to several genes to direct the metabolic flux toward a desired chemical product. At the turn of the millennium and with the advent of omics (genomics, transcriptomics, and proteomics) data, metabolic engineers were able to view microorganisms as systems made up of complex networks. The “1.5th generation” of metabolic engineering, therefore, leveraged such new omics data to perform local metabolic engineering on the basis of local information selected from global omics information—genomics, transcriptomics, and proteomics data were used to identify gene targets to enhance the production of target chemicals or proteins.[4] The second generation of metabolic engineering is characterized by combining metabolic engineering with synthetic biology, systems biology, and evolutionary engineering. This generation is also widely known as “systems metabolic engineering.” Tools and strategies of systems metabolic engineering to develop microbial cell factories have previously been reviewed.[5−7] High-throughput technologies are widely employed to generate bio-big data, which comprises vast omics data. As bio-big data became available, data-driven approaches, such as the rapidly expanding field of artificial intelligence, are now being used to solve various biotechnology problems. With breakthroughs in artificial intelligence and the use of bio-big data in designing microbial cell factories, the third generation of metabolic engineering has begun. Coupled with automated systems to construct hundreds of thousands of molecular systems and bacterial strains, the speed at which microbial cell factories are being developed is unparalleled in the history of metabolic engineering, thereby opening up endless possibilities to produce chemicals. Metabolic pathway designs can be broadly classified into three categories on the basis of whether the pathways are native to the microbial host (native or nonnative) or whether they are reported or found in nature (existing) or are completely synthetic (created; Figure ). Native-existing pathways are biosynthetic pathways existing in an isolated microbial host capable of producing the target chemical endogenously without the need to introduce any foreign biosynthetic pathways. Nonnative-existing pathways are reconstructed biosynthetic pathways that utilize existing or reported pathways in nature but are nonnative to the microbial host. Nonnative-created pathways are reconstructed pathways that do not exist in nature but have been purposefully designed and created using synthetic enzymes and pathways with new functions. With an increasing number of tools and strategies available to discover and design biosynthetic pathways, the production of a target chemical may not necessarily require a single category of synthetic pathway. For example, some chemicals, such as glutaric acid (which will be discussed more in detail below), can be produced by both nonnative-existing pathways and nonnative-created pathways.

Figure 1

Overall design process to construct a microbial cell factory for the production of a target chemical. First, an appropriate microorganism is selected as the microbial host. Next, biosynthetic pathways toward target chemical production are examined, and the optimal pathway is introduced to the microbial host, accordingly. The microbial host harboring the biosynthetic pathway is subject to systems metabolic engineering to improve strain performance. Model organisms Escherichia coli and Saccharomyces cerevisiae are two of the most widely used hosts for microbial cell factories because their metabolisms are best understood and molecular tools to engineer these hosts are well established.[1] However, the production of a target chemical may not necessarily be optimal with E. coli and S. cerevisiae as hosts due to their metabolic and physiological properties. With the expanding number of genetic and computational tools available, other microorganisms with superior metabolic and cellular properties such as Corynebacterium glutamicum, Pichia pastoris, Pseudomonas putida, and Yarrowia lipolytica are increasingly being explored for the production of chemicals, with a notable increase in efforts toward the production of chemicals in the past five years (Table ).

Table 1

Properties of Platform Hosts for Microbial Cell Factories

host	type	characteristics	available GEM	references
Escherichia coli	Gram-negative bacteria	-model organism	iML1515	refs (190, 191)
		-well-established genome engineering tools
		-well-studied metabolism
		-weak cell wall
		-endotoxins
Corynebacterium glutamicum	facultative anaerobe, Gram-positive bacteria	-robust, powerful metabolism	iCW773	refs (11, 192)
		-well-studied fed-batch fermentation
		-chemical-produced labeled GRAS
Pseudomonas putida	Gram-negative bacteria	-robust, suitable for production of natural chemicals	iJN1462	refs (13, 193)
Saccharomyces cerevisiae	eukaryote	-GRAS	Yeast8	refs (12, 194)
		-well-established genome engineering tools
		-advantageous for expressing eukaryotic genes (e.g.: P450s)
Yarrowia lipolytica	eukaryote	-GRAS	iYLI647	refs (10, 195)
		-oleaginous microorganism
		-TAG storage
Bacillus subtilis	Gram-positive, catalase-positive bacteria	-model Gram-positive strain	iYO844	refs (8, 196)
Bacillus subtilis	Gram-positive, catalase-positive bacteria	-spore-forming	iYO844	refs (8, 196)
Pichia pastoris	eukaryote	-methylotrophic	iMT1026v3.0	refs (14, 197)

Previous reviews on designing microbial cell factories for chemical production have focused on individual microbial cell hosts,[8−14] systematic methods to develop strains on the basis of specific examples,[5] or the summary of tools and strategies used in systems metabolic engineering.[6,7] This Perspective focuses on designing biosynthetic pathways for microbial cell factories, various methods to construct completely novel pathways, and on the design strategies to optimize microbial cell factories for the production of chemicals. Readers are encouraged to refer to the flowchart (Figure ) containing the guiding principles on how to design the pathways. We also use recent examples of successful engineering of microbial cell factories to showcase various methods for engineering a microbial host for chemical production. Looking into the future, we discuss various obstacles that need to be overcome in this field and offer our vision for the sustainable production of chemicals using microbial cell factories.

Figure 2

Flowchart illustrating the guiding principles in designing three categories of biosynthetic pathways. Blue and red colored boxes indicate steps for constructing nonnative-existing pathways and nonnative-created pathways, respectively.

Native-Existing Pathways

There are microorganisms naturally capable of producing chemicals of interest in large amounts before any metabolic engineering is performed. For example, species of C. glutamicum isolated in the 1950s were found to be capable of producing large amounts of l-glutamate and l-lysine, which are both important food and feed additives.[15] These C. glutamicum isolates and their derivatives developed through classical strain improvement were directly used for the fermentative production of these amino acids.[16] Other examples include Bacillus and Lactobacillus species, which are known for the production of l-lactate;[17]Y. lipolytica and Rhodococcus opacus for lipids;[10,18,19]Klebsiella pneumoniae for the production of 1,3-propanediol from glycerol;[20]Mannheimia succiniciproducens for succinic acid;[21] and actinomycetes for the production of antibiotics and polyketides.[22] With the recent progress in developing genetic engineering tools, these microorganisms with inherently superior metabolic and physiological properties toward specific chemical production are now being explored and used for the production of various chemicals (Table ). While these native producers are advantageous for producing certain chemicals that are already inherently produced with less metabolic engineering effort, careful considerations should be paid to choosing the host for the production of a particular chemical. These include whether it harbors a strong native metabolic flux toward the synthesis of a target chemical and whether it allows chemical production with a yield close to the theoretical maximum.[23] Once the optimal host is chosen, metabolic pathway engineering strategies can be applied as the next step for further strain development.

Nonnative-Existing Pathways

Reconstruction of Metabolic Pathway from Known Sources

There are target chemicals that cannot be synthesized by a selected microbial host because the biosynthetic genes necessary for complete construction of the metabolic pathway toward a target chemical are not present in the host. This issue can be resolved because many of these target chemicals can also be synthesized by harnessing the appropriate enzymes and pathways existing in other hosts or metagenome. Databases such as KEGG,[24] MetaCyc,[25] and BRENDA[26] provide comprehensive information on metabolic pathways, enzymes, and genes.[6] On the basis of this information, metabolic pathways can be reconstructed by recruiting and combining genes from other organisms or metagenome. For example, the adipic acid biosynthesis pathway, which had never been described in microorganisms, was recently reported in Thermobifida fusca, and the pathway was recently reconstructed in E. coli for the production of adipic acid.[27] As the reconstruction of many different metabolic pathways can be similarly performed and quite intuitively done, here we highlight some of the recent noteworthy examples that require more complex decision-making processes. In some cases, there can be more than one known pathway for the production of a target chemical. It is important to survey various known pathways and choose the optimal pathway toward the production of the target chemical in the host of choice. For example, two biosynthetic pathways have been reported for the production of methyl anthranilate, a grape flavoring compound, from anthranilate, an important intermediate metabolite that exists in most microbial hosts (Figure ).[28] Both reported pathways are usually present in plants: the first pathway is a two-step enzymatic conversion that requires methanol, and the second pathway is a single enzymatic step that uses S-adenosyl methionine (SAM), a ubiquitous metabolite in microbial hosts, as a cofactor. While the first pathway resembles the conventional chemical method of converting anthranilate to methyl anthranilate by methylation using methanol, it is more feasible to use the second pathway given that it is shorter and does not use methanol but rather the ubiquitous SAM as a cofactor. Microbial cell factories have recently been developed for the production of methyl anthranilate using the methanol-dependent pathway[29] and SAM-dependent pathway,[28,30] where the SAM-dependent pathway demonstrated much higher production performance. In another example, large combinatorial libraries have been used to identify and optimize the biosynthetic pathway toward itaconic acid, for which several biosynthetic pathways are known to exist. Algorithmic design strategies were adapted to guide the automated construction of the microbial cell factory for itaconic acid overproduction.[31]

Figure 3

Two reported routes toward methyl anthranilate production from anthranilate. The first route is a two-step catalytic pathway that involves the use of methanol as the methyl donor, and the second route uses SAM as the methyl donor toward methyl anthranilate production. The abbreviations are as follows: CoA, coenzyme A; MeOH, methanol; SAH, S-adenosyl-l-homocysteine; SAM, S-adenosyl methionine. Combinations of partial biosynthetic pathways from various hosts to construct a large biosynthetic pathway have also been demonstrated recently. A vitamin B12 (adenosylcobalamin) biosynthesis pathway was constructed and produced in E. coli by overexpressing endogenous E. coli genes and heterologous 28 genes from Rhodobacter capsulatus, Brucella melitensis, Sinorhizobium meliloti, Salmonella typhimurium, and Rhodopseudomonas palustris.[32] The genes comprising the whole complex pathway were divided into six different modules, which enabled de novo production of vitamin B12. Similarly, carminic acid was produced in E. coli by constructing a biosynthetic pathway using a combination of genes from plant and bacteria. Two important enzymes, monooxygenase and glucosyltransferase, involved in the last two reaction steps were chosen from literature and chemical databases to complete the biosynthetic pathway toward carminic acid production, followed by computer simulation-based improvement of the enzymes.[33] Other notable breakthroughs using a large number of genes from different sources include the production of noscapine and halogenated alkaloids in yeast with 25 heterologous genes from plants (Papaver somniferum, Coptis japonica), bacterium (Pseudomonas putida), and mammal (Rattus norvergicus)[34] and the production of tropane alkaloids in yeast through the introduction of 15 genes, where 11 are derived from various plants and bacteria.[35]

Pathway Discovery for Unknown Metabolic Pathways

There are cases where genetic or enzymatic information on certain steps in the biosynthesis pathway for target chemical production is missing (Figure ). Moreover, enzyme candidates for the missing biosynthetic reactions that are selected intuitively on the basis of in-depth literature searches might not function properly when introduced into the host. One way to address this issue is to discover novel enzyme candidates by profiling the genomes and transcriptomes of biosynthetic gene clusters found in nature (Figure ). One recent example is the biosynthesis of cannabinoids in S. cerevisiae.[36] Several prenyltransferase candidates that can function as geranylpyrophosphate:olivetolate geranyltransferase (GOT), responsible for cannabigerolic acid (CBGA) synthesis, were shortlisted from the literature and from Cannabis transcriptome profiles.[36] Through the introduction and screening of these candidates, the biosynthetic pathway for cannabinoids production was fully reconstructed in the host S. cerevisiae. In another example, unknown UDP-glycosyltransferase and putative valerenadiene oxidase were newly discovered from plant transcriptomes for the biosynthesis of triterpene glycosylation products[37] and valerenic acid,[38] respectively, in S. cerevisiae.

Figure 4

Transcriptome and genome mining approaches for the discovery of novel gene and enzyme candidates. Metabolic reactions previously unknown can be elucidated by introducing microbial hosts with various gene and enzyme candidates newly discovered from the transcriptomes of plants or from bacterial gene clusters using various computational tools. In addition, genome mining software tools such as antiSMASH,[39] ClusterFinder,[40] NP.searcher,[41] and NaPDoS[42] enable and facilitate the discovery and prediction of cryptic biosynthetic gene clusters (BGCs) from the bacterial, fungal, and plant genome sequences (Figure ). The pathway databases can be provided by public or standardized reference databases such as the MIBiG (Minimum Information about a Biosynthetic Gene Cluster) repository.[43] Comparative analyses of the BGCs using databases such as antiSMASH can help identifying potential gene candidates. Following the screening of gene candidates, the activities of cryptic enzymes are experimentally validated through performing enzyme reactions and analyzing the substrate/product using nuclear magnetic resonance and mass spectrometry. This allows association of the cryptic enzymes with their corresponding gene sequences.[44] Thus, comparative analyses using databases and experimental analyses can expedite the discovery of enzyme candidates required for constructing novel biosynthesis pathways.

Nonnative-Created Pathways

As discussed above, metabolic engineering has enabled the production of various chemicals by employing naturally existing pathways toward target chemicals. However, microbial cell factories should be capable of producing non-natural chemicals, or even natural products of which biosynthetic pathways are not characterized, to fully replace the conventional petrochemical refineries. In organic chemistry, the synthesis routes of chemicals that have no precedent cases can be designed using retrosynthesis, a discipline that works backward from a target chemical to the readily available precursors by finding appropriate chemical transformation steps. Computer-aided approaches can facilitate retrosynthesis, which can also now be used to design biosynthetic pathways, to efficiently explore the vast chemical space. Such biosynthetic pathways can be constructed de novo by identifying suitable enzymes from heterologous resources (see above) and/or harnessing the promiscuities of enzymes that are expected to catalyze the metabolite of interest. In this section, we discuss how enzyme promiscuity and retrosynthetic approaches can be applied to de novo pathway design.

Harnessing Enzyme Promiscuity for Designing De Novo Pathways

Promiscuous enzymes are the enzymes that are able to carry out multiple catalytic reactions using a broad range of substrates, which can be harnessed to construct novel metabolic pathways (Figure A). The use of these enzymes shows great potential for the construction of de novo pathways, where the missing steps of the biosynthesis pathway can be compensated by introducing repurposed promiscuous enzymes.[45] It has been estimated that 37% of all the enzymes in E. coli can act on multiple substrates.[46] Leveraging the great potential of promiscuous enzymes derived from E. coli, chemicals that could previously only be obtained from petrochemical processes, can now be produced using de novo biosynthetic pathways. For example, 1,5-pentanediol, a building block for polyesters and polyurethanes synthesis, was first produced in engineered E. coli using such a method.[47] Broad substrate range enzymes such as aldehyde dehydrogenases were selected from various heterologous organisms for the complete de novo production of 1,5-pentanediol from glucose (Figure B). Similarly, 4-amino-1-butanol (4AB), an important precursor of biodegradable polymers, was first produced in C. glutamicum by introducing de novo biosynthesis pathways comprising the broad substrate range of aldehyde dehydrogenase and putrescine aminotransferase from E. coli.[48] Promiscuous enzymes from other species have also been widely employed for constructing de novo pathways. For instance, mevalonate diphosphate decarboxylase from S. cerevisiae, which was already known to exhibit promiscuous activity was employed and further engineered to create an isopentenyl diphosphate (IPP)-bypass mevalonate pathway for the production of isopentanol.[49] Similarly, various prenylated aromatic compounds, including orsellinic, divarinolic, and olivetolic acids, have been produced by harnessing the promiscuity of engineered aromatic prenyltransferase NphB from a Streptomyces strain.[50] 1,6-Hexamethylenediamine, a monomer required for nylon-6,6 production was produced from the in vitro conversion of adipic acid by employing transaminases and carboxylic acid reductases that have already been reported to accept substrates with carboxyl, carbonyl, and amine terminal groups.[51] These enzymes were further engineered to create mutant variants with enhanced activities, which resulted in a complete one-pot transformation of adipic acid to 1,6-hexamethylenediamine.

Figure 5

Use of promiscuous enzymes for constructing nonnative, created pathways. (A) A schematic illustration of enzyme promiscuity. (B) Use of promiscuous enzymes that are functional in the 1,4-butanediol biosynthetic pathway and employed for use in the 1,5-pentanediol biosynthetic pathway.

Computer-Aided Design of De Novo Pathways for Non-Natural Chemicals

In chemical synthesis, retrosynthetic approaches are performed to find chemical transformations that can be applied to synthesizing the target chemicals. As screening all reported chemical transformations is not practical, the chemical transformations are often described in generalized forms that account for the changes in atoms and bonds in the reaction center and its neighborhoods, called reaction rules (Figure A). Reaction rules are iteratively applied to the target chemical and tentative ongoing precursors until the building blocks that can be easily synthesized or are commercially available are reached. Retrobiosynthesis works in the same way as retrosynthesis, but the difference is that reaction rules in retrobiosynthesis are limited to enzymatic reactions.[52] Over 450 000 reaction rules for enzymatic reactions were recently compiled with varying degrees of diameter, a parameter that represents the number of bonds around a reaction center.[53] The reaction rules were used to develop the retrosynthetic metabolic pathway designer RetroPath 2.0. RetroPath 2.0 demonstrated successful de novo pathway design by predicting 81.5% (119 out of 146 pathways) of biosynthetic pathways listed in the LASER database, which compiled the previously reported metabolic engineering designs.[54]

Figure 6

Template-based approaches for retrobiosynthesis. (A) An example of applying a known reaction rule to a target chemical, dopamine, to find a reactant, here in this case, L-DOPA. (B) Filtering the predicted pathways using several criteria including enzyme availability, thermodynamics, toxicity of intermediates, and yields of a pathway, among others. Since the transformation routes are determined by reaction rules, it is important to establish appropriate reaction rules to explain the specificities of reactions while maintaining reasonable extents of enzyme promiscuity. A manually curated minimal rule set comprising 1224 reaction rules for enzymatic reactions has recently been described to cover all possible biotransformations while preserving a minimal number of reaction rules for computational efficiency.[55] The reaction rules can reproduce 85% of reactions in the KEGG[24] database, showing that the well-established reaction rules would benefit retrobiosynthesis without sacrificing the reaction coverage. In addition to engineering reaction rules, postanalysis of retrosynthesis predictions should be considered to filter the numerous predicted biosynthetic pathways (Figure B). For example, genome-scale metabolic models (GEMs) can be used to compare the metabolic capacities of the predicted metabolic pathways by calculating the maximum theoretical yields.[56,57] Achieving the maximum yield of product is often the most important factor determining the overall economics of bioprocesses for the production of bulk products. The maximum theoretical yield can be calculated by maximizing the target chemical production flux using the GEM of interest. As the yield of the biosynthetic pathway affects the efficiency of the whole bioprocess, the pathway which has the highest maximum theoretical yield depending on the condition of interest (e.g., carbon source, strain type) should be determined preliminarily. However, the trade-off between cell growth and target chemical production flux should also be carefully considered in order to not overly sacrifice other important performance metrices such as titer and productivity. Enzyme availability is another key factor that should be considered in retrobiosynthesis. Because the predicted biochemical transformations should be performed often by enzymes that exhibit promiscuity, it is important to identify the corresponding enzymes for the predicted pathways.[58−61] A recent study that employed a computational workflow to systematically screen the biologically synthesizable compounds is a noteworthy example that describes the entire workflow of computer-aided pathway design and accounts for the selection of target chemicals, prediction of biosynthetic pathways, identification of the corresponding enzymes, and validations through experiments.[62] The authors applied the workflow to produce benzylisoquinoline alkaloids, thereby exploring the derivatives of intermediates in the noscapine biosynthetic pathway. First, the hypothetical chemical space that can be reached from the noscapine biosynthetic pathway was explored using a retrosynthetic algorithm, BNICE.ch.[63] Since the chemical space was too large, it was pruned by evaluating the presence of the benzylisoquinoline alkaloid scaffold, the thermodynamics of the reactions, the availability of enzymes for the reactions, and other factors. The enzyme availability was assessed using BridgIT, a computational method that predicts promiscuous enzymes for a reaction using reaction similarities.[60] The framework was experimentally demonstrated by successfully producing the derivatives of the intermediates of the noscapine biosynthetic pathway in yeast with the predicted enzymes that constitute the de novo pathways.[62]

Artificial Intelligence Empowers Pathway Design

With the advent of the bio-big data era, artificial intelligence-based data-driven approaches have been actively applied for retrosynthetic algorithms.[64] Artificial intelligence in retrosynthesis samples reaction rules to be applied rather than considering all reaction rules in every iteration, which results in a much more efficient exploration of the possible reactions. One recent approach, RetroPath RL, uses reinforcement learning as the main engine to explore the chemical space with reaction rules compiled in RetroRules.[53,65] RetroPath RL uses the Monte Carlo tree search algorithm, which iteratively selects an intermediate chemical that can be reached from a starting chemical, expands the pathway from the intermediate by choosing possible transformations, samples transformations until a terminal state is reached, and finally updates scores that occurred during the iteration. With every iteration, a scoring system that evaluates reaction similarity and enzyme sequence availability is used as a selection policy. RetroPath RL outperformed a previous retrosynthetic algorithm, RetroPath 2.0, by predicting 83.6% (127 out of 152) of experimentally validated metabolite biosynthetic pathways. While artificial intelligence can guide the retrosynthetic algorithms to select the proper reaction rules, recent applications of artificial intelligence also enable synthetic route planning without reaction rules. The reaction-rule-based retrosynthetic approaches, called template-based approaches, are constrained to the content and quality of the predefined reaction rules. Therefore, novel reactions cannot be predicted unless the reaction patterns have been reflected in the reaction rules. Such constraints were overcome in recent applications of artificial intelligence in retrosynthesis by performing template-free approaches to directly predict the precursors of a target chemical. Template-free approaches deal with how to translate a target chemical to the possible precursors, inspired by state-of-the-art artificial intelligence for machine translation. A sequence-to-sequence model was developed that takes a simplified molecular-input line-entry system (SMILES) string of a target chemical to predict the corresponding reactants.[66] The sequence-to-sequence model uses a long short-term memory (LSTM) model that takes each character of the input SMILES string to encode latent features. The latent features are then decoded to a series of SMILES characters for the reactants. The sequence-to-sequence model showed comparable performance with a baseline model that used a template-based approach with automatically extracted reaction rules from the training data set, suggesting the good potential of the template-free approach. Since the late 2010s, transformer-based models, which use self-attention units in their neural network architecture, have resulted in state-of-the-art algorithms in various natural language processing areas, including machine translation.[67,68] On the basis of this success, the latest template-free approaches also attempted to employ the transformer-based approach (Figure ).[69−73] However, using SMILES strings for the encoder–decoder networks tends to generate grammatically wrong molecules when the networks do not perfectly learn the grammar of the SMILES. Some innovative strategies have been devised to deal with such problems.[72,73] A recent study employed two-step retrosynthetic prediction: translation of a product chemical to a reactant chemical and correction of the syntax errors in the generated SMILES string for the reactant chemical.[72] Both steps used transformer-based models, where the former model was trained on reactant–product pairs, and the latter model was trained on invalid reactant–ground-truth reactant pairs. Instead of explicitly fixing the syntax error, another study tried to generate valid SMILES strings using cycle consistency.[73] Here, two transformers were trained to predict forward and retrosynthetic reactions, respectively. The forward reaction prediction uses the reactant generated by the retrosynthetic prediction to produce the original input chemical, which ensures that the whole system can consistently generate valid chemicals in the cyclic predictions. Not only SMILES but also graph representation of the molecules can be used for the retrosynthesis algorithms. While decoder networks that use SMILES iteratively generate a character of the SMILES, the generation of molecules using graph representations can be performed by expanding nodes (atoms) and edges (bonds) from a starting point, of which the process is free of syntax errors.[74]

Figure 7

Template-free approaches for retrobiosynthesis using transformer-based models. A SMILES string of a target molecule is translated to a SMILES string of substrate by encoders and decoders of a transformer-based model. The translation iteratively generates a next token (a SMILES character) of the substrate by taking tokens of the target molecules and the previously generated tokens of the substrate. In order to establish sustainable production of chemicals using microbial cell factories, unraveling the metabolic pathways of non-natural chemicals is an important task, which can be enabled by computer-aided pathway designs (Table ). Also, even the strategies for the production of native and nonnative chemicals should be reexamined using computer-aided pathway design to assess whether more efficient biosynthetic pathways are available. Nevertheless, employing synthetic pathways in developing microbial cell factories still requires additional effort. Current retrosynthetic algorithms are mainly developed for organic reactions rather than enzymatic reactions. Although specific reaction rules for enzymatic reactions have enabled template-based approaches, applying these reaction rules to large molecules such as molecules containing Coenzyme A is still prone to yield wrong predictions. Template-free approaches, however, may suffer from the lack of known biochemical reactions differently from known organic reactions. Transfer learning, a scheme that trains a pretrained machine learning model to another data set, can alleviate the problem of the lack of data. For example, a transformer-based model was developed to predict a forward enzymatic reaction using a model that has been pretrained on an organic reaction data set.[70] Pretraining on the organic reaction data set makes the model understand general chemical language, while transfer learning makes the model understand specific biotransformations.

Table 2

Algorithms for Retrosynthesis

types	name	descriptor	reaction type	characteristics	reference
template-based	BNICE.ch	bond-electron matrix	metabolism	-use manually curated reaction rules	refs (63, 198)
	BNICE.ch	bond-electron matrix	metabolism	-evaluate thermodynamic feasibility	refs (63, 198)
	RetroPath RL	fingerprint	metabolism	-use RetroRules	refs (53, 65)
	RetroPath RL	fingerprint	metabolism	-explore reaction space using Monte Carlo tree search algorithm	refs (53, 65)
	novoStoic	molecular signature	metabolism	-use mixed integer linear programming to design pathways	ref (199)
	novoStoic	molecular signature	metabolism	-integrate existing reactions in organisms and novel reactions	ref (199)
	RetroBioCat	fingerprint/SMILES	metabolism	-provide human-led exploration mode and automated pathway generation mode	ref (61)
	RetroBioCat	fingerprint/SMILES	metabolism	-identify enzyme sequences for the predicted reactions	ref (61)
	GEM-Path	fingerprint/SMARTS	metabolism	-integrate a GEM of E. coli	ref (56)
	GEM-Path	fingerprint/SMARTS	metabolism	-integrate growth-coupled strain design algorithms	ref (56)
	Cho et al.	SMILES	metabolism	-use manually curated reaction rules	ref (200)
	Cho et al.	SMILES	metabolism	-estimate binding site covalence, organism specificity, thermodynamics for prioritizing pathways	ref (200)
	PathPred	RDM pattern	metabolism	-predict xenobiotics biodegradation pathways and biosynthesis of secondary metabolites	ref (201)
	PathPred	RDM pattern	metabolism	-link the prediction results to possible genes	ref (201)
	ICHO	fingerprint	organic chemistry	-use Chematica’s expert-coded reaction rules	ref (202)
	ICHO	fingerprint	organic chemistry	-can predict sparsely reported reaction types	ref (202)
	Baylon et al.	fingerprint	organic chemistry	-use automatically extracted reaction rules	ref (203)
	Baylon et al.	fingerprint	organic chemistry	-perform multiscale reaction classification	ref (203)
template-free	Liu et al.	SMILES	organic chemistry	-use a sequence-to-sequence model	ref (66)
	Liu et al.	SMILES	organic chemistry	-require a predetermined reaction type as an input.	ref (66)
	molecular transformer	SMILES	organic chemistry	-predict both reactants and reagents	ref (69)
	molecular transformer	SMILES	organic chemistry	-demonstrated generalizability on proprietary electronic lab notebook data	ref (69)
	SCROP	SMILES	organic chemistry	-use syntax corrector to automatically correct predicted invalid SMILES strings	ref (72)
	SCROP	SMILES	organic chemistry	-analyze inference of the neural network using attention	ref (72)
	tied two-way transformers	SMILES	organic chemistry	-check the cycle consistency of forward and retrosynthesis prediction models	ref (73)
	tied two-way transformers	SMILES	organic chemistry	-generate diverse reactants using multinomial latent variables	ref (73)
	Fuji et al.	SMILES/JT-VAE	metabolism	-embed chemicals into latent spaces using JT-VAE	refs (204, 205)
	Fuji et al.	SMILES/JT-VAE	metabolism	-predict a reaction feasibility using ensemble neural networks	refs (204, 205)
	RetroXpert	graph/SMILES	organic chemistry	-identify a set of disconnection sites	ref (206)
	RetroXpert	graph/SMILES	organic chemistry	-predict reactants from synthons robustly	ref (206)
	G2G	graph	organic chemistry	-handle uncertainty of reactant generation using latent variables	ref (74)
	G2G	graph	organic chemistry	-perform a variational graph translation	ref (74)
	Hasic et al.	fingerprint	organic chemistry	-identify disconnection sites of target molecules	ref (207)
	Hasic et al.	fingerprint	organic chemistry	-use hot-spot fingerprint, a variant of fingerprints	ref (207)

Design Tools and Strategies for Systems Metabolic Engineering of Microorganisms

So far, we have discussed the designing of the biosynthetic pathways in microbial hosts for the production of target chemicals. The identification of an appropriate host, the introduction of an appropriate pathway, the discovery of a previously unknown pathway, or the creation of a synthetic pathway toward a target chemical are the most fundamental processes in designing microbial cell factories, but they are only the first steps toward the production of these chemicals at industrially competitive levels. Further extensive engineering is needed to improve the performance of microbial cell factories. In this section, we revisit and update the tools and strategies for systems metabolic engineering reviewed recently[6,7] and also highlight emerging tools and strategies for the development of microbial cell factories for chemicals production (Figure ).

Figure 8

Design tools and strategies for the construction of microbial cell factories. (A) Molecular tools for the introduction of biosynthetic pathways to host cells. (B) Engineering enzymes using rational design, directed evolution and computational de novo design approaches. (C) Substrate channeling strategies toward product formation. (D) Genome-scale metabolic models driven by omics data and artificial intelligence. (E) Chassis random mutagenesis using ARTP and genome shuffling methods. (F) Increasing tolerance to target chemicals using ALE and process engineering. (G) Transporter engineering for the export and import of metabolites. (H) Engineering storage capacities for metabolites and energy. (I) Increasing membrane area by morphology engineering. (J) Antibiotics-free systems. The abbreviations are as follows: AI, artificial intelligence; ALE, adaptive laboratory evolution; ARTP, atmospheric and room-temperature plasma; FAs, fatty acids; IMV, inner membrane vesicles; OMV, outer membrane vesicles; PHB, polyhydroxybutyrate; RBS, ribosome binding site; TAGs, triacylglycerols; β-OX, β-oxidation pathway.

Molecular Tools for the Introduction of Biosynthetic Pathways

Recent developments in genetic engineering tools have remarkably advanced the introduction to and optimization of pathways in host strains (Figure A). Plasmids still play an important role to this end because of their easy manipulation and introduction to microbial host cells. The nonnative pathways designed can be rapidly constructed and tested in microbial cell factories using plasmids. Tools and strategies for fine-tuning the gene expression levels with plasmids are still being actively studied.[75−79] An increasing number of synthetic promoters and ribosome binding site (RBS) sequences are being developed to fine-tune the expression of biosynthetic genes at plasmid levels. While there have been several promoter libraries previously designed on the basis of random sequences, novel synthetic promoter libraries have recently been constructed for yeast using model-guided design strategies[34] and for E. coli using a deep generative network.[80] Additionally, synthetic RBS sequences with diverse translation initiation levels can be designed for the control of gene expression using computational tools such as RBS calculator, UTR designer, and RBS designer.[81] For the strains used in industrial applications, the chromosomal expression of biosynthetic genes is often favored over plasmid-based expression because of the plasmid maintenance and instability problems. Recombinase systems such as the Lambda Red or RecET from E. coli have widely been employed to delete or introduce target genes in various microbial hosts. However, as such conventional methods rely on laborious and time-consuming processes, much effort has been exerted to improve the recombination-based methods[82] or develop efficient alternative chromosomal engineering methods to further expedite the speed of strain development. CRISPR-Cas systems, an adaptive immune system of microorganisms, have recently attracted much interest as genome engineering tools because they enable simple and rapid engineering compared with the conventional tools. The Type II and class 2 CRISPR/Cas9 system derived from Streptococcus pyogenes has most widely been employed for genome engineering applications. Many variations of the CRISPR/Cas9 system were developed in combination with existing molecular tools for the genome engineering of microorganisms by leveraging the capability of the CRISPR/Cas9 system to introduce double-stranded breaks at a precise DNA sequence in the genome. More recently, Tn7-like transposons associated with CRISPR-Cas systems have been developed as powerful tools for inserting DNA into genomes.[83] The CRISPR-associated transposase (CAST) system[84] derived from cyanobacteria Scytonema hofmanni and Anabaena cylindrica and the INTEGRATE (insert transposable elements by guide RNA-assisted targeting) system[85,86] derived from Vibrio cholerae have enabled the insertion of large gene clusters into targeted sites in the chromosome. For engineering nonmodel organisms, much effort has been made to develop broad-host-range genetic manipulation tools such as mobile-CRISPRi,[87] chassis-independent recombinase-assisted genome engineering (CRAGE),[88] and XPORT.[89] Mobile-CRISPRi was developed to characterize diverse bacterial species, with large guide-RNA libraries to rapidly screen essential pathways and desired phenotypes.[87] CRAGE uses transposon and conjugation systems for introducing biosynthetic gene clusters into diverse bacteria,[88] while XPORT is an engineered donor strain constructed to transfer miniaturized integrative and conjugative elements to undomesticated organisms.[89] With the increasing availability of molecular tools to engineer nonmodel organisms, those nonmodel organisms potentially possessing higher capability to overproduce certain chemicals can be metabolically engineered to become the industrial production strains in the future.

Enzyme Engineering

Recent advances in enzyme engineering have opened up a new avenue toward tailoring enzyme traits to harbor desired activity, specificity, and stability (Figure B). Enzyme selection and design are pivotal steps in constructing metabolic pathways that determine the metabolic reaction efficiency, which in turn affects the overall metabolic flux of the pathway. In general, when heterologous enzymes are introduced into the microbial host, various problems might arise such as low enzyme activity, low stability and solubility, among others.[90] Enzymes can be rationally designed to overcome these problems. For example, various fusion tags that are readily available from empirical research can be fused to target enzymes to solve low enzyme stability or solubility problems.[91] Eukaryotic enzymes—especially those derived from plants—contain transit peptide sequences that are responsible for the translocation of proteins to specific membrane-bound organelles such as chloroplasts. These enzymes are usually nonfunctional when expressed in microbial hosts, and the removal of such transit peptide sequences from the enzymes is a widely used method to successfully express the enzymes in functionally active forms in the microbes.[92,93] Another example of rational enzyme engineering involves the engineering of TesA, a thioesterase from E. coli,[94] which was created through structure-guided mutagenesis to expand the range of substrate selectivity. The rationally engineered TesA was shown to improve the production of medium-chain-length fatty acids while maintaining a high enzymatic activity when introduced into E. coli.[94] Similarly, homology modeling and protein docking simulation of uncharacterized aklavinone 12-hydroxylase (DnrF) and C-glucosyltransferase (GtCGT) were performed to create mutant enzymes with improved activities.[33] The introduction of these mutant versions of DnrF and GtCGT resulted in the increased production of carminic acid in E. coli.[33] Nevertheless, the aforementioned strategies cannot be a “one-size-fits-all” approach to engineer all enzymes. As an alternative, directed evolution can be employed to improve the enzyme performance. Directed evolution involves the generation of a mutated enzyme library and screening the variants through the iterative rounds of random mutagenesis. Among the screened mutated enzymes, mutants with desired traits, such as those with improved activity, can be selected under various selection pressures.[7] An interesting recent example of directed evolution is engineering carboligase for synthetic one-carbon metabolism.[95] Oxalyl-CoA decarboxylase was mutated to glycolyl-CoA synthase by a high-throughput screening of saturation mutagenesis libraries.[95] Both rational design and directed evolution techniques are used to engineer naturally occurring proteins. On the contrary, novel proteins can be created through the computational design of de novo proteins, which are able to carry out reactions that do not even exist (or have not yet been found) in nature. With our increasing knowledge on proteins and the availability of X-ray crystallography and NMR databases, proteins can now be customized on demand.[96] One of the most representative examples is computationally designed formolase, which can carry out carboligation reactions that directly fixes one-carbon unit into a three-carbon unit.[97] This has led to the creation of a new carbon fixation pathway that is more efficient than naturally occurring one-carbon assimilation pathways. Although there have not yet been many cases reported on the application of de novo protein designing in the field of metabolic engineering, it has a high potential to open up a new avenue for the construction of novel biosynthetic pathways.[98] When the bottleneck of the metabolic reaction is not due to the low enzyme activity but rather due to the substrate toxicity or low substrate availability, multiple enzymes can be simultaneously engineered to promote substrate channeling, thereby allowing improved metabolic conversion. Substrate channeling is a strategy to physically recruit proteins in close proximities by covalently fusing two or more proteins using linkers,[99] or through post-translational scaffold formation (Figure C).[100] Minimizing the distance between two or more catalytic domains will create an artificial metabolic flux node, where the reaction efficiency can be improved by minimizing the escape of metabolites through diffusion and lead to the production of target chemicals at higher levels.[23] Because of the rapid conversion of a toxic product (a substrate for the next enzyme reaction), cells can be rescued from the toxic effect as well. Similarly, proteins can also be compartmentalized to specific membranes or organelles (especially for eukaryotes) within the cell to achieve substrate channeling effects.[101] Eukaryotic strains have the advantage of streamlining reaction cascades by localizing enzymes and substrates into compartmentalized organelles, which also protects intermediates from being leaked out to other competitive pathways.[102] In another example, the squalene synthesis pathway was compartmentalized into the peroxisome of S. cerevisiae, which has led to the dramatic enhancement of squalene production.[103] Compartmentalization strategy can also be applied to entrap toxic enzymes, such as norcoclaurine synthase, to peroxisomes to reduce the toxicity of the enzyme and improve alkaloids production.[104] With our better understanding of enzyme assembly mechanisms, substrate channeling will continue to be a promising strategy for streamlining the metabolic flux toward the production of desired chemicals.

Genome-Scale Metabolic Models (GEMs)

The successful design of microbial cell factories requires optimizing the whole biological network instead of focusing on the central metabolism and the target biosynthetic pathways. GEMs, which are mathematical representations of metabolic networks derived from gene–protein–reaction associations of organisms, have been utilized for designing metabolic engineering strategies at systems level (Figure D, Table ).[105,106] The development of omics integration algorithms also enabled the GEMs to serve as platforms for integrating various omics layers to understand the metabolism of host cells, rather than to be used separately.[107,108] A recent study integrated proteomic data into iYO844, a GEM of Bacillus subtilis, to account for the proteome allocation in the metabolic network.[109] The proteome integrated GEM-predicted and experimentally validated gene knockout targets, odhAB and sucCD, for the enhanced production of poly-γ-glutamic acid, which resulted in a 2.10-fold (6.5 g/L) and 2.3-fold (7.2 g/L) increase in titer, respectively. In addition to the integration of omics data into GEMs, available bio-big data can be used for combining machine learning with the applications of GEMs.[110] For example, tryptophan production in yeast was improved by using machine learning algorithms that predict optimal combinations of the promoters of target genes (i.e., PCK1, TAL1, TKL1, CDC19, and PFK1) where the target genes were initially identified using a yeast GEM.[111] Advances in high-throughput technologies allow numerous types of data to be applied to GEMs, which further increases the use of GEMs in systems metabolic engineering. In the near future, the use of multiomics and cultivation data in machine learning approaches will allow the automated design of superior microbial cell factories.

Random Mutagenesis of Chassis Strain

As mentioned earlier, classical strain engineering has relied on random mutagenesis methods used to select strains that overproduce the target chemical (Figure E). This method is still routinely used for the screening of strains with superior production performance. Random mutagenesis is advantageous in that strains constructed using this method are classified as genetically unmodified, which means they are subject to less legal regulation when the strain is used to produce chemicals for human consumption.[112] In addition, products labeled as “genetically modified organism (GMO)-free” are much better accepted by the public.[113] Random mutagenesis can be done by using chemical mutagens, including ethyl methanesulfonate (EMS), N-methyl-N′-nitro-N-nitrosoguanidine (NTG), and N-ethyl-N-nitrosourea (ENU), or by using physical mutagens such as UV light or plasma. Atmospheric and room temperature plasma (ARTP) is a more recent physical mutagenesis technology, where diverse plasmid DNA and oligonucleotides breakages can be generated with plasma dosage variations. With its low plasma temperature and economic costs, ARTP is an increasingly popular mutation tool for microbial breeding, especially to develop mutant strains that have no other feasible method to improve their phenotype because of complex underlying genetic regulations.[112,114] Genome shuffling recombines genomes of multiparental strains using recursive protoplast fusion and it is another technology that has received much attention for phenotypic improvements of microbial strains used industrially.[115]

Tolerance and Transporter Engineering

Increasing the tolerance of host strains to toxic target chemicals is an important strategy to achieve higher production (Figure F). Strategies such as enhancing the transport of toxic chemicals to extracellular space,[116,117] blocking biocatalytic conversions toward toxic metabolites,[118] and membrane engineering[119,120] have been successfully demonstrated to improve tolerance toward target chemicals and thereby enhance production performance. Moreover, process engineering approaches using two-phase (aqueous/organic) fermentation[28,121] and in situ product recovery techniques[122,123] can be utilized to overcome toxicity toward target chemicals and enhance overall production performance. Adaptive laboratory evolution (ALE) is another efficient approach that enables rapid evolution of strains to improve tolerance to target chemical toxicity. In addition, the strains obtained from ALE can be reverse-engineered to uncover the cellular mechanism on tolerance for subsequent further engineering. For example, an engineered E. coli deficient in l-serine degradation pathway was subject to ALE with gradually increasing l-serine concentration, and the strains that were evolved to grow without inhibition at 50 g/L of l-serine were isolated. These isolated strains were analyzed by whole-genome sequencing, which led to the proposal of the mechanism of l-serine inhibition.[124] ALE can also be combined with transcriptome analysis for a more in-depth understanding of tolerance mechanisms at the genetic level.[125,126] More recently, ALE has been coupled with strategies such as GREACE (genome replication engineering-assisted continuous evolution),[127,128] OrthoRep,[129] eMutaT7[130] for increasing genotypic diversity, eVOLVER,[131] and MMC (microliter-scale microbial microdroplet culture system)[132] for multiplexed automated culture and subsequent selection. Through combination with these tools, ALE can be a more powerful tool to develop and characterize strains tolerant to toxic chemicals. Transporter engineering can be performed for improving the import of substrates and the export of the desired product. Exporter engineering is an important strategy not only to overcome the toxicity of the target chemicals, as described earlier, but also to improve the production of target chemicals by lowering the intracellular product concentration so that the product synthesis continues (Figure G). This strategy is especially advantageous in reducing purification costs as the target chemicals are produced extracellularly, thereby simplifying the purification process.[133−137] For example, the overexpression of the ccmABC genes encoding the heme exporter in E. coli has facilitated the production of a high fraction of secreted free heme to the medium.[138] Moreover, deleting intermediate metabolite exporters to prevent the loss of precursor metabolites to the extracellular space[139,140] or organelles[141,142] has been demonstrated to reinforce metabolic flux toward the target chemical production. Engineering importers is as crucial as the manipulation of exporters. The disruption of native importers that uptake secreted products can increase the production titer.[143] Conversely, the reimport of leaked intermediate metabolites can be beneficial for further increasing the target chemical production.[144] The active import of substrates have been extensively studied to enable the utilization of various carbon sources and mixtures as substrates.[145−148] For example, the introduction of various sugar importers to produce biofuels enabled the uptake of various sugars.[149,150] Readers are encouraged to refer to a comprehensive review on enhancing sugar uptake.[151,152] There has been much effort to discover and characterize new transporters that are specific for target chemicals. Potential transporters may be discovered by searching from the literature and databases for the sequences orthologous or homologous with known transporters for similar substrates.[140,153] Also, transcriptomic analysis can be performed on the cells upon exposure to a high concentration of target chemical to identify the overexpressed genes with membrane protein characteristics to identify the potential native transporters of the host strain.[154,155] After shortlisting candidate transporters, the characterization of the candidate transporters can be conducted with knockout assays,[140,156] stress-based selection,[154,157] and biosensor-guided screening.[158,159] Furthermore, directed evolution[160] and ALE[161] have been recently employed for increasing the efficiency of transporters. Transporter engineering had been a relatively underused strategy, but is becoming an important strategy for the enhanced production of various chemicals.

Lipid and Membrane Morphology Engineering

Generally, among hydrophobic chemicals targeted to be produced within microbial strains, large-sized hydrophobic chemical production is sometimes limited because of the lack of accumulation space within the cell. As hydrophobic chemicals have a strong tendency to accumulate within the membranes because of the lipophilic nature of the membrane components, the products binding to or intercalating into the cell membrane can interfere with the normal functioning of membrane proteins and thereby further deterring the production of target hydrophobic compounds. There has been an approach to address this problem by increasing the intracellular accumulation of triacylglycerols (TAGs),[162] which can serve to solubilize hydrophobic compounds (Figure H). Using this strategy, lycopene could be overproduced in S. cerevisiae by allowing lycopene accumulation within the lipid bodies.[163] In addition, TAG accumulation can also be advantageous in terms of storing energy because it can be degraded to metabolic intermediates such as acetyl-CoA for providing high carbon flux toward the polyketides production exemplified in Y. lipolytica strains.[164,165] Such oleaginous microorganisms are useful for accumulating hydrophobic compounds and driving precursors and energy from accumulated lipids, as exemplified by the overproduction of lipophilic β-carotene.[166−168] Reshaping the cellular morphology can be another approach that can influence the production of target chemicals (Figure I).[169] Cell shapes can be engineered to increase the cell volume or membrane area. One of the most representative examples is morphology engineering of engineered cells intracellularly accumulating polyhydroxyalkanoates (PHAs), a family of biodegradable polymers. When large amounts of PHA granules were accumulated inside the cell, the cell became filamented and caused retarded growth. Overexpression of the ftsZ gene involved in cell division prevented cell filamentation, which resulted in an increased overall poly-3-hydroxybutyrate (PHB) productivity.[170] Similarly, the expression of mreB (involved in controlling cell elongation) and ftsZ increased PHB titer in Halomonas campaniensis.[171] As mentioned earlier, the accumulation of hydrophobic compounds within the cell membrane can inhibit cell growth and product formation.[172] Recently, natural rainbow colorants including astaxanthin, β-carotene, zeaxanthin, proviolacein, prodeoxyviolacein, violacein, and deoxyviolacein, corresponding to the red, orange, yellow, green, blue, navy, and purple colors, respectively, were produced in E. coli.[172] By engineering cells to produce intracellular membrane vesicles or outer membrane vesicles, production of these hydrophobic rainbow colorants could be dramatically increased.

Antibiotics-Free Systems

In strain development, plasmids are routinely used to introduce and test heterologous biosynthetic pathways in microbial hosts, but antibiotics are required to serve as a selection pressure to retain plasmids within the cells. The use of antibiotics is not desirable in industrial-scale fermentation because of the safety issues, particularly when producing food and health-related products, in addition to the high cost. One method to avoid the use of antibiotics for the retention of biosynthetic pathways is to integrate these nonnative pathways into the genome. However, the integration of large biosynthetic gene clusters into genomes may pose challenges because of their size. Also, the gene expression levels are often lower than those expressed from plasmids. Several routinely used plasmid-retention systems include the toxin/antitoxin systems,[173] auxotrophic system,[174] and operator/repressor titration systems (Figure J).[175] The use of such systems results in the survival of only host cells carrying the plasmid harboring the plasmid-retention systems, which allows for the retention of the plasmids without the use of antibiotics. For example, the toxin/antitoxin system which uses the postsegregational killing hok/sok system was able to stably maintain the biosynthetic pathway toward astaxanthin production.[176] The engineered strain with the toxin/antitoxin system was capable of producing astaxanthin at a similar production titer and productivity without the addition of antibiotics. Another more recent strategy uses promoters with engineered incoherent feedforward loop (iFFL) that allows gene expression to be constant at any copy number.[177] It was demonstrated that these stabilized promoters could retain the function of the deoxychromoviridans biosynthetic genes when moved from the plasmid into the genome without the need to retune gene expression.

Outlook for the Future

The strategies for developing microbial cell factories for the production of chemicals have evolved from random mutagenesis approaches and simple metabolic engineering to systems metabolic engineering that integrates metabolic engineering with various tools of systems biology, synthetic biology, and evolutionary engineering. With the development of advanced techniques to design and engineer microorganisms, a wide portfolio of chemicals can now be produced from renewable resources by employing microbial cell factories.[1] However, there remain several challenges in discovering new pathways and designing high-performance strains toward realizing a future of the sustainable production of chemicals. First, the construction of the predicted de novo pathways in microbial cell factories is still limited. Even though several retrobiosynthesis methods have been developed for the single-step reaction prediction, designing entire synthetic pathways is not yet straightforward. As the number of reaction steps required for the desired chemical synthesis increases, the number of possible combinations of reactions increases exponentially. Thus, it is important to screen and filter implausible reactions considering various factors including thermodynamic feasibility, toxicity, and economic feasibility of the reactions.[178] Optimizing the enzymes is another hurdle to be overcome for the successful employment of the predicted pathways. Simply finding enzymes that are capable of catalyzing a chemical reaction is not enough to direct sufficient metabolic flux toward the production of a desired chemical. Enzymes should function in the chosen host strain while ensuring several factors including good activity, solubility, and cofactor availability. Moreover, the substrate specificity and enzyme promiscuity need to be considered depending on the products of interest. The optimal balance between substrate specificity and enzyme promiscuity can be found with the key strategy of de novo design of enzymes.[98] Recent development in protein design studies using artificial intelligence enables the generation of non-natural enzyme sequences[179] or enables prediction of the 3D structure of a protein from its sequence.[180,181] Such advances are expected to open a new avenue to develop enzymes for the de novo biosynthetic pathways. Second, more nonmodel microorganisms should be explored for use as microbial cell factories for the production of chemicals. As we have discussed above, most microbial cell factories have relied on using model organisms such as E. coli and S. cerevisiae, which are not always the optimal hosts for the production of target chemicals. Recently, nonmodel microorganisms are increasingly being studied and used as microbial hosts for chemical production, as previously discussed. However, the lack of genetic engineering tools and parts to enable the engineering of these nonmodel organisms is a hurdle to overcome. Third, the pathways for the utilization of various carbon sources also need to be explored and designed with similar rigor as designing biosynthetic pathways. Although we focused on constructing biosynthetic pathways toward the production of chemicals in this paper, the design and construction of carbon utilization pathways in microbial cell factories are equally important. There has recently been much interest in using carbon sources such as food wastes, carbon dioxide, syngas, and methane. Substantial progress has recently been made in constructing carbon-fixing synthetic pathways in E. coli.[182−184] More effort needs to be exerted to more efficiently utilize various inexpensive and/or waste carbon sources to help achieve the net-zero (carbon neutral) goal. Fourth, the design, build, test, and learn (DBTL) cycle for each iteration in constructing a microbial cell factory is still laborious and time-consuming for the average metabolic engineering laboratory because of the lack of standardized molecular parts and infrastructure needed for rapid building and testing of synthetic pathways. In response to the growing need for the integrated infrastructure for the DBTL of engineered organisms, biofoundries are being established by research institutions and companies around the world to accelerate and enhance iterative DBTL cycles. With the automation and high-throughput equipment provided by these biofoundries, the DBTL cycle for developing microbial cell factories can be expedited.[185,186] The Global Biofoundry Alliance was established in 2018 to coordinate these efforts worldwide.[187] Fifth, despite the many reports demonstrating the production of chemicals for the first time using microbial cell factories, only a few biotechnology innovations traverse the “Valley of Death” to successful commercialization.[188] More collaborative efforts are needed between academia and industry to develop microbial cell factories and the corresponding scale-up bioprocesses. For successful commercialization, the entire project of developing microbial cell factories and bioprocesses should be based on techno-economic analysis (TEA) using tools such as the bioprocess TEA calculator.[189] Such TEA should be performed at the early stage of the project to ensure the successful design of microbial cell factories and their associated bioprocesses. When these hurdles are overcome, microbial cell factories will be the future plants for the sustainable and environmentally friendly production of chemicals, fuels, and materials. In addition to realizing biobased, sustainable chemical industries, microbial cell factories will also serve as platforms for efficiently producing natural products that are beneficial for human health as food, nutrition, and medicine. With the rapid advances in designing and developing efficient microorganisms we are observing, it is expected that microbial cell factories will become a key platform for manufacturing numerous chemicals and materials that are currently produced from fossil resources or extracted from plants and animals.

185 in total

Review 1. Recent advances in engineering Corynebacterium glutamicum for utilization of hemicellulosic biomass.

Authors: Jae Woong Choi; Eun Jung Jeon; Ki Jun Jeong
Journal: Curr Opin Biotechnol Date: 2018-12-08 Impact factor: 9.740

Review 2. The coming of age of de novo protein design.

Authors: Po-Ssu Huang; Scott E Boyken; David Baker
Journal: Nature Date: 2016-09-15 Impact factor: 49.962

Review 3. Fusion tags to enhance heterologous protein expression.

Authors: Mi-Ran Ki; Seung Pil Pack
Journal: Appl Microbiol Biotechnol Date: 2020-01-28 Impact factor: 4.813

4. Prospecting Biochemical Pathways to Implement Microbe-Based Production of the New-to-Nature Platform Chemical Levulinic Acid.

Authors: Ana Vila-Santa; M Ahsanul Islam; Frederico C Ferreira; Kristala L J Prather; Nuno P Mira
Journal: ACS Synth Biol Date: 2021-03-25 Impact factor: 5.110