Metabolic Engineering for Production of Small Molecule Drugs: Challenges and Solutions

Production of small molecule drugs in a recombinant host is becoming an increasingly popular alternative to chemical synthesis or production in natural hosts such as plants due to the ease of growing microorganisms with higher titers and less cost. While there are a wide variety of well-developed cloning techniques to produce small molecule drugs in a heterologous host, there are still many challenges towards efficient production. Therefore, this paper reviews some of these recently developed tools for metabolic engineering and categorizes them according to a chronological series of steps for a generalized method of drug production in a heterologous host, including 1) pathway discovery from a natural host, 2) pathway assembly in the recombinant host, and 3) pathway optimization to increase titers and yield.


Introduction
Small molecules derived from natural organisms offer a wide range of useful applications.One of the earliest and most important applications is using small organic molecules for medical purposes [1].For much of human history, natural products were the most common source of all medicines.From the 1980s to the early 2000s, inorganic and chemically synthesized small molecules have seen increased use, especially by means of combinatorial synthesis approaches to produce vast libraries of potentially bio-active moieties, as the field of chemistry advanced [2].Nevertheless, despite many new drug motifs created using these synthetic methods, very few of them had the desired bioactivity levels or made it through early stages of testing [3].This is largely because biologically produced drugs tend to have certain features of complexity that are difficult to create with synthetic methods (e.g., having many chiral centers and polycyclic structures) such that the majority of drugs approved for use around the end of the 20th century and onward were, in fact, still natural or naturally derived compounds [4].
Many of the small molecule drugs can be produced by native hosts or using chemical routes.However, when using native hosts to produce small molecule drugs, the rareness of the host organism, the slow growth rates of hosts, and the low concentration of these secondary metabolites often make the production economically not feasible [5].In addition, these natural hosts are often far more difficult to genetically modify than a model microorganism such as Escherichia coli or Saccharomyces cerevisiae, which hinders the improvement of product yields via genetic engineering.Drug synthesis using the chemical route, as mentioned above, is not always an efficient alternative due to the complexity of some of these organic molecules with many chiral centers and different functional groups.Because of these limitations, an alternative approach, which synthesizes small molecule drugs in a heterologous host (e.g., E. coli or S. cerevisiae) by expressing the biosynthetic pathways, is attracting increasing attention for the production of pharmaceuticals [6,7].Compared to producing drugs in native hosts, a well-studied heterologous host uses cheaper feedstock, is more robust, is easier to grow, and has a panel of well-designed genetic tools to allow efficient genetic modification to achieve high production.Compared to chemical synthesis of small molecule drugs, production in a heterologous host is often less costly without using dangerous operation conditions, and could potentially achieve higher enantioselectivity.Recently, two of the small molecule drugs, artemisinic acid [8] and opioids [9], have been successfully produced in S. cerevisiae by using exogenous genes and a combination of metabolic engineering techniques such as enhancing the precursor supply and down-regulation of the side pathways.The yield of artemisinic acid reached ~0.16 g/g (~50% of the theoretical yield using glucose).Similarly, E. coli has been recently engineered to produce taxadiene, an important precursor for taxol synthesis, at an impressive titer of 1 g/L and a yield of 0.07 g/g (~25% of the theoretical yield using glycerol) using a metabolic engineering approach to balance the taxadiene biosynthesis pathway [10].
Figure 1.The three major steps in metabolic engineering for production of small molecule drugs, including pathway discovery, pathway assembly and pathway optimization.In general, the first step in metabolic engineering is to identify the genes involved in the synthesis of the target molecule.If a particular step in the pathway is unknown, several methods could be used to identify likely candidate genes, such as genome comparisons between drug-producing organisms and closely related non-producers.Other challenges related to pathway identification include finding silent pathways and creating de novo pathways for drugs with unnatural modifications.Once all relevant genes have been identified, the next challenge for production of small molecule drugs is to functionally express these genes in a suitable host.The assembly of long pathways can be assisted by cutting-edge assembly techniques such as golden gate cloning and DNA assembler.The poor enzyme expression can be overcome by techniques such as codon optimization.Finally, pathway optimization is conducted to identify and solve bottlenecks with various forms of static or even dynamic regulation.Transport limitation of intermediates can be overcome by metabolic engineering strategies such as compartmentalization strategies for over-production of small molecule drugs.
Although holding a great promise of drug synthesis in a heterologous host, it is not always easy to achieve efficient production due to challenges such as unknown production pathways, poor gene expression in the heterologous host, or bottlenecks in an unbalanced biosynthesis pathway.To overcome these issues, many efforts have been committed, such as creating easier expression vector assembly techniques [11] and employing pathway balancing algorithms [10].Overall, producing small molecule drugs in heterologous hosts offers a potentially much more efficient alternative when neither production in the natural host nor chemical synthesis is satisfactory.In this review, we will focus on the synthetic biology techniques related to pathway discovery, pathway assembly and pathway optimization for engineering heterologous hosts to produce small molecule drugs, as is outlined in Figure 1.It is worth noticing that the post-production factors, such as drug extraction and purification, also have a critical impact on the drug's cost but will not be reviewed in this study since these post-production factors fall more into the realm of chemical or bioprocess engineering and because of the fact that increasing efficiency in the drug production stage is always beneficial for the biosynthesis of small molecule drugs, regardless of the downstream processes.In general, we summarized some common challenges for heterologous production of small molecule drugs, which is followed by reviewing both the conventional approaches as well as recent techniques that address each of these challenges, with some specific examples listed in Table 1.We also offer our perspectives on some cutting-edge breakthroughs in synthetic biology and their potential applications in assisting metabolic engineering for the production of small molecule drugs.

Pathway Discovery
The first step in engineering a heterologous host for the production of small molecule drugs is to transfer the drug synthesis genes to an industrial microorganism.For some drugs, the biosynthetic pathways are well studied, while in many other cases, the exact metabolic pathway [12,13] or the functional enzymes associated with the biosynthetic pathway were unknown [14,15].Therefore, identifying the correct pathway to express in a heterologous host is one of the most important steps in metabolic engineering for the production of a small molecule drug.

Unknown Route
One of the conventional methods for discovering gene function is using gene knockout mutations.This method is generally better suited for discovering the unknown function of a given gene than it is for finding an unknown gene for a given function, but for pathway discovery, it is still an important technique for confirmation of the gene's function once a likely candidate is found.For instance, the gene cluster responsible for the production of shanorellin in C. globosum was recently discovered by noting increased production upon activation of the transcription factor CgsA, and the importance of that cluster was confirmed by knockout experiments for each element of the cluster and the accumulation of various intermediates revealed the pathway [16]. 13C-assisted pathway analysis can also help determine what intermediates are involved and potentially in what order the reactions happen [17].This method was used to determine the series of chemical reactions responsible for the production of Phomoidride B from soil fungi [18] but one limitation of this method is that it does not identify the exact enzymes and genes that would need to be transferred to a new host for heterologous expression.Knowledge of the intermediates, however, can help direct the search for uncharacterized proteins with similarity to enzymes capable of catalyzing similar reactions.
Currently, many sequence analysis and data-mining techniques are being developed for pathway discovery of small molecule drugs.With the rapid advances in high-throughput sequencing technologies and the accessibility of large sequence databases, there is a shift towards using computational methods to predict biosynthesis pathways, which generally utilizes two strategies: 1) similarity searches that screen for certain motifs expected to exist for the given reaction, and 2) comparative genomics techniques where the producing organism is compared to closely related species that cannot produce the drug in question or to organisms that produce similar natural products.As one example of similarity searches, novel enediyenes were discovered among several actinomycetes by searching their genomes for the "warhead chromophore" motif characteristic of a enediyene-producing polyketide synthase (PKSE) [19,20].In another case study of similarity searches, an orphan gene cluster in Pseudomonas fluorescens was identified as a potential producer of lipopeptides which led to the discovery of an orfamide A-producing pathway [21].In addition to similarity searches, the comparative genomics approach has also been widely applied to the discovery of novel biosynthesis pathways of small molecule drugs.For example, novel paclitaxel-producing genes in Penicillium aurantiogriseum were discovered and found to be quite different from the analogous paclitaxel genes in other Taxus species [22].Also, the genes responsible for rifamycin production in Amycolatopsis meditarranei were uncovered by comparing the genome sequences of A. meditarranei to closely related non-producers [23].In another case study of the comparative genomics approach, a library of non-producers from the producing organism, Papaver somniferum, was created by random mutagenesis and used to identify the enzyme that transforms thebain into morphine [24].Comparative transcriptomics can also be used when production of the drug is known to be preferentially localized to a certain tissue or expressed only under certain conditions.One study used transcriptomics and knowledge of some of the genes related to taxane production to identify the root of Taxus plants as the primary site of taxane production.This knowledge was then used to guide the search for the remaining taxane production genes among transcripts preferentially expressed in the root [25].
While the bioinformatics techniques have been used to predict the genes involved in the pathway for a particular drug, those predictions must still be confirmed by traditional knockdown phenotype analysis.Such a step could be problematic if the natural producing organism is not easy to cultivate in a laboratory setting or is generally difficult to manipulate genetically.Thus, it is desirable to transfer the suspected genes or gene clusters to a heterologous host as part of pathway discovery, a method that has been utilized for novel drug discovery [26].It should also be noted that full pathway elucidation is not always required for subsequent metabolic engineering.It may be enough to transfer the entire gene cluster of interest without identifying the role of each constituent [27].It is also possible to perform pathway optimization (the subject of Section 4) without the explicit pathway information via engineering approaches such as directed evolution [28].

Silent Pathway
Another major challenge to pathway discovery occurs when drug production is only activated under certain conditions which are either poorly understood or difficult to replicate in a laboratory setting.In prokaryotes, these silent pathways for the production of secondary metabolites are often grouped into a gene cluster under the control of one regulatory element.The endogenous producer of the drug may have multiple cryptic gene clusters that could be responsible for the drug in question and these clusters are often the focus of sequence analysis methods such as those from the preceding section.Many of the same bioinformatics techniques from the preceding section are still relevant, especially since phenotype-based methods are less useful for silent genes.
Once bioinformatics techniques reveal multiple gene candidates that could be responsible for drug production, it becomes necessary to activate those silent genes or gene clusters for phenotype or transcriptome comparisons [29,30].These methods include fusing new promoters to silent gene clusters such as in the discovery of the novel polyketide, asperfuranone [31]; prevention of heterochromatin formation, which activated the production of monodictyphenone in Aspergillus nidulans [32]; and co-incubation with microbial consortia to mimic conditions in nature [33], a process which was instrumental in discovering dihydrofarnesol production in Candida albicans [34] as well as in finding a variety of products only produced by co-cultures of marine actinomycetes with their natural competitors [35].Recently, a novel plug-and-play synthetic biology approach was developed as the landmark research of the activation and characterization of silent pathways, and applied to activate a cryptic polycyclic tetramate macrolactams (PTMs) biosynthetic gene cluster, sgr810-815, from Streptomyces griseus [36].In general, despite the emerging significance of PTMs in basic and applied biology, the mechanisms for the key steps in PTM biosynthesis such as installation of two polyketide chains and subsequent formation of the polycyclic system remain elusive due to the pathway silence under typical fermentation conditions in either the native or heterologous host.In order to elucidate the biosynthetic steps of this new gene cluster, the scientists first determined the exact boundary of the gene cluster by bioinformatics analysis of upstream and downstream regions of this gene cluster, and then selected a set of well-characterized promoters for gene cluster reconstruction in the heterologous host.Finally, a silent PTM gene cluster was successfully activated and three novel PTM compounds were discovered.This strategy has been demonstrated to be simple, generally applicable and potentially scalable in studying other silent natural product gene clusters.

Semi-Synthesis
While many studies focuses primarily on natural product drugs, structural modifications of natural drugs are also regularly used to increase the drug performance or to counter resistance factors [37,38].For instance, paclitaxel derivatives have been developed to combat taxane-resistant tumors [39].Derivatives of the anti-cancer drug oridonin with improved solubility were created by adding hydrophilic amino acids and carboxyl groups [40].Because these modified drugs have no natural producer, they are often made either by complete chemical synthesis if the molecule is relatively simple, or by semi-synthesis, in which the natural compound is made biologically, extracted and subsequently modified in vitro.Therefore, using an incomplete pathway rather than a complete pathway could offer unique solutions for drug modification and production.For example, the anti-malarial drug coartem is produced by the reduction of the natural product artemisinin [41], which has been successfully achieved in Jay Keasling's milestone work on engineering microorganisms to produce artemisinin [42].In another case study, neomangiferin, an anti-diabetic drug produced from Gentiana asclepiadea, was produced by the O-glycosylation of mangiferin, an intermediate that can be more easily extracted from Mangifera indica [43].Overall, semi-synthesis is preferable to complete chemical synthesis and is viable when only one or two modifications are required with little danger of side products [44], but as the complexity of the product increases, especially with respect to chiral centers among the post-biological modifications, it becomes increasingly important to consider ways to achieve complete biological synthesis.
Creating a new pathway for a novel drug derivative may be achievable through existing proteins from various sources.To this end, computational tools are being developed to generate potential pathways from database information of possible enzymatic reactions [45,46].This method only works in cases where the modification can be catalyzed by natural enzymes but, even in cases where the modification has no natural analogue, synthetic biology offers avenues for new catalysis capabilities via protein engineering.Rational design of novel enzymes has been sought after for many years [47,48].Recently, a novel enzyme was created for producing the synthetic anti-diabetic drug sitagliptin by a combination of rational and directed evolution modifications to a transaminase from the Arthrobacter species [49].Directed evolution was also used to create andrimid derivatives in Pantoea agglomerans [50].Another method used to generate novel enzymes is a combinatorial domain swapping [51], which has been used to generate polyketide derivatives in Aspergillus nidulans [52].Indeed, combinatorial methods offer a way to produce hundreds of natural product derivatives at the same time [51].
By combining the technologies of enzyme discovery, enzyme engineering, and pathway and strain optimization, a significant breakthrough was reported for the complete biosynthesis of opioids in yeast [53].Opioids are the primary drugs used in pain management and palliative care.Farming of opium remains the sole source of these essential medicines.To engineer yeast S. cerevisiae to produce the selected opioid compounds thebaine and hydrocodone starting from sugar, the resulting opioid biosynthesis strains required the expression of 21 (thebaine) and 23 (hydrocodone) enzyme activities from plants, mammals, bacteria, and yeast itself.Functionally expressing the >20 heterologous genes required for complete biosynthesis of these complex molecules has been challenging because of the decreases in titer observed with each additional enzymatic step.To achieve this, researchers first built a S. cerevisiae strain to produce the key biosynthetic intermediate (S)-reticuline which could be converted to many downstream products including the morphinans.This long pathway was designed in several genetic modules as shown in Figure 2: (I) the precursor overproduction module; (II) the tetrahydrobiopterin (BH4) module; (III) the (S)-norcoclaurine module; and (IV) the (S)-reticuline module, as well as a bottleneck module (V) with extra copies of the enzymes for rate limiting steps.Then a panel of key enzyme 1,2-dehydroreticuline synthase (DRS) and 1,2-dehydroreticuline reductase (DRR) enzyme(s) from different species were further screened, and thus module VI was constructed to convert (S)-reticuline to the morphinan alkaloid thebaine.Despite the still-low concentrations of the final product, this study highlights the potential of yeast as a chassis for designing a complete biosynthesis pathway modularly to produce bio-based, complex small molecule drugs.

Pathway Expression
Once the appropriate pathway and all relevant genes have been identified, the next challenge for production of small molecule drugs is to functionally express these genes in a suitable host and with relatively balanced stoichiometry of expression.Several challenges have been found in heterologous gene expression, including construction of long biosynthesis pathways and poor enzyme expression.[53].(B) E. coli-S.cerivisiae co-culture for production of oxygenated taxanes.The taxadiene production was first achieved in E. coli, which included two modules: the upstream non-mevalonate pathway (MEP) ending in the production of dimethylallyl pyrophosphate (DMAPP) and a downstream pathway producing the chemotherapy drug intermediate taxadiene [10].By performing a multivariate regulation of these two modules, a local maximum was discovered for the production of taxadiene.The oxygenated taxanes were next achieved in an E. coli-S.cerivisiae co-culture [54], in which the taxadiene produced by E. coli was taken up by S. cerivisiae and converted into more complex (and valuable) taxanes.The co-culture is kept stable by using a mutualistic feeding strategy wherein the substrate xylose can be used by E. coli, but not S. cerevisiae.The E. coli then produced acetate, which was used by S. cerevisiae so that it did not accumulate to levels toxic to E. coli.

Long Biosynthesis Pathway
While some drug-producing pathways involve the expression of only one or two genes in a heterologous host, most of the drugs, however, require the addition of entire gene clusters or many (>10) genes from different parts of microorganisms' genome.Such large plasmids can be unstable over generations and generate metabolic burden upon the host.To solve this, integrative vectors and synthetic chromosomes have been developed to incorporate foreign DNA into the host genome [4,55], such as those employed to transfer the epothilone cluster into Streptomyces coelicolor [56].
The construction of such complex vectors presents another challenge.Whether the genes are collected from multiple loci in the genome or whether the gene cluster needs to be broken down and rebuilt with alternate regulatory elements, it can be quite laborious to construct one vector with the entire pathway using conventional digestion and ligation methods.Recently, several synthetic biology approaches have been developed to facilitate the construction of a vector with many constituents.For example, Golden Gate Cloning [57] uses the type II endonuclease Bsa1 which has a sequence-specific recognition site but a sequence-independent cut site to achieve "one-step" digestion and ligation of vectors.The modular nature of Bsa1 allows for the rational design of several overhang sequences in the same reaction vessel such that PCR-generated inserts can be incorporated into a programmable position.This method was used to assemble vectors to express various diterpenes, including cebratrien-ol in Nicotiana benthamiana [58].In addition to Golden Gate Cloning, another assembly technique, namely DNA assembler [11,59], exists for eukaryotic hosts using homologous recombination.In general, PCR-generated genetic inserts can be designed to include overlapping ends which are then fused in a host such as S. cerevisiae with high recombination rates.This method is gaining popularity as an alternative to ligation and has been recently employed to construct vectors for the expression of morphinan in S. cerevisiae [60].Even when yeast is not the host for producing drugs, it can still be used as an intermediate host for constructing the desired vectors.Such recombination can also occur in E. coli with the application of bacteriophage-derived Red/ET recombination tools [61] or lambda bacteriophage recombination.Such assembly techniques can also be used to generate combinatorial libraries for natural product derivatives [62].

Poor Enzyme Expression
When expressing heterologous genes in a new host, it is important to consider the differences between the host and source organisms [4].The heterologous host may not have the same post-translational modifications [63] or codon usage [64].Many of these challenges can be avoided by carefully choosing a host that is closely related to the drug producer or produces a similar secondary metabolite yet is more amenable to genetic manipulation [65].For example, many pharmaceutical polyketides are produced by Strepromyces.Researchers have used a model organism for this genus, Streptomyces coelicolor, to produce a strain suited for heterologous expression of polyketide gene clusters from closely related natural producers [66].However, in case the ideal host cannot be selected to match the heterologous pathway, the enzymes are often required to be altered to be efficiently expressed in the new host.
For expression in a distantly related heterologous host, it is often necessary to alter the codon usage because either the original codon matches with a tRNA that is very rare in the heterologous host to slow protein translation, or the codon codes for an entirely different amino acid in the new host [67].Codon usage preferences for the most regularly used expression systems are well characterized and many automated optimization algorithms are available, such as OptimumGeneTM, which was used in one study to increase the expression of pigment epithelium-derived factor (PEDF), an anti-tumorigenic protein, in E. coli by three-fold [68].In another study, the MelC1 protein from Streptomyces avermitilis, which is required to be co-expressed with MelC2 in a heterologous host, needed to be codon-optimized before it could be expressed in E. coli and subsequently re-engineered at its binding site [69].
Another challenge in enzyme expression is the difference in post-translational modification capabilities between the source organism and the heterologous host.When the enzymes needed for drug production require post-translational modifications but are not found in the heterologous host, it becomes necessary to introduce additional genes for that capability.For example, Cinnamycin (a lantibiotic) was produced in E. coli by the expression of the CinA gene from Streptomyces cinnamoneus [70].CinA produces a peptide precursor to cinnamycin that needs several post-translational modifications, which necessitated the co-expression of CinX, CinM and Cinorf7 to carry out the hydroxylation of aspartate, the formation of three (Me)Lan bridges, and the cross-linking of lysine to dehydroalanine, respectively.In another study, heterologous genes for the production of erythromycin were introduced into E. coli, but one of the proteins, deoxyerythronolide B synthase, required pantetheinylation, which E. coli cannot perform naturally [71].Recently, microbial synthesis of benzylisoquinoline alkaloids (BIAs), a diverse family of plant-specialized metabolites that include the pharmaceuticals codeine and morphine and their derivatives, gave us an attractive promise as an alternative to traditional crop-based manufacturing [72].The authors reported a major breakthrough in microbial BIA production by developing a yeast strain to synthesize the key intermediate (S)-reticuline from glucose.They first identified a cytochrome P450 from the sugar beet Beta vulgaris which was known to be the first example capable of L-tyrosine hydroxylation.Then, by applying PCR mutagenesis, the researchers improved the activity of the wild-type enzyme and increased the production of BIA precursors L-DOPA and dopamine titer by 2.8-fold and 7.4-fold, respectively.

Pathway Optimizations
The final step for metabolic engineering, once the pathway is assembled in a heterologous host and the production of the target drug molecule is confirmed, is to maximize the drug production.To achieve the high titer and yield of the drug at a satisfactory level for large-scale production, two main challenges need to be overcome: the pathway bottlenecks and the transportation limits of intermediates.

Pathway Bottlenecks
The rate-limiting step for drug production may exist somewhere in a dedicated pathway [73] or among primary metabolite production and regeneration [74].Either of these forms of metabolic bottlenecks must be identified and addressed to ensure a steady flow of metabolites towards the desired product, since over-expression of all genes in a pathway often does not lead to optimum results.For instance, in an effort to increase the production of FK506 (an immunosuppressive polyketide) in Streptomyces tsukubaensis, five genes for precursor generation were over-expressed, which only increased the titer by 40% [75].By using a combinatorial strategy to fine-tune the gene expression levels, however, the titer of FK506 was increased by 140% over the wild type.Since pathway bottlenecks often lead to intermediate accumulation or precursor depletion, methods such as GC/MS, HPLC, and isotope labeling techniques [76,77] can identify perturbations caused by the exogenous pathway.The 13 C-metabolic flux analysis ( 13 C-MFA) [78,79] is particularly useful for uncovering pathway bottlenecks for the host microorganism.For example, 13 C-MFA was used to identify bottlenecks at five metabolite nodes for the production of daptomycin in Streptomyces roseosporus [80] which allowed for rational feed enhancement and an overall increase in daptomycin production.
The traditional solutions for tackling pathway bottlenecks include 1) increasing the expression of the enzyme at the rate-limiting step by using stronger promoters; 2) introducing more gene copies such as in studies to increase the production of isoflavone [81] or coenzyme Q [82]; and 3) down-regulating or knocking out side pathways and feedback regulators such as in a study on deoxyviolacein production from E. coli, in which the tryptophan precursor levels are enhanced by knocking out tryptophan repression genes and tryptophan catabolism [83].Expression levels may also be enhanced indirectly by over-expressing regulatory elements, a method used to increase avermectin production in its native producer Streptomyces avermitilis [84].However, the benefits of these static regulations can be offset by increased metabolic burdens [85].Recently, two methods, namely combinatorial promoter engineering [86] and the dynamic control method [87], have been developed to avoid the metabolic burden issues.For example, the production of para-hydroxybenzoic acid (PHBA) in S. cerevisiae was controlled by an exogenous quorum-sensing regulation system such that production of PHBA did not cause unnecessary metabolic burden during the growth phase [88].Dynamic regulation methods can also improve drug production by limiting the host's exposure to toxic intermediates.For example, heterologous production of amorphadiene in E. coli was increased by a factor of two by using a general stress response-controlled promoter, PgadE, instead of the native promoter for expressing the gene for farnesyl diphosphate production [89].
Directed evolution can also be applied to solve the bottleneck issues [28].In one study, theophylline production in E. coli was increased by linking molecule production to fitness in a selectable media [90].This was achieved by creating a theophylline riboswitch for the promoter of the tetracycline resistance gene.Indeed, the directed evolution for pathway optimization is gaining a second wave of popularity via inverse metabolic engineering [91].Another method for optimizing the drug pathway is to use a cell-free system rather than a heterologous host organism [92].Although beyond the scope of this review, the cell-free system is worth mentioning as an alternative that can bypass many of the concerns with expression and optimization in a heterologous host.On the other hand, the cell-free system also loses many of the benefits of cell-based production, e.g., ease of scaling up.
While these optimization techniques are impressive when used singularly, they are more effective when used in combination.The study in isoprenoid pathway optimization for producing the taxol precursor in E. coli showed us the great potential of modular pathway engineering for the production of terpenoid natural products [10].The metabolic pathway for taxadiene consists of an upstream native isoprenoid pathway and a heterologous downstream terpenoid pathway.In order to systemically optimize the taxadiene metabolic pathway, the authors partitioned it into two modules: a native upstream methylerythritol-phosphate (MEP) pathway forming isopentenyl pyrophosphate and a heterologous downstream pathway forming terpenoid as seen in Figure 2B.After a systematic multivariate search to identify conditions that best balance the two pathway modules, an optimized combination was identified to maximize the taxadiene production with minimal accumulation of the inhibitory compound indole.This multivariate search optimization boosted the titer to 1 g/L in fed-batch bioreactor fermentation, a 100-fold increase from the original strain.Compared to the traditional metabolic engineering approaches which always ignore nonspecific effects such as toxicity of intermediate metabolites, adverse cellular effects of the vectors used for expression, and metabolites that may compete with the main pathway, this combinatorial approach could overcome such problems because they offer the opportunity to broadly sample the parameter space and bypass these complex nonlinear interactions.

Transport Limitation of Intermediates
Enzyme expression is not the only factor affecting the overall pathway reaction rate.The diffusion, transport and localization of the drug intermediate is another pivotal factor that decides the molecule productivity but has not yet been adequately addressed.In fact, the complete pathway for producing a complex drug may often include steps not compatible with the optimal environments for other steps [93].Therefore, by localizing the pathways in different organisms, a particular drug could be better in a symbiotic relationship, a scenario that inspired the production of pharmaceuticals using a combination of heterologous hosts.For instance, the expression of cytochrome P450s which is in charge of the oxygenation of taxadiene is poorly expressed in E. coli, in spite of a sufficient supply of the taxadiene.To solve this, a co-culture of recombinant E. coli engineered to produce the taxadiene and a recombinant S. cerevisiae that could effectively express the cytochrome P450s was constructed (Figure 2B), and taxanes were efficiently produced using this localized pathway [54].
For pathways with toxic intermediates, compartmentalization provides a way to mitigate damage to the host organism [94].Even when there is no need for localization of pathway elements into a specific compartment for compatibility reasons, there is still the innate advantage of local concentration increase if the enzymes for the pathway are relocated into a smaller cellular compartment.For these reasons, methods of drug production pathway re-localization have been developed for chloroplasts [95], peroxisomes [96], and mitochondria [97].In the latter study, heterologous enzymes for the production of valencene and amorphadiene were re-localized via targeting signal peptides to the mitochondria of S. cerevisiae, in which the important precursor, farnesyl diphosphate (FDP), is produced.By moving these genes to the mitochondria, the scientists were able to increase the local concentration of heterologous enzymes and their access to the precursor pool, and bypass the need for transport of FDP into the cytosol.Using this method, the production of valencene was increased three-fold relative to cytosolic heterologous expression.In addition to eukaryotic organelles, proteins could be targeted for localization to prokaryotic carboxysomes, although this method has not yet been applied to producing small molecule drugs [98].
The concept of increasing local concentration can be taken to its extreme by creating synthetic enzyme complexes on an engineered scaffold, such as the one used to co-localize elements of an exogenous mevalonate pathway in E. coli [99].In this study, exogenous enzymes for AtoB, HMGS and HMGR were engineered to include peptide ligands recognized by a metazoan-based scaffold protein.Binding of all three enzymes to the same scaffold allowed for efficient substrate shuttling and, ultimately, increased the production of mevalonate 77-fold compared to a control without scaffolding.Subsequent enzymes in a pathway may also be directly linked to each other by creating fusion proteins.In one study, the production of miltiradien, a precursor to tanshinone, in S. cerevisiae was increased by fusing the heterologous proteins SmCPS and SmKSL as well as the endogenous BTS1 and Erg20 to enhance substrate channeling [100].In addition, for pathways with toxic end products, product efflux is often engineered to transport the toxic product out of the cells, which allows better cell growth and higher productivity.For instance, in one study that utilized the modularity of tripartite antibiotic efflux pumps to combinatorially engineer an efflux for the diterpene kaurene in E. coli [101], the kaurene production was increased over two-fold.

Summary and Perspectives
This review has summarized many of the challenges associated with drug production in a heterologous host and has presented both traditional and recently developed solutions to those challenges.These solutions have a broad range of applications and it is this feature that makes them likely candidates to be adopted into the standard toolbox for next generation metabolic engineering.For each of the three general challenges discussed above (pathway discovery, pathway assembly and pathway optimization), there are currently many exciting and broadly applicable techniques being developed for efficient production of small molecule drugs.For example, in the past five years, vast public databases have been generated for high-throughput genomic, proteomic, transcriptomic and metabolomic techniques.By merging these data into a multi-omics database [102], the accuracy of biological predictions such as identifying candidate genes for the production of a given natural product could be significantly increased, which stands as a promising novel strategy to discover pathways for small molecule drugs as well as guiding pathway design and optimizing their production in a heterologous host.In addition, the recently developed CRISPR-based genetic editing tools [103] not only offer an efficient way to manipulate expression levels of multiple genes, but also provide a solution towards the "multivariate modular metabolic engineering" [104,105] to optimize the drug synthesis pathways with modular, multiplex regulation using only a few core proteins (e.g., dCas9) that are guided to specific sequences by guide RNAs.The synthetic regulatory systems such as the dynamic sensor-regulator system (DSRS) [106] or aptozyme-based sensors for feedback control [107] could also be applied in metabolic engineering for producing small molecule drugs by automatically fine-tuning the combinatorial gene expression levels.It is also worth noticing that the native host often has evolutionary time to optimize for the production of complex molecules involving toxic intermediates and this is a source of information that metabolic engineers have not yet taken full advantage of.Therefore, the revisit of functional genomics in native hosts could lead to unexpected discoveries in pathway discovery and pathway optimization.
In sum, heterologous expression and optimization of small molecule drug-producing pathways offers an efficient alternative to drug production in the natural host or by chemical synthesis.While there are still inherent challenges, they can be addressed by the ever-growing toolbox of metabolic engineering to create production platforms to keep up with humanity's growing need for diverse pharmaceuticals.* Estimated titer based on a given mass product/mass cells and an assumed cell mass density of 3 g/L.** Estimated percent yield is based on the reported media composition, the stoichiometry of drug production and the reported titer.

Figure 2 .
Figure 2. Highlights of metabolic engineering for synthesis of small molecule drugs.(A) Modular construction for a pathway to produce hydrocodone in S. cerivisiae.The complete synthesis of hydrocodone was achieved in S. cerivisiae by the organization of the lengthy, 23-reaction pathway into six different modules responsible for key intermediates[53].(B) E. coli-S.cerivisiae co-culture for production of oxygenated taxanes.The taxadiene production was first achieved in E. coli, which included two modules: the upstream non-mevalonate pathway (MEP) ending in the production of dimethylallyl pyrophosphate (DMAPP) and a downstream pathway producing the chemotherapy drug intermediate taxadiene[10].By performing a multivariate regulation of these two modules, a local maximum was discovered for the production of taxadiene.The oxygenated taxanes were next achieved in an E. coli-S.cerivisiae co-culture[54], in which the taxadiene produced by E. coli was taken up by S. cerivisiae and converted into more complex (and valuable) taxanes.The co-culture is kept stable by using a mutualistic feeding strategy wherein the substrate xylose can be used by E. coli, but not S. cerevisiae.The E. coli then produced acetate, which was used by S. cerevisiae so that it did not accumulate to levels toxic to E. coli.

Table 1 .
Case studies in metabolic engineering for production of versatile small molecule drugs.