Application of Cell-Free Protein Synthesis for Faster Biocatalyst Development

Cell-free protein synthesis (CFPS) has become an established tool for rapid protein synthesis in order to accelerate the discovery of new enzymes and the development of proteins with improved characteristics. Over the past years, progress in CFPS system preparation has been made towards simplification, and many applications have been developed with regard to tailor-made solutions for specific purposes. In this review, various preparation methods of CFPS systems are compared and the significance of individual supplements is assessed. The recent applications of CFPS are summarized and the potential for biocatalyst development discussed. One of the central features is the high-throughput synthesis of protein variants, which enables sophisticated approaches for rapid prototyping of enzymes. These applications demonstrate the contribution of CFPS to enhance enzyme functionalities and the complementation to in vivo protein synthesis. However, there are different issues to be addressed, such as the low predictability of CFPS performance and transferability to in vivo protein synthesis. Nevertheless, the usage of CFPS for high-throughput enzyme screening has been proven to be an efficient method to discover novel biocatalysts and improved enzyme variants.


Introduction
Biocatalysis and biocatalytic processes affect many industries ranging from consumer products and food to the chemical and pharmaceutical industry.In many cases, biocatalysis can offer economically attractive and environmentally benign solutions.The potential of biocatalysis is, however, not yet fully exploited [1][2][3][4][5].Some of the hurdles in implementing more processes are linked to the timelines and cost of developing a biocatalytic process, in particular when protein engineering is involved.Cell-free protein synthesis (CFPS) systems might contribute to accelerating the development process of new enzymes and enable rapid screening for suitable biocatalysts.The great advantage of CFPS is the ease of parallelization for efficient and quick synthesis of numerous protein variants.Furthermore, CFPS is easily adjustable to the translational requirements and simultaneously adaptable to the desired subsequent analytical setup.
Nevertheless, in the recent decade, mainly technologies for reaction and process engineering [6][7][8], modelling tools [9,10], as well as automation, parallelization, and miniaturization [11,12] were improved in order to increase the efficiency of bioprocess development.However, the effort to genetically optimize biotransformation systems in industrial processes is often lagging behind and is not always integrated in the process development considerations.As recommended by Lima-Ramos et al., economic analysis should be carried out at an early development phase of the production process [13].This means that the choice and also the biocatalyst improvement in order to tune the biocatalyst's properties to the process requirements are required at an early stage [1].The ability to modify proteins facilitates the development and implementation of biocatalyzed syntheses in industry [14].However, screening of suitable biocatalysts is a complex task if mimicking the process conditions [15,16].The availability of genes and genomes originating from metagenomes from natural pools or mutational creation is steadily increasing [17].Furthermore, the generation of genetic libraries by gene-shuffling, error-prone PCR, or other methods is quite advanced and powerful [18].Consequently, the gap between the number of known gene sequences and known genetic functions is continuously growing [19][20][21][22].The comparison of reviewed and unreviewed gene sequences clearly illustrates this gap (Figure 1).The percentage of genes having any experimental proof for their annotation is relatively close to zero.In addition, biocatalytic processes often require knowledge about a function differing from the natural or physiological enzyme function, e.g., converting non-natural substrates in the presence of organic media.Data about these process relevant enzyme functions is even scarcer.
Catalysts 2019, 9, x FOR PEER REVIEW 2 of 18 process [13].This means that the choice and also the biocatalyst improvement in order to tune the biocatalyst's properties to the process requirements are required at an early stage [1].The ability to modify proteins facilitates the development and implementation of biocatalyzed syntheses in industry [14].However, screening of suitable biocatalysts is a complex task if mimicking the process conditions [15,16].The availability of genes and genomes originating from metagenomes from natural pools or mutational creation is steadily increasing [17].Furthermore, the generation of genetic libraries by gene-shuffling, error-prone PCR, or other methods is quite advanced and powerful [18].Consequently, the gap between the number of known gene sequences and known genetic functions is continuously growing [19][20][21][22].The comparison of reviewed and unreviewed gene sequences clearly illustrates this gap (Figure 1).The percentage of genes having any experimental proof for their annotation is relatively close to zero.In addition, biocatalytic processes often require knowledge about a function differing from the natural or physiological enzyme function, e.g., converting non-natural substrates in the presence of organic media.Data about these process relevant enzyme functions is even scarcer.Closing the gap will advance medicine, chemistry, and industry.A strategy to predict and assign functions to unknown enzymes discovered in genome projects was suggested by the Enzyme Function Initiative [23,24].This approach is mainly based on computational methods.However, these predictions are generally limited to new and unknown enzymes and are less suitable for genetic libraries of enzyme variants.Furthermore, assigning protein functions on the basis of sequences is challenging and not sufficiently consistent.While in silico predictions are useful in functional genomics and systems biology in order to describe dynamic processes, such as gene expression regulation, transcription, translation, and protein interactions, the final evidence about the enzyme's function has to be performed with the enzyme itself.Screening of biocatalyst libraries in order to identify optimal variants can be performed with high-throughput methods, such as instrumental assays or assays based on fluorogenic or chromogenic substrates and reporters [25,26].The rate limiting step in the generation of biocatalyst libraries is typically the heterologous expression and protein purification.A method is hence required for the large-scale or high-throughput production of enzyme libraries which can be tested on their functions.CFPS systems might be a solution to accelerate prototyping of new enzymes and to speed up screening for Closing the gap will advance medicine, chemistry, and industry.A strategy to predict and assign functions to unknown enzymes discovered in genome projects was suggested by the Enzyme Function Initiative [23,24].This approach is mainly based on computational methods.However, these predictions are generally limited to new and unknown enzymes and are less suitable for genetic libraries of enzyme variants.Furthermore, assigning protein functions on the basis of sequences is challenging and not sufficiently consistent.While in silico predictions are useful in functional genomics and systems biology in order to describe dynamic processes, such as gene expression regulation, transcription, translation, and protein interactions, the final evidence about the enzyme's function has to be performed with the enzyme itself.Screening of biocatalyst libraries in order to identify optimal variants can be performed with high-throughput methods, such as instrumental assays or assays based on fluorogenic or chromogenic substrates and reporters [25,26].The rate limiting step in the generation of biocatalyst libraries is typically the heterologous expression and protein purification.A method is hence required for the large-scale or high-throughput production of enzyme libraries which can be tested on their functions.CFPS systems might be a solution to accelerate prototyping of new enzymes and to speed up screening for suitable or optimized biocatalysts [27].A vast expansion in research activities on in vitro synthesized proteins in industry is justified by the need to achieve a more efficient development of bioprocesses.The application of CFPS combined with high-throughput analytics could close the gap and efficiently decrypt the enzyme-function-relation (Figure 2).
Catalysts 2019, 9, x FOR PEER REVIEW 3 of 18 suitable or optimized biocatalysts [27].A vast expansion in research activities on in vitro synthesized proteins in industry is justified by the need to achieve a more efficient development of bioprocesses.The application of CFPS combined with high-throughput analytics could close the gap and efficiently decrypt the enzyme-function-relation (Figure 2).described enzymes (BRENDA), and applied enzymes [28] in context with the high-throughput-discovery of gene-enzyme-function relation.The available potential of genetic diversity (natural ressources and synthesized sequences) can be accessed via rapid prototyping of proteins using CFPS and high-throughput analyses, in order to increase knowledge about enzyme functionality and finally, make the discovery and development of enzyme-based industrial applications more efficient.
In the context of CFPS as tool for high-throughput protein synthesis, this review addresses the following topics.The compositions of CFPS systems are compared and the necessity of individual ingredients assessed.The applications and development status of CFPS are described focusing on screening in industrial applications of the past few years.Finally, remaining challenges and limitations of CFPS are discussed.

Cell-free protein synthesis
Cell-free systems have been used for protein synthesis for decades, with a wide range of applications.The first cell-free protein synthesis system was established in 1961 by Nirenberg and Matthaei [29].Since then, the usage of biological machinery without the use of living cells has undergone many improvements and became the method of choice for various applications [30].There are several advantages of CFPS compared to heterologous expression.Proteins can be produced from DNA templates within a few hours with concentrations at a milligram per milliliter scale [31].Moreover, difficult to synthesize proteins, like membrane anchored proteins [32], or proteins with a toxic effect on the metabolism of the host cells [33] can be easily produced with in vitro systems.While the hydrophobic properties of membrane proteins are often challenging for in vivo systems, the open environment of CFPS systems offers the possibility to support the synthesis of membrane proteins by supplementing membrane mimics, such as lipid-detergent-based systems, nanodiscs, or liposomes [34].Furthermore, there are numerous other options for flexible adjustments of the reaction conditions.These include lowering the expression temperature [35], the addition of chaperones [36] or protein disulfide isomerase to facilitate protein folding [37], the incorporation of nonstandard amino acids [38], and the adjustment of codon usage to enhance translation [39].
In principle, two methods can be distinguished according to the general composition of the CFPS solution.The first method is based on a purified cell extract as a complex solution with many undefined ingredients, whereas the second system uses a rational combination of individually purified components [40].Both approaches contain all necessary constituents for the coupled Overview of the currently existing amount of available nucleotide sequences (NCBI), described enzymes (BRENDA), and applied enzymes [28] in context with the high-throughput-discovery of gene-enzyme-function relation.The available potential of genetic diversity (natural ressources and synthesized sequences) can be accessed via rapid prototyping of proteins using CFPS and high-throughput analyses, in order to increase knowledge about enzyme functionality and finally, make the discovery and development of enzyme-based industrial applications more efficient.
In the context of CFPS as tool for high-throughput protein synthesis, this review addresses the following topics.The compositions of CFPS systems are compared and the necessity of individual ingredients assessed.The applications and development status of CFPS are described focusing on screening in industrial applications of the past few years.Finally, remaining challenges and limitations of CFPS are discussed.

Cell-Free Protein Synthesis
Cell-free systems have been used for protein synthesis for decades, with a wide range of applications.The first cell-free protein synthesis system was established in 1961 by Nirenberg and Matthaei [29].Since then, the usage of biological machinery without the use of living cells has undergone many improvements and became the method of choice for various applications [30].There are several advantages of CFPS compared to heterologous expression.Proteins can be produced from DNA templates within a few hours with concentrations at a milligram per milliliter scale [31].Moreover, difficult to synthesize proteins, like membrane anchored proteins [32], or proteins with a toxic effect on the metabolism of the host cells [33] can be easily produced with in vitro systems.While the hydrophobic properties of membrane proteins are often challenging for in vivo systems, the open environment of CFPS systems offers the possibility to support the synthesis of membrane proteins by supplementing membrane mimics, such as lipid-detergent-based systems, nanodiscs, or liposomes [34].Furthermore, there are numerous other options for flexible adjustments of the reaction conditions.These include lowering the expression temperature [35], the addition of chaperones [36] or protein disulfide isomerase to facilitate protein folding [37], the incorporation of nonstandard amino acids [38], and the adjustment of codon usage to enhance translation [39].
In principle, two methods can be distinguished according to the general composition of the CFPS solution.The first method is based on a purified cell extract as a complex solution with many undefined ingredients, whereas the second system uses a rational combination of individually purified components [40].Both approaches contain all necessary constituents for the coupled transcription and translation machinery, such as ribosomes, aminoacyl-tRNA-synthetases, and translation factors for initiation, elongation, and product release (Figure 3).Various crude cell extracts obtained from prokaryotic, fungi, plant, and mammalian cells are described in literature [41,42].In general, any organism can serve as the source for cell extract preparation.The choice of extract depends on the biochemical requirements and applications of the target protein.Commonly used extracts are derived from Escherichia coli, Saccharomyces cerevisiae [43], rabbit reticulocytes [44], wheat germs [45], insect cells [46], and Chinese hamster ovary (CHO) cells [47].The most widely used source for crude cell lysates is E. coli, bearing some advantages over other CFPS systems.These advantages are the high rate of protein synthesis with high protein yields, simple and cost-effective cultivation and extract preparation, the possibility of genetic engineering with well-established tools, the use of low-cost energy sources, and the ability to fold complex proteins [48].However, E. coli shares the common drawback of prokaryotic extracts and does not provide sufficient post-translational modifications, such as glycosylation.In applications where these modifications are required, other sources for the extract, such as CHO or yeast cells, should be considered [41].
transcription and translation machinery, such as ribosomes, aminoacyl-tRNA-synthetases, and translation factors for initiation, elongation, and product release (Figure 3).Various crude cell extracts obtained from prokaryotic, fungi, plant, and mammalian cells are described in literature [41,42].In general, any organism can serve as the source for cell extract preparation.The choice of extract depends on the biochemical requirements and applications of the target protein.Commonly used extracts are derived from Escherichia coli, Saccharomyces cerevisiae [43], rabbit reticulocytes [44], wheat germs [45], insect cells [46], and Chinese hamster ovary (CHO) cells [47].The most widely used source for crude cell lysates is E. coli, bearing some advantages over other CFPS systems.These advantages are the high rate of protein synthesis with high protein yields, simple and cost-effective cultivation and extract preparation, the possibility of genetic engineering with well-established tools, the use of low-cost energy sources, and the ability to fold complex proteins [48].However, E. coli shares the common drawback of prokaryotic extracts and does not provide sufficient post-translational modifications, such as glycosylation.In applications where these modifications are required, other sources for the extract, such as CHO or yeast cells, should be considered [41].Crude extracts are prepared by lysing the cells followed by removing cellular debris and large molecules, such as genomic DNA, via multiple rounds of washing and high-speed centrifugation.Depending on the used type of CFPS system, additional cofactors and supplements are needed.The E. coli extract CFPS system has experienced many improvements over the years and various compositions were described (Table 1).In the following, the purpose and necessity of the main components are described and discussed.Crude extracts are prepared by lysing the cells followed by removing cellular debris and large molecules, such as genomic DNA, via multiple rounds of washing and high-speed centrifugation.Depending on the used type of CFPS system, additional cofactors and supplements are needed.The E. coli extract CFPS system has experienced many improvements over the years and various compositions were described (Table 1).In the following, the purpose and necessity of the main components are described and discussed.
Ions are added in almost every published CFPS system as they are essential for the activity of many enzymes, as well as for the interaction between proteins and nucleic acids.Typical cations are magnesium (Mg 2+ ) and potassium (K + ), whose concentrations must be carefully elucidated for optimal protein synthesis.Acetate and glutamate, both major anions in the E. coli cytoplasm, can be used interchangeably [32].
Amino acids, adenosine triphosphate (ATP), guanosine triphosphate (GTP), cytidine triphosphate (CTP), and uridine triphosphate (UTP) are essential for protein synthesis and can hence be found in every system.The proteinogenic amino acids are the building blocks for proteins and can be expanded to non-natural amino acids for special applications.Whereas ATP is a universal energy source, GTP is a source of energy for protein synthesis.Together with UTP and CTP, they serve as the substrate for the transcription reaction mediated by RNA polymerase.Yang et al. alternatively used nucleoside monophosphates in order to reduce the reaction costs [55].Although an additional conversion of the nucleoside monophosphates into the corresponding nucleoside triphosphates was required, the obtained protein yields were similar compared to systems with nucleoside triphosphates.
For successful and efficient protein synthesis, a competent energy regeneration system is crucial.ATP and GTP have to be regenerated for prolonged reactions and satisfying protein yields.This supply for protein synthesis is accomplished by using secondary energy sources containing a high-energy phosphate bond.Hence, different systems are described using creatine phosphate, phosphoenolpyruvate, or acetyl phosphate in combination with the enzymes creatine kinase, pyruvate kinase, and acetate kinase, respectively.To avoid the necessity of additional exogenous enzymes, systems were developed using endogenous enzymes present in the cell extract.Effective ATP regeneration from any glycolytic intermediate is enabled by adding β-nicotinamide adenine dinucleotide (NAD) and coenzyme A [57].The conversion of pyruvate into phosphoenolpyruvate via endogenous phosphoenolpyruvate synthetase is coupled with the conversion of ATP to adenosine monophosphate (AMP).Hence, this reaction could significantly reduce the energy supply during protein synthesis by degrading both ATP and pyruvate.For further improvement of the CFPS performance, oxalate, a potent inhibitor of phosphoenolpyruvate synthetase, can be added to the reaction [50].In general, the degradation of the secondary energy sources is a common limitation for the performance of CFPS systems.In addition, the resulting phosphate accumulation inhibits long-term protein synthesis by generating complexes with magnesium ions [58].The recycling of inorganic phosphate can be achieved by phosphorylation of maltodextrin [59] or maltose [31].The addition of these supplements enables one to overcome the limitation caused by phosphate inhibition.
Moreover, tRNA is a common supplement in CFPS systems.Although tRNA is present in the cell extract, increased concentrations of total tRNA improve the availability of amino acids for translation, and therefore the obtained overall protein synthesis yield.Furthermore, the specific addition of tRNA offers the opportunity to adjust the codon usage by adding rare codon tRNA [39].For initiation of the translation reaction, formyl-methionine is obligatory.To ensure adequate supply, folinic acid is added to the CFPS reaction as a formyl donor substrate.
Most of the described systems contain the organic chemical buffering agent 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES) for maintaining physiological pH.The complexity and difficulty of creating an efficient CFPS mixture is also illustrated by the addition of crowding agents.Polyethylene glycol (PEG) can be used to mimic the viscosity of the E. coli cytoplasm.PEG is also supposed to support the stability of mRNA and to induce macromolecular crowding effects.However, crowding agents inhibit translation, which might be caused by protein precipitation [60].The usage of crowding agents and their impact on the transcription and translation machinery have to be critically considered.Another attempt describes the substitution of these unnatural components, as HEPES and PEG, with supplements naturally occurring in E. coli.The polycations spermidine and putrescine were incorporated to replace PEG and offer some advantages, such as stimulation of T7 RNA polymerase activity and stabilization of DNA, RNA, and tRNA [51].Because of its stabilizing effect on T7 RNA polymerase, the reducing agent dithiothreitol (DTT) is often used in systems, which are based on the T7 regulation system [32].Nevertheless, a major drawback is the reducing environment, which might prevent the formation of disulfide bonds.For oxidized target proteins, a compromise between T7 RNA polymerase stability and optimized synthesis environment should be considered.
The most widely used transcription regulation systems in cell-free expression systems are bacteriophage RNA polymerases, such as T7 or T3.These polymerases are known for strict promoter specificity and accept supercoiled, as well as linear DNA templates.Furthermore, these polymerases do not recognize chromosomal DNA, resulting in a strong decrease of background expression of unintended genes [61].In general, there are two methods described for the application of these RNA polymerases.First, the addition of recombinantly produced and purified polymerase to the CFPS reaction [62].Secondly, the usage of a T7 RNA polymerase producing strain for the preparation of the cell extracts [63].This organism can be a natural producer of T7 RNA polymerase, such as E. coli BL21 (DE3), or a strain, harboring a plasmid for the overexpression of T7 RNA polymerase.Alternatively, endogenous E. coli RNA polymerases in combination with sigma factors are described, although the endogenous E. coli transcription machinery is much less efficient, as bacteriophage systems comparable protein yields can be obtained due to DNA template optimization [54].
This shows the great potential of optimization attempts for different DNA gene templates.In general, a template can be used either as a circular plasmid or in linear PCR fragment forms.Linear templates can be produced via PCR within hours and avoids the necessity of molecular cloning steps, while plasmid template preparation can take days [48].For many high-throughput applications of CFPS, the usage of PCR products can circumvent the time-intensive template construction and cloning and yields a large number of different proteins within a few hours.The preparation of linear DNA gene templates can easily be done with gene-specific primers with additional overhanging sequences, consisting of regulatory features, such as a T7 promoter and a T7 terminator [64].A noted disadvantage is the increased vulnerability of linear DNA to endogenous nucleases; hence a faster degradation can be observed [65].To avoid this drawback in applications with E. coli extracts, three approaches are mentioned.Usage of strains, lacking the dominant nuclease RecB [66], addition of potent inhibitor of exonucleases, such as purified Gam protein of bacteriophage λ [52,67], or modification of the gene template with structures, which avoid the degradation of linear DNA [68].These adjustments lead to comparable protein yields between linear and circular DNA gene templates [52,69].
In summary, several CFPS systems have been described, which enable production of protein yields from the lower microgram scale to the milligram scale in a single batch reaction.The slightly different compositions can significantly affect the efficiency and productivity of the CFPS reaction, as displayed by the addition of maltose or maltodextrin.Furthermore, it is shown that a CFPS system that omits several ingredients, which were considered as essential, is able to produce proteins in small quantities.Pedersen et al. have rationally improved their CFPS systems and tested five different proteins [70].The final yields varied between 250 and 700 µg protein per mL of CFPS reaction mix.Consequently, it is not possible to compare the yields of distinct proteins.Moreover, the applied protocol for the cell-free extract preparation has a major effect on productivity.

Applications of Cell-Free Protein Synthesis
Cell-free protein synthesis serves as a platform for many biotechnology and synthetic biology projects.The growing interest for the CFPS technology is due to its applicability to synthesize "difficult" proteins, for example toxic or membrane proteins, proteins which incorporate non-natural amino acids, or proteins that require a tight control during synthesis in terms of reactant concentrations or timing of reactant addition, which is not guaranteed by heterologous expression [71].These possibilities open the way to a novel understanding of enzymes and new applications, such as personalized medicines or new therapeutics and chemicals.Next to the aforementioned applications, CFPS is a promising approach for high-throughput synthesis to provide enzymes for functional analyses.In the following, we focus on CFPS in the context of compartmentalization, miniaturization in microfluidics and microarrays, rapid prototyping, and scale-up with regard to industrial application (Figure 4).miniaturization in microfluidics and microarrays, rapid prototyping, and scale-up with regard to industrial application (Figure 4).Compartmentalized microenvironments are used for CFPS to provide cell-like structures to enable functional protein analyses in preferably similar environments compared to the cell.Artificial microcompartments offer the great advantage of design and control structure and biochemical composition [75].Liposomes or microgels are the simplest systems for modelling cells.These compartments provide micro reaction chambers to study membrane-associated protein processes or enzymatic reactions in the compartment [76,77].Giant unilamellar vesicles (GUV) have a cell-like size and contain the same phospholipids that compose cell membranes [78,79].Using GUVs, CFPS was efficiently applied for directed evolution of a membrane protein.Genes encoding for α-hemolysin pore protein in Staphylococcus aureus were subjected to mutagenesis for generating a randomly mutagenized gene library.The mutagenized genes were individually encapsulated in GUVs with a CFPS system and the HaloTag protein.HaloTag protein is a modified haloalkane dehalogenase and covalently binds to synthetic ligands [80].The α-hemolysins were synthesized from the mutated genes and incorporated into the GUV membranes [72].The GUVs were then exposed to a solution containing a fluorescent AF488 ligand.Dependent on the pore-forming activity Compartmentalized microenvironments are used for CFPS to provide cell-like structures to enable functional protein analyses in preferably similar environments compared to the cell.Artificial microcompartments offer the great advantage of design and control structure and biochemical composition [75].Liposomes or microgels are the simplest systems for modelling cells.These compartments provide micro reaction chambers to study membrane-associated protein processes or enzymatic reactions in the compartment [76,77].Giant unilamellar vesicles (GUV) have a cell-like size and contain the same phospholipids that compose cell membranes [78,79].Using GUVs, CFPS was efficiently applied for directed evolution of a membrane protein.Genes encoding for α-hemolysin pore protein in Staphylococcus aureus were subjected to mutagenesis for generating a randomly mutagenized gene library.The mutagenized genes were individually encapsulated in GUVs with a CFPS system and the HaloTag protein.HaloTag protein is a modified haloalkane dehalogenase and covalently binds to synthetic ligands [80].The α-hemolysins were synthesized from the mutated genes and incorporated into the GUV membranes [72].The GUVs were then exposed to a solution containing a fluorescent AF488 ligand.Dependent on the pore-forming activity of α-hemolysin, AF488 diffused more or less into the liposome and bound to HaloTag protein.The GUVs were sorted using fluorescence activated cell sorting (FACS) and active genes were recovered and amplified for transformation in S. aureus [72].This approach allowed efficient evolution under functional screening conditions.In addition to the aforementioned examples, CPFS in compartmentalized microenvironments enabled the development of artificial minimal cells from the bottom up [81,82].Such cell-like micro containers were optimized with regard to nutrient supply by co-expressing genes encoding α-hemolysin pore proteins, which enabled an continuous exchange of energy and material during expression [83].These advancements resulted in a prolonged expression for up to 4 days compared to 2 h in bulk solution.Although we are far away from synthesizing viable and self-replicating artificial cells, mimicking cells is currently an active field of research for studying enzyme functionality in well-defined and controllable environments and also for screening of enzyme variants.
Miniaturization and automation using microchips has a tremendous potential for CFPS parallelization and high-throughput analysis of the synthesized proteins [48,84,85].CFPS in microfluidic chips offers new opportunities, such as continuous protein production or compartmentalization with simultaneous protein detection and analysis [73].The separation of reactants and the separation of transcription and translation machinery by, for example, inclusion in droplets or separation with dialysis membranes, result in significantly higher product yields and provide novel reactions modes [86][87][88].Mazutis et al. succeeded to fulfill the individual biochemical requirements of protein synthesis and functional enzyme assay by separating multiple reaction steps in droplets and adding new reagents at defined times by droplet-fusion [89].This microfluidic device was used to combine laccase production via CFPS with a laccase activity assay.The reaction conditions of CFPS and laccase assay reagents are generally incompatible.The application of the droplet-fusion technology enabled production of the enzyme in a droplet, which was subsequently fused to the assay reagent containing droplet.The combined synthesis and analysis were hence only possible by spatial and temporal separation.Another droplet-based microfluidic device has been used as an in vitro ultra-high-throughput screening platform [90].Single genes were compartmentalized in droplets, amplified using PCR, fused to a CFPS containing droplet, and finally to the reagent containing droplet for a fluorogenic assay.Afterwards the droplets were analyzed and genes recovered from droplets which complied with the desired enzyme activity.In general, compared to screening systems using microtiter plates, costs, reagent consumption, production, and analysis time can be dramatically reduced.
Moreover, CFPS provides a powerful tool for the manufacturing of protein microarrays.These protein chips are solid-phase ligand binding assay systems for high-throughput testing of interactions and activities of proteins.Typical challenges in protein array technology are an efficient protein synthesis and availability, a functional protein immobilization and purification, and the long-term stability of immobilized proteins [74].The parallel synthesis of several proteins directly onto an immobilizing surface can circumvent these drawbacks.Different methods are described, such as PISA (Protein In Situ Array), NAPPA (Nucleic Acid Programmable Protein Array), and DAPA (DNA to Protein Array) [91].All approaches have in common that the transcription, translation, and immobilization take place simultaneously in situ (on-chip).They differ in the type of DNA template and of the protein capture entities.The protein arraying via PISA uses PCR DNA constructs, containing all necessary sequences for transcription and translation and additionally a tag-coding sequence for immobilization by means of a tag capture agent on the surface of the chip [92].In contrast, the NAPPA method generates protein microarrays by printing DNA templates onto glass slides with a subsequent transcription and translation of the target proteins.Tags fused to the proteins enable the immobilization via antibodies and an easy purification.[93] A screening process for identification of new antibody responses to the Mycobacterium tuberculosis proteome with about 4,000 tested genes was performed with NAPPA generated microarrays and yielded 8 proteins with tuberculosis biomarker value [94].The DAPA technology allows the production of replicate protein arrays [95].A single DNA array template with covalently immobilized PCR DNA constructs can generate at least 20 copies of a protein array.The CFPS takes place in a permeable membrane carrying all components for the reaction, which is arranged between the DNA array slide and a second slide with a tag-capturing agent.Synthesized proteins can diffuse through the membrane and get immobilized onto the second slide.Further optimizations of DAPA led to an optimal combination of array supporting coatings and a 3D surface structure, which allows the synthesis of proteins in a comparable scale to classically spotted protein arrays [96].
The integration of advanced high-throughput technologies enables the application of CFPS for rapid prototyping in order to accelerate the screening process of enzymes with improved characteristics [97,98].For example, 63 proteins of Pseudomonas aeruginosa in the size range of 18-59 kDa were produced and analyzed within 4 h by one person [99].A mutant library consisting of 10,000 genes coding for sialyltransferase genes was screened to identify enzymes with improved activities within a few days [100].Furthermore, a protocol named RAPPER (Rapid Parallel Protein EvaluatoR) was established for the fast preparation of functional enzyme variants from linear, mutagenic DNA templates [27].The application of RAPPER facilitated an efficient evaluation of old yellow enzyme variants, which contained amino acid substitutions and deletions, with regard to conversion rates of various substrates [101].CFPS is not only used to synthesize protein libraries but also for metabolic engineering purposes, e.g., to debug and optimize biosynthetic pathways.Cell-free prototyping of biosynthetic operons for the production of polyhydroxyalkanoates (PHAs) in combination with screening of relevant metabolite recycling enzymes revealed an operon, which produced higher levels of PHAs than the native operon [102].The results of the prototyping were subsequently validated with in vivo assays.Cell-free metabolic engineering is thus a promising tool to identify the best combination of enzymes that work together [103,104].Combining CFPS with cell-free metabolic engineering simplifies the manipulation of metabolic pathways and avoids the need for time-intensive engineering of organisms [105,106].This was also recently demonstrated during an evaluation of bioengineering methods [107].A group of scientists tried to synthesize 10 molecules in 90 days, which were unknown to them in advance.To achieve this goal, various enzymes were tested and optimized in order to construct functional biosynthesis routes.In the end, two of the molecules were produced with enzyme cascades synthesized via CFPS.CFPS is hence competitive to recombinant expression and complements in vivo protein synthesis approaches.
CFPS systems are widely used at a small scale for research and development purposes.However, after the enzyme is found with the desired functions, it is usually produced in large amounts using traditional protein synthesis approaches, such as overexpression or heterologous expression.Only some researchers publish the scale-up of CFPS systems for industrial application.Fujiwara et al. reported a scale-up with a volume of 9 L for the synthesis of GFP [108].The achieved concentration of 0.5 mg mL −1 was comparable to other preparation methods and smaller scales.Although only a volume of about 9 L was tested, the authors expect scalability to hundreds of liters.Zawada et al. achieved a linear scalability using an optimized process with regard to extract preparation, gene sequence, and redox parameters [109].The scalability was realized over a range of 250 µL to 100 L for the production of a multi-disulfide-bonded protein with a concentration of about 0.7 mg mL −1 .This milestone production at a large scale might enable the commercial production of proteins that are inaccessible to cells in the future.Nevertheless, for most of the described applications in this review, namely high-throughput functional analyses, scale-up in terms of volume is not required.

Limitations and Challenges of CFPS
The applications discussed above demonstrate the importance and contribution of CFPS for efficiently synthesizing new enzymes for functional analyses.However, CFPS has technology-based limitations and impacts which have to be considered.Very recent research focuses on the effects imposed by the highly artificial environment in CFPS systems, which significantly differs from the natural protein synthesis in a viable cell.
One important aspect is certainly the influence of molecular interactions, for example the crowding phenomenon, viscosity, and related effects.The cytoplasmic space of cells exhibits a total protein concentration roughly 20-fold higher compared to CFPS extracts [110].It is known that macromolecular crowding can have an influence on enzyme kinetics, increases transcription rates, and might enhance the robustness of gene expression [111][112][113].Nevertheless, up to now, the impact of crowding on protein synthesis efficiency in CFPS systems is not reliably quantified and completely understood.Interestingly, the compartment volume, which is used for CFPS, seems to have an influence on protein synthesis rate and yield.Okano et al. showed that protein synthesis begins quicker in smaller volumes (56 fL), but maximally achieved protein concentrations are higher in bigger volumes (126 fL) [114].The reasons for the dependency of protein synthesis on compartment size are manifold and can be attributed to dilution effects [114], but also to surface area to volume ratio [115].In general, the artificiality of the protein synthesis environment, the overlapping influences of physical aspects, as well as the individual biochemical requirements of the transcription and translation machinery makes the predictability and transferability of CFPS difficult.
Another significant difference is the protein synthesis rate, which is usually slower for in vitro based protein synthesis compared to protein synthesis in vivo.The decreased efficiency is probably caused by the discrepancy of transcription and translation rate in CFPS systems, which are tightly synchronized in vivo.The translation in CFPS is hampered due to the limited availability of the elongation factor Tu and tRNA [116].The accumulation of mRNA results in inactivation in the form of degradation or the formation of mRNA secondary structures and mRNA:DNA hybrids [117].A solution might be the separation of transcription and translation machinery with an intermediate mRNA purification step [88].However, this approach contradicts the demands of a time-efficient, easy, and universally applicable high-throughput protein synthesis strategy.The knowledge about CFPS extracts and their performance limits has to be improved to make this system more broadly applicable for industry.
CFPS extracts are complex reaction systems with predominantly unknown contents.Proteomics-based tools are able to decipher the composition of CFPS systems.By analyzing the proteome of CFPS extracts, more than 1000 proteins were identified [118].This information contributes to improve extract preparation in order to increase the folding capacity, solubility, and activity of the target protein [119].Furthermore, such proteomic analyses might contribute to describing and enhancing reproducibility of CFPS extract preparation.Next to proteomic-based analyses, modeling approaches for simulating the synthesis of target proteins enable one to reveal system limitations, and furthermore offer the possibility to estimate and optimize the CFPS performance in silico [120].
The complexity and unspecific composition of CFPS extracts is also challenging for activity screenings.For example, side reactions or inhibition by extract components might influence the activity of the synthesized target enzyme.Possibilities for target enzyme isolation could be purification by removing all other components (as realized in [40]) or the immobilization of the target enzyme itself and subsequent washing steps.However, these purification steps require extra effort leading to reduced effectiveness of CFPS for rapid prototyping.Furthermore, the quantification of product molecules is difficult in such complex solutions.An example for integrated CFPS and activity screening was published by Gagoski et al. [121].In this study, thermostable endo-1,4-β-glucanases and xylanases were characterized without purification using substrates labeled with a colorimetric dye.The release of the dye resulted in an absorption increase that correlates well with the enzyme activities.However, this method is limited to substrates and products which are optically quantifiable.Nevertheless, advanced chromatography or mass spectrometry systems combined with miniaturization and automation technologies for sample preparation enable rapid and parallel activity analyses of CFPS products.
In our opinion, different issues have to be addressed for CFPS to be widely accepted as protein synthesis platform in industrial applications: quality control methods are required for reproducible extract preparation; the CFPS preparation itself has to be generalizable and accessible for high-throughput; the transferability of in vitro expression is mandatory, when a subsequent in vivo expression at large scale is envisaged.Despite the discussed challenges of CFPS application, these systems provide new opportunities for getting deeper insights into cellular expression mechanisms and contribute to understanding the metabolic costs for protein production [122].Furthermore, by using such an open system, bottlenecks of biomolecular editing tools can be identified and enhance our mechanistic understanding of genome-editing [123].This knowledge enables improving the efficiency of protein synthesis, in vitro as well as in vivo.

Conclusions
In this review, we have shown that CFPS bears a high value for biocatalytic processes and has become an accepted alternative for in vivo expression systems.Different compositions of E. coli extract based CFPS systems were presented and the necessity and purpose of individual supplements, which significantly affect the efficiency and productivity of the CFPS reaction, were discussed.The rational improvement of supplemental components results in systems with protein yields in milligram per milliliter scale, which permits novel CFPS applications beyond pure research interests.The scale-up of CFPS is currently an almost negligible aspect due to the high costs for the reaction components and a reduced added value by shorter production times compared to cell-based expression.However, the integration of CFPS in platforms for high-throughput enzyme screening and analysis has been proven as a powerful and versatile tool for the fast discovery of new candidates or improved biocatalyst variants.In addition, the in vitro construction and study of synthetic pathways in combination with traditional strain development can become a valuable instrument for the engineering of biocatalytic processes.While CFPS permits a rapid and inexpensive identification of the best suitable enzyme candidates for application in a biocatalytic process, the traditional in vivo technologies enable a scale-up and subsequent high-level production.CFPS has become a technology which makes protein engineering and the prototyping of new enzymes accessible for many laboratories, as specialized equipment and considerable experience are not required.It is therefore ideally suited to help in the faster accumulation of functional data on enzymes.
Funding: This research received no external funding.

Figure 1 .
Figure 1.Number of sequences in UniProt databases during the last decade.The dark grey bars show the number of reviewed entries, which represent manually annotated sequences with information extracted from literature.The light grey bars show the number of unreviewed entries, which represent the computationally analyzed sequences that are not manually annotated.

Figure 1 .
Figure 1.Number of sequences in UniProt databases during the last decade.The dark grey bars show the number of reviewed entries, which represent manually annotated sequences with information extracted from literature.The light grey bars show the number of unreviewed entries, which represent the computationally analyzed sequences that are not manually annotated.

Figure 2 .
Figure 2. Overview of the currently existing amount of available nucleotide sequences (NCBI),

Figure 2 .
Figure 2. Overview of the currently existing amount of available nucleotide sequences (NCBI), described enzymes (BRENDA), and applied enzymes[28] in context with the high-throughput-discovery of gene-enzyme-function relation.The available potential of genetic diversity (natural ressources and synthesized sequences) can be accessed via rapid prototyping of proteins using CFPS and high-throughput analyses, in order to increase knowledge about enzyme functionality and finally, make the discovery and development of enzyme-based industrial applications more efficient.

Figure 3 .
Figure 3.The principle of cell-free protein synthesis.The cell-free extract contains the machinery for the coupled transcription and translation reaction; the DNA template consists of regulatory sequences and encodes the target protein; the energy mix contains building blocks for mRNA and protein synthesis, as well as components for energy regeneration and supplemental substances.

Figure 3 .
Figure 3.The principle of cell-free protein synthesis.The cell-free extract contains the machinery for the coupled transcription and translation reaction; the DNA template consists of regulatory sequences and encodes the target protein; the energy mix contains building blocks for mRNA and protein synthesis, as well as components for energy regeneration and supplemental substances.

Table 1 .
Selection of E. coli-based cell-free protein synthesis systems.(X) means the component is included, (-) means the component is not included in the CFPS system.