Crafting Genetic Diversity: Unlocking the Potential of Protein Evolution

: Genetic diversity is the foundation of evolutionary resilience, adaptive potential


Introduction
The year 2018 marked a significant milestone in the field of protein engineering, as it witnessed the well-deserved recognition of Prof. Frances H. Arnold with the Nobel Prize in Chemistry [1,2].This prestigious accolade honoured her ground-breaking contributions to directed protein evolution, particularly through the utilization of a simple yet powerful algorithm.Directed evolution involves iterative cycles of genetic diversity creation, followed by screening and selection (Figure 1).The far-reaching applications and profound impacts of directed evolution are evident in various domains, including, but not limited to, the development of highly active or stable enzymes, the creation of novel enzymatic functions and chemistries, and the engineering of biopharmaceutical proteins.
At the core of protein evolution lies the essential prerequisite of genetic diversity.Indeed, without this diversity, the evolutionary process becomes untenable.In recognition of this pivotal aspect, this review article is dedicated to examining methodologies deliberately designed for the creation of genetic diversity, with a focus on publications from 2014 onward.This article serves as an extension to our prior reviews on the same subject in 2006 [3] and 2013 [4].

Figure 1. (Top left)
The directed evolution cycle.The parental gene of interest (GOI) undergoes mutagenesis to generate a diverse pool of genetic variants.This pool is then subjected to a selection process targeting the desired phenotype, enabling the identification of improved variant(s).This iterative cycle is repeated until the desired trait is successfully achieved.(Bottom right) Classification of genetic diversity creation methods.The diverse methods for generating a genetically varied gene pool can be systematically categorized into three main classes: random mutagenesis, focused mutagenesis, and DNA recombination.Random mutagenesis involves the introduction of random mutations throughout the starting parental gene sequence.Focused mutagenesis targets mutations to specific pre-selected region(s) or amino acid residue(s) within the starting parental gene sequence.DNA recombination generates chimeric sequences by combining segments from a set of either homologous or non-homologous parental sequences.
The methodologies for generating genetic diversity can be categorized into three main classes: random mutagenesis, focused mutagenesis, and DNA recombination (Figure 1).There exist hybrid methods that integrate features derived from two or more of these classes, showcasing the versatility and adaptability of current approaches.In random mutagenesis, mutations are randomly introduced into the target gene of interest (GOI).Conversely, focused mutagenesis involves the randomization or semi-randomization of selected amino acid residues within the target gene.In DNA recombination, a collection of homologous or non-homologous genes undergo random fragmentation and subsequent reassembly to form chimeric constructs.The fundamentals of directed evolution and genetic diversity creation have been comprehensively documented in our recently published protein engineering textbook [5].Therefore, this review primarily explores the significant advancements achieved in the past decade.
This review commences by discussing the latest advancements in molecular cloning, a crucial initial step in the process of generating genetic diversity.Subsequent to this, we The directed evolution cycle.The parental gene of interest (GOI) undergoes mutagenesis to generate a diverse pool of genetic variants.This pool is then subjected to a selection process targeting the desired phenotype, enabling the identification of improved variant(s).This iterative cycle is repeated until the desired trait is successfully achieved.(Bottom right) Classification of genetic diversity creation methods.The diverse methods for generating a genetically varied gene pool can be systematically categorized into three main classes: random mutagenesis, focused mutagenesis, and DNA recombination.Random mutagenesis involves the introduction of random mutations throughout the starting parental gene sequence.Focused mutagenesis targets mutations to specific pre-selected region(s) or amino acid residue(s) within the starting parental gene sequence.DNA recombination generates chimeric sequences by combining segments from a set of either homologous or non-homologous parental sequences.
The methodologies for generating genetic diversity can be categorized into three main classes: random mutagenesis, focused mutagenesis, and DNA recombination (Figure 1).There exist hybrid methods that integrate features derived from two or more of these classes, showcasing the versatility and adaptability of current approaches.In random mutagenesis, mutations are randomly introduced into the target gene of interest (GOI).Conversely, focused mutagenesis involves the randomization or semi-randomization of selected amino acid residues within the target gene.In DNA recombination, a collection of homologous or non-homologous genes undergo random fragmentation and subsequent reassembly to form chimeric constructs.The fundamentals of directed evolution and genetic diversity creation have been comprehensively documented in our recently published protein engineering textbook [5].Therefore, this review primarily explores the significant advancements achieved in the past decade.This review commences by discussing the latest advancements in molecular cloning, a crucial initial step in the process of generating genetic diversity.Subsequent to this, we delve into the assessment of gene library quality.Building upon this foundation, we offer succinct summaries of innovative methodologies within each category of genetic diversity creation, emphasizing their individual merits and limitations.

Molecular Cloning
DNA cloning serves as the cornerstone for mutagenesis experiments, involving the insertion of a target GOI or a gene library into a cloning/expression vector.Molecular cloning also plays an integral role in all synthetic biology projects at various stages.Conventionally, this process relies on the use of restriction endonucleases, which, while fundamental in molecular biology, present certain limitations.Drawbacks such as sequence specificity, time-consuming restriction digestion and ligation reactions, and the creation of undesirable 'scars' underscore the necessity for more versatile DNA cloning techniques.To address contemporary demands and challenges and facilitate rapid advancements, DNA assembly methods must be robust, cost-effective, swift, user-friendly, and highly scalable.
In this section, we present a comprehensive overview of the latest advancements in DNA cloning technology over the past decade.These innovations are categorized into in vivo, in vitro, modular DNA assembly, and automated DNA assembly methods.The cost per reaction for these methods varies significantly, ranging from a mere USD 0.0025 to a substantial USD 36.3 per reaction, as per 2019 pricing [6].

In Vivo Cloning
In vivo cloning techniques (Table 1) leverage the cellular machinery of a host organism, such as bacteria or yeast, to incorporate a DNA fragment into a vector.Recent descriptions of these techniques often rely on DNA recombination or repair systems, such as homologous recombination (HR) and non-homologous end joining (NHEJ).The appeal of in vivo cloning techniques lies in their distinct advantages: they obviate the need for in vitro ligation reactions and are not constrained by restriction endonuclease sites.Introducing a short homologous sequence (30-40 bp) to any vector or DNA fragment through PCR facilitates recombinational cloning, underscoring the simplicity and versatility of this approach.
An underutilized yet increasingly popular in vivo technique involves harnessing the RecA-independent recombination (RAIR) machinery [7] found in commonly employed laboratory bacterial strains, such as Escherichia coli DH5α.RAIR is thought to be dependent on XthA, a 3 ′ -to-5 ′ exonuclease that resects the 3 ′ -ends of linear DNA fragments introduced into E. coli cells, exposing the single-stranded 5 ′ -overhangs [8].Subsequently, the complementary single-stranded DNA ends hybridize with each other, and gaps are filled by DNA polymerase I and ligase LigA.Despite being originally described over three decades ago [9], the application of high-fidelity polymerases and low-cost oligonucleotides, coupled with insights into their molecular mechanisms [7,8,[10][11][12][13][14], holds promise for expanding their widespread adoption.
Traditionally, the budding yeast Saccharomyces cerevisiae has served as a platform for HR-based cloning [15][16][17], employing a shuttle vector to propagate plasmids in both yeast and E. coli.However, this approach limited the utility of yeast-based cloning exclusively to these two organisms.Recent breakthroughs in yeast recombination cloning [18][19][20][21][22] have transcended these limitations, allowing the technique to be applied universally for cloning multiple fragments of interest into any vector.The key innovation involves incorporating an origin of replication and a selection marker for yeast as a cassette or DNA fragment, which is co-inserted with other fragments for cloning.Assembled plasmids can then be selected in and purified from yeast, ready for transformation or transfection into desired host organisms.HR-based cloning in yeast can also be adapted for high-throughput DNA assembly and is amenable to automation [23].
While S. cerevisiae exhibits efficient HR machinery, its NHEJ activity is limited.By leveraging previous findings that the thermotolerant yeast Kluveromyces marxianus pos-sesses a highly efficient NHEJ pathway [24], an NHEJ-mediated cloning method was developed [25].This method employs a functional marker selection system (e.g., ura3) for the cloning of DNA fragments in K. marxianus.NHEJ-mediated cloning exploits the sequence-independent nature of NHEJ-based joining, allowing the cloning of DNA fragments without the need for homologous ends-a departure from the HR-based cloning approach.
CReasPy-cloning (Figure 2) harnesses the precision of Cas9 to cleave DNA at a userspecified locus, synergizing with the yeast's remarkably efficient homologous recombination [26].This innovative approach enables the simultaneous cloning and engineering of a bacterial genome in yeast.The successful application of CReasPy-cloning was demonstrated through the cloning and engineering of the 0.816 Mbp genome of Mycoplasma pneumonia, showcasing its potential for bacterial genome manipulation.
SynBio 2024, 2, FOR PEER REVIEW 4 While S. cerevisiae exhibits efficient HR machinery, its NHEJ activity is limited.By leveraging previous findings that the thermotolerant yeast Kluveromyces marxianus possesses a highly efficient NHEJ pathway [24], an NHEJ-mediated cloning method was developed [25].This method employs a functional marker selection system (e.g., ura3) for the cloning of DNA fragments in K. marxianus.NHEJ-mediated cloning exploits the sequenceindependent nature of NHEJ-based joining, allowing the cloning of DNA fragments without the need for homologous ends-a departure from the HR-based cloning approach.
CReasPy-cloning (Figure 2) harnesses the precision of Cas9 to cleave DNA at a userspecified locus, synergizing with the yeast's remarkably efficient homologous recombination [26].This innovative approach enables the simultaneous cloning and engineering of a bacterial genome in yeast.The successful application of CReasPy-cloning was demonstrated through the cloning and engineering of the 0.816 Mbp genome of Mycoplasma pneumonia, showcasing its potential for bacterial genome manipulation.Upon entry into the cell, the Cas9/gRNA complex cleaves the target genome, and the yeast homologous recombination system repairs it using the provided linear DNA fragment as a template.Consequently, the bacterial genome incorporates the yeast elements precisely at the designated locus and is now carried by the yeast as an artificial chromosome.
Most in vivo cloning methods utilize bacteria or yeasts as hosts, necessitating transformation and plasmid extraction procedures for the introduction of extracted plasmids into other organisms.A recent advancement in this domain is the Phage Enzyme-Assisted In Vivo DNA Assembly (PEDA) method [27].The simultaneous expression of phage-derived T5 DNA exonuclease and T4 DNA ligase facilitates in vivo DNA assembly in a diverse range of microorganisms, such as Cupriavidus necator, Pseudomonas putida, Lactobacillus plantarum, and Yarrowia lipolytica.Another cutting-edge technique, the Yeast Life Cycle (YLC) assembly method [28], capitalizes on CRISPR-Cas9 and yeast meiosis to iteratively assemble large DNA fragments.The advantage of the YLC method lies in bypassing challenging in vitro steps associated with handling and importing large DNA fragments into a host system.Following this, the yeast undergoes simultaneous transformation with the target genome to be cloned and a linear DNA fragment containing yeast elements (CEN-HIS3, with or without ARS).The linear DNA fragment has recombination arms homologous to each side of the target locus.Upon entry into the cell, the Cas9/gRNA complex cleaves the target genome, and the yeast homologous recombination system repairs it using the provided linear DNA fragment as a template.Consequently, the bacterial genome incorporates the yeast elements precisely at the designated locus and is now carried by the yeast as an artificial chromosome.
Most in vivo cloning methods utilize bacteria or yeasts as hosts, necessitating transformation and plasmid extraction procedures for the introduction of extracted plasmids into other organisms.A recent advancement in this domain is the Phage Enzyme-Assisted In Vivo DNA Assembly (PEDA) method [27].The simultaneous expression of phage-derived T5 DNA exonuclease and T4 DNA ligase facilitates in vivo DNA assembly in a diverse range of microorganisms, such as Cupriavidus necator, Pseudomonas putida, Lactobacillus plantarum, and Yarrowia lipolytica.Another cutting-edge technique, the Yeast Life Cycle (YLC) assembly method [28], capitalizes on CRISPR-Cas9 and yeast meiosis to iteratively assemble large DNA fragments.The advantage of the YLC method lies in bypassing challenging in vitro steps associated with handling and importing large DNA fragments into a host system.* For clarity purposes, a gene cloned into a linearized vector is considered a 2-fragment assembly.

In Vitro Cloning
Gibson assembly, Seamless Ligation Cloning Extract (SLiCE), Sequence-and Ligation-Independent Cloning (SLIC), and In-Fusion cloning are all in vitro techniques.Their underlying principle relies on the presence of complementary overhangs at the ends of vector and insert DNA fragments.These methods have gained widespread popularity due to their ability to overcome limitations associated with restriction endonuclease-based cloning.Recent advancements have integrated the flexibility of Gibson assembly with the precision of CRISPR-Cas9, allowing for the targeting of double-strand breaks at any desired location.This is particularly advantageous in situations where the PCR amplification of vector fragments is challenging and restriction endonuclease sites are unavailable.Guide RNAs guide the Cas9 endonuclease to target any double-stranded DNA sequence, effectively linearizing the vector.The linearized vector can then be employed in Gibson assembly or other cloning techniques [29,30].CRISPR-Cas9 has also been utilized to excise large DNA fragments (~100 kb) from bacterial chromosomes.Creating such lengthy fragments through PCR amplification could be challenging.The cut fragments are subsequently cloned using Gibson assembly [31].Variations of the SLiCE technique, such as Zero-Background Redα (ZeBRα), leverage the ccdB gene to eliminate any background from uncut or re-ligated vectors during cloning [32].
Other recent advances, such as T5 Exonuclease-Dependent Assembly (TEDA), Single 3 ′ -exonuclease-based multifragment DNA assembly (SENAX), and T5 exonucleasemediated low-temperature DNA cloning (TLTC) [6,33,34], operate on the principle of exonuclease-generated overhangs within the overlapping regions of vectors and DNA fragments.These methods offer a cost-effective alternative to Gibson assembly, as they only require the exonuclease enzyme for the in vitro reaction to create the necessary complementary overhangs.The subsequent steps of gap repair and DNA fragment ligation take place in vivo post-transformation into bacterial cells.
Alternatives to generating compatible overhangs include techniques such as Uracil-Specific Excision Reagent (USER) Cloning.In USER Cloning, vector and insert fragments with short overlapping regions are produced through PCR.The primers used in this process introduce a single deoxyuracil (dU) residue, subsequently cleaved by the USER enzyme.This cleavage results in 3 ′ overhangs, facilitating the annealing of multiple DNA fragments.
Although USER Cloning has a longstanding history [35], advancements in the past decade, such as optimizing the melting temperature of annealing DNA fragments [36] and employing in silico design tools like AMUSER [37], have enhanced its efficacy.A similar technique that introduces compatible overhangs through PCR is QuickStep-Cloning [38], where two parallel asymmetrical PCR reactions generate overhangs.An improvement on this method is PTO-QuickStep cloning [39,40], where phosphorothioate (PTO) bonds, introduced via primers, are processed by iodine cleavage to generate overhangs (Figure 3).Nicks in the plasmids resulting from both USER Cloning and QuickStep-Cloning are sealed following transformation into bacterial cells.For scarless and sequence-independent DNA assembly, the method using thermostable exonuclease and ligase (DATEL) [41,42] presents another alternative.This technique employs a combination of Taq and Pfu DNA polymerases to cleave single-stranded DNA flaps generated during the annealing of DNA fragments with overlaps.The resulting nicks are then ligated using a heat-stable DNA ligase.
SynBio 2024, 2, FOR PEER REVIEW 6 Alternatives to generating compatible overhangs include techniques such as Uracil-Specific Excision Reagent (USER) Cloning.In USER Cloning, vector and insert fragments with short overlapping regions are produced through PCR.The primers used in this process introduce a single deoxyuracil (dU) residue, subsequently cleaved by the USER enzyme.This cleavage results in 3′ overhangs, facilitating the annealing of multiple DNA fragments.Although USER Cloning has a longstanding history [35], advancements in the past decade, such as optimizing the melting temperature of annealing DNA fragments [36] and employing in silico design tools like AMUSER [37], have enhanced its efficacy.A similar technique that introduces compatible overhangs through PCR is QuickStep-Cloning [38], where two parallel asymmetrical PCR reactions generate overhangs.An improvement on this method is PTO-QuickStep cloning [39,40], where phosphorothioate (PTO) bonds, introduced via primers, are processed by iodine cleavage to generate overhangs (Figure 3).Nicks in the plasmids resulting from both USER Cloning and QuickStep-Cloning are sealed following transformation into bacterial cells.For scarless and sequence-independent DNA assembly, the method using thermostable exonuclease and ligase (DA-TEL) [41,42] presents another alternative.This technique employs a combination of Taq and Pfu DNA polymerases to cleave single-stranded DNA flaps generated during the annealing of DNA fragments with overlaps.The resulting nicks are then ligated using a heatstable DNA ligase.
In summary, the fundamental principle underlying the latest advancements in in vitro DNA assembly (Table 2) is to streamline the process, ensuring simplicity, cost-effectiveness, and versatility.Leveraging overlapping DNA sequences among fragments to be assembled allows for the insertion of any DNA fragment into any chosen vector without being confined by the limitations of restriction endonuclease sites.While each discussed technique presents its unique advantages and limitations, the optimal choice for a particular DNA assembly project will depend on specific requirements and contextual factors.Initially, megaprimers (coloured in blue) are generated in a PCR using a set of PTO oligonucleotides containing phosphorothioate linkages (indicated with letters 'P').Subsequently, iodoethanol treatment is applied to the megaprimers, breaking the phosphorothioate linkages and exposing 3′-overhangs.In the second step, these treated megaprimers anneal to the destination or recipient vector at the target locus, initiating the amplification of the entire plasmid.Moving to the third step, DpnI is employed to remove the methylated or Figure 3.An overview of the PTO-QuickStep method.Initially, megaprimers (coloured in blue) are generated in a PCR using a set of PTO oligonucleotides containing phosphorothioate linkages (indicated with letters 'P').Subsequently, iodoethanol treatment is applied to the megaprimers, breaking the phosphorothioate linkages and exposing 3 ′ -overhangs.In the second step, these treated megaprimers anneal to the destination or recipient vector at the target locus, initiating the amplification of the entire plasmid.Moving to the third step, DpnI is employed to remove the methylated or hemimethylated destination or recipient vector without the gene insert (shown as dotted circle).After DpnI digestion, the newly synthesized plasmids undergo transformation into E. coli, where any nicks are repaired in vivo.
In summary, the fundamental principle underlying the latest advancements in in vitro DNA assembly (Table 2) is to streamline the process, ensuring simplicity, cost-effectiveness, and versatility.Leveraging overlapping DNA sequences among fragments to be assembled allows for the insertion of any DNA fragment into any chosen vector without being confined by the limitations of restriction endonuclease sites.While each discussed technique presents its unique advantages and limitations, the optimal choice for a particular DNA assembly project will depend on specific requirements and contextual factors.* For clarity purposes, a gene cloned into a linearized vector is considered a 2-fragment assembly.

Modular DNA Assembly
The in vivo and in vitro cloning techniques outlined above present numerous advantages compared to traditional methods relying on restriction endonucleases.However, they often involve custom-designed primers and constructs tailored for specific experiments or purposes.A paradigm shift occurs with modular DNA assembly in synthetic biology.This approach revolves around using standardized DNA 'parts' to construct complex assemblies, which, in turn, can serve as 'parts' for even more intricate constructions.Once these components are generated, sequence-verified, and validated, they become reusable and shareable among researchers.This fosters standardization, ultimately leading to significant time and cost savings in the field.
Initially, modular DNA assembly relied on type IIP restriction enzymes, leading to the development of the BioBrick system for standardization.More recent techniques have shifted to the Golden Gate cloning technology, leveraging type IIS restriction enzymes.Unlike type IIP enzymes, which recognize and cut within palindromic DNA sequences, type IIS enzymes identify non-palindromic sequences and cleave DNA outside of the recognition site.This innovation has emerged as a potent tool for seamless cloning, allowing userdefined DNA sequences to serve as cutting sites and facilitating the precise assembly of DNA fragments in a predetermined manner.
Golden Gate assembly has been embraced in the creation of two extensively utilized hierarchical modular DNA cloning methods: MoClo and Golden Braid.These methodologies employ DNA 'parts' organized in different hierarchical levels, with each level being more intricate than the one preceding it.Comprehensive reviews [43,44] provide invaluable resources for users seeking to initiate projects with these techniques.While these methods excel in seamlessly cloning multiple DNA fragments, their application has been constrained by limited vector choices-specifically, the requirement for destination vectors designed to contain one (or more) appropriately oriented pairs of recognition sites for a type IIS restriction enzyme.Recent advancements [45] have addressed this limitation by introducing type IIS restriction sites that are compatible with type IIP restriction sites, significantly broadening the range of vectors compatible with modular DNA assembly.A summary of other recently developed modular assembly techniques is provided in Table 3.

Automated DNA Assembly
Amidst numerous breakthroughs, advancements, and refinements, a diverse array of in vivo and in vitro DNA cloning methods now cater to a spectrum of scenarios-from the straightforward insertion of a single DNA fragment into a vector to intricate assemblies involving large and/or multiple DNA fragments.As experimental designs grow increasingly complex, the norm is now the generation of hundreds, if not thousands, of library sequences.Consequently, there arises a demand for high-throughput methods capable of assembling a large number of DNA sequences to generate libraries.
The integration of automation into these processes (Table 4) offers several advantages, including the elimination of manual handling errors, enhanced reliability and reproducibility, and significant time and cost savings.This shift toward high-throughput methods represents a pivotal stride in accommodating the evolving demands of sophisticated experimental designs with large-scale library creation.Liquid-handling robots like Opentrons play a pivotal role in enabling this shift, offering benefits such as cost-effectiveness in hardware, open-source software accessibility, and user-friendly operation.These robotic systems prove invaluable for automating a diverse array of workflows, ranging from nucleic acid extraction to PCR and various DNA assembly techniques.

4
Not reported [54] * For clarity purposes, a gene cloned into a linearized vector is considered a 2-fragment assembly.

Genetic Diversity Creation
The primary objective of creating genetic diversity is to generate various protein variants, with the hope that a subset of these variants, albeit often a small fraction, will exhibit favourable phenotypes compared to their parent or wildtype protein.The introduction of mutations into the GOI is a common approach in mutagenesis to achieve this diversity.The outcome of protein variants is influenced by several factors.Firstly, considering that there are 61 codons encoding 20 amino acids, altering the codon usage or the GC content of the DNA sequence can impact the mutagenesis outcome.This effect becomes particu-larly prominent when the mutational spectrum of the employed mutagenesis method is highly biased.Secondly, the organization of the genetic code imposes constraints on the mutagenesis outcome.
While protein engineers predominantly work with DNA, it is worth noting that RNA mutagenesis methods also exist.Fukuda et al. developed an RNA mutagenesis method that leverages the intracellular RNA-editing mechanism [55].In this approach, guide RNAs guide the editing enzyme, human adenosine deaminase acting on RNA (ADAR), inducing adenosine-to-inosine (A-to-I) mutations on RNA molecules.Other RNA-based mutagenesis strategies involve using Qβ replicase to generate complex mRNA libraries [56].However, compared to DNA, RNA is a transient molecule, making it more challenging to track genotype-phenotype linkage.The conversion of RNA to cDNA is typically required to identify mutations introduced in RNA.In the subsequent sections, our focus will primarily be on DNA mutagenesis methods.

Quality and Size of a Gene Library
It is strongly advisable to assess the quality of a gene library, especially when employing a mutagenesis method for the first time on the target gene sequence.This evaluation is commonly achieved by sequencing a small subset, typically ranging from 10 to 20 randomly selected 'variants'.If the gene library's quality is found to be suboptimal, proceeding to the expression of the protein library and subsequent selection/screening may not be prudent.The probability of identifying enhanced protein variants would be too minimal to justify the resources invested in such an undertaking.
Several key indicators are commonly employed to evaluate the quality of a gene library.Taking random DNA mutagenesis as an example, a high-quality gene library should exhibit the following characteristics:

•
Mutations are precisely targeted to the GOI (i.e., no off-target mutations).

•
Mutations are uniformly distributed along the entire GOI.• All bases (A/T/G/C) experience mutations at the same frequency and are substituted with their three counterparts equally.• The mutation frequency (number of errors per 1 kb of DNA) is not excessively high, preventing the predominance of non-functional protein variants.

•
Duplicated sequences are avoided/eliminated.• Wildtype sequences are absent in the gene library (i.e., no template carry-over).
Achieving an absolutely ideal gene library is practically impossible unless it exclusively comprises synthetic genes.The extent of deviation from this ideal serves as a practical measure of the gene library's quality.The Schwaneberg group conducted a comprehensive large-scale comparison among random mutagenesis libraries created using three approaches: error-prone polymerase chain reaction (epPCR) with low mutagenic conditions, epPCR with high mutagenic conditions, and Sequence Saturation Mutagenesis (SeSaM) [57][58][59].After sequencing 1000 mutations for each library, the library quality was evaluated on both the DNA and protein levels [60,61].The protein level assessment primarily employed the Mutagenesis Assistant Program (MAP) [62][63][64].The SeSaM library exhibits a preference for transversion mutations, contrasting with epPCR libraries that display a transition bias.Additionally, the SeSaM library demonstrates a significantly higher number of consecutive nucleotide substitutions.These characteristics result in a greater number and more diverse amino acid substitutions in the SeSaM library.
In directed evolution, preserving genotype-phenotype linkage is crucial for tracing an improved phenotype back to its genetic origins.During the library transformation process, transformants incorporating more than one plasmid could account for over 20% of the constructed library, thereby compromising the library quality.To mitigate the occurrence of multiple-plasmid transformants, it is possible to reduce their frequency by optimizing the amount of plasmid DNA used for transformation [65].
The maximum library size of a mutagenesis method is determined by the transformation efficiency of the microbial host responsible for expressing the protein library.Table 5 provides a summary of the typical transformation efficiencies for commonly used microorganisms in biomanufacturing and protein evolution.Taking E. coli as an illustration, the achievable library size typically falls within the range of 10 9 -10 10 .For organisms with low transformation efficiencies, opting for an in vivo method for library creation is recommended.The ability to thoroughly explore this library of variants hinges on the capacity or throughput of the screening/selection method employed.The combination of a largersized and higher-quality library enables more efficient exploration of the protein space, particularly when the exploration is not constrained by screening/selection capacity.This review deliberately omits the discussion of screening or selection as it has been extensively covered in our textbook [5] and several recent, excellent reviews [66][67][68][69][70][71].Readers are encouraged to consult these references for a comprehensive understanding of this aspect. 1Applied antimicrobial peptide LFcin-B to increase the permeability of the cell membrane and high concentration of Ca 2+ and Mn 2+ to suppress lethal antimicrobial properties. 2 Applied nutrient supplement to boost transformation efficiency.

Random Mutagenesis
Random mutagenesis continues to be a frequently employed method, particularly in cases where a structure-function relationship is lacking.In the following section, we will explore recent advancements in five key areas: (1) epPCR, (2) in vivo mutagenesis in E. coli, (3) random base editing, (4) virus-assisted mutagenesis, and (5) random insertion and deletion (InDel).

epPCR
Motivated by the technical simplicity of epPCR, several modified protocols have been developed.One adaptation is a modified epPCR protocol specifically optimized for small amplicons [82].Another variation, known as Casting epPCR (cepPCR), directs random mutations to a specific region within the target GOI [82].Furthermore, efforts are underway to extend epPCR methods beyond the conventional microbial hosts E. coli and S. cerevisiae, with attempts to optimize them for other microbial hosts.For example, in the transformation of Bacillus subtilis into a host for directed evolution, the epPCR product was fused with flanking regions and an antibiotic-resistant marker.This composite PCR product was then integrated into the chromosome through homologous recombination after transformation into the supercompetent cells of the B. subtilis strain SCK6 [83].Additionally, a random mutagenesis protocol has been devised for the methylotrophic yeast Pichia pastoris (Figure 4) [84].This protocol involves the sequential amplification of plasmids using Phi29 DNA polymerase, encompassing error-prone rolling circle amplification (RCA) followed by multiple displacement amplification (MDA).Through these steps, it becomes feasible to obtain microgram amounts of plasmids for subsequent electroporation into Pichia cells, addressing a key challenge in employing Pichia for directed evolution.

In Vivo Mutagenesis in E. coli
In vivo mutagenesis provides several distinct advantages over in vitro mutagenesis.This approach allows the integration of genetic diversity creation and selection, eliminating the need for a separate transformation step that often limits the size of gene libraries in in vitro mutagenesis.Additionally, it avoids the labour-intensive process associated with in vitro gene library creation.
Liu and colleague developed highly effective, inducible, broad-spectrum mutagenesis systems in E. coli, elevating mutation rates by over 320,000 times compared to basal levels [85].These plasmid systems rely on the induced and combinatorial expression of proteins associated with proofreading (dnaQ926), translesion synthesis (umuD', umuC, reacA730), mismatch repair (dam, seqA), base excision (ugi, cda1), and base selection (emrR).Although efficient, this global mutagenesis strategy could introduce extensive mutations throughout the genome of the host, leading to undesirable issues such as toxicity, a reduced library size, the silencing of mutagenic plasmids, or the introduction of parasite variants into DNA libraries (mutations outside the target GOI that allow the host to circumvent the selection scheme).
MutaT7 (Figure 5) is a targeted in vivo mutagenesis strategy that overcomes the challenges associated with global mutagenesis strategies [86].It employs a DNA-damaging cytidine deaminase fused to a processive T7 RNA polymerase (T7RNAP), allowing continuous directed mutations to any DNA region downstream of a T7 promoter.To enhance mutation rates, eMutaT7 replaces the cytidine deaminase from rat APOBEC1 (rApo1) with Petromyzon marinus cytidine deaminase (PmCDA1), resulting in an increased mutation rate from 0.34 mutations/kb/day (MutaT7) to 4 mutations/kb/day [87].The MutaT7 toolbox has recently been expanded further with the use of adenosine deaminase-T7 RNA polymerase fusion proteins [88], such as TadA8e fused to T7RNAP and TadA7.10 fused to T7RNAP.TadA8e and TadA7.10 are variants of E. coli tRNA adenosine deaminase, evolved to operate on DNA [89].Despite its utility, the MutaT7 toolkit has limitations, including a restricted mutational spectrum, strand bias, and the necessity to prevent the Initially, the circular protein expression vector undergoes repeated amplification through strand displacement reactions facilitated by Phi29 DNA polymerase.Mutations are intentionally introduced by adding Mn 2+ to lower the fidelity of the polymerase, a process known as error-prone rolling circle amplification (epRCA).Following this, subsequent amplification, achieved through Phi29 DNA polymerase (or multiple displacement amplification, MDA), yields microgram quantities of mutated DNA.This mutated DNA is then utilized for transformation into P. pastoris to enable enzyme production.

In Vivo Mutagenesis in E. coli
In vivo mutagenesis provides several distinct advantages over in vitro mutagenesis.This approach allows the integration of genetic diversity creation and selection, eliminating the need for a separate transformation step that often limits the size of gene libraries in in vitro mutagenesis.Additionally, it avoids the labour-intensive process associated with in vitro gene library creation.
Liu and colleague developed highly effective, inducible, broad-spectrum mutagenesis systems in E. coli, elevating mutation rates by over 320,000 times compared to basal levels [85].These plasmid systems rely on the induced and combinatorial expression of proteins associated with proofreading (dnaQ926), translesion synthesis (umuD', umuC, reacA730), mismatch repair (dam, seqA), base excision (ugi, cda1), and base selection (emrR).Although efficient, this global mutagenesis strategy could introduce extensive mutations throughout the genome of the host, leading to undesirable issues such as toxicity, a reduced library size, the silencing of mutagenic plasmids, or the introduction of parasite variants into DNA libraries (mutations outside the target GOI that allow the host to circumvent the selection scheme).
MutaT7 (Figure 5) is a targeted in vivo mutagenesis strategy that overcomes the challenges associated with global mutagenesis strategies [86].It employs a DNA-damaging cytidine deaminase fused to a processive T7 RNA polymerase (T7RNAP), allowing continuous directed mutations to any DNA region downstream of a T7 promoter.To enhance mutation rates, eMutaT7 replaces the cytidine deaminase from rat APOBEC1 (rApo1) with Petromyzon marinus cytidine deaminase (PmCDA1), resulting in an increased mutation rate from 0.34 mutations/kb/day (MutaT7) to 4 mutations/kb/day [87].The MutaT7 toolbox has recently been expanded further with the use of adenosine deaminase-T7 RNA polymerase fusion proteins [88], such as TadA8e fused to T7RNAP and TadA7.10 fused to T7RNAP.TadA8e and TadA7.10 are variants of E. coli tRNA adenosine deaminase, evolved to operate on DNA [89].Despite its utility, the MutaT7 toolkit has limitations, including a restricted mutational spectrum, strand bias, and the necessity to prevent the repair of deoxyuridine (e.g., deleting uracil DNA glycosylase in the host or using a uracil DNA glycosylase inhibitor) for significant mutagenesis when using cytidine deaminase.
SynBio 2024, 2, FOR PEER REVIEW 13 repair of deoxyuridine (e.g., deleting uracil DNA glycosylase in the host or using a uracil DNA glycosylase inhibitor) for significant mutagenesis when using cytidine deaminase.When employing a highly processive T7RNAP, mutations may extend beyond the intended target GOI.To confine the mutagenesis to a specific region, multiple copies of the T7 terminator are required, serving as a boundary to restrict mutagenesis [86].Alternatively, in the T7-DIVA strategy [90], the catalytically dead Cas9 (dCas9), tethered to a custom-designed crRNA, acts as a 'roadblock' for the base deaminase-T7RNAP fusion proteins, effectively constraining the window of mutagenesis (Figure 5). .Graphic summary of the MutaT7 mutagenesis system and its derivatives.The T7 RNA polymerase fusion (T7RNAP) selectively binds to the T7 promoter, initiating transcription and traversing the gene of interest.As the fusion carries a base editor (BE), mutations are randomly introduced into the gene, represented by blue vertical stripes.The fusion halts and disengages from the DNA upon encountering a dCas9 molecule bound to a specific sequence dictated by the CRISPR RNA (crRNA).The termination process is also facilitated by the transactivating CRISPR RNA (tra-crRNA).In the absence of dCas9, the movement of the fusion protein can be halted by incorporating one or multiple T7 terminators.
Instead of relying on base deaminase as a mutagenic agent and T7RNAP as a 'guide protein' (GP) to target mutations to specific loci, the EvolvR system offers continuous nucleotide diversification within a tunable window length at user-defined loci [91,92].This is accomplished by directly generating mutations using engineered DNA polymerases (variant of E. coli PolI with reduced fidelity) targeted to specific loci through CRISPRguided nickases (nCas9).

Random Base Editing
We introduced the use of base editors (BE, e.g., cytidine deaminase and adenosine deaminase) for random mutagenesis in E. coli in Section 3.2.2.On first glance, this section on random base editing may seem redundant or repetitive.However, dedicating an additional section to random base editing is warranted for three reasons: (1) to discuss method variation, (2) to highlight the versatility of this approach, extending it to a wide range of organisms/cells beyond E. coli, and (3) to showcase its diverse applications.Indeed, the summary of methods in Table 6 emphasizes the popularity of this approach, further justifying a dedicated section on this topic.
Methods utilizing BEs for targeted random mutagenesis share a common chimeric protein design (Figure 6): a BE is tethered to a GP, with or without any accessory protein(s).The BE functions as the mutagenic agent by deaminating cytidine to uridine or adenosine to inosine.The GP directs the BE to specific gene loci for mutagenesis.The key variations among methods primarily reside in four areas: (1) the choice of BE, (2) the selection of GP, (3) the mechanisms of linkage between the GP and BE, and (4) the utilization or absence of accessory protein(s).When employing a highly processive T7RNAP, mutations may extend beyond the intended target GOI.To confine the mutagenesis to a specific region, multiple copies of the T7 terminator are required, serving as a boundary to restrict mutagenesis [86].Alternatively, in the T7-DIVA strategy [90], the catalytically dead Cas9 (dCas9), tethered to a customdesigned crRNA, acts as a 'roadblock' for the base deaminase-T7RNAP fusion proteins, effectively constraining the window of mutagenesis (Figure 5).
Instead of relying on base deaminase as a mutagenic agent and T7RNAP as a 'guide protein' (GP) to target mutations to specific loci, the EvolvR system offers continuous nucleotide diversification within a tunable window length at user-defined loci [91,92].This is accomplished by directly generating mutations using engineered DNA polymerases (variant of E. coli PolI with reduced fidelity) targeted to specific loci through CRISPR-guided nickases (nCas9).

Random Base Editing
We introduced the use of base editors (BE, e.g., cytidine deaminase and adenosine deaminase) for random mutagenesis in E. coli in Section 3.2.2.On first glance, this section on random base editing may seem redundant or repetitive.However, dedicating an additional section to random base editing is warranted for three reasons: (1) to discuss method variation, (2) to highlight the versatility of this approach, extending it to a wide range of organisms/cells beyond E. coli, and (3) to showcase its diverse applications.Indeed, the summary of methods in Table 6 emphasizes the popularity of this approach, further justifying a dedicated section on this topic.
Methods utilizing BEs for targeted random mutagenesis share a common chimeric protein design (Figure 6): a BE is tethered to a GP, with or without any accessory protein(s).The BE functions as the mutagenic agent by deaminating cytidine to uridine or adenosine to inosine.The GP directs the BE to specific gene loci for mutagenesis.The key variations among methods primarily reside in four areas: (1) the choice of BE, (2) the selection of GP, (3) the mechanisms of linkage between the GP and BE, and (4) the utilization or absence of accessory protein(s).Figure 6.Targeted random mutagenesis using chimeric proteins comprising a base editor (BE) and a guide protein (GP), following the general BE-GP protein architecture.BE is the mutagenic agent, introducing random mutations through its base-editing activity (e.g., cytidine and adenosine deamination).GP, with DNA-binding capability, guides or leads the BE to its target locus within the gene of interest (GOI) to effect mutagenesis.Typical BE choices include cytidine deaminase (CDA) or error-prone DNA polymerase (Pol).Frequently used GP candidates are T7 RNA polymerase (RNAP) or catalytically dead Cas9 (dCas9)/Cas9 nickase (nCas9).BE is tethered to GP via gene fusion or protein/protein or protein/RNA interactions through the utilization of SRC homology domain 3 (SH3) and the MS2 bacteriophage coat protein.In some methods, an accessory protein such as uracil DNA glycosylase inhibitor (UGI) is required.
The most commonly employed BEs include apolipoprotein B mRNA editing enzyme catalytic subunit 1 (rAPOBEC1) from rats (Rattus norvegicus), cytidine deaminase (PmCDA1) from sea lamprey (Petromyzon marinus), human activation-induced cytidine deaminase (hAID), E. coli tRNA adenosine deaminase (TadA), and their respective variants.CRISPR-Cas and T7RNAP represent the most popular choices for GPs.While gene fusion stands out as the most straightforward method for connecting a BE to its GP, alternative strategies have proven effective, including the utilization of the SRC homology domain 3 (SH3) and the MS2 bacteriophage coat protein.To augment base-editing efficiency, the inclusion of accessory proteins is frequently essential.Uracil DNA glycosylase inhibitor (UGI), an 83-residue protein derived from Bacillus subtilis bacteriophage PBS1, is extensively utilized for this purpose [93].Additionally, the incorporation of mismatch and base excision proteins, Apn2p and Msh6p, has been shown to enhance editing efficiency.
With the BE-GP approach demonstrating growing maturity and proven success across diverse microbes and mammalian cells, we foresee ongoing developments and expect to witness adaptations of this method to additional microbial systems.Figure 6.Targeted random mutagenesis using chimeric proteins comprising a base editor (BE) and a guide protein (GP), following the general BE-GP protein architecture.BE is the mutagenic agent, introducing random mutations through its base-editing activity (e.g., cytidine and adenosine deamination).GP, with DNA-binding capability, guides or leads the BE to its target locus within the gene of interest (GOI) to effect mutagenesis.Typical BE choices include cytidine deaminase (CDA) or error-prone DNA polymerase (Pol).Frequently used GP candidates are T7 RNA polymerase (RNAP) or catalytically dead Cas9 (dCas9)/Cas9 nickase (nCas9).BE is tethered to GP via gene fusion or protein/protein or protein/RNA interactions through the utilization of SRC homology domain 3 (SH3) and the MS2 bacteriophage coat protein.In some methods, an accessory protein such as uracil DNA glycosylase inhibitor (UGI) is required.
The most commonly employed BEs include apolipoprotein B mRNA editing enzyme catalytic subunit 1 (rAPOBEC1) from rats (Rattus norvegicus), cytidine deaminase (PmCDA1) from sea lamprey (Petromyzon marinus), human activation-induced cytidine deaminase (hAID), E. coli tRNA adenosine deaminase (TadA), and their respective variants.CRISPR-Cas and T7RNAP represent the most popular choices for GPs.While gene fusion stands out as the most straightforward method for connecting a BE to its GP, alternative strategies have proven effective, including the utilization of the SRC homology domain 3 (SH3) and the MS2 bacteriophage coat protein.To augment base-editing efficiency, the inclusion of accessory proteins is frequently essential.Uracil DNA glycosylase inhibitor (UGI), an 83residue protein derived from Bacillus subtilis bacteriophage PBS1, is extensively utilized for this purpose [93].Additionally, the incorporation of mismatch and base excision proteins, Apn2p and Msh6p, has been shown to enhance editing efficiency.

Virus-Assisted Mutagenesis
Viruses provide a distinctive avenue for designing rapid laboratory evolution experiments, capitalizing on their inherent capacity to evolve at a much faster pace than many living organisms [105].This accelerated evolution is facilitated by their smaller genome size, which tolerates a high frequency of mutations and a rapid rate of replication.These attributes present an excellent opportunity for the directed evolution of various biomolecules.
In the Viral Evolution of Genetically Actuating Sequences (VEGAS) method (Figure 7), the highly mutagenic RNA alphavirus Sindbis, belonging to the Togaviridae family and lacking known proof-reading capability, was employed to establish a mammalian-directed evolution system [106].Estimates of RNA virus mutation frequencies range from 10 −5 to 10 −3 mutations per base replicated.The mutation rate of the Sindbis virus was quantified to be 1.0 × 10 −4 ± 3.7 × 10 −5 mutations/base/hr.To create a robust directed evolution platform that capitalizes on the replicative and mutagenic potential of the Sindbis virus, artificial selective pressure must be applied.Each Sindbis viral particle requires 240 copies of the structural proteins E1, E2, and capsid to form a functional viral particle capable of maturing and propagating; without this envelope, the virus is unable to mature and propagate.By engineering restrictions on structural genome transcription, it is possible to apply selective pressure to transgenic Sindbis virus carrying a GOI.The VEGAS system therefore allows for the simultaneous operation of viral mutagenesis, selection, and heredity.It is noteworthy that other reports on virus-assisted directed evolution which did not utilize viruses for creating genetic diversity are not included in this review.
SynBio 2024, 2, FOR PEER REVIEW 16 heredity.It is noteworthy that other reports on virus-assisted directed evolution which did not utilize viruses for creating genetic diversity are not included in this review.This approach relies on the use of the Sindbis virus for efficient and mutagenic viral propagation in mammalian cell culture.To establish a robust directed evolution platform that harnesses the replicative and mutagenic potential of the Sindbis virus, artificial selective pressure must be applied.A crucial aspect involves the requirement for 240 copies of each structural protein (E1, E2, and capsid) in each Sindbis viral particle to form a functional unit capable of maturation and propagation.Without this envelope, the virus cannot mature or propagate.By strategically introducing limitations on the transcription of the structural genome, selective pressure can be applied to the transgenic Sindbis virus.In the VEGAS platform, the structural genome of the Sindbis virus is cloned into the mammalian expression vector pSSG, under the regulation of the tetracycline operator sequence.The structural genome elements of the Sindbis genome are then replaced with a transgene encoding a tetracycline transactivator.Propagation and selection can then be performed in mammalian cell culture, by infecting cells transfected with pSSG with the pTSin packaged virus.

Random Insertion and Deletion
Insertions and deletions in genomes occur naturally due to the replication slippage or error-prone NHEJ of double-stranded breaks.While their utilization in protein engineering is infrequent due to their tendency to be highly deleterious, often resulting in frame-shift mutations that significantly alter the protein sequence or prematurely terminate translation, there is evidence suggesting potential benefits.Instances of altered This approach relies on the use of the Sindbis virus for efficient and mutagenic viral propagation in mammalian cell culture.To establish a robust directed evolution platform that harnesses the replicative and mutagenic potential of the Sindbis virus, artificial selective pressure must be applied.A crucial aspect involves the requirement for 240 copies of each structural protein (E1, E2, and capsid) in each Sindbis viral particle to form a functional unit capable of maturation and propagation.Without this envelope, the virus cannot mature or propagate.By strategically introducing limitations on the transcription of the structural genome, selective pressure can be applied to the transgenic Sindbis virus.In the VEGAS platform, the structural genome of the Sindbis virus is cloned into the mammalian expression vector pSSG, under the regulation of the tetracycline operator sequence.The structural genome elements of the Sindbis genome are then replaced with a transgene encoding a tetracycline transactivator.Propagation and selection can then be performed in mammalian cell culture, by infecting cells transfected with pSSG with the pTSin packaged virus.
SynBio 2024, 2 158 3.2.5.Random Insertion and Deletion Insertions and deletions in genomes occur naturally due to the replication slippage or error-prone NHEJ of double-stranded breaks.While their utilization in protein engineering is infrequent due to their tendency to be highly deleterious, often resulting in frameshift mutations that significantly alter the protein sequence or prematurely terminate translation, there is evidence suggesting potential benefits.Instances of altered protein functionality through insertions and deletions have been reported, such as the broadening of substrate specificity in β-lactamase [107] and the modification of coenzyme specificity in Rossman fold enzymes [108].These findings underscore the need for further exploration and investigation into the potential applications of this approach in protein engineering.
Methods for insertion and deletion in protein engineering have been developed over the last two decades, as comprehensively outlined in a recent review [109].Some approaches leverage the higher slippage rates of certain polymerases to insert or delete one to two bases, inducing frame-shift mutations.Alternatively, other methods involve the fragmentation of GOI using DNase I or cleavage through exonucleases, endonucleases, or chemicals.Subsequently, nucleotides are added through processes like terminal deoxynucleotide transferase [110] or rolling circle amplification [111], incorporating random numbers or blocks of nucleotides.This results in libraries containing both in-frame and frame-shift variants.
Another set of methods relies on transposons to generate insertion and deletion libraries, and these approaches can prevent frame-shift occurrences through careful transposon sequence design [112][113][114].Typically, transposons are designed with recognition sites for selected restriction enzymes, enabling cleavage and subsequent religation to generate the insertion or deletion of nucleotide triplets.A notable example is the recently developed TRIAD method (Figure 8) that randomly inserts or deletes one to three nucleotide triplets [115].This method randomly inserts engineered mini-Mu transposons, defining the location of the insertion/deletion event.For deletion, the recognition site for the type II restriction enzyme MlyI is designed at both ends of the transposon.After MlyI digestion and religation, 3 bp are deleted.Repeating this process using MlyI custom cassettes enables longer deletions of up to 9 bp.For insertion, an asymmetric transposon with NotI and MlyI restriction sites is used.Following restriction digestion, a cassette carrying up to three NNN triplets can be inserted.The TRIAD method has been applied to evolve arylesterase activity in a phosphotriesterase [115] and enhance antibody affinity [116].

Focused Mutagenesis
Advancement in focused mutagenesis is marked by five noteworthy trends.These trends encompass (1) a transition from site-directed mutagenesis to multi-site-directed mutagenesis and even massive mutagenesis, (2) the incorporation of CRISPR/Cas9 in focused mutagenesis, mirroring trends seen in cloning and random mutagenesis, (3) the formulation of strategies to minimize library size, (4) the development of computational tools for automated oligo design, and (5) a shift from column-synthesized oligos to microarraysynthesized oligos for protein engineering.The subsequent sections will discuss each of these trends individually.
II restriction enzyme MlyI is designed at both ends of the transposon.After MlyI digestion and religation, 3 bp are deleted.Repeating this process using MlyI custom cassettes enables longer deletions of up to 9 bp.For insertion, an asymmetric transposon with NotI and MlyI restriction sites is used.Following restriction digestion, a cassette carrying up to three NNN triplets can be inserted.The TRIAD method has been applied to evolve arylesterase activity in a phosphotriesterase [115] and enhance antibody affinity [116].Subsequent self-ligation reestablishes the target sequence, now with a deletion of 2 or 3 triplets.This versatile approach also allows for the creation of insertion libraries.

Multi-Site-Directed Mutagenesis
Significant progress in multi-site-directed mutagenesis has predominantly emerged in methodologies utilizing single-stranded DNA (ssDNA) templates, as summarized in Table 7.A key example is nicking mutagenesis (Figure 9A) [117], leveraging a nicking enzyme (Nt.BbvCl) and exonucleases (ExoIII, ExoI) for the preparation of ssDNA templates.This approach involves a pool of mutagenic primers defining targeted mutagenesis sites, which anneal to the ssDNA and undergo isothermal assembly.Following template strand removal, the complementary mutant strand is synthesized via PCR.Continuous protocol enhancements have been reported to bolster efficiency (Table 7).In contrast, the SLUPT method generates an ssDNA template by eliminating the phosphorylated strand in the linearly amplified GOI [118].
Another group of methods opts for the creation of gene fragments bearing the desired mutations, which are subsequently combined in a sequential assembly [e.g., Combinatorial Codon Mutagenesis (Figure 9B)] using techniques like megaprimer PCR, overlap extension PCR, or Golden Gate assembly.Drawing inspiration from genome recombineering methods such as MAGE, Higgins et al. devised a plasmid recombineering method for in vivo multisite-directed mutagenesis (Figure 9C) [119].
Another development involves leveraging solid-phase gene synthesis technologies to design libraries [120,121].This approach minimizes the risk of stop codons and unintended mutations during PCR, thereby reducing screening efforts [120].Notably, Öling et al. achieved an impressive 161 multi-site mutations using this methodology [122].selectively nicked by Nt.BbvCI.The resulting nicked strand undergoes degradation by Exonuclease III (ExoIII), creating a single-stranded DNA (ssDNA) template.To eliminate insufficiently digested DNA, Exonuclease I (ExoI) is employed.Next, phosphorylated mutagenic primers are annealed to the ssDNA parental template.The mutagenic strand is then synthesized through the collaborative action of a polymerase and ligase.Following this synthesis, the WT template strand is nicked by Nb.BbvCI and subsequently digested by ExoIII.The introduction of a second primer initiates the synthesis of the complementary mutant strand, resulting in the generation of mutagenized dsDNA.(B) Combinatorial codon mutagenesis: This process initiates with two parallel PCR reactions.In one, mutagenic reverse primers are employed, while in the other, mutagenic forward primers are utilized.In the subsequent step, a third PCR is employed to combine the fragments generated from the preceding PCR reactions.(C) Plasmid recombineering: This method involves the direct in vivo incorporation of synthetic oligonucleotides carrying desired mutations into a gene of interest.These oligonucleotides are introduced into E. coli cells via electroporation and can recombine with resident plasmids, facilitated by the lambda phage protein Beta.
• Megaprimers generated by one-pot or multi-pot amplification with mutagenic primers and an end primer.

•
Random combinations of mutations generated by PCR reassembly of GOI with mutant megaprimers and two end primers.

•
Library cloned by Gibson assembly.

•
Fragments generated using mutagenic primers with Type IIS recognition sites.• In vivo method.

•
Plasmid and mutagenic primers electroporated into a recombineering strain.

•
Repeat recombineering using variant plasmids as template to increase sites of mutations.

CRISPR/Cas9-Mediated Mutagenesis
She et al. introduced a PCR-free, two-step In vitro CRISPR/Cas9-mediated Mutagenic (ICM) system designed for both single-site-and multi-site-directed mutagenesis (Figure 10) [131].In the first step, site-specific plasmid digestion is achieved by employing a complex of Cas9 with specific single guide RNA (sgRNA), followed by degradation with T5 exonuclease to create a 15-nucleotide complementary overhang.Subsequently, in step 2, primers containing the desired mutations are annealed to generate double-stranded DNA fragments, which are then ligated into the linearized plasmid.A distinct advantage of employing a PCR-free approach is the attainment of greater genetic diversity, attributed to the elimination of biases introduced by PCR, such as preferential primer binding to the template.Anticipating ongoing advancements, we foresee more methodologies integrating the use of CRISPR/Cas9 in the near future.

The Numbers Game
In focused mutagenesis, with the escalation of target sites, the library size (number of permutations) undergoes exponential expansion.When combined with oversampling to ensure a certain degree of diversity coverage, this exerts significant strain on library screening, making the navigation of the entire diversity space practically challenging-a scenario often denoted as permutation explosion.Commonly, three strategies are em-

The Numbers Game
In focused mutagenesis, with the escalation of target sites, the library size (number of permutations) undergoes exponential expansion.When combined with oversampling to ensure a certain degree of diversity coverage, this exerts significant strain on library screening, making the navigation of the entire diversity space practically challenging-a scenario often denoted as permutation explosion.Commonly, three strategies are employed to tackle this issue: the divide-and-conquer approach, the codon randomization strategy, and the use of machine learning (ML).
GeneORator [132] employs a divide-and-conquer approach.In this method, target sites are partitioned into subsets, such as subsets A and B. A permutation library is established for each subset, generating libraries A and B. Both libraries A and B are then independently screened for the desired phenotype.Another notable divide-and-conquer strategy is iterative saturation mutagenesis (ISM) pioneered by Reetz [133], taking the approach one step further.Beneficial variants from each subset act as templates to randomize the other subset, leading to the creation of new libraries, AB or BA, for the subsequent round of screening.A potential limitation of the divide-and-conquer approach is that it may not capture some pairwise and higher-order effects, known as epistasis, between residues in the gene libraries.
M-ISM, a variant of the ISM method [134], employs a computational tool to design a 'small-intelligent' library for each subset.The term 'small-intelligent' denotes a minimal gene library size devoid of inherent amino acid biases, stop codons, or rare codons specific to the protein expression host [135].Rather than sampling each target site with 19 amino acid alphabets, it is feasible to obtain improved protein variants using reduced amino acid alphabets, potentially even with just one or two alphabets [136].
Degenerate Codon Optimization for Informed Libraries (DeCOIL) is a recently introduced ML method that optimizes degenerate codon libraries to sample protein variants likely to exhibit both high fitness and high diversity within the sequence search space [137].Like all ML approaches, DeCOIL necessitates a training library to develop its ML model.

Automated Oligo Design
Mutation Maker [138] and GeneGenie [139] serve as oligo design platforms for protein engineering.Taking Mutation Maker as an example, the platform designs oligos for various applications, including site-scanning saturation mutagenesis, multi-site-directed mutagenesis, and PCR-based accurate gene synthesis.The designed oligos can be assembled through overlap extension PCR, with the program optimizing melting temperature (T m ) and codon choice.The use of degenerate oligos remains popular in focused mutagenesis.Web-based applications like CodonGenie [140] facilitate the selection of degenerate codons.For high-throughput or massively parallel protein engineering, automated oligo design proves useful in minimizing errors and expediting experimental design.

Oligo Pool (oPool) for Cost-Effective Library Construction
Rather than relying on oligonucleotides with degenerate codons synthesized through solid-phase phosphoramidite chemistry in column-based synthesis, there is a growing preference for using microarray-synthesized oligonucleotides or oligo pools (oPools) [141].oPools are mixes of thousands of individually designed polynucleotides, each up to 350 bases in length.
Individually designed oligos offer several distinct advantages, including the removal of redundancy from the gene library, the extensive coverage of mutational space, the precise selection of codons optimized for protein expression in a chosen host, and the facilitation of studies such as deep mutational scanning (DMS) [142] that necessitate large oligonucleotide pools.Programmed Allelic Series (PALS) is a technique devised for massively parallel single amino acid mutagenesis [143].It achieves this by integrating a low-cost oPool with overlap extension mutagenesis.oPool has been demonstrated to yield higher-quality gene libraries compared to degenerate oligos [144].
While oPools show significant promise for protein engineering, it is essential to acknowledge certain challenges [141].First, although the number of individual user-defined oligo sequences in a pool is large, their individual concentrations are low.Second, in focused mutagenesis applications, some oligos might preferentially bind to the template.Exacerbated by low concentrations, certain mutations may not be present in the final constructed gene library.Third, as the length of the oligos in the pool increases, the percentage of truncated molecules also rises, further diminishing the expected concentration of fulllength molecules.Fourth, the error rates for oPools are typically higher than those for column-synthesized oligos.
Irrespective of the method chosen for creating a gene library, effective target site identification is crucial for focused mutagenesis.A recent comprehensive review by the Dalby group offers an excellent overview of methodologies for identifying target sites based on sequence or structural information [145].Additionally, the review provides a valuable list of computational tools designed for target site identification.

DNA Recombination
While considerable strides have been made in the random mutagenesis and focused mutagenesis categories, advancements in the DNA recombination category have been relatively modest.An article sought to tackle the challenge of recombining protein modules from distant parents with minimal disruption at crossover sites.To overcome this hurdle, an approach called key motif-directed recombination was introduced [146].Members of the same protein superfamily often share common structural or sequence motifs.Validated through the creation of α/β-hydrolase chimeras, this method strategically conducted recombination at key sequence motif regions.These chimeras retained their biological functions and exhibited desirable properties.

Applications of Genetic Diversity Creation
Directed evolution has firmly established itself as a powerful tool in biotechnology, diversifying its applications from evolving single enzymes or proteins to fascinating areas such as metabolic pathway engineering, organismal engineering, viral engineering, and the engineering of molecular biology tools.In the section below, we provide recent examples of applications beyond engineering a single protein.

Biofuels and Biochemicals
Biofuels have emerged as a promising alternative to fossil fuels in sustainable energy solutions.Agricultural waste, a rich source of cellulose, provides a valuable feedstock for biofuel production [147].Cellobiose, an intermediate in the conversion of cellulose into glucose monomers for fermentative processes, is crucial for efficient biofuel production.Directed evolution has played a key role in engineering two essential proteins in the cellobiose utilization pathway in S. cerevisiae: β-glucosidase and cellodextrin transporter [148,149].These advancements have significantly increased ethanol production, contributing to the development of a more sustainable biofuel industry.
In Section 3.2.3above, we extensively covered the use of the 'BE-GP' protein architecture for targeted random mutagenesis in a wide range of biological systems.By excluding the GP, this method transforms into a potent global mutagenesis approach suitable for adaptive laboratory evolution (ALE).Pan et al. showcased the application of BE for the ALE of S. cerevisiae, enhancing resistance to isobutanol and acetate, and boosting the production of β-carotene [104].

Bioremediation
Given the rising global population and the associated increase in demand for food, materials, chemicals, and energy to support contemporary lifestyles, there is an urgent need for bioremediation tools to address and alleviate the adverse environmental impacts of their production and manufacturing processes.
Polyethylene terephthalate (PET) stands as one of the most commonly used polyester plastics, finding extensive applications in fabrics and packaging.However, the thermomechanical recycling of PET encounters challenges, including the gradual decline in its mechanical properties over time, issues of contamination, and the energy-intensive nature inherent in these recycling methods [150].Despite the identification of various PET hydrolases capable of converting PET into monomers, their widespread implementation on a large scale is hindered by limitations in efficiency.
A recent study aimed at enhancing the activity and thermal stability of leaf-branch compost cutinase (LCC) through enzyme engineering resulted in the creation of a mutant with high catalytic activity and increased thermostability, LCC ICCG [151].Achieved through the saturation mutagenesis of amino acid residues within the binding pocket and the introduction of a disulfide bridge, LCC ICCG demonstrated the ability to convert 90% of PET into monomers on an industrial scale.In a separate study, Chen et al. explored the fusion of carbohydrate-binding modules (CBMs) to LCC ICCG .This modification significantly enhanced the enzyme's binding affinity to PET [152].These outputs exemplify the potential of protein engineering in addressing challenges linked to plastic utilization and contribute to the ongoing development of efficient and sustainable bioremediation solutions.

Agriculture and Food Production
Directed evolution is making contributions to the field of plant biology.The evolution of genes in crops has the potential to bolster their resistance against pests, diseases, or adverse environmental conditions, ultimately resulting in higher yields and enhanced food security.A recent study, leveraging the synergy of CRISPR/Cas and directed evolution, has successfully developed a herbicide-resistant spliceosomal protein, SF3B1 [153].This plant-based directed evolution platform proves instrumental in investigating and evolving the molecular functions of essential biomolecules.Moreover, it facilitates the engineering of crop traits to enhance performance and adaptability under the changing conditions associated with climate change.

Diagnostics and Healthcare
In a recent study, directed evolution was leveraged to enhance the transduction capabilities of adeno-associated viruses (AAVs) [154].Directed evolution of the AAV capsid protein resulted in variants exhibiting significantly improved transduction efficiencies.This breakthrough facilitated the genetic manipulation of microglia, specialized immune cells in the central nervous system, with unprecedented ease.Given the implication of microglial dysfunction in neurodegenerative disorders and brain cancers, this advancement opens avenues for a more profound understanding of their functions and potential therapeutic interventions.
Cholesterol oxidase finds extensive applications in diagnostics, as well as in the food and agriculture industries.It serves as a valuable diagnostic tool for detecting serum cholesterol levels and plays a pivotal role as a biocatalyst in steroid production.The thermal stability of this enzyme is crucial for its efficacy.In a recent study, epPCR was employed as a tool to engineer a mutant with three amino acid substitutions, resulting in enhanced thermal stability [155].

Advanced Molecular Biology and Protein Engineering Tools
In genome editing and targeting, the landscape has been revolutionized by the emergence of CRISPR-Cas9.The widely utilized Cas9 nuclease, originating from Streptococcus pyogenes, encounters challenges attributed to its substantial size, constraining its effectiveness in delivery into cells.A smaller alternative, the Cas9 orthologue from Campylobacter Jejuni (CjCas9), presents an appealing solution [156].Nevertheless, the intricate PAM sequence recognized by CjCas9 imposes limitations on its versatility.Through successive rounds of directed evolution, a variant of CjCas9 named evoCjCas9 was found, boasting an altered PAM recognition sequence that is ten times more prevalent in the genome than the canonical PAM sequence [157].Beyond this, evoCjCas9 demonstrates increased nuclease activity compared to its wildtype counterpart, thereby broadening the horizons of CRISPR-Cas9 technology.Throughout this review, CRISPR-Cas9 has been consistently emphasized for its role in cloning and genetic diversity creation.It is truly inspiring to witness the 'ripple effect' where proteins crafted through protein engineering contribute to the continual expansion of the capabilities of protein engineering itself.

Emerging Trends and Prospective Trajectories
The convergence of automation and the emergence of artificial intelligence (AI) are poised to bring about a transformative shift in the domain of directed evolution, promising substantial improvements in the scale, efficiency, and precision of experiments.Automation stands to simplify labour-intensive processes like library construction, screening, and the identification of desired variants.AI tools, coupled with the capability of analysing extensive datasets, have the potential to steer experimental and variant design, expediting the discovery process.
Furthermore, we foresee that the effectiveness of directed evolution will be significantly amplified by synergistic integration with other well-established methodologies or tools.Specifically, NMR spectroscopy has been shown to enhance the capabilities of directed evolution [158].Leveraging inhibitors and identifying chemical site perturbations through NMR spectroscopy allows for the identification of regions in a protein that are likely to yield beneficial mutations.This approach has proven successful in converting myoglobin into a highly efficient Kemp eliminase with just three amino acid substitutions.
Embarking on the era of transformation brought about by directed evolution necessitates the unlocking of untapped biological potential.At the heart of this unlocking process lies the art and science of creating genetic diversity.It is in the diverse tapestry of our genetic landscape that the blueprint for the future of the bioeconomy is etched.Through the strategic exploration and manipulation of genetic diversity, we forge a path into an era where the frontiers of bioengineering unfold boundlessly, promising novel discoveries and innovations that will shape the landscape of tomorrow.

Figure 1 .
Figure 1.(Top left)The directed evolution cycle.The parental gene of interest (GOI) undergoes mutagenesis to generate a diverse pool of genetic variants.This pool is then subjected to a selection process targeting the desired phenotype, enabling the identification of improved variant(s).This iterative cycle is repeated until the desired trait is successfully achieved.(Bottom right) Classification of genetic diversity creation methods.The diverse methods for generating a genetically varied gene pool can be systematically categorized into three main classes: random mutagenesis, focused mutagenesis, and DNA recombination.Random mutagenesis involves the introduction of random mutations throughout the starting parental gene sequence.Focused mutagenesis targets mutations to specific pre-selected region(s) or amino acid residue(s) within the starting parental gene sequence.DNA recombination generates chimeric sequences by combining segments from a set of either homologous or non-homologous parental sequences.

Figure 2 .
Figure 2. Illustration of the experimental procedure of CReasPy-cloning.Initially, a yeast is transformed with two plasmids-pCas9 and pgRNA-enabling the expression of the Cas9 nuclease and a guide RNA (gRNA).These plasmids carry the TRP1 and URA3 selection markers, respectively.Following this, the yeast undergoes simultaneous transformation with the target genome to be cloned and a linear DNA fragment containing yeast elements (CEN-HIS3, with or without ARS).The linear DNA fragment has recombination arms homologous to each side of the target locus.Upon entry into the cell, the Cas9/gRNA complex cleaves the target genome, and the yeast homologous recombination system repairs it using the provided linear DNA fragment as a template.Consequently, the bacterial genome incorporates the yeast elements precisely at the designated locus and is now carried by the yeast as an artificial chromosome.

Figure 2 .
Figure 2. Illustration of the experimental procedure of CReasPy-cloning.Initially, a yeast is transformed with two plasmids-pCas9 and pgRNA-enabling the expression of the Cas9 nuclease and a guide RNA (gRNA).These plasmids carry the TRP1 and URA3 selection markers, respectively.Following this, the yeast undergoes simultaneous transformation with the target genome to be cloned and a linear DNA fragment containing yeast elements (CEN-HIS3, with or without ARS).The linear DNA fragment has recombination arms homologous to each side of the target locus.Upon entry into the cell, the Cas9/gRNA complex cleaves the target genome, and the yeast homologous recombination system repairs it using the provided linear DNA fragment as a template.Consequently, the bacterial genome incorporates the yeast elements precisely at the designated locus and is now carried by the yeast as an artificial chromosome.

Figure 3 .
Figure 3.An overview of the PTO-QuickStep method.Initially, megaprimers (coloured in blue) are generated in a PCR using a set of PTO oligonucleotides containing phosphorothioate linkages (indicated with letters 'P').Subsequently, iodoethanol treatment is applied to the megaprimers, breaking the phosphorothioate linkages and exposing 3′-overhangs.In the second step, these treated megaprimers anneal to the destination or recipient vector at the target locus, initiating the amplification of the entire plasmid.Moving to the third step, DpnI is employed to remove the methylated or

SynBio 2024, 2 , 12 Figure 4 .
Figure 4.A schematic representation of the random mutagenesis method tailored for Pichia pastoris.Initially, the circular protein expression vector undergoes repeated amplification through strand displacement reactions facilitated by Phi29 DNA polymerase.Mutations are intentionally introduced by adding Mn 2+ to lower the fidelity of the polymerase, a process known as error-prone rolling circle amplification (epRCA).Following this, subsequent amplification, achieved through Phi29 DNA polymerase (or multiple displacement amplification, MDA), yields microgram quantities of mutated DNA.This mutated DNA is then utilized for transformation into P. pastoris to enable enzyme production.

Figure 4 .
Figure 4.A schematic representation of the random mutagenesis method tailored for Pichia pastoris.Initially, the circular protein expression vector undergoes repeated amplification through strand displacement reactions facilitated by Phi29 DNA polymerase.Mutations are intentionally introduced by adding Mn 2+ to lower the fidelity of the polymerase, a process known as error-prone rolling circle amplification (epRCA).Following this, subsequent amplification, achieved through Phi29 DNA polymerase (or multiple displacement amplification, MDA), yields microgram quantities of mutated DNA.This mutated DNA is then utilized for transformation into P. pastoris to enable enzyme production.

Figure 5
Figure 5. Graphic summary of the MutaT7 mutagenesis system and its derivatives.The T7 RNA polymerase fusion (T7RNAP) selectively binds to the T7 promoter, initiating transcription and traversing the gene of interest.As the fusion carries a base editor (BE), mutations are randomly introduced into the gene, represented by blue vertical stripes.The fusion halts and disengages from the DNA upon encountering a dCas9 molecule bound to a specific sequence dictated by the CRISPR RNA (crRNA).The termination process is also facilitated by the transactivating CRISPR RNA (tra-crRNA).In the absence of dCas9, the movement of the fusion protein can be halted by incorporating one or multiple T7 terminators.

Figure 5 .
Figure 5. Graphic summary of the MutaT7 mutagenesis system and its derivatives.The T7 RNA polymerase fusion (T7RNAP) selectively binds to the T7 promoter, initiating transcription and traversing the gene of interest.As the fusion carries a base editor (BE), mutations are randomly introduced into the gene, represented by blue vertical stripes.The fusion halts and disengages from the DNA upon encountering a dCas9 molecule bound to a specific sequence dictated by the CRISPR RNA (crRNA).The termination process is also facilitated by the transactivating CRISPR RNA (tracrRNA).In the absence of dCas9, the movement of the fusion protein can be halted by incorporating one or multiple T7 terminators.

Figure 7 .
Figure 7. Schematic representation of the VEGAS platform for directed evolution, a technique for engineering DNA sequences in mammalian cells.This approach relies on the use of the Sindbis virus for efficient and mutagenic viral propagation in mammalian cell culture.To establish a robust directed evolution platform that harnesses the replicative and mutagenic potential of the Sindbis virus, artificial selective pressure must be applied.A crucial aspect involves the requirement for 240 copies of each structural protein (E1, E2, and capsid) in each Sindbis viral particle to form a functional unit capable of maturation and propagation.Without this envelope, the virus cannot mature or propagate.By strategically introducing limitations on the transcription of the structural genome, selective pressure can be applied to the transgenic Sindbis virus.In the VEGAS platform, the structural genome of the Sindbis virus is cloned into the mammalian expression vector pSSG, under the regulation of the tetracycline operator sequence.The structural genome elements of the Sindbis genome are then replaced with a transgene encoding a tetracycline transactivator.Propagation and selection can then be performed in mammalian cell culture, by infecting cells transfected with pSSG with the pTSin packaged virus.

Figure 7 .
Figure 7. Schematic representation of the VEGAS platform for directed evolution, a technique for engineering DNA sequences in mammalian cells.This approach relies on the use of the Sindbis virus for efficient and mutagenic viral propagation in mammalian cell culture.To establish a robust directed evolution platform that harnesses the replicative and mutagenic potential of the Sindbis virus, artificial selective pressure must be applied.A crucial aspect involves the requirement for 240 copies of each structural protein (E1, E2, and capsid) in each Sindbis viral particle to form a functional unit capable of maturation and propagation.Without this envelope, the virus cannot mature or propagate.By strategically introducing limitations on the transcription of the structural genome, selective pressure can be applied to the transgenic Sindbis virus.In the VEGAS platform, the structural genome of the Sindbis virus is cloned into the mammalian expression vector pSSG, under the regulation of the tetracycline operator sequence.The structural genome elements of the Sindbis genome are then replaced with a transgene encoding a tetracycline transactivator.Propagation and selection can then be performed in mammalian cell culture, by infecting cells transfected with pSSG with the pTSin packaged virus.

Figure 8 .
Figure 8. Schematic representation of the TRIAD process for generating deletion libraries.In the first step, the TransDel insertion library is formed through in vitro transposition of the engineered transposon TransDel into the target sequence on circular plasmid DNA.In the second step, MlyI digestion is applied to eliminate TransDel along with 3 base pairs of the original target sequence, creating a single break per variant.The third step involves self-ligation, leading to the reconstitution of the target sequence minus 3 base pairs.This results in a library of single variants, each featuring a deletion of one triplet.Alternatively, DNA cassette Dels can be inserted between the breaks in the

Figure 8 .
Figure 8. Schematic representation of the TRIAD process for generating deletion libraries.In the first step, the TransDel insertion library is formed through in vitro transposition of the engineered transposon TransDel into the target sequence on circular plasmid DNA.In the second step, MlyI digestion is applied to eliminate TransDel along with 3 base pairs of the original target sequence, creating a single break per variant.The third step involves self-ligation, leading to the reconstitution of the target sequence minus 3 base pairs.This results in a library of single variants, each featuring a deletion of one triplet.Alternatively, DNA cassette Dels can be inserted between the breaks in the target sequence to produce insertion libraries.MlyI digestion removes the DNA cassette Del along with 3 or 6 additional base pairs of the original target sequence, depending on the used DNA cassette.Subsequent self-ligation reestablishes the target sequence, now with a deletion of 2 or 3 triplets.This versatile approach also allows for the creation of insertion libraries.

Figure 9 .Figure 9 .
Figure 9. Multi-site-directed mutagenesis methods.(A) Nicking mutagenesis: This process begins with the wildtype (WT) plasmid dsDNA containing a 7-base-pair BbvCI recognition site, which is selectively nicked by Nt.BbvCI.The resulting nicked strand undergoes degradation by Exonuclease III (ExoIII), creating a single-stranded DNA (ssDNA) template.To eliminate insufficiently digested Figure 9. Multi-site-directed mutagenesis methods.(A) Nicking mutagenesis: This process begins with the wildtype (WT) plasmid dsDNA containing a 7-base-pair BbvCI recognition site, which is

SynBio 2024, 2 , 22 Figure 10 .
Figure 10.Schematic representation of the In vitro CRISPR/Cas9-mediated Mutagenic (ICM) system for site-directed mutagenesis: The target plasmid undergoes initial cleavage by the Cas9 protein and specific sgRNA complex at both sides of the mutational position, removing the wildtype sequence.Subsequent digestion with T5 exonuclease generates 15 nt sticky ends.Primers carrying the desired mutation are then annealed to create DNA fragments with 15 nt sticky ends that complement the digested plasmid.These fragments are inserted into the linearized vector through transformation into the host cell.

Figure 10 .
Figure 10.Schematic representation of the In vitro CRISPR/Cas9-mediated Mutagenic (ICM) system for site-directed mutagenesis: The target plasmid undergoes initial cleavage by the Cas9 protein and specific sgRNA complex at both sides of the mutational position, removing the wildtype sequence.Subsequent digestion with T5 exonuclease generates 15 nt sticky ends.Primers carrying the desired mutation are then annealed to create DNA fragments with 15 nt sticky ends that complement the digested plasmid.These fragments are inserted into the linearized vector through transformation into the host cell.

Table 1 .
In vivo cloning methods developed in the past decade (HR, homologous recombination; NHEJ, non-homologous end joining).

Table 2 .
In vitro cloning methods developed in the past decade.

Table 3 .
Modular DNA assembly methods developed in the past 10 years.
* For clarity purposes, a gene cloned into a linearized vector is considered a 2-part assembly.SynBio 2024, 2 150

Table 4 .
Automated DNA assembly methods reported in the past decade.

Table 5 .
Transformation efficiency of commonly used microorganisms in biomanufacturing and protein evolution.

Table 6 .
Targeted and non-targeted (global) random base-editing methods reported in the past 10 years.

Table 7 .
Multi-site-directed mutagenesis methods reported in the last decade.