Challenges in Expression and Puriﬁcation of Functional Fab Fragments in E. coli : Current Strategies and Perspectives

: Microbial host systems remain the most efﬁcient and cost-effective chassis for biotherapeutics production. Escherichia coli is often the preferred host due to ease of cloning, scale-up, high product yields, and most importantly, cost-effective cultivation. E. coli often experience difﬁculties in producing biologically active therapeutics such as Fab fragments, which require protein folding and subsequent three-dimensional structure development. This paper outlines the recent improvements in upstream and downstream unit operations for producing Fab fragments in E. coli . Monoclonal antibody fragments (Fab) are a rising class of biotherapeutics and their production has been optimised using coexpression of molecular chaperones such as DsbC or DnaK–DnaJ–GrpE, as well as strain engineering for post-translational modiﬁcations such as disulphide bridging. Different media systems such as EnBase and combining nitrogen source supplementation with low-temperature cultivation have resulted in improvement in cell integrity, protein expression, and protein refolding. The recovery of native proteins from insoluble inclusion bodies can be improved by adjusting refolding conditions, as well as by incorporating multimodal and afﬁnity chromatography for achieving high product yields in puriﬁcation. Recent developments summarised in this review may tune the E. coli expression system to produce more complex and glycosylated proteins for therapeutic use in the near future.


Introduction
Antibody fragments such as Fab and ScFv are small fragments that retain the parent antibody's antigen binding ability [1]. The Fab molecule has several advantages over a full antibody, the most important of which is its ease of penetration and clearance. As these molecules are not typically glycosylated, the bacterial expression system provides a compelling cost-effective production system [2].
The widespread use of the bacterial host system can be attributed to its several advantages over other options (such as mammalian cell culture). Fast growth, a well-known genome, and simple gene manipulation are some of these advantages [3]. While the mammalian expression system has a higher risk of contamination and higher costs for the media and subsequent processes, it has lower cell line stability [4]. Disulphide bond formation is, however, limited in prokaryotic expression systems. As such, protein misfolding, aggregation, and accumulation result in the formation of insoluble inclusion bodies, which, together with high chances of mutations, codon bias, inadequate translation, and protein degradation, are some of the major limitations of prokaryotic expression systems [5]. Significant advancements in the last few decades have allowed us to overcome these challenges and achieve sufficient expression and proper folding.
Many recent advances in strain engineering and cloning have increased the suitability of microbial hosts for biotherapeutic production. Bacterial cytoplasm has a reducing

Major Developments in Upstream Processing of Fab Fragments in E. coli
Production of a Fab fragment in a bacterial host is an attractive option due to the above-mentioned advantages. Recent advances in cloning technologies and host cell engineering along with process development have resulted in a significant improvement in the expression of Fab molecules in E. coli.

Expression and Localisation of Fab Fragment
Production of Fab fragments in E. coli typically offers multiple challenges, including low expression, poor yield of properly folded Fab fragments, and unequal expression of heavy and light chain in case of bicistronic expression. Oxidising conditions that allow disulphide bond formation are critical for the soluble expression of Fab fragments. This is achieved by directing the expressed molecule to the periplasmic space, which provides the oxidising environment required for disulphide bond formation. Inefficient translocation is one of the major hurdles in periplasmic expression, and different Sec pathways and codon modulation in signal sequences have been tried to increase the expression of properly folded soluble fractions efficiently. The pertinent cytoplasmic and periplasmic pathways have been illustrated in Figure 1. The PelB signal sequence library was created by codon modulation, and its effect on Fab expression has been reported [13]. Further, the fifth leucine position of light chain PelB has been proposed to be significant for Fab fragment expression. Fab leakage has been observed in the case of OmpA-signalled Fab, and researchers have used different signal sequences to replace heavy and light chain OmpA [14]. Further, DsbA has been used to prevent leakage and for re-routing of secretory pathways to rescue the poor performance, with this strategy demonstrating significant improvement in Fab expression. In a recent development, it has been observed that when OmpA of both chains is replaced with DsbA, uncleaved soluble light chains are observed in both cases and the amount of uncleaved soluble fragments is more in the plasmid-based system, compared with genome-integrated clones. Based on these observations, we can conclude that the use of different signal sequences for heavy and light chains can increase the periplasmic translocation, as well as cleavage efficiency, and can prevent overburden on the signal translocation pathway [15]. achieved by directing the expressed molecule to the periplasmic space, which provides the oxidising environment required for disulphide bond formation. Inefficient translocation is one of the major hurdles in periplasmic expression, and different Sec pathways and codon modulation in signal sequences have been tried to increase the expression of properly folded soluble fractions efficiently. The pertinent cytoplasmic and periplasmic pathways have been illustrated in Figure 1. The PelB signal sequence library was created by codon modulation, and its effect on Fab expression has been reported [13]. Further, the fifth leucine position of light chain PelB has been proposed to be significant for Fab fragment expression. Fab leakage has been observed in the case of OmpA-signalled Fab, and researchers have used different signal sequences to replace heavy and light chain OmpA [14]. Further, DsbA has been used to prevent leakage and for re-routing of secretory pathways to rescue the poor performance, with this strategy demonstrating significant improvement in Fab expression. In a recent development, it has been observed that when OmpA of both chains is replaced with DsbA, uncleaved soluble light chains are observed in both cases and the amount of uncleaved soluble fragments is more in the plasmid-based system, compared with genome-integrated clones. Based on these observations, we can conclude that the use of different signal sequences for heavy and light chains can increase the periplasmic translocation, as well as cleavage efficiency, and can prevent overburden on the signal translocation pathway [15]. Cytoplasmic targeting factors such as SecB help control protein targeting and folding. On the other hand, the SecYEG channel and the translocation ATPase SecA use ATP as a driving force to push the protein through the membrane. The transmembrane proton gradient may also power translocation. Periplasmic protein folding is mediated by oxidases DsbA and DsbC and chaperones, which give final active and protease-resistant conformation of translocated protein. Unlike the Sec pathway, the Tat pathway transports fully folded cofactor proteins. TatA, TatB, and TatC may make up the Tat translocase. Transmembrane proton-motive force drives protein transport via Tat; (B) cytoplasmic disulphide bond formation: Thioredoxin (TrxB) and glutaredoxin (Gor) pathways are required for reducing cytoplasmic protein disulphide bonds. While one relies on glutathione reductase and tripeptide glutathione, the other relies on thioredoxin reductase and one of the two thioredoxin family members. They both use NADPH. Glutaredoxins (Grx1, Grx2, Grx3) and oxidised thioredoxins (TrxB) catalyse the reversible oxidation-reduction of protein disulphide groups. Cytoplasmic DsbC isomerises mis-oxidised proteins to their native correctly folded state. (Image created in BioRender.com). While the cytoplasm is more reducing and not desirable for protein folding, it has many advantages over the periplasmic space, and to capitalise on these modified host cell mutants that lack thioredoxin, reductase and glutathione reductase genes have been developed to provide an oxidative environment. Small ubiquitin-related modifier (SUMO) tagged technology has been used such that SUMO fused heavy and light chains are expressed to enhance protein expression, stability, and solubility in the cytoplasmic space [1]. Higher expression of heavy and light chains was observed in the soluble fraction in the case of SUMO-tagged molecules, compared with untagged molecules. However, the increased expression was limited to the SHuffle strain and remained unaffected in the case of the BL21 strain.
Extracellular expression offers multiple advantages, including reduced proteolysis, easy purification, and thereby reduced time and cost. The lack of secretory pathways in E. coli makes it less efficient. Very few pathways, such as haemolysin and pullulanase, are used for extracellular protein production. Researchers have developed a novel system in which they used the phoA promoter and STII sequence for extracellular Fab production [16]. A study carried out on five Fab molecules indicated that secretion efficiency increased in the case of the phoA promoter, compared with the T7 promoter, and the STII signal sequence showed higher secretion ability than PelB. The proteins produced were of high purity, integrity, and bioactivity.

Co-Expression of Soluble Expression Partners
Insoluble aggregate formation and coexpression of molecular chaperones with Fab fragments to enhance the solubility and proper folding of antibody fragments are being widely studied as an effective approach to overcome protein loss and reduce process time at the refolding step in the case of inclusion bodies.
The expression of periplasmic chaperones such as DsbC exhibits a synergistic effect on protein folding and disulphide bond formation. In the cytoplasm, coexpression of DnaK-DnaJ-GrpE chaperones facilitates protein folding and transport across the membrane ( Figure 1). DnaK-DnaJ-GrpE chaperones coexpressed with anti-TNF-α Fab in the BL21 host system result in a marked increase in the periplasmic translocation and active soluble expression [17]. DsbC has a positive effect on the solubility of the proteins expressed in the cytoplasm but not in the proteins targeted to the periplasmic space. In another study, it has been observed that coexpression of the DsbC chaperon in periplasmic protease deficient strains such as wild-type W3110 and MXE001 strain resulted in high periplasmic Fab yield and increased levels of cell viability for four different types of Fab. A twofold yield increase in three Fab molecules (average 1.1 g/L to 2.25 g/L) and a fivefold (average 0.48 g/L to 2.6 g/L) increase in one Fab molecule was observed, as compared with wild type. Additionally, an increase in cell viability up to 40 h post-induction was observed in these modified strains (from around 80 OD to 105 OD) [18]. Since only a few pathways transfer folded or partially folded proteins to periplasmic space, researchers have examined the effect of coexpression of different sets of cytoplasmic chaperones on the yield of the Fab fragments and observed that the DnaKJE (DnaK-DnaJ-GrpE) had a positive effect on solubilising recombinant proteins expressed in the cytoplasm and had no effect on increasing the functionality of protein [6,17]. In the same study, it was reported that GroES-GroEL did not improve solubility and functionality, and it was concluded that GroES-GroEL chaperones are likely more effective in the case of cytoplasmic expression than the periplasmic expression.
In summary, the coexpression of chaperones and their effect on protein folding, solubility, and translocation may vary depending on Fab molecule, strain type, and protein localisation.

Strain Development and Protein Expression
Proteolytic degradation is the primary concern when expressing heterologous proteins in prokaryotic host systems. Recent advances in strain engineering have enabled us to improve the tolerance and applicability of host cells for the expression of heterologous proteins.
In the case of periplasmic expression of Fab fragments, researchers have developed Tsp, protease III, and DegP periplasmic protease deficient strains, and an increase in protein expression was observed in the Tsp deficient strain. Coexpression of protease deficient strains with DsbC gave increased yield and also restored cell viability; the final yield of 2.4 g/L was achieved, compared with 1 g/L in wild-type strain [18]. This indicated the likely involvement of Tsp in the degradation of Fab in periplasmic space, which was tested in wild-type W3110 and MXE001 strains and showed consistent results in both cases.
The CyDisCo system, which is based on the coexpression of Erv1p, DsbB, or VKOR, and either DsbC or PDI, has been developed for the production of disulphide bond containing proteins in E. coli. Researchers have studied the efficiency of the CyDisCo system to express Fab and scFv in E. coli, and it has been reported that the CyDisCo system is able to generate high yields of folded, biologically active, antibody fragments in the cytoplasm of E. coli with more than 90% success rate. In one study, four Fabs were used, and a nearly 20-fold increased yield was observed. In addition, 42 mg/L folded purified Fab (Maa48) was obtained from a shake flask study, and overall general average Fab levels were 23 mg/L [19].
Genomic integration of the gene of interest has been attempted by performing sitedirected integration of the gene of interest into the host cell genome [15]. When compared with the plasmid-based and genomic integration-based systems, the proposed approach resulted in yields ranging from 80% to 300%, whereas plasmid-based counterparts were considered at 100%. Soluble specific Fab yields ranged from 0.01 mg to 7.4 mg Fab per gram of cell dry mass. This study also proved that the genome-based system is a better option for expressing proteins translocated to the periplasm. The genomic integration lowers the concentration of mRNA, which prevents the overburdening of the translation, translocation, and folding machinery. This can also be used as an alternative to a plasmid-free, antibiotic selection pressure-free system.
As mentioned above, plasmid-based expression of recombinant protein imposes an excessive burden on the host cell machinery, and to overcome this, researchers have studied promoter engineering. In one study, a strong Ptac promoter was converted to a weak Ptic promoter by insertion of two amino acids between the −10 and −35 region to increase the gap between the two sites [20]. About a 35% decrease in expression was observed in the case of Ptic promoter, together with reduced intracellular accumulation, reduced leakage of Fab in the external environment, and increased cell viability. In this case, Fab expression was decreased, compared with wild type, from 0.32 to 0.23 mg Fab/g cell weight/h); additionally, the death rate was reduced from 18.1% to 11.6% at the end of fermentation. This proved that the engineered plasmid is significantly effective for periplasmic Fab production.
Leucine zipper-fused Fab, Zipbody, has been constructed to enhance the rate of formation of heterodimers [21]. In this study, an artificially designed pair of leucine zipper peptides, LZA and LZB, and the most widely studied leucine zipper pair, c-Jun and c-Fos, were assembled onto the C-termini of Fab, and it was observed that the rate of disulphide bond formation and association of heavy and light chain to form the heterodimer increased in case of the leucine Zipper. Different strain engineering and promoter engineering strategies have been developed to increase the soluble expression and yield of Fab fragments.

Fermentation Process and Media Development
After gene level modification and strain engineering, process development primarily focuses on media components and cultivation parameters. The former entails examining suitable combinations of carbon source, nitrogen source, and complex rich media with different combinations of yeast extract, peptone, and tryptone. Researchers have also optimised different process parameters such as pH, agitation, temperature, cell density, the concentration of inducers, induction time, and OD.
The use of a suitable carbon source and its concentration during different stages of expression is the main factor affecting cell growth and protein expression. Excessive glucose supplementation leads to higher acetate production, which affects cell growth and product yield. The EnBase system, in which glucose is gradually released into the medium by enzymatic degradation of glucose polymer, was applied for the expression of Fab molecules, and it was observed that better protein yield and ratio of soluble to total protein were achieved in E. coli BL21 [22]. Higher biomass production and lower acetate accumulation resulted in improved recombinant protein expression. Similar observations have been reported by other researchers, in studies in which the use of the EnBase system resulted in a 3-times increase in biomass, compared with LB in both BL21 and TSHuffle host systems [1]. While a 19% increase in protein expression was reported in the TSHuffle strain, the difference was minor in the case of the BL21 strain. The highest achieved in this cultivation mode was 12 mg/L. While low temperature is favoured for the expression of heterologous proteins, it is associated with a reduced growth rate and lower cell density, which often results in an increased burden of time and cost. Researchers have studied the effect of temperature and compared the synergistic effect of low temperature with nitrogen supplementation to overcome a lower growth rate [23]. In one study, it was confirmed that a lower temperature is important for soluble Fab production, whereas a higher aggregation rate was observed at a higher temperature. Using nitrogen supplementation, productivity increased by 60% (from 332 mg/L to 529 mg/L active form expression) at 25 • C, compared with 37 • C.
The stress minimisation approach has been explored by researchers, through which metabolic stress on host cells was reduced by decreasing the culture temperature or the inducer concentration [10]. This resulted in improved cell integrity, protein expression, and protein refolding.

Refolding
Recombinant antibody fragments such as Fab molecules, when produced in E. coli in the form of misfolded aggregates called inclusion bodies (IBs), need to be refolded into a functionally active protein (Figure 2). Refolding is often a rate-limiting step and a major challenge in the production of such Fab molecules owing to the shuffling of disulphide bonds. These multidomain proteins carry intra-and interchain disulphide bonds, which makes them highly complex molecules, such that the refolding becomes an overall determining step for cost-effective and time-effective production of these biotherapeutics. There are various events that occur during the refolding of these multidomain proteins. These include domain association, structural changes based on hydrophobicity, bond formation involving hydrogen or electrostatic interaction, and disulphide bond formation [12]. The association of light and heavy chains via disulphide bridges is highly inefficient in E. coli due to the large number of combinations that can occur in making and breaking the disulphide bonds [24,25]. The refolding efficiency is highly constrained due  There are various events that occur during the refolding of these multidomain proteins. These include domain association, structural changes based on hydrophobicity, bond formation involving hydrogen or electrostatic interaction, and disulphide bond formation [12]. The association of light and heavy chains via disulphide bridges is highly inefficient in E. coli due to the large number of combinations that can occur in making and breaking the disulphide bonds [24,25]. The refolding efficiency is highly constrained due to the higher propensity of aggregation formation owing to higher-order reaction kinetics than refolding, which is a first-order reaction [26]. Thus, a deeper understanding of molecular behaviour is required to enhance the refolding of these multidomain proteins. Researchers have studied the in vitro unfolding of Fab molecules by applying two-and three-state thermodynamic models, followed by DoE-based optimisation [12]. The refolding kinetics was examined, and it was surmised that refolding follows a three-state mechanism, wherein the formation of intermediates by light and heavy chains was the rate-limiting step. For renaturing of proteins, redox agents are often employed, wherein the reduced and oxidised forms are added in an optimum ratio to favour the refolding process. These forms include the glutathione system (GSH/GSSG), cysteine/cysteine system, DTT/GSSG system in the molar ratio range of 1:1 to 10:1 [27]. To enhance refolding yield, various additives are added to the refolding buffer. These additives can act during the different conformations that the protein acquires during refolding, such as stabilising the native state, increasing the reaction kinetics towards correct folding, or suppressing aggregation during the formation of intermediates. Some of the commonly used additives are sugars, polyethylene glycol (PEG), urea, acetone, dimethyl sulphoxide (DMSO), and amino acids such as arginine [28,29].
In hosts other than E. coli, refolding has also been a limitation in yeast, Pichia pastoris [30], and insect cell-based expression systems [31,32]. Even though mammalian-based expression systems can produce a refolded biotherapeutic, the cost of production increases considerably. Therefore, there is a need to address the challenges posed by the refolding unit operation for the efficient production of these recombinant biotherapeutic proteins.
A variety of approaches have been suggested by researchers for the refolding of Fab biotherapeutics. These approaches involve techniques such as drip dilution, dialysis, and on-column refolding (Figure 3).  To enhance the refolding efficiency, the solubilisation and unfolding behaviour of Fab molecules need to be studied. A group of scientists determined the unfolding events using nano-differential scattering fluorimetry (nano-DSF) [12]. The unfolding of proteins usually follows two-state models, which have native and unfolded forms. In three-state models, there is also an intermediate form along with native and unfolded forms. Understanding these forms can help improve refolding efficiency. In addition, other parameters also determine refolding such as the concentration of chaotrophs, pH of refolding buffer, the temperature during refolding, and the ratio of redox reagents, while the presence of refolding enhancers such as arginine hydrochloride (ArgHCl) determines the overall refolding efficiency. Dilution based in vitro refolding of Fab molecule has been optimised by using the DoE approach, and a refolding yield of 56% has been achieved in 120 h by appropriate dilution in the presence of redox reagents. A recently filed patent covers the  To enhance the refolding efficiency, the solubilisation and unfolding behaviour of Fab molecules need to be studied. A group of scientists determined the unfolding events using nano-differential scattering fluorimetry (nano-DSF) [12]. The unfolding of proteins usually follows two-state models, which have native and unfolded forms. In three-state models, there is also an intermediate form along with native and unfolded forms. Understanding these forms can help improve refolding efficiency. In addition, other parameters also determine refolding such as the concentration of chaotrophs, pH of refolding buffer, the temperature during refolding, and the ratio of redox reagents, while the presence of refold- ing enhancers such as arginine hydrochloride (ArgHCl) determines the overall refolding efficiency. Dilution based in vitro refolding of Fab molecule has been optimised by using the DoE approach, and a refolding yield of 56% has been achieved in 120 h by appropriate dilution in the presence of redox reagents. A recently filed patent covers the process for the production of Fab molecule ranibizumab, in which the refolding of protein was performed by diluting solubilised IBs into refolding buffer, and pH and temperature shift was performed at suitable time intervals to attain the refolded product, with a refolding yield of 13.6% in 22 h [33].
A matrix-associated method of refolding has also been explored; in this approach, the solubilised IBs are subjected to a refolding-based affinity column, also called the matrixassociated refolding method. Here, the elimination of chaotropic agents occurs by giving a linear gradient of refolding buffers. The refolded antibody fragments are then eluted and analysed by SDS-PAGE. The researchers compared the matrix-associated refolding strategy with the drip dilution method and found that the former method generated structurally folded and active fragments in terms of amount, as compared with the drip dilution method when performed in absence of redox pair [34]. Further, matrix-associated refolding has been explored with different molecules. For example, a group of researchers performed refolding of a recombinant human growth hormone fused with glutathione S transferase. This solid-phase refolding utilises the expanded-bed adsorption chromatography, wherein the unwanted particulates and denaturants are removed, and the target unfolded protein binds to the column. The results demonstrated an enhanced refolding yield of around 80%, which is significantly higher than that obtained with the conventional dilution method, and reduced aggregation was also observed, which usually has a high propensity in solutionphase refolding [35].
In another study, refolding of Fab molecules by the dialysis method was performed along with the screening of various additives to enhance the refolding yield, and taurine was found to be the most effective additive to attain 60 mg of refolded human Fab from a 1 L culture harvest [36]. Other researchers isolated the IBs of light and heavy chains, and the solubilised IBs were refolded by following a dialysis approach, reducing the concentration of chaotropic agent sequentially in three stages [37].
Many research groups have explored the refolding of Fab molecules to enhance the refolding yield, but understanding the molecular behaviour during refolding needs to be explored further to enhance the refolding efficiency. The chemistry behind the intermediates that form during refolding needs to be studied to move the reaction in the forward direction towards the folded protein.

Purification of Fab Molecules
The refolded Fab solution contains a large amount of product-related impurities (residuals of light and heavy chains, intermediates, and aggregates) and process-related impurities (HCPs and HcDNA) that must be removed in order to make a purified drug substance. An extensive study is required for the purification of Fab fragments, as the conventional protein A chromatography performed for mAbs is not applicable due to the lack of an Fc region in Fab molecules (Figure 4).
Affinity resins such as protein L and protein G are available for the purification of Fab molecules [38][39][40][41][42]. Traditional purification schemes involve the use of ion exchange and multimodal chromatography after affinity chromatography [40,41,43,44]. Various chromatography techniques have been employed by researchers to purify the Fab fragments. The purification of Fab molecules has been demonstrated using two multimodal chromatography resins, and the resulting product had a product purity of 99.50% and an overall process yield of 32.55% [45]. In one study, researchers performed IgG digestion and isolated Fab fragments using protein A and protein L affinity resins along with ultrafiltration/diafiltration to remove small molecular weight impurities. The overall yield for Fab isolation was approximately 50%. This was followed by resin screening of cation exchange and multimodal resins in the microfluidic device. Their findings suggest that multimodal resins are most suitable the for purification of Fab molecules due to their high salt tolerance and efficient binding of Fab molecules. Furthermore, the microfluidic strategy allows for rapid initial screening of various parameters for designing a purification process. It helps in determining the salt concentration, pH, and adequate resin in a rapid and cheaper way [40]. Another approach that has been attempted involves the use of cation exchange chromatography (SP Toyopeal resin) to purify the refolded Fab molecule [36]. An affinity-based apo-B-100-coupled antigen column was successfully used to purify a recombinant Fab molecule specific for apolipoprotein B-100, and a refolded molecule with the production of~3 mg of purified protein from 1 L E. coli harvest was generated specific to apolipoprotein B-100, which offered high selectivity for Fab molecule [37].
1 L culture harvest [36]. Other researchers isolated the IBs of light and heavy chains, and the solubilised IBs were refolded by following a dialysis approach, reducing the concentration of chaotropic agent sequentially in three stages [37].
Many research groups have explored the refolding of Fab molecules to enhance the refolding yield, but understanding the molecular behaviour during refolding needs to be explored further to enhance the refolding efficiency. The chemistry behind the intermediates that form during refolding needs to be studied to move the reaction in the forward direction towards the folded protein.

Purification of Fab Molecules
The refolded Fab solution contains a large amount of product-related impurities (residuals of light and heavy chains, intermediates, and aggregates) and process-related impurities (HCPs and HcDNA) that must be removed in order to make a purified drug substance. An extensive study is required for the purification of Fab fragments, as the conventional protein A chromatography performed for mAbs is not applicable due to the lack of an Fc region in Fab molecules (Figure 4). Affinity resins such as protein L and protein G are available for the purification of Fab molecules [38][39][40][41][42]. Traditional purification schemes involve the use of ion exchange and multimodal chromatography after affinity chromatography [40,41,43,44]. Various chromatography techniques have been employed by researchers to purify the Fab fragments. The purification of Fab molecules has been demonstrated using two multimodal chromatography resins, and the resulting product had a product purity of 99.50% and an overall process yield of 32.55% [45]. In one study, researchers performed IgG digestion and isolated Fab fragments using protein A and protein L affinity resins along with ultrafiltration/diafiltration to remove small molecular weight impurities. The overall yield for Fab isolation was approximately 50%. This was followed by resin screening of cation exchange and multimodal resins in the microfluidic device. Their findings suggest that multimodal resins are most suitable the for purification of Fab molecules due to their high salt tolerance and efficient binding of Fab molecules. Furthermore, the microfluidic strategy allows for rapid initial screening of various parameters for designing a purification process. It helps in determining the salt concentration, pH, and adequate resin in a rapid and cheaper way [40]. Another approach that has been attempted involves the use of cation exchange chromatography (SP Toyopeal resin) to purify the refolded Fab molecule [36]. An affinity-based apo-B-100-coupled antigen column was successfully used to purify a recombinant Fab molecule specific for apolipoprotein B-100, and a refolded molecule with the production of ~3 mg of purified protein from 1 L E. coli harvest was generated specific to apolipoprotein B-100, which offered high selectivity for Fab molecule [37].  Recently, a study was performed to extract and characterise the Fab molecule and the associated impurities. A three-dimensional chromatography method was designed, involving two steps of affinity chromatography (protein G and protein L), followed by cation exchange chromatography and subsequent analysis by mass spectrometry [11]. This 3D approach offers rapid optimisation of the preparative purification process. In this study, Fab molecules were purified from crude E. coli samples that contained Fab along with other fragments. Automation was carried out such that the elute from affinity chromatography was loaded directly onto high-resolution cation exchange resins via loops, to form a threedimensional chromatography system generating a purified sample. Thus, an automated process was designed, wherein the separation and characterisation of target Fab molecule, as well as the impurities associated with them, were determined, which was mainly found to be a fragment of light chain.
Another group identified the purification challenge for Fab molecules associated with affinity purification. Protein L resins can only bind some of the subclasses of variable regions of the light-kappa chain (V Lκ 1, 3, 4) but no binding to V Lκ 2. The authors presented a camelid-based affinity resin that can bind the constant region of the lambda light chain. This ensured 100% selectivity for the lambda light chain, thus bridging the gap between the purification of various Fab molecules. This affinity chromatography medium, lambda FabSelect, has been developed with its effectiveness equivalent to that of protein A for mAbs [46].
Thus, various groups have explored different types of strategies for refolding these multidomain antibody fragments and purifying them. These antibody fragments produced in E. coli still need to be explored for their refolding pathways, which can help in generating high amounts of refolded protein and thus increases yield and productivity.

Remaining Challenges in Production of Fab Fragments in E. coli
E. coli expression system is preferred for the production of Fab fragments due to the many advantages, as discussed in previous sections. Industrial production of biopharmaceuticals offers several challenges that need to be solved. Production of Fab fragments or high molecular weight proteins imposes a physiological burden and toxicity on the host cell, which can adversely affect cellular physiology. Reduced metabolic burden, stable host, and process predictability are critical for achieving higher yield and stable expression. However, in order to achieve these improvements, changes must be made at both the genetic and process levels.
The generation of genetically stable clones is critical for constant product formation. For the production of larger protein molecules such as Fab, which trigger metabolic burden on the cell, the selection of an appropriate vector or promoter with moderate strength that remains stable is key to success. Promoter engineering can also be used to tailor the promoter to the required strength. Continuous culturing of E. coli requires a stable and consistent clone, either plasmid-based or genome-integrated. Long-term stability, no plasmid loss, and the ability to create an antibiotic-free system are all advantages of genomeintegrated systems. However, there are significant drawbacks, such as lower product titre, which must be compensated for by site-specific integration, modifying/optimising fermentation conditions, or continuous production. The design-of-experiments approach is increasingly being used to explore multiparameter experimental space and understand the interactions between different parameters and their effect on yield and other responses.
A comprehensive approach that takes into account both the upstream and downstream processes in an integrated manner is required. The physical and biological state of cells, process parameters, feeding strategy, process robustness, protein localisation, inefficient translocation and incomplete signal sequence processing, and many other parameters in the upstream process can all affect the downstream steps. Small changes in the upstream process can have a significant impact on the downstream operations and performance. As a result, in order to develop a productive and economic process, both upstream and downstream processes need to be investigated as an integrated system.

Summary
In this review, we presented a comprehensive list of possible solutions for the production of Fab molecules in microbial systems. It was noted that different growth media and feeding strategies, as well as the synergistic effect of different parameter combinations and screening strategies, can significantly improve Fab molecule expression. Coexpression of multiple chaperone-encoding genes and careful selection of modifying secretion pathways can result in increased yields of correctly folded proteins. A deeper understanding of the stress-related degradation pathway may be effective for preventing protein misfolding and mislocalisation. Various methods are being developed to identify high-yielding scalable refolding conditions for multidomain proteins with interdomain disulphide bonds. Varied purification approaches have also been developed, including an automated purification method for a rapid and knowledge-based purification process, as well as its improvement to identify and separate product-related impurities. While some strategies for troubleshooting recombinant protein expression are protein-specific, there remains space for interpretation. The development of a suitable host with a stable expression platform, as well as the development of a general systematic, media, and fermentation process, are critical for robust process development and stable production. Recent advances in strain engineering, proteomics, and metabolomics can provide a better understanding of cellular pathways. Knowledge of gene expression, secretion pathways, and protein folding can be applied to improve the expression of functional Fab molecules. A newly modified strain that can perform oxidative protein folding in the cytoplasm, as well as interesting findings from cell-free experiments, offer new avenues for future developments. A holistic approach that takes into account both the upstream and downstream processes in an integrated manner is required.