Digital Twins for scFv Production in Escherichia coli

: Quality-by-Design (QbD) is demanded by regulatory authorities in biopharmaceutical production. Within the QbD frame advanced process control (APC), facilitated through process analytical technology (PAT) and digital twins (DT), plays an increasingly important role as it can help to assure to stay within the predeﬁned proven acceptable range (PAR).This ensures high product quality, minimizes failure and is an important step towards a real-time-release testing (RTRT) that could help to accelerate time-to-market of drug substances, which is becoming even more important in light of dynamical pandemic situations. The approach is exempliﬁed on scFv manufacturing in Escherichia coli . Simulation results from digital twins are compared to experimental data and found to be accurate and precise. Harvest is achieved by tangential ﬂow ﬁltration followed by product release through high pressure homogenization and subsequent clariﬁcation by tangential ﬂow ﬁltration. Digital twins of the membrane processes show that shear rate and transmembrane pressure are signiﬁcant process parameters, which is in line with experimental data. Optimized settings were applied to 0.3 bar and a shear rate of 11,000 s − 1 . Productivity of chromatography steps were 5.3 g/L/d (Protein L) and 2167 g/L/d (CEX) and the ﬁnal product concentration was 8 g/L. Based on digital twin results, an optimized process schedule was developed that decreased puriﬁcation time to one working day, which is a factor-two reduction compared to the conventional process schedule. This work presents the basis for future studies on advanced process control and automation for biologics production in microbials in regulated industries.


Introduction
Among prokaryotic organisms for the production of pharmaceutically active and commercially relevant proteins, Escherichia coli (E. coli) is the most widely used [1]. Because the bacterium has been extensively studied, has high growth rates, can grow to high cell densities, and grows on inexpensive, chemically defined media, it is a well-suited production organism [2,3]. A major drawback is that E. coli is not capable of post-translational modifications, such as glycosylation. However, non-glycosylated therapeutic proteins, such as insulin, various growth factors, interferons, and antibody fragments can be efficiently produced with E. coli [4][5][6].
In addition, an important factor for protein production is the formation of disulfide bridges. The reducing properties of E. coli cytoplasm prevent disulfide bridges from being formed. Lack of, or defective, disulfide bridges in combination with high expression rates often lead to agglomeration of the target protein resulting in insoluble inclusion bodies (IB), which are biologically inactive and must be resolved in a separate process step [7,8].
To circumvent this problem, the target protein can be expressed as a fusion protein with a signal sequence that controls transport into the periplasmic space. The oxidizing conditions prevailing there allow more efficient formation of disulfide bridges, allowing the target quality can be achieved [35][36][37]. Consequently, they must be dynamically adjusted as new knowledge is gained about the process or product. Experiments or risk management are used to define the CQAs. The latter includes risk assessment, which should be performed at the beginning of QbD-based process development [35,38]. The definition of the design space in which consistent product quality can be ensured follows the risk analysis and is traditionally done by experimentation. The experimental effort can be reduced by using statistical design-of-experiments (DoE) and rigorous process models [39]. The use of process models becomes feasible when the model used is at least as accurate and precise as the experiments it is designed to replace. This requirement can be verified using Monte Carlo simulation studies by comparing the simulated results with the experimentally obtained data. By using predictive process models, in addition to the resource-efficient spanning of the design space, a quantitatively defined and knowledge-based process optimum can be determined. Consequently, process development becomes not only empirically possible, but is extended by a model-and data-based process evaluation. In addition, physicochemical models do not lose their validity even if the limits of the design space are exceeded [35].
Spanning the design space is followed by the development of a control strategy. This is supported by process analytical technologies (PAT). This PAT-supported control strategy, in combination with digital twins, provides the basis for an automated, continuous process, and also leads to a shortened time-to-market. It also reduces the burden on personnel by reducing both cleaning and sterilization efforts. PAT also enable real-time release testing (RTRT) [35,40]. The development of PAT strategies based on spectroscopic methods has already been demonstrated using various substance systems [40][41][42]. The final step of the holistic QbD approach involves continuous improvement [35,40]. Thus, a process developed in this way allows better utilization of costly raw materials such as fermentation and feed media. Furthermore, a process developed according to QbD guidelines can reduce logistical expenses [23,[42][43][44][45].
This article presents the application of digital twins in the context of a QbD concept. Figure 3 gives an overview of the process. The workflow of a QbD-based process development is shown in Figure 2 [16,33]. First, the quality target product profile is defined, which influences the bioavailability, potency, and stability of the drug. Thus, the relationship between the QTPP and the quality, safety, and efficacy of the drug can be established [34]. Subsequently, critical quality   In the first step, the QTPPs are defined. Subsequently, the CQAs are defined and a risk assessment of the influence of various process parameters on the CQAs is carried out. The risk assessment results in a design space for the process parameters to be investigated, which can be examined either via experiments or by means of a rigorous process model. Based on the results, a control strategy is defined, which can be continuously compared online via PAT with the actual state of the system. Strict implementation of this strategy allows continuous process optimization.
Spanning the design space is followed by the development of a control strategy. This is supported by process analytical technologies (PAT). This PAT-supported control strategy, in combination with digital twins, provides the basis for an automated, continuous process, and also leads to a shortened time-to-market. It also reduces the burden on personnel by reducing both cleaning and sterilization efforts. PAT also enable real-time release testing (RTRT) [35,40]. The development of PAT strategies based on spectroscopic methods has already been demonstrated using various substance systems [40][41][42]. The final step of the holistic QbD approach involves continuous improvement [35,40]. Thus, a process developed in this way allows better utilization of costly raw materials such as fermentation and feed media. Furthermore, a process developed according to QbD guidelines can reduce logistical expenses [23,[42][43][44][45].
This article presents the application of digital twins in the context of a QbD concept. Figure 3 gives an overview of the process.  In the first step, the QTPPs are defined. Subsequently, the CQAs are defined and a risk assessment of the influence of various process parameters on the CQAs is carried out. The risk assessment results in a design space for the process parameters to be investigated, which can be examined either via experiments or by means of a rigorous process model. Based on the results, a control strategy is defined, which can be continuously compared online via PAT with the actual state of the system. Strict implementation of this strategy allows continuous process optimization.

Figure 2.
Workflow of model validation based on a QbD-oriented approach. In the first step, the QTPPs are defined. Subsequently, the CQAs are defined and a risk assessment of the influence of various process parameters on the CQAs is carried out. The risk assessment results in a design space for the process parameters to be investigated, which can be examined either via experiments or by means of a rigorous process model. Based on the results, a control strategy is defined, which can be continuously compared online via PAT with the actual state of the system. Strict implementation of this strategy allows continuous process optimization.
Spanning the design space is followed by the development of a control strategy. This is supported by process analytical technologies (PAT). This PAT-supported control strategy, in combination with digital twins, provides the basis for an automated, continuous process, and also leads to a shortened time-to-market. It also reduces the burden on personnel by reducing both cleaning and sterilization efforts. PAT also enable real-time release testing (RTRT) [35,40]. The development of PAT strategies based on spectroscopic methods has already been demonstrated using various substance systems [40][41][42]. The final step of the holistic QbD approach involves continuous improvement [35,40]. Thus, a process developed in this way allows better utilization of costly raw materials such as fermentation and feed media. Furthermore, a process developed according to QbD guidelines can reduce logistical expenses [23,[42][43][44][45].
This article presents the application of digital twins in the context of a QbD concept. Figure 3 gives an overview of the process.

Fed-Batch Fermentation
Fed-Batch cultivations were described using a Monod-based model where glucose was included as the carbon source. The concentration of viable cells, X V , was described by where µ is the growth rate that was described by a Monod equation The glucose concentration was described by the following equation in which m glc is the maintenance coefficient of glucose The yield coefficient of biomass from glucose, Y X V/glc , is composed of a growthindependent, Y X V/Glc,growth , and a growth-dependent term, q Glc/Xm The change in the oxygen concentration was calculated based on the oxygen transfer and the oxygen consumption by the viable biomass where k L a is the specific oxygen transfer coefficient, c * O2 is the oxygen saturation concentration in the fermentation broth, c O 2 is the current oxygen concentration in the fermentation broth and q O 2 is the specific oxygen uptake rate.
The specific product formation was proportional to the viable cell concentration .
V is the volumetric flowrate either in or out of the reactor, and V is the cultivation volume. The change in volume over time was calculated using a volume balance: Since the process considered in this study is a fed-batch cultivation, the outflowing volume flow is zero.

Ultrafiltration/Diafiltration
The investigated hollow fiber module consists of 60 fibers (0.5 mm inner diameter, 0.2 m length, 115 cm 2 surface area, 500 kDa MWCO). The process model applied is based on work by Grote et al. [46]. Filtration is described by the Darcy-Weisbach equation [47][48][49]  The three major approaches to model flux decline in tangential-flow ultrafiltration are resistance, gel-concentration, and osmotic-pressure models [50]. Since the retentate stream is a suspension of E. coli cells and bioparticles, flux decline is best described by the resistance model, where the total resistance R is the sum of the initial membrane resistance R m and the boundary-layer resistance R bl , which is computed from experiments [51,52]. The transmembrane pressure TMP is defined by Equation (9): For process simulations all model parameters are randomly varied inside the experimentally observed range to show model prediction precision and accuracy. Based on 31 simulations the experimental results are then compared to the simulation range of prediction.

Chromatography
There are different approaches for the purification of single-chain fragment variables (scFv). Most commonly, scFv are purified employing a metal-ion affinity chromatography to capture scFv from the stock through His-tag modification [53]. Alternatively, to the metalion affinity, chromatography Protein L chromatography has gained track in recent years [54]. In addition, cation-exchange chromatography or different mixed-mode chromatography techniques can be used in the purification of these fragments [55]. In comparison with metal affinity chromatography, these techniques do not need a Histidine-tagged protein, which most likely would have to be cleaved before its usage in a pharmaceutical application, which is why we decided to use Protein L chromatography as the purification and cation exchange chromatography as the polishing step.
For protein L chromatography we used the Toyopearl ® AF-rProtein L-650 F skillpak 1 mL column (Tosoh Bioscience GmbH, Griesheim, Germany). For cation exchange chromatography we used the BioPro IEX SmartSep S30 1 mL column (YMC Europe GmbH, Dinslaken, Germany). We based our method design for Protein L on literature data [54] and employed a five-column volume (CV) gradient from buffer A to buffer B. Buffer A consisted of 50 mM sodium citrate (pH 6.5) and buffer B consisted of 50 mM sodium citrate (pH 2.3); subsequently, to the load, we washed the column for 10 CV and used a 10 CV regeneration with 0.1 M NaOH. For cation exchange chromatography, we used a 5 CV gradient in conjunction with a 3 CV wash step and a 3 CV regeneration, based on a prior method screening. In the cation exchange chromatography, buffer A consisted of 50 mM sodium phosphate (pH 3.5), and buffer B consisted of 50 mm sodium phosphate, and 1 M sodium chloride (pH 5.5).
The Digital Twin for chromatography can either use a general rate model or a lumped pore diffusion model [56][57][58]. In this case study, a lumped pore diffusion model of chromatography was used; as for scFv, it can be expected that pore diffusion does not show a major impact on the chromatogram. This is supported by the relatively low impact on monoclonal antibodies, which are considerably larger [58,59]. The mass balance of the stationary phase for the lumped pore diffusion model is [56]: With ε P,i as the porosity of the component, c P,i as the concentration of the component in the pores, t as the time, q i as the loading, d P as the mean diameter of the resin particle, ε S as the voidage, k e f f ,i as the effective mass transport coefficient, and c i as the concentration in the continuous phase. Different approaches for modelling of adsorption have been described by different working groups [56,[59][60][61][62]. In this study, adsorption is modelled using a Langmuir isotherm [61,63] Here, q max, i is the maximum loading capacity of the component and K eq,i is the Langmuir coefficient of the component. K eq,i and q max, i are related by the Henry coefficient H i , see Equation (12) [56]. Salt influence can be described by Equations (13) and (14) defining a 1 , a 2 , b 1 , and b 2 as material constants [59,64].
The mass transfer coefficient k e f f ,i is given by Equation (15). Here, k f ,i is the film mass transfer coefficient, r p the particle radius, and D p,i the pore diffusion coefficient.
D p,i is calculated according to the correlation of Carta [65] and k f ,i according to Wilson and Geanoplis [66].

Lyophilization
As the last step in the production of the scFv solution lyophilization is used to formulate the product into a stable solid form. A one-dimensional sorption sublimation model introduced by Klepzig et al. models the Lyophilization process [67]. Here, the exact derivation of the proposed model is shown. It calculates the time-dependent product temperature and the residual moisture during the lyophilization process with a coupled mass and heat transfer. The process is separated into primary and secondary drying. In both drying steps, conduction is the main heat transport mechanism.
The energy balance is written as: ρ Product describes the density of the product, c p,apparent is the apparent heat capacity, T the product temperature inside the vial, and λ is the heat conductivity.
In primary drying, the ice is removed by sublimation and convection through the dried zone. The phase change at the sublimation interface is implemented by the apparent heat capacity.
The overall mass balance of water considers ice and the dried product. Convection controls the transport rate.
With m w as overall water mass, ρ W,g as density of water vapor, ·p as pressure difference, η W as dynamic vapor viscosity of water, and K as hydraulic flow resistance., A Vial as cross-sectional area of the vial. Heat and mass transfer are coupled by the sublimation enthalpy [68].
In secondary drying, bound water from the dried matrix is removed by desorption. It is modeled by an Arrhenius approach. The mass balance of the bound water can be formulated to: With w bw as mass fraction of the bound water in the dried product, ∆h subl as sublimation enthalpy, R as gas constant, α w as water activity, and w bw,eq as mass share of bound water at equilibrium.

Fed-Batch Fermentation
Fed-batch fermentation of E. coli BL21(DE3) transformed with pET22b(+) carrying the scFv gene under the control of the T7-promoter, was performed for 19 h. Riesenberg's medium with an initial glucose concentration of 8.8 g/L as well as 10 g/L yeast extract and 5 g/L soy peptone was inoculated with an initial dry cell weight of 0.5 g/L. The fermentation was done at 37 • C, pH 6.9 and 2000 rpm stirrer speed, using a Rushton turbine with 54 mm diameter. Gassing was constant at 2 vvm using air and pure oxygen so to keep the pO 2 above 20 %. The batch phase lasted 5 h after which the exponential glucose feeding was started to reach a constant growth rate of 0.15 h −1 . After four more hours the feeding profile was adjusted to reach a constant growth rate of 0.05 h −1 . The expression of the scFv encoding gene was induced by addition of 1 mM IPTG after 12 h. Figure 4 shows the experimental and simulation results of 30 Monte-Carlo simulations of the fed-batch fermentation. The model can predict the growth of the cells and the consumption of glucose sufficiently. Only at the end of the fermentation, the biomass is slightly underestimated. The final scFv concentration after HPH was 0.74 ± 0.02 g/L, which is equivalent to 13.7 ± 0.4 mg/g.
With wbw as mass fraction of the bound water in the dried product, Δhsubl as sublimation enthalpy, R as gas constant, αw as water activity, and wbw,eq as mass share of bound water at equilibrium.

Fed-Batch Fermentation
Fed-batch fermentation of E. coli BL21(DE3) transformed with pET22b(+) carrying the scFv gene under the control of the T7-promoter, was performed for 19 h. Riesenberg's medium with an initial glucose concentration of 8.8 g/L as well as 10 g/L yeast extract and 5 g/L soy peptone was inoculated with an initial dry cell weight of 0.5 g/L. The fermentation was done at 37 °C, pH 6.9 and 2000 rpm stirrer speed, using a Rushton turbine with 54 mm diameter. Gassing was constant at 2 vvm using air and pure oxygen so to keep the pO₂ above 20 %. The batch phase lasted 5 h after which the exponential glucose feeding was started to reach a constant growth rate of 0.15 h −1 . After four more hours the feeding profile was adjusted to reach a constant growth rate of 0.05 h −1 . The expression of the scFv encoding gene was induced by addition of 1 mM IPTG after 12 h Figure 4 shows the experimental and simulation results of 30 Monte-Carlo simulations of the fed-batch fermentation. The model can predict the growth of the cells and the consumption of glucose sufficiently. Only at the end of the fermentation, the biomass is slightly underestimated. The final scFv concentration after HPH was 0.74 ± 0.02 g/L, which is equivalent to 13.7 ± 0.4 mg/g.

Harvest by UFDF
After fermentation, cell harvest was performed by ultra/diafiltration. This process step can be divided into two steps: volume reduction and concentration by a factor of three and subsequent washing with five diafiltration volumes (DV). Concentration reduces the amount of buffer required in the following diafiltration step. The process model prediction of three experiments differing in applied transmembrane pressure (TMP),

Harvest by UFDF
After fermentation, cell harvest was performed by ultra/diafiltration. This process step can be divided into two steps: volume reduction and concentration by a factor of three and subsequent washing with five diafiltration volumes (DV). Concentration reduces the amount of buffer required in the following diafiltration step. The process model prediction of three experiments differing in applied transmembrane pressure (TMP), shear rate, and cell concentration used is shown in Figure 5. The prediction of the digital twin is in line with the experimental data. Both the strong decrease in the flux (Jv) at the beginning and the slower decrease in the flux in the further experimental process due to the increase in the boundary layer resistance are reproduced. In order to demonstrate the predictive power of the digital twin, the time from which the filtration course can be predicted and the end point determined was investigated. Figure 6 shows that the blocking mechanism including the blocking constant can already be identified after one third of the total process time. Thus, the digital twin, which is continuously fed with the process data in real time, can predict the further course of the process at an early stage, compare it with predefined limits and, if necessary, carry out an optimization. In order to demonstrate the predictive power of the digital twin, the time from which the filtration course can be predicted and the end point determined was investigated. Figure 6 shows that the blocking mechanism including the blocking constant can already be identified after one third of the total process time. Thus, the digital twin, which is continuously fed with the process data in real time, can predict the further course of the process at an early stage, compare it with predefined limits and, if necessary, carry out an optimization. cesses 2022, 10, x FOR PEER REVIEW 10 Figure 6. Demonstration of early digital twin-assisted prediction of the filtration progress as we endpoint determination.

Clarification by UFDF
After cell lysis, the homogenizate is clarified by ultra-/diafiltration. The target c ponent is leached out over five diafiltration volumes. The process model predictio three experiments differing in applied transmembrane pressure (TMP), shear rate, applied concentration is shown in Figure 6. The concentration used is correspondin the concentration factor in ultra/diafiltration used for cell harvest. The prediction of digital twin agrees with the experimental data. The strong decrease in flux (Jv) is dicted, as well as the subsequent flux, which is approximately constant.

Concentration and Buffer Exchange by UFDF
After clarification of the homogenized suspension volume reduction and buffer change is done by ultrafiltration/diafiltration. This process step can be divided into phases. First, a concentration step is performed to reduce the subsequently needed change buffer volume as well as process time. Second, buffer exchange and partial pu cation is achieved by diafiltration with five diafiltration volumes (DV). Process model diction are shown in Figure 7. The digital twin prediction for experiments aligns with experimental results regarding ultrafiltration endpoint.

Clarification by UFDF
After cell lysis, the homogenizate is clarified by ultra-/diafiltration. The target component is leached out over five diafiltration volumes. The process model prediction of three experiments differing in applied transmembrane pressure (TMP), shear rate, and applied concentration is shown in Figure 7. The concentration used is corresponding to the concentration factor in ultra/diafiltration used for cell harvest. The prediction of the digital twin agrees with the experimental data. The strong decrease in flux (Jv) is predicted, as well as the subsequent flux, which is approximately constant.

Clarification by UFDF
After cell lysis, the homogenizate is clarified by ultra-/diafiltration. The target component is leached out over five diafiltration volumes. The process model prediction of three experiments differing in applied transmembrane pressure (TMP), shear rate, and applied concentration is shown in Figure 6. The concentration used is corresponding to the concentration factor in ultra/diafiltration used for cell harvest. The prediction of the digital twin agrees with the experimental data. The strong decrease in flux (Jv) is predicted, as well as the subsequent flux, which is approximately constant.

Concentration and Buffer Exchange by UFDF
After clarification of the homogenized suspension volume reduction and buffer exchange is done by ultrafiltration/diafiltration. This process step can be divided into two phases. First, a concentration step is performed to reduce the subsequently needed exchange buffer volume as well as process time. Second, buffer exchange and partial purification is achieved by diafiltration with five diafiltration volumes (DV). Process model prediction are shown in Figure 7. The digital twin prediction for experiments aligns with the experimental results regarding ultrafiltration endpoint.

Purification by Protein L Chromatography
For the Protein L Digital Twin, we based our calculations on the data given by the resin supplier, especially for the resin particle size, pore size, and operating velocity ranges. Isotherm parameters were determined using a least-squares routine in combination with the experimentally obtained data. The results are shown in Figure 8a. For the scale-up of the purification step, we used the determined parameters and increased velocity according to the resin supplier's information sheet to 500 cm/h; additionally, we decreased the wash volume in the chromatographic method to 3 CV. This results in a productivity of 5.31 g/L•d, with a high process yield in the obtained fraction of 96%.

Concentration and Buffer Exchange by UFDF
After clarification of the homogenized suspension volume reduction and buffer exchange is done by ultrafiltration/diafiltration. This process step can be divided into two phases. First, a concentration step is performed to reduce the subsequently needed exchange buffer volume as well as process time. Second, buffer exchange and partial purification is achieved by diafiltration with five diafiltration volumes (DV). Process model prediction are shown in Figure 8. The digital twin prediction for experiments aligns with the experimental results regarding ultrafiltration endpoint.

Purification by Protein L Chromatography
For the Protein L Digital Twin, we based our calculations on the data given by the resin supplier, especially for the resin particle size, pore size, and operating velocity ranges. Isotherm parameters were determined using a least-squares routine in combination with the experimentally obtained data. The results are shown in Figure 8a. For the scale-up of the purification step, we used the determined parameters and increased velocity according to the resin supplier's information sheet to 500 cm/h; additionally, we decreased the wash volume in the chromatographic method to 3 CV. This results in a productivity of 5.31 g/L•d, with a high process yield in the obtained fraction of 96%.

Polishing by Cation Exchange Chromatography (CEX)
In CEX chromatography, the side components are mostly non-binding components, obtained during the wash from one to three minutes, see Figure 9. As such, a very pure product is obtained, with no identifiable side components in the analytics. The product is obtained in one CV, which would allow for a good concentration factor in this step. For the evaluation of a potential scale-up, however, we have to rely on literature data for the maximum loading capacity of the resin, as a production of scFv on a preparative scale to evaluate the CEX chromatography thoroughly was unfeasible in the context of this study. We decided to assume the capacity given by the supplier.
(a) (b) Figure 9. Chromatograms for the laboratory process (a) and the scaled-up process (b). The dashed red line gives the experimental data, black shows a theoretical fraction cut, the solid green lines give the data derived from the Digital Twin and the solid blue lines give the gradients.
As before, we used the product of the Protein L chromatography as the feed for the following unit operation. The feed volume was 42 mL, which was scaled up to a velocity

Purification by Protein L Chromatography
For the Protein L Digital Twin, we based our calculations on the data given by the resin supplier, especially for the resin particle size, pore size, and operating velocity ranges. Isotherm parameters were determined using a least-squares routine in combination with the experimentally obtained data. The results are shown in Figure 9. For the scale-up of the purification step, we used the determined parameters and increased velocity according to the resin supplier's information sheet to 500 cm/h; additionally, we decreased the wash volume in the chromatographic method to 3 CV. This results in a productivity of 5.31 g/L·d, with a high process yield in the obtained fraction of 96%.

Polishing by Cation Exchange Chromatography (CEX)
In CEX chromatography, the side components are mostly non-binding components, obtained during the wash from one to three minutes, see Figure 9. As such, a very pure product is obtained, with no identifiable side components in the analytics. The product is obtained in one CV, which would allow for a good concentration factor in this step. For the evaluation of a potential scale-up, however, we have to rely on literature data for the maximum loading capacity of the resin, as a production of scFv on a preparative scale to evaluate the CEX chromatography thoroughly was unfeasible in the context of this study. We decided to assume the capacity given by the supplier.
(a) (b) Figure 9. Chromatograms for the laboratory process (a) and the scaled-up process (b). The dashed red line gives the experimental data, black shows a theoretical fraction cut, the solid green lines give the data derived from the Digital Twin and the solid blue lines give the gradients.
As before, we used the product of the Protein L chromatography as the feed for the following unit operation. The feed volume was 42 mL, which was scaled up to a velocity

Polishing by Cation Exchange Chromatography (CEX)
In CEX chromatography, the side components are mostly non-binding components, obtained during the wash from one to three minutes, see Figure 10. As such, a very pure product is obtained, with no identifiable side components in the analytics. The product is obtained in one CV, which would allow for a good concentration factor in this step. For the evaluation of a potential scale-up, however, we have to rely on literature data for the maximum loading capacity of the resin, as a production of scFv on a preparative scale to evaluate the CEX chromatography thoroughly was unfeasible in the context of this study. We decided to assume the capacity given by the supplier.
of 500 cm/h. This scale up results in a 2167 g/L•d, the yield was 96%, and the concentration factor was 3.9, ending up with a product concentration after CEX chromatography 8 g/L.

Formulation by Lyophilization
In this study, 266 2R vials were filled with 500 µL scFv solution and then freeze-dried. This scale is manageable with an Epsilon 2-4 LSC-plus [69]. The main excipient was mannitol. The vial heat transfer coefficient was taken from the literature [67]. Since the main excipient was mannitol, the hydraulic flow resistance from this study is increased. The socalled edge effect leads to a drying batch heterogeneity because edge vials receive a higher amount of energy [70]. Edge vials receive the highest heat input, whereas center vials are limited to conduction. The lyophilization process for different vials is shown in Figure 10. The drying protocol was adopted from literature [71]. Figure 11 shows the results from the digital twin of the freeze drying process. During primary drying the shelf temperature was raised from −40 °C to −23 °C and a chamber pressure of 0.18 mbar was set. The primary drying step lasted approximately 16 h. During secondary drying the shelf temperature was increased to 15 °C and the pressure was lowered to 0.001 mbar. During primary drying, the product temperature was gradually increased to the shelf temperature. Edge vials finished primary drying faster. Furthermore, the edge effect resulted in an increased product temperature compared to the center vial. Both vial classes finished primary drying before secondary drying had started. The endpoint of the primary drying can be determined as the point where the product and shelf temperature were the same. In secondary drying, the residual moisture was set. The final residual moisture of the vials was about <1%. As before, we used the product of the Protein L chromatography as the feed for the following unit operation. The feed volume was 42 mL, which was scaled up to a velocity of 500 cm/h. This scale up results in a 2167 g/L·d, the yield was 96%, and the concentration factor was 3.9, ending up with a product concentration after CEX chromatography 8 g/L.

Formulation by Lyophilization
In this study, 266 2R vials were filled with 500 µL scFv solution and then freeze-dried. This scale is manageable with an Epsilon 2-4 LSC-plus [69]. The main excipient was mannitol. The vial heat transfer coefficient was taken from the literature [67]. Since the main excipient was mannitol, the hydraulic flow resistance from this study is increased. The so-called edge effect leads to a drying batch heterogeneity because edge vials receive a higher amount of energy [70]. Edge vials receive the highest heat input, whereas center vials are limited to conduction. The lyophilization process for different vials is shown in Figure 11.
The drying protocol was adopted from literature [71]. Figure 11 shows the results from the digital twin of the freeze drying process. During primary drying the shelf temperature was raised from −40 • C to −23 • C and a chamber pressure of 0.18 mbar was set. The primary drying step lasted approximately 16 h. During secondary drying the shelf temperature was increased to 15 • C and the pressure was lowered to 0.001 mbar. During primary drying, the product temperature was gradually increased to the shelf temperature. Edge vials finished primary drying faster. Furthermore, the edge effect resulted in an increased product temperature compared to the center vial. Both vial classes finished primary drying before secondary drying had started. The endpoint of the primary drying can be determined as the point where the product and shelf temperature were the same. In secondary drying, the residual moisture was set. The final residual moisture of the vials was about <1%.

Discussion
In pharmaceutical process development, the QbD concept is increasingly important. During the process run, it must be constantly ensured that the process is operated within PAR. Spectroscopic methods such as Raman and FTIR spectroscopy allow real-time process control with respect to critical process parameters. This information can be passed on to the digital twins of the process, allowing early prediction of process performance and, thus, ensuring robust process control. In this study, using the production of scFv in E. coli as an example, it was shown that the entire process, starting with fermentation, and subsequently through several purification steps to formulation, can be well predicted across all basic operations using the process models employed.
In fermentation, the progression of biomass, glucose, as well as product concentration can be accurately and precisely predicted. The model can be used to control glucose feeding, pH as well as pO2 control. The biomass and product concentration prediction can also be fed to the subsequent process models to adjust the operating parameters accordingly.
After fermentation, the fermentation broth is first concentrated via tangential flow filtration and then washed with PBS (UF/DF). The digital twin can accurately predict the flux across the membrane as a function of TMP and shear rate. Thus, in the process, the model can predict biomass and product concentration on the one hand, and depending on the operating parameters, predict filtration time based on increasing blockage due to increasing top layer resistance. Depending on the blockage, the TMP, as well as the shear rate, can be adjusted based on the model prediction and thus be operated as gently as possible. Fluctuating biomass concentrations from the fermentation can also be taken into account in real time and the operating parameters adjusted accordingly to achieve consistent concentration and defined washing.
The washing step is followed by mechanical cell disruption using a high-pressure homogenizer. After disruption, the scFv is clarified by tangential flow filtration. Analogous to the harvest, the digital twin can be used to predict, monitor, and control the washing. The model is able to predict the experimentally determined permeate fluxes predictively. In addition, in combination with PAT, for example, FTIR, the digital twin allows dynamic adjustment of the exchange volumes used based on the current desalination level and purity. This flexibility cannot be achieved with conventional process development and offline analytics, where exchange volumes are fixed. Consequently, only a digital twin and QbD-based process development can achieve optimal product purity or save buffers.
The clarification step is followed by protein L chromatography in which the scFv is bound from the mobile phase by affinity. The digital twin in the chromatography was able

Discussion
In pharmaceutical process development, the QbD concept is increasingly important. During the process run, it must be constantly ensured that the process is operated within PAR. Spectroscopic methods such as Raman and FTIR spectroscopy allow real-time process control with respect to critical process parameters. This information can be passed on to the digital twins of the process, allowing early prediction of process performance and, thus, ensuring robust process control. In this study, using the production of scFv in E. coli as an example, it was shown that the entire process, starting with fermentation, and subsequently through several purification steps to formulation, can be well predicted across all basic operations using the process models employed.
In fermentation, the progression of biomass, glucose, as well as product concentration can be accurately and precisely predicted. The model can be used to control glucose feeding, pH as well as pO 2 control. The biomass and product concentration prediction can also be fed to the subsequent process models to adjust the operating parameters accordingly.
After fermentation, the fermentation broth is first concentrated via tangential flow filtration and then washed with PBS (UF/DF). The digital twin can accurately predict the flux across the membrane as a function of TMP and shear rate. Thus, in the process, the model can predict biomass and product concentration on the one hand, and depending on the operating parameters, predict filtration time based on increasing blockage due to increasing top layer resistance. Depending on the blockage, the TMP, as well as the shear rate, can be adjusted based on the model prediction and thus be operated as gently as possible. Fluctuating biomass concentrations from the fermentation can also be taken into account in real time and the operating parameters adjusted accordingly to achieve consistent concentration and defined washing.
The washing step is followed by mechanical cell disruption using a high-pressure homogenizer. After disruption, the scFv is clarified by tangential flow filtration. Analogous to the harvest, the digital twin can be used to predict, monitor, and control the washing. The model is able to predict the experimentally determined permeate fluxes predictively. In addition, in combination with PAT, for example, FTIR, the digital twin allows dynamic adjustment of the exchange volumes used based on the current desalination level and purity. This flexibility cannot be achieved with conventional process development and offline analytics, where exchange volumes are fixed. Consequently, only a digital twin and QbD-based process development can achieve optimal product purity or save buffers. The clarification step is followed by protein L chromatography in which the scFv is bound from the mobile phase by affinity. The digital twin in the chromatography was able to predict well the experimentally determined process course based on the manufacturer's data on the adsorbent used, as well as the experimentally determined parameters for the isotherms. Using the model, the flow rate could be increased, and the wash step reduced to 3 DV, allowing the process to be run more efficiently. As a result, a productivity of 5.3 g/L/d at a purity of 96% can be achieved. Furthermore, the process model is able to accurately predict elution time points, which can control fractionation of the product.
Protein L chromatography is followed by polishing using CEX chromatography. Analogous to protein L chromatography, the process model can be used to increase efficiency in terms of throughput. As a result, a productivity of 2167 g/L/d at a yield of 96% is achieved. The product concentration after CEX is 8 g/L.
After polishing, freeze drying is performed to ensure long-term stability. The digital twin in freeze drying can be used to enable efficient process control in terms of drying time while avoiding collapse of the cake. The primary variables influencing drying are shelf temperature, temperature gradient, chamber pressure, and residual moisture in the dried product. These can be controlled or predicted by the digital twin. Furthermore, the model allows a prediction of the heterogeneity of the vials depending on the position in the freeze dryer (corner vs. center).
The use of digital twins makes it possible to realize more efficient process management in several respects, as shown in Figure 12. On the one hand, it is possible to predict the exact end points of the process steps at an early stage, so that the preparation of the subsequent basic operations can be optimally prepared and scheduled ( Figure 13). Furthermore, PAT in combination with a predefined normal operating range eliminates the need for timeconsuming quality controls. All in all, the maximum savings potential is a reduction in the total time for downstream processing by a factor of two, since the downstream can be performed within one working day instead of two.

Conclusions
This work demonstrates the application of digital twins for advanced process control using the example of the production and purification of scFv in E. coli for a whole process, starting with fermentation and ending with the formulation by freeze drying. The digital twin can predict the concentrations during fermentation, such as glucose, but can also be used for optimization. Knowledge of the ideal harvest time and the resulting product concentration present can be used for optimal preparation of the harvest and concentration via tangential flow filtration, where concentration is increased by a factor of three, and five diafiltration volumes are used for washing. Tangential flow filtration is also applied for clarification after high-pressure homogenization, where the digital twin accurately predicts flux decrease at different operating points within a DoE plan, which enabled shear rate and biomass concentrations to be identified as critical process parameters. In the following Protein L purification and cation exchange chromatography polishing step, the digital twins allow for improvement in terms of duration and productivity. In the final lyophilization step, the digital twin is able to determine the critical residual moisture and endpoint of the drying, depending on the position of the vial in the freeze dryer.  Data Availability Statement: Data generated in this study are available from the authors upon reasonable request.

Acknowledgments:
The authors would like to acknowledge the fruitful discussion with their Clausthal University of Technology colleagues. The authors would like to thank Annika Leibold and Armin Busch for excellent laboratory work and discussion. The authors acknowledge financial support by Open Access Publishing Fund of Clausthal University of Technology.

Conflicts of Interest:
The authors declare no conflict of interest.

Conclusions
This work demonstrates the application of digital twins for advanced process control using the example of the production and purification of scFv in E. coli for a whole process, starting with fermentation and ending with the formulation by freeze drying. The digital twin can predict the concentrations during fermentation, such as glucose, but can also be used for optimization. Knowledge of the ideal harvest time and the resulting product concentration present can be used for optimal preparation of the harvest and concentration via tangential flow filtration, where concentration is increased by a factor of three, and five diafiltration volumes are used for washing. Tangential flow filtration is also applied for clarification after high-pressure homogenization, where the digital twin accurately predicts flux decrease at different operating points within a DoE plan, which enabled shear rate and biomass concentrations to be identified as critical process parameters. In the following Protein L purification and cation exchange chromatography polishing step, the digital twins allow for improvement in terms of duration and productivity. In the final lyophilization step, the digital twin is able to determine the critical residual moisture and endpoint of the drying, depending on the position of the vial in the freeze dryer.