Accelerating Biologics Manufacturing by Upstream Process Modelling

Intensified and accelerated development processes are being demanded by the market, as innovative biopharmaceuticals such as virus-like particles, exosomes, cell and gene therapy, as well as recombinant proteins and peptides will possess no available platform approach. Therefore, methods that are able to accelerate this development are preferred. Especially, physicochemical rigorous process models, based on all relevant effects of fluid dynamics, phase equilibrium, and mass transfer, can be predictive, if the model is verified and distinctly quantitatively validated. In this approach, a macroscopic kinetic model based on Monod kinetics for mammalian cell cultivation is developed and verified according to a general valid model validation workflow. The macroscopic model is verified and validated on the basis of four decision criteria (plausibility, sensitivity, accuracy and precision as well as equality). The process model workflow is subjected to a case study, comprising a Chinese hamster ovary fed-batch cultivation for the production of a monoclonal antibody. By performing the workflow, it was found that, based on design of experiments and Monte Carlo simulation, the maximum growth rate μmax exhibited the greatest influence on model variables such as viable cell concentration XV and product concentration. In addition, partial least squares regressions statistically evaluate the correlations between a higher μmax and a higher cell and product concentration, as well as a higher substrate consumption.


Introduction
The process analytical technology (PAT), which was initiated by the Food and Drug Administration (FDA) in 2004, is a key-enabling technology for quality-by-design (QbD) process development approaches [1][2][3].The steadily increasing demand for process robustness, as well as for reducing cost of goods (COGs) and batch variability, requires intensified processes, detailed process understanding, and control [4].Therefore, the PAT initiative aims at measuring, analyzing, monitoring, and ultimately controlling all important attributes of the process [1,[5][6][7].Bioprocess (i.e., cultivation in batch, fed-batch, and continuous operational mode) monitoring includes process variables such as (viable) cell, substrate, metabolite, and product concentration, as well as product quality and impurities.These target variables depend on state variables such as pH, dissolved oxygen (pO 2 ), and temperature, providing suitable cultivation conditions [8].Furthermore, state variables are controlled via process variables such as the addition of base, acid, substrates, and salts, as well as the adjustment of stirrer rates, air flow, and heating/cooling temperatures [9].The ability for measurement, analysis, monitoring, and controlling of variables, preferably online, is strongly dependent on the state (i.e., physical, chemical, biological), the variable itself, and available sensor techniques [2,10].An online estimation of difficult to measure variables can be achieved by implementing macroscopic kinetic models into bioprocesses [11].Furthermore, a reduction of experiments for defining process design spaces can be accomplished by an integration of physicochemical process models [12].For this, the physicochemical model must be verified and distinctively and quantitatively validated, as depicted in the general model validation workflow in Figure 1 [13].
Initially, a model task and its application must be defined, from which a conceptual model is derived.The computerized model is verified by checking equations for syntax, dimensional analysis, as well as mass and energy balances.Subsequently, the first decision criterion is the comparison of characteristic numbers with literature data.Afterwards, the second decision criterion (sensitivity) is based on sensitivity studies.One-parameter-at-a-time as well as DoE (design of experiments) simulation studies are conducted to detect gross errors.A statistical data-driven evaluation of these studies generates a pareto diagram of standardized effects identifying significant parameters that have a major influence on the respective model.The third decision criterion (accuracy and precision) is based on the comparison of model and experimental errors.For this, a model parameter determination concept needs to be established.The determination concept consists of a separation of parameter effects, as well as an experimental determination and assessment of the impact of parameter errors on the process model, based on error propagation of experimental errors.The objective is to gain a process model with a higher precision than experimental data, which are to be substituted to reduce the experimental effort.In the end, the fourth decision criterion, which is based on simulated experimental data to prove model accuracy and precision, supported by statistical methods (i.e., partial least squares regression (PLS)), will result in a verified and distinctly quantitatively validated process model.
In this approach, the general valid model validation workflow in Figure 1 is applied to a macroscopic kinetic process model, based on a case study comprising a Chinese hamster ovary fed-batch cultivation for the production of a monoclonal antibody.[13].Macroscopic kinetic models simulating the dynamic state of a cell culture can gather essential information about cellular conditions (e.g., lag, exponential, stationary, decline phase), substrate uptake, metabolite, and productivity, as well as possible feeding adjustments [14][15][16][17][18][19].A combination of online turbidity data resembling the cell concentration and a macroscopic kinetic model was already applied to estimate substrate and metabolite concentrations [11].A general overview of several working groups focusing on upstream process modelling is presented in

Materials and Methods
Chinese hamster ovary cells (CHO DG44) were used to produce an immunoglobulin (IgG1).The culture conditions were 36.8 • C, pH 7.1, 60% pO 2 , and 433 rpm (three-blade segment impeller with a diameter of 54 mm and blades at an angle of 30 • , bbi-biotech GmbH, Berlin, Germany).The cultivations were carried out in serum-free, commercial medium (CellcaCHO Expression Platform, Sartorius Stedim Biotech GmbH, Göttingen, Germany) in 2 L glass bioreactors (Biostat ® B, Sartorius Stedim Biotech GmbH, Göttingen, Germany) controlled via a digital control unit (DCU, Biostat ® B, Sartorius Stedim Biotech GmbH, Göttingen, Germany).Pre-cultures were grown in shake flasks in serum-free medium.In terms of fed-batch bioreactor cultivations, feed medium (based on CellcaCHO Expression Platform) was provided every 24 h starting at 72 h.Cell concentration was repeatedly quantified using a hemocytometer (Neubauer improved, BRAND GmbH + CO KG, Wertheim, Germany) and trypan blue solution (0.4%, Sigma-Aldrich, St. Louis, MO, USA) as dye for the detection of dead cells.An in situ turbidity probe (transmission, 880 nm, HiTec Zang GmbH, Herzogenrath, Germany) was used for quantifying the cell concentration during bioreactor cultivations.
The product was quantified by Protein A chromatography (PA ID Sensor Cartridge, Applied Biosystems, Bedford, MA, USA).Dulbecco's PBS buffer was used as a loading buffer at pH 7.4 and as an elution buffer at pH 2.6.The absorbance was monitored at 280 nm.Glucose and lactate concentrations were quantified using a LaboTrace compact (TRACE Analytics GmbH, Braunschweig, Germany).Glutamine and ammonium concentrations were determined by a Bioprofile 100 plus (nova ® biomedical, Waltham, MA, USA).
The macroscopic kinetic model was developed in Aspen Custom Modeler V8.4 (Aspen Technology, Inc., Bedford, MA, USA).As bioreactor cell cultures were performed in fed-batch mode with daily bolus feed additions, the model equations were extended by feeding terms.Consequently, volumetric changes were considered as well.

Results and Discussion
In this approach, a Monod-type process model was used for the simulation of dynamic cellular states (i.e., lag, exponential, stationary, decline phase), as well as the uptake of substrates (i.e., glucose (GLC), glutamine (GLN)), production of metabolites (i.e., lactate (LAC), ammonium (AMM)), and product (i.e., monoclonal antibody (mAb)).Model equations are given in the following section and are mostly adopted from Xing et al. [34].Model parameters employed in the macroscopic model are given in the example of the lag phase in Table 2.
The Monod-type equations are based on limiting substrate and inhibiting metabolite concentrations, as seen in Equations ( 2) and ( 3).The higher the substrate concentration, the higher the growth rate µ, whereas higher metabolite concentrations limit cell growth by decreasing the growth rate.The apparent growth rate (µ minus µ d ) is thus dependent on substrate consumption and metabolite accumulation.These concentrations are simulated by Equations ( 4)-( 8), where yield coefficients (e.g., Y Xv/glc ) empirically describe the correlation between input and output variables.For example, Y Xv/glc correlates the variation in viable cell concentration (X V , E6 cells/mL) to the variation in glucose concentration (mM) during cultivation.Because of the dynamic behavior of a culture, yield coefficients can be quantitatively described by systematically defining cellular phases (i.e., lag, exponential, stationary, decline).
Furthermore, the growth rate depends on cell-dependent half-maximum rates, such as K glc .These model parameters are, by definition, dependent on the maximum growth rate at respective substrate concentrations and are not subject to changes during cultivation.Additionally, maintenance coefficients (e.g., m glc ) intend to simulate the maintenance metabolism, where no cell division occurs.Product concentration, Equation ( 9), is described by a cell specific production rate (Q mAb , E-12 g Product /cells/h).Figure 2 (left) depicts the verification of the computerized process model (first decision criterion), by comparing experimental data with model-derived results.The viable cell, product, and glucose concentration can be predicted sufficiently well.The coefficient of determination for each prediction is greater than 0.979, which is similar to the modelling of glucose concentration.Differences between the experimental and model-derived data are possible as a result of undescribed fluctuations under feed addition during cultivation in the kinetic model.These fluctuations mainly affect the glucose concentration, as other model variables such as lactate, antibody, or viable cells are not being fed.However, by adding feed, volumetric changes must be considered during the description of all variables.
Sensitivity studies such as the variation of one-parameter-at-a-time or design of experiments (DoE) are essential for detecting gross model errors.One-parameter-at-a-time variation of the maximum growth rate (±10% µ max ) can be seen in Figure 2 (right).Sensitivity studies such as the variation of one-parameter-at-a-time or design of experiments (DoE) are essential for detecting gross model errors.One-parameter-at-a-time variation of the maximum growth rate (±10% µmax) can be seen in Figure 2 (right).The higher µmax, the steeper the exponential phase between 72 and 180 h of the viable cell concentration.This is mainly the result of a higher growth rate, which can be achieved during this phase, as can be seen in Equations ( 1) and ( 2).Therefore, the maximum growth rate has an impact on growth and maximum viable cell concentration in this macroscopic model, especially during the The higher µ max , the steeper the exponential phase between 72 and 180 h of the viable cell concentration.This is mainly the result of a higher growth rate, which can be achieved during this phase, as can be seen in Equations ( 1) and (2).Therefore, the maximum growth rate has an impact on growth and maximum viable cell concentration in this macroscopic model, especially during the exponential phase.Corresponding to this increase in viable cell concentration, a higher product concentration can be observed, as well as a higher glucose consumption at the constant feed strategy, which is dependent on the current viable cell concentration at respective phases, as seen in Equation ( 4).
During one-parameter-at-a-time studies, the maximum growth rate exhibited the most significant influence on the course of the considered model variables.Therefore, a precise determination of this process model parameter is crucial for the precise modelling of dynamic cultivation processes, as the viable cell concentration X V , and thus the growth rate µ, are present in the equations regarding substrates ( 4) and ( 6), metabolites ( 5) and ( 8), as well as product (9).This study indicates how important it is to accurately determine the model parameter to construct an accurate process model.
Simulations conducted by DoE approaches enable the quantitative determination of significant parameters.In addition, the variation of multiple parameters at the same time facilitates the coverage of parameter interactions during process modelling.For this, a fractional factorial design (resolution IV) consisting of all model parameters was conducted, resulting in 128 additional simulations.
The Pareto diagrams (Figure 3) indicate significant model parameters, based on a level of significance (α) set to 0.05 (second decision criterion).Besides the maximum growth rate µ max , the yield coefficient Y X/glc has a significant impact on the considered process variable (i.e., maximum viable cell concentration).Y X/glc describes the ratio between generated viable cells (E6 cells/mL) and consumed glucose (mM).The higher Y X/glc , the more cells can be generated during substrate uptake, as depicted in Equations ( 4) and ( 6).This results in a higher maximum viable cell concentration.The same correlation can be found considering Y X/gln .
The yield coefficient Y lac/glc , resembling the ratio between produced lactate (mM) and consumed glucose (mM), is shown to be a significant parameter by the pareto diagram in Figure 3 (left).As seen in Equations ( 2) and ( 5), a higher Y lac/glc results in higher lactate concentrations and, therefore, in stronger inhibition of the growth rate and ultimately in a reduced maximum viable cell concentration.The same correlation can be found regarding Y amm/gln .
Although the initial viable cell concentration possesses a significant influence on the maximum viable cell concentration, higher initial values result in an increase in substrate consumption and metabolite production, reducing the growth rate more rapidly, and may decrease the overall maximum viable cell concentration.The maximum death rate k d possesses no significant effect on the maximum viable cell concentrations, as this parameter is strongly dependent on the current lactate and ammonium concentration.The higher the metabolite concentration, the higher the death rate µ d , thus commonly resulting in a lower maximum viable cell concentration, as seen in Figure 2.However, these metabolites accumulate during cultivation and possess an impact only at the end of the process, after the maximum viable cell concentration has already been reached.Nevertheless, k d exhibit a significant influence on the considered process variable because the apparent growth rate (µ minus µ d ) will be affected by the maximum death rate as well.
Similarities can be found regarding the pareto diagram in Figure 3 (right) for the maximum antibody concentration (mAb max ).In addition to the aforementioned correlations, the cell specific antibody production rate Q mAb significantly influences the process variable mAb max .However, the process parameter µ max seems to be the most significant parameter for mAb max , as well as for X v,max , as can be seen in Figure 3 and in previous one-parameter-at-a-time studies in Figure 2.
Model parameters are being quantitatively determined via experimental studies.The determination based on experimental data is afflicted with errors, for example, resulting from handling, repetitions, and measurement inaccuracies.Therefore, a model parameter determination concept must be implemented to evaluate the precision of the model (third decision criterion).The modelling approach for fed-batch and perfusion cultivation processes is summarized in Figure 4. Model parameters are being quantitatively determined via experimental studies.The determination based on experimental data is afflicted with errors, for example, resulting from handling, repetitions, and measurement inaccuracies.Therefore, a model parameter determination concept must be implemented to evaluate the precision of the model (third decision criterion).The modelling approach for fed-batch and perfusion cultivation processes is summarized in Figure 4.  [2,11,13,14,45,46].Red-marked parameters (volumetric mass transfer coefficient kLa, mixing time θ95, as well as residence time) lead to the characterization of the equipment.Kinetic (green-marked) and equilibrium (blue-marked) parameters can be obtained by cultivations.For validation, similar experiments can be used.Efforts for the generation of this data are commonly 2 to 3 weeks for two to three simultaneously run cultivations in 1 to 2 L and their respective analysis.GLC, glucose, GLN, glutamine; LAC, lactate; AMM, ammonium, PAT, process analytical technology.Model parameters are being quantitatively determined via experimental studies.The determination based on experimental data is afflicted with errors, for example, resulting from handling, repetitions, and measurement inaccuracies.Therefore, a model parameter determination concept must be implemented to evaluate the precision of the model (third decision criterion).The modelling approach for fed-batch and perfusion cultivation processes is summarized in Figure 4.   [2,11,13,14,45,46].Red-marked parameters (volumetric mass transfer coefficient k L a, mixing time θ 95 , as well as residence time) lead to the characterization of the equipment.Kinetic (green-marked) and equilibrium (blue-marked) parameters can be obtained by cultivations.For validation, similar experiments can be used.Efforts for the generation of this data are commonly 2 to 3 weeks for two to three simultaneously run cultivations in 1 to 2 L and their respective analysis.GLC, glucose, GLN, glutamine; LAC, lactate; AMM, ammonium, PAT, process analytical technology.
The main equations are represented by a Monod-type kinetic, considering the time-dependent variation of substrate (e.g., glucose, glutamine), metabolite (e.g., lactate, ammonium), cell, and product concentration.The correlation between input (e.g., substrate concentration) and output (e.g., cell concentration) variables can be macroscopically determined using empirical observations, such as yield coefficients, which are strongly dependent on the cell line and growth phase [14].
In terms of fluid dynamics (red-marked parameters), determination of oxygen transfer rates according to the unsteady-state (dynamic) technique, mixing times with conductivity measurements, and residence times with tracer experiments result in a characterization of the equipment.
Cultivations as well as the analysis of substrate, metabolite, cell, and product concentration need to be performed to determine kinetic (green-marked, for example, maximum growth rate) and equilibrium (blue-marked, for example, yield coefficients) parameters.The substrate saturation constants (or substrate affinity constants) equal concentrations that support a half-maximum growth rate.
Analog experiments can be used for validation.Furthermore, cultivations assisted by an online process model increase the gain in process information by integrating process data (e.g., turbidity) into the macroscopic kinetic model to extract information on process variables (e.g., glucose and lactate concentration) [11].Efforts for the generation of this data are commonly 2 to 3 weeks for two to three simultaneously run cultivations in 1 to 2 L and their respective analysis.
The impact of experimental model parameter determination error on the process model, based on error propagation of experimental errors, must be assessed to evaluate model precision.These errors depict maximal and minimal values for each model parameter.This variance can be included into a Monte Carlo simulation to determine the effect of model parameter errors on the process model.In this case, 100 simulations, varying each parameter equally distributed based on their experimental errors, were conducted.An example of the distribution of model parameters is given in Figure 5.As can be seen, the equally distributed process parameters result in maximum and minimum values for each considered parameter.The arithmetic mean of each parameter is depicted as a red horizontal line and corresponds to the model parameter value in Table 2 (e.g., 0.029 h −1 µ max ).This analysis shows the influence of model parameter errors and their interactions among each other on the macroscopic kinetic model.The Monte Carlo simulations resulted in a range of minimal and maximal values of each process variable, yielding enveloped curves, represented by grey lines (standard deviation of simulation results) in Figure 6.In addition to enveloped curves, represented by grey lines, the model is able to describe variations of process parameters, which occur with a lower probability (simulation result larger/smaller than standard deviation of mean simulation results), as depicted by the dotted lines.As can be seen, the variation of model parameters based on respective parameter errors leads to a variation in each process value.Maximum viable cell concentrations result in a higher substrate uptake and consumption, leading to a faster glucose and glutamine depletion.In addition, lactate, ammonium, and monoclonal antibody concentrations increase with higher viable cell concentrations.This analysis clearly shows the aforementioned precise model parameter determination.Especially, µ max seems to be the key model parameter in this macroscopic kinetic model, as variations (±10% in Figure 2, as well as Monte Carlo based variations in Figure 6) result in a significant influence on the process model, as already shown by the DoE-based pareto diagrams in Figure 3. Sensor techniques such as turbidity (limited to cellular viability) and Raman spectroscopy (limited to a precise chemometric model) measuring the viable cell concentration online during mammalian cell cultivation support a precise determination of µ max and, therefore, increase the precision of the process model (third decision criterion).Partial least squares (PLS) regressions can be conducted to statistically evaluate the correlation between predictors (i.e., process parameters such as µmax) and responses (i.e., process variables such as XV,max).For this, the NIPALS (nonlinear iterative partial least squares) algorithm was used for the computation of factors.The data basis for this analysis is the results from the 100 Monte Carlo For better clarity and visualization, the enveloping curves for glucose after 72 h show the starting concentrations before respective feed addition.The detailed course of glucose between feed additions can be seen in Figure 2.
Partial least squares (PLS) regressions can be conducted to statistically evaluate the correlation between predictors (i.e., process parameters such as µ max ) and responses (i.e., process variables such as X V,max ).For this, the NIPALS (nonlinear iterative partial least squares) algorithm was used for the computation of factors.The data basis for this analysis is the results from the 100 Monte Carlo simulations.Each varied parameter set resulted in a specific outcome of process variables (e.g., maximum viable cell concentration, X V ), as seen in Table 3.In other words, a variation of the predictor possibly leads to a positive variation of the response.Opposite responses and predictors are negatively correlated, and 90 • relationships do not induce a correlation (independent variation permitted), as they are correlated with different principal components.Higher orders of principal components would possibly be able to describe more variance, but are more affected by noise (i.e., PLS model is too detailed).Together, Table 3 and Figure 7 show the general principle of the implementation of PLS models to statistically evaluate Monte Carlo simulation results, as seen in Figure 8.
Processes 2019, 7, x FOR PEER REVIEW 14 of 19 permitted), as they are correlated with different principal components.Higher orders of principal components would possibly be able to describe more variance, but are more affected by noise (i.e., PLS model is too detailed).Together, Table 3 and Figure 7 show the general principle of the implementation of PLS models to statistically evaluate Monte Carlo simulation results, as seen in Figure 8. Figure 8 depicts the PLS regression of the influence of model parameter variation on the maximum viable cell, glucose, and antibody concentration.As can be seen, the maximum growth rate µmax is strongly positive correlated with the maximum viable cell concentration XV,max and negatively correlated with GLCmax (maximum glucose concentration).As already shown by one-parameter-at-atime and DoE studies (Figures 2 and 3), the higher µmax, the higher XV,max.Additionally, Figure 8 depicts the influence of model parameter variation on the process variable GLCmax (maximum glucose concentration).The maximum growth rate µmax is negatively correlated with GLCmax, as a higher growth rate results in an increased substrate uptake and consumption, as already seen in Figure 2.
The positive correlation of µmax on the maximum viable cell concentration (XV max) based on PLS analysis is shown in the score plot (Figure 8 left).The score plot identifies patterns in the samples, for example, by grouping responses and labeling predictors.The higher µmax, the higher the maximum viable cell concentration, yielding higher antibody concentrations, as seen in Figure 2 and Equations ( 1) and (9).Interestingly, the cell specific antibody production rate QmAb can be independently varied.This may be the result of low variations of this process model parameter.
Concluding, as can be seen during one-parameter-at-a-time, DoE, and PLS studies, the maximum growth rate µmax may be the key parameter of this macroscopic kinetic model based on Monod equations.A variation of µmax exhibits a significant impact on growth, substrate consumption, and metabolite and product formation.This is mainly because of the dependency of each process variable on the viable cell concentration XV, as seen in Equations ( 4) to (9).This dependency can also be seen during experimental cultivations.The strong dependency on cellular growth rate is, however, not surprising, but clearly shows the importance of a valid model parameter determination concept.The experimental determination of the maximum growth rate µmax can be conducted by offline samples or, more preferably, by in situ probes such as turbidity or Raman spectroscopy [11], where data density, gain in information, and thus model accuracy and precision are increased.

Conclusions
The presented general valid model validation workflow is based on four criteria (verification, accuracy, precision, equality).On the basis of a literature review, a macroscopic model for the simulation of mammalian cell culture could be developed.The macroscopic model consists of Monod kinetic approaches for the simulation of time-dependent variations of substrate, metabolite, cellular, and product variables.A coefficient of determination of at least 0.97 (glucose concentration) could be achieved (verification).The accuracy was addressed by one-parameter-at-a-time and design of experiments studies in order to develop a Pareto plot to statistically determine significant parameters (accuracy).Therefore, a valid model parameter determination concept was generated to determine Figure 8 depicts the PLS regression of the influence of model parameter variation on the maximum viable cell, glucose, and antibody concentration.As can be seen, the maximum growth rate µ max is strongly positive correlated with the maximum viable cell concentration X V,max and negatively correlated with GLC max (maximum glucose concentration).As already shown by one-parameter-at-a-time and DoE studies (Figures 2 and 3), the higher µ max , the higher X V,max .
Additionally, Figure 8 depicts the influence of model parameter variation on the process variable GLC max (maximum glucose concentration).The maximum growth rate µ max is negatively correlated with GLC max , as a higher growth rate results in an increased substrate uptake and consumption, as already seen in Figure 2.
The positive correlation of µ max on the maximum viable cell concentration (X V max) based on PLS analysis is shown in the score plot (Figure 8 left).The score plot identifies patterns in the samples, for example, by grouping responses and labeling predictors.The higher µ max , the higher the maximum viable cell concentration, yielding higher antibody concentrations, as seen in Figure 2 and Equations ( 1) and (9).Interestingly, the cell specific antibody production rate Q mAb can be independently varied.This may be the result of low variations of this process model parameter.
Concluding, as can be seen during one-parameter-at-a-time, DoE, and PLS studies, the maximum growth rate µ max may be the key parameter of this macroscopic kinetic model based on Monod equations.A variation of µ max exhibits a significant impact on growth, substrate consumption, and metabolite and product formation.This is mainly because of the dependency of each process variable on the viable cell concentration X V , as seen in Equations ( 4) to (9).This dependency can also be seen during experimental cultivations.The strong dependency on cellular growth rate is, however, not surprising, but clearly shows the importance of a valid model parameter determination concept.The experimental determination of the maximum growth rate µ max can be conducted by offline samples or, more preferably, by in situ probes such as turbidity or Raman spectroscopy [11], where data density, gain in information, and thus model accuracy and precision are increased.

Conclusions
The presented general valid model validation workflow is based on four criteria (verification, accuracy, precision, equality).On the basis of a literature review, a macroscopic model for the simulation of mammalian cell culture could be developed.The macroscopic model consists of Monod kinetic approaches for the simulation of time-dependent variations of substrate, metabolite, cellular, and product variables.A coefficient of determination of at least 0.97 (glucose concentration) could be achieved (verification).The accuracy was addressed by one-parameter-at-a-time and design of experiments studies in order to develop a Pareto plot to statistically determine significant parameters (accuracy).Therefore, a valid model parameter determination concept was generated to determine the effect of parameter determination error on the process model based on Monte Carlo simulations by equally distributing parameter determination errors (precision).PLS regressions were conducted to statistically evaluate the correlations between model parameters and variables, as well as correlations among model parameters themselves.PAT approaches to demonstrate the usability of this kinetic model were already applied (equality) [11].Concluding, following the model validation workflow, process model understanding was increased by identifying key parameters (Figure 3), the impact of experimental parameter determination (Figure 6), as well as the correlation of model parameters between themselves and model variables (Figure 8).

Figure 1 .
Figure 1.General valid model validation workflow[13].Macroscopic kinetic models simulating the dynamic state of a cell culture can gather essential information about cellular conditions (e.g., lag, exponential, stationary, decline phase), substrate uptake, metabolite, and productivity, as well as possible feeding adjustments[14][15][16][17][18][19].A combination of online turbidity data resembling the cell concentration and a macroscopic kinetic model was already applied to estimate substrate and metabolite concentrations[11].A general overview of several working groups focusing on upstream process modelling is presented in Table 1.DoE, design of experiments; MC, Monte Carlo.

Figure 2 .
Figure 2. Correlation (left) between experimental and model-derived viable cell (R 2 ≥ 0.98), glucose (R 2 ≥ 0.97), and antibody concentration (R 2 ≥ 0.99) with a maximum growth rate of 0.029 h −1 .Oneparameter-at-a-time studies (right) of the macroscopic model exemplified by varying the maximum growth rate ±10% of the experimental determined value.

Figure 2 .
Figure 2. Correlation (left) between experimental and model-derived viable cell (R 2 ≥ 0.98), glucose (R 2 ≥ 0.97), and antibody concentration (R 2 ≥ 0.99) with a maximum growth rate of 0.029 h −1 .One-parameter-at-a-time studies (right) of the macroscopic model exemplified by varying the maximum growth rate ±10% of the experimental determined value.

Figure 3 .
Figure 3. Pareto diagram of standardized effects for the determination of significant model parameters on the response of maximum viable cell concentration XV,max (left) and maximum antibody concentration mAbmax (right).The vertical red line depicts the reference line for statistical significance, which is 1.98 for both model variables.The level of significance α was 0.05 for both model variables.Numbers indicate model parameter used in respective phase: 1, lag; 2, exponential; 3, stationary; 4, decline.

Figure 4 .
Figure 4. Schematic workflow overview for the model parameter determination concept in upstream process modelling.Detailed description can be found in the literature[2,11,13,14,45,46].Red-marked parameters (volumetric mass transfer coefficient kLa, mixing time θ95, as well as residence time) lead to the characterization of the equipment.Kinetic (green-marked) and equilibrium (blue-marked) parameters can be obtained by cultivations.For validation, similar experiments can be used.Efforts for the generation of this data are commonly 2 to 3 weeks for two to three simultaneously run cultivations in 1 to 2 L and their respective analysis.GLC, glucose, GLN, glutamine; LAC, lactate; AMM, ammonium, PAT, process analytical technology.

Figure 3 .
Figure 3. Pareto diagram of standardized effects for the determination of significant model parameters on the response of maximum viable cell concentration X V,max (left) and maximum antibody concentration mAb max (right).The vertical red line depicts the reference line for statistical significance, which is 1.98 for both model variables.The level of significance α was 0.05 for both model variables.Numbers indicate model parameter used in respective phase: 1, lag; 2, exponential; 3, stationary; 4, decline.

Figure 3 .
Figure 3. Pareto diagram of standardized effects for the determination of significant model parameters on the response of maximum viable cell concentration XV,max (left) and maximum antibody concentration mAbmax (right).The vertical red line depicts the reference line for statistical significance, which is 1.98 for both model variables.The level of significance α was 0.05 for both model variables.Numbers indicate model parameter used in respective phase: 1, lag; 2, exponential; 3, stationary; 4, decline.

Figure 4 .
Figure 4. Schematic workflow overview for the model parameter determination concept in upstream process modelling.Detailed description can be found in the literature[2,11,13,14,45,46].Red-marked parameters (volumetric mass transfer coefficient kLa, mixing time θ95, as well as residence time) lead to the characterization of the equipment.Kinetic (green-marked) and equilibrium (blue-marked) parameters can be obtained by cultivations.For validation, similar experiments can be used.Efforts for the generation of this data are commonly 2 to 3 weeks for two to three simultaneously run cultivations in 1 to 2 L and their respective analysis.GLC, glucose, GLN, glutamine; LAC, lactate; AMM, ammonium, PAT, process analytical technology.

Figure 4 .
Figure 4. Schematic workflow overview for the model parameter determination concept in upstream process modelling.Detailed description can be found in the literature[2,11,13,14,45,46].Red-marked parameters (volumetric mass transfer coefficient k L a, mixing time θ 95 , as well as residence time) lead to the characterization of the equipment.Kinetic (green-marked) and equilibrium (blue-marked) parameters can be obtained by cultivations.For validation, similar experiments can be used.Efforts for the generation of this data are commonly 2 to 3 weeks for two to three simultaneously run cultivations in 1 to 2 L and their respective analysis.GLC, glucose, GLN, glutamine; LAC, lactate; AMM, ammonium, PAT, process analytical technology.

Figure 5 .
Figure 5. Example of the distribution of model parameters, resulting from 100 Monte Carlo simulations, varying each parameter equally distributed based on their experimental error.The horizontal line depicts the arithmetic mean of each parameter.

Figure 5 .
Figure 5. Example of the distribution of model parameters, resulting from 100 Monte Carlo simulations, varying each parameter equally distributed based on their experimental error.The horizontal line depicts the arithmetic mean of each parameter.

Processes 2019, 7 , 19 Figure 6 .
Figure 6.Enveloped curves of each process variable resulting from 100 model parameter error-based Monte Carlo simulations.Xv, viable cell concentration; mAb, monoclonal antibody.Black square dots represent experimental data.Error bars comprise the standard deviation of three cultivations.For better clarity and visualization, the enveloping curves for glucose after 72 h show the starting concentrations before respective feed addition.The detailed course of glucose between feed additions can be seen in Figure 2.

Figure 6 .
Figure 6.Enveloped curves of each process variable resulting from 100 model parameter error-based Monte Carlo simulations.Xv, viable cell concentration; mAb, monoclonal antibody.Black square dots represent experimental data.Error bars comprise the standard deviation of three cultivations.For better clarity and visualization, the enveloping curves for glucose after 72 h show the starting concentrations before respective feed addition.The detailed course of glucose between feed additions can be seen in Figure2.

Figure 7
Figure7schematically describes the analysis of the correlation loadings based on the PLS regression.If responses and predictors are positive correlated to a specific factor (e.g., factor 1 or principal component 1, PC1), they are in relation to each other.In other words, a variation of the predictor possibly leads to a positive variation of the response.Opposite responses and predictors are negatively correlated, and 90 • relationships do not induce a correlation (independent variation permitted), as they are correlated with different principal components.Higher orders of principal components would possibly be able to describe more variance, but are more affected by noise (i.e., PLS model is too detailed).Together, Table3and Figure7show the general principle of the implementation of PLS models to statistically evaluate Monte Carlo simulation results, as seen in Figure8.

Figure 7 .
Figure 7. Schematic presentation of a loadings plot resulting from a partial least squares (PLS) regression.

Figure 7 .
Figure 7. Schematic presentation of a loadings plot resulting from a partial least squares (PLS) regression.

Figure 8 .
Figure 8. PLS regression loadings (right) for the maximum viable cell concentration XV,max, maximum glucose concentration (GLCmax), and maximum antibody concentration (mAbmax) as response, depending on model parameters (predictors) as well as PLS regression scores (left) grouped into the correlation between maximum growth rate µmax and XV.

Figure 8 .
Figure 8. PLS regression loadings (right) for the maximum viable cell concentration X V,max , maximum glucose concentration (GLC max ), and maximum antibody concentration (mAb max ) as response, depending on model parameters (predictors) as well as PLS regression scores (left) grouped into the correlation between maximum growth rate µ max and X V .

Table 1 .
General overview of working groups focusing on upstream process modelling.CHO, Chinese hamster ovary; MFA, metabolic flux analysis; TCA, tricarboxylic acid cycle; EFM, elementary flux mode; PFA, principal factor analysis; FBA, flux balance analysis; ANN, artificial neural networks.

Table 3 .
Example of the parameter set (X V min) and target variable (X V max) resulting from 100 Monte Carlo simulations.