Distinct and Quantitative Validation Method for Predictive Process Modelling in Preparative Chromatography of Synthetic and Bio-Based Feed Mixtures Following a Quality-by-Design (QbD) Approach

Process development, especially in regulated industries, where quality-by-design approaches have become a prerequisite, is cost intensive and time consuming. A main factor is the large number of experiments needed. Process modelling can reduce this number significantly by replacing experiments with simulations. However, this requires a validated model. In this paper, a process and model development workflow is presented, which focuses on implementing, parameterizing, and validating the model in four steps. The presented methods are laid out to gain, create, or generate the maximum information and process knowledge needed for successful process development. This includes design of experiments and statistical evaluations showing process robustness, sensitivity of target values to process parameters, and correlations between process and target values. Two case studies are presented. An ion exchange capture step for monoclonal antibodies focusing on high accuracy and low feed consumption; and one case study for small molecules focusing on rapid process development, emphasizing speed of parameter determination. Record Type: Published Article Submitted To: LAPSE (Living Archive for Process Systems Engineering) Citation (overall record, always the latest version): LAPSE:2019.1169 Citation (this specific file, latest version): LAPSE:2019.1169-1 Citation (this specific file, this version): LAPSE:2019.1169-1v1 DOI of Published Version: https://doi.org/10.3390/pr7090580 License: Creative Commons Attribution 4.0 International (CC BY 4.0) Powered by TCPDF (www.tcpdf.org)


Introduction
Chromatography is a widely-used unit operation in chemical and pharmaceutical engineering with broad fields of application from low-cost bulk chemicals to high potential pharmaceuticals. Regardless of the area of application, process development always targets specific product quality and process performance attributes. These are normally not achieved by a single chromatography step. Especially in biopharmaceutical industry, where very high purity is mandatory, often several chromatographic steps in combination with filtration, extraction, or other purification steps are needed. To find the ideal purification sequence and following detailed design of each step is; therefore, often very time consuming and labor/cost intensive without compromising process development time. Here, process simulation can help both rapid process development and detail engineering. Especially in the development of multicolumn chromatography steps, implementation of process modelling proved to be beneficial [1][2][3][4][5][6].
Implemented in the early stage of process development, process models are used to quickly evaluate different combinations of unit operations. When the best combination is found, the same models are enhanced by either including more effects like, for example, pore diffusion by refining the model parameters or both. Thus, previous work is easily carried on through the whole project. models are enhanced by either including more effects like, for example, pore diffusion by refining the model parameters or both. Thus, previous work is easily carried on through the whole project.
Besides fast progress in process development, product quality and process robustness is of growing importance. Thus, approaches like quality-by-design (QbD) are demanded by authorities [7][8][9][10][11][12][13][14][15][16]. A possible approach for fast process development under quality-by-design is shown in the process development workflow in Figure 1. This paper targets the key steps marked with the red square.
One of the main goals in process development in general and QbD in particular is to identify the sensitivity of the process relating to process parameters. This helps to evaluate the stability of each operation point within the design space and to create control strategies, both leading to a more robust process. Although this evaluation could be done experimentally, the amount of necessary experiments exceeds the effort of modelling and model parameter determination by far.
The presented approach combines model validation with process development by proving model accuracy and precision through comparing multiple well-planned simulations with few experiments and deriving correlation plots from these data sets. The correlation plots can afterwards be used to establish control strategies with respect to the influence of each parameter on the stability of the operation point. Quality-by-design (QbD) process development workflow with process modelling as its central part [17].
Before starting, it is important to define the quality target product profile and critical quality attributes. After that, a risk assessment is required. This might be done, for example, by aid of Ishikawa diagrams and failure mode and effect analysis, as presented in Figure 2 [7,17]. Although this step is not mandatory for process simulation, but part of quality-by-design, conclusions drawn here directly define the modelling task, the parameters that need to be covered with the model, and the level of detail needed. If, for example, a specific flow regime within a unit (e.g., a stirred tank) is crucial, computational fluid dynamics is needed rather than an equilibrium stage model. Quality-by-design (QbD) process development workflow with process modelling as its central part [17].
Before starting, it is important to define the quality target product profile and critical quality attributes. After that, a risk assessment is required. This might be done, for example, by aid of Ishikawa diagrams and failure mode and effect analysis, as presented in Figure 2 [7,17]. Although this step is not mandatory for process simulation, but part of quality-by-design, conclusions drawn here directly define the modelling task, the parameters that need to be covered with the model, and the level of detail needed. If, for example, a specific flow regime within a unit (e.g., a stirred tank) is crucial, computational fluid dynamics is needed rather than an equilibrium stage model.
For modelling chromatography, all types of model approaches have been presented [18,19], from stage models [18,20,21] to mechanistic models [22][23][24][25][26][27]. Among the most common ones is the general rate model and its derivatives/simplifications [28][29][30][31], which is explained in the chromatography modelling section. Presented here are two case studies of industrial relevance: The first case study is the separation of monoclonal antibody IgG from cell harvest on an ion exchange column as a capture step. The focus is set on accuracy, thus the general rate model without simplifications is used. Furthermore, minimum feed amount for parameter determination and model validation is aspired. Thus, isotherms are measured in relatively small-scale shaking flask experiments and well-known correlations for model parameters are used wherever possible, as described in the model parameter determination section. The derived model and the parameter set is afterwards suitable for detailed engineering of both batch and continuous chromatography. For modelling chromatography, all types of model approaches have been presented [18,19], from stage models [18,20,21] to mechanistic models [22][23][24][25][26][27]. Among the most common ones is the general rate model and its derivatives/simplifications [28][29][30][31], which is explained in the chromatography modelling section. Presented here are two case studies of industrial relevance: The first case study is the separation of monoclonal antibody IgG from cell harvest on an ion exchange column as a capture step. The focus is set on accuracy, thus the general rate model without simplifications is used. Furthermore, minimum feed amount for parameter determination and model validation is aspired. Thus, isotherms are measured in relatively small-scale shaking flask experiments and well-known correlations for model parameters are used wherever possible, as described in the model parameter determination section. The derived model and the parameter set is afterwards suitable for detailed engineering of both batch and continuous chromatography.
The second case study shows a slightly faster approach for rapid process evaluation and design based on a relatively simple separation of cyclopentanone and cycloheptanone and normal phase media [32]. In this case, feed is less expensive and available in a larger amount, so that the focus is set more on speed of parameter determination. The goal is to implement a model capable of process screening for multicolumn chromatography like simulated moving bed. Since the molecules are much smaller than in protein chromatography, pore diffusion is less important. Thus, the lumped pore diffusion model is used.
In both cases, process and model development follow the workflow presented in Figure 3. The second case study shows a slightly faster approach for rapid process evaluation and design based on a relatively simple separation of cyclopentanone and cycloheptanone and normal phase media [32]. In this case, feed is less expensive and available in a larger amount, so that the focus is set more on speed of parameter determination. The goal is to implement a model capable of process screening for multicolumn chromatography like simulated moving bed. Since the molecules are much smaller than in protein chromatography, pore diffusion is less important. Thus, the lumped pore diffusion model is used.
In both cases, process and model development follow the workflow presented in Figure 3.  Step 1 As depicted in Figure 3, the first step of the model-based process development is model derivation and implementation. Correct model implementation can be checked with different approaches. Since the general rate model is published a lot in the literature, one way could be to recreate these examples.

Step 2
The second decision criterion is based on sensitivity. For process development, of course, but to Step 1 As depicted in Figure 3, the first step of the model-based process development is model derivation and implementation. Correct model implementation can be checked with different approaches. Since the general rate model is published a lot in the literature, one way could be to recreate these examples.

Step 2
The second decision criterion is based on sensitivity. For process development, of course, but to establish a parameter determination concept as well, it is of great importance to know which parameter is affecting the outcome significantly and which one is not.
For this, one has to define the possible range in which the unit might be operated (e.g., the QbD design space). Afterwards, single parameter studies are conducted. All parameters are kept at the average value except for one parameter that is changed from its minimum to maximum value in sensible or relevant steps. These simulations show the variance in results influenced by this parameter on the one hand and serve as a plausibility check on the other hand. Since for most parameters the outcome of a changed input value should be clear, at least as a general tendency, a very different outcome indicates some programming mistakes.
In many cases, one parameter is dominant in a specific region of the design space but might be of less importance compared to other parameters in a different place of the design space. Thus, single parameter studies with all other parameters remaining might be misleading. Hence, multi-parameter studies should be conducted. There are several entries possible, like Monte Carlo approaches or global sensitivity analysis [34][35][36]. For complex problems, a combination of several methods might be beneficial. To reduce the amount of simulations and to ease the interpretation, a design-of-experiments (DoEs) plan is used and the results are fed back into a statistic analysis tool.
A full factorial DoEs design would have as many different simulations as two to the power of number of parameters; in case study one that would be 2 26 simulations. Although computational power is not the biggest problem anymore, these are too many simulation runs, obviously. The design of the DoEs plan again depends on the application. DoEs usually starts with screening plans like Plackett-Burman and can be enhanced until the predicted p-value of the response plot is sufficiently low. Overall sensitivity analysis could help to reduce the dimensions of the experimental space to be explored with DoE.
The input for DoEs are the model and process parameters with a reasonable range for the given problem. These arise from prior knowledge (e.g., early process development), literature or, at best, previous model-based process development. To compare the sensitivity results with the expectation it is best to consider the expected influence of each parameter on the target values. The target values need to be defined as well. In this case study, only purity, as a classical quality attribute, and yield, as a common performance parameter, are used. In most cases, there will be more than two target values. Within a quality-by-design approach, other quality attributes should be added, like monomer-to-dimer ratio, host cell protein concentration, and so on. For process development, extension to more performance attributes like productivity or buffer consumption might be interesting.
When the DoEs plan is set up, the simulations are carried out and the results of the target values are fed back into the statistical analysis tool. Now, Pareto charts of standardized effects are generated showing which parameter is significantly influencing the target value. These plots are only valid for the given design space and process development task. Pareto plots are a good tool to visualize the outcome quickly, leading to better process understanding on the one hand and influencing the proceedings on the other hand.
Step 3 The next step in process and model development workflow is accurate model parameter determination and comparison of modeling results with experimental results for model validation (see Figure 3). Parameters shown to be less significant in Step 2 do not have to be determined with the highest accuracy. Focus should be set to significant ones. If possible, effects should be separated [37]. Hydrodynamic parameters like axial dispersion should be measured without the influence of mass transfer or adsorption equilibrium. The latter one should be determined without the influence of fluid dynamics. By measuring each parameter individually, the parameter keeps its validity when other input parameters are changed. Axial dispersion coefficient, for example, is the same for a once-packed column, no matter which substance is separated. Model parameter determination methods are described in Chapter 0.
For model validation, simulation and experimental results are compared. Again, this can be done separately for each effect. It is often appropriate to validate the hydrodynamic behavior of the model by neglecting mass transfer (k f = 0) and comparing with the tracer experiments. In these case studies, chromatographic runs are compared with the corresponding simulations.
Both results are afflicted with a certain error. For experimental chromatograms small variation in peak retention time and peak width occur due to pump inaccuracy, minor changes in feed or eluent composition, and so on. Thus, experimental chromatographic validation runs are performed several times at the same operation point resulting in a mean value chromatogram with an envelope curve.
Simulated chromatograms, originating from the same set of parameters, should not vary between several runs as long as all parameters are kept the same. However, the model parameter determination experiments are subject to a certain error. These errors are calculated and used to define minimum and maximum values for model parameters. Simulations are carried out with all parameters at average value, a single parameter set to minimum or maximum, all parameters at minimum or maximum and randomized variations of parameters within the min/max range. This randomization is done with Monte Carlo simulations. Again, a mean value chromatogram with an envelope curve is generated, which should lie completely within the experimental envelope curve. This approach is done at least twice to prove both accuracy and precision.

Step 4
Step 4 is again more focused on process development. Here, simulation studies similar to the ones in step 3 are run. These are set to the optimized operation point. If one of the simulation studies from step 3 already used this point, no further simulations are needed.
The simulation studies are analyzed for the quality and performance attributes, in this case purity and yield. This information as well as the input parameters and boundaries are fed into a statistical analysis tool. Partial Least Squares (PLS) regression now helps identifying the correlations between input parameters and target values. This can be visualized, for example, with correlation loading plots. Although PLS regression does not find causal relationships, it shows correlating effects. This helps to find critical process parameters and to assess the stability of the operation point.
After passing through all these steps, the process model should be verified and validated. Process development can now proceed with optimizing the working point with the help of the model and with focus on the relevant parameters, as learned from steps 2 and 4.

Modeling Chromatography: General Rate Model
In these case studies, the general rate model and the lumped pore diffusion model were used. The general rate model and its derivatives are widely used for modeling and simulation of chromatographic processes. It can be separated in three parts: the mass balance for the mobile phase, the mass balance for the light phase, and the description of the equilibrium. For derivation, assumptions and further information see [1,18,30,31,39,40]:

Mass balance of mobile phase:
The mass balance of the mobile phase consists of four terms reading from left to right: storage, convective flow, axial dispersion, and mass transport [18]: with u int as interstitial velocity, D ax as axial dispersion coefficient, ε s as voidage, d p as particle diameter, and k f ,i as film mass transport coefficient. The use of film mass transport coefficient demands the consideration of pore diffusion in the mass balance of the stationary phase. However, film mass transport and pore diffusion can be combined, resulting in the lumped pore diffusion model [30]. Here, the film mass transport coefficient k f ,i is replaced with an effective mass transport coefficient k e f f . This simplification is often applied in early process development to reduce model parameter determination efforts at the expense of model accuracy and process understanding. An even further simplification is the lumped kinetic model that neglects intraparticle pores et al. [30].

Mass balance of stationary phase:
The mass balance of the stationary phase is mostly dominated by pore diffusion D p,i and surface diffusion D S,i [39,41]: with c p,i as the concentration of component i within the pores, and q i as the surface loading of component i. For larger molecules, surface diffusion is often neglected or combined with pore diffusion into one effective diffusion coefficient D eff [41,42].
Combining Equations (2) and (3) results in: For the lumped pore diffusion model, the mass balance for the stationary phase reads [30]:

Adsorption equilibrium:
There is a vast amount of approaches to describe the adsorption equilibrium, mostly depending on the adsorption mechanism and mode of operation [43][44][45][46][47][48][49][50][51][52][53][54][55]. For this simulation study, competitive Langmuir isotherms were used [42,56]: Processes 2019, 7, 580 7 of 27 Here, K i is the Langmuir coefficient and q max,i the maximum loading capacity of component i. There are different notations found in literature (e.g., with the use of the Henry coefficient H i ). All notations can be transferred into the other with:

Feed Mixtures, Buffers, and Stationary Phases
Ion exchange chromatography was done with Fractogel ® EMD SO3-(S) (Merck KGaA, Darmstadt, Germany) in prepacked 1 mL Atoll columns (5-50, Atoll GmbH, Weingarten, Germany). Bulk material for shaking flask experiments was obtained from Merck (Merck KGaA, Darmstadt, Germany). Buffers consist of 20 mM sodium phosphate at pH 6.0 with ammonium sulfate as the modifier salt (1 M for elution buffer). All salts were obtained in analytical quality from Merck (Merck KGaA, Darmstadt, Germany). IgG feed broth was produced in-house with industrial Chinese Hamster Ovary (CHO) cell culture [57]. After harvest, the feed broth was diafiltrated either with loading buffer or with a mixture of loading and elution buffer to set the desired salt concentration for shaking flask experiments. For this purpose, a SARTOFLOW ® Slice 200 Benchtop System with Sartocon ® Slice 200 Hydrosart ® 10 kDa membranes were used (Sartorius Stedim Biotech GmbH, Göttingen, Germany).

Instruments and Devices
Analytical chromatographic separations were carried out with a VWR-Hitachi LaChrom Elite ® system (VWR International, Radnor, PA, USA) with a quaternary pump L-2130, L-2200 auto sampler, and L-2455 diode array detector (DAD). In addition, a pH/conductivity detector pH/C-900 (General Electric, Boston MA, USA) was used.
Preparative runs were done with a LaPrep ® system (VWR International, Radnor, PA, USA) consisting of two P110 pumps, one P314 UV detector, and a Smartline 3900 auto sampler from Knauer (Knauer Wissenschaftliche Geräte GmbH, Berlin, Germany). Peak fractionation was done with a Foxy Jr. sample collector (Teledyne Isco, Lincoln, NE, USA).

Analytics
IgG concentration was measured with Protein A chromatography and size exclusion chromatography. Protein A gives the total amount of IgG monomer and dimer as well as the total amount of side components. Size exclusion chromatography reveals the concentration of IgG monomer and dimer as well as different side components.
Protein A chromatography was done with PA ID Poros ® Sensor Cartridges (2.1 × 30, Applied Biosystems, Waltham, MA, USA). Binding buffer was Dulbecco's PBS Buffer (Sigma-Aldrich, St. Louis, MO, USA) at pH 7.4. A pH shift to pH 2.6 was used for elution. Flow rate was 1.6 mL/min. Peak area was used for calibration and concentration measurements at 280 nm UV adsorption.

Experimental Validation Runs
Experimental validation runs for case study one consisted of IgG broth separated on ion exchange columns. The feed broth was from the same harvest step; however, split into four batches. Each batch was diafiltrated separately. A total of 99 µL was injected into the column. A 3 to 5 CV gradient, respectively, from 100% loading buffer to 100% elution buffer, started directly after injection, followed by 3 CV regeneration and 3 CV re-equilibration. The flow was 0.5 mL/min. Five injections were done for each feed batch, resulting in 20 experiments in total. Column outlet was fractionated from 1 min before to 1 min after main peak in 0.5 min intervals and analyzed with protein A and size exclusion chromatography.
For normal phase chromatography (case study 2), 100 µL feed containing between 31.25 and 62.5 g/L cyclopentanone and cycloheptanone was injected, respectively. Flow rate was set to 12.5 mL/min as mean value for most experiments. 15 mL/min was used for the validation run with higher flow rate. Eluent mixture (85%-vol. hexane,15%-vol. ethyl acetate) and feed were prepared three times, forming one set or series of experiments, for each point of operation. Each series had 15 individual injections or runs. Fractionation was done in 0.25 mL intervals, from 0.5 min before the first peak to 0.5 min after the last peak, and analyzed with analytical normal phase chromatography. In addition, the inline concentration of both components was measured with inline concentration measurements [58].

Software Tools
DoEs plans and statistical evaluation resulting in pareto plots were done with JMP (JMP Inc., SAS Institute, Cary, NC, USA).
PLS regression and correlation loading plots were done with Unscramble (Camo Analytics, Oslo, Norway).

Model Parameter Determination
Although this is partially a part of the materials and methods section, the parameter determination is such great importance for modelling that it is given its own section. Furthermore, the following description is not exhaustive. Further reading is, to some extent, provided in the subsections.

Voidage, Porosity, and Axial Dispersion Coefficient
The three parameters, voidage, ε s , porosity, ε p,i , and axial dispersion coefficient D ax can be measured with the same set of experiments, measuring the retention times of non-binding components. For voidage and porosity, the retention time is fed into the following equation: With .
V as volumetric flow, t i as mean retention time, and V column as column volume. As known from size exclusion chromatography, molecules with different hydrodynamic radius have different residence times due to their possibility to penetrate the porous system. Hence, to evaluate the porosity as a function of molecular weight ε p,i , different tracers are used. Very large molecules that cannot enter the pores will give the voidage ε s . This approach is known as inverse size exclusion [59,60].
For this case studies, dextran standards from 1 to 670 kDa molecular weight (Fluka/Sigma-Aldrich, St. Louis, MO, USA) and pullulan standards from 342 to 708 kDa molecular weight (PSS-Polymer Standards Service GmbH, Mainz, Germany) were used for characterization of ion exchange media.
Normal phase media was only characterized with toluene since the test mixture contains small molecules only. To determine the axial dispersion coefficient, chromatograms resulting from above experiments can be loaded into the simulations and the axial dispersion coefficient adapted until the simulations match the experiments. Again, for this parameter the molecules should stay out of the pores. Then, the mass transfer term in Equation (1) is negligible. The same chromatogram can be used to calculate the axial dispersion coefficient according to [61]: With t as mean residence time, σ 2 as variance, v as linear velocity, and l as column length. The first two approaches were used in case study two.
Besides experimental parameter determination, a number of correlations exist. For protein chromatography and due to its large range of applicability 10 −3 < Re < 10 3 , the correlation from Chung and Wen is widely used [62]: With Re as Reynolds number according to: Here, ρ is the density and η is the dynamic viscosity. This approach was used for case study one.

Mass Transfer Coefficient
As for the axial dispersion coefficient, there is as set of correlations for the mass transfer coefficient as well. These correlations are based on Sherwood number, Sh, Peclet number, Pe and, in some cases, Reynolds number, Re, and Schmidt number, Sc, and again vary in their range of applicability [63][64][65]. For case study one, the correlation from Wilson and Geankoplis is used [65], which is a common one for protein chromatography [42,66]. The range of applicability is 0.0016 < Re < 55. The basis are the correlations of Sherwood number: With which can be rearranged to: with D m,i as molecular diffusion coefficient which, in this case, can be calculated with [67]: In case study two, the effective mass transfer coefficient, k e f f , was determined from pulse injections. Samples of cyclopentanone and cycloheptanone were injected with different concentrations varying from 0.5 to 125 g/L. The resulting chromatograms were fed into the simulations and the coefficient was changed until simulations and experiments were in good agreement. To do so, the isotherms for the system cyclopentanone/cycloheptanone on Si60 have to be known. Then, the effective mass transfer coefficient is the only unknown parameter, thus must induce a potential deviation between simulations and experiments.

Pore Diffusion Coefficient
For case study one, the pore diffusion coefficient D p,i is needed. This is determined using the following equation [42]: Here τ i is the tortuosity factor and ψ p,i is the hindrance parameter. There are correlations to calculate the latter. Nevertheless, the tortuosity has to be measured. For IgG, common values are between 1.5 and 2.5 [42,68]. Hence, the pore diffusion coefficient itself can be measured by comparing simulations with pulsed injections, after all the other parameters have been measured. The resulting diffusion coefficient can then be compared with the correlation for verification.

Adsorption Isotherm
Since this might be the most important part, there is a vast amount of isotherm parameter determination methods [18,43,46,69,70]. Although the following list is not exhaustive, a few notable approaches are: •  To accommodate the effect of salt on Henry coefficient and maximum binding capacity, the following correlations were used: q max,i = a 1,i ·c Salz + a 2,i .
For more details on this exact approach, please see also [31,85]. For a similar approach for membrane chromatography see [86].
The isotherms for case study two were measured with perturbation and frontal analysis, since both approaches are very easy to combine. The eluent composition was 85% vol. hexane and 15% vol. ethyl acetate, as described earlier. Eight different feed concentrations between 0.5 and 125 g/L were measured. The best practice is well described in the literature given above, hence left out here.

First Case Study: General Rate Model for Monoclonal Antibody Purification
Step 1 The first case study is set as an example of modelling chromatography for detail engineering for monoclonal antibody separation on cation exchange media. Thus, the general rate model without further simplifications is used. Since it was used for previous work, no further actions are needed in this step [30,31,39,[87][88][89].
Step 2 To start with the DoEs design, all parameters and variables influencing the process and process modelling need to be found and a reasonable range has to be set. For this case study, all parameters and their range are summarized in Table 1: Isotherm parameters underlie diversified influences (e.g., adsorber material or eluent composition). Table 1 shows the impact of the process or model parameters on quality and performance attributes as expected by the study designer. These expectations arise from prior knowledge due to long-term experience (or literature studies). In this case study, only two target values are included in the sensitivity study-IgG purity and yield. In chromatography, these parameters are somewhat contradictive. Assuming overlapping peaks to each side of IgG, to achieve high purity, narrow product fractionation is needed, resulting in high product loss, thus low yield and vice versa. IgG purity is a classical quality attribute. To determine this variable, the cut points for IgG fractioning are set to achieve 95% yield with maximum IgG purity. The second target value is IgG yield as a performance attribute. Here, the fractionation is chosen to achieve 95% IgG purity at maximum yield. Table 1 shows 13 variables. However, the last entry "isotherm parameters" is summarizing 16 parameters; these are parameters a 1 , a 2 , b 1 , and b 2 (see Equations (17) and (18)) for the four components IgG monomer, IgG dimer, low binding HCP (Host Cell Proteins), and strong binding HCP. All parameters, which are expected to have no influence (green), are kept out of the sensitivity study, reducing the number of parameters to 26. For this case study, a fractional design with a resolution of 4 and 129 simulations, including one center point, proved to be sufficient for a p-value of less than 0.0001 for simulation vs. prediction response.
The results for IgG purity and yield of the 129 simulation runs were fed into the statistical evaluation and analyzed. The significance of each parameter for one of the two target variables can be visualized with Pareto plots, as shown in Figure 4.   Figure 4 indicates the significance of each parameter for IgG purity (left) or IgG yield (right). Although the resolution of the DoEs was high enough to identify some two-factor interactions, only main effects are shown here. Two-factor interactions; however, would be interesting for process development. Besides, it is important to notice that this Pareto plots are only valid for the given

Figure 4 indicates the significance of each parameter for IgG purity (left) or IgG yield (right).
Although the resolution of the DoEs was high enough to identify some two-factor interactions, only main effects are shown here. Two-factor interactions; however, would be interesting for process development. Besides, it is important to notice that this Pareto plots are only valid for the given separation problem in the given design space. There is no explanatory power for chromatography in general or for similar separations with different ranges of operation. Figure 4, for example, indicates that gradient steepness has no significant effect on IgG purity. This is remarkable compared to general knowledge of bind and elute chromatography. Regarding the narrow operational range from 3 to 7 CV, and keeping in mind that the side components of IgG on an ion exchange column are often very closely eluting [90], this is plausible.
Isotherm parameters are unexpectedly mostly irrelevant. Relevant isotherm parameters represent the Henry coefficients, indicating that the column is operated in the linear range of the Langmuir isotherm, thus is not utilized to its full extend. Feed loading seems to be too low.
As expected, column length, particle diameter, and axial dispersion are significant for both purity and yield, whereas flow rate and porosity are not (in the given range).
Step 3 In this case study, gradient separations are used for model validation. To prove model precision and accuracy, two different experiments are compared with the corresponding simulations. The experimental setup is detailed in Chapter 3, all model parameters were measured as described in Chapter 4.
The first operating point is mostly the center point of the step 2 DoEs. Since the column loading was found to be too low, a few changes were made. In contrast to the DoEs, column diameter was lowered from 1 to 0.5 cm and column length reduced from 10 to 5 cm, resulting in a column volume reduction of roughly factor 8. In addition, injection volume was increased to the auto sampler maximum of 99 µL. Nevertheless, the flow regime should not be changed. Hence, the flow rate was changed from 2 to 0.5 mL. Thus, the linear flow rate is the same for both cases. All model parameter mean values and the corresponding uncertainties are given in Table 2. The second operating point is mostly the same except for a 3 CV gradient. The outcome of simulations and experiments are summarized in Figure 5. In terms of experiments, from the 20 runs for each gradient, the mean concentration value for each point of time was calculated, resulting in the solid blue lines. In addition, the minimum and maximum value was determined resulting in the dotted and dashed lines, respectively. All chromatograms lie in between these lines. The red solid lines represent the result of the simulation run with all input values at average (mean values of Table 2). The dashed and dotted lines are again minimum and maximum values, meaning that all 30 Monte Carlo runs were within these boundaries. The outcome of simulations and experiments are summarized in Figure 5. In terms of experiments, from the 20 runs for each gradient, the mean concentration value for each point of time was calculated, resulting in the solid blue lines. In addition, the minimum and maximum value was determined resulting in the dotted and dashed lines, respectively. All chromatograms lie in between these lines. The red solid lines represent the result of the simulation run with all input values at average (mean values of Table 2). The dashed and dotted lines are again minimum and maximum values, meaning that all 30 Monte Carlo runs were within these boundaries. In general, simulations and experiments are in good agreement. Simulations are mostly within the experimental envelope curve. Deviations can be found primarily at the end of the elution, where simulations reach the baseline faster than experiments. This is reflected in the overall residence times as well. The deviation is 7.8% for the 5 CV gradient and 7.3% for the 3 CV gradient. The tendency is the same. This can be due to an underestimate of the axial dispersion because of column aging, or, more likely, due to strongly binding side components that were not considered in the simulations.
Although the peak variability seems rather large, it is important to notice again that the envelope curve represents the overall minimum and maximum. These arise from multiple experiments, from which each is shifted a little bit in retention time and peak height compared to the average chromatogram, as shown in Figure 6a. Since all components shift in the same direction, this does not In general, simulations and experiments are in good agreement. Simulations are mostly within the experimental envelope curve. Deviations can be found primarily at the end of the elution, where simulations reach the baseline faster than experiments. This is reflected in the overall residence times as well. The deviation is 7.8% for the 5 CV gradient and 7.3% for the 3 CV gradient. The tendency is the same. This can be due to an underestimate of the axial dispersion because of column aging, or, more likely, due to strongly binding side components that were not considered in the simulations.
Although the peak variability seems rather large, it is important to notice again that the envelope curve represents the overall minimum and maximum. These arise from multiple experiments, from which each is shifted a little bit in retention time and peak height compared to the average chromatogram, as shown in Figure 6a. Since all components shift in the same direction, this does not affect the purity or yield very much. The purity-over-yield diagram for the 5 CV case is given in Figure 6b. affect the purity or yield very much. The purity-over-yield diagram for the 5 CV case is given in Figure 6b. Producing a purity-over-yield plot is simple for simulations, since the concentrations of all components at each time are known. This is not the case for the experiments. Due to the overlapping peaks, offline analytics are needed, in this case protein A chromatography for total IgG concentration and size exclusion chromatography for IgG monomer and dimer as well as HCP concentrations. Minimum sample volume for both analytics combined is 0.2 mL, resulting in 0.5 min fractionation at 0.5 mL/min flow rate. The fraction with the highest overall purity gives the first point to the left in Figure 6b. Adding the results from the fractions next to it creates the other points. Producing a purity-over-yield plot is simple for simulations, since the concentrations of all components at each time are known. This is not the case for the experiments. Due to the overlapping peaks, offline analytics are needed, in this case protein A chromatography for total IgG concentration and size exclusion chromatography for IgG monomer and dimer as well as HCP concentrations. Minimum sample volume for both analytics combined is 0.2 mL, resulting in 0.5 min fractionation at 0.5 mL/min flow rate. The fraction with the highest overall purity gives the first point to the left in Figure 6b. Adding the results from the fractions next to it creates the other points.
The feed for this ion exchange separation was cell free culture after harvest only diafiltrated without any further purification, thus the overall purity is relatively low compared to the classical ion exchange chromatography that follows protein A chromatography in the platform process [91]. Nevertheless, it can be seen that results from simulations are in good agreement with experiments and lie within the experimental uncertainty. This applies to chromatograms and purity-yield plots. Thus, the model has been proven valid.

Step 4
To identify correlations between input parameters and target values, the Monte Carlo simulations are evaluated statistically with partial least squares regression leading to correlation loading plots, as shown in Figure 7. Here, blue dots are the predictors. These are all input variables from Table 2. Red dots are the responses, in this case the maximum purity possible, the purity at 95% yield and the yield at 95% purity. The PLS regression extracts so called "principal components", "latent vectors", or "latent factors" from the predictor and response sets, so that the maximum of covariance between the predictor and response sets is explained. Although there might be more than 2 factors needed to explain 100% variance, these plots only show two factors. This is mostly factor 1 and 2 since these have the biggest influence on explained variance. More factors might only add irrelevant noise or might not add any benefits to an adequate regression model. Nevertheless, it is often helpful to plot different factors against each other. A large intercept on one axis represents a large influence of this factor for this parameter. For better clarity, red and green circles are added, indicating the limit for 50% explained variance (red) and 100% explained variance (green). In Figure 7, for example, maximum purity highly correlates with factor 1. Yield mainly correlates with factor 1 as well, but is also dependent on factor 2. Length is positively correlating with purity and yield on factor 1 and 2, indicating that a longer column would increase purity and yield. The factor A2 for HCP2 is negatively correlated on factor 1 and positively correlated on factor 2. This means a lower factor A2 effects the purity positive but for some parameter variations would be negative for yield. Since factor 1 explains 73% and factor 2 only additional 3%, it is better to keep this factor low. All parameters around the center of this graphic are less important. Again, there is more information hidden in the other factors that are not shown here.
The feed for this ion exchange separation was cell free culture after harvest only diafiltrated without any further purification, thus the overall purity is relatively low compared to the classical ion exchange chromatography that follows protein A chromatography in the platform process [91]. Nevertheless, it can be seen that results from simulations are in good agreement with experiments and lie within the experimental uncertainty. This applies to chromatograms and purity-yield plots. Thus, the model has been proven valid.

Step 4
To identify correlations between input parameters and target values, the Monte Carlo simulations are evaluated statistically with partial least squares regression leading to correlation loading plots, as shown in Figure 7. Here, blue dots are the predictors. These are all input variables from Table 2. Red dots are the responses, in this case the maximum purity possible, the purity at 95% yield and the yield at 95% purity. The PLS regression extracts so called "principal components", "latent vectors", or "latent factors" from the predictor and response sets, so that the maximum of covariance between the predictor and response sets is explained. Although there might be more than 2 factors needed to explain 100% variance, these plots only show two factors. This is mostly factor 1 and 2 since these have the biggest influence on explained variance. More factors might only add irrelevant noise or might not add any benefits to an adequate regression model. Nevertheless, it is often helpful to plot different factors against each other. A large intercept on one axis represents a large influence of this factor for this parameter. For better clarity, red and green circles are added, indicating the limit for 50% explained variance (red) and 100% explained variance (green). In Figure  7, for example, maximum purity highly correlates with factor 1. Yield mainly correlates with factor 1 as well, but is also dependent on factor 2. Length is positively correlating with purity and yield on factor 1 and 2, indicating that a longer column would increase purity and yield. The factor A2 for HCP2 is negatively correlated on factor 1 and positively correlated on factor 2. This means a lower factor A2 effects the purity positive but for some parameter variations would be negative for yield. Since factor 1 explains 73% and factor 2 only additional 3%, it is better to keep this factor low. All parameters around the center of this graphic are less important. Again, there is more information hidden in the other factors that are not shown here.  Altogether, the process model for monoclonal antibody purification on ion exchange media is validated and ready to be used in optimization studies or for process development of continuous chromatography steps. Significant design parameters were identified in step 2 and the influence on target values quantified in step 4. Model validation and the ability to replace further experiments was demonstrated in step 3.
Especially for the evaluation of continuous processes, it is important to check again that the model parameters are still valid. Parameters determined with correlations must be within the range of applicability. Different flow rates for example may result in a change of flow regime. Solid design of experiments usually yield in broad application range. Isotherms are usually measured from very low to very high concentrations and with different component ratios.

Second Case Study: Lumped Pore Diffusion Model for Continuous Chromatography of Cycloheptanone and Cyclopentanone
Step 1 The second case study exemplifies rapid process development for evaluation of various continuous chromatography steps with often more than four columns, exemplified with a simulated moving-bed test system of cyclopentanone (C5) and cycloheptanone (C7) on normal phase columns. Since speed is more important than level of detail in this case, the lumped pore diffusion model is used. Again, this was used for previous work, no further actions are needed in this step [30,31,86].

Step 2
Again, to start with the DoEs design, all parameters and variables influencing the process and process modelling with their design range are summarized in Table 3. Table 3. Process and model parameters, their design range and the expected impact on cyclopentanone and cycloheptanone purity and yield. Green-no impact; yellow-low impact; orange-medium impact; red-high impact.  Table 3 identifies 12 parameters with an expected influence on process outcome. Thus, a full factorial DoEs plan would hold 2 12 + 1 = 4097 simulations, including one center point. Depending on the computational power at hand, this is workable, although the results of a reduced plan might have the same quality.
In the present case, only two components are present. Depending on the input parameters, a complete baseline separation of both components is possible as well as a complete overlap. Thus, the purity is expected to be between 50% and 100%, the yield can vary from 0% to 100%. At center point, the peaks are slightly overlapping resulting in 98% yield of both components when 100% purity is desired. Purity and yield for both components are again the only target values of the sensitivity study. Purity was taken for a fraction with at least 95% yield and yield was calculated for a fraction with 95% purity. Other performance attributes like productivity or eluent consumption would also be interesting for process development, but are again left out to simplify the approach. Besides, the purpose of this case study is to implement and validate a process model for further process studies of different continuous chromatography steps like simulated moving bed, where the potential batch column performance is not relevant since it is not operated in batch mode anyway.
The sensitivities are visualized with Pareto charts of standardized effects in Figure 8. It can be seen that every parameter is relevant for the target values except the Langmuir coefficients, the axial dispersion coefficient and the feed concentration. Langmuir coefficients include the maximum loading as shown in Equation (7), combined with feed concentration being not relevant indicates a not fully loaded column. Of course, full column utilization is desired for continuous chromatography processes, but is not necessary for model validation experiments, which are important here.
For closely eluting components, it is obvious that the Henry coefficients should be of major importance, which is the case. Same conclusion goes for column length and isocratic elution chromatography. Similar explanations can be found for all parameters and their outcome. No unexpected behavior was found and the sensitivity is as expected. Step 3 In this case study, isocratic chromatography is investigated. For model validation, the center point of step 2 DoEs is used. The corresponding experimental and model parameters are listed in Table 4. Additionally, one point of operation with a volumetric flow of 15 mL/min and one point with Step 3 In this case study, isocratic chromatography is investigated. For model validation, the center point of step 2 DoEs is used. The corresponding experimental and model parameters are listed in Table 4. Additionally, one point of operation with a volumetric flow of 15 mL/min and one point with a feed concentration of 31.25 g/L for cycloheptanone factor 1.5 more for cyclopentanone was investigated. The later was not part of the original design space, nevertheless, due to solid parameter determination, the model should be valid beyond these boundaries. Simulations were done with all parameters at average value once, with one parameter set to minimum or maximum, with all parameters at minimum or maximum and with 30 Monte Carlo runs. Chromatograms for experimental and simulated runs are compared in Figure 9. There are three series of experiments with 15 runs each. Between each series, new feed and eluent was prepared but with the same recipe. If the simulations are only compared to the first series (grey lines), it can be seen that the variation between simulations is larger than those between experiments. Nevertheless, if all three series are taken into account the deviation between the experiments is bigger. In this case, all simulations lie within the experimental envelope curve.
Although the peaks of the envelope curves look extremely wide and there is a large overlap between the envelope curves of cyclopentanone and cycloheptanone, there was always high purity and yield achieved. The average value for purity was around 99.2% with a minimum of 99.0% and maximum of 99.4%. It is important to notice that although retention times of both peaks vary a lot, both peaks usually shifted in the same direction. Thus, the separation itself was little affected. Similar conclusions can be drawn for the other two validation experiments, shown in Figure 10.
if all three series are taken into account the deviation between the experiments is bigger. In this case, all simulations lie within the experimental envelope curve.
Although the peaks of the envelope curves look extremely wide and there is a large overlap between the envelope curves of cyclopentanone and cycloheptanone, there was always high purity and yield achieved. The average value for purity was around 99.2% with a minimum of 99.0% and maximum of 99.4%. It is important to notice that although retention times of both peaks vary a lot, both peaks usually shifted in the same direction. Thus, the separation itself was little affected. Similar conclusions can be drawn for the other two validation experiments, shown in Figure 10.  Step 4 Again, simulations from step 3 are evaluated statistically. The resulting correlation loadings plot is given in Figure 11. In step 3, both simulations and experiments showed deviations in retention time, but only minor variance in purity and yield. Although this indicates a very stable operation point, it carries less information for statistical evaluation. This reflects in low explanatory power of

Step 4
Again, simulations from step 3 are evaluated statistically. The resulting correlation loadings plot is given in Figure 11. In step 3, both simulations and experiments showed deviations in retention time, but only minor variance in purity and yield. Although this indicates a very stable operation point, it carries less information for statistical evaluation. This reflects in low explanatory power of factor 1 and 2 in Figure 11. Nevertheless, some interesting conclusions can be drawn. In this case, factor 1 and 2 mostly explain variance in cycloheptanone and cyclopentanone yield. As expected, a lower volumetric flow results in better yield. Same goes for feed concentration and Henry coefficient of the earlier eluting component. A lower Henry coefficient for this component would shift this peak to the front, reducing the overlap with the stronger binding component. In summary, the process model for the simulated moving-bed chromatography test system cyclopentanone and cycloheptanone on normal phase media is implemented and validated. This model, parametrized and tested as batch chromatography, is now ready to be used in continuous chromatography modelling with as many columns as possible, really simplifying the process development and comparison.

Discussion and Conclusions
In this paper, model validation for the model-based process development under quality-bydesign aspects has been addressed. The presented approach can be divided in four steps, of which the first is left out since it is known from literature.
The second step targets the sensitivity of the process and the model for parameter changes within the predetermined design space. In terms of model validation, this step should reveal flaws in the model equations or programming by showing behavior in conflict with expectations. Furthermore, it improves process comprehension. The IgG case study, for example, showed to be less sensitive to gradient steepness, which is not expected for gradient separations, but proved correct due to the similarity in elution behavior of target and side components. Thus, optimizing the elution profile should become less important in further process development.
It is important to notice; however, that this conclusion is only valid for the problem at hand processed in the given design space. It might be completely different for a similar antibody from a different cell culture, even when processed on the same column. This is the dominant effect of adsorption equilibrium, which turned out to be significant in both case studies, although only for the linear part.
Parameter determination is crucial for modelling. Both case studies followed slightly different approaches. Nevertheless, qualified personnel in a dedicated laboratory should be able to measure All the other parameters are less important in the factor 1 and 2 plane. Nevertheless, both factors combined only explain 71% variance. Factors 3 to 5, in this case, are also important for variance, but are left out here since no additional information can be extracted.
In summary, the process model for the simulated moving-bed chromatography test system cyclopentanone and cycloheptanone on normal phase media is implemented and validated. This model, parametrized and tested as batch chromatography, is now ready to be used in continuous chromatography modelling with as many columns as possible, really simplifying the process development and comparison.

Discussion and Conclusions
In this paper, model validation for the model-based process development under quality-by-design aspects has been addressed. The presented approach can be divided in four steps, of which the first is left out since it is known from literature.
The second step targets the sensitivity of the process and the model for parameter changes within the predetermined design space. In terms of model validation, this step should reveal flaws in the model equations or programming by showing behavior in conflict with expectations. Furthermore, it improves process comprehension. The IgG case study, for example, showed to be less sensitive to gradient steepness, which is not expected for gradient separations, but proved correct due to the similarity in elution behavior of target and side components. Thus, optimizing the elution profile should become less important in further process development.
It is important to notice; however, that this conclusion is only valid for the problem at hand processed in the given design space. It might be completely different for a similar antibody from a different cell culture, even when processed on the same column. This is the dominant effect of adsorption equilibrium, which turned out to be significant in both case studies, although only for the linear part.
Parameter determination is crucial for modelling. Both case studies followed slightly different approaches. Nevertheless, qualified personnel in a dedicated laboratory should be able to measure all parameters in less than four weeks. Offline analytics is often the critical part that can extend project duration significantly. Significant improvements are gained by implementation of process analytical technology, like inline concentration measurements [58,92]. Besides, not every parameter has to be determined experimentally. Many parameters can be calculated using correlations or are taken over from previous simulation studies. For separations of similar components carried out on the same column only the isotherm parameters change. Therefore, experienced process development teams may reduce parameter determination effort to a few days.
Step 3 focused on model validation. Two or three operation points were considered, respectively. Model parameter determination errors were used to conduct Monte Carlo simulation studies, which were compared to experiments. The variance of the simulations and the variance gained by repeating the same experiment multiple times showed to be in the same range. The envelope curve of simulations lies within the envelope curve of the experiments. This proves the model to be valid and to be capable of replacing further experiments.
Step 4 focuses more on process development by statistical evaluation of the results from step 3, revealing the dependencies between input/process parameters and target values.
For both cases, the IgG purification and the small molecule example, the presented approach was successful, resulting in valid process models and deeper process knowledge for further process development.