Robust Process Design in Pharmaceutical Manufacturing under Batch-to-Batch Variation

: Model-based concepts have been proven to be beneﬁcial in pharmaceutical manufacturing, thus contributing to low costs and high quality standards. However, model parameters are derived from imperfect, noisy measurement data, which result in uncertain parameter estimates and sub-optimal process design concepts. In the last two decades, various methods have been proposed for dealing with parameter uncertainties in model-based process design. Most concepts for robustiﬁcation, however, ignore the batch-to-batch variations that are common in pharmaceutical manufacturing processes. In this work, a probability-box robust process design concept is proposed. Batch-to-batch variations were considered to be imprecise parameter uncertainties, and modeled as probability-boxes accordingly. The point estimate method was combined with the back-off approach for efﬁcient uncertainty propagation and robust process design. The novel robustiﬁcation concept was applied to a freeze-drying process. Optimal shelf temperature and chamber pressure proﬁles are presented for the robust process design under batch-to-batch variation.


Introduction
To implement Quality by Design (QbD) concepts, and to ensure optimally designed processes, over the last two decades model-based process design has become an important tool in pharmaceutical manufacturing and process systems engineering [1][2][3][4][5]. For instance, dynamic process models support recent activities of the Food and Drug Administration (FDA) [6] and the International Council for Harmonisation (ICH) Q11 guideline [7] regarding QbD, and the quantification of process variability [8]. Although uncertainties in process models and parameters are considered, and are frequently incorporated in robust process design concepts [9][10][11][12][13], the applied algorithms are commonly based on perfect uncertainty measures, i.e., using specific probability density functions (PDFs). In addition to probability-based concepts for robust process design, scenario-based methods exist [14][15][16]. Simulation studies seek the worst-case scenario for which the process is optimized, even when the worst-case scenario rarely occurs in reality, and thus, lead to robust but extremely conservative designs with considerable performance losses. Therefore, robust design concepts for pharmaceutical processes, which aim to maximize process performance while satisfying critical process constraints under probabilistic uncertainties, are preferred, to provide the proper trade-off between process performance and robustness [2,17]. Probabilistic uncertainties, in turn, are the result of noisy experimental data and system identification routines that assume a particular experimental setting, while neglecting batch-to-batch variation effects [17]. In the pharmaceutical industry, the batch operation is the standard pharmaceutical process, where the optimal shelf temperature and chamber pressure profiles are derived for optimal process efficiency and process quality attributes. The paper is organized as follows. Section 2 covers the basics of the robust process design under batch-to-batch variation. In Section 3, an effective solution strategy with the PEM and the back-off-based design is introduced. Section 4 summarizes the results from the p-box robust process design of a freeze-drying process. Conclusions can be found in Section 5.

Robust Process Design
In what follows, the basics of the probability-based process design are summarized. Starting with the standard probability-based robust optimization framework, an extension to imprecise uncertainties representing model parameter uncertainties under batch-to-batch variation is given.

Probability-Based Robust Optimization
In the literature, various concepts of robust process design exist. Traditional methods for propagating and quantifying model uncertainties are probabilistic and frequently used in robust process design. Here, the interested reader is referred to [9,11,13,16,30,35,37] and references therein. The general structure of the original probability-based robust process design reads as: subject to: where t ∈ [0, t f ] is the time, u ∈ R n u is the vector of the control variables, and p ∈ R n p is the vector of the time-invariant parameters. x d ∈ R n x d and x a ∈ R n xa are the differential and algebra states; that is, The initial conditions for the differential states are given by x 0 . Uncertainties can exist in the parameters and the initial conditions ξ = [p; x 0 ], where the probability space (Ω, F , P) is defined with the sample space Ω, the σ-algebra F , and the probability measure P. Φ(M(x t f )) denotes the robust formulation of the Mayer objective term M(x t f ) that is used for the nominal process design. Equations (1b) and (1c) are the model equations with g d : R (n x d +n xa )×n u ×n p → R n x d and g a : R (n x d +n xa )×n u ×n p → R n xa . P v in Equation (1e) is the probability of violating the inequality constraints h nq : R (n x d +n xa )×n u ×n p → R n nq . ε nq is the tolerance factor that gives the maximum acceptable probability for constraint violations. [u min , u max ] are the upper and lower boundaries for the control variables. For a conventional robust process design, parameters uncertainties ξ are characterized with well-defined probability distributions F Ξ (ξ). The probability of constraint violations can be approximated with statistical moments, and thus, the robust inequality constraints in Equation (1e) read as: where E[·] and Var[·] indicate the mean and variance calculated over the probability space of ξ, and β ξ determines the robust design's conservatism to the variation of the model parameters ξ uncertainties.

Imprecise Uncertainties
In the case of imprecise uncertainties, the conventional robust process design concept can be generalized with the parametric p-box approach, where the uncertainties of the parameters ξ depend on the hyper-parameters θ of the parametric probability distributions: where θ is specified by upper and lower bounds, and D Θ = [θ l 1 , θ u 1 ] × . . . [θ l n θ , θ u n θ ] denotes the feasible domain of these hyper-parameters. According to the p-box notation, the probability of a constraint violation can be expressed as a bounded interval P v ∈ [P l v , P u v ] with: In the case of the p-box robust process design, the upper probability bound is of interest, to guarantee a safe operation: If the upper boundary of P v is lower than or equal to ε nq , then Equation (1e) holds for all realizations of hyper-parameters θ and P v ∈ [P l v , P u v ], respectively. To avoid solving a cumbersome double-loop sampling or optimization problem [31], Equation (5) can be, as for a conventional robust design, also approximated with statistical moments according to: where β θ determines the conservativeness of the p-box robust design results from the variation of hyper-parameters θ. Note the direct link to PBA, where the first term of Equation (6) refers to the averaged value of the uncertain boundary, and the second term measures the variation. Thus, with β θ = 2.32, the 99% confidence interval of the uncertain upper limit is calculated, and the upper bound of the upper bound approximates the plausibility function sufficiently. Please note that the 99% confidence interval indicates the interval of the probability distribution, in which the constraints are satisfied, and does not have to be symmetric. Evaluating the plausibility function for a robust process design might result in too-conservative designs under considerable performance loss. Note that the plausibility function assigns the inequality constraints the highest probability. Alternatively, setting β θ = −2.32 leads to the lowest probability of the inequality constraint under batch-to-batch variation, and thus, approximates the belief function accordingly. Both strategies, i.e., β θ = 2.32 and β θ = −2.32, are considered within the PEM-based back-off approach, and the general structure of the double-loop approach is summarized in Figure 2.  Figure 2. Illustration of the outer and inner loop setting for evaluating the inequality constraint under aleatory and epistemic uncertainty. Sampling from epistemic uncertainty (a) results in aleatory uncertainty realization (b), which is propagated to the inequality constraint (c) via the process model. Re-sampling from the epistemic uncertainty (a) helps to quantify the variation in the upper limit of the inequality constraint (d).

PEM-Based Back-Off Approach
Before introducing the basic notation of back-off-based process design, the concept of the PEM is introduced, and how it can be efficiently used for problems of imprecise uncertainty propagation.

Point Estimate Method
The conventional robust and p-box robust process design problems are solved with the back-off approach [12], where the back-offs are calculated with Equations (2) and (6), respectively. The statistical moments used in Equations (2) and (6) are approximated with the PEM, as it is more efficient than standard methods for uncertainty propagation [35]. Depending on the underlying parameter distribution, specialized sample points and weight factors w i can be derived, and evaluated for uncertainty propagation [35,41]. In the case of aleatory parameter uncertainties, 2n 2 ξ + 1 PEM sample points must be used to evaluate Equation (2), assuming Gaussian distributions for the model parameter uncertainties. In detail, the deterministic 2n 2 ξ + 1 sample points are generated by the first three generator functions (GF[0], GF[±ϑ], and GF[±ϑ, ±ϑ]) defined in [41], where ϑ controls the exploration of the n ξ -dimensional parameter space. Using specific weight factors for each generator function results in the final approximation scheme for the mean value: where 18 , w 2 = 1 36 , and ϑ depends on the specification of the Gaussian distribution [41]. Similarly, the variance can be estimated with the following equation: With Equations (7) and (8), the conventional robust design can be realized, but they ignore the batch-to-batch variation and epistemic uncertainty, respectively. To incorporate epistemic uncertainty, the outer loop of uncertainty propagation must be considered; see Figure 2. To do so, the scaling factor ϑ and weights w i of the PEM are adapted to uniform probability distributions [35,41]. Thus, for imprecise parameter uncertainties, and the given nested uncertainty propagation problem in Equation (6), 2n 2 ξ + 1 PEM sample points for the model parameters and 2n 2 θ + 1 PEM sample points for the bounded hyper-parameters θ are evaluated that result in 4n 2 θ n 2 ξ + 2(n 2 θ + n 2 ξ ) + 1 total PEM sample points. Please note that the deterministic sampling scheme from the PEM can be easily parallelized, while ensuring reproducible results.

Back-Off Realization
For the back-off strategy, the inequality constraint of Equation (1e) is considered in its deterministic form first: To guarantee that the inequality constraint is fulfilled under imprecise uncertainties, a back-off term b c to the constraint at the nominal parameter vector p is introduced as: In conventional robust process design with precise parameter uncertainties, the back-off term b c can be calculated with the following equation [12]: in which h nq,nom represents the nominal value of the inequality constraints, and β ξ determines the robustness that could be obtained with this back-off term. For instance, b c calculated with β ξ = 2.32 could provide a robust design, where 99% of the process realizations under aleatory uncertainty do not violate the inequality constraints.
In comparison with the conventional robust design, the p-box robust process design determines the back-off term b c with Equation (6) and reads as: in which β ξ and β θ decide the robustness of the individual batch and the batch-to-batch variation, respectively. For instance, the back-off term determined with β ξ = 2.32 and β θ = 2.32 could provide robust design, with which 99% of different configurations of parameter uncertainties and process realizations will have the desired robust performance; that is, the probability of a constraint violation is smaller than 99%. Different values for β ξ and β θ could also be used, depending on the preferred robustness level required for the process under study. For more details regarding the PEM-based back-off design and the selection of proper values for b c , we refer to our preview works [12,13] and references in those works.

Case Study
The freeze-drying process, also known as lyophilization, is used extensively in pharmaceutical manufacturing to stabilize APIs that have limited storage time in aqueous solutions, for example, therapeutic protein formulations and vaccines [10]. The primary drying process, which dominates the overall energy consumption of the lyophilization process, is considered the most critical step [44].
In this study, we analyze the robust process design of the primary drying process in the presence of imprecise parameter uncertainties due to batch-to-batch variations. Thus, we advance our preliminary work on robust process design for the primary drying process which was based on precise parameter uncertainties [45], and we extend recent model-based studies and experimental work on inter-vial heterogeneity in general [46][47][48].

Mathematical Model of the Primary Drying Process
The mathematical model of the primary drying process used in this work is adapted from [10,44], and the overall setup is illustrated in Figure 3. The mass transfer equation of vapor, which represents the dynamics of the sublimation process at the sublimation surface, is given as: where m sub is the mass of ice removed by sublimation. A p , P c , and R p are the cross-sectional area of the product, the chamber pressure, and the dried product resistance to the vapor flux, respectively. The heat used for sublimation is assumed to be equal to the heat transferred from the heating shelf: where K v , A v , and T s are the heat transfer coefficient, the outer cross-sectional area of the vial, and the shelf temperature, respectively. Figure 3. Illustration of the freeze-drying process. Model parameters can represent the dynamic processes of a single vial (a) appropriately, but may fail for all vials that are handled in the freeze-dryer (b) due to batch-to-batch variations. P i is the vapor pressure at the sublimation interface which depends on T i [49]:

(a) (b)
T i is the temperature at the sublimation interface, and is calculated with the energy balance equation given in [44]. K v , A v , and T s are the heat transfer coefficient, the outer cross-sectional area of the vial, and the shelf temperature, respectively. ∆H s is the heat of sublimation which also depends on T i [49]. T B = T i + ∆T is the temperature at the bottom of the vial. ∆T is the temperature difference across the frozen layer [10]: where L f is the height of the frozen layer and can be linked to m sub via: L total , ρ I , and are the total height of the product layer, the density of the ice, and the volume of the ice fraction, respectively. Nominal values and units for the parameters and the initial conditions can be found in [10,44], and the used nominal parameter values are summarized in Table 1.

Optimal Process Design Strategy
This case study aims to maximize the efficiency of the primary drying step under parameter uncertainties in R p and K v , while ensuring the product quality at the same time. The shelf temperature and the chamber pressure are adapted to maximize the total mass of ice removed by sublimation, and to minimize the operating time. To avoid irreversible product damage of the API cake, temperature T i is maintained to be smaller than the critical collapse temperature T crit = −34 • C [10]. Feasible operation intervals for chamber pressure P c and shelf temperature T s are [5,30] Pa and [−40, 30] • C, respectively.
First, the optimization problem is solved in the absence of parameter uncertainties for the nominal design. Second, precise uncertainties in parameters R p and K v are included for the conventional robust process design; that is, batch-to-batch variation effects are ignored. According to [10], we assume that the uncertainties of R p and K v follow a Gaussian distribution; i.e., R p ∼ N (56,000, 5600 2 ) and K v ∼ N (11.47, 1.15 2 ). Finally, we assume imprecise parameter uncertainties in R p and K v for the p-box robust process design as introduced in Section 2. According to the parametric p-box concept, the interval of the hyper-parameters and the type of probability distribution families are listed in Table 2. For the sake of demonstration, the performance of the proposed framework, i.e., precise parameter uncertainties and imprecise parameter uncertainties, are assumed according to the information from [10] and the reference therein. The number of model evaluations needed to calculate the back-offs for the p-box robust design is 297 for each iteration, and the back-offs converge at the 4th iteration. The optimization problems were implemented in MATLAB (2017a), and solved within the CasADi framework [50] using the nonlinear programming (NLP) solver IPOPT [51].

Results and Discussion
In Figure 4, we show the designed profiles of the shelf temperature and the chamber pressure for the nominal, robust, and p-box robust designs of the primary drying step. For the nominal design, T i is kept at its upper boundary to ensure higher vapor pressure P i at the sublimation interface, and thus, to accelerate the sublimation process according to Equations (13) and (15). In the beginning, P c is set to 9.6 Pa to achieve a higher sublimation speed and is decreased gradually to compensate for the influence of the decreasing height of the frozen layer following Equation (16). However, with the existence of (imprecise) parameter uncertainties, the variation of temperature at the sublimation interface will lead to significant violations of the critical temperature which is necessary for maintaining the quality of dried API product. Therefore, results from the robust designs attempt to reduce the temperature of the heating shelf to avoid quality failures of the API cake, while the pressure is maintained at its lower boundary to maximize the efficiency of the freeze-drying process. The shelf temperature for the conventional robust design decreases, while that for the nominal design remains at the upper limit; see Figure 4a. Moreover, the shelf temperature for the p-box robust design (plausibility function, β θ = 2.32) decreases further as the effect of imprecise parameter uncertainties is taken into account with the highest probability. The chamber pressure for the robust and p-box robust designs is also lower than that for the nominal design, and is kept at the minimum to increase the efficiency of the sublimation process; see Figure 4b. The increase in robustness is at the cost of decreased performance; that is, the drying time of the p-box robust design shown in Figure 5 is longer than that for the nominal and conventional robust designs. In addition, the effect of batch-to-batch variation and epistemic uncertainties is also illustrated with the p-box design (belief function, β θ = −2.32). The shelf temperature and chamber pressure profiles are close to those of the nominal case.  Figure 4. Designed profiles for the shelf temperature T s (a) and the chamber pressure P c (b) of the nominal, robust, p-box (plausibility function) robust, and p-box (belief function) robust designs.
In Figure 5, we illustrate the probability distributions of T i calculated with the parameter uncertainties of different hyper-parameter realizations for the nominal, robust, and p-box robust designs. Results from the nominal design violate the upper limit of T i in most scenarios, as indicated in Figure 5a, and thus, the designed shelf temperature and chamber pressure profiles are unlikely to be beneficial for most of the vials that are handled in the freeze-dryer. In Figure 5b, we show that the conventional robust design increases the robustness of the process, but with significant constraint violations when the hyper-parameters are different from the one used for the conventional robust process design, i.e., when there is considerable batch-to-batch variation. In the case of batch-to-batch variation and imprecise parameter uncertainties, the p-box robust design (plausibility function, β θ = 2.32) ensures the lowest number of constraint violations; see Figure 5c. To demonstrate the effect of batch-to-batch variation further, the p-box design (belief function, β θ = −2.32) is given in Figure 5d. Considering the belief function, the p-box design is far from robust, and is close to the nominal design, as indicated in Figure 5d. The performance of the nominal, robust, and p-box robust designs is further analyzed, and validated for the imprecise parameter uncertainties of R p and K v with Monte Carlo simulations. To this end, 1000 realizations of the hyper-parameters (epistemic uncertainty) and 1000 samples of parameters R p and K v from each realization (aleatory uncertainty) are generated, which leads to 10 6 samples in total for the double-loop approach (Figure 2). In Figure 6, we summarize the number of constraint violations determined from the 1000 model evaluations with the parameter samples generated from the probability distributions of parameter uncertainties, with a fixed hyper-parameter realization. The normalized histograms of the violation number, in turn, are obtained from the 1000 realizations of the hyper-parameters. Please note that for the sake of validation, we aim for a robust design for the individual batches and batch-to-batch variations. Thus, two parameters, β ξ = 2.32 and β θ = 2.32, are selected, and determine the final robustness level of the designed process. (1) β ξ = 2.32 (i.e., ε nq is set to 1%) attempts to have a design with which the constraint violation number should be lower than or equal to 10 in the case of 1000 parameter samples in single realization of the hyper-parameters, and (2) β θ = 2.32 attempts to guarantee that fewer than or equal to 10 realizations of hyper-parameters will not obey the first desired robustness. The probability that the violation number ≥ 10 is the highest for the nominal design; see Figure 6a. As indicated in Figure 6b, the conventional robust process design has a certain effect of, but results in a considerable number of constraint violation, due to the neglected batch-to-batch variation. For the p-box robust design (plausibility function, β θ = 2.32) in Figure 6c, the number of constraint violations fulfills the given specification for almost all sample realizations. Only a small number of samples do not fulfill the given specification of the violation number, which can be attributed to the approximation errors of the PEM used for the inner and outer loops of uncertainty propagation ( Figure 2). The effect of the batch-to-batch variation becomes also obvious for the alternative p-box design (belief function, Figure 6d. The likelihood of a constraint violation increases drastically when compared with the previous p-box design (plausibility function, β θ = 2.32), and has a performance similar to that of the nominal design. To summarize, the probability that the violation number is larger than 10 is equal to 98.6%, 78.9%, 2%, and 96.8% for the nominal, conventional robust, p-box (plausibility function), and p-box (belief functions) designs, respectively. As can be observed, the proposed p-box (plausibility function) approach can handle the imprecise parameter uncertainties and provide a process design which is robust enough for not only an individual batch but also for batch-to-batch variations.

Conclusions
In this work, we introduced a p-box robust process design to compensate for batch-to-batch variation and imprecise parameter uncertainties, which were expressed as parametric p-boxes. The notation of the robust inequality constraint was adapted according to the parametric p-boxes, and further approximated with statistical moments that were calculated efficiently. Moreover, combining the point estimate method with a back-off strategy for robust design implementation has been proven to be beneficial in terms of the computational costs for the p-box robust process design concept. The efficiency of the proposed strategy was successfully demonstrated with a freeze-drying process, as a highly relevant pharmaceutical manufacturing process. The results of the p-box robust process design were compared with the results from the nominal and conventional robust designs. The proposed strategy performed quite well, and could ensure the robustness of the inequality constraint, even in the presence of batch-to-batch variation and imprecise parameter uncertainties, respectively. For the p-box design, the two scenarios of plausibility and belief function illustrated the considerable impact of batch-to-batch variation on the optimal process design results. Thus, future work will focus on rigorous sensitivity studies of robust process designs for pharmaceutical processes that have imprecise parameter uncertainties, and systematic analysis of the effect of epistemic and aleatory uncertainties. Moreover, the proposed approach could also be incorporated into advanced control strategies, e.g., model predictive control, to guarantee the robustness of process constraints in the presence of imprecise parameter uncertainties, which is also an interesting aspect for novel Quality by Control concepts.
Author Contributions: X.X. and R.S. designed the study. X.X. performed the and prepared the draft. R.S. provided feedback on the content and participated in writing the manuscript.

Funding:
We acknowledge support from the German Research Foundation and the Open Access Publication Funds of the Technische Universität Braunschweig.