Handling Measurement Delay in Iterative Real-Time Optimization Methods

: This paper is concerned with the real-time optimization (RTO) of chemical plants, i.e., the optimization of the steady-state operating points during operation, based on inaccurate models. Speciﬁcally, modiﬁer adaptation is employed to cope with the plant-model mismatch, which corrects the plant model and the constraint functions by bias and gradient correction terms that are computed from measured variables at the steady-states of the plant. This implies that the sampling time of the iterative RTO scheme is lower-bounded by the time to reach a new steady-state after the previously computed inputs were applied. If analytical process measurements (PAT technology) are used to obtain the steady-state responses, time delays occur due to the measurement delay of the PAT device and due to the transportation delay if the samples are transported to the instrument via pipes. This situation is quite common because the PAT devices can often only be installed at a certain distance from the measurement location. The presence of these time delays slows down the iterative real-time optimization, as the time from the application of a new set of inputs to receiving the steady-state information increases further. In this paper, a proactive perturbation scheme is proposed to efﬁciently utilize the idle time by intelligently scheduling the process inputs taking into account the time delays to obtain the steady-state process measurements. The performance of the proposed proactive perturbation scheme is demonstrated for two examples, the Williams–Otto reactor benchmark and a lithiation process. The simulation results show that the proposed proactive perturbation scheme can speed up the convergence to the true plant optimum signiﬁcantly.


Introduction
There is a strong interest in the process industries to identify the best steady-state operating conditions of their processes such that they are operated at their economic optimum while meeting constraints on product qualities, emissions, and equipment capabilities. Optimal operating conditions of a process can be identified by formulating and solving an optimization problem where a cost (or profit) function is minimized (or maximized) respecting all process limitations on the basis of a mathematical model of the plant. As the parameters of the optimization problem (e.g., prices for raw materials, energy, and products) and the behavior of the plant vary over time, such optimization is performed repeatedly during plant operation, which is called RTO [1,2] (real-time optimization). At each call of the optimizer, the problem at hand is a constrained nonlinear program (NLP) which usually is solved by gradient-based algorithms, e.g., SQP solvers [3,4] or interior point algorithms [5,6]. These solvers compute local optima, which is sufficient under the tacit assumption that only minor adaptations are made and the problem is convex in the region of interest. As far as the model-based optimization is concerned, in principle also other optimization methods can be applied, as e.g., derivative free methods [7,8], population-based methods [9], or nature-inspired algorithms [10]. While for the solution of the optimization problems efficient algorithms are available, a major practical problem is the mismatch between the model and the true behavior of the plant. No matter how the optimum is computed, if the model and the behavior of the plant do not match, the computed optimum won't be the true optimum for the plant and the solution may even violate the constraints of the plant. Of course the model could be discarded completely and a brute force optimization could be performed using any derivative-free or meta-heuristic algorithm, but this implies performing a large number of exploratory moves with the real plant, which is highly undesirable. In particular, large constraint violations of internal variables must be avoided, which is impossible if the optimization is completely model-free. Therefore, a combination of model-based and data-based optimization is advantageous. The classical approach to this problem is to add a parameter estimation element, called the two-stage approach [11].
In the two-step approach, some model parameters are updated repeatedly using leastsquares estimation and the optimization problem with updated model parameters is solved to identify the process optimum. The performance of the two-step approach is discussed in detail in Chen and Joseph [12]. The two-step approach is designed to handle parametric plant-model mismatch, which means that the model can, with the correct parameter values, represent the true plant exactly. However, there always is also structural plant-model mismatch, i.e., even with optimized parameters, the predictions of the model are inaccurate. This issue was addressed in the "Integrated system optimization and parameter estimation" (ISOPE) approach proposed in Roberts [13]. In ISOPE scheme, in addition to updating the model parameters in each iteration, the objective function of the model based optimization problem is modified iteratively by adding bias and gradient correction terms so that the solution converges iteratively to the optimum of the real plant. These correction terms (also called modifiers) are estimated from the available measurements.
Later, by Tatjewski [14], it was shown that the model parameters do not have to be updated in each iteration to converge to a process optimum. Based on this insight, Tatjewski [14] simplified the ISOPE approach and proposed the redesigned-ISOPE scheme without the parameter-update step from the ISOPE approach. The redesigned-ISOPE scheme was extended in Gao and Engell [15] to handle process-dependent constraints, by adding bias and gradient correction terms also to the constraint functions, and was termed iterative gradient modification optimization (IGMO). The IGMO scheme in Gao and Engell [15] was analyzed in detail in Marchetti et al. [16] and there was given the name modifier adaptation (MA).
The efficiency of MA-based iterative RTO methods depends on the accuracy of the gradient modifiers, which contain the gradients of the response of the cost function and of the constraints to the inputs. As the plant gradients usually cannot be measured directly, they have to be computed from the available measurements. The traditional way of doing this is to use finite differences, preferably employing the past input moves as in Gao and Engell [15] where this was enhanced with probing moves if the gradient estimation problem becomes ill-conditioned. However, in finite differences-based schemes, a trade-off between the approximation error and the effect of measurement noise has to be found. Therefore, Gao et al. [17] proposed to use quadratic approximation (QA) for the approximation of the plant gradients, which is more robust at measuring noise compared to that of finite differences (FD) and Broyden's formula [18]. The resulting modifier adaptation with quadratic approximation (MAWQA) scheme combines elements of MA, QA, and derivative free optimization [7] (DFO).
For all adaptation-based iterative RTO methods, it is a prerequisite that the adapted process model is adequate at the-unknown-optimum of the real plant [1,16,[19][20][21]. A model is considered adequate if the Lagrangian of the model-based optimization problem satisfies the necessary second-order conditions of optimality at the optimum of the real process. In Faulwasser and Bonvin [22], it was proposed to use second-order correction terms (modifiers) in the objective and constraint functions to match the Hessian of the nominal model to that of the plant. However, it is difficult to estimate the plant Hessian [23] from real data. Model adequacy was addressed in François and Bonvin [24] by using convex model approximation of the objective and constraint functions. This may however slow down the rate of convergence to the process optimum. Ahmad et al. [25] addressed model adequacy by updating some model parameters only if the model adequacy conditions are satisfied at the current operating input. In Gottu Mukkula and Engell [26], a guaranteed model adequacy (GMA) scheme for MAWQA was proposed where model adequacy is ensured by strictly convex quadratic approximations of the objective and constraint functions if the goal of the optimization is to minimize an objective function. The MAWQA with GMA scheme may converge to the process optimum in fewer iterations than MAWQA [26].
No matter how the optimization model is corrected, by parameter adaptation or by modifier adaptation of by a combination of both [23], the solution of the resulting modelbased optimization problem can be performed by any algorithm. In our work, we use gradient-based algorithms, as the cost functions and the constraints are smooth, no discrete variables are involved, and guaranteed (local) optimality is important.
Modifier adaptation-based iterative RTO methods require steady-state process measurements to compute a new input that provides a lower limit to the time between iterates. This can be alleviated by the use of transient process measurements [27][28][29][30][31], but this is outside the scope of this paper. In addition to the time required to reach a steady-state, the convergence to a process optimum may also be slowed down due to delays in obtaining measurement information. Considerable measurement delays may occur due to the time required for the measurement device (e.g., gas chromatography) to analyze the sample and due to the remote positioning of a measurement device. Figure 1 illustrates a situation when the time delay is caused by the remote positioning of the measurement device for safety reasons, e.g., due to harsh conditions at the plant. Measurement devices like NMR have to be placed in a certified (e.g., ATEX, IECEx) enclosure located at a distance from the process equipment. The thin, long tubings that carry the sample from plant to the remotely positioned measurement device can cause a significant delay, and this is quite common in the process industries. Gottu Mukkula et al. [32,33] proposed two schemes where additional plant perturbations are performed during the waiting period. The purpose was to gain additional information about the process instead of remaining idle until the effect of a new operating input propagated through the measurement device setup. Here, we propose a strategy in the context of MA that proactively schedules the input moves and thereby significantly reduces the effect of the time delay caused due to the remote positioning of a measurement device. The input perturbations in the proposed method are computed by solving an optimization problem using the latest measurement information.

Process
The rest of the paper is organized as follows. Firstly, the MA scheme used in this paper is presented in detail. Then, the previously proposed approaches to handle the time delay due to the positioning of a measurement device are presented. Thereafter, the new proactive perturbation scheme to effectively handle the time delay is described. Finally, the performance of the proposed scheme is analyzed using the Williams-Otto reactor benchmark and a lithiation reaction case study.

Model
Consider the steady-state mathematical models y = F p (u) andŷ = F m (u) as the exact representation of a process and as the nominal model that was built to represent the underlying process. The mapping functions F p : R n u → R n y and F m : R n u → R n y map the n u -dimensional vector of manipulated variables u to the n y -dimensional vector of measured variables y andŷ. Let J : R n u × R n y → R represent the objective function, which is continuous and twice differentiable with respect to u, that should be minimized, and let G : R n u × R n y → R n c be a continuous and twice differentiable with respect to u n c -dimensional vector of constraint functions, which may include process, safety, and quality restrictions.
An optimal input of the process lying within u L and u U can be obtained theoretically by solving but the mapping function (1b) is not available. The optimum u * m results from solving the optimization problem using the nominal model: and may differ from u * p considerably if F m = F p . To simplify the notation, we from here on replace J(y, u), G(y, u) by J p (u), G p (u) and J(ŷ, u), G(ŷ, u) by J m (u), G m (u).

Modifier Adaptation
Modifier adaption handles plant-model mismatch by adding and iteratively updating gradient correction term to the objective function, and by adding and iteratively updating bias and gradient correction terms to the constraint functions of the model based optimization problem (2) as described in Gao and Engell [15], Marchetti et al. [16]. In the kth iteration, the modified objective and constraint functions of the model based optimization problem, i.e., the MA problem, are where ∇J k p , ∇J k m are the gradients of the plant objective function J p (u) and of the nominal model objective function J m (u) with respect to the process input for the kth iteration. Similarly, ∇G k p , ∇G k m represent the gradients of the plant constraint functions G p (u) and the constraint functions of the nominal model G m (u) with respect to the process input for the kth iteration. The modified optimization problem that is solved in the kth iteration is: where the determined optimum (û k+1 ) is the k + 1th input which is applied to the plant. The gradients of the plant objective function (∇J k p ) and of the plant constraint functions (∇G k p ) can be obtained for example using finite differences for which additional plant perturbations are required. Methods to approximate the plant gradients using the available past number of inputs and steady-state measurements, without the need for additional plant perturbations, are proposed in Roberts [18], Brdyś and Tatjewski [34], and Gao et al. [35].
In MAWQA, the gradients of the plant objective function (∇J k p ) and of the plant constraint functions (∇G k p ) are obtained using surrogate QA models as long as the cardinality condition, i.e., the minimum number of data points required to identify all the parameters of the QA function, is fulfilled. A minimum of := (n u +1)(n u +2) 2 data points, i.e., process inputs and their corresponding steady-state process measurements, are required to fit a surrogate QA function (Q) for the plant objective function and for each of the plant constraint functions. Upon satisfying the cardinality condition, ∇J k p and ∇G k p are then obtained by evaluating the analytical derivatives of the surrogate QA models at u k . A surrogate QA model can be defined as where, p := {a 1,1 , . . . , a n u ,n u , b 1 , . . . , b n u , c} are the parameters of the surrogate QA model. To obtain a surrogate QA model for each of the objective and constraint functions, a subset of at least data points (U k ) is selected from the set of all data points available until the kth iteration (U k ). Wenzel et al. [36] reported a comparative study of existing methods to screen U k from U k . In general, the methods for screening U k attempt to include distant data points which are well distributed around u k (U k dist ) and can act as anchor points for the QA. They also include all neighboring data points (U k nb ), which lie within the inner-circle around u k . The inner-circle is defined as an n-dimensional sphere around u k with the tuning parameter ∆u as its radius. As proposed in Gao et al. [17], the quality of the distribution of data points in U k dist can be evaluated using a so-called conditionality-check criterion, the inverse of the condition number of s k (κ −1 (s k )) has to be greater than a desired value (δ). s k in the conditionality-check criterion is defined as where [U k dist ] and [u k ] are matrix representations of the set U k dist and of u k . If the conditionalitycheck criterion fails, additional plant perturbations are performed and added to U k to improve the distribution of the anchor points used for fitting the QA-function.
As the data points in U k consist of input variables positioned locally around u k , the QA functions fitted using U k are also local approximations of J p and G p . In MAWQA, this is taken into account by restricting the process input for the next iteration (û k+1 ) obtained by solving the optimization (4) to lie inside an ellipsoidal trust region by adding the following constraint to the modifier adaptation problem in (4) where γ is a tuning parameter that scales the trust region [17]. In addition, there are also elements of DFO [7] that include the criticality-check, the quality-check and a possible switch to an optimization based on the quadratic approximation model. We refer the readers to Gao et al. [17,37] for a detailed discussion on MAWQA.
Convergence to the plant optimum is assumed when the evaluation function C defined . . , k} where ∇J i p,j is the gradient of the plant objective function of the jth input-variable of kth input (u k ) [38]. The tuning parameter N c is the number of iterations for which the condition C ≤ |εJ k p,best | has to be satisfied continuously. J k p,best is the best value of the objective function realized by the plant up to the kth-iteration and ε is a tuning parameter that must be chosen taking into account the measurement noise and the required tolerance of the value of the optimum. Figure 1 illustrates a situation that occurs typically in the process industries while implementing MA based RTO methods to optimize their processes. Consider a process that reaches a steady-state for constant inputs within τ p time units and the measurement device that requires a maximum of τ m time units to analyze a sample. An additional τ d time units is required for the sample to be transported to the measurement device making the total time to obtain steady-state measurements after a change of the inputs τ t := τ p + τ d + τ m . If the measurement device is located at a remote position, this leads to a waiting period. Figure 2 illustrates the waiting period in an iteration of a MA algorithm. If an input u k+1 is given to the plant at time t, its corresponding steady-state plant measurement becomes available at time t + τ t . The time between iterations in MA is thus at least τ t time units if the plant gradients are computed using past inputs and measurements, and even larger if additional perturbations are performed. During the waiting period the plant remains at a steady-state and no additional process information is gained during this period, hence the convergence to the optimum steady-state is slowed down. Gottu Mukkula et al. [32,33] proposed active perturbation strategies to perform input perturbations during the waiting period to gain additional process information that is used later to drive the process to its optimum faster. Figure 2 illustrates the active perturbation scheme where additional process perturbing inputs are given to the process, for example, between t + τ t + τ p and t + 2τ t , to gain additional process information that can be used in the subsequent iterations, once the measurements of the effects of the input perturbations arrive. In general, the maximum possible number of input perturbations that can be given to a process within the waiting period can be computed as [32,33]

Active Perturbation Strategies
In the following subsections, we present the active perturbation strategies proposed in Gottu Mukkula et al. [32,33]. Both active perturbation strategies are illustrated using an example case with 2 input variables and by assuming the maximum possible perturbations per iteration (P max ) as 4. Figure 2. Illustration of standard modifier adaptation method and of active perturbation method.

Active Perturbation around the Current Input (APCI)
According to the method proposed in Gottu Mukkula et al. [32], perturbing inputs around the input in the current iteration, i.e., u k , are given to the process during the waiting period. Figure 3, illustrates this active perturbation method. In the figure, represents the inputs from MA. Let u k be the input given to the plant in the kth iteration at time t. The plant reaches its steady-state at t + τ p , and the steady-state measurements for u k are available at t + τ t . During the waiting period, i.e., from t + τ p to t + τ t , P max = 4 additional perturbations represented by circles ( ) are performed around u k . Redundant probing is avoided by suppressing perturbation inputs that are closer than a threshold ξ from one of the earlier inputs. For example, the perturbing input represented by is suppressed as it is in close vicinity to u k−1 , and therefore, would be redundant. During the remaining waiting period, which is available due to suppressing of some perturbations, the plant is given the best known input (u k p,best ) to avoid a loss of performance. As the perturbing inputs are not computed optimally, not all of them provide useful information. For example, in Figure 3, only the perturbation inputs represented by ⊕ are used by the QA scheme for computing the gradient at u k+1 .

Active Perturbation around an Estimate of the Next Input (APENI)
Here, the process is perturbed around an estimate of the next input, i.e.,û k+1 e , instead of perturbing around current input u k . Due to the unavailability of the measurements for the input u k during the waiting period, the plant gradients at u k cannot not be computed. So, the next input u k+1 cannot be calculated precisely without the plant gradients at u k , and therefore is estimated using the following assumption: If u k is close to u k−1 , the difference between the plant and the model gradients at u k is approximately equal to the gradient difference at The same is assumed for the bias correction term (G k p − G k m ). In other words, the optimization problem is unchanged. The input in the next iteration can then be estimated by solving the following modified MA problem: where The problem in (9) for computingû k+1 e is the same as the problem in (3) for computing u k+1 , except that the trust-region in (7) for computingû k+1 e is now centered around u k . In Figure 4a, an ideal scenario for an active perturbation around next input is illustrated. In the ideal case, the assumption on the modifiers is valid, i.e., (∇J k , and therefore, the measurement information from the perturbed inputs are used to compute the inputs in upcoming iterations. Figure 4b illustrates a nonideal scenario where the assumption about the modifiers does not hold, soû k+1 e = u k+1 . Therefore, as shown in Figure 4b, not all measurement information from the perturbing inputs may be useful in the upcoming iterations. Redundant perturbation inputs are suppressed similar to the active perturbation around the current input (APCI) method.

Proactive Perturbation Scheme
Although the perturbation schemes proposed in Gottu Mukkula et al. [32,33] gain additional measurement information by perturbing the process during the waiting period, they are suboptimal as they are based on heuristics. Also, the measurement information gained from the additional input perturbations is not used immediately. Therefore, in this paper, we propose a proactive perturbation scheme where the additional perturbation inputs are computed by solving an optimization problem using all available process information. The key idea of the proactive perturbation scheme is illustrated in Figure 5 using an example case with n u = 1, P max = 5, and where the plant gradients are estimated using QA. At t := 0, an input u 0 is given to the process, the corresponding measurements become available after time τ t . In the initial (0th) iteration, FD is used for gradient approximation as there are no past data points for gradient approximation. Therefore, after applying u 0 for max{τ p , τ m } units of time, the process is operated with u 0 f d 1 , for gradient approximation, the corresponding measurements are available at time ζ := τ t + (n u × max{τ p , τ m }) and a new iteration input u 6 is computed by solving the MA problem in (3). During the waiting period in the 0th iteration, i.e., from (n u + 1) × max{τ p , τ m } to ζ, P max additional perturbation moves (from u 1 to u 5 ) around u 0 are performed to gain additional knowledge about the process. For n u := 1, steady-state measurements for at least := 3 different inputs are required for the process gradients to be approximated using QA (cardinality condition). We note that varies if other methods are used for plant gradient estimation using past data. In this example, after time ζ + max{τ p , τ m }, there are already measurements for inputs. From here on, an MA problem is solved each time a new measurement is available as the plant gradients are computed using past inputs and measurements. From this point onward, each input to the process is optimal with respect to the available process knowledge, thereby making the proposed proactive perturbation scheme more efficient than the active perturbation methods proposed in Gottu Mukkula et al. [32,33]. A flowsheet for the implementation of the proposed proactive perturbation scheme is presented in Figure 6. In the flowsheet, φ is the total number of inputs given to the plant, also including the inputs whose steady-state measurements are not yet available and ψ is the total number of inputs given to the plant for whom the steady-state plant measurements are already available. The major steps in the flowsheet are: 1 perturbation input for gradient estimation using FD 2 perturbation input for gradient estimation using FD, also considered as an input (u k ) for MA problem formulation 3 perturbation input to gain additional information during the waiting period 4 perturbation input to gain additional information during the waiting period, also considered as an input (u k ) for MA problem formulation 5 input (u k+1 ) computed by solving a MA problem (4) with plant gradients approximated using FD 6 perturbation input considered as an input (u k ) for MA problem formulation, to ensure a continuous flow of steady-state measurements every max{τ p , τ m } units of time 7 input (u k+1 ) computed by solving MA problem with gradients approximated from past (at least ) measurements In all the active perturbation strategies discussed in this section, to avoid frequent probing of the plant with unsuccessful inputs, and thus, deteriorating the plant performance, a list of unsuccessful moves is stored. An input from MA u k is applied to the plant only if the number of past inputs on the list that are not farther than ϕ from u k is less than a predefined number χ count . The parameters ϕ, χ count are defined depending on the expected level of measurement noise.
Convergence Terminate

Process Description
The Williams-Otto reactor case study [39] is a benchmark case study widely used to analyze the performance of RTO methods. Reactants A and B react in a CSTR to produce products E and P, and a side product G. While the reaction in the plant follows a threestep reaction mechanism, the nominal model considers a two-step reaction mechanism, ignoring the formation of an intermediate component C, leading to a structural plant-model mismatch. The reaction mechanisms for the plant and in the nominal model are:

Plant:
Nominal model: The goal is to maximize the profit function J(y, u) := (1143.38x P,ss + 25.92x E,ss )F R − 76.23F A − 114.34F B , where x P,ss , x E,ss are the measured steady-state concentrations of products P, E, and F R is the summation of the feed flowrates of reactants A and B, i.e., and For the purpose of illustration of the performance of the proposed proactive perturbation scheme, we consider that τ p , τ d and τ m are 3, 12 and 3 min.

Simulation Results
The modifier adaptation scheme MAWQA with GMA [26] is used in the simulation study to compare the methods active perturbation around the current input, active perturbation around an estimate of the next input, and the proposed proactive perturbation scheme using the tuning parameters listed in Table 1. In MAWQA with GMA, the plant gradients are estimated from past inputs and steady-state measurements using QA. All schemes are initialized at u 0 := [3,100]. For the simulation study, the measured plant profit is corrupted by Gaussian noise with zero mean and 1.0 standard deviation. accepted variance of J 0.01 − for convergence (ε) Figure 7 illustrates the evolution of the inputs (scaled) and the plant profit function from MAWQA with GMA, MAWQA with GMA with active perturbation schemes APCI and APENI, and MAWQA with GMA with the proposed proactive perturbation scheme. Descriptions of all markers in the figure are given in Table 2. The axis for the value of the profit function ( ) can be found on the right-side of the plot and the axis for the inputs (scaled) to the plant is on the left. Each vertical grid line in the plot represents the completion of an iteration, which includes all steps from giving input to a process to computing a new input by solving a modifier-adaptation problem. For MAWQA with GMA and MAWQA with GMA with active perturbation schemes APCI and APENI, after applying u 0 ([ , ]) at t = 0, two input steps (perturbations [ , ]) with a scaled step length ∆h u are performed to compute the plant gradients using finite differences. Each input is applied to the plant until it reaches a steady-state, i.e., for τ p := 3 min. The steady-state plant measurement ( ) for each applied input are obtained after τ t := 18 min from applying an input. Upon computing the plant gradients using the steady-state plant measurements, the modifier adaptation problem in (4) is solved to compute the the input u 1 . As the condition for cardinality is not met, i.e., U 1 < , n u := 2 additional perturbations ([ , ]) are performed after u 1 ([ , ]) to compute the plant gradients. Similar to the 0th iteration, the MA problem in (4) is solved in the 1st iteration to compute u 2 . From here on, the cardinality condition is always satisfied. Therefore, QA is used for gradient approximation for all further iterations. In MAWQA with GMA, additional input perturbations are made only if the criticality-check criterion is not satisfied, for example in the 2nd iteration after applying u 2 to the plant. Table 2. Description of markers in Figure 7. Input markers in red and black color refer to the input variable u 1 ; blue and magenta input markers refer to the input variable u 2 .

Marker
Description [ , ] successful-iteration input ([u 1 , u 2 ]) [ , ] unsuccessful-iteration input ([u 1 , u 2 ]) [ , ] perturbation input ([u 1 , u 2 ]) value of the plant profit function In MAWQA with GMA with active perturbation schemes APCI and APENI, additional perturbations are made once the cardinality condition is satisfied, i.e., from the 2nd iteration onwards, and if there are no past inputs lying closer than ξ to the scaled planned perturbation inputs. In MAWQA with GMA with APCI, the additional perturbations are stopped from the 11th iteration, as from k := 11, there are no past inputs lying farther than ξ from all the planned perturbation inputs. In MAWQA with GMA with APENI, the perturbation inputs in kth iteration are chosen around an estimate of iteration input for k + 1th iteration. From the iteration k := 10 onwards, all the planned perturbation inputs are not farther than ξ from earlier inputs, so all active perturbations are stopped.  are available after ζ := 24 min. During the waiting period, i.e., from (n u + 1) × max{τ p , τ m } := 9 min to ζ := 24 min, P max := 5 additional input perturbations u 0 ap 1 , . . . , u 0 ap 5 are performed to gain additional plant information. As the input perturbations when φ ≥ are considered as inputs from MA (according to the flowsheet in Figure 6), among the 5 additional perturbations, u 0 ap 3 , . . . , u 0 ap 5 are renamed as u 1 , u 2 , u 3 . At 24 min, steady-state measurements for u 0 and u 0 are available. Therefore, an MA problem is formulated and solved to compute the next input u 4 . For the next iteration, as φ := 9 and φ ≮ , FD is no more used for gradient approximation. However, there are not enough measurements for QA, i.e., ψ := 4, after applying u 4 . As ( − ψ) := 2, input perturbations u 5 and u 6 are added to smoothly switch from using FD for gradient approximation to QA. After u 6 , ψ = and a new steady-state measurement is always available every max{τ p , τ m } thereby overcoming τ d and τ m . From here on, a new iteration input is computed every max{τ p , τ m } := 3 min taking into account the latest measurement information available.
Although all the schemes converged to an input in the neighborhood of the plant optimum, the MAWQA with GMA scheme took the most time, 375 min and 20 iterations to converge. The time of convergence is indicated with a red-dashed vertical line in Figure 7. The MAWQA with GMA, APCI, and APENI gained additional plant information from suboptimal input perturbations, and therefore converged in 303, 249 min and 16, 13 iterations. Finally, MAWQA with GMA and proactive perturbation scheme converged in only 123 min with 31 iterations. It gained more information per time from additional perturbations than that of the earlier proposed active perturbation schemes and also by using the measurement information immediately after it is available.
The mean profit function for 100 simulations for all schemes is shown in Figure 8. The mean and standard deviation of the profit function did not vary significantly upon convergence. The mean and standard deviation of the times of convergence for all modifier adaptation schemes is also shown. In the proactive perturbation scheme, the profit function converges earlier to the plant optimum, followed by the active perturbations schemes APCI, APENI, and the standard modifier adaptation scheme. The mean time of convergence for the MAWQA with GMA if τ d = 0, shown in green, is longer than the mean time of convergence of the proposed proactive perturbation scheme as the proposed scheme also overcomes τ m in addition to τ d . This illustrates that the proposed scheme overcame the effect of time delay (τ d >> τ p , τ m ) in this case.  Table 1.
The Friedman ranking test [40] was performed for a ranking comparison of the proposed proactive perturbation scheme with standard MAWQA, and the APCI and APENI schemes based on the time of convergence for 100 realizations of the measurement noise. The Friedman test is a nonparametric statistical test to recognize performance differences among different methods across multiple test samples. For each realization of the measurement noise, the methods are ranked based on their time of convergence. The lowest mean rank for a method implies its consistently superior performance, and vice versa for the highest mean rank. The mean rank of the standard scheme with no additional perturbations is 3.87, for APCI it is 2.6, for APENI 2.51, and the proposed proactive perturbation scheme has a mean rank of 1.02. Thus, the proposed scheme clearly showed a superior performance when compared to that of the other methods.

Process Description
The lithiation case study is a real case where τ d >> {τ p , τ m } [41]. The iterative realtime optimization method MAWQA was successfully applied to the real process by Gottu Mukkula et al. [41] but there is still scope for improvement in the time of convergence to the real process optimum due to the large waiting period in the process. In this simulation study, we aim at proving that the plant optimum could be identified much earlier with the proposed proactive perturbation scheme.
A lithiation reaction [41] is performed to produce Lithium 2-Nitrodiphenylamine (component H) in a containerized reactor module developed within the F 3 -factory project [42] at the INVITE GmbH facility. All chemical components involved in the lithiation reaction are listed in Table 3 and the reaction mechanism is shown in Figure 9. The main modules of the containerized reactor, i.e., the coiled tubular reactor, a NIR flow cell, an online-NMR sensor [43] and a product filter, are shown in Figure 10. The byproduct Lithium fluoride (component G) precipitates along the length of the tubular reactor, and the solid G is filtered out before the reaction mixture from the outlet of the coiled reactor is collected in a product tank. A filtered sample of the reaction mixture is continuously fed to an online-NMR measurement system and to a NIR flow cell to measure the concentration of the product in the reaction mixture at the reactor outlet [43].  Figure 9. Detailed reaction mechanism of lithiation reaction. Names of the chemical components are given in Table 3.  The following steady-state plant model was developed assuming a negligible value of the dispersion coefficients for concentrations and temperature: where r i is the reaction rate of component i, described by: The model parameters are provided in Table 4, and the reaction rates constants k i ∀i = 1, . . . , 4 depend on the temperature via an Arrhenius-type equation: Although a four-step reaction takes place in the reactor, the nominal model instead considers a single reaction: leading to a structural plant-model mismatch. The steady-state nominal model is defined as follows: The reaction rates r i in the nominal model follow the elementary rate law with the rate constant k defined by the Arrhenius equation with k = 17.8 exp( E RT R ). The goal is to maximize the plant profit that is computed by the function

Simulation Results
All tuning parameters used in the simulation study are listed in Table 1. The RTO schemes are initialized at u 0 := [3.58, 3.58] and the measured variable C H is corrupted by Gaussian noise with zero mean and 1.0 standard deviation. Similar to the Williams-Otto reactor case study, model adequacy is enforced using the guaranteed model adequacy (GMA) proposed in Gottu Mukkula and Engell [26]. Convergence to the process optimum, in this case study, is assumed when the profit from N c consecutive iterations is greater than 1.175 × 10 5 $. Figure 11 illustrates the mean and the 10 to 90 percentile range of the profit function, and the mean time of convergence for 100 simulations of MAWQA with GMA, MAWQA with GMA with active perturbation schemes APCI and APENI, and MAWQA with GMA with the proposed proactive perturbation scheme. The deviations in the profit function are due to the measurement noise, which affects the computation of the next inputs via the calculation of the quadratic approximation. The figure shows that the proposed proactive perturbation scheme converges faster when compared to that of MAWQA with GMA and the two active perturbation variants proposed earlier. The reason for this is that the idle time is used better by intelligently scheduling the process inputs taking into account the time delays to obtain the steady-state process measurements. Figure 12 shows the mean value of the profit function from MAWQA with GMA, MAWQA with GMA, APCI, and APENI, and MAWQA with GMA and the proposed proactive perturbation scheme. The mean time of convergence and the standard deviation of the time of convergence for 100 simulations are shown as error bars. In addition to the faster convergence of the proposed proactive perturbation scheme, the standard deviation is also smaller when compared to the other schemes. Similar to the Williams-Otto reactor case study, the mean time of convergence of the proposed proactive perturbation scheme is even shorter than that of the MAWQA with GMA scheme with τ d = 0. The mean Friedman rank [40] of the standard scheme with no additional perturbations is 3.64, for APCI it is 2.82, for APENI 2.5, and for the proposed proactive perturbation scheme it is 1.06.  The time to compute a new input in an Intel(R) Core(TM) i7-4790 CPU @ 3.60 GHz × 4 using a 64 bit Windows 10 operating system for both case studies ranged between 0.01 and 0.02 s. Such short computation time would enable the usage of the proposed proactive perturbation scheme also for processes with very short settling time.

Conclusions
In this paper, we address iterative RTO with plant-model mismatch in the presence of a time delay caused due to the positioning of an analytic measurement device at some distance from the place where the process stream is sampled. The time delay leads to a longer time of convergence to the real process optimum for modifier adaptation methods, during which the plant is not operated optimally. Perturbation schemes to handle this time delay in the sense that the period where one must wait for the result of the transfer of the sample and the analytic measurement is used for further probing of the plant were proposed earlier, but they rely on heuristics for the choice of the plant perturbations. A new proactive perturbation scheme is proposed in this paper to handle the time delay where additional perturbation inputs are computed by solving an optimization problem using all available process information, which contrasts the earlier proposed active perturbation schemes.
In the proposed proactive perturbation approach, the measurement delay affects the time of convergence only once at the beginning, until a number of plant measurements for gradient approximation using quadratic approximation are available. Later, the proactive perturbation scheme computes a new input by solving a modifier adaptation problem as soon as new measurement information is available, with a sampling time equal to the maximum of the settling time of the plant and the time taken by the measurement device to analyze a sample. The performance of the proposed scheme was evaluated for two examples and compared with that of other active perturbation schemes using two case studies. The proposed proactive perturbation scheme drives the real plant to its optimum significantly faster than the other approaches, and it significantly reduces the effect of the measurement time delay for both case studies because the scheme collects more useful information from the perturbations. An important issue in the application of MA schemes to real plants is to freeze and restart the adaptation when the input moves are mainly caused by measurement noise. Initial proposals for this can be found in [38], but this requires more research in the future.

Conflicts of Interest:
The authors declare no conflict of interest.

Symbols
The following symbols are used in this manuscript: u input variables n u number of input variables F p (u) true process model F m (u) nominal process model y measured variableŝ y values of the measured variables estimated using a nominal process model n y number of measured variables n c number of constraint functions in the optimization problem u * p true process optimum u * m optimum computed using a nominal process model u L lower bound of the input variables u U upper bound of the input variables J(y, u) cost function in the optimization problem J p (u) cost function computed using true process measurements J m (u) cost function computed using estimated values of the measured variables J ad,k m (u) cost function of the modifier adaptation problem in the kth iteration J ad,k m,e (u) cost function of the optimization problem to estimateû k+1 G(y, u) constraint functions of the optimization problem G p (u) constraint functions evaluated using the true process measurements G m (u) constraint functions evaluated using estimated values of the measured variables G ad,k m (u) constraint functions of the modifier adaptation problem in the kth iteration G ad,k m,e (u) constraint functions of the optimization problem to estimateû k+1 ∇ gradient operator Q(p, u) quadratic function p parameters of the quadratic function number of parameters in the quadratic function U k set of all data points available until the kth iteration of MA U k selected data points for quadratic approximation in the kth iteration of MA U k dist anchor points in U k U k nb neighboring points in U k ∆u radius of the inner circle (tuning parameter) ∆h u step length for finite differences (tuning parameter) δ minimum value for the inverse of the condition number (tuning parameter) s k inverse of the condition number in the kth iteration of MÂ u k+1 input variables for the k + 1 iteration computed by solving the MA problem in kth iteration u k+1 e estimated input variables for the k + 1 iteration γ tuning parameter to scale the trust region (tuning parameter) cov covariance operator C evaluation function to identify the convergence of MA to true process optimum ε accepted variance of J p (u) for convergence (tuning parameter) J k p,best best value of cost function computed using the plant measurements until the kth iteration of MA N c number of iterations the convergence criterion has to be satisfied (tuning parameter) τ p time taken by the process to reach steady-state after a change of the inputs τ d time required for the sample to reach (or to be transported to) the measurement device τ m time required for the measurement device to analyze the sample τ t total time to obtain steady-state measurements after a change in the input P max maximum possible number of input perturbations in the waiting period φ number of inputs given to the process, including the inputs whose measurements are not available ψ number of inputs given to the process for whom the measurements are already available ξ threshold to stop active perturbations (tuning parameter) ϕ threshold for unsuccessful iteration (tuning parameter) χ count number of times an unsuccessful input can be probed (tuning parameter)