Offset-Free Economic MPC Based on Modiﬁer Adaptation: Investigation of Several Gradient-Estimation Techniques

: Various offset-free economic model predictive control schemes that include a disturbance model and the modiﬁer-adaptation principle have been proposed in recent years. These schemes are able to reach plant optimality asymptotically even in the presence of plant–model mismatch. All schemes are affected by a major issue that is common to all modiﬁer-adaptation formulations, namely, plant optimality (note that convergence per se does not require perfect plant gradients) requires perfect knowledge of static plant gradients, which is a piece of information not known in most practical applications. To address this issue, we present two gradient-estimation techniques, one based on Broyden’s update and the other one on linear regression. We apply these techniques for the estimation of either the plant gradients or the modiﬁers directly. The resulting economic MPC schemes are tested in a simulation and compared on two benchmark examples of different complexity with respect to both convergence speed and robustness to measurement noise.


Introduction
Economic model predictive control (eMPC) [1] has been proposed to overcome the traditional separation between control and optimization layers in industrial process operations [2,3]. eMPC merges standard MPC with the economic optimization layer that is characteristic of real-time optimization (RTO) [2]. In particular, this combination is obtained by substituting the MPC tracking objective with economic performance [4]. There are many similarities between MPC and eMPC, with one of the main challenges being optimality under plant-model mismatch [3]. Note that the tracking-MPC (tMPC) and RTO communities have dealt with this challenge for many years, and two main solutions have been established. The first one concerns tMPC schemes and consists of augmenting the nominal model with a disturbance model so as to generate a steady-state offset correction to set-point changes [5,6]. A comprehensive review about offset-free tMPC can be found in [7]. On the other hand, RTO formulations have incorporated the so-called modifier-adaptation (MA) idea [8,9]. The essence of the MA strategy is to force the KKT optimality conditions of the modified model to match those of the plant upon convergence by using zeroth-and first-order correction terms [10].
To handle plant-model mismatch in eMPC schemes, a methodology that combines the augmented model used in tMPC [7] and the MA approach used in RTO [8] was first proposed in [11] and labeled "offset-free eMPC" (OF-eMPC). Alternative OF-eMPC formulations have been proposed in recent years [11][12][13]. These formulations can overcome plant-model mismatch and let the controlled plant reach plant optimality upon convergence. This is done with the help of zeroth-and first-order modifier terms that exploit an alternative version of MA, labeled "output MA" (MAy). Instead of modifying the cost function and the constraints as in traditional MA [8], MAy sets the modifier terms on the outputs. A recent comprehensive analysis of MAy and its convergence properties can be found in [14].
Despite its ability to converge to plant optimality in the presence of plant-model mismatch, OF-eMPC suffers from a major caveat to proper MA schemes. Indeed, these schemes require accurate knowledge of static plant gradients with respect to inputs. This represents a major challenge, especially when dealing with noisy measurements. Even the most recent MA investigations, which strive for feasibility improvement [15], faster convergence [16], applicability to periodic systems [17], or large-scale systems [18], are still based on the assumption that the plant gradients are perfectly known. Nevertheless, several approaches for estimating plant gradients have been proposed, as detailed next.
The gradient estimation techniques found in literature can be divided into two classes [10]. The first class, and by far the largest, includes steady-state perturbation methods that rely on steady-state data (e.g., [19][20][21]). These methods are the ones that better replicate the concept of steadiness of RTO iterations. On the other hand, it might be advantageous with slow processes to use transient data and estimate steady-state gradients on the go [22][23][24]. In this case, we speak of dynamic perturbation methods, which exhibit a compromise between convergence speed and accuracy. Concretely, gradient approximations are computed using either a finite-difference approach (see [10] for an overview of available approaches) or some type of system identification [21,24,25]. Exploiting the so-called directional modifier-adaptation concept [20], recent works have shown how privileged directions can ease gradient estimation-thereby making the approach more reactive to plant-model mismatch [26,27]-and how these directions can be computed efficiently [28]. Jeong et al. [29] recently compared the performance of different gradient estimation methods with MA schemes. These methods include multivariate linear regression, partial least-squares regression, and principal component analysis. It is shown that the use of latent-variable space can result in fast convergence and stability near plant optimality.
It is worth noting that researchers have also tried to avoid working with plant gradient estimation. Navia et al. [30] formulated a nested optimization problem, in which the modifiers are computed via gradient-free optimization in the outer loop. Although the nested implementation was proved to be robust in the presence of noisy measurements and capable of reaching plant optimality, it was rather slow. Another modifier-estimation technique based on optimization has been proposed recently in [31]. In this case, the limiting factor becomes the choice of weighting factors in the optimization problem and the speed at which plant optimality can be reached.
This paper aims at investigating alternative solutions to the gradient estimation problem. In particular, we propose two main techniques to estimate static plant gradients, namely, a Broyden's update and linear regression. Note that these techniques have already been used to estimate plant gradients and modifiers in the RTO context, but not in the control context, that is, using fast sampling time and dynamic optimization. Another important advantage over our previous work [21,25] is not requiring repeated local perturbations of the process. Moreover, taking inspiration from the aforementioned "plant gradient-free" methodologies [30,31], we also use the same two techniques to estimate the modifiers directly.
This work represents an extension to [32], but it also goes further by comparing several gradient-estimation techniques. This is important to steer future research on how MA can best be combined with eMPC.
The paper is organized as follows. The problem definition and the constituents of OF-eMPC are detailed in Section 2. Several gradient-estimation techniques based on either plant gradient estimation or modifier estimation are discussed in Section 3, while Section 4 presents two OF-eMPC algorithms. The resulting OF-eMPC schemes are then compared via a numerical example and a standard benchmark case study in Section 5. Finally, conclusions are presented in Section 6.

Preliminaries
The OF-eMPC algorithm described in this paper deals with a general nonlinear plant and cost. Let us detail the main concepts and techniques.

Plant and Cost Specifications
The plant to be optimized is expressed by the following discrete-time nonlinear dynamic system: where x p ∈ R n xp , u ∈ R n u and y p ∈ R n y are the plant states, inputs, and outputs, respectively; x + p are the successor states. The unknown functions f p : R n xp × R n u → R n xp and h p : R n xp → R n y are assumed to be differentiable. The measured plant outputs y p at time k ∈ Z are noted y p,k . Furthermore, the inputs and outputs are bounded, with u min , u max , y min , and y max being the corresponding lower and higher bounds. Plant (1) is said to be optimized when the known and differentiable function e : R n y × R n u → R is minimized. Hence, the triple (x p ,ū ,ȳ p ) that defines the economic optimum equilibrium of Plant (1) can be computed as: subject to We introduce the following assumption about Problem (3)- (7): The triple (x p ,ū ,ȳ p ) is the unique solution to Problem (3)-(7).

Remark 1 (Unknown plant optimum).
Given that the functions f p and h p in Equation (1) are typically unknown, the solution triple of Problem (3)- (7) is also unknown.

Nominal and Augmented Models
We consider the following nominal process model: in which x and x + ∈ R n x denote the current and successor states, respectively. It is assumed that the functions f : R n x × R n u → R n x and h : R n x → R n y are known and differentiable. Augmenting Model (8) linearly with the integral states d ∈ R n d , called disturbances, has proven to be very effective for improving the performance of tMPC under plant-model mismatch [5,6,[33][34][35]. In the general formulation, the effects of the disturbances on the states and outputs are modeled using the matrices B d ∈ R n x ×n d and C d ∈ R n y ×n d : The resulting augmented functions F : R n x × R n u × R n d → R n x and H : R n x × R n d → R n y are assumed to be continuous and consistent with f and h, that is, To guarantee that the augmented model can represent the plant without steady-state offset, we introduce the following assumption (see [35] (Remark 8)): Assumption 2 (Observability). The augmented system (9) is observable.

State and Disturbance Estimation
At each time step, the estimation procedure includes two steps, namely, prediction and filtering. First, using the augmented model (9), one computes the predicted valuesx k ,d k , andŷ k asx wherex * k−1 andd * k−1 are the estimates of x k−1 and d k−1 at time k − 1. Then, using output measurements, these predicted values are filtered appropriately to give the estimatesx * k ,d * k , andŷ * k .
There are different types of estimators. For example, the so-called moving-horizon estimator (MHE) processes past output measurements in an optimization framework [36]. In contrast, a Luenberger observer or Kalman filter uses only the current measurements y p,k to estimate the augmented states from their predicted values. This is the type of estimator used in this study and is detailed next.
The prediction errors at time k are expressed as k = y p,k −ŷ k (11) and used to estimate the augmented states as follows: Note that the matrices K x ∈ R n x ×n y and K d ∈ R n d ×n y must be chosen to form an asymptotically stable observer, which implies n y = n d and K d to be invertible [5,35].

Output Modifiers
The output-correction scheme adopted in this study follows the MAy formulation that was originally developed in the RTO literature [14] and firstly applied in the eMPC framework by [11]. The correction is based on zeroth-and first-order terms that represent the steady-state deviations between plant and model outputs. In our implementation, we use the disturbance estimatesd * as zeroth-order corrections. The first-order modifier matrix at time k is noted Λ k ∈ R n y ×n u , with Λ 0 = 0. Its evolution over time is described by the following filtering equation: where σ is a scalar filter constant ∈ (0, 1], the operators D u (·) are the derivatives of the considered output functions (either g p or g) with respect to the inputs u, andū k−1 are the input steady-state targets computed at the previous iteration by the target calculation described in the next subsection. The plant steady-state input-to-output maps g p : R n u → R n y are unknown, and D u g p (ū k−1 ) must be estimated experimentally from input and output measurements. The model steady-state input-to-output maps g : R n u → R n y are calculated from Model (8) upon imposing the target inputsū k−1 as follows: Remark 3 (Model gradients). Since a linear disturbance model is used, the derivatives of Models (8) and (9) with respect to u are identical.

Target Calculation with Modifiers
The equilibrium triple (x k , u k , y k ) at time k is calculated via steady-state optimization. The modifiers described in Section 2.4 correct the outputs so as to eliminate any firstorder plant-model mismatch. The following steady-state target problem is solved at each iteration: subject to The modifier terms Λ k (u −ū k−1 ) in Equation (17) ensure KKT matching between Problems (3)- (7) and (15)- (19) [11,14].

Economic MPC with Modifiers
The finite-horizon optimal control problem (FHOCP) typical of eMPC is modified similarly to the method in Equation (17). Defining the generic state and input sequences x := {χ 0 , χ 1 , . . . , χ N } and u := {ν 0 , ν 1 , . . . , ν N−1 }, the optimal problem solved over N steps at each decision time k reads subject to The ordered pair (x k , u k ) is the unique solution to Problem (20)- (26) The inputs implemented in the plant are the first ones of the optimal sequence u k , that is, We remark that, since the modifier terms in Equations (17) and (23) serve the same goal, for consistency purposes, the steady-state input targets used in Problem (20)-(26) areū k−1 and notū k . Moreover, for ensuring closed-loop stability of eMPC at nominal conditions, the following assumption is introduced: Assumption 4. Steady-state operation is assumed to be more profitable than oscillating behavior, that is, the system is dissipative with respect to the stage cost e (·) [37].
Furthermore, to be able to deal with plant-model mismatch, we also need the following assumption: Note that Assumption 5 is consistent with the convergence requirements of tracking offset-free MPC [7], modifier-adaptation schemes [9,14], and offset-free economic MPC [12].

Gradient or Modifier Estimation
The issue limiting the practical applicability of MA methods (and thus, also of MAy), is the need to gather appropriate information about the plant-model mismatch. In particular, it is necessary to accurately estimate the steady-state plant gradients D u g p (·) for the OF-eMPC algorithm in Section 2 to work properly. As a matter of fact, convergence to plant optimality is only ensured with perfect gradient estimation [38].

Basic Idea
This section compares two methodologies based on steady-state perturbation methods. We also compare the performance when one uses only measurements taken at the current iteration versus data collected at several (previous) iterations.
In Section 2.4, the first-order modifiers Λ k are updated at every sampling time. This works fine if the plant measurements correspond to steady-state conditions; however, when dealing with transient measurements, the system must reach some sort of "steadiness" before updating Λ k . For this reason, with all the methods investigated next, the modifiers are updated every M sampling times, where M ∼ τ st τ is a tuning parameter, with τ st being the time to reach steady state and τ being the sampling time. This way, we let the system reach quasi-stationary conditions between two successive modifier updates, thereby satisfying the necessity of operating with (near) steady-state measurements. This implies a time-scale separation between the control and gradient-estimation tasks. Since Λ k is updated recursively, it helps to start the recursion with a fair estimate of Λ. In this initialization phase, we apply n u appropriate input perturbations and collect quasi-stationary output measurements, with which we can construct an initial nonzero Λ 0 . The next two subsections will address plant gradient estimation and modifier adaptation, respectively.

Plant Gradient Estimation
We propose two techniques for the estimation of plant output gradients.

Broyden's Update for Plant Gradient Estimation
Broyden's update offers a recursive way of updating the estimated gradients using current and past measurements [10,39,40]. The advantage of Broyden's update is that it does not require additional perturbations; however, being a numerical method, sufficient variation between two consecutive inputs is needed for the scheme not to fail. The technique is a standard secant method in nonlinear programming for updating estimates of first-order derivatives, such as Jacobian matrices [41].
At the k th iteration, D u g p,k , which are estimates of D u g p (ū k−1 ), are updated by defining and using Broyden's formula as follows: The gradients of the model output functions g(·) can be computed from their definition (14) using the implicit function theorem: Then, the gradient differences are calculated as follows: When applying Equation (30), care must be taken to avoid ill-conditioning when ∆U k → 0. Hence, the step given by Equation (30) is performed only if ∆U k ≥ ρ u , where ρ u is a chosen threshold.

Linear Regression for Plant Gradient Estimation
This method relies on output measurements at n s + 1 steady-state operating points corresponding to the inputs u k , u k−M , . . . , u k−n s M , with n s ≥ n u . Using such a data set, the plant output gradients are computed as linear interpolation if n u = n s or linear regression if n s > n u . These gradients are often called simplex gradients [42]. A similar approach is used in [43]. In the present work, we use n s past points, with n s > n u , to estimate the n u -dimensional gradients since the resulting redundancy helps deal with measurement noise [20].
Hence, at the k th iteration, we set n s k = min n s , k M − 1 to construct the following matrices: The length of the sets U k and Y p,k increases at each iteration from n u , when only n u + 1 operation points are available, to n s . Defining the simplex gradients are computed as the least-squares solution to Equation (35): The matrix U † k in Equation (36) can be poorly conditioned if the successive points do not extend in all directions of the input space [42]. This ill-conditioning can lead to erroneous gradient estimates; hence, as with Broyden's update, Step (36) is performed only if U k 1 ≥ ρ u . Then, the gradient differences are calculated as in (32).

Modifier Estimation
In this section, we propose to use the two techniques developed in Section 3.2 to estimate the differences between the plant and model gradients directly, that is, ∆g k . In fact, direct estimation of modifiers is often preferred over the estimation of the individual gradients since the plant and the model are approximated using the same numerical scheme. A graphical explanation can be derived similarly to the one made for linear interpolation in [10, Figure 3], but this is beyond the scope of the present paper.

Broyden's Update for Modifier Estimation
Analogously to Section 3.2.1, we define the following quantities: which leads to Broyden's update: It should be noted that when applying Equation (39), the same issue mentioned for Equation (30) of avoiding ill-conditioning must be considered.

Linear Regression for Modifier Estimation
Analogously to Equation (34), at the k th iteration, we construct the following matrix: with δy k defined in Equation (37). The matrix of differences between plant and model gradients is computed as the least-squares solution to the following regression problem: for which the solution is As above, we need to consider the ill-conditioning problem occurring when U k → 0.

Role of Disturbance Model
The role played by the disturbance model is not the same for plant gradient estimation and modifier estimation. This issue is discussed next.

Plant Gradient Estimation
In the case of plant gradient estimation, the computation of D u g p,k in Equation (30) or Equation (36) depends on the disturbance model that has been selected: • Using a linear disturbance model, as in (9), implies an equivalence between the nominal and augmented models for computing D u g(·), as in (31), since the disturbances d would not appear in any of the derivatives involved, that is, D x F(·), D x H(·), and D u F(·). Hence, the disturbance variables are not involved in the gradient calculation. • In contrast, using a nonlinear disturbance model, it would not be possible to discard the disturbance variables when calculating gradients. Moreover, a nonlinear disturbance model may be more challenging to implement; however, if implemented correctly, the first-order modifier terms might become superfluous (see [11] for an example).
Hence, in the case of plant gradient estimation, the most straightforward option is to work with a linear disturbance model and use first-order modifiers.

Modifier Estimation
The driving terms ∆E k in Broyden's update (39) and ∆E k in the linear regression update (42) use the output errors δy k (the differences between the plant and nominal model outputs) in place of the prediction errors k (the differences between the plant and augmented model outputs). As a matter of fact, using or to compute ∆g k in Equation (39) or Equation (42) would not work. The implicit secant equation behind the Broyden update, namely, k − k−M = ∆g k ∆U k , does not hold true becaused k =d k−M . The same problem appears when dealing with the linear regression update and Equation (41). Hence, the nominal model (8) is used for estimation, while the augmented model (9) is used for optimization to ensure convergence.

Offset-Free Economic MPC Algorithms
This sections formulates algorithmically the two OF-eMPC schemes based on Broyden's update and linear regression. The way the algorithms are initialized is presented first.

Algorithm Initialization
To improve convergence speed, the OF-eMPC schemes require a good initial estimate for ∆g, which translates into an appropriate Λ 0 . This can be achieved by perturbing each input individually around the current operating point and calculating the corresponding gradient elements, which requires n u input perturbations. Following each perturbation, one must wait M sampling times for the system to approach a steady state.
Next, we detail this initialization phase. Let k 0 denote the iteration at which the plant first reaches steady state, with the inputs u k 0 . The following control law is then used during the initialization phase, for j = 1, . . . , n u : where s j is the amplitude of a step of duration M in the direction e j , with e j being a unit vector in input space. One sees that the j th component of u is perturbed individually by the amount of s j e j during M iterations. The initialization phase ends at iteration k f := k 0 + n u M.
The j th column of the plant gradient to be used in Equation (30) is estimated as while the j th column of the gradient differences to be used in Equation (39) is Note that, unlike Broyden's update, the linear regression method in Equation (36) or in Equation (42) does not require Jacobian initialization, albeit the initialization phase is still used.
These estimated plant gradients or gradient differences are used to compute Λ k as follows:

OF-eMPC Algorithm Using Broyden's Update
The block diagram and algorithm describing the OF-eMPC scheme with Broyden's update are given in Figure 1 and Algorithm 1. For the sake of brevity, the initialization phase is not shown, and only the behavior for k > k f is displayed.
As can be seen in Figure 1, gradient estimates are computed after k 0 + n u M iterations. The modifier matrix is updated every M time steps, but only when the difference between two successive inputs is not too close to zero. This is needed to avoid calculating gradient differences based on little informative data, that is, to avoid ill-conditioning of the updating Equation (30) or Equation (39). Hence, the parameter ρ u can be seen as a tuning parameter for the performance of the gradient estimation method.

OF-eMPC Algorithm Using Linear Regression
The block diagram and algorithm describing the OF-eMPC scheme that uses linear regression are given in Figure 2 and Algorithm 2. Here, the initialization phase is not shown, and only the behavior for k > k f is displayed.

Ill-Conditioning
Ill-conditioning of ∆g k in Equations (30), (36), (39), or (36) results from too small a value of ∆U k or U k . When ∆U k or U k tends to zero, the aforementioned ∆g k can no longer be calculated. This happens when two measurements corresponding to (almost) the same steady state are used in ∆U k or U k , that is, with u k−1 u k−M−1 . Hence, before calculating ∆g k , we compare the value of ∆U k or U k with the threshold value ρ u , which depends on the input-output sensitivity and the size of measurement noise. This is an engineering choice that the designer has to make based on the process at hand.
We underline that, if modifier update does not take place due to ill-conditioning, and thus the previous values of the gradients or modifiers are used, Problems (15) and (20) are still solved as part of OF-eMPC. On the other hand, if the inputs u get "stuck" far away from plant optimality, the system requires additional excitation for accurate estimation of plant gradients or modifiers.

Simulation Results
Two examples taken from the literature are used to compare the performance of five eMPC schemes, all using the same nominal model, cost function, and sampling time. There is significant plant-model mismatch, and the augmented model (9) reads The estimator of Section 2.3 is the deadbeat Kalman filter with the parameters K x = 0 and K d = I. In addition, all eMPC controllers use the target calculation (15) and the FHOCP (20). The five controllers are the following: eMPC0 assumes the plant gradients to be perfectly known, i.e., D u g p,k = D u g p (ū k−1 ). This controller is not practically implementable but is used as reference. 2.
eMPC1 uses the Broyden's gradient update described in Section 3.2.1.

3.
eMPC2 estimates the plant gradient using linear regression as per Section 3.2.2. The number of data points in the regression is n s = 4. 4.
eMPC3 uses the Broyden's modifier update described in Section 3.3.1.

5.
eMPC4 estimates the modifiers using linear regression as per Section 3.3.2. The number of data points in the regression is also n s = 4. Note that, in the presence of plant-model mismatch, the use of the disturbance model (51) by itself is not sufficient to eliminate offset, as reaching plant optimality requires perfect knowledge of plant gradients [25,32]. When not otherwise specified, M = 15 is chosen for all controllers. Moreover, the filter constant for updating the modifiers in Equation (13) is σ = 0.5, and the tolerance threshold for avoiding numerical ill-conditioning is ρ u = 5 · 10 −4 .

CSTR with Competitive Reactions
The first example is a continuous stirred-tank reactor (CSTR), in which two consecutive reactions take place: The objective is to maximize the profit expressed as the difference between the revenues generated by the desired product B and the cost of the raw material A. The following system of ordinary differential equations describes the reactor dynamics: where x 1 and x 2 represent the molar concentrations of A and B in the reactor, the corresponding feed concentrations are c A0 and c B0 , and the input u is the feed flow rate regulated through a valve. For the sake of simplicity, the reactor is assumed to have a constant volume V and to be isothermal. Both states are assumed to be measured. The reactor parameters are taken from [25]. Optimal steady-state performance is computed by solving the following optimization problem: with the running profit Optimal operation corresponds to u s = 1.043 m 3 /min, x 1,s = 0.511 kmol/m 3 , and x 2,s = 0.467 kmol/m 3 with the maximal profit p(u , x ) = 0.906 e/min.

Model
We assume plant-model mismatch and design the controller with erroneous values of the two kinetic parameters: the model values known by the controller arek 1 = 0.5 min −1 andk 2 = 0.4 min −1 , meaning that, compared to the true values k 1 = 1.0 min −1 and k 2 = 0.05 min −1 , the first reaction rate is underestimated, while the second one is overestimated. The economic cost function to be minimized in Problem (20) is with the sampling time τ = 0.25 min. Furthermore, the dynamics (53) is discretized using an implicit Euler method and the initial conditions are x 1,0 = x 2,0 = 0.1 kmol/m 3 . Note that, with τ st 4 min, the choice M = 15 is quite appropriate.

Reactor Performance
The input and the cost function of the reactor with the five aforementioned controllers are depicted in Figure 3. We can immediately see that all schemes reach plant optimality even under the presence of plant-model mismatch. Before reaching the initialization phase at t = 3.75 min, the behavior is the same for the practical controllers eMPC1-eMPC4 as they all use the disturbance model (51) and no first-order correction. eMPC0 is, of course, significantly better as it uses perfect knowledge of plant gradients. Λ k is updated for the first time at time step k f = k 0 + n u M = 30, which corresponds to t = 7.5 min. After initialization, we see a slight difference between the schemes that use Broyden's update (eMPC1 and eMPC3) and those that rely on linear regression (eMPC2 and eMPC4). All of them start to move towards plant optimality, with the methodologies using Broyden's update reaching steady state faster. We observe a steplike profile, since the corrections are updated every M time samples. The average convergence time for the proposed methods is about 40 min.
To speed up convergence, one can reduce the value of M, thereby looking for a compromise between convergence speed and steadiness of the collected samples. Figure 4 shows the reactor performance with the five controllers when M has been reduced from 15 to 5. For the sake of comparison, the same initial steady state was used, that is, we have k 0 = 15 and k f = 20, with t = 5 min being the first time that Λ k is updated. All schemes reach plant optimality in about half the time of the previous case. Note also that, with all schemes, the profit (bottom panel of Figure 4) quickly reaches plant optimality. Only eMPC3 presents some erratic behavior that could result from poor numerical conditioning when Broyden's update is used on the gradient difference. However, adjusting the filtering parameter α can improve the situation.

Process
As a well-known RTO example [10,44], the Williams-Otto reactor is a nonisothermal CSTR, in which three reactions take place: The reactor feed consists of two components: Species A fed at the constant flow rate Q A with molar concentration c A0 , and Species B added at the variable flow rate Q B with molar concentration c B0 . The desired products are P and E, while C and G are intermediate and undesired products, respectively. The reactor outlet flow rate Q r is set equal to the sum of the two inlet flow rates, that is, Q r = Q A + Q B , with the reactor volume V being constant. Mimicking realistic process conditions, only the molar concentrations of the two desired products are assumed to be measured, that is, y p = c P c E T . The reactor temperature T r is manipulated, assuming ideal cooling. The kinetic parameters depend on the reactor temperature via an Arrhenius-type law: For the sake of brevity, the system dynamics are not reported here but can be found in [12]. The reactor has two inputs, namely, the temperature T r and the flow rate Q B . The profit to be maximized is where p P , p E , p A , and p B are the molar prices of products and reactants. The plant parameters can be found in [32] and in Table 1. The model used for control design comprises only two reactions: for which the kinetic parameters are reported in Table 2.
while the plant has 6 states, In addition, the inputs are bounded as follows:

Reactor Optimization
The sampling time in this example is τ = 2 min, which calls for Mτ = 30 min between two successive modifier updates. In this case, with τ st = 35 min, the choice M = 15 is also quite appropriate. The performance of the five controllers is depicted in Figure 5, together with the optimal operating point that is unknown to the controllers. Λ-initialization is performed in the time interval of 30-90 min. The starting point for the model state vector is and the first measurement from the plant reads y p,0 = 1.1 0.6 T .
A behavior similar to that shown in Figure 3 can be observed, namely, (i) all schemes reach plant optimality; (ii) before the initialization phase, the practical schemes eMPC1-4 behave alike since the modifier matrix is still zero. Then, from k 0 τ = 30 to k f τ = 90 min, the two inputs are excited individually and sequentially as described in Section 4.1. Once a nonzero modifier matrix has been computed, we see a difference between the schemes that use plant-gradient estimation (eMPC1 and eMPC2) and those that estimate the gradient differences directly (eMPC3 and eMPC4). In this example, the schemes that rely on plantgradient estimation do not work well: eMPC1 and eMPC2 fail to reach plant optimality; in addition, eMPC2 converges slowly, while the second input with eMPC1 gets stuck to its upper bound. In contrast, eMPC3 and eMPC4 are able to reach plant optimality. These results indicate that the two schemes based on estimating the gradient differences give the best performance.
In order to check the robustness of these last two schemes, we propose to corrupt the measurements with white noise, i.e., Plant (1) is modified as follows: where v ∈ R n y is a white noise signal with covariance R v = 10 −3 I n y . Figure 6 shows the effect of noise on the convergence performance of eMPC3 and eMPC4. It is seen that linear regression handles measurement noise better than Broyden's update since the former uses measurements stemming from several measurement points and therefore possesses a stronger filtering effect. Note that eMPC3 and eMPC4 perform very similarly right after the initialization phase when the number of measurements points is the same. As shown in the top panels of Figure 6, as time advances and more measurements become available to eMPC4, the inputs are able to move much quicker than with eMPC3. Another illustration of the improved resilience of eMPC4 with respect to measurement noise is shown by the profit profile in the bottom panel of Figure 6.

Final Considerations
An interesting result is the improvement in estimation quality when the modifiers are estimated directly rather than computed from the estimated plant gradients and the predicted model gradients. This is related to the nature of the estimation error. The gradient estimation error encompasses two terms resulting from truncation error and measurement noise, respectively (see [9] for a more comprehensive discussion). These two error types are quite different in nature, with the former being systematic and the latter being random. On this basis, one can consider two different ways of estimating the modifiers:

1.
Estimate the plant gradients from measurements, compute the model gradients "analytically" from the model, and evaluate the modifiers as their differences, as per Section 3.2. This approach is very sensitive to truncation errors, as these errors affect the estimation of plant gradients but not of model gradients.

2.
Estimate the plant gradients from measurements, estimate the model gradients from the model using the same numerical scheme, and evaluate the modifiers as their differences. This is equivalent to estimating the modifiers directly from measurements (see Section 3.3). In this case, the truncation errors tend to cancel out upon computing the differences between plant and model gradients.

Conclusions
Different offset-free economic model predictive control schemes that combine an augmented model structure and first-order modifiers on the outputs have been proposed in recent years. These schemes proved to converge asymptotically to plant optimality, even under plant-model mismatch. The main caveat of this formulation is the requirement of perfect knowledge of static plant gradients. This paper has investigated two different techniques to estimate plant gradients using steady-state measurements, namely, Broyden's update and linear regression. These techniques have been adapted and applied to estimate either the plant gradient or the modifiers directly. Consequently, four different approaches have been tested via two examples: in the first (single-input) example, all methods give satisfactory results, and also when the time between modifier updates is reduced to boost convergence speed. In the second (multi-input) example, only the approach that estimates the modifiers directly succeeded in reaching plant optimality. The approach was tested using both Broyden's update and linear regression. It was found that the approach that uses linear regression is clearly superior in handling measurement noise. In general, although the most effective estimation method should be chosen according to the specific case, it appears from these results that estimating the modifiers directly is more effective than estimating the plant gradient alone. Furthermore, in the presence of measurement noise, modifier estimation is best implemented via linear regression.
Note that these findings are not the end of the story, as this investigation was limited to plant gradient estimation using as many steady-state measurements as possible (M = 15 in this study). As mentioned in the introduction, one would like to be able to estimate static plant gradients using transient data, so as to speed up convergence even more (ideally, M = 1). Hence, a similar type of study to analyze the performance of eMPC in the presence of plant-model mismatch and the availability of transient measurements is very welcome. Finally, in the present paper, closed-loop stability has been assumed as per Assumption 5. However, the stability of eMPC under plant-model mismatch is still an open issue, which may motivate future research.