Extremum Seeking Control for Discrete-Time with Quantized and Saturated Actuators

This paper proposes an extremum-seeking controller (ESC) design for a class of discrete-time nonlinear control systems subject to input constraints or quantized inputs. The proposed method implements a proportional-integral ESC design along with a discrete-time anti-windup mechanism. The anti-windup enforces input saturation while preserving the input dither signal. The technique incorporates a mechanism for adjusting the amplitude of the extremum seeking control dither signal. This mechanism ensures that any violation of constraints due to the dither signal is removed while maintaining the probing signal active. An amplitude update routine is also proposed. The amplitude update is coupled with a saturation bias estimation algorithm that correctly accounts for the inherent bias associated with systems operated at or near saturation conditions. The amplitude update is designed to remove the dither signal when the system approaches the optimum. It also ensures that a lower bound of the amplitude is enforced to guarantee that excitation conditions are maintained.


Introduction
Extremum-seeking control (ESC) has grown to become the leading approach to solve real-time optimization problems [1]. Following the seminal work of Krstic and coworkers ( [2][3][4][5][6][7]), this general and practically relevant control approach is equipped with an established and well understood theoretical framework, as highlighted in the proof of Krstic and Wang [2]. The standard perturbation ESC algorithm has been generalized in various forms to handle output and input constraints. ESC in the presence of constraints has been investigated in various form in the literature. Constrained ESC was first considered in [8] where a trajectory tracking approach was used to address the constrained ESC problem for a class of nonlinear systems with parametric uncertainties. In this approach, a barrier function or interior-point method was used to enforce constraints and feasibility of the closed-loop trajectories. A similar model-free extremum-seeking approach was presented in [9]. A Lagrangian, saddle-point, ESC technique is proposed in [10], and a similar approach is proposed in [11] to handle a class of stochastic control systems. In [12], a Shahshahani gradient approach was proposed. This techniques allows one to handle ESC problems subject to linear constraints by a simple reformulation of the gradient descent dynamics. In contrast to Lagrangian-based techniques, the main advantage of the Shahshahani gradient and barrier function approaches is the ability to preserve feasibility throughout the optimization.
For ESC in the presence of input constraints, a variety of techniques have been proposed. In [13] and [14], a projection algorithm is used to solve ESC problems subject to constraints in the decision variables. In [15], a comprehensive study of anti-windup mechanisms for standard ESC is presented. The approach draws a parallel between penalty (barrier) function methods and an anti-windup mechanism. A proof of convergence of the ESC in the presence of constraints is provided. In [16], a simplistic windup algorithm for a standard ESC technique is implemented experimentally for the real-time optimization of airside economizers.
The vast majority of existing results on ESC have focussed on continuous-time systems, as is the case for the existing approaches for ESC in the presence of constraints. Although discrete-time systems can be treated in an essentially similar fashion, the application of gradient descent in a discrete-time setting requires some care. A discrete-time version of the standard ESC loop was studied in [4,6] where convergence results similar to continuous time systems are obtained. A similar algorithm was also proposed in [17] for the tuning of PID controllers in unknown dynamical systems using ESC. Discrete-time ESC subject to stochastic perturbations is studied in [18]. The use of approximate parameterizations of the unknown cost function using quadratic functions was recently proposed in [19]. An alternative ESC-like method was proposed in [20]. In this study, a trajectory-based technique is used to analyze the properties of nonlinear optimization algorithms as dynamical systems. It is shown that properties of the nonlinear-optimization algorithms are suitable to assess the convergence of certain classes of ESC applied in a sampled-data approach. This method was recently studied in the context of global sampling methods in [21] where trajectory-based properties of nonlinear optimization methods are used to establish robust convergence. The main objective with the trajectory-based techniques is to analyze the properties of optimization algorithms assuming that they can converge to the true optimum using only the measurement of the objective function and possibly the constraints.
This paper proposes an extremum-seeking controller (ESC) design for a class of discrete-time nonlinear control systems subject to input constraints. Two actuation scenarios are considered. In the first scenario, we consider the ESC in the presence of saturated inputs. The proposed method generalizes the discrete-time proportional-integral ESC proposed in [22] to incorporate a new discrete-time anti-windup mechanism for ESC. One contribution of this study is the development of a saturation bias estimation mechanism that can be used to remove the impact of dither on or near the saturation level. This mechanism ensures that violation of the constraints due to the dither signal are removed without the introduction of a gradient estimation bias. Moreover, it allows the system to remain responsive to changes in the system despite operating on or very close to the saturation level. An amplitude update routine is also proposed as a discrete-time generalization of the method proposed in [23]. The amplitude update is coupled with the saturation bias estimation algorithm to account for the inherent bias associated with systems operated at or near saturation conditions.
In the second scenario, we adapt the application of the anti-reset windup strategy and the saturation bias estimation routine to handle systems with quantized actuators. We focus on ESC design for systems with "on/off" actuators. Since the excitation signal is limited to the on or off position, the application of the saturation bias estimation is able to remove the impact of the dither to allow the ESC system to converge to the correct position. Such actuators have not been treated in the literature.
The paper is organized as follows. A description of the ESC problem along with the key assumptions are given in Section 2. The proportional-integral ESC controller are presented in Section 3. The anti-windup mechanism and amplitude adjustment mechanism are described in Section 4. The design of ESC for quantized actuators are presented in Section 5. Simulation examples are presented in Section 6 followed by brief conclusions and proposed future work are in Section 7.

Problem Description
We consider a class of nonlinear systems of the form: where x k ∈ R n is the vector of state variables at time k, u k is the input variable at time k taking values in U ⊂ R and y k ∈ R is the objective function at step k, to be minimized. It is assumed that f (x k ) and g(x k ) are smooth vector valued functions and that h(x k ) is a unknown smooth function. The objective is to stabilize the system at the equilibrium conditions, x * and u * , that achieves the minimum value of y(= h(x * )) subject to saturation of the input. The input variable, u k , is required to lie in the interval U = [u − , u + ]. At equilibrium, the state variables are given by the map x = π(u) that solves the following equation: The corresponding equilibrium cost function is given by: The steady-state optimization problem is to find the minimizer u * of y = (u * ) subject to u * ∈ U . The set D(u) represents a neighbourhood of the equilibrium x = π(u).
The steady-state cost function, (u), meets the following assumptions.

Assumption 1. The nonlinear system is such that
∂x∂x T > βI, ∀x ∈ R n where β is a strictly positive constant.
Following [22], we write the cost dynamics as: where The following assumptions are required to ensure the stability of the closed-loop system.

Assumption 3.
There exists a function u k = α F (x k ,û k ) that solves the identity: This assumption states that the feedback: is well defined.
The following stabilizability condition for the nonlinear system subject to input saturation is also required.

Assumption 4.
There exists a positive definite function (x) that satisfies the following inequalities: with positive constant α e and ∀û ∈ U and

Proportional-Integral Discrete-Time ESC
In this section, we present the basic PI-ESC controller and the proposed parameter (gradient) estimation algorithm.

PI-ESC Controller
From (4), the objective function dynamics are parameterized as follows: where the time-varying parameters θ 0,k and θ 1,k are identified with θ 0,k = Ψ 0,k and θ 1,k = Ψ T 1,k . The unknown parameters θ 0,k and θ 1,k must be estimated using a parameter estimation approach described in the next subsection. We letθ 0,k andθ 1,k denote the estimates of θ 0,k and θ 1,k , respectively. The proportional-integral extremum-seeking controller is given by: where k g and τ I are positive constants to be assigned. The term d k is a dither signal used to provide a sufficiently excited signal in closed-loop. The dither signal is bounded such that d k ≤ D where D a known positive constant denotes the amplitude of the dither. In this study, the estimation of the parameter varying parameters θ 0,k and θ 1,k is performed using the estimation routine described in [22]. The estimation routine implements a modified recursive least squares approach for the estimation of time-varying parameters. It is described briefly in Appendix A.
The stability properties of the ESC considered are summarized in Appendix B.

Input Constrained ESC
In this section, we present the main contribution of this study. The proposed technique incorporates three mechanisms for the solution of ESC problems in the presence of input constraints. The first mechanism consists of a standard anti-windup mechanism that exploits the proportional integral formulation of the ESC considered. The second mechanism proposes a dither bias estimation routine that eliminates the presence of biases introduced when the dither signal input pushes the input to its saturation limit. The third mechanism is a dither amplitude update that is used to remove the dither signal when the system has converged to its optimal value, or its optimal saturation limit.

Anti-Windup Mechanism
In this paper, we propose the use of an anti-windup mechanism for the proportional integral ESC controller (5). A block diagram of the mechanism is shown in Figure 1. Figure 1. Anti-windup proportional-integral ESC.
In Figure 1, C(z) represents the discrete-time transfer function of the proportional-integral controller The mechanism places the dither addition after the anti-windup loop but before the final saturation. This mechanism guarantees that the dither signal is not removed when the system operates at the saturation limits. It also guarantees that the dithered input does not violate the input constraints.
The operator Sat(·) denotes the saturation function: The proposed dynamics of the anti-windup mechanism is given by: The anti windup loop is such that, in the absence of saturation, the control law reduces to the proportional integral law and the control law becomes: Please note that the Sat(·) remains in the control loop to ensure that the added dither signal does not cause input constraint violation.
One of the difficulties associated with such an approach is that the saturation creates a bias in the dither signal. This is problematic in cases where the optimum input lies close to or on the saturation limit. This bias in the dither signal can lead to a bias in the estimation of the parameters. As result, the value of the parameterθ 1,k does not converge to zero even when the true value θ 1,k vanishes.
It is, therefore, imperative to provide a mechanism to introduce the dither signal that prevents the estimation bias. We consider two mechanisms in this study.

Saturation Bias Estimation
Let us consider the case in which the optimum occurs on the upper saturation levelū. At the optimum, the control signal for the ESC is given by The filter (or regressor) vector yields: One of the key properties of the dither signal is that 1 In the absence of input saturation, the average regressor is such that In the presence of a bias in the dither signal, the regressor vector does not average to the correct value. As a result, the parameter estimation of θ 1,k is subject to a bias, and the system would converge to an erroneous optimum state and input.
In this section, we design an update mechanism that accounts for this saturation bias. This is achieve by introducing a signal δ k in the control, u k , which is such that the average input is unbiased. That is, for a fixed value of the input u k =ū, the following property is achieved: We first define the variable The bias estimation update proposed in this study is given by: Proposition 1. The saturation bias estimate update (10) is such that: Proof. For Statement 1, the conclusion is straightforward. The proof of Statement 2 is as follows. To establish the property (9), we first compute the average in the case where the value of the input is at one its saturation limits. Let us consider the case where the input is at its upper limit, u + . From a set of N samples of the input, assume that there are N 1 samples at which Sat(u + + d k + δ k ) = u + with the remaining N 2 samples for which Sat(u + + d k + δ k ) < u + .
Let µ 2 (j), j = 1, . . . , N 2 denote the indices of the samples that are not saturated. As a result, we can decompose the averaged quantity as follows: Thus if one considers the update (10). Following the above argument, we average both sides by summing over N samples. Let us consider the situation where the input is at its upper saturation limit u + and decompose the overall average into N 1 saturated values and N 2 inputs whose perturbed value is not saturated. This yields This gives the following recursion of sums: For every sample from the set of points that are saturated at step k, it follows that δ µ 1 (j)+1 = δ µ 1 (j) . As a result, we can write Defining the variableδ we obtain the following recursion:δ As a result, we see that the averageδ k approaches the negative value of the mean ditherd k . As a result, the bias (9) is completely removed by the saturation bias update (10). This completes the proof of Statement 2.
In cases where the optimum lies on or close to a saturation, the update (10) would lead to an effective removal of the dither signal. However, in fact, the dither is not removed. It is simply compensated for by the bias estimate δ k . If a disturbances affects the system, moving the optimum inside the saturation, then the dither signal would resume and the ESC would operate in a normal way. As a result, the dither would not be effectively removed.

Dither Amplitude Update
In this study, we consider a dither signal of the form: d k = a k sin(ν k ) where ν k can be taken to be a zero-mean Gaussian variable or simply ν k = ωk for some frequency ω. The amplitude of the dither signal a k is obtained using an amplitude update. Let the upper or lower limit if u k be denoted generically byū. We first define the signal: The proposed amplitude update is given by: where a 0 ≥ 0, σ 1 , γ 1 and γ 2 are tuning parameters. This mechanism confers two actions to adjust the amplitude of the dither signal. The term γ 1 2 π tan −1 (Θ k ) decreases the amplitude when the gradient estimate decreases or when the system has reached an equilibrium corresponding to a saturation input level. In [23], a similar amplitude update is proposed. The proposed method complements that approach in two ways. First, we adjust for the situation in which the input has stabilized on a saturation level. In this case, the estimated value of θ 1,k cannot reach 0. As a result, the approach of [23] using onlyθ 1,k would yield a larger value of the amplitude. If the optimization does not lead to a saturated value of the input, the update acts as the update of [23] and reduces the amplitude to a suitable lower level.
Second, the proposed method assigns a minimum value of the amplitude a k . As the proof of stability of the PI-ESC algorithm demonstrates [22], the practical stability of the unknown optimum requires a persistent dither signal with a k > 0 for all k. In practice, setting a k = 0 would prevent the system from responding to possible changes in the changes that may arise from changing conditions. This property of the ESC system was recognized in [23], which required a fixed lower bound for the amplitude. However, the choice of this lower bound can be conservative. The second term in the update (11), γ 2 λ min [Σ k ], aims to increase the amplitude a k when the smallest eigenvalue of the matrix Σ decreases. This update guarantees a minimum amount of excitation in the system in order to respond to possible process changes.
The action of the amplitude update can be summarized as follows.
The combination of the anti-windup ( Figure 1) and the amplitude update (11) provides an effective mechanism to minimize the bias of the system arising from the saturation. It also removes the need for the tuning of the amplitude. We demonstrate this in simulations in the next section.

ESC for Systems with Quantized Actuators
The three mechanisms proposed in the previous section can be easily adapted to a situation where the actuators of the system are limited to quantized (or on-off) input settings. In this case, we consider an actuator whose on-off action can be implemented using a hysteresis mechanism of the form: where > 0 is a small positive constant. The function Γ(u) implements the discrete actuator using a hysteresis mechanism.
In this study, we propose a quantized actuator ESC using the mechanism depicted in Figure 2. Figure 2. Proportional-integral ESC with quantized actuator.
As above, we consider the anti-windup ESC given by: Since the ESC only provides quantized control action u k = u + or u k = u − , we must consider a saturation bias estimation to eliminate the bias and remove the b presented in Section 4.3. The reason for this is that the update (14) yields the required property of the saturation bias as the system reaches the limits. The proposed bias update is given by: where The amplitude of the dither signal is implemented as in Section 4.3.

Anti-Windup PIESC
We consider the application of the PI-ESC approach to the following linear discrete-time system: where a 1 = 0.8, The input variable u k in constrained to values over the interval [0, 0.6]. We consider the PIESC algorithm with k = 0.1 and τ I = 5. The estimation routine parameters are set to α = 0.25, σ = 10 −5 and K k = 0.99. The amplitude is used with γ 1 = 0.1, γ 2 = 0.01 and σ 1 = 0.1 with an initial condition a 0 = 0. The dither signal is d k = a k sin(2k). The choice of these tuning parameters reflect the tuning guidelines that have been presented in [22]. For the saturation bias update is implemented with λ = 0.05. This choice guarantees that the bias update responds quickly to changing conditions. The simulation results are shown in Figures 3 and 4. Figure 3 shows the input and output trajectories for the resulting closed-loop system. It also shows the changes in the amplitude of the dither signal. Figure 4 shows the corresponding parameter estimates.
In this simulation, the location of the optimum is changed. For k ∈ [0, 200], the optimum occurs at y * = 1 with u * = 0.6. This places the minimizer directly on the input constraint. The ESC system identifies the optimum correctly. The amplitude update rule is also able to reduce the amplitude to a suitable minimum level. For k ∈ [200, 300], the unknown optimum occurs at y * = 2 and u * = 0.4. Since the dither amplitude update routine (11) prevented the amplitude from values that are too small, the PIESC is able to respond quickly to the change in conditions. For the period k ∈ [300, 400], the unconstrained minimum occurs at y * = 5 and u * = 0.8. As required in this case, the PIESC system converges to the saturated value of u k = 0.6 with cost y k = 6. Finally, for k ≥ 400, the system cannot reach the unconstrained optimum y * = 2 and u * = −0.4 but converges correctly to the lower saturation level of u k = 0 with cost y k = 6.
Overall, the PIESC with anti-windup mechanism performs effectively. First, the system is able to perform the optimization task in the presence of input saturation. Second, the proposed approach allows the partial removal of the dither signal when the system operates at saturation.

Quantized Actuator ESC
In this section, we consider the same dynamical system subject to the quantized actuator: where = 0.01 for the purpose of simulation. The proposed dither signal is given by the sine wave signal: where δ k is the bias update and sign(·) is the sign function. The parameters, p 1 and q 1 , are chosen as follows: We consider the PIESC algorithm with k = 0.1 and τ I = 5. The estimation routine parameters are set to α = 0.25, σ = 10 −5 and K k = 0.99. The amplitude is used with γ 1 = 0.1, γ 2 = 0.01 and σ 1 = 0.9 with an initial condition a 0 = 0.5. We consider the bias update (14) with update parameter λ = 0.95.
The simulation results are shown in Figure 5. The ESC system is able to respond quickly to the changing conditions. The correct optimal setting of the discrete actuator is correctly identified, u * = 0.6 for 0 ≤ t ≤ 200, u * = 0 for 200 < t ≤ 300, u * = 0.6 for 300 < t ≤ 400 and u * = 0 for t ≥ 400. The combination of the amplitude update and the bias estimation works very effectively in this case. The systems reintroduces excitation in the system in response to the change in conditions. The resulting excitation introduces a short sequence of discrete switches that vanish once the correct optimum is identified.

Conclusions
This study proposed an extremum-seeking control algorithm for systems subject to input saturated or quantized actuators. The approach couples a well known anti-reset windup technique with a saturation bias estimation routine that improves the performance of the ESC near or on the saturation level by removing the impact of the dither signal. An amplitude update is also proposed to further improve the performance of the ESC system.