A PI Controller with a Robust Adaptive Law for a Dielectric Electroactive Polymer Actuator

: Dielectric electroactive polymer actuators are new important transducers in control system applications. The design of a high performance controller is a challenging task for these devices. In this work, a PI controller was studied for a dielectric electroactive polymer actuator. The pole placement problem for a closed-loop system with the PI controller was analyzed. The limitations of a PI controller in the pole placement problem are discussed. In this work, the analytic PI controller gain rules were obtained, and therefore extension to adaptive control is possible. To minimize the inﬂuence of unmodeled dynamics, the robust adaptive control law is applied. Furthermore, analysis of robust adaptive control was performed in a number of simulations and experiments.


Introduction
The dielectric electroactive polymer (DEAP) actuators are being intensively developed as a new generation of transducers [1][2][3]. The properties of the DEAP actuator enable a wide range of practical applications. For instance, recently DEAP actuators were applied in the construction of speakers [4] and pumps [5,6], and in robotics applications [7,8]. DEAP actuators are constructed from an elastic membrane and covered with two electrodes [9,10]. In the literature, the models of DEAP actuator have been deeply studied [2,[11][12][13] to design actuator geometry or to design the control systems.
The controller's design is the crucial aspect of the control system's performance. The most popular design technique for DEAP actuators is using a PID controller. The modelbased control design methods are presented in [12,14,15]. In the works [12,15], a model of a DEAP actuator was used to design a robust controller. Furthermore, various structures of PID controller were also analyzed. A PID controller was also successfully applied in works [16,17]. In the discussed works, the PID controller was tuned offline. Additionally, the presented approach cannot be directly incorporated into online tuning. In the work [18], the self tuning method was applied to tune the PID gains using the methodology of [19], where gains are tuned with the gradient descent method.
In this work, a PI controller was studied on the basis of a second-order model. The advantage of presented design is using only the proportional and integral parts of the PID controller. Therefore, the derivative of control error is not required. From the general properties of a PID controller [20], it is known that the pole placement problem for a PI controller cannot be solved for higher-order plants. As long as the DEAP actuator is modeled by an at least second-order system, this limitation also exists for the DEAP actuator. The properties of the self tuning methods for PID controllers have been studied in many works [20,21]. In this study, the certainty equivalence method was applied. Firstly the design rules were obtained for known parameters. Then, the design of an adaptive controller was based on robust adaptive control, which has been widely discussed [22][23][24][25].
At the beginning of this paper, the model of DEAP actuator is described. Then, the design of PI controller is analyzed for the second-order model. The design limitations of a PI controller are shown for the DEAP actuator. The main advantages of the presented approach are the analytic rules for PI controller gains. This enables one to calculate the gains online, and therefore, the adaptive control can be applied by using identification of parameters. The obtained PI controller is used with online parameter identification by applying the certainty equivalence method. We improved the robustness of the PI controller by extending it with an adaptive algorithm. Further, to assure robustness of identifier to unmodeled dynamics, we have applied the robust adaptive laws (as described for instance in [22]). The results of the work were verified in the extensive simulations and experiments.
The paper is organized as follows. Descriptions of the DEAP actuator model and its linearized form are given in Section 2. The PI controller, its key elements, tuning guidelines and its properties are presented in Section 3. Section 4 describes robust adaptation processes and the extension of a PI controller to an adaptive version. Simulations and experiments are presented in Sections 5 and 6. There we include the simulations' and experiments' goals, assumptions, and results. The paper ends with a few concluding remarks in Section 7.

DEAP Actuator Model
The DEAP actuator is described by a nonlinear model [11,12,26] which can be written in the state space form as: where f (x), g(x), and h(x) are the state space representation functions; x is the state; and u is the input voltage. The output is the distance y. The definition of the model can be found in [26]. Furthermore, different working configurations of the DEAP actuator were also studied in [1,2,11,12,15]. The static input linearization v = u 2 is applied to cancel the input nonlinearity. The considered linearization was successfully applied in works [12,27]. In this work, the DEAP actuator was modeled by a linear model [27] for some working point y n = h(x n ), v n = u 2 n where f (x n ) + g(x n )u 2 n = 0. The transfer function can be represented as: where k a , z 0 , s 0 , α a , and ω a define the transfer function. The input and the output of transfer function are given by y ∆ = y − y n and v ∆ = v − v n . Generally, the dynamics of DEAP actuator can be split into two parts. A short-term oscillation is described by the part G f ast (s) = k a s 2 +2α a s+ω 2 a +α 2 a and a long time relaxation process is defined by the part G slow (s) = s+z 0 s+s 0 . The steady state gain is given by k s = k a z 0 . The parameters of the transfer function depend on the actuator material, geometry, and working point.

PI Controller
The PI controller is applied to drive the DEAP actuator. The target of the control system is to follow the reference r. The control error is defined as e c = r − y ∆ where y ∆ is the output in the working point coordinates. It is assumed that the state of actuator is unavailable, so only the output feedback is possible. The structure of PI controller is as follows: In this design, the controller is constructed based only on the short-time part. The reduction is possible as long as the pole/zero represents slow dynamics. The plant is represented by: where b 0 = k a , a 1 = 2α a , and a 0 = α 2 a + ω 2 a . The closed-loop system is given by: Let us denote the closed loop poles as −p 0 , −α + jω and −α − jω. Then the denominator of the closed-loop system is given by (s + p 0 ) s 2 + 2αs + α 2 + ω 2 . The comparison of two forms of denominator gives us: The poles of s 2 + 2αs + α 2 + ω 2 influence the behavior of the closed-loop system. If ω is a real nonzero number, then the system is underdamped (∆ < 0); if ω = 0, then the system is critically damped (∆ = 0); and if ω is an imaginary nonzero number, then the system is overdamped (∆ > 0).
The following property describes the limitation of a PI controller for a DEAP actuator. From (6) the equation with k p and ω is rewritten into: It is easy to see that k p can be chosen to be large enough to make ω real. However, to obtain imaginary ω we must have: or negative k p . The negative k p causes positive feedback, which is very rarely used in control systems. Hence, only the positive values of k p are taken into account. Using (6), (8) is written into: The minimum of the left side of the above equation is for α = a 1 3 . This means that if a 0 − a 2 1 3 is negative, then the PI controller has the possibility of dampening oscillations in the control system. However, if a 0 − a 2 1 3 is positive, then the PI controller cannot dampen the oscillations. In general it means that it is not always possible to dampen the oscillations with the PI controller. In the case of the DEAP actuator described in [26], the parameters satisfy: a 0 a 1 . This means also that a DEAP actuator with a PI controller has the oscillations in the closed-loop system and ω has real values.
In general, the PI controller coefficients can be expressed as: considering the constrain a 1 = 2α + p 0 . Therefore, the goal is to specify the α, ω and p 0 which give the desired performance of the closed-loop system. Further, the PI controller does not allow one to freely choose all of the poles of the closed-loop system. This is visible in (6) which has three equations and two variables (k p and k i ). For this reason the problem of pole placement is redefined as an optimization problem. The goal is to maximally dampen the oscillations. Therefore, the target is to choose p 0 α such that: taking into account the constraint p 0 = −2α + a 1 from (6) and α, p 0 > 0 to assure stability. The solution of above problem is α = p 0 . Thus, from (6) the coefficients are given by α = p 0 = a 1 3 . Next, let us define the imaginary part of pole (7) as: where the proportional gain k p,min ≥ 0 is some minimal value of k p and r ≥ 0 is the auxiliary variable. As it was discussed earlier, the expression a 0 − 1 3 a 2 1 is assumed to be positive (because of a 0 a 1 for the considered DEAP actuator). Let us consider that the goal is to choose minimal ω. As long as ω(r) is increasing function of r, the minimum is for r = 0.
The final value of control gains are equal to: The stability of the closed-loop system is assured because the values α and p 0 are positive, and hence poles have negative real parts. Further, in the simulation part it will be shown that the gains calculated by (13) leads to large stability margins. It is worth pointing out that the controller coefficients are simply calculated from the parameters of the DEAP actuator model, which is an important advantage in adaptive control.
The linear PI controller can be designed to control the nonlinear system around a working point using the linearized model [20,28]. In our work the linear model in a working point is defined by the transfer function G DEAP (s). The model G DEAP (s) presents a linearized form of the nonlinear model using the transfer function presented in (2) in which two components can be distinguished (short term oscillation part and a long time relaxation process). It is also worth pointing out that our work uses the static input linearization v = u 2 to cancel out the input nonlinearity. Such an approach for a DEAP actuator is well known of in the literature [12,27] and gives the possibility to control the DEAP actuator by linear controllers such as PIDs and model reference controllers. Additionally, the extension of control schema with the adaptation presented in the next section allows for the correction of parameter values which my vary due to changes of external conditions or going further from the working point.

Robust Adaptation
In this section, the extension of PI controller to an adaptive version is shown. Further, we would like to show different modifications of the adaptive control law which assure robustness. By means of robustness, we mean the ability to assure the stability of an adaptive system in the presence of uncertainties, unmodeled dynamics, or disturbances. If one of those phenomena exists in an adaptive system, instability of the control system or parameter unboundedness can occur [22,25]. A summary of robust adaptive laws is well described, for instance, in work [22]. In our work, in the simulations the unmodeled dynamics were the difference between the linear and nonlinear models (the models did not perfectly match). Additionally, in the experiments all uncertainties related to physical implementation (like measurement noise, model to physical device imperfections) had influences on the adaptive system.
The results of simulations show that a second-order identifier is too simple to perform identification with. Therefore, the identifier was built based on the model: which was obtained from (2). The parameters of model G f ast (s) defined in (4) can be obtained based on calculations: where −s 0 is a pole of (14), as presented in (2). In the DEAP actuator the state is not available. Therefore, a nonminimal state representation is applied with filter Λ(s) [22,23]. The DEAP actuator is third-order; hence, the filter is defined as: Λ(s) = s 3 + λ 2 s 2 + λ 1 s + λ 0 . The state space representation is given by: where the vectors are defined as: The signals v ∆ and y ∆ are directly measured from the input and output of the actuator and the signal z ∆ is available even for unknown parameters.
The values of θ are assumed to be unknown. For this reason the estimate of a parameters is defined asθ. The normalized adaptive law (see [22]) is designed to estimate the parameter values. The adaptation error is given by: where In this work, the least-square adaptive law with a forgetting factor is considered. The adaptive law was modified to obtain robustness. To assure the robustness of an adaptive control law, the parametersθ are constrained, the adaptive law has a dead zone, and the norm of covariance matrix P is limited. The idea of these extensions is widely discussed in the literature [22]. The reason for the robust extension is to reduce the influence of the DEAP actuator's nonlinearities, which exist in the control system as unmodeled dynamics.
The dead zone is applied to update parameters only if the adaptation error a is large. The adaptation law with the dead zone is as follows [22]:θ where g 0 is the threshold of dead zone. The norm of covariance matrix P is constrained as in the leakage algorithm: where R 0 > 1 defines the threshold on the matrix norm P(t) .
The parameters θ define the transfer function properties such as gain, poles, and zeros. Let us consider that vector p consists of [k s , s 0 , α a , ω a , z 0 ], which describes the transfer function of a DEAP actuator. Then, based on the parameters θ it is possible to find the value of p = f (θ). In this work, the constraints on parameters are defined for the parameters p rather than θ. Hence, it is assumed that: The goal of adaptive law is to keep the function S in below 1 to assure that values of p are in some range of nominal values. Hence, the parameter update law is given by: The discontinuous equations such as (20) and (22) cannot assure that the constraints are maintained after the discretization [22]. If the Euler forward simulation method is assumed, the conditions can be written as: to assure that constraints are maintained. The schema of the presented adaptive controller is presented in Figure 1. The parameters of transfer G DEAP (s) described in (14) are estimated by an identifier. The parameter vector is given by θ and it is defined in (17) (the estimated parameters are denoted byθ). The vector θ contains all parameters of transfer function G DEAP (s). From the parameter vector θ, the parameters of G f ast (s) are obtained by the transformation defined in (15). Relying on the estimated parameters of G f ast (s), the gains of the PI controller are found using (13). In the presented adaptive PI controller, the estimated parameters are applied to calculate the PI controller gains. Relying on the certainty equivalence principle [29], it is assumed that the parameters estimated by the adaptive laws recover the performance of the system. The presented adaptive laws are called normalized due to the term m 2 applied in (18). This term ensures that even if the signals y and u are ill posed, the normalized signal a will be applicable in the adaptive laws (exactly will be in class L ∞ ) [22]. Further, the extension of adaptive law with the leakage and dead zone gives that the estimated parameters are bounded even under plant uncertainty and existence of disturbance. If the system is persistently excited, then true estimates of parameters are possible to obtain with bounded error.
For initial conditions, the PI controller gains were equal to k p = 0.014 and k i = 2.02 (underestimated), and k p = 0.014 and k i = 3 (overestimated). The minimal value of gain k p,min was set to 0.014. The PI controller gains calculated by formula (13) lead to a stable closed loop if the true parameters are known. This is visible in Figure 2 where in the nominal case (meaning that PI controller gains are calculated for true plant parameters) the Nyquist criterion is satisfied [30]. Further, it has large stability margins (around 9.8 dB and 90 • for gain and phase margin, respectively). Additionally, to present an example of parameter sensitivity, the underestimated and overestimated PI controller gains were applied to a nominal plant. It is visible that in both cases the characteristics of the system varied, which shows the importance of having an adaptive controller to compensate.
The reference is a square wave signal with period T = 8 s and amplitude 0.1 mm. Other parameters of adaptive controller were β = 0.05 and P(0) = 10 4 I 5×5 . The nonminimal state representation filter (16) was defined by λ 2 = 300 and λ 1 = 30, λ 0 = 1000 with pole 10. To make the simulation more realistic, the noise with amplitude 10 −6 was added to the DEAP actuator's output.
To make the results more clear, the reference output y r was introduced as the response of a nominal closed-loop system. The nominal closed-loop system was the DEAP actuator linear model with nominal parameters and a PI controller with gains calculated based on the plant model. The reference output describes the output of adaptive system when all parameters converge to nominal values. The reference error e r = y − y r shows the difference between the current output of the actuator and the reference output.  The first analysis is presented for cases with under/overestimated α a and adaptation turned on/off. The aim of this analysis was to show the advantage of the adaptive system. The parameters of the robust adaptive controller were set to R 0 = 100 and g 0 = 1.0 × 10 −5 . The simulation results are visible in Figure 3. At the beginning of the adaptation process shown in Figure 3a,b, the reference error e r is large for all cases. There is also a difference visible between the under/overestimate α a . At the end of the adaptation process shown in Figure 3c,d, the reference error is much lower than for the static PI controller. The PI controller with adaptation has a reference error much lower than that of the static alternative. The improvement of control system behavior is also visible in the performance indexes which are given in Table 1. The performance improvement of the adaptive PI controller was at least 40% compared to the static alternative.   The robustness of an adaptive control system is visible after a long running time. Therefore, the analysis was run for a long time: T f = 800[s]. To simplify the visualization, the following function for signal s(t) defined in t ∈ [0, T f ] is introduced: The goal of the function γ is to display the peak of absolute value of signal considering a window T. This is useful in the analysis of reference error in long processes. The visualization of the reference error with the function γ is shown in Figure 4. It is worth mentioning that the influences of robust adaptive laws are visible in the analysis of the long process.  The second analysis was performed for dead zone g 0 (R 0 was constant and equal 10 4 , which is large value). The results are visible in Figure 5 for the following values of g 0 : 0, 10 −5 , 10 −4 , 10 −3 , 0.5 × 10 −2 , and 10 −2 . In the case of large values of g 0 , the function S in did not produce large values (near 1). However, the reference error also did not decrease. Therefore, the influence of adaptation on the system was limited, and hence there was little improvement in control error. The decrease of g 0 caused the reference error to get close to 0; however, the parameters went further from the nominal values. The third analysis was performed for the constraint on the norm of matrix P. The results for values of R 0 10 2 , 5 × 10 2 , 10 3 , 5 × 10 3 , 7.5 × 10 3 , and 10 4 are shown in Figure 6. It is visible that increasing R 0 did not have a strong influence on the reference error. However, it prevented a parameter increase in the adaptive control system. Hence, it improved the robustness properties of the control system.
In the last part, a summary simulation of the proposed algorithms was presented on Figure 7. The goal of the figure is to show the differences in parameter transients for various adaptive laws. The values of parameters are represented by function S in , which is close to 0 if a given parameter is estimated near the nominal (true) value. If the projection is not applied, its values can go without limit. It is visible in Figure 7 that an adaptive law without projection, leakage, and a dead zone can cause unbounded parameters. Thanks to the application of the projection, S in was limited to below 1, as presented in Figure 7. The modifications of adaptive laws described in Section 4 led to estimations close to nominal (true) values for the parameters. From this point of view, the adaptive law is called a robust adaptive law.

Experiments
The experiments on a DEAP actuator were performed to verify the presented controller. The construction of the DEAP actuator was based on mass bias. The laboratory setup is presented in Figure 8. The distance was measured with a laser sensor, and voltage was applied by a high-voltage amplifier. The signals were processed by a data acquisition card connected to a computer. The construction of the actuator was the same as in work [26]; however, due to a new production sample, the identification of parameters was performed once again. To get information about the behavior of the system, the series of step responses was measured in different working conditions. Due to working in local coordinates of transfer function G DEAP (s) (2), the control signal v ∆ was the input and y ∆ was the output. The output was filtered by low pass Butterworth filter with cutt-off frequency 100 Hz. The results of step responses are shown in Figure 9. It is visible that the varying nominal voltage u n caused the damping of the response to vary. In the case of a varying mass, the oscillation period is slightly different. This was one of the motivations for building an adaptive controller that in varying working conditions causes different transfer function parameters. Further, relying on the step responses, the identification of transfer function coefficients was performed. The process was split into to steps. In the first one, the parameters' transfer functions (G f ast (s)), which are responsible for fast dynamics, were identified. In the second one, the transfer function G slow (s) was identified. In both cases the parameters were found for single operation conditions-that is, a constant mass and a nominal voltage. In the case of G f ast (s), the optimization problem was solved for parameters where T f ast is duration of a fast-step response (in our experiments it was 2.5 s), y m is the measured response, and y f ast is the transfer function G f ast (s) of the response. The goal function takes into account steps with different amplitudes v i ∆ . In our experiment, eight steps were performed (4 with positive amplitude and 4 with negative amplitude): the voltage u was set to u = u n ± 0.25, u = u n ± 0.5 kV, u = u n ± 0.75 kV, or u = u n ± 1 kV. The optimization was performed by the Nelder-Mead algorithm. The identification of slow dynamics was performed by solving the problem: where T slow is the final time of a slow-step response (in our experiments it was 30 s), and y slow is the transfer function G DEAP (s) response. The example comparison of the identified model with measurement is presented in Figure 10. The parameters of transfer functions for different conditions are presented in Table 2.    To verify the properties of the adaptive control system, the experiments were performed with the described PI controller. The experiment was performed for the working point u n = 3.5 kV and mass 59 g. The identified parameters were used to set the ranges of control system parameters. Each parameter's range was defined as 0.5 of the nominal value: The initial parameters were set to nominal values, aside from the value of α a , which was decreased by 40%. Other parameters were the same as in the simulation. The sample transients with R 0 = 10 and g 0 = 0.01 are presented in Figure 11a-c. It is visible that the reference error was much greater at the beginning of the adaptation process. In the steady state, in which adaption finished the estimation of parameters, it is clear that the reference error was significantly lower. Further, the output transients show the minimal oscillations. This shows that a PI cannot fully dampen oscillations, as was shown earlier in this paper. It is worth mentioning that the control signal did not contain any noise. This was due to the strong influence of integral action and the low value of proportional gain. Relying on the presented results, we can state that the adaptation improves the performance of the system. Analysis of robustness in the adaptive controller was performed by varying the parameters: R 0 = 10 and 10 4 ; and g 0 = 0 and 0.01. It is worth pointing out that a small value of R 0 constrains the matrix P; hence, it applies robustness. In the case of dead zone parameter g 0 , the value of 0.01 should improve robustness because the adaptation law does not work for small errors. The summary of transients is visible in Figure 11d-f. In the presented figure, the reference error is transformed by function γ (28) to increase visibility. The transients of error show that the adaptation causes a decrease of error. Further, the algorithm with no robustness (g 0 = 0, R 0 = 10 4 ) has a high level of error (it can be defined as nominal case with standard adaptive law). The case with the highest robustness (g 0 = 0.01, R 0 = 10 1 ) has a low level of error and very soft transients of estimated control gains. The performance indexes were computed for all four cases and are presented in Table 3. It is clear that applying one of the robustness algorithms improves significantly the quality of control. Table 3. Performance indexes for the experiment with different levels of robust adaptation (g 0 = 0 no dead zone; a large R 0 approximates an unconstrained covariance matrix P).

Conclusions
This work presents the design of a PI controller. The simplicity of controllers causes limitations when tuning control systems' behavior. This means that pole placement for a closed-loop system is only partially possible. As a result of the design process, the simple rules for proportional and integral gains were given. Further, the extension with adaptive parameters was described. For parameter variations which can cause PI controller tuning based on the non-nominal parameters, the adaptive system improves the performance significantly. The influences of robust adaptive laws were analyzed in the simulations. It was shown that bounding the covariance matrix improves the robustness of the control system. Further, the experiments performed in the laboratory confirmed the theoretical and simulation results.
Further research topics include the application of the developed control to the different geometries of the DEAP actuator. Moreover, we plan to work on the development of more advanced estimation techniques implementing the sensorless interaction control approach.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: