Data-Driven Adaptive Controller Based on Hyperbolic Cost Function for Non-Affine Discrete-Time Systems with Variant Control Direction

: As technology evolves, more complex non-affine systems are created. These complex systems are hard to model, whereas most controllers require information on systems to be designed. This information is hard to obtain for systems with varying control directions. Therefore, this study introduces a novel data-driven estimator and controller tailored for single-input single-output non-affine discrete-time systems. This approach focuses on cases when the control direction varies over time and the mathematical model of the system is completely unknown. The estimator and controller are constructed using a Multiple-input Fuzzy Rules Emulated Network framework. The weight vectors are updated through the gradient descent optimization method, which employs a unique cost function that multiplies the error by a hyperbolic tangent. The stability analyses demonstrate that both the estimator and controller converge to uniformly ultimately bounded (UUB) functions of Lyapunov. To validate the results, we show experimental tests of force control that were executed on the z-axis of a drive-controlled 3D scanning robot. This system has a varying control direction, and we also provide comparison results with a state-of-the-art controller. The results show a mean absolute percentage tracking error smaller than one percent on the steady state and the expected variation in the system’s control direction.


Introduction
In the continually growing landscape of control systems, adaptive controllers have garnered significant attention owing to their adeptness in handling the intricate dynamics of modern systems, which are often unknown and highly non-linear [1,2].With these sophisticated systems, there has been a surge in the availability of system status information, enabling adaptive controllers to rely less on system knowledge.This shift has led to the emergence of data-driven controllers (DDCs) and model estimators [3,4].Researchers typically categorize the adaptation of DDCs into online learning, offline learning, and hybrid approaches that combine both online and offline learning strategies.
Along the DDCs' most popular online methods, model-free adaptive control (MFAC) [5][6][7][8] has the advantages of a low computational cost for other online methods and no information required about the system besides the control direction.This method has been mainly used for non-linear systems.Regarding the offline methods, iterative feedback tuning [9,10] (IFT) and virtual reference feedback tuning [11,12] (VRFT) focus on parameter adaptation/identification; IFT adapts with iterations according to the gradient descent method whereas VRFT looks for the global minimum of the data available.Both require some controlled experiments on the system to have data on its behavior previous to controlling the system.Offline methods usually have lower computational costs than online methods but require more information on the system.No offline methods have reported results for systems with varying control directions.Hybrid methods have the popular iterative learning control (ILC) [13,14] that is commonly used for systems with repetitive tasks.This method has the advantage of decreasing error in each cycle but requires a bit of prior knowledge of the system.These types of controllers have a common disadvantage: they need to know the control direction of the system.
In 1983, Roger D. Nussbaum [15] proposed a solution of adaptive control for systems where the control direction is unknown.His proposed controller could deal with the unknown control direction by adapting a control gain.The proposed function slowly adapts to the unknown control directions.It has been used to solve different problems such as those pertaining to non-affine systems [16,17], non-linear systems [18], switched systems [19], reducing the computational cost [20,21], industrial applications [22,23], and time-varying control gain with no change in the sign [24].Unfortunately, to the authors' knowledge, only a very few articles address the problem of systems with varying control directions (control gain with sign change) [25,26].Both articles use a Nussbaum-type function alongside fuzzy observers and controllers.Their focus is on affine non-linear systems; they only report simulation results to validate their theoretical analysis, and one of them does not present a graph of the control-related parameter estimation.
Unlike affine systems, non-affine systems have a non-linear relation with the system output regarding the control input [27][28][29].This property of the non-affine systems makes it very difficult to find an exact solution for them.Non-affine systems have countless applications such as active magnetic bearings, aircraft dynamics, biochemical processes, dynamic models in pendulum control, underwater vehicles, and so on [30,31].The common approach to controlling non-affine systems is adaptive control, which mostly involves neural networks where the control direction is either known or estimated with the Nussbaum gain [32][33][34].Some of these systems also show time-varying sign behavior when we talk about the correlation between the output of the system and the control input.Therefore, our focus is the development of an adaptive controller for non-affine systems with time-varying control direction behavior.
This work presents two significant contributions.First, we introduce a novel controller capable of effectively managing non-affine, non-linear discrete-time systems with varying control directions.This adaptive controller showcases remarkable versatility in handling such systems, enabling precise control even amidst changing control directions.Then, we propose a novel cost function, employing a hyperbolic tangent of the error multiplied by the error, diverging from traditional quadratic or absolute functions.Investigations reveal that this innovative cost function facilitates faster responses to aggressive system changes while ensuring smoother control laws and estimated function responses.These enhancements significantly improve the overall system performance and may find valuable applications in redundancy scenarios for future works.
The rest of this work is organized as follows: Section 2 outlines the requirements and assumptions of the systems to control.Section 3 introduces the model estimator and its upgrade law according to the gradient descent method.We propose an unusual cost function: a hyperbolic tangent of the error multiplied by the error.The stability proof of the estimator is provided at the end of Section 3. Section 4 develops a model-free adaptive controller, where the weight vector is also upgraded according to the gradient descent method with a similar cost function as the estimator.The stability proof of the closed-loop system is provided at the end of Section 4. Section 5 shows the performance of the controller and estimator for a highly non-linear system with a changing control direction.This section also provides a comparison of the experimental results with a state-of-the-art controller.Finally, Section 6 offers suggestions for future work and concluding remarks.

Problem Formulation
A single-input single-output non-affine non-linear discrete-time system described as with unknown indices n y and n u , it has the following affine representation: where f (k) and g(k) are unknown functions and ε h (•) is a bounded residual error when the non-affine system (1) meets the following assumptions: Assumption 1.The non-linear function F(•) in (1) needs to be continuous regarding the control law u(k).This implies that ∂y(k + 1) where 0 < |g(k)| ≤ g M and g M is an unknown positive constant.Therefore, the system (1) is controllable with unknown and varying control directions.
According to Assumption 1, the control direction is determined by utilizing the sign function of g(k) in (3) as where ∆u(k) ̸ = 0. Therefore, the equivalent model based on MiFREN [35] is developed in the following section.

Model Estimator
In this work, a class of non-linear discrete-time systems described in (1) and (2), where where and g(k) are unknown functions.Therefore, the adaptive network MiFREN is used to estimate those functions as where φ(k) is the multidimensional vector of the membership functions and β * f and β * g are the unknown ideal weight vectors.
By utilizing the equivalent model based on MiFREN, the dynamics in ( 5)-( 7) are estimated as ŷ(k Therefore, the MiFREN implementation leads to where β f (k) and β g (k) are iterative weight vectors used to estimate the functions f (k) and ĝ(k), respectively.This implementation is illustrated in the diagram of Figure 1.This diagram shows that the inputs of the estimator are the system output and the estimation error at the kth iteration to avoid causality issues.Both inputs enter simple fuzzy membership functions µ and are later combined in the multidimensional membership function φ(k).
Then, the estimation proceeds as explained in Equations ( 8)- (10).The weight vectors β f (k) and β g (k) are updated by the gradient descent method.Thus, the estimation error ê(k + 1) is introduced as Recalling ( 8)-( 10) with (11), it yields where . The cost function Ê(k + 1) is secondly selected as a semi-definite positive function: It is worth observing that the proposed cost function (13) is developed here to reduce the high-frequency behavior, which will be discussed next in the experimental results.Therefore, the update laws for the weight parameters are formulated by the gradient descent method as follows and where η f and η g are the learning rates for β f and β g , respectively.
Using the chain rule, the cost function (13) derivation is obtained as where Then, the cost function (13) derivation is also calculated as Thus, substituting ( 16) into ( 14) and ( 15) into (18), the update laws for the weight parameters can be formulated as follows and It is seen that η f and η g can play an important role in the model performance.Thus, η f and η g are determined as in the following theorem.
Theorem 1.A class of non-affine non-linear discrete-time systems (1) that can be represented in an affine way (5)-meeting the three assumptions made in Section 2-can also be estimated by ( 8) based on MiFREN.The estimation error, along with the internal signals, is convergent when the estimator parameters are designed following the conditions Proof.To verify the convergence of the estimation error along and the internal signals of the estimator, let us select the Lyapunov function as Therefore, the Lyapunov function differentiation is calculated as Utilizing the learning laws developed in (19) and (20), we obtain and respectively.By substituting (25) and ( 26) into (24), we have According to (21) where η f = η g = η T , the relation in ( 27) can be rewritten as From the definition of the estimation error (12), we have By employing (28), it yields Let us define ξ ê(k + 1) ≜ ê(k + 1)h ê(k + 1); thus, from the last equation, we have Figure 2 shows the cost function E( ê(k)), a semi-definite positive function dependent on the estimation error ê(k).Furthermore the function h ê(k + 1) in Figure 3, shows that h ê(k + 1) is a bounded function regardless of the value of ê(k + 1) with the limits −1.2 ≤ h ê(k + 1) ≤ 1.2.Given that ε h (k)h ê(k + 1) is a bounded function with an unknown sign, the Lyapunov function differentiation (30) can be rewritten as since, for definite positive values a and b with an unknown value c, we can say that Limits for the learning rate η T are derived according to the known functions , where It is suggested to set the learning rate as when no offline learning has taken place to accelerate the error convergence to a bonded compact set.In this case, a zero-initial-weight vector is recommended if there is no human knowledge of the system behavior to pass into the neural network.If the initial parameters of the estimator are obtained by offline learning or previous behavioral knowledge of the system, the learning rate can be set as Recalling the differentiation of the Lyapunov (31), and with the previous statements, it can be seen that the differentiation is semi-definite negative when the estimation error co-related parameter ξ ê(k + 1) is bounded as where . This boundary concludes the stability proof, where the estimation error and estimator's internal signals are established as UUB according to the proposed Lyapunov function (for more information, see the Lyapunov extension Theorem 2.5.7 [36]).
The following section proposes an adaptive controller that can deal with varying and unknown control directions, based on information provided by the estimator.

Data-Based Model-Free Adaptive Controller
An adaptive data-driven controller is proposed with a direct control scheme (Figure 4).The adaptive controller depends on the closed-loop system tracking error and the desired trajectory.The upgrade of the weight vector β(k) depends on the model estimator as described in this section.
Defining the system's tracking error as where r(k + 1) is the desired trajectory, we propose a MiFREN-based adaptive controller as where φ c (k) is a multidimensional membership-function vector and β c (k) is the weight vector for the controller.The fundamental difference of the proposed controller is found in the weight vector actualization method.The actualization will be performed with the gradient descent method and a novel cost function: the hyperbolic tangent of the tracking error multiplied by the tracking error.It is worth noticing that this is a model-free adaptive controller, and no further information on the system is required at this point.The actualization method for the weight vector β c (k) will be discussed in more detail in this section.The upgrade law of weight vectors is defined according to the gradient descent method as with the cost function E(k + 1) = tanh e(k + 1) e(k + 1).
As was stated for the estimator cost function, this type of function has the advantage of being smooth near the origin, unlike functions with absolute values.The partial derivative needed for the upgrade law (37) is obtained with the chain rule and the partial derivative of the tracking error regarding the weight vector is obtained as The term ∂y(k+1) ∂u(k) , according to Theorem 1, is estimated as Then, the partial derivative of the cost function is approximated as Substituting the last equation on the upgrade law (37), we obtain a feasible upgrade law with h e (k + 1) ≜ sech 2 e(k + 1) e(k + 1) + tanh e(k + 1) .Per the stability proof for Theorem 2, the upgrade law can also be stated as Theorem 2. A class of non-affine non-linear discrete-time systems (1) represented in an affine way ( 2) is estimated as ( 8) if the original system (1) follows the assumptions described in Section 2. The tracking error along with the internal signals are convergent with the system estimator according to Theorem 1, the controller (36), and the upgrade law (40) or (41), if parameters are designed following the next conditions: Proof.To verify the convergence of the closed-loop tracking error and the convergence of the system's internal signals, the Lyapunov function is selected: Therefore, the Lyapunov function differentiation is calculated as Defining βc (k + 1) as the difference in the ideal weight vector β * and the current iteration β(k + 1), it is also established that Substituting the update law (40) in the last equation the weight vector error is also described as In a similar sense, considering that the ideal weight produces the ideal controller and no tracking error, it is inferred that Substituting the last equation in the tracking error (35), is rearranged as With Equations ( 44) and ( 45), the Lyapunov function differentiation (43) is rewritten as and with some mathematical processes takes the form and finally, From the last equation, the learning rate η needs to be positive definite.The term multiplied by η also needs to be definite positive as To ensure the last inequality, the learning rate is bounded as then, the boundary of the learning rate becomes .
Considering that 2e(k + 1) ≥ h e (k + 1) and ε(k) is small enough to be negligible, the learning rate boundary is rewritten as , and the final boundary of the learning rate becomes where In a similar sense, we can say that η ≤ sign{g(k)}g min g(k) and substitute it into the upgrade law (40), we obtain and according to (39), it is rearranged as If we define and substitute it in (50), we obtain the upgrade law in (41).
From the previous boundary, it is derived that the Lyapunov differentiation (46) can be rewritten as With the knowledge of the boundaries of function h e (k + 1), it is also deduced that |h e (k + 1)| ≤ 1.2.Replacing the boundaries, the last equation is rearranged as For negative or positive constants a and b, the inequality (a + b) 2 ≤ 2a 2 + 2b 2 is always met.With that property in mind, the last equation can be rewritten as To ensure stability as in the previous equation, we must analyze the different term boundaries.The boundary of the tracking error e(k + 1) is established from the equation and is defined as with Ω e ≜ 1.44η 2 g 2 (k)φ 2 c (k) + 2ε 2 (k).
On the other hand, the boundary of the weight vector error β(k) is defined from the equation where adding and subtracting 1.2η to and from the equation means some terms can be rearranged on a square binomial: Isolating the term βc (k), it is bounded as where This concludes the stability proof with the boundedness of the closed-loop system's tracking error and internal signals.The next section shows experimental results to validate the proposed controller performance.

Validation Results
For experimentation, we proposed a Cartesian robot with the motor speed controlled by a driver (the frequency and direction as the input) and the output as the sensed force.Both the input and output are considered for the z-axis of the robot.The robot was designed at Cinvestav Saltillo.The robot uses servo-motors with a terminal voltage of 60VDC, a continuous torque of 0.353 Nm, and an incremental encoder (AMT102), controlled with a generic driver.The generic driver is connected to a computer by NI DAQ Multifunction SCB-68, which also controls the pulse generator Agilent 33220A and the power supply B&K Precision 1666.The force sensor TW Transducer 9105-TW-MINI58 is also connected to a PC with an ATI Industrial Automation 9620-05-DAQ.The control algorithm is run on MATLAB 2013, and the computer has a processor GenuineIntel 7, a RAM of 2 GB, and an integrated hard disk of 150 MB. Figure 5 shows a picture of the experimental setup, and Figure 6 shows a diagram of how the system works.For performance comparison, the controller proposed by M. L. Corradini in the article A Robust Sliding-Mode Based Data-Driven Model-Free Adaptive Controller [37] was replicated.
The control algorithms are written for MATLAB 2013.Given the type of system, it is clear that the control laws need to be separated between the direction of the motor and the pulse speed; the control law information is divided as the motor direction and the frequency sent to the diver This means that if the control law equals u(k) = 5, the motor direction d u (k) = 1 will be moving to the right and the driver will be sent a pulse frequency of u f (k) = 5 kHz.On the other hand, if the control law equals u(k) = −5, the motor direction d u (k) = 0 will be moving to the left and the driver will be sent a pulse frequency of u f (k) = 5 kHz.show the controller's performance, and Table 1 shows the metrics results for both controllers.As is seen in both the figures and the table, the proposed controller has a more significant tracking error at the beginning than the comparison controller but has a better performance in the final cycle.The proposed controller has better performance for the initial and final cycle of the estimation.It can be seen that both controllers have high-frequency disturbance on both the estimator and the system performance, and the proposed controller has a slower settling.
Figure 12 shows the PPD estimation ĝ(k) as proposed in comparison with φ1 (k) of the comparison controller.The PPD is usually approximated as ∂y(k+1) ∂u(k) ≈ ∆y(k+1) ∆u(k) with ∆y(k + 1) = y(k + 1) − y(k) and ∆u(k) = u(k) − u(k − 1). Figure 12 shows the approximation according to the system performance and controllers.It can be seen that the high frequency in both controllers causes the estimation to be scattered.Figure 13 shows and approximation of the PPD of the system with each controller, where it can be seen the comparison controller produces more sign changes.In contrast, if we think about how the control direction and the PPD should behave with a smooth input to the system, as shown in Figures 14 and 15, the direction of the function ĝ(k) has an expected direction during the simulation, unlike the comparison controller.

Conclusions
Our proposed controller and estimator address the challenges posed by unknown non-affine discrete-time systems with varying control directions.They incorporate novel cost functions, which play a crucial role in adapting to changes in the control direction and ensuring system stability.Through rigorous analysis based on Lyapunov theory, we proved the convergence of these methods, which instills confidence in their effectiveness.
Experimental validation conducted on a force-feedback control system, characterized by its time-varying control direction, demonstrates the practical utility of the proposed estimator and controller.The results illustrate smooth and adaptive nature of the system's response to changes in the control direction, highlighting the efficacy of the proposed methods in real-world scenarios.
While the proposed controller may initially exhibit a slower response compared to state-of-the-art alternatives, its adaptive nature ultimately enables a remarkable performance.By continuously estimating and adjusting the control direction, the system achieves an impressive tracking accuracy and robustness over time.Other systems can implement the proposed estimator and controller by acquiring data for estimator training and spending some time training with the actual system.Additionally, it is worth noting that due to the MiFREN base of these methods, human knowledge can be transferred into the system through the initialization of weight vectors, which will be enhanced with the learning algorithm.

Figure 6 .
Figure 6.Cartesian robot setup diagram.The error metrics presented are the Sum Square Error [SSE] defined as