^{1}

^{†}

^{2}

^{†}

^{3}

^{*}

^{4}

^{1}

These authors contributed equally to this work.

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/.)

Variable structure strategy is widely used for the control of sensor-actuator systems modeled by Euler-Lagrange equations. However, accurate knowledge on the model structure and model parameters are often required for the control design. In this paper, we consider model-free variable structure control of a class of sensor-actuator systems, where only the online input and output of the system are available while the mathematic model of the system is unknown. The problem is formulated from an optimal control perspective and the implicit form of the control law are analytically obtained by using the principle of optimality. The control law and the optimal cost function are explicitly solved iteratively. Simulations demonstrate the effectiveness and the efficiency of the proposed method.

With the development of mechatronics, automatic systems consisting of sensors for perception and actuators for action are more and more widely used in applications [

Due to the importance of Euler-lagrange equations in modeling many real sensor-actuator systems, much attention has been paid to the control of such kind systems. According to the type of constraints, the Euler-lagrange system can be categorized into Euler-lagrange system without nonholonomic constraints (e.g., fully-actuated manipulator [

In this paper, we propose a self-learning control method applicable to Euler-lagrange systems. In contrast to existing work on intelligent control of Euler-lagrange systems, the stability of the close loop system with the proposed method is proven in theory. On the other hand, different from model based design strategies, such as backstepping based design [

The remainder of this paper is organized as follows: in Section 2, preliminaries on Euler-lagrange systems and variable structure control are given briefly. In Section 3, the problem is formulated as a constrained optimization problem and the critic model and the action model are employed to approximate the optimal mappings. The control law is then derived in Section 4. In Section 5, simulations are given to show the effectiveness of the proposed method. The paper is concluded in Section 6.

In this paper, we are concerned with the following sensor-actuator system in the Euler-Lagrange form,
^{n}, D^{n×}^{n}^{n}^{×}^{n}, ϕ^{n}^{n}_{1},_{2}, _{n}_{1} = 0 restricts _{1} = _{2} _{1}) is invertible as it is positive definite. The control objective is to asymptotically stabilize the Euler-Lagrange system (_{1},_{2}) _{1} → 0 and _{2} → 0 when time elapses.

As an effective design strategy, variable structure control finds applications in many different type of control systems including the Euler-Lagrange system. The method stabilizes the dynamics of a nonlinear system by steering the state to a elaborately designed sliding surface, on which the state inherently evolves towards the zero state. Particularly for the system (_{1},_{2}) as follows:
_{0} _{0}_{1} + _{2} _{1} in _{1} as _{1}_{0}_{1} for _{0} _{1} asymptotically converges to zero. Also we know _{2} = 0 when _{1} = 0 according to _{0}_{1} + _{2} = 0. Therefore, we conclude the states x_{1}, _{2} on the sliding surface ^{2}, is often used to design the control law. For stability consideration, the time derivative of

About the Euler-Lagrange

Without losing generality, we stabilize the system (

In this paper, we set the origin as the desired operating point, _{0}_{1} + _{2}, which measures the distance from the desired sliding surface _{1},_{2}, _{n}^{T}_{i}_{i}>_{i}_{k},u_{k}+_{∞}) is the control sequence starting from the _{k}_{k}_{k}_{k}_{k}_{i}_{0} is the cost function for _{0} is a function of _{0}, _{1},…, _{∞}) and

In this section, we present the strategy to solve the constrained optimization problem efficiently without knowing the model information of the chaotic system. We first investigate the optimality condition of

Denoting

subject to: (

According to the principle of optimality [_{k}

Define the Bellman operator

Then, the optimality condition in

Note that the function _{k}

The control action keeps constant in the duration between the

In the previous sections, the iteration (

Note that the optimal cost _{n}_{n}, W_{c}_{n}, W_{a}_{c}_{a}

In order to train the critic model with the desired input-output correspondence, we define the following error at time step

Note that _{c}_{c}

As to the action model, the optimal control

Then, similar to the update rule of _{c}_{a}_{a}

_{c}_{a}

In this section, we consider the simulation implementation of the proposed control strategy. The dynamics given in

The cart-pole system, as sketched in

The cart-pole model used in this work is the same as that in [

^{2}, acceleration due to gravity;

_{c}

_{c}

_{p}

This system has four state variables:

Define
_{5} = _{c}_{6}(_{7}(

By choosing
_{1}, u_{2}]^{T}, u_{1} _{2} ∈ ℝ}.

In the simulation experiment, we set the discount factor γ = 0.95, the sliding surface parameter _{1} = 2, _{2} = 24. The feasible control action set Ω in _{1},_{2}]^{T},u_{1} ∈ ℝ,u_{2} ∈ ℝ,_{1} = _{2} = ±10 Newtons}. This definition corresponds to the widely used bang-bang control in industry. To make the output of the action model within the feasible set, the output of the action network is clamped to 10 if it is greater than or equal to zero and clamped to – 10 if less than zero. The sampling period τ is set to 0.02 seconds. Both the critic model and the action model are linearly parameterized. The step size of the critic model, which is _{c}_{a}_{c}_{a}

In this paper, the self-learning variable structure control is considered to solve a class of sensor-actuator systems. The control problem is formulated from the optimal control perspective and solved via iterative methods. In contrast to existing models, this method does not need pre-knowledge on the accurate mathematic model. The critic model and the the action model are introduced to make the method more practical. Simulations show that the control law obtained by the proposed method indeed achieves the control objective. Future work on this topic includes the theoretical proof of the convergence and exploration on the performance limit of the proposed strategy. Also, the control of other mechanical systems modeled by Euler-Lagrange system, such as manipulators

Shuai Li would like to share with the readers the poem by Rabindranath Tagore “The traveler has to knock at every alien door to come to his own and one has to wander through all the outer worlds to reach the innermost shrine at the end”. The authors would like to acknowledge the support by the National Natural Science Foundation of China under Grant No. 61172165 and Guangdong Science Foundation of China under Grant No. S2011010006116 and No. 10151802904000013.

The cart-pole system.

State profiles of the cart-pole system with the proposed control strategy.