Nonlinear Model Predictive Control of Single-Link Flexible-Joint Robot Using Recurrent Neural Network and Differential Evolution Optimization

Anlong Zhang; Zhiyun Lin; Bo Wang; Zhimin Han

doi:10.3390/electronics10192426

,

and

¹

Artificial Intelligence Institute, School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China

²

Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, China

^*

Author to whom correspondence should be addressed.

Electronics2021, 10(19), 2426;https://doi.org/10.3390/electronics10192426

This article belongs to the Special Issue Predictive and Learning Control in Engineering Applications

Version Notes

Order Reprints

Review Reports

Abstract

A recurrent neural network (RNN) and differential evolution optimization (DEO) based nonlinear model predictive control (NMPC) technique is proposed for position control of a single-link flexible-joint (FJ) robot. First, a simple three-layer recurrent neural network with rectified linear units as an activation function (ReLU-RNN) is employed for approximating the system dynamic model. Then, using the RNN predictive model and model predictive control (MPC) scheme, an RNN and DEO based NMPC controller is designed, and the DEO algorithm is used to solve the controller. Finally, comparing numerical simulation findings demonstrates the efficiency and performance of the proposed approach. The merit of this method is that not only is the control precision satisfied, but also the overshoots and the residual vibration are well suppressed.

Keywords:

flexible-joint robot; nonlinear model predictive control; differential evolution; recurrent neural network

1. Introduction

The control of the flexible-joint (FJ) robot has been a major research topic in the field of control theory and engineering for several decades [1,2,3,4,5,6,7]. The FJ robot benefits from the characteristic of inbuilt compliance that provides low output impedance, shock tolerance, and accurate force control [8]. Due to its benefits, the FJ robot has been widely used in many applications where robot interacts with environments or with humans, such as monopod hopping robots and exoskeletons [9]. However, the FJ robot is an under-actuated strong coupling nonlinear system [10]. The control of such a complex nonlinear system is a difficult task. Therefore, the goal of this study is to design a suitable controller for a single-link FJ robot, which can also be utilized for complicated nonlinear systems.

Model-free methods have been generally employed in the field of FJ robot control. The earliest influential control approach is the traditional proportional-derivative (PD) method with gravity compensation [11,12]. Different types of PD controllers have been proposed because of their simplicity and practicability [13,14,15]. To deal with the overshoots and residual vibration, a fuzzy proportional-integral-derivative (PID) controller was proposed to suppress the elastic torsional vibration [16], and a nonlinear state feedback controller was employed to suppress the residual vibration of FJ robot [10]. Although these techniques in the aforementioned references have acquired relatively excellent performance in the FJ robot control, there are still certain issues requiring attention. For instance, the parameters of model-free controllers must be adjusted according to the requirements of system performance. The control performance is sensitive to the controller parameters, and these parameters are complex to adjust. Unlike the model-free method, model predictive control (MPC) as a primary model-dependent control method is an efficient controller to handle the performance requirements. For example, in [17,18], the MPC controller has been utilized for robot manipulator trajectory tracking. For high precision position tracking of the robot arm, a data-driven MPC method has been proposed which has shown performance improvement compared to the PID method [19]. The MPC controller has been performed well in process control, automotive systems, and robotics due to its advantages of versatility, robustness, and safety guarantees [20,21,22,23,24]. However, the major challenge in the MPC method is obtaining an accurate system dynamic model. As we all know, the model of the FJ robot system is prone to uncertainty disturbances, inaccurate parameters, and unknown model functions (e.g., the friction model), making MPC implementation difficult. The neural network (NN) approach has been widely used as a strong tool in dealing with uncertainties and unknown model functions. For instance, the NN method has been employed to approximate the friction model for implementing friction compensation [25,26], and it was applied to estimate the unknown model parameters and uncertainties for achieving adaptive control [27,28,29]. Nevertheless, to the best of our knowledge, few works use the NN method to approximate the FJ robot system dynamics.

Recently, the study of merging MPC and NN techniques has increased [30], in which the NN methods are utilized to deal with the difficulty of modeling system dynamics. For example, in [31], a deep recurrent neural network (RNN) MPC architecture has been established to slice foods. In [32,33,34], deep NN was used to approximate soft robot dynamics for implementing MPC. Besides, NN has been utilized to approximate the MPC laws in [35,36,37,38,39]. In robot system, the optimization problem of MPC is still the challenges due to the nonlinear dynamic model and other non-convex constraints [40]. In addition, the robot system often suffers the deadloack problem, which has been well investigated in [41,42,43]. A suitable method for solving the nonlinear MPC (NMPC) is differential evolution optimization (DEO). DEO is a heuristic method proposed by [44], which is effective for solving numerical optimization issues. DEO has been designed as a stochastic parallel direct search method, and there are many studies on parallel DEO [45,46,47,48]. The DEO algorithm has the benefit of being a global optimization technique that is simple to understand and implement, and has strong robustness and fewer parameters to be adjusted. Due to its advantages, DEO has been extensively investigated [49] and successfully applied in diverse fields, including robot manipulator systems [50], mobile robots [51,52,53], autonomous cars [54], spectrum sensing systems [55], and permanent magnet synchronous motor systems [56]. Although the integrating MPC and NN methods produced good results in robot applications, few researches are focusing on the position control of the FJ robot. On one hand, the FJ robot system dynamics are hard to obtain. Contrarily, the optimization process of NMPC is a nonlinear programming problem, which is tough to solve.

In this study, we present an RNN and DEO based NMPC approach for position control of a single-link FJ robot. The RNN is employed to approximate the system dynamics, and the DEO algorithm is applied to solve the NMPC controller. The key contributions of this research are summarized as follows:

First, an RNN and DEO based NMPC method is proposed for the position control of a single-link FJ robot. The merit of this process is that not only is the control precision satisfied, but also the overshoots and the residual vibration is well suppressed.
To overcome the difficulty of modeling, a simple three-layer RNN with leaky rectified linear units as an activation function (ReLU-RNN) is established to approximate the FJ robot dynamic model with satisfactory precision. Then, according to the RNN predictive model and MPC approach, an RNN and DEO based NMPC controller is designed, in which the DEO algorithm is applied to solve the controller.
Finally, to demonstrate the efficiency and performance of this technique, some numerical simulation comparisons between our method and the PD method and the differential dynamic programming (DDP) [57] MPC approach have been established. Numerical simulation findings illustrate that the performance of this technique is superior to that of the PD and DDP MPC methods.

The remainder of this paper is organized as follows. In Section 2, the dynamic model of the single-link FJ robot, including the direct-current (DC) motor dynamics is established. In Section 3, the controller design is indicated. The numerical simulations are displayed in Section 4. Finally, the conclusion is given in Section 5.

2. Single-Link FJ Robot System Model

In this section, we establish the single-link FJ robot dynamic model with the DC motor dynamics being considered. The single-link FJ robot system, which can rotate in vertical plane, is shown in Figure 1.

Figure 1. The architecture of single-link FJ robot system.

The system comprises two parts, as shown in Figure 1. The left part is the motor side, which includes a motor drive board, a DC motor, and a gear reduction box. The right part is the link side, which is composed of a massless link and a load. The two sections are linked by an elastic element, which is modeled as a linear spring. The FJ robot rotates in a vertical plane with the assumption that the elastic element can only deform in the direction of joint rotation [4]. The driving torque provided by the DC motor is

τ_{m}

, and the gear reduction ratio is

1 : N

. The motor side torque is

τ_{2} = N τ_{m}

. The stiffness of the linear spring is K. The angular position of the motor side is

θ_{2}

, and the link side angular position is

θ_{1}

. When the joint rotates, the joint can deform in the direction of rotation, and the torque can be represented by

τ_{1} (φ) = K (θ_{2} - θ_{1})

, where

φ = θ_{2} - θ_{1}

denotes the deformation of a linear spring.

{\dot{θ}}_{1}

and

{\dot{θ}}_{2}

stand for the angular velocity of the link side and the motor side, respectively. Similarly,

{\ddot{θ}}_{1}

and

{\ddot{θ}}_{2}

symbolize the angular acceleration of the link side and the motor side, respectively. For the sake of simplicity, we presume the viscous damping on the motor side and the link side to be

B_{1} ({\dot{θ}}_{1}) = K_{f_{1}} {\dot{θ}}_{1}

and

B_{2} ({\dot{θ}}_{2}) = K_{f_{2}} {\dot{θ}}_{2}

, where

K_{f_{1}}

and

K_{f_{2}}

denote the damping coefficient of the motor side and the link side, respectively.

G (θ_{1}) = m g l \sin (θ_{1})

represents gravity, where m is the quality of the load, g is the gravity acceleration, and l is the length of the massless link. The rotary inertia of the link side and the motor side are

J_{1}

and

J_{2}

, respectively. Then, based on the Euler–Lagrangian equations, the system dynamics is formulated as (1) [58]

\{\begin{matrix} \begin{matrix} J_{1} {\ddot{θ}}_{1} + G (θ_{1}) + B_{1} ({\dot{θ}}_{1}) = τ_{1} (φ), \\ J_{2} {\ddot{θ}}_{2} + τ_{1} (φ) + B_{2} ({\dot{θ}}_{2}) = τ_{2} . \end{matrix} \end{matrix}

(1)

Since the motor is employed to actuate the system, the motor dynamics are also considered to institute the system dynamic model. The motor dynamics are depicted as (2)

\{\begin{matrix} τ_{m} = K_{τ} i, \\ R i + L \dot{i} + K_{e} {\dot{θ}}_{m} = U_{V}, \end{matrix}

(2)

where

K_{τ}

is motor torque coefficient, i denotes motor armature current, R represents armature circuit resistance, L stands for armature circuit inductance,

K_{e}

is back electromotive coefficient,

{\dot{θ}}_{m}

denotes the angular velocity of the motor rotor, and

U_{V}

symbolizes motor armature voltage.

The torque produced by the motor is transmitted to the motor side using a gear reduction box as shown in Figure 1. We suppose that there is no transmission loss. Then, based on Equation (2), we attain

\{\begin{matrix} τ_{2} = N K_{τ} i, \\ {\dot{θ}}_{2} = \frac{1}{N} {\dot{θ}}_{m} . \end{matrix}

(3)

According to the above analysis, combining (1)–(3), the system dynamics including the motor dynamics can be described as (4)

\{\begin{matrix} \begin{matrix} J_{1} {\ddot{θ}}_{1} + m g l \sin (θ_{1}) + K_{f_{1}} ({\dot{θ}}_{1}) = K (θ_{2} - θ_{1}), \\ J_{2} {\ddot{θ}}_{2} + K (θ_{2} - θ_{1}) + K_{f_{2}} ({\dot{θ}}_{2}) = N K_{τ} i, \\ R i + L \dot{i} + N K_{e} {\dot{θ}}_{2} = U_{V} . \end{matrix} \end{matrix}

(4)

Let us define

x_{1} = θ_{1}

,

x_{2} = {\dot{θ}}_{1}

,

x_{3} = θ_{2}

,

x_{4} = {\dot{θ}}_{2}

,

x_{5} = i

,

u = U_{V}

. Then, the system dynamics can be formulated by following state-space expression (5) and (6)

\dot{X} (t) = [\begin{matrix} 0 & 1 & 0 & 0 & 0 \\ - \frac{K}{J_{1}} & - \frac{K_{f_{1}}}{J_{1}} & \frac{K}{J_{1}} & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ \frac{K}{J_{2}} & 0 & - \frac{K}{J_{2}} & - \frac{K_{f_{2}}}{J_{2}} & \frac{N K_{τ}}{J_{2}} \\ 0 & 0 & 0 & - \frac{N K_{e}}{L} & - \frac{R}{L} \end{matrix}] X (t) - [\begin{matrix} 0 \\ \frac{m g l}{J_{1}} \sin x_{1} (t) \\ 0 \\ 0 \\ 0 \end{matrix}] + [\begin{matrix} 0 \\ 0 \\ 0 \\ 0 \\ \frac{1}{L} \end{matrix}] u (t),

(5)

Y (t) = [1, 0, 0, 0, 0] X (t),

(6)

where

X (t) = {[x_{1} (t), x_{2} (t), x_{3} (t), x_{4} (t), x_{5} (t)]}^{T}

denotes the system state vector, and

Y (t)

symbolizes the system output.

u (t)

stands for the control input of the system.

This model contains unmodeled parts, such as an accurate friction model, gear backlash, and mechanical transmission efficiency. Besides, precise model parameters are difficult to obtain. This type of nonlinear system is complex to control as the model is unknown. Thus, we present an RNN and DEO based NMPC method, which can be utilized for complicated nonlinear systems.

3. Controller Design

3.1. Nonlinear Model Predictive Control

Based on our system, the fundamental scheme of NMPC is detailed in this subsection. We utilize the discrete-time nonlinear autoregressive exogenous dynamic model to represent the system state-space Equation (5), which is capable of predicting future states for long-time series. At time step k, the state

X (k + 1)

is predicted by (7)

X (k + 1) = f_{p} (X_{k}, U_{k}),

(7)

where

X_{k} = [X^{T} (k), X^{T} (k - 1), . . ., X^{T} (k - d_{x} + 1)]

represents system state time series from kth time step through

k - d_{x} + 1

th time step, correspondingly,

U_{k} = [u (k), u (k - 1), . . ., u (k - d_{u} + 1)]

depicts the control input time series.

d_{x}

and

d_{u}

stand for the length of time series of system state and control input, respectively.

f_{p} (\cdot)

signifies a nonlinear function.

For long-time series prediction, the predicted system state is transmitted into

X_{k}

recurrently, for example (8)

X (k + 2) = f_{p} (X_{k + 1}, U_{k + 1}) .

(8)

At

k + 1

th time step predicted system state

X (k + 1)

is transmitted into

X_{k}

, and system state time series is updated as

X_{k + 1} = [X^{T} (k + 1), X^{T} (k), . . ., X^{T} (k - d_{x} + 2)]

. Similarly, exogenous control input is transmitted into

U_{k}

and the control input time series is updated as

U_{k + 1} = [u (k + 1), u (k), . . ., u (k - d_{u} + 2)]

. This formula is not only useful for establishing NMPC controller but also convenient for approximating the model using NN.

Then, we consider the discrete-time nonlinear system (7) to express the MPC scheme. Equation (7) including constraints can be rewritten as (9)

\begin{matrix} \hat{X} ( & k + 1) = f_{n} (X_{k}, U_{k}), \\ X_{k} \in X, k = 0, 1, . . ., N \\ U_{k} \in U, k = 0, 1, . . ., N \end{matrix}

(9)

where

X \subset ℜ^{5}

symbolizes system state vector constraints, and

U \subset ℜ

denotes control input constraints.

f_{n} (\cdot)

stands for the nonlinear function, which is approximated by ReLU-RNN. N is the prediction horizon.

A nonlinear MPC controller works by minimizing the performance criterion such as (10)

U^{🟉} (k) = \underset{[u_{0}^{🟉}, u_{1}^{🟉}, . . ., u_{N - 1}^{🟉}]}{arg min} J (X (k), U (k)),

(10)

where

X (k) = [X_{k}, X_{k + 1}, . . ., X_{k + N - 1}]

and

U (k) = [u_{0}, u_{1}, . . ., u_{N - 1}]

symbolize the system state information and control input to be optimized, respectively. The cost function is denoted by

J (X (k), U (k))

.

U^{🟉} (k) = [u_{0}^{🟉}, u_{1}^{🟉}, . . ., u_{N - 1}^{🟉}]

signifies the optimized control input series. Each state-input pair satisfies Equation (9) with constraints. When the control input series are optimized, only the first term

u_{0}^{🟉}

is applied to the system until the next time step, and the system state measurements are updated at the next time step. Then, the optimization procedure is repeated at each time step, which runs as a closed-loop.

In the field of robot control, it is very important to realize accurate position control, speed control and torque control. In practical applications, the accuracy of position control will directly affect the performance of the robot. When performing position control, the residual vibration is easy to be inspired due to the existence of elastic elements [10]. Therefore, position control is the most important and more challenging in FJ robot control. In this study, in order to achieve accurate position control, the position variable is selected as the control objective of the NMPC controller, so that we design the quadratic cost function as (11)

J (X (k), U (k)) = α \sum_{j = 0}^{N - 1} {[{\hat{x}}_{1} (k + j + 1) - x_{1}^{r e f} (k + j + 1)]}^{2} + β \sum_{j = 0}^{N - 1} {[u (k + j + 1) - u (k + j)]}^{2},

(11)

subject to the terminal constraint (12)

\hat{X} (k + N) = 0,

(12)

where

{\hat{x}}_{1} (k + j + 1)

denotes predicted position state and

x_{1}^{r e f} (k + j + 1)

represents the reference trajectory.

u (k + j)

stands for the system control input at time step

k + j

. Due to the constraints of the control input in the real system, this term is introduced into the objective function as adjustment.

α

and

β

are the penalty coefficients of the performance criterion and control input, respectively.

We conclude from the above analysis that this technique is flexible as we have the option to design the cost functions for various objectives. For example, we can execute the velocity and torque control by simply varying the cost functions. However, implementing an NMPC controller is rather challenging, and there are two main problems in designing the controller. The first is how to create an accurate predictive model, and the second is how to solve the optimization problem successfully. We utilize a ReLU-RNN to approximate the system dynamic model in Section 3.2 to overcome the difficulty of nonlinear system modeling. The DEO algorithm, which will be described in Section 3.3 is used to optimize the control inputs.

3.2. Dynamics Model Approximation Using ReLU-RNN

We approximate the discrete-time dynamic model (7) by utilizing a simple three-layer ReLU-RNN, which is displayed in Figure 2.

Figure 2. The ReLU-RNN architecture used to approximate system dynamic model.

From Figure 2, the input of the hidden neuron

a_{i, k}

is computed by (13)

a_{i, k} = W_{i}^{x} X_{k} + W_{i}^{u} U_{k} + b_{i},

(13)

where

W_{i}^{x}

and

W_{i}^{u}

are the weight vectors of the system state and control input for ith hidden neuron.

b_{i}

signifies the bias of ith hidden neuron. The ith hidden neuron output

h_{i, k}

is evaluated by (14)

h_{i, k} = δ (a_{i, k}),

(14)

where

δ (\cdot)

represents the nonlinear activation function. We choose the leaky ReLU function as the nonlinear activation function, selected because it is useful for computing efficiently and preventing gradients from disappearing. The leaky ReLU function is detailed as (15)

δ (ϑ) = \{\begin{matrix} \begin{matrix} ϑ, if ϑ \geq 0, \\ 0.01 ϑ, if ϑ < 0, \end{matrix} \end{matrix}

(15)

where

ϑ

indicates the input variable of leaky ReLU function. Then, the predicted output is expressed as (16)

{\hat{x}}_{j} (k + 1) = W_{j}^{o} h_{k} + b_{j}^{o}, j = 1, 2, . . ., 5,

(16)

where

W_{j}^{o}

symbolizes the weight vector of jth output neuron concerning hidden layer neurons, and

b_{j}^{o}

is the bias of jth output neuron.

h_{k} = [h_{1, k}, h_{2, k}, . . ., h_{q, k}]

, where q is the number of hidden neurons. Combining (13)–(16), ReLU-RNN is capable of estimating system dynamic model by (17)

\hat{X} (k + 1) = f_{n} (W, b, X_{k}, U_{k}),

(17)

where

W

and

b

represent the weights and bias of the NN, respectively.

Each batch of training data contains 1000 randomly selected state-input pairs throughout the training process. The state-input pairs are generated by the simulation of discretized system model (5). We choose the the mean squared error (MSE) as the loss function, which is denoted as (18)

\begin{matrix} ℓ ({\hat{X}}_{k}, X_{k}) & = \frac{1}{N} \sum_{j = 1}^{N} | | \hat{X} (k + j) - X (k + j) {| |}^{2} \\ = \frac{1}{N} \sum_{j = 1}^{N} | | f_{n} (W, b, X_{k + j - 1}, U_{k + j - 1}) - X (k + j) {| |}^{2}, \end{matrix}

(18)

where

{\hat{X}}_{k} = {\hat{X} (k + 1), \hat{X} (k + 2), . . ., \hat{X} (k + N)}

symbolize the predicted values of system states. According to (18), the backpropagation method may be used to obtain the weight gradients

\frac{ℓ (\hat{X}, X)}{\partial W}

and bias gradients

\frac{ℓ (\hat{X}, X)}{\partial b}

. Then, we adopt the Adam algorithm [59], a type of gradient descent method, to train the network. Using Intel(R) Core(TM) i7-8550U CPU, the learning rate is set to

1.0 \times 10^{- 5}

, and the training process is completed after 50 min. The parameters of ReLU-RNN are set as follows.

d_{x}

and

d_{u}

are set to 5. The number of neurons in the hidden layer is 15, in the input layer is 30, and in the output layer is 5. The training findings are displayed in Section 4.

3.3. RNN and DEO Based NMPC Controller

In this subsection, we first introduce the DEO algorithm, which is based on the NMPC technique. Then, the RNN and DEO based NMPC controller is designed in detail.

The standard DEO is commonly indicated as DE/rand/1/bin [44]. A randomly selected population

P

consists of

N_{P}

individuals corresponding to the prediction horizon of NMPC, each individual is an N-dimensional vector, which is represented by

U_{i} = [u_{i, 1}, u_{i, 2}, . . ., u_{i, N}]

. The

U_{i}

corresponds to the control input

U_{k + N}

that will be optimized. The evolutionary generation time in DEO is expressed by

G = 0, 1, 2, . . ., G_{m}

, where

G_{m}

signifies the highest generation time. At Gth generation, the ith individual of the Gth generation population is designated as

U_{i}^{G} = [u_{i, 1}^{G}, u_{i, 2}^{G}, . . ., u_{i, N}^{G}]

with each element of

U_{i}^{G}

constrained to

[u_{L}, u_{U}]

.

u_{L}

and

u_{U}

are the lower band and upper band of the control input, respectively. The population will vary with the evolution process,

P^{G}

stands for the Gth generation population, and the initial population

P^{0}

is randomly generated with the boundary constraint

[u_{L}, u_{U}]

. The basic DEO algorithm operation procedure contains initialization, mutation, crossover, and selection, which are detailed as follows.

Initialization: To establish the initial point of the optimization search, the population needs to be initialized. Generally, one way to build an initial population is to randomly select from the values within a given boundary constraint. It is a common assumption that all populations with random initialization conform to a uniform probability distribution. Typically, each jth element of the ith individual in the

P^{0}

is initialized by (19)

u_{i, j}^{0} = u_{L} + r a n d (0, 1) \cdot (u_{U} - u_{L}), (i = 1, 2, . . . . ., N_{P}, j = 1, 2, . . . . ., N),

(19)

where

r a n d (0, 1)

denotes a uniformly distributed random number in

[0, 1]

.

Mutation: For each individual vector

U_{i}^{G}

, a mutant vector

V_{i}^{G} = [ν_{i, 1}^{G}, ν_{i, 2}^{G}, . . ., ν_{i, N}^{G}]

at generation G is generated by (20)

V_{i}^{G} = U_{r_{1}}^{G} + F \cdot (U_{r_{2}}^{G} - U_{r_{3}}^{G}), r_{1} \neq r_{2} \neq r_{3} \neq i, (i = 1, 2, . . . . ., N_{P}),

(20)

where

r_{1}, r_{2}, r_{3} \in {1, 2, 3, . . ., N_{P}}

represent randomly chosen indices.

F \in [0, 2]

is the zoom factor of the difference vector

(U_{r_{2}}^{G} - U_{r_{3}}^{G})

. If the element

ν_{i, j}^{G}

of the mutant individual violates the feasible region boundary of the search space, a simple method to treatment this problem is to replace the element with a novel one formulated by Equation (19). Another method is boundary absorption, which is described as (21)

ν_{i, j}^{G} = \{\begin{matrix} \begin{matrix} u_{L}, if ν_{i, j}^{G} < u_{L}, \\ u_{U}, if ν_{i, j}^{G} > u_{U}, \end{matrix} \end{matrix} (i = 1, 2, . . . . ., N_{P}, j = 1, 2, . . . . ., N),

(21)

The mainstream mutation strategies are described as follows (22)–(27)

(1): DE/rand/1/bin

$V_{i}^{G} = U_{r_{1}}^{G} + F \cdot (U_{r_{1}}^{G} - U_{r_{2}}^{G}), r_{1} \neq r_{2} \neq r_{3} \neq i,$

(22)
(2): DE/rand/2/bin

$V_{i}^{G} = U_{r_{1}}^{G} + F \cdot (U_{r_{2}}^{G} - U_{r_{3}}^{G}) + F \cdot (U_{r_{4}}^{G} - U_{r_{5}}^{G}), r_{1} \neq r_{2} \neq r_{3} \neq r_{4} \neq r_{5} \neq i,$

(23)
(3): DE/best/1/bin

$V_{i}^{G} = U_{b e s t}^{G} + F \cdot (U_{r_{1}}^{G} - U_{r_{2}}^{G}), r_{1} \neq r_{2} \neq i,$

(24)
(4): DE/best/2/bin

$V_{i}^{G} = U_{b e s t}^{G} + F \cdot (U_{r_{1}}^{G} - U_{r_{2}}^{G}) + F \cdot (U_{r_{3}}^{G} - U_{r_{4}}^{G}), r_{1} \neq r_{2} \neq r_{3} \neq r_{4} \neq i,$

(25)
(5): DE/current-to-best/1/bin

$V_{i}^{G} = U_{i}^{G} + F \cdot (U_{b e s t}^{G} - U_{r_{1}}^{G}) + F \cdot (U_{r_{2}}^{G} - U_{r_{3}}^{G}), r_{1} \neq r_{2} \neq r_{3} \neq i,$

(26)
(6): DE/rand-to-best/1/bin

$V_{i}^{G} = U_{r}^{G} + F \cdot (U_{b e s t}^{G} - U_{r 1}^{G}) + F \cdot (U_{r_{2}}^{G} - U_{r_{3}}^{G}), r_{1} \neq r_{2} \neq r_{3} \neq i,$

(27)

where

r_{1}, r_{2}, r_{3}, r_{4}, r_{5} \in {1, 2, . . ., N_{P}}

are randomly chosen individual indices.

U_{b e s t}^{G}

denotes the best fitness individual vector of Gth generation.

Crossover: To maintain the diversity of the population, the crossover operation is introduced. Binomial crossover strategy is most frequently utilized, which is expressed as (28)

z_{i, j}^{G} = \{\begin{matrix} \begin{matrix} ν_{i, j}^{G}, if r a n d (i) \leq C_{r} o r j = r a n d i n t (j), \\ u_{i, j}^{G}, otherwise, \end{matrix} \end{matrix}

(28)

where

Z_{i}^{G} = [z_{i, 1}, z_{i, 2}, . . ., z_{i, N}]

stands for the trial vector.

C_{r} \in [0, 1]

is the crossover rate, which determines how many elements are inherited from the mutant vector.

r a n d i n t (j)

is a randomly generated integer of

[1, N]

, which is used to make sure that at least one element of the trial vector is inherited from the mutant vector.

Selection: In the selection procedure, the fitness function of the DEO algorithm is designed according to the control objective. Since our goal is to execute position control of a single-link FJ robot, the cost function detailed in Equation (11) is designed as the fitness function (29) of the DEO algorithm.

f (U_{i}) = α \sum_{j = 0}^{N - 1} {[{\hat{x}}_{1} (k + j + 1) - x_{1}^{r e f} (k + j + 1)]}^{2} + β \sum_{j = 0}^{N - 1} {(u_{i, j + 1} - u_{i, j})}^{2} .

(29)

The best individual in current population is selected by calculating the fitness function. The selection method is represented by (30)

U_{i}^{G + 1} = \{\begin{matrix} \begin{matrix} Z_{i}^{G}, if f (Z_{i}^{G}) < f (U_{i}^{G}), \\ U_{i}^{G}, otherwise, \end{matrix} \end{matrix}

(30)

In this paper, we adopt DE/best/2/bin (25) as the mutation progress of the DEO algorithm. The adaptive mutation factor is applied to scale the difference vector, which is described as (31)

F = F_{0} \times 2^{λ}, λ = e^{1 - G_{m} / (G_{m} + 1 - G)},

(31)

where

F_{0}

signifies the initial mutation factor. The adaptive mutation factor is

2 F_{0}

, which is a big value at the beginning of the evolution. Then, the diversity of individuals can be maintained, and it benefits for avoiding premature. In the later evolution period, the mutation rate is close to

F_{0}

, the better individual is retained, and the damage of the optimal solution is avoided. In this approach, the probability of searching for the global optimal solution is enhanced. The flow chart of the DEO algorithm is shown in Figure 3.

Figure 3. The flow chart of DEO algorithm.

The RNN and DEO based NMPC controller architecture is illustrated in Figure 4. First, we employ the ReLU-RNN described in Section 3.2 to approximate the system dynamic model. Then, the ReLU-RNN predictive model is employed for predicting system forward dynamics, which is capable of integrating into NMPC architecture for designing the NMPC controller. Finally, the DEO algorithm is utilized to optimize the control inputs, only the first term

u_{0}^{🟉}

is applied to the system, and the whole procedure runs as a closed-loop.

Figure 4. The architecture of RNN and DEO based NMPC controller.

Based on the sampled system state information and the control inputs that will be optimized, the predicted position states

{\hat{x}}_{1} (k + j + 1)

will be obtained via the ReLU-RNN predictive model (17). Then, the fitness function (29) is computed, and the system control inputs can be optimized via the DEO algorithm. The process of optimization via DEO algorithm is detailed in Algorithm 1.

Algorithm 1 The optimization process of DEO.

Input:

Individual dimension: N
Maximum evolution generation: $G = 0, 1, 2, . . ., G_{m}$
Individual lower band: $u_{L}$
Individual upper band: $u_{U}$

Output:

The best $f (U_{b e s t}^{G_{m}})$ , and the best individual $U_{b e s t}^{G_{m}}$

1:: Initialize parameters: $C_{r}$ , $F_{0}$ , and $N_{P}$
2:: Randomly initial population: $U^{0} = [U_{1}^{0}, U_{2}^{0}, . . ., U_{N_{P}}^{0}]$
3:: for $G = 0$ to $G_{m}$ do
4:: Evaluate $f (U_{i}^{G})$ and select the best individual $U_{i}^{G}$ , $i = 1, 2, . . ., N_{P}$
5:: Let $U_{b e s t}^{G} \leftarrow U_{i}^{G}$
6:: Evaluate adaptive mutation factor: $F = F_{0} \times 2^{λ}$ , $λ = e^{1 - G_{m} / (G_{m} + 1 - G)}$
7:: for $i = 1$ to $N_{p}$ do
8:: Randomly generate: $r_{1}, r_{2}, r_{3}, r_{4}$ and $r_{1} \neq r_{2} \neq r_{3} \neq r_{4} \neq i$
9:: $V_{i}^{G} = U_{b e s t}^{G} + F \cdot (U_{r_{1}}^{G} - U_{r_{2}}^{G}) + F \cdot (U_{r_{3}}^{G} - U_{r_{4}}^{G})$
10:: Randomly generate $r a n d i n t (j)$ , $r a n d i n t (j) \in {1, 2, . . ., N}$
11:: for $j = 1$ to N do
12:: if $r a n d (0, 1) < C R$ or $j = r a n d i n t (j)$ then
13:: $z_{i, j}^{G} \leftarrow ν_{i, j}^{G}$
14:: else
15:: $z_{i, j}^{G} \leftarrow u_{i, j}^{G}$
16:: end if
17:: if $z_{i, j}^{G} \leq u_{L}$ then
18:: $z_{i, j}^{G} \leftarrow u_{L}$
19:: end if
20:: if $z_{i, j}^{G} \geq u_{U}$ then
21:: $z_{i, j}^{G} \leftarrow u_{U}$
22:: end if
23:: end for
24:: if $f (Z_{i}^{G}) \leq f (U_{i}^{G})$ then
25:: $U_{i}^{G} \leftarrow Z_{i}^{G}$
26:: end if
27:: end for
28:: end for
29:: return The best $f (U_{b e s t}^{G_{m}})$ , and the best individual $U_{b e s t}^{G_{m}}$ .

A corresponding summary of the RNN and DEO based NMPC scheme can be presented as follows:

Step1.: Obtaining the current system states and the saved history system states information along with system control inputs from the single-link flexible joint robot system.
Step2.: Based on the system state information, using the ReLU-RNN predictive model to predict future position states with N time steps.
Step3.: According to the predicted system state information and the designed cost function, using the DEO (Algorithm 1) to solve the NMPC controller.
Step4.: Applying the first term ( $u_{0}^{🟉}$ ) of the optimized control inputs to the system until the next time step.
Step5.: Time step proceeds forward one step ( $k = k + 1$ ). Then, it updates the saved history system state information, and returns to Step 1.

3.4. Control Stability Analysis

The NMPC is obtained for the plant (9) by minimizing the cost function (11) satisfy the terminal constraint (12). It is clearly that

J (X (k), U (k)) \geq 0

and

J (X (k), U (k)) = 0

only if

U (k) = 0

, and

J (X (k), U (k))

is decrescent. We assume that

X (k) = 0

and

U (k) = 0

is an equilibrium condition for the plant:

0 = f (0, 0)

. The MPC control law is

U^{🟉} (k) = [u_{0}^{🟉}, u_{1}^{🟉}, . . ., u_{N - 1}^{🟉}]

. Thus, the equilibrium point

X (k) = 0

and

U (k) = 0

is stable, providing that the optimization problem is feasible and is solved at each time step [23,60,61].

We define

J^{🟉} (X (k), U^{🟉} (k))

as the optimal value of

J (X (k), U^{🟉} (k))

which corresponds to the optimal control input

U^{🟉} (k)

. It is clearly that

J^{🟉} (X (k + 1), U^{🟉} (k + 1)) \geq 0

, and

J^{🟉} (X (k), U^{🟉} (k)) = 0

only if

U^{🟉} (k) = 0

. We will show that

J (X (k + 1), U^{🟉} (k + 1)) \leq J^{🟉} (X (k), U^{🟉} (k))

, and hence that

J^{🟉} (X (k), U^{🟉} (k))

is a Lyapunov function for the closed-loop system.

As usual in stability proofs, we will assume that the ReLU-RNN predicitve model is perfect, so that the predicted and real state trajectories coincide:

X (k + j) = \hat{X} (k + j)

if

u (k + j) = u^{🟉} (k + i)

.

Let define

J (X (k), U (k)) = min_{U} \sum_{j = 0}^{N - 1} G (X (k), U (k))

(32)

where

G (X (k), U (k)) = α {[x_{1} (k + j + 1) - x_{1}^{r e f} (k + j + 1)]}^{2} + β {[u (k + j + 1) - u (k + j)]}^{2}

(33)

With this assumption we have

\begin{matrix} J^{🟉} (X (k + 1), U (k + 1)) = min_{U} \sum_{j = 1}^{N} G (X (k + j + 1), U (k + j)) \\ = min_{U} \sum_{j = 1}^{N} G (X (k + j), U (k + j - 1)) - G (X (k + 1), U (k)) + G (X (k + N + 1), U (k + N)) \\ \leq - G (X (k + 1), U (k)) + J^{🟉} (X (k), U (k)) + min_{U} {G (X (k + N + 1), U (k + N))} . \end{matrix}

(34)

We have assumed that the terminal constraint is satisfied, the optimization problem was assumed to be feasible, so we can make

U (k + N) = 0

and stay at

X (k) = 0

, which gives

min_{U} {G (X (k + N + 1), U (k + N))} = 0 .

(35)

Since

G (X (k), U (k)) \geq 0

, we can conclude that

J^{🟉} (X (k + 1), U (k + 1)) \leq J^{🟉} (X (k), U (k))

. Thus,

J^{🟉} (X (k), U (k))

is a Lyapunov function, and we conclude by Lyapunov’s theorem that the equilibrium

X (k) = 0

,

U (k) = 0

is stable.

4. Numerical Simulations

In this section, MATLAB 2019b is employed to create numerical simulations to demonstrate the effectiveness and performance of our suggested method. To verify the superiority of this method, the conventional PD and DDP MPC methods were considered comparatives. DDP MPC is an NMPC technique that uses DDP to solve the MPC controller. To achieve a fair comparison, the predictive model and cost function used in DDP MPC were the same as the RNN and DEO based NMPC method, and the parameters of the controllers were carefully adjusted.

The model approximated by ReLU-RNN, which has been described in Section 3.1, was used for designing the NMPC controller. The discretized system dynamic model (5) was simulated in MATLAB, which is regarded as the real system platform. Table 1 lists the parameters of the simulated model.

Table 1. The simulation model parameters of single-link FJ robot.

The MSE (18) was employed to evaluate the performance of the approximated model. Table 2 shows the model approximation results. The findings revealed that the prediction precision of the learned model was relatively accurate.

Table 2. The MSE of the ReLU-RNN predictive model.

Figure 5 displays the progress of the multi-step prediction of the ReLU-RNN predictive model. Correspondingly, Figure 6 shows the absolute errors. The figures show that, even if a forward prediction was 20 time steps, the performance of the ReLU-RNN predictive model was also satisfied, and it could be used to establish an NMPC controller.

Figure 5. The progress of multi-step prediction.

Figure 6. The absolute errors of multi-step prediction.

In the simulation procedure, the time step for the suggested method and DDP MPC was 20 ms, and the prediction horizon was five time steps. The parameters of the DEO algorithm are displayed as follows,

F_{0} = 0.5

,

C R = 0.5

,

N P = 30

,

G_{m} = 200

. The control inputs were constrained in

[- 24 V, 24 V]

.

The experimental results proposed by [44,62] have shown that the DEO has good convergence properties. To demonstrate the convergence of DEO, the cost values of the optimization process are plotted in Figure 7 and Figure 8. Figure 7 displays the cost values in evolutionary iteration at each time step. It can be seen that the cost value converged to a fixed value after 80 iterations. Figure 8 shows the optimized cost values at the target tracking process. We can see that the cost values converged to a small value with the increase of time step, which indicates that the DEO could solve the proposed controller effectively.

Figure 7. The cost values of the optimization process at five adjacent time steps.

Figure 8. The cost values of the optimization process at each time step.

Figure 9 and Figure 10 depicts the tracking performance of different controllers, while Table 3 indicates the state error of different controllers. The findings illustrate that both controllers could efficiently control the system.

Figure 9. The target tracking process of different controllers.

Figure 10. The control actions of different controllers.

Table 3. Comparison of the state error of different controllers.

Figure 9 indicates that there were some overshoots and residual vibration in the system response when controlled by the PD and DDP MPC methods. This is due to the existence of an elastic element in the FJ robot, which led to the overshoots and residual vibration being easily inspired. Nevertheless, from Figure 9, we can see that the proposed controller was able to reduce the overshoots and suppress the residual vibration.

Table 3 demonstrates that our controller had a certain degree of precision control, and the precision was better than the PD controller. The DDP MPC controller achieved higher precision than our controller, but a closer look at the tracking progress in Figure 9, shows that the tracking process of our controller was smooth, with few overshoots and the vibration was well suppressed. Figure 10 depicts the controller actions. The control signal of the DDP MPC controller fluctuated greatly, the PD controller presented smaller fluctuations, and the proposed controller had the smallest fluctuations. The fluctuations in the controller signal had a great influence on the system, potentially reducing the service life of the robot and even leading to mechanical damage. The influence of controller signal fluctuations was, to some extent, more essential than control precision. It indicates that our strategy was more suitable for FJ robot control.

The need for a closed-loop system is important in the presence of external disturbances. To verify that the proposed controller is robust to external disturbances, we added external disturbances to the system. Figure 11 and Figure 12 show the system responses with external disturbances. As can be seen from Figure 11, the system responded quickly and remained stable. The control performance was also fairly satisfactory. Figure 12 depicts the control actions, which demonstrates that the proposed controller could be solved by the DEO efficiently and it could achieve a good robustness against external disturbances.

Figure 11. The target tracking process with external disturbances.

Figure 12. The control actions with external disturbances.

Based on the above investigation, we conclude that the performance of the RNN and DEO based NMPC method was better than that of the PD and DDP MPC methods. In addition, this method achieved a good robustness against external disturbances. The merit of the proposed method was that not only was the control precision satisfied, but also the overshoots and residual variation were suppressed well.

5. Conclusions

This work presents an RNN and DEO based NMPC approach for position control of a single-link FJ robot. First, the system dynamic model has been approximated using a simple three-layer ReLU-RNN. Then, according to the RNN predictive model and MPC method, the RNN and DEO based NMPC controller was designed, in which the DEO algorithm was utilized to optimize the control inputs. Finally, through comparative numerical simulations, the effectiveness and performance of the proposed technique have been verified. The simulation findings have shown that the suggested method is superior to that of the PD and DDP MPC methods, which is capable of minimizing overshoots and suppressing residual variation with the control precision satisfied.

The parallel DEO can speed up the optimization process because DEO is a stochastic optimization algorithm that is inherently parallel. In the future, considering the optimization solution time, we will evaluate the RNN and parallel DEO based NMPC approach that can be utilized for implementing real-time NMPC, and it will be further verified by experiments. In addition, we intend to apply it to multi-degree-of-freedom FJ robot applications.

Author Contributions

Conceptualization, A.Z., Z.L., B.W. and Z.H.; methodology, A.Z.; software, A.Z.; validation, A.Z., Z.L., B.W. and Z.H.; formal analysis, A.Z.; investigation, A.Z.; resources, A.Z., Z.L., B.W. and Z.H.; writing—original draft preparation, A.Z.; writing—review and editing, A.Z.; visualization, A.Z.; supervision, Z.L., B.W. and Z.H.; project administration, A.Z. and Z.L.; funding acquisition, Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Spong, M.W. Adaptive control of flexible joint manipulators. Syst. Control Lett. 1989, 13, 15–21. [Google Scholar] [CrossRef]
Brogliato, B.; Ortega, R.; Lozano, R. Global tracking controllers for flexible-joint manipulators: A comparative study. Automatica 1995, 31, 941–956. [Google Scholar] [CrossRef]
Kim, M.S.; Lee, J.S. Adaptive tracking control of flexible-joint manipulators without overparametrization. J. Robot. Syst. 2004, 21, 369–379. [Google Scholar] [CrossRef]
Huang, A.C.; Chen, Y.C. Adaptive sliding control for single-link flexible-joint robot with mismatched uncertainties. IEEE Trans. Control Syst. Technol. 2004, 12, 770–775. [Google Scholar] [CrossRef]
Ibrir, S.; Xie, W.F.; Su, C.Y. Observer-based control of discrete-time Lipschitzian non-linear systems: Application to one-link flexible joint robot. Int. J. Control 2005, 78, 385–395. [Google Scholar] [CrossRef]
Akyuz, I.H.; Yolacan, E.; Ertunc, H.M.; Bingul, Z. PID and state feedback control of a single-link flexible joint robot manipulator. Proceedings of 2011 IEEE International Conference on Mechatronics, Istanbul, Turkey, 13–15 April 2011; pp. 409–414. [Google Scholar]
Liu, X.; Yang, C.; Chen, Z.; Wang, M.; Su, C.Y. Neuro-adaptive observer based control of flexible joint robot. Neurocomputing 2018, 275, 73–82. [Google Scholar] [CrossRef] [Green Version]
Yin, W.; Sun, L.; Wang, M.; Liu, J. Nonlinear state feedback position control for flexible joint robot with energy shaping. Robot. Auton. Syst. 2018, 99, 121–134. [Google Scholar] [CrossRef]
Wang, M.; Sun, L.; Yin, W.; Dong, S.; Liu, J. A novel sliding mode control for series elastic actuator torque tracking with an extended disturbance observer. In Proceedings of the 2015 IEEE International Conference on Robotics and Biomimetics (ROBIO), Zhuhai, China, 6–9 December 2015; pp. 2407–2412. [Google Scholar]
Sun, L.; Yin, W.; Wang, M.; Liu, J. Position control for flexible joint robot based on online gravity compensation with vibration suppression. IEEE Trans. Ind. Electron. 2017, 65, 4840–4848. [Google Scholar] [CrossRef]
Tomei, P. A simple PD controller for robots with elastic joints. IEEE Trans. Automat. Control 1991, 36, 1208–1213. [Google Scholar] [CrossRef]
De Luca, A.; Siciliano, B.; Zollo, L. PD control with on-line gravity compensation for robots with elastic joints: Theory and experiments. Automatica 2005, 41, 1809–1819. [Google Scholar] [CrossRef]
Alvarez-Ramirez, J.; Cervantes, I. PID regulation of robot manipulators with elastic joints. Asian J. Control 2003, 5, 32–38. [Google Scholar] [CrossRef]
De Luca, A.; Flacco, F. A PD-type regulator with exact gravity cancellation for robots with flexible joints. In Proceedings of the 2011 International Conference on Robotics and Automation (ICRA), Shanghai, China, 9–13 May 2011; pp. 317–323. [Google Scholar]
Albu-Schäffer, A.; Petit, C.O.F. Energy shaping control for a class of underactuated euler-lagrange systems. In Proceedings of the 10th IFAC Symposium on Robot Control, Dubrovnik, Croatia, 5–7 September 2012; Springer: Berlin/Heidelberg, Germany, 2012; Volume 45, pp. 567–575. [Google Scholar]
Ju, J.; Zhao, Y.; Zhang, C.; Liu, Y. Vibration suppression of a flexible-joint robot based on parameter identification and fuzzy PID control. Algorithms 2018, 11, 189. [Google Scholar] [CrossRef] [Green Version]
Tang, Q.; Chu, Z.; Qiang, Y.; Wu, S.; Zhou, Z. Trajectory tracking of robotic manipulators with constraints based on model predictive control. In Proceedings of the 17th International Conference on Ubiquitous Robots (UR), Kyoto, Japan, 22–26 June 2020; pp. 23–28. [Google Scholar]
Wilson, J.; Charest, M.; Dubay, R. Non-linear model predictive control schemes with application on a 2 link vertical robot manipulator. Robot. Comput.-Integr. Manuf. 2016, 41, 23–30. [Google Scholar] [CrossRef]
Carron, A.; Arcari, E.; Wermelinger, M.; Hewing, L.; Hutter, M.; Zeilinger, M.N. Data-driven model predictive control for trajectory tracking with a robotic arm. IEEE Robot. Autom. Lett. 2019, 4, 3758–3765. [Google Scholar] [CrossRef] [Green Version]
Poignet, P.; Gautier, M. Nonlinear model predictive control of a robot manipulator. In Proceedings of the 6th International Workshop on Advanced Motion Control. Proceedings (Cat. No.00TH8494), Nagoya, Japan, 30 March–1 April 2000; pp. 401–406. [Google Scholar]
De Nicolao, G.; Magni, L.; Scattolini, R. Robust predictive control of systems with uncertain impulse response. Automatica 1996, 32, 1475–1479. [Google Scholar] [CrossRef]
Magni, L.; Sepulchre, R. Stability margins of nonlinear receding-horizon control via inverse optimality. Syst. Control Lett. 1997, 32, 241–245. [Google Scholar] [CrossRef]
Mayne, D.Q.; Rawlings, J.B.; Rao, C.V.; Scokaert, P.O. Constrained model predictive control: Stability and optimality. Automatica 2000, 36, 789–814. [Google Scholar] [CrossRef]
Hewing, L.; Wabersich, K.P.; Menner, M.; Zeilinger, M.N. Learning-based model predictive control: Toward safe learning in control. Annu. Rev. Control Robot. Auton. Syst. 2020, 3, 269–296. [Google Scholar] [CrossRef]
Guo, K.; Pan, Y.; Yu, H. Composite learning robot control with friction compensation: A neural network-based approach. IEEE Trans. Ind. Electron. 2019, 66, 7841–7851. [Google Scholar] [CrossRef]
Liu, X.; Zhao, F.; Ge, S.S.; Wu, Y.; Mei, X. End-effector force estimation for flexible-joint robots with global friction approximation using neural networks. IEEE Trans. Ind. Inform. 2019, 15, 1730–1741. [Google Scholar] [CrossRef]
Liu, Y.J.; Li, J.; Tong, S.; Chen, C.L.P. Neural network control-based adaptive learning design for nonlinear systems with full-state constraints. IEEE Trans. Neural Netw. Learn. Syst. 2016, 27, 1562–1571. [Google Scholar] [CrossRef] [PubMed]
He, W.; Chen, Y.; Yin, Z. Adaptive neural network control of an uncertain robot with full-state constraints. IEEE Trans. Cybern. 2016, 46, 620–629. [Google Scholar] [CrossRef] [PubMed]
He, W.; Yan, Z.; Sun, Y.; Ou, Y.; Sun, C. Neural-learning-based control for a constrained robotic manipulator with flexible joints. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 5993–6003. [Google Scholar] [CrossRef] [PubMed]
Bai, G.; Meng, Y.; Liu, L.; Luo, W.; Gu, Q.; Liu, L. Review and comparison of path tracking based on model predictive control. Electronics 2019, 8, 1077. [Google Scholar] [CrossRef] [Green Version]
Lenz, I.; Knepper, R.A.; Saxena, A. DeepMPC: Learning deep latent features for model predictive control. In Proceedings of the Robotics: Science and Systems XI, Rome, Italy, 13–17 July 2015. [Google Scholar]
Gillespie, M.T.; Best, C.M.; Townsend, E.C.; Wingate, D.; Killpack, M.D. Learning nonlinear dynamic models of soft robots for model predictive control with neural networks. In Proceedings of the 2018 International Conference on Soft Robotics (RoboSoft), Livorno, Italy, 24–28 April 2018; pp. 39–45. [Google Scholar]
Hyatt, P.; Wingate, D.; Killpack, M.D. Model-based control of soft actuators using learned non-linear discrete-time models. Front. Robot. AI 2019, 6, 22. [Google Scholar] [CrossRef] [Green Version]
Hyatt, P.; Killpack, M.D. Real-time nonlinear model predictive control of robots using a graphics processing unit. IEEE Robot. Autom. Lett. 2020, 5, 1468–1475. [Google Scholar] [CrossRef]
Li, D.; Li, D. Adaptive neural tracking control for an uncertain state constrained robotic manipulator with unknown time-varying delays. IEEE Trans. Syst. Man Cybern. Syst. 2018, 48, 2219–2228. [Google Scholar] [CrossRef]
Karg, B.; Lucia, S. Efficient representation and approximation of model predictive control laws via deep learning. IEEE Trans. Cybern. 2020, 50, 3866–3878. [Google Scholar] [CrossRef] [PubMed]
Thuruthel, T.G.; Falotico, E.; Renda, F.; Laschi, C. Model-based reinforcement learning for closed-loop dynamic control of soft robotic manipulators. IEEE Trans. Robot. 2018, 35, 124–134. [Google Scholar] [CrossRef]
Hu, Y.; Su, H.; Fu, J.; Karimi, H.R.; Ferrigno, G.; De Momi, E.; Knoll, A. Nonlinear model predictive control for mobile medical robot using neural optimization. IEEE Trans. Ind. Electron. 2020, 68, 12636–12645. [Google Scholar] [CrossRef]
Cao, Y.; Huang, J.; Xiong, C. Single-layer learning-based predictive control with echo state network for pneumatic-muscle-actuators-driven exoskeleton. IEEE Trans. Cogn. Dev. Syst. 2021, 13, 80–90. [Google Scholar] [CrossRef]
Kumar, S.S.P.; Tulsyan, A.; Gopaluni, B.; Loewen, P. A deep learning architecture for predictive control. In Proceedings of the 10th IFAC Symposium on Advanced Control of Chemical Processes ADCHEM, Shenyang, China, 25–27 July 2018; pp. 512–517. [Google Scholar]
Damasceno, B.C.; Xie, X. Deadlock-free scheduling of manufacturing systems using petri nets and dynamic programming. In Proceedings of the 14th IFAC World Congress 1999, Beijing, China, 5–9 July 1999; pp. 4870–4875. [Google Scholar]
Fahmy, S.; Balakrishnan, S.; ElMekkawy, T. Deadlock prevention and performance oriented supervision in flexible manufacturing cells: A hierarchical approach. Robot. Comput.-Integr. Manuf. 2011, 27, 591–603. [Google Scholar] [CrossRef]
Foumani, M.; Gunawan, I.; Smith-Miles, K. Resolution of deadlocks in a robotic cell scheduling problem with post-process inspection system: Avoidance and recovery scenarios. In Proceedings of the 2015 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), Singapore, 6–9 December 2015; pp. 1107–1111. [Google Scholar]
Storn, R.; Price, K. Differential evolution-a simple and efficient adaptive scheme for global optimization over continuous spaces. J. Glob. Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
Tasoulis, D.K.; Pavlidis, N.G.; Plagianakos, V.P.; Vrahatis, M.N. Parallel differential evolution. In Proceedings of the 2004 Congress on Evolutionary Computation, Portland, OR, USA, 19–23 June 2004; Volume 2, pp. 2023–2029. [Google Scholar]
Wang, H.; Rahnamayan, S.; Wu, Z. Parallel differential evolution with self-adapting control parameters and generalized opposition-based learning for solving high-dimensional optimization problems. J. Parallel Distrib. Comput. 2013, 73, 62–73. [Google Scholar] [CrossRef]
Pedroso, D.M.; Bonyadi, M.R.; Gallagher, M. Parallel evolutionary algorithm for single and multi-objective optimisation: Differential evolution and constraints handling. Appl. Soft Comput. 2017, 61, 995–1012. [Google Scholar] [CrossRef]
Zibin, P. Performance analysis and improvement of parallel differential evolution. arXiv 2021, arXiv:2101.06599. [Google Scholar]
Opara, K.R.; Arabas, J. Differential evolution: A survey of theoretical analyses. Swarm Evol. Comput. 2019, 44, 546–558. [Google Scholar] [CrossRef]
Al-Dabbagh, R.D.; Kinsheel, A.; Mekhilef, S.; Baba, M.S.; Shamshirband, S. System identification and control of robot manipulator based on fuzzy adaptive differential evolution algorithm. Adv. Eng. Softw. 2014, 78, 60–66. [Google Scholar] [CrossRef]
Zhang, B.; Sun, X.; Liu, S.; Deng, X. Adaptive differential evolution-based receding horizon control design for multi-UAV formation reconfiguration. Int. J. Control Autom. 2019, 17, 3009–3020. [Google Scholar] [CrossRef]
Jhang, J.Y.; Lin, C.J.; Young, K.Y. Cooperative carrying control for multi-evolutionary mobile robots in unknown environments. Electronics 2019, 8, 298. [Google Scholar] [CrossRef] [Green Version]
Chen, C.H.; Lin, C.J.; Jeng, S.Y.; Lin, H.Y.; Yu, C.Y. Using ultrasonic sensors and a knowledge-based neural fuzzy controller for mobile robot navigation control. Electronics 2021, 10, 466. [Google Scholar] [CrossRef]
Guo, H.; Cao, D.; Chen, H.; Sun, Z.; Hu, Y. Model predictive path following control for autonomous cars considering a measurable disturbance: Implementation, testing, and verification. Mech. Syst. Signal Process. 2019, 118, 41–60. [Google Scholar] [CrossRef]
Gul, N.; Kim, S.M.; Ahmed, S.; Khan, M.S.; Kim, J. Differential evolution based machine learning scheme for secure cooperative spectrum sensing system. Electronics 2021, 10, 1687. [Google Scholar] [CrossRef]
Wei, Y.; Wei, Y.; Sun, Y.; Qi, H.; Li, M. An advanced angular velocity error prediction horizon self-tuning nonlinear model predictive speed control strategy for PMSM system. Electronics 2021, 10, 1123. [Google Scholar] [CrossRef]
MAYNE, B.D. A second-order gradient method for determining optimal trajectories of non-linear discrete-time systems. Int. J. Control 1966, 3, 85–95. [Google Scholar] [CrossRef]
Slotine, J.J.E.; Li, W. Applied Nonlinear Control; Number 1; Prentice Hall: Englewood Cliffs, NJ, USA, 1991. [Google Scholar]
Kingma, D.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the 2nd International Conference Learning Representations (ICLR), Banff, Canada, AB, 14–16 April 2014. [Google Scholar]
Kwon, W.H.; Han, S.H. Receding Horizon Control: Model Predictive Control for State Models; Springer Science & Business Media: London, UK, 2006. [Google Scholar]
Maciejowski, J.M. Predictive Control: With Constraints; Pearson Education Limited, Prentice Hall: London, UK, 2002. [Google Scholar]
Storn, R. System design by constraint adaptation and differential evolution. IEEE Trans. Evol. Comput. 1999, 3, 22–34. [Google Scholar] [CrossRef] [Green Version]

Figure 1. The architecture of single-link FJ robot system.

Figure 2. The ReLU-RNN architecture used to approximate system dynamic model.

Figure 3. The flow chart of DEO algorithm.

Figure 4. The architecture of RNN and DEO based NMPC controller.

Figure 5. The progress of multi-step prediction.

Figure 6. The absolute errors of multi-step prediction.

Figure 7. The cost values of the optimization process at five adjacent time steps.

Figure 8. The cost values of the optimization process at each time step.

Figure 9. The target tracking process of different controllers.

Figure 10. The control actions of different controllers.

Figure 11. The target tracking process with external disturbances.

Figure 12. The control actions with external disturbances.

Table 1. The simulation model parameters of single-link FJ robot.

Parameters	Values	Parameters	Values
$J_{1}$	0.8 kg· m $^{2}$	R	5.3 $Ω$
$J_{2}$	0.1 kg· m $^{2}$	$K_{f_{1}}$	2.0
N	200	$K_{f_{2}}$	2.0
K	70 Nm/rad	m	0.3 kg
L	1.4 × $10^{- 5}$ H	l	0.5 m
$K_{τ}$	9.3 × $10^{- 3}$ Nm/A	g	9.8 ${m / s}^{2}$
$K_{e}$	0.1 V/rad/s	-	-

Table 2. The MSE of the ReLU-RNN predictive model.

States	$x_{1}$ (rad)	$x_{2}$ (rad/s)	$x_{3}$ (rad)	$x_{4}$ (rad/s)	$x_{5}$ (A)
MSE	$3.14 \times 10^{- 7}$	$4.76 \times 10^{- 7}$	$2.60 \times 10^{- 7}$	$1.44 \times 10^{- 7}$	$6.25 \times 10^{- 8}$

Table 3. Comparison of the state error of different controllers.

Controller	Proposed	PID	DDP MPC
Proposed	0.0037	0.0049	0.0016

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Nonlinear Model Predictive Control of Single-Link Flexible-Joint Robot Using Recurrent Neural Network and Differential Evolution Optimization

Abstract

1. Introduction

2. Single-Link FJ Robot System Model

3. Controller Design

3.1. Nonlinear Model Predictive Control

3.2. Dynamics Model Approximation Using ReLU-RNN

3.3. RNN and DEO Based NMPC Controller

3.4. Control Stability Analysis

4. Numerical Simulations

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics