Data-Driven Optimal Preview Repetitive Control of Linear Discrete-Time Systems

Li, Xiang-Lai; Wu, Qiu-Lin

doi:10.3390/math13213501

Open AccessArticle

Data-Driven Optimal Preview Repetitive Control of Linear Discrete-Time Systems

by

Xiang-Lai Li

^1,* and

Qiu-Lin Wu

²

¹

College of Electrical and Information Engineering, Hunan Institute of Engineering, Xiangtan 411101, China

²

School of Automation and Electronic Information, Xiangtan University, Xiangtan 411105, China

^*

Author to whom correspondence should be addressed.

Mathematics 2025, 13(21), 3501; https://doi.org/10.3390/math13213501

Submission received: 24 July 2025 / Revised: 26 September 2025 / Accepted: 14 October 2025 / Published: 2 November 2025

(This article belongs to the Special Issue Advances and Applications for Data-Driven/Model-Free Control)

Download

Browse Figures

Versions Notes

Abstract

This paper investigates the problem of data-driven optimal preview repetitive control of linear discrete-time systems. Firstly, by integrating prior information into the preview time domain, an augmented state-space system is established. Secondly, the original output tracking problem is mathematically reconstructed and transformed into the optimization problem form of a linear quadratic tracking (LQR). Furthermore, a Q-function-based iterative algorithm is designed to dynamically calculate the optimal tracking control gain based solely on online measurable data. This method has a dual-breakthrough feature: it neither requires prior knowledge of system dynamics nor provides an initial stable controller. Finally, the superiority of the proposed scheme is verified through numerical simulation experiments.

Keywords:

data-driven preview repetitive control; discrete-time systems; reinforcement learning

MSC:

93D21

1. Introduction

In practical control system applications, there is often a situation where the system tracks a specific periodic signal. Repetitive control (RC) is a high-precision control method that tracks a reference signal with a known period [1]. Preview control (PC) is a control method that improves the performance of closed-loop systems by utilizing known future information from target signals or interference signals [2]. The combination of PC and RC, known as preview repetitive control [3,4] (PRC), was first developed in the early 1990s. It utilizes both the internal model compensation mechanism of repetitive learning and the compensation of preview information; thus, it can significantly improve the control performance of closed-loop systems.

Recently, a great deal of research has been devoted to the theory and applications of PRC, and various structures and algorithms have been devised [5,6,7,8]. In [5], a design method of discrete-time sliding-mode preview repetitive servo systems was proposed. Using a parameter-dependent Lyapunov function method and linear matrix inequality (LMI) techniques, a robust guaranteed-cost PRC law was proposed in [6]. Based on the optimal control theory, the problem of optimal PRC for continuous-time linear systems was investigated in [7] by solving the algebraic Riccati equation (ARE). Using a two-dimensional model approach, observer-based PRC for uncertain discrete-time systems was studied in [8]. In [9], a discrete-time PRC scheme was proposed for linear discrete-time periodic systems. To compensate for unknown external disturbances, the problem of PRC with equivalent-input disturbance (EID) for uncertain continuous-time systems was presented in [10]. The problem of Padé approximation-based optimal PRC with EID was proposed in [11]. In [12], the PRC for discrete-time linear parameter-varying systems with a time-varying delay was investigated. However, the above results depend on a perfect understanding of the system model.

In model driven-based control, a mathematical model is first established and the controller is then designed based on this model [13,14]. The inherent problem of system modeling is the motivation for research in the field of data-driven control (DDC) technology [15,16]. An approach of designing an optimal preview output tracking controller using online measurable data and a Q-function-based iteration algorithm was developed [17]. A data-driven framework to control a complex network optimally and without any knowledge of the network dynamics was develop in [18]. The fault-tolerant consensus control problem for a general class of linear continuous-time multi-agent systems (MASs) was considered in [19]. In [20], the problem of the decoupled eata-based optimal control policy for a nonlinear stochastic systems was addressed. In [21], a data-based iteration learning control for multiphase batch processes was investigated. A data-based approach to linear quadratic tracking (LQT) control design was presented in [22]. In [23], a novel adaptive dynamic programming-based learning algorithm was proposed by only accessing the input and output data to solve the LQT problem. In [24], a robust data-driven finite-horizon linear quadratic regulator control problem of unknown linear time-invariant systems was addressed. DDC-based

H_{\infty}

optimal tracking control for systems in the presence of external disturbance was presented in [25,26].

Note that the mechanism model with system matrices and order needs to be known in existing PC/RC studies. Actually, many practical plants are hard to model, particularly for the complicated modern control systems and complex networks [27]. On the other hand, the above studies do not reflect the research interests in data-driven PC and RC compensation information. These observations inspire our current study. In this paper, we investigate the problem of data-driven preview repetitive control (DD-PRC) of linear discrete-time systems. The main contributions of this paper are summarized as follows: (i) By taking advantage of the preview information of the desired tracking signal as well as the periodic learning ability of repetitive control, the original control problem is converted to an LQT one under an augmented state-space model. (ii) In order to solve the DDC-based LQT problem, a Q-function-based iterative algorithm is then designed to dynamically calculate the optimal tracking control gain instead of solving the ARE.

The rest of this paper is organized follows. Section 2 gives the problem formulation. The data-driven PRC law design method is given in Section 3. In Section 4, the numerical simulation is presented to demonstrate the validity of the design method. Finally, the conclusions are drawn in Section 5.

2. Problem Formulation

Consider a discrete-time linear system given by

\{\begin{matrix} x (k + 1) = A x (k) + B u (k), \\ y (k) = C x (k), \end{matrix}

(1)

where

x (k) \in R^{n}

is the system state,

y (k) \in R^{m}

is the measurement output, and

u (k) \in R^{p}

is the control input.

A, B

, and C are the corresponding appropriate dimensional matrices.

The configuration of the data-driven PRC system is shown in Figure 1, where

G (z)

is the controlled plant as described in System (1).

r (k) \in R^{m}

denotes the constant reference trajectory signal with a period of L.

C_{R} (z)

is the basic repetitive control block, and

v (k)

is the output of the repetitive controller and is given by

v (k) = \{\begin{matrix} e (k), & 0 \leq k < L, \\ v (k - L) + e (k), & k \geq L, \end{matrix}

(2)

where

e (k) = r (k) - y (k)

is the tracking error.

The following assumptions are needed for System (1).

Assumption 1.

Assume that

A, B

, and C satisfy both

(A, B)

controllable and

(A, C)

observable.

Assumption 2.

The preview length of the reference trajectory signal

r (k)

is limited to M (

M < L

). That is, at the current moment k, the current value of

r (k)

and the M-step future values

r (k), r (k + 1), \dots, r (k + M)

are available. Additionally, it is assumed that the future values beyond the reference trajectory forward range L are set to 0, i.e.,

r (k + L) = 0

.

Remark 1.

Note that Assumption 1 is a necessary assumption for the optimal state feedback control, while Assumption 2 is a standard working hypothesis introduced to facilitate the PC mathematical development [2,3,4]. It indicates that the reference signal’s values are known in the actual system for a certain time period, during which the reference signal exceeds the preview length, and the impact is very small.

The objective of this paper is to design a DD-PRC law

u (k)

with the form of

u (k) = \sum_{j = 0}^{L - 1} K_{v j} v (k - j) + K_{x} x (k) + \sum_{j = 0}^{M} K_{r j} r (k + j),

(3)

where

K_{v j}

and

K_{r j}

are the gains of the repetitive controller and the preview compensator, respectively, and

K_{x}

is the gain of the feedback controller, such that the system output

y (k)

tracks the reference signal

r (k)

and the quadratic cost function

J = \sum_{k = 0}^{\infty} [e^{T} (k) Q_{e} e (k) + Δ u^{T} (k) R Δ u (k)]

(4)

is minimized subject to system (1), where

Q_{e}

and R are symmetric positive definite and semi-positive definite symmetric matrices, respectively, and

Δ u (k) = u (k) - u (k - L) .

Now, define the following L-order difference operator:

\{\begin{matrix} Δ x (k) = x (k) - x (k - L), \\ Δ y (k) = y (k) - y (k - L), \\ Δ u (k) = u (k) - u (k - L), \\ Δ r (k) = r (k) - r (k - L) . \end{matrix}

(5)

Substituting (5) into System (1) yields

\{\begin{matrix} Δ x (k + 1) = A Δ x (k) + B Δ u (k), \\ e (k + 1) = Δ r (k + 1) - C A Δ x (k) - C B Δ u (k) \\ + e (k + 1 - L) . \end{matrix}

(6)

Introducing the state vector

{\bar{x}}_{d} (k) = {[e (k), e (k - 1), \dots, e (k - L + 1), Δ x (k)]}^{T},

(7)

Therefore,

\{\begin{matrix} {\bar{x}}_{d} (k + 1) = \bar{A} {\bar{x}}_{d} (k) + D_{r} Δ r (k + 1) + D Δ u (k), \\ Δ y (k) = C_{1} {\bar{x}}_{d} (k), \end{matrix}

(8)

where

\begin{matrix} \bar{A} = [\begin{matrix} 0 & \dots & 0 & I & - C A \\ I & \dots & 0 & 0 & 0 \\ ⋮ & ⋱ & ⋮ & ⋮ & ⋮ \\ 0 & \dots & I & 0 & 0 \\ 0 & \dots & 0 & 0 & A \end{matrix}], \\ D = {[\begin{matrix} - C B & 0 & \dots & 0 & B \end{matrix}]}^{T}, \\ D_{r} = {[\begin{matrix} I & 0 & \dots & 0 & 0 \end{matrix}]}^{T}, C_{1} = [\begin{matrix} 0 & \dots & 0 & C \end{matrix}] . \end{matrix}

Since M-step future values

r (k), r (k + 1), \dots, r (k + M)

are available, a new

m (M + 1) \times 1

state vector

X_{r} (k)

can be defined as

X_{r} (k) = {[\begin{matrix} Δ r (k) & Δ r (k + 1) & \dots & Δ r (k + M) \end{matrix}]}^{T} .

(9)

Combining Equations (8) and (9) yields

\bar{X} (k + 1) = Φ_{x} \bar{X} (k) + D_{x} Δ u (k),

(10)

where

\begin{matrix} \bar{X} (k) = [\begin{matrix} {\bar{x}}_{d} (k) \\ X_{r} (k) \end{matrix}] \in R^{n + L + m (M + 1)}, \\ D_{x} = [\begin{matrix} D \\ 0 \end{matrix}] \in R^{[n + L + m (M + 1)] \times p}, \\ Φ_{x} = [\begin{matrix} \bar{A} & D_{x r} \\ 0 & A_{r} \end{matrix}] \in R^{[n + L + m (M + 1)] \times [n + L + m (M + 1)]}, \\ D_{x r} = [\begin{matrix} 0 & D_{r} & 0 & \dots & 0 \end{matrix}] \in R^{(n + L) \times m (M + 1)}, \\ A_{r} = (\begin{matrix} 0 & I & 0 & \dots & 0 \\ 0 & ⋱ & ⋱ & ⋱ & ⋮ \\ ⋮ & ⋱ & ⋱ & 0 \\ 0 & \dots & \dots & 0 & I \\ 0 & \dots & \dots & 0 & 0 \end{matrix}) \in R^{m (M + 1) \times m (M + 1)} . \end{matrix}

With these definitions, the performance index (4) can be rewritten as

\bar{J} = \sum_{k = 0}^{\infty} [{\bar{X}}^{T} (k) Q \bar{X} (k) + Δ u^{T} (k) R Δ u (k)]

(11)

where,

Q = diag [\begin{matrix} Q_{e} & \dots & 0 \end{matrix}]

.

Thus, the control problem associated with Discrete System (1) under Performance Index (4) is converted into an optimal LQT control problem of the Augmented System (10), subject to Performance Index (11).

According to (2) and (13), we can get

\begin{matrix} Δ u (k) & = & \sum_{j = 0}^{L - 1} K_{v j} e (k - j) + K_{x} Δ x (k) \\ + & \sum_{j = 0}^{M} K_{r j} Δ r (k + j) \\ = & - K_{1} {\bar{x}}_{d} (k) - K_{2} X_{r} (k) \\ = & - K \bar{X} (k), \end{matrix}

(12)

where

\begin{matrix} K_{1} = - [K_{v 0}, K_{v 1}, \dots, K_{v (L - 1)}, K_{x}], \\ K_{2} = - [K_{r 0}, K_{r 1}, \dots, K_{r M}], \\ K = - [K_{1}, K_{2}] . \end{matrix}

Hence, if the optimal controller

Δ u (k) = - K \bar{X} (k)

subject to (10) and (11) is solved, then the DD-PRC law (13) will be derived. In fact, using (5) and (12), we can find that

u (k) = \sum_{j = 0}^{L - 1} K_{v j} v (k - j) + K_{x} x (k) + \sum_{j = 0}^{M} K_{r j} r (k + j) .

(13)

It can be seen that the above control law

u (k)

consists of three terms. The first term represents RC, which is used to eliminate the steady-state tracking error. The second term describes the state-feedback control, which is used to guarantee the stability of closed-loop system. The third term denotes the preview action, which is used to improve the transient response of systems. The combination of these three terms can achieve the satisfactory tracking performance.

3. Data-Driven PRC Law Design

It has been shown in [28] that the optimal controller can be obtained if we can find a unique positive definite solution of the following algebraic Riccati equation (ARE):

P = Φ_{x}^{T} P Φ_{x} + Q - Φ_{x}^{T} P D_{x} (R + D_{x}^{T} P D_{x}) D_{x}^{T} P Φ_{x} .

(14)

Then, we can obtain the optimal controller:

Δ u (k) = - K \bar{X} (k),

(15)

where

K = {(R + D_{x}^{T} P D_{x})}^{- 1} D_{x}^{T} P Φ_{x} .

(16)

This problem can be solved using the iterative algorithm [17,28] to approximate the solution of ARE (14), which is given as Algorithm 1 for comparison in the next section.

Algorithm 1 Iterative algorithm for solving K

1.: Set $K_{0}$ and $j = 0$ , where j denotes iteration index.
2.: Solve the following equation for $P_{j + 1}$ ,

$\begin{matrix} P_{j + 1} & = & Φ_{x}^{T} P_{j} Φ_{x} + Q \\ - & Φ_{x}^{T} P_{j} D_{x} (R + D_{x}^{T} P_{j} D_{x}) D_{x}^{T} P_{j} Φ_{x} . \end{matrix}$

(17)
3.: Calculate $K_{j + 1}$ by

$K_{j + 1} = {(R + D_{x}^{T} P_{j + 1} D_{x})}^{- 1} D_{x}^{T} P_{j + 1} Φ_{x} .$

(18)
4.: Set $j = j + 1$ and repeat step 2–3 until $∥P_{j + 1} - P_{j}∥ \leq ε$ with a small constant $ε$ $(ε > 0)$ . Output $P_{j + 1}$ as the solution of ARE (14).

Note that the above iterative algorithm, Algorithm 1, is essentially model-based. In this section, we will develop a model-free control approach by using online input and output data.

For the LQT problem, we define the following cost function:

V (\bar{X} (k)) = \sum_{k = 0}^{\infty} [{\bar{X}}^{T} (k) Q \bar{X} (k) + Δ u^{T} (k) R Δ u (k)],

(19)

whose optimal value can be parameterized as

V (\bar{X} (k)) = {\bar{X}}^{T} (k) P \bar{X} (k) .

Using the well-known Bellman equation, we get

\begin{matrix} V^{*} (\bar{X} (k)) & = R (\bar{X} (k), Δ u (k)) + V (\bar{X} (k + 1)) \\ = R (\bar{X} (k), Δ u (k)) \\ + V^{*} (Φ_{x} \bar{X} (k) + D_{x} Δ u (k)), \end{matrix}

(20)

where

R (.) = {\bar{X}}^{T} (k) Q \bar{X} (k) + Δ u^{T} (k) R Δ u (k)

can be viewed as the gain from reinforcement learning, and the corresponding

V (\bar{X} (k + 1))

function can be viewed as the cost function.

Define a Q-function, also known as action value function

Q (\bar{X} (k), u (k))

, given by

Q (\bar{X} (k), Δ u (k)) = R (\bar{X} (k), Δ u (k)) + V (\bar{X} (k + 1)) .

(21)

From (21), it can be observed that this Q-function describes the performance associated with the state–action pair

(\bar{X} (k), Δ u (k)) .

For the proposed optimal controller

u (k)

, it can be derived from Equations (20) and (21) that

V (\bar{X} (k)) = Q (\bar{X} (k), Δ u (k))

and therefore the Q-function also has a corresponding Bellman formation:

\begin{matrix} Q (\bar{X} (k), Δ u (k)) & = & R (\bar{X} (k), Δ u (k)) \\ + & Q (\bar{X} (k + 1), Δ u (k + 1)) . \end{matrix}

(22)

From the optimality of Bellman’s formula, the optimal value function

V^{*} (\bar{X} (k))

and the optimal controller

Δ u^{*} (k)

is given by

V^{*} (\bar{X} (K)) = min_{Δ u (k)} Q^{*} (\bar{X} (k), Δ u (k))

(23)

u^{*} (k) = arg min_{Δ u (k)} Q^{*} (\bar{X} (k), Δ u (k)) .

(24)

It follows from the Q-function (21) that

\begin{matrix} Q^{*} (\bar{X} (k), Δ u (k)) \\ = & {\bar{X}}^{T} (k) Q_{a} \bar{X} (k) + u^{T} (k) R Δ u (k) \\ + {[Φ_{x} \bar{X} (k) + D_{x} Δ u (k)]}^{T} P [Φ_{x} \bar{X} (k) + D_{x} Δ u (k)] \\ = & {[\begin{matrix} \bar{X} (k) \\ Δ u (k) \end{matrix}]}^{T} G [\begin{matrix} \bar{X} (k) \\ Δ u (k) \end{matrix}], \end{matrix}

(25)

where

\begin{matrix} G & = [\begin{matrix} G_{11} & G_{12} \\ G_{12}^{T} & G_{22} \end{matrix}] \\ = [\begin{matrix} Q_{a} + Φ_{x}^{T} P Φ_{x} & Φ_{x}^{T} P D_{x} \\ D_{x}^{T} P Φ_{x} & R + D_{x}^{T} P D_{x} \end{matrix}] . \end{matrix}

(26)

By solving

\frac{\partial Q^{*} (\bar{X} (k), Δ u (k))}{\partial Δ u (k)} = 0

, we can obtain

Δ u^{*} (k) = - K \bar{X} (k) = - G_{22}^{- 1} G_{12}^{T} \bar{X} (k)

(27)

If P is the only positive definite solution obtained from the ARE (14), then

K = G_{22}^{- 1} G_{12}^{T}

, which is equal to the gain of the optimal LQT control.

The Q-learning scheme generates the sequences

Q^{j} (\bar{X} (k), Δ u (k))

, and

Δ u (k)

can converge to

Q^{*} (\bar{X} (k), Δ u (k))

,

Δ u^{*} (k)

, respectively, which has been proven in [29]. It is worth pointing out that this Q-learning scheme requires a priori information of system dynamics and an initially stabilizing control gain. To overcome this drawback, in the following section, based on the Q-function value, a data-driven based algorithm will be developed to obtain the optimal control gain.

To implement the iterative algorithm, the iterative Q-function

Q^{j} (\bar{X}, Δ u)

is denoted as

Q^{j} (\bar{X}, u) = {[\begin{matrix} \bar{X} (k) \\ Δ u (k) \end{matrix}]}^{T} G^{j} [\begin{matrix} \bar{X} (k) \\ Δ u (k) \end{matrix}],

(28)

where

G^{j} = [\begin{matrix} G_{11}^{j} & G_{12}^{j} \\ {(G_{12}^{j})}^{T} & G_{22}^{j} \end{matrix}] .

(29)

It follows from (27), (28), and (29) that

Δ u^{j} = - K_{j} \bar{X} (k) = - {(G_{22}^{j})}^{- 1} {(G_{12}^{j})}^{T} \bar{X} (k) .

(30)

Using (17) and (28), we have

\begin{matrix} {[\begin{matrix} \bar{X} (k) \\ Δ u (k) \end{matrix}]}^{T} Π^{j + 1} [\begin{matrix} \bar{X} (k) \\ Δ u (k) \end{matrix}] \\ = & {[\begin{matrix} \bar{X} (k + 1) \\ - K_{j} \bar{X} (k + 1) \end{matrix}]}^{T} Π^{j} [\begin{matrix} \bar{X} (k + 1) \\ - K_{j} \bar{X} (k + 1) \end{matrix}] \\ + {\bar{X}}^{T} (k) Q_{a} \bar{X} (k) + Δ u^{T} (k) R Δ u (k), \end{matrix}

(31)

where

Π^{j} = [\begin{matrix} Q_{a} + Φ_{x}^{T} P_{j} Φ_{x} & Φ_{x}^{T} P_{j} D_{x} \\ D_{x}^{T} P_{j} Φ_{x} & R + D_{x}^{T} P_{j} D_{x} \end{matrix}] = G^{j} .

(32)

In the following, for computational simplicity, the Kronecker product is introduced to convert

G^{j}

into a vector representation. Note that

a^{T} W b = (b^{T} \otimes a^{T}) vec (W),

(33)

where

a \in R^{1 \times m}, W \in R^{m \times m}, b \in R^{m \times 1}

, ⊗ is the Kronecker product, and

vec (W)

is the vector formed by stacking the columns of matrix

W .

Therefore, it follows from (31) that

ρ_{j} (X, Δ u) vec (G^{j + 1}) = π_{j}

(34)

where

\begin{matrix} ρ_{j} (X, Δ u) = {[\begin{matrix} \bar{X} (k) \\ Δ u (k) \end{matrix}]}^{T} \otimes {[\begin{matrix} \bar{X} (k) \\ Δ u (k) \end{matrix}]}^{T} \\ π_{j} = \{{[\begin{matrix} \bar{X} (k + 1) \\ - K_{j} \bar{X} (k + 1) \end{matrix}]}^{T} \otimes {[\begin{matrix} \bar{X} (k + 1) \\ - K_{j} \bar{X} (k + 1) \end{matrix}]}^{T}\} \\ \times vec (G^{j}) + ({\bar{X}}^{T} (k) \otimes {\bar{X}}^{T} (k)) vec (Q_{a}) \\ + (Δ u^{T} (k) \otimes Δ u {(k)}^{T}) vec (R) . \end{matrix}

(35)

For a given positive integer N, denote

Ξ = [ρ_{1}^{T}, ρ_{2}^{T}, \dots, ρ_{N}^{T}]

and

Θ = [π_{1}^{T}, π_{2}^{T}, \dots, π_{N}^{T}]

. Then by collecting sample data, we have

Ξ^{T} vec (G^{j + 1}) = Θ .

(36)

By using the least squares method, unknown parameter vectors can be calculated by

vec (G^{j + 1}) = {(Ξ Ξ^{T})}^{- 1} Θ .

(37)

Remark 2.

Note that the matrix Ξ is used in each iteration, so the measured data can be fully utilized. Thus, we should collect enough data so that the rank of Ξ is equal to the unknown independent parameter possessed by the vector G, ensuring that Equation (37) has a unique solution. The rank condition is bound up with the persistence of excitation or exploratory signals, which ensures that enough information about system dynamics is collected.

On the other hand, the convergence of the least squares solution depends on the convexity of the problem and the initial conditions of the iterative algorithm. When the least squares problem is convex, the iterative method usually converges to the optimal solution. If it is a non-convex problem, there may be convergence difficulties. By adding constraints, such as adjusting the threshold and the maximum number of iterations, the convergence of the non-convex problem can be improved.

Now, a data-driven algorithm is developed for solving K in (30), referred to as Algorithm 2.

Algorithm 2 Data-driven algorithm for solving K

1.: Set $K_{0}$ and $j = 0$ , where j denotes iteration index.
2.: Calculate $G^{j + 1}$ by (37).
3.: Controller update

$\begin{matrix} Δ u {(k)}^{j + 1} = - K^{j + 1} \bar{X} (k), \\ K_{j + 1} = {(G_{22}^{j + 1})}^{- 1} {(G_{12}^{j + 1})}^{T} . \end{matrix}$
4.: Set $j = j + 1$ and repeat step 2–3 until $∥G^{j} - G^{j - 1}∥ \leq ε$ with a small constant $ε$ $(ε > 0)$ . Output $K_{j + 1}$ as the optimal controller gain.

Remark 3.

Algorithm 2 is a DDC method. The algorithm is implemented using past input, output, and preview reference data measured by the system, which does not require any priori knowledge of the system dynamics.

Remark 4.

Algorithm 2 is a DD-PRC algorithm for linear discrete-time systems. According to the data-driven robust fault-tolerant control studied in [15], Algorithm 2 can be extended to nonlinear systems by using the Markov parameter sequence identification method.

4. Simulation Results

In this section, the effectiveness of the proposed algorithm is verified by a simulation example.

Consider the following linear discrete system:

\{\begin{matrix} x (k + 1) = [\begin{matrix} 0.6 & 0.1 \\ 0.45 & 0.3 \end{matrix}] x (k) + [\begin{matrix} 0.3 \\ 0.2 \end{matrix}] u (k), \\ y (k) = [1 2] x (k), \end{matrix}

(38)

where

x (0) = {[1, - 1]}^{T}

. Assume that the reference signal with period

L = 5

is given by

\begin{matrix} r (k) = sin (\frac{2 π}{5} k) + 0.25 sin (\frac{4 π}{5} k) + 0.5 sin (\frac{6 π}{5} k) . \end{matrix}

The performance index is denoted as (4) with

Q = 5

and

R = 1

. Set the preview length to

M = 3

.

By some calculation, we can find that

(A, B)

is controllable and

(A, C)

is observable. By solving the ARE (14), we obtain the optimal control gain:

\begin{matrix} K^{*} = [0.2310 0.0050 - 0.0637 - 0.2180 - 1.1284 \\ 1.5194 1.2777 0 - 1.1284 - 0.2180 - 0.0637] . \end{matrix}

Set

K_{0} = 0,

ε = 10^{- 4} .

Using Algorithm 1 to solve for K, at

j = 6

th iteration, we get the optimal control gain

K_{6}^{1} = K^{*}

. While using Algorithm 2 to solve for K, at

j = 6

th iteration, we get

\begin{matrix} K_{6}^{2} = [0.2311 0.0051 - 0.0636 - 0.2179 - 1.1283 \\ 1.5192 1.2775 0 - 1.1283 - 0.2178 - 0.0637] . \end{matrix}

Altough the

K_{6}^{1}

obtained by Algorithm 1 is more accurate, it depends on the prior information of system matrices. On account of the inevitability of parametric variations or unmodeled dynamics, it is usually hard to obtain the perfect model of a physical system. In the next section, we will use

K_{6}^{2}

to perform the simulation.

The simulation results are given in Figure 2, Figure 3 and Figure 4. Figure 2 shows the norm of the difference between the controller gain K and the optimal tracking controller gain

K^{*}

during the iteration process. Figure 3 shows the system output

y (k)

and the reference signal

r (k)

under the data-driven PRC law (13) at

j = 6

th iteration. The tracking error is depicted in Figure 4. It can be seen from these Figures that the data-based algorithm proposed in this paper is capable of realizing tracking control tasks successfully.

To test the robustness, we compared the simulation results between our approach and the Q-learning method proposed in [24]. Add a disturbance signal

d (k) = sin (\frac{2 π}{5} k)

to the input signal of the control system. The simulation result is shown in Figure 5. It can be seen that both the outputs can track the reference signal accurately. However, DD-PRC can more effectively reduce the tracking error. The reason is the combination of preview and repetitive learning.

Furthermore, Gaussian white noise is also added to the disturbance term

d (k)

. Noise with varying variances is selected, and 50 Monte Carlo experiments are conducted for each variance level. This process yields the mean and standard deviation (SD) of the sum square error (SSE) for the two controllers under different noise intensities, as presented in Table 1. As is evident from Table 1, we can conclude that the DD-PRC method has superior anti-interference ability compared to the Q-learning method proposed in [24].

5. Conclusions

In this paper, a data-based PRC algorithm for addressing the tracking control problem of linear discrete-time systems is designed. The algorithm transforms the system’s optimal tracking problem into a linear quadratic regulation problem. A data-driven approach is proposed without relying on system dynamics, enabling the resolution of LQT using solely measured input, output, and reference trajectory data. Finally, MATLAB R2022a simulation results demonstrate that the proposed data-driven PRC algorithm exhibits satisfactory tracking performance.

Author Contributions

Methodology, X.-L.L.; Software, Q.-L.W.; Formal analysis, X.-L.L.; Data curation, Q.-L.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed at the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Hara, S.; Yamamoto, Y.; Omata, T.; Nakano, M. Repetitive control system: A new type servo system for periodic exogenous signals. IEEE Trans. Autom. Control 1988, 33, 659–668. [Google Scholar] [CrossRef]
Hać, A. Optimal linear preview control of active vehicle suspension. Veh. Syst. Dyn. 1992, 21, 167–195. [Google Scholar] [CrossRef]
Nakamura, H. Preview repetitive control considering delays in manipulation and detection. J. Autom. Control 1994, 30, 871–873. [Google Scholar]
Egami, T. A design of low-degree optimal preview repetitive control system. Trans. Soc. Instrum. Control Eng. 1999, 35, 297–299. [Google Scholar] [CrossRef]
Tadashi, S.; Tadashi, E.; Takeshi, T. Design of discrete-time sliding-mode preview-repetitive control systems. Trans. Inst. Syst. Control Inf. Eng. 2005, 18, 312–321. [Google Scholar]
Lan, Y.H.; Xia, J.J.; Shi, Y.X. Robust guaranteed-cost preview repetitive control for polytopic uncertain discrete-time systems. Algorithms 2019, 12, 20. [Google Scholar] [CrossRef]
Lan, Y.H.; He, J.L.; Li, P.; She, J.H. Optimal preview repetitive control with application to permanent magnet synchronous motor drive system. J. Franklin Inst. 2020, 357, 10194–10210. [Google Scholar] [CrossRef]
Li, L. Observer-based preview repetitive control for uncertain discrete-time systems. Int. J. Robust Nonlinear Control 2021, 31, 1103–1121. [Google Scholar] [CrossRef]
Li, L.; Lu, Y. Preview repetitive control for linear discrete-time periodic systems. Int. J. Adapt. Control Signal Process. 2022, 36, 412–428. [Google Scholar] [CrossRef]
Lan, Y.H.; Zhao, J.Y. Design of a preview repetitive control with equivalent-input-disturbance system based on a continuous-discrete 2D model. J. Franklin Inst. 2023, 360, 1884–1903. [Google Scholar] [CrossRef]
Lan, Y.H.; Zhao, J.Y. Improving track performance by combining padé-approximation-based preview repetitive control and equivalent-input-disturbance. J. Electr. Eng. Technol. 2024, 19, 3781–3794. [Google Scholar] [CrossRef]
Li, L.; Lu, Y.; Meng, X. Preview repetitive control for discrete-time linear parameter-varying systems with a time-varying delay. Trans. Inst. Meas. Control 2025, 01423312241300756. [Google Scholar] [CrossRef]
Rguigui, H.; Elghribi, M. Practical stabilization for a class of tempered fractional-order nonlinear fuzzy systems. Asian J. Control 2025. [Google Scholar] [CrossRef]
Chen, Y.; Wang, H.; Wang, X. Precision fixed-time formation control for multi-AUV systems with full state constraints. Mathematics 2025, 13, 1451. [Google Scholar] [CrossRef]
Han, K.Z.; Feng, J. Data driven robust fault tolerant linear quadratic preview control of discrete-time linear systems with completely unknown dynamics. Int. J. Control 2019, 94, 49–59. [Google Scholar] [CrossRef]
Huang, J.W.; Gao, J.W. How could data integrate with control? A review on data-based control strategy. Int. J. Dyn. Control 2020, 8, 1189–1199. [Google Scholar] [CrossRef]
Liu, Z.Y.; Wu, H.N. Data-driven optimal preview output tracking of linear discrete-time systems. In Proceedings of the 38th Chinese Control Conference, Guangzhou, China, 27–30 July 2019; pp. 1973–1978. [Google Scholar]
Baggio, G.; Bassett, D.S.; Pasqualetti, F. Data-driven control of complex networks. Nat. Commun. 2021, 12, 1429. [Google Scholar] [CrossRef]
Gao, C.; Wang, Z.; He, X.; Yue, D. Sampled-data-based fault-tolerant consensus control for multi-agent systems: A data privacy preserving scheme. Automatica 2021, 133, 109847. [Google Scholar] [CrossRef]
Wang, R.; Parunandi, K.S.; Yu, D.; Kalathil, D.; Chakravorty, S. Decoupled data-based approach for learning to control nonlinear dynamical Systems. IEEE Trans. Autom. Control 2022, 67, 3582–3589. [Google Scholar] [CrossRef]
Geng, Y.; Ruan, X.; Yang, X.; Zhou, Q. Data-based iterative learning control for multiphase batch processes. Asian J. Control 2023, 25, 1392–1406. [Google Scholar] [CrossRef]
Yan, Y.; Bao, J.; Huang, B. An approach to data-based linear quadratic optimal control. IEEE Control Syst. Lett. 2024, 8, 1120–1125. [Google Scholar] [CrossRef]
Xie, K.; Zheng, Y.; Jiang, Y.; Lan, W.; Yu, X. Optimal dynamic output feedback control of unknown linear continuous-time systems by adaptive dynamic programming. Automatica 2024, 163, 111601. [Google Scholar] [CrossRef]
Wang, L.; Liu, W.; Li, Y.; Sun, J.; Wang, G. Data-based online linear quadratic gaussian control from noisy data. Int. J. Robust Nonlinear Control 2025. [Google Scholar] [CrossRef]
Dao, P.N.; Dao, Q.H. H_∞ tracking control for perturbed discrete-time systems using On/Off policy Q-learning algorithms. Chaos Solitons Fractals 2025, 197, 116459. [Google Scholar] [CrossRef]
Peyman, A.; Aref, S.; Mehdi, R. Data-based H_∞ optimal tracking control of completely unknown linear systems under input constraints. IET Control Theory Appl. 2025, 19, 70022. [Google Scholar]
Yu, T.; Zhao, Y.; Wang, J.; Liu, J. Event-triggered sliding mode control for switched genetic regulatory networks with persistent dwell time. Nonlinear Anal. Hybrid Syst. 2022, 44, 101135. [Google Scholar] [CrossRef]
Kiumarsi, B.; Lewis, F.L.; Naghibi-Sistani, M.B.; Karimpour, A. Optimal tracking control of unknown discrete-time linear systems using input-output measured data. IEEE Trans. Cybern. 2015, 45, 2770–2779. [Google Scholar] [CrossRef] [PubMed]
Song, S.; Zhu, M.; Dai, X.; Gong, D. Model-free optimal tracking control of nonlinear input-affine discrete-time systems via an iterative deterministic Q-learning algorithm. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 999–1012. [Google Scholar] [CrossRef]

Figure 1. Configuration of data-driven PRC system.

Figure 2. The values of

∥ K^{*} - K ∥

.

Figure 2. The values of

∥ K^{*} - K ∥

.

Figure 3. The system output and the reference signal.

Figure 4. The tracking error.

Figure 5. The tracking errors under different controllers.

Table 1. The comparison of SSE with different controller.

Controller	SSE	SD of SSE
Q-learning method [24]	1.9875	0.0354
DD-PRC	1.5846	0.0338

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, X.-L.; Wu, Q.-L. Data-Driven Optimal Preview Repetitive Control of Linear Discrete-Time Systems. Mathematics 2025, 13, 3501. https://doi.org/10.3390/math13213501

AMA Style

Li X-L, Wu Q-L. Data-Driven Optimal Preview Repetitive Control of Linear Discrete-Time Systems. Mathematics. 2025; 13(21):3501. https://doi.org/10.3390/math13213501

Chicago/Turabian Style

Li, Xiang-Lai, and Qiu-Lin Wu. 2025. "Data-Driven Optimal Preview Repetitive Control of Linear Discrete-Time Systems" Mathematics 13, no. 21: 3501. https://doi.org/10.3390/math13213501

APA Style

Li, X.-L., & Wu, Q.-L. (2025). Data-Driven Optimal Preview Repetitive Control of Linear Discrete-Time Systems. Mathematics, 13(21), 3501. https://doi.org/10.3390/math13213501

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Data-Driven Optimal Preview Repetitive Control of Linear Discrete-Time Systems

Abstract

1. Introduction

2. Problem Formulation

3. Data-Driven PRC Law Design

4. Simulation Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI