Parameter Estimation of Noise-Disturbed Multivariate Systems Using Support Vector Regression Integrated with Random Search and Bayesian Optimization

Zheng, Jiawei; Jie, Xinchun

doi:10.3390/pr13030773

Open AccessArticle

Parameter Estimation of Noise-Disturbed Multivariate Systems Using Support Vector Regression Integrated with Random Search and Bayesian Optimization

by

Jiawei Zheng

and

Xinchun Jie

^*

School of Automation and Electrical Engineering, Inner Mongolia University of Science & Technology, Baotou 014010, China

^*

Author to whom correspondence should be addressed.

Processes 2025, 13(3), 773; https://doi.org/10.3390/pr13030773

Submission received: 10 January 2025 / Revised: 26 February 2025 / Accepted: 5 March 2025 / Published: 7 March 2025

(This article belongs to the Section Process Control and Monitoring)

Download

Browse Figures

Versions Notes

Abstract

To achieve accurate control of Multi-Input and Multi-Output (MIMO) physical plants, it is crucial to obtain correct model expressions. In practice, the prevalence of both outliers and colored noise can cause serious interference with the industrial process, thus reducing the accuracy of the identification algorithm. The algorithm of support vector regression (SVR) is proposed to address the problem of parameter estimation for MIMO systems under interference from outliers and colored noise. In order to further improve the speed of parameter estimation, random search and Bayesian optimization algorithms were introduced, and the support vector regression combining stochastic search and Bayesian optimization (RSBO-SVR) algorithm was proposed. It was verified by simulation and tank experiments. The results showed that the method has strong anti-interference ability and can achieve high-precision parameter identification. The maximum relative error of the RSBO-SVR algorithm did not exceed 4% in both the simulation and experiment. It had a maximum reduction of 99.38% in runtime compared to SVR.

Keywords:

multivariate systems; parameter estimation; colored noise; support vector regression machines; stochastic search; Bayesian optimization; water tank systems

1. Introduction

It is known that multivariable systems are referred to as Multi-Input and Multi-Output (MIMO) systems. In the field of identification, methods for the identification of univariate systems are becoming mature, such as the least squares (LS) method [1,2], the stochastic gradient method [3,4], the maximum-likelihood method [5,6], and so on. With the development of control theory and the needs of engineering practice, most industrial plants are expressed by multivariable systems with complex structures, the existence of nonlinearities, incomplete information, and uncertain disturbances [7,8,9,10]. Therefore, the study of multivariate systems is more valuable than univariate systems [11,12].

There are two main models for describing MIMO systems: one is the state space equation (SSE) model, and the other is the transfer function matrix (TFM) model [13,14]. Over the years, scholars have proposed different parameter identification methods for these two models.

When the SSE model has a measurable state, the controlled model can be converted into a regression model, and the conventional identification algorithm can be used to identify the regression model; e.g., the iterative identification algorithm derived in the literature [15] for the SSE model with unknown parameters and a measurable state can effectively identify the parameters of the system. Although the ultimate goal of the literature [16] is to identify the TFM model, it first identifies the SSE model and then explores the relationship between the SSE model and the TFM model. The paper first utilized the subspace method to estimate the state space matrix (A, B, C, D). Then, the covariance expression of the input correlation matrix (B, D) was derived based on the first-order perturbation method, and the covariance of the transfer function was calculated by combining the covariance of (B, D) with the covariance of (A, C). Subspace identification methods have developed particularly rapidly in recent decades, and when the state of the SSE model is not measurable, the method identifies the linear time-invariant state-space model directly from the input and output data [17]. A novel system identification method based on invariant subspace theory has been proposed in the literature [18], aiming at solving the problem of identifying continuous-time linear time-invariant systems by combining time-domain and frequency-domain methods.

When the TFM is utilized to describe the system model for parameter identification, the MIMO system is always decoupled by a decoupling method before identification [19,20,21,22]. Liu derived a coupled stochastic gradient algorithm for multivariate systems, which decomposes the system model into multiple single-output subsystems for parameter identification. Although this method has high accuracy, it does not match the realistic industrial production model [19]. Ref. [23] presented a dynamic model of a plate-fin heat exchanger, from the heat exchanger mechanism model, using the Laplace transform to derive the calculation formula of the time constant, and put forward the identification method based on the heat exchanger efficiency of the heat transfer thermal resistance calculation relationship. Ref. [24] avoided the calculation of matrix inversion by decomposing the multivariate class CAR system into two subsystems and deriving a joint identification algorithm of stochastic gradient and least squares to estimate the system parameters.

Scholars who have previously proposed traditional identification methods consider only empirical risks, and these methods are prone to overfitting problems. In 1995, Vapnik et al. proposed a support vector machine (SVM) based on the Vapnik–Chervonenkis (VC) theory of statistical learning theory and structural risk minimization theory [25]. SVM has a strict theoretical and mathematical foundation, does not easily fall into the problems of local optimum and dimension difficulty, has a weak dependence on the number of samples, and has a strong generalization ability [26].

SVM was first solely used for classification problems [27,28,29,30]; however, as research into the topic has progressed, the SVM algorithm has been refined and enhanced, and its use has been extended to regression problems.

Huang and colleagues used SVM and the least square support vector machine (LS-SVM) to identify the difference equation models of linear and nonlinear systems [31]. The findings indicate that the SVM and LS-SVM techniques outperform neural networks in terms of generalization capacity and system identification accuracy. The LS-SVM algorithm is better suited for dynamic system identification since it is quicker and more resilient to noise than SVM. Castro-Garcia R. et al. presented a technique that uses the LS-SVM’s regularization capability to identify MIMO Hammerstein systems. The identification process only necessitates a few basic presumptions and performs well in the presence of Gaussian white noise [32].

Systems with colored noise have more challenges in parameter identification than systems with white noise disturbances [33,34,35]. Systems with colored noise, however, are more in accordance with real-world production processes [36,37].

Zhao divided the system disturbed by colored noise into a system part and a noise part and estimated the parameters of the two parts separately [38]. However, specific noise levels cannot be observed in real industrial activities and the method does not apply to on-site production activities. Xu and others used the filtering idea to convert the colored noise system into two models containing white noise [39]. This method directly changes the system structure and also generates new noise during the conversion process. Zhang et al. considered the problem of parameter identification in the presence of additive noise and used the a priori knowledge of the instantaneous mixing matrix in the independent component analysis (ICA) model to accurately estimate the additive noise at the output of the system [40].

Real-time parameter estimation is of great significance in modern engineering and control systems, which not only improves the performance and robustness of the system, but also supports fault detection, optimizes the production process, reduces the maintenance cost, and enhances the system safety. In order to realize real-time parameter estimation, the time for each parameter estimation process should not be too long. To achieve the reduction of estimation time, parameter optimization can be used. Parameter optimization has a very wide range of applications in many fields. These include hyperparameter optimization [41], process optimization [42,43], and so on.

The parameter estimation problem for multivariate systems under the interference of outliers and colored noise is considered in this paper. The main contributions are summarized as follows.

(1) To solve the problem of estimating the parameters of the TFM, the parameters of the equivalent difference equation model were first estimated, and then the TFM parameters were derived from the equation;

(2) In this paper, a method was proposed to deal with the parameter identification problem effectively using support vector regression (SVR). The method can well solve the problem of parameter estimation under the interference of outliers and colored noise;

(3) To address the issue of long parameter search time in the abovementioned method, a new SVR method combining stochastic search and Bayesian optimization was proposed;

(4) A two-input single-output simulation system was used to verify the effectiveness and accuracy of the algorithm. In addition, a water tank system was constructed for system identification to verify the effectiveness of the algorithm in real systems.

The remainder of this article is organized as follows. Section 2 introduces the related knowledge of SVR, analyzes the anti-interference property of SVR, and introduces the parameter estimation algorithm based on SVR in detail. Section 3 introduces the SVR algorithm combining random search and Bayesian optimization. Section 4 verifies the effectiveness of the algorithm through a dual-input, single-output simulation system, and verifies the usability of the algorithm in real systems through data from actual water tank systems. Section 5 contains the concluding remarks.

2. Parameter Estimation for MIMO Systems Under Noise Interference

2.1. Support Vector Regression Algorithm

There are m sets of input and output data samples

\{(X_{i}, Y_{i}), i = 1, 2, \dots, m\}

, where

X_{i} \in ℝ

is the input data set matrix of the ith data sample, which is expressed as

[\begin{matrix} x_{i}^{11} & x_{i}^{12} & \dots & x_{i}^{1 n} \\ x_{i}^{21} & x_{i}^{22} & \dots & x_{i}^{2 n} \\ ⋮ & ⋮ & \dots & ⋮ \\ x_{i}^{r 1} & x_{i}^{r 2} & ⋮ & x_{i}^{r n} \end{matrix}]

.

Y_{i} \in R^{r}

is the corresponding output data set matrix.

The established linear regression function is shown in Equation (1).

f (X) = 〈w, X_{i}〉 + b, w \in ℵ, b \in ℝ

(1)

where

〈\cdot, \cdot〉

denotes the dot product in

ℵ

.

We construct an SVR regression problem for each output

Y_{i}

.

The SVR algorithm is obtained by introducing an insensitive loss function on the basis of SVM classification. The basic idea is to find an optimal plane that minimizes the error of all training samples away from that optimal plane. Currently, the most widely used SVR model is

ε

-SVR, which introduces a

ε

linear insensitive loss function (as shown in Equation (2)).

L (f (X), Y_{i}, ε) = \{\begin{matrix} 0, & |Y_{i} - f (X)| \leq ε \\ |f (X) - Y_{i}| - ε, & |Y_{i} - f (X)| > ε \end{matrix}

(2)

The goal of

ε

-SVR is to find a function

f (x)

that deviates from the actual obtained target

Y_{i}

by at most

ε

for all training data while being as flat as possible [44]. To satisfy the flatness of Equation (1), it can be achieved by minimizing the norm, i.e.,

{‖w‖}^{2}

[45]. This is shown in Equation (3). The principle of SVR is shown in Figure 1.

\{\begin{matrix} \min \frac{1}{2} {‖w‖}^{2} \\ s . t . |y_{i} - 〈w, X_{i}〉 + b| \leq ε, \forall i \end{matrix}

(3)

The biggest difference between SVR and traditional regression algorithms is that SVR gives some tolerance to fitting.

In practice, setting

ε

too small cannot guarantee that all sample points are within the

ε

-pipeline, and setting

ε

too large will cause the regression hyperplane to be affected by outliers. Relaxation variables

ξ_{i}

and

ξ_{i}^{*}

are introduced to describe the degree of deviation of the sample points from the

ε

-pipeline so that both empirical and structural risks are taken into account.

The objective function is updated as Equation (4).

\{\begin{matrix} \min_{w, b, ξ, ξ^{*}} \frac{1}{2} {‖w‖}^{2} + C \sum_{i = 1}^{m} (ξ_{i} + ξ_{i}^{*}) \\ s . t . 〈w, X_{i}〉 + b - Y_{i} \leq ε + ξ_{i}, i = 1, 2, \dots, m \\ Y_{i} - 〈w, X_{i}〉 - b \leq ε + ξ_{i}^{*}, i = 1, 2, \dots, m \\ ξ_{i}, ξ_{i}^{*} \geq 0, i = 1, 2, \dots, m \end{matrix}

(4)

where C is the penalty factor, a human-set parameter.

Based on the constraints, Lagrange multipliers

α, α^{*}, β, β^{*}

are introduced, and constructing the Lagrange function (as in Equation (5)) transforms the optimization problem into a dyadic problem.

L = \frac{1}{2} {‖w‖}^{2} + C \sum_{i = 1}^{m} (ξ_{i} + ξ_{i}^{*}) - \sum_{i = 1}^{m} α_{i} [ε + ξ_{i} - Y_{i} + f (X_{i})] - \sum_{i = 1}^{m} α_{i}^{*} [ε + ξ_{i}^{*} + Y_{i} - f (X_{i})] - \sum_{i = 1}^{m} (β_{i} ξ_{i} + β_{i}^{*} ξ_{i}^{*})

(5)

The above-shown equation is obtained by taking the derivations of

ω, b, ξ_{i}, ξ_{i}^{*}

, respectively, and making the result zero:

\{\begin{matrix} ω = \sum_{i = 1}^{m} (α_{i} - α_{i}^{*}) X_{i} \\ \sum_{i = 1}^{m} (α_{i} - α_{i}^{*}) = 0 \\ C - α_{i} - ξ_{i} = 0 \\ C - α_{i}^{*} - ξ_{i}^{*} = 0 \end{matrix}

(6)

The dyadic problem of the original question is

\{\begin{matrix} \max_{α, α^{*}} [- \frac{1}{2} \sum_{i = 1}^{m} \sum_{j = 1}^{m} (α_{i} - α_{i}^{*}) (α_{j} - α_{j}^{*}) 〈X_{i}, X_{j}〉 - \sum_{i = 1}^{m} (α_{i} + α_{i}^{*}) ε + \sum_{i = 1}^{m} (α_{i} - α_{i}^{*}) Y_{i}] \\ s . t . \sum_{i = 1}^{m} (α_{i} - α_{i}^{*}) = 0 \\ 0 \leq α_{i} \leq C \\ 0 \leq α_{i}^{*} \leq C \end{matrix}

(7)

A kernel function is a very important concept in SVR algorithms, which can map the input feature vectors to a high-dimensional space for computation. Commonly used kernel functions include linear kernel function, polynomial kernel function, radial basis kernel function (RBF), and sigmoid kernel function. Different kernel functions correspond to different problems and are applicable to different problems, and choosing the appropriate kernel function can improve the performance of SVR. The specific calculation formula is shown in Table 1.

Support vector machines that introduce kernel functions and neural networks have a similar structure. Since we are studying a linear MIMO system, we use a linear kernel function.

Solving Equation (5) yields the optimal solution

α, α^{*}

, and the regression function is shown in Equation (8).

f (X) = \sum_{i = 1}^{ρ} (α_{i} - α_{i}^{*}) 〈X_{i}, X〉 + b

(8)

where

ρ

represents the number of support vectors.

Currently, there are many toolkits about SVM, such as LIBLINEAR, mySVM, LIBSVM, and so on. Among them, LIBSVM (Version 3.35) is an open-source software package for SVM developed by Prof. Chi-Jen Lin’s team at National Taiwan University, which provides an interface with MATLAB and is easy to use. In this paper, this package was used for data training and prediction.

2.2. Anti-Interference Analysis of SVR

Observations or values that deviate significantly from other data points in the data collection are called outliers. Numerous factors such as measurement errors, data entry problems, natural variability, and atypical events can lead to outliers.

Colored noise is a noise signal with an uneven frequency–energy distribution. In contrast to Gaussian white noise, colored noise does not have a uniform energy distribution over the frequency spectrum but has a frequency dependence or bias.

White noise and outliers are independent at different points in time, whereas colored noise is time-dependent. In addition, colored noise signals are often mixed with useful signals and are difficult to distinguish.

Due to the principle and structural properties of SVR, high interference resistance of parameter identification using SVR is guaranteed.

(1) Support Vector Regression employs the principle of structural risk minimization, which aims to minimize the empirical error and the confidence range, i.e., to minimize both the training error and the model complexity. In this way, SVR is able to find a balance between reducing model complexity and improving fitting accuracy, thus enhancing the generalization ability of the model.

Specifically, the goal of SVR is to optimize the model by minimizing structural risk, which involves not only minimizing training error but also ensuring that the model complexity is manageable. The L2 regularization term plays a key role here. The L2 regularization ensures that the model parameters do not become too large by restricting the L2 norms of the model parameter w. This helps to control the model complexity and to improve the generalization ability of the model. This helps control model complexity and prevents the model from overfitting the noise in the training data. L2 regularization reduces the model’s sensitivity to the input features by compressing the parameter values, thus improving the model’s generalization ability. Noise is usually random and irregular, and if the model is too complex, it may capture these noise signals, leading to overfitting. L2 regularization, on the other hand, makes the model smoother and avoids overreacting to chance noise patterns in the data;

(2) The generalization performance of SVR depends heavily on the appropriate choice of parameter C, which controls the weight of the empirical risk in the structural risk. To find the optimal value of C, a grid search method can be used. The specific size of the grid search will be elaborated in Section 2.3;

(3) SVR improves the robustness and immunity of the model by maximizing the distance between the support vector and the regression hyperplane. Maximizing this spacing means that the model pays more attention to trends in the overall data distribution, rather than being overly sensitive to small changes (i.e., noise) in individual sample points. This strategy allows SVR to better handle training data with noise and avoid overfitting the model by fitting noise;

(4) Using an

ε -

insensitive loss function allows a width interval to exist between the regression predictions and the true values, and the loss function does not take into account the prediction error within that interval. This means that the model is insensitive to small noises, as small noises do not lead to additional penalties.

2.3. Parameter Estimation of SVR-Based for MIMO Systems

A linear multivariate system is shown in Figure 2. The state space equation description of a system is not unique and when the state of the state space equation model is not measurable, not only the parameters of the model but also the state of the model has to be estimated, which makes the identification more complicated, and it is more advantageous to use the transfer function matrix model to describe the identification system model.

The TFM model of the MIMO system is described as shown in Equation (9).

Y (s) = [\begin{matrix} \frac{K_{11}}{T_{11} s + 1} & \dots & \frac{K_{1 r}}{T_{1 r} s + 1} \\ ⋮ & ⋱ & ⋮ \\ \frac{K_{n 1}}{T_{n 1} s + 1} & \dots & \frac{K_{n r}}{T_{n r} s + 1} \end{matrix}] U (s)

(9)

The parameter set K described in Equation (10) and the parameter set T described in Equation (11) are the sets of parameters that the system needs to recognize.

K = [\begin{matrix} K_{11} & K_{12} & \dots & K_{i j} & \dots & K_{n r} \end{matrix}]

(10)

T = [\begin{matrix} T_{11} & T_{12} & \dots & T_{i j} & \dots & T_{n r} \end{matrix}]

(11)

It is first necessary to discretize the discrete model (9) of the multivariate system into an impulse TFM model, as shown in Equation (12). In this paper, the discretization was carried out using the bilinear transformation method (

s = \frac{2}{T_{s}} \frac{1 - z^{- 1}}{1 + z^{- 1}}

).

Y (z) = (\begin{matrix} G_{11} (z^{- 1}) & \dots & G_{1 r} (z^{- 1}) \\ ⋮ & ⋱ & ⋮ \\ G_{n 1} (z^{- 1}) & \dots & G_{n r} (z^{- 1}) \end{matrix}) U (z)

(12)

Regarding the above-shown equation, the specific description of

G_{i j} (z^{- 1})

is shown in Equation (13).

G_{i j} (z^{- 1}) = \frac{n_{i j, 0} + n_{i j, 1} z^{- 1} + \dots + n_{i j, q_{b_{i j}}} z^{- q_{b_{i j}}}}{1 + m_{i j, 1} z^{- 1} + \dots + m_{i j, p_{a_{i j}}} z^{- p_{a_{i j}}}}

(13)

where

q_{b_{i j}}

is the input signal order and

p_{a_{i j}}

is the output signal order.

Extracting the least common multiple of each row of the impulse TFM leads to Equation (14) and then converting the impulse TFM model to a difference equation model (Equation (15)).

1 + μ_{i, 1} z^{- 1} + \dots + μ_{i, n_{a_{i}}} z^{- n_{a_{i}}} = (1 + \dots + m_{i 1, p_{a_{i 1}, i 1}} z^{- p_{a_{i 1}, i 1}}) \dots (1 + \dots + m_{i n, p_{a_{i n}, i n}} z^{- p_{a_{i n}, i n}})

(14)

[\begin{matrix} y_{1} (k) \\ ⋮ \\ y_{n} (k) \end{matrix}] = - [\begin{matrix} Y_{1} (k) \\ ⋮ \\ Y_{n} (k) \end{matrix}] + [\begin{matrix} U_{1} (k) \\ ⋮ \\ U_{n} (k) \end{matrix}]

(15)

where

Y_{i} (k) = μ_{i, 1} y_{i} (k - 1) + \dots + μ_{i, n_{a_{i}}} y_{i} (k - n_{a_{i}})

,

U_{i} = λ_{i 1, 0} u_{1} (k) + \dots + λ_{i 1, q_{b_{i 1}}}

u_{1} (k - q_{b_{i 1}}) + \dots + λ_{i r, 0} u_{r} (k) + \dots + λ_{i r, q_{b_{i r}}} u_{r} (k - q_{b_{i r}})

.

The difference equation model parameters (Equation (16)) are first estimated and then the parameters K and T are inverted according to the discretization method used.

θ_{1} = [\begin{matrix} μ_{1, 1} & \dots & μ_{1, n_{a 1}} \\ μ_{2, 1} & \dots & μ_{2, n_{a 2}} \\ ⋮ & ⋮ & ⋮ \\ μ_{n, 1} & \dots & μ_{n, n_{a n}} \end{matrix} \begin{matrix} λ_{11, 0} & \dots & λ_{11, q_{b_{11}}} & \dots & λ_{1 r, 0} & \dots & λ_{1 r, q_{b_{1 r}}} \\ λ_{21, 0} & \dots & λ_{21, q_{b_{21}}} & \dots & λ_{2 r, 0} & \dots & λ_{2 r, q_{b_{2 r}}} \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ λ_{n 1, 0} & \dots & λ_{n 1, q_{b_{n 1}}} & \dots & λ_{n r, 0} & \dots & λ_{n r, q_{b_{n r}}} \end{matrix}]

(16)

Specifically, after determining the differential equation model, K and T are solved inversely according to the formulation of the bilinear transformation method. This can be achieved by using the d2c function in MATLAB.

For the MIMO system, we split the system into multiple Multi-Input and Single-Output (MISO) systems based on each output

y_{i}

and study each MISO system.

Take the example of a two-input, two-output system (Equation (17)). It is discretized into a difference equation, which is described as Equation (18).

Y (s) = [\begin{matrix} \frac{K_{1}}{T_{1} s + 1} & \frac{K_{2}}{T_{2} s + 1} \\ \frac{K_{3}}{T_{3} s + 1} & \frac{K_{4}}{T_{4} s + 1} \end{matrix}] U (s)

(17)

[\begin{matrix} y_{1} (k) \\ y_{2} (k) \end{matrix}] = - [\begin{matrix} μ_{11} y_{1} (k - 1) & μ_{12} y_{1} (k - 2) \\ μ_{21} y_{2} (k - 1) & μ_{22} y_{2} (k - 2) \end{matrix}] + [\begin{matrix} λ_{11} u_{1} (k) & λ_{12} u_{1} (k) & λ_{13} u_{1} (k) & λ_{21} u_{2} (k) & λ_{22} u_{2} (k) & λ_{23} u_{2} (k) \\ λ_{31} u_{1} (k) & λ_{32} u_{1} (k) & λ_{33} u_{1} (k) & λ_{41} u_{2} (k) & λ_{42} u_{2} (k) & λ_{43} u_{2} (k) \end{matrix}]

(18)

The inverse solution of

K_{1}, K_{2}, K_{3}, K_{4}, T_{1}, T_{2}, T_{3}, T_{4}

is performed according to the formulation of the bilinear transformation method, the expression of which is shown in Equation (19).

\{\begin{matrix} K_{1} = \frac{2 λ_{11} λ_{21}}{λ_{22}} \\ K_{2} = \frac{2 λ_{11} λ_{21}}{λ_{12}} \\ K_{3} = \frac{2 λ_{31} λ_{41}}{λ_{42}} \\ K_{4} = \frac{2 λ_{31} λ_{41}}{λ_{32}} \end{matrix} \{\begin{matrix} T_{1} = \frac{2 T_{s} λ_{21} - T_{s} λ_{22}}{2 λ_{22}} \\ T_{2} = \frac{2 T_{s} λ_{11} - T_{s} λ_{12}}{2 λ_{12}} \\ T_{3} = \frac{2 T_{s} λ_{41} - T_{s} λ_{42}}{2 λ_{42}} \\ T_{4} = \frac{2 T_{s} λ_{31} - T_{s} λ_{32}}{2 λ_{32}} \end{matrix}

(19)

where

T_{s}

is the system sampling time.

Parameter estimation for MIMO systems can be viewed as finding the relationship between the input vector X and multiple outputs

y_{j}

. For each output

y_{j}

, an SVR model is built to describe its relationship with the input vector X.

y_{j} = f_{j} (X) + μ_{j}

(20)

For each output

y_{j}

, we define an independent SVR optimization problem with the following objective function.

\{\begin{matrix} \min_{w, b, ξ, ξ^{*}} \frac{1}{2} {‖w_{j}‖}^{2} + C \sum_{i = 1}^{m} (ξ_{i}^{j} + ξ_{i}^{* j}) \\ s . t . 〈w_{j}, X_{i}〉 + μ_{j} - y_{j} \leq ε + ξ_{i}^{j} \\ y_{j} - 〈w_{j}, X_{i}〉 - μ_{j} \leq ε + ξ_{i}^{* j} \\ ξ_{i}^{j}, ξ_{i}^{* j} \geq 0 \end{matrix}

(21)

where

w_{j}

is the weight vector of the corresponding SVR regression problem for each output signal

y_{j}

.

By training the dataset

\{(X_{i}, Y_{i})\}

, the output of each SVR model is trained individually.

After collecting the input and output signals, the dataset is constructed according to the data requirements of LIBSVM, including the label dataset and data dataset; for the regression problem, the label dataset is the independent variable dataset, and the data dataset is the dependent variable dataset. Parameter identification of multivariate systems based on SVR requires the construction of datasets as shown in Equations (22) and (23).

l a b e l = [\begin{matrix} - y_{i} (k - 1) & - y_{i} (k - 2) \end{matrix} \begin{matrix} u_{1} (k) & u_{1} (k - 1) & u_{1} (k - 2) \end{matrix} \begin{matrix} u_{2} (k) & u_{2} (k - 1) & u_{2} (k - 2) \end{matrix}]

(22)

d a t a = [y_{i} (k)]

(23)

The dataset was divided into training and test sets according to a certain ratio. The linear kernel function was selected based on the characteristics of the linear multivariate system. We then optimized the parameter C using the grid search method to achieve the best prediction result.

When the kernel function is a linear kernel function, the training system parameters, support vector weights, and support vectors can be used to find the discriminant parameters, as shown in Equation (24).

w = ((m o d e l . s v_c o e f) ’ • f u l l (m o d e l . S V s))

(24)

The specific process of SVR-based parameter identification is shown in Figure 3.

2.4. Unbiased and Error Convergence

For traditional estimation algorithms, such as the LS method, the most important properties of the method are unbiasedness and convergence of the error variance to zero.

SVR-based parameter estimation methods do not directly optimize for or guarantee unbiasedness and are more concerned with finding a model that strikes a balance between model complexity and prediction error. This bias facilitates the reduction of overfitting and improves the generalization ability of the model. Therefore, this paper considers the problem of parameter estimation under noise interference, and better results can be achieved by selecting SVR-based parameter estimation algorithms compared to traditional parameter estimation methods.

Whether the error variance of SVR-based parameter estimation methods can converge to zero depends on a number of factors, including the choice of model, the complexity of the feature space, the nature of the dataset, and the setting of the regularization parameters. Since the goal of SVR is to minimize the model complexity and prediction error, it is not possible to achieve complete convergence of the error variance to 0.

Although the complexity of the feature equations is performed to improve the accuracy of the parameter estimation, the model may be overfitted in this case. For the data studied in this paper, i.e., data containing noise, overfitting can lead to a parameter estimation problem that is heavily influenced by noise.

3. SVR Algorithm Using Stochastic Search and Bayesian Optimization

Stochastic search is a global search method that does not rely on gradient information, and it searches for optimal solutions by randomly sampling in the solution space. Unlike deterministic search algorithms, stochastic search does not rely on a priori knowledge and is insensitive to the choice of initial points, and thus exhibits unique robustness in dealing with discontinuous and nonconvex optimization problems. The core of the stochastic search algorithm is to utilize randomness to traverse the solution space, and it does not use any directed search strategy, which means that the decision at each step of the algorithm is independent of the previous step.

Its main process is as follows:

(1) Define the parameter space. Define the parameter space to be optimized according to the specific problem;

(2) Initial sampling. Randomly sample a set of initial parameter values from the parameter space according to a certain sampling method, such as uniform distribution, Gaussian distribution, and so on;

(3) Evaluation of objective function. Design the objective function according to the specific problem, and then calculate the corresponding objective function value for each set of parameter values;

(4) Update the best parameters. Record the best combination of parameters found and their corresponding objective function values;

(5) Iterative search. Repeat steps 2 to 4 until a predetermined number of iterations is reached or the stopping condition is met.

Compared with grid search, random search does not need to search for all parameter combinations, which greatly reduces the number of searches and time. Especially when the parameter space is very large, stochastic search can find the near-optimal parameter combinations with less computational resources and time cost.

Bayesian optimization is a global optimization method based on a probabilistic model, which searches for the optimal hyperparameters by constructing a probabilistic model of the objective function.

The core of Bayesian optimization lies in the use of a probabilistic model to model a black-box function and the use of this model to guide the process of finding the optimal solution. Probabilistic models use prior knowledge and observations to update the posterior distribution as a means of predicting the value of a function at points that have not yet been observed. The first step in constructing the model is to determine the prior distribution, which is usually based on assumptions about the prior knowledge of the problem.

The probabilistic model is then updated based on Bayes’ Law as new observations are obtained. Constructing a probabilistic model requires the selection of an appropriate family of probability distributions, which is commonly used as a Gaussian process (GP).

Its specific implementation process is as follows:

(1) Initial sampling. Randomly select some sampling points and calculate the corresponding objective function values;

(2) Construct an agent model of the objective function based on the initial sampling using the Gaussian process;

(3) Optimize the proxy model. Use the proxy model to predict the objective function values and uncertainties of the new parameter points. Calculate the acquisition function based on the above predictions to balance the trade-off between exploration and exploitation. In this case, exploration is the selection of points with higher uncertainty to discover new optimal points. Exploitation is conducted to select points with better expected objective function values to improve the optimal solution by utilizing existing information;

(4) Updating the agent model. The next parameter point is selected under the guidance of the acquisition function and its objective function value is calculated. Add the new data point to the existing data set to update the agent model;

(5) Iterative repetition. Repeat steps 3 and 4 to gradually reduce the parameter space and find the optimal parameters. The iterative process ends when the preset number of iterations or convergence conditions are reached.

Bayesian optimization predicts which parameter combinations are likely to lead to the best performance by constructing a probabilistic model of the objective function and then selects the parameters to be evaluated next based on this predictive model. This approach is more efficient compared to random and lattice searches and is able to find near-optimal parameters with fewer evaluations, especially if the objective function evaluation is costly. In addition, Bayesian optimization is an iterative optimization process where each step uses the information from the previous steps to improve the next search. This means that as the search proceeds, the optimization process becomes progressively more efficient and precise and is able to better focus on the possible regions of optimal parameters.

When the search space is large and the problem complexity is high, stochastic search covers the solution space well but lacks relevance and efficiency. On the other hand, Bayesian optimization methods provide good relevance and high efficiency, but require high prior knowledge of the problem and perform erratically in high-dimensional spaces. Combining the two aims to exploit the global search capability of stochastic search and the local search efficiency of Bayesian optimization.

The specific steps of the algorithm for combining the two are as follows:

(1) Define the parameter space. Define the range of values and distribution of each hyperparameter;

(2) Initial exploration using random search. Determine the number of initial searches according to the specific problem and computational resources, and collect the initial data points after a certain number of random searches;

(3) Construct a Bayesian optimization agent model. Construct the agent model of the objective function using the initial data points in step 2;

(4) Optimize the agent model. Use the agent model to predict the objective function values and uncertainties for the new parameter points. Calculate the acquisition function based on these predictions to balance the tradeoff between exploration and exploitation. In this case, exploration is the selection of points with higher uncertainty to discover new optimal points. Exploitation is conducted to select points with better expected objective function values to improve the optimal solution by utilizing existing information;

(5) Updating the agent model. The next parameter point is selected under the guidance of the acquisition function and its objective function value is calculated. Add the new data point to the existing data set to update the agent model;

(6) Iterative repetition. Repeat steps 4 and 5 to gradually reduce the parameter space and find the optimal parameters. The iterative process ends when the preset number of iterations or convergence conditions are reached.

4. Simulation and Experiment

4.1. Simulation

In order to demonstrate the effectiveness of the proposed method, a two-input single-output system was built for simulation. Its specific formula is shown in Equation (25).

Y (s) = [\begin{matrix} \frac{1}{s + 1} & \frac{3}{5 s + 1} \end{matrix}] U (s)

(25)

Bilinear discretization of the above transfer function matrix using MATLAB gives the impulse TFM as Equation (26).

[\begin{matrix} \frac{0.04762 (1 + z^{- 1})}{1 - 0.9048 z^{- 1}} & \frac{0.0297 (1 + z^{- 1})}{1 - 0.9802 z^{- 1}} \end{matrix}]

(26)

The least common multiple is extracted for each row of the impulse TFM and transformed into the form of a difference equation, as described below:

\begin{matrix} y (k) & = 1.885 y (k - 1) - 0.8869 y (k - 2) + 0.04762 u_{1} (k) + 0.00094 u_{1} (k - 1) - 0.046676 u_{1} (k - 2) \\ + 0.0297 u_{2} (k) + 0.028 u_{2} (k - 1) - 0.05516 u_{2} (k - 2) \end{matrix}

(27)

The parameters to be estimated for this system are shown in Equation (28).

\{\begin{matrix} a_{1} = 1.885 \\ a_{2} = - 0.8869 \end{matrix} \{\begin{matrix} b_{11} = 0.04762 \\ b_{12} = 0.00094 \\ b_{13} = - 0.046676 \\ b_{21} = 0.0297 \\ b_{22} = 0.028 \\ b_{23} = - 0.05516 \end{matrix}

(28)

The input signals 1 and 2 are both random signals, as shown in Figure 4.

The PC hardware configuration is a Windows 11 operating system, 16 g RAM, and the CPU is an Intel i7-1260p. The MATLAB version is R2018b.

The estimation of model parameters (see Equation (27)) was performed using the recursive least square (RLS), SVR, and RSBO-SVR methods in the presence of outliers or colored noise. The data were divided into a training set and a test set, where 2/3 is the training set and 1/3 is the test set.

4.1.1. Presence of Outliers

We assumed that there were two outliers y (100) = 100 and y (120) = −50 in the dataset. The recursive process of RLS identification is shown in Figure 5. The training and prediction results of SVR are shown in Figure 6. The training and prediction results of RSBO-SVR are shown in Figure 7. The specific recognition results and errors are shown in Table 2.

Using the above-estimated data, the values of the relevant parameters of the TFM are calculated according to Equation (19), and their specific values and errors from the set values are shown in Table 3.

From Figure 6, it can be seen that the outliers had a great influence on the parameter identification of RLS, which will make RLS completely invalid and unable to identify the correct parameters. On the contrary, SVR had a strong anti-interference ability to the outliers, and its parameter identification results still had a high correct rate, as shown in Table 2; the maximum error was 2.04%, and the maximum error of the transfer function matrix parameters obtained after the operation was 2.01%. The RSBO-SVR algorithm also had very good identification results; the maximum error of the parameters of the difference equation was not more than 2%, and the maximum error of the transfer function matrix parameters obtained after the operation was 2.01%. The maximum error of the transfer function matrix parameters was only 3.33%.

4.1.2. Colored Noise States

Assuming that the model of Equation (27) contains colored noise, the expression is shown in Equation (29).

e (k) = 1.5 e (k - 1) - 0.7 e (k - 2) - 0.1 e (k - 3) + 0.5 v (k) + 0.2 v (k - 1)

(29)

where

v (k)

is a white noise signal with mean 0 and variance 1.

The training and prediction results of SVR and RSBO-SVR are shown in Figure 8 and Figure 9. The specific identification results and errors are shown in Table 4.

From the above-shown table, it can be seen that in most of the parameter identification results, SVR and RSBO-SVR performed closer to the set values than RLS, and the errors were generally lower than RLS. The maximum errors were 2.04% for SVR, 3.06% for RSBO-SVR, and 237.88% for RLS, which was more than 100 times higher than that of SVR, and more than 100 times higher than that of RSBO-SVR. The maximum error of RLS was 237.88%, which is more than 100 times greater than SVR and more than 70 times greater than RSBO-SVR. This shows that SVR and RSBO-SVR identify the parameters better than RLS in the presence of colored noise. The values of the relevant parameters of the transfer function matrix were calculated using the above-shown data, and the exact values and the errors with respect to the set values are shown in Table 5.

In the presence of colored noise in the system, although RLS could identify the relevant parameter values, the error was very large; as can be seen from the table above, the maximum error was 669.3% and the minimum error was more than 40%, and the model built by using this identification parameter could not accurately describe the system. In contrast, the maximum error of SVR was only 1.997%, and the maximum error of RSBO-SVR was only −3.5%, which could describe the relevant system better.

Both SVR and RSBO-SVR have good estimation accuracy, and we compared their estimation times, as shown in Table 6 below.

In the presence of outliers, the estimation time of RSBO-SVR decreased by 90.05% compared to SVR; in the presence of colored noise interference, the estimation time of RSBO-SVR decreased by 78.88% compared to SVR.

4.2. Experimental Verification

In order to verify the practical application of the algorithm, a water tank model was created. The diagram of the water tank model is shown in Figure 10.

Three systems need to be identified in the above-shown system. To explain in detail, the following is an example of Tank 1.

To identify Tank 1, we used a pressure sensor to obtain level information and a pump and ball valve to control the level. The current value of the pressure sensor was obtained, and the current values of the input pump and ball valve were used as identification data.

The input data and output data are shown in Figure 11. The complete data are shown in Appendix A. Parameter estimation was carried out using SVR and RSBO-SVR algorithms and the training and prediction process is shown in Figure 12 and Figure 13.

The final recognized model of SVR is shown in Equation (30). The recognized model of RSBO-SVR is shown in Equation (31).

y (k) = 0.8107 y (k - 1) + 0.1717 y (k - 2) - 0.0074 u_{1} (k - 2) - 0.0055 u_{1} (k - 3) - 0.0222 u_{2} (k - 2) + 0.0015 u_{2} (k - 3) + 0.4492

(30)

y (k) = 0.7998 y (k - 1) + 0.2056 y (k - 2) - 0.0077 u_{1} (k - 2) - 0.0053 u_{1} (k - 3) - 0.0219 u_{2} (k - 2) + 0.0015 u_{2} (k - 3) + 0.4001

(31)

To verify the accuracy of the estimated model, the same input signals were given to the estimated model and the actual system. The output signals and errors of the two systems are shown in Figure 14 and Figure 15.

The error between the estimated output signal and the actual output signal of the SVR algorithm was effectively controlled within ±1.5%. The maximum error between the estimated output signal and the actual output signal of the RSBO-SVR algorithm did not exceed 3%. This indicates that the accuracy of the estimation process of SVR and RSBO-SVR was very high. The estimation time of SVR was 4085 s and the estimation time of RSBO-SVR was 25 s. The estimation accuracy of RSBO-SVR decreased in comparison with SVR, but its estimation speed improved dramatically, being 99.38% faster when compared with SVR.

5. Conclusions

In this study, we investigated the performance of SVR and RSBO-SVR algorithms for parameter identification in multivariate systems, including challenging conditions such as common outliers and colored noise disturbances. The results showed that both the RSBO-SVR and SVR algorithms utilize the structural risk minimization principle and the L2 regularization term to effectively reduce the complexity of the model while maintaining robustness to noise disturbances. Specifically, the use of the maximum margin strategy and insensitive loss function enabled the model to effectively resist noise and outliers.

For parameter identification of multivariate systems, the RLS algorithm was completely ineffective when there were outliers in the collected input and output data. The SVR algorithm and RSBO-SVR algorithm, however, still had high identification accuracy under the same conditions. In addition, in order to solve the problem of the long optimization time of SVR parameters, the RSBO-SVR algorithm combining random search and Bayesian optimization is proposed. When the system contains colored noise, RLS could identify the parameters, but the error was very large, and the identified system could not correctly reflect the system characteristics. In contrast, color noise had less of an effect on the SVR and RSBO-SVR algorithms. Compared with SVR, RSBO-SVR had a faster estimation speed.

In addition, the tank experiments prove that these algorithms can be applied to the identification of real systems; the maximum output error between the estimated model and the actual model was only 1.5% for the SVR algorithm, and the maximum output error between the estimated model and the actual model was not more than 3% for the RSBO-SVR algorithm. Although the accuracy is lower than SVR, it still meets the requirements of industrial production. Compared with SVR, the estimation time of RSBO-SVR was 99.38% shorter.

6. Limitation of Study

There are three main limitations in this study: first, the parameter estimation model was constructed based on a linear multivariate system; although fast and accurate parameter identification can be achieved by the proposed method, it may generate modeling bias for an actual engineering system with strong nonlinear coupling characteristics. Second, although the SVR-based parameter estimation algorithm balances the model complexity and prediction error via L2 regularization, its loss function design does not impose explicit constraints on the unbiasedness of parameter estimation. Third, the current framework does not introduce a compensation mechanism for non-minimum phase zeros and time lag links, which may widen the phase margin prediction error for hydraulic servo systems with significant inverted response characteristics or process control objects with transmission delays of more than 40% of the sampling period, potentially affecting their applicability in specific industrial scenarios. These limitations point the way to improvements in subsequent research, including the introduction of a nonlinear model predictive control framework [46,47], the incorporation of Bayesian regularization constraints [48], and the embedding of Padé time-lag approximation modules [49].

Author Contributions

Conceptualization, J.Z. and X.J.; data curation, J.Z.; funding acquisition, X.J.; methodology, J.Z. and X.J.; software, J.Z.; writing—original draft, J.Z.; writing—review and editing, J.Z. and X.J. All authors have read and agreed to the published version of the manuscript.

Funding

This paper was funded by the National Natural Science Foundation of China (62063026) and also supported by the Natural Science Foundation of Inner Mongolia (2024LHMS06009).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. The data of tank 1.

Time/s	Water Pump Output/mA	Ball Valve/mA	Pressure Sensor Output/mA
0	19	5	4.25
3	19	5	4.22
5	19	5	4.29
7	19	5	4.23
9	19	5	4.22
11	19	5	4.22
13	19	5	4.27
15	19	5	4.33
18	19	5	4.42
20	19	5	4.46
22	19	5	4.51
24	19	5	4.57
26	19	5	4.6
28	19	5	4.66
30	19	5	4.68
32	19	5	4.71
35	19	5	4.77
37	19	5	4.79
39	19	5	4.84
41	19	5	4.89
43	19	5	4.96
45	18.49	5.51	4.95
47	17.99	6.01	5.01
50	17.47	6.53	5.04
52	16.95	7.05	5.07
54	16.42	7.58	5.11
56	15.87	8.13	5.12
58	15.32	8.68	5.19
60	14.74	9.26	5.19
62	14.16	9.84	5.22
65	13.55	10.45	5.28
67	12.93	11.07	5.28
69	12.29	11.71	5.33
71	11.63	12.38	5.36
73	10.93	13.07	5.34
75	10.22	13.78	5.34
77	9.49	14.51	5.39
79	8.73	15.27	5.38
82	7.94	16.06	5.3
84	7.14	16.86	5.29
86	6.32	17.68	5.23
88	5.5	18.5	5.14
90	5	19	5.05
92	5	19	5.01
94	5	19	4.93
96	5.81	18.19	4.86
99	6.63	17.37	4.8
101	7.46	16.54	4.73
103	8.31	15.69	4.65
105	9.19	14.81	4.55
107	10.09	13.91	4.54
109	11.02	12.98	4.55
111	11.96	12.04	4.52
114	12.92	11.08	4.47
116	13.91	10.09	4.54
118	14.92	9.08	4.57
120	15.95	8.05	4.52
122	17.01	6.99	4.58
124	18.08	5.92	4.58
126	19	5	4.7
128	19	5	4.74
131	19	5	4.7
133	19	5	4.75
135	19	5	4.81
137	19	5	4.83
139	19	5	4.86
141	19	5	4.94
143	19	5	4.93
146	19	5	4.98
148	17.89	6.11	5.03
150	16.77	7.23	5.07
152	15.65	8.35	5.08
154	14.52	9.48	5.11
156	13.38	10.62	5.1
158	12.24	11.76	5.13
160	11.09	12.91	5.19
163	9.91	14.09	5.16
165	8.73	15.27	5.14
167	7.54	16.46	5.13
169	6.33	17.67	5.1
171	5.11	18.89	4.99
173	5	19	4.9
175	6.21	17.79	4.83
178	7.43	16.57	4.8
180	8.66	15.34	4.77
182	9.89	14.11	4.72
184	11.14	12.86	4.69
186	12.41	11.59	4.64
188	13.69	10.31	4.63
190	15	9	4.67
192	16.32	7.68	4.66
195	17.66	6.34	4.7
197	19	5	4.72
199	19	5	4.77
201	19	5	4.79
203	19	5	4.83
205	19	5	4.86
207	19	5	4.91
210	19	5	4.96
212	17.64	6.36	5
214	16.28	7.72	5.03
216	14.91	9.09	5.05
218	13.54	10.46	5.07
220	12.16	11.84	5.05
222	10.79	13.21	5.12
224	9.4	14.6	5.07
227	8	16	5.05
229	6.59	17.41	5
231	5.18	18.82	4.93
233	6.58	17.42	4.91
235	7.99	16.01	4.83
237	9.41	14.59	4.79
239	10.84	13.16	4.73
241	12.28	11.72	4.73
244	13.73	10.27	4.75
246	15.19	8.81	4.72
248	16.67	7.33	4.75
250	18.17	5.83	4.76
252	19	5	4.82
254	19	5	4.84
256	19	5	4.91
258	19	5	4.93
261	19	5	4.95
263	17.5	6.5	4.99
265	16	8	5.05
267	14.5	9.5	5.04
269	12.99	11.01	5.05
271	11.48	12.52	5.05
273	9.97	14.03	5.06
276	8.45	15.55	5.09
278	6.9	17.1	5.02
280	5.36	18.64	4.93
282	6.9	17.1	4.86
284	8.45	15.55	4.84
286	9.99	14.01	4.77
288	11.55	12.45	4.73
290	13.13	10.87	4.73
293	14.71	9.29	4.75
295	16.31	7.69	4.77
297	17.93	6.07	4.81
299	19	5	4.8
301	19	5	4.88
303	19	5	4.91
305	19	5	4.94
307	19	5	4.97
310	17.38	6.62	4.98
312	15.77	8.23	5.01
314	14.15	9.85	5.07
316	12.52	11.48	5.04
318	10.9	13.1	5.08
320	9.27	14.73	5.1
322	7.61	16.39	5.06
324	5.95	18.05	5.01
327	5	19	4.93
329	6.66	17.34	4.83
331	8.33	15.67	4.81
333	10	14	4.79
335	11.68	12.32	4.75
337	13.37	10.63	4.73
339	15.08	8.92	4.68
342	16.81	7.19	4.73
344	18.55	5.45	4.73
346	19	5	4.81
348	19	5	4.82
350	19	5	4.89
352	19	5	4.89
354	19	5	4.97
356	17.25	6.75	4.99
359	15.5	8.5	5.01
361	13.75	10.25	5.01
363	12	12	5.05
365	10.25	13.75	5.05
367	8.48	15.52	5.05
369	6.7	17.3	4.99
371	5	19	4.91
374	6.78	17.22	4.87
376	8.57	15.43	4.85
378	10.35	13.65	4.77
380	12.15	11.85	4.72
382	13.96	10.04	4.71
384	15.8	8.2	4.73
386	17.65	6.35	4.8
388	19	5	4.82
391	19	5	4.87
393	19	5	4.91
395	19	5	4.91
397	19	5	4.97
399	17.14	6.86	5
401	15.29	8.71	5.01
403	13.44	10.56	5.04
405	11.58	12.42	5.07
408	9.72	14.28	5.06
410	7.84	16.16	5.03
412	5.96	18.04	4.96
414	5	19	4.92
416	6.88	17.12	4.88
418	8.77	15.23	4.83
420	10.66	13.34	4.8
422	12.55	11.45	4.73
425	14.46	9.54	4.76
427	16.39	7.61	4.78
429	18.33	5.67	4.75
431	19	5	4.83
433	19	5	4.87
435	19	5	4.89
437	19	5	4.91
439	19	5	4.94
442	19	5	4.96
444	17.05	6.95	5
446	15.1	8.9	5.01
448	13.15	10.85	5.03
450	11.2	12.8	5.05
452	9.25	14.75	5.05
454	7.28	16.72	5.03
456	5.3	18.7	4.96
459	5	19	4.91
461	6.98	17.02	4.83
463	8.97	15.03	4.79
465	10.97	13.03	4.74
467	12.98	11.02	4.72
469	15	9	4.74
471	17.03	6.97	4.77
474	19	5	4.78
476	19	5	4.83
478	19	5	4.84
480	19	5	4.88
482	19	5	4.96
484	16.95	7.05	4.96
486	14.9	9.1	4.98
488	12.85	11.15	4.99
491	10.81	13.19	5.04
493	8.75	15.25	5.04
495	6.69	17.31	4.97
497	5	19	4.89
499	7.08	16.92	4.86
501	9.16	14.84	4.81
503	11.25	12.75	4.8
506	13.33	10.67	4.73
508	15.45	8.55	4.78
510	17.57	6.43	4.75
512	19	5	4.79
514	19	5	4.83
516	19	5	4.89
518	19	5	4.88
520	19	5	4.93
523	19	5	4.99
525	16.85	7.15	5
527	14.71	9.29	4.97
529	12.57	11.43	5.04
531	10.43	13.57	5.04
533	8.29	15.71	5.06
535	6.12	17.88	4.99
538	5	19	4.92
540	7.17	16.83	4.85
542	9.34	14.66	4.82
544	11.52	12.48	4.83
546	13.69	10.31	4.82
548	15.87	8.13	4.81
550	18.07	5.93	4.83
552	19	5	4.85
555	19	5	4.89
557	19	5	4.89
559	19	5	4.93
561	19	5	4.95
563	16.79	7.21	4.98
565	14.59	9.41	5
567	12.38	11.62	5.02
569	10.18	13.82	5.07
572	7.96	16.04	5.05
574	5.72	18.28	4.99
576	5	19	4.89
578	7.24	16.76	4.82
580	9.49	14.51	4.78
582	11.75	12.25	4.74
584	14.01	9.99	4.73
587	16.29	7.71	4.73
589	18.59	5.41	4.76
591	19	5	4.81
593	19	5	4.84
595	19	5	4.86
597	19	5	4.88
599	19	5	4.91
601	19	5	4.93
604	19	5	4.96
606	16.69	7.31	5.01
608	14.38	9.63	5.04
610	12.06	11.94	5.04
612	9.75	14.25	5.07
614	7.42	16.58	5.03
616	5.08	18.92	4.99
618	5	19	4.91
621	7.34	16.66	4.85
623	9.69	14.31	4.8
625	12.04	11.96	4.77
627	14.4	9.6	4.74
629	16.79	7.21	4.74
631	19	5	4.8
633	19	5	4.83
635	19	5	4.84
638	19	5	4.88
640	19	5	4.93
642	19	5	4.96
644	16.59	7.41	5
646	14.19	9.81	4.99
648	11.8	12.2	5.01
650	9.4	14.6	5
652	7	17	4.98
655	5	19	4.91
657	7.42	16.58	4.87
659	9.84	14.16	4.85
661	12.26	11.74	4.79
663	14.69	9.31	4.77
665	17.14	6.86	4.79
667	19	5	4.8
670	19	5	4.85
672	19	5	4.88
674	19	5	4.93
676	19	5	4.95
678	16.54	7.46	4.96
680	14.08	9.92	5.01
682	11.62	12.38	5.02
684	9.16	14.84	5.05
687	6.68	17.32	5.02
689	5	19	4.93
691	7.48	16.52	4.9
693	9.96	14.04	4.88
695	12.45	11.55	4.82

References

Young, P.C. Parallel Processes in Hydrology and Water Quality: A Unified Time-series Approach. J. Inst. Water Environ. Manag. 1992, 6, 598–612. [Google Scholar] [CrossRef]
Johnson, M.L.; Faunt, L.M. Parameter estimation by least-squares methods. Methods Enzymol. 1992, 210, 1–37. [Google Scholar] [CrossRef] [PubMed]
Chen, B.; Xu, Y.; Shrivastava, A. Fast and Accurate Stochastic Gradient Estimation. In Proceedings of the Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019. [Google Scholar]
Meng, D. Recursive Least Squares and Multi-innovation Gradient Estimation Algorithms for Bilinear Stochastic Systems. Circuits Syst. Signal Process. CSSP 2017, 36, 1052–1065. [Google Scholar] [CrossRef]
Gibson, S.; Wills, A.; Ninness, B. Maximum-likelihood parameter estimation of bilinear systems. IEEE Trans. Autom. Control AC 2005, 50, 1581–1596. [Google Scholar] [CrossRef]
Li, J.; Ding, F.; Hua, L. Maximum likelihood Newton recursive and the Newton iterative estimation algorithms for Hammerstein CARAR systems. Nonlinear Dyn. 2013, 75, 235–245. [Google Scholar] [CrossRef]
Wang, H.; Memon, F.H.; Wang, X.; Li, X.; Zhao, N.; Dev, K. Machine learning-enabled MIMO-FBMC communication channel parameter estimation in IIoT: A distributed CS approach. Digit. Commun. Netw. Engl. Ed. 2023, 9, 306–312. [Google Scholar] [CrossRef]
Cerone, V.; Razza, V.; Regruto, D. Set-membership errors-in-variables identification of MIMO linear systems. Automatica 2018, 90, 25–37. [Google Scholar] [CrossRef]
Liu, L.; Wang, Y.; Wang, C.; Ding, F.; Hayat, T. Maximum likelihood recursive least squares estimation for multivariate equation-error ARMA systems. J. Frankl. Inst. 2018, 355, 7609–7625. [Google Scholar] [CrossRef]
Cecilio, I.M.; Ottewill, J.R.; Fretheim, H.; Thornhill, N.F. Multivariate Detection of Transient Disturbances for Uni- and Multirate Systems. IEEE Trans. Control Syst. Technol. 2014, 23, 1477–1493. [Google Scholar] [CrossRef]
Kulikova, M.V.; Tsyganova, J.V.; Kulikov, G.Y. UD-Based Pairwise and MIMO Kalman-Like Filtering for Estimation of Econometric Model Structures. IEEE Trans. Control Syst. Technol. 2020, 65, 4472–4479. [Google Scholar] [CrossRef]
Weijtjens, W.; De Sitter, G.; Devriendt, C.; Guillaume, P. Operational modal parameter estimation of MIMO systems using transmissibility functions. Automatica 2014, 50, 559–564. [Google Scholar] [CrossRef]
Astrm, K.J.; Murray, R.M. Feedback Systems: An Introduction for Scientists and Engineers; Princeton University Press: Princeton, NJ, USA, 2008. [Google Scholar]
Stoica, P.; Jansson, M. MIMO System Identification: State-Space and Subspace Approximations versus Transfer Function and Instrumental Variables. IEEE Trans. Signal Process. 2000, 48, 3087–3099. [Google Scholar] [CrossRef]
Gu, Y.; Ding, F.; Li, J. States based iterative parameter estimation for a state space model with multi-state delays using decomposition. Signal Process. 2015, 106, 294–300. [Google Scholar] [CrossRef]
Gre, S.; Dhlerm, M.; Jacobsen, N.J.; Mevel, L. Uncertainty quantification of input matrices and transfer function in input/output subspace system identification. Mech. Syst. Signal Process. 2022, 167, 108581. [Google Scholar] [CrossRef]
Wang, J.; Wang, H.W.; Xia, H. Subspace-based closed-loop identification of multi-rate control system. Inf. Control 2014, 43, 6. [Google Scholar]
You, J.; Huang, C.; Zhang, H. Combined invariant subspace & frequency-domain subspace method for identification of discrete-time MIMO linear systems. Syst. Control Lett. 2023, 181, 105641. [Google Scholar] [CrossRef]
Liu, Y.J.; Ding, F. Coupled stochastic gradient algorithm and performance analysis for multivariable systems. Control Decis. 2016, 31, 1487–1492. [Google Scholar] [CrossRef]
Pearson, A.; Chin, Y. Identification of MIMO systems with partially decoupled parameters. IEEE Trans. Autom. Control 1979, 24, 599–604. [Google Scholar] [CrossRef]
Li, J.; Ruiz-Sandoval, M.; Spencer, B.F., Jr.; Elnashai, A.S. Parametric time-domain identification of multiple-input systems using decoupled output signals. Earthq. Eng. Struct. Dyn. 2014, 43, 1307–1324. [Google Scholar] [CrossRef]
Kord, S.; Taghikhany, T. Parametric system identification of large-scale structure using decoupled synchronized signals. Struct. Des. Tall Spec. Build. 2022, 31, e1915. [Google Scholar] [CrossRef]
Liang, X.Z.; Huang, Z.Y.; Ai, F.; Yuan, Z.; Wang, J. Dynamic modeling and parameterization of transfer function of plate-fin heat exchanger. J. Beijing Univ. Aeronaut. Astronaut. 2024, 50, 154–162. [Google Scholar]
Pan, J.; Zhang, H.; Guo, S.L.Y. Multivariable CAR-like System Identification with Multi-innovation Gradient and Least Squares Algorithms. Int. J. Control Autom. Syst. 2023, 21, 1455–1464. [Google Scholar] [CrossRef]
Vapnik, V. The Natural of Statistical Learning Theory; Springer Science & Business Media: New York, NY, USA, 1995. [Google Scholar] [CrossRef]
Liu, Y.; Li, Y.; Duan, Z.; Wu, H. Study of aircraft system identification based on support vector machine. Aviat. Sci. Technol. 2019, 30, 68–72. [Google Scholar]
Liu, Y.; Pi, D.; Cheng, Q. Ensemble kernel method: SVM classification based on game theory. J. Syst. Eng. Electron. 2016, 27, 251–259. [Google Scholar] [CrossRef]
Keerthi, S.S.; Shevade, S.K. A fast iterative nearest point algorithm for support vector machine classifier design. IEEE Trans. Neural Netw. 2000, 11, 124–136. [Google Scholar] [CrossRef]
Shashua, A.A. On the Relationship Between the Support Vector Machine for Classification and Sparsified Fisher’s Linear Discriminant. Neural Process. Lett. 1999, 9, 129–139. [Google Scholar] [CrossRef]
Bredensteiner, E.J.; Bennett, K.P. Multicategory Classification by Support Vector Machines. Comput. Optim. Appl. 1999, 12, 53–79. [Google Scholar] [CrossRef]
Yanwei, H.; Tihua, W.; Jingyi, Z. System identification and simulation by regressive support vector machine. Comput. Simul. 2004, 21, 4. [Google Scholar]
Castro-Garcia, R.; Agudelo, O.M.; Johan, A.K. Impulse response constrained LS-SVM modeling for MIMO Hammerstein system identification. Int. J. Control 2019, 92, 908–925. [Google Scholar] [CrossRef]
Li, J.; Ding, F. Maximum likelihood stochastic gradient estimation for Hammerstein systems with colored noise based on the key term separation technique. Comput. Math. Appl. 2011, 62, 4170–4177. [Google Scholar] [CrossRef]
Shutong, L.; Yan, J.; Anning, J. Parameter Estimation Method for Generalized Time-varying Systems with Colored Noise Based on the Hierarchical Principle. Int. J. Control. Autom. Syst. 2024, 22, 548–559. [Google Scholar] [CrossRef]
Liu, S.; Wang, Y.; Ding, F.; Alsaedi, A.; Hayat, T. Joint iterative state and parameter estimation for bilinear systems with autoregressive noises via the data filtering. ISA Trans. 2024, 147, 337–349. [Google Scholar] [CrossRef] [PubMed]
Kuo, S.M.; Morgan, D.R. Active noise control: A tutorial review. Proc. IEEE 1999, 87, 943–973. [Google Scholar] [CrossRef]
Shi, Y.; Yuan, Y.; Luo, B.; Li, F.; Xu, X.; Yang, C. Data-Driven Plant-Model Mismatch Detection for Closed-Loop LPV System Based on Instrumental Variable Using Sum-of-Norms Regularization. IEEE Trans. Instrum. Meas. 2024, 73, 3001612. [Google Scholar] [CrossRef]
Yue, Z.; Guiming, L. A recursive algorithm for parameter identification of colored noise interference systems. J. Tsinghua Univ. (Nat. Sci. Ed.) 2009, 49, 135–137+141. [Google Scholar] [CrossRef]
Xu, L.; Ding, F.; Zhang, X.; Zhu, Q. Novel parameter estimation method for the systems with colored noises by using the filtering identification idea. Syst. Control Lett. 2024, 186, 105774. [Google Scholar] [CrossRef]
Zhang, Y.; Li, Q. A Regressive Convolution Neural Network and Support Vector Regression Model for Electricity Consumption Forecasting. In Advances in Information and Communication: Proceedings of the 2019 Future of Information and Communication Conference (FICC), San Francisco, CA, USA, 14–15 March 2019; Springer: Cham, Switzerland, 2020. [Google Scholar]
Lin, K.C.; Chien, H.Y. CSO-based feature selection and parameter optimization for support vector machine. In Proceedings of the 2009 Joint Conferences on Pervasive Computing (JCPC), Tamsui, Taiwan, 3–5 December 2009; IEEE: New York, NY, USA, 2010. [Google Scholar] [CrossRef]
Gao, Z.; Li, L. Adaptive optimization of cutting parameters in milling industry considering dynamic tool wear in intelligent manufacturing driven by reinforcement learning. Int. J. Adv. Manuf. Technol. 2024, 133, 4751–4760. [Google Scholar] [CrossRef]
Der, O.; Ordu, M.; Basar, G. Optimization of cutting parameters in manufacturing of polymeric materials for flexible two-phase thermal management systems. Mater. Test. 2024, 66, 1700–1719. [Google Scholar] [CrossRef]
Smola, A.J.; Schölkopf, B. A tutorial on support vector regression. Stat. Comput. 2004, 14, 199–222. [Google Scholar] [CrossRef]
Schölkopf, B.; Smola, A.J. Learning with Kernels; MIT Press: Cambridge, MA, USA, 2002. [Google Scholar]
Baydokhty, M.E.; Zare, A.; Balochian, S. Performance of optimal hierarchical type 2 fuzzy controller for load–frequency system with production rate limitation and governor dead band. Alex. Eng. J. 2016, 55, 379–397. [Google Scholar] [CrossRef]
Namadchian, Z.; Zare, A.; Namadchian, A. Stability Analysis of Nonlinear Dynamic Systems by Nonlinear Takagi-Sugeno-Kang Fuzzy Systems. J. Dyn. Syst. Meas. Control-Trans. ASME 2014, 136, 021019. [Google Scholar] [CrossRef]
Byrd, M.; Nghiem, L.H.; McGee, M. Bayesian regularization of Gaussian graphical models with measurement error. Comput. Stat. Data Anal. 2021, 156, 107085. [Google Scholar] [CrossRef]
Hussein, A.A.; Hajj, M.R.; Elkholy, S.M.; ELbayoumi, G.M. Dynamic Stability of Hingeless Rotor Blade in Hover Using Pade Approximations. AIAA J. 2016, 54, 1769–1777. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of the SVR principle. (The circles are the data points that need to be fitted).

Figure 2. MIMO system architecture diagram.

Figure 3. The flowchart of SVR.

Figure 4. Input signals.

Figure 5. Diagrams of the RLS identification process of a and b (with outliers). (a) The identification process of parameter a; (b) The identification process of parameter b.

Figure 6. The training and test fit plots of SVR (with outliers). (a) The training process of SVR; (b) The test process of SVR.

Figure 7. The training and test fit plots of RSBO-SVR (with outliers). (a) The training process of RSBO-SVR; (b) The test process of RSBO-SVR.

Figure 8. The training and test fit plots of SVR (with colored noise). (a) The training process of SVR; (b) The test process of SVR.

Figure 9. The training and test fit plots of RSBO-SVR (with colored noise). (a) The training process of RSBO-SVR; (b) The test process of RSBO-SVR.

Figure 10. The model diagram of the real water tank.

Figure 11. The real input and output signals of tank 1. (a) The input signals; (b) The output signal.

Figure 12. The training and test fit plots of tank 1 using SVR. (a) The training process of SVR; (b) The test process of SVR.

Figure 13. The training and test fit plots of tank 1 using RSBO-SVR. (a) The training process of RSBO-SVR; (b) The test process of RSBO-SVR.

Figure 14. (a) The output signals of the real and the estimated system (SVR); (b) the error between the real and estimated output (SVR).

Figure 15. (a) The output signals of the real and the estimated system (RSBO-SVR); (b) the error between the real and estimated output (RSBO-SVR).

Table 1. The kernel functions and their calculation formulas.

Kernel Function	Calculation Formula
Linear kernel function	$K (x_{i}, x_{j}) = x_{i}^{T} x_{j}$
Polynomial kernel function	$K (x_{i}, x_{j}) = {(γ x_{i}^{T} x_{j} + b)}^{d}$
RBF	$K (x_{i}, x_{j}) = \exp (- γ {‖x_{i} - x_{j}‖}^{2})$
Sigmoid kernel function	$K (x_{i}, x_{j}) = \tanh (γ x_{i}^{T} x_{j} + b)$

Table 2. The identification results and errors of SVR and RSBO-SVR (with outliers).

Parameters	Setpoints	SVR	RSBO-SVR	SVR Error	RSBO-SVR Error
$a_{1}$	1.885	1.885467	1.885583	0.02%	0.03%
$a_{2}$	−0.8869	−0.887002	−0.88585	0.01%	−0.12%
$b_{11}$	0.04762	0.047306	0.047903	−0.66%	0.59%
$b_{12}$	0.00094	0.000937	0.000935	−0.32%	−0.53%
$b_{13}$	−0.046676	−0.047586	−0.046977	1.95%	0.64%
$b_{21}$	0.0297	0.030307	0.030254	2.04%	1.87%
$b_{22}$	0.0028	0.002851	0.002831	1.82%	1.11%
$b_{23}$	−0.05516	−0.054962	−0.0554611	−0.36%	0.55%

Table 3. Estimation results for the parameters of the transfer function matrix (with outliers).

Parameters	Setpoints	SVR	RSBO-SVR	SVR Error	RSBO-SVR Error
$K_{1}$	1	1.005754	1.023848	0.58%	2.38%
$K_{2}$	3	3.060198	3.100016	2.01%	3.33%
$T_{1}$	1	1.013031	1.018668	1.30%	1.87%
$T_{2}$	5	4.998666	5.073316	−0.03%	1.47%

Table 4. The identification results and errors of RLS, SVR, and RSBO-SVR (with colored noise).

Parameters	Setpoints	RLS	SVR	RSBO-SVR	RLS Error	SVR Error	RSBO-SVR Error
$a_{1}$	1.885	1.7199	1.885467	1.885395	−8.76%	0.02%	0.02%
$a_{2}$	−0.8869	−0.8937	−0.887002	−0.88687	0.77%	0.01%	0.00%
$b_{11}$	0.04762	0.1609	0.047306	0.047441	237.88%	−0.66%	−0.38%
$b_{12}$	0.00094	0.000534	0.000937	0.000942	−43.19%	−0.32%	0.21%
$b_{13}$	−0.046676	−0.0043	−0.047586	−0.048103	−90.79%	1.95%	3.06%
$b_{21}$	0.0297	0.0383	0.030307	0.028976	28.96%	2.04%	−2.44%
$b_{22}$	0.0028	0.0087	0.002851	0.002849	210.71%	1.82%	1.75%
$b_{23}$	−0.05516	−0.0092	−0.054962	−0.055081	−83.32%	−0.36%	−0.14%

Table 5. Estimation results for the parameters of the transfer function matrix (with colored noise).

Parameters	Setpoints	RLS	SVR	RSBO-SVR	RLS Error	SVR Error	RSBO-SVR Error
$K_{1}$	1	1.417	1.0059	0.965006	41.7%	0.59%	−3.50%
$K_{2}$	3	23.08	3.0599	2.918578	669.3%	1.997%	−2.71%
$T_{1}$	1	0.39	1.013	0.967059	−61%	1.3%	−3.29%
$T_{2}$	5	30.08	4.9987	4.9862	501.6%	−0.026%	−0.28%

Table 6. Estimated times for SVR and RSBO-SVR.

State	SVR	RSBO-SVR
With outliers	201 s	20 s
With colored noise	251 s	53 s

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zheng, J.; Jie, X. Parameter Estimation of Noise-Disturbed Multivariate Systems Using Support Vector Regression Integrated with Random Search and Bayesian Optimization. Processes 2025, 13, 773. https://doi.org/10.3390/pr13030773

AMA Style

Zheng J, Jie X. Parameter Estimation of Noise-Disturbed Multivariate Systems Using Support Vector Regression Integrated with Random Search and Bayesian Optimization. Processes. 2025; 13(3):773. https://doi.org/10.3390/pr13030773

Chicago/Turabian Style

Zheng, Jiawei, and Xinchun Jie. 2025. "Parameter Estimation of Noise-Disturbed Multivariate Systems Using Support Vector Regression Integrated with Random Search and Bayesian Optimization" Processes 13, no. 3: 773. https://doi.org/10.3390/pr13030773

APA Style

Zheng, J., & Jie, X. (2025). Parameter Estimation of Noise-Disturbed Multivariate Systems Using Support Vector Regression Integrated with Random Search and Bayesian Optimization. Processes, 13(3), 773. https://doi.org/10.3390/pr13030773

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Parameter Estimation of Noise-Disturbed Multivariate Systems Using Support Vector Regression Integrated with Random Search and Bayesian Optimization

Abstract

1. Introduction

2. Parameter Estimation for MIMO Systems Under Noise Interference

2.1. Support Vector Regression Algorithm

2.2. Anti-Interference Analysis of SVR

2.3. Parameter Estimation of SVR-Based for MIMO Systems

2.4. Unbiased and Error Convergence

3. SVR Algorithm Using Stochastic Search and Bayesian Optimization

4. Simulation and Experiment

4.1. Simulation

4.1.1. Presence of Outliers

4.1.2. Colored Noise States

4.2. Experimental Verification

5. Conclusions

6. Limitation of Study

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI