A Filtering-Based Stochastic Gradient Estimation Method for Multivariate Pseudo-Linear Systems Using the Partial Coupling Concept

: Solutions for enhancing parameter identiﬁcation effects for multivariate equation-error systems in random interference and parameter coupling conditions are considered in this paper. For the purpose of avoiding the impact of colored noises on parameter identiﬁcation precision, an appropriate ﬁlter is utilized to process the autoregressive moving average noise. Then, the ﬁltered system is transformed into a number of sub-identiﬁcation models based on system output dimensions. Founded on negative gradient search, a new multivariate ﬁltering algorithm employing a partial coupling approach is proposed, and a conventional gradient algorithm is derived for comparison. Parameter identiﬁcation for multivariate equation-error systems has a high estimation accuracy and an efﬁcient calculation speed with the application of the partial coupling approach and the data ﬁltering method. Two simulations are performed to reveal the proposed method’s effectiveness.


Introduction
The foundation of industrial automatic production and intelligent control is a precise model of the production processes [1].With the expansion of production scale, multivariate systems have been widely used in production processes [2][3][4].Parameter estimation for multivariate systems has taken a considerable role in system identification and has attracted much attention from researchers in recent decades.Multivariate systems are more difficult to identify than scalar systems because they have more unknown parameters, are accompanied by more complex random interferences, and the parameters of some channels are coupled [5][6][7].Good identification results are often not achieved if the identification approaches for scalar systems are applied to multivariate systems without modification.Some improved methods for the identification of multivariate systems have been researched recently [8][9][10].For instance, Mari et al. combined the Schur restabilization technique and a covariance fitting algorithm to propose a parameter estimation method for finite dimensional multivariate linear stochastic systems [11].Luo and Manikas proposed an iterative method and a nonlinear optimization algorithm for suppressing the mutual target interference in the multitarget parameter estimation [12].Zhang et al. identified the parameters of multivariate uncertain regression model with a maximum likelihood identification algorithm [13].Oigard et al. researched an expectation maximization algorithm for heavy-tailed processes with a multivariate normal inverse Gaussian distribution, which has fast and accurate parameter identification effectiveness [14].
Although the identification methods for multivariate systems are gradually enriched, researchers have been devoted to finding methods with faster identification efficiency and higher identification accuracy for multivariate systems.In terms of improving estimation efficiency, in addition to the decomposition identification method, the coupling identification approach also can effectively reduce the amount of identification computation [15][16][17].The basis of the coupling identification approach is the transformation of multivariate systems into some identification subsystems, then coupling the identification results of each subsystem to make their results correlated [18][19][20].There are some studies on coupling identification methods for multivariate systems.Ding researched parameter identification issues for non-uniformly sampled systems and proposed a partially coupled algorithm based upon the stochastic gradient method.A simulation in the paper revealed that the new algorithm requires less calculation than the standard stochastic gradient algorithm [21].Zhou designed a nonlinear partially coupled parameter identification algorithm for multivariate radial basis function-based hybrid models inspired by the coupling concept which reduces the amount of calculation by dealing with the associated items brought by model decomposition [22].Huang et al. provided a coupled probability representation regarding model coupling in feature-based image source identification which improved the identification accuracy significantly [23].Wang solved parameter identification problems of nonlinear multivariate systems and developed a coupled gradient method by introducing the coupling idea, which can realize subsystem-coupled computation [24].
The data filtering approach in parameter estimation is the improvement of the parameter identification precision by modifying the structure of system interference noises through an appropriate filter without changing the system's input-output relationship [25][26][27].The data filtering approach has been applied to scalar system identification in some studies.Ji and Jiang utilized a data filter to process collected data to deal with the disturbance of colored noise on identification precision for generalized time-varying systems [28].Imani studied a maximum-likelihood parameter identification method for partially observed Boolean dynamical systems by using a Boolean Kalman filter [29].Zhang developed a filtering hierarchical maximum likelihood iterative algorithm for nonlinear systems by applying the data filtering approach and multi-innovation identification method, which obtains highly precise parameter estimates and tracks time-varying parameters well [30].Chen et al. proposed a multi-step-length gradient iterative algorithm for ARX models with the application of a modified Kalman filter.The Kalman filter was designed to enhance unmeasurable output estimates, which improved parameter identification accuracy [31].Li and Liu addressed parameter identification problems in bilinear systems and presented iterative methods with high estimation accuracies by utilizing the particle filtering approach [32].
The least squares estimation algorithm, gradient estimation algorithm, least mean square estimation algorithm, and stochastic approximation estimation algorithm are all classical identification methods in the field of system identification.The least squares method is a basic parameter estimation method which can be used for dynamic system identification as well as for static system parameter fitting [33,34].The gradient identification method is a search for parameter estimates along the direction of the negative gradient of the criterion functions [35][36][37].Compared with the least squares method, the gradient identification algorithm has less computational complexity because it does not involve the covariance matrix.By extending the gradient identification algorithm and combining it with other methods, estimation algorithms with high identification performances can be obtained.Zhang and Ding proposed an optimal adaptive filtering algorithm for filter design by combining the data filtering approach with the gradient method [38].Roman et al. derived a gradient descent method for identifying parameters of a linear wave equation from experimental boundary data [39].Chen et al. identified parameters of time-delay rational state-space systems and presented two improved gradient descent algorithms by utilizing an intelligent search method and a momentum method, which had faster convergence speeds and higher computational efficiencies [40].Kulikova researched adaptive filtering methods based on the gradient algorithm for identifying unknown parameters of pairwise linear Gaussian systems [41].
The data filtering approach can improve the parameter identification precision for multivariate systems by transforming colored noises with complex structures into white noises with simple structures [42][43][44].At the same time, the coupling identification method can effectively speed up identification and the gradient identification approach can quickly search for the optimal estimates [45][46][47].Therefore, motivated by the significant advantages of these three methods, this paper combines the data filtering approach and the coupling identification method based upon gradient search to recognize parameters of multivariate equation-error systems.The introduction of the data filtering approach overcomes the influence of colored noise on identification precision.The use of the coupling identification method reduces the computation of the identification algorithm.The main highlights of this paper are summarized as follows.
(1) A filter is used to transform the autoregressive moving average noise of multivariate pseudo-linear systems into white noise by applying the data filtering approach.The filtered system is converted into a number of subsystem identification models based upon system the output dimensions according to the coupling identification method.(2) A filtering-based multivariate gradient algorithm employing the partial coupling concept for multivariate pseudo-linear systems is proposed.Additionally, a conventional multivariate gradient algorithm is derived for comparison.The proposed algorithm has higher identification precision and faster computational efficiency than the conventional algorithm.
The structure of this paper is as follows.The multivariate pseudo-linear system is presented and the system identification obstacles are analyzed in Section 2. A new gradient algorithm based upon the coupling identification approach and the data filtering method is proposed in Section 3. Section 4 derives a conventional multivariate gradient algorithm.Convergence of the proposed method is discussed in Section 5.In Section 6, two simulations are performed to reveal the effectiveness of the proposed methods.Finally, Section 7 provides some conclusions of this paper.

Problem Description
At the beginning, we provide some notation to make the paper concise and clear.I m is an identity matrix of size m × m. 1 m×n denotes a matrix of size m × n whose elements are 1.The norm of a matrix A is defined by A 2 := tr[AA T ], and the superscript T represents the matrix/vector transpose.The symbol ⊗ stands for the Kronecker product, such as denotes a vector consisting of all columns of matrix B arranged in order, that is, According to the type of colored noise, the multivariate pseudo-linear system can be divided into different types.In this paper, we consider systems where the noise is of the autoregressive moving average type.The system is widely present in industrial processes and its structure is described as where y(k T ∈ R m is the system output vector which can be measured, Φ(k) ∈ R m×n is the system information matrix which is formed from system input-output data, θ ∈ R n is the system parameter vector which is unknown and to be identified, v(k and Define the noise model as, In general, suppose that orders m, n, n c , and n d of the system are known, and y(k) = 0, Φ(k) = 0 and v(k) = 0 for k 0.
Define the parameter vector η, the parameter matrix κ, the information vector ψ(k), and the information matrix Ψ(k) as Based on Equation ( 2), we obtain The identification model of the system in ( 1) is represented as Uniting the information matrix Ψ(k) with the information vector ψ(k), and the parameter vector η with the parameter matrix κ, the new information matrix Ω(k) and new parameter vector ϑ are Equation ( 9) is changed into The goal is to find effective identification methods to estimate unmeasurable parameters θ, c i , and D i which are in ϑ.The analysis shows that if the identification model in ( 12) is performed directly, superfluous calculations will be generated in the unknown parameter estimation processes because the Kronecker product calculation produces substantial zero elements to the information matrix Ω(k).With a view to enhance the identification performance for the system in (1), it is necessary to explore another efficient identification method.

The Filtering-Based Multivariate Partially Coupled Gradient Algorithm
By analyzing System (1), the existence of the noise reduces the parameter identification precision.For overcoming the adverse effects of disturbances, the data filtering method is adopted to convert colored noise into white noise.Setting c(z) as the filter for System (1) is an appropriate solution of the problem.First of all, multiply both sides of Equation ( 1) by c(z): Define the filtered output vector y f (k) and the filtered information matrix Φ f (k) as Then, Equation ( 13) can be rewritten as Let κ T i ∈ R 1×mn d be the ith row of the parameter matrix κ T , y fi (k) ∈ R be the ith row of the filtered output vector y f (k), and Φ fi (k) ∈ R 1×n be the ith row of the filtered information matrix Φ f (k), that is Transform Equation ( 14) into m sub-identification models: Equation ( 14) is described by In Equation ( 16), vectors θ and ψ(k) are common in each subsystem, which is in line with the characteristics of the partially coupled-type identification model.Next, parameter estimation algorithm employing partial coupling concept for the identification model in (16) is derived in detail.
Define a gradient criterion function for the new identification model in ( 16) as Minimizing J 1 (θ, κ i ) based upon the gradient search, the gradient relationships are However, estimates θ(k) and κi (k) in ( 17)-( 20) cannot be computed because y fi (k), Φ fi (k), and ψ(k) involve the unmeasurable terms c i and v(k).If we define, Then, y f (k) and Φ f (k) are represented by However, y f (k) and Φ f (k) still cannot be calculated because c i is unknown.The auxiliary model identification method [48] is a classical method which can solve some system identification problems with unmeasured variables.The essential thought behind this is to replace the unknown variable with the output of an auxiliary model.The problem is that parameter estimation cannot be calculated in the algorithm, which can be solved by replacing the unknown variable with its estimate when its value cannot be obtained.Here, utilizing the auxiliary model identification method, according to Equation (22), replacing c i with estimates ĉi (k), the estimate τ(k) is formed by After that, replacing unknown parameters c i and τ with estimates ĉi (k) and τ(k) in ( 23) and (24), estimates ŷf (k) and Φf (k) are calculated by Meanwhile, according to Equation ( 5), the estimate ψ(k) is formed by Define the noise information matrix: Equation ( 7) can be written as

Define an intermediate vector w
Thus, the noise model is rewritten as In order to obtain the value of estimate τ(k), define another gradient criterion function for the noise model in Equation (29) as Minimizing J 2 (τ) based upon the gradient search, the gradient relationship is Obviously, the parameter vector estimate τ(k) cannot be calculated because w n (k) and χ(k) are unknown.Replacing them with estimates ŵn (k) and χ(k) can solve this problem.We have According to Equation ( 8), substitute the estimate θ(k) for the unmeasurable term θ.Then, the estimate ŵ(k) is calculated by Similarly, according to Equation ( 14), replacing unmeasurable terms Φ f (k), θ, and κ with their estimates Φf (k), θ(k), and κ(k), the estimate v(k) is calculated by There are superfluous estimates in algorithm ( 17)-(20) because θ is repeatedly computed m times.To reduce the excess computation, θi is used instead of θ in (17) and (19).Meanwhile, substitute estimates ΦT fi (k), ŷfi (k) and ψ(k) for unknown terms Φ T fi (k), y fi (k), and ψ(k); then, the new algorithm is In recursive algorithms, the estimated values of parameters tend to true values infinitely with the data length increasing.As is well known, the estimate θi−1 (k) of the (i − 1)th subsystem at time k is closer to the true value θ than the estimate θi (k − 1) of the ith subsystem at time k − 1. Accordingly, in Algorithm ( 36)- (39), substitute θi−1 (k) for θi (k − 1) on the right-hand side of Equation ( 36), and substitute θm (k − 1) for θ1 (k − 1) in Equation ( 36) when i = 1.To conclude, the filtering-based multivariate partially coupled generalized extended stochastic gradient (F-M-PC-GESG) algorithm is as follows.
The calculation steps of the F-M-PC-GESG algorithm in ( 40)-(58) are presented as follows.
Gain k by 1 if k < K, and then skip to Step 2. If not, obtain the parameter estimates θ(k), κ(k), and τ(k) and quit.
By analyzing the whole calculation steps, the F-M-PC-GESG algorithm uses the method of interactive estimation.That is, the value of estimate τ(k) can be obtained first after setting the initial values, and then using the value of estimate τ(k) to calculate the values of estimates θi (k) and κi (k).The loop continues until the estimates τ(k), θi (k) and κi (k) are stable.
The schematic diagram of the F-M-PC-GESG algorithm in ( 40)-( 58) is given in Figure 1.It indicates that θi (k) in each subsystem are common whereas κi (k) is separate.Figure 1 clearly shows how the partially coupled identification approach of the F-M-PC-GESG algorithm operates.

The Multivariate Generalized Extended Stochastic Gradient Algorithm
The gradient algorithm is a classical identification method which does not generate covariance matrix in the calculation process and has a significant effect on improving the computational efficiency.In this section, the direct stochastic gradient method without improvement is utilized to identify parameters of the multivariate system.Define another quadratic criterion function for the identification model in (12): Suppose that µ(k) is the step size.Minimizing J 3 (ϑ) based upon the gradient search, the gradient relationship is The obstacle in identification is that θ(k) cannot be calculated because v(k) and w(k) in Ω(k) are unmeasurable.Substitute estimates v(k) and ŵ(k) for terms v(k) and w(k).From Equations ( 8) and ( 12), estimates v(k) and ŵ(k) are calculated by Considering that Ψ(k) also involves the unmeasurable w(k), define Ψ(k) by the estimate ŵ(k): Additionally, because Ω(k) involves the unmeasurable terms ψ(k) and Ψ(k), define Ω(k) by estimates ψ(k) and Ψ(k): Subsequently, the multivariate generalized extended stochastic gradient (M-GESG) algorithm is obtained as follows.
The computation procedures of the M-GESG algorithm in (65)-( 73) are presented as follows.

1.
Let k = 1, set the initial values θ(0 , and set the data length K.
Gain k by 1 if k < K, and then skip to Step 2. If not, acquire the parameter estimate θ(k) and quit.

Convergence Analysis
The convergence of the F-M-PC-GESG algorithm is analyzed in this section.Suppose that the σ algebra sequence . .) generated by v(k), and {v(k), F k } is a Martingale difference sequence on a probability space {Ω, F , P} [49].The sequence {v(k)} satisfies Lemma 1.For the systems in ( 16) and ( 29) and the F-M-PC-GESG algorithm in ( 40)-(58), the following inequalities hold: Theorem 1.For the systems in ( 16) and ( 29) and the F-M-PC-GESG algorithm in ( 40)-( 58), suppose that (Q1) and (Q2) hold.There exists the positive constants λ 1 , λ 2 , and λ 3 independent of k, and an integer K such that the following persistent excitation condition holds: Then, the parameter estimation errors θi (k) − θ , κ(k) − κ , and τ(k) − τ given by the F-M-PC-GESG algorithm converge to zero in the mean square sense.
Through the above analysis, we can determine that the proposed algorithm can make parameter estimation errors of multivariate pseudo-linear systems converge to zero in the case of random disturbance.In other words, the proposed algorithm not only has the ability to estimate the unknown parameters accurately, but also has certain stability.
Theorem 1 can be proved in a similar to the way in [50] and is omitted here.

Simulations
This part is to demonstrate the superiority of F-M-PC-GESG algorithm in identification performances by conducting two simulations.
Example 1.Consider the following multivariate equation-error autoregressive moving average systems, The parameter vector to be identified is ∈ R 2 is with zero mean.σ 2 1 and σ 2 2 are variances of v 1 (k) and v 2 (k).Taking the noise variances σ 2 1 = σ 2 2 = 0.20 2 , utilizing the M-GESG algorithm and the F-M-PC-GESG algorithm to identify system parameters, parameter estimates and their errors δ := θ(k) − ϑ / ϑ are given in Table 1.The parameter identification errors versus k are given in Figure 2. The parameter estimates θ1 (k), θ2 (k), θ3 (k), θ4 (k), ĉ1 (k), ĉ2 (k), d11 (k), d12 (k), d21 (k), and d22 (k) versus k of the F-M-PC-GESG algorithm are given in Figures 3 and 4.   Example 2. Consider another multivariate equation-error autoregressive moving average system, The configuration of the simulation in this example is the same as in Example 1. Set noise variances σ 2 1 = 0.50 2 and σ 2 2 = 0.40 2 , and utilize the M-GESG algorithm and the F-M-PC-GESG algorithm to identify the system parameters.The parameter estimates and their errors are given in Table 2.The parameter identification errors versus t are given in Figure 5.With Tables 1 and 2 and Figures 2-5, identification performances of proposed algorithms are analyzed as follows.

1.
Tables 1 and 2, Figures 2 and 5 indicate that parameter identification errors of the M-GESG and the F-M-PC-GESG algorithms decrease with increasing data length k.
This reveals that the proposed algorithms are valid in parameter identification for the multivariate equation-error autoregressive moving average system.2.
Through Figures 2 and 5, we can see that the F-M-PC-GESG algorithm has superiority over the M-GESG algorithm in parameter identification precision under the same data length and noise variances.

3.
Figures 3 and 4 show that the F-M-PC-GESG algorithm can rapidly obtain access to precise parameter estimates.

Conclusions
This paper presents methods of how to improve parameter identification effects for multivariate pseudo-linear systems under conditions of random interference and parameter coupling, which provides modular solutions for modeling and forecasting of real multivariate time series.Taking into account colored noises and high-dimensional unknown parameters, the original system is filtered by the designed filter and then be transformed into several subsystem identification models by utilizing the coupling identification approach.A new filtering-based multivariate gradient algorithm is proposed, which has higher parameter estimation precision and faster identification efficiency than the conventional multivariate gradient algorithm.Convergence analysis and simulation experiments confirms that the F-M-PC-GESG algorithm has performance that can obtain access to unknown parameter estimates precisely and rapidly.Future research directions include applying proposed methods to parameter identification problems for other linear or nonlinear models under random interference in various engineering systems.

Figure 1 .
Figure 1.The schematic diagram of the F-M-PC-GESG algorithm.