Model Error Modeling for a Class of Multivariable Systems Utilizing Stochastic Embedding Approach with Gaussian Mixture Models

Orellana, Rafael; Coronel, Maria; Carvajal, Rodrigo; Escárate, Pedro; Agüero, Juan C.

doi:10.3390/sym17030426

Open AccessArticle

Model Error Modeling for a Class of Multivariable Systems Utilizing Stochastic Embedding Approach with Gaussian Mixture Models

by

Rafael Orellana

^1,*

,

Maria Coronel

²

,

Rodrigo Carvajal

³

,

Pedro Escárate

³

and

Juan C. Agüero

⁴

¹

Departamento de Ingeniería Eléctrica, Universidad de Santiago de Chile (USACH), Av. Victor Jara 3519, Santiago 9170124, Chile

²

Departamento de Electricidad, Universidad Tecnológica Metropolitana, Av. José Pedro Alessandri 1242, Santiago 7800002, Chile

³

Escuela de Ingenieria Eléctrica, Pontificia Universidad Católica de Valparaíso, Av. Brasil 2147, Valparaíso 2362804, Chile

⁴

Departamento de Electrónica, Universidad Técnica Federico Santa María, Av. España 1680, Valparaíso 2390123, Chile

^*

Author to whom correspondence should be addressed.

Symmetry 2025, 17(3), 426; https://doi.org/10.3390/sym17030426

Submission received: 12 February 2025 / Revised: 7 March 2025 / Accepted: 10 March 2025 / Published: 12 March 2025

(This article belongs to the Special Issue Symmetry and Asymmetry in Multivariate Statistics and Data Science, Second Edition)

Download

Browse Figures

Versions Notes

Abstract

Many real-world multivariable systems need to be modeled to capture the interconnected behavior of their physical variables and to understand how uncertainty in actuators and sensors affects the system dynamics. In system identification, some estimation algorithms are formulated as multivariate data problems by assuming symmetric noise distributions, yielding deterministic system models. Nevertheless, modern multivariable systems must incorporate the uncertainty behavior as a part of the system model structure, taking advantage of asymmetric distributions to model the uncertainty. This paper addresses the uncertainty modeling and identification of a class of multivariable linear dynamic systems, adopting a Stochastic Embedding approach. We consider a nominal system model and a Gaussian mixture distributed error-model driven by an exogenous input signal. The error-model parameters are treated as latent variables and a Maximum Likelihood algorithm that functions by marginalizing the latent variables is obtained. An Expectation-Maximization algorithm that jointly uses the measurements from multiple independent experiments is developed, yielding closed-form expressions for the Gaussian mixture estimators and the noise variance. Numerical simulations demonstrate that our approach yields accurate estimates of both the multivariable nominal system model parameters and the noise variance, even when the error-model non-Gaussian distribution does not correspond to a Gaussian mixture model.

Keywords:

maximum likelihood; multivariable uncertainty modeling; stochastic embedding; expectation-maximization; Gaussian mixture model

1. Introduction

The field of system identification focuses on the development and analysis of estimation methodologies for system models using experimental data, i.e., estimating the set of system models that best fit measurements [1,2,3,4]. The formulation of system identification algorithms is generally based on a stochastic problem, where measurement errors are treated as stochastic processes that can affect the accuracy of the estimates of the system model parameters. The measurement error behavior, modeled as Gaussian or other symmetric distributions [1,5], is a classical assumption for the development of estimation algorithms. However, real-world dynamic systems can be affected by uncertainty sources with asymmetric distribution behavior, and the estimation algorithms perform poorly under symmetric distribution assumptions [6]. In this context, it is crucial to address the asymmetries in error models, especially for multiple-input multiple-output (MIMO) systems that require more flexible representation of the uncertainty behavior.

Many system identification algorithms in the literature have been developed under the Maximum Likelihood (ML) principle, motivated by its valuable statistical properties [1,3]. This approach has been used in various fields of application under symmetric and asymmetric distribution assumptions, ranging from modeling simple single-input single-output (SISO) systems, such as dynamic systems with a Finite Impulse Response (FIR) structure [7], systems with quantized data [8], non-linear systems [9,10], astronomy [11,12], estimation [13], medicine [14], and communications [15], among others. Similarly, ML estimation approaches in the time domain [1,2] and the frequency domain [16] have been considered for MIMO systems, with applications in robust control [17,18], power electronics [19,20], and biomedical engineering [21], to mention a few.

In particular, ML estimators are asymptotically unbiased—that is, the estimated values approach the true values when the number of measurements is large. Nevertheless, if the set of measurements is small, the estimates deviate from the true values as a result of the measurement noise variance [1,2]. However, there are scenarios in which a large number of measurements are available and the estimated models change between estimations because there exist sources of uncertainty that the system model does not incorporate [22,23]. In [24], the authors proved that Bayesian methods yield more accurate estimates than classical ML estimation when a kind of exponential distribution assumption is considered to handle censoring data. In [25], the analysis of various approaches to estimate the parameters of a class of exponential distribution was addressed. The authors considered censored samples with mechanical and medical data, obtaining better performance using a Bayesian approach with asymmetry assumptions than using the ML method.

To illustrate the behavior of biased ML estimates, we consider the discrete-time system models shown in Figure 1 as two-input and two-output (

2 \times 2

) MIMO linear dynamic systems, where

y_{t} \in R^{2}

is the output signal,

u_{t} \in R^{2}

denotes the input deterministic signal,

ω_{t} \in R^{2}

is the measurement noise signal, and

q^{- 1}

corresponds to the backward shift operator (

q^{- 1} x_{t} = x_{t - 1}

). The main goal is to obtain an ML estimate of the vector of parameters,

θ

, that parameterizes the MIMO system model

G_{0} (q^{- 1}, θ)

. The estimates are computed from the measurement set

{y_{1 : N}, u_{1 : N}}

for both the system model shown in Figure 1a, referred to as Model I, and in Figure 1b, called Model II, where N is the data length. Without loss of generality, we assume that all transfer functions in Figure 1 are second-order FIR–MIMO systems. For Model I, an additive measurement noise

ω_{t}

is considered. Similarly, for Model II, we add the term

G_{Δ} (q^{- 1}, θ)

, parameterized by a stochastic vector of parameters

η

, which models structural and parametric uncertainties in the MIMO system.

We perform 50 numerical simulations to obtain different data sets

{y_{1 : N}, u_{1 : N}}

. Then, for each data set, the MIMO system model,

G_{0} (q^{- 1}, θ)

, is estimated for both Model I and Model II using the Prediction Error Method (PEM) for MIMO systems [2]. The principal gains for each estimated MIMO system model are computed. The principal gains are important characteristics used to assess the robustness of the system model to plant variations, its sensitivity to disturbances, and for designing robust multivariable control strategies to ensure stability and performance margins [19,26,27]. In Figure 2 and Figure 3, the blue and red shaded regions represent the area in which the magnitudes of the largest,

\bar{σ} ({\hat{G}}_{0} (q^{- 1}, θ))

, and smallest,

\underset{̲}{σ} ({\hat{G}}_{0} (q^{- 1}, θ))

, principal gains, respectively, of 50 estimations lie. The solid lines (blue and red colors) represent the principal gains of the true MIMO system model

G_{0} (q^{- 1}, θ)

. In Figure 2a, we observe small estimation biases when the number of measurements used for the ML estimations is large (

N = 10^{3}

). These biases increase as the number of measurements decreases (

N = 50

), as shown in Figure 2b. In contrast, in Figure 3, given the incorporation of parametric uncertainties with the term

G_{Δ} (q^{- 1}, θ)

, the estimated MIMO system model yields biased principal gains for both large and small data lengths (see Figure 3a and Figure 3b, respectively). Based on this, it is desirable to develop system identification methodologies to provide estimated models capable of quantifying the error-model that provides parametric and/or structural uncertainties.

One alternative is to consider the uncertainty model as part of the estimation algorithm, combining a nominal model with an error-model. This idea can be adopted for designing robust control systems based on the

H_{\infty}

norm [28,29] or variable–structure controllers [30]. This approach to model error modeling has been addressed using a deterministic framework, in which the error model is bounded by a set of possible solutions [31]. However, this methodology has been developed for SISO system models and also does not guarantee that the set of solutions adequately describes the uncertainty in the corresponding dynamic system. In [23], the uncertainty modeling is based on residual dynamics calculated from an available nominal SISO system model using PEM.

On the other hand, the Stochastic Embedding (SE) approach has been used to describe uncertainty in dynamic systems by assuming that the system model is a realization that belongs to a probability space and that the parameters that define the error-model are characterized by a probability density function (PDF). In this context, the error-model can be quantified utilizing an ML estimation that treats the error-model parameters as hidden (latent) variables and then solving the associated optimization problem via the Expectation-Maximization (EM) [32] algorithm under Gaussian assumptions for the error-model distribution [33]. In [34], a Bayesian approach is adopted to jointly characterize the nominal system model and the error-model as realizations of random variables under certain prior distributions; then, the posterior distributions of the system model parameters are estimated. However, this approach does not provide a separate derivation of a nominal system model and an error-model.

In [35], a more flexible scenario is proposed in which the error-model distribution can be represented as a Gaussian Mixture Model (GMM). Then, an iterative EM-based algorithm is developed to solve the estimation problem for both the parameters of the nominal system model and the Gaussian mixture parameters that define the error-model distribution. GMMs have traditionally been used in filtering [36,37], communications [38], tracking [39], and probabilistic modeling [40], among others. The flexibility of GMMs lies in the fact that they can approximate non-Gaussian PDFs according to the Wiener approximation theorem, which establishes that any PDF with compact support can be approximated as a linear combination of Gaussian distributions [41]. This idea has been utilized to develop estimation algorithms under non-Gaussian assumptions for dynamic systems (see, e.g., [6]). However, this approach has been considered only for a SISO dynamic system with a multiplicative error-model structure.

In this paper, we extend the SE approach adopted in [35] to a class of MIMO linear dynamic systems considering an additive or a multiplicative structure for modeling the error-model term under non-Gaussian assumptions. We develop an ML estimation algorithm with the SE framework to estimate separately the MIMO nominal system model and the non-Gaussian multivariate error-model distribution as a GMM. Combining the SE approach with a linear combination of Gaussian distributions produces flexible scenarios in which the non-Gaussian distribution that describes the uncertainty is not a GMM but can be approximated by one. The main contributions of the paper are the following:

(a): A Maximum Likelihood methodology using the SE approach is developed to model parametric and structural uncertainties in a class of MIMO linear dynamic systems. The resulting estimated model combines a nominal MIMO model with an additive or multiplicative multivariate error-model distribution represented as a GMM.
(b): An iterative EM-based algorithm is proposed to solve the optimization problem associated with the ML estimation, obtaining the estimator for the MIMO nominal system model parameters and closed-form expressions for the estimators of the multivariate GMM parameters that characterize the error-model PDF.

The paper is organized as follows: Section 2 outlines the problem of interest, which considers modeling the error-model for MIMO linear dynamic systems using the SE approach. In Section 3, the estimation problem using the ML principle within GMMs is addressed. In Section 4, an iterative EM-based algorithm is developed to address the associated ML estimation problem. Numerical simulations, considering scenarios with additive and multiplicative uncertainty models, are shown in Section 5. Finally, in Section 6, we present our conclusions.

2. Stochastic Embedding Approach for Class of Multivariable Systems

2.1. System of Interest

The MIMO dynamic system of interest is given by

y_{t}^{[r]} = G^{[r]} (q^{- 1}) u_{t}^{[r]} + ω_{t}^{[r]},

(1)

where

q^{- 1}

is the backward operator (

x_{t} q^{- 1} = x_{t - 1}

),

y_{t}^{[r]} \in R^{\tilde{N} \times 1}

represents the output signal,

u_{t}^{[r]} \in R^{\tilde{N} \times 1}

is the input signal,

ω_{t}^{[r]} \in R^{\tilde{N} \times 1}

is a zero-mean Gaussian white noise with covariance matrix

σ^{2} I_{\tilde{N}}

(i.e.,

N (ω_{t}; 0, σ^{2} I_{\tilde{N}})

), and

\tilde{N}

is the number of inputs and outputs in the multivariable dynamic system. The matrix

G^{[r]} (q^{- 1}) \in R^{\tilde{N} \times \tilde{N}}

is the multivariable dynamic model, where each component,

G_{i j}^{[r]} (q^{- 1})

, corresponds to the dynamic model relating the i-th output signal to the j-th input signal. The

{superscript}^{[r]}

corresponds to the r-th experiment carried out to obtain the data set

{u_{1 : N}^{[r]}, y_{1 : N}^{[r]}}

with

r = 1, \dots, M

. In this case,

u_{1 : N}^{[r]}

and

y_{1 : N}^{[r]}

correspond to the measurements

u_{t}^{[r]}

and

y_{t}^{[r]}

with

t = 1, \dots, N

for each experiment r.

On the other hand, the MIMO system

G^{[r]} (q^{- 1})

in (1) can be represented utilizing a nominal model with an additive error-model as follows [26,42]:

G^{[r]} (q^{- 1}) = G_{0} (q^{- 1}, θ) + G_{Δ} (q^{- 1}, η^{[r]}),

(2)

or a multiplicative error-model given by

G^{[r]} (q^{- 1}) = G_{0} (q^{- 1}, θ) [I + G_{Δ} (q^{- 1}, η^{[r]})] \lor [I + G_{Δ} (q^{- 1}, η^{[r]})] G_{0} (q^{- 1}, θ),

(3)

where

G_{0} (q^{- 1}, θ) \in R^{\tilde{N} \times \tilde{N}}

is the nominal model parameterized by

θ

, and

G_{Δ} (q^{- 1}, η^{[r]}) \in R^{\tilde{N} \times \tilde{N}}

represents an additive error-model (2) or a multiplicative error-model (3) parameterized by

η^{[r]}

. Here, I denotes the identity matrix of appropriate dimensions. The additive error-model in (2) is typically utilized in robust stabilization formulations [43]. The multiplicative uncertainties in (3) provide a more realistic scenario, since they account for the possibility that the error-model in the system may occur at both the input and output of the MIMO system model [43,44]. The nominal model,

G_{0} (q^{- 1}, θ)

, in (2) and (3) is given by the following:

G_{0} (q^{- 1}, θ) = [\begin{matrix} G_{0, 11} (q^{- 1}, θ_{11}) & G_{0, 12} (q^{- 1}, θ_{12}) & \dots & G_{0, 1 \tilde{N}} (q^{- 1}, θ_{1 \tilde{N}}) \\ G_{0, 21} (q^{- 1}, θ_{21}) & G_{0, 22} (q^{- 1}, θ_{22}) & \dots & G_{0, 2 \tilde{N}} (q^{- 1}, θ_{2 \tilde{N}}) \\ ⋮ & \dots & \dots & ⋮ \\ G_{0, \tilde{N} 1} (q^{- 1}, θ_{\tilde{N} 1}) & G_{0, \tilde{N} 2} (q^{- 1}, θ_{\tilde{N} 2}) & \dots & G_{0, \tilde{N} \tilde{N}} (q^{- 1}, θ_{\tilde{N} \tilde{N}}) \end{matrix}],

(4)

with

G_{0, i j} (q^{- 1}, θ_{i j}) = \frac{B_{i j} (q^{- 1}, θ_{i j})}{A_{i j} (q^{- 1}, θ_{i j})},

(5)

B_{i j} (q^{- 1}, θ_{i j}) = b_{1, i j} q^{- 1} + b_{2, i j} q^{- 2} + \dots + b_{n_{b}, i j} q^{- n_{b, i j}},

(6)

A_{i j} (q^{- 1}, θ_{i j}) = 1 + a_{1, i j} q^{- 1} + a_{2, i j} q^{- 2} + \dots + a_{n_{a}, i j} q^{- n_{a, i j}},

(7)

θ_{i j} = {[b_{1, i j} b_{2, i j} \dots b_{n_{b}, i j} a_{1, i j} a_{2, i j} \dots a_{n_{a}, i j}]}^{T},

(8)

where

θ_{i j} \in R^{(n_{a, i j} + n_{b, i j}) \times 1}

,

i = 1, 2, \dots, \tilde{N}

, and

j = 1, 2, \dots, \tilde{N}

. The error-model

G_{Δ} (q^{- 1}, η^{[r]})

in (2) and (3) is assumed to be a linear regression model given by

G_{Δ} (q^{- 1}, η^{[r]}) = [\begin{matrix} G_{Δ, 11} (q^{- 1}, η_{11}^{[r]}) & G_{Δ, 12} (q^{- 1}, η_{12}^{[r]}) & \dots & G_{Δ, 1 \tilde{N}} (q^{- 1}, η_{1 \tilde{N}}^{[r]}) \\ G_{Δ, 21} (q^{- 1}, η_{21}^{[r]}) & G_{Δ, 22} (q^{- 1}, η_{22}^{[r]}) & \dots & G_{Δ, 2 \tilde{N}} (q^{- 1}, η_{2 \tilde{N}}^{[r]}) \\ ⋮ & \dots & \dots & ⋮ \\ G_{Δ, \tilde{N} 1} (q^{- 1}, η_{\tilde{N} 1}^{[r]}) & G_{Δ, \tilde{N} 2} (q^{- 1}, η_{\tilde{N} 2}^{[r]}) & \dots & G_{Δ, \tilde{N} \tilde{N}} (q^{- 1}, η_{\tilde{N} \tilde{N}}^{[r]}) \end{matrix}],

(9)

with

G_{Δ, i j} (q^{- 1}, η_{i j}^{[r]}) = η_{0, i j}^{[r]} + η_{1, i j}^{[r]} q^{- 1} + \dots + η_{n_{Δ}, i j}^{[r]} q^{- n_{Δ, i j}},

(10)

where

η_{i j}^{[r]} \in R^{n_{Δ, i j} \times 1}

. Thus, the vectors of parameters

θ

and

η^{[r]}

in (2) and (3), respectively, are defined as follows:

θ = {[θ_{11}^{T} θ_{12}^{T} \dots θ_{1 \tilde{N}}^{T} θ_{21}^{T} θ_{22}^{T} \dots θ_{2 \tilde{N}}^{T} \dots θ_{\tilde{N} 1}^{T} θ_{\tilde{N} 2}^{T} \dots θ_{\tilde{N} \tilde{N}}^{T}]}^{T},

(11)

η^{[r]} = {[η_{11}^{{[r]}^{T}} η_{12}^{{[r]}^{T}} \dots η_{1 \tilde{N}}^{{[r]}^{T}} η_{21}^{{[r]}^{T}} η_{22}^{{[r]}^{T}} \dots η_{2 \tilde{N}}^{{[r]}^{T}} \dots η_{\tilde{N} 1}^{{[r]}^{T}} η_{\tilde{N} 2}^{{[r]}^{T}} \dots η_{\tilde{N} \tilde{N}}^{{[r]}^{T}}]}^{T},

(12)

where

{[\cdot]}^{T}

denotes the transpose operator,

θ \in R^{(n_{b} + n_{a})}

, and

η^{[r]} \in R^{n_{Δ}}

with

n_{a} = \sum_{i = 1}^{\tilde{N}} \sum_{j = 1}^{\tilde{N}} n_{a, i j}

,

n_{b} = \sum_{i = 1}^{\tilde{N}} \sum_{j = 1}^{\tilde{N}} n_{b, i j}

, and

n_{Δ} = \sum_{i = 1}^{\tilde{N}} \sum_{j = 1}^{\tilde{N}} n_{Δ, i j}

. In this paper, the PDF of the error-model parameters in (12) is a GMM given by

p (η^{[r]} | γ) = \sum_{l = 1}^{K} α_{l} N (η^{[r]}; μ_{l}, Σ_{l}), s . t . 0 \leq α_{l} \leq 1, \sum_{l = 1}^{K} α_{l} = 1,

(13)

γ = {[\underset{γ_{1}}{\underset{︸}{α_{1} μ_{1} Σ_{1}}} \dots \underset{γ_{K}}{\underset{︸}{α_{K} μ_{K} Σ_{K}}}]}^{T},

(14)

where

N (η^{[r]}; μ_{l}, Σ_{l})

represents a Gaussian distribution with mean value

μ_{l}

and covariance matrix

Σ_{l} > 0

, and

α_{l}

is the weight factor of the l-th GMM component.

Remark 1.

The error-model structure in (10) corresponds to an FIR system model with stochastic parameters

η_{i j}^{[r]}

, which may require high-order FIR system models to capture complex dynamic behaviors. This challenge can be addressed by utilizing weighted linear combinations of orthonormal models, such as Kautz models or Laguerre models [45], where the weighting factors are treated as stochastic parameters.

2.2. Standing Assumptions

The problem of interest is to estimate the vector of parameters

β = {[θ^{T} γ^{T} σ^{2}]}^{T}

, which defines the nominal model, the error-model, and the noise covariance matrix for the MIMO system in (1). Let

β_{0}

denote the true vector of parameters that defines the MIMO system model. In order to formulate the ML estimation algorithm, we adopt the following standing assumptions:

Assumption 1.

The MIMO system in (1) operates in open loop, and the data from M independent experiments are available for the exogenous deterministic input signal,

u_{t}^{[r]}

, and the system output response,

y_{t}^{[r]}

, for

t = 1, \dots, N

, and

r = 1, \dots, M

. We also consider that the MIMO system has the same number of input signals and output signals

\tilde{N}

.

Assumption 2.

The nominal model,

G_{0} (q^{- 1}, θ)

, in (4) is time-invariant—i.e., it does not change across experiments. Nevertheless, the error-model in (9) may change among independent experiments but all the realizations of

η^{[r]}

,

r = 1, \dots, M

are described by the same PDF.

Assumption 3.

The MIMO system model in (1) is asymptotically stable and has no poles on the unit circle.

Assumption 4.

The true vector of parameters,

β_{0}

, the input signal

u_{t}^{[r]}

, and the multivariable noise signal in (1) satisfy the regularity conditions that guarantee the ML estimator,

{\hat{β}}_{M L}

, converges to

β_{0}

as

N \to \infty

.

Assumption 5.

The system model orders

n_{a, i j}

,

n_{b, i j}

, and

n_{Δ, i j}

in (1), and the number of components K in the GMM in (13) is known.

Assumptions 1 and 2 are necessary to formulate an ML algorithm based on the SE approach for the error-model. Assumption 3 is considered to obtain unbiased ML estimators. Assumption 4 is required to develop an estimation algorithm that preserves the asymptotic properties of an ML estimator. Assumption 5 can be relaxed by considering an MIMO system model with non-Gaussian parametric uncertainties that do not correspond to a Gaussian mixture distribution. This scenario will be addressed in Section 5.

3. Maximum Likelihood Estimation for Multivariable Model Error Modeling Using GMMs

Typically, the analysis of uncertainty in MIMO systems has been addressed within a deterministic context, assuming that this uncertainty is bounded [46,47]. In contrast, in the SE framework, a stochastic approach is adopted. Specifically, in this approach, we assume the existence of a PDF parameterized by a vector of parameters,

γ

, that describes the stochastic behavior of the vector of parameters of the error-model,

η^{[r]}

[22,35]. In order to obtain the ML estimator for the nominal system model and the error-model PDF in (1), we consider that the observed data,

Y^{[r]} = [y_{1}^{[r]} y_{2}^{[r]} \dots y_{N}^{[r]}]

, and the data from the input signal,

U^{[r]} = [u_{1}^{[r]} u_{2}^{[r]} \dots u_{N}^{[r]}]

, are sets of measurements from each independent experiment r. Then, the MIMO dynamic system in (1) can be described as follows:

Additive error-model (2):

$Y^{[r]} = G (θ, U^{[r]}) + Ψ^{[r]} η^{[r]} + W^{[r]},$

(15)

with

$G (θ, U^{[r]}) = [\begin{matrix} G_{0} (q^{- 1}, θ) u_{1}^{[r]} \\ G_{0} (q^{- 1}, θ) u_{2}^{[r]} \\ ⋮ \\ G_{0} (q^{- 1}, θ) u_{N}^{[r]} \end{matrix}], Ψ^{[r]} = [\begin{matrix} Ψ_{1}^{[r]} \\ Ψ_{2}^{[r]} \\ ⋮ \\ Ψ_{N}^{[r]} \end{matrix}], W^{[r]} = [\begin{matrix} ω_{1}^{[r]} \\ ω_{2}^{[r]} \\ ⋮ \\ ω_{N}^{[r]} \end{matrix}],$

(16)

$Ψ_{t}^{[r]} = [u_{t}^{{[r]}^{T}} \otimes I_{\tilde{N}} u_{t - 1}^{{[r]}^{T}} \otimes I_{\tilde{N}} \dots u_{t - n_{Δ} + 1}^{{[r]}^{T}} \otimes I_{\tilde{N}}] .$

(17)
Multiplicative error-model (3):

$Y^{[r]} = G (θ, U^{[r]}) + Ψ^{[r]} (θ) η^{[r]} + W^{[r]},$

(18)

with

$Ψ^{[r]} (θ) = [\begin{matrix} Ψ_{1}^{[r]} (θ) \\ Ψ_{2}^{[r]} (θ) \\ ⋮ \\ Ψ_{N}^{[r]} (θ) \end{matrix}],$

(19)

$Ψ_{t}^{[r]} (θ) = [{\tilde{u}}_{t}^{{[r]}^{T}} \otimes I_{\tilde{N}} {\tilde{u}}_{t - 1}^{{[r]}^{T}} \otimes I_{\tilde{N}} \dots {\tilde{u}}_{t - n_{Δ} + 1}^{{[r]}^{T}} \otimes I_{\tilde{N}}],$

(20)

${\tilde{u}}_{t}^{[r]} = G_{0} (q^{- 1}, θ) u_{t}^{[r]},$

(21)

where

t = 1, 2, \dots, N

, ⊗ represents the Kronecker product,

{\tilde{u}}_{t}^{[r]} \in R^{\tilde{N}}

corresponds to the input signal

u_{t}^{[r]}

filtered by the nominal model

G_{0} (q^{- 1}, θ)

,

Y^{[r]}, U^{[r]}, W^{[r]} \in R^{N \tilde{N} \times 1}

,

G (θ, U^{[r]}) \in R^{N \tilde{N} \times 1}

, and

Ψ^{[r]}, Ψ {(θ)}^{[r]} \in R^{N \tilde{N} \times {\tilde{N}}^{2} n_{Δ}}

. The term

G (θ, U^{[r]})

represents the output response of the nominal model

G_{0} (q^{- 1}, θ)

in (1). The terms

Ψ^{[r]} η^{[r]}

and

Ψ^{[r]} (θ) η^{[r]}

are the output response of the additive error-model or multiplicative error-model, respectively, and

W^{[r]} \sim N (0, σ^{2} I_{N \tilde{N}})

. Notice that the term

G (θ, U^{[r]})

implies the structure of the nominal model

G_{0} (q^{- 1}, θ)

in (4) within the data of the input signal

U^{[r]}

. The terms

Ψ^{[r]} η^{[r]}

and

Ψ^{[r]} (θ) η^{[r]}

correspond to the error-model structure,

G_{Δ} (q^{- 1}, η^{[r]})

in (9), which can refer to an additive error-model in (2) or a multiplicative error-model,

G_{Δ} (q^{- 1}, η^{[r]}) G_{0} (q^{- 1}, θ)

in (3), respectively.

The vector of parameters to be estimated is defined as

β = {[θ^{T} γ^{T} σ^{2}]}^{T}

, where

θ

in (11) denotes the nominal model parameters,

γ

in (14) gives the GMM parameters that define the error-model PDF in (13), and

σ^{2} I_{N \tilde{N}}

is the covariance matrix of the zero-mean multivariable Gaussian noise signal in (1). Thus, the ML is obtained as follows:

Lemma 1.

Consider the vector of parameters to be estimated

β = {[θ^{T} γ^{T} σ^{2}]}^{T}

using (11) and (14). Under the standing assumptions, the ML estimator for the MIMO system in (1) is given by

{\hat{β}}_{M L} = arg max_{β} ℓ (β) s . t \sum_{l = 1}^{K} α_{l} = 1, 0 \leq α_{l} \leq 1,

(22)

ℓ (β) = \sum_{r = 1}^{M} log \{\sum_{l = 1}^{K} α_{l} N (Y^{[r]}; μ_{Y}^{[r, l]}, Σ_{Y}^{[r, l]})\},

(23)

where

μ_{Y}^{[r, l]}

and

Σ_{Y}^{[r, l]}

are computed as follows:

(a): If an additive error-model (2) is considered, then from (15), we have the following:

$μ_{Y}^{[r, l]} = G (θ, U^{[r]}) + Ψ^{[r]} μ_{l},$

(24)

$Σ_{Y}^{[r, l]} = σ^{2} I_{N \tilde{N}} + Ψ^{[r]} Σ_{i} {[Ψ^{[r]}]}^{T},$

(25)
(b): If a multiplicative error-model (3) is considered, then from (18), we have the following:

$μ_{Y}^{[r, l]} = G (θ, U^{[r]}) + Ψ^{[r]} (θ) μ_{l},$

(26)

$Σ_{Y}^{[r, l]} = σ^{2} I_{N \tilde{N}} + Ψ^{[r]} (θ) Σ_{i} {[Ψ^{[r]} (θ)]}^{T} .$

(27)

Proof.

Consider the set of output measurements

Y^{[r]} = [y_{1}^{[r]} \dots y_{N}^{[r]}]

, and the input signal data set

U^{[r]} = [u_{1}^{[r]} \dots u_{N}^{[r]}]

for each experiment

r = 1, \dots, M

from the MIMO system model in (1). Then, the MIMO system model in (1) can be expressed as follows:

Y^{[r]} = G (θ, U^{[r]}) + {\tilde{Ψ}}^{[r]} η^{[r]} + W^{[r]},

(28)

where

{\tilde{Ψ}}^{[r]}

corresponds to

Ψ^{[r]}

in (16) if an additive error-model approach is considered, or

{\tilde{Ψ}}^{[r]}

corresponds to

Ψ^{[r]} (θ)

in (19) if a multiplicative error-model approach is adopted. Then, the likelihood function,

L (β)

, for the system model in (28) is

\begin{matrix} L (β) & = p (Y^{[1]}, \dots, Y^{[M]} | β), \\ = \prod_{r = 1}^{M} p (Y^{[r]} | β), \end{matrix}

(29)

where

β = {[θ^{T} γ^{T} σ^{2}]}^{T}

is the vector of parameters to be estimated. From the random variable transformation theorem in [48] and utilizing the GMM in (13), the likelihood function is obtained by marginalizing with respect to the latent variable,

[η^{[1]} \dots η^{[M]}]

, as follows:

\begin{matrix} L (β) & = \prod_{r = 1}^{M} \int_{- \infty}^{\infty} p (Y^{[r]}, η^{[r]} | β) d η^{[r]}, \\ = \prod_{r = 1}^{M} \int_{- \infty}^{\infty} p (Y^{[r]} | η^{[r]}, β) p (η^{[r]} | β) d η^{[r]}, \end{matrix}

(30)

where

p (Y^{[r]} | η^{[r]}, β) = N (Y^{[r]}; G (θ, U^{[r]}) + {\tilde{Ψ}}^{[r]} η^{[r]}, σ^{2} I),

(31)

p (η^{[r]} | β) = \sum_{l = 1}^{K} α_{l} N (η^{[r]}; μ_{l}, Σ_{l}),

(32)

where I is the identity matrix with appropriate dimensions, and

0 \leq α_{l} \leq 1

,

\sum_{l = 1}^{K} α_{l} = 1

. Then, the log-likelihood function is given by the following:

ℓ (β) = \sum_{r = 1}^{M} log [\sum_{l = 1}^{K} α_{l} \int_{- \infty}^{\infty} K_{l} (β, η^{[r]})] d η^{[r]},

(33)

K_{l} (β, η^{[r]}) = N (Y^{[r]}; G (θ, U^{[r]}) + {\tilde{Ψ}}^{[r]} η^{[r]}, σ^{2} I) N (η^{[r]}; μ_{l}, Σ_{l}) .

(34)

Let us consider the random variables x and y with Gaussian probability density functions

p (x) = N (x; μ_{x}, Σ_{x})

and

p (y | x) = N (y; C x + μ_{y}, Σ_{y})

, where C is a constant with appropriate dimensions. Then, using the well-known Woodbury matrix identity [49], the following identities are satisfied:

\int_{- \infty}^{\infty} p (y, x) d x = N (y; {\bar{μ}}_{y}, {\bar{Σ}}_{y}),

(35)

p (y, x) = N (y; {\bar{μ}}_{y}, {\bar{Σ}}_{y}) N (x; {\tilde{μ}}_{x}, {\tilde{Σ}}_{x}),

(36)

where

{\bar{μ}}_{y} = C μ_{x} + μ_{y},

(37)

{\bar{Σ}}_{y} = Σ_{y} + C Σ_{x} C^{T},

(38)

S = Σ_{x} C^{T} {({\bar{Σ}}_{y})}^{- 1},

(39)

{\tilde{μ}}_{x} = μ_{x} + S (y - {\bar{μ}}_{y}),

(40)

{\tilde{Σ}}_{x} = Σ_{x} - S C Σ_{x} .

(41)

Using (36) in (34), and then solving (33) using (35), we obtain the following:

ℓ (β) = \sum_{r = 1}^{M} log \{\sum_{l = 1}^{K} α_{l} N (Y^{[r]}; {\bar{μ}}_{Y}^{[r, l]}, {\bar{Σ}}_{Y}^{[r, l]})\},

(42)

where

{\bar{μ}}_{Y}^{[r, l]} = G (θ, U^{[r]}) + {\tilde{Ψ}}^{[r]} μ_{l},

(43)

{\bar{Σ}}_{Y}^{[r, l]} = σ^{2} I + {\tilde{Ψ}}^{[r]} Σ_{i} {[{\tilde{Ψ}}^{[r]}]}^{T} .

(44)

The term

{\tilde{Ψ}}^{[r]}

corresponds to

Ψ^{[r]}

in (16) if an additive error-model approach is considered, or

{\tilde{Ψ}}^{[r]}

corresponds to

Ψ^{[r]} (θ)

in (19) if a multiplicative error-model approach is used. This complete the proof. □

Remark 2.

Notice that the estimation of the systems in (15) and (18) can be handled by using a Bayesian perspective, where both vector of parameters,

{θ, η^{[r]}}

, are modeled as multivariate random variables with specifics prior distributions. Under this Bayesian approach and for particular system model structures, it is possible to estimate an unified system model of both the nominal model and the error-model [33]. In our proposed SE framework, the structure and complexity of both the nominal system model and the uncertainty model are defined by the user, which provides more flexibility to obtain suitable models.

The result in Lemma 1 presents the likelihood function using the simultaneous information from M experiments to obtain an estimate of the vector of parameters

β

. This contrasts with the classical ML formulation, where data from a single experiment are used to estimate the model parameters of interest. On the other hand, the term

Ψ^{[r]}

in (24) and (25) represents a regressor matrix constructed from the r-th experiment data set of the multivariable input

u_{t}^{[r]}

. The term

Ψ^{[r]} (θ)

in (26) and (27) is a regressor matrix built from filtering the r-th multivariable input signal

u_{t}^{[r]}

through the nominal system model

G_{0} (q^{- 1}, θ)

.

Solving the optimization problem in (22) involves inverting large-dimensional matrices, which can be challenging with gradient-based optimization methods (e.g., quasi-Newton or trust-region). Moreover, the likelihood function utilizes the logarithm of a summation that depends on the number of GMM components, and it becomes difficult to solve when the number of GMM components is large. In this context, formulating an iterative algorithm based on the EM algorithm can be advantageous, since it typically provides closed-form expressions for the estimators of the parameters that define the GMM [50].

4. An Iterative Algorithm for Model Error Modeling in Multivariable Systems

The Expectation-Maximization algorithm is an iterative optimization methodology used to develop identification algorithms for both linear and nonlinear dynamic systems in the time domain [51] and in the frequency domain [52]. In such algorithms, a sequence of parameter estimates

{\hat{β}}^{(m)}

,

m = 1, 2, \dots

, is computed for the parameter vector

β

, ensuring convergence to a local maximum of the likelihood function associated with the dynamic system of interest [32]. Specifically, the formulation of the EM algorithm with GMMs adopts a data augmentation approach, in which a discrete hidden random variable models an indicator that determines which GMM component an observation comes from [50,53].

In the SE context, formulating an EM-based algorithm can consider the parameter vector

η^{[r]}

, which defines the error model as the hidden variable [35]. To solve the estimation problem in (22), an EM algorithm with GMMs is developed starting from the definition of the likelihood function using the observed data,

Y^{[1 : M]} = [Y^{[1]} Y^{[2]} \dots Y^{[M]}]

, and the hidden variable,

η^{[1 : M]} = [η^{[1]} η^{[2]} \dots η^{[M]}]

. In other words, we define the likelihood function in (23) using the complete data set,

\{Y^{[1 : M]}, η^{[1 : M]}\}

. Hence, the EM-based algorithm is given by

Q (β, {\hat{β}}^{(m)}) = E \{log [p (Y^{[1 : M]}, η^{[1 : M]} | β)] | Y^{[1 : M]}, {\hat{β}}^{(m)}\},

(45)

{\hat{β}}^{(m + 1)} = arg max_{β} Q (β, {\hat{β}}^{(m)}), s . t . \sum_{l = 1}^{K} α_{l} = 1, 0 \leq α_{l} \leq 1,

(46)

where

E \{a | b\}

and

p (a | b)

represent the expected value and the PDF of the random variable a given the random variable b, respectively.

{\hat{β}}^{(m)}

is the current estimate of the vector of parameters

β

, and

p (Y^{[1 : M]}, η^{[1 : M]} | β)

is the joint PDF of the observations from the M experiments,

Y^{[1 : M]}

, and the vector of parameters of the error-model from the experiments,

η^{[1 : M]}

.

Q (β, {\hat{β}}^{(m)})

is the auxiliary function of the EM-based algorithm. Note that (45) and (46) correspond to the E-step and M-step of the EM algorithm, respectively [32,53]. In general terms, the EM algorithm can be summarized as follows:

(i): Choose an initial value for ${\hat{β}}^{(0)}$ .
(ii): For $m = 0, 1, \dots$ , compute the E-step using the auxiliary function $Q (β, {\hat{β}}^{(m)})$ in (45).
(iii): Compute the M-step to obtain ${\hat{β}}^{(m + 1)}$ by solving the optimization problem in (46).
(iv): Go to step (ii) until convergence or until the number of EM algorithm iterations is reached.

Based on the description above, and inspired in the procedure in [54], the E-step of the EM-based algorithm can be computed as follows:

Lemma 2.

Consider the vector of parameters to be estimated

β = {[θ^{T} γ^{T} Σ_{ω}]}^{T}

using (11) and (14). For the system of interest in (1), the E-step is given by

Q (β, {\hat{β}}^{(m)}) = \sum_{r = 1}^{M} \sum_{l = 1}^{K} {\tilde{Q}}^{[r, l]} (β, {\hat{β}}^{(m)}) {\hat{ζ}}^{[r, l, m]},

(47)

{\hat{ζ}}^{[r, l, m]} = \frac{{\hat{α}}_{l}^{(m)} N (Y^{[r]}; {\hat{μ}}_{Y}^{[r, l, m]}, {\hat{Σ}}_{Y}^{[r, l, m]})}{\sum_{j = 1}^{K} {\hat{α}}_{j}^{(m)} N (Y^{[r]}; {\hat{μ}}_{Y}^{[r, j, m]}, {\hat{Σ}}_{Y}^{[r, j, m]})},

(48)

where

{\tilde{Q}}^{[r, l]} (β, {\hat{β}}^{(m)})

,

{\hat{μ}}_{Y}^{[r, l, m]}

and

{\hat{Σ}}_{Y}^{[r, l, m]}

are computed as follows:

(a): If an additive error-model (15) is considered, then from (45), we have

$\begin{matrix} {\tilde{Q}}^{[r, l]} & (β, {\hat{β}}^{(m)}) = log [α_{l}] - \frac{1}{2 σ^{2}} [{(Y^{[r]} - G (θ, U^{[r]}) - Ψ^{[r]} {\hat{μ}}_{η}^{[r, l, m]})}^{T} (Y^{[r]} - G (θ, U^{[r]}) - \\ Ψ^{[r]} {\hat{μ}}_{η}^{[r, l, m]}) + tr \{Ψ^{[r]} {\hat{Σ}}_{η}^{[r, l, m]} Ψ^{{[r]}^{T}}\}] - \frac{1}{2} [{({\hat{μ}}_{η}^{[r, l, m]} - μ_{l})}^{T} Σ_{l}^{- 1} ({\hat{μ}}_{η}^{[r, l, m]} - μ_{l}) + \\ tr \{Σ_{l}^{- 1} {\hat{Σ}}_{η}^{[r, l, m]}\}] - \frac{N \tilde{N}}{2} log [σ^{2}] - \frac{1}{2} log [det (Σ_{l})], \end{matrix}$

(49)

where $tr \{\cdot\}$ is the trace operator and

${\hat{μ}}_{η}^{[r, l, m]} = {\hat{μ}}_{l}^{(m)} + S^{[r, l, m]} (Y^{[r]} - {\hat{μ}}_{Y}^{[r, l, m]}),$

(50)

${\hat{Σ}}_{η}^{[r, l, m]} = (I_{{\tilde{N}}^{2} n_{Δ}} - S^{[r, l, m]} Ψ^{[r]}) {\hat{Σ}}_{l}^{(m)},$

(51)

$S^{[r, l, m]} = {\hat{Σ}}_{l}^{(m)} Ψ^{{[r]}^{T}} {({\hat{Σ}}_{Y}^{[r, l, m]})}^{- 1},$

(52)

computing ${\hat{μ}}_{Y}^{[r, l, m]}$ and ${\hat{Σ}}_{Y}^{[r, l, m]}$ utilizing (24) and (25), respectively, using the current estimations ${\hat{θ}}^{(m)}$ and ${\hat{σ}}^{2^{(m)}}$ .
(b): If a multiplicative error-model (18) is considered, then from (45), we have

$\begin{matrix} {\tilde{Q}}^{[r, l]} & (β, {\hat{β}}^{(m)}) = log [α_{l}] - \frac{1}{2 σ^{2}} [{(Y^{[r]} - G (θ, U^{[r]}) - Ψ^{[r]} (θ) {\hat{μ}}_{η}^{[r, l, m]})}^{T} (Y^{[r]} - G (θ, U^{[r]}) - \\ Ψ^{[r]} (θ) {\hat{μ}}_{η}^{[r, l, m]}) + tr \{Ψ^{[r]} (θ) {\hat{Σ}}_{η}^{[r, l, m]} Ψ^{{[r]}^{T}} (θ)\}] - \frac{1}{2} [{({\hat{μ}}_{η}^{[r, l, m]} - μ_{l})}^{T} Σ_{l}^{- 1} ({\hat{μ}}_{η}^{[r, l, m]} - μ_{l}) + \\ tr \{Σ_{l}^{- 1} {\hat{Σ}}_{η}^{[r, l, m]}\}] - \frac{N \tilde{N}}{2} log [σ^{2}] - \frac{1}{2} log [det (Σ_{l})], \end{matrix}$

(53)

where

${\hat{μ}}_{η}^{[r, l, m]} = {\hat{μ}}_{l}^{(m)} + S^{[r, l, m]} (Y^{[r]} - {\hat{μ}}_{Y}^{[r, l, m]}),$

(54)

${\hat{Σ}}_{η}^{[r, l, m]} = (I_{{\tilde{N}}^{2} n_{Δ}} - S^{[r, l, m]} Ψ^{[r]} ({\hat{θ}}^{(m)})) {\hat{Σ}}_{l}^{(m)},$

(55)

$S^{[r, l, m]} = {\hat{Σ}}_{l}^{(m)} Ψ^{{[r]}^{T}} ({\hat{θ}}^{(m)}) {({\hat{Σ}}_{Y}^{[r, l, m]})}^{- 1},$

(56)

computing ${\hat{μ}}_{Y}^{[r, l, m]}$ and ${\hat{Σ}}_{Y}^{[r, l, m]}$ utilizing (26) and (27), respectively, utilizing the current estimations ${\hat{θ}}^{(m)}$ and ${\hat{σ}}^{2^{(m)}}$ .

Proof.

Consider the log-likelihood function

ℓ (β)

in (33) as follows:

ℓ (β) = \sum_{r = 1}^{M} log [V^{[r]} (β)],

(57)

where

V^{[r]} (β) = \sum_{l = 1}^{K} α_{l} \int_{- \infty}^{\infty} K_{l} (β, η^{[r]}) d η^{[r]},

(58)

with

K_{l} (β, η^{[r]})

given in (34). The term

log [V^{[r]} (β)]

can be expressed as follows [54]:

log [V^{[r]} (β)] = Q^{[r]} (β, {\hat{β}}^{(m)}) - H^{[r]} (β, {\hat{β}}^{(m)}),

(59)

where

Q^{[r]} (β, {\hat{β}}^{(m)}) = \sum_{l = 1}^{K} \int_{- \infty}^{\infty} log [K_{l} (β, η^{[r]})] \frac{K_{l} ({\hat{β}}^{(m)}, η^{[r]})}{V^{[r]} ({\hat{β}}^{(m)})} d η^{[r]},

(60)

H^{[r]} (β, {\hat{β}}^{(m)}) = \sum_{l = 1}^{K} \int_{- \infty}^{\infty} log [\frac{K_{l} (β, η^{[r]})}{V^{[r]} (β)}] \frac{K_{l} ({\hat{β}}^{(m)}, η^{[r]})}{V^{[r]} ({\hat{β}}^{(m)})} d η^{[r]} .

(61)

Using Jensen’s inequality, it can be shown that the term

H^{[r]} (β, {\hat{β}}^{(m)})

is a decreasing function for any value of

β

[54]. From (34)–(36), the term

V^{[r]} ({\hat{β}}^{(m)})

can be expressed as follows:

V^{[r]} ({\hat{β}}^{(m)}) = \sum_{l = 1}^{K} {\hat{α}}_{l}^{(m)} N (Y^{[r]}; {\bar{μ}}_{Y}^{[r, l, m]}, {\bar{Σ}}_{Y}^{[r, l, m]}),

(62)

where

{\bar{μ}}_{Y}^{[r, l, m]} = G ({\hat{θ}}^{(m)}, U^{[r]}) + {\tilde{Ψ}}^{[r, m]} μ_{l},

(63)

{\bar{Σ}}_{Y}^{[r, l, m]} = {\hat{σ}}^{2^{(m)}} I + {\tilde{Ψ}}^{[r, m]} {\hat{Σ}}_{i}^{(m)} {[{\tilde{Ψ}}^{[r, m]}]}^{T} .

(64)

Let us consider the following identity for a random variable z:

E \{{(b - A z)}^{T} (b - A z)\} = {(b - A \hat{z})}^{T} (b - A \hat{z}) + tr \{A Σ_{z} A^{T}\},

(65)

where

\hat{z} = E \{z\}

and

Σ_{z} = E \{(z - \hat{z}) {(z - \hat{z})}^{T}\}

. Using (65) to compute

log [K_{l} (β, η^{[r]})]

in (60), we obtain the following:

\begin{matrix} log [K_{l} (β, η^{[r]})] & = l o g [α_{l}] - \frac{1}{2 σ^{2}} [{(Y^{[r]} - G (θ, U^{[r]}) - {\tilde{Ψ}}^{[r]} {\tilde{μ}}_{η}^{[r, l, m]})}^{T} (Y^{[r]} - G (θ, U^{[r]}) - \\ {\tilde{Ψ}}^{[r]} {\tilde{μ}}_{η}^{[r, l, m]}) + tr \{{\tilde{Ψ}}^{[r]} {\tilde{Σ}}_{η}^{[r, l, m]} {\tilde{Ψ}}^{{[r]}^{T}}\}] - \frac{1}{2} [{({\tilde{μ}}_{η}^{[r, l, m]} - μ_{l})}^{T} Σ_{l}^{- 1} ({\tilde{μ}}_{η}^{[r, l, m]} - μ_{l}) + \\ tr \{Σ_{l}^{- 1} {\tilde{Σ}}_{η}^{[r, l, m]}\}] - \frac{N \tilde{N}}{2} log [σ^{2}] - \frac{1}{2} log [det (Σ_{l})], \end{matrix}

(66)

where

{\tilde{μ}}_{η}^{[r, l, m]}

and

{\tilde{Σ}}_{η}^{[r, l, m]}

are computed utilizing (39)–(41) as follows:

S^{[r, l, m]} = {\hat{Σ}}_{l}^{(m)} {\tilde{Ψ}}^{{[r]}^{T}} {({\bar{Σ}}_{Y}^{[r, l, m]})}^{- 1},

(67)

{\tilde{μ}}_{η}^{[r, l, m]} = {\hat{μ}}_{l}^{(m)} + S^{[r, l, m]} (Y^{[r]} - {\bar{μ}}_{Y}^{[r, l, m]}),

(68)

{\tilde{Σ}}_{η}^{[r, l, m]} = (I - S^{[r, l, m]} {\tilde{Ψ}}^{[r]}) {\hat{Σ}}_{l}^{(m)} .

(69)

Using (66) in (60), and solving the integral utilizing (36), we obtain the following:

Q^{[r]} (β, {\hat{β}}^{(m)}) = \sum_{l = 1}^{K} log [K_{l} (β, η^{[r]})] {\hat{ζ}}^{[r, l, m]},

(70)

where

log [K_{l} (β, η^{[r]})]

is given in (66) and

{\hat{ζ}}^{[r, l, m]} = \frac{{\hat{α}}_{l}^{(m)} N (Y^{[r]}; {\bar{μ}}_{Y}^{[r, l, m]}, {\bar{Σ}}_{Y}^{[r, l, m]})}{V^{[r]} ({\hat{β}}^{(m)})},

(71)

where

V^{[r]} ({\hat{β}}^{(m)})

is given in (62). Finally,

{\tilde{Ψ}}^{[r]}

corresponds to

Ψ^{[r]}

in (16) if an additive error-model approach is used, or

{\tilde{Ψ}}^{[r]}

corresponds to

Ψ^{[r]} (θ)

in (19) if a multiplicative error-model approach is considered. Then, substituting (70) in (57), we obtain directly the auxiliary function

Q (β, {\hat{β}}^{(m)})

in (47). This completes the proof. □

From Lemma 2, the term

{\hat{ζ}}^{[r, l, m]}

in (48) represents the posterior probability that the observations from the r-th experiment,

Y^{[r]}

, comes from the l-th GMM component defining the PDF of the error-model [55,56]. On the other hand, to compute the M-step of the EM-based algorithm, we need to solve (46). Since it is not possible to obtain a closed-form solution for all the estimators of the parameters of interest, we define a solution based on the coordinate descent algorithm and on the concept of Generalized EM as follows (see, e.g., [57,58]):

(1): Fix the vector of parameters $θ$ at the current estimate ${\hat{θ}}^{(m)}$ in (47) and solve (46) with respect to the GMM parameters $γ$ in (14) and the noise variance $σ^{2}$ to obtain the estimates ${\hat{γ}}^{(m + 1)}$ and ${\hat{σ}}^{2^{(m + 1)}}$ .
(2): Fix the estimated values ${\hat{γ}}^{(m + 1)}$ and ${\hat{σ}}^{2^{(m + 1)}}$ in (47) and solve (46) with respect to $θ$ to obtain the estimate ${\hat{θ}}^{(m + 1)}$ .

From Lemma 2, the M-step of the proposed EM-based algorithm can be computed as follows:

Theorem 1.

Consider the MIMO system given in (1) and the vector of parameters to be estimated

β = {[θ^{T} γ^{T} σ^{2}]}^{T}

. Under the standing assumptions, the solution of the optimization problem stated in (46) utilizing the auxiliary function (47) is given by the following:

{\hat{α}}_{l}^{(m + 1)} = [\sum_{r = 1}^{M} {\hat{ζ}}^{[r, l, m]}] / M,

(72)

{\hat{μ}}_{l}^{(m + 1)} = [\sum_{r = 1}^{M} {\hat{μ}}_{η}^{[r, l, m]} {\hat{ζ}}^{[r, l, m]}] / [\sum_{r = 1}^{M} {\hat{ζ}}^{[r, l, m]}],

(73)

{\hat{Σ}}_{l}^{(m + 1)} = [\sum_{r = 1}^{M} (({\hat{μ}}_{η}^{[r, l, m]} - {\hat{μ}}_{l}^{(m)}) {({\hat{μ}}_{η}^{[r, l, m]} - {\hat{μ}}_{l}^{(m)})}^{T} + {\hat{Σ}}_{η}^{[r, l, m]}) {\hat{ζ}}^{[r, l, m]}] / [\sum_{r = 1}^{M} {\hat{ζ}}^{[r, l, m]}],

(74)

{\hat{σ}}^{2^{(m + 1)}} = [\sum_{r = 1}^{M} \sum_{l = 1}^{K} J^{[r, l, m]} ({\hat{θ}}^{(m)}) {\hat{ζ}}^{[r, l, m]}] / N \tilde{N} M,

(75)

{\hat{θ}}^{(m + 1)} = arg min_{θ} \sum_{r = 1}^{M} \sum_{l = 1}^{K} J^{[r, l, m]} (θ) {\hat{ζ}}^{[r, l, m]} .

(76)

Then, the M-step in the proposed EM-based algorithm can be carried out as follows:

(a): If an additive error-model (15) is considered, the estimators (72)–(76) are computed using ${\hat{μ}}_{η}^{[r, l, m]}$ in (50), ${\hat{Σ}}_{η}^{[r, l, m]}$ in (51), and $J^{[r, l, m]} (θ)$ as follows:

$\begin{matrix} J^{[r, l, m]} (θ) = & {(Y^{[r]} - G (θ, U^{[r]}) - Ψ^{[r]} {\hat{μ}}_{η}^{[r, l, m]})}^{T} (Y^{[r]} - G (θ, U^{[r]}) - Ψ^{[r]} {\hat{μ}}_{η}^{[r, l, m]}) . \end{matrix}$

(77)
(b): If a multiplicative error-model (18) is considered, the estimators (72)–(76) are computed using ${\hat{μ}}_{η}^{[r, l, m]}$ in (54), ${\hat{Σ}}_{η}^{[r, l, m]}$ in (55), and $J^{[r, l, m]} (θ)$ as

$\begin{matrix} J^{[r, l, m]} (θ) = & {(Y^{[r]} - G (θ, U^{[r]}) - Ψ^{[r]} (θ) {\hat{μ}}_{η}^{[r, l, m]})}^{T} (Y^{[r]} - G (θ, U^{[r]}) - Ψ^{[r]} (θ) {\hat{μ}}_{η}^{[r, l, m]}) + \\ tr \{Ψ^{[r]} (θ) {\hat{Σ}}_{η}^{[r, l, m]} Ψ^{{[r]}^{T}} (θ)\} . \end{matrix}$

(78)

Proof.

Using (70) and (71), the auxiliary function

Q (β, {\hat{β}}^{(m)})

in (47) can be expressed as follows:

Q (β, {\hat{β}}^{(m)}) = \sum_{r = 1}^{M} \sum_{l = 1}^{K} log [K_{l} (β, η^{[r]})] {\hat{ζ}}^{[r, l, m]} .

(79)

Taking the derivative of (79) with respect to

μ_{l}

and equating to zero yields

\frac{\partial Q (β, {\hat{β}}^{(m)})}{\partial μ_{l}} = \sum_{r = 1}^{M} ({\tilde{μ}}_{η}^{[r, l, m]} - {\hat{μ}}_{l}^{(m + 1)}) {\hat{ζ}}^{[r, l, m]} = 0 .

(80)

Then, we obtain the following:

{\hat{μ}}_{l}^{(m + 1)} = \frac{\sum_{r = 1}^{M} {\tilde{μ}}_{η}^{[r, l, m]} {\hat{ζ}}^{[r, l, m]}}{\sum_{r = 1}^{M} {\hat{ζ}}^{[r, l, m]}} .

(81)

From (81), if we consider an additive error-model, then

{\tilde{μ}}_{η}^{[r, l, m]}

is equal to

{\hat{μ}}_{η}^{[r, l, m]}

in (50) and we directly obtain (73). If we consider a multiplicative error-model, then

{\tilde{μ}}_{η}^{[r, l, m]}

corresponds to

{\hat{μ}}_{η}^{[r, l, m]}

in (54).

Next, taking the derivative of (79) with respect to

Σ_{l}^{- 1}

and equating to zero yields

\frac{\partial Q (β, {\hat{β}}^{(m)})}{\partial Σ_{l}^{- 1}} = \sum_{r = 1}^{M} ([({\tilde{μ}}_{η}^{[r, l, m]} - {\hat{μ}}_{l}^{(m)}) {({\tilde{μ}}_{η}^{[r, l, m]} - {\hat{μ}}_{l}^{(m)})}^{T} + {\tilde{Σ}}_{η}^{[r, l, m]}] - {\hat{Σ}}_{l}^{(m + 1)}) {\hat{ζ}}^{[r, l, m]} = 0 .

(82)

Then, we obtain

{\hat{Σ}}_{l}^{(m + 1)}

as follows:

{\hat{Σ}}_{l}^{(m + 1)} = \frac{\sum_{r = 1}^{M} [({\tilde{μ}}_{η}^{[r, l, m]} - {\hat{μ}}_{l}^{(m)}) {({\tilde{μ}}_{η}^{[r, l, m]} - {\hat{μ}}_{l}^{(m)})}^{T} + {\tilde{Σ}}_{η}^{[r, l, m]}] {\hat{ζ}}^{[r, l, m]}}{\sum_{r = 1}^{M} {\hat{ζ}}^{[r, l, m]}} .

(83)

From (83), if we consider an additive error-model, then

{\tilde{μ}}_{η}^{[r, l, m]}

is equal to

{\hat{μ}}_{η}^{[r, l, m]}

in (50),

{\tilde{Σ}}_{η}^{[r, l, m]}

is

{\hat{Σ}}_{η}^{[r, l, m]}

in (51), and we directly obtain (74). If we consider a multiplicative error-model, then

{\tilde{μ}}_{η}^{[r, l, m]}

is equal to

{\hat{μ}}_{η}^{[r, l, m]}

in (54) and

{\tilde{Σ}}_{η}^{[r, l, m]}

is

{\hat{Σ}}_{η}^{[r, l, m]}

in (55).

Then, taking the derivative of (79) with respect to

ρ = 1 / σ^{2}

and equating to zero yields

\begin{matrix} \frac{1}{N \tilde{N}} \sum_{r = 1}^{M} \sum_{l = 1}^{K} & [{(Y^{[r]} - G ({\hat{θ}}^{(m)}, U^{[r]}) - {\tilde{Ψ}}^{[r]} {\tilde{μ}}_{η}^{[r, l, m]})}^{T} (Y^{[r]} - G ({\hat{θ}}^{(m)}, U^{[r]}) - {\tilde{Ψ}}^{[r]} {\tilde{μ}}_{η}^{[r, l, m]}) + \\ tr \{{\tilde{Ψ}}^{[r]} {\tilde{Σ}}_{η}^{[r, l, m]} {\tilde{Ψ}}^{{[r]}^{T}}\}] {\hat{ζ}}^{[r, l, m]} = \sum_{r = 1}^{M} \sum_{l = 1}^{K} {\hat{ρ}}^{(m + 1)} {\hat{ζ}}^{[r, l, m]} . \end{matrix}

(84)

Notice that

\sum_{l = 1}^{K} {\hat{ζ}}^{[r, l, m]} = 1

. Then, we obtain

{\hat{σ}}^{2^{(m + 1)}} = \frac{\sum_{r = 1}^{M} \sum_{l = 1}^{K} J^{[r, l, m]} ({\hat{θ}}^{(m)}) {\hat{ζ}}^{[r, l, m]}}{N \tilde{N} M},

(85)

\begin{matrix} J^{[r, l, m]} ({\hat{θ}}^{(m)}) & = {(Y^{[r]} - G ({\hat{θ}}^{(m)}, U^{[r]}) - {\tilde{Ψ}}^{[r]} {\tilde{μ}}_{η}^{[r, l, m]})}^{T} (Y^{[r]} - G ({\hat{θ}}^{(m)}, U^{[r]}) - {\tilde{Ψ}}^{[r]} {\tilde{μ}}_{η}^{[r, l, m]}) + \\ tr \{{\tilde{Ψ}}^{[r]} {\tilde{Σ}}_{η}^{[r, l, m]} {\tilde{Ψ}}^{{[r]}^{T}}\} . \end{matrix}

(86)

If we consider an additive error-model, then

{\tilde{μ}}_{η}^{[r, l, m]}

is equal to

{\hat{μ}}_{η}^{[r, l, m]}

in (50),

{\tilde{Σ}}_{η}^{[r, l, m]}

is

{\hat{Σ}}_{η}^{[r, l, m]}

in (51), and we directly obtain (75). If we consider a multiplicative error-model, then

{\tilde{μ}}_{η}^{[r, l, m]}

is equal to

{\hat{μ}}_{η}^{[r, l, m]}

in (54) and

{\tilde{Σ}}_{η}^{[r, l, m]}

is

{\hat{Σ}}_{η}^{[r, l, m]}

in (55).

For the parameter

α_{l}

, we use a Lagrange multiplier in order to deal with the constraint on

α_{l}

. Then, from (79), we define the Lagrangian

R (α_{l}, λ)

as follows:

R (α_{l}, λ) = \sum_{r = 1}^{M} \sum_{l = 1}^{K} log [α_{l}] {\hat{ζ}}^{[r, l, m]} + λ (\sum_{l = 1}^{K} α_{l} - 1) .

(87)

Taking the partial derivatives of (87) with respect to

α_{l}

and

λ

and equating to zero, we obtain

\frac{\partial R (α_{l}, λ)}{\partial α_{l}} = \sum_{r = 1}^{M} \frac{{\hat{ζ}}^{[r, l, m]}}{{\hat{α}}_{l}^{(m + 1)}} - λ = 0, \to {\hat{α}}_{l}^{(m + 1)} = \frac{\sum_{r = 1}^{M} {\hat{ζ}}^{[r, l, m]}}{λ},

(88)

\frac{\partial R (α_{l}, λ)}{\partial λ} = \sum_{l = 1}^{K} α_{l} - 1 = 0 .

(89)

Then, taking summation over

l = 1, \dots, K

in (88) and utilizing (89), we have

λ = \sum_{l = 1}^{K} \sum_{r = 1}^{M} {\hat{ζ}}^{[r, l, m]} .

(90)

Substituting (90) in (88) yields:

{\hat{α}}_{l}^{(m + 1)} = \frac{\sum_{r = 1}^{M} {\hat{ζ}}^{[r, l, m]}}{\sum_{l = 1}^{K} \sum_{r = 1}^{M} {\hat{ζ}}^{[r, l, m]}} .

(91)

Notice that

\sum_{l = 1}^{K} {\hat{ζ}}^{[r, l, m]} = 1

. Then, we obtain directly (72).

Finally, from (86), substituting

{\hat{θ}}^{(m)}

by

θ

, we can obtain (77) and (78) for an additive error-model and a multiplicative error-model, respectively. □

The results of Theorem 1 show the closed-form expressions for both the error-model distribution parameters as a GMM and the noise variance are obtained. In addition, the estimation problem of the nominal model parameters in (76) can be solved utilizing traditional gradient-based optimization methods (see, e.g., [6,59]). Finally, the proposed iterative algorithms for an additive error-model and an multiplicative error-model with GMMs are summarized in Algorithm 1 and Algorithm 2, respectively.

Algorithm 1 Iterative algorithm with GMMs for additive error-model

Inputs

Y^{[1 : M]}

,

U^{[1 : M]}

, M, K,

{\hat{θ}}^{(0)}

,

{{\hat{α}}_{l}^{(0)}, {\hat{μ}}_{l}^{(0)}, {\hat{Σ}}_{l}^{(0)}}

, and

{\hat{σ}}^{2^{(0)}}

, for

l = 1, \dots, K

.

Outputs

\hat{θ}

,

{\hat{α}, {\hat{μ}}_{l}, {\hat{Σ}}_{l}}

and

{\hat{σ}}^{2}

for

l = 1, \dots, K

.

1:: $m \leftarrow 0$
2:: procedure E-step
3:: Compute ${\hat{μ}}_{Y}^{[r, l, m]}$ and ${\hat{Σ}}_{Y}^{[r, l, m]}$ from (24) and (25) for $l = 1, \dots, K$ , and $r = 1, \dots, M$ .
4:: Compute ${\hat{ζ}}^{[r, l, m]}$ from (48) for $l = 1, \dots, K$ , and $r = 1, \dots, M$ .
5:: Compute ${\hat{μ}}_{η}^{[r, l, m]}$ and ${\hat{Σ}}_{η}^{[r, l, m]}$ from (50) and (51) for $l = 1, \dots, K$ , and $r = 1, \dots, M$ .
6:: end procedure
7:: procedure M-step
8:: Estimate ${{\hat{α}}_{l}^{(m + 1)}, {\hat{μ}}_{l}^{(m + 1)}, {\hat{Σ}}_{l}^{(m + 1)}}$ , ${\hat{σ}}^{2^{(m + 1)}}$ from (72)–(75).
9:: Compute $J {(θ)}^{[r, l, m]}$ from (77) using ${{\hat{α}}_{l}^{(m + 1)}, {\hat{μ}}_{l}^{(m + 1)}, {\hat{Σ}}_{l}^{(m + 1)}}$ and ${\hat{θ}}^{(m)}$ .
10:: Estimate ${\hat{θ}}^{(m + 1)}$ by solving (76).
11:: end procedure
12:: if stopping criterion is not satisfied then
13:: $m \leftarrow m + 1$ , return to 2
14:: else
15:: $\hat{θ} \leftarrow {\hat{θ}}^{(m + 1)}$ , ${\hat{α}, {\hat{μ}}_{l}, {\hat{Σ}}_{l}} \leftarrow {{\hat{α}}_{l}^{(m + 1)}, {\hat{μ}}_{l}^{(m + 1)}, {\hat{Σ}}_{l}^{(m + 1)}}$ , ${\hat{σ}}^{2} \leftarrow {\hat{σ}}^{2^{(m + 1)}}$
16:: end if
17:: End

Algorithm 2 Iterative algorithm with GMMs for multiplicative error-model

Inputs

Y^{[1 : M]}

,

U^{[1 : M]}

, M, K,

{\hat{θ}}^{(0)}

,

{{\hat{α}}_{l}^{(0)}, {\hat{μ}}_{l}^{(0)}, {\hat{Σ}}_{l}^{(0)}}

, and

{\hat{σ}}^{2^{(0)}}

, for

l = 1, \dots, K

.

Outputs

\hat{θ}

,

{\hat{α}, {\hat{μ}}_{l}, {\hat{Σ}}_{l}}

and

{\hat{σ}}^{2}

for

l = 1, \dots, K

.

1:: $m \leftarrow 0$
2:: procedure E-step
3:: Compute ${\hat{μ}}_{Y}^{[r, l, m]}$ and ${\hat{Σ}}_{Y}^{[r, l, m]}$ from (26) and (27) for $l = 1, \dots, K$ , and $r = 1, \dots, M$ .
4:: Compute ${\hat{ζ}}^{[r, l, m]}$ from (48) for $l = 1, \dots, K$ , and $r = 1, \dots, M$ .
5:: Compute ${\hat{μ}}_{η}^{[r, l, m]}$ and ${\hat{Σ}}_{η}^{[r, l, m]}$ from (54) and (55) for $l = 1, \dots, K$ , and $r = 1, \dots, M$ .
6:: end procedure
7:: procedure M-step
8:: Estimate ${{\hat{α}}_{l}^{(m + 1)}, {\hat{μ}}_{l}^{(m + 1)}, {\hat{Σ}}_{l}^{(m + 1)}}$ , ${\hat{σ}}^{2^{(m + 1)}}$ from (72)–(75).
9:: Compute $J {(θ)}^{[r, l, m]}$ from (78) using ${{\hat{α}}_{l}^{(m + 1)}, {\hat{μ}}_{l}^{(m + 1)}, {\hat{Σ}}_{l}^{(m + 1)}}$ and ${\hat{θ}}^{(m)}$ .
10:: Estimate ${\hat{θ}}^{(m + 1)}$ by solving (76).
11:: end procedure
12:: if stopping criterion is not satisfied then
13:: $m \leftarrow m + 1$ , return to 2
14:: else
15:: $\hat{θ} \leftarrow {\hat{θ}}^{(m + 1)}$ , ${\hat{α}, {\hat{μ}}_{l}, {\hat{Σ}}_{l}} \leftarrow {{\hat{α}}_{l}^{(m + 1)}, {\hat{μ}}_{l}^{(m + 1)}, {\hat{Σ}}_{l}^{(m + 1)}}$ , ${\hat{σ}}^{2} \leftarrow {\hat{σ}}^{2^{(m + 1)}}$
16:: end if
17:: End

5. Numerical Simulations

In this section, we consider three simple examples to show the benefits and performance of our proposed iterative algorithm for model error modeling with an SE approach in MIMO systems described in (1). Using numerical simulations is a common way to verify the performance of new estimation algorithms, as it allows the exploration of different scenarios that cannot be safely tested in experimental design [60,61].

All the numerical examples consider a two-input, two-output (

2 \times 2

) MIMO system. The first example adopts the SE approach with an additive error-model in (2). The nominal system model,

G_{0} (q^{- 1})

in (4), corresponds to an FIR system model, i.e.,

A_{i j} (q^{- 1}, θ_{i j}) = 1

for

i = 1, 2

and

j = 1, 2

. The error-model in (9) corresponds to a second-order FIR system model with a two-component GMM for the error-model distribution. Similarly, for the second example, FIR system models for both nominal model in (4) and the multiplicative error-model in (3) are considered. The error-model distribution corresponds to a two-component overlapped Gaussian mixture distribution. In both examples, it is assumed that there is no modeling uncertainty at low frequency. Furthermore, a traditional MIMO system estimation is performed using an FIR structure for each

G_{i j} (q^{- 1}, θ_{i j})

using a non-linear least squares (NLSs) search method for the prediction error minimization, implemented via the Matlab tfest() function. This estimation aims to validate the error-model dynamic behavior using a data set different from the one used to estimate the MIMO system model parameters with the SE approach.

On the other hand, in the third numerical example, we focus on analyzing the flexibility of the proposed iterative algorithm for the MIMO system in (1). In this case, the model

G^{[r]} (q^{- 1})

in (1) is defined by the model structure in (5) with

a_{1, i j} = a_{1, i j}^{o} (1 + δ_{i j}^{[r]})

,

b_{1, i j} = b_{1, i j}^{o} (1 + δ_{i j}^{[r]})

, for

i = 1, 2

and

j = 1, 2

. Here,

{a_{1, i j}^{(0)}, b_{1, i j}^{(0)}}

denote the nominal values, and

δ_{i j}^{[r]}

models a parametric uncertainty with a non-Gaussian distribution that does not correspond to a GMM.

For all the examples, the simulation setup is as follows:

(1): The data length is $N = 100$ .
(2): The number of independent experiments is $M = 200$ .
(3): The number of Monte Carlo (MC) simulations is $n_{MC} = 50$ .
(4): The stopping criterion is given by

$‖ {\hat{β}}^{(m)} - {\hat{β}}^{(m - 1)} ‖ / ‖ {\hat{β}}^{(m)} ‖ < 10^{- 5},$

or when 1000 iterations of the EM-based algorithm have been reached, where ‖·‖ denotes the vector norm operator.

The initial values for the proposed EM-based algorithm are obtained as follows:

(I)

For the nominal MIMO model structure

G_{0} (q^{- 1}, θ)

, M estimates,

{\hat{θ}}^{[1 : M]}

and

{\hat{σ}}^{2^{[1 : M]}}

, are obtained from the set of independent experimental data. For this purpose, the bj() function from Matlab’s System Identification toolbox is used, considering a Box–Jenkins model structure [1,2]. Then, the initial values

{\hat{θ}}^{(0)}

and

{\hat{σ}}^{2^{(0)}}

correspond to the average of all the corresponding estimates.

(II)

For the linear regression structure of the MIMO error-model,

G_{Δ} (q^{- 1}, η^{[r]})

, M estimates

{\hat{η}}^{[1 : M]}

are obtained using the bj() function with the corresponding FIR system model. Then, for the set of estimates

{\hat{η}}^{[1 : M]}

, the following procedure based on the k-means algorithm is used [50]:

(II.1): The mixing weight, ${\hat{α}}_{l}^{(0)}$ , is chosen equal to $1 / K$ .
(II.2): The covariance matrix ${\hat{Σ}}_{l}^{(0)}$ is chosen to be diagonal and identical to the sample variance of the estimates ${\hat{η}}^{[1 : M]}$ .
(II.3): The mean, ${\hat{μ}}_{l}^{(0)}$ , is chosen to be evenly spaced between the maximum and minimum estimated values of each component of the estimates ${\hat{η}}^{[1 : M]}$ .

The initialization methodologies for iterative EM algorithms with GMMs have been extensively studied in the literature (see e.g., [62,63,64]), where it has been shown that a careful initialization of the Gaussian mixture parameters yields accurate estimates and also improves the rate of convergences. These methods typically assume that a GMM describes a probabilistic model of the observed data and require a previous short run of an EM algorithm to obtain the initial values of the GMM parameters. However, the problem addressed with the SE approach is different, since we assume that the multivariate GMM describes the PDF of a hidden variable that defines the dynamic behavior of the error-model. Our experience shows that the procedure previously described yields accurate estimates of the system model parameters without requiring additional runs of the EM algorithm.

5.1. Example 1: Multivariable FIR System Model with Additive Error-Model

Consider the MIMO system in (1) with an additive error-model (2) as follows:

G_{0, i j} (q^{- 1}, θ_{i j}) = g_{0, i j} + g_{1, i j} q^{- 1},

(92)

G_{Δ, i j} (q^{- 1}, η_{i j}^{[r]}) = η_{0, i j}^{[r]} + η_{1, i j}^{[r]} q^{- 1},

(93)

where

i = 1, 2

,

j = 1, 2

, and the true values (but unknown) are

g_{0, 11} = 1.0

,

g_{1, 11} = 0.5

,

g_{0, 12} = 0.8

,

g_{1, 12} = - 1.2

,

g_{0, 21} = 1.0

,

g_{1, 21} = - 1.5

,

g_{0, 22} = 1.2

, and

g_{1, 22} = - 1.3

. The deterministic input signal is

u_{t}^{[r]} \sim N (0, σ_{u}^{2} I_{2})

,

σ_{u}^{2} = 10

, and the noise signal is

ω_{t}^{[r]} \sim N (0, σ^{2} I_{2})

,

σ^{2} = 0.2

. We also consider that

η_{1, i j}^{[r]} = - η_{0, i j}^{[r]}

—i.e., we focus on uncertainties in the MIMO system at high frequencies with

η^{[r]} = {[η_{0, 11}^{[r]}, η_{0, 12}^{[r]}, η_{0, 21}^{[r]}, η_{0, 22}^{[r]}]}^{T}

. Then, the error-model distribution as a GMM with

K = 2

is given by

α_{1} = α_{2} = 0.5

and

μ_{1} = [\begin{matrix} - 3 \\ - 3 \\ - 1 \\ - 1 \end{matrix}], μ_{2} = [\begin{matrix} 3 \\ 3 \\ 1 \\ 1 \end{matrix}], Σ_{1} = Σ_{2} = [\begin{matrix} 0.5 & 0 & 0 & 0 \\ 0 & 0.5 & 0 & 0 \\ 0 & 0 & 0.5 & 0 \\ 0 & 0 & 0 & 0.5 \end{matrix}] .

(94)

The vector of parameters to be estimated is

β = {[θ^{T} γ^{T} σ^{2}]}^{T}

, with

θ = {g_{0, i j}, g_{1, i j}}

,

γ = {α_{l}, μ_{l}, Σ_{l}}_{l = 1}^{K}

, and

K = 2

. Figure 4 shows the results of the parameter estimates for the nominal MIMO model

G_{0} (q^{- 1}, η)

. The large red cross indicates the true value of the parameter vector

θ

. The black circles represent the estimated nominal model values for all Monte Carlo simulations. Table 1 presents the mean values of the estimated nominal model parameters and the noise variance across all MC realizations. We observe a small bias in the estimates compared to the true values. This effect can be mitigated by utilizing a larger number of measurements, N, in each independent experiment. The standard deviation of the estimates is similar in each case, with a value close to

\pm 5.1002 \times 10^{- 5}

.

Figure 5 shows the estimated multivariate GMM for the distributions of the error-model parameters

η^{[r]}

. Figure 5a,b show the estimated GMM for

{η_{0, 11}^{[r]}, η_{0, 12}^{[r]}}

and

{η_{0, 21}^{[r]}, η_{0, 22}^{[r]}}

, respectively. The Gaussian mixture distributions are drawn utilizing the mean values of the estimated GMM parameters for all MC simulations. We observe the bimodal behavior of the error-model distributions described in (94).

On the other hand, to describe the dynamic behavior using the SE approach, we consider computing the frequency response of the principal gains of the estimated MIMO system model. The principal gains provide a metric to evaluate the estimated MIMO system model performance in terms of how, and to what extent, different linear combinations of the input signals are amplified in the outputs signals, as well as the degree of channel coupling [17,51]. Here, we compute the principal gains of M realizations of the MIMO system model given by

{\hat{G}}^{[r]} (q^{- 1}) = {\hat{G}}_{0} (q^{- 1}, \hat{θ}) + {\hat{G}}_{Δ} (q^{- 1}, η^{[r]}),

(95)

with

r = 1, \dots, M

, where

{\hat{G}}_{0} (q^{- 1}, \hat{θ})

is the estimated MIMO nominal system model from the estimated values in Table 1, and

{\hat{G}}_{Δ} (q^{- 1}, η^{[r]})

is obtained from M independent realizations of

η^{[r]}

utilizing the GMM in (13) with the mean values of the estimated parameters,

\hat{γ}

, from all MC simulations. In addition, we consider estimating the MIMO system

\hat{G} (q^{- 1}, θ)

using the NLS [1] algorithm with a second-order FIR multivariable system model. In order to validate the error-model behavior, we consider the data from 25 independent experiments, and we obtain an estimate of a second-order FIR multivariable model for each independent experiment.

Figure 6 shows the magnitude of the principal gains corresponding to the estimated MIMO system model using the SE approach. In Figure 6a, the solid blue line represents the largest principal gain,

\bar{σ} ({\hat{G}}_{0} (q^{- 1}, \hat{θ}))

, of the estimated nominal MIMO system model. The blue-shaded region corresponds to the area that described the error-model behavior with respect to the largest principal gain. Similarly, in Figure 6b, the red solid line is the smallest principal gain,

\underset{̲}{σ} ({\hat{G}}_{0} (q^{- 1}, \hat{θ}))

, of the estimated MIMO system model. The red-shaded area corresponds to the estimated uncertainty region in which the smallest principal gains of the estimated true MIMO system lie. In both cases, the dotted black lines represent the principal gains computed from the estimated MIMO system models using the NLS algorithm. We observe that the estimated principal gains lie in the shaded regions (blue and red regions) that described the error-model behavior with the SE approach.

5.2. Example 2: Multivariable FIR System Model with Multiplicative Error-Model

In this example, we consider the MIMO system in (1) with a multiplicative error-model (3) as follows:

G_{0, i j} (q^{- 1}, θ_{i j}) = g_{0, i j} + g_{1, i j} q^{- 1},

(96)

G_{Δ, i j} (q^{- 1}, η_{i j}^{[r]}) = η_{0, i j}^{[r]} + η_{1, i j}^{[r]} q^{- 1},

(97)

where

i = 1, 2

and

j = 1, 2

, and the true values of the nominal model are

g_{0, 11} = 1.0

,

g_{1, 11} = 0.5

,

g_{0, 12} = 0.8

,

g_{1, 12} = - 1.2

,

g_{0, 21} = 1.0

,

g_{1, 21} = - 1.5

,

g_{0, 22} = 1.2

, and

g_{1, 22} = - 1.3

. The input signal corresponds to

u_{t}^{[r]} \sim N (0, σ_{u}^{2} I_{2})

,

σ_{u}^{2} = 10

, and the noise signal is

ω_{t}^{[r]} \sim N (0, σ^{2} I_{2})

,

σ^{2} = 0.2

. We consider that

η_{1, i j}^{[r]} = - η_{0, i j}^{[r]}

—i.e., we focus on uncertainties in the MIMO system at high frequencies with

η^{[r]} = {[η_{0, 11}^{[r]}, η_{0, 12}^{[r]}, η_{0, 21}^{[r]}, η_{0, 22}^{[r]}]}^{T}

. Then, the error-model distribution as an overlapped GMM with

K = 2

is given by

α_{1} = α_{2} = 0.5

and

μ_{1} = [\begin{matrix} - 2 \\ - 2 \\ - 1 \\ - 1 \end{matrix}], μ_{2} = [\begin{matrix} 2 \\ 2 \\ 1 \\ 1 \end{matrix}], Σ_{1} = Σ_{2} = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0.5 & 0 \\ 0 & 0 & 0 & 0.5 \end{matrix}] .

(98)

As in the previous example, the vector of parameters to be estimated is

β = {[θ^{T} γ^{T} σ^{2}]}^{T}

, with

θ = {g_{0, i j}, g_{1, i j}}

,

γ = {α_{l}, μ_{l}, Σ_{l}}_{l = 1}^{K}

, and

K = 2

. Figure 7 presents the nominal MIMO system model parameter estimates. The large red cross corresponds to the true value of the parameters for the nominal system model. The black circles represent the parameter estimates of the nominal MIMO system model for all MC realizations. Table 2 summarizes the mean values of the estimated nominal system model parameters and the noise variance. In contrast to the previous example, we observe that the multiplicative error-model approach improves the estimation accuracy, as the dynamic behavior of the nominal model weights the model uncertainty. In this case, the standard deviation of the estimates remains similar, around

\pm 3.02 \times 10^{- 4}

.

In Figure 8, the estimated GMM distribution for the error-model parameters

η^{[r]}

is shown. Figure 8a,b show the estimated GMM for

{η_{0, 11}^{[r]}, η_{0, 12}^{[r]}}

, and

{η_{0, 21}^{[r]}, η_{0, 22}^{[r]}}

, respectively. The GMM distributions are computed utilizing the mean values of the estimated GMM parameters for all MC simulations. We observe the overlapped behavior of the multivariate error-model distribution described in (98).

For this example, we compute the principal gains of the M realizations of the MIMO system model

{\hat{G}}^{[r]} (q^{- 1}) = {\hat{G}}_{0} (q^{- 1}, \hat{θ}) + {\hat{G}}_{Δ} (q^{- 1}, η^{[r]}) {\hat{G}}_{0} (q^{- 1}, \hat{θ}),

(99)

with

r = 1, \dots, M

, where

{\hat{G}}_{0} (q^{- 1}, \hat{θ})

is the estimated MIMO nominal system model where

\hat{θ}

is given by the estimated values in Table 2, and

{\hat{G}}_{Δ} (q^{- 1}, η^{[r]})

is obtained from M independent realizations of

η^{[r]}

utilizing the GMM in (13) with the mean values of the estimated parameters,

\hat{γ}

, from all MC realizations. We also estimate the system

G^{[r]} (q^{- 1}, θ)

using the NLS [1] algorithm with a second-order FIR multivariable system model. We consider the data from 25 independent experiments and we obtain an estimate of a second-order FIR multivariable model for each independent experiment in order to validate the uncertainty dynamic behavior.

Figure 9a,b show the magnitude of both the largest and smallest principal gains corresponding to the estimated MIMO system model. The solid blue and red lines represent the largest and smallest principal gains computed from the estimated nominal MIMO system model, i.e.,

\bar{σ} ({\hat{G}}_{0} (q^{- 1}, \hat{θ}))

and

\underset{̲}{σ} ({\hat{G}}_{0} (q^{- 1}, \hat{θ}))

, respectively. The shaded regions (blue and red) correspond to the error-model region in which the principal gains of the true system lie. We observe that all principal gains from the estimated MIMO system using the NLS algorithm (black dotted lines) lie in the error-model area, that is, the estimations lie in the uncertainty region described with the SE approach.

5.3. Example 3: A General MIMO System with Non-Gaussian Mixture Error-Model Distribution

In this example, we consider the MIMO system model in (1) with uncertainties in the parameters of the system model as follows:

G_{i j}^{[r]} (q^{- 1}) = \frac{b_{1, i j}^{[r]} q^{- 1}}{1 + a_{1, i j}^{[r]} q^{- 1}},

(100)

where

i = 1, 2

,

j = 1, 2

,

u_{t}^{[r]} \sim N (0, σ_{u}^{2} I_{2})

,

σ_{u}^{2} = 10

, and

ω_{t}^{[r]} \sim N (0, σ^{2} I_{2})

,

σ^{2} = 0.2

. The parameters

b_{1, i j}^{[r]}

and

a_{1, i j}^{[r]}

are given by

b_{1, i j} = b_{1, i j}^{o} (1 + δ_{i j}^{[r]}), a_{1, i j} = a_{1, i j}^{o} (1 + δ_{i j}^{[r]}),

(101)

where

b_{1, 11}^{o} = 1.0

,

a_{1, 11}^{o} = - 0.3

,

b_{1, 12}^{o} = 0.8

,

a_{1, 12}^{o} = - 0.4

,

b_{1, 21}^{o} = 1.1

,

a_{1, 21}^{o} = - 0.5

,

b_{1, 22}^{o} = 1.0

,

a_{1, 22}^{o} = - 0.6

, and

δ_{i j}^{[r]}

is uniformly distributed as

U [- 0.2, 0.2]

.

In the SE framework, we consider a multiplicative error-model and a nominal MIMO system model as follows:

G_{0, i j} (q^{- 1}, θ) = \frac{b_{1, i j} q^{- 1}}{1 + a_{1, i j} q^{- 1}},

(102)

G_{Δ, i j} (q^{- 1}, η_{i j}^{[r]}) = η_{0, i j}^{[r]} + η_{1, i j}^{[r]} q^{- 1},

(103)

where

θ = {b_{1, i j}, a_{1, i j}}

,

i = 1, 2

, and

j = 1, 2

. The distribution of the parameters

η^{[r]} = {η_{0, i j}^{[r]}, η_{1, i j}^{[r]}}

of the error-model in (103) is considered a GMM with

K = 2

—that is,

γ = {α_{l}, μ_{l}, Σ_{l}}_{l = 1}^{K}

. The vector of parameters to be estimated is

β = [θ^{T} γ^{T} σ^{2}]

. In this case, the advantage of the proposed algorithm lies in the fact that a non-Gaussian PDF can be accurately approximated by a linear combination of Gaussian distributions [65]. Gaussian mixture approximations enhance the accuracy of the estimates when Gaussian assumptions are relaxed (see, e.g., [6,35]).

In order to validate proposed SE approach for the MIMO system in (100), we compute the principal gains of M independent realizations of the vector of parameters

η^{[r]}

in (99) with

r = 1, \dots, M

. The vector of parameters

\hat{θ}

corresponds to the average of the estimated parameters of the nominal MIMO system model for all MC simulations. The vector of parameters

η^{[r]}

is computed from a GMM as in (13) using the average of the estimated Gaussian mixture parameters from the MC simulations. In addition, we estimate the MIMO system model

\hat{G} (q^{- 1}, θ)

with the system model structure in (102) using the NLS algorithm. In particular, we use the measurements of 25 independent experiments and we estimate

\hat{G} (q^{- 1}, θ)

for each experiment, that is, for

r = 1, \dots, 25

.

Figure 10a,b show the frequency response of the largest and the smallest principal gains, respectively, for the estimated MIMO system model with the SE approach. The solid blue and red lines represent the largest and smallest principal gain of the estimated nominal MIMO system model,

{\hat{G}}_{0} (q^{- 1}, \hat{θ})

. The two shaded areas are the uncertainty regions in which the principal gains computed from the M realizations of

η^{[r]}

in (99) lie. The dotted black lines represent the principal gains of the estimated MIMO system models using the NLS algorithm. We observe that the principal gains from the estimated MIMO system

\hat{G} (q^{- 1}, θ)

lie in the uncertainty regions despite the fact that the SE system model in (99) does not correspond to the system model structure in (102).

Remark 3.

In the numerical simulations, we consider that the MIMO system model structure (both nominal model and error-model), and the number of components, K, of the GMM are known. However, we can use an information criterion, e.g., Akaike’s information criterion [66], in order to obtain a candidate reduced complexity model that explains the dynamic behavior with good accuracy from a set of system models in which the true system model does not lie; see, e.g., [67].

6. Conclusions

In this paper, we addressed the problem of modeling parametric and structural uncertainties in a class of MIMO linear dynamic systems using the SE approach. We considered that the MIMO system model was a realization drawn from an underlying probability space, and its dynamic behavior was described by combining a deterministic nominal model and a stochastic error-model with a non-Gaussian PDF as a GMM. We formulated an ML estimation algorithm in which the error-model parameters were treated as hidden variables to obtain estimates of the vector of parameters that define both the MIMO nominal system model and the multivariate GMM parameters. In this context, we used the combined information from a set of independent experiment measurements, obtaining the likelihood function as an infinite mixture distribution and marginalizing with respect to the hidden variables for both the additive error model and the multiplicative error model cases. We proposed an iterative EM-based algorithm to solve the corresponding ML estimation problem, yielding closed-form expressions for the estimators of the GMM parameters and the noise variance. Estimating the nominal MIMO system model parameters was also addressed utilizing a gradient descent methodology. The simulation results demonstrated accurate estimates of the nominal MIMO system model and a good description of the uncertainty region, including a scenario where there exist non-Gaussian parametric uncertainties that did not correspond to a GMM and where the true system model did not correspond to the SE system model assumption. As future work, this multivariable SE approach can be extended in applications of hybrid systems which exhibit both continuous time dynamics and discrete switching behavior using sampled data or in scenarios where system model parameters are intrinsically discrete, e.g., logical conditions or classification labels, to mention a few.

Author Contributions

Conceptualization, R.O. and J.C.A.; methodology, R.O., M.C. and J.C.A.; software, R.O.; validation, M.C. and P.E.; formal analysis, R.O., M.C. and J.C.A.; investigation, R.O., M.C., R.C., P.E. and J.C.A.; resources, R.O.; writing—original draft, R.O., M.C., R.C., P.E. and J.C.A.; writing—review and editing, R.O., M.C., R.C., P.E. and J.C.A.; visualization, R.C. and P.E.; funding acquisition, R.O., M.C. and J.C.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by ANID-FONDECYT under grant numbers 3220403, 1211630, and 3230398; ANID–Fondo QUIMAL 2024/240012; ANID-Basal Project Grant AFB240002 (AC3E); the Electrical Energy Technologies Research Center (E2TECH); and the Proyecto Dirección de Investigación Científica y Tecnológica (DICYT) Regular 062413OP at Vicerrectoría de Investigación, Innovación y Creación, Universidad de Santiago de Chile. The APC was funded by ANID-FONDECYT under grant number 3220403.

Data Availability Statement

The pseudo-code provided in this article can be replicated utilizing Matlab code according to the proposed methodology. In addition, for a particular scenario, Matlab code is available at https://github.com/ROrellanaP/Stochastic_Embedding_MIMO.git (accessed on 7 March 2025). For further inquiries, please contact the corresponding author(s).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ML	Maximum Likelihood
SISO	Single-Input Single-Output
FIR	Finite Impulse Response
MIMO	Multiple-Input Multiple-Output
PEM	Prediction Error Method
SE	Stochastic Embedding
PDF	Probability Density Function
EM	Expectation-Maximization
GMM	Gaussian Mixture Model
NLS	Non-Linear Least Squares

References

Ljung, L. System Identification: Theory for the User, 2nd ed.; Prentice Hall PTR: Upper Saddle River, NJ, USA, 1999. [Google Scholar]
Söderström, T.; Stoica, P. System Identification; Prentice-Hall, Inc.: Upper Saddle River, NJ, USA, 1988. [Google Scholar]
Goodwin, G.C.; Payne, R. Dynamic System Identification: Experiment Design and Data Analysis; Academic Press: New York, NY, USA, 1977. [Google Scholar]
Pintelon, R.; Schoukens, J. System Identification: A Frequency Domain Approach; Wiley: New York, NY, USA, 2004. [Google Scholar]
Cao, Y.; Song, Y.; Liu, X.; Li, C. Parameter Identification of Synchronous Condenser and Its Excitation System Considering Multivariate Coupling and Symmetry Characteristic. Symmetry 2024, 16, 1596. [Google Scholar] [CrossRef]
Orellana, R.; Bittner, G.; Carvajal, R.; Agüero, J.C. Maximum Likelihood estimation for non-minimum-phase noise transfer function with Gaussian mixture noise distribution. Automatica 2022, 135, 109937. [Google Scholar] [CrossRef]
Guo, J.; Zhao, Y. Recursive projection algorithm on FIR system identification with binary-valued observations. Automatica 2013, 49, 3396–3401. [Google Scholar] [CrossRef]
Moschitta, A.; Schoukens, J.; Carbone, P. Parametric System Identification Using Quantized Data. IEEE Trans. Instrum. Meas. 2015, 64, 2312–2322. [Google Scholar] [CrossRef]
Gibson, S.; Wills, A.; Ninness, B. Maximum-likelihood parameter estimation of bilinear systems. IEEE Trans. Autom. Control 2005, 50, 1581–1596. [Google Scholar] [CrossRef]
Cedeño, A.L.; González, R.A.; Carvajal, R.; Agüero, J.C. Identification of Wiener state–space models utilizing Gaussian sum smoothing. Automatica 2024, 166, 111707. [Google Scholar] [CrossRef]
Orellana, R.; Escárate, P.; Curé, M.; Christen, A.; Carvajal, R.; Agüero, J.C. A method to deconvolve stellar rotational velocities - III. The probability distribution function via maximum likelihood utilizing finite distribution mixtures. Astron. Astrophys. 2019, 623, A138. [Google Scholar] [CrossRef]
Aghajani, T.; Lindegren, L. Maximum likelihood estimation of local stellar kinematics. Astron. Astrophys. 2013, 551, A9. [Google Scholar] [CrossRef]
Binhimd, S.M.S.; Kalantan, Z.I.; Abd AL-Fattah, A.M.; EL-Helbawy, A.A.; AL-Dayian, G.R.; Abd EL-Kader, R.E.; Abd Elaal, M.K. Exponentiated Generalized Xgamma Distribution Based on Dual Generalized Order Statistics: A Bayesian and Maximum Likelihood Approach. Symmetry 2024, 16, 1708. [Google Scholar] [CrossRef]
Reyes, J.; Rojas, M.A.; Cortés, P.L.; Arrué, J. A New Multimodal Modification of the Skew Family of Distributions: Properties and Applications to Medical and Environmental Data. Symmetry 2024, 16, 1224. [Google Scholar] [CrossRef]
Tiwari, S.K.; Upadhyay, P.K. Maximum Likelihood Estimation of SNR for Diffusion-Based Molecular Communication. IEEE Wirel. Commun. Lett. 2016, 5, 320–323. [Google Scholar] [CrossRef]
Chen, B.; Petropulu, A. Frequency domain blind MIMO system identification based on second- and higher order statistics. IEEE Trans. Signal Process. 2001, 49, 1677–1688. [Google Scholar] [CrossRef]
Koung, C.W.; Macgregor, J.F. Identification for robust multivariable control: The design of experiments. Automatica 1994, 30, 1541–1554. [Google Scholar] [CrossRef]
Pintelon, R.; Schoukens, J.; Guillaume, P. Box–Jenkins identification revisited—Part III: Multivariable systems. Automatica 2007, 43, 868–875. [Google Scholar] [CrossRef]
Rathnayake, D.B.; Bahrani, B. Multivariable Control Design for Grid-Forming Inverters with Decoupled Active and Reactive Power Loops. IEEE Trans. Power Electron. 2023, 38, 1635–1649. [Google Scholar] [CrossRef]
Sreelakshmi, S.M.; Tharamal, L.; Preetha, P. Maximum Likelihood Classification for Transformer Fault Diagnosis Using Dissolved Gas Analysis. In Proceedings of the 2021 IEEE Electrical Insulation Conference (EIC), Denver, CO, USA, 7–28 June 2021; pp. 381–384. [Google Scholar]
Turksoy, K.; Quinn, L.; Littlejohn, E.; Cinar, A. Multivariable Adaptive Identification and Control for Artificial Pancreas Systems. IEEE Trans. Biomed. Eng. 2014, 61, 883–891. [Google Scholar] [CrossRef] [PubMed]
Goodwin, G.C.; Gevers, M.; Ninness, B. Quantifying the error in estimated transfer functions with application to model order selection. IEEE Trans. Autom. Control 1992, 37, 913–928. [Google Scholar] [CrossRef]
Ljung, L. Model Error Modeling and Control Design. IFAC Proc. Vol. 2000, 33, 31–36. [Google Scholar] [CrossRef]
Wang, W.; Gui, W. Estimation and Bayesian Prediction for the Generalized Exponential Distribution Under Type-II Censoring. Symmetry 2025, 17, 222. [Google Scholar] [CrossRef]
Ramadan, D.A.; Farhat, A.T.; Bakr, M.E.; Balogun, O.S.; Hasaballah, M.M. Optimal Estimation of Reliability Parameters for Modified Frechet-Exponential Distribution Using Progressive Type-II Censored Samples with Mechanical and Medical Data. Symmetry 2024, 16, 1476. [Google Scholar] [CrossRef]
MacIejowski, J. Multivariable Feedback Design; Electronic Systems Engineering Series; Addison-Wesley: Wokingham, UK, 1989. [Google Scholar]
Imai, A.; Costa, R.; Hsu, L.; Tao, G.; Kokotovic, P. Multivariable adaptive control using high-frequency gain matrix factorization. IEEE Trans. Autom. Control 2004, 49, 1152–1156. [Google Scholar] [CrossRef]
Anderson, M.; Buehner, M.; Young, P.; Hittle, D.; Anderson, C.; Tu, J.; Hodgson, D. MIMO Robust Control for HVAC Systems. IEEE Trans. Control Syst. Technol. 2008, 16, 475–483. [Google Scholar] [CrossRef]
Noshadi, A.; Shi, J.; Lee, W.S.; Shi, P.; Kalam, A. System Identification and Robust Control of Multi-Input Multi-Output Active Magnetic Bearing Systems. IEEE Trans. Control Syst. Technol. 2016, 24, 1227–1239. [Google Scholar] [CrossRef]
Málik, R.; Okienková, K.; Vasko, D.; Kardoš, J.; Tárník, M.; Paulusová, J. Robust Control of a MIMO Laboratory Motion System Using Variable Structure Control and the Computed Torque Method With the Velocity Limitation. IEEE Access 2024, 12, 143869–143882. [Google Scholar] [CrossRef]
Milanese, M.; Vicino, A. Optimal estimation theory for dynamic systems with set membership uncertainty: An overview. Automatica 1991, 27, 997–1009. [Google Scholar] [CrossRef]
Dempster, A.; Laird, N.; Rubin, D. Maximum Likelihood from Incomplete Data via the EM Algorithm. J. R. Stat. Soc. Ser. B (Methodological) 1977, 39, 1–38. [Google Scholar] [CrossRef]
Ljung, L.; Goodwin, G.C.; Agüero, J.C.; Chen, T. Model Error Modeling and Stochastic Embedding. IFAC-PapersOnLine 2015, 48, 75–79. [Google Scholar] [CrossRef]
Pillonetto, G.; De Nicolao, G. A new kernel-based approach for linear system identification. Automatica 2010, 46, 81–93. [Google Scholar] [CrossRef]
Orellana, R.; Carvajal, R.; Escárate, P.; Agüero, J.C. On the Uncertainty Identification for Linear Dynamic Systems Using Stochastic Embedding Approach with Gaussian Mixture Models. Sensors 2021, 21, 3837. [Google Scholar] [CrossRef]
Arasaratnam, I.; Haykin, S.; Elliott, R. Discrete-Time Nonlinear Filtering Algorithms Using Gauss-Hermite Quadrature. Proc. IEEE 2007, 95, 953–977. [Google Scholar] [CrossRef]
Balenzuela, M.; Dahlin, J.; Bartlett, N.; Wills, A.; Renton, C.; Ninness, B. Accurate Gaussian Mixture Model Smoothing using a Two-Filter Approach. In Proceedings of the 2018 IEEE Conference on Decision and Control (CDC), Miami, FL, USA, 17–19 December 2018; pp. 694–699. [Google Scholar]
Holubnychyi, O.; Zaliskyi, M.; Solomentsev, O.; Ostroumov, I.; Averyanova, Y.; Sushchenko, O. Gaussian Mixture Model Based Machine Learning Approach for Detection of Threat Types in Communication Networks. In Proceedings of the 2023 IEEE 13th International Conference on Electronics and Information Technologies (ELIT), Lviv, Ukraine, 26–28 September 2023; pp. 142–147. [Google Scholar]
Luo, J.; Zhu, H. GMM-Based Distributed Kalman Filtering for Target Tracking Under Cyberattacks. IEEE Sens. Lett. 2024, 8, 1–4. [Google Scholar] [CrossRef]
Ou, K.; Gao, S.; Wang, Y.; Zhai, B.; Zhang, W. Assessment of the Renewable Energy Consumption Capacity of Power Systems Considering the Uncertainty of Renewables and Symmetry of Active Power. Symmetry 2024, 16, 1184. [Google Scholar] [CrossRef]
Wiener, N. Tauberian Theorems. Ann. Math. 1932, 33, 1–100. [Google Scholar] [CrossRef]
Skogestad, S.; Postlethwaite, I. Multivariable Feedback Control: Analysis and Design, 2nd ed.; Wiley: Hoboken, NJ, USA, 2005. [Google Scholar]
Glover, K. Robust stabilization of linear multivariable systems: Relations to approximation. Int. J. Control 1986, 43, 741–766. [Google Scholar] [CrossRef]
Mahapatro, S.R.; Subudhi, B. A Robust Stability Region-Based Decentralized PI Controller for a Multivariable Liquid Level System. IEEE Syst. J. 2021, 16, 124–131. [Google Scholar] [CrossRef]
Wahlberg, B.; Mäkilä, P. On Approximation of Stable Linear Dynamical Systems using Laguerre and Kautz Functions. Automatica 1996, 32, 693–708. [Google Scholar] [CrossRef]
Tsykunov, A.M. Robust control algorithms with compensation of bounded perturbations. Autom. Remote Control 2007, 68, 1213–1224. [Google Scholar] [CrossRef]
Lanzon, A.; Engelken, S.; Patra, S.; Papageorgiou, G. Robust stability and performance analysis for uncertain linear systems—The distance measure approach. Int. J. Robust Nonlinear Control 2012, 22, 1270–1292. [Google Scholar] [CrossRef]
Papoulis, A.; Pillai, S.U. Probability, Random Variables, and Stochastic Processes; McGraw-Hill: New York, NY, USA, 2002. [Google Scholar]
Max, A.W. Inverting modified matrices. In Memorandum Rept. 42, Statistical Research Group; Princeton University: Princeton, NJ, USA, 1950; p. 4. [Google Scholar]
McLachlan, G.; Peel, D. Finite Mixture Models; Wiley: Hoboken, NJ, USA, 2000. [Google Scholar]
Gibson, S.; Ninness, B. Robust maximum-likelihood estimation of multivariable dynamic systems. Automatica 2005, 41, 1667–1682. [Google Scholar] [CrossRef]
Deng, J.; Godsill, S. Frequency domain multi-channel expectation maximization algorithm for audio background noise reduction. In Proceedings of the 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, USA, 20–23 October 2013; pp. 1–4. [Google Scholar]
McLachlan, G.; Krishnan, T. The EM Algorithm and Extensions, 2nd ed.; Wiley: Hoboken, NJ, USA, 2008. [Google Scholar]
Carvajal, R.; Orellana, R.; Katselis, D.; Escárate, P.; Agüero, J.C. A data augmentation approach for a class of statistical inference problems. PLoS ONE 2018, 13, e0208499. [Google Scholar] [CrossRef]
Mengersen, K.; Robert, C.; Titterington, M. Mixtures: Estimation and Applications; Wiley: Hoboken, NJ, USA, 2011. [Google Scholar]
Frühwirth-Schnatter, S.; Celeux, G.; Robert, C. Handbook of Mixture Analysis; Chapman and Hall/CRC: Oxford, UK, 2018. [Google Scholar]
Fessler, J.; Hero, A. Space-alternating generalized expectation-maximization algorithm. IEEE Trans. Signal Process. 1994, 42, 2664–2677. [Google Scholar] [CrossRef]
Wright, S. Coordinate descent algorithms. Math. Program. 2015, 151, 3–34. [Google Scholar] [CrossRef]
Goodwin, G.C.; Salgado, M. A stochastic embedding approach for quantifying uncertainty in the estimation of restricted complexity models. Int. J. Adapt. Control Signal Process. 1989, 3, 333–356. [Google Scholar] [CrossRef]
Pawełek, A.; Lichota, P. Arrival air traffic separations assessment using Maximum Likelihood Estimation and Fisher Information Matrix. In Proceedings of the 20th International Carpathian Control Conference, Wieliczka, Poland, 26–29 May 2019; pp. 1–6. [Google Scholar]
Lichota, P. Inclusion of the D-optimality in multisine manoeuvre design for aircraft parameter estimation. J. Theor. Appl. Mech. 2016, 54, 87–98. [Google Scholar] [CrossRef]
Xu, L.; Jordan, M. On Convergence properties of the EM Algorithm for Gaussian Mixtures. Neural Comput. 1996, 8, 129–151. [Google Scholar] [CrossRef]
Zhao, R.; Li, Y.; Sun, Y. Statistical convergence of the EM algorithm on Gaussian mixture models. Electron. J. Statist. 2020, 14, 632–660. [Google Scholar] [CrossRef]
Biernacki, C.; Celeux, G.; Govaert, G. Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput. Stat. Data Anal. 2003, 41, 561–575. [Google Scholar] [CrossRef]
Lo, J. Finite-dimensional sensor orbits and optimal nonlinear filtering. IEEE Trans. Inf. Theory 1972, 18, 583–588. [Google Scholar] [CrossRef]
Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [Google Scholar] [CrossRef]
Spiegelhalter, D.; Best, N.; Carlin, B.; van der Linde, A. The deviance information criterion: 12 years on. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2014, 76, 485–493. [Google Scholar] [CrossRef]

Figure 1. Models for linear MIMO system identification of

G_{0} (q^{- 1}, θ)

: (a) MIMO system model with additive measurement noise signal

ω_{t}

. (b) MIMO system model with dynamic uncertainty

G_{Δ} (q^{- 1}, η)

and additive noise signal

ω_{t}

.

Figure 1. Models for linear MIMO system identification of

G_{0} (q^{- 1}, θ)

: (a) MIMO system model with additive measurement noise signal

ω_{t}

. (b) MIMO system model with dynamic uncertainty

G_{Δ} (q^{- 1}, η)

and additive noise signal

ω_{t}

.

Figure 2. Principal gains from the estimated MIMO system for Model I using PEM: (a) Largest principal gain (blue color) and smallest principal gain (red color) of the estimated MIMO system using PEM with data length

N = 1000

. (b) Largest principal gain (blue color) and smallest principal gain (red color) of the estimated MIMO system using PEM with data length

N = 50

.

Figure 2. Principal gains from the estimated MIMO system for Model I using PEM: (a) Largest principal gain (blue color) and smallest principal gain (red color) of the estimated MIMO system using PEM with data length

N = 1000

. (b) Largest principal gain (blue color) and smallest principal gain (red color) of the estimated MIMO system using PEM with data length

N = 50

.

Figure 3. Principal gains from the estimated MIMO system for Model II using PEM: (a) Largest principal gain (blue color) and smallest principal gain (red color) of the estimated MIMO system using PEM with data length

N = 1000

. (b) Largest principal gain (blue color) and smallest principal gain (red color) of the estimated MIMO system using PEM with data length

N = 50

.

Figure 3. Principal gains from the estimated MIMO system for Model II using PEM: (a) Largest principal gain (blue color) and smallest principal gain (red color) of the estimated MIMO system using PEM with data length

N = 1000

. (b) Largest principal gain (blue color) and smallest principal gain (red color) of the estimated MIMO system using PEM with data length

N = 50

.

Figure 4. Estimated nominal MIMO system model parameters for Example 1: (a) Estimated system model parameters for

G_{0, 11} (q^{- 1}, θ)

. (b) Estimated system model parameters for

G_{0, 12} (q^{- 1}, θ)

. (c) Estimated system model parameters for

G_{0, 21} (q^{- 1}, θ)

. (d) Estimated system model parameters for

G_{0, 22} (q^{- 1}, θ)

.

Figure 4. Estimated nominal MIMO system model parameters for Example 1: (a) Estimated system model parameters for

G_{0, 11} (q^{- 1}, θ)

. (b) Estimated system model parameters for

G_{0, 12} (q^{- 1}, θ)

. (c) Estimated system model parameters for

G_{0, 21} (q^{- 1}, θ)

. (d) Estimated system model parameters for

G_{0, 22} (q^{- 1}, θ)

.

Figure 5. Estimated error-model distribution as GMM for Example 1: (a) Estimated multivariate GMM for the vector of parameters

{[η_{0, 11}^{[r]} η_{0, 12}^{[r]}]}^{T}

. (b) Estimated multivariate GMM for the vector of parameters

{[η_{0, 21}^{[r]} η_{0, 22}^{[r]}]}^{T}

.

Figure 5. Estimated error-model distribution as GMM for Example 1: (a) Estimated multivariate GMM for the vector of parameters

{[η_{0, 11}^{[r]} η_{0, 12}^{[r]}]}^{T}

. (b) Estimated multivariate GMM for the vector of parameters

{[η_{0, 21}^{[r]} η_{0, 22}^{[r]}]}^{T}

.

Figure 6. Principal gains from the estimated MIMO system for Example 1 using the EM-based algorithm: (a) Largest principal gain (blue color) of the estimated MIMO system using SE approach and the NLS algorithm (dotted black lines). (b) Smallest principal gain (blue color) of the estimated MIMO system using SE approach and the NLS algorithm (dotted black lines).

Figure 7. Estimated nominal MIMO system model parameters for Example 2: (a) Estimated system model parameters for

G_{0, 11} (q^{- 1}, θ)

. (b) Estimated system model parameters for

G_{0, 12} (q^{- 1}, θ)

. (c) Estimated system model parameters for

G_{0, 21} (q^{- 1}, θ)

. (d) Estimated system model parameters for

G_{0, 22} (q^{- 1}, θ)

.

Figure 7. Estimated nominal MIMO system model parameters for Example 2: (a) Estimated system model parameters for

G_{0, 11} (q^{- 1}, θ)

. (b) Estimated system model parameters for

G_{0, 12} (q^{- 1}, θ)

. (c) Estimated system model parameters for

G_{0, 21} (q^{- 1}, θ)

. (d) Estimated system model parameters for

G_{0, 22} (q^{- 1}, θ)

.

Figure 8. Estimated error-model distribution as GMM for Example 2: (a) Estimated multivariate GMM for the vector of parameters

{[η_{0, 11}^{[r]} η_{0, 12}^{[r]}]}^{T}

. (b) Estimated multivariate GMM for the vector of parameters

{[η_{0, 21}^{[r]} η_{0, 22}^{[r]}]}^{T}

.

Figure 8. Estimated error-model distribution as GMM for Example 2: (a) Estimated multivariate GMM for the vector of parameters

{[η_{0, 11}^{[r]} η_{0, 12}^{[r]}]}^{T}

. (b) Estimated multivariate GMM for the vector of parameters

{[η_{0, 21}^{[r]} η_{0, 22}^{[r]}]}^{T}

.

Figure 9. Principal gains from the estimated MIMO system for Example 2 using the EM-based algorithm: (a) Largest principal gain (blue color) of the estimated MIMO system using SE approach and the NLS algorithm (dotted black lines). (b) Smallest principal gain (blue color) of the estimated MIMO system using SE approach and the NLS algorithm (dotted black lines).

Figure 10. Principal gains from the estimated MIMO system for Example 3 using the EM-based algorithm: (a) Largest principal gain (blue color) of the estimated MIMO system using SE approach and the NLS algorithm (dotted black lines). (b) Smallest principal gain (blue color) of the estimated MIMO system using SE approach and the NLS algorithm (dotted black lines).

Table 1. Mean values of the estimated parameters of

G_{0} (q^{- 1}, θ)

and

σ^{2}

for Example 1.

Table 1. Mean values of the estimated parameters of

G_{0} (q^{- 1}, θ)

and

σ^{2}

for Example 1.

Parameters	$g_{0, 11}$	$g_{1, 11}$	$g_{0, 12}$	$g_{1, 12}$	$g_{0, 21}$	$g_{1, 21}$	$g_{0, 22}$	$g_{1, 22}$	$σ^{2}$
Estimated values	$0.9873$	$0.5124$	$0.7501$	$- 1.1501$	$0.9375$	$- 1.4375$	$1.1375$	$- 1.2375$	$0.2001$
True values	$1.0$	$0.5$	$0.8$	$- 1.2$	$1.0$	$- 1.5$	$1.2$	$- 1.3$	0.2

Table 2. Mean values of the estimated parameters of

G_{0} (q^{- 1}, θ)

and

σ^{2}

for Example 2.

Table 2. Mean values of the estimated parameters of

G_{0} (q^{- 1}, θ)

and

σ^{2}

for Example 2.

Parameters	$g_{0, 11}$	$g_{1, 11}$	$g_{0, 12}$	$g_{1, 12}$	$g_{0, 21}$	$g_{1, 21}$	$g_{0, 22}$	$g_{1, 22}$	$σ^{2}$
Estimated values	$1.0002$	$0.5001$	$0.8001$	$- 1.1991$	$1.0003$	$- 1.5005$	$1.1985$	$- 1.3025$	$0.1907$
True values	$1.0$	$0.5$	$0.8$	$- 1.2$	$1.0$	$- 1.5$	$1.2$	$- 1.3$	0.2

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Orellana, R.; Coronel, M.; Carvajal, R.; Escárate, P.; Agüero, J.C. Model Error Modeling for a Class of Multivariable Systems Utilizing Stochastic Embedding Approach with Gaussian Mixture Models. Symmetry 2025, 17, 426. https://doi.org/10.3390/sym17030426

AMA Style

Orellana R, Coronel M, Carvajal R, Escárate P, Agüero JC. Model Error Modeling for a Class of Multivariable Systems Utilizing Stochastic Embedding Approach with Gaussian Mixture Models. Symmetry. 2025; 17(3):426. https://doi.org/10.3390/sym17030426

Chicago/Turabian Style

Orellana, Rafael, Maria Coronel, Rodrigo Carvajal, Pedro Escárate, and Juan C. Agüero. 2025. "Model Error Modeling for a Class of Multivariable Systems Utilizing Stochastic Embedding Approach with Gaussian Mixture Models" Symmetry 17, no. 3: 426. https://doi.org/10.3390/sym17030426

APA Style

Orellana, R., Coronel, M., Carvajal, R., Escárate, P., & Agüero, J. C. (2025). Model Error Modeling for a Class of Multivariable Systems Utilizing Stochastic Embedding Approach with Gaussian Mixture Models. Symmetry, 17(3), 426. https://doi.org/10.3390/sym17030426

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Model Error Modeling for a Class of Multivariable Systems Utilizing Stochastic Embedding Approach with Gaussian Mixture Models

Abstract

1. Introduction

2. Stochastic Embedding Approach for Class of Multivariable Systems

2.1. System of Interest

2.2. Standing Assumptions

3. Maximum Likelihood Estimation for Multivariable Model Error Modeling Using GMMs

4. An Iterative Algorithm for Model Error Modeling in Multivariable Systems

5. Numerical Simulations

5.1. Example 1: Multivariable FIR System Model with Additive Error-Model

5.2. Example 2: Multivariable FIR System Model with Multiplicative Error-Model

5.3. Example 3: A General MIMO System with Non-Gaussian Mixture Error-Model Distribution

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI