On the Uncertainty Identification for Linear Dynamic Systems Using Stochastic Embedding Approach with Gaussian Mixture Models

In control and monitoring of manufacturing processes, it is key to understand model uncertainty in order to achieve the required levels of consistency, quality, and economy, among others. In aerospace applications, models need to be very precise and able to describe the entire dynamics of an aircraft. In addition, the complexity of modern real systems has turned deterministic models impractical, since they cannot adequately represent the behavior of disturbances in sensors and actuators, and tool and machine wear, to name a few. Thus, it is necessary to deal with model uncertainties in the dynamics of the plant by incorporating a stochastic behavior. These uncertainties could also affect the effectiveness of fault diagnosis methodologies used to increment the safety and reliability in real-world systems. Determining suitable dynamic system models of real processes is essential to obtain effective process control strategies and accurate fault detection and diagnosis methodologies that deliver good performance. In this paper, a maximum likelihood estimation algorithm for the uncertainty modeling in linear dynamic systems is developed utilizing a stochastic embedding approach. In this approach, system uncertainties are accounted for as a stochastic error term in a transfer function. In this paper, we model the error-model probability density function as a finite Gaussian mixture model. For the estimation of the nominal model and the probability density function of the parameters of the error-model, we develop an iterative algorithm based on the Expectation-Maximization algorithm using the data from independent experiments. The benefits of our proposal are illustrated via numerical simulations.


Introduction
System identification is a scientific discipline that studies techniques for modeling and estimating dynamic systems from experimental data, i.e., a number of experiments are performed on the system, and a model is then fitted to the measured data by estimating the corresponding system model parameters [1,2]. Most system identification techniques available in the literature assume that the system model that will be fitted lies in the model set [3]. However, real processes have an arbitrary complexity and using a complex model structure can lead to large variance estimation errors [4]. Moreover, the experimental data is typically obtained with the presence of sensors errors and measurement noise, modeling errors, and external disturbances that can impact on the system model accuracy; see, e.g., Reference [5]. This scenario has been considered for sensor or actuator failure diagnosis and isolation in systems with structural uncertainties and noisy measurements, utilizing residual analysis and Bayesian frameworks to develop failure detection procedures [6][7][8][9].
In Figure 1, a typical representation of a real process and measured experimental data is shown, where u t is the input measurements, y t denotes the output measurements, and ω t denotes a measurement noise. The real process (true system) is represented by the interaction of three components: (i) an actuator, (ii) a plant, and (iii) a sensor. This representation has been used to develop system identification methodologies for flight vehicle development in order to obtain accurate models and to validate mathematical models of the flight vehicle [10]. These system models are required, for example, to perform fault-diagnosis and adaptive control, to analyze handling qualities specification compliance, to develop high-fidelity aerodynamic databases for flight simulators, among others [11]. However, the quality of these results can be affected by the uncertainties incorporated by the (unknown) dynamics that are not modeled, the instrumentation, model simplifications, and measurement noise and sensor errors [12].
On the other hand, in Reference [13], the modeling and parameter identification in structural health monitoring was addressed. The authors considered that the multi-sensor data in a structural health monitoring can provide incomplete information due to multisensor faults or errors and then quantified its corresponding uncertainty with a Bayesian framework. Similarly, in Reference [14], a sensor-orientated approach for space situational awareness and traffic management uncertainty quantification was presented, analyzing the implications of the sensor uncertainties in the performance of this systems. A mathematical formulation based on the least square method was developed, providing a methodology to represent the navigation and tracking errors as an uncertainty volume that accurately depicts the size and orientation in space situational awareness evolution.

Plant Actuator Sensor
Real process

Sensor locations
Sensor model (calibration factors, bias errors, etc.) dynamics (unknown parameters) u t y t Figure 1. Representation of a real dynamic system with data measurements for system identification.
For these reasons, a suitable formulation of a model that incorporates uncertainties is an important aspect to consider in system identification methods, in order to obtain system models that represent as closely as possible the real process behavior [1,2,10]. Because of its valuable statistical properties [1][2][3], many identification algorithms involve the maximum likelihood (ML) principle with particular system models, such as dynamic systems [15][16][17], static systems [18], systems with quantized output data [19][20][21][22], and communications [23,24], to mention a few. In particular, the ML estimators are asymptotically unbiased. That is, when the number of measurements is not too large, the estimates can be far from the true value. This bias is typically accounted for by the measurement noise variance. However, there are scenarios in which the number of sampled-data is, in theory, large enough, and the estimates considerably change from one experiment to another, i.e., there is an uncertainty in the system model that cannot be accounted for by the noise measurements and a large variance.
To illustrate the stochastic embedding approach, the magnitude of the frequency response of a linear dynamic system (assumed to be the true system) is shown in Figure 2, where G [r] (q −1 ) is given by where q −1 denotes the backward shift operator (q −1 x t = x t−1 ), r denotes a realization of the true system, G o (q −1 ) is a fixed nominal model, and G [r] ∆ (q −1 ) denotes the rth realization of a relative error-model for the true system. The red-shaded region represents the area in which possible true systems, G [r] (q −1 ), i.e, realizations of the true system (represented by black dotted lines), can lie. The error-model can introduce model parameter uncertainties at low frequency range and model structure uncertainties at high frequency range [10]. This system model behavior can be encountered, for instance, in an aeroservoelastic system model (see, e.g., References [5,25]). Aeroservoelastic systems include dynamic coupling due to structural, control, sensor and actuator dynamics that cover both the low frequency and high frequency range. This means that, in an ML framework, the estimates can change considerable from one experiment to another, even if the number of measurements used to obtain an ML estimation is large, due to the structural and parametric uncertainties in the true system model. Hence, it is desirable that the models obtained from system identification techniques provide a quantification of their uncertainty, i.e., considering that the uncertainty modeling is part of the structure and estimating a nominal model with an error-model [26][27][28]. This approach of uncertainty modeling can be used for robust control design in which optimization methods can be used to obtain controllers in order to minimize the expected variation (variance) of the true system performance from a given desired behavior [29], or to obtain a probabilistic solution for robust control design [30,31]. Frequency (rad/s) This alternative view of modeling error-model has been addressed in different frameworks. In Reference [32], the set membership approach was used to deal with the problem of uncertainty modeling in dynamic systems. The authors of Reference [32] considered the error-model estimation in a deterministic framework in which it is unknown-but-bounded and obtaining a set of possible solutions. However, this method does not guarantee a small set of solutions to properly describe the system model uncertainty. In Reference [33], classical prediction error methods (PEM) [2] were used to obtain an estimation of the nominal model. Then, the unmodeled dynamics were estimated from the residuals dynamics using PEM and obtaining the corresponding model error modeling (MEM). Nevertheless, this methodology assumes that the nominal model is available. In References [26,34], a stochastic embedding (SE) approach was used to describe the model uncertainty by considering the model as a realization drawn from an underlying probability space, where the parameters that define the error-model are characterized by a probability density function (PDF). However, a flexible error-model distribution is needed to obtain a correct description of the system model uncertainty. In this approach, the uncertainty can be quantified by using ML estimation considering the parameters of the error-model as hidden variables [4,35] and solving the associated estimation problem with an expectationmaximization (EM) [36] based algorithm under Gaussian assumptions for the error-model distribution. References [4,37] also adopted a Bayesian perspective, where the parameters that define both the nominal model and the error-model can be modeled as realizations of random variables with certain prior distributions, and the posterior densities of the parameters can be estimated. There are works closely related with this framework [38,39] based on the kernel approach for system identification. Under this Bayesian framework, it is possible to obtain an unified system model of both the nominal model and error-model for the system in Equation (1) if all system models are considered finite impulse response (FIR) systems. However, this approach does not deliver the nominal model and the error-model separately when more complex model structures are involved in the true system.
In this paper, we propose an ML identification algorithm for modeling the uncertainty in a general class of linear dynamic systems using SE approach. That is, we aim at obtaining an estimation of the nominal model and the red-shaded region in Figure 2. We also consider that the error-model distribution is given by a Gaussian mixture model (GMM). GMMs have been utilized in non-linear filtering [40][41][42], identification of dynamic systems with ML framework [43,44] and a Bayesian framework [45][46][47]. Modeling error-model using SE approach for FIR systems have also been used with Gaussian mixtures distributions [48]. Moreover, GMMs have been used to approximate non-Gaussian distributions in the ML framework (see, e.g., References [17,47,49]) based on the Wiener approximation theorem which established that any PDF with compact support can be approximated by a finite sum of Gaussian distributions (Reference [50], Theorem 3). Combining the SE approach with Gaussian mixture distributions provides a flexible scenario for uncertainty modeling when the error-model distribution does not correspond to a GMM but can be approximated by one; the latter corresponds to the Wiener approximation theorem. The contributions of this paper can be summarized as follows: (i) We obtain the likelihood function for a linear dynamic system modeled with SE approach and a finite Gaussian mixture distribution. The likelihood function is computed by marginalizing the vector of parameters of the error-model as a hidden variable. (ii) We propose an EM algorithm to solve the associated ML estimation problem with GMMs, obtaining the estimates of the vector of parameters that define the nominal model and closed-form expressions for the GMM estimators of the error-model distribution.
The remainder of the paper is as follows: In Section 2, the system of interest for uncertainty modeling with SE framework is stated. In Section 3, the estimation problem for uncertainty modeling in a linear dynamic system is addressed using SE approach with GMMs. In Section 4, an EM algorithm is presented to solve the estimation problem.
Numerical simulation examples are presented in Section 5. Finally, in Section 6, we present our conclusions.

System Description
The system of interest in this paper is as follows: where r = 1, ..., M denotes the r-th realization of the system, M corresponds the number of independent experiments or batches, t = 1, ..., N denotes the t-th measurement, N is the data length, y t is the output signal, u t is a zero-mean Gaussian white noise with variance σ 2 , and H(q −1 , θ) is the noise model parametrized by θ. We consider that the system G [r] (q −1 ) can be described as follows (see, e.g., References [26,37]): where G o (q −1 , θ) is the nominal system model parametrized by θ, and G ∆ q −1 , η [r] is the error-model parametrized by η [r] . Here, we consider that the PDF of the error-model in Equation (2) is a finite Gaussian mixture distribution given by: represents a Gaussian PDF with mean value µ i and covariance matrix Γ i . For simplicity of the presentation, we consider that H(q −1 , θ) in Equation (2) is part of the nominal model, i.e., it is only parametrized by θ. The nominal system model in Equation (3) is given by where A(q −1 , θ) = 1 + a 1 q −1 + · · · + a n a q −na .
Similarly, the noise system model in Equation (2) is given by with Notice that we use the term nominal model to refer to the system model G o (q −1 , θ) and the noise system model H(q −1 , θ). We also consider that the error-model G ∆ q −1 , η [r] in Equation (3) is a linear regression as follows: where η [r] ∈ R n ∆ ×1 .

Standing Assumptions
The problem of interest is to estimate the vector of parameters, β = [θ T γ T σ 2 ] T , that defines the parameters of the nominal model, error-model and the noise variance. In addition, we consider that β 0 is the true vector of parameters that defines the true model. In order to formulate the ML estimation algorithm, we introduce the following standing assumptions: Assumption 1. The system in Equation (2) is operating in open loop, and the input signal u [r] t is an exogenous deterministic signal for each r independent experiment. In addition, the data of M independent experiments are available. t in Equation (2) satisfy regularity conditions, guaranteeing that the ML estimate β ML converges (in probability or a.s.) to the true solution β 0 as N → ∞.

Assumption 4.
The orders n a , n b , n c , n d , and n ∆ of the polynomials of system in Equation (2) and the number of components κ of the error-model distribution in Equation (4) are known. (2) is asymptotically stable, and its models G [r] (q −1 ) and H(q −1 , θ) have no poles-zeros on the unit circle and have no pole-zero cancellations. The noise system model H(q −1 , θ) is also a minimum-phase system. Assumption 2 is needed to develop an ML methodology based on the uncertainty modeling with SE approach. Assumption 3 is necessary to develop an estimation algorithm that holds the asymptotic properties of the ML estimator. Assumption 4 can be relaxed by considering a dynamic system model that includes parametric uncertainties with a different error-model structure that we assume in Equation (3). We will address this case in Section 5. Assumption 5 is necessary to obtain an asymptotically unbiased ML estimator [51] and a system model that is controllable [52].

Maximum Likelihood Estimation for Uncertainty Modeling Using GMMs
In this section, we develop an ML estimation algorithm for system in Equation (2) to obtain both the nominal model and the error-model distribution. We consider that the observed data N ] is a collection of measurements for each experiment. We use capital letters to denote the vector of signals, u t or ω t for t = 1, ..., N. Then, the system in Equation (2) can be described as follows: where The term G(θ)U [r] corresponds to the output response corresponding to the nominal model structure G(θ), Ψ [r] (θ)η [r] corresponds to the output signal due to the error-model structure in Equation (3), and W [r] ∼ N (0, σ 2 I N ) (I x represents the identity with dimension given by x). Notice that G(θ) involves the nominal model structure utilized for G o (q −1 , θ) and H(q −1 , θ) in Equation (2). The term Ψ [r] (θ) describes the model structure given by the product (3). (13) consider that the error-model structure Ψ [r] (θ)η [r] is a linear regression, and the nominal model structure G(θ) can have an arbitrary structure [37]. The fact of considering the error-model as a linear regression can provide flexibility, lower compu-tational complexity, and contain FIR model of arbitrary orders, Laguerre and Kautz models (see, e.g., Reference [53]).

Remark 1. The system model in Equation
We define the vector of parameters to be estimated as β = [θ T γ T σ 2 ] T in order to formulate the ML estimator for the system in Equation (13): where θ is the vector of parameters of the nominal model, γ is the vector of parameters of the error-model distribution as a GMM in Equation (4), and σ 2 is the noise variance. The corresponding ML estimation algorithm using GMMs is obtained as follows: Consider the parameters to be estimated as β = [θ T γ T σ 2 ω ] T using Equations (5) and (14). Under the standing assumptions and using Equation (13), the ML estimator for the system model in Equation (2) is given by: where the log-likelihood function is given by Proof. See Appendix A.
The result obtained from Lemma 1 differs from the classical ML formulation, where the measurement data of one experiment is used to obtain one estimate of the parameters of interest (see, e.g., References [1,2]). In contrast, from Lemma 1, the ML estimator considers the measurements of M independent experiments to obtain one estimation of the system model, i.e, the nominal model and error-model distribution as a GMM. Notice thatμ ir and Σ ir in Equations (17) and (18), respectively, depend on the vector of parameters β.
The optimization problem in Equation (15) is typically solved using gradient-based methods. However, the log-likelihood function in Equation (16) contains a matrixΣ [r] i with dimension N × N that may seem cumbersome to solve for the optimization problem [37,54]. Moreover, the log-likelihood function involves a logarithm of a summation that depends of the number of components κ of the GMM, and it may be difficult to solve when the number of components in the GMM is high [55]. The EM algorithm [36] can provide a solution to deal with this difficulty since combination of an EM algorithm with GMMs typically provides closed-form estimators for the GMM parameters [56,57].

An EM Algorithm with GMMs for Uncertainty Modeling in Linear Dynamic Systems
The EM algorithm is a popular tool for identifying linear and non-linear dynamic systems in the time domain (see, e.g., References [58,59]) and frequency domain [60]. In particular, a common strategy used in the formulation of an EM algorithm with GMMs is to consider a hidden (latent) variable, modeled as an indicator, that determines from which GMM component an observation comes from Reference [57]. To solve the estimation problem in Equation (15) Hence, the EM algorithm is given by (see, e.g., References [36,57]): where E{a|b} and p(a|b) denote the expected value and the PDF of the random variable a given the random variable b, respectively,β (m) is the current estimate, p(Y, ζ 1:M |β) is the joint PDF of Y [1:M] and ζ 1:M , and Q(β,β (m) ) is the auxiliary function of the EM algorithm. Notice that Equations (19) and (20) correspond to the E-step and M-step of the EM algorithm, respectively [36].
In order to develop the iterative algorithm in Equations (19) and (20), we obtain the following result. This will be used to compute Q(β,β (m) ) in Equation (19): (5) and (14).

Proof. See Appendix B.
Based on the coordinate descent algorithm [61], we obtain a new estimate in Equation (20) using the following steps: (1) Fixing the vector of parameters θ at its value from the current iterationθ (m) to optimize Equation (19) with respect to the GMM parameters γ and the noise variance σ 2 .
Fixing the GMM parameters and the noise variance at their values from the iteration γ (m+1) andσ 2 (m+1) and to solve the optimization problem in Equation (20) to obtain θ (m+1) . From Lemma 2, our proposed EM algorithm can be computed as follows: Theorem 1. Consider the vector of parameters to be estimated β = [θ T γ T σ 2 ] T defined in Equations (5) and (14) for the system in Equation (2). Substituting Equation (21) in Equation (19), the E-step in the EM algorithm is given by ir are given by: Consider the maximization problem stated in Equation (20) using Equation (22). The M-step in the EM algorithm is carried out using the following steps: Solving Equation (20) using Equation (22) with θ =θ (m) : where

Proof. See Appendix C.
From Theorem 1, we obtain closed-form expressions for the GMM and noise variance estimators. They are computed from the definition of the auxiliary function Q(β,β (m) ) using the complete log-likelihood function in Equation (21). That differs from the procedure proposed in Reference [49], where an auxiliary function is built without explicitly defining a hidden variable. However, the auxiliary function obtained in Reference [49] is equal to Equation (19) and the estimators are obtained from Equations (26)- (29) and Equation (34).

Numerical Examples
In this section, we present two numerical examples with different approaches to illustrate the benefits of our proposal for modeling both the nominal model and the error-model in the system of Equation (2). This approach of numerical simulations is often used when new estimation algorithms are tested in order to evaluate the estimation performance in conditions that could reduce the safety levels or deal with problems for which experiments cannot be planned [62,63].
In the first example, we consider a variant of the example used in Reference [37] holding all standing assumptions stated in Section 2.2. The nominal system model (G o (q −1 , θ) and H(q −1 , θ)) corresponds to a Box-Jenkins model. For simplicity, we also consider that the error-model (G ∆ (q −1 , η [r] )) corresponds to a second order FIR system in Equation (12) with an overlapped Gaussian mixture error-model distribution. We assume that there are not dynamic uncertainties at low frequencies, i.e., the error-model has zero static gain. In this case, we do not establish a comparison analysis with other approaches since the simulated data is generated using our proposed framework with SE and GMMs.
In contrast, we consider a second example to illustrate the flexibility of our proposed algorithm for modeling the error-model for the system in Equation (2). Assumption 4 is relaxed considering a model structure does not correspond to Equation (3). Instead, we consider the system in Equation (2)  For both examples, the simulation setup is as follows: (1) The data length is N = 100. (2) The number of experiments is M = 100.
The number of Monte Carlo (MC) simulation is 25.
The stopping criterion is satisfied when: or when 2000 iterations of the EM algorithm have been reached, where · denotes the general vector norm operator.
Notice that each MC simulation corresponds to an estimation obtained using the data from M independent experiments.
On the other hand, for the numerical examples, the initialization of the estimation algorithm is as follows: For the nominal model initialization, we estimate a system model for each independent experiment using PEM and a sufficiently flexible (high order) structure. Then, we choose the estimated system model and the corresponding estimated noise variance with the smallest number of parameters that better described all the previously estimated models. For the error-model FIR transfer function, we adopt a similar approach for different FIR filter orders (n ∆ ). Here, we choose a sufficiently complex FIR system by assuming the nominal model is equal to one (G o (q −1 , θ) = 1). The number of Gaussian mixture model components (κ) is defined by the user, and it is typically a large number. Once the parameters have been estimated, it is possible to discard the Gaussian component corresponding to small estimated weight (e.g., 1/(100κ)). Finally, for a given κ, the initial distribution of the parameters of the error-model is chosen as follows: (1) From all the estimated FIR models, compute the average value of the coefficient corresponding to each tap. (2) Evenly space the initial mean values of the n ∆ -dimensional GMM between the estimated maximum and the minimum value of each tap. (3) Set the variances of the n ∆ -dimensional GMM equal to a diagonal covariance matrix with the sample variance of each tap on the diagonal. (4) Set the mixing weight for each GMM component equal to 1/κ.

Example 1: A General System with a Gaussian Mixture Error-Model Distribution
Consider the true system G [r] (q −1 ) in Equation (2) with a nominal system model as follows: and the error-model in Equation (12) is given by where b 0 1 = 1, a 0 1 = 0.5, c 0 1 = 0.1, and d 0 1 = 0.8. The input signal is u [r] t ∼ N (0, σ 2 u ), σ 2 u = 10, and the noise source ω [r] t ∼ N (0, σ 2 ), σ 2 = 0.1. In this example, we focus on uncertainties in the system model at high frequencies, i.e., η 1 = −η 0 . We also consider that, for each experiment, η [r] = [η 0 η 1 ] T is drawn from a Gaussian mixture distribution in Equation (4) with The initial values used for our proposed EM algorithm correspond toθ (0) = [1.2510 0.4523 0.0772 0.7546] T , Table 1 shows the estimation results of the nominal model parameters and the noise variance. Figure 3a shows the results of the estimation of the error-model distribution. The blue line corresponds to the average of all the GMMs estimated. It is clear that the estimated PDF is similar to the true PDF.  Frequency (rad/s) On the other hand, we note that the SE method does not directly provide an uncertainty region. However, we compute M realizations of the model with r = 1, ..., M, whereĜ o (q −1 ,θ) is the nominal model estimated and η [r] are obtained from M different realizations using the GMM in Equation (4) with the estimated parameters γ obtained by using our proposed SE method. In addition, we use the PEM [1,2] to estimate the system model G [r] (q −1 ) with a high order FIR model. We consider the measurements of 25 independent experiments, and we estimate a 10th order FIR system model,Ĝ(q −1 , θ) (FIR) , for each independent experiment in order to validate the uncertainty region estimated. Figure 3b shows the magnitude of the frequency response corresponding to the average of all MC simulations for the estimated nominal model (blue line). The red-dashed line corresponds to the nominal system model in Equation (36), i.e., G o (q −1 , θ). The blueshaded region represents the area of the corresponding uncertainty region estimated of the true system. We observe an accurate estimation of the nominal system model. We also observe that the FIR models estimated (black-dotted lines) lie in the uncertainty region obtained with our proposed method.

Example 2: A Resistor-Capacitor System Model with Non-Gaussian-Sum Uncertainties
Consider the system in Equation (2) as follows: where T s = 0.01 is the sampled period, and We consider that the sampled period T s is known.
For the SE method, we consider an output-error nominal model with a FIR error-model as follows: The distribution of the parameters η [r] of the error-model is modeled using a GMM with γ = {α i , µ i , Γ i } κ i=1 , κ = 2 components. The vector of parameter to be estimated is β = {τ o , γ, σ 2 }. As in the previous example, the uncertainty region is obtained from Equation (38) with r = 1, ..., M, where η [r] is obtained from M different realizations using the GMM in Equation (4) with the estimated parametersγ. In addition, the initial values used for our proposed EM algorithm are given byτ (0) 0 = 9 × 10 −3 ,σ 2 (0) = 0.1247, For comparison purposes, we estimate the system uncertainty and the nominal model using MEM approach [33]. The nominal model estimated,Ĝ o (q −1 , θ) (MEM) , corresponds to the average of all PEM estimations obtained using the original structure given in Equation (39), i.e., n a = n b = 1 and n c = n d = 0, from the all independent experiments . In order to obtain the residuals, we compute ε t = y t −Ĝ o (q −1 , θ) (MEM) u t for each independent experiment. The error-model, G ε (q −1 , θ ε ), is obtained from the residual data as follows: where v t is a zero-mean Gaussian noise sequence which is assumed to be uncorrelated with the input signal u t . Thus, we consider a 10th order FIR system model for G ε (q −1 , θ ε ) and we use the PEM [2] to estimate the error-model parameters θ ε for each experiment. The uncertainty region is computed by adding frequency by frequency the average of the all error-model estimated to the nominal model. Finally, we consider the measurements of 25 independent experiments and we obtain an estimation of the system model in Equation (39),Ĝ(q −1 , θ) (PEM) , using PEM with n a = n b = 1 and n c = n d = 0 for each independent experiment in order to validate the uncertainty region estimated for both SE and MEM approach. Figure 4a shows the magnitude of the frequency response corresponding to the nominal model and the uncertainty region estimated using SE approach. The red-dashed line corresponds to the system model in Equation (39) considering the time constant without uncertainty, i.e., G(q −1 , τ o ). The blue line corresponds to the average of all MC simulations for the estimated nominal model. We observe an accurate estimation for the nominal model withτ o = 9.7 × 10 −3 ± 0.17 × 10 −3 and noise varianceσ 2 = 0.133 ± 5.9 × 10 −3 . The blue-shaded area corresponds to the uncertainty region estimated for the true system. Similarly, Figure 4b shows the magnitude of the frequency response corresponding to the nominal model estimated (blue line) and the uncertainty region estimated (blue-shaded area) from the residuals using MEM. We also observe an accurate estimation of the nominal model withτ o = 10.1 × 10 −3 ± 2.30 × 10 −3 and noise varianceσ 2 = 0.099 ± 2.85 × 10 −3 . In addition, we observe in Figure 4a,b that PEM estimations,Ĝ(q −1 , θ) (PEM) , lie in the uncertainty region obtained using both SE approach and MEM. However, SE method describes better the uncertainty region than MEM method, specifically, the error-model at low frequencies.

Conclusions
In this paper, we have addressed the uncertainty modeling problem by combining the SE approach with GMMs. We proposed an identification algorithm using ML principle to estimate both the nominal model and the distribution of the parameters of the error-model as a GMM. An EM algorithm is proposed to solve the ML estimation problem providing closed-form expressions for the estimators of the GMM parameters and the noise variance. In the simulation examples, we considered two scenarios: (i) identification of a system model with a structure defined with SE approach using a nominal model and an errormodel as a FIR system with Gaussian mixture distribution, and (ii) identification of a system model with non-Gaussian uncertainties in the parameters does not correspond to the SE framework. For both cases, we obtained accurate estimations of the nominal system model. Finally, our proposed method provided a good description of the uncertainty region in the system for the simulation examples.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: N ] is a collection of measurements for each experiment r = 1, ..., M. Then, the likelihood function, L(β), for the system model in Equation (13) using the GMM in Equation (4) can be obtained by marginalizing the hidden variable, [η [1] · · · η [M] ], as follows: The log-likelihood function, (β), is then given by Consider the following identities: . Substituting Equations (A2) and (A3) in K i (β, η [r] ), and using the identities in Equations (A5)-(A7), we obtain: Substituting Equation (A8) in Equation (A4), we can directly obtain the log-likelihood function in Equation (16). Then, the ML estimator corresponds to Equation (15).

Appendix B. Computing the Complete Likelihood Function
From the likelihood function in Equation (16), we have: whereμ ir andΣ ir are given by Equations (A9) and (A10), respectively. Let consider where Defining the complete data set as Y [1:M] , ζ 1:M and utilizing the Bayes' rule, the complete likelihood function is obtained as follows: The complete log-likelihood function is given by: Using Equations (A16) and (A17) in Equation (A20), we directly obtain Equation (21).
For the parameter α i , we define F (α i ) as follows: subject to ∑ κ i=1 α i = 1. Then, using a Lagrange multiplier to deal with the constraint on α i , we define: Taking the derivative of Equation (A33) with respect to α i and ν and equating to zero, we obtain: Then,α ir /ν. Taking summation over i = 1, ..., κ and using Equation (A35), we have: Notice that ∑ κ i=1ζ (m) ir = 1, and then we have ν = M. Then, we can obtain Equation (26). Notice that 0 ≤α (m+1) i ≤ 1 holds even though we did not explicitly consider it in Equation (A33).