A Two-Filter Approach for State Estimation Utilizing Quantized Output Data

Filtering and smoothing algorithms are key tools to develop decision-making strategies and parameter identification techniques in different areas of research, such as economics, financial data analysis, communications, and control systems. These algorithms are used to obtain an estimation of the system state based on the sequentially available noisy measurements of the system output. In a real-world system, the noisy measurements can suffer a significant loss of information due to (among others): (i) a reduced resolution of cost-effective sensors typically used in practice or (ii) a digitalization process for storing or transmitting the measurements through a communication channel using a minimum amount of resources. Thus, obtaining suitable state estimates in this context is essential. In this paper, Gaussian sum filtering and smoothing algorithms are developed in order to deal with noisy measurements that are also subject to quantization. In this approach, the probability mass function of the quantized output given the state is characterized by an integral equation. This integral was approximated by using a Gauss–Legendre quadrature; hence, a model with a Gaussian mixture structure was obtained. This model was used to develop filtering and smoothing algorithms. The benefits of this proposal, in terms of accuracy of the estimation and computational cost, are illustrated via numerical simulations.


Introduction
It is well known that discrete-time dynamical systems can be described as first-order difference equations relating internal variables called states [1]. State estimation is a scientific discipline that studies methodologies and algorithms for estimating the state of dynamical systems from input-output measurements [2,3]. There are a variety of applications that use state estimation, such as control [4][5][6][7], parameter identification [8][9][10], power systems [11,12], fault detection [13][14][15][16][17], prognosis [18,19], cyber-physical systems [20], hydrologic and geophysical data assimilation [21,22], maritime tracking [23], consensusbased state estimation using wireless sensor networks [24][25][26], navigation systems [27], and transportation and highway traffic management [28][29][30], to mention a few. Depending on the measurements that are used, two algorithms of state estimation can be distinguished: filtering and smoothing. Filtering algorithms estimate the current state using measurements up to the current instant, and smoothing algorithms estimate the state at some time in the past using measurement up to the current instant [23,31].
In general, the experimental noisy data can suffer a significant loss of information introduced by low-resolution and cost-effective sensors [32] in the digitalization process [33]. For linear systems and Gaussian noises (without quantization), the optimal estimator of the system state is the celebrated Kalman filter [1] and smoother [22]. However, in the most general case, i.e., nonlinear systems and non-Gaussian noises, it is not possible to obtain an optimal estimator because the computation of some integrals in the filtering and smoothing equations is difficult or the integrals are just intractable. As mentioned above, the quantization process is a nonlinear map that results in a significant loss of information on the system dynamics, which produces a biased state estimation and incorrect characterization of the filtering and smoothing probability density functions (PDFs). In this context, several suboptimal filtering and smoothing algorithms for state-space systems with quantized data have been developed, for instance, standard- [49,54], unscented- [55], and extended- [36] Kalman filters for quantized data, in which some structural elements of the state-space models and the quantizer are exploited. Sequential Monte Carlo methods have been also used for filtering and smoothing with quantized data, where complex integrals are approximated by a set of weighted samples called particles [31], which define (approximately) a desired PDF. However, the most common difficulty in these methods is dealing with the quantizer model. Some approaches have been proposed for this purpose such as gradient-based approximation of the quantizer [56] or modeling the quantizer as uniformly distributed additive noise [32,57]. In [58], an approximation of the integral in the non-Gaussian probability mass function (PMF) of the quantized output given the state was proposed by using Gaussian quadrature rules in order to deal with binary data. This approximation naturally yields an explicit Gaussian mixture model (GMM) form for the PMF of the quantized output. In this paper, the approximation of this integral was extended by considering a more general (finite-and infinite-level) quantizer, and it was used to develop Gaussian sum filtering and smoothing algorithms.

Main Contributions
The main contributions of this work are:

1.
Developing an explicit model (of the GMM form) for the PMF of the quantized output considering a finite-and infinite-level quantizer, to solve in closed-form filtering and smoothing recursions in a Bayesian framework; 2.
Designing Gaussian sum filtering and smoothing algorithms to deal with quantized data, providing closed expressions for the state estimates and for the filtering and smoothing PDFs.
The filtering algorithm for quantized data presented in this paper includes, as a particular case, the filtering algorithm presented in [58], where only the case of binary data was considered. Additionally, the smoothing algorithm presented here, based on the approximation of the PMF of the quantized output given the state, is completely novel. The remainder of the paper is as follows: In Section 2, the problem of interest is defined. In Section 3, the characterization and approximation of the PMF of the quantized data given the state are presented, and using this explicit model, the Gaussian sum filtering and smoothing algorithms are developed. In Section 4, some examples are presented to show the benefits of this proposal in terms of accuracy and computational cost. Finally, in Section 5, conclusions are presented.

System Model
Consider the state-space model for a discrete-time linear-time-invariant system with quantized output (see Figure 2): where x t ∈ R n is the state vector, z t ∈ R is the nonquantized output, y t ∈ R is the quantized output, and u t ∈ R m is the input of the system. The matrices are A ∈ R n×n , B ∈ R n×m , C ∈ R 1×n , and D ∈ R 1×m . The nonlinear map q{·} is the quantizer. The state noise w t ∈ R n and the output noise v t ∈ R are zero-mean white Gaussian noises with covariance matrix Q and R, respectively. The system in (1)-(2) can be described using the state transition and conditional nonquantized output probability distributions: where N x (µ, P) represents a PDF corresponding to a Gaussian distribution with mean µ and the covariance matrix P of the variable x. The initial condition x 1 , the model noise w t , and the measurement noise v t are statistically independent random variables.

Quantizer Model
Let the quantizer q{·} : R → V be a map from the real line defined by the set of intervals {J i ⊂ R : i ∈ I} to the finite or countable infinite set V = {β i ∈ R : i ∈ I} [59], where I is a set of indices defined by the quantizer type. In this work, two quantizers were considered. The first an infinite-level quantizer, in which the output of the quantizer has infinite (countable) levels of quantization corresponding to the indices' set: The definition of the infinite-level quantizer is as follows (see Figure 3 (left)): where the sets J i = {z t : q i−1 ≤ z t < q i } are disjoint intervals and each β i is the value that the quantizer takes in the region J i . The second is a finite-level quantizer, in which the output of the quantizer is limited to a minimum and maximum values (saturated quantizer) corresponding to the indices' set: The definition of the finite-level quantizer (see Figure 3 (right)) is similar to (8), where I is defined in (9)

Problem Definition
The problem of interest can be defined as follows: Given the available data u 1:N = {u 1 , u 2 , . . . , u N } and y 1:N = {y 1 , y 2 , . . . , y N }, where N is the data length, obtain the filtering and smoothing PDFs of the state given the quantized measurements, p(x t |y 1:t ) and p(x t |y 1:N ), respectively the state estimators: x t|t = E{x t |y 1:t }, (10) x t|N = E{x t |y 1:N }, (11) and the corresponding covariance matrices of the estimation error: where t ≤ N and E{x|y} denotes the conditional expectation of x given y.

Gaussian Sum Filtering and Smoothing for Quantized Data
Here, the Gaussian sum filtering and smoothing algorithms for quantized output data are explained in detail.

Gaussian Mixture Models
Gaussian mixture models refer to a convex combination of Gaussian densities corresponding to a random variable ζ ∈ R n . Then, the PDF can be written as [60]: where ϕ i is the ith mixing weight, υ i ∈ R n is the ith mean, and Γ i ∈ R n×n is the ith covariance matrix. GMMs are used in a variety of applications to approximate non-Gaussian densities [61][62][63] and filtering and smoothing [64,65], to mention a few.

General Bayesian Framework
The well-known equations for Bayesian filtering (see, e.g., [31]) are given by: p(x t+1 |y 1: where p(x t |y 1:t ) and p(x t+1 |y 1:t ) are the measurement and time-update equations, respectively. The PDF p(x t+1 |x t ) is obtained from the model in (5), and p(y t |y 1:t−1 ) is a normalization constant. On the other hand, the well-known formula for Bayesian smoothing (see, e.g., [31]) is defined by: However, the PDF in (16) is difficult to compute because of the division by a non-Gaussian density. This difficulty was overcome in [64] where the smoothing equations were separated in a two-fold formula filter: (i) the first formula, called the backward filter, defined by the following recursion: p(y t:N |x t ) = p(y t |x t )p(y t+1:N |x t ), where p(y t+1:N |x t ) and p(y t:N |x t ) are the backward prediction and the backwardmeasurement-update equations, respectively, and (ii) the second formula, given by: p(x t |y 1:N ) = p(x t |y 1:t−1 )p(y t:N |x t ) p(y t:N |y 1:t−1 ) , (19) where p(x t |y 1:t−1 ) is the time-update equation from the filtering step, p(y t:N |y 1:t−1 ) is a normalization constant, and p(y t:N |x t ) is obtained using the backward recursion given in (17) and (18). Due to the non-Gaussianity of p(y t |x t ), the integrals in both the filtering and backwardfiltering algorithms in (15) and (17), respectively, are difficult to compute or intractable. Widely used methods to deal with these integrals are Monte-Carlo-based algorithms, such as particle filtering and smoothing; see, e.g., [31,66]. These methods represent the posterior distributions p(x t |y 1:t ) and p(x t |y 1:N ) by a set of weighted random samples. However, in general, they exhibit a growing computational complexity as the model dimension increases. Here, an alternative method to compute the PMF p(y t |x t ) using Gauss-Legendre quadrature is proposed. This procedure results in a GMM form. Hence, the integrals in both the filtering and backward filtering in (15) and (17), respectively, can be computed in closed form.

Remark 1.
Notice that since y t is a discrete random variable, the measurement-update equation in (14) and the backward-measurement-update equation in (18) comprise PDFs and a PMF. Hence, generalized probability density functions are used here; see, e.g., [67].

Computing an Explicit Model of p(y t |x t )
From (8), it is observed that the output z t ∈ J i is mapped to a single value β i . Then, the probability that y t takes the value β i is the same as the probability of z t belonging to set J i , as shown in Figure 4. Notice that the quantizer regions J i include finite and semi-infinite intervals. Figure 4. The shaded area represents the probability of y t taking a value β i that is equal to the probability of z t belonging to set ∈ J i . Here, P{·} denotes probability.
In the following theorem, the characterization of p(y t |x t ) is formalized via the computation of the probability P{z t ∈ J i } shown in Figure 4 through the integral definition of probabilities [68]. Therefore, the approximation of this integral by using the Gauss-Legendre quadrature rule is presented (see, e.g., [69]). Theorem 1. Consider the system (1)-(3), the infinite-and finite-level quantizers defined in (8). Then, the PMF of the discrete random variable y t given the state x t is given by: where a t and b t are functions of the boundary values of each region of the quantizers and are defined in Table 1. In addition, the integral in (20) can be approximated using the Gauss-Legendre quadrature rule, yielding: where K is the number of points from the Gauss-Legendre quadrature rule (defined by the user), ς τ t , η τ t , and µ τ t are defined in Table 1, and ω τ and ψ τ are weights and points defined by the quadrature rule, given in, e.g., [69]. Table 1. Integral limits and parameters of Theorem 1. ILQ and FLQ mean infinite-and finite-level quantizers, respectively. FLQ: y t = β 1 ILQ: y t = β i with i ∈ I in (7) FLQ: y t = β i with i = 2, . . . , L − 1 FLQ: Proof. From the infinite-and finite-level quantizers, it is observed that the random variable y t can only take the values β i with i in the indices' sets given in (7) and (9). Then, the probability of y t = . . . , β 1 , β 2 , . . . , β L , . . . or y t = β 1 , β 2 , . . . , β L−1 , β L is the same as the probability that the random variable z t belongs to the sets J i . This probability can be obtained from the distribution function as follows: where P{·} denotes probability. Considering the infinite-level quantizer and the output equation in (2), the following expressions are obtained for y t = β i with i = . . . , 1, 2 . . . , L, . . . : where a t = q i−1 − Cx t − Du t and b t = q i − Cx t − Du t . Additionally, for the finite-level quantizer, (23) holds for y t = β i with i = 2, . . . , L − 1, and for y t = β 1 and y t = β L , the following holds: where b t = q 1 − Cx t − Du t and a t = q L−1 − Cx t − Du t . Then, using the fact that p(v t ) = N v t (0, R) and using (23)- (25), the integral in (20) can be obtained, where the integral limits are given in Table 1.
On the other hand, the Gauss-Legendre quadrature for approximating a definite integral over the interval [−1, 1] is given by: where ω τ and ψ τ are the quadrature weights and the roots of the order K Legendre polynomial, respectively; see, e.g., [69]. Notice that the integral over [a t , b t ] can be mapped and using the definition in (26), this integral is approximated by: Defining . . , 1, 2 . . . , L, . . . , the approximation of p(y t |x t ) given in (21) for the infinite-level quantizer is derived.
For the finite-level quantizer, (28) holds for y t = β i with i = 2, . . . , L − 1. Then, the approximation of p(y t |x t ) for y t = β 1 and y t = β L is defined. First, it is observed that the integral over the semi-infinite interval [a t , ∞) can be mapped onto the interval (0, 1] using r = a t + (1 − ξ)/ξ, so that: Then, using an appropriate change of variable, it can be mapped onto the interval (−1, 1], yielding: Using the Gauss-Legendre approximation given in (26), the approximation of p(y t |x t ) for y t = β L is obtained as follows: (21) is obtained. A similar procedure for the integral over the semi-infinite interval (−∞, b t ] can be applied using Defining is obtained. This completes the proof.

Remark 2.
Notice that any quadrature rule, such as Newton-Cotes, Gauss-Laguerre, and Gauss-Hermite (see, e.g., [69,70]), used to approximate the integral in (20), yields a weighted sum of Gaussian distributions evaluated as a linear function of the values q i and the state x t . Furthermore, it is possible to interpret p(y t |x t ) as a weighted sum of Gaussian distributions in the random variable x t . This weighted sum is denoted as the GMM structure. Thus, this structure is considered for developing the Gaussian sum filtering and smoothing algorithms that deal with quantized data.

Remark 3.
Notice that in [70], a suboptimal filtering algorithm called the quadrature Kalman filter, where a linear regression is used to linearize the nonlinear process and measurement functions using the Gauss-Hermite quadrature rule, was developed. This approach is not directly applicable to the problem of interest in this paper, so that the quantizer is a nondifferentiable nonlinearity.
On the other hand, the cubature Kalman filter [70] and smoother [71] are approaches that use the spherical-radial cubature rule to approximate the n-dimensional integrals when computing the expected values in (10)-(13) in a nonlinear state-space model. These integral approximations are obtained under the assumption that p(x t |y 1:t ) and p(x t |y 1:N ) are Gaussian distributed. The difference between these approaches and the proposed method in this paper is that the Gauss quadrature rule was used to approximate the integral in the probabilistic model of p(y t |x t ). It is clear that in this paper, the Gaussian assumption in the filtering and smoothing densities was not used. In fact, it is shown that p(x t |y 1:t ) and p(x t |y 1:N ) are non-Gaussian PDFs.

Understanding the Gaussian Sum Filtering and Smoothing Algorithms
The general Bayesian framework for filtering leads to the product p(y t |x t ) and p(x t |y 1:t−1 ), as shown in (14). From the definition of p(y t |x t ) in (21), it is observed that p(x t |y 1:t ) results in the product of two GMMs. This, in turn, results in a GMM with more components than the individual factors p(y t |x t ) and p(x t |y 1:t−1 ). This implies that the timeupdate equation p(x t+1 |y 1:t ) in (15) is also a GMM distribution. A similar situation occurs in the backward-measurement-update equation p(y t:N |x t ) in (18), which is the product of two GMM structures p(y t |x t ) and p(y t+1:N |x t ). This yields another GMM structure with more components than the individual factors, which implies that the backward-prediction equation in (17) is also a GMM structure. For the clarity of the presentation, reducing the product of two GMMs (structures) into one is necessary.
In order to understand the transformation of the product of two summations into another summation, the following sums are considered: g = ∑ K τ=1 g τ and f = ∑ M =1 f . Then, for each two-tuple (τ, ) where τ = 1, . . . , K and = 1, . . . , M, the product h = f g is another summation and has KM terms indexed by k = ( − 1)K + τ. Then, reordering these terms, the following sum is obtained:

Gaussian Sum Filtering for Quantized Data
Using the approximation of the PMF p(y t |x t ) defined in Theorem 1, the Gaussian sum filtering algorithm can be derived using (14) and (15) as follows: Theorem 2 (Gaussian sum filter). Consider the system in (4)-(6) and the approximation of p(y t |x t ) in (21). The filtering equations for state-space systems with quantized output data are defined as follows: is the prior distribution of the initial state. Then, for t = 1, . . . , N, the measurement-update and the time-update equations for the Gaussian sum filtering are defined as follows: Measurement update: The PDF of the current state x t given the current and previous measurements, that is y 1 , . . . , y t , is the GMM given by: where γ k t|t ,x k t|t , and Σ k t|t are given in Appendix B, M t|t = KM t|t−1 , and M t|t−1 is the number of Gaussian components in the time update step. The initial values satisfy that M 1|0 = 1, γ 1|0 = 1, x 1|0 = µ 1 , and Σ 1|0 = P 1 .

Time update:
The PDF of the future state x t+1 , one-step-ahead prediction given the measurements y 1 , . . . , y t , is the GMM given by: where M t+1|t = M t|t , γ k t+1|t ,x k t+1|t , and Σ k t+1|t are given in Appendix B.
Proof. Consider the recursion in (14) and (15). The required PDF p(x t+1 |x t ) and PMF p(y t |x t ) are obtained from (5) and (21), respectively. Then, using the measurement-update equations in (14) and Lemma 2 in [65] (p. 86), the following expression is obtained: where κ τ t , V t ,x k t|t , and Σ k t|t are defined in (A11), (A12), (A7), and (A8), respectively. Notice that N η τ t (κ τ t , V t ) is a coefficient since the measurement y t is available. Then, rewriting the double summation in (35) as a single one with a new index k = ( − 1)K + τ, the measurement-update equation in (33) is derived defining M t|t = KM t|t−1 , the weights γ k t|t as in (A9), and computing the corresponding normalization constant. On the other hand, using the time-update equation in (15) and Lemma 2 in [65] (p. 86), the following equation is obtained: wherex k t+1|t and Σ k t+1|t are defined in (A14), and (A15), respectively. Then, solving this integral, the time-update equation in (34) can be derived defining M t+1|t = M t|t , the weights γ k t+1|t as in (A13), and computing the corresponding normalization constant. This completes the proof.

Computing the State Estimatorx t|t from a GMM
Provided p(x t |y 1:t ) in (33) as a GMM, the state estimator given in (10) and the covariance matrix of the estimation error in (12) can be computed as follows:

Backward Filtering for Quantized Data
Using the approximation of the PMF p(y t |x t ) defined in Theorem 1, the backward filter recursion can be derived as follows: Theorem 3 (Backward filtering). Consider the system in (4)-(6) and the approximation of p(y t |x t ) in (21). Then, the backward filter for state-space systems with quantized output data is: Initialization: For t = N, the backward measurement update is given by: where S N|N = K, and: where ς k N , η k N , and µ k N are defined in Table 1. Then, for t = N − 1, . . . , 1, the backward prediction and the backward-measurement-update equations are defined as follows: Backward predictions: The backward-prediction equation is given by: where S t|t+1 = S t+1|t+1 , S t+1|t+1 is the number of components in the backward-measurementupdate step, and k t|t+1 , λ k t|t+1 , F k t|t+1 , G kT t|t+1 , H k t|t+1 are given in Appendix C.
Backward-measurement update: The distribution of y t:N |x t evaluated at y t , . . . , y N is: where S t|t = KS t|t+1 , S t|t+1 is the number of components in the backward-prediction step, and k t|t , λ k t|t , F k t|t , G kT t|t , H k t|t are given in Appendix C.
Proof. Consider the recursion in (17) and (18). The required PDF p(x t+1 |x t ) and PMF p(y t |x t ) are given by Equations (5) and (21), respectively. The proof is carried out by induction in reverse time. First, it is verified that it holds for t = N − 1, then it is assumed to be true for t = s + 1, and finally, it is verified that it holds for t = s. Notice that the recursion starts in t = N with p(y N:N |x N ) = p(y N |x N ), which is defined in (39). From (17), at time t = N − 1, the backward-prediction step is defined as: where p(y N:N |x N ) is given by (39) and p(x N |x N−1 ) is defined by the system model in (5). From the definition given in (43), the equation below is obtained: . Then, completing the square and solving the integral in (44), it is obtained: where S N−1|N = S N|N and the remaining terms in (45) are defined in (A22)-(A26), but evaluated at t = N − 1. The backward-measurement-update step in (18) is as follows: Thus, using (45) in the definition given in (46), it is obtained: Finally, rewriting the double summation in the last equation into a single one with a new index k = ( − 1)K + τ results in: where S N−1|N−1 = KS N−1|N and the remaining terms in (48) are the same as the ones shown in (A16)-(A21), but evaluated at t = N − 1. Thus, it is concluded that it holds for t = N − 1. A similar procedure was applied to find that, for both the backwardprediction and backward-measurement-update steps in the backward filter, it yields the same expressions in (A22)-(A26) and (A16)-(A21), but evaluated at t = s. Thus, it is concluded that Theorem 3 holds for all t.
From Theorems 2 and 3, it is clear that the number of elements in the mixture grows exponentially with every iteration, making the algorithm computationally intractable after a few iterations. In addition, it would be necessary to save and manage a large amount of information in every iteration of the Gaussian sum filtering and in the backward recursion. Therefore, an algorithm that reduces the number of GMM components should be implemented in every iteration of these two algorithms in order to keep the number of components bounded. Different methods have been proposed to perform this kind of procedure, termed Gaussian sum reduction, such us pruning, joining, and integral-squarederror-based methods; see [70]. In this work, the Kullback-Leibler approach for Gaussian sum reduction proposed in [64,72] was used. The idea behind the Gaussian sum reduction is to transform the GMM {ϕ i , In [64], it was suggested to use a measure of dissimilarity between two components and pooling the pair of components that minimize this measure. Then, based on this idea, in [72], the Kullback-Leibler information number was used as the measure of dissimilarity. The author in [72] provided an algorithm to merge two components so that the merged component preserves the first-and second-order moments of the original two components, which is given by: where M{·, ·} is a merging function such that: . The merging function applied to two components i and j (i = j) minimizes the dissimilarity measure D(i, j), defined as: The function D(i, j) satisfies D(i, j) = D(j, i) and D(i, i) = 0. This implies that the total number of combinations to merge is reduced to 0.5J(J − 1). The authors in [73] used Runnalls' algorithm in a Bayesian filtering environment, and they modified it to include a user-defined threshold for the number of components after reduction and a user-defined threshold ε that satisfies D(i, j) < ε. In Algorithm 1, the steps for implementing the Gaussian sum filtering are summarized.
The backward recursion in Theorem 3 is used to obtain the smoothing PDF in (19). For this purpose, p(y t:N |x t ) is converted into a GMM structure of the random variable x t . Then, the Gaussian sum reduction algorithm is applied to the GMM structure of p(y t:N |x t ) to obtain: where S red is the number of Gaussian components kept after the Gaussian reduction procedure and δ k t|t , z k t|t , and U k t|t are the corresponding weight, mean, and covariance matrix. This reduced GMM structure is used to obtain the smoothing PDFs. However, for the next recursion in the backward filter, reconverting the reduced GMM structure into the backward filter form to obtain the reduced version of the backward-measurement-update step in (42) is required. This conversion process between the backward filter and GMM structure is summarized in Lemma A3 in Appendix A. In Algorithm 2, the steps for implementing the backward filter are summarized.

Algorithm 2:
Backward-filtering algorithm for quantized output data 1 Input: The initial backward measurement p(y N |x N ) given in (39), e.g., S N|N , k N|N , λ k N|N , F k N|N , G kT N|N , and H k N|N . The set {ς τ t , η τ t , µ τ t } for t = 1, . . . , N computed in Algorithm 1. 2 for t = N − 1 to 1 do 3 Backward Prediction 4 Set S t|t+1 = S t+1|t+1 . 5 for k = 1 to S t|t+1 do 6 Compute and store k t|t+1 , λ k t|t+1 , F k t|t+1 , G kT t|t+1 , and H k t|t+1 according to Theorem 3. Set S t|t = KS t|t+1 . 10 for = 1 to S t|t+1 do 11 for τ = 1 to K do 12 Compute the index k = ( − 1)K + τ. 13 Compute and store k t|t , λ k t|t , F k t|t , G kT t|t , and H k t|t according to Theorem 3 14 end 15 end 16 Compute the GMM structure of p(y t:N |x t ) using Lemma A3 in Appendix A. Perform the Gaussian sum reduction algorithm according to [73] to obtain the reduced GMM structure of p(y t:N |x t ) given in (54). 17 Compute and store the backward filter form of the reduced version of p(y t:N |x t ) using Lemma A3 in Appendix A. 18 end 19 Output: The backward prediction p(y t+1:N |x t ) and the backward measurement update p(y t:N |x t ) for t = N, . . . , 1.

Smoothing Algorithm with Quantized Data
In order to obtain the smoothing PDF in (19), the GMM structure p(y t:N |x t ) in the random variable x t given in (54) is used. This GMM structure is multiplied by the timeupdate equation p(x t |y 1:t−1 ) of the filtering algorithm given in (34). Then, the smoothing PDF is obtained from the following: Theorem 4 (Gaussian sum smoothing). Consider the system in (4)- (6). Given p(x t |y 1:t−1 ), p(x N |y 1:N ) and p(y t:N |x t ), then the smoothing PDF at time t = N is given by p(x N |y 1:N ), and for t = N − 1, . . . , 1, the PDF p(x t |y 1:N ) is a GMM given by: where S t|N = M t|t−1 S red , M t|t−1 and S red are given in (34) and (54), respectively, and k t|N ,x k t|N , and Σ k t|N are given in Appendix D.
Provided p(x t |y 1:N ) in (55) as a GMM, the state estimator given in (11) and the covariance matrix of the estimation error in (13) can be computed as follows: In Algorithm 3, the steps to implement the Gaussian sum smoothing are summarized.

Algorithm 3:
Gaussian sum smoothing algorithm for quantized output data 1 Input: The PDFs p(x t |y 1:t−1 ) and p(x N |y 1:N ) obtained from Algorithm 1 and p(y t:N |x t ) obtained from Algorithm 2. Compute the index k = ( − 1)M t|t−1 + τ. 8 Compute and store k t|N ,x k t|N , and Σ k t|N according to Theorem 4. Compute the state estimation (11) and the covariance matrix of the estimation error in (13) according to (59) and (60).

end 13 Output:
The state estimation in (11), the covariance matrix of the estimation error in (13), and the smoothing PDFs p(x t |y 1:N ), for t = 1, . . . , N.
3.9. Computing the Smoothing Joint PDF p(x t+1 , x t |y 1:N ) In Theorems 2-4, the filtering and smoothing problems for quantized data are solved. However, many strategies for the identification of state-state space systems [8,9] require the joint PDF p(x t+1 , x t |y 1:N ), which for the quantized output data case is given by the following: Theorem 5. Consider the system in (4)-(6), the PDF p(x t |y 1:t ) and the backward prediction equation p(y t+1:N |x t+1 ) given in Theorems 2 and 3, respectively, and the PDF p(x t+1 |x t ) given in Equation (5). Then, for t = N − 1, . . . , 1, the joint PDF p(x t+1 , x t |y 1:N ) is the GMM given by: where S t,t+1 = M t|t S t+1|t+1 , M t|t and S t+1|t+1 are given in (33) and (42), respectively, α k ,χ k t|N , and E k t|N are given in Appendix E, and χ t is the extended vector given by: At time t = N, the joint PDF p(x N+1 , x N |y 1:N ) is given by (61) with: Proof. Using Bayes' theorem, the joint PDF p(x t+1 , x t |y 1:N ) can be obtained as: where p(y t+1:N |y 1:t ) is a normalization constant. The required PDFs p(x t |y 1:t ) and p(x t+1 |x t ) and backward-prediction equation p(y t+1:N |x t+1 ) are given in (33), (5) and (42), respectively. Using Lemma A2, p(x t |y 1:t ) in (33) can be rewritten as: where J τT t and L τ t are defined in (A44) and (A45), respectively. Considering p(y t+1:N |x t+1 ), p(x t |y 1:t ), p(x t+1 |x t ), and the extended vector of the state given in (62), the argument of these three functions can be written as follows: Then, adding these expressions, the following equation is derived: where F τ t , G τT t , and H τ t are defined in (A41), (A42), and (A43), respectively. Completing the square and expressing the double summation as a single one with a new index k = ( − 1)M t|t + τ, it is obtained: where S t,t+1 = M t|t S t+1|t+1 ,ᾱ k is defined in (A39), andχ k t|N and E k t|N are defined in (A37) and (A38), respectively. Finally, the smoothing joint distribution in (64) for t = N − 1, . . . , 1 is derived by computing the normalized wights as in (A36). Notice that, for t = N, the smoothing joint PDF p(x N+1 , x N |y 1:N ) = p(x N+1 |x N )p(x N |y 1:N ) is obtained as (61) considering the expressions given in (63). Then, the proof is finished.
In Algorithm 4, the steps to implement the Gaussian sum smoothing for computing the joint PDF p(x t+1 , x t |y 1:N ) are summarized.

Numerical Example
In this section, a numerical example to illustrate the benefits of this paper proposal is presented. Furthermore, a practical simulation example is studied: the problem of estimating the tank liquid level in a two-tank system is addressed. Typically, a numerical simulation approach is used for testing new algorithms and designs in applications of state estimation [74], control [75], and system identification [63], among others. This approach is used to evaluate the performance of the estimation algorithms, in order to avoid safety problems that can occur in a real-world processes.
To evaluate the performance of the proposed filtering and smoothing methods for quantized data, a comparison with three classical techniques is presented: standard Kalman filtering and smoothing [76], quantized Kalman filtering and smoothing [49,77], and particle filtering and smoothing [31]. Notice that it is also possible to implement a version of particle filtering and smoothing algorithms using the approximation of p(y t |x t ) given in (21). However, in the numerical examples run for filtering, the computation time was high. In order to validate the approximation in Theorem 1, the standard particle filtering and smoothing algorithms with a large number of particles were used, where p(y t |x t ) was computed using the integral in (20) from the MATLAB function mvncdf, which computes the multivariate normal cumulative distribution function. The true filtering and smoothing PDFs were considered to be provided by the particle filter and smoother with 20,000 particles, which was defined as the ground truth.
In the following examples, the discrete-time system in state-space form given in (1)-(3) is used with: where the quantizer is defined in terms of the round function in MATLAB and the quantization step ∆ q . The infimum and supremum values of the sets J i defined in (8), q i−1 and q i , can be calculated for the infinite-level quantizer as q i−1 = y t − 0.5∆ q and q i = y t + 0.5∆ q for i = . . . , 1, 2 . . . , L, . . . . The experiments were carried out on a computer with the following specifications: processor: Intel(R) Core(TM) i5-8300H CPU @ 2.30 GHz, RAM memory: 8.00 GB, operating system: Windows 10 with MATLAB 2020b.

Example 1: First-Order System
In this example, the following state-space system was considered: where w t ∼ N w t (0, 1) and v t ∼ N v t (0, 0.5). Furthermore, the input was considered to be drawn from N u t (0, 2) and x 1 ∼ N x 1 (1, 1). In Figure 5, the filtering PDFs for some time instants are shown. The quantization step used in this example and the number of Gaussian components to approximate p(y t |x t ) were considered to be ∆ q = 7 and K = 10, respectively. Furthermore, S red = K. GT stands for ground truth. QKF, KF, PF, and GSF stand for quantized Kalman filter, Kalman filter, particle filter, and Gaussian sum filter, respectively. Figure 5 shows that the filtering PDFs obtained with our proposal, the Gaussian sum filter, were that best fit to the ground truth. In contrast, the PDFs obtained with the Kalman filter, quantized Kalman filter, and particle filter were different from the ground truth. Notice that the results obtained using the particle filter with 100 particles were good at t = 3, 9, 40, whilst the resulting PDFs for t = 25,32,59,72,82, 99 differed slightly more from the ground truth. However, the performance of particle filtering can be improved by increasing the number of particles, which produces an increment in the execution time.
To compare the accuracy of the state estimation, 100 Monte Carlo trials were run, and the mean-squared error between the true state and the estimation obtained by the Kalman filter, quantized Kalman filter, Gaussian sum filter, and particle filter is computed as follows: where x t is the true state (which in a real-word system is not available, but in simulations, it can be used to analyze the performance of the estimation techniques),x t|t is the estimation of the state system, and · 2 2 denotes the squared Euclidean norm. In Figure 6, the box plots corresponding to 100 Monte Carlo runs for ∆ q = {3, 5, 7} are shown.  Figure 6 shows that the MSE between the true state and the estimation obtained with the Kalman filter and quantized Kalman filter increased fast as ∆ q increased. However, the MSE increased slowly for the state estimation obtained with the Gaussian sum filter and particle filter.
In Figure 7, the smoothing PDFs at time t = 100 are shown. The quantization step was considered to be chosen from ∆ q = {3, 5, 7}, and the number of Gaussian components to approximate p(y t |x t ) was chosen from K = {6, 8, 10}. Furthermore, the number of Gaussian components kept after the reduction procedure was considered as S red = K. In order to obtain the adequate number of particles to compare the particle smoother with the Gaussian sum smoother, 100 Monte Carlo simulations were carried out to obtain the number of particles that yielded smoothing PDFs that were as close to the true PDFs as the Gaussian sum smoother using K = {6, 8, 10}. For comparison purposes, the particle smoother execution time corresponding to the time required to implement the particle smoother algorithm using the number of particles that produces a similar result to the Gaussian sum smoother with K components is defined as Par(K). The L 2 -norm of the difference between the true and the estimated PDFs as the measure of similarity was used: where q represents the true PDF andq represents the estimated PDF, which was chosen so that q −q 2 < 1 × 10 −6 . The approximated number of particles (labeled in each PDF in Figure 7) was used to compare the execution time of both algorithms, the Gaussian sum smoother and particle smoother, and the results are shown in Table 2. Figure 7 shows that the smoothing PDFs obtained with our proposal using a small number of Gaussian components and the PDFs obtained using the particle smoother with a large number of particles fit the ground truth. In contrast, the PDFs obtained with the Kalman smoother and quantized Kalman smoother were different from the ground truth. Furthermore, the execution time in Table 2 shows that the required time to perform the Gaussian sum smoother was less compared to the time to perform the particle smoother, which needs a large number of particles to produce a result similar to the Gaussian sum smoother.   Table 2. Time in seconds required to perform the smoothing algorithm for the scalar system. Par(K) represents the number of particles (labeled in Figure 7) that produce a similar result to the Gaussian sum smoother with K components. KS, QKS, GSS, and PS stand for the Kalman smoother, quantized Kalman smoother, Gaussian sum smoother, and particle smoother, respectively. From the results shown in Figures 5-7 and Table 2, it can be concluded that:

KS QKS GSS PS
• The filtering and smoothing PDFs are non-Gaussian, although the process and output noises in (1) and (2) are Gaussian distributed; • The accuracy of the standard and quantized Kalman filtering and smoothing decreased as the quantization step increased; • The state estimates obtained with particle filter and smoother were similar to the results obtained using the Gaussian sum filter and smoother. However, the characterization of the filtering and smoothing PDFs using the Gaussian sum filter and smoother were better than the PDF obtained by the particle filter and smoother. Notice that a correct characterization of a PDF is important when high-order moments need to be computed, especially in system identification tasks; • In order to implement the Gaussian sum filter and smoother, the parameters K (the number that defines the quality of the p(y t |x t ) approximation) and S red (the Gaussian components kept after the Gaussian sum reduction algorithm) need to be chosen by the user. These parameters can be found in a simulation study by the trial and error approach and should be set by a trade-off between the time complexity and the accuracy of the estimation. A large value of K produces an accurate estimate, but a high computational load; • The larger the quantization step is, the larger the number of Gaussian components, K, needed to approximate p(y t |x t ) in order to obtain an accurate approximation. However, for a large quantization step, the number K needed to obtain a good approximation of the filtering and smoothing PDFs is relatively small compared to the number of particles required to obtain similar results using the particle filter and smoother; • The maximum number of Gaussian components kept after the Gaussian reduction procedure is important for the accuracy of the approximation. In the simulations, S red = K was used. Furthermore, it was noticed that once an adequate S red was defined, incrementing this value did not produce a significant improvement in the estimation. However, this increment in S red was really critical for the resulting numerical complexity of the algorithm (and hence, the execution time), which increased since the Gaussian sum reduction procedure (e.g., Kullback-Leibler reduction) utilized more time to reduce a large amount of Gaussian components; • The Gaussian sum smoother execution time for all values of ∆ q was small. This occurred because in each case, a relatively small number of Gaussian components to approximate p(y t |x t ) were used. However, the particle smoother execution time is variable for different values of ∆ q . As ∆ q decreased, the L 2 -norm between the Gaussian sum smoother and the ground truth decreased, and a larger number of particles to obtain a comparable L 2 -norm between the particle smoother and the ground truth were required.

Real-World Application: Tank Liquid Level
In this example, the problem of estimating the tank liquid level in a two-tank system by using the measurements taken by a low-cost sensor based on a variable resistor that is attached to an arm with a floater is considered; see Figure 8. A linearized model of this system can be found in [78]. Here, it was assumed that h 2 can be measured and h 1 cannot. The discrete-time version of the model in [78] with sample time 0.1 s was considered: x t+1 = 0.9959 0.0041 0.0041 0.9959 where x t = [h 1 h 2 ] T and u t = [ f 1 − k f 2 + k] T with k = 0.4111. The sensor measures h 2 vary its resistance in discrete steps, with minimum and maximum values β 1 = 2 and β L = 10. To simulate this system, it was considered that: w t ∼ N w t (0, 0.001I 2 ), v t ∼ N v t (0, 0.0001); the input [ f 1 f 2 ] T was drawn from N [10 2] T , 10I 2 ; the initial condition x 1 ∼ N x 1 ([10 5] T , 0.01I 2 ). For this example, 100 Monte Carlo runs were simulated. In Figure 9 (left), the output z t that corresponds to the values of h 2 that are nonquantized and the output y t that corresponds to the measurements given by the sensor (for one of the Monte Carlo runs) are shown. In this figure, the level of quantization in the measurements can be observed. In Figure 9 (right), the MSE between the true and estimated state is shown. It was observed that the proposal presented in this paper, the Gaussian sum filter, yielded the most accurate estimation of h 1 , followed by the particle filter, Kalman filter, and quantized Kalman filter. In this example, K = 50 and 1000 particles were used to implement the Gaussian sum filter and particle filter, respectively.  Figure 9. (Left) The quantized and nonquantized measurements; (right) the MSE between the true and estimated filtered state. KF, QKF, GSF, and PF stand for the Kalman filter, quantized Kalman filter, Gaussian sum filter, and particle filter, respectively.
In this example, a relatively high number of Gaussian components (K = 50) were required to obtain a good estimation of the filtering and smoothing distributions, and hence of the state. This produced an increment in the execution time since the Gaussian sum reduction algorithm needed more time to deal with a high number of Gaussian components in every iteration. This resulted in similar execution times for our proposed algorithm and the traditional particle filter. However, the execution time of the Gaussian sum filter and smoother was smaller than the execution time of the ground truth (particle filter and smoother with 20,000 particles).

Conclusions
In this paper, Gaussian sum filtering and smoothing algorithms for linear-timeinvariant state-space systems with quantized output data were developed. An approximation of the integral equation that defines the probability mass function of the quantized data given the current state, p(y t |x t ), using Gaussian quadrature was considered. This approximation naturally yielded an explicit mathematical model with a GMM structure for this probability function. Using the approximation of p(y t |x t ) summarized in Theorem 1, it was possible to solve in closed form the general equations of filtering and smoothing to deal with quantized data. This fact allowed for a closed-form expression of the system state estimators given the quantized datax t|t andx t|N . Via numerical simulations, it was shown that approximating p(y t |x t ) with a small number of Gaussian components was adequate, yielding an approximation comparable to the true filtering and smoothing PDFs given by the particle approach (using a large number of particles, namely 20,000 particles). This reduced number of Gaussian components allowed for a low computational load, especially when the system order increased. In addition, our results showed overall less computational load for our proposed techniques since the number of Gaussian components was considerably less than the number of particles used in particle filtering and smoothing.
The proposed Gaussian sum filtering and smoothing algorithms can be utilized, in principle, to develop system identification algorithms and control strategies having quantized output measurements.
Proof. Directly expand the exponential argument and reorder the terms in the variable x.
Lemma A3. Consider the backward filter function p(y t:N |x t ) given in (42). Its GMM structure is given by: where σ k t|t =σ k t|t / ∑ S t|t s=1σ s t|t is the normalized mixing weight and ν k t|t = (F k t|t ) −1 G k t|t and Ω k t|t = (F k t|t ) −1 , are the mean vector and the covariance matrix, respectively, with: On the other hand, consider the GMM structure given by (54). Then, the backward filter form in (42) is obtained using the following: Proof. Directly by using Lemma A1.
are the weights, means, and covariance matrices of the time-update equation in the Gaussian sum filter.