State Space Modeling with Non-Negativity Constraints Using Quadratic Forms

State space model representation is widely used for the estimation of nonobservable (hidden) random variables when noisy observations of the associated stochastic process are available. In case the state vector is subject to constraints, the standard Kalman filtering algorithm can no longer be used in the estimation procedure, since it assumes the linearity of the model. This kind of issue is considered in what follows for the case of hidden variables that have to be non-negative. This restriction, which is common in many real applications, can be faced by describing the dynamic system of the hidden variables through non-negative definite quadratic forms. Such a model could describe any process where a positive component represents “gain”, while the negative one represents “loss”; the observation is derived from the difference between the two components, which stands for the “surplus”. Here, a thorough analysis of the conditions that have to be satisfied regarding the existence of non-negative estimations of the hidden variables is presented via the use of the Karush–Kuhn–Tucker conditions.


Introduction
State space modeling is used for estimating-revealing the dynamic evolution of hidden variables' processes. In some cases, the state vector, which includes the hidden components, is subject to constraints, which are derived either due to the physical meaning of the states or because of the mathematical properties that have to be satisfied. For example, state space models with constraints are used in camera surveillance [1,2], navigation issues [3], and biological systems [4]. Especially, in finance, the hidden variables are often subject to non-negative constraints or in general have to be bounded. For example, in the Vasicek model [5] and its extension [6], the interest rates are considered to be hidden random variables subject to non-negative constraints, while in [7,8], the eigenvalues of the VAR process were restricted within the unit circle. Considering the use of state space models in the domain of finance, a discrete state space model could be implemented for the estimation of the hidden jump components of asset returns [9,10]. The use of jumps has been proposed for the description of the dynamics of asset prices since they can explain some of the empirical characteristics of the asset prices, e.g., the lack of a normal distribution or the existence of leptokurticity (see for example [11]).
When dealing with state space models that are subject to constraints, the Kalman filtering algorithm [12] can no longer be used, since it assumes linearity in the model. In the domain of nonlinear filters, the particle filtering approach (see for example [13][14][15][16]) has wide applicability, and it adopts resampling techniques for the estimation of the state vector at every time t. However, the use of resampling techniques adds considerable computational cost in the estimation procedure.
In this work, the observation is defined as the difference between the two-sided components under noise inclusion. The components are considered to be hidden random variables, and therefore, a state space model is established, where the state equation describes the dynamic evolution of the two hidden components. This equation represents a first-order Markov process, i.e., all the information needed for the estimation of the components at time t is derived by the components at time t − 1, and no other information from past times is needed. Moreover, the state vector is subject to non-negative constraints that have to be taken into account for its estimation in time. Such a model could describe, for example, the evolution of a system where the positive component represents "gain", while the negative one represents "loss"; the observation is derives from the difference between the two components, which stands for the "surplus", under noise inclusion. In asset pricing, an asset return can be defined as the difference between the two-sided nonnegative return jump components under noise inclusion, and the jump components are considered to be hidden variables. Another example could be the one-dimensional random walk, where a positive jump could represent (the measure of) a move to the right and a negative jump (the measure of) a move to the left, while the observation could be a function of the two jump components given at discrete times. To handle such kinds of problems, non-negative definite quadratic forms are adopted in the state equation for the dynamic evolution of the two-sided components. In this case, the recursive equations of the Kalman filter cannot be used for the estimation of the state vector, since this filter assumes linearity in the measurement and state equation. To this end, this work first derives the recursive equations for the estimation of the state vector based on the state space model representation with non-negative definite quadratic forms in the state equation and their Taylor expansions. Then, a thorough analysis of the necessary conditions that have to be satisfied in order to obtain the non-negative estimations at every time t is provided. In Proposition 1, the stationary points of the optimization problem with the non-negative constraints are given by using the Karush-Kuhn-Tucker conditions, while in Proposition 2, the necessary conditions for the existence of feasible solutions in the constrained optimization problem are provided.
Overall, this work proposes a method in state space modeling representation, which can be used when dealing with hidden components that are subject to non-negativity constraints. The method results in the formulation of a constrained optimization problem for which the stationary points are derived via Proposition 1, and the necessary conditions for the existence of feasible solutions in this optimization problem are provided via Proposition 2; to that end, the iterative formulas for the minimum variance a posteriori estimators for the (hidden) state vector are illustrated. Moreover, the proposed method has a low computational burden compared to other nonlinear filtering methods that can be used in state space modeling with inequality constraints and are based on resampling techniques (e.g., particle filtering).
The paper is organized as follows. In Section 2, the state space model proposed for the estimation of the two jump components is established. Two non-negative quadratic forms are adopted to describe the dynamic evolution of the two-sided components subject to their non-negative restrictions. In Section 3, the recursive equations of the second-order Kalman filter are presented, while in Section 4, a thorough analysis of the conditions that have to be fulfilled so as to have non-negative estimations is presented. The results of this analysis are summarized in Propositions 1 and 2. In Section 5, an illustrative example concerning the evolution of positive and negative jumps of asset returns is presented to demonstrate the theoretical results. Finally, Section 6 concludes on the findings and provides suggestions for future work.

State Space Model
In this section, a state space model representation is illustrated considering the case where there are two hidden processes subject to non-negativity constraints. The state equation that describes the dynamic evolution of the hidden components adopts the use of non-negative definite quadratic forms, while the measurement equation is linear.
The state equation is given by: or equivalently: where: ) stands for the state vector; • w t stands for the noise, and it is assumed that 11 > 0 and g where the k-th element equals 1, and the other element equals 0. The measurement equation is given by the relation: where H = 1 −1 and e t ∼ N(0, V). Moreover, it is assumed that E(e k w T j ) = 0. Apparently, state Equation (2) describes a (nonobservable) first-order non-negative valued Markovian process, the evolution of which and its characteristics (e.g., periodicity, convergence etc.) depend on the structure (values) of the associated noisy observation sequence. The aim of our study here was to estimate (reveal) the Markovian process (2) (i.e., the matrices G (k) , k = 1, 2, and Q), through the observation Equation (3), if the components of the state vector have to be non-negative. For this purpose, Model (2) and (3) adopts the use of non-negative definite quadratic forms to describe the dynamic evolution of the hidden two-sided components; that is, to ensure that the estimations of the components will be non-negative. To that end, the extended Kalman filter of second order is proposed in order to estimate at every time t the state vector z t that incorporates the hidden jump components. It is noticed here that the noise component in Relation (2) is multiplicative and not additive.
Next, the extended Kalman filter of second order is described and its iterative equations for the estimation of the state vector are presented.

Extended Kalman Filter of Second Order
Model (2) and (3) presented in Section 2 is nonlinear, and subsequently, the recursive standard algorithm of the Kalman filter cannot be used for the estimation of the state vector. Aiming to derive the recursive equations for the estimation of the hidden states taking into consideration that the state Equation (2) is a quadratic form, the following notation is used: •ẑ − t : the a priori estimation of the state vector z t , i.e., without taking into consideration the measurement at time t; •ẑ + t : the a posteriori estimation of the state vector z t , i.e., by considering the measurement at time t; • P − t , P + t : the variance-covariance matrices of the a priori and a posteriori error estimations of z t , respectively, i.e., According to (2), z t,k , k = 1, 2 is a function of the random variables z t−1 , and w t−1 , i.e., z t,k = z t,k (z t−1 , w t−1 ). Then, using the Taylor expansion of second order of z t,k at (ẑ + t−1 , 0), it is derived that: (1). By equating the mean values in Relation (4), the a priori estimation of z t (prediction stage) is derived, that is: and the entries of the respective variance-covariance matrix P − t are given by the relation, where (P − t ) k,m denotes the (k, m)-element of matrix P − t and tr(.) denotes the trace of the respective matrix. Taking into consideration the properties of the trace of a matrix, it is derived after some algebraic manipulations on Relations (5) and (6) that: Regarding the a posteriori estimations of z t , it is taken into account that the joint distribution of z t and R t is normal, based on the relation: Then, we make use of the following Lemma (see for example [17]): Lemma 1. Let x, y be two random variables that are jointly normally distributed with: E( x y ) = µ x µ y and Σ = Σ 11 Σ 12 Σ 21 Σ 22 .
Then, (x/y) ∼ N(µ , Σ ), where: Based on Lemma 1, the a posteriori estimation of z t (update stage) and the related variance-covariance matrix P t (+) are given by, By using the recursive Relations (7)- (10), we can estimate the hidden components at every time t.
Next, a detailed investigation regarding the existence of non-negative solutions (i.e., non-negative a posteriori estimations of z t ) derived from (9) is presented.

Investigation of the State Space Model
In what follows, we present an investigation concerning the conditions that have to be satisfied so as to derive non-negative a posteriori estimations of the state vector z t . Obviously, Relation (7) ensures the existence of non-negative a priori estimations of z t at every time t. However, the a posteriori estimations of z t given by (9) may not fulfil the non-negativity condition. We note that the solutions depend on the term K t (R t − Hẑ − t ), the sign of which is not time invariant. To this end, in order to ensure that the a posteriori unbiased estimatorẑ + t will be a minimum variance estimator under the non-negativity restrictions that its components must satisfy, the following optimization problem arises, whereẑ + t 0.
Symbol (or ) is used for the elementwise inequality, while z t = (X t , Y t ) T is given by Equation (1) (or (2)). The following Proposition 1 provides the set of stationary points related to the optimization problem (11), subject to the non-negativity restrictions. This set includes the optimal solution, i.e., the unbiased minimum variance estimatorẑ + t . In what follows, we use the following notations:
Taking into consideration Remark 1, it is assumed in the sequel that a t = 0 for every t. Proposition 1. The weight matrix K t and the stationary points related to the optimization problem (11) are given by the relations: which leads to the solution: which leads to the solution: , which leads to the solution: which leads to the solution: Proof. The Lagrangian function related to the optimization problem (11) is defined as: Based on (10), it is derived that: while (by assuming the dependence ofẑ + t,i on R t andẑ − t,i , i = 1, 2, as provided in Kalman filtering): 2 + a t K t,2 . By calculating the first derivative of the Lagrangian function and equating it to 0, it is derived that: where λ = (λ 1 , λ 2 ) T . Thus, matrix K t has to satisfy the following condition (by noticing that P − t is symmetric): based on the constraints [18]: The following cases have to be considered: The two constraint conditions are inactive. Then, λ 1 = λ 2 = 0, and the optimization problem, leading to (14), is transformed into the unconstrained one considered in the case of the Kalman filter. It is derived that: which is the well-known Kalman gain matrix. The related solution in terms of the a posteriori estimatorẑ + t is: Relation (16) constitutes a possible solution of the optimization problem (11), and it has to satisfy the constraintẑ + t 0; (ii) The first constraint condition is inactive (i.e., λ 1 = 0), while the second one is active. Then, the following two cases are considered: (a) If λ 2 = 0, then we are led to the unconstrained optimization problem presented in Case (i), and the solution must satisfy the non-negative restrictions, i.e.,ẑ + t 0; (b) Ifẑ − t,2 + a t K t,2 = 0, it is derived via the active constraint condition that: By using (17), Relation (14) is transformed into: Consequently, where b t = HP − t H T ≥ 0. By using (17) and (18), it is derived that: The first constraint condition is active, while the second one is inactive (i.e., λ 2 = 0). The following two cases are considered: (a) If λ 1 = 0, then we obtain the unconstrained optimization problem presented in Case (i), and the solution must fulfil the nonnegative restrictions, i.e.,ẑ + t 0; (b) Ifẑ − t,1 + a t K t,1 = 0 and λ 1 = 0, then it is derived that: and Relation (14) is transformed into: where b t = HP − t H T ≥ 0. By using (19) and (20), it is derived that: and consequently:ẑ + t,1 = 0 and:ẑ (iv) The two constraint conditions are active, i.e.,ẑ − t,1 + a t K t,1 = 0 andẑ − t,2 + a t K t,2 = 0. In this case, we have to seek solutions such that λ 1 , λ 2 ≥ 0.
Based on the active constraint conditions, it is derived that: i.e., K t = −a −1 tẑ − t , resulting in the relation, The state vectorẑ + t = 0 constitutes a feasible solution, and it has to be checked whether Relation (14) is satisfied with λ 1 , λ 2 ≥ 0.
In what follows, Proposition 2 provides the necessary conditions for the existence of feasible solutions regarding the constrained filter.
constitutes a feasible solution, if: and: and: a t (P − Remark 2. Based on the low computational cost, the four possible solutions of the constrained optimization problem (11) can be examined one-to-one, aiming to find the optimal solution. In any case, the necessary conditions presented in Proposition 2 can be examined simultaneously to have a more comprehensive view in the process of searching for the optimal solution.
Next, an illustrative application of the described methodology is presented regarding the estimation (revelation) of the two-sided jump components of asset returns.

Application; Estimation of the Two-Sided Jump Components of the NASDAQ Index
In this section, an application example of the proposed methodology analyzed in Section 4 is illustrated concerning the estimation of the hidden two-sided jump components of the NASDAQ index for the 3 y period 2006-2008. To estimate the parameters of the model, i.e., the parameter set φ = (G (1) , G (2) , σ 2 x , σ 2 y , V), the maximum likelihood estimation method is used taking into consideration that the distribution of R t conditioned on z t is normal, i.e., R t |z t ∼ N(Hẑ − t , HP − t H T + V) . Therefore, the log-likelihood function, LogL, is of the form: where,

Conclusions
In this work, the topic of state space modeling with non-negative constraints was considered. For that purpose, a state space model was constructed where the state equation that describes the dynamic evolution of the components of the hidden state vector was expressed via non-negative definite quadratic forms and represents a non-negative valued Markovian stochastic process of order one. Due to the inequality conditions, a constrained optimization problem arises to derive estimators for the states, which are unbiased and of minimum variance. Towards this direction, a thorough analysis was illustrated via Propositions 1 and 2, concerning the stationary points of the optimization problem along with the special conditions that have to be satisfied in order to derive non-negative estimations for the state vectors at every time. Thus, in Proposition 2, necessary conditions were derived for a stationary point to constitute a feasible solution. The proposed method constitutes an alternative for handling state space models with non-negativity constraints, and it has a low computational burden compared to resampling methods for the estimation procedure.
Regarding future work, the generalization of the proposed method for the case of an n-dimensional non-negative state vector, n > 2, could be examined. This is a challenging problem in many applications. For example, in navigation problems, for n = 3, state space models with non-negativity constraints are suitable to describe the distance covered during the motion of a vehicle, if we let the three non-negative components of the state vector represent the measures of the velocities (speeds) along the axes in R 3 .