Next Article in Journal
Multivariate Mixed Response Model with Pairwise Composite-Likelihood Method
Previous Article in Journal
A Family of Correlated Observations: From Independent to Strongly Interrelated Ones
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Non-Negativity of a Quadratic form with Applications to Panel Data Estimation, Forecasting and Optimization

1
Indian School of Business, Gachibowli, Hyderabad, Telangana 500032, India
2
Gies College of Business, University of Illinois at Urbana-Champaign, Champaign, IL 61820, USA
3
Department of Economics, School of Management and Economics, University of Peloponnese, 22100 Peloponnese, Greece
4
Durham University Business School, Durham DH1 3LB, UK
*
Author to whom correspondence should be addressed.
Bhimasankaram Pochiraju was Professor at the Indian School of Business in India. He passed away on 1 April 2018. He and Sridhar Seshadri worked on this paper when they were at the Indian School of Business.
Stats 2020, 3(3), 185-202; https://doi.org/10.3390/stats3030015
Submission received: 26 May 2020 / Revised: 28 June 2020 / Accepted: 30 June 2020 / Published: 6 July 2020

Abstract

:
For a symmetric matrix B, we determine the class of Q such that Q t BQ is non-negative definite and apply it to panel data estimation and forecasting: the Hausman test for testing the endogeneity of the random effects in panel data models. We show that the test can be performed if the estimated error variances in the fixed and random effects models satisfy a specific inequality. If it fails, we discuss the restrictions under which the test can be performed. We show that estimators satisfying the inequality exist. Furthermore, we discuss an application to a constrained quadratic minimization problem with an indefinite objective function.
JEL Classification:
C01; C02; C61

1. Introduction

Optimization of quadratic structures and corrections to the construction of covariance matrices has a long history in econometrics and financial economics. From the early stages of simultaneous equations modeling, to testing panel data models, to recent advances in handling volatilities and correlations of financial returns in large systems, to applications in portfolio management, all these issues have contained a component of either quadratic structures and covariance estimation or some form of optimization based on them. The importance of the above in forecasting cannot be overstated as they all relate to decision-making at some future time period: a good panel data model can be used in generating out-of-sample forecasts, a well-constructed covariance matrix can be used in optimizing the weights of a portfolio or for building a model for volatility and correlation forecasting. Although we rarely admit it, most models and mathematical operations are –in the end –geared for forecasting.
In this paper we present a novel mathematical approach for solving a particular class of quadratic optimization problems with applications in econometrics, statistics and portfolio construction. At the center of these mathematical derivations is a symmetric indefinite matrix, in applications a covariance matrix. The indefinite nature of this matrix can come from many sources but the common ground to all is rank deficiency or rank indeterminacy based on redundant information in the variables from which we compute that said matrix: a large number of variables involved or some deficiency in the structure of the underlying problem. The problem of this indeterminacy leads to other problems in different set-ups: in the context of a matrix-version of the well-known Hausman test in econometrics a difference in covariance matrices required for the application of the test might not be positive definite; in the context of a large portfolio optimization the covariance matrix of the financial returns might not be positive definite. Admittedly, there are many solutions that have been proposed for these problems; for example, the Hausman test can be computed with a regression approach so that no matrix inversion is required (or a generalized inverse can be used, as has been done in many parts of the relevant literature); in the context of a rank-deficient covariance there are a number of solutions that have been proposed that correct this. So, what are the new insights that this paper has to offer? First, in the context of the Hausman test the results offer a very clear rule as to when the test (in its matrix form) can be applied. This is useful as a specification tool, and is very simple to compute and report –so when the underlying conditions do not hold for the test to be performed in its matrix form some caution is warranted, both in model specification and model performance. Second, in the context of portfolio optimization the results suggest a particular procedure for optimizing a portfolio given a set of particular constraints without reference to the potential problems of rank indeterminacy of the covariance matrix. This therefore allows for a direct solution to the optimization problem even with sample covariance matrices, computed with a limited number of observations. Finally, it should be noted that the optimization problems discussed in this paper have potential for other applications as well. For example, one might consider the case of a least generalized least squares-like problem for which the covariance matrix has rank indeterminacy because of, say, less observations than variables. In that case, given constraints, the solution proposed here can be utilized. Of course such a model can then be used in forecasting.

2. Notations

In this paper, we obtain the class of all Q such that Q t BQ is non-negative definite (nnd) where B is a given symmetric indefinite matrix (the terms ’non-negative’ and ’nonnegative’ are used interchangeably in the literature). Why is this important? Primarily because there is a very active interest in the topic in the literature recently [1,2,3], but even more importantly due to the vast array of applications in statistics, finance and economics, most notably in panel data econometrics and quadratic optimization, as elaborated in the subsequent sections.
(a) Let θ be a parametric vector of interest and t be an unbiased estimator of θ , the dispersion matrix of which depends on some parameters γ . The estimated dispersion matrix D ^ ( t ) based on γ ^ may turn out to be an indefinite matrix. It is of interest to find out the class of all linear parametric functions Q t θ for which the estimated dispersion matrix of the unbiased estimator Q t t is non-negative definite. Take a specific instance where D ^ ( t ) = σ 2 ( ( 1 ρ ^ ) I + ρ ^ 11 t ) and let ρ ^ < 1 n 1 where t is of order n × 1 . In this case D ^ ( t ) is indefinite.
(b) Again, let θ be a parametric vector of interest and let t 1 and t 2 be two unbiased estimators of θ . We say that t 1 is superior to t 2 if D ^ ( t 2 ) D ^ ( t 1 ) is non-negative definite or equivalently if D ^ ( t 1 ) is below D ^ ( t 2 ) under the Löwner order [4]. Suppose neither of t 1 and t 2 is superior to the other. It is of interest to find out sets of linear functions Q t θ of θ such that D ^ ( t 2 ) D ^ ( t 1 ) is non-negative definite. For a specific case, we consider the fixed effects and random effects panel data models [5] to examine when the issue of endogeneity in the random effects model can be checked using the Hausman test. We study the following:
(i)
Suppose we choose and fix the functional form of the estimators of the variance components. We obtain the class of all regressor matrices for which we can perform the Hausman test.
(ii)
Suppose for given data on regressors the difference in the estimated dispersion matrices is indefinite. We obtain the class of linear compounds of the regression coefficient vector for which we can perform the Hausman test.
(iii)
We note that there always exists an estimator of the variance of the white noise part of the error in the random effects model for which the difference in the estimated dispersion matrices of the fixed effects and random effects estimators of the regression coefficient vector is indeed non-negative definite. Reference [6]’s estimator is one such estimator of the variance component mentioned above. There can be others.
(iv)
We extend the above results to the cases where either the random error or the random effects or both are heteroscadastic.
We are aware that there are alternatives to the traditional Hausman test such as in Reference [7] that incorporates the time-invariant parts of the regressors in the random effects model. However, the most popular test to date is the traditional Hausman test.
(c) Consider the problem of minimizing x t Bx subject to Ax = 0 . Clearly, this is equivalent to the unconstrained minimization problem u t ( I A + A ) B ( I A + A ) u , which has a finite solution if and only if ( I A + A ) B ( I A + A ) is non-negative definite. ( A + denotes the Moore-Penrose inverse of A .) Thus, it is often of interest to explicitly obtain the class of all vectors x such that x t Bx 0 , where B is an indefinite real symmetric matrix. Unfortunately, this class is not a subspace of R n . It is also not a convex set. In this paper, we characterize the class of all matrices Q such that Q t BQ is non-negative definite (nnd). We then study the problem of minimization of a quadratic form x t Bx subject to Ax = b , where B is an indefinite matrix and b is in the column space of A . Given a matrix A , we characterize the class of all real symmetric matrices B and vectors b in the column space of A for which the aforementioned problem has a finite solution. It turns out that one of the key conditions for the minimization problem to have a finite solution is the non-negative definiteness of Q t BQ for a suitable orthogonal projection matrix Q .
In Section 3, we state the results on non-negative definite matrices and generalized inverses which will be needed in the later sections. In Section 4, we obtain necessary and sufficient conditions for Q t BQ to be non-negative definite (when B is a symmetric indefinite matrix). Based on this, we develop an algorithm to generate all such matrices Q. We then specialize to the cases where A has (i) just one negative eigenvalue and (ii) just one positive eigenvalue. As a special case of (i), we consider the intraclass correlation matrix which comes naturally as the dispersion matrix in random effects models. In Section 6, we study, in detail, the issues related to performing Hausman test mentioned in (b) above. In Section 7, we show that the problem of finding the class of all matrices Q such that Q t BQ is nnd is equivalent to the solution of the quadratic optimization problem: Minimize x t Bx subject to Ax = 0 varying over matrices A . As we shall show, the connection between Q and A comes through the relationship, null space of A being equal to the column space of Q . Given A , we then determine the class of all matrices Q such that N ( A ) = C ( Q ) . Likewise, given Q , we also determine the class of all matrices A such that N ( A ) = C ( Q ) . In Section 8, we study, in some detail, the constrained optimization problem of minimizing x t Bx subject to Ax = b where b C ( A ) and B is a symmetric indefinite matrix. We consider two cases: (i) the problem has a solution for some non-null vector b C ( A ) and (ii) the problem has a solution for every non-null b C ( A ) . Finally, Section 9 concludes.
We use real vectors and matrices in this paper and use the following notations. For a matrix A, ρ ( A ) , tr(A), C ( A ) , N ( A ) , A t , A , A + , P A denote respectively the rank, trace, column space, null space, transpose, generalized inverse, Moore-Penrose inverse and orthogonal projector into the column space of a matrix A. For a positive integer r, 1 r denotes a column vector with r components where each component is 1. Further, J ¯ r denotes the matrix of order r × r each element of which is 1 r . Clearly J ¯ r = P 1 r . The orthogonal projector I r J ¯ r is denoted by E r . For matrices A and B , A B denotes the Kronecker product defined as ( ( a i j B ) ) . A symmetric matrix A is said to be non-negative definite (nnd) if x t Ax 0 for all vectors x. The symbol diag(A,B,C) denotes a block diagonal matrix A 0 0 0 B 0 0 0 C . For a random vector ϵ , E ( ϵ ) and D ( ϵ ) denote the expectation vector and the dispersion matrix of ϵ . Also c o v ( α , ξ ) denotes the covariance matrix of α with ξ .

3. Preliminaries

In this section, we provide a few results which are well-known and which will be used in the later sections of this paper.
Lemma 1.
Let M = P Q Q t S be a real symmetric matrix where P and S are real symmetric matrices. ThenM is nnd if and only if
(i)
P is nnd,
(ii)
C ( Q ) C ( P ) , and
(iii)
S - Q t P Q is nnd.
(In view of (ii), Q t P Q is invariant under choices of generalized inverses of P.)
The following lemma is well known. For a proof, please see Reference [8].
Lemma 2.
Let A   a n d   B be matrices of order m × n . Then AA t = BB t if and only if A = BT , where T is an orthogonal matrix.
For the proofs of the following Lemmas 3–5, please see Rao and Mitra, 1971.
Lemma 3.
LetAbe a matrix of order m × n . Let b C ( A ) . LetGbe a generalized inverse of A . Then
(i)
N ( A ) = C ( I GA ) .
(ii)
The class of all solutions to Ax = b is given by Gb + ( I GA ) ζ , where ζ is arbitrary.
Lemma 4.
Let A be an m × n matrix of rank r ( > 0 ) . Let A = U Δ 0 0 0 V t be a singular value decomposition of A where U and V are orthogonal matrices and Δ is a positive definite (pd) diagonal matrix of order r × r . Then the class of the generalized inverses of A is given by V Δ 1 L M N U t where L , M , N are arbitrary. In particular, the Moore-Penrose inverse A + of A is given by V Δ 1 0 0 0 U t .
Lemma 5.
Let C and D be nnd matrices of the same order. Then there exists a nonsingular matrix T such that  T t CT and T t DT are diagonal matrices.
The following lemma on quadratic optimization is well-known.
Lemma 6.
Let C be a real-symmetric matrix of order n × n , then the function f ( x ) = 1 2 x t Cx d t x has a minimum value if and only if C is nnd and d C ( C ) , in which case the minimum value is given by 1 2 d t C + d . Furthermore, if C = S t Λ 0 0 0 S is a spectral decomposition of C , where S is orthogonal and Λ is diagonal positive definite matrix of order r × r , then the optimal value is achieved by all vectors x of the form,
x = C + d + S t 0 z
for any z R n r .
The following result is well-known for researchers in parallel sums of matrices and shorted operators. For a proof, see Reference [4].
Lemma 7.
Let A and B be nnd matrices of the same order. Then C ( A ) C ( B ) = C ( A ( A + B ) B ) , where  ( A + B ) is any generalized inverse of A + B .
The following lemmas are well known. For a proof, please see Reference Rao and Bhimasankaram [8].
Lemma 8.
Let A and B be positive definite matrices of the same order. Then B 1 A 1 is nnd if and only if  A B is nnd.
Lemma 9.
Let A be a nonsingular matrix of order n × n and letuandvbe vectors of order n × 1 . Then A + uv t is nonsingular if and only if 1 + v t A 1 u is not equal to 0. Also if 1 + v t A 1 u is not equal to 0, then ( A + uv t ) 1 = A 1 - A 1 uv t A 1 / (1 + v t A 1 u ) .
Lemma 10.
Let A and D be nonsingular matrices of orders n × n and r × r respectively and let B and C are matrices of orders n × r and r × n respectively. Then A + BDC is nonsingular if and only if W = D 1 + CA 1 B is nonsingular. Also if W is nonsingular, then ( A + BDC ) 1 = A 1 A 1 B ( D 1 + CA 1 B ) 1 CA 1 .
Lemma 11.
Let A and B be matrices of orders m × n and n × m respectively Then the non-null eigenvalues of AB and BA are identical.
Lemma 12.
Let A 1 , , A k be commuting real symmetric matrices of the same order. Then they have a simultaneous spectral decomposition.

4. Non-Negative Definiteness of Q t BQ

Let B be a real symmetric indefinite matrix of order n × n and let Q be a matrix with n rows. In this section, we investigate the conditions under which Q t BQ is non-negative definite. We shall also give a method of constructing all such matrices Q.
Let B be a real symmetric indefinite matrix of order n × n . Let a spectral decomposition of B be given by
B = ( P 1 : P 2 : P 3 ) diag ( Δ 1 , Δ 2 , 0 ) ( P 1 : P 2 : P 3 ) t ,
where Δ i is a positive definite diagonal matrix of order r i × r i , i = 1, 2, and P = ( P 1 : P 2 : P 3 ) is an orthogonal matrix, P i being a matrix of n × r i , i = 1, 2, 3 such that r 1 + r 2 + r 3 = n .
We prove
Theorem 1.
Let B , P , Δ 1 , and Δ 2 be as specified above. LetQbe a matrix of order n × s . Write R i = Q t P i , i = 1,2,3 and R = ( R 1 : R 2 : R 3 ) . Let ρ ( R i ) = w i , i = 1,2. Then Q t BQ is nnd if and only if there exists a matrixLwith number of columns equal to w 1 such that
LL t = R 1 Δ 1 R 1 t
and
L Λ L t = R 2 Δ 2 R 2 t ,
where Λ is a diagonal nnd matrix with exactly w 2 diagonal elements in (0,1] and the rest are equal to 0.
Proof. 
Notice that Q = PR t and Q t BQ = RP t P diag ( Δ 1 : Δ 2 : 0 ) P t PR t = R 1 Δ 1 R 1 t R 2 Δ 2 R 2 t .
‘If part’:
Q t BQ = R 1 Δ 1 R 1 t R 2 Δ 2 R 2 t = LL t L Λ L t = L ( I Λ ) L t is nnd since I Λ is nnd.
‘Only if’ part:
Notice that R 1 Δ 1 R 1 t and R 2 Δ 2 R 2 t are both nnd. Since Q t BQ = R 1 Δ 1 R 1 t R 2 Δ 2 R 2 t is nnd, we have C ( R 2 ) C ( R 1 ) . By Lemma 5, there exists a nonsingular matrix T such that
R 1 Δ 1 R 1 t = T Γ 0 0 0 T t
and
R 2 Δ 2 R 2 t = T Ω 1 0 0 Ω 2 T t
where Γ is a positive definite diagonal matrix of order ω 1 × ω 1 and diag ( Ω 1 , Ω 2 ) is a diagonal nnd matrix of rank ω 2 ( Ω 1 is of order ω 1 × ω 1 ).
R 1 Δ 1 R 1 t R 2 Δ 2 R 2 t is nnd T Γ 0 0 0 T t T Ω 1 0 0 Ω 2 T t is nnd Γ Ω 1 is nnd and Ω 2 = 0 . Writing T = ( T 1 : T 2 ) where T 1 has w 1 columns, we have,
R 1 Δ 1 R 1 t = T 1 Γ T 1 t
and
R 2 Δ 2 R 2 t = T 1 Ω 1 T 1 t
Writing L = T 1 Γ 1 2 , we have
R 1 Δ 1 R 1 t = LL t
and
R 2 Δ 2 R 2 t = L Λ L t ,
where Λ = Γ 1 2 Ω 1 ( Γ 1 2 ) t . Further, since Γ Ω 1 is nnd, so is I Λ = Γ 1 2 ( Γ Ω 1 ) ( Γ 1 2 ) t . Clearly, all diagonal elements of Λ are in [0, 1]. Since, æ ( Λ ) = æ ( Ω 1 ) = æ ( R 2 ) = ω 2 , exactly ω 2 diagonal elements of Λ lie in the interval (0, 1]. Q.E.D. □
Given a real symmetric matrix B, we now give a method of generating all matrices Q such that Q t BQ is nnd. Let B be as specified just before Theorem 1. Clearly, ω 2 ω 1 . Also, ω 2 ρ ( Δ 2 ) . Thus, ω 2 l = m i n { ω 1 , ρ ( Δ 2 ) } .
We now prove that Algorithm 1 yields the class of all Q such that Q t BQ is nnd. First, notice that for each Q obtained through Algorithm 1,
R 1 Δ 1 R 1 t = LL t
and
R 2 Δ 2 R 2 t = LVS 1 D 1 1 2 0 0 0 U Δ 2 1 2 Δ 2 Δ 2 1 2 U t D 1 1 2 0 0 0 S 1 t V t L t = LVS 1 D 1 0 0 0 S 1 t V 1 t L t = L Λ L t ,
where Λ is 0 or is orthogonally similar to a diagonal nnd matrix with at most l = m i n { w 1 , ρ ( Δ 2 ) } diagonal elements in (0,1]. Hence, by Theorem 1, Q t BQ is nnd.
Next, let Q t BQ be nnd. Then, there exists R 1 , R 2 such that Q t BQ = R 1 Δ 1 R 1 t R 2 Δ 2 R 2 t and M with rank ρ ( R 1 ) and Λ with rank ρ ( R 2 ) and Λ such that MM t = R 1 Δ 1 R 1 t and M Λ M t = R 2 Δ 2 R 2 t where Λ and I Λ are nnd.
Therefore, by Lemma 2, R 1 Δ 1 1 2 = MV t for some orthogonal matrix V. Without loss of generality, define L = MV t .
LV Λ V t L t = R 2 Δ 2 R 2 t
or
LVB D 0 0 0 S t V t L t = R 2 Δ 2 R 2 t
where S is a permutation matrix. So,
R 2 Δ 2 R 2 t = LVS 1 D 0 0 0 S 1 t V t L t ,
where S 1 is a semi-permutation matrix with ρ ( Δ 2 ) columns. By Lemma 2,
R 2 Δ 2 1 2 = LVS 1 D 1 2 0 0 0 U
where U is an orthogonal matrix. Q.E.D.
Algorithm 1 demonstrates how to construct the class of all Q such that Q t BQ is nnd in an organized manner. However, it is clear that even when R 1 is fixed, the class of all Q such that Q t BQ is nnd is neither a subspace nor a convex set.
Algorithm 1: The Pochiraju algorithm.
Step 1: Choose R 1 and R 3 arbitrarily. (Once R 1 and R 3 are chosen and fixed, their ranks ω 1 and  ω 3  automatically get fixed.)
Step 2: Construct L = R 1 Δ 1 1 2 .
Step 3: Choose ω 2 arbitrarily such that 0 ω 2 l = m i n { ω 1 , ρ ( Δ 2 ) } .
Step 4: Choose D = diag  ( d 1 , d 2 , , d ω 2 , 0 , , 0 ) = d i a g ( D 1 : 0 ) where d i is an arbitrary number in (0,1].
Step 5: Construct R 2 = R 1 Δ 1 1 2 T Δ 2 1 2 where T is an arbitrary matrix of rank ω 2 with singular values in [0,1]. (This is actually achieved as follows: Choose S 1 to be an arbitrary semi-permutation matrix of order ω 1 × ρ ( Δ 2 ) , U be an arbitrary orthogonal matrix of order  ρ ( Δ 2 ) × ρ ( Δ 2 ) and construct R 2 = LVS 1 D 1 1 2 0 0 0 U Δ 2 1 2 , where V is an orthogonal matrix.)
Step 6: Construct Q = PR t where R = ( R 1 : R 2 : R 3 ) .
We now consider two special cases where the construction of the class of all Q such that Q t BQ is nnd becomes simple: (i) B has just one negative eigenvalue and (ii) B has just one positive eigenvalue.
Case (i): B has just one negative eigenvalue.
Choose R 1 and R 3 arbitrarily. Let LL t = R 1 Δ 1 R 1 t where the number of columns in L is ρ ( R 1 ) . Since there is only one negative eigenvalue, let us denote it by δ 2 and the corresponding matrix R 2 by the vector r 2 . Now L Λ L t = δ 2 r 2 r 2 t . Clearly, Λ has exactly one nonzero (positive) diagonal element (say λ ) which can appear in any of the diagonal elements. So L Λ L t = λ l i l i t where 0 λ 1 . So r 2 = λ δ 2 l i . The class of all r 2 is obtained by choosing λ arbitrarily such that 0 λ 1 and an arbitrary column (say ith column) l i of L and constructing r 2 = λ δ 2 l i . Then Q = P R 1 t r 2 t R 3 t .
As a special instance, we consider the estimated intraclass correlation matrix B where the estimated intraclass correlation coefficient ρ ^ < 1 n 1 . We now obtain the class of all Q such that Q t BQ is nnd.
Here Δ 1 = ( 1 ρ ^ ) I , Δ 2 = ( 1 + ( n 1 ) ρ ^ ) , P 1 P 1 t = I 11 t n and P 2 P 2 t = 11 t n and P 3 does not exist since there is no zero eigenvalue.
Choose R 1 arbitrarily and let L be a matrix of maximum column rank such that LL t = ( 1 ρ ^ ) R 1 R 1 t . Choose a column (arbitrarily), say, l i and a number λ in the interval (0,1] (arbitrarily). Construct r 2 = λ ( 1 + ( n 1 ) ρ ^ ) l i . Construct Q = ( P 1 : P 2 ) R 1 t r 2 t . These are all the matrices Q such that Q t BQ is nnd.
Case (ii): B has just one positive eigenvalue.
Let δ 1 be the positive eigenvalue. Since R 1 has just one column, we denote it by r 1 . Choose r 1 and R 3 arbitrarily. Denote L by l since L has only one column. Then ll t = δ 1 r 1 r 1 t . So, l = δ 1 r 1 .
Since Λ is a 1 × 1 matrix, we denote it by λ . As per Theorem 1, 0 λ 1 . Choose and fix λ such that 0 λ 1 . Then by Theorem 1
λ ll t = R 2 Δ 2 R 2 t .
Notice that
ρ ( R 2 ) = ρ ( R 2 Δ 2 R 2 t ) = ρ ( λ ll t ) = ρ ( l ) = 1 .
Write R 2 = uv t , where u and v are column vectors.
λ ll t = v t Δ 2 vuu t .
Clearly, u is a scalar multiple of l. Choose v arbitrarily, and then u = λ v t Δ 2 v l . Construct R 2 = uv t and Q = P r 1 t R 2 t R 3 t .
It may be noted that even in these two simple cases, the class of all Q such that Q t BQ is nnd is a complex structure. (Neither of them is an affine space).

5. Remarks

In Theorem 1, We have obtained a solution to the following problem in matrix partial orders:
Suppose two real symmetric matrices C and D are not related by Lowner order. What is the class of all matrices Q such that Q t CQ is below Q t DQ under the Lowner order?
Comparison of the estimators of vector valued parameters is quite common in sample surveys where no estimator is uniformly superior to the others as the difference in the estimated dispersion matrices of the estimators, say, Δ is indefinite (for details, see Section 6.1 of Reference [9]). The results this Section help in identifying the subsets of linear functions of such parameters for which one estimator is superior to the other by finding the class of all Q for which Q t Δ Q is nnd.

6. Hausman Test

The usual Hausman Test—in order to test for endogeneity in the random effects model [5], cannot be performed if the difference between the estimated dispersion matrices of the regression coefficient estimators in the fixed effects and random effects models (with homoscedastic structures for the error in the fixed effects model and for the random error and also for the random effects in the random effects model) denoted by Π is not non-negative definite. In this section, we study the difference matrix Π in detail. Since we do not know the regressor matrix at the design stage, we first study when Π is nnd for every choice of the regressor matrix. It turns out that Π is nnd for all regressor matrices X if and only if the estimated variance of the error in the fixed effects model is at least as big as the estimator of the variance component of the random noise part in the error in the random effects model. When the difference in the estimated dispersion matrices of the errors in the fixed effects and the random effects models is not nnd, using Algorithm 1, we obtain explicitly the class of all regressor matrices X for which Π is nnd. Owing to this structure, we show that when the number of regressors is larger than the number of individuals, Π cannot be non-negative definite. Finally, for a given regressor matrix X, if Π is not nnd, we find an explicit expression for the class of all linear functions of regression coefficients for which the Hausman Test can be performed. We note that Reference [6]’s estimator of the variance component of the random noise part satisfies the property that the estimated variance of the error in the fixed effects model is at least as big as the estimator of the variance component of the random noise part in the error in the random effects model. Thus, with this choice of the estimators of the variance components, Hausman test can be performed for all regressor matrices X. We observe that we can always get estimators of variance components such that the difference in the dispersion matrices of the error structures in the fixed and random effects models is non-negative definite. Finally, we show that for a suitable choice of the variance component estimators, the difference between the estimated dispersion matrices of the regression coefficient estimators in the fixed effects and random effects models is non-negative definite even when there is heteroscedasticity in the random effects or the random error or both.
We introduce briefly the homoscedastic fixed and random effects panel data models. For details please see References [5,7]. Consider a balanced panel data ( y i t , x i t t ) , t = 1 , , T ; i = i , , N where y i t is the response and x i t t is a 1 × k vector of regressor values on k regressors for the i t h individual at time point t. Denote Y i = ( y i 1 , , y i T ) t , X i = ( x i 1 , , x i T ) t , and X = ( X 1 t , , X N t ) t . Let 1 denote a column vector of appropriate order where each component is 1. Denote F = d i a g ( 1 , , 1 ) where 1 is of order T × 1 .
The fixed effects specification is given by Y = F α + X β + ϵ where α = ( α 1 , , α N ) t is the vector of fixed effects (treated as non-stochastic), β = ( β 1 , , β k ) t is the vector of regression coefficients (also treated as nonstochastic), and ϵ is a random error vector of order N T × 1 with E ( ϵ ) = 0 and D ( ϵ ) = σ F 2 I . (In the fixed effects model, it is assumed that the observational errors are all uncorrelated and have the same variance, denoted by σ F 2 .)
The random effects specification is given by Y = 1 N T μ + F α + X β + ξ where α , β , ξ are as specified in the fixed effects model except that α is treated as random with E ( α ) = 0 and D ( α ) = σ α 2 I , c o v ( α , ξ ) = 0 and D ( ξ ) = σ ξ 2 I . We shall denote σ 1 2 = σ ξ 2 + T σ α 2 .
If we denote η = F α + ξ in the random effects model, we get D ( η ) = Ω = σ α 2 ( I N T J ¯ T ) + σ ξ 2 ( I N I T ) = σ 1 2 ( I N J ¯ T ) + σ ξ 2 ( I N E T ) .
The usual fixed and random effects estimators of β , denoted by β F ^ and β R ^ are given by
β F ^ = ( X t ( I N E T ) X ) 1 X t ( I N E T ) Y a n d β R ^ = ( X t E 1 N T ( E 1 N T Ω E 1 N T ) E 1 N T X ) 1 X t E 1 N T ( E 1 N T Ω E 1 N T ) E 1 N T Y .
Also,
D ( β F ^ ) = ( X t ( I N E T ) X ) 1 σ F 2 a n d D ( β R ^ ) = ( X t E 1 N T ( E 1 N T Ω E 1 N T ) E 1 N T X ) 1 .
Let s F 2 , s α 2 , s ξ 2 , s 1 2 ( = s ξ 2 + T s α 2 ) denote the estimators of σ F 2 , σ α 2 , σ ξ 2 , σ 1 2 respectively and let D ^ ( β F ^ ) and D ^ ( β R ^ ) denote the estimators of D ( β F ^ ) and D ( β R ^ ) where σ F 2 , σ α 2 , σ ξ 2 , σ 1 2 are replaced by their estimators s F 2 , s α 2 , s ξ 2 , s 1 2 respectively. ( Ω is replaced by Ω ^ obtained by plugging in the estimators of the variance components).
As a step towards checking when D ^ ( β F ^ ) D ^ ( β R ^ ) is nnd for all X , we obtain the spectral decomposition of E 1 N T ( E 1 N T Ω E 1 N T ) E 1 N T in the following Lemma.
Lemma 13.
The spectral decomposition of E 1 N T ( E 1 N T Ω ^ E 1 N T ) E 1 N T is given by 1 s 1 2 ( E N J ¯ T ) + 1 s ξ 2 ( I N E T ) .
Proof. 
Let us start with simplifying ( E 1 N T Ω ^ E 1 N T ) .
First,
E 1 N T Ω ^ = ( I N I T ) ( J ¯ N J ¯ T ) s 1 2 ( I N J ¯ T ) + s ξ 2 ( I N E T )
= s 1 2 ( I N J ¯ T ) + s ξ 2 ( I N E T ) s 1 2 ( J ¯ N J ¯ T ) ,
(since, ( J ¯ N J ¯ T ) ( I N E T ) = J ¯ N J ¯ T E T = 0 )
= s 1 2 ( E N J ¯ T ) + s ξ 2 ( I N E T ) . Now,
E 1 N T Ω ^ E 1 N T = s 1 2 ( E N J ¯ T ) + s ξ 2 ( I N E T ) ( I N I T ) ( J ¯ N J ¯ T ) = s 1 2 ( E N J ¯ T ) + s ξ 2 ( I N E T ) .
Notice that, E N J ¯ T and I N E T are both orthogonal projectors and their product is 0 . Hence, (1) is the spectral decomposition of E 1 N T Ω ^ E 1 N T . One generalized inverse (in fact, the Moore-Penrose inverse) of E 1 N T Ω ^ E 1 N T is 1 s 1 2 ( E N J ¯ T ) + 1 s ξ 2 ( I N E T ) .
Further, since Ω ^ is positive definite with probability 1, C E 1 N T C E 1 N T Ω ^ E 1 N T ) with probability 1. Therefore, E 1 N T E 1 N T Ω ^ E 1 N T E 1 N T is invariant under the choices of generalized inverses of E 1 N T Ω ^ E 1 N T .
Now,
E 1 N T E 1 N T Ω ^ E 1 N T E 1 N T =
( I N I T ) ( J ¯ N J ¯ T ) 1 s 1 2 ( E N J ¯ T ) + 1 s ξ 2 ( I N E T ) ( I N I T ) ( J ¯ N J ¯ T ) =
1 s 1 2 ( E N J ¯ T ) + 1 s ξ 2 ( I N E T ) ( I N I T ) ( J ¯ N J ¯ T ) =
1 s 1 2 ( E N J ¯ T ) + 1 s ξ 2 ( I N E T ) . Q.E.D.
We are now ready to prove
Theorem 2.
The difference in the estimated dispersion matrices, Π = D ^ ( β F ^ ) D ^ ( β R ^ ) is nnd for all X if and only if s F 2 s ξ 2 .
Proof. 
D ^ ( β F ^ ) D ^ ( β R ^ ) is nnd, if and only if [ X t ( I N E T ) X ] 1 s F 2 [ X t E 1 N T [ E 1 N T Ω ^ E 1 N T ] 1 E 1 N T X ] 1 is nnd X t E 1 N T [ E 1 N T Ω ^ E 1 N T ] 1 E 1 N T X X t ( I N E T ) X 1 s F 2 is nnd.
But, E 1 N T [ E 1 N T Ω ^ E 1 N T ] 1 E 1 N T ( I N E T ) 1 s F 2 = 1 s 1 2 ( E N J ¯ T ) + 1 s ξ 2 ( I N E T ) 1 s F 2 ( I N E T ) = 1 s 1 2 ( E N J ¯ T ) + 1 s ξ 2 1 s F 2 ( I N E T ) (In fact, this is the spectral decomposition.) which is nnd if and only if 1 s ξ 2 1 s F 2 0 or s F 2 s ξ 2 . Q.E.D.
If a computed estimator s ξ 2 is larger than s F 2 , it is clear from Theorem 2 that Π = D ^ ( β F ^ ) D ^ ( β R ^ ) is not nnd at least for some X . We now determine the class of all X for which Π is nnd, so that the Hausman test can be performed for the entire β vector. Towards this end, as we already noted in the proof of Theorem 2, the spectral decomposition of S = E 1 N T [ E 1 N T Ω ^ E 1 N T ] 1 E 1 N T ( I N E T ) 1 s F 2 is given by
1 s 1 2 ( E N J ¯ T ) + 1 s ξ 2 1 s F 2 ( I N E T ) . From the spectral decomposition of S it is clear that the distinct eigen-values of S are 1 s 1 2 , 1 s ξ 2 1 s F 2 and 0 with algebraic multiplicities N 1 , N T N and 1 respectively. Now we can use Algorithm 1 to obtain the class of all X for which Π is nnd. □
We prove
Theorem 3.
Let ( C l , C l t ) be a rank factorization of E l where l is a positive integer. Let s ξ 2 > s F 2 . Then the class of all X for which Π is nnd is given by X = 1 T ( C N 1 T ) R 1 t + ( I N C T ) R 2 t + 1 N T ( 1 N 1 T ) R 3 t where
(i)
R 1 and R 3 are arbitrary,
(ii)
R 2 = R 1 W where W is an arbitrary matrix, of rank not greater than that of R 1 , having singular values in the interval [0.θ],with θ = s 1 2 ( 1 s F 2 1 s ξ 2 ) .
Proof. 
Notice that 1 T ( C N 1 T ) , I N C T and 1 N T ( 1 N 1 T ) are orthonormal bases of the eigen spaces of ( I P 1 N T ) [ ( I P 1 N T ) Ω ( I P 1 N T ) ] 1 ( I P 1 N T ) ( I N E T ) 1 s F 2 corresponding to the eigen-values 1 s 1 2 , 1 s ξ 2 1 s F 2 and 0 respectively.
Since S has N T N negative eigen-values, R 2 in the expression for X in Theorem 3 is heavily restricted. Thus, for a large class of matrices X , we cannot perform the Hausman test for all linear parametric functions. We shall now concentrate on this situation (namely, the difference matrix Π is not nnd) and obtain the class of all linear parametric functions for which we can still perform the Hausman test.
Notice that the Hausman Test can be performed on estimable linear functions, A β , if and only if A Π A t is non-negative definite. Also, A β are estimable if and only if A is of the form A = Z ( I N E T ) X for some Z .
First observe that
D ^ ( β F ^ ) D ^ ( β R ^ ) = [ X t ( I N E T ) X ] 1 s F 2 [ X t ( 1 s 1 2 ( E N J ¯ T ) + 1 s ξ 2 ( I N E T ) ) X ] 1
= [ X t ( I N E T ) X ] 1 ( s F 2 s ξ 2 ) + ( X t ( I N E T ) X ) 1 X t ( I N J ¯ T )
[ s 1 2 s ξ 2 I N T + ( I N J ¯ T ) X ( X t ( I N E T ) X ) 1 X t ( I N J ¯ T ) ] 1 ( I N J ¯ T ) X ( X t ( I N E T ) X ) 1 (applying Lemma 10).
Consider A β . The class of all A such that the Hausman test can be performed for A β is completely determined by the class of all Z such that A = Z ( I N E T ) X . Write C = ( I N E T ) X and D = ( I N J ¯ T ) X . In these expressions, C and D are the time-varying and time-invariant parts of X .
We want to determine the class of all Z such that A Π A t is nnd.
Now,
A Π A t = Z Φ Z t where
Φ = P C ( s F 2 s ξ 2 ) + C ( C t C ) 1 D t ( s 1 2 s ξ 2 I + D ( C t C ) 1 D t ) 1 D ( C t C ) 1 C t .
As before, we need to get the spectral decomposition of Φ in order to determine the class of all Z such that Z Φ Z t is nnd. We first prove
Lemma 14.
(a) Let s = ρ ( D ( C t C ) 1 D t ) . Then s ρ ( C ) .
(b) The non-null eigenvalues of Φ are ( s F 2 s ξ 2 ) + ( s 1 2 s ξ 2 + λ i ) 1 λ i , i = 1 , , s and s F 2 s ξ 2 , i = s + 1 , , k .
Proof. 
(a) is trivial.
(b) follows from the following facts: (i) Non-null eigenvalues of BF and FB are the same including multiplicities; (ii) B and I + B commute. (iii) Eigen-values of α I + B are obtained by adding α to the eigen-values of B . (iv) If two real symmetric matrices commute they have simultaneous spectral decomposition. If B and F are two real symmetric matrices of the same order such that C ( B ) C ( F ) , the null space of F is contained in that of B . Q.E.D. □
As a consequence we have the following
Theorem 4.
The spectral decomposition of Φ is given by Φ = G Δ G t where the columns of G form an orthonormal basis of C and Δ is a diagonal matrix whose diagonal entries are the non-null eigen-values of Φ as detailed in Lemma 14.
Using the spectral decomposition of Φ we can determine the class of all Z such that Z Φ Z t is nnd using Algorithm 1. From there we can get the class of all A = Z ( I N E T ) X such that we can perform the Hausman test for A β .
We now show that there is at least one good estimator of σ ξ 2 which satisfies Theorem 2.
Notice that s F 2 = R o 2 N T N k where R o 2 is the sum of squared residuals in the fixed effects model. Amemiya’s estimator of σ ξ 2 , namely, s ξ 2 is R o 2 N T k 1 (see page 16 of Reference [7]). Clearly s F 2 s ξ 2 for this choice. Amemiya (1971) obtains some optimal properties of this estimator. Thus we proved
Theorem 5.
For the choice of the estimators s F 2 = R o 2 N T N k and s ξ 2 = R o 2 N T k 1 of the error variance in the fixed effects model and the variance of the random component of the random effects model respectively, where R o 2 is the sum of squared residuals in the fixed effects model, the difference in the estimated dispersion matrices of the regression coefficient estimators in the fixed effects and random effects models is non-negative definite.
So far, we considered the case where both the random effects and the random error are homoscedastic. We now examine the case where one or both of them are heteroscedastic. Specifically, we explore whether we can find estimators of the variance components whereby the difference in the dispersion matrices of the design parameter estimators corresponding to the fixed effects and random effects specifications respectively is non-negative definite for all X .
Let us first write down the fixed effects and random effects specifications with heteroscedasticity.
The fixed effects specification is given by Y = F α + X β + ϵ where α = ( α 1 , , α N ) t is the vector of fixed effects (treated as non-stochastic), β = ( β 1 , , β k ) t is the vector of regression coefficients (also treated as nonstochastic), and ϵ is a random error vector of order N T × 1 with E ( ϵ ) = 0 and D ( ϵ ) = d i a g ( σ 1 2 , σ 2 2 , , σ N 2 ) I T .
The random effects specification is given by Y = 1 N T μ + F α + X β + ξ where α , β , ξ are as specified in the fixed effects model except that α is treated as random with E ( α ) = 0 and D ( α ) = d i a g ( w 1 2 , , w N 2 ) I T , c o v ( α , ξ ) = 0 and D ( ξ ) = d i a g ( r 1 2 , , r N 2 ) I T .
If we denote η = F α + ξ in the random effects model, we get D ( η ) = Ω = ( T D ( α ) J ¯ T ) + ( D ( ξ ) I T ) .
Consider the random effects model. Let the estimated dispersion matrices of the random effects and the random error be denoted by D α = d i a g ( w ^ 1 2 , , w ^ N 2 ) and D ξ I T ) where D ξ = d i a g ( r ^ 1 2 , , r ^ N 2 ) .
Hence the estimated error dispersion matrix is Ω ^ = ( T D α J ¯ T ) + ( D ξ I T ) .
Further, the estimated dispersion matrix of the error in the fixed effects specification is D ^ ( ϵ ) = D F I T where D F = d i a g ( σ ^ 1 2 , σ ^ 2 2 , , σ ^ N 2 ) .
The fixed and random effects estimators of β , denoted by β F ^ and β R ^ are given by
β F ^ = ( X t ( D F 1 E T ) X ) 1 X t ( D F 1 E T ) Y a n d β R ^ = ( X t E 1 N T ( E 1 N T Ω ^ E 1 N T ) E 1 N T X ) 1 X t E 1 N T ( E 1 N T Ω ^ E 1 N T ) E 1 N T Y .
Also,
D ^ ( β F ^ ) = ( X t ( D F 1 E T ) X ) 1 a n d D ^ ( β R ^ ) = ( X t E 1 N T ( E 1 N T Ω ^ E 1 N T ) E 1 N T X ) 1 .
We now proceed to evaluate the difference in the estimated dispersion matrices of fixed effects and random effects. As before, the difference in the dispersion matrices of the design parameter estimators corresponding to the fixed effects and random effects specifications respectively is non-negative definite for all X if and only if ( E 1 N T ( E 1 N T Ω ^ E 1 N T ) E 1 N T ) ( D F 1 E T ) is nnd. We start with computing
E 1 N T Ω ^ E 1 N T = ( I N I T J ¯ N J ¯ T ) ( T D α J ¯ T + D ξ E T + D ξ J ¯ T ) ( I N I t J ¯ N J ¯ T )
= ( T D α J ¯ T + D ξ E T + D ξ J ¯ T T J ¯ N D α J ¯ T J ¯ N D ξ J ¯ T ) ( I N I T J ¯ N J ¯ T )
= T D α J ¯ T + D ξ E T + D ξ J ¯ T T J ¯ N D α J ¯ T J ¯ N D ξ J ¯ T T D α J ¯ N J ¯ T D ξ J ¯ T J ¯ T + T J ¯ N D α J ¯ N J ¯ T + J ¯ N D ξ J ¯ N J ¯ T
= D ξ E T + ( T D α + D ξ T J ¯ N D α J ¯ N D ξ T D J ¯ N D ξ J ¯ N + T J ¯ N D J ¯ N + J ¯ N D R J ¯ N ) J ¯ T
= D ξ E T + ( T ( E N D α E N ) + E N D ξ E N ) J T ¯ = D ξ E T + E N ( T D α + D ξ ) E N J ¯ T
Now,
E 1 N T ( E 1 N T Ω ^ E 1 N T ) E 1 N T = { ( I N I T ) ( J ¯ N J ¯ T ) } D ξ 1 E T + ( E N ( T D α + D ξ ) E N ) J ¯ T } { I N I T J ¯ N J ¯ T }
= D ξ 1 E T + ( E N ( T D α + D ξ ) E N ) J ¯ T J ¯ N ( E N ( T D α + D ξ ) E N ) J ¯ T J ¯ T ( E N ( T D α + D ξ ) E N ) J ¯ N J ¯ T + J ¯ N ( E N ( T D α + D ξ ) E N ) J ¯ N J ¯ T = D ξ 1 E T + ( E N ( T D α + D ξ ) E N ) E N J ¯ T .
Thus, we proved
Theorem 6.
D ^ ( β ^ F ) D ^ ( β ^ R ) under heteroscedistic specification is nnd if and only if ( D ξ 1 D F 1 ) E T + ( E N ( T D α + D ξ ) E N ) E N J ¯ T is nnd.
We can always get a positive definite estimator of the dispersion matrix of random error, that is D ξ . For the difference in the estimated dispersion matrices, D ^ ( β ^ F ) D ^ ( β ^ R ) to be nnd, we need both ( D ξ 1 D F 1 ) and ( E N ( T D α + D ξ ) E N ) to be nnd. We can use Amemiya type estimator to make the first expression to be nnd. It is easy to see that for the second expression to be nnd, it is sufficient that ( T D α + D ξ ) is nnd. This is indeed nnd. (Adapt equation 2.21 of Reference [7] for each individual.) Hence, we can always find error component estimators such that D ^ ( β ^ F ) D ^ ( β ^ R ) under heteroscedistic specification is nnd.
The cases where the random error alone or the random effects alone are heteroscedastic are simple special cases of the case we have discussed above. However, if there is heteroscedasticity in the random error, performing the Hausman test requires that T is large, for otherwise the large sample chi-square test will not be valid.
In the next section, we shall show that the problem of finding the class of all Z such that Z Φ Z t is nnd is equivalent to solving a quadratic optimization problem.

7. A Quadratic Optimization Problem

In a previous section we obtained the class of all matrices Q such that Q t BQ is nnd where B is a symmetric indefinite matrix. In this section, for a given symmetric indefinite matrix B , we establish the connection between the following two problems:
(a)
When is Q t BQ nnd?
(b)
When does x t Bx have a minimum subject to Ax = 0 .
We prove
Theorem 7.
Let B be a real symmetric indefinite matrix. Then Q t BQ is nnd if and only if x t Bx has a minimum subject to Ax = 0 where N ( A ) = C ( Q ) .
Proof. 
We note that the orthogonal projectors into N ( A ) and C ( Q ) are ( I A + A ) and QQ + respectively. Hence N ( A ) = C ( Q ) if and only if ( I A + A ) = QQ +
‘If part’:
x t Bx has a minimum subject to Ax = 0
( I A + A ) B ( I A + A ) is nnd
QQ + BQQ + is nnd
Q t BQ = Q t QQ + BQQ + Q is nnd.
‘Only if part’:
Q t BQ is nnd
Q ( Q t Q ) + Q t BQ ( Q t Q ) + Q t is nnd
QQ + BQQ + is nnd
x t Bx has a minimum subject to Ax = 0 .
Given A , the class A A of all Q such that N ( A ) = C ( Q ) can be obtained as follows. Let ( C , C t ) be a rank factorization of ( I A + A ) .
Let A A = { Q : Q = CT where T is an arbitrary full row-rank matrix}. Now, Q A A C ( Q ) = C ( C ) (since T is of full row-rank) = C ( I A + A ) ) = N ( A ) .
Conversely, let N ( A ) = C ( Q ) . Then Q = CT for some matrix T . Also ρ ( Q ) = dimension of N ( A ) = dimension of C ( I A + A ) ) = ρ ( C ) . Since C has a left inverse, ρ ( Q ) = ρ ( T ) . So, ( C , T ) is a rank factorization of Q . Hence Q A A . Q.E.D.
Let Q be a given matrix. We now obtain the class of all matrices A such that N ( A ) = C ( Q ) . Let ( D , D t ) be a rank factorization of ( I QQ + ) . Then the class D Q of all matrices A such that N ( A ) = C ( Q ) is given by
D Q = { A : A = WD t where W is an arbitrary full column-rank matrix}.
Proof follows along similar lines to the earlier case. □

8. A Quadratic Optimization Problem with Non-Homogeneous Linear Constraints

In this section, we consider the problem of minimization of a quadratic form x t Bx subject to linear constraints Ax = b , where B is an n × n symmetric matrix, A an m × n matrix, and b an m × 1 vector. The case where B is a pd matrix is well-known [8]. The case where B is nnd is described in Reference [10]. In this section, for given matrices A and B where B is symmetric (not necessarily nnd), we study when the minimum exists in the following cases:
(i)
For some non-null vector b R m .
(ii)
For all non-null vectors b R m .
We shall notice that for a suitable matrix Q , Q t BQ being nnd forms an important condition for the existence of a finite solution to the minimization problem. We shall then proceed to characterize the class of all matrices B and vectors b (given a matrix A ), such that x t Bx has a finite minimum subject to Ax = b .
We prove
Theorem 8.
Let B be a real symmetric matrix of order n × n . Let A be an m × n matrix and let b C ( A ) . Consider the minimization problem:
Minimize x t Bx subject to Ax = b .
Write H 1 = ( I A + A ) B ( I A + A ) and H 2 = ( I A + A ) B A + A .
(a) The problem has a finite solution for some non-null vector b if and only if
H 1 i s n n d a n d
H 1 ( H 1 + H 2 H 2 t ) H 2 0 .
(b) The problem has a finite solution for every b C ( A ) if and only if (2) holds and
C ( H 2 ) C ( H 1 ) .
Let b = Au . The minimum value in either case is given by
u t A + A B A + A u u t H 2 t H 1 + H 2 u ,
and the minimum value is achieved at all vectors x of the form
H 1 + H 2 u + S t 0 1
where H 1 = S t Γ 0 0 0 S is a spectral decomposition of H 1 , S being orthogonal, Γ a diagonal positive definite matrix, ζ an arbitrary vector in R n r , r being the rank of H 1 .
Proof. 
By Lemma 3, the class of all x satisfying Ax = b is given by,
x = A + b + ( I A + A ) ζ
where ζ is arbitrary. Invoking (7) into x t Bx , we get
x t Bx = b t ( A + ) t BA + b + 2 b t ( A + ) t B ( I A + A ) ζ + ζ t ( I A + A ) B ( I A + A ) ζ .
Thus, constrained minimization of x t Bx subject to Ax = b is equivalent to unconstrained minimization of the right hand side of (8) over ζ . Since b C ( A ) , we can write b = Au for some u . Now, the theorem follows from Lemmas 6 and 7. Q.E.D. □
Let us identify (2)–(6) in terms of a singular value decomposition of A . Let us write b = Au . Let A = U Δ 0 0 0 V t be a singular value decomposition of A .
Write V t BV = R 11 R 12 R 21 R 22 , where R 11 is of the same order as Δ and R 22 is a square matrix.
It is easy to see that H 1 = V 0 0 0 R 22 V t and H 2 = V 0 0 R 21 0 V t .
Hence, (2) is equivalent to saying that R 22 is nnd, (3) is equivalent to saying that C ( R 21 ) C ( R 22 ) 0 , (4) is equivalent to the statement that C ( R 21 ) C ( R 22 ) , and (5) translates to the expression u t V R 11 R 12 R 22 + R 21 0 0 0 V t u .
Let ρ ( R 22 ) = s , and let R 22 = M Γ 0 0 0 M t be a spectral decomposition of R 22 , where M is orthogonal and Γ is a diagonal positive definite matrix of order s × s . Then a spectral decomposition of H 1 is given by V I 0 0 M P Γ 0 0 0 P t I 0 0 M t V t , where P is a suitable permutation matrix. In view of this, (6) translates to the expression V 0 0 R 22 + R 21 0 V t u + V I 0 0 M P 0 ζ , where ζ is an arbitrary vector in R n s .
We note that (2), namely H 1 should be nnd, is a key factor for the existence of a finite solution to the constrained optimization problem under consideration, which falls into the line of investigation in Section 1.
Let A be a given m × n matrix of rank r. The above identification helps us in characterizing the class of all real symmetric matrices B and the class of all vectors b C ( A ) such that the constrained optimization problem
Minimize x t B x subject to Ax = b
has a finite solution.
As before, let A = U Δ 0 0 0 V t be singular value decomposition of A , where U and V are orthogonal matrices and Δ is a positive definite diagonal matrix of order r × r . Write V t BV = R 11 R 12 R 21 R 22 , where R 11 is of order r × r and R 22 is of order ( n r ) × ( n r ) . Since, b C ( A ) , write b = Au . Characterizing B and b is equivalent to characterizing R 11 , R 21 , R 22 , and u .
If the minimization problem should have a finite solution for every b C ( A ) , then the class of all B is given by B = V R 11 R 21 t R 21 R 22 V t .
where (a) R 11 is an arbitrary real symmetric matrix of order r × r
(b) R 22 is an arbitrary nnd matrix of order ( n r ) × ( n r )
(c) R 21 = R 22 D , where D is an arbitrary matrix of order ( n r ) × r .
If the minimization problem should have a solution for some non-null vector b C ( A ) , then the class of all B is given by B = V R 11 R 21 t R 21 R 22 V t , satisfying (a) and (b) as above and
(d) (( Fy : G ), W ) is a rank-factorization of R 2 , where
(i)
( F , F t ) is a rank factorization of R 22 ,
(ii)
y is arbitrary non-null vector,
(iii)
G is arbitrary such that ( Fy : G ) is of full column rank, and
(iv)
W is an arbitrary full row rank matrix.
The class of all b is obtained as follows:
Using B as obtained above, compute J = H 1 ( H 1 + H 2 H 2 t ) H 2 H 2 t . Let w be an arbitrary non-zero vector in C ( J ) . Let u be an arbitrary solution of H 2 u = w . Compute b = Au . Notice that b 0 , for, if b = 0 , Au = 0 , and hence H 2 u = 0 , which is a contradiction, since H 2 u = w 0 .
We prove
Theorem 9.
LetBbe a symmetric indefinite matrix of order n × n , andAbe an n × n matrix, andbbe an m × 1 vector in the column space ofA. If x t Bx has a finite minimum subject to Ax = b for every b C ( A ) , then there exists a generalized inverse G ofAsuch that
( I GA ) B ( I GA ) i s n n d
a n d BGA = ( GA ) t B .
Proof. 
Since x t Bx has a finite minimum subject to Ax = b for every b , by Lemma 13, we have, ( I A + A ) B ( I A + A ) is nnd. Whenever, G is a generalized inverse of A , ( I A + A ) ( I GA ) = I GA . Now, it follows that, ( I GA ) t ( I A + A ) B ( I A + A ) ( I GA ) = ( I GA ) t B ( I GA ) is nnd. Further, from the discussion after Theorem 7, it follows that R 22 is nnd and C ( R 21 ) C ( R 22 ) .
Since R 22 is nnd, ( I GA ) t B ( I GA ) = V Δ M t I R 22 M Δ I V t is nnd, whatever M be. Further, since C ( R 21 ) C ( R 22 ) , there exists a matrix T such that R 21 = R 22 T . Write M = T Δ 1 . Now it is easy to verify that, for this choice of M, G = V Δ 1 L M N U t is a generalized inverse of A (where L and N are arbitrary) such that BGA = ( GA ) t B . Q.E.D. □
Remark 1.
A + does not in general have the properties (9) and (10).
Is there anything special about a generalized inverse G of A satisfying (9) and (10)?
It turns out that every generalized inverse G of A satisfying (9) and (10) is in fact a minimum semi-norm generalized inverse of A under a suitable semi-inner product. To see this construct
S = V k Δ 2 R 11 0 0 0 V t
where k > tr ( R 12 t R 22 R 21 ) . Then,
B + A t SA = V kI R 21 t R 21 R 22 V t
is nnd, since R 22 is nnd, C ( R 21 ) C ( R 22 ) and kI R 21 t R 22 R 21 is nnd (by virtue of choice of k and Lemma 1). Also, AGA = A and
( B + A t SA ) GA = BGA + A t SA = ( GA ) t B + A t SA ,
and is thus symmetric. Hence, G is a minimum semi-norm generalized inverse of A under the semi-inner product ( x , y ) = y t ( B + A t SA ) x (see Theorem 1.4 of Reference [11]).

9. Conclusions, Limitations and Future Research

In this article we discussed extensively and provided new results for the optimization of quadratic structures and respective corrections to the construction of covariance matrices. This is an area has a long history in econometrics and financial economics with various applications in testing panel data models, handling volatility, to applications in portfolio management. Furthermore, the implications in forecasting are numerous as all relate to decision-making at some future time period: panel data models can be used to generate out-of-sample forecasts; a covariance matrix can be used in optimizing a portfolio or in a model for volatility and correlation forecasting. The economic implications of all the aforementioned are profound. We have presented a novel statistical approach for solving a particular class of quadratic optimization problems. At the center of the mathematical derivations is a symmetric indefinite matrix, the indefinite nature of which can come from many sources usually rank deficiency or rank indeterminacy driven from redundant information in the variables from which we compute that said matrix. The problem of this indeterminacy leads to subsequent problems in the context of: a matrix-version of the well-known Hausman test in econometrics; a large portfolio optimization the covariance matrix of the financial returns might not be positive definite. This is the body of literature and application we contribute to.
As any other statistical derivation, this paper comes with the usual limitations and caveats of any statistical analysis—we do provide solutions to well known problems, but for which there are alternative solutions [6], and as such no solution is universally better, and given collected samples the researcher may have to try an extensive array of available tools, to which we contribute emphatically one more here.
For future research, we leave the investigation of further application areas for our propositions, as well as simulations for a wide range of panel data, and optimization problems.

Author Contributions

Conceptualization, B.P.; methodology, B.P. and S.S.; software, D.D.T.; validation, B.P., S.S. and D.D.T.; formal analysis, B.P. and S.S.; investigation, B.P., S.S., D.D.T.; resources, B.P. and S.S.; data curation, S.S.; writing–original draft preparation, B.P.; writing–review and editing, S.S., D.D.T., and K.N.; supervision, B.P.; project administration, B.P. and K.N.; All authors have read and agreed to the published version of the manuscript.(S.S. and K.N. on behalf of B.P.

Funding

This research received no external funding.

Acknowledgments

Bhimasankaram Pochiraju has acknowledged and Sridhar Seshadri gratefully acknowledges the research support from the Applied Statistics and Computing Lab, Indian School of Business. Nikolopoulos gratefully acknowledges the research support from Indian School of Business during his two visits in the campuses in Hyderabad in 2017, and the extended stay in Mohali in early 2018 while delivering the PGP elective module on ‘Forecasting Analytics’.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Linton, O.; Tang, H. Estimation of the Kronecker Covariance Model by Quadratic Form; Cambridge Working Papers in Economics 2050; Faculty of Economics, University of Cambridge: Cambridge, UK, 2020. [Google Scholar]
  2. Eriksson, A.; Preve, D.; Yu, J. Forecasting Realized Volatility Using a Nonnegative Semiparametric Modell. J. Risk Financ. Manag. 2019, 12, 139. [Google Scholar] [CrossRef] [Green Version]
  3. Toloo, M.; Mensah, E. Robust optimization with nonnegative decision variables: A DEA approach. Comput. Ind. Eng. 2019, 127, 313–325. [Google Scholar] [CrossRef]
  4. Mitra, S.; Bhimasankaram, P.; Malik, S. Matrix Partial Orders. Shorted Operators and Applications; World Scientific: Singapore, 2010. [Google Scholar]
  5. Greene, W. Econometric Analysis, 7th ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2012. [Google Scholar]
  6. Amemiya, T. The estimation of variance components in a variance components model. Int. Econ. Rev. 1971, 12, 1–13. [Google Scholar] [CrossRef]
  7. Baltagi, B. Econometric Analysis of Panel Data, 5th ed.; Wiley: New York, NY, USA, 2013. [Google Scholar]
  8. Rao, A.; Bhimasankaram, P. Linear Algebra, 2nd ed.; Hindustan Book Agency: New Delhi, India, 2000. [Google Scholar]
  9. Goga, C. Variance Estimators in Survey Sampling. Available online: http://goga.perso.math.cnrs.fr/ChapVar1_coursBesan.pdf (accessed on 24 March 2008).
  10. Kambo, N. Mathematical Programming Techniques; Affiliated East West Press: New Delhi, India, 1984. [Google Scholar]
  11. Rao, C.; Mitra, S. Generalized Inverse of Matrices and Its Applications; Wiley: New York, NY, USA, 1971. [Google Scholar]

Share and Cite

MDPI and ACS Style

Pochiraju, B.; Seshadri, S.; Thomakos, D.D.; Nikolopoulos, K. Non-Negativity of a Quadratic form with Applications to Panel Data Estimation, Forecasting and Optimization. Stats 2020, 3, 185-202. https://doi.org/10.3390/stats3030015

AMA Style

Pochiraju B, Seshadri S, Thomakos DD, Nikolopoulos K. Non-Negativity of a Quadratic form with Applications to Panel Data Estimation, Forecasting and Optimization. Stats. 2020; 3(3):185-202. https://doi.org/10.3390/stats3030015

Chicago/Turabian Style

Pochiraju, Bhimasankaram, Sridhar Seshadri, Dimitrios D. Thomakos, and Konstantinos Nikolopoulos. 2020. "Non-Negativity of a Quadratic form with Applications to Panel Data Estimation, Forecasting and Optimization" Stats 3, no. 3: 185-202. https://doi.org/10.3390/stats3030015

Article Metrics

Back to TopTop