2. Notations
In this paper, we obtain the class of all
such that
is non-negative definite (nnd) where
is a given symmetric indefinite matrix (the terms ’non-negative’ and ’nonnegative’ are used interchangeably in the literature). Why is this important? Primarily because there is a very active interest in the topic in the literature recently [
1,
2,
3], but even more importantly due to the vast array of applications in statistics, finance and economics, most notably in panel data econometrics and quadratic optimization, as elaborated in the subsequent sections.
(a) Let be a parametric vector of interest and be an unbiased estimator of , the dispersion matrix of which depends on some parameters . The estimated dispersion matrix based on may turn out to be an indefinite matrix. It is of interest to find out the class of all linear parametric functions for which the estimated dispersion matrix of the unbiased estimator is non-negative definite. Take a specific instance where and let where is of order . In this case is indefinite.
(b) Again, let
be a parametric vector of interest and let
and
be two unbiased estimators of
. We say that
is superior to
if
is non-negative definite or equivalently if
is below
under the Löwner order [
4]. Suppose neither of
and
is superior to the other. It is of interest to find out sets of linear functions
of
such that
is non-negative definite. For a specific case, we consider the fixed effects and random effects panel data models [
5] to examine when the issue of endogeneity in the random effects model can be checked using the Hausman test. We study the following:
- (i)
Suppose we choose and fix the functional form of the estimators of the variance components. We obtain the class of all regressor matrices for which we can perform the Hausman test.
- (ii)
Suppose for given data on regressors the difference in the estimated dispersion matrices is indefinite. We obtain the class of linear compounds of the regression coefficient vector for which we can perform the Hausman test.
- (iii)
We note that there always exists an estimator of the variance of the white noise part of the error in the random effects model for which the difference in the estimated dispersion matrices of the fixed effects and random effects estimators of the regression coefficient vector is indeed non-negative definite. Reference [
6]’s estimator is one such estimator of the variance component mentioned above. There can be others.
- (iv)
We extend the above results to the cases where either the random error or the random effects or both are heteroscadastic.
We are aware that there are alternatives to the traditional Hausman test such as in Reference [
7] that incorporates the time-invariant parts of the regressors in the random effects model. However, the most popular test to date is the traditional Hausman test.
(c) Consider the problem of minimizing subject to . Clearly, this is equivalent to the unconstrained minimization problem , which has a finite solution if and only if is non-negative definite. ( denotes the Moore-Penrose inverse of .) Thus, it is often of interest to explicitly obtain the class of all vectors x such that , where B is an indefinite real symmetric matrix. Unfortunately, this class is not a subspace of . It is also not a convex set. In this paper, we characterize the class of all matrices Q such that is non-negative definite (nnd). We then study the problem of minimization of a quadratic form subject to , where is an indefinite matrix and is in the column space of . Given a matrix , we characterize the class of all real symmetric matrices and vectors in the column space of for which the aforementioned problem has a finite solution. It turns out that one of the key conditions for the minimization problem to have a finite solution is the non-negative definiteness of for a suitable orthogonal projection matrix .
In
Section 3, we state the results on non-negative definite matrices and generalized inverses which will be needed in the later sections. In
Section 4, we obtain necessary and sufficient conditions for
to be non-negative definite (when
B is a symmetric indefinite matrix). Based on this, we develop an algorithm to generate all such matrices
Q. We then specialize to the cases where
A has (i) just one negative eigenvalue and (ii) just one positive eigenvalue. As a special case of (i), we consider the intraclass correlation matrix which comes naturally as the dispersion matrix in random effects models. In
Section 6, we study, in detail, the issues related to performing Hausman test mentioned in (b) above. In
Section 7, we show that the problem of finding the class of all matrices
such that
is nnd is equivalent to the solution of the quadratic optimization problem: Minimize
subject to
varying over matrices
. As we shall show, the connection between
and
comes through the relationship, null space of
being equal to the column space of
. Given
, we then determine the class of all matrices
such that
=
. Likewise, given
, we also determine the class of all matrices
such that
=
. In
Section 8, we study, in some detail, the constrained optimization problem of minimizing
subject to
where
and
B is a symmetric indefinite matrix. We consider two cases: (i) the problem has a solution for some non-null vector
and (ii) the problem has a solution for every non-null
. Finally,
Section 9 concludes.
We use real vectors and matrices in this paper and use the following notations. For a matrix A, , tr(A), , , , , , denote respectively the rank, trace, column space, null space, transpose, generalized inverse, Moore-Penrose inverse and orthogonal projector into the column space of a matrix A. For a positive integer r, denotes a column vector with r components where each component is 1. Further, denotes the matrix of order each element of which is . Clearly = . The orthogonal projector is denoted by . For matrices and , denotes the Kronecker product defined as . A symmetric matrix A is said to be non-negative definite (nnd) if for all vectors x. The symbol diag(A,B,C) denotes a block diagonal matrix . For a random vector , and denote the expectation vector and the dispersion matrix of . Also denotes the covariance matrix of with .
4. Non-Negative Definiteness of
Let B be a real symmetric indefinite matrix of order and let be a matrix with n rows. In this section, we investigate the conditions under which is non-negative definite. We shall also give a method of constructing all such matrices Q.
Let
B be a real symmetric indefinite matrix of order
. Let a spectral decomposition of
B be given by
where
is a positive definite diagonal matrix of order
, i = 1, 2, and
is an orthogonal matrix,
being a matrix of
, i = 1, 2, 3 such that
.
We prove
Theorem 1. Let , , and be as specified above. LetQbe a matrix of order . Write , i = 1,2,3 and . Let , i = 1,2. Then is nnd if and only if there exists a matrixLwith number of columns equal to such thatandwhere Λ
is a diagonal nnd matrix with exactly diagonal elements in (0,1] and the rest are equal to 0. Proof. Notice that and .
‘If part’:
is nnd since is nnd.
‘Only if’ part:
Notice that
and
are both nnd. Since
is nnd, we have
. By Lemma 5, there exists a nonsingular matrix
T such that
and
where
is a positive definite diagonal matrix of order
and
diag (
) is a diagonal nnd matrix of rank
(
is of order
).
is nnd
is nnd
is nnd and
. Writing
where
has
columns, we have,
and
Writing
, we have
and
where
. Further, since
is nnd, so is
. Clearly, all diagonal elements of
are in [0, 1]. Since,
, exactly
diagonal elements of
lie in the interval (0, 1]. Q.E.D. □
Given a real symmetric matrix B, we now give a method of generating all matrices Q such that is nnd. Let B be as specified just before Theorem 1. Clearly, . Also, . Thus, .
We now prove that Algorithm 1 yields the class of all
Q such that
is nnd. First, notice that for each
Q obtained through Algorithm 1,
and
where
is
0 or is orthogonally similar to a diagonal nnd matrix with at most
diagonal elements in (0,1]. Hence, by Theorem 1,
is nnd.
Next, let be nnd. Then, there exists such that and M with rank and with rank and such that and where and are nnd.
Therefore, by Lemma 2,
for some orthogonal matrix
V. Without loss of generality, define
.
or
where
S is a permutation matrix. So,
where
is a semi-permutation matrix with
columns. By Lemma 2,
where
U is an orthogonal matrix. Q.E.D.
Algorithm 1 demonstrates how to construct the class of all
Q such that
is nnd in an organized manner. However, it is clear that even when
is fixed, the class of all
Q such that
is nnd is neither a subspace nor a convex set.
Algorithm 1: The Pochiraju algorithm. |
Step 1: Choose and arbitrarily. (Once and are chosen and fixed, their ranks and automatically get fixed.) |
Step 2: Construct . |
Step 3: Choose arbitrarily such that . |
Step 4: Choose D = diag where is an arbitrary number in (0,1]. |
Step 5: Construct where is an arbitrary matrix of rank with singular values in [0,1]. (This is actually achieved as follows: Choose to be an arbitrary semi-permutation matrix of order , be an arbitrary orthogonal matrix of order and construct , where V is an orthogonal matrix.) |
Step 6: Construct where . |
We now consider two special cases where the construction of the class of all Q such that is nnd becomes simple: (i) B has just one negative eigenvalue and (ii) B has just one positive eigenvalue.
Case (i): B has just one negative eigenvalue.
Choose and arbitrarily. Let where the number of columns in L is . Since there is only one negative eigenvalue, let us denote it by and the corresponding matrix by the vector . Now . Clearly, has exactly one nonzero (positive) diagonal element (say ) which can appear in any of the diagonal elements. So where . So . The class of all is obtained by choosing arbitrarily such that and an arbitrary column (say ith column) of L and constructing . Then .
As a special instance, we consider the estimated intraclass correlation matrix B where the estimated intraclass correlation coefficient . We now obtain the class of all Q such that is nnd.
Here , , and and does not exist since there is no zero eigenvalue.
Choose arbitrarily and let L be a matrix of maximum column rank such that . Choose a column (arbitrarily), say, and a number in the interval (0,1] (arbitrarily). Construct . Construct . These are all the matrices Q such that is nnd.
Case (ii): B has just one positive eigenvalue.
Let be the positive eigenvalue. Since has just one column, we denote it by . Choose and arbitrarily. Denote L by l since L has only one column. Then . So, .
Since
is a
matrix, we denote it by
. As per Theorem 1,
. Choose and fix
such that
. Then by Theorem 1
Write
, where
u and
v are column vectors.
Clearly, u is a scalar multiple of l. Choose v arbitrarily, and then . Construct and .
It may be noted that even in these two simple cases, the class of all Q such that is nnd is a complex structure. (Neither of them is an affine space).
6. Hausman Test
The usual Hausman Test—in order to test for endogeneity in the random effects model [
5], cannot be performed if the difference between the estimated dispersion matrices of the regression coefficient estimators in the fixed effects and random effects models (with homoscedastic structures for the error in the fixed effects model and for the random error and also for the random effects in the random effects model) denoted by
is not non-negative definite. In this section, we study the difference matrix
in detail. Since we do not know the regressor matrix at the design stage, we first study when
is nnd for every choice of the regressor matrix. It turns out that
is nnd for all regressor matrices
X if and only if the estimated variance of the error in the fixed effects model is at least as big as the estimator of the variance component of the random noise part in the error in the random effects model. When the difference in the estimated dispersion matrices of the errors in the fixed effects and the random effects models is not nnd, using Algorithm 1, we obtain explicitly the class of all regressor matrices
X for which
is nnd. Owing to this structure, we show that when the number of regressors is larger than the number of individuals,
cannot be non-negative definite. Finally, for a given regressor matrix
X, if
is not nnd, we find an explicit expression for the class of all linear functions of regression coefficients for which the Hausman Test can be performed. We note that Reference [
6]’s estimator of the variance component of the random noise part satisfies the property that the estimated variance of the error in the fixed effects model is at least as big as the estimator of the variance component of the random noise part in the error in the random effects model. Thus, with this choice of the estimators of the variance components, Hausman test can be performed for all regressor matrices
X. We observe that we can always get estimators of variance components such that the difference in the dispersion matrices of the error structures in the fixed and random effects models is non-negative definite. Finally, we show that for a suitable choice of the variance component estimators, the difference between the estimated dispersion matrices of the regression coefficient estimators in the fixed effects and random effects models is non-negative definite even when there is heteroscedasticity in the random effects or the random error or both.
We introduce briefly the homoscedastic fixed and random effects panel data models. For details please see References [
5,
7]. Consider a balanced panel data
where
is the response and
is a
vector of regressor values on
k regressors for the
individual at time point
t. Denote
,
, and
. Let
denote a column vector of appropriate order where each component is 1. Denote
where
is of order
.
The fixed effects specification is given by where is the vector of fixed effects (treated as non-stochastic), is the vector of regression coefficients (also treated as nonstochastic), and is a random error vector of order with and . (In the fixed effects model, it is assumed that the observational errors are all uncorrelated and have the same variance, denoted by .)
The random effects specification is given by where are as specified in the fixed effects model except that is treated as random with and , and . We shall denote
If we denote in the random effects model, we get = =.
The usual fixed and random effects estimators of
, denoted by
and
are given by
Let denote the estimators of respectively and let and denote the estimators of and where are replaced by their estimators respectively. ( is replaced by obtained by plugging in the estimators of the variance components).
As a step towards checking when is nnd for all , we obtain the spectral decomposition of in the following Lemma.
Lemma 13. The spectral decomposition of is given by .
Proof. Let us start with simplifying .
First,
=,
(since, )
=
. Now,
□
Notice that, and are both orthogonal projectors and their product is . Hence, (1) is the spectral decomposition of . One generalized inverse (in fact, the Moore-Penrose inverse) of is .
Further, since is positive definite with probability 1, with probability 1. Therefore, is invariant under the choices of generalized inverses of .
Now,
=
=
=
. Q.E.D.
We are now ready to prove
Theorem 2. The difference in the estimated dispersion matrices, is nnd for all if and only if .
Proof. is nnd, if and only if is nnd is nnd.
But, (In fact, this is the spectral decomposition.) which is nnd if and only if or Q.E.D.
If a computed estimator is larger than , it is clear from Theorem 2 that is not nnd at least for some . We now determine the class of all for which is nnd, so that the Hausman test can be performed for the entire vector. Towards this end, as we already noted in the proof of Theorem 2, the spectral decomposition of = is given by
. From the spectral decomposition of it is clear that the distinct eigen-values of are , and 0 with algebraic multiplicities , and 1 respectively. Now we can use Algorithm 1 to obtain the class of all for which is nnd. □
We prove
Theorem 3. Let be a rank factorization of where l is a positive integer. Let . Then the class of all for which Π is nnd is given by= where
- (i)
and are arbitrary,
- (ii)
where is an arbitrary matrix, of rank not greater than that of , having singular values in the interval [0.θ],with .
Proof. Notice that , and are orthonormal bases of the eigen spaces of corresponding to the eigen-values , and 0 respectively.
Since has negative eigen-values, in the expression for in Theorem 3 is heavily restricted. Thus, for a large class of matrices , we cannot perform the Hausman test for all linear parametric functions. We shall now concentrate on this situation (namely, the difference matrix is not nnd) and obtain the class of all linear parametric functions for which we can still perform the Hausman test.
Notice that the Hausman Test can be performed on estimable linear functions, , if and only if is non-negative definite. Also, are estimable if and only if is of the form for some
First observe that
=
(applying Lemma 10).
Consider . The class of all such that the Hausman test can be performed for is completely determined by the class of all such that . Write and . In these expressions, and are the time-varying and time-invariant parts of .
We want to determine the class of all such that is nnd.
Now,
= where
= .
As before, we need to get the spectral decomposition of in order to determine the class of all such that is nnd. We first prove
Lemma 14. (a) Let . Then .
(b) The non-null eigenvalues of Φ are and .
Proof. (a) is trivial.
(b) follows from the following facts: (i) Non-null eigenvalues of and are the same including multiplicities; (ii) and commute. (iii) Eigen-values of are obtained by adding to the eigen-values of . (iv) If two real symmetric matrices commute they have simultaneous spectral decomposition. If and are two real symmetric matrices of the same order such that , the null space of is contained in that of . Q.E.D. □
As a consequence we have the following
Theorem 4. The spectral decomposition of Φ is given by where the columns of form an orthonormal basis of and Δ is a diagonal matrix whose diagonal entries are the non-null eigen-values of Φ as detailed in Lemma 14.
Using the spectral decomposition of we can determine the class of all such that is nnd using Algorithm 1. From there we can get the class of all such that we can perform the Hausman test for .
We now show that there is at least one good estimator of which satisfies Theorem 2.
Notice that
where
is the sum of squared residuals in the fixed effects model. Amemiya’s estimator of
, namely,
is
(see page 16 of Reference [
7]). Clearly
for this choice. Amemiya (1971) obtains some optimal properties of this estimator. Thus we proved
Theorem 5. For the choice of the estimators and = of the error variance in the fixed effects model and the variance of the random component of the random effects model respectively, where is the sum of squared residuals in the fixed effects model, the difference in the estimated dispersion matrices of the regression coefficient estimators in the fixed effects and random effects models is non-negative definite.
So far, we considered the case where both the random effects and the random error are homoscedastic. We now examine the case where one or both of them are heteroscedastic. Specifically, we explore whether we can find estimators of the variance components whereby the difference in the dispersion matrices of the design parameter estimators corresponding to the fixed effects and random effects specifications respectively is non-negative definite for all .
Let us first write down the fixed effects and random effects specifications with heteroscedasticity.
The fixed effects specification is given by where is the vector of fixed effects (treated as non-stochastic), is the vector of regression coefficients (also treated as nonstochastic), and is a random error vector of order with and .
The random effects specification is given by where are as specified in the fixed effects model except that is treated as random with and , and .
If we denote in the random effects model, we get = .
Consider the random effects model. Let the estimated dispersion matrices of the random effects and the random error be denoted by and where .
Hence the estimated error dispersion matrix is .
Further, the estimated dispersion matrix of the error in the fixed effects specification is where .
The fixed and random effects estimators of
, denoted by
and
are given by
We now proceed to evaluate the difference in the estimated dispersion matrices of fixed effects and random effects. As before, the difference in the dispersion matrices of the design parameter estimators corresponding to the fixed effects and random effects specifications respectively is non-negative definite for all if and only if is nnd. We start with computing
Now,
.
Thus, we proved
Theorem 6. under heteroscedistic specification is nnd if and only if is nnd.
We can always get a positive definite estimator of the dispersion matrix of random error, that is
. For the difference in the estimated dispersion matrices,
to be nnd, we need both
and
to be nnd. We can use Amemiya type estimator to make the first expression to be nnd. It is easy to see that for the second expression to be nnd, it is sufficient that
is nnd. This is indeed nnd. (Adapt equation 2.21 of Reference [
7] for each individual.) Hence, we can always find error component estimators such that
under heteroscedistic specification is nnd.
The cases where the random error alone or the random effects alone are heteroscedastic are simple special cases of the case we have discussed above. However, if there is heteroscedasticity in the random error, performing the Hausman test requires that T is large, for otherwise the large sample chi-square test will not be valid.
In the next section, we shall show that the problem of finding the class of all such that is nnd is equivalent to solving a quadratic optimization problem.
8. A Quadratic Optimization Problem with Non-Homogeneous Linear Constraints
In this section, we consider the problem of minimization of a quadratic form
subject to linear constraints
, where
B is an
symmetric matrix,
A an
matrix, and
b an
vector. The case where
B is a pd matrix is well-known [
8]. The case where
B is nnd is described in Reference [
10]. In this section, for given matrices
A and
B where
B is symmetric (not necessarily nnd), we study when the minimum exists in the following cases:
- (i)
For some non-null vector .
- (ii)
For all non-null vectors .
We shall notice that for a suitable matrix , being nnd forms an important condition for the existence of a finite solution to the minimization problem. We shall then proceed to characterize the class of all matrices and vectors (given a matrix ), such that has a finite minimum subject to .
We prove
Theorem 8. Let be a real symmetric matrix of order . Let be an matrix and let . Consider the minimization problem:
Minimize subject to .
Write and .
(a) The problem has a finite solution for some non-null vector if and only if (b) The problem has a finite solution for every if and only if (2) holds and Let . The minimum value in either case is given by and the minimum value is achieved at all vectors of the form where is a spectral decomposition of , being orthogonal, Γ a diagonal positive definite matrix, ζ an arbitrary vector in , r being the rank of .
Proof. By Lemma 3, the class of all
satisfying
is given by,
where
is arbitrary. Invoking (7) into
, we get
Thus, constrained minimization of subject to is equivalent to unconstrained minimization of the right hand side of (8) over . Since , we can write for some . Now, the theorem follows from Lemmas 6 and 7. Q.E.D. □
Let us identify (2)–(6) in terms of a singular value decomposition of . Let us write . Let be a singular value decomposition of .
Write , where is of the same order as and is a square matrix.
It is easy to see that and .
Hence, (2) is equivalent to saying that is nnd, (3) is equivalent to saying that , (4) is equivalent to the statement that , and (5) translates to the expression .
Let , and let be a spectral decomposition of , where is orthogonal and is a diagonal positive definite matrix of order . Then a spectral decomposition of is given by , where is a suitable permutation matrix. In view of this, (6) translates to the expression , where is an arbitrary vector in .
We note that (2), namely
should be nnd, is a key factor for the existence of a finite solution to the constrained optimization problem under consideration, which falls into the line of investigation in
Section 1.
Let be a given matrix of rank r. The above identification helps us in characterizing the class of all real symmetric matrices and the class of all vectors such that the constrained optimization problem
Minimize subject to
has a finite solution.
As before, let be singular value decomposition of , where and are orthogonal matrices and is a positive definite diagonal matrix of order . Write , where is of order and is of order . Since, , write . Characterizing and is equivalent to characterizing , , , and .
If the minimization problem should have a finite solution for every , then the class of all is given by .
where (a) is an arbitrary real symmetric matrix of order
(b) is an arbitrary nnd matrix of order
(c) , where is an arbitrary matrix of order .
If the minimization problem should have a solution for some non-null vector , then the class of all is given by , satisfying (a) and (b) as above and
(d) ((),) is a rank-factorization of , where
- (i)
() is a rank factorization of ,
- (ii)
is arbitrary non-null vector,
- (iii)
is arbitrary such that () is of full column rank, and
- (iv)
is an arbitrary full row rank matrix.
The class of all is obtained as follows:
Using as obtained above, compute . Let be an arbitrary non-zero vector in . Let be an arbitrary solution of . Compute . Notice that , for, if , , and hence , which is a contradiction, since .
We prove
Theorem 9. LetBbe a symmetric indefinite matrix of order , andAbe an matrix, andbbe an vector in the column space ofA. If has a finite minimum subject to for every , then there exists a generalized inverse ofAsuch that Proof. Since has a finite minimum subject to for every , by Lemma 13, we have, is nnd. Whenever, is a generalized inverse of , . Now, it follows that, = is nnd. Further, from the discussion after Theorem 7, it follows that is nnd and .
Since is nnd, is nnd, whatever M be. Further, since , there exists a matrix T such that . Write . Now it is easy to verify that, for this choice of M, is a generalized inverse of A (where L and N are arbitrary) such that . Q.E.D. □
Remark 1. does not in general have the properties (9) and (10).
Is there anything special about a generalized inverse G of A satisfying (9) and (10)?
It turns out that every generalized inverse
G of
A satisfying (9) and (10) is in fact a minimum semi-norm generalized inverse of
A under a suitable semi-inner product. To see this construct
where
. Then,
is nnd, since
is nnd,
and
is nnd (by virtue of choice of k and Lemma 1). Also,
and
and is thus symmetric. Hence,
is a minimum semi-norm generalized inverse of
A under the semi-inner product
(see Theorem 1.4 of Reference [
11]).