The Data-Constrained Generalized Maximum Entropy Estimator of the GLM: Asymptotic Theory and Inference

Maximum entropy methods of parameter estimation are appealing because they impose no additional structure on the data, other than that explicitly assumed by the analyst. In this paper we prove that the data constrained GME estimator of the general linear model is consistent and asymptotically normal. The approach we take in establishing the asymptotic properties concomitantly identifies a new computationally efficient method for calculating GME estimates. Formulae are developed to compute asymptotic variances and to perform Wald, likelihood ratio, and Lagrangian multiplier statistical tests on model parameters. Monte Carlo simulations are provided to assess the performance of the GME estimator in both large and small sample situations. Furthermore, we extend our results to maximum cross-entropy estimators and indicate a variant of the GME estimator that is unbiased. Finally, we discuss the relationship of GME estimators to Bayesian estimators, pointing out the conditions under which an unbiased GME estimator would be efficient.


Introduction
Information theoretic estimators have been receiving increasing attention in the econometric-statistics literature [1][2][3][4][5][6][7].In other work, [3] proposed an information theoretic estimator based on minimization of the Kullback-Leibler Information Criterion as an alternative to optimally-weighted generalized method of moments estimation.This specific estimator handles weakly dependent data generating mechanisms and under reasonable regulatory assumptions it is consistent and asymptotically normally distributed.Subsequently, [1] proposed an information theoretic estimator based on minimization of the Cressie-Read discrepancy statistic as an alternative approach to inference in moment condition models.In [1] identified a special case of the Cressie-Read statistic-the Kullback-Leibler Information Criterion (e.g., maximum entropy)-as being preferred over other estimators (e.g., empirical likelihood) because of its efficiency and robustness properties.Special issues of the Journal of Econometrics (March 2002) and Econometric Reviews (May 2008) were devoted to this particular topic of information estimators.
Historically, information theoretic estimators have been motivated in several ways.The Cressie-Read statistic directly minimizes an information based concept of closeness between the estimated and empirical distribution [1].Alternatively, the maximum entropy principle is based on an axiomatic approach that defines a unique objective function to measure uncertainty of a collection of events [8][9][10].Interest in maximum entropy estimators stems from the prospect to recover and process information when the underlying sampling model is incompletely or incorrectly known and the data are limited, partial, or incomplete [10].To date the principle of maximum entropy has been applied in an abundance of circumstances, including in the fields of econometrics and statistics [11][12][13][14][15][16][17], economic theory and applications [18][19][20][21][22][23][24], accounting and finance [25][26][27], and resources and agricultural economics [28][29][30][31][32].Moreover, widely used econometric software packages are now incorporating procedures to calculate maximum entropy estimators in their latest releases (e.g., SAS, SHAZAM, and GAUSSX).
In most cases, rigorous investigation of small and large sample properties of information theoretic estimators have lagged far behind empirical applications [3].Exceptions include [1][2][3] who examined information theoretic alternatives to generalized method of moments estimation; [14] who derived the statistical properties of the generalized maximum entropy estimator in the context of modeling multinomial response data; and, [10] who provided asymptotic properties for the moment-constrained generalized maximum entropy (GME) estimator for the general linear model (showing it is asymptotically equivalent to ordinary least squares).An alternative information theoretic estimator of the general linear model (GLM), yet to be rigorously investigated, but that has arisen in empirical applications (e.g., [24]), is the purely data-constrained formulation of the generalized maximum entropy estimator [10].In a purely data-constrained formulation the regression model itself, as opposed to moment conditions of it, represents the constraining function to the entropy objective function.In the maximum entropy framework, unlike ordinary least square or maximum likelihood estimators of the GLM, moment constraints are not necessary to uniquely identify parameter estimates.Moreover, there exists distinct differences between the data and moment constrained versions of the GME for the GLM.For [10] have shown the data-constrained GME estimator to be mean square error superior to the moment-constrained GME estimator of the GLM in selected Monte Carlo experiments.
Our paper contributes to the econometric literature in several ways.First, regularity conditions are identified that provide a solid foundation from which to develop statistical properties of the data constrained GME estimator of the GLM and hypothesis tests on model parameters.Given the regularity conditions, we define a conditional maximum entropy function to rigorously prove consistency and asymptotic normality.As demonstrated in this paper the data-constrained GME estimator is not asymptotically equivalent to the moment-constrained GME estimator or ordinary least squares estimator.However, the GME estimator is shown to be nearly asymptotically efficient.Moreover, we derive formulae to compute the asymptotic variance of the proposed estimator.This allows us to define classical Wald, Likelihood Ratio, and Lagrange Multiplier tests for testing hypothesis about model parameters.
Second, theoretical extensions to unbiased, cross entropy, and Bayesian estimation are also identified.Further, we demonstrate that the GME specification can be extended from finite-discrete parameter and error spaces to infinite-continuous parameter and error spaces.Alternative formulations of the data constrained GME estimator of the GLM under selected regularity conditions, and the implications to properties of the estimator, are also discussed.
Third, to compliment the theoretical results, Monte Carlo experiments are used in comparing the performance of the data-constrained GME estimates to least squares estimates for small and medium size samples.The performance of the GME estimator is tested relative to selected distributions of the errors, to the user supplied supports of the parameters and errors, and to its robustness to model misspecification.Monte Carlo experiments are also performed to examine the size and power of the Wald, Likelihood Ratio, and Lagrange Multiplier test statistics.
Fourth, insight into computational efficiency and guidelines for setting boundaries of parameters and error support spaces are discussed.The conditional maximum entropy formulation utilized in proof of asymptotic properties provides a basis for new computationally efficient method of calculating GME estimates.The approach involves a nonlinear search over a K-vector of coefficient parameters, which is much more efficient than numerical approaches proposed elsewhere in the literature.Finally, practical guidelines for setting boundaries of parameters and error support spaces are analyzed and discussed.

The Data-Constrained GME Formulation
Let Y X    represent the general linear model with Y being an 1 N  dependent variable vector, X being a fixed N K  matrix of explanatory variables, β being a 1 K  vector of parameters, and ε being an 1 N  vector of disturbance terms (All of our results can be extended to stochastic X.For example, if i X  is iid with ( ) i Var X    , a positive definite matrix, then the asymptotic properties are identical to those developed below).The GME rule for defining the estimator of the unknown β in the general linear model formulation is given by ˆẐ p [0], , .
In the preceding formulation, the matrices Z and V are K KM  and N NJ  matrices of support points for the β and ε vectors, respectively, as:    , and similarly 1 ( , , ) for β is to choose an estimate that contains only the information available.In this way the maximum entropy estimator is not constrained by any extraneous assumptions.The information used is the observed information contained in the data, the information contained in the constraints on the admissible values of β, and the information inherent in the structure of the model, including the choice of the supports for the k  's.In effect, the information set used in estimation is shrunk to the boundary of the observed data and the parameter constraint information.Because the objective function value increases as the weights in p i and w i are more uniformly distributed, any deviation from uniformity represents the effect of the data constraints on the weighting of the support points used for representing β and .This fact also motivates the interpretation of the GME as a shrinkage-type estimator that in the absence of constraints on β will shrink  to the centers of the supports defined in the specification of Z.We next establish consistency and asymptotic normality results for the GME estimator under general regularity conditions on the specification of the estimation problem.

Consistency and Asymptotic Normality of the GME Estimator
Regularity Conditions.To establish asymptotic results for the GME estimator, we utilize the following regularity conditions for the problem of estimating     for some δ > 0 and large enough finite positive The pdf of , ( ) R4. X has full column rank.R5.
1 ( ) N X X  is O(1) and the smallest eigenvalue of 1 ( ) for some  > 0, and where N * is some positive integer.
R6.   .The conditions R4-R6 on X are familiar analogues to typical assumptions made in the least squares context for establishing asymptotic properties of the least squares estimator of β.We utilize condition R6 to simplify the demonstration of asymptotic normality, but the result can be established under weaker conditions, as alluded to in the proof.Finally, our proof of the asymptotic results will utilize symmetry of the disturbance distribution, which is the content of condition R2.
Reformulated GME Rule.The asymptotic results are derived within the context of the following representation of the GME model, represented in scalar notation to facilitate exposition of the proof.The GME representation described below is completely consistent with the formulation in Section 2 under the condition that the support points represented by the vector v i are chosen to be symmetrically dispersed around 0. We use the same vector of support points for each of the i  's, consistent with the iid nature of the disturbances, and so henceforth v  refers to the common th  scalar support point in the development below.The representation is also more general than the representation in Section II in the sense that different numbers of support points can be used for the representation of different k  parameters.The constrained maximum entropy problem is as follows: subject to: 1 (thus for odd 0) As will become apparent, the nonnegativity restrictions on k p  and i w  are inherently enforced by the structure of the optimization problem itself, and thus need not be explicitly incorporated into the constraint set.
Asymptotic Properties.The following theorem establishes the consistency and asymptotic normality of the GME estimator of β in the GLM.

With the addition of regularity condition R6, the GME estimator is asymptotically normally distributed as
for appropriate definitions of 2 , , and Proof.Define the maximized entropy function, conditional on b   , as: ( 1) ( , : The optimal value of 1 ( , , ) in the conditionally-maximized entropy function is given by: ( ) arg max ln( ) , which is the maximizing solution to the Lagrangian: The optimal value of i w  is then: ( ( ( ))) ( ( ( ))) Similarly, the optimal value of 1 ( , , ) in the conditionally-maximized entropy function is given by: , which is the maximizing solution to the Lagrangian: .
The optimal value of k p  is then: where ( ) Substituting the optimal solutions for the k p  's and i w  's into (2) obtains the conditional maximum value function: Define the gradient vector of ( ) ( ) as ( ) , where ( )   and ( ( )) e   are 1 K  and 1 N  vectors of Lagrangian multipliers.It follows that the Hessian matrix of ( )  F  is given by: Regarding the functional form of the derivatives of the Lagrangian multipliers appearing in the definition of ( ) H  , it follows from (C2) that: , so that from (3): , and thus: Also, based on (C1): Because the denominators of the terms in the definition of the k H  's are positive valued, it follows that ( ) H  is a negative definite matrix, because X X  is positive definite.Now consider the case where    , so that: are iid with mean zero, and thus:  is bounded as well.In addition, ( ( )) i e   is symmetrically distributed around zero because the i  's are so distributed, and, from (4): e e e e It follows that ( ( ( ))) 0  .Then, using a multivariate version of Liapounov's central limit theorem, and given condition R6 (asymptotic normality can be established without regularity condition R6.In fact, the boundedness properties on the X-matrix stated in R5 would be sufficient.See [33] for a related proof under the weaker regularity conditions).

Consistency
For any τ, represent the conditional maximum value function, ( ) F  , by a second order Taylor series around β as: where *  lies between τ and β.The value of the quadratic term in the expansion can be bounded by: where  [34].The smallest eigenvalue exhibits a positive lower bound given by whatever the value of *  .
The value of the linear term in the expansion is bounded in probability; that is, 0    and for ( ) , there exists a finite ( ) A  such that:  .It follows from Equations ( 7)-( 9) that, for all , and the GME estimator of β is consistent.

Asymptotic Normality
Expand G(b) in a Taylor series around β, where ˆarg max ( ) where *  is between  and β.In general, different *  points will be required to represent the different coordinate functions in ( ) and  is a consistent estimator of β; , and: where d denotes equivalence of limiting distributions.Using ( )  , note that: ( ( )) , 1, ,  , Slutsky's Theorem [34] implies that: Note that holding the support of  constant, one can reduce the interval (c 1 , c J ).As 0   , the asymptotic variance of ( )  may tend to zero, but cannot grow without bound.For example, if at 0, 0 Also note that, for large samples, the parameters reliance on the supports vanishes.In contrast, the supports on the errors influence the computed covariance matrix.Finally, for non-homogenous errors, the covariance matrix estimator could be adjusted following a standard White's covariance correction.

Cross-Entropy Extensions
To extend the previous asymptotic results to the case of cross-entropy maximization [10], first suppose that  3) and ( 5), ( ( ( ))) ( ( ( ))) . Thus, the maximization problem given by Equation ( 2) and Conditions C1-C6 is equivalent to: with obvious changes being made to C1-C6.The only alterations needed to the preceding proof are: More generally, the same representation ( 11)-( 13) applies for any 0, 0 . Furthermore, Equations ( 12) and ( 13 1 and can be imposed.Using Equations ( 11), (12), and (13), we have characterized the maximum cross entropy solution.Upon substitution of Equations ( 11)-( 13) in the appropriate arguments, all results, including the results in the next section on statistical testing, apply to the maximum cross-entropy paradigm.

Statistical Tests
The GME estimator ˆẐ p   is consistent and asymptotically normally distributed.Therefore, asymptotically valid normal and 2  test statistics can be used to test hypotheses about β.For empirical implementation of such tests a consistent estimate of the asymptotic covariance matrix of  will be required.

An estimate of
, where: An estimate of the variance, 2   , of the i  's can be constructed as 2 . Then the asymptotic covariance matrix of  can be estimated by: Alternatively, ξ can be estimated by: Then:

Asymptotically Normal Tests
is asymptotically N(0,1) under the null hypothesis

Wald Tests
Wald tests of linear restrictions on the elements of β can be expressed in the usual form.Let 0 : H R r   be the null hypothesis to be tested, where R is a L × K matrix with rank ( ) Thus, the Wald test statistic has a 2  limiting distribution as: under the null hypothesis H 0 .Similarly, for nonlinear restrictions ( ) [0] g   , where ( ) g  is a continuously differentiable L-dimensional vector function with follows that:

Likelihood Ratio Tests
To establish a pseudo-likelihood ratio test of functional restrictions on the β vector, first note that: which follows from Equations ( 7) and ( 10) and the fact that 1 ( ) p N H       .Thus:

Now let ˆR
 be a restricted GME estimator of β.Thus,   for a general hypothesis. Then: under the null hypothesis.

Lagrange Multiplier Tests
Define R, r, g, J, and ˆR  as above.Then a Lagrangian multiplier test of functional restrictions on β can be based on the fact that: under the null hypothesis.

Monte Carlo Simulations
A Monte Carlo experiment was conducted to explore the sampling behavior of test situations based on the Generalized Maximum Entropy Estimator.The data were generated based on a linear model containing an intercept term, a dichotomous explanatory variable, and two continuously measured explanatory variables.The results of the Monte Carlo experiment also add additional perspective to simulation results relating the bias and mean square error to the maximum entropy estimator generated previously by [10].
The linear model Y = Xβ + is specified as that are censored at the mean ±3 standard deviations, and outcomes of the disturbance term are defined as , where ĩ id i U Uniform(0,1).The support points for the disturbance terms were specified as V = (−10, 0, 10)' (recall C2 and C3) for all experiments.Three different sets of support points were specified for the β-vector, given by: and: 10 0 10 10 0 10 10 0 10 10 0 10 (recall C1).The support points in Z I were chosen to be most favorable to the GME estimator, where the elements of the true β-vector are located in the center of their respective supports and the widths of the supports are relatively narrow.The supports represented by Z II are tilted to the left of β In the course of calculating values of the test statistics, both unrestricted and restricted (by β 2 = c and/or β 3 = d) GME estimators needed to be calculated.Therefore, bias and mean square error measures relating to these and the least squares estimators were calculated as well.Monte Carlo results for the test statistics and for the unrestricted GME and OLS estimators are presented in Tables 1 and 2, respectively, while results relating to the restricted GME and OLS estimators are presented in Table 3.
Because the choice of which asymptotic covariance matrix to use in calculating the T Z and Wald tests was inconsequential, only the results for the second suggested covariance matrix representation are presented here.
Regarding properties of the test statistics, their behavior under a true H 0 is consistent with the behavior expected from the respective asymptotic distributions when n is large (sample size of 1600), their sizes being approximately .05regardless of the choice of support for β.The sizes of the tests remain within 0.01 of their asymptotic size when n decreases to 400, except for the Lagrange Multiplier test under support Z II , which has a slightly larger size.Across all support choices and ranging over all sample sizes from small to large, the sizes of the T Z and Wald tests remain in the 0−0.10 range; for Z I supports and small sample sizes, the sizes of the tests are substantially less than 0.05.Results were similar for the pseudo-likelihood and Lagrange Multiplier tests, except for the cases of Z II support and n ≤ 100, where the size of the test increased as high as 0.36 for the pseudo-likelihood test and 0.73 for the Lagrange multiplier test when n = 25.
Table 1.Rejection Probabilities for True 2 Hypotheses.The powers of the tests were all substantial in rejecting false null hypotheses except for the T Z test in the case of Z II support and the smallest sample size, the latter result being indicative of a notably biased test.Overall, the choice of support did impact the power of tests for rejecting the errant hypotheses, although the effect was small for all but the T Z test.

Supports
In the case of unrestricted estimators and the most favorable support choice (Z I ), the GME estimator dominated the OLS estimator in terms of MSE, and GME superiority was substantial for sample sizes of n ≤ 100 (Table 2).The GME-Z I estimator and, of course, the OLS estimator, were unbiased, with the GME-Z I estimator exhibiting substantially smaller variances for smaller n.The choice of support has a significant effect on the bias and MSE of the GME estimator for small sample sizes.Neither the GME-Z II or GME-Z III estimator dominates the OLS estimator, although the GME-Z III estimator is generally the better estimator across the various sample sizes.When n = 25, the GME-Z II estimator offers notable improvement over OLS for estimating three of the four elements of β, but is significantly worse for estimating β 2 .For larger sample sizes, the GME-Z II estimator is generally inferior to the OLS estimator.Although the centers of the Z III support are on average further from the true β's than are the centers of the Z II support, the wider widths of the former result in a superior GME estimator.
The results for the restricted GME estimators in Table 3 indicate that under the errant constraints  Estimator

Asymmetric Error Supports
We present further Monte Carlo simulations to show that regularity condition R2, which assumes symmetry of the disturbance term, is not a necessary condition for identification of the GME slope parameters.It is demonstrated below that if the supports of the error distribution asymmetric, then only the intercept term of the GME regression estimator is asymptotically biased.
The Monte Carlo experiments that follow are identical to those above except for specification of the user supplied support points for the error terms and the underlying true error distribution.To illustrative the impact of asymmetric errors, experiments are based on one set of support points symmetric about zero, ( 10, 0,10) , and two sets of support points not symmetric about zero, ( 5,5,15) . The support V II is a simple translation of V I by five positive units in magnitude and retaining symmetry centered about 5. The asymmetric support V III translates the truncation points by five positive units in magnitude, but retains the center support point 0. The true error distribution is generated in two ways: a symmetric distribution specified as a N(0,1) distribution truncated at (−3,3) and an asymmetric distribution specified as a Beta(3,2) translated and scaled from support (0,1) to (−3,3) with mean 0.6.Supports on the parameter coefficients terms are retained as Z I , providing symmetric support points about the true coefficient values.
Monte Carlo experiments presented in Table 4 and 5 are generated for sample sizes 25, 100, and 400 with 1,000 replications for each sample size.Consider when the true distribution is symmetric about zero.Slope coefficients for error supports that are not symmetric about zero appear biased in smaller sample sizes.However, the bias and MSE of the slope coefficients decrease as the sample sizes increases.Next, suppose the true distribution is asymmetric.For symmetric and asymmetric supports only the intercept terms are persistently biased, diverging from the true parameter values as the sample size increases.These results demonstrate the robustness of GME slope coefficients to asymmetric error distributions and user supplied supports.

Further Results
Unbiased GME Estimation.It is apparent from the proof of the theorem in Section 3 that the 1 ln( ) terms are asymptotically uninformative.It is instructive to note that if these terms are deleted from the GME objective function and the resulting objective function is then maximized through choosing b and w subject to constraints C2-C4 and C6, the resulting GME estimator is in fact unbiased for estimating β.This follows because the i  's are iid mean zero and symmetrically distributed around zero, and the new estimator, say   , is such that     is a symmetric function of the i  's.
Bayesian Analogues.As pointed out by [35] maximum entropy methods can be motivated as an empirical Bayes rule.We expand on their analogy by noting a strong formal parallel to the traditional Bayesian framework of inference.In particular, one can view as the maximum entropy analogue to the log of a non-normalized Bayesian prior and ( ) and ( ) , the maximum likelihood estimator of β is   , and if one adds priors ~k is the Bayesian posterior mode estimator of β.We note the following consequences of these equivalences.First, if the support points 1 , , J v v  can be chosen so that f  is very close to the true distribution of , then the GME estimator should be nearly asymptotically efficient.Second, in finite samples the prior information influences  such that  is generally not unbiased.Third, the support points used in the GME estimator have no particular relationship to the points of support of the distribution of a discrete random variable.The distributions f  and k f  are absolutely continuous for any choice of Z and V.
The previous Monte Carlo results illustrate the Bayesian-like character of the maximum entropy results.The GME with reasonably narrow points of support centered on the true values of β dominated the OLS estimator and was sometimes far better.On the other hand, the GME performed poorly when the points of support were similarly narrow and mis-centered by only one-eighth the range of the points of support.In the latter case, mean squared errors were often much worse than OLS and biases were often substantial.Finally, wider points of support, even though they were the most mis-centered of the cases examined, were quite similar to OLS results for moderate to large sample sizes, and provided some degree of improvement over OLS for small samples.
Finally, the GME approach is a special case of generalized cross entropy, which incorporates a reference probability distribution over support points.This allows a direct method of including prior information, akin to a Bayesian framework.However, in a classical sense, the empirical estimation strategies are inherently different.
GME Calculation Method.The conditional maximum entropy formulation (2) utilized in the proof of asymptotic results represents the basis for a computationally efficient method of obtaining GME estimates.In particular, maximizing ( ) F  through choice of τ involves a nonlinear search over a vector of relatively low dimension (K) as opposed to searching over the (KM + NJ) dimensional space of (p,w) values.In the process of concentrating the objective function, note that the needed Lagrange multiplier functions ( ) and ( ( ))     can be expressed as elementary functions for three support points or less, and still exist in closed form (using inverse hyperbolic functions) for support vectors having five elements.As a point of comparison, the calculation of GME estimates in the Monte Carlo experiment with N = 1,600 was completed in a matter of seconds on a 133 mhz personal computer.Such a calculation would be intractable, let alone efficient, in the space of (p,w) values.We note further that the dual algorithm of [10] would still involve a search over a space of dimension N = 1,600, which would be infeasible here and in other problems in which the number of data points is large.

Conclusions
We have shown that the data-constrained GME estimator of the GLM is consistent and asymptotically normal as long as the coefficients and errors obey the constraints of the constrained maximum entropy problem.Furthermore, we have demonstrated the possibility that the GME estimator can be asymptotically efficient.Thus, depending on the distribution of the errors, GME may be more or less efficient than alternatives such as least squares.We performed Monte Carlo tests showing that the quality of the GME estimates depends on the quality of the supports chosen.The Monte Carlo results suggest that GME with wide supports will often perform better than OLS while providing some robustness to misspecification.
We have shown how all the conventional types of asymptotic tests can be calculated for GME estimates.In the Monte Carlo study these asymptotic tests performed extremely well in samples of 400 or more.In smaller samples the tests performed less well, particularly when the supports were narrow, although some of the results were quite acceptable.We have also demonstrated that all our results can be applied to a maximum cross-entropy estimator.While our focus has been on asymptotic properties, we have also shown how the entropy terms involving the coefficients play a role analogous to a Bayesian prior.Furthermore, these terms are asymptotically uninformative and can be omitted if the researcher wishes to use an unbiased GME estimator.

.
the optimal value of the Lagrangian multiplier i  under the condition b  It follows from the symmetry of the v i 's around zero that: 1 1 can be used to test hypotheses about the values of the k  's.
L = rank ( ) R K for a linear hypothesis or L = rank ( ( )) q K 1 and β 2 and to the right of β 3 and β 4 by 1 unit, with the widths of the supports being the same as their counterparts in Z I .The last set of supports represented by Z III are wider and effectively define an upper bound of 10 on the absolute values of each of the elements of β.To explore the respective sizes of the various tests presented in Section IV, the hypothesis 0 using the Wald, pseudo-likelihood, and Lagrange Multiplier tests, with c and d set equal to the true values of β 2 and β 3 , i.e., c = 1 and d = −1.Critical values of the tests were based on their respective asymptotic distributions and a 0.05 level of significance.An observation on the power of the respective tests was obtained by performing a test of significance whereby c = d = 0 in the preceding hypotheses.All scenarios were analyzed using 10,000 Monte Carlo repetitions, and sample sizes of n = 25, 100, 400, and 1,600 were examined.
the GME dominates the OLS estimator for all sample sizes and for all support choices.The superiority of the GME estimator is substantial for smaller sample sizes, but dissipates as sample size increases.The results suggest a misspecification robustness of the GME estimator that deserves further investigation.
to the non-normalized log of the probability density kernel or log-likelihood function.For any given set of support points Z and V, we can define functions and [10]r original formulation,[10]required i

Table 2 .
( ) i E  and Mean Square Error Measures-Unrestricted Estimators.

Table 4 .
Mean and MSE of 1,000 Monte Carlo Simulations with True Distribution Symmetric.Symmetric and Asymmetric Error Supports and Coefficient Support Z I .

Table 5 .
Mean and MSE of 1000 Monte Carlo Simulations with True Distribution Asymmetric.Symmetric and Asymmetric Error Supports and Coefficient Support Z I .