 Next Article in Journal
Compulsory Schooling and Returns to Education: A Re-Examination
Previous Article in Journal
Optimal Multi-Step-Ahead Prediction of ARCH/GARCH Models and NoVaS Transformation
Article

# Heteroskedasticity in One-Way Error Component Probit Models

ENSEA, Abidjan 08, Cote D’lvoire
Econometrics 2019, 7(3), 35; https://doi.org/10.3390/econometrics7030035
Received: 13 April 2019 / Revised: 5 August 2019 / Accepted: 7 August 2019 / Published: 11 August 2019

## Abstract

This paper introduces an estimation procedure for a random effects probit model in presence of heteroskedasticity and a likelihood ratio test for homoskedasticity. The cases where the heteroskedasticity is due to individual effects or idiosyncratic errors or both are analyzed. Monte Carlo simulations show that the test performs well in the case of high degree of heteroskedasticity. Furthermore, the power of the test increases with larger individual and time dimensions. The robustness analysis shows that applying the wrong approach may generate misleading results except for the case where both individual effects and idiosyncratic errors are modelled as heteroskedastic.

## 1. Introduction

The problem that heteroskedasticity presents for panel data regression has been widely discussed in the literature (Baltagi 2008; Baltagi et al. 2006; Montes-Rojas and Sosa-Escudero 2011). Let us consider the one-way error component model, i.e., with the error term defined as $u i t = μ i + ν i t , i = 1 , … , N , t = 1 , … , T$ where the individual effects $μ i$ and the idiosyncratic errors $ν i t$ are assumed to be random (i.e., $μ i ∼ i i d ( 0 , σ μ 2 )$ and $ν i t ∼ i i d ( 0 , σ ν 2 )$). Several authors (Baltagi 1988; Mazodier and Trognon 1978; Randolph 1988; Wansbeek 1989, among others) consider different types of heteroskedasticity depending upon whether individual effects ($μ i$) or idiosyncratic errors ($ν i t$) or both are heteroskedastic. Baltagi et al. (2006) and later Montes-Rojas and Sosa-Escudero (2011) proposed Lagrange Multiplier (LM) test procedures to check for the presence of heteroskedasticity in linear models for various cases. However, such test procedures for panel data binary choice models are lacking. In addition, to the best of my knowledge, there is no existing procedure to estimate an heteroskedastic probit model on panel data.
The use of random effects probit models panel data has been popularized due to the problem of incidental parameters (Baltagi 2008; Lancaster 2000). Since this model is generally applied to micro-panels, heteroskedasticity problems are likely to arise. One must account for heteroskedasticity since it could result in misleading conclusions about coefficients and marginal effects interpretation (Greene 2012). Two approaches are used to calculate the marginal effects after probit models in applied works: (i) integrating with respect to individual effects, or (ii) assuming the individual effects to be null (Bland and Cook 2018). In the case (i), the probability of positive outcome is given by $P r ( y i t = 1 | X i t ) = Φ X i t β σ ν 2 + σ μ 2$ while in case (ii), this probability is given by $P r ( y i t = 1 | X i t , μ i = 0 ) = Φ X i t β σ ν$, where $Φ$ and $ϕ$ are respectively the standard normal cumulative probability and the standard normal density functions. Thus, the marginal effect for variable $x k$ (denoted $m e ( x k )$) is given by Equation (1) for case (i) and Equation (2) for case (ii):
$m e ( x k ) = β k σ ν 2 + σ μ 2 ϕ X i t β σ ν 2 + σ μ 2$
$m e ( x k ) = β k σ ν ϕ X i t β σ ν .$
In Equations (1) and (2), it clearly appears that the marginal effects estimated in both case (i) and (ii) depend on the variance components. Since these variance components are functions of individual characteristics in the presence of heteroskedasticity, thus considering an homoskedastic model yields misestimated marginal effects. This stresses the need for an empirical implementation of an estimation and test procedure to deal with heteroskedasticity on panel probit models.
The aim of this paper is to introduce an estimation procedure that accounts for this heteroskedasticity using the Gauss-Hermite quadrature scheme1. In addition, the papers aims at providing a likelihood ratio (LR) test procedure for homoskedasticity in a panel probit model that allows one to investigate various forms of heteroskedasticity under alternative hypothesis. Monte Carlo simulations are conducted to estimate the power and the empirical size of the test and a robustness analysis is completed to ensure that the test and estimation procedures perform well. Results suggest that the estimation procedure has good performances and that it performance also depends on the quadrature parameters. The LR test has excellent power when there is high degree of heteroskedasticity and its performance depends on sample size in the situation of low degree of heteroskedasticity. The contribution of this paper to the literature is twofold. Firstly, it introduces a procedure to estimate a panel probit model with heteroskedasticity. The procedure allows one to deal with different sources of heteroskedasticity. Secondly, based on the power and the empirical size of the test, it shows that the LR test for homoskedasticity has good performance. The robustness of the estimation procedure and the test performance have been assessed using an extensive Monte Carlo simulation.
The rest of this paper is organized as follows. Section 2 presents the different forms of heteroskedasticity encountered in the literature and derives the likelihood estimator in a general setting. Section 3 discusses the estimation requirements and test procedures to deal with heteroskedasticity. In Section 4, the power and the empirical size of the test as well as the bias and the mean square error of the estimated parameters are computed based on Monte Carlo simulations. Section 5 presents robustness analysis. Section 6 presents a case study that illustrates the estimation of the parameters and marginal effects in the presence of heteroskedasticity. Section 7 concludes.

## 2. Heteroskedasticity and Likelihood Function

This section discusses the different types of heteroskedasticity encountered in the literature and specifies the likelihood function.

#### 2.1. Different Sources of Heteroskedasticity

Consider the following one-way error components probit model:
$y i t = 𝟙 R + X i t β + u i t ∀ i = 1 , … , N ; t = 1 , … , T ,$
where $u i t$ is decomposed in individual unobserved effects ($μ i$) and idiosyncratic errors ($ν i t$). We consider the random effects model. Classical assumptions for the estimation of a random effects model are the following: (i) the individual effects $μ i$ are independent from the idiosyncratic errors $ν i t$, and (ii) the explanatory variables $X i t$ are independent from the individual effects $μ i$ and the idiosyncratic errors $ν i t$. In addition, some assumptions are made on the variance components to deal with heteroskedasticity issues. These assumptions lead to three cases of heteroskedasticity identified in the literature:
• Heteroskedasticity a la Mazodier and Trognon (1978): The heteroskedasticity is due to the individual effects. Thus, $μ i ∼ i i d ( 0 , σ μ i 2 )$ and $ν i t ∼ i i d ( 0 , σ ν 2 )$.
• Heteroskedasticity a la Baltagi (1988) and Wansbeek (1989): the heteroskedasticity is due to the idiosyncratic errors. Thus, $μ i ∼ i i d ( 0 , σ μ 2 )$ and $ν i t ∼ i i d ( 0 , σ ν i t 2 )$.
• Heteroskedasticity a la Randolph (1988): The heteroskedasticity is due to both the individual effects and the idiosyncratic errors. Thus, $μ i ∼ i i d ( 0 , σ μ i 2 )$ and $ν i t ∼ i i d ( 0 , σ ν i t 2 )$. An alternative specification by Verbon (1980) is to consider that $μ i ∼ i i d ( 0 , σ μ i 2 )$ and $ν i t ∼ i i d ( 0 , σ ν i 2 )$.
In this paper, the heteroskedastic component is assumed to a function of some observed variables. More specifically, when the heteroskedasticity is due to the $μ i$, the variance depends on time-invariant exogenous variables $Z μ i$ and expressed as $σ μ i = σ μ h μ Z μ i ′ θ μ$. Alternatively, when the heteroskedasticity is due to the $ν i t$, the variance depends on exogenous variables $Z ν i t$ and has the following expression: $σ ν i t = σ ν h ν Z ν i t ′ θ ν$. The approach of Verbon (1980) can be modelled with a variance of idiosyncratic errors that is $σ ν i = σ ν h ν Z ν i ′ θ ν$. The functions $h μ ( . )$ and $h ν ( . )$ are twice continuously differentiable and satisfy $h μ ( . ) > 0$, $h ν ( . ) > 0$, $h μ ( 0 ) = 1$ and $h ν ( 0 ) = 1$. $Z μ i$ and $Z ν i t$ are vectors of regressors and have no constant term included. Note that the variance of the idiosyncratic errors is set to one ($σ ν = 1$) in order to avoid identification problems (Greene 2012). This identification problem occurs when a constant term is included since it implies that the variance of idiosyncratic errors will not be 1 when $θ ν$ is null.
In the rest of the paper, as in Montes-Rojas and Sosa-Escudero (2011) and Baltagi et al. (2006), the results are reported for the functions $h μ ( . )$ and $h ν ( . )$ set to exponential functions. Then, $h μ Z μ i ′ θ μ = e x p Z μ i ′ θ μ$ and $h ν Z ν i t ′ θ ν = e x p Z ν i t ′ θ ν$. Thus, the variance of the individual effects can be rewritten as $σ μ i = σ μ e x p Z μ i ′ θ μ = e x p λ 0 + Z μ i ′ θ μ$, with $λ 0 = l o g ( σ μ )$.

#### 2.2. Likelihood Function

The individual level likelihood is given by:
$L i = ∫ R ∏ t = 1 T i Φ ϵ i t ϕ μ ( μ i ) d μ i$
where $Φ$ denotes the standard normal cumulative distribution function, $ϕ μ$ denotes the density function of a normal distribution with mean 0 and variance equal to the variance of $μ i$ that is $σ μ i = σ μ h μ Z μ i ′ θ μ$, and
$ϵ i t = q i t X i t β + μ i h ν Z ν i t ′ θ ν q i t = 2 y i t − 1 .$
Note that this general form of the likelihood function allows dealing with homoskedasticity and each of the aforementioned heteroskedasticity cases. The homoskedastic model is given by $θ μ = θ ν = 0$. The heteroskedastic model where $μ i$ are heteroskedastic is given by $θ μ ≠ 0$ and $θ ν = 0$; the heteroskedastic model where $ν i t$ are heteroskedastic is given by $θ μ = 0$ and $θ ν ≠ 0$; while the heteroskedastic model where both $μ i$ and $ν i t$ are heteroskedastic is given by $θ μ ≠ 0$ and $θ ν ≠ 0$.

## 3. Estimation and Tests

The estimation procedure was based on a Gauss–Hermite quadrature scheme. This section discusses the requirements for the use of this approach. Then, the procedure to test for homoskedasticity is presented.

#### 3.1. Estimation Requirements

Given that the likelihood function (Equation (4)) has an integral form, it is common in the literature to use a numerical integration method. Gauss–Hermite quadrature is used to provide an approximation of the likelihood function that has a tractable form for likelihood maximization algorithms (Liu and Pierce 1994; Naylor and Smith 1982). It consists in approximating the integral $∫ R g ( x ) e x p ( − x 2 ) d x$ by a weighted sum of the function $g ( . )$ taken at some specific points called nodes.
The Gauss–Hermite quadrature scheme used herein is the one proposed by Liu and Pierce (1994). Let Q denotes the number of quadrature points, $x q , q = 1 , … , Q$ and $w q , q = 1 , … , Q$ denote respectively the quadrature points (nodes) and their corresponding weights. Then, the individual level likelihood is given in Equation (4) can be re-expressed as a sum of functions as follows:
$L i = ∫ R ∏ t = 1 T i Φ ϵ i t ϕ μ ( μ i ) d μ i = ∑ q = 1 Q w q ∗ g ( x q ∗ ) .$
With
$g ( μ i ) = ∏ t = 1 T i Φ ϵ i t ϕ μ ( μ i ) x q ∗ = γ + 2 σ x q w q ∗ = 2 σ w q e x p ( x q 2 ) γ = A r g max μ i g ( μ i ) σ = − ∂ 2 l o g ( g ( μ i ) ) ∂ μ i 2 | μ i = γ − 1 / 2 ∂ 2 l o g ( g ( μ i ) ) ∂ μ i 2 = − ϕ ϵ i t Φ u i t + ϕ ϵ i t 2 h ν Z ν i t ′ θ ν Φ ϵ i t 2 − 1 h μ Z μ i ′ θ μ 2 ,$
where $ϕ$ denotes the density function of the standard normal distribution2.
The individual level log-likelihood function depends on the selected number of quadrature points Q. A discussion based on empirical applications of the effect of the number of quadrature points on the estimation results and the computing time is presented by Moussa and Delattre (2018). Researchers can check the impact of a selected number of quadrature points on the results. A quadrature points check is conducted in Section 5 for the model with heteroskedasticity due to both $μ i$ and $ν i t$ that is the most complete case of heteroskedasticity in panel models3.

#### 3.2. Test Procedure

As described by Greene (2012), the issue of heteroskedasticity test can be analyzed using a misspecification test procedure. The homoskedastic panel probit model can be viewed as a restricted model in which $θ μ$ and $θ μ$ are constrained to be null ($Z μ i$ and $Z ν i t$ are omitted in the model). Thus, the homoskedastic probit model is nested in the heteroskedastic one. The omitted variables tests in literature are based on the likelihood-ratio (LR), the Lagrange multiplier (LM), and the Wald test. These three tests are asymptotically equivalent. However, this equivalence is valid for probit models only if the error components are homoskedastic and uncorrelated over time (Lechner 1995). The following relationship between test statistics for linear models has been proved ($W a l d ≥ L R ≥ L M$, Johnston and DiNardo 2001). This implies that the LM test is less likely to reject the null hypothesis of homoskedasticity. In the literature, the LM test is mostly used to test for homoskedasticity in linear models even for panel data models (Baltagi et al. 2006; Montes-Rojas and Sosa-Escudero 2011). The LR test is mainly used for nonlinear models. On the cross-section probit model, Davidson and MacKinnon (1984) show that the LR test performs well. The Wald test has poor performance on finite sample when testing for nonlinear hypothesis (Davidson and MacKinnon 1984; Wooldridge 2001). The power of the LM test for homoskedasticity on probit model may be problematic since it fails to distinguish between heteroskedasticity and simple omission of a variable in the index function (Greene 2018; Davidson and MacKinnon 1984).
For the aforementioned reasons, the heteroskedasticity tests used herein are based on the LR test procedure. LR test addresses the issue of the change in model fit when new variables are added (Wooldridge 2001). Thus, it requires the estimation of both the full heteroskedastic and the homoskedastic models. Since the aim of this paper is to propose an estimation procedure of a random effects probit model for panel data in presence of heteroskedasticity, the LR statistics will be easy to compute. The LR test statistics is given by:
$L R = 2 L o g L U − L o g L R ∼ χ 2 ( p )$
where $L o g L R$ and $L o g L U$ denote the log-likelihood of the restricted and unrestricted models respectively, and p is the number of parameters that are omitted in the homoskedastic panel probit model, i.e., the dimension (number of column) of $Z μ i$ or $Z ν i t$ or the sum of the dimensions of $Z μ i$ and $Z ν i t$.
Following the sources of the heteroskedasticity and as specified by Baltagi et al. (2006), three types of hypothesis can be tested. These hypothesis are related to the joint test for homoskedasticity of both individual effects and idiosyncratic errors ($H 0 : θ μ = θ ν = 0$) and to the two marginal tests for homoskedasticity of one of the aforementioned error components assuming the other component homoskedastic (i.e., $H 0 : θ μ = 0 | θ ν = 0$ and $H 0 : θ ν = 0 | θ μ = 0$).
Monte Carlo simulations are conducted in Section 4 to check for the robustness of the test by estimating its power and empirical size. The power of the test is defined as the percentage of rejection at 5% significance level of the null hypothesis of homoskedasticity in presence of heteroskedasticity. The empirical size refers to the percentage of false rejection at 5% significance level of the null hypothesis of homoskedasticity.

## 4. Monte Carlo Experiments

The Monte Carlo4 experiments conducted herein are based on a data generated as follows. For $i = 1 , … , N$ and $t = 1 , … , T$ the binary dependant variable is generated as:
$y i t = 𝟙 R + α 0 + α 1 ∗ X 1 i t + α 2 ∗ X 2 i t + μ i + ν i t ,$
where $X 1$ and $X 2$ are generated from a random uniform distribution. The error components $μ i$ and $ν i t$ are generated following a normal data generating process with zero mean and standard deviation $σ μ i = σ μ h μ Z μ i ′ θ μ$ and $σ ν i t = h ν Z ν i t ′ θ ν$ respectively. The time invariant variable $Z μ i$ and the variable $Z ν i t$ are generated from a random uniform distribution. The parameters of the index function are set to $α 0 = 1.5$, $α 1 = 0.8$, and $α 2 = − 2$. For each type of heteroskedasticity presented in Section 2, nine (9) Monte Carlo experiments in which $N = 50 , 100 , 500$ and $T = 5 , 10 , 20$ are conducted with 5000 replications.
To estimate the power of the test, two cases are considered. The first set of experiments consists of a generated dataset with low degree of heteroskedasticity (i.e., setting $θ μ = 0.7$ and $θ ν = 0.6$). A second set of experiments consists of a generated dataset with a high degree of heteroskedasticity (i.e., setting $θ μ = 2.1$ and $θ ν = 1.8$) for each of the 27 models aforementioned. In these experiments, the variance of the individual effects is set to $σ μ 2 = 0.2$ (i.e., $λ 0 = − 0.8$). The results of these experiments are presented in Section 4.1. The empirical size of the test is estimated using a generated dataset with no heteroskedasticity (i.e., setting $θ μ = θ ν = 0$). A well performing test would be such that its empirical size does not significantly differ from the nominal size of 5%. The results of these simulations are presented in Section 4.1.
Further Monte Carlo experiments have been conducted to cover several situations. These experiments consist of setting the following parameters: $σ μ 2 = 2 , 6$, $θ μ = 0 , 1 , 2 , 3$ and $θ ν = 0 , 1 , 2 , 3$. The results of this second set of experiments for $N = 50 , 500$ and $T = 5 , 20$ are reported in Table A1, Table A2 and Table A3 in Appendix D.

#### 4.1. Power and Empirical Size of the Test

Table 1 shows the power of the LR test for each of the aforementioned experiments. The Monte Carlo experiment for the marginal test $H 0 : θ μ = 0 | θ ν = 0$ reveals that the test performs well in the case of high degree of heteroskedasticity even for small samples ($N = 50$ and $T = 5$, the power of the LR test is 81.72%). However, in the case of low degree of heteroskedasticity, the test does not perform well on a sample with small N. For $N = 50$, the power of the test is 12.04% when $T = 5$ and increasing T to 10 and 20 yields in a power of 19.9% and 27.4% respectively. But, the power of the LR test is very high for a sample with large N. For $N = 500$, the power of the test is respectively 69.54% for $T = 5$, 94.14% for $T = 10$ and 99.26% for $T = 20$. Nonetheless, increasing $σ μ 2$ from 0.2 a larger value, say 2 or 6 results in a decrease in the power of the test when T is large and to an increase of the power of the test when T is low (see Table A2 in Appendix D).
For the marginal test $H 0 : θ ν = 0 | θ μ = 0$, the Monte Carlo experiment shows that the performance of the test is mitigated for small samples in presence of high heteroskedasticity. With $N = 50$, when $T = 5$ the power of the test is 47.78%. However, increasing the time dimension to $T = 10$ and $T = 20$ the power of the test increases drastically to 93.7% and 99.98% respectively. In the case of low degree of heteroskedasticity, the power of the test is very high when N or T is large. When N is small, the performance is low (22.86% with $N = 50$ and $T = 5$) and increases with T (39.74% with $T = 10$ and 67.14% with $T = 20$). Nonetheless, increasing $σ μ 2$ from 0.2 a larger value, say two or six results in a drastic increase in the power of the test (see Table A3 in Appendix D).
As for the joint test $H 0 : θ μ = θ ν = 0$, the Monte Carlo experiment reveals that the test has good performance in the case of high heteroskedasticity even when N is small. For $N = 50$, the power of the test is 65.16% with $T = 5$ and reaches 100% with $T = 20$. In the case of low degree of heteroskedasticity, the power of the test is excellent when N or T is large. For small N, the power of the test is low. It is 14.62% when $N = 50$ and $T = 5$, and when T is fixed at 5, increasing N to 500 yields in a drastic increase in the power of the test (96.74%). Nonetheless, fixing N at 50 and increasing T to 10 and 20 yields in a power of the test of 35.16% and 63.74% respectively. Furthermore, increasing $σ μ 2$ from 0.2 a larger value, say two or six results in an increase in the power of the test (see Table A1 in Appendix D).
Table 2 presents the empirical size of the test based the Monte Carlo experiments described above. The empirical size of the test $H 0 : θ μ = 0 | θ ν = 0$ varies between 4.54% and 5.36%. The empirical size of the test $H 0 : θ ν = 0 | θ μ = 0$ varies between 4.48% and 5.54% and between 4.52% and 5.14% in the case of the joint test $H 0 : θ μ = θ ν = 0$. All these empirical sizes do not significantly differ from the nominal size5 of the test.

#### 4.2. Bias and Mean Square Error of the Estimates

This subsection aims at evaluating the robustness of the proposed estimation procedure. For this purpose, the modelling approach for the case where both $μ i$ and $ν i t$ are heteroskedastic is used. The parameters of the model are those set in Section 4. The bias and the mean square error (MSE) of the estimates are computed based on 5000 replications for $N = 50 , 500$ and $T = 5 , 20$. The results are presented in Table 3.
The results suggest that the MSEs of both index function and variance parameters decrease with the number of observations. The bias of the index function parameters are lower than 5% regardless of the individual and time dimensions of the panel. The bias for the parameters of the variance of $μ i$ and $ν i t$ becomes lower as the time dimension of the panel increases. It reaches 5% for $( N , T ) = ( 500 , 20 )$.

#### 4.3. Robustness of Validity

In this subsection, the robustness of validity of the test procedure is assessed using the framework described by Montes-Rojas and Sosa-Escudero (2011). The aim is to assess how the departure away from normality of the data generating process (DGP) of the error components might affect the results of the test. For this purpose, the empirical size of the tests ($H 0 : θ μ = 0 | θ ν = 0$, $H 0 : θ ν = 0 | θ μ = 0$, and $H 0 : θ μ = θ ν = 0$) is computed for $N = 50$ and $T = 5$ using 5000 replications. The empirical sizes for normal, student with three degrees of freedom, exponential, uniform, and chi-square DGP are estimated respectively. The results are presented in Table 4.
The results suggest that a deviation from the normal DGP has heavy consequences on the empirical size on the test. The higher effect on the empirical size of the test is observed for exponential DGP. These results were expected since the estimation procedure, i.e., the Gauss–Hermite quadrature, is accurate only when the integral function has a Gaussian factor.

To further check for the robustness of the proposed approach, three analysis are conducted. Based on data simulated with parameters setted in Section 4, the first analysis consists in checking whether the estimation procedure provides estimates that are consistent with the data generating process. This analysis complements the measure of bias and MSE done in Section 4.2 by focusing of the difference between each estimated parameter and the DGP. The second robustness analysis consists in checking the effect of the number of quadrature points on the estimated parameters. The third analysis focuses on the robustness to misspecified heteroskedasticity, i.e., how the test procedure performs when a researcher applies the wrong test.

#### 5.1. Application Examples and Comparisons

For each of the three cases of heteroskedasticity described in Section 2, two applications are provided: (i) the first on a random sample of size $N = 500$ and $T = 5$, (ii) and the second on a random sample of size $N = 500$ and $T = 20$. For each of these applications, comparisons with the homoskedastic panel probit and the heteroskedastic pooled probit models are provided. The log-likelihood and the LR statistics are provided for the heteroskedastic pooled probit and the heteroskedastic panel probit models. Estimates are in the Appendix E for models with heteroskedasticity due to $μ i$ are provided in Table A4, those of models with heteroskedasticity due to $ν i t$ are provided in Table A5, and Table A6 provides the estimates of models with heteroskedasticity due to both $μ i$ and $ν i t$.
Results suggest that in the presence of heteroskedasticity due to $μ i$, the pooled heteroskedastic model underestimates the heteroskedastic factor (coefficient of variable $Z μ i$) for both $T = 5$ and $T = 20$ models. The homoskedastic part of the variance is well estimated using the homoskedastic panel probit model. It also appears that the parameters estimated from the homoskedastic and the heteroskedastic panel probit models are not different. However, as expected, the pooled model yields to bias in the estimated parameters especially when T is large.
In the presence of heteroskedasticity due to $ν i t$, the pooled heteroskedastic model gives correct estimates of the heteroskedastic factor (coefficient of variable $Z ν i t$) and the parameters of the model with $T = 20$. However, with $T = 5$, the estimated parameters are different from that of the data generating process (DGP). These estimates are not different from those provided by the pooled heteroskedastic model. The homoskedastic panel probit model yields estimates of parameters and variance components that differ from the DGP.
The estimation of the model with heteroskedasticity due to both $μ i$ and $ν i t$ by a homoskedastic panel probit model leads to parameters that are different from the DGP. The heteroskedatic pooled probit model yields to estimated individual effects variance that is different from the DGP.

The data generated for the examples in Section 5.1 with $N = 500$ and $T = 20$ are used for the quadrature points check. This quadrature check is conducted on the heteroskedastic model where the heteroskedasticity is due to both $μ i$ and $ν i t$ which is the more general case of heteroskedasticity.
The quadrature points check shows that using Q under 10, in this example, leads to significant differences in the estimated parameters from the DGP (See details in Table A7 in the Appendix F). For $Q = 10$ or more quadrature points, the estimated parameters are not significantly different from the DGP. Furthermore, as the number of quadrature points increases, the estimated parameters converge to the DGP’s values and become more accurate. This result has also been found in several applications that use Gauss–Hermite quadrature (Baltagi 2008; Moussa and Delattre 2018). Moreover, for $Q = 10$, the relative difference in log-likelihood is around $0.001$ while the relative difference in the LR statistics is around $0.1$. In terms of computation time, the convergence is generally reached quickly. It takes from 41 seconds for $Q = 6$ to 133 seconds for $Q = 20$ for the model to converge. However, the computation time may vary considerably according to the number of explanatory variables and to the sample size.

#### 5.3. Misspecified Heteroskedasticity: Effects of Applying the Wrong Approach

To check for robustness, this subsection analyzes what happens when researchers apply the wrong heteroskedasticity modelling approach to a model with heteroskedasticity. For example, what happens if in the presence of heteroskedasticity due to both $μ i$ and $ν i t$, researchers apply the procedure for estimation of heteroskedasticity due to $μ i$? A second set of robustness check consists in applying one of the three tests to a homoskedastic model. For this purpose, the data generated in Section 5.1 for the examples with $N = 500$ and $T = 20$ are used. Thus, for each case, the number of quadrature points is set to $Q = 10$. For example, on a dataset generated with $μ i$ heteroskedastic, the heteroskedasticity due to $ν i t$ and the heteroskedasticity due to both $μ i$ and $ν i t$ modelling approaches are applied.
Table 5 shows the results of LR tests and the variance components for each of the aforementioned cases. Results suggest that when only $μ i$ are heteroskedastic, the application of the heteroskedasticity due to $ν i t$ modelling approach results in incorrect estimates for the variance components and the LR test concludes to the presence of heteroskedasticity due to $ν i t$. The results of Monte Carlo simulations presented in Table A3 in Appendix D show that the acceptance rate of the null hypothesis in such a situation varies between 6.6% and 96.4% according to the panel’s dimension and the degree of heteroskedasticity. Furthermore, the higher the variance of $μ i$, the higher the acceptance rate. Contrary to the latter and as expected, if researchers apply the heteroskedasticity due to both $μ i$ and $ν i t$ modelling approach, the results are consistent with that obtained when applying the right approach and the parameter $θ ν$ that indicates the presence of heteroskedasticity due to $ν i t$ is not significantly different from zero. The same results hold for the case where only $ν i t$ are heteroskedastic except that the LR test concludes to no presence of heteroskedasticity due to $μ i$ when this modelling approach is used. The results from Monte Carlo simulations presented in Table A1 in Appendix D show that the acceptance rate of the null hypothesis does not significantly differ from 5%. In the case where both $μ i$ and $ν i t$ are heteroskedastic, applying the wrong modelling approach yields in identification of the related heteroskedasticity while the others forms are ignored. For example, if researchers apply the heteroskedasticity due to $ν i t$ modelling approach, the LR test concludes to the existence of heteroskedasticity due to $ν i t$ while the heteroskedasticity from the individual effects is ignored. The Monte Carlo simulations conducted (see Table A2 and Table A3 in Appendix D) show that the power of the test is close to that of the right test. This result suggest that it is better starting by the heteroskedasticity due to both $μ i$ and $ν i t$ modelling approach. Then, if one of the sources has no significant contribution to heteroskedasticity, then researchers can turn to the other source with the use of the specific modelling approach.
Table 6 shows the results of the application of one of the three tests to a situation where there is no heteroskedasticity. As expected, the LR tests conclude to homoskedasticity for all of the three modelling approaches. The Monte Carlo simulations conducted show that the acceptance rate of the null hypothesis does not differ significantly from the nominal size of the test (see Table A1, Table A2 and Table A3 in Appendix D).

## 6. Case Study

In this section, the illustration dataset for panel probit models used by Greene (2012) and refereed as Example 17.11 pp. 274–275 is used. This dataset is related to German health care utilization and contains 26,326 observations with $N = 7293$ and T varying between 1 and 7. The model estimates the effects of socioeconomic variables (age, income, kids, education, and marital status) on the probability to visit a doctor. The results by Greene (2012) are replicated and the marginal effects are calculated with respect to the two approaches aforementioned. Then, the heteroskedastic probit model in the more general setting where both $μ i$ and $ν i t$ are heteroskedastic is estimated and the marginal effects are computed. Table 7 shows the results of the estimates for the two approaches.
The LR test of homoskedasticity leads to the rejection of the null hypothesis of homoskedasticity. Thus, there is the presence of heteroskedasticity due to both $μ i$ and $ν i t$. The estimated parameters are significantly different between the homoskedastic and the heteroskedastic models. The same result holds for the marginal effects. However, the differences in marginal effects are lower using the marginal effects integrated with respect to $μ i$ that the ones computed assuming $μ i = 0$. Using the former approach, results suggest that ageing increases by 0.55% the probability to visit a doctor using the homoskedastic model and by 0.61% using the heteroskedastic model. These estimates are respectively 0.69% and 0.76% using the second approach. An increase in the number education years reduces by 0.92% the probability of visiting a doctor using the homoskedastic model and by 0.38% using the heteroskedastic model. Assuming $μ i = 0$, an increase in the number of education years reduce by 1.16% the probability of visiting a doctor using the homoskedastic model while the effect is not significant using the heteroskedastic model.

## 7. Conclusions

The use of a random effects probit model has been popularized due to the problem of incidental parameters encountered when dealing with fixed effects models for binary outcomes in panel data. However, researchers do not test for the presence of heteroskedasticity in the error terms and then do not control for that when estimating these models. This paper proposes an estimation procedure that accounts for heteroskedasticity for both individual effects and idiosyncratic errors separately and jointly as well as a LR test for homoskedasticity.
A Monte Carlo experiment was conducted to estimate the power of the test. It shows that the LR test performs well generally. However, on samples with a low degree of heteroskedasticity, the power of the test is around 20% for panels with small N and T but it increases drastically with larger N and T. The analysis also show that applying the wrong estimation and test procedures may yield misleading conclusions about heteroskedasticity.

## Funding

This research received no external funding.

## Acknowledgments

I would like to thank Désiré Kanga and Vakaramoko Diaby for their helpful comments on earlier version of the manuscript.

## Conflicts of Interest

The author declare no conflict of interest.

## Appendix A. STATA Code for Computing the Marginal Effects ## Appendix B. STATA Code for Generating the Dataset ## Appendix C. STATA Code for Monte Carlo Experiments ## Appendix D. Power of the Test for Different Degrees of Heteroskedasticity

#### Appendix D.1. Testing for the Joint Hypothesis

Table A1. Power and size of the LR test based on 5000 replications: case of $H 0 : θ μ = θ ν = 0$.
Table A1. Power and size of the LR test based on 5000 replications: case of $H 0 : θ μ = θ ν = 0$.
Setting$σ μ 2 = 2$$σ μ 2 = 6$
$N = 50$$N = 500$$N = 50$$N = 500$
$θ μ$$θ ν$$T = 5$$T = 20$$T = 5$$T = 20$$T = 5$$T = 20$$T = 5$$T = 20$
004.484.884.445.425.585.525.65.6
0112.6891.2499.481006.692.2498.42100
0235.8899.9210010031.84100100100
0350.2299.9410010038.42100100100
1022.0839.0899.4810013.3629.398.46100
1126.5697.5610010014.4690.0499.6100
1246.3610010010049.26100100100
1345.7810010010062.56100100100
2030.7972.3410010021.0843.12100100
2154.1499.2610010023.0875.34100100
2278.2810010010065.62100100100
2379.2810010010087.06100100100
3050.9273.810010030.0858.46100100
3159.5296.8810010037.7256.46100100
3290.310010010057.8899.92100100
3395.4610010010093.8100100100

#### Appendix D.2. Testing for the Marginal Hypothesis of No Heteroskedasticity in Individual Effects Given Homoskedastic Idiosyncratic Errors

Table A2. Power and size of the LR test based on 5000 replications: case of $H 0 : θ μ = 0 | θ ν = 0$.
Table A2. Power and size of the LR test based on 5000 replications: case of $H 0 : θ μ = 0 | θ ν = 0$.
$σ μ 2 = 2$$σ μ 2 = 6$
Setting$N = 50$$N = 500$$N = 50$$N = 500$
$T = 5$$T = 20$$T = 5$$T = 20$$T = 5$$T = 20$$T = 5$$T = 20$
$θ μ$
05.265.465.065.024.425.084.764.72
124.8647.1299.541005.8815.7677.8498.68
235.1671.71001006.2217.6890.52100
329.1464.981001004.9415.2899.32100
$θ ν$
05.265.465.065.024.425.084.764.72
15.785.465.564.745.064.824.724.4
25.265.065.584.465.564.625.024.42
35.124.944.964.445.444.744.924.42
$θ μ$$θ ν$
005.265.465.065.024.425.084.764.72
015.785.465.564.745.064.824.724.4
025.265.065.584.465.564.625.024.42
035.124.944.964.445.444.744.924.42
1024.8647.1299.541005.8815.7677.8498.68
1127.253.299.610015.6634.2696.3100
1223.8845.7897.210022.5642.8498.64100
131433.5679.5810019.0831.7292.44100
2035.1671.71001006.2217.6890.52100
2159.4691.9810010024.7659.52100100
226794.9810010055.4889.68100100
2352.7289.1610010055.988.36100100
3029.1464.981001004.9415.2899.32100
3166.4695.9610010021.5659.06100100
3288.4299.810010065.197.06100100
3386.999.6610010083.3699.58100100

#### Appendix D.3. Testing for the Marginal Hypothesis of no Heteroskedasticity in Idiosyncratic Errors Given Homoskedastic Individual Effects

Table A3. Power and size of the LR test based on 5000 replications: case of $H 0 : θ ν = 0 | θ μ = 0$.
Table A3. Power and size of the LR test based on 5000 replications: case of $H 0 : θ ν = 0 | θ μ = 0$.
$σ μ 2 = 2$$σ μ 2 = 6$
Setting$N = 50$$N = 500$$N = 50$$N = 500$
$T = 5$$T = 20$$T = 5$$T = 20$$T = 5$$T = 20$$T = 5$$T = 20$
$θ ν$
04.644.484.445.15.565.65.65.6
120.595.699.810011.1296.6299.5100
254.2899.9810010052.8100100100
366.1899.9610010062.46100100100
$θ μ$
04.644.484.445.15.565.65.65.6
16.68.239.0643.1823.9623.9887.0296.4
216.0823.7877.0696.115.5434.3499.9699.42
325.2829.1997.9898.7224.5849.799.96100
$θ μ$$θ ν$
004.644.484.445.15.565.65.65.6
0120.595.699.810011.1296.6299.5100
0254.2899.9810010052.8100100100
0366.1899.9610010062.46100100100
106.68.239.0643.1823.9623.9887.0296.4
1111.5296.4699.5410034.287.784.38100
1250.9210010010053.16100100100
1360.9610010010076.9100100100
2016.0823.7877.0696.115.5434.3499.9699.42
2124.2286.3891.1810025.2450.28100100
2245.9410010010034.0699.98100100
2370.0810010010078.88100100100
3025.2829.1997.9898.7224.5849.799.96100
3133.7156.6498.0499.9232.3659.6100100
3248.8899.9810010051.3499.32100100
3367.5210010010065.98100100100

## Appendix E. Application and Comparisons

#### Appendix E.1. Application and Comparison for Data Generated with Individual Effects Heteroskedastic

Table A4. Estimated index function and variance parameters.
Table A4. Estimated index function and variance parameters.
Variables$DGP$HomoskedasticHeteroskedasticHeteroskedastic
Panel ProbitPooled ProbitPanel Probit
With $( N , T ) = ( 500 , 5 )$
$L o g L$ $− 1241.7666$$− 1283.699$$− 1238.6702$
$L R s t a t$ $9.52$ ***6.2328 **
The estimated index function parameters.
$X 1$$0.8$$0.8285 [ 0.6008 ; 1.0561 ]$ ***$0.87 [ 0.6053 ; 1.1348 ]$ ***$0.8289 [ 0.6021 ; 1.0557 ]$ ***
$X 2$$− 2$$− 1.9558 [ − 2.2014 ; − 1.7102 ]$ ***$− 2.1149 [ − 2.5151 ; − 1.7146 ]$ ***$− 1.9628 [ − 2.2077 ; − 1.7179 ]$ ***
$i n t e r c e p t$$1.5$$1.3993 [ 1.2052 ; 1.5922 ]$ ***$1.4881 [ 1.206 ; 1.7702 ]$ ***$1.3992 [ 1.2077 ; 1.5907 ]$ ***
The variance parameters.
$Z μ i$$0.7$ $0.4086 [ 0.1485 ; 0.6686 ]$ ***$0.7147 [ 0.1192 ; 1.3103 ]$ **
$λ 0$$− 0.8$$− 0.8713 [ − 1.2052 ; − 0.5373 ]$ *** $− 0.8449 [ − 1.257 ; − 0.4327 ]$ ***
$σ ν ( a s s u m e d )$1111
With $( N , T ) = ( 500 , 20 )$
$L o g L$ $− 4500.4005$$− 4928.991$$− 4492.8442$
$L R s t a t$ $8.81$ ***$20.5892$ ***
The estimated index function parameters.
$X 1$$0.8$$0.7235 [ 0.6114 ; 0.8356 ]$ ***$0.6531 [ 0.5392 ; 0.7671 ]$ ***$0.719 [ 0.6073 ; 0.8307 ]$ ***
$X 2$$− 2$$− 2.0158 [ − 2.1368 ; − 1.8949 ]$ ***$− 1.8368 [ − 1.9884 ; − 1.6852 ]$ ***$− 2.0038 [ − 2.1238 ; − 1.8838 ]$ ***
$i n t e r c e p t$$1.5$$1.5913 [ 1.4813 ; 1.7012 ]$ ***$1.4502 [ 1.3315 ; 1.5689 ]$ ***$1.5688 [ 1.4651 ; 1.6725 ]$ ***
The variance parameters.
$Z μ i$$0.7$ $0.182 [ 0.0613 ; 0.3028 ]$ ***$0.6953 [ 0.39 ; 1.0016 ]$ ***
$λ 0$$− 0.8$$− 0.8252 [ − 1.0042 ; − 0.6462 ]$ *** $− 0.7853 [ − 0.9646 ; − 0.6059 ]$ ***
$σ ν ( a s s u m e d )$1111
95% level confident interval in brackets; ***: Significant at the 1% level. **: Significant at the 5% level.

#### Appendix E.2. Application and Comparison for Data Generated with Idiosyncratic Errors Heteroskedastic

Table A5. Estimated index function and variance parameters.
Table A5. Estimated index function and variance parameters.
Variables$DGP$HomoskedasticHeteroskedasticHeteroskedastic
Panel ProbitPooled ProbitPanel Probit
With $( N , T ) = ( 500 , 5 )$
$L o g L$ $− 1356.3042$$− 1363.723$$− 1352.2662$
$L R s t a t$ 5.99 **$8.0787$ ***
The estimated index function parameters.
$X 1$$0.8$$0.7217 [ 0.515 ; 0.9284 ]$ ***$0.752 [ 0.511 ; 0.9931 ]$ ***$0.8878 [ 0.6085 ; 1.167 ]$ ***
$X 2$$− 2$$− 1.5232 [ − 1.7353 ; − 1.3111 ]$ ***$− 1.698 [ − 2.0197 ; − 1.3763 ]$ ***$− 1.9064 [ − 2.2846 ; − 1.5283 ]$ ***
$i n t e r c e p t$$1.5$$1.0644 [ 0.9026 ; 1.2262 ]$ ***$1.2019 [ 0.9675 ; 1.4364 ]$ ***$1.3267 [ 1.0544 ; 1.599 ]$ ***
The variance parameters.
$Z ν i t$$0.6$ $0.3358 [ 0.0649 ; 0.6067 ]$ ***$0.4247 [ 0.1352 ; 0.7143 ]$ ***
$σ μ$$0.45$$0.3857 [ 0.2935 ; 0.507 ]$ *** $0.4805 [ 0.3619 ; 0.638 ]$ ***
With $( N , T ) = ( 500 , 20 )$
$L o g L$ $− 5230.5929$$− 5280.282$$− 5189.8208$
$L R s t a t$ $89.12$ ***$83.3667$ ***
The estimated index function parameters.
$X 1$$0.8$$0.5561 [ 0.4553 ; 0.6568 ]$ ***$0.7271 [ 0.5877 ; 0.8665 ]$ ***$0.7618 [ 0.6191 ; 0.9046 ]$ ***
$X 2$$− 2$$− 1.5742 [ − 1.6786 ; − 1.4698 ]$ ***$− 2.0745 [ − 2.2619 ; − 1.8871 ]$ ***$− 2.1414 [ − 2.3319 ; − 1.951 ]$ ***
$i n t e r c e p t$$1.5$$1.2223 [ 1.1361 ; 1.3086 ]$ ***$1.5999 [ 1.4578 ; 1.7419 ]$ ***$1.638 [ 1.4917 ; 1.7842 ]$ ***
The variance parameters.
$Z ν i t$$0.6$ $0.6417 [ 0.5062 ; 0.7772 ]$ ***$0.603 [ 0.4731 ; 0.7329 ]$ ***
$σ μ$$0.45$$0.3549 [ 0.3144 ; 0.4007 ]$ *** $0.44 [ 0.3883 ; 0.4987 ]$ ***
95% level confident interval in brackets; ***: Significant at the 1% level. **: Significant at the 5% level.

#### Appendix E.3. Application and Comparison for Data Generated with Both Individual Effects and Idiosyncratic Heteroskedastic

Table A6. Estimated index function and variance parameters.
Table A6. Estimated index function and variance parameters.
Variables$DGP$HomoskedasticHeteroskedasticHeteroskedastic
Panel ProbitPooled ProbitPanel Probit
With $( N , T ) = ( 500 , 5 )$
$L o g L$ $− 1378.8704$$− 1407.272$$− 1371.3114$
$L R s t a t$ $11.00$ ***$15.1319$ ***
The estimated index function parameters.
$X 1$$0.8$$0.7061 [ 0.4946 ; 0.9175 ]$ ***$0.8069 [ 0.5124 ; 1.1015 ]$ ***$0.8886 [ 0.6023 ; 1.1749 ]$ ***
$X 2$$− 2$$− 1.4263 [ − 1.6449 ; − 1.2077 ]$ ***$− 1.6477 [ − 2.0588 ; − 1.2365 ]$ ***$− 1.7918 [ − 2.1511 ; − 1.4324 ]$ ***
$i n t e r c e p t$$1.5$$1.0207 [ 0.8499 ; 1.1915 ]$ ***$1.1883 [ 0.8931 ; 1.4836 ]$ ***$1.2692 [ 1.002 ; 1.5365 ]$ ***
The variance parameters.
$Z μ i$$0.7$ $0.1062 [ − 0.1847 ; 0.3971 ]$$0.6385 [ 0.0271 ; 1.2499 ]$ **
$λ 0$$− 0.8$$− 1.1807 [ − 1.5411 ; − 0.8204 ]$ *** $− 0.767 [ − 1.2244 ; − 0.3096 ]$ ***
$Z ν i t$$0.6$ $0.4681 [ 0.1734 ; 0.7628 ]$ ***$0.448 [ 0.155 ; 0.7411 ]$ ***
With $( N , T ) = ( 500 , 20 )$
$L o g L$ $− 5245.4829$$− 5447.011$$− 5200.2317$
$L R s t a t$ $63.51$ ***$92.7282$ ***
The estimated index function parameters.
$X 1$$0.8$$0.5571 [ 0.4553 ; 0.659 ]$ ***$0.7218 [ 0.5743 ; 0.8692 ]$ ***$0.7407 [ 0.6011 ; 0.8804 ]$ ***
$X 2$$− 2$$− 1.5148 [ − 1.62 ; − 1.4096 ]$ ***$− 1.9041 [ − 2.117 ; − 1.6912 ]$ ***$− 1.9696 [ − 2.1449 ; − 1.7943 ]$ ***
$i n t e r c e p t$$1.5$$1.1794 [ 1.0879 ; 1.271 ]$ ***$1.468 [ 1.3057 ; 1.6302 ]$ ***$1.5026 [ 1.3669 ; 1.6383 ]$ ***
The variance parameters.
$Z μ i$$0.7$ $0.1148 [ − 0.0189 ; 0.2485 ]$ *$0.7565 [ 0.4228 ; 1.0902 ]$ ***
$λ 0$$− 0.8$$− 1.4512 [ − 1.6468 ; − 1.2556 ]$ *** $− 0.9136 [ − 1.1258 ; − 0.7013 ]$ ***
$Z ν i t$$0.6$ $0.5498 [ 0.4099 ; 0.6896 ]$ ***$0.5255 [ 0.4108 ; 0.6603 ]$ ***
95% level confident interval in brackets; ***: Significant at the 1% level. **: Significant at the 5% level. *: Significant at the 10% level.

## Appendix F. Estimates for Different Numbers of Quadrature Points

Table A7. Changes in Parameters and in log-likelihood with respect to the number of quadrature point Q.
Table A7. Changes in Parameters and in log-likelihood with respect to the number of quadrature point Q.
Variables$DGP$$Q = 6$$Q = 8$$Q = 10$$Q = 12$$Q = 14$$Q = 16$$Q = 18$$Q = 20$
$L o g L$ $− 5233.2859$$− 5210.0882$$− 5200.2317$$− 5195.4211$$− 5192.3082$$− 5190.1575$$− 5189.2627$$− 5188.4647$
$W a l d s t a t$ $66.3381$$81.5596$$94.3733$$103.304$$109.889$$115.235$$116.835$$117.604$
$L R s t a t$ $67.0431$$80.8018$$92.7282$$100.563$$106.433$$110.666$$112.443$$114.037$
$X 1$$0.8$$0.6809 [ 0.5508 ; 0.811 ]$ ***$0.7143 [ 0.579 ; 0.8495 ]$ ***$0.7407 [ 0.6011 ; 0.8804 ]$ ***$0.7598 [ 0.6167 ; 0.9028 ]$ ***$0.7751 [ 0.6292 ; 0.9211 ]$ ***$0.7889 [ 0.6403 ; 0.9375 ]$ ***$0.7962 [ 0.646 ; 0.9464 ]$ ***$0.8035 [ 0.6517 ; 0.9553 ]$ ***
$X 2$$− 2$$− 1.8256 [ − 1.9899 ; − 1.6612 ]$ ***$− 1.9058 [ − 2.0755 ; − 1.7361 ]$ ***$− 1.9696 [ − 2.1449 ; − 1.7943 ]$ ***$− 2.0171 [ − 2.1978 ; − 1.8365 ]$ ***$− 2.0568 [ − 2.2428 ; − 1.8709 ]$ ***$− 2.0906 [ − 2.2814 ; − 1.8999 ]$ ***$− 2.1087 [ − 2.3029 ; − 1.9145 ]$ ***$− 2.1265 [ − 2.3244 ; − 1.9287 ]$ ***
$I n t e r c e p t$$1.5$$1.3551 [ 1.2299 ; 1.4803 ]$ ***$1.4401 [ 1.3098 ; 1.5703 ]$ ***$1.5026 [ 1.3669 ; 1.6383 ]$ ***$1.5479 [ 1.4067 ; 1.6892 ]$ ***$1.5852 [ 1.4384 ; 1.7319 ]$ ***$1.6162 [ 1.4642 ; 1.7681 ]$ ***$1.6332 [ 1.4774 ; 1.7889 ]$ ***$1.6497 [ 1.4901 ; 1.8094 ]$ ***
$Z μ i$$0.7$$0.8259 [ 0.4615 ; 1.1903 ]$ ***$0.7739 [ 0.4341 ; 1.1137 ]$ ***$0.7565 [ 0.4228 ; 1.0902 ]$ ***$0.7366 [ 0.4036 ; 1.0696 ]$ ***$0.7436 [ 0.4094 ; 1.0778 ]$ ***$0.7582 [ 0.4241 ; 1.0923 ]$ ***$0.7598 [ 0.425 ; 1.0947 ]$ ***$0.7697 [ 0.4347 ; 1.1046 ]$ ***
$λ 0$$− 0.8$$− 1.0917 [ − 1.3352 ; − 0.8482 ]$ ***$− 0.984 [ − 1.2023 ; − 0.7656 ]$ ***$− 0.9136 [ − 1.1258 ; − 0.7013 ]$ ***$− 0.8603 [ − 1.0733 ; − 0.6474 ]$ ***$− 0.831 [ − 1.0463 ; − 0.6157 ]$ ***$− 0.8104 [ − 1.0266 ; − 0.5941 ]$ ***$− 0.7967 [ − 1.015 ; − 0.5784 ]$ ***$− 0.7887 [ − 1.0077 ; − 0.5697 ]$ ***
$Z ν i t$$0.6$$0.4013 [ 0.2782 ; 0.5244 ]$ ***$0.4761 [ 0.3529 ; 0.5993 ]$ ***$0.5355 [ 0.4108 ; 0.6603 ]$ ***$0.5783 [ 0.4512 ; 0.7053 ]$ ***$0.6142 [ 0.4842 ; 0.7442 ]$ ***$0.6435 [ 0.5109 ; 0.7761 ]$ ***$0.6593 [ 0.5243 ; 0.7943 ]$ ***$0.6747 [ 0.5372 ; 0.8122 ]$ ***
$Δ i n L o g L$ $0.0044$$0.0019$$0.0009$$0.0006$$0.0004$$0.0002$$0.0002$
$Δ i n W a l d s t a t$ $0.2259$$0.155$$0.0936$$0.0632$$0.0482$$0.0138$$0.0065$
$Δ i n L R s t a t$ $0.2022$$0.1458$$0.0836$$0.0578$$0.0394$$0.0159$$0.0141$
$Δ i n p a r a m .$ $0.162$$0.1022$$0.0631$$0.0335$$0.0341$$0.0465$$0.0533$$0.0599$
$T i m e ( s e c )$ 4153627996106130133
95% level confident interval in brackets; ***: Significant at the 1% level. $Δ$ denotes the relative difference defined as $| x − y | 1 + | y |$. It is calculated to assess the variation in the log-likelihood ($L o g L$), $L R s t a t$ and parameters when the number of quadrature points Q increases. $Δ i n p a r a m e t e r s$ is calculated as the maximum relative difference between parameters for two different Q.

## References

1. Baltagi, Badi H. 1988. An alternative heteroscedastic error component model, problem 88.2.2. Econometric Theory 4: 349–50. [Google Scholar] [CrossRef]
2. Baltagi, Badi H. 2008. Econometric Analysis of Panel Data, 4th ed. Hoboken: John Wiley & Sons. [Google Scholar]
3. Baltagi, Badi H., Georges Bresson, and Alain Pirotte. 2006. Joint lm test for homoskedasticity in a one-way error component model. Journal of Econometrics 134: 401–17. [Google Scholar] [CrossRef]
4. Bland, James R., and Amanda C. Cook. 2018. Random effects probit and logit: understanding predictions and marginal effects. Applied Economics Letters 26: 116–23. [Google Scholar] [CrossRef]
5. Davidson, Russell, and James G. MacKinnon. 1984. Convenient specification tests for logit and probit models. Journal of Econometrics 25: 241–62. [Google Scholar] [CrossRef]
6. Gould, William, Jeffrey Pitblado, and William Sribney. 2010. Maximum Likelihood Estimation With Stata, 4th ed. College Station: Stata Press. [Google Scholar]
7. Greene, William H. 2012. Econometric Analysis, 7th ed. Upper Saddle Rive: Prentice Hall. [Google Scholar]
8. Greene, William H. 2018. Econometric Analysis, 8th ed. New York: Pearson. [Google Scholar]
9. Johnston, John, and John DiNardo. 2001. Econometric Methods, 4th ed. New York: The McGraw-Hill Companies. [Google Scholar]
10. Lancaster, Tony. 2000. The incidental parameter problem since 1948. Journal of Econometrics 95: 391–413. [Google Scholar] [CrossRef]
11. Lechner, Michael. 1995. Some specification tests for probit models estimated on panel data. Journal of Business and Economic Statistics 13: 475–88. [Google Scholar] [CrossRef]
12. Liu, Qing, and Donald A. Pierce. 1994. A note on gauss-hermite quadrature. Biometrika 83: 624–29. [Google Scholar] [CrossRef]
13. Mazodier, Pascal, and Alain Trognon. 1978. Heteroskedasticity and stratification in error components models. Annales de l’INSEE 30: 451–82. [Google Scholar]
14. Montes-Rojas, Gabriel, and Walter Sosa-Escudero. 2011. Robust tests for heteroskedasticity in the one-way error components model. Journal of Econometrics 160: 300–10. [Google Scholar] [CrossRef]
15. Moussa, Richard, and Eric Delattre. 2018. On the estimation of causality in a bivariate dynamic probit model on panel data with stata software. a technical review. Theoretical Economics Letters 8: 1257–78. [Google Scholar] [CrossRef]
16. Naylor, Jennifer C., and Adrian F. M. Smith. 1982. Applications of a method for the efficient computation of posterior distributions. Applied Statistics 31: 214–25. [Google Scholar] [CrossRef]
17. Randolph, William C. 1988. A transformation for heteroscedastic error components regression models. Economics Letters 27: 349–54. [Google Scholar] [CrossRef]
18. Verbon, Harrie. 1980. Testing for heteroscedasticity in a model of seemingly unrelated regression equations with variance components. Economics Letters 5: 149–53. [Google Scholar] [CrossRef]
19. Wansbeek, Tom. 1989. An alternative heteroscedastic error components model, solution 88.1.1. Econometric Theory 5: 326. [Google Scholar] [CrossRef]
20. Wooldridge, Jeffrey M. 2001. Econometric Analysis of Cross Section and Panel Data. Cambridge: The MIT Press. [Google Scholar]
 1 A user-written Stata’s ado file is provided to deal with these purposes. This ado file is an extension of the existing Stata’s $h e t p r o b i t$ and $x t p r o b i t , r e$ commands that accounts for each of the types of heteroskedasticity observed in panel one-way error component models in the literature. A Stata code for computing the marginal effects after the proposed estimation procedure is given in the Appendix A. 2 The estimation procedure described above has been implemented as a Stata user-written ado file using the Stata’s $d 0$ procedure for maximum likelihood estimation (see Gould et al. 2010; Moussa and Delattre 2018). 3 For all others applications presented herein, $Q = 10$ is used as the number of quadrature points. 4 An example of the Stata code for the experiment of the power of the test in presence of heteroskedasticity due to both $μ i$ and $ν i t$ with $N = 100$ and $T = 5$ is provided in the Appendix C. The Appendix B reports the Stata code used to generate the data. 5 The empirical size estimated on 5000 replications is significantly different from the nominal size of 5% if it does not range between 4.4% and 5.6%. These thresholds are calculated as $0.05 ± 1.96 0.05 ∗ 0.95 5000$.
Table 1. Power of the likelihood ratio (LR) test for homoskedasticity based on 5000 replications.
Table 1. Power of the likelihood ratio (LR) test for homoskedasticity based on 5000 replications.
Settings$H 0 : θ μ = 0 | θ ν = 0$$H 0 : θ ν = 0 | θ μ = 0$$H 0 : θ μ = θ ν = 0$
DimensionsObs.%%%
Low degree of heteroskedasticity: $σ μ 2 = 0.2$, $θ μ = 0.7$ and $θ ν = 0.6$
$( N , T ) = ( 50 , 5 )$25012.0423.8614.62
$( N , T ) = ( 100 , 5 )$50019.8843.3231.68
$( N , T ) = ( 500 , 5 )$250069.5497.2296.74
$( N , T ) = ( 50 , 10 )$50019.939.7435.16
$( N , T ) = ( 100 , 10 )$100034.0465.6264.72
$( N , T ) = ( 500 , 10 )$500094.1499.98100
$( N , T ) = ( 50 , 20 )$100027.467.1463.74
$( N , T ) = ( 100 , 20 )$200050.5893.292.94
$( N , T ) = ( 500 , 20 )$$10 , 000$99.26100100
High degree of heteroskedasticity: $σ μ 2 = 0.2$, $θ μ = 2.1$ and $θ ν = 1.8$
$( N , T ) = ( 50 , 5 )$25081.7247.7865.16
$( N , T ) = ( 100 , 5 )$50098.448596.2
$( N , T ) = ( 500 , 5 )$2500100100100
$( N , T ) = ( 50 , 10 )$50094.7293.798.12
$( N , T ) = ( 100 , 10 )$100098.8899.999.96
$( N , T ) = ( 500 , 10 )$5000100100100
$( N , T ) = ( 50 , 20 )$100098.3699.98100
$( N , T ) = ( 100 , 20 )$2000100100100
$( N , T ) = ( 500 , 20 )$$10 , 000$100100100
Table 2. Empirical size of the LR test for homoskedasticity based on 5000 replications.
Table 2. Empirical size of the LR test for homoskedasticity based on 5000 replications.
Settings$H 0 : θ μ = 0 | θ ν = 0$$H 0 : θ ν = 0 | θ μ = 0$$H 0 : θ μ = θ ν = 0$
DimensionsObs.%%%
$( N , T ) = ( 50 , 5 )$2504.824.984.54
$( N , T ) = ( 100 , 5 )$5004.75.024.52
$( N , T ) = ( 500 , 5 )$25004.645.344.68
$( N , T ) = ( 50 , 10 )$5005.365.464.72
$( N , T ) = ( 100 , 10 )$10004.544.745.14
$( N , T ) = ( 500 , 10 )$50004.624.484.68
$( N , T ) = ( 50 , 20 )$10005.085.004.94
$( N , T ) = ( 100 , 20 )$20004.924.485.04
$( N , T ) = ( 500 , 20 )$$10 , 000$4.585.545.04
Table 3. Bias and mean square error (MSE) of the estimates based on 5000 replications.
Table 3. Bias and mean square error (MSE) of the estimates based on 5000 replications.
Settings$( N , T ) = ( 50 , 5 )$$( N , T ) = ( 50 , 20 )$$( N , T ) = ( 500 , 5 )$$( N , T ) = ( 500 , 20 )$
ParameterDGPBiasMSEBiasMSEBiasMSEBiasMSE
Parameters of the index function
$α 0$$1.5$0.00090.24350.08140.05540.04020.02250.06060.0126
$α 1$$0.8$0.00400.24740.03310.04740.02370.02210.04000.0062
$α 2$$− 2$0.00720.43230.08340.08140.05300.03880.09090.0172
Parameters of the variances of $μ i$ and $ν i t$
$λ 0$$− 0.8$0.16831.54060.06090.20580.05170.08460.04070.0177
$θ μ$$0.7$0.16601.10610.02320.40440.04120.12040.03530.0316
$θ ν$$0.6$0.07210.23690.06180.04470.04560.02250.03010.0119
Table 4. Empirical size of the test based on 5000 replications for $N = 50$ and $T = 5$.
Table 4. Empirical size of the test based on 5000 replications for $N = 50$ and $T = 5$.
DGP$H 0 : θ μ = 0 | θ ν = 0$$H 0 : θ ν = 0 | θ μ = 0$$H 0 : θ μ = θ ν = 0$
Normal4.824.985.54
Student (3)6.387.868.32
Exponential17.567.1821.36
Uniform5.747.688.7
Chi-square3.245.84.5
Table 5. Estimated variance components and LR tests on wrong models.
Table 5. Estimated variance components and LR tests on wrong models.
Case$μ i$ Heteroskedastic$ν it$ Heteroskedastic$μ i$ and $ν it$ Heteroskedastic
Model(1)(2)(3)(4)(5)(6)
$L o g L$$− 4499.6355$$− 4492.5752$$− 5231.2132$$− 5189.5929$$− 5235.0043$$− 5210.8403$
$L R s t a t$$7.0066$ ***$21.1272$ ***$0.5819$$83.8224$ ***$23.1831$ ***$71.5111$ ***
The variance parameters.
$Z μ i$ $0.6952 [ 0.3888 ; 1.0016 ]$ ***$0.1588 [ − 0.2497 ; 0.5673 ]$$0.1392 [ − 0.2652 ; 0.5436 ]$$0.8079 [ 0.4655 ; 1.1504 ]$ ***
$λ 0$ $− 0.8047 [ − 0.9916 ; − 0.6178 ]$ ***$− 1.1308 [ − 1.3659 ; − 0.8957 ]$ ***$− 0.8886 [ − 1.1252 ; − 0.6519 ]$ ***$− 1.1642 [ − 1.3771 ; − 0.9513 ]$ ***
$σ μ$$0.5985 [ 0.5369 ; 0.6672 ]$ *** $0.6012 [ 0.5419 ; 0.6669 ]$ ***
$Z ν i t$$− 0.1974 [ − 0.3455 ; − 0.0493 ]$ ***$− 0.0442 [ − 0.1627 ; 0.0743 ]$ $0.6025 [ 0.4726 ; 0.7324 ]$ *** $0.5479 [ 0.4223 ; 0.6734 ]$ ***
95% level confident interval in brackets; ***: Significant at the 1% level. In columns (1) and (2), the dataset has been generated with $μ i$ heteroskedastic. Then, the modelling and test approaches for heteroskedasticity due to $ν i t$ (column 1) and to both $μ i$ and $ν i t$ (column 2) are applied. For columns (4) and (5) the dataset is generated with $ν i$ heteroskedastic and the modelling and test approaches for heteroskedasticity due to $μ i$ (column 3) and to both $μ i$ and $ν i t$ (column 4) are applied. In columns (5) and (6), the dataset is generated with both $μ i$ and $ν i t$ heteroskedastic and the modelling and test approaches for heteroskedasticity due to $μ i$ (column 5) and to $ν i t$ (column 6) are applied.
Table 6. Estimated variance components and LR tests on homoskedastic model.
Table 6. Estimated variance components and LR tests on homoskedastic model.
Model(1)(2)(3)
$L o g L$$− 4536.406$$− 4535.2483$$− 4535.1225$
$L R s t a t$$0.2644$$2.5797$$2.8313$
The variance parameters.
$Z μ i$$0.0856 [ − 0.2406 ; 0.4118 ]$ $0.0835 [ − 0.2428 ; 0.4098 ]$
$λ 0$$− 0.7031 [ − 0.886 ; − 0.5203 ]$ *** $− 0.7452 [ − 0.9357 ; − 0.5548 ]$ ***
$σ μ$ $0.4938 [ 0.4421 ; 0.5514 ]$ ***
$Z ν i t$ $− 0.0987 [ − 0.2197 ; 0.0222 ]$$− 0.0985 [ − 0.2194 ; 0.0224 ]$
95% level confident interval in brackets; ***: Significant at the 1% level; The data used for the results in this Table are generated with no heteroskedasticity. Then, the modelling and test approaches for heteroskedasticity due to $μ i$ (column 1), to $ν i t$ (column 2) and to both $μ i$ and $ν i t$ (column 3) are applied.
Table 7. Estimated coefficients and marginal effects.
Table 7. Estimated coefficients and marginal effects.
VariablesHomoskedastic ModelHeteroskedastic Model
$Coef .$$M . E . +$$M . E . + +$$Coef .$$M . E . +$$M . E . + +$
$a g e$$0.0201 [ 0.0175 ; 0.0228 ]$ ***$0.0055 [ 0.0048 ; 0.0062 ]$ ***$0.0069 [ 0.0061 ; 0.0078 ]$ ***$0.0027 [ 0.0023 ; 0.0032 ]$ ***$0.0061 [ 0.0055 ; 0.0066 ]$ ***$0.0076 [ 0.007 ; 0.0082 ]$ ***
$i n c o m e$$− 0.0032 [ − 0.1341 ; 0.1278 ]$$− 0.0009 [ − 0.0366 ; 0.0349 ]$$− 0.0011 [ − 0.0463 ; 0.0441 ]$$− 0.001 [ − 0.0256 ; 0.0237 ]$$− 0.0212 [ − 0.0581 ; 0.0157 ]$$− 0.0316 [ − 0.0747 ; 0.0115 ]$
$k i d s$$− 0.1538 [ − 0.2079 ; − 0.0996 ]$ ***$− 0.0420 [ − 0.0567 ; − 0.0272 ]$ ***$− 0.053 [ − 0.0717 ; − 0.0344 ]$ ***$− 0.0336 [ − 0.0433 ; − 0.0238 ]$ ***$− 0.0497 [ − 0.064 ; − 0.0353 ]$ ***$− 0.0549 [ − 0.0707 ; − 0.039 ]$ ***
$e d u c a t i o n$$− 0.0337 [ − 0.0462 ; − 0.0212 ]$ ***$− 0.0092 [ − 0.0126 ; − 0.0058 ]$ ***$− 0.0116 [ − 0.0159 ; − 0.0073 ]$ ***$− 0.0065 [ − 0.0083 ; − 0.0046 ]$ ***$− 0.0038 [ − 0.0066 ; − 0.001 ]$ ***$− 0.0018 [ − 0.0051 ; 0.0014 ]$
$m a r r i e d$$0.0163 [ − 0.0477 ; 0.0803 ]$$0.0045 [ − 0.013 ; 0.0219 ]$$0.0056 [ − 0.0164 ; 0.0277 ]$$0.0005 [ − 0.0101 ; 0.011 ]$$0.0007 [ − 0.0149 ; 0.0163 ]$$0.0008 [ − 0.0164 ; 0.018 ]$
$i n t e r c e p t$$0.0341 [ − 0.1591 ; 0.2273 ]$ $0.0558 [ 0.0251 ; 0.0864 ]$ ***
The variance parameters: variance of $μ i$
$f e m a l e$ $− 0.0766 [ − 0.1101 ; − 0.0431 ]$ ***
$λ 0$ $− 2.1074 [ − 2.1311 ; − 2.0837 ]$ ***
$σ μ$$0.9007 [ 0.8649 ; 0.9379 ]$ ***
The variance parameters: variance of $ν i t$
$a g e$ $− 0.0215 [ − 0.0232 ; − 0.0198 ]$ ***
$i n c o m e$ $0.2098 [ 0.028 ; 0.3916 ]$ **
$e d u c a t i o n$ $− 0.061 [ − 0.0691 ; − 0.0529 ]$ ***
$L o g L$$− 16 , 273.964$ $− 14 , 019.325$
$L R s t a t$ $4509.45$ ***
95% level confident interval in brackets; ***: Significant at the 1% level; **: Significant at the 5% level; +: marginal effects by integrating with respect to $μ i$; ++: marginal effects assuming $μ i = 0$. The coefficients of the homoskedastic model are those reported by Greene (2012).