1. Introduction
Identification and estimation of strategic interaction models have recently received a great deal of attention in econometrics, owing to the growing interest and application of stochastic games in various fields including industrial organization, labor, political and international economics. Most of the existing literature has focused on discrete choice games, see [
1,
2] for a survey of recent results. In this literature, the observed data are assumed to arise from an equilibrium of a game played by a finite number of players, and therefore, to be correlated across players. Typically, the number of players is assumed to be fixed, and the asymptotic inferential theory relies on a large number of independent repetitions of the same game in different markets or in a single market at different points of time. Two notable exceptions are Menzel [
3] and Xu [
4], who develop the inferential theory of discrete choice games based on a large number of players.
In this paper, we develop a new model of a static game of incomplete information with a large number of players. The model has two key distinguishing features. First, the strategies are subject to threshold effects, and can be interpreted as dependent censored random variables, e.g., R&D investment and labor supply. Second, the game is played in a single market and is not repeated over time. To develop the asymptotic theory, we instead assume that the number of players grows unboundedly, and the players reside on an exogenously given lattice so that the vector of their choices and characteristics can be viewed as a dependent random field, which can be handled by the limit theorems for near-epoch dependent (NED) random fields established by Jenish and Prucha [
5].
We derive this model explicitly in two game-theoretical applications: (i) R&D investment by firms under strategic complementarities; and (ii) labor supply decision by women under peer effects. The set-up is standard for a static game of incomplete information: each player’s payoff function depends on her choice and choices of other players, her commonly observed characteristics, and her private characteristic unobserved by other players; players move simultaneously based on their expectations about the choices of other players, and in equilibrium, players have self-consistent expectations, see [
6],
i.e., their subjective expectations coincide with the expectation based on the equilibrium distribution of strategies conditional on commonly observed variables. We assume that private characteristics are i.i.d. normal across players, and prove existence and uniqueness of the pure strategy equilibrium under some mild conditions. We then show that the censored equilibrium strategies also satisfy the NED property under the same conditions.
Under normality of private shocks, the equilibrium strategies boil down to a Tobit econometric model. However, in contrast to the standard Tobit model, our censored model involves a non-zero threshold parameter that needs to be estimated. Therefore, we use the following two-step semiparametric procedure: we first estimate the threshold by the minimum order statistic of the uncensored subsample, and then estimate the remaining parameters either by the maximum likelihood or least squares method. Unlike the standard Tobit model, the maximum likelihood estimator does not strictly dominate the least squares estimator in our model due to discontinuous dependence of the likelihood on the first-step estimator, which may amplify finite-sample biases stemming from the first-step estimation. This provides a rationale for considering the least squares estimation as an alternative to the maximum likelihood procedure. We establish consistency and asymptotic distributions of these estimators. The minimum order statistic is n-consistent and asymptotically exponentially distributed, while the maximum likelihood and least squares estimators are -consistent and asymptotically normal. A Monte Carlo study suggests that all these estimators perform well in finite samples.
Finally, we address the computational challenges of our game with a large number of players. The standard estimation of games involves computing the equilibrium for each alternative parameter value and then optimizing the objective function over parameter values, and thus presents a formidable computational burden. To tackle it, we use the constrained optimization algorithm proposed by Su and Judd [
7], which treats the equilibrium equations as constraints and optimizes simultaneously over parameters and equilibrium variables, thereby avoiding calculation of the equilibrium at each iteration on the parameter value. Su and Judd [
7] show equivalence of this constrained optimization problem to the original problem. Our simulations confirm viability and a significant computational efficiency of the Su-Judd algorithm in our model.
To our knowledge, the proposed censored model has not been considered in the existing literature. Most of the existing results have dealt with discrete choice games, e.g., [
8]. Recently, Xu and Lee [
9] analyzed a spatially autoregressive (SAR) Tobit model, which can be viewed as a censored version the Cliff-Ord type linear SAR model with a known spatial weight matrix. Xu and Lee [
9] establish the NED property as well as consistency and asymptotic normality of the maximum likelihood estimator using the limit theorems of Jenish and Prucha [
5]. Though not explicitly demonstrated, this SAR Tobit model can be interpreted as an equilibrium of a static game of complete information, while our model is a game of incomplete information with a different concept of equilibrium and, consequently, qualitatively different implications. Moreover, the presence of latent endogenous variables and non-zero threshold in our model pose additional statistical and computational difficulties. Thus, the two papers are complementary to each other.
The paper is organized as follows.
Section 2 describes and derives the model in two examples.
Section 3 establishes existence and uniqueness of the equilibrium, and proves the NED property of the equilibrium strategies.
Section 4 discusses identification and estimation of the model. Consistency and asymptotic distributions of the estimators are established in
Section 5.
Section 6 contains a Monte Carlo study, and
Section 7 concludes. All proofs are collected in the appendices.
2. Model
In this paper, we are concerned with estimation of the following econometric model:
where
,
, is the choice of agent
i,
is agent
i’s expectation, given its information set
, about the choices of its neighbors,
, within the neighborhood of radius
r,
, containing a fixed number of neighbors
k that does not depend on
i;
is the vector of observed characteristics of agent
i;
is agent
i’s private characteristic observed only by agent
i, and
,
, is the vector of unknown coefficients. The distribution of private shocks is known to all players. It is assumed that
i.i.d.
, and are independent of
. The information set of each player consists of the entire state vector,
and its private information
,
i.e.,
.
The choice of player i is assumed to be directly affected by its neighbors only in a fixed neighborhood of the known radius, , with respect to some socio-economic metric. However, it will be indirectly affected by all other players. The number of the neighbors within the r-neighborhood of each agent is assumed to be fixed and equal to k. To avoid the incidental parameters problem, the k coefficients , measuring the effect of these k neighbors, are assumed to depend only on the relative locations of i and j, but not on j or i. We formally specify the metric and the neighborhood structure in the following section.
The above assumptions seem reasonable in many empirical settings. For example, in their R&D decision, firms would take into account R&D of its neighbors within a certain distance in the geographic (or product characteristic) space, rather than all firms in the market. This is due to the fact that technological spillovers, knowledge diffusion and labor mobility—determinants of R&D diffusion—are usually confined to a limited geographical or technological area.
Aside from the unobserved heterogeneity captured by the private shocks we do not allow individual heterogeneity in the parameters. The reason is that the model assumes only one repetition of the game and the number of agents growing to infinity to develop the asymptotic theory. Clearly, allowing heterogenous parameters across individuals will result in inconsistency.
Model (
1) is fairly general for applications. It can arise as a system of best response functions of a static game of incomplete information among
n players. Below, we derive these equations for two strategic interaction models: (i) R&D investment by firms; and (ii) labor supply by women. In these models, decisions of players exhibit strategic complementarities, and are subject to threshold effects.
2.1. Spillovers in R&D Investment
A large body of empirical evidence suggests presence of technology and R&D spillovers among firms, e.g., [
10]. Audretsch and Feldman [
10] find that knowledge spillovers are more prevalent in the industries that exhibit spatial clustering. Positive R&D spillovers may occur through several channels including knowledge transfers, labor mobility and imitation. Therefore, it is reasonable to expect the magnitude of such a spillover effect to depend on the geographical and technological distances between firms. As a result, firms’ R&D expenditures may be spatially correlated, and the magnitude of this correlation often decays with the distance between firms.
The literature distinguishes two major channels through which R&D can raise firms’ profits: cost-reducing and demand-creating effects. The former allows firms to carry out process improvements leading to efficiency gains and cost reduction, while the latter enables firms to improve the quality of their product and thereby boost the demand. Levin and Reiss [
11] analyze a model of monopolistic competition with both demand-creating and cost-reducing R&D spillovers across
n firms. Based on a sample of US manufacturing firms, the authors find statistically significant, sizeable spillovers in the cost-reducing R&D and insignificant, small spillovers in the demand-creating R&D in most industries. Levin and Reiss [
11] also find the elasticity of product quality to firm’s own R&D to be much higher than that of cost to firm’s own R&D. Other theoretical models of R&D spillovers include d’Aspremont and Jacquemin [
12], and Motta [
13], among others.
Yet all these papers model the R&D investment as a continuous variable thereby neglecting the strong empirical evidence that a sizeable proportion of firms do not undertake R&D activities, see, e.g., [
14]. One plausible explanation is that the demand-creating effect of R&D is subject to threshold effects: the quality could be raised only after a certain minimum level R&D investment is attained; R&D has no effect on the quality below this level. Thus, the R&D expenditure could be viewed as a censored decision variable whose optimal values below a certain threshold are unobserved. This type of model in the single-firm setup is analyzed by Gonzalez and Jaumandreu [
15].
To study spatial spillovers in R&D investment, we develop a simple model of strategic interaction with a censored decision variable that incorporates the empirical findings discussed above. We consider a single, monopolistically competitive industry composed of a large number,
firms, each producing a brand of the same product differentiated by quality. Let
,
and
denote, respectively, the price, demand, product quality and R&D expenditure of firm
i. To derive
, we employ a variant of the Dixit-Stiglitz [
16] model of monopolistic competition in which the CES utility of a representative consumer is augmented with preference for quality:
where
is a quality sensitivity parameter. Utility maximization yields the demand for firm
i of the form:
, where
is the elasticity of substitution between the quality-adjusted goods,
,
with
I being consumer’s income and
is a quality-adjusted price index. To obtain non-increasing marginal demand for quality,
suppose that
If the number of firms is large, it is reasonable to assume the effect of a single firm’s decision on the industry index
to be negligible,
i.e.,
is constant, and normalize
.
Following Gonzalez and Jaumandreu [
15], we assume that firm’s own R&D expenditure affects only its product quality, subject to a technological constraint:
where
is the minimum investment required for quality improvements,
is the R&D sensitivity parameter. Throughout, we use
instead of
to ensure that the logarithm of the censored investment is defined for zero values, and let
. It is a convenient normalization, which does not affect the results.
Furthermore, in light of the above empirical findings, we assume that other firms’ R&D have only a cost-reducing effect on firm
and this effect is limited to the fixed
r-neighborhood,
, of firm
i:
where
and
are, respectively, the marginal cost and vector of observed cost-determinants of firm
i,
is the log of firm
i’s investment,
is the vector of the log R&D choices of all firms except
i, and
is firm
i’s idiosyncratic cost component. The coefficient
measures the strength of this spillover effect.
Suppose that all firms observe
, but
is observed only by firm
i. Given this uncertainty about the choices of other firms, following Durlauf [
17], see also [
6], we assume that each firm
i decides on its R&D investment based on its beliefs about the choices of the other firms,
which are formed as the conditional expectation given all the information available to firm
i.
Based on these beliefs, firms choose simultaneously price,
, and R&D investment,
, to maximize their profits subject to a technological constraint,
i.e., solve
where
is the cost of investment. The nonstochastic threshold
is assumed to be observed and constant across all firms.
Lemma 1. The solution to optimization problem (2) and (3) is given by (1) with , , ,
and If , i.e., R&D of the neighbors has a cost-reducing effect on firm i, then both the probability and intensity of firm i’s R&D increases with the expected R&D of its neighbors. In other words, there are strategic complementarities or positive externalities in the R&D decision of firms. Furthermore, the probability of R&D is also increasing in (i) the elasticity of demand with respect to quality, higher ϵ; (ii) the elasticity of quality with respect to R&D, higher δ; and (iii) the market power, lower ν. The latter is consistent with the Schumpeterian argument that economies of scales make R&D more attractive to large firms than to small firms.
2.2. Peer Effects in Female Labor Supply
Our next example involves social interactions in the female labor supply. Suppose the utility of female
i is defined over her consumption,
, and leisure,
, as follows:
where
is the parameter characterizing the relative preference for consumption over leisure. Let
be the labor supply of female
i, and let the weight on the leisure,
, capture the peer effects that depend on the labor supply decisions of female
i’s peers in her social neighborhood, referred to as friends, as follows:
where
is the vector of observed characteristics of woman
i,
is the log labor supply of woman
i’s friends, and
is her private characteristic unobserved by other women. As in the previous example, we use
instead of
to ensure that log of the censored labor supply is defined for zero values. In presence of positive peer effects,
, which implies mutual reinforcement in the choices within the social group.
As before, all women observe
, but
is observed only by
i. Woman
i makes her decision based on her beliefs about the choices of her peer group,
which are formed as the conditional expectation given all the information available to woman
i. Based on these beliefs, women simultaneously maximize their utility subject to threshold effects:
where
w is the wage,
T is the time endowment,
is the reservation labor income, which can be interpreted as welfare or other government transfers. The nonstochastic threshold
is assumed to be observed and constant across women.
Lemma 2. The solution to optimization problem (4) and (5) is given by (1) with , , , and .
If , i.e., there are positive peer effects, then both the probability and magnitude of woman i’s labor supply increases with the expected labor supply of her peers.
3. Equilibrium: Characterization and Weak Dependence
We assume that in equilibrium, players have self-consistent expectations,
i.e., their subjective expectations or beliefs coincide with the expectation based on the equilibrium distribution of strategies conditional on
. That is,
where the expectation
is taken with respect to the equilibrium conditional distribution of strategies. The last equality follows from independence of
, and independence of
and
.
Suppose that the
are i.i.d.
. Taking conditional expectation of Equation (
1) with respect to the equilibrium distribution of strategies, conditional on
, yields:
where Φ and
ϕ are, respectively, the c.d.f. and p.d.f. of the standard normal distribution.
Provided that they are well-defined, strategies
are independent across
i conditional on
and have censored normal distributions with the means
the common variance
and the common nonstochastic threshold
. In equilibrium,
satisfy system (
7). If this system has a unique solution, the corresponding equilibrium strategies,
will be also unique with probability 1, since a censored normal variable is uniquely characterized by its mean, variance and threshold. This leads to the following characterization of equilibrium.
Definition 1. An equilibrium is a set of policy functions whose conditional mean functions satisfy system (7). A similar characterization of equilibrium in discrete games of incomplete information is used in [
8].
An appealing feature of Equation (
1) is that it reduces to the popular Tobit model, which is part of any regression package. However, the difficulty is that
depends on the latent regressors,
. Thus, one would need first to obtain consistent estimates of the latent regressors, and then use any consistent estimation procedures for the Tobit model.
Since consistency of any estimation method hinges upon uniqueness of equilibrium, we first prove existence and uniqueness of the pure strategy equilibrium. To this end, we maintain the following assumption.
Assumption 1. The shocks i.i.d. and where , , is the p.d.f. of the standard normal distribution.
This assumption restricts the strength of interactions, captured by the coefficients α: interactions must not be too strong for a stable equilibrium to exist. Intuitively, if the interactions are long-ranged and too strong, then the effect of remote neighbors is substantial and may lead to instability and multiple equilibria. Since it involves only the estimated coefficients and the number of neighbors, k, is typically known, the assumption is testable.
Assumption 1 is similar to Assumptions B and C in [
4], which restrict the strength of interactions to obtain a unique equilibrium in a discrete choice game of social interactions.
Based on this assumption, we can now show existence and uniqueness of equilibrium.
Theorem 1. Under Assumption 1, there exists a unique equilibrium of model (1). In general, without restrictions on the parameters, multiple equilibria could occur. If one does not want to impose restrictions directly, one can use the Mathematical Program with Equilibrium Constraints (MPEC) routine to deal with multiple equilibria implicitly by choosing the equilibrium that maximizes the empirical likelihood.
In equilibrium, the policy variables will be correlated across players. To characterize their dependence, we assume that the process is indexed by a vector of locations on the lattice , and hence can be viewed as a random field on . In other words, is a triangular array of vector-valued random fields defined on a probability space and observed on sample regions . In the following, to simplify notation, we suppress the index t and write . Furthermore, we denote by the Euclidian norm in and by – the -norm.
Assumption 2. The data-generating process is a triangular array of random fields indexed by where the are the sample regions such that as . The distance between players i and j is measured by the Euclidian metric:
This assumption implies that the players’ locations are exogenous, i.e., they are known and determined outside the model. Extensions to endogenous locations would require explicit modeling of the location choice, and would therefore considerably complicate the model. This extension is an interesting direction for future research.
Given Assumption 2, it turns out that the equilibrium policy variables satisfy a weak dependence condition known as near-epoch dependence (NED), see [
5], under the same condition that ensures uniqueness of equilibrium. For ease of reference, we state definition of NED random variables.
Definition 2. The triangular array of random fields , is -NED on iff for some sequence as , where
Theorem 2. Suppose Assumptions 1 and 2 hold and , then (i) is -NED on with the NED numbers , where is the integer of part of (ii) is -NED on with the NED numbers for constant c that does not depend on m, (iii) and are -NED on with the NED numbers
The value of the constant c is given in the proof, but it is not important for what follows.
4. Identification and Estimation
We now discuss identification and estimation of our model. Let
, and let
denote the corresponding vector of the coefficients. Given the specification, it is natural to identify and estimate all unknown parameters based on the likelihood function. The log likelihood function of the model is
where
. Likelihood function (
8) involves an unknown threshold parameter,
which is in contrast to the standard Tobit model, where the threshold is assumed to be known and equal to zero. The maximum likelihood (ML) estimator of
γ is the minimum order statistics of the uncensored subsample. More specifically, partition the dependent variable and regressor matrix into two parts:
and
, where the subscript
indicates that observations come from the censored subsample, and the subscript
– from the uncensored subsample, and let
As shown in Proposition 1 below, is a consistent estimator of γ. The ML estimators of the other parameters θ can then be obtained by the standard differentiation techniques.
Assumption 3. Suppose (i) ; and (ii) is a stationary α-mixing process with coefficient satisfying , , s.t. .1 Proposition 1. Under Assumptions 1–3, for whereand is the regressor vector from the uncensored subsample. Thus,
is
n-consistent and asymptotically exponentially distributed. For i.i.d. sample, this result has been established by Carson and Sun [
19]. So, Proposition 1 extends [
19] to a spatially dependent case. The superconsistency of
is a well-known consequence of the dependence of the support of
on
γ.
Proposition 1 implies that
is identified. The remaining parameters,
can now be identified from the likelihood function. Alternatively, one can identify
from the conditional mean function and estimate it by the least squares procedure:
In contrast to the standard Tobit model with zero-threshold, the ML estimator does not strictly dominate the least squares (LS) estimator in our model due to the presence of the first-step estimator. The thing is that the LS objective function is continuous in
γ, while the likelihood function is not. The latter implies that small finite-sample biases in
may cause sizeable finite-sample biases in the ML estimates of
θ. This prediction is confirmed by the simulation results of
Section 6, which suggest larger finite-sample biases in the ML than in the LS estimator. This is the main rationale for considering the LS procedure as an alternative to the ML procedure in our model.
Thus, estimation of model (
1) could be carried out in two steps. First, estimate the threshold parameter
γ by the minimum order statistic of the uncensored subsample
, and substitute it for the true
in (
8) and (
10). Then, estimate the remaining parameters
θ in (
8) and (
10) by the ML or LS procedures, respectively. Note that the least squares estimator of
γ in (
1) will be imprecise due to near multicollinearity of the intercept and threshold. Therefore, we use the first-step estimator
in both procedures.
We now present sufficient conditions for identification of θ.
Assumption 4. Suppose (i) at least one of components of has the full support, ; and (ii) is positive definite.
Theorem 3. Under Assumptions 1–4, is uniquely maximized at for- A.
, where
is as defined in (
8), and
- B.
, where
is as defined in (
10).
Practically, the second-step estimation of
θ could be implemented through the following nested fixed-point (NFXP) algorithm: (i) in an inner loop, for a given
θ, find the unique solution of the equilibrium equations (
7) by the fixed-point algorithm; and (ii) in an outer loop, search over
that maximizes the objective function. Let
be the solution of the equilibrium equations (
7). Then, the resulting estimator can be represented as
where
is either the log likelihood function defined in (
8) or minus the squared deviation of
from the conditional mean defined in (
10). This formulation makes explicit the dependence of the equilibrium variables on the estimated parameters. Given superconsistency of the first-step estimator, the resulting second-step maximum likelihood or least squares estimators of
θ will be root-n consistent, asymptotically normal and independent of
, as shown in Theorem 4 below. However, the NFXP algorithm will be computationally costly for large cross-sectional datasets, e.g.,
.
To overcome this problem, we instead use the constrained optimization algorithm proposed by Su and Judd [
7]. The idea is to solve the following constrained optimization problem:
where
is the vector representation of the equilibrium system (
7). Note that
in this formulation does not depend on
θ, and is chosen simultaneously with
θ to maximize the objective function subject to the equilibrium constraints. This obviates the need to solve the multi-dimensional fixed point problem for
at each iteration on
θ.
Su and Judd [
7] prove equivalence of problems (
11) and (
12) provided that the model is identified. They also demonstrate the computational advantage of this constrained optimization algorithm over the NFXP algorithm in the context of a single-agent dynamic discrete choice model. In particular, they show that the proposed algorithm leads, on average, to a ten-fold reduction of the computational time relative to the NFXP algorithm.
Since our model is identified by Theorem 3, the maximizer
of problem (
11) equals the maximizer
of problem (
12) by Proposition 1 of Su and Judd [
7], and one can thus replace the computationally intensive problem (
11) by the simpler problem (
12). We investigate performance of the constrained optimization algorithm for our model in the Monte Carlo study of
Section 6.
5. Consistency and Asymptotic Normality
We next show consistency and asymptotic normality of the maximum likelihood and least squares estimators. To this end, we need the following assumption.
Assumption 5. Suppose- (i)
and Γ are compact, , , , .
- (ii)
, , .
Part (i) Assumption 5 is the standard condition on the parameter space and the true parameter value; Part (ii) is used to verify uniform convergence of various sample functions. Generally, the above assumptions are slightly stronger than those in the fully parametric Tobit model with zero-threshold, since our Tobit estimator of θ relies on a nonparametric first-step estimator of γ.
Theorem 4. Under Assumptions 1–5, the maximum likelihood and least squares estimators are both consistent and asymptotically normal, i.e., , where with and
Thus, both the maximum likelihood and least squares estimators of
θ are
-consistent and asymptotically normal. To conduct inference, it remains to obtain a consistent estimate of the covariance matrix
. For this purpose, one can employ the following spatial HAC estimator:
where
is a
d-dimensional symmetric kernel, and
is a bandwidth parameter. Jenish [
20] proves consistency of this estimator for more general nonparametric estimators of
γ. In our model, consistency is achieved by bandwidth parameters satisfying
.
6. Numerical Results
In this section, we examine the finite sample properties of the maximum likelihood (ML) and least squares (LS) estimators of our censored model, as well the performance of the Su-Judd [
7] algorithm.
Throughout, data reside on the two-dimensional lattice , where denotes, for simplicity, the vector of coordinates. The data are simulated on a rectangular grid of locations. To control for boundary effects, we discard the 300 outer boundary points along each of the axes and use the sample of size for estimation.
Our experiment consists of two stages: (i) simulation; and (ii) estimation. In the first stage, we first simulate two i.i.d.
processes
and
, which are independent of each other. Next, using the fixed-point algorithm, we generate the process
according to:
and then the process
according to:
with
,
,
and
. Last, we form the process
according to:
In the second stage, we first construct the minimum order statistic estimator of
γ –
– and then use the Su-Judd [
7] constrained optimization algorithm to estimate the remaining parameters
As discussed in
Section 4, we estimate the
n-dimensional vector of endogenous variables
jointly with
θ, instead of computing it at each iteration on
θ,
i.e., we solve the constrained optimization problem:
where
is the vector representation of the equilibrium system (
13).
Table 1.
Estimation of auto-regressive parameter: .
Table 1.
Estimation of auto-regressive parameter: .
Maximum Likelihood |
---|
Sample Size | Mean | Bias (%) | SD | RMSE | 25th Pct | 50th Pct | 75th Pct |
---|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
Least Squares |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
Both the ML and LS estimators of
α behave well for all sample sizes. The finite-sample bias declines rapidly from about 1.1% (
to 0.3% (for
) in the case of the ML estimator, and from 0.55% (
to 0.08% (
in the case of the LS estimator. These results suggest that a five-fold increase in the sample size leads to a more than three-fold reduction in the ML bias, and a more than six-fold decrease in the LS bias, which is consistent with our asymptotic theory. The standard errors also fall off rapidly with the sample size. A similar pattern is observed for the estimates of the slope
β, shown in
Table 2.
Table 3 contains the minimum order statistic estimates of
γ. The finite-sample bias diminishes from 4.5% (
to 0.8% (for
), which means that a five-fold increase in the sample size is associated with more than a five-fold reduction in the bias. This is in line with the theoretical prediction of
n-consistency of the minimum order statistics.
Table 2.
Estimation of slope: .
Table 2.
Estimation of slope: .
Maximum Likelihood |
---|
Sample Size | Mean | Bias (%) | SD | RMSE | 25th Pct | 50th Pct | 75th Pct |
---|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
Least Squares |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
Table 3.
Estimation of threshold: .
Table 3.
Estimation of threshold: .
Min. Order Statistics |
---|
Sample Size | Mean | Bias (%) | SD | RMSE | 25th Pct | 50th Pct | 75th Pct |
---|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
Next,
Table 4 and
Table 5 present the estimates of the intercept
b and the standard deviation
σ, respectively.
The maximum likelihood estimates of b and σ exhibit larger biases than those of α and β. However, they decrease as the sample size increases: from 5.5% ( to 1.4% ( in the case of b, and from 15.9% ( to 6.9% ( in the case of σ. Thus, the biases still halve when the sample size increases four-fold, consistent with the asymptotic theory. Large small-sample biases could be explained by weak identification or near multicollinearity introduced by the inverse Mills ratio, which is approximately linear over a wide range of its argument.
Table 4.
Estimation of intercept: .
Table 4.
Estimation of intercept: .
Maximum Likelihood |
---|
Sample Size | Mean | Bias (%) | SD | RMSE | 25th Pct | 50th Pct | 75th Pct |
---|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
Least Squares |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
Table 5.
Estimation of standard deviation: .
Table 5.
Estimation of standard deviation: .
Maximum Likelihood |
---|
Sample Size | Mean | Bias (%) | SD | RMSE | 25th Pct | 50th Pct | 75th Pct |
---|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
Least Squares |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
Interestingly, the LS estimates of all parameters, including b and σ, have smaller finite-sample biases than the respective ML estimates. The reason is that the LS objective function is continuous in the first-step nonparametric estimator of γ, while the likelihood function is not, and small first-step biases in γ may get disproportionately amplified and translate into sizeable second-step biases in θ. Consequently, the biases in the ML estimates of b and σ due to weak identification are further exacerbated by discontinuity of the likelihood function. Nevertheless, as expected, the LS estimator has larger standard errors than the ML estimator across all parameters. Thus, in contrast to the standard Tobit model with zero-threshold, the ML estimator does not strictly dominate the LS estimator in our model.
Finally,
Table 6 reports the computational time and the number of converged iterations for the Su-Judd [
7] algorithm. The algorithm performs well for all sample sizes: converges in almost 99% of iterations and the time costs are less than two hours even in large sample sizes such as
. For comparison, the NFXP algorithm will take about 130–150 hours to estimate the model for same sample sizes. Thus,the Su-Judd [
7] algorithm offers a considerable time savings over standard nested fixed-point algorithms.
Table 6.
Algorithm Performance
Table 6.
Algorithm Performance
Maximum Likelihood |
---|
Sample Size | | | | | |
Number of converged iterations | 1000 | 999 | 991 | 992 | 961 |
Run Time | 2 min | 6 min | 27 min | 61 min | 119 min |
Least Squares |
Sample Size | | | | | |
Number of converged iterations | 942 | 996 | 997 | 999 | 998 |
Run Time | 2 min | 5 min | 24 min | 57 min | 112 min |
Overall, the simulations results are consistent with our asymptotic theory: the finite-sample biases and standard errors of the ML and LS estimators decay rapidly with the sample size. Moreover, the Su-Judd [
7] constrained optimization algorithm seems to be a viable and effective numerical procedure for estimating games with large number of players, including our model.
7. Conclusions
In this paper, we study identification and estimation of a static game of incomplete information with censored strategies. Specifically, we show existence and uniqueness of an equilibrium as well as its weak dependence property under a condition that restricts the strength of interactions among the players. We then show identification of the parameters and estimate them by maximum likelihood and least squares procedures. The resulting estimators are shown to be consistent and asymptotically normal. We also demonstrate application of our results to modeling spillovers in firms’ R&D investment and peer effects in female labor supply.
One direction for future research is to relax the normality assumption on the errors and obtain identification under more general error distributions whose conditional mean functions satisfy contraction mapping conditions, similar to the one used in the paper. Another extension could be to allow for random threshold effects in the outcome variable, using some parametric family of distributions. One can also allow for truncated strategies by slight modifications in the likelihood function and Assumption 1. Finally, instead of the regular lattice, one can consider players located at the nodes of some graph, which describes the network structure, as in the social interactions literature.