Next Article in Journal
A New Vision on the Prosumers Energy Surplus Trading Considering Smart Peer-to-Peer Contracts
Previous Article in Journal
Solving Second-Order Linear Differential Equations with Random Analytic Coefficients about Regular-Singular Points
Previous Article in Special Issue
Robust Linear Trend Test for Low-Coverage Next-Generation Sequence Data Controlling for Covariates
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparing Groups of Decision-Making Units in Efficiency Based on Semiparametric Regression

1
Department of Statistics, Sookmyung Women’s University, Seoul 04310, Korea
2
Department of Statistics (Institute of Applied Statistics), Jeonbuk National University, Jeollabuk-do 54896, Korea
*
Author to whom correspondence should be addressed.
Mathematics 2020, 8(2), 233; https://doi.org/10.3390/math8020233
Submission received: 26 December 2019 / Revised: 22 January 2020 / Accepted: 7 February 2020 / Published: 11 February 2020
(This article belongs to the Special Issue Uncertainty Quantification Techniques in Statistics)

Abstract

:
We consider a stochastic frontier model in which a deviation of output from the production frontier consists of two components, a one-sided technical inefficiency and a two-sided random noise. In such a situation, we develop a semiparametric regression-based test and compare the technical efficiencies of the different decision-making unit groups, assuming that the production frontier function is the same for all the groups. Our test performs better than the previously proposed ones for the same purpose in numerical studies, and also has the theoretical advantage of working under more general assumptions. To illustrate our method, we apply the proposed test to Program for International Student Assessment (PISA) 2015 data and investigate whether an efficiency difference exists between male and female student groups at a specific age in terms of learning time and achievement in mathematics.

1. Introduction

Efficiency comparison between groups is currently used in various fields such as banking, insurance, sports, and R&D investment evaluation. Numerous empirical studies frequently analyze group efficiency using so-called Data Envelopment Analysis (DEA). DEA is a body of techniques for measuring relative efficiency by comparing it with the possible frontiers of decision-making units (DMUs) with multiple inputs and outputs. Here, the term DMU is used to collectively refer to all the units in which the production activity takes place. In the DEA framework, the DMU efficiency scores of each group can be obtained after specifying some assumptions appropriate to the situation, and then the comparison of the efficiency distributions of the groups is made on the basis of their obtained scores. For example, Golany and Storberg [1] and Lee et al. [2] applied non-parametric tests, such as the Mann–Whitney (MW) and Kruskal–Wallis tests, to the efficiency scores. Cummins et al. [3] introduced a dummy variable to indicate the groups, and then regressed the efficiency scores on the dummy variable. Simar and Zelenyuk [4] adapted the test developed in Li [5] to the DEA context and applied it to the obtained scores, to test the equality of efficiency distributions. O’Donnell et al. [6] used the concept of a meta-frontier to compare the technical efficiencies of firms that may be classified into different groups.
However, this stream of research under the DEA framework has a limitation in that it does not consider the noise factor in the production process. DEA typically assumes that the inefficiency of the DMU is the only cause of its production not reaching its maximum output, but obviously there are many uncontrolled factors which need to be considered as the cause. From this recognition, Aigner and Chu [7] and Meeusen and van den Broeck [8] first proposed the stochastic frontier model (SFM), which allows for both unobserved variation in output: the technical inefficiency of the production unit and the noise which represents the effect of innumerable uncontrollable factors. For illustrative comparison between DEA and SFM frameworks, see Figure 1.
Nowadays, the stochastic frontier model is used in a large literature of studies of production. Hence, we feel the need to develop a method and compare the efficiency difference between groups under SFM framework. One pioneering work in this direction is Banker et al. [9]. They developed five DEA-based hypothesis tests to compare the efficiency of groups under SFM. Although the paper referred above is an important development toward group efficiency comparison under SFM, their tests need to improve further.
First, their rather strong assumptions might limit the applicability of the proposed methods. For their parametric tests, they assumed the equality of both noise variance and inefficiency variance across groups. Second, their theoretical justification of the proposed methods needs to be checked. As regards their ordinary least squares (OLS) test of the mean difference in inefficiency, they provided its asymptotic normality as theoretical basis, but to our knowledge, such asymptotic normality is difficult to obtain because of the slow convergence rate of the DEA estimator when the number of input variables is greater than or equal to 2. The same comment is given in Section 3.2 of Simar and Wilson [10] on a similar type of asymptotic normality result as proof of Proposition 1 in Banker and Natarajan [11]. Finally, because they used the DEA methods for SFM, the tests they developed were based not directly on inefficiency itself, but on the inefficiency contaminated by positive measurement error due to noise. This indirect approach can lower the performance of their tests.
This observation has motivated us to develop a theoretically sound tool for comparison of group inefficiencies in the presence of noise. We develop such a methodology using a semi-parametric regression technique instead of DEA methods. The newly developed test performs better than the tests of Banker et al. [9] in numerical studies. It also has the theoretical advantage of working under more general assumptions compared to Banker et al. [9].
The rest of this paper is organized as follows. Section 2 describes our proposed test for group inefficiency comparison. We then perform some simulation studies and compare our test with the tests proposed by Banker et al. [9] in Section 3. We illustrate our method by applying the proposed test to Program for International Student Assessment (PISA) 2015 data and investigate whether an efficiency difference exists between male and female student groups at a specific age in terms of learning time and achievement in mathematics in Section 4. Section 5 provides some discussion and future research topics.

2. Group Efficiency Comparison under SFM

Assume that we have observations on n DMUs, where each observation consists of a vector of p inputs X i = ( X 1 , i , , X p , i ) and the corresponding output Y i . We consider the case where n DMUs can be divided into two distinct groups with n l observations ( n = n 1 + n 2 ) . We assume the following stochastic frontier model for two groups of DMUs:
( The first group ) Y i = ϕ ( X i ) + ε 1 , i , i { 1 , , n 1 } ; ( The second group ) Y i = ϕ ( X i ) + ε 2 , i , i { n 1 + 1 , , n 1 + n 2 } ,
where ε l , i = V l , i U l , i , V l , i is a random noise term of the lth group with E ( V l , i | X i ) = 0 , and U l , i is an inefficiency term of the lth group with U l , i 0 for l = 1 , 2 . We assume that the same production technology is applied to both DMU groups. Hence, the production frontier function ϕ ( · ) is the same throughout the groups, as in Banker et al. [9]. Under this model, we need to estimate the difference E ( U 1 ) E ( U 2 ) and test the hypothesis
H 0 : E ( U 1 ) E ( U 2 ) = 0 vs . H 1 : E ( U 1 ) E ( U 2 ) > 0 ( < 0 )
to know which DMU group is more efficient. A novelty of our approach in developing the test is to implement the test without imposing any parametric assumption on the frontier function ϕ ( · ) , and with minimal assumptions on inefficiency and random noise. Banker et al. [9] also implemented the test without any parametric assumption on ϕ ( · ) , but with additional restrictive parametric assumptions on noise and inefficiency. In the following sections, we first review the work of Banker et al. [9] and then explain the development of our semiparametric regression-based test.

2.1. The Previous Work

To apply the DEA methods to SFM, Banker et al. [9] assumed that the random noise variables V 1 , i , V 2 , i are bounded above by V m a x , that is, V 1 , i , V 2 , i V m a x . Under this assumption, they transformed model (1) as
Y i = ( ϕ ( X i ) + V m a x ) ( V m a x V l , i + U l , i ) ϕ ˜ ( X i ) U ˜ l , i , l = 1 , 2 .
Since U ˜ l , i = ( V m a x V 1 , i ) + U l , i 0 , they considered the translated production function ϕ ˜ ( · ) = ϕ ( · ) + V m a x as a new production function, and U ˜ l , i as the inefficiency of the DEA framework. The new inefficiency U ˜ i l can be estimated as ϕ ˜ ^ ( X i ) Y i after ϕ ˜ ( · ) is estimated using the conventional DEA methods. After estimating U ˜ l , i using DEA methods, they used it for group efficiency comparison. This approach is advantageous in that we use the strength of the existing well-developed DEA techniques. However, the approach has one disadvantage in that the tests developed are based not on inefficiency ( U l , i ) itself, but on the inefficiency contaminated by the positive measurement error ( V m a x V l , i ) due to random noise. Additionally, the distributional property of the inefficiency estimated using DEA methods is generally hard to derive or quite complicated, making it very difficult to develop a statistical test theory based on estimated inefficiency (estimate of U ˜ l , i ). Hence, we are motivated to develop a test for (2) directly based on inefficiency U l , i . We will explain this in the following section.

2.2. The Proposed Test

This section introduces our approach to testing the hypothesis in (2). Unlike Banker et al. [9], we do not require that neither the noise variance nor inefficiency should be equal across groups. Moreover, we allow for distributional difference in the composite error ε and input vector X i from the production environmental factors of each group. Specifically, the variance of V l and mean of inefficiency U l can differ by the group as well as conditional distribution of X i , given group l.
First, model (1) can be written as two nonparametric mean regression models as follows:
Y i = [ ϕ ( X i ) E ( U 1 ) ] + [ V 1 , i ( U 1 , i E ( U 1 ) ) ] ϕ * ( X i ) + ε 1 , i * , i { 1 , , n 1 } ;
Y i = [ E ( U 1 ) E ( U 2 ) ] + [ ϕ ( X i ) E ( U 1 ) ] + [ V 2 , i ( U 2 , i E ( U 2 ) ) ] β 0 + ϕ * ( X i ) + ε 2 , i * , i { n 1 + 1 , , n 1 + n 2 } ,
where E ( ε 1 , i * ) = E ( ε 2 , i * ) = 0 , and β 0 = E ( U 1 ) E ( U 2 ) . If a dummy variable is defined for groups letting T i = 0 for i { 1 , , n 1 } and T i = 1 for i { n 1 + 1 , , n 1 + n 2 } , the two models (4) and (5) can be integrated into a single partial linear semiparametric regression model as follows:
Y i = β 0 T i + ϕ * ( X i ) + ε i * , i { 1 , , n } ,
where ε i * = ( 1 I ( T i = 1 ) ) ε 1 , i * + I ( T i = 1 ) ε 2 , i * and E ( ε i * | T i , X i ) = 0 . Using this model (6), we can test hypothesis (2) by testing hypothesis
H 0 : β 0 = 0 vs . H 1 : β 0 > 0 ( < 0 ) .
Note that V a r ( ε i * | T i , X i ) = ( 1 I ( T i = 1 ) ) V a r ( ε 1 , i * ) + I ( T i = 1 ) V a r ( ε 2 , i * ) . Thus, model (6) is a heteroscedastic partial linear model. Liang [12] and Ma et al. [13] studied model (6) when X i is univariate. By extending the theory from there to the case where X i is multivariate, we can test hypothesis (7). In Appendix A, we prove the asymptotic normality of the kernel-based profile estimator of β 0 based on a local linear model smoother when X i is multivariate, and provide the necessary assumptions for it. As with the estimator in Liang [12], the kernel-based profile estimator of β 0 when X i is multivariate is given as
β ^ 0 = ( T ( I S ) ( I S ) T ) 1 T ( I S ) ( I S ) Y H Y ,
where T = ( T 1 , , T n ) , Y = ( Y 1 , , Y n ) , and S the smoother matrix for estimating the vector ( E ( · | X 1 ) , , E ( · | X n ) ) . If we choose local linear regression as the smoothing method, the smoothing matrix S = [ s X 1 s X n ] will be a collection of row vectors, each of which is the smoother vector
s x = e 1 ( X x W x X x ) 1 X x W x ,
where e 1 = ( 1 , 0 , , 0 ) is a ( p + 1 ) × 1 vector; W x = diag K h ( X 1 x ) , , K h ( X n x ) for some kernel function K and bandwidth vector h = ( h 1 , , h p ) ; and
X x = 1 ( X 1 x ) 1 ( X n x ) .
Here, K h ( X i x ) = j = 1 p h j 1 K ( ( X j , i x j ) / h j ) . From the theorem in Appendix A, under some regularity conditions, n ( β ^ 0 β 0 ) is asymptotically normal with mean zero and variance σ 2 = E ( T ˜ 2 ) 2 E ( ε * T ˜ ) 2 , where T ˜ = T E ( T | X ) . Using a consistent estimator of σ 2 , we can test (7) with significance level α by rejecting H 0 if Z = β ^ 0 / ( σ ^ / n ) z α ( or z α ), where z α is the ( 1 α ) -quantile of the standard normal distribution.
As regards the estimation of σ 2 , we can first directly estimate variance σ 2 using the estimates ε ^ i * and E ^ ( T | X i ) , where ε ^ i * = Y i β ^ 0 T i ϕ * ^ ( X i ) and ϕ * ^ ( · ) is the local linear estimator of ϕ * ( · ) based on Y i β ^ 0 T i , i = 1 , n . We can also estimate it using the sandwich covariance estimate based on (8),
V a r ^ ( β ^ 0 | T , X 1 , , X n ) = H V a r ^ ( Y | T , X 1 , , X n ) H .
Matrix V a r ^ ( Y | T , X 1 , , X n ) is diagonal, with the ith diagonal element equal to
E ^ ( ε * 2 ^ | T i , X i ) = ( 1 I ( T i = 1 ) ) V a r ^ ( ε 1 * ) + I ( T i = 1 ) V a r ^ ( ε 2 * ) ,
where
V a r ^ ( ε 1 * ) = n 1 1 i : T i = 0 ( ε ^ i * ) 2 n 1 1 i : T i = 0 ε ^ i * 2 ,
V a r ^ ( ε 2 * ) = n 2 1 i : T i = 1 ( ε ^ i * ) 2 n 2 1 i : T i = 1 ε ^ i * 2 .
Since the frontier function is generally (coordinatewise) non-decreasing with respect to the input variables, one might consider it necessary to impose such a monotonicity on ϕ * ( · ) . However, from Theorem 2.1 in Huang [14], such imposition will not decrease the asymptotic variance of β ^ 0 ; that is, it shows no theoretical improvement in performance. We therefore choose to develop the test without the monotonicity assumption for simplicity.
Note that our test directly estimates the mean difference in inefficiency E ( U 1 ) E ( U 2 ) using the semiparametric regression technique. Thus, the proposed test can work under assumptions that are more general than those in Banker et al. [9]. Additionally, we do not have to assume that noise has a finite upper support bound ( V m a x ). However, the tests in Banker et al. [9] need such assumptions because they estimate U ˜ l , i = ( V m a x V l , i ) + U l , i and use it as a surrogate estimate of U l , i . However, V m a x V l , i may hamper the tests and degrade their performance.

3. Numerical Studies

In this section, we compare the performance of our test with those of Banker et al. [9]. We consider single and multiple input cases and use sandwich formulas to estimate the variance in estimators.

3.1. Single Input Case

We first consider a single input case using the following model:
Y i = ϕ ( X i ) + V 1 , i U 1 , i , i { 1 , , n 1 } Y i = ϕ ( X i ) + V 2 , i U 2 , i , i { n 1 + 1 , , n 1 + n 2 } ,
where ϕ ( x ) = 30 x 9 x 2 , X U ( 0 , 1 ) , U l , i N + ( 0 , σ l , u 2 ) , and V l , i follow the truncated normal distribution with mean 0 and variance σ l , v 2 , which lies within ( 6 σ l , v , 6 σ l , v ) , l = 1 , 2 . Here, N + stands for a normal distribution limited to the domain [ 0 , ) . As for V l , i , we try two cases to reflect both the equal and unequal error variances between groups. We set σ 1 , v = σ 2 , v = 1 for the equal error variance case and σ 1 , v = 2 , σ 2 , v = 1 for the unequal variance case. To evaluate the type I error rate and power, we again consider two cases based on whether a mean difference ( β 0 ) exists or does not exist between group inefficiencies: σ 1 , u = σ 2 , u = 1 ( β 0 = 0 ) and σ 1 , u = 2 , σ 2 , u = 1 ( β 0 = 0.3305 ) . Here, the type I error rate implies the rate of supporting group difference in mean inefficiency when there is no difference and the power means the rate of supporting group difference in mean inefficiency when there are really inefficiency differences between groups.
We consider three sample sizes, n = 100 , 200, and 400; the proportion of each group is approximately 50% and number of replications is 1000. For a comparison, we report the type I error and power of the following five tests with significance level α = 0.05 : our proposed test (PT), the OLS test, the T-test, the Mann-Whitney (MW) test, the Kolmogorov-Smirnov (KS) test, and the F-test. The last five tests are from Banker et al. [9]. We used a plug-in principle (see Ruppert et al. [15]) to find the bandwidth for our PT.
The test results are depicted in Table 1. Four of these tests, that is, except the KS test and the F-test, seem to respect the significance level in both the equal and unequal variance cases. However, the KS test obviously shows a larger type I error rate than expected for unequal error variances and the F-test seems to be a conservative test, which gives much smaller type I error probabilities than expected. As regards the power, our PT performs best, with the largest power among all the tests. In unequal variance cases where n = 200 , 400 , the KS test has larger power than our PT. However, the KS test is not reliable since it tends to reject the null hypothesis too easily in those cases. Finally, all tests tend to show higher power with larger sample sizes.

3.2. Multiple Input Case

We next consider a multiple input case with p = 3 . All the components in the simulation, except for the frontier function ϕ , are the same as in the single input case. As regards the frontier function, we consider two scenarios: the production function has an additive form, and the production function does not have an additive form. The additive assumption on the production function is used in Ferrara and Vidoli [16], but it may not be satisfied in some cases. However, in case of multiple covariates, the practical applicability of our proposed method may become worse since it requires multivariate smoothing and therefore suffers from the well-known “curse of dimensionality” problem as the dimension of the covariates becomes higher. In this case, additive modeling can be a meaningful alternative. For this, we try to estimate the difference in group efficiencies and test whether it is zero with an alternative estimating strategy, where we employ a backfitting procedure, which is a well-known estimating approach under the additive assumption. See Appendix B for details of the alternative method. Considering these two scenarios (additive and non-additive production functions), we examine how our PT(n) and its alternative based on the additive assumption, PT(a), behave depending on the validity of the assumption. The model considered here is
Y i = ϕ ( X i ) + V 1 , i U 1 , i , i { 1 , , n 1 } Y i = ϕ ( X i ) + V 2 , i U 2 , i , i { n 1 + 1 , , n 1 + n 2 } ,
where X j , i , j = 1 , 2 , 3 , are generated from U ( 0 , 1 ) independently. For the first scenario, we set ϕ ( x ) = ( 30 x 1 9 x 1 2 ) + ( 5 + 2 arctan ( 10 ( x 2 0.5 ) ) ) + ( 4 x 3 ) ; this has an additive structure. For the second scenario, we consider ϕ ( x ) = 4 x 1 + 7 x 2 + 5 x 3 + 8 x 1 x 2 + 10 x 2 x 3 + 9 x 1 x 3 1.1 . Note that both these production functions are concave. For PT(n), we select bandwidths by a generalized cross validation method (see Hastie and Tibshirani [17]), and for PT(a), we adopt a plug-in principle, as in the single input case.
The performances of PT(n) and PT(a) as well as the other five tests are reported in Table 2 and Table 3. From Table 2, our PT(n) outperforms the competitors overall, especially in the unequal variances case. Note that its type I error rates do not deviate much from 0.05, which means that the type I error rate is under control as desired; those of other competitors such as the OLS, T and KS tests tend to be a bit smaller than this level in case of equal variances, and considerably larger in case of unequal variances. The MW test seems to respect the level like PT(n) but PT(n) turns to be more powerful than MW. Note that PT(a) shows good results in terms of type I error rate, with comparable power to the MW test. Its power is lower than that of PT(n), but this is natural since the true production function is not additive. The F-test seems to be anticonservative leading to too high of a type I error probability, especially in the unequal variances case. However, in the equal variance case, the F-test shows shows very good performance in terms of type I error rate and power in large samples, as reported in Banker et al. [9]. From Table 3, our proposed two tests outperform their competitors when the additive assumption is true. Note that under the additive assumption, both PT(a) and PT(n) correctly specify the model. From our simulation, PT(a) slightly outperforms PT(n), since their type I error rates are close to 0.05 and the power of PT(a) is larger than that of PT(n). In case of equal variances, the type I error rates of the five competitors tend to be below 0.05, but when the variances are unequal, their power becomes much lower than our proposed tests although overall they show satisfactory type I errors. The only exception is the F-test. It shows the largest power in the unequal variances case but such merit is dimmed by considerably larger type I errors than other tests.

4. Application to PISA 2015 Data

In this section, we applied our PT to PISA 2015 data and test the efficiency difference between male and female student groups at a specific age in terms of learning time ( X i ) and achievement ( Y i ) in mathematics. The data can be downloaded from http://www.oecd.org/pisa/data/. PISA is a worldwide study to evaluate educational systems by measuring the scholastic performance of 15-year-old school students in mathematics, science, and reading. We considered the regional averages of the students’ learning time and achievement in mathematics based on test results of the 2015 version as production data. Out of the 103 regions in the data, two regions, Nova Scotia in Canada and Chile, were excluded from our analysis in view of their outlier characteristics in efficiency analysis.
Usually, international large-scale assessments data include measurement errors at the individual as well as group level. Therefore, we considered the following stochastic frontier model for such data:
( Male Students ) Y i = ϕ ( X i ) + V i m a l e U i m a l e , i { 1 , , 101 } ( Female Students ) Y i = ϕ ( X i ) + V i f e m a l e U i f e m a l e , i { 102 , , 202 } .
In this model, we assumed that there would be no gender difference in learning ability from a biological point of view and use the same production frontier for both gender group. It means that all the socio-economic characteristics of differentiation between the gender groups were in the random error terms and not introduced in the frontier function.
Table 4 shows summary statistics of each student group data. We applied the six tests in Section 3 to the data and calculated the p-values for the following hypothesis testing:
H 0 : E ( U m a l e ) = E ( U f e m a l e ) vs . H 1 : E ( U m a l e ) < E ( U f e m a l e )
From Samuelsson and Samuelsson [18], it is known that male students are often more involved in mathematics classes than female students. Additionally, since women are more involved in domestic chores than men and for men time is often made free by their families and relatives for the learning activity, male students are likely to be in an environment where they can focus more on studying than female students. Hence, we expected the results of the test to indicate that the effectiveness of male students was greater than that of female students in average.
Table 5 gives the test results. At a significance level of around α = 0.05 , our PT, the MW test, and the KS test (with p-value slightly higher that α = 0.05 ) supported the hypothesis that on average male students are more efficient in mathematics than female students. However, the OLS test, the T-test and the F-test reported no significant difference in learning efficiency between the two groups. The reason for this could be the somewhat restrictive assumptions for test validity. Thus, the three tests seem to face the risk of unreasonable results if the assumptions are not satisfied in practice, but our PT does not seem to suffer from this problem.

5. Discussion and Conclusions

In this study, we developed tests with sound statistical theory for group efficiency comparison under SFM with considerably better performance than the previous tests proposed in numerical simulations. However, there is still room for improvement in our methods.
First, since we perform full nonparametric modeling for the frontier function ϕ ( · ) , which can be multivariate, our test might suffer from the “curse of dimensionality” and require high-order kernels for implementation with four or more input variables. In such a situation, we can consider an alternative test with spirit as in our test when the frontier function ϕ ( X ) has an additive structure, that is, ϕ ( X ) = j = 1 p ϕ j ( X j ) , or could be well-approximated by it.
Second, we only deal with one output case, which limits practical applications. Our methods should be extended to cover the case of multi-output production frontiers, which DEA methods cover.
Third, we assume the same production frontier for both group, which is a clear limitation in practice since such situation is not frequently observed. If it is important to assume separate production frontier functions for different groups, one can use the meta-frontier approach. O’Donnell et al. [6] proposed a meta-frontier approach to compare the group technical efficiencies under stochastic frontier framework. The proposed method has the advantage that it can be used without assuming a common frontier function. However, the use of their method sometimes can be restricted by their assumption that the frontier production function is log-linear.
Finally, if one is interested in estimating the mean inefficiency of each group, we refer to Noh and Van Keiligom [19], which is a recent work along that direction.

Author Contributions

Conceptualization, H.N.; methodology, H.N. and S.J.Y.; software, H.N. and S.J.Y.; formal analysis, H.N. and S.J.Y.; investigation, H.N. and S.J.Y.; writing–original draft preparation, H.N. and S.J.Y.; writing–review and editing, H.N. and S.J.Y.. All authors have read and agree to the published version of the manuscript.

Funding

H. Noh was supported by the Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Education (NRF-2017R1D1A1A09000804), and S.J.Y. was supported by research funds for newly appointed professors of Jeonbuk National University in 2018.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

In this appendix, we provide details of the asymptotic normality of the proposed estimator β ^ 0 in Section 2.2. For this, we first list the relevant assumptions.
Assumption
  • The kernel function K is symmetric, and Lipschitz continuous in [ 1 , 1 ] .
  • ϕ is twice partially continuously differentiable.
  • The density functions of X j ( j { 1 , , p } ) are continuous, and bounded away from zero and infinity on their supports C j , which are bounded.
  • V 1 , V 2 , U 1 and U 2 have finite second moments.
  • For j { 1 , , p } , h j are asymptotic to n a for a > 0 such that n ( j = 1 p h j ) 2 / log n and n h j 8 0 as n goes to infinity.
Theorem A1.
Under the above assumptions,
n ( β 0 ^ β 0 ) N ( 0 , σ 2 )
where
σ 2 = E ( T ˜ 2 ) 2 E ( ε * 2 T ˜ 2 ) T ˜ = T E ( T | X )
Proof. 
We write
n ( β ^ 0 β 0 ) = n 1 T ( I S ) ( I S ) T 1 × n 1 / 2 T ( I S ) ( ε * + ϕ * S ( Y β 0 T ) )
where ϕ * = ( ϕ * ( X 1 ) , , ϕ * ( X n ) ) and ε * = ( ε 1 * , , ε n * ) . It suffices to show that
n 1 T ( I S ) ( I S ) T p E ( T ˜ 2 ) , and
n 1 / 2 T ( I S ) ( ε + ϕ * S ( Y β 0 T ) ) d N ( 0 , E ( ϵ * ) T ˜ 2 ) .
To prove these, we first give the following fact.
sup x C 1 × × C p ξ ^ ( x ) ξ ( x ) = O p ( n 2 a + n ( 1 a p ) / 2 log n ) ,
where ξ ( x ) = E ( R | X = x ) and ξ ^ ( x ) is its local linear estimator. That is, ξ ^ ( x ) = s x R with R = ( R 1 , , R n ) when R j is a response variable. (A3) can be shown from the standard theory of kernel smoothing. Note that
( I S ) T = ( T E ( T | X ) ) + ( E ( T | X ) S T ) .
The second term of the right-hand side of the above equation is o p ( 1 ) from (A3). This proves (A1).
Next, we write
n 1 / 2 T ( I S ) ( ε * + ϕ * S ( Y β 0 T ) ) = n 1 / 2 ( T E ( T | X ) ) ε * + n 1 / 2 ( T E ( T | X ) ) ( ϕ * S ( Y β 0 T ) ) + n 1 / 2 ( E ( T | X ) S T ) ε * + n 1 / 2 ( E ( T | X ) S T ) ( ϕ * S ( Y β 0 T ) ) n 1 / 2 ( T E ( T | X ) ) ε * + A 1 , n + A 2 , n + A 3 , n .
Since n 1 / 2 ( T E ( T | X ) ) ε * = N ( 0 , E ( ϵ * ) T ˜ 2 ) + o p ( 1 ) , it is enough to show that A j , n = o p ( 1 ) , j = 1 , 2 , 3 , to claim (A2).
To treat A 1 , n , we note that
sup x j C j x j ξ ^ ( x ) x j ξ ( x ) = O p ( n 2 a + n ( 1 a p 2 a ) / 2 log n ) , j { 1 , , p }
and denote ϕ ^ * = S ( Y β 0 T ) . Then,
A 1 , n = n 1 / 2 i = 1 n ( T i E ( T i | X i ) ) ( ϕ * ( X i ) ϕ ^ * ( X i ) ) .
Let G denote a class of functions satisfying | g ( x ) g ( y ) | x y for x , y [ 0 , 1 ] p . Then, n a 0 ( ϕ ^ * ( · ) ϕ * ( · ) ) belongs to G with probability tending to 1 from (A3) and (A4), where a 0 < max { 2 a , ( 1 a p 2 a ) / 2 } . We can show that the δ -entropy of G for the supremum norm satisfies
H ( δ , G ) K log 1 δ + 1 δ p
for some constant K. Here, we consider an empirical process
n 1 / 2 i = 1 n ( T i E ( T i | X i ) ) g ( X i ) , g G
indexed by G . Then, E [ ( T i E ( T i | X i ) ) g ( X i ) ] = 0 , and by the Corollary 8.8 of van de Geer [20], we conclude that sup g G n 1 / 2 i = 1 n ( T i E ( T i | X i ) ) g ( X i ) = O p ( 1 ) , to result in A 1 , n = o p ( 1 ) . Note that the exponential tail condition, required to apply the empirical process technique, is automatically satisfied in our case since T is a binary variable.
As for A 2 , n , we first note that E ( A 2 , n | ( T 1 , X 1 ) , , ( T n , X n ) ) = 0 . Moreover,
E ( A 2 , n 2 | ( T 1 , X 1 ) , , ( T n , X n ) ) n 1 [ v a r ( ε 1 1 , * ) + v a r ( ε 1 2 , * ) ] ( E ( T | X ) S T ) ( E ( T | X ) S T ) = O p ( n 4 a + n ( 1 a p ) log n )
from (A3). This establishes A 2 , n = o p ( 1 ) . Finally, A 3 , n = O p ( n 1 / 2 4 a + n 1 / 2 + a p / 2 log n ) = o p ( 1 ) from (A3), to complete the proof. □

Appendix B

In this appendix, we describe an alternative test for (2) when the frontier function ϕ ( X ) has an additive structure, that is, ϕ ( X ) = j = 1 p ϕ j ( X j ) . Under the additive structure assumption, model (1) can be written as two nonparametric mean regression models:
Y i = j = 1 p E ϕ j ( X j , i ) E ( U 1 ) + j = 1 p ( ϕ j ( X j , i ) E ϕ j ( X j ) ) + [ V 1 , i ( U 1 , i E ( U 1 ) ) ] μ + j = 1 p ϕ j * ( X j , i ) + ε 1 , i * , i { 1 , , n 1 } ;
Y i = j = 1 p E ϕ j ( X j , i ) E ( U 1 ) + [ E ( U 1 ) E ( U 2 ) ] + j = 1 p ( ϕ j ( X j , i ) E ϕ j ( X j ) ) + [ V 2 , i ( U 2 , i E ( U 2 ) ) ] μ + β 0 + j = 1 p ϕ j * ( X j , i ) + ε 2 , i * , i { n 1 + 1 , , n 1 + n 2 } ,
where ϕ j * ( X j ) = ϕ j * ( X j ) E ϕ j ( X j ) and E ( ϕ * ( X j ) ) = 0 for j { 1 , , p } . If we introduce the same dummy variable T i as in the single input case, models (A5) and (A6) can be integrated into one single semiparametric regression model, which would be a (heteroscedastic) partial linear additive model:
Y i = μ + β 0 T i + j = 1 p ϕ j * ( X j , i ) + ε i * , i { 1 , , n } ,
where ε i * = ( 1 I ( T i = 1 ) ) ε 1 , i * + I ( T i = 1 ) ε 2 , i * and E ( ε i * | T i , X i ) = 0 . Partial linear additive models have been studied by several authors; for example, Fan et al. [21], Fan and Li [22], Li [23], and Wei and Liu [24]. For the test, we use the profile least square estimator of β 0 in Wei and Liu [24]. However, Wei and Liu [24] only showed the asymptotic distribution of the estimator of the parametric component vector (in our case, the estimator of β = ( μ , β 0 ) ) under the homoscedasticity assumption of the error (Theorem 2.1 of their paper), and so we extended their result to the heteroscedasticity case.
To introduce the profile least square estimator of β 0 using the method of Wei and Liu [24], we define some notations. Let
X d e s = 1 X 1 1 X n = 1 n X , S T = I n S 1 * S 1 * S 2 * I n S 2 * S p * S 1 * I n , C = S 1 * S 2 * S p * ,
where S k is the smoothing matrix for local linear regression with respect to the jth ( j { 1 , , p } ) covariate vector X j = ( X j , 1 , , X j , n ) with kernel function K ( · ) and bandwidth h j , S j * = ( I n 1 n 1 n / n ) S j , and 1 n = ( 1 , , 1 ) with length n. Additionally, we define the additive smoother matrix W j as W j = E j S T 1 C , where E j is a partitioned matrix of dimension n × n p with n × n identity matrix as the jth “block” and zeros elsewhere. Then, the profile least squares estimator of β = ( μ , β 0 ) is obtained as the estimator of the coefficient vector β of a synthetic linear regression model
Y i Y i ˜ = ( T d e s , i T ˜ d e s , i ) β + ε i ,
where W M = j = 1 p W j , Y ˜ = ( Y ˜ 1 , , Y ˜ n ) = W M Y and T ˜ d e s = ( T ˜ d e s , 1 , , T ˜ d e s , n ) = W M T d e s . Additionally, since W M 1 n = ( 0 , , 0 ) , we know that the linear model (A9) becomes
Y i Y i ˜ = μ + ( T i T ˜ i ) β 0 + ε i * ,
where T ˜ = ( T ˜ 1 , , T ˜ n ) = W M T . Hence, after a standard calculation in linear model theory, we obtain the profile least squares estimator of β 0 as
β 0 ^ = T ( I n W M ) ( I n J ) ( I n W M ) T 1 T ( I n W M ) ( I n J ) ( I n W M ) Y ,
where J = 1 n 1 n / n . Using the results to prove Theorem 2.1 in Wei and Liu [24], we show below that under some regularity conditions, estimator β ^ 0 is asymptotically normal with mean zero and variance
σ a d d 2 = E ( T ˜ 2 ) 2 E ( ε * 2 T ˜ 2 ) ,
where T ˜ = T E ( T ) j = 1 p E ( T | X j ) E ( T ) . Once we have a consistent estimate of σ a d d 2 , we can test (7) for a given significance level α . As with the case of single input variable, we can directly estimate the variance σ β 0 2 using estimates ε ^ i * and E ^ ( T | X j , i ) . Here, E ^ ( T | X j , i ) can be obtained as the ith element of S j ( T 1 , , T n ) . Alternatively, we can estimate the variance via the sandwich formula estimate based on (A11) following similar steps in Section 2.2.
Now, we can show the asymptotic property of the profile least square estimator of β 0 . For this, we first list the relevant assumptions.
Assumption
  • The kernel function K is symmetric, and Lipschitz continuous in [ 1 , 1 ] .
  • ϕ j ( j { 1 , , p } ) are twice continuously differentiable.
  • The density functions of X j ( j { 1 , , p } ) are continuous, and bounded away from zero and infinity on their supports, which are bounded.
  • V 1 , V 2 , U 1 and U 2 have finite second moments.
  • For j { 1 , , p } , h j 0 , n h j / log n and n h j 8 0 as n goes to infinity.
Theorem A2.
Under the above assumptions,
n ( β 0 ^ β 0 ) N ( 0 , σ a d d 2 )
where
σ a d d 2 = E ( T ˜ 2 ) 2 E ( ε * 2 X ˜ 2 ) T ˜ = T E ( T ) j = 1 p E ( T | X j ) E ( T )
Proof. 
β 0 ^ can be expressed as follows:
β 0 ^ = T ( I n W M ) ( I n J ) ( I n W M ) T 1 T ( I n W M ) ( I n J ) ( I n W M ) Y
where T = ( T 1 , , T n ) and J = 1 n 1 n / n . Then,
n ( β ^ 0 β 0 ) = n 1 T ( I n W M ) ( I n J ) ( I n W M ) T 1 × n 1 / 2 T ( I n W M ) ( I n J ) ( I n W M ) ( ϕ * + ε * ) ,
where ϕ * = ( j = 1 p ϕ * ( X j , 1 ) , , j = 1 p ϕ * ( X j , n ) ) and ε * = ( ε 1 * , , ε n * ) . Here, the term associated with the intercept μ vanishes because W M 1 n = ( 0 , , 0 ) . To prove the theorem, it suffices to show that
n 1 T ( I n W M ) ( I n J ) ( I n W M ) T = n 1 i = 1 n T i E ( T i ) j = 1 p E ( T i | X j , i ) E ( T i ) 2 + o p ( 1 )
and
n 1 / 2 T ( I n W M ) ( I n J ) ( I n W M ) ( ϕ * + ε * ) = n 1 / 2 i = 1 n T i E ( T i ) j = 1 p E ( T i | X j , i ) E ( T i ) ε i * + o p ( 1 ) .
Note that ( I n J ) ( I n W M ) X = ( I n W M ) X 1 n X ¯ , where X ¯ = n 1 i = 1 n X i because fact 1 n W M = ( 0 , , 0 ) . Then, one can easily see that
n 1 X ( I n W M ) ( I n J ) ( I n W M ) T = n 1 ( T 1 n μ X ) ( I n W M ) ( I n W M ) ( T 1 n μ T ) + O p ( n 1 / 2 )
for μ T = E ( T 1 ) . Therefore, Equation (A13) can be verified as in the proof of Lemma 6.2 in Wei and Liu (2012). For Equation (A14), note that
n 1 / 2 T ( I n W M ) ( I n J ) ( I n W M ) ϕ * = n 1 / 2 ( T 1 n μ T ) ( I n W M ) ( I n W M ) ( ϕ * 1 n μ ϕ * ) + O p ( n 1 / 2 )
and
n 1 / 2 T ( I n W M ) ( I n J ) ( I n W M ) ε * = n 1 / 2 ( T 1 n μ X ) ( I n W M ) ( I n W M ) ε * + O p ( n 1 / 2 ) ,
where μ ϕ * = E ( j = 1 p ϕ * ( X j , 1 ) ) . Then, we have
( I n W M ) ε * = ε * j = 1 p S j ε * + O p n j = 1 p h j 4
from Lemma B.6 in [25]. Note that this is true as long as the conditional variance of the ε * given covariates exists. This is guaranteed by assumption 4. Wei and Liu (2012) used a similar fact under the homoscedastic error assumption. Then, with a derivation similar to that in Lemma 6.3 of Wei and Liu (2012), we can show that (A15) converges to zero in probability and (A16) can be written as:
n 1 / 2 i = 1 n T i E ( T i ) j = 1 p E ( T i | X j , i ) E ( T i ) ε i * + o p ( 1 ) ,
to complete the proof. □

References

  1. Golany, B.; Storberg, J. A data envelopment analysis of the operational efficiencies of bank branches. Interfaces 1999, 29, 14–26. [Google Scholar] [CrossRef]
  2. Lee, H.; Park, Y.; Choi, H. Comparative evaluation of performance of national R&D programs with heterogeneous objectives: A DEA approach. Eur. J. Oper. Res. 2009, 196, 847–855. [Google Scholar]
  3. Cummins, J.D.; Weiss, M.A.; Zi, H. Organizational form and efficiency: The coexistence of stock and mutual property-liability insurers. Manag. Sci. 1999, 45, 1254–1269. [Google Scholar] [CrossRef]
  4. Simar, L.; Zelenyuk, V. On testing equality of distributions of technical efficiency scores. Econom. Rev. 2006, 25, 497–522. [Google Scholar] [CrossRef] [Green Version]
  5. Li, Q. Nonparametric testing of closeness between two unknown distribution functions. Econom. Rev. 1996, 15, 261–274. [Google Scholar] [CrossRef]
  6. O’Donnell, C.J.; Rao, D.S.P.; Battese, G.E. Metafrontier frameworks for the study of firm-level efficiencies and technology ratios. Empir. Econ. 2008, 34, 231–255. [Google Scholar] [CrossRef]
  7. Aigner, D.; Chu, S. On estimating the industry production function. Am. Econ. Rev. 1968, 58, 826–839. [Google Scholar]
  8. Meeusen, W.; van den Broeck, J. Efficiency estimation from Cobb-Douglas production functions with composed error. Int. Econ. Rev. 1977, 18, 435–444. [Google Scholar] [CrossRef]
  9. Banker, R.D.; Zheng, Z.; Natarajan, R. DEA-based hypothesis tests for comparing two groups of decision making units. Eur. J. Oper. Res. 2010, 206, 231–238. [Google Scholar] [CrossRef]
  10. Simar, L.; Wilson, P.W. Two-stage DEA: Caveat emptor. J. Product. Anal. 2011, 36, 205–218. [Google Scholar] [CrossRef]
  11. Banker, R.D.; Natarajan, R. Evaluating contextual variables sffecting productivity using data envelopment analysis. Oper. Res. 2008, 56, 48–58. [Google Scholar] [CrossRef]
  12. Liang, H. Estimation in partially linear models and numerical comparisons. Comput. Stat. Data Anal. 2006, 50, 675–687. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Ma, Y.; Chiou, J.M.; Wang, N. Efficient semiparametric estimator for heteroscedastic partially linear models. Biometrika 2006, 93, 75–84. [Google Scholar] [CrossRef] [Green Version]
  14. Huang, J. A note on estimating a partly linear model under monotonicity constraints. J. Stat. Plan. Inference 2002, 107, 343–351. [Google Scholar] [CrossRef]
  15. Ruppert, D.; Sheather, S.J.; Wand, M.P. An effective bandwidth selector for local least squares regression. J. Am. Stat. Assoc. 1995, 90, 1257–1270. [Google Scholar] [CrossRef]
  16. Ferrara, G.; Vidoli, F. Semiparametric stochastic frontier models: A generalized additive model approach. Eur. J. Oper. Res. 2017, 258, 761–777. [Google Scholar] [CrossRef]
  17. Hastie, T.J.; Tibshirani, R.J. Generalized Additive Models; Chapman and Hall/CRC: New York, NY, USA, 1990. [Google Scholar]
  18. Samuelsson, M.; Samuelsson, J. Gender differences in boys’ and girls’ perception of teaching and learning mathematics. Open Rev. Educ. Res. 2016, 3, 18–34. [Google Scholar] [CrossRef]
  19. Noh, H.; Van Keiligom, I. On relaxing the distributional assumption of stochastic frontier models. J. Korean Stat. Soc. 2020, in press. [Google Scholar]
  20. Van de Geer, S. Empirical Processes in M-Estimation; Cambridge University Press: Cambridge, UK, 2000. [Google Scholar]
  21. Fan, J.; Hardle, W.; Mammen, E. Direct estimation of low-dimensional components in additive models. Ann. Stat. 1998, 26, 943–971. [Google Scholar] [CrossRef]
  22. Fan, Y.; Li, Q. A kernel-based method for estimating additive partially linear models. Stat. Sin. 2003, 13, 739–762. [Google Scholar]
  23. Li, Q. Efficient estimation of additive partially linear models. Int. Econ. Rev. 2000, 41, 1073–1092. [Google Scholar] [CrossRef]
  24. Wei, C.H.; Liu, C. Statistical inference on semi-parametric partial linear additive models. J. Nonparametric Stat. 2012, 24, 809–823. [Google Scholar] [CrossRef]
  25. Fan, J.; Jiang, J. Nonparametric inferences for additive models. J. Am. Stat. Assoc. 2005, 100, 890–907. [Google Scholar] [CrossRef]
Figure 1. Comparison between Data Environment Analysis (DEA) (left panel) and SFM (right panel) frameworks ( y i : output, x i : input, ϕ ( x i ) : the maximum output which can be obtained from the input x i , ϵ i : deviation from the production frontier function ϕ ( x i ) , u i : technical inefficiency, v i : noise). Note that the technical inefficiency u i is a nonnegative random variable with unknown distribution.
Figure 1. Comparison between Data Environment Analysis (DEA) (left panel) and SFM (right panel) frameworks ( y i : output, x i : input, ϕ ( x i ) : the maximum output which can be obtained from the input x i , ϵ i : deviation from the production frontier function ϕ ( x i ) , u i : technical inefficiency, v i : noise). Note that the technical inefficiency u i is a nonnegative random variable with unknown distribution.
Mathematics 08 00233 g001
Table 1. Type I error and power of the single input case with equal and unequal error variances.
Table 1. Type I error and power of the single input case with equal and unequal error variances.
Type I ErrorPower
(Rejection Rate When β 0 = 0)(Rejection Rate When β 0 = 0.3305)
Variances n PTOLSTMWKSFPTOLSTMWKSF
Equal1000.0520.0500.0500.0500.0370.0130.3770.3360.3200.2830.2000.152
( σ 1 , v = σ 2 , v )2000.0620.0580.0560.0620.0470.0060.5950.5550.5470.4830.3970.300
4000.0630.0640.0620.0610.0520.0030.8480.8220.8180.7610.6650.533
Unequal1000.0470.0650.0600.0500.0620.0370.3370.3370.3280.2760.2770.279
( σ 1 , v σ 2 , v )2000.0530.0640.0630.0520.0990.0290.5050.4680.4630.3880.5210.401
4000.0440.0510.0500.0480.1790.0260.7380.7220.7120.6450.8360.650
Table 2. Type I error and power of the multiple input case with equal and unequal error variances when the true production function is not additive.
Table 2. Type I error and power of the multiple input case with equal and unequal error variances when the true production function is not additive.
Type I ErrorPower
(Rejection Rate When β 0 = 0)(Rejection Rate When β 0 = 0.3305)
Variances n PT(a)PT(n)OLSTMWKSFPT(a)PT(n)OLSTMWKSF
Equal1000.0660.0530.0440.0430.0540.0400.1410.2200.2940.2340.2350.1700.1330.492
( σ 1 , v = σ 2 , v )2000.0590.0770.0490.0500.0440.0350.1130.3280.4880.4730.4710.3560.2680.644
4000.0480.0560.0390.0390.0370.0310.0640.5090.7650.7050.7060.5800.5170.815
Unequal1000.0610.0630.1010.0990.0540.0460.3160.1990.2580.3460.3400.1730.1430.649
( σ 1 , v σ 2 , v )2000.0580.0740.1390.1370.0620.0530.3810.3020.4330.5830.5800.3470.3310.849
4000.0480.0600.1280.1300.0460.0930.3850.4680.6820.7930.7920.5300.6500.947
Table 3. Type I error and power of the multiple input case with equal and unequal error variances when the true production function is additive.
Table 3. Type I error and power of the multiple input case with equal and unequal error variances when the true production function is additive.
Type I ErrorPower
(Rejection Rate When β 0 = 0)(Rejection Rate When β 0 = 0.3305)
Variances n PT(a)PT(n)OLSTMWKSFPT(a)PT(n)OLSTMWKSF
Equal1000.0540.0720.0460.0450.0410.0340.0650.3640.3280.2130.2080.1700.1320.279
( σ 1 , v = σ 2 , v )2000.0660.0600.0630.0610.0560.0450.0450.5780.5510.3870.3840.3180.2580.403
4000.0540.0550.0340.0340.0330.0330.0180.8200.8010.5720.5710.5030.4190.517
Unequal1000.0570.0700.0660.0620.0440.0480.1440.3010.2950.2360.2280.1530.1210.381
( σ 1 , v σ 2 , v )2000.0680.0600.0890.0890.0540.0600.1370.5010.4810.4150.4140.2810.2900.566
4000.0570.0540.0580.0580.0320.0560.0960.7190.6950.5880.5860.4460.5350.720
Table 4. Summary statistics of our PISA 2015 data.
Table 4. Summary statistics of our PISA 2015 data.
min Q 1 medianmean Q 3 max
male X i 27.8939.5241.7243.1047.8156.70
Y i 338.5470.6499.7483.9513.8565.6
female X i 25.2338.9941.4941.9845.2656.67
Y i 339.0456.9487.9474.8501.9565.0
Table 5. P-values of the six tests to detect efficiency difference in groups of male and female students in terms of learning time and achievement in mathematics.
Table 5. P-values of the six tests to detect efficiency difference in groups of male and female students in terms of learning time and achievement in mathematics.
testPTOLSTMWKSF
p-value0.0490.1500.1500.0440.0570.322

Share and Cite

MDPI and ACS Style

Noh, H.; Yang, S.J. Comparing Groups of Decision-Making Units in Efficiency Based on Semiparametric Regression. Mathematics 2020, 8, 233. https://doi.org/10.3390/math8020233

AMA Style

Noh H, Yang SJ. Comparing Groups of Decision-Making Units in Efficiency Based on Semiparametric Regression. Mathematics. 2020; 8(2):233. https://doi.org/10.3390/math8020233

Chicago/Turabian Style

Noh, Hohsuk, and Seong J. Yang. 2020. "Comparing Groups of Decision-Making Units in Efficiency Based on Semiparametric Regression" Mathematics 8, no. 2: 233. https://doi.org/10.3390/math8020233

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop