A Simultaneous Stochastic Frontier Model with Dependent Error Components and Dependent Composite Errors: An Application to Chinese Banking Industry

: The paper develops a simultaneous equations stochastic frontier model (SFM) with dependent random noise and inefficiency components of individual equations as well as allowing dependence across all equations of the model using copula functions. First, feasibility of our developed model was verified via two simulation studies. Then the model was applied to assess cost efficiency and market power of the banking industry of China using a panel data of 37 banks covering the period 2013–2018. Results confirmed that our simultaneous SFM with dependent random noise and inefficiency components outperformed its predecessor, which is a simultaneous SFM with dependent composite errors but with independent random noise and inefficiency components of individual SFMs as well as the conventional single-equation SFM. Apart from the statistical and computational superiority of our developed model, we also see that Chinese banks in general have a high level of cost efficiency and that competition in the banking industry of China mainly exists in state-owned banks and joint stock banks. Presence of economies of scales as well as diseconomies of scales were found in different banks. Also, the state-owned banks embraced most sophisticated technologies thereby allowing them to operate with the highest level of cost efficiency.


Introduction
The stochastic frontier model (SFM) reflects the functional relationship between inputs and outputs produced by enterprises under a given level of technology.For a production function, the SFM is used to calculate efficiency of the production technology by measuring the gap between actual output and the maximum potential output given technology and input levels.For a cost function, the cost efficiency of an enterprise can be calculated by comparing the difference between actual cost and the potential minimum cost given technology, input and output prices.Conventional SFM decomposes the composite error into two components-statistical noise and inefficiency-thereby allowing separation of inefficiency from external random shocks or measurement errors and thus avoiding overestimation of inefficiency [1].Given this advantage of SFM, it has been applied to a wide variety of research fields to estimate efficiency of firms and/or economic agents, including the banking and finance sectors [2][3][4], the agricultural production sector [5,6] as well as the energy industry and environmental performance [7,8].In recent decades, SFM has been improved in its estimation approaches ranging from semi-parametric and nonparametric estimations, various assumptions regarding marginal distribution and so on.However, these extensions of single or univariate SFM cannot deal with multiple outputs.Also, neglecting dependence among multiple outputs may lead to biased estimates of technical efficiencies [9].Therefore, when the nature of outputs is multiple and potentially correlated, it is meaningful to apply simultaneous SFMs.
Existing literature does extend the single-equation SFM to solve the issues of multiple outputs or construct multiple SFMs in some ways.For example, Fernández et al. [10] combined parametric aggregator of outputs with the single-equation SFM to deal with multiple outputs.It is clear that this transformation of multiple outputs is a way out to solve multi-output problems, but it is still constrained by the application of a single-equation model and aggregation of multiple outputs could be an issue on its own.Later, Ferreira and Steel [11] applied a multivariate skewed distribution to model the skewness of composed error terms in multi-output equations of stochastic production frontiers.Carta and Steel [12] first proposed a multi-output SFM using copula functions to link inefficiency error terms.Afterwards, Lai and Huang [9] proposed a multiple SFM with correlated composite errors using copula functions and used the maximum likelihood estimation procedure to draw inference from the model.They showed that omission of dependence between composite errors could result in severely biased estimation of technical efficiency.Hereafter, Huang et al. [13] applied the copula-based simultaneous SFM to measure cost efficiency and Lerner index for Russia, Czech Republic, and Poland, etc.In addition, Huang et al. [14] employed the approach to measure competition, innovation and efficiency in Taiwan's banking industry.
Through the above-mentioned literature, we can find that scholars have successfully extended single-equation SFMs to simultaneous SFMs.Moreover, these studies focused on modelling dependence between either the composite errors or the inefficiency terms of equations in simultaneous SFMs, all of which were proved to be more effective than the single-equation SFM.However, as in the case of conventional SFM, simultaneous SFMs were also developed based on the assumption that the statistical noise and inefficiency components are independent.In fact, dependence between the two error components of SFM should not be ignored.The logic behind this argument is that the correlation between statistical noise and inefficiency may arise from factors that are beyond the control of firms, on efficiency.
Some studies have showed that the relaxation of the restrictive independence assumption of the statistical noise and inefficiency components can remarkably improve performance of the conventional SFM.For example, Bandyopadhyay and Das [15] developed a SFM in which the error components were jointly distributed as a truncated bivariate normal, given the condition that the distribution of observational error is negatively skewed.Soon after, Smith [16] proposed a copulabased SFM with dependent error components relaxing the statistical noise-inefficiency independence assumption.Following that, many scholars have applied the copula-based SFM to assess efficiency of decision making units in various fields, such as, applications to analyze technical efficiencies of Moroccan municipalities [17] and intercrop coffee production in northern Thailand [18].Further, Sriboonchitta et al. [19] proposed a double-copula SFM with sample selection, extending the standard SFM with sample selection by modelling dependent error components using copula functions.Thus, it is important to allow for the dependence not only between composite errors across multiple equations but also between the statistical noise (random error) and inefficiency components in the simultaneous SFM framework.In view of this, we propose a simultaneous SFM with dependent error components of each equation as well as correlated composite errors across equations, which is not currently available in the existing literature.We apply our proposed model to an empirical panel data set of 37 Chinese banks covering the period 2013-2018, to measure market power, cost efficiency, and technology gap ratio of the Chinese banking industry.
Literature abounds on the efficiency analysis of the Chinese banking industry, ranging from examination of cost, production and/or profit efficiency from different perspectives.Among these, some studies investigated interrelationships between efficiency and market power (competition) of Chinese banks but with contrasting evidences.For instance, Lin et al. [20] found that the competition from foreign banks promote efficiency of domestic banks, while Fungáčová et al. [21] concluded that the increase of competition has no significant relationship with the efficiency of Chinese banks.Other studies mainly focused on the influence of financial reforms [22], risk preference [23,24] and bank ownership types [25,26] on efficiency of Chinese banks.Evidence showed that financial reforms and risk taking were both key determinants that affected efficiency of Chinese banks.Nevertheless, the conclusions on the relationship between efficiency and ownership types of banks are rather mixed.For instance, Berger et al. [25] and Fungáčová et al. [27] provided evidence that the state-owned banks suffered from the lowest efficiency compared with other banks, while Chen et al. [28] argued that Chinese state-owned banks were more efficient than other bank groups.Besides, Wang et al. [29] concluded that there is no significant difference in efficiency for banks of different ownership types.
Most of these studies, along with others, mainly applied SFM and/or data envelopment analysis (DEA) and their extensions to analyze efficiency of Chinese banks.Chen et al. [26], Jiang and He [30] and Zhu et al. [24] measured efficiency of Chinese banks using the DEA framework, combined DEA with support vector machines, Malmquist index, and multi-directions efficiency analysis methods, respectively.SFM has also been widely applied and extended to estimate efficiency of the Chinese banking industry, such as, Lin et al. [20], Yin et al. [31] and Fungáčová et al. [27].Besides, Silva et al. [32] compared SFA and DEA approaches on the efficiency analysis of Chinese banks and concluded that SFA and DEA provided consistent results overall for industry but not for individual banks.Previous researches have provided a systematic and comprehensive analysis on the efficiency of the Chinese banking industry with respect to different research interests and methodologies.However, the application of simultaneous SFM to the Chinese banking industry is quite limited.Huang et al. [33] developed a stochastic network model to assess the efficiency of Chinese banks under production of multistage processes, with the help of copula methods.To the best of our knowledge, there is no research applying simultaneous SFM with dependent error components to analyze efficiency of the Chinese banking industry.
Therefore, the specific contribution of this study to related literature on SFM developments and applications is threefold: First, we develop a simultaneous SFM with dependent error components and dependent composite errors to measure efficiency, which not only allows for dependence between the composite errors of seemingly unrelated stochastic frontier functions, but also captures the dependence between the random noise and inefficiency components within each equation.This is the novelty of our model development which circumvents the limitations in the existing literature on simultaneous SFM.Second, we verify the reasonability and feasibility of relaxing the restrictive assumption of independence between statistical noise and inefficiency components in individual SFM and simultaneous SFM by conducting two simulation studies thereby providing evidence about the consequences of ignoring correlations between the error components.Third, our model is applied to an empirical panel data set of 37 Chinese banks covering the period 2013-2018, in order to measure market power, cost efficiency, scale economy, and technology gap ratio of the Chinese banking industry, which in turn contributes to the relatively limited literature on such analysis for Chinese banks for current years.
The remaining sections of this paper is arranged as follows: In Section 2, we present the basic theories of copula functions, copula-based SFM and then establish the simultaneous SFM with dependent error components.Section 3 describes the detailed process and main findings of the two simulation studies.In Section 4, we apply our proposed model to a panel data of Chinese banks and summarize the empirical results.Section 5 draws conclusions of this study.

Methodology
In this section, the basic concepts of copulas are introduced first.Then, we review the theoretical foundations of copula-based SFM.On this basis, we propose a SFM with dependent error components, which not only allows for dependence between the composite errors of two stochastic frontier functions but also captures the dependence between random noise and inefficiency components within each equation.

Copula Functions
The concept of copula originated from Sklar's theorem.A copula joins univariate distribution functions of random variables to form multivariate (joint) distribution functions to describe the dependence structure among variables [34,35].Kreinovich et al. [36] mentioned that the copula is the most efficient way of representing multidimensional distributions and, thus, has been successfully applied to many applications in statistics.A bivariate copula is a cumulative distribution function (CDF) of two random variables with uniform margins [0, 1] and support contained in [0, 1] 2 [37].A copula function can be expressed in terms of a joint distribution function  of two random variables  and , such that and where (•,•) is the copula function,  1 ,  2 ∈ [0, 1] are the uniform margins, (•) and (•) are the continuous marginal distribution functions of  and  , and  −1 (•) and  −1 (•) are the corresponding quantile functions [38][39][40].
The joint probability density function (PDF) of  and  is factorized as where () and () are the marginal densities of  and  and (•,•) is the PDF of the copula distribution.Many copula families have been developed to model dependence between variables, where different copula families model dependence in different ways.Elliptical copulas and Archimedean copulas are two parametric copula families.Elliptical copulas, such as the Gaussian copula and student t copula, do not have closed-form expressions and are radial symmetric.On the contrary, Archimedean copulas, including Frank, Gumbel, Clayton, and Joe copulas, admit explicit formulas and have simpler forms.Detailed expressions of the commonly used copula families can be found in Sriboonchitta et al. [41] and Wiboonpongse et al. [18].
Different copulas have different ranges of parameters and, so, the degree of dependence modelled by different copulas cannot be compared directly by values of copula parameters.Instead, we may extract Kendall's tau coefficient from the copula functions to compare correlations.Kendall's tau coefficient is measured by the difference between the probability of concordance and the probability of discordance of two pairs of random variables [42].The Kendall's tau () for the random vector (, )  is defined as where (′, ′)  is an independent copy of (, )  .Kendall's tau  is expressible in terms of a copula function: where  1 and  2 are values of the uniform margins.

Copula-Based Stochastic Frontier Model
A SFM breaks down the composite error into two components: a normally distributed random error term , which takes into account uncontrollable exogenous factors, and a non-negative error term , which represents a firm's technical inefficiency [43].The two error components are assumed to be independent in the conventional SFM.Nevertheless, this assumption of independence can be relaxed by applying copula functions to model the dependence between  and  [16,18], which is the basis of the so-called copula-based SFM.The basic form of a copula-based SFM is given by with and (, ) = (  (),   (); ), where the output  is positively valued, ( × 1) is a vector of regressors, and ( × 1) is a vector of unknown parameters.The composite error  contains two components: a symmetric noise  , which is typically assumed to be normally distributed with (0,   2 ) , and a non-negative inefficiency term  , which is usually supposed to be gamma, half-normal or exponentially distributed.For a stochastic production function, the error term has the specification  =  −  while, for a stochastic frontier cost function, the error term is specified by  =  +  instead [44].(, ) is the joint CDF of  and  modeled by a copula function (•,•) with the copula parameter ,   () and   () represent the CDFs of  =  and  = , respectively.
The copula-based SFM is reduced to the conventional SFM when  and  are independent, such that while, if  and  are dependent, the copula-based SFM is also referred to as the SFM with dependent error components.The joint PDF of (, ) is expressed by where   () and   () denote the PDFs of  and , and (•,•) is the density of the copula.The likelihood function of the copula-based SFM is written as: where   () is the density function of , which can be obtained by the following steps [19]: First, the joint density (, ) can be obtained using Equations ( 3) and (7).Transforming (, ) to (, ), we can get the density function of (, ) [16]: Second, marginalizing out , we can obtain the density function of  as or where Ε  [•] represents the expectation function with respect to .Then,   () can be approximated by where   ,  = 1, … ,  is a sequence of  random draws from a particular distribution, such as a standard half-normal distribution.The technology efficiency  could be derived as

Simultaneous Stochastic Frontier Model with Dependent Error Components
In simultaneous SFM, which has also been referred to as seemingly unrelated stochastic frontier regressions, the stochastic frontier functions are estimated simultaneously.In this subsection, we propose a copula-based simultaneous SFM which allows for dependence between the random noise and inefficiency components of each stochastic frontier function, as well as dependence between the composite errors of two equations.We named our proposed model as simultaneous SFM with dependent error components for short, in order to distinguish from the copula-based simultaneous SFM which was first introduced by Lai and Huang [9] and further developed by Huang et al. [13,14] (hereafter referred to as simultaneous SFM with dependent composite errors).The basic form of the simultaneous SFM with dependent error components is expressed by where the dependence of error terms could be modeled by three copula functions, such that Similar to the copula-based SFM, the noise term   ~(0,    2 ),  = 1, 2, is usually assumed to obey normal distribution, while the distribution of the inefficiency term   ,  = 1, 2, could follow a half-normal, exponential or gamma distribution.  (•),   (•), and   (•) are the CDFs of , , and , respectively.The parameters  1 ,  2 , and  12 represent the parameters of the three copulas  1 (•,•),  2 (•,•), and  12 (•,•), which model the dependences between  1 and  1 ,  2 and  2 , and  1 and  2 , respectively.The simultaneous SFM with dependent composite errors, proposed by Lai and Huang [9], could be regarded as a special case of the simultaneous SFM with dependent error components when the error components are assumed to be independent, such that The likelihood function of the simultaneous SFM with dependent error components can be written as: with where Ω denotes the total possible parameter space and   ,  = 1, 2 and  = 1, … ,  , is a sequence of  random draws from a specific distribution.Therefore, the log-likelihood function of the simultaneous SFM with dependent error components can be expressed by

Simulation Study
The major advantage of simulation studies is that they are helpful to evaluate the behavior of statistical models, as some "truth" is known from the data generating process.This helps us to compare the performance and quality of one model against its competing methods [45].To check the reasonability of our proposed simultaneous SFM with dependent error components, we perform two simulation experiments in this section.In the first simulation experiment, we compare the performance of the copula-based SFM with conventional SFM under the "truth" that the error components are correlated and the inefficiency terms are known.The second simulation is performed to make a comparison between the performance of our proposed simultaneous SFM with dependent error components and the simultaneous SFM with dependent composite errors of Huang et al. [13,14].

Comparative Study of Copula-Based SFM and Conventional SFM
We conducted a simulation experiment to make a comparison between the performance of copula-based SFM introduced in Section 2.2 and conventional SFM, under the assumption that the error components were dependent.Our simulation was based on a simple SFM with single explanatory variable, expressed by where  is the output vector,  is the vector of a single explanatory variable, and  is the unknown parameter to be estimated.The composite error is expressed as  =  +  , where the error component  represents the statistical noise, while  stands for the inefficiency term.Here the marginal distribution of  was assumed to be a half-normal distribution with |(0,   2 )|, while the random noise term  was assumed to be normal distributed (0,   2 ).We assumed  and  to be dependent in the true model.
The data-generating mechanism consisted of the following steps: • Set up the values for parameters.To generate a simulated data set, the true parameters of the SFM were fixed as  = 10,   = 0.7, and   = 0.7.We chose the Gaussian copula to obtain the correlation between  and  with the copula parameter set to be  = 0.7.

•
Simulate distributions of  and .We first simulated the distribution of  (  ) by generating a sequence of 1000 random draws from the Halton sequence.Then, the conditional distribution of  given  (  ) was simulated by a Gaussian copula   (|) using the "BiCopCondSim" function in the  software.

•
Obtain simulated data of  and  from their simulated distributions.The inefficiency term  was generated by computing the inverse of the half-normal distribution with |(0,   2 )|, given the distribution   obtained in the last step; the statistical noise term  was computed as the inverse of the normal distribution with (0,   2 ), given the distribution   .The composite errors were then computed by  =  + .

•
Simulate data of variables  and .The data of the explained variable  was generated from uniform random numbers on the interval [0, 1], while the dependent variable  was generated according to Equation (29).
We generated 500 data sets of size  = 200, based on the above process.We estimated the conventional SFM by the package "frontier" in the R software and used the estimated coefficients as the starting values of ,   , and   to estimate the copula-based SFM and conventional SFM.Next, we estimated the two models 500 times each using maximum simulated likelihood, see details in Greene [46].The simulated log-likelihood was maximized using the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm in R software.The performance of two models was then compared by the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC).
The true values (True) of parameters (Para) and the summary statistics of estimated parameters by simulation are summarized in Table 1.The estimation accuracy of parameters from the two models were compared by the values of Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE).The overall MAE and MAPE for parameters ,   ,   , and  from the copula-based SFM is 0.057 and 0.05, which is lower than the overall MAE (0.316) and MAPE (0.428) from the conventional SFM.In particular, the MAE and MAPE of   estimated from copula-based SFM were obviously lower than the conventional SFM, which indicates that the parameters   estimated by copula-based SFM were much more accurate than conventional SFM.Furthermore, the average AIC and BIC of copula-based SFM were both lower than the conventional SFM, implying that copulabased SFM outperformed the conventional SFM and gave a better fit to data.From the simulation study, the estimated parameters ,   ,   , and  from the copula-based SFM were found to be closer to their true values, with mean values of 9.993, 0.698, 0.700, and 0.700, respectively.It is remarkable that the mean (0.698) and median (0.703) of   from the copula-based SFM were much closer to the true value (0.7), compared with the mean (1.566) and median (1.563) of   from the conventional SFM.Moreover, the parameter   estimated by copula-based SFM was also more precise than by conventional SFM.Therefore, ignoring dependence between the error components  and  in conventional SFM may lead to biased estimations, which was also established by Smith [16], Sriboonchitta et al. [41] and Wiboonpongse et al. [18].
Further, Figure 1a,b plot the histogram and kernel density of the estimated parameters by conventional SFM and copula-based SFM, respectively.Generally speaking, the kernel density of each parameter from copula-based SFM fit well to the histograms and was quite close to a normal distribution.However, the estimated parameter   by conventional SFM obviously deviated from its true value, as shown in Figure 1a.On the contrary, the   estimated by copula-based SFM was much closer to its true values, as shown in Figure 1b.Therefore, the copula-based SFM outperformed the conventional SFM when the error components were correlated.

Comparative Study of Two Copula-Based Simultaneous SFMs
Our second simulation study was aimed at comparing the performance of two copula-based simultaneous SFMs: simultaneous SFM with dependent composite errors and simultaneous SFM with dependent error components.The simultaneous SFM in this simulation study could be expressed by and where We allowed for dependence between the error components, such that  1 was correlated with  1 and  2 was correlated with  2 .Meanwhile, the composite errors  1 and  2 were also assumed to be dependent.The marginal distribution of the inefficiency terms  1 and  2 were assumed to be half-normally distributed (�(0,   2 )�,  = 1,2), while the noise terms  1 and  2 were assumed to be normally distributed ((0,   2 ),  = 1,2).
The data generation process can be summarized as follows: • Simulate values of the parameters.The coefficients  0 ,  1 ,  0 , and  1 were generated from uniformly distributed random numbers on the interval [5,10] , the standard deviations of inefficiency terms  1 and  2 were generated from uniform random numbers on the interval [2,3], and the standard deviations of noise terms  1 and  2 were simulated from uniform random numbers on the interval [0.30) and ( 31), respectively.
Following the above steps, we generated 200 data sets of size  = 500 for each model.We first estimated the conventional SFM by the package "frontier" in R software, and used the estimated coefficients as the starting values of  0 ,  1 ,  1 ,  1 ,  0 ,  1 ,  2 , and  2 ; the starting values of three copula parameters  1 ,  2 and  3 were all set at 0.5.Then, we estimated the simultaneous SFM with dependent error components and simultaneous SFM with dependent composite errors using maximum simulated likelihood and compared the quality of the two models by AIC and BIC.The simultaneous SFM with dependent error components in this simulation study was supposed to be the true model, as the data were generated based on the assumption that the error components were correlated.Thus, the simultaneous SFM with dependent composite errors was mis-specified.
In Table 2, we summarize the statistics of AIC and BIC calculated from the two copula-based simultaneous SFMs.The average AIC and BIC of the simultaneous SFM with dependent error components are both lower than the simultaneous SFM with dependent composite errors, with an average difference of −228.2 (−219.7)for AIC (BIC) between the two models.Further, the difference of AICs between the two models is plotted in Figure 2.During 200 simulations, the AICs of simultaneous SFM with dependent error components were smaller than the AICs of simultaneous SFM with dependent composite errors for 197 times.We can conclude that the mis-specified model (with larger AIC and BIC) lost more information and, thus, had lower qualities compared to the true model.Therefore, if the error components of SFM are correlated, the simultaneous SFM with dependent error components provided a better fit for the data.

Brief Summary
The results of the two simulation studies confirmed reasonability and feasibility of allowing for possible correlations between the error components in SFM as well as simultaneous SFM.Conclusions from the first simulation study challenged the ubiquitous assumption of independence between statistical noise and inefficiency components in conventional SFM.When the two error components were correlated, estimation by conventional SFM led to relatively larger errors as compared with the copula-based SFM which modelled dependence of the error components by copula functions.Furthermore, for 200 simulations in the second simulation study, the AICs of simultaneous SFM with dependent error components were always lower than the AICs of simultaneous SFM with dependent composite errors.This result strongly supported that the simultaneous SFM with dependent error components could provide a better fit to the data as compared with the simultaneous SFM with dependent composite errors developed by Huang et al. [13,14].Therefore, it is advisable to relax the restrictive assumption of independence between two error components when analyzing real economic problems using either SFM and/or simultaneous SFM.

An Application to the Chinese Banking Industry
In this section, the simultaneous SFM with dependent error components (as proposed in Section 2) was applied to a balanced panel dataset of the Chinese banking industry.We estimated market power, cost efficiency and meta-frontier technology gap ratio of 37 Chinese commercial banks covering the period 2013 to 2018.Meanwhile, estimation based on simultaneous SFM with composite errors and the single-equation SFM was also carried out for comparison.

Model and Data
In this application, we jointly estimated a cost frontier function and an output price frontier function using the simultaneous SFM with dependent error components, in which the total costs and output price were considered as dependent variables.The simultaneous SFM with dependent error components combined a translog cost function and an output price function using copulas, which can be specified as and where  1 =  1 +  1 ,  2 =  2 +  2 , and Equation ( 32) represents the translog cost function of bank  at time  (which is also referred to as the cost frontier function), where  donates the total cost;  represents for the output;   ( = 1,2,3) implies the input prices of labor, capital, and funds, respectively; and  is the time trend, indicating technical changes over time.The error component  1 represents the random noise, while  1 stands for the non-negative inefficiency term.Equation ( 33) is the so-called output price frontier function, where   * and   stand for the output price and the marginal cost of bank  at time , respectively [14,25].The error component  2 stands for the random noise in the price frontier, while  2 is a non-negative random variable measuring the extent to which price deviates from   .The implied marginal cost   is calculated by taking the partial derivative of total costs   with respect to output   : Thus, it is crucial to allow the total cost (Equation ( 32)) and output price (Equation ( 33)) to be correlated.In the simultaneous SFM with dependent error components, the dependence between the error components  1 and  1 , and  2 and  2 were modelled by copulas, as shown in Equations ( 34) and (35).Moreover, the composed error terms  1 and  2 were also permitted to be dependent, following Equation (36).
Based on the estimation results of the simultaneous SFM with dependent error components, a set of indices to measure market power, cost efficiency, and the meta-frontier technology gap ratio of banks could be derived.First, the Lerner Index (LI), a well-established measure of market power (competition) of firms, was calculated as ( 2 | 2 )/  .Second, the measure of scale economies (SC) could be obtained by /, which is the term in the parenthesis in Equation (37).Further, the cost efficiency (CE), technology gap ratio (TGR) as well as meta-frontier cost efficiency (MCE) could be obtained; for more details, see Huang et al. [13,14].
We used a balanced panel dataset comprised of 222 observations for 37 Chinese commercial banks from 2013 to 2018, to avoid selection bias from unbalanced data or non-random sample.Our sample included all the 6 state-owned banks and 12 joint-stock banks of China.Furthermore, we selected 19 Chinese city commercial banks and rural commercial banks which were listed on A-share or H-share stock market by the end of 2018.The total assets of the sample banks account for approximately 68% of the total assets of the Chinese banking industry during the sample period, which indicates that these 37 banks are representative of the Chinese banking industry.The data of all variables were gathered from the Bankscope database.The missing values were manually recorded from annual reports of individual banks.We divided the 37 Chinese banks into four subsamples: (a) state-owned commercial banks (SOCBs), (b) joint-stock commercial banks (JSCBs), (c) city commercial banks (CCBs) and (d) rural commercial banks (RCBs), according to their ownership types.The output of banks was proxied by their total assets ().The price of labor ( 1 ), price of capital ( 2 ), and the price of funds ( 3 ) were calculated as the ratio of personnel expenses to number of employees, the ratio of operating expenses to total fixed assets, and the ratio of interest expenses to all types of deposits, respectively.Total cost () was calculated as the sum of personnel expenses, operating expenses, and interest expenses.The output price  * was defined as the ratio of total revenue () to output ().Following Shamshur and Weill [47] and Huang et al. [13,14], we used the consumer price index (CPI) of China to deflate all variables from nominal values into real values with base year 2013 = 100.The descriptive statistics of variables are summarized in Table 3.It was observed that SOCBs played the dominant role in the Chinese banking industry.Regarding bank size, SOCBs were largest among all types of banks and were roughly four times as large as JSCBs on average in terms of total assets.On the contrary, the sizes of CCBs and RCBs were much smaller compared with SOCBs and JSCBs.However, the three input prices and output price of SOCBs were lowest as compared with JSCBs, CCBs and RCBs during the sample period.

Estimation Results
In our application, the random noise terms  1 and  2 were assumed to be normally distributed ((0,   2 ),  = 1, 2), while the inefficiency terms  1 and  2 were assumed to be either half-normally (HN) or exponentially (EX) distributed.We considered Gaussian (G) and Frank (F) copulas to model dependence between random error and inefficiency terms.The simultaneous SFM with dependent error components was estimated by maximum simulated likelihood using 500 draws from the Halton sequence.The best-fitting model was selected according to the AIC values.
Figure 3 plots the AIC values of all considered models.As shown, the best-fitting model (GFG) is the one based on a Gaussian copula to capture dependence between  1 and  1 , a Frank copula for dependence between  2 and  2 , and a Gaussian copula for dependence between  1 and  2 , with the lowest AIC value (−2255.7)among all considered models.It is worth mentioning that the models in which the inefficiency terms  1 and  2 were described by a HN distribution were all superior to the models using an EX distribution, indicating that the marginal distribution of inefficiency terms has a crucial influence on the goodness of model fit, especially in our case.
Figure 3. AIC values for each copula-based simultaneous SFM (the symbol "G" stands for "Gaussian copula"; "F" represents "Frank copula"; and "HN" and "EX" represent half-normal and exponential distributions, respectively.The three copulas are listed, according to the order of the dependence between  1 and  1 ,  2 and  2 , and  1 and  2 ).
The estimation results of the cost function by the best-fitting model of the simultaneous SFM with dependent error components (SSFMDEC) are shown in Table 4.Meanwhile, we provide the result obtained by the simultaneous SFM with dependent composite errors (SSFMDCE) of Huang et al. [13,14] where the dependence between  1 and  1 , and  2 and  2 were both modelled by independent copulas.To draw a solid comparison with our simultaneous SFM with dependent error components, the dependence between  1 and  2 in the simultaneous SFM with dependent composite errors was also modelled by a Gaussian copula and the inefficiency terms were assumed to have HN distributions.The estimation results of the conventional single-equation SFM is also provided for comparison.
The results immediately yield some clear conclusions: First, most of the estimated coefficients of the cost function by SSFMDEC and SSFMDCE were significant at 1% level of significance, while many parameters failed to be significant in the conventional single-equation SFM.Thus, the two copulabased simultaneous SFMs (SSFMDEC and SSFMDCE) provided more efficient results as compared to the single-equation SFM.Second, the AIC of the single-equation SFM had a much larger value than the other two models, indicating that both versions of simultaneous SFM were preferable to the single-equation SFM.Third, the AIC of SSFMDEC was lower than the AIC of SSFMDCE.Thus, the simultaneous SFM with dependent error components outperformed the simultaneous SFM with dependent composite errors.Finally, it is important to note that the Gaussian copula parameter  1 and Frank copula parameter  2 in SSFMDEC were both significant, which indicate that it is rational to allow dependence between the error components in simultaneous SFM.

Various Measures of Interests
The parameters estimated by the simultaneous SFM with dependent error components were then applied to compute a set of indicators to analyze the level of competition, cost efficiency and technology gap ratio of Chinese banks, such as the Lerner Index (LI), scale economies (SC), cost efficiency (CE), technology gap ratio (TGR) as well as meta-frontier cost efficiency (MCE).4.3.1.The Lerner Index (LI) and Scale Economies (SC) Summary statistics of LI and SC are presented in Table 5.A bank's LI reflects relative makeup of the market output price ( * ) over marginal cost (), which is defined by   = (  * −   )/  * .LI is a measure of market power of firms and can be regarded as the inverse of market competition level.The values of LI range from 0 (perfect competition) to 1 (pure monopoly) [48].We obtained an average LI of 33.5% for the entire sample, with group mean ranging from 31.1-38.0%.This result is close to the statistics of the Federal Reserve Economic Data (FRED), which reported an average LI of 34.9% in the banking market of China from 1997 to 2014.Compared with the banks of developed countries during the same period (such as the average LI of 26.8% for banks in United States), Chinese banks possessed higher market power.Second, there were some discrepancies in competition levels between different types of banks.The average LI was 31.1% for SOCBs, 31.4% for JSCBs, 34.1% for CCBs, and 38.0% for RCBs.The market competition of both SOCBs and JSCBs were highest and did not differ substantially from each other, followed by CCBs; whereas RCBs had the lowest competition level on average.This finding indicates that competition of the Chinese banking industry mainly exists in SOCBs and JSCBs.The reason for this is that RCBs and CCBs are limited within a certain territorial area, typically serving local residents and small enterprises.Compared with SOCBs and JSCBs, CCBs and RCBs have the advantages of more flexible operation modes, lower transaction costs, and better understanding of local conditions.Hence, RCBs and CCBs were found to have greater market power than SOCBs and JSCBs.Our results are in line with the findings by Tan [49] and Fungáčová [21], who documented that the competition of SOCBs and JSCBs were higher than CCBs.
With respect to the economies of scale, a company operates under diminishing, constant, or increasing returns to scale if SC is greater than, equal to, or less than one, respectively [50].The mean and median values of SC for the entire sample were greater than 1, indicating that Chinese banks operated under slightly decreasing returns to scale overall.Among them, all values of SC for SOCBs and JSCBs were greater than 1.Thus, diseconomies of scale occurred in SOCBs and JSCBs.On the contrary, CCBs and RCBs achieved overall economies of scale, also known as increasing returns to scale.According to Berger and Humphrey [51], small-sized banks can obtain scale economies by increasing their size, while further increases in size may result in diseconomies of scale after a certain point.This statement is further supported by the findings of Barros et al. [52] and Athanasoglou et al. [53].Thus, our finding is acceptable because the average sizes of SOCBs and JSCBs were much larger than CCBs and RCBs based on the  and  values, as shown in Table 3. Next, we take a closer look at the dynamic characteristics of the market power of Chinese banks.Figure 4 plots the average LI for different types of banks by year.Overall, the increasing average LI provides evidence of enhanced market power of Chinese banks during the period 2013-2018, both as a whole or specific groups.Second, the market power of RCBs remained the highest among all types of banks each year, followed by CCBs; while the market power of SOCBs remained low and did not vary largely across the whole period.However, it is noticeable that the average LI of JSCBs showed some variation.The market power of JSCBs remained lowest before 2014, then increased from 2014 to 2016 and reached the average level of the whole sample.Third, it is worth noting that the market power of JSCBs, CCBs, and RCBs all decreased in 2016, indicating increased competition among these banks.This result may be due to the tax reform policy of China implemented in 2016 to replace business tax (BT) by value-added tax (VAT) in several service sectors, including financial service sector.The tax burden of many banks and financial companies increased at the beginning of the tax reform thereby leading to an increase in operating costs for banks.Therefore, the market power of Chinese banks, especially for the smaller banks, decreased in 2016.The gaps in market power amongst banks, as measured by the difference between maximum LI and minimum LI within each group, are plotted in Figure 5.The difference of LI tended to reduce for every type of bank, indicating that the gap of market power amongst banks has narrowed in recent years.Regarding SOCBs, the gap of market power remained relatively stable from 2013 to 2016.However, the gap has narrowed since 2016 due to the ever-increasing market power of the Postal Savings Bank of China (PSBC).The Industrial and Commercial Bank of China (ICBC) had the highest market power among all SOCBs from 2013 to 2017, whereas the market power of PSBC was lowest in those years.The gap of market power for JSCBs remained stable, with a slightly downward trend, implying that the market power of joint stock banks was changing.Obviously, the market power of the China Merchants Bank (CMBC) remained the highest every year, while the China Bohai Bank (CBHB) faced the highest level of competition for most years (except for 2017).Turning to CCBs and RCBs, the differences of LI for CCBs and RCBs shrank from 2013 to 2018, indicating that the gap of market power among banks within the two groups was diminishing.In addition, banks with the highest and lowest market power varied amongst different banks for CCBs and RCBs.SOCBs are the mainstay of China's commercial banks, with respect to asset and loan sizes, and most state revenues and expenses are handled by state-owned banks, while the operation of JSCBs is largely attributed to contributions of shareholders.Thus, the market power of SOCBs and JSCBs were relatively stable, leading to the fact that the gap of market power within the two types of banks varied only slightly over the period under consideration.On the contrary, CCBs and RCBs are mainly controlled by local governments and enterprises, with the characteristics of large quantities, regional restriction and instability.Therefore, there were more obvious changes of market power among CCBs and RCBs.6. CE measures how well a bank performs relative to the "best-practice" bank under the same conditions, or how close it is to the minimum cost level [25].We observed that the Chinese banks are operating at a very high level of cost efficiency with an average CE value of 0.982.This result agreed with the findings of Hsiao et al. [22], which indicated that the cost efficiency of Chinese banks had been converging and getting closer to one by the end of 2012.As our sample period was from 2013 to 2018, it is no wonder that the high CEs of Chinese banks were noticed.It is worth noting that the estimated CEs were not comparable among different bank groups.However, the efficiencies among banks of different ownerships could be compared based on MCE [54], which will be explained in detail later.
TGR measures the distance between group frontier technology and meta-frontier technology.A larger TGR value indicates that more advanced technologies were utilized by the group, such that the group's cost frontier is closer to the meta-cost frontier [54].The average TGR for the four groups of banks ranged from 0.857 (for RCBs) to 0.946 (for SOCBs), with a total mean of 0.899 for the whole sample.The six state-owned banks took on the most superior technology, followed by CCBs and JSCBs.On the contrary, the six RCBs, as a group, were found to acquire least sophisticated technology, indicated by their group cost frontier deviating farthest from the meta-frontier.Banks in rural areas have short development times, with obvious geographical restrictions, and encounter challenges of backward information technology, such as a lack of high-speed wireless networks, big data, and cloud computing facilities.The strategies of credit risk management and services of online banking have also been shown to be lacking in RCBs [55].Thus, an obvious technology gap exists between RCBs and other banks.MCE can help to figure out efficiency of individual banks and, consequently, to compare banks.The average MCE of the entire sample was 0.883.SOCBs were the most efficient banks among four groups, with an average MCE of 0.923, followed by CCBs (0.893) and JSCBs (0.872); while RCBs were the least efficient banks, with an average MCE of 0.834.These results were in line with many previous studies focused on analyzing cost efficiency of Chinese banks, such as Chen et al. [28].Nevertheless, our findings were contrary to Lee and Huang [56], who showed that JSCBs and RCBs were more cost efficient compared with CCBs and the four biggest SOCBs.The different conclusions may have been due to the different sample of banks selected, different methodologies applied and different periods covered.

Brief Summaries
Our findings on various bank behaviors can be summarized as follows: First, the findings by LI indicated that the competition in the Chinese banking industry mainly existed in state-owned banks and joint stock banks.Moreover, the market power of Chinese banks had increased in recent years and gaps in market power amongst banks were declining gradually.Second, the presence of economies of scale was found in CCBs and RCBs, while diseconomies of scale occurred in SOCBs and JSCBs.Third, Chinese banks in our sample, in general, were operating with a high level of cost efficiency during 2013 to 2018.The findings from TGR and MCE were consistent too: SOCBs were found to have the most superior technology leading to the highest MCE followed by CCBs and JSCBs; while TGR and MCE of RCBs were lowest among the four groups of banks.

Conclusions
In this study, we proposed a simultaneous SFM with dependent error components by using copula functions.The flexibility of this model is that it allows for dependence between the random noise and inefficiency components of individual SFM as well as dependence between the composite errors of simultaneous stochastic frontier equations.We first verified reasonability of allowing for such dependence between statistical noise and inefficiency components when estimating singleequation SFM and/or simultaneous SFM using two simulation studies.The results confirmed that the copula-based SFM outperforms conventional SFM as the two error components, i.e., random noise and inefficiency components, are correlated in practice; while, for the case of simultaneous equations, it was demonstrated that ignoring dependence between random noise and inefficiency components could result in biased estimates.We then applied our developed model to measure performance of the Chinese banking industry.Empirical analysis again confirmed that our simultaneous SFM with dependent error components was superior to the other two models.Finally, we estimated market power, economies of scale, cost efficiency and technology gap ratios of 37 Chinese banks covering the period 2013-2018 based on the estimation results from the simultaneous SFM with dependent error components.
Results from the empirical application reveal that Chinese banks operated at a high level of cost efficiency and there is high level of competition amongst state-owned banks and joint stock banks, implying low market power.On the other hand, high market power was enjoyed by CCBs and RCBs.Different groups of banks demonstrated economies of scales as well as diseconomies of scales.The state-owned banks acquired the highest level of technologies which enabled them to operate at the highest level of cost efficiency.
The following policy implications can be drawn from the empirical results of the study.First, Chinese banks experiencing diseconomies of scale should consider reducing their operation size to remain competitive in the global financial market.Second, there is a need to acquire superior and advanced technologies in operation, which will allow reducing efficiency gaps of Chinese banks.A suit of advanced technologies is available within the banking sector, which the lagging banks should explore and adopt in order to remain at the top of their game in a globally competitive financial market.
The main contribution of our study is that it provides a new method to estimate simultaneous SFM with least restrictive assumptions, which is a valuable addition to the existing literature on SFM developments.Our study could serve as a useful for future studies, both in terms of applying innovative research methods and/or empirical applications with least restrictive assumptions.In terms of directions for future research, first, future research could focus on incorporating other copula families to specify dependence between random error and inefficiency components instead of Gaussian or Frank copulas used in our study.Second, future research can also consider other marginal distributions for the inefficiency terms instead of half-normal or exponential distributions applied in our study.Third, solving problems of possible serial autocorrelation and heteroskedasticity in the application of our proposed model for panel data could also be a useful future research direction.

Figure 1 .
Figure 1.(a) Histogram and kernel density of parameters ,   , and   from conventional SFM.(b) Histogram and kernel density of parameters ,   ,   , and   from copula-based SFM.

Figure 2 .
Figure 2. Differences between AICs of two copula-based simultaneous SFMs.Note: D.AIC: AICs of simultaneous SFM with dependent error components minus AICs of simultaneous SFM with dependent composite errors.

Figure 4 .
Figure 4.The average Lerner Index for different types of banks by year.

Figure 5 .
Figure 5.The values of maximum, minimum, and difference of Lerner Index for the four studied bank groups.(The left-hand ordinate indicates the LI values and the right-hand ordinate indicates the differences between maximum and minimum LI within each group; CCB: China Construction Bank, CGB: China Guangfa Bank, JJCCB: Bank of Jiujiang, BOCD: Bank of Chengdu, HSB: Huishang Bank, BQD: Bank Of Qingdao, BOJ: Bank of Jiangsu, TCCB: Tianjin City Commercial Bank, WJRCB: Jiangsu Wujiang Rural Commercial Bank, CRCB: Chongqing Rural Commercial Bank, CSRCB: Jiangsu Changshu Rural Commercial Bank, ZRCB: Jiangsu Zhangjiagang Rural Commercial Bank, GRCB: Guangzhou Rural Commercial Bank.).

Table 1 .
Simulation results of the conventional SFM and copula-based SFM.
5,1].The dependence between  1 and  1 ,  2 and  2 , and  1 and  2 were modeled by Gaussian copulas, where the copula parameters  1 ,  2 , and  12 were simulated from uniform random numbers on the interval [0.7, 0.95].• Simulate distributions of  1 and  1 , and  2 and  2 .To simulate the data of  1 and  1 , we first simulated the distribution of  1 ( 1 ) by generating a sequence of 500 random draws from the Halton sequence.Next, we simulated the conditional distribution of  1 given  1 ( 1 ) from a Gaussian copula   ( 1 | 1 ) using the "BiCopCondSim" function in R software, setting up the copula parameter to be  1 .The distributions of  2 ( 2 ) and  2 ( 2 ) were simulated following the same procedure as  1 and  1 , where the copula parameter was set as  2 .• Generate values for  1 ,  1 ,  1 ,  2 ,  2 , and  2 .In this step, we generated the values of  1 and  1 , as well as  2 and  2 , given their distributions simulated in the last step.The inefficiency term  1 ( 2 ) was computed by the inverse of the half-normal distribution with a mean of zero and standard deviation of  1 ( 2 ); the noise term  1 ( 2 ) was computed by the inverse of the normal distribution with a mean of zero and standard deviation of  1 ( 2 ); and the composite errors were computed by  1 =  1 +  1 and  2 =  2 +  2 .• Generate values for variables  1 and  1 , and  2 and  2 .The explained variables  1 and  2 were simulated as uniform random numbers on the interval [0, 1], while the dependent variables  1 and  2 were calculated according to Equations (

Table 2 .
Summary statistics of AIC and BIC of two copula-based simultaneous SFM.

Table 3 .
Descriptive statistics of used variables.
Note: Sample means are reported (Thousands of real Chinese Yuan with base year 2013); Numbers in parentheses are standard deviations.

Table 4 .
Parameter estimation results of the cost function.Significance at the 0.01, 0.05, and 0.10 levels are indicated by ***, **, *, respectively; SSFMDEC is short for simultaneous SFM with dependent error components, SSFMDCE is short for simultaneous SFM with dependent composed errors;  1 is arbitrarily selected as the numeraire to satisfy the homogeneity restriction in input prices; In single-equation SFM,  = �  2 +   2 and  =   2 / 2 . Note:

Table 5 .
Summary statistics of Lerner Index and Scale Economies.
4.3.2.Cost Efficiency (CE), Technology Gap Ratio (TGR), and Meta-Frontier Cost Efficiency (MCE) CE, TGR, and MCE are the most widely used measures of efficiency of firms.The descriptive statistics of the CE, TGR and MCE for Chinese banks are displayed in Table

Table 6 .
Summary statistics of Cost Efficiency, Technology Gap Ratio, and Meta-frontier Cost Efficiency.