Appendix A. Methodological Derivations for TFP Estimation
TFP is a micro-level concept at the firm level. However, due to data limitations, early studies focused on estimating TFP at the macro level. With the increasing availability of micro-level enterprise data, firm-level TFP estimation has attracted growing attention. In the process of fitting the production function, input factors cannot fully explain total output, leading to a residual productivity term—namely TFP. It reflects improvements in production efficiency driven by technological progress and institutional optimization.
To estimate TFP, the production function is specified as a Cobb–Douglas production function.
where
denotes TFP,
represents capital input, and
denotes labor input. Taking the logarithm of Equation (A1) yields the following linear form:
The logarithmic form of
is incorporated into
. Direct estimation of Equation (A2) using OLS would result in sample selection bias and simultaneity bias. To maximize profits, firms adjust input factors in real time based on current production conditions, leading to correlation between the error term and the regressors. Therefore, OLS is not suitable for direct estimation. Marschak and Andrews (1944) [
90] suggest that this issue can be addressed by decomposing
.
The error term is decomposed into two components: and . Here, captures the portion of the error that reflects the firm’s adjustment of input factors in response to its current production conditions, while represents the true random error. Currently, various methods have been developed to address this issue.
① Fixed Effects Estimation Method. For panel data, using a firm fixed effects model can effectively address endogeneity arising from unobserved heterogeneity across firms. However, this method is only applicable to panel datasets and cannot capture information contained in time variation, nor can it fully identify the parameters to be estimated. Moreover, applying a fixed effects model requires the assumption that is time-invariant, which is difficult to satisfy in real-world scenarios.
② OP Method. To address the limitations of the fixed effects estimation method, Olley and Pakes (1992) [
91] proposed a consistent semiparametric estimator. Specifically, they assume that firms adjust their investment strategies based on current production conditions, and that unobserved productivity can be proxied by the firm’s current investment. The core idea of the OP method is to establish a functional relationship between capital stock and investment.
Let
and
denote the firm’s current investment and capital stock, respectively. Equation (A4) implies an orthogonal relationship between the two. This formulation requires the assumption that firms’ expectations of future
are significantly and positively correlated with their current investment decisions. Based on this, the following optimal investment function can be established:
Taking the inverse of Equation (A5) and assuming
, the unobservable productivity term
can be expressed as:
Substituting Equation (A6) into the production function yields:
Equation (A8) shows that it consists of two components: capital stock and investment.
Define the estimated value of
_ it as
. In the first step, estimate
as follows:
Directly estimate Equation (A9) to obtain a consistent and unbiased estimate of the coefficient on
. The key of this step lies in estimating
, which is defined by the following equation:
Estimate the following equation:
In this equation, is a function of and . Equation (A11) is estimated using higher-order polynomials of and . Since both and appear in Equation (A10), nonlinear least squares must be used. After estimating all the coefficients in Equation (A11), the logarithmic value of the residual is obtained by fitting Equation (A1), thereby yielding the log value of TFP.
③ LP method: The OP method requires a monotonic relationship between current investment and total output, which leads to the exclusion of firms with zero investment in the current period. To address this issue, Levinsohn and Petrin (2003) [
92] proposed replacing investment with intermediate input as the proxy variable. Data on intermediate inputs are generally more accessible than investment data. The LP method also provides several ways to test the validity of the selected proxy, thereby expanding the range of viable proxy variables. As a result, the LP method offers greater flexibility in proxy selection.
④ GMM method: Blundell and Bond (1998) [
93] suggested incorporating instrumental variables into the production function to address endogeneity problems. The lagged values of explanatory variables can serve as instruments. However, the GMM method has two main drawbacks. First, ϖ may be influenced by both short-term and long-term factors. Second, GMM models require transformations such as differencing and lagging, which demand long time-series data. The following fixed effects model is specified:
Take the first difference of Equation (A12).
The two-period lagged terms are considered the optimal instrumental variables. Serial correlation in
may cause current technology shocks to correlate with past input factors. The following model is specified:
Substituting into model (A13), we obtain:
The production function is expressed as:
By applying first-differencing to eliminate fixed effects:
Use as an instrumental variable to estimate the parameter ρ.
⑤ GNR method: When both labor and intermediate materials are treated as static input factors, the ACF method requires additional conditions to estimate a Cobb–Douglas–type gross-output production function. Gandhi et al. (2020) [
94] address this issue by adding share-regression equations for static inputs. Doraszelski and Jaumandreu (2013) [
95] and Grieco et al. (2016) [
96] also estimate production functions by imposing first-order condition (FOC) restrictions for static inputs; however, the former requires information on factor prices, and the latter is not applicable to the Cobb–Douglas functional form.
Let the wage (labor price) be (
), the price of intermediate materials be (
), and normalize the output price to 1. The firm’s problem is:
Taking the first-order condition with respect to intermediate materials (
) yields:
Multiplying both sides by (
), dividing by (
), and taking logarithms gives:
where (
) is the share of intermediate-materials expenditure in gross output. Since (
) is an error term with zero mean, ordinary least squares (OLS) on the above share equation yields estimates (
) and (
). Given arbitrary values for (
) and (
), productivity can be written as
Invoking the model setup, The production function can be re-expressed as
where
is approximated by a high-order polynomial. Using the “structural” relationships implied by firms’ input choices and the evolution of productivity, we construct moment conditions
and estimate (
) and (
) by GMM, where the instrument set is
In summary, the GNR method does not require a monotonic relationship between investment (as in OP) or materials (as in ACF/LP) and productivity. Compared with proxy-variable approaches, GNR therefore relies on **fewer** assumptions. However, like proxy methods, GNR requires separating the disturbance term via the share-regression equation in order to form valid moments for estimation.
Table A1.
Variable Definitions.
Table A1.
Variable Definitions.
| Variable | Definition |
|---|
| TFP | Total Factor Productivity, calculated using the LP method. |
| DT | Digital Transformation Index, constructed based on text mining of annual reports. |
| LEV | Leverage ratio, defined as total liabilities/total assets. |
| AGE | Firm age, measured as the number of years since establishment. |
| SIZE | Firm size, measured as the natural logarithm of total assets. |
| IAR | Intangible assets ratio, calculated as intangible assets/total assets. |
| TOP10 | Ownership concentration, measured as the shareholding ratio of the top 10 shareholders. |
| ORG | Ownership type dummy, equals 1 if the firm is state-owned, 0 otherwise. |
| INDE | Proportion of independent directors on the board. |
| Innovate | Innovation capacity, measured as ln(1 + number of patent applications). |
| HC | Human capital structure, measured as the proportion of employees with a bachelor’s degree or above. |
| CER | Cost efficiency, proxied by the ratio of total operating cost to revenue. |
| FAT | Inventory turnover ratio, used to capture operational efficiency. |
| ITR | Fixed asset turnover ratio, also used to capture operational efficiency. |
| HHI | Industry competition level, measured using the Herfindahl–Hirschman Index (HHI). |
| DINF | Digital infrastructure, proxied by the number of broadband access ports per capita in each province. |
| IPP | Intellectual property protection, measured by the number of IP-related case rulings in local courts. |
| MI | Marketization index, from China’s provincial marketization index database |
| HPE | High-polluting enterprise dummy variable, equals 1 if firm is classified as HPE, 0 otherwise. |
| TQ | Tobin’s Q, used as a proxy for firm market value. |
Table A2.
Robustness test—replacing the explained variables.
Table A2.
Robustness test—replacing the explained variables.
| | (1) |
(2)
|
(3)
|
(4)
|
(5)
|
|---|
| | | | | | |
|---|
| DT | 0.0032 *** | 0.0022 * | 0.0022 *** | 0.0020 *** | 0.0049 *** |
| | (3.92) | (1.66) | (4.59) | (4.38) | (4.90) |
| LEV | −0.0210 *** | −0.3210 *** | −0.0031 | −0.0013 | −0.0307 *** |
| | (−3.43) | (−29.87) | (−0.77) | (−0.34) | (−4.14) |
| AGE | 0.0419 *** | −0.0134 | 0.0344 *** | 0.0334 *** | 0.0394 *** |
| | (3.71) | (−0.71) | (4.87) | (4.93) | (2.88) |
| SIZE | 0.0611 *** | 0.0240 *** | 0.0708 *** | 0.0718 *** | 0.0580 *** |
| | (35.66) | (9.37) | (64.21) | (67.49) | (27.52) |
| IAR | −0.1880 *** | 0.0698 * | −0.0628 *** | −0.0487 *** | −0.2549 *** |
| | (−8.38) | (1.67) | (−4.47) | (−3.56) | (−9.03) |
| TOP10 | 0.0001 | 0.0005 *** | 0.0001 | 0.0001 | 0.0002 |
| | (1.04) | (2.77) | (1.24) | (1.20) | (1.26) |
| ORG | 0.0373 *** | 0.0227 *** | 0.0233 *** | 0.0217 *** | 0.0449 *** |
| | (38.00) | (11.66) | (38.02) | (37.15) | (38.08) |
| CR | −0.0001 | −0.0000 | −0.0001 | −0.0001 | −0.0001 |
| | (−0.49) | (−0.07) | (−1.04) | (−1.13) | (−0.31) |
| INDE | 1.2569 *** | 0.2386 *** | 1.6721 *** | 1.7321 *** | 1.1230 *** |
| | (43.62) | (5.08) | (95.19) | (103.37) | (32.19) |
| ROE | 0.0032 *** | 0.0022 * | 0.0022 *** | 0.0020 *** | 0.0049 *** |
| | (3.92) | (1.66) | (4.59) | (4.38) | (4.90) |
| Constant | −0.0210 *** | −0.3210 *** | −0.0031 | −0.0013 | −0.0307 *** |
| | (−3.43) | (−29.87) | (−0.77) | (−0.34) | (−4.14) |
| | | | | |
| | | | | |
| 51,458 | 51,335 | 51,458 | 51,458 | 51,458 |
| R2 | 0.4830 | 0.1589 | 0.6842 | 0.7036 | 0.3989 |
Table A3.
Robustness test—replacing the explained variables, eliminating municipality data, shortening the sample period, and adjusting the clustering level.
Table A3.
Robustness test—replacing the explained variables, eliminating municipality data, shortening the sample period, and adjusting the clustering level.
| | (1) | (2) | (3) | (4) | (5) | (6) |
|---|
| | | | | | | |
|---|
| DT1 | 0.0124 *** | | | | | |
| | (3.14) | | | | | |
| DT2 | | 0.2003 *** | | | | |
| | | (3.93) | | | | |
| | | | 0.0007 *** | | | |
| | | | (2.66) | | | |
| DT | | | | 0.0043 *** | 0.0036 *** | 0.0035 *** |
| | | | | (6.43) | (5.07) | (9.24) |
| LEV | −0.0091 * | −0.0096 ** | −0.0106 *** | −0.0131 ** | −0.0097 | −0.0096 *** |
| | (−1.90) | (−2.01) | (−6.52) | (−2.49) | (−1.63) | (−3.24) |
| AGE | 0.0316 *** | 0.0326 *** | 0.0356 *** | 0.0273 *** | 0.0291 ** | 0.0331 *** |
| | (3.66) | (3.82) | (10.69) | (2.89) | (2.56) | (6.88) |
| SIZE | 0.0695 *** | 0.0687 *** | 0.0694 *** | 0.0680 *** | 0.0666 *** | 0.0686 *** |
| | (51.86) | (51.40) | (164.07) | (45.90) | (37.30) | (81.19) |
| IAR | −0.1015 *** | −0.1094 *** | −0.1099 *** | −0.1087 *** | −0.0907 *** | −0.1084 *** |
| | (−5.94) | (−6.43) | (−17.62) | (−5.62) | (−4.34) | (−11.33) |
| TOP10 | 0.0001 | 0.0001 | 0.0001 *** | 0.0002 * | 0.0001 | 0.0001 *** |
| | (1.42) | (1.63) | (4.78) | (1.82) | (0.90) | (2.93) |
| ORG | 0.0288 *** | 0.0288 *** | 0.0291 *** | 0.0285 *** | 0.0279 *** | 0.0288 *** |
| | (38.71) | (38.81) | (65.51) | (33.38) | (30.34) | (32.73) |
| INDE | −0.0001 | −0.0001 | −0.0001 * | −0.0000 | −0.0000 | −0.0001 |
| | (−1.00) | (−0.95) | (−1.72) | (−0.42) | (−0.36) | (−1.34) |
| Constant | 1.5171 *** | 1.5200 *** | 1.4898 *** | 1.5331 *** | 1.5391 *** | 1.5057 *** |
| | (70.30) | (70.98) | (34.21) | (65.89) | (50.78) | (92.61) |
| | | | | | |
| | | | | | |
| 50,125 | 50,675 | 51,434 | 41,495 | 26,948 | 51,130 |
| R2 | 0.5978 | 0.5966 | 0.6252 | 0.6027 | 0.5819 | 0.9032 |
Table A4.
Endogeneity test—instrumental variables.
Table A4.
Endogeneity test—instrumental variables.
| |
(1)
|
(2)
|
(3)
|
(4)
|
(5)
|
(6)
|
|---|
| | | | | | | |
|---|
| Telephone | 0.0474 *** | | | | | |
| | (3.84) | | | | | |
| Distance | | | −0.0008 *** | | | |
| | | | (−12.78) | | | |
| ADIG | | | | | 0.4934 *** | |
| | | | | | (21.78) | |
| DT | | −0.0433 ** | | 0.0209 *** | −0.0744 | 0.0152 *** |
| | | (−2.18) | | (3.97) | (−1.40) | (4.27) |
| LEV | −0.0401 *** | −0.0115 *** | −0.1131 *** | 0.0334 *** | −0.1434 | −0.0089 * |
| | (−1.50) | (−5.24) | (−4.45) | (18.47) | (−1.15) | (−1.86) |
| AGE | −0.1893 *** | 0.0242 *** | −0.1488 *** | 0.0080 *** | 0.2253 *** | 0.0355 *** |
| | (−3.41) | (4.27) | (−8.65) | (5.86) | (14.87) | (4.13) |
| SIZE | 0.2343 *** | 0.0796 *** | 0.1507 *** | 0.0740 *** | 0.1113 *** | 0.0659 *** |
| | (35.27) | (17.00) | (34.97) | (86.41) | (0.59) | (40.42) |
| IAR | −0.0776 | −0.1122 *** | 0.1515 *** | −0.1513 *** | −0.0047 *** | −0.1073 *** |
| | (−0.76) | (−14.05) | (1.52) | (−22.57) | (−4.54) | (−6.26) |
| TOP10 | −0.0054 *** | −0.0001 | −0.0040 *** | 0.0005 *** | 0.0079 | 0.0002 ** |
| | (−10.80) | (−1.08) | (−11.60) | (16.85) | (0.94) | (2.33) |
| ORG | 0.0067 | 0.0291 *** | 0.0151 | 0.0261 *** | 0.2202 *** | 0.0287 *** |
| | (0.88) | (49.07) | (1.38) | (35.31) | (35.76) | (38.55) |
| INDE | −0.0042 *** | −0.0003 *** | 0.0029 *** | −0.0006 *** | −0.0042 *** | −0.00003 |
| | (−4.69) | (−2.59) | (3.31) | (−9.18) | (−3.12) | (−0.31) |
| | | | | | |
| | | | | | |
| 51,099 | 51,099 | 45,951 | 45,951 | 51,035 | 51,035 |
| R2 | | 0.1274 | | 0.6620 | | 0.4351 |
| 14.72 *** | | 30.93 *** | | 474.43 *** | |
| | 34.719 [12.41] | | 163.241 [16.38] | | 474.432 [16.38] |
| | 16.342 *** | | 163.052 *** | | 253.981 *** |
Table A5.
Endogeneity test—DID and Heckman two-step method.
Table A5.
Endogeneity test—DID and Heckman two-step method.
| |
(1)
|
(2)
|
|---|
| | | |
|---|
| DID | 0.0026 *** | |
| | (2.75) | |
| DT | | 0.0248 *** |
| | | (7.09) |
| IMR | | −0.1301 |
| | | (−10.32) |
| LEV | −0.0081 *** | −0.1053 |
| | (−4.64) | (−0.94) |
| AGE | 0.0379 *** | 0.2912 *** |
| | (10.67) | (5.01) |
| SIZE | 0.0709 *** | 0.4953 *** |
| | (163.11) | (32.86) |
| IAR | −0.1179 *** | −1.4122 *** |
| | (−17.78) | (−8.67) |
| TOP10 | 0.0001 *** | 0.0009 * |
| | (3.46) | (1.85) |
| ORG | 0.0290 *** | 0.2238 *** |
| | (60.71) | (28.15) |
| INDE | −0.0001 ** | 0.0008 |
| | (−2.34) | (0.67) |
| Constant | 1.5042 *** | 1.5055 *** |
| | (146.44) | (120.29) |
| | |
| | |
| 45,951 | 51,458 |
| R2 | 0.6050 | 0.9038 |
Table A6.
Heterogeneity Analysis—Categories of DT in Enterprises.
Table A6.
Heterogeneity Analysis—Categories of DT in Enterprises.
| |
(1)
|
(2)
|
|---|
| | | |
|---|
| DTA | 0.0025 *** | |
| | (3.83) | |
| UT | | 0.0041 *** |
| | | (6.28) |
| LEV | −0.0098 ** | −0.0096 ** |
| | (−2.04) | (−2.01) |
| AGE | 0.0327 *** | 0.0330 *** |
| | (3.83) | (3.88) |
| SIZE | 0.0689 *** | 0.0688 *** |
| | (52.00) | (51.75) |
| IAR | −0.1084 *** | −0.1081 *** |
| | (−6.38) | (−6.37) |
| TOP10 | 0.0001 | 0.0001 |
| | (1.58) | (1.52) |
| ORG | 0.0288 *** | 0.0288 *** |
| | (38.99) | (38.86) |
| INDE | −0.0001 | −0.0001 |
| | (−0.90) | (−0.86) |
| Constant | 1.5187 *** | 1.5179 *** |
| | (71.24) | (71.42) |
| | |
| | |
| 51,458 | 51,458 |
| R2 | 0.5959 | 0.5968 |
Table A7.
Heterogeneity Test—External Macro Environment.
Table A7.
Heterogeneity Test—External Macro Environment.
| |
(1)
|
(2)
|
(3)
|
|---|
| | | | TFP |
|---|
| DTDINF | 0.0070 *** | | |
| | (2.88) | | |
| DINF | −0.0171 *** | | |
| | (−3.20) | | |
| DTIPP | | 0.0135 *** | |
| | | (4.29) | |
| IPP | | −0.0086 | |
| | | (−0.74) | |
| DIMI | | | 0.0007 *** |
| | | | (3.05) |
| MI | | | −0.0004 |
| | | | (−0.35) |
| DT | 0.0009 | 0.0140 *** | 0.0107 *** |
| | (1.11) | (5.58) | (4.27) |
| LEV | −0.0077 * | −0.0086 * | −0.0012 |
| | (−1.68) | (−1.78) | (−0.25) |
| AGE | 0.0121 *** | 0.0347 *** | 0.0303 *** |
| | (3.69) | (4.06) | (3.51) |
| SIZE | 0.0680 *** | 0.0687 *** | 0.0680 *** |
| | (53.16) | (51.64) | (49.45) |
| IAR | −0.0995 *** | −0.1053 *** | −0.1034 *** |
| | (−6.12) | (−6.13) | (−6.18) |
| TOP10 | 0.0001 * | 0.0001 | 0.0001 |
| | (1.68) | (1.62) | (1.35) |
| ORG | 0.0291 *** | 0.0288 *** | 0.0281 *** |
| | (39.69) | (38.62) | (38.22) |
| INDE | −0.0001 | −0.0001 | −0.0001 |
| | (−1.08) | (−0.71) | (−1.11) |
| Constant | 1.5234 *** | 1.5075 *** | 1.5261 *** |
| | (59.98) | (65.77) | (64.71) |
| | | |
| | | |
| 50,669 | 50,842 | 49,348 |
| R2 | 0.6035 | 0.5976 | 0.6077 |
Table A8.
Heterogeneity Test—Internal Micro Characteristics.
Table A8.
Heterogeneity Test—Internal Micro Characteristics.
| |
(1)
|
(2)
|
(3)
|
|---|
| | | | |
|---|
| DTHP | −0.0021 *** | | |
| | (−3.88) | | |
| HP | −0.0001 | | |
| | (−0.05) | | |
| DTHHI | | 0.0029 ** | |
| | | (2.24) | |
| HHI | | −0.0044 * | |
| | | (−1.65) | |
| DTTQ | | | 0.0003 *** |
| | | | (2.60) |
| TQ | | | −0.0001 *** |
| | | | (−3.61) |
| DT | 0.0039 *** | 0.0031 *** | 0.0024 *** |
| | (12.90) | (9.18) | (3.63) |
| LEV | −0.0099 *** | −0.0097 *** | −0.0065 |
| | (−6.12) | (−5.99) | (−1.39) |
| AGE | 0.0333 *** | 0.0329 *** | 0.0340 *** |
| | (9.98) | (9.85) | (4.01) |
| SIZE | 0.0686 *** | 0.0686 *** | 0.0684 *** |
| | (169.11) | (169.20) | (52.08) |
| IAR | −0.1083 *** | −0.1087 *** | −0.1069 *** |
| | (−17.51) | (−17.59) | (−6.27) |
| TOP10 | 0.0001 *** | 0.0001 *** | 0.0001 |
| | (4.38) | (4.35) | (1.49) |
| ORG | 0.0288 *** | 0.0288 *** | 0.0285 *** |
| | (63.22) | (63.13) | (37.93) |
| INDE | −0.0001 | −0.0001 | −0.0001 |
| | (−1.54) | (−1.53) | (−0.70) |
| Constant | 1.5181 *** | 1.5196 *** | 1.5191 *** |
| | (175.75) | (175.42) | (71.92) |
| | | |
| | | |
| 51,415 | 51,447 | 50,530 |
| R2 | 0.5969 | 0.5967 | 0.5928 |