Abstract
This paper takes the quasi-natural experiment from the National Big Data Comprehensive Pilot Zone (NBDCPZ) in China as an example to examine the impact of market-oriented allocation of data elements on enhancing enterprises’ New Quality Productive Forces (NQPF). Based on panel data from China’s A-share listed enterprises on the Shanghai and Shenzhen stock exchanges between 2011 and 2022, this study employs a robust policy evaluation method, the multi-way fixed effects staggered difference-in-differences (MWFE Staggered DID) method, to analyze the impact of the NBDCPZ on NQPF comprehensively. The key findings are threefold: First, the NBDCPZ significantly boosts enterprises’ NQPF within their jurisdictions. Second, the NBDCPZ enhances NQPF by accelerating enterprise digital transformation, and the digital talent can amplify the promotional effect of the NBDCPZ on enterprise digital transformation. Third, the NQPF-enhancing effects are more pronounced for privately owned enterprises (POEs), foreign-invested enterprises (FIEs), and smaller enterprises, whereas they exhibit an inhibitory impact on state-owned enterprises (SOEs) and large enterprises. Fourth, the promotional effect of the NBDCPZ on enterprises’ NQPF varies across different industries. Furthermore, regional (city-level) digital infrastructure and financial development levels amplify the NQPF-enhancing effects of the NBDCPZ.
1. Introduction
The “New Quality Productive Forces” (NQPF) was first introduced by Chinese President Xi Jinping in September 2023 when he conducted an inspection in Heilongjiang. NQPF represents the most advanced productive forces of the present, characterized by the empowerment of high tech on traditional industries and marked by an evident increase in total factor productivity. The importance of developing NQPF lies in two aspects. First, the world is currently undergoing a fresh round of scientific and technological revolution and industrial transition. Digital technologies such as Artificial Intelligence are continuously emerging, reshaping traditional modes of production and promoting the transformation of the traditional industrial mix. Under these new circumstances, traditional productive forces are no longer aligned with future development. Second, China’s economy has shifted from a stage of high-speed development to one of high-quality development. Accompanied by high consumption, high emissions, and high pollution, high-speed development has become unsustainable, necessitating a shift from extensive growth patterns to green and efficient production modes. However, NQPF is characterized by innovation-driven green development [1]. This can promote traditional industries upgrading, green transformation, efficiency enhancement, and productivity transition, thereby reducing dependence on resources and environmental damage and fostering high-quality economic development. In conclusion, NQPF is conducive to achieving sustainable development. Serving as the micro-carriers of social production, enterprises’ NQPF are the core elements that constitute a country’s NQPF and determine the overall level of China’s NQPF. Therefore, the key to developing NQPF lies in promoting enterprises’ NQPF.
After the first National Big Data Comprehensive Pilot Zone (NBDCPZ) was established in Guizhou, China, in September 2015, a total of eight pilot zones have been established. The establishment of the NBDCPZ aims to provide a testing ground for the innovation, application, and dissemination of digital technologies, including Big Data, Artificial Intelligence, Blockchain, and 5G networks. Furthermore, it seeks to explore effective strategies for digital technologies to facilitate the digital transformation of traditional enterprises, which aligns perfectly with enterprises’ NQPF. Therefore, an important question arises: Can the NBDCPZ promote enterprises’ NQPF? If so, what is the underlying mechanism? Are there variations in its effects? Clarifying these issues holds significant theoretical and practical importance for constructing a scholarly framework that establishes the causal relationship between the NBDCPZ and enterprises’ NQPF, fully leveraging the policy dividends of the NBDCZP to foster enterprises’ NQPF, thereby driving high-quality economic development.
Existing research on the cultivation effect of the NBDCPZ on NQPF can be categorized into three main types: The first category examines the impact of the NBDCPZ on NQPF of provinces [2] and cities [3,4]. The second category explores how the NBDCPZ enhances new NQPF in specific industries, such as agriculture [5]. The third category investigates the impact of the NBDCPZ on enterprise-level NQPF, suggesting that the NBDCPZ facilitates improvement through mechanisms such as increasing market attention and competition intensity [6], promoting industry–university–research collaboration, expanding enterprise knowledge breadth, enhancing enterprise absorption capacity [7], improving labor innovation efficiency, alleviating financing constraints, and optimizing industrial structure [8]. In summary, while the potential of the NBDCPZ in cultivating NQPF is widely acknowledged, research on their enterprise-level promotion effects remains relatively scarce. Furthermore, the underlying mechanisms and heterogeneity analysis between these NBDCPZs and enterprise-level NQPF require further exploration. Additionally, how to effectively leverage NBDCPZs in practice to enhance enterprise-level NQPF remains a critical question worthy of in-depth study.
The potential contributions may be divided into three aspects: a. Enriching research on the impact of the NBDCPZ in promoting enterprises’ NQPF: Existing scholars have constructed indicator systems for enterprises’ NQPF from different perspectives. According to the classification of primary indicators, the existing indicator systems are mainly divided into three categories: “labor force, labor tools [8],” “laborers, objects of labor, means of labor [2],” and “technology, green, digital [4],” with varying impact mechanisms of action across these systems. The differences between this paper and the existing literature may lie either in the indicator systems used [6,7] or in the impact mechanisms under the same indicator system [8], which helps to enrich current research content. b. Expanding the scope of research on the mechanisms through which the NBDCPZ enhances enterprises’ NQPF: Although some of the existing literature points out from various angles that the NBDCPZ promotes NQPF by breaking barriers to factor circulation, they fail to recognize the positive role of removing such barriers in facilitating enterprises’ digital transformation, thereby enhancing NQPF. Meanwhile, this paper innovatively proposes the moderation of digital talent on NBDCPZs, promoting enterprises’ digital transformation. c. Providing more reference points for policymaking: This paper offers valuable supplements to the heterogeneity analysis in existing research and, combined with conclusions from mechanism analysis, provides additional insights for policy formulation, thereby promoting the digital economy.
2. Policies, Mechanisms, and Hypotheses
2.1. Introduction to China’s NBDCPZ Policies
As a key national initiative to advance market-oriented allocation reform of data elements, China’s NBDCPZ serves as a regional innovation hub focusing on three core domains: institutional innovation, data development and utilization, and industrial integration. As of the data cutoff date for this study, eight such zones had been established, including Guizhou, Beijing–Tianjin–Hebei (this pilot zone is consisted of three provincial regions: Beijing, Tianjin, and Hebei), Pearl River Delta (although the Pearl River Delta is a sub-region within Guangdong, the paper regards the entire Guangdong province as a pilot zone due to the broad policy influence), Shanghai, Henan, Chongqing, Shenyang, and Inner Mongolia (https://www.ndrc.gov.cn/wsdwhfz/202304/t20230410_1353437.html, accessed on 6 September 2025).
Institutional Innovation: Developing pioneering mechanisms for data property rights registration, hierarchical classification standards, and public data ownership confirmation. Establishing foundational rules for data transaction ecosystems.
Data Development and Utilization: The zones prioritize building a comprehensive data elements system through institutional breakthroughs in critical areas such as public data aggregation, governance, and authorized operation, thereby driving data elements development and utilization.
Industrial Integration: Accelerating “Data Elements or Digital Technologies+” applications across strategic sectors such as healthcare, transportation, and agriculture.
2.2. Mechanisms and Hypotheses
NQPF, as defined in this paper, represents the comprehensive performance of enterprises across four dimensions: Living Labor, Materialized Labor, Hard Technology, and Soft Technology. Consequently, this subsection will clarify the promotion effect of NBDCPZs on these four dimensions of enterprises’ NQPF through the following three mechanisms.
2.2.1. Mediation Effect of Digital Transformation on NBDCPZs Promoting NQPF
The impact of NBDCPZs on promoting enterprises’ digital transformation. Firstly, as digital infrastructure improves and digital industries develop in NBDCPZs, and the application scenarios of digital technologies in the real world and the deep integration of advanced digital technologies, such as Big Data, Artificial Intelligence, Cloud Computing, Cloud Storage, and Blockchain, with traditional industries, are becoming more widespread. This includes establishing digital factories, workshops, and production lines, thereby accelerating enterprises’ digital transformation.
Secondly, NBDCPZs offer digital subsidies, financial support, and a market–institutional environment that fosters the growth of digital enterprises. For instance, Guizhou has supported the popularization and application of digital technology by small and medium-sized enterprises (SMEs). Subsidies are granted of up to 30 percent of the total project investment, capped at 5 million RMB for qualified projects. The digital subsidies can significantly stimulate enterprises’ innovation activities [9] and effectively promote enterprises’ digital transformation.
Thirdly, NBDCPZs break down data barriers and promote the gathering, integration, and open sharing of Big Data from every field, making the data elements more available and abundant for enterprises. In light of the resource-based theory, data elements, an emerging factor of production, can be combined with traditional factors of production, including capital, labor, etc. [10], making procurement, production, transportation, warehousing, sales, and other parts more digitized.
Fourthly, NBDCPZs delve into the value of Big Data for governmental, commercial, and civil purposes, illustrating the significant social benefits of the digital economy to enterprises. Moreover, the NBDCPZ policies can send signals of a developing digital economy, fostering a digital competition environment. These factors increase enterprises’ willingness to utilize modern digital technologies and data elements, thereby achieving digital transformation.
The impact of promoting digital transformation on enterprises’ NQPF. Digitalized and precise production can optimize the allocation of various resources. It potentially quickens the process of production. This indicates the purchase of more machinery and equipment for fixed asset investment to expand production, thereby raising the percentage of manufacturing costs (an increase in Materialized Labor). This will increase operational revenues (an enhancement in Soft Technology), which in turn enables enterprises to gain greater favor from investors in the bond market, thereby obtaining more financing (an enhancement in Soft Technology). Secondly, the stakeholder theory suggests that an enterprise’s digital transformation not only impels it to actively fulfill its social responsibilities but also to disclose enterprise-related information. This facilitates easier access to financing from banks and other relevant financial institutions (an enhancement in Soft Technology). Thirdly, digital transformation also enhances enterprises’ innovation efficiency and increases investment in innovation (an enhancement in Hard Technology). Concurrently, the demand for highly skilled personnel will also rise (an increase in Living Labor).
Furthermore, as operational revenues and financing grow, enterprises have more funds to invest in technological R&D (an enhancement in Hard Technology), to hire highly skilled personnel (an increase in Living Labor), and to purchase more advanced machinery and equipment for fixed asset investment (an increase in Materialized Labor). Additionally, enterprises’ operational revenues continue to rise (an enhancement in Soft Technology), thereby establishing a virtuous cycle. Consequently, the enterprises’ NQPF can be comprehensively and continuously improved.
2.2.2. Moderation Effect of Digital Talent on NBDCPZs Promoting Digital Transformation
In NBDCPZs, the local digital economy and industries are experiencing growth, along with the rapid gathering of numerous national-level science and technology innovation platforms, innovation and entrepreneurship complexes, universities, scientific research institutes, enterprises, and other innovation institutions. This convergence can provide a wealth of employment and development opportunities for information technology (IT) professionals, thereby fostering a digital talent agglomeration effect and enhancing the supply of digital talent for enterprises [11].
With a greater supply of digital talent, enterprises have opportunities to hire more digital talent. Moreover, the aggregation of a significant number of highly educated and skilled professionals can generate a spillover effect of knowledge and digital technologies [12]. Therefore, the digital transformation progress of enterprises can be accelerated.
With the above analyses, this paper proposes hypotheses 1–3 as follows (refer to Figure 1):
Figure 1.
Mechanisms for the promotion effect of NBDCPZs on enterprises’ NQPF.
Hypothesis 1.
NBDCPZs can promote the enterprises’ NQPF within the zone.
Hypothesis 2.
NBDCPZs can promote digital transformation, thereby promoting the enterprises’ NQPF.
Hypothesis 3.
Digital talent can accelerate the progress of NBDCPZs, promoting the enterprise’s digital transformation.
3. Model and Data
3.1. Model Specification
The establishment of NBDCPZs can be regarded as a quasi-natural experiment, and the difference-in-differences method (DID) is a common approach used to examine the effects of policy shocks. Given that the timing of the establishment of NBDCPZs varies from city to city, this paper draws on Callaway and Sant’Anna and proposes using the MWFE Staggered DID to assess the impact of NBDCPZ policies on enterprises’ NQPF [13]. It is worth noting that the DID method can inherently mitigate the endogeneity problem due to omitted variables and reverse causality. The model is set up as follows:
Here, i denotes the enterprise, t denotes the year; NQPF is the explained variable, indicating the enterprise’s NQPF; DID is a dummy variable for the policy effect of NBDCPZs; X denotes a series of controlled variables; represents the individual fixed effect; represents the time fixed effect; represents the city fixed effect; represents the random error term.
3.2. Description of Variables
3.2.1. Explained Variable
The explained variable of this paper is enterprises’ NQPF (NQPF). NQPF represents the most advanced productive forces at present, characterized by innovation capability enhancement and marked by an evident increase in total factor productivity, to drive knowledge-based economic growth. Scholars have developed the indicator system for NQPF from various perspectives [14,15], yet this paper posits that defining productivity in accordance with the two-factor theory of productivity is more rational. Consequently, this paper adopts the approach of Song et al. (2024) [14]. Utilizing the two-factor theory of productivity, the NQPF indicator system is constructed from two dimensions: labor force and production tools. The entropy value approach is adopted to gauge NQPF (see Appendix A for details), a method widely used in existing studies to construct indicator systems. The indicator system is outlined as follows (refer to Table 1):
Table 1.
NQPF indicator system.
3.2.2. Explanatory Variable
The explanatory variable is the NBDCPZ Policy Effect dummy variable (DID) calculated by multiplying the experimental group dummy variable Treat with the experimental period dummy variable Time. The NBDCPZ (Guizhou) was inaugurated in September 2015, marking China’s first NBDCPZ. Subsequently, in October 2016, two cross-regional NBDCPZs were established in the Beijing–Tianjin–Hebei area and the Pearl River Delta, along with four regional demonstration NBDCPZs in Shanghai, Henan, Chongqing, and Shenyang. Additionally, an NBDCPZ for promoting the development of Big Data infrastructures was set up in Inner Mongolia. This paper treats the establishment of these NBDCPZs as a policy shock, with enterprises within the zones designated as the experimental group, where Treat equals 1, and 0 otherwise. It is important to note that due to the Pearl River Delta NBDCPZ’s influence extending across the entire Guangdong Province, all enterprises within Guangdong are considered part of the experimental group. For enterprises in the experimental group, the dummy variable Time is set to 1 during the experimental period if the sample year is greater than or equal to the establishment time of the NBDCPZ in which the enterprises are located; otherwise, it is set to 0.
3.2.3. Controlled Variables
To mitigate potential endogeneity arising from omitted variables, the enterprise size, enterprise age, leverage, revenue growth, ownership concentration, size of the board of directors, ratio of independent directors, management capabilities, and separation of extent of ownership and controlling rights are incorporated as controlled variables [14]. Additionally, this paper introduces the individual fixed effect () to represent the influence of other potential factors at the enterprise level that do not change over time on enterprises’ NQPF, the time fixed effect () to represent the impact of variables that change only over time, and the city fixed effect () to represent the influence of other potential factors at the city level that do not change over time on enterprises’ NQPF.
The variable names, symbols, and measures for the aforementioned variables are detailed in Table 2.
Table 2.
Details of variables.
3.3. Description of Data Sources and Processing
This study researches the effect of NBDCPZs on NQPF of listed enterprises on the A-share market of the Shanghai and Shenzhen stock exchanges from 2011 to 2022. The explanatory variable, DID, is sourced from the Chinese government’s official website. Enterprise-level data is compiled from the China Stock Market & Accounting Research (CSMAR) database and the Chinese Research Data Services Platform (CNRDS). The data processing steps include the following: a. Exclusion of enterprises from the financial and real estate sectors. b. Exclusion of enterprises with ST, *ST, and PT designations. c. Exclusion of enterprises listed after 2016. d. Exclusion of enterprises that have no more than nine samples. e. The Winsor2 command is used to winsorize continuous variables at the 1% and 99% levels. f. Except for the dummy DID, all the other variables are added by 1 and in a natural logarithm form.
4. Empirical Results and Discussion
4.1. Parallel Trend Test
The premise of using DID (difference in differences) is to satisfy the parallel trends prerequisite. Therefore, this paper establishes several dummy variables for relative periods ranging from 1 year to n years before the policy shock, the time of the policy shock, and relative periods ranging from 1 year to n years post-shock. All these dummy variables are then incorporated into the model to regress against the explained variables. If the regression coefficients of the dummy variables for relative periods ranging from 1 year to n years before the policy shock are not significantly different from zero (or the coefficients fall within a range that includes zero), they are considered to pass the parallel trend test. This test is similar but not identical to the one proposed by Beck et al. (2010) [16]. The details of the procedure are as follows:
Firstly, this paper sets up dummy variables for relative periods ranging from 1 year to n years before the establishment of the NBDCPZ, using D_n to represent them. For example, if a city establishes an NBDCPZ in 2015, D_1 takes the value of 1 when the year is 2014, for each enterprise located in this city; otherwise, it takes the value of 0. By analogy, this paper can determine the sample values for other cities in the experimental group, while the control group’s sample value is set to 0. Given that the earliest sample time in this paper is 2011 and the earliest policy impact time is 2015, it is necessary to establish the D_2, D_3, and D_4 (refer to D_1 for their meanings and values). This paper uses D_4 to represent the dummy variable for four years or more before the establishment of the NBDCPZ and rounds it off in the regression. It is evident that when a city established an NBDCPZ in 2016, D_4 was assigned a value of 1 for years earlier than or equal to 2012, and the value of 0 for other years.
Secondly, this paper also establishes the dummy variables for the time of the policy shock and relative periods ranging from 1 year to n years post-shock, using D to denote the former and D_n to denote the latter (refer to D_1 for their meanings and values). Since the sample period ends in 2022 and the earliest policy shock occurred in 2015, the value of n ranges from 1 to 7.
Finally, this paper incorporates D_3–D_1, D, and D1–D7 into the model for regression analysis and presents the trend graph depicting the range of regression coefficient values for D_3–D_1, D, and D1–D7 within a 90% confidence interval, as illustrated in Figure 2. The figure indicates that the range of regression coefficients for D_3–D_1 includes 0, signifying that the parallel trend assumption holds. Consequently, this paper can employ the MWFE Staggered DID approach to assess the impact of the NBDCPZ on enterprises’ NQPF.
Figure 2.
Results of the parallel trend test.
4.2. Correlation Analysis
To avoid the presence of multicollinearity issues, a correlation analysis was conducted on the explanatory variable and control variables. The correlation coefficients for each pair of variables are presented in Table 3. Table 3 shows that all the correlation coefficients are less than the absolute value of 0.8 (even less than the absolute value of 0.25). Therefore, there is no issue of multicollinearity.
Table 3.
Correlation coefficients.
4.3. Baseline Regression
The estimation results of the effect of NBDCPZs on enterprises’ NQPF are represented in Table 4. Column (1) does not add any fixed effects and controlled variables, and the results are significant but not reliable, requiring further consideration of other affecting factors. Only individual fixed effects, time fixed effects, and city fixed effects are added in Column (2), and it shows that the DID coefficient is significant and positive at the 1% level. Column (3) continues to add several controlled variables at the enterprise level, and the regression coefficients remain significantly positive. In summary, it can be concluded that the building of an NBDCPZ significantly promotes enterprises’ NQPF, and the level of enterprises’ NQPF within the zone increased by 2.00% after the establishment of the NBDCPZ, which is a relatively evident enhancement effect.
Table 4.
Results of baseline regression.
Regarding the controlled variables, enterprise size, ownership concentration, size of the board of directors, and ratio of independent directors significantly inhibit the improvement of enterprises’ NQPF, whereas enterprise size, enterprise age, leverage and management capabilities are the important factors that contribute to the improvement of the enterprises’ NQPF.
In conclusion, the baseline regression indicates that NBDCPZs contribute significantly to the level of enterprises’ NQPF. Next, multiple robustness tests are conducted to further verify the accuracy of the above findings.
4.4. Robustness Test
4.4.1. Placebo Test (Combination of Virtual Experimental Group Method and Virtual Experimental Period Method)
Inspired by Beck (2010), this paper adopts the following method for the placebo test [17]: Initially, enterprises with a comparable number of individuals to the original experimental group are randomly selected from all enterprises to form the new experimental group. The experimental group dummy variable, Treat, is then reassigned a value of 0, while enterprises in the newly selected experimental group are assigned a value of 1 for Treat. Secondly, a year is randomly selected from the variable year for each enterprise to represent the time of the policy shock on that enterprise. Next, for each enterprise, the experimental period dummy variable Time is assigned a value of 1 when the variable year is greater than or equal to the randomly selected time of the policy shock; otherwise, it takes the value of 0. Finally, the experimental group dummy variable, Treat, is multiplied by the experimental period dummy variable, Time, to obtain the new experimental effect dummy variable, DID. This DID is then regressed to obtain one regression coefficient and the corresponding probability value, p. The process was repeated 500 times to obtain 500 DID coefficients and their relevant probability value p. Scatter plots of these 500 DID regression coefficients and their probability values p, along with kernel density plots of the regression coefficients, were then created and are depicted in Figure 3.
Figure 3.
Results of the placebo test.
Figure 3 indicates that the DID coefficients are more evenly distributed around 0, and the majority of these coefficients correspond to probability values of P at 0.1 and above. This suggests that the NBDCPZ does not significantly affect enterprises’ NQPF after establishing the virtual experimental group and period, meaning the placebo test is passed and the baseline regression conclusions remain valid.
4.4.2. Replacement of Identification Strategy (Stacked DID)
Goodman-Bacon (2021) showed that the regression coefficient DID can be decomposed into a weighted average of the regression coefficients of DID under four types of controlled experiments [18]. The sample can be divided into four types of controlled experiments according to the status of the experimental and control groups: controlled experiments for enterprises receiving the experiment early and enterprises never receiving the experiment, controlled experiments for enterprises receiving the experiment late and enterprises never receiving the experiment, controlled experiments for enterprises receiving the experiment early and enterprises receiving the experiment late (but not yet experimented), and controlled experiments for enterprises receiving the experiment late and enterprises receiving the experiment early (have received experiment). The fourth type of controlled experiment is not reasonable in the setting of the control group. If the regression coefficients of the fourth type of controlled experiments are closer to the remaining three types of controlled experiments or their weights are smaller, the DID regression coefficients of the baseline regression can be accepted. Otherwise, the accuracy of the DID regression coefficients will be affected, or even diametrically opposite results will be obtained. The above estimation biases can be remedied by using Bacon Decomposition, Negative Weight Diagnostics, and a variety of Heterogeneous Robust DID Estimators. Heterogeneous Robust Estimators specifically include DID with Multiple Groups and Times (DID Multiple GT), DID Imputation, MWFE Stacked DID (also known as Stacked Event), Callaway and Sant’Anna DID (CSDID), Event Study, Event Study Interact, Two-Stage DID, Two-Step DID (DID2S), and so on.
To test whether the baseline regression results are unbiased, this study draws on previous studies and chooses to employ MWFE Stacked DID to conduct a robustness test [19,20,21]. Stacked DID is roughly divided into three steps: a. Divide the sample into subsamples based on the time of the policy shock. b. Decide on the experimental and control groups within the subsamples. c. Combine all subsamples and estimate the coefficients. Columns (1) and (2) of Table 5 represent the regression results of the baseline regression and the stacked DID, respectively. Compared with the baseline regression, the DID coefficient of Stacked DID has not changed significantly and remains robust in terms of significance, meaning that the conclusions drawn from the baseline regression still hold.
Table 5.
Regression results of stacked DID.
4.4.3. Controlling for the Impact of Other Policy Factors
Cross-border e-commerce essentially belongs to the category of the digital economy, and the preferential policies of the integrated pilot zones for cross-border e-commerce (IPZCE) will stimulate enterprises to accelerate the process of integrating into the cross-border e-commerce platform and the process of digital transformation. Therefore, the IPZCE policy could also promote enterprises’ NQPF by promoting their digital transformation. Omitting the positive impact of the IPZCE policy may bias the estimation of the DID coefficients. Now, this study sets the dummy variable IPZCE for IPZCE policies. If the city establishes a pilot zone in a certain year, the variable ECCPA for the city’s enterprises takes the value of 1 in that year and the following years; otherwise, it takes the value of 0. Column (1) of Table 6 shows the results of the baseline regression, and Column (2) shows the regression results after controlling for the IPZCE policy. The results indicate that the DID coefficients and their significance did not change significantly when considering the impact of the IPZCE policy. Therefore, the baseline regression conclusions still hold.
Table 6.
Regression results when considering the impact of the integrated pilot zone for cross-border e-commerce.
4.4.4. Controlling for Expected Effect
Before the implementation of any policy, individual expectations are likely to have an impact on the relevant economic variables, as the implementation of any policy is often preceded by signals. In other words, there may be expected effects, which means that “before the policy is implemented, the effect comes first”. If the expected factor is omitted, the regression results may be inaccurate. Based on this, this paper considers introducing a treatment effect dummy variable EXPECT in a relative time of 1 year before the establishment of the NBDCPZ, so as to control the impact of the expected factor on enterprises’ NQPF. When the year of enterprise in the experimental group is exactly 1 year earlier than the establishment of the NBDCPZ in which the enterprise is located, EXPECT takes the value of 1; otherwise, it takes the value of 0.
Columns (1) and (2) of Table 7 show the baseline regression results and the regression results after controlling for the expected effect, respectively. In contrast, the DID regression coefficients after controlling for the expected effect did not change evidently, and the coefficient values increased slightly. Overall, the expected effect did not significantly impact the results of the baseline regression. This suggests that the baseline regression conclusions still hold after controlling for the effect of the expected factor on enterprises’ NQPF.
Table 7.
Regression results after controlling for the expected effect.
4.4.5. Adjusting the Sample
COVID-19 began at the end of 2019, thereby interfering with the estimation results. Therefore, the regression is considered to be conducted after excluding the samples from 2020 and onwards. Columns (1) and (2) of Table 8 show the regression results for baseline regression and samples, excluding those from 2020 and beyond, respectively, which implies that the conclusions of the baseline regression still hold.
Table 8.
Regression results after adjusting the samples.
In conclusion, the findings of the baseline regression are supported, and Hypothesis 1 is confirmed by the seven types of robustness tests mentioned above.
4.5. Endogeneity Discussion and Model Limitations
Endogeneity discussion: This paper addresses endogeneity problems from the following aspects: a. The explained variable uses enterprise-level data, while the independent variable utilizes the city-level data, which ensures that the former is unlikely to influence the latter, meaning there is no reverse causation. b. The DID approach can mitigate the endogeneity issue resulting from omitted variables. Meanwhile, the estimation process controlled for individual fixed effect, year fixed effect, city fixed effect, and a series of individual characteristic variables, which can effectively mitigate the endogeneity problem caused by omitted variables. c. Parallel trend can solve the endogeneity problem resulting from selection bias.
Model limitations: The aforementioned method effectively mitigates endogeneity problems, but this paper may still suffer from endogeneity problems caused by other factors, such as measurement errors, which could be one of the limitations of the model in this paper. It is important to note that endogeneity problems cannot be entirely resolved but can only be effectively mitigated to obtain estimates that are close to the true state and are acceptable. Furthermore, the model presented in this paper may have additional shortcomings, including the following: a. It does not account for the spillover effect of the NBDCPZ from one pilot city to another. b. It does not consider the nonlinear policy effect.
4.6. Mechanism Test
4.6.1. The Mediation Effect of Digital Transformation on the NBDCPZ Promoting NQPF
From Hypothesis 2, it is clear that the NBDCPZ can promote enterprises’ NQPF through promoting digital transformation. This subsection examines Hypothesis 2. Concerning Baron and Kenny (1986), this paper constructs the mediating effects model as follows to conduct a mechanism test [22]:
where DIG represents the extent of digital transformation. The variables in Equations (2) and (3) have a similar meaning to those in Equation (1) (refer to the Section 3.1).
Referring to Wu (2021), this paper calculates the frequency of words related to digital transformation within the annual reports of listed enterprises and utilizes this data to measure the level of their digital transformation [23]. The digital transformation keywords list and its frequency analysis technical details can be found in Appendix B. The frequency of digital transformation keywords is then employed to gauge the extent of enterprises’ digital transformation and serves as a mediator to test Hypothesis 2. The regression results are presented in Columns 1–3 of Table 9. The regression results indicate that both and are significantly positive, as are and , with the coefficient value of being greater than that of . This suggests that the NBDCPZ can indirectly improve enterprises’ NQPF by promoting their digital transformation, thereby proving Hypothesis 2. Since both and are significantly positive, there is no need to conduct a Sobel test.
Table 9.
Results of the mechanism test.
4.6.2. The Moderation Effect of Digital Talent on the NBDCPZ Promoting Digitalization
To test the moderation effect of digital talent on the NBDCPZ promoting digitalization (Hypothesis 3), this paper constructs a moderating model as follows:
where TAL represents the level of digital talent. The variables in Equation (4) have a similar meaning to those in Equation (2).
This paper utilizes the percentage of practitioners from the information transmission, computer service, and software industries in the sum workforce across all industries to measure the supply scale of digital talent in NBDCPZs. This variable serves as a moderator variable to verify Hypothesis 3. Column 4 of Table 9 shows that digital talent can amplify the promotion effect of NBDCPZs on enterprises’ digital transformation, thereby proving Hypothesis 3.
4.7. Heterogeneity Analysis
4.7.1. The Impact of the NBDCPZ on the NQPF of Enterprises of Varying Ownership Structures
This paper considers that the promotion effect of the NBDCPZ on enterprises’ NQPF may vary among enterprises of different ownership structures. The sample can be divided into three categories: SOEs, POEs, and FIEs. The estimated results for the subsamples are shown in Table 10.
Table 10.
Regression results of enterprises of varying ownership structures.
Upon examining Columns (1)–(3) of Table 10, it becomes evident that the NBDCPZ has a promotional effect on the NQPF of both POEs and FIEs. However, the NQPF of SOEs is negatively affected. The possible reasons are as follows: digital technology talents are the key elements for enterprises’ digital transformation, and the process of hiring talents is simpler and faster for both POEs and FIEs compared to SOEs. According to institutional inertia theory, compared with SOEs, POEs, and FIEs are not easy to form path dependence and behavioral rigidity and can adjust business processes, organizational structure, and development strategies in a timely manner. Therefore, under the influence of the NBDCPZ policy, the first two can respond more swiftly and recruit digital technology talents to fulfill the current demands of digital transformation. The incentive theory also suggests that the benefit motivation of POEs and FIEs is greater than that of SOEs. Under digital subsidy policies, POEs and FIEs take action more positively to achieve digital transformation. Simultaneously, the risk appetite theory indicates that POEs and FIEs are more likely to seize the opportunities of digital economic development. Moreover, as POEs and FIEs offer more favorable treatment, this could result in the migration of digital technology talents from SOEs to POEs and FIEs. Consequently, this may hinder the development of NQPF within SOEs.
4.7.2. The Impact of the NBDCPZ on the NQPF of Enterprises of Different Sizes
Previous studies have shown that enterprises’ digital transformation is affected by enterprise size. Therefore, this paper considers exploring the impact of the NBDCPZ on the NQPF of enterprises of varying sizes. Consistent with the previous controlled variable measurement, the enterprise’s total asset is used to represent enterprise size. This paper calculates the average size of each enterprise from 2011 to 2022, as well as the average size of the entire sample, taking the latter as the threshold. When the enterprise size is larger than the threshold, the enterprise is categorized as a large-scale enterprise; otherwise, it is categorized as a small-scale enterprise. The above two categories of enterprises were regressed separately, and the results are presented in Table 11.
Table 11.
Regression results of enterprises of different sizes.
Based on Columns (1) and (2) of Table 11, it is evident that the NBDCPZ significantly promotes the NQPF of small-scale enterprises, whereas it inhibits that of large-scale enterprises. The possible reasons are as follows: large-scale enterprises already possess greater reserves of capital, technology, talent, and other factors. In contrast, small-scale enterprise’s NQPF cannot afford the high costs associated with R&D, data purchase, talent hiring, and building digital platforms due to financial constraints. These can be alleviated through digital subsidies and digital infrastructure construction by NBDCPZs. Small enterprises are actually similar to POEs or FIEs, while large enterprises are similar to SOEs. Therefore, the aforementioned results can also be explained by the incentive theory, risk appetite theory, and institutional inertia theory. Consequently, the small enterprise’s NQPF is more likely to be enhanced by the NBDCPZ’s policies.
4.7.3. The Impact of the NBDCPZ on the NQPF of Enterprises of Various Industries
This study also examines the varying promotional effects of the NBDCPZ on NQPF across different industries. The sample can be divided into three categories: primary industry, secondary industry (including manufacturing), and tertiary industry. The estimated results are presented in Table 12.
Table 12.
Regression results of enterprises of various industries.
Table 12 indicates that the NBDCPZ significantly promotes the NQPF of the secondary industry, with a more evident effect on the manufacturing industry. However, the NQPF of the primary industry is significantly inhibited, and that of the tertiary industry has not been impacted. The possible reasons are as follows: the secondary industry, particularly manufacturing, has a significant demand for digital elements such as data and could benefit greatly from NBDCPZs. At the same time, the NBDCPZ mainly aims to promote the digitalization and development of the secondary industry, with a particular focus on manufacturing. Therefore, production factors such as capital, labor, and technology may flow from other industries to the secondary industry, thereby possibly leading to an inhibitory effect on the primary industry and no influence on the tertiary industry.
4.7.4. The Impact of the NBDCPZ on the NQPF of Enterprises Located in Cities with Different Levels of Digital Infrastructure
To some extent, advanced digital infrastructure can accelerate the progress of enterprises in improving their NQPF. Consequently, a more advanced digital infrastructure could potentially accelerate the process of the NBDCPZ in promoting enterprises’ NQPF. The digital infrastructure level of each city is measured using six indicators: density of long-distance fiber optic cable lines, per capita Internet broadband access ports, percentage of employees in the information transmission, computer services, and software industry, per capita telecommunications business income, cellphone penetration rate, and Internet penetration rate. Based on the median of the average digital infrastructure index for each city from 2006 to 2021, the entire sample is divided into two groups: enterprises located in cities with a higher level of digital infrastructure and enterprises located in cities with a lower level of digital infrastructure. The regression results for the subsamples are shown in Table 13.
Table 13.
Regression results of enterprises located in cities with different levels of digital infrastructure.
From Table 13, for those enterprises located in cities with a high level of digital infrastructure, the NBDCPZ significantly promotes enterprises’ NQPF. Conversely, for those enterprises located in cities with low levels of digital infrastructure, the NBDCPZ policy does not have a significant effect.
4.7.5. The Impact of the NBDCPZ on the NQPF of Enterprises Located in Cities with Different Levels of Financial Development
Many enterprises encounter a shortage of funds when developing NQPF. The level of financial development in a city can effectively alleviate enterprises’ financing constraints and facilitate easier access to exogenous financing. Although the NBDCPZ offers financial support to enterprises for the development of NQPF through the digital subsidy policy, over half of the costs associated with digital transformation and the enhancement of NQPF are borne by the enterprises themselves. Thus, a high level of financial development in the city can expand financing channels, making it easier for enterprises to secure external funds for enhancing NQPF. Moreover, a robust level of financial development in the city can expand the government’s financing avenues, offering more sufficient financial backing for the execution of the NBDCPZ’s policies, thereby more effectively promoting the enhancement of local enterprises’ NQPF. The level of financial development in a city is gauged by the proportion of bank loans in the region’s (city’s) GDP. The sample is then categorized into two groups: high-level and low-level, according to the median of the average financial development level of each city from 2006 to 2021. The regression results are presented in Table 14.
Table 14.
Regression results of enterprises located in cities with different levels of financial development.
From Table 14, the promotion of enterprises’ NQPF by the NBDCPZ’s policies is significant in cities with high-level financial development, whereas the effects of these policies are not evident in cities with low-level financial development.
5. Research Conclusions and Policy Implications
5.1. Research Conclusions
Utilizing panel data from A-share listed enterprises in China’s Shanghai and Shenzhen stock markets from 2011 to 2022, this paper uses the MWFE Staggered DID method to thoroughly examine the impact of establishing NBDCPZs on enterprises’ NQPF. The following specific conclusions are drawn: a. The establishment of the NBDCPZ significantly enhances the NQPF of enterprises within these zones. b. Mechanism tests demonstrate that promoting digital transformation is the pathway through which the NBDCPZ boosts NQPF, and the digital talent can amplify the promotional effect of the NBDCPZ on enterprises’ digital transformation. c. The NBDCPZ significantly promotes NQPF in POEs, FIEs, and smaller enterprises, while significantly inhibiting it in SOEs and large enterprises. d. The promotional effect of the NBDCPZ on enterprises’ NQPF is different across various industries. e. Regional (city-level) digital infrastructure and financial development levels amplify the NQPF-enhancing effects of the NBDCPZ.
The findings of this study contribute to a deeper evaluation of the policy effectiveness of the NBDCPZ in driving micro-level enterprises’ NQPF and disclose more mechanisms (promoting digital transformation) through which the NBDCPZ enhances enterprises’ NQPF. The findings of this paper also provide further implications for the formulation of relevant policies aimed at further enhancing enterprises’ NQPF. The following policy implications are proposed.
5.2. Policy Implications
5.2.1. Establish a Special Fund for Enterprises’ Digital Transformation to Alleviate the Financing Constraints of Enterprises
As one of the important mechanisms through which the NBDCPZ promotes enterprises’ NQPF, enterprises’ digital transformation should be taken as an important leverage point to advance enterprises’ NQPF. The success of enterprises’ digital transformation hinges on digital talents and the research and application of digital technology, which necessitate substantial financial costs. However, most enterprises often face financing constraints. Consequently, the government could actively establish a special fund for enterprises’ digital transformation within the NBDCPZ. This fund would be utilized to subsidize enterprises in hiring digital technology talents and conducting research and applications of digital technology, thereby assisting them in enhancing their digital transformation capabilities and promoting the enhancement of NQPF. On the other hand, the government should guide relevant departments and enterprises to delve deeply into the potential of Big Data and other modern information technologies to empower financial innovation and development. The government should progressively improve platforms such as Big Data investment and financing to continuously narrow the information gap between borrowers and lenders, manage financial risks, improve lending efficiency, and expand enterprises’ financing channels. This will aid in enterprises’ digital transformation and NQPF enhancement.
5.2.2. Utilize the Agglomeration Effect of the NBDCPZ to Foster the Concentration of Digital Industries and Digital Talent
The policy of the NBDCPZ effectively sends signals to develop the digital economy. This is beneficial for the entire digital industry chain, attracting the aggregation of digital industries and talents and further aiding the development of enterprises’ NQPF. Therefore, on one hand, it is essential to continue enhancing the publicity of the NBDCPZ policy and to improve policy support through preferential measures such as tax exemptions, financial subsidies, and the establishment of special funds. Simplifying the administrative approval process and implementing other measures are necessary to optimize the business environment, thereby reducing operational difficulties for enterprises and attracting more digital enterprises to enter the pilot zone. On the other hand, relevant government departments need to attract more digital talents to the pilot zone by addressing issues related to housing and settlement, employment for spouses, and school enrollment for children. Moreover, multiple measures are required to fully leverage the agglomeration effect of industries and talents in the NBDCPZ, thereby promoting enterprises’ digital transformation and the rapid improvement of NQPF.
5.2.3. Implementing Varying Policies for Enterprises of Different Sizes or Ownership Structures
As indicated in the heterogeneity analysis section, the positive impact of the NBDCPZ on enterprises’ NQPF is larger in small-scale and non-state-owned enterprises. It indicates that the government can adopt enterprise-specific policies in promoting the enhancement of enterprises’ NQPF. Specifically, first, small-scale and non-state-owned enterprises exhibit greater dynamism in innovation but often lack capital and talent. Consequently, the government should focus on providing financial assistance to support these enterprises’ digital transformation and NQPF development.
5.2.4. Strengthen the Building of Digital Infrastructure, Integrate and Share Data Elements, and Promote the Utilization and Popularization of Digital Technology
The construction of digital infrastructure is a fundamental project to promote the development of China’s network power, which is beneficial for the advancement of NQPF. Therefore, the government should promote the construction and improvement of three categories of infrastructures: network infrastructure represented by 5G, 6G, Satellite Internet, and other new-generation Broadband (Mobile) Communication technologies; information service infrastructure represented by Big Data centers and Industrial Internet; and computing power support infrastructure represented by Cloud Computing centers and Super-Computing centers. Taking full advantage of Big Data, Artificial Intelligence, and other technologies to accelerate the integration of data elements and share them with enterprises, the government can fully unlock the potential value of data elements and turn them into a driving force for enterprises’ NQPF. By leveraging digital infrastructure, the government can vigorously promote the application and popularization of Artificial Intelligence, Big Data, Blockchain, and other digital technologies in the digital transformation of enterprises and the development of NQPF.
Supplementary Materials
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/su17188262/s1. Annex A, Stata Regression Code and Data; Annex B, Stata Code and Raw Data for Entropy Weighting Method; Annex C, Python Code for Digital Transformation Keywords Frequency Analysis.
Author Contributions
Conceptualization, W.H.; methodology, Y.Z.; validation, Y.Z., G.L. and T.S.; formal analysis, T.S.; resources, W.H.; data curation, G.L.; writing—original draft preparation, Y.Z., G.L. and W.H.; writing—review and editing, Y.Z. and W.H. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the National Natural Science Foundation of China (grant number 72302221), Major Humanities and Social Sciences Research Projects in Zhejiang Higher Education Institutions (grant number 2023QN113), and the Natural Science Foundation of Zhejiang Province (grant number LQ22G020001).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Code and data are provided as annexes (Supplementary Materials) to this article. Access is currently available through this anonymous private link: https://www.scidb.cn/anonymous/N25hWUJ2 (accessed on 7 September 2025).
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| NBDCPZ | National Big Data Comprehensive Pilot Zone |
| NQPF | New Quality Productive Forces |
| MWFE Staggered DID | the multi-way fixed effects staggered difference-in-differences |
| POEs | privately owned enterprises |
| FIEs | foreign-invested enterprises |
| SOEs | state-owned enterprises |
Appendix A. Entropy Weighting Method for NQPF Index Construction
Note: All data and Stata code are uploaded as Annex B. You can use the data and code to replicate the calculation process, results, and sensitivity analysis of entropy weights for NQPF.
Appendix A.1. Raw Indicators and Units
The raw indicators and their unit are as follows:
Table A1.
Explanation of all sub-indicators.
Table A1.
Explanation of all sub-indicators.
| Indicator Code | Sub-Indicators | Raw Indicators | Raw Data Unit |
|---|---|---|---|
| A1 | Percentage of R&D salary | (Salaries and wages in R&D expenses)/Operational revenues | Chinese Yuan (CNY) |
| A2 | Percentage of R&D staff | Number of R&D staff/Number of employees | Headcount |
| A3 | Percentage of educated staff | Number of people with master’s degree or higher/Number of employees | Headcount |
| B1 | Percentage of fixed assets | Fixed assets/Total assets | Chinese Yuan (CNY) |
| B2 | Percentage of manufacturing costs | (Subtotal of cash outflows from operating activities + Depreciation of fixed assets + Amortization of intangible assets + Provision for impairment—Cash paid for purchases of goods and labor—Wages paid to employees)/(Subtotal of cash outflows from operating activities + Depreciation of fixed assets + Amortization of intangible assets + Provision for impairment) | Chinese Yuan (CNY) |
| C1 | Percentage of R&D depreciation and amortization | (Depreciation and amortization in R&D expenses)/Operational revenues | Chinese Yuan (CNY) |
| C2 | Percentage of R&D lease expenses | (Lease expenses in R&D expenses)/Operational revenues | Chinese Yuan (CNY) |
| C3 | Percentage of R&D direct investment | (Direct investment in R&D expenses)/Operational revenues | Chinese Yuan (CNY) |
| C4 | Percentage of intangible assets | Intangible assets/Total assets | Chinese Yuan (CNY) |
| D1 | Turnover rate of total assets | Operational revenues/Average total assets | Chinese Yuan (CNY) |
| D2 | Inverse of equity multiplier | Owners’ equity/Total assets | Chinese Yuan (CNY) |
Note: a. All missing raw data points were replaced with 0 prior to ratio calculation (The average proportion of missing values of raw data used to calculate sub-indicators is about 17.82%. Specifically, 58.93% for Salaries and wages in R&D expenses, 2.02% for Operational revenues, 37.78% for Number of R&D staff, 0.05% for Number of employees, 0.05% for Fixed assets, 0.00% for Total assets, Subtotal of cash outflows from operating activities, and Owners’ equity, 0.15% for Depreciation of fixed assets, 1.60% for Amortization of intangible assets, 2.50% for Provision for impairment, 2.09% for Cash paid for purchases of goods and labor, 0.01% for Wages paid to employees, 1.35% for Intangible assets, and 12.32% for Average total assets. It should be noted that the samples that exist with missing values of any sub-indicators will be removed. A sensitivity check where missing values are left missing is displayed at the end of Appendix A.). b. With reference to the approach of defining “Living Labor” as the proportion of personnel costs, “Materialized Labor” should be defined as the proportion of non-personnel costs. Therefore, firstly, we use “(Subtotal of cash outflows from operating activities + Depreciation of fixed assets + Amortization of intangible assets + Provision for impairment)” to represent the total cost. Then, “(Subtotal of cash outflows from operating activities + Depreciation of fixed assets + Amortization of intangible assets + Provision for impairment—Cash paid for purchases of goods and labor—Wages paid to employees)” represents the non-personnel costs. Here, the latter expression is derived by subtracting the two personnel cost items, “Cash paid for purchases of goods and labor” and “Wages paid to employees”, from the former expression. Next, “Materialized Labor” is defined by dividing the non-personnel costs (represented by the latter expression) by the total costs (represented by the former expression).
Appendix A.2. Monotonicity and Directionality
All eleven secondary indicators (A1, A2, A3, B1, B2, C1, C2, C3, C4, D1, D2) are conceptualized as positive indicators for the NQPF construct. A higher value for any indicator is assumed to represent a higher level of productive forces. Consequently, no directionality reversal was necessary before normalization.
Appendix A.3. Normalization Steps
To render the different secondary indicators comparable and dimensionless, they were normalized to a [0, 1] scale using the Min–Max method. The formula for a positive indicator is
where Xij is the raw value of jth secondary indicator (any one of A1, A2, A3, B1, B2, C1, C2, C3, C4, D1, D2) for firm-year i, and r(max) and r(min) are the minimum and maximum values of that secondary indicator across the entire sample, respectively.
S_Xij = (Xij − min (Xj))/(max(Xj) − min(Xj)) i = 1, ……, n (observations number); j = 1, ……, m
To avoid undefined values in subsequent entropy calculations, normalized values of 0 were replaced with a negligible constant (0.000001).
Appendix A.4. Entropy and Weight Formulas
The entropy weight method, an objective weighting technique, was applied based on the normalized matrix. The procedure is as follows:
Appendix A.4.1. Calculate Proportion Pij
For S_Xij, the proportion Pij is calculated as
Appendix A.4.2. Calculate Entropy Value ej
The entropy value for jth secondary indicator is
where n is the number of firm-year observations.
Appendix A.4.3. Calculate Degree of Divergence dj
The divergence of intrinsic information for each secondary indicator is
dj = 1 − ej
Appendix A.4.4. Calculate Entropy Weight wj
The objective weight for each secondary indicator is
where m is the number of indicators (11 in this case).
Appendix A.5. Final Weights
The entropy weights were calculated on the entire panel dataset (2010–2022) rather than separately for each year. This approach ensures consistency in the weighting scheme across time, allowing for longitudinal comparability of the NQPF index. The final weights for each indicator are as follows
Table A2.
Weights of all sub-indicators.
Table A2.
Weights of all sub-indicators.
| Indicator | Weight |
|---|---|
| A1 | 0.2565175 |
| A2 | 0.0217321 |
| A3 | 0.0263404 |
| B1 | 0.0087363 |
| B2 | 0.0000116 |
| C1 | 0.2492229 |
| C2 | 0.1387839 |
| C3 | 0.2775407 |
| C4 | 0.0148563 |
| D1 | 0.0062573 |
| D2 | 0.00000087 |
Appendix A.6. Sensitivity Analysis to Alternative Normalization Methods
To test the robustness of the entropy weights to the normalization technique, we computed weights using two alternative methods:
a. Z-Score Normalization (Shifted to Non-Negative): S_Xij = (Xij − mean(Xj))/sd(Xj) (i = 1, 2, …, n (observations number); j = 1, 2, ……, m (11)), followed by a translation to ensure non-negativity (S_Xij = S_Xij − min(S_Xj)). Similarly, normalized values of 0 were replaced with a negligible constant (0.000001).
b. Vector Normalization: S_Xij = (Xij/sqrt((Xij2)).
The entropy weights from all three methods (Method 1 (Min–Max), Method 2 (Z-Score), and Method 3 (Vector)) are roughly consistent. This is visually confirmed by the entropy weights matrix and heatplot below (see Figure A1), which shows C > A > B> (or =) D according to the relative importance of the secondary indicator sets (each one of A, B, C, and D represents a secondary indicator set). The high Spearman correlation coefficients (ρ > 0.90) between each pair of weight vectors from three methods further confirm the robustness of the weighting scheme to the choice of normalization technique (see Table A3).
Conclusions: The construction of the NQPF index is robust. The entropy weights are stable and not sensitive to the chosen normalization technique, as evidenced by the sensitivity analysis.
Figure A1.
Sensitivity Analysis—Matrix and Heatplot of Entropy Weights under Different Normalization Methods. The Entropy Weights show nearly uniform color across all methods for each secondary indicator set (A, B, C, and D), indicating identical weights.
Table A3.
Sensitivity Analysis—Spearman’s Rank Correlation Coefficients Matrix for Entropy Weight Vectors from Different Normalization Methods.
Table A3.
Sensitivity Analysis—Spearman’s Rank Correlation Coefficients Matrix for Entropy Weight Vectors from Different Normalization Methods.
| Method 1 (Min–Max) | Method 2 (Z-Score) | Method 3 (Vector) | |
|---|---|---|---|
| Method 1 | 1.000 | 1.000 ** | 0.900 |
| Method 2 | 1.000 ** | 1.000 | 0.900 |
| Method 3 | 0.900 | 0.900 | 1.000 |
Note: ** indicate significance at the 5% level.
Appendix A.7. A Sensitivity Check in Which Missing Values Are Left Missing
When missing values of raw data used to gauge sub-indicators are kept, the entropy weights from all three methods (Method 1 (Min–Max), Method 2 (Z-Score), and Method 3 (Vector)) are roughly consistent. This is visually confirmed by the entropy weights matrix and heatplot below (see Figure A2), which shows C > A > B> (or =) D according to the relative importance of the secondary indicator sets (each one of A, B, C, and D represents a secondary indicator set). The high Spearman correlation coefficients (ρ > 0.84) between each pair of weight vectors from three methods further confirm the robustness of the weighting scheme to the choice of normalization technique (see Table A4).
Next, we compare the weight set obtained by method 1 in Figure A1 with that obtained by method 1 in Figure A2. The results indicate that only the weights of four sub-indicators have changed slightly, suggesting that the weight set has hardly changed. Moreover, the NQPF calculated from the former weight set is slightly different from that calculated from the latter, but the relative NQPF value in different years for one enterprise and for different enterprises in the same year are either still unchanged.
Figure A2.
Sensitivity Analysis—Matrix and Heatplot of Entropy Weights under Different Normalization Methods (When keeping missing values of raw data). The Entropy Weights show nearly uniform color across all methods for each secondary indicator set (A, B, C, and D), indicating identical weights.
Table A4.
Sensitivity Analysis—Spearman’s Rank Correlation Coefficients Matrix for Entropy Weight Vectors from Different Normalization Methods (when keeping missing values of raw data).
Table A4.
Sensitivity Analysis—Spearman’s Rank Correlation Coefficients Matrix for Entropy Weight Vectors from Different Normalization Methods (when keeping missing values of raw data).
| Method 1 (Min–Max) | Method 2 (Z-Score) | Method 3 (Vector) | |
|---|---|---|---|
| Method 1 | 1.000 | 1.000 ** | 0.836 |
| Method 2 | 1.000 ** | 1.000 | 0.836 |
| Method 3 | 0.836 | 0.836 | 1.000 |
Note: ** indicate significance at the 5% level.
Appendix B. Digital Transformation Keywords List and Its Frequency Analysis Details
Appendix B.1. Keyword List
Table A5.
Keyword list of digital transformation.
Table A5.
Keyword list of digital transformation.
| 人工智能 | 异构数据 | 数字货币 | 智能穿戴 |
| Artificial Intelligence | Heterogeneous Data | Digital Currency | Smart Wearables |
| 商业智能 | 征信 | 分布式计算 | 智慧农业 |
| Business Intelligence | Credit Reporting | Distributed Computing | Smart Agriculture |
| 图像理解 | 增强现实 | 差分隐私技术 | 智能交通 |
| Image Understanding | Augmented Reality | Differential Privacy Technology | Intelligent Transportation |
| 投资决策辅助系统 | 混合现实 | 智能金融合约 | 智能医疗 |
| Investment Decision Support System | Mixed Reality | Smart Financial Contracts | Smart Healthcare |
| 智能数据分析 | 虚拟现实 | 移动互联网 | 智能客服 |
| Intelligent Data Analytics | Virtual Reality | Mobile Internet | Intelligent Customer Service |
| 智能机器人 | 云计算 | 工业互联网 | 智能家居 |
| Intelligent Robots | Cloud Computing | Industrial Internet | Smart Home |
| 机器学习 | 流计算 | 移动互联 | 智能投顾 |
| Machine Learning | Stream Computing | Mobile Connectivity | Intelligent Investment Advisory |
| 深度学习 | 图计算 | 互联网医疗 | 智能文旅 |
| Deep Learning | Graph Computing | Internet Healthcare | Smart Culture and Tourism |
| 语义搜索 | 内存计算 | 电子商务 | 智能环保 |
| Semantic Search | In-Memory Computing | E-Commerce | Smart Environmental Protection |
| 生物识别技术 | 多方安全计算 | 移动支付 | 智能电网 |
| Biometric Technology | Secure Multi-Party Computation | Mobile Payment | Smart Grid |
| 人脸识别 | 类脑计算 | 第三方支付 | 智能营销 |
| Facial Recognition | Neuromorphic Computing | Third-Party Payment | Smart Marketing |
| 语音识别 | 绿色计算 | NFC支付 | 数字营销 |
| Speech Recognition | Green Computing | NFC Payment | Digital Marketing |
| 身份验证 | 认知计算 | 智能能源 | 无人零售 |
| Identity Authentication | Cognitive Computing | Smart Energy | Unmanned Retail |
| 自动驾驶 | 融合架构 | B2B | 互联网金融 |
| Autonomous Driving | Converged Architecture | Business-to-Business | Internet Finance |
| 自然语言处理 | 亿级并发 | B2C | 数字金融 |
| Natural Language Processing | Hundreds of Millions-Level Concurrency | Business-to-Consumer | Digital Finance |
| 大数据 | EB级存储 | C2B | Fintech |
| Big Data | Exabyte-Level Storage | Customer-to-Business | Financial Technology |
| 数据挖掘 | 物联网 | C2C | 金融科技 |
| Data Mining | Internet of Things | Customer-to-Customer | Fintech |
| 文本挖掘 | 信息物理系统 | O2O | 量化金融 |
| Text Mining | Cyber-Physical Systems | Online-to-Offline | Quantitative Finance |
| 数据可视化 | 区块链 | 网联 | 开放银行 |
| Data Visualization | Blockchain | Networked Connectivity | Open Banking |
Appendix B.2. Analysis Steps
Step 1: Download all annual reports of listed companies from cninfo (this is the only official website that has been approved by the China Securities Regulatory Commission for the disclosure of information of listed companies. Its website address is http://www.cninfo.com.cn/, accessed on 1 July 2024).
Step 2: Use Python (v3.8.3) to perform word frequency analysis on the reports with the above digital transformation keywords (The Python code is uploaded as Annex C).
Step 3: Import the processed results into Stata (v17.0 MP-Parallel Edition) for data integration and further statistical analysis.
Appendix B.3. Technical Details
The construction details of the digital transformation dictionary (DIG) are as follows:
- The complete keyword dictionary is reported in Table A5, and the set of terms is derived from Wu (2021) (see Reference [23] for details).
- For Chinese text segmentation, we apply the jieba tokenizer, with an additional procedure to protect digital transformation keywords. Specifically, each keyword is temporarily replaced by a unique placeholder of the form “_DT_WORD {a digital transformation word}__”, ensuring that jieba does not break multi-word entries during tokenization. After the jieba tokenization, the original keywords were restored by putting back the placeholders.
- Regarding text cleaning, we remove common stop-words based on the widely accepted Harbin Institute of Technology (HIT) stop-word list.
- Unlike some studies that restrict analysis to particular sections (e.g., MD&A), we have not applied section filters to more comprehensively capture digital transformation characteristics throughout the annual reports.
All procedures, including tokenization, keyword protection, stop-word filtering, and frequency computation, are implemented in Python (v3.8.3). For replication and transparency, we provide the full code in Annex C.
References
- Xie, F.; Jiang, N.; Kuang, X. Towards an Accurate Understanding of ‘New Quality Productive Forces’. Econ. Political Stud. 2025, 13, 1–15. [Google Scholar] [CrossRef]
- Men, Y. Big Data Enablement, Dual Innovation and the Development of New Quality Productive Forces: Evidence from National Big Data Comprehensive Pilot Zone. Zhejiang Acad. J. 2025, 198–209. [Google Scholar] [CrossRef]
- Xing, M.; Chen, D.; Zhang, H. Big Data Empowerment and the Development of Regional New Quality Productive Forces: A Quasi-Natural Experiments of National Big Data Comprehensive Pilot Zone. Sci. Technol. Prog. Policy 2024, 41, 23–31. [Google Scholar]
- Zhao, P.; Zhu, Y.; Zhao, L. National Big Data Comprehensive Experimental Zone and New Quality Productivity: Based on Empirical Evidence from 230 Cities. J. Chongqing Univ. (Soc. Sci. Ed.) 2024, 30, 62–78. [Google Scholar]
- Chen, X.; Chang, H.; Deng, R. Research on the Influence of Data Elements on the Agricultural New Quality Productivity: A Quasi-Natural Experiment Based on National Big Data Comprehensive Pilot Zones. Foreign Econ. Relat. Trade 2025, 46–51+93. [Google Scholar] [CrossRef]
- Xu, L.; Deng, L. How National Big Data Pilot Zones Empower New Quality Productive Forces: An Empirical Study Based on a Multi-Time Difference-in-Differences Model. China Circ. Econ. 2025, 149–155. [Google Scholar] [CrossRef]
- Li, H.; Li, Z.; Yang, K.; Gao, Y. How Big Data Policy Foster New Quality Productive Forces in Enterprises: Evidence from a Quasi-Natural Experiment of the National Big Data Comprehensive Pilot Zone. Forum Sci. Technol. China 2025, 137–149. [Google Scholar] [CrossRef]
- Yang, J.; Ding, J.; Liu, Y. How Does the Digital Economy Empower Enterprises with New Quality Productivity—Quasi-Natural Experiments Based on the National Big Data Comprehensive Pilot Area. J. Harbin Univ. Commer. (Soc. Sci. Ed.) 2025, 3–20. Available online: https://kns.cnki.net/kcms2/article/abstract?v=bJ89lKU86K_dau9V6MkMkl38qULC9Choelycnpqy-gqPgl_abdp13R1WNnZgsqoEFfiLQNIYJ7_ubANL0vhYEDzQSyDmI9usyAkda4yweV5xw3TS_AgJhjEpmhee-69gnjUow0K8qUsPQCk5HWmoO263rM4lgmDp4wmJ28iNVfPNO6FQ1xGrcg==&uniplatform=NZKPT&language=CHS (accessed on 7 September 2025).
- Brown, J.; Matsa, D.A. Boarding a Sinking Ship? An Investigation of Job Applications to Distressed Firms. J. Financ. 2016, 71, 507–550. [Google Scholar] [CrossRef]
- Mikalef, P.; Pappas, I.O.; Krogstie, J.; Giannakos, M. Big Data Analytics Capabilities: A Systematic Literature Review and Research Agenda. Inf. Syst. E-Bus. Manag. 2018, 16, 547–578. [Google Scholar] [CrossRef]
- Liu, C.; Chen, L.; Wei, X. Impact of Data Element Agglomeration on Scientific and Technological Innovation: A Quasi-Natural Experiment Based on Big Data Comprehensive Pilot Areas. J. Shanghai Univ. Financ. Econ. 2023, 25, 107–121. [Google Scholar] [CrossRef]
- Sun, W.; Mao, N.; Lan, F.; Wang, L. Policy Empowerment, Digital Ecosystem and Enterprise Digital Transformation: A Quasi Natural Experiment Based on the National Big Data Comprehensive Experimental Zone. China Ind. Econ. 2023, 9, 117–135. [Google Scholar] [CrossRef]
- Callaway, B.; Sant’Anna, P.H. Difference-in-Differences with Multiple Time Periods. J. Econom. 2021, 225, 200–230. [Google Scholar] [CrossRef]
- Song, J.; Zhang, J.; Pan, Y. Research on the Impact of Esg Development on New Quality Productive Forces of Enterprises—Empirical Evidence from Chinese A-Share Listed Companies. Contemp. Econ. Manag. 2024, 46, 1–11. [Google Scholar] [CrossRef]
- Liu, Y.; He, Z. Synergistic Industrial Agglomeration, New Quality Productive Forces and High-Quality Development of the Manufacturing Industry. Int. Rev. Econ. Financ. 2024, 94, 103373. [Google Scholar] [CrossRef]
- Beck, T.; Levine, R.; Levkov, A. Big Bad Banks? The Winners and Losers from Bank Deregulation in the United States. J. Financ. 2010, 65, 1637–1667. [Google Scholar] [CrossRef]
- Topalova, P. Factor Immobility and Regional Impacts of Trade Liberalization: Evidence on Poverty from India. Am. Econ. J. Appl. Econ. 2010, 2, 1–41. [Google Scholar] [CrossRef]
- Goodman-Bacon, A. Difference-in-Differences with Variation in Treatment Timing. J. Econom. 2021, 225, 254–277. [Google Scholar] [CrossRef]
- Cengiz, D.; Dube, A.; Lindner, A.; Zipperer, B. The Effect of Minimum Wages on Low-Wage Jobs. Q. J. Econ. 2019, 134, 1405–1454. [Google Scholar] [CrossRef]
- Deshpande, M.; Li, Y. Who Is Screened Out? Application Costs and the Targeting of Disability Programs. Am. Econ. J. Econ. Policy 2019, 11, 213–248. [Google Scholar] [CrossRef]
- Baker, A.C.; Larcker, D.F.; Wang, C.C.Y. How Much Should We Trust Staggered Difference-in-Differences Estimates? J. Financ. Econ. 2022, 144, 370–395. [Google Scholar] [CrossRef]
- Baron, R.M.; Kenny, D.A. The Moderator–Mediator Variable Distinction in Social Psychological Research: Conceptual, Strategic, and Statistical Considerations. J. Personal. Soc. Psychol. 1986, 51, 1173. [Google Scholar] [CrossRef] [PubMed]
- Wu, F.; Hu, H.; Lin, H.; Ren, X. Enterprise Digital Transformation and Capital Market Performance: Empirical Evidence from Stock Liquidity. J. Manag. World 2021, 37, 130–144+10. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).