3.1. Model Setting
- (1)
Global Spatial Autocorrelation Test
The
index and the global
index are commonly used for spatial correlation analysis [
49].
index is a statistical measure used in spatial data analysis, primarily to measure the autocorrelation of a variable in space. The formula is as follows:
where
is Geary’s index and n is the sample size,
is the average GTFP of cities,
is the GTFP of city
, and
is the spatial weighting matrix.
fluctuates between 0 and 2, with values below 1 indicating a positive spatial correlation, values above 1 suggesting a negative spatial correlation, and a value of 1 indicating a random distribution.
Contrary to
, which concentrates on neighboring data point comparisons,
is designed to assess the broader relational dynamics between neighboring data points [
50]. The formula is as follows:
where
is the GTFP of city
,
represents the sample variance, and
denotes the spatial weight matrix.
has a value between −1 and 1, with values near 1 denoting a significant spatial correlation, values near −1 denoting a great spatial disparity, and values close to zero indicating a random spatial distribution.
- (2)
Local Spatial Autocorrelation Test
Local spatial autocorrelation explores the spatial correlation and heterogeneity of a sample city’s GTFP with its neighboring cities from a local spatial perspective. The local Moran’s index is calculated as follows:
where
and
represent the normalized values of green total factor productivity observations for cities
and
, respectively. Four quadrants—high–high (HH), low–high (LH), low–low (LL), and high–low (HL)—represent distinct local spatial correlation patterns based on the Moran scatter plot analysis of local spatial autocorrelation. Positive spatial correlation is shown by HH and LL, whereas negative spatial correlation is indicated by LH and HL.
- (3)
Spatial Econometric Models
Given the extensive and close economic links between regions in China, this paper posits that industrial robot adoption in one region may impact the GTFP of surrounding regions. To explore this, we set the following more general Spatial Durbin Model (SDM):
where
denotes the green total factor productivity of city
during a period
,
denotes the robot penetration of city
in the period
,
denotes the set of control variables that affect urban green total factor productivity and change with
and
. The term
represents city fixed effects, which control for individual effects that affect urban green total factor productivity but do not vary over time,
is time fixed effects, which control for time effects that affect urban green total factor productivity but do not vary over time,
is a random disturbance factor,
is the spatial autoregressive coefficient, indicating that there is also an effect of neighboring cities’ GTFP on local GTFP, and
is the extent to which the penetration of industrial robots in other cities affects local GTFP.
Given the feedback effect inherent in spatial lag terms, the point estimates from the SDM might not accurately represent the impact of industrial robot adoption on GTFP. To mitigate potential biases in direct regression, this study employs partial differentiation to decompose the impacts into three distinct components: direct, indirect, and total effects [
51]. The specific calculation formulas are as follows:
where
is the effect matrix calculated based on the estimated parameters of the SDM. The direct effect is the average value of the diagonal elements of the above matrix, reflecting the impact of industrial robot adoption in the local area on GTFP. The indirect effect is the average value of the non-diagonal elements of the matrix, reflecting the impact of industrial robot adoption in the local area on the GTFP of other regions through spatial associations. The total effect is the sum of the direct and indirect effects.
3.2. Variable Selection and Data Description
- (1)
Core Explanatory Variable: Industrial Robot Penetration
The current approach to measuring robot penetration is mainly based on the research findings of Acemoglu and Restrepo (2020) [
15]. They used the “Bartik instrument variable” construction method to measure robot penetration at the commuting area level in the United States from 1993 to 2007. As a result of the International Federation of Robotics (IFR) limited availability of robot stock data, which are primarily focused on industrial robots at the national industry level, numerous researchers have expanded the scope of their study on robot adoption by measuring robot penetration at the micro (enterprise), meso (industry), and macro (city, province, or country) levels, using Acemoglu and Restrepo’s method [
15]. China has witnessed a remarkable growth in industrial robot adoption, emerging as a significant global market. Many scholars have used this method to depict the penetration of industrial robots at the enterprise and city levels in China. For example, Zhang et al. (2025) use the Bartik instrumental variable method to measure robot adoption in China at the city level [
52]. Many scholars have confirmed the consistency of the results obtained from this method with the development trend in China [
53]. Thus, this paper constructs the industrial robot penetration at the city level in China using the Bartik instrument variable method, referring to Wang and Dong [
54], as shown in Equation (6):
where
represents the employment count in different manufacturing industries
in city
in the base period of 2006,
represents the total number of employment posts in the manufacturing industry in city
in the base period of 2006,
quantifies the national count of industrial robots in industry
for period
, and
denotes the total employment across the nation in different manufacturing sectors for the base year. The selection of 2006 as the foundational period for employment data is informed by the rationale that historical data precludes contemporaneous correlations with shifts in employment structures at both national and industry levels, thereby safeguarding the homogeneity in constructing share weights.
- (2)
Explained Variable: Green Total Factor Productivity (GTFP)
Using a non-radial, non-angular directional distance function as a framework, this paper constructs a green total factor productivity DEA model based on the global technological environment and in the form of a Luenberger index to measure GTFP at the city level, referring to Liu et al. (2020) [
55]. The generalized non-radial directional distance function under the global technological frontier is defined as follows:
where
denotes the weight vector associated with input factors, desirable outputs, and undesirable outputs,
is the direction vector, which indicates the expected direction of efficiency improvement, that is, the reduction of input factors, the increase in desired outputs, and the decrease in undesired outputs, and
represents the directional distance function value of each variable, also known as the proportion factor, indicating the possible proportion of input reduction, desired output increase, and undesired output decrease.
Based on the Luenberger productivity index form (Chambers et al., 1996) [
56], the
for period
is defined as follows:
The Non-radial Directional Distance Function (NDDF) framework is instrumental in gauging city-level GTFP in China, tracking each city’s deviation from the production technology frontier. In this framework, a GTFP value exceeding 0 indicates that a city in the period
is nearer to the frontier compared to the base period
, indicating an improvement in GTFP. Conversely, a GTFP value lower than 0 suggests a regression in GTFP. In this study, the input factors include capital stock (K), labor (L), and total energy consumption (E), with GDP (Y) as the expected output, and CO
2 emissions and PM2.5 concentration (PM) as undesired outputs. Weights reflect policymakers’ priorities in adjusting variables. While different weights can be assigned based on research needs, many scholars agree that equal weighting of inputs and outputs is reasonable when there is no prior information [
57]. This paper assumes that inputs, desired outputs, and non-desired outputs are of equal importance. In particular, the current Chinese macro-policy level emphasizes pollution reduction, carbon reduction, and green expansion and growth; therefore, each of the three is given an equal weight. Following the approach of Zhang et al. (2013), equal weights of 1/3 were assigned to desired outputs, undesired outputs, and input factors, respectively [
58], a method that has been widely used in China’s environmental efficiency analysis. This approach aims to avoid subjective weighting biases while ensuring that the model balances economic growth objectives, resource utilization efficiency, and environmental constraints in the optimization process, consistent with the concept of sustainable development. The weights were then evenly distributed among the specific types of desired outputs, undesired outputs, and input factors, considering the synergistic impact of capital, labor, and energy on urban development. Thus, the weights for input and output factors were set as
, which remains constant throughout the analysis, ensuring consistent optimization trajectories.
- (3)
Control Variables
We added a set of control factors in conjunction with the existing literature. Specifically, (1) openness level (FDI), measured by the logged foreign investment inflows; (2) industrial structure (STR), using the secondary industry’s value-added ratio; (3) market dynamism (MAR), indicated by the ratio of urban entrepreneurs to employees; (4) digitalization level (DIGITAL), captured by the logged number of internet users; (5) tax burden (COST), represented by the VAT burden on industry; (6) degree of government intervention (GOV), calculated as fiscal expenditure per capita.
- (4)
Spatial Weight Matrix
Furthermore, taking into account the fact that the spatial correlation of GTFP in cities is not only associated with geographic distance but also more strongly with economic development, this study creates an economic–geographical nested weight matrix, as shown in Equation (9):
where
represents the inverse distance weight matrix,
represents the economic distance weight matrix, calculated based on the average GDP growth rate during the observation period, and the value of
is set to 0.5.
- (5)
Daa Description
This study measures city-level industrial robot penetration through three key steps. First, we calculate sectoral robot penetration rates using national manufacturing employment data from the China Stock Market & Accounting Research Database (CSMAR) and robot stock statistics from the International Federation of Robotics (IFR). Second, city–industry weights are constructed based on the distribution of urban manufacturing employment derived from the China Industrial Enterprise Database. Finally, these components are integrated to derive city-level penetration rates through a shift–share aggregation method.
The IFR data, compiled from global robot manufacturers, provide authoritative country–industry–year statistics that address coverage gaps in the imported data and ensure the accuracy of our core explanatory variable. We manually align two-digit industry codes to resolve discrepancies between the International Standard Industrial Classification (ISIC) and China’s 2002 National Economic Industry Classification. National subsector employment data for the base year (2006) are extracted using IFR-compatible industry coding rules. For city-level estimates, we reconcile firm-level employment records (China Industrial Enterprise Database) with aggregated totals from the China City Statistical Yearbook. This dual-source approach addresses the lack of official city–sector employment statistics while ensuring consistency with macroeconomic trends. As shown in
Figure 2, the derived penetration index aligns closely with China’s industrial automation trajectory, validating the robustness of the methodology in this paper. In the empirical analysis, this variable underwent a logarithmic transformation.
Following the approach of Wu et al. (2014) [
59], energy data were inferred based on provincial-level energy consumption data and nighttime light data. The formula is
, where
is the total energy consumption of province i in year
t and
is the sum of the grayscale values of all grids in province
in year
. CO
2 emission data were sourced from the Center for Global Environmental Research website, with city-level data extracted based on grid information. PM2.5 concentration data were obtained from Washington University in St. Louis, with city-level panel data on PM2.5 concentrations obtained by cutting and summarizing grid data within China. The remaining data were sourced from the China City Statistical Yearbook, resulting in balanced panel data for 273 Chinese cities from 2007 to 2019. Specific variable names and descriptive statistics are shown in
Table 1.