4.1. Model Specification
To systematically evaluate the impact pathways through which industrial robot adoption affects labor resource allocation efficiency at the city level, this study first establishes a baseline two-way fixed effects regression model that incorporates both city fixed effects and year fixed effects. The model is then extended to a mediation analysis framework by introducing two key intermediary variables, industrial upgrading and urban innovation capacity. This modeling strategy not only facilitates the estimation of the direct effect of industrial robot adoption on labor resource allocation but, more importantly, helps to uncover the underlying transmission mechanisms. Specifically, it allows us to examine how industrial robots indirectly reshape intra-city and interregional labor allocation patterns through intermediary channels such as industrial restructuring and urban innovation systems, thereby enhancing the overall efficiency of factor markets.
For empirical testing, this study selects the industrial robot penetration index and the labor resource mismatch index as core variables based on panel data at the prefecture-level city level in China to identify the impact of industrial robot application on the efficiency of labor resource allocation in prefecture-level cities. Accordingly, the following baseline regression model is constructed:
In this model, and denote the city and year dimensions, respectively. represents the labor misallocation index of city in year reflecting the efficiency of labor resource allocation—where lower values indicate higher allocation efficiency. measures the penetration of industrial robots and is the core variable for measuring the level of industrial robot application in prefecture-level cities in China.
is a vector of control variables, including indicators such as city economic size, demographic structure, educational attainment, and industrial base. denotes city fixed effects, which control for unobservable regional heterogeneity, while represents year fixed effects, accounting for temporal shocks and national trends. is the idiosyncratic error term.
The coefficient is the core parameter of this paper. If its estimated value is significantly negative, it means that the penetration of industrial robots helps to reduce the level of mismatch, that is, the application of industrial robots significantly improves the efficiency of urban labor resource allocation.
To further identify the mechanisms through which industrial robot adoption influences labor resource allocation efficiency, this study extends the baseline model by incorporating a mediation analysis framework. The goal is to examine how robot adoption may indirectly affect allocation efficiency through specific structural variables.
Mediation analysis, first systematically proposed by Baron and Kenny [
67], is one of the most widely used approaches for mechanism identification. It investigates whether the effect of an independent variable (X) on a dependent variable (Y) is transmitted through a mediating variable (M). The model follows a stepwise regression-based path analysis framework, decomposing the total effect into direct and indirect effects, and determines the presence of mediation by testing the statistical significance of each path coefficient.
In recent years, as mediation analysis has gained traction in econometrics and the social sciences, it has been extensively applied to research in labor economics, urban economics, and technology diffusion. The method is particularly well-suited to uncovering the structural mechanisms underlying complex economic phenomena. Compared with traditional regression models, mediation analysis emphasizes hierarchical causal structures, transmission logic among variables, and explanatory power with respect to mechanisms—making it especially appropriate for the “technology–structure–allocation” transmission chain that this study seeks to investigate.
In the specific empirical design, this study selects two mediating variables: urban industrial structure upgrading () and urban innovation capacity (). The three-step estimation procedure for mediation analysis is as follows:
Step 1: Test whether the application of industrial robots significantly affect allocation efficiency by estimating the total effect of the application of industrial robots on
.
Step 2: Test whether the application of industrial robots significantly affect the mediating variable (
).
Step 3: Incorporate the mediating variable into the baseline regression model to examine whether it serves as a transmission channel between the application of industrial robots and labor allocation efficiency.
According to the stepwise regression logic, if coefficient in Model (2), in Model (3), and in Model (4) are all statistically significant, this indicates that the mediation pathway is valid, and that the impact of industrial robot adoption on allocation efficiency operates at least in part through an indirect mechanism.
Furthermore, if becomes statistically insignificant after including the mediating variable, this suggests a case of full mediation. If remains significant but its absolute value decreases, the result is interpreted as partial mediation.
To enhance the statistical significance and robustness of the mediation effect identification, this study employs the Bootstrap nonparametric resampling method to test the significance of the mediation pathway. Originally proposed by Efron [
68], the Bootstrap method generates an empirical distribution of parameter estimates by repeatedly resampling with replacement from the original sample. Without requiring assumptions of normality or independence of the error terms, it enables robust estimation of indirect effects. The Bootstrap method is particularly effective when dealing with non-symmetric mediation paths, skewed distributions of estimators, or limited sample sizes, and has become a mainstream approach in mediation analysis.
In summary, this study adopts a three-pronged modeling strategy baseline regression, mediation analysis, and Bootstrap testing to systematically identify both the direct effects and indirect mechanisms through which industrial robot adoption influences labor resource allocation efficiency at the urban level. This integrated econometric framework provides a comprehensive approach to understanding the interaction between technological advancement and factor allocation, and offers empirical support for designing efficient labor allocation systems that are adaptive to ongoing technological transformation.
4.2. Variable Description and Data Sources
To empirically examine the effects and underlying mechanisms of industrial robot adoption applications on labor resource allocation efficiency at the urban level, this study constructs a multidimensional variable system that includes the dependent variable, core explanatory variable, mediating variables, and a set of control variables. These variables collectively reflect the dynamic characteristics of urban systems in terms of technological shocks, structural evolution, and allocative performance. The definitions and measurement methods for each type of variable are as follows:
4.2.1. Dependent Variable: Labor Resource Allocation Efficiency (Lab)
This study uses a labor misallocation index as the core indicator to measure labor resource allocation efficiency at the city level. This selection is based on both solid theoretical foundations and practical advantages. According to resource misallocation theory, in a perfectly competitive market without institutional distortions, the marginal productivity of production factors across regions and sectors should converge. Any divergence signals resource misallocation, which leads to a total factor productivity (TFP) level below its potential optimum. As a fundamental production input, labor plays a critical role in determining the operational efficiency and growth potential of both urban and national economies. Therefore, measuring the degree of labor misallocation helps uncover distortions arising from variations in market structure, policy environments, or technological shocks across cities, offering both theoretical and practical significance.
From an empirical perspective, the labor misallocation index can be constructed using standardized, publicly available data with a transparent and replicable calculation path. This reduces reliance on subjective assumptions and enhances comparability across studies. Compared to surface-level indicators such as employment or unemployment rates, the misallocation index provides a more intrinsic measure of disparities in marginal labor productivity, thereby capturing structural adjustments in the labor market. This makes it particularly suitable for analyzing reallocation dynamics under the influence of technological change.
This study focuses on how labor resource allocation efficiency evolves during the diffusion of industrial robot technologies. Existing literature shows that industrial robots reshape labor markets through two main channels: (1) directly substituting for low-skill, repetitive jobs, and (2) creating new tasks and skill demands, thereby transforming the structure of labor demand and inducing dynamic changes in marginal productivity. Accordingly, the labor misallocation index serves as an ideal dependent variable to capture the reallocation effects triggered by robot adoption, aligning closely with the theoretical logic and core research objectives of this paper.
From a dynamic analysis perspective, the index is also highly sensitive to temporal changes and structural shifts. Utilizing a panel data framework allows for the tracking of allocation efficiency trajectories across cities and over time, thereby enabling the identification of temporal effects associated with technology diffusion, policy interventions, and market adjustments—further enhancing the explanatory power of the empirical analysis.
Depreciation Rates
Following the methodology of Wu [
70], fixed asset investment is divided into three categories: (1) construction and installation, (2) equipment and tools purchases, and (3) other costs. Each category is assigned a separate depreciation rate: 8.12%, 17.08%, and 12.1%, respectively. These values are based on empirical estimates of asset lifespan and economic depreciation derived from large-scale micro-level firm data in China. Compared to a uniform depreciation rate, this category-specific approach captures internal differences in investment composition and reduces systematic bias in capital stock estimation.
Given regional and temporal heterogeneity in investment structures, the study uses data from the China Fixed Asset Investment Yearbook and the Statistical Yearbook of China’s Investment Fields to compute the annual proportion of each investment category in provincial-level total investment. These proportions are used to calculate a weighted average economic depreciation rate for each province, which is then assigned to cities within the corresponding province. This approach balances data availability with regional accuracy, enhancing both the granularity and validity of the capital stock estimates.
Based on the settings described above, capital stock for each city and year is calculated using the standard PIM formula:
In summary, this paper adopts a three-pronged modeling strategy comprising baseline regression, mediation analysis, and Bootstrap testing to systematically identify the impact of AI applications on labor resource allocation efficiency in cities. By capturing both direct effects and structural transmission mechanisms, the proposed framework offers a comprehensive econometric approach for analyzing the interaction between technological progress and factor allocation. It also provides a solid empirical basis for designing adaptive and efficient labor allocation systems in the era of rapid technological transformation.
In the formula, represents the capital stock in year is the nominal fixed asset investment in year , is the fixed asset investment price index (base year: 2006), and denotes the weighted economic depreciation rate in year . The initial capital stock is set as twice the completed fixed asset investment in 2006. Prior studies have shown that setting the base-year capital stock at 1.5 to 2 times the investment value can effectively mitigate early-stage estimation errors and ensure the stability and plausibility of the long-term capital stock series.
After estimating the capital stock, this study incorporates city-level GDP and employment data to construct a Cobb–Douglas production function and estimate factor output elasticities. The general form of the production function is defined as follows:
where
is the real GDP of city
in year
(measured in 2006 constant prices),
and
represent capital and labor inputs, respectively;
is a constant denoting technology level; and
,
are the output elasticities of capital and labor. It is assumed that returns to scale are constant, i.e.,
. Taking natural logarithms on both sides yields a linearized regression equation:
A two-way fixed effects panel regression is employed to estimate the equation, controlling for unobserved city-specific and time-fixed effects, thereby ensuring the robustness of the elasticity estimates.
Based on the estimated elasticities, the relative distortion coefficients for capital and labor in each city are calculated as follows:
where
and
are the capital and labor inputs of city
,
and
denote the national total capital and labor,
is the share of city
in total national output, and
and
represent the national average output elasticities of capital and labor, respectively. A distortion coefficient closer to 1 indicates a more optimal allocation of factors; greater deviation implies more severe resource misallocation.
Subsequently, the capital and labor misallocation indices
and
are calculated based on the relative distortion coefficients:
These indices measure the absolute deviation of unit input efficiency from the optimal configuration. Following the method of Xie et al. [
71], the absolute value is taken to ensure the indices are always positive and exhibit monotonicity for ease of interpretation.
In summary, through investment deflation, region-specific depreciation rate estimation, capital stock computation via the perpetual inventory method, output elasticity estimation, and misallocation index calculation, this study constructs a rigorous and comprehensive measure of labor resource allocation efficiency at the city level. This provides a solid empirical foundation for investigating how the impact of industrial robot application influences factor allocation efficiency.
4.2.2. Core Explanatory Variable: Industrial Robot Adoption at the Prefecture Level (Robots)
Regarding efforts to measure the regional application level of industrial robot technologies, constructing an indicator that is exogenous, representative, and operationalizable has long been a critical challenge in empirical research. Given that China’s statistical system has not yet established a standardized official index for robot adoption intensity at the city level, this study, drawing on international research paradigms and data availability, selects the penetration rate of industrial robots at the prefecture level as the core explanatory variable. This measure not only aligns closely with theoretical expectations and methodological rigor but also provides a solid foundation for subsequent causal identification and mechanism analysis.
From an econometric perspective, the measure of “robot adoption intensity” at the city level in this study is not simply based on the number of robots installed locally. Instead, it follows the Bartik instrumental variable approach proposed by Acemoglu and Restrepo [
51], combining national-level changes in robot penetration across industries with city-level baseline employment structures to construct each city’s exposure to technological shocks. The underlying logic is that cities whose baseline employment is more concentrated in industries experiencing rapid national-level robot adoption are more likely to face stronger technological substitution or restructuring in the future.
Essentially, this variable captures the degree to which a city is exposed to exogenous technological trends due to its historically determined industrial and employment structure. It helps to mitigate endogeneity concerns arising from local policy choices, economic adjustments, or firm-level behaviors. From a data feasibility standpoint, the International Federation of Robotics (IFR) provides authoritative and continuous data on robot deployment across industries in major countries and regions since 1993. Meanwhile, China’s national statistical system provides industry-level employment data from economic censuses and annual labor statistics, which can be used to construct the baseline employment structure at the city level.
By linking these two datasets, this study builds an industrial robot adoption index at the city level that is both exogenous and comparative across time and space. This design ensures data quality and enhances the external validity and generalizability of the findings.
In summary, the selection of the industrial robot stock penetration rate at the prefecture-level as the core explanatory variable for measuring the level of industrial robot application represents a synthesis of theoretical perspective and practical trajectory, as well as a methodological innovation aligned with empirical constraints. This indicator not only closely captures the logic of industrial robot adoption in practice but also provides a solid foundation for identifying the causal effects of technological progress on labor resource allocation efficiency. By employing a Bartik-style instrumental variable design, it further enhances the robustness of causal inference, offering more scientifically grounded and policy-relevant empirical evidence.
The robot exposure index for city in year is calculated as follows:
denotes the total number of industrial robots deployed in industry i nationwide in year t, sourced from the International Federation of Robotics (IFR);
is the number of employees in industry i nationwide in 2010, serving as a normalization base;
represents the number of workers employed in industry i in city c in the base year 2010, capturing the city’s initial industrial structure.
This formula can be interpreted as a weighted average of national robot adoption intensity across industries, with the weights reflecting each city’s baseline exposure to robot-intensive sectors. The specific weight is defined as follows:
where
is the total number of employees in city
in 2010, ensuring that the weighting captures the relative employment share of each industry in the city’s baseline structure.
The industrial robot usage data employed in this study are sourced from the International Federation of Robotics (IFR) Global Robot Database, which provides detailed records of industrial robot deployments disaggregated by country and industry. Although IFR does not directly report city-level robot usage, we estimate each city’s robot exposure intensity by combining national-level industry-specific robot penetration rates with the industrial composition of each prefecture-level city.
Data on city-level industrial structures and the working-age population (ages 16–64) are primarily drawn from China’s Sixth National Population Census, which is highly authoritative and offers comprehensive coverage.
The year 2010 is selected as the baseline year for two main reasons. First, from the perspective of data completeness and comparability, 2010 is the only year in recent history when both a nationwide economic census and population census were conducted simultaneously. This provides detailed and consistent data on industry employment structures and demographic composition at the city level. The spatial coverage, industry classification, and statistical standards in this year are highly consistent across sources, offering a solid empirical foundation for constructing a comparable and accurate city-level robot exposure index.
Second, the choice of 2010 ensures both temporal relevance and reasonable lag structure. Around this time, China entered a phase of rapid industrial robot adoption. Using 2010 as the baseline year captures the initial responsiveness of urban industrial structures to the emerging technology diffusion, while allowing sufficient time lags to analyze the subsequent impacts on labor resource allocation efficiency. Fixing the baseline year also helps mitigate the potential endogeneity arising from contemporaneous changes in industry structure, thereby enhancing the exogeneity of the constructed instrument.
Moreover, using national-level robot penetration by industry as the exogenous source of variation ensures horizontal comparability and consistency over time. National robot diffusion trends can be regarded as exogenous technological shocks, largely unaffected by individual city-level policies or market dynamics, which strengthens the instrument’s validity [
72].
In conclusion, this study constructs a city-level measure of industrial robot adoption by combining national industry-level robot penetration rates with city-specific baseline employment structures using a Bartik-type instrument. This variable exhibit strong theoretical grounding, practical feasibility, and robust identification strategy, thereby providing a critical foundation for analyzing the causal impact of industrial robots on labor resource allocation efficiency in subsequent empirical analyses.
4.2.3. Control Variables (CV)
In econometric modeling, the omission of key explanatory variables may lead to biased estimates, thereby compromising causal identification between the core explanatory and dependent variables. To address this concern and enhance the robustness and validity of the empirical analysis, this study incorporates a series of control variables that capture macroeconomic conditions and institutional factors that may systematically affect labor resource allocation efficiency. The selection of these variables is grounded in theoretical insights and empirical findings from prior literature [
14,
40,
50,
51,
59], covering key dimensions such as urban economic development level, urban financial development, local government behavior, infrastructure construction, local science and technology investment, and opening up to the outside world.
① Level of Economic Development (GDP)
This study uses the logarithm of per capita GDP at the prefecture level as a core indicator of regional economic development. This variable reflects not only the overall economic output capacity of a city but also its ability to absorb, allocate, and redistribute labor resources. It serves as a fundamental condition for evaluating labor allocation efficiency.
② Financial Development (FIN)
The level of financial development is a key institutional foundation for technology diffusion and industrial upgrading. Higher financial efficiency enhances firms’ access to credit and financing, thereby improving their ability to invest in high-tech equipment such as industrial robots and implement job restructuring. To capture this effect, the study uses the ratio of year-end loan balances of financial institutions to regional GDP as an indicator of financial development, reflecting the influence of regional capital markets on technology investment and employment structure.
③ Government Intervention (GOV)
In the context of China’s institutional framework, local governments play a central role in shaping economic development. Their policy preferences, fiscal expenditure structures, and administrative behavior exert both direct and indirect influences on factor market allocation. To measure the degree of government intervention, this study uses the ratio of local general public budget expenditure to regional GDP. This proxy reflects the extent to which government resources are involved in economic activities and partially captures their impact on industrial restructuring and employment orientation.
④ Infrastructure Level (BASE)
Infrastructure represents the foundational conditions for regional economic operations. High-quality infrastructure such as transportation, water supply, and energy can reduce production costs, improve inter-regional connectivity, and enhance the capacity to attract labor and capital, thereby improving labor resource allocation. This study measures infrastructure development using the ratio of total fixed asset investment to GDP, capturing the supportive role of infrastructure in facilitating factor agglomeration and urban functionality.
⑤ Government Science and Technology Expenditure (SCI)
Public expenditure on science and technology is a critical enabler of technological innovation and the development of high-end manufacturing, and serves as an institutional basis for AI deployment. Government spending in this area accelerates R&D and the diffusion of technologies such as robotics, and indirectly influences structural labor reallocation by promoting industrial upgrading. Accordingly, this study measures technological support intensity by the ratio of local government science and technology expenditure to GDP.
⑥ Openness to the Outside World (OPEN)
In recent years, regional openness has been widely recognized as a major source of economic dynamism. Highly open regions tend to possess stronger resource integration capacity, higher technological absorptive capacity, and more active industrial linkages. These features promote regional technological progress and industrial restructuring, thereby altering labor demand structures. This study uses the ratio of total imports and exports to city-level GDP as a proxy for openness, capturing the influence of outward-oriented economic activity on labor allocation.
4.2.4. Mechanism Variables
① Industrial Structure Upgrading (ISU): As one of the key mediating variables in this study, the level of industrial structure upgrading reflects the degree and trajectory of a region’s economic transition from low-end to high-end sectors. It serves as a core indicator for evaluating industrial optimization and structural advancement. In the existing literature, industrial upgrading is widely regarded as a hallmark of high-quality economic development and is frequently employed to analyze structural transformation driven by technological progress, changes in factor allocation efficiency, and industrial transition dynamics. Incorporating this variable into the mediation analysis framework enables a deeper understanding of how industrial robot adoption may influence labor allocation efficiency through structural evolution pathways.
Current approaches to quantifying industrial upgrading fall into four main categories: the industrial output ratio method [
69], the industrial hierarchy index method [
59], the Moore structural change index [
73], and the cosine similarity (angle) method [
74]. Each method has its own strengths and limitations and is suited to different research objectives and analytical contexts.
The first is the industrial output ratio method, a more traditional approach that typically uses the share of tertiary industry value added in regional GDP as the core indicator. It reflects the rising prominence of the service sector within the overall economy. This metric is straightforward and data-accessible, making it widely used in studies of China’s industrial evolution, particularly the growing dominance of services. However, the main limitation of this method lies in its simplicity: it captures only the quantity of the tertiary sector share without addressing inter-sectoral relationships, value-added capabilities, or the quality of structural transformation. It also fails to reflect the full upgrading trajectory from primary to tertiary industries, thus producing a somewhat “flattened” index.
The second is the industrial hierarchy index method, which assigns different weights to the three major sectors (typically 1 for primary, 2 for secondary, and 3 for tertiary industries) to construct an index that reflects the direction of structural change. Compared to the output ratio method, this approach better captures the “directionality” of transformation from traditional to modern sectors. However, its subjective weighting system may undermine accuracy, and its performance is limited in contexts with strong structural heterogeneity or complex internal industrial dynamics. Furthermore, its comparability over time is often weak, limiting its usefulness in longitudinal panel studies.
The third method is the Moore structural change index, which quantifies the absolute magnitude of changes in sectoral shares over time to reflect the intensity of structural evolution. This index is particularly suitable for dynamic trajectory analysis, as it captures how quickly structural change is occurring. Nevertheless, it suffers from a critical limitation: it does not distinguish between the direction or quality of change. For example, a shift from primary to tertiary industries and a regression from tertiary back to primary may yield the same index value, thereby weakening the index’s explanatory power as a proxy for upgrading.
The fourth method, which is also adopted in this paper, is the Cosine Angle Method, which has been optimized on the basis of inheriting the advantages of Moore’s change index method, with special emphasis on the directionality and quality level of structural change. The basic principle is that the proportion of value added of primary, secondary and tertiary industries in a certain region constitutes a vector, and the three-unit vectors representing the ideal state of industrial structure are calculated to calculate the cosine angle value, and accordingly weighted to synthesize the index of structural sophistication. Specifically, firstly, let the industrial structure vector of a certain region be
, which represents the proportion of the primary, secondary and tertiary industries in the GDP of the place. Then set three ideal unit vectors,
, which represent the ideal structural state dominated by primary, secondary and tertiary industries, respectively. By calculating the angle cosine value
between
and
respectively, the similarity of its structure near the three ideal states is obtained, and then according to the level of industrial development, the weight (3, 2, 1) is given to form the final structural index of advanced structure, and the calculation formula is as follows:
The key advantages of this method are twofold. First, it captures the evolutionary direction of industrial transformation from low-end to high-end sectors. Second, it reflects the degree of deviation from the ideal high-end structure by measuring the angular distance from the tertiary-industry-dominated vector. A larger ISU value indicates a stronger orientation toward a service-led, advanced industrial structure. This method addresses the limitations of traditional structural metrics in directional interpretation and explanatory power, and has been increasingly adopted in recent studies on regional industrial upgrading and urban development quality.
② Urban Innovation Capacity (Inno): Within the analytical framework that investigates how technological progress affects labor resource allocation efficiency through structural mechanisms, urban innovation capacity emerges as a critical, multidimensional construct. It encompasses a region’s capacity for technology absorption, knowledge generation, and institutional support, serving as a composite indicator of a city’s technological supply capabilities, factor allocation efficiency, and policy effectiveness. Urban innovation capacity reflects not only a region’s current level of technological advancement but also its resilience and adaptability in response to science-driven economic transformation.
From the perspective of resource allocation, regions with higher levels of innovation capacity tend to possess more agile and efficient mechanisms for integrating production factors. These regions are better positioned to anticipate and adapt to structural adjustments triggered by technological change, enabling more accurate and efficient matches within the labor market. Therefore, incorporating innovation capacity as a mediating variable in this study not only enriches the understanding of potential causal mechanisms, but also provides a pivotal analytical lens for identifying structural transformation pathways across macro, meso, and micro levels.
In the earlier literature, urban innovation capacity was typically proxied by patent data. For instance, Guan and Liu [
75] employed the share of invention patents in total granted patents at the prefecture-level as a representative metric. This approach has gained wide application in early studies of regional innovation due to the transparency and availability of patent statistics, as well as the ease of operationalization. However, as innovation activities have become increasingly diverse and systemically embedded in urban economies, the limitations of patent-counting metrics have become more pronounced.
First, patent counts primarily capture the output side of innovation processes, without adequately reflecting the practical utility, commercial value, or real-world application of technological outcomes. Second, due to significant cross-industry variation in reliance on patenting practices, such indicators may produce misleading comparisons across regions. High-tech industries typically exhibit a high density of patent output, whereas traditional manufacturing or resource-intensive sectors may contribute fewer patents, leading to potential biases in inter-city assessments. Finally, patent data may also be distorted by local government incentives, corporate patenting strategies, and inconsistencies in R&D reporting, all of which undermine their reliability as comprehensive indicators of a region’s true innovation capacity.
To overcome these limitations, this study adopts the Urban and Industrial Innovation Index published in the China Urban and Industrial Innovation Report by Kou and Liu. [
76] This index offers a multidimensional and comprehensive evaluation framework covering four dimensions: “innovation environment, innovation input, innovation output, and innovation performance”. It integrates key indicators such as the density of R&D personnel, the number of high-tech enterprises, patent applications, and the value of technology market transactions. This framework captures the full process of innovation activity ranging from resource input to institutional responsiveness and economic outcomes thus providing a more robust and theoretically grounded measure compared to single-dimension proxies.
Due to the latest publicly available version of the report being limited to 2021, this study extends the dataset to 2023 following the methodology outlined in the original report. During the extension process, all city-level innovation indices were harmonized to the 2020 administrative division standard, and industrial innovation components were aligned with the Industrial Classification for National Economic Activities of China (GB/T 4754–2017) [
77]. Due to space constraints, the calculation method and technical details of the indicators can be found in the public information description of the “China City and Industry Innovation Report (2017)”, which will not be repeated here.
4.3. Data Sources and Descriptive Statistics
This study is based on a panel dataset covering 280 prefecture-level cities in China from 2006 to 2023. The primary data sources include the China Statistical Yearbook, China City Statistical Yearbook, China Labor Statistical Yearbook, and other relevant statistical yearbooks and sector-specific statistical reports.
The core explanatory variable prefecture-level industrial robot application intensity is constructed using sector-level robot installation data published by the International Federation of Robotics (IFR). Following the Bartik instrument design proposed by Acemoglu and Restrepo [
14], the measure captures city-level exposure to national trends in industrial automation across sectors.
The dependent variable, labor resource allocation efficiency, is constructed based on misallocation theory, with foundational data drawn from various statistical yearbooks. Two mechanism variables are included: (1) industrial upgrading—measured by the relative value-added shares of different industries, and (2) urban innovation capacity—quantified using the China Urban and Industrial Innovation Report compiled by Kou and Liu [
76].
The model further incorporates a set of control variables to account for macroeconomic and institutional influences on labor allocation efficiency. These controls include government intervention, infrastructure development, science and technology expenditure, and financial development, with data derived from the same official statistical sources.
To address potential issues related to missing values or outliers, supplementary information from other statistical bulletins and the National Economic and Social Development Statistical Communiqués was utilized to ensure the accuracy and objectivity of the dataset.
Summary statistics for all key variables are presented in
Table 1.