4.1. Baseline Regression
Table 4 reports the baseline regression results. The dependent variable is the firm-level skill premium, and the core explanatory variable is the firm’s level of artificial intelligence. Columns (1) through (3) sequentially add regional controls, firm-level controls, and the full set of controls. The results show that the coefficient on AI is significantly positive in all specifications. In column (3), the estimated coefficient on AI is 0.072 and is significant at the 1 percent level, indicating that firms with a higher level of AI exhibit a larger skill premium. This suggests that AI development may widen the wage gap between high-skilled and low-skilled workers within firms by increasing the relative demand for high-skilled labor or raising their marginal productivity. This finding is consistent with the conclusions of Hémous and Olsen (2022) and Lankisch et al. (2017), who show that technological progress increases the demand for and productivity of high-skilled labor, thereby widening income disparities between high-skilled and low-skilled workers [
75,
76]. In terms of economic significance, because both AI and SP are measured in logarithmic form, the estimated coefficient can be interpreted approximately as an elasticity. In column (3), a 1 percent increase in AI is associated with an average increase of about 0.072 percent in the skill premium. Further, combining this estimate with the standard deviation reported in
Table 2, where the standard deviation of AI is 0.353, a one-standard-deviation increase in firm-level AI is associated with an increase of about 2.54 percent in the skill premium. This suggests that the effect of AI on the skill premium is not only statistically significant but also economically meaningful.
With respect to the control variables, population density (Pop), GDP per capita (Pgdp), the level of industrial upgrading (Upgrade), internet penetration rate (Internet), and road network density (Road_density) are all significantly positive, indicating that the skill premium is larger in regions with higher levels of economic development, a more advanced industrial structure, and better infrastructure conditions. At the firm level, firm size (Size), leverage (Lev), return on assets (ROA), cash flow capacity (CashFlow), firm value (Tobin’s Q), and CEO duality (Dual) all exhibit significantly positive effects. By contrast, the level of human capital (Hcapital) and firm growth (Growth) do not pass conventional significance tests. Overall, the baseline results support Hypothesis 1, indicating that artificial intelligence significantly increases the skill premium within firms.
4.2. Endogeneity Tests
To mitigate potential endogeneity in firms’ artificial intelligence development, this paper employs two instrumental variables: the interaction between prefecture-level base station density and time, and the interaction between terrain relief and time [
77,
78,
79]. The identification strategy is based on the fact that the development, training, and application of AI technologies are not purely digital processes, but rely heavily on physical infrastructure conditions, including data transmission capacity, computing connectivity, and low-latency communication networks. A higher density of base stations provides the bandwidth redundancy and low-latency environment required for cloud computing, large-scale deep learning, and AI deployment, thereby facilitating firms’ engagement in AI-related activities [
80]. Meanwhile, terrain relief affects the marginal cost of constructing and maintaining communication infrastructure. More rugged terrain raises the cost of laying optical fiber, installing base stations, and maintaining communication networks, which may hinder the expansion of digital infrastructure and the diffusion of AI technologies, creating a geographically determined barrier to technology adoption [
81].
The validity of these instruments depends on the absence of direct effects on the skill premium outside the AI channel. Terrain relief is a long-standing geographical characteristic determined well before the sample period and is therefore plausibly exogenous to firms’ current wage-setting and skill demand decisions [
82]. Although base station density reflects communication infrastructure, the deployment of mobile communication networks in China has been largely shaped by national strategies such as the “Broadband China” initiative and the “New Infrastructure” program, rather than by firm-level labor demand or wage structures [
83].
In addition, the baseline specification controls for internet penetration and traditional transportation infrastructure, to account for potential non-AI channels through which digital or physical infrastructure may affect the skill premium. Taken together, these considerations suggest that base station density and terrain relief are more likely to affect the skill premium through firms’ differential exposure to AI adoption than to directly determine wage gaps between high- and low-skilled workers.
Table 5 reports the two-stage least squares estimates. Columns (1) and (2) present the first-stage and second-stage results based on the first instrumental variable, columns (3) and (4) report the corresponding results based on the second instrumental variable, and columns (5) and (6) report the results when both instrumental variables are included simultaneously. The first-stage results show that both instrumental variables are significantly and positively associated with firms’ AI development at the 1 percent level, indicating strong relevance.
Turning to the identification tests, the Kleibergen–Paap rk Wald F-statistic is 22.1 for the first instrumental variable and 17.5 for the second instrumental variable, both of which are above the conventional threshold for weak-instrument tests. The corresponding Kleibergen–Paap rk LM statistics are 35.2 and 27.6, respectively, indicating that the model is not underidentified. When both instruments are used jointly, the Kleibergen–Paap rk Wald F-statistic remains 18.2, suggesting that weak instrument concerns are unlikely to affect the results. In addition, the Durbin–Wu–Hausman endogeneity test statistics are 19.6 and 16.3, respectively, both significant at the 1 percent level, supporting the view that AI is endogenous and that the instrumental variables approach is warranted. Moreover, the Hansen J test fails to reject the null hypothesis of instrument validity (p = 0.27), providing support for the joint exogeneity of the two instruments.
The second-stage results show that, after accounting for potential endogeneity, firms’ level of artificial intelligence development still has a significantly positive effect on the skill premium. Specifically, the estimated coefficient is 0.083 when the first instrumental variable is used and 0.076 when the second instrumental variable is used. When both instruments are included simultaneously, the estimated coefficient remains positive and significant at 0.079. These estimates are close to the baseline results, indicating that the positive effect of AI on the firm-level skill premium remains robust after addressing endogeneity.
To further address concerns regarding the exclusion restriction, we provide graphical evidence on the dynamic evolution of the skill premium across regions stratified by quartiles of the instrumental variables (
Figure 4). China’s AI-related policy environment experienced a marked intensification during the sample period. Since the release of Made in China 2025 in 2015, which laid the foundation for intelligent manufacturing, the density of AI-related policies has continued to increase. The 2016 “Internet Plus” Artificial Intelligence Three-Year Action Plan initiated industrial deployment, while the 2017 New Generation Artificial Intelligence Development Plan formally elevated AI development to a national strategy. Subsequently, in 2018, policy initiatives issued by the Ministry of Industry and Information Technology and the Ministry of Education further promoted AI-related industrial implementation and talent cultivation. The cumulative effect of these policies provided strong institutional support for the rapid diffusion of AI after 2017 and strengthened firms’ demand for high-skilled labor. Accordingly, this study treats 2017 as a key policy turning point marking the onset of large-scale AI diffusion, which provides a natural temporal benchmark for examining the differential evolution of the skill premium across regions with varying initial conditions.
As shown in
Figure 4, during the pre-treatment period, the trends in the skill premium are largely parallel across quartiles of both base station density and terrain relief. This indicates that regions with different initial conditions did not exhibit systematically different trajectories in skill premium before the large-scale diffusion of AI technologies. In contrast, clear divergence emerges only after 2018–2019. Regions with higher base station density experience a faster increase in skill premium, showing a pronounced J-shaped pattern, while regions with more favorable terrain conditions display a more gradual but persistent divergence. This timing pattern suggests that these regional characteristics do not directly affect the skill premium but rather influence it through differential exposure to AI diffusion. Therefore, the graphical evidence provides empirical support for the validity of the exclusion restriction.
Overall, the instrumental variables estimates are broadly consistent with the baseline regression results. These findings provide further support for the robustness of the paper’s main conclusion that artificial intelligence is positively associated with the skill premium within firms.
4.3. Robustness Checks
4.3.1. Sensitivity Tests for the Construction of the Dependent Variable
Given that the proxy for low-skilled wages, , may be sensitive to the definition of the grouping cell, this paper further conducts sensitivity tests along both the regional and industry dimensions. Specifically, while the baseline regression uses the year by prefecture level city by two-digit industry cell, this paper alternatively reconstructs the low-skilled wage proxy using the year by province by two-digit industry cell and the year by prefecture level city by three-digit industry cell, and then re-estimates the firm skill premium accordingly.
Table 6 reports the results of these sensitivity tests. In column (1), the low-skilled wage proxy is reconstructed using the year by province by two-digit industry cell, and the coefficient on AI is 0.069, significant at the 1 percent level. In column (2), the low-skilled wage proxy is reconstructed using the year by prefecture level city by three-digit industry cell, and the coefficient on AI is 0.065, significant at the 5 percent level. It is clear that, after changing the grouping cell definition, the sign and statistical significance of the coefficient on the core explanatory variable remain broadly unchanged, and the magnitude of the estimated coefficient remains close to that in the baseline regression.
Overall, these results indicate that the conclusion of this paper that artificial intelligence significantly increases the skill premium within firms does not depend on a particular grouping cell definition and is therefore robust.
4.3.2. Alternative Measures of the Dependent Variable
To further examine the robustness of the baseline results, this paper adopts three alternative measures of the firm-level skill premium.
First, worker skill types are redefined on the basis of educational attainment. Following the existing literature, employees with a junior college degree or above are classified as high-skilled workers, while the remaining employees are classified as low-skilled workers. On this basis, the firm-level skill premium is reconstructed from the perspective of educational attainment [
84,
85]. This measure can, to some extent, capture income differentiation among workers with different levels of education within firms.
Second, an alternative indicator is constructed based on the pay gap between research and development personnel and ordinary employees. Relative to ordinary employees, research and development personnel typically possess stronger professional knowledge, technical ability, and innovative capacity, and they play a key role in technology development, the commercialization of innovation outcomes, and process optimization. Accordingly, this paper further divides employees into research and development personnel and non-research personnel and uses the ratio of average annual compensation of research and development personnel to that of ordinary employees to capture pay differentials across job categories within firms. This provides a supplementary test of the main findings from the perspective of job functions.
Third, this paper further constructs an alternative indicator based on the pay ratio between executives and ordinary employees. Compared with ordinary employees, executives generally occupy positions associated with higher managerial responsibility, stronger decision-making power, and greater strategic influence, and their compensation often reflects the market valuation of managerial and organizational skills. Accordingly, this paper uses the ratio of executive compensation to the average compensation of ordinary employees as another proxy for the firm-level skill premium. Although this measure does not directly correspond to the wage gap between high-skilled and low-skilled workers in a strict sense, it provides an additional supplementary perspective on within-firm income differentiation.
Table 7 reports the robustness results after replacing the dependent variable. In column (1), the skill premium measure constructed on the basis of educational attainment is used, and the coefficient on AI is 0.067, significant at the 1 percent level. It is worth noting that the coefficient based on the education-based skill premium is slightly smaller than that in the baseline regression. This subtle difference suggests that task-based measures may more accurately capture the specific functional shifts in labor demand driven by AI, whereas education-based proxies, being coarser in nature, might introduce slight attenuation bias due to the inclusion of non-task-related educational signals.
In column (2), the alternative measure based on the ratio of average annual compensation of research and development personnel to that of ordinary employees is used, and the coefficient on AI is 0.054, significant at the 5 percent level. In column (3), the alternative measure based on the ratio of executive compensation to that of ordinary employees is used, and the coefficient on AI is 0.062, significant at the 1 percent level. These results show that, after remeasuring the firm-level skill premium using different definitions, the estimated coefficient on the core explanatory variable remains significantly positive, and its magnitude remains close to that in the baseline regression.
Overall, the empirical results based on alternative dependent variables are consistent with the baseline findings. This indicates that the positive association between artificial intelligence and the firm-level skill premium remains robust to alternative constructions of the dependent variable.
4.3.3. Alternative Measures of the Independent Variable
To further examine the robustness of the baseline results, this paper replaces the measurement of the core independent variable.
First, firm-level AI is remeasured using the number of AI patent grants. Compared with the number of patent applications, patent grants are subject to a more stringent review process and can therefore better reflect the maturity, validity, and quality of the underlying technologies. Accordingly, this paper reconstructs the core explanatory variable using the number of firms’ AI patent grants in the robustness checks.
Second, this paper uses an AI indicator constructed from annual report texts as an alternative explanatory variable. Specifically, based on the AI keyword dictionary, this paper identifies and counts AI-related terms in the annual reports of listed firms, and uses the natural logarithm of one plus the frequency of AI-related keywords in annual reports as an alternative core explanatory variable. Compared with patent-based measures, this text-based indicator complements the analysis by capturing firms’ attention to, strategic positioning in, and application of AI from the perspective of corporate information disclosure, thereby providing another feasible measure of firms’ AI-related activities.
Third, this paper further constructs an alternative AI indicator based on the Management Discussion and Analysis (MD&A) section of annual reports. Specifically, using the same AI keyword dictionary, this paper counts the frequency of AI-related terms appearing in the MD&A section and uses the natural logarithm of one plus this frequency as another alternative measure of firm-level AI development. Compared with the AI keyword frequency calculated from the full annual report, the MD&A-based indicator is more focused on firms’ managerial discussion of business operations, strategic planning, and technology deployment, and therefore may better capture firms’ substantive attention to and application of AI in their operating activities.
These indicators capture firms’ AI development from both textual disclosures and technological innovation, allowing different data sources to mutually validate firms’ AI activities. We further conduct Pearson correlation tests for these variables, and the results are reported in
Table 8. The correlation coefficient between Lnpatent_app and Lnpatent_grant is 0.882 and significant at the 1% level, indicating that the two patent-based measures are closely related and capture similar aspects of firms’ AI-related technological innovation. Meanwhile, the correlation coefficient between Lnwords and Lnwords_MD&A is 0.863 and significant at the 1% level, suggesting strong consistency between the two text-based measures. The correlations between text-based and patent-based measures are positive and statistically significant, ranging from 0.286 to 0.318. Overall, these findings provide supplementary support for the consistency and validity of the alternative AI measures used in this paper.
Table 9 reports the robustness results after replacing the core explanatory variable. Column (1) remeasures firm-level AI using the number of AI patent grants, column (2) remeasures the core explanatory variable using the AI indicator constructed from annual report texts, and column (3) further remeasures firm-level AI using the AI keyword frequency in the MD&A section of annual reports. The results show that, after replacing the explanatory variable, both the sign and the statistical significance of the estimated coefficients remain consistent with the baseline results, and the coefficient magnitudes remain within a reasonable range. Specifically, the estimated coefficients on the AI variable are 0.088 in column (1), 0.105 in column (2), and 0.103 in column (3), all of which are significant at the 1 percent level.
Overall, these results indicate the conclusion of this paper, that the positive association between artificial intelligence and the skill premium within firms remains robust to alternative measures of the core explanatory variable.
4.3.4. AI Pilot Zone Policy Shock
To further examine the robustness of the baseline results, this paper exploits the policy of establishing National New Generation Artificial Intelligence Innovation and Development Pilot Zones as a quasi natural experiment. Since 2019, the Ministry of Science and Technology has successively promoted the establishment of these pilot zones. By the end of 2021, three batches covering 18 cities had been approved, including Beijing, Shanghai, Tianjin, Shenzhen, Hangzhou, Hefei, Deqing County, Chongqing, Chengdu, Xi’an, Jinan, Guangzhou, Wuhan, Suzhou, Changsha, Zhengzhou, Shenyang, and Harbin. Because the pilot zone policy was implemented in multiple waves over time, this paper constructs a staggered difference-in-differences model to examine the effect of the AI innovation and development pilot zone policy on the skill premium within firms, thereby providing an additional test of the robustness of the main findings.
This paper specifies the staggered difference-in-differences model in Equation (3). Here,
is the policy treatment variable. If the city in which firm
is located is approved as a National New Generation Artificial Intelligence Innovation and Development Pilot Zone in year
or thereafter, then
is assigned a value of 1 for year
and all subsequent years. Otherwise, it is assigned a value of 0. The definitions of the other variables are the same as those in Equation (1).
To further test the parallel trends assumption, this paper adopts an event-study approach and specifies the dynamic effects model in Equation (4).
Here, the year immediately preceding policy implementation, that is k = −1, is used as the reference period in the event-study analysis. denotes the interaction between the treatment group indicator and a dummy variable for the k-th year relative to the timing of policy implementation in the city where firm is located. This paper denotes the interaction terms for the periods before policy implementation as prek, the interaction term for the year of implementation as current, and the interaction terms for the periods after policy implementation as postk. When k < 0, the coefficients capture the dynamic effects in the pre-policy periods. When k = 0, the coefficient captures the effect in the year of policy implementation. When k > 0, the coefficients capture the dynamic effects in the post-policy periods. Given the sample period, this paper includes in the regression the dynamic effects for up to four years before implementation and up to three years after implementation.
Column (1) of
Table 10 reports the regression results based on this quasi natural experiment. The coefficient on DID is 0.055 and is significant at the 5 percent level, indicating that the AI innovation and development pilot zone policy significantly increases the skill premium within firms. Column (2) of
Table 10 and
Figure 5 further report the results of the parallel trends test. It can be seen that, during the four years before policy implementation, the coefficients on the interaction terms for all pre-policy periods are statistically insignificant. This suggests that, prior to policy implementation, there is no significant difference in the trend of the skill premium between the treated and control groups, which provides support for the parallel trends assumption. At the same time, the policy effect emerges gradually after implementation. The coefficients on post2 and post3 are 0.059 and 0.085, respectively, with the latter being significant at the 1 percent level. This indicates that the positive effect of the policy shock on the firm-level skill premium is characterized by a certain degree of lag and accumulation.
Overall, the results from the quasi natural experiment based on the AI innovation and development pilot zone policy are consistent with the baseline regression findings and further support the core conclusion of this paper that AI increases the skill premium within firms.
4.3.5. Alternative Sample Restrictions
To further examine the robustness of the baseline results, this paper conducts tests based on alternative sample restrictions. First, the sample is restricted to manufacturing firms only. Manufacturing is a sector in which the application of AI is more direct. Its production processes are more standardized, and its employment structure and skill differentiation are also more pronounced. It is therefore particularly informative to examine the effect of AI on the skill premium within the manufacturing sample. Second, firms located in municipalities directly under the central government, namely Beijing, Shanghai, Tianjin, and Chongqing, are excluded. Since these municipalities differ markedly from other cities in terms of AI development, talent concentration, and the broader economic environment, they may introduce additional interference into the estimation. Excluding them therefore helps reduce the influence of sample heterogeneity.
Column (1) of
Table 11 reports the regression results using only the manufacturing sample, while column (2) reports the results after excluding firms located in the municipalities directly under the central government. The results show that, after restricting the sample to manufacturing firms or excluding the municipalities, the coefficient on AI remains significantly positive at the 1 percent level, and its magnitude is broadly consistent with that in the baseline regression. These findings indicate that the positive effect of AI on the skill premium within firms remains significant after changing the sample scope, further supporting the robustness of the main conclusion.
4.4. Mechanism Analysis
To further examine the transmission mechanisms through which artificial intelligence affects the skill premium within firms, this paper conducts mechanism tests based on the baseline regressions. Drawing on the mediation analysis framework proposed by Baron and Kenny (1986) [
86], and combining it with the Bootstrap method to test indirect effects, this paper specifies the following models.
Here, denotes the mediating variable and is used to capture the specific transmission mechanism through which AI affects the skill premium within firms. Equation (5) is the mediator equation and is mainly used to examine whether AI significantly affects the mechanism variable. If is significant, this indicates that the application of AI has a significant effect on the corresponding mechanism variable. Equation (6) is the mediation effect equation. Building on the baseline regression, it includes both the AI variable and the mediating variable in order to examine the effect of the mediating variable on the firm skill premium as well as the direct effect of AI adoption. In this specification, captures the effect of the mediating variable on the firm skill premium, while captures the direct effect of AI adoption on the firm skill premium after controlling for the mediating variable. If both and are significant, this indicates that the corresponding channel plays a role in the process through which AI affects the firm skill premium. This paper further uses the Bootstrap method to test the significance of the indirect effect, thereby improving the reliability of the mechanism identification results. The remaining control variables and fixed effects are specified in the same way as in the baseline regression model.
Accordingly, the following analysis examines four potential mediating channels, namely the substitution effect, productivity effect, capital deepening effect, and technological upgrading effect. To alleviate potential reverse causality concerns, all mediating variables are included in their one-period lagged form. The corresponding results are reported in
Table 12.
4.4.1. Substitution Effect
This paper uses the share of routine labor (RoutineShare) to capture the substitution effect, measured as the proportion of workers in routine occupations relative to total employment. Columns (1) and (2) of
Table 12 report the results. Column (1) shows that the coefficient of AI on RoutineShare is −0.058 and significant at the 1 percent level, indicating that AI significantly reduces the share of routine labor within firms. Column (2) shows that the coefficient on RoutineShare is −0.079 and significant at the 5 percent level, suggesting that a higher share of routine labor is associated with a lower skill premium. Meanwhile, after controlling for the mediating variable, the coefficient of AI on the skill premium remains 0.069 and significant at the 1 percent level. The bootstrap results show that the indirect effect through this channel is 0.005, with a 95 percent confidence interval of [0.002, 0.009], excluding zero. This indirect effect accounts for about 6.9 percent of the total effect, indicating that the substitution effect constitutes a partial mediating channel through which AI affects the skill premium, thus supporting Hypothesis 2.
4.4.2. Productivity Effect
This paper uses total factor productivity (TFP) to capture the productivity effect, which is estimated using the OP method. Columns (3) and (4) of
Table 12 report the results. Column (3) shows that the coefficient of AI on TFP is 0.123 and significant at the 1 percent level, indicating that AI significantly enhances firm productivity. Column (4) shows that the coefficient on TFP is 0.208 and significant at the 1 percent level, suggesting that higher productivity is associated with a higher skill premium. After controlling for the mediating variable, the coefficient of AI on the skill premium decreases to 0.048 and remains significant at the 5 percent level, implying that part of the effect of AI operates through productivity improvement. The bootstrap results show that the indirect effect through this channel is 0.026, with a 95 percent confidence interval of [0.015, 0.039], excluding zero. This indirect effect accounts for about 36.1 percent of the total effect, indicating that the productivity effect constitutes an important mediating channel through which AI affects the skill premium, thus supporting Hypothesis 3.
4.4.3. Capital Deepening Effect
This paper uses the capital-labor ratio (CapLabRatio), defined as the ratio of net fixed assets to total employment, to capture capital deepening. Columns (5) and (6) of
Table 12 report the results. Column (5) shows that the coefficient of AI on CapLabRatio is 0.118 and significant at the 1 percent level, indicating that AI significantly increases firms’ capital intensity. Column (6) shows that the coefficient on CapLabRatio is 0.136 and significant at the 1 percent level, suggesting that capital deepening is associated with a higher skill premium. After controlling for the mediating variable, the coefficient of AI on the skill premium remains 0.057 and significant at the 1 percent level. The bootstrap results show that the indirect effect through this channel is 0.016, with a 95 percent confidence interval of [0.008, 0.026], excluding zero. This indirect effect accounts for about 22.2 percent of the total effect, indicating that capital deepening also serves as a partial mediating channel through which AI affects the skill premium, thus supporting Hypothesis 4.
4.4.4. Technological Upgrading Effect
This paper uses research and development intensity (RDIntensity), measured as the ratio of R&D expenditure to operating revenue, to capture technological upgrading. Columns (7) and (8) of
Table 12 report the results. Column (7) shows that the coefficient of AI on RDIntensity is 0.034 and significant at the 1 percent level, indicating that AI significantly increases firms’ R&D intensity. Column (8) shows that the coefficient on RDIntensity is 0.184 and significant at the 1 percent level, suggesting that technological upgrading is associated with a higher skill premium. After controlling for the mediating variable, the coefficient of AI on the skill premium remains 0.065 and significant at the 1 percent level. The bootstrap results show that the indirect effect through this channel is 0.006, with a 95 percent confidence interval of [0.003, 0.012], excluding zero. This indirect effect accounts for about 8.3 percent of the total effect, indicating that technological upgrading constitutes a partial mediating channel through which AI affects the skill premium, thus supporting Hypothesis 5.
Taken together, the four channels account for about 73.5 percent of the total effect of AI on the skill premium, indicating that they jointly explain a substantial share of the overall effect.
Figure 6 further illustrates the proportional contributions of the different effects to the total effect, thereby providing a visual decomposition of the relative importance of each channel.