Toward Sustainable Workforce Development: How AI Reshapes Skill Demand Structure—Evidence from 67 Million Job Postings in China

Zhang, Ling; Zhang, Chenglei

doi:10.3390/su18104905

Open AccessArticle

Toward Sustainable Workforce Development: How AI Reshapes Skill Demand Structure—Evidence from 67 Million Job Postings in China

by

Ling Zhang

¹ and

Chenglei Zhang

^2,*

¹

School of Economics, Jinan University, Guangzhou 510632, China

²

School of Economics and Trade, Guangdong University of Finance, Guangzhou 510521, China

^*

Author to whom correspondence should be addressed.

Sustainability 2026, 18(10), 4905; https://doi.org/10.3390/su18104905

Submission received: 4 April 2026 / Revised: 8 May 2026 / Accepted: 11 May 2026 / Published: 14 May 2026

(This article belongs to the Topic AI for Sustainable Development: Innovations, Challenges, and Real-World Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

How artificial intelligence (AI) reshapes the internal structure of firm-level skill demand remains largely uncharted. Using approximately 67 million online job postings from two major Chinese recruitment platforms (2019–2024), we construct firm-by-year potential AI exposure via semantic matching between AI patent texts and detailed occupation task descriptions, decompose exposure into displacement and augmentation components based on task routineness, and measure four skill-category demand shares and their within-category importance from job-description text, with identification from within-firm variation under firm and city-by-year fixed effects. Displacement and augmentation exposure exhibit opposing relationships with skill demand: displacement is negatively associated with the routine cognitive share, while augmentation is positively associated with the nonroutine analytical share. Both forms of exposure are associated with a de-coring pattern, a shallower and more dispersed skill portfolio with within-category importance diverging from share movements, concentrated among low entry-threshold, small firms. Reskilling policy should therefore emphasize portfolio breadth and portable competency frameworks rather than deeper single-track specialization, particularly for workers in small, lower-threshold firms.

Keywords:

artificial intelligence; skill demand; sustainable workforce development; education policy; displacement–augmentation decomposition; online job postings; de-coring

1. Introduction

Rapid advances in artificial intelligence (AI) are reshaping the tasks that workers perform and the skills that firms seek. These shifts matter for sustainable workforce development because mismatches between evolving employer demand and the competencies that education systems produce can undermine both labor market inclusiveness and the quality of workforce investment [1,2,3]. Yet, most evidence on AI and labor demand examines aggregate exposure and occupation-level outcomes [4,5,6,7], leaving the within-firm reconfiguration of skill demand largely uncharted. This paper provides a firm-level empirical characterization of AI-driven skill demand restructuring, with particular attention to dimensions that existing studies have examined only partially: the opposing roles of displacement and augmentation, the reconfiguration of skill portfolio depth (measured throughout by O*NET importance ratings), and the divergence between skill share movements and within-category importance movements.

Unlike earlier automation technologies that primarily displaced labor from routine tasks, AI can simultaneously displace and augment within the same occupation [3,8]. This dual character means that AI may shift skill demand within a firm in multiple directions at once, but existing studies rarely decompose AI exposure into its displacing and augmenting components, making it difficult to identify the direction of skill reallocation. Two questions, therefore, remain open: how does AI alter the internal structure of skill demand within firms, and which segments of the labor market bear the greatest restructuring burden? Disentangling displacement from augmentation is a prerequisite for answering them and for designing education and training responses.

Beyond the decomposition gap, existing studies have focused primarily on changes in the quantity or type of jobs demanded [4,9,10,11] rather than on the internal reconfiguration of skill portfolios. Yet, if AI also alters the importance rating at which each skill category is demanded, the changes in average importance and dispersion within a firm’s skill mix, then curriculum designs calibrated to produce deep specialists in narrow domains may be structurally misaligned with evolving employer needs, even when they target the correct skill categories. These multi-dimensional shifts are what this paper sets out to characterize. Its findings inform education and workforce development policy along the dimensions most directly relevant to the sustainable development agenda [12].

This paper asks whether AI reshapes firm skill demand through upgrading and specialization, or instead through a flattening of the internal skill hierarchy, and this paper has three specific objectives. First, to decompose firm-level AI exposure into displacement and augmentation components and test whether the two exhibit opposing relationships with skill demand. Second, to document changes in portfolio-level depth, namely, average skill importance and dispersion, that analyses of shares alone cannot capture. Third, to examine whether share movements and within-category importance movements can diverge in direction, and to identify which segments of the firm distribution bear the greatest restructuring pressure.

We use approximately 67 million online job postings from two major Chinese recruitment platforms (2019–2024) to construct firm-year measures of AI exposure and skill demand. Exposure is built from semantic matching between AI patents and occupational task descriptions following Kogan et al. [13], decomposed into displacement and augmentation components; skill demand is built from job-description text mapped to the four-category O*NET taxonomy [14]. Identification comes from within-firm variation with firm and city-by-year fixed effects. The empirical analysis covers composition (skill shares), reconfiguration (portfolio-level depth), and divergence between within-category importance movements and share movements, with heterogeneity along firm characteristics. Full details of the measures, the econometric model, and robustness appear in Section 3.

This paper contributes to several strands of the literature. First, it extends the displacement–augmentation decomposition of Kogan et al. [13] from the occupation level to the firm level, making it possible to observe how skill demand shifts simultaneously across multiple categories within the same firm. The decomposition reveals that the two types of exposure exhibit opposing relationships with skill demand, a finding that aggregate exposure measures cannot capture. Second, this paper moves beyond the directional reallocation of skill shares examined in prior work [4] by documenting changes in the shape of the skill portfolio, specifically the decline in average importance and the rise in dispersion, and by showing that share movements and within-category importance movements can diverge in direction. Together, these results constitute what we term de-coring, a joint pattern that existing firm-level studies on AI and labor demand have characterized only partially. Third, by demonstrating that de-coring is concentrated among low entry-threshold, small firms, this paper connects micro-level evidence on AI-driven skill restructuring to the design of education and workforce development policy, an issue of growing importance in the sustainable development literature [1,12].

Two specific gaps underlie the contributions above: firm-level studies have examined hiring quantities, employment arrangements, and demand for AI-specific skills, but not the internal composition of the skill portfolio; and evidence on how AI reshapes skill demand in large developing-economy labor markets remains limited, despite these markets being the settings where vocational-track misalignment is most consequential for workforce development. These two gaps motivate the design choices summarized in the preceding paragraph.

The remainder of this paper is organized as follows. Section 2 reviews the related literature and develops the theoretical framework with testable hypotheses. Section 3 describes the empirical design, data, and variable construction. Section 4 presents the baseline results, skill portfolio characteristics, and robustness checks. Section 5 reports the mechanism and heterogeneity analyses. Section 6 discusses the findings and their implications for education policy and sustainable workforce development.

2. Literature Review and Theoretical Framework

2.1. Related Literature

The analysis builds on two theoretical traditions. The skill-biased technological change (SBTC) literature [15,16] emphasizes that technological progress raises the relative demand for skilled labor. The task framework [14,17] shifts the unit of analysis from workers to tasks, showing that computers substitute for routine cognitive and manual tasks while complementing abstract and interpersonal tasks, a pattern formalized as employment polarization [18] and extended to social skills by Deming [19] and Deming and Kahn [20]. Building on the task framework, Acemoglu and Restrepo [21,22] decompose the employment effects of automation into a displacement effect, a productivity effect, and a reinstatement effect, with the net direction depending on the relative magnitudes of these channels. This decomposition motivates the empirical distinction between displacement and augmentation exposure adopted in this paper.

However, applying this framework to AI requires recognizing that AI is distinct from earlier automation technologies: it can automate not only routine tasks but also a subset of nonroutine cognitive tasks [5,23,24], making its potential impact broader than previous automation waves [25]. Early estimates of job-level automation risk range widely [26,27], but more recent evidence points to a different margin of adjustment. Rather than eliminating jobs wholesale, AI appears to operate primarily within occupations by reshaping skill requirements [28], a finding that motivates the within-firm analysis pursued in this paper. A further structural feature of the literature is its temporal bifurcation: most pre-2023 evidence was calibrated on machine-learning rather than generative-AI capabilities, whereas post-2023 evidence on generative AI is still thin. Our 2019–2024 panel straddles both periods, though we do not separately identify Gen-AI-specific effects; distinguishing the two generations of AI impact remains a direction for future work.

A large empirical literature measures AI exposure at the occupation level and relates it to employment or wage outcomes, from US vacancy data [4,6,7] to cross-country comparisons [10] and recent multi-country Gen-AI exposure estimates produced by international institutions [29]. European evidence has extended this work to sixteen labor markets with distinct labor-market institutions [30], and national-panel studies have added German [31], Dutch [32], and Danish [33] microdata on within-occupation adjustment and worker-level responses. Autor et al. [34] show that labor-augmenting and labor-automating innovations have countervailing effects on occupational employment, providing direct support for the opposing channels emphasized by the task model. Freund and Mann [31] formalize the within-occupation adjustment margin, showing that AI-induced task automation reshapes the composition and returns of remaining tasks within a job. However, a common limitation of occupation-level analysis is that changes in employment and wages reflect both firm hiring decisions and workers’ occupational choices, making it difficult to isolate the adjustment of skill demand within firms.

Estimates from this literature do not fully converge. The task-level substitution of routine cognitive and manual work is robustly reproduced across measures and institutional settings [18,31,34]; however, the direction and magnitude of AI’s net effect on nonroutine cognitive work remain contested. Estimates based on patent-task similarity [13,35] place the automation frontier deeper into cognitive work than do skill-based exposure measures [6,7], producing systematically different rankings of affected occupations. Worker-level evidence is similarly split: Humlum and Vestergaard [33] document occupational switching without aggregate earnings change in Denmark, while Brynjolfsson, Chandar, and Chen [36] find disproportionate employment losses among young workers in the United States. These discrepancies motivate firm-level evidence on compositional outcomes, which are masked when aggregate employment or earnings serve as the dependent variable.

This limitation has motivated firm-level studies using online job posting data [20,37,38]. Recent work has examined AI and firm-level hiring quantities [4], employment arrangements [32], firm growth and product innovation [9], demand for AI-specific skills [10], and the pace of demand-side adjustment [11].

However, three substantive limitations constrain the inferences this body of work can support about AI-driven skill restructuring. First, AI exposure is almost uniformly entered as a single aggregate index, conflating channels whose theoretical signs are opposite; the few studies that distinguish AI-related from non-AI skill demand [10] do so in the dependent variable rather than in the exposure measure itself, leaving the decomposition between displacement and augmentation on the exposure side untested. Our paper inherits the patent-task matching framework of Kogan et al. [13] (and the known limitation that it captures potential rather than realized adoption) but recovers the decomposition by restricting the patent sample and splitting tasks into routine and nonroutine subsets. Second, the dependent variable is typically quantity-based, such as postings, vacancy shares, or hires, rather than the internal composition of the skill portfolio, so changes that reshape portfolio depth without changing observed quantities are silent in the data. Third, evidence is primarily from the US and a handful of European labor markets; the skill-restructuring implications of AI in large developing-economy contexts, where vocational-track specialization is deeper and retraining infrastructure thinner, are underexamined.

This paper extends the displacement–augmentation decomposition of Kogan et al. [13] from the occupation level to the firm level, switches the dependent variable from quantities to the internal composition of skill demand (shares, portfolio-level depth, and within-category importance), and implements this analysis on a Chinese 67-million-posting panel. While Freund and Mann [31] model within-occupation task transformation, their analysis does not track how these adjustments aggregate to firm-level skill demand patterns; the present paper operates at precisely this level. More broadly, cross-country comparisons of AI labor-market exposure consistently show that the signs and dominant margins of adjustment differ between advanced and emerging economies [29,30]; these divergences plausibly reflect a mix of labor-market institutions, stages of AI diffusion, and measurement-instrument differences, none of which the current literature has conclusively separated. This reinforces the case for firm-level evidence from large developing-economy settings rather than direct extrapolation from US or Northern European contexts.

2.2. Theoretical Framework and Hypotheses

Building on the task-based model of Acemoglu and Restrepo [21,22] and the exposure decomposition proposed by Kogan et al. [13], we distinguish two types of AI exposure with potentially opposing relationships with skill demand. Kogan et al. [13] classify occupational tasks into routine and nonroutine categories and construct two technology exposure indices via semantic matching between patent texts and task descriptions. This paper adopts their decomposition but restricts the patent sample to AI patents, yielding displacement AI exposure built over routine tasks and augmentation AI exposure built over nonroutine tasks. To fix notation, let occupation

j

in year

t

have routine task set

T_{j}^{R}

and nonroutine task set

T_{j}^{N}

. The two exposure measures are defined as:

{AIE}_{j, t}^{R} = A \{s i m (p, τ) : p \in P_{t}^{AI}, τ \in T_{j}^{R}\}

{AIE}_{j, t}^{N} = A \{s i m (p, τ) : p \in P_{t}^{AI}, τ \in T_{j}^{N}\}

where

P_{t}^{AI}

is the set of AI patents in year

t

,

s i m (\cdot)

is a semantic similarity function, and

A (\cdot)

is an aggregation function with threshold truncation. Displacement exposure captures the potential of AI to cover codifiable, pattern-based tasks; augmentation exposure captures coverage over tasks requiring integrative judgment and interpersonal interaction.

The correspondence between the two empirical measures and the theoretical channels in Acemoglu and Restrepo [21,22] is not one-to-one. Displacement exposure primarily captures the displacement effect, partially offset by productivity gains. Augmentation exposure captures the productivity and reinstatement effects jointly. The directional predictions above, therefore, reflect the dominant channel within each measure rather than an exact correspondence with a single theoretical mechanism.

From occupations to firms. Having defined exposure at the occupation level, we now aggregate to the firm. The relationship between AI exposure and firm-level skill demand operates through two margins: reallocation across jobs and reconfiguration of the skill mix within jobs [19,39]. Define the demand weight of firm

f

in year

t

for skill category

k

as

w_{f, k} = \sum_{j} π_{f, j} \cdot ϕ_{j, k}

, where

π_{f, j}

is the share of job

j

within the firm and

ϕ_{j, k}

is the weight of skill

k

in job

j

. Differentiating with respect to AI exposure yields:

\frac{\partial w_{f, k}}{\partial {AIE}_{f}} = \sum_{j} \frac{\partial π_{f, j}}{\partial {AIE}_{f}} \cdot ϕ_{j, k} + \sum_{j} π_{f, j} \cdot \frac{\partial ϕ_{j, k}}{\partial {AIE}_{f}}

The first term on the right-hand side captures cross-job reallocation (the effect of AI exposure on the job-mix weights

π_{f, j}

), and the second captures within-job reconfiguration (the effect on the skill weights

ϕ_{j, k}

within each job).

The dominant margin differs by exposure type. For displacement exposure, the primary channel is cross-job reallocation: firms reduce staffing for routine-intensive jobs (

π_{f, j}

falls), and routine skill demand declines through the first term. For augmentation exposure, the primary channel is within-job reconfiguration: AI takes over routine components within jobs, raising the weight of nonroutine skills in

ϕ_{j, k}

, and demand for nonroutine skills expands through the second term. The skill demand shares measured in the empirical analysis capture both margins simultaneously.

To fix ideas, consider a representative firm in cross-border e-commerce adopting GAI tools for content creation. In this stylized case, GAI integration can raise a designer’s daily throughput from 8 to 45 requests by shifting the task from manual execution to AI supervision. This transition implies two key shifts in skill demand: first, a decline in the necessity of narrow technical proficiency (e.g., Photoshop); and second, the emergence of ‘digital operations’ roles that prioritize broad, cross-modal auditing skills over deep, single-domain specialization. This example illustrates how GAI reshapes the skill frontier by substituting for specialized tasks while complementing oversight capabilities.

Hypothesis 1 (Composition).

Displacement exposure reduces the routine cognitive skill share, while augmentation exposure increases the nonroutine analytical skill share.

The O*NET importance rating provides a measure of within-skill depth orthogonal to the share dimension. Two channels link exposure to depth.

Displacement exposure is expected to lower portfolio-wide average importance because it removes high-importance routine jobs from the portfolio. Routine cognitive skills tend to carry the highest average importance in our sample (descriptive evidence in Table 1), so reducing their weight mechanically lowers the portfolio mean.

Augmentation exposure is expected to lower average importance through a different channel: by enabling AI-assisted execution of tasks that previously required deep expertise, it reduces the importance threshold per nonroutine skill. Augmentation exposure is additionally expected to raise skill dispersion, because expanding workers’ task boundaries beyond their original specialization shifts the skill profile toward a more even multi-category configuration.

Hypothesis 2 (Reconfiguration).

Both types of AI exposure lower average skill importance, and augmentation exposure also raises skill dispersion, producing a shallower and more dispersed skill portfolio.

A shallower, more dispersed skill portfolio is not merely a neutral recomposition; it threatens sustainable workforce development through three channels.

First, shallower portfolios erode professional identity and firm-specific human capital. A concentrated core of deeply required skills is the foundation on which workers build occupational identity, negotiate wages, and accumulate the firm-specific human capital that underwrites internal promotion. A portfolio in which no single category is demanded at high depth flattens this foundation, leaving workers with capabilities that are broadly applicable yet nowhere indispensable.

Second, shallower portfolios weaken long-term bargaining power. When within-category importance falls, the marginal productivity anchored to any given skill declines, weakening workers’ ability to extract wage premia or progress along a clear career ladder even when job titles remain nominally intact.

Third, dispersed skill demand imposes a compounded reskilling burden. Dispersed demand forces workers to maintain competence across multiple categories rather than deepening one, implying a continuous rather than episodic learning investment that narrow vocational systems are poorly structured to support. The burden does not fall evenly: empirical evidence shows that the reciprocal relationship between perceived employability and work-related learning is weak and concentrated among workers who already perceive themselves as employable [40]. Under bounded agency, the idea that individual reskilling effort is constrained by structural barriers of education, age, and sector position [40], workers facing the most structural barriers are systematically less able to convert reskilling opportunities into renewed employability, even when training is nominally available. De-coring, therefore, widens the set of skills any single worker must maintain and amplifies an existing employability–learning gap.

These channels together imply that de-coring threatens workforce sustainability not through outright job loss but through a gradual hollowing out of the expertise that underwrites long-term career stability and wage progression.

Hypothesis 3 (Divergence).

For at least some skill categories, share movements and within-category importance movements diverge in direction under AI exposure.

We use de-coring as a compact label for the three-part pattern in Hypotheses 2 and 3: declining average importance, rising dispersion, and share–importance divergence within skill categories (i.e., within a given category, shares and importance coefficients can move in opposite directions rather than together). The label captures the idea that AI exposure hollows out the high-importance core of the skill portfolio while spreading demand more thinly across a wider set of categories.

Relation to existing concepts. We use de-coring as a descriptive label rather than a new theoretical construct, and it covers a joint signature that prior descriptors each capture only in part. Employment polarization [18,41] and routine-biased technological change [17] characterize shifts in employment shares across skill or task categories and are silent on within-category depth; skill dilution and job hollowing-out [27] speak respectively to within-job broadening and middle-skill decline but not to share–importance divergence. What de-coring adds is a single empirical signature that combines three margins, namely, shares, portfolio-level depth, and within-category depth, observed jointly in the same firm-year data.

Figure 1 provides a conceptual illustration of the de-coring pattern. Panel A depicts a pre-exposure baseline portfolio with a clearly identifiable core: one skill category (typically routine cognitive in administrative-intensive firms, or nonroutine social in service-oriented firms) carries both a disproportionately high share and a high within-category average importance, while the remaining three categories are peripheral on both dimensions. Panel B depicts the post-exposure portfolio under de-coring: the former core category may retain or even gain share, yet its within-category importance falls; peripheral categories expand in share but are demanded at shallower depth; and the portfolio’s mean importance falls and its across-category dispersion rises. The transition is therefore not a reallocation from one concentrated core to another but a flattening of the portfolio’s internal hierarchy, the pattern we label de-coring.

Figure 1. Conceptual illustration of the de-coring pattern. Conceptual illustration (not based on empirical data). Panels (A,B) each show the four skill categories (NRA, NRS, RC, and RM) with two bars per category: blue = share, orange = within-category average importance. Panel (A) (Baseline portfolio): RC is the high-share, high-importance core. Panel (B) (De-cored portfolio): shares equalize, importance is universally lower, and within-category depth diverges from share movements. Panel (C): portfolio-level summary, mean importance ↓ and dispersion (1 − HHI) ↑. “Baseline” and “De-cored” are theoretical labels representing the same firm under low versus high AI exposure, not a split of the data by calendar time. The bars are stylized for expositional clarity; the directional shifts they depict (importance decline, share equalization, and share–importance divergence) are empirically documented by the regression evidence in Table 2, Table 3 and Table 4.

Table 2. AI Exposure and Firm Skill Demand Structure: Baseline Results.

	(1) NRA	(2) NRS	(3) RC	(4) RM
Panel A
AI exposure	−0.072 ***	−0.192 ***	−0.145 ***	0.409 ***
AI exposure	(0.010)	(0.014)	(0.009)	(0.018)
β × SD	−0.001	−0.003	−0.002	+0.007
Observations	3,613,645	3,613,645	3,613,645	3,613,645
Within R²	0.0010	0.0005	0.0007	0.0058
Panel B
displacement AI exposure	−0.380 *** (0.017)	−0.298 *** (0.029)	−0.361 *** (0.019)	1.039 *** (0.033)
β × SD(R)	−0.004	−0.003	−0.003	+0.010
augmentation AI exposure	0.033 *** (0.010)	−0.148 *** (0.014)	−0.013 (0.009)	0.128 *** (0.011)
β × SD(N)	+0.000	−0.002	−0.000	+0.002
Observations	3,472,561	3,472,561	3,472,561	3,472,561
Within R²	0.0016	0.0005	0.0010	0.0128
control variables	Yes	Yes	Yes	Yes
Fixed effects	Firm + City × Year	Firm + City × Year	Firm + City × Year	Firm + City × Year

Notes: Standard errors two-way clustered by firm and city × year are reported in parentheses. *** p < 0.01. β × SD denotes standardized effect sizes: SD (Total) = 0.016, SD (Displacement) = 0.009, and SD (Augmentation) = 0.013.

Table 3. AI Exposure and Firm Skill Portfolio Characteristics.

	(1) Skill Importance	(2) Skill Dispersion (1 − HHI)
displacement AI exposure	−0.706 ***	0.003
displacement AI exposure	(0.040)	(0.027)
augmentation AI exposure	−0.123 ***	0.088 ***
augmentation AI exposure	(0.016)	(0.012)
Observations	3,472,561	3,472,561
Within R²	0.0034	0.1651
control variables	Yes	Yes
Fixed effects	Firm + City × Year	Firm + City × Year

Notes: Standard errors two-way clustered by firm and city × year are reported in parentheses. *** p < 0.01. Skill importance is the O*NET importance-weighted average across all skills appearing in a firm’s postings. Skill dispersion is the complement of the Herfindahl index of firm-year skill shares (1 − HHI); higher values indicate a more even distribution.

Table 4. Category-Level Skill Importance.

	(1) NRA Importance	(2) NRS Importance	(3) RC Importance	(4) RM Importance
displacement AI exposure	1.273 ***	−0.264 ***	0.308 ***	−1.120 ***
displacement AI exposure	(0.106)	(0.019)	(0.016)	(0.033)
augmentation AI exposure	−0.076 **	0.011	0.083 ***	−0.053 ***
augmentation AI exposure	(0.038)	(0.010)	(0.009)	(0.017)
Obs	3,064,371	3,369,772	3,270,724	2,563,270
Within R²	0.0026	0.0004	0.0010	0.0042
control variables	Yes	Yes	Yes	Yes
Fixed effects	Firm + City × Year	Firm + City × Year	Firm + City × Year	Firm + City × Year

Notes: Standard errors two-way clustered by firm and city × year are reported in parentheses. *** p < 0.01, ** p < 0.05. Observations vary across columns because firm-year observations with zero skill hits in a given category are excluded from the corresponding importance regression.

3. Materials and Methods

3.1. Econometric Model

Empirically, the analysis proceeds in three steps. First, occupation-level AI exposure is constructed from patent-task semantic similarity. Second, exposure is aggregated to the firm-year level using firms’ recruitment composition. Third, job-description texts are used to extract the firm-level skill demand structure.

The baseline specification, estimated at the firm-year level, is:

Y_{f, t} = β {AI}_{f, t - 1} + X_{f, t}^{'} θ + μ_{f} + δ_{c (f, t), t} + ε_{f, t}

where

Y_{f, t}

is the skill structure outcome (the share of each of the four skill categories);

{AI}_{f, t - 1}

is the firm’s AI exposure, lagged one period;

X_{f, t}

includes the log number of job postings;

μ_{f}

denotes firm fixed effects; and

δ_{c (f, t), t}

denotes city-by-year fixed effects. Because the four skill shares sum to one, the coefficients sum to zero across equations and capture reallocation across skill categories. The city

c (f, t)

is defined as the modal recruitment city of firm

f

in year

t

; robustness to a time-invariant city assignment is verified in Section 4.4. Standard errors are two-way clustered by firm and city-year [42].

To separately examine the two types of exposure, AI exposure is decomposed:

Y_{f, t} = β^{R} {AI}_{f, t - 1}^{R} + β^{N} {AI}_{f, t - 1}^{N} + X_{f, t}^{'} θ + μ_{f} + δ_{c (f, t), t} + ε_{f, t}

Comparing

{\hat{β}}^{R}

and

{\hat{β}}^{N}

provides a direct test of Hypothesis 1. Because firm and city-by-year fixed effects absorb the vast majority of variation in skill shares, within R² values are expected to be low. We therefore report standardized effect sizes, computed as each coefficient multiplied by the standard deviation of its exposure measure, as a complementary benchmark for gauging economic significance.

This design is best interpreted as recovering conditional correlations within firms rather than as a quasi-experimental causal estimate. Three design features determine what it can and cannot reveal. First, decomposing exposure into displacement and augmentation components identifies asymmetric relationships that aggregate-exposure studies cannot resolve, regardless of their identification strategy, because the two channels partially offset. Second, the dependent variable is the internal composition of skill demand, a margin on which quasi-experimental evidence at a comparable scale remains limited in any institutional context. Third, the panel’s 67-million-posting scale permits within-firm identification for 3.62 million firm-years, an order of magnitude beyond most existing AI-labor studies. We complement the design with Oster [43] bounds for omitted-variable sensitivity (Section 4.4) and sub-sample robustness across geographic, weighting, and period perturbations.

3.2. Data

This paper draws on three data sources. The primary source consists of job vacancy advertisements from two major Chinese online recruitment platforms, 51job and Zhaopin, covering 2019–2024. These platforms are among the largest online recruitment channels in China, covering both standard and nonstandard employment arrangements [44], though their coverage is concentrated in formal-sector urban employment and is less systematic for rural and lower-tier-city labor markets. Each record contains the job title, employer name, city, job description, education and experience requirements, salary range, and number of positions. Cleaning proceeds in three steps. First, records with a missing job description or with fewer than 20 Chinese characters of description text are dropped. Second, duplicate postings are removed by collapsing records that share the same combination of standardized employer name, job title, posting date, and advertised salary range within the same firm-month; only the earliest record is retained. Third, employer names are standardized by mapping each full Chinese company name to a unique firm identifier, ensuring consistent firm-level linkage across years and across the two platforms. After cleaning, the working sample contains approximately 66.98 million job-posting records covering roughly 16.30 million unique job titles. Per-step removal counts are reported in Appendix A.4.

The platforms’ geographic coverage is not uniform. Tier-1 and tier-2 cities (following the standard NDRC and Yicai New First-Tier Cities Research Institute classification: 4 tier-1 cities and roughly 15 new-tier-1 plus tier-2 cities) are quantitatively over-represented in online recruitment volume: in the cleaned sample, these nineteen cities account for roughly 71% of postings (34% from the four tier-1 cities and a further 37% from the fifteen new-tier-1 and tier-2 cities), materially higher than their combined share of national urban employment, as is common in online recruitment samples. Prefectures below tier-3 and rural labor markets are correspondingly under-represented. Because the identification strategy relies on within-firm variation with firm and city-by-year fixed effects, time-invariant geographic over-representation is absorbed. Time-varying geographic composition effects are further probed by re-estimating the baseline on geographically stratified sub-samples (Appendix B Table A4); the sign and significance of the main baseline coefficients are preserved. The estimates should still be interpreted as characterizing the urban, formal-sector, online-recruiting segment of the Chinese labor market.

Occupational task descriptions are drawn from the 2022 edition of the Chinese Dictionary of Occupational Classifications, which provides a four-level coding system with task description texts for each detailed occupation. Skill classification data come from the 35 general skills and associated importance ratings in the U.S. O*NET database, which capture general cognitive, social, and technical ability dimensions and have been applied in cross-country settings [18]. The potential limitations of applying this taxonomy to the Chinese labor market are discussed in Section 6.

AI patent data are obtained from the IncoPat patent database, which mirrors the China National Intellectual Property Administration (CNIPA) records, covering 2018–2024 (lagged one year relative to the 2019–2024 posting panel). The AI-patent subset is identified using the approach of WIPO Technology Trends 2019: artificial intelligence [35], which combines IPC classification codes with keyword searches. Detailed identification criteria and patent sample statistics are reported in Appendix A.2.

3.3. Variables

A key design feature of the variable construction is the separation of data sources: the key explanatory variable (AI exposure) is derived from patent-task semantic matching via a job-title-to-occupation crosswalk, while the dependent variable (skill demand structure) is extracted from job-description texts. Because the two rely on distinct textual inputs, they are not mechanically linked.

3.3.1. Key Explanatory Variable: Firm-Year AI Exposure

Firm-year AI exposure is constructed in two steps, following the patent-task semantic matching approach of Kogan et al. [13] and adapting the firm-level aggregation strategy of Liu et al. [44]. First, occupation-level AI exposure is computed at the detailed occupation level (7-digit codes in the Chinese Dictionary of Occupational Classifications). A pretrained sentence embedding model encodes task description texts from the Dictionary and AI patent texts into dense vectors, and cosine similarity is computed for each task-patent pair. To compute displacement and augmentation exposure separately, the 10,160 unique task descriptions in the Dictionary are classified as routine (6322; 62.22%) or nonroutine (3838; 37.78%) using the large language model GLM-4.7, based on codifiability and rule structure. The classification is validated against Kogan et al. [13] and Acemoglu and Restrepo [22] at the occupation level. For each year, task-level exposure scores are computed as the sum of threshold-truncated similarities across the top 50 most similar patents (Appendix A.2); scores are then summed by task type within each occupation to yield routine-task exposure and nonroutine-task exposure, defined as:

ξ_{o, t}^{R} = \sum_{τ \in T_{o}^{R}} \sum_{p \in P_{τ, t}^{K}} m a x {0, s i m (p, τ) - κ_{t}}

and analogously

ξ_{o, t}^{N}

summed over

T_{o}^{N}

. Total occupation-level exposure is the sum across both task types,

ξ_{o, t}^{total} = ξ_{o, t}^{R} + ξ_{o, t}^{N}

. Details on the embedding model, threshold calibration, and differences from Kogan et al. [13] are provided in Appendix A.1.

What the exposure measure captures. The measure is constructed from semantic similarity between AI patent texts and occupational task descriptions and therefore captures potential technological exposure, the degree to which the tasks typical of a firm’s job mix are covered by the AI technology frontier, rather than observed AI adoption at the firm. A firm with high measured exposure need not have deployed AI in practice, and conversely, early adopters in low-exposure industries may be invisible to the measure. The estimated associations therefore characterize how firm-level skill demand co-moves with the industry-facing AI opportunity set, not with firm-level adoption decisions. Any classical measurement error that remains biases estimated coefficients toward zero, so significant relationships should be read as conservative.

Second, occupation-level scores are mapped to individual job postings via a job-title-to-occupation crosswalk. After normalization, the 16.30 million unique raw titles reduce to roughly 11.07 million standardized titles; because the distribution is heavily long-tailed (the top 200,000 standardized titles already cover 74.67% of all postings), matching is applied to these top 200,000 titles. A two-stage procedure, fuzzy character-level matching followed by semantic vector matching, successfully maps about 101,000 standardized titles to detailed occupations, covering roughly 30.78 million postings (46.0% of the cleaned sample). Firm-year total, displacement, and augmentation exposure are then computed as posting-count-weighted averages across all matched jobs within a firm.

Matched-sample representativeness. In the absence of a formal matched-vs-unmatched descriptive table, we describe the likely direction of selection on qualitative grounds. The long-tail concentration of unmatched titles implies that matched postings are plausibly drawn disproportionately from standardized roles linked to the detailed occupation system, likely concentrated in larger firms, formal-sector industries, and higher-tier cities whose labor markets dominate online recruitment. Postings in household services, micro-retail, and composite-role positions in smaller firms are under-represented. The most likely direction of selection is that matched-sample estimates understate the routine-cognitive share (because the excluded long-tail positions include many nonroutine-social service roles) and therefore likely deliver conservative displacement-on-routine-cognitive coefficients. The robustness checks in Section 4.4 that reweight postings by headcount and that restrict to geographically stratified sub-samples (Appendix B Table A4) allow the reader to gauge how the estimates shift as matched-sample representativeness changes. A formal matched-vs-unmatched descriptive table is deferred to future work; the present estimates therefore characterize the urban, formal-sector, online-recruiting segment of the Chinese labor market.

3.3.2. Dependent Variable: Firm-Year Skill Demand Structure

Following the task classification framework of Acemoglu and Autor [14], we group the 35 general skills in the O*NET database into four categories: nonroutine analytical (NRA, 11 skills involving abstract reasoning and technical judgment), nonroutine social (NRS, 9 skills involving interpersonal interaction and communication), routine cognitive (RC, 6 skills involving rule-based information processing), and routine manual (RM, 9 skills involving standardized equipment operation and resource scheduling). Only routine manual appears in the four-category grouping because the 35 general skills include equipment-operation and resource-scheduling skills that fit the routine-manual definition but do not include the physical-dexterity indicators needed to construct a nonroutine-manual category (for example, assembly, physical repair, or caregiving). Nonroutine manual is therefore omitted. The complete mapping of all 35 skills to four categories is provided in Appendix C.

For each job posting

j

, verb–noun task units are extracted from the description text and mapped to the 35 ONET skills using the large language model GLM-4.7, following the extraction procedure in Liu et al. [44], yielding a posting-level skill hit indicator

m_{j, s} \in {0,1}

. The extraction is validated in two ways: LLM-extracted skill vectors achieve Cohen’s kappa above 0.70 against independent manual coding on 500 postings, and firm-year skill shares are positively correlated with occupation-level ONET importance ratings. The firm-level demand share for skill category

k

is:

{Share}_{f, t, k} = \frac{\sum_{j \in J_{f, t}} \sum_{s \in S (k)} m_{j, s}}{\sum_{j \in J_{f, t}} \sum_{s = 1}^{35} m_{j, s}}

The empirical analysis uses three groups of dependent variables corresponding to Hypotheses 1–3: the four skill category shares, average skill importance and dispersion, and within-category importance for each skill type.

Generative AI tools, including Anthropic Claude, were used only for language editing and limited programming assistance during data processing. All analyses, interpretations, and conclusions were independently completed and verified by the authors.

4. Results

4.1. Descriptive Analysis

The final sample contains approximately 3.62 million firm-year observations from 2019 to 2024. The sample is restricted to firms that published at least one posting with identifiable skill content and at least one posting mappable to a detailed occupation with AI exposure information. The Pearson correlation between displacement and augmentation exposure is 0.026, indicating no serious multicollinearity concern.

Table 1 reports summary statistics. Among the skill shares, nonroutine social is the largest (0.447), followed by routine cognitive (0.283), nonroutine analytical (0.187), and routine manual (0.083). The category-level importance measures reveal a different ordering: routine cognitive skills have the highest average importance (3.099), while nonroutine social ranks first in share but only second in importance. This divergence between the share and importance rankings is worth noting, as the two dimensions need not move in the same direction. Aggregate AI exposure, displacement exposure, and augmentation exposure all have means close to zero with highly right-skewed distributions, so within-firm identification is driven by the extensive margin (zero to positive) and by intensity variation within the positive range.

4.2. Baseline Results

Table 2 reports baseline results with firm and city-by-year fixed effects. Panel A shows that aggregate AI exposure is significantly positively associated with the routinemanual share and negatively associated with the remaining three shares. Panel B enters displacement and augmentation exposure simultaneously, revealing that the two exhibit relationships with skill structure that differ in direction. Panel B is restricted to firm-years with both routine and nonroutine exposure scores (3,472,561 firm-years, 96% of Panel A).

Displacement exposure is significantly negatively associated with the routine cognitive share and positively associated with the routine manual share, which carries the largest coefficient in absolute value. A one-standard-deviation increase in displacement exposure is associated with a 0.4 percentage point decrease in the nonroutine analytical share and a 1.0 percentage point increase in the routine manual share; relative to the respective means of 0.187 and 0.083, these magnitudes are modest but economically meaningful.

To translate these percentage-point shifts into an intuitive headcount scale, we express them per 1000 postings. At this scale, a one-SD rise in displacement exposure implies about 10 more routine-manual roles, four fewer nonroutine-analytical roles, and three fewer roles each for the routine-cognitive and nonroutine-social categories, summing to zero by construction. Because the median firm in our sample posts only eight jobs per year, the firm-level reallocation implied by any single coefficient is below one role; the per-1000 figure is a unit-conversion device rather than the actual volume at any single firm. The economically meaningful interpretation is the aggregate pattern across the 1.4 million firms in the panel, on the order of tens of thousands of roles reallocated per year under a one-SD shift in displacement exposure.

Augmentation exposure is significantly positively associated with the nonroutine analytical share and with the routine manual share; it is negatively associated with the nonroutine social share and is not significant for routine cognitive. Among the non-manual categories, NRA is the only share whose augmentation coefficient is positive, consistent with AI complementing rather than substituting for analytical tasks. The augmentation-NRA coefficient of 0.033, combined with an augmentation-exposure standard deviation of 0.013, implies that a one-SD rise in augmentation exposure is associated with roughly 0.4 additional analytically oriented roles per 1000 postings, small at the firm level but a consistent directional shift across the 67-million-posting panel.

The value of decomposition is most directly illustrated by the nonroutine analytical share. The aggregate coefficient is −0.072, whereas the displacement coefficient of −0.380 and the augmentation coefficient of 0.033 operate in opposite directions and largely offset each other (Table 2, Panel B). Without decomposition, the positive augmentation–analytical relationship cannot be detected.

The results are consistent with Hypothesis 1. The routine manual and nonroutine social shares do not exhibit opposing coefficient signs across the two types of exposure: the routine manual share is positively associated with both types of exposure, consistent with the low susceptibility of physical-presence tasks to algorithmic substitution, while the nonroutine social share is negatively associated with both, consistent with this category containing a substantial codifiable component alongside tasks that resist standardization. These patterns were left as empirical questions in Hypothesis 1, and the results confirm that the clearest opposing-direction pattern was obtained for the two categories where directional predictions were specified: routine cognitive and nonroutine analytical.

4.3. Skill Portfolio Characteristics and Category-Level Skill Importance

Share variables measure relative weights across skill categories but do not capture whether the average importance of demanded skills changes (Hypothesis 2) or whether within-category importance movements align with share movements (Hypothesis 3). This section examines these dimensions using decomposed exposure throughout.

Table 3 reports results for two skill portfolio characteristics: skill importance (the O*NET importance-weighted average across all skills in a firm’s postings) and skill dispersion (1 − HHI of firm-year skill shares) [4]. For importance, both types of exposure have significantly negative coefficients, but the displacement coefficient, at −0.706, is substantially larger than the augmentation coefficient of −0.123, consistent with Hypothesis 2: displacement targets routine tasks that carry above-average importance weights. For dispersion, augmentation exposure is significantly positive while displacement is not significant, consistent with augmentation being associated with a more evenly distributed skill profile. Taken together, the results point to a skill portfolio that is shallower in average importance and more dispersed across skill types, the pattern defined as de-coring in Hypothesis 2.

Table 4 examines whether share movements and importance movements within each category move in the same direction. Routine manual skills provide the clearest illustration of de-coring: the share is positively associated with displacement exposure (Table 2), yet the within-category importance coefficient is significantly negative at −1.120. Routine cognitive skills move in the opposite direction: the share falls under displacement, yet importance rises, with a coefficient of 0.308. Nonroutine analytical skills present a more nuanced pattern: under augmentation, the share rises but importance declines, with a coefficient of −0.076, consistent with the hypothesis that augmentation lowers the depth at which analytical skills are demanded even as it expands the share of jobs requiring them.

These divergences indicate that inferring the direction of skill demand from share movements alone can be misleading, consistent with Hypothesis 3 and completing the empirical characterization of de-coring.

4.4. Robustness Check

The baseline results are assessed along four dimensions: alternative aggregation (summing rather than averaging posting-level exposure scores to construct firm-year AI exposure), alternative weighting (advertised headcount rather than posting count), excluding the COVID-19 period (2020), and replacing city-by-year fixed effects with time-invariant city fixed effects. These checks address concerns about the sensitivity of results to how posting-level exposure is aggregated to the firm level, the weighting scheme applied in that aggregation, the influence of a macroeconomic shock that disrupted hiring patterns, and the definition of the geographic fixed effect, respectively.

Table 5 reports robustness checks for both skill portfolio characteristics (columns 1–2) and category-level importance (columns 3–6). For skill importance, displacement coefficients range from −0.687 to −0.744 across the four panels (baseline: −0.706), and augmentation retains a negative and significant coefficient in all cases. For skill dispersion, augmentation is significantly positive in three of the four panels; the exception is Panel B (headcount weighting), where the displacement coefficient turns marginally significant instead. For category-level importance, the displacement sign pattern, positive for NRA and RC and negative for NRS and RM, is identical to the baseline across all four panels, with all coefficients significant at the 1% level. The augmentation coefficients are also broadly consistent: the signs match the baseline in nearly all cases, and the few instances where significance is lost (e.g., augmentation on NRA importance in Panels A and C) involve coefficients that were already modest in the baseline specification. The share-importance divergence documented in Section 4.3 is therefore not an artifact of baseline measurement choices.

Taken together, the robustness checks indicate that both the reconfiguration pattern (Hypothesis 2) and the divergence pattern (Hypothesis 3) are stable across a range of specification choices. Full specification details, including sample sizes and within R², are reported in Appendix B Table A1. Section 4.4 complements these checks with a formal assessment of sensitivity to omitted variable bias using the method of Oster [43].

Table 6 reports the Oster [43] sensitivity analysis, which assesses how strongly selection on unobservable variables would need to correlate with the treatment variable, relative to the observed controls, to drive each estimated coefficient to zero. Under the conventional threshold of |δ| > 1, the coefficient is considered robust to omitted variable bias. Nearly all significant coefficients pass this threshold. In Panel A, displacement on importance yields δ > 100 and augmentation on importance yields δ = −10.6, both far exceeding the threshold. In Panel B, all four divergence coefficients for displacement yield |δ| > 100, and the three significant augmentation coefficients likewise yield |δ| well above one. The only exception is augmentation on dispersion (δ = 0.4), which falls below the threshold. This result reflects the high explanatory power of the posting-count control for this particular outcome: within R² increases from 0.0002 to 0.165 when ln(postings) is added, leaving little residual variation for the Oster statistic to work with, rather than indicating a genuine omitted-variable vulnerability.

Taken together, the robustness checks in Table 5 and Table 6 indicate that both the reconfiguration and divergence patterns are stable across alternative specifications and are unlikely to be driven by omitted variable bias, with the partial exception of the augmentation–dispersion relationship.

5. Mechanism and Heterogeneity

The preceding sections document the de-coring pattern. This section examines the channels through which these relationships may operate, combining an analysis of job-level characteristics (Section 5.1) with a prediction-based heterogeneity analysis (Section 5.2).

5.1. AI Exposure and Job Characteristics

If displacement exposure is associated with skill demand restructuring primarily through cross-job reallocation, it should also be associated with observable changes in the entry requirements of the jobs that firms post. Table 7 provides evidence on this prediction. Displacement exposure is significantly negatively associated with education requirements, experience requirements, and log median salary, consistent with firms under higher displacement exposure recruiting for lower-threshold positions. Augmentation exposure shows a different pattern: it has no statistically significant relationship with education requirements but is significantly negatively associated with experience requirements and log salary. The asymmetry between the two types of exposure on education requirements, the most direct measure of the skill threshold for job entry, is consistent with the theoretical distinction between cross-job reallocation (displacement) and within-job reconfiguration (augmentation): displacement is associated with lower entry barriers across all three dimensions, whereas augmentation is associated with lower experience and salary requirements but not with lower education thresholds.

5.2. Heterogeneity of De-Coring

The finding that displacement exposure is associated with lower job entry barriers (Table 7) generates a testable prediction: the de-coring pattern should be more pronounced among firms where barriers are already low. Table 8 tests this by interacting displacement and augmentation exposure with a dummy

D_{h i g h}

indicating whether the firm’s base-year median of each characteristic exceeds the sample median.

For skill importance, the displacement ×

D_{h i g h}

interaction is positive and significant across all four panels (education, experience, wage, and firm size), indicating that the negative association between displacement exposure and average skill importance is concentrated among low-barrier, small firms. The firm-size interaction is the largest at 0.376, suggesting that scale is a particularly strong moderator. For skill dispersion, the augmentation ×

D_{h i g h}

interaction is negative and significant across three of the four dimensions (education, experience, and firm size), indicating that the positive association between augmentation exposure and skill dispersion is likewise concentrated among lower-threshold, smaller firms.

These results complement the evidence in Table 7 through a different empirical strategy: displacement exposure is associated with lower entry barriers and its relationship with importance is strongest where barriers are already low; augmentation’s relationship with dispersion is also strongest among lower-threshold firms. Both patterns are concentrated among firms with lower entry barriers and smaller size, the segment of the firm distribution with the fewest reskilling resources and therefore the greatest vulnerability to skill demand restructuring. As supplementary evidence, directly controlling for job characteristics attenuates the displacement coefficient on category-level importance by 9–24% while leaving augmentation coefficients virtually unchanged (Appendix B Table A2 and Table A3), consistent with displacement operating partly through observable job-level channels.

6. Discussion and Conclusions

6.1. Discussion

The findings support three substantive insights about how AI is reshaping firm-level skill demand, each tied to one of the hypotheses and to a specific gap identified in Section 2. Before turning to interpretation, we emphasize that throughout this discussion “AI exposure” refers to potential technological exposure (the AI-applicability of a firm’s task content) rather than observed firm-level AI adoption; the relationships we report are between skill demand and task-level AI applicability, not effects of realized AI use.

First, displacement and augmentation pull in opposite directions on skill composition (Hypothesis 1), and aggregate exposure measures obscure the split. The opposing relationships we document echo the countervailing effects at the occupation level in Autor et al. [34] and extend them to firm-level skill composition. Empirically, this means that single-index AI exposure studies necessarily under-report the AI-driven reallocation of skill demand, because displacement-induced declines and augmentation-induced increases in the same skill category partially cancel (Table 2 Panel B).

Second, the primary skill-demand co-movement with AI exposure is compositional restructuring rather than aggregate contraction (Hypothesis 2), and it concentrates where reskilling capacity is thinnest. Our firm-level evidence on declining average importance and rising dispersion (Table 3) is the employer-side counterpart of the worker-side adjustment margin documented by Humlum and Vestergaard [33]. The concentration of de-coring among low entry-threshold, small firms sits opposite the large-firm AI-growth concentration by Babina et al. [9] and compounds the young-worker vulnerability by Brynjolfsson, Chandar, and Chen [36]: the workers most exposed to de-coring are those with the least access to employer-sponsored retraining infrastructure.

Third, de-coring is a complementary descriptor of AI-driven restructuring rather than a rival to polarization (Hypothesis 3). Employment polarization [18,41] and routine-biased technological change [17] characterize cross-category employment shares; de-coring captures within-firm portfolio depth and the share–importance divergence that only firm-level composition data can reveal (Table 4). The policy-relevant addition is that workforce strategy may need to emphasize within-occupation skill broadening alongside occupational transition support.

The Chinese institutional context provides a relevant backdrop for interpreting these results. China’s vocational education system has historically been organized around narrow occupational tracks, which the de-coring evidence suggests are particularly misaligned with the direction in which AI is shifting employer demand; the welfare implications are more acute here than in systems with stronger general-education foundations or well-developed adult-learning infrastructures [29,30]. The supply side of AI in China has also grown faster than its reskilling infrastructure [45], with platform and gig employment expanding rapidly over the same window [44] and overlapping with displacement exposure in ways our specification does not directly identify. Finally, labor market dualism sustained by the household registration (hukou) system shapes who de-coring affects most: our data capture the urban, formal-sector segment, and informal and rural workers (not observed here) are likely exposed to similar or greater restructuring pressure with even thinner safety nets. Evidence from representative industry practices in cross-border e-commerce echoes this pattern: the integration of GAI leads to a bifurcation in skill demand, where requirements intensify for high-level technical roles involved in AI development, while concurrently declining for operational roles that focus on AI-assisted content consumption and execution.

6.2. Conclusions and Limitation

In sum, AI primarily reshapes the structure of skill demand within firms rather than its aggregate level: it shifts the depth and composition of skills, with the heaviest restructuring falling on small, low entry-threshold employers whose workforces have the least redundant capacity to absorb it. The central empirical implication is that AI may reorganize the internal hierarchy of firm skill demand even when aggregate employment effects appear modest. These conclusions concern the firm-level association between potential AI exposure and skill-demand restructuring; the magnitudes should not be read as causal effects of realized AI adoption. Two policy implications follow, both directionally consistent with the sustainable-workforce-development agenda.

First, educational systems should emphasize versatile, multi-dimensional competencies rather than job-specific specialization, because the share–importance divergence cautions against relying on occupation-level shares alone for workforce planning. The reciprocal employability–learning relationship is weak and concentrated among workers who already perceive themselves as employable [40], so under bounded agency, workers in low entry-threshold small firms are precisely those least likely to be activated by generic upskilling supply. Pairing broad-competency foundations with the cultivation of informal-learning capabilities (trial-and-error, peer collaboration, and post-task reflection), which shape employability indirectly through career adaptability and sustainability [46], addresses the most fragile link in the learning–adaptability–employability chain. Second, targeted reskilling support is needed for the small-firm segment, where restructuring exposure is concentrated but reskilling capacity is thinnest, an asymmetry compounded by unequal AI tool adoption among lower-earning workers [47].

Three concrete design principles follow: modular and portable credentialing that recognizes competencies across firms and occupations; firm–government cost-sharing (training vouchers, sectoral skills councils, and cross-category credential tax credits); and longitudinal learning accounts that accumulate entitlements across employers. How firms introduce AI also matters: participatory design, explicit job redesign alongside tool adoption, and preservation of discretionary judgment space shape whether augmentation hollows out expertise or deepens it, suggesting that implementation-quality standards (analogous to workplace health and safety standards) should steer firms toward employee-centered practices rather than defaulting to cost-minimizing substitution. These prescriptions should be read with two caveats: the evaluation base for the instruments we suggest is thin, even in economies that have experimented with them at scale, and moving from diagnostic evidence to policy design requires addressing financing burden-sharing, administrative capacity, and political feasibility, none of which this paper evaluates.

Some limitations bound the inferences. First, online postings capture only the formal, internet-mediated recruitment channel; informal labor sectors are systematically underrepresented, and whether the de-coring pattern extends to them is an open question. Second, the analysis establishes conditional correlations rather than causal effects; Oster sensitivity tests indicate robustness to omitted-variable bias for most results, except the augmentation–dispersion relationship (Section 4.4), and we cannot rule out that part of the observed restructuring reflects coincident shocks (COVID-19, platform-economy expansion, and China–US trade tensions), although Panel C of Table 5 confirms the baseline pattern when 2020 is dropped. Third, our exposure measure captures potential technological exposure (the similarity between firm-posted tasks and AI-patent tasks) rather than realized AI adoption at the firm. The estimated relationships should therefore be read as how skill demand co-moves with the AI-applicability of tasks, not as the effects of firm-level AI implementation; whether and how each firm internalizes AI tools remains unobserved. Construct-level validation against independent firm-level AI-adoption surveys is a direction for future work.

Future research may extend the analysis along several directions. The release of large language models in late 2022 provides a potential source of quasi-exogenous variation for studying the causal effects of a discrete increase in AI capability on skill demand structure, though the selective nature of AI adoption would need careful treatment. Separating the extensive margin (entry and exit of job types) from the intensive margin (changes in requirements within continuing jobs) would clarify the micro-level adjustment process, and examining how de-coring interacts with the rise of nonstandard employment [44] would connect the skill-demand and employment-structure margins. Cross-country replication, particularly within the comparative framework established by recent IMF [29,48] and European [30] work, would help establish the generalizability of the de-coring pattern.

Author Contributions

Conceptualization, L.Z. and C.Z.; methodology, L.Z.; software, L.Z.; validation, L.Z. and C.Z.; formal analysis, L.Z.; investigation, C.Z.; resources, L.Z.; data curation, L.Z.; writing—original draft preparation, L.Z.; writing—review and editing, C.Z.; visualization, L.Z.; supervision, C.Z.; project administration, C.Z.; funding acquisition, C.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded and supported by the Institute of Digital Economy and Financial Powerhouse Building (2024WZJD015).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors used Anthropic Claude (https://claude.ai) for language editing and limited programming assistance during manuscript preparation and data processing. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Technical Details of Key Explanatory Variable Construction

Appendix A.1. Choice of Sentence Embedding Model

Ref. [7] construct document vectors from GloVe word embeddings with TF-IDF weighting. This paper instead adopts pretrained sentence embeddings using paraphrase-multilingual-MiniLM-L12-v2 from the sentence-transformers framework, mapping each text to a 384-dimensional dense vector. Compared with GloVe-based averaging, pretrained sentence embeddings capture richer syntactic and semantic information and are particularly well suited to Chinese, which does not use whitespace to delimit word boundaries. Compared with larger multilingual models such as XLM-RoBERTa, MiniLM achieves comparable performance at several times the encoding speed, making it suitable for matching hundreds of thousands of patent texts against nearly ten thousand task descriptions annually. All vectors are L2-normalized so that inner products equal cosine similarities. Each task description is encoded as a single vector; each patent’s title and abstract are concatenated and encoded as one vector.

Appendix A.2. Vector Retrieval and Threshold Calibration

For a given year

t

, a FAISS index is constructed from all AI patent vectors, and each task description serves as a query to retrieve the top

K = 50

most similar patents by cosine similarity. Following the sparsification approach of ref. [7], an annual threshold

τ_{t}

is introduced to zero out sub-threshold pairs. The threshold is set at the 60th percentile of the distribution of each task’s top-matched patent similarity within that year (Ref. [7] use the 80th percentile; the lower threshold reflects the similarity score distribution of the Chinese-language corpus).

After threshold truncation, the similarity score between task

i

and patent

k

is:

{\tilde{s}}_{i, k} = m a x (0, s_{i, k} - τ)

(A1)

The AI exposure score of task

i

in year

t

is the sum of truncated similarities:

x_{i, t} = \sum_{k = 1}^{K} {\tilde{s}}_{i, k}

. Task-level scores are summed by task type within each occupation to yield total exposure

ξ_{o, t}^{t o t a l}

, routine-task exposure

ξ_{o, t}^{R}

, and nonroutine-task exposure

ξ_{o, t}^{N}

.

Appendix A.3. Job Title Normalization and Staged Matching

Job titles are mapped to detailed occupations through a pipeline of deduplication, normalization, staged matching, and back-mapping. First, approximately 16.30 million unique titles are deduplicated and normalized by removing benefit descriptions, irrelevant affixes, and embedded location or department identifiers, producing approximately 11.07 million normalized titles. The distribution exhibits a pronounced long tail: the top 200,000 normalized titles account for 74.67 percent of all postings; matching is applied to these titles.

Matching proceeds in two stages. The first uses character-level similarity with a high threshold (fuzzy matching); the second encodes normalized titles and occupation description texts into vectors and applies cosine similarity with a dual criterion requiring both a high top score and a sufficient gap between the top and second-ranked candidates. The procedure matches approximately 101,000 normalized titles, corresponding to roughly 30.78 million job postings (coverage rate: 46.0%). Each posting’s AI exposure is defined as the exposure score of its mapped detailed occupation in the corresponding year.

Appendix A.4. Data Processing Pipeline

Figure A1 summarizes the end-to-end data processing pipeline from raw platform scrapes to the final firm-year analysis sample. The pipeline comprises five stages: (1) raw ingestion of postings from 51job and Zhaopin (2019–2024); (2) record-level filtering, which drops postings whose job description is missing or fewer than 20 Chinese characters long; (3) within-firm-month deduplication, which collapses records sharing the same (employer name, job title, posting date, and salary range) tuple, retaining only the earliest record; (4) firm linkage, which standardizes full Chinese company names to a unique firm identifier to ensure cross-year and cross-platform consistency; and (5) occupation matching, via the two-stage title-to-detailed-occupation mapping described in Appendix A.3, culminating in firm-year aggregation. Steps (2)–(4) together yield the cleaned 66.98-million-record corpus; step (5) reduces this to approximately 30.78 million matched postings, corresponding to 3.62 million firm-year observations that constitute the final analysis sample.

Figure A1. Data processing pipeline. The vertical flowchart traces a posting from raw ingestion (Stage 1) through record-level filtering (Stage 2), within-firm-month deduplication (Stage 3), firm-name standardization and identifier linkage (Stage 4), and two-stage occupation matching (Stage 5) to the final firm-year analysis sample. Volume annotations on the inter-stage arrows and the final box correspond to the figures reported in Section 3.2 and Appendix A.3: ≈66.98 M cleaned records/≈1.41 M unique firms after Stage 4, and ≈30.78 M matched postings/3.62 M firm-year observations after Stage 5. The side note flags that the unmatched ~54% of records sit in the long tail of title diversity (the top 200,000 normalized titles already cover 74.67% of all postings).

Appendix B. Tables

Table A1. Further Robustness Checks.

	(1) Importance	(2) Dispersion	(3) NRA Importance	(4) NRS Importance	(5) RC Importance	(6) RM Importance
Panel A: Alternative aggregation method (sum)
displacement AI exposure	−0.150 *** (0.013)	0.030 *** (0.009)	0.429 *** (0.040)	−0.070 *** (0.007)	0.096 *** (0.007)	−0.269 *** (0.015)
augmentation AI exposure	−0.025 *** (0.003)	0.021 *** (0.002)	−0.004 (0.008)	−0.001 (0.002)	0.018 *** (0.002)	−0.013 *** (0.004)
Observations	3,472,561	3,472,561	3,064,371	3,369,772	3,270,724	2,563,270
Within R²	0.0024	0.1651	0.0029	0.0004	0.0010	0.0023
Panel B: Alternative weighting scheme (headcount)
displacement AI exposure	−0.687 *** (0.035)	−0.010 (0.024)	1.090 *** (0.090)	−0.241 *** (0.017)	0.281 *** (0.014)	−1.044 *** (0.029)
augmentation AI exposure	−0.120 *** (0.015)	0.072 *** (0.011)	−0.094 *** (0.035)	0.010 (0.010)	0.078 *** (0.008)	−0.040 ** (0.016)
Observations	3,472,561	3,472,561	3,064,371	3,369,772	3,270,724	2,563,270
Within R²	0.0036	0.1651	0.0025	0.0004	0.0010	0.0043
Panel C: Excluding the COVID-19 Period
displacement AI exposure	−0.744 *** (0.052)	−0.036 (0.037)	1.252 *** (0.163)	−0.303 *** (0.022)	0.319 *** (0.023)	−1.079 *** (0.040)
augmentation AI exposure	−0.130 *** (0.018)	0.101 *** (0.014)	−0.034 (0.043)	−0.005 (0.012)	0.088 *** (0.011)	−0.067 *** (0.019)
Observations	3,106,379	3,106,379	2,719,285	3,006,594	2,915,582	2,267,212
Within R²	0.0031	0.1677	0.0035	0.0006	0.0009	0.0032
Panel D: Time-Invariant City Fixed Effects
displacement AI exposure	−0.695 *** (0.039)	0.003 (0.027)	1.277 *** (0.105)	−0.259 *** (0.018)	0.305 *** (0.017)	−1.121 *** (0.033)
augmentation AI exposure	−0.120 *** (0.016)	0.089 *** (0.011)	−0.074 ** (0.038)	0.010 (0.010)	0.082 *** (0.009)	−0.055 *** (0.017)
Observations	3,475,135	3,475,135	3,066,567	3,372,217	3,273,092	2,564,917
Within R²	0.0035	0.1667	0.0026	0.0005	0.0009	0.0042

Notes: Standard errors two-way clustered by firm and city × year are reported in parentheses. *** p < 0.01, ** p < 0.05. All specifications include ln(postings) as a control. Panels A–C use firm and city × year fixed effects; Panel D uses firm and time-invariant city fixed effects.

Table A2. Mechanism: Sequential Adding for Skill Portfolio Characteristics.

	(1) Importance	(2) Dispersion
Panel A: Baseline specification
displacement AI exposure	−0.706 ***	0.003
displacement AI exposure	(0.040)	(0.027)
augmentation AI exposure	−0.123 ***	0.088 ***
augmentation AI exposure	(0.016)	(0.012)
Observations	3,472,561	3,472,561
Within R²	0.0034	0.1651
Panel B: Adding all job characteristics
displacement AI exposure	−0.767 ***	0.184 ***
displacement AI exposure	(0.039)	(0.024)
augmentation AI exposure	−0.142 ***	0.083 ***
augmentation AI exposure	(0.016)	(0.011)
Observations	3,271,541	3,271,541
Within R²	0.0289	0.2118
control variables	Yes	Yes
Fixed effects	Firm + City × Year	Firm + City × Year

Notes: Standard errors two-way clustered by firm and city × year in parentheses. *** p < 0.01. Panel A includes only ln(postings). Panel B adds education requirements, experience requirements, and log median salary. All specifications include ln(postings) as a control.

Table A3. Mechanism: Sequential Adding for Category-Level Importance.

	(1) NRA Importance	(2) NRS Importance	(3) RC Importance	(4) RM Importance
Panel A: Baseline specification
displacement AI exposure	1.273 ***	−0.264 ***	0.308 ***	−1.120 ***
displacement AI exposure	(0.106)	(0.019)	(0.016)	(0.033)
augmentation AI exposure	−0.076 **	0.011	0.083 ***	−0.053 ***
augmentation AI exposure	(0.038)	(0.010)	(0.009)	(0.017)
Observations	3,064,371	3,369,772	3,270,724	2,563,270
Within R²	0.0026	0.0004	0.0010	0.0042
Panel B: Adding all job characteristics
displacement AI exposure	1.097 ***	−0.201 ***	0.258 ***	−1.024 ***
displacement AI exposure	(0.100)	(0.017)	(0.015)	(0.032)
augmentation AI exposure	−0.107 ***	0.012	0.074 ***	−0.047 ***
augmentation AI exposure	(0.036)	(0.011)	(0.009)	(0.018)
Observations	2,918,643	3,183,672	3,099,978	2,458,556
Within R²	0.0374	0.0127	0.0173	0.0089
control variables	Yes	Yes	Yes	Yes
Fixed effects	Firm + City × Year	Firm + City × Year	Firm + City × Year	Firm + City × Year

Notes: Standard errors two-way clustered by firm and city × year in parentheses. *** p < 0.01, ** p < 0.05. Panel A includes only ln(postings). Panel B adds education requirements, experience requirements, and log-median salary. Displacement coefficient attenuation: NRA 13.8%, NRS 23.9%, RC 16.2%, and RM 8.6%. All specifications include ln(postings) as a control.

Table A4. Baseline Regressions Stratified by City Tier.

	(1) NRA Share	(2) NRS Share	(3) RC Share	(4) RM Share
Panel A: Tier-1 + New Tier-1 cities (19 cities)
Total AI exposure	−0.0585 *** (0.0126)	−0.1598 *** (0.0190)	−0.1285 *** (0.0116)	+0.3468 *** (0.0227)
Displacement exposure (R)	−0.3579 *** (0.0215)	−0.2067 *** (0.0409)	−0.3284 *** (0.0258)	+0.8931 *** (0.0455)
Augmentation exposure (N)	+0.0299 *** (0.0115)	−0.1317 *** (0.0169)	−0.0138 (0.0113)	+0.1156 *** (0.0129)
Observations	2,286,918	2,286,918	2,286,918	2,286,918
Panel B: Other cities (non Tier-1 + 2)
Total AI exposure	−0.0632 *** (0.0155)	−0.2578 *** (0.0196)	−0.1888 *** (0.0153)	+0.5098 *** (0.0228)
Displacement exposure (R)	−0.3569 *** (0.0245)	−0.4451 *** (0.0364)	−0.4364 *** (0.0281)	+1.2384 *** (0.0374)
Augmentation exposure (N)	+0.0687 *** (0.0183)	−0.1786 *** (0.0235)	−0.0072 (0.0175)	+0.1172 *** (0.0199)
Observations	1,326,727	1,326,727	1,326,727	1,326,727

Notes: OLS estimates of the baseline specification by city-tier sub-sample. Dependent variables are the firm-year shares of the four O*NET skill categories (NRA = nonroutine analytical; NRS = nonroutine social; RC = routine cognitive; RM = routine manual). The displacement (R) and augmentation (N) exposure measures are constructed as in Section 3.3.1. All specifications include firm fixed effects, city-by-year fixed effects, and ln(postings) as a control. Standard errors are two-way clustered by firm and city-by-year (in parentheses). *** p < 0.01. Tier-1 cities follow the conventional Chinese policy-document and academic-literature designation: Beijing, Shanghai, Guangzhou, Shenzhen. The 15 New Tier-1 cities follow the Yicai New First-Tier Cities Research Institute classification (2023 edition): Chengdu, Chongqing, Hangzhou, Wuhan, Suzhou, Xi’an, Tianjin, Nanjing, Changsha, Zhengzhou, Dongguan, Qingdao, Kunming, Ningbo, Hefei. Panel B includes all firm-years whose modal recruitment city falls outside these 19 cities.

Appendix C. Skill Classification

Following the task classification framework of ref. [8], the 35 general skills in the O*NET database are grouped into four categories along two dimensions: cognitive versus manual and routine versus nonroutine. Because the 35 O*NET general skills are concentrated in the cognitive and social domains and do not include physical dexterity or manual flexibility indicators, the nonroutine manual category in the original five-category framework cannot be constructed. The final classification retains four categories.

Nonroutine analytical (NRA, 11 skills): critical thinking, complex problem solving, operations analysis, systems analysis, systems evaluation, judgment and decision making, technology design, programming, science, troubleshooting, and equipment selection. These correspond to nonroutine cognitive-analytical tasks in [8], involving abstract reasoning, systems modeling, and technical judgment.

Nonroutine social (NRS, 9 skills): active listening, speaking, social perceptiveness, coordination, persuasion, negotiation, instructing, service orientation, and management of personnel resources. These correspond to nonroutine cognitive-social tasks, characterized by interpersonal interaction, contextual judgment, and communication.

Routine cognitive (RC, 6 skills): reading comprehension, writing, mathematics, monitoring, active learning, and learning strategies. These correspond to rule-based information processing and knowledge acquisition activities with relatively high codifiability.

Routine manual (RM, 9 skills): quality control analysis, operation monitoring, operation and control, equipment maintenance, repairing, installation, time management, management of financial resources, and management of material resources. The first six skills correspond directly to standardized equipment operation and maintenance monitoring. The remaining three (time management, management of financial resources, and management of material resources) are not physical tasks in the narrow sense, but they share the defining characteristic of routine work as identified in ref. [13]: execution follows explicit, codifiable rules and standardized procedures with limited discretion, and performance is evaluated against operational benchmarks rather than judgment-based criteria. They are therefore grouped with routine manual rather than with cognitive or social categories.

References

Vinuesa, R.; Azizpour, H.; Leite, I.; Balaam, M.; Dignum, V.; Domisch, S.; Felländer, A.; Langhans, S.D.; Tegmark, M.; Fuso Nerini, F. The Role of Artificial Intelligence in Achieving the Sustainable Development Goals. Nat. Commun. 2020, 11, 233. [Google Scholar] [CrossRef] [PubMed]
Korinek, A.; Stiglitz, J. Artificial Intelligence, Globalization, and Strategies for Economic Development; NBER Working Paper No. 28453; National Bureau of Economic Research: Cambridge, MA, USA, 2021. [Google Scholar]
World Economic Forum. Future of Jobs Report 2023; World Economic Forum: Geneva, Switzerland, 2023. [Google Scholar]
Acemoglu, D.; Autor, D.; Hazell, J.; Restrepo, P. Artificial Intelligence and Jobs: Evidence from Online Vacancies. J. Labor Econ. 2022, 40, S293–S340. [Google Scholar] [CrossRef]
Webb, M. The Impact of Artificial Intelligence on the Labor Market; Working Paper; Stanford University: Stanford, CA, USA, 2019. [Google Scholar]
Felten, E.W.; Raj, M.; Seamans, R. A Method to Link Advances in Artificial Intelligence to Occupational Abilities. AEA Pap. Proc. 2018, 108, 54–57. [Google Scholar] [CrossRef]
Felten, E.W.; Raj, M.; Seamans, R. Occupational, Industry, and Geographic Exposure to Artificial Intelligence: A Novel Dataset and Its Potential Uses. Strateg. Manag. J. 2021, 42, 2195–2217. [Google Scholar] [CrossRef]
Acemoglu, D.; Restrepo, P. Automation and New Tasks: How Technology Displaces and Reinstates Labor. J. Econ. Perspect. 2019, 33, 3–30. [Google Scholar] [CrossRef]
Babina, T.; Fedyk, A.; He, A.; Hodson, J. Artificial Intelligence, Firm Growth, and Product Innovation. J. Financ. Econ. 2024, 151, 103745. [Google Scholar] [CrossRef]
Alekseeva, L.; Azar, J.; Giné, M.; Samila, S.; Taska, B. The Demand for AI Skills in the Labor Market. Labour Econ. 2021, 71, 102002. [Google Scholar] [CrossRef]
Bessen, J.E. AI and Jobs: The Role of Demand; NBER Working Paper No. 24235; National Bureau of Economic Research: Cambridge, MA, USA, 2018. [Google Scholar]
United Nations. Transforming Our World: The 2030 Agenda for Sustainable Development; United Nations: New York, NY, USA, 2015. [Google Scholar]
Kogan, L.; Papanikolaou, D.; Schmidt, L.D.W.; Seegmiller, B. Technology and Labor Displacement: Evidence from Linking Patents with Worker-Level Data; NBER Working Paper No. 31846; National Bureau of Economic Research: Cambridge, MA, USA, 2023. [Google Scholar]
Acemoglu, D.; Autor, D. Skills, Tasks and Technologies: Implications for Employment and Earnings. In Handbook of Labor Economics; Ashenfelter, O., Card, D., Eds.; Elsevier: Amsterdam, The Netherlands, 2011; Volume 4B, pp. 1043–1171. [Google Scholar]
Katz, L.F.; Murphy, K.M. Changes in Relative Wages, 1963–1987: Supply and Demand Factors. Q. J. Econ. 1992, 107, 35–78. [Google Scholar] [CrossRef]
Goldin, C.; Katz, L.F. The Race Between Education and Technology; Harvard University Press: Cambridge, MA, USA, 2008. [Google Scholar]
Autor, D.H.; Levy, F.; Murnane, R.J. The Skill Content of Recent Technological Change: An Empirical Exploration. Q. J. Econ. 2003, 118, 1279–1333. [Google Scholar] [CrossRef]
Goos, M.; Manning, A.; Salomons, A. Explaining Job Polarization: Routine-Biased Technological Change and Offshoring. Am. Econ. Rev. 2014, 104, 2509–2526. [Google Scholar] [CrossRef]
Deming, D.J. The Growing Importance of Social Skills in the Labor Market. Q. J. Econ. 2017, 132, 1593–1640. [Google Scholar] [CrossRef]
Deming, D.J.; Kahn, L.B. Skill Requirements across Firms and Labor Markets: Evidence from Job Postings for Professionals. J. Labor Econ. 2018, 36, S337–S369. [Google Scholar] [CrossRef]
Acemoglu, D.; Restrepo, P. The Race between Man and Machine: Implications of Technology for Growth, Factor Shares, and Employment. Am. Econ. Rev. 2018, 108, 1488–1542. [Google Scholar] [CrossRef]
Acemoglu, D.; Restrepo, P. Tasks, Automation, and the Rise in U.S. Wage Inequality. Econometrica 2022, 90, 1973–2016. [Google Scholar] [CrossRef]
Eloundou, T.; Manning, S.; Mishkin, P.; Rock, D. GPTs Are GPTs: Labor Market Impact Potential of LLMs. Science 2024, 384, 1306–1308. [Google Scholar] [CrossRef] [PubMed]
Brynjolfsson, E.; Mitchell, T.; Rock, D. What Can Machines Learn, and What Does It Mean for Occupations and the Economy? AEA Pap. Proc. 2018, 108, 43–47. [Google Scholar]
Frank, M.R.; Autor, D.; Bessen, J.E.; Brynjolfsson, E.; Cebrian, M.; Deming, D.J.; Feldman, M.; Groh, M.; Lobo, J.; Moro, E.; et al. Toward Understanding the Impact of Artificial Intelligence on Labor. Proc. Natl. Acad. Sci. USA 2019, 116, 6531–6539. [Google Scholar] [CrossRef]
Frey, C.B.; Osborne, M.A. The Future of Employment: How Susceptible Are Jobs to Computerisation? Technol. Forecast. Soc. Change 2017, 114, 254–280. [Google Scholar] [CrossRef]
Nedelkoska, L.; Quintini, G. Automation, Skills Use and Training; OECD Social, Employment and Migration Working Papers No. 202; OECD Publishing: Paris, France, 2018. [Google Scholar]
Georgieff, A.; Milanez, A. What Happened to Jobs at High Risk of Automation? OECD Social, Employment and Migration Working Papers No. 255; OECD Publishing: Paris, France, 2021. [Google Scholar]
Cazzaniga, M.; Jaumotte, F.; Li, L.; Melina, G.; Panton, A.; Pizzinelli, C.; Rockall, E.J.; Tavares, M.M. Gen-AI: Artificial Intelligence and the Future of Work; IMF Staff Discussion Note SDN2024/001; International Monetary Fund: Washington, DC, USA, 2024. [Google Scholar]
Albanesi, S.; Dias da Silva, A.; Jimeno, J.F.; Lamo, A.; Wabitsch, A. New Technologies and Jobs in Europe. Econ. Policy 2025, 40, 71–139. [Google Scholar] [CrossRef]
Freund, L.; Mann, L. Job Transformation, Specialization, and the Labor Market Effects of AI; CESifo Working Paper No. 12072; CESifo: Munich, Germany, 2025. [Google Scholar]
Bessen, J.E.; Goos, M.; Salomons, A.; van den Berge, W. What Happens to Workers at Firms That Automate? Rev. Econ. Stat. 2025, 107, 125–141. [Google Scholar] [CrossRef]
Humlum, A.; Vestergaard, E. Still Waters, Rapid Currents: Early Labor Market Transformation under Generative AI; NBER Working Paper No. 33777; National Bureau of Economic Research: Cambridge, MA, USA, 2025. [Google Scholar]
Autor, D.H.; Chin, C.; Salomons, A.; Seegmiller, B. New Frontiers: The Origins and Content of New Work, 1940–2018. Q. J. Econ. 2024, 139, 1399–1465. [Google Scholar] [CrossRef]
WIPO. WIPO Technology Trends 2019: Artificial Intelligence; World Intellectual Property Organization: Geneva, Switzerland, 2019. [Google Scholar]
Brynjolfsson, E.; Chandar, B.; Chen, R. Canaries in the Coal Mine? Six Facts About the Recent Employment Effects of Artificial Intelligence; Working Paper; Stanford Digital Economy Lab: Stanford, CA, USA, 2025. [Google Scholar]
Atalay, E.; Phongthiengtham, P.; Sotelo, S.; Tannenbaum, D. The Evolution of Work in the United States. Am. Econ. J. Appl. Econ. 2020, 12, 1–34. [Google Scholar] [CrossRef]
Hershbein, B.; Kahn, L.B. Do Recessions Accelerate Routine-Biased Technological Change? Evidence from Vacancy Postings. Am. Econ. Rev. 2018, 108, 1737–1772. [Google Scholar] [CrossRef]
Autor, D.H.; Handel, M.J. Putting Tasks to the Test: Human Capital, Job Tasks, and Wages. J. Labor Econ. 2013, 31, S59–S96. [Google Scholar] [CrossRef]
Houben, E.; De Cuyper, N.; Kyndt, E.; Forrier, A. Learning to Be Employable or Being Employable to Learn: The Reciprocal Relation Between Perceived Employability and Work-Related Learning. J. Career Dev. 2021, 48, 443–458. [Google Scholar] [CrossRef]
Autor, D.H.; Dorn, D. The Growth of Low-Skill Service Jobs and the Polarization of the US Labor Market. Am. Econ. Rev. 2013, 103, 1553–1597. [Google Scholar] [CrossRef]
Cameron, A.C.; Gelbach, J.B.; Miller, D.L. Robust Inference with Multiway Clustering. J. Bus. Econ. Stat. 2011, 29, 238–249. [Google Scholar] [CrossRef]
Oster, E. Unobservable Selection and Coefficient Stability: Theory and Evidence. J. Bus. Econ. Stat. 2019, 37, 187–204. [Google Scholar] [CrossRef]
Liu, X.; Zhang, H.; Tian, X. Artificial Intelligence, Human Capital Structure and Flexible Work: Empirical Evidence from Online Job Postings in China. Jingji Yanjiu (Econ. Res. J.) 2025, 60, 240–260. (In Chinese) [Google Scholar]
OECD. OECD Skills Outlook 2019: Thriving in a Digital World; OECD Publishing: Paris, France, 2019. [Google Scholar]
Gemmano, C.G.; Manuti, A. Developing to Sustain: How Informal Learning Shapes University Students’ Future Careers. Educ. Train. 2026, 68, 39–57. [Google Scholar] [CrossRef]
Humlum, A.; Vestergaard, E. The Unequal Adoption of ChatGPT Exacerbates Existing Inequalities among Workers. Proc. Natl. Acad. Sci. USA 2025, 122, e2414972121. [Google Scholar] [CrossRef] [PubMed]
Jaumotte, F.; Kim, J.; Koll, D.; Li, E.; Li, L.; Melina, G.; Song, A.; Tavares, M.M. Bridging Skill Gaps for the Future: New Jobs Creation in the AI Age; Staff Discussion Note SDN2026/001; International Monetary Fund: Washington, DC, USA, 2026. [Google Scholar]

Table 1. Summary Statistics.

	Observations	Mean	SD	Min	Median	Max
Panel A: dependent variables
nonroutine analytical share (NRA)	3,619,934	0.187	0.140	0.000	0.176	1.000
nonroutine social share (NRS)	3,619,934	0.447	0.196	0.000	0.438	1.000
routine cognitive share (RC)	3,619,934	0.283	0.149	0.000	0.277	1.000
routine manual share (RM)	3,619,934	0.083	0.110	0.000	0.053	1.000
skill importance (avg)	3,619,934	2.749	0.203	1.244	2.759	3.597
skill dispersion (1 − HHI)	3,619,934	0.827	0.175	0.000	0.884	0.960
NRA importance	3,173,058	2.230	0.385	1.606	2.153	3.518
NRS importance	3,505,690	2.894	0.134	2.598	2.881	3.597
RC importance	3,400,401	3.099	0.109	2.561	3.118	3.484
RM importance	2,644,651	2.078	0.194	1.244	2.150	2.408
Panel B: independent variables
AI exposure	3,619,934	0.003	0.016	0.000	0.000	0.986
displacement AI exposure	3,546,635	0.002	0.009	0.000	0.000	0.500
augmentation AI exposure	3,551,629	0.001	0.013	0.000	0.000	0.986
Panel C: control variables
log posting count	3,619,934	2.067	1.067	0.693	1.946	11.515
Panel D: job characteristics
years of education required	3,515,620	13.342	1.960	9.000	14.000	20.000
years of experience required	3,486,075	1.942	1.555	0.000	1.844	10.000
log median salary	3,619,928	8.902	0.479	0.693	8.882	16.812
years of education (restricted sample)	3,105,762	14.120	1.258	9.000	14.000	20.000
years of experience (restricted sample)	3,083,804	2.395	1.563	0.000	2.000	10.000

Notes: The restricted sample for years of education and years of experience excludes postings with open requirements (i.e., “no restriction” on education or experience).

Table 5. Robustness Checks.

	(1) Importance	(2) Dispersion	(3) NRA Imp.	(4) NRS Imp.	(5) RC Imp.	(6) RM Imp.
Panel A: Alternative aggregation method (sum)
displacement AI exposure	−0.150 *** (0.013)	0.030 *** (0.009)	0.429 *** (0.040)	−0.070 *** (0.007)	0.096 *** (0.007)	−0.269 *** (0.015)
augmentation AI exposure	−0.025 *** (0.003)	0.021 *** (0.002)	−0.004 (0.008)	−0.001 (0.002)	0.018 *** (0.002)	−0.013 *** (0.004)
Panel B: Alternative weighting scheme (headcount)
displacement AI exposure	−0.687 *** (0.035)	−0.010 (0.024)	1.090 *** (0.090)	−0.241 *** (0.017)	0.281 *** (0.014)	−1.044 *** (0.029)
augmentation AI exposure	−0.120 *** (0.015)	0.072 *** (0.011)	−0.094 *** (0.035)	0.010 (0.010)	0.078 *** (0.008)	−0.040 ** (0.016)
Panel C: Excluding the COVID-19 Period
displacement AI exposure	−0.744 *** (0.052)	−0.036 (0.037)	1.252 *** (0.163)	−0.303 *** (0.022)	0.319 *** (0.023)	−1.079 *** (0.040)
augmentation AI exposure	−0.130 *** (0.018)	0.101 *** (0.014)	−0.034 (0.043)	−0.005 (0.012)	0.088 *** (0.011)	−0.067 *** (0.019)
Panel D: Time-Invariant City Fixed Effects
displacement AI exposure	−0.695 *** (0.039)	0.003 (0.027)	1.277 *** (0.105)	−0.259 *** (0.018)	0.305 *** (0.017)	−1.121 *** (0.033)
augmentation AI exposure	−0.120 *** (0.016)	0.089 *** (0.011)	−0.074 ** (0.038)	0.010 (0.010)	0.082 *** (0.009)	−0.055 *** (0.017)

Notes: Standard errors two-way clustered by firm and city × year in parentheses. *** p < 0.01, ** p < 0.05. All specifications include ln(postings) as a control. Panels A–C use firm and city × year fixed effects; Panel D uses firm and time-invariant city fixed effects. Observations and within R² are reported in Appendix B Table A1.

Table 6. Oster [43] Sensitivity Analysis.

	Displacement		Augmentation
	β	δ	β	δ
Panel A: Reconfiguration (Table 3)
Skill importance	−0.706 ***	>100	−0.123 ***	−10.6
Skill dispersion	—	—	0.088 ***	0.4 ^†
Panel B: Divergence (Table 4)
NRA importance	1.273 ***	>100	−0.076 **	−2.4
NRS importance	−0.264 ***	<−100	—	—
RC importance	0.308 ***	>100	0.083 ***	>100
RM importance	−1.120 ***	<−100	−0.053 ***	>100

Notes: δ is the Oster [43] statistic measuring how strong selection on unobservables must be, relative to observables, to drive β to zero (R~max~ = 1.3R²). |δ| > 1 indicates robustness. ‘—’ denotes insignificant baseline coefficients. ^† Dispersion δ is conservative due to the high R² contribution of ln(postings). *** p < 0.01, ** p < 0.05.

Table 7. Results of Mechanism Analysis.

	(1) Education	(2) Experience	(3) Log Salary
displacement AI exposure	−11.289 ***	−2.576 ***	−2.064 ***
displacement AI exposure	(0.459)	(0.330)	(0.174)
augmentation AI exposure	0.035	−1.250 ***	−0.164 ***
augmentation AI exposure	(0.164)	(0.129)	(0.058)
Observations	3,377,316	3,349,538	3,472,557
Within R²	0.0250	0.0013	0.0244
control variables	Yes	Yes	Yes
Fixed effects	Firm + City × Year	Firm + City × Year	Firm + City × Year

Notes: Standard errors two-way clustered by firm and city × year in parentheses. *** p < 0.01. The dependent variable in each column is the firm–year average of the indicated job characteristic. All specifications include ln(postings) as a control.

Table 8. Results of Heterogeneity of De-Coring.

	(1) Importance	(2) Dispersion
Panel A: Education requirements
$displacement \times D_{h i g h}$	0.245 ***	−0.003
$displacement \times D_{h i g h}$	(0.042)	(0.030)
$augmentation \times D_{h i g h}$	0.084 ***	−0.058 ***
$augmentation \times D_{h i g h}$	(0.029)	(0.017)
Observations	3,472,561	3,472,561
Within R²	0.0172	0.2346
Panel B: Experience requirements
$displacement \times D_{h i g h}$	0.146 ***	0.025
$displacement \times D_{h i g h}$	(0.043)	(0.029)
$augmentation \times D_{h i g h}$	0.043	−0.056 ***
$augmentation \times D_{h i g h}$	(0.027)	(0.018)
Observations	3,472,561	3,472,561
Within R²	0.0195	0.2257
Panel C: Wage level
$displacement \times D_{h i g h}$	0.224 ***	0.147 ***
$displacement \times D_{h i g h}$	(0.045)	(0.028)
$augmentation \times D_{h i g h}$	−0.004	−0.039 **
$augmentation \times D_{h i g h}$	(0.025)	(0.017)
Observations	3,472,561	3,472,561
Within R²	0.0105	0.2509
Panel D: Firm size
$displacement \times D_{h i g h}$	0.376 ***	0.132 ***
$displacement \times D_{h i g h}$	(0.044)	(0.029)
$augmentation \times D_{h i g h}$	0.058 **	−0.074 ***
$augmentation \times D_{h i g h}$	(0.027)	(0.018)
Observations	3,472,561	3,472,561
Within R²	0.0239	0.2266
main effects	Yes	Yes
control variables	Yes	Yes
Fixed effects	Firm + City × Year	Firm + City × Year

Notes: Standard errors two-way clustered by firm and city × year in parentheses. *** p < 0.01, ** p < 0.05. All specifications include ln(postings) as a control.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, L.; Zhang, C. Toward Sustainable Workforce Development: How AI Reshapes Skill Demand Structure—Evidence from 67 Million Job Postings in China. Sustainability 2026, 18, 4905. https://doi.org/10.3390/su18104905

AMA Style

Zhang L, Zhang C. Toward Sustainable Workforce Development: How AI Reshapes Skill Demand Structure—Evidence from 67 Million Job Postings in China. Sustainability. 2026; 18(10):4905. https://doi.org/10.3390/su18104905

Chicago/Turabian Style

Zhang, Ling, and Chenglei Zhang. 2026. "Toward Sustainable Workforce Development: How AI Reshapes Skill Demand Structure—Evidence from 67 Million Job Postings in China" Sustainability 18, no. 10: 4905. https://doi.org/10.3390/su18104905

APA Style

Zhang, L., & Zhang, C. (2026). Toward Sustainable Workforce Development: How AI Reshapes Skill Demand Structure—Evidence from 67 Million Job Postings in China. Sustainability, 18(10), 4905. https://doi.org/10.3390/su18104905

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Toward Sustainable Workforce Development: How AI Reshapes Skill Demand Structure—Evidence from 67 Million Job Postings in China

Abstract

1. Introduction

2. Literature Review and Theoretical Framework

2.1. Related Literature

2.2. Theoretical Framework and Hypotheses

3. Materials and Methods

3.1. Econometric Model

3.2. Data

3.3. Variables

3.3.1. Key Explanatory Variable: Firm-Year AI Exposure

3.3.2. Dependent Variable: Firm-Year Skill Demand Structure

4. Results

4.1. Descriptive Analysis

4.2. Baseline Results

4.3. Skill Portfolio Characteristics and Category-Level Skill Importance

4.4. Robustness Check

5. Mechanism and Heterogeneity

5.1. AI Exposure and Job Characteristics

5.2. Heterogeneity of De-Coring

6. Discussion and Conclusions

6.1. Discussion

6.2. Conclusions and Limitation

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Technical Details of Key Explanatory Variable Construction

Appendix A.1. Choice of Sentence Embedding Model

Appendix A.2. Vector Retrieval and Threshold Calibration

Appendix A.3. Job Title Normalization and Staged Matching

Appendix A.4. Data Processing Pipeline

Appendix B. Tables

Appendix C. Skill Classification

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI