Next Article in Journal
Evolutionary Patterns and Advanced Strategies of Health Policies Based on Topic Modeling and Social Network Analysis
Previous Article in Journal
Deliberate Assignment Deferral for Multi-Agent Pickup and Delivery with Deadlines
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Stage-Wise Systemic Evolution of China’s Digital Economy: Evidence from Topic Modeling of Think Tank Reports

1
School of Economics and Management, Xiamen University of Technology, Xiamen 361024, China
2
School of Business, Sun Yat-sen University, Guangzhou 510275, China
*
Author to whom correspondence should be addressed.
Systems 2026, 14(5), 495; https://doi.org/10.3390/systems14050495
Submission received: 8 March 2026 / Revised: 21 April 2026 / Accepted: 26 April 2026 / Published: 1 May 2026

Abstract

With the in-depth advancement of the “Digital China” initiative, policies and research discourses related to the digital economy have continued evolved, making it necessary to systematically examine their stage-specific characteristics and underlying logic from a long-term perspective. Accordingly, this study adopts information society theory as the analytical framework and selects the annual series of reports on China’s digital economy development published by the China Academy of Information and Communications Technology (CAICT) from 2015 to 2024 as the research corpus. Using text mining techniques and Latent Dirichlet Allocation (LDA) topic modeling, this paper conducts a longitudinal examination of the stage-wise systemic evolution of key topics in China’s digital economy development. The findings indicate that over the past decade, the topic structure of China’s digital economy has followed a clear evolutionary trajectory, progressing from “informatization-driven development” to “platform expansion,” and subsequently to “data factors and institutional governance.” In the early stage, the focus was on information infrastructure development and industrial integration; the middle stage shifted toward the platform economy and enterprise growth; more recently, the emphasis has increasingly been placed on the construction of data factor markets and the improvement of governance frameworks. This process of topic evolution not only reflects changes in the practical forms of the digital economy but also reveals the ongoing adjustment of the state’s cognitive framework and governance logic regarding digital economy development. These findings provide empirical evidence for understanding the systemic evolution of China’s digital economy over time. By identifying the stage-specific pathways of China’s digital economy, this study extends the application of information society theory within this context and provides new empirical evidence for understanding the evolutionary logic underlying high-quality digital economy development.

1. Introduction

In 2015, Guizhou Province announced the launch of China’s first provincial-level big data strategy and, in the subsequent years, rapidly developed a regional model centered on data centers, algorithmic applications, and digital government services [1]. This seemingly localized initiative, however, reflects a broader national trend of rapid expansion and institutional transformation in China’s digital economy. In recent years, numerous studies have shown that large-scale digital infrastructure development and the function-oriented application of information technologies can significantly shape the structural pathways of national digital transformation, for example, by promoting digital industrialization and intelligent social governance [2,3]. From the “Internet Plus” initiative to the construction of “Digital China,” and further to the formal recognition of data as a new factor of production, China has experienced a digital leap over the past decade that is rare observed globally. The digital economy has not only become a critical driver of industrial upgrading and the modernization of public governance [3,4], but has also progressively reshaped national development logic and resource allocation mechanisms [5,6]. Meanwhile, with the rapid expansion of the platform economy, the institutionalization of data as a production factor, and the deepening of regulatory frameworks, policy discourses and strategic planning related to the digital economy have continued to evolve, revealing internal dynamics that are more complex than simple narratives of growth. Existing studies have demonstrated that national-level policy documents and research outputs can systematically reflect key issues in digital economy development and their temporal dynamics [7,8]. Therefore, a systematic examination of consecutively published reports enables an analysis of how core topics in the digital economy evolve across different stages, in response to policy orientations, technological change, and practical demands, thereby capturing both stage-wise characteristics and the broader systemic evolution of the digital economy.
In recent years, research on China’s digital economy has largely focused on specific issues, such as the expansion and regulation of platform enterprises [9,10], pathways of industrial digital transformation [11,12], and the circulation of data factors and institutional construction [13,14]. While these studies are important for understanding different dimensions of the digital economy, research that systematically examines the topic structure of China’s digital economy and traces its stage-wise evolutionary pathways remains relatively limited. Meanwhile, with the advancement of text mining methods, existing studies have begun to analyze the evolution of discursive systems in policy documents, seeking to uncover the changing logics of national strategic articulation through techniques such as keyword modeling and agenda analysis [15,16]. This line of research provides strong methodological support for analyzing the cognitive structures and decision-making orientations underlying policy discourse.
However, existing text-based analyses rarely conduct systematic longitudinal tracking of serialized think tank publications across multiple years. Taking the China Digital Economy Development Research Report released by the China Academy of Information and Communications Technology (CAICT) as an example1, this report series represents a flagship output of a national-level policy research institution. Although it does not constitute formal policy documents, it has consistently reflected research judgments and assessment outcomes regarding trends in digital economy development over an extended period. Characterized by strong foresight, continuity, and policy influence, this body of work remains underexplored in academic research. In response to these gaps, this study is grounded in information society theory and draws on ten editions of the China Digital Economy Development Research Report published between 2015 and 2024 as its data source.
This study aims to address the following research questions:
RQ1: How have the core conceptual systems of key topics in China’s digital economy evolved over the past decade?
RQ2: What developmental pathway of China’s digital economy is revealed through this process of topic evolution?
This study integrates policy analysis with text visualization methods and divides the research period into three stages based on major policy milestones and the background of technological evolution, thereby enabling an examination of the systemic evolution of China’s digital economy across stages: the exploratory initiation stage (2015–2017), the period of policy consolidation and platform economy expansion (2018–2020), and the phase of regulatory governance and data factor market construction (2021–2024). Methodologically, the study employs the “Weiciyun” text analysis platform to conduct word frequency analysis, semantic visualization, and LDA-based topic modeling of the report texts, in order to identify long-term topic trajectories and reveal stage-specific patterns of topic evolution in China’s digital economy development. The contribution of this study lies in its systematic analysis of topic shifts in a decade-long series of research reports on China’s digital economy from a stage-based perspective, thereby revealing the underlying logics of structural transformation across different stages. On this basis, the three-stage framework of informatization-driven development, platform expansion, and institutional governance goes beyond treating information society theory as a general interpretive lens and refines its contextualized application. Specifically, it links key elements emphasized in the theory, such as the development of information infrastructure, the expansion of network-based organizational forms, and the growing importance of data and governance, to observable topic shifts identified through text mining and LDA topic modeling. This approach clarifies how the development pathways of China’s digital economy empirically reflect and operationalize the underlying mechanisms highlighted in information society theory, and provides a theoretical reference for international comparative research.
The remainder of the paper is structured as follows. Section 2 reviews the theoretical foundations and related literature. Section 3 presents the research design. Section 4 reports the results of the text analysis and presents their visualization. Section 5 discusses the findings in terms of theoretical and practical implications. The final section concludes this study, outlines its limitations, and suggests directions for future research.

2. Theoretical Foundations and Literature Review

2.1. Information Society Theory

Information society theory provides a key theoretical perspective for understanding digital transformation and the rise in the digital economy. Classical studies argue that as information and knowledge gradually replace traditional production factors as the core resources for allocation, social structures are transitioning from an industrial society to an information society dominated by the production, processing, and dissemination of information [17]. Furthermore, within the framework of the “network society,” information technologies, communication networks, and data flows constitute a new logic of social organization, in which economic activities are carried out within highly interconnected network structures [18]. Amid profound social transformations occurring on a global scale (e.g., advances in information technology), the academic community has built upon this tradition to systematically update and critically reassess information society theory. For instance, in revisiting the theory of the “network society,” Castells [19] argues that information and communication technologies (ICTs) are not merely tools, but constitute a foundational paradigm for social restructuring. However, the rise in digital platforms, data capital, and algorithmic governance has given the information society new power configurations and institutional tensions, necessitating an extension of the original framework to incorporating greater attention to datafication, platformization, and governance.
Meanwhile, some scholars have begun to reexamine the transformation of the information society from the perspective of a “digital society,” arguing that the information society entails not only changes in technological structures but also a multidimensional process encompassing institutional reconfiguration, shifts in patterns of action, and the reorganization of social relations. For example, Liu [20] describes the reshaping of social structure and social order brought about by digitalization from five perspectives, including technological digitalization, value digitalization, action digitalization, cultural digitalization, and normative digitalization. The study maintains that the wave of digitalization represents a further deepening of the existing foundations of the information society, and it emphasizes the construction of a coherent conceptual system of the digital rather than a complete paradigmatic rupture [20]. Jamanbalayeva [21] conceptualizes the digital society as a social formation centered on data flows and network connectivity, emphasizing the importance of balancing technocratic and humanitarian dimensions in the process of digitalization and arguing that equitable access to digital technologies should be ensured to prevent the exacerbation of digital inequality. Thus, the formation of the information society or the digital society is not achieved instantaneously, but unfolds as a gradual historical process characterized by ongoing tensions and interactions among technological diffusion, industrial restructuring, and institutional construction.
On this basis, conceptualizing the digital economy as a concentrated manifestation of the information society in the economic domain has strong theoretical validity. On the one hand, information society theory emphasizes the diffusion of information infrastructure, network connectivity, and digital technologies as prerequisites for societal transition to a new stage of development. This perspective closely aligns with the stepwise logic in digital economy research that progresses from digital infrastructure to industrial digitalization, platform-based economies, and data as a production factor, and in some cases even extends to cybersecurity [22,23]. On the other hand, the recent literature on the digital society and the digital economy indicates that as data become a key factor of production and platforms emerge as central organizational forms [24], economic development tends to undergo a multilayered evolution, shifting from technology-driven growth to platform ecosystems and subsequently to institution- and governance-oriented trajectories [25]. However, the specific pathways of this evolution vary across industries and national contexts, and therefore require empirical research for validation and refinement [26]. In other words, information society theory offers a theoretical expectation that the development of the digital economy may exhibit stage-based and structural patterns of evolution, but it does not specify concrete stage divisions or their internal mechanisms. These aspects need to be examined and reconstructed through engagement with specific contexts and relevant empirical evidence.
Therefore, this study adopts information society theory as its theoretical foundation and does not assume a priori that the development of the digital economy necessarily follows a fixed sequence of stages. Instead, it draws on the theory’s emphasis on gradual change, stage-like development, and structural reorganization to elucidate why the topic evolution of the digital economy may display stage-based characteristics. More specifically, the stage logic identified in this study, namely the progression from information to platforms and further to data governance, can be understood in relation to the core propositions of information society theory. The early stage corresponds to the diffusion of information infrastructure and the expansion of information flows. The intermediate stage reflects the rise in network-based organizational forms and platform-mediated interactions. The later stage is associated with the increasing centrality of data as a production factor and the institutionalization of governance structures. In other words, information society theory provides this study with a theoretical bridge linking the topic evolution observed in report texts to the deeper structural transformations of the digital economy. On the one hand, it enables a macro-level interpretation of semantic shifts in Chinese think tank reports on the digital economy from 2015 to 2024 as stage-specific reflections of the information society process within the Chinese context. On the other hand, the empirical identification of these stages through textual analysis, such as shifts in high-frequency terms, semantic network structures, and topic distributions, provides concrete evidence of how the theoretical mechanisms of information society theory are manifested in a specific national context. Meanwhile, it calls for theoretical caution, as the stage-like topic evolution of digital economy discourse is an empirical finding derived from specific texts and a particular historical period. As such, it represents a contextualized application of information society theory to China’s digital economy, which requires further examination and extension through broader datasets and comparative perspectives in future research.

2.2. Related Literature

As digital technologies become deeply embedded in the functioning of economic and social systems, the digital economy is widely regarded as a critical force driving transformations in modes of production and institutional restructuring. Existing studies generally emphasize the central roles of digital infrastructure, data resources, and platform-based organizational forms in enhancing industrial efficiency, stimulating innovation, and fostering the emergence of new business models. For example, the OECD [27] provides a systematic overview of global trends in the development of the digital economy, highlighting its role in enhancing industrial efficiency, promoting digital innovation, and facilitating the emergence of new business models. Building on this, Rong [22] argues that the digital economy is not merely an outcome of technological evolution but also a major driver of institutional innovation and governance restructuring. From this perspective, research on the digital economy should extend across multiple dimensions, including the construction of digital infrastructure, the operational logic of digital platforms, the governance of data rights, cross-sectoral collaboration mechanisms, and the design of policy frameworks, thereby forming a comprehensive analytical framework for understanding how the digital economy operates [22].
A substantial body of research on the operational mechanisms and developmental pathways of the digital economy has focused on key issues such as the governance of data as a factor of production, the regulation of the platform economy, and the integration of digital technologies with the real economy. These studies indicate that the development of data factor markets plays a significant role in enhancing firms’ innovative capabilities [28], while the clarification of data property rights, the facilitation of data circulation, and the improvement of regulatory frameworks constitute the institutional foundations for sustainable digital economy development [29,30]. At the industrial level, scholars have proposed multidimensional pathways for the deep integration of digital technologies with the real economy, such as systematic advancement models built around integrated ecosystems, full value chains, and factor systems [31]. In addition, under the conditions of the platform economy, issues of market competition, data concentration, and algorithmic governance have generated new regulatory challenges, necessitating the establishment of coordinated mechanisms that link antitrust regulation with data governance [32]. Recent research has emphasized that effective digital governance plays a critical role in shaping the outcomes of digital transformation and sustainable development, highlighting the need to integrate governance frameworks into the analysis of digital economy evolution [33]. Overall, these studies collectively establish a theoretical foundation concerning the structural logic, institutional requirements, and developmental pathways of the digital economy, thereby providing systematic support for understanding of the economic and governance effects of digital transformation.
Against the backdrop of increasingly sophisticated research on mechanisms and institutions, scholars have pursued methodological diversification. Many studies have explored topics related to the digital economy using approaches such as literature-based analysis (e.g., Rong [22]), qualitative comparative analysis (e.g., Cong et al. [34]), and quasi-natural experimental designs (e.g., Hunjra et al. [35]). However, there remains a notable lack of research that reveals the stage-based topic evolution of the digital economy from a long-term longitudinal perspective. In the field of policy research, some scholars have begun to examine the evolutionary characteristics, thematic structure, and quantitative features of policy texts through text analysis and related computational methods. For instance, Cai et al. [36] examined China’s digital economy policies through text analysis and identified the evolutionary characteristics of the policy system underpinning the sustainable development of the digital economy. In a related vein, Shen et al. [16] conducted a quantitative evaluation of China’s science and technology financial policies since the 13th Five-Year Plan by combining text analysis, content analysis, keyword frequency extraction, and the PMC-AE index model, thereby identifying major policy characteristics, differences in policy quality, and directions for policy optimization. However, this line of research has primarily focused on formal policy documents such as laws and regulations, government work reports, and planning outlines and has not yet given sufficient attention to topic changes reflected in continuously published think tank outputs grounded in both policy expectations and observed outcomes.
At the methodological level, text mining, natural language processing, and visualization techniques provide new tools for this study. With the maturation of approaches such as LDA topic modeling, word embedding analysis, and social network analysis, scholars have increasingly explored ways to integrate quantitative analyses of policy semantics with theoretical interpretations [37,38]. Recent studies have demonstrated that topic modeling techniques can reveal the semantic structure and temporal evolution of policy portfolios, providing a quantitative lens for understanding the dynamics of national innovation and governance agendas [39]. The aforementioned studies seek to identify core issues, evolutionary trajectories, and the distribution of discursive weight among different actors within policy texts, emphasizing that language functions not merely as a carrier of information but also as a crucial mechanism for constructing social reality and shaping policy cognition. However, in research on the digital economy, such methods are predominantly applied to cross-sectional comparisons or stage-based policy evaluations. They have rarely been employed in longitudinal analyses of cross-year, serialized research texts, nor have they systematically examined topic evolution in a dynamic manner anchored in key historical junctures and technological iteration contexts.

2.3. Research Framework

In summary, existing studies have generated substantial findings regarding the impact mechanisms of the digital economy and the diffusion trends of policy themes. However, they have paid insufficient attention to the long-term and stage-based topic evolution reflected in strategic research texts. In particular, for continuous and highly forward-looking think tank outputs (e.g., research reports), the academic literature still lacks systematic text mining and semantic analysis. Building upon this research gap, this study draws on information society theory and uses ten consecutive reports published between 2015 and 2024 as its research corpus. By employing “Weiciyun” to conduct word-frequency statistics, semantic visualization, and topic classification, the study constructs an integrated research framework (see Figure 1) to uncover the stage-based patterns and characteristics of topic evolution in China’s digital economy. Compared with existing literature, this study not only supplements longitudinal analyses of strategic perceptions of the digital economy, but also provides a novel methodological pathway and empirical evidence for understanding how the state constructs the meaning of the digital economy, shifts topic priorities, and transforms governance logics across stages of development.

3. Research Design

3.1. Research Data

The data for this study are derived from the series of China Digital Economy Development Research Reports published by CAICT between 2015 and 2024, as detailed in Table 1. CAICT is an authoritative research institution directly affiliated with the Ministry of Industry and Information Technology of China and has long been responsible for national digital economy strategy development, policy formulation, and technological research. Accordingly, its reports are widely regarded as reliable and of substantial academic value.
This series of reports encompasses multiple dimensions of China’s digital economy development, including industrial structure, infrastructure development, regional disparities, technological convergence and innovation, platform economy governance, the development of data as a production factor, and policy and regulatory evolution. The reports are grounded in systematic macro-level data analysis, enterprise surveys, and policy evaluations, providing rich informational content and strong analytical depth.
In terms of data type, the research objects comprise research-oriented textual materials, primarily in the form of officially published reports in PDF format. These documents are full-text and structurally consistent, making them suitable for subsequent text processing tasks such as word segmentation, word frequency analysis, and visualization. The corpus is substantial and spans a relatively long period (2015–2024), providing strong longitudinal comparability. In addition, the data selection demonstrates methodological rigor in several respects: (1) Authority and representativeness. As a government-affiliated think tank, CAICT produces publications that closely align with national strategic priorities, reflecting an official and systematic understanding of digital economy development. (2) Continuity and comparability. The reports are released annually with a consistent structure and thematic focus, facilitating cross-year comparisons and the identification of evolving trends in key topics. (3) Breadth and typicality of content. The reports cover a wide range of core issues, including macro-level trends, regional distribution, industrial digitalization, and data governance, thereby providing comprehensive coverage and a holistic perspective. (4) Suitability of textual characteristics. These texts are information-dense, linguistically standardized, and terminologically consistent, making them well suited for natural language processing tasks, such as word frequency analysis, keyword extraction, and topic evolution analysis.
Therefore, selecting this series of reports as the data source ensures both the reliability and representativeness of the textual data and provides a solid foundation for subsequent quantitative text analysis.

3.2. Data Classification

To better analyze the stage-specific characteristics of digital economy development and the trajectory of policy evolution, this study divides the development process into three stages: the exploratory start-up period (2015–2017), the policy consolidation and platform economy expansion period (2018–2020), and the regulatory governance and data factor market construction period (2021–2024). This classification is primarily based on the issuance of key national-level policy documents and major structural changes within the industry. More specifically, the periodization reflects a combined consideration of three factors: (1) major policy milestones that mark shifts in national digital economy strategy, (2) substantive changes in the developmental focus of the digital economy, and (3) the evolving policy and research discourse surrounding digital economy development as reflected in the report series. Therefore, the stage division is not derived from a single criterion, but is grounded in a synthesis of policy context and developmental dynamics, which together provide a coherent basis for distinguishing different phases of digital economy development. The specific stages and their characteristics are presented in Figure 2. This periodization framework provides a clear analytical foundation for subsequent analysis and clarifies the evolution of topic features across different stages in the report texts.

3.3. Research Methods

To systematically reveal the stage-specific evolution of topics in China’s digital economy from 2015–2024, this study adopts a text mining approach, using the “Weiciyun” software (version 2.7) as the primary analytical tool. By integrating multiple technical methods, including word frequency analysis, semantic network analysis, sentiment analysis, and topic modeling, the textual materials contained in the report series are analyzed quantitatively and visualized.
From a theoretical perspective, the methodological approach of this study is grounded in information society theory, which posits that digital economy-related texts reflect the state’s stage-specific cognitive mapping of information structures, flows, and institutional transformations across different periods. In this study, these theoretical elements are analytically connected to empirical patterns identified through text mining and topic modeling, thereby enabling an examination of how abstract theoretical mechanisms are reflected in observable textual structures. On the one hand, a structured analysis of the high-frequency distribution of keywords, the centrality of information nodes, and their co-occurrence relationships can reveal the characteristics of the digital economy’s information structure reflected in the reports at each stage, as well as its evolutionary logic. On the other hand, by tracing the emergence, intensification, and substitution of core informational elements (such as “information,” “platform,” “data,” and “governance”), it is possible to further identify the structural shifts in digital economy topics accompanying the development of the information society. At the technical level, the study integrates natural language processing (NLP) techniques with text visualization methods to address the research questions in a data-driven manner: (1) the stage-specific evolution of modes of expression and core conceptual frameworks of digital economy topics over the past decade; and (2) the strategic logic of national development reflected in the evolution of digital economy topics and their dynamic interaction with policy contexts.
In addition, prior to formal analysis, the report texts were preprocessed. First, all reports were converted into machine-readable TXT format and systematically cleaned, including the removal of cover pages, tables of contents, and redundant structured information, while retaining the main textual content. Subsequently, dictionary resources were optimized to reflect the thematic characteristics of the texts. For example, domain-specific terms such as “platform-based enterprises” and “platform economy” were incorporated into a custom dictionary, while synonymous expressions such as “increase” and “improve” were grouped within a synonym dictionary. Finally, for lexical resources including the general dictionary, stop-word list, and sentiment lexicon, this study adopted the default dictionaries provided by the official Weiciyun software package (for details, see [40]). These dictionaries are pre-configured for Chinese text processing. No additional manual modifications were made. Furthermore, domain-specific terms were incorporated exclusively through a separate custom dictionary, without altering the original default lexicons, thereby ensuring methodological transparency and reproducibility.

4. Results and Visualization

To avoid analytical redundancy and to clarify the role of each method, this study integrates word frequency analysis, semantic network analysis, and topic modeling as complementary approaches. Word frequency analysis identifies salient terms, semantic network analysis captures relationships among concepts, and topic modeling reveals latent thematic structures, together enabling a multi-dimensional interpretation of the text.

4.1. Word Frequency Analysis

Word frequency analysis is a method that identifies core concepts and their relative importance by calculating the frequency of words in a text [41]. In this study, word frequency analysis reveals the distribution of high-frequency terms across different periods in the reports series and captures shifts in topic focus over time. It thus provides a foundation for subsequent semantic network analysis and topic evolution analysis. The top 20 most frequent words are presented in Table 2.
In the first stage, high-frequency terms were mainly concentrated in “information economy,” “development,” “digital economy,” “information,” “Internet,” “network,” and “industry.” Among these, “information economy” appeared most frequently, indicating that China’s digital economy remained at an early stage of development. The focus was primarily on informatization and Internet infrastructure development, emphasizing the transformation and upgrading of traditional industries through the “Internet Plus” strategy. Meanwhile, terms such as “development,” “economy,” and “growth” suggest that the digital economy was primarily framed as a driver of macroeconomic growth and industrial development. The co-occurrence of “Internet” with “innovation” and “industry” reflects strong expectations regarding the integration of the Internet with the real economy. Overall, the lexical characteristics of this stage align closely with the “exploratory start-up phase” proposed in this study, indicating that the digital economy remained at a preliminary stage characterized by informatization and industrial integration.
Upon entering the second stage, the focal keywords shift markedly. The term “digital economy” replaces “information economy” as the central concept, indicating that it has evolved from a policy initiative and social experimental construct into a core component of the national development strategy. Meanwhile, the frequent co-occurrence of terms such as “platform,” “enterprise,” and “employment” reflects the rapid expansion of platform-based firms and the emergence of the platform economy as a dominant form within the digital economy. During this period, the digital economy was viewed not only as a key driver of technological and industrial transformation but also as a mechanism for employment generation and broader social restructuring. In addition, the frequency of the term “data” increased significantly at this stage (0.0188, compared with 0.0067 in the previous stage). These lexical features align closely with the phase characterized by “policy consolidation” and “platform prosperity,” indicating that China’s digital economy had shifted from a focus on foundational infrastructure toward platform-centered expansion and the initial formation of institutional frameworks.
In the third stage, the keyword system evolves further, with “data” emerging as the most prominent term, while words such as “governance,” “factors,” and “market” simultaneously enter the high-frequency set. This result indicates that the focus of China’s digital economy has shifted further toward the institutional development of data resource ownership, circulation, and market-based allocation, marking a transition from “factor recognition” to a stage of institutionalized governance. The term “governance” consistently ranks among the most frequent keywords, highlighting the accelerated establishment of regulatory frameworks in areas such as antitrust, data security, personal information protection, and algorithmic regulation. Meanwhile, “digital transformation” has emerged as a new key term, reflecting that the scope of the digital economy has expanded beyond a single industrial domain to encompass broader systemic transformation across society.
Taken together, the three stages reveal a clear progression in core concepts and high-frequency terms within the report texts, shifting from an “information economy–Internet” orientation that emphasizes informatization and industrial integration, to a “digital economy–platform” orientation characterized by platform-driven growth and expansion, and further to a “data–market” orientation that focuses on the data factor market and the development of regulatory frameworks. This progression underscores the stage-based leapfrogging nature of China’s digital economy development trajectory.
Beyond the longitudinal changes in overall keywords, place-related terms in the report texts also exhibit clear stage-specific differences. To further reveal the spatial dimensions of the discourse on digital economy development, this study conducts a statistical analysis of the geographical names appearing in the reports, with the results presented in Figure 3.
Figure 3 shows that the distribution of place names in the report texts exhibits a transition from an eastern-region-led pattern to nationwide expansion. In the first stage, the most frequently mentioned place name is “China,” followed by the United States, Zhejiang, Guizhou, the United Kingdom, Guangdong, and Jiangsu. This distribution indicates that digital economy development was discussed from both a national perspective and a comparative international viewpoint while also highlighting the exploratory experiences of eastern regions such as Zhejiang, Guangdong, and Jiangsu. These areas possess relatively strong foundations in the Internet industry, manufacturing digitalization, and informatization. In addition, the prominence of Guizhou is noteworthy, suggesting that western China had already achieved considerable progress through early initiatives in big data development. Overall, the geographical distribution during this period reflects a pattern characterized by the coexistence of a nationwide framework and exploratory practices in representative regions.
In the second stage, although “China” continued to dominate, the frequencies of place names such as Guangdong, Zhejiang, Jiangsu, Shanghai, Beijing, Sichuan, Fujian, and Chongqing increased markedly. This indicates that, in spatial terms, the digital economy expanded beyond the traditional eastern coastal provinces to a broader range of regions. In particular, the inclusion of inland areas such as Sichuan and Chongqing demonstrates the role of digital economy in driving regional development and reflects a shift from an “east-led” pattern toward nationwide expansion. Meanwhile, the emergence of regional terms such as the “Yangtze River Delta” suggests that China’s digital economy had begun to move beyond individual cities or provinces toward more coordinated regional development.
By the third stage, the distribution of place names exhibits clear diversification. Shanghai, Beijing, and Shenzhen appear most frequently, underscoring the benchmark role of core cities in digital economy development and governance. At the same time, locations such as Shandong, Zhejiang, Hangzhou, and Guangzhou are mentioned repeatedly, highlighting the important role of provinces and cities with strong industrial bases and advanced levels of digitalization. Notably, the frequent use of regional expressions such as the “Yangtze River Delta,” along with repeated references to places including Guizhou, Chongqing, Wuhan, Shanxi, Suzhou, and Hubei, indicates that the reports at this stage frame the spatial logic of the digital economy as one of nationwide deployment and regional coordination. This pattern closely corresponds to the transition of the digital economy from rapid expansion to high-quality development.
Across the three stages, the distribution of place names in the report texts exhibits a clear evolutionary trajectory. In the first stage, references are centered on “China,” emphasizing international comparisons and highlighting representative eastern provinces. In the second stage, while the eastern regions remain prominent, attention expands to central and western provinces and introduces regional concepts. By the third stage, a spatial configuration emerges characterized by core cities, integrated regions, and multiple supporting locations. This progression indicates that the reports gradually develop a cognitive framework moving from a national perspective to key regions and ultimately to coordinated regional development. It reflects the spatial evolution of the digital economy from an exploratory start-up phase toward comprehensive governance.

4.2. Semantic Network Analysis

Semantic network analysis is a method that reveals the internal semantic structure and conceptual relationships of a text by examining co-occurrence patterns among words [42]. In this study, the approach serves two main purposes. First, it illustrates the clustering patterns and interconnections among key terms within the report texts. Second, it provides an intuitive representation of the evolutionary logic of topic systems across stages, thereby offering strong support for understanding the dynamic shifts in the digital economy development. The results of the semantic network analysis are presented in Figure 4.
In the first stage, the semantic network is centered on the core node “information economy.” The first peripheral layer is composed of terms such as “information,” “enterprises,” and “Internet,” while the second peripheral layer includes words such as “technology,” “networks,” “industries,” and “fields,” forming a typical semantic cluster characterized by informatization–industrial integration. The overall network structure at this stage is relatively simple, indicating that the development trajectory of the digital economy primarily revolved around the “Internet Plus” initiative and industrial transformation. The analysis emphasizes the enabling role of information technology in upgrading traditional industries. Meanwhile, the presence of terms such as “traditional” and “industrial” highlights the transitional nature of this period, during which new and traditional industrial forms coexisted.
In the second stage, the semantic network becomes significantly more complex, forming a dual-core structure centered on “digital economy” and “data.” The report texts not only continue to emphasize “industries” and “technology,” but also highlight the close connections between “platforms” and sectors such as the “service industry” and “manufacturing,” reflecting the prominent role of the platform economy during this period. At the same time, “data,” as a secondary core node, indicates that China had gradually established a cognitive framework recognizing data as a new factor of production. The emergence of peripheral nodes such as “industrial Internet” and “collaboration” further suggests an increasing emphasis on cross-industry and cross-regional integration. Overall, the semantic network in this stage exhibits characteristics of multiple cores and cross-domain connectivity, which aligns closely with the deepening development of the digital economy and the expansion of the platform economy.
In the third stage, the semantic network further evolves into a structure that transcends the conventional notion of the “digital economy,” with “data” emerging as the dominant central node. Not only does data retain its core status from the previous stage, but it also becomes closely linked with terms such as “transactions,” “factors,” “markets,” and “industries,” reflecting a strategic shift toward the development of a data factor market. Meanwhile, a new triangular core composed of “data–industry–market” takes shape, indicating that computing capacity development, data capability building, industrial coordination, and multi-scenario application have become key directions for digital economic growth. Moreover, “cities” emerge as increasingly significant peripheral nodes, highlighting the growing importance of domains such as digital cities and urban governance. The incorporation of emerging technological nodes such as “artificial intelligence” and “intelligence,” together with spatial expressions related to national (e.g., nationwide) and urban (e.g., city) scales, suggests that the digital economy at this stage has fully shifted toward a development logic characterized by institutionalization, intelligentization, and coordinated expansion from localized points to broader systemic coverage.
Taken together, the evolution of the semantic networks across the three stages clearly reveals a shift in the topic focus of the report texts: from “informatization-driven development and industrial integration” in the first stage, to “platform prosperity and the rise of data as a production factor” in the second stage, and ultimately to “data dominance and the institutionalization of factor markets” in the third stage. This progression not only reflects changes in the central topics of the reports but also mirrors the underlying developmental trajectory of China’s digital economy, which has gradually evolved from technology-driven growth to platform expansion and ultimately to data governance.

4.3. Sentiment Analysis and Topic Classification

4.3.1. Sentiment Analysis

Sentiment analysis is a method that reveals the attitudinal and emotional orientation of a text by identifying and quantifying positive, neutral, and negative tendencies within it [43]. Through sentiment analysis, the overall emotional distribution of the report texts across different stages can be intuitively presented, thereby uncovering their underlying value judgments and attitudinal orientations toward digital economy development. This approach complements the analysis of topic characteristics and stage-specific differences. In this study, sentiment analysis was conducted separately for the report texts of the three stages, as illustrated in Figure 5.
The sentiment analysis results for stage one indicate that the development of China’s digital economy was characterized by a distinctly positive overall attitude. Positive sentiment accounts for 71.93%, neutral sentiment 22.37%, and negative sentiment 5.7%. The scatter plot shows that most statements are concentrated within the positive range of 0–20, with the highest density observed at 10.61, suggesting that this stage was primarily oriented toward policy support and positive framing.
The report texts in stage two continue to exhibit an overall positive orientation; however, compared with stage one, the proportion of negative sentiment increases to 14.22%. The scatter plot indicates that most statements remain concentrated within the positive range of 0–20, with the highest density occurring around 11.45, suggesting that positive affirmation continues to predominate overall. Nevertheless, the higher share of negative sentiment, primarily distributed within the −20 to 0 range, indicates that the emotional tone of the reports became more diverse and complex during this stage.
Compared with stage two, the report texts in stage three exhibit a substantial increase in positive sentiment, rising from 62.58% to 75.88%. Similarly, most statements remain concentrated within the positive range of 0–20, with the highest density located around 11.19. Negative sentiment decreases by nearly half compared with the previous stage, indicating that the overall tone is predominantly affirmative and constructive.
The sentiment analysis across the three stages indicates that the development of China’s digital economy has progressed from an initial phase marked by strong constructive expectations, to a more complex intermediate phase shaped by emerging problems and challenges, and finally to a later phase in which discourse returns to a predominantly constructive and affirmative tone as the digital economy matures. Moreover, it is important to interpret these sentiment distributions with caution. The report texts analyzed in this study are produced by policy-oriented and think tank institutions, which often adopt a relatively formalized and positively framed discursive style. As a result, the observed prevalence of positive sentiment may partly reflect institutionalized modes of expression rather than purely substantive evaluations of development outcomes or policy confidence. Therefore, the sentiment analysis results should be understood as indicative of the overall tone and framing of the discourse, rather than as direct measures of underlying attitudes or objective conditions.

4.3.2. Topic Classification

Topic classification aims to automatically cluster texts using machine learning models to identify latent core topics and their semantic characteristics [44]. In this study, topic classification is used to reveal the semantic focal points and distribution of core themes across different stages of the report series, thereby enabling a systematic understanding of the evolutionary logic of the digital economy’s topic structure. Specifically, this study employs the LDA topic modeling algorithm embedded in the “Weiciyun” platform. Through probabilistic modeling, the method estimates the weight of each topic within the texts and extracts representative keywords, which are then analyzed in conjunction with sentiment distributions to interpret their evolutionary characteristics.
Notably, prior to conducting the topic analysis, we evaluated the perplexity, coherence, and topic distinctiveness of the texts in each stage based on the LDA model in order to determine the optimal number of topics. After determining the optimal number of topics, this study uses the automated output function of the “Weiciyun” platform to visually determine the relative importance of topics across stages and adopts the system-generated topic score distribution map (see Figure 6). Based on the topic probability scores calculated by the model, the figure represents the weight of each topic in the texts at each stage through variations in bubble size, thereby illustrating structural differences in topic focus across stages.
(1) Stage one: the nascent exploratory stage of the digital economy
In determining the number of topics, this study considers perplexity, coherence, and topic visualization results. First, based on perplexity trends, as the number of topics increases from 1 to 10, the model’s perplexity shows a continuous decline. Specifically, the decrease is substantial in the range of K = 1 to K = 3, remains relatively pronounced from K = 3 to K = 5, and becomes noticeably more gradual when K ≥ 6, indicating that further increases in the number of topics yield only limited improvements in model fit. Based on this, the reasonable range for the number of topics is preliminarily identified as K = 4 to K = 6. Building on this, the variation in the coherence metric is further examined. The results show that coherence reaches a relatively high level at K = 5, indicating strong semantic cohesion within topics under this setting. However, relying solely on the coherence metric is insufficient to fully capture the distinctions between topics. Therefore, we further incorporate inter-topic distance visualization for analysis. The results show that when K = 5, some topics exhibit substantial overlap in two-dimensional space, with several topics forming highly clustered groups. This indicates strong semantic similarity among these topics, insufficient discriminability, and a degree of redundancy. In contrast, when the number of topics is set to K = 4, the topics are more evenly dispersed, the degree of overlap between bubbles is significantly reduced, and the topic boundaries are clearer, thereby better reflecting differences among distinct semantic structures. Overall, given that perplexity has already entered a stage of diminishing returns, although K = 5 performs slightly better in terms of coherence, it suffers from notable topic overlap. By comparison, K = 4 not only maintains satisfactory model fit and semantic coherence, but also improves topic distinctiveness and structural clarity. It should be noted that under larger numbers of topics, the coherence metric may increase due to topic fragmentation; thus, this study places greater emphasis on topic discriminability and semantic interpretability in making the final decision. Accordingly, the number of topics is set to four. The relevant results are shown in Figure 7.
The topic classification results for stage one (see Table 3) indicate that the digital economy during this period was still in a phase of conceptual emergence and path exploration. The topic structure was primarily shaped by key terms such as “information,” “information economy,” “development,” and “enterprise.” From an overall perspective, the semantic profile reveals a pronounced orientation toward informatization, enterprise innovation, and sustained attention to macro-level development issues.
Specifically, the theme of the “information economy” occupies a central position with the highest average score of 0.81. Its sentiment distribution shows that positive sentiment accounts for 64.29%, while neutral sentiment accounts for a substantial share (32.14%). High-frequency terms such as “integration,” “sector,” “proportion,” “region,” and “nationwide” indicate that, during this period, the digital economy was primarily discussed as a pathway through which informatization could promote macroeconomic structural optimization. In other words, the “information economy” theme reflects that, in exploring the concept of the digital economy, governments and research institutions tended to situate it within an information-driven growth framework, emphasizing its structural role in regional development, industrial restructuring, and overall economic expansion.
Second, the theme of “information” has an average score of 0.71, with positive sentiment accounting for 89.71%, making it the most affirmatively perceived theme in stage one. Its representative high-frequency terms include “technology,” “service,” “industry,” “investment”, and “infrastructure”, indicating that the primary focus of digital economy development during this period was on the improvement of information infrastructure, the diffusion of informatization applications, and the integration of technology into industrial sectors. Information infrastructure was regarded not only as a foundational prerequisite for the emergence of the digital economy but also as a critical technological underpinning for industrial upgrading and the digitalization of public services. Accordingly, it was associated with highly positive policy expectations at this stage.
The “development” theme also receives a relatively high score (0.80) and exhibits a positive sentiment proportion of 67.06%. Its key terms, such as “index,” “network,” “scale,” “Internet Plus,” and “trend,” collectively reflect the policy discourse of “Internet Plus” strategy that characterized stage one. The semantic structure of this theme suggests that the digital economy had not yet developed an independent and fully conceptual framework at this stage. Instead, it evolved primarily within the context of industrial development driven by the “Internet Plus” initiative, emphasizing the role of network connectivity and economies of scale in promoting industrial growth and economic transformation.
In addition, the “enterprise” theme has a score of 0.71, with positive sentiment accounting for 75.19%. Its high-frequency terms, including “innovation”, “market”, “platform”, “traditional”, and “model”, reflect the dual role of enterprises in the early stage of digitalization. On the one hand, enterprises acted as key drivers of technological innovation and organizational transformation. On the other hand, their transition remained in an exploratory phase from traditional to digital models. Consequently, this theme exhibits a clear positive orientation while also retaining a certain proportion of neutral and negative sentiment (15.04% and 9.77%, respectively), indicating that early corporate digital transformation was still accompanied by uncertainty and adaptation challenges.
Taken together, the topic structure of stage one exhibits the characteristic features of informatization-driven development and industrial integration exploration. Evidence from theme scores, sentiment distributions, and semantic clustering of high-frequency terms suggests that the digital economy was primarily understood as a dynamic linkage among informatization initiatives, new technologies, and industrial growth. Further analysis of the report texts reveals that although the concept of the “digital economy” had already appeared, it had not yet emerged as an independent, high-weight theme. Instead, it was largely embedded within broader macro-level topics such as the “information economy” and “development.” This indicates that the digital economy was still in an early stage of conceptual development, with its strategic positioning, conceptual boundaries, and institutional framework not yet fully clarified. These findings are highly consistent with this study’s overall assessment of the exploratory start-up period. Specifically, the digital economy was largely regarded as an extension of information society development, representing a policy vision in which industrial growth would be driven by information infrastructure and internet technologies. Its strategic framework was still in formation, marking a transitional phase from “informatization” to “digitalization.”
(2) Stage two: platform economy as the core with more complex sentiment patterns
In stage two, this study similarly determines the number of topics by considering perplexity, coherence, and topic visualization results. First, from the trend of perplexity, as the number of topics increases from 1 to 10, the model’s perplexity shows a continuous decline. Specifically, the decrease is substantial in the range of K = 1 to K = 4, while the downward trend becomes noticeably more gradual when K ≥ 5, indicating that further increases in the number of topics yield only limited improvements in model fit. Based on this, the reasonable range for the number of topics is preliminarily identified around K = 4. Building on this, the variation in the coherence metric is further examined. The results show that coherence reaches its maximum at K = 4, and then declines or fluctuates as the number of topics increases, suggesting that increasing the number of topics beyond four does not enhance semantic cohesion within topics and may instead reflect over-segmentation. To validate the rationality of the topic structure, this study compares inter-topic distance visualizations under different topic numbers. The results indicate that when K = 4, the topics are relatively well dispersed in two-dimensional space, with low levels of overlap and a clear overall structure. In contrast, when K = 5, some topics exhibit notable clustering and overlap, with several topics located in close proximity, indicating strong semantic similarity and a degree of topic redundancy. In sum, given that perplexity has already entered a stage of diminishing returns, K = 4 not only achieves optimal coherence but also demonstrates superior topic distinctiveness and structural clarity. Thus, the number of topics is set to four. The relevant results are shown in Figure 8.
Table 4 shows that the topic focus of stage two is concentrated on four main areas: “platform,” “enterprise,” “data,” and the “digital economy.” However, compared with stage one, both the sentiment structure and semantic orientation become more complex. Among these, the “platform” theme has an average score of 0.71, with a polarized sentiment distribution: positive sentiment accounts for 41.18%, neutral sentiment: 32.35%, and negative sentiment: 26.47%. High-scoring terms such as “internet,” “user,” “management,” “regulation,” and “issue” indicate that while the platform economy expanded user connectivity and improved service efficiency, it also revealed governance imbalances, regulatory lag, and a concentration of emerging issues. As a result, platforms were increasingly discussed within a framework characterized by the coexistence of expansion and regulation.
By contrast, the “enterprise” theme has a higher average score of 0.73, with positive sentiment accounting for 77.98%, while neutral and negative sentiments represent 14.28% and 7.74%, respectively. High-scoring terms such as “field,” “service,” “production,” “innovation,” and “integration” indicate that enterprises across different domains leveraged digital technologies to enhance service models, restructure production processes, and achieve multidimensional integration driven by innovation. This theme highlights how enterprises undertook business upgrading and structural optimization based on platforms and digital infrastructure, and is therefore associated with a distinctly positive evaluation.
However, the “data” theme has an average score of 0.65, slightly lower than the other three themes, yet its sentiment structure is predominantly positive (81.40% positive, 13.95% neutral, and 4.65% negative). Its high-scoring terms cluster around “factors,” “market,” “transaction,” “risk,” and “value,” indicating that data was regarded as a key factor of production with significant value, and that the development of data markets and transaction mechanisms was a key direction during this phase. Meanwhile, the inclusion of the term “risk” suggests growing awareness of potential security, privacy, and systemic risks associated with data circulation, trading, and marketization. Thus, while affirming the factorization and market potential of data, this theme reflects a shift from a resource-based perspective toward a dual assessment of value and risk.
Meanwhile, the “digital economy” theme maintains the highest average score in this stage (0.79), while its sentiment distribution exhibits a relatively balanced pattern: positive sentiment accounts for 56.03%, neutral sentiment: 28.45%, and negative sentiment: 15.52%. High-scoring terms such as “digitalization,” “industry,” “sector,” “proportion,” and “scale” indicate that the role of the digital economy in shaping structure, share, and scale became a central focus, reflecting an emphasis on digitalization-driven macroeconomic restructuring. The relatively high proportion of neutral sentiment suggests that, alongside rapid expansion in scale and industrial upgrading, the digital economy was also subject to certain constraints. As a result, its development began to shift from a singular growth-oriented trajectory toward a more balanced pattern that integrates quality and structural optimization.
Collectively, stage two is characterized by a topic pattern in which platform expansion coexists with emerging risks, data factors gain increasing importance, and the digital economy enters a more balanced phase of development. The textual sentiment reflects a tension between expansionary optimism and cautious governance, which aligns closely with the policy environment of this period, marked by strengthened platform regulation and ongoing structural transformation of the digital economy.
(3) Stage three: coupled advancement of institutional governance, factor markets, and enterprise actors
As in the previous two stages, this study determines the final number of topics in the third stage by considering perplexity, coherence, and topic visualization results (for details, see Figure 9).
Based on perplexity trends, as the number of topics increases, perplexity declines markedly in the range of K = 1 to K = 4, while the rate of decrease becomes progressively more gradual when K ≥ 5, indicating that the model fit has entered a stage of diminishing returns. Regarding coherence, although the highest value is observed at K = 9, this peak occurs at a relatively large number of topics and may result from topic fragmentation, which can enhance local semantic consistency but does not necessarily reflect an improvement in the overall semantic structure. By comparison, K = 5 represents a local optimum, but it still requires assessment in terms of topic distinctiveness. From the inter-topic distance visualization, when K = 5, several topics exhibit notable overlap, indicating strong semantic similarity and a degree of redundancy. In contrast, when K = 4, the topics are more evenly dispersed, with lower levels of overlap and a clearer overall structure, better capturing differences among distinct semantic dimensions. Taken together, given that perplexity has already stabilized, although larger values of K achieve higher coherence, they carry a risk of over-segmentation. By contrast, K = 4 demonstrates superior performance in terms of topic distinctiveness and structural clarity. Hence, the number of topics is set to four.
As shown in Table 5, the topic structure of stage three exhibits pronounced diversification, with semantic attention centered on key issues such as “digitalization,” “enterprise,” “data,” and the “digital economy,” alongside a clear strengthening of institutional governance mechanisms. Among these, the “enterprise” and “data” themes have relatively higher average scores (0.72 and 0.71, respectively), and both display positive sentiment proportions exceeding 70%, indicating that notable progress was achieved during this period in enterprise digital transformation and the development of data factor markets. Key terms associated with the “enterprise” theme, including “technology,” “integration,” “intelligence,” and “industrial chain,” reflect the deep embedding of digital technologies within production systems and the increasingly prominent role of enterprises in the industrial internet and industrial coordination. Meanwhile, high-frequency terms for the data theme, such as “factor,” “system,” “transaction,” “institution,” and “regulation,” indicate accelerated progress toward the institutionalization and standardization of data factor markets, with data governance capacity emerging as a central pillar supporting the development of the digital economy.
It is noteworthy that although the “digitalization” theme has a score of 0.70, the proportion of negative sentiment reaches 12.50%, a level that is relatively high compared with the other themes. This suggests that as digitalization increasingly permeates industrial and social systems, the processes of market expansion, industrial scaling, and regional development are accompanied by certain risks, pressures, and uncertainties. For example, the advancement of digital industrialization may encounter structural bottlenecks or market volatility, among other challenges.
In addition, the “digital economy” theme in stage three continues to maintain a relatively high average score (0.73) and a positive sentiment proportion approaching 80%. Its key terms, including “promotion,” “enhancement,” “growth,” “driving force,” and “advantage,” indicate that the digital economy still plays a prominent leading role at the macro-strategic level and is regarded as a key driver of optimizing economic structure and enhancing growth resilience. Taken as a whole, compared with earlier stages in which the topic structure leaned toward technological deployment or platform expansion, the topic system of stage three clearly reflects the coordinated advancement of institutional governance, factor markets, and enterprise actors. This pattern signifies a new stage in the development of the digital economy, characterized by a transition from rapid expansion to institutionalization, standardization, and high-quality development.
A synthesis of the topic classification results across the three stages reveals a clear evolutionary trajectory of China’s digital economy, progressing from technology introduction to platform dominance and ultimately to institutional governance. During the exploratory start-up period (stage one), themes concentrated on “information,” “enterprises,” and “development,” highlighting informatization development, industrial integration, and the catalytic role of the internet, which reflects a stage in which the digital economy was still in a phase of conceptual formation and path exploration. In stage two, the topic focus shifted from “information” to “platforms,” “enterprises,” and “data,” with “platforms” emerging as the most distinctive core issue while exhibiting pronounced sentiment polarization. This pattern suggests that the platform economy simultaneously facilitated industrial chain restructuring and service innovation, yet was accompanied by structural controversies such as monopolistic tendencies and regulatory lag, thereby displaying a transitional character in which prosperity coexisted with risk.
In stage three, as the digital economy entered a period of deepening institutional development, the topic structure underwent a marked reconfiguration. “Enterprises” and “data” emerged as the most heavily weighted core themes, while “digitalization” and the “digital economy” functioned as overarching frameworks. In particular, the high scores and strong positive sentiment associated with the “data” theme indicate that the institutional construction of data factor markets has become a central component in the narrative of digital economy development. Although “governance” did not appear as a standalone theme, its logic is embedded within themes such as “data” through keywords including “system,” “institution,” and “regulation,” reflecting the integrated role of institutional governance in the evolution of the digital economy. Taken together, stage three fully demonstrates a historical shift in digital economy development from the identification of production factors to institutional consolidation, and from a platform-centered logic to a governance-oriented logic.
From the perspective of the longitudinal evolution of topic logic, this trajectory progresses from an initial phase of “technology-driven exploratory construction” to a stage characterized by “risk awareness amid platform expansion,” and ultimately to “governance-led institutional system building.” This evolutionary process reveals the semantic updating of digital economy themes and suggests alignment with broader developmental trends. Meanwhile, because these findings are derived from a continuous series of reports issued by a single think tank institution, they should be understood as reflecting a specific yet influential analytical perspective on China’s digital economy, rather than the full range of possible institutional or societal interpretations.

5. Discussion

5.1. Theoretical Implications

First, this study clarifies the fundamental trajectory of topic evolution in China’s digital economy over the past decade. Existing research has examined the key mechanisms driving China’s digital economy from multiple perspectives, including detailed analyses of platform economy expansion and regulation [9,10], pathways of industrial digital transformation [4,11], and the circulation of data as a production factor alongside institutional development [13,14]. Such studies are valuable for explaining specific issues; however, they generally adopt a single-issue focus or a cross-sectional perspective and seldom provide a systematic account of the overall evolutionary trajectory of digital economy development topics from a long-term, longitudinal standpoint. Although some policy-text–based studies have begun to examine topic evolution (e.g., Shen et al. [16]; Cai et al. [36]), their research objects are primarily confined to formal policy documents, leaving the continuous analysis of strategic research reports relatively underexplored. By conducting a longitudinal textual analysis of a series of research reports on China’s digital economy development over the past decade, this study systematically maps the stage-based evolution of the topic structure of China’s digital economy. It reveals a clear trajectory of transformation from “informatization-driven development” to “platform expansion,” and subsequently to “data as a production factor and institutional governance,” thereby illustrating the stage-wise systemic evolution of China’s digital economy. Compared with existing studies that offer parallel examinations of discrete topics, this paper uncovers the underlying progressive logic of digital economy development topics from the perspective of overall cognitive structure, thereby providing a more systematic analytical framework for understanding the stage-specific characteristics of China’s digital economy. However, this framework is constructed from the longitudinal analysis of reports issued by a single national-level think tank and thus should be regarded as an analytically valuable but not exhaustive account of how China’s digital economy has been understood across different institutional contexts.
Second, this study extends the theoretical application of information society theory to the digital economy. Information society theory and network society theory emphasize the foundational role of information technologies, network structures, and data flows in socio-economic transformation [17,18], and recent studies have further employed these frameworks to explain the formation of digital societies and the emergence of digital power structures [19,20]. In research on the digital economy, these theories are typically used as a macro-level analytical framework to illustrate overarching trends in digital transformation; however, systematic empirical analyses of how these theoretical dynamics are manifested in topic evolution within specific national contexts remain limited. By identifying a stage-wise trajectory from information to platforms and further to data governance through longitudinal text analysis, this study provides an empirical mapping of the core mechanisms proposed by information society theory. Rather than treating the theory as a general interpretive background, this study makes its underlying mechanisms more analytically explicit by examining how shifts in topic prevalence and semantic structures correspond to different aspects of the information society process. In particular, the transition from information infrastructure to platform-based organization and further to data-centered governance reflects a progressive deepening of informational logic, network structures, and institutional arrangements emphasized by the theory. Moreover, the textual evidence generated in this study, including changes in keyword centrality, semantic clustering, and topic structures, demonstrates that these theoretical dynamics are empirically observable in the evolving discourse of national-level research reports. In this sense, this study advances the application of information society theory by showing how its key mechanisms can be traced through measurable changes in topic structures, thereby strengthening the connection between abstract theory and empirical analysis.
Third, while this study is theoretically grounded in information society theory, it does not assume a fixed stage model of digital economy development. Instead, by analyzing changes in the topic structure of think tank report texts, it demonstrates how the information society process has unfolded in China’s digital economy. The findings indicate that topic evolution in these reports exhibits a progressive pattern, reflecting a gradual transformation of informational structures, organizational forms, and institutional arrangements. Rather than reiterating a predefined stage sequence, this pattern is inductively derived from the textual data, thereby providing empirical grounding for the core propositions of information society theory regarding the gradual reconfiguration of social structures [19,20]. In this sense, the contribution of this study lies not only in its alignment with existing theoretical expectations, but also in its ability to render these abstract mechanisms empirically observable through topic evolution in national-level think tank reports. It thus operationalizes the application of information society theory within the context of digital economy research, providing a data-driven perspective on how theoretical dynamics are manifested in a specific national setting.
Finally, by taking authoritative research reports as its primary analytical material, this study addresses a long-standing gap in the textual analysis literature, where such sources have received relatively limited attention. Existing text-based studies on the digital economy have predominantly focused on policy documents, strategic plans, and laws and regulations [15,36], based on the assumption that formal policy texts most directly reflect national strategic orientations. Although this research orientation is well justified, it has to some extent overlooked the mediating role of think tank outputs in shaping policy practice and the evolution of development topics. In particular, continuously published national-level think tank research reports, despite their substantial practical influence, have long remained underexamined in systematic textual analysis. By taking think tank reports published by CAICT as the research object, this study treats them as a key textual source for tracing topic evolution in the course of China’s digital economy development. Compared with existing studies that predominantly focus on policy texts, this paper broadens the range of source materials in digital economy text research and demonstrates the distinctive theoretical value of think tank research texts for understanding the underlying logic of national digital economy development.

5.2. Practical Implications

First, this study highlights the need to enhance the coordination mechanism of digital economy policies from the perspective of staged evolution. The analysis in this study indicates that the topics of China’s digital economy development exhibit a clear pattern of phased progression: initially driven by informatization, subsequently shifting toward platform expansion, and more recently focusing on data as a production factor and institutional governance. This evolutionary trajectory indicates that digital economy policies cannot remain effective over the long term under a single, fixed institutional framework; instead, policy priorities must be continuously adjusted in accordance with different stages of development. Therefore, at the policymaking level, a stage-oriented approach should be strengthened to avoid relying on static policies to address continually evolving developmental needs. Meanwhile, greater coordination across government departments is required to foster synergy among technological, industrial, and governance policies at different stages, thereby enhancing the overall coherence and adaptability of the digital economy policy system.
Second, this study suggests that there is a need to cultivate both the consensus and the capacity to transition from a platform-driven model to an industrial support system centered on firm capabilities and data as a key production factor. The findings indicate that platforms occupy a central position during the intermediate stage of the digital economy, but their importance is gradually superseded in later stages by enterprise actors and data elements. This shift implies that the industrial support system of the digital economy should move away from reliance on platform-scale expansion and instead place greater emphasis on strengthening firms’ digital capabilities and improving the allocation efficiency of data resources. In practice, the primary role of enterprises in technological application, data governance, and industrial coordination should be strengthened to consolidate the micro-level foundations of digital economy development. At the same time, institutional arrangements for the data factor market should be further developed and refined so that data can generate value through regulated circulation, thereby providing sustained momentum for high-quality development of the digital economy.
Furthermore, we recommend advancing the transformation of the digital economy from “point-based breakthroughs” to a “systematic layout” through regional coordination. Evidence from place-name frequency and spatial semantic analyses indicates that China’s digital economy has evolved from early pilot explorations concentrated in a few regions to a development pattern that increasingly emphasizes interregional coordination and nationwide deployment. This suggests that the digital economy has entered a new stage characterized by a transition from “point-based breakthroughs” to “systematic advancement.” Accordingly, at the regional development level, it is essential to avoid simply replicating the development trajectories of leading regions. Instead, stronger interregional division of labor and cooperation should be promoted to foster a multi-tiered and differentiated spatial structure for the digital economy. By reinforcing the leading role of core cities and enhancing regional coordination mechanisms, the overall coherence and sustainability of digital economy development at the national level can be further enhanced.
Finally, this study highlights the importance of embedding governance logic throughout the entire process of digital economy development in order to prevent the risks associated with “governance lag.” The findings reveal that governance-related issues become significantly more prominent in the later stages of the digital economy and are increasingly integrated with topics such as data, indicating that governance logic has emerged as a crucial pillar supporting its development. This result further suggests that governance should not be treated merely as a reactive response after problems emerge, but rather should be embedded throughout the entire process of digital economy development. In practice, therefore, industrial development and institutional construction should be advanced in parallel, with efforts to improve regulatory frameworks and coordination mechanisms to mitigate the accumulation of risks such as platform monopolies, data security threats, and social inequities. By proactively embedding and institutionalizing governance logic, a more reliable foundation can be established for the long-term and stable development of the digital economy.

6. Conclusions

From the analytical perspective of information society theory, this study takes the series of China Digital Economy Development Research Reports released by CAICT from 2015 to 2024 as its research corpus. Using text mining and topic analysis methods, this paper systematically examines the stage-based evolutionary characteristics of the topics underlying China’s digital economy development. The findings indicate that over the past decade, the topic structure of China’s digital economy has followed a clear progressive trajectory, evolving from “informatization-driven development” to “platform expansion,” and further to “data factors and institutional governance.” This trajectory indicates a shifting pattern in the research discourse of think tank reports and the analytical framing of the digital economy, moving from technological introduction, through the expansion of organizational and industrial forms, to the consolidation of institutional arrangements. Across these stages, the core focus of the digital economy has shifted from infrastructure development and industrial integration to the growth of the platform economy and enterprise development and ultimately to the construction of data factor markets and the improvement of governance systems. This topic evolution reflects changes in the analytical focus of the reports and suggests how governance-related issues are increasingly emphasized in the understanding of the digital economy, thereby providing empirical evidence on the evolution of discursive representations in think tank research texts, rather than definitive proof of macro-level structural transformation in the real economy. This is because the findings are derived from think tank report texts and therefore reflect a structured analytical perspective, rather than direct evidence of formal policy implementation or institutional change.
Although this study seeks to systematically reveal the topic evolution of China’s digital economy from a longitudinal perspective, several limitations remain. First, with regard to data sources, the analysis relies solely on a series of reports published by a single think tank institution. While these reports offer both authority and temporal continuity, they primarily reflect the evolving research perspective of this institution on China’s digital economy development, rather than a fully comprehensive representation of all possible viewpoints across policy, industry, and regional contexts. As such, the findings should be understood as capturing the stage-wise evolution of this institutional research perspective on the digital economy, and their generalizability to other policy actors, regions, or industrial contexts remains limited. Future research could incorporate policy documents, local development plans, or corporate reports to enable cross-validation and enhance the robustness of the findings. Nevertheless, given the authority of the institution, the close alignment with national policy agendas, and its role as policy-oriented research outputs, the findings still provide valuable insights into the mainstream analytical understanding of China’s digital economy development over time. Second, at the methodological level, this study primarily relies on text mining techniques such as word frequency analysis, semantic network analysis, and topic modeling. Therefore, it offers limited insight into the causal relationships among topics and the institutional drivers underlying them and provides only limited analysis of the underlying mechanisms. Future research could address this limitation by incorporating qualitative comparative approaches or in-depth case studies. Finally, in terms of theoretical extension, although this study adopts information society theory as its explanatory framework and applies it in a contextualized manner to the analysis of digital economy theme evolution, it remains an open question whether similar trajectories emerge under different national and institutional settings. This issue calls for further verification through cross-national comparative research. Future studies may extend this line of inquiry by integrating multi-source data, combining methodological approaches, and adopting an international comparative perspective, thereby offering a more comprehensive understanding of the structural logic and evolutionary patterns of digital economy development.

Author Contributions

Conceptualization, G.X. and Y.T.; methodology, G.X.; software, G.X.; validation, G.X. and Y.T.; formal analysis, R.Z.; investigation, G.X.; resources, G.X. and Y.T.; data curation, G.X.; writing—original draft preparation, G.X.; writing—review and editing, G.X., Y.T. and R.Z.; visualization, G.X.; supervision, Y.T.; project administration, Y.T.; funding acquisition, G.X. and R.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Youth Project of Ministry of Education Humanities and Social Science Research, China, grant number 24YJC630299; the Doctoral Support Project of Fujian Provincial Social Science Fund Project, grant number FJ2024BF051; Natural Science Foundation of Xiamen Municipality, grant number 3502Z202573072; High-Level Scientific Research Project of Xiamen University of Technology, grant number YSK23014R; Innovation Strategy Project of Fujian Province Science and Technology Program, grant number 2026R0114; and the 2024 High-Level Talents Project in Social Sciences at Xiamen University of Technology, grant number YSK24005R.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

The authors express their gratitude to the editor and anonymous reviewers for their numerous constructive comments and encouragement, which have significantly enhanced our paper.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
LDALatent Dirichlet Allocation
CAICTthe China Academy of Information and Communications Technology

Note

1

References

  1. Belt and Road Portal. Guizhou Accelerates Construction of National Big Data (Guizhou) Comprehensive Pilot Zone. Available online: https://eng.yidaiyilu.gov.cn/p/99781.html?utm_source=chatgpt.com (accessed on 22 May 2025).
  2. Ma, R.; Lin, B. Digital infrastructure construction drives green economic transformation: Evidence from Chinese cities. Humanit. Soc. Sci. Commun. 2023, 10, 460. [Google Scholar] [CrossRef]
  3. Chen, X.; Tang, X.; Xu, X. Digital technology-driven smart society governance mechanism and practice exploration. Front. Eng. Manag. 2023, 10, 319–338. [Google Scholar] [CrossRef]
  4. Deevi, D.P.; Allur, N.S.; Dondapati, K.; Chetlapalli, H.; Kodadi, S.; Perumal, T. The impact of the digital economy on industrial structure upgrading and sustainable entrepreneurial growth. Electron. Commer. Res. 2024, 26, 863–887. [Google Scholar] [CrossRef]
  5. Yang, H. Construction of the new development dynamic and development of digital economy: Internal logic and policy focus. China Political Econ. 2023, 6, 92–113. [Google Scholar] [CrossRef]
  6. Qi, L.; Yang, L.; Ta, M. The Impact of Digital Economy on Regional Resource Allocation Efficiency: An Analysis Based on Resource Flow Speed and Direction. Financ. Res. Lett. 2025, 82, 107644. [Google Scholar] [CrossRef]
  7. Wang, G.; Yang, Y. Quantitative evaluation of digital economy policy in Heilongjiang Province of China based on the PMC-AE index model. Sage Open 2024, 14, 21582440241234435. [Google Scholar] [CrossRef]
  8. Detthamrong, U.; Nguyen, L.T.; Jaroenruen, Y.; Takhom, A.; Chaichuay, V.; Chotchantarakun, K.; Chansanam, W. Topic Modeling Analytics of Digital Economy Research: Trends and Insights. J. Scientometr. Res. 2024, 13, 448–458. [Google Scholar] [CrossRef]
  9. McKnight, S.; Kenney, M.; Breznitz, D. Regulating the platform giants: Building and governing China’s online economy. Policy Internet 2023, 15, 243–265. [Google Scholar] [CrossRef]
  10. Huang, Y. ‘Strong regulations’ of China’s platform economy: A preliminary assessment. China Econ. J. 2022, 15, 125–138. [Google Scholar] [CrossRef]
  11. Xing, X.; Chen, T.; Yang, X.; Liu, T. Digital transformation and innovation performance of China’s manufacturers? A configurational approach. Technol. Soc. 2023, 75, 102356. [Google Scholar] [CrossRef]
  12. Meng, T.; Li, Q.; He, C.; Dong, Z. Research on the configuration path of manufacturing enterprises’ digital servitization transformation. Int. Rev. Econ. Financ. 2025, 98, 103952. [Google Scholar] [CrossRef]
  13. Ouyang, Y.; Hu, M. The impact of data elements marketization on corporate financing constraints: Quasi-experimental evidence from the establishment of data trading platforms in China. Financ. Res. Lett. 2024, 69, 106132. [Google Scholar] [CrossRef]
  14. Zhang, L.; Zhang, X. Impact of digital government construction on the intelligent transformation of enterprises: Evidence from China. Technol. Forecast. Soc. Change 2025, 210, 123787. [Google Scholar] [CrossRef]
  15. Hong, S.; Wang, T.; Fu, X.; Li, G. Research on quantitative evaluation of digital economy policy in China based on the PMC index model. PLoS ONE 2024, 19, e0298312. [Google Scholar] [CrossRef] [PubMed]
  16. Shen, H.; Xiong, P.; Yang, L.; Zhou, L. Quantitative evaluation of science and technology financial policies based on the PMC-AE index model: A case study of China’s science and technology financial policies since the 13th five-year plan. PLoS ONE 2024, 19, e0307529. [Google Scholar] [CrossRef]
  17. Bell, D. The Coming of Post-Industrial Society: A Venture in Social Forecasting; Basic Books: New York, NY, USA, 1973; Available online: https://www.hachettebookgroup.com/titles/daniel-bell/the-coming-of-post-industrial-society/9780465097135/?utm_source=chatgpt.com/?lens=basic-books (accessed on 20 June 2025).
  18. Castells, M. The Rise of the Network Society; Blackwell: Oxford, UK, 1996; Available online: https://books.google.com/books?hl=zh-CN&lr=&id=FihjywtjTdUC&oi=fnd&pg=PA1975&dq=The+Rise+of+the+Network+Society%EF%BC%8CCastells,+M.+(1996)&ots=l5Zn-UDSe-&sig=QPMnz98oG9TLiAhLfNYCBP3qQ2A#v=onepage&q=The%20Rise%20of%20the%20Network%20Society%EF%BC%8CCastells%2C%20M.%20(1996)&f=false (accessed on 22 June 2025).
  19. Castells, M. The network society revisited. Am. Behav. Sci. 2023, 67, 940–946. [Google Scholar] [CrossRef]
  20. Liu, Y. Five perspectives on the digital: A sociological interpretation. J. Chin. Sociol. 2025, 12, 21. [Google Scholar] [CrossRef]
  21. Jamanbalayeva, S.; Burova, E.; Sagikyzy, A.; Zhanabayeva, D.; Adamidi, A. The paradigm of the digital society: Synthesis of technocratic and socio-humanitarian approaches. Cogent Soc. Sci. 2025, 11, 2513460. [Google Scholar] [CrossRef]
  22. Rong, K. Research agenda for the digital economy. J. Digit. Econ. 2022, 1, 20–31. [Google Scholar] [CrossRef]
  23. Milskaya, E.; Seeleva, O. Main directions of development of infrastructure in digital economy. In IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2019; Volume 497, p. 012081. Available online: https://iopscience.iop.org/article/10.1088/1757-899X/497/1/012081/meta (accessed on 29 June 2025).
  24. Gawer, A. Digital platforms and ecosystems: Remarks on the dominant organizational forms of the digital age. Innovation 2022, 24, 110–124. [Google Scholar] [CrossRef]
  25. Gorwa, R. What is platform governance? Inf. Commun. Soc. 2019, 22, 854–871. [Google Scholar] [CrossRef]
  26. Mayer, M.; Nock, P.J. Digital fragmentations, technological sovereignty and new perspectives on the global digital political economy. Glob. Political Econ. 2025, 4, 2–13. [Google Scholar] [CrossRef]
  27. OECD. OECD Digital Economy Outlook 2020; OECD Publishing: Paris, France, 2020. [Google Scholar] [CrossRef]
  28. Xu, N.; Xu, L.; Yan, X.W. Data factor marketization empowering enterprise innovation quality: New evidence from Chinese patent citations. Int. Rev. Econ. Financ. 2025, 103, 104433. [Google Scholar] [CrossRef]
  29. Huang, J. The rise of data property rights in China: How does it compare with the EU data act and what does it mean for digital trade with China? J. Int. Econ. Law 2024, 27, 462–479. [Google Scholar] [CrossRef]
  30. Xianbin, T.; Qiong, W. Sustainable digital economy through good governance: Mediating roles of social reforms and economic policies. Front. Psychol. 2021, 12, 773022. [Google Scholar] [CrossRef]
  31. Dou, R.; Hou, Y.; Lin, K.Y.; Si, S.; Wei, Y. Transforming digital value chain ecosystems for dual-carbon target: An exploration of the BDS-RAS framework. Comput. Ind. Eng. 2024, 188, 109861. [Google Scholar] [CrossRef]
  32. Kira, B.; Sinha, V.; Srinivasan, S. Regulating digital ecosystems: Bridging the gap between competition policy and data protection. Ind. Corp. Change 2021, 30, 1337–1360. [Google Scholar] [CrossRef]
  33. Ghazal Masri, S.; El-Fadel, M. Governance of Digital Transformation for Sustainable Development: Aligning Digital Innovation with the Sustainable Development Goals. Front. Sustain. Cities 2026, 8, 1743552. [Google Scholar] [CrossRef]
  34. Cong, X.; Liu, B.; Wang, L.; Su, P.; Zhang, S.; Liu, Y.; Ustinovičius, L.; Skibniewski, M.J. Exploration of multiple enhancing pathways of digital economy development of city clusters using fuzzy-set qualitative comparative analysis. Technol. Econ. Dev. Econ. 2024, 30, 1769–1804. [Google Scholar] [CrossRef]
  35. Hunjra, A.I.; Zhao, S.; Goodell, J.W.; Liu, X. Digital economy policy and corporate low-carbon innovation: Evidence from a quasi-natural experiment in China. Financ. Res. Lett. 2024, 60, 104910. [Google Scholar] [CrossRef]
  36. Cai, L.; Xiao, J.; Zuo, R. Research on the Evolution Characteristics of Policy System That Supports the Sustainability of Digital Economy: Text Analysis Based on China’s Digital Economy Policies. Sustainability 2025, 17, 3876. [Google Scholar] [CrossRef]
  37. Zhang, W.; Zhang, M.; Yuan, L.; Fan, F. Social network analysis and public policy: What’s new? J. Asian Public Policy 2023, 16, 115–145. [Google Scholar] [CrossRef]
  38. Malandrino, A. Comparing qualitative and quantitative text analysis methods in combination with document-based social network analysis to understand policy networks. Qual. Quant. 2024, 58, 2543–2570. [Google Scholar] [CrossRef]
  39. Zhou, Q.; Xu, K.; Acur, N. How Do Governments Frame Responsible Innovation: A Topic Modelling Analysis of National Policy Portfolios. Technol. Soc. 2026, 86, 103257. [Google Scholar] [CrossRef]
  40. Weiciyun. Analytical Principles. Available online: https://www.weiciyun.com/file/#/question-explain?id=%e5%88%86%e6%9e%90%e5%8e%9f%e7%90%86 (accessed on 20 February 2025).
  41. Feuerriegel, S.; Maarouf, A.; Bär, D.; Geissler, D.; Schweisthal, J.; Pröllochs, N.; Robertson, C.E.; Rathje, S.; Hartmann, J.; Mohammad, S.M.; et al. Using natural language processing to analyse text data in behavioural science. Nat. Rev. Psychol. 2025, 4, 96–111. [Google Scholar] [CrossRef]
  42. Tang, R.; Moon, J.; Heo, G.R.; Lee, W.S. Exploring the knowledge structure and potential research areas of sustainable tourism in sustainable development: Based on text mining and semantic network analysis. Sustain. Dev. 2024, 32, 3037–3054. [Google Scholar] [CrossRef]
  43. Zhan, J.; Jin, B. Does Pollyanna hypothesis hold true in death narratives? A sentiment analysis approach. Acta Psychol. 2024, 245, 104238. [Google Scholar] [CrossRef] [PubMed]
  44. Zimmermann, J.; Champagne, L.E.; Dickens, J.M.; Hazen, B.T. Approaches to improve preprocessing for Latent Dirichlet Allocation topic modeling. Decis. Support Syst. 2024, 185, 114310. [Google Scholar] [CrossRef]
Figure 1. Analytical framework of the study.
Figure 1. Analytical framework of the study.
Systems 14 00495 g001
Figure 2. Stages of China’s digital economy development, 2015–2024.
Figure 2. Stages of China’s digital economy development, 2015–2024.
Systems 14 00495 g002
Figure 3. Frequency distribution of place names.
Figure 3. Frequency distribution of place names.
Systems 14 00495 g003
Figure 4. Results of semantic network analysis. Note: the words in this figure (ac) are translated from Chinese.
Figure 4. Results of semantic network analysis. Note: the words in this figure (ac) are translated from Chinese.
Systems 14 00495 g004
Figure 5. Sentiment analysis results across the three stages.
Figure 5. Sentiment analysis results across the three stages.
Systems 14 00495 g005
Figure 6. Scores of topics across different stages.
Figure 6. Scores of topics across different stages.
Systems 14 00495 g006
Figure 7. Results of topic number determination for stage one.
Figure 7. Results of topic number determination for stage one.
Systems 14 00495 g007
Figure 8. Results of topic number determination for Stage two.
Figure 8. Results of topic number determination for Stage two.
Systems 14 00495 g008
Figure 9. Results of topic number determination for stage three.
Figure 9. Results of topic number determination for stage three.
Systems 14 00495 g009
Table 1. Data sources.
Table 1. Data sources.
No.Report TitlePublication Date
1Research Report on the Development of China’s Digital Economy (2024)August 2024
2Research Report on the Development of China’s Digital Economy (2023)April 2023
3Research Report on the Development of China’s Digital Economy (2022)May 2022
4White Paper on the Development of China’s Digital EconomyApril 2021
5White Paper on the Development of China’s Digital Economy (2020)July 2020
6White Paper on the Development of China’s Digital Economy and Employment (2019)April 2019
7White Paper on the Development of China’s Digital Economy and Employment (2018)April 2018
8White Paper on the Development of China’s Digital Economy (2017)July 2017
9White Paper on the Development of China’s Information Economy (2016)September 2016
10Research Report on China’s Information Economy (2015)September 2015
Table 2. Results of word frequency analysis.
Table 2. Results of word frequency analysis.
Stage OneStage TwoStage Three
TermFrequencyTF-IDFTermFrequencyTF-IDFTermFrequencyTF-IDF
Information economy0.0222063040.007969536Digital economy0.0222445410.008178431Data0.0271528320.0126185
Development0.0202960840.007066332Data0.0188595020.010750125Digital economy0.0166364970.008195494
Growth0.0132760270.006904027Development0.0170739870.005908436Development0.0157170360.006806867
Digital economy0.0132760270.008761065Growth0.0143213180.006336213Growth0.014481510.007045111
Information0.0105539640.007297361Platform0.0094855480.00617809Digitalization0.0114070630.006718955
Production0.0095510980.006429269Digitalization0.0085927910.005418033Enterprise0.0099704050.006942691
Enterprise0.0095510980.007176993Enterprise0.0083324030.005535057Industry0.0084475480.005199522
Economy0.0090735430.006067389Employment0.0068444740.006890357Platform0.0068959570.004870712
Internet0.00826170.005751204Industry0.0068072760.004706777Construction0.0067235580.004269865
Innovation0.0074021010.005483759Field0.0065468880.00437784Realization0.0066086260.00421582
Index0.0074021010.007113021Sector0.0064724920.004795069Service0.0061201620.004427828
Technology0.0072110790.005577232Internet0.0062865010.004560339Promotion0.0060339630.003884168
Service0.0071633240.005124882Service0.006100510.004617026Application0.0058902970.004410171
Field0.0068290350.005168393Information0.0060633110.004367991Digital0.0056891650.004389651
Data0.006781280.005442545Governance0.0059517170.004812902Governance0.0054018330.005119167
Network0.0066380130.005450255Production0.0059145180.004819257Technology0.0051432350.003889345
Integration0.0063037250.004541184Promotion0.0057285270.003543229Innovation0.0047984370.003858263
Become0.0057306590.004215684Industrial0.0056913290.004745666Factors0.0046260380.004177729
Industry0.0055396370.004729389Innovation0.005468140.004079792Market0.0045111050.004499196
Application0.0054441260.004272849Application0.0052821490.003677054Digital transformation0.0044536390.004022037
Table 3. Topic classification results of report texts in stage one.
Table 3. Topic classification results of report texts in stage one.
Topic NameAverage ScoreSentiment Distribution (Positive/Neutral/Negative)High-Scoring Terms (Examples)
Information0.7189.71%/8.09%/2.20%Technology, service, industry, investment, and infrastructure
Information economy0.8164.29%/32.14%/3.57%Integration, sector, proportion, region, and nationwide
Development0.8067.06%/32.94%/0.00%Index, network, scale, internet plus, and trend
Enterprise0.7175.19%/15.04%/9.77%Innovation, market, platform, traditional, and model
Table 4. Topic classification results of report texts in stage two.
Table 4. Topic classification results of report texts in stage two.
Topic NameAverage ScoreSentiment Distribution (Positive/Neutral/Negative)High-Scoring Terms (Examples)
Platform0.7141.18%/32.35%/26.47%Internet, user, management, regulation, and issue
Enterprise0.7377.98%/14.28%/7.74%Field, service, production, innovation, and integration
Data0.6581.40%/13.95%/4.65%Factors, market, transactions, risk, and value
Digital economy0.7956.03%/28.45%/15.52%Digitalization, industry, sector, proportion, and scale
Table 5. Topic classification results of report texts in stage three.
Table 5. Topic classification results of report texts in stage three.
Topic NameAverage ScoreSentiment Distribution (Positive/Neutral/Negative)High-Scoring Terms (Examples)
Digitalization0.7061.61%/25.89%/12.50%Market, sector, nationwide, scale, and digital industrialization
Enterprise0.7279.44%/16.11%/4.45%Technology, integration, industrial internet, intelligence, and industrial chain
Data0.7178.20%/14.28%/7.52%Factors, system, transactions, institutions, and regulation
Digital economy0.7379.91%/10.96%/9.13%Promotion, enhancement, growth, driving force, and advantage
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xie, G.; Tian, Y.; Zhang, R. Stage-Wise Systemic Evolution of China’s Digital Economy: Evidence from Topic Modeling of Think Tank Reports. Systems 2026, 14, 495. https://doi.org/10.3390/systems14050495

AMA Style

Xie G, Tian Y, Zhang R. Stage-Wise Systemic Evolution of China’s Digital Economy: Evidence from Topic Modeling of Think Tank Reports. Systems. 2026; 14(5):495. https://doi.org/10.3390/systems14050495

Chicago/Turabian Style

Xie, Guojie, Yu Tian, and Ruilin Zhang. 2026. "Stage-Wise Systemic Evolution of China’s Digital Economy: Evidence from Topic Modeling of Think Tank Reports" Systems 14, no. 5: 495. https://doi.org/10.3390/systems14050495

APA Style

Xie, G., Tian, Y., & Zhang, R. (2026). Stage-Wise Systemic Evolution of China’s Digital Economy: Evidence from Topic Modeling of Think Tank Reports. Systems, 14(5), 495. https://doi.org/10.3390/systems14050495

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop