Next Article in Journal
Enhanced Anaerobic Digestion of Spent Coffee Grounds: A Review of Pretreatment Strategies for Sustainable Valorization
Previous Article in Journal
Forecasting the Number of Electric Vehicles in Turkey Towards 2030: SARIMA Approach
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Decoding the Developmental Trajectory of the New Power System in China via Bibliometric and Visual Analysis

by
Yinan Wang
1,
Heng Chen
1,*,
Minghong Liu
2,
Mingyuan Zhou
1,
Lingshuang Liu
2 and
Yan Zhang
3
1
School of Energy Power and Mechanical Engineering, North China Electric Power University, Beijing 102206, China
2
State Grid Xinjiang Electric Power Company Economic and Technological Research Institute, Ürümqi 830063, China
3
State Grid Xinjiang Electric Power Corporation, Ürümqi 830063, China
*
Author to whom correspondence should be addressed.
Energies 2025, 18(18), 4809; https://doi.org/10.3390/en18184809
Submission received: 30 July 2025 / Revised: 24 August 2025 / Accepted: 8 September 2025 / Published: 10 September 2025

Abstract

Under the twin imperatives of climate change mitigation and sustainable development, achieving a low-carbon transformation of power systems has become a national priority. To clarify this objective, China issued the Blue Book on the Development of New Power System, which comprehensively defines the guiding concepts and characteristic features of a new power system. In this study, natural language processing-based keyword extraction techniques were applied to the document, employing both the TF-IDF and TextRank algorithms to identify its high-frequency terms as characteristic keywords. These keywords were then used as topic queries in the Web of Science Core Collection, yielding 1568 relevant publications. CiteSpace was employed to perform a bibliometric analysis of these records, extracting research hotspots in the new power system domain and tracing their evolutionary trajectories. The analysis revealed that “renewable energy” appeared 247 times as the core high-frequency term, while “energy storage” exhibited both high frequency and high centrality, acting as a bridge across multiple subfields. This pattern suggests that research in the new power system field has evolved from a foundation in renewable energy and storage toward smart grids, market mechanisms, carbon capture, and artificial intelligence applications. Taken together, these results indicate that early research was primarily grounded in renewable energy and storage technologies, which provided the technical basis for subsequent exploration of smart grids and market mechanisms. In the more recent stage, under the dual-carbon policy and digital intelligence imperatives, research hotspots have further expanded toward carbon capture, utilization, and storage (CCUS) and artificial intelligence applications. Looking ahead, interdisciplinary studies focusing on intelligent dispatch and low-carbon transition are poised to emerge as the next major research frontier.

1. Introduction

Within the contexts of climate change mitigation and the pursuit of sustainable development, establishing an energy system that is clean, low carbon, safe, and highly efficient has become a central objective of national energy strategy [1]. Renewable energy, as a vital approach to carbon emissions mitigation, has assumed an ever more prominent role in the decarbonization of energy systems [2]. However, the intermittency, unpredictability, and variability of renewable energy create substantial challenges for strengthening the balancing and support functions of the power system, and the situation for new energy integration remains severe [3]. In the context of climate change mitigation and sustainable development, the low-carbon transformation of the power sector has become a global priority. China, as the world’s largest energy consumer and carbon emitter, has taken a leading role in promoting this transition. According to the China Electricity Council, by the end of 2023, China’s total installed power generation capacity had reached 2.92 TW, with renewable energy accounting for 1.45 TW [4]. Among this, wind power exceeded 430 GW, and solar PV surpassed 610 GW [5], both ranking first worldwide. The share of non-fossil fuels in power generation also rose above 36%, highlighting the accelerating pace of the energy transition. To guide this process, the National Energy Administration issued the Blue Book on the Development of New Power System in 2023, which comprehensively defines the concept and features of China’s new power system. However, despite rapid growth in publications on this topic, there is still a lack of systematic evaluation of the field’s developmental trajectory and shifting research hotspots. Clarifying the evolutionary path of China’s new power system is not only necessary for understanding the technological shift from fossil fuel reliance to renewable-dominated systems but also of great practical significance. Such analysis provides theoretical insights and policy references for achieving the “dual-carbon” targets while offering strategic guidance for balancing renewable integration, energy storage deployment, smart grid development, and emerging digital intelligence technologies [6].
Natural language processing (NLP) represents a key discipline within artificial intelligence, encompassing diverse research themes across multiple domains. Within NLP, keyword extraction constitutes a fundamental task [7]. Depending on the retrieval methodology, keyword extraction approaches can be broadly classified into supervised and unsupervised methods [8]. Traditional supervised learning frames keyword extraction as a classification task [9]. It requires manually labeling keywords and then iteratively refining the model to improve accuracy [10]. This approach incurs high time costs and demands large volumes of annotated data [11]. Unsupervised approaches eliminate the need for pre-annotation by relying on statistical analysis and modeling of candidate terms to automatically identify keywords within documents. In contrast, unsupervised methods do not require prior labeling; they employ statistical analysis and modeling of candidate terms to automatically extract keywords from documents [12]. Owing to the high efficiency of unsupervised keyword extraction, numerous researchers have applied term-based techniques across diverse fields. Anshul Saxena et al. [13] utilized the term frequency–inverse document frequency (TF–IDF) algorithm to extract features from a 10,000-term vocabulary. They then trained predictive models using stochastic gradient descent, support vector classification (SVC), and relevance vector machine algorithms, ultimately developing a complication prediction model with high accuracy. Liu Hao et al. [14] proposed a keyword extraction model that integrates a sentiment dictionary with TF-IDF through a weighted allocation mechanism. This model achieved a 13.9% increase in accuracy compared with a conventional rule-based sentiment dictionary approach, and a 7.7% improvement over a standalone TF-IDF weighting scheme. Their results demonstrate that although BoW performs effectively in short text classification tasks, TF-IDF remains the preferred technique for keyword extraction in search engine applications. Accordingly, this study employs both the TF-IDF algorithm and the TextRank algorithm to perform keyword extraction on the document.
Bibliometric analysis represents an approach that combines mathematical and statistical methods to capture the evolving dynamics across diverse research domains [15]. Bibliometric analysis represents an approach that combines mathematical and statistical methods to capture the evolving dynamics across diverse research domains [16]. This methodology has now been widely adopted for hotspot analysis across a broad range of academic disciplines. In the medical domain, Dino Fanfan et al. [17] applied bibliometric techniques to examine the 100 most influential publications in the field of sarcomas worldwide and conducted a detailed assessment of the countries and journals contributing most significantly to this research area. In the field of economics, Godwin Ahiase et al. [18] employed the bibliometric software VOSviewer (version 1.6.17) to perform a quantitative analysis of the literature in the realm of Digital Financial Inclusion (DFI), revealing a pronounced increase in publication output in 2022. In the computer science domain, Nikolaj Goranin et al. [19] performed a bibliometric analysis of the Internet of Things market literature indexed in the Web of Science (WOS) database, revealing the field’s key emerging research trends.
Recent studies have begun to explicitly investigate the developmental trajectories of power systems from multiple perspectives. At the industry level, Hui Wei et al. [20] employed a dynamic multidimensional cloud model under the PESTEL framework to quantitatively evaluate the evolution of China’s virtual power plant (VPP) industry between 2015 and 2023, highlighting the critical roles of policy, law, and economic drivers while identifying limitations in marketization and large-scale deployment. At the system level, Yilun Luo et al. [21] proposed a hybrid system dynamics model that integrates optimization modules within a system dynamics framework to simulate power mix trajectories in liberalized electricity markets, with particular emphasis on the impacts of carbon pricing and capacity mechanisms. Their results demonstrate that such a hybrid approach provides more realistic insights into system evolution, especially under high shares of variable renewable energy (VRE). At the policy level, Tingting Liu [22] applied data visualization and paradigm trajectory analysis to map the evolution of China’s wind power policies, identifying five distinct phases characterized by subsidy reforms, market mechanism construction, and policy goal decomposition. The study also underscored persistent challenges such as grid integration and flexibility. These studies collectively provide a multidimensional perspective on the development trajectory of power systems. However, they also reveal certain shortcomings: industry-oriented analyses often lack connection to conceptual policy evolution, system models may overlook discourse and institutional drivers, and policy research rarely links research findings to broader academic hot topics.
In summary, natural language processing and bibliometric analysis have been applied, advanced, and integrated across multiple disciplines, and existing trajectory studies on power systems have largely focused on specific subfields such as virtual power plants, power mix modeling, or wind power policy. However, no study has employed both methods to conduct a macro-level investigation of the development trajectory of the new power system domain. Addressing this gap, the present study combines NLP-based keyword extraction with bibliometric analysis to systematically trace how policy-driven concepts, articulated in the Blue Book on the Development of New Power System, have shaped and been interpreted within the scholarly literature. Specifically, high-frequency keywords identified by both the TF-IDF and TextRank algorithms were selected as representative features and used as topic terms to query the Web of Science Core Collection. The retrieved publications were then analyzed bibliometrically using CiteSpace to identify research hotspots in the new power system domain and to trace their evolutionary trajectories.
The main innovations of this study are as follows: (1) This study integrates natural language processing (TF-IDF and TextRank algorithms) with bibliometric analysis (CiteSpace), thereby establishing a replicable methodological framework for extracting and analyzing research hotspots in the new power system domain. (2) By systematically analyzing 1568 publications from the Web of Science Core Collection (2015–2024), this research reveals the temporal evolution of research themes, highlighting the shifts from renewable energy and storage to smart grids, market mechanisms, CCUS, and deep learning. (3) The study provides a conceptual mapping of the multi-stage development of new power system research, offering insights into the trajectory from foundational renewable energy studies to interdisciplinary integration. (4) Policy implications are derived from the empirical findings, emphasizing the necessity of intelligent dispatch, flexible operation, and low-carbon transition, which provide guidance for China’s ongoing power system transformation.

2. Methodology

2.1. Analysis Method

This study establishes a combined framework of a natural language processing model and a CiteSpace-based bibliometric visualization model to extract research hotspots in the new power system domain and analyze their evolutionary trajectories. First, in the Data Preprocessing phase, the Blue Book on the Development of New Power System is ingested as raw text and segmented using the Jieba Chinese tokenizer (version 0.42.1, available at https://github.com/fxsjy/jieba, accessed on 1 January 2023). The resulting tokens are then cleaned and filtered through a custom “power industry lexicon” and a standard stop-word list to remove irrelevant or overly frequent terms and other noise. Next, in the keyword extraction phase, two complementary algorithms are applied: a TF-IDF model to quantify each term’s discriminative power across the corpus and the TextRank algorithm to capture term importance via a graph-based ranking approach. Each method yields high-frequency keywords. The choice of TF-IDF and TextRank is particularly suited to the analysis of the Blue Book for two reasons. First, the Blue Book is a concise yet domain-specific policy document, in which technical terms appear with distinctive statistical distributions. TF-IDF effectively highlights these discriminative terms. Second, as the document emphasizes conceptual frameworks and interlinked strategies, TextRank is advantageous in capturing semantically important terms through graph-based ranking. Compared with more complex methods such as LDA or BERT-based embeddings, which require large-scale training corpora and introduce interpretability challenges, TF-IDF and TextRank provide a transparent, computationally efficient, and complementary approach. This ensures that both statistically distinctive and semantically central concepts are identified, aligning with the objectives of this bibliometric study. Finally, in the Bibliometric and Visual Analysis phase, these keywords are used to retrieve relevant articles from the Web of Science Core Collection. A co-word matrix is constructed and pruned using a projection pursuit method to simplify network edges. On the pruned network, co-occurrence analysis, clustering, and temporal evolution analyses are conducted to reveal the field’s thematic structure and dynamics. Leveraging CiteSpace, we then generate a keyword co-occurrence map, cluster network visualizations based on Latent Semantic Analysis (LSA) and Log-Likelihood Ratio (LLR) algorithms, and a keyword timeline, thereby providing a clear, quantitative depiction of research hotspots, intellectual clusters, and developmental trends in the new power systems domain. The research process is illustrated in Figure 1.

2.2. Natural Language Processing Model (NLP)

2.2.1. Jieba Word Segmentation Tool

The text segmentation technique is a method for dividing continuous text into discrete lexical units. With the ongoing advancement of natural language processing technologies, notable progress has been made in the field of Chinese word segmentation, and numerous high-quality, open-source segmentation software tools have emerged accordingly. Among these, Jieba segmentation has gained considerable recognition due to its high efficiency, precise performance, exceptional support for Chinese, and superior extensibility. Additionally, Jieba segmentation allows for user-defined vocabularies and stop-word lists, offering significant flexibility. Consequently, this system utilizes Jieba for text segmentation tasks.
Jieba segmentation provides three segmentation modes: Full Mode, Precise Mode, and Search Engine Mode. The Full Mode attempts to segment as many words as possible from the sentence, potentially resulting in duplication of characters. The default mode of Jieba is the Precise Mode, which follows the library’s built-in segmentation strategy, ensuring each word extracted is unique within the dictionary and avoiding redundancy, thus making it particularly suitable for general text analysis tasks. The Search Engine Mode tends to split longer phrases into shorter segments to better cater to specific requirements, such as search engine queries.
Given that the power industry involves highly specialized terminology and a wide range of content, the word segmentation model must accurately extract terminology while maintaining text integrity. Therefore, a search engine model was selected to perform the word segmentation task. To further improve word segmentation accuracy, this study developed a custom stop-word list for the power industry within Jieba, comprising approximately 120 domain-specific terms such as “new power system”, “Source-Grid-Load-Storage”, “virtual power plant”, “urban power grid”, “flexibility retrofit of coal power”, “renewable energy consumption”, “pumped storage”, “hydrogen energy technology”, etc. Additionally, the stop-word list was combined to filter out common functional words and irrelevant punctuation marks, such as “show”, “should”, “.”, etc.
To ensure transparency and reproducibility, the construction of the custom vocabulary and stop-word list followed strict standards. The custom vocabulary primarily draws from domain-specific terminology appearing in the Blue Book on the Development of New Power System and national power industry standards and was supplemented and verified by two senior experts with over ten years of experience in power system research to ensure comprehensiveness and accuracy. The stop-word list is adapted from publicly available standard stop-word lists and iteratively optimized by excluding high-frequency function words such as “the” and “and” and filler words with low semantic value in the domain context. The criteria for retaining or removing keywords are clearly defined: terms semantically related to the new power system domain are retained, such as “renewable energy”, “energy storage”, and “carbon neutrality”. Conversely, extremely low-frequency, generic non-technical terms, such as “development” and “issue”, and words that are statistically insignificant in TF-IDF or TextRank weights are removed. In this study, “noise” is operationally defined as high-frequency generic terms lacking thematic specificity, such as “system” or “technology” when used alone, punctuation marks, and grammatical particles.

2.2.2. Jieba Based on the TF-IDF Algorithm

The TF-IDF algorithm, grounded in statistical principles, quantitatively assesses the importance of a term within a single document relative to an entire corpus. It combines term frequency (TF) and inverse document frequency (IDF) to provide a precise evaluation of word significance [23]. TF measures the frequency of a specific term in the target document, whereas IDF indicates the rarity of that term across the entire corpus. For a dataset D containing documents Wi (i = 1, 2, 3, …, j), the term frequency TFi,j for term n in document Wi and the inverse document frequency IDFi,j are defined by Equations (1) and (2), respectively.
T F i , j = n i , j k n i , j
I D F i , j = log M 1 + d D : W i , j d
Here, ni,j denotes the number of occurrences of term Wi,j in dataset D; ∑k ni,j denotes the total number of terms in document Wi; |M| denotes the number of documents in D; and d denotes the number of documents in which term Wi appears. If a term does not appear in any document, this value is set to zero; to prevent that case, one is added to the total document count, ensuring a strictly positive denominator.
Combining TF and IDF yields a weight for a given term. The larger weight indicates the higher importance of the term in the document. The detailed computation is presented in Equation (3).
t i , j = T F I D F = n i , j k n i , j + log M 1 + d D : W i , j d

2.2.3. Jieba Based on TextRank Algorithm

TextRank represents an enhanced adaptation of the PageRank algorithm [24], and it is widely employed in text analysis tasks, especially for the automatic identification and extraction of salient terms. PageRank, which underlies TextRank, was originally designed to address webpage ranking and hyperlink analysis. Its fundamental concept treats the World Wide Web as a graph in which each webpage constitutes a node and each hyperlink serves as an edge connecting two nodes. A hyperlink from one page to another signifies a directed edge from the source node to the target node. The importance of a webpage increases with the number of incoming links. Initially, PageRank assigns an identical weight to every node; through successive iterations, these weights are updated to reflect the relative importance of each node within the entire network. This iterative procedure converges to a stable set of values—known as PageRank scores—which quantify the significance of each webpage for search engine ranking purposes.
Building on this concept, the TextRank algorithm models a document as a graph in which terms or sentences function as nodes. Iterative computations are performed on these nodes until convergence is reached, producing weight values that are then sorted in descending order. The top N nodes correspond to the most significant keywords. The core computation of TextRank is defined by Equation (4).
s ( v i ) = ( 1 λ ) + λ j N ( v i ) 1 N ( v j ) s ( v j )

2.3. Bibliometric and Visual Analysis Model Based on Citespace

2.3.1. Data Source

Because the concept of “new power system” derives from the Blue Book on the Development of New Power System in China, researchers outside China may employ different terminology and thereby overlook pertinent studies. To address this issue, this study selected the ten most frequent terms generated by two NLP algorithms as thematic keywords and conducted a Topic Search in the WOS Core Collection covering the years 2015-2024. Ten high-frequency keywords were selected for literature collection, with the following Boolean expression: TS = (“New energy” OR “Source-Grid-Load-Storage” OR “Power market” OR “Renewable energy” OR “Advanced energy storage” OR “System safety” “Pumped-storage hydroelectricity” OR “Carbon capture, utilization, and storage (CCUS)” OR “New energy generation” OR “Power supply.” The initial search returned 1958 records. To ensure quality and relevance, several exclusion criteria were applied. First, non-Article publication types (e.g., proceedings papers, letters, editorials, book chapters, and meeting abstracts) were excluded [25]. Second, duplicate entries and records with incomplete bibliographic metadata were removed. Finally, the remaining publications were manually screened by reviewing their titles, abstracts, and author keywords to eliminate papers unrelated to the theme “new power system.” After this multi-stage filtering, a total of non- Article publication types were excluded. Subsequently, the filtered research papers were manually reviewed, with their titles, keywords, and abstracts examined to determine their relevance to the theme “new power system.” Finally, a total of 1568 relevant publications were identified and used as input data for CiteSpace. The specific process is shown in Figure 2. All bibliographic information—author names, source titles, publication years, citation counts, and so forth—was exported in plain-text format for subsequent scientometric analysis.
To assess the robustness of the search strategy, sensitivity analyses were conducted by modifying the search parameters. Specifically, three variations were tested: the inclusion of review articles in addition to articles, removal of the English-language restriction, and extension of the timespan to 2010–2024. The results of these analyses are visualized in Figure 3. While the number of retrieved and included records varied slightly across scenarios, the resulting cluster structures and thematic trends remained consistent. This confirms that the search strategy is stable, and the dataset is representative of the research landscape of the new power system.

2.3.2. Analysis Method Based on Citespace

CiteSpace is a bibliometric analysis and visualization tool developed by Professor Chaomei Chen at Drexel University [26]. It is widely employed for research trend analysis, knowledge graph construction, frontier identification, and mapping the trajectories of scientific development [27]. By mining and extracting metadata, such as authors, institutions, countries, and keywords, from the retrieved publications, co-occurrence and clustering analyses were performed [28]. The workflow comprises document conversion, time-slice configuration, threshold range adjustment, Pathfinder pruning, and other related steps [29]. The primary methods of co-occurrence analysis include co-citation analysis and co-word analysis [30]. In this study, we configured the CiteSpace parameters as follows: the analysis was conducted with a 1-year slice length covering 2015–2025. The selection criteria employed a g-index with k = 15, a link retaining factor of 2.5, and cosine similarity for link strength. Keywords were chosen as node types, and Pathfinder pruning was consistently applied. The resulting network contained 278 nodes and 412 links (density = 0.0107), with the largest component including 99% of nodes. Co-citation analysis is a bibliometric method used to identify the intellectual affinity between two publications [31]. This approach holds that when two papers are cited jointly within a third document, they share an intellectual connection [32]. Frequently co-cited works form distinct clusters, which together constitute a co-citation network. Analysis of this network enables the identification of a field’s seminal publications and principal thematic areas. Co-word analysis differs from co-citation analysis in that it treats keywords as network nodes and creates links according to their co-occurrence within the same document [33]. By doing so, it uncovers the semantic structure of research themes and traces the evolution of topical hotspots [34].
Clustering analysis in this study employs two principal algorithms, LSA [35] and the LLR method [36]. LSA is a matrix factorization-based technique for uncovering latent semantic structures within text [37]. First, this method constructs a term–document co-occurrence matrix for the corpus under analysis, where each row represents a term and each column represents a document. The entries in the matrix typically correspond to raw term frequencies or TF-IDF-weighted values [38]. The term–document matrix is then subjected to Singular Value Decomposition (SVD). By retaining the largest singular values and their corresponding vectors, dimensionality is reduced, noise is eliminated, and the underlying thematic structure is extracted [39]. Finally, within the reduced latent semantic space, both documents and terms are represented as vectors, and semantic overlap between text segments is quantified using measures such as cosine similarity [40]. Unlike LSA, the LLR method computes LLR values and applies statistical tests to identify high-frequency, highly specific keywords, rendering it especially well suited for delineating topic boundaries and extracting representative labels [41].
In summary, CiteSpace applies information visualization techniques, bibliometric methods, and data mining algorithms to analyze the citations and content of the published scientific literature, thereby detecting and visualizing emerging research trends [42].

3. Results and Discussion

3.1. Keyword Extraction

The Blue Book on the Development of New Power System in China is a strategic document issued by China’s National Energy Administration in 2023 that first introduced the concept of the new power system. To extract its characteristic terminology, this study applied Jieba word segmentation with a custom dictionary and stop-word list to the Blue Book text. Keywords were then identified and ranked using both TF-IDF and TextRank algorithms. For TextRank, the damping factor λ was set to 0.85, the co-occurrence window size was fixed at 4, and the initial Top K was set to 50, from which the top 6 keywords were retained; TF-IDF used the same settings. Since the Blue Book is in Chinese, keyword extraction was first conducted in Chinese, and an explicit cross-language mapping was constructed for WoS queries (e.g., “新能源”→“new energy” and “源网荷储”→“Source-Grid-Load-Storage”). The results are presented in Table 1 and Table 2. Terms such as new energy, new power system, and renewable energy consistently appeared among the highest-ranked keywords in both methods, underscoring the central role of renewable technologies in new power system design. The phrase Source-Grid-Load-Storage also ranked prominently, reflecting sustained interest in multi-energy coordination and integrated system architectures. Accordingly, this study selected the following 10 keywords for Topic Search queries in the Web of Science Core Collection: new energy, Source-Grid-Load-Storage, power market, renewable energy, advanced energy storage, system safety, pumped-storage hydroelectricity, carbon capture, utilization, and storage (CCUS), new energy generation, and power supply.
To validate the robustness of the TF-IDF and TextRank keyword extraction process, we adopted a two-step procedure. First, the extracted keywords were cross-checked against domain-specific terminology emphasized in the Blue Book on the Development of New Power System. Terms such as “renewable energy”, “energy storage”, “Source-Grid-Load-Storage”, and “carbon capture” were found to be consistent with the central concepts repeatedly highlighted in the Blue Book. Second, we compared the extracted terms with those reported in recent review articles and policy analyses. For example, Cavus [43] emphasized “renewable energy”, “smart grid”, “system safety”, and “advanced energy storage” as cornerstones of modern grid transformation, while Husin et al. [44] highlighted the integration challenges of “renewable energy sources”, “grid stability”, and “smart grid technologies”. This high degree of overlap between our extracted terms and those highlighted in the authoritative literature and bibliometric analyses confirms the validity of our keyword set.

3.2. Analysis of Annual Publications

Between 2015 and 2024, a total of 1568 publications were identified using the specified retrieval strategy. Figure 4 presents both annual and cumulative publication counts, using one-year time slices. Since 2015, the annual output in the new power system domain has risen steadily, with the cumulative total growing by approximately 3.5 times over the decade, signaling increasing scholarly interest and expanding research output. In particular, annual publications jumped from 119 in 2019 to 164 in 2020, driven by energy transition and carbon neutrality policies alongside breakthroughs in energy storage and flexible dispatch technologies. After a slight downturn in 2021, output rebounded sharply in 2022–2024, reaching a record 282 publications in 2024. These trends indicate that research activity in new power systems is accelerating. Under ongoing policy support and continued technological advancement, both the volume of publications and the depth of research are expected to keep growing.

3.3. Analysis of Keyword Co-Occurrence

Term frequency refers to the number of times a term appears within the analyzed document [45]. By analyzing term frequencies, high-frequency keywords within the new power system domain can be identified. These keywords serve as proxies for the research hotspots that emerged in this field between 2015 and 2024. As shown in Table 3, the ten most frequent keywords by term frequency are listed. It should be noted that the keywords in Table 1 and Table 3 are derived from different perspectives, Table 1 presents high-frequency keywords extracted directly from the Blue Book on the Development of New Power System through semantic analysis, reflecting the terminology emphasized in the policy document itself. By contrast, Table 3 lists the high-frequency keywords obtained after using the Table 1 terms as queries for Web of Science literature retrieval, followed by co-occurrence analysis in CiteSpace, thereby reflecting academic research hotspots. Together, the two sets of keywords provide complementary insights into policy discourse and scholarly focus. Analysis indicates that “renewable energy” emerges as the field’s central term, while “energy storage” and “storage” exhibit both high frequency and high centrality. This suggests that energy storage not only constitutes a major research hotspot but also serves as a pivotal bridge connecting subthemes such as modeling, operation, and system integration. Specifically, energy storage plays a bridging role by enabling the large-scale integration of renewable energy into the power grid and by supporting the stable operation of smart grids. On the one hand, storage technologies absorb excess electricity from variable sources such as wind and solar, thereby mitigating fluctuations and enhancing the reliability of renewable energy integration. On the other hand, they provide flexibility for smart grid operation by balancing supply and demand in real time and ensuring system resilience. Therefore, “energy storage” occupies a central position across multiple research subfields. Although the term electricity market ranks high in frequency, its centrality value is low (0.01). This suggests that research on electricity market mechanisms remains relatively limited and has not yet established connections with other themes. In contrast, topics such as integration and operation serve as more effective bridges between research areas. In particular, integration is emphasized in the roadmap for new power system development as a pivotal strategy to enhance system flexibility, resilience, and decarbonization. Energy system integration entails the coordinated operation of multiple energy carriers, enabling cross-sectoral synergies and holistic optimization. For example, electricity–gas integration has been shown to improve flexibility and system stability under high renewable energy penetration by leveraging gas infrastructure as a buffer for variable electricity supply [46]. Likewise, electricity–heat integration allows joint scheduling of electricity generation and thermal energy systems, thereby improving energy utilization efficiency and supporting deep decarbonization pathways [47]. These examples illustrate that integration is not merely a keyword but also a critical research frontier and strategic approach for advancing the new power system.
Co-word co-occurrence denotes the simultaneous appearance of two or more keywords within a single document, thereby constituting a co-word relationship [48]. Co-word frequency quantifies the extent to which two topics are addressed concurrently within a research corpus [49]. Mapping identical keywords to the same alphanumeric codes enables the construction of a co-word matrix [50]. As shown in Figure 5, node size reflects keyword frequency, and edge thickness indicates co-occurrence strength. Concentric color rings, progressing outward, represent sustained activity for each year from 2015 through 2024. Analysis of Figure 3 indicates that “renewable energy”, “energy storage”, and “storage” have maintained high frequency and centrality since the outset of research, while spanning subfields such as model construction, system integration, performance evaluation, and market operation. “Wind power” and “integration” have illustrated the research focus on integrating wind power generation with multi-energy complementary systems. “Power market”, “operation”, and “power system” highlight focused investigations into market mechanisms and dispatch strategies.
It is noteworthy that in recent years, the red outer rings of keywords such as “optimization”, “technology”, and “impact” have become particularly prominent, indicating that optimization algorithms and technology impact assessment are emerging as new research frontiers. The co-occurrence network map systematically reconstructs the research hotspots in the new power system domain focused on renewable energy and energy storage and underscores the intermediary role of modeling and integration studies.

3.4. Analysis of Keyword Clusters

3.4.1. LSA-Based Keyword Clustering Results

LSA was applied to the input keyword dataset to perform cluster analysis. In this study, the maximum number of clusters that could be generated under the selected CiteSpace settings was ten. Therefore, the final solution naturally produced 10 clusters, and the resulting clusters are shown in Figure 6. Notably, the figure presents only the top-ranked terms for each cluster. Analysis indicates that “#0 renewable energy” is the largest cluster and encompasses the core themes of the new power system domain. Research on renewable energy has remained highly active throughout all time periods, confirming the Blue Book on the Development of New Power System’s emphasis on the “clean, low-carbon” objective and the principle that non-fossil energy sources will serve as the primary basis for installed capacity and power generation. Next, “0#1 energy storage” and “#7 energy storage system” correspond to the key construction points of the energy storage subsystem within the four elements: Source, Grid, Load, and Storage. The distribution of nodes across multiple years demonstrates that energy storage serves as a crucial support for flexibility and adaptability in new power system construction, reflecting its sustained and growing research value. The “#3 power system operation” and “#2 surface water energy” clusters correspond directly to the “smart integration” and hydropower development strategies outlined in the Blue Book on the Development of New Power Systems. Their core terms emphasize system dispatch and water resources, and an examination of cluster color coding reveals that this domain is emerging as a mid-to-late-stage research hotspot. Cluster “#4 carbon capture” is dominated by green nodes, indicating that research on carbon capture and utilization has been progressing steadily since the mid-term. Clusters “#5 Divisia index” and “#6 supersonic separation”, though smaller in size, are composed mainly of recent (orange-to-red) nodes, revealing that emerging metrics and processes—namely the Divisia index and supersonic separation—have become frontiers in current research. Finally, Clusters “#8 market power” and “#9 green energy” jointly illustrate the converging trend between electricity market mechanisms and the green power economy.
In summary, the LSA clustering results highlight both the central role of renewable energy and energy storage research and the evolution of current hotspots towards coordinated integration of wind, solar, and hydropower, intertwined with system operation and market mechanisms, alongside the rise of carbon capture and advanced process innovations, all pointing to the four defining characteristics outlined in the Blue Book on the Development of New Power Systems: safe and efficient operation, clean low-carbon footprint, flexibility and adaptability, and intelligent integration.
Table 4 provides a detailed summary of the Top K = 10 clustering results for the literature in the new power system domain based on LSA. Each cluster is characterized by its size, its silhouette coefficient, its average publication year, and the primary and secondary keywords extracted by LSA [51]. “LSA primary” refers to the terms directly extracted from the co-word matrix built using the documents within the cluster (including titles, abstracts, and keywords). These terms capture the cluster’s internal semantic characteristics or “self-descriptive” features. “LSA secondary” refers to the terms extracted by applying LSA to the set of documents that cite the cluster’s publications (the cluster’s external citation network). These terms highlight the vocabulary and focus areas used by external researchers when citing the cluster, thereby revealing the cluster’s extended semantics or “external contextual” features in subsequent studies [52]. Analysis of Table 4 indicates that the silhouette coefficients of the clusters range from 0.779 to 0.970, demonstrating strong intra-cluster homogeneity and clear inter-cluster separation.
Cluster size decreases progressively with the cluster index from #0 to #9. Cluster “#0 renewable energy” (size 34, silhouette coefficient 0.845, and average publication year 2017) represents the largest cluster. Primary keywords include “renewable energy”, “CO2 emissions”, and “renewable energy output”, reflecting the field’s core focus on clean energy generation. Secondary keywords—“economic growth”, “nuclear energy”, “vector autoregression”, and “electricity market coupling”—indicate an extension of research themes into economic modeling and market coupling. Cluster #1 (size 31; silhouette coefficient 0.945; average publication year 2018) and Cluster #3 (size 27; silhouette coefficient 0.952; average publication year 2017) further underscore a sustained emphasis on renewable energy and storage. Cluster #1 highlights technological diversification—namely, “wind energy” and “solar energy”—while Cluster #3 focuses on operational issues such as power system operation and sensitivity analysis. Secondary keywords, including “generation planning” and “demand-side management”, suggest an evolving research trajectory toward integrating technical models with system-level planning and demand response strategies. Cluster “#4 carbon capture” (size 24; silhouette coefficient 0.863; average publication year 2020) and Cluster “#8 market power” (size 22; silhouette coefficient 0.779; average publication year 2020) both predominately emerged after 2019, reflecting the policy-driven emphasis on carbon neutrality and electricity market reform. Cluster “#6 supersonic separation” (size 23; silhouette coefficient 0.946; average publication year 2021) features the keywords “supersonic separation” and “modal analysis”, reflecting the recent emergence of advanced separation technologies and methodological innovations. Cluster “#7 deep learning” (size 23; silhouette coefficient 0.924; average publication year 2018) bridges technological research and data-driven methods, with secondary keywords like “deep learning” and “photovoltaic power forecasting” indicating the application of AI techniques in renewable energy prediction and hybrid system optimization.
The contrast between primary and secondary keywords further elucidates the evolving research trajectory in the new power system domain. Primary keywords denote each cluster’s core technical focus, while secondary keywords reveal how these themes are situated within broader research contexts, for example, economic evaluation (“Divisia index” and “risk assessment”) and system reliability (“dynamic Pareto” and “fault distribution”).
The ten policy-driven keywords from the Blue Book exhibit a high degree of consistency and complementarity with the LSA clustering results. “New energy” appears as a central theme in Cluster #0 and Cluster #9. “Renewable energy” is present across almost all clusters, highlighting its dominant role in the field. “Advanced energy storage” is closely related to Cluster #1, Cluster #3, and Cluster #7, indicating that storage research spans both integration with renewables and techno-economic issues such as cost and remote supply. “Power market” corresponds directly to Cluster #8, underscoring the role of market mechanisms and competition. “CCUS” maps onto Cluster #4 and Cluster #6, confirming its emergence as an independent research front. By contrast, “Source-Grid-Load-Storage” does not explicitly appear as a cluster label but is reflected in Clusters #1 and #3, which emphasize renewable energy, storage, and system operation. “System safety” is not an explicit cluster but is partially represented in Cluster #3 and Cluster #7, both of which relate to system reliability. “Pumped-storage hydroelectricity” does not explicitly appear but is implicitly included in storage-related Clusters #1 and #7 as a specific technology pathway. “New energy generation” is associated with Cluster #2 and Cluster #6, showing its linkage to solar and innovative generation methods. Finally, “Power supply” corresponds to Cluster #7, highlighting supply issues in remote regions and economic considerations. Overall, the Blue Book keywords capture the strategic core concepts, while the clustering results disaggregate them into concrete research themes, validating the alignment between policy priorities and academic hotspots, while also revealing emerging directions such as waste heat utilization, bidding strategies, and AI-driven forecasting not explicitly covered by the policy document.
Table 4. LSA-based clustering analysis results (Top K = 10).
Table 4. LSA-based clustering analysis results (Top K = 10).
Cluster NumberSizeSilhouetteMain YearMain Keywords (LSA Primary)Main Keywords (LSA Second)
0340.8452017renewable energy, CO2 emission, and renewable energy productioneconomic growth, nuclear energy, vector autoregression, and power market coupling
1310.9452018renewable energy, energy storage, wind energy, and solar energypower generation, renewable generation, and seasonal multi-energy demands
2290.9332018renewable energy, surface water energy, solar PV energy, solar thermal energy, and waste heat energyrenewable energy sources, energy efficiency, electricity generation, data envelopment analysis, and non-renewable energy sources
3270.9522017renewable energy, energy storage, power system operation, and sensitivity analysispower generation, power generation planning, energy resources, and demand side management
4240.8632020carbon capture and greenhouse gasescarbon neutrality, carbon dioxide, emission reduction, and carbon management
5240.9072017renewable energy, Divisia index, energy intensity, risk assessment, and CO2 reductioncarbon capture, enhanced oil recovery, business model, economic evaluation, and risk assessment
6230.9462021carbon capture, supersonic separation, climate change, and modal analysisrenewable energy, technological innovation, power generation, and crucial barriers
7230.9242018energy storage system, electricity cost, and remote area electricity supplydeep learning, PV power forecasting, short-term memory, and hybrid renewable energy
8220.7792020market power, market power prediction, market power detection, neuro-fuzzy systems, and bidding strategythermal energy storage, concentrated solar power, liquid metals, and solar tower
9210.9702018renewable energy, green energy, and renewable energy lawenergy management strategy, dynamic Pareto, and power distribution faults

3.4.2. LLR-Based Keyword Clustering Results

The LLR algorithm was applied to the input keyword dataset for cluster analysis. In this study, the maximum number of clusters that could be generated under the selected CiteSpace settings was ten. Therefore, the final solution naturally produced 10 clusters, the resulting cluster map is presented in Figure 7. Analysis of Figure 7 reveals that cluster “#0 economic growth” is the largest, encompassing core topics such as renewable energy investment and macroeconomic impacts. This indicates that early research in the new power system domain focused heavily on the interplay between clean energy and economic growth. Clusters “#3 smart grid” and “#8 market power” highlight interdisciplinary studies on microgrids and electricity market mechanisms; the prevalence of mid-to-late period nodes shows that these areas have been hotspots since the study’s middle phase. Clusters “#5 security of supply” and “#6 carbon capture”, while distinct, jointly reflect two frontier demands: grid reliability and optimization of carbon reduction processes. Finally, cluster “#7 deep learning”, dominated by recent-year nodes, signals the rapid rise of AI methods in load forecasting and system optimization. Overall, the LSA method excels at uncovering latent semantics and implicit themes, whereas the LLR approach is particularly effective at delineating topic boundaries and highlighting high-frequency terminology.
Similar [38] to the LSA approach, top 10 clustering of new power system keywords using the LLR algorithm produced the results summarized in Table 5. Examination of the clusters’ silhouette coefficients confirms that each cluster exhibits high cohesion and clear separation, indicating robust clustering quality. Based on the LLR values and associated p-level metrics, the representativeness and statistical significance of each cluster’s thematic keywords can be further assessed. In Cluster #4 CCUS, the term CCUS (LLR = 46.34, p = 1.0 × 10−4), and in Cluster #6 carbon capture, the term carbon capture (LLR = 42.52, p = 1.0 × 10−4) exhibit exceptionally high LLR values. These figures indicate that both keywords occur at frequencies far above the corpus average and are highly unlikely to result from random variation, thereby demonstrating strong thematic specificity. Moreover, in Cluster #8, “market power” (LLR = 40.88, p = 1.0 × 10−4) and in Cluster “#0 economic growth” (LLR = 39.75, p = 1.0 × 10−4) both exhibit exceptional discriminative strength for their respective themes, reaffirming the centrality of market mechanisms and macroeconomic growth. By comparison, “thermal inertia” in Cluster #3 (LLR = 7.23, p = 0.01) and “demand response” in Cluster #9 (LLR = 6.71, p = 0.01) yield lower LLR scores yet remain statistically significant at the 0.01 level, underscoring the enduring relevance of smart-grid control and demand-side management in new power system research.
Analysis of the primary terms within each cluster reveals distinct thematic emphases. Cluster #0 “economic growth” highlights the early literature’s core focus on interactions between clean energy and macroeconomic growth. Cluster #1 emphasizes keywords such as “pumped storage”, “optimal design”, “genetic algorithm”, “biomass gasification”, and “green hydrogen”, reflecting critical technologies related to pumped-storage optimization and multi-energy coordination. Cluster #2 underscores diversified approaches within the renewable energy transition. Cluster #3, characterized by keywords such as “smart grid”, “reinforcement learning”, “integrated energy system”, and “thermal inertia”, illustrates the convergence of intelligent grids and reinforcement learning in system integration and control. Cluster #4, featuring keywords such as “CCUS” and “capture”, indicates the growing prominence of carbon capture, storage, and utilization technologies in the context of the “dual carbon” targets. Cluster #5 demonstrates sustained attention toward electricity system reliability and pricing mechanisms. Collectively, the high LLR values and significant p-levels of these keywords not only validate the distinctiveness of each thematic cluster but also clearly depict the multidimensional evolution of research from clean energy and economic growth through energy storage and intelligent control to market mechanisms and carbon management.
The ten policy-driven keywords from the Blue Book are also well reflected in the LLR clustering results. “New energy” corresponds directly to Cluster #0 and Cluster #2, reaffirming its central role. “Renewable energy” appears across Clusters #0, #2, and #4, linking closely with themes of energy transition and carbon neutrality. “Advanced energy storage” is associated with Cluster #7 and Cluster #9, indicating that storage research spans both system-level integration and intelligent control. “Power market” aligns with Cluster #8, highlighting market competition and regulatory mechanisms. “CCUS” is explicitly central in Cluster #4 and Cluster #6, confirming its emergence as an independent research front. “Source-Grid-Load-Storage” does not explicitly appear but is reflected in Cluster #3, emphasizing multi-energy integration and intelligent scheduling. “System safety” is captured in Cluster #5, focusing on reliability and security of power supply. “Pumped-storage hydroelectricity” does not appear explicitly but can be considered part of storage-related Cluster #7. “New energy generation” maps onto Cluster #2 and Cluster #9, stressing innovative generation pathways. Finally, “Power supply” corresponds to Cluster #5, highlighting supply reliability and distribution optimization. Overall, the Blue Book keywords align closely with the LLR clusters in core themes, while the clustering analysis further uncovers emerging directions not explicitly mentioned in the policy document, such as the use of deep learning in storage optimization, time-series forecasting, and demand response, thereby enriching the technical and frontier dimensions of the study.
Table 5. Clustering analysis results based on LLR (Top K = 10).
Table 5. Clustering analysis results based on LLR (Top K = 10).
Cluster NumberSizeSilhouetteMain YearMain Keywords (LSA Primary)
0340.8452017economic growth (39.75, 1.0 × 10−4), renewable energy (23.36, 1.0 × 10−4), non-renewable energy (18.3, 1.0 × 10−4), and electricity generation (14.15, 0.001)
1310.9452018pumped storage (20.82, 1.0 × 10−4), optimal design (14.54, 0.001), genetic algorithm (12.13, 0.001), biomass gasification (10.23, 0.005), and green hydrogen (10.23, 0.005)
2290.9332018renewable energy (25, 1.0 × 10−4), renewable energy sources (15.79, 1.0 × 10−4), renewable energies (13.43, 0.001), energy transition (12.46, 0.001), and solar energy (12.46, 0.001)
3270.9522017smart grid (11.96, 0.001), reinforcement learning (10.91, 0.001), integrated energy system (9.19, 0.005), and thermal inertia (7.23, 0.01)
4240.8632020CCUS (46.34, 1.0 × 10−4), renewable energy (20.51, 1.0 × 10−4), capture (17.88, 1.0 × 10−4), carbon neutrality (13.56, 0.001), and mineral carbonation (11.37, 0.001)
5240.9072017security of supply (16.2, 1.0 × 10−4), pricing (10.79, 0.005), energy security (8.3, 0.005), and distribution network (7.12, 0.01)
6230.9462021carbon capture (42.52, 1.0 × 10−4), CCUS (16.61, 1.0 × 10−4), utilization (14.44, 0.001), and vibration (12.57, 0.001)
7230.9242018deep learning (26.97, 1.0 × 10−4), energy storage system (13.78, 0.001), voltage control (9.71, 0.005), time series forecasting (9.71, 0.005), and energy storage applications (9.71, 0.005)
8220.7792020market power (40.88, 1.0 × 10−4), system dynamics (12.27, 0.001), thermal energy storage (12.16, 0.001), regulation (10.6, 0.005), and carbon emission reduction (9.48, 0.005)
9210.9702018distributed generation (13, 0.001), biomass (11.07, 0.001), energy management strategy (11.07, 0.001), demand response (6.71, 0.01), and sustainable development (6.68, 0.01)

3.4.3. Comparative Analysis of Clustering Results

In order to enhance the robustness of clustering analysis, both LSA and LLR algorithms were employed, and their results were systematically compared. The LSA algorithm, which is grounded in latent semantic representation, tends to capture broader conceptual linkages between terms. As a result, LSA-generated clusters usually integrate thematically related but lexically diverse concepts into unified knowledge domains. By contrast, the LLR algorithm is more sensitive to statistically significant co-occurrence patterns, producing clusters with sharper boundaries and stronger term exclusivity. This distinction is evident in our results: for instance, both LSA and LLR identify “renewable energy integration” and “energy storage” as dominant clusters. However, LSA aggregates them into a broader thematic cluster labeled “energy system transition,” thereby emphasizing interdisciplinary convergence, while LLR decomposes them into two subclusters, namely “renewable energy and grid coordination” and “advanced storage technologies,” providing greater granularity. From the perspective of cluster density and modularity, the LSA clustering produced 8 clusters, with Q = 0.742 and S = 0.876, while the LLR clustering yielded 12 clusters, with Q = 0.758 and S = 0.892. This consistency in modularity and silhouette values further demonstrates the robustness of the clustering results.
Moreover, both methods consistently highlight renewable energy and storage as dominant themes, while their complementary emphases—macro-level integration in LSA versus micro-level diversification in LLR—offer richer insights into the structural evolution of the field. Together, these findings demonstrate that applying both LSA and LLR not only validates the robustness of the clustering results but also provides complementary perspectives, substantially strengthening the validity and interpretability of our empirical findings.

3.5. Emerging Trends of New Power System

A time-series analysis was performed on the clustered keyword groups, resulting in Figure 8. From the figure, one can clearly observe the evolution and development of research hotspots in new power systems over the ten-year period from 2015 to 2024. Analysis of the temporal evolution of each cluster shows that “#2 renewable energy” and “#0 economic growth” have remained the largest nodes with the highest-frequency connections since 2015, confirming that research on clean low-carbon synergies between renewable energy technologies and macroeconomic development has consistently formed the backbone of the new power system domain. In 2020, “#1 pumped storage” and “#3 smart grid” became markedly active, underscoring the crucial roles of storage optimization and intelligent grids in achieving the “flexibility and adaptability” and “intelligent integration” objectives. By 2024, nodes for “#4 CCUS” and “#6 carbon capture” expanded sharply and their concentric rings shifted from green to red, reflecting a research surge in carbon capture and utilization technologies driven by dual-carbon policies. Concurrently, “#8 deep learning” grew from its initial emergence in 2020 to prominently larger, red-shaded nodes by 2024, demonstrating the rapid ascent of AI methods in load forecasting and system optimization. In contrast, “#5 security of supply” and “#7 market power” have exhibited steadily diminishing node sizes since the mid-period, indicating a slight decline in interest in traditional grid reliability and market mechanism research amid the rise of new themes. In summary, this timeline map not only reflects the multi-stage evolution from foundational energy and economic evaluation through intelligent storage to carbon reduction and AI-driven innovations but also provides guidance for future research hotspots in the new power system domain.
In response to the recent surge of research hotspots in “CCUS” and “deep learning,” it is important to analyze the driving factors behind these trends. In the case of CCUS, research in the early stage (around 2010) primarily focused on life-cycle assessment and geological storage, which established the foundation for environmental feasibility. By 2015, significant breakthroughs were achieved in CO2 capture and separation technologies, such as solvent absorption and membrane-based processes, which pushed the field toward technical maturity. Around 2018, techno-economic analysis began to gain attention, marking a shift toward large-scale deployment and industrial applications. After 2020, the scope of research further expanded to CO2 conversion pathways (e.g., methanol and methane synthesis), accompanied by the rise of novel materials such as metal–organic frameworks (MOFs), which opened new possibilities for efficient CO2 capture and utilization [53]. These technological milestones collectively explain the accelerated growth of CCUS-related research observed in the temporal evolution analysis over the past five years. For the research field of “deep learning,” in the early stage (around 2015), studies primarily employed shallow neural networks for load forecasting and fault detection, which laid the foundation for the application of data-driven methods in power systems. Between 2016 and 2018, with the rise of deep architectures such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), large-scale power data could be modeled more effectively, thereby promoting their applications in frequency analysis, stability assessment, and state estimation. During the period from 2018 to 2020, the introduction of deep reinforcement learning (DRL) represented a significant breakthrough, as this method demonstrated strong adaptability and autonomous control potential in complex and dynamic power grid environments [54]. Since 2020, deep learning has gradually expanded to integrated energy systems, showing particular advantages in frequency analysis and stability control within electricity–heat and electricity–gas coupling scenarios [55]. In recent years, research has further extended to model interpretability and real-time deployment, such as employing graph neural networks (GNNs) for power grid topology modeling and developing lightweight models for edge computing, which reduced the technical barriers to online applications. These technological milestones collectively explain why “deep learning” has shown an accelerated growth trend in temporal evolution analysis of power system research.
The Kleinberg burst detection model was employed for computational analysis, and the identified terms were ranked according to their burst intensities and onset times, with the results presented in Table 6. Analysis indicates that the evolution of research hotspots in the new power system domain can be divided into three phases. In Phase I (2015–2018), “electricity market” (strength = 7.96) and “wind power” (strength = 7.93) exhibited exceptionally high burst intensities. Concurrently, “energy policy” (strength = 5.51) and “smart grid” (strength = 3.32) also demonstrated significant bursts, suggesting that the early stage of new power system development was focused on electricity market reform and smart grid technologies. In Phase II (2016–2021), “demand-side management” (strength = 3.45) and “flexibility” (strength = 2.88) remained focal points, underscoring the critical support that demand response and system flexibility provide for the integration of large-scale renewable energy. Simultaneously, the concurrent bursts of “economic growth” (strength = 2.92) and “cointegration” (strength = 2.74) reflect scholars’ extensive exploration of the relationship between clean energy and macroeconomic dynamics through cointegration analysis. In Phase III (2020–2024), “neural network” (strength = 3.16) rose sharply, revealing the widespread application of deep learning in load forecasting and dispatch optimization, while “capture” (strength = 3.19) highlighted the research surge in carbon capture and storage technologies under the “dual-carbon” strategic framework.

4. Conclusions

Since the release of the Blue Book on the Development of New Power System, research in this domain has steadily emerged as a focal topic. In recent years, both the number of publications and citation counts have risen rapidly; however, no comprehensive evaluation of the field’s developmental trajectory and shifting research hotspots has yet been undertaken. Therefore, this paper employs bibliometric methods to systematically review the literature on new power systems. To ensure data integrity, keyword extraction was first performed on the document using a natural language processing model. The extracted keywords served as topic terms for a Web of Science search, yielding 1568 relevant records. These publications were then subjected to descriptive statistical analysis, including overall publication growth trends, co-occurrence network analysis, and keyword clustering. From this study, the following conclusions were drawn:
(1) The TF-IDF and TextRank algorithms were applied to the Blue Book on the Development of New Power System to extract high-frequency thematic terms. These terms were used as topic queries in the Web of Science Core Collection for the period 2015–2024, yielding 1568 relevant publications. Annual output increased steadily, with the cumulative total rising approximately 3.5-fold over the decade. However, the distribution across subfields remains uneven, indicating that systematic exploration is still lacking in emerging areas such as multi-energy complementarity and electricity–gas/heat coupling. This suggests that policymakers should strengthen guidance and funding support for cross-domain research when promoting multi-energy demonstration projects and standardization.
(2) The extracted publications were subjected to keyword co-occurrence analysis using CiteSpace. “Renewable energy” emerged as the central term with a frequency of 247 occurrences. “Energy storage” demonstrated both high frequency and high centrality, indicating that storage not only represents a major research hotspot but also acts as a crucial bridge linking subfields such as modeling, operation, and system integration. Nevertheless, the mechanisms through which storage facilitates renewable energy integration, enhances demand-side flexibility, and supports smart grid operation remain insufficiently clarified. From a policy perspective, it is necessary to accelerate the establishment of multi-level storage incentive mechanisms and promote the integration of storage with market mechanisms and carbon trading systems.
(3) Keyword clustering using both LSA and LLR algorithms produced silhouette coefficients above 0.779 for all clusters, indicating robust clustering validity. Consistent with the co-occurrence analysis, renewable energy and energy storage remained central throughout; smart grid technologies, market mechanisms, and demand-side management emerged prominently during the mid-term; and carbon capture (CCUS) and deep learning surged to prominence in the later phase under the dual-carbon and digital intelligence imperatives. However, comparative research on the maturity, scaling pathways, and integration of these frontier technologies into system planning and policy frameworks is still insufficient. Policymakers should strategically coordinate the development trajectories of CCUS and artificial intelligence, clarifying their roles in energy security and decarbonization policies.
(4) A comprehensive analysis of keywords in the new power system literature reveals a multi-stage evolution: from foundational energy and economic assessment, through intelligent energy storage, to carbon reduction and AI-driven innovations. Future interdisciplinary research themes—particularly intelligent dispatch, flexible operation, and low-carbon transition—are poised to become focal points in the new power system domain. However, this study is subject to certain limitations: for instance, the dataset was primarily derived from the Web of Science Core Collection, potentially omitting regional or non-English literature. Moreover, keyword-based clustering may oversimplify the complex interactions among policy, technology, and markets. Future studies should integrate multi-database sources, deepen modeling of policy–technology–market coupling mechanisms, and enhance the translation of research findings into practical policy formulation and industrial demonstration.

Author Contributions

Conceptualization, Y.W. and M.Z.; methodology, Y.W., M.Z. and L.L.; resources, L.L.; funding acquisition H.C.; formal analysis H.C.; vestigation M.L.; visualization Y.Z.; data curation, Y.W. and M.L.; writing—original draft preparation, Y.W.; writing—review and editing, Y.W.; supervision, H.C.; project administration, L.L., M.L. and Y.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Science and Technology Project of State Grid Corporation of China (Grant No. 1400-202456284A-1-1-ZN, Research on the collaborative allocation and feedback control technology of project reservation for business optimization). The authors declare that this study received funding from the State Grid Corporation of China. The funder had the following involvement with the study: Participated in research and provided public data required for research.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

Authors Minghong Liu and Lingshuang Liu were employed by the State Grid Xinjiang Electric Power Company Economic and Technological Research Institute; Yan Zhang was employed by the State Grid Xinjiang Electric Power Corporation. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Cheng, Q.; Li, F.; Luo, S.; Wu, W.; Zhang, N.; Kang, C. Research on the Planning Methodology Framework and Key Supporting Technologies for New-type Power Systems. Power Syst. Technol. 2025, 49, 2219–2231. [Google Scholar] [CrossRef]
  2. Zhang, A.H.; Şirin, S.M.; Fan, C.; Bu, M. An analysis of the factors driving utility-scale solar PV investments in China: How effective was the feed-in tariff policy? Energy Policy 2022, 167, 113044. [Google Scholar] [CrossRef]
  3. Østergaard, P.A.; Duić, N.; Noorollahi, Y.; Kalogirou, S.A. Advances in renewable energy for sustainable development. Renew. Energy 2023, 219, 119377. [Google Scholar] [CrossRef]
  4. Kataray, T.; Nitesh, B.; Yarram, B.; Sinha, S.; Cuce, E.; Shaik, S.; Vigneshwaran, P.; Roy, A. Integration of smart grid with renewable energy sources: Opportunities and challenges—A comprehensive review. Sustain. Energy Technol. 2023, 58, 103363. [Google Scholar] [CrossRef]
  5. Jia, S. The Development Trend and Measures of China’s Photovoltaic New Energy Industry. Highlights Bus. Econ. Manag. 2024, 45, 238–244. [Google Scholar] [CrossRef]
  6. TUP. The National Energy Administration organized the release of the “Blue Book on the Development of New Power Systems”. iEnergy 2023, 2, 87–88. [Google Scholar] [CrossRef]
  7. Cheng, M.Y.; Kusoemo, D.; Gosno, R.A. Text mining-based construction site accident classification using hybrid supervised machine learning. Autom. Constr. 2020, 118, 103265. [Google Scholar] [CrossRef]
  8. de Groof, R.; Xu, H. Automatic topic discovery of online hospital reviews using an improved LDA with Variational Gibbs Sampling. In Proceedings of the 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, USA, 11–14 December 2017; pp. 4022–4029. [Google Scholar]
  9. Olusegun, R.; Oladunni, T.; Audu, H.; Houkpati, Y.; Bengesi, S. Text Mining and Emotion Classification on Monkeypox Twitter Dataset: A Deep Learning-Natural Language Processing (NLP) Approach. IEEE Access 2023, 11, 49882–49894. [Google Scholar] [CrossRef]
  10. Naithani, K.; Raiwani, Y.P. Realization of natural language processing and machine learning approaches for text-based sentiment analysis. Expert Syst. 2023, 40, e13114. [Google Scholar] [CrossRef]
  11. Yu, H.; Xiong, F.; Chen, Z. Text Classification Based on Natural Language Processing and Machine Learning in Multi-Label Corpus. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 2024, 23, 115. [Google Scholar] [CrossRef]
  12. Ennajari, H.; Bouguila, N.; Bentahar, J. Combining Knowledge Graph and Word Embeddings for Spherical Topic Modeling. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 3609–3623. [Google Scholar] [CrossRef] [PubMed]
  13. Saxena, A.; McGranaghan, P.; Rubens, M.; Salami, J.; Tonse, R.; Lindeman, A.A.; Keller, M.S.; Lindeman, P.; Veledar, E. Natural language processing (NLP) and machine learning (ML) model for predicting CMS OP-35 categories among patients receiving chemotherapy. J. Clin. Oncol. 2021, 39, e13591. [Google Scholar] [CrossRef]
  14. Hao, L.; Xi, C.; Xiao, L. A Study of the Application of Weight Distributing Method Combining Sentiment Dictionary and TF-IDF for Text Sentiment Analysis. IEEE Access 2022, 10, 32280–32289. [Google Scholar] [CrossRef]
  15. Razzaq, S.; Malik, A.K.; Raza, B.; Khattak, H.A.; Zegarra, G.W.M.; Molina, J. Research Collaboration Influence Analysis Using Dynamic Co-authorship and Citation Networks. Int. J. Interact. Multimed. Artif. Intell. 2022, 7, 103. [Google Scholar] [CrossRef]
  16. Ruirui, L.; Peipei, D.; Chen, L. Hotspots and emerging trends in the research area of agricultural application of biochar: Visualization analysis based on bibliometrics. Sci. Technol. Soc. 2021, 21, 14440–14450. [Google Scholar]
  17. Fanfan, D.; Olorunlogbon, O.; Figueroa, Y.T.; Trent, J.C. Publication trends in sarcoma research: A bibliometric analysis. J. Clin. Oncol. 2023, 41, e23530. [Google Scholar] [CrossRef]
  18. Ahiase, G.; Umar, A.; Saeed, A.M.M.; Rasuman, M.A.; Suri, E.R.D.R. Research Trends in Digital Financial Inclusion: A Bibliometric Analysis using VOSviewer. Int. J. Inform. Inf. Syst. Comput. Eng. 2024, 5, 132–145. [Google Scholar]
  19. Goranin, N.; Hora, S.K.; Čenys, A. A Bibliometric Review of Intrusion Detection Research in IoT: Evolution, Collaboration, and Emerging Trends. Electronics 2024, 13, 3210. [Google Scholar] [CrossRef]
  20. Wei, H.; Zhang, J.; Wang, W.; Zhu, X. Development trajectory of China’s virtual power plants Industry: Comprehensive quantitative evaluation using a novel dynamic multidimensional cloud model. J. Clean. Prod. 2025, 508, 145576. [Google Scholar] [CrossRef]
  21. Luo, Y.; Ahmadi, E.; McLellan, B.C.; Tezuka, T. A hybrid system dynamics model for power mix trajectory simulation in liberalized electricity markets considering carbon and capacity policy. Renew. Energy 2024, 233, 121164. [Google Scholar] [CrossRef]
  22. Liu, T. Data visualization-based paradigm trajectory exploration towards development and shifts for China’s wind power policy. Renew. Sustain. Energy Rev. 2025, 217, 115715. [Google Scholar] [CrossRef]
  23. Jones, K.S. A statistical interpretation of term specificity and its application in retrieval. J. Doc. 2004, 60, 493–502. [Google Scholar] [CrossRef]
  24. Jun, H.; Im, D.; Kim, H. An RDF Metadata-Based Weighted Semantic Pagerank Algorithm. Int. J. Web Semant. Technol. 2016, 7, 11–24. [Google Scholar] [CrossRef]
  25. Shi, R.; Wan, X. A bibliometric analysis of knowledge mapping in Chinese education digitalization research from 2012 to 2022. Humanit. Soc. Sci. Commun. 2024, 11, 505. [Google Scholar] [CrossRef]
  26. Chen, C. CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. J. Am. Soc. Inf. Sci. Technol. 2005, 57, 359–377. [Google Scholar] [CrossRef]
  27. Chen, C.; Chen, Y.; Horowitz, M.; Hou, H.; Liu, Z.; Pellegrino, D.A. Towards an explanatory and computational theory of scientific discovery. J. Informetr. 2009, 3, 191–209. [Google Scholar] [CrossRef]
  28. Losse, M.; Geissdoerfer, M. Mapping socially responsible investing: A bibliometric and citation network analysis. J. Clean. Prod. 2021, 296, 126376. [Google Scholar] [CrossRef]
  29. Wei, X.; Liu, Q.; Pu, A.; Wang, S.; Chen, F.; Zhang, L.; Zhang, Y.; Dong, Z.; Wan, X. Knowledge Mapping of bioeconomy: A bibliometric analysis. J. Clean. Prod. 2022, 373, 133824. [Google Scholar] [CrossRef]
  30. Rojas-Lamorena, Á.J.; García, S.D.B.; Pilar, J.M.A. A review of three decades of academic research on brand equity: A bibliometric approach using co-word analysis and bibliographic coupling. J. Bus. Res. 2022, 139, 1067–1083. [Google Scholar] [CrossRef]
  31. Fang, S.; Wei, Y.; Wang, S. 30 years of exchange rate analysis and forecasting: A bibliometric review. J. Econ. Surv. 2024, 38, 973–1007. [Google Scholar] [CrossRef]
  32. Chu, S.; Deng, T.; Cheng, H. The role of social media advertising in hospitality, tourism and travel: A literature review and research agenda. Int. J. Contemp. Hosp. Manag. 2020, 32, 3419–3438. [Google Scholar] [CrossRef]
  33. Fauzi, M.A. A bibliometric review on knowledge management in tourism and hospitality: Past, present and future trends. Int. J. Contemp. Hosp. Manag. 2022, 35, 2178–2201. [Google Scholar] [CrossRef]
  34. Fauzi, M.A.; Abdul Rahman, A.R.; Lee, C.K. A systematic bibliometric review of the United Nation’s SDGS: Which are the most related to higher education institutions? Int. J. Sustain. High. Educ. J. Sustain. High. Educ. 2022, 24, 637–659. [Google Scholar] [CrossRef]
  35. Hassani, A.; Iranmanesh, A.; Mansouri, N. Text mining using nonnegative matrix factorization and latent semantic analysis. Neural Comput. Appl. 2021, 33, 13745–13766. [Google Scholar] [CrossRef]
  36. Daghyani, M.; Zamzami, N.; Bouguila, N. Toward an Efficient Computation of Log-Likelihood Functions in Statistical Inference: Overdispersed Count Data Clustering. In Mixture Models and Applications; Bouguila, N., Fan, W., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 155–176. [Google Scholar]
  37. Shi, F.; Chen, L.; Han, J.; Childs, P. A Data-Driven Text Mining and Semantic Network Analysis for Design Information Retrieval. J. Mech. Des. Mech. Des. 2017, 139, 111402. [Google Scholar] [CrossRef]
  38. Kherwa, P.; Bansal, P. Latent Semantic Analysis: An Approach to Understand Semantic of Text. In Proceedings of the 2017 International Conference on Current Trends in Computer, Electrical, Electronics and Communication (CTCEEC), Mysuru, India, 8–9 September 2017; pp. 870–874. [Google Scholar]
  39. Cheng, Q.; Zhu, Y.; Song, J.; Zeng, H.; Wang, S.; Sun, K.; Zhang, J. Bert-Based Latent Semantic Analysis (Bert-LSA): A Case Study on Geospatial Data Technology and Application Trend Analysis. Appl. Sci. 2021, 11, 11897. [Google Scholar] [CrossRef]
  40. Lakshmi, A.; Latha, D. A Hybrid Model of Latent Semantic Analysis with Graph-Based Text Summarization on Telugu Text. In Intelligent System Design: Proceedings of INDIA 2022; Bhateja, V., Sunitha, K.V.N., Chen, Y., Zhang, Y., Eds.; Springer Nature Singapore: Singapore, 2023; pp. 179–186. [Google Scholar]
  41. Quan, C.; Liu, F.; Qi, L.; Tie, Y. LRT-CLUSTER: A New Clustering Algorithm Based on Likelihood Ratio Test to Identify Driving Genes. Interdiscip. Sci. Comput. Life Sci. 2023, 15, 217–230. [Google Scholar] [CrossRef]
  42. Xu, J.; Liu, T. Technological paradigm-based approaches towards challenges and policy shifts for sustainable wind energy development. Energy Policy 2020, 142, 111538. [Google Scholar] [CrossRef]
  43. Cavus, M. Advancing Power Systems with Renewable Energy and Intelligent Technologies: A Comprehensive Review on Grid Transformation and Integration. Electronics 2025, 14, 1159. [Google Scholar] [CrossRef]
  44. Erdiwansyah, F.; Mahidin, F.; Husin, H.; Nasaruddin, F.; Zaki, M.; Muhibbuddin, F. A critical review of the integration of renewable energy sources with various technologies. Prot. Control. Mod. Power Syst. 2021, 6, 3. [Google Scholar] [CrossRef]
  45. Diéguez-Santana, K.; González-Díaz, H. Machine learning in antibacterial discovery and development: A bibliometric and network analysis of research hotspots and trends. Comput. Biol. Med. 2023, 155, 106638. [Google Scholar] [CrossRef] [PubMed]
  46. Zhang, S.; Zhang, X.; Zhang, R.; Gu, W.; Cao, G. N-1 Evaluation of Integrated Electricity and Gas System Considering Cyber-Physical Interdependence. IEEE Trans. Smart Grid 2025, 16, 3728–3742. [Google Scholar] [CrossRef]
  47. Khatibi, M.; Bendtsen, J.D.; Stoustrup, J.; Mølbak, T. Exploiting Power-to-Heat Assets in District Heating Networks to Regulate Electric Power Network. IEEE Trans. Smart Grid 2021, 12, 2048–2059. [Google Scholar] [CrossRef]
  48. Tang, K.Y.; Chang, C.Y.; Hwang, G.J. Trends in artificial intelligence-supported e-learning: A systematic review and co-citation network analysis (1998–2019). Interact. Learn. Environ. 2021, 31, 2134–2152. [Google Scholar] [CrossRef]
  49. Xu, W.; Dai, T.; Shen, Z. Effect sizes and research directions of technology application in museum learning: Evidence obtained by integrating meta-analysis with co-citation network analysis. J. Comput. Assist. Learn. 2021, 38, 565–580. [Google Scholar] [CrossRef]
  50. Smojver, V.; Štorga, M.; Zovak, G. Exploring knowledge flow within a technology domain by conducting a dynamic analysis of a patent co-citation network. J. Knowl. Manag. 2020, 25, 433–453. [Google Scholar] [CrossRef]
  51. Wen, J.; Aishan, T.; Halik, Ü.; Wei, Z.; Wumaier, M. A Bibliometric and Visualized Analysis of Research Progress and Trends on Decay and Cavity Trees in Forest Ecosystem over 20 Years: An Application of the CiteSpace Software. Forests 2022, 13, 1437. [Google Scholar] [CrossRef]
  52. Rawat, K.S.; Sood, S.K. Knowledge mapping of computer applications in education using CiteSpace. Comput. Appl. Eng. Educ. 2021, 29, 1324–1339. [Google Scholar] [CrossRef]
  53. Agbejule, A.; Sempron-Namuag, P. A bibliometric analysis of carbon capture, utilization and storage (CCUS): Identifying barriers and drivers. Appl. Energy 2025, 400, 126604. [Google Scholar] [CrossRef]
  54. Zhang, Y.; Shi, X.; Zhang, H.; Cao, Y.; Terzija, V. Review on deep learning applications in frequency analysis and control of modern power system. Int. J. Electr. Power Energy Syst. 2022, 136, 107744. [Google Scholar] [CrossRef]
  55. Akhtar, S.; Adeel, M.; Iqbal, M.; Namoun, A.; Tufail, A.; Kim, K. Deep learning methods utilization in electric power systems. Energy Rep. 2023, 10, 2138–2151. [Google Scholar] [CrossRef]
Figure 1. The research process.
Figure 1. The research process.
Energies 18 04809 g001
Figure 2. The PRISMA flow diagram.
Figure 2. The PRISMA flow diagram.
Energies 18 04809 g002
Figure 3. Sensitivity analysis of the search strategy.
Figure 3. Sensitivity analysis of the search strategy.
Energies 18 04809 g003
Figure 4. Trend of publication counts.
Figure 4. Trend of publication counts.
Energies 18 04809 g004
Figure 5. Keyword co-occurrence network map.
Figure 5. Keyword co-occurrence network map.
Energies 18 04809 g005
Figure 6. Clustering analysis map based on LSA (Top K = 10).
Figure 6. Clustering analysis map based on LSA (Top K = 10).
Energies 18 04809 g006
Figure 7. Clustering analysis map based on LLR (Top K = 10).
Figure 7. Clustering analysis map based on LLR (Top K = 10).
Energies 18 04809 g007
Figure 8. Distribution of the keyword clusters on the timeline.
Figure 8. Distribution of the keyword clusters on the timeline.
Energies 18 04809 g008
Table 1. Results of document keyword extraction based on the TF-IDF algorithm.
Table 1. Results of document keyword extraction based on the TF-IDF algorithm.
RankTermTerm Frequency
1New energy65
2Source-Grid-Load-Storage34
3Power market23
4Renewable energy21
5Advanced energy storage18
6System safety15
7Pumped-storage hydroelectricity13
8Carbon capture, utilization, and storage (CCUS)13
9New energy generation13
10Power supply11
Table 2. Results of document keyword extraction based on the TextRank algorithm.
Table 2. Results of document keyword extraction based on the TextRank algorithm.
RankTermRaw CountTextRank Weight
1New power system12,2940.4770
2Power system75940.4368
3Generation-Grid-Load-Storage36030.2029
4Distributed energy32500.1746
5Renewable energy26740.1575
6Electricity market26060.1491
Table 3. Top 10 keywords by frequency ranking.
Table 3. Top 10 keywords by frequency ranking.
RankKeywordsFrequencyCentrality
1Renewable energy2470.06
2Model1270.05
3Energy storage1190.06
4Wind power730.05
5Performance730.05
6Storage690.08
7Power system630.05
8Electricity market570.01
9Operation560.04
10Integration560.05
Table 6. Brust disciplines in the field of new power system during 2015–2024.
Table 6. Brust disciplines in the field of new power system during 2015–2024.
KeywordsYearStrengthBeginEnd2015–2024
Electricity market20157.9620152018----------
Wind power20157.9320152018----------
Energy policy20155.5120152018----------
Smart grid20153.3220152019----------
Demand-side management20163.4520162020----------
Flexibility20162.8820162019----------
Economic growth20162.9220172021----------
Cointegration20172.7420172020----------
Neural network20203.1620202023----------
Capture20213.1920212024----------
Note: The red bars indicate the burst periods of keywords. The sequence of the red highlights from left to right represents the chronological order in which each keyword became a research hotspot during 2015–2024.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Y.; Chen, H.; Liu, M.; Zhou, M.; Liu, L.; Zhang, Y. Decoding the Developmental Trajectory of the New Power System in China via Bibliometric and Visual Analysis. Energies 2025, 18, 4809. https://doi.org/10.3390/en18184809

AMA Style

Wang Y, Chen H, Liu M, Zhou M, Liu L, Zhang Y. Decoding the Developmental Trajectory of the New Power System in China via Bibliometric and Visual Analysis. Energies. 2025; 18(18):4809. https://doi.org/10.3390/en18184809

Chicago/Turabian Style

Wang, Yinan, Heng Chen, Minghong Liu, Mingyuan Zhou, Lingshuang Liu, and Yan Zhang. 2025. "Decoding the Developmental Trajectory of the New Power System in China via Bibliometric and Visual Analysis" Energies 18, no. 18: 4809. https://doi.org/10.3390/en18184809

APA Style

Wang, Y., Chen, H., Liu, M., Zhou, M., Liu, L., & Zhang, Y. (2025). Decoding the Developmental Trajectory of the New Power System in China via Bibliometric and Visual Analysis. Energies, 18(18), 4809. https://doi.org/10.3390/en18184809

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop