1. Introduction
According to United Nations projections, by 2050, 68% of the global population will reside in urban areas, with China’s urbanization process being particularly significant. It is estimated that China’s urban population will increase by 255 million by 2050 [
1]. This trend imposes unprecedented pressures on urban development. As urbanization advances, cities face complex challenges such as resource inefficiency, environmental pollution, and aging infrastructure [
2,
3,
4,
5]. These issues not only hinder sustainable development but also directly affect the quality of life and well-being of urban residents [
6]. To address these challenges, the United Nations’ New Urban Agenda and Sustainable Development Goal 11 advocate for building inclusive, safe, resilient, and sustainable cities and human settlements [
7], highlighting the strategic importance of incorporating urban renewal into sustainable development agendas.
Urban renewal has been formally incorporated into China’s national core agenda and positioned as the strategic blueprint for urban development over the coming decades [
8,
9]. The 20th National Congress further emphasized accelerating the transformation of development models in megacities, promoting urban renewal initiatives, enhancing infrastructure construction, and building livable, resilient, and smart cities [
10,
11,
12]. Urban renewal policies serve as institutional instruments for guiding resource allocation, regulating implementation pathways, and coordinating multiple stakeholders. The scientific and coordinated nature of these policy systems directly influences the effectiveness of urban renewal initiatives. Therefore, extracting meaningful themes, identifying inter-policy synergies, and analyzing evolutionary patterns from a vast and complex corpus of policy documents are essential for understanding the dynamics of policy development and improving policy effectiveness.
Recent academic studies on urban renewal policies have explored various dimensions. Ye et al. [
13] examine the background factors driving policy innovation, highlighting the role of structural-instrumental, cultural-institutional, and environmental perspectives. Nachmany et al. [
14] focus on the political economy and institutional dynamics influencing policy changes, and evaluate policy effectiveness through comparative studies such as those on Hong Kong and Macau [
15]. In general, there are few studies on the quantitative content of urban renewal policy research, which is limited to the discussion of policy impact and effectiveness. As policy research increasingly moves toward quantitative and objective methodologies, topic modeling has emerged as a powerful tool for textual analysis. Among these, the BERTopic model has gained prominence for generating more coherent and interpretable topics. It has been applied in various domains, including government reports [
16], social media analysis [
17], policy evaluation [
18], and general policy text mining [
19]. Hence, BERTopic offers a promising approach for exploring semantic-level themes and policy orientations. This study analyzes urban renewal policy documents issued by central and local governments in China. By applying the BERTopic model at the sentence level, it aims to identify granular themes and uncover their temporal evolution. The goal is to depict the developmental characteristics and internal logic of urban renewal policies in China, thereby offering practical references for evidence-based policy formulation.
3. Materials and Methods
3.1. Overall Research Framework
This study performs topic identification and evolutionary analysis on urban renewal policy texts in China. The research steps for this study are shown in
Figure 1.
(1) Data collection and cleaning: Urban renewal policy documents issued by central and local (provincial and municipal) governments in China were collected using keyword searches such as urban renewal, old residential community renovation, urban village transformation, shantytown redevelopment, and dilapidated housing reconstruction. A Python-based crawler was developed to download the relevant policy texts. To improve model performance, the collected policy documents were segmented into individual sentences. The Jieba library was employed to conduct Chinese word segmentation, remove stopwords, and clean invalid or redundant content. This process transformed long, unstructured policy texts into a structured corpus of short texts suitable for model input. (2) Topic modeling: The BERTopic model was trained on the preprocessed short-text corpus. Topics were extracted from the urban renewal policy sentences, and visualizations were generated using dimensionality reduction algorithms. Identified topics were clustered according to semantic similarity, and each cluster was interpreted through qualitative content analysis to reveal thematic orientation. (3) Multi-perspective analysis: This analysis included theme identification and cluster analysis on national urban renewal policies, used theme intensity for statistics and analysis, and aimed to reveal trends in policy priorities over time. Meanwhile, on the basis of comparative analysis, it summarized the commonalities and differences in theme setting, content coverage, and other aspects of urban renewal policies across China’s four major economic regions.
3.2. Data Source and Selection
The policy texts analyzed in this study were obtained from the PKULaw database [
43,
44] (
https://www.pkulaw.com/)(accessed on 15 June 2025). Our data collection process complied with relevant Chinese regulations and the terms of service of the PKULaw website. The search covered the period from 1 January 2000 to 31 May 2025, using the keywords urban renewal, old residential community renovation, urban village transformation, shantytown redevelopment, and dilapidated housing reconstruction to retrieve relevant policy titles. A total of 1568 policy documents were retrieved, including 83 issued by central government agencies and 1485 by local governments. The following inclusion criteria were applied for data screening: (1) the document must be a current or soon-to-be-effective policy issued by the central or local government; (2) the document must be an official policy type, including laws, regulations, provisions, opinions, notices, and methods—excluding informal decisions such as approvals, responses, or public announcements; (3) the policy content must be directly related to urban renewal. After screening, 1144 documents were retained as the final dataset for analysis (see
Supplementary Materials). Notably, although China’s urban renewal practice began earlier, policies issued before 2000 were sparse and scattered. Therefore, this study focuses on policies issued from 2000 onward, which more systematically represent urban renewal efforts in China.
3.3. Text Preprocessing
Before performing BERTopic topic modeling, we conducted text preprocessing on the raw data to ensure the quality and relevance of the dataset, thereby improving data integrity and reducing computational overhead. This study followed the data preprocessing methods proposed by Jelodar H. [
32] and Su Y. S. [
33], and the text cleaning and optimization were completed through the following four steps:
- (1)
Removing duplicate texts: We manually reviewed the unstructured text data to identify and remove duplicate articles, ensuring the uniqueness of each text in the dataset. This step helped avoid bias in the analysis results caused by duplicate data.
- (2)
Removing irrelevant texts: By combining manual review with keyword filtering, we excluded texts that were clearly unrelated to the theme of “urban renewal”. This step focused the research scope, enhancing the accuracy and interpretability of subsequent topic modeling.
- (3)
Cleaning unrelated elements: We systematically removed HTML tags, URL links, special symbols, and pure numeric characters from the text. These elements typically did not carry semantic meaning, and their removal helped reduce noise and interference in the natural language processing process.
- (4)
Tokenization and lemmatization: We used the widely adopted Chinese text segmentation tool Jieba to tokenize the cleaned text, breaking continuous text into the smallest units with independent meanings. To accommodate the specialized terminology in the urban renewal field, we incorporated a custom domain-specific dictionary (e.g., “old residential areas,” “urban health checks”, “age-friendly renovation”, “micro-renovation”) during tokenization to prevent incorrect splitting of professional terms. Additionally, a stopword list was constructed by including stopwords from the Harbin Institute of Technology’s list, specific terms such as names of people and places, special characters like “@”, “#”, and emojis, and custom stopwords (e.g., “second phase”, “three years”, “a batch”, “mentioned above”) to eliminate irrelevant words.
It is important to note that during tokenization, we fully considered the characteristics of specialized terminology in the urban renewal field. If the tokenization results were inaccurate (e.g., failing to correctly identify compound terms like “old residential areas” or “micro-renovation”), it could lead to deviations in subsequent topic feature extraction. Core concepts might be fragmented into meaningless word fragments, reducing the significance of key terms and negatively affecting the quality and interpretability of the topic model. To address this, we imported the custom domain-specific dictionary and conducted iterative adjustments to maximize the accuracy of tokenization, ensuring the reliability of the analysis results.
3.4. BERTopic Modeling
In recent years, topic models have been widely applied in policy text analysis due to their ability to identify the underlying semantic structures and latent topics in texts [
45]. Considering that the policy document collection analyzed in this study was relatively long, when encoding long documents with multiple topics or converting them into embeddings, some information might be lost in the input data [
46]. Jin J. et al. [
38], through sentence-level processing of government work reports, found that using BERTopic allowed for more precise handling of the semantic complexity and ambiguity of texts. Li Q. et al. [
47] segmented news texts into sentence-level units, which exhibited better characteristics in terms of topic consistency and coherence. Therefore, this study adopts sentence-level analysis. Each sentence is treated as an independent unit, and sentences are clustered based on their semantic and contextual features. This approach enables a clearer understanding of topics with contextual characteristics, presenting the multidimensional nature of the documents and improving the accuracy of topic clustering.
This study employs BERTopic [
37], a deep learning-based topic modeling technique, to identify the latent topics embedded in China’s central and local urban renewal policy documents. BERTopic uses pre-trained BERT embeddings to capture rich semantic representations of text, making it particularly effective for context-sensitive topic modeling. The existing research using BERTopic for topic modeling generally follows these key steps [
48]:
1. Embedding: This step involves converting natural language into a format that can be effectively processed by computers. The algorithm uses BERT or Sentence Transformers along with pre-trained language models to generate document embeddings from a set of documents. To extract sentence representations, we use the “bert-base-chinese” [
49] sentence transformer as the default choice within the BERTopic framework. This model is specifically designed for Chinese tasks and optimized for semantic similarity, converting documents into numerical representations to create embeddings that capture the textual semantics.
2. Dimensionality reduction: The high-dimensional embeddings are reduced using UMAP (Uniform Manifold Approximation and Projection) to preserve semantic relationships while minimizing computational complexity [
50]. UMAP is well-suited for maintaining the global and local structure of the data during projection.
3. Clustering: The HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with Noise) algorithm clusters the reduced embeddings based on density and semantic similarity. HDBSCAN excels at detecting clusters of arbitrary shape and handling noise, assigning documents to core or peripheral topic clusters accordingly [
51].
4. Topic representation: To extract representative keywords for each topic, the class-based TF-IDF (c-TF-IDF) algorithm is applied, which calculates the term importance within each cluster. Terms with the highest c-TF-IDF scores are considered core descriptors of the topic [
52]. The c-TF-IDF formula is defined as follows:
where
is the frequency of word
in class
,
is the frequency of word
across all classes, and
is the average number of words per class.
After obtaining the latent topics of the texts using BERTopic, we calculated the topic intensity to represent the relative weight of this topic in the entire policy text. The specific calculation formula is as follows:
where
denotes the intensity of Topic k,
represents the total number of sentences, and
indicates the probability value of Topic k in the i-th sentence.
3.5. BERTopic Parameter Configuration and Validation
In this study, the bert-base-chinese pre-trained sentence transformer was used for text embedding within the BERTopic model, with UMAP employed for dimensionality reduction. The parameters were set as follows: n_neighbors = 120, min_dist = 0.0, n_components = 5, and metric = “cosine”. The HDBSCAN clustering configuration was set to min_cluster_size = 180 and metric = “euclidean”. The number of topics was not pre-specified manually but was automatically inferred by the algorithm.
Topic coherence score was the core quantitative metric for evaluating the quality of the topic model. It measured the semantic association strength among high-frequency terms within a topic, reflecting human perception of structured topics. In other words, it evaluated the interrelationship between the top k words in a given topic. Higher coherence scores indicated that the topic was cohesive, clear, and relevant, while lower scores suggested a lack of clarity, presence of noise, or irrelevance [
53]. To validate the necessity of sentence-level segmentation, we additionally tested sentence-level and full-text-level granularities. The experiments showed that the full-text model had a CV (cross-validation) coherence score of 0.389, whereas the sentence-level model used in this study achieved a CV coherence score of 0.656, indicating that sentence-level segmentation provided reasonable coherence.
To evaluate the accuracy and separability of document clustering, the Davies–Bouldin Index (DBI) was used, a well-established method for internal validation. This index identified the least favorable group pairs by utilizing within-group variance and between-group centroid distances [
54]. The DBI for this model was 0.313, indicating good separation between topic clusters.
To ensure the accuracy and reliability of the machine learning results, we conducted a thorough manual review of the generated topics and their related policies. A team of three experts carefully assessed the coherence and relevance of each topic, reviewing the keywords and policies. To ensure reliability among the evaluators, we applied Fleiss’ kappa, a metric used to evaluate the consistency of three or more raters on categorical variables [
55]. Our analysis yielded a kappa value of 0.63 for the 34 topics, indicating a substantial level of agreement among the raters, ensuring that the unsupervised model’s output was meaningful and insightful. These metrics collectively supported the coherence, robustness, and semantic validity of the topic structure derived through BERTopic.
4. Results and Discussion
4.1. Overall Identification of Policy Themes
Figure 2 illustrates the declining trend in keyword importance weights within each topic. It shows that for most topics, a few high-ranking keywords carried the greatest representational weight, while the importance of other keywords declined rapidly. This suggests that each topic could be effectively represented by approximately three to five core keywords, with additional keywords providing only marginal improvements in interpretability.
We conducted topic identification on Chinese urban renewal policy texts using the BERTopic model. After multiple experimental iterations, we ultimately identified 34 topics (labeled Topic 0 to Topic 33). Each identified topic was represented by a set of characteristic terms with different weights, which collectively illustrated the main content of the topic, as detailed in
Table 1.
From the perspective of policy focus areas, this is mainly reflected in the following aspects: (1) Infrastructure and renovation of old residential areas: From Topics 0, 1, 14, and 26, it is evident that the policy focuses on addressing the core needs of residents’ daily lives. This is achieved by upgrading community facilities, promoting the renovation of old residential areas and urban villages, and incorporating infrastructure construction in resettlement projects. The goal is to improve living conditions, enhance urban functions, and ensure the effective implementation of renovation tasks. (2) Land acquisition and resettlement compensation: Topics 4, 9, 22, and 27 clearly indicate that the policy aims to balance urban development with the rights of residents. By standardizing the land acquisition process and clarifying compensation standards, the policy ensures that the land acquisition process is lawful and orderly. (3) Financial support and diverse fundraising: From Topics 3, 20, and 32, it can be seen that the policy broadens funding sources through financial subsidies, tax incentives, and loan support. It encourages social capital participation in urban renewal projects. (4) Policy norms and execution assurance: Topics 6, 8, 15, and 25 emphasize that the policy stresses the importance of defining urban renewal plans and policies through processes like issuance, formulation, review, and approval. This ensures that all tasks are carried out in a structured manner. Additionally, from Topics 10, 21, and 33, it is evident that the policy strengthens the accountability mechanism by using legal means such as enforcing violations, clarifying civil liabilities, introducing compulsory enforcement by courts, and administrative litigation, ensuring compliance in the execution of the policy.
4.2. Overall Identification of Policy Directions
To further understand the relationships among identified topics, the study conducted visual analyses using a topic distance map and a cosine similarity heatmap. These tools helped to uncover the latent structure of urban renewal policy discourse.
Figure 3 presents a 2D topic distance map alongside a cosine similarity heatmap;
Figure 4 shows the topic hierarchical clustering diagram. The topic distance map applied dimensionality reduction to project the 34 topics onto a two-dimensional plane. In this map, the size of each circle indicated the relative importance of the topic, and the spatial proximity between circles reflected semantic similarity. The cosine similarity heatmap revealed pairwise semantic correlations among the 34 topics. Darker shades of blue indicated stronger similarity (values closer to 1), while lighter shades represented weaker associations (values closer to 0). The heatmap confirmed the internal coherence of certain topic clusters and the divergence among others. These findings reinforced the reliability of the BERTopic clustering results and validated the natural grouping of topics. Together, these visual analytics revealed the semantic structure underlying China’s urban renewal policy discourse. They demonstrated the systematic and hierarchical nature of policy formulation, offering insights into how thematic priorities evolved and interconnected within the broader urban governance framework.
Based on the clustering results, evaluation metrics, and expert judgment, this study categorizes the 34 topics into five overarching dimensions: Spatial Improvement and Facility Upgrades, Project Collaboration and Approval, Land Acquisition and Compensation, Fiscal Incentives and Funding Support, and Institutional Guarantees and Long-term Governance.
Table 2 presents the number and proportion of relevant sentences for the 34 identified topics. From the data, it is clear that the themes of urban renewal policy texts show distinct areas of focus. Spatial Improvement and Facility Upgrades: With a proportion of 35.48%, this theme indicates that optimizing physical space layout and improving infrastructure and public services are key tasks in the urban renewal process. These factors are directly related to enhancing urban functions and improving residents’ quality of life, representing a critical aspect of the “hard power” enhancement in urban renewal. Project Collaboration and Approval: At 30.08%, this theme highlights the complexity of urban renewal as a systemic project that heavily relies on coordination among multiple stakeholders and efficient approval processes. The policy emphasizes breaking down departmental barriers, integrating resources from enterprises, governments, and communities, simplifying approval procedures, and improving efficiency. This approach aims to remove institutional obstacles and ensure the smooth implementation of urban renewal projects. Land Acquisition and Resettlement Compensation: This theme accounts for 15.56%, reflecting the focus on safeguarding people’s livelihood rights during urban renewal. While advancing urban space restructuring and functional upgrades, it is essential to handle the coordination of interests in the land requisition process. Through reasonable compensation and resettlement policies, the rights of those affected by land acquisition are protected, ensuring the social stability of the renewal process. Financial and Tax Incentives and Funding Support: With a proportion of 13.51%, this theme highlights the policy’s focus on securing financial support for urban renewal. Given that urban renewal projects often involve large investments and long payback periods, relying solely on a single funding source is insufficient. The policy addresses this by establishing a diversified financial and tax incentive mechanism, such as subsidies, tax benefits, and special funds, to attract social capital, providing continuous financial support for the projects and promoting the sustainable development of urban renewal. Institutional Guarantees and Long-term Governance: With a relatively low proportion of 5.37%, this theme suggests that current policies focus more on specific implementation aspects, while there is still room for improvement in terms of attention to institution building and sustainable governance mechanisms.
4.3. Overall Analysis of the Evolutionary Trends of Policy Themes
To further explore the evolution of the thematic intensity of urban renewal policies in China, this study conducted an evolutionary analysis of the topics generated by the BERTopic model. The data were sliced by year to show the evolution of five key policy themes in urban renewal over the past 25 years. This allowed for a clearer understanding of the changing trends in thematic intensity and government focus. The topic strength distribution of urban renewal policies over the years is shown in
Table 3, and the trends were analyzed through the thematic intensity trend line (
Figure 5).
1. Project Collaboration and Approval: This aligns with the broader shift in China’s urban governance from government-led approaches to multi-stakeholder collaboration. Early high-intensity topics focus on vertical management processes such as planning review (Topic 15), reflecting the government-dominated project-based system. For example, Topic 8 showed a significant change in 2019 based on Pettitt’s change-point detection. This timing coincides with the release of the Central Committee’s decision on modernizing China’s system and governance capacity, which emphasizes “building a well-structured, scientifically sound, and effective institutional system.” This reflects increased attention to institutional design and multi-actor participation in urban–rural development, marking a transition toward more collaborative, institutionalized, and procedural governance.
2. Land Acquisition and Resettlement Compensation: This corresponds to the strategic shift in Chinese urban governance from large-scale new development to improved stock management and finer governance. In the early 2000s, rising emphasis on topics like housing expropriation and compensation (Topics 4 and 9) addressed demands during rapid urbanization. The 2011 Regulation on the Expropriation and Compensation of Houses on State-owned Land emphasizes public interest and protection of residents’ rights. A change point detected in Topic 9 in 2020 aligns with the Ministry of Housing and Urban–Rural Development’s notice against excessive demolition and construction. This national policy stresses controlled expropriation, renovation, and quality of life. The decline in Topic 9’s intensity indicates a move away from large-scale land redevelopment toward more regulated, rights-protecting, and refined governance.
3. Spatial Improvement and Facility Upgrades: This reflects China’s people-centered and sustainability-oriented development strategy. After 2010, under the “New Urbanization” initiative, policy attention shifted toward improving human living environments. The growing emphasis on facilities enhancement (Topic 0) and old residential neighborhoods (Topic 26) reflects this priority. From 2016 to 2020, the focus on old neighborhood renovation intensified, reinforced by national guidelines issued in 2020. This phase highlights organic regeneration of existing urban areas, social integration, public participation, and green development, demonstrating more refined and human-centered urban governance.
4. Fiscal Incentives and Funding Support: This reveals the evolution of financing mechanisms in China’s urban renewal. Around 2014, central government interventions such as monetary compensation in shantytown redevelopment and policy bank loans led to increased attention on lending (Topic 3). After 2020, policy encouraged private investment (Topic 5), reflecting a shift toward market-driven and sustainable renewal models. This trend aligns with broader market reforms and modernized governance, though challenges remain in balancing policy incentives with fiscal sustainability.
5. Institutional Safeguards and Long-term Governance: This theme reflects the deep transformation from “project-driven” to “institutional empowerment”. In 2013, the Decision of the CPC Central Committee on Several Major Issues Concerning Comprehensively Deepening Reform first put forward the “modernization of national governance”. Under the strategy of comprehensively advancing the rule of law, the topic of project contracts (Topic 24) gained more attention, and the topic of civil liability (Topic 21) was further strengthened. This shows that policies focus on clarifying rights and responsibilities via standardized contracts and ensuring implementation quality through professional qualifications. The Pettitt change-point detection method shows that Topic 30 had a change point in 2020. This matches the higher demand for project quality, liability tracing, and law-based supervision after urban renewal entered the large-scale implementation stage. This indicates that national governance is shifting from “project-oriented thinking” to “institute-oriented thinking”. The goal is to reduce transaction risks and improve governance efficiency by building a stable legal and contractual framework. However, overall attention to institutional guarantees and long-term governance remains insufficient. It is urgent to address this gap through more breakthrough policy refinement and resource allocation.
4.4. Comparative Analysis of Policies in Different Regions
To gain deeper insights into the policy focus and strategic differentiation under different development conditions, and to accurately capture the government’s attention on urban renewal, this study categorizes the four major economic regions of China based on the policies released by the State Council, as shown in
Table 4.
Using the BERTopic model, this study identifies and analyzes the policy themes of urban renewal for the four major economic regions, with the results displayed in
Figure 6. These results clearly show both similarities and distinctive regional characteristics in the thematic structure of urban renewal policies across the four regions.
Through a comparative analysis of the urban renewal policy themes across the four regions, the following common features emerge: (1) Land Acquisition and Resettlement Compensation: All four regions have clear policy themes surrounding land acquisition and resettlement compensation, reflecting the collaborative nature of the policies. In the Eastern region, topics like Topic 3 and Topic 4 emphasize land requisition and resettlement, while in the Western region, Topic 5 and Topic 6 focus on compensation and relocation. In the Central region, Topic 0 and Topic 8 highlight similar themes, and in the Northeastern region, Topic 4 and Topic 7 also reflect the land acquisition and resettlement paradigm. This indicates that all regions focus on standardizing the requisition process to ensure smooth integration of resources and protect the rights of affected groups. (2) Funding Support and Financing Innovation: Each region proposes various financial and fiscal support mechanisms to address the funding needs of urban renewal projects. In the Eastern region, Topic 5 and Topic 7 involve resettlement investment and subsidies, while in the Western region, Topic 7 and Topic 17 mention special loans and bond financing. The Central region’s Topic 11 and Topic 17 emphasize special funds and evaluation mechanisms, while the Northeastern region’s Topic 3 addresses monetization and financial arrangements. Despite the varying economic strengths across the regions, all have been actively exploring diverse funding paths and improving funding assurance mechanisms. (3) Facility Improvement and Quality Upgrades: This reflects the high priority placed on improving living conditions and responding to public demands. In the Eastern region, Topic 6 and Topic 11 emphasize functional optimization and upgrades to infrastructure like fire safety; in the Western region, Topic 13 stresses improving housing conditions and enhancing the quality of housing. In the Central region, Topic 12 focuses on the gas facilities in old residential areas, which are a critical public service need. In the Northeastern region, Topic 5 shows a clear policy focus on quality upgrades, with a clear goal.
In terms of specific policy orientations, the economic regions show distinct characteristics: (1) Eastern Region: The focus is on fine-grained urban renewal management, market-based operations, and policy standardization. For example, Topic 0 reflects a refined approach to optimizing existing space, while Topic 14 emphasizes the market-oriented allocation of land resources. Topic 16 underscores the importance of policy rigor and standardization. (2) Western Region: The focus is on urban–rural integration and financial support, emphasizing both urban and rural spatial renewal and strengthening funding mechanisms. Topic 0 highlights the region’s emphasis on integrated rural–urban renewal, and Topics 7, 15, and 17 form a diverse financing system with special loans, purchasing in lieu of construction, and expanded social capital participation to ensure the smooth implementation of urban renewal projects. (3) Central Region: The focus is on public services and livelihood security. Topics 12 and 14 reflect the attention to residents’ daily needs, especially the renewal of aging infrastructure like gas facilities. Topics such as Topic 2 and Topic 15 reflect broader efforts to ensure improved urban quality and better public service delivery, enhancing the public’s sense of gain and well-being. (4) Northeastern Region: The focus is on the management of industrial heritage and subsiding areas. Topics 1 and 8 reflect the region’s efforts to address the issues of old industrial bases, including ecological restoration and exploring transformation paths through industrial heritage and cultural tourism integration.
The regional differences in urban renewal policies arise mainly from the following factors: (1) Development Stage Differences: The Eastern region has entered the post-industrialization stage, with limited space for construction land expansion; the Central and Western regions are in the phase of accelerated urbanization; the Northeastern region, as an old industrial base, faces rapid industrial decline and needs to address historical issues to create conditions for transformation and upgrading. (2) Economic Foundation and Fiscal Capacity: The Eastern region has a developed economy, strong local government finances, and abundant social capital; the Western region has a relatively weaker economic foundation, limited fiscal capacity, and cannot rely solely on its own resources to push forward large-scale urban renewal; the Central region’s fiscal capacity is between the Eastern and Western regions, and its urban renewal has significant social benefits; the Northeastern region faces downward economic pressure and weak fiscal support. (3) Geographical Conditions: The Eastern region benefits from a long coastline, superior ports, and geographic advantages that facilitate attracting global capital and talent. The Western region has a vast area but fragile ecology, rich in energy and mineral resources, and unevenly distributed. Urban renewal in this region needs to consider the ecological capacity. The Central region, located inland, is a key transportation hub and population concentration area, focusing on ensuring basic livelihood and facilitating the smooth flow of people and goods. The Northeastern region has rich ecological resources, while also having a concentration of traditional energy and heavy industries.
5. Conclusions and Recommendations
This study analyzes 1144 urban renewal policy documents issued in China between 2000 and 2025 using the BERTopic model. It reveals their thematic evolution and regional variations, providing a basis for optimizing policy design.
Key Findings: (1) Policy theme structure: Through BERTopic model-based recognition and clustering, 34 sub-themes of urban renewal policy are extracted. These are further categorized into five core directions. This classification highlights both the systematic and hierarchical nature of urban renewal policies. (2) Theme evolutionary trends: China’s urban renewal governance paradigm has shifted from a government-led, incremental expansion model focused on engineering projects to a more collaborative, stock-optimization, and institutional empowerment approach. (3) Regional policy differences: The policies of different regions show both commonalities and distinct regional characteristics due to differences in development stages, economic conditions, and geographic contexts. The Eastern region emphasizes fine management and market operations, while the Western region stresses urban–rural coordination and diversified financing models. The Central region focuses on public services and livelihood improvement, and the Northeastern region places more attention on industrial heritage and ecological governance.
Policy implications and recommendations: (1) Strengthen multi-department collaborative governance. Led by housing authorities, coordinate departments like natural resources and finance to create policy synergy. Streamline approvals, build a digital platform with performance metrics (e.g., utilization and online processing rates), and integrate project tracking for transparent, cost-effective processes. (2) Enhance funding and incentives. Establish a dedicated urban renewal fund, combining central and local finances to prioritize public welfare projects. Use interest subsidies and guarantees to attract social capital for renovation and industrial development. (3) Tailor policies to regional disparities. Eastern regions should focus on refined management and smart, low-carbon renewal; central regions on old community upgrades and public services; western regions on urban–rural integration and ecological/cultural preservation; northeast regions on revitalizing industrial heritage and managing subsidence areas for urban transformation.
Research contributions: (1) Moving beyond traditional manual coding or word-frequency methods, this study introduces the BERTopic model to urban renewal policy analysis, enabling sentence-level semantic mining and evolutionary tracking. This approach significantly improves the accuracy and depth of policy theme identification and offers a new technical pathway for quantitative policy text research. (2) A comparative analysis of policy texts across four major economic regions reveals both commonalities and differentiated features in policy orientation, providing a scientific basis for regionally coordinated and tailored policy design by central and local governments.
Research limitations: (1) This study does not deeply explore the spatial heterogeneity of urban renewal policies across different regions. (2) It has not fully verified the causal relationship between regional economic/fiscal differences and policy discourse.
Research prospects: Future research can integrate Geographic Information System (GIS) technology or spatial econometric methods to further reveal the differentiated characteristics of policy implementation in different regions and their spatial spillover effects. Meanwhile, it can use causal inference methods such as panel data analysis to conduct in-depth analysis of the shaping effect of economic foundations on policy orientations, thereby providing a more scientific basis for optimizing regional coordinated development strategies.