Next Article in Journal
Assessing the Readiness for 15-Minute Cities: Spatial Analysis of Accessibility and Urban Sprawl in Limassol, Cyprus
Previous Article in Journal
Surface Heat Island and Its Link to Urban Morphology: Multitemporal Analysis with Landsat Images in an Andean City in Peru
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Evolving from Rules to Learning in Urban Modeling and Planning Support Systems

1
Department of Urban Planning, School of Architecture, Southeast University, Nanjing 210096, China
2
Department of Sustainable Development, Environmental Science and Engineering, KTH Royal Institute of Technology, 100 44 Stockholm, Sweden
Urban Sci. 2025, 9(12), 508; https://doi.org/10.3390/urbansci9120508 (registering DOI)
Submission received: 14 October 2025 / Revised: 23 November 2025 / Accepted: 26 November 2025 / Published: 1 December 2025
(This article belongs to the Special Issue Research on Plural Values in Sustainable Urban Planning)

Abstract

Urban modeling is being reshaped by advances in artificial intelligence (AI) and data-rich sensing. This review assembles an integrated evidence base connecting spatial dynamic modeling (SDM), planning support systems (PSSs), urban analytics, and governance concerns. We analyze 1290 publications (2000–2025) using a reproducible pipeline that combines structured literature retrieval with retrieval-augmented generation (RAG) for semantic screening and evidence extraction. Bibliometric mapping and a rigorous coding framework structure the synthesis. The results reveal three linked trajectories. First, SDM has progressed from rule-based simulation toward learned spatial representations using deep and multimodal learning. Second, PSS has evolved from static analytical tools to interactive and participatory environments that embed AI for scenario exploration and stakeholder engagement. Third, governance themes such as transparency, fairness, and accountability have gained importance but remain unevenly implemented in modeling workflows. Building on these findings, we advance AI-aligned SDM, which integrates explainability, uncertainty reporting, documentation, and participation into model design to strengthen institutional accountability and evidence-based planning. A forward research agenda emphasizes methodological fusion between simulation and learning, institutional design for continuous model stewardship, and epistemic pluralism connecting local knowledge with AI to advance equitable and transparent urban governance.

1. Introduction

Urban systems are being redefined by dense data streams, pervasive sensing, and rapidly improving artificial intelligence (AI) [1,2]. The unprecedented pace of population growth and digital transformation has intensified pressures on cities to manage complexity, sustainability, and equity within increasingly data-driven decision environments [3]. These shifts do not merely add new tools to the planning repertoire; they change how urban knowledge is produced, contested, and used in decisions [4,5]. Over the past three decades, Spatial Dynamic Modeling (SDM) and Planning Support Systems (PSSs) have matured from separate traditions into converging instruments for data-rich urban analysis [6]. Early SDM framed cities as evolving spatial processes driven by local interactions and transition rules, using cellular automata (CA), agent-based models (ABMs), Markov chains, and system dynamics to render change visible and debatable [7,8,9,10]. That generation excelled at interpretability and communicability: parameters were explicit, uncertainty could be probed through sensitivity analysis, and simulated maps supported institutional deliberation. Yet the very clarity of the rule sets limited transferability across contexts and made calibration laborious [11,12].
A second wave emphasized hybridization. Researchers coupled rule-based schemes with statistical and econometric components, allowing urban change to respond to socioeconomic drivers, infrastructure, and policy constraints rather than geometry alone [13,14]. Ensemble and Bayesian formulations further improved predictive fidelity and provided a way to express uncertainty within comparable experimental designs [15,16]. In parallel, validation routines became more careful and comparative, and the field learned to speak in both process and prediction. The hybrid era also raised enduring questions for planning: how can emerging AI and data infrastructures improve generalizability and interpretability at the same time, what should count as a convincing explanation for a forecasted pattern, and how should models communicate uncertainty, bias, and limits of applicability when transferred across cities with different data regimes or institutional capacities [13,17]?
Building on this trajectory, recent developments in GeoAI mark a third inflection. Deep learning and vision–language modeling (VLM) let algorithms infer spatial regularities from examples, integrating satellite imagery, street-level scenes, and textual corpora to learn morphological and functional signatures at multiple scales [18,19]. This capacity shifts SDM away from hand-crafted neighborhood rules toward learned representations that connect visual, semantic, and statistical signals of urban form and activity. Work in multimodal pretraining for remote sensing and urban indicators illustrates how semantic signals can transfer across geographies while preserving local specificity [20]. While these advances deliver gains in granularity and transfer performance, they place new demands on interpretation. Saliency maps and feature attribution can help, but practitioners still need to understand when a model is extrapolating beyond its experience, how it handles minority geographies, and why particular patterns are emphasized over others [21,22].
During the same period, PSS evolved from desktop extensions of a Geographic Information System (GIS) into web-based, interactive environments designed to support collaborative exploration, multi-criteria trade-offs, and co-design of scenarios [23,24]. These platforms did not simply make analysis more convenient; they reframed analysis as dialogue. Interfaces became spaces where oppositional values could be juxtaposed and where assumptions could be surfaced and contested [25,26,27]. At the same time, the field learned hard lessons about implementation pitfalls, including organizational fit, skill requirements, and the fragility of adoption beyond pilots [28]. As AI modules and decision-support routines entered these systems, the boundary between modeling and governance blurred further, with examples from operational decision contexts demonstrating both opportunities and the need for auditable logic and traceability in recommendations [29]. The promise is clear: decision rooms where people can query models in natural language, receive explainable outputs, and iterate policy levers with immediate feedback. The risk is equally clear: if automation advances faster than institutional capacity to scrutinize it, then speed can substitute for judgment.
At this intersection, a persistent imbalance is visible. Technical papers optimize accuracy metrics, while governance scholarship articulates fairness, transparency, and digital rights [30,31,32]. Few studies make these threads meet inside an operational workflow [33,34,35]. Tooling for fairness diagnostics and transparency has matured in computer science, yet adaptation to spatial settings and planning interfaces remains limited [33,36]. Governance debates point to rights, accountability, and data justice that are directly implicated when model choices affect exposure to environmental risk or access to services [37]. It is telling that many planning pilots report impressive prototypes but limited organizational uptake. Our reading is that legitimacy has become a first-order design requirement, not a downstream add-on. Legitimacy here includes interpretability that is meaningful to non-specialists, uncertainty communication that travels well in policy settings, and participation that shapes the model rather than merely reacts to its outputs.
These gaps are further amplified as cities adopt digital-twin infrastructures, autonomous data pipelines, and AI-driven decision systems [38,39]. Always-on synchronization between assets, flows, and behaviors promises tighter coupling between observation and action, but it also hardens implicit choices about what counts as signal, whose perspectives are encoded, and how exceptions are handled. If left implicit, these choices can narrow the range of plausible futures the system is able to represent. Conversely, when made explicit and contestable, they can anchor more honest planning conversations about trade-offs, distributional consequences, and institutional responsibilities [40].
To address this challenge, Figure 1 introduces the conceptual framework that organizes the remainder of this review. It synthesizes methodological, participatory, and governance dimensions into a shared socio-technical ecosystem. SDM, PSS, and AI governance are positioned as interacting pathways that co-produce model behavior and institutional use. This framework also clarifies how subsequent sections proceed: methods and modeling trajectories, participation and interface design in PSS, and accountability mechanisms that include transparency protocols, fairness audits, documentation, and stewardship.
We adopt AI-aligned SDM as a working stance. By this we mean modeling practices that pair predictive intelligence with built-in explainability, calibrated uncertainty, and portable documentation and that are embedded in a PSS capable of hosting participation as a design input rather than an afterthought. The goal is not to supplant accuracy but to recognize that accuracy alone cannot carry the weight of public decisions. In settings where data are uneven and authority is distributed, intelligibility and accountability are part of what makes a model useful [41,42].
This review assembles and analyzes 1290 publications from 2000 to 2025 using a reproducible pipeline that combines structured literature retrieval with retrieval-augmented semantic synthesis and bibliometric mapping. Rather than cataloguing techniques, we read the literature for design tensions that recur across cases and for the institutional conditions under which models travel from prototypes to practice. Guided by this stance, the review is organized around two research questions:
RQ1. How have SDM and PSSs co-evolved from rule-based and hybrid approaches to data-driven and multimodal learning, and what methodological turning points have actually changed how planning decisions are made?
RQ2. What constitutes an implementable framework for AI-aligned SDM in real institutions, including the minimum set of practices for interpretability, uncertainty reporting, documentation, and participation that can sustain legitimacy over time?
Our intention is pragmatic. We do not argue for a single best technique, but for workflows that make technical power and institutional judgment co-productive. Accordingly, the following sections trace the historical evolution and current practice of SDM and PSSs, followed by a synthesis of actionable insights. Section 2 explains the methodological design, Section 3 presents results, Section 4 interprets their implications, and Section 5 concludes.

2. Materials and Methods

Understanding how AI is reshaping SDM and PSSs requires a methodological design that can bridge technical precision and conceptual inclusivity. The field spans at least three epistemic traditions: quantitative modeling grounded in spatial science, design-oriented decision support rooted in planning theory, and normative debates informed by data governance and digital ethics. Each domain carries its own terminology, venues, and evaluation criteria, which complicates the construction of a unified evidence base. To address this, we implemented a hybrid review protocol that integrates systematic search, AI-assisted screening, and bibliometric–semantic synthesis. The protocol emphasizes transparency, reproducibility, and explicit checks against automation bias, consistent with contemporary best practices for systematic reviews and evidence mapping.

2.1. Research Design and Scope

This review adopts a structured and reproducible literature synthesis design inspired by systematic review principles but tailored for integrative analysis rather than strict PRISMA compliance [43,44,45]. It defines the overall scope and methodological boundaries of the study, linking the conceptual motivations introduced earlier to the procedural steps that follow. The temporal window was 2000 to 2025, capturing three successive methodological generations: rule-based SDM (CA, ABM, Markov, and system dynamics), hybrid models integrating statistical or econometric drivers, and the rise of AI-enhanced and multimodal modeling pipelines. Seminal studies from the 2000s were retained when identified via cross-referencing to maintain historical continuity.
Four bibliographic platforms were queried to balance disciplinary breadth and index coverage: Web of Science Core Collection, Scopus, IEEE Xplore, and Google Scholar via API aggregation. This combination captured peer-reviewed journals and high-impact conference proceedings that are often missed by a single index [46]. We restricted inclusion to peer-reviewed journal articles and indexed conference proceedings; theses, dissertations, patents, and grey literature were excluded. Searches were run from 1 March 2025 to 31 July 2025 across Web of Science, Scopus, IEEE Xplore, and Google Scholar (API aggregation). To manage Google Scholar noise and duplication, we capped per-query returns, filtered by language (English), excluded patents/citations, and performed two-stage deduplication (DOI/URL, then title-year fuzzy match at 0.88 with ±1-year tolerance). This produced 4516 records → 2410 unique after cleaning.
A pre-registered keyword framework (Table 1) guided query construction. Five thematic lenses were used: modeling approaches, urban simulation and planning applications, decision support and PSS design, AI and data integration, and ethics or governance. Boolean search strings were harmonized across databases while maintaining native syntax, and keywords were joined by logical “AND” between domains and “OR” within each domain to balance coverage of the five domain themes. For example, field operators such as TITLE-ABS-KEY in Scopus and TS = in Web of Science were adapted to ensure comparable retrieval structures. This balanced structure reduced over-representation of any single theme, such as land-use modeling or digital-twin systems.
To evaluate completeness, pilot runs were compared against a benchmark set of influential SDM and PSS papers identified from prior reviews. The finalized keyword set achieved high recall, retrieving more than ninety percent of these benchmark publications across databases. As an additional balance check, no single thematic lens contributed more than one third of results in the initial aggregation. This procedure reduced disciplinary bias and ensured inclusion of both technical and governance-oriented studies. The combined search produced 4516 records prior to cleaning. Each item was retained in raw JSON to preserve index-specific metadata for validation of bibliometric metrics and traceability from analytical figures back to source records. Author-country affiliations were standardized and mapped to UN world regions to enable regional descriptors; the synthesis reports global patterns with regional notes where salient, given uneven coverage across time and venues.

2.2. Data Harvesting and Cleaning

All records were normalized to a common schema containing standardized bibliographic and source fields. Because many publications appeared in multiple indexes, careful deduplication was essential to avoid inflation of bibliometric measures. A two-stage procedure was implemented.
The first stage removed exact duplicates using DOI and URL equivalence. When conflicting identifiers were detected, the DOI was retained as the canonical key due to its persistence across platforms. The second stage applied fuzzy Title–Year matching, using normalized titles and a similarity threshold of 0.88 with a one-year tolerance. This step eliminated minor punctuation or spelling differences while preserving distinct versions when substantial revision could be confirmed, for example, a conference paper and its extended journal article.
All deduplication and normalization operations were logged with timestamps and batch identifiers. Manual inspection of a stratified sample confirmed that false positives were below two percent. The resulting dataset contained 2410 unique records with uniform fields, ready for screening and semantic enrichment. Each record retained its provenance tag indicating the original database source, enabling later analysis of cross-index coverage and potential publication bias. This harmonized dataset formed the basis for AI-assisted screening and retrieval-augmented synthesis. Early harmonization ensured that subsequent steps, from semantic embedding to bibliometric mapping, operated on a consistent and auditable corpus.

2.3. AI-Assisted Screening and Retrieval-Augmented Synthesis

The size of the corpus and the uneven abstract quality required an enhanced screening strategy that could combine computational efficiency with human oversight. Traditional title–abstract screening alone would have risked excluding studies that employed unconventional terminology or embedded modeling details deep within full-text sections [47,48,49]. We therefore developed an AI-assisted retrieval and synthesis workflow (Figure 2) based on retrieval-augmented generation (RAG) principles, executed in a controlled Azure OpenAI environment using Python (version 3.1.1). This arrangement improves recall for studies using atypical terminology while preserving a fully logged and reproducible process (see Appendix A for full model, batching, and logging details).
All records passing initial cleaning were processed with Pandas in a standardized pipeline. Where institutional access permitted, full-text PDFs were downloaded and parsed alongside abstracts. Each document was divided into overlapping text segments of approximately 800 words to capture contextual continuity across paragraphs. These segments were converted into vector representations using an embedding model (text-embedding-3-large) optimized for semantic similarity in scientific text [48,49]. A vector store was then constructed to support targeted retrieval queries corresponding to the major themes of the review.
To identify whether a document met the inclusion criteria, the RAG workflow was queried with a curated set of prompts reflecting the review’s analytical focus. Examples included “comparison of cellular automata and agent-based models in urban expansion modeling,” “integration of digital twins within planning support systems,” “use of deep or multimodal learning for urban functional mapping,” and “application of fairness or transparency frameworks in AI-assisted planning.” The retrieval model ranked document segments by cosine similarity to these prompts, returning both text excerpts and associated metadata such as record ID, page span, and similarity score. This ensured that even studies lacking explicit keywords in their titles or abstracts could still be captured if relevant methods or discussions were embedded elsewhere in the text.
Relevance classification was performed by a deterministic instance of GPT-5-nano deployed through a secure Azure environment. The model produced structured outputs in JSON format, labelling each record as Relevant, Partially relevant, or Excluded, together with a short rationale derived from the retrieved passages. Deterministic parameter settings, schema validation, and token limits were enforced to prevent non-reproducible variability. Each batch of classifications was automatically logged with a time stamp, version tag, and process identifier. All classifications were produced with deterministic settings (temperature = 0, fixed token cap = 400, and standardized batch size = 24) to ensure reproducible outputs, as detailed in Appendix A.
To maintain reliability, all AI-generated labels were subjected to human verification. Two reviewers independently audited stratified samples representing different publication years, venues, and provisional thematic clusters. Reviewers were blinded to the model’s preliminary labels during evaluation to avoid confirmation bias. Discrepancies were discussed collectively until a consensus decision was reached, and the resulting adjudicated set was used to fine-tune prompt phrasing for subsequent batches. Inter-rater agreement reached κ = 0.73 (commonly interpreted as “substantial”), which is above a conventional acceptability threshold around 0.70 [50].
Several safeguards were incorporated to minimize automation bias. Time and batch limits were applied to prevent drift in model behavior, identical titles were cached to avoid redundant re-screening, and schema validation checks ensured consistent output formats. In cases where the model returned incomplete or ambiguous results, the record was automatically marked as Partially relevant pending manual review. This conservative procedure favored inclusion and minimized the risk of false negatives. Controls such as accountability, verification prompts, and limits on unreviewed automation are recommended in the human-factors and informatics literature to reduce automation complacency and bias [51,52,53].

2.4. Coding, Bibliometric Mapping, and Quality Assessment

Following the completion of screening, all 1290 publications were imported into a structured analytical database for systematic coding and bibliometric evaluation. The purpose of this phase was to translate the diverse body of literature into a consistent and analyzable form, enabling quantitative exploration of patterns and qualitative interpretation of methodological and thematic trends. Because the corpus spanned multiple disciplines and publication types, a detailed codebook was essential to ensure conceptual consistency and transparency.
Each record was assigned a unique identifier and linked to its metadata, including title, authorship, publication year, journal or conference venue, and DOI. Coding was performed manually within a controlled data environment to minimize human error and preserve provenance. The schema was designed to capture not only the technical characteristics of each study but also its thematic orientation and governance relevance. Table 2 summarizes the six major dimensions used in the coding framework.
The codebook was developed iteratively. An initial draft was prepared based on concepts drawn from the keyword framework and refined after examining a stratified sample of twenty percent of the corpus. Two coders conducted joint calibration sessions until terminology and classification criteria were harmonized. Ambiguous cases were resolved through discussion, and decisions were logged to maintain an audit trail. Once calibration was complete, the remaining records were coded independently using standardized templates. Consistency checks were conducted at regular intervals to ensure that conceptual drift did not occur over time.
Quality assessment accompanied the bibliometric mapping to evaluate the evidential robustness of the corpus. Because methodological heterogeneity prevented formal meta-analysis, a qualitative rubric was used. Each publication was evaluated across four axes: transparency of model description, clarity of data provenance, rigor of validation, and attention to social or ethical dimensions. Scores were assigned on a three-point scale (0 = low, 1 = medium, 2 = high). Aggregated results from this assessment are discussed in Section 3, where they are used to interpret differences in research maturity among clusters. These bibliometric mappings and co-occurrence analyses represent the modeling outputs of the text-mining stage, translating the semantically processed corpus into interpretable thematic clusters used in the next section.

3. Results

3.1. Overview of the Corpus and Cluster Composition

The analytical corpus consists of 1290 publications spanning 2000 to 2025, which together document the field’s shift from rule-based simulation to AI-augmented urban analytics and governance. The span is long enough to observe not only method turnover but also the migration of interfaces, participation practices, and governance language into mainstream modeling. Co-occurrence analysis of author keywords yields a stable four-cluster solution that is also recognizable in practice. One cluster centers on SDM, bringing together CA, ABM, Markov, and system dynamics approaches that aim to replicate or predict urban change. A second cluster focuses on PSSs, emphasizing interfaces that enable participatory scenario design, visualization, and decision dialogue. A third cluster captures AI and advanced analytics, including deep learning, computer vision, and multimodal techniques applied to land change, urban morphology, and functional zoning. The fourth and smallest cluster addresses governance and ethics, gathering work on transparency, fairness, documentation, and public participation. Across the full period, SDM and AI account for more than half of the items, PSSs contribute roughly one quarter, and governance forms a smaller but rapidly growing share. This imbalance is consistent with a field where technical capacity expands faster than institutional uptake, which helps to explain recurrent implementation gaps.
Figure 3 shows annual publication counts by theme from 2000 to 2025. The dashed vertical line at 2015 marks the point at which deep learning diffuses into remote sensing and urban-form analysis. Prior to 2015, annual volume is modest and dominated by SDM, while PSSs appear at low levels and governance is largely absent. In the shaded post-2015 window the total volume rises sharply, with the curve steepening after 2018 and exceeding 250 records by 2025. The AI layer becomes the largest single component in the final years, yet SDM does not decline, which indicates refinement and hybridization rather than displacement. PSSs grow gradually across the period and become most visible in years where cross-domain work intensifies. Governance remains the smallest layer in absolute terms but accelerates alongside AI after 2018, suggesting that ethical and analytical debates are increasingly pulled into the same studies instead of proceeding in parallel.
The chronological layering visible in the co-occurrence network, derived from the same corpus, is consistent with this series. Publications from 2000 to 2010 cluster around SDM topics such as calibration of neighborhood rules and basic agent interactions. Between 2010 and 2015 the literature broadens to include hybrid CA–ABM frameworks and first-generation PSS prototypes. After 2015 the density of AI-related nodes increases markedly, especially for remote sensing and urban morphology, while governance terms appear from 2018 onward and attach most frequently to AI and PSSs, indicating where accountability concerns are being operationalized.
Figure 4 summarizes the share of annual output that spans at least two domains. The heatmap confirms a move from specialization to integration. Pairings of PSSs with SDM appear early in the 2000s, fade during the mid period, and resurge after 2016 as interfaces reconnect to modeling pipelines. The AI with SDM overlay emerges after 2016 and darkens steadily through 2020 to 2025, which reflects learned representations embedded in spatial dynamics tasks. Overlays of AI with governance and AI with PSSs become visible from 2019 onward and reach the darkest intensity bin in the most recent years, indicating that a meaningful fraction of annual publications now integrate methods, interfaces, and accountability within single studies. The triple overlay combining AI, governance, and PSSs is largely absent before 2020, then appears and strengthens, a pattern that aligns with the rise of explainability toolchains, documentation protocols, and participatory front ends in applied projects.
Regional diffusion is asynchronous. Earlier consolidation appears in long-standing research hubs, with marked acceleration in several rapidly urbanizing regions after 2015; given uneven coverage, these are reported as descriptors rather than formal cross-region comparisons. This temporal divergence reflects differences in data infrastructure, funding mechanisms, and policy programs that have shaped the pace and direction of AI integration in urban modeling. Geographic patterns in Figure 5 reveal concentrated productivity with widening participation. Output is highest in the United States (162 records) and China (145), followed by the United Kingdom (59), Germany (44), Italy (41), and India (39). Additional hubs include Canada (49), Australia (28), Turkey (24), Japan (23), The Netherlands (22), Russia (20), Switzerland (19), and Spain (19). The distribution suggests that access to training data, compute resources, and collaborative institutions is uneven, which likely influences both the kinds of problems that are studied and the readiness of agencies to adopt AI-enabled systems. Countries that invest in open data and digital-twin programs tend to appear more frequently in the overlays that combine AI with PSSs and governance, consistent with lower marginal costs of integrating models into live decision settings.

3.2. Evolution of SDM

The trajectory of SDM over the past twenty-five years reflects the continuous search for a balance between interpretability, empirical fidelity, and computational performance. The corpus demonstrates a clear methodological transition from rule-based paradigms rooted in geographical theory to data-driven models informed by AI. This evolution can be divided into three broad phases, each characterized by distinct theoretical motivations and technical advances, yet connected by an enduring concern with how urban form and function can be represented through dynamic spatial processes.
The first phase, spanning roughly the early 2000s, was dominated by classical CA, ABM, and hybrid Markov-chain frameworks [7,8,54]. These approaches treated urban change as a bottom-up process, where local interactions and neighborhood rules collectively produced emergent spatial patterns. CA-based models offered planners a powerful visual logic: a grid-based abstraction that could simulate land conversion, diffusion, and containment effects [55,56]. Agent-based extensions introduced behavioral variability, allowing researchers to represent household decisions, developer competition, or transport accessibility as drivers of spatial evolution [57,58]. The literature from this period shows intense experimentation with calibration, neighborhood structures, and transition probabilities, often validating against remote-sensing and census data [59,60].
The second phase, emerging between 2010 and 2015, marked a turning point toward hybridization and integration with statistical learning [61]. Researchers began to embed CA and ABM logics within regression-based or econometric formulations, enabling the incorporation of socioeconomic, infrastructural, and environmental variables as dynamic constraints [15,56]. The hybrid CA–logistic regression models and geographically weighted frameworks of this period improved predictive accuracy while retaining a mechanistic link to spatial processes [62]. At the same time, the availability of higher-resolution remote-sensing imagery and global datasets facilitated cross-city comparisons and multi-scale validation [63,64]. This era also saw the rise of probabilistic formulations such as Bayesian calibration and ensemble modeling, allowing the uncertainty of simulation outcomes to be quantified [12]. Within the corpus, studies from this phase exhibit a clear methodological sophistication but remain largely deterministic in their workflow, with limited automation of model selection or parameter optimization.
The third and most recent phase, beginning around 2015, coincides with the diffusion of deep learning and GeoAI into spatial analysis [18,19]. Here the modeling emphasis shifted from prescriptive rules to learned representations. Convolutional and recurrent neural networks began to be used for urban growth prediction, land-cover classification, and morphological pattern recognition [5]. Transformer-based VLMs further extended this capability by linking textual descriptions, images, and geospatial attributes in a shared latent space [65]. This multimodal capacity allowed SDM to move beyond pure spatial simulation toward semantic understanding of urban form and function. For example, models could infer accessibility, density gradients, or typological patterns directly from imagery without explicit coding of neighborhood effects. The bibliometric evidence shows a sharp increase in the frequency of AI-related keywords after 2018, corresponding to a broader disciplinary shift toward data-driven modeling.
Across these phases, a recurring methodological tension becomes evident between interpretability and autonomy. Rule-based SDM frameworks offered clarity but required manual calibration, whereas AI-driven models delivered higher accuracy at the cost of transparency [66]. A growing number of publications, particularly after 2020, attempt to bridge this divide by embedding explainable AI techniques within spatial modeling workflows. Examples include the use of saliency maps to interpret spatial attention in convolutional networks, or feature attribution methods that quantify the influence of socioeconomic variables on predicted urban expansion [40,67]. Such studies signal an emerging consensus that future SDM should not merely replicate spatial patterns but also make the reasoning process intelligible to decision-makers.
The changing vocabulary of modeling practice also reflects this methodological transformation. Across the literature, early rule-based terms such as cellular automata, Markov chains, and system dynamics appear most frequently in the 2000–2010 period. After 2015, the terminology shifts sharply toward data-driven approaches, including machine learning, GeoAI, deep learning, and digital-twin concepts. This linguistic evolution parallels the substantive trajectory of SDM, moving from prescriptive rule systems toward adaptive, AI-enhanced frameworks capable of learning spatial and semantic patterns from multimodal data. The trend underscores how conceptual and technical vocabularies have co-evolved alongside methodological advancements, even without relying on explicit visualizations.
Despite this progress, the corpus also reveals that most AI-enhanced SDM models remain primarily technical prototypes, often validated on a single metropolitan region. Few studies systematically evaluate generalizability across diverse urban contexts or assess the social implications of model deployment [68,69]. Only a small subset explicitly addresses ethical dimensions such as data bias, model transparency, or accessibility of tools to planning institutions with limited resources. This asymmetry highlights the ongoing challenge of aligning methodological innovation with governance relevance.

3.3. The Transformation of PSSs

The transformation of PSSs from static analytical tools into adaptive, participatory, and increasingly intelligent environments represents one of the most significant shifts in urban modeling practice during the study period [70]. This conclusion is supported by the bibliometric evidence shown in Figure 4, which reveals the growing co-occurrence of PSSs with AI and governance themes. The progressive overlap of these clusters indicates that PSSs have evolved beyond standalone decision-support utilities to become integrated environments linking modeling, participation, and accountability. Within this context, the corpus demonstrates that PSSs have evolved not only as computational infrastructures but also as epistemic frameworks that shape how evidence is produced, interpreted, and debated within planning processes. This transformation can be traced through three overlapping trajectories: functional expansion, participatory integration, and algorithmic augmentation.
During the early 2000s, PSSs were largely conceived as extensions of Geographic Information Systems, designed to provide planners with analytical capabilities for scenario evaluation and impact assessment [71]. These systems often functioned as stand-alone desktop applications, integrating spatial databases with visualization and simulation modules. The focus was on supporting expert decision-making through improved access to spatial information. Studies from this period emphasize tool development, interface design, and technical interoperability [23]. The primary objective was efficiency: to automate the manipulation of spatial layers, accelerate scenario testing, and visualize the outcomes of SDM [72]. Although these early systems successfully embedded SDM outputs into decision workflows, they were limited by rigid architectures and minimal opportunities for stakeholder interaction [24].
Between 2010 and 2015, PSSs began to adopt a more interactive and participatory orientation [25]. Advances in web-based technologies and open-source geospatial platforms facilitated the development of systems that could operate across institutional boundaries. Planners, researchers, and sometimes community representatives could now engage with simulation results through web interfaces that allowed real-time exploration of planning scenarios. This participatory turn marked an epistemological change: models were no longer regarded solely as expert instruments but as boundary objects that mediated dialogue among multiple actors [73]. The literature from this period introduces concepts such as co-design, collaborative mapping, and participatory visualization [74]. Empirical studies document workshops where stakeholders adjust model parameters and observe the resulting urban growth trajectories, illustrating how computational tools can support deliberation as well as prediction [72].
The emergence of open data ecosystems further expanded the reach of PSSs. Governmental agencies and research institutions began to share spatial datasets through online portals, enabling systems to integrate live or near-real-time information [75]. This openness increased transparency but also introduced new technical challenges related to data harmonization and quality control [76]. A number of studies in the corpus experiment with linking PSSs to crowdsourced data and social media feeds, demonstrating early attempts to incorporate citizen-generated knowledge into formal planning workflows [77,78]. These initiatives reflect a gradual redefinition of expertise in urban analytics, where data curation and interpretation become collective rather than individual acts.
After 2015, the trajectory of PSSs converged with that of AI-enhanced modeling [29]. The integration of machine-learning algorithms into decision-support systems transformed the role of the planner from a model operator to a model interpreter. Systems began to include predictive modules capable of automatically generating alternative land-use scenarios, detecting spatial anomalies, or suggesting optimal configurations based on multi-criteria optimization [79]. AI components also enabled personalization of interfaces, adapting displayed information to user preferences and past interaction patterns. The literature highlights the emergence of “smart” or “adaptive” PSSs that learn from user feedback, effectively functioning as interactive agents within the planning process.
More recent studies extend this integration by embedding deep-learning models for image recognition, semantic segmentation, and accessibility analysis directly within PSS environments [32]. These models enable near-instantaneous visualizations of future urban configurations based on combinations of policy inputs and environmental constraints. At the same time, the rapid automation of analytical tasks has revived debates on the balance between human judgment and algorithmic recommendation. Several publications explicitly question the implications of AI-driven PSSs for procedural fairness, accountability, and the transparency of decision pathways [30,80]. While few systems currently incorporate formal ethical audits, a growing subset of projects adopt open-source licensing or participatory data governance to maintain trust and verifiability [81,82].
The bibliometric evidence supports this temporal narrative. Keyword frequency analyses show that terms such as “participation,” “interactive,” and “web-based planning” begin to rise after 2010, while “AI,” “deep learning,” and “digital twin” dominate after 2018. The co-occurrence network displays a convergence zone where PSS and AI clusters intersect, illustrating that planning systems increasingly serve as platforms through which advanced analytical methods become accessible to non-technical users. Recent bibliometric and keyword analyses highlight this trend: venues that historically published SDM or urban informatics research now regularly feature participatory and AI-enabled PSS applications, suggesting a disciplinary fusion that blurs the line between modeling and governance.
Despite these advances, the corpus reveals persistent asymmetries in the distribution of technical capacity. Most AI-integrated PSSs originate from high-income contexts with strong data infrastructures, while systems designed for resource-constrained environments remain focused on visualization rather than predictive analytics. Moreover, only a small fraction of studies report long-term adoption beyond pilot phases, indicating that institutional integration remains uneven [28]. Researchers increasingly recognize that technical sophistication alone does not guarantee planning relevance; usability, interpretability, and governance alignment are equally critical [17,83]. As a result, the most recent literature emphasizes co-production between model developers, planners, and communities as a precondition for sustainable system implementation.

3.4. AI, Governance, and the Ethics of Urban Modeling

The governance and ethical dimensions of SDM and PSS have gradually moved from peripheral discussion to a central research concern. Although this thematic cluster remains the smallest within the corpus, it reveals the most rapid rate of growth, particularly since 2018. The expansion of machine learning and automated decision-making in planning has intensified debates on fairness, transparency, and inclusion, highlighting the need to design modeling ecosystems that align computational intelligence with societal values.
Early publications touching on governance appear around the mid-2000s and are mainly conceptual [26,84]. They frame PSSs as instruments for improving institutional accountability and civic participation, yet without explicit reference to algorithmic processes. The focus during this period rests on participatory GIS and stakeholder workshops, emphasizing inclusivity through dialogue rather than through the internal design of models. Ethical concerns were therefore procedural: who participates, whose knowledge counts, and how scenarios are communicated. While these studies established the normative foundation for participatory planning, they offered little guidance on how ethics could be operationalized in data-driven modeling.
A second wave of governance-oriented studies emerges between 2015 and 2020, coinciding with the acceleration of AI adoption in urban analytics. Researchers began to recognize that algorithmic systems could both enhance and constrain participation [31]. On one hand, automation enabled broader access to simulation results through interactive interfaces; on the other, it introduced opacity in how predictions were generated. Within this context, the discourse on algorithmic transparency gained prominence. Several studies propose model-documentation protocols and open-source repositories as mechanisms to foster trust and reproducibility [85,86]. Others experiment with “white-box” modeling techniques, combining interpretable statistical components with neural networks to maintain a balance between predictive power and clarity of reasoning [87].
Parallel to transparency, the notion of algorithmic fairness entered the literature [88]. Scholars observed that biases in input data, such as uneven coverage of remote-sensing imagery, demographic imbalance in training samples, or differing spatial resolutions, could translate into systematic distortions in predicted urban outcomes [89]. Fairness metrics originally developed in computer science were adapted for spatial applications, allowing researchers to quantify disparities in model performance across geographic or socioeconomic groups. A few studies explicitly test fairness interventions by re-weighting samples or incorporating contextual covariates that capture urban inequality [90,91]. These contributions mark an important methodological shift: ethical principles begin to be expressed through measurable indicators rather than purely discursive arguments.
Participation also evolved conceptually during this phase. Instead of viewing inclusion as external to modeling, several projects embed participatory logic directly within computational workflows. Examples include collaborative model calibration sessions, crowdsourced data validation, and iterative feedback loops where stakeholder preferences are used to refine learning parameters [92,93]. These approaches treat AI not as a replacement for human judgment but as a facilitator of distributed cognition among planners, citizens, and machines. The literature documents both successes and challenges in such co-production efforts. While they enhance legitimacy and local relevance, they also expose disparities in technical literacy and data governance capacity across participating institutions.
After 2020, ethical and governance discussions increasingly intersect with the design of urban digital twins [38,39]. These large-scale, data-integrated systems aspire to represent cities as living computational entities, continuously updated through sensors, imagery, and social data. The corpus indicates that scholars and practitioners are now debating not only the technical feasibility of digital twins but also their normative implications. Issues such as data ownership, consent, and the right to be represented within simulations have become salient. A subset of articles introduces frameworks for “responsible digital twins,” emphasizing auditability, data minimization, and user agency [94,95]. These efforts reflect a growing consensus that AI-aligned urban models must incorporate ethical constraints from the outset rather than as post-hoc safeguards.
Bibliometric analyses reinforce the conceptual cross-fertilization occurring in this cluster. Governance-related keywords such as “fairness,” “transparency,” “inclusion,” and “ethics” appear increasingly connected to technical terms like “deep learning,” “GeoAI,” and “digital twin” in the co-term network. The proximity of these terms in the semantic space suggests that ethical and methodological discourses are beginning to merge. Thematic overlays show that, unlike earlier phases where governance formed a distinct and peripheral cluster, recent publications integrate ethical discussions directly into modeling or system design sections. This integration marks a structural transformation in how the community conceptualizes responsibility in computational planning.
Despite these advances, significant gaps remain. Most ethical frameworks are still presented as conceptual propositions rather than empirically validated methodologies. Few studies measure the long-term impact of transparency or participation protocols on planning outcomes. Moreover, there is uneven geographic representation: the majority of empirical studies addressing fairness or inclusion originate from technologically advanced regions, while research from low- and middle-income contexts remains under-represented [96]. This imbalance raises questions about the global applicability of AI-governance frameworks that presume robust data infrastructures and regulatory capacity.

4. Discussion

4.1. Thematic Convergence of SDM, PSSs, and AI Governance

The evolution of SDM, PSSs, and governance-oriented research has followed parallel yet increasingly intertwined paths. Over the twenty-five-year review period, what began as separate lines of inquiry, including spatial simulation, decision-support design, and participatory planning, has progressively merged into an integrated ecosystem of AI-aligned urban modeling [2,71]. This convergence represents more than a technical synthesis; it marks a transformation in the epistemology of planning—how knowledge about cities is produced, communicated, and legitimized.
From a methodological standpoint, the intersection of SDM and PSSs has matured through successive cycles of integration. In the early 2000s, dynamic models were typically developed as standalone analytical engines, with outputs manually transferred into decision-support interfaces [81]. As systems became more interactive and data-rich, these boundaries blurred. Models are now embedded directly within planning environments, allowing continuous feedback between simulation and deliberation [61,92]. This coupling has elevated the role of modeling from predictive to exploratory, positioning SDM as an instrument for negotiating alternative urban futures rather than merely forecasting them. The corpus shows that as soon as interactive PSSs became widespread, references to co-creation, scenario negotiation, and collaborative modeling increased sharply.
The infusion of AI into this landscape has accelerated both the opportunities and the tensions inherent in convergence. Deep learning and GeoAI now underpin many SDM applications, while PSSs increasingly employ natural language processing, pattern recognition, and recommendation algorithms to support user interaction. The combination of predictive automation and participatory visualization produces new modalities of urban knowledge, where algorithmic insight and human judgment are continuously interwoven [97]. Such hybrid intelligence redefines traditional planning workflows: instead of proceeding linearly from model development to policy interpretation, decision processes now unfold as iterative and dialogical exchanges [98]. However, as the results in Section 3 indicated, integration remains uneven. Technical interoperability between modeling and decision-support systems has advanced rapidly, but normative integration of transparency, accountability, and inclusivity lags behind.
Conceptually, this convergence signals a shift from instrumental rationality toward reflexive rationality in urban modeling. Early SDM studies operated under a positivist logic, focused on reproducing urban patterns as accurately as possible. PSS research introduced procedural rationality by emphasizing deliberation and collective reasoning [99]. The rise of AI governance frameworks adds a reflexive dimension: systems are now expected to recognize their own epistemic limits and ethical implications [100]. Models are thus evaluated not only by accuracy or usability but also by how they treat uncertainty, bias, and representation. This shift reflects a broader cultural realignment, from prediction toward responsibility, where modeling is valued not only for its explanatory capacity but for its contribution to democratic legitimacy in urban decision-making.
At a practical level, the corpus indicates that convergence has fostered methodological hybridization. Many recent studies combine spatial simulation, multi-criteria decision analysis, and fairness auditing within a single workflow [79,101]. AI-augmented PSSs, for example, use machine learning to generate scenario alternatives that are then evaluated through participatory scoring sessions explicitly addressing ethical trade-offs. Such frameworks illustrate how technical and normative dimensions can reinforce each other when coherently designed. Nevertheless, persistent barriers remain: interoperability gaps between open-source and proprietary systems, limited institutional capacity for algorithmic auditing, and a lack of standardized documentation for model behavior [86].
Thematic convergence is also visible in the changing language of research. Keyword co-occurrence patterns show that terms traditionally associated with modeling, such as simulation or validation, are now accompanied by governance-oriented terms like transparency, accountability, and inclusion. This linguistic shift suggests that ethical and participatory considerations are no longer peripheral but intrinsic to the design of models themselves [102]. In this sense, convergence can be understood as a form of disciplinary hybridization, where computational and societal logics evolve together. Yet convergence does not imply equilibrium. Most progress continues to occur in the technical domain, while governance innovation remains fragmented. The next challenge is to ensure that advances in computational modeling are matched by equally robust institutional design and ethical accountability. Without such balance, the integration of AI into planning risks reproducing inequalities in knowledge, capacity, and influence.

4.2. Toward Responsible Modeling

The convergence of technical and ethical agendas within urban analytics requires a fundamental rethinking of what constitutes responsibility in modeling. As AI-enhanced SDM and PSS become central to decision-making, responsibility extends beyond achieving technical accuracy to include inclusivity, transparency, and accountability. These dimensions are not supplementary to performance; they are conditions for legitimacy [95].
Inclusivity functions both as a procedural and epistemic requirement. While participatory design has become a normative expectation, it remains inconsistently applied across geographic and institutional contexts [84]. Many projects invite stakeholder input during early design but retain little flexibility once models are operationalized. The most effective examples, often within collaborative digital-twin initiatives, further treat participation as an iterative process embedded throughout calibration, scenario generation, and evaluation [61]. This approach recognizes that local knowledge complements data-driven inference, reducing the risk of epistemic exclusion, where communities most affected by decisions remain underrepresented in the data guiding them [82].
Transparency, in turn, is the enabling condition for accountability. In classical SDM, transparency was inherent in explicit transition rules and parameters. With deep learning and multimodal models, interpretability has become a technical and institutional challenge [21,67]. Researchers are increasingly integrating explainable-AI methods, such as feature attribution, attention visualization, and sensitivity mapping, to clarify how predictions relate to input variables [103]. These techniques enhance interpretability but are not substitutes for institutional transparency. Genuine openness also requires full documentation of data provenance, assumptions, and limits of model validity. Trust arises less from algorithmic clarity alone than from the visibility of the entire modeling process—who designed it, with what data, and under which constraints [104].
Accountability connects inclusivity and transparency through shared governance structures that define who is responsible for the design, deployment, and consequences of models. The corpus reveals a growing interest in practical accountability mechanisms, from ethics charters and audit protocols to participatory oversight boards [85]. Although many remain conceptual, they indicate an emerging recognition that responsibility cannot rest solely with individual modelers. Instead, it must be distributed across developers, planners, and the public affected by modeling decisions. Empirical examples, particularly within open-source urban analytics consortia, show that collective stewardship can balance innovation with restraint [80].
Taken together, these dimensions outline the contours of responsible modeling. They shift evaluation from statistical performance to a broader view of models as socio-technical systems embedded within networks of expertise, data, and power. Responsibility thus becomes a design property rather than a post-hoc ethical statement. Emerging research suggests complementing traditional accuracy metrics with procedural indicators such as participation diversity, data accessibility, and clarity of communication [105]. Such composite evaluation frameworks capture the multidimensional nature of model quality in contemporary governance. The central task now is to institutionalize these practices without compromising scientific rigor, transforming responsibility from aspiration into verifiable protocol [42].

4.3. Research Agenda for AI-Aligned Urban Governance

The integration of AI into SDM and PSSs marks a decisive turning point in how cities are analyzed, simulated, and governed. Yet across the review, technical progress outpaces institutional adaptation. To align computational innovation with societal goals, future research should center on inclusivity, accountability, and long-term public value across three fronts: methodological fusion, institutional design, and epistemic pluralism.
Methodological fusion addresses the collapsing boundary between simulation, learning, and deliberation. Future SDM development should move beyond static workflows of calibration and prediction toward dynamic architectures capable of continual learning [69,106]. The integration of multimodal models linking textual, visual, and spatial data can produce richer urban representations, but interpretability must remain a design constraint. In parallel, ecological indicators, social equity metrics, and behavioral dynamics should be incorporated so AI-aligned SDM reflects coupled human–environment systems [107,108,109]. Integrating these factors through multimodal data and participatory modeling can improve both contextual validity and policy relevance. Research should prioritize coupling predictive modules with explainability and uncertainty quantification that can be directly communicated through PSS interfaces. Moreover, transferability should be demonstrated through external validation and out-of-sample transfer tests across heterogeneous urban settings, with performance reported by city type, data regime, and time period; this will be crucial for global equity [68]. Future evaluations should adopt regionally stratified benchmarks and incorporate policy-program metadata to contextualize adoption speed and trajectories across different urban systems. Open benchmarks, shared ontologies, and transparent datasets will enable cumulative progress [110,111].
Institutional design concerns the organizational arrangements that make responsible modeling sustainable. As AI-aligned tools move into policy formation, ownership, oversight, and accountability become central [94]. Comparative research should examine how planning cultures and governance structures influence adoption, resistance, or transformation of AI-based systems. New forms of algorithmic stewardship-dedicated units or networks that maintain, audit, and update urban models could ensure that data flows, parameters, and safeguards evolve alongside cities themselves [85]. These methods can be operationalized in widely used environments, including QGIS and ArcGIS ModelBuilder for workflow integration, UrbanSim and CityEngine for scenario design, web-based PSS dashboards built on PostGIS and GeoPandas, and AI toolchains such as scikit-learn and PyTorch with explainability modules such as SHAP and LIME.
Epistemic pluralism calls for recognizing that no single modeling paradigm captures the full complexity of urban life. Future work should explore hybrid knowledge systems that combine quantitative inference with qualitative narratives, local expertise, and experiential data [112]. Participatory modeling can evolve toward co-creation, where citizens directly shape representations of their environments. Advances in VLMs and interactive visualization now enable multilingual, multimodal participation, allowing communities to engage with complex information intuitively [113,114]. Such developments democratize not only access to data but also the power to define which data and models matter.
Across these three frontiers, the overarching goal is to develop AI-aligned urban governance that is both intelligent and reflexive. The field must shift from asking how accurately models predict urban change to how constructively they participate in shaping equitable and sustainable urban futures. Achieving this balance demands methodological rigor, ethical commitment, and institutional imagination. As cities become increasingly mediated by data and algorithms, the key question is no longer how to model the city, but how to ensure modeling remains accountable to its inhabitants.
In answer to RQ2, the implementable framework emerging from the corpus consists of four minimally sufficient practices: (i) interpretability legible to non-specialists; (ii) uncertainty reporting that disaggregates data noise, model error, and scenario variability in policy-ready formats; (iii) documentation and provenance via model cards, data statements, and version control for audit and reproducibility; and (iv) participation as a design input, with stakeholder feedback captured through PSS sessions to iteratively inform specification. These elements recur in the governance cluster and co-occur with PSS and AI themes, indicating that institutional legitimacy hinges on pairing predictive capability with these procedural safeguards. Concretely, this review refines six routine methods for planning workflows: interpretability for non-specialists, uncertainty disaggregation and reporting, documentation and provenance for auditability, participation embedded through PSS sessions, clear calibration and validation reporting standards, and a reproducible RAG-assisted protocol for literature screening and evidence synthesis.

5. Conclusions

Over the past twenty-five years, SDM and PSSs have evolved from distinct traditions into an integrated framework for urban decision-making. In direct response to RQ1, three turning points stand out. Early rule-based models made assumptions explicit and interpretable, allowing planners to engage with spatial dynamics rather than simply observe outcomes. Hybrid approaches strengthened empirical validation and comparability across contexts, bridging theory and application. Most recently, data-driven and multimodal learning expanded the capacity to read complex urban patterns and enabled interactive, scenario-based deliberation. In step with these shifts, PSSs moved from analytical displays to participatory environments, reframing models as instruments for negotiation and shared understanding. Together, these changes have shifted planning from static forecasts toward iterative reasoning, where assumptions, uncertainty, and values are openly examined.
The synthesis also responds explicitly to RQ2 by identifying four minimally sufficient practices that make AI-aligned SDM implementable in practice within institutions: interpretability in plain, non-technical language; transparent reporting of uncertainty that distinguishes data noise, model error, and scenario variability; thorough documentation and provenance for auditability and reproducibility; and participation treated as a design input, with stakeholder feedback captured through PSS sessions to inform model specification and refinement. Where these practices are present, adoption is smoother and trust more durable; where they are absent, technical sophistication rarely translates into sustained use.
Two challenges remain and highlight the main prospects for improving approaches and tools going forward. Technical capability continues to outpace mechanisms for scrutiny and governance, and systematic evidence of cross-city generalization is limited. Addressing these gaps will require embedding transparency, reproducibility, fairness, and transfer testing as core design requirements in next-generation SDM and PSS workflows, rather than treating them as later-stage procedural formalities. In sum, the field is moving from forecasts that ask to be believed toward workflows that invite examination—models that organize judgment rather than replace it, so that predictive power, interpretability, uncertainty, and participation operate together in public.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Acknowledgments

The author thanks the colleagues and reviewers who provided helpful comments that improved the clarity and quality of this manuscript.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ABMAgent-Based Model
AIArtificial Intelligence
CACellular Automata
GISGeographic Information System
PSSPlanning Support System
RAGRetrieval-Augmented Generation
SDMSpatial Dynamic Modeling
VLMVision–Language Model

Appendix A

Appendix A.1. Models, Endpoint, and Runtime Configuration

The screening notebook uses the Azure OpenAI Python v1 SDK with configuration loaded from environment variables at runtime. Defaults are printed when the run starts for provenance—auth mode, endpoint, and deployment (e.g., “Auth mode: key; Endpoint: …/conflict.openai.azure.com/; Deployment: gpt-5-nano”). The code sets the API version to 2025-01-01-preview and the chat deployment to gpt-5-nano unless overridden; an optional embeddings deployment text-embedding-3-large was used to convert segmented text into vectors for semantic retrieval. These are read from AZURE_OPENAI_API_VERSION, AZURE_OPENAI_CHAT_DEPLOYMENT, and AZURE_OPENAI_EMBED_DEPLOYMENT, with the endpoint at AZURE_OPENAI_ENDPOINT. Authentication can be an API key or Entra ID; the client is constructed accordingly, and the notebook echoes the resolved settings at run start.

Appendix A.2. Tasks, Prompts, and JSON Schema

Two chat tasks are used: (i) title–abstract triage and clustering and (ii) country-of-origin enrichment. For triage, a short developer message orients the model to the review’s scope—SDM, PSS, AI/advanced analytics (e.g., GeoAI/VLMs/Digital Twins), and governance/ethics—and enforces “Output strict JSON only.” The user instruction then requires a JSON array where each element has a decision code d ∈ {R,P,E} and a cluster list c drawn from {SDM,PSS,AI,Governance}; it also tells the model to use pre_tags as hints and caps clusters to three. (These phrasings appear verbatim in the notebook strings that build the messages.) Each record passed to the model contains the cleaned title plus a compact bundle of abstract snippets and any pre-tags.
The enrichment pass uses the same chat deployment with temperature = 0 to maximize determinism. Instructions require canonical sovereign country names and one primary_country (preferably the first author’s location) and allow inference from weak signals (institution/city/email) when affiliations are missing. If any row in a batch returns empty, a single-row force retry is triggered with stricter wording that must output at least one country. Rows that already contain country fields are skipped; the script never overwrites non-empty values.

Appendix A.3. Evidence Packing, Domain Cues, and Pre-Tags

Before model calls, evidence is packed tightly to promote reproducibility and reduce noise: the notebook caps each item to ≤6 snippets and ≤1200 characters in total, prioritizing diversity of cues rather than raw length. A fast, case-insensitive cue-matching step infers pre-tags: e.g., SDM cues include “cellular automata,” “agent-based,” “Markov,” “system dynamics,” “urban growth,” “land-use change,” and SLEUTH/CLUE-S; PSS cues include “planning support system,” “decision support,” “scenario,” “what-if,” and “geodesign.” The helper _has_any() checks cue presence; infer_pretags_fast() builds the pre-tag list from title and abstract. When the cues are unambiguous, the logic can directly label an item R with up to three clusters (deduplicated, order-preserving) or skip to E—only ambiguous items are sent to the model. The pre-tags are then included in the prompt so the model can treat them as weak priors.

Appendix A.4. Batching, Timeouts, Parsing, and Retries

Screening runs in batches of 24 records per API call with MAX_WORKERS = 3 threads, a 35 s call timeout, and a 400 token completion cap—enough for the required JSON array. To keep runs bounded, guardrails are set to MAX_ITEMS = 5000, MAX_RUNTIME_MIN = 60, and MAX_ERRORS = 8 consecutive failures. The notebook uses tqdm to surface progress and prints a banner with the resolved endpoint/auth/deployment. JSON parsing first attempts to load the full response; on failure, a regex extracts the top-level array [...] and retries parsing, returning an empty list on persistent failure so the retry logic can kick in. Within the timeout window, the call loop can toggle response-format strictness to salvage arrays when a model prepends text despite the “JSON only” instruction.
Country enrichment prefers larger batches (BATCH_SIZE = 50) because rows are short and arrays compress well. It sets MAX_BATCH_RETRIES = 2 with a base back-off and MAX_FORCE_RETRIES = 2 for stubborn single rows; progress and outputs are printed during the run. All Azure settings for this script mirror the screening configuration and are echoed at startup.

Appendix A.5. Inputs, Outputs, Determinism, and Run Artifacts

The screening stage consumes a harvested Excel (e.g., data_exports/auto_harvest.xlsx) and writes three main artifacts: screened_literature.xlsx (all items with labels) and final_corpus.xlsx (in-scope after inclusion rules). The enrichment stage reads your final corpus and produces final_corpus_enriched.xlsx, adding Primary_Origin_Country and All_Origin_Countries, plus summary sheets. The notebook prints resolved file paths on successful writes, which doubles as a simple batch log.
No pseudo-random number generator is used in either script, so there is no PRNG seed to report. Determinism is encouraged via temperature = 0 for the country task, fixed batching order, strict JSON-only prompting, and consistent concurrency limits; residual non-determinism may arise from server scheduling or thread interleaving. The SEED_FILE variable in the notebook refers to an input sheet name (the pre-harvested Excel), not a random seed.

Appendix A.6. Corpus Provenance, Venue Levels, and Author Affiliations

The final corpus of 1290 publications was compiled from Web of Science Core Collection, Scopus, IEEE Xplore, and Google Scholar (via API aggregation). Web of Science, Scopus, and IEEE Xplore are subscription databases, while Google Scholar ensured comprehensive recall. Only peer-reviewed journal articles and indexed conference proceedings were included; theses and grey literature were excluded. Open-access status was not used as a filter, but full texts were retrieved whenever institutional access permitted.
Venue “level” (journal or indexed conference) was recorded from index metadata but not used as a quality filter to avoid penalizing computational research published in conference proceedings while maintaining comparability with journals. These metadata were retained for stratified checks reported in the results.
Author affiliations were parsed from index records. The corpus is predominantly authored by universities and public research institutes, with practitioner or industry co-authorship present in a minority of records. Student theses were excluded by design. When affiliation fields were incomplete, institution names were inferred from email domains or affiliation strings. All literature records have been fully documented, including identifiers and metadata fields, and can be made available to qualified researchers upon reasonable request.

References

  1. Bittencourt, J.C.N.; Costa, D.G.; Portugal, P.; Vasques, F. A Survey on Adaptive Smart Urban Systems. IEEE Access 2024, 12, 102826–102850. [Google Scholar] [CrossRef]
  2. Batty, M. Urban Analytics; SAGE: London, UK, 2019. [Google Scholar]
  3. Townsend, A. Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia; W.W. Norton: New York, NY, USA, 2013. [Google Scholar]
  4. Li, W.; Arundel, S.T.; Gao, S.; Goodchild, M.F.; Hu, Y.; Wang, S.; Zipf, A. GeoAI for Science and the Science of GeoAI. J. Spat. Inf. Sci. 2024, 29, 1–17. [Google Scholar] [CrossRef]
  5. Chen, W.; Wu, A.N.; Biljecki, F. Classification of Urban Morphology with Deep Learning: Application on Urban Vitality. Comput. Environ. Urban Syst. 2021, 90, 101706. [Google Scholar] [CrossRef]
  6. Cai, Z.; Kwak, Y.; Cvetkovic, V.; Deal, B.; Mörtberg, U. Urban Spatial Dynamic Modeling Based on Urban Amenity Data to Inform Smart City Planning. Anthropocene 2023, 42, 100387. [Google Scholar] [CrossRef]
  7. He, C.; Okada, N.; Zhang, Q.; Shi, P.; Zhang, J. Modeling Urban Expansion Scenarios by Coupling Cellular Automata and System Dynamics in Beijing, China. Appl. Geogr. 2006, 26, 323–345. [Google Scholar] [CrossRef]
  8. Guan, D.; Li, H.; Inohae, T.; Su, W.; Nagaie, T.; Hokao, K. Modeling Urban Land-Use Change by Integrating Cellular Automata and Markov Chains. Ecol. Model. 2011, 222, 3761–3772. [Google Scholar] [CrossRef]
  9. Arsanjani, J.J.; Helbich, M.; Vaz, E.D.N. Spatiotemporal Simulation of Urban Growth Patterns Using Agent-Based Modeling: The Case of Tehran, Iran. Cities 2013, 32, 33–42. [Google Scholar] [CrossRef]
  10. Richardson, G.P. Reflections on the Foundations of System Dynamics. Syst. Dyn. Rev. 2011, 27, 219–243. [Google Scholar] [CrossRef]
  11. Kocabas, V.; Dragicevic, S. Bayesian Networks and Agent-Based Modeling Approach for Urban Land-Use and Population Density Change: A BNAS Model. J. Geogr. Syst. 2013, 15, 403–426. [Google Scholar] [CrossRef]
  12. Dotson, T. Trial-and-error urbanism: Addressing obduracy, uncertainty and complexity in urban planning and design. J. Urban. Int. Res. Placemak. Urban Sustain. 2015, 9, 148–165. [Google Scholar] [CrossRef]
  13. Cai, Z.; Wang, B.; Cong, C.; Cvetkovic, V. Spatial Dynamic Modelling for Urban Scenario Planning: A Case Study of Nanjing, China. Environ. Plan. B Urban Anal. City Sci. 2020, 47, 1380–1396. [Google Scholar] [CrossRef]
  14. Tayyebi, A.; Pijanowski, B.C.; Pekin, B. Two rule-based urban growth boundary models applied to the Tehran metropolitan area, Iran. Appl. Geogr. 2011, 31, 908–918. [Google Scholar] [CrossRef]
  15. Freni, G.; Mannina, G.; Viviani, G. Urban runoff modelling uncertainty: Comparison among Bayesian and pseudo-Bayesian methods. Environ. Model. Softw. 2009, 24, 1100–1111. [Google Scholar] [CrossRef]
  16. Lin, D.; Zhu, R.; Yang, J.; Meng, L. An Open-Source Framework of Generating Network-Based Transit Catchment Areas by Walking. ISPRS Int. J. Geo Inf. 2020, 9, 467. [Google Scholar] [CrossRef]
  17. Cordova-Pozo, K.; Rouwette, E.A. Types of scenario planning and their effectiveness: A review of reviews. Futures 2023, 149, 103153. [Google Scholar] [CrossRef]
  18. Weng, X.; Pang, C.; Xia, G.-S. Vision–Language Modeling Meets Remote Sensing: Models, datasets, and perspectives. IEEE Geosci. Remote Sens. Mag. 2025, 13, 276–323. [Google Scholar] [CrossRef]
  19. Hao, X.; Chen, W.; Yan, Y.; Zhong, S.; Wang, K.; Wen, Q.; Liang, Y. UrbanVLP: Multi-Granularity Vision–Language Pretraining for Urban Socioeconomic Indicator Prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; Volume 39, pp. 28061–28069. [Google Scholar] [CrossRef]
  20. Wu, M.; Huang, Q.; Gao, S.; Zhang, Z. Mixed land use measurement and mapping with street view images and spatial context-aware prompts via zero-shot multimodal learning. Int. J. Appl. Earth Obs. Geoinf. 2023, 125, 103591. [Google Scholar] [CrossRef]
  21. Liu, P.; Zhang, Y.; Biljecki, F. Explainable Spatially Explicit Geospatial Artificial Intelligence. Environ. Plan. B Urban Anal. City Sci. 2024, 51, 1104–1123. [Google Scholar] [CrossRef]
  22. Wang, Z.; Feng, T.; Safikhani, A.; Tepe, E. Enhancing Transparency in Land-Use Change Modeling: Leveraging Explainable AI Techniques for Urban Growth Prediction with Spatially Distributed Insights. Comput. Environ. Urban Syst. 2025, 121, 102322. [Google Scholar] [CrossRef]
  23. Sermet, Y.; Demir, I. GeospatialVR: A web-based virtual reality framework for collaborative environmental simulations. Comput. Geosci. 2022, 159, 105010. [Google Scholar] [CrossRef]
  24. Li, X.; Yue, J.; Wang, S.; Luo, Y.; Su, C.; Zhou, J.; Xu, D.; Lu, H. Development of Geographic Information System Architecture Feature Analysis and Evolution Trend Research. Sustainability 2024, 16, 137. [Google Scholar] [CrossRef]
  25. Geertman, S.; Allan, A.; Pettit, C.; Stillwell, J. (Eds.) Planning Support Science for Smarter Urban Futures; Springer: Cham, Switzerland, 2017. [Google Scholar] [CrossRef]
  26. Innes, J.E.; Booher, D.E. Planning with Complexity: An Introduction to Collaborative Rationality for Public Policy; Routledge: New York, NY, USA, 2010. [Google Scholar]
  27. Forester, J. The Deliberative Practitioner: Encouraging Participatory Planning Processes; MIT Press: Cambridge, MA, USA, 1999. [Google Scholar]
  28. Jiang, H.; Geertman, S.; Witte, P. Avoiding the Planning Support System Pitfalls? Lessons from the PSS Implementation Gap. Environ. Plan. B Urban Anal. City Sci. 2020, 47, 1343–1360. [Google Scholar] [CrossRef]
  29. Lartey, D.; Law, K.M. Artificial intelligence adoption in urban planning governance: A systematic review of advancements in decision-making, and policy making. Landsc. Urban Plan. 2025, 258, 105337. [Google Scholar] [CrossRef]
  30. Mitchell, M.; Wu, S.; Zaldivar, A.; Barnes, P.; Vasserman, L.; Hutchinson, B.; Spitzer, E.; Raji, I.D.; Gebru, T. Model Cards for Model Reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency, Atlanta, GA, USA, 29–31 January 2019; ACM: New York, NY, USA, 2019; pp. 220–229. [Google Scholar] [CrossRef]
  31. Gebru, T.; Morgenstern, J.; Vecchione, B.; Vaughan, J.W.; Wallach, H.; Daumé, H., III; Crawford, K. Datasheets for Datasets. Commun. ACM 2021, 64, 86–92. [Google Scholar] [CrossRef]
  32. Raji, I.D.; Smart, A.; White, R.N.; Mitchell, M.; Gebru, T.; Hutchinson, B.; Smith-Loud, J.; Theron, D.; Barnes, P. Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, Barcelona, Spain, 27–30 January 2020; ACM: New York, NY, USA, 2020; pp. 33–44. [Google Scholar] [CrossRef]
  33. Sokol, K.; Santos-Rodriguez, R.; Flach, P. FAT Forensics: A Python Toolbox for Algorithmic Fairness, Accountability and Transparency. Softw. Impacts 2022, 14, 100406. [Google Scholar] [CrossRef]
  34. Badami, A.A. Management of the image of the city in urban planning: Experimental methodologies in the colour plan of the Egadi Islands. Urban Des. Int. 2025, 30, 21–36. [Google Scholar] [CrossRef]
  35. Jobin, A.; Ienca, M.; Vayena, E. The Global Landscape of AI Ethics Guidelines. Nat. Mach. Intell. 2019, 1, 389–399. [Google Scholar] [CrossRef]
  36. Ye, X.; Yigitcanlar, T.; Goodchild, M.; Huang, X.; Li, W.; Shaw, S.L.; Newman, G. Artificial intelligence in urban science: Why does it matter? Ann. GIS 2025, 31, 181–189. [Google Scholar] [CrossRef]
  37. Brugere, C.; Bansal, T.; Kruijssen, F.; Williams, M. Humanizing aquaculture development: Putting social and human concerns at the center of future aquaculture development. J. World Aquac. Soc. 2023, 54, 482–526. [Google Scholar] [CrossRef]
  38. Zarrabi, H.; Doost Mohammadian, M.R. Fusion of Digital Twin, Internet-of-Things and Artificial Intelligence for Urban Intelligence. In Digital Twin Computing for Urban Intelligence; Pourroostaei Ardakani, S., Cheshmehzangi, A., Eds.; Urban Sustainability; Springer: Singapore, 2024; pp. 79–102. [Google Scholar] [CrossRef]
  39. Nechesov, A.; Dorokhov, I.; Ruponen, J. Virtual Cities: From Digital Twins to Autonomous AI Societies. IEEE Access 2025, 13, 13866–13903. [Google Scholar] [CrossRef]
  40. Pröbstl, F.; Paulsch, A.; Zedda, L.; Nöske, N.; Santos, E.M.C.; Zinngrebe, Y. Biodiversity policy integration in five policy sectors in Germany: How can we transform governance to make implementation work? Earth Syst. Gov. 2023, 16, 100175. [Google Scholar] [CrossRef]
  41. Cash, D.W.; Clark, W.C.; Alcock, F.; Dickson, N.M.; Eckley, N.; Guston, D.H.; Jäger, J.; Mitchell, R.B. Knowledge Systems for Sustainable Development. Proc. Natl. Acad. Sci. USA 2003, 100, 8086–8091. [Google Scholar] [CrossRef] [PubMed]
  42. Stilgoe, J.; Owen, R.; Macnaghten, P. Developing a Framework for Responsible Innovation. Res. Policy 2013, 42, 1568–1580. [Google Scholar] [CrossRef]
  43. Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 Statement: An Updated Guideline for Reporting Systematic Reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef] [PubMed]
  44. Miake-Lye, I.M.; Hempel, S.; Shanman, R.; Shekelle, P.G. What Is an Evidence Map? A Systematic Review of Published Evidence Maps and Their Definitions, Methods, and Products. Syst. Rev. 2016, 5, 28. [Google Scholar] [CrossRef]
  45. Tricco, A.C.; Lillie, E.; Zarin, W.; O’Brien, K.K.; Colquhoun, H.; Levac, D.; Moher, D.; Peters, M.D.J.; Horsley, T.; Weeks, L.; et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and Explanation. Ann. Intern. Med. 2018, 169, 467–473. [Google Scholar] [CrossRef]
  46. Stallings, J.; Vance, E.; Yang, J.; Vannier, M.W.; Liang, J.; Pang, L.; Dai, L.; Ye, I.; Wang, G. Determining scientific impact using a collaboration index. Proc. Natl. Acad. Sci. USA 2013, 110, 9680–9685. [Google Scholar] [CrossRef]
  47. Dennstädt, F.; Zink, J.; Putora, P.M.; Hastings, J.; Cihoric, N. Title and abstract screening for literature reviews using large language models: An exploratory study in the biomedical domain. Syst. Rev. 2024, 13, 158. [Google Scholar] [CrossRef]
  48. Li, M.; Sun, J.; Tan, X. Evaluating the effectiveness of large language models in abstract screening: A comparative analysis. Syst. Rev. 2024, 13, 219. [Google Scholar] [CrossRef]
  49. Cohan, A.; Feldman, S.; Beltagy, I.; Downey, D.; Weld, D.S. SPECTER: Document-Level Representation Learning Using Citation-Informed Transformers. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), Online, 5–10 July 2020; pp. 2270–2282. [Google Scholar] [CrossRef]
  50. McHugh, M.L. Interrater Reliability: The Kappa Statistic. Biochem. Med. 2012, 22, 276–282. [Google Scholar] [CrossRef]
  51. Parasuraman, R.; Riley, V. Humans and Automation: Use, Misuse, Disuse, Abuse. Hum. Factors 1997, 39, 230–253. [Google Scholar] [CrossRef]
  52. Skitka, L.J.; Mosier, K.L.; Burdick, M.D. Does Automation Bias Decision-Making? Int. J. Hum. Comput. Stud. 1999, 51, 991–1006. [Google Scholar] [CrossRef]
  53. Goddard, K.; Roudsari, A.; Wyatt, J.C. Automation Bias: A Systematic Review of Frequency, Effect Mediators, and Mitigators. J. Am. Med. Inform. Assoc. 2012, 19, 121–127. [Google Scholar] [CrossRef]
  54. Benenson, I.; Torrens, P. Geosimulation: Automata-Based Modeling of Urban Phenomena; Wiley: Chichester, UK, 2004. [Google Scholar]
  55. Herold, M.; Couclelis, H.; Clarke, K.C. The role of spatial metrics in the analysis and modeling of urban land use change. Comput. Environ. Urban Syst. 2005, 29, 369–399. [Google Scholar] [CrossRef]
  56. Stanilov, K. Accessibility and land use: The case of suburban Seattle, 1960–1990. Reg. Stud. 2003, 37, 783–794. [Google Scholar] [CrossRef]
  57. Heppenstall, A.; Crooks, A.; See, L.; Batty, M. (Eds.) Agent-Based Models of Geographical Systems; Springer: Dordrecht, The Netherlands, 2012. [Google Scholar] [CrossRef]
  58. Macal, C.M.; North, M.J. Tutorial on Agent-Based Modeling and Simulation. J. Simul. 2010, 4, 151–162. [Google Scholar] [CrossRef]
  59. Pontius, R.G., Jr.; Millones, M. Death to Kappa: Birth of Quantity Disagreement and Allocation Disagreement for Accuracy Assessment. Int. J. Remote Sens. 2011, 32, 4407–4429. [Google Scholar] [CrossRef]
  60. Wu, F. Calibration of stochastic cellular automata: The application to rural–urban land conversions. Int. J. Geogr. Inf. Sci. 2002, 16, 795–818. [Google Scholar] [CrossRef]
  61. Daniel, C.; Pettit, C. Charting the Past and Possible Futures of Planning Support Systems: Results of a Citation Network Analysis. Environ. Plan. B Urban Anal. City Sci. 2022, 49, 1875–1892. [Google Scholar] [CrossRef]
  62. Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep Learning in Remote Sensing: A Comprehensive Review. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [Google Scholar] [CrossRef]
  63. Ma, L.; Liu, Y.; Zhang, X.; Ye, Y.; Yin, G.; Johnson, B.A. Deep Learning in Remote Sensing Applications: A Meta-Analysis and Review. ISPRS J. Photogramm. Remote Sens. 2019, 152, 166–177. [Google Scholar] [CrossRef]
  64. Sedano, F.; Kempeneers, P.; Strobl, P.; Kucera, J.; Vogt, P.; Seebach, L.; San-Miguel-Ayanz, J. A cloud mask methodology for high-resolution remote sensing data combining information from high and medium resolution optical sensors. ISPRS J. Photogramm. Remote Sens. 2011, 66, 588–596. [Google Scholar] [CrossRef]
  65. Liu, F.; Chen, D.; Guan, Z.; Zhou, X.; Zhu, J.; Ye, Q.; Fu, L.; Zhou, J. RemoteCLIP: A Vision–Language Foundation Model for Remote Sensing. arXiv 2023, arXiv:2306.11029. [Google Scholar] [CrossRef]
  66. Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16), San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar] [CrossRef]
  67. Lundberg, S.M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 4765–4774. [Google Scholar]
  68. Tuia, D.; Persello, C.; Bruzzone, L. Domain Adaptation for the Classification of Remote Sensing Data: An Overview of Recent Advances. IEEE Geosci. Remote Sens. Mag. 2016, 4, 41–57. [Google Scholar] [CrossRef]
  69. Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N.; Prabhat, F. Deep Learning and Process Understanding for Data-Driven Earth System Science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef]
  70. Brail, R.; Klosterman, R. (Eds.) Planning Support Systems; ESRI Press: Redlands, CA, USA, 2001. [Google Scholar]
  71. Geertman, S.; Stillwell, J. (Eds.) Planning Support Systems: Best Practice and New Methods; Springer: Dordrecht, The Netherlands, 2009. [Google Scholar] [CrossRef]
  72. Barredo, J.I.; Kasanko, M.; McCormick, N.; Lavalle, C. Modelling dynamic spatial processes: Simulation of urban future scenarios through cellular automata. Landsc. Urban Plan. 2003, 64, 145–160. [Google Scholar] [CrossRef]
  73. Star, S.L.; Griesemer, J.R. Institutional Ecology, ‘Translations’ and Boundary Objects: Amateurs and Professionals in Berkeley’s Museum of Vertebrate Zoology, 1907–1939. Soc. Stud. Sci. 1989, 19, 387–420. [Google Scholar] [CrossRef]
  74. Haklay, M. How Good Is Volunteered Geographical Information? A Comparative Study of OpenStreetMap and Ordnance Survey Datasets. Environ. Plan. B Plan. Des. 2010, 37, 682–703. [Google Scholar] [CrossRef]
  75. Janssen, M.; Charalabidis, Y.; Zuiderwijk, A. Benefits, Adoption Barriers and Myths of Open Data and Open Government. Inf. Syst. Manag. 2012, 29, 258–268. [Google Scholar] [CrossRef]
  76. Linders, D. Towards open development: Leveraging open data to improve the planning and coordination of international aid. Gov. Inf. Q. 2013, 30, 426–434. [Google Scholar] [CrossRef]
  77. Goodchild, M.F. Citizens as Sensors: The World of Volunteered Geography. GeoJournal 2007, 69, 211–221. [Google Scholar] [CrossRef]
  78. Kamel Boulos, M.N.; Resch, B.; Crowley, D.N.; Breslin, J.G.; Sohn, G.; Burtner, E.R.; Pike, W.A.; Jezierski, E.; Chuang, K.-Y.S. Crowdsourcing, citizen sensing and sensor web technologies for public and environmental health surveillance and crisis management: Trends, OGC standards and application examples. Int. J. Health Geogr. 2011, 10, 67. [Google Scholar] [CrossRef] [PubMed]
  79. Malczewski, J.; Rinner, C. Multicriteria Decision Analysis in Geographic Information Science; Springer: New York, NY, USA, 2015. [Google Scholar] [CrossRef]
  80. Veale, M.; Van Kleek, M.; Binns, R. Fairness and Accountability Design Needs for Algorithmic Support in High-Stakes Public Sector Decision-Making. In Proceedings of the CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 11–16 May 2018; Association for Computing Machinery: New York, NY, USA, 2018; p. 440. [Google Scholar] [CrossRef]
  81. Voinov, A.; Bousquet, F. Modelling with Stakeholders. Environ. Model. Softw. 2010, 25, 1268–1281. [Google Scholar] [CrossRef]
  82. D’Ignazio, C.; Klein, L.F. Data Feminism; MIT Press: Cambridge, MA, USA, 2020. [Google Scholar] [CrossRef]
  83. Nowak, M.; Pantyley, V.; Blaszke, M.; Fakeyeva, L.; Lozynskyy, R.; Petrisor, A.-I. Spatial planning at the national level: Comparison of legal and strategic instruments in a case study of Belarus, Ukraine, and Poland. Land 2023, 12, 1364. [Google Scholar] [CrossRef]
  84. Arnstein, S.R. A Ladder of Citizen Participation. J. Am. Inst. Plann. 1969, 35, 216–224. [Google Scholar] [CrossRef]
  85. National Institute of Standards and Technology (NIST). Artificial Intelligence Risk Management Framework (AI RMF 1.0); NIST AI 100-1; U.S. Department of Commerce: Gaithersburg, MD, USA, 2023. Available online: https://nvlpubs.nist.gov/nistpubs/ai/nist.ai.100-1.pdf (accessed on 11 September 2025).
  86. Arnold, M.; Bellamy, R.; Hind, M.; Houde, S.; Mehta, S.; Mojsilović, A.; Nair, R.; Ramamurthy, K.N.; Olteanu, A.; Piorkowski, D.; et al. FactSheets: Increasing Trust in AI Services through Supplier’s Declarations of Conformity. IBM J. Res. Dev. 2020, 64, 6:1–6:13. [Google Scholar] [CrossRef]
  87. Guidotti, R.; Monreale, A.; Ruggieri, S.; Turini, F.; Giannotti, F.; Pedreschi, D. A Survey of Methods for Explaining Black Box Models. ACM Comput. Surv. 2018, 51, 93. [Google Scholar] [CrossRef]
  88. Mehrabi, N.; Morstatter, F.; Saxena, N.; Lerman, K.; Galstyan, A. A Survey on Bias and Fairness in Machine Learning. ACM Comput. Surv. 2021, 54, 115. [Google Scholar] [CrossRef]
  89. Yu, D.; Fang, C. Remote Sensing with Spatial Big Data: A Review and Renewed Perspective of Urban Studies in Recent Decades. Remote Sens. 2023, 15, 1307. [Google Scholar] [CrossRef]
  90. Kamiran, F.; Calders, T. Data Preprocessing Techniques for Classification without Discrimination. Knowl. Inf. Syst. 2012, 33, 1–33. [Google Scholar] [CrossRef]
  91. Zafar, M.B.; Valera, I.; Rodriguez, M.G.; Gummadi, K.P. Fairness Beyond Disparate Treatment & Impact: Learning Classification without Disparate Mistreatment. In Proceedings of the 26th International World Wide Web Conference, Perth, Australia, 3–7 April 2017; pp. 1171–1180. [Google Scholar] [CrossRef]
  92. Goodspeed, R. Scenario Planning for Cities and Regions: Managing Uncertainty, Complexity, and Change; Lincoln Institute of Land Policy: Cambridge, MA, USA, 2020. [Google Scholar]
  93. Brabham, D.C. Crowdsourcing the Public Participation Process for Planning Projects. Plan. Theory 2009, 8, 242–262. [Google Scholar] [CrossRef]
  94. OECD. Recommendation of the Council on Artificial Intelligence; OECD/LEGAL/0449; Organisation for Economic Co-Operation and Development: Paris, France, 2019. [Google Scholar]
  95. European Commission High-Level Expert Group on AI. Ethics Guidelines for Trustworthy AI; European Commission: Brussels, Belgium, 2019. [Google Scholar]
  96. Howe, B.; Brown, J.M.; Han, B.; Herman, B.; Weber, N.; Yan, A.; Yang, S.; Yang, Y. Integrative Urban AI to Expand Coverage, Access, and Equity of Urban Data. Eur. Phys. J. Spec. Top. 2022, 231, 1741–1752. [Google Scholar] [CrossRef] [PubMed]
  97. Amershi, S.; Weld, D.; Vorvoreanu, M.; Fourney, A.; Nushi, B.; Collisson, P.; Suh, J.; Iqbal, S.; Bennett, P.N.; Inkpen, K.; et al. Guidelines for Human–AI Interaction. In Proceedings of the CHI Conference on Human Factors in Computing Systems, Glasgow, UK, 4–9 May 2019; Association for Computing Machinery: New York, NY, USA, 2019; p. 3. [Google Scholar] [CrossRef]
  98. Healey, P. Collaborative Planning: Shaping Places in Fragmented Societies; Macmillan: London, UK, 1997. [Google Scholar]
  99. Schön, D.A. The Reflective Practitioner: How Professionals Think in Action; Basic Books: New York, NY, USA, 1983. [Google Scholar]
  100. Voß, J.-P.; Bauknecht, D.; Kemp, R. (Eds.) Reflexive Governance for Sustainable Development; Edward Elgar: Cheltenham, UK, 2006. [Google Scholar]
  101. Malczewski, J. GIS and Multicriteria Decision Analysis; John Wiley & Sons: New York, NY, USA, 1999. [Google Scholar]
  102. Zellner, M.L. Participatory modeling for collaborative landscape and environmental planning: From potential to realization. Landsc. Urban Plan. 2024, 247, 105063. [Google Scholar] [CrossRef]
  103. Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV 2017), Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar] [CrossRef]
  104. Bender, E.M.; Friedman, B. Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science. Trans. Assoc. Comput. Linguist. 2018, 6, 587–604. [Google Scholar] [CrossRef]
  105. Floridi, L.; Cowls, J.; Beltrametti, M.; Chatila, R.; Chazerand, P.; Dignum, V.; Luetge, C.; Madelin, R.; Pagallo, U.; Rossi, F.; et al. AI4People—An Ethical Framework for a Good AI Society: Opportunities, Risks, Principles, and Recommendations. Minds Mach. 2018, 28, 689–707. [Google Scholar] [CrossRef]
  106. Karpatne, A.; Atluri, G.; Faghmous, J.H.; Steinbach, M.; Banerjee, A.; Ganguly, A.; Shekhar, S.; Samatova, N.; Kumar, V. Theory-Guided Data Science: A New Paradigm for Scientific Discovery. IEEE Trans. Knowl. Data Eng. 2017, 29, 2318–2331. [Google Scholar] [CrossRef]
  107. Cai, Z.; Page, J.; Cvetkovic, V. Urban Ecosystem Vulnerability Assessment to Support Climate-Resilient City Development. Urban Plan. 2021, 6, 227–239. [Google Scholar] [CrossRef]
  108. Minaei, M.; Salar, Y.S.; Zwierzchowska, I.; Azinmoghaddam, F.; Hof, A. Exploring inequality in green space accessibility for women-Evidence from Mashhad, Iran. Sustain. Cities Soc. 2025, 126, 106406. [Google Scholar] [CrossRef]
  109. Salehi, S.; Naghshineh, R.; Ahmadian, A. Determine of Proxemic Distance Changes before and During COVID-19 Pandemic with Cognitive Science Approach. PsyArXiv 2025. [Google Scholar] [CrossRef]
  110. Gröger, G.; Plümer, L. CityGML—Interoperable Semantic 3D City Models. ISPRS J. Photogramm. Remote Sens. 2012, 71, 12–33. [Google Scholar] [CrossRef]
  111. Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.W.; da Silva Santos, L.B.; Bourne, P.E.; et al. The FAIR Guiding Principles for Scientific Data Management and Stewardship. Sci. Data 2016, 3, 160018. [Google Scholar] [CrossRef] [PubMed]
  112. Funtowicz, S.O.; Ravetz, J.R. Science for the Post-Normal Age. Futures 1993, 25, 739–755. [Google Scholar] [CrossRef]
  113. Klemmer, K.; Rolf, E.; Robinson, C.; Mackey, L.; Rußwurm, M. SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; Volume 39, pp. 4347–4355. [Google Scholar] [CrossRef]
  114. Zhang, Z.; Zhao, T.; Guo, Y.; Yin, J. RS5M and GeoRSCLIP: A Large Scale Vision–Language Dataset and a Large Vision–Language Model for Remote Sensing. arXiv 2023, arXiv:2306.11300. [Google Scholar] [CrossRef]
Figure 1. Conceptual framework linking SDM, PSS, and AI-aligned governance.
Figure 1. Conceptual framework linking SDM, PSS, and AI-aligned governance.
Urbansci 09 00508 g001
Figure 2. Literature retrieval and RAG-assisted screening workflow. Records retrieved from major databases (n = 4516) were deduplicated (n = 2410), screened with a GPT-based classifier, and refined through human validation to produce the final analytical corpus (n = 1290).
Figure 2. Literature retrieval and RAG-assisted screening workflow. Records retrieved from major databases (n = 4516) were deduplicated (n = 2410), screened with a GPT-based classifier, and refined through human validation to produce the final analytical corpus (n = 1290).
Urbansci 09 00508 g002
Figure 3. Annual publication trends by theme (2000–2025).
Figure 3. Annual publication trends by theme (2000–2025).
Urbansci 09 00508 g003
Figure 4. Evolution of multi-theme overlays (% of annual output, 2000–2025).
Figure 4. Evolution of multi-theme overlays (% of annual output, 2000–2025).
Urbansci 09 00508 g004
Figure 5. Global distribution of publications on SDM and related themes.
Figure 5. Global distribution of publications on SDM and related themes.
Urbansci 09 00508 g005
Table 1. Keyword framework for literature retrieval.
Table 1. Keyword framework for literature retrieval.
DomainRepresentative Keywords
Modeling approachesspatial dynamic modeling; cellular automata; agent-based model; hybrid modeling; system dynamics; Markov chain urban modeling
Urban simulation and planning applicationsurban growth simulation; urban expansion modeling; urban land change modeling; urban functional typologies; fine-grained urban modeling
Decision-support and planning toolsplanning support system; decision support framework; urban analytics platform; urban scenario modeling; participatory planning tools
AI and advanced data integrationurban AI; vision–language model; deep learning urban modeling; machine learning urban dynamics; digital twins; urban big data analytics; GeoAI; generative AI for planning
Ethics, inclusivity, and governanceinclusive urban modeling; data justice; algorithmic fairness; AI governance; digital inclusion; citizen-centric urban AI; urban digital rights
Table 2. Coding dimensions and representative attributes.
Table 2. Coding dimensions and representative attributes.
DimensionRepresentative Attributes and Description
Model familyCA, ABM, hybrid models integrating Markov or system-dynamics components, deep learning or GeoAI models, VLM, and digital twin frameworks.
Application domainUrban growth and expansion modeling, accessibility or mobility studies, functional or morphological mapping, climate resilience assessment, and other spatial planning applications.
PSS roleSoftware prototype, analytical platform, participatory decision-support interface, or integrated scenario engine linking simulation with stakeholder interaction.
Validation approachUse of performance metrics such as FoM, Kappa, or F1 score, cross-scale or temporal transfer tests, sensitivity analyses, and external benchmark comparisons.
Governance and inclusionPresence of fairness audits, stakeholder participation, transparency protocols, data rights frameworks, or explicit discussion of ethical AI and inclusion.
Openness and reproducibilityAvailability of open data, public code repositories, model documentation, or Supplementary Materials facilitating replication.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cai, Z. Evolving from Rules to Learning in Urban Modeling and Planning Support Systems. Urban Sci. 2025, 9, 508. https://doi.org/10.3390/urbansci9120508

AMA Style

Cai Z. Evolving from Rules to Learning in Urban Modeling and Planning Support Systems. Urban Science. 2025; 9(12):508. https://doi.org/10.3390/urbansci9120508

Chicago/Turabian Style

Cai, Zipan. 2025. "Evolving from Rules to Learning in Urban Modeling and Planning Support Systems" Urban Science 9, no. 12: 508. https://doi.org/10.3390/urbansci9120508

APA Style

Cai, Z. (2025). Evolving from Rules to Learning in Urban Modeling and Planning Support Systems. Urban Science, 9(12), 508. https://doi.org/10.3390/urbansci9120508

Article Metrics

Back to TopTop