Unleashing the Potential of Large Language Models in Urban Data Analytics: A Review of Emerging Innovations and Future Research

Jiang, Feifeng; Ma, Jun; Jin, Yuping

doi:10.3390/smartcities8060201

Open AccessReview

Unleashing the Potential of Large Language Models in Urban Data Analytics: A Review of Emerging Innovations and Future Research

by

Feifeng Jiang

¹,

Jun Ma

^2,* and

Yuping Jin

²

¹

Department of Geography, Hong Kong Baptist University, Hong Kong, China

²

Department of Urban Planning and Design, The University of Hong Kong, Hong Kong, China

^*

Author to whom correspondence should be addressed.

Smart Cities 2025, 8(6), 201; https://doi.org/10.3390/smartcities8060201 (registering DOI)

Submission received: 9 October 2025 / Revised: 19 November 2025 / Accepted: 21 November 2025 / Published: 28 November 2025

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

The review shows how large language models operate across the full urban analytics pipeline and documents their functions using evidence from 178 studies
The review identifies stable patterns in multimodal integration, synthetic data generation, and human-in-the-loop use in urban workflows.

What are the implications of the main findings?

The findings clarify how LLMs improve urban analytical capacity through stronger data integration and more accessible analytical interfaces.
The 3E framework offers a concise structure to guide future research on data expansion, model enhancement, and advanced urban applications.

Abstract

This paper presents a comprehensive review of emerging innovations and future research directions leveraging Large Language Models (LLMs) for urban data analytics, examining how cities generate, structure, and use information to support planning and operational decisions. While LLMs show promise in addressing critical challenges faced by urban stakeholders—including data integration, accessibility, and cross-domain analysis—their applications and effectiveness in urban contexts remain largely unexplored and fragmented across disciplines. Through our systematic analysis of 178 papers, we examine the impact of LLMs across the four key stages of urban data analytics: collection, preprocessing, modeling, and post-analysis. Our review encompasses various urban domains, including transportation, urban planning, disaster management, and environmental monitoring, identifying how LLMs can transform analytical approaches in these fields. We identify current trends, innovative applications, and challenges in integrating LLMs into urban analytics workflows. Based on our findings, we propose a 3E framework for future research directions: Expanding information dimensions, Enhancing model capabilities, and Executing advanced applications. This framework provides a structured approach to emphasize key opportunities in the field. Our study concludes by discussing critical challenges, including hallucination, scalability, fairness, and ethical concerns, emphasizing the need for interdisciplinary collaboration to fully realize the potential of LLMs in creating smarter, more sustainable urban environments for researchers and urban practitioners working to integrate LLMs into data-driven decision processes.

Keywords:

large language models (LLMs); urban data analytics; artificial intelligence (AI); artificial general intelligence (AGI); smart cities

1. Introduction

1.1. Background

Urban data analytics has emerged as a critical field in the era of smart cities and digital transformation. As urban areas continue to grow and evolve, the volume, variety, and velocity of data generated within these environments have increased exponentially. This data encompasses diverse sources, from sensor networks and Internet of Things (IoT) devices to social media feeds and administrative records, offering unprecedented potential for understanding and improving urban systems.

While urban data analytics has traditionally employed statistical methods, machine learning algorithms, and domain-specific models to extract insights from this wealth of information, the inherent complexity and heterogeneity of urban data present significant challenges for practical implementation. City planners, policymakers, and community organizations face multiple obstacles that impede effective utilization of urban data [1]. These include fragmented data silos across departments, resource-intensive processing requirements for unstructured data sources (e.g., public feedback and policy documents), and the challenge of translating technical analyses into actionable insights for non-technical stakeholders. Additionally, conventional analytical approaches often prove inadequate for time-sensitive scenarios, particularly during urban emergencies like natural disasters or critical infrastructure failures. These limitations not only constrain agile, evidence-based decision-making but also highlight the pressing need for more integrated and efficient analytical frameworks in urban governance, which relies on coordinated information flows and routine decision processes across multiple institutional domains [2]. Urban science and digital governance research describe cities as systems sustained by interdependent information structures that enable coordination and institutional linkages, and this perspective clarifies why fragmented data environments hinder the formation of coherent analytical capacity.

Large Language Models (LLMs) have emerged as a transformative technology in artificial intelligence (AI), fundamentally reshaping natural language processing capabilities [3]. These sophisticated AI systems, trained on extensive textual corpora, exhibit unprecedented proficiency in comprehending and generating human-like text across diverse domains [4]. Their capacity to process and synthesize information from heterogeneous sources positions them as particularly valuable tools for addressing the multidimensional challenges inherent in urban data analytics [5].

The integration of LLMs into urban data analytics represents a fundamental shift in methodological approaches to complex urban challenges [1]. These models facilitate the integration of disparate data sources, enable the extraction of meaningful patterns from unstructured text, and support sophisticated interpretations of urban phenomena. Moreover, their intuitive natural language interfaces democratize access to advanced analytical capabilities, empowering diverse stakeholders to engage meaningfully with urban data and contribute to evidence-based decision-making processes [5].

As cities globally pursue enhanced sustainability, resilience, and livability, data-driven insights have become indispensable to effective urban planning and management. The synergy between LLMs and urban data analytics creates unprecedented opportunities for addressing critical urban challenges—from optimizing transportation infrastructure and energy systems to improving public service delivery and strengthening community engagement initiatives. This technological convergence promises to transform how urban planners, policymakers, and communities collaborate to shape the future of our cities.

1.2. Motivation and Objectives

The rapid advancement of LLMs has revolutionized numerous fields, such as geospatial science, transportation, and autonomous systems [3,4,6,7,8], yet their systematic integration into urban data analytics remains insufficiently explored. Recent surveys have examined LLMs in urban contexts but focused primarily on system architectures and model development. The authors of [5,9] presented model-centric reviews on urban foundation models and envisioned urban general intelligence systems, while the authors of [10] proposes foundational platforms for creating embodied agents for various urban tasks. However, there is no comprehensive review examining how to systematically integrate LLMs into the urban data analytics pipeline. This gap is particularly critical, as urban practitioners require concrete guidance for the implementation of LLMs within existing analytical workflows—spanning data collection, preprocessing, modeling, and post-analysis stages. Our review addresses this practical need by examining how LLMs can enhance each phase of the urban analytics pipeline, providing actionable insights that enable researchers and practitioners to effectively leverage these powerful models in real-world urban applications.

This review is guided by the following research questions that frame the scope of inquiry:

What characteristics and capabilities of LLMs make them suitable for urban data analytics, and how do these properties align with the requirements of urban analytical workflows?
How are LLMs currently being applied across urban data analytics tasks, and what approaches or implementations demonstrate their practical utility?
What research directions and innovation opportunities emerge from the existing body of work, and how might these directions improve the effectiveness of LLMs in urban data analytics?
What technical and operational challenges constrain the integration of LLMs into urban data analytics, and what barriers require further investigation?

The key contributions of this review include the following:

We present a comprehensive analysis of 178 papers, offering a panoramic view of the current state of LLM applications in urban analytics. This extensive review unveils emerging trends and innovative applications across diverse urban domains, providing valuable insights into both the potential and challenges of these technologies.
We develop a novel review framework based on the key steps in urban data analytics: data collection, preprocessing, modeling, and post-analysis. This framework ensures a systematic examination and categorization of LLM applications throughout the urban data analytics pipeline.
We propose a 3E framework (Expanding information dimensions, Enhancing model capabilities, and Executing advanced applications) to chart future directions of LLMs in urban analytics. This framework serves as a roadmap for future research and innovation in this burgeoning field.

The rest of this paper is structured as follows: Section 2 provides an overview of LLMs, tracing their history, evolution, key characteristics, and capabilities. Section 3 delves into the applications of LLMs in urban data analytics, discussing various stages, such as data collection, preprocessing, modeling, and post-analysis. Section 4 explores future directions and challenges, focusing on potential advancements in data dimensions, model capabilities, and applications, as well as addressing issues like hallucination, scalability, fairness, and ethical concerns. Finally, Section 5 concludes the paper.

2. Overview of Large Language Models

LLMs represent a significant leap in natural language processing (NLP) and AI. They are advanced neural network-based systems designed to understand and generate human-like text by processing vast amounts of textual data. The evolution of LLMs is rooted in decades of research, beginning with early statistical models and progressing through neural language models to the sophisticated transformer-based models that dominate the field today [11]. This section provides an overview of LLMs, tracing their evolution, key technologies, and capabilities.

2.1. Landscape of Large Language Models

The development of language models progressed through three preliminary waves [12]: statistical language models (early 2000s), which model language as a probability distribution over sequences of words [13]; neural language models (early 2010s), which introduced word embeddings with innovations like Word2Vec and GloVe [14,15]; and pre-trained language models (late 2010s), which established the paradigm of pre-training on large text corpora followed by task-specific fine-tuning, notably featuring the Transformer architecture, e.g., BERT and GPT in 2018 [16,17].

Large Language Models (LLMs) represent the fourth and current wave, characterized by transformer-based architectures with billions of parameters. Early examples like GPT-2, GPT-3, T5, and Switch Transformer expanded scale and capability, supporting broader generalization across tasks [18,19,20]. Instruction tuning and alignment methods later enabled models to follow user prompts more reliably, as illustrated by InstructGPT and ChatGPT [21]. Recent systems, including GPT-5, Claude 3, Gemini, and Llama 2/3 have further increased model capacity and robustness. The rapid growth of this ecosystem enables applications in domains that handle heterogeneous, text-rich information such as urban analytics.

As shown in Table 1, LLMs are often grouped by parameter scale. Small models like BERT (under 1 billion parameters) prioritize efficiency, making them suitable for resource-constrained applications [17,18]. Medium-sized models, such as GPT-2, balance performance and resource needs [22,23]. Large models like Llama 3-70B, Mixtral 8x7B, and Qwen2-72B (10–100 billion parameters) offer significant performance improvements [24,25], while very large models such as GPT-4, Grok 3, and DeepSeek-V2 (over 100 billion parameters) push the boundaries of what is possible, often demonstrating emergent abilities [26,27]. Accessibility further shapes the LLM landscape. Open-source families such as Grok, LLaMA, DeepSeek, and Qwen provide access to model architectures and weights, supporting adaptation for specialized needs. Closed models like GPT-4, Claude, and Gemini are mainly accessed through APIs, which limits transparency. This distinction influences reproducibility and auditability in analytical contexts.

As LLMs continue to evolve, they are increasingly being augmented with external knowledge and tools, enabling more effective interaction with users and environments. The ability to continually improve through feedback data collected via interactions, such as reinforcement learning with human feedback (RLHF), points to a future where these models become even more powerful and adaptive.

2.2. Key Characteristics and Capabilities

2.2.1. Key Technologies in Developing LLMs

The development of LLMs involves a sophisticated pipeline of technologies and methodologies, as summarized in Table 2. At the core of most modern LLMs is the Transformer architecture, with variants including encoder-only (e.g., BERT), decoder-only (e.g., GPT), and encoder–decoder models [12]. This architecture supports the modeling of long-range dependencies in heterogeneous urban text. Data preparation involves collection, cleaning, and deduplication, and urban sources such as administrative documents, sensor descriptions, and social media require consistent preprocessing. Tokenization and positional encoding structure the input for training and help the model process sequential patterns common in urban records.

The training process typically involves pre-training on massive unlabeled datasets, followed by fine-tuning for specific urban analytics tasks. Crucial to LLM development for urban applications is the concept of alignment, ensuring that model outputs reflect human values and urban planning principles through techniques like RLHF [12]. Given the computational intensity of LLMs, efficient training and inference techniques such as optimized frameworks, parameter-efficient fine-tuning, and model compression are essential, especially when deploying these models in resource-constrained urban environments or for real-time city management applications.

These technologies strengthen the capacity of LLMs to analyze complex textual information in urban datasets, and continuing advances aim to improve their efficiency and relevance for urban contexts.

2.2.2. Major Capabilities of LLMs

Large language models exhibit a hierarchy of capabilities that can be categorized into three progressive levels: basic, intermediate, and advanced (as outlined in Table 3) [11,12,28].

At the basic level, LLMs excel in fundamental text processing (e.g., summarization and sentiment analysis), simple text generation, and simple question answering. Recent studies demonstrate these capabilities through tasks such as generative approaches to aspect-based sentiment analysis and automatic thematic extraction from large text collections [29]. In urban data analytics, LLMs can efficiently process city documents, analyze social media sentiment about urban issues, and classify transportation complaints. For example, they can analyze thousands of resident feedback submissions to identify emerging neighborhood concerns or track sentiment trends about public transit quality.
At the intermediate level, LLMs demonstrate advanced text understanding, advanced text generation ability (e.g., code generation), complex reasoning, and sophisticated task execution. Studies assessing SQL generation quality for models such as GPT-3.5 and Gemini show how LLMs handle structured analytical tasks and compositional instructions [30]. In urban contexts, these capabilities enable LLMs to interpret complex zoning regulations, generate comprehensive urban development reports, and solve multifaceted problems like the optimization of traffic flow. They can translate technical urban planning documents into accessible language for public consumption or analyze patterns in accident reports to identify unsafe intersections.
At the advanced level, LLMs push the boundaries with sophisticated knowledge reasoning, tool planning, simulation, multimodal capabilities, and advanced system integration. Recent advances in large-scale information extraction and multimodal reasoning demonstrate how LLMs coordinate tools and integrate diverse signals in complex analytical environments [31]. In urban data analytics, LLMs create comprehensive urban knowledge graphs connecting demographic, economic, and infrastructure data. They simulate urban scenarios (e.g., traffic pattern changes after infrastructure modifications), integrate with IoT networks for real-time city monitoring, and enable collaborative multi-agent systems for emergency response coordination. For instance, an LLM-powered system could integrate traffic sensor data, weather forecasts, and event schedules to predict congestion patterns and dynamically adjust traffic signals.

2.3. Review Methods

Building on the examination of LLM capabilities and their relevance to urban data analytics, the review followed a structured PRISMA (preferred reporting items for systematic reviews and meta-analysis) method to capture research published in this emerging area. The procedure is summarized in Figure 1, which outlines the stages of identification, screening, eligibility assessment, and final inclusion. Searches were conducted across multiple academic databases—Web of Science, IEEE Xplore, PubMed, Scopus, Google Scholar, and arXiv—using a comprehensive search strategy that combined LLM-related terms (“natural language processing”, “large language model”, “generative pre-trained transformer”, “GPT”, “NLP”, and “LLM”) with urban-focused keywords (“city”, “urban”, “geospatial”, and “built environment”). Given the significant advancements marked by ChatGPT’s release in December 2022, we focused on publications from 2023 onward to capture the latest innovations. We restricted the search to studies published between 1 January 2023 and 31 December 2024 to ensure temporal consistency and enable reproducibility. Our search was limited to English-language documents, including articles, conference papers, reviews, book chapters, and books. We applied the following inclusion criteria to identify studies as eligible: (1) addressing at least one urban-focused topic, such as planning, governance, mobility, or the built environment; (2) involving the use of NLP or LLM techniques, including transformer-based models like GPT and BERT; and (3) utilizing data relevant to urban contexts, such as geospatial datasets, planning documents, or policy texts. To ensure comprehensive coverage, we supplemented our database searches with reference chasing, systematically reviewing the reference lists of selected papers to identify additional relevant studies that may have been missed in our initial search.

Screening was performed independently by two reviewers who evaluated titles and abstracts, achieving an inter-rater agreement of

κ = 0.92

, indicating strong consistency in screening decisions. Full-text assessments resolved any discrepancies through discussion. A brief risk-of-bias assessment examined model transparency, data clarity, and evaluation robustness. The overall literature exhibited low-to-moderate risk across these dimensions, with several studies presenting clear methodological descriptions, while others provided limited detail on datasets or reproducibility.

This rigorous process yielded 178 papers that directly address LLMs in urban data analytics or closely related fields. These papers form the foundation of our review, representing the current landscape of this emerging interdisciplinary field. In the following section, we analyze these papers in detail, examining the specific urban data analytics tasks being addressed, the LLM techniques employed, and the outcomes achieved. This analysis provides insights into the field’s current state, highlights emerging trends, and identifies promising directions for future research on the application of LLMs to urban data analytics.

3. Applications of LLMs in Urban Data Analytics

3.1. Urban Data Analytics Landscape

Urban data analytics is a multidisciplinary field that encompasses the collection, analysis, and interpretation of data across various urban domains, including transportation systems, physical infrastructure, public health, and environmental monitoring. This field is essential for modern city management, providing critical insights that inform policy decisions, optimize resource allocation, and enhance service delivery to residents. By leveraging diverse data sources—from traffic sensors and air quality monitors to social media feeds and administrative records—urban analytics helps address complex challenges such as traffic congestion, environmental pollution, inefficient public services, and infrastructure maintenance. However, the field faces significant obstacles, including fragmented data ecosystems across municipal departments, the overwhelming volume and variety of information streams, and the demand for real-time processing capabilities that can generate actionable insights. LLMs present transformative opportunities in this context by enabling sophisticated data integration, advanced pattern recognition, and predictive analytics. Through their natural language processing capabilities, LLMs help overcome traditional analytical barriers, facilitating more comprehensive understanding of urban dynamics and enhancing the overall effectiveness of data-driven city management.

A review of the collected 178 papers demonstrates the wide-ranging applications of LLMs in urban data analytics, as summarized in Table 4. Transportation dominates the field, with 59 studies, encompassing traffic control, driver behavior analysis, and mobility forecasting. Urban development and planning follow, with 21 papers, exploring areas such as road network generation, urban renewal initiatives, and region profiling. Disaster management and social dynamics represent significant research areas, with 14 and 13 studies, respectively, addressing issues from flood detection to crime prediction and emergency response coordination.

Additional research domains encompass tourism (nine studies), environmental science (five studies), and a cluster of emerging areas including economy, public health, and building energy (three studies each). A substantial portion of the literature—48 papers—concentrates on fundamental technological advances in urban analytics rather than specific applications, addressing critical capabilities such as remote sensing, geospatial analytics, and visual intelligence. This distribution reveals a marked disparity: while transportation and urban development have attracted significant scholarly attention, domains such as public health, building energy, and economic applications remain relatively underexplored, presenting compelling opportunities for future investigation.

Beyond the distribution of applications across domains, the reviewed studies reveal common socio-technical patterns in how LLM outputs interact with urban decision processes. Across transportation, planning, disaster management, and social governance, model-generated analyses typically enter institutional workflows through reporting pipelines, operational dashboards, or automated decision-support tools. These outputs are then interpreted; validated; and often adjusted by planners, analysts, or emergency coordinators before influencing policy actions. This cross-domain pattern highlights a shared dependency on human-in-the-loop structures that manage hallucination risks, reconcile fairness concerns, and provide audit trails for institutional accountability. It also shows that policy uptake depends not only on model accuracy but on the traceability of reasoning, documentation quality, and alignment with existing governance protocols. These observations provide a socio-technical layer that complements the domain-level mapping and clarifies how LLM-enabled analytics propagate through real-world urban decision systems.

Figure 2 indicates the current trends, applications, future directions, and challenges of LLMs in urban data analytics. The subsequent sections examine the transformative role of LLMs across the fundamental workflow of urban analytics: data collection, preprocessing, modeling, and post-analysis. Each phase is analyzed to demonstrate how LLMs augment conventional approaches, enhance operational efficiency, and generate more nuanced insights. This comprehensive evaluation illustrates how LLM integration advances smart city initiatives and facilitates sustainable urban development.

3.2. Data Collection and Generation

Data collection and generation constitute the foundation of urban data analytics, supplying the critical raw materials necessary for meaningful analysis. The accuracy and comprehensiveness of collected data directly determine whether derived insights are both representative and actionable, while sophisticated data generation methods can effectively simulate scenarios or address gaps in sparse datasets. LLMs are transforming these foundational processes by bringing unprecedented natural language processing capabilities to urban contexts. These models can parse unstructured text from diverse sources, extract relevant information from planning documents and citizen feedback, and generate synthetic data that captures the complexity of urban systems. By automating and enhancing both data collection and generation, LLMs enable urban analysts to work with richer, more comprehensive datasets, ultimately leading to more robust and actionable insights for city planning and management (Figure 3).

3.2.1. Data Collection

Urban data analytics leverages diverse data types to capture the multifaceted nature of urban environments. These data types—including structured data, visual information, textual content, and specialized categories—provide distinct perspectives on urban systems and support various applications in urban studies. Figure 4 presents an overview of the primary data types utilized in urban analytics, illustrating key examples and their respective sources.

The integration of LLMs into urban data collection has introduced several significant innovations. Table 5 outlines the specific LLM-assisted methods and their applications across various data collection tasks. They extract critical information from textual sources such as policies, regulations, and reports [146,166] while also facilitating the analysis of user-generated content on social media platforms to derive insights and sentiments [112,153]. Furthermore, LLMs streamline the automation of data collection from APIs by generating code or developing agents based on natural language queries [164,174]. They also synthesize urban data from diverse online sources, including Web articles and databases [116], and extract information from visual sources such as photographs and satellite imagery through advanced image understanding capabilities [47]. In urban survey research, LLMs improve both the design and processing stages by generating survey questions and automating response analysis [34,203]. Additionally, they facilitate conversational interactions through chatbots and voice assistants to gather both structured and unstructured data via natural language communication [142].

Beyond handling diverse data types, LLMs excel at integrating heterogeneous data sources to create comprehensive, multi-dimensional representations of urban environments. Autonomous GIS [164] exemplifies this capability by using LLMs to dynamically select appropriate data sources, generate and execute code to access various APIs and Web services, and automatically transform disparate data formats from sources such as Census Bureau demographics, OpenStreetMap, satellite imagery, and environmental sensors into compatible formats for unified spatial analysis. StreetViewLLM [202] demonstrates another approach to data integration by combining LLMs with chain-of-thought reasoning and multimodal data sources. This system merges street-view imagery with geographic coordinates and textual data to improve geospatial predictions for urban environments. When tested across seven global cities, StreetViewLLM consistently outperformed benchmark models (such as XGBoost, ResNet50) by at least 49.43% in capturing urban indicators such as population density, healthcare accessibility, and environmental features. Similarly, [146] enhanced GPT-4 by providing access to up-to-date and reliable climate knowledge sources, creating a hybrid system that delivers significantly improved performance on climate-related queries. When evaluated by climate experts, their hybrid ChatClimate achieved perfect 5/5 accuracy scores on critical climate questions, compared to standard GPT-4’s average of 2.5/5, demonstrating how timely access to authoritative external knowledge sources can dramatically improve LLM reliability in specialized domains.

Table 5. LLM-assisted data collection methods for urban analytics.

Data Source	LLM-Assisted Method and Description	Examples
Document	Document-based Data Extraction: Extracts relevant information from various textual sources, such as policies, regulations, forms, and reports, using LLMs.	[146,166]
Social Media	Social Media Data Harvesting: Collects and analyzes user-generated content from social media platforms to extract urban-related insights and sentiment.	[112,153]
API	API Data Harvesting: Develops LLM-powered agents or generates code to collect spatial and other types of data from various APIs as requested by users through natural language queries.	[164,174]
Web	LLM-powered Web Scraping and Search Engine: Utilizes online LLM agents to gather, filter, and synthesize urban data from diverse Web sources and online databases.	[116]
Image	Data Extraction from Images: Leverages image understanding and captioning capabilities to collect data from visual urban sources like photographs and satellite imagery.	[47]
Survey	Intelligent Survey Design and Processing: Employs LLMs for survey question generation, adaptive questioning, and automated analysis of survey responses.	[34,203]
Chatbot	Conversational Data Collection: Utilizes chatbots or voice assistants to gather structured and unstructured data through natural language interactions.	[142]

The incorporation of LLMs in data collection processes not only streamlines existing workflows but also enables novel approaches to urban analytics. By leveraging LLM capabilities, urban analysts can develop deeper, more nuanced understanding of urban dynamics, ultimately supporting more informed and effective decision-making in urban planning and management.

3.2.2. Data Generation

As urban environments become increasingly complex, the demand for comprehensive data in urban analytics has grown substantially. Traditional urban analytics often struggles with fundamental data limitations: incomplete datasets, systematic biases, insufficient coverage of under-represented areas, and the high cost of comprehensive data collection. These constraints have historically hindered our ability to create accurate models and make informed decisions about urban planning and development. The emergence of LLMs has opened new possibilities for addressing these critical data gaps through sophisticated synthetic data generation.

Purposes and Applications

LLMs serve crucial roles in data generation across the urban analytics pipeline, contributing to both pre-modeling and post-modeling stages, as illustrated in Table 6:

Pre-modeling: During the foundational stage of urban analytics, LLMs enhance data preparation through three key capabilities. First, they excel at scenario creation by generating realistic urban situations—from routine traffic patterns to complex scenarios like public events or emergency situations—providing researchers with diverse datasets for simulation and analysis. Second, LLMs address data scarcity through augmentation, expanding limited datasets by synthesizing plausible variations of existing data points, which is particularly valuable when studying under-represented neighborhoods or rare urban events. Third, these models facilitate the creation of multimodal datasets by transforming unstructured text (such as planning documents, social media posts, or citizen feedback) into structured formats while also integrating various data types. like text-enhanced visual data or location-based information, with textual descriptions. This comprehensive approach enables more nuanced analysis of urban phenomena by capturing multiple dimensions of city life that traditional data collection methods might miss.
Post-modeling: After urban models are developed, LLMs enhance their validation, interpretation, and practical application in three ways. First, through scenario testing, LLMs generate diverse test cases—such as different weather conditions, varying traffic volumes, or unexpected events—allowing urban planners to evaluate model robustness across a wide range of realistic situations. Second, these models excel at interpretation and explanation by converting complex analytical outputs into clear, contextual narratives that stakeholders can understand. For instance, when a traffic model predicts congestion patterns, an LLM can explain the underlying factors in plain language, making the insights accessible to policymakers and citizens alike. Third, LLMs support suggestion generation by synthesizing model outputs into actionable recommendations, such as by proposing specific policy interventions based on predicted outcomes or generating detailed implementation strategies for urban development projects. This capability transforms raw analytical results into practical guidance for urban planning and policy-making.

The versatility of LLMs in generating diverse data types—including text-to-text transformations, image–text pairs, and text-to-3D spatial models—significantly expands their utility in urban analytics. These capabilities enable more comprehensive analyses by capturing multiple dimensions of urban phenomena, ultimately empowering planners and policymakers to base their decisions on a richer, more nuanced understanding of urban systems, as summarized in Figure 5.

Scenario and Solution Generation

Scenario generation represents one of the most impactful applications of LLMs in urban data generation and is particularly valuable in contexts where real scenario data is scarce or when exploring potential future urban conditions. Transportation systems have emerged as a primary beneficiary of LLM-generated scenarios. In autonomous driving, traditional testing approaches rely on limited predefined scenarios that inadequately capture the complexity of real-world driving situations. LLMs address this limitation by creating synthetic environments that simulate challenging urban conditions, thereby enhancing both the scalability and realism of autonomous vehicle evaluation [68,77,87]. OmniTester [87], for instance, leverages multimodal LLMs to generate realistic, challenging testing scenarios for autonomous vehicles by harnessing these models’ world knowledge and reasoning capabilities. The framework achieved a 100% success rate in scenario generation, substantially outperforming versions without chain-of-thought reasoning (40% success), demonstrating the critical importance of these components for reliable AV testing. In traffic simulation, conventional approaches demand extensive manual configuration and programming expertise, which limits scenario diversity and adaptability to evolving urban conditions. LLMs overcome these constraints by translating natural language descriptions into executable simulation code, dramatically reducing technical barriers to creating complex scenarios. ChatSUMO [84], for example, converts user prompts into executable SUMO code, enabling automatic generation of customized urban scenarios with minimal input. The system achieved 96% accuracy for real-world simulations and demonstrated effective customization capabilities for road networks, traffic lights, and vehicle behaviors.

Beyond scenario simulation, LLMs exhibit remarkable capabilities in generating practical urban solutions across multiple domains. Across these domains, the process through which LLM-generated scenarios and solutions enter analytical workflows follows a similar pattern, where outputs are reviewed and interpreted before informing operational or planning decisions. In disaster response, DisasterResponseGPT [111] employs LLMs (e.g., GPT-3.5 and GPT-4) to rapidly generate valid disaster response plans by incorporating planning guidelines in the initial prompt. The AI-generated plans proved comparable to human-generated ones in terms of quality while offering superior modification flexibility in real-time situations. In urban planning, [106] demonstrated LLMs’ effectiveness through a framework integrating role play, collaborative generation, and feedback iteration for land-use planning. Their approach achieved superior satisfaction (0.784) and inclusion (0.794) metrics, outperforming human experts while rivaling state-of-the-art methods such as deep reinforcement learning by effectively processing and balancing the needs of 1000 distinct stakeholders.

The impact of the LLM-driven scenario and solution generation extends to both pre- and post-modeling stages of urban analytics. During pre-modeling, it facilitates the generation of rare cases that might be under-represented in real-world data, ensuring more comprehensive model training. In post-modeling, it enables scenario comparison and supports enhanced decision-making by allowing planners and policymakers to explore a wide range of potential outcomes.

3.3. Preprocessing

Preprocessing transforms raw urban data into analysis-ready formats while addressing critical challenges in data quality, consistency, and relevance. LLMs have fundamentally transformed this phase by automating and enhancing preprocessing tasks, significantly improving both efficiency and reliability in urban data analytics. Figure 6 and Table 7 outline the primary preprocessing categories—data quality, representation, dimensionality, and distribution—along with their corresponding LLM-based solutions. By addressing these preprocessing challenges comprehensively, LLMs enable urban researchers and practitioners to extract more accurate insights from complex urban data, ultimately facilitating more informed decision-making in urban management.

3.3.1. Data Quality Issues

LLMs provide sophisticated solutions for ensuring data quality across multiple dimensions. For instance, they standardize textual descriptions and eliminate inconsistencies, as demonstrated in urban region profiling [99] and human mobility prediction [70]. LLMs also excel at identifying relevant features while filtering out irrelevant information—a capability essential for driver behavior analysis [37] and semantic footprint mapping [107]. Additionally, LLMs automate labeling tasks in social media sentiment analysis [145] and perform context-aware data imputation for missing values in traffic datasets [72].

3.3.2. Data Representation Issues

LLMs excel at converting between different data representations to meet specific analytical requirements. They extract structured information from unstructured text, enabling event–event relation extraction [126] and urban itinerary planning [103]. Furthermore, LLMs adapt data representations to match spatial or temporal requirements in geospatial knowledge extraction [104] and time-series forecasting [187]. Notably, they generate textual descriptions for non-textual data, facilitating cross-modal alignment in satellite image text retrieval [97] and 3D scene understanding [105].

3.3.3. Data Dimensionality and Distribution Issues

LLMs address complex challenges in data dimensionality and distribution through advanced techniques. They generate semantically rich, lower-dimensional representations of high-dimensional data, which are particularly valuable in urban mobility pattern analysis [42]. For data distribution issues, LLMs generate synthetic data to enrich limited datasets or address class imbalances. This capability proves essential in urban renewal knowledge base creation [100], traffic accident analysis [46], and rare event prediction [120]. Through these comprehensive preprocessing capabilities, LLMs significantly enhance dataset quality and usability, enabling more robust and balanced urban analytics.

3.3.4. Representative Application

One representative application of LLMs in data preprocessing is sentiment analysis in social media, where these models excel at extracting nuanced insights from massive volumes of unstructured data with unprecedented accuracy and contextual understanding. Recent implementations demonstrate their transformative impact: CrisisSense-LLM [121], designed for disaster-related post classification, achieved a 15-fold performance improvement over yjr baseline LLaMA2-chat model by simultaneously categorizing event types, informativeness, and human aid involvement. This multi-label approach significantly enhances situational awareness during crisis management. The ALEX framework [153] combined data augmentation with LLM explanation mechanisms to excel in public health analysis, achieving F1 scores of 94.97%, 89.13%, and 88.17% on COVID diagnosis detection, therapy sentiment analysis, and social anxiety analysis tasks, respectively, substantially outperforming established models like BERT, BERTweet, and CT-BERT. Similarly, [145]’s analysis of 1.26 million nuclear power-related tweets demonstrated that LLM implementations achieved up to 96% classification accuracy while showing superior resistance to overfitting compared to traditional machine learning methods. These applications highlights LLMs’ effectiveness in analyzing complex social topics requiring nuanced understanding.

In conclusion, the integration of LLMs in urban data preprocessing fundamentally enhances data quality, consistency, and usability by automating complex tasks and addressing longstanding challenges. This advancement not only reduces manual labor but also produces reliable, analysis-ready datasets that enable more accurate urban analytics and deeper understanding of urban systems, ultimately supporting more sophisticated applications and informed decision-making.

3.4. Modeling

Urban analytics modeling creates computational representations that analyze and predict urban phenomena, offering direct insights into dynamics like traffic patterns, land-use evolution, environmental impacts, and socioeconomic trends. LLMs transform this field by enabling more responsive and nuanced approaches to urban system modeling that are capable of processing complex relationships and supporting sophisticated decision-making processes. The integration of LLMs within urban modeling frameworks operates across three complementary domains (Figure 7): prompt engineering, LLM agents, and foundation models.

3.4.1. Prompt Engineering

Prompt engineering, a crucial technique in harnessing the power of LLMs, involves the strategic design of inputs to guide model outputs. Essentially, it is the art and science of crafting questions or instructions for LLMs such as GPT-4, aiming to elicit precise and relevant responses. It serves as a bridge between raw data and meaningful insights, directly influencing the model’s interpretation and output generation. This approach is particularly vital in urban analytics, where the complexity and heterogeneity of data demand nuanced understanding and interpretation.

Representative Prompting Techniques

In general, prompt engineering techniques in urban analytics can be categorized into two main types (Figure 8): typical prompting and reasoning techniques. Typical prompting focuses on how to present tasks or questions to the model, often varying in the amount of context or examples provided. This category includes zero-shot and few-shot prompting, which differ in the number of examples given to the model before it performs a task. Reasoning techniques, on the other hand, are designed to enhance the model’s problem-solving and analytical capabilities. Methods such as Chain of Thought (CoT) and Tree of Thought (ToT) guide the model through more complex, multi-step reasoning processes, making them particularly valuable for intricate urban analytics tasks.

Figure 8 outlines key prompting techniques in LLMs. Among these, zero-shot, few-shot, and CoT prompting are currently the most prevalent in urban analytics. More advanced techniques like ToT and Graph of Thought (GoT), along with methods addressing hallucination and automation, are emerging but less common in urban applications due to their complexity. Readers interested in these advanced techniques can refer to the survey reported in [204].

Figure 8. Common prompting techniques in large language models [18,19,205,206,207,208].

Content-Wise Prompting

In urban analytics, prompt engineering often focuses on the content and structure of prompts, typically implemented in zero-shot or few-shot formats. Table 8 illustrates the diverse strategies and techniques employed in content-wise prompting for urban data analytics. For example, role and simulation strategies such as role-playing and scenario simulation prompts enable LLMs to emulate different stakeholders in urban settings, facilitating participatory planning and improving autonomous driving models [78,106]. Context enhancement strategies enrich prompts with relevant spatial and temporal information, improving geospatial predictions and integrating real-time data for applications like traffic signal control [64,104]. Task decomposition aspects, including CoT prompting, guide LLMs through complex reasoning processes, enhancing interpretability in tasks such as human mobility prediction [70]. Domain knowledge incorporation techniques embed specialized urban and geographical expertise into prompts, improving location description extraction and generating comprehensive reports for urban applications [116,117]. Robustness and optimization aspects focus on improving LLM performance and reliability in urban contexts, including by assessing spatial–temporal information handling and optimizing solutions for urban systems [157,194].

Promptsfor Forecasting

Forecasting represents a distinctive application of prompting in urban analytics that extends beyond the LLM uses listed in Table 8. Unlike standard conversational applications, these approaches employ large language models more analogously to traditional machine learning methods for spatiotemporal predictions [33,94]. Groundbreaking research from NeurIPS [187] demonstrated that by encoding time-series data as numerical strings, models like GPT-3 and LLaMA-2 can perform zero-shot extrapolation with effectiveness comparable to or exceeding that of purpose-built forecasting models. This innovative approach bridges the gap between general-purpose language models and specialized predictive systems, opening new avenues for urban data analysis without requiring domain-specific model training.

Prompt engineering has demonstrated significant efficacy in real-time traffic forecasting and management. PromptGAT [69] exemplifies this approach by integrating domain knowledge with real-time traffic states for traffic signal control. When evaluated during challenging snowy conditions, the prompt-engineered system demonstrated remarkable resilience, achieving an approximately 7% reduction in average travel time compared to conventional direct-transfer methods. This improvement demonstrates the potential for prompt adaptation in real-time traffic management without requiring extensive retraining. xTP-LLM [88] further illustrates the dynamic capabilities of prompt engineering in real-time traffic prediction. By incorporating contextual information about sudden events like accidents or severe weather conditions into input prompts, the system can rapidly adjust its predictions. For instance, when a severe sandstorm scenario was introduced into the prompt, the model intelligently predicted a reduction in traffic volume from 532 to 352 vehicles, accounting for visibility limitations. These advancements collectively indicate that prompt engineering offers a promising path toward more adaptive, responsive, and intelligent traffic management systems capable of handling real-world complexity with minimal computational overhead.

In summary, prompt engineering is a vital technique in leveraging LLMs for urban analytics. By using carefully crafted prompts, researchers can significantly enhance the capabilities of LLMs to provide accurate and actionable insights. On the other hand, continuous refinement and validation remain crucial to ensuring the reliability of LLM outputs in real-world urban scenarios.

3.4.2. LLM Agents

As urban data analytics grows increasingly complex, there is a rising need for sophisticated tools that can autonomously navigate, interpret, and act upon vast amounts of urban data. LLM agents address this need by enhancing the capabilities of traditional LLMs with additional functionalities, allowing them to perform complex, multi-step tasks autonomously. These AI systems build upon the linguistic and analytical capabilities of LLMs, augmenting them with features such as sustained reasoning, access to external tools, and the ability to break down problems into manageable subtasks. These enhancements overcome limitations inherent in prompt-based interactions, such as the inability to use tools, handle multimodal data, and automate complex processes.

Key Components

Based on our review of previous studies, we have summarized LLM agents into four core components that work in synergy (as illustrated in Figure 9):

Knowledge: This foundational element encompasses contextual information and domain expertise, allowing the agent to draw upon relevant urban-specific data for its tasks.
Planning: This component empowers the agent to break down complex tasks and create strategies to solve them, incorporating various planning approaches and feedback mechanisms.
Memory: By storing and recalling information from past interactions, the memory component allows the agent to evolve and learn, informing future actions.
Action: This component translates the agent’s decisions into specific outcomes and tool uses, directly interacting with the environment to achieve defined goals.

As shown in Figure 9, these components interact in a cyclical process: the LLM agent receives a query or task, which initiates the planning process. The knowledge informs planning, which utilizes memory to create more effective strategies. The action component then executes these plans, with the results feeding back into the knowledge and memory components, continuously refining the agent’s capabilities. To facilitate the development of agents, several general-purpose frameworks have emerged, such as AutoGen [209], LangChain [95], MetaGPT [169], and AutoGPT [210]. These tools help streamline the process of managing different workflows and handling data efficiently.

LLM Agents in Urban Analytics

While these general frameworks provide a solid foundation, many researchers in urban analytics are developing specialized agent frameworks for specific tasks, incorporating domain-specific knowledge and methodologies for more nuanced applications.

Based on the major purposes and key features being utilized, we have summarized six primary aspects of LLM agents in urban analytics (Table 9). These agents excel at task planning and automation, coordinating complex urban management tasks such as disaster response planning [111] and autonomous driving [32]. They enhance knowledge retrieval and information search, rapidly synthesizing data from various sources for applications like flood detection [109] and remote sensing analysis [158]. LLM agents also provide human-like reasoning and decision support, analyzing multiple factors for urban planning [10] and driver behavior analysis [37]. They facilitate natural language interaction, making complex urban systems more accessible through applications like vehicle co-pilot systems [44] and conversational geospatial assistants [190]. In data analysis and forecasting, these agents process large datasets for tasks such as time-series forecasting [184] and traffic performance analysis [58].

Multimodal integration represents a significant application of LLM agents in urban environments, enabling comprehensive modeling and management through the synthesis of diverse data types, as demonstrated by systems like VELMA and CityCraft (Table 9) [90,189]. VELMA [189], an embodied LLM agent operating in Street View environments, transforms visual observations and navigation trajectories into natural language prompts for action prediction, achieving approximately 25% relative improvement in task completion compared to previous state-of-the-art models. Similarly, LA-Light [57] illustrates the effectiveness of multimodal integration in traffic signal control by converting intersection layouts, real-time sensor data, and traffic conditions into structured natural language prompts for decision-making, resulting in a 20.4% reduction in average waiting time during sensor outages compared to leading reinforcement learning methods. These examples highlight how LLM agents can effectively integrate and process multiple data modalities to enhance urban systems’ performance.

Across these applications, we also observed that studies often adopt heterogeneous task definitions, datasets, and evaluation metrics, especially in multimodal and agent-based settings. To increase comparability, Table 9 consolidates representative exemplars and standardizes their task scopes, data sources, and performance measures. This matrix provides a cross-study reference point that highlights where evaluation protocols are aligned, where benchmarks are missing, and where results cannot yet be compared directly.

These diverse applications demonstrate the versatility and potential of LLM agents in addressing complex urban challenges. They represent LLM agents as a significant leap forward in the application of artificial intelligence to urban data analytics. By combining the linguistic and analytical capabilities of LLMs with autonomous decision-making and tool integration, these agents are poised to become indispensable allies in our quest for smarter, more sustainable urban futures.

3.4.3. Fine-Tuning and Foundation Models

While agents and prompt engineering have shown promise in adapting LLMs to urban data analytics tasks, they may not be the optimal path towards urban general intelligence [5]. To address this, scholars are increasingly turning to fine-tuning techniques and developing specialized foundation models for urban analytics.

Fine-Tuning

Fine-tuning is a process that adapts pre-trained LLMs to perform specific tasks or operate within particular domains. This technique is crucial in urban data analytics, as it allows models to understand and process city-specific terminology, patterns, and contextual nuances. Fine-tuning typically involves further training of a pre-trained model on a smaller, task-specific dataset, which can significantly improve performance on urban-related tasks. Table 10 presents four common fine-tuning techniques used in urban analytics research:

In-context fine-tuning (Prompt engineering): This technique involves crafting specific prompts and adding few-shot examples to guide the model’s behavior without changing its parameters. It is useful for quick adaptations and when labeled data is scarce.
Supervised fine-tuning: This approach involves fine-tuning a pre-trained model on a labeled dataset specific to the target task. It is ideal when substantial labeled data is available and domain-specific performance is crucial.
Instruction fine-tuning: This method involves fine-tuning a pre-trained model on a dataset of instructions and their corresponding outputs. It is particularly effective for task-specific applications and improving model interpretability.
Parameter-Efficient Fine-tuning (PEFT): PEFT techniques aim to adapt pre-trained models to new tasks while updating only a small subset of the model’s parameters. This approach is suitable when computational resources are not sufficient or when preserving general knowledge is important.

The choice of fine-tuning technique in urban applications depends on factors such as data availability, computational resources, and the specific requirements of the task. For instance, in-context fine-tuning might be preferred for rapid prototyping or when dealing with novel urban challenges, while supervised fine-tuning could be more appropriate for well-defined tasks with ample labeled data.

Urban Foundation Models

Urban foundation models are large-scale models pre-trained on diverse urban datasets, serving as a robust starting point for various urban analytics tasks. These models can be further fine-tuned for specific applications, offering a comprehensive understanding of urban contexts and the ability to transfer knowledge across related urban tasks. However, due to limited training data and resources, most urban foundation models are adapted versions of well-known open-source large models, fine-tuned to fit urban contexts. This approach leverages the general knowledge embedded in existing LLMs while incorporating urban-specific insights. Despite these constraints, the development of foundation models has progressed beyond text-based applications, with a variety of models emerging to address different aspects of urban analytics, as illustrated in Table 11. These models range from language-based and vision-related models to time-series, spatiotemporal, text-to-3D, and multimodal models, each tailored to meet specific urban analytics needs.

To systematize this rapidly evolving field, [5,9] developed comprehensive frameworks for understanding urban foundation models. [10] introduced Urban Generative Intelligence (UGI), which integrates LLMs with urban systems to enhance generative capabilities in cities. They employ CityGPT, which creates embodied agents capable of navigating simulated urban environments through natural language interfaces, addressing multifaceted challenges across physical, social, economic, and environmental domains. Despite these advances, evaluation studies [180] demonstrated that current foundation models still lag behind specialized task-specific models in multimodal applications, particularly in POI-based urban function classification and remote sensing image scene classification. These findings highlight persistent challenges in developing truly versatile urban foundation models that can excel across diverse urban computing tasks.

In summary, fine-tuning and foundation models play a critical role in enhancing the applicability and accuracy of LLMs in urban analytics. Fine-tuning customizes LLMs for specific tasks, while foundation models provide a versatile and scalable base. These techniques collectively expand the potential of LLMs, enabling more precise and insightful urban analytics across various subdomains. However, a major barrier to the widespread adoption of fine-tuning and foundation models in urban analytics remains the lack of a comprehensive, large-scale database for training. As the field advances, the development of such databases will be crucial for realizing the full potential of these approaches in urban data analytics.

3.4.4. Performance Evaluation

The last aspect in modeling is performance evaluation. As LLMs increasingly permeate urban data analytics, rigorous performance evaluation becomes crucial. Proper assessment ensures that these models not only meet the complex demands of urban environments but also provide reliable insights for decision-making. In the context of urban analytics, performance evaluation goes beyond traditional metrics, encompassing both technical capabilities and human-centered considerations.

To analyze the evaluation landscape in LLM urban analytics, we have grouped common evaluation aspects into two broad categories: model-oriented and human-oriented aspects, as summarized in Table 12. While this categorization serves to illustrate the diverse facets of performance evaluation, it is important to note that many of these aspects are interconnected and may overlap in practice.

Model-Oriented Evaluation

Model-oriented evaluation focuses on the technical performance and capabilities of LLMs in urban data analytics tasks. Key aspects include the following:

Quality of Generated Data: Assessing the accuracy and realism of synthetic data produced by LLMs is crucial in urban analytics, where data quality directly impacts decision-making. For example, ref. [77] evaluated the realism of AI-generated driving scenarios using metrics like collision rates and emergency braking incidents, demonstrating the importance of high-quality synthetic data in urban transportation planning.
Prediction Performance: Measuring the accuracy of LLM predictions on various urban tasks is essential for ensuring reliable outcomes. Ref. [63] assessed traffic flow prediction accuracy using metrics such as Mean Absolute Error (MAE) and Root Mean Square Error (RMSE), highlighting the critical role of precise predictions in urban mobility management.
Task Completion: The ability of LLMs to successfully complete complex urban tasks is another important aspect, particularly in agent-based applications. Ref. [172] measured task completion in GeoLLM-Engine by assessing both the functional correctness of an agent’s tool usage and the success rate of achieving the expected end state of a task, using automated model-checking techniques that compare the agent’s actions and outcomes against pre-defined ground truths. In traffic signal control tasks, a fine-tuned model [85] achieved a 23.6% reduction in average vehicle delay and an 18.9% increase in throughput, while a prompt-based system [64] reduced average waiting time by 21.4%.
Robustness, Generalization, and Efficiency: While less common in current urban analytics applications, these aspects are gaining importance as LLMs become more integrated into urban systems. Evaluating LLM performance under varying conditions or with noisy input data (robustness), on unseen data or in new urban environments (generalization), and in terms of computational resources and time required (efficiency) is critical for ensuring the practical applicability of LLMs in diverse urban contexts. For instance, ref. [70] tested LLM-based mobility prediction models on days with and without public events to assess robustness, ref. [97] evaluated the performance of urban cross-modal retrieval models across different cities for generalization, and ref. [50] compared the efficiency of LLM-based approaches with that of traditional optimization methods for delivery route planning.

Human-Oriented Evaluation

Human-oriented evaluation considers the interaction between LLMs and human users in urban analytics applications:

Interpretability: Evaluating the explainability of LLM decisions and outputs is crucial for building trust and understanding, especially in urban planning and policy-making contexts. Ref. [57] assessed the quality and relevance of explanations provided by LLMs for traffic control decisions, demonstrating the importance of transparent AI systems in urban management.
Human–AI Collaboration: Assessing how well LLMs support and enhance human decision-making in urban contexts is vital for maximizing the synergy between human expertise and AI capabilities. Ref. [35] examined the impact of ChatGPT assistance on human performance in traffic simulation tasks, highlighting the potential of LLMs to augment human problem-solving in complex urban scenarios.
Ethical Considerations: Evaluating LLM adherence to ethical guidelines and potential biases in urban applications is essential for ensuring fair and responsible AI use in city planning and management. Ref. [135] assessed LLM-generated content for potential biases in tourism marketing, underscoring the need for vigilance in maintaining ethical standards as LLMs become more prevalent in shaping urban narratives and experiences.

Evaluation Metrics

To effectively evaluate the aforementioned aspects in urban analytics, researchers employ a variety of metrics and methods, as summarized in Table 13. These can be broadly categorized into quantitative metrics and qualitative methods:

Quantitative Metrics: These provide quantitative measures of LLM performance. They include accuracy for tasks like question answering or text classification [167], exact matchingfor precise output evaluation [46], common machine learning metrics such as F1 score [126], response time for real-time applications [37], consistency across different input formats [160], and task-specific metrics like collision rate in autonomous driving simulations [79]. These metrics offer a standardized way to assess and compare LLM performance across various urban analytics tasks.
Qualitative Methods: These provide deeper insights into LLM performance and user interaction. They include expert reviews to assess accuracy and relevance [146], user satisfaction surveys to gauge usability and effectiveness [90], in-depth case studies to understand model behavior in specific scenarios [155], and ethical assessments to evaluate adherence to guidelines and potential biases [127]. These methods are crucial for understanding the nuanced performance of LLMs in complex urban contexts and their impact on stakeholders.

It is important to note that the urban context often requires domain-specific evaluation criteria. Researchers should consider developing tailored metrics that capture the nuances of urban systems and align with local planning goals and regulations. The combination of quantitative metrics and qualitative methods provides a comprehensive evaluation framework, ensuring that LLMs not only perform well technically but also meet the complex needs of urban stakeholders and ethical standards.

3.5. Post-Analysis

The final crucial component in the pipeline of urban analytics is post-analysis. This stage plays a vital role in transforming the complex analytical outputs into more practical insights for decision makers and stakeholders. Post-analysis involves the application of additional engineering and presentation techniques to modeling results, significantly enhancing their accessibility and practical utility. The linguistic nature of LLMs has introduced new possibilities for post-analysis in urban analytics, offering advanced capabilities for processing, interpreting, and presenting complex urban data. The post-analysis techniques leveraging LLMs in urban analytics can be broadly categorized into three main areas, as summarized in Table 14:

Interactivity: LLMs have enabled more interactive and responsive urban data analytics systems, primarily through QA systems and automatic surveys. LLM-powered QA systems allow users to interact with urban data in natural language, as seen in [109]’s system for answering queries about flood situations in real time and [57]’s system for providing explanations for traffic control decisions. In terms of surveys, LLMs can assist in conducting surveysand analyzing public opinion on urban issues. Ref. [138] used LLMs to assess customer satisfaction in tourism, while ref. [145] analyzed public opinion on nuclear power. These interactive capabilities make urban data more accessible to non-experts, enable rapid information retrieval, and allow for more efficient and comprehensive assessment of public sentiment on urban policies and developments.
Accessibility: LLMs have significantly improved the accessibility of urban data analysis results through various techniques. These include automatic report generation, result visualization, and result/decision explanation. For instance, ref. [58] demonstrated the use of LLMs in creating traffic advisory reports, while ref. [105] showcased their application in producing detailed ecological construction reports. In terms of visualization, ref. [34] utilized LLMs to visualize geospatial trends in transit feedback, and ref. [164] employed them to create maps and charts for COVID-19 death-rate analysis. In addition, LLMs can provide explanations for results or decisions, as demonstrated by [57] in the context of traffic signal control decisions and by [48] in explaining predictions for autonomous driving. These capabilities collectively enhance the understanding and usability of complex urban data analysis results.
Decision Support: LLMs have significantly enhanced decision-support capabilities in urban analytics through scenario/policy simulation, decision analysis, and personalized recommendations. In scenario simulation, ref. [113] demonstrated the use of LLMs in simulating disaster scenarios for education, while ref. [59] used them to generate activity patterns under different conditions. For decision analysis, ref. [60] showed how LLMs could aid in traffic management and urban planning decisions, and ref. [152] demonstrated their use in supporting policy-making for health and disaster management. In terms of personalized recommendations, ref. [171] used LLMs to create tailored travel itineraries, while ref. [113] developed a system for providing personalized emergency guidance. These capabilities collectively provide valuable insights for urban planning and policy formulation, assist in complex decision-making processes, and allow for more targeted and effective urban services.

LLMs are revolutionizing urban governance by creating seamless pipelines that transform raw city data into actionable intelligence for informed urban decision-making. For example, ref. [108] demonstrated that LLM-based multi-agent systems achieve 94–99% accuracy in routing urban queries, with response quality scores (G-Eval: 0.68–0.74) significantly outperforming standalone LLMs (0.30–0.38), thereby enabling more efficient resource allocation in urban planning. In transportation management, TrafficGPT [33] maintained 100% accuracy in providing traffic recommendations—surpassing GPT-4’s variable 80–100% accuracy—delivering reliable guidance for traffic resource optimization and congestion mitigation. Complementing these operational tools, TrajLLM [211] generated interpretable mobility insights for urban planning, while UrbanGPT [94] provides spatio-temporal analysis to anticipate urban trends. Together, these applications showcase how LLMs can effectively translate vast urban datasets into concrete policy recommendations and operational decisions, supporting evidence-based governance across transportation, infrastructure, and public services.

While these LLM-powered techniques offer significant advantages, it is important to note some considerations. For example, data privacy must be carefully managed, particularly in interactive systems, to protect sensitive urban data. Bias mitigation strategies should be employed to ensure fair and equitable urban analytics, as LLMs can potentially perpetuate or amplify biases present in training data. In conclusion, post-analysis techniques powered by LLMs play a crucial role in bridging the gap between complex urban analytics and practical decision-making. By making analytical results more accessible, interactive, and actionable, these techniques have the potential to significantly enhance urban planning and policy-making processes.

3.6. Summary

LLMs have emerged as a transformative force in urban data analytics, revolutionizing every stage of the analytical process, from data collection to post-analysis. Their integration has enhanced the efficiency, accuracy, and scope of urban data processing and interpretation. LLMs have demonstrated remarkable versatility across various urban domains, with applications ranging from transportation and urban planning to disaster management and social dynamics.

The impact of LLMs is evident in their ability to generate synthetic data, automate complex preprocessing tasks, and create sophisticated models through techniques like prompt engineering and the development of specialized urban foundation models. LLM agents have further extended these capabilities, enabling autonomous navigation of complex urban datasets and facilitating more intuitive human–AI collaboration. In post-analysis, LLMs have significantly improved the accessibility and interpretability of results, offering enhanced visualization, natural language explanations, and interactive query systems.

While the current applications of LLMs in urban analytics are impressive, they also point towards exciting future possibilities and challenges. As we look towards the future, questions arise about how to expand information dimensions for wider urban applications, how to enhance urban-specific foundation models for more intelligent solutions, and how to optimize LLM agents and workflows for complex urban problems. The next section will delve into these future directions, exploring the potential trajectories of LLM applications in urban analytics and the research priorities that will shape the cities of tomorrow.

4. Future Directions and Challenges

4.1. Overview of the 3E Framework

To guide future research and maximize the impact of LLMs on urban data analytics, we propose the 3E framework, comprising three interconnected pillars (Figure 10): expanding information dimensions, enhancing model capabilities, and executing advanced applications. This framework follows the natural progression of urban data workflows—from collection and integration, through analytical processing, to practical implementation. The structure of the framework emerged from an inductive synthesis of the 178 studies examined in this review. Each study was reviewed with respect to its position in the urban analytics pipeline and its primary contribution type, and recurring patterns were consolidated through iterative discussion among the authors.

Expanding Information Dimensions (Data): This pillar focuses on enriching the data ecosystem by integrating diverse sources (structured, unstructured, visual, and temporal) to create comprehensive digital representations of urban systems, breaking down data silos and enabling access to domain-specific knowledge, establishing a foundation for holistic urban analysis.
Enhancing Model Capabilities (Model): This pillar addresses the development of LLMs specifically designed to process urban data complexities. This includes creating models that effectively handle multimodal information, capture spatial–temporal dynamics, and operate efficiently at scale—transforming raw urban data into actionable intelligence.
Executing Advanced Applications (Application): This pillar bridges technological innovation and real-world impact by deploying LLMs to address complex urban challenges, exploring applications in real-time decision-making, multi-agent simulations, and cross-domain collaboration to deliver tangible benefits for urban planning and policy-making.

The framework is situated alongside existing perspectives in urban analytics. Pipeline-oriented formulations in urban computing provide structured descriptions of how urban data progress through stages of collection, preprocessing, modeling, and post-analysis [212]. Discussions on emerging urban foundation models offer another perspective, centering on model architectures and training regimes [9]. The 3E framework contributes to this landscape by organizing research directions that recur across stages and model classes, providing a lens for understanding how LLM-related advances shape data, modeling practices, and application development.

The 3E framework is designed to be flexible and adaptable across diverse urban contexts—from metropolitan areas to smaller municipalities—and various domains, including transportation, environmental management, and public service delivery. Progress in each pillar enables advancements in others, creating a synergistic approach to improving urban analytics and addressing tomorrow’s urban challenges.

4.2. Expanding Information Dimensions (Data)

The first pillar of the 3E framework, expanding information dimensions, builds on the understanding that information flows and multimodal data constitute the foundational layer of urban technological systems, forming the basis upon which a richer and more integrated data ecosystem for LLM applications can be developed [213] (Figure 11). Urban environments generate vast amounts of heterogeneous data, ranging from structured datasets like sensor readings to unstructured data such as social media posts and visual imagery. However, the full potential of this data remains untapped due to siloed storage, inconsistent formats, and limited integration across domains.

This pillar addresses these challenges by emphasizing the integration of diverse data sources—structured, unstructured, visual, and temporal—to create a more comprehensive digital representation of urban systems. By breaking down data silos and enabling access to domain-specific knowledge, this pillar lays the foundation for more accurate and holistic urban analysis. The integration of multimodal data and the use of techniques like Retrieval-Augmented Generation (RAG) are key to unlocking new insights into urban dynamics, enabling LLMs to provide more contextually relevant and actionable intelligence for urban stakeholders.

4.2.1. Multimodal Data Integration

As urban environments become increasingly complex and data-rich, the integration of multimodal data sources presents both a significant opportunity and a challenge for urban data analytics. While recent research has begun to explore the potential of combining different data modalities, there is still considerable room for advancement in this area.

Expanding Integration Scope

One key direction for future research lies in expanding the scope of multimodal integration. Urban environments generate a wealth of diverse data types, including tabular data, images, textual documents, and street-level imagery. Future studies should aim to develop methodologies that can effectively synthesize three or more of these modalities simultaneously, providing a more comprehensive and nuanced understanding of urban dynamics [158,172,177]. This integration will be achieved through specialized LLM agents that coordinate across data types, orchestrating workflows that process each modality through appropriate tools before synthesizing results. For example, such systems could analyze satellite imagery alongside mobility patterns and social media sentiment to identify and explain emerging urban hotspots, helping planners respond more effectively to changing neighborhood dynamics.

Advanced Modalities

Furthermore, as technology evolves, there is an opportunity to incorporate more advanced and complex data modalities into urban analytics frameworks. For instance, the integration of 3D urban models could provide valuable insights into spatial relationships and urban morphology [105,196], implemented through specialized code generation capabilities that transform raw 3D data into queryable formats for LLM analysis. This could enable planners to rapidly assess how proposed building developments might affect pedestrian comfort, sunlight exposure, and visual corridors throughout different seasons. Similarly, the analysis of video data could capture temporal dynamics in urban settings [47], achieved through tool-using LLM systems that extract and interpret movement patterns from traffic cameras to predict congestion hours before it occurs, allowing for proactive traffic management interventions. Social media data could offer crowd-sourced insights into public sentiment and behavior [153], with LLM workflows processing diverse text and image content to identify emerging community concerns before they become formal complaints.

Potential Applications and Summary

Future research could explore fusing video feeds, social media reports, and geolocation data to support real-time crowd monitoring and emergency response—for example, detecting congestion or safety risks during city events and enabling timely intervention. Such systems would leverage agent-based approaches where specialized LLMs monitor different data streams and collaborate to identify anomalies requiring attention, providing emergency managers with integrated situational awareness during festivals, protests, or natural disasters. In disaster scenarios, these multimodal systems could integrate aerial imagery, social media reports, and sensor data to create real-time damage assessments and prioritize response efforts based on both physical infrastructure damage and human needs.

To date, LLM agents and tool-use approaches have proven to be effective and accessible methods for multimodal analysis in urban contexts. These approaches offer significant advantages, particularly for urban scholars, as they do not require extensive training or computational resources. As such, they remain a valuable and practical direction for future research and application. Alongside these methods, the development of large multimodal models presents additional opportunities for advancing the field. By pursuing these diverse research directions, we can unlock new insights into urban systems and enhance our ability to address complex urban challenges while ensuring that advancements in urban analytics remain accessible to a wide range of scholars and practitioners.

4.2.2. Retrieval-Augmented Generation (RAG)

Building upon the concept of integrating domain-specific knowledge, Retrieval-Augmented Generation (RAG) presents a promising approach to enhance LLMs’ capabilities in urban analytics. RAG allows models to access and utilize external knowledge bases during inference, potentially addressing the challenge of incorporating domain-specific urban planning principles, socio-economic theories, and environmental science concepts. In practical terms, RAG works by first retrieving relevant information from specialized databases or documents based on the query, then using this retrieved information to generate more accurate and contextually appropriate responses—essentially giving LLMs access to knowledge beyond their training data.

Potential Applications

The opportunities presented by RAG in urban analytics are significant. For instance, a RAG-enabled system could analyze traffic patterns by retrieving historical congestion data for specific intersections, then generate optimized signal timing recommendations that adapt to both seasonal variations and special events. Similarly, in disaster management, RAG could retrieve building codes and flood-zone maps when analyzing vulnerable infrastructure, enabling precise evacuation planning tailored to neighborhood demographics. In urban planning applications, RAG could access zoning regulations and development histories when evaluating proposed projects, generating comprehensive impact assessments that consider both legal compliance and community needs.

Moreover, RAG can be applied creatively to solve complex urban challenges. A recent innovative application reported in [214] demonstrates how RAG can be used to redefine image geolocalization as a text generation task. The proposed Img2Loc system combines CLIP-based image representations with large multi-modality models like GPT-4V or LLaVA, achieving decent performance in precise location prediction from images without additional model training. This approach allows urban planners to automatically identify locations from citizen-submitted photos of infrastructure issues, retrieving relevant maintenance records and jurisdictional information to expedite repairs.

This approach opens up exciting possibilities for urban analytics. For instance, similar techniques could be applied to automatically geolocate and contextualize user-submitted urban images, providing valuable data for city planning and management. By retrieving relevant land-use policies and development histories for specific locations, planners can make more informed decisions about proposed changes. RAG could also enhance the analysis of historical urban imagery, helping to track changes in cityscapes over time with unprecedented accuracy by retrieving temporal data on construction permits, demographic shifts, and economic indicators to explain observed physical transformations.

4.2.3. Others Data Dimensions

It is important to note that LLM-empowered urban analytics is still in its early stages, with many data-related directions mentioned in previous sections still having ample room for exploration. While we have highlighted two areas we believe are particularly promising and offer significant opportunities for advancement, there are some other data-focused directions to consider:

Human-as-Sensor Data Sources: Research is needed on the effective incorporation of unstructured human observations (social media, feedback forms, and community forums) as reliable data sources for urban analytics. Key challenges include the development of data validation frameworks for citizen-reported information and addressing representation biases in voluntarily contributed urban data.
Synthetic Urban Data: Opportunities exist for the use of LLMs to generate synthetic urban datasets that can fill gaps in historical records while preserving statistical properties. Future research should focus on validating the fidelity of LLM-generated synthetic data against known urban patterns and ensuring it correctly reflects urban system interdependencies.
Cross-Domain Data Integration: Research is needed on the creation of standardized approaches for the merging of datasets across traditionally separate domains (e.g., transportation, public health, and economic development) while maintaining semantic consistency. This includes developing shared data schemas and crosswalks between different urban data taxonomies.
Large-Scale Urban Data Processing: Investigations into efficient methods for processing city-scale heterogeneous datasets with LLMs represent an important research direction. This includes developing techniques for partitioning and processing massive urban datasets while maintaining context across analysis segments.
Privacy-Preserving Data Techniques: Critical research is needed on the incorporation of differential privacy and anonymization techniques specifically designed for urban data when processed through LLMs, especially for sensitive information like mobility patterns, utility usage, and public service access.

By focusing on multimodal data integration and RAG, while keeping in mind the broader landscape of data dimensions, we can push the boundaries of what is possible in LLM-powered urban analytics. These advancements will set the stage to enhance model capabilities and execute advanced applications, which will be discussed in subsequent sections. Such directions also point to the need for shared benchmark datasets and standardized evaluation procedures that can support reproducible testing of multimodal fusion, data-quality robustness, and cross-domain integration.

4.3. Enhancing Model Capabilities (Model)

The second pillar, enhancing model capabilities, aims to improve the ability of LLMs to understand and process the unique complexities of urban environments. It corresponds to the analytical-capability tier commonly identified in urban computing research, where system performance hinges on advances in the modeling of heterogeneous data, spatio-temporal reasoning, and multimodal integration. This component acknowledges that general-purpose LLMs, while powerful, require specific enhancements to effectively process and interpret the multifaceted nature of urban environments. This pillar focuses on advancing the technical capabilities of LLMs to develop multimodal foundation models; enhance temporal and spatial awareness; and create smaller, more efficient models for large-scale analysis. By advancing capabilities in spatial awareness, temporal reasoning, and cross-domain knowledge transfer, this pillar ensures that LLMs can transform raw urban data into actionable intelligence. These advancements are critical in supporting sophisticated analytical tasks, such as predicting urban growth patterns, modeling traffic flow, and understanding the interplay between environmental and social factors in urban systems.

4.3.1. Large Foundation Models

Multimodal Foundation Models

The diverse nature of urban data necessitates models capable of processing multiple data types simultaneously. While agent-based techniques have shown promise in multimodal integration (Section 4.2.1), developing fine-tuned foundation models represents another crucial advancement. Research conducted to date has produced various foundation models, though most focus on single modalities [10]. Urban environments, however, generate heterogeneous data across text, images, sensor readings, and video—integrating these diverse sources into unified models presents both challenges and opportunities.

Multimodal foundation models offer transformative potential for urban understanding. They could enhance environmental monitoring by processing satellite imagery, sensor data, and demographic information simultaneously, enabling accurate predictions of pollution dispersion and heat island effects at neighborhood scales. Such capabilities represent steps toward Urban General Intelligence (UGI) [5,9,10], as illustrated in Figure 12.

Video integration presents a particularly promising frontier. Urban environments generate vast amounts of video data through traffic cameras, surveillance systems, and social media. Recent developments like MM-VID [215] demonstrate how LLMs, combined with vision and audio processing, can transform video into detailed textual descriptions. These advances enable applications like real-time accident detection, where camera feeds analyzed alongside sensor data improve emergency response and reduce congestion through automated detour implementation.

Urban scholars can contribute to the development of multimodal foundation models through several pathways, the first of which is by fine-tuning existing general-purpose LLMs with urban-specific datasets that combine text, imagery, and structured data. This approach requires fewer computational resources while yielding models adapted to urban contexts. For instance, researchers could fine-tune models on paired datasets of urban policy documents and corresponding spatial data visualizations to improve cross-modal understanding. Second, domain experts can develop specialized training datasets that capture the nuanced relationships between different urban data modalities, such as the connection between satellite imagery and socioeconomic indicators. Finally, interdisciplinary collaborations between urban planners, computer scientists, and social scientists can help develop evaluation frameworks specifically designed to assess how well models transfer knowledge across modalities in urban applications.

Enhanced Temporal and Spatial Awareness

Urban systems are inherently dynamic and spatially complex, necessitating models with improved temporal and spatial awareness for accurate urban analytics. This involves creating large models that can effectively capture and reason about long-term temporal patterns and complex spatial relationships across various urban domains.

Recent advancements in this area are exemplified by the development of spatio-temporal LLMs such as UrbanGPT [94]. These models aim to address the challenge of data scarcity in urban sensing scenarios by integrating spatio-temporal dependency encoders with the instruction-tuning paradigm. This approach enables LLMs to comprehend complex interdependencies across time and space, facilitating more comprehensive and accurate predictions, even under data-scarce conditions.

While research on spatio-temporal LLMs conducted to date has primarily focused on transportation issues, there is significant potential to expand these models to other urban aspects. Future studies could explore the application of such models to predict and analyze patterns in air pollution by tracking how pollutants spread over time and space, weather dynamics by predicting localized temperature and rainfall changes, traffic flow by forecasting how congestion patterns develop across the city, and other critical urban phenomena. This expansion could lead to more accurate predictions of urban growth patterns, improved traffic flow modeling, and a better understanding of how spatial and temporal factors influence various urban systems.

4.3.2. Smaller Models for Large-Scale Analysis

The ability to process and analyze large-scale datasets—both temporal and spatial—is crucial for advancing urban analytics. Currently, the computational requirements of LLMs impose limitations on their application to large-scale analyses. To address this, there is growing interest in developing smaller, more efficient models. These models reduce computational requirements and costs, potentially enhance privacy, and make AI-driven urban analytics more accessible to a wider range of cities and organizations with limited resources. Such advancements could enable comprehensive analyses of global urbanization trends, cross-city comparisons, and long-term urban evolution studies.

Opportunities

Urban scholars can work toward developing these smaller yet powerful models through several approaches. Knowledge distillation techniques offer a promising path, where researchers can transfer capabilities from large models to smaller ones by training compact models to mimic the outputs of larger models on urban-specific tasks. For example, a distilled model focused specifically on traffic pattern analysis could run locally on city transportation department servers, providing real-time congestion predictions and route optimization with minimal latency. Similarly, domain-specific fine-tuning allows scholars to adapt foundation models to urban contexts using relatively small datasets of city-specific information, creating specialized urban-focused models that outperform general LLMs on targeted tasks. A fine-tuned model could analyze neighborhood-level social media sentiment during disaster events, helping emergency managers prioritize response resources based on real-time community needs assessment.

Modular architectures represent another avenue, where urban researchers can develop interchangeable components for different analytical tasks with proper workflow and agents. For instance, a city planning department could deploy a lightweight model for routine zoning compliance checks that processes thousands of building permit applications daily, flagging only complex cases for human review and reducing processing time from weeks to hours. In climate adaptation planning, smaller specialized models could continuously analyze IoT sensor networks monitoring urban heat islands, triggering automated cooling interventions like adaptive shading systems or water features in vulnerable neighborhoods during extreme heat events.

In summary, while these three areas represent our primary interests with respect to enhancing model capabilities, it is important to note that other aspects, such as interpretability and ethical considerations, remain crucial. As we advance in these priority areas, parallel efforts should continue to address these important aspects of model development. These advancements will enable more accurate, efficient, and responsible use of AI in urban planning and management, paving the way for smarter and more sustainable cities across a wide range of contexts and scales.

4.4. Executing Advanced Applications (Application)

The third pillar, executing advanced applications, addresses the translation of analytical advances into real interventions for urban systems. Urban analytics and socio-technical systems models commonly describe an operational layer where analytical outputs are enacted through planning, simulation, and decision-oriented actions, and this pillar aligns with that structure. While advancements in data integration and model capabilities are essential, their true value lies in their application to real-world urban problems. This pillar focuses on the practical implementation of LLMs in urban analytics, with an emphasis on enhancing existing workflows, enabling multi-agent collaborations, and developing innovative applications across various urban domains. By automating routine analytical tasks, facilitating human–AI collaboration, and creating realistic simulations of urban environments, this pillar ensures that LLMs deliver tangible benefits for urban planning, management, and policy-making. The integration of LLM-powered agents into urban workflows has the potential to revolutionize decision-making processes, enabling more efficient, adaptive, and holistic solutions to urban challenges. From real-time adaptation to cross-domain collaboration, this pillar explores the diverse ways in which LLMs can be applied to create smarter, more sustainable cities.

4.4.1. Enhancing Existing Workflows

The integration of LLMs into urban data analytics processes offers substantial room for innovation. While current LLM workflows often require significant manual work, future research should focus on enhanced automation and adaptivity. By automating routine analytical tasks, scholars and practitioners could allocate more resources to addressing core urban problems, potentially leading to breakthroughs in understanding complex urban phenomena.

The developments of LLM workflows typically involves creating a sequence of interconnected LLM operations that transform raw urban data into actionable insights. They can be constructed using visual programming interfaces or code-based frameworks that connect LLM capabilities with domain-specific tools. Research opportunities in this area include the development of standardized workflow templates for common urban analyses (e.g., land use forecasting, mobility pattern analysis, or community sentiment assessment) that practitioners can easily adapt to their local contexts.

However, automation must be balanced with improved human–AI collaboration. Ref. [216] highlights the challenges LLMs face in “decision-oriented dialogues”, emphasizing the importance of human involvement in urban planning processes, especially for ethical considerations. Future research should develop effective collaboration mechanisms between human experts and LLM agents, creating intuitive interfaces that leverage both human expertise and AI capabilities. For example, in participatory urban planning processes, LLMs could analyze thousands of citizen comments to identify key concerns and synthesize them into actionable design recommendations, while human planners maintain oversight of the final decision-making process to ensure equity considerations are properly addressed.

By enhancing workflows with LLM agents, urban data analytics can become more efficient and powerful, potentially improving our understanding of urban systems and supporting data-driven planning strategies. However, as [216] demonstrates, significant improvements are still needed in LLMs’ performance for complex decision-making scenarios, particularly when compared to human assistants. This balance between automation and human–AI collaboration represents a critical area for future research and development in urban data analytics.

4.4.2. Multi-Agent Collaborations

The future of urban data analytics lies in harnessing the power of LLMs through sophisticated multi-agent frameworks. These frameworks promise to revolutionize collaboration and decision-making in complex urban scenarios, offering unprecedented insights and solutions. Recent advancements, such as the “generative agents” introduced in [134], demonstrate the potential for the creation of highly realistic simulations of urban environments and human behavior. LLM-powered agents can simulate daily activities, form opinions, and even reflect on past experiences to plan future actions, enabling more dynamic and nuanced urban modeling.

LLM agents function as autonomous software entities that perform specific tasks or represent different stakeholders in urban systems. Their development involves designing specialized prompts that define roles, knowledge bases, decision-making parameters, and interaction capabilities. When orchestrated to work together, these agents can simulate complex urban dynamics or collaboratively solve multi-faceted problems. Effective agent systems require clear role definitions (such as transportation planners, environmental analysts, and community representatives), well-designed communication protocols, and appropriate evaluation metrics to assess performance in urban contexts.

The potential of multi-agent-based urban simulations extends beyond mere representation to powerful scenario generation tools for real-world planning. In transportation, these systems could simulate the impacts of congestion pricing policies by modeling how different citizen groups might adapt their behaviors, helping identify equity issues before implementation. In disaster management, agents representing emergency responders, infrastructure managers, and vulnerable populations could optimize evacuation plans by revealing bottlenecks under different conditions. Additional research opportunities span housing development—where agents could model interactions between developers, regulators, and community members to forecast neighborhood changes under different zoning policies—and public health, where simulations could track disease transmission while accounting for behavioral adaptations. This collaborative approach enables holistic problem-solving that addresses urban challenges from multiple perspectives simultaneously.

4.4.3. Others

The applications of LLM agents and workflows in urban data analytics extend across various domains, including transportation, the environment, urban development, and planning. Researchers can leverage their domain-specific expertise to design innovative future studies, adapting these technologies to address unique urban challenges.

While human–AI collaboration and multi-agent systems are key areas of development, the application of LLM agents in urban analytics remains in its early stages, offering numerous opportunities for exploration. Table 15 summarizes additional future directions, including real-time adaptation, integration with IoT and sensor networks, cross-domain agent collaboration, cross-lingual and cultural adaptation, interpretable and transparent AI, and ethical AI and bias mitigation. As urban data analytics evolves alongside LLM technology, we anticipate the emergence of novel applications that will further transform our understanding and management of urban environments.

Beyond the three pillars discussed above, the framework points toward a research agenda that can be operationalized through clearer evaluation settings and reproducible experimental schemes. Future studies could develop geo-grounded benchmark suites that assess spatial reasoning, temporal alignment, and multimodal integration for representative urban tasks. Complementary to these resources, standardized procedures for hallucination auditing and fairness assessment would support more consistent evaluation of LLM behavior within urban governance workflows. For agentic systems, research could define task success criteria and reliability thresholds under incomplete information and human-in-the-loop verification protocols that reflect real operational constraints. These elements transform the 3E framework from a conceptual structure into a set of implementable directions that can guide empirical investigations across diverse urban contexts.

4.5. Potential Applications of the 3E Framework

While the previous sections examined each pillar of the 3E framework separately, their integration offers promising approaches to urban challenges. The following conceptual scenarios illustrate how combining expanding information dimensions, enhancing model capabilities, and executing advanced applications could address pressing urban issues across different urban contexts. Each scenario highlights how the interaction between the three pillars could enable improved understanding and management of complex urban systems.

4.5.1. Urban Resilience and Disaster Management

Climate change has intensified the frequency and severity of extreme weather events, challenging cities’ disaster preparedness systems. The 3E framework offers a comprehensive approach to enhancing urban resilience. Through expanding information dimensions, municipalities can integrate previously siloed data sources—meteorological forecasts, IoT sensor networks, critical infrastructure status, historical disaster response data, and citizen reports via social media—creating a comprehensive situational awareness platform. By enhancing model capabilities, specialized LLMs with advanced spatio-temporal reasoning can process this multimodal data to generate high-precision predictions of disaster impacts at neighborhood scales, accounting for complex interactions between natural and built environments. The execution of advanced applications enables multi-agent systems to coordinate emergency responses by automatically prioritizing resource allocation, generating context-specific evacuation strategies, and simulating intervention scenarios to optimize preparedness investments. This integrated approach could significantly improve response times and resource utilization during extreme events.

4.5.2. Urban Mobility and Transportation Systems

Transportation networks represent complex systems where efficiency, equity, and sustainability objectives frequently conflict. The 3E framework provides an integrated methodology for optimizing urban mobility. By expanding information dimensions, transportation authorities can integrate heterogeneous data streams from traffic sensors, public transit operations, ride-sharing platforms, pedestrian counts, and economic activity indicators. Enhanced Model capabilities enable LLMs to identify recurrent patterns and anomalies across multiple time frames and spatial scales, from immediate congestion to seasonal variations, while accounting for the interdependence between transportation modes. This analytical foundation supports advanced applications such as predictive traffic management systems that proactively adjust signal timing, dynamic public transit routing that responds to emerging demand patterns, and equity-focused accessibility analysis that ensures transportation benefits are distributed fairly across demographic groups. Such applications could potentially reduce average commute times while simultaneously improving accessibility for underserved communities.

4.5.3. Sustainable Urban Development

Balancing economic growth, environmental sustainability, and social equity presents persistent challenges for urban development. The 3E framework offers a methodological approach to addressing these multidimensional concerns. By expanding information dimensions, urban planners can synthesize land-use data, environmental indicators, economic metrics, demographic patterns, and qualitative community feedback to create comprehensive digital representations of neighborhoods. Enhanced model capabilities allow for sophisticated analyses that capture complex interdependencies between development decisions and their multi-generational impacts on communities and ecosystems. These capabilities enable advanced applications such as multi-stakeholder planning platforms where LLM-powered agents represent different interests (environmental, economic, and community) to facilitate more balanced negotiations and scenario simulation tools that project long-term outcomes of alternative development strategies across multiple sustainability metrics. This approach could support more deliberative planning processes that achieve improvements in both sustainability indicators and community satisfaction.

4.5.4. Summary

These conceptual applications illustrate how the 3E framework provides not only a theoretical structure for research advancement but also a practical pathway for addressing complex urban challenges. By systematically integrating innovations across data types, model capabilities, and applications, cities could more effectively harness the transformative potential of LLMs for urban analytics and decision-making.

It is important to note that urban data analytics encompasses a broad spectrum of subdomains beyond those discussed above. The 3E framework is designed to be adaptable and extendable to numerous other urban contexts, including tourism management, urban environmental analysis, building energy management, public health analytics, smart grid optimization, waste management, etc. Each of these domains presents unique data challenges, modeling requirements, and application opportunities that could benefit from the structured approach offered by this framework. As LLM technologies continue to evolve, we anticipate that researchers and practitioners across these diverse urban subdomains will adapt and extend the 3E framework to address their specific analytical needs and challenges.

4.6. Discussion

While pushing the boundaries of LLMs in urban data analytics, several critical challenges need to be addressed to fully realize their potential. This section focuses on three key challenges that are particularly pertinent to urban analytics: hallucination and trustworthiness, scalability and computational requirements, and ethical and privacy concerns. Although these issues have been explored in computer science, their implications for public-sector decision-making and the governance structures surrounding urban analytics remain insufficiently examined.

4.6.1. Hallucination and Trustworthiness

LLMs have demonstrated a tendency to generate plausible but factually incorrect information, a phenomenon known as “hallucination” [169]. This poses a significant challenge in urban analytics, where accuracy is crucial for informed decision-making. Recent quantitative evaluations provide compelling evidence of this problem: HALLUSIONBENCH, a comprehensive benchmark for multimodal reasoning evaluation, revealed that even state-of-the-art models like GPT-4V achieved only 31.42% accuracy on visually grounded question-answering tasks, while most other models performed below 16% [217]. SelfCheckGPT, using standardized traffic datasets (Waymo Open Dataset and PREPER CITY dataset) revealed significant model-dependent hallucination rates in visual traffic agent identification (e.g., vehicles and pedestrians), ranging from 8.47% in GPT-4o to 29.86% in LLaVA, with performance fluctuating significantly across model architectures and geographic contexts [218]. The hallucination problem is further exacerbated in urban environments due to their dynamic nature, where rapid infrastructure development, changing mobility patterns, and evolving regulations quickly render data obsolete, creating a particularly challenging domain for LLM deployment.

To address this challenge, future research should focus on developing robust fact-checking mechanisms and integrating up-to-date, authoritative urban data sources with LLMs. Techniques such as retrieval-augmented generation and information verification agents show promise in enhancing the reliability of LLM outputs. Additionally, interpretability remains essential in public-facing analytical workflows, as decision-makers require transparent reasoning chains when models contribute to policy or operational recommendations. The development of standardized benchmarking datasets across various urban subdomains is crucial. While some efforts have been made in transportation [4], they remain in early stages, with other subdomains lagging even farther behind.

4.6.2. Scalability and Computational Requirements

The training and deployment of LLMs demand substantial computational resources, presenting a significant barrier to their widespread adoption in urban analytics. This challenge extends beyond the initial training phase to the operational deployment of these models, where real-time analysis of diverse data streams is often necessary. Recent research demonstrates that optimized architectures and inference strategies frequently outperform raw computational scaling for urban analytics applications. For example, empirical experiments with Llemma-7B using search-based inference methods reduced error rates from 58% to 52% compared to the larger 34B model under equivalent compute constraints, specifically when employing weighted majority voting approaches [219]. Similarly, in geospatial applications, GeoLLM-Engine-100k maintained nearly identical performance metrics to GeoLLM-Engine-10k (76.81% vs. 77.35% success rate), despite processing ten times more queries and handling over half a million tool calls [172]. The deployment strategy further impacts system performance and cost-effectiveness: cloud-based deployments offer powerful computational resources and scalability but introduce latency challenges and ongoing subscription costs that may compromise real-time urban applications, while edge computing enables local data processing with significantly reduced latency but requires upfront hardware investments and faces inherent computational constraints [220]. Urban computing systems are often fragmented across agencies, creating additional challenges in coordinating compute availability, model updates, and data synchronization.

To address these computational constraints, future research directions should focus on both model architecture and deployment strategies. In addition to the direction of smaller models mentioned in Section 4.3, techniques such as quantization, pruning, and knowledge distillation can further reduce computational requirements while preserving essential capabilities for urban analytics tasks [23]. Domain-specific LLMs fine-tuned on urban data can potentially achieve better performance-to-computation ratios by embedding specialized knowledge directly into model parameters. Distributed deployment strategies that balance edge and cloud computing can also support governance requirements such as latency control, data sovereignty, and operational reliability within city management systems.

4.6.3. Fairness in Urban Decision-Making

The deployment of LLMs in high-stakes urban contexts raises significant fairness concerns, particularly in areas like law enforcement and urban policy, where algorithmic decisions can have profound societal impacts. This challenge stems from LLMs’ tendency to absorb and amplify existing societal biases embedded in their training data, potentially leading to discriminatory outcomes that disproportionately affect marginalized communities when applied to urban decision-making processes. Recent studies examining predictive policing algorithms have revealed how systems trained on historical arrest data perpetuate racial biases by targeting minority communities at higher rates, essentially encoding and reinforcing patterns of systemic discrimination already present in law enforcement practices [221]. As urban centers increasingly integrate LLMs for critical decision support in resource allocation, zoning determinations, and law enforcement, these fairness issues pose substantial risks of further entrenching social inequalities and undermining public trust in algorithmic governance systems.

To address these concerns, future research should adopt a multi-faceted approach that combines technical and governance solutions. On the technical side, researchers could explore fairness-aware modeling techniques that explicitly constrain algorithms against demographic disparities, develop robust transparency mechanisms that make model decisions interpretable to stakeholders, and ensure training data adequately represents diverse urban communities. Beyond technical fixes, implementing participatory governance frameworks is essential, giving affected communities meaningful input throughout the system development life cycle. Systematic fairness evaluation through standardized domain-specific benchmarks, regular equity impact assessments, and inclusive stakeholder engagement—prioritizing historically marginalized communities alongside domain experts and policymakers—will promote more equitable and context-sensitive urban decision-making systems.

4.6.4. Ethical and Privacy Issues

The application of LLMs in urban analytics raises significant ethical and privacy concerns. These models, trained on vast datasets, may inadvertently perpetuate or amplify existing biases, potentially leading to unfair or discriminatory outcomes in urban planning and management. The challenge extends beyond personal information from individual citizens to include sensitive data from governments, institutions, and companies.

On a positive note, the open-source LLM movement offers a potential solution. By using open-sourced LLMs, users and organizations can deploy their own internal models, mitigating some privacy concerns [222]. Future research could also prioritize the development of privacy-preserving techniques for training and deploying LLMs in urban contexts. This could include advanced data anonymization methods, federated learning approaches, and differential privacy techniques. Establishing ethical guidelines for the use of LLMs in urban decision-making processes is crucial, with a focus on fairness, transparency, and accountability. Research into making LLMs more interpretable and explainable will also be vital in addressing these ethical concerns and building trust among stakeholders.

Beyond model-level properties, future research also needs to examine how LLM-enabled analytics are embedded in urban governance workflows. Outputs from LLM systems typically enter decision processes through reports, dashboards, and operational tools, where they are interpreted and validated by planners, analysts, and public officials. Human-in-the-loop designs that require expert review of model-derived recommendations before implementation, together with documentation of data sources, prompts, and post-processing steps, can support transparency and ex post auditability along the analytics pipeline. These socio-technical arrangements are central to managing hallucination risks, ensuring that fairness assessments are acted upon, and aligning LLM-based tools with existing accountability structures in public-sector decision-making.

5. Conclusions

This review has systematically examined the integration of large language models into urban data analytics, analyzing their impact across the four typical steps of the analytical process: data collection, preprocessing, modeling, and post-analysis. Our investigation spanned various urban domains, including transportation, urban planning, disaster management, and environmental monitoring, revealing the transformative potential of LLMs in each stage.

In data collection, LLMs enhance the extraction and synthesis of information from diverse sources. For preprocessing, they excel in tasks such as data cleaning and feature extraction. In modeling, techniques like prompt engineering and the use of LLM agents and fine-tuned foundation models enable more sophisticated urban analysis. Post-analysis benefits from LLMs through improved data visualization and report generation, making insights more accessible to stakeholders.

Looking ahead, we propose a 3E framework for future directions in LLM-powered urban analytics:

Expanding Information Dimensions: This pillar focuses on multimodal data integration and retrieval-augmented generation to enable more comprehensive urban analyses.
Enhancing Model Capabilities: This pillar involves developing multimodal foundation models, improving temporal and spatial awareness, and creating efficient smaller models for large-scale analysis.
Executing Advanced Applications: This pillar concerns the enhancement of existing workflows through automation and human–AI collaboration and the exploration of multi-agent systems for complex urban simulations.

Beyond these technical trajectories, broader governance considerations will influence how LLMs shape urban analytics. The reliability of model outputs, the interpretability required for public-sector decisions, and the equity implications of algorithmic recommendations all carry significant weight in institutional contexts. Urban data ecosystems also involve fragmented responsibilities, heterogeneous data standards, and varying capacities across agencies, which affect the feasibility and accountability of LLM-based systems. As research progresses, integrating socio-technical perspectives with LLM development will be essential. Addressing hallucination, fairness, and privacy concerns requires not only technical advances but also procedures for auditing, transparency, and stakeholder involvement. These elements are becoming central to ensuring that LLMs contribute constructively to urban decision-making and support more equitable and sustainable urban environments. At the same time, the current evidence base remains uneven across domains, and many demonstrations have not yet been validated in large-scale or real-world settings. These limitations highlight the need for systematic benchmarks and reproducible evaluation protocols that can guide future empirical studies. The insights synthesized in this review are intended to support researchers and urban practitioners who plan to incorporate LLMs into data-driven urban decision processes.

Author Contributions

Conceptualization, F.J., J.M. and Y.J.; methodology, F.J. and Y.J.; software, F.J.; validation, F.J.; formal analysis, F.J. and Y.J.; investigation, F.J.; resources, F.J. and J.M.; data curation, F.J.; writing—original draft preparation, F.J.; writing—review and editing, F.J. and Y.J.; visualization, F.J. and Y.J.; supervision, J.M.; project administration, F.J. and J.M.; funding acquisition, J.M. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the Seed Fund for Collaborative Research (No. 2207101592) from The University of Hong Kong.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

During the preparation of this work, the authors used ChatGPT 4o to improve readability and language. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zou, X.; Yan, Y.; Hao, X.; Hu, Y.; Wen, H.; Liu, E.; Zhang, J.; Li, Y.; Li, T.; Zheng, Y.; et al. Deep Learning for Cross-Domain Data Fusion in Urban Computing: Taxonomy, Advances, and Outlook. Inf. Fusion 2025, 113, 102606. [Google Scholar] [CrossRef]
Bettencourt, L.M. The origins of scaling in cities. Science 2013, 340, 1438–1441. [Google Scholar] [CrossRef]
Wang, S.; Hu, T.; Xiao, H.; Li, Y.; Zhang, C.; Ning, H.; Zhu, R.; Li, Z.; Ye, X. GPT, large language models (LLMs) and generative artificial intelligence (GAI) models in geospatial science: A systematic review. Int. J. Digit. Earth 2024, 17, 2353122. [Google Scholar] [CrossRef]
Yan, H.; Li, Y. A Survey of Generative AI for Intelligent Transportation Systems. arXiv 2023. [Google Scholar] [CrossRef]
Zhang, W.; Han, J.; Xu, Z.; Ni, H.; Lyu, T.; Liu, H.; Xiong, H. Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models. arXiv 2025. [Google Scholar] [CrossRef]
Cui, C.; Ma, Y.; Cao, X.; Ye, W.; Zhou, Y.; Liang, K.; Chen, J.; Lu, J.; Yang, Z.; Liao, K.D.; et al. A Survey on Multimodal Large Language Models for Autonomous Driving. In Proceedings of the 2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), Waikoloa, HI, USA, 1–6 January 2024; pp. 958–979. [Google Scholar] [CrossRef]
Sufi, F. A systematic review on the dimensions of open-source disaster intelligence using GPT. J. Econ. Technol. 2024, 2, 62–78. [Google Scholar] [CrossRef]
Saka, A.; Taiwo, R.; Saka, N.; Salami, B.A.; Ajayi, S.; Akande, K.; Kazemi, H. GPT models in construction industry: Opportunities, limitations, and a use case validation. Dev. Built Environ. 2024, 17, 100300. [Google Scholar] [CrossRef]
Zhang, W.; Han, J.; Xu, Z.; Ni, H.; Liu, H.; Xiong, H. Urban Foundation Models: A Survey. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Barcelona, Spain, 25–29 August 2024; pp. 6633–6643. [Google Scholar] [CrossRef]
Xu, F.; Zhang, J.; Gao, C.; Feng, J.; Li, Y. Urban Generative Intelligence (UGI): A Foundational Platform for Agents in Embodied City Environment. arXiv 2023. [Google Scholar] [CrossRef]
Zhao, W.X.; Zhou, K.; Li, J.; Tang, T.; Wang, X.; Hou, Y.; Min, Y.; Zhang, B.; Zhang, J.; Dong, Z.; et al. A Survey of Large Language Models. arXiv 2023, arXiv:2303.18223. [Google Scholar]
Minaee, S.; Mikolov, T.; Nikzad, N.; Chenaghlu, M.; Socher, R.; Amatriain, X.; Gao, J. Large Language Models: A Survey. arXiv 2024. [Google Scholar] [CrossRef]
Chen, S.F.; Goodman, J. An empirical study of smoothing techniques for language modeling. Comput. Speech Lang. 1999, 13, 359–394. [Google Scholar] [CrossRef]
Bengio, Y.; Ducharme, R.; Vincent, P. A Neural Probabilistic Language Model. In Proceedings of the Advances in Neural Information Processing Systems; Leen, T., Dietterich, T., Tresp, V., Eds.; MIT Press: Cambridge, MA, USA, 2000; Volume 13. [Google Scholar]
Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient Estimation of Word Representations in Vector Space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2023, arXiv:1706.03762. [Google Scholar]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2019, arXiv:1810.04805. [Google Scholar] [CrossRef]
Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language models are unsupervised multitask learners. OpenAI Blog 2019, 1, 9. [Google Scholar]
Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models are Few-Shot Learners. In Proceedings of the Advances in Neural Information Processing Systems; Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H., Eds.; Curran Associates, Inc.: Vancouver, BC, Canada, 2020; pp. 1877–1901. [Google Scholar]
Fedus, W.; Zoph, B.; Shazeer, N. Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. arXiv 2022, arXiv:2101.03961. [Google Scholar] [CrossRef]
Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.L.; Mishkin, P.; Zhang, C.; Agarwal, S.; Slama, K.; Ray, A.; et al. Training language models to follow instructions with human feedback. arXiv 2022, arXiv:2203.02155. [Google Scholar] [CrossRef]
Team, G. Gemma: Open Models Based on Gemini Research and Technology. arXiv 2024, arXiv:2403.08295. [Google Scholar] [CrossRef]
Abdin, M.; Aneja, J.; Awadalla, H.; Awadallah, A.; Awan, A.A.; Bach, N.; Bahree, A.; Bakhtiari, A.; Bao, J.; Behl, H.; et al. Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone. arXiv 2024, arXiv:2404.14219. [Google Scholar] [CrossRef]
Grattafiori, A.; Dubey, A.; Jauhri, A.; Pandey, A.; Kadian, A.; Al-Dahle, A.; Letman, A.; Mathur, A.; Schelten, A.; Vaughan, A.; et al. The Llama 3 Herd of Models. arXiv 2024, arXiv:2407.21783. [Google Scholar] [CrossRef]
Jiang, A.Q.; Sablayrolles, A.; Roux, A.; Mensch, A.; Savary, B.; Bamford, C.; Chaplot, D.S.; de las Casas, D.; Hanna, E.B.; Bressand, F.; et al. Mixtral of Experts. arXiv 2024, arXiv:2401.04088. [Google Scholar] [CrossRef]
OpenAI. GPT-4 Technical Report. arXiv 2024, arXiv:2303.08774. [Google Scholar]
Google, G.T. Gemini: A Family of Highly Capable Multimodal Models. arXiv 2024, arXiv:2312.11805. [Google Scholar]
Chang, Y.; Wang, X.; Wang, J.; Wu, Y.; Yang, L.; Zhu, K.; Chen, H.; Yi, X.; Wang, C.; Wang, Y.; et al. A Survey on Evaluation of Large Language Models. ACM Trans. Intell. Syst. Technol. 2024, 15, 1–45. [Google Scholar] [CrossRef]
Chumakov, S.; Kovantsev, A.; Surikov, A. Generative approach to aspect based sentiment analysis with GPT language models. Procedia Comput. Sci. 2023, 229, 284–293. [Google Scholar] [CrossRef]
Rosca, C.M.; Stancu, A. Quality Assessment of GPT-3.5 and Gemini 1.0 Pro for SQL Syntax. Comput. Stand. Interfaces 2025, 95, 104041. [Google Scholar] [CrossRef]
Crooks, A.; Chen, Q. Exploring the new frontier of information extraction through large language models in urban analytics. Environ. Plan. B Urban Anal. City Sci. 2024, 51, 565–569. [Google Scholar] [CrossRef]
Cui, C.; Ma, Y.; Cao, X.; Ye, W.; Wang, Z. Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles. arXiv 2023. [Google Scholar] [CrossRef]
Zhang, S.; Fu, D.; Liang, W.; Zhang, Z.; Yu, B.; Cai, P.; Yao, B. TrafficGPT: Viewing, Processing and Interacting with Traffic Foundation Models. Transp. Policy 2024, 150, 95–105. [Google Scholar] [CrossRef]
Leong, M.; Abdelhalim, A.; Ha, J.; Patterson, D.; Pincus, G.L.; Harris, A.B.; Eichler, M.; Zhao, J. MetRoBERTa: Leveraging Traditional Customer Relationship Management Data to Develop a Transit-Topic-Aware Language Model. arXiv 2023. [Google Scholar] [CrossRef]
Villarreal, M.; Poudel, B.; Li, W. Can ChatGPT Enable ITS? The Case of Mixed Traffic Control via Reinforcement Learning. arXiv 2023. [Google Scholar] [CrossRef]
Zhang, Z.; Amiri, H.; Liu, Z.; Züfle, A.; Zhao, L. Large Language Models for Spatial Trajectory Patterns Mining. arXiv 2023. [Google Scholar] [CrossRef]
Zhang, K.; Wang, S.; Jia, N.; Zhao, L.; Han, C.; Li, L. Integrating visual large language model and reasoning chain for driver behavior analysis and risk assessment. Accid. Anal. Prev. 2024, 198, 107497. [Google Scholar] [CrossRef]
Sha, H.; Mu, Y.; Jiang, Y.; Chen, L.; Xu, C.; Luo, P.; Li, S.E.; Tomizuka, M.; Zhan, W.; Ding, M. LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving. arXiv 2023. [Google Scholar] [CrossRef]
Ge, J.; Chang, C.; Zhang, J.; Li, L.; Na, X.; Lin, Y.; Li, L.; Wang, F.Y. LLM-Based Operating Systems for Automated Vehicles: A New Perspective. IEEE Trans. Intell. Veh. 2024, 9, 4563–4567. [Google Scholar] [CrossRef]
Wong, I.A.; Lian, Q.L.; Sun, D. Autonomous travel decision-making: An early glimpse into ChatGPT and generative AI. J. Hosp. Tour. Manag. 2023, 56, 253–263. [Google Scholar] [CrossRef]
Li, X.; Liu, E.; Shen, T.; Huang, J.; Wang, F.Y. ChatGPT-Based Scenario Engineer: A New Framework on Scenario Generation for Trajectory Prediction. IEEE Trans. Intell. Veh. 2024, 9, 4422–4431. [Google Scholar] [CrossRef]
Wang, X.; Fang, M.; Zeng, Z.; Cheng, T. Where Would I Go Next? Large Language Models as Human Mobility Predictors. arXiv 2024. [Google Scholar] [CrossRef]
Liu, Y.; Kuai, C.; Ma, H.; Liao, X.; He, B.Y.; Ma, J. Semantic Trajectory Data Mining with LLM-Informed POI Classification. arXiv 2024. [Google Scholar] [CrossRef]
Wang, S.; Zhu, Y.; Li, Z.; Wang, Y.; Li, L.; He, Z. ChatGPT as Your Vehicle Co-Pilot: An Initial Attempt. IEEE Trans. Intell. Veh. 2023, 8, 4706–4721. [Google Scholar] [CrossRef]
Fu, D.; Li, X.; Wen, L.; Dou, M.; Cai, P.; Shi, B.; Qiao, Y. Drive Like a Human: Rethinking Autonomous Driving with Large Language Models. arXiv 2023. [Google Scholar] [CrossRef]
Zheng, O.; Abdel-Aty, M.; Wang, D.; Wang, Z.; Ding, S. ChatGPT Is on the Horizon: Could a Large Language Model Be Suitable for Intelligent Traffic Safety Research and Applications? arXiv 2023. [Google Scholar] [CrossRef]
Ha, S.V.U.; Le, H.D.A.; Nguyen, Q.Q.V.; Chung, N.M. DAKRS: Domain Adaptive Knowledge-Based Retrieval System for Natural Language-Based Vehicle Retrieval. IEEE Access 2023, 11, 90951–90965. [Google Scholar] [CrossRef]
Peng, M.; Guo, X.; Chen, X.; Zhu, M.; Chen, K.; Wang, F.Y. LC-LLM: Explainable Lane-Change Intention and Trajectory Predictions with Large Language Models. arXiv 2024. [Google Scholar] [CrossRef]
Syum Gebre, T.; Beni, L.; Tsehaye Wasehun, E.; Elikem Dorbu, F. AI-Integrated Traffic Information System: A Synergistic Approach of Physics Informed Neural Network and GPT-4 for Traffic Estimation and Real-Time Assistance. IEEE Access 2024, 12, 65869–65882. [Google Scholar] [CrossRef]
Liu, Y.; Wu, F.; Liu, Z.; Wang, K.; Wang, F.; Qu, X. Can language models be used for real-world urban-delivery route optimization? Innov. 2023, 4, 100520. [Google Scholar] [CrossRef]
Zhang, Q.; Mott, J.H. An Exploratory Assessment of LLM’s Potential Toward Flight Trajectory Reconstruction Analysis. arXiv 2024. [Google Scholar] [CrossRef]
Mo, B.; Xu, H.; Zhuang, D.; Ma, R.; Guo, X.; Zhao, J. Large Language Models for Travel Behavior Prediction. arXiv 2023. [Google Scholar] [CrossRef]
Chen, J.; Lin, B.; Xu, R.; Chai, Z.; Liang, X.; Wong, K.Y.K. MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation. arXiv 2024. [Google Scholar] [CrossRef]
Cui, Y.; Huang, S.; Zhong, J.; Liu, Z.; Wang, Y.; Sun, C.; Li, B.; Wang, X.; Khajepour, A. DriveLLM: Charting the Path Toward Full Autonomous Driving with Large Language Models. IEEE Trans. Intell. Veh. 2024, 9, 1450–1464. [Google Scholar] [CrossRef]
Mao, J.; Qian, Y.; Ye, J.; Zhao, H.; Wang, Y. GPT-Driver: Learning to Drive with GPT. arXiv 2023. [Google Scholar] [CrossRef]
Yang, Y.; Zhang, W.; Lin, H.; Liu, Y.; Qu, X. Applying masked language model for transport mode choice behavior prediction. Transp. Res. Part A Policy Pract. 2024, 184, 104074. [Google Scholar] [CrossRef]
Wang, M.; Pang, A.; Kan, Y.; Pun, M.O.; Chen, C.S.; Huang, B. LLM-Assisted Light: Leveraging Large Language Model Capabilities for Human-Mimetic Traffic Signal Control in Complex Urban Environments. arXiv 2024. [Google Scholar] [CrossRef]
Wang, B.; Cai, Z.; Karim, M.M.; Liu, C.; Wang, Y. Traffic Performance GPT (TP-GPT): Real-Time Data Informed Intelligent ChatBot for Transportation Surveillance and Management. arXiv 2024. [Google Scholar] [CrossRef]
Wang, J.; Jiang, R.; Yang, C.; Wu, Z.; Onizuka, M.; Shibasaki, R.; Koshizuka, N.; Xiao, C. Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation. arXiv 2024. [Google Scholar] [CrossRef]
Zhao, C.; Wang, X.; Lv, Y.; Tian, Y.; Lin, Y.; Wang, F.Y. Parallel Transportation in TransVerse: From Foundation Models to DeCAST. IEEE Trans. Intell. Transp. Syst. 2023, 24, 15310–15327. [Google Scholar] [CrossRef]
Grigorev, A.; Saleh, A.S.M.K.; Ou, Y. IncidentResponseGPT: Generating Traffic Incident Response Plans with Generative Artificial Intelligence. arXiv 2024. [Google Scholar] [CrossRef]
Sultan, R.I.; Li, C.; Zhu, H.; Khanduri, P.; Brocanelli, M.; Zhu, D. GeoSAM: Fine-tuning SAM with Sparse and Dense Visual Prompting for Automated Segmentation of Mobility Infrastructure. arXiv 2024. [Google Scholar] [CrossRef]
Guo, X.; Zhang, Q.; Jiang, J.; Peng, M.; Yang, H.F.; Zhu, M. Towards Responsible and Reliable Traffic Flow Prediction with Large Language Models. arXiv 2024. [Google Scholar] [CrossRef]
Lai, S.; Xu, Z.; Zhang, W.; Liu, H.; Xiong, H. LLMLight: Large Language Models as Traffic Signal Control Agents. arXiv 2024. [Google Scholar] [CrossRef]
Liao, H.; Shen, H.; Li, Z.; Wang, C.; Li, G.; Bie, Y.; Xu, C. GPT-4 enhanced multimodal grounding for autonomous driving: Leveraging cross-modal attention with large language models. Commun. Transp. Res. 2024, 4, 100116. [Google Scholar] [CrossRef]
Du, H.; Teng, S.; Chen, H.; Ma, J.; Wang, X.; Gou, C.; Li, B.; Ma, S.; Miao, Q.; Na, X.; et al. Chat with ChatGPT on Intelligent Vehicles: An IEEE TIV Perspective. IEEE Trans. Intell. Veh. 2023, 8, 2020–2026. [Google Scholar] [CrossRef]
Qu, X.; Lin, H.; Liu, Y. Envisioning the future of transportation: Inspiration of ChatGPT and large models. Commun. Transp. Res. 2023, 3, 100103. [Google Scholar] [CrossRef]
Güzay, Ç.; Özdemir, E.; Kara, Y. A Generative AI-driven Application: Use of Large Language Models for Traffic Scenario Generation. In Proceedings of the 2023 14th International Conference on Electrical and Electronics Engineering (ELECO), Bursa, Turkey, 30 November–2 December 2023. [Google Scholar] [CrossRef]
Da, L.; Gao, M.; Mei, H.; Wei, H. Prompt to Transfer: Sim-to-Real Transfer for Traffic Signal Control with Prompt Learning. arXiv 2024. [Google Scholar] [CrossRef]
Liang, Y.; Liu, Y.; Wang, X.; Zhao, Z. Exploring Large Language Models for Human Mobility Prediction under Public Events. arXiv 2023. [Google Scholar] [CrossRef]
Shi, Y.; Lv, F.; Wang, X.; Xia, C.; Li, S.; Yang, S.; Xi, T.; Zhang, G. Open-TransMind: A New Baseline and Benchmark for 1st Foundation Model Challenge of Intelligent Transportation. arXiv 2023. [Google Scholar] [CrossRef]
Zhang, K.; Zhou, F.; Wu, L.; Xie, N.; He, Z. Semantic understanding and prompt engineering for large-scale traffic data imputation. Inf. Fusion 2024, 102, 102038. [Google Scholar] [CrossRef]
Zheng, O.; Abdel-Aty, M.; Wang, D.; Wang, C.; Ding, S. TrafficSafetyGPT: Tuning a Pre-trained Large Language Model to a Domain-Specific Expert in Transportation Safety. arXiv 2023. [Google Scholar] [CrossRef]
Tang, Y.; Dai, X.; Lv, Y. Large Language Model-Assisted Arterial Traffic Signal Control. IEEE J. Radio Freq. Identif. 2024, 8, 322–326. [Google Scholar] [CrossRef]
Tian, Y.; Li, X.; Zhang, H.; Zhao, C.; Li, B.; Wang, X.; Wang, X.; Wang, F.Y. VistaGPT: Generative Parallel Transformers for Vehicles with Intelligent Systems for Transport Automation. IEEE Trans. Intell. Veh. 2023, 8, 4198–4207. [Google Scholar] [CrossRef]
De Zarzà, I.; De Curtò, J.; Roig, G.; Calafate, C.T. LLM Multimodal Traffic Accident Forecasting. Sensors 2023, 23, 9225. [Google Scholar] [CrossRef]
Adekanye, O.A.M. LLM-Powered Synthetic Environments for Self-Driving Scenarios. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; pp. 23721–23723. [Google Scholar] [CrossRef]
Li, Y.; Li, L.; Wu, Z.; Bing, Z.; Xuanyuan, Z.; Knoll, A.C.; Chen, L. UnstrPrompt: Large Language Model Prompt for Driving in Unstructured Scenarios. IEEE J. Radio Freq. Identif. 2024, 8, 367–375. [Google Scholar] [CrossRef]
Jin, Y.; Shen, X.; Peng, H.; Liu, X.; Qin, J.; Li, J.; Xie, J.; Gao, P.; Zhou, G.; Gong, J. SurrealDriver: Designing Generative Driver Agent Simulation Framework in Urban Contexts based on Large Language Model. arXiv 2023. [Google Scholar] [CrossRef]
Liu, Y. Large language models for air transportation: A critical review. J. Air Transp. Res. Soc. 2024, 2, 100024. [Google Scholar] [CrossRef]
Wang, X.; Wang, D.; Chen, L.; Lin, Y. Building Transportation Foundation Model via Generative Graph Transformer. arXiv 2023. [Google Scholar] [CrossRef]
Yang, Z.; Jia, X.; Li, H.; Yan, J. LLM4Drive: A Survey of Large Language Models for Autonomous Driving. arXiv 2024. [Google Scholar] [CrossRef]
Zhang, Z.; Sun, Y.; Wang, Z.; Nie, Y.; Ma, X.; Li, R.; Sun, P.; Ban, X. Large Language Models for Mobility Analysis in Transportation Systems: A Survey on Forecasting Tasks. arXiv 2025. [Google Scholar] [CrossRef]
Li, S.; Azfar, T.; Ke, R. ChatSUMO: Large Language Model for Automating Traffic Scenario Generation in Simulation of Urban MObility. arXiv 2024. [Google Scholar] [CrossRef]
Yuan, Z.; Lai, S.; Liu, H. CoLLMLight: Cooperative Large Language Model Agents for Network-Wide Traffic Signal Control. arXiv 2025. [Google Scholar] [CrossRef]
Onsu, M.A.; Lohan, P.; Kantarci, B.; Syed, A.; Andrews, M.; Kennedy, S. Leveraging Multimodal-LLMs Assisted by Instance Segmentation for Intelligent Traffic Monitoring. arXiv 2025. [Google Scholar] [CrossRef]
Lu, Q.; Wang, X.; Jiang, Y.; Zhao, G.; Ma, M.; Feng, S. Multimodal Large Language Model Driven Scenario Testing for Autonomous Vehicles. arXiv 2024. [Google Scholar] [CrossRef]
Guo, X.; Zhang, Q.; Jiang, J.; Peng, M.; Zhu, M.; Yang, H. Towards Explainable Traffic Flow Prediction with Large Language Models. arXiv 2024. [Google Scholar] [CrossRef]
Wang, D.; Lu, C.T.; Fu, Y. Towards Automated Urban Planning: When Generative and ChatGPT-like AI Meets Urban Planning. arXiv 2023. [Google Scholar] [CrossRef]
Deng, J.; Chai, W.; Huang, J.; Zhao, Z.; Huang, Q.; Gao, M.; Guo, J.; Hao, S.; Hu, W.; Hwang, J.N.; et al. CityCraft: A Real Crafter for 3D City Generation. arXiv 2024. [Google Scholar] [CrossRef]
Balsebre, P.; Huang, W.; Cong, G.; Li, Y. City Foundation Models for Learning General Purpose Representations from OpenStreetMap. arXiv 2023, arXiv:2310.00583. [Google Scholar] [CrossRef]
Aghzal, M.; Plaku, E.; Yao, Z. Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning. arXiv 2024. [Google Scholar] [CrossRef]
Chen, Y.; Wang, X.; Xu, G. GATGPT: A Pre-trained Large Language Model with Graph Attention Network for Spatiotemporal Imputation. arXiv 2023. [Google Scholar] [CrossRef]
Li, Z.; Xia, L.; Tang, J.; Xu, Y.; Shi, L.; Xia, L.; Yin, D.; Huang, C. UrbanGPT: Spatio-Temporal Large Language Models. arXiv 2024. [Google Scholar] [CrossRef]
Chen, J.; Xu, W.; Cao, H.; Xu, Z.; Zhang, Y.; Zhang, Z.; Zhang, S. Multimodal Road Network Generation Based on Large Language Model. arXiv 2024. [Google Scholar] [CrossRef]
Buitrago-Esquinas, E.M.; Puig-Cabrera, M.; Santos, J.A.C.; Custódio-Santos, M.; Yñiguez-Ovando, R. Developing a hetero-intelligence methodological framework for sustainable policy-making based on the assessment of large language models. MethodsX 2024, 12, 102707. [Google Scholar] [CrossRef]
Zhong, S.; Hao, X.; Yan, Y.; Zhang, Y.; Song, Y.; Liang, Y. UrbanCross: Enhancing Satellite Image-Text Retrieval with Cross-Domain Adaptation. arXiv 2024. [Google Scholar] [CrossRef]
Hao, X.; Chen, W.; Yan, Y.; Zhong, S.; Wang, K.; Wen, Q.; Liang, Y. UrbanVLP: Multi-Granularity Vision-Language Pretraining for Urban Socioeconomic Indicator Prediction. arXiv 2025. [Google Scholar] [CrossRef]
Yan, Y.; Wen, H.; Zhong, S.; Chen, W.; Chen, H.; Wen, Q.; Zimmermann, R.; Liang, Y. UrbanCLIP: Learning Text-enhanced Urban Region Profiling with Contrastive Language-Image Pretraining from the Web. arXiv 2024. [Google Scholar] [CrossRef]
Wang, X.; Ling, X.; Zhang, T.; Li, X.; Wang, S.; Li, Z.; Zhang, L.; Gong, P. Optimizing and Fine-tuning Large Language Model for Urban Renewal. arXiv 2023. [Google Scholar] [CrossRef]
Zenkert, J.; Fathi, M. Taxonomy Mining from a Smart City CMS using the Multidimensional Knowledge Representation Approach. In Proceedings of the 2024 IEEE 14th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 8–10 January 2024. [Google Scholar] [CrossRef]
Jang, K.M.; Chen, J.; Kang, Y.; Kim, J.; Lee, J.; Duarte, F. Understanding Place Identity with Generative AI. arXiv 2023. [Google Scholar] [CrossRef]
Tang, Y.; Wang, Z.; Qu, A.; Yan, Y.; Wu, Z.; Zhuang, D.; Kai, J.; Hou, K.; Guo, X.; Zheng, H.; et al. ITINERA: Integrating Spatial Optimization with Large Language Models for Open-domain Urban Itinerary Planning. arXiv 2024. [Google Scholar] [CrossRef]
Manvi, R.; Khanna, S.; Mai, G.; Burke, M.; Lobell, D.; Ermon, S. GeoLLM: Extracting Geospatial Knowledge from Large Language Models. arXiv 2024. [Google Scholar] [CrossRef]
Chen, Y.; Zhang, S.; Han, T.; Du, Y.; Zhang, W.; Li, J. Chat3D: Interactive understanding 3D scene-level point clouds by chatting with foundation model for urban ecological construction. ISPRS J. Photogramm. Remote Sens. 2024, 212, 181–192. [Google Scholar] [CrossRef]
Zhou, Z.; Lin, Y.; Li, Y. Large Language Model Empowered Participatory Urban Planning. arXiv 2024. [Google Scholar] [CrossRef]
Berragan, C.; Singleton, A.; Calafiore, A.; Morley, J. Mapping Great Britain’s semantic footprints through a large language model analysis of Reddit comments. Comput. Environ. Urban Syst. 2024, 110, 102121. [Google Scholar] [CrossRef]
Kalyuzhnaya, A.; Mityagin, S.; Lutsenko, E.; Getmanov, A.; Aksenkin, Y.; Fatkhiev, K.; Fedorin, K.; Nikitin, N.O.; Chichkova, N.; Vorona, V.; et al. LLM Agents for Smart City Management: Enhancing Decision Support Through Multi-Agent AI Systems. Smart Cities 2025, 8, 19. [Google Scholar] [CrossRef]
Kumbam, P.R.; Vejre, K.M. FloodLense: A Framework for ChatGPT-based Real-time Flood Detection. arXiv 2024. [Google Scholar] [CrossRef]
Hao, Y.; Qi, J.; Ma, X.; Wu, S.; Liu, R.; Zhang, X. An LLM-Based Inventory Construction Framework of Urban Ground Collapse Events with Spatiotemporal Locations. ISPRS Int. J. Geo-Inf. 2024, 13, 133. [Google Scholar] [CrossRef]
Goecks, V.G.; Waytowich, N.R. DisasterResponseGPT: Large Language Models for Accelerated Plan of Action Development in Disaster Response Scenarios. arXiv 2023. [Google Scholar] [CrossRef]
Soomro, S.e.h.; Boota, M.W.; Zwain, H.M.; Soomro, G.e.Z.; Shi, X.; Guo, J.; Li, Y.; Tayyab, M.; Aamir Soomro, M.H.A.; Hu, C.; et al. How effective is twitter (X) social media data for urban flood management? J. Hydrol. 2024, 634, 131129. [Google Scholar] [CrossRef]
Xue, Z.; Xu, C.; Xu, X. Application of ChatGPT in natural disaster prevention and reduction. Nat. Hazards Res. 2023, 3, 556–562. [Google Scholar] [CrossRef]
Han, J.; Zheng, Z.; Lu, X.Z.; Chen, K.Y.; Lin, J.R. Enhanced earthquake impact analysis based on social media texts via large language model. Int. J. Disaster Risk Reduct. 2024, 109, 104574. [Google Scholar] [CrossRef]
Ou, R.; Yan, H.; Wu, M.; Zhang, C. A Method of Efficient Synthesizing Post-disaster Remote Sensing Image with Diffusion Model and LLM. In Proceedings of the 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Taipei, Taiwan, 31 October–3 November 2023; pp. 1549–1555. [Google Scholar] [CrossRef]
Colverd, G.; Darm, P.; Silverberg, L.; Kasmanoff, N. FloodBrain: Flood Disaster Reporting by Web-based Retrieval Augmented Generation with an LLM. arXiv 2023. [Google Scholar] [CrossRef]
Hu, Y.; Mai, G.; Cundy, C.; Choi, K.; Lao, N.; Liu, W.; Lakhanpal, G.; Zhou, R.Z.; Joseph, K. Geo-knowledge-guided GPT models improve the extraction of location descriptions from disaster-related social media messages. Int. J. Geogr. Inf. Sci. 2023, 37, 2289–2318. [Google Scholar] [CrossRef]
Akinboyewa, T.; Ning, H.; Lessani, M.N.; Li, Z. Automated Floodwater Depth Estimation Using Large Multimodal Model for Rapid Flood Mapping. arXiv 2024. [Google Scholar] [CrossRef]
Ziaullah, A.W.; Ofli, F.; Imran, M. Monitoring Critical Infrastructure Facilities During Disasters Using Large Language Models. arXiv 2024. [Google Scholar] [CrossRef]
Xia, Y.; Huang, Y.; Qiu, Q.; Zhang, X.; Miao, L.; Chen, Y. A Question and Answering Service of Typhoon Disasters Based on the T5 Large Language Model. ISPRS Int. J. Geo-Inf. 2024, 13, 165. [Google Scholar] [CrossRef]
Yin, K.; Liu, C.; Mostafavi, A.; Hu, X. CrisisSense-LLM: Instruction Fine-Tuned Large Language Model for Multi-label Social Media Text Classification in Disaster Informatics. arXiv 2025. [Google Scholar] [CrossRef]
Chen, W.; Su, Y.; Zuo, J.; Yang, C.; Yuan, C.; Chan, C.M.; Yu, H.; Lu, Y.; Hung, Y.H.; Qian, C.; et al. AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors. arXiv 2023. [Google Scholar] [CrossRef]
Zhang, J.; Xu, X.; Zhang, N.; Liu, R.; Hooi, B.; Deng, S. Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View. arXiv 2024. [Google Scholar] [CrossRef]
Wang, Z.; Chiu, Y.Y.; Chiu, Y.C. Humanoid Agents: Platform for Simulating Human-like Generative Agents. arXiv 2023. [Google Scholar] [CrossRef]
Li, G.; Hammoud, H.A.A.K.; Itani, H.; Khizbullin, D.; Ghanem, B. CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society. arXiv 2023. [Google Scholar] [CrossRef]
Huang, F.; Huang, Q.; Zhao, Y.; Qi, Z.; Wang, B.; Huang, Y.; Li, S. A Three-Stage Framework for Event-Event Relation Extraction with Large Language Model. In Proceedings of the Neural Information Processing, Changsha, China, 20–23 November 2023; Luo, B., Cheng, L., Wu, Z.G., Li, H., Li, C., Eds.; Springer Nature Singapore: Singapore, 2024; pp. 434–446. [Google Scholar] [CrossRef]
Manvi, R.; Khanna, S.; Burke, M.; Lobell, D.; Ermon, S. Large Language Models Are Geographically Biased. arXiv 2024. [Google Scholar] [CrossRef]
Liu, R.; Yang, R.; Jia, C.; Zhang, G.; Zhou, D.; Dai, A.M.; Yang, D.; Vosoughi, S. Training Socially Aligned Language Models on Simulated Social Interactions. arXiv 2023. [Google Scholar] [CrossRef]
Sarzaeim, P.; Mahmoud, Q.H.; Azim, A. Experimental Analysis of Large Language Models in Crime Classification and Prediction. In Proceedings of the 37th Canadian Conference on Artificial Intelligence, Guelph, ON, Canada, 27–31 May 2024. [Google Scholar]
Sarzaeim, P.; Mahmoud, Q.H.; Azim, A. A Framework for LLM-Assisted Smart Policing System. IEEE Access 2024, 12, 74915–74929. [Google Scholar] [CrossRef]
Kim, J.; Lee, B. AI-Augmented Surveys: Leveraging Large Language Models and Surveys for Opinion Prediction. arXiv 2024. [Google Scholar] [CrossRef]
Suzuki, R.; Arita, T. An evolutionary model of personality traits related to cooperative behavior using a large language model. Sci. Rep. 2024, 14, 5989. [Google Scholar] [CrossRef]
Gao, C.; Lan, X.; Lu, Z.; Mao, J.; Piao, J.; Wang, H.; Jin, D.; Li, Y. S3: Social-network Simulation System with Large Language Model-Empowered Agents. arXiv 2023. [Google Scholar] [CrossRef]
Park, J.S.; O’Brien, J.C.; Cai, C.J.; Morris, M.R.; Liang, P.; Bernstein, M.S. Generative Agents: Interactive Simulacra of Human Behavior. arXiv 2023. [Google Scholar] [CrossRef]
Zhang, Y.; Prebensen, N.K. Co-creating with ChatGPT for tourism marketing materials. Ann. Tour. Res. Empir. Insights 2024, 5, 100124. [Google Scholar] [CrossRef]
Mich, L.; Garigliano, R. ChatGPT for e-Tourism: A technological perspective. Inf. Technol. Tour. 2023, 25, 1–12. [Google Scholar] [CrossRef]
Gursoy, D.; Li, Y.; Song, H. ChatGPT and the hospitality and tourism industry: An overview of current trends and future research directions. J. Hosp. Mark. Manag. 2023, 32, 579–592. [Google Scholar] [CrossRef]
Carvalho, I.; Ivanov, S. ChatGPT for tourism: Applications, benefits and risks. Tour. Rev. 2023, 79, 290–303. [Google Scholar] [CrossRef]
Xie, J.; Zhang, K.; Chen, J.; Zhu, T.; Lou, R.; Tian, Y.; Xiao, Y.; Su, Y. TravelPlanner: A Benchmark for Real-World Planning with Language Agents. arXiv 2024. [Google Scholar] [CrossRef]
Xie, J.; Liang, Y.; Liu, J.; Xiao, Y.; Wu, B.; Ni, S. QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search. arXiv 2023. [Google Scholar] [CrossRef]
Yao, J. Elevating Urban Tourism: Data-Driven Insights and AI-Powered Personalization with Large Language Models Brilliance. In Proceedings of the 2023 IEEE 3rd International Conference on Social Sciences and Intelligence Management (SSIM), Taichung, Taiwan, 15–17 December 2023; pp. 138–143. [Google Scholar] [CrossRef]
Balamurali, O.; Abhishek Sai, A.; Karthikeya, M.; Anand, S. Sentiment Analysis for Better User Experience in Tourism Chatbot using LSTM and LLM. In Proceedings of the 2023 9th International Conference on Signal Processing and Communication (ICSC), Noida, India, 21–23 December 2023; pp. 456–462. [Google Scholar] [CrossRef]
Fan, Z.; Chen, C. CuPe-KG: Cultural perspective–based knowledge graph construction of tourism resources via pretrained language models. Inf. Process. Manag. 2024, 61, 103646. [Google Scholar] [CrossRef]
Chen, S.; Long, G.; Shen, T.; Jiang, J. Prompt Federated Learning for Weather Forecasting: Toward Foundation Models on Meteorological Data. arXiv 2023. [Google Scholar] [CrossRef]
Kwon, O.H.; Vu, K.; Bhargava, N.; Radaideh, M.I.; Cooper, J.; Joynt, V.; Radaideh, M.I. Sentiment analysis of the United States public support of nuclear power on social media using large language models. Renew. Sustain. Energy Rev. 2024, 200, 114570. [Google Scholar] [CrossRef]
Vaghefi, S.A.; Stammbach, D.; Muccione, V.; Bingler, J.; Ni, J.; Kraus, M.; Allen, S.; Colesanti-Senni, C.; Wekhof, T.; Schimanski, T.; et al. ChatClimate: Grounding conversational AI in climate science. Commun. Earth Environ. 2023, 4, 480. [Google Scholar] [CrossRef]
Agathokleous, E.; Saitanis, C.J.; Fang, C.; Yu, Z. Use of ChatGPT: What does it mean for biology and environmental science? Sci. Total Environ. 2023, 888, 164154. [Google Scholar] [CrossRef]
Chen, S.; Long, G.; Jiang, J.; Liu, D.; Zhang, C. Foundation Models for Weather and Climate Data Understanding: A Comprehensive Survey. arXiv 2023. [Google Scholar] [CrossRef]
Li, N.; Gao, C.; Li, M.; Li, Y.; Liao, Q. EconAgent: Large Language Model-Empowered Agents for Simulating Macroeconomic Activities. arXiv 2024. [Google Scholar] [CrossRef]
Han, X.; Wu, Z.; Xiao, C. “Guinea Pig Trials” Utilizing GPT: A Novel Smart Agent-Based Modeling Approach for Studying Firm Competition and Collusion. arXiv 2024. [Google Scholar] [CrossRef]
Horton, J.J. Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus? arXiv 2023. [Google Scholar] [CrossRef]
Sifat, R.I. ChatGPT and the Future of Health Policy Analysis: Potential and Pitfalls of Using ChatGPT in Policymaking. Ann. Biomed. Eng. 2023, 51, 1357–1359. [Google Scholar] [CrossRef] [PubMed]
Jiang, Y.; Qiu, R.; Zhang, Y.; Zhang, P.F. Balanced and Explainable Social Media Analysis for Public Health with Large Language Models. arXiv 2023. [Google Scholar] [CrossRef]
Guevara, M.; Chen, S.; Thomas, S.; Chaunzwa, T.L.; Franco, I.; Kann, B.H.; Moningi, S.; Qian, J.M.; Goldstein, M.; Harper, S.; et al. Large language models to identify social determinants of health in electronic health records. Npj Digit. Med. 2024, 7, 6. [Google Scholar] [CrossRef]
Zhang, L.; Chen, Z. Large language model-based interpretable machine learning control in building energy systems. Energy Build. 2024, 313, 114278. [Google Scholar] [CrossRef]
Jiang, G.; Ma, Z.; Zhang, L.; Chen, J. EPlus-LLM: A large language model-based computing platform for automated building energy modeling. Appl. Energy 2024, 367, 123431. [Google Scholar] [CrossRef]
Huang, C.; Li, S.; Liu, R.; Wang, H.; Chen, Y. Large Foundation Models for Power Systems. arXiv 2023. [Google Scholar] [CrossRef]
Guo, H.; Su, X.; Wu, C.; Du, B.; Zhang, L.; Li, D. Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models. arXiv 2024. [Google Scholar] [CrossRef]
Fernandez, A.; Dube, S. Core Building Blocks: Next Gen Geo Spatial GPT Application. arXiv 2023. [Google Scholar] [CrossRef]
Jiang, Y.; Yang, C. Is ChatGPT a Good Geospatial Data Analyst? Exploring the Integration of Natural Language into Structured Query Language within a Spatial Database. ISPRS Int. J. Geo-Inf. 2024, 13, 26. [Google Scholar] [CrossRef]
Zhan, Y.; Xiong, Z.; Yuan, Y. SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model. arXiv 2024. [Google Scholar] [CrossRef]
Kuckreja, K.; Danish, M.S.; Naseer, M.; Das, A.; Khan, S.; Khan, F.S. GeoChat: Grounded Large Vision-Language Model for Remote Sensing. arXiv 2023. [Google Scholar] [CrossRef]
Yuan, Z.; Xiong, Z.; Mou, L.; Zhu, X.X. ChatEarthNet: A Global-Scale Image-Text Dataset Empowering Vision-Language Geo-Foundation Models. arXiv 2024. [Google Scholar] [CrossRef]
Li, Z.; Ning, H. Autonomous GIS: The next-generation AI-powered GIS. Int. J. Digit. Earth 2023, 16, 4668–4686. [Google Scholar] [CrossRef]
Hämäläinen, P.; Tavast, M.; Kunnari, A. Evaluating Large Language Models in Generating Synthetic HCI Research Data: A Case Study. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI’23), Hamburg, Germany, 23–28 April 2023; pp. 1–19. [Google Scholar] [CrossRef]
Fu, J.; Han, H.; Su, X.; Fan, C. Towards Human-AI Collaborative Urban Science Research Enabled by Pre-trained Large Language Models. arXiv 2023. [Google Scholar] [CrossRef]
Roberts, J.; Lüddecke, T.; Das, S.; Han, K.; Albanie, S. GPT4GEO: How a Language Model Sees the World’s Geography. arXiv 2023. [Google Scholar] [CrossRef]
Li, Z.; Zhou, W.; Chiang, Y.Y.; Chen, M. GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding. arXiv 2023. [Google Scholar] [CrossRef]
Hong, S.; Zhuge, M.; Chen, J.; Zheng, X.; Cheng, Y.; Zhang, C.; Wang, J.; Wang, Z.; Yau, S.K.S.; Lin, Z.; et al. MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework. arXiv 2024. [Google Scholar] [CrossRef]
Xue, H.; Salim, F.D. PromptCast: A New Prompt-Based Learning Paradigm for Time Series Forecasting. IEEE Trans. Knowl. Data Eng. 2024, 36, 6851–6864. [Google Scholar] [CrossRef]
Yang, J.; Ding, R.; Brown, E.; Qi, X.; Xie, S. V-IRL: Grounding Virtual Intelligence in Real Life. arXiv 2024. [Google Scholar] [CrossRef]
Singh, S.; Fore, M.; Stamoulis, D. GeoLLM-Engine: A Realistic Environment for Building Geospatial Copilots. arXiv 2024. [Google Scholar] [CrossRef]
Osco, L.P.; de Lemos, E.L.; Gonçalves, W.N.; Ramos, A.P.M.; Junior, J.M. The Potential of Visual ChatGPT For Remote Sensing. arXiv 2023. [Google Scholar] [CrossRef]
Zhang, Y.; Wei, C.; Wu, S.; He, Z.; Yu, W. GeoGPT: Understanding and Processing Geospatial Tasks through An Autonomous GPT. arXiv 2023. [Google Scholar] [CrossRef]
Zhou, T.; Niu, P.; Wang, X.; Sun, L.; Jin, R. One Fits All:Power General Time Series Analysis by Pretrained LM. arXiv 2023. [Google Scholar] [CrossRef]
Mooney, P.; Cui, W.; Guan, B.; Juhász, L. Towards Understanding the Geospatial Skills of ChatGPT: Taking a Geographic Information Systems (GIS) Exam. In Proceedings of the 6th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, Hamburg, Germany, 13 November 2023; pp. 85–94. [Google Scholar] [CrossRef]
Kang, Y.; Zhang, Q.; Roth, R. The Ethics of AI-Generated Maps: A Study of DALLE 2 and Implications for Cartography. arXiv 2023. [Google Scholar] [CrossRef]
Jakubik, J.; Roy, S.; Phillips, C.E.; Fraccaro, P.; Godwin, D.; Zadrozny, B.; Szwarcman, D.; Gomes, C.; Nyirjesy, G.; Edwards, B.; et al. Foundation Models for Generalist Geospatial Artificial Intelligence. arXiv 2023. [Google Scholar] [CrossRef]
Zhu, X.; Chen, Y.; Tian, H.; Tao, C.; Su, W.; Yang, C.; Huang, G.; Li, B.; Lu, L.; Wang, X.; et al. Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory. arXiv 2023. [Google Scholar] [CrossRef]
Mai, G.; Huang, W.; Sun, J.; Song, S.; Mishra, D.; Liu, N.; Gao, S.; Liu, T.; Cong, G.; Hu, Y.; et al. On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence. arXiv 2023. [Google Scholar] [CrossRef]
Deng, C.; Zhang, T.; He, Z.; Xu, Y.; Chen, Q.; Shi, Y.; Fu, L.; Zhang, W.; Wang, X.; Zhou, C.; et al. K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization. arXiv 2023. [Google Scholar] [CrossRef]
Fulman, N.; Memduhoğlu, A.; Zipf, A. Distortions in Judged Spatial Relations in Large Language Models. arXiv 2024. [Google Scholar] [CrossRef]
Mei, L.; Mao, J.; Hu, J.; Tan, N.; Chai, H.; Wen, J.R. Improving First-stage Retrieval of Point-of-interest Search by Pre-training Models. ACM Trans. Inf. Syst. 2023, 42, 1–27. [Google Scholar] [CrossRef]
Chang, C.; Wang, W.Y.; Peng, W.C.; Chen, T.F. LLM4TS: Aligning Pre-Trained LLMs as Data-Efficient Time-Series Forecasters. arXiv 2024. [Google Scholar] [CrossRef]
Feng, S.; Lyu, H.; Chen, C.; Ong, Y.S. Where to Move Next: Zero-shot Generalization of LLMs for Next POI Recommendation. arXiv 2024. [Google Scholar] [CrossRef]
Yan, Z.; Li, J.; Li, X.; Zhou, R.; Zhang, W.; Feng, Y.; Diao, W.; Fu, K.; Sun, X. RingMo-SAM: A Foundation Model for Segment Anything in Multimodal Remote-Sensing Images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5625716. [Google Scholar] [CrossRef]
Gruver, N.; Finzi, M.; Qiu, S.; Wilson, A.G. Large Language Models Are Zero-Shot Time Series Forecasters. arXiv 2024. [Google Scholar] [CrossRef]
Bhandari, P.; Anastasopoulos, A.; Pfoser, D. Are Large Language Models Geospatially Knowledgeable? arXiv 2023. [Google Scholar] [CrossRef]
Schumann, R.; Zhu, W.; Feng, W.; Fu, T.J.; Riezler, S.; Wang, W.Y. VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View. arXiv 2024. [Google Scholar] [CrossRef]
Balsebre, P.; Huang, W.; Cong, G. LAMP: A Language Model on the Map. arXiv 2024. [Google Scholar] [CrossRef]
Naveen, P.; Maheswar, R.; Trojovský, P. GeoNLU: Bridging the gap between natural language and spatial data infrastructures. Alex. Eng. J. 2024, 87, 126–147. [Google Scholar] [CrossRef]
Roberts, J.; Lüddecke, T.; Sheikh, R.; Han, K.; Albanie, S. Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs. arXiv 2024. [Google Scholar] [CrossRef]
Ji, Y.; Gao, S. Evaluating the Effectiveness of Large Language Models in Representing Textual Descriptions of Geometry and Spatial Relations. arXiv 2023. [Google Scholar] [CrossRef]
Gurnee, W.; Tegmark, M. Language Models Represent Space and Time. arXiv 2024. [Google Scholar] [CrossRef]
Juhász, L.; Mooney, P.; Hochmair, H.H.; Guan, B. ChatGPT as a mapping assistant: A novel method to enrich maps with generative AI and content derived from street-level photographs. In Proceedings of the Spatial Data Science Symposium 2023, Virtual, 5–6 September 2023. [Google Scholar] [CrossRef]
Hong, Y.; Zhen, H.; Chen, P.; Zheng, S.; Du, Y.; Chen, Z.; Gan, C. 3D-LLM: Injecting the 3D World into Large Language Models. arXiv 2023. [Google Scholar] [CrossRef]
Gao, C.; Lan, X.; Li, N.; Yuan, Y.; Ding, J.; Zhou, Z.; Xu, F.; Li, Y. Large Language Models Empowered Agent-based Modeling and Simulation: A Survey and Perspectives. arXiv 2023, arXiv:2312.11970. [Google Scholar] [CrossRef]
Huang, X.; Liu, W.; Chen, X.; Wang, X.; Wang, H.; Lian, D.; Wang, Y.; Tang, R.; Chen, E. Understanding the planning of LLM agents: A survey. arXiv 2024. [Google Scholar] [CrossRef]
Jin, M.; Wen, Q.; Liang, Y.; Zhang, C.; Xue, S.; Wang, X.; Zhang, J.; Wang, Y.; Chen, H.; Li, X.; et al. Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook. arXiv 2023. [Google Scholar] [CrossRef]
Xi, Z.; Chen, W.; Guo, X.; He, W.; Ding, Y.; Hong, B.; Zhang, M.; Wang, J.; Jin, S.; Zhou, E.; et al. The Rise and Potential of Large Language Model Based Agents: A Survey. arXiv 2023. [Google Scholar] [CrossRef]
Feng, J.; Du, Y.; Liu, T.; Guo, S.; Lin, Y.; Li, Y. CityGPT: Empowering Urban Spatial Cognition of Large Language Models. arXiv 2024. [Google Scholar] [CrossRef]
Li, Z.; Xu, J.; Wang, S.; Wu, Y.; Li, H. StreetviewLLM: Extracting Geographic Information Using a Chain-of-Thought Multimodal Large Language Model. arXiv 2024. [Google Scholar] [CrossRef]
Zou, Z.; Mubin, O.; Alnajjar, F.; Ali, L. A pilot study of measuring emotional response and perception of LLM-generated questionnaire and human-generated questionnaires. Sci. Rep. 2024, 14, 2781. [Google Scholar] [CrossRef]
Sahoo, P.; Singh, A.K.; Saha, S.; Jain, V.; Mondal, S.; Chadha, A. A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications. arXiv 2024. [Google Scholar] [CrossRef]
Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Xia, F.; Chi, E.; Le, Q.V.; Zhou, D. Chain-of-thought prompting elicits reasoning in large language models. Adv. Neural Inf. Process. Syst. 2022, 35, 24824–24837. [Google Scholar]
Yao, S.; Yu, D.; Zhao, J.; Shafran, I.; Griffiths, T.; Cao, Y.; Narasimhan, K. Tree of thoughts: Deliberate problem solving with large language models. Adv. Neural Inf. Process. Syst. 2024, 36, 11809–11822. [Google Scholar]
Yao, Y.; Li, Z.; Zhao, H. Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models. arXiv 2024. [Google Scholar] [CrossRef]
Wang, X.; Wei, J.; Schuurmans, D.; Le, Q.; Chi, E.; Narang, S.; Chowdhery, A.; Zhou, D. Self-Consistency Improves Chain of Thought Reasoning in Language Models. arXiv 2023. [Google Scholar] [CrossRef]
Wu, Q.; Bansal, G.; Zhang, J.; Wu, Y.; Li, B.; Zhu, E.; Jiang, L.; Zhang, X.; Zhang, S.; Liu, J.; et al. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. arXiv 2023. [Google Scholar] [CrossRef]
Yang, H.; Yue, S.; He, Y. Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions. arXiv 2023. [Google Scholar] [CrossRef]
Ju, C.; Liu, J.; Sinha, S.; Xue, H.; Salim, F. TrajLLM: A Modular LLM-Enhanced Agent-Based Framework for Realistic Human Trajectory Simulation. arXiv 2025. [Google Scholar] [CrossRef]
Zheng, Y.; Capra, L.; Wolfson, O.; Yang, H. Urban computing: Concepts, methodologies, and applications. ACM Trans. Intell. Syst. Technol. (TIST) 2014, 5, 1–55. [Google Scholar] [CrossRef]
Batty, M. The New Science of Cities; MIT Press: Cambridge, MA, USA, 2013. [Google Scholar]
Zhou, Z.; Zhang, J.; Guan, Z.; Hu, M.; Lao, N.; Mu, L.; Li, S.; Mai, G. Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’24), Washington DC, USA, 14–18 July 2024; pp. 2749–2754. [Google Scholar] [CrossRef]
Lin, K.; Ahmed, F.; Li, L.; Lin, C.C.; Azarnasab, E.; Yang, Z.; Wang, J.; Liang, L.; Liu, Z.; Lu, Y.; et al. MM-VID: Advancing Video Understanding with GPT-4V(ision). arXiv 2023, arXiv:2310.19773. [Google Scholar] [CrossRef]
Lin, J.; Tomlin, N.; Andreas, J.; Eisner, J. Decision-Oriented Dialogue for Human-AI Collaboration. Trans. Assoc. Comput. Linguist. 2024, 12, 892–911. [Google Scholar] [CrossRef]
Guan, T.; Liu, F.; Wu, X.; Xian, R.; Li, Z.; Liu, X.; Wang, X.; Chen, L.; Huang, F.; Yacoob, Y.; et al. HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models. arXiv 2024. [Google Scholar] [CrossRef]
Dona, M.A.M.; Cabrero-Daniel, B.; Yu, Y.; Berger, C. LLMs Can Check Their Own Results to Mitigate Hallucinations in Traffic Understanding Tasks. arXiv 2024. [Google Scholar] [CrossRef]
Wu, Y.; Sun, Z.; Li, S.; Welleck, S.; Yang, Y. Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models. arXiv 2025. [Google Scholar] [CrossRef]
Ding, D.; Mallick, A.; Wang, C.; Sim, R.; Mukherjee, S.; Ruhle, V.; Lakshmanan, L.V.S.; Awadallah, A.H. Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing. arXiv 2024. [Google Scholar] [CrossRef]
Hung, T.W.; Yen, C.P. Predictive Policing and Algorithmic Fairness. Synthese 2023, 201, 206. [Google Scholar] [CrossRef]
Ge, Y.; Hua, W.; Mei, K.; Ji, J.; Tan, J.; Xu, S.; Li, Z.; Zhang, Y. OpenAGI: When LLM Meets Domain Experts. arXiv 2023. [Google Scholar] [CrossRef]

Figure 1. Study selection by the PRISMA method.

Figure 2. Overall framework of LLMs in urban data analytics.

Figure 3. Overview of LLM-assisted urban data collection and generation.

Figure 4. Representative data types used in urban data analytics.

Figure 5. Data generation types [37,68,74,90,101,105,106,115,141,145,156,162,173,196].

Figure 6. Overview of LLM-based solutions in urban data preprocessing.

Figure 7. Overview of LLM-based modeling in urban analytics.

Figure 9. Components of LLM agents and their interactions.

Figure 10. Future direction framework.

Figure 11. Future directions in expanding information dimensions (Data) for urban analytics.

Figure 12. Urban foundation model [5,9,10].

Table 1. Categories of large language models.

Classification	Category	Description	Examples
Size	Small	Number of parameters ≤ 1B	BERT
	Medium	1B < Number of parameters ≤ 10B	GPT-2, ChatGLM3, Phi-3-small
	Large	10B < Number of parameters ≤ 100B	Llama 3-70B, Gemma 27B, Mixtral 8x7B, Qwen2-72B
	Very Large	100B < Number of parameters	GPT-4, Grok 3, Gemini 2.5, Claude 3 Opus, DeepSeek-V2
Open Source	Yes	Model and weights are available	Grok, LLaMA, Gemma, DeepSeek, Qwen
Open Source	No	Model and weights are not publicly available	GPT-4, Claude, Gemini

Table 2. Key technologies and steps in developing large language models. For a detailed review of the related technologies and approaches, please refer to [12].

Step	Description	Representative Technologies/Approaches
1. Model Architecture	Fundamental structure of the LLM	- Transformer-based architectures - Encoder-only models (e.g., BERT) - Decoder-only models (e.g., GPT) - Encoder-decoder models
2. Data Preparation	Preparing datasets for training	- Data collection - Cleaning - Filtering - Deduplication
3. Tokenization	Converting text into processable tokens	- BytePairEncoding - WordPieceEncoding - SentencePieceEncoding
4. Positional Encoding	Adding position information to tokens	- Absolute positional embeddings - Relative positional embeddings - Rotary positional embeddings - Relative positional bias
5. Pre-training	Initial training on large unlabeled datasets	- Next token prediction - Masked language modeling - Mixture of Experts (MoE)
6. Fine-tuning and Instruction Tuning	Adapting models for specific tasks or instructions	- Supervised Fine-Tuning (SFT) - Instruction tuning datasets - Self-instruction
7. Alignment	Aligning model outputs with human values	- Reinforcement Learning from Human Feedback (RLHF) - Reinforcement Learning from AI Feedback (RLAIF) - Direct Preference Optimization (DPO) - Kahneman–Tversky Optimization (KTO)
8. Decoding Strategies	Techniques for generating text from the model	- Greedy search - Beam search - Top-k sampling - Top-p (nucleus) sampling
9. Efficient Training/Inference	Reducing computational costs	- Optimized training frameworks (e.g., ZeRO and RWKV) - Low-Rank Adaptation (LoRA) - Knowledge distillation - Quantization

Table 3. Capabilities of language models at different levels.

Level	Category	Capabilities
Basic	Text Processing	- Summarization - Simplification - Sentiment Analysis - Named Entity Recognition - Topic Modeling - Text Classification - Keyword Extraction
	Text Generation	- Simple Text Continuation - Basic Generative Writing
	Language Tasks	- Translation (common language pairs) - Language Identification - Basic Grammar Correction
	Simple Question Answering	- Boolean QA - Basic Multi-choice QA
Intermediate	Advanced Text Understanding	- Reading Comprehension - Contextual Understanding - Inference Generation
	Advanced Text Generation	- Long-form Content Creation - Code Generation - Story Generation with Consistency
	Complex Reasoning and QA	- Common Sense Reasoning - Arithmetic Problem Solving - Logical Reasoning - Step-by-step Problem Solving - Open-ended QA - Multi-hop Reasoning QA
	Task Understanding and Execution	- Complex Instruction Following - Task Definition from Examples - Few-shot Learning - Zero-shot Learning
Advanced	Knowledge Integration and Reasoning	- Knowledge Graph Construction and Querying - External Knowledge Base Utilization - Symbolic Reasoning - Causal Reasoning - Analogical Reasoning
	Tool Use and Planning	- Function Calling - API Integration - Tool Planning and Selection - Task Decomposition
	Simulation and Acting	- Physical Acting Simulation - Advanced Virtual Acting - Complex Role-playing - Scenario Generation and Analysis
	Multimodal Capabilities	- Image Understanding and Generation - Audio Processing and Generation - Video Analysis - Cross-modal Reasoning
	Advanced System Integration	- Autonomous Agent Behavior - Multi-agent Collaboration - Integration with Robotic Systems

Table 4. Application areas of LLMs in urban data analytics.

Domain	Count	Representative Applications	Related Papers
Transportation	59	Autonomous Driving, Traffic Modeling and Analysis, Traffic Safety, Transportation Planning and Management, and Driving and Travel Behavior	[4,6,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88]
Urban Development and Planning	21	Road Network Generation, Urban Renewal, Urban Itinerary Planning, Urban Footprints, and Urban Region Profiling	[10,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108]
Disaster Management	14	Flood Detection, Disaster Response, Earthquake Impact Analysis, Typhoon Disaster Management, and Disaster Management	[7,109,110,111,112,113,114,115,116,117,118,119,120,121]
Social Dynamics	13	Social Psychology, Crime Prediction, Emergency Management, Human Behavior Simulation, Social Networks, and Opinion Prediction	[122,123,124,125,126,127,128,129,130,131,132,133,134]
Tourism	9	Tourism Marketing, Tourism Chatbot, e-Tourism, and Travel Planning	[135,136,137,138,139,140,141,142,143]
Environmental Science	5	Weather Forecasting, Climate Science, and City Environment	[144,145,146,147,148]
Economy	3	Economic Simulation, Firm Competition and Collusion, and Macroeconomic Simulation	[149,150,151]
Public Health	3	Health Policy Analysis and Health Determinants	[152,153,154]
Urban Building Energy	3	Building Energy Systems, Building Energy Modeling, and Power Systems	[155,156,157]
Others (mainly technology-focused)	48	Remote Sensing, Geospatial Analytics, Time-series Analysis, Geo-entity Understanding, Virtual Intelligence, Information Retrieval, and Cartography	[1,3,5,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202]

Table 6. Purposes and applications of LLM-oriented data generation in urban analytics.

Aspect	Purpose	Description	Examples
Pre-Modeling	Scenario Creation	Enables the generation of specific urban scenarios, such as traffic conditions or public events, for simulation and analysis purposes.	[68,70,77]
	Data Augmentation	Increases the diversity and volume of data available for model training, helping to overcome data sparsity or imbalance issues in urban datasets.	[72,115]
	Creation of Multimodal Datasets	Converts unstructured textual data into structured formats or combines different data modalities, facilitating further analysis or enhancing model training in urban contexts.	[99,156,162]
Post-Modeling	Scenario Testing	Tests urban models under various generated scenarios to evaluate performance under different conditions, ensuring robustness and reliability.	[73,74,79]
	Interpretation and Explanation	Provides clear, human-readable explanations of complex urban model outputs, enhancing understanding and decision-making.	[106,196]
	Suggestion Generation	Generates data-driven insights and recommendations for urban planning and policy-making based on model outputs, supporting evidence-based decision processes.	[106,141]

Table 7. LLM-based solutions for preprocessing issues in urban data analytics.

Issue Category	Specific Issue	LLM-Based Solution and Description	Examples
Data Quality Issues	Data Inconsistency	Text Refinement: Using LLMs to standardize textual descriptions and remove inconsistencies.	Urban region profiling [99]. Human mobility prediction [70].
	Irrelevant Information	Relevant Feature Identification: Focusing on critical elements while filtering out irrelevant information.	Driver behavior analysis [37]. Semantic footprint mapping [107].
	Labeling Inefficiency	Intelligent Automated Annotation: Using LLMs to automatically label or categorize large datasets, improving efficiency.	Social media sentiment analysis [145]. Social network simulation [133].
	Missing Data	Contextual Imputation: Leveraging LLMs to infer and fill in missing data points based on context.	Traffic data imputation [72]. Urban sensor network data completion [175].
Data Representation Issues	Unstructured Data	Structured Information Extraction: Employing techniques like Auto-CoT to derive structured data from unstructured inputs.	Event–event relation extraction [126]. Urban itinerary planning [103].
	Data Incompatibility	Contextual Data Transformation: Guiding data transformations to suit specific spatial or temporal requirements.	Geospatial knowledge extraction [104]. Time-series forecasting [187].
	Multimodal Data Misalignment	Cross-Modal Alignment: Generating textual descriptions for non-textual data to facilitate integration.	Satellite image text retrieval [97] and 3D scene understanding [105].
Data Dimensionality Issues	High-Dimensional Sparse Data	Semantic Compression: Using LLMs to generate lower-dimensional, semantically rich representations of high-dimensional data.	Urban mobility pattern analysis [42]. Spatial–temporal event modeling [110].
Data Dimensionality Issues	Lack of Contextual Information	Context Enhancement: Generating detailed, context-rich descriptions to augment low-dimensional data.	Autonomous driving in unstructured scenarios [78] and trajectory prediction [41].
Data Distribution Issues	Data Scarcity	Synthetic Data Generation: Extracting information and generating new data points to enrich limited datasets.	Urban renewal knowledge base creation [100]. Post-disaster image captioning [115].
Data Distribution Issues	Imbalanced Data	Minority-Class Augmentation: Using LLMs to generate synthetic examples for under-represented classes.	Traffic accident analysis [46] and rare-event prediction in urban systems [120].

Table 8. Representative content-wise prompting techniques for urban data analytics.

Strategy	Technique	Example
Role and Simulation	Role-Playing Prompts	Facilitating participatory urban planning by designing prompts aligned with specific roles (e.g., planners and residents) [106].
	Scenario Simulation Prompts	Improving autonomous driving in complex environments by developing structured prompts for unstructured scenarios [78].
	Interview Simulation Prompts	Generating synthetic HCI research data by designing prompts that mimic real interview scenarios [165].
Context Enhancement	Context-enhanced Prompts	Improving geospatial predictions by including coordinates, reverse-geocoded addresses, and nearby places [104].
	Information Integration	Enhancing traffic signal control by integrating real-time observations [64].
	Retrieval-Augmented Generation (RAG)	Improving typhoon disaster Q&A by integrating retrieved passages from external knowledge bases [120].
Task Decomposition	Chainof-Thought (CoT) Prompting	Enhancing interpretability and accuracy of human mobility predictions by guiding step-by-step reasoning [70].
	Few-Shot CoT	Building realistic driver agents using interview data from real drivers as few-shot prompts [79].
	Task-specific Prompts	Executing various remote sensing vision-language tasks using task-specific identifiers and instructions [161].
Domain Knowledge Incorporation	Geo-guided Prompts	Improving extraction of location descriptions from social media by encoding geo-knowledge into question-answering statements [117].
	Expert-guided Prompts	Generating comprehensive flood reports by developing prompts with feedback from domain experts [116].
	Domain-specific Prompts	Extracting relevant taxonomies for smart city projects using prompts with domain-specific terminology and rules [101].
Robustness and Optimization	Robustness Testing Prompts	Assessing LLMs’ ability to recall spatial and temporal information in different contexts [194].
	Iterative Improvement Prompts	Optimizing power system solutions by using historical solution–cost pairs for iterative improvements [157].
	Parameter-inclusive Prompts	Generating accurate traffic scenario files by including necessary parameters in prompts [68].

Table 9. Common LLM agent applications in urban data analytics.

Purpose/Use Case	Description	Examples
Task Planning and Automation	LLM agents automate complex tasks by breaking them down into subtasks, planning execution sequences, and coordinating multiple agents or systems.	Multi-agent collaboration [122]. Autonomous driving [32], disaster response planning [111], and travel planning [139].
Knowledge Retrieval and Information Search	LLM agents leverage extra knowledge bases to quickly retrieve and synthesize relevant information from various sources, enhancing decision-making in urban contexts.	Flood detection [109], remote sensing analysis [158], geospatial data analysis [160], and traffic information systems [49].
Human-like Reasoning and Decision Support	These agents mimic human cognitive processes to develop complex decision supports, considering multiple factors and potential outcomes in urban scenarios.	Urban planning and simulation [10]. Driver behavior analysis [37], macroeconomic simulations [149], and crime prediction [130].
Natural Language Interaction	LLM agents facilitate natural language interactions between humans and urban systems, making complex information more accessible and user-friendly.	Vehicle co-pilot systems [44], conversational assistants for geospatial tasks [190], and interactive 3D scene understanding [105].
Data Analysis and Forecasting	These agents analyze large datasets to identify patterns, make predictions, and generate insights for urban planning and management.	Time-series forecasting [184], next POI recommendation [185], traffic performance analysis [58], and firm competition modeling [150].
Multimodal Integration	LLM agents combine various data types (text, images, and 3D models) to create more comprehensive and intuitive urban modeling and navigation systems.	Vision-and-language navigation [189], 3D city generation [90], and traffic signal control [57].

Table 10. Common Fine-tuning Techniques in Urban Analytics Research.

Fine-Tuning Technique	Description	Examples
In-context fine-tuning (Prompt engineering)	This technique involves crafting specific prompts to guide the model’s behavior without changing its parameters. It leverages the model’s existing knowledge to perform tasks in a zero-shot or few-shot learning manner.	Empowering autonomous agents for urban tasks [122] and adapting models for localized weather forecasting [144].
Supervised fine-tuning	This approach involves fine-tuning a pre-trained model on a labeled dataset specific to the target task. It allows the model to adapt its knowledge to a particular domain or application.	Enhancing tourism marketing text generation [135], developing transit-aware language models [34], and improving building energy modeling [156].
Instruction fine-tuning	This method involves fine-tuning a pre-trained model on a dataset of instructions and their corresponding outputs. It helps the model better understand and follow specific instructions for various tasks.	Aligning urban models with human preferences [10] and improving traffic signal control through imitation learning [64].
Parameter-Efficient Fine-Tuning (PEFT)	PEFT techniques aim to adapt pre-trained models to new tasks while updating only a small subset of the model’s parameters. This approach reduces computational resources and storage requirements.	Analyzing driver behavior [37], enhancing urban renewal QA systems [100], and predicting traffic flow patterns [63].

Table 11. Urban foundation models: types and examples.

Model Type	Description	Examples
Language-based Models	Foundation models trained on vast text corpora, adapted for the understanding and generation of human-like text in urban contexts.	Autonomous vehicle interactions [32], urban planning [89], and geospatial tasks [174].
Vision-related Models	Models that process, analyze, or generate visual information or integrate visual and textual data for urban applications.	Map generation [177], urban scene understanding [158], and remote sensing [173].
Time-Series Models	Models specialized in analyzing sequential data over time, adapted for urban-related temporal patterns.	Flood detection [109], weather forecasting [144], and urban trend analysis [170].
Spatiotemporal Models	Models that integrate both spatial and temporal dimensions, crucial for understanding dynamic urban phenomena.	Spatio-temporal prediction [94], path planning [92], and geospatial analysis [178].
Text-to-3D Models	Emerging models that translate textual descriptions into three-dimensional representations, applicable to urban design and planning.	3D city generation [90].
Multimodal Models	Models that integrate multiple data types (e.g., text, image, and geospatial data) for comprehensive urban analysis.	N/A; mostly agents [171].

Table 12. Evaluation aspects of large language models in urban data analytics.

Category	Evaluation Aspect	Description	Application Example
Model-oriented	Quality of Generated Data	Assessing the accuracy and realism of synthetic data produced by LLMs.	Evaluating the realism of AI-generated driving scenarios using metrics like collision rates and emergency braking incidents [77].
	Prediction Performance	Measuring the accuracy of LLM predictions on various urban tasks.	Assessing traffic flow prediction accuracy using metrics like MAE and RMSE [63].
	Task Completion	Assessing the ability of LLMs to successfully complete complex urban tasks.	Measuring the success rate of LLM agents in completing Minecraft tasks as a proxy for urban planning scenarios [179].
	Robustness	Evaluating LLM performance under varying conditions or with noisy input data.	Testing LLM-based mobility prediction models on days with and without public events [70].
	Generalization	Measuring LLM performance on unseen data or in new urban environments.	Evaluating the performance of urban cross-modal retrieval models across different cities [97].
	Efficiency	Measuring computational resources and time required for LLM urban analytics tasks.	Comparing the efficiency of LLM-based approaches to traditional optimization methods for delivery route planning [50].
Human-oriented	Interpretability	Evaluating the explainability of LLM decisions and outputs.	Assessing the quality and relevance of explanations provided by LLMs for traffic control decisions [57].
	Human–AI Collaboration	Evaluating how well LLMs support and enhance human decision-making in urban contexts.	Assessing the impact of ChatGPT assistance on human performance in traffic simulation tasks [35].
	Ethical Considerations	Assessing LLM adherence to ethical guidelines and potential biases in urban applications.	Evaluating LLM-generated content for potential biases in tourism marketing [135].

Table 13. Common evaluation metrics for LLMs and AI agents.

Category	Metric	Description	Examples
Quantitative	Accuracy	Measures the proportion of correct answers out of all predictions for tasks like question answering or text classification.	GPT-4 GEO accuracy [167].
	Exact Matching	Calculates the percentage of predictions that perfectly match the ground truth, often used in tasks requiring precise outputs.	Exact matchingin accident automation tasks [46].
	ML/DL Metrics	Common machine learning and deep learning metrics such as precision, recall, and F1 score.	F1 score in event extraction [126].
	Response Time	Measures how quickly the model responds to inputs—crucial for real-time applications and user experience.	Inference time in driving behavior analysis [37].
	Consistency	Evaluates the variance in model outputs when presented with inputs in different formats or presentations, assessing robustness and reliability.	Consistency in spatial queries [160].
	Task-specific Metrics	User-defined custom metrics tailored for different tasks, such as performance indicators in simulations or domain-specific benchmarks.	Collision rate in autonomous driving [79].
Qualitative	Expert Review	Assessment of model outputs or performance by domain experts, providing insights into accuracy, relevance, and quality that may not be captured by quantitative metrics.	IPCC authors evaluating climate responses [146].
	User Satisfaction	Surveys or feedback from users interacting with the LLM or LLM agent to gauge satisfaction levels, usability, and perceived effectiveness.	User preference scores for city layouts [90].
	Case Studies	In-depth analysis of specific examples or scenarios to understand model behavior in detail, often revealing nuances in performance and decision-making processes.	Simulated case study for building HVAC control [155].
	Ethical Assessment	Evaluation of the model’s adherence to ethical guidelines and safety constraints, including tests for bias, fairness, and potential harmful outputs.	Bias score for geographic biases [127].

Table 14. Common post-analysis techniques using LLMs in urban analytics.

Major Features	Techniques	Application Examples
Interactivity	QA System	Answering queries about flood situations in real time [109] and providing explanations for traffic control decisions [57].
Interactivity	Automatic Survey	Assessing customer satisfaction in tourism [138] and analyzing public opinion on nuclear power [145].
Accessibility	Report Generation	Generating traffic advisory reports [58] and creating detailed ecological construction reports [105].
	Result Visualization	Visualizing geospatial trends in transit feedback [34] and creating maps and charts for COVID-19 death-rate analysis [164].
	Result/Decision Explanation	Providing reasoning for traffic signal control decisions [57] and explaining predictions in autonomous driving [48].
Decision Support	Scenario/Policy Simulation	Simulating disaster scenarios for education [113] and generating activity patterns under different conditions [59].
	Decision Analysis	Assisting in traffic management and urban planning decisions [60] and supporting policy-making in health and disaster management [152].
	Personalized Recommendations	Generating tailored travel itineraries [171] and providing personalized emergency guidance [113].

Table 15. Future directions for LLM agents in urban data analytics.

Future Direction	Description
Enhanced Automation and Adaptivity	Development of multi-agent systems for automated geographical information systems, urban data analytics, and spatial problem-solving.
Enhanced Human–AI Collaboration	Exploring new paradigms for effective collaboration between human experts and LLM agents in urban decision-making processes.
Multi-Agent Collaboration	Developing more sophisticated multi-agent frameworks that leverage LLMs for better collaboration and decision-making in complex urban scenarios.
Realistic Simulations and Scenario Generation	Use of LLM agents to create more realistic and complex simulations of urban environments and human behavior.
Real-time Adaptation	Creating LLM agents that can quickly adapt to changing urban conditions and provide real-time insights and recommendations.
Integration with IoT and Sensor Networks	Combining LLM capabilities with data from IoT devices and urban sensor networks for more comprehensive urban analytics.
Cross-domain Agent Collaboration	Enabling LLM agents to transfer knowledge across different urban domains (e.g., transportation, energy, and healthcare) for more holistic urban management.
Cross-lingual and Cultural Adaptation	Developing LLM agents that can effectively operate across different languages and cultural contexts in diverse urban environments.
Interpretable and Transparent AI	Focus on creating LLM-based multi-agent systems that can explain their decisions and actions, enhancing trust and usability.
Ethical AI and Bias Mitigation	Addressing ethical concerns and mitigating biases in LLM-based urban decision-making processes.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiang, F.; Ma, J.; Jin, Y. Unleashing the Potential of Large Language Models in Urban Data Analytics: A Review of Emerging Innovations and Future Research. Smart Cities 2025, 8, 201. https://doi.org/10.3390/smartcities8060201

AMA Style

Jiang F, Ma J, Jin Y. Unleashing the Potential of Large Language Models in Urban Data Analytics: A Review of Emerging Innovations and Future Research. Smart Cities. 2025; 8(6):201. https://doi.org/10.3390/smartcities8060201

Chicago/Turabian Style

Jiang, Feifeng, Jun Ma, and Yuping Jin. 2025. "Unleashing the Potential of Large Language Models in Urban Data Analytics: A Review of Emerging Innovations and Future Research" Smart Cities 8, no. 6: 201. https://doi.org/10.3390/smartcities8060201

APA Style

Jiang, F., Ma, J., & Jin, Y. (2025). Unleashing the Potential of Large Language Models in Urban Data Analytics: A Review of Emerging Innovations and Future Research. Smart Cities, 8(6), 201. https://doi.org/10.3390/smartcities8060201

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Unleashing the Potential of Large Language Models in Urban Data Analytics: A Review of Emerging Innovations and Future Research

Highlights

Abstract

1. Introduction

1.1. Background

1.2. Motivation and Objectives

2. Overview of Large Language Models

2.1. Landscape of Large Language Models

2.2. Key Characteristics and Capabilities

2.2.1. Key Technologies in Developing LLMs

2.2.2. Major Capabilities of LLMs

2.3. Review Methods

3. Applications of LLMs in Urban Data Analytics

3.1. Urban Data Analytics Landscape

3.2. Data Collection and Generation

3.2.1. Data Collection

3.2.2. Data Generation

Purposes and Applications

Scenario and Solution Generation

3.3. Preprocessing

3.3.1. Data Quality Issues

3.3.2. Data Representation Issues

3.3.3. Data Dimensionality and Distribution Issues

3.3.4. Representative Application

3.4. Modeling

3.4.1. Prompt Engineering

Representative Prompting Techniques

Content-Wise Prompting

Promptsfor Forecasting

3.4.2. LLM Agents

Key Components

LLM Agents in Urban Analytics

3.4.3. Fine-Tuning and Foundation Models

Fine-Tuning

Urban Foundation Models

3.4.4. Performance Evaluation

Model-Oriented Evaluation

Human-Oriented Evaluation

Evaluation Metrics

3.5. Post-Analysis

3.6. Summary

4. Future Directions and Challenges

4.1. Overview of the 3E Framework

4.2. Expanding Information Dimensions (Data)

4.2.1. Multimodal Data Integration

Expanding Integration Scope

Advanced Modalities

Potential Applications and Summary

4.2.2. Retrieval-Augmented Generation (RAG)

Potential Applications

4.2.3. Others Data Dimensions

4.3. Enhancing Model Capabilities (Model)

4.3.1. Large Foundation Models

Multimodal Foundation Models

Enhanced Temporal and Spatial Awareness

4.3.2. Smaller Models for Large-Scale Analysis

Opportunities

4.4. Executing Advanced Applications (Application)

4.4.1. Enhancing Existing Workflows

4.4.2. Multi-Agent Collaborations

4.4.3. Others

4.5. Potential Applications of the 3E Framework

4.5.1. Urban Resilience and Disaster Management

4.5.2. Urban Mobility and Transportation Systems

4.5.3. Sustainable Urban Development

4.5.4. Summary

4.6. Discussion

4.6.1. Hallucination and Trustworthiness

4.6.2. Scalability and Computational Requirements

4.6.3. Fairness in Urban Decision-Making

4.6.4. Ethical and Privacy Issues

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite