LLM4GIS: Large Language Models for GIS

Special Issue Editors


E-Mail Website
Guest Editor
State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, China
Interests: GeoAI; large language models; geospatial code generation; geospatial modeling

E-Mail Website
Guest Editor
School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China
Interests: GeoAI; spatiotemporal analysis; big data mining; web GIS; geoprocessing workflow modeling
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The integration of large language models (LLMs) into geographic information systems (GISs) marks a transformative shift in GeoAI. As the volume and complexity of spatial data continue to grow, LLMs offer a new paradigm for automating spatial analysis, enabling natural language interactions, and enhancing analytical reasoning within GIS environments. By leveraging recent advances in prompt engineering, retrieval-augmented generation (RAG), autonomous agents, and multi-agent frameworks, LLMs significantly extend the usability, scalability, and adaptability of GIS platforms across technical and non-technical user groups.

Despite the promising potential of LLMs in GIS, their deep integration still faces multiple challenges. Spatial data are inherently heterogeneous, structurally complex, and semantically layered, limiting the generalization capabilities of LLMs in understanding geographic entities, coordinate systems, and spatial contexts. Moreover, GIS analysis often involves multi-step, hierarchical tasks, while user intentions—especially from non-expert users—tend to be vague, discontinuous, or imprecise in natural language. LLMs must not only accurately infer such complex spatial intentions but also assist users in clarifying their goals through task reformulation, semantic guidance, and parameter suggestions. From a technical perspective, challenges remain in designing spatially aware token representations, integrating multimodal geospatial inputs (e.g., imagery, remote sensing, and sensor data), modeling spatial relations and topology within contextual embeddings, and improving reasoning controllability and interpretability without sacrificing spatial precision. Importantly, different GIS applications involve distinct underlying scientific problems, including trajectory modeling and prediction in transportation, map language understanding and visual semantics in cartography, and syntax design, workflow reasoning, and metadata integration in geospatial code generation. These diverse tasks impose higher demands on the expressive, inferential, and knowledge integration capabilities of LLMs, while also opening up rich avenues for research.

Given the above challenges, this Special Issue focuses on the innovative research of LLMs in GISs, welcoming contributions that span foundational methodologies to real-world implementations. Topics of interest include intelligent spatial data processing (such as automated spatial data cleaning and transformation, as well as intelligent geospatial report generation); multimodal data fusion (integrating imagery, sensor data, and textual information to enable the unified modeling of complex spatial semantics); advances in spatial analysis and interaction (including LLM-enhanced spatial analysis methods and natural language interfaces for simplifying GIS operations); applications of LLMs in geospatial decision support systems (covering domains such as urban planning, environmental monitoring, and climate modeling); model optimization strategies for geospatial tasks (such as domain-specific token design, fine-tuning methods, and alignment with spatial ontologies); and a range of emerging interdisciplinary directions, including geospatial code generation, map generation from textual input, trajectory forecasting, and knowledge-driven spatial dialogue systems.

We invite interdisciplinary contributions from artificial intelligence, spatial data science, data visualization, geospatial modeling, urban planning, and other relevant fields, aiming to drive the deep integration of LLMs and GIS. Topics to be considered include (but are not limited to) the following:

  • LLMs for GIS data processing;
  • LLMs for geospatial report generation;
  • LLMs for multimodal GIS integration;
  • LLMs for enhancing spatial data accessibility;
  • LLMs in urban planning and environmental analysis;
  • LLMs for natural language querying in real-time GISs;
  • LLMs for improving GIS user experience;
  • LLMs for GIS map generation;
  • LLMs for geospatial question answering and intelligent querying;
  • LLMs for geospatial code generation;
  • LLMs for trajectory prediction and spatiotemporal analysis;
  • LLMs for spatial database querying and NL interfaces;
  • LLMs for causal inference in geospatial analysis;
  • LLMs for explainable spatial reasoning;
  • LLMs for user intent analysis and geospatial task decomposition.

We invite researchers in these fields to submit original research, methodological papers, case studies, or review articles. This Special Issue aims to highlight the significant applications of LLMs in GIS analysis and provide a platform for academic exchange and best practices in this innovative area.

Prof. Dr. Huayi Wu
Prof. Dr. Zhipeng Gui
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. ISPRS International Journal of Geo-Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1900 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • large language models (LLMs)
  • GeoAI
  • multi-agent collaboration
  • geospatial knowledge embedding and enhancement
  • multimodal data fusion
  • geospatial data processing
  • geospatial code generation
  • GIS report generation
  • spatial analysis and urban planning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

24 pages, 3010 KB  
Article
Retrieval-Augmented Generation-Based Earth Surface System Association Network Optimization and Data Recommendation
by Jiangbing Sun, Yan Zhang, Longxing Tian, Jiali Li, Miao Tian, Jie Chen, Liufeng Tao and Qinjun Qiu
ISPRS Int. J. Geo-Inf. 2026, 15(5), 199; https://doi.org/10.3390/ijgi15050199 - 2 May 2026
Viewed by 362
Abstract
The scientific data of the Earth surface system is characterized by multi-source heterogeneity and dynamic correlation, so constructing an efficient data association network and enabling intelligent knowledge services is a hot topic. Nevertheless, confronted with the existing challenges of onerous data acquisition, inadequate [...] Read more.
The scientific data of the Earth surface system is characterized by multi-source heterogeneity and dynamic correlation, so constructing an efficient data association network and enabling intelligent knowledge services is a hot topic. Nevertheless, confronted with the existing challenges of onerous data acquisition, inadequate precision of data recommendation, excessive time and labor consumption, as well as insufficient semantic reasoning in intelligent question-and-answer (Q&A) systems, we propose an intelligent framework that integrates dynamic optimization and retrieval-augmented generation (RAG) technology to address the problems of strong subjectivity in the setting of edge weight thresholds in association networks and insufficient semantic inference in intelligent Q&A. First, a multidimensional association network is constructed based on metadata features, redundant edge pruning is achieved through dynamic threshold analysis, and key nodes are identified by combining complex network centrality theory to optimize network structure and storage efficiency. Secondly, the RAG-based intelligent Q&A model is designed to transform the association triples into a paragraph-based knowledge base, generate a domain Q&A dataset using a large language model GPT-4o, and fine-tune the word embedding model to improve the semantic representation accuracy. Experiments show that the number of network edges is reduced by about 70% after optimization, and the node importance analysis accurately identifies key data nodes; the fine-tuned model improves each index by 6% on average in the retrieval task, and the Q&A system significantly outperforms the traditional method in terms of indexes such as relevance and completeness. This study provides innovative solutions for the intelligent service of scientific data in Earth surface systems and promotes the deep integration of association networks and generative AI. Full article
(This article belongs to the Special Issue LLM4GIS: Large Language Models for GIS)
Show Figures

Figure 1

19 pages, 4539 KB  
Article
Urban Housing Conflicts in Large Canadian Cities: A Spatio-Temporal and Semantic Analysis Using Large Language Models
by Catherine Trudelle, Christophe Claramunt, Eliott Libner and Rodolphe Gonzales
ISPRS Int. J. Geo-Inf. 2026, 15(5), 193; https://doi.org/10.3390/ijgi15050193 - 1 May 2026
Viewed by 375
Abstract
This paper introduces a comparative analysis of urban housing conflicts across eight major Canadian cities, Toronto, Vancouver, Québec, Ottawa, Calgary, Edmonton, St. John’s, and Halifax, over a 20-year period. Using Large Language Models (LLMs), we implement a structured workflow to extract, classify, and [...] Read more.
This paper introduces a comparative analysis of urban housing conflicts across eight major Canadian cities, Toronto, Vancouver, Québec, Ottawa, Calgary, Edmonton, St. John’s, and Halifax, over a 20-year period. Using Large Language Models (LLMs), we implement a structured workflow to extract, classify, and organize more than one thousand conflict instances from diverse textual sources, including municipal reports, media archives, and non-governmental organization publications. The methodological contribution lies in demonstrating how an LLM-assisted pipeline, combining schema-based extraction, prompt perturbation, and a two-phase calibration procedure, can generate structured, multi-city conflict datasets while addressing challenges such as output homogenization and sensitivity to prompt design. The findings highlight both shared national tendencies and city-specific configurations with post-2020 conflicts intensifying. Overall, the study proposes a transparent workflow for applying LLMs to conflict-related text analysis and offers an exploratory overview of the spatial, temporal, and semantic regularities of housing conflicts in Canadian cities. Full article
(This article belongs to the Special Issue LLM4GIS: Large Language Models for GIS)
Show Figures

Figure 1

23 pages, 1894 KB  
Article
Theft Address Extraction and Classification from Chinese Judicial Documents Based on Large Language Model
by Zengli Wang, Xiang Li, Xiaoping Rui, Linfang Ding and Jingjing Li
ISPRS Int. J. Geo-Inf. 2026, 15(3), 124; https://doi.org/10.3390/ijgi15030124 - 13 Mar 2026
Viewed by 695
Abstract
Judicial documents have become a significant data source for crime geography research, offering advantages in accessibility and scale compared to highly restricted police-recorded crime data. However, extracting crime addresses from these texts is challenging due to sparse, inconsistent, and incomplete address information. Without [...] Read more.
Judicial documents have become a significant data source for crime geography research, offering advantages in accessibility and scale compared to highly restricted police-recorded crime data. However, extracting crime addresses from these texts is challenging due to sparse, inconsistent, and incomplete address information. Without proper classification, errors in geocoding and spatial analysis can arise, compromising data quality. To address these limitations, we employed large language models (LLMs) and a structured prompt engineering strategy tailored for this task. Specifically, we propose a fine-tuned LLM, named CAEC_LLM, to extract addresses from judicial documents and classify these crime addresses at various categories with different spatial scales. Experimental results demonstrate that the model achieved an F1-score of 0.79 for address extraction and a classification accuracy of up to 0.74 for the best-performing category, significantly outperforming other LLMs. This study makes two primary contributions: (1) designing an address classification scheme specifically for crime addresses, and (2) developing a fine-tuned LLM for extracting and classifying crime addresses from Chinese judicial documents, enabling LLMs to be used to classify crime addresses into different categories on a spatial scale. These advancements facilitate more accurate crime pattern analysis and data-driven urban planning. Full article
(This article belongs to the Special Issue LLM4GIS: Large Language Models for GIS)
Show Figures

Figure 1

24 pages, 7660 KB  
Article
Reasoning over Heterogeneous Geospatial Schemas: Aligning Authoritative Taxonomies and Collaborative Folksonomies Through Large Language Models
by Fabíola Andrade Souza and Silvana Philippi Camboim
ISPRS Int. J. Geo-Inf. 2026, 15(2), 87; https://doi.org/10.3390/ijgi15020087 - 18 Feb 2026
Viewed by 780
Abstract
Semantic interoperability remains a critical challenge in Spatial Data Infrastructures (SDIs), particularly when aligning authoritative taxonomies with collaborative folksonomies. Traditional alignment tools often fail to bridge the semantic and structural asymmetry between these schemas. This paper evaluates the capability of Large Language Models [...] Read more.
Semantic interoperability remains a critical challenge in Spatial Data Infrastructures (SDIs), particularly when aligning authoritative taxonomies with collaborative folksonomies. Traditional alignment tools often fail to bridge the semantic and structural asymmetry between these schemas. This paper evaluates the capability of Large Language Models (LLMs), specifically distinguishing between traditional architectures and emerging Large Reasoning Models (LRMs), to perform semantic alignment between the Brazilian national topographic data model standard (EDGV) and OpenStreetMap (OSM). Using a formal ontology as a prompting scaffold, we tested seven model versions (including ChatGPT 5, DeepSeek R1, and Gemini 2.5) on their ability to bridge the gap between rigid hierarchical classes and the dynamic, ‘long-tail’ vocabulary of the folksonomy. Results reveal a distinct trade-off: while traditional LLMs exhibited ‘lexical rigidity’ and popularity bias—failing to map low-frequency tags—Reasoning Models demonstrated significantly improved capacity for semantic expansion, correctly identifying complex many-to-one (n:1) relationships across linguistic barriers. However, this reasoning depth often came at the cost of ‘hallucination by over-specification’ and syntactic instability in generating OWL code. We conclude that a neuro-symbolic approach, positioning LRMs as ‘Semantic Catalysts’ within a Human-in-the-Loop (HITL) workflow, provides a viable pathway for interoperability, balancing generative power with the need for logical rigor and spatial validation. Full article
(This article belongs to the Special Issue LLM4GIS: Large Language Models for GIS)
Show Figures

Figure 1

26 pages, 2554 KB  
Article
Semi-Automated Reporting from Environmental Monitoring Data Using a Large Language Model-Based Chatbot
by Angelica Lo Duca, Rosa Lo Duca, Arianna Marinelli, Donatella Occhiuto and Alessandra Scariot
ISPRS Int. J. Geo-Inf. 2026, 15(2), 80; https://doi.org/10.3390/ijgi15020080 - 14 Feb 2026
Viewed by 803
Abstract
Producing high-quality analytical reports for the environmental domain is typically time-consuming and requires significant human expertise. This paper describes MeteoChat, a semi-automatic framework for efficiently generating specialized environmental reports from heterogeneous environmental data. MeteoChat utilizes a Large Language Model (LLM) fine-tuned and integrated [...] Read more.
Producing high-quality analytical reports for the environmental domain is typically time-consuming and requires significant human expertise. This paper describes MeteoChat, a semi-automatic framework for efficiently generating specialized environmental reports from heterogeneous environmental data. MeteoChat utilizes a Large Language Model (LLM) fine-tuned and integrated with Retrieval-Augmented Generation (RAG). The system’s core is its plug-and-play philosophy, which separates analytical reasoning from the data source and the report’s intended audience. The fine-tuning phase uses data-agnostic, parameterized question–context–answer triples defined by an environmental expert to teach the LLM domain-specific analytical logic and audience-appropriate communication styles. Subsequently, the RAG phase integrates the model with actual datasets, which are processed via an Extract–Transform–Load (ETL) workflow to generate statistical summaries. This architectural separation ensures that the same reporting engine can operate on different sources, such as meteorological time series, satellite imagery, or geographical data, without additional training. Users interact with the system via a web-based conversational interface, where responses are tailored for either technical experts (using explicit calculations and tables) or the general public (using simplified, narrative language). MeteoChat has been tested with real data extracted from the micrometeorological network of ARPA Lazio. Full article
(This article belongs to the Special Issue LLM4GIS: Large Language Models for GIS)
Show Figures

Figure 1

28 pages, 8127 KB  
Article
CARAG: Context-Aware Retrieval-Augmented Generation for Railway Operation and Maintenance Question Answering over Spatial Knowledge Graph
by Wenkui Zheng, Mengzheng Yang, Yanfei Ren, Haoyu Wang, Chun Zeng and Yong Zhang
ISPRS Int. J. Geo-Inf. 2026, 15(2), 78; https://doi.org/10.3390/ijgi15020078 - 14 Feb 2026
Cited by 1 | Viewed by 1027
Abstract
General-purpose large language models excel at open-domain question answering, but in railway operation and maintenance (O&M) scenarios they still suffer from hallucinated knowledge and poor domain adaptation. In practice, railway O&M knowledge mainly arises from two heterogeneous sources: spatio-temporal data such as train [...] Read more.
General-purpose large language models excel at open-domain question answering, but in railway operation and maintenance (O&M) scenarios they still suffer from hallucinated knowledge and poor domain adaptation. In practice, railway O&M knowledge mainly arises from two heterogeneous sources: spatio-temporal data such as train trajectories, which are organized along the spatial layout of railway lines, and domain documents such as operating rules, which exhibit varying degrees of structural regularity. Traditional retrieval-augmented generation (RAG) systems usually flatten these multi-source data into a single unstructured text space and perform global retrieval in one embedding space, which easily introduces noisy context and makes it difficult to precisely target knowledge for specific lines, sections, or equipment states. To overcome these limitations, we propose CARAG, a context-aware RAG framework tailored to railway O&M data. CARAG treats domain documents and spatial data as a unified knowledge substrate and builds a spatial knowledge graph with concept and instance levels. On top of this knowledge graph, a GraphReAct-based multi-turn interaction mechanism guides the LLM to reason and act over the concept knowledge graph, dynamically navigating to spatially and semantically relevant candidate regions, within which vector retrieval and instance-level graph retrieval are performed. Experiments show that CARAG significantly outperforms baseline RAG methods on RAGAS metrics, confirming the effectiveness of structure-guided multi-step reasoning for question answering over multi-source heterogeneous railway O&M data. Full article
(This article belongs to the Special Issue LLM4GIS: Large Language Models for GIS)
Show Figures

Figure 1

30 pages, 3799 KB  
Article
Geospatial Knowledge-Base Question Answering Using Multi-Agent Systems
by Jonghyeon Yang and Jiyoung Kim
ISPRS Int. J. Geo-Inf. 2026, 15(1), 35; https://doi.org/10.3390/ijgi15010035 - 8 Jan 2026
Viewed by 1370
Abstract
Large language models (LLMs) have advanced geospatial artificial intelligence; however, geospatial knowledge-base question answering (GeoKBQA) remains underdeveloped. Prior systems have relied on handcrafted rules and have omitted the splitting of datasets into training, validation, and test sets, thereby hindering fair evaluation. To address [...] Read more.
Large language models (LLMs) have advanced geospatial artificial intelligence; however, geospatial knowledge-base question answering (GeoKBQA) remains underdeveloped. Prior systems have relied on handcrafted rules and have omitted the splitting of datasets into training, validation, and test sets, thereby hindering fair evaluation. To address these gaps, we propose a prompt-based multi-agent LLM framework (based on GPT-4o) that translates natural-language questions into executable GeoSPARQL. The architecture comprises an intent analyzer, multi-grained retrievers that ground concepts and properties in the OSM tagging schema and map geospatial relations to the GeoSPARQL/OGC operator inventory, an operator-aware intermediate representation aligned with SPARQL/GeoSPARQL 1.1, and a query generator. Our approach was evaluated on the GeoKBQA test set using 20 few-shot exemplars per agent. It achieved 85.49 EM (GPT-4o) with less supervision than fine-tuned baselines trained on 3574 instances and substantially outperformed a single-agent GPT-4o prompt. Additionally, we evaluated GPT-4o-mini, which achieved 66.74 EM in a multi-agent configuration versus 47.10 EM with a single agent. The observations showed that the multi-agent gain was higher for the larger model. Our results indicate that, beyond scale, the framework’s structure is important; thus, principled agentic decomposition yields a sample-efficient, execution-faithful path beyond template-centric GeoKBQA under a fair, hold-out evaluation protocol. Full article
(This article belongs to the Special Issue LLM4GIS: Large Language Models for GIS)
Show Figures

Figure 1

Back to TopTop