Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (218)

Search Parameters:
Keywords = semantic user interactions

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 1515 KiB  
Article
Ontology-Based Data Pipeline for Semantic Reaction Classification and Research Data Management
by Hendrik Borgelt, Frederick Gabriel Kitel and Norbert Kockmann
Computers 2025, 14(8), 311; https://doi.org/10.3390/computers14080311 (registering DOI) - 1 Aug 2025
Abstract
Catalysis research is complex and interdisciplinary, involving diverse physical effects and challenging data practices. Research data often captures only selected aspects, such as specific reactants and products, limiting its utility for machine learning and the implementation of FAIR (Findable, Accessible, Interoperable, Reusable) workflows. [...] Read more.
Catalysis research is complex and interdisciplinary, involving diverse physical effects and challenging data practices. Research data often captures only selected aspects, such as specific reactants and products, limiting its utility for machine learning and the implementation of FAIR (Findable, Accessible, Interoperable, Reusable) workflows. To improve this, semantic structuring through ontologies is essential. This work extends the established ontologies by refining logical relations and integrating semantic tools such as the Web Ontology Language or the Shape Constraint Language. It incorporates application programming interfaces from chemical databases, such as the Kyoto Encyclopedia of Genes and Genomes and the National Institute of Health’s PubChem database, and builds upon established ontologies. A key innovation lies in automatically decomposing chemical substances through database entries and chemical identifier representations to identify functional groups, enabling more generalized reaction classification. Using new semantic functionality, functional groups are flexibly addressed, improving the classification of reactions such as saponification and ester cleavage with simultaneous oxidation. A graphical interface (GUI) supports user interaction with the knowledge graph, enabling ontological reasoning and querying. This approach demonstrates improved specificity of the newly established ontology over its predecessors and offers a more user-friendly interface for engaging with structured chemical knowledge. Future work will focus on expanding ontology coverage to support a wider range of reactions in catalysis research. Full article
Show Figures

Figure 1

20 pages, 980 KiB  
Article
Dynamic Decoding of VR Immersive Experience in User’s Technology-Privacy Game
by Shugang Li, Zulei Qin, Meitong Liu, Ziyi Li, Jiayi Zhang and Yanfang Wei
Systems 2025, 13(8), 638; https://doi.org/10.3390/systems13080638 (registering DOI) - 1 Aug 2025
Abstract
The formation mechanism of Virtual Reality (VR) Immersive Experience (VRIE) is notably complex; this study aimed to dynamically decode its underlying drivers by innovatively integrating Flow Theory and Privacy Calculus Theory, focusing on Perceptual-Interactive Fidelity (PIF), Consumer Willingness to Immerse in Technology (CWTI), [...] Read more.
The formation mechanism of Virtual Reality (VR) Immersive Experience (VRIE) is notably complex; this study aimed to dynamically decode its underlying drivers by innovatively integrating Flow Theory and Privacy Calculus Theory, focusing on Perceptual-Interactive Fidelity (PIF), Consumer Willingness to Immerse in Technology (CWTI), and the applicability of Loss Aversion Theory. To achieve this, we analyzed approximately 30,000 user reviews from Amazon using Latent Semantic Analysis (LSA) and regression analysis. The findings reveal that user attention’s impact on VRIE is non-linear, suggesting an optimal threshold, and confirm PIF as a central influencing mechanism; furthermore, CWTI significantly moderates users’ privacy calculus, thereby affecting VRIE, while Loss Aversion Theory showed limited explanatory power in the VR context. These results provide a deeper understanding of VR user behavior, offering significant theoretical guidance and practical implications for future VR system design, particularly in strategically balancing user cognition, PIF, privacy concerns, and individual willingness. Full article
Show Figures

Figure 1

23 pages, 1127 KiB  
Article
NOVA: A Retrieval-Augmented Generation Assistant in Spanish for Parallel Computing Education with Large Language Models
by Gabriel A. León-Paredes, Luis A. Alba-Narváez and Kelly D. Paltin-Guzmán
Appl. Sci. 2025, 15(15), 8175; https://doi.org/10.3390/app15158175 - 23 Jul 2025
Viewed by 556
Abstract
This work presents the development of NOVA, an educational virtual assistant designed for the Parallel Computing course, built using a Retrieval-Augmented Generation (RAG) architecture combined with Large Language Models (LLMs). The assistant operates entirely in Spanish, supporting native-language learning and increasing accessibility for [...] Read more.
This work presents the development of NOVA, an educational virtual assistant designed for the Parallel Computing course, built using a Retrieval-Augmented Generation (RAG) architecture combined with Large Language Models (LLMs). The assistant operates entirely in Spanish, supporting native-language learning and increasing accessibility for students in Latin American academic settings. It integrates vector and relational databases to provide an interactive, personalized learning experience that supports the understanding of complex technical concepts. Its core functionalities include the automatic generation of questions and answers, quizzes, and practical guides, all tailored to promote autonomous learning. NOVA was deployed in an academic setting at Universidad Politécnica Salesiana. Its modular architecture includes five components: a relational database for logging, a vector database for semantic retrieval, a FastAPI backend for managing logic, a Next.js frontend for user interaction, and an integration server for workflow automation. The system uses the GPT-4o mini model to generate context-aware, pedagogically aligned responses. To evaluate its effectiveness, a test suite of 100 academic tasks was executed—55 question-and-answer prompts, 25 practical guides, and 20 quizzes. NOVA achieved a 92% excellence rating, a 21-second average response time, and 72% retrieval coverage, confirming its potential as a reliable AI-driven tool for enhancing technical education. Full article
Show Figures

Figure 1

19 pages, 782 KiB  
Article
On the Rate-Distortion Theory for Task-Specific Semantic Communication
by Jingxuan Chai, Huixiang Zhu, Yong Xiao, Guangming Shi and Ping Zhang
Entropy 2025, 27(8), 775; https://doi.org/10.3390/e27080775 - 23 Jul 2025
Viewed by 207
Abstract
Semantic communication has attracted considerable interest due to its potential to support emerging human-centric services, such as holographic communications, extended reality (XR), and human-machine interactions. Different from traditional communication systems that focus on minimizing the symbol-level distortion (e.g., bit error rate, signal-to-noise ratio, [...] Read more.
Semantic communication has attracted considerable interest due to its potential to support emerging human-centric services, such as holographic communications, extended reality (XR), and human-machine interactions. Different from traditional communication systems that focus on minimizing the symbol-level distortion (e.g., bit error rate, signal-to-noise ratio, etc.), semantic communication targets at delivering the intended meaning at the destination user which is often quantified by various statistical divergences, often referred to as the semantic distances. Currently, there still lacks a unified framework to quantify the rate-distortion tradeoff for semantic communication with different task-specific semantic distance measures. To tackle this problem, we propose the task-specific rate-distortion theory for semantic communication where different task-specific statistic divergence metrics can be considered. To investigate the impact of different semantic distance measures on the achievable rate, we consider two popular tasks, classification and signal generation. We present the closed-form expressions of the semantic rate-distortion functions for these two different tasks and compare their performance under various scenarios. Extensive experimental results are presented to verify our theoretical results. Full article
(This article belongs to the Special Issue Semantic Information Theory)
Show Figures

Figure 1

20 pages, 709 KiB  
Article
SKGRec: A Semantic-Enhanced Knowledge Graph Fusion Recommendation Algorithm with Multi-Hop Reasoning and User Behavior Modeling
by Siqi Xu, Ziqian Yang, Jing Xu and Ping Feng
Computers 2025, 14(7), 288; https://doi.org/10.3390/computers14070288 - 18 Jul 2025
Viewed by 244
Abstract
To address the limitations of existing knowledge graph-based recommendation algorithms, including insufficient utilization of semantic information and inadequate modeling of user behavior motivations, we propose SKGRec, a novel recommendation model that integrates knowledge graph and semantic features. The model constructs a semantic interaction [...] Read more.
To address the limitations of existing knowledge graph-based recommendation algorithms, including insufficient utilization of semantic information and inadequate modeling of user behavior motivations, we propose SKGRec, a novel recommendation model that integrates knowledge graph and semantic features. The model constructs a semantic interaction graph (USIG) of user behaviors and employs a self-attention mechanism and a ranked optimization loss function to mine user interactions in fine-grained semantic associations. A relationship-aware aggregation module is designed to dynamically integrate higher-order relational features in the knowledge graph through the attention scoring function. In addition, a multi-hop relational path inference mechanism is introduced to capture long-distance dependencies to improve the depth of user interest modeling. Experiments on the Amazon-Book and Last-FM datasets show that SKGRec significantly outperforms several state-of-the-art recommendation algorithms on the Recall@20 and NDCG@20 metrics. Comparison experiments validate the effectiveness of semantic analysis of user behavior and multi-hop path inference, while cold-start experiments further confirm the robustness of the model in sparse-data scenarios. This study provides a new optimization approach for knowledge graph and semantic-driven recommendation systems, enabling more accurate capture of user preferences and alleviating the problem of noise interference. Full article
Show Figures

Figure 1

33 pages, 2593 KiB  
Article
Methodological Exploration of Ontology Generation with a Dedicated Large Language Model
by Maria Assunta Cappelli and Giovanna Di Marzo Serugendo
Electronics 2025, 14(14), 2863; https://doi.org/10.3390/electronics14142863 - 17 Jul 2025
Viewed by 319
Abstract
Ontologies are essential tools for representing, organizing, and sharing knowledge across various domains. This study presents a methodology for ontology construction supported by large language models (LLMs), with an initial application in the automotive sector. Specifically, a user preference ontology for adaptive interfaces [...] Read more.
Ontologies are essential tools for representing, organizing, and sharing knowledge across various domains. This study presents a methodology for ontology construction supported by large language models (LLMs), with an initial application in the automotive sector. Specifically, a user preference ontology for adaptive interfaces in autonomous machines was developed using ChatGPT-4o. Based on this case study, the results were generalized into a reusable methodology. The proposed workflow integrates classical ontology engineering methodologies with the generative and analytical capabilities of LLMs. Each phase follows well-established steps: domain definition, term elicitation, class hierarchy construction, property specification, formalization, population, and validation. A key innovation of this approach is the use of a guiding table that translates domain knowledge into structured prompts, ensuring consistency across iterative interactions with the LLM. Human experts play a continuous role throughout the process, refining definitions, resolving ambiguities, and validating outputs. The ontology was evaluated in terms of logical consistency, structural properties, semantic accuracy, and inferential completeness, confirming its correctness and coherence. Additional validation through SPARQL queries demonstrated its reasoning capabilities. This methodology is generalizable to other domains, if domain experts adapt the guiding table to the specific context. Despite the support provided by LLMs, domain expertise remains essential to guarantee conceptual rigor and practical relevance. Full article
(This article belongs to the Special Issue Role of Artificial Intelligence in Natural Language Processing)
Show Figures

Figure 1

21 pages, 1118 KiB  
Review
Integrating Large Language Models into Robotic Autonomy: A Review of Motion, Voice, and Training Pipelines
by Yutong Liu, Qingquan Sun and Dhruvi Rajeshkumar Kapadia
AI 2025, 6(7), 158; https://doi.org/10.3390/ai6070158 - 15 Jul 2025
Viewed by 1319
Abstract
This survey provides a comprehensive review of the integration of large language models (LLMs) into autonomous robotic systems, organized around four key pillars: locomotion, navigation, manipulation, and voice-based interaction. We examine how LLMs enhance robotic autonomy by translating high-level natural language commands into [...] Read more.
This survey provides a comprehensive review of the integration of large language models (LLMs) into autonomous robotic systems, organized around four key pillars: locomotion, navigation, manipulation, and voice-based interaction. We examine how LLMs enhance robotic autonomy by translating high-level natural language commands into low-level control signals, supporting semantic planning and enabling adaptive execution. Systems like SayTap improve gait stability through LLM-generated contact patterns, while TrustNavGPT achieves a 5.7% word error rate (WER) under noisy voice-guided conditions by modeling user uncertainty. Frameworks such as MapGPT, LLM-Planner, and 3D-LOTUS++ integrate multi-modal data—including vision, speech, and proprioception—for robust planning and real-time recovery. We also highlight the use of physics-informed neural networks (PINNs) to model object deformation and support precision in contact-rich manipulation tasks. To bridge the gap between simulation and real-world deployment, we synthesize best practices from benchmark datasets (e.g., RH20T, Open X-Embodiment) and training pipelines designed for one-shot imitation learning and cross-embodiment generalization. Additionally, we analyze deployment trade-offs across cloud, edge, and hybrid architectures, emphasizing latency, scalability, and privacy. The survey concludes with a multi-dimensional taxonomy and cross-domain synthesis, offering design insights and future directions for building intelligent, human-aligned robotic systems powered by LLMs. Full article
Show Figures

Figure 1

26 pages, 2873 KiB  
Article
Interactive Content Retrieval in Egocentric Videos Based on Vague Semantic Queries
by Linda Ablaoui, Wilson Estecio Marcilio-Jr, Lai Xing Ng, Christophe Jouffrais and Christophe Hurter
Multimodal Technol. Interact. 2025, 9(7), 66; https://doi.org/10.3390/mti9070066 - 30 Jun 2025
Viewed by 573
Abstract
Retrieving specific, often instantaneous, content from hours-long egocentric video footage based on hazily remembered details is challenging. Vision–language models (VLMs) have been employed to enable zero-shot textual-based content retrieval from videos. But, they fall short if the textual query contains ambiguous terms or [...] Read more.
Retrieving specific, often instantaneous, content from hours-long egocentric video footage based on hazily remembered details is challenging. Vision–language models (VLMs) have been employed to enable zero-shot textual-based content retrieval from videos. But, they fall short if the textual query contains ambiguous terms or users fail to specify their queries enough, leading to vague semantic queries. Such queries can refer to several different video moments, not all of which can be relevant, making pinpointing content harder. We investigate the requirements for an egocentric video content retrieval framework that helps users handle vague queries. First, we narrow down vague query formulation factors and limit them to ambiguity and incompleteness. Second, we propose a zero-shot, user-centered video content retrieval framework that leverages a VLM to provide video data and query representations that users can incrementally combine to refine queries. Third, we compare our proposed framework to a baseline video player and analyze user strategies for answering vague video content retrieval scenarios in an experimental study. We report that both frameworks perform similarly, users favor our proposed framework, and, as far as navigation strategies go, users value classic interactions when initiating their search and rely on the abstract semantic video representation to refine their resulting moments. Full article
Show Figures

Figure 1

20 pages, 1159 KiB  
Article
Visualization of a Multidimensional Point Cloud as a 3D Swarm of Avatars
by Leszek Luchowski and Dariusz Pojda
Appl. Sci. 2025, 15(13), 7209; https://doi.org/10.3390/app15137209 - 26 Jun 2025
Viewed by 235
Abstract
This paper proposes an innovative technique for representing multidimensional datasets using icons inspired by Chernoff faces. Our approach combines classical projection techniques with the explicit assignment of selected data dimensions to avatar (facial) features, leveraging the innate human ability to interpret facial traits. [...] Read more.
This paper proposes an innovative technique for representing multidimensional datasets using icons inspired by Chernoff faces. Our approach combines classical projection techniques with the explicit assignment of selected data dimensions to avatar (facial) features, leveraging the innate human ability to interpret facial traits. We introduce a semantic division of data dimensions into intuitive and technical categories, assigning the former to avatar features and projecting the latter into a four-dimensional (or higher) spatial embedding. The technique is implemented as a plugin for the open-source dpVision visualization platform, enabling users to interactively explore data in the form of a swarm of avatars whose spatial positions and visual features jointly encode various aspects of the dataset. Experimental results with synthetic test data and a 12-dimensional dataset of Portuguese Vinho Verde wines demonstrate that the proposed method enhances interpretability and facilitates the analysis of complex data structures. Full article
Show Figures

Figure 1

31 pages, 2406 KiB  
Article
Enhancing Mathematical Knowledge Graphs with Large Language Models
by Antonio Lobo-Santos and Joaquín Borrego-Díaz
Modelling 2025, 6(3), 53; https://doi.org/10.3390/modelling6030053 - 24 Jun 2025
Viewed by 553
Abstract
The rapid growth in scientific knowledge has created a critical need for advanced systems capable of managing mathematical knowledge at scale. This study presents a novel approach that integrates ontology-based knowledge representation with large language models (LLMs) to automate the extraction, organization, and [...] Read more.
The rapid growth in scientific knowledge has created a critical need for advanced systems capable of managing mathematical knowledge at scale. This study presents a novel approach that integrates ontology-based knowledge representation with large language models (LLMs) to automate the extraction, organization, and reasoning of mathematical knowledge from LaTeX documents. The proposed system enhances Mathematical Knowledge Management (MKM) by enabling structured storage, semantic querying, and logical validation of mathematical statements. The key innovations include a lightweight ontology for modeling hypotheses, conclusions, and proofs, and algorithms for optimizing assumptions and generating pseudo-demonstrations. A user-friendly web interface supports visualization and interaction with the knowledge graph, facilitating tasks such as curriculum validation and intelligent tutoring. The results demonstrate high accuracy in mathematical statement extraction and ontology population, with potential scalability for handling large datasets. This work bridges the gap between symbolic knowledge and data-driven reasoning, offering a robust solution for scalable, interpretable, and precise MKM. Full article
Show Figures

Figure 1

20 pages, 955 KiB  
Article
Natural Language Interfaces for Structured Query Generation in IoD Platforms
by Anıl Sezgin
Drones 2025, 9(6), 444; https://doi.org/10.3390/drones9060444 - 18 Jun 2025
Viewed by 572
Abstract
The increasing complexity of Internet of Drones (IoD) platforms demands more accessible ways for users to interact with unmanned aerial vehicle (UAV) data systems. Traditional methods requiring technical API knowledge create barriers for non-specialist users in dynamic operational environments. To address this challenge, [...] Read more.
The increasing complexity of Internet of Drones (IoD) platforms demands more accessible ways for users to interact with unmanned aerial vehicle (UAV) data systems. Traditional methods requiring technical API knowledge create barriers for non-specialist users in dynamic operational environments. To address this challenge, we propose a retrieval-augmented generation (RAG) architecture that enables natural language querying over UAV telemetry, mission, and detection data. Our approach builds a semantic retrieval index from structured application programming interface (API) documentation and uses lightweight large language models to map user queries into executable API calls validated against platform schemas. This design minimizes fine-tuning needs, adapts to evolving APIs, and ensures schema conformity for operational safety. Evaluations conducted on a curated IoD dataset show 91.3% endpoint accuracy, 87.6% parameter match rate, and 95.2% schema conformity, confirming the system’s robustness and scalability. The results demonstrate that combining retrieval-augmented semantic grounding with structured validation bridges the gap between human intent and complex UAV data access, improving usability while maintaining a practical level of operational reliability. Full article
Show Figures

Figure 1

23 pages, 5424 KiB  
Article
Interactive Maintenance of Space Station Devices Using Scene Semantic Segmentation
by Haoting Liu, Chuanxin Liao, Xikang Li, Zhen Tian, Mengmeng Wang, Haiguang Li, Xiaofei Lu, Zhenhui Guo and Qing Li
Aerospace 2025, 12(6), 542; https://doi.org/10.3390/aerospace12060542 - 15 Jun 2025
Viewed by 326
Abstract
A novel interactive maintenance method for space station in-orbit devices using scene semantic segmentation technology is proposed. First, a wearable and handheld system is designed to capture images from the astronaut in the space station’s front view scene and play these images on [...] Read more.
A novel interactive maintenance method for space station in-orbit devices using scene semantic segmentation technology is proposed. First, a wearable and handheld system is designed to capture images from the astronaut in the space station’s front view scene and play these images on a handheld terminal in real-time. Second, the proposed system quantitatively evaluates the environmental lighting condition in the scene by calculating image quality evaluation parameters. If the lighting condition is not proper, a prompt message will be given to the astronaut to remind him or her to adjust the environment illumination. Third, our system adopts an improved DeepLabV3+ network for semantic segmentation of these astronauts’ forward view scene images. Regarding the improved network, the original backbone network is replaced with a lightweight convolutional neural network, i.e., the MobileNetV2, with a smaller model scale and computational complexity. The convolutional block attention module (CBAM) is introduced to improve the network’s feature perception ability. The atrous spatial pyramid pooling (ASPP) module is also considered to enable an accurate calculation of encoding multi-scale information. Extensive simulation experiment results indicate that the accuracy, precision, and average intersection over the union of the proposed algorithm can be better than 95.0%, 96.0%, and 89.0%, respectively. And the ground application experiments have also shown that our proposed technique can effectively shorten the working time of the system user. Full article
(This article belongs to the Section Astronautics & Space Science)
Show Figures

Figure 1

31 pages, 5948 KiB  
Article
Intelligent Digital Twin for Predicting Technology Discourse Patterns: Agent-Based Modeling of User Interactions and Sentiment Dynamics in DeepSeek Discourse Case
by Kaihang Zhang, Changqi Dong, Yifeng Guo, Guang Yu and Jianing Mi
Systems 2025, 13(6), 451; https://doi.org/10.3390/systems13060451 - 8 Jun 2025
Cited by 1 | Viewed by 537
Abstract
Understanding user interaction patterns during technology-triggered public discourse provides critical insights into how emerging technologies gain social meaning. This study develops an intelligent digital twin framework for modeling discourse dynamics around DeepSeek, an indigenous large language model that generated approximately 250,000 social media [...] Read more.
Understanding user interaction patterns during technology-triggered public discourse provides critical insights into how emerging technologies gain social meaning. This study develops an intelligent digital twin framework for modeling discourse dynamics around DeepSeek, an indigenous large language model that generated approximately 250,000 social media interactions during a 13-day period. By integrating LLM-enhanced semantic analysis with agent-based modeling, we create a comprehensive virtual representation that captures both content characteristics and behavioral dynamics. Our analysis identifies six distinct thematic domains that structure public engagement: Technological Competition, Technological Breakthrough, User Feedback, Financial Market, Social Influence, and Information Security. The agent-based simulation reveals distinctive participation and sentiment patterns across different user segments, with general users expressing stronger positive sentiments than domain experts and institutional accounts. Network analysis demonstrates the evolution from random-like initial connection patterns to scale-free structures with pronounced influence hubs. The simulation results illuminate how individual behaviors aggregate to produce complex discourse patterns, offering insights into the micro-mechanisms underlying technology reception. This research advances digital twin methodologies beyond physical systems into social phenomena, providing a framework for anticipating public responses to technological innovations and informing more effective communication strategies. Full article
Show Figures

Figure 1

19 pages, 1303 KiB  
Article
GLARA: A Global–Local Attention Framework for Semantic Relation Abstraction and Dynamic Preference Modeling in Knowledge-Aware Recommendation
by Runbo Liu, Lili He and Junhong Zheng
Appl. Sci. 2025, 15(12), 6386; https://doi.org/10.3390/app15126386 - 6 Jun 2025
Viewed by 311
Abstract
Knowledge graph-enhanced recommendation has gained increasing attention for its ability to provide structured semantic context. However, most existing approaches struggle with two critical challenges: the sparsity of long-tail relations in knowledge graphs and the lack of adaptability to users’ dynamic preferences. In this [...] Read more.
Knowledge graph-enhanced recommendation has gained increasing attention for its ability to provide structured semantic context. However, most existing approaches struggle with two critical challenges: the sparsity of long-tail relations in knowledge graphs and the lack of adaptability to users’ dynamic preferences. In this paper, we propose GLARA, a novel recommendation framework that combines semantic abstraction and behavioral adaptation through a two-stage modeling process. First, a Virtual Relational Knowledge Graph (VRKG) is constructed by clustering semantically similar relations into higher-level virtual groups, which alleviates relation sparsity and enhances generalization. Then, a global Local Weighted Smoothing (LWS) module and a local Graph Attention Network (GAT) are integrated to jointly refine item and user representations: LWS propagates information within each virtual relation subgraph to improve semantic consistency, while GAT dynamically adjusts neighbor importance based on recent interaction signals. Extensive experiments on Last.FM and MovieLens-1M demonstrate that GLARA outperforms state-of-the-art methods, achieving up to 5.8% improvements in NDCG@20, especially in long-tail and cold-start scenarios. Additionally, case studies confirm the model’s interpretability by tracing recommendation paths through clustered semantic relations. This work offers a flexible and interpretable solution for robust recommendation under sparse and dynamic conditions. Full article
Show Figures

Figure 1

27 pages, 6504 KiB  
Article
A Natural Language-Based Automatic Identification System Trajectory Query Approach Using Large Language Models
by Xuan Guo, Shutong Yu, Jinxue Zhang, Huanyu Bi, Xiaohui Chen and Junnan Liu
ISPRS Int. J. Geo-Inf. 2025, 14(5), 204; https://doi.org/10.3390/ijgi14050204 - 16 May 2025
Viewed by 570
Abstract
The trajectory data collected by an Automatic Identification System (AIS) are an essential resource for various ships, and effective filtering and querying approaches are fundamental for managing these data. Natural language has become the preferred way to express complex query requirements and intents, [...] Read more.
The trajectory data collected by an Automatic Identification System (AIS) are an essential resource for various ships, and effective filtering and querying approaches are fundamental for managing these data. Natural language has become the preferred way to express complex query requirements and intents, due to its intuitiveness and universal applicability. In light of this, we propose a natural language-based AIS trajectory query approach using large language models. Firstly, trajectory textualization was designed to convert the time sequences of trajectories into semantic descriptions by segmenting AIS trajectories, extracting semantics, and constructing trajectory documents. Then, the semantic trajectory querying was completed by rewriting queries, retrieving AIS trajectories, and generating answers. Finally, comparative experiments were conducted to highlight the improvements in accuracy and relevance achieved by our proposed method over traditional approaches. Furthermore, a human study demonstrated the user-friendly interaction experience enabled by our approach. Additionally, we conducted an ablation study to illustrate the significant contributions of each module within our framework. The results demonstrate that our approach effectively bridges the gap between AIS trajectories and natural language query intents, offering an intuitive, user-friendly, and accessible solution for domain experts and novices. Full article
Show Figures

Figure 1

Back to TopTop