VTLLM: A Vessel Trajectory Prediction Approach Based on Large Language Models

Liu, Ye; Xiong, Wei; Chen, Nanyu; Yang, Fei

doi:10.3390/jmse13091758

Open AccessArticle

VTLLM: A Vessel Trajectory Prediction Approach Based on Large Language Models

by

Ye Liu

,

Wei Xiong

^*

,

Nanyu Chen

and

Fei Yang

College of Electronic Science, National University of Defense Technology, Changsha 410073, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2025, 13(9), 1758; https://doi.org/10.3390/jmse13091758

Submission received: 1 August 2025 / Revised: 5 September 2025 / Accepted: 10 September 2025 / Published: 11 September 2025

(This article belongs to the Section Ocean Engineering)

Download

Browse Figures

Versions Notes

Abstract

In light of the rapid expansion of maritime trade, the maritime transportation industry has experienced burgeoning growth and complexity. The deployment of trajectory prediction technology is paramount in safeguarding navigational safety. Due to limitations in design complexity and the high costs of data fusion, current deep learning methods struggle to effectively integrate high-level semantic cues, such as vessel type, geographical identifiers, and navigational states, within predictive frameworks. Yet, these data contain abundant information regarding vessel categories or operational scenarios. Inspired by the robust semantic comprehension exhibited by large language models (LLMs) in natural language processing, this study introduces a trajectory prediction method leveraging LLMs. Initially, Automatic Identification System (AIS) data undergoes processing to eliminate incomplete entries, thereby selecting trajectories of high quality. Distinct from prior research that concentrated solely on vessel position and velocity, this study integrates ship identity, spatiotemporal trajectory, and navigational information through prompt engineering, empowering the LLM to extract multidimensional semantic features of trajectories from comprehensive natural language narratives. Thus, the LLM can amalgamate multi-source semantics with zero marginal cost, significantly enhancing its understanding of complex maritime environments. Subsequently, a supervised fine-tuning approach rooted in Low-Rank Adaptation (LoRA) is applied to train the chosen LLMs. This enables rapid adaptation of the LLM to specific maritime areas or vessel classifications by modifying only a limited subset of parameters, thereby appreciably diminishing both data requirements and computational costs. Finally, representative metrics are utilized to evaluate the efficacy of the model training and to benchmark its performance against prevailing advanced models for ship trajectory prediction. The results indicate that the model demonstrates notable performance in short-term predictions fFor instance, with a prediction step of 1 h, the average distance errors for VTLLM and TrAISformer are 5.26 nmi and 6.12 nmi, respectively, resulting in a performance improvement of approximately 14.05%), having identified certain patterns and features, such as linear movements and turns, from the training data.

Keywords:

trajectory prediction; large language models; AIS data; data preprocessing; prompt engineering

1. Introduction

1.1. Background and Significance

With the development of the world economy in the direction of globalization, the status of waterway transportation in international transport is becoming more and more important, and it plays a vital role in international trade. However, there are also a series of safety hazards in waterway transportation [1], such as the impact of reefs, harsh or extreme weather environments on ships, collisions between sailing ships, etc., which will cause a large number of casualties and property losses, but also likely to cause serious damage to the marine ecological environment.In the backdrop of increasingly prominent navigation safety problems, ship trajectory prediction technology is particularly important in improving the level of maritime supervision. The higher the technical level of prediction of the ship’s trajectory, the higher the level of automation and intelligence of the Vessel Traffic Services (VTS), and the more efficient the task of maritime supervision will be. In recent years, trajectory prediction technology has been applied to many fields related to human safety [2], including volunteer location data, GPS vehicle trajectories, mobile terminal positioning, and communication records [3]. These advances have contributed to the rise of spatiotemporal trajectory research as a discipline, which is widely used in intelligent transportation [4,5,6], public safety [7,8,9], and commercial services [10,11,12], as shown in Figure 1.

With the widespread adoption of sensor devices and the continuous advancements in positioning technology, spatiotemporal data has progressively become the predominant form of generalized geospatial data. Consequently, a critical inquiry arises regarding how to thoroughly investigate the intrinsic spatiotemporal mechanisms within trajectories and their external spatiotemporal correlations, as well as how to integrate and represent the information of spatiotemporal objects across various levels, dimensions, and stages, thereby unveiling the behavioral intentions and interactional rules among these spatiotemporal entities. Furthermore, addressing the imperative need to support advanced applications—such as semantic understanding of moving object behavior, trajectory reasoning, and trajectory generation from a semantic cognition perspective—has emerged as an urgent challenge within the contemporary spatiotemporal information system. Given the superior capacity of large language models (LLMs) in semantic processing, their application in the fusion of spatiotemporal information is feasible.

LLMs have become an important tool for semantic understanding. Studies have shown that [13] LLMs can achieve success in many fields such as computer vision, signal processing, reinforcement learning, etc., and there have also been attempts to research and develop Geospatial Artificial Intelligence (GeoAI) basic models. These results show that in some geospatial tasks involving only text patterns, such as place name recognition and location description recognition, task-agnostic LLMs outperform task-specific fully supervised models in few-shot or zero-shot learning settings. Compared with traditional methods, LLMs have the ability to capture contextual associations of sequence data and understand complex semantic information, and spatiotemporal trajectory data is also a kind of sequence data, which enables this research method to achieve a more comprehensive analysis of the predicted trajectory from the two aspects of spatiotemporal sequence and spatiotemporal semantics. Finally, LLMs have a massive knowledge reserve, and many potential factors and constraints that affect the future trend of the trajectory have been learned by the model. The collaborative prediction with LLM can fill in the shortcomings of user knowledge and adapt to trajectory prediction tasks in different scenarios, which has many potentials to be further studied. The research method based on LLM proposed in this paper provides a new method and a new idea for the realization of space–time trajectory prediction.

1.2. Research Aim and Contents

As a classic task in the data mining field, trajectory prediction aims to analyze the possible position of moving objects in the future. Existing trajectory prediction techniques primarily rely on extracting common features and patterns from spatiotemporal sequences such as position and velocity, and then make predictions accordingly. However, this approach overlooks the influences of individual preferences and those associated with vessel types, thus encountering difficulties when confronted with more complex scenarios. This study uses LLMs related technologies to provide new ideas to solve the above problems: On the one hand, natural language has the natural ability to describe multi-source heterogeneous spatiotemporal data, so it is no longer necessary to redesign the model to transform specific spatiotemporal entities or spatiotemporal features into vectors in multidimensional space. It only needs to use natural language to describe the relevant data uniformly, and complete the fusion of multi-source heterogeneous spatiotemporal data from the perspective of semantics. On the other hand, LLMs have the ability to capture contextual association of sequence data and understand complex semantic information, which enables this study to achieve a more comprehensive analysis of the predicted trajectory from both spatiotemporal sequence and spatiotemporal semantic levels. And LLMs have learned many potential factors and constraints that affect the future trend of trajectories, which makes it easier to adapt to trajectory prediction tasks in different scenarios.

The methodological framework of this investigation is depicted in Figure 2. Initially, the paper delineates and conceptualizes the trajectory prediction challenges, utilizing an open-source AIS dataset. Subsequently, owing to the substantial redundancy prevalent in multi-source heterogeneous spatiotemporal data, which does not necessarily equate to enriched information, and given the constraints imposed by the input context window of large language models (LLMs), preprocessing of trajectory data is imperative. Through meticulous data cleaning processes, anomalous data is filtered out and standardized, resulting in structured trajectory information. Furthermore, the dynamic and static characteristics, along with navigational points of the vessels, are incorporated as inputs. The spatiotemporal data is then transformed into natural language text via the construction of a prompt template, thus establishing a dataset conducive to the training and assessment of LLMs. Ultimately, the LLMs identified for this study are rigorously trained using this dataset, employing suitable fine-tuning techniques, and their performance is appraised through pertinent technical metrics.

The main contributions of this paper are as follows:

(1): A trajectory data preprocessing method designed to preserve semantic information—rather than to serve conventional numerical models—is proposed for this study. Given that the data and information used here differ from those in traditional methods, it is crucial to retain the relevant research information during data preprocessing.
(2): A prompt template for information fusion is designed. By constructing an appropriate spatiotemporal semantic model and integrating heterogeneous spatiotemporal data related to trajectories, the transformation of spatiotemporal trajectories into abstract text is achieved. This involves using natural language to uniformly describe the relevant data and completing the fusion from a semantic perspective, which is key to effectively accomplishing trajectory prediction tasks.
(3): Based on the Low-Rank Adaptation (LoRA) fine-tuning method [14] and the built datasets, six large language models, including CodeLlama-7B, were trained in this study; the best-performing one, CodeLlama-7B, was advanced to further experimentation. Analyses of its pre- and post-training MAE and related metrics confirmed the effectiveness and generalisation of the approach. Compared with the current state-of-the-art deep learning model TrAISformer, the proposed VTLLM yielded superior short-term prediction performance.

The structure of this paper is organized as follows. In Section 2, a systematic review of the development and achievements of research methods related to the trajectory prediction task is presented. In Section 3, the data is processed, and prompt templates are designed to construct a dataset for training the LLMs. In Section 4, the experimental results are analyzed and evaluated. Finally, in Section 5, the key findings and limitations of this paper is summarized, and future research directions are discussed.

2. Literature Review

2.1. Methods Based on Machine Learning

In previous studies, numerous models for spatiotemporal trajectory prediction based on historical data using machine learning methods have been proposed. As shown in Table 1, these models include regression models. For instance, the Linear Regression Model (LRM) [15] and the autoregressive model (AR) [16]. Other regression models include Support Vector Regression (SVR) and Gaussian Process Regression (GPR). Additionally, neural network-based models have been developed, such as artificial neural network (ANN), Kalman Filter (KF), and Random Forest (RF). These models typically collect information such as the ship’s sailing speed, latitude and longitude coordinates, acceleration, and heading for training. The relevant models are discussed in detail in the following sections.

LRM can be used to predict time series. This method is simple and easy to understand and has high efficiency. It is good for the prediction of linear relationship between variables. However, it is only suitable for short-term linear ship trajectory prediction and easy for overfitting. The KF model can predict the state of a moving target, it can be an online algorithm that can predict the track point of the next timestamp by using new real-time observation data, and it is suitable for applications that require fast response. In general, this method is only suitable for linear systems. The SVR model is a method that combines Support Vector Machine (SVM) and regression and is suitable for nonlinear data. The GPR model is an improvement in Bayesian linear regression, providing better modeling accuracy and uncertainty estimation. Chen et al. [17] proposed a sparse GPR model that can help set intelligent ship navigation, and solved the difficulty of applying kernel function method in large datasets. Zhang et al. [18] proposed a multiple-output Gaussian Process Regression (MOGPR) method, which also combined K-means and density-based spatial clustering of applications with noise (DB-SCAN), to predict selected ship motion dynamics. However, GPR method is highly dependent on data. RF is a comprehensive algorithm that includes decision trees. It has a wide range of applications in many fields, such as predicting the arrival time of a ship at a port [19], the port of destination [20], and the prediction of ship speed and trajectory [21]. However, it is easy to overfitting when dealing with some specific noise datasets. To date, many artificial neuronal connections have been used to compute ANN. ANN is an adaptive system because, in most cases, the internal structure changes based on external information. Due to the strong adaptability of neural networks, they have been applied to ship trajectory prediction today, including direct applications [35,36,37] and improved models such as extreme learning machines [22], generalized regression neural networks [23], neuroevolutionary ANN [24], an improved K-Nearest Neighbor (KNN) [25] algorithm, a novel Gated Spatiotemporal Graph Aggregation Network (G-STGAN) [26], and the multilayer perceptron (MLP) method [27]. However, these models involve a large number of parameters, resulting in relatively poor interpretability, potentially massive computations and high spatial complexity. Because of these limitations, researchers are beginning to explore ways to use neural networks to mix with other algorithms to overcome these challenges. This hybrid method can undoubtedly improve the performance of neural network in prediction. However, this method will also cause some new problems in the application process, including high computing cost, high data quality requirements, low computing efficiency, and lack of generality.

In addition to the above methods, scholars have used other machine learning methods to predict ship trajectories. For example, Murray and Perera et al. [29] proposed a single point neighbor search method to cluster and then classify trajectory data for ship behavior prediction; Zheng et al. [30] proposed an improved cultural particle swarm method for ship turning angle prediction; Maskooki et al. [31] proposed a k for optimal navigation route selection k-nearest neighbor (k-NN) algorithm, Miller and Walczak [32] proposed a method for coefficient estimation of second-order rational Bessel curves, Tang et al. [33] proposed a Bayesian network for ship trajectory prediction, and Xie et al. [34] proposed an improved beetle antenna search algorithm for prediction and collision avoidance. Due to the rapid growth of these methods in the field, it is necessary to conduct a comparative analysis of their advantages and disadvantages in order to reveal their adaptability in different sailing scenarios. Thus, we can choose suitable methods and models in different tasks to adapt to different sailing scenarios.

In the field of ship trajectory prediction, machine learning techniques have been extensively explored and applied. Early studies adopted classical regression models such as Linear Regression and Support Vector Regression, as well as improved methods including Random Forest and Extreme Learning Machines, aimed at enhancing prediction accuracy and addressing challenges such as overfitting and strong data dependency. However, these methods often suffer from issues like low computational efficiency and limited generalization capability. In recent years, research interest in this direction has somewhat waned. Nonetheless, it remains evident that suitable models should be selected based on specific task requirements to achieve effective prediction across different navigation scenarios.

2.2. Methods Based on Deep Learning

Deep learning model has strong learning ability and adaptability. It can carry out multi-level feature learning and has strong generalization ability, and can complete complex trajectory data prediction tasks with high standards and high precision. The deep learning method shows excellent performance in the ship trajectory prediction task driven by AIS data. The following is a systematic review of existing trajectory prediction models based on deep learning methods.

Recurrent neural networks (RNNs) have the problem of gradient disappearance, so they only have short-term memory characteristics. The long short-term memory (LSTM) model is a development and improvement on the RNN model. The LSTM model combines short-term memory and long-term memory through a special gate control mechanism, which solves the problem of gradient disappearance in ship trajectory prediction to a certain extent. In order to deal with complex trajectory problems, scholars have proposed many models based on LSTM integration. For example, Mehri et al. [38] proposed a context-aware LSTM that can consider Context variables (such as ship type). CLSTM model and Hammedi [39] et al. proposed a Conv prediction model based on deep federation learning, which can simultaneously avoid acquiring private data from others and can cooperatively establish the LSTM position of the ship. Ma et al. [40] proposed a deep learning model based on accumulated long short-term memory (ALSTM) to predict navigation intentions on intersecting waterways. Chen et al. [41] combined artificial neural networks with three common smoothing models, Weighted Least Squares (WL), Moving Average (MA), and Butterworth Filter (BW), to form a hybrid model with enhanced performance. Zhang et al. [42] designed a Social-Passing module (SP-LSTM) that combined with the strong performance of LSTM in capturing temporal information to achieve a good trajectory prediction effect. Guo et al. [43] proposed a novel sequence-to-sequence model—Vessel Influence LSTM (VI-LSTM)—that introduces a Vessel Influence Map (VIM) to quantify the dynamic impact of surrounding vessels. Experimental results demonstrate this innovation significantly improves the accuracy and generalizability of ship trajectory prediction.

Nevertheless, the primary focus of LSTM and its variants is on processing past information through gating mechanisms to capture long-term dependencies in sequential data. This characteristic also implies that they are not well-suited for handling future information. In other words, the LSTM model struggles with tasks that require the simultaneous consideration of both past and future data. In contrast, the Bidirectional Long Short-term Memory Network (Bi-LSTM) model is capable of processing data from both the past and the future. It achieves this by processing input information in two directions: one from the future to the past and the other from the past to the future. Thus, Bi-LSTM can store both forward and reverse information. Numerous novel hybrid models based on Bi-LSTM have been proposed for ship trajectory prediction. These include models that combine data denoising with Bi-LSTM [44], methods that combine spectral clustering methods and Bi-LSTM [45], as well as new models that combine attention mechanisms with Bi-LSTM [46]. These models have considerable computational complexity and limited generalization ability.

Gate Recurrent Unit (GRU) models address some limitations of LSTM by having fewer parameters, thereby reducing the risk of overfitting. Suo et al. [47] applied GRU models to achieve early warning in Marine navigation, but the cost was relatively high. Zhang et al. [48] proposed a ship trajectory prediction method based on GRU and multi-scale convolutional neural network (MSCNN) for ship trajectory prediction. The model effectively extracted spatiotemporal features and accurately predicted the trajectory by combining GRU, attention mechanism, and autoregressive (AR) model.

In addition to the above methods, Wang and He [49] proposed a generative adversarial network with attention mechanism and interaction module (GAN-AI) to predict the trajectory of multiple ships. The interactive module is designed to process the trajectory information of a single ship to the relative motion information of multiple ships, which improves the accuracy of trajectory prediction in complex cases. Zhang et al. [50] proposed a trajectory generation adversarial network with multiple reviewers (TGans-MC) to enrich historical trajectories, especially anomalous trajectories, in order to improve prediction accuracy. Huang et al. [51] proposed a TripConvTransformer model composed of global, local and trend convolution, which utilizes the simplified transformer architecture and integrates meteorological data. Nguyen and Fablet [52] introduced a novel TrAISformer model to improve prediction accuracy by combining long-term correlation of AIS trajectories with a probabilistic converter structure, and the model was validated using a new evaluation function on AIS datasets spanning three months.

In general, the application of deep learning methods to ship trajectory prediction has been extensively and intensively studied. While early research faced challenges such as handling noisy and incomplete data, modeling complex multi-vessel interactions, and achieving accurate long-term forecasts, these issues have been progressively addressed through technological advances. Classical sequence modeling approaches such as LSTM and Bi-LSTM laid the important groundwork for capturing temporal dependencies. Subsequent innovations—including attention mechanisms, graph neural networks (GNNs), and transformer-based architectures—have further significantly improved prediction performance. Moreover, recent studies focus increasingly on integrating multi-source heterogeneous information, such as AIS data, meteorological conditions, oceanographic features, and geographic context, greatly enhancing the robustness and practical applicability of trajectory prediction systems. As a result, deep learning-based methods continue to develop rapidly and are becoming increasingly mature, offering reliable support for real-world maritime intelligent systems.

2.3. Time Series Prediction Method Based on LLMs

Currently, there are some studies on trajectory prediction based on LLMs. However, LLMs have primarily been applied to time series prediction tasks. For example, Xue et al. [53] framed the human mobility prediction problem as a sequence-to-sequence language translation task. This involves converting descriptions of human mobility into descriptions of future mobility using natural language models to predict future trends. However, the prediction granularity generated by this method is coarser and not suitable for spatiotemporal trajectory prediction.

Spatiotemporal trajectory data is a type of time series data. Xue et al. [54] proposed a time series data prediction method based on PromptCast. This method uses natural language to guide models in time series predictions and introduces a guided fine-tuning dataset called PISA. Gruver et al. [55] found that LLMs can achieve time series prediction through zero-shot prompts by using markers for proper processing of numerical sequences. LLMTIME model for time series data prediction is also introduced. Yu et al. [56] applied LLM to explainable financial time series forecasting to explore the predictive performance of LLM on NASDAQ-100 stock price time series from the perspective of cue engineering and fine-tuning. Their results show that even when the LLM is guided in a zero-shot way to make predictions, it outperforms traditional methods. Jin et al. [57] proposed a “reprogramming” approach to design a time series prediction framework called Time-LLM. The framework transforms time series into a more suitable presentation format for LLM to enhance its predictive power. Zhou et al. [58] explored the application of pre-trained natural language processing and computer vision models to time series analysis, and proposed a model named Frozen Pre-trained Transformer (FPT). By fine-tuning, competitive or leading performance is achieved on multiple time series tasks, and the similarity between the self-attention module and principal component analysis is revealed. This provides a theoretical and practical basis for understanding the generality of pre-trained transformer. Cao et al. [59] proposed a new framework called TEMPO. This framework effectively learns time series representations by decomposing the trend, seasonality, and residual parts of time series and introducing selection-based cues to accommodate nonstationary distributions. Experiments demonstrated that TEMPO exceeded existing methods on multiple time series datasets, showcasing its potential for building foundational models. Xie et al. [60] designed a novel multimodal large language model, ChatTS, which treats time series as a modality, analogous to how visual multimodal large language models process images, enabling it to understand and reason about time series data. The experimental results demonstrate ChatTS’s superior performance over existing vision-based MLLMs (e.g., GPT-4o) and text/agent-based LLMs. It achieves performance gains of 46.0% on alignment tasks and 25.8% on reasoning tasks. Wilinski et al. [61] explored the learning mechanisms and representational properties of time series foundation models (TSFMs), aiming to bridge the gap in understanding their internal workings. Through hierarchical representation analysis, they identified and localized interpretable concepts within TSFMs, investigating whether different models organize knowledge in similar ways. Furthermore, the authors proposed a novel method to steer predictions by manipulating the latent representations of the model, enabling it to generate outputs that better align with target concepts—without requiring additional training or fine-tuning.

LLMs adopt natural language interfaces, substantially lowering the barrier to use. Empirical studies show that they achieve superior performance in zero-shot or few-shot settings and possess a strong capacity to capture complex patterns. Moreover, LLMs exhibit remarkable versatility, enabling cross-task and cross-modal transfer and offering broad prospects for future development. However, since the spatiotemporal trajectory data has both time and space dimensions, the above method does not make full use of the spatial characteristics of the trajectory. In addition, the large scale, information redundancy, and complex patterns of trajectory data must be taken into account during the interaction with LLMs, and appropriate prompt needs to be designed to enhance prediction performance.

3. Data and Methodology

The overall structure of the methodological approach is illustrated in Figure 2.

3.1. Definition of Problem

3.1.1. The Definition of Trajectory

For a ship, the sequence of sampling points of the same ship in a continuous period of time is taken as a trajectory

T r a

.

T r a = \{p_{1}, p_{2}, \dots, p_{i}, \dots, p_{n}\}

(1)

p_{i} = \{t_{i}, (l a t_{i}, l o g_{i}), a t t r_{i}\}

(2)

where n is the number of sampling points in

T r a

,

p_{i}

is the point on the trajectory,

t_{i}

is the sampling timestamp of

p_{i}

,

(l a t_{i}, l o g_{i})

is the latitude and longitude coordinates of

p_{i}

, and

a t t r_{i}

is the attribute value of

p_{i}

.

3.1.2. The Definition of Trajectory Segment

For the trajectory

T r a

, the trajectory segment of length N is

T r a_{N} = \{p_{x + 1}, p_{x + 2}, \dots, p_{x + N}\}, x + 1 \geq 1, x + N \leq n

(3)

3.1.3. The Definition Trajectory Prediction

For trajectory

T r a

, the trajectory prediction problem is defined as follows.

Given the trajectory segment

T r a_{N} = \{p_{x + 1}, p_{x + 2}, \dots, p_{x + N}\}

, find the latitude and longitude coordinates

(l a t_{x + N + i}, l o g_{x + N + i})

of the possible position of the ship at time

t_{x + N + i}

.

3.2. Study Area and Data

The data source for this study is the open-source AIS data from the Danish Maritime Administration’s relevant website. The research area is a rectangular region within 52.7059.22° north latitude and 0.1618.58° east longitude. This maritime region is at the intersection of the Baltic Sea and the North Sea, serving as a major European maritime traffic hub with abundant ship trajectory data resources, including AIS data and satellite remote sensing data. These data encompass numerous ship trajectories, providing ample support for constructing trajectory prediction models and facilitating in-depth analysis of ship trajectory prediction.

For this study, over 33 million data points from 1 March 2023 to 3 March 2023 were selected as the dataset. Each AIS data entry contains 27 pieces of information, with commonly used fields categorized into three types, as shown in Table 2. These include Maritime Mobile Service Identity (MMSI), ship name, latitude and longitude, and turning angle rate. Different message types serve different purposes. For instance, papers often use the ship’s unique identifier, MMSI, to classify existing data and reorganize the AIS data, which were originally sorted by time series, to obtain the trajectories of individual ships over a specific period.

The units and data types of the AIS data fields that can be used in this study are declared as follows, as shown in Table 3.

3.3. Data Preprocessing

3.3.1. Data Cleaning

The original AIS dataset has a large amount of data and contains very miscellaneous information. In the process of data transmission, loss and errors are likely to occur, so data cleaning must be carried out. The steps of data cleaning in this paper are as follows:

(1): Remove data from outside the study area.
(2): The official ship’s unique identification MMSI is a nine-digit identifier that removes the incorrect MMSI.
(3): AIS data contains 27 fields. Some fields, such as type of position fixing device, are not used and need to be removed.
(4): The drift points in the trajectory are eliminated at a maximum speed of 50 knots.
(5): Ships that are too small will be greatly affected by wind direction, wind speed, sea water flow, etc., and it is not easy to forecast, so ships that are less than 5 m long and less than 3 m wide are eliminated.
(6): As LLMs in this study relies on semantic information for training, it is necessary to screen out data with relatively complete data information and high quality. Consequently, data of certain fields (such as MMSI, SOG, COG, navigation status, ship name, destination, ship type, ROT, etc.) that were empty, “Unknown”, “Unknown Value”, “Undefined”, or “NAN” were eliminated.
(7): Filter out data with navigation statuses of “Moored,” “At anchor,” “Aground,” and other statuses indicating the vessel is not in motion.

3.3.2. Trajectory Extraction

After data cleaning, it is not the track of a ship, but the track points of various ships mixed together. MMSI also needs to be used to classify these track points and extract the sailing track of each ship, the specific steps are as follows:

(1): MMSI is used to group the track points and arrange them in ascending order of time.
(2): The ship will return an AIS data to the maritime management center every two seconds during the voyage; even when at anchor, it will return an AIS data every three minutes. The ship returns AIS data so frequently that a huge amount of data is generated. In the study, such high-frequency and low-interval data are not necessary, so downsampling of trajectory data is necessary. In this study, a sampling point was obtained every 10 min.
(3): If the spacing between the front and rear trajectory segments is too large, the correlation between trajectories will be too small, which is not conducive to the learning of subsequent LLMs. Therefore, the trajectory with a time interval of more than one hour is divided into two trajectories. If there are time intervals greater than 1 h in more than one place, the trajectory is divided into multiple segments.
(4): If the number of trajectory points is too small, it is not enough to support the trajectory prediction, so the trajectory with less than 6 trajectory points is eliminated.
(5): Due to potential data transmission errors between vessels and the Maritime Data Management Center, anomalies like the one shown in Table 4 can arise. Examination of the AIS records reveals that one trajectory point’s longitude lost a digit during transmission. In this study, trajectories with adjacent points having a longitude or latitude difference exceeding 1° are directly eliminated.
(6): Finally, the processed tracks are renumbered to obtain the unique identification of each track.

To enrich trajectory information, this study additionally collected trajectory-related Point of Interest (POI) information and integrated it as auxiliary semantic information into trajectory analysis. In specific operations, for each trajectory, the moving speed and direction angle were first calculated based on the latitude and longitude coordinates as well as timestamps of its trajectory points, and then inflection points and stay points were detected by combining preset thresholds; subsequently, based on the latitude and longitude coordinates of the detected inflection points and stay points, and using distance as the query criterion, the open-source geographic database GeoNames was used to query POI information, further improving the semantic dimension of the trajectory.

To adapt to numerical models, traditional AIS data preprocessing methods typically only retain pure numerical fields (e.g., longitude, latitude, speed, course), while text fields (e.g., vessel type, navigational status, IMO number) are either discarded directly or encoded into discrete numerical values (e.g., “anchored”: 0, “underway”: 1), resulting in the loss of semantic associations. In contrast, this study retains and optimizes text fields with high semantic value through the aforementioned semantic retention strategies and further adds auxiliary semantic fields. Specifically, we not only preserve text information such as “vessel type (e.g., fishing, cargo, oil tanker)” and “navigational status (e.g., underway, anchored, berthed)” but also incorporate POI information as public domain knowledge for association.

After the above steps, the article obtained a total of 5415 track data, as shown in Figure 3. The structure of the ship trajectory after processing is shown in Table 5. One of the MMSI may correspond to multiple tracks due to track segmentation, and each pair of tracks should have its own number and a string of track points.

3.4. Prompt Engineering

A key step in guiding LLMs in making predictions is to build datasets that convert the original sequence data into natural language text. Using prompt template-based descriptions is an efficient way to achieve data-to-text conversion. Therefore, this study uses prompt project to enable the LLMS to understand the intention of the task and make a reasonable trajectory prediction. The prompt template and the example presented in this section are shown in Table 6. The template consists of two parts: input prompt and output prompt. The input prompt contains the context and question. The input prompt also covers the basic information about the ship, historical track observation data, and target prediction time. It mostly includes static information such as ship name, MMSI, length, width, and ship type (blue part in Table 6), as well as dynamic information such as longitude and latitude coordinates, sailing speed, and heading at each moment (red part in Table 6), in addition to the integration of information from points of interest (POI) near the track (orange part in Table 6). The output prompt provides only the target output, which is responsible for providing the target predicted value to the LLMs as a baseline true value tag for training or evaluation.

3.5. Fine-Tuning Method

The model architecture employed in this paper is illustrated in Figure 4. It represents the mainstream LLM architecture, also commonly referred to as a Dense LLM. It adopts and modifies the decoder part of the classic Transformer, belonging to the decoder-only Transformer architecture. The Embedding module primarily functions to convert input tokens into high-dimensional vectors, enabling them to capture semantic and contextual information of the words, so that subsequent Transformer layers can process the data effectively. RMSNorm utilizes the root mean square to normalize the hidden states in the neural network. Its mathematical expression is given in Equation (4), where x denotes the input vector and

γ

represents the scaling factor. This method aims to stabilize the distribution of activation values in deep networks, helping to accelerate the training process and improve model performance. The Rotary Positional Encoding (RoPE) mechanism is used to inject positional information into the model. RoPE provides relative positional information through rotational transformations, enabling the model to better understand positional relationships within sequences. Grouped Query Attention (GQA) is an improved attention mechanism that enhances model efficiency and performance by grouping queries. Skip connections are adopted in the architecture to facilitate gradient flow and enhance training stability. SwiGLU is used in the Feed-Forward Network (FFN) as a variant of Gated Linear Units (GLUs), which introduces nonlinear activation functions to strengthen the model’s expressive power. The final model architecture consists of N repeated core Transformer modules.

RMSNorm (x) = \frac{x}{RMS (x)} \cdot γ, RMS (x) = \sqrt{\frac{1}{d} \sum_{i = 1}^{d} x_{i}^{2}}

(4)

LLMs themselves do not have trajectory prediction capabilities and need to be fine-tuned with datasets to fully understand this specific field. The main methods for fine-tuning LLMs include full fine-tuning and parameter-efficient fine-tuning (PEFT). Full fine-tuning requires updating all parameters of the pre-trained model on the new task, which demands a large amount of computing resources and time, leading to a significant decrease in efficiency. This involves not only the storage requirements of the model itself, but also the processing and storage of a large number of key parameters during the training process. PEFT offers an effective strategy to manage this situation. By selectively updating only a portion of the model’s parameters and “freezing” the majority of the remaining parameters, it significantly reduces the number of parameters that need to be trained. PEFT not only retains the knowledge that the model has acquired from the original training data, but also ensures that the new training task does not disrupt this existing knowledge.

Recent studies, such as PromptCast [54] and Time-LLM [57], have demonstrated the remarkable potential of using meticulously engineered prompts to elicit time series forecasting capabilities from pre-trained LLMs, representing a highly parameter-efficient paradigm. However, this prompt-based approach is inherently a ’frozen-model, tuned-input’ strategy. Its performance is heavily reliant on the breadth of the pre-trained model’s inherent knowledge and the alignment between the prompts and the model’s internal representations. For tasks characterized by strong domain specificity and complex patterns—such as maritime data with intricate spatiotemporal dependencies—its adaptability and performance ceiling can be limited. In contrast, fine-tuning, which updates a subset of the model’s parameters, enables a deeper assimilation of target domain features. Therefore, we propose a novel “Prompt + PEFT” paradigm to achieve highly effective and efficient adaptation of LLMs for maritime time series forecasting tasks.

Low-Rank Adaptation [14] is a mainstream LLM fine-tuning method in PEFT. Our work constitutes a conventional application of the established LoRA framework. No modifications or alterations were made to its internal architecture. In the LoRA method, the weights of the original LLMs are frozen, which means remaining unchanged during training, and the purpose of this is to preserve the general knowledge that the model learned during the pre-training phase. The core idea of LoRA is to approximate the model update by using low-rank matrix decomposition to reduce dimension first and then increase dimension. As shown in Figure 5, the pre-trained weights

W \in R^{d \times d}

of the pre-trained model are fixed during training, and only the matrices

A \in R^{r \times d}

and

B \in R^{d \times r}

are trained. The dimensions of the parameter matrices A and B are chosen such that

B A \in R^{d \times d}

, which coincides with the dimensionality of the pre-trained weight matrix W, thereby enabling the corresponding parameters to be updated via element-wise summation with W. At this point, the parameter update cost is dominated by the low-rank matrices B and A, with a size of

2 \times r \times d

, as opposed to the original W with

d \times d

parameters. Since

r < < d

, it follows that

2 \times r \times d < < d \times d

, thus significantly reducing the number of parameters that require updating.

At the beginning of training, matrix A is initialized via a Gaussian distribution with a mean of 0, i.e.,

A \sim N (0, σ^{2})

, while matrix B is initialized to 0, i.e.,

B = 0

. This initialization scheme ensures that the LoRA branch

B A

is 0 before training commences. Consequently, fine-tuning starts from the original pre-trained weights W, thus guaranteeing an identical starting point to full fine-tuning. During training, for input x, the forward propagation process of the model is updated to

h = W x + Δ W x = W x + B A x

, where

B \in R^{d \times r}, A \in R^{r \times d}

. In this process, the original parameters W are frozen, the gradients for B and A are computed as shown in Equation (5), and the back-propagated gradients are likewise given in Equation (6). This means that although W participates in both the forward and backward passes, no gradients

\frac{\partial L}{\partial W}

are computed for them, and consequently, their values are not updated. In practice, the weight update

Δ W

is scaled by a factor of

\frac{α}{r}

before being merged into the pre-trained weights, i.e.,

h = (W + \frac{α}{r} Δ W) x

. This scaling factor serves to calibrate the influence of the LoRA update. A smaller

\frac{α}{r}

value diminishes the impact of

Δ W

, often leading to less pronounced fine-tuning effects. Conversely, a larger

\frac{α}{r}

value amplifies its contribution, which increases the risk of overfitting on the downstream task. It is common practice to set the ratio

\frac{α}{r} = 2

for a given task. At inference time,

B A

is directly merged into W according to the equation above, so no additional latency is introduced compared to the original LLM.

\begin{matrix} \frac{\partial L}{\partial B} & = \frac{\partial L}{\partial h} \frac{\partial h}{\partial Δ h} \frac{\partial Δ h}{\partial B} = \frac{\partial L}{\partial h} {(A x)}^{T} \in R^{d \times r} \\ \frac{\partial L}{\partial A} & = \frac{\partial L}{\partial h} \frac{\partial h}{\partial Δ h} \frac{\partial Δ h}{\partial A} = B^{T} \frac{\partial L}{\partial h} x^{T} \in R^{r \times d} \end{matrix}

(5)

\frac{\partial L}{\partial x} = \frac{\partial L}{\partial h} \frac{\partial h}{\partial x} = \frac{\partial L}{\partial h} {(W + B A)}^{T} \in R^{d}

(6)

Specifically, for transformer-based models, LoRA only needs to fine-tune the self-attention part of each layer. The self-attention mechanism works through three key matrices: query matrix

W_{Q}

, key matrix

W_{K}

, and value matrix

W_{V}

. In our design, as illustrated in Figure 5, LoRA is applied separately to the

W_{Q}

,

W_{K}

, and

W_{V}

matrices. We just need to introduce low-rank matrices

A_{i}

and

B_{i}

to represent the weight updates rather than represent

W_{Q}

,

W_{K}

, and

W_{V}

directly. After training is completed, the outputs are transformed through the mapping matrices to obtain the results presented in Equation (7). All models in this study use a supervised fine-tuning (SFT) method with instruction dataset. SFT typically necessitates a small amount of labeled data to effectively guide the model to capture the desired pattern in a particular task.

\begin{matrix} Query : X \cdot W_{l o r a}^{q} = X \cdot W^{q} + \frac{α}{r} X \cdot Δ W^{q} \\ Key : X \cdot W_{l o r a}^{k} = X \cdot W^{k} + \frac{α}{r} X \cdot Δ W^{k} \\ Value : X \cdot W_{l o r a}^{v} = X \cdot W^{v} + \frac{α}{r} X \cdot Δ W^{v} \end{matrix}

(7)

4. Experiment and Evaluation

4.1. Setup

The experimental data settings are as follows: Open-source AIS data is used in the study. The selected research space is a rectangular region within 52.70∼59.22° North latitude and 0.16∼18.58° East longitude. The study selected 3652 trajectories as the training set, from which 5% trajectories were extracted as the validation set, and 100 trajectories for the test set.

The specific versions of LLMS used in the study were CodeLlama-7B, Qianfan-Chinese-Llama 2-7B, ERNIE-Lite-8K-0308, DeepSeek-R1-Distill-1.5B, Baichuan2-7B-Chat, and BLOOMZ-7B. The key configuration parameters and core libraries of the training process are shown in Table 7. A comprehensive evaluation of the trained models was conducted, and the top-performing model was benchmarked against TrAISformer [52], the current state-of-the-art baseline in vessel trajectory prediction.

4.2. Results

4.2.1. Performance Evaluation

We employ standard automatic metrics to evaluate the performance of the post-trained models. Six LLMs were trained through datasets in the experiment, and the commonly used evaluation metrics (including BLEU-4, ROUGE-1, ROUGE-2, and ROUGE-L) in the LLMs training task were obtained, as shown in Table 8. The ROUGE metrics are recall-oriented: ROUGE-1 and ROUGE-2 gauge the overlap of words and word pairs, respectively, indicating the coverage of key information. ROUGE-L assesses sentence-level structural similarity via the longest common subsequence. In contrast, the BLEU-4 metric is precision-oriented, measuring the n-gram accuracy of the generated text against the reference. And the larger the value of each metric, the better the effect of LLMs. Due to the limitations of these metrics in continuous numerical regression, the experiments only employ them as a benchmark for measuring surface-level similarity. Our subsequent focus lies in the further analysis of the optimal model using the latitude–longitude continuous numerical regression metrics.

The results showed that the CodeLlama-7B model had the best performance under the evaluation of the above four metrics, the Qianfan-Chinese-Llama 2-7B model, the DeepSeek-R1-Distill-1.5B model, and the ERNIE-Lite-8K-0308 model had similar performance. The Baichuan2-7B-Chat model and the BLOOMZ-7B model have poor performance in the evaluation of the four metrics. The study selected the best-performing CodeLlama-7B model for further analysis. The loss curves of the model is shown in Figure 6. As can be observed from the figure, as training steps increases, both the training loss and validation loss initially decrease rapidly to approximately 0.5, indicating effective learning on the training set and relatively fast convergence. Subsequently, the training loss declines gradually with fluctuations before eventually stabilizing, while the validation loss decreases steadily throughout the training process and also converges in the end. This suggests that the model becomes increasingly accurate in prediction or classification tasks as training progresses, with consistently decreasing loss values reflecting continuous improvement in model performance.

Furthermore, the evaluation of predictive performance must be compared with the actual historical trajectories. The following experiment selected the best-performing CodeLlama-7B model as VTLLM and used the Haversine Distance to measure the distance between the predicted latitude and longitude coordinates of the model and the real trajectory’s latitude and longitude coordinates. Four metrics, namely Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Symmetric Mean Absolute Percentage Error (SMAPE), and Average Euclidean Distance (AED), which measure the numerical differences between the model’s predicted values and the actual observed values, were adopted as evaluation indicators. If there are n predicted points, the related formulas are as follows:

M A E = \frac{1}{n} \sum_{i = 1}^{n} d i s_{H} ({\hat{p}}_{i}, r_{i})

(8)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} d i s_{H}^{2} ({\hat{p}}_{i}, r_{i})}

(9)

A E D = \frac{1}{n} \sum_{i = 1}^{n} \sqrt{{(l a t_{P} - l a t_{r})}^{2} + {(l o n_{p} - l o n_{r})}^{2}}

(10)

S M A P E = \frac{1}{2 n} \sum_{t = 1}^{n} [\frac{| l a t_{P} - l a t_{r} |}{(l a t_{P} + l a t_{r}) / 2} + \frac{| l o n_{p} - l o n_{r} |}{(l o n_{p} + l o n_{r}) / 2}]

(11)

d i s_{H} (p, r) = 2 R arcsin \sqrt{s i n^{2} (\frac{l a t_{p} - l a t_{r}}{2}) + c o s (l a t_{p}) c o s (l a t_{r}) s i n^{2} (\frac{l o n_{p} - l o n_{r}}{2})}

(12)

where

p_{i}

and

r_{i}

are the prediction and real result of the

i t h

point and taking

R = 6371

km.

d i s_{H} (p, r)

is the Haversine Distance of the prediction and real result.

l a t_{p}, l o n_{p}

and

l a t_{r}, l o n_{r}

are the corresponding longitude and latitude of the prediction and real result.

The experimental results are shown in Table 9.

According to the result analysis, it can be known that the trained CodeLlama-7B model is highly accurate when the prediction time is very short. As the prediction duration increases, the model’s prediction performance gradually decreases. Figure 7 shows the distribution histogram of the Haversine distance between the predicted results and the actual results at 1 h. The histogram shows an approximately right-skewed distribution, which means that the majority of the data points have a smaller Haversine distance, while a minority of the data points have a larger distance. Furthermore, when the prediction time is 1 h, the MAE and RMSE indicators are 5.26 nmi and 6.06 nmi, respectively. This indicates that the model performs well within a prediction duration of 1 h. In contrast, the evaluation metrics of the untrained CodeLlama-7B model on the same task are shown in Table 10. It is evident that all the metrics have achieved significant improvements after training compared to their pre-training levels. This demonstrates the effectiveness of the training approach.

Furthermore, TrAISformer was trained on the dataset and evaluated against VTLLM using the MAE and RMSE metrics, with the comparative results presented in Figure 8 and Table 11. As can be seen from the table, when the forecast horizon is within 1 h, VTLLM outperforms TrAISformer in both MAE and RMSE metrics. For instance, at the 40-minute and 1-hour forecast time points, the MAE values of VTLLM are 3.21 and 5.26, respectively, representing a reduction of 26.71% and 14.05% in prediction error compared with those of TrAISformer under the same time horizons. The experimental evidence indicates that VTLLM outperforms TrAISformer over short horizons; however, its performance degrades markedly as the forecast horizon lengthens. This suggests that VTLLM benefits from a richer information set during short-term prediction, yet fails to maintain the same capacity as TrAISformer in capturing long-range temporal dependencies, ultimately yielding less stable performance.

4.2.2. Trajectory Visualization Analysis

In order to make the experimental results more intuitive, three representative trajectories (Track 1, Track 2, Track 3) were selected in the test set to visualize the predicted trajectories. The three tracks have their own characteristics. Track 1 is in a straight sailing state in the prediction part, Track 2 has a turn in the prediction part, and Track 3 is in a deceleration state to enter the harbor in the prediction part.

From the visualization results in Figure 9, Figure 10 and Figure 11, it can be intuitively observed that the model’s prediction for Track 1 is significantly better than those for Track 2 and Track 3. The reason might be that Track 2 moves in a straight line during the predicted segment, with a relatively simple behavioral pattern. The model performs better with ships traveling in a straight course. Although the prediction for Track 2 is not as good as that for Track 1, it can still be observed that the predicted trajectory has a similar turning tendency to the actual one, indicating that the model has learned some useful patterns and features from the data through training. Track 3, however, performs poorly. By observing the trajectory, it is found that the model failed to accurately capture the ship’s deceleration and entry into the port, thus not decelerating in time. Overall, the experimental results validate the effectiveness of the research method, which can identify spatiotemporal features from known historical trajectory information, navigation information, static and dynamic information, and establish the intrinsic connections between time series.

5. Discussion and Conclusions

5.1. Discussion

Although the “Prompt + PEFT” approach we adopted successfully validates the feasibility of using LLMs for ship trajectory prediction, it remains a preliminary attempt. Performance degradation in long-term forecasting is a known challenge for autoregressive models due to error accumulation. The VTLLM model proposed in this paper exhibits a declining performance trend in long-term prediction tasks and faces challenges in capturing complex trajectories, such as deceleration during port entry. Through in-depth analysis, we discuss the potential causes of these limitations and propose feasible solutions, as detailed below.

Firstly, the prompt-based paradigm is inherently a complex pattern recognition tool, and its performance heavily relies on the information embedded within the context window. The currently designed prompts do not guide the LLM to deeply explore the internal relationships among static information, historical trajectories, navigational information, etc., lacking effective guidance for connecting trajectories under different modes and transmitting long-term features. The study [62] has demonstrated that the application of Chain-of-Thought (COT) techniques to prompt engineering significantly enhances the performance of LLMs. Consider leveraging COT techniques to guide the LLM through a reasoning process similar to “vessel type, ship size, navigation status, trajectory data” to uncover intrinsic connections between information. Secondly, we believe that the semantic representational potential of prompts is underutilized. For instance, converting heading angles into directional descriptions and characterizing turns as changes in direction can reduce the difficulty of pattern recognition and trajectory trend association. Furthermore, to enhance long-horizon performance, a strategy combining hierarchical prompting with Mixture-of-Experts (MoE) can be employed by first using global–local hierarchical prompts to decouple macro-level maritime traffic patterns from short-term vessel intent, thereby mitigating error accumulation in long sequences, followed by utilizing sparsely activated MoE to dynamically invoke specialized sub-networks on demand for specific vessel types or maritime regions, effectively balancing prediction accuracy with computational efficiency. Lastly, our selection of limited training data and lightweight models, constrained by finite computational resources, may also restrict the final prediction performance. Although studies in Section 2.3 have proposed and validated the view that LLMs can perform well with minimal data guidance, there is no doubt that larger-scale, longer-term datasets can enable LLMs to achieve better results in specific task domains. Meanwhile, with the application and development of Retrieval-Augmented Generation (RAG) technology in LLMs, we see a viable approach to resolving the conflict between massive data and the extensive computational resources required for LLM training. Through the vector databases established in RAG technology, external data can be input to the LLM without training, enabling real-time updates of data resources. However, designing retrieval indices to mine the most relevant knowledge presents another challenge. Certainly, LLMs are currently undergoing rapid development, and numerous powerful LLMs are continuously being introduced. The deployment of state-of-the-art LLMs in trajectory prediction tasks holds significant promise.

In other task domains, LLMs have demonstrated considerable generalization and transfer capabilities. Specifically, in trajectory prediction tasks, large language models can naturally understand natural language prompts such as “fishing vessels, fishing operations, main engine operating normally” without additional feature engineering costs, showing potential for zero-shot or few-shot rapid transfer across different ship types or even maritime domains. Moreover, LLMs exhibit broad prospects in leveraging pre-trained knowledge for parallel execution of multiple tasks, such as prediction, anomaly detection, and route planning. This can substantially reduce development and maintenance costs, demonstrating strong potential for engineering application and promotion. Research on utilizing LLMs for domain-specific tasks is rapidly advancing. We hope this study can contribute to achieving autonomous route planning, optimizing port management, enhancing border security warnings, and promoting the application of LLMs in the maritime domain, thereby driving maritime services toward greater intelligence and automation.

5.2. Conclusions

This paper systematically elaborates on the specific methodology of adapting LLMs to the field of ship trajectory prediction using the “Prompt + PEFT” framework. First, in the data preprocessing stage, unlike traditional methods designed for numerical models that focus solely on numerical features, we designed a novel semantic preservation strategy to retain and supplement the semantic information contained in AIS data. Redundant information and abnormal data were filtered out through data preprocessing, while high-value information was preserved. Second, leveraging the outstanding performance of LLMs in language processing, we transformed data of different structures and types into uniformly structured natural language messages through prompt engineering, achieving semantic fusion of information. Accordingly, a natural language text dataset was constructed for subsequent training and evaluation of the LLM. During the training phase, we adopted a well-established PEFT method—LoRA. This method can adapt, supplement, and specialize the pre-trained knowledge of LLMs by training a very small number of model parameters, achieving effective training while conserving substantial computational resources. In the experimental section, we first trained six lightweight LLMs, including CodeLlama-7B, using the above methods and dataset. The reasoning capabilities of the six LLMs were preliminarily compared using automated evaluation metrics, and the best-performing model, CodeLlama-7B, was selected for further experimentation and evaluation. We plotted the loss curve of its training process and found that the curve eventually converged to a sufficiently low value, indicating successful training. Subsequently, we compared the model’s performance before and after training using metrics such as MAE and RMSE, and observed a significant improvement in prediction performance after training (for example, for a prediction horizon of 1 h, the MAE values before and after training were 34.97 and 5.26, respectively). Visual analysis revealed that the model had learned certain patterns and features from the training process. We selected the deep learning model TrAISformer as a baseline for comparison with VTLLM. Experimental results demonstrate that VTLLM exhibits superior performance in short-term predictions. For instance, for a prediction horizon of 1 h, the average distance errors for VTLLM and TrAISformer are 5.26 nmi and 6.12 nmi, respectively, resulting in a performance improvement of approximately 14.05%.

Author Contributions

Conceptualization, Y.L. and W.X.; methodology, Y.L. and N.C.; software, Y.L.; validation, Y.L.; writing—original draft preparation, Y.L.; writing—review and editing, W.X., N.C., and F.Y.; visualization, Y.L.; Resources, W.X.; funding acquisition, W.X. and F.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Kristiansen, S. Maritime Transportation: Safety Management and Risk Analysis; Routledge: London, UK, 2013. [Google Scholar]
Zheng, Y.; Xie, X.; Ma, W.Y. GeoLife: A Collaborative Social Networking Service among User, Location and Trajectory. IEEE Data Eng. Bull. 2010, 33, 32–39. [Google Scholar]
Liu, Y.; Kang, C.; Gao, S.; Xiao, Y.; Tian, Y. Understanding Intra-Urban Trip Patterns from Taxi Trajectory Data. J. Geogr. Syst. 2012, 14, 463–483. [Google Scholar] [CrossRef]
Ma, S.; Zheng, Y.; Wolfson, O. T-Share: A Large-Scale Dynamic Taxi Ridesharing Service. In Proceedings of the 2013 IEEE 29th International Conference on Data Engineering (ICDE), Brisbane, QLD, Australia, 8–12 April 2013; pp. 410–421. [Google Scholar]
Zhang, Q.; Wang, Z.; Long, C.; Huang, C.; Yiu, S.M.; Liu, Y.; Cong, G.; Shi, J. Online Anomalous Subtrajectory Detection on Road Networks with Deep Reinforcement Learning. In Proceedings of the 2023 IEEE 39th International Conference on Data Engineering (ICDE), Anaheim, CA, USA, 3–7 April 2023; pp. 246–258. [Google Scholar]
Xie, J.; Zhang, K.; Chen, J.; Zhu, T.; Lou, R.; Tian, Y.; Xiao, Y.; Su, Y. Travelplanner: A Benchmark for Real-World Planning with Language Agents. arXiv 2024, arXiv:2402.01622. [Google Scholar]
Gao, Q.; Zhou, F.; Zhang, K.; Trajcevski, G.; Luo, X.; Zhang, F. Identifying Human Mobility via Trajectory Embeddings. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, Melbourne, Australia, 19–25 August 2017; Volume 17, pp. 1689–1695. [Google Scholar]
Zhou, F.; Gao, Q.; Trajcevski, G.; Zhang, K.; Zhong, T.; Zhang, F. Trajectory-User Linking via Variational AutoEncoder. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; pp. 3212–3218. [Google Scholar]
Chen, W.; Huang, C.; Yu, Y.; Jiang, Y.; Dong, J. Trajectory-User Linking via Hierarchical Spatio-Temporal Attention Networks. ACM Trans. Knowl. Discov. Data 2024, 18, 1–22. [Google Scholar] [CrossRef]
Zou, Z.; Yu, Z.; Cao, K. An Innovative GPS Trajectory Data Based Model for Geographic Recommendation Service. Trans. GIS 2017, 21, 880–896. [Google Scholar] [CrossRef]
Wu, L.; Wen, H.; Hu, H.; Mao, X.; Xia, Y.; Shan, E.; Zhen, J.; Lou, J.; Liang, Y.; Yang, L.; et al. Lade: The First Comprehensive Last-Mile Delivery Dataset from Industry. arXiv 2023, arXiv:2306.10675. [Google Scholar]
Wen, H.; Lin, Y.; Wu, L.; Mao, X.; Cai, T.; Hou, Y.; Guo, S.; Liang, Y.; Jin, G.; Zhao, Y.; et al. A Survey on Service Route and Time Prediction in Instant Delivery: Taxonomy, Progress, and Prospects. IEEE Trans. Knowl. Data Eng. 2024, 36, 7516–7535. [Google Scholar] [CrossRef]
Mai, G.; Huang, W.; Sun, J.; Song, S.; Mishra, D.; Liu, N.; Gao, S.; Liu, T.; Cong, G.; Hu, Y.; et al. On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence. arXiv 2023, arXiv:2304.06798. [Google Scholar] [CrossRef]
Hu, E.J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. arXiv 2021, arXiv:2106.09685. [Google Scholar]
Neri, P. Time-Domain Simulator for Short-Term Ship Manoeuvring Prediction: Development and Applications. Ships Offshore Struct. 2019, 14, 249–264. [Google Scholar] [CrossRef]
Qiang, H.; Jin, S.; Feng, X.; Xue, D.; Zhang, L. Model Predictive Control of a Shipborne Hydraulic Parallel Stabilized Platform Based on Ship Motion Prediction. IEEE Access 2020, 8, 181880–181892. [Google Scholar] [CrossRef]
Chen, G.; Wang, W.; Xue, Y. Identification of Ship Dynamics Model Based on Sparse Gaussian Process Regression with Similarity. Symmetry 2021, 13, 1956. [Google Scholar] [CrossRef]
Zhang, M.; Kujala, P.; Musharraf, M.; Zhang, J.; Hirdaris, S. A Machine Learning Method for the Prediction of Ship Motion Trajectories in Real Operational Conditions. Ocean Eng. 2023, 283, 114905. [Google Scholar] [CrossRef]
Karataş, G.B.; Karagoz, P.; Ayran, O. Trajectory Pattern Extraction and Anomaly Detection for Maritime Vessels. Internet Things 2021, 16, 100436. [Google Scholar] [CrossRef]
Zhang, C.; Bin, J.; Wang, W.; Peng, X.; Wang, R.; Halldearn, R.; Liu, Z. AIS Data Driven General Vessel Destination Prediction: A Random Forest Based Approach. Transp. Res. Part C Emerg. Technol. 2020, 118, 102729. [Google Scholar] [CrossRef]
Abebe, M.; Shin, Y.; Noh, Y.; Lee, S.; Lee, I. Machine Learning Approaches for Ship Speed Prediction towards Energy Efficient Shipping. Appl. Sci. 2020, 10, 2325. [Google Scholar] [CrossRef]
Tu, E.; Zhang, G.; Mao, S.; Rachmawati, L.; Huang, G.B. Modeling Historical AIS Data For Vessel Path Prediction: A Comprehensive Treatment. arXiv 2022, arXiv:2001.01592. [Google Scholar] [CrossRef]
Borkowski, P. The Ship Movement Trajectory Prediction Algorithm Using Navigational Data Fusion. Sensors 2017, 17, 1432. [Google Scholar] [CrossRef]
Lacki, M. Intelligent Prediction of Ship Maneuvering. TransNav Int. J. Mar. Navig. Saf. Sea Transp. 2016, 10, 511–516. [Google Scholar] [CrossRef]
Wang, Z.; He, W.; Lan, J.; Zhu, C.; Lei, J.; Liu, X. Ship Trajectory Classification Prediction at Waterway Confluences: An Improved KNN Approach. J. Mar. Sci. Eng. 2024, 12, 1070. [Google Scholar] [CrossRef]
Zhang, X.; Liu, J.; Gong, P.; Wu, Z.; Han, B.; Wang, J. Vessel Trajectory Prediction with a Gated Spatio-Temporal Graph Aggregation Network. In Proceedings of the 2023 7th International Conference on Transportation Information and Safety (ICTIS), Xi’an, China, 4–6 August 2023; pp. 715–721. [Google Scholar] [CrossRef]
Valsamis, A.; Tserpes, K.; Zissis, D.; Anagnostopoulos, D.; Varvarigou, T. Employing Traditional Machine Learning Algorithms for Big Data Streams Analysis: The Case of Object Trajectory Prediction. J. Syst. Softw. 2017, 127, 249–257. [Google Scholar] [CrossRef]
Zhang, D.; Chu, X.; Wu, W.; He, Z.; Wang, Z.; Liu, C. Model Identification of Ship Turning Maneuver and Extreme Short-Term Trajectory Prediction under the Influence of Sea Currents. Ocean Eng. 2023, 278, 114367. [Google Scholar] [CrossRef]
Murray, B.; Perera, L.P. Ship Behavior Prediction via Trajectory Extraction-Based Clustering for Maritime Situation Awareness. J. Ocean Eng. Sci. 2022, 7, 1–13. [Google Scholar] [CrossRef]
Zheng, Y.; Zhang, X.; Shang, Z.; Guo, S.; Du, Y. A Decision-Making Method for Ship Collision Avoidance Based on Improved Cultural Particle Swarm. J. Adv. Transp. 2021, 2021, 1–31. [Google Scholar] [CrossRef]
Maskooki, A.; Virjonen, P.; Kallio, M. Assessing the Prediction Uncertainty in a Route Optimization Model for Autonomous Maritime Logistics. Int. Trans. Oper. Res. 2021, 28, 1765–1786. [Google Scholar] [CrossRef]
Miller, A.; Walczak, S. Maritime Autonomous Surface Ship’s Path Approximation Using Bézier Curves. Symmetry 2020, 12, 1704. [Google Scholar] [CrossRef]
Tang, H.; Wei, L.; Yin, Y.; Shen, H.; Qi, Y. Detection of Abnormal Vessel Behaviour Based on Probabilistic Directed Graph Model. J. Navig. 2020, 73, 1014–1035. [Google Scholar] [CrossRef]
Xie, S.; Chu, X.; Zheng, M.; Liu, C. Ship Predictive Collision Avoidance Method Based on an Improved Beetle Antennae Search Algorithm. Ocean Eng. 2019, 192, 106542. [Google Scholar] [CrossRef]
Chen, R.; Chen, M.; Li, W.; Guo, N. Predicting Future Locations of Moving Objects by Recurrent Mixture Density Network. ISPRS Int. J. Geo-Inf. 2020, 9, 116. [Google Scholar] [CrossRef]
Wen, Y.; Sui, Z.; Zhou, C.; Xiao, C.; Chen, Q.; Han, D.; Zhang, Y. Automatic Ship Route Design between Two Ports: A Data-Driven Method. Appl. Ocean Res. 2020, 96, 102049. [Google Scholar] [CrossRef]
Yan, Y.; Chow, A.H.; Ho, C.P.; Kuo, Y.H.; Wu, Q.; Ying, C. Reinforcement Learning for Logistics and Supply Chain Management: Methodologies, State of the Art, and Future Opportunities. Transp. Res. Part E Logist. Transp. Rev. 2022, 162, 102712. [Google Scholar] [CrossRef]
Mehri, S.; Alesheikh, A.A.; Basiri, A. A Contextual Hybrid Model for Vessel Movement Prediction. IEEE Access 2021, 9, 45600–45613. [Google Scholar] [CrossRef]
Hammedi, W.; Brik, B.; Senouci, S.M. Toward Optimal MEC-Based Collision Avoidance System for Cooperative Inland Vessels: A Federated Deep Learning Approach. IEEE Trans. Intell. Transp. Syst. 2022, 24, 2525–2537. [Google Scholar] [CrossRef]
Ma, J.; Jia, C.; Shu, Y.; Liu, K.; Zhang, Y.; Hu, Y. Intent Prediction of Vessels in Intersection Waterway Based on Learning Vessel Motion Patterns with Early Observations. Ocean Eng. 2021, 232, 109154. [Google Scholar] [CrossRef]
Chen, X.; Wu, S.; Shi, C.; Huang, Y.; Yang, Y.; Ke, R.; Zhao, J. Sensing Data Supported Traffic Flow Prediction via Denoising Schemes and ANN: A Comparison. IEEE Sens. J. 2020, 20, 14317–14328. [Google Scholar] [CrossRef]
Zhang, Y.; Liu, J.; Li, X.; Liu, J. Social-Passing: A New Approach to Trajectory Prediction in Multi-Vessel Scenarios for Sensing Intrinsic Social Information of Vessels. In Proceedings of the 2024 2nd International Conference on Computer, Vision and Intelligent Technology (ICCVIT), Huaibei, China, 24–27 November 2024; pp. 1–5. [Google Scholar] [CrossRef]
Guo, Z.; Qiang, H.; Peng, X. Vessel Trajectory Prediction Using Vessel Influence Long Short-Term Memory with Uncertainty Estimation. J. Mar. Sci. Eng. 2025, 13, 353. [Google Scholar] [CrossRef]
Yang, C.H.; Wu, C.H.; Shao, J.C.; Wang, Y.C.; Hsieh, C.M. AIS-Based Intelligent Vessel Trajectory Prediction Using Bi-LSTM. IEEE Access 2022, 10, 24302–24315. [Google Scholar] [CrossRef]
Park, J.; Jeong, J.; Park, Y. Ship Trajectory Prediction Based on Bi-LSTM Using Spectral-Clustered AIS Data. J. Mar. Sci. Eng. 2021, 9, 1037. [Google Scholar] [CrossRef]
Ma, J.; Jia, C.; Yang, X.; Cheng, X.; Li, W.; Zhang, C. A Data-Driven Approach for Collision Risk Early Warning in Vessel Encounter Situations Using Attention-BiLSTM. IEEE Access 2020, 8, 188771–188783. [Google Scholar] [CrossRef]
Suo, Y.; Chen, W.; Claramunt, C.; Yang, S. A Ship Trajectory Prediction Framework Based on a Recurrent Neural Network. Sensors 2020, 20, 5133. [Google Scholar] [CrossRef]
Zhang, L.; Zhang, J.; Niu, J.; Wu, Q.M.J.; Li, G. Track Prediction for HF Radar Vessels Submerged in Strong Clutter Based on MSCNN Fusion with GRU-AM and AR Model. Remote Sens. 2021, 13, 2164. [Google Scholar] [CrossRef]
Wang, S.; He, Z. A Prediction Model of Vessel Trajectory Based on Generative Adversarial Network. J. Navig. 2021, 74, 1161–1171. [Google Scholar] [CrossRef]
Zhang, B.; Xu, Z.; Zhang, J.; Wu, G. A Warning Framework for Avoiding Vessel-bridge and Vessel-vessel Collisions Based on Generative Adversarial and Dual-task Networks. Comput.-Aided Civ. Infrastruct. Eng. 2022, 37, 629–649. [Google Scholar] [CrossRef]
Huang, P.; Chen, Q.; Wang, D.; Wang, M.; Wu, X.; Huang, X. TripleConvTransformer: A Deep Learning Vessel Trajectory Prediction Method Fusing Discretized Meteorological Data. Front. Environ. Sci. 2022, 10, 1012547. [Google Scholar] [CrossRef]
Nguyen, D.; Fablet, R. A Transformer Network with Sparse Augmented Data Representation and Cross Entropy Loss for AIS-Based Vessel Trajectory Prediction. IEEE Access Pract. Innov. Open Solut. 2024, 12, 21596–21609. [Google Scholar] [CrossRef]
Xue, H.; Voutharoja, B.P.; Salim, F.D. Leveraging Language Foundation Models for Human Mobility Forecasting. In Proceedings of the 30th International Conference on Advances in Geographic Information Systems, SIGSPATIAL ’22, Seattle, WA, USA, 1–4 November 2022; pp. 1–9. [Google Scholar] [CrossRef]
Xue, H.; Salim, F.D. PromptCast: A New Prompt-Based Learning Paradigm for Time Series Forecasting. IEEE Trans. Knowl. Data Eng. 2024, 36, 6851–6864. [Google Scholar] [CrossRef]
Gruver, N.; Finzi, M.; Qiu, S.; Wilson, A.G. Large Language Models Are Zero-Shot Time Series Forecasters. In Proceedings of the 37th International Conference on Neural Information Processing Systems, NIPS ’23, New Orleans, LA, USA, 10–16 December 2024; pp. 19622–19635. [Google Scholar]
Yu, X.; Chen, Z.; Ling, Y.; Dong, S.; Liu, Z.; Lu, Y. Temporal Data Meets LLM—Explainable Financial Time Series Forecasting. arXiv 2023, arXiv:2306.11025. [Google Scholar]
Jin, M.; Wang, S.; Ma, L.; Chu, Z.; Zhang, J.Y.; Shi, X.; Chen, P.Y.; Liang, Y.; Li, Y.F.; Pan, S.; et al. Time-LLM: Time Series Forecasting by Reprogramming Large Language Models. arXiv 2024, arXiv:2310.01728. [Google Scholar] [CrossRef]
Zhou, T.; Niu, P.; Wang, X.; Sun, L.; Jin, R. One Fits All:Power General Time Series Analysis by Pretrained LM. arXiv 2023, arXiv:2302.11939. [Google Scholar] [CrossRef]
Cao, D.; Jia, F.; Arik, S.O.; Pfister, T.; Zheng, Y.; Ye, W.; Liu, Y. TEMPO: Prompt-Based Generative Pre-Trained Transformer for Time Series Forecasting. arXiv 2024, arXiv:2310.04948. [Google Scholar]
Xie, Z.; Li, Z.; He, X.; Xu, L.; Wen, X.; Zhang, T.; Chen, J.; Shi, R.; Pei, D. ChatTS: Aligning Time Series with LLMs via Synthetic Data for Enhanced Understanding and Reasoning. Proc. VLDB Endow. 2024, 18, 2385–2398. [Google Scholar] [CrossRef]
Wiliński, M.; Goswami, M.; Żukowska, N.; Potosnak, W.; Dubrawski, A. Exploring Representations and Interventions in Time Series Foundation Models. arXiv 2025, arXiv:2409.12915. [Google Scholar]
Liu, H.; Zhao, Z.; Wang, J.; Kamarthi, H.; Prakash, B.A. LSTPrompt: Large Language Models as Zero-Shot Time Series Forecasters by Long-Short-Term Prompting. In Findings of the Association for Computational Linguistics: ACL 2024; Ku, L.W., Martins, A., Srikumar, V., Eds.; Association for Computational Linguistics: Bangkok, Thailand, 2024; pp. 7832–7840. [Google Scholar] [CrossRef]

Figure 1. Trajectory prediction: From Data to Application.

Figure 2. System framework.

Figure 3. Visualization of the trajectory data.

Figure 4. LLM framework.

Figure 5. LoRA schematic diagram.

Figure 6. The loss curves of CodeLlama-7B.

Figure 7. Haversine distance distribution histogram.

Figure 8. MAE values of VTLLM and TrAISformer.

Figure 9. Track1 (MMSI: 538002561, ID: 5013).

Figure 10. Track2 (MMSI: 538002561, ID: 5014).

Figure 11. Track3 (MMSI: 538002561, ID: 5012).

Table 1. Research on machine learning methods.

Category	Ref.	Applied Method
Regression Model	[15] (2019)	Linear Regression Model
	[16] (2020)	Support Vector Regression
	[17,18] (2021, 2023)	Gaussian Process Regression
Neural Network	[19,20,21] (2021, 2020, 2020)	Random Forest
	[22] (2022)	Extreme Learning Machines
	[23] (2017)	Generalized Regression Neural Networks
	[24] (2016)	Neuroevolutionary ANN
	[25] (2024)	Improved KNN
	[26] (2023)	G-STGAN
	[27] (2017)	Multilayer Perceptron
	[28] (2023)	Least Squares Method
Other	[29] (2022)	Single Point Neighbor Search Method
	[30] (2021)	Improved Cultural Particle Swarm Method
	[31] (2021)	K-Nearest Neighbor Algorithm
	[32] (2020)	Second-order Rational Bessel curves Method
	[33] (2020)	Bayesian Network
	[34] (2019)	Improved Beetle Antennae Search Algorithm

Table 2. AIS data classification.

Static Information	Dynamic Information	Navigation Information
MMSI	Timestamp	Navigational status
Callsign	Latitude	Draught
Ship name	Longitude	Destination
IMO	Heading	Type of position fixing device
Ship length	SOG	Cargo type
Ship width	COG	Type of mobile
Ship type	ROT	Antenna position

SOG, COG, and ROT represent speed over ground, course over ground, and rate of turn, respectively.

Table 3. Data unit and type.

Field Name	Unit	Field Type
MMSI	\	Int64
Timestamp	\	Datetime
Ship name	\	Object
Ship type	\	Object
Ship length	Meter (m)	Float64
Ship width	Meter (m)	Float64
Draught	Meter (m)	Float64
Latitude	Degree (°)	Float64
Longitude	Degree (°)	Float64
Heading	Degree (°)	Float64
COG	Degree (°)	Float64
SOG	Knot (kn)	Float64
ROT	Degrees per minute (°/min)	Float64

Table 4. Abnormal trajectory data.

Timestamp	Latitude	Longitude	MMSI
1/3/2023 16:50	54.2807	11.7509	210399000
1/3/2023 17:00	54.2995	11.8129	210399000
1/3/2023 17:10	54.3183	11.8742	210399000
1/3/2023 17:20	54.3372	11.9351	210399000
1/3/2023 17:30	54.3557	11.9964	210399000
1/3/2023 17:40	54.3738	2.01393	210399000
1/3/2023 17:50	54.3922	12.1216	210399000
1/3/2023 18:00	54.4204	12.1699	210399000
1/3/2023 18:10	54.4501	12.2177	210399000
1/3/2023 18:20	54.4871	12.2415	210399000
1/3/2023 18:30	54.5247	12.2672	210399000

The data with underline in the table are abnormal track points.

Table 5. Trajectory data structure.

MMSI	Trajectory ID	Trajectory Points Sequence
$M M S I_{1}$	$T r a j e c t o r y_{1}$	${\{p_{1}, p_{2}, \dots, p_{i}, \dots, p_{n} ∣ p_{i} = (t_{i}, l a t_{i}, l o g_{i}, a t t r_{i})\}}_{1}$
$M M S I_{2}$	$T r a j e c t o r y_{2}$	${\{p_{1}, p_{2}, \dots, p_{i}, \dots, p_{n} ∣ p_{i} = (t_{i}, l a t_{i}, l o g_{i}, a t t r_{i})\}}_{2}$
	$T r a j e c t o r y_{3}$	${\{p_{1}, p_{2}, \dots, p_{i}, \dots, p_{n} ∣ p_{i} = (t_{i}, l a t_{i}, l o g_{i}, a t t r_{i})\}}_{3}$
…	…	…
$M M S I_{n}$	$T r a j e c t o r y_{m}$	${\{p_{1}, p_{2}, \dots, p_{i}, \dots, p_{n} ∣ p_{i} = (t_{i}, l a t_{i}, l o g_{i}, a t t r_{i})\}}_{m}$

Table 6. Prompt template and example.

	Input Prompt		Output Prompt
	Context	Question	Answer
ine Template	The vessel $s h i p n a m e$ (MMSI: $m m s i$ ) has a length of $l e n g t h$ meters, a width of $w i d t h$ meters, and a draft depth of $d r a u g h t$ meters. The type of mobile of the vessel is $t y p e o f m o b i l e$ , the vessel type is $s h i p t y p e$ , the cargo type is $c a r g o t y p e$ , and the destination is $d e s t i n a t i o n$ . The POI near the trajectory are $p o i_{1}, p o i_{2}, \dots, p o i_{n}$ . The navigational status of the vessel is $n a v i g a t i o n a l s t a t u s$ . At $t_{1}$ , the ship’s position was $l a t_{1}$ ° North latitude, $l n g_{1}$ ° East longitude. At $t_{2}$ , the ship’s position was $l a t_{2}$ ° North latitude, $l n g_{2}$ °......At $t_{n}$ , the ship’s position was $l a t_{n}$ ° North latitude, $l n g_{n}$ ° East longitude. It is also known that the speed to the ground corresponding to the last track point above is $s o g$ knots, the course to the ground is $c o g$ °, the heading is $h e a d i n g$ °, and the turn rate is $r o t$ degrees per minute.	Please predict the latitude and longitude of time $t_{n + 1}$ based on the above information	$l n g_{n + 1}$ , $l a t_{n + 1}$
ine Example	The vessel ELBSKIPPER (MMSI: 209715000) has a length of 134.0 meters, a width of 22.0 meters and a draft depth of 7.4 meters. The type of mobile of the vessel is Class A, the vessel type is Cargo, the cargo type is Category X, and the destination is DKCPH. The POI near the trajectory are neue Schleuse Kiel-Holtenau, Schleuse Brunsbüttel, Die längste Bank der Welt. The navigational status of the vessel is Under way using engine. At 2023-01-03 03:12:42, the ship’s position was 53.7395° North latitude, 9.42863° East longitude. At 2023-01-03 03:22:42, the ship’s position was 53.775° North latitude, 9.39396°......At 2023-01-03 23:42:42, the ship’s position was 56.0873° North latitude, 11.1413° East longitude. It is also known that the speed to the ground corresponding to the last track point above is 13.7 knots, the course to the ground is 42.3°, the heading is 47.0°, and the turn rate is 0.0 degrees per minute.	Please predict the latitude and longitude of time 2023-01-03 23:52:42 based on the above information	11.1941, 56.1145

The italics in the template represent variables that can be assigned.

Table 7. Main parameters and core libraries.

The Name of the Parameter	Key
LoRA r	16
LoRA $α$	32
LoRA matrices	$W_{Q}, W_{V}, W_{K}$
Learning rate	$2 \times 10^{- 4}$
LoRA dropout	0.1
Sequence length	4096
Batch size	2
Learning rate warmup	0.1
Epoch	4
Python	3.9
Pandas	2.2.3
Pytorch	2.1.0
Cuda	11.8
Transformers	4.49.0

Table 8. BLEU and ROUGE metrics results.

Model	BLEU-4	ROUGE-1	ROUGE-2	ROUGE-L
CodeLlama-7B	0.7107	0.6102	0.4153	0.6102
Qianfan-Chinese-Llama-2-7B	0.6338	0.5246	0.2869	0.5246
ERNIE-Lite-8K-0308	0.6221	0.4686	0.2030	0.4686
DeepSeek-R1-Distill-1.5B	0.6021	0.4511	0.1766	0.4511
Baichuan2-7B-Chat	0.4290	0.3168	0.0743	0.3168
BLOOMZ-7B	0.1821	0.1571	0.0495	0.1501

Table 9. Performance of evaluation metrics for VTLLM.

Metric	10 min	20 min	30 min	40 min	1 h
MAE	0.38	1.34	2.28	3.21	5.26
RMSE	0.66	1.63	3.02	3.79	6.06
AED	0.0093	0.0324	0.0546	0.0792	0.1277
SMAPE	0.0001	0.0004	0.0006	0.0009	0.0015

Where the units of MAE and RMSE are nautical mile (nmi). The unit of AED is degree (°).

Table 10. Performance of evaluation metrics for the untrained model.

Metric	10 min	20 min	30 min	40 min	1 h
MAE	17.84	48.32	23.76	49.04	34.97
RMSE	41.82	317.56	48.49	254.64	56.69
AED	0.4687	0.9287	0.5932	1.0029	0.8598
SMAPE	0.0063	0.0072	0.0076	0.0088	0.0108

Table 11. MAE and RMSE values of VTLLM and TrAISformer.

Metric	Model	10 min	20 min	30 min	40 min	1 h
MAE	TrAISformer	1.42	2.35	3.40	4.38	6.12
MAE	VTLLM	0.38	1.34	2.28	3.21	5.26
RMSE	TrAISformer	3.27	4.41	5.15	5.98	8.42
RMSE	VTLLM	0.66	1.63	3.02	3.79	6.06

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, Y.; Xiong, W.; Chen, N.; Yang, F. VTLLM: A Vessel Trajectory Prediction Approach Based on Large Language Models. J. Mar. Sci. Eng. 2025, 13, 1758. https://doi.org/10.3390/jmse13091758

AMA Style

Liu Y, Xiong W, Chen N, Yang F. VTLLM: A Vessel Trajectory Prediction Approach Based on Large Language Models. Journal of Marine Science and Engineering. 2025; 13(9):1758. https://doi.org/10.3390/jmse13091758

Chicago/Turabian Style

Liu, Ye, Wei Xiong, Nanyu Chen, and Fei Yang. 2025. "VTLLM: A Vessel Trajectory Prediction Approach Based on Large Language Models" Journal of Marine Science and Engineering 13, no. 9: 1758. https://doi.org/10.3390/jmse13091758

APA Style

Liu, Y., Xiong, W., Chen, N., & Yang, F. (2025). VTLLM: A Vessel Trajectory Prediction Approach Based on Large Language Models. Journal of Marine Science and Engineering, 13(9), 1758. https://doi.org/10.3390/jmse13091758

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

VTLLM: A Vessel Trajectory Prediction Approach Based on Large Language Models

Abstract

1. Introduction

1.1. Background and Significance

1.2. Research Aim and Contents

2. Literature Review

2.1. Methods Based on Machine Learning

2.2. Methods Based on Deep Learning

2.3. Time Series Prediction Method Based on LLMs

3. Data and Methodology

3.1. Definition of Problem

3.1.1. The Definition of Trajectory

3.1.2. The Definition of Trajectory Segment

3.1.3. The Definition Trajectory Prediction

3.2. Study Area and Data

3.3. Data Preprocessing

3.3.1. Data Cleaning

3.3.2. Trajectory Extraction

3.4. Prompt Engineering

3.5. Fine-Tuning Method

4. Experiment and Evaluation

4.1. Setup

4.2. Results

4.2.1. Performance Evaluation

4.2.2. Trajectory Visualization Analysis

5. Discussion and Conclusions

5.1. Discussion

5.2. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI