3.2. Study Area and Data
The data source for this study is the open-source AIS data from the Danish Maritime Administration’s relevant website. The research area is a rectangular region within 52.7059.22° north latitude and 0.1618.58° east longitude. This maritime region is at the intersection of the Baltic Sea and the North Sea, serving as a major European maritime traffic hub with abundant ship trajectory data resources, including AIS data and satellite remote sensing data. These data encompass numerous ship trajectories, providing ample support for constructing trajectory prediction models and facilitating in-depth analysis of ship trajectory prediction.
For this study, over 33 million data points from 1 March 2023 to 3 March 2023 were selected as the dataset. Each AIS data entry contains 27 pieces of information, with commonly used fields categorized into three types, as shown in
Table 2. These include Maritime Mobile Service Identity (MMSI), ship name, latitude and longitude, and turning angle rate. Different message types serve different purposes. For instance, papers often use the ship’s unique identifier, MMSI, to classify existing data and reorganize the AIS data, which were originally sorted by time series, to obtain the trajectories of individual ships over a specific period.
The units and data types of the AIS data fields that can be used in this study are declared as follows, as shown in
Table 3.
3.3. Data Preprocessing
3.3.1. Data Cleaning
The original AIS dataset has a large amount of data and contains very miscellaneous information. In the process of data transmission, loss and errors are likely to occur, so data cleaning must be carried out. The steps of data cleaning in this paper are as follows:
- (1)
Remove data from outside the study area.
- (2)
The official ship’s unique identification MMSI is a nine-digit identifier that removes the incorrect MMSI.
- (3)
AIS data contains 27 fields. Some fields, such as type of position fixing device, are not used and need to be removed.
- (4)
The drift points in the trajectory are eliminated at a maximum speed of 50 knots.
- (5)
Ships that are too small will be greatly affected by wind direction, wind speed, sea water flow, etc., and it is not easy to forecast, so ships that are less than 5 m long and less than 3 m wide are eliminated.
- (6)
As LLMs in this study relies on semantic information for training, it is necessary to screen out data with relatively complete data information and high quality. Consequently, data of certain fields (such as MMSI, SOG, COG, navigation status, ship name, destination, ship type, ROT, etc.) that were empty, “Unknown”, “Unknown Value”, “Undefined”, or “NAN” were eliminated.
- (7)
Filter out data with navigation statuses of “Moored,” “At anchor,” “Aground,” and other statuses indicating the vessel is not in motion.
3.3.2. Trajectory Extraction
After data cleaning, it is not the track of a ship, but the track points of various ships mixed together. MMSI also needs to be used to classify these track points and extract the sailing track of each ship, the specific steps are as follows:
- (1)
MMSI is used to group the track points and arrange them in ascending order of time.
- (2)
The ship will return an AIS data to the maritime management center every two seconds during the voyage; even when at anchor, it will return an AIS data every three minutes. The ship returns AIS data so frequently that a huge amount of data is generated. In the study, such high-frequency and low-interval data are not necessary, so downsampling of trajectory data is necessary. In this study, a sampling point was obtained every 10 min.
- (3)
If the spacing between the front and rear trajectory segments is too large, the correlation between trajectories will be too small, which is not conducive to the learning of subsequent LLMs. Therefore, the trajectory with a time interval of more than one hour is divided into two trajectories. If there are time intervals greater than 1 h in more than one place, the trajectory is divided into multiple segments.
- (4)
If the number of trajectory points is too small, it is not enough to support the trajectory prediction, so the trajectory with less than 6 trajectory points is eliminated.
- (5)
Due to potential data transmission errors between vessels and the Maritime Data Management Center, anomalies like the one shown in
Table 4 can arise. Examination of the AIS records reveals that one trajectory point’s longitude lost a digit during transmission. In this study, trajectories with adjacent points having a longitude or latitude difference exceeding 1° are directly eliminated.
- (6)
Finally, the processed tracks are renumbered to obtain the unique identification of each track.
To enrich trajectory information, this study additionally collected trajectory-related Point of Interest (POI) information and integrated it as auxiliary semantic information into trajectory analysis. In specific operations, for each trajectory, the moving speed and direction angle were first calculated based on the latitude and longitude coordinates as well as timestamps of its trajectory points, and then inflection points and stay points were detected by combining preset thresholds; subsequently, based on the latitude and longitude coordinates of the detected inflection points and stay points, and using distance as the query criterion, the open-source geographic database GeoNames was used to query POI information, further improving the semantic dimension of the trajectory.
To adapt to numerical models, traditional AIS data preprocessing methods typically only retain pure numerical fields (e.g., longitude, latitude, speed, course), while text fields (e.g., vessel type, navigational status, IMO number) are either discarded directly or encoded into discrete numerical values (e.g., “anchored”: 0, “underway”: 1), resulting in the loss of semantic associations. In contrast, this study retains and optimizes text fields with high semantic value through the aforementioned semantic retention strategies and further adds auxiliary semantic fields. Specifically, we not only preserve text information such as “vessel type (e.g., fishing, cargo, oil tanker)” and “navigational status (e.g., underway, anchored, berthed)” but also incorporate POI information as public domain knowledge for association.
After the above steps, the article obtained a total of 5415 track data, as shown in
Figure 3. The structure of the ship trajectory after processing is shown in
Table 5. One of the MMSI may correspond to multiple tracks due to track segmentation, and each pair of tracks should have its own number and a string of track points.
3.5. Fine-Tuning Method
The model architecture employed in this paper is illustrated in
Figure 4. It represents the mainstream LLM architecture, also commonly referred to as a Dense LLM. It adopts and modifies the decoder part of the classic Transformer, belonging to the decoder-only Transformer architecture. The Embedding module primarily functions to convert input tokens into high-dimensional vectors, enabling them to capture semantic and contextual information of the words, so that subsequent Transformer layers can process the data effectively. RMSNorm utilizes the root mean square to normalize the hidden states in the neural network. Its mathematical expression is given in Equation (
4), where
x denotes the input vector and
represents the scaling factor. This method aims to stabilize the distribution of activation values in deep networks, helping to accelerate the training process and improve model performance. The Rotary Positional Encoding (RoPE) mechanism is used to inject positional information into the model. RoPE provides relative positional information through rotational transformations, enabling the model to better understand positional relationships within sequences. Grouped Query Attention (GQA) is an improved attention mechanism that enhances model efficiency and performance by grouping queries. Skip connections are adopted in the architecture to facilitate gradient flow and enhance training stability. SwiGLU is used in the Feed-Forward Network (FFN) as a variant of Gated Linear Units (GLUs), which introduces nonlinear activation functions to strengthen the model’s expressive power. The final model architecture consists of N repeated core Transformer modules.
LLMs themselves do not have trajectory prediction capabilities and need to be fine-tuned with datasets to fully understand this specific field. The main methods for fine-tuning LLMs include full fine-tuning and parameter-efficient fine-tuning (PEFT). Full fine-tuning requires updating all parameters of the pre-trained model on the new task, which demands a large amount of computing resources and time, leading to a significant decrease in efficiency. This involves not only the storage requirements of the model itself, but also the processing and storage of a large number of key parameters during the training process. PEFT offers an effective strategy to manage this situation. By selectively updating only a portion of the model’s parameters and “freezing” the majority of the remaining parameters, it significantly reduces the number of parameters that need to be trained. PEFT not only retains the knowledge that the model has acquired from the original training data, but also ensures that the new training task does not disrupt this existing knowledge.
Recent studies, such as PromptCast [
54] and Time-LLM [
57], have demonstrated the remarkable potential of using meticulously engineered prompts to elicit time series forecasting capabilities from pre-trained LLMs, representing a highly parameter-efficient paradigm. However, this prompt-based approach is inherently a ’frozen-model, tuned-input’ strategy. Its performance is heavily reliant on the breadth of the pre-trained model’s inherent knowledge and the alignment between the prompts and the model’s internal representations. For tasks characterized by strong domain specificity and complex patterns—such as maritime data with intricate spatiotemporal dependencies—its adaptability and performance ceiling can be limited. In contrast, fine-tuning, which updates a subset of the model’s parameters, enables a deeper assimilation of target domain features. Therefore, we propose a novel “Prompt + PEFT” paradigm to achieve highly effective and efficient adaptation of LLMs for maritime time series forecasting tasks.
Low-Rank Adaptation [
14] is a mainstream LLM fine-tuning method in PEFT. Our work constitutes a conventional application of the established LoRA framework. No modifications or alterations were made to its internal architecture. In the LoRA method, the weights of the original LLMs are frozen, which means remaining unchanged during training, and the purpose of this is to preserve the general knowledge that the model learned during the pre-training phase. The core idea of LoRA is to approximate the model update by using low-rank matrix decomposition to reduce dimension first and then increase dimension. As shown in
Figure 5, the pre-trained weights
of the pre-trained model are fixed during training, and only the matrices
and
are trained. The dimensions of the parameter matrices
A and
B are chosen such that
, which coincides with the dimensionality of the pre-trained weight matrix
W, thereby enabling the corresponding parameters to be updated via element-wise summation with
W. At this point, the parameter update cost is dominated by the low-rank matrices
B and
A, with a size of
, as opposed to the original
W with
parameters. Since
, it follows that
, thus significantly reducing the number of parameters that require updating.
At the beginning of training, matrix
A is initialized via a Gaussian distribution with a mean of 0, i.e.,
, while matrix
B is initialized to 0, i.e.,
. This initialization scheme ensures that the LoRA branch
is 0 before training commences. Consequently, fine-tuning starts from the original pre-trained weights
W, thus guaranteeing an identical starting point to full fine-tuning. During training, for input
x, the forward propagation process of the model is updated to
, where
. In this process, the original parameters
W are frozen, the gradients for
B and
A are computed as shown in Equation (
5), and the back-propagated gradients are likewise given in Equation (
6). This means that although
W participates in both the forward and backward passes, no gradients
are computed for them, and consequently, their values are not updated. In practice, the weight update
is scaled by a factor of
before being merged into the pre-trained weights, i.e.,
. This scaling factor serves to calibrate the influence of the LoRA update. A smaller
value diminishes the impact of
, often leading to less pronounced fine-tuning effects. Conversely, a larger
value amplifies its contribution, which increases the risk of overfitting on the downstream task. It is common practice to set the ratio
for a given task. At inference time,
is directly merged into
W according to the equation above, so no additional latency is introduced compared to the original LLM.
Specifically, for transformer-based models, LoRA only needs to fine-tune the self-attention part of each layer. The self-attention mechanism works through three key matrices: query matrix
, key matrix
, and value matrix
. In our design, as illustrated in
Figure 5, LoRA is applied separately to the
,
, and
matrices. We just need to introduce low-rank matrices
and
to represent the weight updates rather than represent
,
, and
directly. After training is completed, the outputs are transformed through the mapping matrices to obtain the results presented in Equation (
7). All models in this study use a supervised fine-tuning (SFT) method with instruction dataset. SFT typically necessitates a small amount of labeled data to effectively guide the model to capture the desired pattern in a particular task.