Navigating the Future: A Novel PCA-Driven Layered Attention Approach for Vessel Trajectory Prediction with Encoder–Decoder Models

Er, Fusun; Yalman, Yıldıray

doi:10.3390/app15168953

Open AccessArticle

Navigating the Future: A Novel PCA-Driven Layered Attention Approach for Vessel Trajectory Prediction with Encoder–Decoder Models

by

Fusun Er

^*

and

Yıldıray Yalman

Department of Computer Engineering, Piri Reis University, Tuzla 34940, Istanbul, Turkey

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(16), 8953; https://doi.org/10.3390/app15168953

Submission received: 17 June 2025 / Revised: 5 August 2025 / Accepted: 6 August 2025 / Published: 14 August 2025

Download

Browse Figures

Versions Notes

Abstract

This study introduces a novel deep learning architecture for vessel trajectory prediction based on Automatic Identification System (AIS) data. The motivation stems from the increasing importance of maritime transport and the need for intelligent solutions to enhance safety and efficiency in congested waterways—particularly with respect to collision avoidance and real-time traffic management. Special emphasis is placed on river navigation scenarios that limit maneuverability with the demand of higher forecasting precision than open-sea navigation. To address these challenges, we propose a Principal Component Analysis (PCA)-driven layered attention mechanism integrated within an encoder–decoder model to reduce redundancy and enhance the representation of spatiotemporal features, allowing the layered attention modules to focus more effectively on salient positional and movement patterns across multiple time steps. This dual-level integration offers a deeper contextual understanding of vessel dynamics. A carefully designed evaluation framework with statistical hypothesis testing demonstrates the superiority of the proposed approach. The model achieved a mean positional error of 0.0171 nautical miles (SD: 0.0035), with a minimum error of 0.0006 nautical miles, outperforming existing benchmarks. These results confirm that our PCA-enhanced attention mechanism significantly reduces prediction errors, offering a promising pathway toward safer and smarter maritime navigation, particularly in traffic-critical riverine systems. While the current evaluation focuses on short-term horizons in a single river section, the methodology can be extended to complex environments such as congested ports or multi-ship interactions and to medium-term or long-term forecasting to further enhance operational applicability and generalizability.

Keywords:

layered attention mechanism; encoder–decoder models; vessel trajectory prediction; maritime safety

1. Introduction

Maritime transport is the backbone of global trade, handling over 80 percent of cargo volumes and continuing to grow at over 2 percent annually (UNCTAD, 2023) [1]. This rapid expansion has resulted in increased congestion and heightened safety risks, particularly in confined waterways where accurate trajectory forecasting is essential for effective collision avoidance and maritime traffic management. As these challenges escalate, there is a growing demand for advanced, data-driven technologies that can enhance navigational safety and operational efficiency. To address these growing concerns and enhance maritime safety, there is a rising demand for advanced technological solutions. Among these, precise vessel trajectory prediction—forecasting a ship’s future positions and movements based on historical AIS data—plays a pivotal role.

Automatic Identification System (AIS) data [2] has become the cornerstone of modern vessel trajectory prediction systems due to its rich, real-time stream of navigational information. AIS provides detailed reports on vessel identity, position (latitude and longitude), speed over ground (SOG), course over ground (COG), and navigational status, all of which are essential for modeling movement patterns for maritime navigation [3]. These broadcasts, transmitted via VHF radio and received by both nearby vessels and coastal stations, enable continuous monitoring of maritime traffic. The widespread availability, standardized structure, and global coverage of AIS data make it an invaluable resource for developing predictive models. Moreover, its consistent timestamped nature facilitates the application of deep learning techniques that rely on sequential data, thus explaining its central role in collision avoidance systems, traffic management, and maritime domain awareness initiatives [4].

Leveraging AIS data for reliable trajectory forecasting involves significant challenges, despite its wide availability [5]. Vessel movement is influenced by numerous dynamic factors, including traffic density, environmental conditions, and navigational constraints, which vary significantly across different geographical contexts. Rivers, for instance, introduce unique challenges due to narrow channels, fluctuating water levels, and strong currents, requiring higher predictive precision compared to open-sea navigation. These spatiotemporal variabilities introduce dynamics and localized dependencies that conventional models often fail to capture, highlighting the need for adaptive and fine-grained predictive frameworks tailored specifically to inland waterway conditions.

Existing studies on vessel trajectory prediction vary considerably in terms of input feature selection, which significantly influences predictive performance. A common approach involves using only geospatial coordinates—latitude and longitude—as model inputs [6,7,8,9], treating trajectory forecasting as a pure spatial extrapolation problem. More recent methods have incorporated additional navigational features such as speed over ground (SOG) and course over ground (COG) to enrich the input space and better capture vessel dynamics [10,11,12,13,14,15,16,17,18]. However, the increased dimensionality can introduce noise and redundancy, especially in regions with high spatiotemporal variability. Furthermore, many of these models process features in isolation or concatenate them without considering their latent interdependencies.

Recently, deep learning models based on AIS data have become the dominant approach for trajectory prediction, with attention mechanisms playing a crucial role in capturing complex spatial and temporal dependencies. While attention-enhanced LSTMs and transformers have shown promising results in pedestrian and vehicle prediction tasks [19,20], their integration into maritime trajectory forecasting is still evolving. Some studies have adopted attention-based models for AIS data [15,21], whereas others rely on simpler or traditional methods [10,13], especially in inland waterway scenarios [7,16]. This variation highlights a gap and motivates the development of more tailored, attention-driven architectures for vessel trajectory prediction. Attention mechanisms (AMs) are widely applied in machine learning across a broad range of tasks, including natural language processing (NLP), computer vision (CV), speech recognition, and behavior recognition. Depending on how the attention scores are computed, these mechanisms can be categorized into additive attention, multiplicative attention, self-attention, key–value attention, and other variants [22]. This study adopts multiplicative attention, specifically the formulation proposed by Luong et al. [23], as it provides a favorable balance between computational efficiency and alignment accuracy—an important consideration for large-scale AIS datasets with long sequential dependencies. Unlike additive attention, which requires an additional feed-forward network to compute scores, the multiplicative form directly measures similarity through dot products (optionally with a learned projection), making it less computationally demanding while still effectively capturing relevant temporal context. It is important to note that many studies in vessel trajectory prediction rely primarily on limited input features such as latitude and longitude (LAT and LON) or a small set including speed over ground (SOG) and course over ground (COG). In real-world scenarios, access to comprehensive environmental and operational data can be restricted or unavailable. This study addresses the challenge of effectively utilizing such limited data to improve prediction performance. Rather than requiring extensive and hard-to-obtain inputs, the proposed approach focuses on maximizing the value of commonly available AIS data. This contribution is significant as it offers a practical solution applicable in data-constrained environments, which are common in maritime operations.

To address this, we propose a unified feature integration framework that incorporates Principal Component Analysis (PCA) within a layered attention mechanism designed for encoder–decoder architectures. This approach jointly encodes LAT, LON, SOG, and COG into a compact lower-dimensional representation that captures the most informative variance while filtering out redundancy. The experimental results demonstrate that this fused representation outperforms models relying solely on geospatial inputs or simple feature concatenations, offering improved generalization and robustness across diverse maritime environments. Unlike prior studies using either raw geospatial or concatenated navigational inputs, we introduce a PCA-guided layered attention mechanism that fuses spatial and dynamic features into a compact representation. This approach significantly enhances predictive accuracy, particularly in complex inland waterways.

In contrast to many existing studies that report only the best-case results, this study adopts a more rigorous evaluation methodology. Instead of focusing on a single optimal prediction, we compute and report the mean and standard deviation of performance metrics over the entire test set. Furthermore, we apply statistical hypothesis testing to compare models based on the distribution of their prediction errors, providing a more robust and interpretable assessment of model performance. This approach ensures that the observed improvements are not incidental but generalize reliably across diverse trajectory samples.

The main objectives of this study are presented in Section 2. The experimental approach applied to achieve these objectives, including the proposed attention mechanism, is detailed in Section 3. The experimental results and discussions are provided in Section 4. Finally, conclusions and future work directions are summarized in Section 5.

2. Definitions and Research Objectives

This section introduces formal definitions essential for structuring the data used in our experiments and outlines the research objectives guiding our predictive modeling approach.

2.1. Key Definitions

To represent vessel movement using AIS data, the following concepts were defined formally.

Definition 1.

Timestamped Point: A timestamped point is a vector of AIS-derived attributes at a given time. Formally, the

i^{t h}

timestamped point of the

k^{t h}

ship is denoted as follows:

P_{i}^{k} = ({lon}_{i}^{k}, {lat}_{i}^{k}, {cog}_{i}^{k}, {sog}_{i}^{k}, t_{i}^{k}),

where

lon

and

lat

denote the ship’s location,

cog

and

sog

indicate course and speed over ground, and

t

is the timestamp.

Definition 2.

Ship Trajectory: A ship trajectory is an ordered sequence of timestamped points:

T^{k} = {P_{1}^{k}, P_{2}^{k}, \dots, P_{m}^{k}},

where m denotes the number of available AIS observations for vessel k.

Definition 3.

Trajectory Segment: A trajectory segment is a non-overlapping subseries extracted from a ship trajectory, consisting of two consecutive parts: an input sequence and its corresponding prediction targets. Formally, given a trajectory

T^{k} = (P_{1}^{k}, \dots, P_{m_{k}}^{k})

, where each

P_{i}^{k}

is an AIS point with an associated timestamp

t_{i}^{k}

, a segment

S_{j}^{k}

is defined as

S_{j}^{k} = (X_{j}^{k}, Y_{j}^{k}),

where the input sequence is

X_{j}^{k} = (P_{i}^{k}, \dots, P_{i + H_{x} - 1}^{k}),

and the prediction targets are selected based on future timestamps:

Y_{j}^{k} = \{P_{r}^{k} | t_{r}^{k} \geq t_{i + H_{x} - 1}^{k} + δ_{q}, δ_{q} \in {δ_{1}, \dots, δ_{f}}\} .

Here, each

δ_{q} \in R^{+}

denotes a forecast horizon in minutes. For each

δ_{q}

, the corresponding target point

P_{r}^{k}

is the first AIS point for which its timestamp is equal to or just after the required future time.

Trajectory segments are extracted such that they do not overlap. Depending on the total duration and sampling density of a trajectory, multiple segments can be generated from a single trajectory.

These definitions establish the structure of the sequential data used for trajectory forecasting.

2.2. Research Objectives

The overarching goal of this study is to evaluate the effectiveness of a PCA-enhanced layered attention mechanism for vessel trajectory prediction. To this end, we define three research objectives:

RO1: To identify the optimal combination of historical input length and forecast horizon by systematically analyzing their joint impact on trajectory prediction accuracy.
RO2: To assess the added value of dynamic navigational features (SOG and COG) in improving prediction accuracy beyond using spatial coordinates alone.
RO3: To evaluate whether incorporating attention mechanisms—particularly with PCA-guided feature fusion—enhances model performance across different maritime environments.

These objectives are designed in a progressive manner, where RO1 and RO2 explore essential factors affecting prediction accuracy, and RO3 builds upon these findings to assess the core contribution of this study—the PCA-enhanced layered attention mechanism. By aligning all objectives toward this overarching goal, this study aims to provide a comprehensive evaluation of how strategic feature integration and attention mechanisms can improve vessel trajectory prediction.

3. Methodology

To achieve the defined research objectives, the methodology of this study is organized into two main phases: data preprocessing and experimental evaluation. The data preprocessing phase is executed once and serves as a shared foundation for all subsequent experiments. Figure 1 presents a high-level overview of the overall workflow, highlighting the two major phases and their interdependencies. The outputs of the preprocessing step include the Preprocessed Ship Trajectory Segment Database (PSD) and the training–test split indexes, which are consistently used across all experiments to ensure fair comparisons.

As illustrated in Figure 1, the methodology is governed by three sets of parameters: Sampling Parameters, Experimental Parameters, and Deep Learning Parameters. Sampling and Deep Learning Parameters remain fixed throughout all experiments, whereas Experimental Parameters are tailored to each individual experiment configuration.

3.1. Methodological Parameters

This section explains the methodological parameters: Sampling parameters, Experimental Parameters, and Deep Learning Parameters (shown in Figure 1).

Sampling parameters determine the criteria for selecting trajectory segment(s) from a given ship trajectory, ensuring that the segments have a consistent structure in terms of the number of points and the time intervals between them. The

h i s t P o i n t C o u n t

parameter determines the number of timestamped points, where the time interval between two adjacent points is

h i s t T i m e I n t e r v a l

with a permissible deviation of ±

h i s t T i m e I n t e r v a l D e v

. Then, the selected

m^{th}

timestamped point of the trajectory represents the current time. Finally, the short-, medium-, and long-term forecast horizons are defined by the sampling parameters of

f o r e c a s t H o r i z o n

with a permissible deviation of ±

f o r e c a s t H o r i z o n D e v

.

Sampling parameters were common to all experiments, and their values are defined as follows:

h i s t P o i n t C o u n t

,

h i s t T i m e I n t e r v a l

, and

h i s t T i m e I n t e r v a l D e v

were set as 5, 1, and 0.2, respectively. Four forecast horizons were defined, so

f o r e c a s t H o r i z o n s

and

f o r e c a s t H o r i z o n s D e v

are arrays of three decimal points, which are set to 1.5, 2.5, 3.5, 4.5 and 0.5, 0.5, 0.5, 0.5, respectively.

The experimental parameters are applied to the PSD database to extract subsets of trajectory segments, denoted as

X^{'}

and

Y^{'}

, based on the experimental hypotheses. There are four parameters:

n u m I n T

(number of historical points),

n u m O u t T

(number of forecast points),

n u m I n M

(input feature mask), and

i n d e x F o r e c a s t

(forecast step). The subset

X^{'}

consists of the last

n u m I n T

timestamped points before the prediction step, while

Y^{'}

includes

n u m O u t T

timestamped points starting from the appropriate position determined by

i n d e x F o r e c a s t

. By default, LAT and LON are included in each time point of

X^{'}

. The inclusion of additional features such as COG and SOG is controlled by the Boolean-valued

n u m I n M

mask. The

X^{'}

time series is used as input to the deep learning model, which is trained to predict

Y^{'}

. The final model output corresponds to the last element in the

Y^{'}

sequence.

Deep Learning Parameters: The models were trained on the dataset for 30 epochs, using a batch size of 500 and an initial learning rate of

10^{- 3}

. The learning rate was decreased at each epoch by a gradient decay factor of

9 \times 10^{- 2}

. To avoid over-fitting, a dropout layer with a rate of

10^{- 1}

was included in the model, and early stopping was implemented to finalize training. If the performance criteria are met before completing all 30 epochs, indicating that the training procedure has likely converged, an early stopping mechanism is employed. These training parameter values are kept the same for all model training procedures to ensure fairness.

3.2. Preprocessing Stage

As illustrated in Figure 1, the preprocessing phase of the study involves a pipeline that converts raw AIS data into the PSD database with training–test splits. This pipeline is composed of two more internal processes, where the output of one process (extract segments), segment database (SD), will be the input of the other (preprocess segments).

This study utilizes freely accessible historical raw AIS messages provided by the international data platform Marine Cadastre. Marine Cadastre is known for its comprehensive repository of marine data, providing access to AIS messages from ships transiting global waterways, containing a ship’s identity, location, speed, course, and timestamps in CSV format. This study utilizes AIS data from the Mississippi River region (longitude: −91.7 to −91.2; latitude: 30.5 to 31.5) in January–March 2023, with the following exclusion criteria: navigational status is not “Under way using its engine” (status value of zero), latitude and longitude are out of the defined range, or heading and COG properties are invalid. The Mississippi River, one of North America’s longest and most economically significant waterways, presents a unique set of vessel traffic properties [24]. Spanning a vast area from the Midwest to the Gulf of Mexico, its extensive network of navigable channels accommodates a wide range of vessel types, supporting the transportation of goods crucial to the U.S. economy. Since the raw AIS data entries obtained are not suitable for directly feeding LSTM models, the raw data entries with the same vessel number were grouped together and sorted in ascending order according to their timestamp to form a vessel trajectory dataset (TD). The dataset contains multiple vessels navigating within the same waterway. In this study, each trajectory is modeled independently based solely on its own historical AIS data. Vessel-to-vessel interactions and potential collision scenarios are not explicitly modeled, as the primary objective is to forecast single-vessel trajectories.

The obtained trajectories vary in both the time intervals between consecutive timestamped points and the total number of timestamped points. To construct a dataset suitable for training the predictive model, a segment extraction procedure was applied based on predefined sampling parameter values. As a result, trajectories were partitioned into non-overlapping trajectory segments that satisfy the specified sampling constraints. An illustration depicting the relationship between a given trajectory and its corresponding extracted segment is presented in Figure 2. Trajectories that do not meet the defined sampling conditions may not yield any valid trajectory segments.

To ensure consistent feature scaling, facilitate faster convergence during training, and enhance the model’s generalization capability, a normalization procedure was applied to all four features in the dataset. Specifically, the speed over ground (SOG) and course over ground (COG) values were normalized using the min–max normalization technique, which is defined as

X_{norm} = \frac{X - X_{\min}}{X_{\max} - X_{\min}}

(1)

where X represents the original feature value (i.e., SOG or COG), and

X_{\min}

and

X_{\max}

denote the minimum and maximum values observed within the trajectory segment dataset, respectively.

The latitude (LAT) and longitude (LON) features were transformed into relative values with respect to the first data point of the input sequence within each trajectory segment. Formally, for each trajectory segment, the relative position at time step t is computed as follows:

\begin{matrix} {LAT}_{rel}^{(t)} & = {LAT}^{(t)} - {LAT}^{(1)} \\ {LON}_{rel}^{(t)} & = {LON}^{(t)} - {LON}^{(1)} \end{matrix}

(2)

where

{LAT}^{(1)}

and

{LON}^{(1)}

denote the latitude and longitude values of the first timestamped point in the input sequence, respectively.

3.3. Experimental Stage

To explore the research objectives outlined in Section 2.2, a series of experiments were designed and conducted following the preprocessing stage. To ensure consistency and comparability across experiments, all were evaluated on the same Preprocessed Ship Trajectory Segment Database (PSD) with identical training and testing splits. The procedures involved in the “Construct Experimental Datasets” and “Evaluate Hypothesis” stages (see Figure 1) depending on the specific hypothesis set for each experiment, which are detailed separately in the following Experimental Design subsections.

3.3.1. Experimental Design for Research Objective 1

To investigate the impact of the number of historical timestamped data points on prediction performance, twelve configurations were tested by varying the parameters

n u m I n T

and

i n d e x F o r e c a s t

. The

n u m I n T

parameter was set to 2, 3, and 4, while

i n d e x F o r e c a s t

was set to 1, 2, 3, and 4, corresponding to four different forecast horizons. This resulted in a total of twelve combinations. The output length,

n u m O u t T

, was fixed at one in all cases.

To evaluate performance differences across these settings, four statistical hypotheses were formulated—one for each forecast horizon. The analysis involved computing descriptive statistics, testing for normality, and applying either repeated-measure ANOVA or Friedman tests with post hoc comparisons depending on the distribution characteristics. Table 1 summarizes the hypotheses.

3.3.2. Experimental Design for Research Objective 2

This part of the study examines whether incorporating SOG and COG as additional features improves predictions when only a limited number of historical timestamped points are available. Four configurations were considered based on the inclusion of these features. In all cases,

n u m I n T

was set to 2,

n u m O u t T

was 1, and

i n d e x F o r e c a s t

varied from 1 to 4.

The different configurations are denoted by binary superscripts: 00 indicates that neither SOG nor COG was included; 01 and 10 indicate that only SOG or only COG was included, respectively; 11 represents that both features were included. Hypothesis testing followed the same procedure as in the first objective, using a significance level of 0.01. The hypotheses are summarized in Table 2.

3.3.3. Experimental Design for Research Objective 3

The third objective focuses on evaluating the effect of attention mechanisms in the prediction architecture. For each forecast horizon, a separate experiment was conducted. The

i n d e x F o r e c a s t

parameter was set accordingly (from 1 to 4), and

n u m O u t T

remained fixed at 1. The input length

n u m I n T

was determined based on the best-performing configuration from the first objective, denoted as a, b, c, and d.

Each configuration was tested with and without the attention mechanism. The corresponding means are represented by

μ

(with attention) and

μ^{'}

(without attention). For each horizon, a statistical test was performed to determine whether the attention mechanism led to significant performance changes. Depending on normality, either paired t-tests or Wilcoxon signed-rank tests were used. Table 3 outlines the hypotheses.

3.3.4. The Proposed Deep Learning Model with a Novel Layered Attention Mechanism

This section describes the deep learning methodology corresponding to the “Train DL Model” phase illustrated in Figure 1. The trained model has an encoder–decoder architecture with an attention mechanism, designed to process the time series dataset (ED) for which its state vector features are determined by the parameter inclusion mask defined in the “Experimental Parameters”.

The encoder comprises an LSTM network that learns a latent representation of the ED dataset. Subsequently, an attention mechanism is applied to enhance the model’s ability to focus on relevant features, thereby improving prediction accuracy. In this study, the widely adopted Luong attention mechanism is used as a baseline for comparison against the proposed novel attention approach. Finally, the decoder LSTM generates the predicted geospatial location coordinates (latitude and longitude) of the vessel.

The proposed layered attention mechanism, depicted in Figure 3, modifies the encoder and attention components to incorporate all four AIS features—latitude (LAT), longitude (LON), course over ground (COG), and speed over ground (SOG)—based on the premise that each feature contains critical information essential for enhancing predictive accuracy.

Specifically, the encoder consists of three separate LSTM branches trained independently on geospatial information (LAT and LON), COG, and SOG features to learn distinct latent representations (hidden states). A dot-product attention operation is applied separately to each hidden state, yielding respective context vectors representing weighted hidden states.

To reduce noise and redundancy, Principal Component Analysis (PCA) is applied to each context vector to retain 90% of the explained variance, followed by inverse PCA to restore the original dimensionality. This PCA–inverse PCA (PCA–iPCA) procedure transforms the high-dimensional hidden states into a compact orthogonal principal component space, which improves computational efficiency and enhances the attention mechanism’s focus on the most informative latent features learned by each LSTM branch.

Finally, the refined context vectors are integrated in a multi-branch, layered attention architecture that effectively combines all four AIS features to produce more accurate vessel position predictions.

In Figure 3, the context vector marked with a single red asterisk (*) corresponds to the reference model’s attention vector, whereas the context vector marked with double red asterisks (**) denotes the enhanced context vector used in the proposed model. The proposed approach improves upon the baseline by incorporating additional movement information (COG and SOG) into the attention mechanism, thereby enriching the context’s representation.

3.3.5. Performance Evaluation Metrics

The performance of the trained deep learning model is evaluated using the Haversine distance to quantify the difference between the actual and predicted geospatial locations. This metric, referred to as the “distance error (DE)”, measures the great-circle distance between the true and predicted positions on the Earth’s surface.

The following equation is the “haversine” formula used to calculate the great-circle distance between two points, where the first point represents the ground truth location and the second point represents the predicted location. Let the points be (

φ_{1}, λ_{1}

) and (

φ_{2}, λ_{2}

), where

φ

represents latitude,

λ

represents longitude, and r is Earth’s radius (mean radius = 6371 km). Then, the output of the equation, denoted as d, is called the “distance error”, where the lower the distance error, the more accurate the model’s predictions are considered to be. Finally, distance errors are calculated for each datum in the test dataset, where the dataset composed of distance errors from the test set is referred to as the metric distribution (MD).

d = 2 r \cdot arcsin (\sqrt{{sin}^{2} (\frac{Δ φ}{2}) + cos (φ_{1}) \cdot cos (φ_{2}) \cdot {sin}^{2} (\frac{Δ λ}{2})})

(3)

4. Results and Discussion

As it is summarized in Figure 1, the methodology of this study consists of two parts: preprocessing phase and experimental phase. As an outcome of the preprocessing phase, a total of 773 ship trajectories were obtained from raw data, which were collected as separate CSV files for each day between 1 January and 31 March. Each trajectory is composed of different numbers of timestamped points with different state vector values. Ship trajectories are characterized by various metrics; these are duration, distance, number of timestamped points, average duration between timestamped points, average SOG, and average COG. A total of 18,649 “trajectory segments” were sampled to compose the segment database (SD), where 3730 were reserved for testing procedure. In this study, the same training–test split is used in each experiment to ensure result comparability when interpreting the findings. Finally, the experimental phase results were grouped in terms of the respective research objectives below. Although the dataset contains multiple vessels navigating within the same waterway, in this study, each trajectory is modeled independently based solely on its own historical AIS data. Vessel-to-vessel interactions and potential collision scenarios are not explicitly modeled, as the primary objective is to forecast single-vessel trajectories.

4.1. Evaluation of Hypotheses H1 to H4

Table 4 presents the descriptive statistics and results of normality assessments for the performance’s metric distributions (MDs). Shapiro–Wilk tests were conducted for each group, revealing that all twelve distributions significantly deviated from normality (p < 0.0001), which was further supported by skewness and kurtosis values. Despite the large sample size (n = 3730), the assumption of normality was not met. Therefore, non-parametric statistical methods were adopted to ensure the robustness of the analysis. Specifically, the Friedman test—the non-parametric alternative to repeated measures ANOVA—was employed to evaluate differences across the three groups within each hypothesis. The results revealed statistically significant differences for Hypothesis-1 (

χ^{2} = 25.46, p < 0.0001

), Hypothesis-2 (

χ^{2} = 2823.54, p < 0.0001

), Hypothesis-3 (

χ^{2} = 5669.23, p < 0.0001

), and Hypothesis-4 (

χ^{2} = 5961.03, p < 0.0001

), providing strong evidence of differences among the groups within each hypothesis.

All twelve distributions are leptokurtic (kurtosis > 3), indicating heavier tails compared to a normal distribution and a higher likelihood of extreme values. Among them,

μ_{4}^{2}

exhibits the highest kurtosis value, suggesting the most pronounced tail heaviness and the greatest presence of outliers. Additionally, the distributions with positive skewness, where the median is lower than the mean of the distribution, indicate that while most values are relatively low, a few large values are pulling the mean upward.

In evaluating the distributions for each hypothesis by considering multiple metrics—such as mean, standard deviation, median, skewness, and kurtosis—it is evident that the distributions with the smallest mean values—

μ_{1}^{2}

for H1,

μ_{2}^{2}

for H2,

μ_{3}^{4}

for H3, and

μ_{4}^{2}

for H4—also generally exhibit favorable characteristics across other metrics. These distributions not only demonstrate the lowest means, indicating better overall performance, but also show lower variability and consistent central tendencies, making them the most stable and reliable choices within each hypothesis.

The evaluation of the performance metrics across different forecast horizons and historical time points reveals nuanced insights into the optimal input length for accurate vessel trajectory prediction. For Forecast 1, 2, and 4, the distributions with two historical time points (

μ_{1}^{2}

,

μ_{2}^{2}

, and

μ_{4}^{2}

) consistently demonstrate the lowest mean error values along with favorable stability indicators such as lower standard deviations and relatively balanced skewness. This suggests that for these horizons, a shorter historical window is sufficient to capture the vessel’s movement dynamics effectively, leading to more reliable and computationally efficient predictions. Conversely, Forecast 3 shows a distinct pattern where the distribution with four historical points (

μ_{3}^{4}

) outperforms the others, indicating that the model benefits from a longer temporal context at this forecast horizon to better accommodate more complex or less predictable vessel behavior.

Although distributions with the smallest mean errors generally indicate better average performance, a closer look at kurtosis and skewness reveals the presence of heavier tails and outliers, particularly in some cases such as

μ_{4}^{2}

, which exhibits high kurtosis, suggesting more extreme values despite the low mean. This highlights the importance of considering not only the central tendency but also the variability and tail behavior of the distributions when selecting the optimal historical length. Hence, while

μ_{4}^{2}

shows promising average accuracy for Forecast 4, practitioners should be cautious of potential outliers that could impact operational reliability. Balancing mean error and distribution shape characteristics leads to a more robust model choice, underscoring the need for adaptive historical input lengths tailored to specific forecast horizons to optimize both accuracy and stability in vessel trajectory prediction.

One of the trajectories, for which their evaluation metrics are detailed in Table 4, was selected to illustrate the distance errors at the first, second, third, and fourth forecast horizons, as shown in Figure 4, respectively. The sequence of blue carets represents the current time point at the top, with historical time points arranged consecutively below. Black-filled circles denote the ground truth positions used for comparison, while three sets of predicted positions are overlaid on the topographical map using red, green, and blue filled circles. Red circles correspond to predictions based on two historical data points, green circles represent predictions using three historical points, and blue circles indicate predictions relying on a single historical data point.

As seen in Table 4, predictions incorporating two historical time points (red) generally achieve lower distance errors compared to those based on a single point (blue), demonstrating improved accuracy. Additionally, predictions utilizing three historical points (green) occasionally show further refinement but with diminishing returns, consistent with the observed mean and variability metrics. This suggests that while increasing the number of historical points can enhance prediction accuracy, the marginal benefit decreases beyond two points for most forecast horizons. These visual and statistical insights support the selection of optimal historical context lengths depending on the forecast horizon and model complexity.

4.2. Evaluation of Hypotheses H5 to H8

The results presented in Table 5 build on the best-performing configurations in Table 4—namely,

μ_{1}^{2}

,

μ_{2}^{2}

,

μ_{3}^{4}

, and

μ_{4}^{2}

—which correspond to the

μ_{*}^{00}

distributions in H5–H8, representing baseline predictions using only LAT and LON. This section evaluates the effect of adding SOG and/or COG to the baseline. For three hypotheses (H5, H6, and H8), the baseline

μ_{*}^{00}

remains the most effective, yielding the lowest mean distance errors and displaying stable distribution properties such as lower standard deviation and moderate kurtosis. This suggests that, in these scenarios, enriching the input with SOG and COG does not yield additional benefits, and it may even degrade model performance by introducing complexity or noise. However, an exception is observed in H7, where

μ_{3}^{01}

—which includes COG in addition to LAT and LON—outperforms the baseline. This implies that course information contributes positively to prediction accuracy for longer-term forecasting. In summary, while SOG and COG can enhance prediction in specific contexts, particularly in longer horizons, their contribution is not consistent. A selective and context-aware feature design is therefore essential for optimizing trajectory prediction models. Although these dynamic features contain valuable information, as emphasized in the motivation of this study, their predictive power may remain underutilized unless they are effectively and appropriately incorporated into the modeling process.

4.3. Evaluation of Hypotheses H9 to H12

The evaluation results presented in Table 6 demonstrate the effectiveness of the proposed model that integrates SOG and COG. Compared to the reference model, the proposed model yields significantly lower error values across all four hypotheses (H9–H12), with all differences statistically significant at

p < 0.001

.

For example, in H9 and H10, the mean distance errors were reduced from

0.041

to

0.029

and from

0.023

to

0.017

, respectively—indicating a substantial improvement in short-term forecasts. Although the improvements in H11 and H12 are smaller in magnitude, the consistent advantage across forecast horizons demonstrates the robustness of the proposed approach.

These findings validate the motivation behind this study: While SOG and COG contain valuable navigational signals, their benefits can only be fully realized when they are properly encoded into the model. The proposed model offers a more structured and context-aware incorporation of these features, leading to measurable performance gains. This highlights the importance of architectural design in leveraging auxiliary motion attributes such as speed and course for trajectory prediction.

Vessel trajectory prediction fundamentally involves forecasting a vessel’s future positions based on sequential, timestamped AIS data collected over time. Over recent decades, a wide spectrum of methodologies has emerged to extract meaningful patterns and trends from vessel trajectory data. The Automatic Identification System (AIS) serves as the predominant data source for these investigations owing to its high reliability, extensive global coverage, and rich informational content. Originally developed for vessel traffic management, AIS transmits real-time data at fixed intervals, including critical positional parameters (latitude and longitude) and dynamic movement metrics such as speed over ground (SOG) and course over ground (COG).

Seminal studies, leverage AIS data to elucidate regional collision patterns, thereby delineating high-risk zones and furnishing crucial insights to bolster maritime safety [25,26]. The majority of existing research concentrates on the four principal AIS attributes—location, speed over ground (SOG), course over ground (COG), and timestamps [10,11,12,13,14,15,16,17,18]—while a subset restricts analysis exclusively to positional coordinates [6,7,8,9]. For example, Bi et al. [27] augmented AIS-based models with environmental variables, including wind and wave data, to enhance prediction precision.

A considerable segment of the literature targets coastal zones, where factors such as rising sea levels and coastal erosion introduce complexities to long-term trajectory forecasting. For example, Slaughter et al. [11] conducted long-term trajectory predictions by employing all four core AIS attributes. Coastal environments present particular challenges for short-term prediction tasks due to their dynamic nature, elevated traffic densities, and swiftly fluctuating environmental conditions. Models designed for these areas must adapt quickly to real-time updates. Studies such as [6,15,17,21] focused on short-term trajectory predictions in coastal regions. Among them, Sekhon et al. [21] relied only on location data, while Alam et al. [10] and Cheng et al. [13] achieved better results by incorporating longer historical sequences, even with the trade-off between accuracy and real-time performance.

Research efforts have also expanded into archipelagic environments, characterized by intricate traffic flows and challenging natural topographies. Li et al. [12], for instance, analyzed medium-to-short-term trajectory predictions across both coastal and archipelagic zones. Inland waterways, exemplified by rivers, introduce further complexities arising from narrow, sinuous channels, fluctuating water levels, and robust currents [7,9,16].

Recently, deep learning techniques have surged in prominence within maritime trajectory prediction, with attention mechanisms playing a pivotal role in augmenting model efficacy. Attention mechanisms enable models to selectively concentrate on the most salient portions of sequential inputs, a critical capability for time-series forecasting tasks. For instance, Zhou et al. [19] applied attention to pedestrian trajectory prediction, while Messaoud et al. [20] used it in vehicle forecasting. In the maritime domain, attention-based LSTM architectures have proven effective in capturing long-term dependencies [7,11,15,21]. Transformer-based models [12], CNN-GRU hybrids [27], and spatiotemporal graph networks [14] further show how attention helps improve predictive accuracy.

Despite the growing popularity of attention mechanisms, some researchers prefer simpler, more computationally efficient models that omit them [8,10,13,18]. These models offer advantages in terms of speed but often struggle to identify the most informative features in high-dimensional settings.

The experimental results demonstrate that the proposed model outperforms recent studies in terms of distance error measured in nautical miles. Alam et al. [10] reported distance errors of approximately 370 m, 742 m, and 1.2 km for forecast horizons of 10, 20, and 30 min, respectively—corresponding to roughly 0.2, 0.4, and 0.65 nautical miles. Slaughter et al. [11] achieved a mean error of 0.88 km (0.475 nautical miles). Li et al. [12] reported a best position error of 0.0006 degrees, corresponding to approximately 0.036 nautical miles (latitude) or 0.028 nautical miles (longitude) around Zhoushan. Zhang et al. [14] reported an average of 0.8 nautical miles over ten sample routes, while Wang et al. [15] achieved 508 m (0.274 nautical miles) for a 1 min horizon. You et al. [16] reported an RMSE of 0.00386 degrees in an inland scenario.

In comparison, our proposed model achieved significantly lower mean distance errors—ranging between 0.017 and 0.042 nautical miles across all tested forecast horizons and input feature combinations (see Table 6). These results highlight the effectiveness of our attention-based architecture, particularly in capturing the latent dynamics conveyed by SOG and COG, which have often been underutilized in the literature despite their recognized potential.

While acknowledging the inherent variability across studies—stemming from differences in forecast horizons, regional focus, input features, and error metrics—we took careful measures to ensure a fair and context-aware comparison. To minimize inconsistencies, all reported errors from the literature were converted to nautical miles, providing a consistent basis for evaluation. Moreover, rather than merely aggregating the overall results, our approach includes hypothesis-driven evaluations that examine the influence of different input configurations, including SOG and COG, under controlled experimental settings. This design choice strengthens the validity of comparisons and offers a deeper understanding of the proposed model’s performance in relation to prior works.

5. Conclusions and Future Work

This study presents a deep learning-based framework for vessel trajectory prediction that systematically explores the impact of incorporating dynamic ship-based features—specifically, course over ground (COG) and speed over ground (SOG)—alongside geospatial information. A series of controlled experiments conducted across multiple forecast horizons demonstrated that leveraging these auxiliary inputs can significantly improve predictive accuracy. The experimental design enabled a structured comparison of different model configurations, revealing the influence of both the number of historical timestamped points and the integration of attention mechanisms.

In particular, the proposed layered attention architecture, which separately encodes geospatial, SOG, and COG information through dedicated LSTM modules before fusing them via attention, effectively enhances the model’s ability to capture temporal and directional patterns. This selective focus on relevant features contributes to more accurate trajectory forecasts, especially in dynamic and constrained maritime environments.

Quantitative results further validate the effectiveness of the proposed approach. The model achieved a mean position error of 0.0171 nautical miles with a standard deviation of 0.0035 across the entire test dataset. The minimum observed error was as low as 0.0006 nautical miles, indicating the model’s capacity for highly precise short-term forecasting. These results not only outperform recent benchmarks but also highlight the potential of multi-feature attention-enhanced models for operational maritime applications.

The results consistently demonstrate that the proposed framework achieves measurable improvements over baseline models by effectively integrating dynamic navigational features (SOG and COG) into geospatial features (LAT and LON). These improvements align with the initial hypotheses, which predicted that (i) incorporating SOG and COG could enhance trajectory prediction under certain forecast horizons and (ii) the PCA-driven layered attention mechanism would more effectively encode these features within the LSTM architecture. The findings validate these hypotheses and confirm that the research objectives have been successfully met.

Future work may investigate extending this methodology to more complex navigational scenarios, such as congested port approaches or inland waterways, and integrating additional domain-specific data, including environmental variables, vessel type, or maneuvering status, to further improve robustness and generalizability. Ultimately, the proposed architecture offers a scalable and interpretable solution for data-driven maritime trajectory forecasting tasks.

Future work will also investigate extending the prediction horizon to medium- and long-term ranges (e.g., beyond 10 min), which are crucial for many practical maritime applications. From a methodological standpoint, further studies will explore the integration and empirical evaluation of a broader range of attention mechanisms, including transformer-based architectures and purely self-attentive models. Additionally, hybrid schemes combining PDLA with these alternative attention strategies will be examined to identify potential performance improvements.

Finally, future research may incorporate domain-specific information such as vessel type, maneuvering status, and environmental variables (e.g., wind, current, water depth) to enhance the robustness and adaptability of the model. Ultimately, the proposed architecture aims to serve as a scalable and interpretable solution for real-world maritime trajectory forecasting tasks.

Author Contributions

Conceptualization, F.E. and Y.Y.; methodology, F.E.; software, F.E.; validation, F.E. and Y.Y.; formal analysis, F.E.; investigation, F.E.; resources, F.E.; data curation, F.E.; writing—original draft preparation, F.E. and Y.Y.; writing—review and editing, F.E. and Y.Y.; visualization, F.E.; supervision, Y.Y.; project administration, Y.Y.; funding acquisition, F.E. and Y.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This study is part of the projects entitled as “Evaluation of Recurrent Neural Network Architectures Containing Attention Mechanism for Ship Navigation Prediction” (STB Project Code: 089300) and “Investigation and Optimization of AIS and Bathymetry Data Using Data Fusion Methods for Computer-Aided Ship Route Planning (STB Project Code: 100611)”, supported by Piri Reis University.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors would like to express their sincere gratitude to Cüneyt Ezgi, Technopark Operations Manager, for his valuable support and encouragement throughout the course of this study. During the preparation of this manuscript, the authors used OpenAI’s ChatGPT (model: GPT-4, accessed via chatgpt.com) for assistance with language editing, refining technical phrasing, and improving the clarity of definitions and descriptions. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the final version of this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AIS	Automatic Identification System
LAT	Latitude
LON	Longitude
SOG	Speed of Ground
COG	Course of Ground
PCA	Principal Component Analysis

References

UNCTAD Review of Maritime Transport 2023. 2023. Available online: https://unctad.org/publication/review-maritime-transport-2023 (accessed on 24 September 2024).
IMO International Maritime Organization AIS: The Automatic Identification System for Ships. 2001. Available online: https://www.imo.org/en/OurWork/Safety/Pages/AIS.aspx (accessed on 24 September 2024).
Emmens, T.; Amrit, C.; Abdi, A.; Ghosh, M. The promises and perils of Automatic Identification System data. Expert Syst. Appl. 2021, 178, 114975. Available online: https://www.sciencedirect.com/science/article/pii/S0957417421004164 (accessed on 1 August 2025). [CrossRef]
Zhu, Q.; Xi, Y.; Weng, J.; Han, B.; Hu, S.; Ge, Y. Intelligent ship collision avoidance in maritime field: A bibliometric and systematic review. Expert Syst. Appl. 2024, 252, 124148. Available online: https://www.sciencedirect.com/science/article/pii/S0957417424010145 (accessed on 1 August 2025). [CrossRef]
He, Z.; Liu, C.; Chu, X.; Wu, W.; Zheng, M.; Zhang, D. Dynamic domain-based collision avoidance system for autonomous ships: Real experiments in coastal waters. Expert Syst. Appl. 2024, 255, 124805. Available online: https://www.sciencedirect.com/science/article/pii/S0957417424016725 (accessed on 1 August 2025). [CrossRef]
Wang, S.; Li, Y.; Xing, H.; Zhang, Z. Vessel trajectory prediction based on spatio-temporal graph convolutional network for complex and crowded sea areas. Ocean Eng. 2024, 298, 117232. Available online: https://www.sciencedirect.com/science/article/pii/S0029801824005699 (accessed on 1 August 2025). [CrossRef]
Gao, D.W.; Zhu, Y.S.; Zhang, J.F.; He, Y.K.; Yan, K.; Yan, B.R. A novel MP-LSTM method for ship trajectory prediction based on AIS data. Ocean Eng. 2021, 228, 108956. Available online: https://www.sciencedirect.com/science/article/pii/S0029801821003917 (accessed on 1 August 2025). [CrossRef]
Liu, R.; Hu, K.; Liang, M.; Li, Y.; Liu, X.; Yang, D. QSD-LSTM: Vessel trajectory prediction using long short-term memory with quaternion ship domain. Appl. Ocean Res. 2023, 136, 103592. Available online: https://www.sciencedirect.com/science/article/pii/S0141118723001335 (accessed on 1 August 2025). [CrossRef]
Ma, Q.; Du, X.; Zhang, M.; Wang, H.; Lang, X.; Mao, W. A spatial-temporal attention method for the prediction of multi ship time headways using AIS data. Ocean Eng. 2024, 311, 118927. Available online: https://www.sciencedirect.com/science/article/pii/S0029801824022650 (accessed on 1 August 2025). [CrossRef]
Alam, M.; Spadon, G.; Etemad, M.; Torgo, L.; Milios, E. Enhancing short-term vessel trajectory prediction with clustering for heterogeneous and multi-modal movement patterns. Ocean Eng. 2024, 308, 118303. Available online: https://www.sciencedirect.com/science/article/pii/S002980182401641X (accessed on 1 August 2025). [CrossRef]
Slaughter, I.; Charla, J.; Siderius, M.; Lipor, J. Vessel trajectory prediction with recurrent neural networks: An evaluation of datasets, features, and architectures. J. Ocean Eng. Sci. 2024, 10, 229–238. Available online: https://www.sciencedirect.com/science/article/pii/S2468013324000081 (accessed on 1 August 2025). [CrossRef]
Li, H.; Jiao, H.; Yang, Z. AIS data-driven ship trajectory prediction modelling and analysis based on machine learning and deep learning methods. Transp. Res. Part E Logist. Transp. Rev. 2023, 175, 103152. Available online: https://www.sciencedirect.com/science/article/pii/S1366554523001400 (accessed on 1 August 2025). [CrossRef]
Cheng, R.; Liang, M.; Li, H.; Yuen, K. Benchmarking feed-forward randomized neural networks for vessel trajectory prediction. Comput. Electr. Eng. 2024, 119, 109499. Available online: https://www.sciencedirect.com/science/article/pii/S0045790624004269 (accessed on 1 August 2025). [CrossRef]
Zhang, X.; Liu, J.; Gong, P.; Chen, C.; Han, B.; Wu, Z. Trajectory prediction of seagoing ships in dynamic traffic scenes via a gated spatio-temporal graph aggregation network. Ocean Eng. 2023, 287, 115886. Available online: https://www.sciencedirect.com/science/article/pii/S0029801823022709 (accessed on 1 August 2025). [CrossRef]
Wang, S.; Li, Y.; Xing, H. A novel method for ship trajectory prediction in complex scenarios based on spatio-temporal features extraction of AIS data. Ocean Eng. 2023, 281, 114846. Available online: https://www.sciencedirect.com/science/article/pii/S0029801823012301 (accessed on 1 August 2025). [CrossRef]
You, L.; Xiao, S.; Peng, Q.; Claramunt, C.; Han, X.; Guan, Z.; Zhang, J. ST-Seq2Seq: A Spatio-Temporal Feature-Optimized Seq2Seq Model for Short-Term Vessel Trajectory Prediction. IEEE Access 2020, 8, 218565–218574. [Google Scholar] [CrossRef]
Xiao, Y.; Li, X.; Yao, W.; Chen, J.; Hu, Y. Bidirectional Data-Driven Trajectory Prediction for Intelligent Maritime Traffic. IEEE Trans. Intell. Transp. Syst. 2023, 24, 1773–1785. [Google Scholar] [CrossRef]
Liu, Z.; Qi, W.; Zhou, S.; Zhang, W.; Jiang, C.; Jie, Y.; Li, C.; Guo, Y.; Guo, J. Hybrid deep learning models for ship trajectory prediction in complex scenarios based on AIS data. Appl. Ocean Res. 2024, 153, 104231. Available online: https://www.sciencedirect.com/science/article/pii/S0141118724003523 (accessed on 1 August 2025). [CrossRef]
Zhou, H.; Ren, D.; Xia, H.; Fan, M.; Yang, X.; Huang, H. AST-GNN: An attention-based spatio-temporal graph neural network for Interaction-aware pedestrian trajectory prediction. Neurocomputing 2021, 445, 298–308. Available online: https://www.sciencedirect.com/science/article/pii/S092523122100388X (accessed on 1 August 2025). [CrossRef]
Messaoud, K.; Yahiaoui, I.; Verroust-Blondet, A.; Nashashibi, F. Attention Based Vehicle Trajectory Prediction. IEEE Trans. Intell. Veh. 2021, 6, 175–185. [Google Scholar] [CrossRef]
Sekhon, J.; Fleming, C. A Spatially and Temporally Attentive Joint Trajectory Prediction Framework for Modeling Vessel Intent. In Proceedings of the 2nd Conference On Learning For Dynamics And Control, Berkeley, CA, USA, 10–11 June 2020; Volume 120, pp. 318–327. Available online: https://proceedings.mlr.press/v120/sekhon20a.html (accessed on 1 August 2025).
Hu, K.; Xu, K. An overview: Attention mechanisms in multi-agent reinforcement learning. Neurocomputing 2024, 598, 128015. [Google Scholar] [CrossRef]
Britz, D.; Goldie, A.; Luong, M.T.; Le, Q. Massive Exploration of Neural Machine Translation Architectures. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 7–11 September 2017; pp. 1442–1451. [Google Scholar]
Carlson, B.; Propst, D.; Jackson, R. Economic Impact of Recreation on the Upper Mississippi River System; Technical Report; U.S. Army Corps of Engineers, Waterways Experiment Station: Vicksburg, MS, USA, 1995; EL-95-16. [Google Scholar]
Liu, Z.; Wu, Z.; Zheng, Z. Novel framework for regional collision risk identification based on AIS data. Appl. Ocean Res. 2019, 89, 261–272. Available online: https://www.sciencedirect.com/science/article/pii/S0141118718308198 (accessed on 1 August 2025). [CrossRef]
Liu, T.; Xu, X.; Lei, Z.; Zhang, X.; Sha, M.; Wang, F. A multi-task deep learning model integrating ship trajectory and collision risk prediction. Ocean Eng. 2023, 287, 115870. Available online: https://www.sciencedirect.com/science/article/pii/S0029801823022540 (accessed on 1 August 2025). [CrossRef]
Bi, J.; Gao, M.; Bao, K.; Zhang, W.; Zhang, X.; Cheng, H. A CNNGRU-MHA method for ship trajectory prediction based on marine fusion data. Ocean Eng. 2024, 310, 118701. Available online: https://www.sciencedirect.com/science/article/pii/S0029801824020390 (accessed on 1 August 2025). [CrossRef]

Figure 1. An illustration of the methodology applied in this study.

Figure 2. Sampling trajectory segments based on predefined sampling parameters.

Figure 3. Schematic representation of the proposed layered attention mechanism.

Figure 4. Predictions across different forecast horizons.

Table 1. Hypothesis tests for different forecast horizons.

Forecast Horizon	Null Hypothesis ( $H_{0}$ )	Alternative Hypothesis ( $H_{A}$ )
H1: 1–2 min	$μ_{1}^{2} = μ_{1}^{3} = μ_{1}^{4}$	At least one mean differs
H2: 2–3 min	$μ_{2}^{2} = μ_{2}^{3} = μ_{2}^{4}$	At least one mean differs
H3: 3–4 min	$μ_{3}^{2} = μ_{3}^{3} = μ_{3}^{4}$	At least one mean differs
H4: 4–5 min	$μ_{4}^{2} = μ_{4}^{3} = μ_{4}^{4}$	At least one mean differs

Notation:

μ_{1}^{3}

denotes the mean prediction error when using three historical timestamped points for the first forecast horizon.

Table 2. Hypothesis tests for different forecast horizons (COG and SOG inclusion).

Forecast Horizon	Null Hypothesis ( $H_{0}$ )	Alternative Hypothesis ( $H_{A}$ )
H5: 1–2 min	$μ_{1}^{00} = μ_{1}^{01} = μ_{1}^{10} = μ_{1}^{11}$	At least one mean differs
H6: 2–3 min	$μ_{2}^{00} = μ_{2}^{01} = μ_{2}^{10} = μ_{2}^{11}$	At least one mean differs
H7: 3–4 min	$μ_{3}^{00} = μ_{3}^{01} = μ_{3}^{10} = μ_{3}^{11}$	At least one mean differs
H8: 4–5 min	$μ_{4}^{00} = μ_{4}^{01} = μ_{4}^{10} = μ_{4}^{11}$	At least one mean differs

Notation:

μ_{1}^{01}

denotes the population mean when SOG is included but COG is not for the first forecast horizon.

Table 3. Hypothesis tests for different forecast horizons (effect of attention mechanism).

Forecast Horizon	Null Hypothesis ( $H_{0}$ )	Alternative Hypothesis ( $H_{A}$ )
H9: 1–2 min	$μ_{1}^{a} = μ_{1}^{' a}$	The means differ
H10: 2–3 min	$μ_{2}^{b} = μ_{2}^{' b}$	The means differ
H11: 3–4 min	$μ_{3}^{c} = μ_{3}^{' c}$	The means differ
H12: 4–5 min	$μ_{4}^{d} = μ_{4}^{' d}$	The means differ

Notation:

μ_{1}^{a}

and

μ_{1}^{' a}

denote the mean prediction errors with and without the attention mechanism, respectively, for the first forecast horizon.

Table 4. Statistics of evaluation metrics for Hypotheses H1 to H4.

		Descriptive Statistics						Shapiro–Wilk
	$μ$	Range	Mean	Std	Median	S	K	p-Value	W
H1	$μ_{1}^{2}$	[0.0018–0.1209]	(*) 0.042	(*) 0.014	(*) 0.043	−0.26	3.95	<0.01	0.97
H1	$μ_{1}^{3}$	[0.0007–0.1407]	0.053	0.015	0.054	−0.17	4.62	<0.01	0.98
H1	$μ_{1}^{4}$	[0.0004–0.1420]	0.044	0.014	0.044	0.62	5.98	<0.01	0.96
H2	$μ_{2}^{2}$	[0.0011–0.0623]	(*) 0.023	(*) 0.006	(*) 0.022	1.30	7.85	<0.01	0.91
H2	$μ_{2}^{3}$	[0.0031–0.0963]	0.033	0.007	0.033	1.39	9.94	<0.01	0.90
H2	$μ_{2}^{4}$	[0.0042–0.0681]	0.029	0.006	0.029	0.27	6.79	<0.01	0.94
H3	$μ_{3}^{2}$	[0.0162–0.0857]	0.056	0.007	0.056	−0.69	5.97	<0.01	0.96
H3	$μ_{3}^{3}$	[0.0102–0.0808]	0.046	0.006	0.047	−0.82	8.15	<0.01	0.90
H3	$μ_{3}^{4}$	[0.0085–0.0822]	(*) 0.034	(*) 0.005	(*) 0.033	1.10	8.48	<0.01	0.93
H4	$μ_{4}^{2}$	[0.0079–0.0889]	(*) 0.033	(*) 0.006	(*) 0.033	1.40	12.03	<0.01	0.88
H4	$μ_{4}^{3}$	[0.0200–0.0963]	0.060	0.006	0.061	−0.79	9.96	<0.01	0.89
H4	$μ_{4}^{4}$	[0.0060–0.0994]	0.052	0.007	0.052	0.36	7.02	<0.01	0.94

(*) The smallest values among the groups for each hypothesis. S: Skewness; K: kurtosis; W: W value for Shapio–Wilk normality test.

Table 5. Statistics of evaluation metrics for Hypotheses H5 to H8.

		Descriptive Statistics						Shapiro–Wilk
	$μ$	Range	Mean	Std	Median	S	K	p-Value	W
H5	$μ_{1}^{00}$	[0.0018–0.1209]	(*) 0.041	(*) 0.013	0.043	−0.27	3.95	<0.01	0.97
H5	$μ_{1}^{01}$	[0.0123–0.1808]	0.080	0.016	0.081	0.35	4.13	<0.01	0.99
H5	$μ_{1}^{10}$	[0.0004–0.1508]	0.045	0.022	0.044	0.37	3.10	<0.01	0.98
H5	$μ_{1}^{11}$	[0.0069–0.2121]	0.087	0.026	0.086	0.09	3.34	<0.01	0.99
H6	$μ_{2}^{00}$	[0.0011–0.0623]	(*) 0.023	(*) 0.005	0.022	1.30	7.84	<0.01	0.91
H6	$μ_{2}^{01}$	[0.0307–0.1184]	0.066	0.008	0.066	0.45	5.05	<0.01	0.97
H6	$μ_{2}^{10}$	[0.0011–0.1836]	0.058	0.023	0.058	0.24	3.45	<0.01	0.99
H6	$μ_{2}^{11}$	[0.0825–0.5754]	0.271	0.056	0.272	0.58	4.36	<0.01	0.97
H7	$μ_{3}^{00}$	[0.0162–0.0857]	0.055	0.006	0.056	−0.69	5.96	<0.01	0.95
H7	$μ_{3}^{01}$	[0.0170–0.0881]	(*) 0.042	(*) 0.005	0.041	0.63	7.19	<0.01	0.94
H7	$μ_{3}^{10}$	[0.0015–0.1848]	0.047	0.020	0.046	0.82	5.29	<0.01	0.96
H7	$μ_{3}^{11}$	[0.0007–0.3001]	0.107	0.040	0.107	0.43	3.67	<0.01	0.98
H8	$μ_{4}^{00}$	[0.0079–0.0889]	(*) 0.033	(*) 0.005	0.032	1.40	12.03	<0.01	0.87
H8	$μ_{4}^{01}$	[0.0033–0.0882]	0.035	0.006	0.035	0.92	9.95	<0.01	0.89
H8	$μ_{4}^{10}$	[0.0109–0.2294]	0.097	0.028	0.102	−0.28	2.83	<0.01	0.97
H8	$μ_{4}^{11}$	[0.0100–0.2729]	0.119	0.031	0.119	0.04	3.31	<0.01	0.99

(*) The smallest values among the groups for each hypothesis. S: Skewness; K: kurtosisl W: W value for Shapio–Wilk normality test.

Table 6. Statistics of evaluation metrics for Hypotheses H9 to H12.

H	Reference Model (See Table 5)	Proposed Model	p-Value
H9	[0.0018–0.1209] 0.041 ± 0.013	[0.0011–0.1083] 0.029 ± 0.011	$p < 0.001$
H10	[0.0011–0.0623] 0.023 ± 0.005	[0.0006–0.0380] 0.017 ± 0.003	$p < 0.001$
H11	[0.0170–0.0881] 0.042 ± 0.005	[0.0161–0.0809] 0.042 ± 0.006	$p < 0.001$
H12	[0.0079–0.0889] 0.033 ± 0.005	[0.0044–0.0866] 0.032 ± 0.005	$p < 0.001$

Values presented in [min–max] mean.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Er, F.; Yalman, Y. Navigating the Future: A Novel PCA-Driven Layered Attention Approach for Vessel Trajectory Prediction with Encoder–Decoder Models. Appl. Sci. 2025, 15, 8953. https://doi.org/10.3390/app15168953

AMA Style

Er F, Yalman Y. Navigating the Future: A Novel PCA-Driven Layered Attention Approach for Vessel Trajectory Prediction with Encoder–Decoder Models. Applied Sciences. 2025; 15(16):8953. https://doi.org/10.3390/app15168953

Chicago/Turabian Style

Er, Fusun, and Yıldıray Yalman. 2025. "Navigating the Future: A Novel PCA-Driven Layered Attention Approach for Vessel Trajectory Prediction with Encoder–Decoder Models" Applied Sciences 15, no. 16: 8953. https://doi.org/10.3390/app15168953

APA Style

Er, F., & Yalman, Y. (2025). Navigating the Future: A Novel PCA-Driven Layered Attention Approach for Vessel Trajectory Prediction with Encoder–Decoder Models. Applied Sciences, 15(16), 8953. https://doi.org/10.3390/app15168953

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Navigating the Future: A Novel PCA-Driven Layered Attention Approach for Vessel Trajectory Prediction with Encoder–Decoder Models

Abstract

1. Introduction

2. Definitions and Research Objectives

2.1. Key Definitions

2.2. Research Objectives

3. Methodology

3.1. Methodological Parameters

3.2. Preprocessing Stage

3.3. Experimental Stage

3.3.1. Experimental Design for Research Objective 1

3.3.2. Experimental Design for Research Objective 2

3.3.3. Experimental Design for Research Objective 3

3.3.4. The Proposed Deep Learning Model with a Novel Layered Attention Mechanism

3.3.5. Performance Evaluation Metrics

4. Results and Discussion

4.1. Evaluation of Hypotheses H1 to H4

4.2. Evaluation of Hypotheses H5 to H8

4.3. Evaluation of Hypotheses H9 to H12

5. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI