A Comparative Study of Univariate Models for Baltic Dry Index Forecasting

Huang, Juan; Chu, Ching-Wu; Hsu, Hsiu-Li

doi:10.3390/forecast8010011

Open AccessArticle

A Comparative Study of Univariate Models for Baltic Dry Index Forecasting

by

Juan Huang

¹

,

Ching-Wu Chu

^2,*

and

Hsiu-Li Hsu

³

¹

Navigation Institute, Jimei University, Ximen 361021, China

²

Department of Shipping and Transportation Management, National Taiwan Ocean University, Keelung 202301, Taiwan

³

Department of Navigation and Shipping Transportation Management, Taipei University of Marine Technology, Taipei 11174, Taiwan

^*

Author to whom correspondence should be addressed.

Forecasting 2026, 8(1), 11; https://doi.org/10.3390/forecast8010011

Submission received: 26 December 2025 / Revised: 23 January 2026 / Accepted: 28 January 2026 / Published: 2 February 2026

(This article belongs to the Section Forecasting in Economics and Management)

Download

Browse Figures

Review Reports Versions Notes

Highlights

What are the main findings?

A comparative analysis of six univariate forecasting models shows that the hybrid EMD–SVR–GWO approach delivers the highest short-term forecasting accuracy for the Baltic Dry Index.
The proposed model consistently outperforms traditional econometric and deep learning benchmarks under volatile market conditions.

What are the implications of the main findings?

Hybrid decomposition–learning frameworks provide a robust solution for forecasting nonstationary shipping indices.
More accurate Baltic Dry Index forecasts can support improved chartering, fleet management, and risk control decisions in the dry bulk shipping market.

Abstract

The Baltic Dry Index (BDI) measures the cost of transporting dry bulk commodities such as coal, iron ore, and grain. As a key indicator of global trade, supply chain dynamics, and overall economic activity, accurate short-term forecasting of the BDI is crucial. This paper compares six univariate methods to obtain a more precise short-term BDI prediction model, providing valuable insights for decision-makers. The six forecasting techniques include Grey Forecast, ARIMA, Support Vector Regression, LSTM, GRU and EMD-SVR-GWO. Model performance is evaluated using three common metrics: Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE). Our findings reveal that the novel EMD-SVR-GWO model outperforms the other univariate methods, demonstrating superior accuracy in forecasting monthly BDI trends. This study contributes to improved BDI prediction, aiding managers in strategic planning and decision-making.

Keywords:

univariate forecasting models; Baltic Dry Index; Empirical Mode Decomposition; support vector regression; Grey Wolf Optimizer

1. Introduction

The Baltic Dry Index (BDI) serves as a critical barometer for the global shipping industry, reflecting the costs of transporting dry bulk commodities such as coal, iron ore, and grain. As an economic indicator, the BDI provides valuable insights into the health of global trade, supply chain dynamics, and overall economic activity. Given its relevance, accurate forecasting of the BDI is of paramount importance for a diverse range of stakeholders, including shipping companies, commodity traders, investors, policymakers, and economists.

Historically, the BDI has exhibited significant volatility, influenced by a myriad of factors. These factors include global economic cycles, seasonal demand fluctuations, geopolitical events, and changes in shipping capacity. For instance, economic downturns often lead to reduced demand for raw materials, subsequently lowering shipping rates and the BDI. Conversely, periods of economic growth drive up demand for commodities, resulting in higher shipping rates. Additionally, events such as trade disputes, natural disasters, and changes in regulatory environments can cause sudden and unpredictable movements in the BDI. This volatility poses substantial challenges for forecasting but also presents opportunities for leveraging advanced analytical techniques to improve prediction accuracy.

Traditional forecasting methods, such as time series analysis and econometric models, have been employed with varying degrees of success in predicting BDI movements. These methods provide a foundation for understanding past trends and making short-term forecasts; they often struggle to capture the complex, non-linear relationships inherent in the factors influencing the BDI.

In recent years, advancements in machine learning and artificial intelligence have opened new avenues for improving forecast accuracy. Techniques such as artificial neural networks (ANNs), support vector machines (SVMs), and ensemble learning methods offer the potential to model complex patterns and interactions that traditional methods may overlook. These contemporary approaches can incorporate a wider range of variables and are capable of learning from large datasets, making them well-suited for forecasting highly volatile indices like the BDI.

Forecasting the Baltic Dry Index is crucial for the shipping industry as it offers insights into global trade trends, aiding in fleet management, capacity planning, and route optimization. Accurate BDI forecasts help shipping companies reduce costs, improve efficiency, and make informed investment decisions. Additionally, understanding shipping demand trends aids in predicting commodity prices and managing risks, ultimately enhancing the competitiveness and operational efficiency of shipping companies. Thus, forecasting BDI has attracted considerable attention from both academic researchers and industry practitioners. The primary objective of this study is to compare various univariate forecasting techniques in order to develop a more accurate short-term prediction model for the BDI, offering practical insights for the shipping sector.

The structure of this paper is as follows: Section 2 reviews relevant literature, Section 3 details the research methodology, Section 4 analyzes and compares the forecasting results, and Section 5 concludes the study with key findings and implications.

2. Literature Review

BDI forecasting methods can be broadly categorized into three groups: conventional econometric models, nonlinear models and machine learning techniques, and ensemble machine learning methods. These approaches have become the most widely adopted methodologies for BDI analysis in academic research.

2.1. Econometric Model

Econometric models were among the first developed for BDI forecasting, and they have been widely employed in validation studies. Both univariate and multivariate econometric methodologies are commonly applied in this research. Previous studies frequently utilized models such as the vector error correction model (VECM), vector autoregressive (VAR), and autoregressive integrated moving average (ARIMA). For instance, Veenstra and Franses [1] developed a time series-based VAR model to forecast the BDI, identifying non-stationarity in the data, which raised concerns about the stability of various modelling techniques. Cullinane et al. [2] used the ARIMA model for modifications to the Baltic Freight Index (BFI), demonstrating its superior predictive power. Kavussanos and Alizadeh [3] examined the seasonality of dry bulk freight rates, employing seasonal ARIMA and VAR models to assess fluctuations across different vessel sizes. Tsioumas et al. [4] introduced a multivariate vector autoregressive model with exogenous variables (VARX), which outperformed ARIMA in improving BDI forecasting accuracy. Papailias et al. [5] showed that trigonometric regression can lead to improved predictions and then use their forecasting results to perform an investment exercise and to show how they can be used for improved risk management in the freight sector. Zhang et al. [6] compared ARIMA, ARIMA-GARCH models, and artificial neural networks (ANNs), noting that econometric models performed better in one-step-ahead forecasts, while ANN-based algorithms achieved a higher direction matching rate in weekly and monthly data. More recently, Katris et al. [7] proposed data-driven model selection using ARIMA, fractional ARIMA (FARIMA), and ARIMA/FARIMA models with GARCH and EGARCH errors, marking the first use of FARIMA and MARS models in BDI forecasting.

2.2. Nonlinear Models and Machine Learning

Traditional econometric models often struggle to effectively capture the nonstationary and nonlinear characteristics of BDI time series. As a result, recent years have seen the increasing use of nonlinear regression models and machine learning techniques in BDI prediction research. Common methods include artificial neural networks, support vector machines (SVM), and nonlinear regression. SVMs are praised for their ability to approximate nonlinear functions and generalize well. For instance, Yang et al. [8] showed that SVM models can successfully account for BDI’s nonlinear properties and developed a freight early warning system based on SVM to predict fluctuations in shipping market prices. Bao et al. [9] proposed a new BDI forecasting model using support vector machines with correlation-based feature selection. The study examined macroeconomic indicators and freight index fluctuations to construct the model. Comparative results showed SVM outperforms neural networks in trend accuracy and forecast precision, aiding market risk management.

Similarly, Li and Parsons [10] employed neural networks to forecast short-and long-term monthly tanker freight rates. Their results demonstrated that neural networks outperformed traditional ARIMA models, particularly for long-term forecasts. Building on this, Sahin et al. [11] introduced three ANN techniques for predicting BDI, conducting detailed comparisons to identify the most effective model. In recent years, advanced models like convolutional neural networks (CNNs) and long short-term memory (LSTM) networks have gained recognition as top-performing predictive neural networks, though their use in BDI forecasting remains underexplored.

Additionally, both econometric models and AI methods have their individual limitations. Hybrid forecasting approaches have emerged as a way to combine these models, leveraging their strengths to offset individual weaknesses. Such integrated systems tend to offer higher accuracy. For example, Chou and Lin [12] developed a fuzzy neural network combined with technical indicators to predict the BDI, achieving better accuracy than either technique alone.

Most recently, Bae [13] utilized deep learning models to predict the trends of the Baltic Dry Index (BDI), comparing the forecasting effects of algorithms such as Recurrent Neural Networks (RNN) and determining the optimal parameters. Liu et al. [14] proposed a BiLSTM-based system for BDI forecasting, incorporating the grey relational degree analysis method to select seven highly correlated factors as inputs. This model outperformed common machine learning models like support vector regression (SVR) and regression, as well as the standard LSTM neural network.

2.3. Ensemble Machine Learning Models

Ensemble learning has been extensively employed to enhance model performance, serving as an effective method for boosting the predictive power of individual models. Leonov and Nikolov [15] found that freight pricing in shipping markets is highly volatile. They analyzed fluctuations in the Baltic Panamax 2A and 3A freight rates using a novel approach in shipping economics: a hybrid wavelet-neural network model. In a study on Baltic Dry Index prediction, Zeng et al. [16] applied Empirical Mode Decomposition (EMD) to break down the original BDI series into several intrinsic mode functions (IMFs). Each IMF and combined component was modelled using an Artificial Neural Network, yielding a forecasting method based on EMD and ANN. Their results showed that this approach outperformed other techniques such as ANN and VAR.

Similarly, Kamal et al. [17] developed a deep learning ensemble model by integrating Recurrent Neural Networks, Long Short-Term Memory, and Gated Recurrent Units (GRU) to improve BDI forecasting. The ensemble method outperformed both a single econometric model and a single machine learning model, highlighting its superior predictive capabilities.

Su et al. [18] predicted the Baltic Dry Index using a deep integrated model (CNN-BiLSTM-AM) comprising a convolutional neural network, bi-directional long short-term memory, and the attention mechanism (AM). Their findings indicate that the integrated model CNN-BiLSTM-AM encompasses the nonlinear and nonstationary characteristics of the shipping industry, and it has a greater prediction accuracy than any single model.

Most recently, research has shifted toward deep learning and hybrid intelligent systems, reflecting the growing need to capture the complex, non-linear dynamics of global shipping markets. Kim et al. [19] advanced the field by integrating financial market data into machine learning models, complemented with SHAP explanations to enhance interpretability and transparency of predictions. Similarly, Atsalaki et al. [20] employed a neuro-fuzzy inference system (ANFIS), combining the adaptability of neural networks with the rule-based reasoning of fuzzy logic, thereby improving forecasting accuracy in volatile environments. Zhang [21] introduced a GRU-based deep learning framework that demonstrated strong performance in identifying market trends and cyclical patterns, while Li et al. [22] applied LSTM architectures to capture long-term dependencies in BDI time series.

Existing BDI forecasting studies largely rely on single models, leaving a gap in the use of structured hybrid approaches. Our contribution is a unified framework that systematically integrates EMD, SVR, and GWO. Although each method is well known, their coordinated use—EMD for decomposition, SVR for IMF-level forecasting, and GWO for optimizing the final weights—has not been previously examined for BDI prediction. This approach fills a methodological gap by offering a more robust way to capture the complex dynamics of the series.

Univariate forecasting is simple, efficient, and easy to interpret as it relies solely on past values of the variable being predicted, requiring less data and computational resources. It is robust against noise from external factors and suitable for quick, short-term forecasts. Additionally, it serves as a strong baseline for comparing more complex models. Based on the above literature, this study proposes a comparison of univariate models, including GM(1,1), ARIMA, SVR, LSTM, GRU and EMD-SVR-GWO.

The forecasting models examined in this study span a range of methodological complexity, each suited to different time-series characteristics. GM(1,1) performs well in small-sample settings and weak-information environments but offers limited capacity for modelling nonlinear patterns. ARIMA remains a strong benchmark for linear and stationary data, though it is less effective under structural changes or nonlinear dynamics. SVR improves flexibility through kernel-based nonlinear mapping, making it suitable when relationships are complex, but data are limited. LSTM and GRU further enhance forecasting capability by learning long-range dependencies and nonlinear interactions directly from the data, performing well in series with memory effects, regime shifts, or high variability. Finally, our proposed EMD-SVR-GWO framework uses EMD for decomposition, SVR for IMF-level forecasting, and GWO for optimizing the final weights. Together, these methods provide a comprehensive basis for evaluating forecasting performance across linear, nonlinear, and hybrid approaches.

3. Methods

To ensure a coherent comparison, all forecasting models are evaluated within a unified framework using the same monthly BDI dataset, identical in-sample and out-of-sample partitions, and a common one-step-ahead forecasting horizon. The models are presented in increasing order of methodological complexity—from traditional statistical methods to machine learning and hybrid ensemble approaches—allowing a transparent assessment of predictive performance. Although deep learning models such as LSTM and GRU typically benefit from larger datasets, they are included here as benchmark nonlinear learners rather than as fully optimized deep architectures, allowing a fair comparison of modelling paradigms under identical data constraints.

3.1. Grey Forecasting Model

Grey theory, initially proposed by Deng [23], is designed to analyze systems with incomplete or limited information. A key advantage of grey models is their ability to generate forecasts using relatively small datasets. In particular, the GM(1,1) model does not require the underlying time series to be stationary, nor does it rely on probabilistic or ergodic assumptions as in stochastic time-series models such as ARIMA. Instead, GM(1,1) employs an accumulated generating operation (AGO) to transform the original sequence into a more regular form, enabling effective modelling of trend behaviour under limited data conditions. These characteristics make GM(1,1) especially suitable for short-term forecasting problems with small samples and high uncertainty.

The GM(1,1) model refers to a first-order, single-variable grey model. The procedure for constructing GM(1,1) is outlined as follows:

Let the original data sequence be

x^{(0)} = (x^{(0)} (1), x^{(0)} (2), x^{(0)} (3), \dots, x^{(0)} (n)),

(1)

where x⁽⁰⁾(k) stands for the original data sequence in period k. The following sequence

x^{(1)}

is defined as

x^{(1)} = (\sum_{k = 1}^{1} x^{(0)} (k), \sum_{k = 1}^{2} x^{(0)} (k), \dots, \sum_{k = 1}^{n} x^{(0)} (k)) = (x^{(1)} (1), x^{(1)} (2), x^{(1)} (3), \dots, x^{(1)} (n)) .

(2)

Equation (2) was generated based on the accumulated generating operation of Equation (1).

Before building the model, a class ratio must be performed to ensure that the sequence is suitable for modelling. The class ratio

σ (k)

is defined as follows:

σ^{(1)} (k) = \frac{x^{(1)} (k - 1)}{x^{(1)} (k)}, k \geq 2,

(3)

if

σ^{(1)} (k)

\in

(0, 1), then

x^{(1)} (k)

is appropriate for modelling.

After finishing the class ratio test, the GM(1,1) model is formulated as a first-order differential equation for

x^{(1)} (k)

as

\frac{d x^{(1)}}{d k} + a x^{(1)} = b

(4)

where a and b are coefficients to be estimated.

Next, we applied the ordinary least squares method to Equation (4) to estimate the coefficients of a and b. Once

\hat{a}

and

\hat{b}

are estimated, we generate predictions by substituting

\hat{a}

and

\hat{b}

into the following equation:

{\hat{x}}^{(1)} (k + 1) = (x^{(0)} (1) - \frac{\hat{b}}{\hat{a}}) e^{- \hat{a} k} + \frac{\hat{b}}{\hat{a}} and {\hat{x}}^{(1)} (1) = x^{(0)} (1)

(5)

Finally, the forecasted values of the time series are obtained by applying the inverse accumulated generating operation (IAGO) to convert to

x^{(0)} (k)

as follows:

{\hat{x}}^{(0)} (k + 1) = {\hat{x}}^{(1)} (k + 1) - {\hat{x}}^{(1)} (k)

(6)

3.2. SARIMA

The seasonal model employed in this study belongs to the general class of univariate models originally introduced by Box and Jenkins [24]. SARIMA has been widely applied across various disciplines and remains a cornerstone in time series forecasting. A fundamental requirement in constructing this model is that the time series must be stationary, meaning its probabilistic properties remain constant over time.

SARIMA extends the ARIMA framework by incorporating seasonal components, making it suitable for time series data exhibiting seasonality. The model is typically expressed as SARIMA(p, d, q)(P, D, Q)s, where

p represents the order of the non-seasonal autoregressive terms;
d denotes the degree of non-seasonal differencing;
q corresponds to the order of the non-seasonal moving average terms;
P, D, and Q represent the seasonal autoregressive, seasonal differencing, and seasonal moving average orders, respectively;
s indicates the length of the seasonal cycle.

The general representation of the SARIMA model can be written as

φ_{p} (B) Φ_{p} (B^{s}) \nabla^{d} \nabla_{s}^{D} Z_{t} = θ_{q} (B) Θ_{Q} (B^{s}) ε_{t},

(7)

where

ψ_p(B) = (1 − ψ₁B − ψ₂B² −… − ψ_p B^p),
Φ_p(B^s) is the seasonal operator of order P;
B is the backshift operator with B^m(Z_t) = Z_t−m;
s is the season length;
$\nabla$ ^d = (1 − B) is the non-seasonal operator;
$\nabla_{s}^{D} = {(1 - B^{s})}^{D}$ is the seasonal differencing operator;
Z_t is the stationary data at time t;
θ_q(B) = (1 − θ₁B − θ₂B²⁻… − θ_qB^q);
$Θ_{Q} (B^{s})$ is the seasonal operators of order Q, and ε_t is the white noise with zero mean and variance.

The model-building process involves three main stages: identification, parameter estimation, and diagnostic checking. In the identification phase, the tentative structure of the AR and MA terms (p and q) is determined by analyzing the autocorrelation and partial autocorrelation patterns. Once an initial specification is chosen, parameter estimation is carried out, often using methods such as maximum likelihood or least squares. Finally, diagnostic checking assesses the adequacy of the fitted model by examining residuals to ensure they exhibit the properties of white noise. If the diagnostics confirm model suitability, the SARIMA model is then employed for forecasting.

3.3. Long Short-Term Memory Model

Deep learning has become increasingly important in time-series forecasting due to its ability to capture complex nonlinear and long-range temporal dependencies. Among recurrent neural network architectures, the Long Short-Term Memory network proposed by Hochreiter and Schmidhuber [25] addresses the vanishing-gradient problem inherent in traditional RNNs by introducing explicit memory cells and gating mechanisms. Given the nonlinear, volatile, and path-dependent nature of the Baltic Dry Index, LSTM provides an effective nonlinear benchmark model for comparison with the proposed EMD–SVR–GWO framework.

An LSTM cell contains three multiplicative gates—input, forget, and output gates—together with a cell state that carries long-term information. For a time sequence

{x_{t}}

, the LSTM cell updates are defined as

i_{t} = σ (W_{i} x_{t} + U_{i} h_{t - 1} + b_{i}),

f_{t} = σ (W_{f} x_{t} + U_{f} h_{t - 1} + b_{f}),

o_{t} = σ (W_{o} x_{t} + U_{o} h_{t - 1} + b_{o}),

\tilde{c_{t}} = t a n h (W_{c} x_{t} + U_{c} h_{t - 1} + b_{c}),

c_{t} = f_{t} ⊙ c_{t - 1} + i_{t} ⊙ \tilde{c_{t}},

h_{t} = o_{t} ⊙ t a n h (c_{t}),

where

i_{t}, f_{t}, a n d o_{t}

are the input, forget, and output gates;

c_{t}

is the cell state;

h_{t}

is the hidden state;

W_{*}, U_{*}, b_{*}

are trainable weight matrices and biases;

σ (\cdot)

denotes the logistic function;

⊙

represents elementwise multiplication.

The forget gate

f_{t}

regulates the retention of past information, enabling the network to model both short-term shocks and long-term cycles.

3.4. Gated Recurrent Unit Model

The Gated Recurrent Unit model, introduced by Chung et al. [26], is a simplified gated RNN architecture designed to capture long-term dependencies with fewer parameters than LSTM. GRU merges the input and forget mechanisms into a single update gate, improving computational efficiency while maintaining competitive forecasting accuracy. This makes GRU particularly suitable for small and moderately sized datasets such as the monthly BDI series.

A GRU cell contains two gates: the reset gate r_t and update gate z_t. The transition equations are

z_{t} = σ (W_{z} x_{t} + U_{z} h_{t - 1} + b_{z}),

r_{t} = σ (W_{r} x_{t} + U_{r} h_{t - 1} + b_{r}),

{\tilde{h}}_{t} = t a n h (W_{h} x_{t} + U_{h} (r_{t} ⊙ h_{t - 1}) + b_{h}),

h_{t} = (1 - z_{t}) ⊙ h_{t - 1} + z_{t} ⊙ {\tilde{h}}_{t} .

Here,

z_t controls the interpolation between past information and newly computed content;
r_t controls how much past information to retain;
${\tilde{h}}_{t}$ is the candidate hidden state.
Compared with LSTM, GRU eliminates the explicit cell state c_t, leading to a faster and more compact model.

3.5. The EMD-SVR-GWO Model

3.5.1. EMD Model

Empirical Mode Decomposition is a data-driven technique developed by Norden E. Huang [27] in the late 1990s for analyzing non-linear and non-stationary time series data. Unlike traditional signal processing methods that rely on predefined basis functions (e.g., Fourier or wavelet transforms), EMD adaptively decomposes a signal into a set of intrinsic mode functions based on the inherent characteristics of the data. This makes EMD particularly useful for a wide range of applications, including signal processing, economics, geophysics, and biomedical engineering.

The EMD process consists of the following steps:

(1).: Identify Extrema: Detect all local maxima and minima in the signal. These points indicate where the signal changes direction and are used to construct the upper and lower envelopes.
(2).: Interpolate: Using the identified extrema, create smooth upper and lower envelopes by interpolating between the maxima and minima, respectively. Common interpolation methods include cubic splines or piecewise linear functions.
(3).: Mean Calculation: Compute the mean of the upper and lower envelopes. This mean represents the local trend of the signal.
(4).: Sifting Process: Subtract the mean from the original signal to obtain a candidate IMF. This component is called a “proto-IMF.” The sifting process involves repeating the steps of extrema identification, interpolation, and mean calculation on the proto-IMF until it meets the IMF criteria: (i) The number of zero crossings and extrema must either be equal or differ by at most one. (ii) The mean value of the envelopes should be zero at every point.
(5).: Residual Calculation: Once a valid IMF is obtained, subtract it from the original signal to get a residual. This residual represents the remaining signal after extracting the first IMF.
(6).: Repeat: Apply the EMD process recursively to the residual to extract further IMFs until the residual becomes a monotonic function or has no more than two extrema. The decomposition is complete when the residual is either a constant, a monotonic trend, or contains no further oscillatory modes.

EMD has been applied across various fields due to its ability to handle non-linear and non-stationary data. In signal processing, it facilitates denoising, feature extraction, and fault detection in mechanical and electrical systems. The technique supports the analysis of financial time series, market trends, and economic cycles in economics. Geophysics benefits from EMD through seismic signal analysis, ocean wave modelling, and climate data interpretation. In biomedical engineering, EMD is instrumental in processing physiological signals such as electroencephalograms, electrocardiograms, and speech signals.

The advantages of EMD include its data-driven nature, which eliminates the need for predefined basis functions and makes it versatile for various types of data. Additionally, its adaptability allows it to handle non-linear and non-stationary signals effectively.

In summary, Empirical Mode Decomposition is a powerful and adaptive method for analyzing complex signals. By decomposing signals into intrinsic mode functions, EMD provides a flexible approach for revealing underlying patterns and trends in non-linear and non-stationary data. Its wide range of applications and ability to handle diverse data types make it an essential tool in various scientific and engineering disciplines.

3.5.2. SVR Model

Although different models in this study employ different notational conventions, they are unified by a common forecasting objective. Time-series models such as GM(1,1) and ARIMA express forecasts explicitly in terms of time indices (e.g., y_t, y_t₋₁), whereas machine learning models such as SVR formulate the problem in a supervised-learning form. In this study, these formulations are linked by constructing input–output pairs from the univariate BDI series, where lagged observations [y_t−1, y_t−2,…, y_t−p] serve as inputs and the next-period value y_t is the prediction target. As a result, all models generate one-step-ahead forecasts under a unified time-series forecasting framework, ensuring methodological consistency and comparability.

Support Vector Regression is a robust and versatile machine learning algorithm used for regression tasks. Originating from the principles of Support Vector Machines, SVR aims to predict continuous outcomes based on input features. While SVM is widely recognized for its effectiveness in classification problems, SVR extends these capabilities to the realm of regression, making it a valuable tool in various applications, such as financial forecasting, time series prediction, and biological data analysis.

At its core, SVR is designed to find a function that approximates the relationship between input features and a continuous target variable. The primary objective of SVR is to minimize the error of predictions while maintaining a degree of robustness by considering only a subset of the training data, known as support vectors. These support vectors are the critical elements of the training set that define the model.

SVR operates by introducing a margin of tolerance (epsilon), within which errors are ignored. This margin is referred to as the epsilon-insensitive zone. The essence of SVR is to ensure that the predictions lie within this margin for as many data points as possible, thereby balancing the complexity of the model with its predictive accuracy.

The SVR problem can be formulated as follows:

Given a training dataset

{\{(x_{i}, y_{i})\}}_{i = 1}^{n}

, where x_i represents the input features and y_i the target variable, the goal is to find a function f(x) that deviates from the actual targets y_i by at most ϵ. The function f(x) is typically expressed as

f(x) = ⟨w,x⟩ + b, where w is the weight vector and b is the bias term. The optimization problem is defined as follows (Vapnik [28]):

{m i n}_{w, b} {\frac{1}{2} ‖w‖}^{2} + C \sum_{i = 1}^{n} (ξ_{i} + ξ_{i}^{*})

subject to

y_{i} - ⟨ w, x_{i} ⟩ - b \leq ϵ + ξ_{i}

⟨w, x_{i}⟩ + b - y_{i} \leq ϵ + ξ_{i}^{*} ξ_{i}, ξ_{i}^{*} \geq 0

Here,

ξ_{i}, and ξ_{i}^{*} are

slack variables that allow for deviations greater than ϵ, and C is a regularization parameter that determines the trade-off between the flatness of f(x) and the amount by which deviations larger than ϵ are tolerated.

One of the strengths of SVR, inherited from SVM, is the ability to handle non-linear relationships through the kernel trick. By mapping the input features into a higher-dimensional space using a kernel function, SVR can fit complex, non-linear functions. Commonly used kernels include the linear, polynomial, and radial basis function (RBF) kernels. In this research, the radial basis function is adopted.

3.5.3. GWO

The Grey Wolf Optimizer (GWO) is a population-based metaheuristic optimization algorithm proposed by Mirjalili et al. [29]. It is inspired by the cooperative hunting behaviour and leadership hierarchy observed in grey wolf packs. GWO has attracted increasing attention in forecasting and optimization studies due to its simple structure, limited number of control parameters, and strong capability to balance global exploration and local exploitation. Only the elements of GWO directly relevant to parameter optimization in this study are summarized below; full algorithmic details are available in Mirjalili et al. [29].

In GWO, each candidate solution represents a grey wolf, and the quality of a solution is evaluated using a predefined fitness function. During each iteration, the best three solutions are designated as the alpha (α), beta (β), and delta (δ) wolves, which are assumed to have the most accurate knowledge of the prey’s location (i.e., the optimal solution). All remaining solutions are classified as omega (ω) wolves and update their positions by following these three leaders. This hierarchical structure allows the algorithm to guide the search process toward promising regions of the solution space while maintaining population diversity.

The encircling behaviour of grey wolves during hunting is mathematically modelled as:

\vec{D} = |\vec{C} \cdot \vec{X_{p}} (t) - \vec{X} (t)|,

\vec{X} (t + 1) = \vec{X_{p}} (t) - \vec{A} \cdot \vec{D},

where

\vec{X} (t)

denotes the position vector of the grey wolf at iteration t,

\vec{X_{p}}

(t) represents position vector of the prey,

\vec{D}

= vector used to specify a new position of the grey wolf, and

\vec{A}

and

\vec{C}

are coefficient vectors defined as follows:

\vec{A} = 2 \vec{a} \cdot \vec{r_{1}} - \vec{a}

\vec{C} = 2 \vec{r_{2}},

with

\vec{a}

a decreasing linearly from 2 to 0 over iterations, and r1 and r2 are random vectors uniformly distributed in [0, 1].

The parameter a plays a critical role in controlling the search behaviour of GWO. When ∣A∣ > 1, grey wolves move away from the current best solutions, encouraging global exploration of the search space. Conversely, when ∣A∣ < 1, wolves converge toward the estimated prey position, leading to local exploitation. This adaptive mechanism enables GWO to effectively transition from exploration to exploitation as the optimization progresses, reducing the risk of premature convergence.

To model cooperative hunting, GWO assumes that the true prey position is approximated by the three leading wolves (α, β, and δ). The positions of the remaining wolves are updated according to

\vec{X} (t + 1) = \frac{\vec{X_{1}} + \vec{X_{2}} + \vec{X_{3}}}{3} .

This averaging mechanism allows the search process to incorporate multiple elite solutions simultaneously, improving robustness and stability compared with algorithms that rely on a single global best solution.

In this study, GWO is employed to optimize the combination weights used to reconstruct the final BDI forecast from the SVR-predicted intrinsic mode functions (IMFs). Each candidate solution corresponds to a vector of weights assigned to individual IMFs, and the fitness function is defined by the reconstruction error between the weighted forecast and the reference series. Through iterative position updates, GWO identifies the weight configuration that minimizes forecasting error, thereby improving the accuracy and robustness of the final aggregated prediction.

Figure 1 illustrates the overall flow of the GWO algorithm applied in this study.

3.5.4. Model Construction of EMD-SVR-GWO

The EMD-SVR-GWO model is a hybrid forecasting framework designed to improve prediction accuracy by leveraging the strengths of three advanced techniques: Empirical Mode Decomposition, Support Vector Regression, and the Grey Wolf Optimizer. This model is particularly effective for complex, nonlinear, and non-stationary time series data, such as financial indices, economic indicators, or environmental data. The construction of the EMD-SVR-GWO model involves three primary stages:

Empirical Mode Decomposition

EMD is a data-driven decomposition technique that breaks down the original data into a set of components called Intrinsic Mode Functions and a residual trend. Each IMF represents oscillatory modes embedded within the data, capturing different frequency characteristics, while the residual captures the long-term trend.

The decomposition enables the isolation of distinct patterns (high-frequency fluctuations, medium-term cycles, and long-term trends), which are modelled separately in the next stage.

2.: Support Vector Regression

After decomposition, each IMF and the residual are treated as independent sub-series. Support Vector Regression is applied to predict each component individually. SVR is chosen for its robustness in handling nonlinear relationships and its strong generalization capabilities.

Each IMF captures different dynamics of the original time series, and SVR models are trained separately to forecast these dynamics effectively.

3.: Grey Wolf Optimizer

The final stage involves the reconstruction of the original time series by combining the predicted IMFs and residual. To achieve the most accurate forecast, the Grey Wolf Optimizer is employed to determine the optimal combination of these forecasts. GWO is a metaheuristic optimization algorithm inspired by the hunting behavior and social hierarchy of grey wolves in nature.

The EMD-SVR-GWO hybrid model offers a powerful and flexible framework for time series forecasting. EMD enhances the model by decomposing complex data into simpler components, SVR captures nonlinear relationships within each component, and GWO optimizes the final forecast by adjusting the weights for the most accurate reconstruction. This approach is highly effective for datasets with mixed trends, seasonality, and noise, providing superior forecasting performance compared to traditional single-model methods. Figure 2 presents the flowchart of the EMD-SVR-GWO model.

4. Data and Results

This section begins with a description of the dataset used in the analysis, followed by the presentation of results from all forecasting models. Subsequently, the models’ performance is assessed using three evaluation criteria, and their forecasting accuracy is compared.

4.1. BDI Time Series Data

Monthly Baltic Dry Index data from January 2019 to August 2024 are used in this study. Monthly values are obtained by averaging daily BDI observations from the Eastmoney database, yielding 68 observations. Table 1 reports the original monthly BDI data used in the empirical analysis to ensure transparency and reproducibility, while Figure 3 presents a graphical view of the series, illustrating its overall evolution and nonstationary behavior.

Table 2 summarizes the descriptive statistics, highlighting the variability and right-skewed distribution of the data. The monthly BDI series has a mean of 1743.264 and exhibits substantial variability, with a standard deviation of 851.541 and a wide range from 460.6 to 4819.95. The positive skewness (1.244) and moderate kurtosis (2.380) further indicate a right-skewed distribution with some extreme values. The sample consists of 68 monthly observations.

For model evaluation, the dataset is divided into an in-sample period from January 2019 to December 2023 and an out-of-sample period from January to August 2024.

4.2. Model Results

This section summarizes the outcomes produced by the four models, while their forecasting accuracy is compared in the following section.

4.2.1. Grey Forecasting

The accuracy of forecasts generated by the grey model depends on the length of the initial data segment selected from the time series. To address this, we executed multiple grey forecasts using different initial sequence sizes. The best results, indicated by the lowest prediction errors, occurred when the initial sequence length was set to four observations. The procedure for producing forecasts with the grey model involves the following steps:

1.: Class Ratio Test:

Before applying the GM(1,1) model, the data must satisfy the class ratio condition. As described in Section 3.1, the ratio values

σ (k)

should lie between 0 and 1 for the series to qualify for grey modeling. The class ratio test results, summarized in Table 3, confirm that the grey model is suitable for this dataset.

2.: Accumulated Generating Operation (AGO):

Following Equation (2), the AGO was applied to obtain the new sequence. The resulting values are presented in Table 4.

3.: Mean value generating sequence

We calculated the mean value generating sequence, as shown in Table 5.

4.: Time series prediction model

The coefficients a and b were estimated using the least squares method, yielding the following values:

a = −0.1749295723 and b = 1313.9202714553.

These estimates are used to get

{\hat{X}}^{(0)} (k + 1) = (1 - e^{\hat{a}}) [X^{(0)} (1) - \frac{\hat{b}}{\hat{a}}] e^{- \hat{a} k}

(8)

These parameters are then applied to Equation (8) to compute the predicted values of the time series. The forecasts for the period from January 2024 to August 2024 are presented in Table 6.

4.2.2. ARIMA Model

Figure 3 shows that the monthly BDI series exhibits a clear trend, suggesting nonstationarity. This observation is confirmed by the Augmented Dickey–Fuller (ADF) test, which fails to reject the null hypothesis of a unit root (ADF statistic = −2.377, p = 0.1483). Accordingly, first differencing is applied to obtain a stationary series suitable for ARIMA modeling.

For methodological transparency, we first report results obtained using the classical Box–Jenkins framework. After differencing, the ACF and PACF plots, as shown in Figure 4 and Figure 5, were utilized to identify potential models, and several ARIMA specifications were examined using standard residual diagnostics. Among the candidate models, ARIMA(0,1,1) satisfies the residual independence requirement based on the Ljung–Box test, whereas alternative specifications fail to meet adequacy conditions. However, despite being statistically admissible, the forecasting performance of this model remains limited.

It is acknowledged that while the Box–Jenkins identification–estimation–diagnostic procedure is historically important, contemporary time-series forecasting increasingly relies on automated, information-criterion–based model selection methods. Accordingly, ACF and PACF plots are used here primarily for illustrative purposes rather than as decisive selection tools.

To obtain the final ARIMA specification, we employ the auto_arima procedure from the pmdarima Python package (version 2.1.1), which systematically evaluates candidate models using a stepwise search strategy based on information criteria (AIC, BIC, and HQIC). This automated procedure identifies an ARIMA(1,0,0) model as optimal. Despite its parsimonious structure, this specification achieves superior out-of-sample forecasting performance compared to more complex alternatives.

The difference between the classical ARIMA(0,1,1) and the automated ARIMA(1,0,0) selection primarily reflects the sensitivity of manual differencing decisions in the Box–Jenkins framework. Over-differencing may induce spurious moving-average behavior, whereas the automated procedure evaluates a broader model space and identifies that the BDI series is more appropriately represented as a stationary autoregressive process. Since AR models preserve long-run persistence while avoiding unnecessary complexity, the ARIMA(1,0,0) specification provides a more robust balance between parsimony and predictive accuracy.

Based on these considerations, the ARIMA(1,0,0) model selected by the automated procedure is adopted for forecasting. The final ARIMA(1,0,0) forecasts are reported in Table 6. Additional diagnostic results and intermediate model comparisons are reported in Appendix A for completeness.

4.2.3. LSTM

To ensure consistency with other univariate models studied, the BDI sequence is normalized to the interval [0, 1]. A sliding window of k past observations (here k = 5) is used as input to predict the next period:

X_{t} = \{x_{t - k + 1}, \dots, x_{t}\} \to x_{t + 1} .

The LSTM architecture employed in this study consists of a single LSTM layer with 32 hidden units using a hyperbolic tangent activation function, followed by a fully connected output layer. The model is trained using the Adam optimizer, with mean squared error (MSE) selected as the loss function to guide optimization.

The model is trained for 200 epochs until convergence. After training, multi-step-ahead forecasts (H = 8) are generated recursively:

{\hat{x}}_{t + h} = f ({\hat{x}}_{t + h - 1}, {\hat{x}}_{t + h - 2}, \dots, {\hat{x}}_{t + h - k}), h = 1, \dots, 8 .

The final LSTM forecasts (after rescaling) are reported in Table 6 for direct comparison with GM(1,1), ARIMA, SVR, GRU, and EMD–SVR–GWO models.

4.2.4. GRU

The preprocessing procedures for GRU follow those of the LSTM model. Using the same sliding-window inputs ensures comparability across models. The GRU model used in this study is configured with a single GRU layer containing 32 units, followed by a dense output layer. The training process employs the Adam optimizer with mean squared error as the loss function, and the model is trained over 200 epochs to ensure effective learning.

Given parameters θ, the GRU forecasting function is

${\hat{X}}_{t + 1} = g θ (x_{t - k + 1}, \dots, x_{t})$
As with the LSTM, multi-step forecasting is performed recursively:

${\hat{X}}_{t + h} = g θ ({\hat{x}}_{t + h - 1}, \dots, {\hat{x}}_{t + h - k}), h = 1, \dots, 8 .$
This unified setup allows a direct comparison of performance between LSTM and GRU and ensures consistency with the other forecasting models evaluated in this study. The final GRU forecasts are reported in Table 6.

4.2.5. SVR

Support Vector Regression is a powerful machine learning technique for predicting the Baltic Dry Index. In this study, a total of 80 monthly BDI data points are used. The first 72 consecutive months serve as input features for the model, while the last 8 months are used as output targets.

To prevent training errors caused by high-dimensional data or large variations in feature values, the dataset must be normalized before training the SVR model. The standardization process follows the formula

X^{'} = \frac{X - μ}{σ}

where

X is the original feature value;
$μ$ is the mean of the feature in the training set;
σ is the standard deviation of the feature in the training set;
$X^{'}$ is the standardized value.

Next, the SVR model is trained using an appropriate kernel function, which determines how the model captures non-linear relationships in the data. The three most common kernels are linear, polynomial, and radial basis function. For BDI prediction, the RBF kernel is selected due to its effectiveness in handling complex, non-linear patterns.

Finally, the trained SVR model is deployed for forecasting future BDI values. It is used to generate one-step predictions for the last 8 months of the dataset. The predicted BDI values for the time series from January 2024 to August 2024 are presented in Table 6.

4.2.6. EMD-SVR-GWO

In the proposed EMD-SVR-GWO hybrid ensemble model, the first stage is to apply EMD to decompose the data of BDI into several IMFs. Too many IMFs may result in a poor final result due to the accumulation error of each IMF. This paper decomposes the original data into four IMFs to avoid aforementioned problems. Figure 6 presents the decomposition results of the BDI via EMD, where all IMFs are listed from the highest frequency to the lowest frequency.

The SVR is applied to each IMFs in the second stage. The RBF kernel is selected for SVR. The trained SVR model is deployed for forecasting the next 8 periods of each IMF.

The third stage is to optimize the weight of each IMF using GWO based on least squares. It is worth noting that when using the least squares method as an evaluation criterion, actual data is required. However, in forecasting, actual data is unknown. In this study the predicted values obtained from simple regression are used as the actual data. The optimal weight for each IMF is shown in Table 7. After obtaining the weight of each IMF, multiply this weight by the predicted value of each IMF and sum them up to obtain the predicted value of BDI. The predicted BDI are shown in Table 6.

The dominance of the lowest-frequency IMF indicates that medium- to long-term components carry the majority of predictive power for monthly BDI movements, while high-frequency components mainly contribute noise. This finding is consistent with the economic interpretation of freight indices, which are driven primarily by persistent demand–supply imbalances rather than short-term fluctuations.

4.3. Comparison of Forecasting Results

The BDI forecasts for the out-of-sample period (January to August 2024) were generated using each forecasting approach. The predicted values, along with the actual BDI figures for comparison, are presented in Table 6 and Figure 7.

Figure 7 compares the actual BDI with forecasts from different models over the eight-step horizon. The actual BDI shows a sharp increase in the third period followed by a decline in the fourth period, after which it remains stable, reflecting short-term market volatility. GM(1,1) shows large oscillations and substantial deviations, while ARIMA(1,0,0) produces overly smooth forecasts that fail to capture variability. SVR captures the general trend but underestimates fluctuations. LSTM and GRU generate steadily decreasing forecasts, suggesting a tendency to overestimate trend dynamics. In contrast, the proposed EMD–SVR–GWO model provides stable predictions that remain close to the observed BDI range, indicating improved robustness and balanced forecasting performance.

Yokum and Armstrong [30] carried out two studies examining experts’ views on the criteria for selecting forecasting methods. Their findings indicated that accuracy is regarded as the most important factor by the majority of researchers. However, because no single accuracy metric is universally applicable to all forecasting scenarios, multiple measures are often employed to provide a comprehensive evaluation of model performance. As noted by Makridakis et al. [31], the ranking of models can vary depending on the metric applied.

Model performance is assessed using root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE), which directly measure out-of-sample forecasting accuracy and are widely used in time-series forecasting studies. The coefficient of determination (R²) is not emphasized, as it assumes independent observations and may yield misleading values for nonstationary and highly volatile series such as the BDI. These measures are defined as follows:

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {(Y_{i} - {\overset{\land}{Y}}_{i})}^{2}}{n}}

(9)

M A E = \frac{\sum_{i = 1}^{n} |Y_{i} - \overset{\land}{Y_{i}}|}{n}

(10)

M A P E = \frac{100 \sum_{i = 1}^{n} |\frac{Y_{i} - \overset{\land}{Y_{i}}}{Y_{i}}|}{n}

(11)

where Y_i and

{\overset{\land}{Y}}_{i}

denote the actual and predicted values of the time series for period i, respectively. All three performance metrics yield positive values, and a smaller value indicates a more accurate forecasting model.

The comparative evaluation of forecasting accuracy across all methods is summarized in Table 8. According to the results, the EMD-SVR-GWO model delivers the highest accuracy, as it consistently records the lowest values across all three metrics. ARIMA ranks as the second most effective model, regardless of which measure is applied. The difference in accuracy between SVR and ARIMA is relatively minor. In contrast, the Grey forecasting approach performs the poorest in predicting the monthly BDI.

A commonly cited reference for the MAPE-based classification is Lewis [32]. In this work, Lewis outlines that a MAPE below 10% indicates excellent forecast accuracy, 10–20% is good, 20–50% is acceptable, and values above 50% are considered poor.

According to Lewis’s MAPE-based classification, the proposed EMD-SVR-GWO model achieves a MAPE of 8.3455, corresponding to excellent forecast accuracy. In comparison, the ARIMA, SVR, and LSTM models yield MAPEs of 14.2829, 15.3686, and 16.3638, respectively, which fall within the good accuracy range, while the GRU and Grey Forecast models exhibit acceptable performance (with a MAPE of 24.0836 and 27.26, respectively). It should be noted that LSTM and GRU are included primarily as benchmark deep-learning methods, and their results are interpreted with caution due to the limited sample size, particularly in the test set. Deep neural networks typically require substantially larger datasets to fully exploit their representational capacity, and their comparatively weaker performance in this study likely reflects data constraints rather than inherent model deficiencies. Overall, the results consistently demonstrate the robustness and superior predictive accuracy of the proposed EMD-SVR-GWO framework for BDI forecasting under small-sample conditions.

5. Conclusions

From an industry perspective, improved short-term BDI forecasts can support chartering decisions, fleet deployment, and risk management in volatile freight markets, particularly during periods of structural imbalance such as post-pandemic recovery. The Baltic Dry Index is a key indicator of global shipping costs and economic activity, influencing decisions in trade, investment, and policymaking. However, its high volatility makes accurate forecasting essential. Reliable predictions help shipping companies optimize operations, help investors anticipate market trends, and help policymakers make informed decisions, reducing financial uncertainty and improving strategic planning in the shipping and trade industries.

In forecasting research, synergistic effects arise when multiple complementary methods are combined so that their strengths reinforce one another and improve overall predictive accuracy. In our model, the three-stage EMD-SVR-GWO framework creates such a synergistic effect because each method strengthens a different part of the forecasting process. EMD reduces noise and separates key patterns, SVR captures nonlinear relationships within each component, and GWO optimizes how these component forecasts are combined. Our model fills a methodological gap by offering a more robust way to capture the complex dynamics of the series.

This paper compares various univariate forecasting methods to develop a more precise short-term BDI prediction model, providing valuable insights for decision-makers. Six forecasting techniques are examined: Grey Forecast, ARIMA, Support Vector Regression, LSTM, GRU, and EMD-SVR-GWO. Our results indicate that EMD-SVR-GWO outperforms ARIMA and other four methods (SVR, LSTM, GRU and Grey Forecast). Our proposed approach goes beyond simply combining EMD with SVR by introducing an additional composition step. Notably, this step plays a significant role in enhancing the overall forecasting accuracy. The EMD-SVR-GWO model achieves a MAPE of 8.3455, classified into the excellent category.

From an industry perspective, the novel EMD-SVR-GWO model provided by this study serves as a valuable reference for ship-owners and charterers in making chartering decisions. For instance, if the Baltic Dry Index is projected to rise, ship owners should consider purchasing new or second-hand vessels or securing time charter contracts. If they already hold long-term transportation contracts, sub-chartering their vessels may be a strategic move. On the other hand, charterers should act promptly to secure time charter contracts or long-term transportation agreements with ship-owners. Conversely, if the Baltic Dry Index is expected to decline, the opposite strategies should be adopted.

Despite the strong forecasting performance of the proposed framework, several limitations should be acknowledged from an economic and market perspective. First, the dataset covers a relatively short time span, which may limit the model’s ability to capture long-term structural changes in the global freight market, such as shifts in trade patterns, regulatory interventions, or major economic cycles. As a result, the predictive results identified in this study may be less stable during periods of significant market restructuring or regime change.

Second, although the multi-stage hybrid structure is effective in extracting nonlinear patterns, it may face challenges under extreme market conditions, such as sudden freight rate surges driven by geopolitical events, supply chain disruptions, or abrupt demand shocks. In such environments, purely data-driven models may adjust less rapidly, potentially affecting short-term forecasting reliability.

In addition, the study adopts a univariate framework and evaluates only the Grey Wolf Optimizer (GWO) for weighting optimization, which limits the explicit inclusion of external economic drivers of freight market fluctuations, such as commodity prices, exchange rates, and financial market sentiment.

Future research may extend the proposed framework to a multivariate setting by incorporating economically meaningful variables. Recent studies show that commodity prices, exchange rates, and volatility indices significantly enhance BDI forecasting accuracy (Kim et al. [19]), while future prices of aluminum, iron ore, cotton, thermal coal, and equity market indicators such as the NASDAQ Composite Index also exhibit strong explanatory power (Li et al. [22]). These factors can be integrated into an EMD-based hybrid framework using multivariate SVR or deep learning models. Moreover, alternative metaheuristic algorithms, such as Particle Swarm Optimization, may be explored to further improve robustness under highly volatile freight market conditions.

Author Contributions

Conceptualization, J.H. and C.-W.C.; data curation, J.H. and H.-L.H.; formal analysis, J.H. and C.-W.C.; software, J.H.; validation, C.-W.C.; visualization, J.H. and H.-L.H.; writing—original draft, J.H. and H.-L.H.; writing—review and editing, C.-W.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The datasets generated during the current study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Supplementary ARIMA Model Diagnostics and Selection Results

Appendix A.1. Classical ARIMA Diagnostic Results

To provide transparency while avoiding unnecessary detail in the main text, Table A1 summarizes the diagnostic results of candidate ARIMA models evaluated using the classical Box–Jenkins framework. Residual adequacy was assessed using the Ljung–Box Q test at multiple lags.

Table A1. Diagnostic summary of candidate ARIMA models (classical Box–Jenkins approach).

Model	Differencing	Ljung–Box Test (p-Values)	Residual Adequacy
ARIMA(2,1,0)	First difference	Most p-Values < 0.05	Not adequate
ARIMA(2,1,0)	First difference	All p-Values < 0.001	Not adequate
ARIMA(0,1,1)	First difference	All p-Values > 0.05	Adequate

Note: Only models satisfying residual independence were considered further. Detailed test statistics are omitted for brevity.

Appendix A.2. Automated ARIMA Model Selection

Automated model selection was conducted using the auto_arima function from the pmdarima Python package. The procedure evaluates candidate models using information criteria and a stepwise search strategy.

Table A2. Summary of auto_arima selection results.

Criterion	Value
Selected model	ARIMA(1,0,0)
AIC	1086.442
BIC	1093.272
HQIC	1089.161

The automated procedure selected a parsimonious ARIMA(1,0,0) specification, which provided superior out-of-sample forecasting performance compared to more complex alternatives.

Appendix A.3. Software Environment

SAS (Version 9.2): used for classical Box–Jenkins diagnostics
Python: pmdarima (auto_arima)
Evaluation metrics: MAE, RMSE, MAPE

References

Veenstra, A.W.; Franses, P.H. A co-integration approach to forecasting freight rates in the dry bulk shipping sector. Transp. Res. Part A 1997, 31, 447–458. [Google Scholar] [CrossRef]
Cullinane, K.P.B.; Mason, K.J.; Cape, M. A comparison of models for forecasting the Baltic freight index: Box-Jenkins revisited. Int. J. Marit. Econ. 1999, 1, 15–39. [Google Scholar] [CrossRef]
Kavussanos, M.G.; Alizadeh, M.A.H. Seasonality patterns in dry bulk shipping spot and time charter freight rates. Transp. Res. Part E 2001, 37, 443–467. [Google Scholar] [CrossRef]
Tsioumas, V.; Papadimitriou, S.; Smirlis, Y.; Zahran, S.Z. A novel approach to forecasting the bulk freight market. Asian J. Shipp. Logist. 2017, 33, 33–41. [Google Scholar] [CrossRef]
Papailias, F.; Thomakos, D.D.; Liu, J. The Baltic Dry Index: Cyclicalities, forecasting and hedging strategies. Empir. Econ. 2017, 52, 255–282. [Google Scholar] [CrossRef]
Zhang, X.; Xue, T.; Stanley, H.E. Comparison of econometric models and artificial neural networks algorithms for the prediction of baltic dry index. IEEE Access 2018, 7, 1647–1657. [Google Scholar] [CrossRef]
Katris, C.; Kavussanos, M.G. Time series forecasting methods for the Baltic dry index. J. Forecast. 2021, 40, 1540–1565. [Google Scholar] [CrossRef]
Yang, H.; Dong, F.; Ogandaga, M. Forewarning of freight rate in shipping market based on support vector machine. In Traffic and Transportation Studies; ASCE Library: Reston, VA, USA, 2008; pp. 295–303. [Google Scholar] [CrossRef]
Bao, J.; Pan, L.; Xie, Y. A new BDI forecasting model based on support vector machine. In Proceedings of the 2016 IEEE Information Technology, Networking, Electronic and Automation Control Conference, Chongqing, China, 20–22 May 2016; pp. 65–69. [Google Scholar] [CrossRef]
Li, J.; Parsons, M.G. Forecasting tanker freight rate using neural networks. Marit. Policy Manag. 1997, 24, 9–30. [Google Scholar] [CrossRef]
Şahin, B.; Gürgen, S.; Ünver, B.; Altin, İ. Forecasting the Baltic Dry Index by using an artificial neural network approach. Turk. J. Electr. Eng. Comput. Sci. 2018, 26, 1673–1684. [Google Scholar] [CrossRef]
Chou, C.C.; Lin, K.S. A fuzzy neural network combined with technical indicators and its application to Baltic Dry Index forecasting. J. Mar. Eng. Technol. 2019, 18, 82–91. [Google Scholar] [CrossRef]
Bae, S.H.; Lee, G.; Park, K.S. A Baltic Dry Index prediction using deep learning models. J. Korea Trade 2021, 25, 17–36. [Google Scholar] [CrossRef]
Liu, B.; Wang, X.; Zhao, S.; Xu, Y. Prediction of Baltic Dry Index Based on GRA-BiLSTM Combined Model. Int. J. Marit. Eng. 2023, 165, 217–228. [Google Scholar] [CrossRef]
Leonov, Y.; Nikolov, V. A wavelet and neural network model for the prediction of dry bulk shipping indices. Marit. Econ. Logist. 2012, 14, 319–333. [Google Scholar] [CrossRef]
Zeng, Q.; Qu, C.; Ng, A.K.; Zhao, X. A new approach for Baltic Dry Index forecasting based on empirical mode decomposition and neural networks. Marit. Econ. Logist. 2016, 18, 192–210. [Google Scholar] [CrossRef]
Kamal, I.M.; Bae, H.; Sunghyun, S.; Yun, H. DERN: Deep ensemble learning model for short-and long-term prediction of baltic dry index. Appl. Sci. 2020, 10, 1504. [Google Scholar] [CrossRef]
Su, M.; Park, K.S.; Bae, S.H. A new exploration in Baltic Dry Index forecasting learning: Application of a deep ensemble model. Marit. Econ. Logist. 2024, 26, 21–43. [Google Scholar] [CrossRef]
Kim, H.S.; Kim, D.H.; Choi, S.Y. Baltic dry index forecast using financial market data: Machine learning methods and SHAP explanations. PLoS ONE 2025, 20, e0325106. [Google Scholar] [CrossRef]
Atsalaki, I.; Atsalakis, G.S.; Melas, K.D.; Michail, N.A. Baltic dry index forecasting using a neuro-fuzzy inference system. J. Econ. Financ. 2025, 49, 682–709. [Google Scholar] [CrossRef]
Zhang, H. Grasping the trend of the shipping market: A Baltic Dry Index prediction method based on deep learning. Inf. Syst. Econ. 2025, 6, 129–137. [Google Scholar] [CrossRef]
Li, W.; Bao, H.; Lei, X. Research on BDI index prediction based on LSTM Neural Network. J. Int. Econ. Glob. Gov. 2025, 2, 67–88. [Google Scholar] [CrossRef]
Deng, J.L. Introduction grey system theory. J. Grey Syst. 1989, 1, 1–24. [Google Scholar]
Box, G.E.P.; Jenkins, G.M.; Reinsel, G.C. Time Series Analysis, Forecasting and Control; Prentice-Hall: Englewood Cliffs, NJ, USA, 1994. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar] [CrossRef]
Huang, N.E.; Shen, Z.; Long, S.R.; Wu, M.C.; Shih, H.H.; Zheng, Q.; Liu, H.H. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. A 1998, 454, 903–995. [Google Scholar] [CrossRef]
Vapnik, V. The Nature of Statistical Learning Theory; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1999. [Google Scholar]
Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]
Yokuma, J.T.; Armstrong, J.S. Beyond accuracy: Comparison of criteria used to select forecasting methods. Int. J. Forecast. 1995, 11, 591–597. [Google Scholar] [CrossRef][Green Version]
Makridakis, S.; Hibon, M. The M3-competition: Results, conclusions and implications. Int. J. Forecast. 2000, 164, 451–476. [Google Scholar] [CrossRef]
Lewis, C. Industrial Forecasting: Principles and Practice; Chapman & Hall: London, UK, 1982. [Google Scholar]

Figure 1. Flowchart of the Grey Wolf Optimization algorithm.

Figure 2. Flowchart of the EMD-SVR-GWO model.

Figure 3. Time-series plot of monthly BDI (2019–2024) showing trend and nonstationary behavior.

Figure 4. The SAS output of the ACF for the first difference of the original data.

Figure 5. The SAS output of the PACF for the first difference of the original data.

Figure 6. Data composition results of BDI.

Figure 7. Actual and Predicted BDI.

Table 1. Monthly Baltic Dry Index dataset used for empirical analysis.

Year	Month	BDI	Year	Month	BDI
2019	1	1063.32	2022	1	1760.80
2019	2	628.75	2022	2	1834.90
2019	3	680.82	2022	3	2464.09
2019	4	773.25	2022	4	2220.37
2019	5	1037.05	2022	5	2943.05
2019	6	1174.40	2022	6	2389.45
2019	7	1869.74	2022	7	2078.48
2019	8	1981.86	2022	8	1412.36
2019	9	2256.41	2022	9	1487.14
2019	10	1827.61	2022	10	1814.67
2019	11	1419.29	2022	11	1298.95
2019	12	1380.71	2022	12	1453.41
2020	1	701.59	2023	1	908.81
2020	2	460.60	2023	2	658.35
2020	3	601.09	2023	3	1410.04
2020	4	663.90	2023	4	1480.33
2020	5	489.11	2023	5	1416.05
2020	6	1146.45	2023	6	1081.77
2020	7	1634.39	2023	7	1040.19
2020	8	1514.33	2023	8	1149.86
2020	9	1410.77	2023	9	1378.05
2020	10	1630.77	2023	10	1867.68
2020	11	1180.38	2023	11	1760.25
2020	12	1243.67	2023	12	2537.63
2021	1	1657.50	2024	1	1673.61
2021	2	1499.60	2024	2	1658.16
2021	3	2017.61	2024	3	2232.90
2021	4	2475.05	2024	4	1731.33
2021	5	2965.26	2024	5	1890.36
2021	6	2932.00	2024	6	1922.00
2021	7	3187.95	2024	7	1925.30
2021	8	3718.10	2024	8	1716.24
2021	9	4286.45
2021	10	4819.95
2021	11	2780.45
2021	12	2835.22

Source: Daily data is collected from https://data.eastmoney.com/cjsj/hyzs_EMI00107664.html; The data is accessed on 1 January 2025 and monthly data is prepared by authors.

Table 2. Monthly BDI Data—Descriptive Statistics.

Mean	1743.264
Median	1645.945
Standard Deviation	851.541
Variance	725,122.842
Kurtosis	2.380
Skewness	1.244
Range	4359.350
Minimum	460.600
Maximum	4819.950
Number of observations	68.000

Table 3. Class ratio test.

2023/10	2023/11	2023/12
0.4246	0.6484	0.6636

Table 4. Accumulated generated sequence.

2023/9	2023/10	2023/11	2023/12
1378.05	3245.73	5005.98	7543.61

Table 5. Mean value generating sequence.

2023/9~10	2023/10~11	2023/11~12
2311.89	4125.855	6274.795

Table 6. Actual and predicted BDI.

	Actual BDI	Grey Forecast	ARIMA (1,0,0)	SVR	LSTM	GRU	EMD-SVR-GWO
1	1673.61	2871.9526	2305.03	1402.1657	2179.7861	2314.69	1903.5586
2	1658.16	1915.8460	2213.99	1441.9440	2201.937	2428.1724	1902.6450
3	2232.90	1179.6161	2136.49	1484.2511	2177.8552	2379.2947	1901.7250
4	1731.33	2517.5462	2070.52	1527.7972	2143.251	2322.227	1900.8048
5	1890.36	1941.7500	2014.37	1571.3694	2082.6338	2255.5176	1899.8906
6	1922.00	1614.6354	1966.57	1613.8601	2066.8604	2195.521	1898.9877
7	1925.30	2044.0939	1925.88	1654.2896	2041.5121	2141.5312	1898.1015
8	1716.24	1947.6592	1891.24	1691.8222	2022.9413	2101.729	1897.2360

Table 7. The optimal weight for each IMF.

Weight for IMF 1: 0.0037773231

Weight for IMF 2: 0.0010685708

Weight for IMF 3: 0.0008962573

Weight for IMF 4: 1.0

Table 8. Performance of various methods of forecasting BDI.

	MAE	MAPE	RMSE
Forecasting Method	MAE	MAPE	RMSE
Grey forecast	500.561	27.2611	651.4161
ARIMA(1,0,0)	245.875	14.2829	331.657
SVR	295.301	15.3686	352.329
LSTM	284.620	16.3638	333.568
GRU	423.597	24.0836	471.434
EMD-SVR-GWO	151.977	8.3455	188.801

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Huang, J.; Chu, C.-W.; Hsu, H.-L. A Comparative Study of Univariate Models for Baltic Dry Index Forecasting. Forecasting 2026, 8, 11. https://doi.org/10.3390/forecast8010011

AMA Style

Huang J, Chu C-W, Hsu H-L. A Comparative Study of Univariate Models for Baltic Dry Index Forecasting. Forecasting. 2026; 8(1):11. https://doi.org/10.3390/forecast8010011

Chicago/Turabian Style

Huang, Juan, Ching-Wu Chu, and Hsiu-Li Hsu. 2026. "A Comparative Study of Univariate Models for Baltic Dry Index Forecasting" Forecasting 8, no. 1: 11. https://doi.org/10.3390/forecast8010011

APA Style

Huang, J., Chu, C.-W., & Hsu, H.-L. (2026). A Comparative Study of Univariate Models for Baltic Dry Index Forecasting. Forecasting, 8(1), 11. https://doi.org/10.3390/forecast8010011

Article Menu

A Comparative Study of Univariate Models for Baltic Dry Index Forecasting

Highlights

Abstract

1. Introduction

2. Literature Review

2.1. Econometric Model

2.2. Nonlinear Models and Machine Learning

2.3. Ensemble Machine Learning Models

3. Methods

3.1. Grey Forecasting Model

3.2. SARIMA

3.3. Long Short-Term Memory Model

3.4. Gated Recurrent Unit Model

3.5. The EMD-SVR-GWO Model

3.5.1. EMD Model

3.5.2. SVR Model

3.5.3. GWO

3.5.4. Model Construction of EMD-SVR-GWO

4. Data and Results

4.1. BDI Time Series Data

4.2. Model Results

4.2.1. Grey Forecasting

4.2.2. ARIMA Model

4.2.3. LSTM

4.2.4. GRU

4.2.5. SVR

4.2.6. EMD-SVR-GWO

4.3. Comparison of Forecasting Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Supplementary ARIMA Model Diagnostics and Selection Results

Appendix A.1. Classical ARIMA Diagnostic Results

Appendix A.2. Automated ARIMA Model Selection

Appendix A.3. Software Environment

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI