Data-Driven Prediction of Deep-Sea Near-Seabed Currents: A Comparative Analysis of Machine Learning Algorithms

Bao, Hairong; Yao, Zhixiong; Xu, Dongfeng; Wang, Jun; Yang, Chenghao; Liu, Nuan; Pang, Yuntian

doi:10.3390/rs17183131

Open AccessArticle

Data-Driven Prediction of Deep-Sea Near-Seabed Currents: A Comparative Analysis of Machine Learning Algorithms

by

Hairong Bao

¹,

Zhixiong Yao

^1,2

,

Dongfeng Xu

^1,2,*

,

Jun Wang

¹,

Chenghao Yang

¹,

Nuan Liu

¹ and

Yuntian Pang

³

¹

State Key Laboratory of Satellite Ocean Environment Dynamics, Second Institute of Oceanography, Ministry of Natural Resources, Hangzhou 310012, China

²

Key Laboratory of Ocean Space Resource Management Technology, Ministry of Natural Resources, Marine Academy of Zhejiang Province, Hangzhou 310012, China

³

Beijing Pioneer Hi-Tech Development Corporation, Beijing 100048, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(18), 3131; https://doi.org/10.3390/rs17183131

Submission received: 26 June 2025 / Revised: 23 August 2025 / Accepted: 29 August 2025 / Published: 9 September 2025

(This article belongs to the Special Issue Linking Marine Heatwaves/Cold-Spells, Eddies, and Sea Surface Features)

Download

Browse Figures

Versions Notes

Abstract

Deep-sea mining has garnered significant global attention, and accurate prediction of ocean currents plays a critical role in optimizing the design of sediment plume monitoring networks associated with mining activities. Using near-seabed mooring data from the Western Pacific M2 block (Beijing Pioneer polymetallic nodule Exploration Area, BPEA), this study trained four machine learning models—LSTM, XGBoost, ARIMA, and SVR—on current velocity to generate 96 h forecasts. Key findings include the following: LSTM and ARIMA models outperformed XGBoost and SVR in near-seabed current prediction. 1 h ahead forecasts substantially improved accuracy over rolling predictions (an iterative process where predicted values are treated as observed values for subsequent prediction steps), reducing zonal current (east–west component) RMSE from 2.395 cm/s to 1.120 cm/s and meridional current (north–south component) RMSE from 2.024 cm/s to 1.224 cm/s. For practical deployment, 3 h ahead forecasts achieved a zonal current RMSE of 1.412 cm/s.

Keywords:

data-driven; deep-sea; near-seabed current; prediction; machine learning

1. Introduction

In recent years, the increasing depletion of terrestrial mineral resources and the huge global demand for mineral resources have made deep-sea mineral resource extraction a hot spot of concern, and the sediment plume generated by deep-sea mining has also attracted much attention [1]. Sediment plumes are categorized into mid-water plumes and collector plumes. Mid-water plumes are generated during the discharge of tailings water after mineral extraction [2], and collector plumes are generated during the cutting and collection process of mining trucks [3]. Sediment plumes propagate, spread, and settle increasing water column turbidity. This affects the benthic environment [4,5,6] and impairs benthic organisms’ feeding and communication abilities [7,8,9,10,11]. Such impacts may further reduce local biodiversity. Therefore, more research on sediment plumes is necessary before large-scale commercial mining is undertaken. Current research on plume evolution has employed both in situ measurements and numerical simulations. These studies have identified several key factors influencing plume diffusion: current field [5,12,13], seafloor topography [2], sediment [14] and suspended sediment concentration, location of mining trucks [15], and initial discharge rate [16], etc. Consequently, the accurate simulation of deep-sea near-seabed currents during mining operations could provide scientific foundations for an optimal design for plume monitoring networks.

Currently, methodologies for ocean current prediction can be categorized into three primary approaches. First, the studies of ocean current dynamics primarily utilize numerical modeling approaches, such as the widely used Regional Ocean Modeling System (ROMS) [17], for simulation and prediction. Farrara et al. developed a 3-D ocean model based on the ROMS to simulate surface currents. The results agree well with Acoustic Doppler Current Profiler (ADCP) observations, demonstrating the model’s capability in capturing surface current changes [18]. Bolanos et al. assessed model sensitivity by comparing numerical models with observational data. They found that model configurations, spatial resolution, and related settings significantly affect near-seabed current simulation accuracy [19].

The second category comprises time-series-based methods for ocean current prediction, including classical statistical methods such as moving average (MA) and Autoregressive Integrated Moving Average (ARIMA) models, which can be considered basic machine learning (ML) techniques. These approaches do not account for spatial interactions, yet they demonstrate significant strength in temporal dimension forecasting [20].

The third approach involves ML models based on neural network for ocean current prediction. Characterized by vast repositories and strong spatiotemporal regularity, marine observational datasets allow ML techniques to efficiently extract key data features. This enables robust modeling and the prediction of spatiotemporal current distributions. ML has become a powerful tool for ocean current forecasting due to its ability to process large-scale spatiotemporal data. Neural networks, particularly recurrent architectures such as Long Short-Term Memory (LSTM) and Artificial Neural Network (ANN), have demonstrated strong performance in predicting both surface and subsurface currents [21,22,23]. However, their accuracy tends to degrade in long-term forecasts [24]. Recent studies highlight the effectiveness of hybrid models. Convolutional Neural Networks (CNN)-based approaches, such as the network for the prediction of sea surface currents (SSC-net) [25] and Convolutional Neural Networks-Gated Recurrent Unit (CNN-GRU) [20], have improved prediction accuracy, especially for surface currents. Additionally, radar-data-driven ANN models [26] and numerical–ML hybrid frameworks [27] have shown promise in short-term forecasting and extreme value prediction. Nevertheless, challenges remain in extending prediction horizons and improving deep-water current modeling.

However, current research on ocean currents predominantly focuses on surface currents, which can be predicted using features such as tides and inputs such as wind forcing [26]. In contrast, the study of deep-sea near-seabed currents remains challenging and costly due to the complex topography and harsh environment of the deep ocean. Moreover, existing prediction approaches, aside from rolling forecast techniques, often rely solely on 1 h ahead forecasting [21,25], which proves inadequate for practical applications requiring longer-term predictions in deep-sea environments. Traditional methods, such as numerical models, face limitations arising from modeling assumptions, inaccuracies and inadequacies in boundary conditions, and forcing functions, as well as high computational costs [19,22,27,28]. Consequently, it is difficult to apply these methods to the practical prediction of ocean currents. This study evaluates four machine learning models—ARIMA, LSTM, eXtreme Gradient Boosting (XGBoost), and Support Vector Regression (SVR)—for 96 h current prediction using deep-sea near-bottom mooring data from the mining area. Our research is directly linked to field trials conducted during seabed mining collector testing, which had an operational duration of 100 h. For modeling simplicity and alignment with standard forecasting windows, we approximated this period as 96 h (4 days) in our analysis. The comparative analysis provides critical scientific support for real-time monitoring of mining-induced sediment plumes.

This study is structured as follows: Section 2 details the data and methods, including the implementation of four models, along with three evaluation metrics. Section 3 presents the results, showcasing predictions of zonal and meridional components up to 96 h from the four models, while rigorously evaluating their performance under variable-sized sliding windows, which determine how many past time points are used to predict the next value, across the two methodologies. Section 4 provides the concluding remarks.

2. Materials and Methods

2.1. Data

The ocean current data come from the moored current meter deployed by Beijing Pioneer Hi-tech Development Corporation Ltd. (Beijing, China), which is located in the M2 area of the Beijing Pioneer polymetallic nodule Exploration Area (BPEA) in the Western Pacific Ocean, and the water depth at the deployment is 5571 m (star marker in Figure 1). The data are hourly and underwent quality control via the 3σ rule prior to machine learning training. In this study, we use the moored current meter data in the near-seabed layer (the water depth is 5534 m), and the maximum current velocity of the current field is 18.40 cm/s, and the mean current velocity of the east and north component are −1.20 cm/s and −0.06 cm/s, respectively. The current rose diagram (Figure 2a) reveals a dominant east–west bidirectional current pattern. Based on the current transport vector diagram at the mooring position (Figure 2b), it can be observed that, for most of the time, the zonal component of the current at the mooring location predominantly serves as the dominant component. Based on harmonic analysis, the principal tidal constituents derived from the mooring data are the dominant diurnal (K₁) and semi-diurnal (M₂) constituents, with major axis amplitudes of 1.351 cm/s and 0.647 cm/s, respectively. This indicates that the tidal signal is substantially weaker than the non-tidal current—a characteristic that makes accurate forecasting particularly challenging.

2.2. Methods

Since mining activities predominantly commence during summer and autumn, this study selected data from July to October 2022 (comprising 2952 data points), dividing the dataset into training and testing sets at an approximate 8:2 ratio. This resulted in a training set of 2361 values and a testing set of 591 values. A predictive model was subsequently developed to forecast four consecutive days of current patterns (1–4 November 2022, totaling 96 h).

The Table 1 presents the extreme and mean values of the zonal and meridional current components for both the training and test sets. For the zonal component, the training set exhibits a maximum speed of 14.952 cm/s and a mean speed of 3.032 cm/s, while the test set shows higher variability with a maximum of 16.902 cm/s, a minimum of 0.026 cm/s, and a mean speed of 6.089 cm/s. The average velocity for the zonal component is positive (0.254 cm/s) in the training set but sharply negative (−6.081 cm/s) in the test set, indicating a reversal in dominant current direction between the datasets. For the meridional component, the training set has a maximum speed of 10.542 cm/s and a mean speed of 2.161 cm/s, whereas the test set records a lower maximum (7.643 cm/s) and a nearly identical mean speed (2.155 cm/s). The vector-averaged velocities for this component are close to zero in both sets (0.073 cm/s and −0.683 cm/s), suggesting weaker overall meridional currents.

Four machine learning approaches—ARIMA, LSTM, XGBoost, and SVR—were employed to analyze temporal dependencies using sliding windows ranging from 6 h to 60 h (6 h, 12 h, 24 h, 36 h, 48 h, 60 h). The framework quantitatively evaluates how prediction accuracy evolves with varying data refresh intervals through systematic window-length comparisons.

2.2.1. Autoregressive Integrated Moving Average Model, ARIMA

The current methods for time series are autoregressive (AR) and moving average (MA) models, where the AR model can capture the longer historical trend of the data and make predictions based on the historical trend, with the shortcoming that it cannot deal well with mutant or noisy data, whereas the MA model can deal well with the situation of mutant or noisy data, though it cannot capture the longer historical trend of the data as the AR model does.

The ARIMA model combines the advantages of the AR and MA models, and combines the difference method, which can deal with more complex time series problems. As proposed by Box and Jenkins [30], it is usually denoted as ARIMA (p, d, q), where p is the number of autoregressive terms, which indicates how many hours of observations in the past are used to predict the current value; d is the difference number of times, which indicates how many times the time series has been smoothed by the differencing operation; and q is the number of moving average terms, which indicates the number of past prediction errors included in the prediction equation. ARIMA performs well in univariate time series forecasting, and it is effective in capturing trends and seasonal features in the data [30]. Considering the needs of this study to obtain better forecasting results, the Akaike Information Criterion (AIC) [31] and the grid search method are chosen to determine the value of (p, d, q) of ARIMA.

2.2.2. Long Short-Term Memory Network, LSTM

LSTM was proposed by Hochreiter and Schmidhuber in 1997 to solve the problem of gradient vanishing or exploding in Recurrent Neural Network (RNN) when dealing with long sequences [32,33,34,35]. LSTM is able to effectively capture long term dependencies in long sequences through the introduction of memory cells and gating structures [36,37]. LSTM memory cells employ a linear cyclic structure to maintain information stability over long distances, enabling memorization across extended time intervals. The gating structure consists of three components: an input gate regulates how much new information enters the cell state; a forget gate determines whether to retain or discard old information within the cell state; an output gate then generates the current output based on the updated cell state. Through collaborative operation, these gating mechanisms allow LSTM to selectively memorize and update its cell state, significantly enhancing the model’s ability to learn complex sequential patterns [38]. In current prediction, the LSTM model treats the current data as a time series, and it learns periodic features and long-term dependencies from historical currents [39]. The architectural design of our LSTM model comprises three sequential layers, succeeded by a fully connected (dense) layer. Specifically, the first two layers are each configured with 150 units, while the third layer contains 128 units. To mitigate overfitting, a dropout mechanism with a rate of 0.2 is employed after each layer.

2.2.3. eXtreme Gradient Boosting, XGBoost

XGBoost, developed by Chen et al., is an enhanced version of the Gradient Boosting Decision Tree (GBDT) algorithm [40]. Its core principle involves building more effective decision trees to minimize the objective function. Since XGBoost can find the optimal solution faster than other boosting algorithms, it can also be used for regression tasks. XGBoost’s advantage lies in its use of regularization techniques. This effectively prevents model overfitting and enhances generalization capability. For regression tasks specifically, the regularization term helps smooth the final learned weights. Consequently, the model delivers strong performance on both training data and unseen data [41].

2.2.4. Support Vector Regression, SVR

In 1995, Vapnik proposed Support Vector Machines (SVMs) for classification tasks, and SVR is an extension of SVMs for regression tasks [42]. SVR is robust to outliers and can handle high-dimensional data with many features, and has the ability to capture nonlinear relationships between input and output variables [43,44]. SVR learns mapping relationships by identifying a hyperplane, which minimizes the error between input features and target variables while maximizing the interval between the hyperplane and the data. This hyperplane or function fits the data well enough to allow a certain error tolerance ϵ to ensure that the majority of the data points fall within this tolerance. The goal of SVR is to find a function that the prediction error for the majority of the data points does not exceed [45].

2.2.5. Forecasting Methodology

This study utilizes time series current data to establish predictive models that extrapolate future trends from historical observations. The models are trained on this historical data; subsequently, a time series segment U = {u₁, u₂, …, u_n} (where n denotes the sliding window length) serves as input to forecast future values. The historical time series is trained and modeled for prediction by the model, then input U predicts the current data at future moments. As illustrated in the technical schematic (Figure 3), the rolling prediction mechanism operates recursively: the model first predicts u_n+1 from input U₀, then updates the window to U₁ = {u₂, u₃, …, u_n+1} for predicting u_n+2, sequentially propagating through 96 prediction steps. However, this approach suffers from cumulative error propagation, as each prediction cycle incorporates previous estimation errors, resulting in progressively degraded forecast accuracy with increasing prediction horizon.

To enhance the prediction accuracy, a data updating approach is adopted with varying updating intervals (1 h–6 h, with an interval of 1 h). The evaluation compares different update durations, employing distinct prediction strategies:

For the 1 h update, we implement a pure forecasting approach:

(1)

Given initial sequence U₀ = {u₁, u₂, …, u_n}, predict u_n+1;

(2)

For subsequent prediction u_n+2:

(i): Replace u_n+1 with its observed value u_n+1 (true);
(ii): Remove the oldest entry u₁;
(iii): Form updated sequence U₁ = {u₂, …, u_n+1 (true)};

(3)

Repeat iteratively to generate 96 predictions.

For multi-hour updates (2–6 h), we employ a block-update strategy (demonstrated here for 3 h updates):

(1)

Using U₀, predict three consecutive values (u_n+1, u_n+2, u_n+3);

(2)

Update sequence by

(i): Replacing predictions with observed values (u_n+1 (true), u_n+2 (true), u_n+3 (true));
(ii): Removing three oldest entries (u₁, u₂, u₃);
(iii): Forming U₁ = {u₄, …, u_n+1 (true), …, u_n+3 (true)};

(3)

Predict next three values (u_n+4, u_n+5, u_n+6) using U₁;

(4)

Continue until 96 predictions are generated.

2.2.6. Evaluation Metrics

In this paper, mean absolute error (MAE), root mean square error (RMSE), median absolute error (MedAE), and R-squared (R²) are used to evaluate each model.

MAE is the average of the absolute error between the predicted value and the true value, which measures the average degree of deviation between the predicted value and the observed value. A smaller MAE indicates better model prediction performance.

M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - \hat{y_{i}}|

(1)

RMSE is the square root of the ratio of the square of the deviation of the predicted value from the true value to the number of observations n [46]. It is a measure of the size of the error in the prediction results, and the smaller the RMSE, the closer the predicted value is to the true value.

R M S E = \sqrt{\frac{1}{n} \sum_{i}^{n} {(y_{i} - \hat{y_{i}})}^{2}}

(2)

MedAE is the median of the absolute error between the predicted value and the true value, which is insensitive to outliers and better reflects the centralized trend of the data. The smaller the MedAE, the higher the accuracy of the model prediction.

M e d A E = m e d i a n (|y_{1} - \hat{y_{i}}|, \dots, |y_{n} - \hat{y_{i}}|)

(3)

R², also known as the coefficient of determination, measures the proportion of variance in the dependent variable that is predictable from the independent variables. It provides an indication of the goodness of fit of a model, with values closer to 1 indicating that the model explains a greater portion of the variance. The formula for R² is as follows:

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(4)

where n denotes the number of data,

y_{i}

is the actual value,

\hat{y_{i}}

is the predicted value, and

\bar{y}

is the mean of the true values.

3. Results

3.1. Rolling Forecasts

The trained models were applied to forecast near-seabed current data for the period 1–5 November 2022. For zonal currents, the predicted results of each model under different sliding windows are shown in Figure 4. Figure 4a reveals that 6 h and 12 h sliding window predictions demonstrate limited accuracy, failing to capture complete trend capture. Three models converge to near-constant baselines, while XGBoost exhibits large errors, including inverse trends, with persistent fluctuations. When the sliding window is increased to 24 h and above, the predictions of the four models are generally more consistent with the trend of the observed data, but are not sensitive to certain mutation points and extreme points. Analyzing the error scatter plots (Figure 4b) and evaluation metrics (Table 2) reveals that the LSTM and ARIMA models outperform XGBoost and SVR under the 48 h sliding window configuration. The 48 h window yields the best overall results across all tested window sizes (6 h–60 h), which is why it is presented in the rightmost column for comparison. Specifically, LSTM achieves an RMSE of 2.395 cm/s, while ARIMA reaches 2.383 cm/s—both significantly lower than XGBoost (3.337 cm/s) and SVR (3.631 cm/s). The optimal 48 h window size likely represents a balance between two extremes: smaller windows fail to capture long-term dependencies, while larger windows may introduce excessive noise.

Moreover, although ARIMA achieved a marginally lower RMSE (2.383 cm/s) than LSTM (2.395 cm/s), LSTM emerges as the optimal model under the 48 h sliding window configuration when evaluated across both RMSE and the proportion of predictions within ±20% of observed values. This conclusion is substantiated by LSTM’s superior predictive precision, where 42 out of 96 predictions (43.75%) fall within ±20% of observed values—decisively exceeding ARIMA’s 38 predictions (39.58%) and representing the highest proportion among all models. The rolling forecast results systematically underestimate observed values, with errors accumulating over time. While this approach captures overall trends well, it struggles to predict abrupt changes. This limitation leads to forecast deviations exceeding ±20% from observed values at change points.

For meridional currents (Figure 5), models exhibit limitations similar to those for zonal currents, particularly with short-term windows (6 h and 12 h), where they fail to capture key trends and produce overly linear predictions. While XGBoost predictions fluctuate and differ significantly from measured values, all four models capture the general trend when the window increases to 24 h or longer. However, they miss key mutation points and extreme values. Notably, predictions consistently underestimate meridional currents and sometimes show an opposite direction to the measurements. Quantitative evaluation metrics confirm improved model accuracy at 24 h sliding windows. Specifically, the LSTM model achieves the best performance at a 36 h window size (RMSE = 2.024 cm/s), surpassing ARIMA (2.131 cm/s), SVR (2.196 cm/s), and XGBoost (2.373 cm/s).

3.2. Updating Forecasts

To further improve adaptability, we employ an updating method that continuously integrates the latest measured data into the input sequence, refining each hourly prediction iteratively. This iterative forecasting method operates on an hourly cycle: each prediction step generates a real-time (1 h) forecast, then incorporates the newest measurement to update the input sequence before proceeding to the next prediction.

While XGBoost, LSTM, and ARIMA predictions for zonal currents (Figure 6a) track observed values well overall, their accuracy diminishes at extremes. Analyzing the evaluation indexes of each model (Table 3), the results show that the optimal model for a 1 h ahead forecast is the LSTM model with a sliding window of 48 h (Figure 6b), with an RMSE of 1.120 cm/s, and the predicted values within the range of ±20% of the true values are 74 (96 in total), which is more than 75% of the length of the predicted data. Both the XGBoost model (with 64 predictions within ±20% of true values) and the ARIMA model (with 71 predictions within ±20% of true values) achieved over 66% of their total predictions, falling within the ±20% error margin. All four models show significant improved zonal current prediction compared to rolling forecasts. As visualized in Figure 3 (Methodology and Evaluation), this performance enhancement stems from our 1 h ahead updating approach that incorporates new observations at each step, thus preventing the error propagation inherent to rolling methods, which accumulate discrepancies over time.

Meridional current predictions generated by the updating method are presented in Figure 7. Predictions using the real-time (1 h) updating data method generally align with observed trends but exhibit phase lag. Additionally, all models show limited sensitivity to abrupt change points. Furthermore, Figure 7 reveals minimal performance variation across models under 1 h ahead forecast methods. Unlike the results from the zonal current rolling forecast method, all models achieve relatively low RMSE values at a 36 h sliding window, indicating a higher prediction accuracy at this configuration, with LSTM and SVR reaching an RMSE of 1.220 cm/s, while XGBoost and ARIMA achieve a slightly lower RMSE of 1.120 cm/s.

3.3. Data Updates Based on Lead Times

In deep-sea mining operations, lead time—the period between issuing a bottom current forecast and the actual occurrence of predicted hydrodynamic conditions—plays a pivotal role in ensuring operational safety and efficiency. Extending this lead time becomes particularly crucial when predicting deep-sea near-seabed currents that directly impact mining equipment and operations.

Figure 8 quantitatively demonstrates an inverse correlation between lead time and prediction accuracy, with model performance progressively deteriorating as the forecast window extends from 1 h to 6 h. Among the models, LSTM and ARIMA maintain comparatively better overall performance than XGBoost and SVR. Comprehensive analysis across all evaluation metrics identifies the 48 h sliding window as optimal. Figure 9 specifically demonstrates the LSTM model’s MAE and RMSE performance at varying lead times using this window configuration. Both MAE (blue bars) and RMSE (red line) exhibit a rising trend as the lead time increases, indicating that prediction errors grow with the duration of the forecast. This aligns with the typical behavior of time series models, where uncertainty accumulates over longer prediction horizons. This poses a challenge for seabed mining operations, where advance current forecasts are critical for safety and efficiency. Optimal performance occurs at a 1 h lead time (MAE = 0.903 cm/s, RMSE = 1.120 cm/s, R² = 0.84). Notably, the model exhibits controlled error growth at extended lead times, with the 3 h forecast achieving an RMSE of 1.412 cm/s for zonal currents, corresponding to merely 15% of the observed velocity range (–10 to 0 cm/s). In particular, the 3 h lead time balances prediction precision with practical utility: while 1 h forecasts exhibit the lowest errors, they offer limited advance notice for seabed mining operations. In contrast, 3 h predictions extend the foresight period without incurring the steep error growth observed in longer horizons (4–6 h). This makes the 3 h interval a pragmatic choice, as it aligns with the operational need for extended current forecasts while keeping errors within tolerable margins. Thus, the results underscore that a 3 h lead time with a 48 h sliding window effectively balances prediction accuracy and advance warning. Specifically, for 3 h ahead predictions (Figure 10), over 70% of outputs from both LSTM (68/96) and ARIMA (69/96) models fall within ±20% of true values. This provides actionable insights for dynamic ocean current management in seabed mining contexts.

For meridional currents (Figure 11), all models (XGBoost, LSTM, ARIMA, SVR) exhibit a consistent decline in prediction accuracy as forecast duration increases. ARIMA demonstrates exceptional univariate time series modeling capacity [30], achieving optimal performance with a 36 h sliding window under a 2 h lead time (RMSE = 1.293 cm/s). Notably, ARIMA outperforms other models across all update intervals, highlighting its robustness in handling univariate temporal dependencies.

In contrast, while the LSTM model demonstrates superior performance in zonal current prediction, its effectiveness in meridional current forecasting remains limited. This discrepancy originates from the non-stationary characteristics of the meridional component, marked by absent periodic patterns and indiscernible trends. Such characteristics pose fundamental challenges for LSTM networks in capturing intrinsic spatiotemporal dependencies, ultimately leading to elevated prediction errors [21].

4. Discussion

This study employs four machine learning approaches—XGBoost, LSTM, ARIMA, and SVR—to analyze deep-sea near-seabed current measurements obtained from a 5534 m deep mooring current meter deployed in the M2 sector of the Beijing Pioneer polymetallic nodule Exploration Area (BPEA). Through a combination of rolling prediction methodology and time-varied data refresh intervals, the models demonstrate capability in generating 96 h current velocity forecasts while maintaining operational feasibility under extreme deep-sea conditions. The results are as follows:

(1): Under the rolling forecast framework, LSTM achieves optimal zonal current prediction at 48 h windows. For meridional currents, models capture broad trends at 24 h or longer windows but exhibited intermittent current reversals.
(2): LSTM and ARIMA models demonstrate superior predictive performance in near-seabed current forecasting compared to other approaches, highlighting LSTM’s capacity to capture long-term dependencies and process complex nonlinear patterns, while leveraging ARIMA’s parametric advantages in modeling univariate time series.
(3): The 3 h ahead forecast optimally balances operational needs and error control, serving as the ideal solution for dynamic current management.

The practical value of our 3–6 h updating forecast method for BPEA in deep-sea mining is demonstrated by the complex nature of the local deep-sea current regime. Practical implementation of timed data updates faces challenges such as device communication limits and onboard computational constraints. In the future, we need to deeply analyze the data characteristics and find the factors affecting the near-seabed current in order to extend prediction length and accuracy.

Author Contributions

Conceptualization, H.B., D.X., Z.Y., J.W., N.L. and Y.P.; methodology, H.B., D.X. and Z.Y.; software, H.B.; validation, H.B., D.X., Z.Y., J.W. and C.Y.; formal analysis, H.B.; investigation, H.B.; resources, D.X.; data curation, C.Y., J.W. and N.L.; writing—original draft preparation, H.B.; writing—review and editing, H.B., Z.Y. and D.X.; visualization, H.B.; supervision, D.X.; project administration, D.X.; funding acquisition, D.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key R&D Program of China (Grant Nos. 2022YFC2803901, 2022YFC2806603, and 2022YFC2803905) and the Research Fund of Zhejiang Province (Grant No. 330000210130313013006).

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

Author Yuntian Pang was employed by the company Beijing Pioneer Hi-Tech Development Corporation. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

BPEA	Beijing Pioneer polymetallic nodule Exploration Area
AIC	Akaike Information Criterion
ANN	Artificial Neural Network
AR	Autoregressive
ARIMA	Autoregressive Integrated Moving Average Model
GBDT	Gradient Boosting Decision Tree
GEBCO	General Bathymetric Chart of the Oceans
LSTM	Long Short-Term Memory Network
MA	Moving average
MAE	Mean absolute error
MedAE	Median absolute error
RMSE	Root mean squared error
R²	R-squared
SSC-net	The Network for the Prediction of Sea Surface Currents
SVM	Support Vector Machines
SVR	Support Vector Regression
XGBoost	eXtreme Gradient Boosting

References

Hein, J.R.; Mizell, K.; Koschinsky, A.; Conrad, T.A. Deep-Ocean Mineral Deposits as a Source of Critical Metals for High- and Green-Technology Applications: Comparison with Land-Based Resources. Ore Geol. Rev. 2013, 51, 1–14. [Google Scholar] [CrossRef]
Ouillon, R.; Muñoz-Royo, C.; Alford, M.H.; Peacock, T. Advection-Diffusion-Settling of Deep-Sea Mining Sediment Plumes. Part 1: Midwater Plumes. Flow 2022, 2, E22. [Google Scholar] [CrossRef]
Ouillon, R.; Muñoz-Royo, C.; Alford, M.H.; Peacock, T. Advection–Diffusion Settling of Deep-Sea Mining Sediment Plumes. Part 2. Collector Plumes. Flow 2022, 2, E23. [Google Scholar] [CrossRef]
Washburn, T.W.; Turner, P.J.; Durden, J.M.; Jones, D.O.B.; Weaver, P.; Van Dover, C.L. Ecological Risk Assessment for Deep-Sea Mining. Ocean Coast. Manag. 2019, 176, 24–39. [Google Scholar] [CrossRef]
Aleynik, D.; Inall, M.E.; Dale, A.; Vink, A. Impact of Remotely Generated Eddies on Plume Dispersion at Abyssal Mining Sites in the Pacific. Sci. Rep. 2017, 7, 16959. [Google Scholar] [CrossRef]
Gillard, B.; Purkiani, K.; Chatzievangelou, D.; Vink, A.; Iversen, M.H.; Thomsen, L. Physical and Hydrodynamic Properties of Deep Sea Mining-Generated, Abyssal Sediment Plumes in the Clarion Clipperton Fracture Zone (Eastern-Central Pacific). Elem. Sci. Anthr. 2019, 7, 5. [Google Scholar] [CrossRef]
Fuchida, S.; Yokoyama, A.; Fukuchi, R.; Ishibashi, J.; Kawagucci, S.; Kawachi, M.; Koshikawa, H. Leaching of Metals and Metalloids from Hydrothermal Ore Particulates and Their Effects on Marine Phytoplankton. ACS Omega 2017, 2, 3175–3182. [Google Scholar] [CrossRef]
Hu, V.J.H. Ingestion of Deep-Sea Mining Discharge by Five Species of Tropical Copepods. Water Air Soil Pollut. 1981, 15, 433–440. [Google Scholar] [CrossRef]
Jones, D.O.B.; Kaiser, S.; Sweetman, A.K.; Smith, C.R.; Menot, L.; Vink, A.; Trueblood, D.; Greinert, J.; Billett, D.S.M.; Arbizu, P.M.; et al. Biological Responses to Disturbance from Simulated Deep-Sea Polymetallic Nodule Mining. PLoS ONE 2017, 12, e0171750. [Google Scholar] [CrossRef]
Robison, B.H. Conservation of Deep Pelagic Biodiversity. Conserv. Biol. 2009, 23, 847–858. [Google Scholar] [CrossRef]
Wilber, D.H.; Clarke, D.G. Biological Effects of Suspended Sediments: A Review of Suspended Sediment Impacts on Fish and Shellfish with Relation to Dredging Activities in Estuaries. N. Am. J. Fish. Manag. 2001, 21, 855–875. [Google Scholar] [CrossRef]
Ouillon, R.; Kakoutas, C.; Meiburg, E.; Peacock, T. Gravity Currents from Moving Sources. J. Fluid Mech. 2021, 924, A43. [Google Scholar] [CrossRef]
Muñoz-Royo, C.; Ouillon, R.; El Mousadik, S.; Alford, M.H.; Peacock, T. An in Situ Study of Abyssal Turbidity-Current Sediment Plumes Generated by a Deep Seabed Polymetallic Nodule Mining Preprototype Collector Vehicle. Sci. Adv. 2022, 8, eabn1219. [Google Scholar] [CrossRef]
Vendettuoli, D.; Clare, M.A.; Sumner, E.J.; Cartigny, M.J.B.; Talling, P.; Wood, J.; Bailey, L.; Azpiroz-Zabala, M.; Paull, C.K.; Gwiazda, R.; et al. Global Monitoring Data Shows Grain Size Controls Turbidity Current Structure. Earth Space Sci. Open Res. Arch. 2020. [Google Scholar]
Weaver, P.P.E.; Aguzzi, J.; Boschen-Rose, R.E.; Colaço, A.; De Stigter, H.; Gollner, S.; Haeckel, M.; Hauton, C.; Helmons, R.; Jones, D.O.B.; et al. Assessing Plume Impacts Caused by Polymetallic Nodule Mining Vehicles. Mar. Policy 2022, 139, 105011. [Google Scholar] [CrossRef]
Lee, W.Y.; Li, A.C.Y.; Lee, J.H.W. Structure of a Horizontal Sediment-Laden Momentum Jet. J. Hydraul. Eng. 2013, 139, 124–140. [Google Scholar] [CrossRef]
Shchepetkin, A.F.; McWilliams, J.C. The Regional Oceanic Modeling System (ROMS): A Split-Explicit, Free-Surface, Topography-Following-Coordinate Oceanic Model. Ocean Model. 2005, 9, 347–404. [Google Scholar] [CrossRef]
Farrara, J.D.; Chao, Y.; Li, Z.; Wang, X.; Jin, X.; Zhang, H.; Li, P.; Vu, Q.; Olsson, P.Q.; Schoch, G.C.; et al. A Data-Assimilative Ocean Forecasting System for the Prince William Sound and an Evaluation of Its Performance during Sound Predictions 2009. Cont. Shelf Res. 2013, 63, S193–S208. [Google Scholar] [CrossRef]
Bolaños, R.; Tornfeldt Sørensen, J.V.; Benetazzo, A.; Carniel, S.; Sclavo, M. Modelling Ocean Currents in the Northern Adriatic Sea. Cont. Shelf Res. 2014, 87, 54–72. [Google Scholar] [CrossRef]
Thongniran, N.; Vateekul, P.; Jitkajornwanich, K.; Lawawirojwong, S.; Srestasathiern, P. Spatio-Temporal Deep Learning for Ocean Current Prediction Based on HF Radar Data. In Proceedings of the 2019 16th International Joint Conference on Computer Science and Software Engineering (JCSSE), Chonburi, Thailand, 10–12 July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 254–259. [Google Scholar]
Bayındır, C. Predicting the ocean currents using deep learning. TWMS J. Appl. Eng. Math. 2023, 13, 373–385. [Google Scholar]
Dauji, S.; Deo, M.C.; Bhargava, K. Prediction of Ocean Currents with Artificial Neural Networks. ISH J. Hydraul. Eng. 2015, 21, 14–27. [Google Scholar] [CrossRef]
Immas, A.; Do, N.; Alam, M.-R. Real-Time in Situ Prediction of Ocean Currents. Ocean Eng. 2021, 228, 108922. [Google Scholar] [CrossRef]
Kusnanti, E.A.; Novitasari, D.C.R.; Setiawan, F.; Fanani, A.; Hafiyusholeh, M.; Permata Sari, G.I. Predicting Velocity and Direction of Ocean Surface Currents Using Elman Recurrent Neural Network Method. J. Inf. Syst. Eng. Bus. Intell. 2022, 8, 21–30. [Google Scholar] [CrossRef]
Chae, J.; Jin, H.; Chang, I.; Kim, Y.H.; Park, Y.; Kim, Y.T.; Kang, B.; Kim, M.; Ju, H.; Park, J. Prediction of Sea Surface Current Around the Korean Peninsula Using Artificial Neural Networks. J. Geophys. Res. Mach. Learn. Comput. 2024, 1, e2024JH000168. [Google Scholar] [CrossRef]
Ren, L.; Hu, Z.; Hartnett, M. Short-Term Forecasting of Coastal Surface Currents Using High Frequency Radar Data and Artificial Neural Networks. Remote Sens. 2018, 10, 850. [Google Scholar] [CrossRef]
Saha, D.; Deo, M.C.; Joseph, S.; Bhargava, K. A Combined Numerical and Neural Technique for Short Term Prediction of Ocean Currents in the Indian Ocean. Env. Syst. Res. 2016, 5, 4. [Google Scholar] [CrossRef]
Rozier, D.; Birol, F.; Cosme, E.; Brasseur, P.; Brankart, J.M.; Verron, J. A Reduced-Order Kalman Filter for Data Assimilation in Physical Oceanography. SIAM Rev. 2007, 49, 449–465. [Google Scholar] [CrossRef]
GEBCO Compilation Group. GEBCO 2024 Grid. 2024. Available online: https://www.gebco.net/data-products-gridded-bathymetry-data/gebco2024-grid (accessed on 5 June 2025).
Newbold, P. ARIMA Model Building and the Time Series Analysis Approach to Forecasting. J. Forecast. 1983, 2, 23–35. [Google Scholar] [CrossRef]
Kisi, O.; Guven, A. A Machine Code-Based Genetic Programming for Suspended Sediment Concentration Estimation. Adv. Eng. Softw. 2010, 41, 939–945. [Google Scholar] [CrossRef]
Sainath, T.N.; Vinyals, O.; Senior, A.; Sak, H. Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks. In Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, QLD, Australia, 19–24 April 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 4580–4584. [Google Scholar]
Rumelhart, D.E.; Hintont, G.E.; Williams, R.J. Learning Representations by BackPropagating Errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Staudemeyer, R.C.; Morris, E.R. Understanding LSTM—A Tutorial into Long Short-Term Memory Recurrent Neural Networkss. arXiv 2019, arXiv:1909.09586. [Google Scholar]
Kirbas, I.; Kerem, A. Short-Term Wind Speed Prediction Based on Artificial Neural Network Models. Meas. Control 2016, 49, 183–190. [Google Scholar] [CrossRef]
Kerem, A.; Kirbas, I.; Saygin, A. Performance Analysis of Time Series Forecasting Models for Short Term Wind Speed Prediction. In Proceedings of the International Conference on Engineering and Natural Sciences (ICENS), Sarajevo, Bosnia and Herzegovina, 24–28 May 2016; pp. 2733–2739. [Google Scholar]
Yu, Y.; Si, X.; Hu, C.; Zhang, J. A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef]
Qian, P.; Feng, B.; Liu, X.; Zhang, D.; Yang, J.; Ying, Y.; Liu, C.; Si, Y. Tidal Current Prediction Based on a Hybrid Machine Learning Method. Ocean Eng. 2022, 260, 111985. [Google Scholar] [CrossRef]
Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: San Francisco, CA, USA, 2016; pp. 785–794. [Google Scholar]
Vapnik, V.N. The Nature of Statistical Learning Theory; Springer: New York, NY, USA, 1995; ISBN 978-1-4757-2442-4. [Google Scholar]
Kolukula, S.S.; Pln, M. Enhancing Observations Data: A Machine-Learning Approach to Fill Gaps in the Moored Buoy Data. Results Eng. 2025, 26, 104708. [Google Scholar] [CrossRef]
Treiber, N.A.; Heinermann, J.; Kramer, O. Wind Power Prediction with Machine Learning. In Computational Sustainability; Lässig, J., Kersting, K., Morik, K., Eds.; Studies in Computational Intelligence; Springer International Publishing: Cham, Switzerland, 2016; Volume 645, pp. 13–29. ISBN 978-3-319-31856-1. [Google Scholar]
Smola, A.J.; Schölkopf, B. A Tutorial on Support Vector Regression. Stat. Comput. 2004, 14, 199–222. [Google Scholar] [CrossRef]
Ghorbani, M.A.; Khatibi, R.; FazeliFard, M.H.; Naghipour, L.; Makarynskyy, O. Short-Term Wind Speed Predictions with Machine Learning Techniques. Meteorol. Atmos. Phys. 2016, 128, 57–72. [Google Scholar] [CrossRef]

Figure 1. (a) Location map of the M2 polymetallic nodule mining area (white polygon) and mooring station (red pentagram). (b) Bathymetric chart near the mooring location (red pentagram) within the region 18.6–20°N, 153–154.4°E. Bathymetric data were obtained from the General Bathymetric Chart of the Oceans (GEBCO, 2024) [29].

Figure 2. Ocean current characteristics: (a) Current rose diagram showing velocity distribution (blue: 0–6 cm/s; purple: 7–12 cm/s; pink: >12 cm/s); (b) Displacement vectors at mooring position (origin: 0,0) from 25 October 2021–19 November 2022, with monthly color coding (black: October 2021–June 2022; red: July 2022; yellow: August 2022; purple: September 2022; green: October 2022; blue: November 2022).

Figure 3. Technology Roadmap. The observed zonal and meridional component data were split into training and test sets at an approximate 8:2 ratio and utilized for training four models: LSTM, ARIMA, XGBoost, and SVR, with all models employing grid search for parameter optimization. Two prediction methodologies were implemented: rolling prediction and data updating prediction (taking 3 h data updates as an example). In subsequent visualizations, purple denotes original observed data, pink represents model-predicted data, and yellow indicates actual measurements used for data updates.

Figure 4. (a) The zonal current predictions from each model were generated using a rolling forecast method with sliding windows of 6 h, 12 h, 24 h, 36 h, and 60 h. In each panel, observed values are shown in red, while model predictions are represented as follows: XGBoost (blue), LSTM (green), ARIMA (yellow), and SVR (purple). (b) Prediction results and evaluation metrics for each model, evaluated with a 48 h sliding window. The scatter plot identifies results within ±20% of observations (red) and outliers beyond this range (blue).

Figure 5. Meridional current predictions (left) and evaluation metrics (right) generated through rolling forecasts with sliding windows (6 h–60 h). Observed values are shown in red, while model predictions are color-coded: XGBoost (blue), LSTM (green), ARIMA (yellow), and SVR (purple). Performance metrics are represented by blue bars (MAE) and a red line (RMSE) across all models.

Figure 6. Same format as Figure 4 but using the updating method.

Figure 7. Same format as Figure 5 but using the updating method.

Figure 8. Evaluation metrics for zonal current predictions by each model under varying sliding window durations (6 h, 12 h, 24 h, 36 h, 48 h, 60 h) using data updating methods (XGBoost: blue; LSTM: green; ARIMA: orange; SVR: purple) (unit: cm/s).

Figure 9. Evaluation metrics for zonal current predictions by the LSTM model with a sliding window of 48 h under different lead times (unit: cm/s). The blue bars and red line indicate MAE and RMSE, respectively.

Figure 10. The R² values and the count of predictions within ±20% of the true values for the ARIMA and LSTM models in 1–3 h ahead forecasts. Blue represents the LSTM model, and green represents the ARIMA model. The bars denote R² values, while the dots indicate the count of predictions falling within ±20% of the true values. In each subplot, the model labeled with a numerical value is the one with the R² closest to 1 at the current update time, and the model with the highest count of predictions within the ±20% range is marked in parentheses above the corresponding dot.

Figure 11. Evaluation metrics for meridional current predictions by each model under varying sliding window durations (6 h, 12 h, 24 h, 36 h, 48 h, 60 h) using data updating methods (XGBoost: blue; LSTM: green; ARIMA: orange; SVR: purple) (unit: cm/s).

Table 1. Summary statistics of the zonal and meridional current components in the training and test sets (unit: cm/s).

Speed		Maximum	Minimum	Average Velocity ¹	Average Speed ²
Zonal component	Training set	14.952	0.001	0.254	3.032
Zonal component	Test set	16.902	0.026	−6.081	6.089
Meridional component	Training set	10.542	0.000	0.073	2.161
Meridional component	Test set	7.643	0.020	−0.683	2.155

¹ Average velocity: The displacement divided by the time taken (vector quantity). ² Average speed: The total distance traveled divided by the time taken (scalar quantity).

Table 2. Evaluation metrics for zonal current predictions using rolling forecast methods across models (unit: cm/s).

Model	Sliding Window	MAE	RMSE	MedAE	Error ≤ 20% Count (Total: 96)
XGBoost	6 h	4.062	4.710	2.018	8
	12 h	3.760	4.388	2.360	10
	24 h	2.377	2.941	1.740	32
	36 h	2.449	2.983	2.145	25
	48 h	2.767	3.337	2.471	26
	60 h	2.723	3.280	2.502	25
LSTM	6 h	2.492	3.165	2.224	37
	12 h	2.284	2.967	1.733	35
	24 h	2.087	2.756	1.615	40
	36 h	2.566	3.003	1.506	17
	48 h	1.925	2.395	1.259	42
	60 h	2.586	3.557	1.389	37
ARIMA	6 h	4.503	5.008	1.816	3
	12 h	3.378	3.963	1.943	11
	24 h	2.092	2.562	1.181	31
	36 h	2.061	2.535	1.322	34
	48 h	1.880	2.383	1.342	38
	60 h	1.860	2.346	1.416	39
SVR	6 h	5.001	5.540	1.938	2
	12 h	4.880	5.426	2.061	3
	24 h	3.489	3.918	1.312	9
	36 h	3.197	3.716	1.448	11
	48 h	3.188	3.631	1.371	9
	60 h	3.055	3.520	1.382	9

Table 3. Evaluation metrics for 1 h ahead zonal current forecasts across models (unit: cm/s).

Model	Sliding Window	MAE	RMSE	MedAE	Error ≤ 20% Count (Total: 96)
XGBoost	6 h	1.248	1.565	0.998	61
	12 h	1.184	1.477	0.897	57
	24 h	1.131	1.385	1.059	57
	36 h	1.089	1.326	0.943	62
	48 h	1.092	1.318	1.018	64
	60 h	1.051	1.291	0.966	65
LSTM	6 h	1.083	1.325	1.035	60
	12 h	1.071	1.303	0.945	67
	24 h	0.980	1.188	0.992	69
	36 h	0.972	1.190	0.841	68
	48 h	0.903	1.120	0.899	74
	60 h	0.926	1.182	0.888	73
ARIMA	6 h	1.179	1.477	0.944	63
	12 h	1.105	1.355	1.076	64
	24 h	0.999	1.228	0.878	68
	36 h	0.940	1.168	0.899	69
	48 h	0.943	1.176	0.915	71
	60 h	0.949	1.182	0.896	68
SVR	6 h	1.475	1.769	0.897	41
	12 h	1.495	1.787	0.820	40
	24 h	1.551	1.833	0.803	38
	36 h	1.732	2.039	0.918	36
	48 h	1.677	1.965	0.919	34
	60 h	1.615	1.909	0.907	39

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bao, H.; Yao, Z.; Xu, D.; Wang, J.; Yang, C.; Liu, N.; Pang, Y. Data-Driven Prediction of Deep-Sea Near-Seabed Currents: A Comparative Analysis of Machine Learning Algorithms. Remote Sens. 2025, 17, 3131. https://doi.org/10.3390/rs17183131

AMA Style

Bao H, Yao Z, Xu D, Wang J, Yang C, Liu N, Pang Y. Data-Driven Prediction of Deep-Sea Near-Seabed Currents: A Comparative Analysis of Machine Learning Algorithms. Remote Sensing. 2025; 17(18):3131. https://doi.org/10.3390/rs17183131

Chicago/Turabian Style

Bao, Hairong, Zhixiong Yao, Dongfeng Xu, Jun Wang, Chenghao Yang, Nuan Liu, and Yuntian Pang. 2025. "Data-Driven Prediction of Deep-Sea Near-Seabed Currents: A Comparative Analysis of Machine Learning Algorithms" Remote Sensing 17, no. 18: 3131. https://doi.org/10.3390/rs17183131

APA Style

Bao, H., Yao, Z., Xu, D., Wang, J., Yang, C., Liu, N., & Pang, Y. (2025). Data-Driven Prediction of Deep-Sea Near-Seabed Currents: A Comparative Analysis of Machine Learning Algorithms. Remote Sensing, 17(18), 3131. https://doi.org/10.3390/rs17183131

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Data-Driven Prediction of Deep-Sea Near-Seabed Currents: A Comparative Analysis of Machine Learning Algorithms

Abstract

1. Introduction

2. Materials and Methods

2.1. Data

2.2. Methods

2.2.1. Autoregressive Integrated Moving Average Model, ARIMA

2.2.2. Long Short-Term Memory Network, LSTM

2.2.3. eXtreme Gradient Boosting, XGBoost

2.2.4. Support Vector Regression, SVR

2.2.5. Forecasting Methodology

2.2.6. Evaluation Metrics

3. Results

3.1. Rolling Forecasts

3.2. Updating Forecasts

3.3. Data Updates Based on Lead Times

4. Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI