Neural Network-Based Ship Power Load Forecasting

Liu, Haozheng; Qiu, Chengjun; Qu, Wei; He, Wei; Zhuang, Yuan; Li, Puze; Hao, Huili; Wang, Wenhao; Zhao, Zizi; Su, Jiahua

doi:10.3390/jmse13091766

Open AccessArticle

Neural Network-Based Ship Power Load Forecasting

by

Haozheng Liu

^1,2

,

Chengjun Qiu

^1,3,*,

Wei Qu

⁴

,

Wei He

¹

,

Yuan Zhuang

¹,

Puze Li

^1,2,

Huili Hao

^1,2

,

Wenhao Wang

^1,2,

Zizi Zhao

^1,2

and

Jiahua Su

^1,2

¹

College of Mechanical and Marine Engineering, Beibu Gulf University, Qinzhou 535011, China

²

Guangxi Key Laboratory of Marine Engineering Equipment and Technology, Qinzhou 535011, China

³

College of Computer Science, Shandong Xiehe University, Jinan 250109, China

⁴

College of Electronic and Information Engineering, Beibu Gulf University, Qinzhou 535011, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2025, 13(9), 1766; https://doi.org/10.3390/jmse13091766

Submission received: 1 August 2025 / Revised: 5 September 2025 / Accepted: 11 September 2025 / Published: 12 September 2025

(This article belongs to the Section Ocean Engineering)

Download

Browse Figures

Versions Notes

Abstract

This study combines an experimental semi-physical simulation model of an electric propulsion tugboat with four different neural networks to create a real-time simulation model for forecasting total power loads with small samples. The results of repeated experiments demonstrate that the BP neural network effectively forecasts the power load. Subsequently, addressing the limitations of traditional BP neural networks, an optimization approach employing an enhanced particle swarm algorithm and attention mechanism was developed, thereby improving the model’s prediction accuracy and robustness. The experiment shows that the improved prediction model achieves an R² value of 97.42%, demonstrating its effectiveness in forecasting changes in the short-term power load of ships as parameters change. In actual operation, ships can allocate power reasonably and in a timely manner according to the load forecast results, thereby improving the efficiency of the power grid.

Keywords:

semi-physical simulation model; neural network; particle swarm algorithm; attention mechanism

1. Introduction

Ship load forecasting is one of the core processes in ship power system design and optimization. The goal is to establish a high-precision load forecasting model through comprehensive analysis of historical data on ship power demand, operating conditions, and environmental factors, thereby ensuring the stability, economy, and safety of ship power systems [1]. As the intelligent and green transformation of ships accelerates, power load forecasting is becoming increasingly important. On the one hand, intelligent ships place higher demands on the dynamic response capabilities of power systems, requiring accurate forecasting to achieve optimal energy dispatch. On the other hand, the International Maritime Organization (IMO) has imposed strict restrictions on carbon emissions, prompting the development of clean energy and hybrid power systems for ships, with load forecasting becoming a key link in system planning and operation [2]. In response to the problems in ship power load forecasting, researchers have proposed a variety of forecasting methods, which can be divided into two main categories: traditional and modern methods. Traditional forecasting methods include time-series methods, regression analysis methods, grey model methods, etc.; modern forecasting methods mainly include neural networks and deep learning methods [3,4].

Time-series analysis: Time-series feature modeling based on historical load data, including autoregressive integrated moving average (ARIMA) and seasonal decomposition (STL), is suitable for scenarios with obvious periodicity and trends, and it performs well under stable ship load conditions. Fu Cifu et al. analyzed seawater intrusion events along the Bohai and Yellow Sea coasts through numerical simulation, providing environmental correlation data support for short-term load fluctuation forecasting [5].

Regression analysis method: Forecasts are made by establishing linear or nonlinear relationships between the load and influencing factors. This method is highly explanatory and is suitable for analyzing the correlation between the load capacity and power load. Common methods are linear regression (LR) and least-squares support vector machine (LSSVM). Li et al. used principal component regression to extract several key factors that influence load forecast results and derived the analytical form of the model [6]. Dhaval et al. used multiple linear regression (MLR) to make short-term load forecasts, using MLR to forecast power loads one day in advance and calculating regression coefficients using the least-squares estimation method. The model achieved an accuracy rate of 95% [7]. He Runfeng et al. proposed an improved SVM method for power load forecasting, which comprehensively considers the meteorological factors that affect the effectiveness of power load forecasting. It uses a multiple linear regression model to fit the effects between various working conditions and then uses the vulture search algorithm to optimize the parameters in the support vector machine, thereby improving the effectiveness of the forecast [8].

Grey model method: Based on the grey system theory of small samples and poor information, the randomness of the generated sequence is weakened. The typical model is GM(1,1). This method is suitable for scenarios with small data volumes and high volatility. It performs well in ship repair cost forecasting, but in ship power load forecasting, data pre-processing is required to improve robustness. Wei Mingkui et al. proposed a fractional-order grey forecast model optimized using BFGS-FA for load forecasting. Through optimization, they obtained the optimal order of the fractional-order grey forecast model [9].

Neural network model: The neural network simulates nonlinear relationships through multiple layers of neurons and has strong generalization capabilities. Guo Cheng et al. combined LSTM and neural network models with attention mechanisms to forecast power loads. This method achieved good forecasting results but had the disadvantage of long training times [10]. Tao Juan et al. used a neural network method consisting of a set of trained artificial neural networks iteratively combined to perform short-term load forecasting. By comparing the forecasting performance with BNN, ARMA, Hyb ANN, and SI-WNN, they demonstrated that the results obtained using this method had smaller errors and fluctuations [11]. Zuleta-Elles I et al. conducted substantive experiments using artificial neural networks and ARIMA models based on real microgrid data, and the results proved that neural network models outperform ARIMA models in specific situations [12].

Deep learning methods: Deep learning methods utilize deep neural networks, such as long short-term memory (LSTM) networks, convolutional neural networks (CNNs), and recurrent neural networks (RNNs), to capture long-term dependencies in time-series data or combine attention mechanisms to optimize feature extraction. These methods demonstrate exceptional capabilities in analyzing and processing large, multi-dimensional datasets. Kwon BS proposed an LSTM-based mid-term load forecasting algorithm that significantly improves forecast accuracy by integrating weather data with photovoltaic capacity estimates [13]. Zhang Yu and others designed and implemented a CNN-LSTM time-series forecast method based on the attention mechanism, which solved the problem of LSTM’s inability to obtain multiple feature space connections. They then proposed adding an attention mechanism after the LSTM layer to increase the influence of important time steps in LSTM and further reduce multi-step forecast errors [14,15,16]. In the specific context of short-term load forecasting, Lin Han et al. proposed a short-term load prediction model based on TCA-CNN-LSTM. The integration of a convolutional neural network (CNN) with a long short-term memory (LSTM) network has been demonstrated to enhance prediction accuracy to a significant degree. The model effectively captures long-term dependencies and spatial characteristics within time-series data, rendering it suitable for forecasting tasks involving complex datasets [17]. Gu Yien et al. investigated methods for short-term power load forecasting in vessels under adverse sea conditions [18]. By optimizing the architecture of the model and its parameters, high-precision forecasts were achieved in complex environments, thus providing crucial support for the stable operation of shipboard electrical systems. Significant advances have also been made in the field of non-intrusive load monitoring (NILM). Schirmer and Mporas provided a comprehensive review of NILM techniques, analyzing the strengths and challenges of current approaches. In their study [19], Ramadan et al. explored the potential of NILM and the Internet of Things (IoT) in the context of residential microgrid energy management [20]. The authors demonstrated the potential applications of NILM within smart grids, highlighting its role in enhancing energy management and efficiency. In their comparative analysis of machine learning techniques within NILM, Shabir et al. highlighted performance differences among algorithms in load decomposition and energy management. This analysis thus provides a direction for future research [21].

Notwithstanding the attainment of a certain degree of progress, challenges pertaining to the accurate forecasting of power load remain prevalent. Firstly, the capacity of models to generalize is constrained by limited dataset sizes, particularly in small-sample scenarios. Secondly, the high computational complexity and cost of the technology hinder its widespread practical deployment. Furthermore, the impact of environmental factors (such as sea conditions and wind speed) on shipboard power loads warrants further investigation. It is recommended that future research efforts concentrate on resolving the aforementioned issues by means of refining model architectures, optimizing algorithms, and expanding dataset sizes. This approach is expected to enhance the accuracy and reliability of power load forecasts. Methods based on CNNs and LSTM networks, which rely on extensive data and incur lengthy computation times, are not suitable for the short-term power load forecasting problem under consideration in this study, which involves small data volumes. The present study, therefore, addresses the characteristics of limited data and restricted input parameters by testing multiple neural network algorithms. A selection process is undertaken in order to identify a prediction model that is characterized by a relatively simple structure, which is straightforward to achieve and offers the highest level of stability. The model is then refined and optimized through the integration of algorithms such as particle swarm optimization and the attention mechanism, with the objective of achieving greater precision. The objective of this approach is to enhance the efficacy of predicting the power load of the ship’s electric propulsion system.

Load forecasting enables the accurate estimation of the power requirements of tugboats under varying operational conditions, thus facilitating the rational allocation of power supply equipment in terms of both quantity and capacity. It is vital to ensure that the operational number and power output of the supply equipment are scheduled in accordance with the forecast results. This will ensure that the equipment operates consistently within the high-efficiency working range. This, in turn, will reduce fuel consumption and energy wastage. The objective of this study is to establish a highly versatile power load forecast model that is capable of maintaining a certain degree of forecast accuracy even when the sample size is insufficient, and that can be applied to other forecasting scenarios.

2. Classification of Load Forecasting

According to different time classifications, power load forecasts can be divided into short-term power load forecasts and medium- to long-term power load forecasts. Short-term power load forecasts for ships are usually made on a daily, hourly, or even shorter time basis to predict the power loads during that period, with a greater emphasis on timeliness than in medium- to long-term load forecasts. The differences between different types of load forecasts are shown in Table 1.

Given that tugboats near ports are characterized by short-term and high-efficiency work, short-term load forecasts for electric propulsion tugboats were selected for this study. Based on the laboratory’s complete electric propulsion system simulation model for tugboats, the generated power load data is forecast, and the simulation model is optimized to provide a reference for actual ship navigation. The following will determine the appropriate forecasting method based on the operating status of the tugboat.

3. Load Forecasting Methods

Unlike specialized vessels such as cargo ships and offshore drilling vessels, tugboats are subject to several common operating conditions during a single operating cycle, including mooring, entering and leaving port, sailing, and towing operations. In this study, only the above four typical operating conditions are considered when performing statistical analysis on the forecast samples. After determining the operating conditions required for simulation, it is also necessary to select an appropriate calculation method in order to obtain more accurate power load forecast values. In view of the nonlinear factors of ships’ electric propulsion loads, neural networks are used for forecasting in this study.

Neural networks simulate the ‘receive–process–transmit–output’ process of biological neurons and use the connection weights of a large number of neurons to automatically extract and map data patterns, ultimately completing tasks such as classification and regression [22]. Figure 1 shows the basic model of a neural network. The input layer mimics the function of dendrites, receiving raw data and passing it on to subsequent layers. The hidden layer simulates information processing in cell bodies. Each neuron calculates a weighted sum of the input signals and uses an activation function to determine whether to activate, thereby achieving a nonlinear transformation and extracting data features. The output layer is similar to the output of a synapse, integrating the features processed by the hidden layer and outputting the final predicted value.

3.1. BP Neural Network

BP neural networks are multi-layer feedforward neural networks that adjust network parameters by propagating signals forward and errors backward [23]. The network calculates the error between the actual and expected output using a loss function and adjusts the connection weights and thresholds between neurons in each layer using the gradient descent method based on the error gradient, causing the error to decrease along the gradient direction. This process of forward propagation and backpropagation is repeated continuously. In theory, the BP neural network can approximate any nonlinear function with arbitrary precision, enabling it to handle various complex nonlinear problems. Figure 2 shows the network structure of a three-layer BP neural network.

As demonstrated in the accompanying diagram, the BP neural network consists of an input layer, hidden layers, and an output layer. The neurons within each layer are interconnected in a unidirectional manner, with signals propagating sequentially from the input layer towards the output layer. During the training process, it is imperative that weights are continuously recalibrated to ensure the optimization of network performance. The weight and threshold correction formulas for the output layer are as follows:

\{\begin{cases} Δ w_{k i} = η \sum_{p = 1}^{p} \sum_{k = 1}^{L} (T_{k}^{p} - o_{k}^{p}) \cdot φ^{'} (n e t_{k}) y_{i} \\ Δ a_{k} = η \sum_{p = 1}^{p} \sum_{k = 1}^{L} (T_{k}^{p} - o_{k}^{p}) \cdot φ^{'} (n e t_{k}) \end{cases}

(1)

The weight and threshold correction formulas for the hidden layer are as follows:

\{\begin{cases} Δ w_{i j} = η \sum_{p = 1}^{p} \sum_{k = 1}^{L} (T_{k}^{p} - o_{k}^{p}) \cdot φ^{'} (n e t_{k}) \cdot w_{k i} ϕ^{'} (n e t_{i}) x_{j} \\ Δ θ_{i} = η \sum_{p = 1}^{p} \sum_{k = 1}^{L} (T_{k}^{p} - o_{k}^{p}) \cdot φ^{'} (n e t_{k}) \cdot w_{k i} ϕ^{'} (n e t_{i}) \end{cases}

(2)

In the equation,

ϕ

and

φ

correspond to the activation functions of the hidden and output layers, respectively;

θ

and

a

correspond to the thresholds of the hidden and output layer neuron nodes, respectively;

w_{k i}

is the connection weight between the

i

th neuron of the hidden layer and the

k

th neuron of the output layer, where

k = 1, \dots, L

;

x_{j}

is the input value of the

j

th neuron in the input layer;

o_{k}

is the output value of the

k

th neuron in the output layer; and

w_{i j}

is the connection weight between the

j

th neuron in the input layer and the

i

th neuron in the hidden layer, where

i = 1, \dots, q

,

j = 1, \dots, M

.

The steps for predicting ship power load using a BP neural network are shown in Figure 3.

3.2. RBF Neural Network

RBF neural networks utilize radial basis functions as their activation function. In comparison with backpropagation neural networks, radial basis function neural networks have been shown to exhibit superior training speeds, typically achieving satisfactory training outcomes within a reduced number of iterations. At the same time, they have good predictive capabilities for unknown data, can effectively handle noise and interference in data, and have strong robustness. The structure of an RBF neural network is shown in Figure 4.

The fundamental distinction between radial basis function (RBF) and Bayesian posterior (BP) neural networks is rooted in their disparate activation functions. BP neural networks commonly employ functions such as sigmoid, whereas RBF neural networks utilize radial basis functions, which exhibit local approximation characteristics. Conversely, BP neural networks execute global approximation.

The steps for predicting the ship power load using RBF neural networks are shown in Figure 5.

3.3. Elman Neural Network

The Elman neural network is a recurrent neural network with memory capabilities, enabling it to process time-dependent data. The Elman neural network builds upon the basic structure of the backpropagation (BP) network by adding a memory layer to store the output state of the hidden layer from the previous time step, thereby achieving memory functionality. Through the memory layer, the output of the hidden layer from the previous time step is fed back to the hidden layer, allowing the hidden layer to utilize historical information to process current inputs and capture dynamic patterns in the data. Compared to feedforward neural networks, Elman neural networks have stronger computational capabilities and are particularly suitable for processing data with temporal order and time-dependent relationships, effectively uncovering time-dependent relationships within the data. They can model complex nonlinear dynamic systems, describing the state changes and interactions of the system at different time points. The Elman neural network structure is shown in Figure 6.

In the context of BP, RBF, or Elman neural networks, the selection of the hidden layer neuron count is contingent upon the intrinsic relationships between disparate samples. The accuracy of the predictions made based on the samples does not directly correlate with the number of neurons in the hidden layer. In order to ensure that the network model learns effectively, it is necessary to make node adjustments within a reasonable range based on prediction results during practical operation. The steps of ship power load forecasting using an Elman neural network are illustrated in Figure 7.

3.4. LSTM Neural Network

A long short-term memory (LSTM) network is a type of artificial neural network that incorporates a crucial component known as the memory cell, which is responsible for storing and maintaining long-term state information. The model builds upon recurrent neural networks by introducing three additional logic control units: the input gate, the forget gate, and the output gate [24]. This enables the network to preserve its state across multiple time steps, thereby effectively capturing and maintaining long-term dependencies. Its structural configuration is illustrated in Figure 8.

As shown in Figure 8, the core of the LSTM neural network is the memory cell, which transmits information linearly across time steps to prevent information loss, serving as the carrier of long-term memory. The memory cell controls the inflow, outflow, and forgetting of information through the input gate, forget gate, and output gate to protect and control information. The input gate filters the input information at the current time to determine whether it can enter the memory cell. The forget gate determines which historical information in the memory cell should be forgotten. The output gate determines how much of the memory cell’s output can be used as the current time’s output.

The hidden state is denoted by

h_{t - 1}

, the current input by

x_{t}

, and the activation function

σ

layer output value by

f_{t}

. The calculation formula is as follows:

f_{t} = σ (W_{f} [h_{t - 1}, x_{t}] + b_{f})

(3)

The

σ

layer of the input gate determines the information that needs to be updated. Output

i_{t}

controls the amount of new information added, the tanh layer generates a candidate update vector

{\tilde{C}}_{t}

, and the cell status is updated using Formula (4).

\{\begin{cases} i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i}) \\ {\tilde{C}}_{t} = \tanh (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{c}) \end{cases}

(4)

The output gate outputs

O_{t}

through layer

σ

, which is the output part that determines the cell state. The cell state is processed by the tanh layer and multiplied by

O_{t}

to obtain the current hidden state

h_{t}

. The output gate calculation formula is

\{\begin{cases} O_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o}) \\ h_{t} = O_{t} \cdot \tanh (\tilde{C}) \end{cases}

(5)

Finally, memory cells are renewed:

C_{t} = f_{t} \cdot C_{t - 1} + i_{t} \cdot {\tilde{C}}_{t}

(6)

LSTM builds more complex models by stacking multiple LSTM layers. Multiple LSTM units in each LSTM layer are interconnected to form a hierarchical structure. When recurrent networks contain multiple layers, the input

x_{t}

of the second layer is the output

h_{t}

of the first layer.

4. Load Forecasting Models

4.1. Load Data Statistics

The training of neural networks necessitates a sufficient number of samples. In this study, the laboratory’s inherent hardware-in-the-loop simulation model of a power propulsion tugboat and its hardware-in-the-loop simulation experimental equipment platform were employed to conduct experiments and obtain training data. The hardware-in-the-loop simulation model and experimental equipment platform (Electric Propulsion System Experimental Platform (Beibu Gulf University, Qinzhou, China)) are illustrated in Figure 9 and Figure 10.

As shown in Figure 9, the red section represents the simulation model of the diesel generator set on board the vessel, the green section represents the simulation model of the rectifier transformer, the pink section represents the simulation model of the frequency converter, the cyan section represents the simulation model of the propulsion motor and the yellow section represents the simulation model of the propeller.

In the course of the experiment, the load statistics of a solitary propulsion motor and propeller were measured over a period of 1500 s. The 60 Hz load motor used to drive the propeller was utilized in the simulation of fluctuations in electrical load under various operational conditions, including forward movement, mooring, reverse thrust, and towing. This was achieved by altering parameters such as rotational speed, drag force, water flow velocity, and wind speed. This process yielded 300 sets of sample data. The ensuing sample results are illustrated in Figure 11. Despite the relatively limited time span covered by the test data, a comprehensive adjustment of factors such as rotational speed and drag force was implemented to comprehensively encompass power load fluctuations under various operational conditions. Consequently, they possess a degree of representativeness of short-term power load forecasting. Although accuracy may be compromised to some degree when medium- to long-term predictions are made, the incorporation of additional training data into the model retains the potential to generate highly accurate forecast results.

Next, using the input parameters, a neural network was used to predict the load, and the forecast results were compared with the actual values to verify the reliability of the model.

4.2. Neural Network Parameter Settings

(1): Uniform parameter settings

Before forecasting the power load, it is necessary to determine the basic parameter settings of the neural network based on the input. Based on the sample variable parameters and the total output of a single load, the number of neurons in the input and output layers was determined to be 5 and 1, respectively. The number of neurons in the hidden layer is generally determined using an empirical formula combined with trial and error. The empirical formula is as follows:

X = \sqrt{N + M} + α

(7)

In the formula,

X

represents the number of hidden layer neurons;

N

represents the number of input layer neurons;

M

represents the number of output layer neurons;

α

is a tuning constant with values ranging from 1 to 10, typically determined through trial and error by adjusting

α

’s value to observe its impact on forecast results and selecting the scenario with the highest fit. Using this method, the number of hidden layer neurons is determined to be 7. The maximum number of iterations is set to 1000, with an error precision requirement of 1 × 10⁻⁵. The training, validation, and test sets consist of 70%, 15%, and 15% of the data, respectively.

Before starting the forecast work, data normalization must be performed to avoid low training accuracy and slow convergence speed due to differences in dimensions.

x^{'} = \frac{x_{0} - x_{\min}}{x_{\max} - x_{\min}}

(8)

In the formula,

x^{'}

is the value of the sample data after normalization;

x_{0}

is the initial value of the data; and

x_{\min}

and

x_{\max}

are the minimum and maximum values of the data, respectively.

After the forecast is completed, the output results are deformalized to convert them into values with the same specifications as the original data:

{x_{0}}^{'} = (x_{\max} - x_{\min}) x^{'} + x_{\min}

(9)

After the forecast is completed, the model’s reliability is evaluated using metrics such as the mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R²):

\{\begin{cases} M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\overset{⌢}{y}}_{i}| \\ R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\overset{⌢}{y}}_{i})}^{2}} \\ R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\overset{⌢}{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - {\bar{y}}_{i})}^{2}} \end{cases}

(10)

In these equations,

n

is the sample size,

y_{i}

is the true value,

{\overset{⌢}{y}}_{i}

is the predicted value, and

{\bar{y}}_{i}

is the average value of the true values.

For neural network models, the closer the R² value is to 1, the higher the model’s fit. The magnitude of the MAE and RMSE values is related to the size of the sample output values, and the smaller the relative values, the higher the model’s fit.

(2): Independent parameter setting

LSTM neural networks typically require a smaller learning rate due to their complex network structure; larger learning rates may lead to unstable training. In contrast, neural networks such as BP can use relatively large learning rates in certain scenarios. Accordingly, the learning rates for the BP, RBF, and Elman neural networks are set to 0.01, while that for the LSTM neural network is set to 0.001.

In terms of activation function selection, the sigmoid and tanh functions have smooth nonlinear characteristics, capable of mapping inputs to a finite interval. The ReLU function, which is a piecewise linear function that grows linearly in the positive interval, is computationally simple and helps mitigate the vanishing gradient problem. The Gaussian function has local nonlinear characteristics and is suitable for local data fitting. BP neural networks can select different activation functions based on specific problems. This study uses the tanh function, which has a smooth S-shaped curve and is suitable for handling continuous-value forecasts. RBF neural networks use Gaussian functions for local approximation. Elman neural networks select appropriate activation functions based on the characteristics of sequence data. In this study, the tanh function is used. LSTM neural networks use sigmoid and tanh functions to implement gating mechanisms and information transmission.

4.3. Load Forecasting Work

The following inputs should be selected: velocity, thrust, propeller speed, water flow velocity, and wind speed. A predictive model with power load as the output is then established. Prior to partitioning the dataset into training, validation, and test sets, it is necessary to perform random shuffling on the original dataset. In order to facilitate neural network training, it is necessary to simplify the data through a normalization procedure. After determining all parameters, a neural network was established using MATLAB 2019a. Predictions were performed using the four distinct neural network architectures. Subsequent to this, the results underwent renormalization, with prediction curves and model reliability evaluation metrics being generated. The findings are presented in Figure 12 and Table 2, in which a typical local region of the graph is enlarged to illustrate the differences between the various methods.

It is evident from the analysis of the forecast curves and numerical values that the BP neural network produces results more closely aligned with the actual values across a broader range of sample sequence segments. In comparison to the other three neural network types, which are capable of autonomously learning nonlinear relationships from historical data, the BP neural network is particularly well-suited for forecasting irregular, highly volatile loads. The model displays particularly strong trend-following capabilities, even in areas of significant curve fluctuation. This finding indicates its relative advantage in power load forecasting. Despite the satisfactory fitting performance exhibited by the model, there remains scope for improvement in terms of its accuracy. Without considering errors arising from statistical processing of simulated power load data, the BP neural network will now undergo optimization to enhance its power load forecast accuracy.

5. Forecast Model Optimization

5.1. Improved Particle Swarm Algorithm

Particle swarm optimization (PSO) is inspired by the foraging behavior of flocks of birds or schools of fish. It is a global optimization algorithm based on swarm intelligence, designed to solve continuous-variable optimization problems. It has an inherent parallel search mechanism and is particularly suitable for complex optimization areas where traditional methods are ineffective [23].

In the PSO algorithm mode, each particle represents a candidate solution in the solution space. Through the simulation of individual and group collaboration, the algorithm gradually approaches the optimal solution. The basic particle swarm optimization algorithm updates the velocity and position according to Equation (11).

\{\begin{cases} v_{i j} (t + 1) = ω \cdot v_{i j} (t) + c_{1} r_{1} [p_{i j} (t) - x_{i j} (t)] + c_{2} r_{2} [p_{g j} (t) - x_{i j} (t)] \\ x_{i j} (t + 1) = x_{i j} (t) + v_{i j} (t + 1) \end{cases}

(11)

Here,

c_{1}

and

c_{2}

are learning factors, usually taken as

c_{1} = c_{2} = 2

;

ω

is the inertial weight;

r_{1}

and

r_{2}

are random numbers in the range of 0–1, used to increase the randomness of particle flight;

v_{i j}

is the particle velocity, whose range is

[- v_{\max}, v_{\max}]

; and

v_{\max}

is a constant used to limit the particle velocity.

BP neural networks are sensitive to initial weights and prone to local minima, which can lead to inaccurate power load forecasts; this problem can be solved by using PSO to improve the weight threshold. Traditional PSO optimization becomes less efficient in the later stages of iteration, but local optima can be avoided by dynamically adjusting the learning factor

c

. Based on this idea, the improved particle swarm optimization (IPSO) algorithm optimizes the initial weights and thresholds of the BP neural network to enhance its global search capability and bring it closer to the global optimum, thereby improving the forecast accuracy and robustness of the BP network.

This study utilizes the compression factor

λ

to improve the convergence of the PSO algorithm to control particle velocity and prevent algorithm divergence. The velocity update formula is modified to

v_{i d} (t + 1) = λ \cdot v_{i d} (t) + c_{1} (t) r_{1} [p_{i d} (t) - x_{i d} (t)] + c_{2} (t) r_{2} [p_{g d} (t) - d (t)]

(12)

where

p_{i d}

is the individually optimal position of particle

i

in dimension

d

,

p_{g d}

is the globally optimal position, and the improved

c_{1} (t)

and

c_{2} (t)

are non-symmetric learning factors.

\{\begin{cases} c_{1} (t) = c_{1 \max} - \frac{(c_{1 \max} - c_{1 \min}) \cdot t}{t_{\max}} \\ c_{2} (t) = c_{2 \max} - \frac{(c_{2 \max} - c_{2 \min}) \cdot t}{t_{\max}} \end{cases}

(13)

where

c_{1 \max}

and

c_{1 \min}

are the maximum and minimum values of the local learning factors, and

c_{2 \max}

and

c_{2 \min}

are the maximum and minimum values of the global learning factors.

λ

can be determined by the following formula:

\{\begin{cases} λ = \frac{2}{|2 - φ - \sqrt{φ^{2} - 4 φ}|} \\ φ = c_{1} + c_{2} \end{cases}

(14)

5.2. Attention Mechanism

BP neural networks are sensitive to the completeness and quality of sample data when performing load forecasting. If data are missing or contain significant noise, forecast errors will increase significantly. The attention mechanism (AM) is a computational module that mimics the human visual and cognitive systems’ ability to selectively focus on important information, enhancing the model’s sensitivity to key features by dynamically allocating weights. The core idea is to enable the model to automatically focus on the more important parts of the input data during processing. Its working principle involves calculating the importance weights of each part of the input using learnable parameters and then computing a weighted sum of the weights with the original input to generate a context vector focused on key information. These weights dynamically change based on the input content and task objectives, rather than remaining fixed. In contrast to the fixed weights of traditional BP networks, the attention mechanism dynamically adjusts the importance weights of individual features based on the contextual information of the current input. This facilitates the model’s capacity to automatically prioritize the most pertinent feature combinations under diverse operating conditions, including the towing, maneuvering, and standby states of a vessel. The efficacy of this adaptive weight allocation is evidenced by its ability to suppress interference from noisy features and mitigate the negative impact of redundant information on predictions. Furthermore, it enhances gradient propagation efficiency. The model is accelerated by the selective amplification of gradient signals from significant features. Concurrently, the entropy-maximizing property of the attention distribution provides implicit regularization, thereby preventing overfitting caused by excessive reliance on singular features. This approach has been shown to enhance the accuracy and reliability of ship load prediction, particularly in complex and dynamic environments.

To enhance a backpropagation (BP) neural network using the attention mechanism, an attention layer is first added after the input layer to highlight key features, suppress noise, and calculate feature weights.

α_{i} = s o f t \max (W^{T} \tanh (U \cdot x_{i}))

(15)

where

x_{i}

is the original input feature, and

α_{i}

is the attention weight.

Subsequently, attention modules are inserted between the hidden layers, and the S function is used to dynamically adjust the activation intensity of neurons.

{h_{t}}^{'} = S i g m o i d (W_{α} [h_{t - 1}, x_{t}]) \cdot h_{t}

(16)

The flow of information from the hidden layer output

h_{t}

is controlled through a gating mechanism.

5.3. Improved Forecast Model

In order to optimize the BP neural network using IPSO and AM, it is necessary to determine its parameters. The BP neural network, when augmented with an attention mechanism, demonstrates optimal performance within a dual-hidden-layer architecture. The fundamental parameter ranges were derived from Equation (7), with parameters subsequently perturbed and subjected to cross-validation. The final strategy adopted was to employ eight neurons in the first hidden layer and six in the second, both of which employed the ReLU activation function. All other parameters remained consistent with the preceding section.

Next, parameters such as the particle swarm size, maximum iteration count, inertia weight, learning factor, and compression factor are set, with the specific numerical values shown in Table 3. The positions and velocities of the particles are randomly initialized, with the particle positions corresponding to the initial values of the weights and thresholds in the BP neural network. The improved neural network model must run in a GPU environment, with simulation results shown in Figure 13 and Table 4. We have enlarged a typical local area to more clearly demonstrate the effect of each method.

The improved particle swarm algorithm and attention mechanism increased the forecast model’s R² to 0.9742. The experimental results prove that this improvement scheme significantly improves the forecast accuracy and generalization ability of the model through the deep integration of intelligent optimization algorithms and neural networks, which is particularly suitable for handling complex nonlinear power load forecast problems. The computational results presented herein demonstrate superior precision compared to the recent short-term load forecasting model based on TCA-CNN-LSTM proposed by Lin Han et al. [15] and the VMD-ARIMA-based power load forecast method for electric propulsion vessels in [16], which achieved 95% accuracy. Although LSTM often yields superior results to BP networks in the literature, the limited input–output and data volume in this study’s context meant that the more complex LSTM network did not demonstrate advantages. Conversely, the simpler BP network more readily converged to near-optimal results, benefiting from its straightforward structure, which facilitates control over robustness and computational efficiency.

To validate the model’s efficacy under constrained data, cross-validation was performed on the optimized IPSO-AM-BP model with k = 5. The validation yielded an average mean absolute error (MAE) of 13.5605, mean root mean square error (RMSE) of 21.3082, and average R² of 0.9618. These results demonstrate the model’s high precision and robustness.

The methodology presented herein is grounded in simulation data and relatively well-defined factors influencing electrical load. However, in practice, nonlinearities and noise not present in the simulated environment may arise due to varying sea conditions, meteorological factors, and other variables. In scenarios where additional complex factors may introduce noise, the proposed approach remains capable of yielding highly accurate predictions through local modifications and extensions. When confronted with more intricate environmental parameters, such as sea conditions and meteorological circumstances, it is first necessary to conduct a preliminary factor analysis in order to ascertain whether these elements exert a significant influence or merely introduce noise. In the event that these factors exert a significant influence, it is recommended that the corresponding variables be incorporated into the model as inputs or used to replace less influential variables. Should these factors not exert a significant influence, but merely introduce noise, then ensemble forecasting methods may be employed. This process entails the introduction of minor alterations to the input parameters or model parameters, the subsequent prediction of outcomes for each distinct set, and the calculation of a weighted average of these predictions to achieve stable results that are impervious to noise. Further proposals and results will be explored in detail in subsequent research.

6. Conclusions

In this study, tugboats were selected for short-term power load forecasting, with the aim of applying the laboratory’s semi-physical simulation model of tugboats to build a convenient, fast, highly accurate, and versatile small-sample power load forecasting model. Based on the tugboat simulation model, the total load data of the electric propulsion system were collected as samples under various working conditions, and multiple neural network models were established to predict the power load. Reasonable improvements to the model were made based on the results.

Experiments have proven that the BP neural network converges quickly and has high accuracy in power load forecasting, but its tendency to fall into local optimal solutions prevents it from achieving a higher fit with the actual values. Therefore, an improved particle swarm algorithm and attention mechanism were introduced to optimize the BP neural network and improve the global optimal solution level. Comparing the predicted values of the improved model with the actual values revealed a 5.18% improvement in accuracy compared to before, proving the feasibility of this approach and providing a more comprehensive and effective solution for the effective forecast of short-term power loads on ships.

The findings of this study indicate that optimized operation of shipboard generator sets is achievable through high-precision forecasting, thereby reducing fuel consumption and lowering operational costs. Concurrently, this model supports intelligent energy management systems, facilitating refined energy management and preventive maintenance to enhance operational efficiency and reliability. It is anticipated that this model will play a positive role in the intelligent and green transformation of shipping.

Author Contributions

Conceptualization, C.Q. and P.L.; Methodology, P.L.; Software, H.L. and P.L.; Validation, H.L.; Formal analysis, H.L. and W.W.; Resources, C.Q.; Data curation, H.L., P.L., H.H., Z.Z. and J.S.; Writing—original draft, H.L.; Writing—review & editing, H.L.; Visualization, W.H.; Supervision, C.Q., W.Q., W.H. and Y.Z.; Project administration, C.Q. and W.Q.; Funding acquisition, C.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by the Guangxi Natural Science Foundation Joint Special Project (Beibu Gulf University Special Project, Grant No. 2025GXNSFHA069069).

Data Availability Statement

The data relates to laboratory projects and cannot be disclosed at this time.

Conflicts of Interest

The authors declare no conflict of interest.

References

Daxi, L. Research Report on the Current Status and Investment Prospects of China’s Shipbuilding Industry; Zhiyan Consulting: Beijing, China, 2024. [Google Scholar]
International Maritime Organization. IMO Initial Strategy for the Reduction of Greenhouse Gas Emissions from Ships; International Maritime Organization: London, UK, 2018. [Google Scholar]
Qian, Y.; Kong, Y.; Huang, C. Review of power load forecast research. Sichuan Electr. Power Technol. 2023, 46, 37–43+58. [Google Scholar]
Zou, J.; Yang, S.; Liu, X.; Wang, H.; Liu, L.; Guo, X.; Zhang, H.; Qiu, Z.; Gai, Z. Impacts of Wind Assimilation on Error Correction of Forecasted Dynamic Loads from Wind, Wave, and Current for Offshore Wind Turbines. J. Mar. Sci. Eng. 2025, 13, 1211. [Google Scholar] [CrossRef]
Fu, C.; Yu, F.; Dong, J.; Gao, Y.; Li, M. Analysis of the Causes of Sea Water Inundation Events along the Bohai and Yellow Sea Coasts in October 2024 Based on Numerical Simulation. Mar. Forecast 2025, 42, 1–10. [Google Scholar]
Li, M.; Liu, D. Power load forecast based on improved regression method. Power Syst. Technol. 2006, 30, 99–104. [Google Scholar]
Dhaval, D. Short-term load forecasting with using multiple linear regression. Int. J. Electr. Comput. Eng. 2020, 10, 3911–3917. [Google Scholar] [CrossRef]
He, R.; Huang, Y. Research on an improved support vector machine method for power load forecasting. Hongshuihe 2022, 41, 94–99. [Google Scholar]
Wei, M.; Zhou, Q.; Cai, S.; Jiang, S.; Lu, L.; Zhang, Z.; Zhou, B. Medium- and long-term load forecasting based on BFGS-FA optimization of fractional-order grey models. J. Guangxi Univ. Nat. Sci. Ed. 2020, 45, 270–276. [Google Scholar]
Guo, C.; Wang, X.; Wang, B.; Wang, J. Short-term power load forecast method based on multi-layer fusion neural network model. Comput. Mod. 2021, 10, 94–99+106. [Google Scholar]
Tao, J.; Zou, H.; Zhou, D. Short-term load prediction model based on enhanced artificial neural networks. Electr. Eng. Mater. 2021, 2, 53–56. [Google Scholar] [CrossRef]
Zuleta-Elles, I.; Bautista-Lopez, A.; Cataño-Valderrama, M.J.; Marín, L.G.; Jiménez-Estévez, G.; Mendoza-Araya, P. Load Forecasting for Different Forecast Horizons using ANN and ARIMA models. In Proceedings of the 2021 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON), Valparaíso, Chile, 6–9 December 2021; pp. 1–7. [Google Scholar]
Kwon, B.S.; Song, K.B. Mid-term load forecasting algorithm for large-scale power systems based on deep learning considering the impact of behind-the-meter solar PV generation. J. Electr. Eng. Technol. 2025, 20, 1183–1192. [Google Scholar] [CrossRef]
Ma, M.; Li, X.; Fan, H.; Qin, L.; Wei, L. Actual Truck Arrival Prediction at a Container Terminal with the Truck Appointment System Based on the Long Short-Term Memory and Transformer Model. J. Mar. Sci. Eng. 2025, 13, 405. [Google Scholar] [CrossRef]
Kim, J.-Y.; Oh, J.-S. Multiple Feature Extraction Long Short-Term Memory Using Skip Connections for Ship Electricity Forecasting. J. Mar. Sci. Eng. 2023, 11, 1690. [Google Scholar] [CrossRef]
Zhang, Y.; Chen, G.; Li, J. Research and Application of CNN-LSTM Time Series Prediction Method Based on Attention Mechanism. J. Inn. Mong. Univ. Nat. Sci. Ed. 2022, 53, 516–521. [Google Scholar]
Lin, H.; Hao, Z.; Guo, J.; Wu, Y. Research on Short-Term Load Forecasting Based on TCA-CNN-LSTM. Electr. Meas. Instrum. 2023, 60, 73–80. [Google Scholar] [CrossRef]
Gu, Y.; Yang, Z.; Gao, H.; Tang, Y. Method for short-term power load forecast of ships in rough sea conditions. Mar. Eng. Equip. Technol. 2023, 10, 117–123. [Google Scholar]
Schirmer, A.; Mporas, I. A Survey on Non-Intrusive Load Monitoring. IEEE Trans. Smart Grid 2023, 14, 769–784. [Google Scholar] [CrossRef]
Ramadan, A. Residential Microgrid Energy Management Based on Non-Intrusive Load Monitoring and Internet of Things. Smart Cities 2024, 7, 1907–1935. [Google Scholar] [CrossRef]
Shabir, A.; Vasilyeva, K.; Hockmabad, H.N.; Huseyin, O.; Petrenkov, E.; Belikov, J. Comparative Analysis of Machine Learning Techniques for Non-Intrusive Load Monitoring. Electronics 2024, 13, 1420. [Google Scholar] [CrossRef]
Yan, T.S. An Improved Genetic Algorithm and Its Blending Application with Neural Network. In Proceedings of the 2010 2nd International Workshop on Intelligent Systems and Applications, Wuhan, China, 22–23 May 2010; pp. 1–4. [Google Scholar]
Ma, H.; Leng, S.; Aihara, K.; Chen, L. Randomly Distributed Embedding Making Short-Term High-Dimensional Data Predictable. Proc. Natl. Acad. Sci. USA 2022, 115, E9994–E10002. [Google Scholar] [CrossRef]
Kawakami, K. Supervised Sequence Labelling with Recurrent Neural Networks; Technical University of Munich: Munich, Germany, 2021. [Google Scholar]

Figure 1. Basic model of a neural network.

Figure 2. BP neural network.

Figure 3. BP neural network forecast steps.

Figure 4. RBF neural network.

Figure 5. RBF neural networks forecast steps.

Figure 6. Elman neural network.

Figure 7. Elman neural network forecast steps.

Figure 8. LSTM neural network architecture.

Figure 9. A complete electric propulsion system simulation model.

Figure 10. Semi-physical simulation experimental equipment.

Figure 11. Power load sample.

Figure 12. Comparison of four neural networks’ forecasts.

Figure 13. IPSO-AM-BP neural network forecast simulation results.

Table 1. Different load forecasts by time classification.

Classification	Short-Term Load Forecasting	Medium- to Long-Term Load Forecasting
Time range	Minutes–day	Week–year
Main function	Optimize grid operation, reduce operating costs, formulate maintenance plans, and respond to emergencies	Guiding power system planning, arranging equipment repair and maintenance, formulating fuel supply plans, supporting power market operations, strategic planning and decision-making, investment decision-making and risk assessment, energy resource planning, and environmental impact assessment
Influencing factors	Short-term factors such as weather, social activities, holidays, and emergencies	Long-term factors such as economic development, population growth, industrial restructuring, energy policy, technological advancement, and climatic conditions
Data characteristics	Data volatility is high, randomness is strong, and correlation is high	Data changes are relatively stable, trends are obvious, and seasonal characteristics are significant
Forecast methods	Time-series analysis, neural networks, machine learning, etc.	Regression analysis, grey box modeling, time-series analysis, machine learning, etc.
Forecast accuracy requirements	High	Relatively low
Model update frequency	High	Low
Seasonal effects	Small	Large
Real-time requirements	High	Relatively low

Table 2. Evaluation index values of four models.

Evaluation Criteria	BP	RBF	Elman	LSTM
MAE	26.0522	33.8848	33.2272	28.6458
RMAE	36.2053	46.7057	40.7507	39.0238
R²	0.9207	0.8779	0.8919	0.9055

Table 3. IPSO-BP neural network parameter settings.

Parameters	Set Value
Particle swarm size	30
Maximum number of iterations	50
Inertial weighting	The initial value is 0.9, gradually decreasing to 0.4
Learning factor	$c_{1}$ = 1.5, $c_{2}$ = 2.0
Compression factor	0.53

Table 4. Evaluation index values of improved BP neural networks.

Evaluation Criteria	BP	IPSO-BP	AM-BP	IPSO-AM-BP
MAE	27.5468	22.6451	21.2564	11.3428
RMAE	33.9464	27.3275	28.2163	18.2663
R²	0.9262	0.9433	0.9410	0.9742

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, H.; Qiu, C.; Qu, W.; He, W.; Zhuang, Y.; Li, P.; Hao, H.; Wang, W.; Zhao, Z.; Su, J. Neural Network-Based Ship Power Load Forecasting. J. Mar. Sci. Eng. 2025, 13, 1766. https://doi.org/10.3390/jmse13091766

AMA Style

Liu H, Qiu C, Qu W, He W, Zhuang Y, Li P, Hao H, Wang W, Zhao Z, Su J. Neural Network-Based Ship Power Load Forecasting. Journal of Marine Science and Engineering. 2025; 13(9):1766. https://doi.org/10.3390/jmse13091766

Chicago/Turabian Style

Liu, Haozheng, Chengjun Qiu, Wei Qu, Wei He, Yuan Zhuang, Puze Li, Huili Hao, Wenhao Wang, Zizi Zhao, and Jiahua Su. 2025. "Neural Network-Based Ship Power Load Forecasting" Journal of Marine Science and Engineering 13, no. 9: 1766. https://doi.org/10.3390/jmse13091766

APA Style

Liu, H., Qiu, C., Qu, W., He, W., Zhuang, Y., Li, P., Hao, H., Wang, W., Zhao, Z., & Su, J. (2025). Neural Network-Based Ship Power Load Forecasting. Journal of Marine Science and Engineering, 13(9), 1766. https://doi.org/10.3390/jmse13091766

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Neural Network-Based Ship Power Load Forecasting

Abstract

1. Introduction

2. Classification of Load Forecasting

3. Load Forecasting Methods

3.1. BP Neural Network

3.2. RBF Neural Network

3.3. Elman Neural Network

3.4. LSTM Neural Network

4. Load Forecasting Models

4.1. Load Data Statistics

4.2. Neural Network Parameter Settings

4.3. Load Forecasting Work

5. Forecast Model Optimization

5.1. Improved Particle Swarm Algorithm

5.2. Attention Mechanism

5.3. Improved Forecast Model

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI