3.1. Beluga Whale Optimization Algorithm
BWO is a population-based mechanism; therefore, each beluga whale is a candidate solution that is updated during the optimization process. The matrix of the search agent location is modeled as shown in Equation (1).
where
n is the population size, and
d is the dimension. In addition, the corresponding fitness values are as follows in Equation (2).
BWO can transition from exploration to development, depending on the balance factor
Bf. The mathematical model is shown in Equation (3).
where
B0 randomly changes between (0, 1) during each iteration.
T and
t are the maximum and current iteration, respectively. If
, beluga individuals are in the exploration stage. If
, beluga individuals are in the development stage.
During the exploration phase, the individual position of the beluga whale is determined by paired swimming, and the updated position is as follows in Equation (4).
where
is randomly selected in the dimension of
d.
is the
ith beluga whale’s new position in the
dimension.
and
are the
ith beluga whale’s and the
rth beluga whale’s current positions, respectively. The updated position reflects the synchronized or mirrored behavior of the beluga whale during swimming or diving based on the odd and even selected dimensions.
and
are used to average the random numbers between fish fins.
Beluga whales hunt their prey by sharing information about their location with each other. Levy flight strategy was introduced in the development phase of BWO to enhance convergence, as shown by Equation (5).
where
.
C1 is used to measure the random jump intensity of Levy flight intensity.
LF is the Levy flight function, and is expressed in Equations (6) and (7).
where
u and
v are normally distributed random numbers.
β is a constant with a value of 1.5.
To simulate the behavior of whales falling in each iteration and ensure that the population remains unchanged, the updated position is established using the position of the beluga whale and the step size of whale descent, which is represented in Equation (8).
where
is the step size of a whale falling, which is expressed in Equation (9).
where
,
C2 is a step factor related to the probability of whale decline and population size.
and
represent the upper and lower bounds of the variable, respectively.
The probability of a whale falling
Wf is calculated as a linear function, and is shown in Equation (10).
3.2. EIBWO Proposal and Validation
During the exploration and development stages of BWO, there was a lack of search capability, resulting in low population diversity and decreased solution accuracy. Based on the shortcomings of the original BWO, this study proposes an enhanced and improved beluga whale optimization (EIBWO). The improvement measures for EIBWO are as follows:
- (1)
Initialization of beluga whale population based on chaotic mapping strategy
To obtain high-quality positions of the first-generation population, accelerate the convergence speed of EIBWO, and reduce the computational cost, this study uses the randomness and traversal of the chaotic mapping strategy to obtain the positions of the first-generation population. The details are shown in Equations (11) and (12).
where
rM represents the scale of chaotic mapping. When
α = 4, the model is in a chaotic state.
and
represent the minimum and maximum boundaries of the search space, respectively.
- (2)
Sine dynamic adaptive factor
To improve the local search capability of EIBWO, a sine adaptive factor is added to the position update formula, and its expression is defined in Equation (13).
By introducing the sine dynamic adaptive factor
S into Equation (4), the improved position update equation can be obtained as follows in Equation (14).
- (3)
Individual position disturbance strategy in the population of beluga whales
To fully ensure that the population maintains high diversity during the optimization process, a random position perturbation strategy is proposed. The detailed improvement strategy is as follows: During the iteration process of EIBWO, the disturbance frequency fr is set to a random number rp within the [0, 1] interval, and the values of fr and rp are compared to determine whether the population position has been disturbed.
Specifically, when
fr ≤
rp, individual positions are updated according to Equation (14); when
fr >
rp, individual positions are updated according to Equations (15) and (16).
where
aw is convergence coefficient and is used to guide individuals towards convergence direction, with a value range of [0.3, 0.7].
randn (0,
σ2) follows the Gaussian distribution of a mean of 0 and a variance of
σ2.
To verify the convergence and optimization effect of EIBWO, the traditional BWO algorithm, multi-verse optimization (MVO), the seagull optimization algorithm (SOA), and particle swarm optimization (PSO) were selected as comparative algorithms. In the same testing environment, this study selected 6 benchmark testing functions to test all algorithms. The detailed benchmark testing functions are shown in
Table 1. Some algorithms have unique parameters, and their set values are shown in
Table 2.
As shown in
Table 2, the V
max and V
min are the wormhole existence rates in MVO. C
max and C
min are the inertia factors in PSO.
C1 and
C2 are the acceleration constants in PSO.
d is the control factor in SOA. The population size was set to 50. The number of iterations for all algorithms was set to 300. The dimension was set to 30.
To ensure the objectivity of testing, each test function was tested 50 times for each algorithm, calculating the average value and standard deviation of the convergence values after each run. The algorithm test results under different test functions are shown in
Table 3.
From the statistical results in
Table 3, it can be seen that EIBWO achieved the optimal convergence value in the case of unimodal test function
F3 and multimodal test function
F5. Although EIBWO did not obtain the optimal value in other test functions, it can be seen from the standard deviation of the test results being 0 that the EIBWO algorithm had good optimization robustness and obtained stable convergence results.
3.3. Extreme Learning Machine
Due to its excellent learning ability, single-hidden-layer feed-forward neural network (SLFN) is widely used in fields such as lifespan prediction and pattern recognition. However, there are some inherent problems with traditional SLFN, such as the need to use the gradient descent method for multiple iterations during the training process to complete the correction of network thresholds and weights, the sensitivity of SLFN to the selection of learning rate, and the long training time of the network.
To address the issues in traditional SLFN, scholars have developed a new network model—extreme learning machine (ELM). Compared to traditional SLFN, ELM has a faster learning speed and stronger generalization ability. The network structure of ELM is shown in
Figure 2 [
38,
39].
As shown in
Figure 2, ELM consists of an input layer, a hidden layer, and an output layer, with neurons in each layer connected in sequence.
It is assumed that the number of neurons in the input layer, hidden layer, and output layer is
ni,
nh, and
no, respectively. The corresponding number of input and output variables is
ni and
no, respectively. The connection weight matrix between the input layer and the hidden layer is
wi, as shown in Equation (17).
where
wijk is the connection weight between the
jth neuron in the hidden layer and the
kth neuron in the input layer.
It is assumed that the connection weight matrix between the output layer and the hidden layer is
wo according to Equation (18).
where
wojk is the connection weight between the
jth neuron in the hidden layer and the
kth neuron in the output layer.
It is assumed that the hidden layer neuron threshold matrix is
q, as shown in Equation (19).
For a training set containing
nm samples, the input matrix
X* and output matrix
Y* are as shown in Equations (20) and (21).
G(·) is the activation function of the hidden layer; then, the output matrix
O is as follows in Equations (22) and (23).
where
,
.
The above equation can be expressed in Equation (24).
where
H is the output matrix of the hidden layer, as shown in Equation (25).
In ELM, if the number of hidden-layer neurons is equal to the number of samples in the training set, it is possible to achieve zero error approximation of the network output to the training set. The details are expressed in Equation (26).
where
.
When there is a large number of samples in the training set, in order to reduce computational costs, the number of hidden-layer neurons is usually smaller than the number of samples in the training set. At this point, the network training error can approach an arbitrary value
ε, as shown in Equation (27).
The least-squares solution in the following Equation (28) can be solved to obtain
q.
The solution is as shown in Equation (29).
where
H+ is the Moore Penrose generalized inverse matrix of
H.
ELM is the basic model for PV prediction selected in this study. Similar to other machine learning models, the performance of ELM is also affected by random parameters. Considering the strong randomness and volatility of PV power, there is a high requirement for the regression ability of the model. In the process of predicting PV power, improper selection of random parameter values will directly affect the regression effect of ELM. To accurately characterize the uncertainty of PV power in ELM, EIBWO is proposed to optimize the connection weight and threshold in ELM, thereby improving the predictive performance of ELM and enhancing the accuracy of the model in predicting PV power.