Enhanced Short-Term PV Power Forecasting via a Hybrid Modified CEEMDAN-Jellyfish Search Optimized BiLSTM Model

Liu, Yanhui; Wang, Jiulong; Song, Lingyun; Liu, Yicheng; Shen, Liqun

doi:10.3390/en18133581

Open AccessArticle

Enhanced Short-Term PV Power Forecasting via a Hybrid Modified CEEMDAN-Jellyfish Search Optimized BiLSTM Model

by

Yanhui Liu

¹,

Jiulong Wang

^1,*,

Lingyun Song

¹,

Yicheng Liu

¹ and

Liqun Shen

²

¹

Suihua University Key Laboratory of Mechanical and Electrical Engineering Materials Preparation and Application, Suihua University, Suihua 152000, China

²

School of Instrument Science and Engineering, Harbin Institute of Technology, Harbin 150001, China

^*

Author to whom correspondence should be addressed.

Energies 2025, 18(13), 3581; https://doi.org/10.3390/en18133581

Submission received: 27 May 2025 / Revised: 29 June 2025 / Accepted: 3 July 2025 / Published: 7 July 2025

(This article belongs to the Section A2: Solar Energy and Photovoltaic Systems)

Download

Browse Figures

Versions Notes

Abstract

Accurate short-term photovoltaic (PV) power forecasting is crucial for ensuring the stability and efficiency of modern power systems, particularly given the intermittent and nonlinear characteristics of solar energy. This study proposes a novel hybrid forecasting model that integrates complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), the jellyfish search (JS) optimization algorithm, and a bidirectional long short-term memory (BiLSTM) neural network. First, the original PV power signal was decomposed into intrinsic mode functions using a modified CEEMDAN method to better capture the complex nonlinear features. Subsequently, the fast Fourier transform and improved Pearson correlation coefficient (IPCC) were applied to identify and merge similar-frequency intrinsic mode functions, forming new composite components. Each reconstructed component was then forecasted individually using a BiLSTM model, whose parameters were optimized by the JS algorithm. Finally, the predicted components were aggregated to generate the final forecast output. Experimental results on real-world PV datasets demonstrate that the proposed CEEMDAN-JS-BiLSTM model achieves an

R^{2}

of 0.9785, a MAPE of 8.1231%, and an RMSE of 37.2833, outperforming several commonly used forecasting models by a substantial margin in prediction accuracy. This highlights its effectiveness as a promising solution for intelligent PV power management.

Keywords:

PV power; CEEMDAN; jellyfish search algorithm; BiLSTM; short-term PV forecasting

1. Introduction

Clean energy plays a crucial role in sustainable development and carbon reduction. Among various sources, PV generation is particularly promising due to its abundance, safety, and environmental friendliness [1,2]. However, PV output is highly intermittent and affected by weather, seasons, and sunlight duration [3,4], posing challenges to power system stability and efficiency. To address this, accurate PV power forecasting has become essential for guiding system operation and improving reliability [5,6,7].

The predicted methods mainly focus on point, interval, and probabilistic predictions [8]. The basic PV power prediction methods can be classified into three categories: artificial intelligence technology [9,10], physical model technology [11], and statistical analysis technology [12]. In the physical model technology, it will be integrated with some factors such as radiation, wind speed, temperature, weather, and so on, and the PV power may be estimated. However, this method requires accurate parameters from the PV system and other conditions [13]. Building an accurate model is challenging because of the complexity of the system. Statistical analysis techniques rely heavily on large amounts of historical data to establish relationships between input and output factors. The quality of historical data directly affects the prediction accuracy of PV power. Techniques such as time-series methods [14], grey theory [15], and particle swarm optimization [16] have been commonly used. Recently, artificial intelligence technologies have gained significant importance in PV power forecasting because of their exceptional learning capabilities. Artificial intelligence methods typically involve data cleaning, correlation analysis, normalization, and identification of anomalies and missing data.

PV forecasting is inherently complex, and single-model approaches often fall short in performance. Consequently, hybrid models that integrate multiple techniques have gained popularity. Consequently, hybrid models that integrate multiple techniques have gained popularity [17,18,19,20,21,22]. For example, Wang et al. [23] combined lightweight CNNs with LSTM to extract local features and capture spatial dependencies, further incorporating Transformer and temporal CNNs for improved sequence prediction. Liang et al. [24] proposed a hybrid model combining CEEMDAN, permutation entropy, and BiLSTM, achieving enhanced accuracy and robustness. Hosseini et al. [25] integrated genetic algorithms and particle swarm optimization to improve deep neural network performance. Bai et al. [26] introduced a spatiotemporal graph convolutional recurrent network for multi-site PV prediction. Recent studies have further extended hybrid forecasting frameworks. Zhai et al. [27] developed a VMD–SSA–Transformer–LSTM model that effectively captures complex interactions between meteorological variables and PV power, achieving high accuracy with reduced data requirements. Piantadosi et al. [28] proposed a Transformer-based forecasting method trained with open-source weather data, demonstrating superior accuracy without relying on local irradiance measurements. Tao et al. [29] introduced PTFNet, a physically informed model that combines parallel structures and meteorological features, achieving competitive results across multiple public datasets and forecast horizons. Although these methods have improved forecasting accuracy to some extent, they still exhibit key limitations: insufficient decomposition–reconstruction strategies for highly nonstationary signals, underutilization of frequency-domain features and inter-component correlations, and reliance on manually tuned hyperparameters without adaptive optimization factors that limit their generalization and robustness.

Signal decomposition and reconstruction is a key strategy to reduce uncertainty and improve prediction accuracy [30,31,32]. Common signal decomposition techniques include EMD, EEMD, CEEMD, CEEMDAN, VMD, ITD, and UPITD [33,34]. Among these, VMD [35] requires predefined mode numbers and is sensitive to parameter settings, limiting its adaptability to highly nonstationary data. ITD [36], while simple and easy to implement, lacks theoretical rigor and is vulnerable to noise, resulting in poor decomposition and reconstruction accuracy. UPITD [37] improves ITD by incorporating uncertainty principles but suffers from high computational cost, parameter sensitivity, and limited interpretability. EEMD [38] alleviates mode mixing by introducing Gaussian noise, yet may retain residual noise and produce spurious modes. To address these issues, CEEMDAN [39] was proposed to further suppress residual noise and improve decomposition clarity. Given its balance between accuracy, adaptability, and interpretability, CEEMDAN is adopted in this study to preprocess PV data by decomposing it into IMFs, thus enhancing data quality for modeling. Prior studies [40,41] have suggested that IMFs can be grouped based on shared characteristics and predicted separately before aggregation. However, determining grouping thresholds often relies on subjective heuristics, which may cause improper grouping and affect forecast accuracy. This highlights the need for an adaptive method to identify correlations among IMFs and enhance prediction performance.

After decomposition, predictive models such as LSTM [42,43], kernel extreme learning machine [44], and backpropagation neural networks [45] are commonly employed to forecast PV subcomponents. Among these, BiLSTM improves upon the standard LSTM by processing input sequences in both forward and backward directions, enabling it to capture both past and future contextual information simultaneously. This bidirectional structure enhances the modeling of complex temporal dependencies and typically results in higher predictive accuracy and better generalization in time series forecasting tasks. However, its performance is highly sensitive to hyperparameters such as learning rate, number of hidden units, batch size, and training epochs. Improper settings can lead to slow convergence, overfitting, or poor accuracy. Conventional tuning methods, such as manual adjustment or grid search, are time-consuming and often ineffective in high-dimensional spaces. Therefore, an efficient and robust optimization strategy is essential to improve forecasting accuracy and generalization in PV applications.

To address the intermittency and volatility challenges in photovoltaic power forecasting, this study proposes a hybrid short-term PV prediction model. First, the CEEMDAN algorithm was used to decompose the original power data into multiple IMFs to enhance prediction accuracy. Then, by combining the FFT and IPCC methods, multiple IMFs with similar frequency characteristics were grouped and merged to form new composite components. Next, a BiLSTM model independently predicts each composite component, and the final power forecast is obtained by integrating the predictions from all the groups. Additionally, the JS algorithm was introduced to optimize the BiLSTM hyperparameters. By balancing global exploration and local exploitation, the JS algorithm effectively avoids local optima and significantly accelerates the convergence.

The main contributions of this study are as follows:

1.: The proposed method applies CEEMDAN for signal decomposition and FFT for extracting IMF frequency features. IPCC is introduced to cluster and reconstruct IMFs based on frequency similarity, enhancing input stability and clarity.
2.: A BiLSTM-based predictor is developed, with hyperparameters efficiently optimized via the JS algorithm to improve accuracy and generalization while avoiding manual tuning pitfalls.
3.: By combining CEEMDAN, IPCC, JS, and BiLSTM, the proposed hybrid framework effectively addresses nonlinearity, nonstationarity, and structural complexity in PV power forecasting.

The remainder of this paper is structured as follows. Section 2 presents an overview of the theoretical background, including decomposition techniques, forecasting methods, similarity evaluation approaches, and parameter optimization strategies for PV power output. Section 3 describes the proposed method. Section 4 compares the forecasting performance of the proposed method with that of traditional approaches. Finally, Section 5 provides the conclusion of the paper.

2. Theory Ground

2.1. CEEMDAN Theory

Although the EEMD algorithm can mitigate mode mixing to some extent, the resulting IMFs still suffer from residual noise and limited decomposition precision. To overcome these issues, the CEEMDAN algorithm [39] was introduced, aiming to improve signal decomposition accuracy and enhance the interpretability of extracted modes.

Let

x (t)

denote the original PV power signal, and let

δ^{i} (t)

represent the i-th realization of unit-variance Gaussian white noise. The CEEMDAN process begins by adding modulated white noise with amplitude

ω_{0}

to form

x (t) + ω_{0} δ^{i} (t)

. For each of the n realizations, EMD is applied to extract the first IMF, which is then averaged to yield:

m_{1} (t) = \frac{1}{n} \sum_{i = 1}^{n} {EMD}_{1} (x (t) + ω_{0} δ^{i} (t))

(1)

The first residual is then calculated as:

r_{1} (t) = x (t) - m_{1} (t)

(2)

To obtain the second IMF, modulated noise

δ^{i} (t)

is added to the residual

r_{1} (t)

, and EMD is applied again. Averaging over n realizations yields:

m_{2} (t) = \frac{1}{n} \sum_{i = 1}^{n} {EMD}_{1} (r_{1} (t) + ω_{1} {EMD}_{1} (δ^{i} (t)))

(3)

This recursive process continues for

k = 1, 2, \dots, K

, with the

(k + 1)

-th IMF computed by:

m_{k + 1} (t) = \frac{1}{n} \sum_{i = 1}^{n} {EMD}_{1} (r_{k} (t) + ω_{k} {EMD}_{k} (δ^{i} (t)))

(4)

where the updated residual is:

r_{k + 1} (t) = r_{k} (t) - m_{k + 1} (t)

(5)

The iteration continues until the residual contains fewer than two extrema. Finally, the reconstructed signal is given by:

x (t) = \sum_{k = 1}^{K} m_{k} (t) + r (t) .

(6)

In this study, CEEMDAN is applied to decompose the raw PV power signal into a series of IMFs, thereby improving data quality by reducing nonstationarity and noise, and enabling more effective downstream prediction modeling.

2.2. Improved Pearson Correlation Coefficient (IPCC)

2.2.1. Pearson Correlation Coefficient (PCC)

The PCC is commonly used to evaluate the correlation and similarity between two time series. If the two series are positively correlated, the corresponding PCC is positive; conversely, if they are negatively correlated, the PCC is negative. The closer the PCC is to 1 or −1, the stronger the correlation; conversely, the closer the PCC is to 0, the weaker the correlation. The mathematical expression is as follows:

r_{0} = \frac{\sum_{i = 1}^{M} [(D 1_{i} - \frac{1}{M} \sum_{j = 1}^{M} D 1_{j}) (D 2_{i} - \frac{1}{M} \sum_{j = 1}^{M} D 2_{j})]}{\sqrt{\sum_{i = 1}^{M} {(D 1_{i} - \frac{1}{M} \sum_{j = 1}^{M} D 1_{j})}^{2}} \cdot \sqrt{\sum_{i = 1}^{M} {(D 2_{i} - \frac{1}{M} \sum_{j = 1}^{M} D 2_{j})}^{2}}}

(7)

where

D 1_{i}

is the i-th point value in the first time series,

D 2_{i}

is the i-th point value in the second time series, and M is the number of data points.

2.2.2. Improved Pearson Correlation Coefficient (IPCC)

In PV forecasting, the standard PCC is often sensitive to noise, which can distort the assessment of similarity between IMFs. To mitigate this issue, we propose the IPCC. By replacing the original time series with local cumulative means, IPCC suppresses high-frequency noise effects and provides a more robust measure of the underlying relationships among IMFs.

The observed PV power signal

x (t)

can be modeled as the sum of the true power component and additive noise, formulated as:

x (t) = P_{0} + n

(8)

where

P_{0}

denotes the ideal noise-free PV signal and n represents the noise component.

To reduce the influence of noise on correlation measurement, IPCC substitutes the raw time series values

D 1_{i}

and

D 2_{i}

with their local cumulative means, defined by:

{\tilde{D 1}}_{i} = \frac{1}{i - 1} \sum_{j = 1}^{i} D 1_{j} (i > 1)

(9)

{\tilde{D 2}}_{i} = \frac{1}{i - 1} \sum_{j = 1}^{i} D 2_{j} (i > 1)

(10)

Using these smoothed sequences, the correlation coefficient is recalculated as:

r_{0} = \frac{\sum_{i = 1}^{M} ({\tilde{D 1}}_{i} - \bar{\tilde{D 1}}) ({\tilde{D 2}}_{i} - \bar{\tilde{D 2}})}{\sqrt{\sum_{i = 1}^{M} {({\tilde{D 1}}_{i} - \bar{\tilde{D 1}})}^{2}} \cdot \sqrt{\sum_{i = 1}^{M} {({\tilde{D 2}}_{i} - \bar{\tilde{D 2}})}^{2}}}

(11)

where

\bar{\tilde{D 1}}

and

\bar{\tilde{D 2}}

denote the mean values of the smoothed sequences and M is the number of data points.

2.3. Jellyfish Search Algorithm (JS)

The JS algorithm simulates the foraging behavior of jellyfish by integrating both exploration and exploitation in a time-controlled search process. Initially, a population of

n_{p o p}

jellyfish is generated within the search space defined by the upper and lower bounds

U_{b}

and

L_{b}

. The initial positions are diversified using the logistic chaotic map as follows:

X_{i + 1} = η X_{i} (1 - X_{i}), 0 \leq X_{0} \leq 1

(12)

where

η = 4

, and

X_{0} \notin {0, 0.25, 0.5, 0.75, 1}

to avoid fixed points. To control the behavioral transition over time, a control factor

C (t)

is introduced:

C (t) = |(1 - \frac{t}{M a x_{i t e r}}) \cdot (2 \cdot rand (0, 1) - 1)|

(13)

Depending on

C (t)

, each jellyfish updates its position by passive drifting or active movement. When

C (t) \geq 0.5

, passive movement mimics drifting with ocean currents:

X_{i} (t + 1) = X_{i} (t) + rand (0, 1) \cdot (X_{b e s t} - β \cdot rand (0, 1) \cdot μ)

(14)

where

X_{b e s t}

is the global best position,

μ

is the mean position of the population, and

β = 3

. If

C (t) < 0.5

, two types of active behaviors are possible: random drifting

X_{i} (t + 1) = X_{i} (t) + γ \cdot rand (0, 1) \cdot (U_{b} - L_{b})

(15)

with

γ = 0.1

, or directed movement influenced by comparisons with another randomly selected individual:

X_{i} (t + 1) = X_{i} (t) + rand (0, 1) \cdot \vec{D i r e c t i o n}

(16)

where

\vec{D i r e c t i o n} = \{\begin{matrix} X_{j} (t) - X_{i} (t), & if f (X_{i}) \geq f (X_{j}) \\ X_{i} (t) - X_{j} (t), & if f (X_{i}) < f (X_{j}) \end{matrix}

(17)

X_{j} (t)

is the position of a randomly selected jellyfish. This unified mechanism enables an adaptive search between exploration and exploitation, thereby enhancing the optimizer’s robustness and global convergence capability.

2.4. Bidirectional Long Short-Term Memory (BiLSTM)

The BiLSTM networks enhance the standard LSTM by processing input sequences in both forward and backward directions. This bidirectional structure allows the model to capture information from both past and future time steps, thereby improving the learning of temporal dependencies. The structure of the standard LSTM network is shown in Figure 1, while the architecture of a BiLSTM network is shown in Figure 2.

The LSTM unit computations include the forget gate

f_{t}

, input gate

i_{t}

, candidate cell state

{\tilde{C}}_{t}

, cell state

C_{t}

, output gate

o_{t}

, and hidden state

h_{t}

, which are described below:

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(18)

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(19)

{\tilde{C}}_{t} = tanh (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{c})

(20)

C_{t} = f_{t} * C_{t - 1} + i_{t} * {\tilde{C}}_{t}

(21)

o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(22)

h_{t} = o_{t} * tanh (C_{t}) .

(23)

The forward and backward hidden states in a BiLSTM are computed independently by:

\vec{h_{t}} = {LSTM}_{forward} (x_{t}, \vec{h_{t - 1}}, \vec{C_{t - 1}})

(24)

\overset{\leftarrow}{h_{t}} = {LSTM}_{backward} (x_{t}, \overset{\leftarrow}{h_{t + 1}}, \overset{\leftarrow}{C_{t + 1}}) .

(25)

The activation functions used are defined as follows:

σ (x) = \frac{1}{1 + e^{- x}}

(26)

tanh (x) = \frac{e^{x} - e^{- x}}{e^{x} + e^{- x}}

(27)

By leveraging information from both directions, BiLSTM networks improve sequence modeling and are widely used in areas such as speech recognition, natural language processing, and time series analysis.

3. Proposed Methodology

In this study, we propose a hybrid photovoltaic power forecasting model based on CEEMDAN, the JS optimization algorithm, and BiLSTM. The model comprises three fundamental modules: decomposition, reconstruction, and prediction. The overall workflow can be summarized into three key steps: data preprocessing, signal decomposition, and reconstruction based on fast Fourier transform and IPCC, and forecasting using the JS-optimized BiLSTM model.

We propose a novel hybrid short-term PV forecasting model, named CEEMDAN-JS-BiLSTM, whose structure is illustrated in Figure 3. The overall workflow of the proposed model is as follows:

Step 1: Data preprocessing.The IQR method was used to identify and remove outliers in the PV power series. The missing values were then filled using interpolation techniques, resulting in a complete and continuous dataset with a 30 min time interval. This step helps eliminate the impact of abnormal fluctuations and data incompleteness on the model accuracy, thereby improving the quality and stability of the input data for subsequent modeling.

Step 2: Signal decomposition and reconstruction. The preprocessed PV power series was decomposed using CEEMDAN, extracting a set of IMFs with different frequency characteristics to enable multi-scale information separation. Each IMF is then analyzed in the frequency domain using fast Fourier transform and grouped based on frequency similarity using the IPCC method. These grouped components were subsequently reconstructed to obtain a set of new components with similar frequency features NIMFs. This step effectively alleviates the mode mixing problem and enhances the stationarity and predictability of each subsequence, providing a clearer and more structured input for the prediction model.

Step 3: Power forecasting. Each NIMF subsequence was fed into a BiLSTM network optimized by the JS algorithm for modeling and prediction. The final PV power forecast was obtained by summing the predicted outputs of all subsequences. The BiLSTM network captures both forward and backward temporal dependencies in the sequence, enhancing the model’s ability to learn the dynamic time-series features. The JS optimization algorithm improves the efficiency of parameter tuning during training, further enhancing the accuracy and robustness of the prediction.

To enhance the understanding of the proposed hybrid forecasting framework, Algorithm 1 presents the complete model workflow integrating CEEMDAN, IPCC, JS optimization, and BiLSTM prediction.

Algorithm 1 Hybrid forecasting framework with CEEMDAN–JS–BiLSTM.

1:: Input:PV signal $x (t)$
2:: Output: Forecasted PV power $\hat{P} (t)$
3:: // Step 1: CEEMDAN decomposition
4:: Decompose $x (t)$ into IMFs: $[I M F_{1}, . . ., I M F_{N}] = C E E M D A N (x (t))$
5:: Compute FFT for each IMF: $[f_{1}, . . ., f_{N}] = F F T (I M F_{1}, . . ., I M F_{N})$
6:: // Step 2: IMF clustering using IPCC
7:: for $j = 1$ to $N - 1$ do
8:: Compute similarity: $I P C C_{j} = I P C C (f_{j}, f_{j + 1})$
9:: if $I P C C_{j} > 0.5$ then
10:: Merge adjacent IMFs: $N I M F_{j} = I M F_{j} + I M F_{j + 1}$
11:: else
12:: Retain original IMF: $N I M F_{j} = I M F_{j}$
13:: end if
14:: end for
15:: // Step 3: Forecasting with JS-optimized BiLSTM
16:: for each $N I M F_{i}$ do
17:: Use JS to optimize BiLSTM hyperparameters
18:: Train and predict: ${\hat{N I M F}}_{i} = B i L S T M (N I M F_{i})$
19:: end for
20:: // Step 4: Reconstruction
21:: Reconstruct the final prediction: $\hat{P} (t) = \sum_{i} {\hat{N I M F}}_{i}$
22:: return $\hat{P} (t)$

3.1. Classification of the IMF Based on FFT and IPCC

In this section, an adaptive classification method is proposed to group IMFs based on their frequency characteristics. The aim was to cluster similar fluctuating components in the PV power data to improve the forecasting accuracy. Because low-index IMFs typically contain high-frequency information, the method first applies FFT to each IMF to obtain their frequency spectra. The IPCC is then used to measure the similarity between adjacent IMF frequency spectra. If the similarity exceeds a set threshold

r_{0}

, the corresponding IMFs are merged into an NIMF. This approach helps reduce mode mixing and enhances the quality of the decomposed signals.

To reduce spectral leakage during the frequency analysis, a Hamming window was applied in the FFT process. For the CEEMDAN decomposition, the white noise amplitude was set to 0.2 and the number of realizations to 100. This parameter combination has been demonstrated in previous studies [46] to yield stable decomposition performance.

Formally, the CEEMDAN decomposition of PV data

x (t)

can be written as

x (t) = CEEMDAN (x (t)) = I M F_{1} + I M F_{2} + \dots + I M F_{N}

(28)

where N denotes the number of IMFs. The frequency spectrum of each IMF is

W_{i} = F F T (I M F_{i}), i = 1, 2, \dots, N .

(29)

The similarity between IMFs is calculated by

r_{i} = I P C C (I M F_{i}, I M F_{i + 1}), i = 1, 2, \dots, N - 1

(30)

If

r_{i} > r_{0}

, the IMFs

I M F_{i}

and

I M F_{i + 1}

are combined to form a new IMF (NIMF).

3.2. BiLSTM Optimization with JS

To enhance the performance of the BiLSTM model in photovoltaic power forecasting, this study employed the JS algorithm to optimize its hyperparameters. JS simulates the behavior of jellyfish drifting with ocean currents and performing random movements, balancing exploration and exploitation to efficiently search for the global optima. Specifically, the BiLSTM model adopted in this study consists of a two-layer architecture. Each candidate solution represents a set of BiLSTM hyperparameter configurations, denoted as a vector

H_{i} = {h_{1}, h_{2}, ε, n, b}

, where

h_{1}

and

h_{2}

indicate the number of hidden units in the first and second BiLSTM layers, respectively;

ε

is the learning rate; n denotes the number of training iterations; and b represents the batch size. The search range for each parameter is defined based on the literature [47], with n specifically set to the discrete values

{20, 50, 100, 200}

. At the initialization stage, a population of candidate solutions is randomly generated within the defined bounds. Each candidate is used to construct and train a BiLSTM model, whose forecasting performance is evaluated using the Mean Absolute Percentage Error as the fitness function, reflecting the quality of the given parameter configuration.

During the optimization process, the positions of candidate solutions are updated via two primary mechanisms:

(1) Ocean Current Movement, where the candidate moves toward the best solution as

H_{i}^{n e w} = H_{i} + r \cdot (H_{b e s t} - H_{i})

(31)

where

r \sim U (0, 1)

is a uniform random variable.

(2) Jellyfish Movement. Each candidate explores the solution space by perturbing itself with reference to two randomly selected solutions:

H_{i}^{n e w} = H_{i} + s \cdot (H_{j} - H_{k})

(32)

where

s \sim U (- 1, 1)

determines the direction and magnitude of the step size.

The JS algorithm probabilistically selects between exploration and exploitation at each iteration. After each update, the new hyperparameters retrain the BiLSTM model, and fitness is re-evaluated. If improved, the candidate and possibly the global best are updated. After several iterations, the algorithm converges to the optimal hyperparameter vector

H_{b e s t}

, which enhances the model’s accuracy, robustness, and generalization. Notably, the JS algorithm relies on only two control parameters: the population size

n_{pop}

and the maximum number of iterations

t_{\max}

. According to prior studies [48], and considering the low dimensionality of the BiLSTM hyperparameter space (comprising only five variables), this work sets

n_{pop} = 40

and

t_{\max} = 100

, resulting in at most

40 \times 100 = 4000

model evaluations. In addition to this fixed iteration cap, a dynamic stopping criterion is introduced: if the objective function’s change remains below

10^{- 12}

for 10 consecutive iterations, the algorithm is considered to have converged and terminates early. This dual termination mechanism ensures sufficient search depth while improving computational efficiency and convergence stability. Moreover, existing research [48] suggests that for optimization problems of similar dimensionality, the JS algorithm typically achieves stable convergence under comparable population and iteration configurations. From a theoretical perspective, the total computational complexity of the JS-based BiLSTM hyperparameter optimization can be expressed as:

O (n_{pop} \cdot t_{\max} \cdot C_{train})

where

C_{train}

denotes the time required to train and evaluate a single BiLSTM model. Since each position update involves retraining the model, the overall complexity is primarily dominated by the cost of training. To balance search effectiveness and computational burden, a moderate parameter configuration is adopted in this study. The complete optimization process is described in Algorithm 2.

Algorithm 2 JS optimization for BiLSTM hyperparameters.

1:: Input: Search space $H = {h_{1}, h_{2}, ε, n, b}$ , population size $n_{pop}$ , max iterations $t_{\max}$
2:: Output: Optimal parameters $H_{b e s t}$
3:: Initialize population $P = {H_{1}, \dots, H_{n_{pop}}}$ randomly
4:: Evaluate fitness $f (H_{i})$ for all $H_{i} \in P$
5:: $H_{b e s t} \leftarrow arg min f (H_{i})$ , $c \leftarrow 0$
6:: for $t = 1$ to $t_{\max}$ do
7:: $f_{prev} \leftarrow f (H_{b e s t})$
8:: for $i = 1$ to $n_{pop}$ do
9:: Sample $p \sim U (0, 1)$
10:: Update $H_{i}^{n e w}$ using ocean current or jellyfish motion
11:: Evaluate $f (H_{i}^{n e w})$
12:: if $f (H_{i}^{n e w}) < f (H_{i})$ then
13:: $H_{i} \leftarrow H_{i}^{n e w}$
14:: if $f (H_{i}) < f (H_{b e s t})$ then
15:: $H_{b e s t} \leftarrow H_{i}$
16:: end if
17:: end if
18:: end for
19:: if variation of $H_{b e s t} < 10^{- 12}$ for 10 iterations then
20:: break
21:: end if
22:: end for
23:: return $H_{b e s t}$

3.3. Prediction Results Evaluation

The effectiveness of the prediction can be evaluated using the coefficient of determination (

R^{2}

), RMSE, and MAPE, as expressed by the following equations:

(1) Determination coefficient

(R^{2})

:

\begin{matrix} R^{2} = 1 - \frac{R S S}{T S S} \end{matrix}

(33)

\begin{matrix} R S S = \sum_{i = 1}^{N} {(x_{i} - {\hat{x}}_{i})}^{2} \end{matrix}

(34)

\begin{matrix} T S S = \sum_{i = 1}^{N} {(x_{i} - \bar{x})}^{2} . \end{matrix}

(35)

(2) Root mean square error (RMSE):

RMSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {({\hat{x}}_{i} - x_{i})}^{2}} .

(36)

(3) Mean absolute percentage error (MAPE):

MAPE = \frac{1}{N} \sum_{i = 1}^{N} |\frac{{\hat{x}}_{i} - x_{i}}{x_{i}}| \times 100 %

(37)

where

x_{i}

and

{\hat{x}}_{i}

represent the actual and predicted values, respectively, N is the number of samples, and

\bar{x}

is the mean of the actual values.

4. Result and Discussion

This study used PV power output data from the Changzhou Power Grid to evaluate the effectiveness of the proposed method. The dataset spanned 21 May 2018, to 4 July 2018. To construct the forecasting model, PV power data recorded between 7:00 a.m. and 5:00 p.m. each day were selected. Because the original dataset only contained hourly power values, data preprocessing was performed to enhance the prediction accuracy. Specifically, outliers were removed using the IQR method, and interpolation was applied to generate one data point every 30 min. Consequently, each day contained 22 data points, as shown in Figure 4a. By arranging these daily data points in chronological order, a complete dataset consisting of 990 data points was obtained, as illustrated in Figure 4b. The proposed model was implemented using the PyTorch framework. All experiments were conducted on a workstation equipped with an Intel Core i7-12700H @ 2.30 GHz processor, 32 GB RAM, and an NVIDIA T600 Laptop GPU.

The PV power output exhibited significant fluctuations, suggesting the presence of multiple superimposed components within the signal. To address the nonstationarity of the data and improve the forecasting performance, this study applies a signal decomposition strategy to isolate the intrinsic components, as detailed in Section 3. The PV power data from 21 May to 3 July 2018, were used for model training, whereas the remaining data were reserved for testing and evaluating the predictive performance.

In practice, the PV power face volatility in this short-term productivity subject to the variable temperature, wind velocity, atmospheric pressure and solar radiation etc. Thus, it means that the PV power contain the different volatility components. In essence, the EEMD and development version such as CEEMD and CEEMDAN can decompose the signal into the different volatility IMFs, especially suitable for the nonstationary signals decomposition [24,38,39]. The first IMF contain the highest the volatility component, as the rise of IMF index, the decomposed volatility component is reduced. The volatility can be reflected by frequency information. Based on this idea, the overall volatility feature in the PV power can be decomposed into the different volatility component. It is helpful to enhance the prediction accuracy of PV power.

In this study, the CEEMDAN method was applied to decompose the PV power output, as shown in Figure 5, resulting in six IMF components (IMF1–IMF6) with progressively decreasing frequency density. However, their waveform characteristics in the time domain were not clearly distinguishable. To better differentiate and characterize these IMFs, the method proposed in Section 3 was applied, transforming each IMF using FFT, as shown in Figure 6. The analysis shows that IMF1 and IMF2 both exhibit pronounced sideband features on the left side of the main frequency, with highly similar spectra overall, especially in the regions between the main frequency and the sidebands (highlighted in red). In contrast, starting from IMF3, these original sideband features clearly disappear. Additionally, IMF3, IMF4, and IMF5 display new types of sideband components different from those in IMF1 and IMF2 (highlighted in blue), while IMF6 contains no sideband features at all. Based on these spectral characteristics, IMF1 and IMF2 were grouped as the first group, IMF3, IMF4, and IMF5 as the second group, and IMF6 was classified as the third group.

To verify the effectiveness of the analysis process, the IPCC was employed to evaluate the similarity between the FFT transformations of each IMF component, as shown in Figure 7. The analysis results indicate that the IPCC is 0.67 between IMF1 and IMF2, 0.41 between IMF2 and IMF3, 0.64 between IMF3 and IMF4, 0.63 between IMF4 and IMF5, and 0.24 between IMF5 and IMF6. The detailed values are listed in Table 1. Thus, the demarcation point of similarity and nonsimilarity of sequential IMF is between the 0.41 and 0.6. A certain margin of upper and lower limit threshold value was set, i.e., the lower limit threshold value is bigger and the upper limit threshold value become smaller. Thus, the threshold of IPCC is set at 0.5 in this paper. The IPCC values between IMF2 and IMF3 and between IMF5 and IMF6 are below this threshold, whereas the IPCC values among the other components exceed the threshold. Based on this, IMF1 and IMF2 were grouped as the first set, IMF3, IMF4, and IMF5 were grouped as the second set, and IMF6 alone formed the third set. The reconstructed NIMFs are presented in Figure 8.

In the initial stage of the proposed forecasting model, CEEMDAN was first applied to decompose the clustered data. Then, the IMFs obtained from the decomposition are reconstructed using FFT and the IPCC to generate a new reconstructed time series. Next, these reconstructed time series were predicted using a JS-optimized BiLSTM model. Finally, the overall photovoltaic power forecast was obtained by summing the prediction results of each reconstructed time series. Figure 9 presents the comparison between the predicted results of each reconstructed component and the actual observations on the test set.

In this study, the BiLSTM method optimized by JS was applied to predict the NIMF for each group, and the predicted data were then reconstructed to generate the final photovoltaic power prediction. Additionally, TCN [49], PV-Net [50], LSTM-TCN [51], and DeepFEDformer [52] were used for comparison of prediction performance. Figure 10 presents the prediction results of these methods. To comprehensively evaluate the prediction effectiveness, the metrics

R^{2}

, MAPE, and RMSE introduced in Section 3 were employed to assess the final predictions, and the results are shown in Figure 11. The proposed method achieved the highest

R^{2}

value and the lowest MAPE among all the compared methods, and its RMSE was also significantly lower. Specifically, the RMSE of the proposed method is 37.2833, which is much lower than those of TCN (89.7451), PV-Net (94.2504), LSTM-TCN (89.0610), and DeepFEDformer (76.4208). Regarding the

R^{2}

metric, the proposed method improves the prediction accuracy by 5.121% compared to TCN, 5.417% compared to PV-Net, 5.129% compared to LSTM-TCN, and 2.165% compared to DeepFEDformer. For MAPE, the improvements are 37.040%, 42.213%, 43.14%, and 22.631%, respectively. These results demonstrate the superior performance of the proposed method for PV prediction.

To statistically verify the significance of performance differences among the compared forecasting models, we conducted a Friedman test based on the RMSE values obtained from 10 independent experiments. The corresponding p-value is far below the significance threshold of 0.05. This indicates that the differences in prediction accuracy among the models are statistically significant. The average ranks for each model across the repeated experiments are as follows: the proposed method achieved the best performance with an average rank of 1.00, followed by DeepFEDformer (2.55), LSTM-TCN (3.25), TCN (3.60), and PV-Net (4.60). These rankings further confirm the consistent superiority of our proposed CEEMDAN–JS–BiLSTM method over the other baseline models. Figure 12 illustrates the average ranks of all models, highlighting the robustness and reliability of the proposed framework across multiple runs.

To comprehensively evaluate the effectiveness of the proposed CEEMDAN–JS–BiLSTM model, we conducted a series of ablation experiments to isolate the contribution of each component. Specifically, five model variants were constructed and compared: (1) BiLSTM Only: The baseline model without decomposition or optimization; (2) CEEMDAN-BiLSTM: A model that applies signal decomposition but without grouping or JS optimization; (3) JS-BiLSTM: A model that incorporates hyperparameter optimization but no decomposition; (4) CEEMDAN-Grouped BiLSTM: A model with decomposition and FFT–IPCC-based grouping, but without JS optimization; (5) Full Proposed Model (CEEMDAN–JS–BiLSTM): The complete version integrating all modules. Each model variant was trained and tested under identical conditions using the PV dataset. The prediction performance was evaluated using three standard metrics: RMSE, MAPE, and

R^{2}

. The results are summarized in Table 2, and a visual comparison of error distributions is provided in Figure 13.

The ablation results in Table 2 and Figure 13 clearly demonstrate the individual and combined effects of CEEMDAN decomposition, frequency-domain IMF grouping, and JS-based hyperparameter optimization on BiLSTM forecasting performance.

Compared with the baseline BiLSTM model, incorporating CEEMDAN decomposition alone (CEEMDAN-BiLSTM) yields a notable improvement across all metrics, reducing RMSE from 104.89 to 76.32, lowering MAPE from 16.97% to 13.44%, and increasing

R^{2}

from 0.9032 to 0.9433. This confirms that decomposing the original signal into mode-specific components enables the model to better capture multi-scale temporal patterns and suppresses noise interference. Similarly, applying JS-based optimization alone (JS-BiLSTM) also enhances performance, with RMSE dropping to 87.45, MAPE decreasing to 14.74%, and

R^{2}

improving to 0.9265. This highlights the contribution of the JS algorithm in adaptively exploring the hyperparameter space. By effectively balancing global exploration and local exploitation, JS guides the training process toward parameter configurations that improve both generalization and robustness, especially in the presence of noisy or nonlinear data. The performance is further improved when frequency-domain grouping is introduced (CEEMDAN-Grouped BiLSTM), leading to a sharp decrease in RMSE to 46.43, a reduction in MAPE to 10.25%, and an increase in

R^{2}

to 0.9673. This indicates that grouping similar IMFs based on their frequency characteristics not only reduces mode redundancy but also enhances the signal representation, allowing the model to focus on more coherent and relevant features. Ultimately, the full model (CEEMDAN–JS–BiLSTM) delivers the best performance across all evaluation metrics (RMSE: 37.28, MAPE: 8.12%,

R^{2}

: 0.9785), demonstrating the synergistic effects of signal decomposition, frequency-domain grouping, and adaptive hyperparameter optimization. The inclusion of the JS algorithm proves particularly valuable in fine-tuning the BiLSTM model, allowing it to fully leverage the benefits of the decomposed signal structure. These results confirm that each component contributes uniquely to the model’s predictive power, and their integration produces a robust and accurate forecasting framework for short-term PV power prediction.

To validate the significance of differences among the ablation models, we conducted a Friedman test on RMSE scores from 10 repeated runs. The p-value was below 0.05, confirming that each component—CEEMDAN, JS optimization, and frequency grouping—contributes meaningfully to forecasting accuracy. As shown in Figure 14, CEEMDAN–JS–BiLSTM achieved the best performance with an average rank of 1.30, demonstrating advantages in both stability and accuracy. The strong performance of CEEMDAN-Grouped BiLSTM highlights the value of frequency-domain grouping, while the consistency of CEEMDAN–JS–BiLSTM further validates the effectiveness of JS optimization. These results suggest that the integrated design improves both robustness and precision. Future work may explore adaptive fusion strategies and intelligent search mechanisms to further enhance generalization.

To demonstrate the stability and effectiveness of the proposed JS-based hyperparameter optimization process, we conduct convergence experiments on multiple independent runs. Figure 15 illustrates the convergence curves from two independent runs of the JS-based hyperparameter optimization applied to the BiLSTM model. Despite different random initializations resulting in distinct starting MAPE values, both runs exhibit a rapid decline in MAPE within the first 20 iterations. By around iteration 30, both curves converge toward a stable region, demonstrating the algorithm’s ability to efficiently explore and exploit the hyperparameter space. Beyond this point, the MAPE values remain stable with minor fluctuations, indicating successful convergence and robust optimization performance. The slight differences in convergence speed between the two runs highlight the stochastic nature of the JS algorithm, while the consistent final accuracy confirms its effectiveness in reliably identifying well-performing hyperparameter configurations for the CEEMDAN–JS–BiLSTM forecasting model.

5. Conclusions

This study presents a short-term PV power forecasting framework that integrates CEEMDAN decomposition, FFT-based frequency-domain grouping, and BiLSTM models optimized using the JS algorithm. The CEEMDAN–FFT–IPCC module enables fine-grained decomposition and reconstruction of PV signals, effectively capturing multi-scale fluctuations, while the JS algorithm adaptively tunes BiLSTM hyperparameters to significantly enhance predictive accuracy and robustness. Experimental results on real-world datasets demonstrate that the proposed method outperforms several strong baseline models in terms of forecasting precision.

Beyond these improvements, the JS optimizer demonstrates stable convergence and a balanced exploration–exploitation capability in the hyperparameter space, which is particularly beneficial for BiLSTM models that are sensitive to parameter configurations. Nevertheless, we recognize several limitations in the current study. The proposed method has only been validated on a single dataset, and its generalizability across different geographical regions, seasonal variations, and PV system types remains to be systematically assessed. In addition, the computational cost associated with repeated BiLSTM training may restrict large-scale or real-time deployment in its current form.

Future research will aim to address these limitations in several ways. First, expanding evaluation to multi-regional and heterogeneous datasets will help examine the robustness and adaptability of the model. Second, developing adaptive JS parameter control strategies for unknown scenarios may further improve optimization performance and reduce the reliance on fixed parameter settings. Third, incorporating real-time meteorological inputs could enhance forecasting responsiveness under rapidly changing conditions. Finally, designing lightweight forecasting models or applying model compression techniques could facilitate practical deployment on edge devices and embedded platforms.

Author Contributions

Conceptualization, Y.L. (Yanhui Liu) and J.W.; Methodology, Y.L. (Yanhui Liu) and J.W.; Software, J.W.; Validation, L.S. (Lingyun Song), Y.L. (Yicheng Liu), and J.W.; Formal analysis, Y.L. (Yicheng Liu); Investigation, Y.L. (Yicheng Liu); Resources, L.S. (Liqun Shen); Data curation, L.S. (Lingyun Song); Writing—original draft preparation, Y.L. (Yanhui Liu); Writing—review and editing, J.W., L.S. (Lingyun Song), Y.L. (Yicheng Liu), and L.S. (Liqun Shen); Visualization, J.W.; Supervision, L.S. (Liqun Shen); Project administration, Y.L. (Yanhui Liu) and J.W.; Funding acquisition, J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Fundamental Research Foundation for Universities of Heilongjiang Province under Grant Number YWF10236240123. The APC was funded by Harbin Institute of Technology and Suihua University.

Data Availability Statement

The data generated or analyzed during this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors would like to thank Harbin Institute of Technology and Suihua University for their support.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

PV	Photovoltaic
LSTM	Long Short-Term Memory
CNN	Convolutional Neural Network
CEEMDAN	Complete Ensemble Empirical Mode Decomposition with Adaptive Noise
BiLSTM	Bidirectional Long Short-Term Memory
VMD	Variational Mode Decomposition
SSA	Singular Spectrum Analysis
PTFNet	Physically informed Temporal Fusion Network
EMD	Empirical Mode Decomposition
EEMD	Ensemble Empirical Mode Decomposition
CEEMD	Complete Ensemble Empirical Mode Decomposition
ITD	Intrinsic Time-scale Decomposition
UPITD	Uniform Phase Intrinsic Time-scale Decomposition
IMF	Intrinsic Mode Function
FFT	Fast Fourier Transform
IPCC	Inter-Component Phase Correlation Coefficient
JS	Jellyfish Search
PCC	Pearson Correlation Coefficient
IQR	Interquartile Range
NIMF	New of Intrinsic Mode Functions
RMSE	Root Mean Square Error
MAPE	Mean Absolute Percentage Error
TCN	Temporal Convolutional Network

References

Gu, B.; Shen, H.; Lei, X. Forecasting and uncertainty analysis of day-ahead photovoltaic power using a novel forecasting method. Appl. Energy 2021, 299, 117291. [Google Scholar] [CrossRef]
Mayer, M.; Yang, D. Pairing ensemble numerical weather prediction with ensemble physical model chain for probabilistic photovoltaic power forecasting. Renew. Sustain. Energy Rev. 2023, 175, 113171. [Google Scholar] [CrossRef]
Cui, S.; Lyu, S.; Ma, Y.; Wang, K. Improved informer PV power short-term prediction model based on weather typing and AHA-VMD-MPE. Energy 2024, 307, 132766. [Google Scholar] [CrossRef]
Ling, H.; Liu, M.; Fang, Y. Deep Edge-Based Fault Detection for Solar Panels. Sensors 2024, 24, 5348. [Google Scholar] [CrossRef]
Wang, Y.; Yao, Y.; Zou, Q.; Zhao, K.; Hao, Y. Forecasting a Short-Term Photovoltaic Power Model Based on Improved Snake Optimization, Convolutional Neural Network, and Bidirectional Long Short-Term Memory Network. Sensors 2024, 24, 3897. [Google Scholar] [CrossRef]
Joseph, L.P.; Deo, R.C.; Prasad, R.; Salcedo-Sanz, S.; Raj, N.; Soar, J. Near real-time wind speed forecast model with bidirectional LSTM networks. Renew. Energy 2023, 204, 39–58. [Google Scholar] [CrossRef]
Li, S.; Ma, W.; Liu, Z.; Duan, Y.; Tian, C. Short-Term Prediction of Wind Power Based on NWP Error Correction with Time GAN and LSTM-TCN. In Energy Power and Automation Engineering; Springer: Singapore, 2024; pp. 939–948. [Google Scholar]
Wang, L.; Mao, M.; Xie, J.; Liao, Z.; Zhang, H.; Li, H. Accurate solar PV power prediction interval method based on frequency-domain decomposition and LSTM model. Energy 2023, 262, 125592. [Google Scholar] [CrossRef]
Khan, Z.A.; Hussain, T.; Baik, S.W. Dual stream network with attention mechanism for photovoltaic power forecasting. Appl. Energy 2023, 338, 120916. [Google Scholar] [CrossRef]
Rai, A.; Shrivastava, A.; Jana, K.C. Differential attention net: Multi-directed differential attention based hybrid deep learning model for solar power forecasting. Energy 2023, 263, 125746. [Google Scholar] [CrossRef]
Wolff, B.; Kühnert, J.; Lorenz, E.; Kramer, O.; Heinemann, D. Comparing support vector regression for PV power forecasting to a physical modeling approach using measurement, numerical weather prediction, and cloud motion data. Solar Energy 2016, 135, 197–208. [Google Scholar] [CrossRef]
Brabec, M.; Pelikán, E.; Krc, P.; Eben, K.; Musilek, P. Statistical modeling of energy production by photovoltaic farms. In Proceedings of the Electric Power and Energy Conference (EPEC), Halifax, NS, Canada, 25–27 August 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 1–6. [Google Scholar]
Markovics, D.; Martin, J. Comparison of machine learning methods for photovoltaic power forecasting based on numerical weather prediction. Renew. Sustain. Energy Rev. 2022, 161, 112364. [Google Scholar] [CrossRef]
Yang, D.; Dong, Z. Operational photovoltaics power forecasting using seasonal time series ensemble. Solar Energy 2018, 166, 529–541. [Google Scholar] [CrossRef]
Qian, W.; Sui, A. A novel structural adaptive discrete grey prediction model and its application in forecasting renewable energy generation. Expert Syst. Appl. 2021, 186, 115761. [Google Scholar] [CrossRef]
Douiri, M.R. Particle swarm optimized neuro-fuzzy system for photovoltaic power forecasting model. Solar Energy 2019, 184, 91–104. [Google Scholar] [CrossRef]
Akhter, M.N.; Mekhilef, S.; Mokhlis, H.; Almohaimeed, Z.M.; Muhammad, M.A.; Khairuddin, A.S.M.; Akram, R.; Hussain, M.M. An hour-ahead PV power forecasting method based on an RNN-LSTM model for three different PV plants. Energies 2022, 15, 2243. [Google Scholar] [CrossRef]
Li, G.; Guo, S.; Li, X.; Cheng, C. Short-term Forecasting Approach Based on bidirectional long short-term memory and convolutional neural network for Regional Photovoltaic Power Plants. Sustain. Energy Grids Netw. 2023, 34, 101019. [Google Scholar] [CrossRef]
Chen, Z.R.; Bai, Y.L.; Hong, J.T. Constructing two-stream input matrices in a convolutional neural network for photovoltaic power prediction. Eng. Appl. Artif. Intell. 2024, 135, 108814. [Google Scholar] [CrossRef]
Liu, Q.; Li, Y.; Jiang, H.; Chen, Y.; Zhang, J. Short-term photovoltaic power forecasting based on multiple mode decomposition and parallel bidirectional long short term combined with convolutional neural networks. Energy 2024, 286. [Google Scholar] [CrossRef]
Xue, H.; Ma, J.; Zhang, J.; Jin, P.; Wu, J.; Du, F. Power Forecasting for Photovoltaic Microgrid Based on MultiScale CNN-LSTM Network Models. Energies 2024, 17, 3877. [Google Scholar] [CrossRef]
Yu, J.; Li, X.; Yang, L.; Li, L.; Huang, Z.; Shen, K.; Yang, X.; Yang, X.; Xu, Z.; Zhang, D.; et al. Deep Learning Models for PV Power Forecasting. Energies 2024, 17, 3973. [Google Scholar] [CrossRef]
Wang, S.; Huang, Y. Spatio-temporal photovoltaic prediction via a convolutional based hybrid network. Comput. Electr. Eng. 2025, 123, 110021. [Google Scholar] [CrossRef]
Liang, J.; Yin, L.; Xin, Y.; Li, S.; Zhao, Y.; Song, T. Short-term photovoltaic power prediction based on CEEMDAN-PE and BiLSTM neural network. Electr. Power Syst. Res. 2025, 246, 111706. [Google Scholar] [CrossRef]
Hosseini, E.; Saeedpour, B.; Banaei, M.; Ebrahimy, R. Optimized deep neural network architectures for energy consumption and PV production forecasting. Energy Strategy Rev. 2025, 59, 101704. [Google Scholar] [CrossRef]
Bai, M.; Zhou, G.; Yao, P.; Dong, F.; Chen, Y.; Zhou, Z.; Yang, X.; Liu, J.; Yu, D. Deep multi-attribute spatial-temporal graph convolutional recurrent neural network-based multivariable spatial-temporal information fusion for short-term probabilistic forecast of multi-site photovoltaic power. Expert Syst. Appl. 2025, 279, 127458. [Google Scholar] [CrossRef]
Zhai, C.; He, X.; Cao, Z.; Abdou-Tankari, M.; Wang, Y.; Zhang, M. Photovoltaic power forecasting based on VMD-SSA-Transformer: Multidimensional analysis of dataset length, weather mutation and forecast accuracy. Energy 2025, 324, 135971. [Google Scholar] [CrossRef]
Piantadosi, G.; Dutto, S.; Galli, A.; De Vito, S.; Sansone, C.; Di Francia, G. Photovoltaic power forecasting: A Transformer based framework. Energy AI 2024, 18, 100444. [Google Scholar] [CrossRef]
Tao, K.; Zhao, J.; Tao, Y.; Qi, Q.; Tian, Y. Operational day-ahead photovoltaic power forecasting based on transformer variant. Appl. Energy 2024, 373, 123825. [Google Scholar] [CrossRef]
Qu, Z.; Hou, X.; Li, J.; Hu, W. Short-term wind farm cluster power prediction based on dual feature extraction and quadratic decomposition aggregation. Energy 2024, 290, 130155. [Google Scholar] [CrossRef]
Xie, T.; Zhang, G.; Liu, H.; Liu, F.; Du, P. A hybrid forecasting method for solar output power based on variational mode decomposition, deep belief networks and autoregressive moving average. Appl. Sci. 2018, 8, 1901. [Google Scholar] [CrossRef]
Khelifi, R.; Guermoui, M.; Rabehi, A.; Taallah, A.; Zoukel, A.; Ghoneim, S.S.; Bajaj, M.; AboRas, K.M.; Zaitsev, I. Short-Term PV Power Forecasting Using a Hybrid TVF-EMD-ELM Strategy. Int. Trans. Electr. Energy Syst. 2023, 2023, 6413716. [Google Scholar] [CrossRef]
Wu, S.; Guo, H.; Zhang, X.; Wang, F. Short-Term Photovoltaic Power Prediction Based on CEEMDAN and Hybrid Neural Networks. IEEE J. Photovoltaics 2024, 14, 960–969. [Google Scholar] [CrossRef]
Feng, H.; Yu, C. A novel hybrid model for short-term prediction of PV power based on KS-CEEMDAN-SE-LSTM. Renew. Energy Focus 2023, 47, 100497. [Google Scholar] [CrossRef]
Wang, L.; Liu, Y.; Li, T.; Xie, X.; Chang, C. Short-term PV power prediction based on optimized VMD and LSTM. IEEE Access 2020, 8, 165849–165862. [Google Scholar] [CrossRef]
Jodaei, A.; Moravej, Z.; Pazoki, M. Effective protection scheme for transmission lines connected to large scale photovoltaic power plants. Electr. Power Syst. Res. 2024, 228, 110103. [Google Scholar] [CrossRef]
Ma, J.; Bai, X.; Ma, F.; Zhuo, S.; Sun, B.; Li, C. Convolutional Neural Network Design Based on Weak Magnetic Signals and Its Application in Aircraft Bearing Fault Diagnosis. IEEE Sensors J. 2024, 24, 36031–36043. [Google Scholar] [CrossRef]
Jiang, Y.; Zheng, L.; Ding, X. Ultra-short-term prediction of photovoltaic output based on an LSTM-ARMA combined model driven by EEMD. J. Renew. Sustain. Energy 2021, 13, 046103. [Google Scholar] [CrossRef]
Huang, Y.; Liu, J.; Zhang, Z.; Li, D.; Li, X.; Wang, G. Dynamic Combination Forecasting for Short-Term Photovoltaic Power. IEEE Trans. Artif. Intell. 2024, 5, 5277–5289. [Google Scholar] [CrossRef]
Sareen, K.; Panigrahi, B.K.; Shikhola, T. A short-term solar irradiance forecasting modelling approach based on three decomposition algorithms and adaptive neuro-fuzzy inference system. Expert Syst. Appl. 2023, 231, 120770. [Google Scholar] [CrossRef]
Li, S.; Wang, J.; Zhang, H.; Liang, Y. Solar photovoltaic power forecasting system with online manner based on adaptive mode decomposition and multi-objective optimization. Comput. Electr. Eng. 2024, 118, 109407. [Google Scholar] [CrossRef]
Zhang, D.; Chen, B.; Zhu, H.; Goh, H.H.; Dong, Y.; Wu, T. Short-term wind power prediction based on two-layer decomposition and BiTCN-BiLSTM-attention model. Energy 2023, 285, 128762. [Google Scholar] [CrossRef]
Agga, A.; Abbou, A.; Labbadi, M.; El Houm, Y.; Ali, I.H.O. CNN-LSTM: An efficient hybrid deep learning architecture for predicting short-term photovoltaic power production. Electr. Power Syst. Res. 2022, 208, 107908. [Google Scholar] [CrossRef]
Li, N.; Li, L.; Zhang, F.; Jiao, T.; Wang, S.; Liu, X.; Wu, X. Research on short-term photovoltaic power prediction based on multi-scale similar days and ESN-KELM dual core prediction model. Energy 2023, 277, 127557. [Google Scholar] [CrossRef]
Yang, X.; Wang, S.; Peng, Y.; Chen, J.; Meng, L. Short-term photovoltaic power prediction with similar-day integrated by BP-AdaBoost based on the Grey-Markov model. Electr. Power Syst. Res. 2023, 215, 108966. [Google Scholar] [CrossRef]
Liu, Y.; Zuo, H.; Liu, Z.; Fu, Y.; Jia, J.J.; Dhupia, J.S. Electrostatic signal self-adaptive denoising method combined with CEEMDAN and wavelet threshold. Aerospace 2024, 11, 491. [Google Scholar] [CrossRef]
Gong, J.; Qu, Z.; Zhu, Z.; Xu, H. Parallel TimesNet-BiLSTM model for ultra-short-term photovoltaic power forecasting using STL decomposition and auto-tuning. Energy 2025, 320, 135286. [Google Scholar] [CrossRef]
Chou, J.S.; Truong, D.N. A novel metaheuristic optimizer inspired by behavior of jellyfish in ocean. Appl. Math. Comput. 2021, 389, 125535. [Google Scholar] [CrossRef]
Li, Y.; Song, L.; Zhang, S.; Kraus, L.; Adcox, T.; Willardson, R.; Komandur, A.; Lu, N. A TCN-based hybrid forecasting framework for hours-ahead utility-scale PV forecasting. IEEE Trans. Smart Grid 2023, 14, 4073–4085. [Google Scholar] [CrossRef]
Abdel-Basset, M.; Hawash, H.; Chakrabortty, R.K.; Ryan, M. PV-Net: An innovative deep learning approach for efficient forecasting of short-term photovoltaic energy production. J. Clean. Prod. 2021, 303, 127037. [Google Scholar] [CrossRef]
Limouni, T.; Yaagoubi, R.; Bouziane, K.; Guissi, K.; Baali, E.H. Accurate one step and multistep forecasting of very short-term PV power using LSTM-TCN model. Renew. Energy 2023, 205, 1010–1024. [Google Scholar] [CrossRef]
Wen, Y.; Pan, S.; Li, X.; Li, Z.; Wen, W. Improving multi-site photovoltaic forecasting with relevance amplification: DeepFEDformer-based approach. Energy 2024, 299, 131479. [Google Scholar] [CrossRef]

Figure 1. Structure of LSTM neural network.

Figure 2. Structure of the BiLSTM network.

Figure 3. The structure of the CEEMDAN-JS-BiLSTM.

Figure 4. PV power raw data: (a) The recorded raw data between 7:00 a.m. and 5:00 p.m. (from 21 May 2018 to 4 July 2018), with each line representing the data for one day; (b) The complete dataset in the daily data points.

Figure 5. Decomposed IMFs of PV power output by the CEEMDAN method.

Figure 6. FFT transform of decomposed IMFs of PV power output.

Figure 7. Improved Pearson correlation coefficient of adjacent FFT waveform.

Figure 8. Re-constructed IMF by the IPCC rule.

Figure 9. Test set prediction results versus actual values for the reconstructed time series components.

Figure 10. Predicted result of PV power output.

Figure 11. Performance comparison of different prediction methods on the test set in terms of

R^{2}

, MAPE, and RMSE.

Figure 11. Performance comparison of different prediction methods on the test set in terms of

R^{2}

, MAPE, and RMSE.

Figure 12. Ranking plot of model performance.

Figure 13. Bar graphs of different model performance metrics after ablation experiments.

Figure 14. Explanation of the ranking diagram for ablation models.

Figure 15. Convergence curves of two independent runs of the JS-based hyperparameter optimization for the CEEMDAN–JS–BiLSTM model.

Table 1. IPCC-based similarity between consecutive IMF components.

IMF Pair	IPCC Value	Correlation
IMF1 and IMF2	0.67	High
IMF2 and IMF3	0.41	Low
IMF3 and IMF4	0.64	High
IMF4 and IMF5	0.63	High
IMF5 and IMF6	0.24	Low

Table 2. Performance indicators of different models after ablation experiments.

Model	RMSE	MAPE (%)	$R^{2}$
BiLSTM	104.8925	16.9725	0.9032
CEEMDAN-BiLSTM	76.3219	13.4378	0.9433
JS-BiLSTM	87.4532	14.7441	0.9265
CEEMDAN-Grouped BiLSTM	46.4325	10.2461	0.9673
CEEMDAN–JS–BiLSTM	37.2833	8.1231	0.9785

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, Y.; Wang, J.; Song, L.; Liu, Y.; Shen, L. Enhanced Short-Term PV Power Forecasting via a Hybrid Modified CEEMDAN-Jellyfish Search Optimized BiLSTM Model. Energies 2025, 18, 3581. https://doi.org/10.3390/en18133581

AMA Style

Liu Y, Wang J, Song L, Liu Y, Shen L. Enhanced Short-Term PV Power Forecasting via a Hybrid Modified CEEMDAN-Jellyfish Search Optimized BiLSTM Model. Energies. 2025; 18(13):3581. https://doi.org/10.3390/en18133581

Chicago/Turabian Style

Liu, Yanhui, Jiulong Wang, Lingyun Song, Yicheng Liu, and Liqun Shen. 2025. "Enhanced Short-Term PV Power Forecasting via a Hybrid Modified CEEMDAN-Jellyfish Search Optimized BiLSTM Model" Energies 18, no. 13: 3581. https://doi.org/10.3390/en18133581

APA Style

Liu, Y., Wang, J., Song, L., Liu, Y., & Shen, L. (2025). Enhanced Short-Term PV Power Forecasting via a Hybrid Modified CEEMDAN-Jellyfish Search Optimized BiLSTM Model. Energies, 18(13), 3581. https://doi.org/10.3390/en18133581

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhanced Short-Term PV Power Forecasting via a Hybrid Modified CEEMDAN-Jellyfish Search Optimized BiLSTM Model

Abstract

1. Introduction

2. Theory Ground

2.1. CEEMDAN Theory

2.2. Improved Pearson Correlation Coefficient (IPCC)

2.2.1. Pearson Correlation Coefficient (PCC)

2.2.2. Improved Pearson Correlation Coefficient (IPCC)

2.3. Jellyfish Search Algorithm (JS)

2.4. Bidirectional Long Short-Term Memory (BiLSTM)

3. Proposed Methodology

3.1. Classification of the IMF Based on FFT and IPCC

3.2. BiLSTM Optimization with JS

3.3. Prediction Results Evaluation

4. Result and Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI