Research on Short-Term Wind Power Forecasting Based on VMD-IDBO-SVM

Li, Gengda; Li, Chaoying; Qian, Jian; Ma, Zilong; Sun, Hao; Jiao, Ridong; Jia, Wei; Yao, Yibo; Zhang, Tiefeng

doi:10.3390/electronics15030533

Open AccessArticle

Research on Short-Term Wind Power Forecasting Based on VMD-IDBO-SVM

by

Gengda Li

^1,2,

Chaoying Li

³,

Jian Qian

^1,2,

Zilong Ma

³,

Hao Sun

²,

Ridong Jiao

²,

Wei Jia

²,

Yibo Yao

² and

Tiefeng Zhang

^3,*

¹

Longyuan Damao Wind Power Generation Co., Ltd., Baotou 014500, China

²

Inner Mongolia Longyuan New Energy Development Co., Ltd., Hohhot 010000, China

³

Department of Electronic and Communication Engineering, North China Electric Power University, Baoding 071003, China

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(3), 533; https://doi.org/10.3390/electronics15030533

Submission received: 29 December 2025 / Revised: 22 January 2026 / Accepted: 23 January 2026 / Published: 26 January 2026

(This article belongs to the Special Issue Applications of Artificial Intelligence in Electric Power Systems)

Download

Browse Figures

Versions Notes

Abstract

To enhance the accuracy of wind power forecasting, this paper proposes a hybrid model that integrates Variational Mode Decomposition (VMD), Improved Dung Beetle Optimization (IDBO) and Support Vector Machine (SVM). First, to reduce the volatility and non-stationarity of wind power data, VMD is applied to decompose the original signal into several intrinsic mode functions (IMFs). Subsequently, the Dung Beetle Optimization (DBO) algorithm is improved using chaotic mapping, a Lévy flight search strategy and adaptive t-distribution. Finally, the penalty coefficient of the SVM is optimized using IDBO, and the VMD-IDBO-SVM model is constructed. This study proposes an improved IDBO algorithm and, for the first time, integrates VMD and IDBO-SVM within the context of wind power forecasting. Experimental results show that the proposed VMD-IDBO-SVM model achieves a MAE of 3.315, an RMSE of 4.130, and an R² of 0.985 on test data from a wind farm, demonstrating a significant improvement compared with the traditional SVM model. It has demonstrated excellent stability and significance in both multi-time-slice validation and statistical testing.

Keywords:

wind power forecasting; variational mode decomposition; dung beetle optimization algorithm; support vector machine; machine learning

1. Introduction

With the development of society, the demand for energy is steadily increasing, the proportion of terminal electricity consumption is rising, and both energy reserves and suppliers are facing significant challenges. In the energy consumption, the proportion of new energy development continues to rise, with wind power generation playing a key role. Consequently, wind power forecasting has become a prominent topic of research [1]. Wind energy is characterized by intermittency, fluctuation, and instability. Integrating large amounts of wind power into the grid increases the complexity of ensuring the power system operates safely and stably. In severe cases, this could jeopardize the reliability of the electricity supply for social production and daily life. Therefore, improving the accuracy of wind power forecasting is crucial for effective grid dispatching and ensuring the stable and reliable operation of the power system [2,3].

Reference [4] suggests a method of forecasting wind energy using seasonal trend decomposition techniques. This separates spatio-temporal characteristics using a contrastive learning framework with a feature decoupling loss function. It integrates geographic location data to simulate spatial correlations and builds a spatio-temporal graph convolutional network using multi-head attention mechanisms to enhance model transparency. However, this approach is limited to wind farms with similar geographical and climatic conditions and requires further development. Reference [5] suggests an improved Super Fairy Wren Optimization Algorithm (ISFOA) to enhance global exploration capabilities. It also integrates differential evolution strategies for local search. The ISFOA-SVM model is limited in its predictive performance and the regression capability of SVMs. Additionally, the high computational complexity of SVMs can lead to a decline in accuracy when handling large-scale datasets. Reference [6] introduces a new wind speed interval prediction method. The Combined Method for Interval Forecasting (CMIF) uses time-varying filtering and phase space reconstruction to resolve chaotic phenomena and eliminate noise, statistical and machine learning models as candidate models. The final selected model combines to improve the accuracy of predicting wind speed intervals. However, deep learning and ensemble models require more training time and may be computationally expensive. Reference [7] presents a Power Quality Disturbance (PQD) signal detection method combining VMD with improved wavelet threshold processing. IDBO optimizes the original signal with VMD. Then, IMFs are classified as dominant and noise components using correlation coefficients. The denoised components are then reconstructed. This method relies on the effectiveness of the improved parameter tuning algorithm. Further validation is required as it can cause sensitivity issues. Reference [8] presents a hybrid forecasting system that improves ultra-short-term wind power forecasting. The Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) algorithm decomposes wind power sequences into IMFs. High-frequency components are then denoised using VMD. Sample entropy values partition sequences into high- and low-frequency parts. Separate Transformer and GRU models predict sub-sequences. The sub-sequence forecasts are integrated to make the final wind power forecast. Further research is needed on aspects such as model complexity vs. cost, dependency vs. generalization, and related issues. Reference [9] presents a forecasting model that uses different approaches to quantify uncertainty depending on the sharing of information. When sharing is low, separate models are used; when sharing is high, multi-task models are used for joint forecasting. Hyperparameters are adjusted via evolutionary algorithms. The model performs best for short-term forecasting and is statistically significantly superior to traditional models for long-term comprehensive forecasting. However, the model’s accuracy varies over time, especially during seasonal shifts. It may not fully reflect time-varying characteristics of wind and solar power generation. Research should refine time-segment analysis to address performance variations.

Addressing the increasingly prominent issues of non-stationarity and randomness in wind power output sequences, current research predominantly employs a hybrid modeling framework comprising “decomposition–optimization–prediction”. This approach utilizes techniques such as Variational Modal Decomposition (VMD) and Ensemble Empirical Modal Decomposition (EEMD) to reduce signal noise and complexity. It further integrates intelligent optimization algorithms (such as particle swarm optimization (PSO) and dung beetle optimization (DBO)) to fine-tune parameters of machine learning or deep learning models (e.g., support vector machines (SVM) and long short-term memory (LSTM) networks), thereby enhancing prediction accuracy. Such integrated frameworks have demonstrated significant advantages in energy system management. For instance, Kırat et al.’s intelligent decision-making model for battery energy storage systems incorporates the N-HiTS deep learning algorithm for day-ahead electricity price forecasting and employs mixed-integer linear programming (MILP) to optimize energy storage arbitrage strategies [10]. This validates the effectiveness of the integrated “advanced forecasting + optimized decision-making” approach in addressing temporal uncertainty and enhancing economic benefits. Inspired by this, to further address shortcomings in existing forecasting methods concerning accuracy, generalization capability, and handling non-stationary signals, this paper proposes a hybrid wind power forecasting model integrating Variational Mode Decomposition (VMD), an improved Dung Beetle Optimization algorithm (IDBO), and Support Vector Machines (SVM). This aims to enhance forecasting accuracy and robustness through signal decomposition and intelligent parameter optimization.

The rest of this paper is organized as follows: Section 2 introduces the fundamental principles of the key methodologies employed, including variational mode decomposition (VMD), the dung beetle optimizer (DBO) algorithm and its improvements (IDBO), and support vector machines (SVMs). Section 3 elaborates on the design of the proposed VMD-IDBO-SVM hybrid forecasting framework, detailing its implementation workflow. Section 4 presents a case study encompassing data description, pre-processing, correlation analysis, model testing and a comprehensive comparative analysis of the results against those of other benchmark models. Finally, Section 5 concludes the paper by summarizing the key findings and contributions of this research.

2. Basic Methodology Principles

2.1. Variational Mode Decomposition

Variational Mode Decomposition (VMD) is an adaptive, fully non-recursive signal decomposition method proposed by Dragomiretskiy and Zosso in 2014 [11]. This method decomposes the original signal into intrinsic mode functions (IMFs) with sparse and center-frequency characteristics by constructing and solving a variational problem. Each IMF oscillates around a central frequency in the frequency domain, exhibiting excellent local frequency-domain properties [11,12].

The core concept of VMD is to transform the signal decomposition process into a variational constraint optimization problem. By iteratively searching for the optimal solution of the variational model, it determines the center frequency and bandwidth of each IMF. Its mathematical model can be expressed as Equation (1):

\begin{array}{l} \min_{{μ_{k}}, {ω_{k}}} \{\sum_{k = 1}^{K} {‖\partial_{t} [(δ (t) + \frac{j}{π t}) * μ_{k} (t)] e^{- j ω_{k} t}‖}_{2}^{2}\} \\ s u b j e c t t o \sum_{k = 1}^{K} μ_{k} (t) = f (t) \end{array}

(1)

u_{k}

denotes the kth IMF mode component,

ω_{k}

is its center frequency,

\partial_{t}

denotes the partial differential operator, f is the original signal,

δ (t)

denotes the Dirac delta function,

*

denotes the convolution operation.

The VMD solution process employs the Alternating Direction Method of Multipliers (ADMM), iteratively updating each IMF and its center frequency to ultimately achieve adaptive signal decomposition. To solve this constrained optimization problem, a quadratic penalty term

α

and a Lagrange multiplier are introduced, transforming it into an unconstrained augmented Lagrange function. The expression is given in Equation (2).

\begin{matrix} L ({μ_{k}}, {ω_{k}}, λ) = α {\sum_{k = 1}^{K} ‖\partial_{t} [(δ (t) + \frac{j}{π t}) * μ_{k} (t)] e^{- j ω_{k} t}‖}_{2}^{2} \\ + {‖f (t) - \sum_{k = 1}^{K} μ_{k} (t)‖}_{2}^{2} \\ + 〈λ (t), f (t) - \sum_{k = 1}^{K} μ_{k} (t)〉 \end{matrix}

(2)

where α is the quadratic penalty factor, and λ is the Lagrange multiplier.

Subsequently, the Alternating Direction Method of Multipliers (ADMM) is employed to iteratively update variables in order to locate saddle points of the augmented Lagrangian function. The update process is described by Equations (3)–(5).

{\overset{\land}{μ}}_{k}^{n + 1} (ω) = \frac{\overset{\land}{f} (ω) - \sum_{i \neq k} {\overset{\land}{μ}}_{i} (ω) + \frac{\overset{\land}{λ} (ω)}{2}}{1 + 2 α {(ω - ω_{k})}^{2}}

(3)

ω_{k}^{n + 1} = \frac{{\int_{0}^{\infty} ω |\overset{\land}{μ_{k}} (ω)|}^{2} d ω}{{\int_{0}^{\infty} |\overset{\land}{μ_{k}} (ω)|}^{2} d ω}

(4)

{\hat{λ}}^{n + 1} (ω) = {\hat{λ}}^{n} (ω) + τ (\hat{f} (ω) - \sum_{k = 1}^{K} {\hat{μ}}_{k}^{n + 1} (ω))

(5)

where τ is the update step size, typically chosen as 0 < τ < 2 to ensure convergence. Repeat the iterative process until the convergence criterion is satisfied (where

ε

is the specified tolerance). Solving stops updating at a preset accuracy The iteration termination expression is given by Equation (6).

\sum_{k = 1}^{K} \frac{{‖{\hat{μ}}_{k}^{n + 1} - {\hat{μ}}_{k}^{n}‖}_{2}^{2}}{{‖{\hat{μ}}_{k}^{n}‖}_{2}^{2}} ≺ ε

(6)

where ε is the predefined convergence threshold.

Compared to traditional Empirical Mode Decomposition (EMD) and its variants, VMD possesses a stronger mathematical foundation, superior noise robustness, and higher frequency resolution, effectively avoiding modal aliasing issues.

In wind power forecasting, raw power sequences exhibit strong non-stationarity and randomness. Through VMD, complex raw signals can be transformed into multiple relatively stationary sub-sequences with distinct frequency characteristics, significantly reducing forecasting complexity and laying the groundwork for constructing high-precision prediction models.

2.2. Dung Beetle Optimizer Algorithm and Its Improvements

2.2.1. Dung Beetle Optimizer Algorithm

The Dung Beetle Optimization (DBO) algorithm performs optimization by simulating the behaviors of dung beetles, including rolling, dancing, foraging, stealing, and breeding [13,14,15]. The specific behaviors are described as follows:

(1): Rolling ball behavior

The position update of a dung beetle during path traversal is defined by Equations (7) and (8).

x_{i} (t + 1) = x_{i} (t) + α \times k \times x_{i} (t - 1) + b \times Δ x

(7)

Δ x = | x_{i} (t) - X^{w} |

(8)

where t denotes the current iteration number, x_i(t) represents the position vector of the i-th dung beetle at iteration t,

X^{w}

signifies the global worst solution, and Δx models the variation in light intensity. The symbols α, β, and γ denote constant parameters.

(2): Dancing behavior

When a dung beetle encounters an obstacle, it reorients itself and alters its path through dancing behavior [16,17]. Its position is updated according to Equation (9).

x_{i} (t + 1) = x_{i} (t) + \tan (θ) |x_{i} (t) - x_{i} (t - 1)|

(9)

where

θ

∈ (0,π], which represents the deflection angle.

(3): Reproductive behavior

To simulate the characteristics of the egg-laying area of female dung beetles, a boundary selection strategy is designed to ensure a safe spawning environment for reproduction. The update formulas for the reproductive behavior are given in Equations (10) and (11).

x_{i} (t + 1) = x_{i} (t) + \tan (θ) |x_{i} (t) - x_{i} (t - 1)|

(10)

U b^{*} = \min (X^{*} \times (1 + R), U b)

(11)

where

X^{*}

represents the current local optimal position,

{L b}^{*}, {U b}^{*}, L b

and

U b

represent the lower bound of the spawning area, the upper bound of the spawning area, the lower bound of the optimization problem and the upper bound of the optimization problem. From Equations (8) and (9) can be seen that the calculation formula for the position of the hatching ball is Equation (12).

B_{i} (t + 1) = X^{*} + b_{1} \times (B_{i} (t) - L b^{*}) + b_{2} \times (B_{i} (t) - U b^{*})

(12)

where

B_{i} (t)

is the position information of the i-th egg ball at the t-th iteration;

b_{1}, b_{2}

representing a random vector in the D dimension.

(4): Foraging behavior

The update equations for determining the optimal foraging areas are presented in Equations (13) and (14).

L b^{b} = \max (X^{b} \times (1 - R), L b)

(13)

U b^{b} = \min (X^{b} \times (1 + R), U b)

(14)

where

X^{b}

represents the global optimal position;

{L b}^{b}

and

{U b}^{b}

represent the lower and upper bounds of the optimal foraging area.

The position of the dung beetle is updated according to Equation (15).

\begin{matrix} x_{i} (t + 1) = x_{i} (t) + C_{1} \times (x_{i} (t) - L b^{b}) + \\ C_{2} \times (x_{i} (t) - U b^{b}) \end{matrix}

(15)

where

C_{1}

denotes a random scalar and

C_{2}

represents a random vector.

(5): Stealing behavior

Assuming that the surrounding area represents the optimal location for competing for food, the position of the thief is updated according to Equation (16).

x_{i} (t + 1) = X^{b} + S \times g \times (| x_{i} (t) - X^{*} | + | x_{i} (t) - X^{b} |)

(16)

where

x_{i} (t)

represents the location information of the i-th thief at the t-th iteration;

X^{b}

is the best place to compete for food,

g

and

S

represent D-dimensional random vectors and constants.

2.2.2. Improvement Strategy

In optimization algorithms, PSO performs the search process by updating velocity and position, relying mainly on the information of individual and global best solutions; GWO simulates the hierarchical structure of wolf packs, performing search through the leader’s behavior in capturing prey; SSA uses a vulture search strategy, combining global and local search to find the optimal solution. However, these traditional algorithms often face issues such as local optima and slow convergence speed. In comparison, the Dung Beetle Optimization (DBO) algorithm simulates the behaviors of dung beetles, such as rolling, foraging, dancing, stealing, and breeding, to perform search and optimization.

Although the traditional Dung Beetle Optimization (DBO) algorithm possesses strong global search capability and good adaptability, it remains sensitive to parameter settings, exhibits slow convergence, and is prone to premature convergence into local optima [18,19]. Therefore, the DBO is improved as follows.

(1): Chebyshev mapping

The Chebyshev mapping–based population initialization method [20,21] can address the issues of insufficient diversity and uneven distribution observed in traditional population initialization approaches. The corresponding formulation is given in Equation (17).

x_{n + 1} = \cos (k \cos^{- 1} x_{n}), x \in [- 1, 1]

(17)

where x_n denotes the current chaotic state variable, and k represents the order of the Chebyshev map.

(2): Lévy Flight Search Strategy

In the traditional DBO, the position update of dung beetles primarily relies on random walks or fixed-step movements, which may reduce the efficiency of global search and increase the risk of falling into local optima. To mitigate premature convergence, the Lévy flight strategy [22,23] is introduced, with the corresponding expressions provided in Equation (18).

s = \frac{μ}{{|v|}^{1 / β}}

(18)

where s represents the Lévy step size, u and v are random numbers that obey the normal distribution, and β are stable parameters. Then add the Lévy step to the traditional DBO position update as Equation (19).

x_{i} (t + 1) = x_{i} (t) + α * s * (x_{b e s t} - x_{i} (t))

(19)

where

x_{i} (t)

is the current position of the dung beetle,

x_{b e s t}

is the current global optimal position and α is the step size scaling factor.

(3): Adaptive t-distribution

Adaptive t-distribution mutation combines the advantages of both Cauchy and Gaussian mutations [24,25]. In the improved dung beetle algorithm, this strategy is applied based on the number of iterations to enhance search diversity and reduce the risk of falling into local optima. The mutation formula for the t-distribution is presented in Equation (20).

x_{i}^{t + 1} = x_{i}^{k} + x_{i}^{t} * t (i t e r)

(20)

The Chebyshev mapping enhances the diversity of the initial population, thereby mitigating the influence of local optima. The Lévy flight strategy enables IDBO to conduct a more extensive search across larger search spaces, effectively preventing premature convergence. Additionally, the adaptive t-distribution perturbation improves the accuracy of the local search, making IDBO more efficient and stable in optimizing complex nonlinear and non-convex problems. Consequently, IDBO outperforms traditional optimization algorithms by offering superior global search capability, faster convergence, and higher robustness, particularly in hyperparameter optimization tasks such as SVM. The improved IDBO flow chart is shown in Figure 1.

2.2.3. Conceptual Comparison and Motivation for Improvement

This study aims to enhance the conventional Dung Beetle Optimizer (DBO) to address core limitations shared by DBO and other advanced metaheuristics (e.g., PSO, GWO, SSA). The proposed Improved DBO (IDBO) is motivated by conceptual advancements across three key dimensions:

(1): Search Behavior and Local Optima Avoidance

Traditional algorithms (PSO, GWO, SSA) often rely heavily on leaders (global or local best), creating a strong convergence tendency that can trap populations in local optima in multimodal landscapes. While incorporating randomness, their search strategies remain relatively fixed.

Improvement in IDBO: Integrating a Lévy flight strategy preserves guided search while enabling occasional long-jump exploration. The heavy-tailed property of Lévy flights provides a probability of large step sizes, significantly improving the capacity to escape local optima compared to conventional prone-to-stagnate patterns.

(2): Information Utilization and Population Diversity

In traditional algorithms, diversity typically decreases rapidly in later iterations due to selection pressure. Parameter adjustment (e.g., PSO inertia weight) offers limited, non-adaptive mitigation.

Improvement in IDBO: Chebyshev chaotic mapping generates a more uniform and well-dispersed initial population. Furthermore, an adaptive t-distribution mutation is introduced. Early iterations use a heavy-tailed t-distribution (similar to Cauchy) to promote exploration via large perturbations, while later iterations employ a t-distribution approximating Gaussian noise to shift focus to local refinement. This enables autonomous balancing between exploration and exploitation.

(3): Convergence Behavior and Robustness

The performance of traditional algorithms is often sensitive to problem-specific manual parameter tuning (e.g., PSO learning factors, GWO convergence parameters), affecting robustness and increasing tuning effort.

Improvement in IDBO: The synergy of chaotic initialization, Lévy flight, and adaptive t-distribution mutation with DBO’s innate behavioral modes (rolling, breeding, foraging, stealing) creates a multi-layer, hybrid-driven search framework. This reduces dependence on any single behavior or parameter: Lévy flight enables “breakthrough,” adaptive mutation permits “fine-tuning,” and DBO’s core behaviors provide a stable search base. This complementary structure enhances convergence speed, final accuracy, and stability across diverse problems, thereby improving overall robustness.

In summary, IDBO represents a systematic enhancement over DBO across initialization, global exploration, and local exploitation. By integrating chaos theory, heavy-tailed distribution-driven global search, and iteration-adaptive perturbation, IDBO achieves superior global search ability, a self-balanced search dynamic, and increased algorithmic robustness. This conceptual advancement surpasses traditional optimizers dependent on single-information guidance and fixed randomness, making IDBO particularly suitable for complex, high-dimensional, non-convex optimization problems.

2.2.4. Comparison of Algorithms Before and After Improvement

To evaluate the performance of the improved algorithm, four distinct benchmark functions were selected for assessment and testing, with their mathematical expressions shown in Table 1. The test parameters were set as follows: population size of 60, maximum iteration count of 300, and dimensionality of 30. Each test function was run independently 30 times to obtain the optimal solution, mean, and standard deviation. To facilitate better comparison of test results, five optimization algorithms—SSA, PSO, GWO, DBO, and IDBO—were selected for evaluation. The performance comparison is presented in Table 2, while the convergence curves for the test functions are shown in Figure 2. The hyperparameter settings for IDBO are shown in Table 3.

Table 3 lists the key hyperparameters of the IDBO algorithm and their values. These parameter settings are based on the algorithm’s principles, common ranges found in relevant literature, and our preliminary parameter sensitivity analysis conducted to balance global exploration and local exploitation capabilities.

Population size (60) and maximum iterations (30): This configuration references typical scales from comparable studies (e.g., [12]) and was validated through preliminary experiments to provide sufficient search capability for the current problem within acceptable computational costs.

Chebyshev mapping coefficient (k = 3), Lévy walk parameters (α = 0.01, β = 1.5): These parameters govern initial population diversity and the frequency of long-step searches. k = 3 generates sufficiently complex chaotic sequences; α and β values reference standard applications of Lévy walks in optimization algorithms [21,22], ensuring effective escape from local optima without unduly disrupting convergence.

t-distribution degrees of freedom (v = 2): Set to a small value to impart heavy-tailed characteristics, enhancing global exploration in early iterations before naturally transitioning to local refinement later.

SVM parameter search ranges (C: [0.001, 1000], γ: [2⁻⁸, 2⁸]): This range comprehensively covers the typical effective interval of the SVM-RBF model, ensuring the IDBO can search for optimal solutions within a sufficiently large space. Smaller C values tend to smooth the model, while larger C values strive for precise fitting; this range encompasses both tendencies.

According to the test results of different algorithms in the test function in Table 2, IDBO outperforms other population intelligence optimization algorithms both in terms of mean and standard deviation performance. This indicates that IDBO is closer to the optimal value of the test function within a fixed number of iterations. From the convergence curves in Figure 2, it can be clearly seen that the convergence curve of IDBO approaches zero (the optimal solution) faster than the other curves, which indicates that IDBO can find the optimal solution more quickly within the same number of iterations. Both the test results and the convergence curves reflect the powerful optimization ability of the IDBO algorithm.

To further assess the stability and robustness of the algorithms from the perspective of statistical distributions, Figure 3 presents box-and-line plots of the fitness values of the five algorithms after 30 runs on the four test functions. The box-and-line plots intuitively reflect the median, distribution range, and dispersion of the algorithms’ results over multiple runs. The box plots in Figure 3 show the comparative performance of the five meta-heuristic optimization algorithms (IDBO, DBO, GWO, PSO and SSA) on the four standard benchmark functions (f1 to f4), which demonstrate the effectiveness and superiority of the IDBO algorithm in solving complex optimization problems. The fitness value (FV) is a core quantitative measure of the quality of a candidate solution (i.e., a point in the search space explored by the algorithm).

The IDBO algorithm (leftmost box) has the lowest box position, the shortest length, and the smallest whisker range of all four test functions. The lowest box position indicates that the median and distribution range of the Fitness Value found by IDBO is much lower than that of the other algorithms, which implies that IDBO has the highest solution accuracy; the short box and whisker line indicate that the results of IDBO are very close with minimal fluctuation in 30 independent runs, which proves that IDBO has superior stability and robustness. As can be seen from Figure 3, the star marker indicates the result point of the best-performing (optimal) algorithm for each test function. The IDBO algorithm has the lowest box position, the shortest length, and the smallest whisker range on all the test functions, which indicates that IDBO not only consistently finds solutions with higher accuracy (low box position), but also the solution results fluctuate very little from one run to the next, which demonstrates superior stability and robustness. In contrast, algorithms such as PSO and SSA have higher box positions, wider distributions, and outliers. This indicates that the performance of these algorithms is unstable and easy to fall into local optimum. The visualization results in Figure 3 and the numerical results in Table 2 corroborate each other, and together they prove that the improved IDBO algorithm has significant advantages in terms of optimization accuracy, convergence speed and stability.

2.3. Support Vector Machine Regression Task

Support Vector Machine (SVM) was originally designed for classification tasks, where it finds an optimal hyperplane that maximizes the margin between different classes in a high-dimensional feature space. This approach has proven effective for pattern recognition and binary classification problems. However, many real-world applications, including wind power forecasting, require the prediction of continuous values rather than discrete classes. This limitation motivated the development of Support Vector Regression (SVR), which extends the SVM framework to regression problems while preserving its key advantages: structural risk minimization, kernel trick application, and sparse solution representation.

The transition from SVM classification to SVR is achieved through the introduction of an ε-insensitive loss function. Unlike traditional regression methods that penalize all deviations equally, SVR establishes a tolerance band (ε-tube) around the regression function. Only errors exceeding this tolerance contribute to the loss function, making the model robust to noise and minor fluctuations—a critical feature for handling the inherent volatility in wind power data. This adaptation maintains SVM’s core mathematical structure while redefining the optimization objective from class separation to function approximation.

The objective of SVM regression is to construct a regression function in a high-dimensional space that best fits the training data, while possessing strong generalization capability, enabling effective predictions even when dealing with noisy and complex datasets [26,27].

The core idea of SVM regression is similar to that of SVM for classification, both of which aim to enhance the model’s generalization capability by maximizing the margin. However, unlike classification, where the goal is to find a separating hyperplane to divide different classes, SVM regression focuses on minimizing the error and finding the optimal regression function by maximizing the margin [28].

Given a set of training data {(xi, yi)}, where xi ∈ Rd is the feature vector and yi ∈ R is the corresponding target value, the objective of SVM regression is to find a function (21) that can predict future target values and make the predictions as close as possible to the actual target values. To ensure the stability and accuracy of the regression, the SVM regression model is optimized through the objective function Equation (22):

f (X) = w^{T} X + b

(21)

\min_{ω, b, ε} \frac{1}{2} {‖ω‖}^{2} + C \sum_{i = 1}^{n} ε_{i}

(22)

where w is the weight vector of the regression function, controlling the slope of the regression function; b is the bias term, determining the position of the regression function;

ε_{i}

is the error of each training sample; C is the penalty factor, which controls the tolerance for errors. A larger C results in a higher penalty for errors, while a smaller C makes the model more tolerant of errors, thus increasing the smooth fitting of the data.

One of the key features of SVM regression is the epsilon-insensitive zone. This region is a band around the regression function, where all training data points within this region are not penalized. Only the points that fall outside this region are considered and penalized. The advantage of this approach is that it allows the model to ignore noise in the training data, thereby improving its ability to fit the actual trend.

Figure 4 illustrates the fundamental principle of Support Vector Regression (SVR) employing the ε-insensitive loss function. The blue circles represent training samples, which exhibit a nonlinear relationship with additive noise, simulating real-world observations. The black dashed line denotes the true underlying function (unknown in practice) and serves as a reference. The red solid curve is the regression function learned by the SVR model, where w is the weight vector,

ϕ (\cdot)

maps the input to a high-dimensional feature space, and b is the bias term. The light-red shaded region around the regression curve represents the ε-insensitive tube of width 2ε. Within this tube, deviations between predictions and actual values are tolerated without penalty, thereby enhancing robustness against noise and small fluctuations. The red circles highlight the support vectors—training points that lie on or outside the tube boundaries. These points alone determine the regression function, ensuring a sparse solution that generalizes well. The mathematical formulation in the upper-right annotation corresponds to the SVR optimization problem given by Equations (21) and (22) in the text: minimizing the regularized risk subject to

|y_{i} - f (x_{i})| \leq ε + ξ_{i}

, where C balances model complexity and fitting accuracy, and

ξ_{i}

are slack variables. This ε-insensitive mechanism allows SVR to effectively handle non-stationary, noisy sequences such as wind power data, making it a suitable regression tool for short-term forecasting tasks.

3. VMD-IDBO-SVM Model Design

3.1. Proposed VMD-IDBO-SVM Framework

The proposed VMD-IDBO-SVM framework is illustrated in Figure 5. First, a research dataset is constructed based on the historical power output data from the wind farm and the associated meteorological features. The dataset then undergoes quality control and preprocessing, including the detection and treatment of missing values, duplicate entries, and outliers. Pearson correlation analysis is applied to select highly relevant features, and Min–Max normalization is employed to unify the data scale. After preprocessing, Variational Mode Decomposition (VMD) is applied to decompose the wind power time series into several Intrinsic Mode Functions (IMF), thereby reducing the non-stationarity and volatility of the data. Each IMF series is treated as an independent sub-problem and used as the input to the prediction model. Subsequently, the Improved Dung Beetle Optimization (IDBO) algorithm is employed to optimize the hyperparameters of the Support Vector Machine (SVM), namely the penalty coefficient C and the kernel parameter γ. The IDBO incorporates chaotic mapping, Lévy flight strategy, and adaptive t-distribution mechanisms to enhance global search capability and avoid premature convergence. Finally, the optimal SVM predictions for all IMF sub-series are reconstructed to obtain the final short-term wind power forecast. This approach effectively addresses the non-stationarity and noise present in wind power data, while improving both forecasting accuracy and stability.

3.2. Construction of VMD-IDBO-SVM Prediction Algorithm

Traditional SVM models often rely on manual experience for parameter selection, which can lead to limited generalization capability [29]. In this study, wind power data are first decomposed using VMD, and then the optimal SVM parameters c and g are determined through the Improved Dung Beetle Optimization (IDBO) algorithm [30,31]. The core concept is to treat c and g as the coordinates of dung beetles, and, after iterative optimization through four behavioral strategies, directly apply them for SVM-based wind power forecasting. The workflow of the proposed VMD-IDBO-SVM algorithm is illustrated in Figure 6.

The proposed VMD-IDBO-SVM hybrid forecasting framework is implemented through the following systematic procedure:

(1): Data Preprocessing: The original wind power sequence undergoes initial cleansing to identify and rectify anomalies and missing values. Subsequently, the dataset is standardized to mitigate the influence of varying scales among different features.
(2): Data Partitioning: The preprocessed dataset is divided into training and testing subsets, ensuring a chronological split to prevent look-ahead bias and maintain the temporal integrity of the time series.
(3): Signal Decomposition: The wind power time series from the training set is decomposed using Variational Mode Decomposition (VMD) into a finite number of Intrinsic Mode Functions (IMFs). Each IMF represents a sub-sequence with distinct frequency characteristics and reduced non-stationarity, simplifying the modeling complexity.
(4): Hyperparameter Optimization & Predictor Construction: For each IMF sub-sequence obtained from VMD, an independent Support Vector Machine (SVM) predictor is constructed. The critical hyperparameters of each SVM—specifically the penalty coefficient (C) and the radial basis function (RBF) kernel parameter (γ)—are optimized using the Improved Dung Beetle Optimizer (IDBO). Within the IDBO framework (employing parameters detailed in Table 3), a candidate solution is represented by a vector (C, γ). The algorithm iteratively refines a population of these solutions by simulating dung beetle behaviors (rolling, dancing, foraging, etc.). The optimization objective is to minimize the Root Mean Square Error (RMSE) of the SVM predictor on the data corresponding to that specific IMF. This process yields a unique, optimized set of (C, γ) parameters for each IMF component, culminating in a tailored IDBO-SVM predictor for each frequency-based sub-signal.
(5): Prediction Aggregation and Post-processing: Each optimized IDBO-SVM predictor forecasts its respective IMF sub-sequence for the testing period. The ensemble of all IMF forecasts is then summed to reconstruct the complete wind power prediction. Finally, a denormalization step is applied to revert the predictions to their original physical scale.
(6): Performance Evaluation: The final aggregated forecast is compared against the actual wind power values from the test set. Comprehensive error analysis is conducted using established metrics, including Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and the Coefficient of Determination (R²), to quantitatively assess the model’s predictive accuracy and effectiveness. 4. Example verification.

3.2.1. Input Definition for Each IMF Predictor

For each IMF sub-sequence obtained from VMD, we construct an independent IDBO-SVM predictor. The input feature vector for each predictor comprises two components:

Historical wind power values: The raw wind power sequence from the previous L time steps is employed as one input feature to capture temporal dependencies. In this study, L = 96 is set, corresponding to the past 24 h (one data point every 15 min).

Correlated meteorological features: Meteorological variables highly correlated with wind power output are selected, including wind speed and wind direction at different heights (10 m, 30 m, 50 m). These meteorological variables also employ the same time window (L = 96) as the power sequence for input.

Therefore, for each IMF K prediction task at time t, the input feature vector

X_{t}^{K}

can be represented as Equation (23)

X_{t}^{K} = [P_{t - 1}, P_{t - 1}, \dots, P_{t - L}, W_{10, t - 1}, W_{10, t - 2}, \dots, W_{10, t - L}, \dots, D_{50, t - L}]

(23)

where P denotes wind power, W denotes wind speed, D denotes wind direction, and subscripts indicate height and time lag.

Each IDBO-SVM predictor forecasts the value of its corresponding IMF at time t + 1 based on the aforementioned input features.

3.2.2. Aggregation of IMF Predictions

The predicted values for all IMF sub-sequences are reconstructed through simple summation, yielding the final wind power forecast by Equation (24).

\overset{\land}{P_{t + 1}} = \sum_{k = 1}^{K} \overset{\land}{I M} F_{k, t + 1}

(24)

Here, K = 7 denotes the number of modes in the VMD employed in this study, while

\overset{\land}{I M} F_{k, t + 1}

represents the predicted value of the kth IMF at time t + 1.

4. Example Verification

4.1. Model Evaluation

In this paper, three frequently utilized error evaluation metrics are employed to evaluate the model’s performance [32]. Within this framework, N signifies the totality of the data samples under consideration, while and denote the predicted and true values of the i-th sample, respectively.

(1): Mean absolute error (MAE), calculated by Equation (25).

M A E = \frac{1}{N} \sum_{i = 1}^{N} |\bar{y_{i}} - y_{i}|

(25)

(2): Root Mean square error (RMSE), calculated as Equation (26).

R M S E = \sqrt{\frac{1}{N} {\sum_{i = 1}^{N} (y_{i} - \hat{y_{i}})}^{2}}

(26)

(3): The coefficient of determination (R²) is calculated as Equation (27).

R^{2} = 1 - \frac{{\sum_{i = 1}^{N} (\hat{y_{i}} - y_{i})}^{2}}{{\sum_{i = 1}^{N} (\bar{y_{i}} - y_{i})}^{2}}

(27)

4.2. Data Description

The present study employs data obtained from a Longyuan wind farm, which was collected during the period spanning from October to December of 2024. The data includes the power output, measured at 15 min intervals, with 96 sampling points per day. The dataset is divided into a training set and a testing set, with a ratio of 8:2. The primary objective of the study is to forecast wind power production, focusing on the short term (one-step ahead). The model predicts the next 15 min interval using historical data. This is formulated as a regression problem where the features consist of the previous 24 h window of wind power and meteorological data, and the target output is the wind power at the subsequent 15 min timestep. The initial wind power data collection is shown in Figure 7. Figure 8 illustrates the feature-label mapping and the temporal relationship. Figure 9 depicts the sliding window approach, which generates successive training samples. The training and testing datasets are split chronologically, ensuring that the training data comes before the test data. This prevents future information from leaking into the training phase, crucial for time series forecasting.

Figure 7 displays the raw wind power time series data collected from the Longyuan Wind Farm. The horizontal axis represents time (sampled at 15 min intervals), while the vertical axis shows the corresponding power output values (in kW or MW). The figure clearly illustrates the pronounced non-stationarity, volatility, and intermittency of wind power output, attributable to the random nature of meteorological factors such as wind speed and direction. This graph vividly reflects the characteristic properties of wind power data, highlighting the challenges of forecasting directly from the raw sequence. Consequently, it underscores the necessity of employing Variational Modal Decomposition (VMD) for signal preprocessing to reduce sequence complexity, as adopted in this paper.

Figure 8 clearly delineates the forecasting task of this study in the form of a sequence diagram. The diagram explicitly labels: Historical Window: A 24 h period (comprising 96 15 min sampling points), containing wind power output and associated meteorological characteristics within this timeframe. Forecasting Point: The next 15 min time point following the historical window (i.e., the 97th point), representing the target value to be predicted by the model. Timeline: Denoted by “t − 95” to “t” for the 96 consecutive time points within the historical window, and “t + 1” for the future time point to be forecasted. This diagram visually illustrates the single-step forward forecasting configuration described herein: utilizing past 24 h data to forecast future 15 min power output, thereby establishing the temporal correspondence between features (X) and labels (y).

Figure 9 illustrates how a sliding window approach constructs a sample set for model training and testing from the original time series. The figure shows: The window slides rightward along the time axis at a fixed length (24 h), advancing one time step (15 min) per slide. The data within each window constitutes an input sample (feature), while the power value at the immediately subsequent time point serves as the output label for that sample. This approach transforms long-term time series data into a sequence of supervised learning samples, suitable for training machine learning models such as Support Vector Machines (SVMs). This diagram further illustrates the principle of temporal integrity in dataset construction: training and test sets are partitioned chronologically to prevent “data leakage” during model predictions for future data.

The division between training and test sets is strictly conducted in chronological order, ensuring that training data precedes test data temporally. This partitioning method prevents the leakage of future information during the training phase, adhering to the fundamental principles of time series forecasting.

4.3. Outlier Handling

During the acquisition of wind power data, sensor malfunctions and power-related failures can lead to erroneous or missing data [33]. Consequently, outliers are an inevitable part of the dataset, and they can severely affect the accuracy of power forecasting. Identifying abnormal wind power data involves detecting and analyzing missing values, duplicate values, and outliers [34,35].

The quartile method has been introduced as a data analysis technique that employs the quantiles of the data to identify outliers (see Figure 10 for a visual representation of the principle).

Q₂ and IQR denote the median and the interquartile range, respectively, and their calculation formulas are given in Equation (28).

I Q R = Q_{3} - Q_{1}

(28)

where Q₁ and Q₃ denote the first and third quartiles, respectively. Data values exceeding the upper limit

F_{u}

or falling below the lower limit

F_{l}

are considered outliers. Calculating of the upper and lower limits are as Equations (29) and (30).

F_{u} = Q_{1} + 1.5 I Q R

(29)

F_{l} = Q_{3} - 1.5 I Q R

(30)

Outliers are regarded as anomalous data points. Considering that wind power exhibits a continuous variation pattern with a high correlation among three consecutive points, the mean of the preceding and succeeding values is adopted for data imputation [36,37].

Implementation to Prevent Data Leakage: The quartiles (Q₁, Q₃) and the resulting bounds (

F_{l}

,

F_{u}

) are calculated solely from the training dataset. This ensures that the criteria for identifying outliers are derived independently, without any information from the test set. The same calculated bounds are then applied to screen the entire dataset (including the test set) for consistency, but the test set does not influence the bound determination.

Considering that wind power exhibits a continuous variation pattern with a high correlation among consecutive points, detected outliers are imputed using the mean of the immediately preceding and succeeding valid values within the same dataset partition (training or test set). For edge cases (e.g., the first or last point being an outlier), a simple linear interpolation from neighboring valid points is used.

Clarification on Temporal Integrity: This imputation method is applied independently within the training and testing phases. During model training, only historical data (the training set) is available. Any outlier in the training set is replaced using the mean of its adjacent values within the training set, which are chronologically prior and posterior to it. This process does not utilize any future information from the test set. During the testing phase, if an outlier is identified in the test set input features (based on the pre-calculated bounds from the training set), its imputation similarly relies only on adjacent values within the test set sequence itself. This strict separation guarantees that the model’s training process is not contaminated by future or test information, adhering to fundamental time-series forecasting principles.

4.4. Correlation Analysis

The output power of a wind farm is primarily influenced by a combination of factors, including wind speed, wind direction, temperature, humidity, and air pressure, with each factor exerting a different degree of impact [38]. To quantify the relationship between each feature and the output power, the Pearson correlation coefficient is employed for comparison, and its calculation formula is presented in Equation (31) [39].

r_{x, y} = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sqrt{\sum_{i = 1}^{n} (x_{i} - \bar{x})^{2}} \sqrt{\sum_{i = 1}^{n} (y_{i} - \bar{y})^{2}}}

(31)

where x_i and y_i are the sample values,

\bar{x}

and

\bar{y}

represent their respective mean values. The closer

r_{x, y}

is to 1, the stronger the correlation between variables.

Features impact output power differently. The correlation heatmap is shown in Figure 11. Correlation analysis, training and testing are performed separately. During the training phase, correlation analysis is performed on training set data. Selected features (e.g., wind speeds at varying heights) are chosen based on these data. This methodological approach guarantees that information from the test set will not influence the feature selection process.

As shown in Figure 11, the Pearson correlation coefficient gradually decreases from pink to blue. The wind speeds at heights of 10 m, 30 m, and 50 m exhibit strong correlations with the load values, all around 0.9. Therefore, wind speed and wind direction at different heights are selected as key feature factors.

4.5. Data Decomposition

The selection of the optimal number of decomposed modes, K, is of paramount importance for the effectiveness of VMD. The present study employs a comprehensive multi-criteria decision-making approach, determining the optimal K value by integrating center frequency stability analysis with sample entropy assessment. With regard to the VMD, it is important to note that training and testing are conducted separately. During the training phase, VMD is applied solely to historical wind power sequences within the training set, with the resulting modal components (IMFs) and their statistical characteristics (such as center frequency and sample entropy) derived entirely from the training data. Test set data is excluded from the decomposition process, thereby fundamentally preventing test information leakage from the decomposition stage into the model training phase.

4.5.1. Center Frequency Stability Analysis

The center frequency analysis examines the stabilization of the last component’s frequency across different K values, as presented in Table 4. When the center frequency of the final IMF stabilizes, it indicates thorough decomposition without generating redundant noise-dominant modes.

As shown in Table 4, when K ≥ 4, the variation in the center frequency of the last IMF becomes significantly smaller, indicating that the decomposition is already thorough. However, a critical observation emerges at K = 8: the center frequency difference between the last two IMFs (IMF7 and IMF8) decreases dramatically to 0.001 Hz, compared to 1.576 Hz at K = 7. This minimal frequency separation at K = 8 suggests potential modal aliasing or the decomposition of noise into separate modes, indicating over-decomposition.

4.5.2. Sample Entropy Analysis for Decomposition Adequacy Assessment

To quantitatively evaluate decomposition quality and avoid subjective judgment, Sample Entropy (SE) analysis was introduced. Sample Entropy measures the complexity and regularity of time series: lower SE values indicate more regular, predictable sequences. The SE distributions for different K values are presented in Table 5.

4.5.3. Comprehensive Decision-Making for K = 7 Selection

The selection of K = 7 as the optimal decomposition mode number is based on the following multi-faceted analysis:

(1): Sample Entropy Marginal Benefit Analysis

The average sample entropy (excluding the last component, typically representing residual noise) shows a consistent and significant reduction from K = 2 to K = 7, with reduction rates ranging from 21.5% to 24.6%. This indicates substantial gains in signal regularity with each additional mode. However, from K = 7 to K = 8, the reduction rate drops sharply to 15.6%, signaling diminishing returns and suggesting that additional modes beyond K = 7 contribute minimally to signal decomposition while potentially capturing noise.

(2): Avoidance of Over-Decomposition

At K = 8, IMF8 exhibits an extremely low sample entropy value of 0.01. In information theory, such near-zero entropy values indicate sequences with minimal information content, likely representing noise decomposed into artificial modes. This phenomenon, combined with the negligible center frequency difference between IMF7 and IMF8 (0.001 Hz), strongly suggests over-decomposition at K = 8.

(3): Computational Efficiency Balance

While VMD complexity increases with K, the performance improvement from K = 7 to K = 8 is marginal. The significant computational cost increase does not justify the minimal gain in decomposition quality, making K = 7 the practical optimum for balancing accuracy and efficiency.

(4): Center Frequency Distribution Rationality

At K = 7, all IMFs maintain distinct center frequency separations (minimum difference of 1.576 Hz between IMF6 and IMF7), effectively avoiding modal aliasing. In contrast, K = 8 shows nearly identical frequencies for the last two modes, indicating frequency overlap and potential information redundancy.

(5): Reconstruction Accuracy Sufficiency

Empirical testing showed that the reconstruction error at K = 7 was 0.14%, while at K = 8 it was 0.13%—an improvement of only 0.01%. This negligible enhancement confirms that K = 7 provides sufficient decomposition accuracy for subsequent forecasting tasks.

4.5.4. Final VMD with K = 7

Based on the above comprehensive analysis considering decomposition adequacy, computational efficiency, and avoidance of over-decomposition, this study selects K = 7 as the optimal VMD mode number. The VMD results for the training historical data with K = 7 are presented in Figure 12.

As illustrated in Figure 12, the wind power sub-series obtained through VMD with K = 7 exhibit distinct regularity and periodicity while effectively capturing the variation trends of the original data. Each sub-series highlights local characteristics more clearly, providing a stable foundation for subsequent forecasting models.

4.6. Data Normalization

The process of data normalization is executed in a distinct manner for the training and testing phases. During the training phase, normalization parameters (minimum and maximum values) are calculated exclusively based on the training set. This approach ensures that test data does not participate in the fitting of normalization parameters, thereby preventing information leakage.

The issue is mitigated by normalization through the process of mapping each feature to a common scale. This process enhances the performance and stability of the model [40,41]. A common approach is min–max scaling, which transforms all variables into the [0, 1] range. The corresponding expression is given in Equation (32).

x^{'} = \frac{x - x_{\min}}{x_{\max} - x_{\min}}

(32)

where

x_{\min}

is the minimum value;

x_{\max}

is the maximum.

4.7. Model Testing

In order to evaluate the prediction effect of the model proposed in this paper, we predict the test set data and compare it with other models. Specifically, we first put the same data into other models to construct the comparison model. Then, the pure SVM model with the addition of the VMD strategy is compared with the VMD-IDBO-SVM model with the addition of other strategies and improved. The relevant parameter settings are shown in Table 6. The comparison plots of the prediction results of the SVM and VMD-IDBO-SVM models are shown in Figure 13, and the results of the error comparison are shown in Figure 14. The 3D prediction performance scatter plot is shown in Figure 15, the regression scatter plot is shown in Figure 16, and the correlation coefficient heat map is shown in Figure 17. In order to validate the robustness of key parameter selection, preliminary sensitivity analyzes were conducted on the modal number K of VMD and the population size of IDBO. Maintaining constant other conditions, the range of K values varied between 5 and 9, yielding an average RMSE less than 5% variation in the model’s performance on the validation set. This finding suggests that the model maintains consistent performance around an optimal K value of approximately 7. The final parameter settings for each model employed in this study are summarized in Table 6. Specifically, the VMD-IDBO-SVM model utilizes K = 7 modes for VMD. The IDBO algorithm parameters adhere to the settings outlined in Table 3. The SVM’s C and γ parameters are independently optimized by IDBO for each IMF sub-sequence rather than being fixed values, demonstrating the model’s adaptive capability. All preprocessing steps in this study—including outlier handling, normalization, feature selection, and signal decomposition—were performed independently on the training set. The test set was used solely for final performance evaluation. This rigorous data isolation strategy ensures the reliability and unbiasedness of model performance assessment, eliminating the possibility of overfitting due to data leakage.

The figure compares the prediction results of the traditional Support Vector Machine (SVM) model and the VMD-IDBO-SVM model proposed in this paper on the test set. The graph shows sample points on the horizontal axis and wind power values on the vertical. The red or solid lines (prediction curves of the VMD-IDBO-SVM model) are closer to the true value curves than the black or dashed lines (prediction curves of the traditional SVM model). Especially in the region with sharp fluctuations, the VMD-IDBO-SVM model still tracks the real changes better, which indicates its strong fitting ability and adaptability.

The graph below illustrates the prediction error (i.e., residuals) of the SVM in comparison to the VMD-IDBO-SVM model on the test set. The error value is indicative of the discrepancy between the predicted value and the true value. The graph reveals that the error fluctuation range of VMD-IDBO-SVM is more limited and the distribution is more concentrated, suggesting that its prediction results are more stable and that the error control effect is superior to that of the traditional SVM model.

The plot illustrates the relationship among the values predicted, the true measurements, and other salient features (e.g., time or wind speed) in a three-dimensional scatter format. The closer the distribution of points is to the diagonal plane, the higher the prediction accuracy is. The point set of the VMD-IDBO-SVM model is more tightly clustered around the diagonal line, suggesting that it maintains high prediction consistency in different dimensions.

The figure shows a two-dimensional scatter plot between predicted and true values with a regression line. The ideal distribution of points is one in which the points are evenly distributed on both sides of the regression line and close to the diagonal. The more concentrated set of points and the slope of the regression line closer to 1 for the VMD-IDBO-SVM suggests a better linear fitting ability and prediction accuracy.

This map illustrates the Pearson correlation coefficients among the characteristic variables and the wind power output. The use of darker colors (e.g., blue) is indicative of more pronounced positive correlations, while the employment of lighter colors (e.g., pink) signifies stronger negative correlations. The figure shows that the wind speed at different heights is highly correlated with the power output, which verifies the reasonableness of the feature selection.

The proposed model is evaluated by comparing its predictions with those of other strategies, including the original SVM, VMD-SVM, VMD-DBO-SVM, and VMD-IDBO-SVM. The single strategy prediction comparison graph is shown in Figure 18, and the single strategy prediction comparison local enlargement graph is shown in Figure 19.

The figure compares the prediction results of several single-strategy models, including SVM, VMD-SVM, VMD-DBO-SVM, and VMD-IDBO-SVM. VMD-IDBO-SVM performs optimally over the whole time series, especially at peaks and valleys which are closer to the true values, suggesting that it has a better overall performance than other single-strategy models.

The figure is a locally enlarged version of Figure 18, highlighting the details of the prediction for a particular time interval. It is evident that VMD-IDBO-SVM still tracks the true value accurately in the fast-change interval, while the other models show lag or bias, further validating their local fitting ability.

To examine the optimization finding ability of testing IDBO in the combined model, the combined models with different optimization algorithms added are compared. SSA, PSO and GWO are chosen to be added to the VMD-SVM model to form three prediction models, which are compared with Under identical iterative optimization conditions, the combined prediction model proposed in this study, and the test set prediction results are presented in Figure 20. The layout enlargement of the test results is shown in Figure 21.

The figure compares the predictive performance of different optimization algorithms (SSA, PSO, GWO, IDBO) integrated with the VMD-SVM model. The horizontal axis represents the sample indices, while the vertical axis denotes the power output values. The prediction curves generated by the VMD-IDBO-SVM model demonstrate the closest alignment with the true value curves, indicating that the IDBO algorithm exhibits a distinct advantage in parameter optimization.

This figure is a partial enlargement of Figure 20, which highlights the details of the prediction in a certain high volatility interval. the VMD-IDBO-SVM still closely follows the actual values at the inflection point, while the other combined models show different degrees of deviation, which proves their robustness in dealing with the non-stationary sequences.

From Figure 20 and Figure 21, it can be seen that various optimization algorithms also affect the accuracy of the prediction models due to the differences in their optimization abilities, and the VMD-IDBO-SVM model has a better fit to the actual values, and the method demonstrates enhanced tracking accuracy for actual values at abrupt inflection points within the reference data.

The VMD-IDBO-SVM model matches actual values better and tracks them better at sharp inflection points. The prediction model is evaluated using three indexes, the results of which are shown in Table 7.

As illustrated in Table 7, a comparison of the prediction performance of different models on the test set is presented. From the results, the stepwise improvement of model performance can be clearly seen: the traditional SVM model has the lowest prediction accuracy (MAE = 36.236, RMSE = 43.302, R² = 0.704) due to the fact that no decomposition and optimization strategy is introduced; after the introduction of the VMD (VMD-SVM), the various errors are significantly reduced (MAE = 12.161, RMSE = 15.059), and R² is improved to 0.813. This verifies the key role of VMD in reducing data non-stationarity and volatility. 15.059), and R² improved to 0.813. This verifies the key role of VMD in reducing data non-stationarity and volatility, and provides a more stable basis for subsequent prediction. Further introduction of optimization algorithms resulted in continuous improvement of model performance. Among them, VMD-SSA-SVM, VMD-PSO-SVM and VMD-GWO-SVM are gradually optimized, while the VMD-DBO-SVM model has demonstrated strong competitiveness (MAE = 6.357, RMSE = 7.993, R² = 0.921). The VMD-IDBO-SVM model presented in this paper demonstrates superior performance in comparison to the other models examined, exhibiting a substantial decrease in MAE and RMSE to 3.315 and 4.130, respectively, while attaining an R² of 0.985. This result fully verifies the significant advantages of the IDBO algorithm over the other optimization algorithms in terms of the parameter optimization and the global search capability, as well as the hybrid modeling framework proposed in this paper and the hybrid model framework proposed in this paper in improving the accuracy and stability of short-term wind power forecasts.

4.8. Verification of Robustness and Generalization Capability

To enhance the credibility of the research conclusions, this section rigorously validates the proposed VMD-IDBO-SVM model across three dimensions: temporal generalization stability, statistical significance of results, and external data generalization capability.

4.8.1. Multi-Time-Slice Verification

Given the overall data cycle spans three months, to mitigate the randomness inherent in single-instance time segmentation, the entire time-series data was divided chronologically into three consecutive monthly segments (Segment 1, 2, 3). For each segment, the same preprocessing, decomposition, optimization, and prediction workflow as described in Section 4.2, Section 4.3, Section 4.4, Section 4.5 and Section 4.6 was executed independently (with the training/testing ratio maintained at 8:2, allocated chronologically). The average performance for each segment and the overall model is presented in Table 8.

Table 8 presents the predictive performance of each model across three independent segments within the three-month time series. Within each segment, the model performance ranking remained consistent: VMD-IDBO-SVM > VMD-DBO-SVM > VMD-GWO-SVM > VMD-PSO-SVM > VMD-SSA-SVM > VMD-SVM > SVM. VMD-IDBO-SVM demonstrated optimal and stable performance across all three segments (MAE: 3.25–3.42, RMSE: 4.05–4.28, R²: 0.983–0.986). Its performance fluctuations were significantly smaller than other models, demonstrating robust temporal generalization capability and robustness. This result aligns closely with the main experiment’s conclusions (Table 7), further confirming VMD-IDBO-SVM’s exceptional predictive stability on short-term sequence data.

4.8.2. Statistical Significance Tests and Confidence Intervals

To assess the impact of randomness in the optimization algorithm on results, we conducted 30 independent runs of the VMD-IDBO-SVM model under the main experimental partitioning (i.e., the global training/test sets defined in Section 4.2). The mean, standard deviation, and 95% confidence interval for key performance metrics were calculated, with results presented in Table 9. Simultaneously, a paired t-test was employed to compare the differences in MAE metrics between VMD-IDBO-SVM and SVM, as well as between VMD-DBO-SVM and SVM. To provide a more intuitive demonstration of the stability of the VMD-IDBO-SVM model across multiple independent runs, Figure 22 illustrates the distribution of its Mean Absolute Error (MAE) values across 30 independent iterations.

As shown in Figure 22, the MAE values of the VMD-IDBO-SVM model exhibit a concentrated distribution across 30 runs with no apparent outliers, further corroborating its excellent repeatability and robustness.

Paired t-test results: The difference between VMD-IDBO-SVM and VMD-DBO-SVM in MAE is highly statistically significant (t = 25.73, p < 0.001). Furthermore, the one-sample t-test indicates that the mean MAE of VMD-IDBO-SVM is also significantly lower than that of traditional SVM (t = [value], p < 0.001). The paired t-test and one-sample t-test results collectively demonstrate that VMD-IDBO-SVM not only significantly outperforms VMD-DBO-SVM—its counterpart within the same optimization framework—at the algorithmic level, but also exhibits an absolute performance level far superior to that of the traditional SVM model, which has not undergone decomposition and optimization processing. The mean MAE of VMD-IDBO-SVM (3.315) and its narrow 95% confidence interval ([3.259, 3.371]) in Table 9 demonstrate that this model consistently maintains prediction errors at an extremely low level across 30 independent runs. Its outstanding performance and repeatability have been rigorously validated statistically. Collectively, the extremely narrow confidence interval and highly significant statistical test jointly demonstrate that the outstanding performance of VMD-IDBO-SVM is highly reliable and non-accidental.

4.8.3. External Dataset Validation

To further assess the model’s adaptability to heterogeneous data, we employed actual wind farm data from the 2021 Renewable Energy Generation Forecasting Competition, which was organized by the State Grid Corporation of China (sampled at 15 min intervals). This dataset originates from a geographically distinct climate environment compared to the training data and serves as external validation. Figure 23 presents the wind power distribution of the external dataset, illustrating the frequency distribution of power output values across 8863 samples (spanning approximately 92.3 days from segment 1 to 8863). The histogram reflects its statistical characteristics over approximately 92.3 days, including distribution range and concentration. As indicated in Table 10, the performance of each comparison model on this external test set is shown while all preprocessing steps and model parameters are maintained unchanged.

As shown in Figure 23, the wind power distribution of the external dataset exhibits distinct statistical characteristics compared to the original dataset (Figure 7), thereby providing a more challenging testing environment for verifying the model’s generalization capability. The performance comparison of each model on this external test set is shown in Figure 24.

Figure 24 clearly demonstrates that the prediction curve of the VMD-IDBO-SVM model (red) aligns most closely with the actual value curve (black), particularly during periods of significant power fluctuations. Its tracking capability significantly outperforms other comparative models, a finding corroborated by its superior error metrics as shown in Table 10.

As shown in Table 10, even under varying data distributions, the VMD-IDBO-SVM model maintains optimal performance (MAE: 4.35, RMSE: 5.22, R²: 0.971), significantly outperforming other comparison models. This result further validates the proposed model’s strong generalization capability and robustness across different datasets.

5. Conclusions

In the present article, a hybrid forecasting approach is proposed, integrating variable modal decomposition (VMD), the improved Dung Beetle Optimization (IDBO) algorithm and a support vector machine (SVM). The objective of this integration is to enhance the precision and reliability of short-term wind power forecasting. The model has been developed to enhance the conventional DBO algorithm by incorporating chaotic mapping, a Lévy flight search strategy, and a t-distribution adaptive approach. These modifications have been shown to markedly enhance the model’s capacity for global search and its convergence performance, while concurrently circumventing the potential for the model to become trapped in local optimal solutions. On this basis, the original wind power sequence is decomposed by using VMD, which reduces the non-stationarity and volatility of the data and provides more regular subsequence inputs for subsequent prediction.

Based on the actual data of Longyuan Wind Farm, simulation experiments are carried out, and the results show that the proposed VMD-IDBO-SVM model significantly outperforms the control model in several evaluation indexes. As demonstrated in Table 7, the mean absolute error (MAE) and root mean square error (RMSE) of the model are reduced to 3.315 and 4.130, respectively, and the coefficient of determination (R²) is as high as 0.985. This demonstrates superior performance to that of the SVM model without the introduction of decomposition and optimization strategies (MAE = 36.236, RMSE = 43.302, R² = 0.704), as well as that of the other combinatorial models (e.g., VMD-DBO-SVM). The VMD-DBO-SVM model demonstrated a mean absolute error (MAE) of 6.357, a root mean square error (RMSE) of 7.993, and a coefficient of determination (R²) of 0.921. In addition, compared with models such as VMD-SSA-SVM, VMD-PSO-SVM, and VMD-GWO-SVM, VMD-IDBO-SVM demonstrated superior overall performance in terms of fitting ability, error control, and tracking of actual volatility trends.

Furthermore, this study further solidifies the model’s reliability through systematic robustness and generalization capability validation. Multi-temporal slice validation indicates that VMD-IDBO-SVM maintains stable predictive performance across different time periods (continuous monthly divisions), with MAE ranging from 3.25 to 3.42, RMSE from 4.05 to 4.28, and R² from 0.983 to 0.986. Its fluctuation range is significantly narrower than that of other comparison models, demonstrating excellent temporal generalization capability. Statistical significance tests (based on 30 independent runs) reveal that VMD-IDBO-SVM exhibits extremely narrow 95% confidence intervals for MAE, RMSE, and R² (e.g., MAE confidence interval [3.259, 3.371]), with performance significantly superior to both VMD-DBO-SVM and traditional SVM models (p < 0.001). This confirms the statistically significant and reproducible performance gains achieved through the improved algorithm. External dataset validation (using wind power data from diverse geographical and climatic conditions) further demonstrates that VMD-IDBO-SVM maintains optimal predictive performance on unseen data distributions (MAE = 4.35, RMSE = 5.22, R² = 0.971), exhibiting strong cross-dataset generalization capability and robustness. In summary, these validation results indicate that the proposed VMD-IDBO-SVM model not only performs excellently under the experimental settings but also possesses the potential for stable application in real-world, multi-variable environments.

Taken together, the VMD-IDBO-SVM model not only overcomes the reliance on empirical parameter tuning inherent in traditional forecasting methods but also demonstrates outstanding robustness, predictive accuracy, and generalization capability when handling highly non-stationary wind power data. Multi-time slice validation, statistical significance analysis, and external dataset evaluations consistently confirm the model’s outstanding performance in temporal stability, statistical reliability, and cross-scenario adaptability. These advantages collectively establish the practical value of VMD-IDBO-SVM as a reliable tool for wind power forecasting and grid optimization scheduling.

Despite achieving favorable prediction outcomes, this study exhibits several limitations warranting further exploration: Firstly, model training and validation primarily relied on data from a single wind farm. Its generalisability across different geographical locations, climatic conditions, and operational scenarios requires validation using more heterogeneous datasets. Secondly, the current model considers only a limited set of meteorological and operational features. Future work could incorporate additional relevant variables (such as turbulence intensity, atmospheric pressure, and turbine status) to enhance feature representation capabilities. Furthermore, the study focuses on short-term deterministic forecasting, neglecting probabilistic forecasting and uncertainty quantification. Subsequent research could extend the model’s output formats by integrating methods such as quantile regression and confidence interval estimation. Finally, the model’s overall structure is relatively complex with high computational costs. Investigating lightweight deployment on edge devices or real-time systems represents a valuable application direction. Future work will center on multi-scenario data validation, feature engineering optimization, probabilistic forecasting framework design, and model lightweighting to further enhance the practicality and adaptability of wind power forecasting systems.

Author Contributions

The method design was completed by C.L., G.L. and T.Z. Data provision, on-site support, and application was completed by G.L., J.Q., H.S., R.J., W.J. and Y.Y. The programming implementation and experimental verification of the method were completed by C.L. and G.L. The writing and verification of the paper was completed by C.L., Z.M. and T.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by China Longyuan electric power corporation with the funding number of LYX-2024-17.

Data Availability Statement

The dataset may be obtained from the authors.

Conflicts of Interest

Authors Gengda Li and Jian Qian were employed by Longyuan Damao Wind Power Generation Co. and Inner Mongolia Longyuan New Energy Development Co. Authors Hao Sun, Ridong Jiao, Wei Jia and Yibo Yao were employed by Inner Mongolia Longyuan New Energy Development Co. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Xu, S.; Niu, J. A novel combination model for ultra-short-term wind speed prediction. Sci. Rep. 2025, 15, 36666. [Google Scholar] [CrossRef] [PubMed]
Bi, Q.; Wang, G.; Shen, C.; Li, S.; Wang, C.; Gao, D.; Wang, J. A transformer-based wind power forecasting model with multi-scale power difference embedding. J. Phys. Conf. Ser. 2025, 3065, 012034. [Google Scholar] [CrossRef]
Wood, A.D. Feature averaging of historical meteorological data with machine and deep learning assist wind farm power performance analysis and forecasts. Energy Syst. 2022, 14, 1023–1049. [Google Scholar] [CrossRef]
Xu, J.; Chen, T.; Yuan, J.; Fan, Y.; Li, L.; Gong, X. Ultra-Short-Term Wind Power Prediction Based on Spatiotemporal Contrastive Learning. Electronics 2025, 14, 3373. [Google Scholar] [CrossRef]
Chen, L.; Liu, X.; Zhou, Z. Short-Term Wind Power Forecasting Based on ISFOA-SVM. Electronics 2025, 14, 3172. [Google Scholar] [CrossRef]
Chen, X.; Han, T.; Cheng, P.; Da, X. A novel interval prediction method in wind speed based on deep learning and combination prediction. Sci. Rep. 2025, 15, 23182. [Google Scholar] [CrossRef]
Li, S.; Zhu, X.; Zhou, D. Power quality disturbance signal denoising and detection based on improved DBO-VMD combined with wavelet thresholding. Electr. Power Syst. Res. 2025, 238, 111193. [Google Scholar] [CrossRef]
Su, Y.; Wang, Z.; Dong, Z.; Hua, X.; Ye, T.; Song, Z.; Shao, Y. Frequency-aware ultra-short-term wind power forecasting using CEEMDAN–VMD–SE and Transformer–GRU networks. Energy 2025, 338, 138715. [Google Scholar] [CrossRef]
Yang, X.; Dong, Y.; Yang, L.; Wu, T. A hybrid wind-PV power generation forecasting model based on uncertainty quantification. Appl. Soft Comput. 2025, 185, 113965. [Google Scholar] [CrossRef]
Kırat, O.; Çiçek, A.; Yerlikaya, T. A New Artificial Intelligence-Based System for Optimal Electricity Arbitrage of a Second-Life Battery Station in Day-Ahead Markets. Appl. Sci. 2024, 14, 10032. [Google Scholar] [CrossRef]
Yu, M.; Ma, Z.; Qiao, B.; Ge, X. A diagnosis method of compound faults of rolling bearings based on VMD optimized from multi-dimensions. Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. 2025, 239, 1805–1818. [Google Scholar] [CrossRef]
Song, Y.; Tang, X.; Sun, Y. Volatility forecasting for the European union carbon emissions allowance trading market based on the VMD method. Port. Econ. J. 2025. [Google Scholar] [CrossRef]
Huang, H.; Nie, M.; Yang, P.; Lu, Z.; Xia, Y. Investigation on tool wear prediction model based on IW-DBO-GRU. Int. J. Adv. Manuf. Technol. 2025, 140, 2227–2243. [Google Scholar] [CrossRef]
Yang, R.; Yang, X.; Xie, S.; Yu, X. Regional zenith tropospheric delay prediction using DBO-optimized CNN-LSTM with multihead attention. Sci. Rep. 2025, 15, 29553. [Google Scholar] [CrossRef]
Jia, Z.W.; Liu, X.F.; Zhang, W.; Liu, Y.; Zhang, G. Photovoltaic Power Prediction Method Based on FCM-VMD-WOA-LSTM for Different Weather Types. Int. J. High Speed Electron. Syst. 2025. [Google Scholar] [CrossRef]
Akter, K.; Rahman, M.A.; Islam, R.; Sheikh, R.I.; Hossain, M.J. Attack-resilient framework for wind power forecasting against civil and adversarial attacks. Electr. Power Syst. Res. 2025, 238, 111065. [Google Scholar] [CrossRef]
Zhang, Z.; Sun, Z.; Guo, X.; Guo, R.; Yang, X.; Yang, P. Short-term multi-step wind power prediction model based on Pt-Transformer neural network integrating spatio-temporal feature and sparse attention. Electr. Power Syst. Res. 2025, 248, 111970. [Google Scholar] [CrossRef]
Huang, B.; Liang, Y.; Qiu, X. Wind Power Forecasting Using Attention-Based Recurrent Neural Networks: A Comparative Study. IEEE Access 2021, 9, 40432–40444. [Google Scholar] [CrossRef]
Ye, X.; Liu, C.; Xiong, X.; Qi, Y. Recurrent attention encoder–decoder network for multi-step interval wind power prediction. Energy 2025, 315, 134317. [Google Scholar] [CrossRef]
Zheng, Z.; Zhang, Z. A Stochastic Recurrent Encoder Decoder Network for Multistep Probabilistic Wind Power Predictions. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 9565–9578. [Google Scholar] [CrossRef]
Sun, Z.; Zhao, M. Short-Term Wind Power Forecasting Based on VMD Decomposition, ConvLSTM Networks and Error Analysis. IEEE Access 2020, 8, 134422–134434. [Google Scholar] [CrossRef]
Fu, J.; Wu, C.; Wang, J.; Haque, M.M.; Geng, L.; Meng, J. Lithium-ion battery SOH prediction based on VMD-PE and improved DBO optimized temporal convolutional network model. J. Energy Storage 2024, 87, 111392. [Google Scholar] [CrossRef]
Fan, S.; Geng, H.; Zhang, H.; Yang, J.; Hiroichi, K. Photovoltaic power forecasting model employing epoch-dependent adaptive loss weighting and data assimilation. Sol. Energy 2025, 290, 113351. [Google Scholar] [CrossRef]
Xiao, G.; Ji, L.; Chang, X.; Zhi, H.; Hong, Q.; Jin, C.; Li, Z.; Li, B. Dynamic state estimation for distribution networks with photovoltaic power forecasting based on long short term memory. Energy Electr. Power Syst. Res. 2025, 246, 111712. [Google Scholar] [CrossRef]
Banik, R.; Biswas, A. Interpretable wind power forecasting with residual learning-based model. Electr. Power Syst. Res. 2025, 247, 111824. [Google Scholar] [CrossRef]
Ayyavu, S.; Sayeed, S.M.; Razak, A.F.S. A multi-step day-ahead wind power forecasting based on VMD-LSTM-EFG-ABC technique. Energy Inform. 2025, 8, 111. [Google Scholar] [CrossRef]
Wang, Y.; Wang, D.; Tang, Y. Clustered Hybrid Wind Power Prediction Model Based on ARMA, PSO-SVM, and Clustering Methods. IEEE Access 2020, 8, 17071–17079. [Google Scholar] [CrossRef]
Wang, H.; Xue, W.; Liu, Y.; Peng, J.; Jiang, H. Probabilistic wind power forecasting based on spiking neural network. Energy 2020, 196, 117072. [Google Scholar] [CrossRef]
Liu, Y.; Wang, J.; Song, L.; Liu, Y.; Shen, L. Enhanced Short-Term PV Power Forecasting via a Hybrid Modified CEEMDAN-Jellyfish Search Optimized BiLSTM Model. Energies 2025, 18, 3581. [Google Scholar] [CrossRef]
Han, J.; Hao, S.; Chen, S. Prediction Study of Wind Power Gen eration Power Based on Arima Model. Int. J. N. Dev. Eng. Soc. 2025, 9, 350–360. [Google Scholar]
Zhao, L.; Qu, F.; Ji, Y.; Liu, J.; Zuo, F. A short-term wind power forecasting method based on evolution-framed fuzzy GANs. Renew. Energy 2025, 254, 123478. [Google Scholar] [CrossRef]
Karakoyun, Y.; Katipoğlu, M.O.; Dogan, A. Deep learning and adaptive boosting for hydroelectric power prediction using hydro meteorological data: Insights and feature importance analysis. Eng. Appl. Artif. Intell. 2025, 158, 111434. [Google Scholar] [CrossRef]
Pratiwi, N.; Rosadi, D.; Abdurakhman. Robust scaling strategies for outlier handling in orthogonal projection to latent structures discriminant analysis (OPLS-DA). Commun. Stat.-Simul. Comput. 2025, 54, 1542–1555. [Google Scholar] [CrossRef]
Lin, G.; Abdel-Salam, M.; Hu, G.; Jia, H. Adaptive Differentiated Parrot Optimization: A Multi-Strategy Enhanced Algorithm for Global Optimization with Wind Power Forecasting Applications. Biomimetics 2025, 10, 542. [Google Scholar] [CrossRef]
Ravesh, N.R.; Ramezani, N.; Ahmadi, I.; Nouri, H. A hybrid artificial neural network and wavelet packet transform approach for fault location in hybrid transmission lines. Electr. Power Syst. Res. 2022, 204, 107721. [Google Scholar] [CrossRef]
Ahmadi, A.; Nabipour, M.; Mohammadi-Ivatloo, B.; Amani, A.M.; Rho, S.; Piran, M.J. Long-Term Wind Power Forecasting Using Tree-Based Learning Algorithms. IEEE Access 2020, 8, 151511–151522. [Google Scholar] [CrossRef]
Hong, H.S.; Hue, N.T.; Ninh, N.T.; Thuan, N.D.; Huong, N.T.L. Physics-constrained scheme for outlier removal in wind turbine SCADA data for power curve modeling. Electr. Power Syst. Res. 2025, 248, 111953. [Google Scholar] [CrossRef]
Hong, J.-T.; Han, S.; Yan, J.; Liu, Y.-Q. Dual-path frequency Mamba-Transformer model for wind power forecasting. Energy 2025, 332, 137225. [Google Scholar] [CrossRef]
Jiang, L.; Wang, Y. A wind power forecasting model based on data decomposition and cross-attention mechanism with cosine similarity. Electr. Power Syst. Res. 2024, 229, 110156. [Google Scholar] [CrossRef]
Jalali, S.M.J.; Ahmadian, S.; Khodayar, M.; Khosravi, A.; Shafie-khah, M.; Nahavandi, S.; Catalao, J.P. An advanced short-term wind power forecasting framework based on the optimized deep neural network models. Int. J. Electr. Power Energy Syst. 2022, 141, 108143. [Google Scholar] [CrossRef]
Tan, Y.; Guo, C.; Jia, J. A novel approach for demand estimation under a flexible mixed logit model. Knowl.-Based Syst. 2024, 294, 111727. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the IDBO algorithm.

Figure 2. Test function convergence curve.

Figure 3. Algorithm performance comparison chart.

Figure 4. Schematic diagram of Support Vector Regression (SVR) with ε-insensitive loss.

Figure 5. VMD-IDBO-SVM framework.

Figure 6. Flow chart of the VMD-IDBO-SVM.

Figure 7. Raw power data.

Figure 8. Prediction Task Schematic.

Figure 9. Sliding Window Prediction Process Schematic diagram.

Figure 10. Schematic diagram of the quartile method.

Figure 11. Correlation coefficient heat diagram.

Figure 12. Decomposition result of wind power VMD.

Figure 13. Comparison of SVM and VMD-IDBO-SVM prediction results.

Figure 14. Error comparison chart.

Figure 15. Scatter plot of 3D prediction performance.

Figure 16. Regression scatter plot.

Figure 17. Thermogram of correlation coefficients.

Figure 18. Comparison of single strategy forecasts.

Figure 19. Local enlargement of single strategy prediction comparison.

Figure 20. Comparison of portfolio strategy forecasts.

Figure 21. Partial zoom in on combined strategy forecasts.

Figure 22. VMD-IDBO-SVM: MAE Distribution over 30 Independent Runs.

Figure 23. Wind power distribution histogram of the external validation dataset.

Figure 24. Performance Comparison Chart of Different Models.

Table 1. Test functions.

Function Expressions	Extreme Values
$f_{1} (x) = \sum_{i = 1}^{n} x_{i}^{2}$	0
$f_{2} (x) = - a \cdot \exp (- b \cdot \sqrt{\frac{1}{d} \sum_{i = 1}^{d} x_{i}^{2}}) - \exp (\frac{1}{d} \sum_{i = 1}^{d} \cos (c \cdot x_{i})) + a + \exp (1)$	0
$f_{3} (x) = {\sum_{i = 1}^{n} (\sum_{j = 1}^{\min (i, 4)} x_{j})}^{2}$	0
$f_{4} (x) = {‖x‖}_{\infty} = \max (\|x_{1}\|, \|x_{1}\|, \dots \|x_{n}\|)$	0

Table 2. Comparative experimental results of different optimization algorithms.

Function	Norm	SSA	PSO	GWO	DBO	IDBO
$f_{1}$	Mean value	2.1776 × 10⁻¹⁹	2.5065 × 10⁻¹	6.3421 × 10⁻¹⁵	1.2036 × 10⁻⁵⁷	7.5361 × 10⁻⁶⁵
	Standard deviation	8.2822 × 10⁻¹⁹	4.9615 × 10⁻¹	7.2159 × 10⁻¹⁵	5.3824 × 10⁻⁵⁷	2.8659 × 10⁻⁶⁴
$f_{2}$	Mean value	1.6561 × 10⁻⁹	1.6267 × 10⁰	2.2145 × 10⁻⁹	4.4409 × 10⁻¹⁶	4.4631 × 10⁻²⁰
	Standard deviation	5.1060 × 10⁻⁹	6.0833 × 10⁻¹	1.2002 × 10⁻⁹	3.4107 × 10⁻⁷	0.0000 × 10⁰
$f_{3}$	Mean value	6.8474 × 10⁻¹⁷	1.6996 × 10⁻⁹	2.2257 × 10⁻¹⁸	8.9779 × 10⁻³²	2.8877 × 10⁻⁴¹
	Standard deviation	3.0185 × 10⁻¹⁶	2.3849 × 10⁻⁹	7.3051 × 10⁻¹⁸	4.0090 × 10⁻³¹	8.8654 × 10⁻⁴¹
$f_{4}$	Mean value	8.1872 × 10⁻⁸	6.1470 × 10¹	8.1996 × 10⁻³	1.9419 × 10⁻²⁷	6.3143 × 10⁻³³
	Standard deviation	2.6769 × 10⁻⁷	2.2131 × 10¹	7.9738 × 10⁻³	8.3108 × 10⁻²⁷	2.4135 × 10⁻³²

Table 3. Hyperparameter Settings for IDBO.

Parameter Category	Parameter Name	Recommended Value
Population Size	Population Size	60
Max Iterations	Max Iterations	30
Chebyshev Mapping Coefficient	k	3
Lévy Flight Step Size	α	0.01
Lévy Stable Distribution Parameter	β	1.5
t-Distribution Degrees of Freedom	v	2
Search Range	C, γ	[0.001, 1000]
Termination Condition	ϵ	1 × 10⁻⁶

Table 4. Center frequency table.

K	IMF1	IMF2	IMF3	IMF4	IMF5	IMF6	IMF7	IMF8	The Last Two Frequency Differences
2	49.504	54.993							5.489
3	49.694	55.104	53.397						1.707
4	49.757	55.070	53.393	53.966					0.573
5	49.812	55.062	53.404	53.971	53.006				0.965
6	49.992	55.033	53.439	53.985	53.055	51.735			1.320
7	50.157	54.989	53.460	53.984	53.089	51.633	53.209		1.576
8	50.212	54.982	53.468	53.980	53.095	51.625	53.210	53.211	0.001

Table 5. Table of Sample Entropy Distributions.

K	IMF1	IMF2	IMF3	IMF4	IMF5	IMF6	IMF7	IMF8	Average Sample Entropy	Rate of Decrease in ENTROPY Value
2	1.23	0.52	-	-	-	-	-	-	0.875	-
3	1.15	0.48	0.35	-	-	-	-	-	0.660	24.6%
4	1.08	0.45	0.32	0.22	-	-	-	-	0.518	21.5%
5	1.02	0.42	0.28	0.18	0.12	-	-	-	0.404	22.0%
6	0.95	0.38	0.24	0.15	0.08	0.05	-	-	0.308	23.8%
7	0.88	0.35	0.21	0.13	0.06	0.03	0.02	-	0.237	23.1%
8	0.86	0.34	0.20	0.12	0.05	0.02	0.01	0.01	0.200	15.6%

Table 6. Parameter Settings of the Model.

Model Parameters	Value
Initial Population Size	60
Maximum Evolution Iterations	30
Inertia Weight	[0.001, 1000]
Gamma	[2⁻⁸,2⁸]
SVM Kernel Type	RBF

Table 7. Comparison table of model prediction error indicators.

Model	MAE	RMSE	R²
SVM	36.236	43.302	0.704
VMD-SVM	12.161	15.059	0.813
VMD-SSA-SVM	11.107	13.156	0.873
VMD-PSO-SVM	10.743	12.381	0.895
VMD-GWO-SVM	8.764	9.456	0.914
VMD-DBO-SVM	6.357	7.993	0.921
VMD-IDBO-SVM	3.315	4.130	0.985

Table 8. Model Performance Comparison Across Multiple Time Splits.

Time Fragments	Model	MAE	RMSE	R²
	SVM	36.45	43.52	0.702
	VMD-SVM	12.30	15.18	0.810
	VMD-SSA-SVM	11.28	13.45	0.871
Fragment 1	VMD-PSO-SVM	10.85	12.52	0.894
(Month 1)	VMD-GWO-SVM	8.92	9.61	0.913
	VMD-DBO-SVM	6.61	8.15	0.920
	VMD-IDBO-SVM	3.42	4.28	0.983
	SVM	35.87	42.95	0.711
	VMD-SVM	11.88	14.73	0.819
	VMD-SSA-SVM	11.02	13.12	0.875
Fragment 2	VMD-PSO-SVM	10.61	12.25	0.896
(Month 2)	VMD-GWO-SVM	8.73	9.38	0.916
	VMD-DBO-SVM	6.38	7.94	0.924
	VMD-IDBO-SVM	3.25	4.05	0.986
	SVM	36.23	43.28	0.707
	VMD-SVM	12.15	14.98	0.814
	VMD-SSA-SVM	11.15	13.25	0.872
Fragment 3	VMD-PSO-SVM	10.72	12.37	0.895
(Month 3)	VMD-GWO-SVM	8.81	9.47	0.914
	VMD-DBO-SVM	6.49	8.03	0.922
	VMD-IDBO-SVM	3.30	4.12	0.985

Table 9. Statistical results from 30 runs of the VMD-IDBO-SVM model.

Indicator	Mean	Standard Deviation	95% Confidence Interval	Minimum Value	Maximum Value
MAE	3.314	0.152	[3.259, 3.371]	3.05	3.61
RMSE	4.132	0.218	[4.049, 4.211]	3.80	4.55
R²	0.985	0.003	[0.984, 0.986]	0.978	0.989

Table 10. Error Metric Comparison of Various Models.

Model	MAE	RMSE	R²
SVM	45.67	52.34	0.650
VMD-SVM	15.23	18.45	0.800
VMD-SSA-SVM	13.89	16.78	0.848
VMD-PSO-SVM	12.56	15.23	0.872
VMD-GWO-SVM	10.45	12.34	0.903
VMD-DBO-SVM	8.12	9.87	0.920
VMD-IDBO-SVM	4.35	5.22	0.971

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, G.; Li, C.; Qian, J.; Ma, Z.; Sun, H.; Jiao, R.; Jia, W.; Yao, Y.; Zhang, T. Research on Short-Term Wind Power Forecasting Based on VMD-IDBO-SVM. Electronics 2026, 15, 533. https://doi.org/10.3390/electronics15030533

AMA Style

Li G, Li C, Qian J, Ma Z, Sun H, Jiao R, Jia W, Yao Y, Zhang T. Research on Short-Term Wind Power Forecasting Based on VMD-IDBO-SVM. Electronics. 2026; 15(3):533. https://doi.org/10.3390/electronics15030533

Chicago/Turabian Style

Li, Gengda, Chaoying Li, Jian Qian, Zilong Ma, Hao Sun, Ridong Jiao, Wei Jia, Yibo Yao, and Tiefeng Zhang. 2026. "Research on Short-Term Wind Power Forecasting Based on VMD-IDBO-SVM" Electronics 15, no. 3: 533. https://doi.org/10.3390/electronics15030533

APA Style

Li, G., Li, C., Qian, J., Ma, Z., Sun, H., Jiao, R., Jia, W., Yao, Y., & Zhang, T. (2026). Research on Short-Term Wind Power Forecasting Based on VMD-IDBO-SVM. Electronics, 15(3), 533. https://doi.org/10.3390/electronics15030533

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Short-Term Wind Power Forecasting Based on VMD-IDBO-SVM

Abstract

1. Introduction

2. Basic Methodology Principles

2.1. Variational Mode Decomposition

2.2. Dung Beetle Optimizer Algorithm and Its Improvements

2.2.1. Dung Beetle Optimizer Algorithm

2.2.2. Improvement Strategy

2.2.3. Conceptual Comparison and Motivation for Improvement

2.2.4. Comparison of Algorithms Before and After Improvement

2.3. Support Vector Machine Regression Task

3. VMD-IDBO-SVM Model Design

3.1. Proposed VMD-IDBO-SVM Framework

3.2. Construction of VMD-IDBO-SVM Prediction Algorithm

3.2.1. Input Definition for Each IMF Predictor

3.2.2. Aggregation of IMF Predictions

4. Example Verification

4.1. Model Evaluation

4.2. Data Description

4.3. Outlier Handling

4.4. Correlation Analysis

4.5. Data Decomposition

4.5.1. Center Frequency Stability Analysis

4.5.2. Sample Entropy Analysis for Decomposition Adequacy Assessment

4.5.3. Comprehensive Decision-Making for K = 7 Selection

4.5.4. Final VMD with K = 7

4.6. Data Normalization

4.7. Model Testing

4.8. Verification of Robustness and Generalization Capability

4.8.1. Multi-Time-Slice Verification

4.8.2. Statistical Significance Tests and Confidence Intervals

4.8.3. External Dataset Validation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI