1. Introduction
The power output of offshore wind farms is highly dependent on complex and highly variable marine meteorological conditions [
1,
2]. Under extreme operating conditions (EOCs)—such as strong winds, typhoons, abrupt wind direction shifts, intensified turbulence, and pronounced atmospheric instability—wind power exhibits drastic fluctuations and abrupt transitions, characterized by pronounced non-stationarity, intermittency, and strong nonlinearity [
3,
4]. The inherent complexities of offshore wind environments are fundamentally driven by multi-scale environmental interactions. These intricate thermodynamic and aerodynamic dynamics necessitate the deployment of highly robust forecasting models capable of capturing severe non-stationary fluctuations under extreme marine conditions [
5]. Under such extreme conditions, the prediction errors of conventional forecasting models are often significantly amplified, thereby severely constraining grid dispatch decisions, reserve capacity allocation, and the secure and stable operation of offshore wind farms [
6,
7]. Therefore, developing high-precision offshore wind power forecasting methods tailored to EOCs has become a critical scientific and engineering challenge in the fields of wind power grid integration and intelligent dispatch [
8,
9,
10].
Existing offshore wind power forecasting approaches can generally be categorized into physics-based models, statistical models, and data-driven models [
11,
12,
13]. Physics-based models rely heavily on accurate meteorological forecasts and detailed turbine parameters [
14,
15,
16]. Under EOCs, they struggle to accurately characterize rapidly evolving nonlinear coupling processes, and their model construction and parameter calibration are typically complex and computationally intensive. Statistical models are generally built upon linear or weakly nonlinear assumptions, and their predictive performance deteriorates significantly when confronted with sudden disturbances and severe fluctuations [
17,
18,
19]. In recent years, with the advancement of computational capabilities and the accumulation of large-scale operational data, deep learning methods have achieved substantial progress in wind power forecasting owing to their end-to-end modeling capacity [
20,
21].
While recurrent architectures like Long Short-Term Memory (LSTM) networks and their bidirectional variants effectively model long-term dependencies [
22,
23,
24], their sequential computation and insufficient local feature extraction limit their performance during high-frequency power fluctuations under EOCs [
25].
Conversely, Temporal Convolutional Networks (TCN) offer high parallel computational efficiency and strong local temporal feature extraction via causal dilated convolutions [
26,
27,
28]. However, they struggle to capture complex long-term dependencies and high-order nonlinear mappings independently [
29].
Crucially, these existing deep learning forecasting models generally rely on fixed-form activation functions. This inherent limitation severely restricts their functional expressiveness when approximating the highly complex and rapidly evolving nonlinear dynamics encountered during extreme marine conditions [
30,
31]. The Kolmogorov–Arnold Network (KAN), as a novel neural architecture, replaces fixed activation functions in conventional networks with learnable one-dimensional functions. Theoretically, this design endows KAN with enhanced function approximation capability and greater flexibility in nonlinear representation, offering a new paradigm for modeling complex systems [
32,
33,
34]. Theoretically, Kolmogorov–Arnold Networks specifically bridge the gap between traditional MLPs and symbolic regression by utilizing unique hybrid architectural nuances to successfully combine scalable learning capacity with mathematical interpretability. However, the application of KAN in wind power forecasting—particularly for time-series modeling under EOCs—remains relatively limited, and its integration mechanisms with mainstream temporal deep learning models warrant further investigation [
35].
Based on the foregoing analysis, this study proposes a hybrid forecasting architecture, termed TCN–BiLSTM–KAN, to address the structural limitations of conventional deep learning models under EOCs. Rather than a simple mechanical combination of existing computational modules, this architecture is specifically designed to align with the complex physical dynamics of offshore wind power. A Temporal Convolutional Network is first employed to suppress high-frequency turbulence disturbances and construct a denoised local feature space. A Bidirectional Long Short-Term Memory network then models the sequential temporal dependencies. Crucially, instead of relying on traditional fully connected layers, a Kolmogorov–Arnold Network is integrated as the core nonlinear mapping module. By utilizing learnable spline functions, this design mathematically avoids the memory inertia typical of recurrent networks and the local minima often encountered by attention mechanisms, enabling optimal adaptive approximation of steep power ramps. Experimental results confirm that this targeted structural regularization provides a reliable technical solution for the secure grid integration of offshore wind power.
The main contributions of this study are summarized as follows:
- (1)
A multi-source feature fusion forecasting framework is designed specifically for extreme marine conditions. By employing the learnable spline functions of the Kolmogorov–Arnold Network to replace conventional fixed activation functions, the architecture provides a targeted mathematical approach to approximate the complex nonlinear fluid dynamics embedded in meteorological data.
- (2)
The proposed methodology mitigates the generalization degradation caused by the long-tail distribution of extreme samples. Empirical evaluations demonstrate that the architecture reduces the root mean square error to 3.58 MW under severe operating conditions, objectively mitigating phase lag and amplitude attenuation during abrupt power ramp events.
- (3)
The structural robustness of the model is validated through cross-site application. In an independent offshore site evaluation, the architecture maintains a goodness-of-fit of 97.18 percent, indicating its preliminary spatial adaptability within similar marine environments and demonstrating its capacity to capture the underlying physical mappings of wind energy conversion without relying exclusively on site-specific stationary characteristics.
The remainder of this paper is organized as follows.
Section 2 defines the physical characteristics of EOCs and quantitatively analyzes the correlations among multi-source data.
Section 3 develops the TCN–BiLSTM–KAN multi-source feature fusion forecasting framework based on learnable spline functions.
Section 4 comprehensively validates the model’s robustness and spatial generalization capability through extreme condition analyses, ablation studies, and independent cross-site testing. Finally,
Section 5 concludes the paper.
2. Data Description
2.1. Data Preprocessing
The dataset used in this study was obtained from the operational records of an offshore wind farm located in Fujian Province, China, with a total installed capacity of 48 MW. To comprehensively evaluate the predictive performance of the model under complex marine conditions, long-term operational data from 1 January 2022 to 1 April 2023 were used as the primary dataset, with a temporal resolution of 15 min. After rigorous outlier cleaning and threshold-based screening, 4000 valid samples representing EOCs were accurately extracted from the long-term time series. Meanwhile, to ensure balanced and scientifically sound model evaluation, an additional 4000 samples representing normal operating conditions and 4000 samples covering comprehensive operating scenarios were extracted in parallel, forming the final dataset used for comparative experiments.
The dataset includes key variables such as atmospheric pressure, relative humidity, cloud cover, wind speed and wind direction at 10 m and 100 m heights, ambient temperature, solar radiation intensity, precipitation, and the actual power output of the wind farm. These multidimensional features collectively provide a comprehensive representation of the complex operating environment and power response characteristics of offshore wind power systems.
Regarding the justification of multi-source meteorological features, while wind speed acts as the primary mechanical driver of turbine power output, auxiliary variables such as solar irradiance play a critical thermodynamic role. In the marine atmospheric boundary layer, solar irradiance dictates the differential heating of the ocean surface, thereby continuously influencing thermal stratification and atmospheric stability. This thermodynamic variance directly alters the local vertical wind shear profile and convective circulation patterns. Consequently, integrating solar irradiance into the forecasting framework allows the deep learning architecture to effectively capture the complex nonlinear interactions between thermal dynamics and aerodynamic fluctuations, providing essential predictive information during extreme or rapidly transitioning marine weather events.
During the data preprocessing stage, in order to preserve genuine physical fluctuations under EOCs while accurately removing pseudo-noise caused by sensor faults, the Pauta criterion (3σ rule) was adopted to identify outliers. For a sampling point
xi in the time series, if the following condition is satisfied, it is identified as an outlier and replaced with a null value:
In the above expression, μ and σ denote the sample mean and standard deviation of the variable, respectively.
Subsequently, missing and abnormal segments were processed using a classification strategy. For short gaps with missing durations ≤ 1 h (continuous steps ≤ 4), a linear interpolation method was applied:
Local reconstruction was performed to satisfy the strict temporal continuity requirements of TCN and BiLSTM. For large missing segments exceeding 1 h caused by communication interruptions or similar issues, the data were directly truncated and removed to avoid introducing artificial bias.
Finally, all multi-source meteorological and power output variables were temporally aligned and normalized using normalization:
The input features were uniformly mapped to the [0, 1] interval to eliminate the adverse effects of dimensional differences and numerical scale variations on model training.
For model development and evaluation, the dataset was chronologically divided into training, validation, and test sets with a ratio of 8:1:1.
2.2. Definition of Extreme Operating Conditions
To identify extreme operating conditions (EOCs) during the operation of the offshore wind power system, this study constructs discriminative features from two perspectives—wind speed variation characteristics and power response behavior—based on data with a 15 min temporal resolution. Let vt denote the wind speed at a height of 100 m at time t, and Pt represent the corresponding active power output of the wind farm.
First, to characterize abrupt wind speed variations over short time scales, the absolute change in wind speed between two consecutive time steps is defined as:
This indicator effectively captures rapid increases or decreases in wind speed over a 15 min time scale. Furthermore, to characterize the intensity of short-term wind speed fluctuations and their non-stationary characteristics, the standard deviation of wind speed within a 1 h sliding time window is adopted as a statistical metric, defined as:
The window length corresponds to four consecutive 15 min sampling intervals. In addition, to characterize abrupt power increases or decreases caused by drastic wind speed variations or turbine control strategies, the absolute change in power output between two consecutive time steps is introduced.
The above three categories of features characterize the typical behavioral patterns of wind turbines under EOCs from the perspectives of abrupt wind speed changes, high wind speed volatility, and severe power responses.
With regard to the determination of extreme condition thresholds, a quantile-based adaptive thresholding method is adopted to avoid reliance on empirical parameters or the rated operating conditions of a specific wind farm. Specifically, the 95th percentiles of the wind speed change Δ
vt, wind speed fluctuation intensity
σt, and power change Δ
Pt are computed from the sample data and used as the extreme condition thresholds for the corresponding features:
Here, Q0.95(·) denotes the 95th percentile of the empirical distribution. This threshold selection reflects the low-probability, high-impact statistical characteristics of EOCs. It ensures the representativeness of extreme samples while preventing excessive sample scarcity caused by overly stringent thresholds, thereby maintaining the stability of subsequent model training and evaluation.
Based on the above thresholds, a time step is classified as an extreme operating condition if any of the defined features exceed its corresponding threshold at that time. The decision rule is defined as:
Here, the variables T∆v, Tσ, and T∆P explicitly define the extreme condition thresholds for wind speed change, wind speed fluctuation intensity, and power change, respectively. This OR-logic-based decision criterion enables the simultaneous identification of multiple types of extreme behaviors, including abrupt wind speed changes, sustained high-volatility wind conditions, and severe power fluctuations induced by meteorological factors or turbine control strategies.
After applying the proposed method to classify EOCs in the offshore wind power dataset, extreme samples account for approximately 10.81% of the total observations. This proportion is broadly consistent with the occurrence frequency of abnormal or highly volatile conditions in real-world wind power system operations, indicating that the proposed identification method achieves a sound balance between statistical rigor and engineering practicality. Because the thresholds are adaptively determined using quantile-based criteria, the proposed method exhibits strong transferability across different wind farms and operational environments, thereby providing a consistent and reliable basis for constructing forecasting models under extreme conditions and for conducting cross-site comparative experiments.
By applying the predefined extreme condition thresholds to the entire long-term dataset, exactly 4636 extreme condition samples were initially identified. After applying the aforementioned cleaning rules to eliminate abnormal and invalid readings, a total of 636 abnormal records were removed. This rigorous screening process ultimately yielded a high-quality subset of exactly 4000 valid extreme condition samples.
To verify the validity of the above threshold settings and evaluate the model’s sensitivity to the definition of extreme operating condition boundaries, a threshold sensitivity analysis was further conducted.
Figure 1 illustrates the evolution of prediction error (RMSE) for each model when the screening thresholds are set to 90%, 95%, and 98%, respectively.
The analysis indicates that as the threshold criteria become stricter, the predictive performance of the conventional baseline models exhibits a pronounced stepwise degradation. For instance, when the threshold is set at 98%, the RMSE values of the CNN-BiLSTM and CNN-GRU-Attention models surge to 9.56 MW and 10.15 MW, respectively, revealing their representational limitations in handling high-frequency and high-intensity distortion features. In contrast, across the wide threshold range from 90% to 98%, the RMSE of the proposed model increases only marginally from 3.12 MW to 3.72 MW, consistently maintaining the best performance while keeping error fluctuations within a very narrow margin. This sensitivity analysis demonstrates that the superior predictive robustness of the proposed model does not depend on any specific data partition point. It also confirms that adopting the 95th percentile as the baseline threshold effectively captures and amplifies system characteristics under extreme meteorological conditions while preserving the model’s generalization capability.
2.3. Correlation of Multi-Source Data
To analyze the linear relationships between different meteorological factors and wind power output, and to provide a basis for subsequent feature selection and model development, a statistical correlation analysis was conducted between each meteorological variable and wind power output in the dataset. Considering the continuity of the data and its widespread use in engineering applications, the Pearson correlation coefficient was employed to quantify the degree of linear correlation between variables.
Let random variables X and Y denote any two meteorological or power-related features. The Pearson correlation coefficient between them is defined as:
Here, Cov(·) denotes the covariance, and σX and σY represent the standard deviations of variables X and Y, respectively. The correlation coefficient ranges from −1 to 1; a larger absolute value indicates a stronger linear relationship between the two variables.
The selected features include atmospheric pressure, relative humidity, cloud cover, wind speed and wind direction at 10 m, temperature, solar irradiance, precipitation, wind speed and wind direction at 100 m, and wind farm power output. The Pearson correlation coefficients among these variables were calculated, and the corresponding correlation heatmap is presented in
Figure 1.
Figure 2 visually illustrates the initial linear correlations between multi-source meteorological features and wind power output. The results indicate that wind speeds at 100 m and 10 m exhibit the strongest positive correlations with power output, with coefficients of 0.783 and 0.781, respectively, confirming wind speed as the primary physical driving factor. Atmospheric pressure shows a moderate positive correlation, whereas temperature and wind direction exhibit clear negative correlations, preliminarily confirming the linear constraints imposed by air density variations and wake effects on wind power generation.
Although the heatmap reveals strong multicollinearity between the wind speeds at 100 m and 10 m, neither variable was removed in the subsequent modeling process. From a physical perspective, the difference between wind speeds at these two heights forms the vertical wind shear profile, which is a key meteorological trigger for abrupt aerodynamic load changes and severe power fluctuations under extreme marine conditions; removing either variable would result in the loss of critical vertical spatial information. From an algorithmic perspective, unlike traditional linear regression models that are sensitive to multicollinearity, the deep neural architecture proposed in this study is robust to feature collinearity and can perform dimensionality reduction and feature-level fusion of highly correlated multidimensional wind speed sequences through front-end causal dilated convolutions. The underlying deep nonlinear aerodynamic mapping relationships among these multi-source features are subsequently explored and extracted through the core KAN for high-dimensional feature representation.
Regarding the strong multicollinearity observed between the 10 m and 100 m wind speed variables, both features are deliberately retained. From an aerodynamic perspective, preserving both surface-level and hub-height wind velocities enables the network to implicitly capture the vertical wind shear profile. The differential gradient between the surface friction flow and the higher-altitude free stream provides critical indicators regarding atmospheric boundary layer stability under extreme marine weather. From an algorithmic perspective, unlike traditional linear statistical models that suffer from variance inflation, the proposed deep learning architecture natively processes highly correlated multidimensional inputs. Specifically, the causal dilated convolutions within the Temporal Convolutional Network effectively act as a robust feature extractor, adaptively assigning optimal connection weights to both the shared trends and the distinct differential variances of these spatial sequences without the risk of overfitting.
3. Integrated Prediction Model Architecture
3.1. Overall Architecture Design
To address the pronounced non-stationarity, multi-scale fluctuation characteristics, and complex nonlinear mapping relationships exhibited by offshore wind power sequences under EOCs, this study proposes a hybrid forecasting model that integrates a TCN, a BiLSTM, and a KAN, referred to as TCN–BiLSTM–KAN. A schematic diagram of the model architecture is presented in
Figure 3.
The overall architecture follows a hierarchical modeling paradigm consisting of local feature extraction, global temporal dependency modeling, and high-order nonlinear mapping. First, the TCN module extracts local abrupt-change features and multi-scale temporal patterns from the raw wind power time series under EOCs. Subsequently, the BiLSTM module performs bidirectional modeling on the high-dimensional feature sequences generated by the TCN to capture long-term temporal dependencies. Finally, the KAN module is incorporated to conduct nonlinear function approximation on the deep temporal features, thereby further enhancing the model’s representational capacity and predictive robustness under EOCs.
3.2. TCN Structure
Under EOCs, offshore wind power time series often contain frequent short-term abrupt changes, high-frequency oscillations, and localized anomalous patterns. To effectively capture such local temporal features, a TCN is introduced as the front-end feature extraction module of the proposed model. The TCN employs a causal convolution structure to ensure that no future information is incorporated during the forecasting process. For a one-dimensional time series
xt, the causal convolution can be expressed as:
where
K denotes the kernel size and
ωk represents the convolutional weights.
d is the dilation factor. When
d = 1, the above formulation reduces to the standard causal convolution; when
d > 1, it corresponds to the dilated causal convolution.
Additionally, the TCN employs a residual connection structure to mitigate the vanishing gradient problem in deep network training, and its output can be expressed as:
Here, F(·) denotes the convolutional transformation, and σ(·) represents the nonlinear activation function.
In this study, the TCN module serves to effectively extract local abrupt-change features under EOCs; capture multi-scale, high-frequency temporal patterns; and provide stable and robust feature representations for subsequent temporal modeling. In the specific implementation of the TCN module, four stacked residual blocks are employed. To systematically expand the temporal receptive field, the dilation factors for these consecutive layers are strictly configured as 1, 2, 4, and 8, ensuring comprehensive coverage of historical meteorological dependencies.
3.3. BiLSTM Structure
Although the TCN excels at extracting local features, its ability to capture long-term dependencies remains limited. To further model the global temporal characteristics of wind power under EOCs, a BiLSTM is introduced.
The LSTM unit regulates information flow through a gating mechanism, and its core computations are as follows:
The BiLSTM simultaneously models the time series in both forward and backward directions, and its output is given by:
The BiLSTM module serves to capture long-term dependencies in wind power sequences, leverage bidirectional information to enhance understanding of complex temporal structures, and improve the model’s overall temporal modeling capability under EOCs.
3.4. KAN Structure
Under EOCs, the input–output relationship of wind power often exhibits highly complex nonlinear mappings. Traditional neural networks rely on fixed activation functions, which limit their expressive capability in such scenarios. To address this, a KAN is introduced as the output mapping module of the model.
According to the Kolmogorov–Arnold representation theorem, a multivariate function can be expressed as a superposition of univariate functions. KAN replaces the fixed activation functions in conventional neural networks with learnable univariate functions, which can be formulated as:
Here, ϕi(·) denotes a learnable univariate function, typically parameterized using spline or piecewise functions. In this study, KAN receives the high-dimensional features HBiLSTM from the BiLSTM and performs a precise approximation of complex nonlinear relationships through adaptive function mapping.
In the specific algorithmic configuration of the Kolmogorov–Arnold Network module, a cubic B-spline parameterization is utilized to guarantee optimal mathematical smoothness and mapping capacity. The degree of the learnable splines is strictly set to 3, ensuring second-order continuous differentiability for stable gradient propagation. Concurrently, the number of grid intervals is configured to 5. This configuration provides enough parameters to fit the steep power curves under marine conditions, without introducing redundant nodes that typically cause structural overfitting.
The core functions of the KAN module are to enhance the model’s ability to approximate complex nonlinear relationships under EOCs, reduce dependence on fixed-form activation functions, and improve the model’s representational flexibility and predictive robustness.
3.5. Training Strategies and Optimization Parameters
During the model training phase, the Mean Squared Error (MSE) was adopted as the loss function to measure the discrepancy between the predicted power output and the actual power generation. Because its quadratic penalty mechanism explicitly forces the network to assign maximum gradient attention to the severe, long-tailed power fluctuations under extreme marine conditions, thereby preventing dangerous under-predictions during critical grid dispatch scenarios. Model parameters were optimized using the Adam optimization algorithm to balance convergence speed and adaptive learning during training.
To ensure effective learning under EOCs, the number of training epochs was set to 80, while the batch size was deliberately configured as 5. Under extreme marine conditions, severe power fluctuations and ramping events inherently belong to sparse long-tail samples. Conventional large batch sizes (e.g., 32 or 64) tend to dilute or average the gradient information of these rare extreme samples with numerous stable samples within the same batch. The micro-batch strategy adopted in this study introduces moderate stochastic gradient noise, thereby significantly enhancing the sensitivity of gradient updates and encouraging the model to respond more actively to local meteorological perturbations, enabling it to better capture the highly nonlinear fluctuations of wind power sequences under extreme conditions.
In terms of hyperparameter configuration and stability control, the initial learning rate was set to 1 × 10−4 to prevent training oscillations caused by excessively large parameter updates and to promote smoother convergence. Meanwhile, to ensure training stability and reproducibility of the deep architecture composed of BiLSTM and KAN when handling highly distorted extreme data, an optimization constraint mechanism was employed instead of the conventional early stopping strategy. Specifically, a weight decay coefficient of 1 × 10−5 was incorporated into the Adam optimizer to implement L2 regularization, effectively constraining model complexity and preventing overfitting. During backpropagation, gradient clipping with a maximum norm of 1.0 was applied to mathematically prevent the risk of gradient explosion when the network attempts to fit extreme spikes.
With the above training strategies and optimization settings, the proposed TCN–BiLSTM–KAN hybrid model achieves highly stable training while delivering high-accuracy and robust predictions of offshore wind power under EOCs.
4. Case Study Analysis
4.1. Experiment Settings
All experiments in this study were conducted on the same platform to ensure consistency in model training and inference. The hardware platform is equipped with an Intel(R) Core(TM) i7-10700F CPU @ 2.90 GHz (Intel, Santa Clara, CA, USA), an NVIDIA GeForce RTX 3060 GPU (NVIDIA, Santa Clara, CA, USA), for accelerated computations, 16 GB of RAM, and runs Windows 11 (64-bit). The software environment consists of Python 3.12, with neural network models implemented using the PyTorch (version 2.9.0) deep learning framework.
To comprehensively evaluate the superiority of the proposed KAN-based hybrid model for offshore wind power forecasting under EOCs, four representative deep learning models were selected as baselines for comparative experiments. These models encompass variants of convolutional neural networks, recurrent units, attention mechanisms, and Transformer architectures, specifically: CNN-BiLSTM, TCN-BiGRU, Transformer-KAN, and CNN-GRU-Attention.
To ensure fairness and objectivity in the comparative experiments, a controlled-variable strategy was strictly adopted during the model training stage so that all models were optimized under equivalent conditions. The core hyperparameters of all deep learning and machine learning baseline models were configured according to widely recognized standard settings in the relevant research domain. In addition, all models were trained and optimized under a completely unified external training environment. Specifically, the Adam optimizer was consistently employed across all models, while the initial learning rate, batch size, and maximum number of training epochs were kept identical.
To quantitatively evaluate the forecasting performance of the models under EOCs, this study employs Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and the coefficient of determination (R2) as evaluation metrics.
RMSE and MAE measure the magnitude of deviation between predicted and observed values, with smaller values indicating higher prediction accuracy. R2 quantifies the proportion of variance in the data explained by the model, with values closer to 100% indicating better model performance.
RMSE is sensitive to large-error samples, such as abrupt power changes under EOCs, with units in megawatts (MW).
MAE reflects the average magnitude of prediction errors, providing good robustness, with units in megawatts (MW).
R2 measures the model’s ability to explain the variability in wind power fluctuations, expressed as a percentage (%).
4.2. Robustness Verification Under Extreme Conditions
To further assess the robustness and generalization capability of the models under the most severe grid operating conditions, this study focuses on a subset of extreme scenarios characterized by strong volatility and steep power ramps, conducting a targeted evaluation of each model’s forecasting performance.
Figure 4 shows that the actual power exhibits pronounced non-stationary oscillations and steep ramp-up and ramp-down behaviors. In contrast, the proposed model achieves the closest alignment with the actual values, accurately capturing both peak and trough extremes, while demonstrating rapid response at abrupt power changes with minimal phase lag. Other baseline models, such as CNN-BiLSTM, CNN-GRU-Attention and XGBOOST, exhibit pronounced smoothing effects in highly oscillatory regions, making it difficult to reproduce the rich high-frequency details present under EOCs. To provide a qualitative analysis of the model performance under extreme marine weather, a highly volatile operational window between sample 270 and 340 is extracted as a specific case study. As illustrated in the forecasting curves, this period represents an extreme event characterized by rapid aerodynamic variations. Specifically, the real power trajectory exhibits a rapid increase near sample 270, followed by a sharp decrease around sample 300. During these temporal fluctuations, traditional benchmarks such as the TCN-BiGRU and CNN-BiLSTM models demonstrate noticeable prediction lags and deviations. In contrast, the proposed architecture maintains consistent tracking performance. By utilizing the nonlinear mapping of the Kolmogorov–Arnold Network, the proposed model aligns with the real power curve, mitigating the forecasting delays commonly observed in baseline methods during sudden weather events. This qualitative evidence validates the dynamic tracking ability of the proposed algorithm under extreme marine conditions.
Table 1 and
Figure 5 present detailed quantitative error metrics for the proposed KAN-based hybrid model and the four mainstream baseline models under these high-risk scenarios.
In terms of RMSE, which is sensitive to large errors, the proposed model achieves only 3.58 MW. This represents a 56.8% reduction compared to the worst-performing baseline, CNN-BiLSTM, and a 42.3% decrease relative to the second-best model, Transformer-KAN, while also consistently outperforming other representative methods such as XGBoost with an RMSE of 6.32 MW and Informer with 5.57 MW. These results indicate that under abrupt wind speed changes and steep power ramps, prediction errors in conventional deep learning models are significantly amplified, whereas the proposed model, leveraging KAN’s strong nonlinear approximation capability, mitigates the overall impact of extreme errors. Regarding R2, the proposed model achieves 97.84%, making it the only model to surpass 95% under EOCs, whereas baselines like XGBoost and Informer only reach 89.68% and 92.25%, respectively. Notably, although CNN-GRU-Attention achieves a relatively high R2 of 91.80%, its RMSE reaches 8.12 MW. This highlights a potential limitation of the attention mechanism: while it effectively captures the trend of power fluctuations through weight allocation, it exhibits substantial deviations in amplitude prediction. In contrast, the proposed model accurately captures both trends and amplitudes, indicating the suitable adaptability of KAN spline functions for modeling large-magnitude fluctuations.
Furthermore, the significant reduction in the RMSE serves as clear evidence that the proposed model reduces the prediction delay. When power levels change suddenly, any slow reaction creates a massive gap between the predicted and actual values. The much lower error of our model confirms that it reacts to sudden weather changes more promptly than traditional models.
However, the Transformer-KAN remains substantially lower than that of the proposed hybrid model. This further underscores the necessity of the proposed integration strategy. Relying solely on KAN for function approximation is insufficient, and it must be combined with targeted feature extraction modules to handle non-stationary noise under extreme conditions in order to fully leverage KAN’s potential.
CNN-BiLSTM and TCN-BiGRU perform the worst under EOCs, with RMSE values of 8.29 MW and 6.88 MW, respectively. This is primarily due to the inherent memory inertia of RNNs. When encountering abrupt wind speed spikes or drops, RNNs tend to smooth historical information, resulting in noticeable phase lag in predictions and an inability to promptly respond to high-frequency, sudden power fluctuations.
Regarding computational overhead, an analysis of the training times reveals a highly favorable trade-off between accuracy and efficiency. The proposed architecture requires 59.32 s for training, while the unidirectional baseline TCN-BiGRU requires 58.18 s. Although the bidirectional dependency modeling of the BiLSTM inherently introduces a slight computational increase over the unidirectional structure, this overhead of merely 1.14 s is practically negligible. Given the massive reduction in forecasting error, this marginal increase in computational cost is entirely justified by the substantial gains in predictive precision and stability under complex marine environments.
Despite the robust predictive performance achieved under complex marine environments, the proposed architecture possesses inherent limitations. Specifically, the deep temporal extraction mechanisms and the empirical distribution fitting processes heavily rely on the continuity and quality of the input meteorological sequences. In practical offshore applications, scenarios involving prolonged sensor failures or extreme data sparsity can severely disrupt the input temporal dependencies. Under such extreme missing-data conditions, the model might struggle to maintain its localized feature representation capabilities, potentially leading to noticeable performance degradation.
4.3. Comprehensive Performance Under Normal Operating Conditions and Full Operating Conditions
Under normal operating conditions, after excluding extreme samples, wind speed fluctuations become relatively moderate, mainly testing the model’s ability to capture baseline trends in wind power output. To further evaluate generalization capability, datasets under normal and extreme conditions were merged, and overall performance testing was conducted. The corresponding results are shown in
Figure 6.
Figure 6 illustrates that, even under normal conditions, wind power exhibits significant periodic fluctuations ranging from 5 MW to 40 MW. The proposed model accurately tracks rapid ramp-up (samples 160–200) and recovery phases (samples 230–260) with negligible phase lag. In contrast, baseline models capture the general trend but show reduced accuracy at local turning points.
Under full operating conditions, which include both steady and highly fluctuating intervals, the proposed model maintains robust and consistent performance across different regimes. It achieves accurate tracking in both low-power and high-oscillation scenarios without the typical trade-off observed in conventional models. Quantitative results are provided in
Table 2 and
Figure 7.
Even under stable operating conditions, the proposed KAN-based hybrid model maintains a clear leading advantage. The proposed model achieves an RMSE of 2.08 MW, an MAE of only 1.46 MW, and an R2 as high as 98.20%. This indicates that, in the absence of severe disturbances, the KAN can approximate the physical curve of wind power almost perfectly, with minimal noise. Other representative methods, such as the ensemble learning model XGBOOST with an RMSE of 3.24 MW and the time-series model Informer with an RMSE of 3.15 MW, exhibit noticeably higher errors. Although the TCN-BiGRU model performs reasonably, with an RMSE of 2.94 MW, its error remains approximately 41% higher than that of the proposed model. This demonstrates that the KAN is not solely designed for extreme nonlinear scenarios; it also outperforms conventional gated recurrent units in regression tasks within regular linear operating ranges.
In the comprehensive test set encompassing various complex operating conditions, the proposed model achieves an RMSE of 2.52 MW, which remains lower than that of all comparative models. While advanced baselines like Informer and Transformer-KAN achieve relatively close RMSE values of 2.63 MW and 2.67 MW, respectively, a substantial gap exists in terms of the R2 metric. The proposed model maintains an R2 of 97.38%, making it the only architecture among the compared models that consistently exceeds 95%. In contrast, the R2 of Transformer-KAN is only 92.85%, and Informer only reaches 91.04%. Although Transformer-KAN achieves relatively small average numerical errors, its ability to explain the trend of power fluctuations is significantly weaker than that of the proposed model, and it tends to lose critical local features. Comparing the proposed model with Transformer-KAN, both incorporate the KAN structure and outperform traditional models; however, their RMSE values remain above 3.00 MW. This further confirms the effectiveness of KAN in modeling wind power time series. Nevertheless, the proposed hybrid architecture demonstrates clear superiority over the Transformer-KAN combination, indicating that feature engineering modules specifically designed for offshore wind power characteristics are crucial for enhancing overall predictive accuracy.
From normal to comprehensive operating conditions, CNN-BiLSTM exhibits significant performance fluctuations, with R2 decreasing from 91.88% to 90.46%, indicating its susceptibility to interference from extreme samples. Similarly, models like XGBOOST and Informer also experience noticeable R2 drops to 90.68% and 91.04%, respectively. In contrast, the proposed model shows only a slight decrease in R2 from 98.20% to 97.38%, demonstrating strong robustness. This indicates that the model does not merely memorize stationary patterns but effectively learns the underlying physical mapping between wind speed and power output, enabling smooth transitions across different operating conditions.
To quantitatively evaluate the environmental adaptability and predictive accuracy of the models under varying marine conditions,
Figure 8 visually presents the comparison of R
2 values between the proposed model and baseline models across extreme, normal, and comprehensive operating conditions.
Figure 8 presents a comparison of R
2 values for all models under different operating conditions. Under normal and comprehensive operating conditions, all models demonstrate fundamental fitting capability. However, under extreme conditions, the predictive performance of the baseline models deteriorates markedly, with the R
2 of CNN-BiLSTM dropping below 89%. In contrast, the proposed model maintains the highest predictive accuracy across all three operating conditions, and its goodness-of-fit under extreme scenarios shows no significant degradation, creating the largest performance gap relative to the comparison models. From a statistical perspective, this comparison conclusively demonstrates that the proposed multi-source feature fusion architecture effectively overcomes forecasting bottlenecks under extreme marine conditions and exhibits strong all-scenario robustness.
4.4. Cross-Site Generalization Ability Experiment
To further evaluate the initial spatial adaptability of the proposed model and reduce the potential bias caused by a single dataset, we introduced an independent offshore wind farm, designated as Site B, in the same sea area for verification, with a geographical distance of approximately 70 km from the main site. In this experiment, the network architectures and hyperparameters of all models remained unchanged, and separate training and testing were conducted on the Site B dataset. The prediction curves are shown in
Figure 9.
As illustrated in the cross-site prediction results shown in
Figure 8, when confronted with a completely unseen target-domain meteorological distribution, the conventional deep learning model CNN-BiLSTM exhibits severe peak clipping and phase lag during deep drops and extreme ramping intervals, failing to align with actual extreme values. Meanwhile, Transformer-KAN falls into a generalization paradox due to overfitting in the source domain, producing abnormal high-frequency distortions in several regions that do not exist in the real physical system. In contrast, the prediction trajectory of the proposed model closely matches the actual power, accurately synchronizing with various extreme mutation points without noticeable lag or deviation. From a visual perspective, this provides compelling evidence that the TCN–BiLSTM–KAN architecture effectively disentangles site-specific environmental noise and genuinely learns and transfers the universal physical principles governing wind energy conversion. The results of independent testing under EOCs at Site B for each model are presented in
Table 3 and
Figure 10.
Based on the performance of the baseline models, RMSE for models such as Transformer-KAN surged above 11 MW, a trend similarly observed in the traditional ensemble method XGBOOST, which recorded the highest RMSE of 11.55 MW alongside a notably low R2 of 86.93%. This indicates that forecasting at Site B is significantly more challenging than at the original site, reflecting stronger non-stationarity at this location. However, under such harsh conditions, the proposed model still maintains an RMSE of 7.50 MW and a high R2 of 97.18%. This strongly demonstrates that the proposed TCN–BiLSTM–KAN architecture possesses exceptional robustness. Regardless of changes in the data source, the architecture reliably captures wind power fluctuation patterns through TCN-based feature denoising and KAN’s adaptive approximation, without relying on stationary characteristics specific to a particular site.
Although Transformers theoretically offer global modeling capabilities, when applied to independent, high-noise, and highly volatile data such as Site B, their complex attention mechanisms are prone to local minima, causing performance collapse. This vulnerability is distinctly evident in the advanced Informer model, which only achieves an RMSE of 10.52 MW and an R2 of 89.01%, further confirming the limitations of sparse attention under severe spatial heterogeneity. In contrast, the proposed model leverages temporal constraints from the BiLSTM and the efficient representation of KAN to achieve structural regularization. This allows the model to converge rapidly to a global optimum when trained independently at different sites, demonstrating engineering stability far superior to the Transformer-KAN architecture.
Compared with conventional CNN-BiLSTM and TCN-BiGRU models, whose RMSE values are 10.92 MW and 10.31 MW, respectively, the proposed model achieves a 27–31% improvement in forecasting accuracy at Site B. This substantial performance gap indicates that incorporating the Kolmogorov–Arnold network for function approximation is more efficient than conventional fully connected layers when handling spatially heterogeneous offshore wind power data. These experimental results provide solid empirical support for the broader application of the proposed model across large-scale offshore wind farm clusters.
4.5. Ablation Experiment
To validate the necessity and contribution of each key module (TCN, BiLSTM, and KAN) in the proposed hybrid architecture for extreme-condition forecasting, three variant models were designed for ablation experiments. All experiments were conducted on the extreme-condition dataset, with the comparison models configured as TCN-BiLSTM, TCN-KAN, and BiLSTM-KAN.
Figure 11 visually presents the prediction trajectories of the variant models in the ablation study within extreme oscillatory intervals. Comparing TCN-BiLSTM with the ground truth reveals that this model exhibits pronounced peak clipping at power crests, failing to reach the true maximum values. This directly demonstrates the limitations of conventional fully connected layers in handling extreme nonlinear mappings. Although TCN-KAN and BiLSTM-KAN show improvements over TCN-BiLSTM, phase deviations or amplitude oscillations still occur at certain rapid turning points. The proposed model exhibits the best overall fitting performance. It not only overcomes the peak-clipping issue and accurately reconstructs power spikes, but also demonstrates high stability during trough recoveries, thereby validating the synergistic enhancement of the TCN–BiLSTM–KAN composite architecture in feature extraction, temporal memory modeling, and nonlinear regression. The quantitative results of the ablation study are presented in
Table 4 and
Figure 12.
To strictly isolate and highlight the contribution of the nonlinear mapping module, an ablation study comparing the proposed model against the TCN-BiLSTM-MLP baseline was conducted. When replacing the learnable spline functions of the Kolmogorov–Arnold Network with the fixed activation functions of a standard Multi-Layer Perceptron, the RMSE increases from 3.58 MW to 5.01 MW, and the R2 drops from 97.84% to 95.69%. This specific comparison effectively demonstrates that traditional fully connected layers exhibit limitations in approximating the pronounced non-convexity of wind power curves under EOCs. The learnable activation functions constitute a critical mathematical mechanism for capturing complex aerodynamic power conversion characteristics, proving that the performance gain originates from the novel mapping mechanism rather than merely the structural depth of the network.
Comparing the proposed model with TCN-BiLSTM, whose R2 values are 97.84% and 88.24%, respectively, it is evident that removing the KAN layer and reverting to a conventional deep learning architecture results in the most dramatic performance degradation. The RMSE increases sharply from 3.58 MW to 6.33 MW, representing an error amplification of nearly 76.8%. This strongly demonstrates that under EOCs, the mapping between wind speed and power output exhibits pronounced non-convexity and irregularity. Traditional fully connected layers struggle to approximate such complex fluid-dynamic characteristics, whereas the learnable spline functions in KAN constitute the critical mechanism for addressing this challenge.
Comparing the proposed model with TCN-KAN, the absence of the BiLSTM layer results in an RMSE increase of 2.11 MW. Although TCN possesses an expanded receptive field, it is inherently more effective at capturing local features. When dealing with long-duration extreme events such as typhoons or sustained ramping processes, the gating mechanism of BiLSTM effectively retains long-range historical state information. The ablation results indicate that relying solely on TCN and KAN significantly weakens the model’s capacity to capture long-term temporal dependencies.
Comparing the proposed model with BiLSTM-KAN, removing the TCN layer increases the RMSE from 3.58 MW to 5.58 MW. Under EOCs, raw data are often accompanied by high-frequency noise and turbulence-induced disturbances. The causal dilated convolutions in TCN effectively function as a high-efficiency feature filter, preserving informative waveform patterns while suppressing high-frequency noise. Directly feeding noisy raw data into the BiLSTM can interfere with gradient propagation, thereby degrading predictive accuracy.
5. Conclusions
To address the critical challenges of low forecasting accuracy and poor generalization under extreme marine conditions, this study proposes a multi-source spatiotemporal feature fusion forecasting model utilizing a hybrid TCN, BiLSTM, and KAN architecture. Multidimensional validations using real operational data demonstrate that KAN significantly outperforms conventional multilayer perceptrons in function approximation. This superiority is attributed to its learnable spline-based activation functions, which are critical for capturing the highly nonlinear and volatile characteristics of offshore wind power.
Under extreme marine evaluations involving severe wind speed fluctuations and abrupt wave height changes, the proposed model achieves an RMSE of 3.58 MW and an R2 value of 97.84%. This performance effectively mitigates the peak clipping and phase lag issues prevalent in conventional deep learning models during severe power ramping periods. Consequently, these improvements offer substantial engineering value for the stable operation of offshore wind plants and the reliable dispatch of onshore grids.
Ablation experiments confirm that the local feature denoising of TCN, the long-term temporal memory of BiLSTM, and the nonlinear regression of KAN form a highly synergistic framework. Removing any single module degrades forecasting performance significantly, increasing the RMSE by over 50%. Ultimately, this hybrid architecture effectively decouples high-frequency noise, temporal dependencies, and nonlinear abrupt variations, demonstrating strong adaptability to complex marine operating environments.
In challenging independent cross-site validations across different sea areas with varying wind–wave regimes, the proposed model maintains a robust R2 value of 97.18%. In contrast, the advanced Transformer-KAN benchmark degrades severely due to overfitting, resulting in an RMSE exceeding 11 MW. This demonstrates that the proposed architecture not only enables refined single-site forecasting but also exhibits preliminary spatial adaptability within similar marine environments, mitigating the performance degradation typically induced by spatial data distribution shifts.
Overall, the proposed model achieves high accuracy under normal offshore conditions (RMSE: 2.08 MW) and strong robustness under extreme marine conditions (RMSE: 3.58 MW), providing practical technical support for enhancing power system resilience under extreme marine weather, optimizing grid reserve allocation, and ensuring the secure and efficient grid integration of offshore wind clusters—aligned with the Journal of Marine Science and Engineering (JMSE)’s focus on marine engineering and offshore energy.
In practice, grid operators can directly integrate this forecasting model as a modular engine into existing Energy Management Systems via standard application programming interfaces, thereby enabling secure and dynamic power dispatch during extreme marine weather events.
While current evaluations demonstrate preliminary spatial adaptability, future research will focus on rigorously validating the broad transferability of the model using data from geographically distributed wind farms under diverse climatic conditions. Additionally, considering the profound seasonal non-stationarities of the target offshore environment, such as the alternation between summer typhoons and winter cold surges, developing dynamically adaptive seasonal thresholds to precisely isolate distinct meteorological phenomena represents a vital trajectory for future physical modeling enhancements.