A Deep Hybrid CNNDBiLSTM Model for Short-Term Wind Speed Forecasting in Wind-Rich Regions of Tasmania, Australia

Neupane, Ananta; Raj, Nawin; Deo, Ravinesh

doi:10.3390/en18246390

Open AccessFeature PaperArticle

A Deep Hybrid CNNDBiLSTM Model for Short-Term Wind Speed Forecasting in Wind-Rich Regions of Tasmania, Australia

by

Ananta Neupane

,

Nawin Raj

^*

and

Ravinesh Deo

School of Mathematics, Physics and Computing, Springfield Campus, University of Southern Queensland, Toowoomba, QLD 4350, Australia

^*

Author to whom correspondence should be addressed.

Energies 2025, 18(24), 6390; https://doi.org/10.3390/en18246390

Submission received: 8 July 2025 / Revised: 26 November 2025 / Accepted: 28 November 2025 / Published: 5 December 2025

(This article belongs to the Section A3: Wind, Wave and Tidal Energy)

Download

Browse Figures

Versions Notes

Abstract

Accurate and reliable short-term wind speed forecasting plays a crucial role in efficient operation and integration of wind energy generation. This research study introduces an innovative deep hybrid model that combines Convolutional Neural Networks (CNN) with Double Bidirectional Long Short-Term Memory (DBiLSTM) networks to enhance wind speed forecasting accuracy in Australia. Thirteen years of hourly wind speed data were collected from two wind-rich potential sites in Tasmania, Australia. The CNN component effectively captures local temporal patterns, while the DBiLSTM layers model long-range dependencies in both forward and backward directions. The proposed CNNDBiLSTM model was compared against three traditional benchmark models: Multiple Linear Regression (MLR), Support Vector Regression (SVR), and Categorical Boosting (CatBoost). The proposed framework can effectively support wind farm planning, operational reliability, and grid integration strategies within the renewable energy sector. A comprehensive evaluation framework across both Australian study sites (Flinders Island Airport, Scottsdale) showed that the CNNDBiLSTM consistently outperformed the baseline models. It achieved the highest correlation coefficients (r = 0.987–0.988), the lowest error rates (RMSE = 0.392–0.402, MAE = 0.294–0.310), and superior scores across multiple efficiency metrics (ENS, WI, LM). The CNNDBiLSTM demonstrated strong adaptability across coastal and inland environments, showing potential for real-world use in renewable-energy resource forecasting. The wind speed analysis and forecasting show Flinders with higher and consistent wind speed as a more viable option for large-scale wind energy generation than Scottsdale in Tasmania.

Keywords:

deep learning; CNNBiLSTM hybrid model; wind speed forecasting; time series forecasting; renewable energy

1. Introduction

The world’s dependence on fossil fuels has led to serious issues like global warming and pollution. Moreover, fossil fuel is also not a sustainable long-term energy source. In contrast, renewable energy, especially wind and solar energy, is cleaner and widely available. Wind energy neither produces pollution nor produces waste [1]. To achieve the United Nations’ Sustainable Development Goals (SDGs) by 2030, a worldwide shift from fossil fuels to renewable energy sources is essential.

Wind energy holds immense potential as part of the global renewable energy transition. In 2024, the world’s total energy supply reached 648 exajoules (EJ), of which more than 80% was still derived from fossil fuels, with oil, coal, and natural gas contributing 29.8%, 27.3%, and 23%, respectively [2]. Renewables accounted for around 15% of global energy use, but their role in electricity generation is growing rapidly, representing 33% of the global power mix [2]. Within this renewable portfolio, wind energy has emerged as one of the fastest-expanding technologies, with global installed capacity reaching 1021 GW by the end of 2023 [2,3]. According to the REN21 Global Status Report (2024), total installed renewable energy capacity surpassed 3800 GW in 2023, of which wind contributed a substantial share [4]. China leads with 440 GW of wind power, followed by the United States with about 150 GW, while other countries such as Brazil, India, and Germany have also made significant investments [4,5]. These figures highlight the central role of wind energy in diversifying the global energy mix and reducing reliance on fossil fuels.

Renewable sources such as wind energy play a crucial role in developing a cleaner and more sustainable future. Unlike fossil fuels, wind power does not pollute the air and cause climate change problems. It helps to cut down carbon emissions, protects nature, and keeps the energy supply secure. By the end of 2024, the world had installed a total of 1136 GW of wind power, and 117 GW was added in that same year. China and the United States were the leading countries in wind power capacity. However, despite these figures, wind power accounted for less than 7% of the global energy consumption in 2024, where 90% of all expansion in the power sector was in renewable with wind energy’s contribution being only 20% [6].

In Australia, 836 MW of wind energy was integrated into the grid in 2024, bringing the total wind power capacity to 12.3 GW by the end of the year 2024 however, this accounted for just 13.4% of the country’s total electricity generation [6]. A significant portion, 65%, of Australia’s electricity was still being generated from fossil fuels at the end of 2024. Australia is situated in a region with some of the highest wind energy potential globally, particularly in the southern part of the country (Figure 1 and Figure 2). Several locations experience average wind speeds exceeding 9.4 m/s and rank among the highest average wind speeds worldwide. Given this vast potential, there is a potential opportunity to expand wind power capacity and accelerate the transition toward a more sustainable and renewable energy mix. Enhancing the share of wind power in Australia’s total energy generation would help lower greenhouse gas emissions, mitigate the impacts of climate change, and strengthen long-term energy security. Australia’s federal state and territory governments have set various targets to transition from fossil fuels to renewable energy sources. While some states are making significant progress toward their goals, others are still lagging behind in keeping up. For instance, Tasmania and South Australia aimed to derive 50% of their electricity from renewable sources by 2025 and successfully achieved this target [7]. However, more populous states have yet to reach similar milestones [7]. The debate among Australian energy policymakers and politicians revolves around whether to focus on promoting the renewable energy industry or to continue subsidizing the well-established fossil fuel sector [8]. Additionally, the renewable energy industry faces several technological challenges, such as energy storage issues, irregular supply, and the requirement for large production areas [9]. Figure 2 below illustrates Australia’s wind-rich zones and the selected study sites.

Wind speed forecasting faces significant challenges due to its inherent randomness, nonlinearity, and nonstationary characteristics. Over the years, researchers have developed a variety of modeling techniques, broadly classified into three categories: physical models, statistical models, and AI-based models. Physical models are grounded in atmospheric physics, statistical models rely on historical data patterns, and AI models utilize machine learning and deep learning algorithms to capture complex nonlinear relationships [11]. Physical and statistical models are less accurate in forecasting results compared to AI-based models, especially for short-term predictions [12,13]. Accordingly, this study proposes a hybrid deep learning Convolutional Neural Network with Double Bidirectional Long Short-Term Memory Network (CNNDBiLSTM) model to forecast wind speed for two study sites (Scottsdale and Flinders) in Tasmania, Australia. The CNN structure enables the hybrid model to extract high-level nonlinear spatial features from the input sequences. The CNN helps to reduce noise, identify local patterns, and provides a more informative feature space for sequential learning before the BiLSTM phase. The two Bidirectional LSTM layers enhance the temporal forecasting by learning the long-range dependencies in both forward and backward directions. Furthermore, stacking of two BiLSTM layers further deepens the temporal modelling capability. This helps to capture subtle lags, rapid changes, and periodic behavior, which is common in wind speed datasets. This study also uses three benchmark models: Multiple Linear Regression (MLR), Support Vector Regression (SVR), and Categorical Boosting (CatBoost) models to compare the forecasting results with the proposed hybrid model.

This research paper is structured into four main sections. Section 1 introduces the global and national energy context, including Australia’s wind energy potential and current challenges in the sector. Section 2 provides a detailed literature review, model background, data preprocessing methods, and evaluation metrics, highlighting the key elements of the proposed architecture. Section 3 presents a comparative performance analysis between the benchmark and proposed models, supported by graphical and statistical evidence. Section 4 presents the discussion of the model results, and Section 5 concludes the study by summarizing the principal findings and contributions of this research.

2. Literature Review

The global emphasis on sustainable and renewable energy sources has positioned wind power as a key alternative among various clean energy options [14]. Accurate forecasting is essential for effective grid operation, cost-efficient dispatch planning, and overall system reliability, particularly as the proportion of wind energy in power systems continues to increase. Although wind energy is abundant and clean, it is inherently variable and uncertain due to its dependence on atmospheric conditions [15]. This variability poses significant challenges for grid integration, particularly in maintaining the balance between supply and demand in real time [16]. To overcome these challenges, researchers have developed advanced forecasting methods ranging from statistical models to machine learning and hybrid approaches, all aimed at enhancing prediction accuracy. Short-term forecasts are especially vital for operational decisions such as unit commitment, load balancing, and reserve allocation [16]. Moreover, as wind penetration levels rise, the economic consequences of forecast inaccuracies become increasingly significant, influencing market operations and elevating the demand for ancillary services [17]. Regional differences in wind regimes, terrain complexity, and the quality of available data can also substantially impact forecasting performance, underscoring the necessity for site-specific model design and evaluation [18].

2.1. Model Theoretical Framework

2.1.1. Multiple Linear Regression (MLR) Model

Over the past several decades, particularly before the rise of artificial intelligence techniques, the Multiple Linear Regression (MLR) model has been one of the most widely used tools for forecasting [19]. MLR models predict a dependent variable based on a set of integrated independent variables, establishing cause and effect relationships among them through estimated regression coefficients. Each independent variable is associated with a specific coefficient, enabling the model to quantify its individual contribution to the dependent variable. These independent variables may be either continuous or categorical, but they must be expressed numerically for the part in the regression equation [20]. The MLR approach is particularly effective when the relationships between variables are linear, and it has been successfully applied in various domains, including long-term global solar radiation forecasting in Australia [21], electricity consumption prediction in Italy [22]. However, since many environmental variables exhibit nonlinear behavior, the predictive accuracy of MLR is often limited when compared with more advanced machine learning and deep learning techniques. The theoretical formulation and statistical foundations of the MLR model are well established in classical regression literature [19,20,21,22]. Therefore, only a conceptual overview is provided here, while detailed mathematical derivations can be found in references [23,24].

2.1.2. Support Vector Regression (SVR) Model

Support Vector Regression (SVR) is a powerful and flexible method commonly used for nonlinear forecasting. SVR operates by identifying a function that best fits the data within a specified error margin (epsilon), focusing primarily on the most influential data points known as support vectors [25]. It is particularly effective for addressing complex and nonlinear problems [26]. When appropriate parameters and input features are selected, SVR can deliver more accurate and stable forecasting outcomes than many conventional regression techniques [27]. This method has been applied successfully in various forecasting domains, including rainfall forecasting [28], load forecasting [29], and solar power forecasting [30].

The mathematical formulation and optimization principles of SVR are comprehensively discussed in the foundational work of Vapnik (1995) [31,32] and subsequent applications in energy forecasting studies [25,26,27,28,29,30]. Only a conceptual overview is presented here to maintain focus on the methodological context relevant to the present research.

2.1.3. Categorical Boosting (CatBoost) Model

CatBoost is an open-source machine learning algorithm specifically designed to handle datasets containing both numerical and categorical features [33]. It was first developed by researchers at a multinational technology company known as Yandex. CatBoost addresses key statistical limitations and practical challenges encountered in traditional gradient boosting methods, particularly those related to the processing of categorical variables and the mitigation of prediction bias [34]. It has been successfully applied in several forecasting domains, including Solar radiation forecasting [35], short-term weather forecasting [36], and short-term load forecasting [33].

The theoretical formulation and optimization details of CatBoost are comprehensively described by Prokhorenkova et al. (2018), the original developers of the algorithm [34]. For the purposes of this study, only the conceptual framework is summarized, as CatBoost serves primarily as a benchmark model for comparison with the proposed deep-learning approach.

2.1.4. Convolution Neural Network (CNN)

Convolutional Neural Networks (CNNs) use layers of filters to automatically identify local signals in input data [37]. These extracted features are then processed through nonlinear activation and pooling layers, which help reduce data dimensionality and enhance the model’s ability to generalize [38]. Although CNNs were originally developed for computer vision applications, they have also demonstrated strong performance in time-series forecasting tasks such as wind speed prediction by effectively capturing short-term dependencies and filtering out noise from raw signals [37,39]. The primary strength of CNNs lies in their capability to extract informative and discriminative features from raw data, making them an essential component of hybrid deep learning frameworks for both preprocessing and feature extraction. It has been used several time series forecasting, like for wind power prediction [38], gold price forecasting [40].

Mathematically, CNN can be defined based on the convolution operation, which applies a filter (kernel) over an input signal to extract features:

Here,

y = [y_{1}, y_{2}, \dots, y_{n}]

is the input sequence,

u = [u_{1}, u_{2}, \dots, u_{k}]

is the convolution filter of length k, and bias

b \in R

then the one dimensional (1D) convolution output at position I is defined by

s_{i} = \sum_{i = 1}^{k} u_{i} \cdot y_{t + i - 1} + b

(1)

where, S_i = result of applying the kernel, u = filter segment of the input, k = filter size, and t = position of the sliding window.

A nonlinear activation is applied to the convolution result

z_{i} = σ (s_{i})

(2)

The rectified linear unit (ReLU) is defined as

σ (s_{i}) = \max (0, s_{i})

(3)

With the pooling window (size p), the operation is defined as

z_{i}^{pooled} = \max (z_{i}, z_{i + 1}, \dots, z_{i + p - 1})

(4)

Then the output of the dense layer is defined as

\hat{y} = f (W z + b)

(5)

where w = weight, z = flattened feature, b = bias and f = activation function

Then the combination of all of the above processes is as follows

y → (Convolution with u) s → (ReLU) z →(Pooling) zᵖᵒᵒˡᵉᵈ → (Dense) ŷ

(6)

In this study, the above equation represents how the CNN component processes historical wind speed data to extract key temporal features. The convolutional layer (

u

) identifies short-term variations and gust patterns, while the ReLU activation introduces nonlinearity and filters out irrelevant signals. The pooling layer (

z^{pooled}

) reduces data noise and focuses on the most dominant features. Finally, the dense layer produces the feature output (

\hat{y}

) which is then passed to the DBiLSTM network for learning long-term dependencies and generating the final wind speed forecast.

2.1.5. Long Short-Term Memory (LSTM)

Long Short-Term Memory (LSTM) networks are a specialized type of Recurrent Neural Network (RNN) designed to overcome one of the major challenges in sequence learning, capturing long-term dependencies [39]. Traditional RNNs process sequential data by transmitting information from one time step to the next through a hidden state [38]. However, they often encounter difficulties in learning patterns across long sequences due to vanishing or exploding gradient problems during training. This limitation was addressed by Hochreiter and Schmidhuber, who introduced the LSTM architecture in 1997. LSTMs resolve this issue through an architecture that incorporates memory cells and gating mechanisms, enabling the network to regulate the flow of information [39]. These gates allow the model to selectively retain, update, or discard information, significantly improving its ability to learn long-term temporal dependencies in time-series data [40].

LSTM models with other data refinery processes are widely used in gold price time series forecasting [40], short-term wind power forecasting [38], and power load forecasting [39].

Mathematically, its components are defined by the following:

(a): Forget gate

$f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})$

(7)
(b): Input gate

$i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})$

(8)

$\tilde{C_{t}} = t a n h (W_{C} \cdot [h_{t - 1}, x_{t}] + b_{C})$

(9)
(c): Cell state update

$C_{t} = f_{t} \cdot C_{t - 1} + i_{t} \cdot \tilde{C_{t}}$

(10)
(d): Output gate and the hidden state

$o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})$

(11)

$h_{t} = o_{t} \cdot t a n h (C_{t})$

(12)

where

σ =

sigmoid activation function,

t a n h =

hyperbolic tangent function,

x_{t} =

input at time step t,

h_{t - 1} =

previous hidden state,

W =

weight matrices, b = biases.

2.1.6. Bidirectional LSTM (BiLSTM)

Bidirectional (BiLSTM) networks are an extended version of the traditional LSTM model, designed to improve sequence modeling by leveraging information from both directions from past and future time steps. While standard LSTMs process data in a single direction, generally from past to future [41]. This unidirectional approach limits the model to only prior context. But on the other hand, the BiLSTM overcomes this limitation by employing two parallel LSTM layers: one that processes the sequence forward and another that processes it backward [41]. At each time step, the outputs from both directions are combined, enabling the model to capture richer contextual dependencies and achieve higher prediction accuracy, particularly for stochastic datasets such as wind speed [42]. BiLSTM model has been used for sound speed prediction [42], long-term traffic flow forecasting [43], and heat rate prediction [44]. In a BiLSTM model, there are two types of traditional models applied: Forward LSTM, which processes the sequence from t = 1 to T, and backward LSTM, which processes the sequence from t = T to 1. Each LSTM layer (Figure 3) maintains its own hidden state and cell state and operates its own set of gate parameters. By merging the outputs from both layers at each time step, the BiLSTM captures complete contextual information from the entire input sequence, thereby improving temporal representation and predictive performance.

2.1.7. Deep Bidirectional LSTM (DBiLSTM)

Deep Bidirectional Long Short-Term Memory (DBiLSTM) is an advanced variant of the recurrent neural network that loads multiple bidirectional LSTM layers on top of each other [45]. This layered structure helps the model learn increasingly complex and abstract patterns in sequential data. Each layer reads the input in both directions, forward (left to right) and backward (right to left), and integrates the contextual information from both paths to achieve a more comprehensive understanding of the sequence at every time step [46]. By feeding the output of one bidirectional layer into the next, the network enhances its ability to capture both short-term fluctuations and long-range temporal dependencies more effectively. DBiLSTM has shown impressive results in challenging tasks like Global Horizontal Irradiance (GHI) prediction [47], financial Risk Analysis and Prediction [48], Air Quality Forecasting [49], and time series forecasting [45]. A typical DBiLSTM model is built by stacking double layers of bidirectional LSTMs. Each layer looks at the data from both directions (a) past to future and (b) future to past, and before passing the learned features to the subsequent layer. This hierarchical structure allows the model to develop a deeper and more holistic understanding of sequential dynamics at multiple levels [50].

L = Number of DBiLSTM layers (depth)

x_{t} = b e t h e i n p u t a t t i m e s t e p t

h_{t}^{(l) \to}, h_{t}^{(l) \leftarrow} = H i d d e n s a t e s a t l a y e r l f o r f o r w a r d a n d b a c k w o r d d i r e c t i o n

C_{t}^{(l) \to}, C_{t}^{(l) \leftarrow} = c o r r e s p o n d i n g c e l l s t a t e s

\tilde{C_{t}^{(l) \to}}, \tilde{C_{t}^{(l) \leftarrow}} = candidate cell state

For layer l = 1 to L

For the Forward direction

f_{t}^{(l) \to} = σ (W_{f}^{(l) \to} [h_{t - 1}^{(l) \to}, b_{t}^{(l)}] + x_{f}^{(l) \to})

(13)

i_{t}^{(l) \to} = σ (W_{i}^{(l) \to} [h_{t - 1}^{(l) \to}, b_{t}^{(l)}] + x_{i}^{(l) \to})

(14)

\tilde{C_{t}^{(l) \to}} = \tanh (W_{C}^{(l) \to} [h_{t - 1}^{(l) \to}, b_{t}^{(l)}] + x_{C}^{(l) \to})

(15)

C_{t}^{(l) \to} = f_{t}^{(l) \to} ⊙ C_{t - 1}^{(l) \to} + i_{t}^{(l) \to} ⊙ \tilde{C_{t}^{(l) \to}}

(16)

o_{t}^{(l) \to} = σ (W_{o}^{(l) \to} [h_{t - 1}^{(l) \to}, b_{t}^{(l)}] + x_{o}^{(l) \to})

(17)

h_{t}^{(l) \to} = o_{t}^{(l) \to} ⊙ \tanh (C_{t}^{(l) \to})

(18)

For the Backward direction

f_{t}^{(l) \leftarrow} = σ (W_{f}^{(l) \leftarrow} [h_{t + 1}^{(l) \leftarrow}, b_{t}^{(l)}] + x_{f}^{(l) \leftarrow})

(19)

i_{t}^{(l) \leftarrow} = σ (W_{i}^{(l) \leftarrow} [h_{t + 1}^{(l) \leftarrow}, b_{t}^{(l)}] + x_{i}^{(l) \leftarrow})

(20)

\tilde{C_{t}^{(l) \leftarrow}} = \tanh (W_{C}^{(l) \leftarrow} [h_{t + 1}^{(l) \leftarrow}, b_{t}^{(l)}] + x_{C}^{(l) \leftarrow})

(21)

C_{t}^{(l) \leftarrow} = f_{t}^{(l) \leftarrow} ⊙ C_{t + 1}^{(l) \leftarrow} + i_{t}^{(l) \leftarrow} ⊙ \tilde{C_{t}^{(l) \leftarrow}}

(22)

o_{t}^{(l) \leftarrow} = σ (W_{o}^{(l) \leftarrow} [h_{t + 1}^{(l) \leftarrow}, b_{t}^{(l)}] + x_{o}^{(l) \leftarrow})

(23)

h_{t}^{(l) \leftarrow} = o_{t}^{(l) \leftarrow} ⊙ \tanh (C_{t}^{(l) \leftarrow})

(24)

For input to layer l

If l = 1, the input is the original sequence

x_{t}^{(1)} = x_{t}

(25)

If l > 1 the input is the concatenated output of the previous layer

x_{t}^{(l)} = [h_{t}^{(l - 1) \to}; h_{t}^{(l - 1) \leftarrow}]

(26)

For the final output

After processing all L layers

h_{t}^{final} = [h_{t}^{(L) \to}; h_{t}^{(L) \leftarrow}]

(27)

where,

σ = s i g m o i d a c t i v a t i o n, t a n h = h y p e r b o l i c t a n g e n t,

$⊙ = e l e m e n t w i s e m u l t i p l i c a t i o n,$ $[a; b] = v e c t o r c o n c a t e n a t i o n a n d W^{(l) *},$
$b^{(l) *} = l e a r n a b l e p a r a m e t e r s o f e a c h g a t e, d i r e c t i o n a n d l a y e r .$

In this study, the final output of the DBiLSTM model (

h_{t}^{f i n a l}

) represents the merged hidden states from both the forward (

h_{t}^{(L) \to}

) and backward (

h_{t}^{(L) \leftarrow}

) directions in the last bidirectional layer. This concatenation allows the network to capture information from past and future wind patterns at each time step. The layer’s learnable parameters

(W^{(l) *}, b^{(l) *})

and nonlinear activations (sigmoid and tanh) regulate how information is updated and retained. Through this process,

h_{t}^{f i n a l}

forms a complete temporal representation of the wind sequence, which the model uses to produce precise short-term forecasts. This bidirectional integration helps the proposed CNNDBiLSTM model represent complex wind dynamics more effectively than single direction or shallow networks.

2.1.8. Deep Learning Based Wind Speed Forecasting: A Critical Review

In recent years, deep learning has emerged as a leading approach for short-term wind forecasting [51]. It can effectively model the nonlinear, non-stationary, and multi-scale temporal characteristics of wind data that traditional statistical methods frequently fail to represent effectively [52,53]. The existing research in this area generally falls into four main categories: (i) Recurrent Neural Networks (RNN) and their advanced variants including Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), and Bidirectional LSTM (BiLSTM); (ii) Convolutional Neural Networks (CNNs), which are used to identify local temporal patterns and filter outer noise; (iii) hybrid architectures that integrate CNNs with recurrent networks to simultaneously extract features and capture sequential dependencies; and (iv) decomposition based hybrid models that first preprocess the raw time series data using techniques such as Empirical Mode Decomposition (EMD), Variational Mode Decomposition (VMD), Singular Spectrum Analysis (SSA), or Seasonal-Trend Decomposition (STL) etc. to enhance stationarity before applying deep learning [53].

Recurrent neural networks, especially LSTM and GRU architectures, have shown robust performance in short-term wind forecasting due to their ability to capture long-range temporal dependencies in sequential data [53]. Reviews of deep learning methods show that LSTM models outperform traditional approaches like Support Vector Regression (SVR) and Random Forest (RF) in modelling temporal wind speed patterns [54]. Building on this, the LSTM framework was enhanced with a graph-based sequence reconstruction strategy, achieving higher accuracy and stability than sequential prediction [29,54].

CNNs have also received significant attention in wind forecasting, owing to their capacity to capture local temporal structures and filter out noise from high-frequency time series [52]. When applied to one-dimensional wind speed data, CNNs are particularly effective at identifying short-term fluctuation patterns, which enhances model robustness in highly volatile and uncertain wind conditions [53].

Hybrid architectures that combine convolutional and recurrent networks, such as CNN_LSTM and CNN_BiLSTM, leverage the complementary strengths of both components: the CNN layer extracts hierarchical local features from the input sequence, while the recurrent layer captures long-range temporal dependencies [55,56]. Two-dimensional regional CNN_LSTM model that jointly learns spatial and temporal patterns, achieving higher forecasting accuracy than the standalone LSTM [51]. Similarly, a CNN_LSTM model for non-stationary wind fields presented significant improvements, reducing RMSE and MAE relative to baseline models [51]. Together, these studies underscore the effectiveness of hybrid designs in enhancing short-term wind forecasting, especially in atmospherically complex or highly variable conditions [51,52,54].

Recent studies have addressed the challenges of non-stationarity and noise in wind time series data by integrating signal decomposition methods with deep learning models. Using a hybrid framework combining Empirical Mode Decomposition (EMD) and its variants with CNN_BiLSTM for short-term wind speed forecasting and reported improved predictive stability compared to standard LSTM approaches [56]. Similar decomposition-based strategies such as VM_DLSTM, EMD_BiLSTM, SSA_LSTM, and STL_LSTM have proven effective in isolating quasi-stationary subcomponents from raw wind data before model training [53,56]. However, the performance of these hybrids is sensitive to the choice of decomposition parameters and can vary depending on local wind characteristics and site-specific conditions [53,54,55,56].

Despite significant advances, several weaknesses remain in the current wind forecasting literature. Many studies rely on data from only a single location, making it difficult to assess how well models perform across different climates or terrains. Often, new models are compared against baseline methods that are either too simple or not properly tuned, which prevents the conclusions about their true advantage. Moreover, few studies break down their models to test how much each part, such as the CNN or the recurrent layer, actually contributes to performance. Evaluation is also frequently limited to just two error metrics, RMSE and MAE, while ignoring other useful indicators like correlation, model efficiency (e.g., Nash Sutcliffe), or measures of systematic bias. These additional metrics can reveal important aspects of forecast quality that RMSE and MAE alone cannot capture. Finally, some models use too many input features, which only increases complexity but also raises the risk of overfitting and reduces reliability when the model is applied to new or real-world data.

To address these limitations, this study proposes a hybrid CNNDBiLSTM architecture that combines convolutional feature extraction with deep bidirectional sequential modeling. The model leverages the CNN’s ability to capture local temporal patterns and the DBiLSTM’s capacity to learn long-range dependencies in both forward and backward time directions. This research evaluates the framework against carefully tuned SVR and CatBoost baselines using hourly wind speed data from two high wind potential sites in Tasmania, Australia. Performance is assessed through a comprehensive set of metrics including Pearson correlation coefficient (r), Nash Sutcliffe efficiency (ENS), Willmott’s index of agreement (WI), root mean square error (RMSE), mean absolute error (MAE), Legate McCabe index (LM), mean absolute percentage error (MAPE), and coefficient of determination (R²) to provide a multifaceted evaluation of accuracy, reliability, and systematic bias. This integrated methodology offers a robust and generalizable approach to short-term wind speed forecasting across diverse regional contexts.

2.2. Data and Methodology

This section of the study outlines the procedures used to develop, train, and evaluate the forecasting models. It describes the entire workflow, including data cleaning and preprocessing, the configuration of traditional machine learning models, and the design of the proposed CNN-DBiLSTM deep hybrid learning framework. In addition, it explains the training strategies adopted to optimise model performance and ensure reliable forecasting results.

Data Processing

In this study, wind speed data were collected from two meteorological stations in Tasmania, Australia: Scottsdale (West Minster Road) (147.48° E, 41.17° S) and Flinders Island Airport (148.00° E, 40.09° S). These sites were chosen because they provide long-term, high-quality datasets (Figure 2) and are situated near regions with strong potential for wind energy development. The dataset includes hourly wind speed readings, totaling 113,952 observation points. This data has been taken from the Australian Government Bureau of Meteorology, ensuring more consistency and accuracy. During preprocessing, only a very small proportion of data was missing: 0.17% at Flinders Island and 0.12% at Scottsdale. These percentages refer to the share of missing values within each individual dataset, rather than the total across sites. The gaps were filled using the average wind speed for the same time periods in other years, helping to preserve natural seasonal and temporal patterns by using Statistical Package for the Social Sciences (SPSS) software (version 26). For the model development for this research, data have been divided into three parts: training, testing, and validation with 60%, 20% and 20% respectively. To prepare the data for forecasting, a Partial Autocorrelation Function (PACF) analysis was used to identify the most influential time lag. The results showed that the wind speed at time t − 1 had the strongest influence on the value at time t, so the model uses f(t − 1) as the input and f(t) as the target wind speed (Figure 4). Using a single lag simplifies the model and reduces the risk of multicollinearity, which can occur when multiple highly correlated lags are included as inputs. Furthermore, previous studies in wind and power forecasting have shown that lag-1 is often the most informative predictor for short-term horizons, with additional lags providing only marginal improvements at the cost of higher model complexity [12,20,38]. While multi-lag approaches can sometimes enhance long-term forecasting, for short-term applications such as those considered in this study, lag-1 provides an effective and parsimonious input representation.

These careful preprocessing steps ensure the dataset is clean, consistent, and well-aligned in time, therefore providing a solid base for accurate wind speed forecasting in these promising regions. Figure 5 below shows the overall summary of the modelling process.

After the missing values are imputed, the data is initially normalized to ensure that input values are on a comparable scale (0 to 1). This improves stability and convergence of the AI models. Normalization was computed using a min-max method, hence converting all values between a range of 0 to 1 based on the minimum and maximum values. Following this, the dataset is partitioned into training (60%), validation (20%), and testing (20%). Hyperparameter tuning was carried out to ensure the most effective parameters, such as learning rate, number of neurons, batch size, kernel size, and strength of regularization, were optimized. A grid search strategy was employed where a combination of hyperparameter values was tested on the trial dataset. This is important to achieve balance in predictive accuracy, convergence stability, and generalization to unseen data.

2.3. Proposed CNNDBiLSTM Architecture and Evaluation Metrics

The hybrid CNNDBiLSTM model was developed to combine the advantages of both convolutional neural networks (CNNs) and recurrent neural networks (specifically, bidirectional long short-term memory networks—BiLSTMs). The model architecture was determined through a trial-and-error approach, beginning with a one-dimensional convolutional layer (Conv1D) that employs five filters and a kernel size of seven. This initial layer is responsible for extracting local temporal features and short-term patterns from the input wind speed sequences.

Subsequently, two weighted Bidirectional LSTM layers, each comprising 20 hidden units, are used to model long-term temporal dependencies in both forward and backward directions. By processing the sequence bidirectionally, the model effectively captures contextual information from both past and future time steps, enhancing its predictive understanding of wind speed behaviour. Following the recurrent layers, a flatten layer transforms the multi-dimensional output into a one-dimensional vector to ensure compatibility with the subsequent fully connected layers. This is followed by a dense (fully connected) layer with 64 neurons and ReLU activation, which facilitates the learning of complex nonlinear relationships within the data. Finally, an output layer with a single neuron and a linear activation function is used to generate the continuous wind speed forecasts.

All models were implemented using Python software (conda 4.14.0), leveraging the TensorFlow and Scikit-learn libraries. The CNNDBiLSTM model was trained using the Adam optimizer with a learning rate of 0.001. The mean squared error (MSE) was used as the loss function. The training process was set to run for a maximum of 100 epochs, with a batch size of 32. To avoid overfitting, early stopping was applied with a patience of 10 epochs, based on the validation loss.

For the SVR and CatBoost models, hyperparameter tuning was performed using grid search combined with 5-fold cross-validation. In the case of SVR, the best-performing parameters were found to be C = 100, epsilon = 0.1, and gamma = 0.01. For CatBoost, key parameters such as tree depth, learning rate, and the number of boosting iterations were optimized automatically, with early stopping used to determine the ideal number of iterations based on validation performance.

All experiments were carried out on a standard desktop GPU from the NVIDIA RTX series. The CNNDBiLSTM model demonstrated efficient training times, completing each session in under ten minutes, which supports its suitability for real-time or operational wind speed forecasting applications.

The model performance indicator of wind speed forecasting will be ascertained using the following performance evaluation metrics.

Pearson Correlation Coefficient (r)

$r = \frac{\sum_{i = 1}^{N} (Y_{i} - \bar{Y}) (\hat{Y_{i}} - \bar{\hat{Y}})}{\sqrt{\sum_{i = 1}^{N} {(Y_{i} - \bar{Y})}^{2}} \cdot \sqrt{\sum_{i = 1}^{N} {(\hat{Y_{i}} - \bar{\hat{Y}})}^{2}}}, - 1 \leq r \leq 1$

(28)
Nash–Sutcliffe Efficiency (ENS)

$E N S = 1 - \frac{\sum_{i = 1}^{N} {(Y_{i} - \hat{Y_{i}})}^{2}}{\sum_{i = 1}^{N} {(Y_{i} - \bar{Y})}^{2}}, - \infty < E N S \leq 1$

(29)
Willmott’s Index of Agreement (WI)

$W I = 1 - \frac{\sum_{i = 1}^{N} {(Y_{i} - \hat{Y_{i}})}^{2}}{\sum_{i = 1}^{N} {(|\hat{Y_{i}} - \bar{Y}| + |Y_{i} - \bar{Y}|)}^{2}}, 0 \leq W I \leq 1$

(30)
Root Mean Square Error (RMSE)

$R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(Y_{i} - \hat{Y_{i}})}^{2}}$

(31)
Mean Absolute Error (MAE)

$M A E = \frac{1}{N} \sum_{i = 1}^{N} |Y_{i} - \hat{Y_{i}}|$

(32)
Legate and McCabe Index (LM)

$L M = 1 - \frac{\sum_{i = 1}^{N} |Y_{i} - \hat{Y_{i}}|}{\sum_{i = 1}^{N} (|\hat{Y_{i}} - \bar{Y}| + |Y_{i} - \bar{Y}|)}$

(33)
Mean Absolute Percentage Error (MAPE)

$M A P E = \frac{100}{N} \sum_{i = 1}^{N} |\frac{Y_{i} - \hat{Y_{i}}}{Y_{i}}|$

(34)
Coefficient of Determination (R²)

$R^{2} = 1 - \frac{\sum_{i = 1}^{N} {(Y_{i} - \hat{Y_{i}})}^{2}}{\sum_{i = 1}^{N} {(Y_{i} - \bar{Y})}^{2}}$

(35)

where

Y_{i}

: Observed wind speed at time step i,

\hat{Y_{i}}

: Predicted wind speed at time step i,

\bar{Y}

: Mean of observed values,

\bar{\hat{Y}}

: Mean of predicted values,

c v_{Y}

: Coefficient of variation of observed values,

{c v}_{\hat{Y}} :

Coefficient of variation of predicted values, N: Total number of data points.

The performance of forecasting models can be assessed using a range of statistical metrics that capture different aspects of predictive accuracy. The correlation coefficient (r) evaluates the strength and direction of the linear association between observed and predicted values, with values close to unity denoting strong agreement. The Nash–Sutcliffe Efficiency (ENS) provides a relative measure of predictive skill by comparing the model’s performance to the mean of observations, where values near one indicate excellent predictions and negative values suggest poor performance. To further assess agreement, Willmott’s Index of Agreement (WI) normalises prediction errors against the potential error and ranges from zero (no agreement) to one (perfect agreement). The Root Mean Square Error (RMSE) and the Mean Absolute Error (MAE) quantify the magnitude of prediction errors, where RMSE penalises larger deviations more strongly, while MAE provides a balanced average of absolute differences. The Legate and McCabe Index (LM) refines this assessment by comparing absolute prediction errors with deviations from the observed mean, offering a robust measure of model reliability. Mean Absolute Percentage Error (MAPE), which conveys forecast accuracy in percentage terms. Finally, the coefficient of determination (

R^{2}

) reflects the proportion of variance in observed data explained by model predictions, with higher values signifying stronger explanatory power.

3. Results and Discussion

This section presents a comprehensive performance evaluation of four forecasting models MLR, SVR, CatBoost, and the proposed CNNDBiLSTM, across two meteorological locations: Flinders Island Airport and Scottsdale (Westminster Road) in Tasmania, Australia. The comparison is organized by performance metrics and model behavior.

3.1. Model Performance Overview

The overall forecasting results for both sites are presented in Table 1 and Table 2. The proposed CNNDBiLSTM model achieved the best performance across all statistical indicators, producing the lowest RMSE and MAE and the highest r, R², WI, ENS, and LM. At Flinders Island Airport, the model recorded an RMSE of 3.33 m/s and an r of 0.969, outperforming CatBoost, MLR, and SVR, which showed larger errors and weaker agreement with the observed data. High WI (0.984) and ENS (0.935) values confirm a close match between predicted and measured wind speeds. These results suggest that combining convolutional feature extraction with bidirectional deep learning (CNNBiLSTM) enables the model to capture both short-term fluctuations and long-range temporal dependencies typical of coastal wind regimes. In contrast, the baseline models, relying on simpler regression or kernel structures, were unable to represent the non-linear and non-stationary characteristics of the wind series.

At the Scottsdale (Westminster Road) site, CNNDBiLSTM again produced the most accurate results, with RMSE = 1.06 m/s, MAE = 0.38 m/s, and r = 0.987. CatBoost ranked second but exhibited greater deviations during rapid wind changes, while MLR and SVR consistently showed higher error levels. The hybrid model maintained strong stability and generalization, with R² = 0.973 and WI = 0.993, indicating reliable performance under local terrain conditions. The consistent superiority of CNNDBiLSTM across both locations demonstrates its ability to learn key temporal patterns in complex wind fields, providing a robust framework for short-term wind speed forecasting in practical energy applications.

3.2. Prediction Error Distribution and Forecasting Accuracy

Figure 5, Figure 6, Figure 7, Figure 8, Figure 9, Figure 10, Figure 11 and Figure 12 provide a comparative visual analysis of the forecasting performance of the four models, MLR, SVR, CatBoost, and CNNDBiLSTM, across the two study sites: Flinders Island Airport and Scottsdale (Westminster Road). Figure 6 and Figure 7 present scatter plots of observed versus predicted wind speeds. The closer clustering of data points along the 1:1 line in the CNNDBiLSTM panels indicates superior predictive accuracy compared with the wider scatter in the MLR, SVR, and CatBoost models. For Flinders Island, the CNNDBiLSTM achieved the highest coefficient of determination (

R^{2} = 0.94

), while Scottsdale recorded an even stronger fit (

R^{2} = 0.973

). These results highlight the model’s enhanced ability to capture nonlinear dependencies in wind speed dynamics relative to conventional approaches.

Figure 8 and Figure 9 show the prediction-error histograms. At both sites, the CNN-DBiLSTM demonstrates a sharply peaked distribution near zero, signifying minimal error and high forecast reliability. In contrast, the other models display broader error ranges and heavier tails, implying larger deviations from the observed values. The overall reduction in error variance confirms that the deep hybrid framework successfully generalises the temporal and seasonal variability of wind patterns.

Figure 10 and Figure 11 illustrate time series comparisons of observed and predicted wind speeds for the first 100 data points. The CNNDBiLSTM closely tracks the actual wind speed fluctuations at both locations, preserving short-term variations and sudden changes. Meanwhile, the benchmark models exhibit noticeable lag and amplitude mismatches, particularly during rapid wind transitions.

Figure 12 and Figure 13 summarize the statistical characteristics of the prediction errors. The CNNDBiLSTM shows the lowest mean, standard deviation, and median errors, as well as the smallest maximum deviation, underscoring its consistency and robustness. In comparison, the MLR, SVR, and CatBoost models present higher variability and error magnitudes, particularly under fluctuating wind conditions.

4. Discussion

The CNNDBiLSTM model showed the most balanced and consistently strong performance across both sites. Its hybrid design, combining convolutional layers for feature extraction with bidirectional LSTM layers for sequence learning, allowed it to recognize short-term fluctuations as well as broader temporal trends in the wind-speed data. This advantage can be seen from its lower RMSE and MAE values and higher R² and r across both locations. The model also maintained stable accuracy in different topographies and climatic settings, indicating good adaptability. CatBoost performed reasonably well and ranked second overall, yet it tended to underestimate wind speeds during periods of rapid change. By contrast, MLR and SVR, being linear and kernel-based models, could not adequately represent the complex and non-stationary structure of wind dynamics. These observations suggest that the CNNDBiLSTM architecture offers a reliable and adaptable framework for short-term wind-speed forecasting.

The improvement achieved by CNNDBiLSTM is the result of two complementary mechanisms. First, the convolutional component functions as a local filter that isolates meaningful short-term variations such as gusts and ramp events, while controlling random fluctuations. This cleaner representation is then processed by the deep bidirectional LSTM layers, which learn temporal dependencies in both forward and backward directions. As a result, the model aligns current predictions with both past and near future information, reducing phase-shift errors that are common in unidirectional networks. These combined effects explain its consistently higher r, WI, and ENS values and the smaller RMSE and MAE observed across both coastal and inland environments. The hybrid structure, therefore, not only improves numerical accuracy but also provides a better reflection of the oscillatory and intermittent behaviour characteristic of real wind fields.

To further evaluate the efficiency of the hybrid CNNDBiLSTM model, an ablation study was carried out with each component of the model. The three standalone components of CNN, LSTM, and BiLSTM were compared with the full CNNDBiLSTM architecture predictions. Performance was evaluated with RMSE and R² metrics as shown in Table 3 below.

The ablation study clearly shows the benefits of the hybrid architecture, which shows superior prediction results when compared with single partial models. In the Flinders results, CNNDBiLSTM achieved an RMSE of 3.3346 and an R² of 0.9398, indicating a high degree of accuracy and a strong correlation with the observed data. This performance is notably better than that of the individual models: CNN-only (RMSE = 4.6791, R² = 0.8746), BiLSTM-only (RMSE = 4.6737, R² = 0.8749), and LSTM-only (RMSE = 4.7386, R² = 0.8713). The Scottsdale results follow a similar trend, further supporting the superiority of the CNNDBiLSTM model. Here, the RMSE of CNNDBiLSTM is 1.0553, and the R² is 0.9733, reflecting a more precise fit. Figure 14 and Figure 15 show the comparison of hybrid CNNDBiLSTM model predictions with the standalone models.

5. Conclusions

This study presents the development and systematic evaluation of a hybrid deep learning model, CNNDBiLSTM, designed for short-term wind speed forecasting at two meteorological stations in Tasmania, Australia: Flinders Island Airport and Scottsdale (Westminster Road). These locations were chosen for their distinct wind profiles and reliable long-term data records. To assess the model’s performance, it was compared against three established forecasting methods: Multiple Linear Regression (MLR), Support Vector Regression (SVR), and Categorical Boosting (CatBoost). Across both sites, the CNNDBiLSTM consistently outperformed the baseline models. It achieved the highest correlation coefficients (r = 0.987–0.988), the lowest error rates (RMSE = 0.392–0.402, MAE = 0.294–0.310), and superior scores across multiple efficiency metrics (ENS, WI, LM). Beyond its numerical accuracy, the findings highlight the practical value of integrating convolutional feature extraction with bidirectional sequential learning for modelling highly variable wind dynamics. The CNNDBiLSTM demonstrated strong adaptability across coastal and inland environments, showing potential for real-world use in turbine control, grid integration, and renewable-energy resource planning. The wind speed analysis and forecasting show Flinders with higher and consistent wind speed as a more viable option for large-scale wind energy generation than Scottsdale in Tasmania.

While the results are promising enough to provide insights into the use of a deep hybrid CNNDBiLSTM model for short-term wind speed forecasting in wind-rich regions of Australia, several areas of improvement may present added opportunities as further research investigation. One of these includes the consideration that the present model relies primarily on the historical wind speed data, so future studies could integrate additional meteorological variables such as temperature, pressure, and humidity to enhance the robustness of the model under different weather regimes. Several meteorological variables affect wind speed dynamics in different, non-proportional, and non-linear ways. Therefore, these variables may lead to a better predictive performance of the proposed model. It is also important to incorporate uncertainty quantification methods, given the dynamic and chaotic nature of wind speed. If implemented in a real wind energy conversion system, a model with the capability to generate uncertainty level predictions can provide a major decision tool for wind energy providers [57]. In addition to this, the use of probabilistic prediction methods, or integrating such a method into the existing model, can generate information on the reliability of the model to help evaluate financial or other risk if the model is used in a real-time wind energy system [4,57].

Furthermore, financial risk in energy sector is also closely related to the harmonic waveforms generated by wind (and other renewable energies) into the electricity grid [57]. Therefore, future studies could also include probabilistic model design with uncertainty evaluation methods to better understand and manage the harmonics in the grid. Real-time implementation strategies could also improve operational reliability of the grid if the proposed model with such improvements were used in a system.

Author Contributions

Conceptualization, A.N. and N.R.; methodology, A.N. and N.R.; software, A.N.; validation, A.N. and N.R.; formal analysis, A.N. and N.R.; investigation, A.N.; resources, A.N.; data curation, A.N.; writing original draft preparation, A.N.; writing review and editing, A.N. and N.R.; visualization, A.N.; supervision, N.R. and R.D.; project administration, A.N. and N.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

1. Data has been accessed from Bureau of Meteorology Australia (BoM). 2. Maps obtained from the Global Wind Atlas version 4.0, a free, web-based application developed, owned, and operated by the Technical University of Denmark (DTU). The Global Wind Atlas version 4.0 was released in partnership with the World Bank Group, utilizing data provided by Vortex, using funding provided by the Energy Sector Management Assistance Program (ESMAP). For additional information: https://globalwindatlas.info accessed on 10 April 2020.

Conflicts of Interest

The authors declared no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ABP	Absolute Percentage Bias
AI	Artificial Intelligence
BiLSTM	Bidirectional Long Short-Term Memory Network
CatBoost/CatBoost	Categorical Boosting (a gradient boosting machine learning algorithm)
CNN	Convolutional Neural Network
CNNDBiLSTM	Convolutional Neural Network—Double Bidirectional Long Short-Term Memory
DBiLSTM	Double Bidirectional Long Short-Term Memory
ENS	Nash–Sutcliffe Coefficient (Efficiency)
LM	Legate and McCabe Index
LSTM	Long Short-Term Memory
MAE	Mean Absolute Error
MAPE	Mean Absolute Percentage Error
MLR	Multiple Linear Regression
MSE	Mean Squared Error
PACF	Partial Autocorrelation Function
R²	Coefficient of Determination
RBF	Radial Basis Function (Kernel)
RMSE	Root Mean Square Error
SDG	Sustainable Development Goals
SVR	Support Vector Regression
WI	Willmott’s Index of Agreement

References

Archer, C.L.; Jacobson, M.Z. Evaluation of global wind power. J. Geophys. Res. Atmos. 2005, 110, 1–20. [Google Scholar] [CrossRef]
International Energy Agency (IEA). Global Energy Review 2025; IEA: Paris, France, 2025. [Google Scholar]
International Renewable Energy Agency (IRENA). Renewable Capacity Statistics 2024; IRENA: Abu Dhabi, United Arab Emirates, 2024. [Google Scholar]
International Energy Agency (IEA). Renewables 2024. In Global Status Report—Global Overview; IEA: Paris, France, 2024. [Google Scholar]
BP plc. BP Energy Outlook 2024; BP plc: London, UK, 2024. [Google Scholar]
Almalki, H.; Safaei, B.; Kolamroudi, M.K.; Sahmani, S.; Arman, S.; Shekoofa, O. Systematic literature review on the design, efficiency and fabrication of wind turbine blades. Int. J. Ambient. Energy 2024, 45, 2374057. [Google Scholar] [CrossRef]
Australian Government. Australian Energy Statistics, Table O Electricity Generation by Fuel Type 2023–24; Department of Climate Change, Energy, the Environment and Water: Canberra, Australia, 2025. [Google Scholar]
Cheung, G.; Davies, P.J. In the transformation of energy systems: What is holding Australia back? Energy Policy 2017, 109, 96–108. [Google Scholar] [CrossRef]
Guidolin, M.; Alpcan, T. Transition to sustainable energy generation in Australia: Interplay between coal, gas and renewables. Renew. Energy 2019, 139, 359–367. [Google Scholar] [CrossRef]
Davis, N.N.; Badger, J.; Hahmann, A.N.; Hansen, B.O.; Mortensen, N.G.; Kelly, M.; Larsén, X.G.; Olsen, B.T.; Floors, R.; Lizcano, G.; et al. The Global Wind Atlas: A High-Resolution Dataset of Climatologies and Associated Web-Based Application. Bull. Am. Meteorol. Soc. 2023, 104, E1507–E1525. [Google Scholar] [CrossRef]
Kumar, K.; Prabhakar, P.; Verma, A. Advancements in Wind Power Forecasting: A Comprehensive Review. J. Electron. Electr. Eng. 2025, 4, 298–322. [Google Scholar] [CrossRef]
Jiang, Y.; Song, Z.; Kusiak, A. Very short-term wind speed forecasting with Bayesian structural break model. Renew. Energy 2013, 50, 637–647. [Google Scholar] [CrossRef]
Slater, L.J.; Arnal, L.; Boucher, M.-A.; Chang, A.Y.-Y.; Moulds, S.; Murphy, C.; Nearing, G.; Shalev, G.; Shen, C.; Speight, L.; et al. Hybrid forecasting: Blending climate predictions with AI models. Hydrol. Earth Syst. Sci. 2023, 27, 1865–1889. [Google Scholar] [CrossRef]
Strielkowski, W.; Civín, L.; Tarkhanova, E.; Tvaronavičienė, M.; Petrenko, Y. Renewable energy in the sustainable development of electrical power sector: A review. Energies 2021, 14, 8240. [Google Scholar] [CrossRef]
Kleidon, A. Physical limits of wind energy within the atmosphere and its use as renewable energy: From the theoretical basis to practical implications Physical limits of wind energy. Meteorol. Z. 2021, 30, 203–225. [Google Scholar] [CrossRef]
Veers, P.; Dykes, K.; Lantz, E.; Barth, S.; Bottasso, C.L.; Carlson, O.; Clifton, A.; Green, J.; Green, P.; Holttinen, H.; et al. Grand challenges in the science of wind energy. Science 2019, 366, eaau2027. [Google Scholar] [CrossRef] [PubMed]
Kiviluoma, J. Managing Wind Power Variability and Uncertainty Through Increased Power System Flexibility. Ph.D. Thesis, Aalto University, Espoo, Finland, 2013. [Google Scholar]
Zhao, L.; Liu, C.; Yang, C.; Liu, S.; Zhang, Y.; Li, Y. A location-centric transformer framework for multi-location short-term wind speed forecasting. Energy Convers. Manag. 2025, 328, 119627. [Google Scholar] [CrossRef]
Ehteram, M.; Banadkooki, F.B. A Developed Multiple Linear Regression (MLR) Model for Monthly Groundwater Level Prediction. Water 2023, 15, 3940. [Google Scholar] [CrossRef]
Mogos, A.S.; Salauddin, M.; Liang, X.; Chung, C.Y. An Effective Very Short-Term Wind Speed Prediction Approach Using Multiple Regression Models. IEEE Can. J. Electr. Comput. Eng. 2022, 45, 242–253. [Google Scholar] [CrossRef]
Deo, R.C.; Şahin, M. Forecasting long-term global solar radiation with an ANN algorithm coupled with satellite-derived (MODIS) land surface temperature (LST) for regional locations in Queensland. Renew. Sustain. Energy Rev. 2017, 72, 828–848. [Google Scholar] [CrossRef]
Bianco, V.; Manca, O.; Nardini, S. Electricity consumption forecasting in Italy using linear regression models. Energy 2009, 34, 1413–1421. [Google Scholar] [CrossRef]
Montgomery, D.C.; Peck, E.A.; Vining, G.G. Introduction to Linear Regression Analysis, 5th ed.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2012. [Google Scholar]
Breiman, L.; Friedman, J.H. Predicting Multivariate Responses in Multiple Linear Regression. J. R. Stat. Soc. Ser. B 1997, 59, 3–54. [Google Scholar] [CrossRef]
Raj, N.; Murali, J.; Singh-Peterson, L.; Downs, N. Prediction of Sea Level Using Double Data Decomposition and Hybrid Deep Learning Model for Northern Territory, Australia. Mathematics 2024, 12, 2376. [Google Scholar] [CrossRef]
Supriya; Ashutosh, S.; Priyanka, S.; Kumar, P.R. Forecasting Methods for Renewable Power Generation; Wiley-Scrivener Publishing: Hoboken, NJ, USA, 2025. [Google Scholar]
Moges, G.; McDonnell, K.; Delele, M.A.; Ali, A.N.; Fanta, S.W. Development and comparative analysis of ANN and SVR-based models with conventional regression models for predicting spray drift. Environ. Sci. Pollut. Res. 2023, 30, 21927–21944. [Google Scholar] [CrossRef] [PubMed]
Adaryani, F.R.; Jamshid Mousavi, S.; Jafari, F. Short-term rainfall forecasting using machine learning-based approaches of PSO-SVR, LSTM and CNN. J. Hydrol. 2022, 614, 128463. [Google Scholar] [CrossRef]
Tan, Z.; Zhang, J.; He, Y.; Zhang, Y.; Xiong, G.; Liu, Y. Short-Term Load Forecasting Based on Integration of SVR and Stacking. IEEE Access 2020, 8, 227719–227728. [Google Scholar] [CrossRef]
Das, U.K.; Tey, K.S.; Bin Idris, M.Y.I.; Mekhilef, S.; Seyedmahmoudian, M.; Stojcevski, A.; Horan, B. Optimized Support Vector Regression-Based Model for Solar Power Generation Forecasting on the Basis of Online Weather Reports. IEEE Access 2022, 10, 15594–15604. [Google Scholar] [CrossRef]
Vapnik, V.N. The Nature of Statistical Learning Theory, 2nd ed.; Springer: New York, NY, USA, 1995. [Google Scholar]
Drucker, H.; Burges, C.J.C.; Kaufman, L.; Smola, A.; Vapoik, V. Support Vector Regression Machines. In Advances in Neural Information Processing Systems 9; MIT Press: Cambridge, MA, USA, 1997; pp. 155–161. [Google Scholar]
Zhang, L.; Jánošík, D. Enhanced short-term load forecasting with hybrid machine learning models: CatBoost and XGBoost approaches. Expert Syst. Appl. 2024, 241, 122686. [Google Scholar] [CrossRef]
Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. arXiv 2018, arXiv:1706.09516. [Google Scholar] [CrossRef]
Kim, H.; Park, S.; Park, H.J.; Son, H.G.; Kim, S. Solar Radiation Forecasting Based on the Hybrid CNN-CatBoost Model. IEEE Access 2023, 11, 13492–13500. [Google Scholar] [CrossRef]
Diao, L.; Niu, D.; Zang, Z.; Chen, C. Short-term Weather Forecast Based on Wavelet Denoising and Catboost. In Proceedings of the 2019 Chinese Control Conference (CCC), Guangzhou, China, 27–30 July 2019. [Google Scholar] [CrossRef]
Lecun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Chen, Y.; Zhao, H.; Zhou, R.; Xu, P.; Zhang, K.; Dai, Y.; Zhang, H.; Zhang, J.; Gao, T. CNN-BiLSTM Short-Term Wind Power Forecasting Method Based on Feature Selection. IEEE J. Radio Freq. Identif. 2022, 6, 922–927. [Google Scholar] [CrossRef]
Yao, Z.; Zhang, T.; Wang, Q.; Zhao, Y. Short-Term Power Load Forecasting of Integrated Energy System Based on Attention-CNN-DBILSTM. Math. Probl. Eng. 2022, 2022, 1075698. [Google Scholar] [CrossRef]
Livieris, I.E.; Pintelas, E.; Pintelas, P. A CNN–LSTM model for gold price time-series forecasting. Neural Comput. Appl. 2020, 32, 17351–17360. [Google Scholar] [CrossRef]
Schuster, M.; Paliwal, K.K. Bidirectional Recurrent Neural Networks. IEEE Trans. Signal Process. 1997, 45, 2673–2681. [Google Scholar] [CrossRef]
Wei, Z.; Shaohua, J.; Gang, B.; Yang, C.; Chengyang, P.; Haixing, X. A Method for Sound Speed Profile Prediction Based on CNN-BiLSTM-Attention Network. J. Mar. Sci. Eng. 2024, 12, 414. [Google Scholar] [CrossRef]
Méndez, M.; Merayo, M.G.; Núñez, M. Long-term traffic flow forecasting using a hybrid CNN-BiLSTM model. Eng. Appl. Artif. Intell. 2023, 121, 106041. [Google Scholar] [CrossRef]
Lin, H.; Zhang, S.; Li, Q.; Li, Y.; Li, J.; Yang, Y. A new method for heart rate prediction based on LSTM-BiLSTM-Att. Measurement 2023, 207, 112384. [Google Scholar] [CrossRef]
Yildirim, Ö. A novel wavelet sequences based on deep bidirectional LSTM network model for ECG signal classification. Comput. Biol. Med. 2018, 96, 189–202. [Google Scholar] [CrossRef]
Xu, Y.; Li, D.; Wang, Z.; Guo, Q.; Xiang, W. A deep learning method based on convolutional neural network for automatic modulation classification of wireless signals. Wirel. Netw. 2019, 25, 3735–3746. [Google Scholar] [CrossRef]
Madhiarasan, M. Bayesian optimisation algorithm based optimised deep bidirectional long short term memory for global horizontal irradiance prediction in long-term horizon. Front. Energy Res. 2025, 13, 1499751. [Google Scholar] [CrossRef]
Cheng, Y.; Xu, Z.; Chen, Y.; Wang, Y.; Lin, Z.; Liu, J. A Deep Learning Framework Integrating CNN and BiLSTM for Financial Systemic Risk Analysis and Prediction. arXiv 2025, arXiv:2502.06847. [Google Scholar] [CrossRef]
Zhang, J.; Li, S. Air Quality Index Forecast in Beijing Based on CNN–LSTM Multi-Model. Chemosphere 2022, 308, 136180. [Google Scholar] [CrossRef] [PubMed]
Du, S.; Li, T.; Yang, Y.; Horng, S.J. Deep Air Quality Forecasting Using Hybrid Deep Learning Framework. IEEE Trans. Knowl. Data Eng. 2021, 33, 2412–2424. [Google Scholar] [CrossRef]
Singh, S.K.; Jha, S.K.; Gupta, R. Enhancing the accuracy of wind speed estimation model using an efficient hybrid deep learning algorithm. Sustain. Energy Technol. Assess. 2024, 61, 103603. [Google Scholar] [CrossRef]
Chen, Y.; Wang, Y.; Dong, Z.; Su, J.; Han, Z.; Zhou, D.; Zhao, Y.; Bao, Y. 2-D regional short-term wind speed forecast based on CNN-LSTM deep learning model. Energy Convers. Manag. 2021, 244, 114451. [Google Scholar] [CrossRef]
Wang, Y.; Zou, R.; Liu, F.; Zhang, L.; Liu, Q. A review of wind speed and wind power forecasting with deep neural networks. Appl. Energy 2021, 304, 117766. [Google Scholar] [CrossRef]
Wang, K.; Tang, X.Y.; Zhao, S. Robust multi-step wind speed forecasting based on a graph-based data reconstruction deep learning method. Expert Syst. Appl. 2024, 238, 121886. [Google Scholar] [CrossRef]
Bashir, T.; Wang, H.; Tahir, M.; Zhang, Y. Wind and solar power forecasting based on hybrid CNN-ABiLSTM, CNN-transformer-MLP models. Renew. Energy 2025, 239, 122055. [Google Scholar] [CrossRef]
Nguyen, T.H.T.; Phan, Q.B. Hourly day ahead wind speed forecasting based on a hybrid model of EEMD, CNN-Bi-LSTM embedded with GA optimization. Energy Rep. 2022, 8, 53–60. [Google Scholar] [CrossRef]
Australian Renewable Energy Agency (ARENA); University of Wollongong. Harmonic Study—Large Renewable Energy Generators: Final Report; Australian Renewable Energy Agency: Canberra, Australia, 2022. Available online: https://arena.gov.au/assets/2022/05/harmonic-study-large-renewable-energy-generators-report.pdf (accessed on 11 September 2025).

Figure 1. Average wind speed at 10 m from the ground across the region surrounding Australia. Areas with high wind potential are predominantly located along the coastal zones [10]. Wind speed map accessed on 9 July 2025 from Global Wind Atlas 4.0 (https://globalwindatlas.info).

Figure 2. Wind speed potential areas in Australia at 50 m high above the ground level, where wind speed exceeds 8 m/s. Tasmania study sites show high wind speed potential with dense red markers.

Figure 3. The model features two LSTM layers operating in parallel: one goes the input sequence from start to end (left to right), while the other moves in the opposite direction (right to left). At each time step t, the hidden states from both directions are combined by concatenation to produce the final output.

Figure 4. Partial Autocorrelation Function (PACF) of wind speed data at Flinders Island Airport (TAS), Tasmania. First lag is highly correlated compared to other lags.

Figure 5. Overall summary of the data modelling process for wind speed prediction.

Figure 6. Scatterplot subplots of all models showing the line of best fit and R² values between observed and predicted values of Flinders Island Airport (TAS) study site.

Figure 7. Scatterplot subplots of all models showing the line of best fit and R² values between observed and predicted values of Scottsdale (TAS) study site.

Figure 8. Absolute prediction error subplots of all models showing the spread of errors of Flinders Island Airport (TAS) study site.

Figure 9. Absolute prediction error subplots of all models showing the spread of errors of Scottsdale (TAS) study site.

Figure 10. Time Series comparison of observed and predicted values of 100 data points, subplots of all models showing the observed and predicted values trend of Flinders Island Airport (TAS) study site. The solid line shows the observed and dash lines show predicted values of wind speed.

Figure 11. Time Series comparison of observed and predicted values of 100 data points, subplots of all models showing the observed and predicted values trend of Scottsdale (TAS) study site. The solid line shows the observed and dash lines show predicted values of wind speed.

Figure 12. Statistical summary of model prediction error for Flinders Island Airport (TAS) study site.

Figure 13. Statistical summary of model prediction error for Scottsdale (TAS) study site.

Figure 14. Scatterplot subplots of single and hybrid models for the ablation study showing the line of best fit and R² values between observed and predicted values of Flinders (TAS) study site.

Figure 15. Scatterplot subplots of single and hybrid models for the ablation study showing the line of best fit and R² values between observed and predicted values of Scottsdale (TAS) study site.

Table 1. Model performance metrics for Flinders.

Model	CNNDBiLSTM	CatBoost	MLR	SVR
MAE	2.4898	4.2872	4.4505	4.5569
MSE	11.1199	31.5768	33.544	33.0762
RMSE	3.3346	5.6193	5.7917	5.7512
R²	0.9398	0.8158	0.8042	0.8107
MAPE	13.0376	23.0132	24.2287	26.8134
r	0.9694	0.9032	0.8968	0.9004
WI	0.9841	0.9476	0.9434	0.9409
ENS	0.935	0.8154	0.8039	0.8066
LM	0.7716	0.6067	0.5917	0.582

Table 2. Model performance metrics for Scottsdale.

Model	CNNDBiLSTM	CatBoost	MLR	SVR
MAE	0.3757	3.3579	2.9256	2.8295
MSE	1.1137	16.41	14.2716	13.6034
RMSE	1.0553	4.0509	3.7778	3.6883
R²	0.9733	0.9069	0.6512	0.6676
MAPE	4.1056	43.9152	28.4467	27.357
r	0.9865	0.9523	0.807	0.8171
WI	0.9931	0.8323	0.8853	0.8935
ENS	0.9728	0.5989	0.6512	0.6675
LM	0.9256	0.3354	0.421	0.44

Table 3. Ablation Study Model performance metrics for Flinders and Scottsdale.

Flinders Results			Scottsdale Results
Model	RMSE	R²	Model	RMSE	R²
CNNDBiLSTM	3.3346	0.9398	CNNDBiLSTM	1.0553	0.9733
CNN	4.6791	0.8746	CNN	2.8792	0.8243
BiLSTM-	4.6737	0.8749	BiLSTM	2.8421	0.8227
LSTM	4.7386	0.8713	LSTM	2.9331	0.8180

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Neupane, A.; Raj, N.; Deo, R. A Deep Hybrid CNNDBiLSTM Model for Short-Term Wind Speed Forecasting in Wind-Rich Regions of Tasmania, Australia. Energies 2025, 18, 6390. https://doi.org/10.3390/en18246390

AMA Style

Neupane A, Raj N, Deo R. A Deep Hybrid CNNDBiLSTM Model for Short-Term Wind Speed Forecasting in Wind-Rich Regions of Tasmania, Australia. Energies. 2025; 18(24):6390. https://doi.org/10.3390/en18246390

Chicago/Turabian Style

Neupane, Ananta, Nawin Raj, and Ravinesh Deo. 2025. "A Deep Hybrid CNNDBiLSTM Model for Short-Term Wind Speed Forecasting in Wind-Rich Regions of Tasmania, Australia" Energies 18, no. 24: 6390. https://doi.org/10.3390/en18246390

APA Style

Neupane, A., Raj, N., & Deo, R. (2025). A Deep Hybrid CNNDBiLSTM Model for Short-Term Wind Speed Forecasting in Wind-Rich Regions of Tasmania, Australia. Energies, 18(24), 6390. https://doi.org/10.3390/en18246390

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Deep Hybrid CNNDBiLSTM Model for Short-Term Wind Speed Forecasting in Wind-Rich Regions of Tasmania, Australia

Abstract

1. Introduction

2. Literature Review

2.1. Model Theoretical Framework

2.1.1. Multiple Linear Regression (MLR) Model

2.1.2. Support Vector Regression (SVR) Model

2.1.3. Categorical Boosting (CatBoost) Model

2.1.4. Convolution Neural Network (CNN)

2.1.5. Long Short-Term Memory (LSTM)

2.1.6. Bidirectional LSTM (BiLSTM)

2.1.7. Deep Bidirectional LSTM (DBiLSTM)

2.1.8. Deep Learning Based Wind Speed Forecasting: A Critical Review

2.2. Data and Methodology

Data Processing

2.3. Proposed CNNDBiLSTM Architecture and Evaluation Metrics

3. Results and Discussion

3.1. Model Performance Overview

3.2. Prediction Error Distribution and Forecasting Accuracy

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI