Implementing a Hybrid Quantum Neural Network for Wind Speed Forecasting: Insights from Quantum Simulator Experiences

Hong, Ying-Yi; Santos, Jay Bhie D.

doi:10.3390/en18071771

Open AccessFeature PaperArticle

Implementing a Hybrid Quantum Neural Network for Wind Speed Forecasting: Insights from Quantum Simulator Experiences

by

Ying-Yi Hong

^*

and

Jay Bhie D. Santos

Department of Electrical Engineering, Chung Yuan Christian University, Taoyuan City 320314, Taiwan

^*

Author to whom correspondence should be addressed.

Energies 2025, 18(7), 1771; https://doi.org/10.3390/en18071771

Submission received: 21 February 2025 / Revised: 20 March 2025 / Accepted: 27 March 2025 / Published: 1 April 2025

(This article belongs to the Section A3: Wind, Wave and Tidal Energy)

Download

Browse Figures

Versions Notes

Abstract

The intermittent nature of wind speed poses challenges for its widespread utilization as an electrical power generation source. As the integration of wind energy into the power system increases, accurate wind speed forecasting becomes crucial. The reliable scheduling of wind power generation heavily relies on precise wind speed forecasts. This paper presents an extended work that focuses on a hybrid model for 24 h ahead wind speed forecasting. The proposed model combines residual Long Short-Term Memory (LSTM) and a quantum neural network that is studied by a quantum simulator, leveraging the support of NVIDIA Compute Unified Device Architecture (CUDA). To ensure the desired accuracy, a comparative analysis is conducted, examining the qubit count and quantum circuit depth of the proposed model. The execution time required for the model is significantly reduced when the GPU incorporates CUDA, accounting for only 8.29% of the time required by a classical CPU. In addition, different quantum embedding layers with various entangler layers in the quantum neural network are explored. The simulation results utilizing an offshore wind farm dataset demonstrate that the proper number of qubits and embedding layer can achieve favorable 24 h ahead wind speed forecasts.

Keywords:

deep learning; graphical processing unit; quantum approximate optimization algorithm; residual long short-term memory; wind speed forecasting

1. Introduction

In recent years, the utilization of wind power has gained significant importance in many countries due to the increasing need for additional reserve capacity and sustainable energy sources [1]. However, the volatile nature of wind power generation poses challenges in ensuring reliable and cost-effective operation [2,3]. As a result, researchers and industry decision-makers are increasingly interested in developing more accurate and efficient methods for wind speed forecasting. By improving the ability to forecast wind speed over extended time horizons, it becomes possible to optimize the utilization of wind power resources and mitigate the associated operational costs.

1.1. Classification of Existing Works in Wind Speed/Power Forecasting

Wind speed/power forecasting can be classified into three categories: physical-based, statistical-based, and artificial intelligence (AI)-based methods [4]. Among these, deep learning-based methods in AI have the ability to learn from historical datasets and adapt to ever-changing conditions. As a result, they are particularly suitable for handling the volatile, non-linear, and intermittent nature of wind speed [5]. Numerous papers have provided detailed reviews on wind speed or power forecasting, which can be found in references [6,7,8,9,10].

1.2. Applications of Quantum Computing in Power and Energy Engineering

Quantum computing (QC) has recently garnered significant attention due to its unique characteristics, such as quantum entanglement and quantum superposition, which offer advantages over classical computing [11]. One notable work by Ajagekar explored energy systems optimization using QC, marking it as one of the first articles in the field of power and energy engineering to utilize QC. The study focused on topics such as facility location allocation for energy systems infrastructure development, unit commitment, and heat exchanger network synthesis [12]. Deng et al. developed an optimization solution based on quantum annealing for the real-time building of HVAC controls. This novel approach yielded similar results with less than a 2% difference compared to traditional optimization methods while significantly reducing computational time from hours to seconds [13]. Ajagekar also proposed a hybrid QC-based deep learning framework for fault diagnosis in power grids, utilizing extracted features from faulty signals [14]. Zhou et al. devised a quantum transient stability assessment method using QC to predict transient stability for bulk power systems [15]. Feng et al. combined QC and Surrogate Lagrangian Relaxation to solve unit commitment (UC) problems, showcasing the potential of QC in this domain [16]. Nikmehr et al. presented a quantum UC approach that leveraged the superposition and entanglement of quantum bits (qubits) to solve subproblems [17]. Additionally, Feng et al. demonstrated a proof of concept for quantum power flow algorithms, utilizing QC to support various power system analytics [18]. Nikmehr et al. applied QC to power system reliability assessment by implementing a quantum amplitude estimation algorithm to evaluate reliability indices [19]. In a recent study, the authors proposed a hybrid quantum model for 24 h ahead wind speed forecasting [20]. The potential of QC in various areas of energy sustainability and its contribution to addressing climate change were reviewed in [21].

1.3. Quantum Computer, Quantum/Digital Annealer, and Quantum Simulator

Noisy Intermediate-Scale Quantum (NISQ) technology has become increasingly accessible to the scientific community, showcasing the advantages of quantum hardware in various research areas. However, the reliability of modern quantum computers faces challenges due to environmental hardware noise, which can lead to a loss of accuracy [22]. Several companies, including IBM, Google, Intel, and Rigetti, have developed their universal NISQ-based quantum computers. In contrast, D-Wave’s quantum annealer [23] and Fujitsu’s digital annealer [24] have been developed for combinatorial optimization and quadratic unconstrained binary optimization (QUBO), respectively. Quantum annealers have Quantum Processing Units (QPUs) with more than 5000 qubits, which naturally provide low-energy solutions. On the other hand, digital annealers emulate qubits in a digital circuit and focus on solving QUBO. However, both quantum annealers and digital annealers are not universal quantum computers, they are designed for specific purposes. Since the aforementioned universal/non-universal quantum computers often require accounts and/or payments, quantum simulators have gained popularity. These simulators can be run on local workstations using quantum computing software libraries/platforms, such as PennyLane [25]. Additionally, the graphic processing unit (GPU) cards in local workstations can be utilized to execute parameterized quantum circuits (PQCs) for quantum computation [26]. Quantum simulators are well suited for developing hybrid quantum and classical algorithms and provide a convenient hardware option until universal quantum computers mature and become widely available.

1.4. Insights and Contributions of This Work

This paper serves as an extended study building upon the authors’ previous work [20], which utilized a hybrid quantum model for 24 h ahead wind speed forecasting using a quantum simulator. In the previous study [20], a hybrid model combining residual Long Short-Term Memory (LSTM) with a quantum neural network (QNN) was proposed. This paper aims to further explore the effectiveness of the hybrid quantum model in wind speed forecasting. The insights of this study have the potential to improve both the accuracy and practical applicability of quantum computing in this domain.

The key contributions of this paper can be summarized as follows, addressing aspects not covered in the previous study [20]:

(a): Examination of the architectural parameters of the QNN using the Quantum Approximate Optimization Algorithm (QAOA) embedding layer, such as the number of qubits and wires, to enhance accuracy and overcome barren plateaus.
(b): Evaluation of the model’s performance on different computing hardware facilities, including NVIDIA GPU cuDNN, GPU-only, and CPU, to demonstrate the applicability of modern quantum simulators.
(c): Comparison of the performance of the QAOA-based QNN with other quantum embedding and layer circuits, highlighting the advantages of the QAOA-based QNN approach.
(d): Comparison of the performance of different quantum simulation platforms, such as PennyLane and Torchquantum.

This paper is arranged as follows: Section 2 discusses the theoretical background of quantum computing. Section 3 describes the methodology. Section 4 presents the simulated results. Section 5 draws conclusions and provides recommendations for future work.

2. Background of Quantum Neural Networks

This section describes the basics of quantum computing. Specifically, Quantum Approximate Optimization Algorithm (QAOA) constructs the main topology of a quantum neural network [27].

2.1. Qubits and Ansatz

Quantum bits, or qubits, which can be in a superposition of states and are entangled with one another, are the building blocks of quantum computers. Identifying the ideal set of qubits and their states to represent a solution to a problem can be challenging, especially as the problem’s complexity and qubit count rise. Making this task simpler is possible using an ansatz. In the context of quantum computing, the term “ansatz” refers to a proposed initial approximation or guess for the quantum state of a system [28].

In many cases, finding the exact solution to a quantum problem is computationally inapplicable on account of the exponential growth of the state space with the number of qubits. Instead, researchers often employ variational algorithms, where they utilize an ansatz for the quantum state and iteratively optimize its parameters to minimize some objective function.

After a simple quantum state is initialized as a starting point, PQCs consisting of quantum gates are then applied to the initial state to transform it into a more complex state that potentially contains valuable information for solving the problem at hand. The choice of the initial state and the set of gates is guided by intuition, practical knowledge, and mathematical insights.

Once the ansatz (parameterized circuit) is constructed, classical optimization techniques are employed to find the optimal values for the variables in the circuit. The objective is to minimize the energy of the state, thereby obtaining the best possible solution to the problem. The variational quantum eigensolver (VQE) is a commonly used technique for tackling optimization problems and simulating quantum chemistry systems [28].

By iteratively adjusting the parameters of the quantum circuit and evaluating the resulting energy, the optimization process aims to converge towards the lowest energy state, which corresponds to the optimal solution of the problem. This interplay between classical optimization and quantum circuit evaluation enables the VQE approach to leverage the power of quantum computing while utilizing classical methods to guide the optimization process [28].

2.2. Variational Quantum Eigensolver (VQE)

The ground state energy of a given Hamiltonian can be determined using the VQE [28]. The features of datasets are prepared as inputs for the variational quantum eigensolver (VQE) algorithm. To ensure compatibility with quantum computation, it may be necessary to normalize or scale the features. Once preprocessed, these features are used to construct a Hamiltonian that represents the desired outcome. The variables within the Hamiltonian are associated with operators that affect the quantum state and reflect the system’s overall energy. These Hamiltonian terms are then converted into Pauli operators, where each term is represented by a tensor product of Pauli matrices (Pauli-X, Pauli-Y, and Pauli-Z).

By encoding the problem in this way, it becomes compatible with quantum operations. Propagating Pauli operators can be combined to improve quantum processing and reduce the number of gates required. Applying these commuting operators simultaneously helps conserve computational resources. Simplification of the quantum circuit can be achieved by identifying groups of commuting operators and representing them as sums of products of Pauli strings [28].

The trial state, known as the ansatz state, is constructed on the quantum computer or simulator using a sequence of quantum gates. The features can be used to initialize the QAOA ansatz. Parametrized two-qubit gates, referred to as mixing and driving Hamiltonians, are applied to rotate the ansatz state into the measurement basis. “Measurement basis” refers to the computational basis (e.g., the eigenstates of the Pauli-Z operator, |0〉 and |1〉), which is commonly used in quantum measurements. When the ansatz state is rotated into the measurement basis, it means that a sequence of quantum gates transforms the quantum state into a form where meaningful measurements can be performed to extract expectation values. These gates introduce entanglement between qubits and allow exploration of different states. The parameters in these gates are optimized to minimize the energy expectation value. The expectation value obtained from measurements is used to update the parameters in the ansatz circuit. The goal is to find the optimal parameters that minimize the expectation value, which corresponds to finding the ground state energy of the Hamiltonian. Traditional optimization algorithms or machine learning techniques, such as the Adam optimizer, can be employed to update the parameters and iteratively improve the approach.

The process of rotating into the measurement basis, computing the expectation value, and updating the parameters is repeated for a predetermined number of iterations or until a convergence criterion is met. Iterative parameter updates are applied to the ansatz circuit, resulting in computed expectation values that progressively improve. The process is repeated until the desired level of precision is achieved or a stopping condition is satisfied.

2.3. Quantum Approximate Optimization Algorithm (QAOA)

QAOA is a quantum computing-based algorithm used to solve optimization problems [27,29]. This algorithm leverages the advantages of quantum computing by using a parameterized quantum circuit (PQC) to approximate solutions to optimization problems and can handle various practical problems. The foundation of this algorithm lies in the quantum phase estimation algorithm and the Quantum Optimization Loop (QOL). QAOA can be performed through the alternating operations of the following two steps [29]:

(a): Creating a PQC called the Mixing Circuit, which transforms the initial state into a list of candidate solutions. The parameters of this circuit are part of the objective function to be optimized.
(b): Creating a PQC called the Phase Circuit, which adjusts the circuit’s parameters to increase the expected value of the objective function at each iteration.

Based on the above explanation, the state of a p-level QAOA can be defined by alternately applying the phase operator

{(U}_{P} (γ_{p}) = e^{- i γ H_{P}}

, where

H_{P}

is the phase Hamiltonian), and the mixing operator (

U_{M} (β_{p}) = e^{- i β H_{M}}

, where

H_{M}

is the mixing Hamiltonian), as shown below [29]. The number p determines how many times these operations are applied sequentially:

| γ, β ⟩ = U_{M} (β_{p}) U_{P} (γ_{p}) \dots U_{M} (β_{1}) U_{P} (γ_{1}) | s 〉

(1)

Here, |s〉 represents the initial state, p is an integer greater than or equal to 1, and 2p parameters

γ_{1}

···

γ_{p}

≡

γ

and

β_{1}

···

β_{p}

≡

β

. After performing computations and measurements on the state (1), the expected value of

H_{P}

[29]:

〈H_{P}〉 ≔ 〈γ, β |H_{P}| γ, β〉 = {〈f〉}_{(γ, β)}

(2)

where 〈f〉 is the expected value of the objective function. By iteratively searching for the optimal values of γ and β, the maximum or minimum value of

〈H_{P}〉

can be obtained. The iterative process of finding the optimal parameters utilizes conventional optimization methods. Therefore, QAOA falls under the category of hybrid quantum and classical algorithms. The gradients of the expectation value with respect to the variational parameters are typically computed using the parameter-shift rule for tunable (trained) parameters γ and β [29]. This will be detailed in the Appendix A.

Figure 1 shows a QAOA example for the full embedding circuit using two layers (repeated two times), three features (x₁, x₂, x₃), four wires, and R_Y local fields [30]. The feature-encoding circuit associates features with the angles of R_X rotations. The Ising ansatz consists of trainable two-qubit ZZ interactions, which are represented by

θ_{1}

~

θ_{4}

and

θ_{9}

~

θ_{12}

across two wires.

The Hadamard gate (denoted as a symbol H in Figure 1) is responsible for mapping computational basis states to superposition states and vice versa. The Ising ansatz, which involves trainable two-qubit ZZ interactions, facilitates entanglement between two qubits. This allows for the full implementation of quantum qubit characteristics within the QAOA circuits.

When comparing a traditional neural network to a QNN (quantum circuit), their configurations and functions exhibit similarities. Information is processed through weighted sums generated by neurons, and the connections between neurons are determined using heuristics in a traditional neural network. However, a QNN provides a more structured and expressive framework for computation due to the following reasons: (a) Fixed qubit wires: The architecture of a QNN is inherently defined by the number of qubits in the quantum system, establishing a fixed set of computational pathways. (b) Algorithm-governed interactions: Unlike classical networks, where connections are often manually tuned, QNN interactions are dictated by specific quantum algorithms, such as the QAOA. (c) Depth control via parameter “p”: The number of layers in a QNN is determined by the parameter p in the QAOA, allowing systematic control over circuit depth. These characteristics enable QNNs to capture complex representations more efficiently compared to traditional neural networks.

Despite the differences between VQE and QAOA, they share a common idea of using a parameterized quantum circuit and classically optimizing the parameters to find an optimal solution. In fact, QAOA can be viewed as a specific case of VQE, where the objective function to be maximized is derived from an optimization problem [27]. This relationship highlights the flexibility and generality of the variational approach in quantum algorithms. In summary, while VQE is primarily used for finding ground state energies of Hamiltonians in quantum chemistry, and QAOA is designed for solving combinatorial optimization problems, both algorithms employ a similar variational approach and can be used for related tasks.

3. Methodology

The presented hybrid model for 24 h ahead wind speed forecasting integrates a deep learning-based residual Long Short-Term Memory (LSTM) with a quantum neural network (QNN). This hybrid approach leverages the strengths of both techniques to enhance the accuracy and effectiveness of wind speed predictions. The proposed hybrid model is implemented using a quantum simulator on an NVIDIA GPU RTX 3080 manufactured in Taipei, Taiwan.

This paper utilizes Parameterized Quantum Circuits (PQCs) to build a QNN. The software platform used is PennyLane. PennyLane is a Python 3.12 framework for quantum programs that serves as a template platform for quantum machine learning and optimization, combining quantum and classical computations. PennyLane extends the existing machine learning libraries such as TensorFlow v.2.13, to handle quantum computations. It provides a library of quantum circuit architecture templates composed of quantum gates.

3.1. Residual LSTM

Residual learning is a powerful technique for accelerating and facilitating the convergence of neural networks. By separating a spatial-domain shortcut path from a temporal-domain cell update, a residual LSTM becomes more adaptable in handling disappearing or expanding gradients. In the LSTM architecture, a memory cell is controlled by input and forget gate networks [31]. The forget gate in an LSTM layer determines the extent to which the previous memory value should be retained for the subsequent time step. Similarly, an input gate modifies the most recent input to the memory cells. Additionally, a shortcut path is constructed in the output layer. The residual LSTM equations are provided in the Appendix B.

In the proposed method, as depicted in Figure 2, which illustrates the architecture with three LSTM layers, the

l

-th LSTM layer receives its own hidden state (

h_{t}^{l}

) as an input (

x_{t}^{l}

), which is then processed through element-wise addition. A dense layer with 10 neurons, matching the number of qubits implemented in the subsequent QNN, is inserted between the LSTM and QNN layers to facilitate feature transformation. The model inputs consist of wind speed measurements at time steps t, t − 1, t − 2, and so on, up to t − 25. The output layer is a dense layer with 24 neurons, representing the forecasted wind speeds at time steps t + 1, t + 2, up to t + 24 (24 h ahead predictions).

3.2. Quantum Neural Network (QNN)

When leveraging the advantages of quantum computing, classical information needs to be processed and converted into quantum bits (qubits). In addition to the traditional classical position encoding, classical information also undergoes angle embedding to represent classical data as quantum states and encode them as a set of angles in a quantum circuit. In quantum computing, an entanglement layer can be established between qubits to perform quantum computations, and it can also create parameterized quantum circuits for training specific tasks.

Please note that Quantum Neural Tangent Kernels (QNTKs) are a theoretical framework used to analyze the trainability of QNNs by examining their tangent kernel behavior [32]. While QNTK provides valuable insights into QNN dynamics, it is not an optimization algorithm but a tool for analyzing optimization landscapes. Among quantum algorithms, the QAOA requires shallower quantum circuit depth, whereas the VQE is more hardware-intensive. VQE involves measuring numerous Pauli terms in the Hamiltonian, while QAOA features a simpler cost function. Given these considerations, this paper employs QAOA embedding as the quantum layer in the proposed QNN.

The residual LSTM in the proposed method is augmented with a dense layer connected to a QNN, as depicted in Figure 2. The QNN incorporates the QAOA embedding as its quantum layer. The utilization of the QAOA embedding template is employed in the proposed method to allow for the training of features. This enables the computation of gradients with respect to feature values. In the proposed approach, both weights and features are provided as positional inputs to the qnode to facilitate training. By encoding N characteristics using n qubits, where n > N, the QAOA embedding layer utilizes a multilayer, trainable quantum circuit based on the QAOA ansatz. This architecture enables the efficient representation of complex data relationships while facilitating the fine-grained optimization of feature and weight interactions during training, thereby enhancing the model’s robustness and convergence [30].

The residual LSTM model is connected with a dense layer with 10 neurons. This is followed by the QAOA-based QNN with 10 qubits. The aforementioned number of neurons and qubits will be further explored to evaluate their impact on the performance of the proposed model in Section 4. The QNN is implemented using PennyLane’s integrated simulator, which will be presented in Section 3.3. Since the QNN generates quantum states rather than real numbers, a dense layer with 24 neurons is added after the QNN to map the quantum outputs to the 24 h forecast values. The bias and weight parameters of the LSTM layers, along with the quantum parameters of the QAOA-based QNN, are optimized using the Adam optimizer. The performance of the complete prediction model is evaluated using metrics such as R-squared (R²), mean absolute error (MAE), and root mean square error (RMSE).

3.3. Implementation of QNN

A single layer in a QAOA-based QNN comprises two key components: a Hamiltonian-based operation and a variational circuit. The first component, known as problem Hamiltonian encoding, maps the input features onto the quantum system by evolving the quantum state under a Hamiltonian that encapsulates the data. This process ensures that the problem’s characteristics are effectively embedded in the quantum state. The second component employs a variational ansatz, inspired by a one-dimensional Ising model, which applies a parameterized quantum circuit to explore the solution space. Together, these components operate within the QAOA framework to optimize the quantum state for the target task [30]. The feature tensor, weight tensor, wires, and local field are all the parameters considered within this template [33], where the type of local field used is specified. After optimizing the quantum circuit, measurements of the output state provide approximations of the optimal feature and weight values. The optimization of these features and weights is then further handled by traditional optimizers, such as the Adam optimizer.

In this paper, a TensorFlow-compatible quantum node for the Keras layer was developed using PennyLane’s different embedding techniques, including the QAOA embedding. These embedding techniques allowed the encoding of features and classical input data into quantum states, which were utilized in the QNN. Additionally, QNN layer circuit templates in PennyLane were employed as parameterized quantum circuits that can be trained to modify the parameters in each repetition. These parameterized quantum circuits replaced the classical fully connected layers in traditional wind power forecasting, resulting in improved forecasting accuracy. This is because QNNs leverage superposition and entanglement to enhance feature representation and enable higher-dimensional data encoding.

3.4. Quantum Simulator

The use of a quantum simulator is intended to establish a baseline performance without hardware-induced errors, as is commonly performed in early-stage quantum algorithm research. It is acknowledged that noise and decoherence present challenges in current quantum devices, but ongoing advancements in quantum error mitigation techniques, such as zero-noise extrapolation and variational error correction, continue to improve hardware stability. Future evaluations on platforms like IBM Q, Rigetti, and D Wave are planned to assess the practical performance of the model, with considerations for mitigating noise effects. As quantum hardware evolves, increasing reliability is expected to enhance the applicability of hybrid quantum–classical forecasting models.

Conventional computers can simulate quantum computing by utilizing a quantum simulator. A quantum simulator is capable of performing calculations that mimic the behavior of a quantum computer, albeit at a significant computational cost. In this study, a hybrid quantum model is presented, and it utilizes PennyLane’s lightning. GPU device as the simulator. This choice enables efficient linear algebra calculations for simulating the evolution of quantum state vectors. The adjoint differentiation approach is supported, and parallelization across necessary observables is facilitated [33]. To leverage GPU acceleration for circuit simulation, the PennyLane-Lightning-GPU plugin extends the C++-based PennyLane-Lightning state vector simulator and offloads computations to the NVIDIA cuQuantum SDK.

Table 1 displays the specific templates and layers utilized for the QNN in the proposed method. These templates and layers in the PennyLane were compared to assess their performance and effectiveness on the accuracy of wind power/speed forecasting.

3.5. Compute Unified Device Architecture (CUDA)

The proposed method was implemented using both a Central Processing Unit (CPU) and a Graphical Processing Unit (GPU). When it comes to resolving AI computational problems, CPU-based algorithms often take an impractical amount of time [34]. The GPU, on the other hand, is a specialized integrated circuit that works in conjunction with the CPU to efficiently handle 2D and 3D graphics processing tasks. With its ability to perform thousands of operations per cycle, the GPU surpasses the CPU in terms of speed and emphasizes high throughput. One notable distinction between the CPU and the GPU is that the former focuses on minimizing latency, while the latter prioritizes maximizing throughput.

The CUDA C programming language was utilized for constructing the CUDA parallel computing platform. CUDA offers several advantages such as shared memory usage, cost-effectiveness, and meeting the evolving demands of the gaming industry for GPU advancements [35]. Additionally, CUDA provides the CUDA Deep Neural Network (cuDNN) library, which includes GPU-accelerated implementations of various deep neural network components. These implementations are precisely optimized for commonly used techniques such as pooling, normalization, and activation layers. By leveraging cuDNN, developers can efficiently accelerate their deep learning computations on GPUs.

Figure 3 illustrates the process flow of a CUDA-enabled GPU, showcasing the interaction between the CPU and the GPU during program execution. The CUDA program follows a three-step process flow in the following order: (i) The input data are transferred from the CPU memory to the GPU memory, enabling the GPU to access and process the data efficiently. (ii) The GPU program is loaded and executed, leveraging the computational power of the GPU to perform the desired computations. (iii) The results of the GPU computation are copied from the GPU memory back to the CPU memory, allowing the CPU to access and utilize the computed results [36]. This data transfer between the CPU and GPU ensures seamless coordination and the efficient utilization of both processing units in a CUDA-enabled system.

3.6. Solution Steps for Proposed Methodology

The aforementioned methods for implementing the hybrid residual LSTM-QNN for wind speed forecasting can be summarized as follows:

Step 1: Data Preparation

1.1. Collect historical wind speed data from the Fuhai offshore wind farm dataset.

1.2. Normalize wind speed measurements for stable training.

1.3. Structure input data using time steps t, t – 1, ..., t – 25 as features and t + 1, ..., t + 24 as target outputs.

Step 2: Residual LSTM Model Construction

2.1. Design an LSTM architecture with three layers.

2.2. Implement residual connections to enhance gradient flow.

Step 3: Quantum Neural Network (QNN) Design

3.1. Encode classical LSTM-transformed features into quantum states via different embedding techniques (as shown in Table 1), including the QAOA embedding.

3.2. Construct a QNN using a 10-qubit QAOA-based ansatz.

3.3. Establish entanglement between qubits using parameterized quantum circuits.

Step 4: Hybrid Model Integration and Training using a Quantum Simulator

4.1. Combine residual LSTM and QNN through a connecting dense layer with 10 neurons to transform LSTM features for QNN input.

4.2 Deploy PennyLane’s lightning. Use GPU simulator for efficient state vector evolution.

4.3 Utilize GPU acceleration via NVIDIA cuQuantum SDK.

4.4. Optimize/train both LSTM biases/weights and QNN parameters by the Adam optimizer using a quantum simulator.

Step 5: Prediction and Performance Evaluation

5.1. Predict wind speed for the next 24 h using the well-trained hybrid model.

5.2. Evaluate forecasting performance using R-squared (R²), MAE, and RMSE metrics.

5.3. Compare the accuracy of wind speed forecasting by varying embedding techniques, entanglement layers, and the number of qubits.

Figure 4 presents a diagrammatic representation of the solution steps, illustrating the intricate interactions between classical LSTM components and quantum processing elements in Step 3 (Quantum Neural Network (QNN) Design). In this step, an embedding layer, as shown in Table 1, is used to encode classical data into a quantum state within the PennyLane platform. Specifically, the embedding layer maps classical information into the Hilbert space of a quantum system, enabling quantum circuits to process it effectively. This transformation is achieved using parameterized quantum circuits to convert classical features into quantum states.

4. Results

The Fuhai offshore wind farm in Taiwan provides the input wind speed data (8760 samples) for the proposed method. Hourly wind speed measurements (in m/s) were recorded from 1 May 2017 to 30 April 2018, resulting in a total of 8760 data points. Fuhai offshore wind farm has significant wind power generation potential due to the consistently strong winds in the Taiwan Strait—the body of water between China and Taiwan. In winter, prevailing winds blow from north to south, while in summer, they reverse direction from south to north. Detailed meteorological datasets for the wind farm can be purchased from sources such as https://www.thewindpower.net/index.php (13 October 2022) The mean wind speeds at the Fuhai wind farm are 13.1, 8.88, 5.55, and 7.29 m/s in winter, spring, summer, and autumn, respectively, with the corresponding standard deviations of 4.57, 5.48, 3.15, and 5.19 m/s [20].

For training/validation and testing, 80% and 20% of the total data were used, respectively. The proposed method was implemented in Python with the Keras package, PennyLane, cuDNN, and TensorFlow as the backend. The simulations were executed using the Ubuntu 18 operating system with an NVIDIA GeForce GPU RTX 3080 core with 10 GB of RAM as the graphics hardware for simulation. An Intel Core i7-6700 CPU, running at 3.40 GHz, was used for comparison.

While the model was trained and tested on the Fuhai offshore wind farm dataset in Taiwan, it is acknowledged that wind characteristics vary across regions. The residual LSTM component of the model is expected to retain its effectiveness with fixed hyperparameters and topology parameters, as its architecture captures temporal dependencies in a robust manner. However, the parameters of QNN would require further training when applied to other wind farms, and the concept of transfer learning is expected to be implemented to adapt the QNN efficiently to new datasets.

4.1. Comparative Studies of Runtime

To investigate runtime using various computer resources, the single-layered QNN was modeled with ten qubits and wires. The following three resources were used: (a) CPU only; (b) single GPU but cuDNN is disabled; and (c) single GPU with cuDNN. The execution times for training the proposed model were 13:56:56, 7:23:20, and 1:09:02, respectively. When a classical computer with only a CPU was used, an extensive CPU execution time of more than 13 h was required to obtain a result. The required GPU time was nearly half of the required CPU time. When the GPU incorporated CUDA, the required time was greatly reduced to only 8.29% of that required by a classical CPU.

4.2. Comparative Studies of Accuracy

To achieve high accuracy, the numbers of qubits and wires are determined herein. The number of qubits equals the number of wires. The aim is to obtain the best number of qubits and wires that fit the proposed model. The statistical methods that were used for evaluation were mean squared error (MSE), root mean square error (RMSE), and mean absolute error (MAE), as shown in Figure 5. Figure 6 presents the R-squared score (R²) of the results, which is very close to unity in the case of 10 qubits. Both Figure 5 and Figure 6 reveal that ten qubits in modeling the QNN yielded the best result.

Figure 7 shows the forecasted and actual wind speeds per unit on a typical day in the fall season. The forecasted values are very close to the actual values.

4.3. Comparative Studies of Accuracy with Other QNN and Traditional Methods

The proposed method uses the QAOA embedding layer for its QNN, and is compared with combinations of PennyLane’s embeddings and layers: (a) Angle Embedding with Basic Entangler Layers, (b) Angle Embedding with Strongly Entangling Layers, (c) Amplitude Embedding with Basic Entangler Layers, (d) Amplitude Embedding with Strongly Entangling Layers, (e) IQP Embedding with Basic Entangler Layers, (f) IQP Embedding with Strongly Entangling Layers, and (g) the proposed method incorporating the QAOA embedding layer. The traditional residual LSTM, cascaded with a dense layer and excluding quantum-related layers, is also employed for comparison. The parameters of this traditional residual LSTM, referred to as case (h), are identical to those of the proposed method.

The proposed method achieves superior performance in terms of error indices, including mean squared error (MSE), coefficient of determination (R²), root mean squared error (RMSE), and mean absolute error (MAE), compared to other methods, as shown in Table 2. Let N be the total number of observations. These performance metrics are defined by Equations (3)–(6):

M S E = \frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}

(3)

R^{2} = 1 - \frac{\sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{N} {(y_{i} - {\bar{y}}_{i})}^{2}}

(4)

M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}

(5)

M A E = \frac{1}{N} \sum_{i = 1}^{N} ｜ y_{i} - {\hat{y}}_{i} ｜ \times 100 %

(6)

where

{\hat{y}}_{i}

,

y_{i}

, and

{\bar{y}}_{i}

denote the forecasted value, actual value, and the mean of the forecasted values, respectively.

R^{2}

represents the correlation between the actual and predicted values. If

R^{2}

is close to unity, then the forecasted values are close to the actual values; otherwise, a lower

R^{2}

corresponds to a worse method. MSE, MAE, and RMSE capture the measured errors between the actual and forecasted values. Restated, R² helps understand the proportion of variance explained by the model, while MSE, MAE, and RMSE focus more on prediction accuracy and error magnitude.

The values of these error indices obtained by the proposed method (QNN (g) in Table 2) are closer to zero, indicating higher accuracy in wind speed forecasting. Additionally, the R² score, which measures the goodness of fit of the model, is close to one, further validating the effectiveness of the proposed method. Among the eight QNNs in Table 2, the QNN incorporating Angle Embedding with Basic Entangler Layers (QNN (a) in Table 2) tends to achieve the least accuracy in terms of MSE, R² score, and RMSE. This is because Angle Embedding with Basic Entangler Layers is the simplest QNN configuration.

As shown in Table 2, the accuracy achieved by the traditional residual LSTM without quantum-related layers (method (h) in Table 2) is slightly lower than that of the proposed method. This is attributed to the QAOA embedding layer in the proposed approach, which enables QNNs to explore a larger computational space (leveraging quantum superposition) and to encode and process variable correlations more efficiently (utilizing quantum entanglement).

The computational times for the proposed method, including both model training and model testing, are compared to other approaches. Table 3 indicates that the proposed method (QNN (g) in Table 3) exhibits shorter training time and longer testing time compared to the other approaches. This is attributed to the utilization of the QAOA embedding layer, which is based on the QAOA ansatz, as a built-in template in the proposed method. This enables gradient computations with respect to both the features and weights arguments, enhancing the optimization process. On the other hand, IQP Embedding with Basic and Strongly Entangling Layers (QNNs (e) and (f) in Table 3) requires long training times due to the complexity of quantum operations and the high number of parameters that need to be optimized. These embeddings involve highly entangled quantum circuits, leading to more computational overhead and a challenging optimization landscape. Additionally, the high-dimensional data encoding and the need for multiple gradient evaluations further increase the training time. The other methods employ combinations of embeddings and templates that do not support gradient computations, resulting in longer training times. However, during the testing phase, the proposed method’s ability to compute gradients leads to longer testing times compared to the other approaches.

In Table 3, the training and testing times for the traditional residual LSTM without quantum-related layers (method (h) in Table 3) are the shortest among all the methods. This is because it has the fewest tunable parameters, resulting in lower computational complexity and faster optimization. However, the testing times for all the methods are comparable.

Overall, the suggested method’s shorter training time, compared to cases (a)–(f), and improved optimization capabilities, enabled by the QAOA embedding layer, contribute to its computational efficiency. Despite the longer testing time, the proposed method offers enhanced performance and flexibility by supporting gradient computations for both features and weights.

4.4. Comparative Studies of Accuracy and Runtime Using Different Quantum Simulators

In addition to PennyLane, other quantum simulators, such as TorchQuantum [37], are also widely used today. TorchQuantum is designed for integration with the Torch deep learning framework, enabling the seamless implementation of hybrid QNNs within Torch architectures while leveraging CUDA acceleration. This subsection compares the results obtained from these two quantum simulators in terms of accuracy (evaluated using R² scores and error indices) and runtime performance.

Table 4 presents a performance comparison of the proposed method implemented using PennyLane and TorchQuantum. The results indicate that TorchQuantum yields higher MSE, RMSE, and MAE values, along with a lower R² score. This disparity is likely due to TorchQuantum’s emphasis on integrating quantum circuits into a broader machine learning environment, as opposed to PennyLane’s specialization in quantum machine learning. Although TorchQuantum performs well, the differences in optimization techniques, quantum-specific capabilities, and the integration with classical networks in PennyLane contribute to the observed performance gap. Specifically, Pennylane provides a powerful platform for implementing variational quantum circuits and quantum neural networks, allowing for more flexible ways to prototype variational quantum circuit formulations. Torchquantum, on the other hand, needs to improve its implementation of the parameter-shift rule for gradient computation and enhance its use of GPU acceleration for state vector simulations.

Table 5 presents a comparison of the execution time and the number of iterations required by the two quantum simulators. According to Table 5, the PennyLane implementation is faster in terms of training time, likely due to its use of the lightning.qubit simulator, which is optimized for efficient quantum circuit simulations. The testing times are nearly identical between the two platforms, and the number of iterations is similar, indicating that the primary performance difference lies in the training phase rather than in other aspects of the model’s execution.

In addition to the comparisons of statistical performance metrics, execution time, and the number of iterations, Table 6 may help support the explanations for the results shown in Table 4 and Table 5 by providing a detailed comparison of the key architectural differences and optimization strategies between PennyLane [35,36] and TorchQuantum [37]. This comparison can offer insights into why certain configurations, such as the number of qubits, embedding techniques, or entanglement layers, affect the performance metrics (e.g., accuracy and training time) presented in Table 4 and Table 5. By linking the architectural choices to the observed outcomes, Table 6 provides context for understanding the model’s behavior and performance in the wind speed forecasting task.

5. Conclusions

This study explores the advantages of integrating quantum computing with deep learning algorithms for 24 h ahead wind speed forecasting using high-performance computing. The parallel computation capabilities of GPUs with CUDA enable the efficient execution of hybrid classical–quantum models in a quantum simulator. Based on the results of comparative studies, the following observations can be made:

(a): Hybrid quantum algorithms, such as QAOA, offer the combined benefits of classical and quantum algorithms, resulting in significantly improved forecasting accuracy.
(b): QNNs based on the QAOA layer provide more accurate predictions due to the gradient computation support for both features and weights, facilitating better optimization.
(c): The selection of the appropriate number of wires and qubits in the QNN layer is crucial, as it can lead to favorable evaluation metrics and reduced training time.
(d): Quantum simulators utilizing CUDA-based GPUs serve as a suitable hardware platform for studying quantum algorithms, especially in cases where general-purpose quantum computers are not widely available. The training time required by CUDA-based GPUs is found to be the shortest, followed by GPUs, and then CPUs.

The proposed hybrid classical–quantum model may face limitations in scalability on real quantum hardware, sensitivity to noise and decoherence, computational overhead from hybrid processing, and training challenges due to barren plateaus. In addition, generalization to different wind farms requires transfer learning for the QNN. Although classical methods are effective, quantum models may offer advantages in representing non-linear relationships through superposition and entanglement. This research provides insights into potential future benefits as quantum hardware matures and advances.

In future research, the exploration of other quantum simulators, such as the IBM-Q simulator or the D-Wave platform, is recommended. Additionally, studying state-of-the-art quantum layers or embedding layers with approaches like high expressibility, low-depth quantum circuits, and quantum embedding kernels will contribute to further advancements in this field.

Author Contributions

Conceptualization, Y.-Y.H.; methodology, Y.-Y.H.; software, J.B.D.S.; validation, J.B.D.S.; formal analysis, J.B.D.S.; investigation, J.B.D.S.; resources, Y.-Y.H.; data curation, J.B.D.S.; writing—original draft preparation, J.B.D.S.; writing—review and editing, Y.-Y.H.; visualization, J.B.D.S.; supervision, Y.-Y.H.; project administration, Y.-Y.H.; funding acquisition, Y.-Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Science and Technology Council, Taiwan, grant number NSTC 113-2218-E-008-016 and NSTC 112-2221-E-033-010-MY3.

Data Availability Statement

Data is available upon request due to restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

In the Quantum Approximate Optimization Algorithm (QAOA), explicit gradient computation is essential for parameter optimization. The gradients of the expectation value with respect to the variational parameters are typically computed using the parameter-shift rule. Here is a breakdown of the explicit gradient equations:

Expectation Value in QAOA

Given a Hamiltonian H and a QAOA ansatz state

| ψ (θ) 〉

, the expectation value is as follows:

〈H〉 ≔ 〈ψ (θ) | H | ψ (θ)〉

(A1)

where

θ = (γ_{1}, β_{1}, \dots, γ_{p}, β_{p})

are the tunable parameters of the QAOA circuit.

2.: Parameter-Shift Rule for QAOA Gradient

For a parameter

θ_{i}

(

{e . g ., γ}_{k} a n d β_{k}, k = 1, 2, \dots, p

) that influences a unitary operator, the gradient of the expectation value follows the parameter-shift rule:

\frac{\partial 〈H〉}{\partial θ_{i}} = \frac{1}{2} [{〈H〉}_{θ_{i} + \frac{π}{2}} - {〈H〉}_{θ_{i} - \frac{π}{2}}]

(A2)

Appendix B

This appendix provides mathematical expressions for the residual LSTM, which integrates a residual connection into a standard LSTM structure to enhance gradient flow and improve training stability. Below are the key equations for a residual LSTM [31]:

Let the symbol

⨀

indicate the Hadamard product (element-wise product). For each time step t, given input

x_{t}

, hidden state

h_{t - 1}

, and cell state

c_{t - 1}

:

Forget Gate : f_{t}^{l} = σ (W_{f}^{l} x_{t}^{l} + U_{f}^{l} h_{t - 1}^{l} + b_{f}^{l})

(A3)

Input Gate : i_{t}^{l} = σ (W_{i}^{l} x_{t}^{l} + U_{i}^{l} h_{t - 1}^{l} + b_{i}^{l})

(A4)

Candidate Cell State : {\hat{c}}_{t}^{l} = \tanh (W_{c}^{l} x_{t}^{l} + U_{c}^{l} h_{t - 1}^{l} + b_{c}^{l})

(A5)

Updated Cell State : c_{t}^{l} = f_{t}^{l} ⨀ c_{t - 1}^{l} + i_{t}^{l} ⨀ {\hat{c}}_{t}^{l}

(A6)

Output Gate : o_{t}^{l} = σ (W_{o}^{l} x_{t}^{l} + U_{o}^{l} h_{t - 1}^{l} + b_{o}^{l})

(A7)

Updated Hidden State : h_{t}^{l} = o_{t}^{l} ⨀ t a n h (c_{t}^{l})

(A8)

Shortcut Connection : h_{t}^{l} = h_{t}^{l} + h_{t}^{l - 1}

(A9)

where the superscript

l

is the layer index and the subscripts i, f, and o represent input, forget, and output, respectively. The symbol b is a bias vector. The symbol

σ

means the sigmoid function. The symbols W and U are the weight matrices.

References

Zhao, Z.; Luo, X.; Xie, J.; Gong, S.; Guo, J.; Ni, Q. Decentralized grid-forming control strategy and dynamic characteristics analysis of high-penetration wind power microgrids. IEEE Trans. Sustain. Energy 2022, 13, 2211–2225. [Google Scholar] [CrossRef]
Akhtar, I.; Kirmani, S.; Jameel, M. Reliability assessment of power system considering the impact of renewable energy sources integration into grid with advanced intelligent strategies. IEEE Access 2021, 9, 32485–32497. [Google Scholar] [CrossRef]
Ali, S.W.; Sadiq, M.; Terriche, Y.; Ahmad, S.; Naqvi, R.; Quang, L.; Hoang, N.; Mutarraf, M.U. Offshore wind farm-grid integration: A review on infrastructure, challenges, and grid solutions. IEEE Access 2021, 9, 102811–102827. [Google Scholar] [CrossRef]
Yousuf, M.U.; Al-Bahadly, I.; Avci, E. Current perspective on the accuracy of deterministic wind speed and power forecasting. IEEE Access 2019, 7, 159547–159564. [Google Scholar] [CrossRef]
Lipu, M.S.H.; Miah, M.S.; Hannan, M.A.; Hussain, A.; Sarker, M.R.; Ayob, A. Artificial intelligence based hybrid forecasting approaches for wind power generation: Progress, challenges and prospects. IEEE Access 2021, 9, 102460–102489. [Google Scholar] [CrossRef]
Chen, Y.; Hu, X.; Zhang, L. A review of ultra-short-term forecasting of wind power based on data decomposition-forecasting technology combination model. Energy Rep. 2022, 8, 14200–14219. [Google Scholar] [CrossRef]
Qian, Z.; Pei, Y.; Zareipour, H.; Chen, N. A review and discussion of decomposition-based hybrid models for wind energy forecasting applications. Appl. Energy 2019, 235, 939–953. [Google Scholar] [CrossRef]
Vikash, K.S.; Rajesh, K.; Ameena, S.A.; Sujil, A.; Ehsan, H.F. Learning based short term wind speed forecasting models for smart grid applications: An extensive review and case study. Electr. Power Syst. Res. 2023, 222, 109502. [Google Scholar] [CrossRef]
Liu, H.; Chen, C.; Lv, X.; Wu, X.; Liu, M. Deterministic wind energy forecasting: A review of intelligent predictors and auxiliary methods. Energy Convers. Manag. 2019, 195, 328–345. [Google Scholar] [CrossRef]
Wang, Y.; Zou, R.; Liu, F.; Zhang, L.; Liu, Q. A review of wind speed and wind power forecasting with deep neural networks. Appl. Energy 2021, 304, 117766. [Google Scholar] [CrossRef]
Daley, A.J.; Bloch, I.; Kokail, C.; Flannigan, S.; Pearson, N.; Troyer, M.; Zoller, P. Practical quantum advantage in quantum simulation. Nature 2022, 607, 667–676. [Google Scholar] [CrossRef] [PubMed]
Ajagekar, A.; You, F. Quantum computing for energy systems optimization: Challenges and opportunities. Energy 2019, 179, 76–89. [Google Scholar] [CrossRef]
Deng, Z.; Wang, X.; Dong, B. Quantum computing for future real-time building HVAC controls. Appl. Energy 2023, 334, 120621. [Google Scholar] [CrossRef]
Ajagekar, A.; You, F. Quantum computing based hybrid deep learning for fault diagnosis in electrical power systems. Appl. Energy 2021, 303, 117628. [Google Scholar] [CrossRef]
Zhou, Y.; Zhang, P. Noise-resilient quantum machine learning for stability assessment of power systems. IEEE Trans. Power Syst. 2023, 38, 475–487. [Google Scholar] [CrossRef]
Feng, F.; Zhang, P.; Bragin, M.A.; Zhou, Y. Novel resolution of unit commitment problems through quantum surrogate Lagrangian relaxation. IEEE Trans. Power Syst. 2023, 38, 2460–2471. [Google Scholar] [CrossRef]
Nikmehr, N.; Zhang, P.; Bragin, M.A. Quantum distributed unit commitment: An application in microgrids. IEEE Trans. Power Syst. 2022, 37, 3592–3603. [Google Scholar] [CrossRef]
Feng, F.; Zhou, Y.; Zhang, P. Quantum power flow. IEEE Trans. Power Syst. 2021, 36, 3810–3812. [Google Scholar] [CrossRef]
Nikmehr, N.; Zhang, P. Quantum-inspired power system reliability assessment. IEEE Trans. Power Syst. 2023, 38, 3476–3490. [Google Scholar] [CrossRef]
Hong, Y.Y.; Santos, J.B.D. Day-ahead spatiotemporal wind speed forecasting based on a hybrid model of quantum and residual long short-term memory optimized by particle swarm algorithm. IEEE Syst. J. 2023, 17, 6081–6092. [Google Scholar] [CrossRef]
Ajagekar, A.; You, F. Quantum computing and quantum artificial intelligence for renewable and sustainable energy: A emerging prospect towards climate neutrality. Renew. Sustain. Energy Rev. 2022, 165, 112493. [Google Scholar] [CrossRef]
Kusyk, J.; Saeed, S.M.; Uyar, M.U. Survey on quantum circuit compilation for noisy intermediate-scale quantum computers: Artificial intelligence to heuristics. IEEE Trans. Quantum 2021, 2, 2501616. [Google Scholar] [CrossRef]
Ji, X.; Wang, B.; Hu, F.; Wang, C.; Zhang, H. New advanced computing architecture for cryptography design and analysis by D-Wave quantum annealer. Tsinghua Sci. Technol. 2022, 27, 751–759. [Google Scholar] [CrossRef]
Maruo, A.; Soeda, T.; Igarashi, H. Topology optimization of electromagnetic devices using digital annealer. IEEE Trans. Magn. 2022, 58, 7001504. [Google Scholar] [CrossRef]
Available online: https://docs.pennylane.ai/en/stable/development/guide/documentation.html (accessed on 13 October 2022).
Willsch, D.; Willsch, M.; Jin, F.; Michielsen, K.; Raedt, H.D. GPU-accelerated simulations of quantum annealing and the quantum approximate optimization algorithm. Comput. Phys. Commun. 2022, 278, 108411. [Google Scholar] [CrossRef]
Zhou, L.; Wang, S.T.; Choi, S.; Pichler, H.; Lukin, M.D. Quantum approximate optimization algorithm: Performance, mechanism, and implementation on near-term devices. Phys. Rev. X 2022, 10, 021067. [Google Scholar] [CrossRef]
Tilly, J.; Chen, H.; Cao, S.; Picozzi, D.; Setia, K.; Li, Y.; Grant, E.; Wossnig, L.; Rungger, I.; Booth, G.H.; et al. The variational quantum eigensolver: A review of methods and best practices. Phys. Rep. 2022, 986, 1–128. [Google Scholar] [CrossRef]
Choi, J.; Kim, J. A tutorial on quantum approximate optimization algorithm (QAOA): Fundamentals and applications. In Proceedings of the 2019 International Conference on Information and Communication Technology Convergence, Jeju, Republic of Korea, 16–18 October 2019. [Google Scholar] [CrossRef]
Available online: https://docs.pennylane.ai/en/stable/code/api/pennylane.QAOAEmbedding.html (accessed on 20 October 2022).
Kim, J.; El-Khamy, M.; Lee, J. Residual LSTM: Design of a deep recurrent architecture for distant speech recognition. arXiv 2017, arXiv:1701.03360. [Google Scholar] [CrossRef]
Incudini, M.; Grossi, M.; Mandarino, A.; Vallecorsa, S.; Pierro, A.D.; Windridge, D. The quantum path kernel: A generalized quantum neural tangent kernel for deep quantum machine learning. arXiv 2022, arXiv:2212.11826. [Google Scholar] [CrossRef]
Available online: https://pennylane.ai/devices/lightning-qubit (accessed on 20 October 2022).
Nikolic, G.S.; Dimitrijevic, B.R.; Nikolic, T.R.; Stojcev, M.K. A survey of three types of processing units: CPU, GPU and TPU. In Proceedings of the 57th International Scientific Conference on Information, Communication and Energy Systems and Technologies (ICEST), Ohrid, North Macedonia, 16–18 June 2022. [Google Scholar] [CrossRef]
Dehal, R.S.; Munjal, C.; Ansari, A.A.; Kushwaha, A.S. GPU Computing Revolution: CUDA. In Proceedings of the IEEE 2018 International Conference on Advances in Computing, Communication Control and Networking, Greater Noida, India, 12–13 October 2018. [Google Scholar] [CrossRef]
A Review of CUDA, MapReduce, and Pthreads Parallel Computing Models. Available online: https://www.researchgate.net/publication/267667400_A_Review_of_CUDA_MapReduce_and_Pthreads_Parallel_Computing_Models (accessed on 20 October 2022).
Wang, H.; Liang, Z.; Gu, J.; Li, Z.; Ding, Y.; Jiang, W.; Shi, Y.; Pan, D.Z.; Chong, F.T.; Han, S. TorchQuantum case study for robust quantum circuits (invited paper). In Proceedings of the 2022 IEEE/ACM International Conference on Computer Aided Design (ICCAD), San Diego, CA, USA, 29 October 2022. [Google Scholar]

Figure 1. A QAOA circuit [30].

Figure 2. Architecture of proposed method.

Figure 3. Process flow of CUDA-enabled GPU.

Figure 4. Workflow including QNN design.

Figure 5. Performance with various numbers of qubits.

Figure 6. R-squared score with various different numbers of qubits.

Figure 7. Forecasted and actual wind speeds for a typical day in the fall season.

Table 1. Templates of PennyLane for QNN.

Template	Aim	Main Feature
QAOA Embedding	To train the features that will allow for the computation of feature value gradients.	With regard to the arguments for the features and the weights, it enables gradient computations. It uses a multilayer, trainable quantum circuit that is modeled after the QAOA ansatz.
Amplitude Embedding	To embed $2^{n}$ features into the n qubits’ amplitude vector.	Features are automatically padded to dimension $2^{n}$ when padding is set to a real or complex value. While utilizing the template, the feature parameter is not differentiable, and PennyLane is unable to compute gradients with respect to the features.
Angle Embedding	To encode N features into n qubits’ rotation angles, where N > n.	Depending on the specific rotation parameter, the rotations can be implemented using Rx, Ry, or Rz gates.
IQP (Instantaneous Quantum Polynomial) Embedding	To construct a layer influenced by the diagonal gates of an IQP circuit to encode the features into qubits.	It entails non-trivial classical processing of the features and is composed of a block of Hadamards preceded by a block of gates that are diagonal in the computational basis. Specifically, IQP refers to a class of quantum circuits that consist only of commuting gates and are applied simultaneously.
Basic Entangler Layer	To construct a layer of single-qubit rotations with a single parameter upon every qubit, coupled by a closed chain of CNOT gates.	By employing only two wires, it adheres to the custom of dropping the entanglement between the final and first qubits so that the entangler is not repeated on the same wires.
Strongly Entangling Layer	To construct a layer influenced by the circuit-centric classifier architecture, including single qubit rotations and entanglers	The wires are affected chronologically by the 2-qubit gates, whose type is determined by the imprimitive argument. This template will not employ any imprimitive gates when used on a single qubit.

Table 2. Results of statistical performance metrics obtained by different embeddings and layers.

QNN	MSE	R² Score	RMSE	MAE
a	0.0051	0.8942	0.0714	0.0531
b	0.0027	0.9338	0.0523	0.0384
c	0.0032	0.9274	0.0566	0.0407
d	0.0029	0.9439	0.0544	0.0404
e	0.0049	0.9063	0.0701	0.0684
f	0.0044	0.9115	0.0663	0.0613
g	0.0003	0.9917	0.0191	0.0126
h	0.0007	0.9761	0.0281	0.0258

Table 3. Comparison of execution time and number of iterations obtained by different embeddings and layers.

QNN	Training Time (hrs:mins:secs)	Testing Time (hrs:mins:secs)	No. of Iterations
a	01:07:32	00:00:01.0011	79
b	01:11:19	00:00:01.3762	82
c	01:17:29	00:00:01.7436	74
d	01:15:34	00:00:00.9479	81
e	04:40:12	00:00:01.1362	48
f	04:33:01	00:00:01.5641	45
g	00:47:16	00:00:05.2311	84
h	00:05:23	00:00:00.9372	62

Table 4. Results of statistical performance metrics obtained by different quantum simulators.

Simulator	MSE	R² Score	RMSE	MAE
PennyLane	0.0003	0.9917	0.0191	0.0126
TorchQuantum	0.0005	0.9785	0.0224	0.0163

Table 5. Comparison of execution time and number of iterations obtained by different simulators.

Simulator	Training Time (hrs:mins:secs)	Testing Time (hrs:mins:secs)	No. of Iterations
PennyLane	00:47:16	00:00:05.2311	84
TorchQuantum	00:56:21	00:00:05.4266	87

Table 6. Comparison of architectural differences and optimization strategies between PennyLane and TorchQuantum.

Feature	PennyLane	TorchQuantum
Quantum Circuit Representation	Abstract and hardware-agnostic, supports various quantum backends	Integrated with PyTorch’s v2.6 tensor-based framework
Hardware Integration	Extensive hardware support (e.g., IBM Q and Rigetti)	Primarily focused on simulators, but extendable to hardware
Gradient Calculation	Parameter-shift rule, automatic differentiation	Built-in PyTorch autograd, parameter-shift rule
Optimization	Supports hybrid quantum–classical optimization, advanced optimizers	Uses PyTorch’s classical optimizers for hybrid models
Customization	Highly customizable quantum optimization strategies	Relies on PyTorch optimizers, less customization for quantum optimization

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hong, Y.-Y.; Santos, J.B.D. Implementing a Hybrid Quantum Neural Network for Wind Speed Forecasting: Insights from Quantum Simulator Experiences. Energies 2025, 18, 1771. https://doi.org/10.3390/en18071771

AMA Style

Hong Y-Y, Santos JBD. Implementing a Hybrid Quantum Neural Network for Wind Speed Forecasting: Insights from Quantum Simulator Experiences. Energies. 2025; 18(7):1771. https://doi.org/10.3390/en18071771

Chicago/Turabian Style

Hong, Ying-Yi, and Jay Bhie D. Santos. 2025. "Implementing a Hybrid Quantum Neural Network for Wind Speed Forecasting: Insights from Quantum Simulator Experiences" Energies 18, no. 7: 1771. https://doi.org/10.3390/en18071771

APA Style

Hong, Y.-Y., & Santos, J. B. D. (2025). Implementing a Hybrid Quantum Neural Network for Wind Speed Forecasting: Insights from Quantum Simulator Experiences. Energies, 18(7), 1771. https://doi.org/10.3390/en18071771

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Implementing a Hybrid Quantum Neural Network for Wind Speed Forecasting: Insights from Quantum Simulator Experiences

Abstract

1. Introduction

1.1. Classification of Existing Works in Wind Speed/Power Forecasting

1.2. Applications of Quantum Computing in Power and Energy Engineering

1.3. Quantum Computer, Quantum/Digital Annealer, and Quantum Simulator

1.4. Insights and Contributions of This Work

2. Background of Quantum Neural Networks

2.1. Qubits and Ansatz

2.2. Variational Quantum Eigensolver (VQE)

2.3. Quantum Approximate Optimization Algorithm (QAOA)

3. Methodology

3.1. Residual LSTM

3.2. Quantum Neural Network (QNN)

3.3. Implementation of QNN

3.4. Quantum Simulator

3.5. Compute Unified Device Architecture (CUDA)

3.6. Solution Steps for Proposed Methodology

4. Results

4.1. Comparative Studies of Runtime

4.2. Comparative Studies of Accuracy

4.3. Comparative Studies of Accuracy with Other QNN and Traditional Methods

4.4. Comparative Studies of Accuracy and Runtime Using Different Quantum Simulators

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI