Quantum Recurrent Neural Networks: Predicting the Dynamics of Oscillatory and Chaotic Systems

: In this study, we investigate Quantum Long Short-Term Memory and Quantum Gated Recurrent Unit integrated with Variational Quantum Circuits in modeling complex dynamical systems, including the Van der Pol oscillator, coupled oscillators, and the Lorenz system. We implement these advanced quantum machine learning techniques and compare their performance with traditional Long Short-Term Memory and Gated Recurrent Unit models. The results of our study reveal that the quantum-based models deliver superior precision and more stable loss metrics throughout 100 epochs for both the Van der Pol oscillator and coupled harmonic oscillators, and 20 epochs for the Lorenz system. The Quantum Gated Recurrent Unit outperforms competing models, showcasing notable performance metrics. For the Van der Pol oscillator, it reports MAE 0.0902 and RMSE 0.1031 for variable x and MAE 0.1500 and RMSE 0.1943 for y ; for coupled oscillators, Oscillator 1 shows MAE 0.2411 and RMSE 0.2701 and Oscillator 2 MAE is 0.0482 and RMSE 0.0602; and for the Lorenz system, the results are MAE 0.4864 and RMSE 0.4971 for x , MAE 0.4723 and RMSE 0.4846 for y , and MAE 0.4555 and RMSE 0.4745 for z . These outcomes mark a significant advancement in the field of quantum machine learning.


Introduction
Ordinary differential equations (ODEs) are foundational mathematical models that play a critical role across diverse domains, such as physics [1], biology [2,3], and economics [4].The numerical resolution of ODE systems is pivotal for understanding the dynamics of these varied processes.To this end, numerous methodologies have been devised, including linear multistep methods [5,6] and Runge-Kutta methods [7,8].Nevertheless, the practical application of these methods frequently encounters the complexity of systems characterized by nonlinear and time-varying elements, alongside a pronounced sensitivity to initial conditions.Such complexities [9] significantly challenge the efficacy of traditional analytical and numerical strategies, especially in terms of achieving desired levels of precision, stability, and computational efficiency.Given these considerations, there is a pressing need for more sophisticated techniques capable of navigating the intricate landscape of ODE systems.
In recent years, recurrent neural networks (RNNs) have achieved significant breakthroughs in processing sequential data.RNNs have been used extensively in various domains, including but not limited to, time series data prediction [10][11][12], machine translation [13], and speech recognition [14].Notably, RNNs have also played a critical role in scientific research [15,16], an area of particular interest to us.Within this framework, RNNs and their specific variants, such as Long Short-Term Memory (LSTM) [17] and Gated Recurrent Units (GRUs) [18], have proven to be efficacious in learning or forecasting complex, nonlinear challenges.The intrinsic nature of RNNs, which are capable of capturing temporal dependencies and learning from sequential data, makes them particularly suited for modeling dynamical systems.Unlike conventional approaches that require explicit numerical schemes for integration over time, RNNs can implicitly learn the underlying dynamics from data, offering a way to bypass the complexities of directly solving ordinary differential equations (ODEs).This capability allows for the modeling of complex systems with high degrees of freedom and nonlinearity, providing a flexible and potentially more accurate alternative to traditional methods.Consequently, the use of RNNs in solving ODEs, especially systems of ODEs, represents a significant shift towards data-driven approaches in the prediction of dynamical [19][20][21][22] and time series chaotic systems [23][24][25][26].
Concurrently, quantum computing has been advancing rapidly, spearheaded by technology giants such as IBM [27], Google [28], PennyLane [29], and D-Wave Systems [30].These quantum systems offer theoretical advantages for specific computational tasks and simulations of intricate quantum phenomena.However, the practical utility of current Noisy Intermediate-Scale Quantum (NISQ) devices is constrained by limitations in quantum error correction [31][32][33].This limitation poses a significant challenge, as it restricts the reliability and scalability of quantum computations, impeding their broader application in complex problems.
The advent of variational quantum algorithms and circuits, pioneered by Mitarai et al. [34], has been a landmark development in the integration of quantum computing with machine learning.These algorithms leverage quantum entanglement to address fundamental machine learning challenges, such as function approximation and classification [34,35].More studies about variational quantum algorithms and circuits can be found here [36].This innovation has facilitated the development of hybrid quantum-classical algorithms, adaptable for use on existing Noisy Intermediate-Scale Quantum (NISQ) devices.Such hybrid approaches have shown promise across various domains, including classification [37,38], generative adversarial learning [39], and deep reinforcement learning [40].Furthermore, the exploration of quantum machine learning [41][42][43][44][45][46] has opened a new path for processing and analyzing data, harnessing the unique computational capabilities of quantum systems.
Among the studies, the study by Chen et al. [47,48] innovated in the quantum machine learning domain by introducing Quantum LSTM (QLSTM) and Quantum GRU (QGRU), integrating variational quantum circuits (VQCs) with recurrent neural network architectures.Designed for Noisy Intermediate-Scale Quantum (NISQ) devices, these quantum models utilize quantum entanglement to enhance learning efficiency and stability.Their research showcases that QLSTM and QGRU outperform traditional LSTM models in terms of learning efficiency and convergence stability, marking a significant step in sequential data learning within the quantum computing sphere.However, the scenarios examined in Chen's studies [47,48] only involved singular ODEs, which prompted us to explore the application of QRNN to systems of ODEs.
In this research, we explore the application of Quantum Long Short-Term Memory (QLSTM) and Quantum Gated Recurrent Unit (QGRU) models, based on the Variational Quantum Circuits (VQCs) developed by Chen et al. [47,48], to complex dynamical systems described by ODEs.Our approach involves a nuanced modification to the VQC, thereby refining its compatibility with QLSTM and QGRU models for their application in this context.Our focus is on the Van der Pol oscillator, two coupled damped harmonic oscillators, and the Lorenz equations.We conduct a comparative analysis of these quantum models against their classical counterparts, LSTM and GRU, focusing on error metrics and loss evolution across epochs.This comparison aims to evaluate their effectiveness in modeling and predicting the dynamics of nonlinear and chaotic systems.
The structure of this paper is organized as follows: Section 2 presents an overview of fundamental concepts in quantum computing, including qubits, superposition, quantum gates, and entanglement.In Section 3, we introduce the computational models under inves-tigation, namely variational quantum circuits, classical Long Short-Term Memory (LSTM), classical Gated Recurrent Unit (GRU), Quantum Long Short-Term Memory (QLSTM), and Quantum Gated Recurrent Unit (QGRU).Section 4 introduces three numerical experiments conducted to evaluate these models, including the Van der Pol oscillator, two coupled damped harmonic oscillators, and the Lorenz equations system.Section 5 is the discussion and Section 6 is the conclusion.

Qubits
The core distinction between classical and quantum computing lies in their fundamental units of information: the classical bit and the quantum bit (qubit), respectively.In classical computing, a bit is a binary unit that can exist in one of two states, either 0 or 1.This binary nature underpins all classical computing processes, with data storage and computations executed through combinations of these binary states.
In contrast, the qubit, the fundamental unit of quantum computing, transcends this binary limitation.A qubit differs from a classical bit in its ability to exist in a superposition of states.Rather than being limited to a strict 0 or 1, a qubit can represent both 0 and 1 simultaneously, a phenomenon that is central to quantum computing's potential for processing complexity.
This unique property of qubits arises from the principles of quantum mechanics.Unlike classical bits, whose state is definite and observable, a qubit's state is described probabilistically.The actual state of a qubit is not determined until it is measured.Before measurement, a qubit exists in a superposition of the states |0⟩ and |1⟩, where the probabilities of these states are determined by quantum amplitudes.These amplitudes, which are complex numbers, dictate the likelihood of the qubit collapsing into either the |0⟩ or |1⟩ state upon measurement.
The concept of superposition enables qubits to perform computations on a scale and at a speed unattainable by classical bits.This capability forms the basis for quantum computing's potential to solve problems that are currently intractable for classical computers, which will be introduced in the following section.

Superpositions
Superposition in a quantum system allows a qubit to exist in multiple states simultaneously until it is measured.This characteristic differentiates a quantum bit from a classical bit, which can only be in one state at any given time.In a single qubit system, the state of a qubit can be described as a linear combination of the basis states |0⟩ and |1⟩.Mathematically, this is represented as: where Extending the concept of superposition to a two-qubit system, we find a more complex scenario.In a two-qubit system, the combined state can be represented as a superposition of all four possible states of the two qubits.The state vector for such a system is given by: where α, β, γ, and δ are complex probability amplitudes corresponding to each of the four states.As per the normalization condition, the sum of the squares of the absolute values of these amplitudes must equal one: For instance, if the amplitudes are α 4, the probabilities of measuring the system in the states |00⟩, |01⟩, |10⟩, and |11⟩ are 10%, 20%, 30%, and 40%, respectively.
In both single and two-qubit systems, superposition allows quantum computers to process and encode information in ways that are fundamentally different from classical computers, enabling them to perform complex calculations more efficiently.
In traditional computing, gates are fundamental building blocks that process binary information, operating on bits that exist in one of two states: 0 or 1. Common logic gates include AND, OR, NOT, and XOR, each performing a specific logical operation for computational tasks.
Similarly, quantum computing has quantum gates, which manipulate qubits, such as the Hadamard, Pauli-X, Y, Z, and Controlled-NOT (CNOT) gates, operating on these qubits, enabling complex operations that can entangle qubits, and creating correlations between them that are essential for quantum computation's power.These gates are unitary, meaning they are reversible, a property that contrasts with some irreversible classical gates.
Here, we introduce one of the fundamental gates in quantum computing, the Hadamard gate, often used to create a superposition of states.The Hadamard gate acts on a single qubit and transforms it into a superposition of its basis states.
The Hadamard gate is represented by the following matrix: This matrix operates on a qubit state vector, transforming it from a definite state (|0⟩ or |1⟩) into a superposition.When the Hadamard gate acts on the state |0⟩ (represented by the vector 1 0 ), the resulting state is: Similarly, when it acts on the state |1⟩ (represented by the vector 0 1 ), the output is: The Hadamard gate, therefore, creates an equal superposition of the |0⟩ and |1⟩ states.

Quantum Entanglement
Quantum entanglement is a phenomenon in which multiple qubits become interconnected in such a way that the state of one qubit cannot be described independently of the state of the other qubits, even when the qubits are separated by large distances.
To explain the significance of entanglement in this study, consider an analogy involving the conventional method of information exchange between two individuals.The most traditional method of information exchange involves one individual physically moving within audible distance to relay news to another person.This method, while effective, is indeed time-consuming and limits the speed of information transfer.However, by incorporating quantum entanglement into our framework to enable instantaneous information exchange between two entangled parties we significantly enhance the efficiency and speed of learning.This quantum advantage mirrors the leap from having to physically move towards someone to share information, to an immediate and direct exchange of knowledge.
Entanglement can be achieved using certain quantum gates.For instance, the combination of a Hadamard gate followed by a Controlled-NOT (CNOT) gate is commonly used to entangle two qubits.
First, the Hadamard gate is applied to one of the qubits to create a superposition, as previously discussed.Then, the CNOT gate, which flips the state of the second qubit if the first qubit is in the |1⟩ state, is applied.The operation of the CNOT gate can be described by the following matrix: This operation creates entanglement between the two qubits.Consider two qubits, initially in the state |00⟩.Applying a Hadamard gate to the first qubit creates the superposition for the first qubit.Now, the combined state of the two qubits is . Applying a CNOT gate afterward results in the entangled state , where the state of each qubit cannot be described independently of the other.

Variational Quantum Circuits (VQCs)
Variational Quantum Circuits (VQCs) are the unique part of the architecture of quantum recurrent neural networks.The specific VQC components used in this context are structured in three main parts: an encoding layer, a variational layer, and a quantum measurement layer.Figure 1 shows an example of a VQC.The left-hand part of the figure is the encoding layer, which contains a Hadamard gate, two angle embedding layers working as R y and R z gates in the quantum concept.The middle part is the variational layer (entanglement layer).It is worth mentioning that the number of qubits (in this example is 4), and the number of variational layers (in this example is 2), can be modified to increase the capability to learn the data.

Encoding Layer
The encoding layer's primary function is to map classical data values into quantum amplitudes.It starts with initializing the circuit in the ground state and applying Hadamard gates to create an unbiased initial state.The state of an N-qubit quantum system can be represented as: where c q 1 ,...,q N ∈ C is the complex amplitude for each basis state and ⊗ stands for tensor product.Again, by the Born's rule, we know that: The encoding layer first transforms the input data into rotation angles to the rotation of each single qubit.We apply the Hadamard gate here and transform the initial state into a superposition, also called the unbiased state.After applying the Hadamard gate H N times (once to each qubit), we will obtain: The resulting state after applying the Hadamard gate N times is a uniform superposition of all possible states of N qubits, which can be expressed as: Hence, Here, i is used as an index to sum over all possible states, where i represents the decimal equivalent of the binary numbers formed by the qubits.For example, for N = 2, the sum would run over |00⟩, |01⟩, |10⟩, and |11⟩, which correspond to decimal 0, 1, 2, and 3, respectively.
The encoding process involves using two-angle encoding, where each data value is encoded to a qubit with a series of two gates, R y and R z .In this study, a template named qml.templates.AngleEmbedding is used to play the role of R y and R z .
The qml.templates.AngleEmbedding template effectively maps classical information onto the quantum states by applying specific rotation gates to each qubit in the system.This template can be customized for various aspects of the embedding, including selecting the axis of rotation R x , R y , or R z .This adaptability makes the template well-suited for a broad spectrum of applications within the domain of quantum neural networks and other machine learning models.
After transforming the initial states into unbiased states, we will use 2N rotational angles from an N-dimensional input vector, ⃗ v = (x 1 , x 2 , . . ., x N ).For each component x i of ⃗ v, in Chen's original paper, the two angles are calculated: θ i,1 = arctan(x i ) for y-axis rotation and θ i,2 = arctan(x 2 i ) for z-axis rotation, where the rotations are effected through R y (θ i,1 ) and R z (θ i,2 ) gates, respectively.
Despite normalizing the input data before the encoding layer, this work introduces a distinct methodology.Rather than employing arctan functions, we select the sin and cos functions for the R y (θ i,1 ) and R z (θ i,2 ) gates, respectively.This entails setting θ i,1 = sin(x i ) for the R y gate and θ i,2 = cos(x 2 i ) for the R z gate, diverging from conventional arctan applications.This method represents a preliminary attempt in practice and it can be improved in future work, which is not covered within this paper.More studies related to this aspect can be found in Mitarai's paper [34].

Variational Circuit
The variational circuit is the trainable part of the VQC, consisting of parameterized unitary transformations.This section of the circuit includes multiple CNOT gates for qubit entanglement and unitary rotation gates controlled by learnable parameters α, β, and γ.It is worth mentioning that the variational circuit can be repeated more than one time in practice to increase the number of parameters and the model's expressive capacity.

Quantum Measurement
Quantum measurement is used for extracting classical information from the quantum circuit.It involves measuring the qubits, which, due to the probabilistic nature of quantum systems, yield varying bit strings upon each measurement.The expectation value of an operator Ô for a state |ψ⟩ is given by: The expectation values can be either calculated analytically in a quantum simulation or obtained through multiple samplings in practical quantum devices with specific noise models.
The VQC architecture described here is pivotal in the QLSTM and QGRU, as a quantum-enhanced approach to processing and learning from data.

Long Short-Term Memory (LSTM)
Long Short-Term Memory (LSTM) networks are a type of recurrent neural network (RNN) particularly adept at learning from sequences of data.LSTMs are designed to overcome the limitations of traditional RNNs, especially issues related to long-term dependencies in data sequences.Figure 2 shows the structure of a single LSTM unit.An LSTM unit is composed of a cell, an input gate, an output gate, and a forget gate.These components work together to regulate the flow of information into and out of the cell, and to decide which information to store and which to discard.
The key equations governing the operations within an LSTM unit are as follows: Here, f t , i t , and o t represent the activations of the forget, input, and output gates, respectively, at time t.σ denotes the sigmoid function, and * represents element-wise multiplication.C t is the cell state at time t, and h t is the hidden state.W and b are the weights and biases associated with the respective gates and cell state updates.

Gated Recurrent Units (GRU)
Gated Recurrent Unit (GRU) is a variation of recurrent neural networks that aims to solve the vanishing gradient problem, similar to LSTM.GRUs simplify the architecture seen in LSTM by combining certain gates and states, which often results in more efficient training for certain types of problems.The structure of a single GRU is shown in Figure 3.The GRU architecture is built around two gates: the update gate and the reset gate.These gates determine how much of the past information needs to be passed along to the future.The key equations defining a GRU are: In Equations ( 20)-( 23), z t and r t represent the activations of the update and reset gates at time t, respectively.h t is the hidden state at time t, and ht is the candidate hidden state.W and b are the weights and biases associated with the respective gates and hidden state updates.The symbol * denotes element-wise multiplication, and σ represents the sigmoid activation function.
GRUs provide an efficient alternative to LSTM and are particularly useful in modeling sequences where LSTM's complex structure may not be necessary.
We need to consider the hardware realization of two activation functions here, which are the tanh and sigmoid functions.On classical hardware, tanh and sigmoid functions are computed directly, introducing nonlinearity in neural networks crucial for learning complex patterns: Classical computers execute these operations efficiently using their processing capabilities.However, quantum hardware employs a different approach due to its linear operation nature, making direct computation of tanh and sigmoid functions non-trivial.The encoding and variational layers in the VQCs allow the quantum-based models to catch the nonlinear trends across data inputs.Additionally, the measurement layer outputs values within the range of [−1, 1], which can then be processed classically to implement these nonlinear functions.

Quantum Long Short-Term Memory (QLSTM)
Quantum Long Short-Term Memory (QLSTM) is a quantum-enhanced version of the traditional LSTM networks.QLSTM integrates VQCs into the LSTM architecture, aiming to leverage the computational advantages of quantum mechanics.The structure of a single QLSTM unit is shown in Figure 4.In a QLSTM, two key memory components are present: the hidden state h t and the cell or internal state c t .The functioning of a QLSTM cell can be mathematically described by the following equations: In Equations ( 26)- (33), σ represents the sigmoid function, and * denotes element-wise multiplication.The input to the QLSTM cell at each time step is the concatenation v t of the previous hidden state h t−1 and the current input vector x t .The VQCs mentioned in the equations refer to Variational Quantum Circuits.

Quantum Gated Recurrent Unit (QGRU)
Quantum Gated Recurrent Unit (QGRU) represents an evolution of traditional GRU networks, integrating with VQCs.The structure of a single QGRU unit is shown in Figure 5.A QGRU cell operates based on the following equations: In Equations ( 34)-( 39), r t and z t represent the reset and update gates of the QGRU at time t, respectively.H t denotes the hidden state, and Ht is the candidate hidden state.The input to the QGRU cell, v t , is the concatenation of the previous hidden state H t−1 and the current input vector x t .The VQCs are used to process the quantum aspects of the data.

Hyperparameter Configuration
For the numerical experiments conducted in this study, specific hyperparameters were chosen to optimize the performance of the quantum models.Table 1 outlines the hyperparameter configuration used in the Van der Pol oscillator simulation and the simulation of two coupled damped harmonic oscillators: More specifically, we are using a single-layer model with hidden size 4, and the models are evaluating over 100 epochs.We are using the backend de f ault.qubit by PennyLane [16].

Van der Pol Oscillator Simulation
In the first experiment, we consider the Van der Pol oscillator, a classical example of a non-conservative oscillator with nonlinear damping.The oscillator is modeled by the following second-order differential equation: The parameter µ, representing the nonlinearity and strength of the damping, is set to 1.0 in our simulations.This choice of µ provides a balance between linear and nonlinear dynamical behaviors, also making the system ideal for predicting a wide range of oscillatory patterns.
The initial conditions for the oscillator are chosen as x 0 = 2.0 and y 0 = 0.0, where x 0 and y 0 represent the initial position and velocity, respectively.
To numerically solve the Van der Pol differential equation, we convert it into a system of first-order equations: The numerical solution is obtained over a time span of 0 to 50 s, discretized into 250 time steps.
Following the numerical solution of the differential equations, the resultant time series data of x and y are normalized and prepared for analysis.The data are reshaped to fit the input requirements of RNN models, enabling us to predict the Van der Pol oscillator's dynamics.

Simulation Results for the Van der Pol Oscillator
The behavior of the Van der Pol oscillator was examined through the LSTM, QLSTM, GRU, and QGRU models by evaluating their performance on the Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) metrics.
Mean Absolute Error (MAE) is a measure of errors between paired observations.It is calculated as the average of the absolute differences between the predicted values and the actual values, disregarding the direction of the error.The MAE is given by the formula: Root Mean Square Error (RMSE) is a quadratic scoring rule that also measures the average magnitude of the error.It is the square root of the average of squared differences between prediction and actual observation.The RMSE is given by the formula: where: • n is the number of observations; • y i is the actual value of the ith observation; • ŷi is the predicted value of the ith observation.
Data from the numerical solution of the Van der Pol differential equations formed the basis for training and testing these models.A comparative analysis of the models' performance is summarized in Tables 2 and 3. Table 2 shows that quantum-based models exhibit superior predictive performance on x value over classical models.Among the models compared, the GRU models also outshine the LSTM models in accuracy and QGRU model gives us the most accurate predictions.
Table 3 compares model performances on the y value.It is very interesting to see that the models produced more error on the prediction of the y values compared to the x values.Moreover, GRU models surpass LSTM models in terms of results, with the QGRU model achieving the highest accuracy among the evaluated models.
The example predictive results over epochs by the different models are shown in Figures 6-10.It is observable that the Van der Pol oscillator exhibits periodic behavior.It can be seen that all the models caught the patterns well.It can be observed that the QRNN models, especially, learn significantly faster than the classical RNN models from the comparison through epoch 5. Again, the QGRU model shows great capability in learning the dynamics.
By comparing the training and test loss over the epochs, the QRNN models show more stable decrease than the classical RNN models and the QGRU converges faster than QLSTM.

Two Coupled Damped Harmonic Oscillators Simulation
In this experiment, we study the capability the QRNN in learning the patterns in the system of two coupled harmonic oscillators.It is worth pointing out that this experiment actually is an extension of the prediction on a single damped harmonic oscillator by Chen [34,35].This experiment extends our understanding not only from a single oscillator, but physical systems by introducing interactions between oscillatory motions.Such systems are paramount in numerous fields, including physics and engineering.The governing differential equations for two coupled damped harmonic oscillators, representing an extended model of the single oscillator, are as follows: Here, θ 1 and θ 2 represent the angular displacements of the first and second pendulums, respectively.The constants g = 9.81 m/s 2 (gravitational acceleration), b 1 = b 2 = 0.15 (damping factors), l 1 = l 2 = 1.0 m (lengths of the pendulums), m 1 = m 2 = 1.0 kg (masses of the pendulums), and k c = 0.05 (coupling constant) define the system's characteristics.The initial conditions are set with angular displacements θ 1 = θ 2 = 0 and angular velocities θ1 = 3.0, θ2 = 0.0 rad/s.

Simulation Results for the Two Coupled Damped Harmonic Oscillators
Just like the Van der Pol example, we present a comparative analysis of the models' performance in predicting the dynamic behavior of both oscillators (again, with MAE and RMSE), which are summarized in Tables 4 and 5. Like the results from the first experiment, in Tables 4 and 5 all the models successfully learn the dynamics, but the QGRU shows excellent predictive results, especially on the prediction of Oscillator 2. We noticed, first, that the predictions on Oscillator 2 are more accurate than Oscillator 1.Second, we noticed that the GRU models outshine LSTM models in general.Similar to the Van der Pol case, the GRU models converge faster than the LSTM models and the quantum-based models learn the pattern faster than the classical models (which can be seen by the comparison on the results on epoch 5).The quantum-based RNNs also show stabler decrease in loss.Here, we observe several points of interest: Firstly, there is a minor increase in the test loss in LSTM models from epoch 80 to 100, potentially indicative of mild overfitting.Although adjusting the learning rate could mitigate this, we maintain a constant learning rate to achieve a straightforward comparison between the models.Secondly, spikes are observed in the LSTM model's test loss.Furthermore, at epoch 5, the LSTM model exhibits an undershot in performance, which is not seen in the results of the other models.
From the previous two numerical experiments, the QRNN models demonstrate exceptional capability in learning patterns by system dynamics.In the last experiment, we apply the models to chaotic systems using a different approach.

System of Lorenz Equations
The Lorenz equations, fundamental in chaos theory, model the dynamics of atmospheric convection and are characterized by their chaotic nature for certain parameter values.The system is described by the following set of differential equations: In Equation ( 45), x, y, and z represent the system states, and the parameters σ, ρ, and β are crucial for the system's behavior.The chosen values σ = 10.0, β = 8  3 , and ρ = 28.0 are known to induce chaotic dynamics in the Lorenz system.These parameters represent the Prandtl number, normalized Rayleigh number, and certain physical dimensions of the convective cells, respectively.Their specific values make the system a classic example of chaos, making it a compelling subject for studying the predictive capabilities of quantumenhanced and classical neural networks.Particularly, the research on the utilization of classical neural networks is discussed in [49].

Data Preparation and Hyperparameter Configuration
Unlike the previous two experiments, we generated the data differently in the application of RNN models on Lorenz equations.We first randomly generated 10 datasets; we used 67% of them for training and the remaining 33% for testing.The shapes of the training and test datasets are as follows: • Training dataset shapes: -Features: torch.Size([3120, 10, 3]); -Labels: torch.Size([3120, 3]).
Regarding the hyperparameter aspect, the previous two experiments demonstrate that QRNN models rapidly catch the trend of the dataset.Consequently, we aim to explore more about these models' proficiency in learning features with shorter sequence lengths, reduced data points, and fewer epochs.
We pick sequence length 10 for all the models and all the models will be running over 20 epochs instead of the 100 epochs in the previous two experiments.

Results Analysis
The performances of each model on the Lorenz system data are summarized in Tables 7 and 8.These tables detail the Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) for training and testing datasets across three dimensions.
According to the error metrics, quantum-based RNNs exhibit a marked increase in accuracy, particularly the QGRU model in comparison to the classical GRU.Moreover, the LSTM model demonstrates notably poorer performance compared to the other models, potentially caused by its slower rate of convergence across epochs (again, only 20 epochs were used in this case).To further present the models' predictions, we visualize their predictive outcomes across three dimensions, the trajectories and the losses over the epochs, in Figures 16-24     Firstly, regarding to the three-dimensional predictions, it is evident that all models correctly capture the dataset's trend.Nonetheless, a closer look at the figure scales reveals difficulties in predicting the spike at the figure's center for all the models (for LSTM in y and z dimensions, QLSTM in z, GRU in y and z, and QGRU in z).We believe this challenge could be solved by increasing the number of epochs, though the performance is deemed satisfactory for the current study.
Furthermore, the QGRU outperforms other models in trajectory plotting, showing a consistent decline in both training and test losses across epochs.In contrast, the classical RNNs display slight fluctuations in their loss plots.Although a spike is observed in the QLSTM's loss plot, quantum-based RNNs, in general, tend to show a smoother and more rapid reduction in losses than the classical models.

Discussion
This study explores the efficacy of quantum-enhanced models, specifically the quantum Long Short-Term Memory (QLSTM) and the Quantum Gated Recurrent Unit (QGRU), in predicting the dynamics of systems governed by ordinary differential equations.These systems include Van der Pol oscillators, two damped coupled oscillators, and the Lorenz system.We benchmarked the performance of these quantum models against traditional Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) models by evaluating their accuracy through MAE and RMSE, alongside observing their training and testing loss trajectories over epochs.
The results from the three numerical experiments showcase that quantum-based models, particularly the QGRU, significantly outperform their classical counterparts in terms of predictive accuracy.This discovery highlights the capability of quantum computing techniques to improve the predictive performance of recurrent neural networks, particularly in the complex dynamics of systems represented by ordinary differential equations.However, it is important to recognize that these quantum-based models require greater computational resources, primarily because the operations of the Variational Quantum Circuit (VQC) are simulated on classical computing hardware.The same challenge is observed across numerous studies in the QRNN field, as highlighted in [50,51].To address these computational challenges and enhance efficiency, we implemented batch sizes of 64 in this study, which significantly accelerated the processing time.Additionally, a critical aim of our ongoing research is to test the performance of these models on actual quantum hardware.This will not only confirm their theoretical benefits but also evaluate their practical utility and efficiency within a real quantum computing context.

Figure 1 .
Figure 1.An example of VQC architecture with two layers of entanglements for QLSTM and QGRU.

Figure 2 .
Figure 2. Structure of a single unit of classical LSTM.

Figure 3 .
Figure 3. Structure of a single unit of classical GRU.

Figure 4 .
Figure 4. Structure of a single unit of QLSTM.

Figure 5 .
Figure 5. Structure of a single unit of QGRU.
The example predictive results over the 100 epochs by the different models are shown in
and β are complex numbers representing the probability amplitudes for the qubit being in state |0⟩ and |1⟩, respectively.According to the Born's rule, the probabilities of finding the qubit in either state upon measurement are |α| 2 and |β| 2 , with the condition that |α| 2 + |β| 2 = 1.

Table 2 .
Comparison of train and test MAE and RMSE for the state of x.

Table 3 .
Comparison of train and test MAE and RMSE for the state of y.

Table 4 .
Comparison of train and test MAE and RMSE for Oscillator 1.

Table 5 .
Comparison of train and test MAE and RMSE for Oscillator 2.

Table 6 .
Hyperparameter configuration for classic models and quantum models.

Table 7 .
Mean Absolute Error (MAE) for each model on test set.

Table 8 .
Root Mean Square Error (RMSE) for each model on test set.