1. Introduction
Quantum error-correcting codes (QECCs) are encoding schemes designed to protect quantum information from noise-induced errors. By redundantly encoding quantum information across multiple qubits, this redundancy enables error detection and correction without destroying the underlying quantum state. Ancillary qubits are employed to measure the error states of quantum systems, thereby acquiring error information while preserving the encoded quantum state. Subsequent comprehensive measurement operations determine the presence and nature of errors, allowing corresponding corrective actions to be applied and thus achieving effective error detection and mitigation. Such encoding schemes play a critical role in quantum computing [
1,
2,
3].
QECCs encompass a diverse range of code types, including Shor codes, Steane codes, surface codes, and five-qubit codes. The color codes investigated in this paper, along with surface codes, belong to the class of geometric structure-based quantum error-correcting codes. Their primary advantage lies in the ability to simultaneously correct multiple types of quantum errors, including bit flips (X errors), phase flips (Z errors), and their combined forms (Y errors). This ability provides strong robustness against multi-error patterns. The main differences between color codes and surface codes lie in their physical implementation and structural design: surface codes typically employ two-dimensional square or hexagonal lattices, where X-stabilizer and Z-stabilizer information is stored in two distinct types of faces (or vertices), with qubits and stabilizers arranged in a checkerboard configuration. In contrast, color codes are often implemented using two- or three-dimensional colored lattices. A typical structure uses face coloring—usually a three-color scheme—to ensure that adjacent faces have different colors. Their high symmetry allows for the unified implementation of X and Z operations [
4,
5,
6]. Although surface codes generally exhibit higher error thresholds, their substantial redundancy requirements lead to significant resource consumption. In comparison, while color codes have slightly lower threshold stability, they can simultaneously correct both phase and amplitude errors and handle certain complex error patterns. Moreover, their stabilizer arrangement makes their resource requirements more compatible with conventional quantum computing environments.
For surface codes, numerous decoders have been developed due to their relative simplicity, such as the Minimum Weight Perfect Matching (MWPM) method, the Renormalization Group, Markov Chain Monte Carlo, and various approaches utilizing belief propagation, neural networks, or hierarchical designs [
7,
8,
9]. In contrast, decoding strategies for color codes remain relatively limited. This paper employs the Union-Find (UF) approach to detect non-trivial measurement points in color codes, using four-round growth and quadratic growth methods to identify the most probable error paths. Additionally, a Recurrent Neural Network (RNN) is employed to perform deep refinement of the error paths identified by UF, thereby achieving the efficient decoding of color codes.
Compared with the Hard-Decision Renormalization Group (HDRG) decoder proposed by Anwar, which employs hierarchical clustering and merging strategies to decode topological quantum codes [
10], this paper adopts a distinct approach to decoding color codes. Anwar’s method applies local rules to segment error syndromes and progressively merge clusters, which may introduce additional complexity and result in suboptimal decoding performance under certain noise conditions. By contrast, this work integrates the UF-based defect detection framework with a Recurrent Neural Network (RNN) to more accurately predict error chains. The UF structure efficiently identifies connected error paths, while the RNN improves decoding accuracy by learning complex error correlations beyond local recursions.
Recent advances in quantum error correction have explored various machine learning-based decoders, including Convolutional Neural Networks (CNNs), Transformer-based models, and advanced RNN variants such as bidirectional or gated architectures. CNN decoders excel at capturing spatial patterns within local syndrome regions but often struggle with long-range correlations in topological codes such as color codes. Transformer decoders offer strong global context modeling through self-attention but require significant computational resources and large datasets, limiting their scalability in resource-constrained quantum hardware environments. Previous RNN-based decoders, including Gated Recurrent Units (GRUs) and bidirectional LSTMs, have shown promise in modeling sequential syndrome evolution but often incur higher inference latency or increased parameter complexity.
The novelty of this work lies in introducing a secondary-growth strategy into the Union-Find (UF) decoding process and coupling it with a lightweight two-layer LSTM-based recurrent neural network (RNN) for path-level error refinement in color codes. Unlike conventional UF decoders, which often misidentify correlated error chains in high-noise regimes, the proposed Neural-Guided Union-Find (NGUF) decoder leverages temporal learning to selectively optimize only the ambiguous paths suggested by UF clustering. This targeted enhancement preserves the near-linear time complexity and low space overhead of classical UF decoding while significantly improving decoding accuracy and modestly increasing the decoding threshold. Overall, the contributions of this work include the following: (i) proposing a hybrid UF–RNN decoding framework with a novel secondary-growth mechanism tailored to the structural properties of color codes, and (ii) achieving a favorable balance between decoding accuracy and computational efficiency, making it suitable for real-time and resource-constrained quantum computing.
The structure of this paper is organized as follows.
Section 2 systematically overviews color codes, the Union-Find (UF) algorithm, and related background knowledge.
Section 3 elaborates on the construction logic and core principles of the joint decoding framework combining the UF algorithm and Recurrent Neural Network (RNN) designed specifically for color codes.
Section 4 presents an analysis and discussion based on the simulation results, compares the joint UF–RNN decoding of color codes with standalone UF decoding, and examines their respective advantages, disadvantages, and underlying causes. Finally,
Section 5 summarizes the conclusions of this paper.
3. Hybrid Decoding Architecture Design
3.1. Cluster Growth Dynamics in 2D Color Codes
The growth process in the proposed decoder adopts a two-stage strategy. The first stage, primary growth, begins with each detected non-trivial measurement point expanding uniformly in all directions, thereby forming the initial error clusters. This is followed by secondary growth, a targeted expansion applied to clusters that remain statistically correlated after the primary stage. This additional step is designed to capture latent error connections that might otherwise be overlooked, thereby reducing the risk of under-connecting correlated errors—an inherent limitation of conventional Union–Find decoders, particularly under high-noise conditions.
An illustrative example is shown in the top panel of
Figure 2, which depicts the stabilizer verification for the X-check graph in a color code. Yellow circles indicate non-trivial measurement points. The decoding process begins by identifying these points and initiating outward growth from each—starting from data qubits adjacent to the non-trivial checks and extending to neighboring check qubits. Growth from all non-trivial measurement points occurs simultaneously, and after the first growth cycle, distinct error clusters emerge, i.e.,
Steps 1, 2, and 3 in
Figure 2 provide an example of the stabilizer verification process. Mathematically, a cluster is a connected subgraph of G. A cluster is active when using only the edges within the cluster and without correcting its defects, i.e.,
Based on the experimental observations in Step 1 of
Figure 2, the error clusters located in the lower-left and lower-right regions terminate their growth upon reaching the virtual boundary conditions. In contrast, the three initial error clusters positioned at the top proceed to a second growth phase, driven by the risk of statistical correlation. This situation arises when the minimum error-chain length required to connect two root node sets falls below a defined threshold [
33]. The threshold is determined to ensure that no residual correlation remains between the two spanning trees during Union–Find decoding, and is calculated as follows:
As shown in Step 2 of
Figure 2, the second growth phase causes the three initial error clusters to merge into a single connected cluster via expansion paths, where orange segments indicate the cluster boundaries formed during the first growth stage and purple segments denote the newly generated expansion paths from the second stage. Upon termination of the growth process, only the effective paths that connect distinct error clusters are preserved, while redundant connections generated by multiple growth cycles are pruned. After the second expansion stage, the merged upper cluster continues to grow until it reaches the virtual boundary, thereby satisfying the stopping condition [
34,
35]. As illustrated in Step 3 of
Figure 2, the red-marked segments represent the effective connection paths retained after multiple rounds of growth. Notably, once clusters are merged, each large cluster corresponds to a correlated set of errors. Due to the structural properties of the color code lattice (e.g., the high connectivity of hexagonal lattices), multiple potential connection paths may exist within the same cluster. At this stage, the minimum-weight path—typically defined by path length or the inverse of the associated error probability—is selected, and all redundant paths are discarded. This marks the completion of the correction operator verification process.
3.2. Cluster Stripping Dynamics in 2D Color Codes
In the implementation process of the peeling decoder, the definition of leaf nodes in spanning trees must first be clarified: spanning trees, as undirected, connected, and acyclic graph structures, have leaf nodes defined as vertices with a degree of 1 (i.e., terminal vertices connected by only one edge) [
36,
37,
38]. The specific decoding process is as follows: select any leaf node in the spanning tree as the initial root node (if multiple spanning trees exist, choose one for processing). Starting from the root node, leaf nodes are peeled layer by layer, with two scenarios categorized based on the measurement states of the leaf node.
Non-Trivial Measurement Leaf Node: If the current leaf node is a non-trivial measurement point, record the information of its adjacent edge (the two qubits connected by this edge are potential Pauli error candidate positions). Additionally, flip the measurement states of this non-trivial measurement point and its parent node (i.e., converting the non-trivial measurement point to a trivial measurement point, and the original trivial measurement point of the parent node to a non-trivial measurement point).
Trivial Measurement Leaf Node: If the current leaf node is a trivial measurement point, directly perform the peeling operation to remove the connecting edge between the leaf node and its parent node.
Repeat the above peeling operations until the spanning tree is completely peeled down to only a single vertex, completing the decoding process. For non-terminal nodes, we calculate the expected value as follows:
needs to be specified. If s1 is a stripping process, then
is determined by the stripping model. If s1 is in the process of growth, then
is determined by choosing the best place to grow, i.e.,
As shown in
Figure 3, the peeling process for syndrome verification is illustrated. The left diagram depicts the peeling process of spanning trees, with these trees visualized using distinct colors: green edges denote those associated with leaves, while brown edges represent the trunk of the tree. Additionally, arrows indicate the peeling direction from the tree root to the tree leaves, and the error path post-peeling is presented in the right diagram.
3.3. RNN-Enhanced Error Pattern Recognition
In
Section 3.1, we discussed the application of the Union-Find (UF) algorithm to predict error paths in color codes, which involves peeling from the leaf nodes of the spanning tree, processing the measurement results, and identifying potential paths of Pauli errors. However, relying solely on the UF decoder has significant limitations: its merging performance is highly dependent on the definition of “edges” (e.g., the weights of nearest-neighbor edges), which may lead to failure in accurate error correction when the error density exceeds half the code distance. In contrast, neural networks can dynamically generate superior edge weights or connection rules for UF by learning from error data; detailed steps will be presented in
Section 3.4 and the complete error correction process is illustrated in
Figure 4.
We utilize a Recurrent Neural Network (RNN), which effectively processes qubit information in color codes. Its memory properties help address the issue of information continuity between processing rounds, thereby bringing us closer to the ideal output. The rationale for choosing RNN lies in its ability to efficiently decode without prior knowledge of the number of iterations. In this work, we adopt an RNN architecture incorporating long short-term memory (LSTM) layers. LSTM is a specialized RNN model: unlike conventional RNNs, where the repeating neural network module has a very simple structure, LSTM enhances this structure by replacing the single neural layer with four interacting components. LSTM primarily consists of three distinct gating mechanisms: the forget gate, input gate, and output gate. These gates regulate the retention and transmission of information within the LSTM, ultimately influencing the cell state and output signals [
39,
40,
41], as illustrated in
Figure 5.
In the first step of LSTM, the system needs to determine which information to discard from the cell state, a decision made by the “forget gate” structure. Specifically, the forget gate takes the output from the previous time step (i.e., the error probability of qubits containing Pauli errors) and the input at the current time step (i.e., the error path) as inputs. After processing through an activation function (here, a sigmoid non-linear mapping), it outputs the forget gate value
[
42]. Since the output
of the sigmoid function has the range [0, 1], this value directly characterizes the forgetting probability of the hidden cell state from the previous layer. Mathematically, this can be expressed as follows:
Next, the input gate determines what new information to store in the cell state; in this experiment, this new information specifically refers to newly added error path information. The operation of the input gate involves two components: the first component generates the output i(t) through a sigmoid activation function, and the second component generates the output a(t) through a tanh activation function; ultimately, the update to the cell state is achieved through the product of the two [
43]. Mathematically, this can be expressed as follows:
Before entering the output gate phase, it is necessary to compute the new cell state of the LSTM (the memory state after incorporating new error path information) [
44]. At this stage, the results of the forget gate and input gate collectively act on the cell state C(t),which is composed of two parts: the first part is the product of the previous cell state C(t−1) and the forget gate output f(t), and the second part is the product of the input gate outputs i(t) and a(t). That is,
Finally, the output value of the LSTM needs to be determined. This output value is based on the current cell state: first, the sigmoid layer determines which part of the cell state should be outputted; subsequently, the cell state undergoes tanh processing (mapped to the [−1, 1] range), and the processed value is multiplied by the output of the sigmoid gate. The final output is this product result [
45]. That is,
3.4. Training Dataset and Learning Methodology
To improve decoding precision beyond classical Union-Find (UF) decoding, we developed a Recurrent Neural Network (RNN)-based module that infers fine-grained qubit-level error probabilities along candidate paths extracted from the UF decoder. This module adopts a two-layer Long Short-Term Memory (LSTM) architecture, with each layer comprising 128 hidden units, effectively capturing sequential syndrome dependencies and enabling the memory of long-range correlations. A dropout rate of 0.2 is applied between layers to mitigate overfitting. The final hidden state is passed through a fully connected linear output layer with a sigmoid activation to produce error probabilities for each qubit in the range .
The model was trained using the Adam optimizer with hyperparameters , , and an initial learning rate of 0.001. We employed cosine annealing to decay the learning rate by 10% every 10 epochs. Training proceeds for a maximum of 200 epochs or is terminated early if the validation loss plateaus for 20 consecutive epochs. A batch size of 40 is used for stable and efficient gradient descent.
Training labels are generated using a depolarizing noise model applied to various color code lattices. Each decoding instance simulates a complete error injection and syndrome extraction cycle, from which we can extract the ground-truth Pauli error positions. The qubits along UF-generated candidate paths are then assigned real-valued error probability labels as regression targets. Crucially, model correctness is not defined by alignment with the UF decoder output but rather by consistency with the underlying physical error configuration.
The training and evaluation datasets span color codes with distances to , comprising approximately 8000 training samples and 2000 held-out validation and test samples. Each sample encodes a candidate path formed by linking non-trivial syndrome nodes via UF clustering, with input features constructed as spatial vector sequences. These samples cover a broad spectrum of error densities and cluster complexities, from sparse isolated faults to dense ambiguous regions that require deeper RNN inference.
All experiments were conducted on a high-performance computing environment comprising an Intel® Xeon® Silver 4210R CPU (Intel Corporation, Santa Clara, CA, USA), 256 GB of RAM, and an NVIDIA® RTX A6000 GPU with 48 GB VRAM (NVIDIA Corporation, Santa Clara, CA, USA).
We used PyTorch 2.0.1 with CUDA 11.8 on Python 3.10 and Ubuntu 20.04. Each complete training cycle typically takes 4–6 h, depending on code distance and path topology. Although our implementation and models are not yet publicly released, we intend to share them in a future open-source repository to promote transparency and reproducibility.
The proposed RNN decoder enhances the ability of classical decoders by dynamically refining error path predictions, particularly in overlapping or complex clusters where UF alone may fail. By learning from realistic noise statistics, the LSTM-based model significantly improves error localization and contributes to the robustness of the hybrid decoding architecture.
3.5. Sampling Strategy and RNN Decoder Architecture
In this study, we employed a depolarizing error model to generate the training dataset for training the RNN, aiming to enhance the prediction accuracy of qubit errors. In this experiment, physical errors were simulated by incorporating depolarizing noise channels into the quantum circuit. The construction of this error synthesis path and the determination of the true state of data qubits ensure that the training data can authentically reflect the error distribution of qubits during operation. For dataset sampling, we collected qubit error data by simulating multiple quantum error correction cycles during training dataset generation. In each cycle, we used the UF decoder to perform preliminary decoding on the color code, then ran and recorded the logical qubits with errors and their error paths to generate candidate error bit paths. However, the error locations provided by the UF decoder are not entirely accurate; thus, we further predicted these paths using the RNN. The overall decoding workflow, integrating the Union-Find and RNN modules, is illustrated in
Figure 6, which outlines the sequential stages from syndrome extraction and UF-based clustering to RNN-guided path refinement and final error correction.
To construct the dataset, we generated corresponding stabilizer and syndrome data for color code structures with different code distances.
Figure 7 illustrates the stabilizer arrangement and path input for a 5 × 5 color code. Each error path is composed of stabilizer syndromes, and we constructed these into training sample paths to be input based on the output of the UF decoder. For example, when an ‘X’ error is detected, the UF decoder marks non-trivial measurement points, and we then connect adjacent non-trivial measurement points along the path to form the vector information input to the RNN. For more complex error clusters, we extended the path to the quadratic growth region to enable the neural network to more accurately predict the state of erroneous data qubits. When the error cluster structure becomes more complex, we further expanded this to three layers to introduce a multi-layer growth strategy. In each growth phase, we added connecting paths to cover broader potential error regions, ensuring all interconnected non-trivial measurement points were included within the path. Through multi-layer growth, the RNN can capture global error information in complex paths, enhancing the comprehensiveness and accuracy of its predictions.
In the RNN training dataset, we generated a label for each error path, where the label represents the error probability distribution of each data qubit along the path. Each sample in the training dataset includes an error cluster path (formed by connecting non-trivial measurement points) as input, and the error probability of each data qubit (with the error probability value pp being a real number between 0 and 1) as the output label. During the training process, specific training metrics and parameters were set. The initial training epoch was set to 200 or until the termination condition was met. In each epoch, the model performed a complete pass over all training samples, and the difference between the predicted error probability and the true label was measured using the mean squared error (MSE) loss function:
where
is the true error probability,
is the predicted error probability of the RNN, and N is the total number of data samples. The model adjusts its weights based on the loss value. To evaluate the training effect, we set an accuracy threshold of 0.95. That is, when the RNN model predicts the error-bit position, the match rate with the real error position needs to reach 0.95 or higher. We set the batch size to 40 to stabilize the gradient-descent process and speed up training. The model weights were updated using small-batch random gradient descent. The initial learning rate was 0.001, and it was reduced by 10% every 10 cycles using cosine annealing, ensuring stable convergence, i.e.,
The training termination conditions were defined as follows: the model achieves a prediction accuracy of over 95% and a loss below 0.01 across multiple validation sets, or the validation loss stops decreasing for 20 consecutive epochs (Early Stopping strategy). After training, the model’s predictions were compared with the output of the UF decoder. When the error path positions predicted by the RNN align with those decoded by the UF decoder, the model is considered to have effectively predicted the error locations. If the error paths from the two models are inconsistent, a backtracking mechanism is triggered: if the misjudgment arises from the premature merging of a high-weight edge, the merging operation of that edge is reversed, and the corresponding two clusters are re-split into independent ones. After reversal, the neural network generates new merging priorities to re-execute the decoding process. Following re-execution, the neural network checks the validity of the results again. If the conditions are met, backtracking terminates; if not, the UF backtracking mechanism continues to be triggered. Considering the real-time nature of information transmission, excessive backtracking may cause the decoding time to exceed the coherence time, leading to quantum state decoherence and offsetting the error correction effect. Thus, termination rounds for unmet conditions are set. The total time for three backtracking iterations is approximately , which remains controllable within s during decoding. Therefore, three backtracking iterations were selected in the experiments. Through these settings, the RNN model can accurately predict error paths, thereby enhancing its capabilities in predicting and correcting qubit errors.
To improve reproducibility, we provide the overall pseudocode of the hybrid decoder integrating Union-Find (UF) and Recurrent Neural Network (RNN), as shown in Algorithm 1:
Algorithm 1: UF-RNN Hybrid Decoder for Color Codes |
![Applsci 15 08937 i001 Applsci 15 08937 i001]() |
This modular approach ensures low time complexity during UF cluster construction () and fast inference during RNN path refinement. The RNN module only activates when error clusters form long-range paths or ambiguous merges.
5. Conclusions
A hybrid decoding framework was proposed that integrates the Union-Find (UF) algorithm with a Recurrent Neural Network (RNN) to improve the decoding performance of color codes. The experimental results demonstrate that at low physical error rates, standalone UF decoding achieves competitive accuracy with a significantly reduced resource overhead compared to surface codes. Under high-error regimes, the Neural-Guided Union-Find (NGUF) decoder provides a 4.7% improvement in decoding accuracy and increases the threshold from 0.134 to 0.1365, outperforming traditional UF decoding and approaching the performance of surface code decoders. These findings indicate that the hybrid approach effectively addresses the decoding challenges associated with the densely connected structure of color codes, such as error chain propagation and syndrome ambiguity. Beyond accuracy gains, the NGUF framework offers practical advantages for fault-tolerant quantum computing. Its lightweight RNN enables real-time inference with minimal overhead, allowing for its deployment on resource-constrained hardware. The UF backbone ensures scalability and parallelism, while the RNN refines uncertain regions without reducing throughput.
Nevertheless, our approach has limitations. Reliance on pre-trained RNNs entails a trade-off between generalization and accuracy, as parameters are tuned to specific noise models. Significant deviations in noise conditions may require retraining or fine-tuning, reducing adaptability in dynamic environments. Although latency is modest compared to more complex neural decoders, it may still be non-negligible for ultra-low-latency or high-throughput scenarios. These factors motivate further optimization of both the neural module through lightweight architectures, pruning, and quantization, and use of the classical UF component to minimize delay without sacrificing accuracy.
Future research will focus on broadening the applicability of the NGUF framework to encompass more general error models, including correlated, biased, and time-varying noise, as well as extending its support to diverse code families beyond two-dimensional color codes, such as higher-dimensional topological codes and subsystem codes. Another important direction is the exploration of hardware-efficient RNN deployment using techniques such as model pruning, quantization, and specialized accelerators, enabling low-latency inference on embedded or cryogenic control hardware. In parallel, the integration of adaptive learning strategies capable of online parameter adjustment in response to evolving noise characteristics will be crucial for maintaining robustness in dynamic environments. Collectively, these developments could pave the way for next-generation hybrid neural–classical decoders that combine scalability, resource efficiency, and adaptability, accelerating the practical realization of fault-tolerant quantum computing.