Next Article in Journal
Morphology-Aware Segmentation and Tokenization for Turkic Languages: A CSE-Guided Framework (The Kazakh Case)
Previous Article in Journal
Complexity-Driven Adversarial Validation for Corrupted Medical Imaging Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Permutation-Based Trellis Optimization for a Large-Kernel Polar Code Decoding Algorithm

1
School of Internet of Things Engineering, Wuxi University of Technology, Wuxi 214121, China
2
College of Physics and Electronic Information Engineering, Zhejiang Normal University, Jinhua 321004, China
*
Author to whom correspondence should be addressed.
Information 2026, 17(2), 127; https://doi.org/10.3390/info17020127
Submission received: 26 November 2025 / Revised: 26 January 2026 / Accepted: 27 January 2026 / Published: 29 January 2026
(This article belongs to the Section Information and Communications Technology)

Abstract

Compared to Arikan’s G 2 kernel, large-kernel polar codes exhibit higher polarization rates and superior error correction performance. The critical steps of exact successive cancellation (SC) decoding for such codes can be implemented via trellis-based computations to reduce complexity. However, the complexity remains high for large kernels. This paper proposes a permutation-based trellis optimization scheme. The approach builds on the Massey minimal trellis and reorders its time axis to find a permutation that minimizes the number of trellis edges, thereby further reducing the exact SC decoding complexity. For smaller kernels ( G 3 G 12 ), an exhaustive search is conducted to identify the optimal trellis. For larger kernels ( G 13 G 16 ), where an exhaustive search becomes infeasible due to the factorial growth of the permutation space, an ant colony optimization (ACO)-based method is employed to find a near-optimal permutation. Simulation results show that the permutation-optimized trellis lowers the direct SC decoding complexity drastically. Furthermore, compared to the l -expression, the W -formula and original Massey trellis methods, it achieves multiplication operation reductions of up to 99.2%, 58.1%, and 56.5%, respectively. The improvement is particularly beneficial for large kernels, where traditional decoding methods become computationally prohibitive.

1. Introduction

Polar codes are the first class of error-correction codes mathematically proven to achieve the Shannon limit over binary-input discrete memoryless channels (B-DMCs) with low encoding and decoding complexity [1]. This theoretical breakthrough led to their adoption in the 5G New Radio (NR) standard, where polar codes are specified for control channels while low-density parity-check (LDPC) codes serve data channels [2]. The binary LDPC codes adopted for 5G data channels provide a compelling balance of high throughput and feasible complexity. Non-binary LDPC codes, despite potential performance gains in some scenarios, incur substantially higher decoding complexity [3], which aligns with the system considerations that favored polar codes for control channels. In contrast, polar codes are well-suited for control signalling primarily due to their deterministic construction, which facilitates standardization and implementation, and their provable capacity-achieving property, which provides a solid theoretical foundation for high reliability.
The original polar code construction employs the small 2 × 2 kernel G 2 = 1 0 1 1 . In 5G NR, polar codes are implemented through recursive application of this kernel using the Kronecker product. However, theoretical work by Korada et al. established that a polarizing transformation based on a larger kernel matrix (e.g., 16 × 16) can achieve a higher polarization exponent than the recursive (Kronecker) construction based solely on G 2 [4]. This implies that for a fixed code length, large-kernel polar codes could potentially offer better error-correction performance. Furthermore, techniques have been developed to estimate the capacities and Bhattacharyya parameters of the bit subchannels induced by large kernels, which are essential for their practical construction [5]. However, directly adopting such large kernels introduces a fundamental challenge: the computational load of the successive cancellation (SC) decoding scales exponentially with the kernel size n , quickly becoming prohibitive.
Research on large-kernel polar codes has thus evolved along two complementary directions: one focused on code construction and kernel optimization [5], and the other—which this paper addresses—concentrating on decoding efficiency. We specifically investigate exact SC decoding, which maintains accuracy without approximation and preserves the code’s theoretical guarantees. Our work is confined to binary kernels G n with 3 ≤ n ≤ 16 as defined in [6].
To address SC decoding complexity, several approaches have emerged. The successive cancellation list (SCL) decoder improves error-correction performance but significantly increases computational load [7]. Alternatively, sequential decoding methods such as the Fano algorithm and the more recent SC-Creeper decoder—which incorporates a cost metric threshold—offer different complexity–performance trade-offs through dynamic tree search [8]. Our work takes an alternative path by focusing on accelerating the exact SC decoder through computational optimization of its kernel operations, preserving the algorithm’s original structure and exactness.
Trellis-based implementations represent one promising direction for this kind of optimization [9]. Although early trellis-based SC decoders relied on approximations [10,11], the development of exact SC decoding led to formulations in the likelihood ( l -) domain [1], and later, to more efficient probability-pair ( W -formula) methods [12]. Zhang et al. made significant progress by showing that exact SC decoding could be implemented without approximation using the Viterbi algorithm on a Massey minimal trellis [13]. However, their work was limited to a kernel of size three, and the complexity remains high for larger kernels.
This paper aims to make exact SC decoding practical for larger kernels. We propose a permutation-based trellis optimization scheme built on the Massey minimal trellis. The core idea is to reorder the trellis time axis to minimize the number of edges, thereby directly reducing computational load. We apply this optimization to binary kernels from G 3 to G 16 . For smaller kernels ( G 3 G 12 ), an exhaustive search is used to find the optimal permutation. For larger kernels ( G 13 G 16 ), where an exhaustive search is infeasible, we employ an ant colony optimization (ACO) metaheuristic to efficiently find near-optimal permutations [14]. Simulations show that our approach achieves substantial complexity reductions compared to all existing exact SC decoding methods, making large-kernel polar codes more amenable to practical implementation.

2. Background

This section provides the necessary background for the proposed trellis optimization scheme. It first reviews the mathematical formulation of large-kernel polar codes. Then, it details the time axis of a trellis, which serves as the foundation for the subsequent optimization problem. Finally, it establishes the fundamental connection between the SC decoding algorithm and trellis-based computation.

2.1. Large-Kernel Polar Code

Let W : { 0 , 1 } Y denote a B-DMC, where the input set is { 0 , 1 } , the output set is Y and the transition probability is W ( y | x ) , x { 0 , 1 } , y Y . Let a 1 n denote the row vector ( a 1 , , a n ) .
For a given kernel G n , the channel W G n : { 0 , 1 } n Y n can be defined as
W G n ( y 1 n | u 1 n ) i = 1 n W ( y i | ( u 1 n G n ) i ) .
The transition probability of the i th subchannel W G n ( i ) : { 0 , 1 } Y n , 1 i n is defined as
W G n ( i ) ( y 1 n , u 1 i 1 | u i ) = 1 2 n 1 u i + 1 n j = 1 n W ( y j | ( u 1 n G n ) j ) .

2.2. Time Axis of a Trellis

The transition probability calculation in Equation (2) is the kernel internal operation of SC decoding, which can be simplified via a trellis diagram. A trellis diagram is a time-layered directed graph with edge labels, typically represented as a triple ( V , E , A ) , where V is the vertex set, E is the edge set, and A is the edge label set.
Each edge in the trellis diagram can be denoted as ( v , v , a ) , indicating a directed connection from vertex v to v ( v , v V ) with edge label a A . For every vertex v V , there exists at least one directed path from the source point to v . If such a path contains i edges, the depth of the vertex v is defined as i . Let V i denote the vertex set with depth i . For a trellis of depth n , its vertex set can be partitioned into n + 1 disjoint subsets V 0 , V 1 , , V n , and its edge set corresponds to n disjoint subsets E 0 , E 1 , , E n 1 . The ordered set ( 0 , 1 , , n ) resulting from this partition is called the time axis of the trellis.

2.3. Connecting SC Decoding to Trellis Computation

Direct evaluation of Equation (2) requires summing over 2 n i terms, leading to exponential complexity in n . This summation can be computed exactly on a trellis representing the linear code defined by G n . For a fixed decoding step i and given past bits u 1 i 1 , each complete path from V 0 to V n in the trellis corresponds to one specific assignment of the future bits u i + 1 , , u n in (2). The edge label at depth j equals the channel transition probability W y j x j , where the code bit x j is determined by the path and the kernel G n . Thus, the product of labels along any path equals one term in the product j = 1 n W ( ) of (2).
The forward sum-product algorithm on the trellis computes the total probability mass at V n by merging paths at shared vertices. This merging exploits the distributive law, reducing the number of multiplications compared to direct expansion of the sum.
Consider decoding the first bit u 1 with the ternary kernel G 3 = 1     0     0 1     1     0 1     0     1 as an example, and assume u 1 = 0 . Equation (2) reduces to
W G 3 1 y 1 3 u 1 = 0 = 1 4 u 2 , u 3 0 , 1 W y 1 u 2 u 3 W y 2 u 2 W y 3 u 3 .
According to the method in [13], trellis T1 is built for this computation and is shown in Figure 1. It can be seen that T1 has four paths, each corresponding to one of the four ( u 2 , u 3 ) pairs. Computing the total flow to the terminal vertex yields the same result as (3). As shown in [13], this trellis-based method uses only 6 multiplications, compared to 8 for the direct sum—the reduction is a result of merging paths at the central vertex.
Critically, the number of multiplications in the forward pass is proportional to the number of edges in the trellis. Different trellis representations of the same G n code can have different edge counts while maintaining identical decoding performance. Therefore, minimizing the trellis edge count is equivalent to minimizing the kernel computation complexity. This observation directs us to find the trellis with the minimal edge count for a given kernel, which we address in Section 3 through time-axis permutation.

3. Permutation-Based Trellis Optimization

The computational cost of trellis-based exact SC decoding is primarily determined by the number of edges in the trellis. Therefore, minimizing the number of edges is crucial for reducing decoding complexity. For a fixed time axis, the best trellis is the minimal trellis. In this paper, the work is based on the Massey minimal trellis.
Figure 2 depicts the Massey minimal trellis generated from the generator matrix 1 0 1 0 0 1 0 0 1 0 1 1 1 0 1 , which corresponds to the time axis I 1 = { 0 , 1 , 2 , 3 , 4 } . Then, map this time axis to I 2 = { 0 , 3 , 2 , 1 , 4 } and its corresponding Massey trellis is as shown in Figure 3. Figure 2 and Figure 3 were obtained by using the methods in [13].
As can be observed, Figure 2 contains 14 vertices and 20 edges, whereas Figure 3 comprises only 9 and 12, respectively, demonstrating a significantly simpler structure. Hence, exact SC decoding on the trellis in Figure 3 is more efficient than that in Figure 2. Importantly, the generator matrices corresponding to Figure 2 and Figure 3 exhibit a one-to-one mapping, meaning that they represent the same code. Thus, exact SC decoding on the trellis in Figure 2 and Figure 3 will have the same decoding performance, which guarantees that the error-correction capability is preserved under the proposed permutation-based optimization. This indicates that the computational cost can be further reduced by reordering the time axis without any performance sacrifice.
The search for the trellis with the lowest computational cost involves generating Massey trellises from all possible time-axis permutations of the generator matrix and then selecting the one with the fewest edges. The resultant structure is termed the permutation-optimized trellis.
The above method is performed to identify the permutation-optimized trellis for smaller kernels ( G 3 G 12 ). Once the permutation-optimized trellis is determined, exact SC decoding proceeds on this trellis, following the method in [13].
It should be noted that for kernels up to G 12 , the permutation space (e.g., 12! ≈ 4.79 × 108 for G 12 ) permits an exhaustive search in an offline setting. This is feasible because evaluating a single permutation’s trellis complexity is computationally trivial (microsecond-scale), and the process is perfectly parallelizable. In our experiments, the search for G 12 completed in hours on a multi-core machine.
For larger kernels ( G 13 G 16 ), the factorial growth of the permutation space makes an exhaustive search intractable. We therefore employ the ACO-based heuristic method described next.

4. ACO-Based Time-Axis Permutation Optimization for Larger Kernels

The ACO algorithm is modeled on ant foraging behavior. Its core mechanism is the use of pheromones. As ants move, they lay down pheromone trails, and the strength of this scent influences the path choices of other ants. They essentially collectively reinforce favorable paths. Through this iterative process, the colony progressively identifies optimal or near-optimal solutions.
In this section, we apply ACO to the time-axis permutation optimization for larger kernels ( G 13 G 16 ). We first detail the foundational concepts and steps of the ACO algorithm for this task. We then present the procedure for identifying the optimal or near-optimal permutation using ACO.

4.1. Concepts and Basic Steps of the ACO Algorithm

The application of an ACO algorithm to the time-axis permutation problem involves some fundamental concepts such as “ants”, “pheromone”, and “path selection”.

4.1.1. Ants

For a kernel G n , there are n points at time { 0 , 1 , , n 1 } . As shown in Figure 4, t i ( i = 0 , 1 , , n 1 ) represent n time-axis nodes, r i , k k = 0 , 1 , , n 1 i represent the branch numbers. t 0 { 0 , 1 , , n 1 } is the first time node of the time-axis permutation. There are n lines from the starting point O to t 0 , representing the n branches searching for t 0 . Given t 0 , n 1 lines from t 0 to t 1 represent the subsequent branches for t 1 .
The ants are hypothetical path choosers. They start from O and choose a branch to arrive at t 0 . Then, they choose a branch from t 0 to arrive at t 1 . Finally, they choose a branch from t n 2 to arrive at t n 1 . In this way, path selection is completed, thereby obtaining a time-axis permutation scheme.

4.1.2. Time-Axis Permutation Scheme

Let the path chosen by the ant be { r 0 , 0 , r 1 , 0 , , r n 2 , 0 , r n 1 , 0 } , denoted as
p = [ r 0 , 0 , r 1 , 0 , , r n 2 , 0 , r n 1 , 0 ] .
p represents a time-axis permutation scheme. Define the trellis complexity of p as the number of edges in its corresponding Massey trellis, denoted as S ( p ) .
There are N a ants whose initial paths are randomly selected and labeled p 0 ( 0 ) , p 1 ( 0 ) , , p N a 1 ( 0 ) . The ants update their paths according to a specific “update rule” iteratively, and the paths generated after the m th iteration are denoted as p 0 ( m ) , p 1 ( m ) , , p N a 1 ( m ) .

4.1.3. Pheromone

Pheromone guides the ants’ path selection. Paths with higher pheromone concentrations are more likely to be chosen by the ants. To ensure unbiased exploration at the start of the search, the initial pheromone level on every branch is set to a constant, τ 0 , giving all paths an equal selection probability.

4.1.4. Iterative Optimal and Global Optimal

Denote the trellis complexities of p 0 ( m ) , p 1 ( m ) , , p N a 1 ( m ) as S ( p 0 ( m ) ) , S ( p 1 ( m ) ) , , S ( p N a 1 ( m ) ) . The iterative optimal scheme, denoted as p d ( m ) , is the one with the minimum trellis complexity among this iteration’s time-axis permutations. The global optimal scheme, denoted as p g ( m ) , is the one with the smallest complexity among all currently generated permutations.

4.1.5. Pheromone Update Rules

Let L represent the set of all branches, and Γ represent the pheromone space, specifically the following:
L = { r i , k | i = 0 , 1 , , n 1 ; k = 0 , 1 , , n 1 i } ,
Γ = { τ i k | i = 0 , 1 , , n 1 ; k = 0 , 1 , , n 1 i } .
The pheromone value of branch r i , k after the m th iteration is τ i k ( m ) . As mentioned earlier, for m = 0 , all the pheromone values are initially set to a constant τ 0 , that is, τ i k ( 0 ) = τ 0 , i = 0 , 1 , , n 1 ; k = 0 , 1 , , n 1 i . The pheromone update formula is as follows [14]:
τ i k ( m + 1 ) = ( 1 ρ ) τ i k ( m ) + k τ i k ( m + 1 ) .
In Equation (7), ρ represents the pheromone evaporation coefficient, and Δ τ i k ( m + 1 ) is defined as follows [14]:
Δ τ i k ( m + 1 ) = Q S ( p g ( m ) ) ,   ( r i 1 , k , r i , k )   in   p g ( m )   Q S ( p d ( m ) ) ,   ( r i 1 , k , r i , k )   in   p d ( m ) 0 ,   otherwise ,
where Q is the pheromone intensity, which refers to the amount of pheromone released by each ant after completing a full path search.
To prevent the ACO algorithm from converging prematurely to a local optimum and to enhance its global search ability, the following constraints are imposed on τ i k ( m + 1 ) :
τ i k ( m + 1 ) = τ max ,   τ i k ( m + 1 ) > τ max τ min ,   τ i k ( m + 1 ) < τ min .
And the values of τ max and τ min are given by the following formula:
τ max = 1 ( 1 ρ ) S ( p d ( m ) ) , τ min = 0.05 τ max .

4.1.6. Convergence Factor

The formula for the convergence factor c g is defined as follows [14]:
c g ( m + 1 ) = 2 τ i k ( m + 1 ) L max { τ max τ i k ( m + 1 ) , τ i k ( m + 1 ) τ min } | L | ( τ max τ min ) 0.5 ,
where L is the cardinality of L .

4.1.7. Roulette Wheel Selection

Suppose that an ant can choose branch r 0 , r 1 , , r s 1 at some time point with corresponding probabilities p 0 , p 1 , , p s 1 and p 0 + p 1 + + p s 1 = 1 .
The ant chooses the branch as follows:
(1)
Generate a uniformly distributed random number b in [0, 1].
(2)
Determine an index i such that
j = 0 i 1 p j b j = 0 i p j .
(3)
Choose r i as the branch for this ant at that time point.

4.1.8. Generate Time-Axis Permutation Scheme

For time point t i , the probability of each branch is calculated according to the pheromone space as follows:
p i k ( m ) = τ i k ( m ) l = 0 n 1 i τ i l ( m ) ,
where k = 0 , 1 , , n 1 i .
For the kernel G n , ants generate a time-axis permutation scheme as follows:
(1)
Start from the starting point O , and set i = 0 .
(2)
Calculate the probability of each branch for time point t i according to Equation (13). Then, choose the branch for time point t i according to the “roulette wheel selection”.
(3)
Increment i by 1. Repeat (2) until i = n 1 .
In this way, an ant chooses a path in Figure 4; that is, a time-axis permutation scheme is generated.

4.2. Algorithm Design

Based on the foundational concepts and operations outlined in Section 4.1, the procedure for identifying the optimal permutation using the ACO algorithm is as follows:
Step 1: Initialization. Set the initial parameters, such as the number of ants ( N a ), the initial pheromone evaporation coefficient ( ρ 0 ), the pheromone intensity ( Q ) and the initial pheromone value τ 0 .
Step 2: Solution construction and evaluation. Each ant generates a time-axis permutation scheme following the procedure in Section 4.1.8. The trellis complexity for each generated scheme is then computed. Subsequently, both the current iterative optimal scheme and the global optimal scheme are updated.
Step 3: Pheromone update and convergence assessment. The whole pheromone space is updated as per Section 4.1.5. Following this, the convergence factor for the current pheromone space is calculated according to Section 4.1.6.
Step 4: Termination check. The algorithm flow branches based on the value of the convergence factor c g :
(1)
If c g < 0.9999 or the maximum number of iterations has not been reached, the algorithm returns to Step 2 to commence the next iteration.
(2)
Otherwise, the algorithm terminates and outputs the final result.

5. Simulation Results and Analysis for Polar Codes with G 3 G 12 Kernels

This section presents the simulation results for polar codes with G 3 G 12 kernels. Table 1 provides the precise average number of multiplication operations required for kernel computations in direct SC decoding, the l -expression, the W -formula, the Massey trellis, and the proposed permutation-optimized trellis. Because the proposed optimization reduces the number of trellis edges, the reported reduction in multiplications proportionally reflects the reduction in overall computational complexity.
As shown in Table 1, for n 4 , it significantly outperforms direct SC decoding, with the average number of multiplication operations decreasing progressively as n increases. For n 10 , the optimized trellis reduces operations by 23.4%, 63%, and 82.8% relative to the l -expression-based SC decoding, and by 4.7%, 23%, and 32.3% compared to the W -formula-based approach. Furthermore, for n 3 , it achieves operation reductions of 5.7% to 46.8% compared to SC decoding on the Massey trellis.
Consequently, the simulation results demonstrate that the permutation-optimized trellis offers a lower computational cost than benchmark methods for n 10 , establishing it as a more efficient decoding method for large-kernel polar codes.

6. Simulation Results and Analysis for Polar Codes with G 13 G 16 Kernels

This section presents the simulation results for polar codes with G 13 G 16 kernels using the ACO algorithm for permutation optimization. First, the parameter settings for the ACO algorithm are analyzed. Subsequently, the simulation results and their analysis are presented.

6.1. Parameter Settings

In ACO, parameter settings are highly sensitive to the overall algorithm performance. Because parameters are interdependent, adjusting one may affect others. Thus, no universally optimal parameter combination suits all scenarios. We finalized the parameters through repeated experiments to observe their individual impacts. The following analysis examines the ACO parameters using the time-axis permutation of the G 16 kernel matrix as an example.

6.1.1. Analysis of the Number of Ants

The number of ants, N a , requires careful consideration in ACO design. Although more ants allow for broader exploration of possible solutions, too many ants lead to repeated searches along similar paths. This repetition increases computation time while providing diminishing returns in solution quality. Therefore, N a should be determined according to the specific problem’s requirements and constraints.
For each ant number N a , we conducted 10 independent experiments and selected the best result. For this best result, we recorded both the total number of edges in the trellis corresponding to the optimal time-axis permutation (i.e., the optimal trellis edges) and the corresponding number of iterations to convergence.
As shown in Table 2, across different values of N a , the total number of optimal trellis edges remained unchanged at 220, whereas the number of iterations to convergence varied. The convergence iteration count stabilized at N a = 90 , which was consequently adopted for all subsequent simulations in this work.

6.1.2. Analysis of the Pheromone Evaporation Coefficient

The pheromone evaporation coefficient, ρ , controls the pheromone update rate. The value of ρ influences the balance between exploration and exploitation: a smaller ρ favors global exploration at the cost of slower convergence, while a larger ρ accelerates convergence but increases the risk of local optima. To address this, our study dynamically adjusts ρ based on the convergence factor c g , setting it low initially to promote exploration and increasing it later to enhance convergence speed.
We tested each initial value of ρ 0 with 10 independent runs, selecting the best result, and recorded the total number of optimal trellis edges and the number of iterations to convergence.
Table 3 shows that ρ 0 = 0.3 achieves the optimal time-axis permutation with minimal iterations. Decreasing ρ 0 increases the number of iterations to convergence, while increasing it prevents the algorithm from finding the optimal permutation. Therefore, ρ 0 was set to 0.3 for all subsequent simulations.

6.1.3. Analysis of the Pheromone Intensity

The pheromone intensity, Q , plays a key role in ACO. If Q is too low, pheromone differences between paths become too small to guide the search effectively, slowing convergence. Excessively high Q values, however, allocate a disproportionately large share of pheromone to paths found early on. This can cause the search to prematurely lock into these initially promising but potentially suboptimal regions. For each value of Q , we conducted 10 independent runs, selecting the best result, and recorded the total number of optimal trellis edges and the number of iterations to convergence.
As shown in Table 4, across 10 experiments with different Q values, the total number of optimal trellis edges remained constant at 220, while the number of iterations to convergence varied. The convergence iteration count reached its minimum when Q was set to 50. Any deviation from this value—whether an increase or decrease—resulted in a higher number of iterations to convergence. Therefore, a pheromone intensity of Q = 50 was adopted for subsequent simulations in this work.

6.2. Simulation Results and Analysis

This subsection presents the simulation results from the application of the ACO algorithm to the G 13 G 16 kernels. The ACO parameters are configured as follows:
Number of ants: N a = 90 ,
Maximum iterations: N c M a x = 100 ,
Initial pheromone evaporation coefficient: ρ 0 = 0.3 ,
Pheromone intensity: Q = 50 ,
Initial pheromone value: τ 0 = 0.0625 .
The weights of the global optimum k g and iterative optimum k d , and the dynamically changing pheromone evaporation coefficient ρ are related to the convergence factor c g . They are set as follows:
0 c g < 0.5 : k g = 0.0 , k d = 1.0 , ρ = 0.3 0.5 c g < 0.7 : k g = 0.382 , k d = 0.618 , ρ = 0.32 0.7 c g < 0.9 : k g = 0.5 , k d = 0.5 , ρ = 0.34 0.9 c g < 0.999 : k g = 0.618 , k d = 0.382 , ρ = 0.36 c g 0.999 : k g = 0.8 , k d = 0.2 , ρ = 0.38 .
Table 5 provides the precise average number of multiplication operations required for kernel computations in direct SC decoding, the l -expression, the W -formula, the Massey trellis, and the proposed permutation-optimized trellis. Because our trellis optimization reduces the total number of edges, the number of required multiplications scales proportionally with the overall number of all operations (additions, multiplications, etc.).
As shown in Table 5, for kernel sizes from 13 to 16, the proposed optimized trellis significantly outperforms direct SC decoding, reducing the average number of multiplication operations by more than 99%. And compared to the l -expression, the W -formula, and the Massey trellis, the reductions in multiplication count are (89.8%, 95.8%, 98.1%, 99.2%), (45.8%, 56.8%, 57.4%, 58.1%), (56.2%, 54.8%, 54.1%, 56.5%), respectively, for n = 13   to   16 .
The above results indicate that for n 13 , the proposed optimized trellis significantly outperforms all benchmark methods. As decisively shown in Table 5, the substantial complexity reduction achieved is robust and persists even when the ACO parameters employed for the trellis discovery are not finely tuned to their theoretical optimum. This establishes the permutation-optimized trellis as a more efficient approach for large-kernel exact SC decoding.

7. Conclusions

This paper proposed a permutation-based trellis optimization scheme to reduce the computational complexity of exact SC decoding. Based on the Massey minimal trellis, our approach reordered the time axis to further minimize the number of edges, yielding a more computationally efficient structure.
We first introduced the fundamental permutation problem and illustrated the structural improvements achieved through optimization. Then, depending on the kernel size, we applied two distinct methods—an exhaustive search and ACO—to find the optimal or near-optimal time-axis permutation. Specifically, (1) for smaller kernels ( G 3 G 12 ), a full permutation search was performed to identify the optimal time-axis permutation by exhaustively evaluating all possible configurations. (2) For larger kernels ( G 13 G 16 ), the ACO algorithm was employed to efficiently determine a near-optimal permutation, thereby lowering computational overhead.
Simulation results indicated that our approach significantly outperformed existing methods, including the direct SC decoding, l -expression and W -formula methods, and the original Massey trellis, especially for large kernels. This improvement was particularly crucial as it made the practical adoption of large-kernel polar codes possible.

Author Contributions

Conceptualization, C.D.; methodology, Z.W.; software, F.Z.; validation, F.Z.; formal analysis, Y.X.; investigation, Z.W.; resources, Y.X.; data curation, F.Z.; writing—original draft, C.D.; writing—review & editing, C.D., Z.H. and Y.X.; supervision, Z.H.; project administration, Z.H.; funding acquisition, Y.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Outstanding Teaching Team of the “Qinglan Project” in Jiangsu Higher Education Institutions (2024), of which author Y.X. is a member.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
B-DMCBinary input discrete memoryless channel
NRNew radio
LDPClow-density parity-check
SCSuccessive cancellation
SCLSuccessive cancellation list
ACOAnt colony optimization

References

  1. Arikan, E. Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels. IEEE Trans. Inf. Theory 2009, 55, 3051–3073. [Google Scholar] [CrossRef]
  2. Indoonundon, M.; Fowdur, T.P. Overview of the challenges and solutions for 5G channel coding schemes. J. Inf. Telecommun. 2021, 5, 460–483. [Google Scholar] [CrossRef]
  3. Kruglik, S.; Potapova, V.; Frolov, A. On Performance of Multilevel Coding Schemes Based on Non-Binary LDPC Codes. In Proceedings of the European Wireless 2018, 24th European Wireless Conference, Catania, Italy, 2–4 May 2018. [Google Scholar]
  4. Korada, S.B.; Şaşoğlu, E.; Urbanke, R. Polar codes: Characterization of exponent, bounds, and constructions. IEEE Trans. Inf. Theory 2010, 56, 6253–6264. [Google Scholar] [CrossRef]
  5. Trifonov, P.V.; Trofimiuk, G.A. Design of Polar Codes with Large Kernels. Probl. Inf. Transm. 2024, 60, 304–326. [Google Scholar] [CrossRef]
  6. Lin, H.-P.; Lin, S.; Abdel-Ghaffar, K.A.S. Linear and nonlinear binary kernels of polar codes of small dimensions with maximum exponents. IEEE Trans. Inf. Theory 2015, 61, 5253–5270. [Google Scholar] [CrossRef]
  7. Tal, I.; Vardy, A. List decoding of polar codes. IEEE Trans. Inf. Theory 2015, 61, 2213–2226. [Google Scholar] [CrossRef]
  8. Timokhin, I.; Ivanov, F. Sequential Polar Decoding with Cost Metric Threshold. Appl. Sci. 2024, 14, 1847. [Google Scholar] [CrossRef]
  9. Vardy, A. Trellis Structure of Codes. In Handbook of Coding Theory; Pless, V.S., Huffman, W.C., Eds.; Elsevier Science: Amsterdam, The Netherlands, 1998; Volume 1, pp. 1989–2118. [Google Scholar]
  10. Trifonov, P. Recursive Trellis Processing of Large Polarization Kernels. In Proceedings of the 2021 IEEE International Symposium on Information Theory (ISIT), Melbourne, Australia, 12–20 July 2021. [Google Scholar] [CrossRef]
  11. Trifonov, P.; Karakchieva, L. Recursive Processing Algorithm for Low Complexity Decoding of Polar Codes With Large Kernels. IEEE Trans. Commun. 2023, 71, 5039–5050. [Google Scholar] [CrossRef]
  12. Huang, Z.; Jiang, Z.; Zhou, S.; Zhang, X. On the Non-Approximate Successive Cancellation Decoding of Binary Polar Codes with Medium Kernels. IEEE Access 2023, 11, 87505–87519. [Google Scholar] [CrossRef]
  13. Zhang, F.; Huang, Z.; Zhang, Y.; Zhou, S. A trellis decoding based on Massey trellis for polar codes with a ternary kernel. In Proceedings of the Third International Conference on Optics and Communication Technology (ICOCT 2023), Changchun, China, 15 December 2023. [Google Scholar] [CrossRef]
  14. Stützle, T.; Hoos, H.H. MAX-MIN Ant System. Future Gener. Comput. Syst. 2000, 16, 889–914. [Google Scholar] [CrossRef]
Figure 1. Trellis T1 of G 3 .
Figure 1. Trellis T1 of G 3 .
Information 17 00127 g001
Figure 2. Massey trellis of the time axis I 1 = { 0 , 1 , 2 , 3 , 4 } .
Figure 2. Massey trellis of the time axis I 1 = { 0 , 1 , 2 , 3 , 4 } .
Information 17 00127 g002
Figure 3. Massey trellis of the time axis I 2 = { 0 , 3 , 2 , 1 , 4 } .
Figure 3. Massey trellis of the time axis I 2 = { 0 , 3 , 2 , 1 , 4 } .
Information 17 00127 g003
Figure 4. Ant road map.
Figure 4. Ant road map.
Information 17 00127 g004
Table 1. Average number of multiplication operations required for kernel computations ( G 3 G 12 ): (a) direct SC decoding; (b) l -expression method [1]; (c) W -formula method [12]; (d) Massey trellis [13]; (e) permutation-optimized trellis.
Table 1. Average number of multiplication operations required for kernel computations ( G 3 G 12 ): (a) direct SC decoding; (b) l -expression method [1]; (c) W -formula method [12]; (d) Massey trellis [13]; (e) permutation-optimized trellis.
Kernel Size n (a)(b)(c)(d)(e)
39.33.78.710.610.0
422.52.04.017.515.5
549.67.819.226.421.2
6105.011.325.335.627.6
7217.721.933.752.537.7
8446.342.844.871.746.7
9908.452.156.791.158
101841.483.867.4123.064.2
113721.8246.0118.5159.691.2
127505.5673.8171.0217.8115.8
Table 2. Impact of number of ants on ACO performance.
Table 2. Impact of number of ants on ACO performance.
Number of AntsTotal Number of Optimal Trellis EdgesNumber of Iterations to Convergence
3022057
4022061
5022049
6022041
7022033
8022023
9022020
10022020
11022022
12022021
Table 3. Impact of initial evaporation coefficient on ACO performance.
Table 3. Impact of initial evaporation coefficient on ACO performance.
Initial Evaporation CoefficientTotal Number of Optimal Trellis EdgesNumber of Iterations to Convergence
0.122035
0.222017
0.322013
0.422014
0.522019
0.622020
0.722023
0.823624
0.923622
1.025227
Table 4. Impact of evaporation coefficient on ACO performance.
Table 4. Impact of evaporation coefficient on ACO performance.
Pheromone IntensityTotal Number of Optimal Trellis EdgesNumber of Iterations to Convergence
1022023
2022016
3022016
4022013
5022012
6022017
7022023
8022027
9022029
10022027
Table 5. Average number of multiplication operations required for kernel computations ( G 13 G 16 ): (a) direct SC decoding; (b) l -expression method; (c) W -formula method; (d) Massey trellis; (e) permutation-optimized trellis.
Table 5. Average number of multiplication operations required for kernel computations ( G 13 G 16 ): (a) direct SC decoding; (b) l -expression method; (c) W -formula method; (d) Massey trellis; (e) permutation-optimized trellis.
Kernel Size n (a)(b)(c)(d)(e)
1315,121.81271.5240.2297.2130.3
1430,425.63790.6372.0355.3160.7
1561,165.110,736.6481.3446.9205.1
16122,878.134,145.0630.1606.8263.8
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Diao, C.; Wang, Z.; Xiao, Y.; Zhang, F.; Huang, Z. Permutation-Based Trellis Optimization for a Large-Kernel Polar Code Decoding Algorithm. Information 2026, 17, 127. https://doi.org/10.3390/info17020127

AMA Style

Diao C, Wang Z, Xiao Y, Zhang F, Huang Z. Permutation-Based Trellis Optimization for a Large-Kernel Polar Code Decoding Algorithm. Information. 2026; 17(2):127. https://doi.org/10.3390/info17020127

Chicago/Turabian Style

Diao, Chunjuan, Zhenling Wang, Ying Xiao, Feifei Zhang, and Zhiliang Huang. 2026. "Permutation-Based Trellis Optimization for a Large-Kernel Polar Code Decoding Algorithm" Information 17, no. 2: 127. https://doi.org/10.3390/info17020127

APA Style

Diao, C., Wang, Z., Xiao, Y., Zhang, F., & Huang, Z. (2026). Permutation-Based Trellis Optimization for a Large-Kernel Polar Code Decoding Algorithm. Information, 17(2), 127. https://doi.org/10.3390/info17020127

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop