Pre-Configured Error Pattern Ordered Statistics Decoding for CRC-Polar Codes

In this paper, we propose a pre-configured error pattern ordered statistics decoding (PEPOSD) algorithm and discuss its application to short cyclic redundancy check (CRC)-polar codes. Unlike the traditional OSD that changes the most reliable independent symbols, we regard the decoding process as testing the error patterns, like guessing random additive noise decoding (GRAND). Also, the pre-configurator referred from ordered reliability bits (ORB) GRAND can better control the range and testing order of EPs. An offline–online structure can accelerate the decoding process. Additionally, we also introduce two orders to optimize the search order for testing EPs. Compared with CRC-aided OSD and list decoding, PEPOSD can achieve a better trade-off between accuracy and complexity.


Introduction
In ultra-reliable and low latency communications (URLLC), the high reliability of short block codes becomes the key requirement [1].To do this, cyclic redundancy check (CRC-polar codes are particularly effective [2].For decoding short CRC-polar codes, the state-of-the-art method is CRC-Aided (CA) -successive cancellation list (SCL) decoding [3].
Two cutting-edge short code decoding algorithms are ordered statistics decoding (OSD) [4] and guessing random additive noise decoding (GRAND) [5].OSD is a decoder near the maximum likelihood (ML) and ideal for parallel design.However, the decoding complexity of s-order OSD can be too high to address.
Therefore, many pieces of early research have been done to reduce the complexity of OSD [6]- [9].Recently, a threshold-based OSD decoder can reduce the number of tested codewords [10].CA-OSD [10] and segmentation-discarding decoding [11] limit the number of valid codewords to improve performance.Probability-based OSD [12] calculates the promising probability and success probability to discard the candidate codewords.
Moreover, on the other hand, GRAND provides a new perspective for ML decoding by estimating the noise sequence [5].Ordered reliability bits GRAND (ORBGRAND) [13] is proposed to improve decoding throughput by generating possible error patterns (EPs).Its high-throughput and energy-efficient very large-scale integration (VLSI) circuit architecture is given in [14].
In this paper, we propose a new scheme called pre-configured error pattern (PEP) OSD that considers OSD from a new perspective.The main innovations and the advantages of this scheme are summarized as follows: (1) Decoding process: Instead of concentrating on completing queries of the most reliable independent symbols [4] on Hamming balls as s-order OSD, we use plenty of pre-configured EPs like ORBGRAND onto the transformed information bits.Before decoding, massive EPs can be pre-configured, so the EPs can be continuously read and tested on the hard-decision bits to see if these EPs can fix the errors in the information bits of the permuted systematic polar codes.After a Euclidean distance competition of δ codewords that can pass the CRC check, the most possible result can be obtained.Due to the characteristics of CRC-polar codes, introducing the maximum number of valid codewords δ can early stop the decoding to achieve lower complexity.
(2) EP pre-configuring process: The EPs can be either pre-configured once for all kinds of codes (with different lengths or rates) to achieve higher decoding speed or dynamically generated before decoding to save the hardware resource.As optimizing the test order of the pre-configured EPs can further leverage the soft information, queries can be obviously saved.Two orders are introduced: index weight (IW) & Hamming weight (HW) order and priority weight (PW) order.IW&HW relates to the error possibility of a specific EP, and IW similar to the logical weight in ORBGRAND [13], though only for the transformed information bits in this scheme.Thus the possible calculating complexity is reduced.Moreover, PW, in a quantitative relationship related to IW and HW, is designed to direct an efficient way to use the possible EPs.
The remainder of this work is structured as follows: Preliminaries are provided in Section 2. The design of a PEPOSD decoder is given in Section 3. The generating theory and mechanism of PEP and testing order are given in Section 4. The simulations are evaluated in Section 5. Finally, conclusions are drawn in Section 6.

CRC-polar Codes
A CRC-polar code is characterized by its code length n, k-length information bits, and m-length CRC, thus denoted by [n, k + m].For CRC-polar codes, the information bits are assigned to the channels with indices in the information set A, related to the more reliable subchannels, and |A| = k + m.The frozen bits, which have the default values, all zeros, are assigned to the complementary set A c .The channel input depends on the encoding function where u and c are the source and code block, respectively.The source block u consists of information bits u A and frozen bits u A c , and then modulated into BPSK vector x.Suppose that x is transmitted over a noisy channel, and the received vector y is represented as where z is the additive Gaussian noise.Therefore, there is where θ(y) denotes the hard decision sequence of the received vector, and e denotes the EP where the "1" bits result in the flips of bits between the sequence sent and the hard decision of the received.
Note that the i-th element of a vector is expressed by [ ], for example, the i-th bit of the code is denoted by c[i].

OSD Algorithm
In OSD, two permutations λ 1 , λ 2 are performed over y and G before decoding.After these, the received signals ỹ and the hard decision θ(y) are all respectively reordered.For example, y is reordered by Meanwhile, the permutations and Gaussian elimination transform the generator matrix G into its systematic form G [3].Therefore, only the k + m most reliable positions of ỹ are considered.
Then a number of tested codewords are compared to find the most likely estimate.In traditional OSD, codeword estimates are tested in the increasing order of the EP's Hamming weights.For instance, in s-order OSD, codeword estimates with Hamming weight from 1 to s of the corresponding EP are compared.After performing inverse permutations, the best result of the codeword estimates is chosen as the output.

PEPOSD Decoder
In this section, we introduce the details of PEPOSD.The whole decoder that can generate and test the EPs in parallel and relative processes is shown in Figure 1.There are two key units in PEPOSD: the offline pre-configurator and the online EP estimator.The pre-configurator can generate and reorder all the EPs and only once for all codes.The related details are described in section 6.Meanwhile, the EP estimator consists of 3 modules: pre-processor, EP tester, and validity checker.
We also summarize the decoding process in Algorithm 1.Here we introduce the decoding process in detail.
Before decoding, the signals should be preprocessed by permutations λ 1 and λ 2 .Thereby, the hard decision of the signal with k + m systematic bits can be obtained.The EP tester then tests one or several EPs in parallel on the processed sequences and attains the possible result.The validity checker would decide if the result can pass the CRC check.The valid results will be stored in the list until the number reaches its limit δ.Otherwise, backtrack and another EP will be adopted and tested.Finally, the Euclidean distances of the δ results will be compared, and the most possible result will be selected as the decoding output.The pre-processor performs two permutations and the systematic transform.The first permutation λ 1 sorts y by its absolute value |y|, and the second permutation finds k + m linearly independent column vectors in G as the first k + m columns.Then it performs Gaussian elimination (GE) to the permuted generator matrix λ 2 (λ 1 (G)), so the systematic form of generator matrix G is obtained.Thus the generator matrix becomes G = [I k+m , P], where I k+m is a (k + m)-dimensional identity matrix and P is the parity sub-matrix.

Online EP Estimator
Meanwhile, perform λ 1 and λ 2 on the hard decision θ(y) and initial index r 0 , where r 0 is set by r 0 [i] = i.Then the reliability index r is obtained by λ 2 (λ 1 (r 0 )), which corresponds to the ascending-order index of reliability in the most reliable k + m bits.The reordered-form θ(y), r can be obtained.Note that θ(y) consists of the first (k + m) bits θ(y) I and the rest θ(y) P , respectively corresponding to I k+m and P in G, i.e., θ(y) = [ θ(y) I , θ(y) P ], where I and P denote the index set of the information and parity bits respectively.
For each EP, the estimate of x, is denoted by a codeword ĉ.The systematic bits ĉI are generated by eliminating the error of hard decision θ(y) where e l denotes the l-th EP.Then the whole codeword estimate ĉ can be calculated by ĉ = ĉI Therefore, a possible candidate source block û can be attained.After this, the validity checker will test if û can pass the CRC check.If the CRC check is passed, û is determined as a valid result and sent to the candidate list.Calculate the Euclidean distance d E = ∥y − (1 − 2ĉ)∥ 2 and compare it with the current minimum candidate d E min .If the number of candidates reaches δ, the decoding will be completed and the most likely candidate u * will be output.This leverages the characteristic of CRC-polar codes to control the complexity.
If the candidate is invalid or the number is not enough, come back to the EP tester and read another EP.Though the generator matrix of CRC can be calculated into the whole generator matrix, a separate check is beneficial to control the number of queries.

Pre-configured Error Patterns
In this section, we first discuss in IW&HW order, the theoretical basis of the PEP generating mechanism.Then two integer splitting algorithms are introduced.Finally, PW order is introduced to better control the testing order of the EPs.

IW&HW Order
As the reliability index r is obtained by λ 2 (λ 1 (r 0 )), this indicates the necessary order to eliminate the errors on these bits.Upon this, referring to ORBGRAND [13], we can define reliability weight (RW), IW, and HW.The reliability weight is the sum of the approximate reliability of e, which can be calculated by RW collects the reliability prior information of all permuted systematic bits.However, as RW is difficult to split and control, IW is introduced.For an error pattern e, the corresponding IW is defined as which means the accumulation of the reliability index of all the error bits e[i] given the specific EP e.The smaller IW generally corresponds to the bigger RW, and also the more possible noise effect of the specific EP.IW gives a quantitative integer indicator to evaluate the order to test EPs.The difference between IW and logical weight [13] is that IW only consists of the information of the systematic bits, which is determined endogenously by the OSD algorithm, and accordingly leads to different impacts.Furthermore, w I,max indicates the maximum IW in all the EPs.
Similarly, the HW of a given error pattern is defined as w H,max presents the maximum HW of all the EPs.The smaller HW often leads to some more usual errors.Without ambiguity, for all eligible e, w I (e), w H (e) are abbreviated as w I , w H .
To pre-configure the EPs with all IW and HW we set, the process of PEP generation is designed as follows.We first generate EPs whose w H = 1.While generating "new" EPs whose w H is from 2 to w H,max , the generator first reads the "old" EPs whose w * H = w H − 1, storing into E old .By splitting only the biggest integer in old EPs and putting the small integers aside, corresponding new EPs can be generated.The algorithm is summarized in Algorithm 2. While splitting the integer, b stands for the biggest number and a 1 , a 2 ,..., and b are in ascending order.Thus, all EPs needed can be pre-configured.An integersplitting algorithm for ORBGRAND [13] can also be referred to.
PEP pre-configurator can produce all EPs stored in the memory before decoding numerous codes, so the decoder can continuously read EPs to significantly reduce the decoding delay, and only once is enough for all kinds of codes and all code blocks.On the other hand, while decoding a small number of codes, each EP can also be dynamically generated just before being tested to ensure better energy efficiency.

PW Order
As IW and HW are introduced and all the EPs have been pre-configured, PW can be defined by where α and β are parameters to be set.The order of using the EPs depends on their PW, which indicates a special order to prevent the decoder from trying some EPs with a super low possibility even if its HW is small.Figure 3 gives a hypothetical example to see the difference of the PEPOSD scheme between IW&HW order and PW order.Figure 3(a) shows the IW&HWorder PEPOSD.The decoder first tests the w H = 1 EPs in the order of w I .After that, it tests those with w H = 2, then 3, and so on.Meanwhile, in our new proposed scheme, the decoder just tests the EPs in the order of PW. Figure 3(b) shows how the PWs of the EPs correspond to their HW and PW.Therefore for instance, the EPs are tested from w P = 2, to w P = 23.Obviously, the two EPs with w P = 7 are tested together, also for w P = 8, 9, 10, 15.In this way, the order of using EPs can be optimized and some less probable EPs will be tested far later.

Performance Evaluation
In this section, for CRC-polar codes, we respectively compare the performance of PEPOSD with 3-order CA-OSD and CA-SCL (L = 32).

BLER Analysis
First we compare the BLER performance with low complexity of these algorithms.Figure 4 shows the BLER comparison between PEPOSD (IW&HW) and CA-SCL with different rates with the code length n = 64 and CRC length m = 6.In this figure, there is (IW/HW/δ) = (75/4/20) for PEPOSD.This demonstrates that PEPOSD outperforms CA-SCL by about 0.3 dB with the close complexity for the high rates.Increasing the CRC length can improve the performance of PEPOSD while this worsens CA-SCL, so the advantage can be more obvious.Figure 5 shows when E b /N 0 = 4.0, the BLER comparison with different code rates from 0.5 to 0.85 among PEPOSD, CA-OSD, and CA-SCL.This shows that PEPOSD 2 , (75/4/20) can achieve close accuracy with CA-OSD.Moreover, when R = 0.5 and R = 0.68 or higher, PEPOSD outperforms CA-SCL obviously.More detailed analysis about PEPOSD related to its complexity is given in section 5.2.
Then we analyze the ultimate performance with higher complexity.Figure 6 shows the performance comparison for [128,108+11] CRC-polar code.PE-POSD(IW&HW) is here with (IW/HW) = (100/4) and different δ.Meanwhile, PEPOSD(PW) with (IW/HW/δ/α/β) = (100/4/1/2/3) and CA-SCL(L = 32) are shown.The average decoding time is got from the same CPU.It can be concluded that PEPOSD can achieve better performance at a high rate for 128-bit CRC-polar codes and the decoding complexity can also be smaller than CA-SCL in high SNR areas.Also PW order performs better for this code.
Therefore, the simulation results show that PEPOSD achieves a better tradeoff between accuracy and complexity than CA-OSD, and also can perform better for some short codes than CA-SCL.What's more, the parameters can be configured flexibly and the decoding process can be parallelized to further increase its throughput.

Complexity Analysis
First, we compare the computational complexity of the proposed scheme with CA-SCL.Specifically, it should be noted that all operations are modulo-two operations (XOR) in this scheme, so the hardware resources and time spent will be obviously less with the same quantity in the engineering practice and hardware implementations.As most of the OSD research does, we focus on the queries needed in the decoding process, also the number of the EPs tested.Therefore, the number of bit flipping in this period can be calculated by ), where Q denotes the queries.Another key complexity that we consider compared with SCL, GRAND or other algorithms is GE in the pre-processor, of which the complexity can be calculated by O(n • (min(k, n − k)) 2 ).Moreover, there are some parallel or other efficient implementations can optimize the process like in [16].
Also, multiplication and addition operations needed in CA-SCL can be expressed as O(n • L • log(n)).Thus, Table 1 displays the complexity estimation of PEPOSD, CA-OSD, and CA-SCL.The queries of PEPOSD mainly based on δ, if IW and HW are relatively high enough.In conclusion, PEPOSD can obviously achieve lower complexity for high-rate codes, and for lower rates, PEPOSD may outperform CA-SCL as it's with modulo-two operations, which needs more hardware analysis to prove.
Observing together with Figure 5, it's obvious that PEPOSD 2 , (75/4/20) can achieve close accuracy with CA-OSD while its average number of bit flipping is 1/9 to 1/36 of 3-order CA-OSD.Also, PEPOSD 3 , (100/4/100) can obtain better accuracy than CA-OSD and the queries can be greatly reduced at the same time.For high-rate codes, PEPOSD can outperforms CA-SCL and CA-OSD both in accuracy and complexity.

Conclusion
In this paper, we introduce the PEPOSD algorithm to enhance the performance of short CRC-polar codes.It integrates the generating mechanism of noise queries in ORBGRAND to the generation of error patterns in OSD.Therefore, all the EPs can be pre-generated to allow the pipeline decoding for better speed.Also, early stop by CRC check can significantly reduce the complexity.
To optimize the decoding order of the proposed scheme, two options are introduced.IW&HW order is suitable for the most circumstances while PW shows lower complexity with bigger IW.In this way, the range of error patterns can be more controllable than l-order CA-OSD.
Simulation results show that there are several advantages in the performance and complexity of PEPOSD compared with CA-OSD and CA-SCL for CRCpolar codes, which shows a promising prospect.

Figure 1 :
Figure 1: The offline-online structure of a PEPOSD decoder

Figure 4 :
Figure 4: The comparison of BLER performance between PEPOSD and CA-SCL with different rates with the code length n = 64

Table 1 :
Complexity Estimation of PEPOSD, CA-OSD and CA-SCL with different CRC-polar codes.For GE and CA-SCL, the number denotes the needed operations.For OSD algorithms, the number denotes the number of bit flipping.For PEPOSD 1 ,PEPOSD 2 ,PEPOSD 3 : * Gaussian Eliminate is necessary in all OSD algorithms, which is mod-2 operation **