Dual Threshold Self-Corrected Minimum Sum Algorithm for 5G LDPC Decoders

: Fifth generation (5G) is a new generation mobile communication system developed for the growing demand for mobile communication. Channel coding is an indispensable part of most modern digital communication systems, for it can improve the transmission reliability and anti-interference. In order to meet the requirements of 5G communication, a dual threshold self-corrected minimum sum (DT-SCMS) algorithm for low-density parity-check (LDPC) decoders is proposed in this paper. Besides, an architecture of LDPC decoders is designed. By setting thresholds to judge the reliability of messages, the DT-SCMS algorithm erases unreliable messages, improving the decoding performance and e ﬃ ciency. Simulation results show that the performance of DT-SCMS is better than that of SCMS. When the code rate is 1 / 3, the performance of DT-SCMS has been improved by 0.2 dB at the bit error rate of 10 − 4 compared with SCMS. In terms of the convergence, when the code rate is 2 / 3, the number of iterations of DT-SCMS can be reduced by up to 20.46% compared with SCMS, and the average proportion of reduction is 18.68%.


Introduction
There are three major application scenarios for fifth generation (5G): enhanced mobile broadband (eMBB), ultra-reliable low latency communication (uRLLC), and massive machine type communication (mMTC), as defined by the International Telecommunication Union (ITU) [1].These scenarios have the characteristics of ultra-high traffic density, ultra-high connection number density and ultra-high mobility.Faced with diverse scenarios and extremely differentiated performance requirements in each scenario, it is difficult for 5G to form a solution suitable for all scenarios, using only a single technology as the previous communication system.Channel coding is a key technology to achieve the requirements and targets of 5G.In 5G new radio (NR), the new channel coding solution needs to support incremental-redundancy hybrid automatic repeat request (IR-HARQ), as well as various block lengths and code rates [2].In order to meet these demands and achieve a high-rate transmission, the NR standard selects low-density parity-check (LDPC) codes as the long code coding scheme for the data channel [3,4].
The 5G NR LDPC coding scheme uses two base graph (BG) check matrices to adapt to different code rates and block lengths.BG1 is suitable for code rates 1/3~8/9, with the maximum block length of 8448 bits; BG2 is suitable for code rates 1/5~2/3, with the maximum block length of 3840 bits.[2] According to the saying above, various block lengths and code rates are required for LDPC codes in the 5G NR.Much more than this, in the coding scheme, the first several columns of the code block are selected for puncturing.Experiments show that the introduction of relatively high puncturing variable nodes can produce improved performance with lower complexity [3].However, the introduction of puncturing variable nodes will also bring some performance loss when decoding, so it is necessary to improve the existing decoding algorithms.
The LDPC code is a type of error correction coding with strong error correction capability, which was first proposed by MIT's Gallager in the 1960s [5].Since LDPC codes have the advantages of simple description, low decoding complexity, parallel implementation, flexible use, and low error floor, they are widely used in practical communication and data storage systems [6][7][8][9].The LDPC code did not attract people's attention when it was proposed by Gallager due to the hardware and computer conditions at that time.In the 1990s, MacKay and Neal proved that LDPC codes had the performance of approximating the Shannon limit under the combination of iterative decoding with the belief propagation (BP) algorithm, since then LDPC codes have gradually become a hot topic in academic research [10].
The disadvantage of the BP algorithm is that the computational complexity is high and the performance is seriously degraded, especially when the message quantization bits are low.In 1999, Fossorier proposed the minimum sum (MS) algorithm to simplify the calculation of check node messages [11], and then MS quickly became the most widely used decoding algorithm.However, duo to the simplification, the decoding performance has a large loss, as well as some hard-decision algorithms [12][13][14][15].Therefore, many new improved algorithms appeared [16], such as the performance improvement algorithms modified by MS [17][18][19][20], the approximate simplified algorithms for the decoding function [21][22][23][24], and the scheduling algorithms with improved decoding efficiency [25][26][27][28][29][30].Among them, the offset minimum sum (OMS) algorithm and the normalized minimum sum (NMS) algorithm proposed by Chen et al. improved the performance limitedly.Although the function approximation algorithms have improved performance, they still increase the computational complexity.The scheduling algorithms can reduce the computational complexity, but will lose some performance.Besides, the self-corrected minimum sum (SCMS) algorithm proposed by Savin et al. improves the decoding performance from the perspective of variable node messages and accelerates the convergence of decoding [31][32][33].However, the SCMS algorithm does not perform well when the channel state is good, and cannot meet the requirements of high reliability and efficiency-oriented communication systems.
In this paper, a dual threshold self-corrected minimum sum (DT-SCMS) algorithm based on the SCMS algorithm is proposed for the design requirements of LDPC codes in 5G communication.In the process of iterative decoding, the new algorithm determines the reliability of the messages by setting thresholds and erasing the unreliable messages, which improves the decoding performance and efficiency.Simulation results show that the DT-SCMS algorithm is superior to the SCMS algorithm in terms of error characteristics and convergence characteristics.When the code rate is 2/3, the DT-SCMS algorithm can reduce the number of iterations by up to 20.46% compared with the SCMS algorithm, and the average proportion of reduction is 18.68%.
The remainder of this paper is organized as follows.Section 2 introduces the basic decoding algorithm of LDPC codes.Section 3 proposes a new decoding algorithm for LDPC codes.Section 4 exhibits the performance analyses and comparisons.An architecture of the LDPC decoder is designed in Section 5. Finally, Section 6 concludes the paper.

BP Algorithm
Among the decoding algorithms of LDPC codes, the BP algorithm has the most outstanding decoding performance, but the complexity is very high.For binary LDPC codes, by converting the probability operation in the algorithm to the log domain, the LLR (log-likelihood ratio) BP algorithm can be obtained.The LLR BP algorithm uses simple addition operations instead of complex multiplication operations, which greatly reduces the complexity of the decoding operations in the BP algorithm.For the convenience of description, the definitions of relevant variable symbols are shown in Table 1.

H
The parity check matrix A collection of check nodes connected to variable node i R(j) A collection of variable nodes connected to check node j C(i)\ j A set of check nodes connected to variable node i except j R(j)\ i A set of variable nodes connected to check node j except i L(P i ) Channel initial probability likelihood ratio message The check node message (extrinsic probability likelihood ratio message from variable node i to check node j) The variable node message (extrinsic probability likelihood ratio message from check node j to variable node i) L(q i ) All messages collected by variable node i α The correction factor in the NMS algorithm ĉ Decoding sequence obtained by decoding decision The steps of the LLR BP algorithm can be summarized as follows: (1) Initialization Set the maximum number of iterations (I max ), for every variable node calculate the initial probability likelihood ratio message Prob(i=1) ) that the channel passes to the variable node, set the initial message of the check nodes to 0, and then for each variable node i and its adjacent check node j ∈ C(i), set the initial message that the variable node sends to the check node as: (2) Iterative processing Step 1: Check Node Processing (CNP) For each check node j and its adjacent check nodes i ∈ R( j), at the l-th iteration the check node message is calculated as [22]: Step 2: Variable Node Processing (VNP) For each variable node i and its adjacent check nodes j ∈ C(i), at the l-th iteration the variable node message is calculated as [22]: Step 3: Decision For every variable node, the decision message is calculated as [22]: Following this, a hard decision is performed.If L (l) (q i ) > 0, then ĉi = 0, otherwise ĉi = 1.
(3) Stop If H ĉT = 0, or the maximum number of iterations is reached, the iteration ends, otherwise the iteration is continued from step 1.

MS Algorithm
The MS algorithm is an approximation of the LLR BP algorithm.In the update of the check node, the minimum and second minimum values are used instead of the confidence information to be transmitted.In the entire implementation process, there are only "sum" and "comparison" operations.
The CNP of the MS algorithm is described as follows.
For each check node j and its adjacent check nodes i ∈ R( j), at the l-th iteration the check node message is calculated as [22]: The MS algorithm greatly reduces the hardware complexity, but the approximate substitution sacrifices a part of the performance.To make up for the performance loss, the NMS algorithm and OMS algorithm are the typical improved algorithms.
NMS algorithm: reduce the magnitude of message confidence by dividing by a value, the check message is expressed as [22]: where α is called the correction factor.OMS algorithm: reduce the magnitude of message confidence by subtracting a value, the check message is expressed as [22]: where β is called the offset factor.

Proposed DT-SCMS Algorithm
The corrections of the NMS algorithm and OMS algorithm are made from the processing of check node messages, and there is another type of improvement that makes corrections from variable node messages, namely the SCMS algorithm.During the variable node message processing, according to the variable fluctuation of the variable node message, the "untrusted" message is selectively erased, which can speed up the convergence and restore the performance loss of the MS algorithm to a certain extent.

SCMS Algorithm
The VNP of the SCMS algorithm is described as follows: For each variable node i and its adjacent check nodes j ∈ C(i), at the l-th iteration the variable node message is calculated as [31]: where "⊕" means the Exclusive-OR (XOR) operation in binary; e ij (l) is used to indicate the position of L (l) q ij that is erased in this iteration, to prevent erasing the message of the same position in two consecutive iterations; s ij (l) denotes the sign of L (l) q ij in this iteration; L (l) q SC ij is the new variable node message obtained according to the erasure rule.

DT-SCMS Algorithm
Based on the SCMS algorithm, the proposed DT-SCMS algorithm adds two thresholds when judging the reliability of the variable node messages, and dynamically adjusts the range of the erasure messages so that it can better adapt to different environments.In addition, the new algorithm uses the NMS algorithm in the check node update process to ensure the performance of the decoding.
In the variable node processing, if the variable node message changes between two set thresholds in the opposite sign direction, it will be set to zero in the present update.This can be understood by means of Figure 1.In Figure 1, L (l−1) q ij is the variable node message in the previous iteration.th1 and th2 are the set thresholds.If L (l−1) q ij is a negative value, then L (l) q ij will be erased when it is between th1 and th2 in Figure 1a.If L (l−1) q ij is a positive value, then L (l) q ij will be erased if it falls between th1 and th2 in Figure 1b.

Thresholds Setting
If th1 deviates far from L (l−1) q ij , only a small number of node messages are modified so that the performance improvement is limited.Conversely, if th1 deviates near from L (l−1) q ij , a large number of node messages are modified, which expands the range of unreliable messages and causes degradation in the decoding performance.The purpose of adding th2 is to identify L (l−1) q ij as a reliable message when it deviates very far from L (l−1) q ij .Here, the thresholds th1 and th2 are defined as: where θ1 and θ2 are adjustment factors, θ1 ranges from −0.5 ≤ θ1 ≤ 0.5, and θ2 ranges from −1.5 ≤ θ2 ≤ −0.5, so that the value of th1 does not deviate too far or too near from L (l−1) q ij , and th2 can filter out certain reliable message nodes.At the same time, both th1 and th2 are the relative values of L (l−1) q ij .As L (l−1) q ij is changed in each iteration, th1 and th2 will also change as iterative updates, making sure that the criteria for judging reliability are adaptively updated.
In different LDPC codes, different thresholds can be set variously according to the requirements of the applications.The method of density evolution can be used to choose the optimal values of θ1 and θ2, as the NMS algorithm uses [17].

Iterative Process
The flow chart of the DT-SCMS algorithm is shown in Figure 2. In the check node processing (CNP), NMS is adopted to ensure the decoding performance, and α is the correction factor.The iterative process of the DT-SCMS algorithm is described as follows: Step 1: VNP For each variable node i and its adjacent check nodes j ∈ C(i), at the l-th iteration the variable node message is calculated as: Step 2: CNP For each variable node i and its adjacent check nodes j ∈ C(i), at the l-th iteration the variable node message is calculated as: ij is used to indicate the position of L (l) q ij that is erased in this iteration, to prevent erasing the message of the same position in two consecutive iterations, s (l−1) ij denotes the sign of L (l−1) q ij in this iteration, L (l) q DTSC ij is the new variable node message obtained according to the erasure rule.
Step 3: Decision For every variable node, the decision message is calculated as: And then a hard decision is performed.If L (l) (q i ) > 0, then ĉi = 0, otherwise ĉi = 1.SCMS sets zero as a fixed threshold to judge the unreliable messages, ignoring the changes of the variable node messages.In the two adjacent iterations, the change range of the variable node message is unknown.According to the dynamic adjustment rules of the thresholds, Equation (14) gives the positions of the unreliable messages which will be erased in the iteration.Therefore, both small and large changes will be considered.If the new value (L (l) q ij ) deviates too near from the last value (L (l−1) q ij ), we can treat it as a reliable message (the role of th1); if L (l) q ij deviates too far from L (l−1) q ij , we can also treat it as a reliable message (the role of th2); besides, if L (l) q ij falls between th1 and th2, it will be eased.In this way, the proposed algorithm effectively suppresses the transmission of unreliable messages, and retains reliable messages, thereby improving the reliability of decoding and accelerating the convergence speed of decoding.

Experiments and Simulations
Some experiments are conducted at the code rates of 1/3 and 2/3 in the 5G LDPC coding scheme.The coded bits are transmitted by Binary Phase Shift Keying (BPSK) modulation with the Additive White Gaussian Noise (AWGN) channel on the Matlab platform.The algorithms of BP, MS, NMS, SCMS, DT-SCMS are simulated with the parameter settings in Table 2.The simulation results are shown in Figures 3 and 4.
As can be seen from Figures 3 and 4, the DT-SCMS algorithm is superior to the SCMS algorithm both in error characteristics and convergence characteristics.Especially in the areas of high signal-to-noise ratio (SNR), the DT-SCMS algorithm performs more prominently than the SCMS algorithm in terms of convergence.The number of iterations of the SCMS algorithm and the DT-SCMS algorithm in different SNRs in Figure 4b are listed in Table 3.  From Figure 4b and Table 3, it can be seen that, as the channel SNR increases, the number of iterations of LDPC decoding is gradually reduced.Obviously, the convergence characteristics of the DT-SCMS algorithm are better than those of the SCMS algorithm, especially when the SNR is high.This is because the dynamic judgment of the DT-SCMS algorithm on the erased messages is more effective in suppressing the unreliable messages passing, thus speeding up the convergence of decoding.According to Table 3, by calculation, when the code rate is 2/3, the DT-SCMS algorithm can reduce the number of iterations by up to 20.46% compared with the SCMS algorithm, and the average reduction proportion is 18.68%.In the experiment with R = 2/3, several intermediate values of L (l−1) q ij are taken out from the process of iteration, and the values of th1 and th2 are calculated, as shown in Table 4. From Table 4, we can see that th1 and th2 change dynamically with the value of L (l−1) q ij .In addition, experiments show that this dynamic adjustability improves the decoding performance of the proposed algorithm.

Complexity Analyses
Table 5 compares the computational complexity of BP, MS, NMS, SCMS, and DT-SCMS algorithms in a single iteration where N represents the encoding length, K represents the effective information length, and W is the column weight of the check matrix.

Multiplication Division Addition
BP [10] 11NW − 6(N + K) N(W + 1) N(3W + 1) MS [11] 0 0 N(4W − 1) + K(logW − 2) NMS [17] 0 It can be seen that the improved algorithm based on the MS algorithm, on the basis of the BP algorithm, reduces a large number of multiplication and division operations, which will greatly reduce the calculation complexity in the implementation.The DT-SCMS algorithm proposed in this paper adds a small amount of multiplication and division operations to the variable node message processing.During implementation, the adjustment factors of thresholds can be set to 1/2, 1/4, 1/8, etc., which can be converted into simple shift and addition operations to obtain corresponding results.In general, the DT-SCMS algorithm improves the performance of the decoding algorithm and reduces the number of iterations with a small amount of computational complexity.

Design Architecture and Implementation
The Vivado High-Level Synthesis (HLS) platform launched by Xilinx has implemented a rapid design with IP as the core, greatly improved productivity, and has been widely used in many designs [34,35].Vivado HLS is a high-level programming language that helps to achieve Register Transfer Level (RTL) hardware functions, improving the level of abstraction of the system design.
An architecture of the LDPC decoder is designed in Figure 5, and mainly includes three modules: initialization (readData), information update (updateInfo), and decision (Check).The information update module consists of CNP (rowUpdate) and VNP (colUpdate).In the Vivado HLS platform, the LDPC decoder with a code length of 1584 bits and a code rate of R = 2/3 is simulated and synthesized, using the chip "xc7k420tffg901-2" from Xilinx's device library.In Figure 5, llr indicates the channel input information, and blockdout indicates the output of decoding results.In Vivado HLS, each module is implemented by C ++ functions.After compilation, it can be simulated and synthesized, and verified by the test platform.The initialization module mainly completes reading the initial value from the input channel information (llr) and storing it in the variable node message (L q ij ) and the judgment information (L(q i )).The CNP module is also called "the row update module", which mainly calculates the check node message (L r ji ).The VNP module is also called "the column update module".It first calculates the judgment message (L(q i )), and then calculates the variable node message (L q ij ).The performance evaluation of the decoder after synthesis in Vivado HLS is shown in Figure 6.It can be seen that a total of 42,336 clock cycles are required to complete one decoding process (22 iterations), of which approximately 1586 cycles are required to read the channel soft information, and 1802 cycles are required for decoding iterations.As described in Section 4.1, compared with the SCMS algorithm, the proposed DT-SCMS algorithm can greatly reduce the number of decoding iterations.Therefore, the total latency of decoding can be correspondingly reduced when the average number of iterations required for the decoding is reduced.

Conclusions
In this paper, we propose a dual threshold self-corrected minimum sum algorithm based on the SCMS algorithm, and an architecture of LDPC decoders is designed.According to the variable node message, two thresholds are set to judge whether the message is reliable, then the unreliable message is erased, waiting for the next iteration update.The thresholds are set with the relative values of the variable node message, which change with the iteration update, so the criterion for determining the reliability is adaptively updated.Simulation results indicate that the DT-SCMS algorithm shows better performance in a certain code rate range with a small amount of computational complexity, which can effectively improve the error characteristics and convergence characteristics of the decoding system.

Figure 1 .
Figure 1.Schematic diagram of reliable message determination for the dual threshold self-corrected minimum sum (DT-SCMS) algorithm.

Figure 2 .
Figure 2. The flow chart of the DT-SCMS algorithm.

Table 4 .
Some values of th1 and th2.

Table 5 .
Computational complexity of decoding algorithms.