Next Article in Journal
Masked Channel Modeling Enables Vision Transformers to Learn Better Semantics
Previous Article in Journal
A Fusion of Entropy-Enhanced Image Processing and Improved YOLOv8 for Smoke Recognition in Mine Fires
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Adaptive Learned Belief Propagation for Decoding Error-Correcting Codes

Telecom Paris, Institut Polytechnique de Paris, 91120 Palaiseau, France
*
Author to whom correspondence should be addressed.
Entropy 2025, 27(8), 795; https://doi.org/10.3390/e27080795
Submission received: 12 June 2025 / Revised: 16 July 2025 / Accepted: 22 July 2025 / Published: 25 July 2025
(This article belongs to the Section Information Theory, Probability and Statistics)

Abstract

Weighted belief propagation (WBP) for the decoding of linear block codes is considered. In WBP, the Tanner graph of the code is unrolled with respect to the iterations of the belief propagation decoder. Then, weights are assigned to the edges of the resulting recurrent network and optimized offline using a training dataset. The main contribution of this paper is an adaptive WBP where the weights of the decoder are determined for each received word. Two variants of this decoder are investigated. In the parallel WBP decoders, the weights take values in a discrete set. A number of WBP decoders are run in parallel to search for the best sequence- of weights in real time. In the two-stage decoder, a small neural network is used to dynamically determine the weights of the WBP decoder for each received word. The proposed adaptive decoders demonstrate significant improvements over the static counterparts in two applications. In the first application, Bose–Chaudhuri–Hocquenghem, polar and quasi-cyclic low-density parity-check (QC-LDPC) codes are used over an additive white Gaussian noise channel. The results indicate that the adaptive WBP achieves bit error rates (BERs) up to an order of magnitude less than the BERs of the static WBP at about the same decoding complexity, depending on the code, its rate, and the signal-to-noise ratio. The second application is a concatenated code designed for a long-haul nonlinear optical fiber channel where the inner code is a QC-LDPC code and the outer code is a spatially coupled LDPC code. In this case, the inner code is decoded using an adaptive WBP, while the outer code is decoded using the sliding window decoder and static belief propagation. The results show that the adaptive WBP provides a coding gain of 0.8 dB compared to the neural normalized min-sum decoder, with about the same computational complexity and decoding latency.

1. Introduction

Neural networks (NNs) have been widely studied to improve communication systems. The ability of NNs to learn from data and model complex relationships makes them indispensable tools for tasks such as equalization, monitoring, modulation classification, and beamforming [1]. While NNs have also been considered for decoding error-correcting codes for quite some time [2,3,4,5,6,7,8,9,10], interest in this area has surged significantly in recent years due to advances in NNs and their widespread commercialization [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46].
Two categories of neural decoders may be considered. In model-agnostic decoders, the NN has a general architecture independent of the conventional decoders in coding theory [12,34,42]. Many of the common architectures have been studied for decoding, including multi-layer perceptrons [5,12,47], convolutional NNs (CNNs) [48], recurrent neural networks (RNNs) [13], autoencoders [19,35,40], convolutional decoders [6], graph NNs [33,43], and transformers [34,41]. These models have been used to decode linear block codes [4,12], Reed–Solomon codes [49], convolutional codes [4,48,50], Bose–Chaudhuri–Hocquenghem (BCH) codes [11,16,18,31], Reed–Muller codes [5,25], turbo codes [13], low-density parity-check (LDPC) codes [37,51], and polar codes [52].
Training neural decoders is challenging because the number of codewords to classify depends exponentially on the number of information bits. Furthermore, the sample complexity of the NN is high for the small bit error rates (BERs) and long block lengths required in some applications. As a consequence, model-agnostic decoders often require a large number of parameters and may overfit, which makes them impractical unless the block length is short.
In model-based neural decoders, the architecture of the NN is based on the structure of a conventional decoder [11,14,25,28,31,53]. An example is weighted belief propagation (WBP), where the messages exchanged across the edges of the Tanner graph of the code are weighted and optimized [11,31,54]. This gives rise to a decoder in the form of a recurrent network obtained by unfolding the update equations of the belief propagation (BP) over the iterations. Since the WBP is a biased model, it has fewer parameters than the model-agnostic NNs at the same accuracy.
Prior work has demonstrated that the WBP outperforms BP for block lengths up to around 1000, particularly with structured codes, low-to-moderate code rates, and high signal-to-noise ratios (SNRs) [17,28,31,37,38,39,54,55,56,56]. It is believed that the improvement is achieved by altering the log-likelihood ratios (LLRs) that are passed along short cycles. For example, for BCH and LDPC codes with block lengths under 200, WBP provides frame error rate (FER) improvements of up to 0.4 dB in the waterfall region and up to 1.5 dB in the error-floor region [23,57,58]. Protograph-based (PB) QC-LDPC codes have been similarly decoded using the learned weighted min-sum (WMS) decoder [28].
The WBP does not generalize well at low bit error rates (BERs) due to the requirement of long block lengths and the resulting high sample and training complexity [44]. For example, in optical fiber communication, the block length can be up to tens of thousands to achieve a BER of 10 15 . In this case, the sample complexity of WBP is high, and the model does not generalize well when trained with a practically manageable number of examples.
The training complexity and storage requirements of the WBP can be reduced through parameter sharing. Lian et al. introduced a WBP decoder wherein the parameters are shared across or within the layers of the NN [18,39]. A number of parameter-sharing schemes in WBP are studied in [28,39]. Despite intensive research in recent years, WBP remains impractical in most real-world applications.
In this work, we improve the generalization of WBP to enhance its practical applicability. The WBP is a static NN, trained offline based on a dataset. The main contribution of this paper is the proposal of adaptive learned message-passing algorithms, where the weights assigned to messages are determined for each received word. In this case, the decoder is dynamic, changing its parameters for each transmission in real time.
Two variants of this decoder are proposed. In the parallel decoder architecture, the weights take values in a discrete set. A number of WMS decoders are run in parallel to find the best sequence of weights based on the Hamming weight of the syndrome of the received word. In the two-stage decoder, a secondary NN is trained to compute the weights to be used in the primary NN decoder. The secondary NN is a CNN that takes the LLRs of the received word and is optimized offline.
The performance and computational complexity of the static and adaptive decoders are compared in two applications. In the first application, a number of regular and irregular quasi-cyclic low-density parity-check (QC-LDPC) codes, along with a BCH and a polar code, are evaluated over an additive white Gaussian noise (AWGN) channel in both low- and high-rate regimes. The results indicate that the adaptive WMS decoders achieve decoding BERs up to an order of magnitude less than the BERs of the static WMS decoders, at about the same decoding complexity, depending on the code, its rate, and the SNR. The coding gain is 0.32 dB at a bit error rate of 10 4 in one example.
The second application is coding over a nonlinear optical fiber link with wavelength division multiplexing (WDM). The data rates in today’s optical fiber communication system approach terabits/s per wavelength. Here, the complexity, power consumption, and latency of the decoder are important considerations. We apply concatenated coding by combining a low-complexity short-block-length soft-decision inner code with a long-block-length hard-decision outer code. This approach allows the component codes to have much shorter block lengths and higher BERs than the combined code. As a result, it becomes feasible to train the WBP for decoding the inner code, addressing the curse of dimensionality and sample complexity issues. For PB QC-LDPC inner codes and a spatially coupled (SC) QC-LDPC outer code, the results indicate that the adaptive WBP outperforms the static WBP by 0.8 dB at about the same complexity and decoding latency in a 16-QAM 8 × 80 km 32 GBaud WDM system with five channels.
The remainder of this paper is organized as follows. Section 2 introduces the notation, followed by the channel models in Section 3. In Section 4, we introduce the WBP, and in Section 5, two adaptive learned message-passing algorithms. In Section 6, we compare the performance and complexity of the static and adaptive decoders, and in Section 7, we conclude the paper. Appendix A and Appendix B provide supplementary information, and Appendix C presents the parameters of the codes.

2. Notation

Natural, real, complex and non-negative numbers are denoted by N , R , C , and R + , respectively. The set of integers from m to n is shown as [ m : n ] = m , m + 1 , , n . The special case m = 1 is shortened to [ n ] = Δ [ 1 : n ] . x and x denote, respectively, the floor and ceiling of x R . The Galois field GF(q) with q N , q 2 , is F q . The set of matrices with m rows, n columns, and elements in [ 0 : q 1 ] is F q m × n .
A sequence of length n is denoted as x n = ( x 1 , x 2 , , x n ) . Deterministic vectors are denoted by boldface font, e.g., x R n . The i th entry of x is [ x ] i . Deterministic matrices are shown by upper-case letters with mathrm font, e.g., A .
The probability density function (PDF) of a random variable X is denoted by Pr X ( x ) , shortened to Pr ( x ) if there is no ambiguity. The conditional PDF of Y given X is Pr Y | X ( y | x ) . The expected value of a random variable X is denoted by E ( X ) . The real Gaussian PDF with mean θ and standard deviation σ is denoted by N ( θ , σ 2 ) . The Q function is Q x = 1 2 erfc x 2 , where erfc ( x ) is the complementary error function. The binary entropy function is H b ( x ) = ( x log 2 ( x ) + ( 1 x ) log 2 ( 1 x ) ) , x ( 0 , 1 ) .

3. Channel Models

3.1. AWGN Channel

Encoder: We consider an ( n , k ) binary linear code C with the parity-check matrix (PCM) H F 2 m × n , where n is the code length, k is the code dimension, and m n k , m N . The rate of the code is r = k / n 1 m n . A PB QC-LDPC code is characterized by a lifting factor M N , a base matrix B { 1 , 0 , 1 } λ × ω , and an exponent matrix P { 0 , 1 , , M 1 } λ × ω where, λ , ω N , λ < ω . Given ( λ , ω , M , P ) , the PCM is obtained according to the procedure in Appendix A.
We evaluate a BCH code, seven regular and irregular QC-LDPC codes, and a polar code in the low- and high-rate regimes. These codes are summarized in Table 1 and described in Section 6.1. The parameters of the QC-LDPC codes are given in Appendix C.
The encoder maps a sequence of information bits b = ( b 1 , b 2 , , b k ) , b i { 0 , 1 } , i [ k ] , to a codeword c = ( c 1 , c 2 , , c n ) as c = b G , where G F 2 k × n is the generator matrix of the code.
Channel model: The codeword c is modulated with a binary phase shift keying with symbols ± A R , and transmitted over an AWGN channel. The vector of received symbols is y = ( y 1 , , y n ) , where
y i = ( 1 ) c i A + z i , i = 1 , 2 , 3 , ,
where z j i . i . d . N ( 0 , σ 2 ) . If ρ is the SNR, the channel can be normalized so that A = 1 , and σ 2 = ( r ρ ) 1 .
The LLR function L : R n R n of c conditioned on y is
[ L y i = Δ log Pr c i = 0 y Pr c i = 1 y
= log Pr y i c i = 0 Pr y i c i = 1
= 4 r ρ y i
for each i [ n ] . Equation (2) holds under the assumption that c i are independent and uniformly distributed, and (3) is obtained from Gaussian Pr ( y i | c i ) from (1).
Decoder: We compare the performance and complexity of the static and adaptive belief propagation. The static decoders are tanh-based BP, the auto-regressive BP and WBP with different levels of parameter sharing, including BP with simple scaling and parameter-adapter networks (SS-PAN) [18]. Additionally, to assess the achievable performance with a large number of parameters in the decoder, we include a comparison with two model-agnostic neural decoders based on transformers [41] and graph NNs [33,43].

3.2. Optical Fiber Channel

In this application, we consider a multi-user fiber-optic transmission system using WDM with N c users, each of bandwidth B 0 Hz, as shown in Figure 1.
Transmitter (TX): A binary source generates a pseudo-random bit sequence of n b information bits b u = ( b u , 1 , b u , 2 , , b u , n b ) , b u , j { 0 , 1 } , for the WDM channel u [ u 1 : u 2 ] , u 1 = N c / 2 , u 2 = N c / 2 1 , j [ 1 : n b ] . Bit-interleaved coded modulation (BICM) with concatenated coding is applied in WDM channels independently. The BICM comprises an outer encoder for the code C o ( n o , k o ) with rate r o , an inner encoder for C i ( n i , k i ) with rate r i , a permuter π , and a mapper μ , where n o / k i is assumed to be an integer. The concatenated code C = C i C o has parameters k = k o and n = n o / r i and r total = r i r o . Each consecutive subsequence of b u of length k o is mapped to c ˜ u C o { 0 , 1 } n o by the outer encoder and subsequently to c u C { 0 , 1 } n by the inner encoder. Next, c u is mapped to c ¯ u = π ( c u ) by a random uniform permuter π : F 2 n F 2 n . The mapper μ : F 2 m A maps consecutive sub-sequences of c ¯ u of length m to a symbol in a constellation A of size M = 2 m . Thus, the BICM maps b u to a sequence of symbols s u = ( s u , l 1 , s u , l 1 + 1 , , s u , l 2 ) , where s u , l A , l [ l 1 : l 2 ] , l 1 = n s / 2 , l 2 = n s / 2 1 , n s = n b / m .
The symbols s u , l are modulated with a root raised cosine (RRC) pulse shape p ( t ) at symbol rate R s , where t is the time. The resulting electrical signal of each channel x u ( t ) is converted to an optical signal and subsequently multiplexed by a WDM multiplexer. The baseband representation of the transmitted signal is
x ( t ) = u = u 1 u 2 l = l 1 l 2 s u , l p ( t l T s ) e j 2 π u B 0 t ,
where T s = 1 / R s and s u , l are i.i.d. random variables. The average power of the transmitted signal is P ; thus, E | s u , l | 2 = P , u , l .
Encoder: SC LDPC codes are attractive options for optical communications [59]. These codes approach the capacity of the canonical communication channels [60,61] and have a flexible performance–complexity trade-off. They are decoded with the BP and the sliding window decoder (SWD). Another class of codes in optical communication is the SC product-like codes, braided block codes [62], SC turbo product codes [63], staircase codes [64,65,66] and their generalizations [67,68]. These codes are decoded with iterative, algebraic hard decision algorithms and prioritize low-complexity, hardware-friendly decoding over coding gain.
In this paper, the encoding in BICM combines an inner (binary or non-binary) QC-LDPC code C i with an outer SC QC-LDPC code C o whose component code is a multi-edge QC-LDPC code, as outlined in Table 1. The construction and parameters of the codes are given in Appendix B.1 and Appendix C, respectively.
The choice of the inner code is due to the decoder complexity. Other options have been considered in the literature, for instance, algebraic codes, e.g., the BCH (Section 3.3 [69]) or Reed-Solomon codes (Section 3.4 [69]), or polar codes [70]. However, the QC-LDPC codes are simpler to decode, especially at high rates. The outer code can be an LDPC code [71,72], a staircase code [71,73], or a SC-LDPC code [72].
Fiber-optic link: The channel is an optical fiber link with N s p spans of length L s p of the standard single-mode fiber, with parameters in Table 2.
Let q ( t , z ) : R × R + C be the complex envelope of the signal as a function of time t and distance z along the fiber. The propagation of the signal in one polarization over one span of optical fiber is modeled by the nonlinear Schrödinger equation [74]
q ( t , z ) z = α 2 q ( t , z ) j β 2 2 2 q ( t , z ) t 2 + j γ | q ( t , z ) | 2 q ( t , z ) ,
where α is the loss constant, β 2 is the chromatic dispersion coefficient, γ is the Kerr nonlinearity parameter, and j = 1 . The transmitter is located at z = 0 and the receiver at z = L . The continuous-time model (5) can be discretized to a discrete-time discrete-space model using the split-step Fourier method (Section III.B [75]). The optical fiber channel described by the partial differential Equation (5) differs significantly from the AWGN channel due to the presence of nonlinearity.
An erbium doped fiber amplifier (EDFA) is placed at the end of each span, which compensates for the fiber loss, and introduces amplified spontaneous emission noise. The input x i ( t ) –output x o ( t ) relation of the EDFA is given by x o ( t ) = G x i ( t ) + n ( t ) , where G = e α L s p is the amplifier’s gain, and n ( t ) is zero-mean circularly symmetric complex Gaussian noise process with the power spectral density
σ 2 = 1 2 ( G 1 ) h f 0 N F ,
where NF is the noise figure, h is a Planck constant, and f 0 is the carrier frequency at 1550 nm.
Receiver: The advent of the coherent detection paved the way for the compensation of transmission effects in optical fiber using digital signal processing (DSP). As a result, the linear effects in the channel, such as the chromatic dispersion and polarization-induced impairments, and some of the nonlinear effects, can be compensated with DSP.
At the receiver, a demultiplexer filters the signal of each WDM channel. The optical signal for each channel is converted to an electrical signal by a coherent receiver. Next, DSP followed by bit-interleaved coded demodulation (BICD) is applied. The continuous-time electrical signal is converted to the discrete-time signals by analogue-to-digital converters, down-sampled, and passed to a digital signal processing unit for the mitigation of the channel impairments. For equalization, digital back-propagation (DBP) based on the symmetric split-step Fourier method is applied to compensate for most of the linear and nonlinear fiber impairments [76].
After DSP, the symbols are still subject to signal-dependent noise, which is mitigated by the bit-interleaved coded demodulator (BICD). Let y C n s denote the equalized signal samples for the transmitted symbols s A n s in the WDM channel of interest. Given that the deterministic effects were equalized, we assume that the channel s y is memoryless so that Pr ( y | s ) = l = 1 n s Pr ( y l | s l ) . For s A , let μ 1 ( s ) = ( b 1 ( s ) , , b m ( s ) ) . From the symbol-to-symbol channel Pr ( y | s ) , s A , y C , we obtain m bit-to-symbol channels
Pr j ( y | b ) = s A , b j ( s ) = b Pr ( y | s ) ,
where b F 2 , and j [ m ] .
Let c ¯ = ( b 1 ( s 1 ) , , b m ( s 1 ) , , b 1 ( s n s ) , , b m ( s n s ) ) , n = m n s . The LLR function L : C n s R n of c conditioned on y is, for each i [ n ] ,
[ L y i = Δ log Pr c i = 0 y Pr c i = 1 y   = log Pr c ¯ i = 0 y Pr c ¯ i = 1 y   = log Pr j y b = 0 Pr j y b = 1 ,
where i is obtained from i according to π , j = i mod m , and Pr j ( y | b ) is defined in (6).
Decoder: The decoding of C i C o consists of two steps. First, C i is decoded using an adaptive WBP in Appendix B, which takes the soft information L ( y ) R n and corrects some errors. Second, C o is decoded using the min-sum (MS) decoder with SWD in Appendix B, which further lowers the BER, and outputs the decoded information bits b ^ . The LLRs in the inner decoder are represented with 32 bits, and in the outer decoder are quantized at 4 bits with per-window configuration.
In optical communication, the forward error correction (FEC) overhead 6–25% is common [77]. Thus, the inner code typically has a high rate of ≥0.9 and a block length of several thousands, achieving a BER of 10 6 10 2 . The outer code has a length of up to tens of thousands, lowering the BER to an error floor to ∼ 10 15 .

3.3. Performance Metrics

Q-factor: The SNR per bit in the optical fiber channel is E b / N o , where E b = P / m is the bit energy, and N o = σ 2 B N s p is the total noise power in the link, where B = B 0 N c . The performance of the uncoded communication system is often measured by the BER. The Q-factor for a given BER is the corresponding SNR in an additive white Gaussian noise channel with binary phase-shift keying modulation:
QF = 20 log 10 2 erfc 1 ( 2 BER ) , dB .
Coding gain: Let BER i and QF i (respectively, BER o and QF o ) denote the BER and Q-factor at the input (respectively, output) of the decoder. The coding gain (CG) in dB is the reduction in the Q-factor
CG = QF o QF i = 20 log 10 erfc 1 ( 2 BER o ) 20 log 10 erfc 1 ( 2 BER i ) .
The corresponding net CG (NCG) is
NCG = CG + 10 log 10 r total .
Finite block-length NCG: If n is finite, the rate r total in (8) may be replaced with the information rate in the finite block-length regime [78]
C f C log 2 ( e ) BER i ( 1 BER i ) n Q 1 ( BER o ) ,
where Q ( x ) = 1 2 erfc x 2 .

4. Weighted Belief Propagation

Given a code C , one can construct a bipartite Tanner graph T C = ( C , V , E ) , where C = [ m ] , [ m ] = Δ 1 , 2 , , m , V = [ n ] , and E = { ( c , v ) C × V H c , v 0 } are, respectively, the set of check nodes, variable nodes and the edges connecting them. Let V c = { v V ( c , v ) E } , C v = { c C ( c , v ) E } , and d c and d v be the degree of c and v in T C , respectively.
The WBP is an iterative decoder based on the exchange of the weighted LLRs between the variable nodes and the check nodes in T C [11,79]. Let L c 2 v ( t ) denote the extrinsic LLR from the check node c to the variable node v at iteration t. Define similarly L v 2 c ( t ) .
The decoder is initialized at t = 1 with L v 2 c ( 0 ) = L ( y j ) , where v is the j-th variable node, and L ( y j ) is obtained from (3) or (7). For iteration t [ T ] , the LLRs are updated in two steps.
The check node update:
L c 2 v ( t ) = ( a ) 2 tanh 1 v V c { v } tanh γ v , c ( t ) 2 L v 2 c ( t 1 ) ( b ) L ¯ c , v v V c { v } γ v , c ( t ) sign L v 2 c ( t 1 ) ,
where
L ¯ c , v = min v V c { v } L v 2 c ( t 1 ) .
The equation in ( a ) represents the update relation in the BP [69], where the LLR messages are scaled by non-negative weights { γ v , c ( t ) : v V , c C v , t [ T ] } . Further, ( b ) is obtained from ( a ) though an approximation to lower the computational cost. The WBP and WMS decoders use ( a ) and ( b ) , respectively.
The variable-node update:
L v 2 c ( t ) = α v ( t ) L y + c C v { c } β c , v ( t ) L c 2 v ( t 1 ) .
This is the update relation in the BP, to which the sets of non-negative weights { α v ( t ) : v V , t [ T ] } and { β c , v ( t ) : c C , v V c , t [ T ] } are introduced.
At the end of each iteration t, a hard decision is made
y ¯ j = 1 , if L v ( t ) < 0 , 0 , if L v ( t ) 0 ,
where
L v ( t ) = L y j + c C v L c 2 v ( t ) .
Let y ¯ = y ¯ 1 , y ¯ 2 , , y ¯ n F 2 n , and let s = y ¯ H T F 2 m be the syndrome. The algorithm stops if s = 0 or t = T .
The computation in (9) and (10) can be expressed with an NN. The Tanner graph T C is unrolled over the iterations to obtain a recurrent network with 2 T layers (see Figure 2), in which the weights γ v , c ( t ) and β c , v ( t ) are assigned to the edges of T C , and the weights α v ( t ) to the outputs [16]. The weights are obtained by minimizing a loss function evaluated over a training dataset using the standard optimizers for NNs.

4.1. Parameter Sharing Schemes

The training complexity of WBP can be reduced through parameter sharing at the cost of performance loss. We consider dimensions ( t , v , c ) for the ragged arrays γ v , c ( t ) and β c , v ( t ) . In Type T parameter sharing over γ v , c ( t ) , parameters are shared with respect to iterations t. In Type T a scheme, γ v , c ( t ) = β c , v ( t ) = γ v , c , ( c , v ) E , t [ T ] . In this case, there is a single ragged array with | E | trainable parameters { γ v , c } c C v , v V c . For the regular LDPC code, | E | = n d v = m d c . It has been observed that for typical block lengths, indeed, the weights do not change significantly with iterations [28]. In Type T b , there are T arrays γ v , c ( t ) = β c , v ( t ) , while in Type T c , there are two arrays γ v , c ( t ) = γ v , c and β c , v ( t ) = β c , v . Type T a and T c decoders can be referred to as BP-RNN decoders and Type T b as feedforward BP. In Type V sharing, γ v , c ( t ) = γ c ( t ) is independent of v. This corresponds to one weight per check node. Likewise, in Type C sharing, there is one weight per check node update, and γ v , c ( t ) = γ v ( t ) .
These schemes can be combined. For instance, in Type T a V C parameter sharing, β c , v ( t ) = γ v , c ( t ) = γ . Thus, a single parameter γ is introduced in all layers of the NN. This decoder is referred to as the neural normalized BP, e.g., neural normalized min-sum (NNMS) decoder when BP is based on the MS algorithm. The latter is similar to the normalized MS decoder, except that the parameter γ is empirically determined there. In the Type T b V C scheme, β c , v ( t ) = γ v , c ( t ) = γ ( t ) . Here, there is one weight per iteration. In this paper, α v = 1 t T .

4.2. WBP over F q

The construction and decoding of the PB QC-LDPC binary codes can be extended to codes over a finite field F q [80,81]. Here, there are q 1 LLR messages sent from each node, defined in Equation (1) [81]. The update equations of the BP are similar to (9)–(10), and presented in [81] for the extended min-sum (EMS) and in [82] for the weighted EMS (WEMS) decoder.
The parameter sharing for the four-dimensional ragged array { γ v , c , q ( t ) } t , v C v , c V c , q F q is defined in Section 4.1. In the check-node update of the WEMS algorithm, it is possible to assign a distinct weight to each coefficient q F q for every variable node. For instance, in the Type T c C Q scheme, γ v , c , q ( t ) = γ v and β c , v , q ( t ) = β v , so there is one weight per variable and one per check node t T . In the case of Type T b V C Q , there is only one weight per variable, iteration, and coefficient. In this case, if BP is based on the non-binary EMS algorithm, the decoder is called the neural normalized EMS (NNEMS).
Remark 1. 
The EMS decoder has a truncation factor in { 1 , 2 , , q } that provides a trade-off between complexity and accuracy. In this paper, it is set to q to investigate the maximum performance.

5. Adaptive Learned Message Passing Algorithms

The weights of the static WBP are obtained by training the network offline using a dataset. A WBP where the weights are determined for each received word y is an adaptive WBP. The weights must therefore be found by online optimization. To manage the complexity, we consider a WMS decoder with Type T a parameter sharing. Thus, the decoder has one weight per T iterations, which must be determined for a received y .
Let c F 2 n be a codeword and y U be the corresponding received word, where U = R n for the AWGN channel and U = C n s for the optical fiber channel. Let y ¯ = D γ ( y ) be the word decoded by a Type T a WBP decoder with weight γ ( t ) R + in iteration t [ T ] , where γ = ( γ ( 1 ) , γ ( 2 ) , , γ ( T ) ) . In the adaptive decoder, we wish to find a function g : U X , X R + T , γ = g ( y ) that minimizes the probability that D g ( y ) ( y ) makes an error
min g H Pr y ¯ c ,
where H is a functional class. The static decoder is a special case where g ( . ) is a constant function. Two variants of this decoder are proposed, illustrated in Figure 3.

5.1. Parallel Decoders

Architecture: In parallel decoders, g ( . ) is found through searching. Here, γ ( t ) takes value in a discrete set X t = x 1 ( t ) , x 2 ( t ) , , x K t ( t ) , K t N , and thus γ X : = t = 1 T X t = γ 1 , , γ ν . The parallel decoders consist of ν independent decoders y ¯ i = D γ i ( y ) , i [ ν ] , running concurrently. Since Pr y ¯ i c in (13) is generally intractable, a sub-optimal g ( . ) is selected as follows. At the end of decoding by D γ i , the syndrome s i = y ¯ i H T is computed. Let
i * = argmin i [ ν ] | | s i | | H : s i = y ¯ i H T ,
be the index of the decoder whose syndrome has the smallest Hamming weight. Then, g ( y ) = γ i * , and y ¯ = D γ i * ( y ) .
In practice, the search can be performed up to depth T 1 = 5 iterations. However, the BP decoder often has to run for more iterations. Thus, a WBP decoder with weights γ i * can continue the output with T 2 iterations.
Remark 2. 
The decoder obtained via (14) is generally sub-optimal. Minimizing s i H does not necessarily minimize the number of errors. However, for random codes, the decoder obtained from (14) outperforms the static decoder.
Remark 3. 
If D γ 1 and D γ 2 yield the same number of errors, the decoder with the smaller weight vector is selected, which tends to output smaller LLRs.
Obtaining x k ( t ) from the distribution of weights.: The values of x k ( t ) can be determined by dividing a sub-interval in [ 0 , 1 ] uniformly. The resulting parallel WMS decoder outperforms WMS; however, the performance can be improved by choosing x k ( t ) based on the probability distribution of the weights.
The probability distribution of the channel noise induces a distribution on y and consequently on γ = g ( y ) . Let Γ ( t ) be a random variable representing γ ( t ) . Denote the corresponding mean by θ ( t ) , standard deviation by σ ( t ) , and the cumulative distribution function by C t ( . ) = Δ C Γ ( t ) ( . ) . For ϵ > 0 , set
x k ( t ) = inf x : C t ( x ) > ϵ 2 + ( k 1 ) 1 ϵ K t .
The numbers x 1 ( t ) < x 2 ( t ) < < x K t ( t ) partition the real line into intervals such that Pr Γ ( t ) [ x 1 ( t ) , x K t ( t ) ] = 1 ϵ and Pr ( Γ ( t ) [ x k ( t ) , x k + 1 ( t ) ] Γ ( t ) [ x 1 ( t ) , x K t ( t ) ] = 1 K t . In practice, Γ ( t ) has a distribution close to Gaussian, in which case x k ( t ) values are given by the explicit formulas in Lemma 1.
Lemma 1. 
Let Γ ( t ) have a cumulative distribution function C t ( . ) that is continuous and strictly monotonic. For k [ K t ] ,
x k ( t ) = C t 1 ϵ 2 + ( k 1 ) 1 ϵ K t .
If Γ ( t ) has a Gaussian distribution with a mean θ ( t ) and standard deviation σ ( t ) , then
x k ( t ) = θ ( t ) + 2 σ ( t ) erfc 1 2 ϵ 2 ( k 1 ) ( 1 ϵ ) K t ,
where erfc is the complementary error function.
Proof. 
The proof is based on elementary calculus. □
Obtaining the distribution of weights: To apply (15) or (16), C t ( . ) is required. To this end, a static WBP (with no parameter sharing) is trained offline given a dataset { ( y ( i ) , c ( i ) ) } i . The empirical cumulative distribution of the weights in each iteration is computed as an approximation to C t ( . ) . However, if the BER is low, it can be difficult to obtain a dataset that contains a sufficient number of examples corresponding to incorrectly decoded words required to obtain good generalization.
To address this issue, we apply active learning [23,83]. This approach is based on the fact that the training examples near the decision boundary of the optimal classifier determine the classifier the most. Hence, input examples are sampled from a probability distribution with a support near the decision boundary.
The following approach to active learning is considered. At epoch e in the training of the WBP, random codewords c and the corresponding outputs y are computed. The decoder from the epoch e 1 is applied to decode y to c ^ = WBP ( γ , L ( y ) ) . An acquisition function A f ( c , c ^ ) : C × F 2 n R + evaluates whether the example pair ( y , c ) should be retained. A candidate example is retained if A f ( c , c ^ ) is in a given range.
The choice of the acquisition function depends on the specific problem being solved, the architecture of the NN, and the availability of the labeled data [83]. In the context of training the NN decoders for channel coding, the authors of [23] use distance parameters and reliability parameters. Inspired by [23], the authors of [84] define the acquisition function using importance sampling. In this paper, the acquisition function is the number of errors A f ( c , c ^ ) = c c ^ H = d H ( c , c ^ ) , where d H is the Hamming distance.
The dataset is incrementally generated and pruned as follows. At each epoch e, a subset S e = { ( y ( i ) , c ( i ) ) } i = 1 b 1 of b 1 examples, filtered by the acquisition function, is selected. The entire dataset at epoch e is S e = Prune ( e = 1 e S e ) and has size b 2 > b 1 . The operator Prune removes the subsets S e introduced in old epochs e [ e e 0 ] if e > e 0 = Δ b 2 / b 1 and otherwise leaves its input intact. At each epoch e, the loss function is averaged over a batch set of size b s obtained by randomly sampling from S e .
Complexity of the parallel decoders: The computational complexity of the decoder is measured in real multiplications (RMs). For instance, the complexity of the WMS with T iterations, α v = 1 , without parameter sharing, or with Type T b parameter sharing, is RM = 2 T | E | , where | E | = n d v = m d c is the number of edges of the Tanner graph of the code. For the WMS decoder with Type T a or Type T b V C parameter sharing, RM = T m + n . The latter arises from the fact that equal weights factor out of the ∑ and the min terms in BP and are applied once. Thus, the complexity of ν parallel WMS decoders with Type T a parameter sharing is RM = ν T m + n . If α v 1 , n T is added to the above formulas. Finally, the complexity of Type T a V C decoder is RM = n per single iteration. These expressions neglect the cost of the syndrome check.

5.2. Two-Stage Decoder

In a parallel decoder, the weights are restricted in a discrete set. The number of parallel decoders depends exponentially on the size of this set. The two-stage decoder predicts arbitrary non-negative weights, without the exponential complexity of the parallel decoders. Further, since the weights are arbitrary, the two-stage decoder can improve upon the performance of the parallel decoders, when the output LLRs are sensitive to the weights.
Architecture: Recall that we wish to find a function g ( y ) that minimizes the BER in (13). In a two-stage decoder, this function is expressed by an NN γ ¯ = g θ ( y ) parameterized by vector θ. Thus, the two-stage decoder is a combination of an NN and a WBP. First, the NN takes as input either the LLRs at the channel output L ( y ) or ( L ( y ) , y ) and outputs the vector of weights γ ¯ . Then, the WBP decoder takes the channel LLRs L ( y ) and weights γ ¯ and outputs the decoded word y ¯ .
The parameters θ are found using a dataset of examples ( y ( i ) , γ ( i ) ) i , where γ ( i ) is the target weight. This dataset can be obtained through a simulation, i.e., transmitting a codeword c ( i ) , receiving y ( i ) , and using, e.g., an offline parallel decoder to determine the target weight γ ( i ) . In this manner, g θ ( y ) is expressed in a functional form instead of being determined by real-time search, which may be more expensive.
In this paper, the NN is a CNN consisting of a cascade of two one-dimensional convolutional layers Conv 1 and Conv 2 , followed by a dense layer Dens . Conv i applies F i filters of size S i and stride 1, and the rectified linear unit (ReLU) activation, i = 1 , 2 . The output of Conv 2 is flattened and passed to Dense , which produces the vector of weights γ ¯ of length T. This final layer is a linear transformation with ReLU activation to produce non-negative weights.
The model is trained by minimizing the quantile loss function
l ξ ( γ , γ ¯ ) = mean max ξ ( γ γ ¯ ) , ( ξ 1 ) ( γ γ ¯ ) ,
where ξ ( 0 , 1 ) is the quantile parameter and max is applied per entry and mean over vector entries. The choice of loss is obtained by cross-validating the validation error over a number of candidate functions. This is an asymmetric absolute-like loss, which, if ξ 1 / 2 as in Section 6, encourages entries of γ ¯ to be close to entries of γ from above rather than below.
Complexity of the two-stage decoder: The computational complexity of the two-stage decoder in the inference mode is the sum of the complexity of the CNN and WMS decoder
RM = T ( m + n ) + RM ( CNN ) ,
where the computational complexity of the CNN is
RM ( CNN ) = RM ( Conv 1 ) + RM ( Conv 2 ) + RM ( Dense )   = F 1 ( n S 1 + 1 ) S 1 + F 1 F 2 ( n S 1 S 2 + 2 ) S 2 + F 2 ( n S 1 S 2 + 2 ) .
The complexity can be significantly reduced by pruning the weights, for example, by setting to zero the weights below a threshold τ prun .
Remark 4. 
Neural decoders are sensitive to distribution shifts and often require retraining when the input distribution or channel conditions change. To lower the training complexity, [18] proposed a decoder that learns a mapping from the input SNR to the weights in WBP, enabling the decoder to operate across a range of SNRs. However, the WBP decoder in [18] is static, since the weights remain fixed throughout the transmission once chosen, despite being referred to as dynamic WBP. We do not address the problem of distribution shift in this paper.

6. Performance and Complexity Comparison

In this section, we study the performance and complexity trade-off of the static and adaptive decoders for the AWGN in Section 3.1 and optical fiber channel in Section 3.2.

6.1. AWGN Channel

Low-rate regime: To investigate the error correction performance of the decoders at low rates, we consider a BCH code C 1 ( 63 , 36 ) of rate 0.57 with the cycle-reduced parity check matrix H cr in [85] and two QC-LDPC codes C 2 ( 3224 , 1612 ) and C 3 ( 4016 , 2761 ) , which are, respectively, ( 4 , 8 ) - and ( 5 , 16 ) -regular with rates of 0.5 and 0.69 . The parity check matrix of each QC-LDPC code is constructed using an exponent matrix obtained from the random progressive edge growth (PEG) algorithm [86], with a girth-search depth of two, which is subsequently refined manually to remove the short cycles in their Tanner graphs. The parameters of the QC-LDPC codes, including the exponent matrices P 2 and P 3 , are given in Appendix C. In addition, we consider the irregular LDPC codes C 4 ( 420 , 180 ) specified in the 5G New Radio (NR) standard and C 5 ( 128 , 64 ) in the Consultative Committee for Space Data Systems (CCSDS) standard [85]. The code parameters, such as exponent matrices, are also available in the public repository ([87] v0.1).
We compare our adaptive decoders with tanh-based BP, the auto-regressive BP and several static WMS decoder with different levels of parameter sharing, such as BP with SS-PAN [18]. The latter is a Type T a V C WBP with α v = α , i.e., a BP with two parameters. Additionally, to assess the achievable performance with a large number of parameters in the decoder, we include a comparison with two model-agnostic neural decoders based on the transformer [41] and graph NNs [33,43].
The number of iterations in the WMS decoders of the parallel decoders is chosen so that the total computational complexities of the parallel decoders and the static WMS decoder are about the same. In Figure 4a, this value is T = 5 for C 1 , where T = T 1 + T 2 is the total number of iterations; in Figure 4b–d, T = 4 for C 2 and C 3 , T = 6 for C 4 , and T = 10 for C 5 ; in Figure 4f, T = 10 for C 1 . Furthermore, K t = 4 for all t [ T ] .
To compute the value of weights x k ( t ) , the probability distribution of Γ ( t ) is required. For this purpose, a WMS decoder is trained offline. The training dataset is a collection of examples obtained using the AWGN channel with a range of SNRs ρ 5.8 , 6.0 , 6.2 dB for C 1 or ρ 3.8 , 3.9 , 4.0 , 4.1 , 4.2 dB for C 2 and C 3 . The datasets for C 4 and C 5 are obtained similarly, with different sets of SNRs. The acquisition function A f ( . , . ) in active learning is the Hamming distance. A candidate example for the training dataset is retained if A f 10 . The parameters of the active learning are b 1 = b s = 2000 , and b 2 = 40,000. The loss function is the binary cross-entropy. The models are trained using the Adam optimizer with a learning rate of 0.0005 . It is observed that the distribution of Γ ( t ) is nearly Gaussian. Thus, we obtain x k ( t ) from (16). Table 3 presents the mean and variance of this distribution, and x k ( t ) , for the three codes considered.
For the two-stage decoder, we use a CNN with F 1 = 5 , S 1 = 3 , F 2 = 8 , and S 2 = 2 , determined by cross-validation. The CNN is trained with a dataset of size 80,000, batch set size 300, and the quantile loss function with ξ = 0.75 . The number of iterations of the WMS decoder for each code is the same as above.
Figure 4 illustrates the BER vs. SNR for different codes, and different decoders for the same code. In each of Figure 4a–d, one can compare different decoders at about the same complexity (except for the parallel decoder with the largest ν that shows the smallest achievable BER). For instance, it can be seen in Figure 4a that the two-stage decoder achieves half the BER of the WMS with SS-PAN decoder at SNR 6.4 dB for the short length code C 1 , or approximately 0.32 dB gain in SNR at a BER of 10 4 . For this code, the parallel WMS decoders with 3 iterations and ν = 9 outperforms the tanh-based BP with nearly the same complexity. Figure 4b,c show that the two-stage decoder offers about an order-of-magnitude improvement in the BER compared to the Type T a WMS decoder at 4.2 dB for moderate-length codes C 2 and C 3 , or over 0.1 dB gain at a BER of 10 6 . The performance gains vary with the code, parameters, and SNR.
Figure 4d–f compare decoders with different complexities at about the same performance. The proposed adaptive model-based decoders achieve the same performance of the model-agnostic static decoders, with far fewer parameters.
The computational complexity of the decoders are presented in Table 4. For the CNN, from (17), RM = n ( 8 T + 95 ) 24 T 310 , which is further reduced by a factor of 4 upon pruning at the threshold τ prun = 0.001 , with minimal impact on BER. Thus, the two-stage decoder requires less than half of the RM of the WMS decoder with no or Type T a parameter sharing. Moreover, the two-stage decoder requires approximately one-fifth of the RM of the parallel decoders with ν = 16 . Compared to the WMS decoder with SS-PAN [18], the two-stage decoder has nearly double the complexity, albeit with much lower BER, as seen in Figure 4a.
High-rate regime: To further investigate the error correction performance of the decoders at high rates, we consider three single-edge QC-LDPC codes, C 6 ( 1050 , 875 ) , C 7 ( 1050 , 850 ) , and C 8 ( 4260 , 3834 ) , associated, respectively, with the PCMs H 6 ( λ = 7 , ω = 42 , M = 25 , P 6 ) , H 7 ( 8 , 42 , 25 , P 7 ) , and H 8 ( 6 , 60 , 71 , P 8 ) . These codes have rates r = 0.84 , 0.81 , and 0.9 , respectively, and are constructed using the PEG algorithm. The PEG algorithm requires the degree distributions of the Tanner graph, which are optimized using the stochastic extrinsic information transfer (EXIT) chart described in Appendix C. Additionally, we include the polar code C 9 ( 1024 , 854 ) with r = 0.84 from the 5G-NR standard as a state-of-the-art benchmark. The code parameters, including degree distribution polynomials and the exponent matrices, are given in Appendix C.
The acquisition function with active learning in the parallel decoders is based on Figure 5. The figure shows the scatter plot of the pre-FEC error e 1 = c y ¯ versus post-FEC error e 2 = c c ^ H , for 340 examples ( c , y ) for C 6 at E b / N 0 = 4.25 dB. Here, y ¯ is the hard decision of the LLRs at the channel output defined in (11), and c ^ is decoded with the best decoder at epoch e, i.e., the WMS with weights from epoch e 1 . The acquisition function retains ( c , y ) if e 1 = 0 (no error) or if ( e 1 , e 2 ) falls in the rectangle in Figure 5 (with error). The rectangle is defined such that Pr ( ( e 1 , e 2 ) S ) 0.95 . It is ensured that 70% of examples satisfy e 1 = 0 and 30% with ( e 1 , e 2 ) in the rectangle S . In this example, e 1 { 80 , 81 , , 100 } , and e 2 [ μ 2 σ , μ + 2 σ ] , μ = 149.97 , σ = 12.2 . We use b 1 = 2000 , b 2 = 20,000, b s = 500 and the learning rate 0.001 .
For the adaptive decoder, we consider five parallel decoders with T 1 = 4 iterations. The decoder for the binary codes C 6 , C 7 and C 8 is WMS with Type T a V C sharing. The output of the decoder with the smallest syndrome is continued with an MS decoder with T 2 = 4 iterations. The polar code, however, is decoded with either a cyclic redundancy check (CRC) and successive cancellation list (SCL) with list size L [88] or the optimized successive cancellation (OSC) [89].
Figure 6 shows the performance of the adaptive and static MS decoders for C 6 , C 7 and C 9 . The polar code C 9 with 24 CRC bits is simulated using AFF3CT software toolbox ([90] v3.0.2). It can be seen that at high SNRs, E b / N 0 4.6 , C 6 and C 7 decoded with adaptive parallel decoders outperform C 9 . Given this, and the higher complexity of decoding the polar code with either SCL or OSC [88], the choice of QC-LDPC codes for the inner code for the optical fiber channel in Section 6.2 is justified.
Figure 7 shows the performance of C 8 ( 4260 , 3834 ) with rate 0.9 . The adaptive WMS decoder with T 1 + T 2 = 8 iterations outperforms the static MS decoder with T = 8 iterations at E b / N 0 = 5 by an order of magnitude in BER.
The gains of WBP depend on parameters such as the block length or SNR [91] (Section IV. d [44]). In general, the gain is decreased when the block length is increased, with other parameters remaining fixed.

6.2. Optical Fiber Channel

We simulate a 16-QAM WDM optical communication system described in Section 3.2, with parameters described in Table 2. The continuous-time model (5) is simulated with the split-step Fourier method with a spatial step size of 100 m and a simulation bandwidth of 200 GHz. DBP with two samples/symbol is applied to compensate for the physical effects and to obtain the per-symbol channel law Pr ( y | s ) , s A , y C . For the inner code in the concatenated code, we consider two QC-LDPC codes of rate r i = 0.92 : binary single-edge code C 10 ( 4000 , 3680 ) and non-binary multi-edge code C 11 ( 800 , 32 ) over F 32 , respectively, with PCMs H 10 ( 4 , 50 , 80 , P 10 ) and H 11 ( 2 , 25 , 32 , P 11 ) , given in Appendix C. For the component code used in the outer spatially coupled code, we consider multi-edge QC-LDPC code C 12 ( 3680 , 3520 ) with the PCM H 12 ( 1 , 23 , 160 , P 12 ) . For m s = 2 and L = 100 , the resulting SC-QC-LDPC code has the PCM H SC ( 1 , 23 , 160 , P 12 , 2 , 100 , B ¯ 12 ) , where B ¯ 12 is the spreading matrix. The outer SC-QC-LDPC code is encoded with the sequential encoder [92]. This requires that the top-left λ M × ω M block H 0 ( 0 ) of H SC in Equation (A2) is of full rank. Thus, B ¯ 12 is designed to fulfill this condition. In Equation (A2), we have H t ( λ = 1 , ω = 23 , M = 160 ) = Δ H F 2 160 × 3680 , t = 0 , 1 , 2 , H 0 is of full rank, and H SC F 2 16320 × 368000 . The rate of the component code is r QC = 1 1 / 23 0.956 , and the rate of outer SC code is r o = r QC m s λ L ω 0.955 , so r total = r i · r o 0.88 . The P 12 and B ¯ 12 matrices are constructed heuristically and are given in Appendix C.
The inner code is decoded with the parallel decoder, with five decoders with four iterations each. The decoder for the binary code C 9 is WMS with Type T a V C parameter sharing, while for the non-binary code, C 11 is WEMS with Type T a V C Q sharing and β c , v , q ( t ) = 1 . The static EMS algorithm [93] is parameterized as in Section 4, initialized with the LLRs computed from Equation (1) [81]. The outer code is decoded with the SWD, with the static MS decoding of a maximum of 26 iterations per window.
Table 5 and Table 6 contain a summary of the numerical results. BER i is pre-FEC BER, and the reference BER for the coding gain is BER o = 10 12 . P = 10 dBm, the total gap to NCGf for the adaptive weighted min-sum AWSM (resp., WEMS) decoder is 2.51 (resp., 1.75), while this value is 3.31 (resp., 2.29) and 3.44 (resp., 2.69), respectively, for the NNMS (resp., NNEMS) and MS (resp., EMS) decoders. Thus, the adaptive WBP provides a coding gain of 0.8 dB compared to the static NNMS decoder with about the same computational complexity and decoding latency

7. Conclusions

Adaptive decoders are proposed for codes on graphs that can be decoded with message-passing algorithms. Two variants, the parallel WBP and the two-stage decoder, are studied. The parallel decoders search for the best sequence of weights in real time using multiple instances of the WBP decoder running concurrently, while the two-stage neural decoder employs an NN to dynamically determine the weights of WBP for each received word. The performance and complexity of the adaptive and several static decoders are compared for a number of codes over an AWGN and optical fiber channel. The simulations show that significant improvements in BER can be obtained using adaptive decoders, depending on the channel, SNR, the code and its parameters. Future work could explore further reducing the computational complexity of the online learning, and applying adaptive decoders to other types of codes and wireless channels.

Author Contributions

Conceptualization, A.T. and M.Y.; Methodology, A.T. and M.Y.; Software, A.T. and M.Y.; Formal analysis, A.T. and M.Y.; Investigation, A.T. and M.Y.; Writing—original draft, A.T. and M.Y.; Project administration, M.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work has received funding from the European Research Council (ERC) research and innovation program, under the COMNFT project, Grant Agreement no. 805195.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript.
AWGNAdditive White Gaussian Noise
AWEMSAdaptive Weighted Extended Min-sum
AWMSAdaptive Weighted Min-sum
BCHBose–Chaudhuri–Hocquenghem
BERBit Error Rate
BICDBit-interleaved Coded Demodulation
BICMBit-interleaved Coded Modulation
BPBelief Propagation
CGCoding Gain
CNNConvolutional Neural Network
CRCCyclic Redundancy Check
DSPDigital Signal Processing
EDFAErbium Doped Fiber Amplifier
EMSExtended Min-sum
EXITExtrinsic Information Transfer
FECForward Error Correction
FERFrame Error Rate
GFGalois Field
LDPCLow-density Parity-check
LLRLog-likelihood Ratio
MSMin-sum
NCGNet Coding Gain
NFNoise Figure
NNNeural Network
NRNew Radio
NNEMSNeural Normalized Extended Min-sum
NNMSNeural Normalized Min-sum
OSCOptimized Successive Cancellation
PBProtograph-based
PCMParity-check Matrix
PDFProbability Density Function
PEGProgressive-edge Growth
QAMQuadrature Amplitude Modulation
QCQuasi-cyclic
ReLURectified Linear Unit
RRCRoot Raised Cosine
RMReal Multiplication
RNNRecurrent Neural Network
SCSpatially coupled
SS-PANSimple Scaling and Parameter-adapter Networks
SCLSuccessive Cancellation List
SWDSliding Window Decoder
WBPWeighted Belief Propagation
WDMWavelength Division Multiplexing
WEMSWeighted Extended Min-sum
WMSWeighted Min-sum

Appendix A. Protograph-Based QC-LDPC Codes

In this appendix and the next, we provide the supplementary information necessary to reproduce the results presented in this paper. The presentation in Appendix B may be of independent interest, as it provides an accessible exposition of the construction and decoding of the SC codes.

Appendix A.1. Construction for the Single-Edge Case

A single-edge PB QC-LDPC code C is constructed in two steps. First, a base matrix B F 2 λ × ω is constructed, where λ , ω N , λ ω . Then, B is expanded to the PCM of C by replacing each zero in B with the all-zero matrix 0 F 2 M × M , where M N is the lifting factor, and a one in row i and column j with a sparse circulant matrix H i , j F 2 M × M .
Let P [ 1 : M 1 ] λ × ω be the exponent matrix of the code, with the entries denoted by p i , j . The matrices H i , j (and B ) can be obtained from P as follows. Denote by I n F 2 M × M , n [ 1 : M 1 ] the circulant permutation matrix obtained by cyclic-shifting of each row of the M × M identity matrix n positions to the right, with the convention that I 1 is the all-zero matrix. Then, H i , j = I p i , j . The PCM H : = H ( λ , ω , M , P ) F 2 ω M × λ M of this QC-LDPC code is
H = I p 1 , 1 I p 1 , 2 I p 1 , ω I p 2 , 1 I p 2 , 2 I p 2 , ω I p λ , 1 I p λ , 2 I p λ , ω .
This code has a length n = ω M , k = ω M r , and a rate r = 1 r / ( ω M ) , where r λ M is the rank of H . If H is of a full rank, r = 1 λ / ω . The base matrix is also obtained from P , as B i , j = 0 if p i , j = 1 and B i , j = 1 if p i , j 1 .
Denote the Tanner graph of C by T C , and let d c and d v be, respectively, the degree of the check node c and the variable node v in T C . If B is regular (i.e., its rows have the same Hamming weight), then H is regular, and the check and variable nodes of C have the same degrees d c and d v , respectively. In this case, C is said to be d c , d v -regular. More generally, a variable node’s degree distribution polynomial can be defined as Υ ( x ) = d Υ d x d 1 , where Υ d is the fraction of variable nodes of degree d, and a check node degree distribution Λ ( x ) = d Λ d x d 1 , where Λ d is the fraction of check nodes of degree d.
The parameter matrices B and P can be obtained so as to maximize the girth of T C using search-based methods such as the PEG algorithm [86,94], algebraic methods ([69] Section 10), or a combination of them [95].
Example A1. 
Consider λ = 3 , ω = 5 , and the base matrix B F 2 3 × 5
B = 0 1 1 1 0 1 1 0 1 1 1 0 1 1 1 .
For any exponent matrix P , H F 2 3 M × 5 M is
H = I 1 I p 1 , 2 I p 1 , 3 I p 1 , 4 I 1 I p 2 , 1 I p 2 , 2 I 1 I p 2 , 4 I p 2 , 5 I p 3 , 1 I 1 I p 3 , 3 I p 3 , 4 I p 3 , 5 .

Appendix A.2. Construction for the Multi-Edge Case

The above construction can be extended to multi-edge PB QC-LDPC codes. Here, B i , j , instead of a binary number, is a sequence of length D i , j with entries in F 2 . Likewise, p i , j is a sequence of length D i , j with entries p i , j d [ 1 : M 1 ] , d [ D i , j ] . Then, H i , j = d = 1 D i , j I p i , j d . This code is represented by a Tanner graph where there are multiple edges of different types between the variable and the check nodes. We say this code has type D = max i , j D i , j 1 . A single-edge PB QC-LDPC code is type 1.
Example A2. 
Consider λ = 2 , ω = 3 ,
B = 0 ( 1 , 1 ) 1 1 0 ( 1 , 1 , 1 ) and P = 1 p 1 , 2 1 , p 1 , 2 2 p 1 , 3 p 2 , 1 1 p 2 , 3 1 , p 2 , 3 2 , p 2 , 3 3 .
Then
H = I 1 I p 1 , 2 1 + I p 1 , 2 2 I p 1 , 3 I p 2 , 1 I 1 I p 2 , 3 1 + I p 2 , 3 2 + I p 2 , 3 3 .

Appendix A.3. Construction for Non-Binary Codes

In the non-binary codes, the entries of codewords, parity-check, and generator matrices are in a finite field F q , where q is a power of a prime. There are several ways to construct non-binary PB QC-LDPC codes. The base matrix B F 2 λ × ω typically remains binary, as defined as in Appendix A.1. We use the unconstrained and random assignment strategy in [80] to extend B to a PCM.
There is significant flexibility in selecting the edge weights for constructing non-binary QC-LDPC codes, which can be classified as constrained or unconstrained ([80] Section II). In this work, we focus on an unconstrained and random assignment strategy, where each edge in T C corresponding to a 1 in the binary matrix B is replaced with a coefficient h i , j F q . Alternatively, these coefficients could be selected based on predefined rules to ensure appropriate edge-weight diversity, which can lead to enhanced performance. This methodology allows non-binary codes to retain the structural benefits of binary QC-LDPC codes while extending their functionality to finite fields, offering improved error-correcting capabilities for larger q values.

Appendix A.4. Encoder and Decoder

The generator matrix of the code is obtained by applying Gaussian elimination in the binary field to (A1). The encoder is then implemented efficiently using shift registers [96].
The QC-LDPC codes are typically decoded using belief propagation (BP), as described in Section 4.

Appendix B. Spatially Coupled LDPC Codes

Appendix B.1. Construction

The SC-QC-LDPC codes in this paper are constructed based on the edge spreading process [97]. Denote the PCM of the constituent PB QC-LDPC code by H ( λ , ω , M , P ) . Denote the PCM of the corresponding SC-QC-LDPC code by H SC : = H SC λ , ω , M , P , m s , L , B ¯ , with the additional parameters of the syndrome memory m s N , coupling length L N , and the spreading matrix B ¯ [ 1 : m s ] λ × ω . Then, H SC F 2 λ M ( m s + L + 1 ) × ω M L is given by Entropy 27 00795 i001 in which H t ( l ) , 0 F 2 λ M × ω M are λ × ω block matrices, t [ 0 : m s ] , l [ 0 , L 1 ] and H t ( l ) is obtained from B ¯ as
H t ( l ) at row - block i and column - block j = I p i , j , if B ¯ i , j = t , 0 , otherwise ,
where I p i , j , 0 F 2 M × M .
If H t l 1 = H t l 2 , t [ 0 , m s ] and l 1 , l 2 [ 0 , L 1 ] , the code is time-invariant. If H and H SC are full-rank, then the rate of SC-QC-LDPC code is r = r QC m s λ L ω , where r QC = 1 λ ω is the rate of the component QC-LDPC code. For a fixed m s , as L , then r r QC . Thus, the rate loss in SC-LDPC codes can be reduced by increasing the coupling length.
Example A3. 
Consider any QC-LDPC code, m s = 2 , L = 4 and the spreading matrix
B ¯ = 1 1 0 2 1 1 1 1 0 1 1 2 1 0 2
Then, H 2 ( l ) is obtained by replacing each entry of B ¯ that is 2 at row i and column j with I p i , j , and other entries with 0 . Thus, for all l [ 0 : 3 ]
H 2 ( l ) = 0 0 0 I p 1 , 4 0 0 0 0 0 0 0 I p 3 , 2 0 0 I p 3 , 5 .
In a similar manner,
H 0 ( l ) = 0 0 I p 1 , 3 0 0 0 0 0 I p 2 , 4 0 0 0 0 I p 3 , 4 0 , H 1 ( l ) = 0 I p 1 , 2 0 0 0 I p 2 , 1 0 I p 2 , 3 0 0 I p 3 , 1 0 0 0 0 .
If H SC is full-rank, then r = r QC 2 · 3 4 · 5 . If H t ( l ) is also full-rank, then r QC = 0.4 and r = 0.1 . □
The SC-LDPC code is efficiently encoded sequentially [92] so that at each spatial position , ( ω λ ) M information bits are encoded out of ( ω λ ) M L .
If an entry of B corresponding to an H i ( l ) is a sequence, the corresponding entry in B ¯ is also a sequence. Thus, if H i ( l ) represents a multi-edge code for some i or , so does H SC .
Figure A1. (a) A window in SWD is a m w × m w block, each block with size λ M × ω M . The window slides diagonally by one block. The variable nodes in the block at the block-position i × j are denoted by A ( i ) , and the check nodes by B ( j ) . Here, m s = 2 and m w = 3 . (b) The message update for the current window is denoted by the solid rectangle. The variable nodes in the previous windows l < l and not in the current window , denoted in green, send fixed messages L v 2 c ( T l 1 , l 1 ) to the check nodes in the current window in any iterations t. The blue variable nodes in the current and a previous window send L v 2 c ( T l 1 , l 1 ) to the check nodes in the current window at iteration t = 1 , which would be updated according to (9) and (10) for t 2 . The red variable nodes in the current window, but not in any previous window, send L ( y ) to the check nodes in the current window at t = 1 , which would be updated at t 2 . The edges from the gray check nodes in windows l > l are discarded for the decoding at position .
Figure A1. (a) A window in SWD is a m w × m w block, each block with size λ M × ω M . The window slides diagonally by one block. The variable nodes in the block at the block-position i × j are denoted by A ( i ) , and the check nodes by B ( j ) . Here, m s = 2 and m w = 3 . (b) The message update for the current window is denoted by the solid rectangle. The variable nodes in the previous windows l < l and not in the current window , denoted in green, send fixed messages L v 2 c ( T l 1 , l 1 ) to the check nodes in the current window in any iterations t. The blue variable nodes in the current and a previous window send L v 2 c ( T l 1 , l 1 ) to the check nodes in the current window at iteration t = 1 , which would be updated according to (9) and (10) for t 2 . The red variable nodes in the current window, but not in any previous window, send L ( y ) to the check nodes in the current window at t = 1 , which would be updated at t 2 . The edges from the gray check nodes in windows l > l are discarded for the decoding at position .
Entropy 27 00795 g0a1

Appendix B.2. Encoder

The SC-LDPC codes are encoded with the sequential encoder [92].

Appendix B.3. Sliding Window Decoder

Consider the SC LDPC code H SC ( λ , ω , M , P , m s , L , B ¯ ) in Appendix B.1. Note that any two variable-nodes in the Tanner graph of the code whose corresponding columns in H SC are at least m s + 1 M ω columns apart do not share any common check-nodes. Thus, they are not involved in the same parity-check equation. The SWD uses this property and runs a local BP decoder on windows of H SC shown in Figure A1.
SWD works through a sequence of spatial iterations , where a rectangular window slides from the top-left to the bottom-right side of H SC . In general, a window matrix H w of size m w consists of m w λ M consecutive rows and m w ω M consecutive columns in H SC . At each iteration , it moves λ M rows down and ω M columns to the right in H SC . Thus, a window is an m w × m w block, starting from the top left and moving diagonally one block down per iteration. There is a special case where the window reaches the boundary. The way the windows near the boundary are terminated impacts performance [98,99]. Our setup for window termination at boundary is early termination, which is discussed in section III-B1 [99].
Denote the variable nodes in window by
V ( l ) = v i V : i = ( l 1 ) ω M , , ( m w + l 1 ) ω M 1 .
The check-nodes directly connected to V ( l ) are C ( l ) . Define V ˜ ( l ) = l < l v V ( l ) V ( l ) , v connected to C ( l ) (the variable nodes in the previous windows not in the current window, shown in green in Figure A1) V ¯ ( l ) = l < l v V ( l ) V ( l ) , v connected to C ( l ) (the variable nodes in the current window and any previous window, shown in blue in Figure A1), and V ^ ( l ) = l < l { v V ( l ) V ( l ) , v connected to C ( l ) } (the variables nodes in the current window not in any previous window, shown in red in Figure A1). Let L v 2 c ( t , l ) and L c 2 v ( t , l ) be LLRs in window and iteration t [ T l ] in BP. At t = 1 , the BP is initialized as
L ( 1 , l ) ( v ) = L ( T l 1 , l 1 ) ( v ) , v V ˜ ( l ) V ¯ ( l ) , L y , v V ^ ( l )
The update equation for the variable node is
L v 2 c ( t , l ) = L v 2 c ( T l 1 , l 1 ) , v V ˜ ( l ) , L y + c C v { c } L c 2 v ( t 1 ) , l , v V ¯ ( l ) L y , v V ^ ( l ) .
The update relation for L c 2 v ( t , l ) is given by (9), with no weights, applied for c C ( l ) and v V ( l ) . After T l iterations, the variables in the window , called target symbols, are decoded. The SWD is illustrated in Figure A1.

Appendix C. Parameters of Codes

For low-rate codes, first, the degree distributions of the Tanner graph are determined using the extrinsic information transfer (EXIT) chart [100]. The EXIT chart produces accurate results if n [100]. For high-rate codes, we apply the stochastic EXIT chart, which in the short block length regime yields better coding gains compared to the deterministic variant [101]. For instance, while the EXIT chart suggests that the degree distributions of C 6 are optimal near E b / N 0 = 3 dB, the stochastic EXIT chart in Figure A2 suggests E b / N 0 = 3.5 dB. Indeed, at E b / N 0 = 3 dB, the check-node extrinsic information I C intersects with variable-node extrinsic information I V for the deterministic EXIT chart. The exponent matrices are obtained using the PEG algorithm, which takes the optimized degree distribution polynomials.
Figure A2. Stochastic EXIT chart for the high-rate code C 6 at E b / N 0 = 3.5 dB, for the AWGN channel.
Figure A2. Stochastic EXIT chart for the high-rate code C 6 at E b / N 0 = 3.5 dB, for the AWGN channel.
Entropy 27 00795 g0a2
The matrices below are vectorized row-wise. They can be unvectorized considering their dimensions.
C 2 : λ = 4 , ω = 8 , M = 403 , Λ ( x ) = x 3 , Υ ( x ) = x 7 , P 2 below
[ 345 152 72 376 377 197 4 144 187 398 320 225 330 198 79 289 271 165 259 105 288 254 51 236 111 233 380 332 47 76 222 247 ]
C 3 : λ = 5 , ω = 16 , M = 251 , Λ ( x ) = x 4 , Υ ( x ) = x 15 , P 3 below
[ 6 98 208 177 76 76 76 48 111 76 76 34 76 76 64 85 198 42 155 127 29 32 35 10 76 44 47 8 53 56 47 71 31 211 158 0 238 111 199 8 195 248 121 167 46 170 246 140 117 51 3 65 57 150 243 57 213 20 113 164 48 141 222 85 181 142 121 210 229 98 218 59 242 76 196 23 185 54 162 52 ]
C 6 : λ = 7 , ω = 42 , M = 25 , Λ ( x ) = 0.714 x 2 + 0.286 x 3 , Υ ( x ) = 0.857 x 18 + 0.143 x 23 , P 6 below
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 8 18 7 17 22 24 1 1 1 1 1 16 1 1 1 1 1 1 14 1 1 1 1 19 4 1 1 22 1 8 1 1 5 1 1 9 0 14 1 15 3 12 7 11 14 18 13 22 1 1 1 1 1 1 1 17 1 1 1 1 1 1 15 1 1 3 1 18 1 1 1 9 8 1 1 19 9 20 1 21 5 1 13 1 1 1 1 1 1 1 2 6 16 11 14 9 19 23 12 1 1 1 1 1 1 1 20 1 19 1 1 1 17 1 1 1 8 16 18 4 3 1 17 1 1 1 15 1 1 1 1 1 6 24 14 22 3 11 1 19 1 1 13 1 1 1 1 1 18 1 18 1 15 1 1 1 1 13 11 19 1 1 20 17 1 21 1 1 1 1 17 1 1 1 1 13 1 1 1 1 1 1 3 9 24 4 21 1 16 22 14 1 1 14 23 24 11 1 16 23 1 1 1 1 1 1 7 23 1 1 1 1 1 1 18 1 1 1 1 15 1 1 1 1 9 11 21 8 17 4 14 16 1 11 1 6 11 13 13 20 6 13 1 1 16 1 1 1 1 1 ]
C 7 : λ = 8 , ω = 42 , M = 25 , Λ ( x ) = 0.596 x 2 + 0.404 x 3 , Υ ( x ) = 0.125 x 9 + 0.125 x 16 + 0.5
x 17 + 0.125 x 19 + 0.125 x 23 , P 7 below
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 8 18 7 17 22 24 1 1 1 1 1 16 1 1 1 1 1 1 14 1 1 1 1 20 1 16 1 1 1 1 4 1 10 1 1 20 19 1 2 9 3 12 7 11 14 18 13 22 1 1 1 1 1 1 1 17 1 1 1 1 1 1 15 1 1 1 1 22 6 1 1 13 1 21 4 1 1 1 1 14 1 17 13 1 1 1 1 1 1 1 2 6 16 11 14 9 19 23 12 1 1 1 1 1 1 1 1 1 1 5 1 1 3 1 10 1 23 1 1 8 4 1 2 1 1 1 15 1 1 1 1 1 6 24 14 22 3 11 1 19 1 1 13 1 1 1 1 1 20 1 1 1 1 13 1 1 1 1 0 3 3 1 12 3 7 1 1 1 1 1 17 1 1 1 1 13 1 1 1 1 1 1 3 9 24 4 21 1 16 22 1 1 12 1 6 1 1 6 1 1 1 1 0 6 18 19 1 6 1 1 1 1 1 1 18 1 1 1 1 15 1 1 1 1 9 11 21 8 17 4 14 16 23 1 3 1 24 10 0 1 1 23 1 0 1 1 1 23 1 19 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 8 5 20 1 1 7 10 24 21 10 1 11 22 1 1 1 1 1 ]
C 8 : λ = 6 , ω = 60 , M = 71 , Λ ( x ) = 0.1 x + 0.634 x 2 + 0.266 x 3 , Υ ( x ) = 0.166 x 27 + 0.668
x 30 + 0.166 x 37 , P 8 below
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 61 1 1 1 1 1 1 1 1 1 1 1 1 1 1 62 1 1 20 60 24 37 1 1 17 45 52 57 62 63 68 70 54 26 19 14 9 8 3 0 1 1 1 1 1 1 28 1 1 1 1 1 1 1 1 1 46 1 1 1 1 34 51 1 1 59 1 18 1 1 1 61 1 52 46 21 48 1 1 44 34 1 2 48 6 1 17 40 29 11 67 23 65 70 54 31 42 60 4 1 0 1 1 1 1 2 1 1 1 1 1 1 1 1 1 46 1 1 1 1 1 21 1 23 1 1 7 1 1 53 1 1 62 1 32 27 27 39 1 1 15 1 16 0 1 1 1 1 1 1 16 1 1 1 1 1 1 1 1 3 51 64 14 29 44 47 62 68 20 7 57 42 27 24 9 1 1 1 32 29 1 1 1 1 38 1 34 1 45 1 67 1 1 60 40 1 37 1 9 20 1 69 20 1 0 1 1 1 1 21 1 1 1 1 1 1 1 1 1 2 18 3 51 49 16 33 59 69 53 68 20 22 55 38 12 1 1 62 1 1 1 1 48 0 37 48 1 26 19 59 60 49 38 1 1 1 1 67 1 1 1 24 1 1 1 0 1 40 42 1 1 1 1 1 1 1 1 1 1 1 1 67 14 48 55 1 1 1 1 1 1 1 1 1 1 25 67 16 41 64 0 68 66 50 20 63 67 34 45 0 7 29 11 11 1 1 1 61 1 1 1 1 1 ]
C 10 : λ = 4 , ω = 50 , M = 80 , Λ ( x ) = 0.02 + 0.18 x + 0.64 x 2 + 0.16 x 3 , Υ ( x ) = 0.25 x 29
0.25 x 33 + 0.25 x 34 + 0.25 x 47 , P 10 below
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 2 8 1 53 70 1 78 72 1 27 1 3 6 1 1 79 50 77 1 1 1 1 30 7 14 1 1 51 1 73 66 1 1 29 1 18 36 64 1 74 1 62 44 16 60 1 20 37 1 3 67 78 41 31 26 77 13 1 39 49 54 9 1 1 43 1 1 71 1 6 37 1 1 21 69 66 47 57 22 59 11 14 33 23 58 1 1 44 18 1 68 1 1 36 62 1 12 1 79 42 1 1 23 1 1 38 79 1 57 1 1 1 3 70 69 54 1 1 77 10 11 26 68 1 7 30 1 1 28 1 73 50 1 1 52 36 18 20 14 4 72 44 62 60 66 76 8 34 1 ]
C 11 : λ = 2 , ω = 25 , M = 32 , Λ ( x ) = x , Υ ( x ) = x 24 , P 11 below
[ ( 0 , 17 ) ( 1 ) ( 0 , 20 ) ( 1 ) ( 0 , 21 ) ( 1 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 0 ) ( 1 ) ( 0 , 17 ) ( 1 ) ( 0 , 20 ) ( 1 ) ( 0 , 21 ) ( 0 ) ( 1 ) ( 2 ) ( 3 ) ( 4 ) ( 5 ) ( 6 ) ( 7 ) ( 8 ) ( 9 ) ( 10 ) ( 16 ) ( 25 ) ( 26 ) ( 27 ) ( 28 ) ( 29 ) ( 30 ) ( 31 ) ]
C 12 : λ = 1 , ω = 23 , M = 160 , Λ ( x ) = 0.480 x + 0.130 x 2 + 0.217 x 3 , 0.130 x 4 + 0.043 x 5
Υ ( x ) = x 71 , P 12 below
[ ( 1 , 151 , 151 , 55 , 127 , 151 ) ( 138 , 27 , 139 , 144 ) ( 88 , 57 , 1 ) ( 111 , 1 , 47 ) ( 130 , 15 , 33 ) ( 11 , 47 , 118 , 15 , 108 ) ( 109 , 5 , 1 ) ( 143 , 1 , 140 ) ( 100 , 14 , 14 , 141 , 1 ) ( 12 , 1 , 20 ) ( 91 , 42 , 96 ) ( 74 , 1 , 72 ) ( 54 , 29 , 155 , 157 , 159 ) ( 83 , 82 , 1 ) ( 0 , 77 , 141 , 78 , 13 ) ( 112 , 1 , 59 ) ( 119 , 74 , 56 ) ( 48 , 6 , 55 , 157 , 1 ) ( 85 , 1 , 9 ) ( 41 , 80 , 121 , 2 , 1 ) ( 103 , 1 , 45 ) ( 60 , 117 , 52 , 87 ) ( 99 , 148 , 1 ) ]
m s = 2 , B ¯ 12 below
[ ( 0 , 0 , 1 , 2 , 2 , 2 ) ( 0 , 1 , 1 , 2 ) ( 0 , 1 , 1 ) ( 0 , 1 , 2 ) ( 0 , 1 , 2 ) ( 0 , 0 , 0 , 1 , 2 ) ( 0 , 1 , 1 ) ( 0 , 1 , 2 ) ( 0 , 1 , 1 , 1 , 1 ) ( 0 , 1 , 2 ) ( 0 , 1 , 2 ) ( 0 , 1 , 2 ) ( 0 , 1 , 2 , 2 , 2 ) ( 0 , 1 , 1 ) ( 0 , 0 , 0 , 1 , 2 ) ( 0 , 1 , 2 ) ( 0 , 1 , 2 ) ( 0 , 1 , 1 , 1 , 1 ) ( 0 , 1 , 2 ) ( 0 , 0 , 0 , 1 , 1 ) ( 0 , 1 , 2 ) ( 0 , 1 , 2 , 2 ) ( 0 , 1 , 1 ) ]

References

  1. Pham, Q.V.; Nguyen, N.T.; Huynh-The, T.; Bao Le, L.; Lee, K.; Hwang, W.J. Intelligent radio signal processing: A survey. IEEE Access 2021, 9, 83818–83850. [Google Scholar] [CrossRef]
  2. Bruck, J.; Blaum, M. Neural networks, error-correcting codes, and polynomials over the binary n-cube. IEEE Trans. Inf. Theory 1989, 35, 976–987. [Google Scholar] [CrossRef]
  3. Zeng, G.; Hush, D.; Ahmed, N. An application of neural net in decoding error-correcting codes. In Proceedings of the 1989 IEEE International Symposium on Circuits and Systems (ISCAS), Portland, OR, USA, 8–11 May 1989; Volume 2, pp. 782–785. [Google Scholar] [CrossRef]
  4. Caid, W.; Means, R. Neural network error correcting decoders for block and convolutional codes. In Proceedings of the GLOBECOM ’90: IEEE Global Telecommunications Conference and Exhibition, San Diego, CA, USA, 2–5 December 1990; Volume 2, pp. 1028–1031. [Google Scholar]
  5. Tseng, Y.H.; Wu, J.L. Decoding Reed-Muller codes by multi-layer perceptrons. Int. J. Electron. Theor. Exp. 1993, 75, 589–594. [Google Scholar] [CrossRef]
  6. Marcone, G.; Zincolini, E.; Orlandi, G. An efficient neural decoder for convolutional codes. Eur. Trans. Telecommun. Relat. Technol. 1995, 6, 439–445. [Google Scholar]
  7. Wang, X.A.; Wicker, S. An artificial neural net Viterbi decoder. IEEE Trans. Commun. 1996, 44, 165–171. [Google Scholar] [CrossRef]
  8. Tallini, L.G.; Cull, P. Neural nets for decoding error-correcting codes. In Proceedings of the IEEE Technical Applications Conference and Workshops. Northcon/95. Conference Record, Portland, OR, USA, 10–12 October 1995; pp. 89–94. [Google Scholar] [CrossRef]
  9. Ibnkahla, M. Applications of neural networks to digital communications—A survey. Signal Process. 2000, 80, 1185–1215. [Google Scholar] [CrossRef]
  10. Haroon, A. Decoding of Error Correcting Codes Using Neural Networks. Ph.D. Thesis, Blekinge Institute of Technology, Blekinge, Sweden, 2012. Available online: https://www.diva-portal.org/smash/get/diva2:832503/FULLTEXT01.pdf (accessed on 11 June 2025).
  11. Nachmani, E.; Be’ery, Y.; Burshtein, D. Learning to decode linear codes using deep learning. In Proceedings of the 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 27–30 September 2016; pp. 341–346. [Google Scholar] [CrossRef]
  12. Gruber, T.; Cammerer, S.; Hoydis, J.; Brink, S.T. On deep learning-based channel decoding. In Proceedings of the 2017 51st Annual Conference on Information Sciences and Systems (CISS), Baltimore, MD, USA, 22–24 March 2017; pp. 1–6. [Google Scholar] [CrossRef]
  13. Kim, H.; Jiang, Y.; Rana, R.B.; Kannan, S.; Oh, S.; Viswanath, P. Communication algorithms via deep learning. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018; Available online: https://openreview.net/forum?id=ryazCMbR- (accessed on 11 June 2025).
  14. Vasić, B.; Xiao, X.; Lin, S. Learning to decode LDPC codes with finite-alphabet message passing. In Proceedings of the 2018 Information Theory and Applications Workshop (ITA), San Diego, CA, USA, 11–16 February 2018; pp. 1–9. [Google Scholar]
  15. Bennatan, A.; Choukroun, Y.; Kisilev, P. Deep learning for decoding of linear codes—A syndrome-based approach. In Proceedings of the 2018 IEEE International Symposium on Information Theory (ISIT), Vail, CO, USA, 17–22 June 2018; pp. 1595–1599. [Google Scholar] [CrossRef]
  16. Nachmani, E.; Marciano, E.; Lugosch, L.; Gross, W.J.; Burshtein, D.; Be’ery, Y. Deep learning methods for improved decoding of linear codes. IEEE J. Sel. Top. Signal Process. 2018, 12, 119–131. [Google Scholar] [CrossRef]
  17. Lugosch, L.P. Learning Algorithms for Error Correction. Master’s Thesis, McGill University, Montreal, QC, Canada, 2018. Available online: https://escholarship.mcgill.ca/concern/theses/c247dv63d (accessed on 11 June 2025).
  18. Lian, M.; Carpi, F.; Häger, C.; Pfister, H.D. Learned belief-propagation decoding with simple scaling and SNR adaptation. In Proceedings of the 2019 IEEE International Symposium on Information Theory (ISIT), Paris, France, 7–12 July 2019; pp. 161–165. [Google Scholar] [CrossRef]
  19. Jiang, Y.; Kannan, S.; Kim, H.; Oh, S.; Asnani, H.; Viswanath, P. DEEPTURBO: Deep Turbo decoder. In Proceedings of the 2019 IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Cannes, France, 2–5 July 2019; pp. 1–5. [Google Scholar] [CrossRef]
  20. Carpi, F.; Häger, C.; Martalò, M.; Raheli, R.; Pfister, H.D. Reinforcement learning for channel coding: Learned bit-flipping decoding. In Proceedings of the 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 24–27 September 2019; pp. 922–929. [Google Scholar] [CrossRef]
  21. Wang, Q.; Wang, S.; Fang, H.; Chen, L.; Chen, L.; Guo, Y. A model-driven deep learning method for normalized Min-Sum LDPC decoding. In Proceedings of the 2020 IEEE International Conference on Communications Workshops (ICC Workshops), Dublin, Ireland, 7–11 June 2020; pp. 1–6. [Google Scholar] [CrossRef]
  22. Huang, L.; Zhang, H.; Li, R.; Ge, Y.; Wang, J. AI coding: Learning to construct error correction codes. IEEE Trans. Commun. 2020, 68, 26–39. [Google Scholar] [CrossRef]
  23. Be’Ery, I.; Raviv, N.; Raviv, T.; Be’Ery, Y. Active deep decoding of linear codes. IEEE Trans. Commun. 2020, 68, 728–736. [Google Scholar] [CrossRef]
  24. Xu, W.; Tan, X.; Be’ery, Y.; Ueng, Y.L.; Huang, Y.; You, X.; Zhang, C. Deep learning-aided belief propagation decoder for Polar codes. IEEE J. Emerg. Sel. Top. Circuits Syst. 2020, 10, 189–203. [Google Scholar] [CrossRef]
  25. Buchberger, A.; Häger, C.; Pfister, H.D.; Schmalen, L.; i Amat, A.G. Pruning and quantizing neural belief propagation decoders. IEEE J. Sel. Areas Commun. 2021, 39, 1957–1966. [Google Scholar] [CrossRef]
  26. Dai, J.; Tan, K.; Si, Z.; Niu, K.; Chen, M.; Poor, H.V.; Cui, S. Learning to decode protograph LDPC codes. IEEE J. Sel. Areas Commun. 2021, 39, 1983–1999. [Google Scholar] [CrossRef]
  27. Tonnellier, T.; Hashemipour, M.; Doan, N.; Gross, W.J.; Balatsoukas-Stimming, A. Towards practical near-maximum-likelihood decoding of error-correcting codes: An overview. In Proceedings of the ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 8283–8287. [Google Scholar] [CrossRef]
  28. Wang, L.; Chen, S.; Nguyen, J.; Dariush, D.; Wesel, R. Neural-network-optimized degree-specific weights for LDPC MinSum decoding. In Proceedings of the 2021 11th International Symposium on Topics in Coding (ISTC), Montreal, QC, Canada, 30 August–3 September 2021; pp. 1–5. [Google Scholar] [CrossRef]
  29. Habib, S.; Beemer, A.; Kliewer, J. Belief propagation decoding of short graph-based channel codes via reinforcement learning. IEEE J. Sel. Areas Inf. Theory 2021, 2, 627–640. [Google Scholar] [CrossRef]
  30. Nachmani, E.; Wolf, L. Autoregressive belief propagation for decoding block codes. arXiv 2021, arXiv:2103.11780. [Google Scholar]
  31. Nachmani, E.; Be’ery, Y. Neural decoding with optimization of node activations. IEEE Commun. Lett. 2022, 26, 2527–2531. [Google Scholar] [CrossRef]
  32. Cammerer, S.; Ait Aoudia, F.; Dörner, S.; Stark, M.; Hoydis, J.; Ten Brink, S. Trainable communication systems: Concepts and prototype. IEEE Trans. Commun. 2020, 68, 5489–5503. [Google Scholar] [CrossRef]
  33. Cammerer, S.; Hoydis, J.; Aoudia, F.A.; Keller, A. Graph neural networks for channel decoding. In Proceedings of the 2022 IEEE Globecom Workshops (GC Wkshps), Rio de Janeiro, Brazil, 4–8 December 2022; pp. 486–491. [Google Scholar] [CrossRef]
  34. Choukroun, Y.; Wolf, L. Error correction code transformer. Conf. Neural Inf. Proc. Syst. 2022, 35, 38695–38705. Available online: https://proceedings.neurips.cc/paper_files/paper/2022/file/fcd3909db30887ce1da519c4468db668-Paper-Conference.pdf (accessed on 11 June 2025).
  35. Jamali, M.V.; Saber, H.; Hatami, H.; Bae, J.H. ProductAE: Toward training larger channel codes based on neural product codes. In Proceedings of the ICC 2022—IEEE International Conference on Communications, Seoul, Republic of Korea, 16–20 May 2022; pp. 3898–3903. [Google Scholar] [CrossRef]
  36. Dörner, S.; Clausius, J.; Cammerer, S.; ten Brink, S. Learning joint detection, equalization and decoding for short-packet communications. IEEE Trans. Commun. 2022, 71, 837–850. [Google Scholar] [CrossRef]
  37. Li, G.; Yu, X.; Luo, Y.; Wei, G. A bottom-up design methodology of neural Min-Sum decoders for LDPC codes. IET Commun. 2023, 17, 377–386. [Google Scholar] [CrossRef]
  38. Wang, Q.; Liu, Q.; Wang, S.; Chen, L.; Fang, H.; Chen, L.; Guo, Y.; Wu, Z. Normalized Min-Sum neural network for LDPC decoding. IEEE Trans. Cogn. Commun. Netw. 2023, 9, 70–81. [Google Scholar] [CrossRef]
  39. Wang, L.; Terrill, C.; Divsalar, D.; Wesel, R. LDPC decoding with degree-specific neural message weights and RCQ decoding. IEEE Trans. Commun. 2023, 72, 1912–1924. [Google Scholar] [CrossRef]
  40. Clausius, J.; Geiselhart, M.; Ten Brink, S. Component training of Turbo Autoencoders. In Proceedings of the 2023 12th International Symposium on Topics in Coding (ISTC), Brest, France, 4–8 September 2023; pp. 1–5. [Google Scholar] [CrossRef]
  41. Choukroun, Y.; Wolf, L. A foundation model for error correction codes. In Proceedings of the International Conference on Learning Representations, Vienna, Austria, 7–11 May 2024; Available online: https://openreview.net/forum?id=7KDuQPrAF3 (accessed on 15 March 2012).
  42. Choukroun, Y.; Wolf, L. Learning linear block error correction codes. arXiv 2024, arXiv:2405.04050. [Google Scholar]
  43. Clausius, J.; Geiselhart, M.; Tandler, D.; Brink, S.T. Graph neural network-based joint equalization and decoding. In Proceedings of the 2024 IEEE International Symposium on Information Theory (ISIT), Athens, Greece, 7–12 July 2024; pp. 1203–1208. [Google Scholar] [CrossRef]
  44. Adiga, S.; Xiao, X.; Tandon, R.; Vasić, B.; Bose, T. Generalization bounds for neural belief propagation decoders. IEEE Trans. Inf. Theory 2024, 70, 4280–4296. [Google Scholar] [CrossRef]
  45. Kim, T.; Sung Park, J. Neural self-corrected Min-Sum decoder for NR LDPC codes. IEEE Commun. Lett. 2024, 28, 1504–1508. [Google Scholar] [CrossRef]
  46. Ninkovic, V.; Kundacina, O.; Vukobratovic, D.; Häger, C.; i Amat, A.G. Decoding Quantum LDPC Codes Using Graph Neural Networks. In Proceedings of the GLOBECOM 2024—2024 IEEE Global Communications Conference, Cape Town, South Africa, 8–12 December 2024; pp. 3479–3484. [Google Scholar]
  47. Cammerer, S.; Gruber, T.; Hoydis, J.; Ten Brink, S. Scaling deep learning-based decoding of Polar codes via partitioning. In Proceedings of the GLOBECOM 2017—2017 IEEE Global Communications Conference, Singapore, 4–8 December 2017; pp. 1–6. [Google Scholar] [CrossRef]
  48. Sagar, V.; Jacyna, G.M.; Szu, H. Block-parallel decoding of convolutional codes using neural network decoders. Neurocomputing 1994, 6, 455–471. [Google Scholar] [CrossRef]
  49. Hussain, M.; Bedi, J.S. Reed-Solomon encoder/decoder application using a neural network. Proc. SPIE 1991, 1469, 463–471. [Google Scholar] [CrossRef]
  50. Alston, M.D.; Chau, P.M. A neural network architecture for the decoding of long constraint length convolutional codes. In Proceedings of the 1990 IJCNN International Joint Conference on Neural Networks, San Diego, CA, USA, 17–21 June 1990; pp. 121–126. [Google Scholar] [CrossRef]
  51. Wu, X.; Jiang, M.; Zhao, C. Decoding optimization for 5G LDPC codes by machine learning. IEEE Access 2018, 6, 50179–50186. [Google Scholar] [CrossRef]
  52. Miloslavskaya, V.; Li, Y.; Vucetic, B. Neural network-based adaptive Polar coding. IEEE Trans. Commun. 2024, 72, 1881–1894. [Google Scholar] [CrossRef]
  53. Doan, N.; Hashemi, S.A.; Mambou, E.N.; Tonnellier, T.; Gross, W.J. Neural belief propagation decoding of CRC-polar concatenated codes. In Proceedings of the ICC 2019—2019 IEEE International Conference on Communications (ICC), Shanghai, China, 20–24 May 2019; pp. 1–6. [Google Scholar]
  54. Yang, C.; Zhou, Y.; Si, Z.; Dai, J. Learning to decode protograph LDPC codes over fadings with imperfect CSIs. In Proceedings of the 2023 IEEE Wireless Communications and Networking Conference (WCNC), Glasgow, UK, 26–29 March 2023; pp. 1–6. [Google Scholar] [CrossRef]
  55. Wang, M.; Li, Y.; Liu, J.; Guo, T.; Wu, H.; Lau, F.C. Neural layered min-sum decoders for cyclic codes. Phys. Commun. 2023, 61, 102194. [Google Scholar] [CrossRef]
  56. Raviv, T.; Goldman, A.; Vayner, O.; Be’ery, Y.; Shlezinger, N. CRC-aided learned ensembles of belief-propagation polar decoders. In Proceedings of the ICASSP 2024—2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Republic of Korea, 14–19 April 2024; pp. 8856–8860. [Google Scholar]
  57. Raviv, T.; Raviv, N.; Be’ery, Y. Data-driven ensembles for deep and hard-decision hybrid decoding. In Proceedings of the 2020 IEEE International Symposium on Information Theory (ISIT), Los Angeles, CA, USA, 21–26 June 2020; pp. 321–326. [Google Scholar] [CrossRef]
  58. Kwak, H.Y.; Yun, D.Y.; Kim, Y.; Kim, S.H.; No, J.S. Boosting learning for LDPC codes to improve the error-floor performance. In Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS 2023), New Orleans, LA, USA, 10–16 December 2023; Volume 36, pp. 22115–22131. Available online: https://proceedings.neurips.cc/paper_files/paper/2023/file/463a91da3c832bd28912cd0d1b8d9974-Paper-Conference.pdf (accessed on 11 June 2025).
  59. Schmalen, L.; Suikat, D.; Rösener, D.; Aref, V.; Leven, A.; ten Brink, S. Spatially coupled codes and optical fiber communications: An ideal match? In Proceedings of the 2015 IEEE 16th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Stockholm, Sweden, 28 June–1 July 2015; pp. 460–464. [Google Scholar] [CrossRef]
  60. Kudekar, S.; Richardson, T.; Urbanke, R.L. Spatially coupled ensembles universally achieve capacity under belief propagation. IEEE Trans. Inf. Theory 2013, 59, 7761–7813. [Google Scholar] [CrossRef]
  61. Liga, G.; Alvarado, A.; Agrell, E.; Bayvel, P. Information rates of next-generation long-haul optical fiber systems using coded modulation. IEEE J. Lightw. Technol. 2017, 35, 113–123. [Google Scholar] [CrossRef]
  62. Feltstrom, A.J.; Truhachev, D.; Lentmaier, M.; Zigangirov, K.S. Braided block codes. IEEE Trans. Inf. Theory 2009, 55, 2640–2658. [Google Scholar] [CrossRef]
  63. Montorsi, G.; Benedetto, S. Design of spatially coupled Turbo product codes for optical communications. In Proceedings of the 2021 11th International Symposium on Topics in Coding (ISTC), Montreal, QC, Canada, 30 August–3 September 2021; pp. 1–5. [Google Scholar] [CrossRef]
  64. Smith, B.P.; Farhood, A.; Hunt, A.; Kschischang, F.R.; Lodge, J. Staircase codes: FEC for 100 Gb/s OTN. IEEE J. Lightw. Technol. 2012, 30, 110–117. [Google Scholar] [CrossRef]
  65. Zhang, L.M.; Kschischang, F.R. Staircase codes with 6% to 33% overhead. IEEE J. Lightw. Technol. 2014, 32, 1999–2002. [Google Scholar] [CrossRef]
  66. Zhang, L. Analysis and Design of Staircase Codes for High Bit-Rate Fibre-Optic Communication. Ph.D. Thesis, University of Toronto, Toronto, ON, Canada, 2017. Available online: https://tspace.library.utoronto.ca/bitstream/1807/79549/3/Zhang_Lei_201706_PhD_thesis.pdf (accessed on 11 June 2025).
  67. Shehadeh, M.; Kschischang, F.R.; Sukmadji, A.Y. Generalized staircase codes with arbitrary bit degree. In Proceedings of the 2024 Optical Fiber Communications Conference and Exhibition (OFC), San Diego, CA, USA, 24–28 March 2024; pp. 1–3. Available online: https://ieeexplore.ieee.org/abstract/document/10526860 (accessed on 11 June 2025).
  68. Sukmadji, A.Y.; Martínez-Peñas, U.; Kschischang, F.R. Zipper codes. IEEE J. Lightw. Technol. 2022, 40, 6397–6407. [Google Scholar] [CrossRef]
  69. Ryan, W.; Lin, S. Channel Codes: Classical and Modern; Cambridge University Press: Cambridge, UK, 2009; Available online: https://www.cambridge.org/fr/universitypress/subjects/engineering/communications-and-signal-processing/channel-codes-classical-and-modern?format=HB&isbn=9780521848688 (accessed on 11 June 2025).
  70. Ahmad, T. Polar Codes for Optical Communications. Ph.D. Thesis, Bilkent University, Ankara, Turkey, 2016. Available online: https://api.semanticscholar.org/CorpusID:116423770 (accessed on 11 June 2025).
  71. Barakatain, M.; Kschischang, F.R. Low-complexity concatenated LDPC-staircase codes. IEEE J. Lightw. Technol. 2018, 36, 2443–2449. [Google Scholar] [CrossRef]
  72. i Amat, A.G.; Liva, G.; Steiner, F. Coding for optical communications—Can we approach the Shannon limit with low complexity? In Proceedings of the 45th European Conference on Optical Communication (ECOC 2019), Dublin, Ireland, 22–26 September 2019; pp. 1–4. [Google Scholar] [CrossRef]
  73. Zhang, L.M.; Kschischang, F.R. Low-complexity soft-decision concatenated LDGM-staircase FEC for high-bit-rate fiber-optic communication. IEEE J. Lightw. Technol. 2017, 35, 3991–3999. [Google Scholar] [CrossRef]
  74. Agrawal, G.P. Nonlinear Fiber Optics, 6th ed.; Academic Press: San Francisco, CA, USA, 2019. [Google Scholar]
  75. Kramer, G.; Yousefi, M.I.; Kschischang, F. Upper bound on the capacity of a cascade of nonlinear and noisy channels. arXiv 2015, arXiv:1503.07652, 1–4. [Google Scholar]
  76. Secondini, M.; Rommel, S.; Meloni, G.; Fresi, F.; Forestieri, E.; Poti, L. Single-step digital backpropagation for nonlinearity mitigation. Photonic Netw. Commun. 2016, 31, 493–502. [Google Scholar] [CrossRef]
  77. Union, I.T. G.709: Interface for the Optical Transport Network (OTN). 2020. Available online: https://www.itu.int/rec/T-REC-G.709/ (accessed on 11 June 2025).
  78. Polyanskiy, Y.; Poor, H.V.; Verdu, S. Channel coding rate in the finite blocklength regime. IEEE Trans. Inf. Theory 2010, 56, 2307–2359. [Google Scholar] [CrossRef]
  79. Mezard, M.; Montanari, A. Information, Physics, and Computation; Oxford University Press: Oxford, UK, 2009. [Google Scholar]
  80. Dolecek, L.; Divsalar, D.; Sun, Y.; Amiri, B. Non-binary protograph-based LDPC codes: Enumerators, analysis, and designs. IEEE Trans. Inf. Theory 2014, 60, 3913–3941. [Google Scholar] [CrossRef]
  81. Boutillon, E.; Conde-Canencia, L.; Al Ghouwayel, A. Design of a GF(64)-LDPC decoder based on the EMS algorithm. IEEE Trans. Circ. Syst. I 2013, 60, 2644–2656. [Google Scholar] [CrossRef]
  82. Liang, Y.; Lam, C.T.; Wu, Q.; Ng, B.K.; Im, S.K. A model-driven deep learning-based non-binary LDPC decoding algorithm. TechRxiv 2024. [Google Scholar] [CrossRef]
  83. Fu, Y.; Zhu, X.; Li, B. A survey on instance selection for active learning. Knowl. Inf. Syst. 2013, 35, 249–283. [Google Scholar] [CrossRef]
  84. Noghrei, H.; Sadeghi, M.R.; Mow, W.H. Efficient active deep decoding of linear codes using importance sampling. IEEE Commun. Lett. 2024. [Google Scholar] [CrossRef]
  85. Helmling, M.; Scholl, S.; Gensheimer, F.; Dietz, T.; Kraft, K.; Ruzika, S.; Wehn, N. Database of Channel Codes and ML Simulation Results. 2024. Available online: https://rptu.de/channel-codes/ml-simulation-results (accessed on 11 June 2025).
  86. Hu, X.Y.; Eleftheriou, E.; Arnold, D.M. Progressive edge-growth Tanner graphs. In Proceedings of the GLOBECOM’01. IEEE Global Telecommunications Conference (Cat. No.01CH37270), San Antonio, TX, USA, 25–29 November 2001; Volume 2, pp. 995–1001. [Google Scholar] [CrossRef]
  87. Tasdighi, A.; Yousefi, M. The Repository for the Papers on the Adaptive Weighted Belief Propagation. 2025. Available online: https://github.com/comsys2/adaptive-wbp (accessed on 11 June 2025).
  88. Tal, I.; Vardy, A. List decoding of Polar codes. IEEE Trans. Inf. Theory 2015, 61, 2213–2226. [Google Scholar] [CrossRef]
  89. Süral, A.; Sezer, E.G.; Kolağasıoğlu, E.; Derudder, V.; Bertrand, K. Tb/s Polar successive cancellation decoder 16 nm ASIC implementation. arXiv 2020, arXiv:2009.09388. [Google Scholar]
  90. Cassagne, A.; Hartmann, O.; Léonardon, M.; He, K.; Leroux, C.; Tajan, R.; Aumage, O.; Barthou, D.; Tonnellier, T.; Pignoly, V.; et al. AFF3CT: A fast forward error correction toolbox. SoftwareX 2019, 10, 100345. [Google Scholar] [CrossRef]
  91. Tang, Y.; Zhou, L.; Zhang, S.; Chen, C. Normalized Neural Network for Belief Propagation LDPC Decoding. In Proceedings of the 2021 IEEE International Conference on Networking, Sensing and Control (ICNSC), Xiamen, China, 3–5 December 2021. [Google Scholar]
  92. Tazoe, K.; Kasai, K.; Sakaniwa, K. Efficient termination of spatially-coupled codes. In Proceedings of the 2012 IEEE Information Theory Workshop, Lausanne, Switzerland, 3–7 September 2012; pp. 30–34. [Google Scholar] [CrossRef]
  93. Takasu, T. PocketSDR. 2024. Available online: https://github.com/tomojitakasu/PocketSDR/tree/master/python (accessed on 11 June 2025).
  94. Li, Z.; Kumar, B.V. A class of good quasi-cyclic low-density parity check codes based on progressive edge growth graph. In Proceedings of the Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 7–10 November 2004; Volume 2, pp. 1990–1994. [Google Scholar] [CrossRef]
  95. Tasdighi, A.; Boutillon, E. Integer ring sieve for constructing compact QC-LDPC codes with girths 8, 10, and 12. IEEE Trans. Inf. Theory 2022, 68, 35–46. [Google Scholar] [CrossRef]
  96. Li, Z.; Chen, L.; Zeng, L.; Lin, S.; Fong, W. Efficient encoding of quasi-cyclic low-density parity-check codes. IEEE Trans. Commun. 2006, 54, 71–81. [Google Scholar] [CrossRef]
  97. Mitchell, D.G.; Rosnes, E. Edge spreading design of high rate array-based SC-LDPC codes. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017; pp. 2940–2944. [Google Scholar] [CrossRef]
  98. Lentmaier, M.; Prenda, M.M.; Fettweis, G.P. Efficient message passing scheduling for terminated LDPC convolutional codes. In Proceedings of the 2011 IEEE International Symposium on Information Theory Proceedings, St. Petersburg, Russia, 31 July–5 August 2011; pp. 1826–1830. [Google Scholar] [CrossRef]
  99. Ali, I.; Kim, J.H.; Kim, S.H.; Kwak, H.; No, J.S. Improving windowed decoding of SC LDPC codes by effective decoding termination, message reuse, and amplification. IEEE Access 2018, 6, 9336–9346. [Google Scholar] [CrossRef]
  100. Land, I. Code Design with EXIT Charts. 2013. Available online: https://api.semanticscholar.org/CorpusID:61966354 (accessed on 11 June 2025).
  101. Koike-Akino, T.; Millar, D.S.; Kojima, K.; Parsons, K. Stochastic EXIT design for low-latency short-block LDPC codes. In Proceedings of the 2020 Optical Fiber Communications Conference and Exhibition (OFC), San Diego, CA, USA, 8–12 March 2020; pp. 1–3. Available online: https://ieeexplore.ieee.org/document/9083080 (accessed on 11 June 2025).
Figure 1. Block diagram of an optical fiber transmission system.
Figure 1. Block diagram of an optical fiber transmission system.
Entropy 27 00795 g001
Figure 2. Tanner graph unrolled to an RNN.
Figure 2. Tanner graph unrolled to an RNN.
Entropy 27 00795 g002
Figure 3. Adaptive decoders. (a) Parallel decoders; (b) the two-stage decoder. WMS ( γ , L ( y ) ) refers to D γ ( y ) .
Figure 3. Adaptive decoders. (a) Parallel decoders; (b) the two-stage decoder. WMS ( γ , L ( y ) ) refers to D γ ( y ) .
Entropy 27 00795 g003
Figure 4. BER versus SNR ρ, for the AWGN channel in the low-rate regime. (a) BCH code C 1 ( 63 , 36 ) . Here, the curve for WMS, Type T a & T b is from ([16] Figure 8) and the curve for WMS SS-PAN is from ([18] Figure 5a). (b) QC-LDPC code C 2 ( 3224 , 1612 ) , (c) QC-LDPC code C 3 ( 4016 , 2761 ) , (d) 5G-NR LDPC code C 4 ( 420 , 180 ) . Here, the curve for Graph NN is from ([33] Figure 5). (e) CCSDS LDPC code C 5 ( 128 , 64 ) . In this and the next sub-figure, the Autoregressive BP and Transformers curves are from [30] and [34], respectively. (f) BCH code C 1 ( 63 , 36 ) . Figures (df) show that adaptive decoders achieve the performance of the static decoders with less complexity.
Figure 4. BER versus SNR ρ, for the AWGN channel in the low-rate regime. (a) BCH code C 1 ( 63 , 36 ) . Here, the curve for WMS, Type T a & T b is from ([16] Figure 8) and the curve for WMS SS-PAN is from ([18] Figure 5a). (b) QC-LDPC code C 2 ( 3224 , 1612 ) , (c) QC-LDPC code C 3 ( 4016 , 2761 ) , (d) 5G-NR LDPC code C 4 ( 420 , 180 ) . Here, the curve for Graph NN is from ([33] Figure 5). (e) CCSDS LDPC code C 5 ( 128 , 64 ) . In this and the next sub-figure, the Autoregressive BP and Transformers curves are from [30] and [34], respectively. (f) BCH code C 1 ( 63 , 36 ) . Figures (df) show that adaptive decoders achieve the performance of the static decoders with less complexity.
Entropy 27 00795 g004
Figure 5. The scatter plot of ( e 1 , e 2 ) for C 9 at E b / N 0 = 4.25 dB, for the AWGN channel. The scaled Gaussian approximation curve is fitted per axis.
Figure 5. The scatter plot of ( e 1 , e 2 ) for C 9 at E b / N 0 = 4.25 dB, for the AWGN channel. The scaled Gaussian approximation curve is fitted per axis.
Entropy 27 00795 g005
Figure 6. Performance of the polar code C 9 1024 , 854 versus QC-LDPC codes C 6 1050 , 875 and C 7 1050 , 850 , for the AWGN channel in the high-rate regime. The curve for OSC decoder is from [89].
Figure 6. Performance of the polar code C 9 1024 , 854 versus QC-LDPC codes C 6 1050 , 875 and C 7 1050 , 850 , for the AWGN channel in the high-rate regime. The curve for OSC decoder is from [89].
Entropy 27 00795 g006
Figure 7. Performance of the static and adaptive MS decoder for C 8 4260 , 3834 at E b / N 0 = 4 dB, for the AWGN channel in the high-rate regime.
Figure 7. Performance of the static and adaptive MS decoder for C 8 4260 , 3834 at E b / N 0 = 4 dB, for the AWGN channel in the high-rate regime.
Entropy 27 00795 g007
Table 1. Codes in this paper.
Table 1. Codes in this paper.
AWGN Channel
Low rateHigh rate
BCH C 1 ( 63 , 36 ) , r = 0.57 QC-LDPC C 6 ( 1050 , 875 ) , 0.83
QC-LDPC C 2 ( 3224 , 1612 ) , 0.5 QC-LDPC C 7 ( 1050 , 850 ) , 0.81
QC-LDPC C 3 ( 4016 , 2761 ) , 0.69 QC-LDPC C 8 ( 4260 , 3834 ) , 0.9
Irregular LDPC C 4 ( 420 , 180 ) , 0.43 Polar C 9 ( 1024 , 854 ) , 0.83
Irregular LDPC C 5 ( 128 , 64 ) , 0.5
Optical Fiber Channel
Inner codeOuter code
Single-edge QC-LDPC C 10 ( 4000 , 3680 ) , 0.92 Multi-edge QC-LDPC C 11 ( 3680 , 3520 ) , 0.96
Non-binary multi-edge C 12 ( 800 , 32 )
Table 2. The parameters of the fiber-optic link.
Table 2. The parameters of the fiber-optic link.
Parameter NameValue
Transmitter parameters
WDM channels5
Symbol rate R s 32 Gbaud
RRC roll-off0.01
Channel frequency spacing33 GHz
Fiber channel parameters
Attenuation α 0.2 dB/km
Dispersion parameter (D)17 ps/nm/km
Nonlinearity parameter ( γ ) 1.2 l/(W·km)
Span configuration8 × 80 km
EDFA gain16 dB
EDFA noise figure5 dB
Table 3. The mean and variance ( θ ( t ) , σ ( t ) ) of ( x 1 ( t ) , , x 4 ( t ) ) in WMS for the AWGN channel.
Table 3. The mean and variance ( θ ( t ) , σ ( t ) ) of ( x 1 ( t ) , , x 4 ( t ) ) in WMS for the AWGN channel.
Code C 1 C 2 C 3
t = 1 0.99 , 0.019
0.96 , 0.98 , 0.99 , 1.02
0.90 , 0.026
0.86 , 0.89 , 0.90 , 0.94
0.91 , 0.023
0.87 , 0.90 , 0.92 , 0.95
t = 2 0.97 , 0.036
0.91 , 0.96 , 0.98 , 1.02
0.84 , 0.029
0.79 , 0.83 , 0.85 , 0.89
0.86 , 0.030
0.81 , 0.85 , 0.87 , 0.90
t = 3 0.91 , 0.049
0.83 , 0.89 , 0.92 , 0.99
0.73 , 0.032
0.68 , 0.72 , 0.74 , 0.78
0.75 , 0.031
0.69 , 0.74 , 0.76 , 0.80
t = 4 0.70 , 0.086
0.56 , 0.67 , 0.73 , 0.84
0.63 , 0.036
0.57 , 0.62 , 0.64 , 0.69
0.63 , 0.034
0.57 , 0.61 , 0.64 , 0.68
t = 5 0.40 , 0.175
0.12 , 0.34 , 0.46 , 0.68
Table 4. Computational complexity of decoders, for the AWGN channel.
Table 4. Computational complexity of decoders, for the AWGN channel.
γ * ( t ) α * ( t ) Average RM per Iteration
C 1 C 2 C 3
No weight sharing
WMS [16] γ v , c ( t ) 176825,79240,160
Weight sharing
WMS, Type T a γ v , c 176825,79240,160
WMS, Type T a V C γ 16332264016
Parallel WMS
Type T b V C , ν = 16
γ ( t ) 1144077,37684,336
Parallel WMS
Type T b V C , ν = 64
γ ( t ) 1 3.09 × 10 5 3.37 × 10 5
Parallel WMS
Type T b V C , ν = 1024
γ ( t ) 192,340
Two-stage decoder
Type T b V C , τ prun = 0.001
γ ( t ) 1≃300≃14,093≃17,558
WMS SS−PAN,
Type T a V C [18]
γ ( t ) 115380609287
Table 5. Concatenated inner binary QC-LDPC code C 10 and outer SC-QC-LDPC code C 12 with an r total = 0.88 for the optical fiber channel. The sections for BER i 0.012 and 0.025 correspond to average powers −10 and −11 dBm, respectively. NCGs are in dB.
Table 5. Concatenated inner binary QC-LDPC code C 10 and outer SC-QC-LDPC code C 12 with an r total = 0.88 for the optical fiber channel. The sections for BER i 0.012 and 0.025 correspond to average powers −10 and −11 dBm, respectively. NCGs are in dB.
BER i Inner-SD
Decoder
BER o
Inner
BER o
Total
NCG
Inner
NCG
Total
NCG f
Inner
NCG f
Total
Gap to NCG f
Inner
Gap to NCG f
Total
0.012 AWMS 3.29 × 10 6 4.52 × 10 8 5.64 6.93 9.38 9.44 3.74 2.51
NNMS
θ = 0.75
4.02 × 10 6 5.43 × 10 7 5.56 6.13 9.38 9.44 3.82 3.31
MS 4.77 × 10 6 7.75 × 10 7 5.49 6.00 9.38 9.44 3.89 3.44
0.025 AWMS 0.019 0.017 0.13 0.12 10.20 10.28 10.07 10.16
NNMS
θ = 0.72
0.02 0.018 0.04 0.03 10.20 10.28 10.16 10.25
MS 0.023 0.02 0.20 0.15 10.20 10.28 10.40 10.43
Table 6. Concatenated inner non-binary QC-LDPC code C 11 and outer SC-QC-LDPC code C 12 with r total = 0.88 for the optical fiber channel. The sections for BER i 0.012 and 0.025 correspond to average powers −10 and −11 dBm, respectively. NCGs are in dB.
Table 6. Concatenated inner non-binary QC-LDPC code C 11 and outer SC-QC-LDPC code C 12 with r total = 0.88 for the optical fiber channel. The sections for BER i 0.012 and 0.025 correspond to average powers −10 and −11 dBm, respectively. NCGs are in dB.
BER i Inner-SD
Decoder
BER o
Inner
BER o
Total
NCG
Inner
NCG
Total
NCG f
Inner
NCG f
Total
Gap to NCG f
Inner
Gap to NCG f
Total
0.012 AWEMS 3.21 × 10 8 2.74 × 10 9 7.23 7.69 9.38 9.44 2.15 1.75
NNEMS
θ = 0.2
4.61 × 10 8 2.11 × 10 8 7.12 7.15 9.38 9.44 2.26 2.29
EMS 2.44 × 10 7 8.20 × 10 8 6.60 6.75 9.38 2 9.44 2.78 2.69
0.025 AWEMS 0.0063 0.0051 1.73 1.80 10.20 10.28 8.47 8.48
NNEMS
θ = 0.25
0.0087 0.0075 1.32 1.32 10.20 10.28 8.88 8.96
EMS 0.025 0.022 0.36 0.32 10.20 10.28 10.56 10.60
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tasdighi, A.; Yousefi, M. Adaptive Learned Belief Propagation for Decoding Error-Correcting Codes. Entropy 2025, 27, 795. https://doi.org/10.3390/e27080795

AMA Style

Tasdighi A, Yousefi M. Adaptive Learned Belief Propagation for Decoding Error-Correcting Codes. Entropy. 2025; 27(8):795. https://doi.org/10.3390/e27080795

Chicago/Turabian Style

Tasdighi, Alireza, and Mansoor Yousefi. 2025. "Adaptive Learned Belief Propagation for Decoding Error-Correcting Codes" Entropy 27, no. 8: 795. https://doi.org/10.3390/e27080795

APA Style

Tasdighi, A., & Yousefi, M. (2025). Adaptive Learned Belief Propagation for Decoding Error-Correcting Codes. Entropy, 27(8), 795. https://doi.org/10.3390/e27080795

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop