An LDPC-RS Concatenation and Decoding Scheme to Lower the Error Floor for FTN Signaling

.


Introduction
Faster-than-Nyquist (FTN) signaling dates back to the 1970s and has attracted increasing interest in the past two decades due to the lack of research into frequency resources and bandwidth in digital communication, such as wireless communication, satellite communication, and optical communication [1].One common method to improve link capacity and spectrum efficiency (SE) is to adopt high-order modulation in an orthogonal (Nyquist) modulation scheme, for example, by using quadrature amplitude modulation (QAM), such as 16-QAM, 64-QAM, or even 4096-QAM.However, high-order modulation formats often require higher power and sensitivity on radio frequency (RF) devices.Moreover, a higher modulation order is more likely to suffer from noise since constellation symbols converging in 256-QAM compared to 16-QAM and requires improved linearity of RF devices, thus being a challenge for symbol recovery and decisions.Another attractive method is to use a lower modulation order but non-orthogonal modulation scheme, i.e., FTN signaling.By accelerating the symbol speed of transmission with a faster-than-Nyquist speed of no inter-symbol interference (ISI), controlled ISI and colored noise are introduced after match filtering, and they can be eliminated in many ways, such as with a whitening filter, orthogonal basis model (OBM), precoding, turbo equalizer, or Bahl-Cock-Jelinek-Raviv (BCJR) detector [2].In band-limited scenarios like high-speed optical, wireless, and satellite communication, FTN signaling provides higher SE with the same modulation order at the price of a higher symbol rate (higher sampling rate) and additional digital signal processing to mitigate ISI [3].
In recent years, many ISI mitigation methods have been studied, including linear equalization, decision feedback equalization (DFE), frequency domain equalization at the receiver side, and precoding and pre-equalization at the transmitter side.Although these methods are relatively simple, their performance will degrade with severe ISI when applying the high-acceleration factor of FTN signaling.The BCJR algorithm performs trellis searching to find survivors and make decisions for each bit.However, when dealing with a high modulation order or long ISI taps, BCJR detection requires considerable memory space to store a large number of states and routes in trellises, which increases exponentially with ISI length.To solve this problem, reduced-state M-BCJR detection was studied in [2], and the authors showed that only 3-state BCJR is required for FTN signaling with acceleration factor τ = 0.5.In a coded system with iterative decoding, we can apply FTN signaling as an internal ISI mechanism and use BCJR detection to generate soft decision bits, i.e., log likelihood ratio (LLR), and exchange LLRs among the BCJR detector and soft channel decoder, which is called turbo equalization.In this work, we applied the M-BCJR algorithm together with LDPC-RS concatenation code to form a coded FTN signaling scheme and mitigate ISI through turbo equalization.
LDPC code was first proposed by Gallager in 1962 [4].Originally, the LDPC received little attention due to limited theoretical knowledge and computer processing capacity.Over the years, with the increasing improvement in relevant basic theories and the vigorous development of computer science and technology, researchers have devoted increasing effort to the study of LDPC code.LDPC code has become one of the most popular coding technologies and has been widely used in various communication systems, such as wireless, satellite, and fiber communication.With sufficient code length, LDPC code can approach the Shannon limit [5] and be capable of error correction at a low signal-to-noise ratio (SNR).Among soft decision-decoding methods, belief propagation (BP) is a promising decoding scheme for the LDPC and for other codes that need soft decisions and iterative decoding.
A typical bit error rate (BER) curve of channel coding resembles a waterfall, i.e., the BER drops sharply when the signal-to-noise ratio (SNR) increases.However, an error floor phenomenon can be found with many LDPC codes; when SNR increases, the BER does not decrease, or the curve flattens.According to [6], the error floor [7] is caused by sub-optimal iterative decoding [8], near-code words [9], and absorbing and trapping sets [10].In some situations, the error floor can not be eliminated, and there are some methods to lower the error floor, for example, designing a large-girth parity matrix to optimize the code structure; using concatenation code, such as RS-LDPC [11] or DVB BCH-LDPC code [12] that utilizes RS or BCH as the outer code and the LDPC as the inner code; or manually introducing lowmagnitude noise to message-passing decoders to lower the error floor.For some conditions, such noise can actually improve decoding performance [13].This improvement, based on the effect of noise, is called stochastic resonance.We can refer to the introduction of such noise as perturbation.Perturbation decoders have exceptional performance and very low resource costs [14] and may prove superior to the existing techniques [15] in computer memory and data storage applications [16].Details about stochastic resonance can be found in [17].Note that the error floor may occur in a particular code and may not occur in another, and it is hard to predict whether it will occur.For example, the 5G LDPC (7680, 3840) code used in this study has no error floor in orthogonal (Nyquist) systems.However, when utilizing FTN signaling, we found an error floor in a rate-1/2 LDPC code masked from the 5G base graph "BG2".This is probably caused by the decoder's non-linearity, too few turbo iterations, improper loop gain or code rate, or a weakness in the masked base matrix structure.Since this phenomenon exists and is unpredictable, we require a method to lower the error floor that may occur in the demand region of the BER for practical situations.In this work, we utilized two of the above methods: LDPC-RS concatenated code and a perturbation BP decoder to lower the error floor.
Reed-Solomon (RS) code was invented by Reed and Solomon [18].RS code has undergone decades of development and has been used in wireless, satellite, and optical communications due to its maximum distance separable (MDS) feature.A hard decision decoder is often utilized for high-speed or concatenated code due to its simple structure and high throughput.In this study, we used soft decision LDPC code due to its excellent error correction ability concatenated with hard decision RS code because of its high decoding speed and definite error correction ability.
Our contribution in this paper can be summarized in two points: (1) To combat the error floor, we propose one possible solution, the use of the LDPC-RS concatenation and shortening scheme.We introduce perturbation into the BP decoder to enhance its performance.(2) We propose a 4-parallel encoder architecture for RS component code and a concise algorithm to calculate its constant multiplier coefficients, leveraging a traditional serial encoder, which can also be used for calculating other parallelisms, code rates, and lengths.The simulation results show that the proposed concatenation and shortening scheme can lower the error floor of the original single LDPC code, and with perturbation in the BP decoder, the error floor can be eliminated below 10 −7 for different LDPC maximum decoding iterations.The proposed scheme has an error correction ability for BCJR detection with few states and few iterations of turbo equalization FTN signaling and successfully lowers the error floor.
The outline of this paper is as follows: Section 2 reviews preliminaries of FTN signaling with the OBM, BCJR detection, turbo equalization, LDPC, and RS codes, as well as our simulations with 5G LDPC code and an error floor.Section 3 details the structure of the proposed LDPC-RS concatenation code, RS parallel RS encoder and algorithm for parallel encoding coefficient calculation, and the perturbation BP decoder to lower the error floor.Section 4 describes the performed simulations and analysis.Section 5 concludes the paper.

FTN Signaling with OBM
The concept of FTN signaling was introduced by Mazo [19] in 1975.In FTN signaling, binary sinc(t/T) pulses can be transmitted faster than typical orthogonal (Nyquist) pulses without decreasing the square of minimum Euclidean distance between symbols, thus theoretically not affecting the bit error rate (BER) of demodulation.The binary baseband modulation of FTN signaling is linear and based on pulse shaping as follows: where E b is the average symbol energy, a n represents ±1, h(t) is a T-orthogonal unit energyshaping pulse filter such as sinc or root-raised-cosine (rRC), the symbol interval of a n is τT, the acceleration factor is τ ≤ 1, and n is the symbol index.Regarding Nyquist signaling, τ = 1.We use discrete-time ISI and the additive white Gaussian noise (AWGN) channel model in [2] as the following chain of signal processing.
Data {a n } at nτT → linear modulation and pulse shaping by h(t) at rate 1/τT → AWGN → matched filter → sample at nτT → discrete-time post filter B(z) → frame reverse → channel observation y.
This chain produces a minimum phase sequence of channel observation y and can be applied to turbo equalization as detailed in the next subsection.Furthermore, the orthogonal basis model (OBM) receiver uses a sequence of wider band orthogonal pulses to express h(t) as follows: It is worth mentioning that basis function ϕ(t) is τT-orthogonal, and the matched filter is matched to ϕ(t) rather than h(t); thus, the noise it outputs is white, which solves the colored noise problem in other receivers.The signal processing chain can be modeled as discrete-time system v such that where a = {a n }, * means convolution, and w is a vector of independent identical distribution (i.i.d.) white Gaussian noise.

BCJR Detection and Turbo Equalization
The BCJR algorithm computes the probability of states and paths in trellises.It performs forward and backward recursion among the state transition map.Suppose the channel observation y = y 1 , y 2 , . . ., y N , and the a priori symbol probability of each transmitted symbol is known.The joint probability can therefore be calculated as follows: where α n [i] is called the forward path metric for state i at stage n, β n+1 [j] is called the backward path metric for state j at stage n + 1, and Γ n (i, j) is the branch metric connecting states i and j at stage n.Usually, the transmitted symbols start from (and terminate at) an initial all-zero state, so that is initialized as α 0 = (1, 0, . . ., 0), where S is the set of states that can reach state j at stage n + 1. Regarding binary modulation, there are two elements in set S. Backward recursion is similar in that β N = (1, 0, . . ., 0).It starts from the last symbol (stage) then proceeds backward to the beginning, computing the backward path metric as follows: where S is the set of states that can reach state i at stage n.The branch metric is where l i,j is a label on the branch from state i to j, and P(a ′ ) is the a priori probability of symbol a ′ , which causes the transition (from state i to j).
, and let j be a state at stage n.Then, we calculate LLRs via and L ±1 are sets of states reached by a n = ±1.The M-BCJR algorithm proposed by [2] utilized the reduced search of trees and trellises.It proceeds breadth-first through a tree structure of metric values, keeping only the dominant M paths at each tree stage, and it also deals with empty values of α n [j], β n [j].More details can be found in [2].In coded FTN signaling, turbo equalization is usually adopted, whereby the BCJR detector and channel decoder exchange soft information iteratively.Turbo equalization enhances detector and decoder performance but increases complexity and latency proportional to turbo iterations.The coded FTN signaling diagram is depicted in Figure 1, where the dashed box represents turbo equalization.

LDPC and RS Code
Let N be the block length of the LDPC code and K be the data length; then, M = N − K is the parity length.Let d = (d 1 , d 2 , . . ., d K ) be the data and G be the generation matrix of K-by-N size.The code word c = dG has the property cH T = 0, where H is the parity matrix of M-by-N.The code is called low-density parity check code, a feature of which is that parity matrix H is sparse; namely, there are many zeros and few ones in H.There are binary and non-binary LDPC codes, and only binary codes are discussed in this work.LDPC codes can be classified into two types from the construction of parity matrix H, as regular and irregular codes.The number of ones in a row (column) of H is called the row (column) degree.Regular codes have the same row degree for each row and the same column degree for each column.Row i's degree indicates how many variable nodes are connected to check node i, and column j's degree indicates how many check nodes are connected to variable node j.Among soft decision-decoding methods, belief propagation (BP) is a promising decoding scheme for the LDPC and other codes that need soft decisions and iterative decoding.The BP decoder uses message passing and iteration to propagate soft information among check and variable nodes.Since BP involves many nonlinear calculations, the min-sum (MS) algorithm was proposed.This algorithm uses the minimum value of soft information from relevant variable nodes to approximate the result of nonlinear functions that the BP decoder performs, thus reducing computation complexity and latency significantly [20].Although the MS decoder requires fewer computations, the approximated results always have a greater absolute value than those of BP, which causes performance degradation.To mitigate performance loss, the normalized min-sum (NMS) scheme [21] and its variant were studied [22]; moreover, a neural network-aided NMS approach was proposed in [23].The BP decoding scheme can be described as Algorithm 1.
Algorithm 1 BP Decoding of LDPC Code.

Define:
Set of variable nodes connected to check node i: Set of check nodes connected to variable node j: M(j) = {i : h ij = 1}.Soft information from variable node j to check node i: Z ij .Soft information from check node i to variable node j: L ij .Input: Soft bit information: L = (l 1 , l 2 , . . ., l n ).Initialization: Z ij = l j .Iteration: For each i, j, the following steps are performed until the stopping criterion is satisfied.
(1) Row operation: For each check node i, calculate and for messages from i to each relevant variable node j, )).
(2) Column operation: For each variable node j, and the message passed from j to each relevant check node i is (3) Stopping criterion: If bH T = 0 (no error) or the maximum iteration limit has been reached, stop the decoder, otherwise continue.Output: The decoding result b.
RS code is defined with finite field GF(2 p ), where p is a positive integer, and all calculations are performed with this field.RS(n,k) code has code word length n and k data symbols, each symbol having p bits.It can correct up to ⌊(n − k)/2⌋ errors.Let the data symbols be m = (m 0 , m 1 , . . ., m k−1 ) and be represented with polynomial where σ is a prime element of GF(2 p ).Then, the code word c = (c 0 , c 1 , . . ., c n−1 ) can be calculated as follows:

Experiment with 5G LDPC Code and Turbo Equalization in FTN Signaling
We simulated 5G LDPC code and turbo equalization in FTN signaling with MATLAB.The diagram is depicted in Figure 1.The LDPC code used here is a (7680, 3840) code based on BG2 [24].We used shortened v as in (5) and used BPSK modulation for the baseband system model.We used turbo equalization containing an M-BCJR detector and BP decoding for demodulation.Considering complexity and latency, we chose a maximum turbo equalization count of 3 to lower the number of whole iterations and M = 4 states stored at each stage of the trellis in BCJR detection, i.e., 4-BCJR detection, to reduce memory usage.We considered maximum iteration limits of 10, 20, and 50 for BP decoding.Therefore, in total, three iterations of the 4-BCJR algorithm and 30, 60, and 150 iterations of BP decoding were performed at most for three turbo iterations.Unfortunately, an error floor was found near 10 −5 as in Figure 2. It can be eliminated through many detection and decoding iterations, but this is unacceptable considering the low latency and storage.Accordingly, few turbo iterations and states of FTN detection can be adopted.With sufficient SNR, the error floor of the BP decoder will disappear independently.It appears in the BER region (10 −5 ) of most concern in wireless communication, which is why we must lower or eliminate the error floor.

Occurrence of an Error Floor in FTN Signaling
Inter-symbol interference caused by FTN signaling introduced relevance between symbols and code words.Additionally, shortening or tail cutting of the pulse-shaping filter and the channel response further complicate symbol relevance canceling.The LDPC code shows good performance with the AWGN channel, where symbols and noise are all independent.While applying FTN signaling and M-BCJR detection, there is residual relevance (interference) between symbols since we cannot accurately simulate an infinite pulse-shaping filter or channel response.On the other hand, LDPC code sometimes independently suffers from error propagation, stopping and trapping sets of its factor graph [17].Thus, FTN signaling makes the LDPC code's weakness obvious, and therefore an error floor occurs.The smaller the acceleration factor value is, the more severe the ISI that will occur, and thus, the BER performance of the LDPC will be poor, and the error floor will still occur.
The error floor is caused by FTN signaling and residual relevance (interference) between symbols.Since the M-BCJR algorithm is already a good detection scheme from the decoder perspective, we used shortening to provide some absolutely right bits of the LDPC-known bits with a high LLR in the factor graph in order to help the BP decoder partially overcome error propagation.Furthermore, we utilized perturbation to add low power noise to calculation of all check nodes during BP decoding.Since the operation is summation, confident nodes with high LLR values will not be affected, but those nodes of low reliability (a low magnitude of LLR) will be perturbed and may be corrected or move away from stopping or trapping sets, which are believed to be a reason for the error floor.

Structure of the Proposed LDPC-RS Concatenation Code
Typical concatenation code consists of inner and outer code.Data bits are encoded by the outer code, and the whole outer code word is treated as data that are part of the inner code; then, the inner encoder encodes it to form the inner (final) code word.

Perturbation BP Decoder
One reason for the error floor is that the BP decoder cannot correct nodes for some code structures and absorbing sets, becoming trapped in incorrect states.One example of the (6, 3) trapping set is given in Figure 4, where all variable nodes v 1 , v 2 , v 3 are incorrect and where check nodes are too weak to correct them, so all three variable nodes remain erroneous.Finally, several uncorrected error bits can be corrected using concatenated RS code due to its definite error correction ability.When using a parallel RS encoder or decoder, latency and complexity can be lowered.Moreover, the BP decoder will sometimes encounter errors with very few LLRs, while adding some low-level noise to the decoder can actually improve decoding performance.This phenomenon is called stochastic resonance and was found in many devices and research fields and studied using [25] for bit-flipping decoding.In this work, we adopted this method in the soft decision BP decoder, adding a low level of noise to nodes while decoding, and called it the perturbation BP decoder.We modified the message from check node i to each relevant variable node j by adding noise (perturbation) to affect or flip the signs of small absolute value LLRs, which are believed to be less reliable, as follows: where n p is low-level i.i.d.random noise, irrelevant to i, j, with the expectation that E(n p ) = 0, and variance Var(n p ) = σ 2 np is irrelevant to i, j.Its power σ 2 np can be chosen in practice, and its type was chosen to be Gaussian here; thus, Note that adding noise (perturbation) to LLRs is a more general form of adding the same offset value, which has been shown to be effective in a hard decision bit-flipping decoder.The perturbation decoder proposed here can be seen as a soft bit-flipping BP decoder.From a statistical viewpoint, it adds power σ 2 np to LLRs of small absolute value.Since E(n p ) = 0, the summation of i.i.d.random Gaussian noise will tend toward a small value near zero, so for variable node j, the updating rule becomes and the message passed from variable node j to each relevant check node i is Therefore, perturbation here only affects unreliable check nodes with very small absolute value L ij , making them perturbed.The histogram in Figure 5 illustrates the value distribution of L ′ ij when σ np = 10 −4 , showing that perturbation has almost no effect on the value distribution of L ′ ij .Therefore, perturbation decoding can help to perturb the less confident L ′ ij without destroying the original value distribution of L ′ ij or hindering the BP decoding scheme and decision.Figure 6 shows a (6, 3) trapping set of three variable nodes that remain erroneous during iterations and may be corrected with perturbation in check-to-variable messages.Since all three variable nodes are incorrect, perturbation here may flip their signs and therefore will not hinder BP decoding but rather provide the possibility for error correction.However, a simple (6, 3) trapping set here contains 9 nodes and 9 edges, and it is impossible to use the whole Tanner graph of the LDPC code to analyze the gain of perturbation and yield a closed-form expression.We used Monte Carlo simulations to evaluate its performance.

RS Parallel Encoder
The conventional RS encoder adopts a serial structure that can be treated as a digital circuit.It consists of shifting registers and constant multipliers [26], depicted in Figure 7, whose control, initialization, and reset logic are not depicted here.There are 20 shifting registers and 20 constant multipliers for RS (340, 320) code together with 20 adders (bitwise XOR).After data d k are serially input, contents of registers are serially shifted out in the sequence of R 19 , R 18 , . . ., R 0 , which are parity bits, accordingly.For RS (340, 320) code, let the input data be d = {d k }, where k = 0, 1, . . ., 319 is the index of the clock period in this digital circuit of the encoder, and the generator polynomial is g(x) = x 20 + g 19 x 19 + . . .+ g 1 x + g 0 , whose coefficients [g 19 , . . ., g 0 ] = [58,257,157,222,388,151,62,280,137,404,286,394,61,399,73,84,145,293,369,331] are constant.Shifting register R n (k) describes the value of register n at clock period k, where n = 0, 1, . . ., 19.We formulated the serial encoder as Furthermore, initial states R n (0) = 0 for all registers.After 320 clock periods, all data are input into the encoder, all adders and multipliers are stopped, and 20 registers are shifted out in the sequence of [R 19 (320), R 18 (320), . . ., R 0 (320)] to constitute the parity part of the RS (340, 320) code.In fact, the calculation performed by this encoder is equivalent to ( The shortcomings of a serial encoder are low throughput and high encoding latency.We accordingly require a new structure to encode in parallel and lower the encoding latency while improving its throughput.Considering a 4-parallel encoder, each clock period during operation should be equivalent to four clock periods in the serial encoder.By carefully observing (22) and Figure 7, we can see that R n (k + 1) is determined by two variable terms R n−1 (k) (except for R 0 ) and (d k + R 19 (k)).By recursively applying (22), R n (k + 4) is determined by five variable terms: R n−4 (k) (except for R 0 , R 1 , R 2 , R 3 ), (d k + R 19 (k)), (d k+1 + R 18 (k)), (d k+2 + R 17 (k)), and (d k+3 + R 16 (k)).We take R 5 (k + 4) for example, as in (23), which is clearly especially complicated to derive and difficult to understand.In order to simplify the derivation, we came up with an easy approach for calculating parallel encoding coefficients, leveraging a conventional serial encoder structure.
We came up with a method to use the serial encoder to calculate the parallel encoding coefficients and propose it as Algorithm 2. The details of its derivation are described in the Appendix A.

Algorithm 2 Parallel Encoding Coefficient Calculation.
Initialize the serial encoder; set all registers to zero.
An example of a 4-parallel encoder structure is given in Figure 8.Other parallelism implementations are very similar to the 4-parallel scenario.

Performance Evaluation, Results, and Discussion
We carried out simulations to evaluate the proposed concatenation code and decoding algorithm.The simulations were performed with FTN signaling, τ = 0.703, and shortened v = [0.553,0.793, −0.084, −0.171, 0.154, −0.064].We used y = a * v as in (4) to model the received signal through the AWGN channel, detect it with the 4-BCJR method, and then decode the channel code and perform turbo equalization.Finally, we computed the BER to evaluate each decoding scheme at different SNRs.To reduce storage used for the BCJR trellis, we limited states at each stage to M = 4.Moreover, we set the number of turbo equalization iterations to 3 to avoid excessive iterations for the LDPC decoder and decrease the decoding time.The LDPC code used here is a (7680, 3840) code defined by the 5G standard mentioned in Section 2, with 780 data bits shortened (around 10%) to the lower code rate, leading to more definite results.RS (340, 320) is a 0.94 high-rate code with 6% parity redundancy.The shortening of the LDPC and simple hard decision RS concatenation scheme shows the ability to lower the error floor of original LDPC-coded FTN signaling and has an overall code rate of 0.42.All simulations were performed with MATLAB 2022 on a CPU; no GPU was used.
Note that the demonstration of the proposed scheme is intended to show its effectiveness in lowering the error floor, and parameters of shortening and RS code can change with different code rates vs. the BER and with different LDPC code.Thus, those parameters can be determined by practical means and trade-offs and are not required to be optimal since the goal of the proposed scheme is to lower the error floor.
The BP decoders with maximum iteration limits of 20 and 50 also show an error floor of around 10 −5 .The concatenation code lowers the error floor to 4 × 10 −6 , and no error floor was found in the simulations of the concatenation code with perturbation in the BP decoder down to 2 × 10 −7 and 4 × 10 −8 , as depicted in Figure 9 and Figure 10, respectively.We found that, for three iterations of turbo equalization, increasing the iteration number of the LDPC code has little effect on the BER performance.
The saturation (no more decrease) of the red curve and the blue curve in Figure 9 and the green curve and the red curve in Figure 10 are caused by residual relevance (interference) between symbols.From the view of the decoder, residual relevance leads to error propagation especially with trapping sets in the factor graph.We utilize high-rate RS code to lower the error floor, but it still exists, which means concatenating RS code is helpful, but high-rate RS is inadequate for eliminating the error floor.We introduce perturbation decoding to enhance the proposed LDPC-RS concatenation scheme and eliminate the error floor.In [6], LDPC-RS product code and the hybrid error-erasure correction (HEEC) decoding algorithm were proposed.The difference is that our proposed approach utilizes concatenated code, i.e., a whole RS code word is treated as the information part of an LDPC code word.Additionally, [6] utilizes product code, in which multiple RS code words are put in columns and multiple LDPC codes are put in rows.LDPC (576, 288)-RS (255, 51) product code (code length 576 × 255) and HEEC decoding in [6] will have better performance than our proposed scheme.However, since product code is deeply coupled, RS decoding can only start after 255 LDPC codes finish decoding.The product code consists of 255 LDPC codes and 36 RS codes.The overall decoding latency is (255 × Latency(LDPC) + 36 × Latency(RS)) × Iteration, where Latency(LDPC) and Latency(RS) may vary from different decoding algorithms.For the LDPC-RS concatenation scheme proposed in this paper, the overall decoding latency is (Latency(LDPC) + Latency(RS)) × Iteration.The storage required by product code in [6] is 255 × Storage(LDPC) + 36 × Storage(RS), while our proposed scheme needs Storage(LDPC) + Storage(RS).Therefore, product code in [6] introduces much more latency and needs much more storage than our proposed concatenated code.For our proposed scheme, RS decoding only needs to wait for one LDPC code to finish decoding.Additionally, the HEEC iterative decoding exchanges information between the LDPC and RS decoders.Therefore, decoding latency increases in proportion to iterations.Our proposed scheme does not use iterative decoding between LDPC and RS code considering the increase in latency.In [11], Qiu and coworkers achieved a similar level (10 −8 to 10 −7 ) of noise floor with their RS-SC-LDPC algorithm.Spatially coupled LDPC code may have better performance; however, it has much more complexity and latency, even when utilizing sliding window decoding.Meanwhile, SC or sliding window decoding needs more storage for multiple LDPC code words.Product code scheme or spatially coupled code scheme will have better performance than our proposed concatenated code scheme.However, while considering constraints on latency, complexity, and storage, our proposed scheme will be a better choice.

Conclusions
In this paper, we demonstrate 5G LDPC (7680, 3840) code in FTN signaling to explore spectrum efficiency and reliability.When limited to low latency and low storage, i.e., few BCJR states and few turbo iterations, we found an undesirable error floor.To combat the error floor, we proposed one possible solution, namely, using the LDPC-RS concatenation and shortening scheme together with perturbation introduced into the BP decoder, where the LDPC has a rate of 0.5 as well as a 0.94 high-rate hard decision RS code to lower the rate loss.We proposed a 4-parallel encoder architecture for RS component code and a concise algorithm to calculate its constant multiplier coefficients, leveraging a traditional serial encoder.This algorithm can also be used for calculating other parallelisms, code rates, and lengths.The simulation results show that the proposed concatenation and shortening scheme can lower the error floor of the original single LDPC code, and with perturbation in the BP decoder, the error floor can be eliminated below 10 −7 for different LDPC maximum decoding iterations.The advantage of the proposed scheme is that the LDPC has an error correction ability for BCJR detection with few states and few iterations of turbo equalization FTN signaling, while RS concatenation, shortening, and perturbation techniques together lower the error floor.In future research, other LDPC codes will be constructed, factor graph-based FTN detection with LDPC joint decoding will be explored, and LDPC-RS soft iteration decoding will also be studied.A better equalizer will also be studied to strengthen and broaden the LDPC-RS concatenation scheme's implementation for fading channels and other scenarios.

Figure 3 .
Figure 3. Proposed LDPC-RS concatenation code.With concatenation and shortening, the code rate is 0.42.

Figure 4 .
Figure 4.One example of BP decoding in a (6, 3) trapping set.Three variable nodes all have incorrect negative values, meanwhile relevant check nodes are weak and cannot provide error correction during BP iterations.