Optimized Design for NB-LDPC-Coded High-Order CPM: Power and Iterative E ﬃ ciencies

: In this paper, a non-binary low-density parity-check (NB-LDPC) coded high-order continuous phase modulation (CPM) system is designed and optimized to improve power and iterative e ﬃ ciencies. Firstly, the minimum squared normalized Euclidean distance and the 99% double-sided power bandwidth are introduced to design a competitive CPM, improving its power e ﬃ ciency under a given code rate and spectral e ﬃ ciency. Secondly, a three-step method based on extrinsic information transfer (EXIT) and entropy theory is used to design NB-LDPC codes, which reduces the convergence threshold approximately 0.42 and 0.58 dB compared with the candidate schemes. Thirdly, an extrinsic information operation is proposed to address the positive feedback issue in iterative detection and decoding and the value of bit error rate (BER) can approximately be reduced by 5 × 10 − 3 . Finally, iteration optimization employing the EXIT chart and mutual information between demodulation and decoding is performed to achieve a suitable tradeo ﬀ for the communication reliability and iterative decoding delay. Simulation results show that the resulting scheme provides an approximately 3.95 dB coding gain compared to the uncoded CPM and achieves approximately 0.5 and 0.7 dB advantages compared with the candidate schemes. The resulting NB-LDPC-coded high-order CPM for a given code rate and spectral e ﬃ ciency converges earlier into a turbo cli ﬀ region compared with other competitors and signiﬁcantly improves power and iterative e ﬃ ciencies.

As the serially concatenated CPM schemes in terms of convergence threshold are difficult to approach the Shannon limit, the non-binary (NB)-LDPC [21] code is considered the outer code, which has been the subject of numerous studies due to its excellent error correction capability. Unlike the traditional bit-interleaved coded modulation (BICM) systems, the interleaver of a high-order CPM works at the symbol level, which always yields a lower convergence threshold than the bit level [22]. The demodulation and decoding of the receiver are accomplished through the iterations between CPM soft-input soft-output (SISO), which employs the maximum a posteriori (MAP)-like algorithm [28], and LDPC-SISO, which adopts the log-domain belief propagation (BP) with the fast Fourier transform algorithm [29,30]. SISO calculates the extrinsic a posterior probabilities (APPs) from the information and code symbol priori probabilities, and the decision device determines the symbol with a maximum APP in the last iteration.

SIR
The SIR is defined as the mutual information rate under the assumption that the input information is independent and identically distributed. To design an excellent coded CPM, the SIR is generally applied to design the best CPM scheme with the lowest s 0 E / N required to achieve SIR.
Given that CPM is a time invariant finite state machine (FSM) with complex signal outputs, the CPM system over an AWGN channel can be viewed as a finite state Markov channel (FSMC, Figure 2), and the SIR can be calculated using the algorithm developed in [31][32][33][34].  The demodulation and decoding of the receiver are accomplished through the iterations between CPM soft-input soft-output (SISO), which employs the maximum a posteriori (MAP)-like algorithm [28], and LDPC-SISO, which adopts the log-domain belief propagation (BP) with the fast Fourier transform algorithm [29,30]. SISO calculates the extrinsic a posterior probabilities (APPs) from the information and code symbol priori probabilities, and the decision device determines the symbol with a maximum APP in the last iteration.

SIR
The SIR is defined as the mutual information rate under the assumption that the input information is independent and identically distributed. To design an excellent coded CPM, the SIR is generally applied to design the best CPM scheme with the lowest E s /N 0 required to achieve SIR. Given that CPM is a time invariant finite state machine (FSM) with complex signal outputs, the CPM system over an AWGN channel can be viewed as a finite state Markov channel (FSMC, Figure 2), and the SIR can be calculated using the algorithm developed in [31][32][33][34].  The demodulation and decoding of the receiver are accomplished through the iterations between CPM soft-input soft-output (SISO), which employs the maximum a posteriori (MAP)-like algorithm [28], and LDPC-SISO, which adopts the log-domain belief propagation (BP) with the fast Fourier transform algorithm [29,30]. SISO calculates the extrinsic a posterior probabilities (APPs) from the information and code symbol priori probabilities, and the decision device determines the symbol with a maximum APP in the last iteration.

SIR
The SIR is defined as the mutual information rate under the assumption that the input information is independent and identically distributed. To design an excellent coded CPM, the SIR is generally applied to design the best CPM scheme with the lowest s 0 E / N required to achieve SIR.
Given that CPM is a time invariant finite state machine (FSM) with complex signal outputs, the CPM system over an AWGN channel can be viewed as a finite state Markov channel (FSMC, Figure 2), and the SIR can be calculated using the algorithm developed in [31][32][33][34].  can be estimated as: where ( ) H ⋅ is an entropy function and 1 N x is independently and uniformly distributed. The x y s y y y y .
In computing  In Figure 2, x N 1 and v N 1 denote input information sequence and modulated sequence, respectively, and y N 1 is the output information sequence disturbed by AWGN sequence w N 1 . Then, on the basis of the definition of channel capacity, the mutual information rate between x N 1 = (x 1 , x 2 , . . . , x N ) and y N 1 = (y 1 , y 2 , . . . , y N ) can be estimated as: where H(·) is an entropy function and x N 1 is independently and uniformly distributed. The expression H(x N 1 |y N 1 ) must also be calculated. s N 0 is defined as the state transition sequence of the CPM. s N 0 is a Markov random process and can only be related to x N 1 ; thus: In computing p(s i |s i−1 , y N 1 ), Bayes' rule transforms it to: γ and β are introduced using the Bahl-Cocke-Jelinek-Raviv (BCJR)-like algorithm to estimate p(s i , y N 1 |s i−1 ), and they are defined as follows [34]: where β N (s N ) is initialized as an equally likely state. Then, combining (4) and (5), p(s i |s i−1 , y N 1 ) is rewritten as: In Figure 3, the simulated SIR of CPM signals with 8M2 raised cosine (8M2RC) using various modulation index h (h = p/q, p, and q are the co-primes.) are shown, where 8M2RC denotes a particular CPM family with M = 8, memory length L = 2, and RC frequency pulse.
Symmetry 2020, 12, x FOR PEER REVIEW 4 of 17 γ and β are introduced using the Bahl-Cocke-Jelinek-Raviv (BCJR)-like algorithm to estimate , and they are defined as follows [34]: In Figure 3, the simulated SIR of CPM signals with 8M2 raised cosine (8M2RC) using various modulation index h (

Design Criterion
The power and spectral efficiencies are influenced by the choice of M, h, pulse shape, memory, and code rate R. For example, a greater h, with the remaining parameters constant, typically results in increased power efficiency but at the expense of spectral efficiency. Similarly, increasing M will increase the spectral efficiency and decrease the power efficiency. Lowering R alone can increase the coding gain but decrease the spectral efficiency. Thus, the combination of code and modulation parameters for coded CPM must be chosen carefully based on these constraints.
In practical applications, CPM demodulator complexity must be considered, which is dependent on the total number of matched filters and trellis states. The complexity could be represented as

Design Criterion
The power and spectral efficiencies are influenced by the choice of M, h, pulse shape, memory, and code rate R. For example, a greater h, with the remaining parameters constant, typically results in increased power efficiency but at the expense of spectral efficiency. Similarly, increasing M will increase the spectral efficiency and decrease the power efficiency. Lowering R alone can increase the coding gain but decrease the spectral efficiency. Thus, the combination of code and modulation parameters for coded CPM must be chosen carefully based on these constraints. In practical applications, CPM demodulator complexity must be considered, which is dependent on the total number of matched filters and trellis states. The complexity could be represented as qM L matched filters followed by the CPM-SISO detector with the trellis of PM L−1 states. All the CPM schemes subject to the constraints q ≤ 5, L < 3 and M ≤ 8 are analyzed in this paper to reduce the implementation complexity.
In Table 1, the MSNED d 2 min and B 99% T b of all candidate CPM schemes with RC and rectangle (REC) frequency pulses under the constraints (i.e., q ≤ 5, L < 3, M ≤ 8, and h < 1) are calculated, where T b is the bit period, and MSNED can be expressed by [10]: with where NT s denotes observation symbol intervals and χ i is the difference between the transmitted sequence α and the received sequence α. The values of χ i,j are obtained from {0, ±2, ±4, . . . , ±2(M − 1)}. B 99% can be computed indirectly by: where G( f ) is the normalized power spectrum density of CPM and is defined as: With: where (τ) denotes the autocorrelation function of CPM: where · is a floor rounding operator.
Symmetry 2020, 12, 1353 7 of 17 A definite relationship between SIR and MSNED can be presented in Table 1 and Figure 3. The SIR curve of CPM with the highest MSNED converges earliest and corresponds to the lowest bound denoted as the minimum E s /N 0 , which can be traced back from the SIR curve by the R log 2 M. For instance, the SIR of 8M2RC using h = 1/2 converges earliest, followed by h = 1/3, 1/4, and 1/5, because the 8M2RC using h = 1/2 has the highest d 2 min of 5.56, followed by h = 1/3, 1/4, and 1/5. The computational complexity of MSNED is significantly lower than SIR, which provides convenience in designing a competitive CPM. Thus, we used B 99% T b and MSNED instead of SIR curves to design the suitable scheme from all candidate CPM signals under a given code rate R and spectral efficiency η.
The relationship of the coded modulation systems among M, R, B 99% , T s , and η is defined as: For the given η and R, the B 99% T b can be attained using (13). The competitive CPM scheme can be determined by the highest d 2 min among these candidate CPM signals with corresponding B 99% T b . For example, η and R are 0.5 bit/s/Hz and 2/3, respectively, B 99% T b is approximately computed as 1.33 using (13). Table 1 shows that the 8M2RC with h = 1/2, 8M1REC with h = 2/5, and 4M2RC with h = 2/3 can approximately meet the constraint of B 99% T b , and the d 2 min of the three schemes are 5.56, 4.60, and 3.47, respectively. Consequently, 8M2RC with h = 1/2 is preferred as the competitive CPM scheme due to its highest d 2 min relative to other candidates. The competing CPM schemes for other η and R can also be selected according to (13) and Table 1.

EXIT Technique
The EXIT chart [35] is a powerful technique for predicting the iterative detection convergence using SISO modules. Similar to the density evolution, this technique assumes extrinsic information in SISO decoders as independent Gaussian random variables. Figure 4 depicts the extrinsic information interaction between two SISO decoders of NB-LDPC-coded high-order CPM. CPM-SISO has two inputs of the prior information from the interleaver and the inner codeword information from matched filters. As the inner codeword information is connected to E b /N 0 , the average mutual information of the CPM-SISO output can be treated as a function of the average mutual information of the input I CPM A and E b /N 0 , given by: instance, the SIR of 8M2RC using h = 1/2 converges earliest, followed by h = 1/3, 1/4, and 1/5, because the 8M2RC using h = 1/2 has the highest 2 min d of 5.56, followed by h = 1/3, 1/4, and 1/5. The computational complexity of MSNED is significantly lower than SIR, which provides convenience in designing a competitive CPM. Thus, we used 99 b B T % and MSNED instead of SIR curves to design the suitable scheme from all candidate CPM signals under a given code rate R and spectral efficiency η .
The relationship of the coded modulation systems among M, R, 99 B % , s T , and η is defined as: For the given η and R, the 99% b B T can be attained using (13 schemes for other η and R can also be selected according to (13) and Table 1.

EXIT Technique
The EXIT chart [35] is a powerful technique for predicting the iterative detection convergence using SISO modules. Similar to the density evolution, this technique assumes extrinsic information in SISO decoders as independent Gaussian random variables. Figure 4 depicts the extrinsic information interaction between two SISO decoders of NB-LDPC-coded high-order CPM. CPM-SISO has two inputs of the prior information from the interleaver and the inner codeword information from matched filters. As the inner codeword information is connected to b where functions T CPM−SISO (·) and T LDPC−SISO (·) are defined as the EXIT characteristics of CPM-SISO and LDPC-SISO, respectively.

NB-LDPC Code Design
A three-step method is used in this paper to design the following NB-LDPC codes.
(1) Optimization of degree distribution based on binary parity-matrix using the EXIT technique. The NB-LDPC code has the same Tanner graph and degree distribution property as its corresponding binary representation, except for non-zero values; thus, the EXIT chart was used to explore a binary sparse matrix with acceptable degree distribution. Figure 5 provides the EXIT characteristics of different variable node (VN) and check node (CN) degrees.

NB-LDPC Code Design
A three-step method is used in this paper to design the following NB-LDPC codes.
(1) Optimization of degree distribution based on binary parity-matrix using the EXIT technique. The NB-LDPC code has the same Tanner graph and degree distribution property as its corresponding binary representation, except for non-zero values; thus, the EXIT chart was used to explore a binary sparse matrix with acceptable degree distribution. Figure 5 provides the EXIT characteristics of different variable node (VN) and check node (CN) degrees. To design a suitable degree profile consisting of VN and CN degree distributions, a general approach described in [36] was used by fixing the CN degree distribution, and then changing the VN degree distribution to search the lowest convergence threshold. The area between the VN EXIT curve and the inverse CN EXIT curve is the smallest without intersections. The degree distribution corresponding to the lowest convergence threshold is optimal. This systematic search procedure can be performed as follows: where  [33]. Equations (16) and (17)  The entire process of searching for a suitable degree distribution must be subjected to the constraint, that is:  To design a suitable degree profile consisting of VN and CN degree distributions, a general approach described in [36] was used by fixing the CN degree distribution, and then changing the VN degree distribution to search the lowest convergence threshold. The area between the VN EXIT curve and the inverse CN EXIT curve is the smallest without intersections. The degree distribution corresponding to the lowest convergence threshold is optimal. This systematic search procedure can be performed as follows: where d v and d c are the maximum VN and CN degrees, respectively; λ i and ρ i express the fractions of edges connecting to VNs and CNs of degree i while satisfying The function J(·) and its inverse function J −1 (·) are defined in [33]. Equations (16) and (17) are linear weighted sums of VN and CN EXIT curves of the given R and E b /N 0 , respectively. The entire process of searching for a suitable degree distribution must be subjected to the constraint, that is: (2) Construction of a parity-check matrix with a large girth. After determining the degree distribution, the positions of non-zero elements in binary parity-check matrix H b must be ascertained. A girth optimization tool, called progressive edge growth [37], is adopted to avoid small circles and achieve good girth properties when using the BP-like algorithm on the Tanner graph. (3) Choice of non-zero elements over GF(Q). Generally, this step can be performed by substituting the "1" elements of H b with random non-zero elements over GF(Q), which can provide acceptable performance in most cases. Entropy theory, which is the appropriate measure for uncertainty, is introduced to improve the uncertainty or randomness of cycles located at the Tanner graph, which helps obtain a low error rate. First, the cycle-searching algorithm proposed in [38] is applied to search all small circles of H b with lengths l = 4, 6, and even 8 if necessary, and record the corresponding positions of non-zero elements in each circle. Next, a general method for constructing the NB-LDPC code is employed to randomly replace all the "1" elements of H b with non-zero elements of GF(Q). Eventually, the entropy of each previously recorded circle is calculated and maximized when each element takes various non-zero values over GF(Q), that is: where Pr i = n i /l and n i denote the appearance times of i in each circle with i ∈ [1, 2, · · · , Q − 1].
In the case of R = 2/3 and η = 0.5 bit/s/Hz, based on (14) and (15), the EXIT chart of the resulting scheme using 8M2RC concatenated with designed NB-LDPC code was drawn in Figure 6, where the convergence threshold is 0.81 dB.
Symmetry 2020, 12, x FOR PEER REVIEW 9 of 17 ascertained. A girth optimization tool, called progressive edge growth [37], is adopted to avoid small circles and achieve good girth properties when using the BP-like algorithm on the Tanner graph. (3) Choice of non-zero elements over GF(Q). Generally, this step can be performed by substituting the "1" elements of b H with random non-zero elements over GF(Q), which can provide acceptable performance in most cases. Entropy theory, which is the appropriate measure for uncertainty, is introduced to improve the uncertainty or randomness of cycles located at the Tanner graph, which helps obtain a low error rate. First, the cycle-searching algorithm proposed in [38] is applied to search all small circles of b H with lengths l = 4, 6, and even 8 if necessary, and record the corresponding positions of non-zero elements in each circle. Next, a general method for constructing the NB-LDPC code is employed to randomly replace all the "1" elements of b H with non-zero elements of GF(Q). Eventually, the entropy of each previously recorded circle is calculated and maximized when each element takes various non-zero values over GF(Q), that is:  (14) and (15), the EXIT chart of the resulting scheme using 8M2RC concatenated with designed NB-LDPC code was drawn in Figure 6, where the convergence threshold is 0.81 dB.

Additional Advantages
The investigation into NB-LDPC-coded high-order CPM is particularly significant for coded modulation systems. In addition to the excellent properties in terms of continuous phase, rapidly decaying spectrum side lobes, and constant envelope, this scheme has the following additional advantages: (1) Each edge of the binary LDPC code in the Tanner graph carries bit messages, but the NB-LDPC code carries Q-ary symbol messages, thus, short girths are avoided in the Tanner graph. This reduces the influence of short girths and stopping set on decoding convergence. Therefore, the BP algorithm becomes closer to the maximum likelihood decoding algorithm. The NB-LDPC code, as an outer code in coded modulation systems, provides an alternative solution in enhancing BER performance in practical applications.

Additional Advantages
The investigation into NB-LDPC-coded high-order CPM is particularly significant for coded modulation systems. In addition to the excellent properties in terms of continuous phase, rapidly decaying spectrum side lobes, and constant envelope, this scheme has the following additional advantages: (1) Each edge of the binary LDPC code in the Tanner graph carries bit messages, but the NB-LDPC code carries Q-ary symbol messages, thus, short girths are avoided in the Tanner graph. This reduces the influence of short girths and stopping set on decoding convergence. Therefore, the BP algorithm becomes closer to the maximum likelihood decoding algorithm. The NB-LDPC code, as an outer code in coded modulation systems, provides an alternative solution in enhancing BER performance in practical applications.
(2) In comparison with the traditional BICM, the interleaver of the NB-LDPC-coded high-order CPM works at the symbol level, which always yields a lower convergence threshold than bit level. This advantage is rather significant in a serial concatenation [22]. (3) As the NB-LDPC code and CPM select the uniform M-ary in the investigated systems, the symbol mapping issue that is likely to result in conversion information loss from bit to symbol may be ignored. This phenomenon usually occurs in the case of M > q. Thus, more possible input code symbols exist between the current and next phase states in the trellis diagrams. For an example of the CPM scheme with 8M2RC using h = 1/2, the corresponding transfer diagram of the phase states using Gray and natural mappings is shown in Figure 7.  [28]. On the basis of (20), knowing the identities of the other bits is necessary in deciding on a certain bit. For instance, it is assumed that current and next phase states are 0 and π , separately, there are four possible input code symbols (001, 101, 011, 111) when the natural mapping is used as shown in Figure 7. It is hard to decide whether the first bit is 0 or 1 when the other bits are either 01 or 11. In the Gray mapping case, the decoder can make the appropriate decision on the first bit when the couples of other two mapped bits are different. Consequently, symbol mapping must be chosen carefully for BICM systems with high-order CPM.  In this case, eight input code symbols and two phase states of 0 and π are presented. Each transfer between the adjacent phase states exists in four possible input code symbols. Information loss in an inappropriate symbol mapping would occur when the MAP-like algorithm in CPM-SISO is used [25], that is: With: where H u , H u , and H u j are normalization constants; P I (c; I) is the inner codeword probability of the input; P I (u; I) is the inner information probability of the input; P I (u; O) denotes the inner information probability of the output; and A k (·) and B k (·) are obtained through forward and backward recursions, respectively [28]. On the basis of (20), knowing the identities of the other bits is necessary in deciding on a certain bit. For instance, it is assumed that current and next phase states are 0 and π, separately, there are four possible input code symbols (001, 101, 011, 111) when the natural mapping is used as shown in Figure 7. It is hard to decide whether the first bit is 0 or 1 when the other bits are either 01 or 11. In the Gray mapping case, the decoder can make the appropriate decision on the first bit when the couples of other two mapped bits are different. Consequently, symbol mapping must be chosen carefully for BICM systems with high-order CPM.
NB-LDPC code has many advantages in comparison to binary coding, however, the decoding complexity restricts its development. Since the decoding algorithm of LDPC code is at the symbol level, the decoding complexity will increase rapidly with the increase of Q. In the check-node update of BP decoding, an asymptotical complexity of O (Qˆ2) for log-decoding or O (Q log2(Q)) for fast fourier transform (FFT) decoding is present. Hence, in comparison to binary coding, the per-bit decoding complexity is (at least) increased by a factor of Q. In practical application, the decoding algorithm can adopt Log-FFT-BP. First, a large number of convolution operations are converted into multiplication operations in frequency domain, and then multiplication operations are converted into addition operations in log domain, which can effectively reduce the decoding complexity.

Positive Feedback Issue
For the turbo-like receivers, an unwanted phenomenon called positive feedback, where BER performance worsens with increasing iterations, commonly occurs. A similar phenomenon also occurs in the investigated system, and this is a much more serious phenomenon, especially for low E b /N 0 due to the insufficient interleaving length and the high possibility of burst errors.
A method of extrinsic information operation is introduced to improve iterative convergence, thereby effectively avoiding the undesired phenomenon. As shown in Figure 1, the extrinsic information from one SISO decoder must be operated and then transmitted into the other SISO decoder, that is: where P O (c, O) is the outer codeword probability of the output, and ψ(·) is written as: where a and b are satisfied with a ∈ [0.6, 0.9] and b ∈ [0.001, 0.01], respectively. Figures 8 and 9 show the comparisons between the iterative convergence of the NB-LDPC code for 8M2RC using the proposed method and the original approach when E b /N 0 is 0.4 and 1.2 dB, where a and b are set to 0.9 and 0.01, respectively. As can be seen from Figure 8, the proposed method can be effectively curbing the positive feedback phenomenon at low E b /N 0 , the value of BER can be reduced approximately by 5 × 10 −3 compared with the original method. Figure 9 reveals that the proposed method is capable of accelerating the iterative convergence, improving power efficiency and enhancing the transmission reliability (BER reduce approximately by 2.5 × 10 −3 compared with the original method) at medium-high E b /N 0 . NB-LDPC code has many advantages in comparison to binary coding, however, the decoding complexity restricts its development. Since the decoding algorithm of LDPC code is at the symbol level, the decoding complexity will increase rapidly with the increase of Q. In the check-node update of BP decoding, an asymptotical complexity of O (Q^2) for log-decoding or O (Q log2(Q)) for fast fourier transform (FFT) decoding is present. Hence, in comparison to binary coding, the per-bit decoding complexity is (at least) increased by a factor of Q. In practical application, the decoding algorithm can adopt Log-FFT-BP. First, a large number of convolution operations are converted into multiplication operations in frequency domain, and then multiplication operations are converted into addition operations in log domain, which can effectively reduce the decoding complexity.

Positive Feedback Issue
For the turbo-like receivers, an unwanted phenomenon called positive feedback, where BER performance worsens with increasing iterations, commonly occurs. A similar phenomenon also occurs in the investigated system, and this is a much more serious phenomenon, especially for low b 0 E / N due to the insufficient interleaving length and the high possibility of burst errors.
A method of extrinsic information operation is introduced to improve iterative convergence, thereby effectively avoiding the undesired phenomenon. As shown in Figure 1, the extrinsic information from one SISO decoder must be operated and then transmitted into the other SISO decoder, that is: ( ) * ( , ) exp ( ( , )). ln ( , ) is the outer codeword probability of the output, and ( ) ψ ⋅ is written as: where a and b are satisfied with

Optimization Design for Iterative Efficiency
LDPC decoding is an iterative detector using BP or a modified BP algorithm. Thus, the NB-LDPC-coded CPM systems have two iterative decoding structures at the receiver. The iteration choice has profound effects on iterative decoding performance. However, large numbers of iterations between demodulation and decoding are generally used to achieve excellent BER performance, which adversely increases computational complexity and iterative decoding delay. To improve iterative efficiency, we propose iteration optimization using the EXIT chart and mutual information between demodulation and decoding to achieve the suitable tradeoff for the communication reliability and iterative decoding delay.
The optimized NB-LDPC code for the 8M2RC scheme with 0.5 bit/ s/ Hz η = and R = 2/3 was investigated as an instance. The corresponding EXIT curves of the optimized NB-LDPC code with various inner iterations are shown in Figure 10. Inner iterations have no significant impact on convergence threshold because more iterations barely improve the BER performance at low b 0 E / N .
The EXIT curves for NB-LDPC code become steeper with increasing inner iterations, implying a larger iterative space of the CPM-SISO and LDPC-SISO decoder EXIT curves, resulting in an easier convergence with fewer outer iterations. However, the improved trend is no longer evident in five or more inner iterations. Hence, the optimized inner iteration of this design is set to five times.

Optimization Design for Iterative Efficiency
LDPC decoding is an iterative detector using BP or a modified BP algorithm. Thus, the NB-LDPCcoded CPM systems have two iterative decoding structures at the receiver. The iteration choice has profound effects on iterative decoding performance. However, large numbers of iterations between demodulation and decoding are generally used to achieve excellent BER performance, which adversely increases computational complexity and iterative decoding delay. To improve iterative efficiency, we propose iteration optimization using the EXIT chart and mutual information between demodulation and decoding to achieve the suitable tradeoff for the communication reliability and iterative decoding delay.
The optimized NB-LDPC code for the 8M2RC scheme with η = 0.5 bit/s/Hz and R = 2/3 was investigated as an instance. The corresponding EXIT curves of the optimized NB-LDPC code with various inner iterations are shown in Figure 10. Inner iterations have no significant impact on convergence threshold because more iterations barely improve the BER performance at low E b /N 0 . Symmetry 2020, 12, x FOR PEER REVIEW 12 of 17 Figure 9. Iterative convergence of the NB-LDPC code for 8M2RC with h = 1/2 using the proposed and original methods at b0 E / N =1.2dB .

Optimization Design for Iterative Efficiency
LDPC decoding is an iterative detector using BP or a modified BP algorithm. Thus, the NB-LDPC-coded CPM systems have two iterative decoding structures at the receiver. The iteration choice has profound effects on iterative decoding performance. However, large numbers of iterations between demodulation and decoding are generally used to achieve excellent BER performance, which adversely increases computational complexity and iterative decoding delay. To improve iterative efficiency, we propose iteration optimization using the EXIT chart and mutual information between demodulation and decoding to achieve the suitable tradeoff for the communication reliability and iterative decoding delay.
The optimized NB-LDPC code for the 8M2RC scheme with 0.5bit/ s/ Hz   and R = 2/3 was investigated as an instance. The corresponding EXIT curves of the optimized NB-LDPC code with various inner iterations are shown in Figure 10. Inner iterations have no significant impact on convergence threshold because more iterations barely improve the BER performance at low b0 E / N .
The EXIT curves for NB-LDPC code become steeper with increasing inner iterations, implying a larger iterative space of the CPM-SISO and LDPC-SISO decoder EXIT curves, resulting in an easier convergence with fewer outer iterations. However, the improved trend is no longer evident in five or more inner iterations. Hence, the optimized inner iteration of this design is set to five times.  The EXIT curves for NB-LDPC code become steeper with increasing inner iterations, implying a larger iterative space of the CPM-SISO and LDPC-SISO decoder EXIT curves, resulting in an easier convergence with fewer outer iterations. However, the improved trend is no longer evident in five or more inner iterations. Hence, the optimized inner iteration of this design is set to five times.
Once the inner iterations were determined, the optimization of the outer iterations was investigated as follows. Fewer outer iterations are required to achieve a high BER performance with increasing E b /N 0 in a turbo cliff region. Thus, a limit exists on how to design suitable outer iterations for various E b /N 0 . Mutual information is effective for better understanding the convergence of SISO decoders. A greater amount of mutual information means a more accurate identification of SISO decoders on information symbols. The mutual information of the outer codeword probability of the LDPC-SISO output at the nth iteration is defined as: where P n O (c i q , O) is the probability of an event wherein the ith element in the symbol vector c equals q. If the difference in mutual information between the current and previous iterations is extremely small, then their BER performance is comparable, and the improved performance is no longer remarkable with the following outer iterations. Therefore, an iterative stopping criterion can be estimated as: where ε is an extremely small value (for example, ε= 10 −5 ). If the current iteration satisfies this criterion at a certain E b /N 0 , it is assumed to be the optimal outer iteration, and the next iteration detection is terminated immediately.

Simulation Results
In this section, BER simulations for various schemes with η = 0.5bit/s/Hz and R = 2/3 are presented. Figure 11 depicts the BER performance of the optimized NB-LDPC code for 8M2RC with various inner iterations at fixed 20 outer iterations. The BER curves using various inner iterations almost coincide within approximately 0.8 dB. This finding indicates that the BER performance at low E b /N 0 cannot be improved by increasing the iterations again. Conversely, when the E b /N 0 exceeds the convergence threshold into the turbo cliff region, the BER performance significantly improved as the number of inner iterations increases. However, the improved effect was no longer outstanding after five inner iterations, in perfect agreement with the EXIT chart analysis. Once the inner iterations were determined, the optimization of the outer iterations was investigated as follows. Fewer outer iterations are required to achieve a high BER performance with increasing b0 E / N in a turbo cliff region. Thus, a limit exists on how to design suitable outer iterations for various b0 E / N . Mutual information is effective for better understanding the convergence of SISO decoders. A greater amount of mutual information means a more accurate identification of SISO decoders on information symbols. The mutual information of the outer codeword probability of the LDPC-SISO output at the th n iteration is defined as:   q . If the difference in mutual information between the current and previous iterations is extremely small, then their BER performance is comparable, and the improved performance is no longer remarkable with the following outer iterations. Therefore, an iterative stopping criterion can be estimated as: where  is an extremely small value (for example, ). If the current iteration satisfies this criterion at a certain b0 E /N , it is assumed to be the optimal outer iteration, and the next iteration detection is terminated immediately.

Simulation Results
In this section, BER simulations for various schemes with 0.5bit/ s/ Hz   and R = 2/3 are presented. Figure 11 depicts the BER performance of the optimized NB-LDPC code for 8M2RC with various inner iterations at fixed 20 outer iterations. The BER curves using various inner iterations almost coincide within approximately 0.8 dB. This finding indicates that the BER performance at low b0 E / N cannot be improved by increasing the iterations again. Conversely, when the b0 E / N exceeds the convergence threshold into the turbo cliff region, the BER performance significantly improved as the number of inner iterations increases. However, the improved effect was no longer outstanding after five inner iterations, in perfect agreement with the EXIT chart analysis.   Figure 12 presents the BER simulations of the resulting scheme (8M2RC) and other candidate schemes (8M1REC and 4M2RC) using the proposed methods for η = 0.5bit/s/Hz and R = 2/3, as well as their original counterparts with or without extrinsic information operation. In this analysis, a and b were set to 0.9 and 0.01, respectively. To ensure a fair comparison, the VN degree distribution was maintained with that of the resulting scheme.
with 8M1REC and 4M2RC candidate schemes. The resulting scheme provides an approximately 3.95 dB coding gain compared to the uncoded 8M2RC, while achieving approximately 0.5 dB and 0.7 dB advantages over those of 8M1REC and 4M2RC schemes for the BER of 10 −3 , respectively. The schemes with extrinsic information operation always converge to a smaller BER at the turbo cliff region compared to those without extrinsic information operation and it can achieve approximately 0.1 dB advantages. In addition, Figure 13 shows that these optimized schemes using five inner iterations and 5 1e    have fewer average outer iterations with increasing b0 E /N and exhibit negligible BER performance degradation with respect to their original counterparts with a total of 140 iterations (seven inner iterations 20  outer iterations). These optimized schemes attain a suitable tradeoff of the communication reliability and iterative decoding delay and enhance systematic iterative efficiency.   In Figure 12, the convergence thresholds of all the schemes are reported. The resulting schemes step into the turbo cliff region 0.42 and 0.58 dB earlier and show higher power efficiency compared with 8M1REC and 4M2RC candidate schemes. The resulting scheme provides an approximately 3.95 dB coding gain compared to the uncoded 8M2RC, while achieving approximately 0.5 dB and 0.7 dB advantages over those of 8M1REC and 4M2RC schemes for the BER of 10 −3 , respectively. The schemes with extrinsic information operation always converge to a smaller BER at the turbo cliff region compared to those without extrinsic information operation and it can achieve approximately 0.1 dB advantages. In addition, Figure 13 shows that these optimized schemes using five inner iterations and ε = 1e −5 have fewer average outer iterations with increasing E b /N 0 and exhibit negligible BER performance degradation with respect to their original counterparts with a total of 140 iterations (seven inner iterations ×20 outer iterations). These optimized schemes attain a suitable tradeoff of the communication reliability and iterative decoding delay and enhance systematic iterative efficiency. and b were set to 0.9 and 0.01, respectively. To ensure a fair comparison, the VN degree distribution was maintained with that of the resulting scheme. In Figure 12, the convergence thresholds of all the schemes are reported. The resulting schemes step into the turbo cliff region 0.42 and 0.58 dB earlier and show higher power efficiency compared with 8M1REC and 4M2RC candidate schemes. The resulting scheme provides an approximately 3.95 dB coding gain compared to the uncoded 8M2RC, while achieving approximately 0.5 dB and 0.7 dB advantages over those of 8M1REC and 4M2RC schemes for the BER of 10 −3 , respectively. The schemes with extrinsic information operation always converge to a smaller BER at the turbo cliff region compared to those without extrinsic information operation and it can achieve approximately 0.1 dB advantages. In addition, Figure 13 shows that these optimized schemes using five inner iterations and   Optimized NB-LDPC coded 8M2RC(resulting scheme) Optimized NB-LDPC coded 8M1REC(candidate scheme) Optimized NB-LDPC coded 4M2RC(candidate scheme) NB-LDPC coded 8M2RC with extrinsic information operation NB-LDPC coded 8M1REC with extrinsic information operation NB-LDPC coded 4M2RC with extrinsic information operation NB-LDPC coded 8M2RC NB-LDPC coded 8M1REC NB-LDPC coded 4M2RC Figure 13. Average outer iteration comparison of the resulting scheme and other candidate schemes for η = 0.5 bit/s/Hz and R = 2/3, as well as their original counterparts with or without extrinsic information operation.

Conclusions
The design of the CPM parameter, decoding delay, and the positive feedback problem in iterative decoding are the main factors that limit the development of the coded CPM system. To address the above problems, NB-LDPC-coded high-order CPM systems were designed and optimized in this paper. A novel design method based on the MSNED and B 99% was introduced to explore a competitive CPM scheme using particular η and R under the constraint of implementation complexity. A three-step method based on the EXIT chart and entropy theory was used to design the NB-LDPC code to reduce the convergence threshold, which reduces the convergence threshold by approximately 0.42 and 0.58 dB compared to the candidate schemes. A method of extrinsic information operation was proposed to address the positive feedback phenomenon existing in iterative detection and decoding.
The simulation results showed that the proposed method not only effectively inhibits positive feedback phenomenon at low E b /N 0 but also accelerate iterative convergence at medium-high E b /N 0 , and the value of BER can be reduced by approximately 5 × 10 −3 . An improper iteration match between demodulation and decoding was addressed using the EXIT technique and mutual information to improve the iterative efficiency and attain a suitable tradeoff of the communication reliability and the iterative decoding delay. Finally, simulation results show that the resulting NB-LDPC-coded high-order CPM scheme provides an approximately 3.95 dB coding gain compared to the uncoded CPM and achieves approximately 0.5 and 0.7 dB advantages compared with the candidate schemes. The resulting scheme using the proposed method attains the convergence threshold earlier compared with other competitors and further improves power and iterative efficiencies.
Author Contributions: R.X. conceived the idea and established the mathematical modeling. T.W. did the simulations and wrote the paper. Y.S. checked the simulation and analyzed the data. H.T. contributed to the revisions and the discussion of the results. All authors have read and agreed to the published version of the manuscript.